Transmogrification:Beyond the Magic Box          Cris Ewing     PLONE CONFERENCE, 2011
Plone Conference 2011Migration
Plone Conference 2011It’s Easy!
Plone Conference 2011Photo by Christopher Michel - CC_BYhttp://www.flickr.com/photos/cmichel67/4172613951/
Plone Conference 2011 A Good PlanPhoto by Steve Jurvetson - CC-BYhttp://www.flickr.com/photos/jurvetson/21470089/
Good Tools                                                     Plone Conference 2011Photo by	Kimmo Palosaari - CC-BYhttp:/...
Plone Conference 2011                                                It doesn’t                                           ...
Plone Conference 2011Our Story...
Plone Conference 2011         heroes         and         villainsPhoto by digital_ramapge viahttp://www.flickr.com/photos/d...
Plone Conference 2011                                                  A Difficult                                         ...
Plone Conference 2011Obstacles to OvercomePhoto by	Joe Marinaro - CC-BYhttp://www.flickr.com/photos/m500/5782771006/
Victory                                         Plone Conference 2011Photo by	Petr & Bara Ruzicka - CC-BYhttp://www.flickr....
Plone Conference 2011The Plan
Plone Conference 2011Liferay
Plone Conference 2011ProprietaryOpen Source
Plone Conference 2011Image by	Patrick Hoesly - CC_BYhttp://www.flickr.com/photos/zooboing/5566075309/
Plone Conference 2011    ‘lacking in documentationexplaining how it actually works’
Plone Conference 2011                                                         data                                        ...
Plone Conference 2011DB Schema              Clearmysql> show tables;+--------------------------------+| Tables_in_lportal ...
| Users_Permissions               || Users_Roles               || Users_UserGroups                 || Vocabulary          ...
Plone Conference 2011              DB Schema                            Simplemysql> describe JournalArticle;+------------...
| Field         | Type          | Null | Key | Default | Extra |+--------------------+--------------+------+-----+--------...
Plone Conference 2011           DB Schema                Easy to UnderstandSELECT   COALESCE(gui.guestName, mbm.userName) ...
Plone Conference 2011           DB Schema                Easy to UnderstandSELECT   COALESCE(gui.guestName, mbm.userName) ...
Plone Conference 2011DB Schema Easy to Understand   </sarcasm>
Plone Conference 2011Simple Goal   (relatively)
Plone Conference 2011All Articles    ~2000
Plone Conference 2011All Images   ~5000
Plone Conference 2011All Comments    ~1000
Plone Conference 2011Preserve Links
Plone Conference 2011Preserve Dates
Plone Conference 2011Preserve Authorship
Plone Conference 2011Only a Few Tables
Plone Conference 2011+--------------------+--------------+------+-----+---------+-------+| Field         | Type          |...
+--------------+------------+------+-----+---------+-------+| Field        | Type       | Null | Key | Default | Extra |+-...
+--------------+------------+------+-----+---------+-------+             | Field        | Type       | Null | Key | Defaul...
+----------------+-------------+------+-----+---------+-------+| Field          | Type        | Null | Key | Default | Ext...
+----------------+-------------+------+-----+---------+-------+     | Field           | Type         | Null | Key | Defaul...
Plone Conference 2011What’s in an Article?
Plone Conference 2011
Plone Conference 20111. Image at top2. Author name3. Links in body4. Images in body5. Footnotes6. Comments
Plone Conference 2011And the Data?
+--------------------+--------------+------+-----+---------+-------+| Field         | Type          | Null | Key | Default...
+--------------------+--------------+------+-----+---------+-------+                                                      ...
+--------------------+--------------+------+-----+---------+-------+                                                      ...
+--------------------+--------------+------+-----+---------+-------+                                                      ...
+--------------------+--------------+------+-----+---------+-------+                                                      ...
Plone Conference 2011  Join   SELECT           IGImage.imageId,Image      IGImage.createDate,           IGImage.modifiedDat...
Plone Conference 2011 Get Article CommentsSELECT   COALESCE(gui.guestName, mbm.userName) AS userName,   mbm.createDate,   ...
Plone Conference 2011Alright, Good to Go!
Plone Conference 2011The Toolbox
Plone Conference 2011Transmogrifier
Plone Conference 2011Transmogrifier  “It’s a series of tubes”
Plone Conference 2011Transmogrifier
Plone Conference 2011Transmogrifier
Plone Conference 2011Transmogrifier
Plone Conference 2011Not Too Opinionated
Plone Conference 2011Not Too Opinionated
Plone Conference 2011  Not Too OpinionatedA migration deals with moving pieces of content fromone place to another
Plone Conference 2011  Not Too OpinionatedA migration deals with moving pieces of content fromone place to anotherA piece ...
Plone Conference 2011  Not Too OpinionatedA migration deals with moving pieces of content fromone place to anotherA piece ...
Plone Conference 2011  Not Too OpinionatedA migration deals with moving pieces of content fromone place to anotherA piece ...
Plone Conference 2011A Nice Set of Tools
Plone Conference 2011A Nice Set of Tools
Plone Conference 2011The Migration
Plone Conference 2011Two Main Types
Plone Conference 2011Two Main Types     Images
Plone Conference 2011Two Main Types     Images     Articles
Plone Conference 2011Two Pipelines
Plone Conference 2011  Two PipelinesOne for each main content category
Plone Conference 2011Two Pipelines
Plone Conference 2011       Two Pipelines• Extract content from SQL
Plone Conference 2011       Two Pipelines• Extract content from SQL• Update text field encodings
Plone Conference 2011       Two Pipelines• Extract content from SQL• Update text field encodings• Transform Dates to Python
Plone Conference 2011        Two Pipelines• Extract content from SQL• Update text field encodings• Transform Dates to Pytho...
Plone Conference 2011        Two Pipelines• Extract content from SQL• Update text field encodings• Transform Dates to Pytho...
Plone Conference 2011        Two Pipelines• Extract content from SQL• Update text field encodings• Transform Dates to Pytho...
Plone Conference 2011A Simplification   but you get the idea
Plone Conference 2011          Can You Spot the Flaw?Photo by pollyann - CC-BY-NC-NDhttp://www.flickr.com/photos/pollyann/4...
Plone Conference 2011How Do We Match Them?     Plone ID != Original SQL ID
Plone Conference 2011How Do We Match Them? JournalArticle.smallImageId != Image.smallImageID
Plone Conference 2011How Do We Match Them?JournalArticle.smallImageId != JournalArticle.smallImageURL     |   449868   |  ...
Plone Conference 2011Photo by hobvias sudoneighm - CC-BYhttp://www.flickr.com/photos/striatic/2192192956/
Plone Conference 2011Transmogrifier   To The Rescue!!!
Plone Conference 2011Two Features
Plone Conference 2011Splitter Sectionrun content down different pipelines       in one transmogrifier
Plone Conference 2011     Annotations  store data on the transmogrifier tocommunicate between pipeline sections
Plone Conference 2011How Does This Help?
Plone Conference 2011                 Three Facts           1. Pipeline sections are generatorsclass MySection(object):   ...
Plone Conference 2011                 Three Facts2. SQL sections process items 1 query at a time  for query in self.querie...
Plone Conference 2011                     Three Facts           3. Pipelines process one item at a timeclass SectionOne(ob...
Plone Conference 2011UsingAnnotations
Plone Conference 2011UsingAnnotations
Plone Conference 2011ImageInformation              Using              Annotations
Plone Conference 2011ImageInformation              Using              Annotations
Plone Conference 2011class IGImageIDMapper(object):   classProvides(ISectionBlueprint)   implements(ISection)  def __init_...
Plone Conference 2011class IGImageIDMapper(object):   classProvides(ISectionBlueprint)   implements(ISection)  def __init_...
Plone Conference 2011ImageInformation              Using              Annotations
Plone Conference 2011ImageInformation              Using              Annotations
Plone Conference 2011ImageInformation              Using              Annotations
Plone Conference 2011UsingAnnotations
Plone Conference 2011class GetArticleImage(object):   classProvides(ISectionBlueprint)   implements(ISection)  def __iter_...
Plone Conference 2011                                                  Victory!Photo by	Petr & Bara Ruzicka - CC-BYhttp://...
Plone Conference 2011Image by justinshearer - CC-BY-NC-SAhttp://www.flickr.com/photos/justinshearer/3675295127/
Plone Conference 2011What About Links  In Articles
Plone Conference 2011What About Links  In ArticlesTo Other Articles?
Plone Conference 2011Images finishbeforeArticles Start
Plone Conference 2011ImageInformation              Images finish              before              Articles Start
Plone Conference 2011ImageInformation              Images finish              before              Articles Start
Plone Conference 2011
Plone Conference 2011
Plone Conference 2011Article 1 ID               Article 1               Identifiers               Stored
Plone Conference 2011Article 1 ID
Plone Conference 2011Article 1 ID               Article 2               Links               Processed
Plone Conference 2011Article 1 ID               Article 2               Links               Processed               Link t...
Plone Conference 2011Article 1 ID               Article 2               Links               Processed               Link t...
Plone Conference 2011                                                Sad                                                Pa...
Plone Conference 2011Transmogrifier   To The Rescue!!!
Plone Conference 2011                 Three Facts           1. Pipeline sections are generatorsclass MySection(object):   ...
Plone Conference 2011        One Fact, Really           1. Pipeline sections are generatorsclass MySection(object):   clas...
Plone Conference 2011        One Fact, Really           1. Pipeline sections are generatorsclass MySection(object):   clas...
Plone Conference 2011After all items are gone
Plone Conference 2011After all items are gone Cleanup code is run
Plone Conference 2011                      Set up ID Mapclass PostCreation(object):   classProvides(ISectionBlueprint)   i...
Plone Conference 2011                      Set up ID Mapclass PostCreation(object):   classProvides(ISectionBlueprint)   i...
Plone Conference 2011         Find Image Linksclass ImageTagsFinder(object):   classProvides(ISectionBlueprint)   implemen...
Plone Conference 2011         Find Image Linksclass ImageTagsFinder(object):   classProvides(ISectionBlueprint)           ...
Find Other Tags                                                                      Plone Conference 2011class LinkFinder...
Find Other Tags                                                                           Plone Conference 2011class LinkF...
Plone Conference 2011             Replace Found Linksclass LinkReplacer(object):   """ re-write links in body texts of all...
Plone Conference 2011             Replace Found Linksclass LinkReplacer(object):   """ re-write links in body texts of all...
Plone Conference 2011                                                  Victory!Photo by	Petr & Bara Ruzicka - CC-BYhttp://...
Plone Conference 2011                                                  Victory!                                           ...
Plone Conference 2011please?
Plone Conference 2011please?
Plone Conference 2011Photo of moohttp://instagr.am/p/SSMBw/
Plone Conference 2011Link Formats
Plone Conference 2011Link Formatshow many can you imagine?
Plone Conference 2011Link Formatshow many can you imagine?     we had them all
Plone Conference 2011~2,000 Articles
Plone Conference 20115-10 Links per Article
Plone Conference 20115-10 Links per Article         at least
Plone Conference 20115-10 Links per Article           at least       you do the math
Plone Conference 2011                                              How to                                              Fin...
Plone Conference 2011Transmogrifier   To The Rescue!!!
Plone Conference 2011PythonTo The Rescue!!!
Plone Conference 2011CSV Reports
Plone Conference 2011CSV Reports Which links worked?
Plone Conference 2011CSV Reports Which links worked?  Which links didn’t?
class LinkReplacer(object):   """ re-write links in body texts of all created items   """   classProvides(ISectionBlueprin...
class LinkReplacer(object):   """ re-write links in body texts of all created items   """             def rewrite_image_ta...
Plone Conference 2011Iteration FTW!
Plone Conference 2011Iteration FTW!  from > 2000 bad links
Plone Conference 2011Iteration FTW!  from > 2000 bad links    to < 75 bad links
Plone Conference 2011Iteration FTW!    from > 2000 bad links      to < 75 bad links fix those that remain by hand
Plone Conference 2011                                                  Victory!Photo by	Petr & Bara Ruzicka - CC-BYhttp://...
Plone Conference 2011So What Did We Learn?
Plone Conference 2011 Clients, when asked to describe their existing system,will never describe it with enough accuracy to...
Plone Conference 2011Learn as much as you can about the source system           when planning a migration
Plone Conference 2011Learn as much as you can about the source system           when planning a migrationbut know that you...
Plone Conference 2011 Learn as much as you can about the source system            when planning a migration  but know that...
Plone Conference 2011Users, if given more than one way to do things, will use                      all the ways
Plone Conference 2011Have a plan, but be prepared to adjust when reality hits.  Plans are best when treated as jumping-off...
Plone Conference 2011Estimate Migrations HIGH
Plone Conference 2011Photo by neilspicys - CC-BYhttp://www.flickr.com/photos/neilspicys/2349770710/
Check p.com/de      out       mossixfeetu
Transmogrifier: Beyond the Magic Box
Transmogrifier: Beyond the Magic Box
Transmogrifier: Beyond the Magic Box
Transmogrifier: Beyond the Magic Box
Upcoming SlideShare
Loading in...5
×

Transmogrifier: Beyond the Magic Box

1,024

Published on

Transmogrifier is a fantastic tool for moving content from one website to another. Simple, flexible and powerful, it makes the difficult tasks of migration easy and the impossible possible. But there's more to migrating with transmogrifier than just learning the tool. The everyday task of managing content can lead to complex problems. You need a plan. In this talk, we'll look at a real-world example of the migration of a large, content-heavy website from Liferay to Plone. We'll talk about where the hidden traps were found, the tools we used to get past them, and the knowledge that would have helped us avoid them in the first place.

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,024
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Transcript of "Transmogrifier: Beyond the Magic Box"

    1. 1. Transmogrification:Beyond the Magic Box Cris Ewing PLONE CONFERENCE, 2011
    2. 2. Plone Conference 2011Migration
    3. 3. Plone Conference 2011It’s Easy!
    4. 4. Plone Conference 2011Photo by Christopher Michel - CC_BYhttp://www.flickr.com/photos/cmichel67/4172613951/
    5. 5. Plone Conference 2011 A Good PlanPhoto by Steve Jurvetson - CC-BYhttp://www.flickr.com/photos/jurvetson/21470089/
    6. 6. Good Tools Plone Conference 2011Photo by Kimmo Palosaari - CC-BYhttp://www.flickr.com/photos/kimmo-quva/4630630775/
    7. 7. Plone Conference 2011 It doesn’t have to end like thisPhoto by Mike Nelson - CC-BYhttp://www.flickr.com/photos/mike_nelson/4720252548/
    8. 8. Plone Conference 2011Our Story...
    9. 9. Plone Conference 2011 heroes and villainsPhoto by digital_ramapge viahttp://www.flickr.com/photos/digital_ramapge/6118703544/
    10. 10. Plone Conference 2011 A Difficult JourneyPhoto by krayker - CC-BYhttp://www.flickr.com/photos/krayker/2274246797/
    11. 11. Plone Conference 2011Obstacles to OvercomePhoto by Joe Marinaro - CC-BYhttp://www.flickr.com/photos/m500/5782771006/
    12. 12. Victory Plone Conference 2011Photo by Petr & Bara Ruzicka - CC-BYhttp://www.flickr.com/photos/pruzicka/207209564/
    13. 13. Plone Conference 2011The Plan
    14. 14. Plone Conference 2011Liferay
    15. 15. Plone Conference 2011ProprietaryOpen Source
    16. 16. Plone Conference 2011Image by Patrick Hoesly - CC_BYhttp://www.flickr.com/photos/zooboing/5566075309/
    17. 17. Plone Conference 2011 ‘lacking in documentationexplaining how it actually works’
    18. 18. Plone Conference 2011 data spelunkingPhoto by wjhunter - CC-BYhttp://www.flickr.com/photos/wjhunter/3581151063/
    19. 19. Plone Conference 2011DB Schema Clearmysql> show tables;+--------------------------------+| Tables_in_lportal |+--------------------------------+
    20. 20. | Users_Permissions || Users_Roles || Users_UserGroups || Vocabulary || WOL_MeetupsEntry | Plone Conference 2011| WOL_MeetupsRegistration || WOL_SVNRepository || WOL_SVNRevision || WOL_WallEntry |DB Schema| WSRPConfiguredProducer || WSRPConsumerRegistration || WSRPPortlet || WSRPProducer || WebDAVProps || Website| WikiNode Clear | || WikiPage || WikiPageResource || guestUserInfo |+--------------------------------+155 rows in set (0.01 sec)
    21. 21. Plone Conference 2011 DB Schema Simplemysql> describe JournalArticle;+--------------------+--------------+------+-----+---------+-------+| Field | Type | Null | Key | Default | Extra |+--------------------+--------------+------+-----+---------+-------+| uuid_ | varchar(75) | YES | MUL | NULL | || id_ | bigint(20) | NO | PRI | NULL | || resourcePrimKey | bigint(20) | YES | MUL | NULL | || groupId | bigint(20) | YES | MUL | NULL | || companyId | bigint(20) | YES | MUL | NULL | || userId | bigint(20) | YES | | NULL | || userName | varchar(75) | YES | | NULL | || createDate | datetime | YES | | NULL | || modifiedDate | datetime | YES | | NULL | || articleId | varchar(75) | YES | | NULL | || version | double | YES | | NULL | || title | varchar(100) | YES | | NULL | |
    22. 22. | Field | Type | Null | Key | Default | Extra |+--------------------+--------------+------+-----+---------+-------+| uuid_ | varchar(75) | YES | MUL | NULL | || id_ | bigint(20) | NO | PRI | NULL | || resourcePrimKey | bigint(20) | YES | MUL | NULL | | Plone Conference 2011| groupId | bigint(20) | YES | MUL | NULL | || companyId | bigint(20) | YES | MUL | NULL | || userId | bigint(20) | YES | | NULL | || userName | varchar(75) | YES | | NULL | | DB Schema| createDate | datetime | YES | | NULL | || modifiedDate | datetime | YES | | NULL | || articleId | varchar(75) | YES | | NULL | || version | double | YES | | NULL | || title | varchar(100) | YES | | NULL | || description| content | longtext | longtext Simple | YES | | YES | | NULL | | NULL | | || type_ | varchar(75) | YES | | NULL | || structureId | varchar(75) | YES | | NULL | || templateId | varchar(75) | YES | | NULL | || displayDate | datetime | YES | | NULL | || approved | tinyint(4) | YES | | NULL | || approvedByUserId | bigint(20) | YES | | NULL | || approvedByUserName | varchar(75) | YES | | NULL | || approvedDate | datetime | YES | | NULL | || expired | tinyint(4) | YES | | NULL | || expirationDate | datetime | YES | | NULL | || reviewDate | datetime | YES | | NULL | || indexable | tinyint(4) | YES | | NULL | || smallImage | tinyint(4) | YES | | NULL | || smallImageId | bigint(20) | YES | MUL | NULL | || smallImageURL | varchar(75) | YES | | NULL | || urlTitle | varchar(150) | YES | | NULL | |+--------------------+--------------+------+-----+---------+-------+
    23. 23. Plone Conference 2011 DB Schema Easy to UnderstandSELECT COALESCE(gui.guestName, mbm.userName) AS userName, mbm.createDate, mbm.modifiedDate, mbm.subject, mbm.bodyFROM MBMessage mbmJOIN MBDiscussion mbd ON mbd.threadId = mbm.threadIdLEFT JOIN guestUserInfo gui ON gui.messageId = mbm.messageIdWHERE mbd.classPK=%d ORDER BY modifiedDate
    24. 24. Plone Conference 2011 DB Schema Easy to UnderstandSELECT COALESCE(gui.guestName, mbm.userName) AS userName,SELECT mbm.createDate, te.name mbm.modifiedDate,FROM TagsEntry te mbm.subject,WHERE te.entryId in mbm.body (SELECT tate.entryIdFROMFROM TagsAssets_TagsEntries tate MBMessage mbm in WHERE tate.assetIdJOIN (SELECT ta.assetId MBDiscussion mbd ON mbd.threadId = mbm.threadId FROM TagsAsset taLEFT JOIN WHERE ta.classPK=%d))AND te.vocabularyid ON gui.messageId = mbm.messageId guestUserInfo gui = 41473WHERE mbd.classPK=%d ORDER BY modifiedDate
    25. 25. Plone Conference 2011DB Schema Easy to Understand </sarcasm>
    26. 26. Plone Conference 2011Simple Goal (relatively)
    27. 27. Plone Conference 2011All Articles ~2000
    28. 28. Plone Conference 2011All Images ~5000
    29. 29. Plone Conference 2011All Comments ~1000
    30. 30. Plone Conference 2011Preserve Links
    31. 31. Plone Conference 2011Preserve Dates
    32. 32. Plone Conference 2011Preserve Authorship
    33. 33. Plone Conference 2011Only a Few Tables
    34. 34. Plone Conference 2011+--------------------+--------------+------+-----+---------+-------+| Field | Type | Null | Key | Default | Extra |+--------------------+--------------+------+-----+---------+-------+| uuid_ | varchar(75) | YES | MUL | NULL | | 1 for| id_ | bigint(20) | NO | PRI | NULL | || resourcePrimKey | bigint(20) | YES | MUL | NULL | || groupId | bigint(20) | YES | MUL | NULL | || companyId | bigint(20) | YES | MUL | NULL | || userId | bigint(20) | YES | | NULL | || userName | varchar(75) | YES | | NULL | | Articles| createDate | datetime | YES | | NULL | || modifiedDate | datetime | YES | | NULL | || articleId | varchar(75) | YES | | NULL | || version | double | YES | | NULL | || title | varchar(100) | YES | | NULL | || description | longtext | YES | | NULL | || content | longtext | YES | | NULL | || type_ | varchar(75) | YES | | NULL | || structureId | varchar(75) | YES | | NULL | || templateId | varchar(75) | YES | | NULL | || displayDate | datetime | YES | | NULL | || approved | tinyint(4) | YES | | NULL | || approvedByUserId | bigint(20) | YES | | NULL | || approvedByUserName | varchar(75) | YES | | NULL | || approvedDate | datetime | YES | | NULL | || expired | tinyint(4) | YES | | NULL | || expirationDate | datetime | YES | | NULL | || reviewDate | datetime | YES | | NULL | || indexable | tinyint(4) | YES | | NULL | || smallImage | tinyint(4) | YES | | NULL | || smallImageId | bigint(20) | YES | MUL | NULL | || smallImageURL | varchar(75) | YES | | NULL | || urlTitle | varchar(150) | YES | | NULL | |+--------------------+--------------+------+-----+---------+-------+
    35. 35. +--------------+------------+------+-----+---------+-------+| Field | Type | Null | Key | Default | Extra |+--------------+------------+------+-----+---------+-------+ Plone Conference 2011| discussionId | bigint(20) | NO | PRI | NULL | || classNameId | bigint(20) | YES | MUL | NULL | || classPK | bigint(20) | YES | | NULL | || threadId | bigint(20) | YES | UNI | NULL | |+--------------+------------+------+-----+---------+-------+ 3 for Comments
    36. 36. +--------------+------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +--------------+------------+------+-----+---------+-------+ Plone Conference 2011 | discussionId | bigint(20) | NO | PRI | NULL | | | classNameId | bigint(20) | YES | MUL | NULL | | | classPK | bigint(20) | YES | | NULL | | | threadId | bigint(20) | YES | UNI | NULL | | +--------------+------------+------+-----+---------+-------++-----------------+-------------+------+-----+---------+-------+ 3 for| Field | Type | Null | Key | Default | Extra |+-----------------+-------------+------+-----+---------+-------+| uuid_ | varchar(75) | YES | MUL | NULL | || messageId | bigint(20) | NO | PRI | NULL | || companyId | bigint(20) | YES | MUL | NULL | || userId | bigint(20) | YES | | NULL | | Comments| userName | varchar(75) | YES | | NULL | || createDate | datetime | YES | | NULL | || modifiedDate | datetime | YES | | NULL | || categoryId | bigint(20) | YES | MUL | NULL | || threadId | bigint(20) | YES | MUL | NULL | || parentMessageId | bigint(20) | YES | | NULL | || subject | varchar(75) | YES | | NULL | || body | longtext | YES | | NULL | || attachments | tinyint(4) | YES | | NULL | || anonymous | tinyint(4) | YES | | NULL | || groupId | bigint(20) | YES | MUL | NULL | || classNameId | bigint(20) | YES | MUL | NULL | || classPK | bigint(20) | YES | | NULL | || priority | double | YES | | NULL | |+-----------------+-------------+------+-----+---------+-------+ +------------------+------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +------------------+------------+------+-----+---------+-------+ | threadId | bigint(20) | NO | PRI | NULL | | | categoryId | bigint(20) | YES | MUL | NULL | | | rootMessageId | bigint(20) | YES | | NULL | | | messageCount | int(11) | YES | | NULL | | | viewCount | int(11) | YES | | NULL | | | lastPostByUserId | bigint(20) | YES | | NULL | | | lastPostDate | datetime | YES | | NULL | | | priority | double | YES | | NULL | | | groupId | bigint(20) | YES | MUL | NULL | | +------------------+------------+------+-----+---------+-------+
    37. 37. +----------------+-------------+------+-----+---------+-------+| Field | Type | Null | Key | Default | Extra |+----------------+-------------+------+-----+---------+-------+ Plone Conference 2011| uuid_ | varchar(75) | YES | MUL | NULL | || folderId | bigint(20) | NO | PRI | NULL | || groupId | bigint(20) | YES | MUL | NULL | || companyId | bigint(20) | YES | MUL | NULL | | 3 for| userId | bigint(20) | YES | | NULL | || createDate | datetime | YES | | NULL | || modifiedDate | datetime | YES | | NULL | || parentFolderId | bigint(20) | YES | | NULL | || name | varchar(75) | YES | | NULL | || description | longtext | YES | | NULL | | Images+----------------+-------------+------+-----+---------+-------+
    38. 38. +----------------+-------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +----------------+-------------+------+-----+---------+-------+ Plone Conference 2011 | uuid_ | varchar(75) | YES | MUL | NULL | | | folderId | bigint(20) | NO | PRI | NULL | | | groupId | bigint(20) | YES | MUL | NULL | | | companyId | bigint(20) | YES | MUL | NULL | | 3 for | userId | bigint(20) | YES | | NULL | | | createDate | datetime | YES | | NULL | | | modifiedDate | datetime | YES | | NULL | | | parentFolderId | bigint(20) | YES | | NULL | | | name | varchar(75) | YES | | NULL | | | description | longtext | YES | | NULL | | Images +----------------+-------------+------+-----+---------+-------++----------------+-------------+------+-----+---------+-------+| Field | Type | Null | Key | Default | Extra |+----------------+-------------+------+-----+---------+-------+| uuid_ | varchar(75) | YES | MUL | NULL | || imageId | bigint(20) | NO | PRI | NULL | || companyId | bigint(20) | YES | | NULL | || userId | bigint(20) | YES | | NULL | || createDate | datetime | YES | | NULL | || modifiedDate | datetime | YES | | NULL | || folderId | bigint(20) | YES | MUL | NULL | || name | varchar(75) | YES | | NULL | || description | longtext | YES | | NULL | || smallImageId | bigint(20) | YES | MUL | NULL | || largeImageId | bigint(20) | YES | MUL | NULL | || custom1ImageId | bigint(20) | YES | MUL | NULL | || custom2ImageId | bigint(20) | YES | MUL | NULL | || groupId | bigint(20) | YES | MUL | NULL | |+----------------+-------------+------+-----+---------+-------+ +--------------+-------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +--------------+-------------+------+-----+---------+-------+ | imageId | bigint(20) | NO | PRI | NULL | | | modifiedDate | datetime | YES | | NULL | | | text_ | longtext | YES | | NULL | | | type_ | varchar(75) | YES | | NULL | | | height | int(11) | YES | | NULL | | | width | int(11) | YES | | NULL | | | size_ | int(11) | YES | MUL | NULL | | +--------------+-------------+------+-----+---------+-------+
    39. 39. Plone Conference 2011What’s in an Article?
    40. 40. Plone Conference 2011
    41. 41. Plone Conference 20111. Image at top2. Author name3. Links in body4. Images in body5. Footnotes6. Comments
    42. 42. Plone Conference 2011And the Data?
    43. 43. +--------------------+--------------+------+-----+---------+-------+| Field | Type | Null | Key | Default | Extra | Plone Conference 2011+--------------------+--------------+------+-----+---------+-------+| uuid_ | varchar(75) | YES | MUL | NULL | || id_ | bigint(20) | NO | PRI | NULL | || resourcePrimKey | bigint(20) | YES | MUL | NULL | || groupId | bigint(20) | YES | MUL | NULL | || companyId | bigint(20) | YES | MUL | NULL | || userId | bigint(20) | YES | | NULL | || userName | varchar(75) | YES | | NULL | || createDate | datetime | YES | | NULL | || modifiedDate | datetime | YES | | NULL | || articleId | varchar(75) | YES | | NULL | || version | double | YES | | NULL | || title | varchar(100) | YES | | NULL | || description | longtext | YES | | NULL | || content | longtext | YES | | NULL | || type_ | varchar(75) | YES | | NULL | || structureId | varchar(75) | YES | | NULL | || templateId | varchar(75) | YES | | NULL | || displayDate | datetime | YES | | NULL | || approved | tinyint(4) | YES | | NULL | || approvedByUserId | bigint(20) | YES | | NULL | || approvedByUserName | varchar(75) | YES | | NULL | || approvedDate | datetime | YES | | NULL | || expired | tinyint(4) | YES | | NULL | || expirationDate | datetime | YES | | NULL | || reviewDate | datetime | YES | | NULL | || indexable | tinyint(4) | YES | | NULL | || smallImage | tinyint(4) | YES | | NULL | || smallImageId | bigint(20) | YES | MUL | NULL | || smallImageURL | varchar(75) | YES | | NULL | || urlTitle | varchar(150) | YES | | NULL | |+--------------------+--------------+------+-----+---------+-------+
    44. 44. +--------------------+--------------+------+-----+---------+-------+ Plone Conference 2011 Author| Field | Type | Null | Key | Default | Extra |+--------------------+--------------+------+-----+---------+-------+| uuid_ | varchar(75) | YES | MUL | NULL | || id_ | bigint(20) | NO | PRI | NULL | || resourcePrimKey | bigint(20) | YES | MUL | NULL | || groupId | bigint(20) | YES | MUL | NULL | || companyId | bigint(20) | YES | MUL | NULL | || userId | bigint(20) | YES | | NULL | || userName | varchar(75) | YES | | NULL | || createDate | datetime | YES | | NULL | || modifiedDate | datetime | YES | | NULL | || articleId | varchar(75) | YES | | NULL | || version | double | YES | | NULL | || title | varchar(100) | YES | | NULL | || description | longtext | YES | | NULL | || content | longtext | YES | | NULL | || type_ | varchar(75) | YES | | NULL | || structureId | varchar(75) | YES | | NULL | || templateId | varchar(75) | YES | | NULL | || displayDate | datetime | YES | | NULL | || approved | tinyint(4) | YES | | NULL | || approvedByUserId | bigint(20) | YES | | NULL | || approvedByUserName | varchar(75) | YES | | NULL | || approvedDate | datetime | YES | | NULL | || expired | tinyint(4) | YES | | NULL | || expirationDate | datetime | YES | | NULL | || reviewDate | datetime | YES | | NULL | || indexable | tinyint(4) | YES | | NULL | || smallImage | tinyint(4) | YES | | NULL | || smallImageId | bigint(20) | YES | MUL | NULL | || smallImageURL | varchar(75) | YES | | NULL | || urlTitle | varchar(150) | YES | | NULL | |+--------------------+--------------+------+-----+---------+-------+
    45. 45. +--------------------+--------------+------+-----+---------+-------+ Plone Conference 2011 Author| Field | Type | Null | Key | Default | Extra |+--------------------+--------------+------+-----+---------+-------+| uuid_ | varchar(75) | YES | MUL | NULL | || id_ | bigint(20) | NO | PRI | NULL | || resourcePrimKey | bigint(20) | YES | MUL | NULL | || groupId | bigint(20) | YES | MUL | NULL | || companyId | bigint(20) | YES | MUL | NULL | || userId | bigint(20) | YES | | NULL | || userName | varchar(75) | YES | | NULL | || createDate | datetime | YES | | NULL | || modifiedDate | datetime | YES | | NULL | || articleId| version| title | varchar(75) | YES | | double | YES | | varchar(100) | YES | | NULL | | NULL | | NULL | | | | Body Text| description| content| type_ | longtext | longtext | YES | | YES | | varchar(75) | YES | | NULL | | NULL | | NULL | | | | (links)| structureId | varchar(75) | YES | | NULL | || templateId | varchar(75) | YES | | NULL | || displayDate | datetime | YES | | NULL | || approved | tinyint(4) | YES | | NULL | || approvedByUserId | bigint(20) | YES | | NULL | || approvedByUserName | varchar(75) | YES | | NULL | || approvedDate | datetime | YES | | NULL | || expired | tinyint(4) | YES | | NULL | || expirationDate | datetime | YES | | NULL | || reviewDate | datetime | YES | | NULL | || indexable | tinyint(4) | YES | | NULL | || smallImage | tinyint(4) | YES | | NULL | || smallImageId | bigint(20) | YES | MUL | NULL | || smallImageURL | varchar(75) | YES | | NULL | || urlTitle | varchar(150) | YES | | NULL | |+--------------------+--------------+------+-----+---------+-------+
    46. 46. +--------------------+--------------+------+-----+---------+-------+ Plone Conference 2011 Author| Field | Type | Null | Key | Default | Extra |+--------------------+--------------+------+-----+---------+-------+| uuid_ | varchar(75) | YES | MUL | NULL | || id_ | bigint(20) | NO | PRI | NULL | || resourcePrimKey | bigint(20) | YES | MUL | NULL | || groupId | bigint(20) | YES | MUL | NULL | || companyId | bigint(20) | YES | MUL | NULL | || userId | bigint(20) | YES | | NULL | || userName | varchar(75) | YES | | NULL | || createDate | datetime | YES | | NULL | || modifiedDate | datetime | YES | | NULL | || articleId| version| title | varchar(75) | YES | | double | YES | | varchar(100) | YES | | NULL | | NULL | | NULL | | | | Body Text| description| content| type_ | longtext | longtext | YES | | YES | | varchar(75) | YES | | NULL | | NULL | | NULL | | | | (links)| structureId | varchar(75) | YES | | NULL | || templateId | varchar(75) | YES | | NULL | || displayDate | datetime | YES | | NULL | || approved | tinyint(4) | YES | | NULL | || approvedByUserId | bigint(20) | YES | | NULL | || approvedByUserName | varchar(75) | YES | | NULL | || approvedDate | datetime | YES | | NULL | || expired | tinyint(4) | YES | | NULL | || expirationDate | datetime | YES | | NULL | || reviewDate | datetime | YES | | NULL | || indexable | tinyint(4) | YES | | NULL | || smallImage | tinyint(4) | YES | | NULL | || smallImageId | bigint(20) | YES | MUL | NULL | | Top Image| smallImageURL | varchar(75) | YES | | NULL | || urlTitle | varchar(150) | YES | | NULL | |+--------------------+--------------+------+-----+---------+-------+
    47. 47. +--------------------+--------------+------+-----+---------+-------+ Plone Conference 2011 Author| Field | Type | Null | Key | Default | Extra |+--------------------+--------------+------+-----+---------+-------+| uuid_ | varchar(75) | YES | MUL | NULL | || id_ | bigint(20) | NO | PRI | NULL | || resourcePrimKey | bigint(20) | YES | MUL | NULL | || groupId | bigint(20) | YES | MUL | NULL | || companyId | bigint(20) | YES | MUL | NULL | || userId | bigint(20) | YES | | NULL | || userName | varchar(75) | YES | | NULL | || createDate | datetime | YES | | NULL | || modifiedDate | datetime | YES | | NULL | || articleId| version| title | varchar(75) | YES | | double | YES | | varchar(100) | YES | | NULL | | NULL | | NULL | | | | Body Text| description| content| type_ | longtext | longtext | YES | | YES | | varchar(75) | YES | | NULL | | NULL | | NULL | | | | (links)| structureId | varchar(75) | YES | | NULL | || templateId | varchar(75) | YES | | NULL | || displayDate | datetime | YES | | NULL | || approved | tinyint(4) | YES | | NULL | || approvedByUserId | bigint(20) | YES | | NULL | | Dates| approvedByUserName | varchar(75) | YES | | NULL | || approvedDate | datetime | YES | | NULL | || expired | tinyint(4) | YES | | NULL | || expirationDate | datetime | YES | | NULL | || reviewDate | datetime | YES | | NULL | || indexable | tinyint(4) | YES | | NULL | || smallImage | tinyint(4) | YES | | NULL | || smallImageId | bigint(20) | YES | MUL | NULL | | Top Image| smallImageURL | varchar(75) | YES | | NULL | || urlTitle | varchar(150) | YES | | NULL | |+--------------------+--------------+------+-----+---------+-------+
    48. 48. Plone Conference 2011 Join SELECT IGImage.imageId,Image IGImage.createDate, IGImage.modifiedDate, IGImage.uuid_,Scales IGImage.userId, IGImage.name, IGImage.description, IGImage.smallImageId, IGImage.largeImageId, IGImage.custom1ImageId, IGImage.custom2ImageId, IGImage.folderId, Image.type_ FROM IGImage JOIN Image ON IGImage.largeImageId = Image.imageId
    49. 49. Plone Conference 2011 Get Article CommentsSELECT COALESCE(gui.guestName, mbm.userName) AS userName, mbm.createDate, mbm.modifiedDate, mbm.subject, mbm.bodyFROM MBMessage mbmJOIN MBDiscussion mbd ON mbd.threadId = mbm.threadIdLEFT JOIN guestUserInfo gui ON gui.messageId = mbm.messageIdWHERE mbd.classPK=%d ORDER BY modifiedDate
    50. 50. Plone Conference 2011Alright, Good to Go!
    51. 51. Plone Conference 2011The Toolbox
    52. 52. Plone Conference 2011Transmogrifier
    53. 53. Plone Conference 2011Transmogrifier “It’s a series of tubes”
    54. 54. Plone Conference 2011Transmogrifier
    55. 55. Plone Conference 2011Transmogrifier
    56. 56. Plone Conference 2011Transmogrifier
    57. 57. Plone Conference 2011Not Too Opinionated
    58. 58. Plone Conference 2011Not Too Opinionated
    59. 59. Plone Conference 2011 Not Too OpinionatedA migration deals with moving pieces of content fromone place to another
    60. 60. Plone Conference 2011 Not Too OpinionatedA migration deals with moving pieces of content fromone place to anotherA piece of content comes from somewhere
    61. 61. Plone Conference 2011 Not Too OpinionatedA migration deals with moving pieces of content fromone place to anotherA piece of content comes from somewhereA piece of content ends up somewhere
    62. 62. Plone Conference 2011 Not Too OpinionatedA migration deals with moving pieces of content fromone place to anotherA piece of content comes from somewhereA piece of content ends up somewhereYou should be able to do what you want to a piece ofcontent between point A and point B
    63. 63. Plone Conference 2011A Nice Set of Tools
    64. 64. Plone Conference 2011A Nice Set of Tools
    65. 65. Plone Conference 2011The Migration
    66. 66. Plone Conference 2011Two Main Types
    67. 67. Plone Conference 2011Two Main Types Images
    68. 68. Plone Conference 2011Two Main Types Images Articles
    69. 69. Plone Conference 2011Two Pipelines
    70. 70. Plone Conference 2011 Two PipelinesOne for each main content category
    71. 71. Plone Conference 2011Two Pipelines
    72. 72. Plone Conference 2011 Two Pipelines• Extract content from SQL
    73. 73. Plone Conference 2011 Two Pipelines• Extract content from SQL• Update text field encodings
    74. 74. Plone Conference 2011 Two Pipelines• Extract content from SQL• Update text field encodings• Transform Dates to Python
    75. 75. Plone Conference 2011 Two Pipelines• Extract content from SQL• Update text field encodings• Transform Dates to Python• Calculate Final Plone Location
    76. 76. Plone Conference 2011 Two Pipelines• Extract content from SQL• Update text field encodings• Transform Dates to Python• Calculate Final Plone Location• Create Plone Object
    77. 77. Plone Conference 2011 Two Pipelines• Extract content from SQL• Update text field encodings• Transform Dates to Python• Calculate Final Plone Location• Create Plone Object• Post-process (publication, etc.)
    78. 78. Plone Conference 2011A Simplification but you get the idea
    79. 79. Plone Conference 2011 Can You Spot the Flaw?Photo by pollyann - CC-BY-NC-NDhttp://www.flickr.com/photos/pollyann/4299826600/
    80. 80. Plone Conference 2011How Do We Match Them? Plone ID != Original SQL ID
    81. 81. Plone Conference 2011How Do We Match Them? JournalArticle.smallImageId != Image.smallImageID
    82. 82. Plone Conference 2011How Do We Match Them?JournalArticle.smallImageId != JournalArticle.smallImageURL | 449868 | /image/image_gallery?img_id=449858&t=1296250152541 | | 449894 | /image/image_gallery?img_id=449888&t=1296250703087 | | 450604 | /image/image_gallery?img_id=450582&t=1296515550638 | | 450845 | /image/image_gallery?img_id=450835&t=1296599014396 | | 450917 | /image/image_gallery?img_id=450907&t=1296615851690 |
    83. 83. Plone Conference 2011Photo by hobvias sudoneighm - CC-BYhttp://www.flickr.com/photos/striatic/2192192956/
    84. 84. Plone Conference 2011Transmogrifier To The Rescue!!!
    85. 85. Plone Conference 2011Two Features
    86. 86. Plone Conference 2011Splitter Sectionrun content down different pipelines in one transmogrifier
    87. 87. Plone Conference 2011 Annotations store data on the transmogrifier tocommunicate between pipeline sections
    88. 88. Plone Conference 2011How Does This Help?
    89. 89. Plone Conference 2011 Three Facts 1. Pipeline sections are generatorsclass MySection(object): classProvides(ISectionBlueprint) implements(ISection) def __init__(self, transmogrifier, name, options, previous): [setup] def __iter__(self): ... for item in self.previous: [do some stuff] ... yield item [clean up]
    90. 90. Plone Conference 2011 Three Facts2. SQL sections process items 1 query at a time for query in self.queries: result=self.connection.execute(query) for row in result: yield dict((x[0].encode(utf-8), x[1]) for x in row.items())
    91. 91. Plone Conference 2011 Three Facts 3. Pipelines process one item at a timeclass SectionOne(object): class SectionTwo(object): def __iter__(self): def __iter__(self): ... ... for item in self.previous: for item in self.previous: [do some stuff] [do some stuff] ... ... yield item yield item
    92. 92. Plone Conference 2011UsingAnnotations
    93. 93. Plone Conference 2011UsingAnnotations
    94. 94. Plone Conference 2011ImageInformation Using Annotations
    95. 95. Plone Conference 2011ImageInformation Using Annotations
    96. 96. Plone Conference 2011class IGImageIDMapper(object): classProvides(ISectionBlueprint) implements(ISection) def __init__(self, transmogrifier, name, options, previous): self.transmogrifier = transmogrifier ... annotations = IAnnotations(self.transmogrifier) if IMAGE_MAPS_KEY in annotations: self.image_maps = annotations[IMAGE_MAPS_KEY] else: annotations[IMAGE_MAPS_KEY] = self.image_maps = {img_uuid: {}, sm_img_id: {}, lg_img_id: {}, c1_img_id: {}, c2_img_id: {}}
    97. 97. Plone Conference 2011class IGImageIDMapper(object): classProvides(ISectionBlueprint) implements(ISection) def __init__(self, transmogrifier, name, options, previous): self.transmogrifier = transmogrifier ... annotations = IAnnotations(self.transmogrifier) if IMAGE_MAPS_KEY in annotations: self.image_maps = annotations[IMAGE_MAPS_KEY] else:__iter__(self): def annotations[IMAGE_MAPS_KEY] = self.image_maps = {img_uuid: {}, for item in self.previous: if item[_type] != Folder: sm_img_id: {}, path = item.get(_path, None)lg_img_id: {}, c1_img_id: {}, current = self.transmogrifier.context.unrestrictedTraverse(path) if current: c2_img_id: {}} cuid = current.UID() info = {uid: cuid, path: path} self.image_maps[img_uuid][item[uuid_]] = info self.image_maps[lg_img_id][item[largeImageId]] = info self.image_maps[sm_img_id][item[smallImageId]] = info self.image_maps[c1_img_id][item[custom1ImageId]] = info self.image_maps[c2_img_id][item[custom2ImageId]] = info yield item
    98. 98. Plone Conference 2011ImageInformation Using Annotations
    99. 99. Plone Conference 2011ImageInformation Using Annotations
    100. 100. Plone Conference 2011ImageInformation Using Annotations
    101. 101. Plone Conference 2011UsingAnnotations
    102. 102. Plone Conference 2011class GetArticleImage(object): classProvides(ISectionBlueprint) implements(ISection) def __iter__(self): annotations = IAnnotations(self.transmogrifier) self.image_maps = annotations[IMAGE_MAPS_KEY] for item in self.previous: if item[smallImage] != 0: img_id = None try: url = item[smallImageURL] img_id = extract_img_id_from_url(url) if img_id is None: img_id = extract_img_uuid_from_url(url) except KeyError: img_id = item[smallImageId] if img_id is not None: img_info = find_image_from_id(img_id, self.image_maps) ... yield item
    103. 103. Plone Conference 2011 Victory!Photo by Petr & Bara Ruzicka - CC-BYhttp://www.flickr.com/photos/pruzicka/207209564/
    104. 104. Plone Conference 2011Image by justinshearer - CC-BY-NC-SAhttp://www.flickr.com/photos/justinshearer/3675295127/
    105. 105. Plone Conference 2011What About Links In Articles
    106. 106. Plone Conference 2011What About Links In ArticlesTo Other Articles?
    107. 107. Plone Conference 2011Images finishbeforeArticles Start
    108. 108. Plone Conference 2011ImageInformation Images finish before Articles Start
    109. 109. Plone Conference 2011ImageInformation Images finish before Articles Start
    110. 110. Plone Conference 2011
    111. 111. Plone Conference 2011
    112. 112. Plone Conference 2011Article 1 ID Article 1 Identifiers Stored
    113. 113. Plone Conference 2011Article 1 ID
    114. 114. Plone Conference 2011Article 1 ID Article 2 Links Processed
    115. 115. Plone Conference 2011Article 1 ID Article 2 Links Processed Link to Article 1 ✓
    116. 116. Plone Conference 2011Article 1 ID Article 2 Links Processed Link to Article 1 ✓ Link to Article 3 ✗
    117. 117. Plone Conference 2011 Sad PandaPhoto by Luis Markovic - CC-BYhttp://www.flickr.com/photos/_lulu/3265194525/
    118. 118. Plone Conference 2011Transmogrifier To The Rescue!!!
    119. 119. Plone Conference 2011 Three Facts 1. Pipeline sections are generatorsclass MySection(object): classProvides(ISectionBlueprint) implements(ISection) def __init__(self, transmogrifier, name, options, previous): [setup] def __iter__(self): ... for item in self.previous: [do some stuff] ... yield item [clean up]
    120. 120. Plone Conference 2011 One Fact, Really 1. Pipeline sections are generatorsclass MySection(object): classProvides(ISectionBlueprint) implements(ISection) def __init__(self, transmogrifier, name, options, previous): [setup] def __iter__(self): ... for item in self.previous: [do some stuff] ... yield item [clean up]
    121. 121. Plone Conference 2011 One Fact, Really 1. Pipeline sections are generatorsclass MySection(object): classProvides(ISectionBlueprint) implements(ISection) def __init__(self, transmogrifier, name, options, previous): [setup] def __iter__(self): ... for item in self.previous: [do some stuff] ... yield item [clean up]
    122. 122. Plone Conference 2011After all items are gone
    123. 123. Plone Conference 2011After all items are gone Cleanup code is run
    124. 124. Plone Conference 2011 Set up ID Mapclass PostCreation(object): classProvides(ISectionBlueprint) implements(ISection) def __init__(self, transmogrifier, name, options, previous): self.transmogrifier = transmogrifier ... if DOC_MAPS_KEY in annotations.keys(): self.doc_maps = annotations[DOC_MAPS_KEY] else: annotations[DOC_MAPS_KEY] = self.doc_maps = {resourcePrimKey: {}, urlTitle: {}, articleId: {}, uuid_: {}, _path: {}}
    125. 125. Plone Conference 2011 Set up ID Mapclass PostCreation(object): classProvides(ISectionBlueprint) implements(ISection) def __init__(self, transmogrifier, name, options, previous): def __iter__(self): self.transmogrifier = transmogrifier site = self.transmogrifier.context ... for item in self.previous: if DOC_MAPS_KEY in annotations.keys(): path = item.get(_path, None) self.doc_maps = annotations[DOC_MAPS_KEY] if path: else: try: annotations[DOC_MAPS_KEY] = self.doc_maps = {resourcePrimKey: {}, current = site.unrestrictedTraverse(path) except KeyError: urlTitle: {}, articleId: {}, # missing element in path somewhere, skip it? pass uuid_: {}, if current: _path: {}} cuid = current.UID() for key in [resourcePrimKey, urlTitle, articleId, uuid_, _path]: if key in item: self.doc_maps[key][item[key]] = cuid yield item
    126. 126. Plone Conference 2011 Find Image Linksclass ImageTagsFinder(object): classProvides(ISectionBlueprint) implements(ISection) def __init__(self, transmogrifier, name, options, previous): self.transmogrifier = transmogrifier ... annotations = IAnnotations(transmogrifier) if REWRITABLE_ELEMENTS_KEY not in annotations.keys(): annotations[REWRITABLE_ELEMENTS_KEY] = {} self.rewriteable = annotations[REWRITABLE_ELEMENTS_KEY]
    127. 127. Plone Conference 2011 Find Image Linksclass ImageTagsFinder(object): classProvides(ISectionBlueprint) def __iter__(self): implements(ISection) num_found = 0 for item in self.previous: def __init__(self, transmogrifier, name, options, previous): self.transmogrifier = item[_path] path = transmogrifier ... if tree is None: parser = etree.HTMLParser() annotations = IAnnotations(transmogrifier) tree = etree.fromstring(item[text], parser) if REWRITABLE_ELEMENTS_KEY not in annotations.keys(): if tree is not None: annotations[REWRITABLE_ELEMENTS_KEY] = {} all_images = tree.xpath(//img) self.rewriteable = annotations[REWRITABLE_ELEMENTS_KEY] if len(all_images) > 0: # we have some anchors, do any need re-writing? internal = [] for img in all_images: src = img.attrib.get(src, ) match = img_is_internal(src) if match: internal.append(img) mapped[images] = internal self.rewriteable[path] = internal yield item
    128. 128. Find Other Tags Plone Conference 2011class LinkFinder(object): """ create a mapping of the items which have links to be modified """ classProvides(ISectionBlueprint) implements(ISection) def __init__(self, transmogrifier, name, options, previous): self.transmogrifier = transmogrifier annotations = IAnnotations(transmogrifier) if REWRITABLE_ELEMENTS_KEY not in annotations.keys(): annotations[REWRITABLE_ELEMENTS_KEY] = {} self.rewriteable = annotations[REWRITABLE_ELEMENTS_KEY]
    129. 129. Find Other Tags Plone Conference 2011class LinkFinder(object): """ create a mapping of the items which have links to be modified """ classProvides(ISectionBlueprint) implements(ISection) def __iter__(self): def __init__(self,etrees of any documents with links that need fixing """ save transmogrifier, name, options, previous): """ self.transmogrifier = transmogrifier annotations =in self.previous: for item IAnnotations(transmogrifier) path = item[_path] if REWRITABLE_ELEMENTS_KEY not in annotations.keys(): annotations[REWRITABLE_ELEMENTS_KEY]parser) tree = etree.fromstring(item[text], = {} if tree is not None: self.rewriteable = annotations[REWRITABLE_ELEMENTS_KEY] all_anchors = tree.xpath(//a) if len(all_anchors) > 0: # we have some anchors, do any need re-writing? internal = [] for a in all_anchors: href = a.attrib.get(href, ) if is_internal_link(href): internal.append(href) self.rewriteable[path] = internal yield item
    130. 130. Plone Conference 2011 Replace Found Linksclass LinkReplacer(object): """ re-write links in body texts of all created items The work done by this item takes place entirely in the clean-up stage of the section. """ classProvides(ISectionBlueprint) implements(ISection) def __init__(self, transmogrifier, name, options, previous): self.transmogrifier = transmogrifier annotations = IAnnotations(transmogrifier) if REWRITABLE_ELEMENTS_KEY not in annotations.keys(): annotations[REWRITABLE_ELEMENTS_KEY] = {} self.rewriteable = annotations[REWRITABLE_ELEMENTS_KEY]
    131. 131. Plone Conference 2011 Replace Found Linksclass LinkReplacer(object): """ re-write links in body texts of all created items def __iter__(self): The work done by this item takes place entirely in the clean-up stage of the section. in self.previous: for item """ # no action takes place here yield item classProvides(ISectionBlueprint) implements(ISection) # get the maps we will use annotations = IAnnotations(self.transmogrifier) def __init__(self, transmogrifier, name, options, previous): doc_maps = annotations[DOC_MAPS_KEY] self.transmogrifier = transmogrifier image_maps = annotations[IMAGE_MAPS_KEY] annotations = IAnnotations(transmogrifier) if REWRITABLE_ELEMENTS_KEY tagsin annotations.keys(): # rewrite image and anchor not for path, info in self.rewriteable.items(): annotations[REWRITABLE_ELEMENTS_KEY] = {} page = self.transmogrifier.context.unrestrictedTraverse(path) self.rewriteable = annotations[REWRITABLE_ELEMENTS_KEY] tree = info.get(tree, None) links = info.get(links, []) images = info.get(images, []) lg, ln, le = rewrite_links(links, page, doc_maps, self.logger, self.transmogrifier.context) ig, _in, ie = rewrite_image_tags(images, page, image_maps, self.logger)
    132. 132. Plone Conference 2011 Victory!Photo by Petr & Bara Ruzicka - CC-BYhttp://www.flickr.com/photos/pruzicka/207209564/
    133. 133. Plone Conference 2011 Victory! Right?Photo by Petr & Bara Ruzicka - CC-BYhttp://www.flickr.com/photos/pruzicka/207209564/
    134. 134. Plone Conference 2011please?
    135. 135. Plone Conference 2011please?
    136. 136. Plone Conference 2011Photo of moohttp://instagr.am/p/SSMBw/
    137. 137. Plone Conference 2011Link Formats
    138. 138. Plone Conference 2011Link Formatshow many can you imagine?
    139. 139. Plone Conference 2011Link Formatshow many can you imagine? we had them all
    140. 140. Plone Conference 2011~2,000 Articles
    141. 141. Plone Conference 20115-10 Links per Article
    142. 142. Plone Conference 20115-10 Links per Article at least
    143. 143. Plone Conference 20115-10 Links per Article at least you do the math
    144. 144. Plone Conference 2011 How to Find the Bad Ones?Photo by Alessandra Oddi - CC-BYhttp://www.flickr.com/photos/uvafragola/4834037874/
    145. 145. Plone Conference 2011Transmogrifier To The Rescue!!!
    146. 146. Plone Conference 2011PythonTo The Rescue!!!
    147. 147. Plone Conference 2011CSV Reports
    148. 148. Plone Conference 2011CSV Reports Which links worked?
    149. 149. Plone Conference 2011CSV Reports Which links worked? Which links didn’t?
    150. 150. class LinkReplacer(object): """ re-write links in body texts of all created items """ classProvides(ISectionBlueprint) implements(ISection) ... def __iter__(self): good, notenough, errors = [],[],[] for item in self.previous: yield item for path, info in self.rewriteable.items(): page = self.transmogrifier.context.unrestrictedTraverse(path) tree = info.get(tree, None) links = info.get(links, []) images = info.get(images, []) lg, ln, le = rewrite_links(links, page, doc_maps, self.logger, self.transmogrifier.context) ig, _in, ie = rewrite_image_tags(images, page, image_maps, self.logger) good.extend(lg + ig) notenough.extend(ln + _in) errors.extend(le + ie) with open(goodlinks.csv, w) as f: goodwriter = csv.writer(f) goodwriter.writerows(good) with open(badlinks.csv, w) as f: badwriter = csv.writer(f) badwriter.writerows(notenough) ...
    151. 151. class LinkReplacer(object): """ re-write links in body texts of all created items """ def rewrite_image_tags(images, page, img_maps, logger): good = [] classProvides(ISectionBlueprint) notenough = [] implements(ISection) ... errors = [] def __iter__(self): image in images: for url = image.attrib.get(src, ) good, notenough, errors = [],[],[] # get information about the image to be subbed, either by id or uuid for item in self.previous: yield item img_id = extract_img_id_from_url(url) for path, info in img_id is None: if self.rewriteable.items(): img_id = extract_img_uuid_from_url(url) page = self.transmogrifier.context.unrestrictedTraverse(path) tree = info.get(tree, is None: if img_id None) links = info.get(links, []) not find a mapped image matching, not enough to go on # we could logger.warn(unable to find image id in url: %s % url) images = info.get(images, []) lg, ln, le = rewrite_links(links, page, doc_maps, bad url, page.absolute_url(), url)) notenough.append((missing img, continue self.logger, self.transmogrifier.context) ig, _in, ie =# resolve the id we found into a plone object UID via image maps rewrite_image_tags(images, page, image_maps, img_info = find_image_from_id(img_id, img_maps) self.logger) if img_info is None: good.extend(lg + ig) logger.warn(unable to find mapped plone image id %s % img_id) notenough.extend(ln + _in) notenough.append((missing img, no map, page.absolute_url(), url)) errors.extend(le + ie) continue with open(goodlinks.csv, w) as f: goodwriter # by default, use the 300x300 px medium size for in-page images = csv.writer(f) # To change this, adjust the value of STANDARD_IMG_SCALE goodwriter.writerows(good) newurl = "resolveuid/%s%s" % (img_info[uid], with open(badlinks.csv, w) as f: badwriter = csv.writer(f) STANDARD_IMG_SCALE) badwriter.writerows(notenough)newurl image.attrib[src] = ... good.append((image match, page.absolute_url(), url)) return good, notenough, errors
    152. 152. Plone Conference 2011Iteration FTW!
    153. 153. Plone Conference 2011Iteration FTW! from > 2000 bad links
    154. 154. Plone Conference 2011Iteration FTW! from > 2000 bad links to < 75 bad links
    155. 155. Plone Conference 2011Iteration FTW! from > 2000 bad links to < 75 bad links fix those that remain by hand
    156. 156. Plone Conference 2011 Victory!Photo by Petr & Bara Ruzicka - CC-BYhttp://www.flickr.com/photos/pruzicka/207209564/
    157. 157. Plone Conference 2011So What Did We Learn?
    158. 158. Plone Conference 2011 Clients, when asked to describe their existing system,will never describe it with enough accuracy to properly plan for a migration
    159. 159. Plone Conference 2011Learn as much as you can about the source system when planning a migration
    160. 160. Plone Conference 2011Learn as much as you can about the source system when planning a migrationbut know that you will always need to know more
    161. 161. Plone Conference 2011 Learn as much as you can about the source system when planning a migration but know that you will always need to know moreand that you will not find it out until you actually start the migration
    162. 162. Plone Conference 2011Users, if given more than one way to do things, will use all the ways
    163. 163. Plone Conference 2011Have a plan, but be prepared to adjust when reality hits. Plans are best when treated as jumping-off points.
    164. 164. Plone Conference 2011Estimate Migrations HIGH
    165. 165. Plone Conference 2011Photo by neilspicys - CC-BYhttp://www.flickr.com/photos/neilspicys/2349770710/
    166. 166. Check p.com/de out mossixfeetu

    ×