• Like
  • Save
Transmogrifier: Beyond the Magic Box
Upcoming SlideShare
Loading in...5
×
 

Transmogrifier: Beyond the Magic Box

on

  • 1,134 views

Transmogrifier is a fantastic tool for moving content from one website to another. Simple, flexible and powerful, it makes the difficult tasks of migration easy and the impossible possible. But ...

Transmogrifier is a fantastic tool for moving content from one website to another. Simple, flexible and powerful, it makes the difficult tasks of migration easy and the impossible possible. But there's more to migrating with transmogrifier than just learning the tool. The everyday task of managing content can lead to complex problems. You need a plan. In this talk, we'll look at a real-world example of the migration of a large, content-heavy website from Liferay to Plone. We'll talk about where the hidden traps were found, the tools we used to get past them, and the knowledge that would have helped us avoid them in the first place.

Statistics

Views

Total Views
1,134
Views on SlideShare
1,132
Embed Views
2

Actions

Likes
0
Downloads
0
Comments
0

1 Embed 2

http://www.linkedin.com 2

Accessibility

Upload Details

Uploaded via as Apple Keynote

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n

Transmogrifier: Beyond the Magic Box Transmogrifier: Beyond the Magic Box Presentation Transcript

  • Transmogrification:Beyond the Magic Box Cris Ewing PLONE CONFERENCE, 2011
  • Plone Conference 2011Migration
  • Plone Conference 2011It’s Easy!
  • Plone Conference 2011Photo by Christopher Michel - CC_BYhttp://www.flickr.com/photos/cmichel67/4172613951/
  • Plone Conference 2011 A Good PlanPhoto by Steve Jurvetson - CC-BYhttp://www.flickr.com/photos/jurvetson/21470089/
  • Good Tools Plone Conference 2011Photo by Kimmo Palosaari - CC-BYhttp://www.flickr.com/photos/kimmo-quva/4630630775/
  • Plone Conference 2011 It doesn’t have to end like thisPhoto by Mike Nelson - CC-BYhttp://www.flickr.com/photos/mike_nelson/4720252548/
  • Plone Conference 2011Our Story...
  • Plone Conference 2011 heroes and villainsPhoto by digital_ramapge viahttp://www.flickr.com/photos/digital_ramapge/6118703544/
  • Plone Conference 2011 A Difficult JourneyPhoto by krayker - CC-BYhttp://www.flickr.com/photos/krayker/2274246797/
  • Plone Conference 2011Obstacles to OvercomePhoto by Joe Marinaro - CC-BYhttp://www.flickr.com/photos/m500/5782771006/
  • Victory Plone Conference 2011Photo by Petr & Bara Ruzicka - CC-BYhttp://www.flickr.com/photos/pruzicka/207209564/
  • Plone Conference 2011The Plan
  • Plone Conference 2011Liferay
  • Plone Conference 2011ProprietaryOpen Source
  • Plone Conference 2011Image by Patrick Hoesly - CC_BYhttp://www.flickr.com/photos/zooboing/5566075309/
  • Plone Conference 2011 ‘lacking in documentationexplaining how it actually works’
  • Plone Conference 2011 data spelunkingPhoto by wjhunter - CC-BYhttp://www.flickr.com/photos/wjhunter/3581151063/
  • Plone Conference 2011DB Schema Clearmysql> show tables;+--------------------------------+| Tables_in_lportal |+--------------------------------+
  • | Users_Permissions || Users_Roles || Users_UserGroups || Vocabulary || WOL_MeetupsEntry | Plone Conference 2011| WOL_MeetupsRegistration || WOL_SVNRepository || WOL_SVNRevision || WOL_WallEntry |DB Schema| WSRPConfiguredProducer || WSRPConsumerRegistration || WSRPPortlet || WSRPProducer || WebDAVProps || Website| WikiNode Clear | || WikiPage || WikiPageResource || guestUserInfo |+--------------------------------+155 rows in set (0.01 sec)
  • Plone Conference 2011 DB Schema Simplemysql> describe JournalArticle;+--------------------+--------------+------+-----+---------+-------+| Field | Type | Null | Key | Default | Extra |+--------------------+--------------+------+-----+---------+-------+| uuid_ | varchar(75) | YES | MUL | NULL | || id_ | bigint(20) | NO | PRI | NULL | || resourcePrimKey | bigint(20) | YES | MUL | NULL | || groupId | bigint(20) | YES | MUL | NULL | || companyId | bigint(20) | YES | MUL | NULL | || userId | bigint(20) | YES | | NULL | || userName | varchar(75) | YES | | NULL | || createDate | datetime | YES | | NULL | || modifiedDate | datetime | YES | | NULL | || articleId | varchar(75) | YES | | NULL | || version | double | YES | | NULL | || title | varchar(100) | YES | | NULL | |
  • | Field | Type | Null | Key | Default | Extra |+--------------------+--------------+------+-----+---------+-------+| uuid_ | varchar(75) | YES | MUL | NULL | || id_ | bigint(20) | NO | PRI | NULL | || resourcePrimKey | bigint(20) | YES | MUL | NULL | | Plone Conference 2011| groupId | bigint(20) | YES | MUL | NULL | || companyId | bigint(20) | YES | MUL | NULL | || userId | bigint(20) | YES | | NULL | || userName | varchar(75) | YES | | NULL | | DB Schema| createDate | datetime | YES | | NULL | || modifiedDate | datetime | YES | | NULL | || articleId | varchar(75) | YES | | NULL | || version | double | YES | | NULL | || title | varchar(100) | YES | | NULL | || description| content | longtext | longtext Simple | YES | | YES | | NULL | | NULL | | || type_ | varchar(75) | YES | | NULL | || structureId | varchar(75) | YES | | NULL | || templateId | varchar(75) | YES | | NULL | || displayDate | datetime | YES | | NULL | || approved | tinyint(4) | YES | | NULL | || approvedByUserId | bigint(20) | YES | | NULL | || approvedByUserName | varchar(75) | YES | | NULL | || approvedDate | datetime | YES | | NULL | || expired | tinyint(4) | YES | | NULL | || expirationDate | datetime | YES | | NULL | || reviewDate | datetime | YES | | NULL | || indexable | tinyint(4) | YES | | NULL | || smallImage | tinyint(4) | YES | | NULL | || smallImageId | bigint(20) | YES | MUL | NULL | || smallImageURL | varchar(75) | YES | | NULL | || urlTitle | varchar(150) | YES | | NULL | |+--------------------+--------------+------+-----+---------+-------+
  • Plone Conference 2011 DB Schema Easy to UnderstandSELECT COALESCE(gui.guestName, mbm.userName) AS userName, mbm.createDate, mbm.modifiedDate, mbm.subject, mbm.bodyFROM MBMessage mbmJOIN MBDiscussion mbd ON mbd.threadId = mbm.threadIdLEFT JOIN guestUserInfo gui ON gui.messageId = mbm.messageIdWHERE mbd.classPK=%d ORDER BY modifiedDate
  • Plone Conference 2011 DB Schema Easy to UnderstandSELECT COALESCE(gui.guestName, mbm.userName) AS userName,SELECT mbm.createDate, te.name mbm.modifiedDate,FROM TagsEntry te mbm.subject,WHERE te.entryId in mbm.body (SELECT tate.entryIdFROMFROM TagsAssets_TagsEntries tate MBMessage mbm in WHERE tate.assetIdJOIN (SELECT ta.assetId MBDiscussion mbd ON mbd.threadId = mbm.threadId FROM TagsAsset taLEFT JOIN WHERE ta.classPK=%d))AND te.vocabularyid ON gui.messageId = mbm.messageId guestUserInfo gui = 41473WHERE mbd.classPK=%d ORDER BY modifiedDate
  • Plone Conference 2011DB Schema Easy to Understand </sarcasm>
  • Plone Conference 2011Simple Goal (relatively)
  • Plone Conference 2011All Articles ~2000
  • Plone Conference 2011All Images ~5000
  • Plone Conference 2011All Comments ~1000
  • Plone Conference 2011Preserve Links
  • Plone Conference 2011Preserve Dates
  • Plone Conference 2011Preserve Authorship
  • Plone Conference 2011Only a Few Tables
  • Plone Conference 2011+--------------------+--------------+------+-----+---------+-------+| Field | Type | Null | Key | Default | Extra |+--------------------+--------------+------+-----+---------+-------+| uuid_ | varchar(75) | YES | MUL | NULL | | 1 for| id_ | bigint(20) | NO | PRI | NULL | || resourcePrimKey | bigint(20) | YES | MUL | NULL | || groupId | bigint(20) | YES | MUL | NULL | || companyId | bigint(20) | YES | MUL | NULL | || userId | bigint(20) | YES | | NULL | || userName | varchar(75) | YES | | NULL | | Articles| createDate | datetime | YES | | NULL | || modifiedDate | datetime | YES | | NULL | || articleId | varchar(75) | YES | | NULL | || version | double | YES | | NULL | || title | varchar(100) | YES | | NULL | || description | longtext | YES | | NULL | || content | longtext | YES | | NULL | || type_ | varchar(75) | YES | | NULL | || structureId | varchar(75) | YES | | NULL | || templateId | varchar(75) | YES | | NULL | || displayDate | datetime | YES | | NULL | || approved | tinyint(4) | YES | | NULL | || approvedByUserId | bigint(20) | YES | | NULL | || approvedByUserName | varchar(75) | YES | | NULL | || approvedDate | datetime | YES | | NULL | || expired | tinyint(4) | YES | | NULL | || expirationDate | datetime | YES | | NULL | || reviewDate | datetime | YES | | NULL | || indexable | tinyint(4) | YES | | NULL | || smallImage | tinyint(4) | YES | | NULL | || smallImageId | bigint(20) | YES | MUL | NULL | || smallImageURL | varchar(75) | YES | | NULL | || urlTitle | varchar(150) | YES | | NULL | |+--------------------+--------------+------+-----+---------+-------+
  • +--------------+------------+------+-----+---------+-------+| Field | Type | Null | Key | Default | Extra |+--------------+------------+------+-----+---------+-------+ Plone Conference 2011| discussionId | bigint(20) | NO | PRI | NULL | || classNameId | bigint(20) | YES | MUL | NULL | || classPK | bigint(20) | YES | | NULL | || threadId | bigint(20) | YES | UNI | NULL | |+--------------+------------+------+-----+---------+-------+ 3 for Comments
  • +--------------+------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +--------------+------------+------+-----+---------+-------+ Plone Conference 2011 | discussionId | bigint(20) | NO | PRI | NULL | | | classNameId | bigint(20) | YES | MUL | NULL | | | classPK | bigint(20) | YES | | NULL | | | threadId | bigint(20) | YES | UNI | NULL | | +--------------+------------+------+-----+---------+-------++-----------------+-------------+------+-----+---------+-------+ 3 for| Field | Type | Null | Key | Default | Extra |+-----------------+-------------+------+-----+---------+-------+| uuid_ | varchar(75) | YES | MUL | NULL | || messageId | bigint(20) | NO | PRI | NULL | || companyId | bigint(20) | YES | MUL | NULL | || userId | bigint(20) | YES | | NULL | | Comments| userName | varchar(75) | YES | | NULL | || createDate | datetime | YES | | NULL | || modifiedDate | datetime | YES | | NULL | || categoryId | bigint(20) | YES | MUL | NULL | || threadId | bigint(20) | YES | MUL | NULL | || parentMessageId | bigint(20) | YES | | NULL | || subject | varchar(75) | YES | | NULL | || body | longtext | YES | | NULL | || attachments | tinyint(4) | YES | | NULL | || anonymous | tinyint(4) | YES | | NULL | || groupId | bigint(20) | YES | MUL | NULL | || classNameId | bigint(20) | YES | MUL | NULL | || classPK | bigint(20) | YES | | NULL | || priority | double | YES | | NULL | |+-----------------+-------------+------+-----+---------+-------+ +------------------+------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +------------------+------------+------+-----+---------+-------+ | threadId | bigint(20) | NO | PRI | NULL | | | categoryId | bigint(20) | YES | MUL | NULL | | | rootMessageId | bigint(20) | YES | | NULL | | | messageCount | int(11) | YES | | NULL | | | viewCount | int(11) | YES | | NULL | | | lastPostByUserId | bigint(20) | YES | | NULL | | | lastPostDate | datetime | YES | | NULL | | | priority | double | YES | | NULL | | | groupId | bigint(20) | YES | MUL | NULL | | +------------------+------------+------+-----+---------+-------+
  • +----------------+-------------+------+-----+---------+-------+| Field | Type | Null | Key | Default | Extra |+----------------+-------------+------+-----+---------+-------+ Plone Conference 2011| uuid_ | varchar(75) | YES | MUL | NULL | || folderId | bigint(20) | NO | PRI | NULL | || groupId | bigint(20) | YES | MUL | NULL | || companyId | bigint(20) | YES | MUL | NULL | | 3 for| userId | bigint(20) | YES | | NULL | || createDate | datetime | YES | | NULL | || modifiedDate | datetime | YES | | NULL | || parentFolderId | bigint(20) | YES | | NULL | || name | varchar(75) | YES | | NULL | || description | longtext | YES | | NULL | | Images+----------------+-------------+------+-----+---------+-------+
  • +----------------+-------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +----------------+-------------+------+-----+---------+-------+ Plone Conference 2011 | uuid_ | varchar(75) | YES | MUL | NULL | | | folderId | bigint(20) | NO | PRI | NULL | | | groupId | bigint(20) | YES | MUL | NULL | | | companyId | bigint(20) | YES | MUL | NULL | | 3 for | userId | bigint(20) | YES | | NULL | | | createDate | datetime | YES | | NULL | | | modifiedDate | datetime | YES | | NULL | | | parentFolderId | bigint(20) | YES | | NULL | | | name | varchar(75) | YES | | NULL | | | description | longtext | YES | | NULL | | Images +----------------+-------------+------+-----+---------+-------++----------------+-------------+------+-----+---------+-------+| Field | Type | Null | Key | Default | Extra |+----------------+-------------+------+-----+---------+-------+| uuid_ | varchar(75) | YES | MUL | NULL | || imageId | bigint(20) | NO | PRI | NULL | || companyId | bigint(20) | YES | | NULL | || userId | bigint(20) | YES | | NULL | || createDate | datetime | YES | | NULL | || modifiedDate | datetime | YES | | NULL | || folderId | bigint(20) | YES | MUL | NULL | || name | varchar(75) | YES | | NULL | || description | longtext | YES | | NULL | || smallImageId | bigint(20) | YES | MUL | NULL | || largeImageId | bigint(20) | YES | MUL | NULL | || custom1ImageId | bigint(20) | YES | MUL | NULL | || custom2ImageId | bigint(20) | YES | MUL | NULL | || groupId | bigint(20) | YES | MUL | NULL | |+----------------+-------------+------+-----+---------+-------+ +--------------+-------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +--------------+-------------+------+-----+---------+-------+ | imageId | bigint(20) | NO | PRI | NULL | | | modifiedDate | datetime | YES | | NULL | | | text_ | longtext | YES | | NULL | | | type_ | varchar(75) | YES | | NULL | | | height | int(11) | YES | | NULL | | | width | int(11) | YES | | NULL | | | size_ | int(11) | YES | MUL | NULL | | +--------------+-------------+------+-----+---------+-------+
  • Plone Conference 2011What’s in an Article?
  • Plone Conference 2011
  • Plone Conference 20111. Image at top2. Author name3. Links in body4. Images in body5. Footnotes6. Comments
  • Plone Conference 2011And the Data?
  • +--------------------+--------------+------+-----+---------+-------+| Field | Type | Null | Key | Default | Extra | Plone Conference 2011+--------------------+--------------+------+-----+---------+-------+| uuid_ | varchar(75) | YES | MUL | NULL | || id_ | bigint(20) | NO | PRI | NULL | || resourcePrimKey | bigint(20) | YES | MUL | NULL | || groupId | bigint(20) | YES | MUL | NULL | || companyId | bigint(20) | YES | MUL | NULL | || userId | bigint(20) | YES | | NULL | || userName | varchar(75) | YES | | NULL | || createDate | datetime | YES | | NULL | || modifiedDate | datetime | YES | | NULL | || articleId | varchar(75) | YES | | NULL | || version | double | YES | | NULL | || title | varchar(100) | YES | | NULL | || description | longtext | YES | | NULL | || content | longtext | YES | | NULL | || type_ | varchar(75) | YES | | NULL | || structureId | varchar(75) | YES | | NULL | || templateId | varchar(75) | YES | | NULL | || displayDate | datetime | YES | | NULL | || approved | tinyint(4) | YES | | NULL | || approvedByUserId | bigint(20) | YES | | NULL | || approvedByUserName | varchar(75) | YES | | NULL | || approvedDate | datetime | YES | | NULL | || expired | tinyint(4) | YES | | NULL | || expirationDate | datetime | YES | | NULL | || reviewDate | datetime | YES | | NULL | || indexable | tinyint(4) | YES | | NULL | || smallImage | tinyint(4) | YES | | NULL | || smallImageId | bigint(20) | YES | MUL | NULL | || smallImageURL | varchar(75) | YES | | NULL | || urlTitle | varchar(150) | YES | | NULL | |+--------------------+--------------+------+-----+---------+-------+
  • +--------------------+--------------+------+-----+---------+-------+ Plone Conference 2011 Author| Field | Type | Null | Key | Default | Extra |+--------------------+--------------+------+-----+---------+-------+| uuid_ | varchar(75) | YES | MUL | NULL | || id_ | bigint(20) | NO | PRI | NULL | || resourcePrimKey | bigint(20) | YES | MUL | NULL | || groupId | bigint(20) | YES | MUL | NULL | || companyId | bigint(20) | YES | MUL | NULL | || userId | bigint(20) | YES | | NULL | || userName | varchar(75) | YES | | NULL | || createDate | datetime | YES | | NULL | || modifiedDate | datetime | YES | | NULL | || articleId | varchar(75) | YES | | NULL | || version | double | YES | | NULL | || title | varchar(100) | YES | | NULL | || description | longtext | YES | | NULL | || content | longtext | YES | | NULL | || type_ | varchar(75) | YES | | NULL | || structureId | varchar(75) | YES | | NULL | || templateId | varchar(75) | YES | | NULL | || displayDate | datetime | YES | | NULL | || approved | tinyint(4) | YES | | NULL | || approvedByUserId | bigint(20) | YES | | NULL | || approvedByUserName | varchar(75) | YES | | NULL | || approvedDate | datetime | YES | | NULL | || expired | tinyint(4) | YES | | NULL | || expirationDate | datetime | YES | | NULL | || reviewDate | datetime | YES | | NULL | || indexable | tinyint(4) | YES | | NULL | || smallImage | tinyint(4) | YES | | NULL | || smallImageId | bigint(20) | YES | MUL | NULL | || smallImageURL | varchar(75) | YES | | NULL | || urlTitle | varchar(150) | YES | | NULL | |+--------------------+--------------+------+-----+---------+-------+
  • +--------------------+--------------+------+-----+---------+-------+ Plone Conference 2011 Author| Field | Type | Null | Key | Default | Extra |+--------------------+--------------+------+-----+---------+-------+| uuid_ | varchar(75) | YES | MUL | NULL | || id_ | bigint(20) | NO | PRI | NULL | || resourcePrimKey | bigint(20) | YES | MUL | NULL | || groupId | bigint(20) | YES | MUL | NULL | || companyId | bigint(20) | YES | MUL | NULL | || userId | bigint(20) | YES | | NULL | || userName | varchar(75) | YES | | NULL | || createDate | datetime | YES | | NULL | || modifiedDate | datetime | YES | | NULL | || articleId| version| title | varchar(75) | YES | | double | YES | | varchar(100) | YES | | NULL | | NULL | | NULL | | | | Body Text| description| content| type_ | longtext | longtext | YES | | YES | | varchar(75) | YES | | NULL | | NULL | | NULL | | | | (links)| structureId | varchar(75) | YES | | NULL | || templateId | varchar(75) | YES | | NULL | || displayDate | datetime | YES | | NULL | || approved | tinyint(4) | YES | | NULL | || approvedByUserId | bigint(20) | YES | | NULL | || approvedByUserName | varchar(75) | YES | | NULL | || approvedDate | datetime | YES | | NULL | || expired | tinyint(4) | YES | | NULL | || expirationDate | datetime | YES | | NULL | || reviewDate | datetime | YES | | NULL | || indexable | tinyint(4) | YES | | NULL | || smallImage | tinyint(4) | YES | | NULL | || smallImageId | bigint(20) | YES | MUL | NULL | || smallImageURL | varchar(75) | YES | | NULL | || urlTitle | varchar(150) | YES | | NULL | |+--------------------+--------------+------+-----+---------+-------+
  • +--------------------+--------------+------+-----+---------+-------+ Plone Conference 2011 Author| Field | Type | Null | Key | Default | Extra |+--------------------+--------------+------+-----+---------+-------+| uuid_ | varchar(75) | YES | MUL | NULL | || id_ | bigint(20) | NO | PRI | NULL | || resourcePrimKey | bigint(20) | YES | MUL | NULL | || groupId | bigint(20) | YES | MUL | NULL | || companyId | bigint(20) | YES | MUL | NULL | || userId | bigint(20) | YES | | NULL | || userName | varchar(75) | YES | | NULL | || createDate | datetime | YES | | NULL | || modifiedDate | datetime | YES | | NULL | || articleId| version| title | varchar(75) | YES | | double | YES | | varchar(100) | YES | | NULL | | NULL | | NULL | | | | Body Text| description| content| type_ | longtext | longtext | YES | | YES | | varchar(75) | YES | | NULL | | NULL | | NULL | | | | (links)| structureId | varchar(75) | YES | | NULL | || templateId | varchar(75) | YES | | NULL | || displayDate | datetime | YES | | NULL | || approved | tinyint(4) | YES | | NULL | || approvedByUserId | bigint(20) | YES | | NULL | || approvedByUserName | varchar(75) | YES | | NULL | || approvedDate | datetime | YES | | NULL | || expired | tinyint(4) | YES | | NULL | || expirationDate | datetime | YES | | NULL | || reviewDate | datetime | YES | | NULL | || indexable | tinyint(4) | YES | | NULL | || smallImage | tinyint(4) | YES | | NULL | || smallImageId | bigint(20) | YES | MUL | NULL | | Top Image| smallImageURL | varchar(75) | YES | | NULL | || urlTitle | varchar(150) | YES | | NULL | |+--------------------+--------------+------+-----+---------+-------+
  • +--------------------+--------------+------+-----+---------+-------+ Plone Conference 2011 Author| Field | Type | Null | Key | Default | Extra |+--------------------+--------------+------+-----+---------+-------+| uuid_ | varchar(75) | YES | MUL | NULL | || id_ | bigint(20) | NO | PRI | NULL | || resourcePrimKey | bigint(20) | YES | MUL | NULL | || groupId | bigint(20) | YES | MUL | NULL | || companyId | bigint(20) | YES | MUL | NULL | || userId | bigint(20) | YES | | NULL | || userName | varchar(75) | YES | | NULL | || createDate | datetime | YES | | NULL | || modifiedDate | datetime | YES | | NULL | || articleId| version| title | varchar(75) | YES | | double | YES | | varchar(100) | YES | | NULL | | NULL | | NULL | | | | Body Text| description| content| type_ | longtext | longtext | YES | | YES | | varchar(75) | YES | | NULL | | NULL | | NULL | | | | (links)| structureId | varchar(75) | YES | | NULL | || templateId | varchar(75) | YES | | NULL | || displayDate | datetime | YES | | NULL | || approved | tinyint(4) | YES | | NULL | || approvedByUserId | bigint(20) | YES | | NULL | | Dates| approvedByUserName | varchar(75) | YES | | NULL | || approvedDate | datetime | YES | | NULL | || expired | tinyint(4) | YES | | NULL | || expirationDate | datetime | YES | | NULL | || reviewDate | datetime | YES | | NULL | || indexable | tinyint(4) | YES | | NULL | || smallImage | tinyint(4) | YES | | NULL | || smallImageId | bigint(20) | YES | MUL | NULL | | Top Image| smallImageURL | varchar(75) | YES | | NULL | || urlTitle | varchar(150) | YES | | NULL | |+--------------------+--------------+------+-----+---------+-------+
  • Plone Conference 2011 Join SELECT IGImage.imageId,Image IGImage.createDate, IGImage.modifiedDate, IGImage.uuid_,Scales IGImage.userId, IGImage.name, IGImage.description, IGImage.smallImageId, IGImage.largeImageId, IGImage.custom1ImageId, IGImage.custom2ImageId, IGImage.folderId, Image.type_ FROM IGImage JOIN Image ON IGImage.largeImageId = Image.imageId
  • Plone Conference 2011 Get Article CommentsSELECT COALESCE(gui.guestName, mbm.userName) AS userName, mbm.createDate, mbm.modifiedDate, mbm.subject, mbm.bodyFROM MBMessage mbmJOIN MBDiscussion mbd ON mbd.threadId = mbm.threadIdLEFT JOIN guestUserInfo gui ON gui.messageId = mbm.messageIdWHERE mbd.classPK=%d ORDER BY modifiedDate
  • Plone Conference 2011Alright, Good to Go!
  • Plone Conference 2011The Toolbox
  • Plone Conference 2011Transmogrifier
  • Plone Conference 2011Transmogrifier “It’s a series of tubes”
  • Plone Conference 2011Transmogrifier
  • Plone Conference 2011Transmogrifier
  • Plone Conference 2011Transmogrifier
  • Plone Conference 2011Not Too Opinionated
  • Plone Conference 2011Not Too Opinionated
  • Plone Conference 2011 Not Too OpinionatedA migration deals with moving pieces of content fromone place to another
  • Plone Conference 2011 Not Too OpinionatedA migration deals with moving pieces of content fromone place to anotherA piece of content comes from somewhere
  • Plone Conference 2011 Not Too OpinionatedA migration deals with moving pieces of content fromone place to anotherA piece of content comes from somewhereA piece of content ends up somewhere
  • Plone Conference 2011 Not Too OpinionatedA migration deals with moving pieces of content fromone place to anotherA piece of content comes from somewhereA piece of content ends up somewhereYou should be able to do what you want to a piece ofcontent between point A and point B
  • Plone Conference 2011A Nice Set of Tools
  • Plone Conference 2011A Nice Set of Tools
  • Plone Conference 2011The Migration
  • Plone Conference 2011Two Main Types
  • Plone Conference 2011Two Main Types Images
  • Plone Conference 2011Two Main Types Images Articles
  • Plone Conference 2011Two Pipelines
  • Plone Conference 2011 Two PipelinesOne for each main content category
  • Plone Conference 2011Two Pipelines
  • Plone Conference 2011 Two Pipelines• Extract content from SQL
  • Plone Conference 2011 Two Pipelines• Extract content from SQL• Update text field encodings
  • Plone Conference 2011 Two Pipelines• Extract content from SQL• Update text field encodings• Transform Dates to Python
  • Plone Conference 2011 Two Pipelines• Extract content from SQL• Update text field encodings• Transform Dates to Python• Calculate Final Plone Location
  • Plone Conference 2011 Two Pipelines• Extract content from SQL• Update text field encodings• Transform Dates to Python• Calculate Final Plone Location• Create Plone Object
  • Plone Conference 2011 Two Pipelines• Extract content from SQL• Update text field encodings• Transform Dates to Python• Calculate Final Plone Location• Create Plone Object• Post-process (publication, etc.)
  • Plone Conference 2011A Simplification but you get the idea
  • Plone Conference 2011 Can You Spot the Flaw?Photo by pollyann - CC-BY-NC-NDhttp://www.flickr.com/photos/pollyann/4299826600/
  • Plone Conference 2011How Do We Match Them? Plone ID != Original SQL ID
  • Plone Conference 2011How Do We Match Them? JournalArticle.smallImageId != Image.smallImageID
  • Plone Conference 2011How Do We Match Them?JournalArticle.smallImageId != JournalArticle.smallImageURL | 449868 | /image/image_gallery?img_id=449858&t=1296250152541 | | 449894 | /image/image_gallery?img_id=449888&t=1296250703087 | | 450604 | /image/image_gallery?img_id=450582&t=1296515550638 | | 450845 | /image/image_gallery?img_id=450835&t=1296599014396 | | 450917 | /image/image_gallery?img_id=450907&t=1296615851690 |
  • Plone Conference 2011Photo by hobvias sudoneighm - CC-BYhttp://www.flickr.com/photos/striatic/2192192956/
  • Plone Conference 2011Transmogrifier To The Rescue!!!
  • Plone Conference 2011Two Features
  • Plone Conference 2011Splitter Sectionrun content down different pipelines in one transmogrifier
  • Plone Conference 2011 Annotations store data on the transmogrifier tocommunicate between pipeline sections
  • Plone Conference 2011How Does This Help?
  • Plone Conference 2011 Three Facts 1. Pipeline sections are generatorsclass MySection(object): classProvides(ISectionBlueprint) implements(ISection) def __init__(self, transmogrifier, name, options, previous): [setup] def __iter__(self): ... for item in self.previous: [do some stuff] ... yield item [clean up]
  • Plone Conference 2011 Three Facts2. SQL sections process items 1 query at a time for query in self.queries: result=self.connection.execute(query) for row in result: yield dict((x[0].encode(utf-8), x[1]) for x in row.items())
  • Plone Conference 2011 Three Facts 3. Pipelines process one item at a timeclass SectionOne(object): class SectionTwo(object): def __iter__(self): def __iter__(self): ... ... for item in self.previous: for item in self.previous: [do some stuff] [do some stuff] ... ... yield item yield item
  • Plone Conference 2011UsingAnnotations
  • Plone Conference 2011UsingAnnotations
  • Plone Conference 2011ImageInformation Using Annotations
  • Plone Conference 2011ImageInformation Using Annotations
  • Plone Conference 2011class IGImageIDMapper(object): classProvides(ISectionBlueprint) implements(ISection) def __init__(self, transmogrifier, name, options, previous): self.transmogrifier = transmogrifier ... annotations = IAnnotations(self.transmogrifier) if IMAGE_MAPS_KEY in annotations: self.image_maps = annotations[IMAGE_MAPS_KEY] else: annotations[IMAGE_MAPS_KEY] = self.image_maps = {img_uuid: {}, sm_img_id: {}, lg_img_id: {}, c1_img_id: {}, c2_img_id: {}}
  • Plone Conference 2011class IGImageIDMapper(object): classProvides(ISectionBlueprint) implements(ISection) def __init__(self, transmogrifier, name, options, previous): self.transmogrifier = transmogrifier ... annotations = IAnnotations(self.transmogrifier) if IMAGE_MAPS_KEY in annotations: self.image_maps = annotations[IMAGE_MAPS_KEY] else:__iter__(self): def annotations[IMAGE_MAPS_KEY] = self.image_maps = {img_uuid: {}, for item in self.previous: if item[_type] != Folder: sm_img_id: {}, path = item.get(_path, None)lg_img_id: {}, c1_img_id: {}, current = self.transmogrifier.context.unrestrictedTraverse(path) if current: c2_img_id: {}} cuid = current.UID() info = {uid: cuid, path: path} self.image_maps[img_uuid][item[uuid_]] = info self.image_maps[lg_img_id][item[largeImageId]] = info self.image_maps[sm_img_id][item[smallImageId]] = info self.image_maps[c1_img_id][item[custom1ImageId]] = info self.image_maps[c2_img_id][item[custom2ImageId]] = info yield item
  • Plone Conference 2011ImageInformation Using Annotations
  • Plone Conference 2011ImageInformation Using Annotations
  • Plone Conference 2011ImageInformation Using Annotations
  • Plone Conference 2011UsingAnnotations
  • Plone Conference 2011class GetArticleImage(object): classProvides(ISectionBlueprint) implements(ISection) def __iter__(self): annotations = IAnnotations(self.transmogrifier) self.image_maps = annotations[IMAGE_MAPS_KEY] for item in self.previous: if item[smallImage] != 0: img_id = None try: url = item[smallImageURL] img_id = extract_img_id_from_url(url) if img_id is None: img_id = extract_img_uuid_from_url(url) except KeyError: img_id = item[smallImageId] if img_id is not None: img_info = find_image_from_id(img_id, self.image_maps) ... yield item
  • Plone Conference 2011 Victory!Photo by Petr & Bara Ruzicka - CC-BYhttp://www.flickr.com/photos/pruzicka/207209564/
  • Plone Conference 2011Image by justinshearer - CC-BY-NC-SAhttp://www.flickr.com/photos/justinshearer/3675295127/
  • Plone Conference 2011What About Links In Articles
  • Plone Conference 2011What About Links In ArticlesTo Other Articles?
  • Plone Conference 2011Images finishbeforeArticles Start
  • Plone Conference 2011ImageInformation Images finish before Articles Start
  • Plone Conference 2011ImageInformation Images finish before Articles Start
  • Plone Conference 2011
  • Plone Conference 2011
  • Plone Conference 2011Article 1 ID Article 1 Identifiers Stored
  • Plone Conference 2011Article 1 ID
  • Plone Conference 2011Article 1 ID Article 2 Links Processed
  • Plone Conference 2011Article 1 ID Article 2 Links Processed Link to Article 1 ✓
  • Plone Conference 2011Article 1 ID Article 2 Links Processed Link to Article 1 ✓ Link to Article 3 ✗
  • Plone Conference 2011 Sad PandaPhoto by Luis Markovic - CC-BYhttp://www.flickr.com/photos/_lulu/3265194525/
  • Plone Conference 2011Transmogrifier To The Rescue!!!
  • Plone Conference 2011 Three Facts 1. Pipeline sections are generatorsclass MySection(object): classProvides(ISectionBlueprint) implements(ISection) def __init__(self, transmogrifier, name, options, previous): [setup] def __iter__(self): ... for item in self.previous: [do some stuff] ... yield item [clean up]
  • Plone Conference 2011 One Fact, Really 1. Pipeline sections are generatorsclass MySection(object): classProvides(ISectionBlueprint) implements(ISection) def __init__(self, transmogrifier, name, options, previous): [setup] def __iter__(self): ... for item in self.previous: [do some stuff] ... yield item [clean up]
  • Plone Conference 2011 One Fact, Really 1. Pipeline sections are generatorsclass MySection(object): classProvides(ISectionBlueprint) implements(ISection) def __init__(self, transmogrifier, name, options, previous): [setup] def __iter__(self): ... for item in self.previous: [do some stuff] ... yield item [clean up]
  • Plone Conference 2011After all items are gone
  • Plone Conference 2011After all items are gone Cleanup code is run
  • Plone Conference 2011 Set up ID Mapclass PostCreation(object): classProvides(ISectionBlueprint) implements(ISection) def __init__(self, transmogrifier, name, options, previous): self.transmogrifier = transmogrifier ... if DOC_MAPS_KEY in annotations.keys(): self.doc_maps = annotations[DOC_MAPS_KEY] else: annotations[DOC_MAPS_KEY] = self.doc_maps = {resourcePrimKey: {}, urlTitle: {}, articleId: {}, uuid_: {}, _path: {}}
  • Plone Conference 2011 Set up ID Mapclass PostCreation(object): classProvides(ISectionBlueprint) implements(ISection) def __init__(self, transmogrifier, name, options, previous): def __iter__(self): self.transmogrifier = transmogrifier site = self.transmogrifier.context ... for item in self.previous: if DOC_MAPS_KEY in annotations.keys(): path = item.get(_path, None) self.doc_maps = annotations[DOC_MAPS_KEY] if path: else: try: annotations[DOC_MAPS_KEY] = self.doc_maps = {resourcePrimKey: {}, current = site.unrestrictedTraverse(path) except KeyError: urlTitle: {}, articleId: {}, # missing element in path somewhere, skip it? pass uuid_: {}, if current: _path: {}} cuid = current.UID() for key in [resourcePrimKey, urlTitle, articleId, uuid_, _path]: if key in item: self.doc_maps[key][item[key]] = cuid yield item
  • Plone Conference 2011 Find Image Linksclass ImageTagsFinder(object): classProvides(ISectionBlueprint) implements(ISection) def __init__(self, transmogrifier, name, options, previous): self.transmogrifier = transmogrifier ... annotations = IAnnotations(transmogrifier) if REWRITABLE_ELEMENTS_KEY not in annotations.keys(): annotations[REWRITABLE_ELEMENTS_KEY] = {} self.rewriteable = annotations[REWRITABLE_ELEMENTS_KEY]
  • Plone Conference 2011 Find Image Linksclass ImageTagsFinder(object): classProvides(ISectionBlueprint) def __iter__(self): implements(ISection) num_found = 0 for item in self.previous: def __init__(self, transmogrifier, name, options, previous): self.transmogrifier = item[_path] path = transmogrifier ... if tree is None: parser = etree.HTMLParser() annotations = IAnnotations(transmogrifier) tree = etree.fromstring(item[text], parser) if REWRITABLE_ELEMENTS_KEY not in annotations.keys(): if tree is not None: annotations[REWRITABLE_ELEMENTS_KEY] = {} all_images = tree.xpath(//img) self.rewriteable = annotations[REWRITABLE_ELEMENTS_KEY] if len(all_images) > 0: # we have some anchors, do any need re-writing? internal = [] for img in all_images: src = img.attrib.get(src, ) match = img_is_internal(src) if match: internal.append(img) mapped[images] = internal self.rewriteable[path] = internal yield item
  • Find Other Tags Plone Conference 2011class LinkFinder(object): """ create a mapping of the items which have links to be modified """ classProvides(ISectionBlueprint) implements(ISection) def __init__(self, transmogrifier, name, options, previous): self.transmogrifier = transmogrifier annotations = IAnnotations(transmogrifier) if REWRITABLE_ELEMENTS_KEY not in annotations.keys(): annotations[REWRITABLE_ELEMENTS_KEY] = {} self.rewriteable = annotations[REWRITABLE_ELEMENTS_KEY]
  • Find Other Tags Plone Conference 2011class LinkFinder(object): """ create a mapping of the items which have links to be modified """ classProvides(ISectionBlueprint) implements(ISection) def __iter__(self): def __init__(self,etrees of any documents with links that need fixing """ save transmogrifier, name, options, previous): """ self.transmogrifier = transmogrifier annotations =in self.previous: for item IAnnotations(transmogrifier) path = item[_path] if REWRITABLE_ELEMENTS_KEY not in annotations.keys(): annotations[REWRITABLE_ELEMENTS_KEY]parser) tree = etree.fromstring(item[text], = {} if tree is not None: self.rewriteable = annotations[REWRITABLE_ELEMENTS_KEY] all_anchors = tree.xpath(//a) if len(all_anchors) > 0: # we have some anchors, do any need re-writing? internal = [] for a in all_anchors: href = a.attrib.get(href, ) if is_internal_link(href): internal.append(href) self.rewriteable[path] = internal yield item
  • Plone Conference 2011 Replace Found Linksclass LinkReplacer(object): """ re-write links in body texts of all created items The work done by this item takes place entirely in the clean-up stage of the section. """ classProvides(ISectionBlueprint) implements(ISection) def __init__(self, transmogrifier, name, options, previous): self.transmogrifier = transmogrifier annotations = IAnnotations(transmogrifier) if REWRITABLE_ELEMENTS_KEY not in annotations.keys(): annotations[REWRITABLE_ELEMENTS_KEY] = {} self.rewriteable = annotations[REWRITABLE_ELEMENTS_KEY]
  • Plone Conference 2011 Replace Found Linksclass LinkReplacer(object): """ re-write links in body texts of all created items def __iter__(self): The work done by this item takes place entirely in the clean-up stage of the section. in self.previous: for item """ # no action takes place here yield item classProvides(ISectionBlueprint) implements(ISection) # get the maps we will use annotations = IAnnotations(self.transmogrifier) def __init__(self, transmogrifier, name, options, previous): doc_maps = annotations[DOC_MAPS_KEY] self.transmogrifier = transmogrifier image_maps = annotations[IMAGE_MAPS_KEY] annotations = IAnnotations(transmogrifier) if REWRITABLE_ELEMENTS_KEY tagsin annotations.keys(): # rewrite image and anchor not for path, info in self.rewriteable.items(): annotations[REWRITABLE_ELEMENTS_KEY] = {} page = self.transmogrifier.context.unrestrictedTraverse(path) self.rewriteable = annotations[REWRITABLE_ELEMENTS_KEY] tree = info.get(tree, None) links = info.get(links, []) images = info.get(images, []) lg, ln, le = rewrite_links(links, page, doc_maps, self.logger, self.transmogrifier.context) ig, _in, ie = rewrite_image_tags(images, page, image_maps, self.logger)
  • Plone Conference 2011 Victory!Photo by Petr & Bara Ruzicka - CC-BYhttp://www.flickr.com/photos/pruzicka/207209564/
  • Plone Conference 2011 Victory! Right?Photo by Petr & Bara Ruzicka - CC-BYhttp://www.flickr.com/photos/pruzicka/207209564/
  • Plone Conference 2011please?
  • Plone Conference 2011please?
  • Plone Conference 2011Photo of moohttp://instagr.am/p/SSMBw/
  • Plone Conference 2011Link Formats
  • Plone Conference 2011Link Formatshow many can you imagine?
  • Plone Conference 2011Link Formatshow many can you imagine? we had them all
  • Plone Conference 2011~2,000 Articles
  • Plone Conference 20115-10 Links per Article
  • Plone Conference 20115-10 Links per Article at least
  • Plone Conference 20115-10 Links per Article at least you do the math
  • Plone Conference 2011 How to Find the Bad Ones?Photo by Alessandra Oddi - CC-BYhttp://www.flickr.com/photos/uvafragola/4834037874/
  • Plone Conference 2011Transmogrifier To The Rescue!!!
  • Plone Conference 2011PythonTo The Rescue!!!
  • Plone Conference 2011CSV Reports
  • Plone Conference 2011CSV Reports Which links worked?
  • Plone Conference 2011CSV Reports Which links worked? Which links didn’t?
  • class LinkReplacer(object): """ re-write links in body texts of all created items """ classProvides(ISectionBlueprint) implements(ISection) ... def __iter__(self): good, notenough, errors = [],[],[] for item in self.previous: yield item for path, info in self.rewriteable.items(): page = self.transmogrifier.context.unrestrictedTraverse(path) tree = info.get(tree, None) links = info.get(links, []) images = info.get(images, []) lg, ln, le = rewrite_links(links, page, doc_maps, self.logger, self.transmogrifier.context) ig, _in, ie = rewrite_image_tags(images, page, image_maps, self.logger) good.extend(lg + ig) notenough.extend(ln + _in) errors.extend(le + ie) with open(goodlinks.csv, w) as f: goodwriter = csv.writer(f) goodwriter.writerows(good) with open(badlinks.csv, w) as f: badwriter = csv.writer(f) badwriter.writerows(notenough) ...
  • class LinkReplacer(object): """ re-write links in body texts of all created items """ def rewrite_image_tags(images, page, img_maps, logger): good = [] classProvides(ISectionBlueprint) notenough = [] implements(ISection) ... errors = [] def __iter__(self): image in images: for url = image.attrib.get(src, ) good, notenough, errors = [],[],[] # get information about the image to be subbed, either by id or uuid for item in self.previous: yield item img_id = extract_img_id_from_url(url) for path, info in img_id is None: if self.rewriteable.items(): img_id = extract_img_uuid_from_url(url) page = self.transmogrifier.context.unrestrictedTraverse(path) tree = info.get(tree, is None: if img_id None) links = info.get(links, []) not find a mapped image matching, not enough to go on # we could logger.warn(unable to find image id in url: %s % url) images = info.get(images, []) lg, ln, le = rewrite_links(links, page, doc_maps, bad url, page.absolute_url(), url)) notenough.append((missing img, continue self.logger, self.transmogrifier.context) ig, _in, ie =# resolve the id we found into a plone object UID via image maps rewrite_image_tags(images, page, image_maps, img_info = find_image_from_id(img_id, img_maps) self.logger) if img_info is None: good.extend(lg + ig) logger.warn(unable to find mapped plone image id %s % img_id) notenough.extend(ln + _in) notenough.append((missing img, no map, page.absolute_url(), url)) errors.extend(le + ie) continue with open(goodlinks.csv, w) as f: goodwriter # by default, use the 300x300 px medium size for in-page images = csv.writer(f) # To change this, adjust the value of STANDARD_IMG_SCALE goodwriter.writerows(good) newurl = "resolveuid/%s%s" % (img_info[uid], with open(badlinks.csv, w) as f: badwriter = csv.writer(f) STANDARD_IMG_SCALE) badwriter.writerows(notenough)newurl image.attrib[src] = ... good.append((image match, page.absolute_url(), url)) return good, notenough, errors
  • Plone Conference 2011Iteration FTW!
  • Plone Conference 2011Iteration FTW! from > 2000 bad links
  • Plone Conference 2011Iteration FTW! from > 2000 bad links to < 75 bad links
  • Plone Conference 2011Iteration FTW! from > 2000 bad links to < 75 bad links fix those that remain by hand
  • Plone Conference 2011 Victory!Photo by Petr & Bara Ruzicka - CC-BYhttp://www.flickr.com/photos/pruzicka/207209564/
  • Plone Conference 2011So What Did We Learn?
  • Plone Conference 2011 Clients, when asked to describe their existing system,will never describe it with enough accuracy to properly plan for a migration
  • Plone Conference 2011Learn as much as you can about the source system when planning a migration
  • Plone Conference 2011Learn as much as you can about the source system when planning a migrationbut know that you will always need to know more
  • Plone Conference 2011 Learn as much as you can about the source system when planning a migration but know that you will always need to know moreand that you will not find it out until you actually start the migration
  • Plone Conference 2011Users, if given more than one way to do things, will use all the ways
  • Plone Conference 2011Have a plan, but be prepared to adjust when reality hits. Plans are best when treated as jumping-off points.
  • Plone Conference 2011Estimate Migrations HIGH
  • Plone Conference 2011Photo by neilspicys - CC-BYhttp://www.flickr.com/photos/neilspicys/2349770710/
  • Check p.com/de out mossixfeetu