Content Migrations: Getting from A to B


Published on

Deck fro

Published in: Technology
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Content Migrations: Getting from A to B

  1. 1. • Author of Website Migration Handbook v2• First large migration: World Bank (1,000+ subsites)• Consults to large and medium organizations• David guides complex website transformations.
  2. 2. Deane Barker• Working in content management since 1996• Founding partner in Blend Interactive• Board member of Content Management Professionals
  3. 3. Planning vs. Technical• The planning process encompasses the entire scope of your migration effort• The technical process is just one very critical part of this process
  4. 4. Agenda• David will discuss the larger planning process – Break• Deane will follow with a discussion about the specific technical challenges – End at 4:00 p.m. – Deane and David will be available for discussion until 5:00 p.m.
  5. 5. Ask Questions
  6. 6. It’s painful. [The End]
  7. 7. Requirements for Transfer• You know – …what is being moved – …how it has to change on the way over – …how it fits back together on the other side
  8. 8. Agenda• Original Content vs. Derived Content• Content Geography• The Four Tasks of Content Transfer• Automated vs. Manual Import• The Automated Import Process• QA Automation
  9. 9. Some HTML has to be moved.Some HTML will be generated by your new system as content is imported.
  10. 10. Index Pages vs. Content Pages
  11. 11. Many pages on your new site arenot rendered via content, but via development.
  12. 12. Before you begin transfer, make sure you know which pages arederived and you have made plans to generate those in the new system.
  13. 13. Content has different levels of “geography”Some content is very specifically placed, while other content is automatically organized.
  14. 14. Home Products AboutProduct A Product B History
  15. 15. PressRelease
  16. 16. Highly-geographical content is much harder to migrate.You have to migrate both the content and the placement.
  17. 17. Pop Quiz:Why are blogs soeasy to migrate?No geography.Lots of derivedindex pages.
  18. 18. Hierarchical contentrequires you todetermine andtransfer structure
  19. 19. Stub Mapping Home Products About Product A Product B HistoryExisting Home New Products About Product A Product B History
  20. 20. The Path to Stub Mapping• “We need to codify the new website structure…”• “…let’s just store this in the new CMS…”• “…and let’s store the old URL, just for reference…”• “…and…can we just use that old URL to transfer the content?”
  21. 21. The Four Tasks• Extract• Transform• Import• Normalize• We can generalize about the first two – Extract and transform are platform-agnostic
  22. 22. #1: Extract• Get content out of the existing system• Break content into its necessary components• Store in a neutral format – XML, usually
  23. 23. Migrating out of a CMS is a lot easier than the alternative. CMS enforces at least some consistency.
  24. 24. Are you going to extract from therepository level or the publication level?
  25. 25. Repository vs. Publication Extraction HTML Repository Processing
  26. 26. You may need to make changes to your old site to make extraction easier or more complete.
  27. 27. You do not have to wait for anything to do this.You can start extraction on the very day you decide to migrate your website.
  28. 28. #2: Transform• Modify extracted content• Fix legacy problems with the content• Adapt content to fit the new architecture• Neutralize idiosyncrasies in the content
  29. 29. Content Transformation
  30. 30. Common Transformations
  31. 31. Common Transformations
  32. 32. #3: Import• Move post-transformed content from a neutral format into the new system• This is different for every CMS• This capability should be part of the evaluation process
  33. 33. #4: Normalize• Fix problems that are only “fixable” once content is in its new home• Ex: – Relationship reconstruction – URL resolution – Navigation reconstruction
  34. 34. Content relationships can introduce chicken-egg problems.
  35. 35. How will URLs change on the new platform?If you content is interlinked how are you going to keep all those links valid?
  36. 36. Embedded URLs
  37. 37. Embedded URL Resolution• If you have embedded URLs, they are now broken.• How do you “re-connect” these URLs to the correct content?• Usually performed as some kind of batch job. – You rarely get 100% accuracy. – Prepare to catch the remainder in QA.
  38. 38. Always store the old URL for a migrated page of content.
  39. 39. How it Works• Iterate over every piece of content…• …then iterate over every single property looking for anything that might contain links…• …then iterate over all those links looking for the new content holding that old link…• …then correct the link.
  40. 40. Once migrated, use the old URL to do a lookup in your 404 handler.
  41. 41. If you can preserve binary fileURLs, do so. Your new CMS will likely make this easier.
  42. 42. Depending onvolume, menureconstruction might bea manual process.
  43. 43. What is the actual mechanism of movement? Copy-and-paste? Automated?
  44. 44. When Copy-and-Paste Works• When you don’t have a lot of content• When you have access to cheap labor• When your content is highly geographic• When you cannot automate transformation• When you have enough resources for sufficient QA
  45. 45. When Automated Migration Works• When you have large volumes of content• When your content is not highly-geographic• When you have sufficient technology and/or development resources
  46. 46. You don’t have to use the samemethod for your entire project.
  47. 47. Automated Migration Tools• Great answer to the Transfer phase• Less of an answer to everything else• They still have to be configured and tested
  48. 48. The Promise:You will be able to develop a script that will reduce your migration to a button-click.
  49. 49. The Promise:You will run this script, need to donothing else, then launch your new website.
  50. 50. The Value-Add• A scripting environment• Tested tools for: – Extraction – Transformation – Import (maybe…)• Professional services $$$$
  51. 51. Automated Migration Process• Develop automated migration script – Configure – Execute – Evaluate – (Repeat)• Accept a cycle “as good as is reasonable”• Perform necessary manual editing• Re-do changes during content freeze• Launch
  52. 52. Automated migrations are highly iterative. Configure-Execute-Evaluate
  53. 53. Automated Migration Cycle Weeks? Months? Days? Minutes?Configure Execute Evaluate Manual Launch Editing Iterate again… “As good as is reasonable…”
  54. 54. Once you accept the output of a migration cycle, you are in a content freeze
  55. 55. Handling a Content Freeze• Don’t change any content on the existing site• Track changes so they can be re-changed on the new site
  56. 56. Ideally, track the QA process inside the CMS itself.
  57. 57. • WEB• TWITTER @gadgetopia• EMAIL