Drupalcampchicago2010.rachel.datamigration.

387 views
341 views

Published on

Promet Source - Rachel Joaro - Drupal Camp Chicago presentation on Data Migration. Migrating 100,000 pages of content.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
387
On SlideShare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Todo – make comparison of normal sdlc to migration of sdlc
  • http://www.flickr.com/photos/14804582@N08/2111269218/
  • Drupalcampchicago2010.rachel.datamigration.

    1. 1. Migrating 100,000 pages of content From Legacy CMS to Drupal Rachel JaroSolutions Architect at PrometSource www.prometsource.com
    2. 2. OverviewWe’ll talk about: Successful migration recipe Common questions you should be asking before you start Top 3 tools to do migration in Drupal Issues  Tools to use in URL Rewriting  File management Comparison in D6 Testing Deploying Solution
    3. 3. Data Migration “Data migration solutions extract data from a source system, correct errors, reformat, restructure and load the data into a replacement target system”. It sounds simple, but poorly managed data migration is the most common cause of failure in implementing a replacement system. -- Gershon Pick, March 2001
    4. 4. Successful Migration Recipe
    5. 5. Planning Source: http://www.flickr.com/photos/bjornmeansbear/4380595283/
    6. 6. Plan: What to Ask Node types (Content separation, fields)  Do you want to separate contents into pages, articles, biography, news, etc.  What fields are needed for each node?  Who can access it?  Do you really need that content type? Or can we just use taxonomies instead for similar contents.
    7. 7. Plan: What to Ask Taxonomy (Categorization, tags)  Do you need to categorize nodes?  Would you need different access?  What kind of taxonomy groups or vocabularies you would need? Permission (per nodes) and User Roles  Who are going to use the site?  What are particularly their access rights?
    8. 8. Plan: What to Ask New URL mapping  Do you need to make SEO friendly URLs? Files, files permissions and file directory  Do you need advance file management or document management tool?  Do you need simpler solutions? How simple is that.  Do you need access rights for each folder?  Do you need browser type interface to access them?  What kind of files do you need to store? Images, pdfs?
    9. 9. Build
    10. 10. Requirements Use CSV files to import data Divide migration into group or sections Map and replace old URL to SEO friendly URL  Before: 05-200.htm
    11. 11. Data in CSV ExampleDecember 13, 2005 3:39:54 PM||||||||||December 13, 2005||||||||||Report Spotlights Need for Reform in Jackpot Jurisdictions||||||||||/press/releases/2005/december/||||||||||05- 200||||||||||{UUID}|||||||||| Economics^^^^^^^^^^Economy ||||||||||<p>LoremIpsum is simply dummy text of the printing and typesetting industry. LoremIpsum has been the industrys standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. </p><p>LoremIpsum is simply dummy text of the printing and typesetting industry. LoremIpsum has been the industrys standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. </p>$$$$$$$$$$ Separator: |||||||||| End of Row: $$$$$$$$$$
    12. 12. Content Type DivisionExample: CNN.comDivide migration sequences into US, World, Politics, Justice, etc
    13. 13. Solutions/Tools TW and Migrate modules Combo node_import() Drush + custom script
    14. 14. TW & Migrate Module Combo http://drupal.org/project/tw  Supports Migrate module to run views of source data http://drupal.org/project/migrate  a flexible framework for migrating content
    15. 15. Migrate ModuleFeatures: users browse their legacy data using views support for creating Drupal nodes, users, and comments is included hooks permit migration of other types of content. provides a dashboard for running mini migrations Drush support
    16. 16. Why I did not choose migrate Importing to mysql was not an option. CSV were used instead Cannot map old URL to new URL
    17. 17. node_import()http://drupal.org/project/node_importFeatures: Easy to learn, Point and click Uses CSV to upload contents Can easily delete previous imported data Can download errors when import failed for easy reference to fix issues
    18. 18. node_import() Problems I can’t define map old URL to new URL No drush support It doesn’t save my old settings for a csv.
    19. 19. Drush + Custom script Flexibility - I can do whatever I want with the data
    20. 20. Create your own migration script [demo]
    21. 21. Issues File Management URL Rewriting
    22. 22. File ManagementClient requirements Intuitive Has wysiwyg support Access control – upload, edit, delete, revise files by different roles Revision control – optional but good to have Limited time!
    23. 23. File Management Modules*DbFm was not included due to problems encountered during tests in D6
    24. 24. URL Rewriting Source: http://www.flickr.com/photos/randomfactor/483264915/
    25. 25. URLs Rewriting SolutionNot recommended .htaccess  Too many URL to handle.  Too much server loadRecommended pathauto + path_redirect modules  automated alias settings  301 redirect set global redirectAdditional reference:http://acquia.com/blog/migrating-drupal-way-part-ii-saving-those-old-urls
    26. 26. URL Checker http://drupal.org/project/linkchecker
    27. 27. Access control Alternative /default/files/PressReleases /default/files/Documents /default/files/International  /default/files/International/America  /default/files/International/England  /default/files/International/Asia
    28. 28. Test, Test and did I say Test? Source: http://www.flickr.com/photos/paperpariah/2424107350/
    29. 29. Common problems Broken links Misconfigured page Empty pages Invalid date File not found or orphan pages Page format Test when CACHE is on
    30. 30. Deployment
    31. 31. Deployment2 Ways to Deploy your data to live environment1. All at once2. Divide and conquer
    32. 32. Deployment: Divide and ConquerExample: CNN Division
    33. 33. Deployment Mockup * shadow box is your migrated data’s production box * old CMS is still active at this time
    34. 34. Deployment• Coordination between the old CMS and Drupal• URL Testing
    35. 35. Deployment Mockup * shadow box is your migrated data’s production box * replacing old CMS with Drupal
    36. 36. DeploymentPros Less risk, less stress Editors can do continues data entry dailyCons URL rewriting can be a tricky Updating the production box with new content can be an arduous task
    37. 37. Deployment: Updating ProductionAutomation SVN Drush scripts to migrate contents from tester’s box to shadow box Deploy – http://drupal.org/project/deployManual Document configuration changes Document database changes
    38. 38. Recap SDLC + Agile Common questions you should be asking before you start Top 3 tools to do migration in Drupal  TW & Migrate, node_import(), drush Issues  File management Comparison in D6  Tools to use in URL Rewriting Testing Deployment Solution
    39. 39. Questions?
    40. 40. Resources http://groups.drupal.org/content-migration-import- and-export http://drupal.org/handbook/migrating

    ×