0
Migrating without Migraines 
Oscar Merida 
October. 21, 2014
Content is nomadic 
Musketeers.me
Managing Content 
Musketeers.me 
Migrations 
• Technical process features 
• Automated 
• Safe 
• Adaptable 
• See also “H...
Musketeers.me 
Automated Photo by pubsubhashis
Musketeers.me 
Safe
Musketeers.me 
Adaptable
Migration Workflow 
Sources XML 
Musketeers.me 
Documents CMS
Built from many sources 
• One or more database tables 
• … or CSV, or XML, or… 
• Reference other content 
• Reference im...
Transform to XML 
• Represent fully built content and 
attributes 
• Clean up errors 
• Character encodings, HTML entities...
Import to Drupal 
• Use Feeds & Feeds XPath Parser modules 
• http://drupal.org/project/feeds_xpathparser 
• also JSON Par...
What else do we need? 
• Access to source(s) 
• Command line PHP to transform it to XML 
• https://github.com/omerida/impo...
Importing Author Profiles 
• We need to import profiles for 
authors on our site. 
• Authors are not users, just a 
biogra...
Sample CSV data 
! 
first name,last 
name,email,city,country,date_joined,company,bio,id,tags 
Whilemina,Benton,sollicitudi...
Sample CSV data 
! 
first name,last 
name,email,city,country,date_joined,company,bio,id,tags 
Whilemina,Benton,sollicitudi...
Sample CSV data 
! 
first name,last 
name,email,city,country,date_joined,company,bio,id,tags 
Whilemina,Benton,sollicitudi...
Sample CSV data 
! 
first name,last 
name,email,city,country,date_joined,company,bio,id,tags 
Whilemina,Benton,sollicitudi...
Sample CSV data 
! 
first name,last 
name,email,city,country,date_joined,company,bio,id,tags 
Whilemina,Benton,sollicitudi...
Musketeers.me 
1. Parse CSV 
// read in our Sample CSV file 
// and clean up incoming data with sampleParser 
$csv = new r...
2. Clean up incoming 
function profileParser($item) { 
// skip profiles without an email 
if (empty($item['email'])) retur...
3. Convert to XML 
// output XML 
$xml = new toXml("profiles", "profile"); 
$xml->setHandler("tags", "tagHandler"); 
$xml-...
4. Save XML Output 
Musketeers.me 
$ php csv2xml.php > profiles.xml 
<?xml version="1.0"?>! 
<profiles>! 
<profile>! 
<fir...
Our Profile Content Type 
• Text fields: First Name, Last Name, City, 
Country, Company 
• Email field: E-mail 
• Date fie...
Musketeers.me
Step 4. Configure Feeds 
• Basic: Disable periodic import 
• Fetcher: File Upload 
• Parser: XPath XML Parser 
• Processor...
5. Configure Processor 
• Settings 
• Bundle: Target content type 
• Update: Update existing nodes 
• Text format: HTML 
•...
6. Map Source to Target 
Musketeers.me
7. Map Inputs with XPath 
Musketeers.me
An XPath Primer 
• Used to query XML documents 
• A path can return multiple nodes 
• /profiles/profile - give me all prof...
8. Run the Import 
Musketeers.me
Advanced Feeds Tricks 
• Provide a URL to an image, audio, or 
other media file, and it’ll download 
automatically. 
• Can...
hook_feeds_presave 
• Clean up data on import 
• Can also use regular node hooks 
function grad_importers_feeds_presave(Fe...
hook_feeds_presave II 
• Extract data from a field, assign it to another 
function foo_feeds_presave(FeedsSource $source, ...
Creating an Entity Reference 
• Map a source to 
a Feeds GUID 
! 
! 
• Set an XPATH 
query to read the 
GUID 
Musketeers.m...
Entity Reference XML 
<scholars> ! 
<scholar>! 
<scholar_id>b36c6b08b402c12bd3a11657420cd5dd</scholar_id>! 
<scholar_name>...
What about Drupal 8? 
• Import API is in core 
• Based on Migrate API 
• http://groups.drupal.org/imp 
• Porting Feeds mod...
What did we learn? 
• Decouple your source from import 
• Sources will change … versioning! 
• Transform input sources to ...
Thank You. 
• php[architect] - http://phparch.com 
• Magazine, books, trainings, and … 
• php[world] - this November 
• ht...
Musketeers.me 
Questions? 
@omerida on twitter
Upcoming SlideShare
Loading in...5
×

Migrate without migranes

420

Published on

In this talk, we'll look at the tools and modules available for migrating content into Drupal. I'll describe the workflow I've used to prepare, transform, and import thousands of records into Drupal. I'll share strategies for cleaning up and parsing data and doing it in a reliable, repeatable manner. You'll learn how to efficiently use PHP, Feeds, and Feeds XPath Parser modules to handle almost any data source thrown your way.

Published in: Technology
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
420
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
8
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Transcript of "Migrate without migranes"

  1. 1. Migrating without Migraines Oscar Merida October. 21, 2014
  2. 2. Content is nomadic Musketeers.me
  3. 3. Managing Content Musketeers.me Migrations • Technical process features • Automated • Safe • Adaptable • See also “Hitch your Wagon”, 2013 • http://phpa.me/hitch-your-wagon
  4. 4. Musketeers.me Automated Photo by pubsubhashis
  5. 5. Musketeers.me Safe
  6. 6. Musketeers.me Adaptable
  7. 7. Migration Workflow Sources XML Musketeers.me Documents CMS
  8. 8. Built from many sources • One or more database tables • … or CSV, or XML, or… • Reference other content • Reference image, audio, & other files • Reference other systems • for example, Youtube for video Musketeers.me
  9. 9. Transform to XML • Represent fully built content and attributes • Clean up errors • Character encodings, HTML entities • Filenames & paths • Assign a unique identifier Musketeers.me
  10. 10. Import to Drupal • Use Feeds & Feeds XPath Parser modules • http://drupal.org/project/feeds_xpathparser • also JSON Parser, and others • UI to map XML to entity attributes & fields • UI for importing, deleting content Musketeers.me
  11. 11. What else do we need? • Access to source(s) • Command line PHP to transform it to XML • https://github.com/omerida/importtools • … patience. Musketeers.me
  12. 12. Importing Author Profiles • We need to import profiles for authors on our site. • Authors are not users, just a biographical profile (content type) • Data is provided via a CSV file Musketeers.me
  13. 13. Sample CSV data ! first name,last name,email,city,country,date_joined,company,bio,id,tags Whilemina,Benton,sollicitudin.orci@arcuiaculisenim.ca,Ansfelden,Sri Lanka,12/25/13,Vivamus Institute,"amet nulla. Donec non justo. Proin non massa non ante bibendum ullamcorper. Duis cursus, diam at pretium aliquet, metus urna convallis",1,"contributor, author, " Kim,Sellers,amet.ultricies@ultriciesdignissim.com,Cabo de Santo Agostinho,Burundi,06/26/14,Maecenas Ornare Foundation,"eget massa. Suspendisse eleifend. Cras sed leo. Cras vehicula aliquet libero. Integer in magna. Phasellus dolor elit, pellentesque a, facilisis non, bibendum sed, est. Nunc laoreet lectus quis massa.",2, Musketeers.me
  14. 14. Sample CSV data ! first name,last name,email,city,country,date_joined,company,bio,id,tags Whilemina,Benton,sollicitudin.orci@arcuiaculisenim.ca,Ansfelden,Sri Lanka,12/25/13,Vivamus Institute,"amet nulla. Donec non justo. Proin non massa non ante bibendum ullamcorper. Duis cursus, diam at pretium aliquet, metus urna convallis",1,"contributor, author, " Kim,Sellers,amet.ultricies@ultriciesdignissim.com,Cabo de Santo Agostinho,Burundi,06/26/14,Maecenas Ornare Foundation,"eget massa. Suspendisse eleifend. Cras sed leo. Cras vehicula aliquet libero. Integer in magna. Phasellus dolor elit, pellentesque a, facilisis non, bibendum sed, est. Nunc laoreet lectus quis massa.",2, Musketeers.me
  15. 15. Sample CSV data ! first name,last name,email,city,country,date_joined,company,bio,id,tags Whilemina,Benton,sollicitudin.orci@arcuiaculisenim.ca,Ansfelden,Sri Lanka,12/25/13,Vivamus Institute,"amet nulla. Donec non justo. Proin non massa non ante bibendum ullamcorper. Duis cursus, diam at pretium aliquet, metus urna convallis",1,"contributor, author, " Kim,Sellers,amet.ultricies@ultriciesdignissim.com,Cabo de Santo Agostinho,Burundi,06/26/14,Maecenas Ornare Foundation,"eget massa. Suspendisse eleifend. Cras sed leo. Cras vehicula aliquet libero. Integer in magna. Phasellus dolor elit, pellentesque a, facilisis non, bibendum sed, est. Nunc laoreet lectus quis massa.”,2, Musketeers.me
  16. 16. Sample CSV data ! first name,last name,email,city,country,date_joined,company,bio,id,tags Whilemina,Benton,sollicitudin.orci@arcuiaculisenim.ca,Ansfelden,Sri Lanka,12/25/13,Vivamus Institute,"amet nulla. Donec non justo. Proin non massa non ante bibendum ullamcorper. Duis cursus, diam at pretium aliquet, metus urna convallis",1,"contributor, author, " Kim,Sellers,amet.ultricies@ultriciesdignissim.com,Cabo de Santo Agostinho,Burundi,06/26/14,Maecenas Ornare Foundation,"eget massa. Suspendisse eleifend. Cras sed leo. Cras vehicula aliquet libero. Integer in magna. Phasellus dolor elit, pellentesque a, facilisis non, bibendum sed, est. Nunc laoreet lectus quis massa.",2, Musketeers.me
  17. 17. Sample CSV data ! first name,last name,email,city,country,date_joined,company,bio,id,tags Whilemina,Benton,sollicitudin.orci@arcuiaculisenim.ca,Ansfelden,Sri Lanka,12/25/13,Vivamus Institute,"amet nulla. Donec non justo. Proin non massa non ante bibendum ullamcorper. Duis cursus, diam at pretium aliquet, metus urna convallis",1,"contributor, author, " Kim,Sellers,amet.ultricies@ultriciesdignissim.com,Cabo de Santo Agostinho,Burundi,06/26/14,Maecenas Ornare Foundation,"eget massa. Suspendisse eleifend. Cras sed leo. Cras vehicula aliquet libero. Integer in magna. Phasellus dolor elit, pellentesque a, facilisis non, bibendum sed, est. Nunc laoreet lectus quis massa.",2, Musketeers.me
  18. 18. Musketeers.me 1. Parse CSV // read in our Sample CSV file // and clean up incoming data with sampleParser $csv = new readCsv(__DIR__ . '/sample.csv'); $csv->setKey('id'); $items = $csv->getArray('ProfileParser');
  19. 19. 2. Clean up incoming function profileParser($item) { // skip profiles without an email if (empty($item['email'])) return false; Musketeers.me ! // create a first+last item $item['last_first'] = $item['last_name'] . ', ' . $item['first_name']; ! // cleanup & split the tags column into an array if (isset($item['tags'])) { $tags = explode(',', $item['tags']); $tags = array_filter($tags); $tags = array_map('trim', $tags); $item['tags'] = $tags; } ! // clean date_joined format $date = new DateTime($item['date_joined']); $item['date_joined_clean'] = $date->format('Y-m-d'); ! return $item; }
  20. 20. 3. Convert to XML // output XML $xml = new toXml("profiles", "profile"); $xml->setHandler("tags", "tagHandler"); $xml->convert($items); echo $xml->saveXML(); Musketeers.me
  21. 21. 4. Save XML Output Musketeers.me $ php csv2xml.php > profiles.xml <?xml version="1.0"?>! <profiles>! <profile>! <first_name>Whilemina</first_name>! <last_name>Benton</last_name>! <email>sollicitudin.orci@arcuiaculisenim.ca</email>! <city>Ansfelden</city>! <country>Sri Lanka</country>! <date_joined>12/25/13</date_joined>! <company>Vivamus Institute</company>! <bio>amet nulla. Donec non justo. Proin non massa non ante bibendum ullamcorper. ! Duis cursus, diam at pretium aliquet, metus urna convallis</bio>! <id>1</id>! <roles>! <role>contributor</role>! <role>author</role>! </roles>! <last_first>Benton, Whilemina</last_first>! <date_joined_clean>2013-12-25</date_joined_clean>! </profile>
  22. 22. Our Profile Content Type • Text fields: First Name, Last Name, City, Country, Company • Email field: E-mail • Date field: Date Joined • Long Text: bio • List: Tags (roles) • Integer: Legacy ID Musketeers.me
  23. 23. Musketeers.me
  24. 24. Step 4. Configure Feeds • Basic: Disable periodic import • Fetcher: File Upload • Parser: XPath XML Parser • Processor: Node processor Musketeers.me
  25. 25. 5. Configure Processor • Settings • Bundle: Target content type • Update: Update existing nodes • Text format: HTML • Expire Nodes: never Musketeers.me
  26. 26. 6. Map Source to Target Musketeers.me
  27. 27. 7. Map Inputs with XPath Musketeers.me
  28. 28. An XPath Primer • Used to query XML documents • A path can return multiple nodes • /profiles/profile - give me all profile nodes • Can test for attributes, elements, and more • http://github.com/GeorgeMac/xpath-primer Musketeers.me
  29. 29. 8. Run the Import Musketeers.me
  30. 30. Advanced Feeds Tricks • Provide a URL to an image, audio, or other media file, and it’ll download automatically. • Can create entity references • As long as your GUIDs are set • Can import to Field Collection • https://drupal.org/project/field_collection_feeds Musketeers.me
  31. 31. hook_feeds_presave • Clean up data on import • Can also use regular node hooks function grad_importers_feeds_presave(FeedsSource $source, $entity, $item) {! if ('publications' == $source->id) {! // ensure yes/no fields are imported! $entity->field_submitted_web[LANGUAGE_NONE][0]['value'] = (int) $item['xpathparser:5'];! $entity->field_media_promotion[LANGUAGE_NONE][0]['value'] = (int) $item['xpathparser:6'];! Musketeers.me ! // don't lose freeform notes but import the values cleanly! $notes = array_filter(array($item['xpathparser:8'], $item['xpathparser:12']));! $entity->field_history_notes[LANGUAGE_NONE][0]['value'] = join("n", $notes);! }! }
  32. 32. hook_feeds_presave II • Extract data from a field, assign it to another function foo_feeds_presave(FeedsSource $source, $entity, $item) {! if ('news_feed' == $source->id) {! // get link and title out of the description field! $doc = new DOMDocument();! $x = @$doc->loadHTML($item['xpathparser:2']);! Musketeers.me ! $links = $doc->getElementsByTagName('a');! if ($link = $links->item(0)) {! $entity->field_link['und'][0]['title'] = $link->nodeValue;! $entity->field_link['und'][0]['url'] ! = $link->attributes->getNamedItem('href')->nodeValue;! }! }! }
  33. 33. Creating an Entity Reference • Map a source to a Feeds GUID ! ! • Set an XPATH query to read the GUID Musketeers.me
  34. 34. Entity Reference XML <scholars> ! <scholar>! <scholar_id>b36c6b08b402c12bd3a11657420cd5dd</scholar_id>! <scholar_name>Sammy Zahran, PhD</scholar_name>! </scholar>! </scholars>! <program_id>ee01cb8d56dbd67bf89c3c4fcf69e2f5</program_id>! </publication> Musketeers.me
  35. 35. What about Drupal 8? • Import API is in core • Based on Migrate API • http://groups.drupal.org/imp • Porting Feeds module • http://drupal.org/node/1960800 Musketeers.me
  36. 36. What did we learn? • Decouple your source from import • Sources will change … versioning! • Transform input sources to XML • Clean up data with PHP before it gets to Drupal. • Quicker & easier to run more than one import Musketeers.me
  37. 37. Thank You. • php[architect] - http://phparch.com • Magazine, books, trainings, and … • php[world] - this November • http://world.phparch.com • PHP Foundation for Drupal 8 • 2 days of training • http://phpa.me/drupal8dec14
  38. 38. Musketeers.me Questions? @omerida on twitter
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×