SlideShare a Scribd company logo
Our expertise. Your digital DNA | evolvingweb.ca | @evolvingweb
DRUPAL 8 MIGRATION
CASE STUDY
ALEX DERGACHEV
• Co-founder at Evolving Web
• 10 years!
• @dergachev on twitter + github
• Co-organizer of Drupal Montréal
meetups
MATT CORKS
• Technical project manager at
Evolving Web
• Using Drupal since 2004
• mvc on Drupal.org, IRC, slack
• @mvc447 on Twitter
• 9 years of great Drupal dev.
• Enterprise level projects
• Extensive training program
• Office in Old Montréal
• We’re hiring!
OUR WORK
.ca
.ca
OUR APPROACH
Training & Knowledge Transfer
MVP, Iterative approach
Expertise through the technology
stack.
Collaboration tools: Redmine/JIRA,
Basecamp, Google Suite, Slack
Great design and theming
Large scalable infrastructure
Multilingual CMS
Search UI
Content migration and sync
Custom Drupal development
OUR EXPERTISE
Princeton University Press is an independent publisher with close
connections, both formal and informal, to Princeton University. As
such it has overlapping responsibilities to the University, the
academic community, and the reading public. Our fundamental
mission is to disseminate scholarship (through print and digital
media) both within academia and to society at large.
• 9,000 published titles
• 15,000 pages in Einstein Papers
• 3,000 quote pages
• 2,000 sample chapters
• 2,000 ToCs pages
• 600 bird section pages
• 6,000 PDFs
• 6,000 blog pages
• 18,000 content pages, catalogues,
sales pages, videos, and other
ancillary materials.
CURRENT SITE
• Minisites
• Wordpress Blog
• phpList newsletter
• Search (CGI + Google
Books)
• Two external shopping
carts (US/UK)
• ONIX XML feed + images
• (distributor FTPs)
• videos + audios, moving to
vimeo
MOVING PARTS
WHAT WE DID
• Analysis of the SQL Server DB, VB scripts, and current website
• Content audit - identify pages to drop
• Deploy Drupal CMS to organize the content
• Updated responsive design
• Nightly synchronization of book data from SQL Server
• Deploy to Pantheon
SCOPING A COMPLEX
MIGRATION
EXISTING DATABASE AUDIT
LEGACY VB SCRIPTS
• 30-50 scripts in MS Access
• Take days to run
• Code duplicated, out-of-date
• “Special” books
• (edited in Dreamweaver)
• “Ann’s Biblio” MsSQL DB
• HTML files moved to W:
• COMPLEX! (5k-10k Lines of code)
VB SCRIPT AUDIT
CURRENT WORKFLOW
Static
HTML
Dynamic
HTML
Works
Dynamic
HTML
Pages
Website
PDFs
Illustrations
Cover Images
Biblio
SQL Server
VB Scripts
NEW WORKFLOW
Biblio
SQL Server
Dynamic
Views Works
Basic
Pages
PDFs
Illustrations
Static HTML
Migrations (sync)
CSVs
on FTPExport
Cover
Images
Drupal
Static Files
on FTP
One-time import
BIBLIO QUERIES - GENERATE CSV
s
DRUPAL DATA STRUCTURE
CONTENT CLEAN-UP
Category Number of Files
Dynamic Pages - Books and Book Lists 20,163
Static Pages to Keep 2,478*
Pages to Delete 53,929
Total 76,570
STATIC PAGE ANALYSIS
IMPLEMENTATION
• Modernized,
responsive
design
• Accessible and
good UX
• SEO
• Stable, secure,
maintainable
platform
• Easy to update
web content,
WYSIWYG
• Images are
auto-resized
• Clear
navigation
• Automatic
nightly sync
BENEFITS OF NEW SITE
• Existing branding + look-and-feel
• Adapted to standard modern
responsive template
• Changed side menu to consistent
drop-down UI
• Clean, accessible footer
• Search box collapsed
HOMEPAGE REFRESH
OLD
MENU
STRUCTURE
NEW MENU STRUCTURE
• Simplified UI (less in header)
• Displays books consistently
• Adapts based on the fields
available for each book
• Integrates with shopping carts,
Google Book search
• Renders videos + materials
• Collapses long sections of text
BOOK TEMPLATES
• Includes migration of ~8,000 books and
~4,000 jpgs*
• Process is optimized to prevent
downtime/inconsistency on the live site
• Migrations can be run manually or
automatically
• Out-of-print books are removed
Biblio
SQL Server
CSVs
on FTP
Export to CSV
Migrate to website
MIGRATING BOOK DATA
• Includes migration of works,
editions, reviews, contributors, etc.
• Handle source deletion
• Multi-CSV “join”
• SFTP + FTP support, with source
change detection
• Nightly cron via queue API
• Fixed encoding
• Resolved many issues with legacy
data and legacy book logic
ONLY CUSTOM MODULE: pup_migrate
$ drush migrate-status
Group: PUP Biblio (pup_biblio) Status Total Imported Unprocessed
pup_math_subjects Idle 30 30 0
pup_textbooks_by_author_redirects Idle 7 7 0
pup_catalog_by_math_subject_order_by_author_redirects Idle 30 30 0
pup_subjects Idle 50 50 0
pup_work_chapters Idle 4241 4242 -1
pup_chapters_by_subject_redirects Idle 50 50 0
pup_contrib_roles Idle 37 52 -15
pup_contribs Idle 10356 10357 -1
pup_editions_ebook Idle 4285 4292 -7
pup_region_restrictions Idle 131 131 0
pup_editions_physical Idle 11386 11407 -25
pup_prizes Idle 3096 3094 0
pup_series Idle 225 225 0
pup_textbook_cats Idle 7 7 0
pup_trans_languages Idle 45 45 0
pup_work_contribs Idle 16058 15941 117
pup_work_covers Idle 7540 7560 -23
pup_work_illustrations Idle 125 125 0
pup_work_interviews Idle 569 357 0
pup_work_interview_by_author_redirects Idle 5 5 0
pup_work_interview_by_title_redirects Idle 5 5 0
pup_work_links_header Idle 125 124 0
pup_work_links_mixed Idle 2872 1677 95
pup_work_reviews Idle 42328 42220 35
pup_works Idle 8219 8231 -12
• admin_toolbar
• context
• context_active_trail
• devel
• entity_reference_revisions
• metatag
• migrate_source_csv
• paragraphs
• token
• ctools
• entity_browser
• inline_entity_form
• migrate_plus migrate_tools
• pathauto
• views_slideshow
• editor_file
• redirect
• redis
• stage_file_proxy
CONTRIB MODULES
• ISBN
• Number of
Pages
• Price UK/US
• Region
• Season
• Status
• Type
• Affiliation
• Audible
• Book Club
• Sub-authors
• Book ID
• Co-publisher
• Sub-title
• Table of
Contents
• Cover
Caption
• Cover Image
• Description
• Tagline
• Title
• Need
Textbooks
• Edition
Notes
• Title
(formatted)
• Volume
• Primary
Authors
• Math
Subject
• Prior
Editions
• Reviews
• Links
• Illustrations
• Chapters
• Prizes
• Interviews
• Contributors
• Subjects
• Translation
language
Biblio
SQL Server
CSVs
on FTP
Export to CSV
Migrate to website
IMPORTED BOOK FIELDS
• PDF links rewriting (assets)
• Merged denormalized records for
ebooks and print books
• Added publication type for book
apps
• Encoding problems
BOOK DATA CLEANUP
• Imported most-visited static pages
• Can be easily cleaned up using
the Drupal website admin
interface by client team
• Left other lower priority pages can
be migrated manually
STATIC HTML MIGRATION
• Lists of books, dynamically
updated after Biblio sync
• Search pages
• In future, can replace some static
listings with Drupal views (eg.
books published in the last
month, have won a prize)
DYNAMIC LISTINGS (Views)
GOING LIVE
DEPLOYMENT CHECKLIST (1/2)
• Deploy on
• Set legacy asset domain to
assets.press.princeton.edu
• Test migrations / cron sync on
Pantheon
• Ensure uploaded files stored
in subdirectories per field
• Google Analytics, etc.
• Enable/disable extensions:
admin_toolbar, devel, …
• Ensure that the live site
domain is white-listed as
trusted host in $settings
• Change redirects to use live
domain in settings.site.inc
DEPLOYMENT CHECKLIST (2/2)
• Performance audit (frontend
+ backend)
• Enable page cache, CSS/JS
aggregation
• Broken link checker, Fast
404
• Soft launch: DNS, remove
HTTP Auth
• Pantheon Launch Checklist:
https://pantheon.io/docs/
guides/launch
• Setup backups & recovery
• Monitoring (Pingdom)
POST-LAUNCH
• Support agreement
• Friendly URLs & redirects
• Advanced search
• Convert static pages to dynamic
pages, Webforms
• More admin training + UI
improvements
• SEO enhancements - social
media tags, XML sitemap
• Annual sale (discounts), shopping
cart simplification
• Additional homepage content
(eg. news in slideshow, events)
• Migration to VirtuSales Biblio
• Design overhaul
PROJECT MANAGEMENT
.ca
.ca
Alex Dergachev
Business lead
Jigar Mehta
Backend dev
Jorge Diaz
Front-end dev
Matt Wahba
Designer
Matt Corks
Technical PM
Suzanne Dergacheva
Training
PROJECT TEAM
Dave Vasilevsky
Backend dev
Alex Parker
QA (Intern)
CLOSE COLLABORATION
• 7 calendar days of onsite meetings in
Princeton
• 5 demos (in person or Zoom)
• Single point of contact (PUP + EW)
• Backlog of tasks with estimates for
client prioritization
• No client-initiated scope creep
TIMELINE
DISTRIBUTION OF WORK
SUMMARY / CHALLENGES
• Phases:
• Scoping, Design, Dev, Migration, Content, Deployment, Maintenance
• Gotchas
• Migrations in cron, Legacy complexity, Content cleanup, Editor
interface
• PM
• Timeline, Scoping phase, Budget expectations
.ca
.ca

More Related Content

What's hot

Microsoft's Big Play for Big Data
Microsoft's Big Play for Big DataMicrosoft's Big Play for Big Data
Microsoft's Big Play for Big Data
Andrew Brust
 
History of Drupal: From Drop 1.0 to Drupal 8
History of Drupal: From Drop 1.0 to Drupal 8History of Drupal: From Drop 1.0 to Drupal 8
History of Drupal: From Drop 1.0 to Drupal 8
Websolutions Agency
 
Storage Infrastructure Behind Facebook Messages
Storage Infrastructure Behind Facebook MessagesStorage Infrastructure Behind Facebook Messages
Storage Infrastructure Behind Facebook Messages
feng1212
 
NoSQL and The Big Data Hullabaloo
NoSQL and The Big Data HullabalooNoSQL and The Big Data Hullabaloo
NoSQL and The Big Data Hullabaloo
Andrew Brust
 
Big Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI ProsBig Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI Pros
Andrew Brust
 
Big Data Strategy for the Relational World
Big Data Strategy for the Relational World Big Data Strategy for the Relational World
Big Data Strategy for the Relational World
Andrew Brust
 
NOSQL, CouchDB, and the Cloud
NOSQL, CouchDB, and the CloudNOSQL, CouchDB, and the Cloud
NOSQL, CouchDB, and the Cloud
boorad
 
Self-Service ETL: The PowerBI Data Flows
Self-Service ETL: The PowerBI Data FlowsSelf-Service ETL: The PowerBI Data Flows
Self-Service ETL: The PowerBI Data Flows
Data Con LA
 
Semantic content management: consuming and producing RDF in Drupal
Semantic content management: consuming and producing RDF in DrupalSemantic content management: consuming and producing RDF in Drupal
Semantic content management: consuming and producing RDF in Drupal
Thom Bunting
 
A Practical Look at the NOSQL and Big Data Hullabaloo
A Practical Look at the NOSQL and Big Data HullabalooA Practical Look at the NOSQL and Big Data Hullabaloo
A Practical Look at the NOSQL and Big Data Hullabaloo
Andrew Brust
 
Hadoop and its Ecosystem Components in Action
Hadoop and its Ecosystem Components in ActionHadoop and its Ecosystem Components in Action
Hadoop and its Ecosystem Components in Action
Andrew Brust
 
Relational vs. Non-Relational
Relational vs. Non-RelationalRelational vs. Non-Relational
Relational vs. Non-Relational
PostgreSQL Experts, Inc.
 
Library Mashups & APIs
Library Mashups & APIsLibrary Mashups & APIs
Library Mashups & APIs
librarywebchic
 
SQL Server Denali: BI on Your Terms
SQL Server Denali: BI on Your Terms SQL Server Denali: BI on Your Terms
SQL Server Denali: BI on Your Terms
Andrew Brust
 
The MySQL Server ecosystem in 2016
The MySQL Server ecosystem in 2016The MySQL Server ecosystem in 2016
The MySQL Server ecosystem in 2016
sys army
 
Mashups for Course Websites with Yahoo! Pipes
Mashups for Course Websites with Yahoo! PipesMashups for Course Websites with Yahoo! Pipes
Mashups for Course Websites with Yahoo! Pipes
Matthew Leingang
 
Apcug 2011 07-17-intro_to_drupal_jeff_schuler
Apcug 2011 07-17-intro_to_drupal_jeff_schulerApcug 2011 07-17-intro_to_drupal_jeff_schuler
Apcug 2011 07-17-intro_to_drupal_jeff_schuler
hewie
 
Apachecon Europe 2012: Operating HBase - Things you need to know
Apachecon Europe 2012: Operating HBase - Things you need to knowApachecon Europe 2012: Operating HBase - Things you need to know
Apachecon Europe 2012: Operating HBase - Things you need to know
Christian Gügi
 
2.28.17 Introducing DSpace 7 Webinar Slides
2.28.17 Introducing DSpace 7 Webinar Slides2.28.17 Introducing DSpace 7 Webinar Slides
2.28.17 Introducing DSpace 7 Webinar Slides
DuraSpace
 
Oracle hadoop let them talk together !
Oracle hadoop let them talk together !Oracle hadoop let them talk together !
Oracle hadoop let them talk together !
Laurent Leturgez
 

What's hot (20)

Microsoft's Big Play for Big Data
Microsoft's Big Play for Big DataMicrosoft's Big Play for Big Data
Microsoft's Big Play for Big Data
 
History of Drupal: From Drop 1.0 to Drupal 8
History of Drupal: From Drop 1.0 to Drupal 8History of Drupal: From Drop 1.0 to Drupal 8
History of Drupal: From Drop 1.0 to Drupal 8
 
Storage Infrastructure Behind Facebook Messages
Storage Infrastructure Behind Facebook MessagesStorage Infrastructure Behind Facebook Messages
Storage Infrastructure Behind Facebook Messages
 
NoSQL and The Big Data Hullabaloo
NoSQL and The Big Data HullabalooNoSQL and The Big Data Hullabaloo
NoSQL and The Big Data Hullabaloo
 
Big Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI ProsBig Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI Pros
 
Big Data Strategy for the Relational World
Big Data Strategy for the Relational World Big Data Strategy for the Relational World
Big Data Strategy for the Relational World
 
NOSQL, CouchDB, and the Cloud
NOSQL, CouchDB, and the CloudNOSQL, CouchDB, and the Cloud
NOSQL, CouchDB, and the Cloud
 
Self-Service ETL: The PowerBI Data Flows
Self-Service ETL: The PowerBI Data FlowsSelf-Service ETL: The PowerBI Data Flows
Self-Service ETL: The PowerBI Data Flows
 
Semantic content management: consuming and producing RDF in Drupal
Semantic content management: consuming and producing RDF in DrupalSemantic content management: consuming and producing RDF in Drupal
Semantic content management: consuming and producing RDF in Drupal
 
A Practical Look at the NOSQL and Big Data Hullabaloo
A Practical Look at the NOSQL and Big Data HullabalooA Practical Look at the NOSQL and Big Data Hullabaloo
A Practical Look at the NOSQL and Big Data Hullabaloo
 
Hadoop and its Ecosystem Components in Action
Hadoop and its Ecosystem Components in ActionHadoop and its Ecosystem Components in Action
Hadoop and its Ecosystem Components in Action
 
Relational vs. Non-Relational
Relational vs. Non-RelationalRelational vs. Non-Relational
Relational vs. Non-Relational
 
Library Mashups & APIs
Library Mashups & APIsLibrary Mashups & APIs
Library Mashups & APIs
 
SQL Server Denali: BI on Your Terms
SQL Server Denali: BI on Your Terms SQL Server Denali: BI on Your Terms
SQL Server Denali: BI on Your Terms
 
The MySQL Server ecosystem in 2016
The MySQL Server ecosystem in 2016The MySQL Server ecosystem in 2016
The MySQL Server ecosystem in 2016
 
Mashups for Course Websites with Yahoo! Pipes
Mashups for Course Websites with Yahoo! PipesMashups for Course Websites with Yahoo! Pipes
Mashups for Course Websites with Yahoo! Pipes
 
Apcug 2011 07-17-intro_to_drupal_jeff_schuler
Apcug 2011 07-17-intro_to_drupal_jeff_schulerApcug 2011 07-17-intro_to_drupal_jeff_schuler
Apcug 2011 07-17-intro_to_drupal_jeff_schuler
 
Apachecon Europe 2012: Operating HBase - Things you need to know
Apachecon Europe 2012: Operating HBase - Things you need to knowApachecon Europe 2012: Operating HBase - Things you need to know
Apachecon Europe 2012: Operating HBase - Things you need to know
 
2.28.17 Introducing DSpace 7 Webinar Slides
2.28.17 Introducing DSpace 7 Webinar Slides2.28.17 Introducing DSpace 7 Webinar Slides
2.28.17 Introducing DSpace 7 Webinar Slides
 
Oracle hadoop let them talk together !
Oracle hadoop let them talk together !Oracle hadoop let them talk together !
Oracle hadoop let them talk together !
 

Similar to Princeton University Press to Drupal 8: Migration case study by Evolving Web

Introduction to the Drupal - Web Experience Toolkit
Introduction to the Drupal - Web Experience ToolkitIntroduction to the Drupal - Web Experience Toolkit
Introduction to the Drupal - Web Experience Toolkit
Suzanne Dergacheva
 
Drupal status report for all staff day
Drupal status report for all staff dayDrupal status report for all staff day
Drupal status report for all staff day
sbclapp
 
Building Applications using Apache Hadoop
Building Applications using Apache HadoopBuilding Applications using Apache Hadoop
Building Applications using Apache Hadoop
C4Media
 
The future is flexible extensible and community-based: CORAL as source for da...
The future is flexible extensible and community-based: CORAL as source for da...The future is flexible extensible and community-based: CORAL as source for da...
The future is flexible extensible and community-based: CORAL as source for da...
NASIG
 
Produce and consume_linked_data_with_drupal
Produce and consume_linked_data_with_drupalProduce and consume_linked_data_with_drupal
Produce and consume_linked_data_with_drupal
STIinnsbruck
 
Latest Developments in H2O
Latest Developments in H2OLatest Developments in H2O
Latest Developments in H2O
Sri Ambati
 
Database Migrations with Gradle and Liquibase
Database Migrations with Gradle and LiquibaseDatabase Migrations with Gradle and Liquibase
Database Migrations with Gradle and Liquibase
Dan Stine
 
SplunkLive London 2014 Developer Presentation
SplunkLive London 2014  Developer PresentationSplunkLive London 2014  Developer Presentation
SplunkLive London 2014 Developer Presentation
Damien Dallimore
 
Docs as Part of the Product - Open Source Summit North America 2018
Docs as Part of the Product - Open Source Summit North America 2018Docs as Part of the Product - Open Source Summit North America 2018
Docs as Part of the Product - Open Source Summit North America 2018
Den Delimarsky
 
10 Things Learned Releasing Databricks Enterprise Wide
10 Things Learned Releasing Databricks Enterprise Wide10 Things Learned Releasing Databricks Enterprise Wide
10 Things Learned Releasing Databricks Enterprise Wide
Databricks
 
October2019 release
October2019 releaseOctober2019 release
October2019 release
Berkovich Consulting
 
MCN 2013 - Big-Picture Strategy for Collection-Information Technology Project...
MCN 2013 - Big-Picture Strategy for Collection-Information Technology Project...MCN 2013 - Big-Picture Strategy for Collection-Information Technology Project...
MCN 2013 - Big-Picture Strategy for Collection-Information Technology Project...
Jane Alexander
 
MCN 2013 - Big-Picture Strategy for Collection-Information Technology Project...
MCN 2013 - Big-Picture Strategy for Collection-Information Technology Project...MCN 2013 - Big-Picture Strategy for Collection-Information Technology Project...
MCN 2013 - Big-Picture Strategy for Collection-Information Technology Project...
MCN (Museum Computer Network)
 
Reiss 4
Reiss 4Reiss 4
Presto Strata Hadoop SJ 2016 short talk
Presto Strata Hadoop SJ 2016 short talkPresto Strata Hadoop SJ 2016 short talk
Presto Strata Hadoop SJ 2016 short talk
kbajda
 
Offline first development - Glasgow PHP - January 2016
Offline first development - Glasgow PHP - January 2016Offline first development - Glasgow PHP - January 2016
Offline first development - Glasgow PHP - January 2016
Glynn Bird
 
Taking your site from Drupal 6 to Drupal 7
Taking your site from Drupal 6 to Drupal 7Taking your site from Drupal 6 to Drupal 7
Taking your site from Drupal 6 to Drupal 7
Phase2
 
Untangling - fall2017 - week 9
Untangling - fall2017 - week 9Untangling - fall2017 - week 9
Untangling - fall2017 - week 9
Derek Jacoby
 
Migrate all the things!
Migrate all the things!Migrate all the things!
Migrate all the things!
Dave Vasilevsky
 
SharePoint Conference North America 2018 - Las Vegas - Announcements
SharePoint Conference North America 2018 - Las Vegas - AnnouncementsSharePoint Conference North America 2018 - Las Vegas - Announcements
SharePoint Conference North America 2018 - Las Vegas - Announcements
Nick Hobbs
 

Similar to Princeton University Press to Drupal 8: Migration case study by Evolving Web (20)

Introduction to the Drupal - Web Experience Toolkit
Introduction to the Drupal - Web Experience ToolkitIntroduction to the Drupal - Web Experience Toolkit
Introduction to the Drupal - Web Experience Toolkit
 
Drupal status report for all staff day
Drupal status report for all staff dayDrupal status report for all staff day
Drupal status report for all staff day
 
Building Applications using Apache Hadoop
Building Applications using Apache HadoopBuilding Applications using Apache Hadoop
Building Applications using Apache Hadoop
 
The future is flexible extensible and community-based: CORAL as source for da...
The future is flexible extensible and community-based: CORAL as source for da...The future is flexible extensible and community-based: CORAL as source for da...
The future is flexible extensible and community-based: CORAL as source for da...
 
Produce and consume_linked_data_with_drupal
Produce and consume_linked_data_with_drupalProduce and consume_linked_data_with_drupal
Produce and consume_linked_data_with_drupal
 
Latest Developments in H2O
Latest Developments in H2OLatest Developments in H2O
Latest Developments in H2O
 
Database Migrations with Gradle and Liquibase
Database Migrations with Gradle and LiquibaseDatabase Migrations with Gradle and Liquibase
Database Migrations with Gradle and Liquibase
 
SplunkLive London 2014 Developer Presentation
SplunkLive London 2014  Developer PresentationSplunkLive London 2014  Developer Presentation
SplunkLive London 2014 Developer Presentation
 
Docs as Part of the Product - Open Source Summit North America 2018
Docs as Part of the Product - Open Source Summit North America 2018Docs as Part of the Product - Open Source Summit North America 2018
Docs as Part of the Product - Open Source Summit North America 2018
 
10 Things Learned Releasing Databricks Enterprise Wide
10 Things Learned Releasing Databricks Enterprise Wide10 Things Learned Releasing Databricks Enterprise Wide
10 Things Learned Releasing Databricks Enterprise Wide
 
October2019 release
October2019 releaseOctober2019 release
October2019 release
 
MCN 2013 - Big-Picture Strategy for Collection-Information Technology Project...
MCN 2013 - Big-Picture Strategy for Collection-Information Technology Project...MCN 2013 - Big-Picture Strategy for Collection-Information Technology Project...
MCN 2013 - Big-Picture Strategy for Collection-Information Technology Project...
 
MCN 2013 - Big-Picture Strategy for Collection-Information Technology Project...
MCN 2013 - Big-Picture Strategy for Collection-Information Technology Project...MCN 2013 - Big-Picture Strategy for Collection-Information Technology Project...
MCN 2013 - Big-Picture Strategy for Collection-Information Technology Project...
 
Reiss 4
Reiss 4Reiss 4
Reiss 4
 
Presto Strata Hadoop SJ 2016 short talk
Presto Strata Hadoop SJ 2016 short talkPresto Strata Hadoop SJ 2016 short talk
Presto Strata Hadoop SJ 2016 short talk
 
Offline first development - Glasgow PHP - January 2016
Offline first development - Glasgow PHP - January 2016Offline first development - Glasgow PHP - January 2016
Offline first development - Glasgow PHP - January 2016
 
Taking your site from Drupal 6 to Drupal 7
Taking your site from Drupal 6 to Drupal 7Taking your site from Drupal 6 to Drupal 7
Taking your site from Drupal 6 to Drupal 7
 
Untangling - fall2017 - week 9
Untangling - fall2017 - week 9Untangling - fall2017 - week 9
Untangling - fall2017 - week 9
 
Migrate all the things!
Migrate all the things!Migrate all the things!
Migrate all the things!
 
SharePoint Conference North America 2018 - Las Vegas - Announcements
SharePoint Conference North America 2018 - Las Vegas - AnnouncementsSharePoint Conference North America 2018 - Las Vegas - Announcements
SharePoint Conference North America 2018 - Las Vegas - Announcements
 

Recently uploaded

Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
Operating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptxOperating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptx
Pravash Chandra Das
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-Universität
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Jeffrey Haguewood
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
Hiike
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
Recommendation System using RAG Architecture
Recommendation System using RAG ArchitectureRecommendation System using RAG Architecture
Recommendation System using RAG Architecture
fredae14
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
saastr
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
Zilliz
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
kumardaparthi1024
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
alexjohnson7307
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
Wouter Lemaire
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
Octavian Nadolu
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
Dinusha Kumarasiri
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
saastr
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
Pixlogix Infotech
 

Recently uploaded (20)

Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
Operating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptxOperating System Used by Users in day-to-day life.pptx
Operating System Used by Users in day-to-day life.pptx
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
 
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
Salesforce Integration for Bonterra Impact Management (fka Social Solutions A...
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
 
Recommendation System using RAG Architecture
Recommendation System using RAG ArchitectureRecommendation System using RAG Architecture
Recommendation System using RAG Architecture
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
Deep Dive: AI-Powered Marketing to Get More Leads and Customers with HyperGro...
 
Generating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and MilvusGenerating privacy-protected synthetic data using Secludy and Milvus
Generating privacy-protected synthetic data using Secludy and Milvus
 
GenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizationsGenAI Pilot Implementation in the organizations
GenAI Pilot Implementation in the organizations
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
leewayhertz.com-AI in predictive maintenance Use cases technologies benefits ...
 
UI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentationUI5 Controls simplified - UI5con2024 presentation
UI5 Controls simplified - UI5con2024 presentation
 
Artificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopmentArtificial Intelligence for XMLDevelopment
Artificial Intelligence for XMLDevelopment
 
Azure API Management to expose backend services securely
Azure API Management to expose backend services securelyAzure API Management to expose backend services securely
Azure API Management to expose backend services securely
 
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
Overcoming the PLG Trap: Lessons from Canva's Head of Sales & Head of EMEA Da...
 
Best 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERPBest 20 SEO Techniques To Improve Website Visibility In SERP
Best 20 SEO Techniques To Improve Website Visibility In SERP
 

Princeton University Press to Drupal 8: Migration case study by Evolving Web

  • 1. Our expertise. Your digital DNA | evolvingweb.ca | @evolvingweb DRUPAL 8 MIGRATION CASE STUDY
  • 2. ALEX DERGACHEV • Co-founder at Evolving Web • 10 years! • @dergachev on twitter + github • Co-organizer of Drupal Montréal meetups
  • 3. MATT CORKS • Technical project manager at Evolving Web • Using Drupal since 2004 • mvc on Drupal.org, IRC, slack • @mvc447 on Twitter
  • 4. • 9 years of great Drupal dev. • Enterprise level projects • Extensive training program • Office in Old Montréal • We’re hiring!
  • 6. .ca .ca OUR APPROACH Training & Knowledge Transfer MVP, Iterative approach Expertise through the technology stack. Collaboration tools: Redmine/JIRA, Basecamp, Google Suite, Slack Great design and theming Large scalable infrastructure Multilingual CMS Search UI Content migration and sync Custom Drupal development OUR EXPERTISE
  • 7. Princeton University Press is an independent publisher with close connections, both formal and informal, to Princeton University. As such it has overlapping responsibilities to the University, the academic community, and the reading public. Our fundamental mission is to disseminate scholarship (through print and digital media) both within academia and to society at large.
  • 8. • 9,000 published titles • 15,000 pages in Einstein Papers • 3,000 quote pages • 2,000 sample chapters • 2,000 ToCs pages • 600 bird section pages • 6,000 PDFs • 6,000 blog pages • 18,000 content pages, catalogues, sales pages, videos, and other ancillary materials. CURRENT SITE
  • 9. • Minisites • Wordpress Blog • phpList newsletter • Search (CGI + Google Books) • Two external shopping carts (US/UK) • ONIX XML feed + images • (distributor FTPs) • videos + audios, moving to vimeo MOVING PARTS
  • 10. WHAT WE DID • Analysis of the SQL Server DB, VB scripts, and current website • Content audit - identify pages to drop • Deploy Drupal CMS to organize the content • Updated responsive design • Nightly synchronization of book data from SQL Server • Deploy to Pantheon
  • 13. LEGACY VB SCRIPTS • 30-50 scripts in MS Access • Take days to run • Code duplicated, out-of-date • “Special” books • (edited in Dreamweaver) • “Ann’s Biblio” MsSQL DB • HTML files moved to W: • COMPLEX! (5k-10k Lines of code)
  • 16. NEW WORKFLOW Biblio SQL Server Dynamic Views Works Basic Pages PDFs Illustrations Static HTML Migrations (sync) CSVs on FTPExport Cover Images Drupal Static Files on FTP One-time import
  • 17. BIBLIO QUERIES - GENERATE CSV
  • 19. CONTENT CLEAN-UP Category Number of Files Dynamic Pages - Books and Book Lists 20,163 Static Pages to Keep 2,478* Pages to Delete 53,929 Total 76,570
  • 22. • Modernized, responsive design • Accessible and good UX • SEO • Stable, secure, maintainable platform • Easy to update web content, WYSIWYG • Images are auto-resized • Clear navigation • Automatic nightly sync BENEFITS OF NEW SITE
  • 23. • Existing branding + look-and-feel • Adapted to standard modern responsive template • Changed side menu to consistent drop-down UI • Clean, accessible footer • Search box collapsed HOMEPAGE REFRESH
  • 26. • Simplified UI (less in header) • Displays books consistently • Adapts based on the fields available for each book • Integrates with shopping carts, Google Book search • Renders videos + materials • Collapses long sections of text BOOK TEMPLATES
  • 27. • Includes migration of ~8,000 books and ~4,000 jpgs* • Process is optimized to prevent downtime/inconsistency on the live site • Migrations can be run manually or automatically • Out-of-print books are removed Biblio SQL Server CSVs on FTP Export to CSV Migrate to website MIGRATING BOOK DATA
  • 28. • Includes migration of works, editions, reviews, contributors, etc. • Handle source deletion • Multi-CSV “join” • SFTP + FTP support, with source change detection • Nightly cron via queue API • Fixed encoding • Resolved many issues with legacy data and legacy book logic ONLY CUSTOM MODULE: pup_migrate $ drush migrate-status Group: PUP Biblio (pup_biblio) Status Total Imported Unprocessed pup_math_subjects Idle 30 30 0 pup_textbooks_by_author_redirects Idle 7 7 0 pup_catalog_by_math_subject_order_by_author_redirects Idle 30 30 0 pup_subjects Idle 50 50 0 pup_work_chapters Idle 4241 4242 -1 pup_chapters_by_subject_redirects Idle 50 50 0 pup_contrib_roles Idle 37 52 -15 pup_contribs Idle 10356 10357 -1 pup_editions_ebook Idle 4285 4292 -7 pup_region_restrictions Idle 131 131 0 pup_editions_physical Idle 11386 11407 -25 pup_prizes Idle 3096 3094 0 pup_series Idle 225 225 0 pup_textbook_cats Idle 7 7 0 pup_trans_languages Idle 45 45 0 pup_work_contribs Idle 16058 15941 117 pup_work_covers Idle 7540 7560 -23 pup_work_illustrations Idle 125 125 0 pup_work_interviews Idle 569 357 0 pup_work_interview_by_author_redirects Idle 5 5 0 pup_work_interview_by_title_redirects Idle 5 5 0 pup_work_links_header Idle 125 124 0 pup_work_links_mixed Idle 2872 1677 95 pup_work_reviews Idle 42328 42220 35 pup_works Idle 8219 8231 -12
  • 29. • admin_toolbar • context • context_active_trail • devel • entity_reference_revisions • metatag • migrate_source_csv • paragraphs • token • ctools • entity_browser • inline_entity_form • migrate_plus migrate_tools • pathauto • views_slideshow • editor_file • redirect • redis • stage_file_proxy CONTRIB MODULES
  • 30. • ISBN • Number of Pages • Price UK/US • Region • Season • Status • Type • Affiliation • Audible • Book Club • Sub-authors • Book ID • Co-publisher • Sub-title • Table of Contents • Cover Caption • Cover Image • Description • Tagline • Title • Need Textbooks • Edition Notes • Title (formatted) • Volume • Primary Authors • Math Subject • Prior Editions • Reviews • Links • Illustrations • Chapters • Prizes • Interviews • Contributors • Subjects • Translation language Biblio SQL Server CSVs on FTP Export to CSV Migrate to website IMPORTED BOOK FIELDS
  • 31. • PDF links rewriting (assets) • Merged denormalized records for ebooks and print books • Added publication type for book apps • Encoding problems BOOK DATA CLEANUP
  • 32. • Imported most-visited static pages • Can be easily cleaned up using the Drupal website admin interface by client team • Left other lower priority pages can be migrated manually STATIC HTML MIGRATION
  • 33. • Lists of books, dynamically updated after Biblio sync • Search pages • In future, can replace some static listings with Drupal views (eg. books published in the last month, have won a prize) DYNAMIC LISTINGS (Views)
  • 35. DEPLOYMENT CHECKLIST (1/2) • Deploy on • Set legacy asset domain to assets.press.princeton.edu • Test migrations / cron sync on Pantheon • Ensure uploaded files stored in subdirectories per field • Google Analytics, etc. • Enable/disable extensions: admin_toolbar, devel, … • Ensure that the live site domain is white-listed as trusted host in $settings • Change redirects to use live domain in settings.site.inc
  • 36. DEPLOYMENT CHECKLIST (2/2) • Performance audit (frontend + backend) • Enable page cache, CSS/JS aggregation • Broken link checker, Fast 404 • Soft launch: DNS, remove HTTP Auth • Pantheon Launch Checklist: https://pantheon.io/docs/ guides/launch • Setup backups & recovery • Monitoring (Pingdom)
  • 37. POST-LAUNCH • Support agreement • Friendly URLs & redirects • Advanced search • Convert static pages to dynamic pages, Webforms • More admin training + UI improvements • SEO enhancements - social media tags, XML sitemap • Annual sale (discounts), shopping cart simplification • Additional homepage content (eg. news in slideshow, events) • Migration to VirtuSales Biblio • Design overhaul
  • 39. .ca .ca Alex Dergachev Business lead Jigar Mehta Backend dev Jorge Diaz Front-end dev Matt Wahba Designer Matt Corks Technical PM Suzanne Dergacheva Training PROJECT TEAM Dave Vasilevsky Backend dev Alex Parker QA (Intern)
  • 40. CLOSE COLLABORATION • 7 calendar days of onsite meetings in Princeton • 5 demos (in person or Zoom) • Single point of contact (PUP + EW) • Backlog of tasks with estimates for client prioritization • No client-initiated scope creep
  • 43. SUMMARY / CHALLENGES • Phases: • Scoping, Design, Dev, Migration, Content, Deployment, Maintenance • Gotchas • Migrations in cron, Legacy complexity, Content cleanup, Editor interface • PM • Timeline, Scoping phase, Budget expectations