SlideShare a Scribd company logo
1 of 19
Download to read offline
#evolverocks
CRX2OAK – ALL THE SECRETS OF
REPOSITORY MIGRATION
TOMEK RĘKAWEK, ADOBE RESEARCH
Aug 30, 2016
#evolverocks 2
• Overview of CRX2Oak
• CRX2Oak command line
• Features
• Case study: large migration
• General migration tips
• Using CRX2Oak for AEM upgrade
• Q & (hopefully) A
AGENDA
#evolverocks 3
OVERVIEW OF THE CRX2OAK
UPGRADE FROM CRX2
CQ 5.x – CRX2 AEM 6.x – Jackrabbit Oak
#evolverocks 4
OVERVIEW OF THE CRX2OAK
UPGRADE OR SIDEGRADE
CQ 5.x – CRX2
AEM 6.x – Jackrabbit Oak
AEM 6.x – Oak
#evolverocks 5
OVERVIEW OF THE CRX2OAK
MIGRATING BINARIES
#evolverocks 6
• CRX2Oak is a command-line tool:
• java -jar crx2oak.jar [options] [datastore-options] SOURCE TARGET
• Source and target defines the repositories. Supported formats:
• path to the CRX2 “repository” directory, eg.
crx-quickstart/repository
• path to the Oak SegmentMK “repository” directory, as above
• Mongo URI, eg.
mongodb://localhost:27017/aem
• JDBC URI, eg.
jdbc:mysql://localhost:3306/sakila?profileSQL=true
CRX2OAK COMMAND LINE
REPOSITORY PARAMETER TYPES
#evolverocks 7
• java -jar crx2oak.jar [options] [datastore-options] SOURCE TARGET
• The source blob store is defined using: --src-datastore or --src-s3datastore.
• If there’s no blob store defined for source, CRX2Oak assumes embedded
• If the source blob store is defined, it will be used for target as well (only
references will be copied, not actual binaries)
• It can be overridden with --copy-binaries
• Destination blob store can be defined with: --datastore or --s3datastore
CRX2OAK COMMAND LINE
DEFINING DATASTORE TO BE USED
#evolverocks 8
FEATURES
SELECTING PATHS TO MIGRATE
#evolverocks 9
FEATURES
MIGRATING VERSION STORAGE
#evolverocks 10
• Client requirements
• CQ 5.6.1 instance with a large number of sites and assets, storing binaries in S3
• The content is being authored 24/7
• The migration of the whole content takes about 20h
• The migration is being done offline and the instance can’t be down so long
• The upgraded instance has to be tested before going live
• Strategy
• Snapshot the instance and migrate the copy
• Perform tests on it
• Top-up the changes introduced after snapshot
CASE STUDY
INTRODUCTION
#evolverocks 11
CASE STUDY
STRATEGY
#evolverocks 12
• The migration (4) will be much faster, as only the diff will be migrated
• In the (4) use --skip-init, so the existing repository won’t be reinitialized
• Also, use --include-paths=/content/mysite to migrate only the modified
subtree
CASE STUDY
REMARKS
#evolverocks 13
• When using Mongo (either as source or destination), run CRX2Oak on the same
machine as Mongo primary
• If you don’t need version history for deleted nodes, use --copy-orphaned-
versions=false to make the migration faster
• CRX2Oak may be used to copy content between existing repositories. Use
following parameters:
• --skip-init, so the destination is not initialized with the index definitions,
• --{include,merge}-paths to refer which subtrees should be copied
• --copy-orphaned-versions=false
GENERAL MIGRATION TIPS
#evolverocks 14
• When upgrading CQ 5.x + S3, crx2oak calls AWS asking for length of each binary
• the lengths are stored in Oak but not in CRX2, so we have to ask about it
• For a large repositories it may slow down the whole migration
• It’s possible to pre-fetch all lengths, store them in a text file and configure CRX
(and therefore CRX2Oak) to use it
• More information:
• https://jackrabbit.apache.org/oak/docs/apidocs/org/apache/jackrabbit/oak/upgrade
/blob/LengthCachingDataStore.html
• Sample configuration files:
• http://bit.ly/cq5-s3-upgrade
GENERAL MIGRATION TIPS
UPGRADING CQ 5.X STORING BINARIES IN AWS S3
#evolverocks 15
• UUID conflict exception
• may occur if the destination repository already exists (iterative migration)
• remember to add --copy-orphaned-versions=false
• when using --include-paths, include all modified paths:
• otherwise, if the page has been moved and we include only the destination path,
CRX2Oak won’t remove the page from its original position
• BlobId not found exception
• either source or destination blob store is not configured correctly
• Unable to delete referenced node
• probably CRX2Oak tries to overwrite the whole version storage (removing existing
versions)
• add --copy-orphaned-versions=false
TROUBLESHOOTING
#evolverocks 16
Official docs describes using the extension:
• java -jar aem-quickstart-6.2.0.jar -unpack # unpack the AEM jar
• java -jar aem-quickstart-6.2.0.jar -v -x crx2oak # prepare extension
config
• java -jar aem-quickstart-6.2.0.jar -v -x crx2oak # prepare OSGi config
• java -Xmx4096m -XX:MaxPermSize=2048M -jar aem-quickstart-6.2.0.jar -v -
x crx2oak -xargs -- -o migrate
For running the CRX2Oak manually, the last command should be replaced with:
• java -Xmx4096m -XX:MaxPermSize=2048M -jar crx-
quickstart/opt/helpers/crx2oak/crx2oak.jar [source] [destination]
USING EXTENSION VS RUNNING CRX2OAK
MANUALLY
#evolverocks 17
• All CRX2Oak versions offer similar features
• They differ in:
• Oak version used underneath (as the CRX2Oak starts a normal Oak repository)
• Index definitions created during the repository initialisation
• These both things are assigned to the AEM version and shouldn’t be mismatched
• Table of truth:
• CRX2Oak 1.2.x can be used with AEM 6.1 too, but it won’t have all the
advanced features
VERSIONS
AEM Oak CRX2Oak
AEM 6.0 1.0.x 1.0.x
AEM 6.1 1.2.x 1.3.x (sic!)
AEM 6.2 1.4.x 1.4.x
#evolverocks 18
• CRX2Oak downloads:
• https://repo.adobe.com/nexus/content/groups/public/com/adobe/granite/crx2oak/
• CRX2Oak documentation
• https://docs.adobe.com/docs/en/aem/6-2/deploy/upgrade/using-crx2oak.html
• oak-upgrade documentation:
• https://jackrabbit.apache.org/oak/docs/migration.html
RESOURCES
#evolverocks
THANK YOU!
http://tomek.rekawek.eu
@Tomek1024
rekawek@adobe.com

More Related Content

What's hot

Going Deep on Amazon Aurora Serverless (DAT427-R1) - AWS re:Invent 2018
Going Deep on Amazon Aurora Serverless (DAT427-R1) - AWS re:Invent 2018Going Deep on Amazon Aurora Serverless (DAT427-R1) - AWS re:Invent 2018
Going Deep on Amazon Aurora Serverless (DAT427-R1) - AWS re:Invent 2018Amazon Web Services
 
Heap exploitation
Heap exploitationHeap exploitation
Heap exploitationAngel Boy
 
A story of the passive aggressive sysadmin of AEM
A story of the passive aggressive sysadmin of AEMA story of the passive aggressive sysadmin of AEM
A story of the passive aggressive sysadmin of AEMFrans Rosén
 
Troubleshooting Complex Oracle Performance Problems with Tanel Poder
Troubleshooting Complex Oracle Performance Problems with Tanel PoderTroubleshooting Complex Oracle Performance Problems with Tanel Poder
Troubleshooting Complex Oracle Performance Problems with Tanel PoderTanel Poder
 
PGDay.Amsterdam 2018 - Stefan Fercot - Save your data with pgBackRest
PGDay.Amsterdam 2018 - Stefan Fercot - Save your data with pgBackRestPGDay.Amsterdam 2018 - Stefan Fercot - Save your data with pgBackRest
PGDay.Amsterdam 2018 - Stefan Fercot - Save your data with pgBackRestPGDay.Amsterdam
 
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven RostedtKernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven RostedtAnne Nicolas
 
Windbg cmds
Windbg cmdsWindbg cmds
Windbg cmdskewuc
 
Adobe AEM Maintenance - Customer Care Office Hours
Adobe AEM Maintenance - Customer Care Office HoursAdobe AEM Maintenance - Customer Care Office Hours
Adobe AEM Maintenance - Customer Care Office HoursAndrew Khoury
 
Benchmarking NGINX for Accuracy and Results
Benchmarking NGINX for Accuracy and ResultsBenchmarking NGINX for Accuracy and Results
Benchmarking NGINX for Accuracy and ResultsNGINX, Inc.
 
Meet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracingMeet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracingViller Hsiao
 
The Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageThe Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageKernel TLV
 
How Netflix Tunes EC2 Instances for Performance
How Netflix Tunes EC2 Instances for PerformanceHow Netflix Tunes EC2 Instances for Performance
How Netflix Tunes EC2 Instances for PerformanceBrendan Gregg
 
What should a hacker know about WebDav?
What should a hacker know about WebDav?What should a hacker know about WebDav?
What should a hacker know about WebDav?Mikhail Egorov
 
Know Your Topics – A Deep Dive on Topic IDs with KIP-516 with Justine Olshan ...
Know Your Topics – A Deep Dive on Topic IDs with KIP-516 with Justine Olshan ...Know Your Topics – A Deep Dive on Topic IDs with KIP-516 with Justine Olshan ...
Know Your Topics – A Deep Dive on Topic IDs with KIP-516 with Justine Olshan ...HostedbyConfluent
 
Alfresco Security Best Practices Guide
Alfresco Security Best Practices GuideAlfresco Security Best Practices Guide
Alfresco Security Best Practices GuideToni de la Fuente
 
HTTP Request Smuggling via higher HTTP versions
HTTP Request Smuggling via higher HTTP versionsHTTP Request Smuggling via higher HTTP versions
HTTP Request Smuggling via higher HTTP versionsneexemil
 
Make ARM Shellcode Great Again
Make ARM Shellcode Great AgainMake ARM Shellcode Great Again
Make ARM Shellcode Great AgainSaumil Shah
 

What's hot (20)

Going Deep on Amazon Aurora Serverless (DAT427-R1) - AWS re:Invent 2018
Going Deep on Amazon Aurora Serverless (DAT427-R1) - AWS re:Invent 2018Going Deep on Amazon Aurora Serverless (DAT427-R1) - AWS re:Invent 2018
Going Deep on Amazon Aurora Serverless (DAT427-R1) - AWS re:Invent 2018
 
Heap exploitation
Heap exploitationHeap exploitation
Heap exploitation
 
A story of the passive aggressive sysadmin of AEM
A story of the passive aggressive sysadmin of AEMA story of the passive aggressive sysadmin of AEM
A story of the passive aggressive sysadmin of AEM
 
Fig 9-03
Fig 9-03Fig 9-03
Fig 9-03
 
Troubleshooting Complex Oracle Performance Problems with Tanel Poder
Troubleshooting Complex Oracle Performance Problems with Tanel PoderTroubleshooting Complex Oracle Performance Problems with Tanel Poder
Troubleshooting Complex Oracle Performance Problems with Tanel Poder
 
PGDay.Amsterdam 2018 - Stefan Fercot - Save your data with pgBackRest
PGDay.Amsterdam 2018 - Stefan Fercot - Save your data with pgBackRestPGDay.Amsterdam 2018 - Stefan Fercot - Save your data with pgBackRest
PGDay.Amsterdam 2018 - Stefan Fercot - Save your data with pgBackRest
 
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven RostedtKernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
 
Windbg cmds
Windbg cmdsWindbg cmds
Windbg cmds
 
Adobe AEM Maintenance - Customer Care Office Hours
Adobe AEM Maintenance - Customer Care Office HoursAdobe AEM Maintenance - Customer Care Office Hours
Adobe AEM Maintenance - Customer Care Office Hours
 
Benchmarking NGINX for Accuracy and Results
Benchmarking NGINX for Accuracy and ResultsBenchmarking NGINX for Accuracy and Results
Benchmarking NGINX for Accuracy and Results
 
Meet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracingMeet cute-between-ebpf-and-tracing
Meet cute-between-ebpf-and-tracing
 
The Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageThe Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast Storage
 
How Netflix Tunes EC2 Instances for Performance
How Netflix Tunes EC2 Instances for PerformanceHow Netflix Tunes EC2 Instances for Performance
How Netflix Tunes EC2 Instances for Performance
 
Ansible 101
Ansible 101Ansible 101
Ansible 101
 
What should a hacker know about WebDav?
What should a hacker know about WebDav?What should a hacker know about WebDav?
What should a hacker know about WebDav?
 
淺談探索 Linux 系統設計之道
淺談探索 Linux 系統設計之道 淺談探索 Linux 系統設計之道
淺談探索 Linux 系統設計之道
 
Know Your Topics – A Deep Dive on Topic IDs with KIP-516 with Justine Olshan ...
Know Your Topics – A Deep Dive on Topic IDs with KIP-516 with Justine Olshan ...Know Your Topics – A Deep Dive on Topic IDs with KIP-516 with Justine Olshan ...
Know Your Topics – A Deep Dive on Topic IDs with KIP-516 with Justine Olshan ...
 
Alfresco Security Best Practices Guide
Alfresco Security Best Practices GuideAlfresco Security Best Practices Guide
Alfresco Security Best Practices Guide
 
HTTP Request Smuggling via higher HTTP versions
HTTP Request Smuggling via higher HTTP versionsHTTP Request Smuggling via higher HTTP versions
HTTP Request Smuggling via higher HTTP versions
 
Make ARM Shellcode Great Again
Make ARM Shellcode Great AgainMake ARM Shellcode Great Again
Make ARM Shellcode Great Again
 

Similar to CRX2Oak - all the secrets of repository migration

Postgre sql linuxcontainers by Jignesh Shah
Postgre sql linuxcontainers by Jignesh ShahPostgre sql linuxcontainers by Jignesh Shah
Postgre sql linuxcontainers by Jignesh ShahPivotalOpenSourceHub
 
PostgreSQL and Linux Containers
PostgreSQL and Linux ContainersPostgreSQL and Linux Containers
PostgreSQL and Linux ContainersJignesh Shah
 
Productionizing Spark and the Spark Job Server
Productionizing Spark and the Spark Job ServerProductionizing Spark and the Spark Job Server
Productionizing Spark and the Spark Job ServerEvan Chan
 
Productionizing Spark and the REST Job Server- Evan Chan
Productionizing Spark and the REST Job Server- Evan ChanProductionizing Spark and the REST Job Server- Evan Chan
Productionizing Spark and the REST Job Server- Evan ChanSpark Summit
 
The power of linux advanced tracer [POUG18]
The power of linux advanced tracer [POUG18]The power of linux advanced tracer [POUG18]
The power of linux advanced tracer [POUG18]Mahmoud Hatem
 
Ippevent : openshift Introduction
Ippevent : openshift IntroductionIppevent : openshift Introduction
Ippevent : openshift Introductionkanedafromparis
 
OGG Architecture Performance
OGG Architecture PerformanceOGG Architecture Performance
OGG Architecture PerformanceEnkitec
 
Oracle GoldenGate Presentation from OTN Virtual Technology Summit - 7/9/14 (PDF)
Oracle GoldenGate Presentation from OTN Virtual Technology Summit - 7/9/14 (PDF)Oracle GoldenGate Presentation from OTN Virtual Technology Summit - 7/9/14 (PDF)
Oracle GoldenGate Presentation from OTN Virtual Technology Summit - 7/9/14 (PDF)Bobby Curtis
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scalethelabdude
 
Docking postgres
Docking postgresDocking postgres
Docking postgresrycamor
 
Tanel Poder Oracle Scripts and Tools (2010)
Tanel Poder Oracle Scripts and Tools (2010)Tanel Poder Oracle Scripts and Tools (2010)
Tanel Poder Oracle Scripts and Tools (2010)Tanel Poder
 
Oracle GoldenGate Architecture Performance
Oracle GoldenGate Architecture PerformanceOracle GoldenGate Architecture Performance
Oracle GoldenGate Architecture PerformanceEnkitec
 
Automating OWASP ZAP - DevCSecCon talk
Automating OWASP ZAP - DevCSecCon talk Automating OWASP ZAP - DevCSecCon talk
Automating OWASP ZAP - DevCSecCon talk Simon Bennetts
 
Simon Bennetts - Automating ZAP
Simon Bennetts - Automating ZAP Simon Bennetts - Automating ZAP
Simon Bennetts - Automating ZAP DevSecCon
 
Running Spark on Cloud
Running Spark on CloudRunning Spark on Cloud
Running Spark on CloudQubole
 
Java and Containers - Make it Awesome !
Java and Containers - Make it Awesome !Java and Containers - Make it Awesome !
Java and Containers - Make it Awesome !Dinakar Guniguntala
 
Performance Benchmarking: Tips, Tricks, and Lessons Learned
Performance Benchmarking: Tips, Tricks, and Lessons LearnedPerformance Benchmarking: Tips, Tricks, and Lessons Learned
Performance Benchmarking: Tips, Tricks, and Lessons LearnedTim Callaghan
 
Training Slides: 203 - Backup & Recovery
Training Slides: 203 - Backup & RecoveryTraining Slides: 203 - Backup & Recovery
Training Slides: 203 - Backup & RecoveryContinuent
 
Creating a Benchmarking Infrastructure That Just Works
Creating a Benchmarking Infrastructure That Just WorksCreating a Benchmarking Infrastructure That Just Works
Creating a Benchmarking Infrastructure That Just WorksTim Callaghan
 

Similar to CRX2Oak - all the secrets of repository migration (20)

Postgre sql linuxcontainers by Jignesh Shah
Postgre sql linuxcontainers by Jignesh ShahPostgre sql linuxcontainers by Jignesh Shah
Postgre sql linuxcontainers by Jignesh Shah
 
PostgreSQL and Linux Containers
PostgreSQL and Linux ContainersPostgreSQL and Linux Containers
PostgreSQL and Linux Containers
 
Productionizing Spark and the Spark Job Server
Productionizing Spark and the Spark Job ServerProductionizing Spark and the Spark Job Server
Productionizing Spark and the Spark Job Server
 
Productionizing Spark and the REST Job Server- Evan Chan
Productionizing Spark and the REST Job Server- Evan ChanProductionizing Spark and the REST Job Server- Evan Chan
Productionizing Spark and the REST Job Server- Evan Chan
 
The power of linux advanced tracer [POUG18]
The power of linux advanced tracer [POUG18]The power of linux advanced tracer [POUG18]
The power of linux advanced tracer [POUG18]
 
les01.pdf
les01.pdfles01.pdf
les01.pdf
 
Ippevent : openshift Introduction
Ippevent : openshift IntroductionIppevent : openshift Introduction
Ippevent : openshift Introduction
 
OGG Architecture Performance
OGG Architecture PerformanceOGG Architecture Performance
OGG Architecture Performance
 
Oracle GoldenGate Presentation from OTN Virtual Technology Summit - 7/9/14 (PDF)
Oracle GoldenGate Presentation from OTN Virtual Technology Summit - 7/9/14 (PDF)Oracle GoldenGate Presentation from OTN Virtual Technology Summit - 7/9/14 (PDF)
Oracle GoldenGate Presentation from OTN Virtual Technology Summit - 7/9/14 (PDF)
 
Benchmarking Solr Performance at Scale
Benchmarking Solr Performance at ScaleBenchmarking Solr Performance at Scale
Benchmarking Solr Performance at Scale
 
Docking postgres
Docking postgresDocking postgres
Docking postgres
 
Tanel Poder Oracle Scripts and Tools (2010)
Tanel Poder Oracle Scripts and Tools (2010)Tanel Poder Oracle Scripts and Tools (2010)
Tanel Poder Oracle Scripts and Tools (2010)
 
Oracle GoldenGate Architecture Performance
Oracle GoldenGate Architecture PerformanceOracle GoldenGate Architecture Performance
Oracle GoldenGate Architecture Performance
 
Automating OWASP ZAP - DevCSecCon talk
Automating OWASP ZAP - DevCSecCon talk Automating OWASP ZAP - DevCSecCon talk
Automating OWASP ZAP - DevCSecCon talk
 
Simon Bennetts - Automating ZAP
Simon Bennetts - Automating ZAP Simon Bennetts - Automating ZAP
Simon Bennetts - Automating ZAP
 
Running Spark on Cloud
Running Spark on CloudRunning Spark on Cloud
Running Spark on Cloud
 
Java and Containers - Make it Awesome !
Java and Containers - Make it Awesome !Java and Containers - Make it Awesome !
Java and Containers - Make it Awesome !
 
Performance Benchmarking: Tips, Tricks, and Lessons Learned
Performance Benchmarking: Tips, Tricks, and Lessons LearnedPerformance Benchmarking: Tips, Tricks, and Lessons Learned
Performance Benchmarking: Tips, Tricks, and Lessons Learned
 
Training Slides: 203 - Backup & Recovery
Training Slides: 203 - Backup & RecoveryTraining Slides: 203 - Backup & Recovery
Training Slides: 203 - Backup & Recovery
 
Creating a Benchmarking Infrastructure That Just Works
Creating a Benchmarking Infrastructure That Just WorksCreating a Benchmarking Infrastructure That Just Works
Creating a Benchmarking Infrastructure That Just Works
 

More from Tomasz Rękawek

Deep-dive into cloud-native AEM deployments based on Kubernetes
Deep-dive into cloud-native AEM deployments based on KubernetesDeep-dive into cloud-native AEM deployments based on Kubernetes
Deep-dive into cloud-native AEM deployments based on KubernetesTomasz Rękawek
 
Emulating Game Boy in Java
Emulating Game Boy in JavaEmulating Game Boy in Java
Emulating Game Boy in JavaTomasz Rękawek
 
Zero downtime deployments for the Sling-based apps using Docker
Zero downtime deployments for the Sling-based apps using DockerZero downtime deployments for the Sling-based apps using Docker
Zero downtime deployments for the Sling-based apps using DockerTomasz Rękawek
 
Inter-Sling communication with message queue
Inter-Sling communication with message queueInter-Sling communication with message queue
Inter-Sling communication with message queueTomasz Rękawek
 
Shooting rabbits with sling
Shooting rabbits with slingShooting rabbits with sling
Shooting rabbits with slingTomasz Rękawek
 

More from Tomasz Rękawek (9)

Radio ad blocker
Radio ad blockerRadio ad blocker
Radio ad blocker
 
Deep-dive into cloud-native AEM deployments based on Kubernetes
Deep-dive into cloud-native AEM deployments based on KubernetesDeep-dive into cloud-native AEM deployments based on Kubernetes
Deep-dive into cloud-native AEM deployments based on Kubernetes
 
Emulating Game Boy in Java
Emulating Game Boy in JavaEmulating Game Boy in Java
Emulating Game Boy in Java
 
Zero downtime deployments for the Sling-based apps using Docker
Zero downtime deployments for the Sling-based apps using DockerZero downtime deployments for the Sling-based apps using Docker
Zero downtime deployments for the Sling-based apps using Docker
 
SlingQuery
SlingQuerySlingQuery
SlingQuery
 
Code metrics
Code metricsCode metrics
Code metrics
 
Inter-Sling communication with message queue
Inter-Sling communication with message queueInter-Sling communication with message queue
Inter-Sling communication with message queue
 
Sling Dynamic Include
Sling Dynamic IncludeSling Dynamic Include
Sling Dynamic Include
 
Shooting rabbits with sling
Shooting rabbits with slingShooting rabbits with sling
Shooting rabbits with sling
 

Recently uploaded

ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...Christina Lin
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxComplianceQuest1
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyFrank van der Linden
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 

Recently uploaded (20)

ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
ODSC - Batch to Stream workshop - integration of Apache Spark, Cassandra, Pos...
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The Ugly
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 

CRX2Oak - all the secrets of repository migration

  • 1. #evolverocks CRX2OAK – ALL THE SECRETS OF REPOSITORY MIGRATION TOMEK RĘKAWEK, ADOBE RESEARCH Aug 30, 2016
  • 2. #evolverocks 2 • Overview of CRX2Oak • CRX2Oak command line • Features • Case study: large migration • General migration tips • Using CRX2Oak for AEM upgrade • Q & (hopefully) A AGENDA
  • 3. #evolverocks 3 OVERVIEW OF THE CRX2OAK UPGRADE FROM CRX2 CQ 5.x – CRX2 AEM 6.x – Jackrabbit Oak
  • 4. #evolverocks 4 OVERVIEW OF THE CRX2OAK UPGRADE OR SIDEGRADE CQ 5.x – CRX2 AEM 6.x – Jackrabbit Oak AEM 6.x – Oak
  • 5. #evolverocks 5 OVERVIEW OF THE CRX2OAK MIGRATING BINARIES
  • 6. #evolverocks 6 • CRX2Oak is a command-line tool: • java -jar crx2oak.jar [options] [datastore-options] SOURCE TARGET • Source and target defines the repositories. Supported formats: • path to the CRX2 “repository” directory, eg. crx-quickstart/repository • path to the Oak SegmentMK “repository” directory, as above • Mongo URI, eg. mongodb://localhost:27017/aem • JDBC URI, eg. jdbc:mysql://localhost:3306/sakila?profileSQL=true CRX2OAK COMMAND LINE REPOSITORY PARAMETER TYPES
  • 7. #evolverocks 7 • java -jar crx2oak.jar [options] [datastore-options] SOURCE TARGET • The source blob store is defined using: --src-datastore or --src-s3datastore. • If there’s no blob store defined for source, CRX2Oak assumes embedded • If the source blob store is defined, it will be used for target as well (only references will be copied, not actual binaries) • It can be overridden with --copy-binaries • Destination blob store can be defined with: --datastore or --s3datastore CRX2OAK COMMAND LINE DEFINING DATASTORE TO BE USED
  • 10. #evolverocks 10 • Client requirements • CQ 5.6.1 instance with a large number of sites and assets, storing binaries in S3 • The content is being authored 24/7 • The migration of the whole content takes about 20h • The migration is being done offline and the instance can’t be down so long • The upgraded instance has to be tested before going live • Strategy • Snapshot the instance and migrate the copy • Perform tests on it • Top-up the changes introduced after snapshot CASE STUDY INTRODUCTION
  • 12. #evolverocks 12 • The migration (4) will be much faster, as only the diff will be migrated • In the (4) use --skip-init, so the existing repository won’t be reinitialized • Also, use --include-paths=/content/mysite to migrate only the modified subtree CASE STUDY REMARKS
  • 13. #evolverocks 13 • When using Mongo (either as source or destination), run CRX2Oak on the same machine as Mongo primary • If you don’t need version history for deleted nodes, use --copy-orphaned- versions=false to make the migration faster • CRX2Oak may be used to copy content between existing repositories. Use following parameters: • --skip-init, so the destination is not initialized with the index definitions, • --{include,merge}-paths to refer which subtrees should be copied • --copy-orphaned-versions=false GENERAL MIGRATION TIPS
  • 14. #evolverocks 14 • When upgrading CQ 5.x + S3, crx2oak calls AWS asking for length of each binary • the lengths are stored in Oak but not in CRX2, so we have to ask about it • For a large repositories it may slow down the whole migration • It’s possible to pre-fetch all lengths, store them in a text file and configure CRX (and therefore CRX2Oak) to use it • More information: • https://jackrabbit.apache.org/oak/docs/apidocs/org/apache/jackrabbit/oak/upgrade /blob/LengthCachingDataStore.html • Sample configuration files: • http://bit.ly/cq5-s3-upgrade GENERAL MIGRATION TIPS UPGRADING CQ 5.X STORING BINARIES IN AWS S3
  • 15. #evolverocks 15 • UUID conflict exception • may occur if the destination repository already exists (iterative migration) • remember to add --copy-orphaned-versions=false • when using --include-paths, include all modified paths: • otherwise, if the page has been moved and we include only the destination path, CRX2Oak won’t remove the page from its original position • BlobId not found exception • either source or destination blob store is not configured correctly • Unable to delete referenced node • probably CRX2Oak tries to overwrite the whole version storage (removing existing versions) • add --copy-orphaned-versions=false TROUBLESHOOTING
  • 16. #evolverocks 16 Official docs describes using the extension: • java -jar aem-quickstart-6.2.0.jar -unpack # unpack the AEM jar • java -jar aem-quickstart-6.2.0.jar -v -x crx2oak # prepare extension config • java -jar aem-quickstart-6.2.0.jar -v -x crx2oak # prepare OSGi config • java -Xmx4096m -XX:MaxPermSize=2048M -jar aem-quickstart-6.2.0.jar -v - x crx2oak -xargs -- -o migrate For running the CRX2Oak manually, the last command should be replaced with: • java -Xmx4096m -XX:MaxPermSize=2048M -jar crx- quickstart/opt/helpers/crx2oak/crx2oak.jar [source] [destination] USING EXTENSION VS RUNNING CRX2OAK MANUALLY
  • 17. #evolverocks 17 • All CRX2Oak versions offer similar features • They differ in: • Oak version used underneath (as the CRX2Oak starts a normal Oak repository) • Index definitions created during the repository initialisation • These both things are assigned to the AEM version and shouldn’t be mismatched • Table of truth: • CRX2Oak 1.2.x can be used with AEM 6.1 too, but it won’t have all the advanced features VERSIONS AEM Oak CRX2Oak AEM 6.0 1.0.x 1.0.x AEM 6.1 1.2.x 1.3.x (sic!) AEM 6.2 1.4.x 1.4.x
  • 18. #evolverocks 18 • CRX2Oak downloads: • https://repo.adobe.com/nexus/content/groups/public/com/adobe/granite/crx2oak/ • CRX2Oak documentation • https://docs.adobe.com/docs/en/aem/6-2/deploy/upgrade/using-crx2oak.html • oak-upgrade documentation: • https://jackrabbit.apache.org/oak/docs/migration.html RESOURCES