SlideShare a Scribd company logo
Presented by: Gaston Gonzalez, headwire.com, Inc.
+
Advanced AEM Search
Consuming External Content and Enriching Content with Apache
Camel
About Me
• Senior Technical Architect at
headwire.com, Inc.
• Search Engineer / Developer
• AEM Architect / Developer
• Creator of AEM Solr Search
• Tech Blogger
• UNIX Systems Administrator
+
+
Typical AEM + Search
Integration
Typical AEM + Search Architecture
+
Typical AEM + Search Architecture
+
Pros Cons
• Straight forward implementation
• Simple architecture (AEM + Search)
• Complete data model in AEM?
• Not all data may be in AEM
• Processing overhead
• Data cleansing, transformation and
enrichment handled in AEM
• Fault Tolerance
• What if Solr is down?
• Tight coupling to search platform
Is there another
way?
+
Goals for a better Architecture
• Offload processing outside of AEM
• Improve fault tolerance
• Provide flexible platform for data cleansing,
transformation and aggregation
• Allow for changes to indexing logic with impacting
AEM
• Search engine agnostic
+
Introduce an ETL / Document Processor
+
+
Document Processing
Document Processing Platform
• Roles & Responsibilities
• Enriches submitted documents prior to indexing.
• Submits documents for indexing.
• Terms & Definitions
• Enrichment: Data cleansing, filtering, transformation,
aggregation, etc.
• Processing Stage: Independent processing unit
responsible for contributing to the enrichment process.
• Pipeline: Consists of one or more processing stages or
sub pipelines.
+
Document Processing Platform
+
Document processing is really an
integration problem, right?
+
Integration Library Integration Framework &
Stream Processing
Enterprise Service Bus
Apache Camel Spring Integration Mule ESB
Spring Cloud Data Flow &
Cloud Stream
Low Complexity High
+
Apache Camel
Apache Camel
• A light-weight, open source
integration library.
• Mediation engine
• Implements well-known Enterprise
Integration Patterns (EIPs)
• Aggregator
• Content Enricher
• Content-based router
• Message
• Message Translator
• Pipes and Filters
• Splitter…
+
Why Apache Camel?
• Light weight—it’s a JAR
• Imposes no runtime constraints
• Routing engine
• Powerful, fluent Java DSL
• Mature open source project
• Extensive list of integration components
• Avoid writing boiler plate code—leverage EIPs
+
Apache Camel & EIP Concepts
+
Message
• Unit of information exchange between applications
Exchange
• Wraps inbound & outbound message + headers
Message Channel
• Allows applications to communicate using messaging
Pipes and Filters
• Perform loosely coupled processing on a message
• Routes and Processors in Camel
Camel’s Data Model
+
Camel’s Architecture
+
Importing Product Content into Solr
Problem: “As an AEM developer, I need to import product
content into Solr so that I can display products via search
and on PDPs on my AEM-powered site.”
+
Let’s use Best Buy’s Product API as example…
1. Fetch product data ZIP file via HTTP request.
2. Unzip product data.
3. Parse each JSON file to extract individual products.
4. Transform, enrich and cleanse each product as necessary.
5. Submit each product to Solr for indexing.
A solution using EIPs
+
A solution using Camel
+
A short list of Camel Components
+
AMPQ Git RabbitMQ
ATOM HTTP / HTTP4 Rest
AWS JCR RSS
Bean JDBC Solr
Box JMS Apache Spark
Cache Jsch SQL
CouchDB Log Timer
Elasticsearch MongoDB XSLT
File Netty / Netty4 Quartz
http://camel.apache.org/components.html
Back to AEM and
indexing AEM content…
+
A Better AEM + Search Architecture
+
Enrichment Use Cases for AEM
• Search Relevancy
• Merge ratings and review signals
• Merge analytics signals (visits, page views…)
• Merge social signals (likes, shares, …)
• Cleanse data for search
• Rich content processing (Tika)
• Natural Language Processing (OpenNLP)
• Filter / drop documents
• Classify content
+
AEM: Data Model (1/3)
• Use a serializable object to represent your document
• In fact, use a HashMap
• No dependency object graph
• Most search platforms already think of documents as a
series of key/value pairs
• Use key name prefixes to model:
• Index operation type (aem.op)
• Document Fields (aem.field.<field>)
• Metadata (aem.meta.<field>)
+
AEM: Data Model (1/3)
HashMap<String, Object> jmsDoc = new HashMap<String, Object>();
// Operation Type
jmsDoc.put("aem.op.type","ADD_DOC");
// Document fields
jmsDoc.put("aem.field.id", page.getPath());
jmsDoc.put("aem.field.crxPath", page.getPath());
jmsDoc.put("aem.field.url", page.getPath() + ".html");
jmsDoc.put("aem.field.title", page.getTitle());
jmsDoc.put("aem.field.description", page.getDescription());
// Metadata
jmsDoc.put("aem.meta.foo", "bar");
+
AEM: Listener / JMS Producer (2/3)
+
• Create an AEM Listener
• Implement EventHandler interface
• Listen for the PageEvent topics
• Convert the Page resource to a our data model
• Add operation type
• Add document fields
• Add metadata fields
• Send the message to JMS index topic
• Example: JmsIndexListener.java
AEM: JMS Camel Consumer (3/3)
+
• Define your Camel runtime (e.g., standalone, OSGi, etc.)
• Define your Camel routes
• Consume JMS topic
• Route operation type using content-based router
• Enrich document as needed
• Convert JMS document model to Solr model
• Submit index request
• Example: AemToSolr.java
+
Demo
Demo Prerequisites
• Java 8 / Maven 3.2.x
• AEM 6.1
• http://www.aemsolrsearch.com
• https://github.com/GastonGonzalez/aem-solr-
search-product-sample
• Best Buy API Key
• Vagrant and VirtualBox
+
+
Camel Runtime
Options
Java main:
CamelContext
Java main:
Wrapper
OSGi Runtime
Resources
• My Blog - http://www.gastongonzalez.com/
• AEM Solr Search - http://www.aemsolrsearch.com
• Apache Camel
• http://camel.apache.org/index.html
• https://www.manning.com/books/camel-in-
action-second-edition
• Contact Us: aemsolr@headwire.com
+
In summary…
+
• If you do not need enrichment, keep it simple and
use a direct indexing approach.
• If you have a need to enrich your AEM content
consider using Camel as your document processing
platform.
• This architecture is NOT search-specific!
• Syndicate AEM content to other systems
• Workflow replacement
+
THANK YOU.

More Related Content

What's hot

Introduction to Apache Solr
Introduction to Apache SolrIntroduction to Apache Solr
Introduction to Apache Solr
Alexandre Rafalovitch
 
Case study of Rujhaan.com (A social news app )
Case study of Rujhaan.com (A social news app )Case study of Rujhaan.com (A social news app )
Case study of Rujhaan.com (A social news app )
Rahul Jain
 
Scaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solrScaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solr
Trey Grainger
 
Introduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and UsecasesIntroduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and Usecases
Rahul Jain
 
State-of-the-Art Drupal Search with Apache Solr
State-of-the-Art Drupal Search with Apache SolrState-of-the-Art Drupal Search with Apache Solr
State-of-the-Art Drupal Search with Apache Solr
guest432cd6
 
Building a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache SolrBuilding a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache Solr
Rahul Jain
 
Search domain basics
Search domain basicsSearch domain basics
Search domain basics
pmanvi
 
Deduplication Using Solr: Presented by Neeraj Jain, Stubhub
Deduplication Using Solr: Presented by Neeraj Jain, StubhubDeduplication Using Solr: Presented by Neeraj Jain, Stubhub
Deduplication Using Solr: Presented by Neeraj Jain, Stubhub
Lucidworks
 
Battle of the giants: Apache Solr vs ElasticSearch
Battle of the giants: Apache Solr vs ElasticSearchBattle of the giants: Apache Solr vs ElasticSearch
Battle of the giants: Apache Solr vs ElasticSearch
Rafał Kuć
 
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Lucidworks
 
Battle of the Giants - Apache Solr vs. Elasticsearch (ApacheCon)
Battle of the Giants - Apache Solr vs. Elasticsearch (ApacheCon)Battle of the Giants - Apache Solr vs. Elasticsearch (ApacheCon)
Battle of the Giants - Apache Solr vs. Elasticsearch (ApacheCon)
Sematext Group, Inc.
 
Deepak khetawat sling_models_sightly_jsp
Deepak khetawat sling_models_sightly_jspDeepak khetawat sling_models_sightly_jsp
Deepak khetawat sling_models_sightly_jsp
DEEPAK KHETAWAT
 
Elastic search overview
Elastic search overviewElastic search overview
Elastic search overview
ABC Talks
 
SharePoint Framework, React, and Office UI Fabric spc adriatics 2016
SharePoint Framework, React, and Office UI Fabric spc adriatics 2016SharePoint Framework, React, and Office UI Fabric spc adriatics 2016
SharePoint Framework, React, and Office UI Fabric spc adriatics 2016
Sonja Madsen
 
Managed Search: Presented by Jacob Graves, Getty Images
Managed Search: Presented by Jacob Graves, Getty ImagesManaged Search: Presented by Jacob Graves, Getty Images
Managed Search: Presented by Jacob Graves, Getty Images
Lucidworks
 
Sitemap comparison
Sitemap comparisonSitemap comparison
Sitemap comparison
lukewright418
 
Postman Collection Format v2.0 (pre-draft)
Postman Collection Format v2.0 (pre-draft)Postman Collection Format v2.0 (pre-draft)
Postman Collection Format v2.0 (pre-draft)
Postman
 
Introduction to Apache Solr
Introduction to Apache SolrIntroduction to Apache Solr
Introduction to Apache Solr
Andy Jackson
 
Battle of the Giants round 2
Battle of the Giants round 2Battle of the Giants round 2
Battle of the Giants round 2
Rafał Kuć
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
Erik Hatcher
 

What's hot (20)

Introduction to Apache Solr
Introduction to Apache SolrIntroduction to Apache Solr
Introduction to Apache Solr
 
Case study of Rujhaan.com (A social news app )
Case study of Rujhaan.com (A social news app )Case study of Rujhaan.com (A social news app )
Case study of Rujhaan.com (A social news app )
 
Scaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solrScaling Recommendations, Semantic Search, & Data Analytics with solr
Scaling Recommendations, Semantic Search, & Data Analytics with solr
 
Introduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and UsecasesIntroduction to Lucene & Solr and Usecases
Introduction to Lucene & Solr and Usecases
 
State-of-the-Art Drupal Search with Apache Solr
State-of-the-Art Drupal Search with Apache SolrState-of-the-Art Drupal Search with Apache Solr
State-of-the-Art Drupal Search with Apache Solr
 
Building a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache SolrBuilding a Large Scale SEO/SEM Application with Apache Solr
Building a Large Scale SEO/SEM Application with Apache Solr
 
Search domain basics
Search domain basicsSearch domain basics
Search domain basics
 
Deduplication Using Solr: Presented by Neeraj Jain, Stubhub
Deduplication Using Solr: Presented by Neeraj Jain, StubhubDeduplication Using Solr: Presented by Neeraj Jain, Stubhub
Deduplication Using Solr: Presented by Neeraj Jain, Stubhub
 
Battle of the giants: Apache Solr vs ElasticSearch
Battle of the giants: Apache Solr vs ElasticSearchBattle of the giants: Apache Solr vs ElasticSearch
Battle of the giants: Apache Solr vs ElasticSearch
 
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
Learning to Rank in Solr: Presented by Michael Nilsson & Diego Ceccarelli, Bl...
 
Battle of the Giants - Apache Solr vs. Elasticsearch (ApacheCon)
Battle of the Giants - Apache Solr vs. Elasticsearch (ApacheCon)Battle of the Giants - Apache Solr vs. Elasticsearch (ApacheCon)
Battle of the Giants - Apache Solr vs. Elasticsearch (ApacheCon)
 
Deepak khetawat sling_models_sightly_jsp
Deepak khetawat sling_models_sightly_jspDeepak khetawat sling_models_sightly_jsp
Deepak khetawat sling_models_sightly_jsp
 
Elastic search overview
Elastic search overviewElastic search overview
Elastic search overview
 
SharePoint Framework, React, and Office UI Fabric spc adriatics 2016
SharePoint Framework, React, and Office UI Fabric spc adriatics 2016SharePoint Framework, React, and Office UI Fabric spc adriatics 2016
SharePoint Framework, React, and Office UI Fabric spc adriatics 2016
 
Managed Search: Presented by Jacob Graves, Getty Images
Managed Search: Presented by Jacob Graves, Getty ImagesManaged Search: Presented by Jacob Graves, Getty Images
Managed Search: Presented by Jacob Graves, Getty Images
 
Sitemap comparison
Sitemap comparisonSitemap comparison
Sitemap comparison
 
Postman Collection Format v2.0 (pre-draft)
Postman Collection Format v2.0 (pre-draft)Postman Collection Format v2.0 (pre-draft)
Postman Collection Format v2.0 (pre-draft)
 
Introduction to Apache Solr
Introduction to Apache SolrIntroduction to Apache Solr
Introduction to Apache Solr
 
Battle of the Giants round 2
Battle of the Giants round 2Battle of the Giants round 2
Battle of the Giants round 2
 
Introduction to Solr
Introduction to SolrIntroduction to Solr
Introduction to Solr
 

Viewers also liked

Inter-Sling communication with message queue
Inter-Sling communication with message queueInter-Sling communication with message queue
Inter-Sling communication with message queue
Tomasz Rękawek
 
Elastic search adaptto2014
Elastic search adaptto2014Elastic search adaptto2014
Elastic search adaptto2014
Vivek Sachdeva
 
adaptTo() 2014 - Integrating Open Source Search with CQ/AEM
adaptTo() 2014 - Integrating Open Source Search with CQ/AEMadaptTo() 2014 - Integrating Open Source Search with CQ/AEM
adaptTo() 2014 - Integrating Open Source Search with CQ/AEM
therealgaston
 
Camel ratings ppt
Camel ratings pptCamel ratings ppt
Camel ratings ppt
Sagar Patil
 
Camels Rating
Camels RatingCamels Rating
Camels Rating
Parveen Bari
 
Culture
CultureCulture
Culture
Reed Hastings
 

Viewers also liked (6)

Inter-Sling communication with message queue
Inter-Sling communication with message queueInter-Sling communication with message queue
Inter-Sling communication with message queue
 
Elastic search adaptto2014
Elastic search adaptto2014Elastic search adaptto2014
Elastic search adaptto2014
 
adaptTo() 2014 - Integrating Open Source Search with CQ/AEM
adaptTo() 2014 - Integrating Open Source Search with CQ/AEMadaptTo() 2014 - Integrating Open Source Search with CQ/AEM
adaptTo() 2014 - Integrating Open Source Search with CQ/AEM
 
Camel ratings ppt
Camel ratings pptCamel ratings ppt
Camel ratings ppt
 
Camels Rating
Camels RatingCamels Rating
Camels Rating
 
Culture
CultureCulture
Culture
 

Similar to Consuming External Content and Enriching Content with Apache Camel

Essential Camel Components
Essential Camel ComponentsEssential Camel Components
Essential Camel Components
Christian Posta
 
Solr Recipes Workshop
Solr Recipes WorkshopSolr Recipes Workshop
Solr Recipes Workshop
Erik Hatcher
 
Introduction to Amazon Athena
Introduction to Amazon AthenaIntroduction to Amazon Athena
Introduction to Amazon Athena
Amazon Web Services
 
EPM Automate - Automating Enterprise Performance Management Cloud Solutions
EPM Automate - Automating Enterprise Performance Management Cloud SolutionsEPM Automate - Automating Enterprise Performance Management Cloud Solutions
EPM Automate - Automating Enterprise Performance Management Cloud Solutions
Joseph Alaimo Jr
 
NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.
NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.
NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.
Amazon Web Services
 
NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.
NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.
NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.
Amazon Web Services
 
Learn AJAX at ASIT
Learn AJAX at ASITLearn AJAX at ASIT
Learn AJAX at ASIT
ASIT
 
Ajax workshop
Ajax workshopAjax workshop
Ajax workshop
WBUTTUTORIALS
 
Introduction to Monsoon PHP framework
Introduction to Monsoon PHP frameworkIntroduction to Monsoon PHP framework
Introduction to Monsoon PHP framework
Krishna Srikanth Manda
 
(ATS6-PLAT02) Accelrys Catalog and Protocol Validation
(ATS6-PLAT02) Accelrys Catalog and Protocol Validation(ATS6-PLAT02) Accelrys Catalog and Protocol Validation
(ATS6-PLAT02) Accelrys Catalog and Protocol Validation
BIOVIA
 
How to Load Data, Revisited
How to Load Data, RevisitedHow to Load Data, Revisited
How to Load Data, Revisited
Karen Cannell
 
EPM Automate - Automating Enterprise Performance Management Cloud Solutions
EPM Automate - Automating Enterprise Performance Management Cloud SolutionsEPM Automate - Automating Enterprise Performance Management Cloud Solutions
EPM Automate - Automating Enterprise Performance Management Cloud Solutions
Alithya
 
(ATS6-PLAT04) Query service
(ATS6-PLAT04) Query service (ATS6-PLAT04) Query service
(ATS6-PLAT04) Query service
BIOVIA
 
Introduction to AJAX
Introduction to AJAXIntroduction to AJAX
Introduction to AJAX
Abzetdin Adamov
 
How to Load Data, Revisited, UTOUG
How to Load Data, Revisited, UTOUGHow to Load Data, Revisited, UTOUG
How to Load Data, Revisited, UTOUG
Karen Cannell
 
Automation Nation
Automation NationAutomation Nation
Automation Nation
InnovusPartners
 
Share point development 101
Share point development 101Share point development 101
Share point development 101
Becky Bertram
 
SolrCloud on Hadoop
SolrCloud on HadoopSolrCloud on Hadoop
SolrCloud on Hadoop
Alex Moundalexis
 
(ATS6-DEV01) What’s new for Protocol and Component Developers in AEP 9.0
(ATS6-DEV01) What’s new for Protocol and Component Developers in AEP 9.0(ATS6-DEV01) What’s new for Protocol and Component Developers in AEP 9.0
(ATS6-DEV01) What’s new for Protocol and Component Developers in AEP 9.0
BIOVIA
 
Sf big analytics_2018_04_18: Evolution of the GoPro's data platform
Sf big analytics_2018_04_18: Evolution of the GoPro's data platformSf big analytics_2018_04_18: Evolution of the GoPro's data platform
Sf big analytics_2018_04_18: Evolution of the GoPro's data platform
Chester Chen
 

Similar to Consuming External Content and Enriching Content with Apache Camel (20)

Essential Camel Components
Essential Camel ComponentsEssential Camel Components
Essential Camel Components
 
Solr Recipes Workshop
Solr Recipes WorkshopSolr Recipes Workshop
Solr Recipes Workshop
 
Introduction to Amazon Athena
Introduction to Amazon AthenaIntroduction to Amazon Athena
Introduction to Amazon Athena
 
EPM Automate - Automating Enterprise Performance Management Cloud Solutions
EPM Automate - Automating Enterprise Performance Management Cloud SolutionsEPM Automate - Automating Enterprise Performance Management Cloud Solutions
EPM Automate - Automating Enterprise Performance Management Cloud Solutions
 
NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.
NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.
NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.
 
NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.
NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.
NEW LAUNCH! Intro to Amazon Athena. Easily analyze data in S3, using SQL.
 
Learn AJAX at ASIT
Learn AJAX at ASITLearn AJAX at ASIT
Learn AJAX at ASIT
 
Ajax workshop
Ajax workshopAjax workshop
Ajax workshop
 
Introduction to Monsoon PHP framework
Introduction to Monsoon PHP frameworkIntroduction to Monsoon PHP framework
Introduction to Monsoon PHP framework
 
(ATS6-PLAT02) Accelrys Catalog and Protocol Validation
(ATS6-PLAT02) Accelrys Catalog and Protocol Validation(ATS6-PLAT02) Accelrys Catalog and Protocol Validation
(ATS6-PLAT02) Accelrys Catalog and Protocol Validation
 
How to Load Data, Revisited
How to Load Data, RevisitedHow to Load Data, Revisited
How to Load Data, Revisited
 
EPM Automate - Automating Enterprise Performance Management Cloud Solutions
EPM Automate - Automating Enterprise Performance Management Cloud SolutionsEPM Automate - Automating Enterprise Performance Management Cloud Solutions
EPM Automate - Automating Enterprise Performance Management Cloud Solutions
 
(ATS6-PLAT04) Query service
(ATS6-PLAT04) Query service (ATS6-PLAT04) Query service
(ATS6-PLAT04) Query service
 
Introduction to AJAX
Introduction to AJAXIntroduction to AJAX
Introduction to AJAX
 
How to Load Data, Revisited, UTOUG
How to Load Data, Revisited, UTOUGHow to Load Data, Revisited, UTOUG
How to Load Data, Revisited, UTOUG
 
Automation Nation
Automation NationAutomation Nation
Automation Nation
 
Share point development 101
Share point development 101Share point development 101
Share point development 101
 
SolrCloud on Hadoop
SolrCloud on HadoopSolrCloud on Hadoop
SolrCloud on Hadoop
 
(ATS6-DEV01) What’s new for Protocol and Component Developers in AEP 9.0
(ATS6-DEV01) What’s new for Protocol and Component Developers in AEP 9.0(ATS6-DEV01) What’s new for Protocol and Component Developers in AEP 9.0
(ATS6-DEV01) What’s new for Protocol and Component Developers in AEP 9.0
 
Sf big analytics_2018_04_18: Evolution of the GoPro's data platform
Sf big analytics_2018_04_18: Evolution of the GoPro's data platformSf big analytics_2018_04_18: Evolution of the GoPro's data platform
Sf big analytics_2018_04_18: Evolution of the GoPro's data platform
 

Recently uploaded

GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
Neo4j
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
Data structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdfData structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdf
TIPNGVN2
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
Quotidiano Piemontese
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
SOFTTECHHUB
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
Kari Kakkonen
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
Uni Systems S.M.S.A.
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
Rohit Gautam
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
mikeeftimakis1
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems S.M.S.A.
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
SOFTTECHHUB
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 

Recently uploaded (20)

GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
GraphSummit Singapore | Enhancing Changi Airport Group's Passenger Experience...
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
Data structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdfData structures and Algorithms in Python.pdf
Data structures and Algorithms in Python.pdf
 
National Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practicesNational Security Agency - NSA mobile device best practices
National Security Agency - NSA mobile device best practices
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
Climate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing DaysClimate Impact of Software Testing at Nordic Testing Days
Climate Impact of Software Testing at Nordic Testing Days
 
Microsoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdfMicrosoft - Power Platform_G.Aspiotis.pdf
Microsoft - Power Platform_G.Aspiotis.pdf
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
Large Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial ApplicationsLarge Language Model (LLM) and it’s Geospatial Applications
Large Language Model (LLM) and it’s Geospatial Applications
 
Introduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - CybersecurityIntroduction to CHERI technology - Cybersecurity
Introduction to CHERI technology - Cybersecurity
 
Uni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdfUni Systems Copilot event_05062024_C.Vlachos.pdf
Uni Systems Copilot event_05062024_C.Vlachos.pdf
 
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
Why You Should Replace Windows 11 with Nitrux Linux 3.5.0 for enhanced perfor...
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 

Consuming External Content and Enriching Content with Apache Camel

  • 1. Presented by: Gaston Gonzalez, headwire.com, Inc. + Advanced AEM Search Consuming External Content and Enriching Content with Apache Camel
  • 2. About Me • Senior Technical Architect at headwire.com, Inc. • Search Engineer / Developer • AEM Architect / Developer • Creator of AEM Solr Search • Tech Blogger • UNIX Systems Administrator +
  • 3. + Typical AEM + Search Integration
  • 4. Typical AEM + Search Architecture +
  • 5. Typical AEM + Search Architecture + Pros Cons • Straight forward implementation • Simple architecture (AEM + Search) • Complete data model in AEM? • Not all data may be in AEM • Processing overhead • Data cleansing, transformation and enrichment handled in AEM • Fault Tolerance • What if Solr is down? • Tight coupling to search platform
  • 7. Goals for a better Architecture • Offload processing outside of AEM • Improve fault tolerance • Provide flexible platform for data cleansing, transformation and aggregation • Allow for changes to indexing logic with impacting AEM • Search engine agnostic +
  • 8. Introduce an ETL / Document Processor +
  • 10. Document Processing Platform • Roles & Responsibilities • Enriches submitted documents prior to indexing. • Submits documents for indexing. • Terms & Definitions • Enrichment: Data cleansing, filtering, transformation, aggregation, etc. • Processing Stage: Independent processing unit responsible for contributing to the enrichment process. • Pipeline: Consists of one or more processing stages or sub pipelines. +
  • 12. Document processing is really an integration problem, right? + Integration Library Integration Framework & Stream Processing Enterprise Service Bus Apache Camel Spring Integration Mule ESB Spring Cloud Data Flow & Cloud Stream Low Complexity High
  • 14. Apache Camel • A light-weight, open source integration library. • Mediation engine • Implements well-known Enterprise Integration Patterns (EIPs) • Aggregator • Content Enricher • Content-based router • Message • Message Translator • Pipes and Filters • Splitter… +
  • 15. Why Apache Camel? • Light weight—it’s a JAR • Imposes no runtime constraints • Routing engine • Powerful, fluent Java DSL • Mature open source project • Extensive list of integration components • Avoid writing boiler plate code—leverage EIPs +
  • 16. Apache Camel & EIP Concepts + Message • Unit of information exchange between applications Exchange • Wraps inbound & outbound message + headers Message Channel • Allows applications to communicate using messaging Pipes and Filters • Perform loosely coupled processing on a message • Routes and Processors in Camel
  • 19. Importing Product Content into Solr Problem: “As an AEM developer, I need to import product content into Solr so that I can display products via search and on PDPs on my AEM-powered site.” + Let’s use Best Buy’s Product API as example… 1. Fetch product data ZIP file via HTTP request. 2. Unzip product data. 3. Parse each JSON file to extract individual products. 4. Transform, enrich and cleanse each product as necessary. 5. Submit each product to Solr for indexing.
  • 21. A solution using Camel +
  • 22. A short list of Camel Components + AMPQ Git RabbitMQ ATOM HTTP / HTTP4 Rest AWS JCR RSS Bean JDBC Solr Box JMS Apache Spark Cache Jsch SQL CouchDB Log Timer Elasticsearch MongoDB XSLT File Netty / Netty4 Quartz http://camel.apache.org/components.html
  • 23. Back to AEM and indexing AEM content… +
  • 24. A Better AEM + Search Architecture +
  • 25. Enrichment Use Cases for AEM • Search Relevancy • Merge ratings and review signals • Merge analytics signals (visits, page views…) • Merge social signals (likes, shares, …) • Cleanse data for search • Rich content processing (Tika) • Natural Language Processing (OpenNLP) • Filter / drop documents • Classify content +
  • 26. AEM: Data Model (1/3) • Use a serializable object to represent your document • In fact, use a HashMap • No dependency object graph • Most search platforms already think of documents as a series of key/value pairs • Use key name prefixes to model: • Index operation type (aem.op) • Document Fields (aem.field.<field>) • Metadata (aem.meta.<field>) +
  • 27. AEM: Data Model (1/3) HashMap<String, Object> jmsDoc = new HashMap<String, Object>(); // Operation Type jmsDoc.put("aem.op.type","ADD_DOC"); // Document fields jmsDoc.put("aem.field.id", page.getPath()); jmsDoc.put("aem.field.crxPath", page.getPath()); jmsDoc.put("aem.field.url", page.getPath() + ".html"); jmsDoc.put("aem.field.title", page.getTitle()); jmsDoc.put("aem.field.description", page.getDescription()); // Metadata jmsDoc.put("aem.meta.foo", "bar"); +
  • 28. AEM: Listener / JMS Producer (2/3) + • Create an AEM Listener • Implement EventHandler interface • Listen for the PageEvent topics • Convert the Page resource to a our data model • Add operation type • Add document fields • Add metadata fields • Send the message to JMS index topic • Example: JmsIndexListener.java
  • 29. AEM: JMS Camel Consumer (3/3) + • Define your Camel runtime (e.g., standalone, OSGi, etc.) • Define your Camel routes • Consume JMS topic • Route operation type using content-based router • Enrich document as needed • Convert JMS document model to Solr model • Submit index request • Example: AemToSolr.java
  • 31. Demo Prerequisites • Java 8 / Maven 3.2.x • AEM 6.1 • http://www.aemsolrsearch.com • https://github.com/GastonGonzalez/aem-solr- search-product-sample • Best Buy API Key • Vagrant and VirtualBox +
  • 36. Resources • My Blog - http://www.gastongonzalez.com/ • AEM Solr Search - http://www.aemsolrsearch.com • Apache Camel • http://camel.apache.org/index.html • https://www.manning.com/books/camel-in- action-second-edition • Contact Us: aemsolr@headwire.com +
  • 37. In summary… + • If you do not need enrichment, keep it simple and use a direct indexing approach. • If you have a need to enrich your AEM content consider using Camel as your document processing platform. • This architecture is NOT search-specific! • Syndicate AEM content to other systems • Workflow replacement

Editor's Notes

  1. This is how AEM Solr Search 2.0.0 behaves.
  2. This is how AEM Solr Search 2.0.0 behaves.
  3. Can be thought of as an ETL. Terms & Definitions Processing Stage – Typically reusable. DoT – Do One Thing.
  4. Can be thought of as an ETL. Terms & Definitions Processing Stage – Typically reusable. DoT – Do One Thing.
  5. Many defunct, search-specific projects: OpenPipe, Pypes, OpenPipeline Other interesting search-specific pipelines include: Hydra by Findwise
  6. Mediation EIPs Aggregator - Content Enricher - Content-based router - Message - Message Translator - Pipes and Filters - Polling Consumer - Splitter -
  7. Declarative Spring-based, route definition also available
  8. Declarative Spring-based, route definition also available
  9. Declarative Spring-based, route definition also available
  10. Take a minute and visually think about how much code would be needed to achieve this goal? Is most of it boilerplate (e.g., setting up HTTP client, dealing with file input/output, marshaling/unmarshaling JSON, etc.)?
  11. TODO: Add transfrom
  12. 3 routes defined, all of which are asynchronous Demo code available
  13. Declarative Spring-based, route definition also available