SlideShare a Scribd company logo
1 of 35
APACHE SLING&FRIENDSTECHMEETUP
BERLIN, 26-28SEPTEMBER 2016
IntegratingApache Mahout with AEM
Ankit Gubrani & RimaMittal
adaptTo()2016 2
Speakers Bio
adaptTo()2016 3
Agenda
Agenda
adaptTo()2016 4
 Introduction to Apache Mahout
 MachineLearning
 Recommendations
 AEM with Apache Mahout
 Demo
 ExtensionPoints
adaptTo()2016 5
Introduction to Apache Mahout
Whatis Apache Mahout?
adaptTo()2016 6
 Project of the Apache Software
Foundation.
 Producing free implementations of
scalablemachine learning algorithms,
written in Java.
History
adaptTo()2016 7
 Started as a Lucenesub-project.
 Became Apache TLPin April 2010.
 Latest version – 0.12.2 – Released on 13th June 2016.
Why ApacheMahout?
adaptTo()2016 8
 Increasing volume of data!
 Traditional Data miningalgorithms struggle to process very
large datasets.
 Apache Mahoutto therescue!
TraditionalMachine Learning
adaptTo()2016 9
Machine LearningwithMahout
adaptTo()2016 10
Applications
adaptTo()2016 11
 Adobe, Facebook, LinkedIn,Twitter and Yahoo use Mahout
internally.
 Twitter uses Mahoutfor interest modelling.
 Yahoo! Uses Mahoutfor patternmining.
adaptTo()2016 12
MachineLearning
Machine Learning
adaptTo()2016 13
 Programmingcomputers to optimize a Performance Criterion
usingExample Data or PastExperience
 Branch of ArtificialIntelligence.
 Computers evolve behavior based on Empirical data.
Techniques
adaptTo()2016 14
 Supervised Learning
 Use Labelled training data to create a classifier that can predict
output for unseen inputs.
 Unsupervised Learning
 Use Unlabeled training data to create a function that can predict
output.
Machine LearningwithApache Mahout
adaptTo()2016 15
 DataScience use cases Mahoutsupports:
 Collaborative Filtering
 Clustering
 Classification
CollaborativeFiltering
adaptTo()2016 16
 User behavior mining tomake product
recommendations.
Clustering
adaptTo()2016 17
 Organizing items into naturally
occurring groups, such that items
belonging to same group are similar to
each other
Classification
adaptTo()2016 18
 Learning from existing categorizations
and assigningunclassified items to the
best category
adaptTo()2016 19
Recommendations
ApacheMahout RecommendationEngine
adaptTo()2016 20
 Helps users find items they mightlike based on historical
behavior and preferences.
 Mahoutprovides a rich set of componentsfrom which a
customizedrecommender system can be constructed usinga
selection ofAlgorithms.
Architecture
adaptTo()2016 21
 Top Level Packages
 DataModel
 UserSimilarity
 ItemSimilarity
 UserNeighboorhood
 Recommender
adaptTo()2016 22
AEM with Apache Mahout
Checklist
adaptTo()2016 23
 AEM 6.2
 Mahout as a Maven Dependency
<dependency>
<groupId>org.apache.mahout</groupId>
<artifactId>mahout-mr</artifactId>
<version>0.10.0</version>
</dependency>
JCRDataModel
adaptTo()2016 24
 DataModel
 Implementations representing a repository of information about
users and their associated preferences.
 AbstractDataModel, JDBCDataModel, FileDataModel,
GenericBooleanPrefDataModel, GenericDataModel.
 AEM - JCRDataModel.
Code(1/2)
adaptTo()2016 25
 Using The AEM JCRDataModel
public JSONArray getUserBasedRecommendations(ResourceResolver
resourceResolver, String userId, int numberOfRecommendations) {
//Creating JCRDataModel to fetch information from JCR
DataModel model = JCRDataModel.createDataModel(resourceResolver);
}
Code(2/2)
adaptTo()2016 26
 AEM-Mahout Recommendation steps
UserSimilarity userSimilarity = getSimilarity(model);
UserNeighborhood neighborhood = getNeighbourHood(N_NEIGHOBUR_HOOD,
userSimilarity, model);
GenericUserBasedRecommender recommender = new
GenericUserBasedRecommender(model, neighborhood, userSimilarity);
recommendations = recommender.recommend(userIdHash,
numberOfRecommedations, null, false);
AEMProduct Recommendation
adaptTo()2016 27
 User Based Recommendation
 Takesuser ratingsinto consideration
 Based on PearsonCorrelationSimilarity
 Uses NearestNUserNeighborhood
ConfiguringJCRDataModel
adaptTo()2016 28
 Configurations
 User Generated Content Path
 /content/usergenerated/asi/jcr
 Product Path
 /etc/commerce/products/geometrixx-outdoors
 Rating Resource Type
 Defaults to social/tally/components/response
adaptTo()2016 29
Demo
adaptTo()2016 30
Appendix
Appendix
adaptTo()2016 31
 https://mahout.apache.org/
 http://www.slideshare.net/VaradMeru/introduction-to-
mahout-and-machine-learning
 https://www.youtube.com/watch?v=iMAMYzfRiS4
adaptTo()2016 32
Clone thecode!
CodeRepository
adaptTo()2016 33
 https://github.com/rimamittal/AEMMahout.git
adaptTo()2016 34
Questions.
adaptTo()2016 35
Thankyou.

More Related Content

Similar to AEM integration with Apache Mahout

GrayMeta Demonstrates Metadata Solutions at NAB 2016 _ Media & Entertainment ...
GrayMeta Demonstrates Metadata Solutions at NAB 2016 _ Media & Entertainment ...GrayMeta Demonstrates Metadata Solutions at NAB 2016 _ Media & Entertainment ...
GrayMeta Demonstrates Metadata Solutions at NAB 2016 _ Media & Entertainment ...
Aaron Edell
 
Vipul divyanshu mahout_documentation
Vipul divyanshu mahout_documentationVipul divyanshu mahout_documentation
Vipul divyanshu mahout_documentation
Vipul Divyanshu
 
SA_MSA_ICBAI_2016_presentation_v1.0
SA_MSA_ICBAI_2016_presentation_v1.0SA_MSA_ICBAI_2016_presentation_v1.0
SA_MSA_ICBAI_2016_presentation_v1.0
Vineetha Vishnu
 
OSCON: Apache Mahout - Mammoth Scale Machine Learning
OSCON: Apache Mahout - Mammoth Scale Machine LearningOSCON: Apache Mahout - Mammoth Scale Machine Learning
OSCON: Apache Mahout - Mammoth Scale Machine Learning
Robin Anil
 

Similar to AEM integration with Apache Mahout (20)

Introduction to Apache Mahout
Introduction to Apache MahoutIntroduction to Apache Mahout
Introduction to Apache Mahout
 
GrayMeta Demonstrates Metadata Solutions at NAB 2016 _ Media & Entertainment ...
GrayMeta Demonstrates Metadata Solutions at NAB 2016 _ Media & Entertainment ...GrayMeta Demonstrates Metadata Solutions at NAB 2016 _ Media & Entertainment ...
GrayMeta Demonstrates Metadata Solutions at NAB 2016 _ Media & Entertainment ...
 
Vipul divyanshu mahout_documentation
Vipul divyanshu mahout_documentationVipul divyanshu mahout_documentation
Vipul divyanshu mahout_documentation
 
Adaptto2016-APM-AEM-Permission-Management-Mateusz-Chrominski
Adaptto2016-APM-AEM-Permission-Management-Mateusz-ChrominskiAdaptto2016-APM-AEM-Permission-Management-Mateusz-Chrominski
Adaptto2016-APM-AEM-Permission-Management-Mateusz-Chrominski
 
Fiducia & GAD IT AG: From Fraud Detection to Big Data Platform: Bringing Hado...
Fiducia & GAD IT AG: From Fraud Detection to Big Data Platform: Bringing Hado...Fiducia & GAD IT AG: From Fraud Detection to Big Data Platform: Bringing Hado...
Fiducia & GAD IT AG: From Fraud Detection to Big Data Platform: Bringing Hado...
 
Mahout tutorial
Mahout tutorialMahout tutorial
Mahout tutorial
 
NYC_2016_slides
NYC_2016_slidesNYC_2016_slides
NYC_2016_slides
 
When Recommenders Met Big Data: an Architectural Proposal and Evaluation [CER...
When Recommenders Met Big Data: an Architectural Proposal and Evaluation [CER...When Recommenders Met Big Data: an Architectural Proposal and Evaluation [CER...
When Recommenders Met Big Data: an Architectural Proposal and Evaluation [CER...
 
DevOps in your Oracle Stack
DevOps in your Oracle StackDevOps in your Oracle Stack
DevOps in your Oracle Stack
 
Programmability in spss 15
Programmability in spss 15Programmability in spss 15
Programmability in spss 15
 
Will the Apache Maturity Model save your project?
Will the Apache Maturity Model save your project?Will the Apache Maturity Model save your project?
Will the Apache Maturity Model save your project?
 
Mahout
MahoutMahout
Mahout
 
SA_MSA_ICBAI_2016_presentation_v1.0
SA_MSA_ICBAI_2016_presentation_v1.0SA_MSA_ICBAI_2016_presentation_v1.0
SA_MSA_ICBAI_2016_presentation_v1.0
 
OSCON: Apache Mahout - Mammoth Scale Machine Learning
OSCON: Apache Mahout - Mammoth Scale Machine LearningOSCON: Apache Mahout - Mammoth Scale Machine Learning
OSCON: Apache Mahout - Mammoth Scale Machine Learning
 
Programmability in spss 14
Programmability in spss 14Programmability in spss 14
Programmability in spss 14
 
mahout introduction
mahout  introductionmahout  introduction
mahout introduction
 
Running Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Running Emerging AI Applications on Big Data Platforms with Ray On Apache SparkRunning Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
Running Emerging AI Applications on Big Data Platforms with Ray On Apache Spark
 
Developing Frameworks for Apache Mesos
Developing Frameworks  for Apache MesosDeveloping Frameworks  for Apache Mesos
Developing Frameworks for Apache Mesos
 
Architecting the Future of Big Data and Search
Architecting the Future of Big Data and SearchArchitecting the Future of Big Data and Search
Architecting the Future of Big Data and Search
 
Toolkits Overview for IBM Streams V4.2
Toolkits Overview for IBM Streams V4.2Toolkits Overview for IBM Streams V4.2
Toolkits Overview for IBM Streams V4.2
 

Recently uploaded

CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)
Wonjun Hwang
 
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
FIDO Alliance
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
panagenda
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc
 

Recently uploaded (20)

Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)CORS (Kitworks Team Study 양다윗 발표자료 240510)
CORS (Kitworks Team Study 양다윗 발표자료 240510)
 
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
Observability Concepts EVERY Developer Should Know (DevOpsDays Seattle)
 
Microsoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - QuestionnaireMicrosoft CSP Briefing Pre-Engagement - Questionnaire
Microsoft CSP Briefing Pre-Engagement - Questionnaire
 
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
 
ERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage IntacctERP Contender Series: Acumatica vs. Sage Intacct
ERP Contender Series: Acumatica vs. Sage Intacct
 
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
AI+A11Y 11MAY2024 HYDERBAD GAAD 2024 - HelloA11Y (11 May 2024)
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform Engineering
 
AI mind or machine power point presentation
AI mind or machine power point presentationAI mind or machine power point presentation
AI mind or machine power point presentation
 
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
Easier, Faster, and More Powerful – Alles Neu macht der Mai -Wir durchleuchte...
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
الأمن السيبراني - ما لا يسع للمستخدم جهله
الأمن السيبراني - ما لا يسع للمستخدم جهلهالأمن السيبراني - ما لا يسع للمستخدم جهله
الأمن السيبراني - ما لا يسع للمستخدم جهله
 
Introduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxIntroduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptx
 
JavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuideJavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate Guide
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptx
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 Warsaw
 
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
TrustArc Webinar - Unified Trust Center for Privacy, Security, Compliance, an...
 
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
 
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdfFrisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
Frisco Automating Purchase Orders with MuleSoft IDP- May 10th, 2024.pptx.pdf
 

AEM integration with Apache Mahout