© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Yotam Yarden
Data Scientist, Amazon Web Services
AWS Pop UP Loft Berlin 2018
Build a Recommendation Engine on AWS
Today
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Agenda
• Recommendation Engine – Why?
• Recommendation Engine – Common Techniques
• Introducing Amazon SageMaker
• Customer use cases
• Develop, Train & Deploy a Recommendation Engine
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Artificial Intelligence
At Amazon (1995)
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
And today…
My Profile – amazon.de My Profile – amazon.com
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
• Personalize and enhance customer
experience
• Different goals:
• Increased time spent on a platform
• Suggest complementary items
• Customer satisfaction
Motivation
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Use Cases
Ecommerce:
• Amazon.com
Content:
• Movies (Netflix)
• Music (Amazon Music)
• Articles (The Global And Mail)
Finance:
• Services Recommendation
• Stocks buying / selling
• Relevant news and stock related data
Education:
• Courses recommendations
Legal:
• Similar cases
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Agenda
• Recommendation Engine – Why?
• Recommendation Engine – Common Techniques
• Introducing Amazon SageMaker
• Customer use cases
• Develop, Train & Deploy a Recommendation Engine
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
https://www.oreilly.com/ideas/deep-matrix-factorization-using-apache-mxnet?cmp=tw-data-na-article-engagement_sponsored+kibird
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Supervised Machine Learning
All Labeled Data
Train Test
Model Training Model
Labels
Test Set
Predictions
|Predictions – True Labels|
= Accuracy
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
https://www.oreilly.com/ideas/deep-matrix-factorization-using-apache-mxnet?cmp=tw-data-na-article-engagement_sponsored+kibird
Testset
Test / Validation
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Naïve approach
Linear model? [type of user, movie genre, etc.]
Polynomial model? [+interactions]
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Matrix Factorization
X≈
UserEmbeddings
Item Embeddings
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Matrix Factorization – “Neural Networks”
Representation
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Deep Matrix Factorization
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Binary Predictions
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Binary Predictions
+Negative Sampling
Negative
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Most of the Data is Still Untapped
• Images
• Titles
• Descriptions
• Reviews
• Episode Names
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
DSSM – Deep Structures Semantic
Models
User
Embedding
Item
Embedding
⨀
⨀⨁ score
output
user Search
BOW
title words
BOW
resnet: imgs
dropout
dense densedensedensedense
concat concat
densedense
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Which Technique to Choose? Roadmap Matrix
Iterative
process
   
Data Available Limited user data
Binary user-item
interaction
User data
Additional user-item interaction
More user data
Extensive item
data
Extensive user
data
Extensive item
data
Relevant
Algorithms
Matrix Factorization
Binary
Matrix Factorization
Factorization Machines
DiFacto
DSSM Customized and
more advanced
DSSM
Relative
Complexity
2 4 5 5
Deployment
Considerations
 Historical data size – 30d / 60d / 1y…
 Fine-tuning techniques (daily, weekly..)
 Inference - compressed model? Tradeoff between model complexity and inference latency
 Validation system setup
 Iterate fast and simple
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Agenda
• Recommendation Engine – Why?
• Recommendation Engine – Common Techniques
• Introducing Amazon SageMaker
• Customer use cases
• Develop, Train & Deploy a Recommendation Engine
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
ML @ AWS: Our mission
Put machine learning in the hands of every developer
and data scientist
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Customer Running ML on AWS Today
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
ML is still too complicated for developers and data
scientists
Collect and prepare
training data
Choose and
optimize your ML
algorithm
Set up and manage
environments for
training
Train and tune model
(trial and error)
Deploy model
in production
Scale and manage
the production
environment
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
A m a z o n S a g e M a k e r
Eas ily build, tr ain, and deploy
mac hine lear ning models
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon SageMaker
Pre-built
notebooks for
common
problems
BUILD
Choose and
optimize your ML
algorithm
Set up and manage
environments for
training
Train and tune model
(trial and error)
Deploy model
in production
Scale and manage
the production
environment
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Pre-built
notebooks for
common
problems
K-Means Clustering
Principal Component Analysis
Neural Topic Modelling
Factorization Machines
Linear Learner - Regression
XGBoost
Latent Dirichlet Allocation
Image Classification
Seq2Seq
Linear Learner - Classification
ALGORITHMS
Apache MXNet
TensorFlow
Caffe2, CNTK,
PyTorch, Torch
FRAMEWORKS
Set up and m anage
environments for
training
Train and tune
m odel (trial and
error)
Deploy m odel
in production
Scale and m anage the
production environment
Built-in, high
performance
algorithms
BUILD
Amazon SageMaker
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Pre-built
notebooks for
common
problems
Built-in, high
performance
algorithms
One-click
training
BUILD TRAIN
Train and tune model
(trial and error)
Deploy model
in production
Scale and manage
the production
environment
Amazon SageMaker
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Pre-built
notebooks for
common
problems
Built-in, high
performance
algorithms
One-click
training
Hyperparameter
optimization
BUILD TRAIN
Deploy model
in production
Scale and manage
the production
environment
Amazon SageMaker
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Pre-built
notebooks for
common
problems
Built-in, high
performance
algorithms
One-click
deployment
One-click
training
Hyperparameter
optimization
Scale and manage
the production
environment
BUILD TRAIN DEPLOY
Amazon SageMaker
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Fully managed
hosting with auto-
scaling
One-click
deployment
Pre-built
notebooks for
common
problems
Built-in, high
performance
algorithms
One-click
training
Hyperparameter
optimization
BUILD TRAIN DEPLOY
Amazon SageMaker
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Agenda
• Recommendation Engine – Why?
• Recommendation Engine – Common Techniques
• Introducing Amazon SageMaker
• Customer Use Cases
• Develop, Train & Deploy a Recommendation Engine
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Customers Use Case - erento is using Amazon
SageMaker to develop, train, and deploy
recommendation systems in their market place
Erento’s in-house Data Science team is using Amazon SageMaker to build and deploy
ML models to solve item availability and decrease the enquiry-to-offer time through a
recommendation system, which suggests similar items that are available and increases
the chance for a successful booking. Using Amazon SageMaker reduced our
recommendation system building time from half a year to few weeks and
reduced the algorithm training time from hours to few seconds. It also helped us reduce
dependencies between projects, which has streamlined our whole pre-deployment
process.
- Wassim Zoghlami, Data Scientist Engineer at Erento
“
“
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Agenda
• Recommendation Engine – Why?
• Recommendation Engine – Common Techniques
• Introducing Amazon SageMaker
• Customer use cases
• Develop, Train & Deploy a Recommendation Engine
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
console
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
References
• https://www.oreilly.com/ideas/deep-matrix-factorization-using-
apache-mxnet
• https://github.com/apache/incubator-mxnet
• https://github.com/awslabs/amazon-sagemaker-examples
• https://www.csie.ntu.edu.tw/~b97053/paper/Rendle2010FM.pdf
• https://www.youtube.com/watch?v=cftJAuwKWkA
• https://www.youtube.com/watch?v=1cRGpDXTJC8&t=640s
Build Your Recommendation Engine on AWS Today!

Build Your Recommendation Engine on AWS Today!

  • 1.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Yotam Yarden Data Scientist, Amazon Web Services AWS Pop UP Loft Berlin 2018 Build a Recommendation Engine on AWS Today
  • 2.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Agenda • Recommendation Engine – Why? • Recommendation Engine – Common Techniques • Introducing Amazon SageMaker • Customer use cases • Develop, Train & Deploy a Recommendation Engine
  • 3.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Artificial Intelligence At Amazon (1995)
  • 4.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. And today… My Profile – amazon.de My Profile – amazon.com
  • 5.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. • Personalize and enhance customer experience • Different goals: • Increased time spent on a platform • Suggest complementary items • Customer satisfaction Motivation
  • 6.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Use Cases Ecommerce: • Amazon.com Content: • Movies (Netflix) • Music (Amazon Music) • Articles (The Global And Mail) Finance: • Services Recommendation • Stocks buying / selling • Relevant news and stock related data Education: • Courses recommendations Legal: • Similar cases
  • 7.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Agenda • Recommendation Engine – Why? • Recommendation Engine – Common Techniques • Introducing Amazon SageMaker • Customer use cases • Develop, Train & Deploy a Recommendation Engine
  • 8.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. https://www.oreilly.com/ideas/deep-matrix-factorization-using-apache-mxnet?cmp=tw-data-na-article-engagement_sponsored+kibird
  • 9.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Supervised Machine Learning All Labeled Data Train Test Model Training Model Labels Test Set Predictions |Predictions – True Labels| = Accuracy
  • 10.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. https://www.oreilly.com/ideas/deep-matrix-factorization-using-apache-mxnet?cmp=tw-data-na-article-engagement_sponsored+kibird Testset Test / Validation
  • 11.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Naïve approach Linear model? [type of user, movie genre, etc.] Polynomial model? [+interactions]
  • 12.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Matrix Factorization X≈ UserEmbeddings Item Embeddings
  • 13.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Matrix Factorization – “Neural Networks” Representation
  • 14.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Deep Matrix Factorization
  • 15.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved.
  • 16.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Binary Predictions
  • 17.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Binary Predictions +Negative Sampling Negative
  • 18.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Most of the Data is Still Untapped • Images • Titles • Descriptions • Reviews • Episode Names
  • 19.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. DSSM – Deep Structures Semantic Models User Embedding Item Embedding ⨀ ⨀⨁ score output user Search BOW title words BOW resnet: imgs dropout dense densedensedensedense concat concat densedense
  • 20.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Which Technique to Choose? Roadmap Matrix Iterative process     Data Available Limited user data Binary user-item interaction User data Additional user-item interaction More user data Extensive item data Extensive user data Extensive item data Relevant Algorithms Matrix Factorization Binary Matrix Factorization Factorization Machines DiFacto DSSM Customized and more advanced DSSM Relative Complexity 2 4 5 5 Deployment Considerations  Historical data size – 30d / 60d / 1y…  Fine-tuning techniques (daily, weekly..)  Inference - compressed model? Tradeoff between model complexity and inference latency  Validation system setup  Iterate fast and simple
  • 21.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Agenda • Recommendation Engine – Why? • Recommendation Engine – Common Techniques • Introducing Amazon SageMaker • Customer use cases • Develop, Train & Deploy a Recommendation Engine
  • 22.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. ML @ AWS: Our mission Put machine learning in the hands of every developer and data scientist
  • 23.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Customer Running ML on AWS Today
  • 24.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. ML is still too complicated for developers and data scientists Collect and prepare training data Choose and optimize your ML algorithm Set up and manage environments for training Train and tune model (trial and error) Deploy model in production Scale and manage the production environment
  • 25.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. A m a z o n S a g e M a k e r Eas ily build, tr ain, and deploy mac hine lear ning models
  • 26.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Amazon SageMaker Pre-built notebooks for common problems BUILD Choose and optimize your ML algorithm Set up and manage environments for training Train and tune model (trial and error) Deploy model in production Scale and manage the production environment
  • 27.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Pre-built notebooks for common problems K-Means Clustering Principal Component Analysis Neural Topic Modelling Factorization Machines Linear Learner - Regression XGBoost Latent Dirichlet Allocation Image Classification Seq2Seq Linear Learner - Classification ALGORITHMS Apache MXNet TensorFlow Caffe2, CNTK, PyTorch, Torch FRAMEWORKS Set up and m anage environments for training Train and tune m odel (trial and error) Deploy m odel in production Scale and m anage the production environment Built-in, high performance algorithms BUILD Amazon SageMaker
  • 28.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Pre-built notebooks for common problems Built-in, high performance algorithms One-click training BUILD TRAIN Train and tune model (trial and error) Deploy model in production Scale and manage the production environment Amazon SageMaker
  • 29.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Pre-built notebooks for common problems Built-in, high performance algorithms One-click training Hyperparameter optimization BUILD TRAIN Deploy model in production Scale and manage the production environment Amazon SageMaker
  • 30.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Pre-built notebooks for common problems Built-in, high performance algorithms One-click deployment One-click training Hyperparameter optimization Scale and manage the production environment BUILD TRAIN DEPLOY Amazon SageMaker
  • 31.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Fully managed hosting with auto- scaling One-click deployment Pre-built notebooks for common problems Built-in, high performance algorithms One-click training Hyperparameter optimization BUILD TRAIN DEPLOY Amazon SageMaker
  • 32.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Agenda • Recommendation Engine – Why? • Recommendation Engine – Common Techniques • Introducing Amazon SageMaker • Customer Use Cases • Develop, Train & Deploy a Recommendation Engine
  • 33.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Customers Use Case - erento is using Amazon SageMaker to develop, train, and deploy recommendation systems in their market place Erento’s in-house Data Science team is using Amazon SageMaker to build and deploy ML models to solve item availability and decrease the enquiry-to-offer time through a recommendation system, which suggests similar items that are available and increases the chance for a successful booking. Using Amazon SageMaker reduced our recommendation system building time from half a year to few weeks and reduced the algorithm training time from hours to few seconds. It also helped us reduce dependencies between projects, which has streamlined our whole pre-deployment process. - Wassim Zoghlami, Data Scientist Engineer at Erento “ “
  • 34.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. Agenda • Recommendation Engine – Why? • Recommendation Engine – Common Techniques • Introducing Amazon SageMaker • Customer use cases • Develop, Train & Deploy a Recommendation Engine
  • 35.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. console
  • 36.
    © 2018, AmazonWeb Services, Inc. or its affiliates. All rights reserved. References • https://www.oreilly.com/ideas/deep-matrix-factorization-using- apache-mxnet • https://github.com/apache/incubator-mxnet • https://github.com/awslabs/amazon-sagemaker-examples • https://www.csie.ntu.edu.tw/~b97053/paper/Rendle2010FM.pdf • https://www.youtube.com/watch?v=cftJAuwKWkA • https://www.youtube.com/watch?v=1cRGpDXTJC8&t=640s