© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Build Machine Learning Models with
Amazon SageMaker
Julien Simon
Global Evangelist, AI & Machine Learning
@julsimon
M L F R A M E W O R K S &
I N F R A S T R U C T U R E
The Amazon ML Stack: Broadest & Deepest Set of Capabilities
A I S E R V I C E S
R E K O G N I T I O N
I M A G E
P O L L Y T R A N S C R I B E T R A N S L A T E C O M P R E H E N D
C O M P R E H E N D
M E D I C A L
L E XR E K O G N I T I O N
V I D E O
Vision Speech Chatbots
A M A Z O N S A G E M A K E R
B U I L D T R A I N
F O R E C A S TT E X T R A C T P E R S O N A L I Z E
D E P L O Y
Pre-built algorithms & notebooks
Data labeling (G R O U N D T R U T H )
One-click model training & tuning
Optimization ( N E O )
One-click deployment & hosting
M L S E R V I C E S
F r a m e w o r k s I n t e r f a c e s I n f r a s t r u c t u r e
E C 2 P 3
& P 3 d n
E C 2 C 5 F P G A s G R E E N G R A S S E L A S T I C
I N F E R E N C E
Models without training data (REINFORCEMENT LEARNING)
Algorithms & models ( A W S M A R K E T P L A C E )
Language Forecasting Recommendations
NEW NEWNEW
NEW
NEW
NEWNEW
NEW
NEW
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Amazon SageMaker:
Build, Train, and Deploy ML Models at Scale
Collect and prepare
training data
Choose and optimize
your
ML algorithm
Train and
Tune ML Models
Set up and
manage
environments
for training
Deploy models
in production
Scale and manage
the production
environment
1
2
3
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Machine learning cycle
Business
Problem
ML problem framing Data collection
Data integration
Data preparation and
cleaning
Data visualization
and analysis
Feature engineering
Model training and
parameter tuning
Model evaluation
Monitoring and
debugging
Model deployment
Predictions
Are business
goals
met?
YESNO
Dataaugmentation
Feature
augmentation
Re-training
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Successful models require high-quality data
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Successful models require high-quality data
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Amazon SageMaker Ground Truth
https://aws.amazon.com/blogs/aws/amazon-sagemaker-ground-truth-build-highly-accurate-datasets-and-reduce-labeling-costs-by-up-to-70
Easily integrate
human labelers
Get accurate
results
K E Y F E AT U R E S
Automatic labeling via
machine learning
Ready-made and custom
workflows for image
bounding box,
segmentation, and text
Label
management
Quickly label
training data
Private and public human
workforce
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Manage data on AWS
Business
Problem
ML problem framing Data collection
Data integration
Data preparation and
cleaning
Data visualization
and analysis
Feature engineering
Model training and
parameter tuning
Model evaluation
Monitoring and
debugging
Model deployment
Predictions
Are business
goals
met?
YESNO
Dataaugmentation
Feature
augmentation
Re-training
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Build and train models using SageMaker
Business
Problem
ML problem framing Data collection
Data integration
Data preparation and
cleaning
Data visualization
and analysis
Feature engineering
Model training and
parameter tuning
Model evaluation
Monitoring and
debugging
Model deployment
Predictions
Are business
goals
met?
YESNO
Dataaugmentation
Feature
augmentation
Re-training
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Deploy models using SageMaker
Business
Problem
ML problem framing Data collection
Data integration
Data preparation and
cleaning
Data visualization
and analysis
Feature engineering
Model training and
parameter tuning
Model evaluation
Monitoring and
debugging
Model deployment
Predictions
Are business
goals
met?
YESNO
Dataaugmentation
Feature
augmentation
Re-training
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Amazon SageMaker
Collect and prepare
training data
Choose and optimize
your
ML algorithm
Train and
Tune ML Models
Set up and
manage
environments
for training
Deploy models
in production
Scale and manage
the production
environment
1
2
3
Model compilation
Elastic inference
Inference pipelines
P3DN, C5N
TensorFlow on 256 GPUs
Resume HPO tuning job
New built-in algorithms
scikit-learn environment
Model marketplace
Search
Git integration
Elastic inference
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
AWS Machine Learning Marketplace
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
The Amazon SageMaker API
• Python SDK orchestrating all Amazon SageMaker activity
• High-level objects for algorithm selection, training, deploying,
automatic model tuning, etc.
• Spark SDK (Python & Scala)
• AWS CLI: ‘aws sagemaker’
• AWS SDK: boto3, etc.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Training code
Factorization Machines
Linear Learner
Principal Component Analysis
K-Means Clustering
XGBoost
And more
Built-in Algorithms Bring Your Own ContainerBring Your Own Script
Model options
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
IMAGE RECOGNITION | for the good of | PRODUCT SEARCH
Sébastien BUTREAU
sebastien.butreau@tarkett.com
Group IT Projects & CCOE Manager
Tarkett
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
→ €2.8B Net sales
(2018 figures)
→13,000 employees
→ Present in more
than 100
countries
→1.3M square
meters of flooring
sold each day
A worldwide leader in flooring & sports surfaces
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Could we recommend products to each user?
VINYL & LINOLEUM CARPET WOOD & LAMINATE ACCESSORIES
& RUBBER
SPORTS
SURFACES
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
«Building a
bot»
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Why image search?
Why does it make sense in the context of flooring?
Field research showed
that architects and
designers were already
leveraging image search
engines
Because getting results
from an image search
engine leads to inspiration
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Project Aladdin
Deep Learning
GPU instances
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Why we used Amazon SageMaker
• Go quicker from idea to production
• Distributed training out of the box
• One line of code to deploy models
• Almost the same cost as Amazon EC2 ($300/month)
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Demo: https://professionnels.tarkett.fr
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Next steps
Explore new possibilities opened by image search
• Find the best substitution when a product is out of stock
• Understand « in-situ » the real user demand
• Incorporate external factors (seasons, fashion, styles)
• Position our products against the competition
• Generate new designs
Merci!
Sébastien BUTREAU
sebastien.butreau@tarkett.com
Group IT Projects & CCOE Manager
Tarkett
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Built-in algorithms
orange:supervised,yellow:unsupervised
Linear Learner: regression, classification Image Classification: Deep Learning (ResNet)
Factorization Machines: regression, classification,
recommendation
Object Detection (SSD): Deep Learning
(VGG or ResNet)
K-Nearest Neighbors: non-parametric regression and
classification
Neural Topic Model: topic modeling
XGBoost: regression, classification, ranking
https://github.com/dmlc/xgboost
Latent Dirichlet Allocation: topic modeling (mostly)
K-Means: clustering Blazing Text: GPU-based Word2Vec,
and text classification
Principal Component Analysis: dimensionality
reduction
Sequence to Sequence: machine translation, speech to
text and more
Random Cut Forest: anomaly detection DeepAR: time-series forecasting (RNN)
Object2Vec: general-purpose embedding IP Insights: usage patterns for IP addresses
Semantic Segmentation: Deep Learning
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Demo:
Text Classification with BlazingText
https://github.com/awslabs/amazon-sagemaker-
examples/tree/master/introduction_to_amazon_algorithms/blazingtext_text_classification_dbpedia
https://dl.acm.org/citation.cfm?id=3146354
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Demo:
Image classification with Caltech-256
https://gitlab.com/juliensimon/dlnotebooks/sagemaker/
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
Amazon SageMaker
Collect and prepare
training data
Choose and optimize
your
ML algorithm
Train and
Tune ML Models
Set up and
manage
environments
for training
Deploy models
in production
Scale and manage
the production
environment
1
2
3
Build Train Deploy
© 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.S U M M I T
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Getting started
http://aws.amazon.com/free
https://ml.aws
https://aws.amazon.com/sagemaker
https://github.com/aws/sagemaker-python-sdk
https://github.com/aws/sagemaker-spark
https://github.com/awslabs/amazon-sagemaker-examples
https://gitlab.com/juliensimon/ent321
https://medium.com/@julsimon
https://gitlab.com/juliensimon/dlnotebooks
Thank you!
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Julien Simon
Global Evangelist, AI and Machine Learning
@julsimon
https://medium.com/julsimon
S U M M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Build Machine Learning Models with Amazon SageMaker (April 2019)

  • 1.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved.S U M M I T Build Machine Learning Models with Amazon SageMaker Julien Simon Global Evangelist, AI & Machine Learning @julsimon
  • 2.
    M L FR A M E W O R K S & I N F R A S T R U C T U R E The Amazon ML Stack: Broadest & Deepest Set of Capabilities A I S E R V I C E S R E K O G N I T I O N I M A G E P O L L Y T R A N S C R I B E T R A N S L A T E C O M P R E H E N D C O M P R E H E N D M E D I C A L L E XR E K O G N I T I O N V I D E O Vision Speech Chatbots A M A Z O N S A G E M A K E R B U I L D T R A I N F O R E C A S TT E X T R A C T P E R S O N A L I Z E D E P L O Y Pre-built algorithms & notebooks Data labeling (G R O U N D T R U T H ) One-click model training & tuning Optimization ( N E O ) One-click deployment & hosting M L S E R V I C E S F r a m e w o r k s I n t e r f a c e s I n f r a s t r u c t u r e E C 2 P 3 & P 3 d n E C 2 C 5 F P G A s G R E E N G R A S S E L A S T I C I N F E R E N C E Models without training data (REINFORCEMENT LEARNING) Algorithms & models ( A W S M A R K E T P L A C E ) Language Forecasting Recommendations NEW NEWNEW NEW NEW NEWNEW NEW NEW
  • 3.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved.S U M M I T Amazon SageMaker: Build, Train, and Deploy ML Models at Scale Collect and prepare training data Choose and optimize your ML algorithm Train and Tune ML Models Set up and manage environments for training Deploy models in production Scale and manage the production environment 1 2 3
  • 4.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved.S U M M I T Machine learning cycle Business Problem ML problem framing Data collection Data integration Data preparation and cleaning Data visualization and analysis Feature engineering Model training and parameter tuning Model evaluation Monitoring and debugging Model deployment Predictions Are business goals met? YESNO Dataaugmentation Feature augmentation Re-training
  • 5.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved.S U M M I T Successful models require high-quality data
  • 6.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved.S U M M I T Successful models require high-quality data
  • 7.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved.S U M M I T Amazon SageMaker Ground Truth https://aws.amazon.com/blogs/aws/amazon-sagemaker-ground-truth-build-highly-accurate-datasets-and-reduce-labeling-costs-by-up-to-70 Easily integrate human labelers Get accurate results K E Y F E AT U R E S Automatic labeling via machine learning Ready-made and custom workflows for image bounding box, segmentation, and text Label management Quickly label training data Private and public human workforce
  • 8.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved.S U M M I T Manage data on AWS Business Problem ML problem framing Data collection Data integration Data preparation and cleaning Data visualization and analysis Feature engineering Model training and parameter tuning Model evaluation Monitoring and debugging Model deployment Predictions Are business goals met? YESNO Dataaugmentation Feature augmentation Re-training
  • 9.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved.S U M M I T Build and train models using SageMaker Business Problem ML problem framing Data collection Data integration Data preparation and cleaning Data visualization and analysis Feature engineering Model training and parameter tuning Model evaluation Monitoring and debugging Model deployment Predictions Are business goals met? YESNO Dataaugmentation Feature augmentation Re-training
  • 10.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved.S U M M I T Deploy models using SageMaker Business Problem ML problem framing Data collection Data integration Data preparation and cleaning Data visualization and analysis Feature engineering Model training and parameter tuning Model evaluation Monitoring and debugging Model deployment Predictions Are business goals met? YESNO Dataaugmentation Feature augmentation Re-training
  • 11.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved.S U M M I T Amazon SageMaker Collect and prepare training data Choose and optimize your ML algorithm Train and Tune ML Models Set up and manage environments for training Deploy models in production Scale and manage the production environment 1 2 3 Model compilation Elastic inference Inference pipelines P3DN, C5N TensorFlow on 256 GPUs Resume HPO tuning job New built-in algorithms scikit-learn environment Model marketplace Search Git integration Elastic inference
  • 12.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved.S U M M I T AWS Machine Learning Marketplace
  • 13.
    S U MM I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 14.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved.S U M M I T The Amazon SageMaker API • Python SDK orchestrating all Amazon SageMaker activity • High-level objects for algorithm selection, training, deploying, automatic model tuning, etc. • Spark SDK (Python & Scala) • AWS CLI: ‘aws sagemaker’ • AWS SDK: boto3, etc.
  • 15.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved.S U M M I T Training code Factorization Machines Linear Learner Principal Component Analysis K-Means Clustering XGBoost And more Built-in Algorithms Bring Your Own ContainerBring Your Own Script Model options
  • 16.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved.S U M M I T IMAGE RECOGNITION | for the good of | PRODUCT SEARCH Sébastien BUTREAU sebastien.butreau@tarkett.com Group IT Projects & CCOE Manager Tarkett
  • 17.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved.S U M M I T → €2.8B Net sales (2018 figures) →13,000 employees → Present in more than 100 countries →1.3M square meters of flooring sold each day A worldwide leader in flooring & sports surfaces
  • 18.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved.S U M M I T Could we recommend products to each user? VINYL & LINOLEUM CARPET WOOD & LAMINATE ACCESSORIES & RUBBER SPORTS SURFACES
  • 19.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved.S U M M I T «Building a bot»
  • 20.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved.S U M M I T Why image search? Why does it make sense in the context of flooring? Field research showed that architects and designers were already leveraging image search engines Because getting results from an image search engine leads to inspiration
  • 21.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved.S U M M I T Project Aladdin Deep Learning GPU instances
  • 22.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved.S U M M I T Why we used Amazon SageMaker • Go quicker from idea to production • Distributed training out of the box • One line of code to deploy models • Almost the same cost as Amazon EC2 ($300/month)
  • 23.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved.S U M M I T Demo: https://professionnels.tarkett.fr
  • 24.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved.S U M M I T Next steps Explore new possibilities opened by image search • Find the best substitution when a product is out of stock • Understand « in-situ » the real user demand • Incorporate external factors (seasons, fashion, styles) • Position our products against the competition • Generate new designs
  • 25.
  • 26.
    S U MM I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 27.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved.S U M M I T Built-in algorithms orange:supervised,yellow:unsupervised Linear Learner: regression, classification Image Classification: Deep Learning (ResNet) Factorization Machines: regression, classification, recommendation Object Detection (SSD): Deep Learning (VGG or ResNet) K-Nearest Neighbors: non-parametric regression and classification Neural Topic Model: topic modeling XGBoost: regression, classification, ranking https://github.com/dmlc/xgboost Latent Dirichlet Allocation: topic modeling (mostly) K-Means: clustering Blazing Text: GPU-based Word2Vec, and text classification Principal Component Analysis: dimensionality reduction Sequence to Sequence: machine translation, speech to text and more Random Cut Forest: anomaly detection DeepAR: time-series forecasting (RNN) Object2Vec: general-purpose embedding IP Insights: usage patterns for IP addresses Semantic Segmentation: Deep Learning
  • 28.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved.S U M M I T Demo: Text Classification with BlazingText https://github.com/awslabs/amazon-sagemaker- examples/tree/master/introduction_to_amazon_algorithms/blazingtext_text_classification_dbpedia https://dl.acm.org/citation.cfm?id=3146354
  • 29.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved.S U M M I T Demo: Image classification with Caltech-256 https://gitlab.com/juliensimon/dlnotebooks/sagemaker/
  • 30.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved.S U M M I T Amazon SageMaker Collect and prepare training data Choose and optimize your ML algorithm Train and Tune ML Models Set up and manage environments for training Deploy models in production Scale and manage the production environment 1 2 3 Build Train Deploy
  • 31.
    © 2019, AmazonWeb Services, Inc. or its affiliates. All rights reserved.S U M M I T © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Getting started http://aws.amazon.com/free https://ml.aws https://aws.amazon.com/sagemaker https://github.com/aws/sagemaker-python-sdk https://github.com/aws/sagemaker-spark https://github.com/awslabs/amazon-sagemaker-examples https://gitlab.com/juliensimon/ent321 https://medium.com/@julsimon https://gitlab.com/juliensimon/dlnotebooks
  • 32.
    Thank you! S UM M I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved. Julien Simon Global Evangelist, AI and Machine Learning @julsimon https://medium.com/julsimon
  • 33.
    S U MM I T © 2019, Amazon Web Services, Inc. or its affiliates. All rights reserved.