SlideShare a Scribd company logo
1 of 29
Download to read offline
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Julien Simon
Principal AI/ML Evangelist, Amazon Web Services
Speed up your Machine Learning
workflows with built-in algorithms
@julsimon
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
One-click training for
ML, DL, and custom
algorithms
Easier training with
hyperparameter
optimization
Highly-optimized
machine learning
algorithms
Deployment
without engineering
effort
Fully-managed
hosting at scale
Build
Pre-built notebook
instances
Deploy
Train
Amazon SageMaker
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Training
code
• Matrix Factorization
• Regression
• Principal Component Analysis
• K-Means Clustering
• Gradient Boosted Trees
• And More!
Amazon provided Algorithms
Bring Your Own Container
Amazon SageMaker: model options
Bring Your Own Script
IM Estimators in
Apache Spark
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Streaming datasets, for
cheaper training
Train faster, in a single
pass
Greater reliability on
extremely large
datasets
Choice of several ML
algorithms
Amazon SageMaker: 10x better algorithms
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Infinitely scalable algorithms
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Streaming
GPU State
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Streaming
Data Size
Memory
Data Size
Time/Cost
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Distributed
GPU State
GPU State
GPU State
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Shared State
GPU
GPU
GPU Local
State
Shared
State
Local
State
Local
State
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Cost vs. Time
$$$$
$$$
$$
$
Minutes Hours Days Weeks Months
Best Alternative
Amazon SageMaker
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Linear Learner
Regression (mean squared error)
SageMaker Other
1.02 1.06
1.09 1.02
0.332 0.183
0.086 0.129
83.3 84.5
Classification (F1 Score)
SageMaker Other
0.980 0.981
0.870 0.930
0.997 0.997
0.978 0.964
0.914 0.859
0.470 0.472
0.903 0.908
0.508 0.508
30 GB datasets for web-spam and web-url classification
0
0.2
0.4
0.6
0.8
1
1.2
0 5 10 15 20 25 30
CostinDollars
Billable time in Minutes
sagemaker-url sagemaker-spam other-url other-spam
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Factorization Machines
Log_loss F1 Score Seconds
SageMaker 0.494 0.277 820
Other (10 Iter) 0.516 0.190 650
Other (20 Iter) 0.507 0.254 1300
Other (50 Iter) 0.481 0.313 3250
Click Prediction 1 TB advertising dataset,
m4.4xlarge machines, perfect scaling.
$-
$20.00
$40.00
$60.00
$80.00
$100.00
$120.00
$140.00
$160.00
$180.00
$200.00
1 2 3 4 5 6 7 8CostinDollars
Billable Time in Hours
10
machines
20
machines
30
machines
4050
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Demo: building a movie recommender with
Factorization Machines
h t t p s : / / m e d i u m . c o m / @ j u l s i m o n / b u i l d i n g - a - m o v i e - r e c o m m e n d e r - w i t h - f a c t o r i z a t i o n -
m a c h i n e s - o n - a m a z o n - s a g e m a k e r - c e d b f c 8 c 9 3 d 8
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
0
1
2
3
4
5
6
7
8
10 100 500
BillableTimeinMinutes Number of Clusters
sagemaker other
K-Means Clustering
k SageMaker Other
Text
1.2GB
10 1.18E3 1.18E3
100 1.00E3 9.77E2
500 9.18.E2 9.03E2
Images
9GB
10 3.29E2 3.28E2
100 2.72E2 2.71E2
500 2.17E2 Failed
Videos
27GB
10 2.19E2 2.18E2
100 2.03E2 2.02E2
500 1.86E2 1.85E2
Advertising
127GB
10 1.72E7 Failed
100 1.30E7 Failed
500 1.03E7 Failed
Synthetic
1100GB
10 3.81E7 Failed
100 3.51E7 Failed
500 2.81E7 Failed
Running Time vs. Number of Clusters
~10x Faster!
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Principal Component Analysis (PCA)
More than 10x faster
at a fraction the cost!
0.00
20.00
40.00
60.00
80.00
100.00
120.00
8 10 20
Mb/Sec/Machine
Number of Machines
other sagemaker-deterministic sagemaker-randomized
Cost vs. Time Throughput and Scalability
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
0 10 20 30 40 50
CostinDollars
Billable time in Minutes
other sagemaker-deterministic sagemaker-randomized
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Neural Topic Modeling
Perplexity vs. Number of Topic
Encoder: feedforward net
Input term counts vector
Document
Posterior
Sampled Document
Representation
Decoder:
Softmax
Output term counts vector
0
2000
4000
6000
8000
10000
12000
0 50 100 150 200
Perplexity
Number of Topics
NTM Other
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
DeepAR: Time Series Forecasting
Mean absolute
percentage error
P90 Loss
DeepAR R DeepAR R
traffic
Hourly occupancy rate of 963
Bay Area freeways
0.14 0.27 0.13 0.24
electricity
Electricity use of 370 homes
over time
0.07 0.11 0.08 0.09
pageviews
Page view hits of
websites
10k 0.32 0.32 0.44 0.31
180k 0.32 0.34 0.29 NA
One hour on p2.xlarge, $1
Input
Network
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
DeepAR
https://arxiv.org/abs/1704.04110
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Demo: predicting world temperature
with DeepAR
h t t p s : / / m e d i u m . c o m / @ j u l s i m o n / p r e d i c t i n g - w o r l d - t e m p e r a t u r e - w i t h - t i m e - s e r i e s -
a n d - d e e p a r - o n - a m a z o n - s a g e m a k e r - e 3 7 1 c f 9 4 d d b 5
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
More built-in algorithms
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Spectral LDA
Training Time vs. Number of Topics
0
50
100
150
200
250
0 20 40 60 80 100TrainingTimeinMinutes
Number of Topics
lda-data-a lda-data-b other-data-a other-data-b
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Boosted Decision Trees
Throughput vs. Number of Machines
XGBoost is one of the most
commonly used classifiers.
0
200
400
600
800
1000
1200
1400
0 10 20 30 40 50 60 70
ThroughputinMB/Sec
Number of Machines (C4.8xLarge)
https://github.com/dmlc/xgboost
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Sequence to Sequence
English-German Translation
0
5
10
15
20
25
0 5 10 15 20 25
BLEUScore
Billable Time in Hours
P2.16x P2.8x P2.x
Best known result!
• Based on Sockeye
and Apache MXNet.
• Multi-GPU.
• Can be used for Neural
Machine Translation.
• Supports both RNN/CNN as
encoder/decoder
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
https://arxiv.org/abs/1712.05690
https://github.com/awslabs/sockeye
Sockeye
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Image Classification
• ResNet implementation
with Apache MXNet.
• More networks to come.
• Transfer learning: begin
with a model already
trained on ImageNet!
0
0.5
1
1.5
2
2.5
3
3.5
0 1 2 3 4 5
Speedup
Number of Machines (P2)
Linear Speedup with Horizontal Scaling
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Demo: fine-tuning an image classification
model
h t t p s : / / m e d i u m . c o m / @ j u l s i m o n / i m a g e - c l a s s i f i c a t i o n - o n - a m a z o n - s a g e m a k e r -
9 b 6 6 1 9 3 c 8 b 5 4
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Latest addition: Blazing Text
https://dl.acm.org/citation.cfm?id=3146354
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Resources
https://aws.amazon.com/machine-learning
https://aws.amazon.com/blogs/ai
https://aws.amazon.com/sagemaker (free tier available)
https://github.com/awslabs/amazon-sagemaker-examples
An overview of Amazon SageMaker https://www.youtube.com/watch?v=ym7NEYEx9x4
https://medium.com/@julsimon
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Thank you!
Julien Simon
Principal AI/ML Evangelist, Amazon Web Services
@julsimon

More Related Content

What's hot

What's hot (20)

Optimize your Machine Learning Workloads on AWS (July 2019)
Optimize your Machine Learning Workloads on AWS (July 2019)Optimize your Machine Learning Workloads on AWS (July 2019)
Optimize your Machine Learning Workloads on AWS (July 2019)
 
Working with Amazon SageMaker Algorithms for Faster Model Training
Working with Amazon SageMaker Algorithms for Faster Model TrainingWorking with Amazon SageMaker Algorithms for Faster Model Training
Working with Amazon SageMaker Algorithms for Faster Model Training
 
Integrating Deep Learning In the Enterprise
Integrating Deep Learning In the EnterpriseIntegrating Deep Learning In the Enterprise
Integrating Deep Learning In the Enterprise
 
Using Amazon SageMaker to Build, Train, and Deploy Your ML Models
Using Amazon SageMaker to Build, Train, and Deploy Your ML ModelsUsing Amazon SageMaker to Build, Train, and Deploy Your ML Models
Using Amazon SageMaker to Build, Train, and Deploy Your ML Models
 
Build Deep Learning Applications with TensorFlow and Amazon SageMaker
Build Deep Learning Applications with TensorFlow and Amazon SageMakerBuild Deep Learning Applications with TensorFlow and Amazon SageMaker
Build Deep Learning Applications with TensorFlow and Amazon SageMaker
 
Working with Amazon SageMaker Algorithms for Faster Model Training
Working with Amazon SageMaker Algorithms for Faster Model TrainingWorking with Amazon SageMaker Algorithms for Faster Model Training
Working with Amazon SageMaker Algorithms for Faster Model Training
 
Working with Amazon SageMaker Algorithms for Faster Model Training
Working with Amazon SageMaker Algorithms for Faster Model TrainingWorking with Amazon SageMaker Algorithms for Faster Model Training
Working with Amazon SageMaker Algorithms for Faster Model Training
 
Amazon SageMaker (December 2018)
Amazon SageMaker (December 2018)Amazon SageMaker (December 2018)
Amazon SageMaker (December 2018)
 
Adding Image and Video Analysis to your Applications (May 2018)
Adding Image and Video Analysis to your Applications (May 2018)Adding Image and Video Analysis to your Applications (May 2018)
Adding Image and Video Analysis to your Applications (May 2018)
 
AWS DeepLens - A New Way to Learn Machine Learning
AWS DeepLens - A New Way to Learn Machine LearningAWS DeepLens - A New Way to Learn Machine Learning
AWS DeepLens - A New Way to Learn Machine Learning
 
Build, Train, and Deploy ML Models at Scale
Build, Train, and Deploy ML Models at ScaleBuild, Train, and Deploy ML Models at Scale
Build, Train, and Deploy ML Models at Scale
 
Using Amazon SageMaker to build, train, & deploy your ML Models
Using Amazon SageMaker to build, train, & deploy your ML ModelsUsing Amazon SageMaker to build, train, & deploy your ML Models
Using Amazon SageMaker to build, train, & deploy your ML Models
 
Integrating Deep Learning Into Your Enterprise
Integrating Deep Learning Into Your EnterpriseIntegrating Deep Learning Into Your Enterprise
Integrating Deep Learning Into Your Enterprise
 
Using Amazon SageMaker to build, train, and deploy your ML Models
Using Amazon SageMaker to build, train, and deploy your ML ModelsUsing Amazon SageMaker to build, train, and deploy your ML Models
Using Amazon SageMaker to build, train, and deploy your ML Models
 
Build, train and deploy Machine Learning models on Amazon SageMaker (May 2019)
Build, train and deploy Machine Learning models on Amazon SageMaker (May 2019)Build, train and deploy Machine Learning models on Amazon SageMaker (May 2019)
Build, train and deploy Machine Learning models on Amazon SageMaker (May 2019)
 
Building machine learning inference pipelines at scale (March 2019)
Building machine learning inference pipelines at scale (March 2019)Building machine learning inference pipelines at scale (March 2019)
Building machine learning inference pipelines at scale (March 2019)
 
Kate Werling - Using Amazon SageMaker to build, train, and deploy your ML Mod...
Kate Werling - Using Amazon SageMaker to build, train, and deploy your ML Mod...Kate Werling - Using Amazon SageMaker to build, train, and deploy your ML Mod...
Kate Werling - Using Amazon SageMaker to build, train, and deploy your ML Mod...
 
SageMaker Algorithms Infinitely Scalable Machine Learning
SageMaker Algorithms Infinitely Scalable Machine LearningSageMaker Algorithms Infinitely Scalable Machine Learning
SageMaker Algorithms Infinitely Scalable Machine Learning
 
Intro to SageMaker
Intro to SageMakerIntro to SageMaker
Intro to SageMaker
 
Build, train and deploy ML models with SageMaker (October 2019)
Build, train and deploy ML models with SageMaker (October 2019)Build, train and deploy ML models with SageMaker (October 2019)
Build, train and deploy ML models with SageMaker (October 2019)
 

Similar to Speed up your Machine Learning workflows with build-in algorithms

AWS의 새로운 언어, 음성, 텍스트 처리 인공 지능 서비스, Amazon SageMaker::Sunil Mallya::AWS Summit...
AWS의 새로운 언어, 음성, 텍스트 처리 인공 지능 서비스, Amazon SageMaker::Sunil Mallya::AWS Summit...AWS의 새로운 언어, 음성, 텍스트 처리 인공 지능 서비스, Amazon SageMaker::Sunil Mallya::AWS Summit...
AWS의 새로운 언어, 음성, 텍스트 처리 인공 지능 서비스, Amazon SageMaker::Sunil Mallya::AWS Summit...
Amazon Web Services Korea
 

Similar to Speed up your Machine Learning workflows with build-in algorithms (20)

AWS의 새로운 언어, 음성, 텍스트 처리 인공 지능 서비스, Amazon SageMaker::Sunil Mallya::AWS Summit...
AWS의 새로운 언어, 음성, 텍스트 처리 인공 지능 서비스, Amazon SageMaker::Sunil Mallya::AWS Summit...AWS의 새로운 언어, 음성, 텍스트 처리 인공 지능 서비스, Amazon SageMaker::Sunil Mallya::AWS Summit...
AWS의 새로운 언어, 음성, 텍스트 처리 인공 지능 서비스, Amazon SageMaker::Sunil Mallya::AWS Summit...
 
Machine Learning with Amazon SageMaker - Algorithms and Frameworks - BDA304 -...
Machine Learning with Amazon SageMaker - Algorithms and Frameworks - BDA304 -...Machine Learning with Amazon SageMaker - Algorithms and Frameworks - BDA304 -...
Machine Learning with Amazon SageMaker - Algorithms and Frameworks - BDA304 -...
 
Amazon SageMaker
Amazon SageMakerAmazon SageMaker
Amazon SageMaker
 
Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...
Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...
Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...
 
Accelerate Machine Learning with Ease Using Amazon SageMaker - BDA301 - Chica...
Accelerate Machine Learning with Ease Using Amazon SageMaker - BDA301 - Chica...Accelerate Machine Learning with Ease Using Amazon SageMaker - BDA301 - Chica...
Accelerate Machine Learning with Ease Using Amazon SageMaker - BDA301 - Chica...
 
Work with Machine Learning in Amazon SageMaker - BDA203 - Atlanta AWS Summit
Work with Machine Learning in Amazon SageMaker - BDA203 - Atlanta AWS SummitWork with Machine Learning in Amazon SageMaker - BDA203 - Atlanta AWS Summit
Work with Machine Learning in Amazon SageMaker - BDA203 - Atlanta AWS Summit
 
Machine Learning e Amazon SageMaker: Algoritmos, Modelos e Inferências - MCL...
Machine Learning e Amazon SageMaker: Algoritmos, Modelos e Inferências -  MCL...Machine Learning e Amazon SageMaker: Algoritmos, Modelos e Inferências -  MCL...
Machine Learning e Amazon SageMaker: Algoritmos, Modelos e Inferências - MCL...
 
Accelerate Machine Learning with Ease using Amazon SageMaker
Accelerate Machine Learning with Ease using Amazon SageMakerAccelerate Machine Learning with Ease using Amazon SageMaker
Accelerate Machine Learning with Ease using Amazon SageMaker
 
Quickly and easily build, train, and deploy machine learning models at any scale
Quickly and easily build, train, and deploy machine learning models at any scaleQuickly and easily build, train, and deploy machine learning models at any scale
Quickly and easily build, train, and deploy machine learning models at any scale
 
BDA301 Working with Machine Learning in Amazon SageMaker: Algorithms, Models,...
BDA301 Working with Machine Learning in Amazon SageMaker: Algorithms, Models,...BDA301 Working with Machine Learning in Amazon SageMaker: Algorithms, Models,...
BDA301 Working with Machine Learning in Amazon SageMaker: Algorithms, Models,...
 
Amazon SageMaker 內建機器學習演算法 (Level 400)
Amazon SageMaker 內建機器學習演算法 (Level 400)Amazon SageMaker 內建機器學習演算法 (Level 400)
Amazon SageMaker 內建機器學習演算法 (Level 400)
 
Work with Machine Learning in Amazon SageMaker - BDA203 - Toronto AWS Summit
Work with Machine Learning in Amazon SageMaker - BDA203 - Toronto AWS SummitWork with Machine Learning in Amazon SageMaker - BDA203 - Toronto AWS Summit
Work with Machine Learning in Amazon SageMaker - BDA203 - Toronto AWS Summit
 
Accelerate Machine Learning Workloads using Amazon EC2 P3 Instances - SRV201 ...
Accelerate Machine Learning Workloads using Amazon EC2 P3 Instances - SRV201 ...Accelerate Machine Learning Workloads using Amazon EC2 P3 Instances - SRV201 ...
Accelerate Machine Learning Workloads using Amazon EC2 P3 Instances - SRV201 ...
 
Building a Recommender System on AWS
Building a Recommender System on AWSBuilding a Recommender System on AWS
Building a Recommender System on AWS
 
Supercharge Your ML Model with SageMaker - AWS Summit Sydney 2018
Supercharge Your ML Model with SageMaker - AWS Summit Sydney 2018Supercharge Your ML Model with SageMaker - AWS Summit Sydney 2018
Supercharge Your ML Model with SageMaker - AWS Summit Sydney 2018
 
MLops workshop AWS
MLops workshop AWSMLops workshop AWS
MLops workshop AWS
 
re:Invent Deep Dive on Amazon SageMaker, Amazon Forecast and Amazon Personalise
re:Invent Deep Dive on Amazon SageMaker, Amazon Forecast and Amazon Personalisere:Invent Deep Dive on Amazon SageMaker, Amazon Forecast and Amazon Personalise
re:Invent Deep Dive on Amazon SageMaker, Amazon Forecast and Amazon Personalise
 
Amazon AI/ML Overview
Amazon AI/ML OverviewAmazon AI/ML Overview
Amazon AI/ML Overview
 
Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...
Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...
Architecting for Real-Time Insights with Amazon Kinesis (ANT310) - AWS re:Inv...
 
Build Deep Learning Applications Using Apache MXNet - Featuring Chick-fil-A (...
Build Deep Learning Applications Using Apache MXNet - Featuring Chick-fil-A (...Build Deep Learning Applications Using Apache MXNet - Featuring Chick-fil-A (...
Build Deep Learning Applications Using Apache MXNet - Featuring Chick-fil-A (...
 

More from Julien SIMON

More from Julien SIMON (20)

An introduction to computer vision with Hugging Face
An introduction to computer vision with Hugging FaceAn introduction to computer vision with Hugging Face
An introduction to computer vision with Hugging Face
 
Reinventing Deep Learning
 with Hugging Face Transformers
Reinventing Deep Learning
 with Hugging Face TransformersReinventing Deep Learning
 with Hugging Face Transformers
Reinventing Deep Learning
 with Hugging Face Transformers
 
Building NLP applications with Transformers
Building NLP applications with TransformersBuilding NLP applications with Transformers
Building NLP applications with Transformers
 
Building Machine Learning Models Automatically (June 2020)
Building Machine Learning Models Automatically (June 2020)Building Machine Learning Models Automatically (June 2020)
Building Machine Learning Models Automatically (June 2020)
 
Starting your AI/ML project right (May 2020)
Starting your AI/ML project right (May 2020)Starting your AI/ML project right (May 2020)
Starting your AI/ML project right (May 2020)
 
Scale Machine Learning from zero to millions of users (April 2020)
Scale Machine Learning from zero to millions of users (April 2020)Scale Machine Learning from zero to millions of users (April 2020)
Scale Machine Learning from zero to millions of users (April 2020)
 
An Introduction to Generative Adversarial Networks (April 2020)
An Introduction to Generative Adversarial Networks (April 2020)An Introduction to Generative Adversarial Networks (April 2020)
An Introduction to Generative Adversarial Networks (April 2020)
 
AIM410R1 Deep learning applications with TensorFlow, featuring Fannie Mae (De...
AIM410R1 Deep learning applications with TensorFlow, featuring Fannie Mae (De...AIM410R1 Deep learning applications with TensorFlow, featuring Fannie Mae (De...
AIM410R1 Deep learning applications with TensorFlow, featuring Fannie Mae (De...
 
AIM361 Optimizing machine learning models with Amazon SageMaker (December 2019)
AIM361 Optimizing machine learning models with Amazon SageMaker (December 2019)AIM361 Optimizing machine learning models with Amazon SageMaker (December 2019)
AIM361 Optimizing machine learning models with Amazon SageMaker (December 2019)
 
AIM410R Deep Learning Applications with TensorFlow, featuring Mobileye (Decem...
AIM410R Deep Learning Applications with TensorFlow, featuring Mobileye (Decem...AIM410R Deep Learning Applications with TensorFlow, featuring Mobileye (Decem...
AIM410R Deep Learning Applications with TensorFlow, featuring Mobileye (Decem...
 
A pragmatic introduction to natural language processing models (October 2019)
A pragmatic introduction to natural language processing models (October 2019)A pragmatic introduction to natural language processing models (October 2019)
A pragmatic introduction to natural language processing models (October 2019)
 
Building smart applications with AWS AI services (October 2019)
Building smart applications with AWS AI services (October 2019)Building smart applications with AWS AI services (October 2019)
Building smart applications with AWS AI services (October 2019)
 
The Future of AI (September 2019)
The Future of AI (September 2019)The Future of AI (September 2019)
The Future of AI (September 2019)
 
Building Machine Learning Inference Pipelines at Scale (July 2019)
Building Machine Learning Inference Pipelines at Scale (July 2019)Building Machine Learning Inference Pipelines at Scale (July 2019)
Building Machine Learning Inference Pipelines at Scale (July 2019)
 
Train and Deploy Machine Learning Workloads with AWS Container Services (July...
Train and Deploy Machine Learning Workloads with AWS Container Services (July...Train and Deploy Machine Learning Workloads with AWS Container Services (July...
Train and Deploy Machine Learning Workloads with AWS Container Services (July...
 
Deep Learning on Amazon Sagemaker (July 2019)
Deep Learning on Amazon Sagemaker (July 2019)Deep Learning on Amazon Sagemaker (July 2019)
Deep Learning on Amazon Sagemaker (July 2019)
 
Automate your Amazon SageMaker Workflows (July 2019)
Automate your Amazon SageMaker Workflows (July 2019)Automate your Amazon SageMaker Workflows (July 2019)
Automate your Amazon SageMaker Workflows (July 2019)
 
Build, train and deploy ML models with Amazon SageMaker (May 2019)
Build, train and deploy ML models with Amazon SageMaker (May 2019)Build, train and deploy ML models with Amazon SageMaker (May 2019)
Build, train and deploy ML models with Amazon SageMaker (May 2019)
 
Become a Machine Learning developer with AWS services (May 2019)
Become a Machine Learning developer with AWS services (May 2019)Become a Machine Learning developer with AWS services (May 2019)
Become a Machine Learning developer with AWS services (May 2019)
 
Scaling Machine Learning from zero to millions of users (May 2019)
Scaling Machine Learning from zero to millions of users (May 2019)Scaling Machine Learning from zero to millions of users (May 2019)
Scaling Machine Learning from zero to millions of users (May 2019)
 

Recently uploaded

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Recently uploaded (20)

Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 

Speed up your Machine Learning workflows with build-in algorithms

  • 1. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Julien Simon Principal AI/ML Evangelist, Amazon Web Services Speed up your Machine Learning workflows with built-in algorithms @julsimon
  • 2. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. One-click training for ML, DL, and custom algorithms Easier training with hyperparameter optimization Highly-optimized machine learning algorithms Deployment without engineering effort Fully-managed hosting at scale Build Pre-built notebook instances Deploy Train Amazon SageMaker
  • 3. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Training code • Matrix Factorization • Regression • Principal Component Analysis • K-Means Clustering • Gradient Boosted Trees • And More! Amazon provided Algorithms Bring Your Own Container Amazon SageMaker: model options Bring Your Own Script IM Estimators in Apache Spark
  • 4. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Streaming datasets, for cheaper training Train faster, in a single pass Greater reliability on extremely large datasets Choice of several ML algorithms Amazon SageMaker: 10x better algorithms
  • 5. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Infinitely scalable algorithms
  • 6. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Streaming GPU State
  • 7. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Streaming Data Size Memory Data Size Time/Cost
  • 8. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Distributed GPU State GPU State GPU State
  • 9. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Shared State GPU GPU GPU Local State Shared State Local State Local State
  • 10. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Cost vs. Time $$$$ $$$ $$ $ Minutes Hours Days Weeks Months Best Alternative Amazon SageMaker
  • 11. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Linear Learner Regression (mean squared error) SageMaker Other 1.02 1.06 1.09 1.02 0.332 0.183 0.086 0.129 83.3 84.5 Classification (F1 Score) SageMaker Other 0.980 0.981 0.870 0.930 0.997 0.997 0.978 0.964 0.914 0.859 0.470 0.472 0.903 0.908 0.508 0.508 30 GB datasets for web-spam and web-url classification 0 0.2 0.4 0.6 0.8 1 1.2 0 5 10 15 20 25 30 CostinDollars Billable time in Minutes sagemaker-url sagemaker-spam other-url other-spam
  • 12. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Factorization Machines Log_loss F1 Score Seconds SageMaker 0.494 0.277 820 Other (10 Iter) 0.516 0.190 650 Other (20 Iter) 0.507 0.254 1300 Other (50 Iter) 0.481 0.313 3250 Click Prediction 1 TB advertising dataset, m4.4xlarge machines, perfect scaling. $- $20.00 $40.00 $60.00 $80.00 $100.00 $120.00 $140.00 $160.00 $180.00 $200.00 1 2 3 4 5 6 7 8CostinDollars Billable Time in Hours 10 machines 20 machines 30 machines 4050
  • 13. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Demo: building a movie recommender with Factorization Machines h t t p s : / / m e d i u m . c o m / @ j u l s i m o n / b u i l d i n g - a - m o v i e - r e c o m m e n d e r - w i t h - f a c t o r i z a t i o n - m a c h i n e s - o n - a m a z o n - s a g e m a k e r - c e d b f c 8 c 9 3 d 8
  • 14. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. 0 1 2 3 4 5 6 7 8 10 100 500 BillableTimeinMinutes Number of Clusters sagemaker other K-Means Clustering k SageMaker Other Text 1.2GB 10 1.18E3 1.18E3 100 1.00E3 9.77E2 500 9.18.E2 9.03E2 Images 9GB 10 3.29E2 3.28E2 100 2.72E2 2.71E2 500 2.17E2 Failed Videos 27GB 10 2.19E2 2.18E2 100 2.03E2 2.02E2 500 1.86E2 1.85E2 Advertising 127GB 10 1.72E7 Failed 100 1.30E7 Failed 500 1.03E7 Failed Synthetic 1100GB 10 3.81E7 Failed 100 3.51E7 Failed 500 2.81E7 Failed Running Time vs. Number of Clusters ~10x Faster!
  • 15. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Principal Component Analysis (PCA) More than 10x faster at a fraction the cost! 0.00 20.00 40.00 60.00 80.00 100.00 120.00 8 10 20 Mb/Sec/Machine Number of Machines other sagemaker-deterministic sagemaker-randomized Cost vs. Time Throughput and Scalability 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 0 10 20 30 40 50 CostinDollars Billable time in Minutes other sagemaker-deterministic sagemaker-randomized
  • 16. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Neural Topic Modeling Perplexity vs. Number of Topic Encoder: feedforward net Input term counts vector Document Posterior Sampled Document Representation Decoder: Softmax Output term counts vector 0 2000 4000 6000 8000 10000 12000 0 50 100 150 200 Perplexity Number of Topics NTM Other
  • 17. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. DeepAR: Time Series Forecasting Mean absolute percentage error P90 Loss DeepAR R DeepAR R traffic Hourly occupancy rate of 963 Bay Area freeways 0.14 0.27 0.13 0.24 electricity Electricity use of 370 homes over time 0.07 0.11 0.08 0.09 pageviews Page view hits of websites 10k 0.32 0.32 0.44 0.31 180k 0.32 0.34 0.29 NA One hour on p2.xlarge, $1 Input Network
  • 18. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. DeepAR https://arxiv.org/abs/1704.04110
  • 19. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Demo: predicting world temperature with DeepAR h t t p s : / / m e d i u m . c o m / @ j u l s i m o n / p r e d i c t i n g - w o r l d - t e m p e r a t u r e - w i t h - t i m e - s e r i e s - a n d - d e e p a r - o n - a m a z o n - s a g e m a k e r - e 3 7 1 c f 9 4 d d b 5
  • 20. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. More built-in algorithms
  • 21. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Spectral LDA Training Time vs. Number of Topics 0 50 100 150 200 250 0 20 40 60 80 100TrainingTimeinMinutes Number of Topics lda-data-a lda-data-b other-data-a other-data-b
  • 22. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Boosted Decision Trees Throughput vs. Number of Machines XGBoost is one of the most commonly used classifiers. 0 200 400 600 800 1000 1200 1400 0 10 20 30 40 50 60 70 ThroughputinMB/Sec Number of Machines (C4.8xLarge) https://github.com/dmlc/xgboost
  • 23. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Sequence to Sequence English-German Translation 0 5 10 15 20 25 0 5 10 15 20 25 BLEUScore Billable Time in Hours P2.16x P2.8x P2.x Best known result! • Based on Sockeye and Apache MXNet. • Multi-GPU. • Can be used for Neural Machine Translation. • Supports both RNN/CNN as encoder/decoder
  • 24. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. https://arxiv.org/abs/1712.05690 https://github.com/awslabs/sockeye Sockeye
  • 25. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Image Classification • ResNet implementation with Apache MXNet. • More networks to come. • Transfer learning: begin with a model already trained on ImageNet! 0 0.5 1 1.5 2 2.5 3 3.5 0 1 2 3 4 5 Speedup Number of Machines (P2) Linear Speedup with Horizontal Scaling
  • 26. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Demo: fine-tuning an image classification model h t t p s : / / m e d i u m . c o m / @ j u l s i m o n / i m a g e - c l a s s i f i c a t i o n - o n - a m a z o n - s a g e m a k e r - 9 b 6 6 1 9 3 c 8 b 5 4
  • 27. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Latest addition: Blazing Text https://dl.acm.org/citation.cfm?id=3146354
  • 28. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Resources https://aws.amazon.com/machine-learning https://aws.amazon.com/blogs/ai https://aws.amazon.com/sagemaker (free tier available) https://github.com/awslabs/amazon-sagemaker-examples An overview of Amazon SageMaker https://www.youtube.com/watch?v=ym7NEYEx9x4 https://medium.com/@julsimon
  • 29. © 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved. Thank you! Julien Simon Principal AI/ML Evangelist, Amazon Web Services @julsimon