© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon SageMaker
Sunil Mallya
Sr. AI Solutions Architect/ ML Solutions Lab / Amazon Web Services
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
One-click training
for ML, DL, and
custom algorithms
Easier training with
hyperparameter
optimization
Highly-optimized
machine learning
algorithms
Deployment
without
engineering effort
Fully-managed
hosting at scale
Build
Pre-built
notebook
instances
Deploy
Train
Amazon SageMaker
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Training
code
• Matrix Factorization
• Regression
• Principal Component Analysis
• K-Means Clustering
• Gradient Boosted Trees
• And More!
Amazon provided Algorithms
Bring Your Own Container
Bring Your Own Script
IM Estimators in
Apache Spark
Amazon SageMaker: model options
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Streaming datasets,
for cheaper training
Train faster, in a
single pass
Greater reliability on
extremely large
datasets
Choice of several ML
algorithms
Amazon SageMaker: 10x better algorithms
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Infinitely scalable algorithms
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
GPU State
Streaming
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Size
Memory
Data Size
Time/Cost
Streaming
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
GPU State
GPU State
GPU State
Distributed
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Shared State
GPU
GPU
GPU Local
State
Shared
State
Local
State
Local
State
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Cost vs. Time
$$$$
$$$
$$
$
Minutes Hours Days Weeks Months
Best Alternative
Amazon SageMaker
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Linear Learner
Regression (mean squared error)
SageMaker Other
1.02 1.06
1.09 1.02
0.332 0.183
0.086 0.129
83.3 84.5
Classification (F1 Score)
SageMaker Other
0.980 0.981
0.870 0.930
0.997 0.997
0.978 0.964
0.914 0.859
0.470 0.472
0.903 0.908
0.508 0.508
30 GB datasets for web-spam and web-url classification
0
0.2
0.4
0.6
0.8
1
1.2
0 5 10 15 20 25 30
CostinDollars
Billable time in Minutes
sagemaker-url sagemaker-spam other-url other-spam
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
0
1
2
3
4
5
6
7
8
10 100 500
BillableTimeinMinutes Number of Clusters
sagemaker other
K-Means Clustering
k SageMaker Other
Text
1.2GB
10 1.18E3 1.18E3
100 1.00E3 9.77E2
500 9.18.E2 9.03E2
Images
9GB
10 3.29E2 3.28E2
100 2.72E2 2.71E2
500 2.17E2 Failed
Videos
27GB
10 2.19E2 2.18E2
100 2.03E2 2.02E2
500 1.86E2 1.85E2
Advertising
127GB
10 1.72E7 Failed
100 1.30E7 Failed
500 1.03E7 Failed
Synthetic
1100GB
10 3.81E7 Failed
100 3.51E7 Failed
500 2.81E7 Failed
Running Time vs. Number of Clusters
~10x Faster!
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
DeepAR: Time Series Forecasting
Mean absolute
percentage error
P90 Loss
DeepAR R DeepAR R
traffic
Hourly occupancy rate of 963
Bay Area freeways
0.14 0.27 0.13 0.24
electricity
Electricity use of 370
homes over time
0.07 0.11 0.08 0.09
pageviews
Page view hits
of websites
10k 0.32 0.32 0.44 0.31
180k 0.32 0.34 0.29 NA
One hour on p2.xlarge, $1
Input
Network
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Built in Algorithm Demo
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Image Classification
• ResNet implementation
with Apache MXNet.
• More networks to come.
• Transfer learning: begin
with a model already
trained on ImageNet!
0
0.5
1
1.5
2
2.5
3
3.5
0 1 2 3 4 5
Speedup
Number of Machines (P2)
Linear Speedup with Horizontal Scaling
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Customer Stories
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon SageMaker Customers
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon SageMaker to
automatically map buildings in
Vietnam
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
End-to-End
Machine Learning
Platform
Zero setup Flexible Model
Training
Pay by the second
$
Amazon SageMaker
Build, train, and deploy machine learning models at scale
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
AWS DeepLens
• Intel Atom Processor
• Gen9 graphics
• Ubuntu OS- 16.04 LTS
• 100 GFLOPS performance
• Dual band Wi-Fi
• 8 GB RAM
• 16 GB Storage (eMMC)
• 32 GB SD card
• 4 MP camera with MJPEG
• H.264 encoding at 1080p resolution
• 2 USB ports
• Micro HDMI
• Audio out
• AWS Greengrass preconfigured
• clDNN Optimized for MXNet
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Resources
https://aws.amazon.com/machine-learning
https://aws.amazon.com/blogs/ai
https://aws.amazon.com/sagemaker (free tier available)
https://github.com/awslabs/amazon-sagemaker-examples
An overview of Amazon SageMaker https://www.youtube.com/watch?v=ym7NEYEx9x4
© 2018, Amazon Web Services, Inc. or Its Affiliates. All rights reserved.
Thank You

Amazon SageMaker

  • 1.
    © 2018, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Amazon SageMaker Sunil Mallya Sr. AI Solutions Architect/ ML Solutions Lab / Amazon Web Services
  • 2.
    © 2018, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. One-click training for ML, DL, and custom algorithms Easier training with hyperparameter optimization Highly-optimized machine learning algorithms Deployment without engineering effort Fully-managed hosting at scale Build Pre-built notebook instances Deploy Train Amazon SageMaker
  • 3.
    © 2018, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Training code • Matrix Factorization • Regression • Principal Component Analysis • K-Means Clustering • Gradient Boosted Trees • And More! Amazon provided Algorithms Bring Your Own Container Bring Your Own Script IM Estimators in Apache Spark Amazon SageMaker: model options
  • 4.
    © 2018, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Streaming datasets, for cheaper training Train faster, in a single pass Greater reliability on extremely large datasets Choice of several ML algorithms Amazon SageMaker: 10x better algorithms
  • 5.
    © 2018, AmazonWeb Services, Inc. or Its Affiliates. All rights reserved. Infinitely scalable algorithms
  • 6.
    © 2018, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. GPU State Streaming
  • 7.
    © 2018, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Data Size Memory Data Size Time/Cost Streaming
  • 8.
    © 2018, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. GPU State GPU State GPU State Distributed
  • 9.
    © 2018, AmazonWeb Services, Inc. or Its Affiliates. All rights reserved. Shared State GPU GPU GPU Local State Shared State Local State Local State
  • 10.
    © 2018, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Cost vs. Time $$$$ $$$ $$ $ Minutes Hours Days Weeks Months Best Alternative Amazon SageMaker
  • 11.
    © 2018, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Linear Learner Regression (mean squared error) SageMaker Other 1.02 1.06 1.09 1.02 0.332 0.183 0.086 0.129 83.3 84.5 Classification (F1 Score) SageMaker Other 0.980 0.981 0.870 0.930 0.997 0.997 0.978 0.964 0.914 0.859 0.470 0.472 0.903 0.908 0.508 0.508 30 GB datasets for web-spam and web-url classification 0 0.2 0.4 0.6 0.8 1 1.2 0 5 10 15 20 25 30 CostinDollars Billable time in Minutes sagemaker-url sagemaker-spam other-url other-spam
  • 12.
    © 2018, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. 0 1 2 3 4 5 6 7 8 10 100 500 BillableTimeinMinutes Number of Clusters sagemaker other K-Means Clustering k SageMaker Other Text 1.2GB 10 1.18E3 1.18E3 100 1.00E3 9.77E2 500 9.18.E2 9.03E2 Images 9GB 10 3.29E2 3.28E2 100 2.72E2 2.71E2 500 2.17E2 Failed Videos 27GB 10 2.19E2 2.18E2 100 2.03E2 2.02E2 500 1.86E2 1.85E2 Advertising 127GB 10 1.72E7 Failed 100 1.30E7 Failed 500 1.03E7 Failed Synthetic 1100GB 10 3.81E7 Failed 100 3.51E7 Failed 500 2.81E7 Failed Running Time vs. Number of Clusters ~10x Faster!
  • 13.
    © 2018, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. DeepAR: Time Series Forecasting Mean absolute percentage error P90 Loss DeepAR R DeepAR R traffic Hourly occupancy rate of 963 Bay Area freeways 0.14 0.27 0.13 0.24 electricity Electricity use of 370 homes over time 0.07 0.11 0.08 0.09 pageviews Page view hits of websites 10k 0.32 0.32 0.44 0.31 180k 0.32 0.34 0.29 NA One hour on p2.xlarge, $1 Input Network
  • 14.
    © 2018, AmazonWeb Services, Inc. or Its Affiliates. All rights reserved. Built in Algorithm Demo
  • 15.
    © 2018, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Image Classification • ResNet implementation with Apache MXNet. • More networks to come. • Transfer learning: begin with a model already trained on ImageNet! 0 0.5 1 1.5 2 2.5 3 3.5 0 1 2 3 4 5 Speedup Number of Machines (P2) Linear Speedup with Horizontal Scaling
  • 16.
    © 2018, AmazonWeb Services, Inc. or Its Affiliates. All rights reserved. Customer Stories
  • 17.
    © 2018, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Amazon SageMaker Customers
  • 18.
    © 2018, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Amazon SageMaker to automatically map buildings in Vietnam
  • 19.
    © 2018, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. End-to-End Machine Learning Platform Zero setup Flexible Model Training Pay by the second $ Amazon SageMaker Build, train, and deploy machine learning models at scale
  • 20.
    © 2018, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. AWS DeepLens • Intel Atom Processor • Gen9 graphics • Ubuntu OS- 16.04 LTS • 100 GFLOPS performance • Dual band Wi-Fi • 8 GB RAM • 16 GB Storage (eMMC) • 32 GB SD card • 4 MP camera with MJPEG • H.264 encoding at 1080p resolution • 2 USB ports • Micro HDMI • Audio out • AWS Greengrass preconfigured • clDNN Optimized for MXNet
  • 21.
    © 2018, AmazonWeb Services, Inc. or its Affiliates. All rights reserved.
  • 22.
    © 2018, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Resources https://aws.amazon.com/machine-learning https://aws.amazon.com/blogs/ai https://aws.amazon.com/sagemaker (free tier available) https://github.com/awslabs/amazon-sagemaker-examples An overview of Amazon SageMaker https://www.youtube.com/watch?v=ym7NEYEx9x4
  • 23.
    © 2018, AmazonWeb Services, Inc. or Its Affiliates. All rights reserved. Thank You