Machine Learning: From
Notebook to Production
with Amazon Sagemaker
Julien Simon, AI Evangelist, EMEA
@julsimon
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Platform
Services
AWS ML Stack
Deploy machine learning models with high-performance machine learning
algorithms, broad framework support, and one-click training, tuning, and
inference.
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Visualization &
Analysis
Business Problem –
ML problem framing Data Collection
Data Integration
Data Preparation &
Cleaning
Feature Engineering
Model Training &
Parameter Tuning
Model Evaluation
Are Business
Goals met?
Model Deployment
Monitoring &
Debugging
YesNo
DataAugmentation
Feature
Augmentation
The Machine Learning Process
Re-training
Predictions
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Visualization &
Analysis
Business Problem –
ML problem framing Data Collection
Data Integration
Data Preparation &
Cleaning
Feature Engineering
Model Training &
Parameter Tuning
Model Evaluation
Are Business
Goals met?
Model Deployment
Monitoring &
Debugging
YesNo
DataAugmentation
Feature
Augmentation
Problem discovery
Re-training
• Help formulate the right
questions
• Domain Knowledge
Predictions
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Visualization &
Analysis
Business Problem –
ML problem framing Data Collection
Data Integration
Data Preparation &
Cleaning
Feature Engineering
Model Training &
Parameter Tuning
Model Evaluation
Are Business
Goals met?
Model Deployment
Monitoring &
Debugging
YesNo
DataAugmentation
Feature
Augmentation
Retraining
• Need a data platform?
• Amazon S3
• AWS Glue
• Amazon Athena
• Amazon EMR
• Amazon Redshift
Spectrum
Integration
Predictions
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Visualization &
Analysis
Business Problem –
ML problem framing Data Collection
Data Integration
Data Preparation &
Cleaning
Feature Engineering
Model Training &
Parameter Tuning
Model Evaluation
Are Business
Goals met?
Model Deployment
Monitoring &
Debugging
YesNo
DataAugmentation
Feature
Augmentation
Retraining
Model Training
Predictions
• Setup and manage
Notebook Environments
• Setup and manage
Training Clusters
• Write Data Connectors
• Scale ML algorithms to
large datasets
• Distribute ML training
algorithm to multiple
machines
• Secure Model artifacts
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Data Visualization &
Analysis
Business Problem –
ML problem framing Data Collection
Data Integration
Data Preparation &
Cleaning
Feature Engineering
Model Training &
Parameter Tuning
Model Evaluation
Are Business
Goals met?
Model Deployment
Monitoring &
Debugging
YesNo
DataAugmentation
Feature
Augmentation
Retraining
Model Deployment
Predictions
• Setup and manage Model
Inference Clusters
• Manage and Scale Model
Inference APIs
• Monitor and Debug Model
Predictions
• Models versioning and
performance tracking
• Automate New Model
version promotion to
production (A/B testing)
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
End-to-End
Machine Learning
Platform
Zero setup Flexible Model
Training
Pay by the second
$
Amazon SageMaker
Build, train, and deploy machine learning models at scale
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Highly-optimized
machine learning
algorithms
BuildPre-built notebook
instances
Amazon SageMaker
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Highly-optimized
machine learning
algorithms
One-click training
for ML, DL, and
custom algorithms
BuildPre-built notebook
instances
Easier training with
hyperparameter
optimization
Train
Amazon SageMaker
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
One-click training
for ML, DL, and
custom algorithms
Easier training with
hyperparameter
optimization
Highly-optimized
machine learning
algorithms
Deployment
without
engineering effort
Fully-managed
hosting at scale
BuildPre-built notebook
instances
Deploy
Train
Amazon SageMaker
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon ECR
Model Training (on EC2)
Model Hosting (on EC2)
Trainingdata
Modelartifacts
Training code Helper code
Helper codeInference code
GroundTruth
Client application
Inference code
Training code
Inference requestInference
response
Inference Endpoint
Amazon SageMaker
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Amazon SageMaker customers
https://aws.amazon.com/sagemaker/customers/
Detecting buildings in Vietnam
https://developmentseed.org/blog/2018/01/19/sagemaker-label-maker-case/
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Demos
1.Use a built-in algorithm:
fine-tuning a pre-trained model for image classification
2.Bring your own training code:
learning MNIST with Apache MXNet
3.Bring your own pre-trained model:
clustering MNIST with k-means in scikit-learn
4.Bring your own algorithm:
classifying the Iris data set with Decisions Trees in scikit-learn
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
End-to-End
Machine Learning
Platform
Zero setup Flexible Model
Training
Pay by the second
$
Amazon SageMaker
Build, train, and deploy machine learning models at scale
© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Resources
https://aws.amazon.com/machine-learning
https://aws.amazon.com/blogs/ai
https://aws.amazon.com/sagemaker
An overview of Amazon SageMaker
https://www.youtube.com/watch?v=ym7NEYEx9x4
https://medium.com/@julsimon
Thank you!
Julien Simon, AI Evangelist, EMEA
@julsimon

Machine Learning: From Notebook to Production with Amazon Sagemaker (January 2018)

  • 1.
    Machine Learning: From Notebookto Production with Amazon Sagemaker Julien Simon, AI Evangelist, EMEA @julsimon
  • 2.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Platform Services AWS ML Stack Deploy machine learning models with high-performance machine learning algorithms, broad framework support, and one-click training, tuning, and inference.
  • 3.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Data Visualization & Analysis Business Problem – ML problem framing Data Collection Data Integration Data Preparation & Cleaning Feature Engineering Model Training & Parameter Tuning Model Evaluation Are Business Goals met? Model Deployment Monitoring & Debugging YesNo DataAugmentation Feature Augmentation The Machine Learning Process Re-training Predictions
  • 4.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Data Visualization & Analysis Business Problem – ML problem framing Data Collection Data Integration Data Preparation & Cleaning Feature Engineering Model Training & Parameter Tuning Model Evaluation Are Business Goals met? Model Deployment Monitoring & Debugging YesNo DataAugmentation Feature Augmentation Problem discovery Re-training • Help formulate the right questions • Domain Knowledge Predictions
  • 5.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Data Visualization & Analysis Business Problem – ML problem framing Data Collection Data Integration Data Preparation & Cleaning Feature Engineering Model Training & Parameter Tuning Model Evaluation Are Business Goals met? Model Deployment Monitoring & Debugging YesNo DataAugmentation Feature Augmentation Retraining • Need a data platform? • Amazon S3 • AWS Glue • Amazon Athena • Amazon EMR • Amazon Redshift Spectrum Integration Predictions
  • 6.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Data Visualization & Analysis Business Problem – ML problem framing Data Collection Data Integration Data Preparation & Cleaning Feature Engineering Model Training & Parameter Tuning Model Evaluation Are Business Goals met? Model Deployment Monitoring & Debugging YesNo DataAugmentation Feature Augmentation Retraining Model Training Predictions • Setup and manage Notebook Environments • Setup and manage Training Clusters • Write Data Connectors • Scale ML algorithms to large datasets • Distribute ML training algorithm to multiple machines • Secure Model artifacts
  • 7.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Data Visualization & Analysis Business Problem – ML problem framing Data Collection Data Integration Data Preparation & Cleaning Feature Engineering Model Training & Parameter Tuning Model Evaluation Are Business Goals met? Model Deployment Monitoring & Debugging YesNo DataAugmentation Feature Augmentation Retraining Model Deployment Predictions • Setup and manage Model Inference Clusters • Manage and Scale Model Inference APIs • Monitor and Debug Model Predictions • Models versioning and performance tracking • Automate New Model version promotion to production (A/B testing)
  • 8.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. End-to-End Machine Learning Platform Zero setup Flexible Model Training Pay by the second $ Amazon SageMaker Build, train, and deploy machine learning models at scale
  • 9.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Highly-optimized machine learning algorithms BuildPre-built notebook instances Amazon SageMaker
  • 10.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Highly-optimized machine learning algorithms One-click training for ML, DL, and custom algorithms BuildPre-built notebook instances Easier training with hyperparameter optimization Train Amazon SageMaker
  • 11.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. One-click training for ML, DL, and custom algorithms Easier training with hyperparameter optimization Highly-optimized machine learning algorithms Deployment without engineering effort Fully-managed hosting at scale BuildPre-built notebook instances Deploy Train Amazon SageMaker
  • 12.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Amazon ECR Model Training (on EC2) Model Hosting (on EC2) Trainingdata Modelartifacts Training code Helper code Helper codeInference code GroundTruth Client application Inference code Training code Inference requestInference response Inference Endpoint Amazon SageMaker
  • 13.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Amazon SageMaker customers https://aws.amazon.com/sagemaker/customers/
  • 14.
    Detecting buildings inVietnam https://developmentseed.org/blog/2018/01/19/sagemaker-label-maker-case/
  • 15.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Demos 1.Use a built-in algorithm: fine-tuning a pre-trained model for image classification 2.Bring your own training code: learning MNIST with Apache MXNet 3.Bring your own pre-trained model: clustering MNIST with k-means in scikit-learn 4.Bring your own algorithm: classifying the Iris data set with Decisions Trees in scikit-learn
  • 16.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. End-to-End Machine Learning Platform Zero setup Flexible Model Training Pay by the second $ Amazon SageMaker Build, train, and deploy machine learning models at scale
  • 17.
    © 2017, AmazonWeb Services, Inc. or its Affiliates. All rights reserved. Resources https://aws.amazon.com/machine-learning https://aws.amazon.com/blogs/ai https://aws.amazon.com/sagemaker An overview of Amazon SageMaker https://www.youtube.com/watch?v=ym7NEYEx9x4 https://medium.com/@julsimon
  • 18.
    Thank you! Julien Simon,AI Evangelist, EMEA @julsimon

Editor's Notes

  • #6 The Data platform
  • #7 The Data platform
  • #8 The Data platform
  • #10 Pre-built Notebook Instances For training data exploration and preprocessing, Amazon SageMaker provides fully managed notebook instances running Jupyter notebooks that include example code for common model training and hosting exercises. These notebook instances are pre-loaded with Anaconda packages, and popular deep learning libraries like TensorFlow, and Apache MXNet. Highly-optimized Machine Learning Algorithms Amazon SageMaker installs high-performance, scalable machine learning algorithms optimized for speed, scale, and accuracy, to run on extremely large training datasets. Based on the type of learning that you are undertaking, you can choose from supervised algorithms, such as linear/logistic regression or classification; as well as unsupervised learning, such as with k-means clustering.  
  • #11 TRAIN One-click Training When you’re ready to train in Amazon SageMaker, simply indicate the type and quantity of instances you need and initiate training with a single click. SageMaker sets up the distributed compute cluster, performs the training, and tears down the cluster when complete. SageMaker seamlessly scales to tens of nodes with hundreds of GPUs, so you no longer need to worry about all the complexity and lost time involved in making distributed training architectures work. Built-in Automatic Hyperparameter Optimization (in Preview) Using built-in hyperparameter optimization (HPO), SageMaker can automatically tune your algorithm by adjusting hundreds of different combinations of parameters, to quickly arrive at the best solution for your machine learning problem. HPO lets you easily optimize an ML model on SageMaker by exploring lots of variations of the same algorithm with varying hyperparameters to pick the one with the best performance on your data.
  • #12 DEPLOY   Deployment without Engineering Effort After training, SageMaker provides the model artifacts and scoring images to you for deployment to Amazon EC2 or anywhere else. When you’re ready to deploy your model, you can launch into a secure and elastically scalable environment, with one-click deployment from the SageMaker console.   Fully Managed Amazon SageMaker handles all of the compute infrastructure on your behalf, with built-in Amazon CloudWatch monitoring and logging, to perform health checks, apply security patches, and other routine maintenance, as well as ensure updates to the supported deep learning frameworks as they become available.