Amazon SageMaker is a fully managed platform that enables developers and data scientists to build, train, and deploy machine learning (ML) models in production applications easily and at scale. In this chalk talk, we dive deep into training an ML model based on the TensorFlow framework. We discuss the specifics of training a model through Amazon SageMaker by taking an algorithm and running it on a training cluster in an auto-scaling group. This session showcases the scalability of training that is possible with Amazon SageMaker, which reduces the time and cost of training runs.
2. Questions We’ll Answer in This Session:
1. Amazon SageMaker: What is it? How does it work?
2. Why run TensorFlow on SageMaker?
3. How to train a TensorFlow model using SageMaker? (Demo)
4. How to host a trained TensorFlow model on SageMaker to provide
scalable inferencing service? (Demo)
5. How to get started?
14. ML Hosting Service
Amazon ECR
30 50
10 10
Model Artifacts
Inference Image
Model versions
One-Click!
EndpointConfiguration
Inference Endpoint
Amazon SageMaker
Easy Model Deployment to Amazon SageMaker – with Split Testing
Versions of the
same inference
code saved in
inference
containers. Prod
is the primary
one, 50% of the
traffic must be
served there!
ModelName: prod
Create a ModelCreate versions of a Model
Create weighted
ProductionVariants
Create
EndpointConfiguration from
one or more
ProductionVariant(s)
Create an Endpoint from one
EndpointConfiguration
16. Remove undifferentiated
heavy lifting of ML
Iterate, iterate,
iterate!
Training speed Inference speed
Speed & agility in all phases
Amazon SageMaker
18. Enabling experimentation speed
Tr a i n w i t h
l o c a l n o t e b o o k s
Train on notebook
instances
PetaFLOP
training on p3.16xl
Go distributed
with one line of code
Same containers
Amazon SageMaker Training Service
19. Automatic Model Tuning
Adjusting algorithm parameters to arrive at the best model
Neural Networks
Number of layers
Hidden layer width
Learning rate
Embedding
dimensions
Dropout
…
Decision Trees
Tree depth
Max leaf nodes
Gamma
Eta
Lambda
Alpha
…
…
“Hyperparameters”
(algorithm parameters that significantly affect model quality)
Tuning Strategy: Customized Bayesian Optimization
25. Summary
1. Amazon SageMaker supports the complete TensorFlow ML/DL
life-cycle with speed and agility
• Jupyter notebooks provided
• Built-in cloud-scale ML algorithms
• Fully managed and scalable training service (incl. automated model
tuning)
• Model hosting with auto-scaling
• Full lifecycle can be scripted for CI/CD
2. TensorFlow code can be brought to SageMaker either as scripts
or within Docker containers.
3. We demonstrated a complete example of TensorFlow coding,
training, and model hosting.
28. Please Submit Session Feedback:
1. Tap on the Agenda icon
2. Select the session you attended
3. Tap on Session Evaluation to submit
your feedback
Thank You !
Questions ?