Il Machine Learning può sembrare più difficile di quanto non lo sia perché il processo di sviluppo, training e deployment dei modelli in produzione è troppo complicato e lento. Amazon SageMaker è un servizio completamente gestito che consente a sviluppatori e data scientist di progettare, implementare e distribuire modelli di Machine Learning in qualsiasi scala. Amazon SageMaker offre una scelta di algoritmi di machine learning altamente performanti e framework preconfigurati come Apache MXNet, TensorFlow, PyTorch e Chainer; inoltre, è possibile utilizzare framework o algoritmi alternativi attraverso container Docker. In questa sessione approfondiremo l’utilizzo di Amazon SageMaker, anche attraverso alcuni pratici esempi.
2. A LONG HISTORY OF ML AT AMAZON
THOUSANDS OF ENGINEERS ACROSS THE COMPANY FOCUSED ON AI
Personalized
recommendations
Inventing
entirely new
customer
experiences
Fulfillment
automation and
inventory
management
Drones Voice-driven
interactions
3. ML @ AWS
OUR MISSION
Put Machine Learning in the
hands of every developer and
data scientist
4. APPLICATION SERVICES
R E K O G N I T I O N R E K O G N I T I O N
V I D E O
P O L L Y T R A N S C R I B E T R A N S L A T E C O M P R E H E N D L E X
PLATFORMS Amazon SageMaker Amazon Mechanical Turk Spark on Amazon EMR
FRAMEWORKS
& INFRASTRUCTURE
K E R A S
F r a m e w o r k s I n t e r f a c e s
NVIDIA
Tesla V100 GPUs
(14x faster than P2)
P3
Machine Learning
AMIs
5,120 Tensor cores
128GB of memory
1 Petaflop of compute
NVLink 2.0
THE AWS MACHINE LEARNING STACK
7. THE MACHINE LEARNING PROCESS
Business Problem -
ML problem
framing
Set Business Goals
• Domain knowledge
• Help formulate the right questions
8. THE MACHINE LEARNING PROCESS
Business Problem -
ML problem
framing
Data Collection
Data Integration
Data Preparation &
Cleaning
Build the Data Platform
• Amazon S3
• AWS Glue
• Amazon Athena
• Amazon EMR
• Amazon Redshift / Redshift Spectrum
• Amazon Kinesis
• AWS IoT Core
9. THE MACHINE LEARNING PROCESS
Data Visualization
& Analysis
Business Problem -
ML problem
framing
Data Collection
Data Integration
Data Preparation &
Cleaning
Feature
Engineering
Model Training &
Parameter Tuning
Model Evaluation
Experiment, Train, Tune and Evaluate
• Setup and manage Notebook
Environments
• Setup and manage Training Clusters
• Write Data Connectors
• Scale ML algorithms to large datasets
• Distribute ML training algorithm to
multiple machines
• Secure Model artifacts
10. THE MACHINE LEARNING PROCESS
Data Visualization
& Analysis
Business Problem -
ML problem
framing
Data Collection
Data Integration
Data Preparation &
Cleaning
Feature
Engineering
Are
Business
goals met?
Monitoring &
Debugging
- Predictions
Yes
Re-training
Model Training &
Parameter Tuning
Model Evaluation Model Deployment
Deploy, Monitor and Debug
• Setup and manage Model Inference
Clusters
• Manage and Auto-Scale Model
Inference APIs
• Monitor and Debug Model Predictions
• Models versioning and performance
tracking
• Automate New Model version
promotion to production (A/B testing)
11. THE MACHINE LEARNING PROCESS
Data Visualization
& Analysis
Business Problem -
ML problem
framing
Data Collection
Data Integration
Data Preparation &
Cleaning
Feature
Engineering
Are
Business
goals met?
Monitoring &
Debugging
- Predictions
YesNo
DataAugmentation
Feature
Augmentation
Re-training
Model Training &
Parameter Tuning
Model Evaluation Model Deployment
Enhance and re-train
• Add/Remove features
• Augment Data
17. Amazon SageMaker
A managed service that provides the quickest and
easiest way for data scientists and developers to get
ML models from idea to production
20. AMAZON SAGEMAKER
Fully managed
hosting with auto-
scaling
One-click
deployment
Pre-built
notebooks for
common
problems
Built-in, high
performance
algorithms
One-click
training
BUILD TRAIN & TUNE DEPLOY
Hyperparameter
optimization
BUILD, TRAIN, TUNE AND HOST YOUR OWN MODELS
21. AMAZON SAGEMAKER
Fully managed
hosting with auto-
scaling
One-click
deployment
Pre-built
notebooks for
common
problems
Built-in, high
performance
algorithms
One-click
training
BUILD TRAIN & TUNE DEPLOY
End-to-end encryption with KMS
End-to-end VPC support
Compliance and audit capabilities
Metadata and experiment management capabilities
Pay as you go
Hyperparameter
optimization
BUILD, TRAIN, TUNE AND HOST YOUR OWN MODELS
22. AMAZON SAGEMAKER CUSTOMERS
“
- Ashok Srivastava, Chief Data Officer, Intuit
With Amazon SageMaker, we can accelerate our Artificial
Intelligence initiatives at scale by building and deploying our
algorithms on the platform. We will create novel large-scale
machine learning and AI algorithms and deploy them on this
platform to solve complex problems that can power
prosperity for our customers.
"
23. AMAZON SAGEMAKER @ INTUIT
Ad-hoc setup and management of
notebook environments
Limited choices for model
deployment
Competing for compute resources
across teams
Easy data exploration
in SageMaker notebooks
Building around virtualization for
flexibility
Auto-scalable model hosting
environment
From To
24. AMAZON SAGEMAKER CUSTOMERS
“
- Dr. Walter Scott, CTO of Maxar Technologies
and founder of DigitalGlobe
"
As the world’s leading provider of high-resolution Earth imagery, data and
analysis, DigitalGlobe works with enormous amounts of data every day.
DigitalGlobe is making it easier for people to find, access, and run compute
against our entire 100PB image library, which is stored in AWS’s cloud, to apply
deep learning to satellite imagery. We plan to use Amazon SageMaker to train
models against petabytes of Earth observation imagery datasets using hosted
Jupyter notebooks, so DigitalGlobe's Geospatial Big Data Platform (GBDX) users
can just push a button, create a model, and deploy it all within one scalable
distributed environment at scale.
26. AMAZON SAGEMAKER COMPONENTS
BUILT-IN ALGORITHMS
BRING YOUR OWN SCRIPT
BRING YOUR OWN ALGORITHM
NOTEBOOK INSTANCES
SDKs & LOCAL MODE
AWS CONSOLE
USER EXPERIENCE
ML TRAINING &
TUNING SERVICE
ML HOSTING
SERVICE
27. AMAZON SAGEMAKER COMPONENTS
BUILT-IN ALGORITHMS
BRING YOUR OWN SCRIPT
BRING YOUR OWN ALGORITHM
NOTEBOOK INSTANCES
SDKs & LOCAL MODE
AWS CONSOLE
USER EXPERIENCE
ML TRAINING &
TUNING SERVICE
ML HOSTING
SERVICE
28. NOTEBOOK INSTANCES
ZERO SETUP FOR EXPLORATORY DATA ANALYSIS
Authoring &
Notebooks
ETL Access to AWS
Database services
Access to S3 Data
Lake
VPC • Fully managed Jupyter notebook instances
• Choice of CPU and GPU ml instances
• Sample notebooks and «just add data»
• Recommendations/Personalization
• Fraud Detection
• Forecasting
• Image Classification
• Churn Prediction
• Marketing Email/Campaign Targeting
• Log processing and anomaly detection
• Speech to Text
• More…
• VPC Integration
• Lifecycle Configurations
29. SDKs AND LOCAL MODE
T r a i n w i t h
l o c a l n o t e b o o k s
Train on notebook
instances
PetaFLOP
training on p3.16xl
Go distributed
with one line of code
Same containers
Amazon SageMaker Python SDK
https://github.com/aws/sagemaker-python-sdk
Amazon SageMaker Spark SDK
https://github.com/aws/sagemaker-spark
LOCAL MODE
31. AMAZON SAGEMAKER COMPONENTS
BUILT-IN ALGORITHMS
BRING YOUR OWN SCRIPT
BRING YOUR OWN ALGORITHM
NOTEBOOK INSTANCES
SDKs & LOCAL MODE
AWS CONSOLE
USER EXPERIENCE
ML TRAINING &
TUNING SERVICE
ML HOSTING
SERVICE
32. MANAGED DISTRIBUTED TRAINING
Fully
managed –
VPC–
Training Code
Training Data Model Artifacts
CPU ML INSTANCES GPU ML INSTANCES HYPERPARAMETER TUNING
BUILT-IN ALGORITHMS BRING YOUR OWN SCRIPT BRING YOUR OWN ALGORITHM
Amazon ECR
33. BUILT-IN ALGORITHMS
Data Model
NEW DATA
PREDICTION
Algorithm
K-Means
k-nearest neighbors (k-NN)
PCA
LDA
Factorization Machines
Linear Learner
NTM
RandomCutForest
Sequence to Sequence
XGBoost
Image Classification
Object Detection
DeepAR Forecasting
BlazingText
35. BRING YOUR OWN SCRIPT
Data Model
NEW DATA
PREDICTION
Your Own
Script
+
36. BRING YOUR OWN ALGORITHM
Data Model
NEW DATA
PREDICTION
Your algorithm and libraries
in your own Docker Container
37. HYPERPARAMETER TUNING
Run a large set of training jobs
with varying hyperparameters...
... and search the
hyperparameter space for
improved accuracy.
40. HOSTING
Amazon ECR
30 50
10 10
Model Artifacts
Inference Image
Model versions
InstanceType: ml.c5.4xlarge
InitialInstanceCount: 3
maxInstanceCount: 10
ModelName: prod
VariantName: primary
InitialVariantWeight: 50
Create weighted
ProductionVariant(s)
ProductionVariant
EASY MODEL DEPLOYMENT TO AMAZON SAGEMAKER
41. HOSTING
Amazon ECR
30 50
10 10
Model Artifacts
Inference Image
Model versions
EndpointConfiguration
InstanceType: ml.c5.4xlarge
InitialInstanceCount: 3
maxInstanceCount: 10
ModelName: prod
VariantName: primary
InitialVariantWeight: 50
Create and
EndpointConfiguration from
one or many
ProductionVariant(s)
ProductionVariant
EASY MODEL DEPLOYMENT TO AMAZON SAGEMAKER
42. HOSTING
Amazon ECR
30 50
10 10
Model Artifacts
Inference Image
Model versions
EndpointConfiguration
Inference Endpoint
InstanceType: ml.c5.4xlarge
InitialInstanceCount: 3
maxInstanceCount: 10
ModelName: prod
VariantName: primary
InitialVariantWeight: 50
Create and Endpoint from
one EndpointConfiguration
ProductionVariant
EASY MODEL DEPLOYMENT TO AMAZON SAGEMAKER
43. HOSTING
Amazon ECR
30 50
10 10
Model Artifacts
Inference Image
Model versions
EndpointConfiguration
Inference Endpoint
InstanceType: ml.c5.4xlarge
InitialInstanceCount: 3
maxInstanceCount: 10
ModelName: prod
VariantName: primary
InitialVariantWeight: 50
One-click deployment for
built-in algorithms and
containers
ProductionVariant
EASY MODEL DEPLOYMENT TO AMAZON SAGEMAKER
45. BATCH TRANSFORM
Dataset in
S3 bucket
AGENT
MODEL
Instance Node 1 Instance Node n
Assembled Data
Record Batch
Request Data
Transformed
Data
…
Cluster
46. SAGEMAKER SAMPLE END-TO-END ARCHITECTURE
SageMaker
Notebooks
Training
Algorithm
SageMaker
Training
Amazon ECR
Code Commit
Code Pipeline
SageMaker
Hosting
Coco dataset
AWS
Lambda
API
Gateway
Build
Train
Deploy
static website hosted on S3
Inference requests
Amazon S3
Amazon
Cloudfront
Web assets on
Cloudfront
STYLE TRANSFER
47. IT’S NOT JUST ABOUT ML
Data Lake Storage
Amazon S3
Security
Access Control
Encryption
VPC
KMS
Auditing
Compliance
Roles
Fine Grained Access Controls
Compute
Powerful GPU & CPU Instances
AWS Lambda
Analytics
Amazon Athena
Amazon EMR
Amazon Redshift & Redshift Spectrum