SlideShare a Scribd company logo
1 of 56
Download to read offline
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Machine Learning Workflows with
Amazon SageMaker and AWS Step Functions
A P I 3 2 5
Tom Faulhaber
Principal Engineer, AI Platforms
Amazon Web Services
Jeremy Irwin
Solution Architect
Cox Automotive Inc.
Andy Katz
Sr. Product Manager
Amazon Web Services
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Today’s agenda
Build, train, and deploy machine learning models with
Amazon SageMaker
Build serverless workflows with less code to write and maintain using
AWS Step Functions
Learn how Cox Automotive combined SageMaker and Step Functions to
improve collaboration between data scientists and software engineers
New features to build and manage ML workflows even faster
“Once upon a time…”
Amazon SageMaker
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
SageMaker manages ML infrastructure
Build Train Deploy
Pre-built notebook
instances
Highly optimized
machine learning
algorithms
One-click training for ML,
deep learning, and custom
algorithms
Automatic model tuning
(hyperparameter
optimization)
Fully managed
hosting at scale
Deployment without
engineering effort
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Customers building and deploying on SageMaker
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Machine learning cycle
Business
Problem
ML problem
framing
Data collection
Data integration
Data preparation
and cleaning
Data visualization
and analysis
Feature
engineering
Model training and
parameter tuning
Model evaluation
Monitoring and
debugging
Model deployment
Predictions
YESNO
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Manage data on AWS
Business
Problem
ML problem
framing
Data collection
Data integration
Data preparation
and cleaning
Data visualization
and analysis
Feature
engineering
Model training and
parameter tuning
Model evaluation
Monitoring and
debugging
Model deployment
Predictions
YESNO
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Build and train models using SageMaker
Business
Problem
ML problem
framing
Data collection
Data integration
Data preparation
and cleaning
Data visualization
and analysis
Feature
engineering
Model training and
parameter tuning
Model evaluation
Monitoring and
debugging
Model deployment
Predictions
YESNO
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Deploy models using SageMaker
Business
Problem
ML problem
framing
Data collection
Data integration
Data preparation
and cleaning
Data visualization
and analysis
Feature
engineering
Model training and
parameter tuning
Model evaluation
Monitoring and
debugging
Model deployment
Predictions
YESNO
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
What about the lines between the steps?
Business
Problem
ML problem
framing
Data collection
Data integration
Data preparation
and cleaning
Data visualization
and analysis
Feature
engineering
Model training and
parameter tuning
Model evaluation
Monitoring and
debugging
Model deployment
Predictions
YESNO
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
What is Step Functions?
Task
Choice
Fail
Parallel
Mountains
People
Snow
NotSupportedImageType
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Step Functions uses Amazon States Language (JSON)
{
"Comment": "Image Processing workflow",
"StartAt": "ExtractImageMetadata",
"States": {
"ExtractImageMetadata": {
"Type": "Task",
"Resource": "arn:aws:lambda:::function:photo-backendExtractImageMetadata-...",
"InputPath": "$",
"ResultPath": "$.extractedMetadata",
"Next": "ImageTypeCheck",
"Catch": [ {
"ErrorEquals": [ "ImageIdentifyError"],
"Next": "NotSupportedImageType"
} ],
"Retry": [ {
"ErrorEquals": [ "States.ALL"],
"IntervalSeconds": 1,
"MaxAttempts": 2,
"BackoffRate": 1.5 }, ...
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Run tasks with any compute resource
Activity
Worker
long poll
Traditional server
AWS Lambda function
Synchronous
request
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Customers running workflows on Step Functions
“Back to our story…”
Amazon
SageMaker
AWS Step
Functions
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Machine learning cycle
Business
Problem
ML problem
framing
Data collection
Data integration
Data preparation
and cleaning
Data visualization
and analysis
Feature
engineering
Model training and
parameter tuning
Model evaluation
Monitoring and
debugging
Model deployment
Predictions
YESNO
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Cox Automotive
OUR VISION
TRANSFORM THE WAY THE WORLD
BUYS, SELLS, OWNS, AND USES
CARS
“As Data Scientists, one of our biggest concerns with ML is that over
time the models learn bad behaviors from spoiled data.
We need to interject human expert oversight in our model
deployment process, in order to continuously deliver quality
models with minimal human intervention.”
Jeff Keller, Senior Decision Scientist
Cox Automotive
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Digital advertising recommendations
Enable car dealers to make better
informed digital advertising
decisions
At Cox Automotive, ML-related
product development is
bifurcated:
• Decision Science builds
prediction models
• Engineering integrates models
into applications used by Cox
Automotive clients
Challenge: How can we
reduce the friction
between Data Science and
Engineering so that both
teams’ needs are fulfilled?
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Engineering != Decision Science
Background: Computer Science
Skills: automation, deployment,
reusability, Java
Imperatives: security, operability,
scalability
Background: Statistics
Skills: statistics, modeling, analysis,
R, Python
Imperatives: accuracy, precision,
interpretability
Cadence: 2 week sprints Cadence: varies
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS Compute Blog: Starting point
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon SageMaker model deployment pipeline
VPC VPC
Event
Data Scientist
Email
Requirements
• Model artifacts are created
as .zip files
• Models are created as
.tar.gz files
Configurable Parameters
• Source S3 buckets (landing
zone for newly built
models)
• Destination S3 buckets
(Engineering-owned)
• Email address
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS Step Functions state machine definition
…
"StartAt": "GetNewModel",
"States": {
"GetNewModel": {
"Type": "Task",
"Resource": "arn:aws:lambda:${region}:${act}:
function:model-review-GetNewModelFunction",
"ResultPath": "$",
"Next": "GetManualReview"
},
"GetManualReview": {
"Type": "Task",
"Resource": "arn:aws:states:${region}:${act}:
activity:model-review-getModelReviewDecision",
"ResultPath": "$.taskresult",
"TimeoutSeconds": 604800,
"Next": "ApproveOrRejectNewModel”
},
…
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
State machine activity workers
Call-work-respond: An external worker gets
token, does work, and updates activity with
success or failure
Call-work-delegate…respond: Our external
worker gets the token and then delegates
responsibility for updating the activity to
downstream AWS services
Traditional server
GetActivityTask
JSON Input
+
TaskToken
Traditional server
SendTaskSuccess
JSON Result
+
TaskToken
Delegate
TaskToken
SendTaskSuccess
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Activity token journey: Send models for review
taskToken = getActivityTaskResponse['taskToken’]
sendEmail(taskToken, diagnosticsFileName,
diagnosticsFile, diagnosticsFilePath, apiUrl)
…
def sendEmail(taskToken, diagnosticsFileName,
diagnosticsFile, diagnosticsFilePath, apiUrl):
sesClient = boto3.client('ses')
encodedtaskToken = quote(taskToken, safe='')
approveLink = apiUrl + '/approve/' + encodedtaskToken
rejectLink = apiUrl + '/reject/' + encodedtaskToken
Data Scientist
Event
sfnClient = boto3.client('stepfunctions')
getActivityTaskResponse = sfnClient.get_activity_task(
activityArn=activityArn,
workerName='checkStateMachineActivityStatus’
)
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Activity token journey: Generate review request
Data Scientist
Event
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Activity token journey: Amazon API Gateway
configuration
Data Scientist
Event
GetReviewDecisionFunction:
handler: handler.getReviewDecision
role: "${self:custom.terraformed.service.role}"
events:
- http:
path: approve/{taskToken}
method: get
request:
parameters:
paths:
taskToken: true
- http:
path: reject/{taskToken}
method: get
request:
parameters:
paths:
taskToken: true
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Activity token journey: Prepare arguments & output
path = event['path']
taskToken = unquote(event['pathParameters']['taskToken'])
taskSuccessOutput = '{"decision": "Approved"}'
taskFailureOutput = '{"decision": "Rejected"}'
if path.startswith('/reject'):
message = "The model has been rejected and will not be promoted"
status = 'rejected'
kwargs = {
'taskToken': taskToken,
'output': taskFailureOutput
}
else:
if path.startswith('/approve'):
message = "The model has been approved and will be promoted"
status = 'approved'
kwargs = {
'taskToken': taskToken,
'output': taskSuccessOutput
}
else:
message = "The parameter does not match the expected parameter"
print(message)
Data Scientist
Event
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Activity token journey: Set activity status
try:
if status == 'approved':
sfnClient.send_task_success(**kwargs)
responseData = {
"statusCode": 200,
"body": json.dumps({"decision": message})
}
else:
if status == 'rejected':
sfnClient.send_task_success(**kwargs)
responseData = {
"statusCode": 200,
"body": json.dumps({"decision": message})
}
except Exception as e:
raise e
return responseData
Data Scientist
Event
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
State input & output processing
Lambda state can be shared with downstream/proceeding states via the
state output, which is a mutable JSON object used to carry inputs &
output data between states.
Benefits:
• Upstream worker output can be used as input for downstream workers
(to reduce the number of repeat calls)
• Maintain state of upstream states
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
State input & output processing: Append to output
{
"name": "GetNewModel”,
"output": {
"diagnosticsFilePath": “20181102/model_diagnostics.zip",
"diagnosticsFileName": "model_diagnostics.zip”
}
}
# State is configured to append the decision to its input
{
"name": "GetManualReview",
"output": {
"diagnosticsFilePath": "20181102/model_diagnostics.zip",
"diagnosticsFileName": "model_diagnostics.zip",
"taskresult": {
"decision": "Approved"
}
}
}
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
State input & output processing: Choice states
"ApproveOrRejectNewModel": {
"Type": "Choice",
"Choices": [
{
"Variable": "$.taskresult.decision",
"StringEquals": "Approved",
"Next": "ApproveNewModel"
},
{
"Variable": "$.taskresult.decision",
"StringEquals": "Rejected",
"Next": "RejectNewModel"
}
]
}
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS Compute Blog: What we changed
• Step Functions
• Automating invocation of the state machine
• Using State input & output to pass upstream Lambda
state/data to downstream Lambdas
• > 1 state
• Amazon Simple Email Service
(Amazon SES)
• Initial setup
• Attachments
• Model delivery to Engineering
• Infrastructure as code
“Engineering & Data Science
development cadences are different.
An ability to asynchronously collaborate
reduces wait states and frustration.”
Cox Automotive
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
What Decision Science learned about Engineering
• How to share
• AWS resources amongst different projects
• Infrastructure-as-code repo hierarchy and management
• An approach for working in multiple AWS environments (lab, non-prod,
prod)
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
What Engineering learned about Decision Science
• Human oversight is required to prevent unintended results and bias
• Data access & availability are real issues
• Are we collecting the right data to support future modeling efforts?
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Example ML workflow
def upload_to_s3(channel, file):
s3 = boto3.resource('s3')
data = open(file, "rb")
key = channel + '/' + file
s3.Bucket(bucket).put_object(Key=key, Body=data)
train = sagemaker.s3_input('s3://{}/train/'.format(bucket), content_type='application/x-recordio')
validation = sagemaker.s3_input('s3://{}/validation/'.format(bucket),
content_type='application/x-recordio')
input_data = 's3://batch-test-data/caltech256/'
output_data = 's3://batch-test-output/DEMO-image-classification'
transformer = training_job.transformer(2, 'ml.p3.2xlarge', output_path=output_data,
assemble_with='Line’, max_payload=8, max_concurrent_transforms=8)
transformer.transform(input_data, content_type='application/x-image')
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
ML workflow in Step Functions
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Manage asynchronous jobs without writing code!
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Simplify machine learning workflows
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Add AWS Glue ETL jobs in your workflows
"Synchronously Run a Glue Job": {
"Type": "Task",
"Resource": "arn:aws:states:::glue:startJobRun.sync",
"Parameters":
{
"JobName.$": "$.myJobName”,
“AllocatedCapacity”: 3
},
"Catch": [
{"ErrorEquals": ["States.TaskFailed"],
"ResultPath": "$.cause",
"Next" : "Notify on Error"
} ],
"ResultPath": "$.jobInfo",
"Next": "Report Success"
}
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Add Amazon SageMaker training and transform jobs
in your workflows
"Synchronously Run a Training Job": {
"Type": "Task",
"Resource":
"arn:aws:states:::sagemaker.createTrainingJob.sync",
"Parameters":
{
"AlgorithmSpecification": {...},
"HyperParameters": {...},
"InputDataConfig": [...],
...
},
"Catch": [
{"ErrorEquals": ["States.TaskFailed"],
"ResultPath": "$.cause",
"Next" : ”Notify on Error"
} ],
"ResultPath": "$.jobInfo",
"Next": "Report Success"
}
"Synchronously Run a Transform Job": {
"Type": "Task",
"Resource":
"arn:aws:states:::sagemaker.createTransformJob.sync",
"Parameters":
{
"TransformJobName.$": "$.transform",
"ModelName.$": "$.model",
"MaxConcurrentTransforms": 8,
...
},
"Catch": [
{"ErrorEquals": ["States.TaskFailed"],
"ResultPath": "$.cause",
"Next" : ”Notify on Error"
} ],
"ResultPath": "$.jobInfo",
"Next": "Report Success"
}
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Define workflows in JSON
{
"StartAt": "Download",
"States": {
"Download": {
"Type": "Task",
"Resource": "arn:aws:lambda:REGION:ACCT:function:download_data”,
"Next": "Train"
},
"Train": {
"Type": "Task",
"Resource": "arn:aws:states:::sagemaker:createTrainingJob.sync",
"ResultPath": "$.training_job",
"Parameters": {
"AlgorithmSpecification": {
"TrainingImage": "811284229777.dkr.ecr.us-east-1.amazonaws.com/
image-classification:latest",
"TrainingInputMode": "File"
}…
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
AWS Cloud Developer Kit
JavaScript
TypeScript
Java
C#
Define your cloud resources using an imperative programming interface
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Work in progress: Define workflows in Python
# Define an AWS Lambda task state
xferStep = stepfunctions.task(self,
name = 'Download’,
resource = lambda_.Function(self,
name = 'xfer_recio’,
code=lambda_.Code.file('CodeFile.zip’),
handler='download_data’,
runtime=lambda_.Runtime.python36,
timeout=15 * 60
),
result_path='$.training_data’,
)
# Define an Amazon SageMaker task state
trainStep = stepfunctions.task(self,
"Train",
resource =
'arn:aws:states:::sagemaker.createTrainingJob.sync’
parameters = (
TrainingJobName='string’,
HyperParameters={
...
# Define workflow in Python
Sfn_State_machine = (
xfer_step
.next(train_step.
add_catch(training_failure)
)
.next(create_model_step)
.next(transform_step.
add_catch(transform_failure)
)
.next(transform_success)
# Create an AWS Step Functions state machine
stepfunctions.StateMachine(self,
name = ‘ML Workflow’,
definition = sfn_state_Machine,
timeoutSec = 30000
)
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon
SageMaker
AWS Step
Functions
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Related breakouts
Tuesday, November 27
API302 - Serverless State Management & Orchestration for Modern Apps
10:45 AM – 11:45 AM | MGM, Level 1, Grand Ballroom 122
Wednesday, November 28
SRV373 - Building Massively Parallel Event-Driven Architectures
6:15 PM – 7:15 PM | Venetian, Level 3, Murano 3205
Thursday, November 29
AIM403 - Integrate Amazon SageMaker with Apache Spark
4:00 PM – 5:00 PM | Mirage, Grand Ballroom F
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Resources
https://aws.amazon.com/machine-learning/
https://aws.amazon.com/modern-apps/
https://github.com/awslabs/aws-cdk
Thank you!
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Resources
https://aws.amazon.com/machine-learning/
https://aws.amazon.com/modern-apps/
https://github.com/awslabs/aws-cdk

More Related Content

What's hot

Getting Started with Serverless Architectures with Microservices_AWSPSSummit_...
Getting Started with Serverless Architectures with Microservices_AWSPSSummit_...Getting Started with Serverless Architectures with Microservices_AWSPSSummit_...
Getting Started with Serverless Architectures with Microservices_AWSPSSummit_...
Amazon Web Services
 

What's hot (20)

Amazon API Gateway
Amazon API GatewayAmazon API Gateway
Amazon API Gateway
 
AWS Security by Design
AWS Security by Design AWS Security by Design
AWS Security by Design
 
Introduction to Serverless
Introduction to ServerlessIntroduction to Serverless
Introduction to Serverless
 
Landing Zones - Creating a Foundation for Your AWS Migrations
Landing Zones - Creating a Foundation for Your AWS MigrationsLanding Zones - Creating a Foundation for Your AWS Migrations
Landing Zones - Creating a Foundation for Your AWS Migrations
 
Getting Started with Serverless Architectures with Microservices_AWSPSSummit_...
Getting Started with Serverless Architectures with Microservices_AWSPSSummit_...Getting Started with Serverless Architectures with Microservices_AWSPSSummit_...
Getting Started with Serverless Architectures with Microservices_AWSPSSummit_...
 
Amazon SageMaker 모델 배포 방법 소개::김대근, AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS::AWS AIML 스페셜 웨비나
Amazon SageMaker 모델 배포 방법 소개::김대근, AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS::AWS AIML 스페셜 웨비나Amazon SageMaker 모델 배포 방법 소개::김대근, AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS::AWS AIML 스페셜 웨비나
Amazon SageMaker 모델 배포 방법 소개::김대근, AI/ML 스페셜리스트 솔루션즈 아키텍트, AWS::AWS AIML 스페셜 웨비나
 
Fundamentals of AWS Security
Fundamentals of AWS SecurityFundamentals of AWS Security
Fundamentals of AWS Security
 
AWS Security Best Practices
AWS Security Best PracticesAWS Security Best Practices
AWS Security Best Practices
 
Serverless Computing: build and run applications without thinking about servers
Serverless Computing: build and run applications without thinking about serversServerless Computing: build and run applications without thinking about servers
Serverless Computing: build and run applications without thinking about servers
 
SaaS on AWS - ISV challenges
SaaS on AWS - ISV challengesSaaS on AWS - ISV challenges
SaaS on AWS - ISV challenges
 
Serverless Architectures.pdf
Serverless Architectures.pdfServerless Architectures.pdf
Serverless Architectures.pdf
 
Amazon EventBridge
Amazon EventBridgeAmazon EventBridge
Amazon EventBridge
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
AWS Identity, Directory, and Access Services: An Overview
AWS Identity, Directory, and Access Services: An Overview AWS Identity, Directory, and Access Services: An Overview
AWS Identity, Directory, and Access Services: An Overview
 
Amazon SageMaker 모델 학습 방법 소개::최영준, 솔루션즈 아키텍트 AI/ML 엑스퍼트, AWS::AWS AIML 스페셜 웨비나
Amazon SageMaker 모델 학습 방법 소개::최영준, 솔루션즈 아키텍트 AI/ML 엑스퍼트, AWS::AWS AIML 스페셜 웨비나Amazon SageMaker 모델 학습 방법 소개::최영준, 솔루션즈 아키텍트 AI/ML 엑스퍼트, AWS::AWS AIML 스페셜 웨비나
Amazon SageMaker 모델 학습 방법 소개::최영준, 솔루션즈 아키텍트 AI/ML 엑스퍼트, AWS::AWS AIML 스페셜 웨비나
 
AWS Landing Zone Deep Dive (ENT350-R2) - AWS re:Invent 2018
AWS Landing Zone Deep Dive (ENT350-R2) - AWS re:Invent 2018AWS Landing Zone Deep Dive (ENT350-R2) - AWS re:Invent 2018
AWS Landing Zone Deep Dive (ENT350-R2) - AWS re:Invent 2018
 
Introduction to Amazon Athena
Introduction to Amazon AthenaIntroduction to Amazon Athena
Introduction to Amazon Athena
 
AWS re:Invent 2016: DNS Demystified: Getting Started with Amazon Route 53, fe...
AWS re:Invent 2016: DNS Demystified: Getting Started with Amazon Route 53, fe...AWS re:Invent 2016: DNS Demystified: Getting Started with Amazon Route 53, fe...
AWS re:Invent 2016: DNS Demystified: Getting Started with Amazon Route 53, fe...
 
AWS 101
AWS 101AWS 101
AWS 101
 
AWS Secrets Manager
AWS Secrets ManagerAWS Secrets Manager
AWS Secrets Manager
 

Similar to ML Workflows with Amazon SageMaker and AWS Step Functions (API325) - AWS re:Invent 2018

Similar to ML Workflows with Amazon SageMaker and AWS Step Functions (API325) - AWS re:Invent 2018 (20)

Introduction to Sagemaker
Introduction to SagemakerIntroduction to Sagemaker
Introduction to Sagemaker
 
Introducing Amazon SageMaker - AWS Online Tech Talks
Introducing Amazon SageMaker - AWS Online Tech TalksIntroducing Amazon SageMaker - AWS Online Tech Talks
Introducing Amazon SageMaker - AWS Online Tech Talks
 
Build, Train, and Deploy ML Models Quickly and Easily with Amazon SageMaker, ...
Build, Train, and Deploy ML Models Quickly and Easily with Amazon SageMaker, ...Build, Train, and Deploy ML Models Quickly and Easily with Amazon SageMaker, ...
Build, Train, and Deploy ML Models Quickly and Easily with Amazon SageMaker, ...
 
Quickly and easily build, train, and deploy machine learning models at any scale
Quickly and easily build, train, and deploy machine learning models at any scaleQuickly and easily build, train, and deploy machine learning models at any scale
Quickly and easily build, train, and deploy machine learning models at any scale
 
Build Your Recommendation Engine on AWS Today - AWS Summit Berlin 2018
Build Your Recommendation Engine on AWS Today - AWS Summit Berlin 2018Build Your Recommendation Engine on AWS Today - AWS Summit Berlin 2018
Build Your Recommendation Engine on AWS Today - AWS Summit Berlin 2018
 
Build, Train, and Deploy ML Models Quickly and Easily with Amazon SageMaker, ...
Build, Train, and Deploy ML Models Quickly and Easily with Amazon SageMaker, ...Build, Train, and Deploy ML Models Quickly and Easily with Amazon SageMaker, ...
Build, Train, and Deploy ML Models Quickly and Easily with Amazon SageMaker, ...
 
Build, Train, and Deploy ML Models Quickly and Easily with Amazon SageMaker, ...
Build, Train, and Deploy ML Models Quickly and Easily with Amazon SageMaker, ...Build, Train, and Deploy ML Models Quickly and Easily with Amazon SageMaker, ...
Build, Train, and Deploy ML Models Quickly and Easily with Amazon SageMaker, ...
 
AWS Machine Learning Week SF: End to End Model Development Using SageMaker
AWS Machine Learning Week SF: End to End Model Development Using SageMakerAWS Machine Learning Week SF: End to End Model Development Using SageMaker
AWS Machine Learning Week SF: End to End Model Development Using SageMaker
 
Building WhereML, an AI Powered Twitter Bot for Guessing Locations of Picture...
Building WhereML, an AI Powered Twitter Bot for Guessing Locations of Picture...Building WhereML, an AI Powered Twitter Bot for Guessing Locations of Picture...
Building WhereML, an AI Powered Twitter Bot for Guessing Locations of Picture...
 
Building a Serverless AI Powered Twitter Bot: Collision 2018
Building a Serverless AI Powered Twitter Bot: Collision 2018Building a Serverless AI Powered Twitter Bot: Collision 2018
Building a Serverless AI Powered Twitter Bot: Collision 2018
 
Where ml ai_heavy
Where ml ai_heavyWhere ml ai_heavy
Where ml ai_heavy
 
Amazon SageMaker
Amazon SageMakerAmazon SageMaker
Amazon SageMaker
 
Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...
Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...
Building Deep Learning Applications with TensorFlow and SageMaker on AWS - Te...
 
Supercharge your Machine Learning Solutions with Amazon SageMaker
Supercharge your Machine Learning Solutions with Amazon SageMakerSupercharge your Machine Learning Solutions with Amazon SageMaker
Supercharge your Machine Learning Solutions with Amazon SageMaker
 
Predicting the Future with Amazon SageMaker - AWS Summit Sydney 2018
Predicting the Future with Amazon SageMaker - AWS Summit Sydney 2018Predicting the Future with Amazon SageMaker - AWS Summit Sydney 2018
Predicting the Future with Amazon SageMaker - AWS Summit Sydney 2018
 
Bridging the Gap Between Real Time/Offline and AI/ML Capabilities in Modern S...
Bridging the Gap Between Real Time/Offline and AI/ML Capabilities in Modern S...Bridging the Gap Between Real Time/Offline and AI/ML Capabilities in Modern S...
Bridging the Gap Between Real Time/Offline and AI/ML Capabilities in Modern S...
 
Create an ML Factory in Financial Services with CI CD - FSI301 - New York AWS...
Create an ML Factory in Financial Services with CI CD - FSI301 - New York AWS...Create an ML Factory in Financial Services with CI CD - FSI301 - New York AWS...
Create an ML Factory in Financial Services with CI CD - FSI301 - New York AWS...
 
Automate your Amazon SageMaker Workflows (July 2019)
Automate your Amazon SageMaker Workflows (July 2019)Automate your Amazon SageMaker Workflows (July 2019)
Automate your Amazon SageMaker Workflows (July 2019)
 
[NEW LAUNCH!] Introducing Amazon Elastic Inference: Reduce Deep Learning Infe...
[NEW LAUNCH!] Introducing Amazon Elastic Inference: Reduce Deep Learning Infe...[NEW LAUNCH!] Introducing Amazon Elastic Inference: Reduce Deep Learning Infe...
[NEW LAUNCH!] Introducing Amazon Elastic Inference: Reduce Deep Learning Infe...
 
How Peak.AI Uses Amazon SageMaker for Product Personalization (GPSTEC316) - A...
How Peak.AI Uses Amazon SageMaker for Product Personalization (GPSTEC316) - A...How Peak.AI Uses Amazon SageMaker for Product Personalization (GPSTEC316) - A...
How Peak.AI Uses Amazon SageMaker for Product Personalization (GPSTEC316) - A...
 

More from Amazon Web Services

Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
Amazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
Amazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
Amazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
Amazon Web Services
 

More from Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Introduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container ServiceIntroduzione a Amazon Elastic Container Service
Introduzione a Amazon Elastic Container Service
 
Come costruire un'architettura Serverless nel Cloud AWS
Come costruire un'architettura Serverless nel Cloud AWSCome costruire un'architettura Serverless nel Cloud AWS
Come costruire un'architettura Serverless nel Cloud AWS
 

ML Workflows with Amazon SageMaker and AWS Step Functions (API325) - AWS re:Invent 2018

  • 1.
  • 2. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Machine Learning Workflows with Amazon SageMaker and AWS Step Functions A P I 3 2 5 Tom Faulhaber Principal Engineer, AI Platforms Amazon Web Services Jeremy Irwin Solution Architect Cox Automotive Inc. Andy Katz Sr. Product Manager Amazon Web Services
  • 3. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Today’s agenda Build, train, and deploy machine learning models with Amazon SageMaker Build serverless workflows with less code to write and maintain using AWS Step Functions Learn how Cox Automotive combined SageMaker and Step Functions to improve collaboration between data scientists and software engineers New features to build and manage ML workflows even faster
  • 4. “Once upon a time…” Amazon SageMaker
  • 5. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. SageMaker manages ML infrastructure Build Train Deploy Pre-built notebook instances Highly optimized machine learning algorithms One-click training for ML, deep learning, and custom algorithms Automatic model tuning (hyperparameter optimization) Fully managed hosting at scale Deployment without engineering effort
  • 6. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Customers building and deploying on SageMaker
  • 7. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Machine learning cycle Business Problem ML problem framing Data collection Data integration Data preparation and cleaning Data visualization and analysis Feature engineering Model training and parameter tuning Model evaluation Monitoring and debugging Model deployment Predictions YESNO
  • 8. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Manage data on AWS Business Problem ML problem framing Data collection Data integration Data preparation and cleaning Data visualization and analysis Feature engineering Model training and parameter tuning Model evaluation Monitoring and debugging Model deployment Predictions YESNO
  • 9. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Build and train models using SageMaker Business Problem ML problem framing Data collection Data integration Data preparation and cleaning Data visualization and analysis Feature engineering Model training and parameter tuning Model evaluation Monitoring and debugging Model deployment Predictions YESNO
  • 10. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Deploy models using SageMaker Business Problem ML problem framing Data collection Data integration Data preparation and cleaning Data visualization and analysis Feature engineering Model training and parameter tuning Model evaluation Monitoring and debugging Model deployment Predictions YESNO
  • 11. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. What about the lines between the steps? Business Problem ML problem framing Data collection Data integration Data preparation and cleaning Data visualization and analysis Feature engineering Model training and parameter tuning Model evaluation Monitoring and debugging Model deployment Predictions YESNO
  • 12. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 13. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. What is Step Functions? Task Choice Fail Parallel Mountains People Snow NotSupportedImageType
  • 14. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Step Functions uses Amazon States Language (JSON) { "Comment": "Image Processing workflow", "StartAt": "ExtractImageMetadata", "States": { "ExtractImageMetadata": { "Type": "Task", "Resource": "arn:aws:lambda:::function:photo-backendExtractImageMetadata-...", "InputPath": "$", "ResultPath": "$.extractedMetadata", "Next": "ImageTypeCheck", "Catch": [ { "ErrorEquals": [ "ImageIdentifyError"], "Next": "NotSupportedImageType" } ], "Retry": [ { "ErrorEquals": [ "States.ALL"], "IntervalSeconds": 1, "MaxAttempts": 2, "BackoffRate": 1.5 }, ...
  • 15. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Run tasks with any compute resource Activity Worker long poll Traditional server AWS Lambda function Synchronous request
  • 16. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Customers running workflows on Step Functions
  • 17. “Back to our story…” Amazon SageMaker AWS Step Functions
  • 18. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Machine learning cycle Business Problem ML problem framing Data collection Data integration Data preparation and cleaning Data visualization and analysis Feature engineering Model training and parameter tuning Model evaluation Monitoring and debugging Model deployment Predictions YESNO
  • 19. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Cox Automotive OUR VISION TRANSFORM THE WAY THE WORLD BUYS, SELLS, OWNS, AND USES CARS
  • 20. “As Data Scientists, one of our biggest concerns with ML is that over time the models learn bad behaviors from spoiled data. We need to interject human expert oversight in our model deployment process, in order to continuously deliver quality models with minimal human intervention.” Jeff Keller, Senior Decision Scientist Cox Automotive
  • 21. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Digital advertising recommendations Enable car dealers to make better informed digital advertising decisions At Cox Automotive, ML-related product development is bifurcated: • Decision Science builds prediction models • Engineering integrates models into applications used by Cox Automotive clients Challenge: How can we reduce the friction between Data Science and Engineering so that both teams’ needs are fulfilled?
  • 22. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 23. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 24. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Engineering != Decision Science Background: Computer Science Skills: automation, deployment, reusability, Java Imperatives: security, operability, scalability Background: Statistics Skills: statistics, modeling, analysis, R, Python Imperatives: accuracy, precision, interpretability Cadence: 2 week sprints Cadence: varies
  • 25. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. AWS Compute Blog: Starting point
  • 26. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon SageMaker model deployment pipeline VPC VPC Event Data Scientist Email Requirements • Model artifacts are created as .zip files • Models are created as .tar.gz files Configurable Parameters • Source S3 buckets (landing zone for newly built models) • Destination S3 buckets (Engineering-owned) • Email address
  • 27. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. AWS Step Functions state machine definition … "StartAt": "GetNewModel", "States": { "GetNewModel": { "Type": "Task", "Resource": "arn:aws:lambda:${region}:${act}: function:model-review-GetNewModelFunction", "ResultPath": "$", "Next": "GetManualReview" }, "GetManualReview": { "Type": "Task", "Resource": "arn:aws:states:${region}:${act}: activity:model-review-getModelReviewDecision", "ResultPath": "$.taskresult", "TimeoutSeconds": 604800, "Next": "ApproveOrRejectNewModel” }, …
  • 28. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. State machine activity workers Call-work-respond: An external worker gets token, does work, and updates activity with success or failure Call-work-delegate…respond: Our external worker gets the token and then delegates responsibility for updating the activity to downstream AWS services Traditional server GetActivityTask JSON Input + TaskToken Traditional server SendTaskSuccess JSON Result + TaskToken Delegate TaskToken SendTaskSuccess
  • 29. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Activity token journey: Send models for review taskToken = getActivityTaskResponse['taskToken’] sendEmail(taskToken, diagnosticsFileName, diagnosticsFile, diagnosticsFilePath, apiUrl) … def sendEmail(taskToken, diagnosticsFileName, diagnosticsFile, diagnosticsFilePath, apiUrl): sesClient = boto3.client('ses') encodedtaskToken = quote(taskToken, safe='') approveLink = apiUrl + '/approve/' + encodedtaskToken rejectLink = apiUrl + '/reject/' + encodedtaskToken Data Scientist Event sfnClient = boto3.client('stepfunctions') getActivityTaskResponse = sfnClient.get_activity_task( activityArn=activityArn, workerName='checkStateMachineActivityStatus’ )
  • 30. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Activity token journey: Generate review request Data Scientist Event
  • 31. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Activity token journey: Amazon API Gateway configuration Data Scientist Event GetReviewDecisionFunction: handler: handler.getReviewDecision role: "${self:custom.terraformed.service.role}" events: - http: path: approve/{taskToken} method: get request: parameters: paths: taskToken: true - http: path: reject/{taskToken} method: get request: parameters: paths: taskToken: true
  • 32. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Activity token journey: Prepare arguments & output path = event['path'] taskToken = unquote(event['pathParameters']['taskToken']) taskSuccessOutput = '{"decision": "Approved"}' taskFailureOutput = '{"decision": "Rejected"}' if path.startswith('/reject'): message = "The model has been rejected and will not be promoted" status = 'rejected' kwargs = { 'taskToken': taskToken, 'output': taskFailureOutput } else: if path.startswith('/approve'): message = "The model has been approved and will be promoted" status = 'approved' kwargs = { 'taskToken': taskToken, 'output': taskSuccessOutput } else: message = "The parameter does not match the expected parameter" print(message) Data Scientist Event
  • 33. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Activity token journey: Set activity status try: if status == 'approved': sfnClient.send_task_success(**kwargs) responseData = { "statusCode": 200, "body": json.dumps({"decision": message}) } else: if status == 'rejected': sfnClient.send_task_success(**kwargs) responseData = { "statusCode": 200, "body": json.dumps({"decision": message}) } except Exception as e: raise e return responseData Data Scientist Event
  • 34. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. State input & output processing Lambda state can be shared with downstream/proceeding states via the state output, which is a mutable JSON object used to carry inputs & output data between states. Benefits: • Upstream worker output can be used as input for downstream workers (to reduce the number of repeat calls) • Maintain state of upstream states
  • 35. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. State input & output processing: Append to output { "name": "GetNewModel”, "output": { "diagnosticsFilePath": “20181102/model_diagnostics.zip", "diagnosticsFileName": "model_diagnostics.zip” } } # State is configured to append the decision to its input { "name": "GetManualReview", "output": { "diagnosticsFilePath": "20181102/model_diagnostics.zip", "diagnosticsFileName": "model_diagnostics.zip", "taskresult": { "decision": "Approved" } } }
  • 36. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. State input & output processing: Choice states "ApproveOrRejectNewModel": { "Type": "Choice", "Choices": [ { "Variable": "$.taskresult.decision", "StringEquals": "Approved", "Next": "ApproveNewModel" }, { "Variable": "$.taskresult.decision", "StringEquals": "Rejected", "Next": "RejectNewModel" } ] }
  • 37. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. AWS Compute Blog: What we changed • Step Functions • Automating invocation of the state machine • Using State input & output to pass upstream Lambda state/data to downstream Lambdas • > 1 state • Amazon Simple Email Service (Amazon SES) • Initial setup • Attachments • Model delivery to Engineering • Infrastructure as code
  • 38. “Engineering & Data Science development cadences are different. An ability to asynchronously collaborate reduces wait states and frustration.” Cox Automotive
  • 39. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 40. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. What Decision Science learned about Engineering • How to share • AWS resources amongst different projects • Infrastructure-as-code repo hierarchy and management • An approach for working in multiple AWS environments (lab, non-prod, prod)
  • 41. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. What Engineering learned about Decision Science • Human oversight is required to prevent unintended results and bias • Data access & availability are real issues • Are we collecting the right data to support future modeling efforts?
  • 42. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Example ML workflow def upload_to_s3(channel, file): s3 = boto3.resource('s3') data = open(file, "rb") key = channel + '/' + file s3.Bucket(bucket).put_object(Key=key, Body=data) train = sagemaker.s3_input('s3://{}/train/'.format(bucket), content_type='application/x-recordio') validation = sagemaker.s3_input('s3://{}/validation/'.format(bucket), content_type='application/x-recordio') input_data = 's3://batch-test-data/caltech256/' output_data = 's3://batch-test-output/DEMO-image-classification' transformer = training_job.transformer(2, 'ml.p3.2xlarge', output_path=output_data, assemble_with='Line’, max_payload=8, max_concurrent_transforms=8) transformer.transform(input_data, content_type='application/x-image')
  • 43. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. ML workflow in Step Functions
  • 44. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Manage asynchronous jobs without writing code!
  • 45. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Simplify machine learning workflows
  • 46. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Add AWS Glue ETL jobs in your workflows "Synchronously Run a Glue Job": { "Type": "Task", "Resource": "arn:aws:states:::glue:startJobRun.sync", "Parameters": { "JobName.$": "$.myJobName”, “AllocatedCapacity”: 3 }, "Catch": [ {"ErrorEquals": ["States.TaskFailed"], "ResultPath": "$.cause", "Next" : "Notify on Error" } ], "ResultPath": "$.jobInfo", "Next": "Report Success" }
  • 47. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Add Amazon SageMaker training and transform jobs in your workflows "Synchronously Run a Training Job": { "Type": "Task", "Resource": "arn:aws:states:::sagemaker.createTrainingJob.sync", "Parameters": { "AlgorithmSpecification": {...}, "HyperParameters": {...}, "InputDataConfig": [...], ... }, "Catch": [ {"ErrorEquals": ["States.TaskFailed"], "ResultPath": "$.cause", "Next" : ”Notify on Error" } ], "ResultPath": "$.jobInfo", "Next": "Report Success" } "Synchronously Run a Transform Job": { "Type": "Task", "Resource": "arn:aws:states:::sagemaker.createTransformJob.sync", "Parameters": { "TransformJobName.$": "$.transform", "ModelName.$": "$.model", "MaxConcurrentTransforms": 8, ... }, "Catch": [ {"ErrorEquals": ["States.TaskFailed"], "ResultPath": "$.cause", "Next" : ”Notify on Error" } ], "ResultPath": "$.jobInfo", "Next": "Report Success" }
  • 48. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Define workflows in JSON { "StartAt": "Download", "States": { "Download": { "Type": "Task", "Resource": "arn:aws:lambda:REGION:ACCT:function:download_data”, "Next": "Train" }, "Train": { "Type": "Task", "Resource": "arn:aws:states:::sagemaker:createTrainingJob.sync", "ResultPath": "$.training_job", "Parameters": { "AlgorithmSpecification": { "TrainingImage": "811284229777.dkr.ecr.us-east-1.amazonaws.com/ image-classification:latest", "TrainingInputMode": "File" }…
  • 49. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. AWS Cloud Developer Kit JavaScript TypeScript Java C# Define your cloud resources using an imperative programming interface
  • 50. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Work in progress: Define workflows in Python # Define an AWS Lambda task state xferStep = stepfunctions.task(self, name = 'Download’, resource = lambda_.Function(self, name = 'xfer_recio’, code=lambda_.Code.file('CodeFile.zip’), handler='download_data’, runtime=lambda_.Runtime.python36, timeout=15 * 60 ), result_path='$.training_data’, ) # Define an Amazon SageMaker task state trainStep = stepfunctions.task(self, "Train", resource = 'arn:aws:states:::sagemaker.createTrainingJob.sync’ parameters = ( TrainingJobName='string’, HyperParameters={ ... # Define workflow in Python Sfn_State_machine = ( xfer_step .next(train_step. add_catch(training_failure) ) .next(create_model_step) .next(transform_step. add_catch(transform_failure) ) .next(transform_success) # Create an AWS Step Functions state machine stepfunctions.StateMachine(self, name = ‘ML Workflow’, definition = sfn_state_Machine, timeoutSec = 30000 )
  • 51. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Amazon SageMaker AWS Step Functions
  • 52. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Related breakouts Tuesday, November 27 API302 - Serverless State Management & Orchestration for Modern Apps 10:45 AM – 11:45 AM | MGM, Level 1, Grand Ballroom 122 Wednesday, November 28 SRV373 - Building Massively Parallel Event-Driven Architectures 6:15 PM – 7:15 PM | Venetian, Level 3, Murano 3205 Thursday, November 29 AIM403 - Integrate Amazon SageMaker with Apache Spark 4:00 PM – 5:00 PM | Mirage, Grand Ballroom F
  • 53. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Resources https://aws.amazon.com/machine-learning/ https://aws.amazon.com/modern-apps/ https://github.com/awslabs/aws-cdk
  • 54. Thank you! © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 55. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
  • 56. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Resources https://aws.amazon.com/machine-learning/ https://aws.amazon.com/modern-apps/ https://github.com/awslabs/aws-cdk