Deliver customer value
FASTER with Step Functions
Yan Cui @theburningmonk
What is step functions?
How it works?
When to use it?
Orchestration vs Choreography
Real-world case studies
Design patterns
Agenda
Step Functions
orchestration service that allows you to
model workflows as state machines
design with JSON
https://states-language.net/spec.html
Step Functions OOP
class
instance
execution
input arguments
start a state machine via..
StepFunctions
.startExecution(req)
.promise()
start a state machine via..
API Gateway
StepFunctions
.startExecution(req)
.promise()
start a state machine via..
EventBridge
including cron
StepFunctions
.startExecution(req)
.promise()
API Gateway
state transitions
state transitions
$25 PER MILLION
$25 PER MILLION
15X LAMBDA PRICING!
https://aws.amazon.com/about-aws/whats-new/2019/12/introducing-aws-step-functions-express-workflows
Yan Cui
http://theburningmonk.com
@theburningmonk
AWS user for 10 years
http://bit.ly/yubl-serverless
Yan Cui
http://theburningmonk.com
@theburningmonk
Developer Advocate @
Yan Cui
http://theburningmonk.com
@theburningmonk
Independent Consultant
advise
training development
theburningmonk.com/courses
homeschool.dev
designing state machines
types of states
"TaskState": {
 "Type": "Task",
 "Resource": "arn:aws:lambda:us-east-1:1234556788:function:hello-world",
 "Next": "NextState",
 "TimeoutSeconds": 300
}
Task
Performs a task.
"TaskState": {
 "Type": "Task",
 "Resource": "arn:aws:lambda:us-east-1:1234556788:function:hello-world",
 "Next": "NextState",
 "TimeoutSeconds": 300
}
Task
Performs a task.
"TaskState": {
 "Type": "Task",
 "Resource": "arn:aws:lambda:us-east-1:1234556788:function:hello-world",
 "Next": "NextState",
 "TimeoutSeconds": 300
}
Task
Performs a task.
"TaskState": {
 "Type": "Task",
 "Resource": "arn:aws:lambda:us-east-1:1234556788:function:hello-world",
 "Next": "NextState",
 "TimeoutSeconds": 300
}
Task
Performs a task.
"TaskState": {
 "Type": "Task",
 "Resource": "arn:aws:lambda:us-east-1:1234556788:function:hello-world",
 "Next": "NextState",
 "TimeoutSeconds": 300
}
Task
Defaults to 60s, even if function has longer timeout
Performs a task.
"TaskState": {
 "Type": "Task",
 "Resource": "arn:aws:lambda:us-east-1:1234556788:function:hello-world",
 "Next": "NextState",
 "TimeoutSeconds": 300
}
Task
Defaults to 60s, even if function has longer timeout
Set this to match your function’s timeout
Performs a task.
"TaskState": {
 "Type": "Task",
 "Resource": "arn:aws:lambda:us-east-1:1234556788:function:hello-world",
 "Next": "NextState",
 "TimeoutSeconds": 300
}
Task
Doesn’t have to be Lambda function.
Performs a task.
"TaskState": {
 "Type": "Task",
 "Resource": "arn:aws:lambda:us-east-1:1234556788:function:hello-world",
 "Next": "NextState",
 "TimeoutSeconds": 300
}
Task
Doesn’t have to be Lambda function.
Performs a task.
Activity, AWS Batch, ECS task, DynamoDB,
SNS, SQS, AWS Glue, SageMaker
{ “x”: 42, “y”: 13 }
$ =>
{
“x”: 42,
“y”: 13
}
"choose": {
"Type": "Choice",
"Choices": [
{
"And": [
{
"Variable": "$.x",
"NumericGreaterThanEquals": 42
},
{
"Variable": "$.y",
"NumericLessThan": 42
}
],
"Next": "subtract"
}
],
"Default": "add"
},
{ “x”: 42, “y”: 13 }
$ =>
{
“x”: 42,
“y”: 13
}
"subtract": {
"Type": "Task",
“Resource": "arn:aws:lambda:…",
"Next": "double",
"ResultPath": "$.n"
},
{ “x”: 42, “y”: 13 }
$ =>
{
“x”: 42,
“y”: 13
}
module.exports.handler = async (input, context) => {
return input.x - input.y
}
$.n
{ “x”: 42, “y”: 13 }
$ =>
{
“x”: 42,
“y”: 13,
“n”: 29
}
{ “x”: 42, “y”: 13 }
$ =>
{
“x”: 42,
“y”: 13,
“n”: 29
}
"double": {
“Type": "Task",
“Resource”: ”arn:aws:lambda:...",
“InputPath": "$.n",
“End": true
}
{ “x”: 42, “y”: 13 }
$ =>
{
“x”: 42,
“y”: 13,
“n”: 29
}
module.exports.handler = async (input, context) => {
return input * 2;
}
$.n
$
{ “x”: 42, “y”: 13 }
$ => 58
"double": {
“Type": "Task",
“Resource”: ”arn:aws:lambda:...",
“InputPath": "$.n",
“End": true
}
{ “x”: 42, “y”: 13 }
{ “output”: 58 }
"NoOp": {
 "Type": "Pass",  
 "Result": {
   "is": 42
 },
 "ResultPath": "$.the_answer_to_the_question_of_life_the_universe_and_everything",
 "Next": "NextState"
}
Pass
Passes input to output without doing any work.
"NoOp": {
 "Type": "Pass",  
 "Result": {
   "is": 42
 },
 "ResultPath": "$.the_answer_to_the_question_of_life_the_universe_and_everything",
 "Next": "NextState"
}
Pass
Passes input to output without doing any work.
Pass
Passes input to output without doing any work.
{ }
{
 “the_answer_to_the_question_of_life_the_universe_and_everything”: {
   “is”: 42
 }
}
"WaitTenSeconds" : {
 "Type" : "Wait",
 "Seconds" : 10,
 "Next": "NextState"
}
Wait
Wait before transitioning to next state.
"WaitTenSeconds" : {
 "Type" : "Wait",
“Timestamp": "2018-08-08T01:59:00Z",  
"Next": "NextState"
}
"WaitTenSeconds" : {
 "Type" : "Wait",
 "Seconds" : 10,
 "Next": "NextState"
}
Wait
Wait before transitioning to next state.
"WaitTenSeconds" : {
 "Type" : "Wait",
“Timestamp": "2018-08-08T01:59:00Z",  
"Next": "NextState"
}
"WaitTenSeconds" : {
 "Type" : "Wait",
 "SecondsPath" : "$.waitTime",
 "Next": "NextState"
}
Wait
Wait before transitioning to next state.
"WaitTenSeconds" : {
 "Type" : "Wait",
“TimestampPath": “$.waitUntil”,  
"Next": "NextState"
}
"ChoiceState": {
 "Type" : "Choice",
 "Choices": [
   {
      "Variable": "$.name",
     "StringEquals": "Neo"
     "Next": "RedPill"
   }
 ],
 "Default": "BluePill"
}
Choice
Adds branching logic to the state machine.
"ChoiceState": {
 "Type" : "Choice",
 "Choices": [
   {
      "Variable": "$.name",
     "StringEquals": "Neo"
     "Next": "RedPill"
   }
 ],
 "Default": "BluePill"
}
Choice
Adds branching logic to the state machine.
"ChoiceState": {
 "Type" : "Choice",
 "Choices": [
   {
      "Variable": "$.name",
     "StringEquals": "Neo"
     "Next": "RedPill"
   }
 ],
 "Default": "BluePill"
}
Choice
Adds branching logic to the state machine.
"ChoiceState": {
 "Type" : "Choice",
 "Choices": [
   {
      "Variable": "$.name",
     "StringEquals": "Neo"
     "Next": "RedPill"
   }
 ],
 "Default": "BluePill"
}
Choice
Adds branching logic to the state machine.
{
“And”: [
{
      "Variable": "$.name",
      "StringEquals": “Cypher"
    },
{
      "Variable": "$.afterNeoIsRescued",
      "BooleanEquals": true
    },
],
  "Next": "BluePill"
}
"FunWithMath": {
 "Type": "Parallel",
 "Branches": [
   {
     "StartAt": "Add",
     "States": {
       "Add": {
         "Type": "Task",
         "Resource": "arn:aws:lambda:us-east-1:1234556788:function:add",
         "End": true
       }
     }
   },
   …
 ],
 "Next": "NextState"
}
Parallel
Performs tasks in parallel.
"FunWithMath": {
 "Type": "Map",
 "Iterator": [
   {
     "StartAt": "DoSomething",
     "States": {
       "Add": {
         "Type": "Task",
         "Resource": “arn:aws:lambda:us-east-1:1234556788:function:doSomething",
         "End": true
       }
     }
   },
   …
 ],
 "Next": "NextState"
}
Map
Dynamic parallelism!
"FunWithMath": {
 "Type": "Map",
 "Iterator": [
   {
     "StartAt": "DoSomething",
     "States": {
       "Add": {
         "Type": "Task",
         "Resource": "arn:aws:lambda:us-east-1:1234556788:function:doSomething",
         "End": true
       }
     }
   },
   …
 ],
 "Next": "NextState"
}
Map
Dynamic parallelism!
e.g. [ { … }, { … }, { … } ]
"SuccessState" : {
 "Type" : "Succeed"
}
Succeed
Terminates the state machine successfully.
"FailState" : {
 "Type" : “Fail",
"Error" : "TypeA",
"Cause" : "Kaiju Attack",
}
Fail
Terminates the state machine and mark it as failure.
https://aws.amazon.com/about-aws/whats-new/2019/08/aws-step-function-adds-support-for-nested-workflows
WHEN TO USE STEP FUNCTIONS?
$25 PER MILLION
15X LAMBDA PRICING!
another moving part to manage
what’s the return-on-investment?
Pros Cons
$$$
https://aws.amazon.com/about-aws/whats-new/2019/12/introducing-aws-step-functions-express-workflows
Pros Cons
visual
$$$
Great for collaboration with team
members who are not engineers
Makes it easy for anyone to identify
and debug application errors.
Pros Cons
visual
$$$
error handling
Makes deciding on timeout setting for Lambda
functions easier when you don’t have to consider
retry and exponential backoff, etc.
Surfaces error handling for everyone to see, makes it
easy for others to see without digging into the code.
Pros Cons
visual
$$$
error handling
audit
Pros Cons
visual
$$$
error handling
audit
business critical workflows
what: stuff that makes money, e.g. payment and
subscription flows.
why: more robust error handling worth the premium.
complex workflows
what: complex workflows that involves many states,
branching logic, etc.
why: visual workflow is a powerful design (for product)
and diagnostic tool (for customer support).
long running workflows
what: workflows that cannot complete in 15 minutes
(Lambda limit).
why: AWS discourages recursive Lambda functions,
Step Functions gives you explicit branching checks,
and can timeout at workflow level.
https://aws.amazon.com/about-aws/whats-new/2019/12/introducing-aws-step-functions-express-workflows
https://docs.aws.amazon.com/step-functions/latest/dg/concepts-standard-vs-express.html
https://docs.aws.amazon.com/step-functions/latest/dg/concepts-standard-vs-express.html
https://docs.aws.amazon.com/step-functions/latest/dg/concepts-standard-vs-express.html
https://docs.aws.amazon.com/step-functions/latest/dg/concepts-standard-vs-express.html
use Express Workflows for high-throughput,
short-lived workflows (OLTP)
Pros Cons
visual
$$$
error handling
audit
Orchestration Choreography
Orchestration Choreography
orchestration within a bounded-context
choreography between bounded-contexts
Rule of Thumb
bounded context
fits within my head
high cohesion
same ownership
bounded context
the workflow doesn’t exist
as a standalone concept,
but as the sum of a series of
loosely connected parts
Lambda
Lambda
Lambda
SQS
SQS
API Gateway
bounded context A bounded context B bounded context C
EventBridge SNS
https://lumigo.io/blog/5-reasons-why-you-should-use-eventbridge-instead-of-sns
don’t forget to
emit events from
the workflow
orchestration within a bounded-context
choreography between bounded-contexts
Rule of Thumb
Step Functions in the wild
Backend system was slow and had
timing issue, so they needed to add a
90s delay before processing payment.
Step Functions was the most cost-
efficient and scalable way to
implement this wait.
Update nutritional info on over 100
brands to comply with FDA regulations.
Reduced processing time from 36 hours
to 10 seconds.
Transcode video segments in parallel.
Reduced processing time from ~20 mins
to ~2 mins.
Manages their food delivery experience.
Automates subscription fulfilments with
Step Functions.
Automates subscription billing system
with Step Functions.
Implements payment processing, and
subscription fulfillment systems with Step
Functions, and many more.
sagas
managing failures in a distributed transaction
Success path!
Failure paths…
"BookFlight": {
"Type": “Task",
"Resource": “arn:aws:lambda:…”,
"Catch": [
{
"ErrorEquals": [ “States.ALL" ],
"ResultPath": "$.BookFlightError",
"Next": “CancelFlight"
}
],
"ResultPath": "$.BookFlightResult",
"Next": "BookRental"
},
"BookFlight": {
"Type": “Task",
"Resource": “arn:aws:lambda:…”,
"Catch": [
{
"ErrorEquals": [ “States.ALL" ],
"ResultPath": "$.BookFlightError",
"Next": “CancelFlight"
}
],
"ResultPath": "$.BookFlightResult",
"Next": "BookRental"
},
"BookFlight": {
"Type": “Task",
"Resource": “arn:aws:lambda:…”,
"Catch": [
{
"ErrorEquals": [ “States.ALL" ],
"ResultPath": "$.BookFlightError",
"Next": “CancelFlight"
}
],
"ResultPath": "$.BookFlightResult",
"Next": "BookRental"
},
"CancelFlight": {
"Type": “Task",
"Resource": “arn:aws:lambda:…”,
"Catch": [
{
"ErrorEquals": [ “States.ALL" ],
"ResultPath": "$.CancelFlightError",
"Next": “CancelFlight"
}
],
"ResultPath": "$.CancelFlightResult",
"Next": “CancelHotel"
},
"CancelFlight": {
"Type": “Task",
"Resource": “arn:aws:lambda:…”,
"Catch": [
{
"ErrorEquals": [ “States.ALL" ],
"ResultPath": "$.CancelFlightError",
"Next": “CancelFlight"
}
],
"ResultPath": "$.CancelFlightResult",
"Next": “CancelHotel"
},
https://github.com/theburningmonk/lambda-saga-pattern
callbacks
https://docs.aws.amazon.com/step-functions/latest/dg/connect-to-resource.html#connect-wait-token
TaskToken
TaskToken
?
?
sendTaskSuccess(TaskToken)
?
"Publish SQS message": {
 "Type": "Task",
 "Resource": "arn:aws:states:::sqs:sendMessage.waitForTaskToken",
 "Parameters": {
 "QueueUrl": !Ref MyQueue,
"MessageBody": {
"Token.$": "$$.Task.Token"
}
},
 "Next": "NextState"
}
"Publish SQS message": {
 "Type": "Task",
 "Resource": "arn:aws:states:::sqs:sendMessage.waitForTaskToken",
 "Parameters": {
 "QueueUrl": !Ref MyQueue,
"MessageBody": {
"Token.$": "$$.Task.Token"
}
},
 "Next": "NextState"
}
"Publish SQS message": {
 "Type": "Task",
 "Resource": "arn:aws:states:::sqs:sendMessage.waitForTaskToken",
 "Parameters": {
 "QueueUrl": !Ref MyQueue,
"MessageBody": {
"Token.$": "$$.Task.Token"
}
},
 "Next": "NextState"
}
"Publish SQS message": {
 "Type": "Task",
 "Resource": "arn:aws:states:::sqs:sendMessage.waitForTaskToken",
 "Parameters": {
 "QueueUrl": !Ref MyQueue,
"MessageBody": {
"Token.$": "$$.Task.Token"
}
},
 "Next": "NextState"
}
https://docs.aws.amazon.com/step-functions/latest/dg/input-output-contextobject.html
"Publish SQS message": {
 "Type": "Task",
 "Resource": "arn:aws:states:::sqs:sendMessage.waitForTaskToken",
 "Parameters": {
 "QueueUrl": !Ref MyQueue,
"MessageBody": {
"Token.$": "$$.Task.Token",
"ExecutionId.$": "$$.Execution.Id",
"StateMachineId.$": "$$.StateMachine.Id"
}
},
 "Next": "NextState"
}
sendTaskSuccess(TaskToken)
?
needs access to
Step Functions
HTTP POST
?
sendTaskSuccess
https://go.aws/38KynD1
TaskToken
map-reduce
Map
Map
…
Map
…
{ … }
{ … }
{ … }
{ … }
{ … }
Map
…
{ … }
{ … }
{ … }
{ … }
{ … }
[{ … }, { … } … ]
Map
…
{ … }
{ … }
{ … }
{ … }
{ … }
[{ … }, { … } … ] Reduce
https://github.com/awsdocs/aws-step-functions-developer-guide/blob/master/doc_source/limits.md
https://github.com/awsdocs/aws-step-functions-developer-guide/blob/master/doc_source/limits.md
use S3 to store
large payloads
theburningmonk.com/hire-me
Advise
Training Delivery
“Fundamentally, Yan has improved our team by increasing our
ability to derive value from AWS and Lambda in particular.”
Nick Blair
Tech Lead
productionreadyserverless.com
@theburningmonk
theburningmonk.com
github.com/theburningmonk

How to ship customer value faster with step functions