@theburningmonk#aws #awslambda #serverless
Serverless Design Patterns
Yan Cui @theburningmonk
“Serverless”
2014
Gojko Adzic
It is serverless the same way
WiFi is wireless.
http://bit.ly/2yQgwwb
Serverless means…
don’t pay for it if no-one uses it
don’t need to worry about scaling
don’t need to provision and manage servers
“Function-as-a-Service”
AWS Lambda
Azure Functions
Google Cloud Functions
Auth0 Webtask
Spotinst Functions Kubeless
IBM Cloud Functions
AWS Lambda
AWS Lambda
API Gateway IOT SNS Kinesis CloudWatch
IaaS
Function
Application
Runtime
Container
OS
Virtualization
Hardware
CaaS
Function
Application
Runtime
Container
OS
Virtualization
Hardware
PaaS
Function
Application
Runtime
Container
OS
Virtualization
Hardware
FaaS
Function
Application
Runtime
Container
OS
Virtualization
Hardware
User User (scalable unit) Provider
IaaS
Function
Application
Runtime
Container
OS
Virtualization
Hardware
CaaS
Function
Application
Runtime
Container
OS
Virtualization
Hardware
PaaS
Function
Application
Runtime
Container
OS
Virtualization
Hardware
FaaS
Function
Application
Runtime
Container
OS
Virtualization
Hardware
User User (scalable unit) Provider
Serverless
FaaS
other services…
Database
Storage
BI
SERVERLESS WILL FUNDAMENTALLY
CHANGE HOW WE BUILD BUSINESS
AROUND TECHNOLOGY AND HOW YOU
CODE.
Simon Wardley
more Scalable
(and scales faster!)
Cheaper
(don’t pay for idle servers)
Resilience
(built-in redundancy and multi-AZ)
idea production
choose language
+ framework
master language
+ framework
figure out
deployment
configure AMI
configure ELB
configure
autoscaling
capacity planning
over-provision for
launch
are we doing
microservices?
configure CI/CD
idea production
choose language
+ framework
master language
+ framework
figure out
deployment
configure AMI
configure ELB
configure
autoscaling
capacity planning
over-provision for
launch
are we doing
microservices?
configure CI/CD
idea production
greater Velocity from idea to product
minimise Undifferentiated
heavy-lifting
Less ops responsibility on
your shoulders
http://bit.ly/2Dpidje
events are an enabler for
COMPOSABILITY
AWS LAMBDA
is the...
PATTERNS
WARNING!!
DESIGN PATTERNS DO NOT GUARANTEE SUCCESS
Pattern
/pat(ə)n/
A pattern is the repeated or
regular way in which something
happens or is done.
http://bit.ly/2Goq5mY
there are no silver bullets
UNDERSTAND YOUR PROBLEMS
AND CONSTRAINTS OVER
FOLLOWING A PATTERN.
me
Pattern
/pat(ə)n/
A pattern is the repeated or
regular way in which something
happens or is done.
Yan Cui
http://theburningmonk.com
@theburningmonk
Principal Engineer @
available in Austria, Switzerland, Germany,
Japan, Canada, Italy and US
available on 30+ platforms
~1,000,000 concurrent viewers
We’re hiring! Visit
engineering.dazn.com
to learn more.
follow @dazneng for
updates about the
engineering team
follow @dazneng for
updates about the
engineering team
We’re hiring! Visit
engineering.dazn.com
to learn more.
WE’RE HIRING!
AWS user since 2009
AWS user since 2009
https://www.youtube.com/watch?v=pptsgV4bKv8
https://bit.ly/production-ready-serverless
http://bit.ly/2C9LwIM
CRON
AWS LambdaCloudWatch Events
CloudWatch Events
OPS AUTOMATION
AWS Lambda CloudWatch Logs
CloudWatch Logs
AWS Lambda CloudWatch Logs AWS Lambda
CloudWatch Logs
AWS Lambda CloudWatch Logs
CloudTrail
CloudWatch Events
AWS Lambda CloudWatch Logs
CloudWatch Events
CloudTrail
AWS Lambda
AWS Lambda CloudWatch Logs
CloudWatch Events
CloudTrail
AWS Lambda
AWS Lambda
AWS Lambda CloudWatch Logs
CloudWatch Events
CloudTrail
AWS Lambda
AWS Lambda
auto-update CloudWatch retention policy
auto-create alarms for new APIs
auto-create dashboards for new APIs
alert on suspicious console logins
alert on EC2 activities in unused regions
…
WEB APPS
CloudFront S3
Browser
API Gateway AWS Lambda DynamoDBRoute53
CloudFront S3
Browser
API Gateway AWS Lambda DynamoDBRoute53
CloudFront S3
Browser
Cognito
Federated
Identities
Sync
User Flows
Registration
Verify email/phone
Secure sign-in
Forgotten password
Change password
Sign out
Cognito User
Pools
Federated
Identities
Sync
Leading Practices
Secure password handling
with SRP protocol
Encrypt all data server-side
Password policies
Token-based authentication
MFA
Support CAPTCHA
Cognito User
Pools
Cognito
Federated Identities
Cognito User
Pools
Facebook TwitterGoogle
…
identity providers
authenticate
Cognito
Federated Identities
Cognito User
Pools
Facebook TwitterGoogle
…
identity providers
authenticate token
Cognito
Federated Identities
Cognito User
Pools
Facebook TwitterGoogle
…
identity providers
authenticate token
token
validate
Cognito
Federated Identities
Cognito User
Pools
Facebook TwitterGoogle
…
identity providers
authenticate token
token
validate
Cognito
Federated Identities
Cognito User
Pools
Facebook TwitterGoogle
…
identity providers
authenticate token
token
IAM credential
validate
Cognito
Federated Identities
Cognito User
Pools
Facebook TwitterGoogle
…
identity providers
API Gateway S3 DynamoDB SNSIOT Kinesis
authenticate token
token
IAM credential
IAM credential
what about Multi-Region support?
https://aws.amazon.com/dynamodb/global-tables
http://amzn.to/2Bwb5j6
API Gateway AWS Lambda
DynamoDB
Route53
CloudFront S3
Browser
API Gateway AWS Lambda
eu-west-1
us-east-1
http://bit.ly/2FGKsuA
DATA LAKES
S3 Buckets
S3 Buckets
IAM
S3 Buckets
KMSIAM
S3 Buckets
KMS MacieIAM
S3 BucketsKinesis Streams Kinesis Firehose
KMS MacieIAM
S3 BucketsKinesis Streams
AWS Lambda
KMS MacieIAM
Kinesis Firehose
S3 BucketsKinesis Streams
AWS Lambda
KMS MacieIAM
AWS Lambda
Kinesis Firehose
S3 BucketsKinesis Streams
AWS Lambda AWS Lambda
KMS MacieIAM
AWS Lambda
Kinesis Firehose
S3 BucketsKinesis Streams
AWS Lambda AWS Lambda
KMS MacieIAM
AWS Lambda
DynamoDB
ElasticSearch
Kinesis Firehose
S3 BucketsKinesis Streams
AWS Lambda AWS Lambda
KMS MacieIAM
AWS Lambda Google BigQuery
Kinesis Firehose
S3 BucketsKinesis Streams
AWS Lambda AWS Lambda
Athena QuickSight
KMS MacieIAM
AWS Lambda
Kinesis Firehose
S3 BucketsKinesis Streams
AWS Lambda AWS Lambda
Athena QuickSight
KMS MacieIAM
AWS Lambda
Kinesis Firehose
EVENT DRIVEN
http://bit.ly/2Dpidje
Kinesis
Kinesis
API Gateway AWS Lambda API GatewayAWS Lambda
service-A service-B
Kinesis
API Gateway AWS Lambda API GatewayAWS Lambda
service-A service-B
Kinesis
API Gateway AWS Lambda API GatewayAWS Lambda
service-A service-B
AWS Lambda
AWS Lambda
AWS Lambda
Kinesis
API Gateway AWS Lambda API GatewayAWS Lambda
service-A service-B
AWS Lambda
AWS Lambda
AWS Lambda DynamoDBIOT
Kinesis
API Gateway AWS Lambda API GatewayAWS Lambda
service-A service-B
AWS Lambda
AWS Lambda
AWS Lambda DynamoDBIOT
Kinesis
API Gateway AWS Lambda API GatewayAWS Lambda
service-A service-B
AWS Lambda
AWS Lambda
AWS Lambda DynamoDBIOT
AWS Lambda AWS Lambda
build loosely-coupled system
through events
service A service B
service C service D
bounded context
bounded context
service A service B
service C service D
bounded context
bounded context
service A service B
service C service D
service A service B
service C service D
service A service B
service C service D
service A service B
service C service Dbackward-compatible?
bounded context
DON’T use events to
orchestrate workflows
within the same
bounded context
bounded context
adds unnecessary
complexity to logging,
tracing, and end-to-end
reporting
bounded context
the workflow doesn’t exist
as a standalone concept,
but as the sum of a series of
loosely connected parts
Step Functions
use Step Functions instead
Step Functions
don’t forget to emit events
from the workflow
Step Functions
so others can react to state
changes that happened as
part of the workflow
DECOUPLED INVOCATION
Decoupled Invocation
How can a service handle normal
request loads, peak request loads,
and a continuous period of high
load without failing?
business logic requires expensive processing
API Gateway
max integration timeout is 29 seconds
http://amzn.to/2BwW5Bx
downstream systems not as scalable
decouple reply from the initial request
APIClient
POST /do_something
workerAPIClient
POST /do_something
202 /result_location
do work
workerAPIClient
POST /do_something
202 /result_location
GET /result_location
202 /result_location
do work
workerAPIClient
POST /do_something
202 /result_location
GET /result_location
202 /result_location
do work
work done
workerAPIClient
POST /do_something
202 /result_location
GET /result_location
202 /result_location
do work
work done
GET /result_location
200 OK
amortises spikes in load
allows fast response back to caller whilst
promises to finish work later
allows flexible retry strategies by removing the
urgency of having to reply to caller right away
DynamoDB
API Gateway
POST
task id created at result
xxx xxx <null>
xxx xxx <null>
… … …
task results
not ready
PutItem
DynamoDB
API Gateway
POST
task id created at result
xxx xxx <null>
xxx xxx <null>
… … …
task results
not ready
SQS
DynamoDB
API Gateway
202
task id created at result
xxx xxx <null>
xxx xxx <null>
… … …
task results
not ready
SQS
use “created at” timestamp to timeout
polling requests and avoid infinite retry
DynamoDBAPI Gateway
GET
task id created at result
xxx xxx <null>
xxx xxx <null>
… … …
task results
not ready
SQS
DynamoDBAPI Gateway
202
task id created at result
xxx xxx <null>
xxx xxx <null>
… … …
task results
not ready
SQS
DynamoDB
task id created at result
xxx { … }
xxx <null>
… …
task results
done
UpdateItem
xxx
xxx
…
SQS
DynamoDBAPI Gateway
GET
task id created at result
xxx xxx { … }
xxx xxx <null>
… … …
task results
done
DynamoDBAPI Gateway
200 { … }
control parallelism with Lambda reserved concurrency setting
also consider using Kinesis Streams or
DynamoDB Streams as queue
if you use SNS, make sure to enable
maxReceivesPerSecond delivery policy
(otherwise, invocation-per-message means no amortisation)
PUB-SUB
msg broker
subscriber
subscriber
subscriber
subscriber
…
one message, many consumers
good for decoupling data processing
independent failures
partial failures are easier to manage
SNS, Kinesis Streams, DynamoDB Streams, etc…
SNS
2 retries then DLQ
SNS
2 retries then DLQ
invocation per msg
SNS
2 retries then DLQ
invocation per msg
might run into
throttling limits
consider impact on
downstream
SNS
suffers from temporary issues
msg/s
time
max throughput
erred and retried
msg/s
time
max throughput
erred and retried
msg/s
time
max throughput
downstream outage
Kinesis
retried until success
Kinesis
retried until success
invocation per shard
Kinesis
better handling of temporary issues
msg/s
time
processed
max throughput amortised
received
msg/s
time
max throughput
downstream outage
processed
received
DynamoDB
Streams
DynamoDB
Kinesis Streams or DynamoDB Streams?
what is your source of truth?
limited to events from one table
records describe DynamoDB events, not
events from your domain
auto-scales no. of shards
cannot extend data retention beyond 24 hours
charged based on no. of requests
($0.02 per 100,000 read request units)
1 msg/s for a month, 1KB per msg
$0.47
1 x 60s x 60m x 24hr x 30days
@ $0.014 per mil
+
24hrs x 30days
@ $0.015 per hr
$10.836
1 x 60s x 60m x 24hr x 30days
@ $0.5 per mil
$1.296
DynamoDB StreamsSNSKinesis Streams
1 Write Capacity Unit
@ $0.47 per unit
1k msg/s for a month, 1KB per msg
1k x 60s x 60m x 24hr x 30days
@ $0.014 per mil
+
24hrs x 30days
@ $0.015 per hr
$47.088
1k x 60s x 60m x 24hr x 30days
@ $0.5 per mil
$1296.00
DynamoDB StreamsSNSKinesis Streams
$470.00
1k Write Capacity Unit
@ $0.47 per unit
DON’T take these projections at face value!
SNS
no restriction on
destination target
SNS
no restriction on
destination target
need to handle partial
failures & retries
SNS
SAGA
pattern for managing failures where each action
has a compensating action for rollback
https://www.youtube.com/watch?v=xDuwrtwYHu8
Begin transaction
Start book hotel request
End book hotel request
Start book flight request
End book flight request
Start book car rental request
End book car rental request
End transaction
model actions and compensating actions
as Lambda functions
actions
compensating actions
state machine in AWS Step Functions as
the coordinator for the saga
AWS Step Functions
http://bit.ly/2uTJBE3
input
source code available here:
https://github.com/theburningmonk/lambda-saga-pattern
API Gateway and Kinesis
Authentication & authorisation (IAM, Cognito)
Testing
Running & Debugging functions locally
Log aggregation
Monitoring & Alerting
X-Ray
Correlation IDs
CI/CD
Performance and Cost optimisation
Error Handling
Configuration management
VPC
Security
Leading practices (API Gateway, Kinesis, Lambda)
Canary deployments
http://bit.ly/prod-ready-serverless
get 40% off
with: ytcui
@theburningmonk
theburningmonk.com
github.com/theburningmonk
API Gateway and Kinesis
Authentication & authorisation (IAM, Cognito)
Testing
Running & Debugging functions locally
Log aggregation
Monitoring & Alerting
X-Ray
Correlation IDs
CI/CD
Performance and Cost optimisation
Error Handling
Configuration management
VPC
Security
Leading practices (API Gateway, Kinesis, Lambda)
Canary deployments
http://bit.ly/prod-ready-serverless
get 40% off
with: ytcui

Serveless Design Patterns (Serverless Computing London)