WEBINARS
is the biggest tech conference for
developers in EMEA, open to all
languages and technologies.
SPECIAL DISCOUNT
May 27-28, 2020
LET’S MEET IN CODEMOTION AMSTERDAM 2020!
20% DISCOUNT FOR YOU
CODE: CodemotionAmsterdam20Cui
What do you mean
by ‘serverless’?
@theburningmonk theburningmonk.com
“Serverless”
@theburningmonk theburningmonk.com
@theburningmonk theburningmonk.com
Gojko Adzic
It is serverless the same way
WiFi is wireless.
http://bit.ly/2yQgwwb
@theburningmonk theburningmonk.com
Serverless means…
don’t pay for it if no-one uses it
don’t need to worry about scaling
don’t need to provision and manage servers
@theburningmonk theburningmonk.com
in other words, it’s a lot like taking a cab
@theburningmonk theburningmonk.com
Ownership
Fuel
Navigate
To get there!
Focus on
getting there!
@theburningmonk theburningmonk.com
HW Ownership
OS
Runtime & Scale
Code
Focus on
getting there!
Physical
Servers
Virtual
Machines
Containers Serverless
@theburningmonk theburningmonk.com
Nano Services Self Managed Cost Paradigm
ChangeAsync
Dynamic agile env
“why are we failing at this?”
hidden dangers
@theburningmonk theburningmonk.com
monolith microservices serverless
@theburningmonk theburningmonk.com
monolith microservices serverless
observability
distributed
systems
bounded
context
@theburningmonk theburningmonk.com
monolith microservices serverless
observability
distributed
systems
bounded
context
@theburningmonk theburningmonk.com
monolith microservices serverless
observability
distributed
systems
bounded
context
event driven
@theburningmonk theburningmonk.com
monolith serverless
missing learnings
from microservices
@theburningmonk theburningmonk.com
monolith serverless
missing learnings
from microservices
poor decisions
Yan Cui
http://theburningmonk.com
@theburningmonk
AWS user for 10 years
http://bit.ly/yubl-serverless
Yan Cui
http://theburningmonk.com
@theburningmonk
Developer Advocate @
Yan Cui
http://theburningmonk.com
@theburningmonk
Independent Consultant
advisetraining delivery
@theburningmonk theburningmonk.com
https://theburningmonk.com/workshops
Amsterdam, March 19-20 Helsinki, May 4-5 Stockholm, May 14-15
Dublin, June 16-17 London, September 24-25 Berlin, October 8-9
#1
not letting go of legacy
thinking
“we’re doing serverless,
but why aren’t thing
going faster?”
@theburningmonk theburningmonk.com
Socio Technical
@theburningmonk theburningmonk.com
there are no silver bullets
@theburningmonk theburningmonk.com
@theburningmonk theburningmonk.com
centralised team
Team A Team B Team C Team D …
@theburningmonk theburningmonk.com
“but the developers don’t understand AWS and how
our infrastructure is set up”
@theburningmonk theburningmonk.com
“but the developers don’t understand AWS and how
our infrastructure is set up”
let’s solve this
problem instead!
@theburningmonk theburningmonk.com
what got you here won’t get you there
@theburningmonk theburningmonk.com
if (path == “/user” && method == “GET”) {
return getUser(…);
} else if (path == “/user” && method == “DELETE”) {
return deleteUser(…);
} else if (path == “/user” && method == “POST”) {
return createUser(…);
} else if ….
Monolithic Functions
@theburningmonk theburningmonk.com
GET /user
POST /user
DELETE /user
Single-Purposed Functions
@theburningmonk theburningmonk.com
author: yan.cui
feature: user-api
user-api-dev
Monolithic Single-Purposed
author: yan.cui
feature: user-api
user-api-dev-get-user
author: yan.cui
feature: user-api
user-api-dev-create-user
author: yan.cui
feature: user-api
user-api-dev-delete-user
@theburningmonk theburningmonk.com
author: yan.cui
feature: user-api
user-api-dev
Monolithic Single-Purposed
author: yan.cui
feature: user-api
user-api-dev-get-user
author: yan.cui
feature: user-api
user-api-dev-create-user
author: yan.cui
feature: user-api
user-api-dev-delete-user
find related
functions by prefix
@theburningmonk theburningmonk.com
author: yan.cui
feature: user-api
user-api-dev
Monolithic Single-Purposed
author: yan.cui
feature: user-api
user-api-dev-get-user
author: yan.cui
feature: user-api
user-api-dev-create-user
author: yan.cui
feature: user-api
user-api-dev-delete-user
discoverability
(without having to dig into the code)
@theburningmonk theburningmonk.com
author: yan.cui
feature: user-api
user-api-dev
Monolithic Single-Purposed
author: yan.cui
feature: user-api
user-api-dev-get-user
author: yan.cui
feature: user-api
user-api-dev-create-user
author: yan.cui
feature: user-api
user-api-dev-delete-user
what does it do?
@theburningmonk theburningmonk.com
author: yan.cui
feature: user-api
user-api-dev
Monolithic Single-Purposed
author: yan.cui
feature: user-api
user-api-dev-get-user
author: yan.cui
feature: user-api
user-api-dev-create-user
author: yan.cui
feature: user-api
user-api-dev-delete-user
dynamodb:GetItem
dynamodb:PutItem
dynamodb:DeleteItem
@theburningmonk theburningmonk.com
author: yan.cui
feature: user-api
user-api-dev
Monolithic Single-Purposed
author: yan.cui
feature: user-api
user-api-dev-get-user
author: yan.cui
feature: user-api
user-api-dev-create-user
author: yan.cui
feature: user-api
user-api-dev-delete-user
dynamodb:GetItem
dynamodb:PutItem
dynamodb:DeleteItem
no least privilege…
@theburningmonk theburningmonk.com
author: yan.cui
feature: user-api
user-api-dev
Monolithic Single-Purposed
author: yan.cui
feature: user-api
user-api-dev-get-user
author: yan.cui
feature: user-api
user-api-dev-create-user
author: yan.cui
feature: user-api
user-api-dev-delete-user
require(x)
require(y)
require(z)
@theburningmonk theburningmonk.com
author: yan.cui
feature: user-api
user-api-dev
Monolithic Single-Purposed
author: yan.cui
feature: user-api
user-api-dev-get-user
author: yan.cui
feature: user-api
user-api-dev-create-user
author: yan.cui
feature: user-api
user-api-dev-delete-user
require(x)
require(y)
require(z)
@theburningmonk theburningmonk.com
@theburningmonk theburningmonk.com
more dependecies equals
slower cold start
@theburningmonk theburningmonk.com
author: yan.cui
feature: user-api
user-api-dev
Monolithic Single-Purposed
author: yan.cui
feature: user-api
user-api-dev-get-user
author: yan.cui
feature: user-api
user-api-dev-create-user
author: yan.cui
feature: user-api
user-api-dev-delete-user
require(x)
require(y)
require(z)
worse cold start
performance
@theburningmonk theburningmonk.com
keep functions simple, and single-purposed
#2
one account that rules
them all
@theburningmonk theburningmonk.com
mind the shared limits
@theburningmonk theburningmonk.com
no. of DynamoDB tables
no. of API Gateway regional APIs
no. of API Gateway edge-optimized APIs
no. of Kinesis shards
no. of IAM roles
no. of S3 buckets
no. of CloudFormation stacks
no. of SNS subscription filters
no. of SSM parameters
…
Resource Limits
@theburningmonk theburningmonk.com
DynamoDB read & write
API Gateway requests/second
Lambda concurrent executions
SSM parameter ops/second
…
Throughput Limits
@theburningmonk theburningmonk.com
@theburningmonk theburningmonk.com
compartmentalise security breaches
@theburningmonk theburningmonk.com
One account per Team per Environment
@theburningmonk theburningmonk.com
#3
do first, research later
@theburningmonk theburningmonk.com
https://einaregilsson.com/serverless-15-percent-slower-and-eight-times-more-expensive/
@theburningmonk theburningmonk.com
@theburningmonk theburningmonk.com
@theburningmonk theburningmonk.com
@theburningmonk theburningmonk.com
the platforms need to do better at educating users on
how to choose between different services
@theburningmonk theburningmonk.com
SNS vs SQS vs Kinesis vs MKS?
the platforms need to do better at educating users on
how to choose between different services
@theburningmonk theburningmonk.com
ordering
replay events
Kinesis SQS SNS
by shard
none (standard)
global (FIFO)
none
up to 7 days none none
mode
retry
batched batched (up to 10) singular
retried until
success
(customizable)
retry + DLQ retry + DLQ
concurrency 1 per shard auto-scaled fan-out!!!
subscribers many one-to-one many
EventBridge
many
none
none
singular
retry + DLQ
fan-out!!!
@theburningmonk theburningmonk.com
https://medium.com/theburningmonk-com/all-my-posts-on-serverless-aws-lambda-43c17a147f91
@theburningmonk theburningmonk.com
https://www.jeremydaly.com/newsletter/
#4
not using a deployment
toolkit
@theburningmonk theburningmonk.com
@theburningmonk theburningmonk.com
https://lumigo.io/blog/comparison-of-lambda-deployment-frameworks/
@theburningmonk theburningmonk.com
don’t write your own deployment framework
#5
console-driven
development
@theburningmonk theburningmonk.com
@theburningmonk theburningmonk.com
#6
one repo per function
@theburningmonk theburningmonk.com
github
repo
github
repo
github
repo
github
repo
github
repo
github
repo
github
repo
github
repo
github
repo
@theburningmonk theburningmonk.com
github
repo
github
repo
github
repo
github
repo
github
repo
github
repo
github
repo
github
repo
github
repo
@theburningmonk theburningmonk.com
monorepo?
@theburningmonk theburningmonk.com
github
repo
@theburningmonk theburningmonk.com
one repo per service?
@theburningmonk theburningmonk.com
github
repo
github
repo
github
repo
github
repo
user-api
timeline-api
relationship-api
search-api
@theburningmonk theburningmonk.com
https://lumigo.io/blog/mono-repo-vs-one-per-service/
@theburningmonk theburningmonk.com
@theburningmonk theburningmonk.com
@theburningmonk theburningmonk.com
@theburningmonk theburningmonk.com
@theburningmonk theburningmonk.com
@theburningmonk theburningmonk.com
@theburningmonk theburningmonk.com
@theburningmonk theburningmonk.com
github
repo
github
repo
github
repo
github
repo
user-api
timeline-api
relationship-api
search-api
@theburningmonk theburningmonk.com
CI/CD pipeline per service
@theburningmonk theburningmonk.com
functions are deployed together, as a stack
unencrypted secrets
in env vars
#7
@theburningmonk theburningmonk.com
secrets should NEVER be in plain text in env variables
@theburningmonk theburningmonk.com
SSM Parameter Store
Secret 1
Secret 2
IAM
Environment:
SECRET_1: …
SECRET_2: …
Environment:
SECRET_1: …
SECRET_2: …
@theburningmonk theburningmonk.com
SSM Parameter Store
Secret 1
Secret 2
IAM
Environment:
SECRET_1: …
SECRET_2: …
Environment:
SECRET_1: …
SECRET_2: …
yay!
@theburningmonk theburningmonk.com
@theburningmonk theburningmonk.com
@theburningmonk theburningmonk.com
SSM Parameter Store
Secret 1
Secret 2
IAM
fetch at cold start,
cache,
invalidate every x mins
@theburningmonk theburningmonk.com
https://github.com/middyjs/middy
@theburningmonk theburningmonk.com
@theburningmonk theburningmonk.com
SSM Parameter Store
Secret 1
Secret 2
IAM
switch to Higher
Throughput if you need
more than 40 ops/s
not following least
privilege principle
#8
@theburningmonk theburningmonk.com
@theburningmonk theburningmonk.com
missing DLQs
#9
@theburningmonk theburningmonk.com
async sync
S3
SNS
SES
CloudFormation
CloudWatch Logs
CloudWatch Events
Scheduled Events
CodeCommit
AWS Config
http://amzn.to/2v7Kc3b
Cognito
Alexa
Lex
API Gateway
pulling
DynamoDB Stream
Kinesis Stream
SQS
@theburningmonk theburningmonk.com
async sync
S3
SNS
SES
CloudFormation
CloudWatch Logs
CloudWatch Events
Scheduled Events
CodeCommit
AWS Config
http://amzn.to/2vs2lIg
Cognito
Alexa
Lex
API Gateway
pulling
DynamoDB Stream
Kinesis Stream
SQS
Lambda handles retries
(twice, then DLQ)
@theburningmonk theburningmonk.com
configure DLQ for async functions so you don’t lose failed events
@theburningmonk theburningmonk.com
@theburningmonk theburningmonk.com
@theburningmonk theburningmonk.com
DLQ Lambda Destinations
payload payload, context(s), and response
@theburningmonk theburningmonk.com
too much/too little
concurrency
#10
@theburningmonk theburningmonk.com
“Lambda generates too much load for the downstream system”
@theburningmonk theburningmonk.com
one invocation
per message
SNS
Lambda
@theburningmonk theburningmonk.com
Downstream
System
SNS
Lambda
@theburningmonk theburningmonk.com
ordering
replay events
Kinesis SQS SNS
by shard
none (standard)
global (FIFO)
none
up to 7 days none none
mode
retry
batched batched (up to 10) singular
retried until
success
(customizable)
retry + DLQ retry + DLQ
concurrency 1 per shard auto-scaled fan-out!!!
subscribers many one-to-one many
EventBridge
many
none
none
singular
retry + DLQ
fan-out!!!
@theburningmonk theburningmonk.com
if you want…
maximum
throughput
SNS
precise control
over throughput
Kinesis
@theburningmonk theburningmonk.com
if you want…
maximum
throughput
SNS
precise control
over throughput
Kinesis
how quickly it scales out
@theburningmonk theburningmonk.com
if you want…
maximum
throughput
SNS
precise control
over throughput
Kinesis
how quickly it scales out
SQS DynamoDB
Streams
@theburningmonk theburningmonk.com
ordering
replay events
Kinesis SQS SNS
by shard
none (standard)
global (FIFO)
none
up to 7 days none none
mode
retry
batched batched (up to 10) singular
retried until
success
(customizable)
retry + DLQ retry + DLQ
concurrency 1 per shard auto-scaled fan-out!!!
subscribers many one-to-one many
EventBridge
many
none
none
singular
retry + DLQ
fan-out!!!
cold starts
#11
@theburningmonk theburningmonk.com
“cold starts only happen to the first request”
@theburningmonk theburningmonk.com
function invocationconcurrent execution
i.e. a container
@theburningmonk theburningmonk.com
function invocationconcurrent execution
i.e. a container
class instance method call
@theburningmonk theburningmonk.com
Lambda scales the number of concurrent executions
based on traffic
@theburningmonk theburningmonk.com
existing “containers” are reused where possible
@theburningmonk theburningmonk.com
time
invocation
@theburningmonk theburningmonk.com
time
invocation
invocation
@theburningmonk theburningmonk.com
time
invocation
invocation
@theburningmonk theburningmonk.com
time
invocation
invocation
invocation
invocation
@theburningmonk theburningmonk.com
time
invocation
invocation
invocation
invocation
invocation
invocation
@theburningmonk theburningmonk.com
time
invocation
invocation
invocation
invocation
invocation
invocation
@theburningmonk theburningmonk.com
time
invocation
invocation
invocation
invocation
invocation
invocation
invocation
@theburningmonk theburningmonk.com
time
invocation
invocation
invocation
invocation
invocation
invocation
invocation invocation
@theburningmonk theburningmonk.com
time
invocation
invocation
invocation
invocation
invocation
invocation
invocation invocation
@theburningmonk theburningmonk.com
time
invocation
invocation
invocation
invocation
invocation
invocation
invocation invocation
@theburningmonk theburningmonk.com
@theburningmonk theburningmonk.com
@theburningmonk theburningmonk.com
@theburningmonk theburningmonk.com
@theburningmonk theburningmonk.com
time
invocation
invocation
ping
invocation
invocation
invocation
ping ping
@theburningmonk theburningmonk.com
Lambda warmers don’t work when you have > 1
concurrent executions
@theburningmonk theburningmonk.com
FREQUENCY DURATION
@theburningmonk theburningmonk.com
FREQUENCY DURATION
dictated by user traffic,
out of your control
@theburningmonk theburningmonk.com
cold starts is generally not an issue if you have a
steady traffic pattern
@theburningmonk theburningmonk.com
time
req/s
@theburningmonk theburningmonk.com
time
req/s
El Classico
@theburningmonk theburningmonk.com
time
req/s
lunch dinner
@theburningmonk theburningmonk.com
FREQUENCY DURATION
optimize this!
@theburningmonk theburningmonk.com
minimise the duration of cold starts so they
fall within acceptable latency range
@theburningmonk theburningmonk.com
time
req/s
lunch dinner
Provisioned
Concurrency
@theburningmonk theburningmonk.com
time
req/s
lunch dinner
Provisioned
Concurrency
On-Demand
Concurrency
@theburningmonk theburningmonk.com
https://lumigo.io/blog/provisioned-concurrency-the-end-of-cold-starts/
@theburningmonk theburningmonk.com
there are no silver bullets
@theburningmonk theburningmonk.com
reserved concurrency is a powerful tool IFF you
have a cold start problem
don’t use it by default
RDS connection
handling
#12
@theburningmonk theburningmonk.com
default RDS configs are bad for Lambda
@theburningmonk theburningmonk.com
default RDS configs are bad for Lambda
idle connections are
not closed
too many connections
per “container”
max open connection
is too low
@theburningmonk theburningmonk.com
https://www.jeremydaly.com/manage-rds-connections-aws-lambda/
@theburningmonk theburningmonk.com
set “wait_timeout” and “interactive_timeout” to 10 mins
(default is 8 hours!)
@theburningmonk theburningmonk.com
increase “max_connections” setting
@theburningmonk theburningmonk.com
set client socket pool size to 1
@theburningmonk theburningmonk.com
@theburningmonk theburningmonk.com
(lack of) observability
#13
@theburningmonk theburningmonk.com
happened system repaireduser impact
reduce MTTR
@theburningmonk theburningmonk.com
Identify & Resolve
Issues
Understanding
costs
Visibility
@theburningmonk theburningmonk.com
Identify & Resolve
Issues
Understanding
costs
Visibility
@theburningmonk theburningmonk.com
happened system repaireduser impact
MTTDiscovery
@theburningmonk theburningmonk.com
@theburningmonk theburningmonk.com
“What alerts should I have?”
@theburningmonk theburningmonk.com
It depends on what you’re building…
@theburningmonk theburningmonk.com
But, this is a good starting point
@theburningmonk theburningmonk.com
Lambda
error rate %
throttle count
DLR error count
iterator age
regional concurrency
@theburningmonk theburningmonk.com
Lambda
error rate %
throttle count
DLR error count
iterator age
regional concurrency
API Gateway
p90/95/99 latency
success rate %
4xx rate %
5xx rate %
@theburningmonk theburningmonk.com
API Gateway
p90/95/99 latency
success rate %
4xx rate %
5xx rate %
SQS
message age
Lambda
error rate %
throttle count
DLR error count
iterator age
regional concurrency
@theburningmonk theburningmonk.com
API Gateway
p90/95/99 latency
success rate %
4xx rate %
5xx rate %
SQS
message age
Step Functions
failed count
throttle count
timed out count
Lambda
error rate %
throttle count
DLR error count
iterator age
regional concurrency
@theburningmonk theburningmonk.com
SQS
message age
Step Functions
failed count
throttle count
timed out count
API Gateway
p90/95/99 latency
success rate %
4xx rate %
5xx rate %
Lambda
error rate %
throttle count
DLR error count
iterator age
regional concurrency
@theburningmonk theburningmonk.com
monitor and alert on message flow rate for
event processing pipelines
@theburningmonk theburningmonk.com
“Can’t you codify these?”
@theburningmonk theburningmonk.com
https://theburningmonk.com/hire-me
AdviseTraining Delivery
“Fundamentally, Yan has improved our team by increasing our
ability to derive value from AWS and Lambda in particular.”
Nick Blair
Tech Lead
@theburningmonk theburningmonk.com
https://theburningmonk.com/workshops
Amsterdam, March 19-20 Helsinki, May 4-5 Stockholm, May 14-15
Dublin, June 16-17 London, September 24-25 Berlin, October 8-9
codemotion-2020
10% off with code
Production-Ready Serverless
@theburningmonk
theburningmonk.com
github.com/theburningmonk

Common mistakes in serverless adoption