Apply best parts of microservices to serverless

applying the best parts of Microservices
to Serverless

Yan Cui
http://theburningmonk.com
@theburningmonk
Principal Engineer @

follow @DAZN_ngnrs
for updates about the
engineering team
We’re hiring! Visit
engineering.dazn.com
to learn more.

SQL NoSQL
OOP Functional
On Premise Cloud
Waterfall Agile
Monoliths Microservices

https://en.wikipedia.org/wiki/Hype_cycle

what’s this?
this solves all my problems!

what’s this?
this is rubbish!

what’s this?
this is rubbish!
I’m starting to get it..

what’s this?
this is rubbish!
I’m starting to get it..
I know what I’m doing

SQL NoSQL
OOP Functional
On Premise Cloud
Waterfall Agile
Monoliths Microservices
Server-ful Serverless

“those who cannot remember the
past are condemned to repeat it”
- George Santayana

These are the four pillars of the Observability Engineering
team’s charter:
• Monitoring
• Alerting/visualization
• Distributed systems tracing infrastructure
• Log aggregation/analytics
“
” http://bit.ly/2DnjyuW- Observability Engineering at Twitter

NOWHERE
to install agents/daemons

user request
user request
user request
user request
user request
user request
user request
critical paths:
minimise user-facing latency
handler
handler
handler
handler
handler
handler
handler

user request
user request
user request
user request
user request
user request
user request
critical paths:
StatsD
handler
handler
handler
handler
handler
handler
handler
rsyslog
background processing:
batched, asynchronous, low
overhead

user request
user request
user request
user request
user request
user request
user request
critical paths:
StatsD
handler
handler
handler
handler
handler
handler
handler
rsyslog
background processing:
batched, asynchronous, low
overhead
NO background processing
except what platform provides

2016-07-12T12:24:37.571Z 994f18f9-482b-11e6-8668-53e4eab441ae
GOT is off air, what do I do now?

2016-07-12T12:24:37.571Z 994f18f9-482b-11e6-8668-53e4eab441ae
GOT is off air, what do I do now?
UTC Timestamp Request Id
your log message

one log group per
function
one log stream for each
concurrent invocation

logs are not easily searchable in
CloudWatch Logs
me

CloudWatch Logs AWS Lambda ELK stack

http://bit.ly/lambda-log-aggregation

you need to use structured logging
me

CloudWatch Logs
$0.50 per GB ingested
$0.03 per GB archived per month

CloudWatch Logs
$0.50 per GB ingested
$0.03 per GB archived per month
1M invocation of a 128MB function =
$0.000000208 * 1M + $0.20 =
$0.408

DON’T leave debug logging ON in production

you need to sample debug logs in production
me

volume of logs
observability
all debug logs
no debug logs
sampled debug logs
hurts mean time to
resolution (MTTR) during a
production incident

volume of logs
observability
all debug logs
no debug logs
sampled debug logs $$$$$$

always log the invocation event on error
me

http://bit.ly/lambda-sample-debug-logs

my code
send metrics
internet internet
press button something happens

those extra 10-20ms for
sending custom metrics would
compound when you have
microservices and multiple
APIs are called within one slice
of user event

Amazon found every 100ms of latency cost them 1% in sales.
http://bit.ly/2EXPfbA

console.log(“hydrating yubls from db…”);
console.log(“fetching user info from user-api”);
console.log(“MONITORING|1489795335|27.4|latency|user-api-latency”);
console.log(“MONITORING|1489795335|8|count|yubls-served”);
timestamp metric value
metric type
metric namemetrics
logs

CloudWatch Logs AWS Lambda
ELK stack
logs
m
etrics
CloudWatch

API Gateway
send custom metrics
asynchronously

SNS KinesisS3API Gateway
…
send custom metrics
asynchronously
send custom metrics as
part of function invocation

?
functions are often chained together
via asynchronous invocations

?
SNS
Kinesis
CloudWatch
Events
CloudWatch
LogsIoT
DynamoDB
S3 SES

?
SNS
Kinesis
CloudWatch
Events
CloudWatch
LogsIoT
DynamoDB
S3 SES
tracing ASYNCHRONOUS
invocations through so many
different event sources is difficult

narrow focus on a function
good for homing in on performance issues
for a particular function, but offers little to
help you build intuition about how your
system operates as a whole.

don’t span over async invocations
good for identifying dependencies of a function,
but not good enough for tracing the entire call
chain as user request/data flows through the
system via async event sources.

don’t span over non-AWS services

Nitzan Shapira
@nitzanshapira
Ran Ribenzaft
@ranrib

correlation IDs*
* eg. request-id, user-id, yubl-id, etc.

kinesis client
http client
sns client

http://bit.ly/lambda-correlation-ids

shared DBs create TIGHT COUPLING
between services

build loosely-coupled system through events

service A service B
service C service D
bounded context
bounded context

lesson 3. spiky load between services

downstream systems might not be as scalable

service A service B
Kinesis Lambda

service A service B
Kinesis Lambda
concurrency == no. of shards

service A service B
Kinesis Lambda
retried until success

lesson 4. failures are inevitable

complex distributed systems fail in.. well,
complex, sometimes cascaded ways..

the only way to truly know your system’s
resilience against failures is to test it
through controlled experiments

there are more inherent chaos and
complexity in a Serverless architecture

smaller units of deployment
but A LOT more of them!

more difﬁcult to harden
around boundaries
serverful
serverless

?
Kinesis
CloudWatch
Events
CloudWatch
LogsIoT
DynamoDB
S3
more intermediary services,
and greater variety too
SNS
SES

?
Kinesis
CloudWatch
Events
CloudWatch
LogsIoT
DynamoDB
S3
more intermediary services,
and greater variety too
each with its own set
of failure modes
SNS
SES

more conﬁgurations,
more opportunities for misconﬁguration
serverful
serverless

more unknown failure modes in
infrastructure that we don’t control

often there’s little we can do when
an outage occurs in the platform

missing fallback when downstream is unavailable

inject failures
validate failure handling

amortize spiky load between services

@theburningmonk
theburningmonk.com
github.com/theburningmonk

Apply best parts of microservices to serverless

More Related Content

What's hot

Similar to Apply best parts of microservices to serverless

More from Yan Cui

Recently uploaded

Apply best parts of microservices to serverless