How to build a social
network on #serverless
Yan Cui
@theburningmonk
Yan Cui
http://theburningmonk.com
@theburningmonk
Principal Engineer @
Independent Consultant
Instructor @
Instructor @
Advisor @
“Netflix for sports”
offices in London, Leeds, Katowice and Amsterdam
available in Austria, Switzerland, Germany,
Japan, Italy, Spain, Canada, USA, and Brazil
available on 30+ platforms
1,000,000+ concurrent viewers
“Netflix for sports”
offices in London, Leeds, Katowice and Amsterdam
We’re hiring! Visit
engineering.dazn.com to
learn more.
follow @dazneng for updates
about the engineering team.
WE’RE HIRING!
apr, 2016
nov, 2016
WHY?
hey guys, vote on this post
and I’ll announce a winner at
10PM tonight
10PM
traffic
10PM
traffic
70-100x
low utilisation to leave room for spikes
EC2 scaling is slow, so scale earlier
lots of $$$ for unused resources
up to 30 mins for deployment
deployment required downtime
features took months to develop
- Dan North
“lead time to someone saying
thank you is the only reputation
metric that matters.”
WHY?
to deliver better UX
WHY?
to deliver better UX
to deliver value faster
WHY?
to deliver better UX
to deliver value faster
to be more cost efficient
WHY?
to deliver better UX
to deliver value faster
to be more cost efficient
HOW?
what would good look like for us?
small
fast
zero downtime
no lock-step
deployments should be…
features should be…
deployable independently
loosely-coupled
we want to…
minimise cost for unused resources
we want to…
minimise cost for unused resources
minimise ops effort
we want to…
minimise cost for unused resources
minimise ops effort
reduce tech mess
we want to…
minimise cost for unused resources
minimise ops effort
reduce tech mess
deliver visible improvements faster
WHY?
to deliver better UX
to deliver value faster
to be more cost efficient
HOW?
microservices
WHY?
to deliver better UX
to deliver value faster
to be more cost efficient
HOW?
microservices
event-driven
WHY?
to deliver better UX
to deliver value faster
to be more cost efficient
HOW?
microservices
event-driven
serverless
WHY?
to deliver better UX
to deliver value faster
to be more cost efficient
HOW?
microservices
event-driven
serverless
WHAT?
this talk!
WHY?
to deliver better UX
to deliver value faster
to be more cost efficient
HOW?
microservices
event-driven
serverless
WHAT?
this talk!
170 Lambda functions in prod
95% cost saving vs. EC2
15x no. of prod releases per month
15x no. of prod releases per month
(features were sometimes implemented on the same day)
time
is a good fit
1st function in prod!
time
is a good fit
?
time
is a good fit
1st function in prod!
CI/CD?
CI/CD?
testing?
CI/CD?
testing?
logging, monitoring, alerting?
time
is a good fit
1st function in prod!
CI/CD, testing, logging,
monitoring, alerting
170 functions
?
time
is a good fit
1st function in prod!
CI/CD, testing, logging,
monitoring, alerting
tracing?
tracing?
config management?
tracing?
config management?
security?
170 functions
time
is a good fit
1st function in prod!
CI/CD, testing, logging,
monitoring, alerting
tracing, config
management, security
API Gateway and Kinesis
Authentication & authorisation (IAM, Cognito)
Testing
Running & Debugging functions locally
Log aggregation
Monitoring & Alerting
X-Ray
Correlation IDs
CI/CD
Performance and Cost optimisation
Error Handling
Configuration management
VPC
Security
Leading practices (API Gateway, Kinesis, Lambda)
Canary deployments
http://bit.ly/production-ready-serverless
get 40% off
with: ytcui
evolving the PLATFORM
Legacy Monolith Amazon Kinesis
Step 1.
ALL state changes!
events are an enabler for
COMPOSABILITY
AWS LAMBDA
is the...
Kinesis
Kinesis
API Gateway AWS Lambda API GatewayAWS Lambda
service-A service-B
Kinesis
API Gateway AWS Lambda API GatewayAWS Lambda
service-A service-B
Kinesis
API Gateway AWS Lambda API GatewayAWS Lambda
service-A service-B
AWS Lambda
AWS Lambda
AWS Lambda
Kinesis
API Gateway AWS Lambda API GatewayAWS Lambda
service-A service-B
AWS Lambda
AWS Lambda
AWS Lambda DynamoDBIOT
Kinesis
API Gateway AWS Lambda API GatewayAWS Lambda
service-A service-B
AWS Lambda
AWS Lambda
AWS Lambda DynamoDBIOT
Kinesis
API Gateway AWS Lambda API GatewayAWS Lambda
service-A service-B
AWS Lambda
AWS Lambda
AWS Lambda DynamoDBIOT
AWS Lambda AWS Lambda
build loosely-coupled system
through events
service A service B
service C service D
bounded context
bounded context
service A service B
service C service D
bounded context
bounded context
service A service B
service C service D
there are no silver bullets
service A service B
service C service D
service A service B
service C service D
service A service B
service C service D
update!
service A service B
service C service Dbackward-compatible?
update!
bounded context
DON’T use events to
orchestrate workflows
within the same
bounded context
bounded context
adds unnecessary
complexity to logging,
tracing, and end-to-end
reporting
bounded context
the workflow doesn’t exist
as a standalone concept,
but as the sum of a series of
loosely connected parts
Step Functions
use Step Functions instead
Step Functions
don’t forget to emit events
from the workflow
Step Functions
so others can react to state
changes that happened as
part of the workflow
“how do I organize my functions
into code repositories?”
monorepo?
github
repo
https://lumigo.io/blog/mono-repo-vs-one-per-service/
monorepo !== monostack
one repo per service?
github
repo
github
repo
github
repo
github
repo
user-api
timeline-api
relationship-api
search-api
CI/CD pipeline per service
functions are deployed
together, as a stack
Strangler Pattern
incrementally migrate the legacy system by
gradually replacing pieces of functionalities
to the new system
rebuilt search
Legacy Monolith Amazon Kinesis Amazon Lambda
Amazon CloudSearch
Legacy Monolith Amazon Kinesis Amazon Lambda
Amazon CloudSearchAmazon API Gateway Amazon Lambda
proxy requests from monolith
to new service
new analytics pipeline
expensive ($3000/month)
don’t understand our domain
JS based query language
Legacy Monolith Amazon Kinesis Amazon Lambda
Google BigQuery
Legacy Monolith Amazon Kinesis Amazon Lambda
Google BigQuery
1 developer, 2 days
design production
(his 1st serverless project)
Legacy Monolith Amazon Kinesis Amazon Lambda
Google BigQuery
“nothing ever got done
this fast at Skype!”
- Chris Twamley
- Dan North
“lead time to someone saying
thank you is the only reputation
metric that matters.”
$3000/month $0.03/month
Kinesis
sink
Kinesis Kinesis Firehose
batch Kinesis events
Kinesis Kinesis Firehose S3
data lake
Kinesis Kinesis Firehose S3
Glue
analyze data schema,
catalog data into tables
Kinesis Kinesis Firehose S3
Athena
Glue
query engine
Kinesis Kinesis Firehose S3
AthenaQuickSight
Glue
visualization, dashboards
Kinesis Kinesis Firehose S3
AthenaQuickSight
Glue
no code is required!
Kinesis Kinesis Firehose S3
AthenaQuickSight
Glue
no code is required!
pay-per-use!
user action business intelligence
user action business intelligence
Problem
didn’t work…
Problem
didn’t work…
over-engineered…
try figure out what’s
going on here…
Problem
didn’t work…
over-engineered…
didn’t scale…
Rebuilt
with Lambda
built-in retry
and DLQ
built-in retry
and DLQ
avoid repeating expensive
work of fetching mils of
relationships
github
repo
timeline-api
service: timeline-api
provider:
name: aws
runtime: nodejs6.10
stage: dev
region: us-east-1
functions:
distribute-yubl:
…
undistribute-yubl:
…
Problem
didn’t work…
“it returns the first 30 users in the
database, by creation time…”
Rebuilt
with Lambda
BigQuery
BigQuery
grapheneDB
BigQuery
grapheneDB
BigQuery
grapheneDB
BigQuery
grapheneDB
BigQuery
mostly built in one sleepless night…
Building a
scalable
notification
system
expensive ($3000/month)
don’t understand our domain
JS based query language
all the analytics data is
already in BigQuery
powerful query engine
all the analytics data is
already in BigQuery
powerful query engine
Design Goals
ad-hoc notifications
Design Goals
ad-hoc notifications
scheduled notifications
Design Goals
ad-hoc notifications
scheduled notifications
A/B testing
Design Goals
ad-hoc notifications
scheduled notifications
A/B testing
scalable
Design Goals
ad-hoc notifications
scheduled notifications
A/B testing
scalable
cost-effective
scheduled notifications
how to send notifications
what to send
other processes can leverage
this capability of sending
notifications
why not SNS?
ad-hoc notifications
Oversight vs. Frictionless
Oversight vs. Frictionless
don’t make life difficult for
the marketing team
Oversight vs. Frictionless
don’t make life difficult for
the marketing team
don’t let marketing
team spam users
Oversight vs. Frictionless
don’t make life difficult for
the marketing team
don’t let marketing
team spam users
driving usage/engagement
maintaining user experience
Marketing work with BI on query
request approval from CPO/CTO
approver checks impact and tests
message format
send notifications
more Scalable
(and scales faster!)
Cheaper
(don’t pay for idle servers)
Resilience
(built-in redundancy and multi-AZ)
Secure
request
blue-green deployment
req/s
auto-scaling
us-east-1a
us-east-1b
us-east-1c
multi-AZ
idea production
greater Velocity from idea to product
WHY?
to deliver better UX
to deliver value faster
to be more cost efficient
@theburningmonk
theburningmonk.com
github.com/theburningmonk

How to build a social network on serverless