Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
How to build a social
network on #serverless
Yan Cui
@theburningmonk
Yan Cui
http://theburningmonk.com
@theburningmonk
Independent Consultant
Yan Cui
http://theburningmonk.com
@theburningmonk
Developer Advocate @
apr, 2016
nov, 2016
WHY?
hey guys, vote on this post
and I’ll announce a winner at
10PM tonight
10PM
traffic
10PM
traffic
70-100x
low utilisation to leave room for spikes
EC2 scaling is slow, so scale earlier
lots of $$$ for unused resources
up to 30 mins for deployment
deployment required downtime
features took months to develop
- Dan North
“lead time to someone saying
thank you is the only reputation
metric that matters.”
WHY?
to deliver better UX
WHY?
to deliver better UX
to deliver value faster
WHY?
to deliver better UX
to deliver value faster
to be more cost efficient
WHY?
to deliver better UX
to deliver value faster
to be more cost efficient
HOW?
what would good look like for us?
small
fast
zero downtime
no lock-step
deployments should be…
features should be…
deployable independently
loosely-coupled
we want to…
minimise cost for unused resources
we want to…
minimise cost for unused resources
minimise ops effort
we want to…
minimise cost for unused resources
minimise ops effort
reduce tech mess
we want to…
minimise cost for unused resources
minimise ops effort
reduce tech mess
deliver visible improvements faster
WHY?
to deliver better UX
to deliver value faster
to be more cost efficient
HOW?
microservices
WHY?
to deliver better UX
to deliver value faster
to be more cost efficient
HOW?
microservices
event-driven
WHY?
to deliver better UX
to deliver value faster
to be more cost efficient
HOW?
microservices
event-driven
serverless
WHY?
to deliver better UX
to deliver value faster
to be more cost efficient
HOW?
microservices
event-driven
serverless
WHA...
WHY?
to deliver better UX
to deliver value faster
to be more cost efficient
HOW?
microservices
event-driven
serverless
WHA...
170 Lambda functions in prod
95% cost saving vs. EC2
15x no. of prod releases per month
15x no. of prod releases per month
(features were sometimes shipped on the same day)
time
is a good fit
1st function in prod!
time
is a good fit
?
time
is a good fit
1st function in prod!
CI/CD?
CI/CD?
testing?
CI/CD?
testing?
logging, monitoring, alerting?
time
is a good fit
1st function in prod!
CI/CD, testing, logging,
monitoring, alerting
170 functions
?
time
is a good fit
1st function in prod!
CI/CD, testing, logging,
monitoring, alerting
tracing?
tracing?
config management?
tracing?
config management?
security?
170 functions
time
is a good fit
1st function in prod!
CI/CD, testing, logging,
monitoring, alerting
tracing, config
manag...
API Gateway and Kinesis
Authentication & authorisation (IAM, Cognito)
Testing
Running & Debugging functions locally
Log ag...
evolving the PLATFORM
Legacy Monolith Amazon Kinesis
Step 1.
ALL state changes!
events are an enabler for
COMPOSABILITY
AWS LAMBDA
is the...
Kinesis
Kinesis
API Gateway AWS Lambda API GatewayAWS Lambda
service-A service-B
Kinesis
API Gateway AWS Lambda API GatewayAWS Lambda
service-A service-B
Kinesis
API Gateway AWS Lambda API GatewayAWS Lambda
service-A service-B
AWS Lambda
AWS Lambda
AWS Lambda
Kinesis
API Gateway AWS Lambda API GatewayAWS Lambda
service-A service-B
AWS Lambda
AWS Lambda
AWS Lambda DynamoDBIOT
Kinesis
API Gateway AWS Lambda API GatewayAWS Lambda
service-A service-B
AWS Lambda
AWS Lambda
AWS Lambda DynamoDBIOT
Kinesis
API Gateway AWS Lambda API GatewayAWS Lambda
service-A service-B
AWS Lambda
AWS Lambda
AWS Lambda DynamoDBIOT
AWS ...
build loosely-coupled system
through events
service A service B
service C service D
bounded context
bounded context
service A service B
service C service D
bounded context
bounded context
service A service B
service C service D
there are no silver bullets
service A service B
service C service D
service A service B
service C service D
service A service B
service C service D
update!
service A service B
service C service Dbackward-compatible?
update!
bounded context
DON’T use events to
orchestrate workflows
within the same
bounded context
bounded context
adds unnecessary
complexity to logging,
tracing, and end-to-end
reporting
bounded context
the workflow doesn’t exist
as a standalone concept,
but as the sum of a series of
loosely connected parts
Step Functions
use Step Functions instead
Step Functions
don’t forget to emit events
from the workflow
Step Functions
so others can react to state
changes that happened as
part of the workflow
“how do I organize my functions
into code repositories?”
one repo per function?
github
repo
github
repo
github
repo
github
repo
github
repo
github
repo
github
repo
github
repo
github
repo
github
repo
github
repo
github
repo
github
repo
github
repo
github
repo
github
repo
github
repo
github
repo
monorepo?
github
repo
https://lumigo.io/blog/mono-repo-vs-one-per-service/
one repo per service?
github
repo
github
repo
github
repo
github
repo
user-api
timeline-api
relationship-api
search-api
CI/CD pipeline per service
functions are deployed
together, as a stack
Strangler Pattern
incrementally migrate the legacy system by
gradually replacing pieces of functionalities
to the new syst...
rebuilt search
Legacy Monolith Amazon Kinesis Amazon Lambda
Amazon CloudSearch
Legacy Monolith Amazon Kinesis Amazon Lambda
Amazon CloudSearchAmazon API Gateway Amazon Lambda
proxy requests from monolith
to new service
new analytics pipeline
expensive ($3000/month)
don’t understand our domain
JS based query language
Legacy Monolith Amazon Kinesis Amazon Lambda
Google BigQuery
Legacy Monolith Amazon Kinesis Amazon Lambda
Google BigQuery
1 developer, 2 days
design production
(his 1st serverless pro...
Legacy Monolith Amazon Kinesis Amazon Lambda
Google BigQuery
“nothing ever got done
this fast at Skype!”
- Chris Twamley
- Dan North
“lead time to someone saying
thank you is the only reputation
metric that matters.”
$3000/month $0.03/month
Kinesis
sink
Kinesis Kinesis Firehose
batch Kinesis events
Kinesis Kinesis Firehose S3
data lake
Kinesis Kinesis Firehose S3
Glue
analyze data schema,
catalog data into tables
Kinesis Kinesis Firehose S3
Athena
Glue
query engine
Kinesis Kinesis Firehose S3
AthenaQuickSight
Glue
visualization, dashboards
Kinesis Kinesis Firehose S3
AthenaQuickSight
Glue
no code is required!
Kinesis Kinesis Firehose S3
AthenaQuickSight
Glue
no code is required!
pay-per-use!
user action business intelligence
user action business intelligence
Problem
didn’t work…
Problem
didn’t work…
over-engineered…
try figure out what’s
going on here…
Problem
didn’t work…
over-engineered…
didn’t scale…
Rebuilt
with Lambda
built-in retry
and DLQ
built-in retry
and DLQ
avoid repeating expensive
work of fetching mils of
relationships
github
repo
timeline-api
service: timeline-api
provider:
name: aws
runtime: nodejs6.10
stage: dev
region: us-east-1
functi...
Problem
didn’t work…
“it returns the first 30 users in the
database, by creation time…”
Rebuilt
with Lambda
BigQuery
BigQuery
grapheneDB
BigQuery
grapheneDB
BigQuery
grapheneDB
BigQuery
grapheneDB
BigQuery
mostly built in one sleepless night…
more Scalable
(and scales faster!)
Cheaper
(don’t pay for idle servers)
Resilience
(built-in redundancy and multi-AZ)
Secure
request
blue-green deployment
req/s
auto-scaling
us-east-1a
us-east-1b
us-east-1c
multi-AZ
idea production
greater Velocity from idea to product
WHY?
to deliver better UX
to deliver value faster
to be more cost efficient
@theburningmonk
theburningmonk.com
github.com/theburningmonk
How to build a social network on serverless
How to build a social network on serverless
How to build a social network on serverless
How to build a social network on serverless
How to build a social network on serverless
How to build a social network on serverless
How to build a social network on serverless
How to build a social network on serverless
How to build a social network on serverless
How to build a social network on serverless
How to build a social network on serverless
How to build a social network on serverless
How to build a social network on serverless
How to build a social network on serverless
How to build a social network on serverless
How to build a social network on serverless
How to build a social network on serverless
How to build a social network on serverless
How to build a social network on serverless
How to build a social network on serverless
How to build a social network on serverless
How to build a social network on serverless
How to build a social network on serverless
How to build a social network on serverless
How to build a social network on serverless
How to build a social network on serverless
How to build a social network on serverless
How to build a social network on serverless
How to build a social network on serverless
How to build a social network on serverless
How to build a social network on serverless
Upcoming SlideShare
Loading in …5
×

How to build a social network on serverless

338 views

Published on

Many people are building different workloads using serverless technologies these days, but how would a non-trivial system such as a social network look like on serverless?

In this talk Yan will discuss his journey of migrating a social network startup to serverless, and how his team was able to improve performance, scalability and feature delivery using serverless technologies.

Yan will discuss how serverless technologies such as Lambda are used to implement each part of their system, including search, push notifications, timeline, user recommendations, and business intelligence. If you're wondering how serverless can be used to solve a wide variety of challenges in your business, this is the talk for you.

Published in: Technology
  • Be the first to comment

How to build a social network on serverless

  1. 1. How to build a social network on #serverless Yan Cui @theburningmonk
  2. 2. Yan Cui http://theburningmonk.com @theburningmonk Independent Consultant
  3. 3. Yan Cui http://theburningmonk.com @theburningmonk Developer Advocate @
  4. 4. apr, 2016
  5. 5. nov, 2016
  6. 6. WHY?
  7. 7. hey guys, vote on this post and I’ll announce a winner at 10PM tonight
  8. 8. 10PM traffic
  9. 9. 10PM traffic 70-100x
  10. 10. low utilisation to leave room for spikes EC2 scaling is slow, so scale earlier
  11. 11. lots of $$$ for unused resources
  12. 12. up to 30 mins for deployment deployment required downtime
  13. 13. features took months to develop
  14. 14. - Dan North “lead time to someone saying thank you is the only reputation metric that matters.”
  15. 15. WHY? to deliver better UX
  16. 16. WHY? to deliver better UX to deliver value faster
  17. 17. WHY? to deliver better UX to deliver value faster to be more cost efficient
  18. 18. WHY? to deliver better UX to deliver value faster to be more cost efficient HOW?
  19. 19. what would good look like for us?
  20. 20. small fast zero downtime no lock-step deployments should be…
  21. 21. features should be… deployable independently loosely-coupled
  22. 22. we want to… minimise cost for unused resources
  23. 23. we want to… minimise cost for unused resources minimise ops effort
  24. 24. we want to… minimise cost for unused resources minimise ops effort reduce tech mess
  25. 25. we want to… minimise cost for unused resources minimise ops effort reduce tech mess deliver visible improvements faster
  26. 26. WHY? to deliver better UX to deliver value faster to be more cost efficient HOW? microservices
  27. 27. WHY? to deliver better UX to deliver value faster to be more cost efficient HOW? microservices event-driven
  28. 28. WHY? to deliver better UX to deliver value faster to be more cost efficient HOW? microservices event-driven serverless
  29. 29. WHY? to deliver better UX to deliver value faster to be more cost efficient HOW? microservices event-driven serverless WHAT? this talk!
  30. 30. WHY? to deliver better UX to deliver value faster to be more cost efficient HOW? microservices event-driven serverless WHAT? this talk!
  31. 31. 170 Lambda functions in prod
  32. 32. 95% cost saving vs. EC2
  33. 33. 15x no. of prod releases per month
  34. 34. 15x no. of prod releases per month (features were sometimes shipped on the same day)
  35. 35. time is a good fit
  36. 36. 1st function in prod! time is a good fit
  37. 37. ? time is a good fit 1st function in prod!
  38. 38. CI/CD?
  39. 39. CI/CD? testing?
  40. 40. CI/CD? testing? logging, monitoring, alerting?
  41. 41. time is a good fit 1st function in prod! CI/CD, testing, logging, monitoring, alerting
  42. 42. 170 functions ? time is a good fit 1st function in prod! CI/CD, testing, logging, monitoring, alerting
  43. 43. tracing?
  44. 44. tracing? config management?
  45. 45. tracing? config management? security?
  46. 46. 170 functions time is a good fit 1st function in prod! CI/CD, testing, logging, monitoring, alerting tracing, config management, security
  47. 47. API Gateway and Kinesis Authentication & authorisation (IAM, Cognito) Testing Running & Debugging functions locally Log aggregation Monitoring & Alerting X-Ray Correlation IDs CI/CD Performance and Cost optimisation Error Handling Configuration management VPC Security Leading practices (API Gateway, Kinesis, Lambda) Canary deployments http://bit.ly/production-ready-serverless get 40% off with: ytcui
  48. 48. evolving the PLATFORM
  49. 49. Legacy Monolith Amazon Kinesis Step 1. ALL state changes!
  50. 50. events are an enabler for COMPOSABILITY
  51. 51. AWS LAMBDA is the...
  52. 52. Kinesis
  53. 53. Kinesis API Gateway AWS Lambda API GatewayAWS Lambda service-A service-B
  54. 54. Kinesis API Gateway AWS Lambda API GatewayAWS Lambda service-A service-B
  55. 55. Kinesis API Gateway AWS Lambda API GatewayAWS Lambda service-A service-B AWS Lambda AWS Lambda AWS Lambda
  56. 56. Kinesis API Gateway AWS Lambda API GatewayAWS Lambda service-A service-B AWS Lambda AWS Lambda AWS Lambda DynamoDBIOT
  57. 57. Kinesis API Gateway AWS Lambda API GatewayAWS Lambda service-A service-B AWS Lambda AWS Lambda AWS Lambda DynamoDBIOT
  58. 58. Kinesis API Gateway AWS Lambda API GatewayAWS Lambda service-A service-B AWS Lambda AWS Lambda AWS Lambda DynamoDBIOT AWS Lambda AWS Lambda
  59. 59. build loosely-coupled system through events
  60. 60. service A service B service C service D bounded context bounded context
  61. 61. service A service B service C service D bounded context bounded context
  62. 62. service A service B service C service D
  63. 63. there are no silver bullets
  64. 64. service A service B service C service D
  65. 65. service A service B service C service D
  66. 66. service A service B service C service D update!
  67. 67. service A service B service C service Dbackward-compatible? update!
  68. 68. bounded context DON’T use events to orchestrate workflows within the same bounded context
  69. 69. bounded context adds unnecessary complexity to logging, tracing, and end-to-end reporting
  70. 70. bounded context the workflow doesn’t exist as a standalone concept, but as the sum of a series of loosely connected parts
  71. 71. Step Functions use Step Functions instead
  72. 72. Step Functions don’t forget to emit events from the workflow
  73. 73. Step Functions so others can react to state changes that happened as part of the workflow
  74. 74. “how do I organize my functions into code repositories?”
  75. 75. one repo per function?
  76. 76. github repo github repo github repo github repo github repo github repo github repo github repo github repo
  77. 77. github repo github repo github repo github repo github repo github repo github repo github repo github repo
  78. 78. monorepo?
  79. 79. github repo
  80. 80. https://lumigo.io/blog/mono-repo-vs-one-per-service/
  81. 81. one repo per service?
  82. 82. github repo github repo github repo github repo user-api timeline-api relationship-api search-api
  83. 83. CI/CD pipeline per service
  84. 84. functions are deployed together, as a stack
  85. 85. Strangler Pattern incrementally migrate the legacy system by gradually replacing pieces of functionalities to the new system
  86. 86. rebuilt search
  87. 87. Legacy Monolith Amazon Kinesis Amazon Lambda Amazon CloudSearch
  88. 88. Legacy Monolith Amazon Kinesis Amazon Lambda Amazon CloudSearchAmazon API Gateway Amazon Lambda
  89. 89. proxy requests from monolith to new service
  90. 90. new analytics pipeline
  91. 91. expensive ($3000/month) don’t understand our domain JS based query language
  92. 92. Legacy Monolith Amazon Kinesis Amazon Lambda Google BigQuery
  93. 93. Legacy Monolith Amazon Kinesis Amazon Lambda Google BigQuery 1 developer, 2 days design production (his 1st serverless project)
  94. 94. Legacy Monolith Amazon Kinesis Amazon Lambda Google BigQuery “nothing ever got done this fast at Skype!” - Chris Twamley
  95. 95. - Dan North “lead time to someone saying thank you is the only reputation metric that matters.”
  96. 96. $3000/month $0.03/month
  97. 97. Kinesis sink
  98. 98. Kinesis Kinesis Firehose batch Kinesis events
  99. 99. Kinesis Kinesis Firehose S3 data lake
  100. 100. Kinesis Kinesis Firehose S3 Glue analyze data schema, catalog data into tables
  101. 101. Kinesis Kinesis Firehose S3 Athena Glue query engine
  102. 102. Kinesis Kinesis Firehose S3 AthenaQuickSight Glue visualization, dashboards
  103. 103. Kinesis Kinesis Firehose S3 AthenaQuickSight Glue no code is required!
  104. 104. Kinesis Kinesis Firehose S3 AthenaQuickSight Glue no code is required! pay-per-use!
  105. 105. user action business intelligence
  106. 106. user action business intelligence
  107. 107. Problem didn’t work…
  108. 108. Problem didn’t work… over-engineered…
  109. 109. try figure out what’s going on here…
  110. 110. Problem didn’t work… over-engineered… didn’t scale…
  111. 111. Rebuilt with Lambda
  112. 112. built-in retry and DLQ
  113. 113. built-in retry and DLQ avoid repeating expensive work of fetching mils of relationships
  114. 114. github repo timeline-api service: timeline-api provider: name: aws runtime: nodejs6.10 stage: dev region: us-east-1 functions: distribute-yubl: … undistribute-yubl: …
  115. 115. Problem didn’t work…
  116. 116. “it returns the first 30 users in the database, by creation time…”
  117. 117. Rebuilt with Lambda
  118. 118. BigQuery
  119. 119. BigQuery
  120. 120. grapheneDB BigQuery
  121. 121. grapheneDB BigQuery
  122. 122. grapheneDB BigQuery
  123. 123. grapheneDB BigQuery mostly built in one sleepless night…
  124. 124. more Scalable (and scales faster!)
  125. 125. Cheaper (don’t pay for idle servers)
  126. 126. Resilience (built-in redundancy and multi-AZ)
  127. 127. Secure
  128. 128. request blue-green deployment req/s auto-scaling us-east-1a us-east-1b us-east-1c multi-AZ
  129. 129. idea production greater Velocity from idea to product
  130. 130. WHY? to deliver better UX to deliver value faster to be more cost efficient
  131. 131. @theburningmonk theburningmonk.com github.com/theburningmonk

×