Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Running serverless at scale

191 views

Published on

In this talk, we will discuss some of the things to consider when running a serverless architecture at scale. How to think about and manage cold starts, and strategy for scaling out the app beyond the basic limits, and more.

Published in: Technology
  • Be the first to comment

Running serverless at scale

  1. 1. running serverless at SC AL E
  2. 2. Yan Cui http://theburningmonk.com @theburningmonk Principal Engineer @ Independent Consultant
  3. 3. available in Austria, Switzerland, Germany, Japan, Canada, Italy, US, Spain and Brazil
  4. 4. available on 30+ platforms
  5. 5. ~1,000,000 concurrent viewers
  6. 6. follow @dazneng for updates about the engineering team We’re hiring! Visit engineering.dazn.com to learn more. WE’RE HIRING!
  7. 7. AWS user since 2009
  8. 8. AWS user since 2009
  9. 9. What do you mean by ‘serverless’?
  10. 10. “Serverless”
  11. 11. Gojko Adzic It is serverless the same way WiFi is wireless. http://bit.ly/2yQgwwb
  12. 12. Serverless means… don’t pay for it if no-one uses it don’t need to worry about scaling don’t need to provision and manage servers
  13. 13. “Function-as-a-Service” AWS Lambda Azure Functions Google Cloud Functions Auth0 Webtask Spotinst Functions Kubeless IBM Cloud Functions
  14. 14. AWS Lambda
  15. 15. AWS Lambda API Gateway IOT SNS Kinesis CloudWatch
  16. 16. IaaS Function Application Runtime Container OS Virtualization Hardware CaaS Function Application Runtime Container OS Virtualization Hardware PaaS Function Application Runtime Container OS Virtualization Hardware FaaS Function Application Runtime Container OS Virtualization Hardware User User (scalable unit) Provider
  17. 17. IaaS Function Application Runtime Container OS Virtualization Hardware CaaS Function Application Runtime Container OS Virtualization Hardware PaaS Function Application Runtime Container OS Virtualization Hardware FaaS Function Application Runtime Container OS Virtualization Hardware User User (scalable unit) Provider
  18. 18. Serverless FaaS other services… Database Storage BI
  19. 19. Simon Wardley Serverless will fundamentally change how we build business around technology and how you code.
  20. 20. Why serverless?
  21. 21. more Scalable
  22. 22. 1,000 concurrent executions (soft limit) 500 increase per minute (hard-ish limit)
  23. 23. 1,000 concurrent executions (soft limit) 500 increase per minute (hard-ish limit) AUTO-APPROVED RAISE TO 3000
  24. 24. 1,000 concurrent executions (soft limit) 500 increase per minute (hard-ish limit)
  25. 25. containers are reused
  26. 26. 100% SERVERLESS IN PRODUCTION
  27. 27. 80 MILLION MONTHLY USERS
  28. 28. Cheaper (don’t pay for idle servers)
  29. 29. Resilience (built-in redundancy and multi-AZ)
  30. 30. http://bit.ly/2Vzfexo
  31. 31. Secure
  32. 32. Shared Responsibility Model
  33. 33. Shared Responsibility Model
  34. 34. protection from OS attacks Amazon automatically apply latest patches to host VMs
  35. 35. Deploy
  36. 36. serverless.yml {} Code
  37. 37. {} Code serverless.yml
  38. 38. serverless.yml {} Code S3
  39. 39. {} Code serverless.yml S3 CloudFormation
  40. 40. {} Code serverless.yml S3 CloudFormation
  41. 41. request blue-green deployment
  42. 42. request blue-green deployment
  43. 43. request blue-green deployment
  44. 44. request blue-green deployment req/s auto-scaling us-east-1a us-east-1b us-east-1c multi-AZ
  45. 45. the DevOps forcethe DevOps force is strong with serverlessis strong with serverless
  46. 46. idea production choose language + framework master language + framework figure out deployment configure AMI configure ELB configure autoscaling capacity planning over-provision for launch are we doing microservices? configure CI/CD
  47. 47. idea production choose language + framework master language + framework figure out deployment configure AMI configure ELB configure autoscaling capacity planning over-provision for launch are we doing microservices? configure CI/CD
  48. 48. idea production greater Velocity from idea to product
  49. 49. minimise undifferentiated heavy-lifting
  50. 50. less ops responsibility on your shoulders
  51. 51. infrastructure you
  52. 52. 1,000 concurrent executions (soft limit) 500 increase per minute (hard-ish limit)
  53. 53. ~1,000,000 concurrent viewers
  54. 54. serverless is not right for every use case (yet)
  55. 55. http://bit.ly/2WSfcky
  56. 56. Lambda VPC
  57. 57. there are no silver bullets
  58. 58. 0 Containers “it works on my machine!” “production ready!”days
  59. 59. 0 Containers “it works on my machine!” “production ready!”days Serverless 0 “it works!” “production ready!” days
  60. 60. 0 Containers “it works on my machine!” “production ready!”days Serverless 0 “it works!” “production ready!” days v2! v3! v4! v5! v6!
  61. 61. EC2 docker us-east-1a us-east-1b us-east-1a us-east-1b Theory
  62. 62. Reality
  63. 63. Reality
  64. 64. scale-to-zero
  65. 65. serverful serverless us-east-1a us-east-1b us-east-1a us-east-1bscaled to zero!
  66. 66. scaling limits VPC long-running cold starts performance
  67. 67. scaling limits VPC long-running cold starts performance
  68. 68. scaling limits VPC long-running cold starts performance
  69. 69. http://bit.ly/2X0ksCY
  70. 70. http://bit.ly/2X0ksCY
  71. 71. http://bit.ly/2X0ksCY
  72. 72. scaling limits VPC long-running cold starts performance
  73. 73. http://bit.ly/2I7GJeJ
  74. 74. scaling limits VPC long-running cold starts performance
  75. 75. scaling limits VPC long-running cold starts performance
  76. 76. What can we do now?
  77. 77. 1,000 concurrent executions (soft limit) 500 increase per minute (hard-ish limit)
  78. 78. 1,000 concurrent executions (soft limit) 500 increase per minute (hard-ish limit)
  79. 79. 1,000 concurrent executions (soft limit) 500 increase per minute (hard-ish limit) contact your AWS TAM about raising this
  80. 80. multi-region, active-active
  81. 81. overall traffic regional traffics
  82. 82. http://bit.ly/2Vzfexo
  83. 83. identical, independent
  84. 84. ingest from both regions
  85. 85. ingest from both regions idempotent
  86. 86. CloudFront API Gateway Lambda
  87. 87. cache as close to the user as possible!
  88. 88. CloudFront API Gateway Lambda cache
  89. 89. CloudFront API Gateway Lambda cache cache
  90. 90. CloudFront API Gateway Lambda cache cache cache take advantage of container reuse
  91. 91. https://amzn.to/2VlyXzR
  92. 92. CloudFront Lambda@Edge executed in the nearest AWS region
  93. 93. CloudFront Lambda@Edge executed in the nearest AWS region traffic is distributed across many regions
  94. 94. CloudFront Lambda@Edge executed in the nearest AWS region traffic is distributed across many regions fewer hops, shorter round- trip = faster response
  95. 95. avoid services that require persistent connections
  96. 96. DynamoDB on-demand tables can scale to up to 40K TPS DynamoDB
  97. 97. RDS requires persistent connection
  98. 98. RDS ๏ set client-side pool size to 1 ๏ set server-side max connection high ๏ set server-side connection timeout to ~10 mins (to clean phantom connections) ๏ use read-replicas
  99. 99. RDS ๏ set client-side pool size to 1 ๏ set server-side max connection high ๏ set server-side connection timeout to ~10 mins (to clean phantom connections) ๏ use read-replicas Serverless Aurora is not ready for production-use (yet)
  100. 100. RDS new service HTTPs socket pooling, etc.
  101. 101. requires persistent connection
  102. 102. https://github.com/solve-hq/LaunchDarkly-relay-fargate
  103. 103. function-less
  104. 104. CloudFront API Gateway Lambda 500/min limit on scaling out
  105. 105. CloudFront API Gateway Lambda
  106. 106. CloudFront API Gateway DynamoDB
  107. 107. CloudFront AppSync DynamoDB
  108. 108. engineer the problem away instead of solving it high 5!
  109. 109. What about cold starts?
  110. 110. INITIALISE CONTAINER INITIALISE RUNTIME INITIALISE HANDLER YOUR CODE EXECUTES
  111. 111. INITIALISE CONTAINER INITIALISE RUNTIME INITIALISE HANDLER YOUR CODE EXECUTES COLD START
  112. 112. “cold starts only happen to the first request”
  113. 113. “cold starts only happen to the first request” WRONG!!!
  114. 114. function invocationconcurrent execution i.e. a container
  115. 115. function invocationconcurrent execution i.e. a container class instance method call
  116. 116. Lambda scales the number of concurrent executions based on traffic
  117. 117. existing “containers” are reused where possible
  118. 118. time invocation
  119. 119. time invocation invocation
  120. 120. time invocation invocation
  121. 121. time invocation invocation invocation invocation
  122. 122. time invocation invocation invocation invocation invocation invocation
  123. 123. time invocation invocation invocation invocation invocation invocation
  124. 124. time invocation invocation invocation invocation invocation invocation invocation
  125. 125. time invocation invocation invocation invocation invocation invocation invocation invocation
  126. 126. time invocation invocation invocation invocation invocation invocation invocation invocation
  127. 127. time invocation invocation invocation invocation invocation invocation invocation invocation
  128. 128. FREQUENCY DURATION
  129. 129. FREQUENCY DURATION dictated by user traffic, out of your control
  130. 130. FREQUENCY DURATION optimize this!
  131. 131. “Lambda is not a suitable solution for me because of cold starts”
  132. 132. “what is your latency requirement?”
  133. 133. cold starts that don’t add to user-facing latency is generally not worth worrying about
  134. 134. Node.js functions, no VPC, 1GB, averages ~500ms cold start with production workload
  135. 135. Node.js functions, no VPC, 1GB, averages ~500ms cold start with production workload (good enough for most web applications)
  136. 136. sporadic spikes latency existed before Lambda
  137. 137. GC pauses…
  138. 138. overloaded servers…
  139. 139. slow downstream, databases, etc.
  140. 140. networking issues…
  141. 141. cold starts is generally not an issue if you have a steady traffic pattern
  142. 142. time req/s
  143. 143. time req/s El Classico
  144. 144. time req/s lunch dinner
  145. 145. minimise the duration of cold starts so they fall within acceptable latency range
  146. 146. use Node.js, Python or Golang
  147. 147. trim dependencies
  148. 148. don’t require the full AWS-SDK if you only need one client
  149. 149. https://theburningmonk.com/2019/03/just-how-expensive-is-the-full-aws-sdk/
  150. 150. https://theburningmonk.com/2019/03/just-how-expensive-is-the-full-aws-sdk/ full AWS-SDK vs. DynamoDB only
  151. 151. https://theburningmonk.com/2019/03/just-how-expensive-is-the-full-aws-sdk/ full AWS-SDK vs. DynamoDB only webpack!!
  152. 152. keep functions single-purposed
  153. 153. avoid VPCs unless you need to access VPC-protected resources
  154. 154. caching, caching, caching,
  155. 155. multi-region, active-active
  156. 156. horizontal scalability all the way
  157. 157. lambda warmer doesn’t work
  158. 158. make cold start durations acceptable
  159. 159. “not everything at Netflix runs at Netflix scale”
  160. 160. @theburningmonk theburningmonk.com github.com/theburningmonk

×