Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Common mistakes in serverless adoption

1,626 views

Published on

Looking in from the outside, serverless seems so simple! And yet, many companies are struggling on their journey to serverless. In this webinar, AWS Serverless Hero Yan Cui highlights a number of common mistakes companies are making when they adopt serverless so you can avoid them.

Published in: Technology
  • Be the first to comment

Common mistakes in serverless adoption

  1. 1. WEBINARS
  2. 2. is the biggest tech conference for developers in EMEA, open to all languages and technologies.
  3. 3. SPECIAL DISCOUNT May 27-28, 2020 LET’S MEET IN CODEMOTION AMSTERDAM 2020! 20% DISCOUNT FOR YOU CODE: CodemotionAmsterdam20Cui
  4. 4. What do you mean by ‘serverless’?
  5. 5. @theburningmonk theburningmonk.com “Serverless”
  6. 6. @theburningmonk theburningmonk.com
  7. 7. @theburningmonk theburningmonk.com Gojko Adzic It is serverless the same way WiFi is wireless. http://bit.ly/2yQgwwb
  8. 8. @theburningmonk theburningmonk.com Serverless means… don’t pay for it if no-one uses it don’t need to worry about scaling don’t need to provision and manage servers
  9. 9. @theburningmonk theburningmonk.com in other words, it’s a lot like taking a cab
  10. 10. @theburningmonk theburningmonk.com Ownership Fuel Navigate To get there! Focus on getting there!
  11. 11. @theburningmonk theburningmonk.com HW Ownership OS Runtime & Scale Code Focus on getting there! Physical Servers Virtual Machines Containers Serverless
  12. 12. @theburningmonk theburningmonk.com Nano Services Self Managed Cost Paradigm ChangeAsync Dynamic agile env
  13. 13. “why are we failing at this?”
  14. 14. hidden dangers
  15. 15. @theburningmonk theburningmonk.com monolith microservices serverless
  16. 16. @theburningmonk theburningmonk.com monolith microservices serverless observability distributed systems bounded context
  17. 17. @theburningmonk theburningmonk.com monolith microservices serverless observability distributed systems bounded context
  18. 18. @theburningmonk theburningmonk.com monolith microservices serverless observability distributed systems bounded context event driven
  19. 19. @theburningmonk theburningmonk.com monolith serverless missing learnings from microservices
  20. 20. @theburningmonk theburningmonk.com monolith serverless missing learnings from microservices poor decisions
  21. 21. Yan Cui http://theburningmonk.com @theburningmonk AWS user for 10 years
  22. 22. http://bit.ly/yubl-serverless
  23. 23. Yan Cui http://theburningmonk.com @theburningmonk Developer Advocate @
  24. 24. Yan Cui http://theburningmonk.com @theburningmonk Independent Consultant advisetraining delivery
  25. 25. @theburningmonk theburningmonk.com https://theburningmonk.com/workshops Amsterdam, March 19-20 Helsinki, May 4-5 Stockholm, May 14-15 Dublin, June 16-17 London, September 24-25 Berlin, October 8-9
  26. 26. #1 not letting go of legacy thinking
  27. 27. “we’re doing serverless, but why aren’t thing going faster?”
  28. 28. @theburningmonk theburningmonk.com Socio Technical
  29. 29. @theburningmonk theburningmonk.com there are no silver bullets
  30. 30. @theburningmonk theburningmonk.com
  31. 31. @theburningmonk theburningmonk.com centralised team Team A Team B Team C Team D …
  32. 32. @theburningmonk theburningmonk.com “but the developers don’t understand AWS and how our infrastructure is set up”
  33. 33. @theburningmonk theburningmonk.com “but the developers don’t understand AWS and how our infrastructure is set up” let’s solve this problem instead!
  34. 34. @theburningmonk theburningmonk.com what got you here won’t get you there
  35. 35. @theburningmonk theburningmonk.com if (path == “/user” && method == “GET”) { return getUser(…); } else if (path == “/user” && method == “DELETE”) { return deleteUser(…); } else if (path == “/user” && method == “POST”) { return createUser(…); } else if …. Monolithic Functions
  36. 36. @theburningmonk theburningmonk.com GET /user POST /user DELETE /user Single-Purposed Functions
  37. 37. @theburningmonk theburningmonk.com author: yan.cui feature: user-api user-api-dev Monolithic Single-Purposed author: yan.cui feature: user-api user-api-dev-get-user author: yan.cui feature: user-api user-api-dev-create-user author: yan.cui feature: user-api user-api-dev-delete-user
  38. 38. @theburningmonk theburningmonk.com author: yan.cui feature: user-api user-api-dev Monolithic Single-Purposed author: yan.cui feature: user-api user-api-dev-get-user author: yan.cui feature: user-api user-api-dev-create-user author: yan.cui feature: user-api user-api-dev-delete-user find related functions by prefix
  39. 39. @theburningmonk theburningmonk.com author: yan.cui feature: user-api user-api-dev Monolithic Single-Purposed author: yan.cui feature: user-api user-api-dev-get-user author: yan.cui feature: user-api user-api-dev-create-user author: yan.cui feature: user-api user-api-dev-delete-user discoverability (without having to dig into the code)
  40. 40. @theburningmonk theburningmonk.com author: yan.cui feature: user-api user-api-dev Monolithic Single-Purposed author: yan.cui feature: user-api user-api-dev-get-user author: yan.cui feature: user-api user-api-dev-create-user author: yan.cui feature: user-api user-api-dev-delete-user what does it do?
  41. 41. @theburningmonk theburningmonk.com author: yan.cui feature: user-api user-api-dev Monolithic Single-Purposed author: yan.cui feature: user-api user-api-dev-get-user author: yan.cui feature: user-api user-api-dev-create-user author: yan.cui feature: user-api user-api-dev-delete-user dynamodb:GetItem dynamodb:PutItem dynamodb:DeleteItem
  42. 42. @theburningmonk theburningmonk.com author: yan.cui feature: user-api user-api-dev Monolithic Single-Purposed author: yan.cui feature: user-api user-api-dev-get-user author: yan.cui feature: user-api user-api-dev-create-user author: yan.cui feature: user-api user-api-dev-delete-user dynamodb:GetItem dynamodb:PutItem dynamodb:DeleteItem no least privilege…
  43. 43. @theburningmonk theburningmonk.com author: yan.cui feature: user-api user-api-dev Monolithic Single-Purposed author: yan.cui feature: user-api user-api-dev-get-user author: yan.cui feature: user-api user-api-dev-create-user author: yan.cui feature: user-api user-api-dev-delete-user require(x) require(y) require(z)
  44. 44. @theburningmonk theburningmonk.com author: yan.cui feature: user-api user-api-dev Monolithic Single-Purposed author: yan.cui feature: user-api user-api-dev-get-user author: yan.cui feature: user-api user-api-dev-create-user author: yan.cui feature: user-api user-api-dev-delete-user require(x) require(y) require(z)
  45. 45. @theburningmonk theburningmonk.com
  46. 46. @theburningmonk theburningmonk.com more dependecies equals slower cold start
  47. 47. @theburningmonk theburningmonk.com author: yan.cui feature: user-api user-api-dev Monolithic Single-Purposed author: yan.cui feature: user-api user-api-dev-get-user author: yan.cui feature: user-api user-api-dev-create-user author: yan.cui feature: user-api user-api-dev-delete-user require(x) require(y) require(z) worse cold start performance
  48. 48. @theburningmonk theburningmonk.com keep functions simple, and single-purposed
  49. 49. #2 one account that rules them all
  50. 50. @theburningmonk theburningmonk.com mind the shared limits
  51. 51. @theburningmonk theburningmonk.com no. of DynamoDB tables no. of API Gateway regional APIs no. of API Gateway edge-optimized APIs no. of Kinesis shards no. of IAM roles no. of S3 buckets no. of CloudFormation stacks no. of SNS subscription filters no. of SSM parameters … Resource Limits
  52. 52. @theburningmonk theburningmonk.com DynamoDB read & write API Gateway requests/second Lambda concurrent executions SSM parameter ops/second … Throughput Limits
  53. 53. @theburningmonk theburningmonk.com
  54. 54. @theburningmonk theburningmonk.com compartmentalise security breaches
  55. 55. @theburningmonk theburningmonk.com One account per Team per Environment
  56. 56. @theburningmonk theburningmonk.com
  57. 57. #3 do first, research later
  58. 58. @theburningmonk theburningmonk.com https://einaregilsson.com/serverless-15-percent-slower-and-eight-times-more-expensive/
  59. 59. @theburningmonk theburningmonk.com
  60. 60. @theburningmonk theburningmonk.com
  61. 61. @theburningmonk theburningmonk.com
  62. 62. @theburningmonk theburningmonk.com the platforms need to do better at educating users on how to choose between different services
  63. 63. @theburningmonk theburningmonk.com SNS vs SQS vs Kinesis vs MKS? the platforms need to do better at educating users on how to choose between different services
  64. 64. @theburningmonk theburningmonk.com ordering replay events Kinesis SQS SNS by shard none (standard) global (FIFO) none up to 7 days none none mode retry batched batched (up to 10) singular retried until success (customizable) retry + DLQ retry + DLQ concurrency 1 per shard auto-scaled fan-out!!! subscribers many one-to-one many EventBridge many none none singular retry + DLQ fan-out!!!
  65. 65. @theburningmonk theburningmonk.com https://medium.com/theburningmonk-com/all-my-posts-on-serverless-aws-lambda-43c17a147f91
  66. 66. @theburningmonk theburningmonk.com https://www.jeremydaly.com/newsletter/
  67. 67. #4 not using a deployment toolkit
  68. 68. @theburningmonk theburningmonk.com
  69. 69. @theburningmonk theburningmonk.com https://lumigo.io/blog/comparison-of-lambda-deployment-frameworks/
  70. 70. @theburningmonk theburningmonk.com don’t write your own deployment framework
  71. 71. #5 console-driven development
  72. 72. @theburningmonk theburningmonk.com
  73. 73. @theburningmonk theburningmonk.com
  74. 74. #6 one repo per function
  75. 75. @theburningmonk theburningmonk.com github repo github repo github repo github repo github repo github repo github repo github repo github repo
  76. 76. @theburningmonk theburningmonk.com github repo github repo github repo github repo github repo github repo github repo github repo github repo
  77. 77. @theburningmonk theburningmonk.com monorepo?
  78. 78. @theburningmonk theburningmonk.com github repo
  79. 79. @theburningmonk theburningmonk.com one repo per service?
  80. 80. @theburningmonk theburningmonk.com github repo github repo github repo github repo user-api timeline-api relationship-api search-api
  81. 81. @theburningmonk theburningmonk.com https://lumigo.io/blog/mono-repo-vs-one-per-service/
  82. 82. @theburningmonk theburningmonk.com
  83. 83. @theburningmonk theburningmonk.com
  84. 84. @theburningmonk theburningmonk.com
  85. 85. @theburningmonk theburningmonk.com
  86. 86. @theburningmonk theburningmonk.com
  87. 87. @theburningmonk theburningmonk.com
  88. 88. @theburningmonk theburningmonk.com
  89. 89. @theburningmonk theburningmonk.com github repo github repo github repo github repo user-api timeline-api relationship-api search-api
  90. 90. @theburningmonk theburningmonk.com CI/CD pipeline per service
  91. 91. @theburningmonk theburningmonk.com functions are deployed together, as a stack
  92. 92. unencrypted secrets in env vars #7
  93. 93. @theburningmonk theburningmonk.com secrets should NEVER be in plain text in env variables
  94. 94. @theburningmonk theburningmonk.com SSM Parameter Store Secret 1 Secret 2 IAM Environment: SECRET_1: … SECRET_2: … Environment: SECRET_1: … SECRET_2: …
  95. 95. @theburningmonk theburningmonk.com SSM Parameter Store Secret 1 Secret 2 IAM Environment: SECRET_1: … SECRET_2: … Environment: SECRET_1: … SECRET_2: … yay!
  96. 96. @theburningmonk theburningmonk.com
  97. 97. @theburningmonk theburningmonk.com
  98. 98. @theburningmonk theburningmonk.com SSM Parameter Store Secret 1 Secret 2 IAM fetch at cold start, cache, invalidate every x mins
  99. 99. @theburningmonk theburningmonk.com https://github.com/middyjs/middy
  100. 100. @theburningmonk theburningmonk.com
  101. 101. @theburningmonk theburningmonk.com SSM Parameter Store Secret 1 Secret 2 IAM switch to Higher Throughput if you need more than 40 ops/s
  102. 102. not following least privilege principle #8
  103. 103. @theburningmonk theburningmonk.com
  104. 104. @theburningmonk theburningmonk.com
  105. 105. missing DLQs #9
  106. 106. @theburningmonk theburningmonk.com async sync S3 SNS SES CloudFormation CloudWatch Logs CloudWatch Events Scheduled Events CodeCommit AWS Config http://amzn.to/2v7Kc3b Cognito Alexa Lex API Gateway pulling DynamoDB Stream Kinesis Stream SQS
  107. 107. @theburningmonk theburningmonk.com async sync S3 SNS SES CloudFormation CloudWatch Logs CloudWatch Events Scheduled Events CodeCommit AWS Config http://amzn.to/2vs2lIg Cognito Alexa Lex API Gateway pulling DynamoDB Stream Kinesis Stream SQS Lambda handles retries (twice, then DLQ)
  108. 108. @theburningmonk theburningmonk.com configure DLQ for async functions so you don’t lose failed events
  109. 109. @theburningmonk theburningmonk.com
  110. 110. @theburningmonk theburningmonk.com
  111. 111. @theburningmonk theburningmonk.com DLQ Lambda Destinations payload payload, context(s), and response
  112. 112. @theburningmonk theburningmonk.com
  113. 113. too much/too little concurrency #10
  114. 114. @theburningmonk theburningmonk.com “Lambda generates too much load for the downstream system”
  115. 115. @theburningmonk theburningmonk.com one invocation per message SNS Lambda
  116. 116. @theburningmonk theburningmonk.com Downstream System SNS Lambda
  117. 117. @theburningmonk theburningmonk.com ordering replay events Kinesis SQS SNS by shard none (standard) global (FIFO) none up to 7 days none none mode retry batched batched (up to 10) singular retried until success (customizable) retry + DLQ retry + DLQ concurrency 1 per shard auto-scaled fan-out!!! subscribers many one-to-one many EventBridge many none none singular retry + DLQ fan-out!!!
  118. 118. @theburningmonk theburningmonk.com if you want… maximum throughput SNS precise control over throughput Kinesis
  119. 119. @theburningmonk theburningmonk.com if you want… maximum throughput SNS precise control over throughput Kinesis how quickly it scales out
  120. 120. @theburningmonk theburningmonk.com if you want… maximum throughput SNS precise control over throughput Kinesis how quickly it scales out SQS DynamoDB Streams
  121. 121. @theburningmonk theburningmonk.com ordering replay events Kinesis SQS SNS by shard none (standard) global (FIFO) none up to 7 days none none mode retry batched batched (up to 10) singular retried until success (customizable) retry + DLQ retry + DLQ concurrency 1 per shard auto-scaled fan-out!!! subscribers many one-to-one many EventBridge many none none singular retry + DLQ fan-out!!!
  122. 122. cold starts #11
  123. 123. @theburningmonk theburningmonk.com “cold starts only happen to the first request”
  124. 124. @theburningmonk theburningmonk.com function invocationconcurrent execution i.e. a container
  125. 125. @theburningmonk theburningmonk.com function invocationconcurrent execution i.e. a container class instance method call
  126. 126. @theburningmonk theburningmonk.com Lambda scales the number of concurrent executions based on traffic
  127. 127. @theburningmonk theburningmonk.com existing “containers” are reused where possible
  128. 128. @theburningmonk theburningmonk.com time invocation
  129. 129. @theburningmonk theburningmonk.com time invocation invocation
  130. 130. @theburningmonk theburningmonk.com time invocation invocation
  131. 131. @theburningmonk theburningmonk.com time invocation invocation invocation invocation
  132. 132. @theburningmonk theburningmonk.com time invocation invocation invocation invocation invocation invocation
  133. 133. @theburningmonk theburningmonk.com time invocation invocation invocation invocation invocation invocation
  134. 134. @theburningmonk theburningmonk.com time invocation invocation invocation invocation invocation invocation invocation
  135. 135. @theburningmonk theburningmonk.com time invocation invocation invocation invocation invocation invocation invocation invocation
  136. 136. @theburningmonk theburningmonk.com time invocation invocation invocation invocation invocation invocation invocation invocation
  137. 137. @theburningmonk theburningmonk.com time invocation invocation invocation invocation invocation invocation invocation invocation
  138. 138. @theburningmonk theburningmonk.com
  139. 139. @theburningmonk theburningmonk.com
  140. 140. @theburningmonk theburningmonk.com
  141. 141. @theburningmonk theburningmonk.com
  142. 142. @theburningmonk theburningmonk.com time invocation invocation ping invocation invocation invocation ping ping
  143. 143. @theburningmonk theburningmonk.com Lambda warmers don’t work when you have > 1 concurrent executions
  144. 144. @theburningmonk theburningmonk.com FREQUENCY DURATION
  145. 145. @theburningmonk theburningmonk.com FREQUENCY DURATION dictated by user traffic, out of your control
  146. 146. @theburningmonk theburningmonk.com cold starts is generally not an issue if you have a steady traffic pattern
  147. 147. @theburningmonk theburningmonk.com time req/s
  148. 148. @theburningmonk theburningmonk.com time req/s El Classico
  149. 149. @theburningmonk theburningmonk.com time req/s lunch dinner
  150. 150. @theburningmonk theburningmonk.com FREQUENCY DURATION optimize this!
  151. 151. @theburningmonk theburningmonk.com minimise the duration of cold starts so they fall within acceptable latency range
  152. 152. @theburningmonk theburningmonk.com time req/s lunch dinner Provisioned Concurrency
  153. 153. @theburningmonk theburningmonk.com time req/s lunch dinner Provisioned Concurrency On-Demand Concurrency
  154. 154. @theburningmonk theburningmonk.com https://lumigo.io/blog/provisioned-concurrency-the-end-of-cold-starts/
  155. 155. @theburningmonk theburningmonk.com there are no silver bullets
  156. 156. @theburningmonk theburningmonk.com reserved concurrency is a powerful tool IFF you have a cold start problem don’t use it by default
  157. 157. RDS connection handling #12
  158. 158. @theburningmonk theburningmonk.com default RDS configs are bad for Lambda
  159. 159. @theburningmonk theburningmonk.com default RDS configs are bad for Lambda idle connections are not closed too many connections per “container” max open connection is too low
  160. 160. @theburningmonk theburningmonk.com https://www.jeremydaly.com/manage-rds-connections-aws-lambda/
  161. 161. @theburningmonk theburningmonk.com set “wait_timeout” and “interactive_timeout” to 10 mins (default is 8 hours!)
  162. 162. @theburningmonk theburningmonk.com increase “max_connections” setting
  163. 163. @theburningmonk theburningmonk.com set client socket pool size to 1
  164. 164. @theburningmonk theburningmonk.com
  165. 165. @theburningmonk theburningmonk.com
  166. 166. (lack of) observability #13
  167. 167. @theburningmonk theburningmonk.com happened system repaireduser impact reduce MTTR
  168. 168. @theburningmonk theburningmonk.com Identify & Resolve Issues Understanding costs Visibility
  169. 169. @theburningmonk theburningmonk.com Identify & Resolve Issues Understanding costs Visibility
  170. 170. @theburningmonk theburningmonk.com happened system repaireduser impact MTTDiscovery
  171. 171. @theburningmonk theburningmonk.com
  172. 172. @theburningmonk theburningmonk.com “What alerts should I have?”
  173. 173. @theburningmonk theburningmonk.com It depends on what you’re building…
  174. 174. @theburningmonk theburningmonk.com But, this is a good starting point
  175. 175. @theburningmonk theburningmonk.com Lambda error rate % throttle count DLR error count iterator age regional concurrency
  176. 176. @theburningmonk theburningmonk.com Lambda error rate % throttle count DLR error count iterator age regional concurrency API Gateway p90/95/99 latency success rate % 4xx rate % 5xx rate %
  177. 177. @theburningmonk theburningmonk.com API Gateway p90/95/99 latency success rate % 4xx rate % 5xx rate % SQS message age Lambda error rate % throttle count DLR error count iterator age regional concurrency
  178. 178. @theburningmonk theburningmonk.com API Gateway p90/95/99 latency success rate % 4xx rate % 5xx rate % SQS message age Step Functions failed count throttle count timed out count Lambda error rate % throttle count DLR error count iterator age regional concurrency
  179. 179. @theburningmonk theburningmonk.com SQS message age Step Functions failed count throttle count timed out count API Gateway p90/95/99 latency success rate % 4xx rate % 5xx rate % Lambda error rate % throttle count DLR error count iterator age regional concurrency
  180. 180. @theburningmonk theburningmonk.com monitor and alert on message flow rate for event processing pipelines
  181. 181. @theburningmonk theburningmonk.com “Can’t you codify these?”
  182. 182. @theburningmonk theburningmonk.com
  183. 183. https://theburningmonk.com/hire-me AdviseTraining Delivery “Fundamentally, Yan has improved our team by increasing our ability to derive value from AWS and Lambda in particular.” Nick Blair Tech Lead
  184. 184. @theburningmonk theburningmonk.com https://theburningmonk.com/workshops Amsterdam, March 19-20 Helsinki, May 4-5 Stockholm, May 14-15 Dublin, June 16-17 London, September 24-25 Berlin, October 8-9 codemotion-2020 10% off with code
  185. 185. Production-Ready Serverless
  186. 186. @theburningmonk theburningmonk.com github.com/theburningmonk

×