Advertisement
Advertisement

More Related Content

Advertisement
Advertisement

Beware the potholes on the road to serverless

  1. MIND THE POTHOLES MIND THE POTHOLES
  2. @theburningmonk theburningmonk.com back in the day…
  3. we’re getting a new server in 3 months time
  4. yay! about time! hooray!! finally!
  5. but we have to decide what dependencies to install on it now..
  6. @theburningmonk theburningmonk.com on premise VMs in the cloud
  7. @theburningmonk theburningmonk.com on premise VMs in the cloud abstraction level
  8. @theburningmonk theburningmonk.com on premise VMs in the cloud abstraction levelproductivity
  9. @theburningmonk theburningmonk.com on premise VMs in the cloud abstraction levelproductivity
  10. @theburningmonk theburningmonk.com less is more
  11. @theburningmonk theburningmonk.com
  12. @theburningmonk theburningmonk.com
  13. @theburningmonk theburningmonk.com right?
  14. @theburningmonk theburningmonk.com
  15. @theburningmonk theburningmonk.com “why are we failing at this?”
  16. hidden dangers
  17. @theburningmonk theburningmonk.com monolith microservices serverless
  18. @theburningmonk theburningmonk.com monolith microservices serverless observability distributed systems bounded context
  19. @theburningmonk theburningmonk.com monolith microservices serverless observability distributed systems bounded context
  20. @theburningmonk theburningmonk.com monolith microservices serverless observability distributed systems bounded context event driven
  21. @theburningmonk theburningmonk.com monolith serverless missing learnings from microservices
  22. @theburningmonk theburningmonk.com monolith serverless missing learnings from microservices poor decisions
  23. Yan Cui http://theburningmonk.com @theburningmonk AWS user for 10 years
  24. http://bit.ly/yubl-serverless
  25. Yan Cui http://theburningmonk.com @theburningmonk Developer Advocate @
  26. Yan Cui http://theburningmonk.com @theburningmonk Independent Consultant advisetraining delivery
  27. theburningmonk.com/courses
  28. theburningmonk.com/workshops in your company flexible datesHelsinki, Aug 20-21 London, Sep 24-25 Berlin, Oct 8-9 4-week virtual workshop, May 4 - May 29 Amsterdam, Jul 7-8
  29. @theburningmonk theburningmonk.com not letting go of legacy thinking one account that rules them all do first, research later not using a deployment framework console-driven development one repo per function unencrypted secrets in env vars not following least privilege principle missing DLQs too much/too little concurrency cold starts RDS connection handling (lack of) observability #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 #11 #12 #13
  30. #1 not letting go of legacy thinking
  31. “we’re doing serverless, but why aren’t thing going faster?”
  32. @theburningmonk theburningmonk.com Socio Technical
  33. @theburningmonk theburningmonk.com there are no silver bullets
  34. @theburningmonk theburningmonk.com
  35. @theburningmonk theburningmonk.com centralised team Team A Team B Team C Team D …
  36. @theburningmonk theburningmonk.com “but the developers don’t understand AWS and how our infrastructure is set up”
  37. @theburningmonk theburningmonk.com “but the developers don’t understand AWS and how our infrastructure is set up” let’s solve this problem instead!
  38. @theburningmonk theburningmonk.com what got you here won’t get you there
  39. @theburningmonk theburningmonk.com if (path == “/user” && method == “GET”) { return getUser(…); } else if (path == “/user” && method == “DELETE”) { return deleteUser(…); } else if (path == “/user” && method == “POST”) { return createUser(…); } else if …. Monolithic Functions
  40. @theburningmonk theburningmonk.com GET /user POST /user DELETE /user Single-Purposed Functions
  41. @theburningmonk theburningmonk.com author: yan.cui feature: user-api user-api-dev Monolithic Single-Purposed author: yan.cui feature: user-api user-api-dev-get-user author: yan.cui feature: user-api user-api-dev-create-user author: yan.cui feature: user-api user-api-dev-delete-user
  42. @theburningmonk theburningmonk.com author: yan.cui feature: user-api user-api-dev Monolithic Single-Purposed author: yan.cui feature: user-api user-api-dev-get-user author: yan.cui feature: user-api user-api-dev-create-user author: yan.cui feature: user-api user-api-dev-delete-user find related functions by prefix
  43. @theburningmonk theburningmonk.com author: yan.cui feature: user-api user-api-dev Monolithic Single-Purposed author: yan.cui feature: user-api user-api-dev-get-user author: yan.cui feature: user-api user-api-dev-create-user author: yan.cui feature: user-api user-api-dev-delete-user discoverability (without having to dig into the code)
  44. @theburningmonk theburningmonk.com author: yan.cui feature: user-api user-api-dev Monolithic Single-Purposed author: yan.cui feature: user-api user-api-dev-get-user author: yan.cui feature: user-api user-api-dev-create-user author: yan.cui feature: user-api user-api-dev-delete-user what does it do?
  45. @theburningmonk theburningmonk.com author: yan.cui feature: user-api user-api-dev Monolithic Single-Purposed author: yan.cui feature: user-api user-api-dev-get-user author: yan.cui feature: user-api user-api-dev-create-user author: yan.cui feature: user-api user-api-dev-delete-user dynamodb:GetItem dynamodb:PutItem dynamodb:DeleteItem
  46. @theburningmonk theburningmonk.com author: yan.cui feature: user-api user-api-dev Monolithic Single-Purposed author: yan.cui feature: user-api user-api-dev-get-user author: yan.cui feature: user-api user-api-dev-create-user author: yan.cui feature: user-api user-api-dev-delete-user dynamodb:GetItem dynamodb:PutItem dynamodb:DeleteItem no least privilege…
  47. @theburningmonk theburningmonk.com author: yan.cui feature: user-api user-api-dev Monolithic Single-Purposed author: yan.cui feature: user-api user-api-dev-get-user author: yan.cui feature: user-api user-api-dev-create-user author: yan.cui feature: user-api user-api-dev-delete-user require(x) require(y) require(z)
  48. @theburningmonk theburningmonk.com author: yan.cui feature: user-api user-api-dev Monolithic Single-Purposed author: yan.cui feature: user-api user-api-dev-get-user author: yan.cui feature: user-api user-api-dev-create-user author: yan.cui feature: user-api user-api-dev-delete-user require(x) require(y) require(z)
  49. @theburningmonk theburningmonk.com
  50. @theburningmonk theburningmonk.com more dependecies equals slower cold start
  51. @theburningmonk theburningmonk.com author: yan.cui feature: user-api user-api-dev Monolithic Single-Purposed author: yan.cui feature: user-api user-api-dev-get-user author: yan.cui feature: user-api user-api-dev-create-user author: yan.cui feature: user-api user-api-dev-delete-user require(x) require(y) require(z) worse cold start performance
  52. @theburningmonk theburningmonk.com keep functions simple, and single-purposed
  53. #2 one account that rules them all
  54. @theburningmonk theburningmonk.com mind the shared limits
  55. @theburningmonk theburningmonk.com no. of DynamoDB tables no. of API Gateway regional APIs no. of API Gateway edge-optimized APIs no. of Kinesis shards no. of IAM roles no. of S3 buckets no. of CloudFormation stacks no. of SNS subscription filters no. of SSM parameters … Resource Limits
  56. @theburningmonk theburningmonk.com DynamoDB read & write API Gateway requests/second Lambda concurrent executions SSM parameter ops/second … Throughput Limits
  57. @theburningmonk theburningmonk.com
  58. @theburningmonk theburningmonk.com compartmentalise security breaches
  59. @theburningmonk theburningmonk.com One account per Team per Environment
  60. @theburningmonk theburningmonk.com
  61. @theburningmonk theburningmonk.com https://github.com/OlafConijn/AwsOrganizationFormation
  62. @theburningmonk theburningmonk.com https://github.com/OlafConijn/AwsOrganizationFormation
  63. @theburningmonk theburningmonk.com https://github.com/OlafConijn/AwsOrganizationFormation Accounts Org Units SCPs Pwd Policies Multi-Region Pseudo-Funs Init & Validate CI/CD
  64. #3 do first, research later
  65. @theburningmonk theburningmonk.com https://einaregilsson.com/serverless-15-percent-slower-and-eight-times-more-expensive/
  66. @theburningmonk theburningmonk.com
  67. @theburningmonk theburningmonk.com
  68. @theburningmonk theburningmonk.com
  69. @theburningmonk theburningmonk.com the platforms need to do better at educating users on how to choose between different services
  70. @theburningmonk theburningmonk.com SNS vs SQS vs Kinesis vs MKS? the platforms need to do better at educating users on how to choose between different services
  71. @theburningmonk theburningmonk.com ordering replay events Kinesis SQS SNS by shard none (standard) global (FIFO) none up to 7 days none none mode retry batched batched (up to 10) singular retried until success (customizable) retry + DLQ retry + DLQ concurrency 1 per shard auto-scaled fan-out!!! subscribers many one-to-one many EventBridge many none none singular retry + DLQ fan-out!!!
  72. @theburningmonk theburningmonk.com https://medium.com/theburningmonk-com/all-my-posts-on-serverless-aws-lambda-43c17a147f91
  73. @theburningmonk theburningmonk.com https://www.jeremydaly.com/newsletter/
  74. #4 not using a deployment toolkit
  75. @theburningmonk theburningmonk.com
  76. @theburningmonk theburningmonk.com https://lumigo.io/blog/comparison-of-lambda-deployment-frameworks/
  77. @theburningmonk theburningmonk.com don’t write your own deployment framework
  78. #5 console-driven development
  79. @theburningmonk theburningmonk.com
  80. @theburningmonk theburningmonk.com
  81. #6 one repo per function
  82. @theburningmonk theburningmonk.com github repo github repo github repo github repo github repo github repo github repo github repo github repo
  83. @theburningmonk theburningmonk.com github repo github repo github repo github repo github repo github repo github repo github repo github repo
  84. @theburningmonk theburningmonk.com monorepo?
  85. @theburningmonk theburningmonk.com github repo
  86. @theburningmonk theburningmonk.com one repo per service?
  87. @theburningmonk theburningmonk.com github repo github repo github repo github repo user-api timeline-api relationship-api search-api
  88. @theburningmonk theburningmonk.com https://lumigo.io/blog/mono-repo-vs-one-per-service
  89. @theburningmonk theburningmonk.com
  90. @theburningmonk theburningmonk.com
  91. @theburningmonk theburningmonk.com
  92. @theburningmonk theburningmonk.com
  93. @theburningmonk theburningmonk.com
  94. @theburningmonk theburningmonk.com
  95. @theburningmonk theburningmonk.com
  96. @theburningmonk theburningmonk.com github repo github repo github repo github repo user-api timeline-api relationship-api search-api
  97. @theburningmonk theburningmonk.com CI/CD pipeline per service
  98. @theburningmonk theburningmonk.com functions are deployed together, as a stack
  99. unencrypted secrets in env vars #7
  100. @theburningmonk theburningmonk.com secrets should NEVER be in plain text in env variables
  101. @theburningmonk theburningmonk.com SSM Parameter Store Secret 1 Secret 2 IAM Environment: SECRET_1: … SECRET_2: … Environment: SECRET_1: … SECRET_2: …
  102. @theburningmonk theburningmonk.com SSM Parameter Store Secret 1 Secret 2 IAM Environment: SECRET_1: … SECRET_2: … Environment: SECRET_1: … SECRET_2: … yay!
  103. @theburningmonk theburningmonk.com
  104. @theburningmonk theburningmonk.com
  105. @theburningmonk theburningmonk.com SSM Parameter Store Secret 1 Secret 2 IAM fetch at cold start, cache, invalidate every x mins
  106. @theburningmonk theburningmonk.com https://github.com/middyjs/middy
  107. @theburningmonk theburningmonk.com
  108. @theburningmonk theburningmonk.com SSM Parameter Store Secret 1 Secret 2 IAM switch to Higher Throughput if you need more than 40 ops/s
  109. not following least privilege principle #8
  110. @theburningmonk theburningmonk.com
  111. @theburningmonk theburningmonk.com
  112. missing DLQs #9
  113. @theburningmonk theburningmonk.com async sync S3 SNS SES CloudFormation CloudWatch Logs CloudWatch Events Scheduled Events CodeCommit AWS Config http://amzn.to/2v7Kc3b Cognito Alexa Lex API Gateway pulling DynamoDB Stream Kinesis Stream SQS
  114. @theburningmonk theburningmonk.com async sync S3 SNS SES CloudFormation CloudWatch Logs CloudWatch Events Scheduled Events CodeCommit AWS Config http://amzn.to/2vs2lIg Cognito Alexa Lex API Gateway pulling DynamoDB Stream Kinesis Stream SQS Lambda handles retries (twice, then DLQ)
  115. @theburningmonk theburningmonk.com configure DLQ for async functions so you don’t lose failed events
  116. @theburningmonk theburningmonk.com
  117. @theburningmonk theburningmonk.com
  118. @theburningmonk theburningmonk.com
  119. @theburningmonk theburningmonk.com
  120. too much/too little concurrency #10
  121. @theburningmonk theburningmonk.com “Lambda generates too much load for the downstream system”
  122. @theburningmonk theburningmonk.com one invocation per message SNS Lambda
  123. @theburningmonk theburningmonk.com Downstream System SNS Lambda
  124. @theburningmonk theburningmonk.com ordering replay events Kinesis SQS SNS by shard none (standard) global (FIFO) none up to 7 days none none mode retry batched batched (up to 10) singular retried until success (customizable) retry + DLQ retry + DLQ concurrency 1 per shard auto-scaled fan-out!!! subscribers many one-to-one many EventBridge many none none singular retry + DLQ fan-out!!!
  125. @theburningmonk theburningmonk.com if you want… maximum throughput SNS precise control over throughput Kinesis
  126. @theburningmonk theburningmonk.com if you want… maximum throughput SNS precise control over throughput Kinesis how quickly it scales out
  127. @theburningmonk theburningmonk.com if you want… maximum throughput SNS precise control over throughput Kinesis how quickly it scales out SQS DynamoDB Streams
  128. @theburningmonk theburningmonk.com ordering replay events Kinesis SQS SNS by shard none (standard) global (FIFO) none up to 7 days none none mode retry batched batched (up to 10) singular retried until success (customizable) retry + DLQ retry + DLQ concurrency 1 per shard auto-scaled fan-out!!! subscribers many one-to-one many EventBridge many none none singular retry + DLQ fan-out!!!
  129. cold starts #11
  130. @theburningmonk theburningmonk.com “cold starts only happen to the first request”
  131. @theburningmonk theburningmonk.com function invocationconcurrent execution i.e. a container
  132. @theburningmonk theburningmonk.com function invocationconcurrent execution i.e. a container class instance method call
  133. @theburningmonk theburningmonk.com Lambda scales the number of concurrent executions based on traffic
  134. @theburningmonk theburningmonk.com existing “containers” are reused where possible
  135. @theburningmonk theburningmonk.com time invocation
  136. @theburningmonk theburningmonk.com time invocation invocation
  137. @theburningmonk theburningmonk.com time invocation invocation
  138. @theburningmonk theburningmonk.com time invocation invocation invocation invocation
  139. @theburningmonk theburningmonk.com time invocation invocation invocation invocation invocation invocation
  140. @theburningmonk theburningmonk.com time invocation invocation invocation invocation invocation invocation
  141. @theburningmonk theburningmonk.com time invocation invocation invocation invocation invocation invocation invocation
  142. @theburningmonk theburningmonk.com time invocation invocation invocation invocation invocation invocation invocation invocation
  143. @theburningmonk theburningmonk.com time invocation invocation invocation invocation invocation invocation invocation invocation
  144. @theburningmonk theburningmonk.com time invocation invocation invocation invocation invocation invocation invocation invocation
  145. @theburningmonk theburningmonk.com
  146. @theburningmonk theburningmonk.com
  147. @theburningmonk theburningmonk.com
  148. @theburningmonk theburningmonk.com
  149. @theburningmonk theburningmonk.com time invocation invocation ping invocation invocation invocation ping ping
  150. @theburningmonk theburningmonk.com Lambda warmers don’t work when you have > 1 concurrent executions
  151. @theburningmonk theburningmonk.com FREQUENCY DURATION
  152. @theburningmonk theburningmonk.com FREQUENCY DURATION dictated by user traffic, out of your control
  153. @theburningmonk theburningmonk.com cold starts is generally not an issue if you have a steady traffic pattern
  154. @theburningmonk theburningmonk.com time req/s
  155. @theburningmonk theburningmonk.com time req/s El Classico
  156. @theburningmonk theburningmonk.com time req/s lunch dinner
  157. @theburningmonk theburningmonk.com FREQUENCY DURATION optimize this!
  158. @theburningmonk theburningmonk.com minimise the duration of cold starts so they fall within acceptable latency range
  159. @theburningmonk theburningmonk.com time req/s lunch dinner Provisioned Concurrency
  160. @theburningmonk theburningmonk.com time req/s lunch dinner Provisioned Concurrency On-Demand Concurrency
  161. @theburningmonk theburningmonk.com https://lumigo.io/blog/provisioned-concurrency-the-end-of-cold-starts/
  162. @theburningmonk theburningmonk.com there are no silver bullets
  163. @theburningmonk theburningmonk.com Provisioned concurrency is a powerful tool IFF you have a cold start problem don’t use it by default
  164. RDS connection handling #12
  165. @theburningmonk theburningmonk.com default RDS configs are bad for Lambda
  166. @theburningmonk theburningmonk.com default RDS configs are bad for Lambda idle connections are not closed too many connections per “container” max open connection is too low
  167. @theburningmonk theburningmonk.com https://www.jeremydaly.com/manage-rds-connections-aws-lambda/
  168. @theburningmonk theburningmonk.com set “wait_timeout” and “interactive_timeout” to 10 mins (default is 8 hours!)
  169. @theburningmonk theburningmonk.com increase “max_connections” setting
  170. @theburningmonk theburningmonk.com set client socket pool size to 1
  171. @theburningmonk theburningmonk.com
  172. @theburningmonk theburningmonk.com
  173. (lack of) observability #13
  174. @theburningmonk theburningmonk.com happened system repaireduser impact reduce MTTR
  175. @theburningmonk theburningmonk.com Identify & Resolve Issues Understanding costs Visibility
  176. @theburningmonk theburningmonk.com Identify & Resolve Issues Understanding costs Visibility
  177. @theburningmonk theburningmonk.com happened system repaireduser impact MTTDiscovery
  178. @theburningmonk theburningmonk.com
  179. @theburningmonk theburningmonk.com “What alerts should I have?”
  180. @theburningmonk theburningmonk.com It depends on what you’re building…
  181. @theburningmonk theburningmonk.com But, this is a good starting point
  182. @theburningmonk theburningmonk.com Lambda error rate % throttle count DLR error count iterator age regional concurrency
  183. @theburningmonk theburningmonk.com Lambda error rate % throttle count DLR error count iterator age regional concurrency API Gateway p90/95/99 latency success rate % 4xx rate % 5xx rate %
  184. @theburningmonk theburningmonk.com API Gateway p90/95/99 latency success rate % 4xx rate % 5xx rate % SQS message age Lambda error rate % throttle count DLR error count iterator age regional concurrency
  185. @theburningmonk theburningmonk.com API Gateway p90/95/99 latency success rate % 4xx rate % 5xx rate % SQS message age Step Functions failed count throttle count timed out count Lambda error rate % throttle count DLR error count iterator age regional concurrency
  186. @theburningmonk theburningmonk.com SQS message age Step Functions failed count throttle count timed out count API Gateway p90/95/99 latency success rate % 4xx rate % 5xx rate % Lambda error rate % throttle count DLR error count iterator age regional concurrency
  187. @theburningmonk theburningmonk.com monitor and alert on message flow rate for event processing pipelines
  188. @theburningmonk theburningmonk.com “Can’t you codify these?”
  189. @theburningmonk theburningmonk.com
  190. @theburningmonk theburningmonk.com not letting go of legacy thinking one account that rules them all do first, research later not using a deployment framework console-driven development one repo per function unencrypted secrets in env vars not following least privilege principle missing DLQs too much/too little concurrency cold starts RDS connection handling (lack of) observability #1 #2 #3 #4 #5 #6 #7 #8 #9 #10 #11 #12 #13
  191. https://theburningmonk.com/hire-me AdviseTraining Delivery “Fundamentally, Yan has improved our team by increasing our ability to derive value from AWS and Lambda in particular.” Nick Blair Tech Lead
  192. @theburningmonk theburningmonk.com Production-Ready Serverless
  193. in your company flexible datesHelsinki, Aug 20-21 London, Sep 24-25 Berlin, Oct 8-9Amsterdam, Jul 7-8 4-week virtual workshop, May 4 - May 29 @theburningmonk theburningmonk.com theburningmonk.com/workshops slsdays-virtual-202004 €100 off all my workshops
  194. @theburningmonk theburningmonk.com lambdabestpractice.com bit.ly/complete-guide-to-aws-step-functions 20% off my courses slsdays-virtual-202004
  195. @theburningmonk theburningmonk.com github.com/theburningmonk
Advertisement