Everything fails all the time! A quote repeated by many everyday. How does it feel when things fail in production? How do you recover from such situations? How can you make sure they don’t repeat? All these discussed with real production incidents and the measures taken to mitigate such failures. We will also look at few of the most common failure possibilities in a serverless ecosystem.
Remember, when everything fails all the time, you must learn something everyday to be operational all the time!
21. CloudWatch
Event
Heavy lifting
function
Art of coding – Copy & Paste
Trigger
rule
StepFunction
2 GB RAM
5 mins run
2 x daily
Frontend Status check
API
Request
handler
Status
store
2 GB RAM
100 ms run
1000s x daily
22. Oops! Ops
Memory: 2 GB
Invocations: 1 per sec
Invocations: 2.5mil /mo
Cost~ $9.00 / month
Duration: 100 ms
Memory: 256 MB
Invocations: 1 per sec
Invocations: 2.5mil /mo
Cost~ $1.50 / month
Duration: 100 ms
27. Serverless requires a new way of thinking, new way of
working, and new way of running applications.
That means, we need to change our way of thinking, our
way of working, and our way of running applications.
36. From Oops of Sorrows to
Operational Success…
• Know the service limits
• Be Well-Architected
• See through the Serverless Lens
• Alarm – Alert – Act
• Monitor monitor monitor