Lambda gives you multi-AZ out-of-the-box, but still, things can go wrong in production. There are region-wide outages, and performance degradation in services your function depends on can cause it to time out or error. And what if you're dealing with downstream systems that just aren't as scalable and can't handle the load you put on them? The bottom line is many things can go wrong and they often do at the worst of times. The goal of building resilient systems is not to prevent failures, but to build systems that can withstand these failures. In this talk, we will look at a number of practices and architectural patterns that can help you build more resilient serverless applications. Such as multi-region, active-active, employing DLQs and surge queues. We'll also see how we can use chaos experiments to help us identify failure modes before they manifest in production
56. @theburningmonk theburningmonk.com
The Saga pattern
Begin transaction
Start book hotel request
End book hotel request
Start book flight request
End book flight request
Start book car rental request
End book car rental request
End transaction
98. @theburningmonk theburningmonk.com
circuit breaker pattern
When circuit is open, fail fast
but, allow 1 request through every Y mins
If request succeeds, close the circuit
After X consecutive timeouts, trip the circuit
119. @theburningmonk theburningmonk.com
“the discipline of experimenting on a system in order to build confidence in the
system’s capability to withstand turbulent conditions in production”
principlesofchaos.org