Lambda gives you a lot of scalability and multi-AZ out-of-the-box, but still, things can go wrong in production.
There are region-wide outages, and performance degradation in services your function depends on can cause it to time out or error. And what if you're dealing with downstream systems that just aren't as scalable and can't handle the load you put on them?
The bottomline is many things can go wrong and they often do at the worst times. The goal of building resilient systems is not to prevent failures, but to build systems that can withstand these failures. In this talk, we will look at a number of practices and architectural patterns that can help you build more resilient serverless applications. Such as multi-region, active-active, employing DLQs and surge queues and using chaos experiments to identify failure modes before they manifest in production.
The recording is available here: https://www.youtube.com/watch?v=elVeOYYtLM0