AWS Lambda is a serverless, event-driven compute service that lets you run code for virtually any type of application or backend service without provisioning or managing servers. AWS Lambda is chosen for its flexibility, the ease of integration with other AWS Services, and reducing the amount of infrastructure you and your team own. But over time, when the number of clients and requests start to increase, and you start caring about latency, you may discover that there is no free lunch. Clients complain about latency, things you've taken for granted when running your software on EC2 or Fargate no longer apply, and costs start to ramp up. In this talk, I'm going to describe some of the lessons learned from working on multiple services backed by AWS Lambda: what are and how to reduce the cold starts, how the JVM makes them even more problematic, when AWS Lambda is more expensive than the less abstract platform, how to use provisioned concurrency and why one of the biggest problems in Computer Science (caching) is even bigger on Lambdas.