SoundCloud is the largest repository of audio on the web, used by more than 200 million people every month, who upload more than 11 hours of audio every minute. Like so many others, we have migrated from a typical monolithic architecture to microservices. While the benefits brought by this style of SOA to our productivity and reliability are clear, the architecture required some non-obvious changes in the way we operate systems, and a way to tackle the overhead associated with having hundreds of small moving parts to serve every request. In this talk we’ll share the toolkit and strategy SoundCloud uses to keep its microservices explosion manageable. What do we do about the operations overhead? How to spread devops skills across teams to support the “you build it, you run it” vision? How to deal with breaking changes and asynchronous behaviours? How to deal with chatty interactions? Which protocol? How do I even get a diagram telling me how all this stuff is put together?