Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

No Free Lunch, Indeed: Three Years of Micro-services at SoundCloud

2,956 views

Published on

SoundCloud is the largest repository of audio on the web, used by more than 200 million people every month, who upload more than 11 hours of audio every minute.

Like so many others, we have migrated from a typical monolithic architecture to micro-services. While the benefits brought by this style of SOA to our productivity and reliability are clear, the architecture required some non-obvious changes in the way we operate systems, and a way to tackle the overhead associated with having hundreds of small moving parts to serve every request.

In this talk we’ll share the toolkit and strategy SoundCloud uses to keep its micro-services explosion manageable. What do we do about the operations overhead? How to spread devops skills across teams to support the “you build it, you run it” vision? How to deal with breaking changes and asynchronous behaviours? How to deal with chatty interactions? Which protocol? How do I even get a diagram telling me how all this stuff is put together?

http://qconlondon.com/presentation/no-free-lunch-indeed-three-years-micro-services-soundcloud

Published in: Technology

No Free Lunch, Indeed: Three Years of Micro-services at SoundCloud

  1. 1. No free lunch, indeed: Three years Phil Calçado SoundCloud of microservices at SoundCloud
  2. 2. > 11 hours of audio uploaded every minute ~ 300 million people every month
  3. 3. heaps have been written about microservices in the past year-ish
  4. 4. tl;dr • Rapid provisioning • Basic Monitoring • Rapid application deployment
  5. 5. These make sense • Rapid provisioning • Basic Monitoring • Rapid application deployment this makes sense why does it make me so nervous?
  6. 6. the SoundCloud story you might know
  7. 7. the pre-history
  8. 8. SoundCloud, circa 2011
  9. 9. let’s prepare for the “microservices explosion"
  10. 10. #1 provisioning
  11. 11. what was cool in 2010-11
  12. 12. what was cool in 2010-11 doozer lxc 12factor.net
  13. 13. much better than anything else at the time
  14. 14. a problem no resource limits (i.e. cgroups) + naïve scheduling = loud neighbour in your own datacentre
  15. 15. a problem made for most of our services migrated to
  16. 16. the problem time start work on v1 v1 100% deployed start work on v2
  17. 17. before we go sophisticated, let’s simplify what we have
  18. 18. warmed up pool machine intake
  19. 19. not the final solution, but will buy us some time
  20. 20. #2 telemetry
  21. 21. state of telemetry tools circa 2011-12 wasn’t great
  22. 22. StatsD
  23. 23. let’s build our own!
  24. 24. but that’s not what broke…
  25. 25. obvious with a monolith monolith
  26. 26. not so much now ? ? ?
  27. 27. standardise dashboards
  28. 28. standardise operations https://twitter.github.io/twitter-server/Features.html#http-admin-interface
  29. 29. visualise
  30. 30. add management lines, all the way up, to escalation policies
  31. 31. #3 deployment
  32. 32. > git SquashFS > make unit tests integratio n tests acceptanc e tests perf tests > make /dev/null
  33. 33. we ended up with 7 different deployment scripts
  34. 34. > make > gitunit tests integratio n tests unit tests integratio n tests acceptanc e tests perf tests
  35. 35. containers let you spawn your mini- SoundCloud
  36. 36. but why was I so nervous?
  37. 37. because we messed up
  38. 38. there are simple and incremental ways to address these • Rapid provisioning • Basic Monitoring • Rapid application deployment
  39. 39. “uh? do you think Netflix got it right the first time?"
  40. 40. some good things
  41. 41. Q&A

×