When Spotify started in 2006, with just 20 people, they were more worried about selling the idea of music streaming than of setting up monitoring systems. Fast forward to 2015 and
more than 400 engineers are collecting more than 30 million time series from more than 10000 hosts; so how did we get here? The journey has been a long one, with plenty of false starts and growing pains, from scaling systems to scaling teams to scaling the business itself; challenging what we thought we knew about operational monitoring at every step.
This talk is about some of the more interesting challenges we've faced along the way, and about what we've learned so far; covering some of the technical details but primarily focusing on the human aspects, and how our monitoring solutions have both shaped and been shaped by organizational structures and changing engineering practices.