Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
David Poblador i Garcia // @davidpoblador
Service Availability Lead
DevOps at Spotify:
There and Back Again
david poblador i garcia
@davidpoblador
involved in free software since 1999
started up FS OSS consultancy company in 2000
...
DevOps at Spotify:
from day 0
the beginning
a handful of…
engineers
a handful of…
systems
one sysadmin
one sysadmin
datacenters servers monitoring
on-call operating systems conf management
the result was…
a lot of technical debt…
Not
our
DC
but it was awesome…
and we moved fast…
Spotify interest over time (2008)
kept growing in all dimensions:
datacenters
employees
systems
servers
around 2010
DevOps was fading away…
40ish engineers
4 people in ops
dozens of systems
ownership issues
teams were changing fast
priorities moving
Operations team not scaling
the result was…
technical debt
lack of proper tooling
2013 and onwards
Spotify in numbers:
55 markets
20M+ songs
1.5 billion playlists
24M active users
6M+ paying subscribers
Spotify in numbers:
50+ teams building products/features
around 100 backend systems
4+ datacenters
6000+ servers
DevOps is back
every team is responsible for their
systems. end to end:
it works
it runs
it scales
we embed SRE engineers
in teams as needed
we have a core-SRE team in
charge of coordination duties:
cross cutting incidents
core on-call
…
teams take on-call duties
teams take all operational duties:
provisioning
deployment
capacity planning
incident remediation
operations and infrastructure
merge
I/O builds the Spotify platform:
alerting
backups
monitoring
data pipelines
conf management
testing environment
service ec...
we encourage teams to build
missing pieces
results so far
results so far
20% of our teams are taking full
operational responsibility
results so far
the availability for the systems has not
been damaged (it has even improved)
results so far
we remove technical debt
faster than we create it
results so far
high adoption rate of our I/O platform
results so far
very promising results in our joint
ventures with feature teams
results so far
I/O teams are also taking full operational
responsibility for their systems
and some mistakes!
problems
lack of buy in
hidden work

(old ownership model)
Tack!
DevOps at Spotify: There and Back Again
David Poblador i Garcia
@davidpoblador
dpoblador@spotify.com
Upcoming SlideShare
Loading in …5
×

DevOps at Spotify: There and Back Again

7,699 views

Published on

Slides

Published in: Technology

DevOps at Spotify: There and Back Again

  1. 1. David Poblador i Garcia // @davidpoblador Service Availability Lead DevOps at Spotify: There and Back Again
  2. 2. david poblador i garcia @davidpoblador involved in free software since 1999 started up FS OSS consultancy company in 2000 free software in the public administration leading infrastructure and operations teams
  3. 3. DevOps at Spotify: from day 0
  4. 4. the beginning
  5. 5. a handful of… engineers
  6. 6. a handful of… systems
  7. 7. one sysadmin
  8. 8. one sysadmin datacenters servers monitoring on-call operating systems conf management
  9. 9. the result was…
  10. 10. a lot of technical debt… Not our DC
  11. 11. but it was awesome…
  12. 12. and we moved fast…
  13. 13. Spotify interest over time (2008)
  14. 14. kept growing in all dimensions: datacenters employees systems servers
  15. 15. around 2010
  16. 16. DevOps was fading away…
  17. 17. 40ish engineers 4 people in ops dozens of systems
  18. 18. ownership issues teams were changing fast priorities moving
  19. 19. Operations team not scaling
  20. 20. the result was…
  21. 21. technical debt
  22. 22. lack of proper tooling
  23. 23. 2013 and onwards
  24. 24. Spotify in numbers: 55 markets 20M+ songs 1.5 billion playlists 24M active users 6M+ paying subscribers
  25. 25. Spotify in numbers: 50+ teams building products/features around 100 backend systems 4+ datacenters 6000+ servers
  26. 26. DevOps is back
  27. 27. every team is responsible for their systems. end to end: it works it runs it scales
  28. 28. we embed SRE engineers in teams as needed
  29. 29. we have a core-SRE team in charge of coordination duties: cross cutting incidents core on-call …
  30. 30. teams take on-call duties
  31. 31. teams take all operational duties: provisioning deployment capacity planning incident remediation
  32. 32. operations and infrastructure merge
  33. 33. I/O builds the Spotify platform: alerting backups monitoring data pipelines conf management testing environment service ecosystem data collection procurement provisioning network …
  34. 34. we encourage teams to build missing pieces
  35. 35. results so far
  36. 36. results so far 20% of our teams are taking full operational responsibility
  37. 37. results so far the availability for the systems has not been damaged (it has even improved)
  38. 38. results so far we remove technical debt faster than we create it
  39. 39. results so far high adoption rate of our I/O platform
  40. 40. results so far very promising results in our joint ventures with feature teams
  41. 41. results so far I/O teams are also taking full operational responsibility for their systems
  42. 42. and some mistakes!
  43. 43. problems lack of buy in hidden work
 (old ownership model)
  44. 44. Tack! DevOps at Spotify: There and Back Again David Poblador i Garcia @davidpoblador dpoblador@spotify.com

×