Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

SRE in Startup

1,242 views

Published on

Zonky, 17.1.2017

Published in: Technology
  • Be the first to comment

SRE in Startup

  1. 1. SRE in startup Zonky 17.1.2017 Ladislav Prskavec, Apiary ladislav@apiary.io @abtris 1
  2. 2. What is SRE? 2
  3. 3. "What happens when a software engineer is tasked with what used to be called operations." » Ben Treynor Sloss, Vice President, Google Engineering, founder of Google SRE 3
  4. 4. "Our work is like being part of the world's most intense pit crew. We change the tires of a race car as it's going 100 mph." » Andrew Widdowson, Site Reliability Engineer, Mountain View 4
  5. 5. In general, an SRE team is responsible for: » availability » latency » performance » efficiency » change management » monitoring » emergency response » capacity planning 5
  6. 6. 6
  7. 7. If the team agrees on a 99.9% SLA, that gives them an error budget of 0.1%. 7
  8. 8. 8
  9. 9. Rule If service is in SLA, launch away - clearly DEV team is doing a good job If service is not within SLA, launch freeze - Until you earn back enough error budget 9
  10. 10. Error budget » removes SRE - DEV conflict » DEV teams make self-police 10
  11. 11. Common staffing pool » one more SRE = one less Dev 11
  12. 12. SRE hires only coders » they get bored easily » speak same language as Dev 12
  13. 13. 50% cap on ops work » if you succeed works scales with traffic » coding reduce work / traffic ratio 13
  14. 14. Keep Dev in rotation » 5% ops handled by devs 14
  15. 15. Speaking of Dev and Ops work » excess operations load (tickets, oncall, etc.) 15
  16. 16. SRE portability » no requirement to stick with project or SRE 16
  17. 17. Outages » minimalize impact » prevent recurrence 17
  18. 18. Minimalize damage » no NOC » good diagnostic information » practice, practice, practice 18
  19. 19. Prevent recurrence 1. Handle event 2. Write post-mortems 3. Reset 19
  20. 20. Post-mortems philosophy » blameless, focus on process and technology » create timeline » get all facts » create bugs for all followup work 20
  21. 21. How are specific SRE in startup? 21
  22. 22. 1:10 22
  23. 23. Horizontal team 23
  24. 24. SaaS oriented 24
  25. 25. Oncall culture 25
  26. 26. It's cool work 26
  27. 27. SRE book 27
  28. 28. "May the Queries Flow, And the Pagers Remain Silent" SRE Benediction 28

×