Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Using the SRE model to improve software reliability - Conflux

139 views

Published on

Site Reliability Engineering (SRE) is a new way of running large-scale software systems. Devised and popularised by Google, SRE is a specific set of disciplines and dynamics that work together with modern software engineering practices to help produce reliable software at scale. The SRE discipline combines deep awareness of technical infrastructure, operating systems and computer networking with attention to higher-level service level objectives (SLOs) to maintain a focus on business-relevant activities.

SRE requires new ways of organising work, new ways of hiring, and new modes of interaction between teams. We explore what these new approaches are and how they affect IT organisations.

Published in: Technology
  • Be the first to comment

Using the SRE model to improve software reliability - Conflux

  1. 1. Using the SRE model to improve software reliability Matthew Skelton, Head of Consulting, Conflux confluxdigital.net / @ConfluxHQ Manchester, 20 Sept 2018
  2. 2. 2
  3. 3. Operability is a shared concern #BizDevTestSecOps 3
  4. 4. 4
  5. 5. 5
  6. 6. Availability 6 Do you really need 99.999% uptime?
  7. 7. 7
  8. 8. Service Level Objective 8 A shared goal based on user expectations
  9. 9. 9
  10. 10. Error Budget 10 No budget? No deployment!
  11. 11. 11
  12. 12. 12
  13. 13. 13
  14. 14. Tooling 14 Service availability Diagnostics
  15. 15. 15
  16. 16. 16
  17. 17. Team Guide to Software Operability Matthew Skelton & Rob Thatcher operabilitybook.com Download a free sample chapter 17
  18. 18. Further reading 18 Site Reliability Engineering - 2018 https://landing.google.com/sre/book/ DevOps Team Topologies - Type 7 (SRE) http://devopstopologies.com/ (CC BY-SA)
  19. 19. thank you @ConfluxHQ confluxdigital.net

×