Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Tyranny of the SLA

2,200 views

Published on

My talk with Jim Kimball on the tyranny of the SLA; in it, we:
- Deconstruct the purpose of the service level agreement
- Discuss pitfalls of aspects of common SLA clauses, including how current SLAs inhibit the development of resilient systems and the cultivation of a DevOps culture
- Explore other potential SLA models that could foster healthier organizational behaviors and dynamics, and ultimately result in better technical outcomes and therefore business outcomes.

Published in: Leadership & Management
  • Be the first to comment

Tyranny of the SLA

  1. 1. THE TYRANNY OF THE SLA JIM KIMBALL J. PAUL REED VELOCITY NEW YORK, 2015 99.999% uptime
  2. 2. OUR SERVICE LEVEL AGREEMENT WITH YOU @jpaulreed@jimkimball #velocityconf
  3. 3. THIS PRESENTATION WILL KEEP YOU AWAKE 99.5% OF THE TIME. @jpaulreed@jimkimball #velocityconf
  4. 4. THE MTTL WILL BE 4.2 MINUTES. @jpaulreed@jimkimball #velocityconf
  5. 5. YOU AGREE THAT, AS A VALUED CUSTOMER, ALL TWEETS REGARDING THIS PRESENTATION MUST BE POSITIVE IN NATURE @jpaulreed@jimkimball #velocityconf
  6. 6. IN THE CASE OF DEGRADED PERFORMANCE OF THE PRESENTATION, OUR FINANCIAL LIABILITY IS LIMITED TO THE ASSESSED VALUE OF THIS SESSION.* @jpaulreed@jimkimball #velocityconf
  7. 7. IN THE CASE OF DEGRADED PERFORMANCE OF THE PRESENTATION, OUR FINANCIAL LIABILITY IS LIMITED TO THE ASSESSED VALUE OF THIS SESSION.* * O’Reilly’s valuation pending** @jpaulreed@jimkimball #velocityconf
  8. 8. IN THE CASE OF DEGRADED PERFORMANCE OF THE PRESENTATION, OUR FINANCIAL LIABILITY IS LIMITED TO THE ASSESSED VALUE OF THIS SESSION.* ** Value likely to be 1/100th of a cent * O’Reilly’s valuation pending** @jpaulreed@jimkimball #velocityconf
  9. 9. IN CASE OF EXCESSIVE CELL PHONE UTILIZATION, YOUR CONFERENCE ACCESS WILL BE REVOKED. @jpaulreed@jimkimball #velocityconf
  10. 10. J. PAUL REED • @JPAULREED ON • HOST OF THE @SHIPSHOWPODCAST • 15+ YEARS IN BUILD/RELEASE ENGINEERING • WORK WITH ALL SORTS OF ORGS ON “THE DEVOPS™” • VISITING SCIENTIST/CHIEF DELIVERY OFFICER AT PRAXISFLOW @jpaulreed@jimkimball #velocityconf
  11. 11. JIM KIMBALL • CTO, HEDGESERV • 25 YEARS IN THE FINANCIAL SOFTWARE INDUSTRY • @JIMKIMBALL ON • TOC ICO JONAH • THOUGHTS ON LEADING SOFTWARE ORGANIZATIONS AT SHARINGLUNCH.TUMBLR.COM @jpaulreed@jimkimball #velocityconf
  12. 12. CAUTION: EXPERIMENT AHEAD @jpaulreed@jimkimball #velocityconf
  13. 13. PROBLEMS WITH TODAY’S SLAs @jpaulreed@jimkimball #velocityconf
  14. 14. @jpaulreed@jimkimball #velocityconf
  15. 15. WE HAVE SLA FATIGUE @jpaulreed@jimkimball #velocityconf
  16. 16. Availability Year Quarter Month 90.0% 36.5 days 9 days 72 hours 99.0% 3.65 days 4.5 days 36 hours 99.5% 1.83 days 11.7 hours 3.6 hours 99.9% 8.76 hours 2.19 hours 43.8 mins 99.99% 52.6 mins 13.1 mins 4.38 mins 99.999% 5.26 mins 77.7 secs 25.9 secs 55.5555555% 162.2 days 40 days 13.3 days Remember All Those Nines? @jpaulreed@jimkimball #velocityconf
  17. 17. Availability Year Quarter Month 90.0% 36.5 days 9 days 72 hours 99.0% 3.65 days 4.5 days 36 hours 99.5% 1.83 days 11.7 hours 3.6 hours 99.9% 8.76 hours 2.19 hours 43.8 mins 99.99% 52.6 mins 13.1 mins 4.38 mins 99.999% 5.26 mins 77.7 secs 25.9 secs 55.5555555% 162.2 days 40 days 13.3 days Remember All Those Nines? @jpaulreed@jimkimball #velocityconf
  18. 18. Availability Year Quarter Month 90.0% 36.5 days 9 days 72 hours 99.0% 3.65 days 4.5 days 36 hours 99.5% 1.83 days 11.7 hours 3.6 hours 99.9% 8.76 hours 2.19 hours 43.8 mins 99.99% 52.6 mins 13.1 mins 4.38 mins 99.999% 5.26 mins 77.7 secs 25.9 secs 55.5555555% 162.2 days 40 days 13.3 days Remember All Those Nines? @jpaulreed@jimkimball #velocityconf
  19. 19. Availability Year Quarter Month 90.0% 36.5 days 9 days 72 hours 99.0% 3.65 days 4.5 days 36 hours 99.5% 1.83 days 11.7 hours 3.6 hours 99.9% 8.76 hours 2.19 hours 43.8 mins 99.99% 52.6 mins 13.1 mins 4.38 mins 99.999% 5.26 mins 77.7 secs 25.9 secs 55.5555555% 162.2 days 40 days 13.3 days Remember All Those Nines? @jpaulreed@jimkimball #velocityconf
  20. 20. Remember All Those Nines? Availability Year Quarter Month 90.0% 36.5 days 9 days 72 hours 99.0% 3.65 days 4.5 days 36 hours 99.5% 1.83 days 11.7 hours 3.6 hours 99.9% 8.76 hours 2.19 hours 43.8 mins 99.99% 52.6 mins 13.1 mins 4.38 mins 99.999% 5.26 mins 77.7 secs 25.9 secs 55.5555555% 162.2 days 40 days 13.3 days @jpaulreed@jimkimball #velocityconf
  21. 21. Definitions Are Hard What is an “outage?” Uptime vs. Availability Maintenance windows? “Acts of God” @jpaulreed@jimkimball #velocityconf
  22. 22. Often Opaque @jpaulreed@jimkimball #velocityconf
  23. 23. SLAs Can Promote Silos @jpaulreed@jimkimball #velocityconf
  24. 24. @jpaulreed@jimkimball #velocityconf
  25. 25. @jpaulreed@jimkimball #velocityconf
  26. 26. Every conceivable thing has been taken into consideration. That’s why we have what we call defense in depth. Now that means backup systems to backup systems to backup systems. … Even with a faulty relay, even with a stuck valve, that system works. @jpaulreed@jimkimball #velocityconf
  27. 27. But we didn’t uncover [the core], did we? We stopped it in time for one simple reason, and I told you that: the system works. Dammit, the system works. That’s not the problem. @jpaulreed@jimkimball #velocityconf
  28. 28. FOR ABOUT A CENTURY, THEN, DETERMINISM WAS ASSUMED TO EXIST AND TO BE THE FIRST REQUIREMENT TO BE ABLE TO EXERT PRECISE CONTROL OVER THE WORLD. THIS HAS COME TO DOMINATE OUR ATTITUDES TOWARD CONTROL. TODAY, DETERMINISM IS KNOWN TO BE FUNDAMENTALLY FALSE, AND YET THE ILLUSION OF DETERMINISM IS STILL CLUNG ONTO WITH FERVOR IN OUR HUMAN WORLD OF … COMPUTERS AND INFORMATION SYSTEMS. MARK BURGESS IN SEARCH OF CERTAINTY @jpaulreed@jimkimball #velocityconf
  29. 29. EXTERNAL SLAs @jpaulreed@jimkimball #velocityconf
  30. 30. @jpaulreed@jimkimball #velocityconf
  31. 31. Amazon Web Services First, which SLA? @jpaulreed@jimkimball #velocityconf
  32. 32. @jpaulreed@jimkimball #velocityconf
  33. 33. Amazon Web Services First, which SLA? But not for account suspensions and terminations Ditto maintenance (as defined!) Also ignore “failures of individual instances or volumes not attributable to Region Unavailability” Expect no more than a 10% credit. (Maybe 30%.) (Doesn’t apply to one-time charges… aka “reserved instances.”) @jpaulreed@jimkimball #velocityconf
  34. 34. SLAs ARE SAFEGUARDS YOU PUT INTO BROKEN RELATIONSHIPS. — Roy Rappaport, Netflix @jpaulreed@jimkimball #velocityconf
  35. 35. @jpaulreed@jimkimball #velocityconf
  36. 36. Pagerduty Dig for the SLA Basic plan: “best effort” Standard plan: “5 minutes” Enterprise plan: Insurance ($3 million!) But really: they focus on reliability and resilience @jpaulreed@jimkimball #velocityconf
  37. 37. [REDACTED] @jpaulreed@jimkimball #velocityconf
  38. 38. @jpaulreed@jimkimball #velocityconf
  39. 39. “100% SLA availability? Really?!” @jpaulreed@jimkimball #velocityconf
  40. 40. “100% SLA availability? Really?!” “I’ve forwarded your question to our attorney and he’s suggested that we remove the reference to 100%. So we’ll do that ASAP.” @jpaulreed@jimkimball #velocityconf
  41. 41. “100% SLA availability? Really?!” “I’ve forwarded your question to our attorney and he’s suggested that we remove the reference to 100%. So we’ll do that ASAP.” (They did.) @jpaulreed@jimkimball #velocityconf
  42. 42. Why SLAs In The First Place? @jpaulreed@jimkimball #velocityconf
  43. 43. THE SLA ELEPHANT IN THE DATACENTER@jpaulreed@jimkimball #velocityconf
  44. 44. @jpaulreed@jimkimball #velocityconf
  45. 45. INTERNAL SLAs @jpaulreed@jimkimball #velocityconf
  46. 46. Cautionary Uses of Internal SLAs @jpaulreed@jimkimball #velocityconf
  47. 47. Language Matters Service Level Agreements versus Service Level Commitments Service Level Agreements as “relationship agreements?” @jpaulreed@jimkimball #velocityconf
  48. 48. @jpaulreed@jimkimball #velocityconf
  49. 49. NOT ALL SLAs ARE BAD (OR CREATED EQUAL) @jpaulreed@jimkimball #velocityconf
  50. 50. Making Prioritization Contextual @jpaulreed@jimkimball #velocityconf
  51. 51. Fostering the Right Conversations @jpaulreed@jimkimball #velocityconf
  52. 52. OTHER CONSIDERATIONS FOR IMPROVEMENT @jpaulreed@jimkimball #velocityconf
  53. 53. @jpaulreed@jimkimball #velocityconf
  54. 54. Brené Brown “One of the ways we deal with it is: we numb.” “We make everything that’s uncertain, certain.” “We perfect. And, more dangerously, we perfect our kids.” @jpaulreed@jimkimball #velocityconf
  55. 55. Whether it’s a bailout, an oil spill, a recall: we pretend like what we’re doing doesn’t have a huge impact on other people. I would say to companies: this isn’t our first rodeo, people. We just need you to be authentic and real and say: We’re sorry. We’ll fix it. @jpaulreed@jimkimball #velocityconf
  56. 56. @jpaulreed@jimkimball #velocityconf
  57. 57. A MODEST PROPOSAL @jpaulreed@jimkimball #velocityconf
  58. 58. Remember The Manifesto! @jpaulreed@jimkimball #velocityconf
  59. 59. Remember The Manifesto! @jpaulreed@jimkimball #velocityconf
  60. 60. A LONG-TERM RELATIONSHIP BETWEEN PURCHASER AND SUPPLIER IS NECESSARY FOR BEST ECONOMY. … MORE IMPORTANT THAN PRICE IN THE JAPANESE WAY OF DOING BUSINESS IS CONTINUAL IMPROVEMENT OF QUALITY, WHICH CAN BE ACHIEVED ONLY ON A LONG-TERM RELATIONSHIP OF LOYALTY AND TRUST. W. EDWARDS DEMING OUT OF THE CRISIS @jpaulreed@jimkimball #velocityconf
  61. 61. Obvious ComplicatedComplex Chaotic @jpaulreed@jimkimball #velocityconf
  62. 62. Obvious ComplicatedComplex Chaotic Governing Constraints Good Practice Rigid Constraints Best Practice Lack of Constraints Novel Practice Enabling Constraints Emergent Practice @jpaulreed@jimkimball #velocityconf
  63. 63. Obvious ComplicatedComplex Chaotic Governing Constraints Good Practice Rigid Constraints Best Practice Lack of Constraints Novel Practice Enabling Constraints Emergent Practice MTTR, etc. @jpaulreed@jimkimball #velocityconf
  64. 64. Obvious ComplicatedComplex Chaotic Governing Constraints Good Practice Rigid Constraints Best Practice Lack of Constraints Novel Practice Enabling Constraints Emergent Practice MTTR, etc. Crew Formation @jpaulreed@jimkimball #velocityconf
  65. 65. Obvious ComplicatedComplex Chaotic Governing Constraints Good Practice Rigid Constraints Best Practice Lack of Constraints Novel Practice Enabling Constraints Emergent Practice MTTR, etc. Crew Formation Formalized Response @jpaulreed@jimkimball #velocityconf
  66. 66. Obvious ComplicatedComplex Chaotic Governing Constraints Good Practice Rigid Constraints Best Practice Lack of Constraints Novel Practice Enabling Constraints Emergent Practice Disorder MTTR, etc. Crew Formation Formalized Response @jpaulreed@jimkimball #velocityconf
  67. 67. Obvious ComplicatedComplex Chaotic Governing Constraints Good Practice Rigid Constraints Best Practice Lack of Constraints Novel Practice Enabling Constraints Emergent Practice MTTR, etc. Crew Formation Formalized Response MTTD, etc. Disorder @jpaulreed@jimkimball #velocityconf
  68. 68. A Minimum Viable SLA Covers all complexity domains Involves the business through to the customer Prompts good behavior among teams… … and within the organization Facilitates organizational / team learning Lightweight as possible @jpaulreed@jimkimball #velocityconf
  69. 69. THE SLA, HISTORICALLY, INCENTIVIZES THE WRONG BEHAVIORS @jpaulreed@jimkimball #velocityconf
  70. 70. MODERN SLAs MUST BE COMPLEXITY INFORMED @jpaulreed@jimkimball #velocityconf
  71. 71. BE AWARE OF THE SLAs YOU ARE A PART OF… AND THAT ARE A PART OF YOUR SYSTEM @jpaulreed@jimkimball #velocityconf
  72. 72. J. Paul Reed preed@release-approaches.com @jpaulreed http://jpaulreed.com Jim Kimball jkimball@hedgeserv.com @jimkimball http://eng.hedgeserv.com Anonymous Feedback http://sayat.me/jimkimball http://sayat.me/jpaulreed
  73. 73. (Pssst… Yes, we’re hiring!) eng.hedgeserv.com @jpaulreed@jimkimball #velocityconf
  74. 74. DevOps inPractice J. Paul Reed @jpaulreed@jimkimball #velocityconf
  75. 75. Photo Credits Slide 1: http://jcb.lunaimaging.com/luna/servlet/detail/ JCBMAPS~2~2~313~100035:The-Colossus-of-the-North-or-The-St Slide 12: http://jimhillmedia.com/editor_in_chief1/b/jim_hill/archive/2009/04/27/ wanna-make-a-muppet-happy-go-vote-for-beaker-s-ode-to-joy.aspx Slide 15: Generated via imgflip.com Slide 21: https://commons.wikimedia.org/wiki/ File:Pigneau_de_Behaine_Annamite_Latin_Dictionary.jpg Slide 22: http://www.beckoncall.info/why-do-my-inside-windows-get-so-dirty/ Slide 23: http://commons.wikimedia.org/wiki/File:Wooden_silo.JPG @jpaulreed@jimkimball #velocityconf
  76. 76. Photo Credits, Continued Slide 24: https://commons.wikimedia.org/wiki/File:Nuclear_Missile_Silo_ %287332367192%29.jpg Slide 25: http://www.cio.com/article/2883770/cloud-computing/the-death-of-the-sla.html Slide 28: http://markburgess.org/bio.html Slide 43: http://res.freestockphotos.biz/pictures/10/10006-an-elephant-in-the-wild-pv.jpg Slide 46: http://www.oldhouseonline.com/patch-plaster-walls/ Slide 47: http://seuss.wikia.com/wiki/The_Zax Slide 48: http://www.gapingvoidart.com/ @jpaulreed@jimkimball #velocityconf
  77. 77. Photo Credits, Continued Slide 50: https://www.youtube.com/watch?v=c4QWgfHOvSM Slide 51: Courtesy Jabe Bloom / Will Evans Slide 53: https://www.flickr.com/photos/stignygaard/450640129, https:// www.flickr.com/photos/midendian/358942212/ Slide 54, 55: https://www.youtube.com/watch?v=iCvmsMzlF7o Slide 56: http://www.moviepostersetc.com/MoviePostersEtc/salvador-dali-poster-les- montres-molles-clock-explosion-36-x-24--ff8080813ae48649013aebe4facd03b6- p.html Slide 57, 58: https://www.flickr.com/photos/62981668@N06/6012392939/ @jpaulreed@jimkimball #velocityconf

×