Your SlideShare is downloading. ×
0
Ops Meta-Metrics
The Currency You Use to Pay For Change




John Allspaw
VP Operations
  Etsy.com
                        ...
Warning

Graphs and numbers in this
       presentation
   are sort of made up
/usr/nagios/libexec/check_ops.pl
How R U Doing?
            http://www.flickr.com/photos/a4gpa/190120662/
We track bugs already...




       Example: https://issues.apache.org/jira/browse/TS
We should track
 these, too...
We should track
    these, too...

Changes (Who/What/When/Type)
We should track
     these, too...

Changes (Who/What/When/Type)
Incidents (Type/Severity)
We should track
     these, too...

Changes (Who/What/When/Type)
Incidents (Type/Severity)
Response to Incidents (TTR/TTD)
trepidation
noun
1 a feeling of fear or agitation about something that may
happen : the men set off in fear and trepidatio...
Change

Required.
Often feared.
Why?



                http://www.flickr.com/photos/20408885@N03/3570184759/
This is why
                       OMGWTF OUTAGES!!!1!!

   la de da,
everything’s fine




            change
            ...
Change
 PTSD?




         http://www.flickr.com/photos/tzofia/270800047/
Brace For Impact?
Brace For Impact?
But wait....
                          (OMGWTF)
   la de da,
everything’s fine




            change
            happens
But wait....
                                (OMGWTF)
   la de da,




                   }
everything’s fine
             ...
But wait....
                                (OMGWTF)
   la de da,




                   }
everything’s fine
             ...
But wait....
                                (OMGWTF)
   la de da,




                   }
everything’s fine
             ...
Need to raise confidence that


change != outage
...incidents can be
    handled well




                 http://www.flickr.com/photos/axiepics/3181170364/
...root causes can be fixed
        quick enough




                   http://www.flickr.com/photos/ljv/213624799/
...change can be
  safe enough




     http://www.flickr.com/photos/marksetchell/43252686/
But how?
How do we have confidence in anything
in our infrastructure?



          We measure it.
          And graph it.
 ...
Tracking Change
1. Type
2. Frequency/Size
3. Results of those changes
Types of Change

        Layers                   Examples


      App code        PHP/Rails/etc or ‘front-end’ code

    ...
Code Deploys:
        Who/What/When
WHEN              WHO                                 WHAT
                  (guy who ...
Code Deploys:
            Who/What/When

                      Last 2 prod deploys
Last 2 Chef changes
other changes




(insert whatever ticketing/tracking you have)
Frequency
Frequency
Frequency
Size
Tracking Incidents
        http://www.flickr.com/photos/47684393@N00/4543311558/
Incident Frequency
Incident Size


      Big Outage
     TTR still going
Tracking Incidents

1. Frequency
2. Severity
3. Root Cause
4. Time-To-Detect (TTD)
5. Time-To-Resolve (TTR)
The How
Doesn’t
Matter




          http://www.flickr.com/photos/matsuyuki/2328829160/
Incident/Degradation
               Tracking
         Start      Detect Resolve           Root            PostMortem
 Date...
Incident/Degradation
             Tracking
       Start  Detect Resolve           Root   PostMortem
Date                  ...
Change:Incident Ratio
Change:Incident Ratio

  Important.
Change:Incident Ratio

  Important.
  Not because all changes are equal.
Change:Incident Ratio

  Important.
  Not because all changes are equal.
  Not because all incidents are equal, or
  chang...
Change:Incident Ratio
But because
humans will
irrationally
make a
permanent
connection
between the
two.
               htt...
Severity
Severity
Not all incidents are created equal.
Severity
Not all incidents are created equal.
Something like:
Severity
Not all incidents are created equal.
Something like:
Severity
Not all incidents are created equal.
Something like:



SEV1 Full outage, or effectively unusable.
Severity
Not all incidents are created equal.
Something like:



SEV1 Full outage, or effectively unusable.
SEV2 Significan...
Severity
Not all incidents are created equal.
Something like:



SEV1 Full outage, or effectively unusable.
SEV2 Significan...
Severity
Not all incidents are created equal.
Something like:



SEV1 Full outage, or effectively unusable.
SEV2 Significan...
Root Cause?
          (Not all incidents are change related)

          Something like:




Note: this can be difficult to ...
Root Cause?
          (Not all incidents are change related)

          Something like:


                         1. Hard...
Recording Your Response




                (worth the hassle)


              http://www.flickr.com/photos/mattblaze/26950...
Time
la de da,
 everything’s fine




Time
la de da,
 everything’s fine




Time
                change
                happens
Noticed there
                    was a problem




    la de da,
 everything’s fine




Time
                change
      ...
Noticed there
                    was a problem




                                      Figured out
    la de da,       ...
Fixed the problem


                    Noticed there                       •rolled back
                    was a problem...
Fixed the problem


                    Noticed there                       •rolled back
                    was a problem...
• Coordinate troubleshooting/diagnosis
                                                         Fixed the problem


      ...
• Coordinate troubleshooting/diagnosis
  • Communicate to support/community/execs
                                        ...
Fixed the problem


                    Noticed there                       •rolled back
                    was a problem...
• Coordinate responses*
                                                          Fixed the problem


                    ...
• Coordinate responses*
   • Communicate to support/community/execs problem
                                     Fixed the...
Fixed the problem
                                      Figured out
                                    what the cause is
...
• Confirm stability, resolving steps

                                                         Fixed the problem
          ...
• Confirm stability, resolving steps
 • Communicate to support/community/execs
                                            ...
Communications
http://etsystatus.com




twitter.com/etsystatus
Fixed the problem
                                      Figured out
                                    what the cause is
...
Time To Detect

                      (TTD)

                                     Time To Resolve
    la de da,

         ...
Hypothetical Example:
 “We’re So Nimble!”
Nimble, But Stumbling?
Is There Any Pattern?
Nimble, But Stumbling?



          +
Nimble, But Stumbling?



          +
Maybe this is too
       Maybe you’re      much suck?




                                  }
changing too much at once?

...
What percentage of incidents are related to
change?




                            http://www.flickr.com/photos/78364563@N...
What percentage of change-
related incidents are “off-hours”?




                             http://www.flickr.com/photos...
What percentage of change-
related incidents are “off-hours”?




Do they have higher or
lower TTR?




                  ...
What types of change have the   worst success
rates?




                                 http://www.flickr.com/photos/lwr/...
What types of change have the   worst success
rates?




                       Which ones have the                     be...
Does your   TTD/TTR increase
depending on the:

-   SIZE?
-   FREQUENCY?




                               http://www.flic...
Side effect is
             that you’re
             also tracking
             successful
             changes to
       ...
Q2 2010
                                                    Incident
                                        Success
    T...
Q2 2010
                                                   Incident
                                        Success
    Ty...
Some Observations
Incident Observations


Morale




    Length of Incident/Outage
Incident Observations


Mistakes




      Length of Incident/Outage
Change Observations


Change
 Size



         Change Frequency
Change Observations
          Huge changesets
          deployed rarely


Change
 Size



         Change Frequency
Change Observations
          Huge changesets (high TTR)
          deployed rarely


Change
 Size



         Change Frequ...
Change Observations
          Huge changesets (high TTR)
          deployed rarely


Change
 Size                         ...
Change Observations
          Huge changesets (high TTR)
          deployed rarely


Change
 Size                         ...
Specifically....


   la de da,
                       What if this was only   5

                   }
everything’s fine    ...
Pay attention to this stuff
                http://www.flickr.com/photos/plasticbag/2461247090/
We’re Hiring Ops!
SF & NYC
In May:

-   $22.9M of goods were sold by the community
-   1,895,943 new items listed
-   239,...
The End
Bonus Time!!1!
Continuous
   Deployment

     Described in 6 graphs
(Originally Cal Henderson’s idea)
Ops Meta-Metrics: The Currency You Pay For Change
Ops Meta-Metrics: The Currency You Pay For Change
Ops Meta-Metrics: The Currency You Pay For Change
Ops Meta-Metrics: The Currency You Pay For Change
Ops Meta-Metrics: The Currency You Pay For Change
Ops Meta-Metrics: The Currency You Pay For Change
Upcoming SlideShare
Loading in...5
×

Ops Meta-Metrics: The Currency You Pay For Change

22,678

Published on

Published in: Business, Technology
0 Comments
50 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
22,678
On Slideshare
0
From Embeds
0
Number of Embeds
17
Actions
Shares
0
Downloads
555
Comments
0
Likes
50
Embeds 0
No embeds

No notes for slide
  • This is about metrics about YOU! Metrics *about* the metrics-makers!
  • They are basically taken from both Flickr and Etsy.
  • HOW MANY: write public-facing app code? maintain the release tools? release process? respond to incidents? have had an outage or notable degradation this month? that was change-related?
  • Too fast? Too loose? Too many issues? Too many upset and stressed out humans?
  • Everyone is used to bug tracking, it’s something worthwhile....



  • If this is a feeling you have often, please read on.

  • All you need is to see this happen once, and it’s hard to get out of your memory.
    No wonder why some people can start to think “code deploy = outage”.
  • Mild version of “Critical Incident Stress Management”?
    Change = risk, and sometimes risk = outage. And outages are stressful.

  • Not supposed to feel like this.

  • Details about the change play a huge role in your ability to respond to change-related incidents.
  • Details about the change play a huge role in your ability to respond to change-related incidents.
  • Details about the change play a huge role in your ability to respond to change-related incidents.

  • We do this by tracking our responses to outages and incidents.
  • We can do this by tracking our change, and learning from the results.

  • We need to raise confidence that we’re moving as fast as we can while still being safe enough to do so. And we can adjust the change to meet our requirements...
  • Why should change and results of changes be any different?
  • Type = code, schema, infrastructure, etc.
    Frequency/Size = how often each type is changed, implies risk
    Results = how often each change results in an incident/degradation
  • Lots of different types here. Might be different for everyone.
    Not all types of change bring the same amount of risk.
  • This info should be considered mandatory. This should also be done for db schema changes, network changes, changes in any part of the stack, really.
  • The header of our metrics tools has these statistics, too.
  • The tricky part: getting all prod changes written down without too much hassle.
  • Here’s one type of change....
  • Here’s another type of change....
  • Here’s yet another type of change...
  • Size does turn out to be important. Size = lines of code, level of SPOF risk, etc.
  • This seems like something you should do. Also: “incidents” = outages or degradations.
  • Just an example. This looks like it’s going well! Getting better!
  • Maybe I can’t say that it’s getting better, actually....

  • Some folks have Techcrunch as their incident log keeper. You could just use a spreadsheet.
  • An example!
  • You *are* doing postmortems on incidents that happen, right? Doing them comes at a certain point in your evolution.



  • Without the statistics, even a rare but severe outage can make the impression that change == outage.
  • Just examples. It’s important to categorize these things so you can count the ones that matter for the user’s experience. #4 Loss of redundancy
  • Just examples. It’s important to categorize these things so you can count the ones that matter for the user’s experience. #4 Loss of redundancy
  • Just examples. It’s important to categorize these things so you can count the ones that matter for the user’s experience. #4 Loss of redundancy
  • Just examples. It’s important to categorize these things so you can count the ones that matter for the user’s experience. #4 Loss of redundancy
  • Just examples. It’s important to categorize these things so you can count the ones that matter for the user’s experience. #4 Loss of redundancy
  • Just examples. It’s important to categorize these things so you can count the ones that matter for the user’s experience. #4 Loss of redundancy
  • Just examples. It’s important to categorize these things so you can count the ones that matter for the user’s experience. #4 Loss of redundancy
  • Just examples. This normally comes from a postmortem meeting. A good pointer on Root Cause Analysis is Eric Ries’ material on Five Whys, and the wikipedia page for RCA.
  • http://www.flickr.com/photos/mattblaze/2695044170/
  • What happens in our response to a change-related incident is just as important as the occurrence of the incident.
  • What happens in our response to a change-related incident is just as important as the occurrence of the incident.
  • What happens in our response to a change-related incident is just as important as the occurrence of the incident.
  • What happens in our response to a change-related incident is just as important as the occurrence of the incident.
  • What happens in our response to a change-related incident is just as important as the occurrence of the incident.
  • Th
  • Th
  • This might also be known as a ‘diagnose’ point.
  • This might also be known as a ‘diagnose’ point.
  • These events usually spawn other events.
  • These events usually spawn other events.
  • This should be standard operating procedure at this point,
  • These events usually spawn other events.
  • Some folks might notice a “Time To Diagnose” missing here.
    ALSO: it’s usually more complex than this, but this is the gist of it.


  • Do incidents increase with size of change? With frequency? With frequency/size of different types?


  • If you don’t track: Change, Incidents, and Responses, you’ll never have answers for these questions.


  • Reasonable questions.

  • *YOU* get to decide what is “small” and “frequent”.


  • THIS is what can help give you confidence. Or not.

  • The longer an outage lasts, the bigger of a bummer it is for all those who are working on fixing it.
  • The longer an outage lasts, the more mistakes people make. (and, as the night gets longer)
    Red herrings...
  • put two points on this graph
  • put two points on this graph
  • put two points on this graph
  • put two points on this graph
  • It should, because it is.
  • How we feel about change and how it can (or not) cause outages is important.
    Some of the nastiest relationships emerge between dev and ops because of these things.





  • “Normal” = lots of change done at regular intervals, change = big, time = long.
  • 2 weeks? 5000 lines?
  • Scary Monster of Change! Each incident-causing deploy has only one recourse: roll it all back. Even code that was ok and unrelated to the incident. Boo!
  • Silly Monster of Nothing to Be Afraid Of Because His Teeth Are Small.
  • Problem? Roll that little piece back. Or better yet, roll it forward!
  • This looks like an adorable monster. Like a Maurice Sendak monster.

  • Transcript of "Ops Meta-Metrics: The Currency You Pay For Change"

    1. 1. Ops Meta-Metrics The Currency You Use to Pay For Change John Allspaw VP Operations Etsy.com http://www.flickr.com/photos/wwarby/3296379139
    2. 2. Warning Graphs and numbers in this presentation are sort of made up
    3. 3. /usr/nagios/libexec/check_ops.pl
    4. 4. How R U Doing? http://www.flickr.com/photos/a4gpa/190120662/
    5. 5. We track bugs already... Example: https://issues.apache.org/jira/browse/TS
    6. 6. We should track these, too...
    7. 7. We should track these, too... Changes (Who/What/When/Type)
    8. 8. We should track these, too... Changes (Who/What/When/Type) Incidents (Type/Severity)
    9. 9. We should track these, too... Changes (Who/What/When/Type) Incidents (Type/Severity) Response to Incidents (TTR/TTD)
    10. 10. trepidation noun 1 a feeling of fear or agitation about something that may happen : the men set off in fear and trepidation. 2 archaic trembling motion. DERIVATIVES trepidatious adjective ORIGIN late 15th cent.: from Latin trepidatio(n-), from trepidare ‘be agitated, tremble,’ from trepidus ‘alarme
    11. 11. Change Required. Often feared. Why? http://www.flickr.com/photos/20408885@N03/3570184759/
    12. 12. This is why OMGWTF OUTAGES!!!1!! la de da, everything’s fine change happens
    13. 13. Change PTSD? http://www.flickr.com/photos/tzofia/270800047/
    14. 14. Brace For Impact?
    15. 15. Brace For Impact?
    16. 16. But wait.... (OMGWTF) la de da, everything’s fine change happens
    17. 17. But wait.... (OMGWTF) la de da, } everything’s fine How much change is this? change happens
    18. 18. But wait.... (OMGWTF) la de da, } everything’s fine How much change is this? What kind of change? change happens
    19. 19. But wait.... (OMGWTF) la de da, } everything’s fine How much change is this? What kind of change? How often does this happen? change happens
    20. 20. Need to raise confidence that change != outage
    21. 21. ...incidents can be handled well http://www.flickr.com/photos/axiepics/3181170364/
    22. 22. ...root causes can be fixed quick enough http://www.flickr.com/photos/ljv/213624799/
    23. 23. ...change can be safe enough http://www.flickr.com/photos/marksetchell/43252686/
    24. 24. But how? How do we have confidence in anything in our infrastructure? We measure it. And graph it. And alert on it.
    25. 25. Tracking Change 1. Type 2. Frequency/Size 3. Results of those changes
    26. 26. Types of Change Layers Examples App code PHP/Rails/etc or ‘front-end’ code Apache, MySQL, DB schema, Services code PHP/Ruby versions, etc. OS/Servers, Switches, Routers, Infrastructure Datacenters, etc. (you decide what these are for your architecture)
    27. 27. Code Deploys: Who/What/When WHEN WHO WHAT (guy who pushed the button) (link to diff) (http://codeascraft.etsy.com/2010/05/20/quantum-of-deployment/)
    28. 28. Code Deploys: Who/What/When Last 2 prod deploys Last 2 Chef changes
    29. 29. other changes (insert whatever ticketing/tracking you have)
    30. 30. Frequency
    31. 31. Frequency
    32. 32. Frequency
    33. 33. Size
    34. 34. Tracking Incidents http://www.flickr.com/photos/47684393@N00/4543311558/
    35. 35. Incident Frequency
    36. 36. Incident Size Big Outage TTR still going
    37. 37. Tracking Incidents 1. Frequency 2. Severity 3. Root Cause 4. Time-To-Detect (TTD) 5. Time-To-Resolve (TTR)
    38. 38. The How Doesn’t Matter http://www.flickr.com/photos/matsuyuki/2328829160/
    39. 39. Incident/Degradation Tracking Start Detect Resolve Root PostMortem Date Severity Done? Time Time Time Cause 1/2/08 12:30 ET 12:32 ET 12:45 ET Sev1 DB Change Yes 3/7/08 18:32 ET 18:40 ET 18:47 ET Sev2 Capacity Yes 5/3/08 17:55 ET 17:55 ET 18:14 ET Sev3 Hardware Yes
    40. 40. Incident/Degradation Tracking Start Detect Resolve Root PostMortem Date Severity Time These Time give you will Time context Cause Done? for your rates of change. (You’ll need them for postmortems, anyway.)
    41. 41. Change:Incident Ratio
    42. 42. Change:Incident Ratio Important.
    43. 43. Change:Incident Ratio Important. Not because all changes are equal.
    44. 44. Change:Incident Ratio Important. Not because all changes are equal. Not because all incidents are equal, or change-related.
    45. 45. Change:Incident Ratio But because humans will irrationally make a permanent connection between the two. http://www.flickr.com/photos/michelepedrolli/449572596/
    46. 46. Severity
    47. 47. Severity Not all incidents are created equal.
    48. 48. Severity Not all incidents are created equal. Something like:
    49. 49. Severity Not all incidents are created equal. Something like:
    50. 50. Severity Not all incidents are created equal. Something like: SEV1 Full outage, or effectively unusable.
    51. 51. Severity Not all incidents are created equal. Something like: SEV1 Full outage, or effectively unusable. SEV2 Significant degradation for subset of users.
    52. 52. Severity Not all incidents are created equal. Something like: SEV1 Full outage, or effectively unusable. SEV2 Significant degradation for subset of users. SEV3 Minor impact on user experience.
    53. 53. Severity Not all incidents are created equal. Something like: SEV1 Full outage, or effectively unusable. SEV2 Significant degradation for subset of users. SEV3 Minor impact on user experience. SEV4 No impact, but time-sensitive failure.
    54. 54. Root Cause? (Not all incidents are change related) Something like: Note: this can be difficult to categorize. http://en.wikipedia.org/wiki/Root_cause_analysis
    55. 55. Root Cause? (Not all incidents are change related) Something like: 1. Hardware Failure 2. Datacenter Issue 3. Change: Code Issue 4. Change: Config Issue 5. Capacity/Traffic Issue 6. Other Note: this can be difficult to categorize. http://en.wikipedia.org/wiki/Root_cause_analysis
    56. 56. Recording Your Response (worth the hassle) http://www.flickr.com/photos/mattblaze/2695044170/
    57. 57. Time
    58. 58. la de da, everything’s fine Time
    59. 59. la de da, everything’s fine Time change happens
    60. 60. Noticed there was a problem la de da, everything’s fine Time change happens
    61. 61. Noticed there was a problem Figured out la de da, what the cause is everything’s fine Time change happens
    62. 62. Fixed the problem Noticed there •rolled back was a problem •rolled forward •temporary solution •etc Figured out la de da, what the cause is everything’s fine Time change happens
    63. 63. Fixed the problem Noticed there •rolled back was a problem •rolled forward •temporary solution •etc Figured out la de da, what the cause is everything’s fine Time change happens
    64. 64. • Coordinate troubleshooting/diagnosis Fixed the problem Noticed there •rolled back was a problem •rolled forward •temporary solution •etc Figured out la de da, what the cause is everything’s fine Time change happens
    65. 65. • Coordinate troubleshooting/diagnosis • Communicate to support/community/execs Fixed the problem Noticed there •rolled back was a problem •rolled forward •temporary solution •etc Figured out la de da, what the cause is everything’s fine Time change happens
    66. 66. Fixed the problem Noticed there •rolled back was a problem •rolled forward •temporary solution •etc Figured out la de da, what the cause is everything’s fine Time Time change happens
    67. 67. • Coordinate responses* Fixed the problem Noticed there •rolled back was a problem •rolled forward •temporary solution •etc Figured out la de da, what the cause is everything’s fine Time Time change happens * usually, “One Thing At A Time” responses
    68. 68. • Coordinate responses* • Communicate to support/community/execs problem Fixed the Noticed there •rolled back was a problem •rolled forward •temporary solution •etc Figured out la de da, what the cause is everything’s fine Time Time change happens * usually, “One Thing At A Time” responses
    69. 69. Fixed the problem Figured out what the cause is Noticed there •rolled back was a problem •rolled forward •temporary solution •etc la de da, everything’s fine Time Time change happens
    70. 70. • Confirm stability, resolving steps Fixed the problem Figured out what the cause is Noticed there •rolled back was a problem •rolled forward •temporary solution •etc la de da, everything’s fine Time Time change happens
    71. 71. • Confirm stability, resolving steps • Communicate to support/community/execs Fixed the problem Figured out what the cause is Noticed there •rolled back was a problem •rolled forward •temporary solution •etc la de da, everything’s fine Time Time change happens
    72. 72. Communications http://etsystatus.com twitter.com/etsystatus
    73. 73. Fixed the problem Figured out what the cause is Noticed there •rolled back was a problem •rolled forward •temporary solution •etc la de da, everything’s fine Time Time change happens PostMortem
    74. 74. Time To Detect (TTD) Time To Resolve la de da, (TTR) la de da, everything’s fine everything’s fine Time change happens
    75. 75. Hypothetical Example: “We’re So Nimble!”
    76. 76. Nimble, But Stumbling?
    77. 77. Is There Any Pattern?
    78. 78. Nimble, But Stumbling? +
    79. 79. Nimble, But Stumbling? +
    80. 80. Maybe this is too Maybe you’re much suck? } changing too much at once? } Happening too often?
    81. 81. What percentage of incidents are related to change? http://www.flickr.com/photos/78364563@N00/2467989781/
    82. 82. What percentage of change- related incidents are “off-hours”? http://www.flickr.com/photos/jeffreyanthonyrafolpiano/3266123838
    83. 83. What percentage of change- related incidents are “off-hours”? Do they have higher or lower TTR? http://www.flickr.com/photos/jeffreyanthonyrafolpiano/3266123838
    84. 84. What types of change have the worst success rates? http://www.flickr.com/photos/lwr/2257949828/
    85. 85. What types of change have the worst success rates? Which ones have the best success rates? http://www.flickr.com/photos/lwr/2257949828/
    86. 86. Does your TTD/TTR increase depending on the: - SIZE? - FREQUENCY? http://www.flickr.com/photos/45409431@N00/2521827947/
    87. 87. Side effect is that you’re also tracking successful changes to production as well http://www.flickr.com/photos/wwworks/2313927146
    88. 88. Q2 2010 Incident Success Type Successes Failures Minutes(Sev1 Rate /2) App code 420 5 98.81 8 Config 404 3 99.26 5 DB Schema 15 1 93.33 10 DNS 45 0 100 0 Network (misc) 5 0 100 0 Network (core) 1 0 100 0
    89. 89. Q2 2010 Incident Success Type Successes Failures Minutes(Se ! Rate v1/2) App code 420 5 98.81 8 Config 404 3 99.26 5 DB Schema 15 1 93.33 10 DNS 45 0 100 0 Network (misc) 5 0 100 0 Network (core) 1 0 100 0
    90. 90. Some Observations
    91. 91. Incident Observations Morale Length of Incident/Outage
    92. 92. Incident Observations Mistakes Length of Incident/Outage
    93. 93. Change Observations Change Size Change Frequency
    94. 94. Change Observations Huge changesets deployed rarely Change Size Change Frequency
    95. 95. Change Observations Huge changesets (high TTR) deployed rarely Change Size Change Frequency
    96. 96. Change Observations Huge changesets (high TTR) deployed rarely Change Size Tiny changesets deployed often Change Frequency
    97. 97. Change Observations Huge changesets (high TTR) deployed rarely Change Size Tiny changesets deployed often (low TTR) Change Frequency
    98. 98. Specifically.... la de da, What if this was only 5 } everything’s fine lines of code that were changed? Does that feel safer? change happens (it should)
    99. 99. Pay attention to this stuff http://www.flickr.com/photos/plasticbag/2461247090/
    100. 100. We’re Hiring Ops! SF & NYC In May: - $22.9M of goods were sold by the community - 1,895,943 new items listed - 239,340 members joined
    101. 101. The End
    102. 102. Bonus Time!!1!
    103. 103. Continuous Deployment Described in 6 graphs (Originally Cal Henderson’s idea)
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×