Advertisement
Advertisement

More Related Content

Advertisement

The Upside of Downtime (Velocity 2010)

  1. http://gapingvoid.com/ Sunday, June 20, 2010
  2. The Upside of Downtime Turning disaster into opportunity Sunday, June 20, 2010
  3. Who’s had a site go down? Sunday, June 20, 2010
  4. Who’s hasn’t had a site go down? Sunday, June 20, 2010
  5. There’s always that one guy! Sunday, June 20, 2010
  6. Sunday, June 20, 2010
  7. Sunday, June 20, 2010
  8. Sunday, June 20, 2010
  9. Sunday, June 20, 2010
  10. Sunday, June 20, 2010
  11. Sunday, June 20, 2010
  12. Sunday, June 20, 2010
  13. Sunday, June 20, 2010
  14. Sunday, June 20, 2010
  15. Downtime sucks Source: http://www.motivatedphotos.com/?id=8080 Sunday, June 20, 2010
  16. Why downtime sucks Business $3,000 $2,250 $1,500 Sales $750 $0 0 2 4 6 8 10 12 14 16 18 20 22 Sunday, June 20, 2010
  17. Why downtime sucks Business Brand Sunday, June 20, 2010
  18. Why downtime sucks Business Brand You Sunday, June 20, 2010
  19. Why downtime sucks Business Brand You Users Sunday, June 20, 2010
  20. Downtime = Bad! (Duh) Sunday, June 20, 2010
  21. Approach #1 Don’t fail Sunday, June 20, 2010
  22. Source: http://kansansforlife.files.wordpress.com/2009/12/titanic.jpg Sunday, June 20, 2010
  23. “Everything fails all the time” -- Werner Vogels (Amazon, CTO) Sunday, June 20, 2010
  24. “Everything fails all the time” -- Werner Vogels (Amazon, CTO) Sunday, June 20, 2010
  25. Your site will fail Werner Vogels (Amazon, CTO) Sunday, June 20, 2010
  26. Why?!? Sunday, June 20, 2010
  27. Why Failure Happens Risk Homeostasis Source: http://joshuahind.files.wordpress.com/2009/09/bicycle-crash.jpg Sunday, June 20, 2010
  28. Why Failure Happens Risk Homeostasis Black Swan Source: Amazon.com Sunday, June 20, 2010
  29. Why Failure Happens Risk Homeostasis Black Swan Unknown unknowns Source: http://www.apoliticus.com/wp-content/uploads/2009/01/6_21_080306_rumsfeld.jpg Sunday, June 20, 2010
  30. Why Failure Happens Risk Homeostasis Black Swan Unknown unknowns Change Source: http://bozark.net/wordpress/wp-content/uploads/2008/09/barack_obama_change_fairey.jpg Sunday, June 20, 2010
  31. Why Failure Happens Risk Homeostasis Black Swan Unknown unknowns Change Many small failures Source: http://www.biojobblog.com/uploads/image/dominos.jpg Sunday, June 20, 2010
  32. Why Failure Happens Risk Homeostasis Black Swan Unknown unknowns Change Many small failures Humans Source: http://www.librarian.net/talks/clc/CLC.key/SJ_Shoulder_Shrug.jpg Sunday, June 20, 2010
  33. Sunday, June 20, 2010
  34. Sunday, June 20, 2010
  35. Polisher blocked Not unusual Source: http://www.gladwell.com/1996/1996_01_22_a_blowup.htm Sunday, June 20, 2010
  36. Polisher Moisture leaks into blocked air system Not unusual Not expected Source: http://www.gladwell.com/1996/1996_01_22_a_blowup.htm Sunday, June 20, 2010
  37. Polisher Moisture leaks into Flow of cold water blocked air system stopped Not unusual Not expected Not good Source: http://www.gladwell.com/1996/1996_01_22_a_blowup.htm Sunday, June 20, 2010
  38. Polisher Moisture leaks into Flow of cold water blocked air system stopped Not unusual Not expected Backup disabled Source: http://www.gladwell.com/1996/1996_01_22_a_blowup.htm Sunday, June 20, 2010
  39. Polisher Moisture leaks into Flow of cold water blocked air system stopped Not unusual Not expected Backup disabled Doh! Indicator blocked Source: http://www.gladwell.com/1996/1996_01_22_a_blowup.htm Sunday, June 20, 2010
  40. Polisher Moisture leaks into Flow of cold water blocked air system stopped Not unusual Not expected Backup disabled Doh! Indicator blocked Dammit Relief valve broken Source: http://www.gladwell.com/1996/1996_01_22_a_blowup.htm Sunday, June 20, 2010
  41. Polisher Moisture leaks into Flow of cold water blocked air system stopped Not unusual Not expected Backup disabled Doh! Indicator blocked Dammit Relief valve broken WTF Gauge broken Source: http://www.gladwell.com/1996/1996_01_22_a_blowup.htm Sunday, June 20, 2010
  42. Polisher Moisture leaks into Flow of cold water blocked air system stopped Not unusual Not expected Backup disabled Doh! Indicator blocked Dammit Relief valve broken Meltdown Gauge broken Source: http://www.gladwell.com/1996/1996_01_22_a_blowup.htm Sunday, June 20, 2010
  43. Sunday, June 20, 2010
  44. Source: http://support.rightscale.com/09-Clouds/AWS/02-Amazon_EC2/Designing_Failover_Architectures_on_EC2/03-Advanced_Failover_Architecture Sunday, June 20, 2010
  45. “accidental power failure” Source: http://www.datacenterknowledge.com/archives/2010/06/16/power-failure-kos-intuit-sites-for-24-hours/ Sunday, June 20, 2010
  46. “traffic accident damaged a nearby utility transformer” Source: http://www.datacenterknowledge.com/archives/2007/11/13/truck-crash-knocks-rackspace-offline/ Sunday, June 20, 2010
  47. “unfortunate code change” Source: http://www.datacenterknowledge.com/archives/2010/06/11/errant-code-change-crashes-10-million-blogs/ Sunday, June 20, 2010
  48. Sunday, June 20, 2010
  49. “Unhappy customers may get some attention, but unhappy networked customers can quickly impact your business” -- Clay Shirky Source: http://happenupon.files.wordpress.com/2009/02/technology-guru-clay-shir-001.jpg, http://scholarlykitchen.sspnet.org/2010/03/02/shirky-at-nfais-how-abundance-breaks-everything/ Sunday, June 20, 2010
  50. Sunday, June 20, 2010
  51. Sunday, June 20, 2010
  52. Sunday, June 20, 2010
  53. Sunday, June 20, 2010
  54. Sunday, June 20, 2010
  55. Sunday, June 20, 2010
  56. http://labs.webmetrics.com/crowdsourceduptime Sunday, June 20, 2010
  57. Sunday, June 20, 2010
  58. Sunday, June 20, 2010
  59. Sunday, June 20, 2010
  60. Sunday, June 20, 2010
  61. Recap Sunday, June 20, 2010
  62. Your site will fail Sunday, June 20, 2010
  63. Your site will fail + Downtime is bad Sunday, June 20, 2010
  64. Your site will fail + Downtime is bad + Everyone will find out Sunday, June 20, 2010
  65. Your site will fail + Downtime is bad + Everyone will find out = Screw it, I’ll become a lumberjack Source: http://sbadrinath.files.wordpress.com/2009/03/different26rqcu3.jpg Sunday, June 20, 2010
  66. “Embrace fear of outages and degradation. Use it to guide your architecture, your code, your infrastructure. So lean into it.” -- John Allspaw, VP Tech. Ops at Etsy Sunday, June 20, 2010
  67. Approach #2 Prepare for downtime Sunday, June 20, 2010
  68. Disclaimer: Try hard to avoid downtime Sunday, June 20, 2010
  69. Learning by example... Sunday, June 20, 2010
  70. Case Study #1 Facebook Sunday, June 20, 2010
  71. Sunday, June 20, 2010
  72. Sunday, June 20, 2010
  73. Sunday, June 20, 2010
  74. Sunday, June 20, 2010
  75. Sunday, June 20, 2010
  76. Sunday, June 20, 2010
  77. “The larger issue here isn't just that a portion of Facebook's platform has gone down - numerous web services have issues from time to time, including everything from Gmail to Twitter. An outage of this length, however, with no official communication from the company itself is disturbing.” -- N.Y. Times Sunday, June 20, 2010
  78. Facebook Downtime Disturbing Sunday, June 20, 2010
  79. Sunday, June 20, 2010
  80. Case Study #2 Google App Engine Sunday, June 20, 2010
  81. Sunday, June 20, 2010
  82. Sunday, June 20, 2010
  83. Sunday, June 20, 2010
  84. Sunday, June 20, 2010
  85. Sunday, June 20, 2010
  86. Sunday, June 20, 2010
  87. Sunday, June 20, 2010
  88. Sunday, June 20, 2010
  89. Sunday, June 20, 2010
  90. Sunday, June 20, 2010
  91. Sunday, June 20, 2010
  92. Sunday, June 20, 2010
  93. Sunday, June 20, 2010
  94. Sunday, June 20, 2010
  95. Google App Engine Downtime Kudos Sunday, June 20, 2010
  96. Case Study #3 Atlassian Sunday, June 20, 2010
  97. Sunday, June 20, 2010
  98. Sunday, June 20, 2010
  99. Sunday, June 20, 2010
  100. Sunday, June 20, 2010
  101. Sunday, June 20, 2010
  102. Sunday, June 20, 2010
  103. Sunday, June 20, 2010
  104. Sunday, June 20, 2010
  105. Sunday, June 20, 2010
  106. Sunday, June 20, 2010
  107. Sunday, June 20, 2010
  108. Atlassian Downtime Bravo Sunday, June 20, 2010
  109. http://atlassian.com/ Sunday, June 20, 2010
  110. Downtime: Opportunity to Build Trust Sunday, June 20, 2010
  111. Downtime: Opportunity to Destroy Trust Sunday, June 20, 2010
  112. How To: Prepare for Downtime Sunday, June 20, 2010
  113. Something > Nothing Sunday, June 20, 2010
  114. Upside of Downtime Framework 1.0 Life is good Oh crap That sucked Time Sunday, June 20, 2010
  115. Upside of Downtime Framework 1.0 Prepare Communicate Explain Time Sunday, June 20, 2010
  116. Upside of Downtime Framework 1.0 Prepare Communicate Explain Time Sunday, June 20, 2010
  117. Upside of Downtime Framework 1.0 Prepare Communicate Explain Time Sunday, June 20, 2010
  118. Upside of Downtime Framework 1.0 Prepare Communicate Explain Time Sunday, June 20, 2010
  119. Prepare Communicate Explain Sunday, June 20, 2010
  120. Prepare Communicate Explain 1. Communication channel Sunday, June 20, 2010
  121. Prepare Communicate Explain 1. Communication channel Something is Can’t tell if it’s I’ll assume it’s wrong me or you you You suck Sunday, June 20, 2010
  122. Prepare Communicate Explain 1. Communication channel Something is Can’t tell if it’s I’ll assume it’s wrong me or you you Tell me when You suck a lot I know it’s you you’re back less Sunday, June 20, 2010
  123. Sunday, June 20, 2010
  124. Sunday, June 20, 2010
  125. Sunday, June 20, 2010
  126. Sunday, June 20, 2010
  127. Sunday, June 20, 2010
  128. Sunday, June 20, 2010
  129. Sunday, June 20, 2010
  130. Sunday, June 20, 2010
  131. Prepare Communicate Explain 1. Communication channel Easy to find Sunday, June 20, 2010
  132. Prepare Communicate Explain 1. Communication channel Easy to find Hosted off-site Sunday, June 20, 2010
  133. Prepare Communicate Explain 1. Communication channel Easy to find Hosted off-site Real-time / automated Sunday, June 20, 2010
  134. 7 keys for public health dashboards 1. Must show current status for each “service” 2. Data must be accurate and timely 3. Must be easy to find 4. Must provide details for events in real time 5. Provide historical uptime and performance data 6. Provide a way to be notified of status changes 7. Provide details on the data is gathered Source: http://www.transparentuptime.com/2008/11/rules-for-successful-public-health.html Sunday, June 20, 2010
  135. Prepare Communicate Explain 1. Communication channel Easy to find Hosted off-site Real-time / automated 2. Process Sunday, June 20, 2010
  136. Prepare Communicate Explain 1. Communication channel Easy to find Hosted off-site Real-time / automated 2. Process Authority Sunday, June 20, 2010
  137. Prepare Communicate Explain 1. Communication channel Easy to find Hosted off-site Real-time / automated 2. Process Authority Mean-Time-To-Communicate (MTTC) Sunday, June 20, 2010
  138. Prepare Communicate Explain 1. Communication channel Easy to find Hosted off-site Real-time / automated 2. Process Authority Mean-Time-To-Communicate (MTTC) On-call/drills/escalations/etc. Sunday, June 20, 2010
  139. Your servers Sunday, June 20, 2010
  140. Prepare Communicate Explain 1. Communicate Sunday, June 20, 2010
  141. Prepare Communicate Explain 1. Communicate Use communication channel Sunday, June 20, 2010
  142. Prepare Communicate Explain 1. Communicate Use communication channel MTTC Sunday, June 20, 2010
  143. Prepare Communicate Explain 1. Communicate Use communication channel MTTC Who/what is affected Sunday, June 20, 2010
  144. Prepare Communicate Explain 1. Communicate Use communication channel MTTC Who/what is affected When the incident started Sunday, June 20, 2010
  145. Prepare Communicate Explain 1. Communicate Use communication channel MTTC Who/what is affected When the incident started ETA Sunday, June 20, 2010
  146. Prepare Communicate Explain 1. Communicate Use communication channel MTTC Who/what is affected When the incident started ETA Update regularly Sunday, June 20, 2010
  147. Prepare Communicate Explain 1. Communicate Use communication channel MTTC Who/what is affected When the incident started ETA Update regularly 2. Fix it! Sunday, June 20, 2010
  148. Phew, close one! Sunday, June 20, 2010
  149. Prepare Communicate Explain 1. Postmortem Sunday, June 20, 2010
  150. Prepare Communicate Explain 1. Postmortem Admit failure Source: http://en.blog.wordpress.com/2010/02/19/wp-com-downtime-summary/ Sunday, June 20, 2010
  151. Prepare Communicate Explain 1. Postmortem Admit failure Sound like a human Source: http://www.bureauofcommunication.com/compose/apology Sunday, June 20, 2010
  152. Prepare Communicate Explain “We apologize for any inconvenience this may have caused” Sunday, June 20, 2010
  153. Prepare Communicate Explain 1. Postmortem Admit failure Sound like a human Start time and end time Source: https://groups.google.com/group/google-appengine/browse_thread/thread/a7640a2743922dcf Sunday, June 20, 2010
  154. Prepare Communicate Explain 1. Postmortem Admit failure Sound like a human Start time and end time Who/what was impacted Source: http://techcrunch.com/2009/11/02/large-scale-downtime-at-rackspace-cloud/ Sunday, June 20, 2010
  155. Prepare Communicate Explain 1. Postmortem Admit failure Sound like a human Start time and end time Who/what was impacted What went wrong Source: http://www.zendesk.com/2010/03/tuesday-double-whammy.html Sunday, June 20, 2010
  156. Prepare Communicate Explain 1. Postmortem Admit failure Sound like a human Start time and end time Who/what was impacted What went wrong Lessons learned Source: http://graysky.org/2010/02/downtime-postmortem/ Sunday, June 20, 2010
  157. Prepare Communicate Explain 1. Postmortem Admit failure Sound like a human Start time and end time Who/what was impacted What went wrong Lessons learned Sunday, June 20, 2010
  158. Prepare Communicate Explain “I was completely overwhelmed by the amount of positive feedback and support I received.” Sunday, June 20, 2010
  159. Prepare Communicate Explain 1. Postmortem Admit failure Sound like a human Start time and end time Who/what was impacted What went wrong Lessons learned 2. Improve for the future Sunday, June 20, 2010
  160. Prepare Communicate Explain “Google is not just saying sorry, they are actually implementing serious changes which probably represents millions of dollars of development to help make sure this doesn't happen again.” Source: http://news.ycombinator.com/item?id=1168493 Sunday, June 20, 2010
  161. Prepare Communicate Explain Source: https://groups.google.com/group/google-appengine/browse_thread/thread/a7640a2743922dcf Sunday, June 20, 2010
  162. Prepare Communicate Explain Be human Sunday, June 20, 2010
  163. Prepare Communicate Explain Be authentic Sunday, June 20, 2010
  164. Prepare Communicate Explain Be transparent Sunday, June 20, 2010
  165. Prepare Communicate Explain Accept responsibility Sunday, June 20, 2010
  166. Prepare Communicate Explain Learn and improve Sunday, June 20, 2010
  167. Prepare Communicate Explain Trust Sunday, June 20, 2010
  168. Upside of Downtime Framework 1.0 Prepare Communicate Explain 1. Communication channel 1. Communicate 1. Post-mortem - Easy to find - Use channel - Admit failure - Off-site - M.T.T.C. - Sound like a human - Real-time - Who/what affected - Start time and end time - When started - Who/what was impacted 2. Process - ETA to resolution - What went wrong - Give authority - Update regularly - Lessons learned - M.T.T.C. - On-call/escalations 2. Fix it! 2. Learn and improve Sunday, June 20, 2010
  169. Upside of Downtime Framework 1.0 Prepare Communicate Explain 1. Communication channel 1. Communicate 1. Post-mortem - Easy to find - Use channel - Admit failure - Off-site - M.T.T.C. - Sound like a human - Real-time - Who/what affected - Start time and end time - When started - Who/what was impacted 2. Process - ETA to resolution - What went wrong - Give authority - Update regularly - Lessons learned - M.T.T.C. - On-call/escalations 2. Fix it! 2. Learn and improve Be Prepared + Be Transparent + Be Human Sunday, June 20, 2010
  170. Upside of Downtime Framework 1.0 Prepare Communicate Explain 1. Communication channel 1. Communicate 1. Post-mortem - Easy to find - Use channel - Admit failure - Off-site - M.T.T.C. - Sound like a human - Real-time - Who/what affected - Start time and end time - When started - Who/what was impacted 2. Process - ETA to resolution - What went wrong - Give authority - Update regularly - Lessons learned - M.T.T.C. - On-call/escalations 2. Fix it! 2. Learn and improve Be Prepared + Be Transparent + Be Human = Sunday, June 20, 2010 Trust
  171. Disclaimer: Don’t screw up too often Sunday, June 20, 2010
  172. Sunday, June 20, 2010
  173. Downtime Prisoner’s Dilemma Transparent Not Transparent Caught Not Caught Sunday, June 20, 2010
  174. Downtime Prisoner’s Dilemma Transparent Not Transparent Caught Not Caught Win Sunday, June 20, 2010
  175. Downtime Prisoner’s Dilemma Transparent Not Transparent Caught Big Loss Not Caught Win Sunday, June 20, 2010
  176. Downtime Prisoner’s Dilemma Transparent Not Transparent Caught Big Win Big Loss Not Caught Win Sunday, June 20, 2010
  177. Downtime Prisoner’s Dilemma Transparent Not Transparent Caught Big Win Big Loss Not Caught Win Win Sunday, June 20, 2010
  178. Downtime Prisoner’s Dilemma Transparent Not Transparent Caught Big Win Big Loss Not Caught Win Win Sunday, June 20, 2010
  179. Benefits Gain trust Reduce churn, increase loyalty Reduce support costs Ability to control the message Competitive advantage More time to focus on the actual problem Reduce stress Sunday, June 20, 2010
  180. Change != Easy Sunday, June 20, 2010
  181. Change != Impossible Sunday, June 20, 2010
  182. Keys to Adoption Getting past a culture of “hide the problem” Sunday, June 20, 2010
  183. Keys to Adoption Getting past a culture of “hide the problem” Overriding commitment to want to improve Sunday, June 20, 2010
  184. Keys to Adoption Getting past a culture of “hide the problem” Overriding commitment to want to improve Available resources to improve Sunday, June 20, 2010
  185. Keys to Adoption Getting past a culture of “hide the problem” Overriding commitment to want to improve Available resources to improve Pain Sunday, June 20, 2010
  186. Keys to Adoption Getting past a culture of “hide the problem” Overriding commitment to want to improve Available resources to improve Pain Buy-in Sunday, June 20, 2010
  187. Product Management Support Engineering/ Operations Sales/ Marketing Sunday, June 20, 2010
  188. Product Default: Lets wait for complaints Management Support Engineering/ Operations Sales/ Marketing Sunday, June 20, 2010
  189. Product Default: Lets wait for complaints Management Reality: Proactiveness => Forgiveness Support Engineering/ Operations Sales/ Marketing Sunday, June 20, 2010
  190. Product Default: Lets wait for complaints Management Reality: Proactiveness => Forgiveness Default: Too much work Support Engineering/ Operations Sales/ Marketing Sunday, June 20, 2010
  191. Product Default: Lets wait for complaints Management Reality: Proactiveness => Forgiveness Default: Too much work Support Reality: More upfront, less when it matters Engineering/ Operations Sales/ Marketing Sunday, June 20, 2010
  192. Product Default: Lets wait for complaints Management Reality: Proactiveness => Forgiveness Default: Too much work Support Reality: More upfront, less when it matters Engineering/ Default: Don’t want to look bad Operations Sales/ Marketing Sunday, June 20, 2010
  193. Product Default: Lets wait for complaints Management Reality: Proactiveness => Forgiveness Default: Too much work Support Reality: More upfront, less when it matters Engineering/ Default: Don’t want to look bad Operations Reality: Opportunity to learn/improve Sales/ Marketing Sunday, June 20, 2010
  194. Product Default: Lets wait for complaints Management Reality: Proactiveness => Forgiveness Default: Too much work Support Reality: More upfront, less when it matters Engineering/ Default: Don’t want to look bad Operations Reality: Opportunity to learn/improve Sales/ Default: I don’t want my customers to know Marketing Sunday, June 20, 2010
  195. Product Default: Lets wait for complaints Management Reality: Proactiveness => Forgiveness Default: Too much work Support Reality: More upfront, less when it matters Engineering/ Default: Don’t want to look bad Operations Reality: Opportunity to learn/improve Sales/ Default: I don’t want my customers to know Marketing Reality: They’ll find out, better from us Sunday, June 20, 2010
  196. Product Default: Lets wait for complaints Management Reality: Proactiveness => Forgiveness Default: Too much work Support Reality: More upfront, less when it matters Engineering/ Default: Don’t want to look bad Operations Reality: Opportunity to learn/improve Sales/ Default: I don’t want my customers to know Marketing Reality: They’ll find out, better from us Sunday, June 20, 2010
  197. Source: http://delicious.com/lennysan/healthdashboard Sunday, June 20, 2010
  198. Simple as that! Sunday, June 20, 2010
  199. Your site will still fail! Sunday, June 20, 2010
  200. “The measure of a society is how well it transforms pain and suffering into something worthwhile.” -- Fredrick Nietzsche Sunday, June 20, 2010
  201. “The measure of a company is how well it transforms pain of downtime into something worthwhile.” -- Lenny Rachitsky Source: Original quote inspired by Fredrick Nietzsche Sunday, June 20, 2010
  202. Bare minimum: Register a Twitter account Sunday, June 20, 2010
  203. Thank You Slides: http://bit.ly/upside-of-downtime Lenny Rachitsky @lennysan http://www.transparentuptime.com/ Webmetrics/Neustar @webmetrics http://www.webmetrics.com/ Sunday, June 20, 2010
  204. Bonus Sunday, June 20, 2010
  205. Sunday, June 20, 2010
  206. Sunday, June 20, 2010
  207. Upside of Downtime Framework 1.0 Prepare Communicate Explain 1. Communication channel 1. Communicate 1. Post-mortem - Easy to find - Use channel - Admit failure - Off-site - M.T.T.C. - Sound like a human - Real-time - Who/what affected - Start time and end time - When started - Who/what was impacted 2. Process - ETA to resolution - What went wrong - Give authority - Update regularly - Lessons learned - M.T.T.C. - On-call/escalations 2. Fix it! 2. Learn and improve Sunday, June 20, 2010
  208. Upside of Downtime Framework 1.0 Prepare Communicate Explain 1. Communication channel 1. Communicate 1. Post-mortem - Easy to find - Use channel - Admit failure - Off-site - M.T.T.C. - Sound like a human - Real-time - Who/what affected - Start time and end time - When started - Who/what was impacted 2. Process - ETA to resolution - What went wrong - Give authority - Update regularly - Lessons learned - M.T.T.C. - On-call/escalations 2. Fix it! 2. Learn and improve Sunday, June 20, 2010
  209. Upside of Downtime Framework 1.0 Prepare Communicate Explain 1. Communication channel 1. Communicate 1. Post-mortem - Easy to find - Use channel - Admit failure - Off-site - M.T.T.C. - Sound like a human - Real-time - Who/what affected - Start time and end time - When started - Who/what was impacted 2. Process - ETA to resolution - What went wrong - Give authority - Update regularly - Lessons learned - M.T.T.C. - On-call/escalations 2. Fix it! 2. Learn and improve Sunday, June 20, 2010
  210. Upside of Downtime Framework 1.0 Prepare Communicate Explain 1. Communication channel 1. Communicate 1. Post-mortem - Easy to find - Use channel - Admit failure - Off-site - M.T.T.C. - Sound like a human - Real-time - Who/what affected - Start time and end time - When started - Who/what was impacted 2. Process - ETA to resolution - What went wrong - Give authority - Update regularly - Lessons learned - M.T.T.C. - On-call/escalations 2. Fix it! 2. Learn and improve "Unlikely that an accidental surface or subsurface oil spill would occur from the proposed activities" -- Exploration and environmental impact plan Source: http://en.wikipedia.org/wiki/Deepwater_Horizon_drilling_rig_explosion Sunday, June 20, 2010
  211. Upside of Downtime Framework 1.0 Prepare Communicate Explain 1. Communication channel 1. Communicate 1. Post-mortem - Easy to find - Use channel - Admit failure - Off-site - M.T.T.C. - Sound like a human - Real-time - Who/what affected - Start time and end time - When started - Who/what was impacted 2. Process - ETA to resolution - What went wrong - Give authority - Update regularly - Lessons learned - M.T.T.C. - On-call/escalations 2. Fix it! 2. Learn and improve Sunday, June 20, 2010
  212. Upside of Downtime Framework 1.0 Prepare Communicate Explain 1. Communication channel 1. Communicate 1. Post-mortem - Easy to find - Use channel - Admit failure - Off-site - M.T.T.C. - Sound like a human - Real-time - Who/what affected - Start time and end time - When started - Who/what was impacted 2. Process - ETA to resolution - What went wrong - Give authority - Update regularly - Lessons learned - M.T.T.C. - On-call/escalations 2. Fix it! 2. Learn and improve Sunday, June 20, 2010
  213. Upside of Downtime Framework 1.0 Prepare Communicate Explain 1. Communication channel 1. Communicate 1. Post-mortem - Easy to find - Use channel - Admit failure - Off-site - M.T.T.C. - Sound like a human - Real-time - Who/what affected - Start time and end time - When started - Who/what was impacted 2. Process - ETA to resolution - What went wrong - Give authority - Update regularly - Lessons learned - M.T.T.C. - On-call/escalations 2. Fix it! 2. Learn and improve Sunday, June 20, 2010
  214. Upside of Downtime Framework 1.0 Prepare Communicate Explain 1. Communication channel 1. Communicate 1. Post-mortem - Easy to find - Use channel - Admit failure - Off-site - M.T.T.C. - Sound like a human - Real-time - Who/what affected - Start time and end time - When started - Who/what was impacted 2. Process - ETA to resolution - What went wrong - Give authority - Update regularly - Lessons learned - M.T.T.C. - On-call/escalations 2. Fix it! 2. Learn and improve Sunday, June 20, 2010
  215. Upside of Downtime Framework 1.0 Prepare Communicate Explain 1. Communication channel 1. Communicate 1. Post-mortem - Easy to find - Use channel - Admit failure - Off-site - M.T.T.C. - Sound like a human - Real-time - Who/what affected - Start time and end time - When started - Who/what was impacted 2. Process - ETA to resolution - What went wrong - Give authority - Update regularly - Lessons learned - M.T.T.C. - On-call/escalations 2. Fix it! 2. Learn and improve Sunday, June 20, 2010
  216. Upside of Downtime Framework 1.0 Prepare Communicate Explain 1. Communication channel 1. Communicate 1. Post-mortem - Easy to find - Use channel - Admit failure - Off-site - M.T.T.C. - Sound like a human - Real-time - Who/what affected - Start time and end time - When started - Who/what was impacted 2. Process - ETA to resolution - What went wrong - Give authority - Update regularly - Lessons learned - M.T.T.C. - On-call/escalations 2. Fix it! 2. Learn and improve Sunday, June 20, 2010
  217. Upside of Downtime Framework 1.0 Prepare Communicate Explain 1. Communication channel 1. Communicate 1. Post-mortem - Easy to find - Use channel - Admit failure - Off-site - M.T.T.C. - Sound like a human - Real-time - Who/what affected - Start time and end time - When started - Who/what was impacted 2. Process - ETA to resolution - What went wrong - Give authority - Update regularly - Lessons learned - M.T.T.C. - On-call/escalations 2. Fix it! 2. Learn and improve Sunday, June 20, 2010
  218. Upside of Downtime Framework 1.0 Prepare Communicate Explain 1. Communication channel 1. Communicate 1. Post-mortem - Easy to find - Use channel - Admit failure - Off-site - M.T.T.C. - Sound like a human - Real-time - Who/what affected - Start time and end time - When started - Who/what was impacted 2. Process - ETA to resolution - What went wrong - Give authority - Update regularly - Lessons learned - M.T.T.C. - On-call/escalations 2. Fix it! 2. Learn and improve Sunday, June 20, 2010
  219. “Be not afraid of transparency; some are born transparent, some achieve transparency, and others have transparency thrust upon them.” -- Burrowed from William Shakespeare Sunday, June 20, 2010
  220. Sunday, June 20, 2010
  221. Making change 1. Find the bright spots - (this presentation has a bunch) Sunday, June 20, 2010
  222. Making change 1. Find the bright spots - (this presentation has a bunch) 2. Script the critical moves - (framework) Sunday, June 20, 2010
  223. Making change 1. Find the bright spots - (this presentation has a bunch) 2. Script the critical moves - (framework) 3. Point to the destination - (W.W.G.D.) Sunday, June 20, 2010
  224. Making change 1. Find the bright spots - (this presentation has a bunch) 2. Script the critical moves - (framework) 3. Point to the destination - (W.W.G.D.) 4. Find the feeling - (how would you feel?) Sunday, June 20, 2010
  225. Making change 1. Find the bright spots - (this presentation has a bunch) 2. Script the critical moves - (framework) 3. Point to the destination - (W.W.G.D.) 4. Find the feeling - (how would you feel?) 5. Shrink the change - (start small) Sunday, June 20, 2010
  226. Making change 1. Find the bright spots - (this presentation has a bunch) 2. Script the critical moves - (framework) 3. Point to the destination - (W.W.G.D.) 4. Find the feeling - (how would you feel?) 5. Shrink the change - (start small) 6. Grow your people - (everyone is learning as they go) Sunday, June 20, 2010
  227. Making change 1. Find the bright spots - (this presentation has a bunch) 2. Script the critical moves - (framework) 3. Point to the destination - (W.W.G.D.) 4. Find the feeling - (how would you feel?) 5. Shrink the change - (start small) 6. Grow your people - (everyone is learning as they go) 7. Tweak the environment - (create a simple process) Sunday, June 20, 2010
  228. Making change 1. Find the bright spots - (this presentation has a bunch) 2. Script the critical moves - (framework) 3. Point to the destination - (W.W.G.D.) 4. Find the feeling - (how would you feel?) 5. Shrink the change - (start small) 6. Grow your people - (everyone is learning as they go) 7. Tweak the environment - (create a simple process) 8. Build habits - (build process organically) Sunday, June 20, 2010
  229. Making change 1. Find the bright spots - (this presentation has a bunch) 2. Script the critical moves - (framework) 3. Point to the destination - (W.W.G.D.) 4. Find the feeling - (how would you feel?) 5. Shrink the change - (start small) 6. Grow your people - (everyone is learning as they go) 7. Tweak the environment - (create a simple process) 8. Build habits - (build process organically) 9. Rally the herd - (get buy in, rest will follow) Sunday, June 20, 2010
Advertisement