How To Run a 5 Whys (With Humans, Not Robots)


Published on

Slides from a talk at the Lean Startup conference (video link below).

Update: I've interleaved slides covering what I actually talked about onstage.

Update Update: video is up at

Published in: Technology
  • This is great! Especially like the point about having a sense of humor about it all. The true catastrophes in life don't really require people to 'escalate' b/c they are very obvious. I still like doing root cause analysis b/c it's like detective work, but point taken about eliminating shame. And very much like the idea of 'Planning a future where we are as stupid as we are today.'
    Are you sure you want to  Yes  No
    Your message goes here
  • I believe the 5 Why approach to be very powerful, and I also agree with you that you need to build a culture where people know you are not looking for blame; this is hard and not a five minute activity. I have seen this done over many years and seen great results by analysing how a defect escaped to the customer.

    One of the biggest issues with the 5 Whys, is the 5 Whys. 'why' engenders blame and people by default want to defend their position; they do this unconsciously often, although not always. When I'm coaching people or teams, 'why' questions are rarely used for this very reason.

    Also to consider, not only do you need a culture that will not blame, you need a culture of people wanting to improve. This approach when done well will find deep rooted issues, which you need to be prepared to fix or there is little point in doing it.
    Are you sure you want to  Yes  No
    Your message goes here
  • Great presentation man! Laughed as I clicked through. Loved the failures part.

    And you're right a little humor goes a long way!
    Are you sure you want to  Yes  No
    Your message goes here
  • Truly awesome deck ... shared with my whole team!
    Are you sure you want to  Yes  No
    Your message goes here
  • @willevans That particular point was me trying to write a simple bullet for something I explained at slightly greater length during the talk (that's the structure of these slides -- the visual slides I actually presented interleaved with bullets summarizing what I said over the slides).

    What I meant by the sentence you've picked out is that 'we, humanity' underestimate the power of situations to influence 'our, all of humanity's' behavior. Which is, I think, a fair summary of the FAE -- when someone else is in a situation and does something, we assume it's because of who they are (not the situation influencing them), when we're in the situation, we perceive the same behavior as a natural response to the situation. The psychology professor who taught me this idea emphasized the power of the situation more than just the asymmetry of how we explain things. And I've seen it referenced that way very commonly -- e.g. in this summary of the Stanford Prison Experiment, 'overestimat[ing] the importance of dispositional factors while underestimating situational factors.'

    So, e.g. the people listening to the talk will tend to assume that people who respond to outages with panic and a desire to blame are doing that because they are weak-willed or don't fully understand failure. But those listeners are falling for the FAE, because they don't understand that, when they're in that situation, they'll react the same way.
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

How To Run a 5 Whys (With Humans, Not Robots)

  1. How to run a 5 Whys (With Humans, Not Robots)Dan Milstein@danmilDirector of Product Development, Wingu
  2. What Is a 5 Whys Anyways?• Something you do when your company has badly screwed up• E.g. your CEO demos your cloud storage system to an early prospective customer, and, when he runs a search, it shows other customers’ data (I have done this, it was not awesome)• You get a bunch of people into a room and say: “How on earth did that happen? And how can we make sure it never, ever happens again?”• That’s a 5 Whys (aka, a Post-Mortem)• But, there’s a problem....
  3. Shameful Mistakes: Humans vs Robots
  4. Human Beings Will Eff It Up• Humans (unlike robots) feel this intense emotion called shame• Shame will suggest (strongly) “Slow Down, Stop Making So Many Mistakes”• Aka “Throw overboard everything the Lean Startup tells you is important”• Has potential to be incredibly damaging to your startup• And I have some bad news...
  5. You Will Totally Experience Shame (I Still Do) F.A.E.
  6. This Emotional Experience Can Not Be Avoided• I’ve run c. 50 post-mortems, have studied failure... and I still have this emotional reaction• You will, too. And so will your team.• Much more strongly than you realize right now• This is the “Fundamental Attribution Error” (FAE), from psychology• FAE = humans vastly underestimate the power of a situation on our behavior
  7. Big Idea: Adopt Economic, Not Moral Mindset $, FTW
  8. What Does That Mean• Let me tell you a story...
  9. Parable: A Tale of Two Factories
  10. Two Factories• Both make widgets• Both are missing their monthly Widget Production goals by 10%• But for different reasons...
  11. Factory 1... Broken Machine
  12. When The Machine Breaks...• Belt slips off every once in a while• Ruins a bunch of widgets• Gotta replace it, drift a little behind plan• So... what questions do humans ask in this situation?
  13. Economic Mindset = Broken Machine• “How much is it costing us?”• “How much does it cost to repair?”• “Can we kludge a partial fix?”• “What are risks if we delay a fix?”
  14. Note the Key Words• “Cost”, “Partial”, “Risk”• These are things you hear a lot in an economic discussion• Okay, meanwhile in Factory 2, also missing by 10%, different reason...
  15. Factory 2... One Employee Is an Axe Murderer
  16. After Every Axe Murdering...• Have to, like, hire a new guy, train him on the machine, takes forever• Questions we asked before are now somehow deeply wrong:• “What if we just cut down on the rate, so there’s less axe murdering?”• “Hey, we can train a pool of temps on all the machines, when someone gets killed, we’ll just swap some new guy in, bang, problem solved!”• “How much is it really costing us, anyways?”• These ideas seem obscene, not merely bad
  17. Moral Mindset = Axe Murderer“Search for villains,elevation of accusers,and mobilization of authority tomete out punishment”(Pinker, The Blank Slate)
  18. Moral Mindset, Key Words• “Villains”, “Accusers”, “Authority”, “Punishment”• I believe that most companies, in investigating outages, act much more like they’re looking for an axe murderer, than trying to fix a broken machine
  19. Your Challenge, As Person Running 5 Whys Get team out of moral mindset. Note: this is not, in fact, easy.
  20. Why It’s Hard• Mindsets control how we interpret the world...• ...including what people say to us• So, a team sitting there, fearing moral censure, hears you say “We’re not looking to blame anyone”, they just think you’re lying. How could you mean that, when the thing that happened was so terrible and wrong?• The deep trick (and this is the point of this whole presentation, frankly), is that you have to take advantage of the thing that separates humans and robots...
  21. Fundamental Tool: Make ‘Em Laugh
  22. Humor == Breaking Frames• That’s what humor actually is -- something that stretches or breaks the mental frame that people are using to interpret a situation• So, you use humor to break the frame, release people from the blame/fear/ punishment of the moral mindset, and then refocus them on the economic challenges you’re facing• The humor is, IMHO, not a nice-to-have. It’s absolutely central. I’ve seen smart, caring leaders get this one wrong, and finish their post-mortems with a room full of tense, closed-up team members (and no good ideas on the table)• Rest of talk is specific examples of this, but this is the main point
  23. Tip 1: Always Share Worse “Bad Things”
  24. Place The Bad Thing on a Continuum• Moral mindset is very absolutist: this bad thing is The Worst Thing Ever• I like to say “Okay, well it’s pretty bad, let’s compare it to some things”• Did we irretrievably lose customer data? (I’ve done that, not awesome)• Did we almost get our customer fired by her boss (also, not awesome)• Did we send hundreds of emails to everyone on our customer’s mailing list... but the emails were all question marks? For a customer who was in the proofreading business? (done that, very much not awesome)• People laugh, and then say “Okay, how bad was this, really?” Win.
  25. More Stories of Actual Failures (Just For Fun)• Did we break our allergies-to-medicines module, and risk having a doctor prescribe the wrong medication to someone?• Did our internet-connected home thermostat system have a server crash, causing all the thermostats to set the temp to the default... of 85 degrees?• Did our high-frequency trading program have flaws that led to our company losing 450 million dollars? (that is a tough one to beat, IMHO)• Collect your own! It’s fun!
  26. Tip 2: Mock Hindsight Bias To Its Face “Let’s plan for a future where we’re all as stupid as we are today.”
  27. How Hindsight Bias Shows up in a 5 Whys• Someone says “Oh, yeah, I screwed that one up, I knew I had to run the deploy in that one order, and I just forgot. I’m really sorry, I won’t make that mistake again, totally my bad.”• You have to utterly reject this. It’s pure hindsight bias (easy to see errors after the fact, very difficult in the moment).• I say “It’s like we’re saying ‘I was stupid, this one time, and we’ll fix that problem by never being stupid again.’”• Hence: “planning for a future where we’re as stupid as we are today”• aka “Must create a system which is resilient to occasional bouts of really intense stupidity”.
  28. Tip 3: Relish Absurdities of Your System
  29. 5 Whys Will Highlight That Your Code is a Mess• E.g. you’ve refactored, and rewritten in python (or node or something), and moved to the cloud, but this 5 whys is making clear that your most important report is still run by a VisualCron job on a Windows server that never quite made it out of the office... and someone just tripped on the power cord• Team will feel ashamed, you have to give them license to relish absurdity• I often point out “There are two kinds of startups: the ones that achieve some modest traction on top of a pile of code of which they are vaguely ashamed... and the ones that go out of business. That’s it. No third kind.”• Also sometimes it helps to just laugh: “It’s kind of amazing this works at all”
  30. Tip 4: “Broadest Fixes” vs. “Root Causes”
  31. Handling a Fork in the Road• Example: bad outage at Wingu: was triggered by a mistake in db access code. But we couldn’t fix it for three hours, because our error reporting system was trying to send us hundreds of emails/minute, so our email provider throttled us, and we didn’t get those email until hours later.• Which is the Root Cause? DB access bug or monitoring failure?• Answer: don’t care about “root causes”. They don’t exist (multiple things conspire for failures to happen). Also, kind of moral/blame-ish.• Ask instead: if we made an incremental improvement in area A or area B, which would prevent the broadest class of problems going ahead?• Much better conversation. Answer here is clear: monitoring.
  32. Remember, There Is No Axe Murderer (Probably)
  33. Photo Credits• “Robot de Martillo”, by Luis Perez,• “Helios-Factory floor”,• “old machine”, by Jun Aoyama,• “Axe Marks The Spot”, by Alan Levine,• “Failboat Has Arrived”, failboat2.jpg• “14 plugs but only 6 sockets”, by Jason Rogers, 2661016046/• “Life is like that… a fork in the road… decision required”, by Roger Price, photos/rwp-roger/6687024883/
  34. Thanks...Dan Milstein@danmil