Too often root cause analysis of a development or support issue is skipped in our rush to recover. Often the actions taken address symptoms of the problem, but not the root cause. This presentation reviews two popular approaches for root cause analysis: 5 Whys and Fishbone.
Presented at Agile New England as an Agile 101 on 3 March 2023.
Transcript: New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
Root Cause Analysis
1. Root Cause Analysis
David Hanson
dphanson63@yahoo.com
https://www.linkedin.com/in/david-hanson/
https://www.slideshare.net/DavidHanson5
March 2023 David Hanson | ANE
2. 2
Introduction
Too often root cause analysis of a development or support issue is skipped in our rush
to recover.
Often the actions taken address symptoms of the problem, but not the root cause.
This presentation reviews two popular approaches for root cause analysis.
David Hanson | ANE
3. 3
Use this method to find the deeper root cause
Keep asking why until you can blame
management!
5 Whys Fishbone
Use this method to find multiple root causes
Also known as Ishikawa, when diagrammed
resembles a set of fishbones
Two Most Popular Techniques
Can be used together too…
David Hanson | ANE
Why?
Why?
Why?
Why?
Why?
4. 4
1) Why did the issue occur?
2) Why did a process not run?
3) Why was the schedule not set when
deployed?
4) Why are we shortcutting our deployment
process?
5) Why aren’t releases systematic and
automated?
Why? Because!
1) Because a process did not run.
2) Because schedule was not set during
deployment.
3) Because worked alone, without checklist,
without validation.
4) Because releases happening ad hoc and on
demand.
5) Because management has been rushing us.
Why? Because!
What are the corrective actions that you might propose?
David Hanson | ANE
6. 6
Definition: prevents occurrence
What are some examples of preventive action?
Preventive Action Corrective Action
Definition: prevents recurrence
What are some examples of corrective action?
Preventive or Corrective Action
David Hanson | ANE
7. 7
Monitoring dashboards for
successful data refresh every
Monday morning:
preventive action, corrective action,
or something else?
A current practice for my current team…
David Hanson | ANE
8. 8
If the action taken results in manual or automated
support or periodic maintenance, then the action
arguably does not prevent occurrence or
recurrence
Symptom Cause
When possible, better to find and address the
root cause rather than treating the symptom
Support and Maintenance
David Hanson | ANE
Acetaminophen Amoxicillin
9. 9
Analysis of issue might conclude
works as designed
Root cause analysis might lead to
need for user or support team
education
Analysis of issue might identify
defect
Defects might be addressed now,
soon, or later depending on
workarounds and impact
Analysis of issue might identify
missed, new, or changed
requirement
Enhancements generally written
as user stories for future sprints
Not issue Defect Enhancement
If defects frequent, then process
needs inspection more than code
Is sprint review with stakeholders
generating enhancement requests?
Still worth root cause analysis with
preventive or corrective actions
Support Issue Triage
David Hanson | ANE
10. 10
Classification Support Action
Critical No workaround or
control available
Add to top of sprint backlog and
address ASAP by swarming
Major Complex workaround
or control
Add to sprint backlog and address
in current sprint
Minor Simple workaround
or control
Add to product backlog and
address in future sprint
Trivial Workaround or
control not required
Add to product backlog or
incorporate into existing story to
address when convenient
Severity of impact and frequency
of occurrence are often
considered in addition to whether
workaround or control exists
A Starting Point for Defect Classification
David Hanson | ANE
11. 11
Understand how to recognize the
symptom and implement the
recovery
Implement the recovery and
understand the root cause or
causes
Implement the recovery and then
take corrective action to address
root cause
First Time Second Time Third Time
If impact modest and recovery simple might consider
frequency of issue, instead of count
For high impact issues with complex recovery might want to
take corrective action earlier
My Approach for Support Issues
David Hanson | ANE
12. 12
If leveraging 5 whys, address first
level root cause
If leveraging fishbone, address
most impactful* root cause
If leveraging 5 whys, address
second level root cause
If leveraging fishbone, address
second most impactful root cause
If leveraging 5 whys, address nth
level root cause
If leveraging fishbone, address nth
most impactful root cause
1st Time 2nd Time nth Time
Pareto Principle (extrapolated):
40% effort yields 96% value
*Pareto Principle:
20% effort yields 80% value
Other approaches for addressing root causes
If keeps repeating, then keep addressing next root cause. Over time system should become very robust and frequency of related issue increasingly rare.
David Hanson | ANE
…
13. 13
1) Why?
2) Why?
3) Why?
4) Why?
5) Why?
Why? Because.
1) Because.
2) Because.
3) Because.
4) Because.
5) Because.
Who has a relatively simple issue that wants to try 5 whys?
David Hanson | ANE
14. 14
• Cause • Cause • Cause
M or P M or P M or P
Who has a relatively complex issue that wants to try
fishbone?
David Hanson | ANE
• Cause • Cause • Cause
M or P M or P M or P
Symptom
15. 15
Agile Problems
1. Why is Scrum simple to understand but difficult to master?
2. Why is cycle time for release of valuable software routinely measured in months and
not weeks?
3. Why is the product owner routinely over utilized and the Scrum master routinely
under utilized?
4. Why do some team members have too much to do and other team members have
too little to do?
5. Why are so many user stories written as developer tasks with acceptance criteria
written as task lists?
These questions likely have multiple sources for root cause and lend themselves to fishbone technique.
David Hanson | ANE
16. 16
• Cause • Cause • Cause
M or P M or P M or P
Root cause analysis for a troublesome Agile problem?
David Hanson | ANE
• Cause • Cause • Cause
M or P M or P M or P
Agile Problem?
17. 17
When?
After recurring support issue
After escaped defect
After system outage
For unstable velocity
For long cycle times
After a failed sprint
When team morale low
When improvements stalled
When transformation stalled
Where?
Retrospectives
Post-mortems
Lessons Learned
Incident Reviews
When and Where
David Hanson | ANE
18. 18
Reflection
So, what was most useful for you here?
What was missing or what would you like to see next?
David Hanson | ANE
20. 20
Has a similar aspect to 5 Why’s, except questions
whether the problem is worth tackling, and if so,
considering at least the first step in resolution.
What, So What, Now What Mind Mapping
Has a similar aspect to Fishbone, except this
brainstorming approach is usually used to
discover a range of interesting problems or
creative solutions.
Related Concepts
David Hanson | ANE
21. 21
More Tools in the Lean Toolkit
Not to be overlooked 5 Whys and Fishbone covered in depth in this presentation
Lean Canvas
Single page project definitions
useful to quickly assess and
prioritize projects
A3 Problem Solving
Single 11x17 sheet useful for
tracking problem from current
state to future state to solution
Value Stream Mapping
Map steps from idea to use,
noting value add, non-value but
required, and non-value add
Voice of Customer
Ask who, what, why, when and
where to gather the voice of
your customer
Gemba Walk
Go-and-see, observing business
processes and software
development first-hand
SIPOC
Process flow technique
following supplier > input >
process > output > customer
5S
Organize work and workplace
using sort, straighten, shine,
standardize, sustain
David Hanson | ANE
A3
A
4
A
4
Who
What
Why
When
Where
S I P O C
S
S
S
S
S
Lean Waste
Checklist for identifying types
of waste which can then be
eliminated or minimized
5
• Defects
D
• Overdone
O
• Wait
W
• Neglect
N
•Transport
T
•Inventory
I
•Motion
M
•Excess
E
Non-
value
Required
Value
Fishbone also known as Ishikawa
Fishbone diagram modified
Original by Kathy Wu from Noun Project
Sometimes recommendation is to keep asking why until you can blame management.
Leveraging empathy, you might consider why management is doing what they are doing.
Why is management rushing us?
What happens when you fix on the first? Second? Third? Fourth?
Measurement | Policy
Machine | Program
Manpower | People
Environment | Place
Materials | Product
Method | Process
https://thenounproject.com/term/business-model-canvas/116286/
https://www.drcone.com/2017/12/02/3925/
Replace 5 Whys and Fishbone with visuals for Value Stream Mapping and Lean Waste