Since the dawn of software development, we've struggled with a huge disconnect between the management world and the engineering world. We try to explain our problems in terms of “technical debt”, but somehow the message seems to get lost in translation, and we drive our projects into the ground, over and over again.
What if we could detect the earliest indicators of a project going off the rails, and had data to convince management to take action? What if we could bridge this communication gap once and for all?
In this session, we'll focus on a key paradigm shift for how we can measure the human factors in software development, and translate the “friction” we experience into explicit risk models for project decision-making.
4. Artemis Starr
The most inspiring book…
Scientific Rigor in Organizational Learning
Published
in 1990
Deserves
Modernization
5. Artemis Starr
How do we do Science
in Software Development?
Published
in 1990
Deserves
Modernization
The most inspiring book…
6. Artemis Starr
The Ultimate Metric
1. What Problem are we trying to solve?
2. What is the Strategy?
3. What is the Ultimate Metric?
How to Science with Paper & Pencil
Arty Starr
@janellekz
7. Artemis Starr
1. What Problem are we trying to solve?
2. What is the Strategy?
3. What is the Ultimate Metric?
Arty Starr
@janellekz
The Ultimate Metric
How to Science with Paper & Pencil
17. RESET
“A description of the goal is not a strategy.”
-- Richard P. Rumelt
What’s wrong with our current strategy?
18. Our “Strategy” for Success
High Quality Code
Low Technical Debt
Easy to Maintain
Good Code Coverage
19. RESET
“A good strategy is a specific and coherent response to—
and approach for overcoming—the obstacles to progress.”
-- Richard P. Rumelt
The problem is we don’t have a strategy...
24. Risk Management - Steering in response to emerging risks
Organization = Robot driving a car
25. Reality is more like an Agile Barge
We are failing to learn from our mistakes
26. What if we measured our PAIN?
Stress
Wave
Happy
Engineers
Run Away
from Misery!
What if we made
this visible?
“Thermodynamics of Emotion”
27. PAIN is a stress signal in our bodies that tells
us we ought to move away from something
What is PAIN?
Problem:
We Don’t Move!
28. Why don’t we move?
Arm
Controller
Hand
PAIN
Sensor
“PAIN Signal” is Lost in Translation
29. Artemis Starr
1. What Problem are we trying to solve?
2. What is the Strategy?
3. What is the Ultimate Metric?
Arty Starr
@janellekz
The Ultimate Metric
How to Science with Paper & Pencil
30. A story from the trenches…
How to Measure the PAIN
in Software Development
Janelle Arty Starr
Learning what to Measure
leanpub.com/ideaflow
31. Great Team
Disciplined with Best Practices
Constantly Working on Improvements+
Project FAILURE
About 10 Years Ago…
32. We had a pretty typical Scrum Process…
Planning
Meeting
Retro
4 Week Sprints
Deploy
Why didn’t the Retros
fix the problems?
33. “What’s the best opportunity for improvement?
“The awful email
template engine code!”
Our biggest problem
The Retrospective
34. “Fill in missing
unit tests!”
Our biggest problem
The Retrospective
“What’s the best opportunity for improvement?
35. “We should clean up
the database code!”
Our biggest problem
The Retrospective
“What’s the best opportunity for improvement?
36. “Let’s improve maintainability
of our test framework!”
Our biggest problem
The Retrospective
“What’s the best opportunity for improvement?
37. Just because a problem comes to mind,
doesn’t mean it’s an important problem to solve.
Our biggest problem
The Retrospective
“What’s the best opportunity for improvement?
38. Our biggest problem
What do I feel the
most intensely about?
Daniel Kahneman
Thinking Fast and Slow
The Retrospective
“What’s the best opportunity for improvement?
39. “The awful email
template engine code!”
Recency Bias
Our biggest problem
The Retrospective
“What’s the best opportunity for improvement?
40. Guilt Bias
“Fill in missing
unit tests!”
Our biggest problem
The Retrospective
“What’s the best opportunity for improvement?
41. Known Solution Bias
Our biggest problem
The Retrospective
“What’s the best opportunity for improvement?
“We should clean up
the database code!”
42. Sunk Cost Bias
“Let’s improve maintainability
of our test framework!”
Our biggest problem
The Retrospective
“What’s the best opportunity for improvement?
50. Data: We made significantly more mistakes
in code that we didn’t write ourselves
Lower
Familiarity
More
Mistakes=
There had to be more to the story...
53. The amount of Confusion was caused by…
Likeliness(of((
Unexpected(
Behavior(
Cost(to(Troubleshoot(and(Repair(
High(Frequency(
Low(Impact(
Low(Frequency(
Low(Impact(
Low(Frequency(
High(Impact(
PAIN(
54. What Causes Unexpected
Behavior (likeliness)?
What Makes Troubleshooting
Time-Consuming (impact)?
Semantic Mistakes
Stale Memory Mistakes
Association Mistakes
Bad Input Assumption
Tedious Change Mistakes
Copy-Edit Mistakes
Transposition Mistakes
Failed Refactor Mistakes
False Alarm
Non-Deterministic Behavior
Ambiguous Clues
Lots of Code Changes
Noisy Output
Cryptic Output
Long Execution Time
Environment Cleanup
Test Data Creation
Using Debugger
Most of the confusion was caused by human factors.
What causes Confusion?
55. What Causes Unexpected
Behavior (likeliness)?
What Makes Troubleshooting
Time-Consuming (impact)?
Non-Deterministic Behavior
Ambiguous Clues
Lots of Code Changes
Noisy Output
Cryptic Output
Long Execution Time
Environment Cleanup
Test Data Creation
Using Debugger
Most of the confusion was caused by human factors.
Semantic Mistakes
Stale Memory Mistakes
Association Mistakes
Bad Input Assumption
Tedious Change Mistakes
Copy-Edit Mistakes
Transposition Mistakes
Failed Refactor Mistakes
False Alarm
What causes Confusion?
56. What Causes Unexpected
Behavior (likeliness)?
What Makes Troubleshooting
Time-Consuming (impact)?
Non-Deterministic Behavior
Ambiguous Clues
Lots of Code Changes
Noisy Output
Cryptic Output
Long Execution Time
Environment Cleanup
Test Data Creation
Using Debugger
Semantic Mistakes
Stale Memory Mistakes
Association Mistakes
Bad Input Assumption
Tedious Change Mistakes
Copy-Edit Mistakes
Transposition Mistakes
Failed Refactor Mistakes
False Alarm
Most of the confusion was caused by human factors.
What causes Confusion?
57. Confusion occurs during the process
of understanding and changing the software
Complex(
So*ware(
Not the Code.
Optimize “Idea Flow”
Confusion
58. Run
Experiment
Bias: “What do I feel the most intensely about?”
Setup Experiment Analyze Results &
Decide Next Experiment
Execute
waiting - time goes slow
doing - time zooms by
59. Ugliness
bothers us a LOT
Moderate Difficulty
Is Enjoyable
Our PAIN Sensors are terribly miscalbrated!
Bias: “What do I feel the most intensely about?”
60. Ugliness
bothers us a LOT
Moderate Difficulty
Is Enjoyable
We can recalibrate our PAIN Sensors with data!
Bias: “What do I feel the most intensely about?”
61. My team spent tons of time working on
improvements that didn’t make much difference.
We had tons of automation, but the
automation didn’t catch our bugs.
62. My team spent tons of time working on
improvements that didn’t make much difference.
We had well-modularized code,
but it was still extremely time-consuming to troubleshoot defects.
63. “What are the Specific Examples
of when we experience this pain?”
The hard part isn’t solving the problem,
it’s identifying the right problem to solve
64. Strategy: Use Data to Optimize Flow
“Dream Team”
Code Infra.
Better Flow!
Measure
Friction
YAY!
Commitment to focus on the
highest leverage 20%
66. Explained the problem of “Technical Debt”
Key Insight: Business Coaching
“That doesn’t sound so bad.”
The Response:
?
?
?
?
WHAT?!
67. Loans are a Predictable Financial Tool
Revenue
- Cost
Profit + 10%
Increase Price?
Increase Sales?
Reduce Cost?
What makes investment decisions harder isn’t higher costs,
it’s lower predictability.
Investment Strategy
68. As Difficulty approaches “Cliff of the Impossible”…
Difficulty of Work Increases
Ultimate Constraint
Human
Limitations
Once we lose our ability to understand cause and effect,
we lose our capability to deliver software
69. “Technical Debt” is a misleading metaphor
We shouldn’t make decisions like we’re taking out a loan!
“Interest Payments” are not loan-like at all…
87. Doesn’t mean we learned how to drive
Just because we repaired the car…
88. “Technical Debt” “Escalating Risk”
Predictable increase in cost over time
Can add resources to compensate
Bias around problems in the code
Probabilistic model captures
loss of predictability
Adding resources increases risk
Problems are in the interactions,
and don’t have to be nouns
Likelihood)of))
Unexpected)
Behavior)
Cost)to)Troubleshoot)and)Repair)
High)Frequency)
Low)Impact)
Low)Frequency)
Low)Impact)
Low)Frequency)
High)Impact)
PAIN)
How is this Different?
89. “Technical Debt”
Predictable increase in cost over time
Can add resources to compensate
Bias around problems in the code
This Metaphor is Misleading Management
Veto “Escalating Risk”
Explain it
this way instead
90. Team 1 Team 2 Team 3
Collaborative Support System
“Escalating Risk”
Shared Language
Strategy: Shared Language of Risk
The Fifth Discipline
Peter Senge
Creativity,
Love & Mastery
Powered
“Innovation Co.”
20% 20% 20%
91. Artemis Starr
1. What Problem are we trying to solve?
2. What is the Strategy?
3. What is the Ultimate Metric?
Arty Starr
@janellekz
The Ultimate Metric
How to Science with Paper & Pencil
92. Realization:
Lean Agile
When we mapped the metaphors…
Metaphorical Mapping of “Feature Factory”
broke all the mathematics around Control Theory
Is there a way to
fix the math?
93. Modeling Anatomy of Invisible Systems
Collaboration with
Michael Feathers
@mfeathers
A as a
metaphor for B
System A System B
B as a
metaphor for A
“Form”: Characterizes the similiarities across systems
in terms of explicit “Structure and Forces”
Book of
Form
Systems with Similar Dynamics
94. Where do we see these similarities?
Systems with Similar Dynamics
A as a
metaphor for B
System A System B
B as a
metaphor for A
Socialtechnical
Human Networks
Distributed
Software Networks
Manufacturing
Process Networks“Rhyming Systems”
95. Control Theory mapped to Software World
How to Measure the PAIN
in Software Development
Janelle Arty Starr
“Flow Control”
leanpub.com/ideaflow
Steering Capability
96. “Theory of Constraints”
Tool
1
Tool
2
Tool
3
10 units
per hour
2 units
per hour
4 units
per hour
“Rate of Flow” at the bottleneck determines the
Rate of Flow of the whole system
Focus on
Limiting
Constraint
97. What is the Limiting Constraint in Software?
Difficulty of Work Increases
Limiting Constraint
Human
Understanding
Once we lose our ability to understand cause and effect,
we lose our capability to deliver software
99. Brain as a Stabilizing Feedback System
4. Observe & Adjust Loop
2. Optimization Target
3. Chaos Signal
1. Flow of Inputs
Decision-Making
Engine
Control Theory mathematics require measuring
deviations from an optimize target
100. Make
Sense?
The process of communicating an idea according to an Intention,
and validating that the idea was understood
What is Idea Flow?
Intention
Steering Loop
101. The process of communicating an idea according to an Intention,
and validating that the idea was understood
Intention
Steering Loop
Software
System
What is Idea Flow?
105. What is “Friction”?
Time to recover understanding when observable
behavior doesn’t match expectations (TTR)
WTF?! YAY!
Friction measures the Frequency & Duration of the Confusion State
Confusion
106. Quality Target
Lower Control Limit
Upper Control Limit
X"
Perfect Quality
Upper Control Limit
Lower Control Limit
Process Control in Lean Manufacturing
Lower Variability => Better Control
Out of Control
(OOC)
107. Optimal Friction
Upper Control Limit
X"
“Out of Control”
20min
0m
50m
0m
Understanding
In Sync
Confusion
Limit
X
Out of Control
(OOC)
Lower Variability => Better Control
“Flow Control” in Software Development
WTF =
109. “Idea Flow Mapping”Idea Idea
Explain
to ‘Puter
To calculate factorial(n), multiply all the
numbers together up to that number
Okay. Compiles.
Run a Test
What’s the factorial of 4?
?!?!?! How did you get 6?
6.
1*2*3 = 6
Fix It I meant to include the 4…
Run a Test
What’s the factorial of 4?
24.
Yay!
Idea Flow
110. “Idea Flow Mapping”Idea Idea
Explain
to ‘Puter
To calculate factorial(n), multiply all the
numbers together up to that number
Okay. Compiles.
Run a Test
What’s the factorial of 4?
?!?!?! How did you get 6?
6.
1*2*3 = 6
Fix It I meant to include the 4…
Run a Test
What’s the factorial of 4?
24.
Yay!
Idea Flow
111. Idea Idea“Idea Flow Mapping”
Explain
to ‘Puter
Run a Test
?!?!?!
Fix It
Run a Test
Yay!
Idea Flow
“Start Intention”
“WTF Event”
Confusion
Understanding
In Sync
“YAY Event”
“Finish Intention”
113. Software
“Idea Sculpture”
“Idea Flow Network”
Shared
Intention
Flow is the capability of the system to evolve
an “Idea Sculpture” toward a “Shared Intention”
“Friction”
% Capacity
“Flow Engine”
114. Human Understanding is the Ultimate Constraint
Difficulty of Work Increases
Ultimate Constraint
Human
Understanding
WTF?!
Stress
Curve
115. “WTFs” are the Ultimate Metric
“Friction” measures the
Frequency & Duration
of the Confusion State (TTR)
116. Artemis Starr
1. What Problem are we trying to solve?
2. What is the Strategy?
3. What is the Ultimate Metric?
Arty Starr
@janellekz
How to Science with Paper & Pencil
The Ultimate Metric
117. Why measure WTFs?
Microservices World
90% of our software
is built from 3rd party
Skyrocketing
Diagnostic Difficulty
in the Integration Space
Code Metrics are almost completely
out of alignment with our PAIN
120. 1. What’s the goal of this task?
2. What changes will be the most mistake-prone?
3. What changes will be most difficult to debug?
4. What can we do to mitigate those risks?
1. Talk through the Risk Factors before the Task
Checklist:
121. 2. Write down WTFs during the Task
WTF?! YAY!
Jot down start time (take screenshot)
Jot down end time (take screenshot)
Keep notes on unexpected behavior & clues,
as you diagnose the cause of confusion
Subtract
Confusion
122. 3. Refactor your Decision Habits after the Task
1. What were the biggest WTFs?
2. What made troubleshooting take so long?
3. What decisions would you make differently
if you could do this task again?
4. What could we do to reduce risk on future tasks?
Checklist:
126. Artemis Starr
1. Sociotechnical Science
3. Building the Dream Together
Industry Collaborative Learning Network
Arty Starr, Social Entrepeneur
Platform for the People
@janellekz
2. Community Learning Platform
(Membership fees 100% reinvestment)
Sign up:
dreamscale.love
Thanks!