• What is Metrics?
• A Good Agile Metric or Diagnostic…
• Metric / Diagnostic Evaluation Checklist
• Metrics and Examples
What is Metrics?
• Quantitative measures of performance or production used to
indicate progress or achievement against strategic goals -
• Measurable element of a service, process or function. The real
value of metrics is seen in their change over time. Reliance on
a single metric is not advised, especially if it has the potential
to affect User behaviour in an undesirable way -
A Good Agile Metric or Diagnostic… (1 of 3)
Affirms and reinforces Agile principles
Supports the customer-intimate and value focused traits that reinforce Agile principles. This requires that people
who understand Agile participate in metrics design. The truism quot;you get what you measure“ reminds us that
counterproductive behaviors may ensue if you reinforce the wrong things (ex: overtime, % utilization, paperwork)
Follows trends, not numbers
Measure quot;one level up“ to ensure you measure aggregated information, not sub-optimized parts of a whole.
Aggregate above the individual team level for upper management use. To promote process health, do not track at
levels more granular than “a team”, and “an iteration”.
Belongs to a small set of metrics and diagnostics
A quot;just enoughquot; metrics approach is recommended: too much information can obscure important trends.
A Good Agile Metric or Diagnostic… (2 of 3)
Measures outcome, not output
In an Agile environment where simplicity or quot;maximizing the amount of work not donequot; is promoted, the most
spectacular outcome might be achieved by reducing planned output while maximizing delivered value. Outcomes
are measured in terms of delivered Customer value
Is easy to collect
For team-level diagnostics the ideal is quot;one buttonquot; automation - where data is drawn from operational tools (i.e.
the Product Backlog, acceptance test tools, code analyzers). For management use, avoid rework (ex:
powerpoints) and manipulation of lower level data, aggregation is preferable.
Reveals, rather than conceals, its context and
Should be visibly accompanied by notes on significant influencing factors, to discourage false assumptions and
A Good Agile Metric or Diagnostic… (3 of 3)
Provides fuel for meaningful conversation
Face-to-face conversation is a very useful tool for process improvement. A measurement isolated from its context
loses its meaning. Note: It's a good sign when people talk about what they've learned by using a metric or
Provides feedback on a frequent and regular
To amplify learning and accelerate process improvement, metrics should preferably be available at each iteration
retrospective, and at key periodic management meetings.
May measure Value (Product) or Process
Depending on where problems lie, diagnostics may measure anything suspected of inhibiting effectiveness.
Consider the appropriate audience for each metric, and document its context/assumptions to encourage proper
use of its content. And remember: you get what you measure!
Encourages quot;good-enoughquot; quality
The definition of what's quot;good enoughquot; in a given context must come from that context's Business Customer or
their proxy, not the developers.
Metric / Diagnostic Evaluation Checklist
Name This should be well chosen to avoid ambiguity, confusion, over simplification
It should answer a specific, clear question for a particular role of group. If there are
Question multiple questions, design other metrics.
Basis of Clearly state what is being measured, including units. Labeling of graph axes must be
clear rather than brief
Assumptions Should be identified to ensure clear understanding of data represented
Indicate intended usages at various levels of the organization. Indicate limits on
Level and Usage usage, if any
the designers of the metric should have some idea of what they expect to see
Expected Trend happen. Once the metric is proven, document common trends
When to Use It what prompted creation or use of this metric? How has it historically been used?
when will it outlive its usefulness, become misleading or extra baggage? Design this
When to Stop Using It in from the start
think through the natural ways people will warp behavior or information to yield
How to Game It more ‘favorable’ outcomes
Warnings recommend balancing metrics, limits on use, and dangers of improper use
Question How much software can my team deliver per iteration?
Measurement Story points or “ideal engineering hours”
Assumptions The team is delivering working software every iteration
Velocity is most useful at the project level. It allows the team to forecast how much work they can expect to
Level and Usage complete based on prior efforts.
Velocity can be affected by many things: Changing team members, obstacles, toolsets, difficulty of feature or
amount of learning required, etc. will lower the velocity of the team. Barring unexpected obstacles, a stable
team on the same project with the required resources will generally gain in velocity during the course of the
Expected Trend project, then plateau.
Velocity is a very useful metric for the team, and should be used during the course of the project once work has
When to Use It started.
When to Stop Using In a longer project when the team, resources, and technology are all stable, velocity will also become stable.
It The team may suspend collecting velocity since it is quot;known.quot;
Velocity is only meaningful to the exact team providing the data - each team will estimate their work differently
How to Game It from other teams.
Velocity is not the same as value. A team with excellent velocity could spend months quickly and effectively
delivering software that does not have the investment potential. Comparing velocity of teams is problematic
(see above) and should be avoided: this diagnostic is a barometer for the team itself, as a unit. Team member
velocities are problematic: velocity should be measured at the team level, since that is the unit that must self-
Warnings organize to produce value
QA Metrics – A Sample
Code Coverage Bugs
Status Fixed Carried over Status
Total Coverage 90.3% 88.9% Iteration 17 15 5
Code Coverage 88.5% 86.6% To Date 207 5
Summary – Selenium Tests
Tests Failures Errors Success Time (sec)
52 (677 assertions) 1 0 98% 39 mnts
• Frequency of builds • Unit tests per story
• Average duration of builds • Functional tests per story
• Number of broken builds per
• Defects carried over per iteration
• Defects per story
• Average duration of broken build
• Number of builds per iteration
Development: Scope change (stories removed or added
• Cyclometric complexity measures from scope due to redundancy or rewrite)
• Distribution of method and class
lengths Scope changes not caused by additional
• Rate of change of source (loc stories per iteration
in/out) User Stories carried forward (hangover) per
• Proportion of source code that is iteration
test code No of stories held in
Analysis, Development, Testing per iteration
• One can get completely lost in too many metrics
• A quot;just enoughquot; metrics approach is recommended: too much
information can obscure important trends
• quot;If you can't measure it, you can't manage it.“ – Peter Drucker
Just make it simple and practical. Measure those metrics you
choose and act on them.
• Appropriate Agile Measurement: Using Metrics and Diagnostics to Deliver Business Value
Deborah Hartmann, Agile Process Coach, firstname.lastname@example.org & Robin Dymond, Agile Management
• Cohn, Mike, Agile Estimating and Planning, Prentice Hall, 2006