Key Measurements for Testers: Precision vs Accuracy

Key MeasurementsKey Measurements
For TestersFor Testers
http://www.qaprogrammer.comhttp://www.qaprogrammer.com
© 2003 qaprogrammer.com© 2003 qaprogrammer.com
All Rights Reserved.All Rights Reserved.
qaprogramer
Delivering Software Project Success

Precision vs. AccuracyPrecision vs. Accuracy
 AccuracyAccuracy
 Saying PI = 3 is accurate, but not preciseSaying PI = 3 is accurate, but not precise
 I’m 2 meters tall, which is accurate,I’m 2 meters tall, which is accurate,
but not precisebut not precise
 PrecisionPrecision
 Saying PI = 4.378383 is precise, but not accurateSaying PI = 4.378383 is precise, but not accurate
 Airline flight times are precise to the minute,Airline flight times are precise to the minute,
but not accuratebut not accurate
 Number of significant digits is the keyNumber of significant digits is the key

Precision vs. AccuracyPrecision vs. Accuracy
 People make assumptions about accuracyPeople make assumptions about accuracy
based on precisionbased on precision
 ““365 days” is not the same as “1 year” or365 days” is not the same as “1 year” or
“4 quarters” or even “52 weeks”“4 quarters” or even “52 weeks”
 ““10,000 staff hours” is not the same as “510,000 staff hours” is not the same as “5
staff years”staff years”
 Unwarranted precision is the enemy ofUnwarranted precision is the enemy of
accuracy (e.g., 395.7 days +/- 6 months)accuracy (e.g., 395.7 days +/- 6 months)

IntroductionIntroduction
Good GoalsGood Goals
 A goal should be SMARTA goal should be SMART
 SpecificSpecific
 Measurable/TestableMeasurable/Testable
 AttainableAttainable
 RelevantRelevant
 Time-boundTime-bound
 Can use aCan use a Purpose, Issue, ObjectPurpose, Issue, Object formatformat

GQM HierarchyGQM Hierarchy
Goal 1 Goal 2
Question Question Question Question Question
Measure Measure Measure Measure MeasureMeasure

GQM ExampleGQM Example
Current average cycle time * 100
Baseline average cycle time
Subjective rating of manager’s satisfaction
Measures
Is the performance of the process improving?Question
Average cycle time
Standard Deviation
% cases outside the upper limit
Measures
What is the current change request processing speed?Question
Improve by 10%
the timeliness of
change request processing
from the project manager’s viewpoint
Goal Purpose
Issue
Object (process)
Viewpoint

Project Evaluation: QualityProject Evaluation: Quality
Test Planning and ResourcesTest Planning and Resources
 Do we have enough testing resources?Do we have enough testing resources?
 How many tests do we need to run (estimated)?How many tests do we need to run (estimated)?
 How long does each test case take to design and write?How long does each test case take to design and write?
 How long does each test take, on average?How long does each test take, on average?
 How many full testing cycles do we expect? (more thanHow many full testing cycles do we expect? (more than
one especially for early test cycles)one especially for early test cycles)
 How many person-days do we need (# tests * time perHow many person-days do we need (# tests * time per
test * # of cycles)?test * # of cycles)?
 How many testing staff do we have?How many testing staff do we have?
 How long will the testing phase take, with our currentHow long will the testing phase take, with our current
staff?staff?
 Is the testing phase too long (i.e. our current staff is notIs the testing phase too long (i.e. our current staff is not
sufficient)? Do we have to test less or can we add staff?sufficient)? Do we have to test less or can we add staff?

Reported/Corrected SoftwareReported/Corrected Software
DefectsDefects
0%
100%
Time
Start of testing phase End of testing phase
Defects
found
Defects fixed
Defects open
From Manager’s Handbook for Software
Development, Revision 1, NASA, Software
Engineering Laboratory 1990

Reported/Corrected SoftwareReported/Corrected Software
Defects – Actual ProjectDefects – Actual Project
Number of
defect
reports
(in
thousands)
0
1.0
Weeks of testing
5 10 15 20 25 30 35 40
0.2
0.4
0.8
0.6
Found
Open
Fixed

Defect RateDefect Rate
Expected Total Defects
0
10
20
30
40
50
60
70
0
3
6
9
12
15
18
21
24
27
Months from Start of Project
Defects/Month
Defects/
Month
95%
99%
99.9%

Statistics on Effort per DefectStatistics on Effort per Defect
 Data on time required to fix defects, categorizedData on time required to fix defects, categorized
by type of defect, provides a basis for estimatingby type of defect, provides a basis for estimating
remaining defect correction workremaining defect correction work
 Need to collect data on fix time in defect trackingNeed to collect data on fix time in defect tracking
systemsystem
 Data on phases in which defects are injected andData on phases in which defects are injected and
later detected gives you a measure of thelater detected gives you a measure of the
efficiency of the development process. If 95% ofefficiency of the development process. If 95% of
the defects are detected in the same phase theythe defects are detected in the same phase they
were created, the project has an efficient processwere created, the project has an efficient process

A Defect Fix Time Model forA Defect Fix Time Model for
TestingTesting
From Software Metrics: Establishing a Company-wide Program, by
Robert B Grady and Deborah L. Caswell, 1987
25%
50%
20%
4%
1%
2 hours
5 hours
10 hours
20 hours
50 hours

Product Characterization: QualityProduct Characterization: Quality
DefectsDefects
 Defects are one of the most often usedDefects are one of the most often used
measures of qualitymeasures of quality
 Definitions of defects differDefinitions of defects differ
 Only items found by customers? Testers?Only items found by customers? Testers?
 Items found during upstream reviews?Items found during upstream reviews?
 Only non-trivial items?Only non-trivial items?
 Small enhancements?Small enhancements?
 Timing of “defect” detection an important partTiming of “defect” detection an important part
of defect characterizationof defect characterization
 A “product defect” may be different than a “processA “product defect” may be different than a “process
defect”defect”

Product Evaluation: TestingProduct Evaluation: Testing
System Test ProfileSystem Test Profile
0
20
40
60
80
100
120
140
System Test Phase
Tests
Tests
Executed
Tests
Passed
Tests
Planned
From NASA, Recommended Approach to
Software Development, 1992

Cumulative Defects Found inCumulative Defects Found in
TestingTesting
Error Rate Model
0
1
2
3
4
5
6
7
8
Design Code/Test System Test Acceptance
Test
CumulativeErrorsperKSLOC
Historical Norm
Upper bound
Lower Bound
From Manager’s Handbook for
Software Development, Revision 1,
NASA, Software Engineering
Laboratory 1990

Cumulative Defects – ActualCumulative Defects – Actual
ProjectProject Error Rate Model
0
1
2
3
4
5
6
7
8
Design Code/Test System Test Acceptance
Test
CumulativeErrorsperKSLOC
Historical Norm
Upper Bound
Lower Bound
Actual Project
From Manager’s Handbook for
Software Development, Revision 1,
NASA, Software Engineering
Laboratory 1990

Product PredictionProduct Prediction
Predicting Future Defect RatesPredicting Future Defect Rates
Increasing FactorsIncreasing Factors
 System sizeSystem size
 ApplicationApplication
complexitycomplexity
 Compressing theCompressing the
scheduleschedule
 4x increase4x increase
 More staffMore staff
 Lower productivityLower productivity
Decreasing FactorsDecreasing Factors
 Simplifying theSimplifying the
application/problem atapplication/problem at
handhand
 Extending the plannedExtending the planned
development timedevelopment time
 Cut in halfCut in half
 Fewer staffFewer staff
 Higher productivityHigher productivity

Defect Density PredictionDefect Density Prediction
 To judge whether we’ve found all the defects for anTo judge whether we’ve found all the defects for an
application, estimate its defect densityapplication, estimate its defect density
 Need statistics on defect density of past similar projectsNeed statistics on defect density of past similar projects
 Use this data to predict expected density on this projectUse this data to predict expected density on this project
 For example, if our prior projects had a defect densityFor example, if our prior projects had a defect density
between 7 and 9.5 defects/KLOC, we expect a similarbetween 7 and 9.5 defects/KLOC, we expect a similar
density on our new projectdensity on our new project
 If our new project has 100,000 lines of code, we expect to findIf our new project has 100,000 lines of code, we expect to find
between 700 and 950 defects totalbetween 700 and 950 defects total
 If we’ve found 600 defects so farIf we’ve found 600 defects so far
 We’re not done: we expect to find between 100 and 350 moreWe’re not done: we expect to find between 100 and 350 more
defectsdefects

Distribution of Software DefectDistribution of Software Defect
Origins and SeveritiesOrigins and Severities
 Highest severity faults come fromHighest severity faults come from
requirements and designrequirements and design
SeverityLevel
Minor
Mod
Major
Critical
Requirements
Design
Coding
Documentation
Bad Fixes

Defect ModelingDefect Modeling
 Model the number of defects expectedModel the number of defects expected
based on past experiencebased on past experience
 Model the number of defects inModel the number of defects in
requirements, design, construction, etc.requirements, design, construction, etc.
 Two approaches:Two approaches:
 Model defects based on effort hours, i.e XModel defects based on effort hours, i.e X
defects will be introduced per hour workeddefects will be introduced per hour worked
 Model defects per KSLOC (or other size unit)Model defects per KSLOC (or other size unit)
based on past experience and code growthbased on past experience and code growth
curvecurve

Defect ModelingDefect Modeling continuedcontinued
 Approach 1: SEI data, based on PSP data:Approach 1: SEI data, based on PSP data:
 DesignDesign Injected/hour = 1.76Injected/hour = 1.76
 CodingCoding Injected/hour = 4.20Injected/hour = 4.20
 Approach 2:Approach 2:
 Defects / KSLOC total are about 40 (30-85)Defects / KSLOC total are about 40 (30-85)
 10% requirements (4/KLOC)
 25% design (10/KLOC)
 40% coding (16/KLOC)
 15% user documentation (6/KLOC)
 10% bad fixes (4/KLOC)

Predicted and Actual DefectsPredicted and Actual Defects
FoundFound
0
100
200
300
400
500
600
700
800
Analysis
High
LevelD
esign
Low
LevelD
esignConstruction
UnitTest
ProjectIntegration
Test
Release
Integration
TestSystem
Test
Beta
G
eneralAvailability
Defects
Phase injection
estimate
Phase actual removal
Phase expected
removal
Cumulative actual
removal
Cumulative injection
estimate
Cumulative expected
Removal
Cumulative injection
reestimate
Development Phase
From Edward F. Weller, Practical Applications of
Statistical Process Control, IEEE Software
May/June 2000
Size reestimate

Defect Profile by Type -Defect Profile by Type -
ExampleExample
Sources of defects

Release MeasuresRelease Measures
Defect CountsDefect Counts
 Defect counts give a quantitative handleDefect counts give a quantitative handle
on how much work the project team stillon how much work the project team still
has to do before it can release thehas to do before it can release the
softwaresoftware
 Graph the cumulative reported defects,Graph the cumulative reported defects,
open defects and fixed defectsopen defects and fixed defects
 When the software is nearing release, theWhen the software is nearing release, the
number of open defects should trendnumber of open defects should trend
downward, and the fixed defects shoulddownward, and the fixed defects should
be approaching the reported defects linebe approaching the reported defects line

Defect Trends – Near ReleaseDefect Trends – Near Release
All DefectsAll Defects
Number of
defect
reports
(in
thousands)
0
1.0
Weeks of testing
5 10 15 20 25 30 35 40
0.2
0.4
0.8
0.6
Found
Open
Fixed
Target

Defect Trends – Near ReleaseDefect Trends – Near Release
Severity 1 and 2Severity 1 and 2
Number of
defect
reports
(in
thousands)
0
1.0
Weeks of testing
5 10 15 20 25 30 35 40
0.2
0.4
0.8
0.6
Found
Open
Fixed
Target

Construx Measurable ReleaseConstrux Measurable Release
CriteriaCriteria
 Acceptance testing successfully completedAcceptance testing successfully completed
 All open change requests dispositionedAll open change requests dispositioned
 System testing successfully completedSystem testing successfully completed
 All requirements implemented, based on the specAll requirements implemented, based on the spec
 All review goals have been metAll review goals have been met
 Declining defect rates are seenDeclining defect rates are seen
 Declining change rates are seenDeclining change rates are seen
 No open Priority A defects exist in the databaseNo open Priority A defects exist in the database
 Code growth has stabilizedCode growth has stabilized

HP Measurable ReleaseHP Measurable Release
CriteriaCriteria
 Breadth – testing coverage of userBreadth – testing coverage of user
accessible and internal functionsaccessible and internal functions
 Depth – branch coverage testingDepth – branch coverage testing
 Reliability – continuous hours of operationReliability – continuous hours of operation
under stress; stability; ability to recoverunder stress; stability; ability to recover
gracefully from defect conditionsgracefully from defect conditions
 Remaining defect density at releaseRemaining defect density at release
From Robert B Grady, Practical Software
Metrics for Project Management and Process
Improvement, 1992

Post Release Defect Density byPost Release Defect Density by
Whether Met Release CriteriaWhether Met Release Criteria
Postrelease incoming defects submitted by customers (3
month moving average)
MR 1 2 3 4 5 6 7 8 9 10 11 12
Months
Defects
submitted
(normalized by
KLOC)
Did Not
Meet
Worst
Product
That Met
Average of
Products
That Met
From Practical Software Metrics for
Project Management and Process
Improvement, by Robert B. Grady 1992

Release Measures: Defect CountsRelease Measures: Defect Counts
Defect Plot Before ReleaseDefect Plot Before Release
0
2
4
6
8
10
12
Time
NumberofDefects
Sev 1 & 2
Sev 2
Sev 1
Target
From Robert B Grady, Practical Software Metrics for Project Management
and Process Improvement, 1992

Detection EffectivenessDetection Effectiveness
0
10
20
30
40
50
60
70
80
90
100
DesignCheck
DesignReview
Design
Inspection
CodeInspection
Prototype
CodeCheck
UnitTest
FunctionalTest
IntegrationTest
FieldTrial
Cumulative
Highest
Modal
Lowest
[Jones86]

Process EvaluationProcess Evaluation
Status ModelStatus Model
Units
created
Units reviewed
Units tested

Process EvaluationProcess Evaluation
Status ExampleStatus Example
0
100
200
300
400
500
600
700
800
Implementation Phase
Units
Target
Units Created
Units Reviewed
Units Tested
1
From NASA, Manager’s Handbook for
Software Development, Revision 1, 1990

Goal #1 – Improve Software QualityGoal #1 – Improve Software Quality
Postrelease Discovered DefectPostrelease Discovered Defect
DensityDensity
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Nov-84 Mar-86 Aug-87 Dec-88 May-90 Sep-91 Jan-93
NumberofOpenSeriousandCritical
DefectReports
Older
< 12
Months
10X Goal
From Practical Software Metrics for Project Management
and Process Improvement, by Robert B. Grady 1992

Goal #1 – Improve Software QualityGoal #1 – Improve Software Quality
Prerelease Defect DensityPrerelease Defect Density
Question: How can we predict software quality based onQuestion: How can we predict software quality based on
early development processes?early development processes?
0
10
20
30
40
50
60
70
80
Oct-80 Feb-82 Jul-83 Nov-84 Mar-86 Aug-87 Dec-88
Project Release Date
DefectsinTest/KLOC
Defects in
Test/KLOC
Linear
(Defects in
Test/KLOC)
From Practical Software Metrics for
Project Management and Process
Improvement, by Robert B. Grady 1992

Goal #3 – Improve ProductivityGoal #3 – Improve Productivity
Defect Repair EfficiencyDefect Repair Efficiency
Question: How efficient are defect-fixing activities? Are weQuestion: How efficient are defect-fixing activities? Are we
improving?improving?
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
5
1987 1988 1989 1990 1991
DefectsFixed/Engr.Month
Defects
Fixed /
Engr
Month
From Practical Software Metrics for Project Management
and Process Improvement, by Robert B. Grady 1992

Goal #4 – Maximize Customer SatisfactionGoal #4 – Maximize Customer Satisfaction
Mean Time to Fix Critical andMean Time to Fix Critical and
Serious DefectsSerious Defects
Question: How long does it take to fix a problem?Question: How long does it take to fix a problem?
0
50
100
150
200
250
7/18/1990
8/18/1990
9/18/1990
10/18/1990
11/18/1990
12/18/1990
1/18/1991
2/18/1991
3/18/1991
4/18/1991
5/18/1991
6/18/1991
7/18/1991
8/18/1991
Days
AR
QA
KP+AD
LC
MR
From Practical Software Metrics for Project Management and
Process Improvement, by Robert B. Grady 1992
AR = Awaiting release
QA = Final QA testing
KP = known problem
AD = awaiting data
LC = lab classification
MR = marketing
review

Pamela PerrottPamela Perrott
 23+ years in IT23+ years in IT
 Application programmer, systemsApplication programmer, systems
programmer, programmer supportprogrammer, programmer support
 At two insurance companiesAt two insurance companies
 Several years at BoeingSeveral years at Boeing
 In re-engineering group and repository groupIn re-engineering group and repository group
 Several years in wireless (stints in QA and SEPGSeveral years in wireless (stints in QA and SEPG
groups)groups)
 Currently instructor/consultant at Construx SoftwareCurrently instructor/consultant at Construx Software

Construx SoftwareConstrux Software
 Steve McConnellSteve McConnell
 owner, CEO, Chief Software Engineerowner, CEO, Chief Software Engineer
 Author: Code Complete, Rapid Development, SoftwareAuthor: Code Complete, Rapid Development, Software
Project Survival Guide, Professional SoftwareProject Survival Guide, Professional Software
DevelopmentDevelopment
 Founded in 1996Founded in 1996
 Staff of around 15, mostly software engineersStaff of around 15, mostly software engineers
 Located in the Pacific NorthwestLocated in the Pacific Northwest
 Two primary lines of business: Training andTwo primary lines of business: Training and
ConsultingConsulting

Contact InformationContact Information
support@qaprorammer.comsupport@qaprorammer.com
www. qaprorammer.comwww. qaprorammer.com
+91-40:65 70 57 57 , +91-93 92 91 89 89+91-40:65 70 57 57 , +91-93 92 91 89 89
 Testing Projects
 Consulting
 Seminars
support@qaprogrammer.com
www.qaprogrammer.com
QAProgrammer
Delivering Software Project Success

Key Measurements for Testers: Precision vs Accuracy

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Key Measurements for Testers: Precision vs Accuracy

Similar to Key Measurements for Testers: Precision vs Accuracy (20)

More from QA Programmer

More from QA Programmer (7)

Key Measurements for Testers: Precision vs Accuracy

Editor's Notes