EuroSTAR Software Testing Conference 2008 presentation on Are Real Test Metrics Predictive for the Future? by Rob Baarda. See more at conferences.eurostarsoftwaretesting.com/past-presentations/
3. Which test metrics?
Test basis
Test object Test
Execution Defects
Repair
Production
Specifying
test
cases /
scripts Test cases/
scripts
Test Process
Size test
basis
Size test
object # defects in
test object
# defects in
production
For each process:
# hours effort
lead time
# test cases
# = number of
# defects in
test basis
# repair
rounds
4. Deductible metrics
• Effort estimation = # hours /size (FP, KLOC)
• Productivity = # test cases / # hours
• Efficiency =
# defects / (# hours or # test cases)
> Specification
> Test execution
> Retest of repaired defects
• DDP Defect Detection Percentage (Europe)
DRE Defect Removal Efficiency (USA)
• Defect injection rate for rework
• Damage prevented in €?
HOW to get data?
HOW to organize?
5. Dutch test metrics experiences
• Dutch initiative to
gather test metrics
• Parties involved
>NESMA
Netherlands Software Metrics Association
>Testnet
Dutch Testing community
>LaQuSO
Laboratory for Software Quality
Universities Eindhoven & Nijmegen
Approach
Goal Question Metrics (GQM)
6. 6
Goals
1. Test manager
Support in planning and
controlling the testing project
2. Organization Benchmark around
1. Test process
2. Test products
3. IT-products
To improve test process, IT
process
7. 7
Some Questions
• Test manager
> Number of test cases needed for my project
> What percentage of the project team should be
allocated to testing
> How many retests are executed
• Organization Benchmark
> What is the defect detection & removal efficiency
(at what phase)
> What test coverage do I need to ensure adequate
testing
> How many defects does development insert
when repair others
8. What we have - Structure
Project
Test project
Test activity
Incidents/
Defects
Project
activity
Be careful using the data
Lack of statistical evidence
9. Feedback example test effort
% test
effort /
project
effort
% test effort / project effort
Project effort (man-day)
0
10
20
30
40
50
60
0 500 1000 1500 2000 2500 3000
Be careful using the data
Lack of statistical evidence
10. Feedback example test productivity
Test productivity
Size in Function Points
# Test
hours
/ FP
0
2
4
6
8
10
12
14
16
18
0 100 200 300 400 500 600
Be careful using the data
Lack of statistical evidence
11. Feedback example defects per fp
0
0,2
0,4
0,6
0,8
1
1,2
0 100 200 300 400 500 600 700 800 900
Number of defects / function point
Size in Function Points
#
defects
/ fp
Be careful using the data
Lack of statistical evidence
12. Defect Detection Percentage
Defect Removal Efficiency
DDP
(%)
84
85
86
87
88
89
90
91
92
0 100 200 300 400 500 600 700 800 900 1000
Size in SKLOC
Defect Detection Percentage
Defect Removal Efficiency
Be careful using the data
Lack of statistical evidence
13. Processes around metrics
• Collection in a project
> Embedded in daily work
> Weekly summarisation
> Sanity checks
> Cost: about 2% project budget
• Distribution
• For a benchmark on the level of:
> Project releases
> Organisation
> Country
> International: www.ISBSG.org
International Software Benchmarking Standards
Group
14. Some Considerations for future use
1. Accuracy of definitions
2. Number of types of defects
3. Is a batch test case the same as an online
test case?
4. Only testing of functionality or also
security, performance, usability
5. How to include regression testing?
6. Measure personal productivity?
7. Predictive value
average (mean), median, standard
deviation, correlations with?
Prediction model needed?
17. Some statistics for ST hours/fp
• Average = 4.0
• Standard deviation = 1.8
As predictor: 68% will be between
2.2 and 5.8
• Does not really help as prediction
19. And now
• Non-GIS
> Average = 3.1
> Standard deviation = 0.8
> As predictor: 68% within 2.3 and 3.9
• GIS
> Average = 6.2
> Standard deviation = 1.4
> As predictor: 68% within 4.8 and 7.6
• Overall
> Average = 4.0
> Standard deviation = 1.8
> As predictor: 68% within 2.2 and 5.8
20. Use metrics in your future
1. Starting point for project
> Use estimation model, mostly linear, use categories
Small, Middle, Large
> Use “common” metrics
Possible source:
Chapter 11 of TMap®
Next book
1. Look at your real project data, consistent
with prediction?
> Yes: GO TO End
1. Find the major factor influencing
2. Adapt your
1. Estimation model
2. Metrics
3. GO TO 2
21. Wrap up
• Metrics are possible
• Useful to predict
• Linear model needs localized fine
tuning
Test Metrics can predict your future!
Editor's Notes
Extra Voorbeeld: schatting bij testline Guido N. : 3 categorieën, ronde 1: extra categorie (ES), ronde2: staffelregels voor grotere aantallen (= meenemen inleereffect)