2. 2
“There are lies, damned
lies and statistics”
Attributed by Mark Twain
to Benjamin Disraeli,
British Prime Minister, 1874 - 1880
3. Software metrics don’t have a
good reputation either
• Low success rate of long-term software metrics
programmes
• Few senior managers understand, use and trust metrics
• Too many non-standardized sizing methods
• Project estimating via expert judgement or guesswork is
more common than using reliable historic data
Why?
3
4. Agenda
• Don’t believe the hype-merchants
• Many analyses of software metrics are flawed
• Some mistakes I have made
• Some conclusions on how to analyse and use
project data to get useful and trusted results
4
5. Capers Jones: Master of hype
‘Function Point metrics are the most accurate and
effective metrics yet developed for software sizing and
also for studying software productivity, quality ... (etc)’.1)
5
ESTIMATING ACCURACY BY METRICS USED 2)
Manual Automated
IFPUG function points with SNAP 5% 5%
IFPUG function points without SNAP 10% 7%
COSMIC function points 10% 7%
etc (11 other methods)
‘The current state of software metrics and measurement
practices in 2014 is a professional embarrassment’ 17
6. Be careful about claims for
automatic or fast FP sizing
1. “CAST Automated Function Points (AFP) capability is
an automatic function points counting method based on
the rules defined by the IFPUG.
CAST automates this counting process by using the
structural information retrieved by source code
analysis, database structure and transactions.” 3)
(AFP does not measure based on the IFPUG rules. 4))
6
2. Do not trust any fast method of measuring FP sizes
unless you know how the method was calibrated and
that it is valid for the software you are measuring.
7. Hype can be very expensive
7
2011 HYPE: “The benefits of this change (adopting Agile)
can improve delivery performance, in terms of cost, quality
and speed, by a factor of 20” 5)
(Recommendation for UK public sector to adopt Agile methods, 2011)
2011 NAIVETY: “Government will apply agile methods to
ICT procurement and delivery to reduce the risk of project
failure” 6)
(UK Government ICT Strategy, March 2011)
2014 DISASTER: ‘Universal Credits’ project stopped for a
‘reset’
Cost to UK taxpayer so far: £180 million
8. 8
Agenda
• Don’t believe the hype-merchants
• Many analyses of software metrics are flawed
• Some mistakes I have made
• Some conclusions on how to analyse and use
project data to get useful and trusted results
9. Beware of non-standard sizing methods, doubtful
conversion factors & uncalibrated estimating tools 7)
9
Counts
(e.g. of
Use Cases
User Stories)
COSMIC
CFP’s
IFPUG
FP’s
±14%
SLOC
ISBSG
COCOMO
&
Commercial
Tools
Approx.
CFP’s
Approx.
FP’s
±10%
±10%
±32%
±37%
±14%
Effort
Estimating
Error propagation can lead
to huge estimating errors 8)
10. Avoid compound indices; they destroy
information
10
1. Putnam’s ‘productivity index’ 9)
PI = size / (effort) 1/3 x (duration) 4/3
gives less insight than separate measures for:
productivity = size / effort
2. All attempts to measure a size of Non-Functional
Requirements (VAF, TCA*, SNAP) produce
meaningless numbers, and will eventually fail
* (Mea culpa!)
speed = size / duration
11. The ISBSG published a report showing
software development productivity has
declined over 20 years – probably not true!
11
ISBSG analysis of 1172 projects over 20 years 10)
12. Examine more closely: the project mix changed
12
significantly over the 20 years
When roughly corrected for
the change in mix,
productivity has not changed
much over the 20 years 11)
13. 13
Agenda
• Don’t believe the hype-merchants
• Many analyses of software metrics are flawed
• Some mistakes I have made
• Some conclusions on how to analyse and use
project data to get useful and trusted results
14. I published a paper in on the effort/duration
trade-off relationship 12)
The way of presenting data is helpful ….
14
10.0
1.0
0.1 1.0 10.0
0.1
Relative Efort
Relative Duration
SL < 2
2 < SL < 5
5 < SL < 8
8 < SL < 16
SL > 20
Inefficient
Fast
Inefficient
Slow
Efficient
Fast
Efficient
Slow
15. ….but the analysis of the relationship, relying
on the Putnam model, is flawed! 13)
15
Putnam
2.5
2.0
1.5
1.0
0.8 0.9 1 1.1 1.2
0.5
Relative Efort
Relative Duration
SEER-SEM
1.6
1.5
1.4
1.3
1.2
1.1
1
0.8 0.9 1.0 1.1 1.2
0.9
Relative Efort
Relative Duration
True-S
2
1.8
1.6
1.4
1.2
1
0.5 1.0 1.5 2.0 2.5
0.8
Relative Efort
Relative Duration
COCOMO II
1.5
1.4
1.3
1.2
Relative Efort Relative Duration
1.1
1
0.6 0.8 1 1.2 1.4 1.6
0.9
Four theories of the
effort/duration relationship
for schedule expansion
UUnnlliikkeellyy
MMuucchh mmoorree lliikkeellyy
16. I published data showing an economy of
scale of productivity with size. 14)
The analysis is flawed.
16
50
45
40
35
30
25
20
15
10
5
0
New development projects:
Percentiles of productivity per Size
0 - 50 50 - 100 100 -
200
200 -
300
300 -
500
500 -
1000
1000 -
2000
2000+
Productivity (UFP/WM) Size Band (UFP)
25%
50%
75%
(IFPUG-measured new development projects from ISBSG.
Same result for COSMIC-measured projects)
17. But plot productivity vs effort for the same
projects, shows a diseconomy of scale *
17
80
70
60
50
40
30
20
10
Productivity (UFP/WM) Size Band (UFP)
0
New development projects
Percentiles of productivity per Effort Band
25%
50%
75%
* For the explanation, see 15)
18. The best way to explore any relationship
such as effort vs size is to plot data for your
own homogeneous project datasets
18
Not very informative
y = 6.8466x + 2084
R² = 0.2451
y = 27.06x0.7916
R² = 0.4301
60000
50000
40000
30000
20000
10000
0
Effort vs Size
0 1000 2000 3000 4000 5000
Efort (work -hours)
Size (UFP)
Useful information 16)
• Multiple sources
• Mixed technologies
• Single company
• Single set of technologies
19. 19
Agenda
• Don’t believe the hype-merchants
• Many analyses of software metrics are flawed
• Some mistakes I have made
• Some conclusions on how to analyse and use
data to get useful and trusted results
20. Conclusions
• Don’t believe everything you read in the literature
• Do measure what matters in your organization (not what
is easy to measure)
• Do collect, check and analyse your own data. Be honest
about its accuracy
• Do be patient. It takes time to collect good data. There
are no quick and easy answers
• Do master basic statistical methods – but sophisticated
statistical analysis of poor data is unprofessional
• Don’t automatically discard outliers. First explain them
• Do be cautious if using external data, e.g. benchmarks
• Do keep it simple
• Do explore your data ……. but think, think, think! 20
21. 21
Thank you for your
attention
www.cosmicon.com
cr.symons@btinternet.com
22. 22
References
1. ‘Function Points as a Universal Software Metric’, Capers Jones, July 2013 distributed in a CAI e-mail
on March 20th 2014
2. ‘Keys to success: software measurement, software estimating, software quality’, presentation by
Capers Jones, October 9, 2012
3. www.castsoftware.com/products/automated-function-points , September 2014
4. AFP relies on the OMG Automated Function Point Standard http://www.omg.org/spec/AFP, which
does not distinguish EO’s and EQ’s.
5. ‘System Error: Fixing the flaws in government IT’, Institute for Government, March 2011
6. ‘Government ICT Strategy’, Cabinet Office (UK), March 2011
7. ‘From requirements to project effort estimates – work in progress (still?)’, Cigdem Gencel, Charles
Symons, REFSQ Conference, Essen, Germany, April 2013
8. ‘Error Propagation in Software Measurement and Estimation’, Luca Santillo, 16th International
Workshop on Software Measurement, Potsdam, Germany, 2006
9. ‘Familiar Metric Management - The Effort-Time Tradeoff: It’s in the Data’, Lawrence H. Putnam,
Ware Myers, www.qsm.com
10. ‘Software Industry Performance Report’, ISBSG, August 2011
11. ‘‘Measures to get the best performance from your software suppliers’. Charles Symons, UKSMA
Conference, 8th November 2012 www.uksma.co.uk
(Continued)
23. References (contd)
12. ‘Exploring the software project effort versus duration tradeoffs’, IEEE Software, July/August
2012
13. ‘The effect of project duration on effort in software development projects’, Han Suelman..
IEEE Transactions on Software Engineering, 2013
14. ‘The performance of business application, real-time and component software projects: an
analysis of COSMIC-measured projects in the ISBSG database’, March 2012,
www.isbsg.org
15. ‘Interpretation problems related to the use of regression models to decide on economy of
scale in software development’, Magne Jorgensen, Barbara Kitchenham, Journal of
Systems & Software, 85 (2012)
16. From ‘Software Project Estimation’, Alain Abran, 2014, www.ca.wiley.com
17. ‘The mess of software metrics’, v5.0, Capers Jones, September 16, 2014
23