This document discusses applications of statistics in software engineering. It introduces a special issue that highlights papers applying statistical methods to solve software engineering problems and improve decision making. The issue includes papers on using statistical significance testing and Bayesian belief networks for risk management, using regression splines to understand factors affecting code inspection effectiveness, using Markov chains for reliability modeling, and applying clustering techniques for software partitioning and recovery. The document emphasizes that statistical analysis can help manage uncertainties in development, but challenges remain in collecting good data and integrating these methods into practice and education.
Applications Of Statistics In Software Engineering
1. Applications of statistics in software engineering
Khaled El Emam a
, Anita D. Carleton b,*
a
National Research Council of Canada, Canada
b
Software Engineering Institute, Carnegie Mellon University, 4500 Fifth Avenue, Pittsburgh, Pennsylvania 15213, USA
Received 15 March 2004; received in revised form 15 March 2004; accepted 21 March 2004
Available online 6 July 2004
The last decade or so has seen an increasing number
of companies learn how to apply statistical concepts to
software development. This is evidenced by the increase
in organizational maturity over that period (Software
Engineering Institute Maturity Profile, 1999), which
stipulates more and better data collection and analysis.
In spite of this, there is still debate as to the appli-
cability of statistical analysis to more than a limited
subset of the many development environments in exis-
tence today. There is not a full understanding of how
statistical methods can be applied in software engi-
neering scenarios and to date, limited case studies and
examples have been published, thereby providing moti-
vation for this special issue.
Papers presenting examples of applying statistical
methods to solve software engineering problems and
improve decision making are highlighted in this special
issue. Also of interest in this special issue are method-
ological studies that evaluate accuracy, utility, and
assumptions of statistical methods in software engi-
neering contexts.
The following papers are showcased here:
• ‘‘Statistical Significance Testing––a Panacea for Soft-
ware Technology Experiments?’’ By James Miller,
University of Alberta, Edmonton,
• ‘‘Bayesian Belief Network (BBN)-based Software
Project Risk Management’’ By Chin-Feng Fan and
Yuan-Chang Yu, Yuan-Ze University, Taiwan,
• ‘‘Using Multiple Adaptive Regression Splines to Sup-
port Decision Making in Code Inspections’’ By Lio-
nel Briand, Carleton University; Bernd Freimut,
Fraunhofer Institute for Experimental Software
Engineering; and Ferdinand Vollei, Siemens AG,
• ‘‘Computing System Reliability Using Markov Chain
Usage Models’’ By S.J. Prowell and J.H. Poore,
University of Tennessee,
• ‘‘Applications of Clustering Techniques to Software
Partitioning, Recovery and Restructuring, and De-
coupling’’ By Chung-Horng Lung, Carleton Univer-
sity and Marzia Zaman, Nortel Networks.
This special issue begins with ‘‘Statistical Significance
Testing––a Panacea for Software Technology Experi-
ments?’’ by James Miller. It examines whether statistical
significance testing, initially designed for testing
hypotheses in a different area, is applicable to empirical
software engineering research. This paper addresses
some of the issues that result from doing this:
• formulating hypotheses,
• calculating probability values and its associated cut-
off value,
• and constructing the sample and its distribution.
There is also a discussion of which analysis ap-
proaches are preferable under which conditions.
Many uncertainties exist in software development
processes and products. Some of these uncertainties in-
clude estimating project size, schedule, and quality,
determining resource allocation, etc. While current
software engineering practices cannot eliminate all of
these uncertainties, focusing on risk management can be
enormously helpful. This next paper ‘‘Bayesian Belief
Network (BBN)-based Software Project Risk Manage-
ment’’ by Chin-Feng Fan and Yuan-Chang Yu shows
that BBNs can be utilized in risk management processes
to provide quantitative, and more objective risk man-
agement. A theoretical model is defined to provide in-
sights into risk management. Based on these insights, a
BBN-based procedure using a feedback loop has been
developed to predict potential risks, identify sources of
risks, and advise dynamic resource adjustment. This
*
Corresponding author. Tel.: +1-412-2687718; fax: +1-412-
2685758.
E-mail addresses: khaled.el-emam@nrc-cnrc.gc.ca (K. El Emam),
adc@sei.cmu.edu (A.D. Carleton).
0164-1212/$ - see front matter 2004 Elsevier Inc. All rights reserved.
doi:10.1016/j.jss.2004.03.030
The Journal of Systems and Software 73 (2004) 181–182
www.elsevier.com/locate/jss
2. approach facilitates the visibility and repeatability of the
decision-making process of risk management. Several
analytical and simulated cases are presented.
The next paper ‘‘Using Multiple Adaptive Regression
Splines to Support Decision Making in Code Inspec-
tions’’ by Lionel Briand, Bernd Freimut, and Ferdinand
Vollei examines the factors that affect inspection effec-
tiveness (the rate of detected defects) in a given envi-
ronment, based on project data. Data was collected
from over 230 code inspections and a multivariate sta-
tistical analysis was performed in order to look at how
management factors, such as the effort assigned and the
inspection rate, affect inspection effectiveness. Because
the functional form of effectiveness models is a priori
unknown, they used a novel exploratory analysis tech-
nique: Multiple Adaptive Regression Splines (MARS).
They compared the MARS model with more classical
regression models and showed how it could help
understand the complex trends and interactions in the
data. Results are reported and discussed in light of
existing studies.
The next topic described in ‘‘Computing System
Reliability Using Markov Chain Usage Models’’ by S.J.
Prowell and J.H. Poore, addresses the use of Markov
Chain Models for test planning and analysis. Markov
chains have been used successfully to model software
use, generate tests, and compute statistics about soft-
ware used in the filed. A number of reliability models
have been used for Markov chain-based testing but each
has a certain set of limitations. This paper discusses a
Bayesian reliability model that is gaining support in the
community. Specifically, this paper focuses on the arc-
based Bayesian model.
The final paper in this special issue is ‘‘Applications
of Clustering Techniques to Software Partitioning,
Recovery and Restructuring, and Decoupling’’ by
Chung-Lung and Marzia Zaman. This paper presents
studies of applying the numerical taxonomy clustering
technique to software applications. The objective is to
improve design, evaluation, and evolution. Numerical
taxonomy is mathematically relatively simple and yet it
is a useful mechanism for component clustering and
software partitioning. This technique can be useful when
applied to different levels of abstraction or to different
software life-cycle phases. This paper provides an
introduction to numerical taxonomy and discusses
experiences of applying the approach.
As organizations seek to improve their software
engineering processes, they are turning to quantitative
measurement and analysis methods. SPC, a discipline
that is common in manufacturing and industrial envi-
ronments, but has only recently received attention as an
aid for software engineering (Florac and Carleton, 1999;
Florac et al., 2000; Keeni, 2000) has been generating
some interest, as well as six-sigma applications (Card,
2000; Pavlik et al., 2000; Purcell, 2000), and capture/
recapture methods (Barnard et al., 2003). Hopefully,
these will be areas of further research and application
that might yield articles in the future. Effective use of
these applications requires a detailed understanding of
processes and a willingness to pursue exploratory anal-
ysis. As with anything new, there is a learning curve. To
learn how to use a specific method or technology, one
needs to be willing to conduct research, try things, make
mistakes, and try again. Knowing and understanding
the process is fundamental; consistency in data collec-
tion and reporting is imperative; and clarifying and
understanding how the data is defined is crucial to
knowing what the data represents.
Transitioning some of these concepts and techniques
into actual software engineering practice remains a
challenge. Many organizations do not collect appropri-
ate data about their products and processes. Good data
is a pre-requisite to good analysis. Also, software engi-
neering curricula at universities need to emphasize data
collection and analysis topics, perhaps through joint
efforts with statistics departments. It takes a long time
for graduates to unlearn the data-less decision making
practices that they were taught during their formal
education.
References
Barnard, J., El Emam, K., Zubrow, D., 2003. Using capture-recapture
models for the reinspection decision. Software Quality Professional
5 (2), 11–20.
Card, D., 2000. Sorting out six sigma and the CMM. IEEE Software
(May/June).
Florac, W.A., Carleton, A.D., 1999. Measuring the Software Process:
Statistical Process Control for Software Process Improvement.
Addison-Wesley, Reading, MA.
Florac, W.A., Carleton, A.D., Barnard, J.R., 2000. Statistical process
control: analyzing a space shuttle onboard software process. IEEE
Software (July/August).
Keeni, G., 2000. The evolution of quality processes at Tata consul-
tancy service. IEEE Software (July/August).
Pavlik, R., Riall, C., Janiszewski, S., 2000. Deploying PSPSM,
TSPSM, and six sigma plus at Honeywell, Honeywell Air Trans-
port. Software Engineering Process Group 2000 Conference
Proceedings.
Purcell, L., 2000. Experiences using six sigma in a SW-CMM based
process improvement program. Northrop Grumman, Software
Engineering Process Group 2000 Conference Proceedings.
Software Engineering Institute Maturity Profile. Available from
http://www.sei.cmu.edu/sema/profile.html.
Khaled El Emam is a Senior Research Officer at the National Research
Council of Canada. He is also Chief Scientist at TrialStat Corporation,
and a Senior Investigator at the CHEO Research Institute. Khaled
obtained his PhD from King’s College, University of London (UK) in
1994.
Anita D. Carleton is a Senior Member of the Technical Staff at the
Software Engineering Institute, Carnegie Mellon University. She
helped to launch the software measurement initiative at the SEI in
1988. She is currently working on the Team Software Process (TSP)
initiative. Carleton has co-authored a book Measuring the Software
Process: Statistical Process Control for Software Process Improvement
published by Addison-Wesley in June 1999.
182 K. El Emam, A.D. Carleton / The Journal of Systems and Software 73 (2004) 181–182