This document describes a machine learning model for software defect prediction. It uses NASA software metrics data to train artificial neural networks and decision tree models to predict defect density values. The model performs regression to predict defect values for test data. Experimental results show that while both ANN and decision tree methods did not initially provide acceptable predictions compared to the data variance, further experiments could enhance defect prediction performance through a two-step modeling approach.
Machine Learning approaches are good in solving problems that have less information. In most cases, the
software domain problems characterize as a process of learning that depend on the various circumstances
and changes accordingly. A predictive model is constructed by using machine learning approaches and
classified them into defective and non-defective modules. Machine learning techniques help developers to
retrieve useful information after the classification and enable them to analyse data from different
perspectives. Machine learning techniques are proven to be useful in terms of software bug prediction. This
study used public available data sets of software modules and provides comparative performance analysis
of different machine learning techniques for software bug prediction. Results showed most of the machine
learning methods performed well on software bug datasets.
The Impact of Software Complexity on Cost and Quality - A Comparative Analysi...ijseajournal
Early prediction of software quality is important for better software planning and controlling. In early
development phases, design complexity metrics are considered as useful indicators of software testing
effort and some quality attributes. Although many studies investigate the relationship between design
complexity and cost and quality, it is unclear what we have learned beyond the scope of individual studies.
This paper presented a systematic review on the influence of software complexity metrics on quality
attributes. We aggregated Spearman correlation coefficients from 59 different data sets from 57 primary
studies by a tailored meta-analysis approach. We found that fault proneness and maintainability are most
frequently investigated attributes. Chidamber & Kemerer metric suite is most frequently used but not all of
them are good quality attribute indicators. Moreover, the impact of these metrics is not different in
proprietary and open source projects. The result provides some implications for building quality model
across project type.
A Review on Parameter Estimation Techniques of Software Reliability Growth Mo...Editor IJCATR
Software reliability is considered as a quantifiable metric, which is defined as the probability of a software to operate
without failure for a specified period of time in a specific environment. Various software reliability growth models have been proposed
to predict the reliability of a software. These models help vendors to predict the behaviour of the software before shipment. The
reliability is predicted by estimating the parameters of the software reliability growth models. But the model parameters are generally
in nonlinear relationships which creates many problems in finding the optimal parameters using traditional techniques like Maximum
Likelihood and least Square Estimation. Various stochastic search algorithms have been introduced which have made the task of
parameter estimation, more reliable and computationally easier. Parameter estimation of NHPP based reliability models, using MLE
and using an evolutionary search algorithm called Particle Swarm Optimization, has been explored in the paper.
TOWARDS PREDICTING SOFTWARE DEFECTS WITH CLUSTERING TECHNIQUESijaia
The purpose of software defect prediction is to improve the quality of a software project by building a
predictive model to decide whether a software module is or is not fault prone. In recent years, much
research in using machine learning techniques in this topic has been performed. Our aim was to evaluate
the performance of clustering techniques with feature selection schemes to address the problem of software
defect prediction problem. We analysed the National Aeronautics and Space Administration (NASA)
dataset benchmarks using three clustering algorithms: (1) Farthest First, (2) X-Means, and (3) selforganizing map (SOM). In order to evaluate different feature selection algorithms, this article presents a
comparative analysis involving software defects prediction based on Bat, Cuckoo, Grey Wolf Optimizer
(GWO), and particle swarm optimizer (PSO). The results obtained with the proposed clustering models
enabled us to build an efficient predictive model with a satisfactory detection rate and acceptable number
of features.
Software Cost Estimation Using Clustering and Ranking SchemeEditor IJMTER
Software cost estimation is an important task in the software design and development process.
Planning and budgeting tasks are carried out with reference to the software cost values. A variety of
software properties are used in the cost estimation process. Hardware, products, technology and
methodology factors are used in the cost estimation process. The software cost estimation quality is
measured with reference to the accuracy levels.
Software cost estimation is carried out using three types of techniques. They are regression based
model, anology based model and machine learning model. Each model has a set of technique for the
software cost estimation process. 11 cost estimation techniques fewer than 3 different categories are
used in the system. The Attribute Relational File Format (ARFF) is used maintain the software product
property values. The ARFF file is used as the main input for the system.
The proposed system is designed to perform the clustering and ranking of software cost
estimation methods. Non overlapped clustering technique is enhanced with optimal centroid estimation
mechanism. The system improves the clustering and ranking process accuracy. The system produces
efficient ranking results on software cost estimation methods.
Prioritizing Test Cases for Regression Testing A Model Based ApproachIJTET Journal
Abstract— Testing is an important phase of quality control of Software Development Life Cycle (SDLC). There are various types of testing methodologies involved to test the application. Regression Testing is a type of testing, which is done to ensure whether the modified features or bug fix had an impact over the existing functionality. Defects are identified by executing the set of test cases. Regression Test case selection is not at all possible to conclude how much retesting is required to identify the deviation when the test suites are larger in size. Prioritization of test cases is done to change the order of test case execution based on the severity. In the proposed a model based approach prioritization of test cases are generated based on UML diagrams (Sequence and State Chart). The modified features have the reflection in the model generation and the number of states and transitions covered. Prioritized test cases are then clustered based upon the severities using dendragram approach. It leads to decrease in the time and cost of regression testing.
Machine Learning approaches are good in solving problems that have less information. In most cases, the
software domain problems characterize as a process of learning that depend on the various circumstances
and changes accordingly. A predictive model is constructed by using machine learning approaches and
classified them into defective and non-defective modules. Machine learning techniques help developers to
retrieve useful information after the classification and enable them to analyse data from different
perspectives. Machine learning techniques are proven to be useful in terms of software bug prediction. This
study used public available data sets of software modules and provides comparative performance analysis
of different machine learning techniques for software bug prediction. Results showed most of the machine
learning methods performed well on software bug datasets.
The Impact of Software Complexity on Cost and Quality - A Comparative Analysi...ijseajournal
Early prediction of software quality is important for better software planning and controlling. In early
development phases, design complexity metrics are considered as useful indicators of software testing
effort and some quality attributes. Although many studies investigate the relationship between design
complexity and cost and quality, it is unclear what we have learned beyond the scope of individual studies.
This paper presented a systematic review on the influence of software complexity metrics on quality
attributes. We aggregated Spearman correlation coefficients from 59 different data sets from 57 primary
studies by a tailored meta-analysis approach. We found that fault proneness and maintainability are most
frequently investigated attributes. Chidamber & Kemerer metric suite is most frequently used but not all of
them are good quality attribute indicators. Moreover, the impact of these metrics is not different in
proprietary and open source projects. The result provides some implications for building quality model
across project type.
A Review on Parameter Estimation Techniques of Software Reliability Growth Mo...Editor IJCATR
Software reliability is considered as a quantifiable metric, which is defined as the probability of a software to operate
without failure for a specified period of time in a specific environment. Various software reliability growth models have been proposed
to predict the reliability of a software. These models help vendors to predict the behaviour of the software before shipment. The
reliability is predicted by estimating the parameters of the software reliability growth models. But the model parameters are generally
in nonlinear relationships which creates many problems in finding the optimal parameters using traditional techniques like Maximum
Likelihood and least Square Estimation. Various stochastic search algorithms have been introduced which have made the task of
parameter estimation, more reliable and computationally easier. Parameter estimation of NHPP based reliability models, using MLE
and using an evolutionary search algorithm called Particle Swarm Optimization, has been explored in the paper.
TOWARDS PREDICTING SOFTWARE DEFECTS WITH CLUSTERING TECHNIQUESijaia
The purpose of software defect prediction is to improve the quality of a software project by building a
predictive model to decide whether a software module is or is not fault prone. In recent years, much
research in using machine learning techniques in this topic has been performed. Our aim was to evaluate
the performance of clustering techniques with feature selection schemes to address the problem of software
defect prediction problem. We analysed the National Aeronautics and Space Administration (NASA)
dataset benchmarks using three clustering algorithms: (1) Farthest First, (2) X-Means, and (3) selforganizing map (SOM). In order to evaluate different feature selection algorithms, this article presents a
comparative analysis involving software defects prediction based on Bat, Cuckoo, Grey Wolf Optimizer
(GWO), and particle swarm optimizer (PSO). The results obtained with the proposed clustering models
enabled us to build an efficient predictive model with a satisfactory detection rate and acceptable number
of features.
Software Cost Estimation Using Clustering and Ranking SchemeEditor IJMTER
Software cost estimation is an important task in the software design and development process.
Planning and budgeting tasks are carried out with reference to the software cost values. A variety of
software properties are used in the cost estimation process. Hardware, products, technology and
methodology factors are used in the cost estimation process. The software cost estimation quality is
measured with reference to the accuracy levels.
Software cost estimation is carried out using three types of techniques. They are regression based
model, anology based model and machine learning model. Each model has a set of technique for the
software cost estimation process. 11 cost estimation techniques fewer than 3 different categories are
used in the system. The Attribute Relational File Format (ARFF) is used maintain the software product
property values. The ARFF file is used as the main input for the system.
The proposed system is designed to perform the clustering and ranking of software cost
estimation methods. Non overlapped clustering technique is enhanced with optimal centroid estimation
mechanism. The system improves the clustering and ranking process accuracy. The system produces
efficient ranking results on software cost estimation methods.
Prioritizing Test Cases for Regression Testing A Model Based ApproachIJTET Journal
Abstract— Testing is an important phase of quality control of Software Development Life Cycle (SDLC). There are various types of testing methodologies involved to test the application. Regression Testing is a type of testing, which is done to ensure whether the modified features or bug fix had an impact over the existing functionality. Defects are identified by executing the set of test cases. Regression Test case selection is not at all possible to conclude how much retesting is required to identify the deviation when the test suites are larger in size. Prioritization of test cases is done to change the order of test case execution based on the severity. In the proposed a model based approach prioritization of test cases are generated based on UML diagrams (Sequence and State Chart). The modified features have the reflection in the model generation and the number of states and transitions covered. Prioritized test cases are then clustered based upon the severities using dendragram approach. It leads to decrease in the time and cost of regression testing.
Comparative Performance Analysis of Machine Learning Techniques for Software ...csandit
Machine learning techniques can be used to analyse data from different perspectives and enable
developers to retrieve useful information. Machine learning techniques are proven to be useful
in terms of software bug prediction. In this paper, a comparative performance analysis of
different machine learning techniques is explored for software bug prediction on public
available data sets. Results showed most of the machine learning methods performed well on
software bug datasets.
One of the core quality assurance feature which combines fault prevention and fault detection, is often known as testability approach also. There are many assessment techniques and quantification method evolved for software testability prediction which actually identifies testability weakness or factors to further help reduce test effort. This paper examines all those measurement techniques that are being proposed for software testability assessment at various phases of object oriented software development life cycle. The aim is to find the best metrics suit for software quality improvisation through software testability support. The ultimate objective is to establish the ground work for finding ways reduce the testing effort by improvising software testability and its assessment using well planned guidelines for object-oriented software development with the help of suitable metrics.
Software Quality Engineering is a broad area that is concerned with various approaches to improve software quality. A quality model would prove successful when it suffices the requirements of the developers and the consumers. This research focuses on establishing semantics between the existing techniques related to the software quality engineering and thereby designing a framework for rating software quality.
Practical Guidelines to Improve Defect Prediction Model – A Reviewinventionjournals
Defect prediction models are used to pinpoint risky software modules and understand past pitfalls that lead to defective modules. The predictions and insights that are derived from defect prediction models may not be accurate and reliable if researchers do not consider the impact of experimental components (e.g., datasets, metrics, and classifiers) of defect prediction modeling. Therefore, a lack of awareness and practical guidelines from previous research can lead to invalid predictions and unreliable insights. Through case studies of systems that span both proprietary and open-source domains, find that (1) noise in defect datasets; (2) parameter settings of classification techniques; and (3) model validation techniques have a large impact on the predictions and insights of defect prediction models, suggesting that researchers should carefully select experimental components in order to produce more accurate and reliable defect prediction models.
Reliability is concerned with decreasing faults and their impact. The earlier the faults are detected the better. That's why this presentation talks about automated techniques using machine learning to detect faults as early as possible.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Determination of Software Release Instant of Three-Tier Client Server Softwar...Waqas Tariq
Quality of any software system mainly depends on how much time testing take place, what kind of testing methodologies are used, how complex the software is, the amount of efforts put by software developers and the type of testing environment subject to the cost and time constraint. More time developers spend on testing more errors can be removed leading to better reliable software but then testing cost will also increase. On the contrary, if testing time is too short, software cost could be reduced provided the customers take risk of buying unreliable software. However, this will increase the cost during operational phase since it is more expensive to fix an error during operational phase than during testing phase. Therefore it is essentially important to decide when to stop testing and release the software to customers based on cost and reliability assessment. In this paper we present a mechanism of when to stop testing process and release the software to end-user by developing a software cost model with risk factor. Based on the proposed method we specifically address the issues of how to decide that we should stop testing and release the software based on three-tier client server architecture which would facilitates software developers to ensure on-time delivery of a software product meeting the criteria of achieving predefined level of reliability and minimizing the cost. A numerical example has been cited to illustrate the experimental results showing significant improvements over the conventional statistical models based on NHPP.
A metrics suite for variable categorizationt to support program invariants[IJCSEA Journal
Invariants are generally implicit. Explicitly stating program invariants, help programmers to identify
program properties that must be preserved while modifying the code. Existing dynamic techniques detect
invariants which includes both relevant and irrelevant/unused variables and thereby relevant and
irrelevant invariants involved in the program. Due to the presence of irrelevant variables and irrelevant
invariants, speed and efficiency of techniques are affected. Also, displaying properties about irrelevant
variables and irrelevant invariants distracts the user from concentrating on properties of relevant
variables. To overcome these deficiencies only relevant variables are considered by ignoring irrelevant
variables. Further, relevant variables are categorized as design variables and non-design variables. For
this purpose a metrics suite is proposed. These metrics are validated against Weyuker’s principles and
applied on RFV and JLex open source software. Similarly, relevant invariants are categorized as design
invariants, non-design invariants and hybrid invariants. For this purpose a set of rules are proposed. This
entire process enormously improves the speed and efficiency of dynamic invariant detection techniques
A methodology to evaluate object oriented software systems using change requi...ijseajournal
It is a well known fact that software maintenance plays a major role and finds importance in software
development life cycle. As object
-
oriented programming has become the standard, it is very important to
understand th
e problems of maintaining object
-
oriented software systems. This paper aims at evaluating
object
-
oriented software system through change requirement traceability
–
based impact analysis
methodology
for non functional requirements using functional requirem
ents
. The major issues have been
related to change impact algorithms and inheritance of functionality.
In the software measurement validations, assessing the validation of software metrics in software
engineering is a very difficult task due to lack of theoretical methodology and empirical methodology [41,
44, 45]. During recent years, there have been a number of researchers addressing the issue of validating
software metrics. At present, software metrics are validated theoretically using properties of measures.
Further, software measurement plays an important role in understanding and controlling software
development practices and products. The major requirement in software measurement is that the measures
must represent accurately those attributes they purport to quantify and validation is critical to the success
of software measurement. Normally, validation is a collection of analysis and testing activities across the
full life cycle and complements the efforts of other quality engineering functions and validation is a critical
task in any engineering project. Further, validation objective is to discover defects in a system and assess
whether or not the system is useful and usable in operational situation. In the case of software engineering,
validation is one of the software engineering disciplines that help build quality into software. The major
objective of software validation process is to determine that the software performs its intended functions
correctly and provides information about its quality and reliability. This paper discusses the validation
methodology, techniques and different properties of measures that are used for software metrics validation.
In most cases, theoretical and empirical validations are conducted for software metrics validations in
software engineering [1-50].
International Journal of Computational Engineering Research (IJCER) is dedicated to protecting personal information and will make every reasonable effort to handle collected information appropriately. All information collected, as well as related requests, will be handled as carefully and efficiently as possible in accordance with IJCER standards for integrity and objectivity.
A Software Measurement Using Artificial Neural Network and Support Vector Mac...ijseajournal
Today, Software measurement are based on various techniques such that neural network, Genetic
algorithm, Fuzzy Logic etc. This study involves the efficiency of applying support vector machine using
Gaussian Radial Basis kernel function to software measurement problem to increase the performance and
accuracy. Support vector machines (SVM) are innovative approach to constructing learning machines that
Minimize generalization error. There is a close relationship between SVMs and the Radial Basis Function
(RBF) classifiers. Both have found numerous applications such as in optical character recognition, object
detection, face verification, text categorization, and so on. The result demonstrated that the accuracy and
generalization performance of SVM Gaussian Radial Basis kernel function is better than RBFN. We also
examine and summarize the several superior points of the SVM compared with RBFN.
Framework for a Software Quality Rating SystemKarthik Murali
Software Quality Engineering is a broad area that is concerned with various approaches to improve software quality. A quality model would prove successful when it suffices the requirements of the developers and the consumers. This research focuses on establishing semantics between the existing techniques related to the software quality engineering and thereby designing a framework for rating software quality
International Journal of Computational Engineering Research(IJCER) is an intentional online Journal in English monthly publishing journal. This Journal publish original research work that contributes significantly to further the scientific knowledge in engineering and Technology
STATE-OF-THE-ART IN EMPIRICAL VALIDATION OF SOFTWARE METRICS FOR FAULT PRONEN...IJCSES Journal
With the sharp rise in software dependability and failure cost, high quality has been in great demand.However, guaranteeing high quality in software systems which have grown in size and complexity coupled with the constraints imposed on their development has become increasingly difficult, time and resource consuming activity. Consequently, it becomes inevitable to deliver software that have no serious faults. In
this case, object-oriented (OO) products being the de facto standard of software development with their unique features could have some faults that are hard to find or pinpoint the impacts of changes. The earlier faults are identified, found and fixed, the lesser the costs and the higher the quality. To assess product quality, software metrics are used. Many OO metrics have been proposed and developed. Furthermore,
many empirical studies have validated metrics and class fault proneness (FP) relationship. The challenge is which metrics are related to class FP and what activities are performed. Therefore, this study bring together the state-of-the-art in fault prediction of FP that utilizes CK and size metrics. We conducted a systematic literature review over relevant published empirical validation articles. The results obtained are
analysed and presented. It indicates that 29 relevant empirical studies exist and measures such as complexity, coupling and size were found to be strongly related to FP.
Although there has been an extensive study over delivering, increasing and maintaining software quality, there has not been enough aide- mémoire on ‘Rating a Software‘s Quality’. This study would project the literature review thus far and also sculpt the scope and need for the evolution of a rating system of software quality for the future.
A Complexity Based Regression Test Selection StrategyCSEIJJournal
Software is unequivocally the foremost and indispensable entity in this technologically driven world.
Therefore quality assurance, and in particular, software testing is a crucial step in the software
development cycle. This paper presents an effective test selection strategy that uses a Spectrum of
Complexity Metrics (SCM). Our aim in this paper is to increase the efficiency of the testing process by
significantly reducing the number of test cases without having a significant drop in test effectiveness. The
strategy makes use of a comprehensive taxonomy of complexity metrics based on the product level (class,
method, statement) and its characteristics.We use a series of experiments based on three applications with
a significant number of mutants to demonstrate the effectiveness of our selection strategy.For further
evaluation, we compareour approach to boundary value analysis. The results show the capability of our
approach to detect mutants as well as the seeded errors.
Comparative Performance Analysis of Machine Learning Techniques for Software ...csandit
Machine learning techniques can be used to analyse data from different perspectives and enable
developers to retrieve useful information. Machine learning techniques are proven to be useful
in terms of software bug prediction. In this paper, a comparative performance analysis of
different machine learning techniques is explored for software bug prediction on public
available data sets. Results showed most of the machine learning methods performed well on
software bug datasets.
One of the core quality assurance feature which combines fault prevention and fault detection, is often known as testability approach also. There are many assessment techniques and quantification method evolved for software testability prediction which actually identifies testability weakness or factors to further help reduce test effort. This paper examines all those measurement techniques that are being proposed for software testability assessment at various phases of object oriented software development life cycle. The aim is to find the best metrics suit for software quality improvisation through software testability support. The ultimate objective is to establish the ground work for finding ways reduce the testing effort by improvising software testability and its assessment using well planned guidelines for object-oriented software development with the help of suitable metrics.
Software Quality Engineering is a broad area that is concerned with various approaches to improve software quality. A quality model would prove successful when it suffices the requirements of the developers and the consumers. This research focuses on establishing semantics between the existing techniques related to the software quality engineering and thereby designing a framework for rating software quality.
Practical Guidelines to Improve Defect Prediction Model – A Reviewinventionjournals
Defect prediction models are used to pinpoint risky software modules and understand past pitfalls that lead to defective modules. The predictions and insights that are derived from defect prediction models may not be accurate and reliable if researchers do not consider the impact of experimental components (e.g., datasets, metrics, and classifiers) of defect prediction modeling. Therefore, a lack of awareness and practical guidelines from previous research can lead to invalid predictions and unreliable insights. Through case studies of systems that span both proprietary and open-source domains, find that (1) noise in defect datasets; (2) parameter settings of classification techniques; and (3) model validation techniques have a large impact on the predictions and insights of defect prediction models, suggesting that researchers should carefully select experimental components in order to produce more accurate and reliable defect prediction models.
Reliability is concerned with decreasing faults and their impact. The earlier the faults are detected the better. That's why this presentation talks about automated techniques using machine learning to detect faults as early as possible.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
Determination of Software Release Instant of Three-Tier Client Server Softwar...Waqas Tariq
Quality of any software system mainly depends on how much time testing take place, what kind of testing methodologies are used, how complex the software is, the amount of efforts put by software developers and the type of testing environment subject to the cost and time constraint. More time developers spend on testing more errors can be removed leading to better reliable software but then testing cost will also increase. On the contrary, if testing time is too short, software cost could be reduced provided the customers take risk of buying unreliable software. However, this will increase the cost during operational phase since it is more expensive to fix an error during operational phase than during testing phase. Therefore it is essentially important to decide when to stop testing and release the software to customers based on cost and reliability assessment. In this paper we present a mechanism of when to stop testing process and release the software to end-user by developing a software cost model with risk factor. Based on the proposed method we specifically address the issues of how to decide that we should stop testing and release the software based on three-tier client server architecture which would facilitates software developers to ensure on-time delivery of a software product meeting the criteria of achieving predefined level of reliability and minimizing the cost. A numerical example has been cited to illustrate the experimental results showing significant improvements over the conventional statistical models based on NHPP.
A metrics suite for variable categorizationt to support program invariants[IJCSEA Journal
Invariants are generally implicit. Explicitly stating program invariants, help programmers to identify
program properties that must be preserved while modifying the code. Existing dynamic techniques detect
invariants which includes both relevant and irrelevant/unused variables and thereby relevant and
irrelevant invariants involved in the program. Due to the presence of irrelevant variables and irrelevant
invariants, speed and efficiency of techniques are affected. Also, displaying properties about irrelevant
variables and irrelevant invariants distracts the user from concentrating on properties of relevant
variables. To overcome these deficiencies only relevant variables are considered by ignoring irrelevant
variables. Further, relevant variables are categorized as design variables and non-design variables. For
this purpose a metrics suite is proposed. These metrics are validated against Weyuker’s principles and
applied on RFV and JLex open source software. Similarly, relevant invariants are categorized as design
invariants, non-design invariants and hybrid invariants. For this purpose a set of rules are proposed. This
entire process enormously improves the speed and efficiency of dynamic invariant detection techniques
A methodology to evaluate object oriented software systems using change requi...ijseajournal
It is a well known fact that software maintenance plays a major role and finds importance in software
development life cycle. As object
-
oriented programming has become the standard, it is very important to
understand th
e problems of maintaining object
-
oriented software systems. This paper aims at evaluating
object
-
oriented software system through change requirement traceability
–
based impact analysis
methodology
for non functional requirements using functional requirem
ents
. The major issues have been
related to change impact algorithms and inheritance of functionality.
In the software measurement validations, assessing the validation of software metrics in software
engineering is a very difficult task due to lack of theoretical methodology and empirical methodology [41,
44, 45]. During recent years, there have been a number of researchers addressing the issue of validating
software metrics. At present, software metrics are validated theoretically using properties of measures.
Further, software measurement plays an important role in understanding and controlling software
development practices and products. The major requirement in software measurement is that the measures
must represent accurately those attributes they purport to quantify and validation is critical to the success
of software measurement. Normally, validation is a collection of analysis and testing activities across the
full life cycle and complements the efforts of other quality engineering functions and validation is a critical
task in any engineering project. Further, validation objective is to discover defects in a system and assess
whether or not the system is useful and usable in operational situation. In the case of software engineering,
validation is one of the software engineering disciplines that help build quality into software. The major
objective of software validation process is to determine that the software performs its intended functions
correctly and provides information about its quality and reliability. This paper discusses the validation
methodology, techniques and different properties of measures that are used for software metrics validation.
In most cases, theoretical and empirical validations are conducted for software metrics validations in
software engineering [1-50].
International Journal of Computational Engineering Research (IJCER) is dedicated to protecting personal information and will make every reasonable effort to handle collected information appropriately. All information collected, as well as related requests, will be handled as carefully and efficiently as possible in accordance with IJCER standards for integrity and objectivity.
A Software Measurement Using Artificial Neural Network and Support Vector Mac...ijseajournal
Today, Software measurement are based on various techniques such that neural network, Genetic
algorithm, Fuzzy Logic etc. This study involves the efficiency of applying support vector machine using
Gaussian Radial Basis kernel function to software measurement problem to increase the performance and
accuracy. Support vector machines (SVM) are innovative approach to constructing learning machines that
Minimize generalization error. There is a close relationship between SVMs and the Radial Basis Function
(RBF) classifiers. Both have found numerous applications such as in optical character recognition, object
detection, face verification, text categorization, and so on. The result demonstrated that the accuracy and
generalization performance of SVM Gaussian Radial Basis kernel function is better than RBFN. We also
examine and summarize the several superior points of the SVM compared with RBFN.
Framework for a Software Quality Rating SystemKarthik Murali
Software Quality Engineering is a broad area that is concerned with various approaches to improve software quality. A quality model would prove successful when it suffices the requirements of the developers and the consumers. This research focuses on establishing semantics between the existing techniques related to the software quality engineering and thereby designing a framework for rating software quality
International Journal of Computational Engineering Research(IJCER) is an intentional online Journal in English monthly publishing journal. This Journal publish original research work that contributes significantly to further the scientific knowledge in engineering and Technology
STATE-OF-THE-ART IN EMPIRICAL VALIDATION OF SOFTWARE METRICS FOR FAULT PRONEN...IJCSES Journal
With the sharp rise in software dependability and failure cost, high quality has been in great demand.However, guaranteeing high quality in software systems which have grown in size and complexity coupled with the constraints imposed on their development has become increasingly difficult, time and resource consuming activity. Consequently, it becomes inevitable to deliver software that have no serious faults. In
this case, object-oriented (OO) products being the de facto standard of software development with their unique features could have some faults that are hard to find or pinpoint the impacts of changes. The earlier faults are identified, found and fixed, the lesser the costs and the higher the quality. To assess product quality, software metrics are used. Many OO metrics have been proposed and developed. Furthermore,
many empirical studies have validated metrics and class fault proneness (FP) relationship. The challenge is which metrics are related to class FP and what activities are performed. Therefore, this study bring together the state-of-the-art in fault prediction of FP that utilizes CK and size metrics. We conducted a systematic literature review over relevant published empirical validation articles. The results obtained are
analysed and presented. It indicates that 29 relevant empirical studies exist and measures such as complexity, coupling and size were found to be strongly related to FP.
Although there has been an extensive study over delivering, increasing and maintaining software quality, there has not been enough aide- mémoire on ‘Rating a Software‘s Quality’. This study would project the literature review thus far and also sculpt the scope and need for the evolution of a rating system of software quality for the future.
A Complexity Based Regression Test Selection StrategyCSEIJJournal
Software is unequivocally the foremost and indispensable entity in this technologically driven world.
Therefore quality assurance, and in particular, software testing is a crucial step in the software
development cycle. This paper presents an effective test selection strategy that uses a Spectrum of
Complexity Metrics (SCM). Our aim in this paper is to increase the efficiency of the testing process by
significantly reducing the number of test cases without having a significant drop in test effectiveness. The
strategy makes use of a comprehensive taxonomy of complexity metrics based on the product level (class,
method, statement) and its characteristics.We use a series of experiments based on three applications with
a significant number of mutants to demonstrate the effectiveness of our selection strategy.For further
evaluation, we compareour approach to boundary value analysis. The results show the capability of our
approach to detect mutants as well as the seeded errors.
Machine learning techniques can be used to analyse data from different perspectives and enable developers to retrieve useful information. Machine learning techniques are proven to be useful
in terms of software bug prediction. In this paper, a comparative performance analysis of
different machine learning techniques is explored for software bug prediction on public
available data sets. Results showed most of the machine learning methods performed well on
software bug datasets.
ANALYSIS OF SOFTWARE QUALITY USING SOFTWARE METRICSijcsa
Software metrics have a direct link with measurement in software engineering. Correct measurement is the prior condition in any engineering fields, and software engineering is not an exception, as the size and complexity of software increases, manual inspection of software becomes a harder task. Most Software Engineers worry about the quality of software, how to measure and enhance its quality. The overall objective of this study was to asses and analysis’s software metrics used to measure the software product and process.
In this Study, the researcher used a collection of literatures from various electronic databases, available since 2008 to understand and know the software metrics. Finally, in this study, the researcher has been identified software quality is a means of measuring how software is designed and how well the software conforms to that design. Some of the variables that we are looking for software quality are Correctness, Product quality, Scalability, Completeness and Absence of bugs, However the quality standard that was used from one organization is different from others for this reason it is better to apply the software metrics to measure the quality of software and the current most common software metrics tools to reduce the subjectivity of faults during the assessment of software quality. The central contribution of this study is an overview about software metrics that can illustrate us the development in this area, and a critical analysis about the main metrics founded on the various literatures.
ANALYSIS OF SOFTWARE QUALITY USING SOFTWARE METRICSijcsa
Software metrics have a direct link with measurement in software engineering. Correct measurement is the prior condition in any engineering fields, and software engineering is not an exception, as the size and complexity of software increases, manual inspection of software becomes a harder task. Most Software Engineers worry about the quality of software, how to measure and enhance its quality. The overall objective of this study was to asses and analysis’s software metrics used to measure the software product and process.
In this Study, the researcher used a collection of literatures from various electronic databases, available since 2008 to understand and know the software metrics. Finally, in this study, the researcher has been identified software quality is a means of measuring how software is designed and how well the software conforms to that design. Some of the variables that we are looking for software quality are Correctness, Product quality, Scalability, Completeness and Absence of bugs, However the quality standard that was used from one organization is different from others for this reason it is better to apply the software metrics to measure the quality of software and the current most common software metrics tools to reduce the subjectivity of faults during the assessment of software quality. The central contribution of this study is an overview about software metrics that can illustrate us the development in this area, and a critical analysis about the main metrics founded on the various literatures.
A survey of predicting software reliability using machine learning methodsIAESIJAI
In light of technical and technological progress, software has become an urgent need in every aspect of human life, including the medicine sector and industrial control. Therefore, it is imperative that the software always works flawlessly. The information technology sector has witnessed a rapid expansion in recent years, as software companies can no longer rely only on cost advantages to stay competitive in the market, but programmers must provide reliable and high-quality software, and in order to estimate and predict software reliability using machine learning and deep learning, it was introduced A brief overview of the important scientific contributions to the subject of software reliability, and the researchers' findings of highly efficient methods and techniques for predicting software reliability.
In the present paper, applicability and
capability of A.I techniques for effort estimation prediction has
been investigated. It is seen that neuro fuzzy models are very
robust, characterized by fast computation, capable of handling
the distorted data. Due to the presence of data non-linearity, it is
an efficient quantitative tool to predict effort estimation. The one
hidden layer network has been developed named as OHLANFIS
using MATLAB simulation environment.
Here the initial parameters of the OHLANFIS are
identified using the subtractive clustering method. Parameters of
the Gaussian membership function are optimally determined
using the hybrid learning algorithm. From the analysis it is seen
that the Effort Estimation prediction model developed using
OHLANFIS technique has been able to perform well over normal
ANFIS Model.
Contributors to Reduce Maintainability Cost at the Software Implementation PhaseWaqas Tariq
Software maintenance is important and difficult to measure. The cost of maintenance is the most ever during the phases of software development. One of the most critical processes in software development is the reduction of software maintainability cost based on the quality of source code during design step, however, a lack of quality models and measures can help asses the quality attributes of software maintainability process. Software maintainability suffers from a number of challenges such as lack source code understanding, quality of software code, and adherence to programming standards in maintenance. This work describes model based-factors to assess the software maintenance, explains the steps followed to obtain and validate them. Such a method can be used to eliminate the software maintenance cost. The research results will enhance the quality of the source code. It will increase software understandability, eliminate maintenance time, cost, and give confidence for software reusability.
Insights of effectivity analysis of learning-based approaches towards softwar...IJECEIAES
Software defect prediction is one of the essential sets of operation towards mitigating issues of risk management in software development known to contribute towards enhancing the quality of software. There is evolution of various methodologies towards resolving this issue while learning-based methodology is witnessed to be the most dominant contributor. The problem identified is that there are yet many unsolved queries associated with practical viability of such learning-based approach adoption in software quality management. Proposed approaches discussed in this paper contributes towards mitigating this challenge by introducing a simplified, compact, and crisp analysis of effectiveness associated with learning-based schemes. The paper presents its major findings of effectivity analysis of machine learning, deep learning, hybrid, and other miscellaneous approaches deployed for fault prediction followed by highlighting research trend. The major findings infer that feature selection, data imbalance, interpretability, and in adequate involvement of context are prime gaps in existing methods. The paper also contributes towards research gap as well as essential learning outcomes of present review work.
The peer-reviewed International Journal of Engineering Inventions (IJEI) is started with a mission to encourage contribution to research in Science and Technology. Encourage and motivate researchers in challenging areas of Sciences and Technology.
How Should We Estimate Agile Software Development Projects and What Data Do W...Glen Alleman
Estimating techniques for an acquisition program progresses from analogies to actual cost method as the program matures and more information is known. The analogy method is most appropriate early in the program life cycle when the system is not yet fully defined.
IJCER (www.ijceronline.com) International Journal of computational Engineerin...ijceronline
Call for paper 2012, hard copy of Certificate, research paper publishing, where to publish research paper,
journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJCER, journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, publishing of research paper, research and review articles, IJCER Journal, How to publish your research paper, publish research paper, open access engineering journal, Engineering journal, Mathematics journal, Physics journal, Chemistry journal, Computer Engineering, Computer Science journal, how to submit your paper, peer review journal, indexed journal, research and review articles, engineering journal, www.ijceronline.com, research journals,
yahoo journals, bing journals, International Journal of Computational Engineering Research, Google journals, hard copy of Certificate,
journal of engineering, online Submission
From previous year researches, it is concluded that testing is playing a vital role in the development of the software product. As, software testing is a single approach to assure the quality of the software so most of the development efforts are put on the software testing. But software testing is an expensive process and consumes a lot of time. So, testing should be start as early as possible in the development to control the money and time problems. Even, testing should be performed at every step in the software development life cycle (SDLC) which is a structured approach used in the development of the software product. Software testing is a tradeoff between budget, time and quality. Now a day, testing becomes a very important activity in terms of exposure, security, performance and usability. Hence, software testing faces a collection of challenges.
A Review on Software Fault Detection and Prevention Mechanism in Software Dev...iosrjce
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
1. A Machine Learning Based Model For Software Defect Prediction
Onur Kutlubay, Mehmet Balman, Doğu Gül, Ayşe B. Bener
Boğaziçi University, Computer Engineering Department
kutlubay@cmpe.boun.edu.tr; mbalman@ku.edu.tr; dogugul@yahoo.com; bener@boun.edu.tr
Abstract
Identifying and locating defects in software projects is a difficult work. Especially, when
project sizes grow, this task becomes expensive with sophisticated testing and evaluation
mechanisms. On the other hand, measuring software in a continuous and disciplined manner
brings many advantages such as accurate estimation of project costs and schedules, and
improving product and process qualities. Detailed analysis of software metric data also gives
significant clues about the locations of possible defects in a programming code.
The aim of this research is to establish a method for identifying software defects using
machine learning methods. In this work we used NASA’s Metrics Data Program (MDP) as
software metrics data. The repository at NASA IV & V Facility MDP contains software
metric data and error data at the function/method level.
We used machine learning methods to construct a two step model that predicts potentially
defected modules within a given set of software modules with respect to their metric data.
Artificial Neural Networks and Decision Tree methods are utilized throughout the learning
experiments. The data set used in the experiments is organized in two forms for learning and
predicting purposes; the training set and the testing set. The experiments show that the two
step model enhances defect prediction performance.
1. Introduction
According to a survey carried out by the Standish Group, an average software project
exceeded its budget by 90 percent and its schedule by 222 percent (Chaos Chronicles, 1995).
This survey took place in mid 90s and contained data from about 8-000 projects. These
statistics show the importance of measuring the software early in its life cycle and taking the
necessary precautions before these results come out. For the software projects carried out in
the industry, an extensive metrics program is usually seen unnecessary and the practitioners
2. start to stress on a metrics program when things are bad or when there is a need to satisfy
some external assessment body.
On the academic side, less concentration is devoted on the decision support power of
software measurement. The results of these measurements are usually evaluated with naive
methods like regression and correlation between values. However models for assessing
software risk in terms of predicting defects in a specific module or function have also been
proposed in the previous research (Fenton and Neil, 1999). Some recent models also utilize
machine-learning techniques for defect predicting (Neumann, 2002). But the main drawback
of using machine learning in software defect prediction is the scarcity of data. Most of the
companies do not share their software metric data with other organizations so that a useful
database with great amount of data cannot be formed. However, there are publicly available
well-established tools for extracting metrics such as size, McCabe’s cyclomatic complexity,
and Halstead’s program vocabulary. These tools help automating the data collection process
in software projects.
A well established metrics program yields to better estimations of cost and schedule.
Besides, the analyses of measured metrics are good indicators of possible defects in the
software being developed. Testing is the most popular method for defect detection in most of
the software projects. However, when projects’ sizes grow in terms of both lines of code and
effort spent, the task of testing gets more difficult and computationally expensive with the use
of sophisticated testing and evaluation procedures. Nevertheless, defects that are identified in
previous segments of programs can be clustered according to their various properties and
most importantly according to their severity. If the relationship between the software metrics
measured at a certain state and the defects’ properties can be formulated together, it becomes
possible to predict similar defects in other parts of the code written.
The software metric data gives us the values for specific variables to measure a specific
module/function or the whole software. When combined with the weighted error/defect data,
this data set becomes the input for a machine learning system. A learning system is defined as
a system that is said to learn from experience with respect to some class of tasks and
performance measure, such that its performance at these tasks improve with experience
(Mitchell, 1997). To design a learning system, the data set in this work is divided into two
parts: the training data set and the testing data set. Some predictor functions are defined and
trained with respect to Multi-Layer Perceptron and Decision Tree algorithms and the results
are evaluated with the testing data set.
3. The second section gives a brief literature survey on the previous research and the third
one talks about the data set used in our research. The fourth section states the problem and the
fifth section explains the details of our proposed model for defect prediction. Also, the tools
and methods that are utilized throughout the experiments are described in the same section. In
the sixth section, we have listed the results of the experiments and a detailed evaluation of the
machine learning algorithms is done in the same section. The last section concludes our work
and summarizes the future research that could be done in this area.
2. Related Work
2.1. Metrics and Software Risk Assesment
Software metrics are mostly used for the purposes of product quality and process efficiency
analysis and risk assessment for software projects. Software metrics have many benefits and
one of the most significant benefits is that they provide information for defect prediction.
Metric analysis allows project managers to assess software risks. Currently there are
numerous metrics for assessing software risks. The early researches on software metrics have
focused their attention mostly on McCabe, Halstead and lines of code (LOC) metrics. Among
many software metrics, these three categories contain the most widely used metrics. Also in
this work, we decided to use an evaluation mechanism mainly based on these metrics.
Metrics usually have definitions in terms of polynomial equations when they are not
directly measured but derived from other metrics. Researchers have used neural network
approach to generate new metrics instead of using metrics that are based on certain
polynomial equations (Boetticher et al., 1993). This is actually introduced as an alternative
method to overcome the challenge of derivation of a polynomial which provides the desired
characteristics. Bayesian belief network is also used to make risk assessment in previous
research (Fenton and Neil, 1999). Basic metrics such as LOC, Halstead and McCabe metrics
are used in the learning process. The authors argue that some metrics do not give right
prediction about software’s operational stage. For instance, there is not a similar relation
between the number of fault for the pre- and post-release versions of the software and the
cyclomatic complexity. To overcome this problem, Bayesian Belief Network is used for
defect modeling.
4. In another research, the approach used is to categorize metrics with respect to the models
developed. The model is based on the fact that “software metrics alone are difficult to
evaluate”. They apply metrics on three models namely “Complexity”, “Risk” and “Test
Targeting” model. Different results obtained with respect to these models and each is
evaluated distinctly (Hudepohl et al., 1996).
It is shown that some metrics depict common features on software risk. Instead of using
all the metrics adopted, a basic one that will represent a cluster can be used (Neumann, 2002).
“Principal component analysis” which is one of the most popular approaches has to be applied
in order to determine the clusters that include similar metrics.
2.2. Defect Prediction and Applications of Machine Learning
Defect prediction models can be classified according to the metrics used and the process step
in the software life cycle. Most of the defect models use the basic metrics such as complexity
and size of the software (Henry and Kafura, 1984). Testing metrics that are produced in test
phase are also used to estimate the sequence of defects (Cusumano, 1991). Another approach
is to investigate the quality of design and implementation processes, that quality of design
process is the best predictor for the product quality (Bertolino and Strigini, 1996; Diaz and
Sligo, 1997).
The main idea behind the prediction models is to estimate the reliability of the system, and
investigate the effect of design and testing process over number of defects. Previous studies
show that the metrics in all steps of the life cycle of a software project as design,
implementation, testing, etc. should be utilized and connected with specific dependencies.
Concentrating only a specific metric or process level is not enough for a satisfied prediction
model (Fenton and Neil, 1999).
Machine learning algorithms have been proven to be practical for poorly understood
problem domains that have changing conditions with respect to many values and regularities.
Since software problems can be formulated as learning processes and classified according to
the characteristics of defect, regular machine learning algorithms are applicable to prepare a
probability distribution and analyze errors (Fenton and Neil, 1999; Zhang, 2000). Decision
trees, artificial neural networks, Bayesian belief network and clustering techniques such as k-
nearest neighborhood are examples of most commonly used techniques for software defect
prediction problems (Mitchell, 1997; Zhang, 2000; Jensen, 1996).
5. Machine learning algorithms can be used over program execution to detect the number of
the faulty runs, which will lead to find underlying defects. Executions are clustered according
to the procedural and functional properties of this approach (Dickinson et al., 2001). Machine
learning is also used to generate models of program properties that are known to cause errors.
Support vector and decision tree learning tools are implemented to classify and investigate the
most relevant subsets of program properties (Brun and Ernst, 2004). Underlying intuition is
that most of the properties leading to faulty conditions can be classified within a few groups.
Technique consists of two steps; training and classification. Fault relevant properties are
utilized to generate a model, and this precomputed function selects the properties that are
most likely to cause errors and defects in the software.
Clustering over function call profiles are used to determine which features enable a model
to distinguish failures and non-failures (Podgurski et al., 2003). Dynamic invariant detection
is used to detect likely invariants from a test suite and investigate violations that usually
indicate erroneous state. This method is also used to determine counterexamples and find
properties which lead to correct results for all conditions (Groce and Visser, 2003).
3. Metric Data Used
The data set used in this research is provided by the NASA IV&V Metrics Data Program –
Metric Data Repository1. The data repository contains software metrics and associated error
data at the function/method level. The data repository stores and organizes the data which has
been collected and validated by the Metrics Data Program.
The association between the error data and the metrics data in the repository provides the
opportunity to investigate the relationship of metrics or combinations of metrics to the
software. The data that is made available to general users has been sanitized and authorized
for publication through the MDP website by officials representing the projects from which the
data has originated. The database uses unique numeric identifiers to describe the individual
error records and product entries. The level of abstraction allows data associations to be made
without having to reveal specific information about the originating data.
The repository contains detailed metric data in terms of, product metrics, object oriented
class metrics, requirement metrics and defect/product association metrics. We specifically
concentrate on product metrics and related defect metrics. The data portion that feeds the
experiments in this research contains the mentioned metric data for JM1 project.
6. Some of the product metrics that are included in the data set are, McCabe Metrics;
Cyclomatic Complexity and Design Complexity, Halstead Metrics; Halstead Content,
Halstead Difficulty, Halstead Effort, Halstead Error Estimate, Halstead Length, Halstead
Level, Halstead Programming Time and Halstead Volume, LOC Metrics; Lines of Total
Code, LOC Blank, Branch Count, LOC Comments, Number of Operands, Number of Unique
Operands and Number of Unique Operators, and lastly Defect Metrics; Error Count, Error
Density, Number of Defects (with severity and priority information).
After constructing our data repository, we have cleaned the data set against marginal
values, which may lead our experiments to faulty results. For each type of feature in the
database, the data containing feature values out of a range of ten standard deviations from the
mean values are deleted from the database.
Our analysis depends on machine learning techniques so for this purpose we divided the
data set in two groups; the training set and the testing set. These two groups used for training
and testing experiments are extracted randomly from the overall data set for each experiment
by using a simple shuffle algorithm. This method provided us with randomly generated data
sets, which are believed to contain evenly distributed numbers of defect data.
4. Problem Statement
Two types of research can be studied on the code based metrics in terms of defect prediction.
The first one is predicting whether a given code segment is defected or not. The second one is
predicting the magnitude of the possible defect, if any, with respect to various viewpoints
such as density, severity or priority. Estimating the defect causing potential of a given
software project has a very critical value for the reliability of the project. Our work in this
research is primarily focused on the second type of predictions. But it also includes some
major experiments involving the first type of predictions.
Given a training data set, a learning system can be set up. This system would come out
with a score point that indicates how much a test data and code segment is defected. After
predicting this score point, the results can be evaluated with respect to popular performance
functions. The two most common options here are the Mean Absolute Error (mae) and the
Mean Squared Error (mse). The mae is generally used for classification, while the mse is most
commonly seen in function approximation.
7. In this research we used mse since the performance function for the results of the
experiments aims second type of prediction. Although mae could be a good measure for
classification experiments, in our case, due to the fact that our output values are zeros and
ones we chose to use some custom error measures. We will explain them in detail in the
results section.
5. Proposed Model and Methodology
The data set used in this research contains defect density data which corresponds to the total
number of defects per 1-000 lines of code. In this research we have used the software metric
data set with this defect density data to predict the defect density value for a given project or a
module. Artificial neural networks and decision tree approaches are used to predict the defect
density values for a testing data set.
Multi-layer perceptron method is used in ANN experiments. Multilayer perceptrons are
feedforward neural networks trained with the standard backpropagation algorithm.
Feedforward neural networks provide a general framework for representing non-linear
functional mappings between a set of input variables and a set of output variables. This is
achieved by representing the nonlinear function of many variables in terms of compositions of
nonlinear functions of a single variable, which are called activation functions (Bishop, 1995).
Decision trees are one of the most popular approaches for both classification and
regression type predictions. They are generated based on specific rules. Decision tree is a
classifier in a tree structure. Leaf node is the outcome obtained. It is computed with respect to
the existing attributes. Decision node is based on an attribute, which branches for each
possible outcome for that attribute. Decision trees can be thought as a sequence of questions,
which leads to a final outcome. Each question depends on the previous question hence this
case leads to a branching in the decision tree. While generating the decision tree, the main
goal is to minimize the average number of questions in each case. This task provides increase
in the performance of prediction (Mitchell, 1997). One approach to create a decision tree is to
use the term entropy, which is a fundamental quantity in information theory. Entropy value
determines the level of uncertainty. The degree of uncertainty is related to the success rate of
predicting the result. Also to overcome the over-fitting problem we used pruning to minimize
the output variable variance in the validation data by selecting a simpler tree than the one
obtained when the tree building algorithm stopped, but one that is equally as accurate for
8. predicting or classifying "new" observations. In the regression type prediction experiments we
used regression trees which may be considered as a variant of decision trees, designed to
approximate real-valued functions instead of being used for classification tasks.
In the experiments we first applied the two methods to perform a regression based
prediction over the whole data set. According to the experiment results we calculated the
corresponding mse values. Mse values provide the amount of the spread from the target
values. To evaluate the performance of each algorithm with respect to the mse values, we
compared the square root of the mse values with the standard deviance of the testing data set.
The standard deviation of the data set is in fact the mse of it when all predictions are equal to
the mean value of the data set. To declare that a specific experiment’s performance is
acceptable, its mse value should be fairly less than the variance of the data set. Otherwise
there is no need to apply such sophisticated learning methods, one can obtain a similar level
of success by just predicting all values equal to mean value of the data set.
The first experiments that are done using the whole data set show that the performance of
both algorithms are not in acceptable ranges as these outcomes are detailed in the results
section. The data set includes mostly non-defected modules so there happens to be a bias
towards underestimating the defect possibility in the prediction process. Also it is obvious that
any other input data set will have the same characteristic since it is practically likely to have
much more non-defected modules than defected ones in real life software projects.
As a second type of experiments we repeated the experiments with the metric data that
contains only defected items. By using such a data set, the influence of the dense non-defected
items disappeared as depicted in the results section. These kinds of experiments reveal
successful results and since we are trying to estimate the density of the possible defects, using
the new data set is an improvement with respect to our primary goal.
Despite the fact that the second type of experiments are successful in terms of defect
prediction, it is practically impossible to start from this lucky position. In other words,
without knowing which ones are defected, it does not make much sense that we can estimate
the magnitude of the possible defects among the defected modules. So as a third type of
experiment we used ANN and decision tree methods for classifying the whole data set in
terms of being defected or not. The classification process has two clusters so that the testing
data set is fit into. In these experiments the classification is done with respect to a threshold
value, which is close to zero but is calculated internally by the experiments. This threshold
point is the value where the performance of the classification algorithm is maximized. One of
the two resulting clusters consists of the values less than this threshold value, which indicates
9. that there is no defect. And the other cluster consists of the values greater than the threshold
value, which indicates there is a defect. The threshold value may vary with respect to the
input data set used and it can be calculated throughout the experiments for any data set. The
performance of this classification process is measured by the total number of the correct
predictions it has done compared to the incorrect ones. The results section includes the
outcomes of these experiments in detail.
The three type of experiments explained above guided us in proposing the novel model for
defect prediction in software projects. According to the results of these experiments, better
results are obtained when first a classification is carried out and then a regression type
prediction is done over the data set which is expected to be defected. So the model has two
steps, first classifying the input data set with respect to being defected or not. After this
classification, a new data set is generated with the values that are predicted as defected. And a
regression is done to predict the defect density values among the new data set.
The novel model predicts the possibly defected modules in a given data set, besides it
gives an estimation of the defect density in the module that is predicted as defected. So the
model helps concentrating the efforts on specific suspected parts of the code so that
significant amount of time and resource can be saved in software quality process.
6. Results
In this research, the training and testing are made using MATLAB’s MLP and decision tree
algorithms based on a model for classification and regression. The data set used in the
experiments contains 6-000 training data and 2-000 testing data. The resulting values are the
mean values of 30 separately run experiments.
In designing the experiment set of the MLP algorithm, a neural network is generated by
using linear function as the output unit activation function. 32 hidden units are used in
network generation and the alpha value is set to 0.01 while the experiments are done with 200
training cycles. Also in the experiment set of decision tree algorithms, Treefit and Treeprune
functions are used consecutively. The method of the Treefit function is altered for
classification and regression purposes respectively.
10. 6.1. Regression over the whole data set
In the first type of experiments neither ANN method nor decision trees did bring out
successful results. The average variance of the data sets which are generated randomly by the
use of a shuffling algorithm is 1-402.21 and the mean mse value for the ANN experiments is
1-295.96. This value is far from being acceptable since the method fails to approximate the
defect density values. Figure 1 depicts the scatter graph of the predicted values and the real
values. According to this graph, it is clear that the method potentially does faulty predictions
over the non defected values. The points laying on the y-axis show that there are unacceptable
amount of faulty predictions for non defected values. Also apart from missing to predict the
non defected ones, it is obvious that the method is biased towards smaller approximations on
the predictions for defected items because vast amount of predictions lay under the line which
depicts the correct predictions.
Figure 1. The predicted values and the real values in ANN experiments
11. Decision tree method similarly brings out unsuccessful results when the input data set is
the complete data set which contains both defected and non defected items where non
defected ones are much more dense. The average variance of the data sets is 1-353.27 and the
mean mse value for decision tree experiments is 1-316.42. This result is slightly worse than
that of ANN results. Figure 2 shows the predictions done by the decision tree method and the
real values. Like ANN method, decision tree method also misses predicting non defected
values. Moreover, the decision tree method does much more non defected predictions where
the real values show that the corresponding items are defected. Also the effect of the input
data set which is explained as a bias towards zero value is not as high as in the ANN case.
Figure 2. The predicted values and the real values in decision tree experiments
6.2. Regression over the data set containing only defected items
The second type of experiments are done with input data sets which contain only defected
items. The results for both ANN and decision tree methods are more successful than in the
first type of experiments.
12. The average variance of the data sets used in the ANN experiments are 1-637.41 and the
mean mse value is 262.61. According to these results the MLP algorithm approximates the
error density values well when only defected items reside in the input data set. It also shows
that the dense non defected data effects the prediction capability of the algorithm in a negative
manner. Figure 3 shows the predicted values and the real values after an ANN experiment
run. The algorithm estimates the defect density value better for smaller values as seen from
the graph, where the scatter deviates more from the line that depicts the correct predictions
for higher values of defect density.
Figure 3. The predicted values and the real values in ANN experiments where the input data
set contains only defected items
The average variance of the data sets in the decision tree experiments are 1-656.23 and the
mean mse value is 237.68. Like ANN experiments, decision tree method is also successful in
predicting the defect density values when only defected items are included in the input data
set. According to Figure 4 which depicts the experiment results, decision tree algorithm gives
more accurate results for almost half of the samples than the ANN method. Despite, the
13. spread of the erroneous predictions shows that their deviations are more than that of ANN’s.
Like ANN method, decision tree method also results in increasing deviations from the real
values as the defect density values increase.
Figure 4. The predicted values and the real values in decision tree experiments where the
input data set contains only defected items
6.3. Classification with respect to defectedness
In the third type of experiments the problem is reduced to only predicting whether a module is
defected or not. For this purpose both of the algorithms are used to classify the testing data set
into two clusters. The value that divides the output data set into two clusters is calculated
dynamically so that this value is selected among various values according to their
performance in clustering the data set correctly. After several experiment runs, the
performance of the clustering algorithm is measured with respect to these values and the best
14. one is selected as the point which generates the two clusters; less values are non defected and
the others are defected.
For both of the methods in classifying the defected and non defected items, the value that
seperates the two clusters is selected as 5 while the trials were done with values ranging from
0 to 10. The performace drops significantly after that value but the best results are achieved
when 5 is selected as the cluster seperation point for both of the ANN and decision tree
methods.
In the ANN experiments the clustering algorithm is partly successful in predicting the
defected items. The mean percentage of the correct predictions is 88.35% for ANN
experiments. The mean percentage of correct defected predictions is 54.44% whereas the
mean percentage of correct non defected predictions is 97.28%. These results show that the
method is very succesful in finding out the really defected items. It is capable of finding out
three out of every four defected items.
The decision tree method is more successful than the ANN method in these type of
experiments. The mean percentage of the correct predictions is 91.75% for decision tree
experiments. The mean percentage of correct defected predictions is 79.38% and the mean
percentage of correct non defected predictions is 95.06%. The main difference between the
two methods arises in predicting the defected and non defected items seperately. Decision tree
method is better in the former where ANN method is more successful in the latter. According
to these results it can be concluded that the experiments for classification are much successful
with respect to the experiments that are aiming regression. Since the regression methods do
perform better for the data set containing only the defected items, the predicted items as a
result of this classification process will improve the overall performance of defect density
prediction.
As a result, it can be deduced that we divide the defect prediction problem into two parts.
The first part consists of predicting whether a given module is defected or not. And the second
part is predicting the magnitude of the possible defect if it is labeled as defected by the first
type. We understand that predicting the defect density value among a data set containing only
defected items brings much better results than the case that the whole data set is used where
an intrinsic bias towards lessening the magnitude of the defect arises. Also by dividing the
problem into two separate problems, and knowing that second part is successful enough in
predicting the defect density, it is possible to improve the overall performance of the learning
system by improving the performance of the classification part.
15. 7. Conclusion
In this research, we proposed a new defect prediction model based on machine learning
methods. MLP and decision tree results have much more wrong defect predictions when
applied to the entire data set containing both defected and non defected items. Since most
modules in the input data have zero defects (80% of the whole data), applied machine
learning methods fail to predict scores within expected performance. The data set is already
80% non-defected. Even if an algorithm claims that a test data is non-defected though it did
not try to learn at all, the 80% success is guaranteed. Therefore logic behind the learning
methodology fails. Different methodology which can manage such data set for software
metrics is required.
Instead of predicting the defect density value of a given module, first, trying to find if a
module is defected, and then estimating the magnitude of the defect seems to be an enhanced
technique for such data sets. Metrics values for modules that have defect count zero or not are
very similar so it is much easier to learn the defectedness probability. Moreover, it is also
much easier to learn the magnitude of the defects while training within the modules that are
known to be defected.
Training set of software metrics has most modules with zero or very small defect
densities. So, defect density values can be classified into two clusters as defected and non-
defected sets. This partitioning enhance the performance of learning process and enables
regression to work only on training data consisting of modules that are predicted as defected
in the first processing.
Clustering as defected and non-defected based on a threshold value enhances the learning
and estimation in the classification process. This threshold value is self set within the learning
process so that it is an equilibrium point where the learning performance is at maximum.
In our specific experiment dataset we observed that decision tree algorithm performs
better than MLP algorithm in terms of both classifying the items in the dataset with respect to
being defected, and estimating the defect density of the items that are thought to be defected.
Also the decision tree algorithm generates rules in the classification process. These rules are
used for deciding which branches to select towards the leaf nodes in the tree. The effects of all
features in the dataset can be observed by looking at these rules.
By using our two step approach, along with predicting which modules are defected, the
model generates estimations on the defect magnitudes. The software practitioners may use
16. these estimation values in making decisions about the resources and effort in software quality
processes such as testing. Our model constitutes to a well risk assessment technique in
software projects regarding the code metrics data about the project.
As a future work, different machine learning algorithms or improved versions of the used
machine learning algorithms may be included in the experiments. The algorithms used in our
evaluation experiments are the simplest forms of some widely used methods. Also this model
can be applied to other risk assessment procedures which can be supplied as input to the
system. Certainly these risk issues should have quantitative representations to be considered
as an input for our system.
Notes
1. For information on NASA/WVU IV&V Facility Metrics Data Program see http://mdp.ivv.nasa.gov.
Bibliography
Bertolino, A., and Strigini, L., 1996. On the Use of Testability Measures for Dependability Assessment, IEEE
Trans. Software Engineering, vol. 22, no. 2, pp. 97-108.
Bishop, M., 1995, Neural Networks for Pattern Recognition, Oxford University Press.
Boetticher, G.D., Srinivas, K., Eichmann, D., 1993. A Neural Net-Based Approach to Software Metrics,
Proceedings of the Fifth International Conference on Software Engineering and Knowledge Engineering,
San Francisco, pp. 271-274.
CHAOS Chronicles, The Standish Group - Standish Group Internal Report, 1995.
Cusumano, M.A., 1991. Japan’s Software Factories, Oxford University Press.
Diaz, M., and Sligo, J., 1997. How Software Process Improvement Helped Motorola, IEEE Software, vol. 14,
no. 5, pp. 75-81.
Dickinson, W., Leon, D., Podgurski, A., 2001. Finding failures by cluster analysis of execution profiles. In
ICSE, pages 339– 348.
Fenton, N., and Neil, M., 1999. A critique of software defect prediction models, IEEE Transactions on Software
Engineering, Vol. 25, No. 5, pp. 675-689.
Groce, and Visser, W., 2003. What went wrong: Explaining counterexamples, In SPIN 2003, pages 121–135.
Jensen, F.V., 1996. An Introduction to Bayesian Networks, Springer.
Henry, S., and Kafura, D., 1984. The Evaluation of Software System’s Structure Using Quantitative Software
Metrics, Software Practice and Experience, vol. 14, no. 6, pp. 561-573.
17. Hudepohl, P., Khoshgoftaar, M., Mayrand, J., 1996. Integrating Metrics and Models for Software Risk
Assessment, The Seventh International Symposium on Software Reliability Engineering (ISSRE '96).
Mitchell, T.M., 1997. Machine Learning, McGrawHill.
Neumann, D.E., 2002. An Enhanced Neural Network Technique for Software Risk Analysis, IEEE Transactions
on Software Engineering, pp. 904-912.
Podgurski, D. Leon, P. Francis, W. Masri, M. Minch, J. Sun, and B.Wang. Automated support for classifying
software failure reports. In ICSE, pages 465–475, May 2003.
Yuriy, B., and Ernst, M. D., 2004. Finding latent code errors via machine learning over program executions.
Proceedings of the 26th International Conference on Software Engineering, (Edinburgh, Scotland).
Zhang, D., 2000. Applying Machine Learning Algorithms in Software Development, The Proceedings of 2000
Monterey Workshop on Modeling Software System Structures, Santa Margherita Ligure, Italy, pp.275-285.