The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
International Journal of Computational Engineering Research (IJCER) is dedicated to protecting personal information and will make every reasonable effort to handle collected information appropriately. All information collected, as well as related requests, will be handled as carefully and efficiently as possible in accordance with IJCER standards for integrity and objectivity.
ESTIMATING HANDLING TIME OF SOFTWARE DEFECTScsandit
The problem of accurately predicting handling time for software defects is of great practical
importance. However, it is difficult to suggest a practical generic algorithm for such estimates,
due in part to the limited information available when opening a defect and the lack of a uniform
standard for defect structure. We suggest an algorithm to address these challenges that is
implementable over different defect management tools. Our algorithm uses machine learning
regression techniques to predict the handling time of defects based on past behaviour of similar
defects. The algorithm relies only on a minimal set of assumptions about the structure of the
input data. We show how an implementation of this algorithm predicts defect handling time with
promising accuracy results
Development of software defect prediction system using artificial neural networkIJAAS Team
Software testing is an activity to enable a system is bug free during execution process. The software bug prediction is one of the most encouraging exercises of the testing phase of the software improvement life cycle. In any case, in this paper, a framework was created to anticipate the modules that deformity inclined in order to be utilized to all the more likely organize software quality affirmation exertion. Genetic Algorithm was used to extract relevant features from the acquired datasets to eliminate the possibility of overfitting and the relevant features were classified to defective or otherwise modules using the Artificial Neural Network. The system was executed in MATLAB (R2018a) Runtime environment utilizing a statistical toolkit and the performance of the system was assessed dependent on the accuracy, precision, recall, and the f-score to check the effectiveness of the system. In the finish of the led explores, the outcome indicated that ECLIPSE JDT CORE, ECLIPSE PDE UI, EQUINOX FRAMEWORK and LUCENE has the accuracy, precision, recall and the f-score of 86.93, 53.49, 79.31 and 63.89% respectively, 83.28, 31.91, 45.45 and 37.50% respectively, 83.43, 57.69, 45.45 and 50.84% respectively and 91.30, 33.33, 50.00 and 40.00% respectively. This paper presents an improved software predictive system for the software defect detections.
an error in that computer program. In order to improve the software quality, prediction of faulty modules is
necessary. Various Metric suites and techniques are available to predict the modules which are critical and
likely to be fault prone. Genetic Algorithm is a problem solving algorithm. It uses genetics as its model of
problem solving. It’s a search technique to find approximate solutions to optimization and search
problems.Genetic algorithm is applied for solving the problem of faulty module prediction and as well as
for finding the most important attribute for fault occurrence. In order to perform the analysis, performance
validation of the Genetic Algorithm using open source software jEdit is done. The results are measured in
terms Accuracy and Error in predicting by calculating probability of detection and probability of false
Alarms
Function Point Software Cost Estimates using Neuro-Fuzzy techniqueijceronline
Software estimation accuracy is among the greatest challenges for software developers. As Neurofuzzy based system is able to approximate the non-linear function with more precision so it is used as a soft computing approach to generate model by formulating the relationship based on its training. The approach presented in this paper is independent of the nature and type of estimation. In this paper, Function point is used as algorithmic model and an attempt is being made to validate the soundness of Neuro fuzzy technique using ISBSG and NASA project data.
EARLY STAGE SOFTWARE DEVELOPMENT EFFORT ESTIMATIONS – MAMDANI FIS VS NEURAL N...cscpconf
Accurately estimating the software size, cost, effort and schedule is probably the biggest
challenge facing software developers today. It has major implications for the management of
software development because both the overestimates and underestimates have direct impact for
causing damage to software companies. Lot of models have been proposed over the years by
various researchers for carrying out effort estimations. Also some of the studies for early stage
effort estimations suggest the importance of early estimations. New paradigms offer alternatives
to estimate the software development effort, in particular the Computational Intelligence (CI)
that exploits mechanisms of interaction between humans and processes domain
knowledge with the intention of building intelligent systems (IS). Among IS,
Artificial Neural Network and Fuzzy Logic are the two most popular soft computing techniques
for software development effort estimation. In this paper neural network models and Mamdani
FIS model have been used to predict the early stage effort estimations using the student dataset.
It has been found that Mamdani FIS was able to predict the early stage efforts more efficiently in
comparison to the neural network models based models.
TOWARDS PREDICTING SOFTWARE DEFECTS WITH CLUSTERING TECHNIQUESijaia
The purpose of software defect prediction is to improve the quality of a software project by building a
predictive model to decide whether a software module is or is not fault prone. In recent years, much
research in using machine learning techniques in this topic has been performed. Our aim was to evaluate
the performance of clustering techniques with feature selection schemes to address the problem of software
defect prediction problem. We analysed the National Aeronautics and Space Administration (NASA)
dataset benchmarks using three clustering algorithms: (1) Farthest First, (2) X-Means, and (3) selforganizing map (SOM). In order to evaluate different feature selection algorithms, this article presents a
comparative analysis involving software defects prediction based on Bat, Cuckoo, Grey Wolf Optimizer
(GWO), and particle swarm optimizer (PSO). The results obtained with the proposed clustering models
enabled us to build an efficient predictive model with a satisfactory detection rate and acceptable number
of features.
International Journal of Computational Engineering Research (IJCER) is dedicated to protecting personal information and will make every reasonable effort to handle collected information appropriately. All information collected, as well as related requests, will be handled as carefully and efficiently as possible in accordance with IJCER standards for integrity and objectivity.
ESTIMATING HANDLING TIME OF SOFTWARE DEFECTScsandit
The problem of accurately predicting handling time for software defects is of great practical
importance. However, it is difficult to suggest a practical generic algorithm for such estimates,
due in part to the limited information available when opening a defect and the lack of a uniform
standard for defect structure. We suggest an algorithm to address these challenges that is
implementable over different defect management tools. Our algorithm uses machine learning
regression techniques to predict the handling time of defects based on past behaviour of similar
defects. The algorithm relies only on a minimal set of assumptions about the structure of the
input data. We show how an implementation of this algorithm predicts defect handling time with
promising accuracy results
Development of software defect prediction system using artificial neural networkIJAAS Team
Software testing is an activity to enable a system is bug free during execution process. The software bug prediction is one of the most encouraging exercises of the testing phase of the software improvement life cycle. In any case, in this paper, a framework was created to anticipate the modules that deformity inclined in order to be utilized to all the more likely organize software quality affirmation exertion. Genetic Algorithm was used to extract relevant features from the acquired datasets to eliminate the possibility of overfitting and the relevant features were classified to defective or otherwise modules using the Artificial Neural Network. The system was executed in MATLAB (R2018a) Runtime environment utilizing a statistical toolkit and the performance of the system was assessed dependent on the accuracy, precision, recall, and the f-score to check the effectiveness of the system. In the finish of the led explores, the outcome indicated that ECLIPSE JDT CORE, ECLIPSE PDE UI, EQUINOX FRAMEWORK and LUCENE has the accuracy, precision, recall and the f-score of 86.93, 53.49, 79.31 and 63.89% respectively, 83.28, 31.91, 45.45 and 37.50% respectively, 83.43, 57.69, 45.45 and 50.84% respectively and 91.30, 33.33, 50.00 and 40.00% respectively. This paper presents an improved software predictive system for the software defect detections.
an error in that computer program. In order to improve the software quality, prediction of faulty modules is
necessary. Various Metric suites and techniques are available to predict the modules which are critical and
likely to be fault prone. Genetic Algorithm is a problem solving algorithm. It uses genetics as its model of
problem solving. It’s a search technique to find approximate solutions to optimization and search
problems.Genetic algorithm is applied for solving the problem of faulty module prediction and as well as
for finding the most important attribute for fault occurrence. In order to perform the analysis, performance
validation of the Genetic Algorithm using open source software jEdit is done. The results are measured in
terms Accuracy and Error in predicting by calculating probability of detection and probability of false
Alarms
Function Point Software Cost Estimates using Neuro-Fuzzy techniqueijceronline
Software estimation accuracy is among the greatest challenges for software developers. As Neurofuzzy based system is able to approximate the non-linear function with more precision so it is used as a soft computing approach to generate model by formulating the relationship based on its training. The approach presented in this paper is independent of the nature and type of estimation. In this paper, Function point is used as algorithmic model and an attempt is being made to validate the soundness of Neuro fuzzy technique using ISBSG and NASA project data.
EARLY STAGE SOFTWARE DEVELOPMENT EFFORT ESTIMATIONS – MAMDANI FIS VS NEURAL N...cscpconf
Accurately estimating the software size, cost, effort and schedule is probably the biggest
challenge facing software developers today. It has major implications for the management of
software development because both the overestimates and underestimates have direct impact for
causing damage to software companies. Lot of models have been proposed over the years by
various researchers for carrying out effort estimations. Also some of the studies for early stage
effort estimations suggest the importance of early estimations. New paradigms offer alternatives
to estimate the software development effort, in particular the Computational Intelligence (CI)
that exploits mechanisms of interaction between humans and processes domain
knowledge with the intention of building intelligent systems (IS). Among IS,
Artificial Neural Network and Fuzzy Logic are the two most popular soft computing techniques
for software development effort estimation. In this paper neural network models and Mamdani
FIS model have been used to predict the early stage effort estimations using the student dataset.
It has been found that Mamdani FIS was able to predict the early stage efforts more efficiently in
comparison to the neural network models based models.
TOWARDS PREDICTING SOFTWARE DEFECTS WITH CLUSTERING TECHNIQUESijaia
The purpose of software defect prediction is to improve the quality of a software project by building a
predictive model to decide whether a software module is or is not fault prone. In recent years, much
research in using machine learning techniques in this topic has been performed. Our aim was to evaluate
the performance of clustering techniques with feature selection schemes to address the problem of software
defect prediction problem. We analysed the National Aeronautics and Space Administration (NASA)
dataset benchmarks using three clustering algorithms: (1) Farthest First, (2) X-Means, and (3) selforganizing map (SOM). In order to evaluate different feature selection algorithms, this article presents a
comparative analysis involving software defects prediction based on Bat, Cuckoo, Grey Wolf Optimizer
(GWO), and particle swarm optimizer (PSO). The results obtained with the proposed clustering models
enabled us to build an efficient predictive model with a satisfactory detection rate and acceptable number
of features.
Estimation of resources, cost, and schedule for a software engineering effort requires experience, access to good historical information, and the courage to commit to quantitative predictions when qualitative information is all that exists. Halstead’s Measure & COCOMO Modeol COCOMO II Model of Estimation techniquesused or S/w Developments and Maintenance
A DECISION SUPPORT SYSTEM FOR ESTIMATING COST OF SOFTWARE PROJECTS USING A HY...ijfcstjournal
One of the major challenges for software, nowadays, is software cost estimation. It refers to estimating the
cost of all activities including software development, design, supervision, maintenance and so on. Accurate
cost-estimation of software projects optimizes the internal and external processes, staff works, efforts and
the overheads to be coordinated with one another. In the management software projects, estimation must
be taken into account so that reduces costs, timing and possible risks to avoid project failure. In this paper,
a decision- support system using a combination of multi-layer artificial neural network and decision tree is
proposed to estimate the cost of software projects. In the model included into the proposed system,
normalizing factors, which is vital in evaluating efforts and costs estimation, is carried out using C4.5
decision tree. Moreover, testing and training factors are done by multi-layer artificial neural network and
the most optimal values are allocated to them. The experimental results and evaluations on Dataset
NASA60 show that the proposed system has less amount of the total average relative error compared with
COCOMO model.
Fault localization is time-consuming and difficult,
which makes it the bottleneck of the
debugging progress. To help facilitate this task, t
here exist many fault localization techniques
that help narrow down the region of the suspicious
code in a program. Better accuracy in fault
localization is achieved from heavy computation cos
t. Fault localization techniques that can
effectively locate faults also manifest slow respon
se rate. In this paper, we promote the use of
pre-computing to distribute the time-intensive comp
utations to the idle period of coding phase,
in order to speed up such techniques and achieve bo
th low-cost and high accuracy. We raise the
research problems of finding suitable techniques th
at can be pre-computed and adapt it to the
pre-computing paradigm in a continuous integration
environment. Further, we use an existing
fault localization technique to demonstrate our res
earch exploration, and shows visions and
challenges of the related methodologies.
Comparative Performance Analysis of Machine Learning Techniques for Software ...csandit
Machine learning techniques can be used to analyse data from different perspectives and enable
developers to retrieve useful information. Machine learning techniques are proven to be useful
in terms of software bug prediction. In this paper, a comparative performance analysis of
different machine learning techniques is explored for software bug prediction on public
available data sets. Results showed most of the machine learning methods performed well on
software bug datasets.
Machine Learning approaches are good in solving problems that have less information. In most cases, the
software domain problems characterize as a process of learning that depend on the various circumstances
and changes accordingly. A predictive model is constructed by using machine learning approaches and
classified them into defective and non-defective modules. Machine learning techniques help developers to
retrieve useful information after the classification and enable them to analyse data from different
perspectives. Machine learning techniques are proven to be useful in terms of software bug prediction. This
study used public available data sets of software modules and provides comparative performance analysis
of different machine learning techniques for software bug prediction. Results showed most of the machine
learning methods performed well on software bug datasets.
The adoption of machine learning techniques for software defect prediction: A...RAKESH RANA
The adoption of machine learning techniques for software defect prediction: An initial industrial validation
Presented at:
11th Joint Conference On Knowledge-Based Software Engineering, JCKBSE, Volgograd, Russia, 2014
Get full text of publication at:
http://rakeshrana.website/index.php/work/publications/
Measuring effort for modifying software package as reusable package using pac...eSAT Journals
Abstract In any engineering field the data associated with knowledge is important one for taking decisions for solving problems in the current system development. The specification mining can give support for analyzing collected data to help the project management team to fulfill their responsibilities. In this paper ‘Package Specification Mining’ is designed by using packages’ reusability quality factor. It supports to give effort required for modifying the package to be reusable package for using those packages in new software development. This methodology may reduce the risks in various domains of software engineering. Keywords: Specification Mining, Reusability, Effort Estimation, Coupling, Project Management
A Defect Prediction Model for Software Product based on ANFISIJSRD
Artificial intelligence techniques are day by day getting involvement in all the classification and prediction based process like environmental monitoring, stock exchange conditions, biomedical diagnosis, software engineering etc. However still there are yet to be simplify the challenges of selecting training criteria for design of artificial intelligence models used for prediction of results. This work focus on the defect prediction mechanism development using software metric data of KC1.We have taken subtractive clustering approach for generation of fuzzy inference system (FIS).The FIS rules are generated at different radius of influence of input attribute vectors and the developed rules are further modified by ANFIS technique to obtain the prediction of number of defects in software project using fuzzy logic system.
call for papers, research paper publishing, where to publish research paper, journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJEI, call for papers 2012,journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, research and review articles, engineering journal, International Journal of Engineering Inventions, hard copy of journal, hard copy of certificates, journal of engineering, online Submission, where to publish research paper, journal publishing, international journal, publishing a paper, hard copy journal, engineering journal
The impact of innovation on travel and tourism industries (World Travel Marke...Brian Solis
From the impact of Pokemon Go on Silicon Valley to artificial intelligence, futurist Brian Solis talks to Mathew Parsons of World Travel Market about the future of travel, tourism and hospitality.
We’re all trying to find that idea or spark that will turn a good project into a great project. Creativity plays a huge role in the outcome of our work. Harnessing the power of collaboration and open source, we can make great strides towards excellence. Not just for designers, this talk can be applicable to many different roles – even development. In this talk, Seasoned Creative Director Sara Cannon is going to share some secrets about creative methodology, collaboration, and the strong role that open source can play in our work.
Gave a talk at StartCon about the future of Growth. I touch on viral marketing / referral marketing, fake news and social media, and marketplaces. Finally, the slides go through future technology platforms and how things might evolve there.
The Six Highest Performing B2B Blog Post FormatsBarry Feldman
If your B2B blogging goals include earning social media shares and backlinks to boost your search rankings, this infographic lists the size best approaches.
Estimation of resources, cost, and schedule for a software engineering effort requires experience, access to good historical information, and the courage to commit to quantitative predictions when qualitative information is all that exists. Halstead’s Measure & COCOMO Modeol COCOMO II Model of Estimation techniquesused or S/w Developments and Maintenance
A DECISION SUPPORT SYSTEM FOR ESTIMATING COST OF SOFTWARE PROJECTS USING A HY...ijfcstjournal
One of the major challenges for software, nowadays, is software cost estimation. It refers to estimating the
cost of all activities including software development, design, supervision, maintenance and so on. Accurate
cost-estimation of software projects optimizes the internal and external processes, staff works, efforts and
the overheads to be coordinated with one another. In the management software projects, estimation must
be taken into account so that reduces costs, timing and possible risks to avoid project failure. In this paper,
a decision- support system using a combination of multi-layer artificial neural network and decision tree is
proposed to estimate the cost of software projects. In the model included into the proposed system,
normalizing factors, which is vital in evaluating efforts and costs estimation, is carried out using C4.5
decision tree. Moreover, testing and training factors are done by multi-layer artificial neural network and
the most optimal values are allocated to them. The experimental results and evaluations on Dataset
NASA60 show that the proposed system has less amount of the total average relative error compared with
COCOMO model.
Fault localization is time-consuming and difficult,
which makes it the bottleneck of the
debugging progress. To help facilitate this task, t
here exist many fault localization techniques
that help narrow down the region of the suspicious
code in a program. Better accuracy in fault
localization is achieved from heavy computation cos
t. Fault localization techniques that can
effectively locate faults also manifest slow respon
se rate. In this paper, we promote the use of
pre-computing to distribute the time-intensive comp
utations to the idle period of coding phase,
in order to speed up such techniques and achieve bo
th low-cost and high accuracy. We raise the
research problems of finding suitable techniques th
at can be pre-computed and adapt it to the
pre-computing paradigm in a continuous integration
environment. Further, we use an existing
fault localization technique to demonstrate our res
earch exploration, and shows visions and
challenges of the related methodologies.
Comparative Performance Analysis of Machine Learning Techniques for Software ...csandit
Machine learning techniques can be used to analyse data from different perspectives and enable
developers to retrieve useful information. Machine learning techniques are proven to be useful
in terms of software bug prediction. In this paper, a comparative performance analysis of
different machine learning techniques is explored for software bug prediction on public
available data sets. Results showed most of the machine learning methods performed well on
software bug datasets.
Machine Learning approaches are good in solving problems that have less information. In most cases, the
software domain problems characterize as a process of learning that depend on the various circumstances
and changes accordingly. A predictive model is constructed by using machine learning approaches and
classified them into defective and non-defective modules. Machine learning techniques help developers to
retrieve useful information after the classification and enable them to analyse data from different
perspectives. Machine learning techniques are proven to be useful in terms of software bug prediction. This
study used public available data sets of software modules and provides comparative performance analysis
of different machine learning techniques for software bug prediction. Results showed most of the machine
learning methods performed well on software bug datasets.
The adoption of machine learning techniques for software defect prediction: A...RAKESH RANA
The adoption of machine learning techniques for software defect prediction: An initial industrial validation
Presented at:
11th Joint Conference On Knowledge-Based Software Engineering, JCKBSE, Volgograd, Russia, 2014
Get full text of publication at:
http://rakeshrana.website/index.php/work/publications/
Measuring effort for modifying software package as reusable package using pac...eSAT Journals
Abstract In any engineering field the data associated with knowledge is important one for taking decisions for solving problems in the current system development. The specification mining can give support for analyzing collected data to help the project management team to fulfill their responsibilities. In this paper ‘Package Specification Mining’ is designed by using packages’ reusability quality factor. It supports to give effort required for modifying the package to be reusable package for using those packages in new software development. This methodology may reduce the risks in various domains of software engineering. Keywords: Specification Mining, Reusability, Effort Estimation, Coupling, Project Management
A Defect Prediction Model for Software Product based on ANFISIJSRD
Artificial intelligence techniques are day by day getting involvement in all the classification and prediction based process like environmental monitoring, stock exchange conditions, biomedical diagnosis, software engineering etc. However still there are yet to be simplify the challenges of selecting training criteria for design of artificial intelligence models used for prediction of results. This work focus on the defect prediction mechanism development using software metric data of KC1.We have taken subtractive clustering approach for generation of fuzzy inference system (FIS).The FIS rules are generated at different radius of influence of input attribute vectors and the developed rules are further modified by ANFIS technique to obtain the prediction of number of defects in software project using fuzzy logic system.
call for papers, research paper publishing, where to publish research paper, journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJEI, call for papers 2012,journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, research and review articles, engineering journal, International Journal of Engineering Inventions, hard copy of journal, hard copy of certificates, journal of engineering, online Submission, where to publish research paper, journal publishing, international journal, publishing a paper, hard copy journal, engineering journal
The impact of innovation on travel and tourism industries (World Travel Marke...Brian Solis
From the impact of Pokemon Go on Silicon Valley to artificial intelligence, futurist Brian Solis talks to Mathew Parsons of World Travel Market about the future of travel, tourism and hospitality.
We’re all trying to find that idea or spark that will turn a good project into a great project. Creativity plays a huge role in the outcome of our work. Harnessing the power of collaboration and open source, we can make great strides towards excellence. Not just for designers, this talk can be applicable to many different roles – even development. In this talk, Seasoned Creative Director Sara Cannon is going to share some secrets about creative methodology, collaboration, and the strong role that open source can play in our work.
Gave a talk at StartCon about the future of Growth. I touch on viral marketing / referral marketing, fake news and social media, and marketplaces. Finally, the slides go through future technology platforms and how things might evolve there.
The Six Highest Performing B2B Blog Post FormatsBarry Feldman
If your B2B blogging goals include earning social media shares and backlinks to boost your search rankings, this infographic lists the size best approaches.
Each technological age has been marked by a shift in how the industrial platform enables companies to rethink their business processes and create wealth. In the talk I argue that we are limiting our view of what this next industrial/digital age can offer because of how we read, measure and through that perceive the world (how we cherry pick data). Companies are locked in metrics and quantitative measures, data that can fit into a spreadsheet. And by that they see the digital transformation merely as an efficiency tool to the fossil fuel age. But we need to stretch further…
32 Ways a Digital Marketing Consultant Can Help Grow Your BusinessBarry Feldman
How can a digital marketing consultant help your business? In this resource we'll count the ways. 24 additional marketing resources are bundled for free.
A Review on Software Fault Detection and Prevention Mechanism in Software Dev...iosrjce
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
Insights on Research Techniques towards Cost Estimation in Software Design IJECEIAES
Software cost estimation is of the most challenging task in project management in order to ensuring smoother development operation and target achievement. There has been evolution of various standards tools and techniques for cost estimation practiced in the industry at present times. However, it was never investigated about the overall picturization of effectiveness of such techniques till date. This paper initiates its contribution by presenting taxonomies of conventional cost-estimation techniques and then investigates the research trends towards frequently addressed problems in it. The paper also reviews the existing techniques in well-structured manner in order to highlight the problems addressed, techniques used, advantages associated and limitation explored from literatures. Finally, we also brief the explored open research issues as an added contribution to this manuscript.
COMPARATIVE STUDY OF SOFTWARE ESTIMATION TECHNIQUES ijseajournal
Many information technology firms among other organizations have been working on how to perform estimation of the sources such as fund and other resources during software development processes. Software development life cycles require lot of activities and skills to avoid risks and the best software estimation technique is supposed to be employed. Therefore, in this research, a comparative study was conducted, that consider the accuracy, usage, and suitability of existing methods. It will be suitable for the project managers and project consultants during the whole software project development process. In this project technique such as linear regression; both algorithmic and non-algorithmic are applied. Model, composite and regression techniques are used to derive COCOMO, COCOMO II, SLIM and linear multiple respectively. Moreover, expertise-based and linear-based rules are applied in non-algorithm methods. However, the technique needs some advancement to reduce the errors that are experienced during the software development process. Therefore, this paper in relation to software estimation techniques has proposed a model that can be helpful to the information technology firms, researchers and other firms that use information technology in the processes such as budgeting and decision-making processes.
A simplified predictive framework for cost evaluation to fault assessment usi...IJECEIAES
Software engineering is an integral part of any software development scheme which frequently encounters bugs, errors, and faults. Predictive evaluation of software fault contributes towards mitigating this challenge to a large extent; however, there is no benchmarked framework being reported in this case yet. Therefore, this paper introduces a computational framework of the cost evaluation method to facilitate a better form of predictive assessment of software faults. Based on lines of code, the proposed scheme deploys adopts a machine-learning approach to address the perform predictive analysis of faults. The proposed scheme presents an analytical framework of the correlation-based cost model integrated with multiple standards machine learning (ML) models, e.g., linear regression, support vector regression, and artificial neural networks (ANN). These learning models are executed and trained to predict software faults with higher accuracy. The study considers assessing the outcomes based on error-based performance metrics in detail to determine how well each learning model performs and how accurate it is at learning. It also looked at the factors contributing to the training loss of neural networks. The validation result demonstrates that, compared to logistic regression and support vector regression, neural network achieves a significantly lower error score for software fault prediction.
In the present paper, applicability and
capability of A.I techniques for effort estimation prediction has
been investigated. It is seen that neuro fuzzy models are very
robust, characterized by fast computation, capable of handling
the distorted data. Due to the presence of data non-linearity, it is
an efficient quantitative tool to predict effort estimation. The one
hidden layer network has been developed named as OHLANFIS
using MATLAB simulation environment.
Here the initial parameters of the OHLANFIS are
identified using the subtractive clustering method. Parameters of
the Gaussian membership function are optimally determined
using the hybrid learning algorithm. From the analysis it is seen
that the Effort Estimation prediction model developed using
OHLANFIS technique has been able to perform well over normal
ANFIS Model.
IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com
Machine learning techniques can be used to analyse data from different perspectives and enable developers to retrieve useful information. Machine learning techniques are proven to be useful
in terms of software bug prediction. In this paper, a comparative performance analysis of
different machine learning techniques is explored for software bug prediction on public
available data sets. Results showed most of the machine learning methods performed well on
software bug datasets.
Similar to The International Journal of Engineering and Science (IJES) (20)
The International Journal of Engineering and Science (IJES)
1. The International Journal of Engineering And Science (IJES)
||Volume|| 1 ||Issue|| 2 ||Pages|| 239-242 ||2012||
ISSN: 2319 – 1813 ISBN: 2319 – 1805
Overview of Software Fault Prediction using Clustering Approaches
and Tree Data Structure
1
Swati M.Varade, 2Prof.M.D.Ingle
1, 3
Department of Computer Science, Jayawantrao Sawant College of Engg, Hadapsar, Pune-028
----------------------------------------------------------------------ABSTRACT------------------------------------------------------------
Fault prediction will give one more chance to the development team to retest the modules or files for which the
defectiveness probability is high. By spending more time on the defective modules and no time on the non-defective
ones, the resources of the project would be utilized better and as a result, the maintenance phase of the project will
be easier for both the customers and the project owners. Software fault prediction decreases the total cost of the
project and increases the overall project success rate. The perfect prediction of where faults are likely to occur in
code can help direct test effort, reduce costs and improve the quality of software. This Paper shows specific
methods of fault prediction for software safety that directly address the root causes of software Faults and improve
the quality of software.
Keywords - Clustering, Hyper-Quad Tree, K-Means clustering, Quad Tree, Software Fault Prediction.
----------------------------------------------------------------------------------------------------------------------------------------------------
Date of Submission: 11, December, 2012 Date of Publication: 25, December 2012
----------------------------------------------------------------------------------------------------------------------------------------------------
I INTRODUCTION The software life cycle methodologies might not be
followed very well.
The main objective of this paper is to predict the fault that Improper and incomplete testing of software.
tends to occur while classifying the dataset also tries to Such faulty software classes may increase development &
improve the quality of software. Developing a defect free maintenance cost, due to software failures and decrease
software system is very difficult task and sometimes there customer’s satisfaction.
are some unknown faults or deficiencies found in software The main objective of this paper is to predict the
projects where there is a need of applying carefully the fault that tends to occur while classifying the dataset.
principles of the software development methodologies. By
spending more time on the defective modules and no time Hyper Quad-Trees are applied for finding initial
on the non-defective ones, the resources of the project cluster centers for K-Means algorithm.
would be utilized better and as a result, the maintenance The overall error rates of this prediction approach are
phase of the project will be easier for both the customers compared to other existing algorithms and are found
and the project owners. to be better in most of the cases.
When we look at the publications about Fault prediction we
saw that in early studies static code features were used II .RELATED WORK
more. But afterwards, it was understood that beside the Previous work of faulty software components enables
effect of static code metrics on Fault prediction, other verification experts to concentrate their time and resources on the
problem areas of the software system under development. One of
measures like process metrics are also effective and should
the main purposes of these models is to assist in software
be investigated. For example, Fenton and Neil (1999) argue maintenance budgeting.
that static code measures alone are not able to predict Among various clustering techniques available in
software Faults accurately. literature K-means clustering approach is most widely being used?
To support this idea if software is Faulty this might be Different authors apply different clustering techniques and expert-
related to one of the following: based approach for software fault prediction problem. They
The specification of the project may be wrong due to applied K-Means[8][9] and Neural-Gas techniques on different
differing requirements or missing features. real data sets and then an expert explored the representative
module of the cluster and several statistical data in order to label
Because of improper documentation realization of the
each cluster as faulty or non faulty. And based on their experience
project is too complex. Neural-Gas-based prediction approach performed slightly worse
Scarce and incorrect requirements results in poor than K-Means clustering-based approach in terms of the overall
design. error rate on large data sets. But their approach is dependent on
Developers are not qualified enough for the project. the availability and capability of the expert. Seliya and
www.theijes.com The IJES Page 239
2. Overview of Software Fault Prediction using Clustering Approaches and Tree Data Structure
Khoshgoftaar proposed a constrained based semi-supervised 3.1.1. The algorithm is composed of the following steps
clustering scheme. They showed that this approach helped the
expert in making better estimations as compared to predictions 1. Place K points into the space represented by the
made by an unsupervised learning algorithm. [1] a Quad Tree- objects that are being clustered. These points
based K-Means algorithm has been applied for predicting faults in represent initial group centroids.
program modules. The aim of their topic is twofold. First, Quad-
2. Assign each object to the group that has the closest
Trees are applied for finding the initial cluster centers to be input
to the K-Means Algorithm. Bhattacherjee and Bishnu [1] have centroid.
applied unsupervised learning approach for fault prediction in 3. When all objects have been assigned, recalculate the
software module. An input threshold parameter delta governs the positions of the K centroids.
number of initial cluster centers and by varying delta the user can
generate desired initial cluster centers. The clusters obtained by 4. Repeat Steps 2 and 3 until the centroids no longer
Quad Tree-based algorithm were found to have maximum gain move. This produces a separation of the objects into
values. Second, the Quad-tree based algorithm is applied for groups from which the metric to be minimized can
predicting faults in program modules. Supervised techniques have be calculated
however been applied for software fault prediction and software
effort prediction There is no solution to find the optimal number 3.1.2. Limitations of K-Means
of clusters for any given data set in K-Means. The overall error The cluster centers, thus found, serve as input to the
rates of this prediction approach are compared to other existing clustering algorithms. However, it has some inherent
algorithms and are found to be better in most of the cases. In this drawbacks-
paper I try to find the better centroid than Quad-tree algorithm by The user has to initialize the number of clusters which
using Hyper Quad-tree which will give as a input to the K-Means
is very difficult to identify in most of the cases.
algorithm for lowers the error rate and effective software fault
prediction. Due to some defective software modules, the It requires selection of the suitable initial cluster
maintenance phase of software projects could become really centers which is again subject to error. Since the
painful for the users and costly for the enterprises. That is why structure of the clusters depends on the initial
predicting the defective modules or files in a software system cluster centers this may result in an inefficient
prior to project deployment is a very crucial activity, since it leads clustering.
to a decrease in the total cost of the project and an increase in The K-Means algorithm is very sensitive to noise.
overall project success rate .
3.2. Quad Tree
III . OVERVIEW This data structure was named a Quad tree by Raphael
This paper shows the study of K-Means clustering Finkel and J.L. Bentley in 1974. A similar partitioning is
algorithm Quad tree algorithm and Hyper Quad-tree also known as a Q-tree. The Quad Tree-based method
algorithm then proposed system architecture for software assigns the appropriate initial cluster centers and eliminates
fault prediction, expected result using confusion matrix and the outliers hence overcoming the second and third
conclusion. drawback of K-Means clustering algorithm.
3.1 K-Means clustering algorithm Common features of quad tree
K-means (MacQueen, 1967) is one of the simplest They decompose space into adaptable cells.
unsupervised learning algorithms that solve the well known Each cell (or bucket) has a maximum capacity.
clustering problem. The procedure follows a simple and When maximum capacity is reached, the bucket splits.
easy way to classify a given data set through a certain The tree directory follows the spatial decomposition of
number of clusters (assume k clusters) fixed a priori. The the Quad tree.
main idea is to define k centroids, one for each cluster. Figure1. Shows the simple Quad tree representation.
These centroids should be placed in a cunning way because
of different location causes different result. So, the better
choice is to place them as much as possible far away from
each other. The next step is to take each point belonging to
a given data set and associate it to the nearest centroid.
When no point is pending, the first step is completed and
an early group page is done. At this point we need to re-
calculate k new centroids as bar centers of the clusters
resulting from the previous step. After we have these k new Figure1. Simple Quad Tree.
centroids, a new binding has to be done between the same 3.2.1. Some definitions of notations and parameters
data set points and the nearest new centroid. A loop has Minimum: User defined threshold for minimum
been generated. As a result of this loop we may notice that number of data points in a sub bucket.
the k centroids change their location step by step until no Maximum: User defined threshold for maximum
more changes are done. number of data points in a sub bucket.
White leaf bucket: A sub bucket having less than
MIN number of data points of the parent bucket.
Fig. shows an illustration of a white leaf bucket.
www.theijes.com The IJES Page 240
3. Overview of Software Fault Prediction using Clustering Approaches and Tree Data Structure
Black leaf bucket: A sub bucket having more than Hyper Quad-Trees are expected to give better cluster
MAX number of data points of the parent bucket. centers than the Quad-tree because
Gray bucket: a sub bucket which is neither white It has an eight-way branching tree whose nodes are
nor black. associated with axis- parallel boxes.
User specified distance for finding nearest The d-dimensional analogue is known variously as a
neighbors. multidimensional Quad-tree and a hyper quad tree.
It divides the regions recursively so that no region
3.2.2. Quad Tree Algorithm [8] [9]: contains more than one data point.
For each class: Algorithm to generate a hyper quad tree is as follows
Find the minimum and maximum x and y co- 1. Select Random Node
ordinates. 2. Initialize current node = root node
Find the midpoint using the values obtained in the 3. While current node is not a leaf node do
previous step. 4. Generate a random number n
Divide the spatial area into four sub regions based 5. If n<p then //n= Number of data points
on the midpoint. 6. Break the while loop
Plot the points and classify regions as white leaf //P=Random path termination probability
buckets or black leaf buckets. 7. Else
The white leaf buckets are left as such. 8. Randomly select one children
The Center data-points of each black leaf bucket are 9. current node=selected node
calculated for all black leaf buckets. 10. Require: Hyper Quad Tree
The mean of all the center points obtained in the 11. end if
previous step is calculated. 12. end while
The computed mean gives the centroid point 13. select the current node
necessary for that class.
Input: Dimension, Data set, Min, Max
3.2.3. Limitations of Quad Tree Output: Centroid
The user has to initialize the number of clusters
which is very difficult to identify using quad tree IV . THE PROPOSED SYSTEMS
algorithm. The proposed system is ―Software fault prediction using
It is not providing the exact centroid. clustering approach‖ that classify given data using Hyper
Quad-tree algorithm.
3.3Hyper Quad Tree The system consists of 3 modules
The Hyper Quad Tree-based method assigns the appropriate Create dataset parser
initial cluster centers and eliminates the outliers hence Data set is given as input to the Hyper Quad-tree
overcoming the second and third drawback of K-Means algorithm in which we Create cells, insert cell,
algorithm that is label bucket, split cell, spatial decomposition
Hyper Quad-Trees are applied for finding initial Input: Dimension, Data set
cluster centers for K-Means algorithm. User can Output: Centroid
generate desired number of cluster centers that can Centroid points obtained using the Hyper Quad-tree
be used as input to the simple K-Means. is given as an input to the K-Means to get better
Second, the centroid obtained by the Hyper Quad clusters it Calculates the distance, Shuffle data
Tree is more accurate than Quad tree. points according to distance, If centroids are stable
Figure2 shows a simple hyper quad tree representation of then stop. The output of this will be set of clusters
data set dots denotes the data: Measure the Faults in terms of FPR, FNR and
ERROR using confusion matrix.
Figure 3. System Architecture
Figure2. Hyper Quad Tree [7] As Shown in TABLE1. The Actual labels of data
items are placed along the rows, while the predicted labels
www.theijes.com The IJES Page 241
4. Overview of Software Fault Prediction using Clustering Approaches and Tree Data Structure
are placed along the columns. For example, a False Actual reducing NOI. Better throughput with lower error rates of
label implies that the module is not faulty. If a not faulty classification.
module (Actual label—False) is predicted as non-faulty In this paper I am not focusing on automatic
(Predicted Label—False) then there is the condition of cell initialization of number of clusters this will be the future
A, which is True Negative, and if it is predicted as faulty work for better software fault prediction using clustering
(Predicted label—True) then there is the condition of cell Approach.
B, which is False Positive. Similar definitions hold for
False Negative and True Positive. The False positive rate is ACKNOWLEDGMENT
the percentage of not faulty modules labeled as faulty by the This is a small review of my post graduate project
model and the false negative rate is the percentage of faulty work that I am going to start to implement .I specially
modules labeled as fault free and Error is the percentage of thank to my Guide for his assistance.
mislabeled modules. The following equations are used to
calculate these FPR, FNR, Error, and Precision [1] REFERENCES
[1] P.S. Bishnu and V. Bhattacherjee, Member, IEEE‖ Software Fault
B (1) Prediction Using Quad Tree-Based K-Means Clustering Algorithm‖
FPR = IEEE Transactions on Knowledge and Data Engineering, Vol. 24,
No. 6, June 2012
A+B [2] P.S. Bishnu and V. Bhattacherjee, ―Outlier Detection Technique Using
Quad Tree,‖ Proc Int’l Conf. Computer Comm. Control and
C (2) Information Technology, pp. 143-148, Feb. 2009.
[3] P.S. Bishnu and V. Bhattacherjee, ―Application of K-Medoids with kd-
FNR = Tree for Software Fault Prediction,‖ ACM Software Eng. Notes, vol.
D+C 36, pp. 1-6, Mar. 2011.
[4] V. Bhattacherjee and P.S. Bishnu, ―Software Fault Prediction Using
KMedoids Algorithm,‖ Proc. Int’l Conf. Productivity, Quality,
B+C
Reliability, Optimization and Modeling (ICPQROM ’11), p. 191,
ERROR = (3) Feb. 2011.
A+B+C+D [5] J. Han and M. Kamber, ―Data Mining Concepts and Techniques‖,
second ed, pp. 401-404. Morgan Kaufmann Publishers, 2007.
[6] Parvinder S. Sandhu, Jagdeep Singh, Vikas Gupta, Mandeep Kaur,
TABLE 1.Confusion Matrix Sonia Manhas, Ramandeep Sidhu‖ A K-Means Based Clustering
Predicted Labels Approach for Finding Faulty Modules in Open Source Software
False True Systems‖ ,World Academy of Science, Engineering and Technology
48 2010
Actual Labels
(Non-Faulty) (Faulty) [7] Michael Laszlo and Sumitra Mukherjee, Member, IEEE, ―A Genetic
False True Negative False Positive Algorithm Using Hyper-Quadtrees for Low-Dimensional K-means
(Non-Faulty) A B Clustering‖, IEEE transactions on pattern analysis and machine
intelligence, vol. 28, no. 4, april 2006
True False Negative True Positive [8] Leela Rani.P, Rajalakshmi.P,‖ Clustering Gene Expression Data using
(Faulty) C D Quad-tree based Expectation Maximization Approach‖ International
Journal of Applied Information Systems (IJAIS) – ISSN : 2249-0868
Foundation of Computer Science FCS, New York, USA,Volume 2–
The above performance indicators should be
No.2, June 2012 – www.ijais.org
minimized. A high value of FPR would lead to wasted [9] Meenakshi PC, Meenu S, Mithra M, Leela Rani.P,‖ Fault Prediction
testing effort while high FNR value means error prone using Quad-tree and Expectation Maximization Algorithm‖,
modules will escape testing. International Journal of Applied Information Systems (IJAIS) – ISSN
: 2249-0868 Foundation of Computer Science FCS, New York, USA
In this paper, for calculating the measures, if any Volume 2– No.4, May 2012 – www.ijais.org
metric value of the centroid data point of a cluster was [10] P.S. Bishnu and V. Bhattacharjee, ―A New Initialization Method for
greater than the threshold, that cluster was labeled as faulty KMeans Algorithm Using Quad Tree,‖ Proc. Nat’l Conf. Methods and
Models in Computing (NCM2C), pp. 73-81, 2008.
and otherwise it was labeled as non-faulty. After this the
predicted fault labels will compare with the actual fault
labels. Also the clusters can be label according to the Biographies and Photographs
majority of its members (by comparing with metrics Prof.M.D.Ingle is currently working as an Associate
thresholds) but this increases the complexity of the labeling Professor in Jayawantrao Sawant College of Engineering
procedure since all the modules in the cluster need to be Hadapsar, Pune, India. His research interests are
examined. Networking, Information Security, Mobile Computing, and
Data Mining etc.
Swati M. Varade Pursuing M.E.(Computer) From
CONCLISION AND FUTURE SCOPE
Jayawantrao Sawant College of Engineering Hadapsar,
Hyper Quad-tree based K-Means clustering
Pune, India currently working as a Lecturer in same
algorithm evaluates the effectiveness in predicting faulty
Institute. Her research interests are Software Testing, Data
software modules as compared to the original Quad-tree
Mining etc.
based K-Means algorithm. Also it will find better the initial
cluster centers for K-Means algorithm. By using hyper quad
I try to meet the convergence criterion faster and hence it
results in lesser number of iterations. Also there will be
reduction in time and computational complexity by
www.theijes.com The IJES Page 242