an error in that computer program. In order to improve the software quality, prediction of faulty modules is
necessary. Various Metric suites and techniques are available to predict the modules which are critical and
likely to be fault prone. Genetic Algorithm is a problem solving algorithm. It uses genetics as its model of
problem solving. It’s a search technique to find approximate solutions to optimization and search
problems.Genetic algorithm is applied for solving the problem of faulty module prediction and as well as
for finding the most important attribute for fault occurrence. In order to perform the analysis, performance
validation of the Genetic Algorithm using open source software jEdit is done. The results are measured in
terms Accuracy and Error in predicting by calculating probability of detection and probability of false
Alarms
ANALYSIS OF MACHINE LEARNING ALGORITHMS WITH FEATURE SELECTION FOR INTRUSION ...IJNSA Journal
In recent times, various machine learning classifiers are used to improve network intrusion detection. The researchers have proposed many solutions for intrusion detection in the literature. The machine learning classifiers are trained on older datasets for intrusion detection, which limits their detection accuracy. So, there is a need to train the machine learning classifiers on the latest dataset. In this paper, UNSW-NB15, the latest dataset is used to train machine learning classifiers. The selected classifiers such as K-Nearest Neighbors (KNN), Stochastic Gradient Descent (SGD), Random Forest (RF), Logistic Regression (LR), and Naïve Bayes (NB) classifiers are used for training from the taxonomy of classifiers based on lazy and eager learners. In this paper, Chi-Square, a filter-based feature selection technique, is applied to the UNSW-NB15 dataset to reduce the irrelevant and redundant features. The performance of classifiers is measured in terms of Accuracy, Mean Squared Error (MSE), Precision, Recall, F1-Score, True Positive Rate (TPR) and False Positive Rate (FPR) with or without feature selection technique and comparative analysis of these machine learning classifiers is carried out.
Comparative Performance Analysis of Machine Learning Techniques for Software ...csandit
Machine learning techniques can be used to analyse data from different perspectives and enable
developers to retrieve useful information. Machine learning techniques are proven to be useful
in terms of software bug prediction. In this paper, a comparative performance analysis of
different machine learning techniques is explored for software bug prediction on public
available data sets. Results showed most of the machine learning methods performed well on
software bug datasets.
Machine Learning approaches are good in solving problems that have less information. In most cases, the
software domain problems characterize as a process of learning that depend on the various circumstances
and changes accordingly. A predictive model is constructed by using machine learning approaches and
classified them into defective and non-defective modules. Machine learning techniques help developers to
retrieve useful information after the classification and enable them to analyse data from different
perspectives. Machine learning techniques are proven to be useful in terms of software bug prediction. This
study used public available data sets of software modules and provides comparative performance analysis
of different machine learning techniques for software bug prediction. Results showed most of the machine
learning methods performed well on software bug datasets.
International Journal of Computational Engineering Research (IJCER) is dedicated to protecting personal information and will make every reasonable effort to handle collected information appropriately. All information collected, as well as related requests, will be handled as carefully and efficiently as possible in accordance with IJCER standards for integrity and objectivity.
Implementation of reducing features to improve code change based bug predicti...eSAT Journals
Abstract Today, we are getting plenty of bugs in the software because of variations in the software and hardware technologies. Bugs are nothing but Software faults, existing a severe challenge for system reliability and dependability. To identify the bugs from the software bug prediction is convenient approach. To visualize the presence of a bug in a source code file, recently, Machine learning classifiers approach is developed. Because of a huge number of machine learned features current classifier-based bug prediction have two major problems i) inadequate precision for practical usage ii) measured prediction time. In this paper we used two techniques first, cos-triage algorithm which have a go to enhance the accuracy and also lower the price of bug prediction and second, feature selection methods which eliminate less significant features. Reducing features get better the quality of knowledge extracted and also boost the speed of computation. Keywords: Efficiency, Bug Prediction, Classification, Feature Selection, Accuracy
Development of software defect prediction system using artificial neural networkIJAAS Team
Software testing is an activity to enable a system is bug free during execution process. The software bug prediction is one of the most encouraging exercises of the testing phase of the software improvement life cycle. In any case, in this paper, a framework was created to anticipate the modules that deformity inclined in order to be utilized to all the more likely organize software quality affirmation exertion. Genetic Algorithm was used to extract relevant features from the acquired datasets to eliminate the possibility of overfitting and the relevant features were classified to defective or otherwise modules using the Artificial Neural Network. The system was executed in MATLAB (R2018a) Runtime environment utilizing a statistical toolkit and the performance of the system was assessed dependent on the accuracy, precision, recall, and the f-score to check the effectiveness of the system. In the finish of the led explores, the outcome indicated that ECLIPSE JDT CORE, ECLIPSE PDE UI, EQUINOX FRAMEWORK and LUCENE has the accuracy, precision, recall and the f-score of 86.93, 53.49, 79.31 and 63.89% respectively, 83.28, 31.91, 45.45 and 37.50% respectively, 83.43, 57.69, 45.45 and 50.84% respectively and 91.30, 33.33, 50.00 and 40.00% respectively. This paper presents an improved software predictive system for the software defect detections.
COMPUTER INTRUSION DETECTION BY TWOOBJECTIVE FUZZY GENETIC ALGORITHMcscpconf
The purpose of this paper is to describe two objective fuzzy genetics-based learning algorithms
and discusses its usage to detect intrusion in a computer network. Experiments were performed
with KDD-cup data set, which have information on computer networks, during normal behavior
and intrusive behavior. The performance of final fuzzy classification system has been
investigated using intrusion detection problem as a high dimensional classification problem.
This task is formulated as optimization problem with two objectives: To minimize the number of
fuzzy rules and to maximize the classification rate. We show a two-objective genetic algorithm
for finding non-dominated solutions of the fuzzy rule selection problem
ANALYSIS OF MACHINE LEARNING ALGORITHMS WITH FEATURE SELECTION FOR INTRUSION ...IJNSA Journal
In recent times, various machine learning classifiers are used to improve network intrusion detection. The researchers have proposed many solutions for intrusion detection in the literature. The machine learning classifiers are trained on older datasets for intrusion detection, which limits their detection accuracy. So, there is a need to train the machine learning classifiers on the latest dataset. In this paper, UNSW-NB15, the latest dataset is used to train machine learning classifiers. The selected classifiers such as K-Nearest Neighbors (KNN), Stochastic Gradient Descent (SGD), Random Forest (RF), Logistic Regression (LR), and Naïve Bayes (NB) classifiers are used for training from the taxonomy of classifiers based on lazy and eager learners. In this paper, Chi-Square, a filter-based feature selection technique, is applied to the UNSW-NB15 dataset to reduce the irrelevant and redundant features. The performance of classifiers is measured in terms of Accuracy, Mean Squared Error (MSE), Precision, Recall, F1-Score, True Positive Rate (TPR) and False Positive Rate (FPR) with or without feature selection technique and comparative analysis of these machine learning classifiers is carried out.
Comparative Performance Analysis of Machine Learning Techniques for Software ...csandit
Machine learning techniques can be used to analyse data from different perspectives and enable
developers to retrieve useful information. Machine learning techniques are proven to be useful
in terms of software bug prediction. In this paper, a comparative performance analysis of
different machine learning techniques is explored for software bug prediction on public
available data sets. Results showed most of the machine learning methods performed well on
software bug datasets.
Machine Learning approaches are good in solving problems that have less information. In most cases, the
software domain problems characterize as a process of learning that depend on the various circumstances
and changes accordingly. A predictive model is constructed by using machine learning approaches and
classified them into defective and non-defective modules. Machine learning techniques help developers to
retrieve useful information after the classification and enable them to analyse data from different
perspectives. Machine learning techniques are proven to be useful in terms of software bug prediction. This
study used public available data sets of software modules and provides comparative performance analysis
of different machine learning techniques for software bug prediction. Results showed most of the machine
learning methods performed well on software bug datasets.
International Journal of Computational Engineering Research (IJCER) is dedicated to protecting personal information and will make every reasonable effort to handle collected information appropriately. All information collected, as well as related requests, will be handled as carefully and efficiently as possible in accordance with IJCER standards for integrity and objectivity.
Implementation of reducing features to improve code change based bug predicti...eSAT Journals
Abstract Today, we are getting plenty of bugs in the software because of variations in the software and hardware technologies. Bugs are nothing but Software faults, existing a severe challenge for system reliability and dependability. To identify the bugs from the software bug prediction is convenient approach. To visualize the presence of a bug in a source code file, recently, Machine learning classifiers approach is developed. Because of a huge number of machine learned features current classifier-based bug prediction have two major problems i) inadequate precision for practical usage ii) measured prediction time. In this paper we used two techniques first, cos-triage algorithm which have a go to enhance the accuracy and also lower the price of bug prediction and second, feature selection methods which eliminate less significant features. Reducing features get better the quality of knowledge extracted and also boost the speed of computation. Keywords: Efficiency, Bug Prediction, Classification, Feature Selection, Accuracy
Development of software defect prediction system using artificial neural networkIJAAS Team
Software testing is an activity to enable a system is bug free during execution process. The software bug prediction is one of the most encouraging exercises of the testing phase of the software improvement life cycle. In any case, in this paper, a framework was created to anticipate the modules that deformity inclined in order to be utilized to all the more likely organize software quality affirmation exertion. Genetic Algorithm was used to extract relevant features from the acquired datasets to eliminate the possibility of overfitting and the relevant features were classified to defective or otherwise modules using the Artificial Neural Network. The system was executed in MATLAB (R2018a) Runtime environment utilizing a statistical toolkit and the performance of the system was assessed dependent on the accuracy, precision, recall, and the f-score to check the effectiveness of the system. In the finish of the led explores, the outcome indicated that ECLIPSE JDT CORE, ECLIPSE PDE UI, EQUINOX FRAMEWORK and LUCENE has the accuracy, precision, recall and the f-score of 86.93, 53.49, 79.31 and 63.89% respectively, 83.28, 31.91, 45.45 and 37.50% respectively, 83.43, 57.69, 45.45 and 50.84% respectively and 91.30, 33.33, 50.00 and 40.00% respectively. This paper presents an improved software predictive system for the software defect detections.
COMPUTER INTRUSION DETECTION BY TWOOBJECTIVE FUZZY GENETIC ALGORITHMcscpconf
The purpose of this paper is to describe two objective fuzzy genetics-based learning algorithms
and discusses its usage to detect intrusion in a computer network. Experiments were performed
with KDD-cup data set, which have information on computer networks, during normal behavior
and intrusive behavior. The performance of final fuzzy classification system has been
investigated using intrusion detection problem as a high dimensional classification problem.
This task is formulated as optimization problem with two objectives: To minimize the number of
fuzzy rules and to maximize the classification rate. We show a two-objective genetic algorithm
for finding non-dominated solutions of the fuzzy rule selection problem
Software Defect Prediction Using Radial Basis and Probabilistic Neural NetworksEditor IJCATR
Defects in modules of software systems is a major problem in software development. There are a variety of data mining
techniques used to predict software defects such as regression, association rules, clustering, and classification. This paper is concerned
with classification based software defect prediction. This paper investigates the effectiveness of using a radial basis function neural
network and a probabilistic neural network on prediction accuracy and defect prediction. The conclusions to be drawn from this work is
that the neural networks used in here provide an acceptable level of accuracy but a poor defect prediction ability. Probabilistic neural
networks perform consistently better with respect to the two performance measures used across all datasets. It may be advisable to use
a range of software defect prediction models to complement each other rather than relying on a single technique.
Survey of network anomaly detection using markov chainijcseit
Recently an internet threat has been increased. Our motive is detect the intrusion in the network in concise.
The real time issue such as DoS attack in banking, companies, industries and organization have been
increased significantly IDS has been used in both server and host side. The major challenge is to effectively
predict the periods of threats and protect the server from the unauthorized user. In this study, a novel
probabilistic approach is proposed effectively to detect the network intrusions. It uses a Markov chain for
probabilistic modelling of abnormal events in network systems. The degree of abnormality of the incoming
data is performed on the basis of the network states.
A Defect Prediction Model for Software Product based on ANFISIJSRD
Artificial intelligence techniques are day by day getting involvement in all the classification and prediction based process like environmental monitoring, stock exchange conditions, biomedical diagnosis, software engineering etc. However still there are yet to be simplify the challenges of selecting training criteria for design of artificial intelligence models used for prediction of results. This work focus on the defect prediction mechanism development using software metric data of KC1.We have taken subtractive clustering approach for generation of fuzzy inference system (FIS).The FIS rules are generated at different radius of influence of input attribute vectors and the developed rules are further modified by ANFIS technique to obtain the prediction of number of defects in software project using fuzzy logic system.
An Empirical Comparison and Feature Reduction Performance Analysis of Intrusi...ijctcm
This paper reports on the empirical evaluation of five machine learning algorithm such as J48, BayesNet, OneR, NB and ZeroR using ten performance criteria: accuracy, precision, recall, F-Measure, incorrectly classified instances, kappa statistic, mean absolute error, root mean squared error, relative absolute error, root relative squared error. The aim of this paper is to find out which classifier is better in its performance for intrusion detection system. Machine Learning is one of the methods used in the intrusion detection system (IDS).Based on this study, it can be concluded that J48 decision tree is the most suitable associated algorithm than the other four algorithms. In this paper we compared the performance of Intrusion Detection System (IDS) Classifiers using seven feature reduction techniques.
Evaluation of network intrusion detection using markov chainIJCI JOURNAL
Day today life internet threat has been increased significantly. There is a need to develop model in order to
maintain security of system. The most effective techniques are Intrusion Detection System (IDS).The
purpose of intrusion system through the security devices detect and deal with it. In this paper, a
mathematical approach is used effectively to predict and detect intrusion in the network. Here we discuss
about two algorithms ‘K-Means + Apriori’, a method which classify normal and abnormal activities in
computer network. In K-Means process, it partitions the training set into K-clusters using Euclidean
distance and introduce an outlier factor, then it build Apriori Algorithm to prune the data by removing
infrequent data in the database. Based on defined state the degree of incoming data is evaluated through
the experiment using sample DARPA2000 dataset, and achieves high detection performance in level of
attack in stages.
The International Journal of Engineering and Science (IJES)theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
Bug triage means to transfer a new bug to expertise developer. The manual bug triage is opulent in time
and poor in accuracy, there is a need to automatize the bug triage process. In order to automate the bug triage
process, text classification techniques are applied using stopword removal and stemming. In our proposed work
we have used NB-Classifiers to predict the developers. The data reduction techniques like instance selection
and keyword selection are used to obtain bug report and words. This will help the system to predict only those
developers who are expertise in solving the assigned bug. We will also provide the change of status of bug
report i.e. if the bug is solved then the bug report will be updated. If a particular developer fails to solve the bug
then the bug will go back to another developer.
Design and Development of Arm-Based Control System for Nursing Bed IJCSES Journal
This paper introduces a kind of ARM embedded system as the control systemof the nursing bed.The
embedded control system takes the ARM9 S3C2440 chip as the core of data processing. The design ofthe
control system includes hardware design, software design and PC monitoring system design.
The community detection in complex networks has attracted a growing interest and is the subject of several
researches that have been proposed to understand the network structure and analyze the network
properties. In this paper, we give a thorough overview of different community discovery strategies, we
propose taxonomy of these methods, and we specify the differences between the suggested classes which
helping designers to compare and choose the most suitable strategy for the various types of network
encountered in the real world.
Loan financings belong and parcel of any type of human being when the entire globe is viewing an uptrend of costs for all important commodities permanently to go perfectly. The increase in cost has actually been so high that every person is feeling the crunch of money to stabilize the need and the supply.
The internet has been playing an increasingly important role in our daily life, with the availability of many web services such as email and search engines. However, these are often threatened by attacks from computer programs such as bots. To address this problem, CAPTCHA (Completely Automated Public Turing Test to Tell Computers and Humans Apart) was developed to distinguish between computer programs and human users. Although this mechanism offers good security and limits automatic registration to web services, some CAPTCHAs have several weaknesses which allow hackers to infiltrate the mechanism of the CAPTCHA. This paper examines recent research on various CAPTCHA methods and their categories. Moreover it discusses the weakness and strength of these types.
This paper presents a set of methods that uses a genetic algorithm for automatic test-data generation in
software testing. For several years researchers have proposed several methods for generating test data
which had different drawbacks. In this paper, we have presented various Genetic Algorithm (GA) based test
methods which will be having different parameters to automate the structural-oriented test data generation
on the basis of internal program structure. The factors discovered are used in evaluating the fitness
function of Genetic algorithm for selecting the best possible Test method. These methods take the test
populations as an input and then evaluate the test cases for that program. This integration will help in
improving the overall performance of genetic algorithm in search space exploration and exploitation fields
with better convergence rate.
In order to achieve the wide range of the robotic application it is necessary to provide iterative motions
among points of the goals. For instance, in the industry mobile robots can replace any components between
a storehouse and an assembly department. Ammunition replacement is widely used in military services.
Working place is possible in ports, airports, waste site and etc. Mobile agents can be used for monitoring if
it is necessary to observe control points in the secret place. The paper deals with path planning programme
for mobile robots. The aim of the research paper is to analyse motion-planning algorithms that contain the
design of modelling programme. The programme is needed as environment modelling to obtain the
simulation data. The simulation data give the possibility to conduct the wide analyses for selected
algorithm. Analysis means the simulation data interpretation and comparison with other data obtained
using the motion-planning. The results of the careful analysis were considered for optimal path planning
algorithms. The experimental evidence was proposed to demonstrate the effectiveness of the algorithm for
steady covered space. The results described in this work can be extended in a number of directions, and
applied to other algorithms.
Software Defect Prediction Using Radial Basis and Probabilistic Neural NetworksEditor IJCATR
Defects in modules of software systems is a major problem in software development. There are a variety of data mining
techniques used to predict software defects such as regression, association rules, clustering, and classification. This paper is concerned
with classification based software defect prediction. This paper investigates the effectiveness of using a radial basis function neural
network and a probabilistic neural network on prediction accuracy and defect prediction. The conclusions to be drawn from this work is
that the neural networks used in here provide an acceptable level of accuracy but a poor defect prediction ability. Probabilistic neural
networks perform consistently better with respect to the two performance measures used across all datasets. It may be advisable to use
a range of software defect prediction models to complement each other rather than relying on a single technique.
Survey of network anomaly detection using markov chainijcseit
Recently an internet threat has been increased. Our motive is detect the intrusion in the network in concise.
The real time issue such as DoS attack in banking, companies, industries and organization have been
increased significantly IDS has been used in both server and host side. The major challenge is to effectively
predict the periods of threats and protect the server from the unauthorized user. In this study, a novel
probabilistic approach is proposed effectively to detect the network intrusions. It uses a Markov chain for
probabilistic modelling of abnormal events in network systems. The degree of abnormality of the incoming
data is performed on the basis of the network states.
A Defect Prediction Model for Software Product based on ANFISIJSRD
Artificial intelligence techniques are day by day getting involvement in all the classification and prediction based process like environmental monitoring, stock exchange conditions, biomedical diagnosis, software engineering etc. However still there are yet to be simplify the challenges of selecting training criteria for design of artificial intelligence models used for prediction of results. This work focus on the defect prediction mechanism development using software metric data of KC1.We have taken subtractive clustering approach for generation of fuzzy inference system (FIS).The FIS rules are generated at different radius of influence of input attribute vectors and the developed rules are further modified by ANFIS technique to obtain the prediction of number of defects in software project using fuzzy logic system.
An Empirical Comparison and Feature Reduction Performance Analysis of Intrusi...ijctcm
This paper reports on the empirical evaluation of five machine learning algorithm such as J48, BayesNet, OneR, NB and ZeroR using ten performance criteria: accuracy, precision, recall, F-Measure, incorrectly classified instances, kappa statistic, mean absolute error, root mean squared error, relative absolute error, root relative squared error. The aim of this paper is to find out which classifier is better in its performance for intrusion detection system. Machine Learning is one of the methods used in the intrusion detection system (IDS).Based on this study, it can be concluded that J48 decision tree is the most suitable associated algorithm than the other four algorithms. In this paper we compared the performance of Intrusion Detection System (IDS) Classifiers using seven feature reduction techniques.
Evaluation of network intrusion detection using markov chainIJCI JOURNAL
Day today life internet threat has been increased significantly. There is a need to develop model in order to
maintain security of system. The most effective techniques are Intrusion Detection System (IDS).The
purpose of intrusion system through the security devices detect and deal with it. In this paper, a
mathematical approach is used effectively to predict and detect intrusion in the network. Here we discuss
about two algorithms ‘K-Means + Apriori’, a method which classify normal and abnormal activities in
computer network. In K-Means process, it partitions the training set into K-clusters using Euclidean
distance and introduce an outlier factor, then it build Apriori Algorithm to prune the data by removing
infrequent data in the database. Based on defined state the degree of incoming data is evaluated through
the experiment using sample DARPA2000 dataset, and achieves high detection performance in level of
attack in stages.
The International Journal of Engineering and Science (IJES)theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
Bug triage means to transfer a new bug to expertise developer. The manual bug triage is opulent in time
and poor in accuracy, there is a need to automatize the bug triage process. In order to automate the bug triage
process, text classification techniques are applied using stopword removal and stemming. In our proposed work
we have used NB-Classifiers to predict the developers. The data reduction techniques like instance selection
and keyword selection are used to obtain bug report and words. This will help the system to predict only those
developers who are expertise in solving the assigned bug. We will also provide the change of status of bug
report i.e. if the bug is solved then the bug report will be updated. If a particular developer fails to solve the bug
then the bug will go back to another developer.
Design and Development of Arm-Based Control System for Nursing Bed IJCSES Journal
This paper introduces a kind of ARM embedded system as the control systemof the nursing bed.The
embedded control system takes the ARM9 S3C2440 chip as the core of data processing. The design ofthe
control system includes hardware design, software design and PC monitoring system design.
The community detection in complex networks has attracted a growing interest and is the subject of several
researches that have been proposed to understand the network structure and analyze the network
properties. In this paper, we give a thorough overview of different community discovery strategies, we
propose taxonomy of these methods, and we specify the differences between the suggested classes which
helping designers to compare and choose the most suitable strategy for the various types of network
encountered in the real world.
Loan financings belong and parcel of any type of human being when the entire globe is viewing an uptrend of costs for all important commodities permanently to go perfectly. The increase in cost has actually been so high that every person is feeling the crunch of money to stabilize the need and the supply.
The internet has been playing an increasingly important role in our daily life, with the availability of many web services such as email and search engines. However, these are often threatened by attacks from computer programs such as bots. To address this problem, CAPTCHA (Completely Automated Public Turing Test to Tell Computers and Humans Apart) was developed to distinguish between computer programs and human users. Although this mechanism offers good security and limits automatic registration to web services, some CAPTCHAs have several weaknesses which allow hackers to infiltrate the mechanism of the CAPTCHA. This paper examines recent research on various CAPTCHA methods and their categories. Moreover it discusses the weakness and strength of these types.
This paper presents a set of methods that uses a genetic algorithm for automatic test-data generation in
software testing. For several years researchers have proposed several methods for generating test data
which had different drawbacks. In this paper, we have presented various Genetic Algorithm (GA) based test
methods which will be having different parameters to automate the structural-oriented test data generation
on the basis of internal program structure. The factors discovered are used in evaluating the fitness
function of Genetic algorithm for selecting the best possible Test method. These methods take the test
populations as an input and then evaluate the test cases for that program. This integration will help in
improving the overall performance of genetic algorithm in search space exploration and exploitation fields
with better convergence rate.
In order to achieve the wide range of the robotic application it is necessary to provide iterative motions
among points of the goals. For instance, in the industry mobile robots can replace any components between
a storehouse and an assembly department. Ammunition replacement is widely used in military services.
Working place is possible in ports, airports, waste site and etc. Mobile agents can be used for monitoring if
it is necessary to observe control points in the secret place. The paper deals with path planning programme
for mobile robots. The aim of the research paper is to analyse motion-planning algorithms that contain the
design of modelling programme. The programme is needed as environment modelling to obtain the
simulation data. The simulation data give the possibility to conduct the wide analyses for selected
algorithm. Analysis means the simulation data interpretation and comparison with other data obtained
using the motion-planning. The results of the careful analysis were considered for optimal path planning
algorithms. The experimental evidence was proposed to demonstrate the effectiveness of the algorithm for
steady covered space. The results described in this work can be extended in a number of directions, and
applied to other algorithms.
Phishing is the fraudulent acquisition of personal information like username, password, credit card information, etc. by tricking an individual into believing that the attacker is a trustworthy entity. It is affecting all the major sector of industry day by day with lots of misuse of user’s credentials. So in today
online environment we need to protect the data from phishing and safeguard our information, which can be done through anti-phishing tools. Currently there are many freely available anti-phishing browser extensions tools that warns user when they are browsing a suspected phishing site. In this paper we did a literature survey of some of the commonly and popularly used anti-phishing browser extensions by reviewing the existing anti-phishing techniques along with their merits and demerits.
Cloud computing is a technique that has a great capabilities and benefits for users. Cloud characteristics
encourage many organizations to move to this technology. But many consideration faces transmission
process. This paper outline some of these considerations and considerable efforts solved cloud scalability
issues.
THE SURVEY OF SENTIMENT AND OPINION MINING FOR BEHAVIOR ANALYSIS OF SOCIAL MEDIAIJCSES Journal
Nowadays, internet has changed the world into a global village. Social Media has reduced the gaps among
the individuals. Previously communication was a time consuming and expensive task between the people.
Social Media has earned fame because it is a cheaper and faster communication provider. Besides, social
media has allowed us to reduce the gaps of physical distance, it also generates and preserves huge amount
of data. The data are very valuable and it presents association degree between people and their opinions.The comprehensive analysis of the methods which are used on user behavior prediction is presented in this paper. This comparison will provide a detailed information, pros and cons in the domain of sentiment and
opinion mining.
OS VERIFICATION- A SURVEY AS A SOURCE OF FUTURE CHALLENGESIJCSES Journal
Formal verification of an operating system kernel manifests absence of errors in the kernel and establishes trust in it. This paper evaluates various projects on operating system kernel verification and presents indepth survey of them. The methodologies and contributions of operating system verification projects have been discussed in the present work. At the end, few unattended and interesting future challenges in
operating system verification area have been discussed and possible directions towards the challenge solution have been described in brief.
STATE-OF-THE-ART IN EMPIRICAL VALIDATION OF SOFTWARE METRICS FOR FAULT PRONEN...IJCSES Journal
With the sharp rise in software dependability and failure cost, high quality has been in great demand.However, guaranteeing high quality in software systems which have grown in size and complexity coupled with the constraints imposed on their development has become increasingly difficult, time and resource consuming activity. Consequently, it becomes inevitable to deliver software that have no serious faults. In
this case, object-oriented (OO) products being the de facto standard of software development with their unique features could have some faults that are hard to find or pinpoint the impacts of changes. The earlier faults are identified, found and fixed, the lesser the costs and the higher the quality. To assess product quality, software metrics are used. Many OO metrics have been proposed and developed. Furthermore,
many empirical studies have validated metrics and class fault proneness (FP) relationship. The challenge is which metrics are related to class FP and what activities are performed. Therefore, this study bring together the state-of-the-art in fault prediction of FP that utilizes CK and size metrics. We conducted a systematic literature review over relevant published empirical validation articles. The results obtained are
analysed and presented. It indicates that 29 relevant empirical studies exist and measures such as complexity, coupling and size were found to be strongly related to FP.
בעיצוב חווית משתמש צריך תמיד לקחת בחשבון את ההשפעה שלנו על המשתמש, כיצד אנחנו עושים לו נעים, מה מעניין אותו, איך נראית המציאות שלו, מה יהיה לו אינטואיטיבי..
עשינו מצגת לסקירה כיצד הטכנולוגיה מעצבת את המציאות שלנו, ממנה אפשר ללמוד כמה דברים על החוויה שלנו עם מגוון ממשקים בחיי היום יום
This paper explains new imaging techniques that show promising results in breast cancer detection. The
presented techniques use microwave-based methods, wavelet analyses, and neural networks to get a
suitable resolution for the breast image. One of the presented techniques (hybrid method) uses a
combination of microwaves and acoustic signals to improve the detection capability. Some promising
results are shown and explained.
A study of techniques for facial detection and expression classificationIJCSES Journal
Automatic recognition of facial expressions is an important component for human-machine interfaces. It
has lot of attraction in research area since 1990's.Although humans recognize face without effort or
delay, recognition by a machine is still a challenge. Some of its challenges are highly dynamic in their
orientation, lightening, scale, facial expression and occlusion. Applications are in the fields like user
authentication, person identification, video surveillance, information security, data privacy etc. The
various approaches for facial recognition are categorized into two namely holistic based facial
recognition and feature based facial recognition. Holistic based treat the image data as one entity without
isolating different region in the face where as feature based methods identify certain points on the face
such as eyes, nose and mouth etc. In this paper, facial expression recognition is analyzed with various
methods of facial detection,facial feature extraction and classification.
Reliability is concerned with decreasing faults and their impact. The earlier the faults are detected the better. That's why this presentation talks about automated techniques using machine learning to detect faults as early as possible.
Machine learning techniques can be used to analyse data from different perspectives and enable developers to retrieve useful information. Machine learning techniques are proven to be useful
in terms of software bug prediction. In this paper, a comparative performance analysis of
different machine learning techniques is explored for software bug prediction on public
available data sets. Results showed most of the machine learning methods performed well on
software bug datasets.
A Defect Prediction Model for Software Product based on ANFISIJSRD
Artificial intelligence techniques are day by day getting involvement in all the classification and prediction based process like environmental monitoring, stock exchange conditions, biomedical diagnosis, software engineering etc. However still there are yet to be simplify the challenges of selecting training criteria for design of artificial intelligence models used for prediction of results. This work focus on the defect prediction mechanism development using software metric data of KC1.We have taken subtractive clustering approach for generation of fuzzy inference system (FIS).The FIS rules are generated at different radius of influence of input attribute vectors and the developed rules are further modified by ANFIS technique to obtain the prediction of number of defects in software project using fuzzy logic system.
A NOVEL APPROACH TO ERROR DETECTION AND CORRECTION OF C PROGRAMS USING MACHIN...IJCI JOURNAL
There has always been a struggle for programmers to identify the errors while executing a program- be it
syntactical or logical error. This struggle has led to a research in identification of syntactical and logical
errors. This paper makes an attempt to survey those research works which can be used to identify errors as
well as proposes a new model based on machine learning and data mining which can detect logical and
syntactical errors by correcting them or providing suggestions. The proposed work is based on use of
hashtags to identify each correct program uniquely and this in turn can be compared with the logically
incorrect program in order to identify errors.
A Review on Software Fault Detection and Prevention Mechanism in Software Dev...iosrjce
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
TOWARDS PREDICTING SOFTWARE DEFECTS WITH CLUSTERING TECHNIQUESijaia
The purpose of software defect prediction is to improve the quality of a software project by building a
predictive model to decide whether a software module is or is not fault prone. In recent years, much
research in using machine learning techniques in this topic has been performed. Our aim was to evaluate
the performance of clustering techniques with feature selection schemes to address the problem of software
defect prediction problem. We analysed the National Aeronautics and Space Administration (NASA)
dataset benchmarks using three clustering algorithms: (1) Farthest First, (2) X-Means, and (3) selforganizing map (SOM). In order to evaluate different feature selection algorithms, this article presents a
comparative analysis involving software defects prediction based on Bat, Cuckoo, Grey Wolf Optimizer
(GWO), and particle swarm optimizer (PSO). The results obtained with the proposed clustering models
enabled us to build an efficient predictive model with a satisfactory detection rate and acceptable number
of features.
GENETIC-FUZZY PROCESS METRIC MEASUREMENT SYSTEM FOR AN OPERATING SYSTEMijcseit
Operating system (Os) is the most essential software of the computer system,deprived ofit, the computer
system is totally useless. It is the frontier for assessing relevant computer resources. It performance greatly
enhances user overall objective across the system. Related literatures have try in different methods and
techniques to measure the process matric performance of the operating system but none has incorporated
the use of genetic algorithm and fuzzy logic in their varied techniques which indeed is a novel approach.
Extending the work of Michalis, this research focuses on measuring the process matrix performance of an
operating system utilizing set of operating system criteria’s while fusing fuzzy logic to handle
impreciseness and genetic for process optimization.
GENETIC-FUZZY PROCESS METRIC MEASUREMENT SYSTEM FOR AN OPERATING SYSTEMijcseit
Operating system (Os) is the most essential software of the computer system,deprived ofit, the computer system is totally useless. It is the frontier for assessing relevant computer resources. It performance greatly
enhances user overall objective across the system. Related literatures have try in different methods and techniques to measure the process matric performance of the operating system but none has incorporated the use of genetic algorithm and fuzzy logic in their varied techniques which indeed is a novel approach. Extending the work of Michalis, this research focuses on measuring the process matrix performance of an
operating system utilizing set of operating system criteria’s while fusing fuzzy logic to handle impreciseness and genetic for process optimization.
Genetic fuzzy process metric measurement system for an operating systemijcseit
Operating system (Os) is the most essential software of the computer system,deprived ofit, the computer
system is totally useless. It is the frontier for assessing relevant computer resources. It performance greatly
enhances user overall objective across the system. Related literatures have try in different methods and
techniques to measure the process matric performance of the operating system but none has incorporated
the use of genetic algorithm and fuzzy logic in their varied techniques which indeed is a novel approach.
Extending the work of Michalis, this research focuses on measuring the process matrix performance of an
operating system utilizing set of operating system criteria’s while fusing fuzzy logic to handle
impreciseness and genetic for process optimization.
Software Testing: Issues and Challenges of Artificial Intelligence & Machine ...gerogepatton
The history of Artificial Intelligence and Machine Learning dates back to 1950’s. In recent years, there has been an increase in popularity for applications that implement AI and ML technology. As with traditional development, software testing is a critical component of an efficient AI/ML application. However, the approach to development methodology used in AI/ML varies significantly from traditional development. Owing to these variations, numerous software testing challenges occur. This paper aims to recognize and to explain some of the biggest challenges that software testers face in dealing with AI/ML applications. For future research, this study has key implications. Each of the challenges outlined in this paper is ideal for further investigation and has great potential to shed light on the way to more productive software testing strategies and methodologies that can be applied to AI/ML applications.
Software Testing: Issues and Challenges of Artificial Intelligence & Machine ...gerogepatton
The history of Artificial Intelligence and Machine Learning dates back to 1950’s. In recent years, there has been an increase in popularity for applications that implement AI and ML technology. As with traditional development, software testing is a critical component of an efficient AI/ML application. However, the approach to development methodology used in AI/ML varies significantly from traditional development. Owing to these variations, numerous software testing challenges occur. This paper aims to recognize and to explain some of the biggest challenges that software testers face in dealing with AI/ML applications. For
future research, this study has key implications. Each of the challenges outlined in this paper is ideal for further investigation and has great potential to shed light on the way to more productive software testing strategies and methodologies that can be applied to AI/ML applications.
SOFTWARE TESTING: ISSUES AND CHALLENGES OF ARTIFICIAL INTELLIGENCE & MACHINE ...ijaia
The history of Artificial Intelligence and Machine Learning dates back to 1950’s. In recent years, there has
been an increase in popularity for applications that implement AI and ML technology. As with traditional
development, software testing is a critical component of an efficient AI/ML application. However, the
approach to development methodology used in AI/ML varies significantly from traditional development.
Owing to these variations, numerous software testing challenges occur. This paper aims to recognize and
to explain some of the biggest challenges that software testers face in dealing with AI/ML applications. For
future research, this study has key implications. Each of the challenges outlined in this paper is ideal for
further investigation and has great potential to shed light on the way to more productive software testing
strategies and methodologies that can be applied to AI/ML applications.
A simplified predictive framework for cost evaluation to fault assessment usi...IJECEIAES
Software engineering is an integral part of any software development scheme which frequently encounters bugs, errors, and faults. Predictive evaluation of software fault contributes towards mitigating this challenge to a large extent; however, there is no benchmarked framework being reported in this case yet. Therefore, this paper introduces a computational framework of the cost evaluation method to facilitate a better form of predictive assessment of software faults. Based on lines of code, the proposed scheme deploys adopts a machine-learning approach to address the perform predictive analysis of faults. The proposed scheme presents an analytical framework of the correlation-based cost model integrated with multiple standards machine learning (ML) models, e.g., linear regression, support vector regression, and artificial neural networks (ANN). These learning models are executed and trained to predict software faults with higher accuracy. The study considers assessing the outcomes based on error-based performance metrics in detail to determine how well each learning model performs and how accurate it is at learning. It also looked at the factors contributing to the training loss of neural networks. The validation result demonstrates that, compared to logistic regression and support vector regression, neural network achieves a significantly lower error score for software fault prediction.
A Review on Parameter Estimation Techniques of Software Reliability Growth Mo...Editor IJCATR
Software reliability is considered as a quantifiable metric, which is defined as the probability of a software to operate
without failure for a specified period of time in a specific environment. Various software reliability growth models have been proposed
to predict the reliability of a software. These models help vendors to predict the behaviour of the software before shipment. The
reliability is predicted by estimating the parameters of the software reliability growth models. But the model parameters are generally
in nonlinear relationships which creates many problems in finding the optimal parameters using traditional techniques like Maximum
Likelihood and least Square Estimation. Various stochastic search algorithms have been introduced which have made the task of
parameter estimation, more reliable and computationally easier. Parameter estimation of NHPP based reliability models, using MLE
and using an evolutionary search algorithm called Particle Swarm Optimization, has been explored in the paper.
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
The field of Information retrieval (IR) is currently undergoing a transformative shift, at least partly due to the emerging applications of generative AI to information access. In this talk, we will deliberate on the sociotechnical implications of generative AI for information access. We will argue that there is both a critical necessity and an exciting opportunity for the IR community to re-center our research agendas on societal needs while dismantling the artificial separation between the work on fairness, accountability, transparency, and ethics in IR and the rest of IR research. Instead of adopting a reactionary strategy of trying to mitigate potential social harms from emerging technologies, the community should aim to proactively set the research agenda for the kinds of systems we should build inspired by diverse explicitly stated sociotechnical imaginaries. The sociotechnical imaginaries that underpin the design and development of information access technologies needs to be explicitly articulated, and we need to develop theories of change in context of these diverse perspectives. Our guiding future imaginaries must be informed by other academic fields, such as democratic theory and critical theory, and should be co-developed with social science scholars, legal scholars, civil rights and social justice activists, and artists, among others.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
Keynote at DIGIT West Expo, Glasgow on 29 May 2024.
Cheryl Hung, ochery.com
Sr Director, Infrastructure Ecosystem, Arm.
The key trends across hardware, cloud and open-source; exploring how these areas are likely to mature and develop over the short and long-term, and then considering how organisations can position themselves to adapt and thrive.
Key Trends Shaping the Future of Infrastructure.pdf
Genetic algorithm based approach for
1. International Journal of Computer Science & Engineering Survey (IJCSES) Vol.5, No.3, June 2014
DOI : 10.5121/ijcses.2014.5303 29
GENETIC ALGORITHM BASED APPROACH FOR
FINDING FAULTY MODULES IN OPEN SOURCE
SOFTWARE SYSTEMS
Aditi Puri1
and Harshpreet Singh2
1
Department of Computer Science and Engineering, Lovely Professional University,
Punjab, India
2
Assistant Professor, Department of Computer Science, Lovely Professional University,
Punjab, India
ABSTRACT
Computer program produces an incorrect or unexpected result or behaves in haphazard way then there is
an error in that computer program. In order to improve the software quality, prediction of faulty modules is
necessary. Various Metric suites and techniques are available to predict the modules which are critical and
likely to be fault prone. Genetic Algorithm is a problem solving algorithm. It uses genetics as its model of
problem solving. It’s a search technique to find approximate solutions to optimization and search
problems.Genetic algorithm is applied for solving the problem of faulty module prediction and as well as
for finding the most important attribute for fault occurrence. In order to perform the analysis, performance
validation of the Genetic Algorithm using open source software jEdit is done. The results are measured in
terms Accuracy and Error in predicting by calculating probability of detection and probability of false
Alarms
KEYWORDS
Genetic Algorithm, software metrics, fault, faulty modules, fault proneness
1. INTRODUCTION
A fault prone module is one in which the number of faults are higher than a selected threshold.
Making any application or software 100% defects free is the need of the hour and the greatest
challenge being faced by software industry. Fault proneness of software depends on the faults it
contains and is directly proportional to its measurable attributes and testing. Detection of fault-
prone modules of software helps the experts to concentrate more on the development work. Fault
proneness can be predicted by classifying software modules into categories of faulty and non-
faulty modules at an early stage of development.
Abundant research has been done on the software fault prediction techniques and software
metrics were considered for the classification and evaluation of the data.
Predicting fault prone modules adds up to the software quality assurance. Many algorithms have
been used to predict the fault proneness including the hierarchical and k-means clustering
algorithms. Some work has also been done using Bayes Network Classification Algorithm and
spamfiltering technique for finding fault prone software modules.Support vector machine (SVM)
and module dependency graphs (MDGs) have also proved helpful for predicting the fault
proneness.
2. International Journal of Computer Science & Engineering Survey (IJCSES) Vol.5, No.3, June 2014
30
Software metrics has been widely used and has proven to be one of the critical attribute to
measure the fault proneness of applications and software. Chidamber & Kemerer metrics suite [1]
includes the various product metrics and a quality oriented extension. Martin's metrics [10] and
Henderson-Sellers metrics [9] are also there.
In the first section of this paper a brief introduction of the topic is described including the use of
metrics and the historic methods used in fault findings. Second section of the paper gives the
literature review.
2. RELATED WORK
Parvinder S. Sandhu et al.[2] presented K-means clustering approach to cluster the modules into
different categories of faulty and non-faulty modules and thereafter empirically validated them.
Chidamber & Kemerer metrics suite [1] was used to divide the data into categories. To find the
attributes that are important for predicting Chi-squared ranking filter was applied subsequently
to reduce the data. The reduced attributes is then given as input to the K-means clustering
algorithm that divides data into two or more clusters depending upon that whether they are fault
free or fault prone.
Ritika et al. [3]empirically evaluated the performance of SVM technique in predicting fault-prone
classes using open source software. SVM structure was generated from the faulty data in
MATLAB environment and then evaluated for the JEdit dataset. The performance of the system
was evaluated by calculating the accuracy%, the percentage of the predicted values that matches
with the obtained values.
D. Doval et al. [4]used MDGs to represent the structure of complex software systems in
understandable form. In MDGs, the system’s modules (e.g., files, classes) are represented as
nodes and their relationships (e.g., function calls, inheritance relationships) as directed edges that
connect the nodes. Genetic Algorithm is then used to efficiently partition the modules to form
MDG and thus clustered and classified the modules as fault prone or fault free.
Dr. Parvinder S. Sandhu [5] evaluated the fault proneness of modules in open source software
system JEdit using k-NN clustering algorithm based on Chidamber & Kemerer metrics suite [1]
.The DIT, CBO, RFC, NPM and LOC metrics were selected and the predicted model so
developed was applied to 274 classes of the dataset. The percentage of Accuracy as 82.84,
Probability of Detection, Probability of False alarm was recorded for the fault dataset.
Simranjit Kaur et al. [6] investigated the accuracy of the fault prediction of the software modules
using Hierarchical based clustering. Structural code and design attributes of the software systems
was found and suitable metric values were selected for representing the dataset. Further the
metrics values were refined and normalized. Hierarchical Clustering algorithm was then used to
classify the software components into two categories of faulty and fault-free systems. Both
agglomerative as well as divisive approach was used for the predicting the accuracy of 85.12 and
finally confusion metrics was used to predict the outcome.
Aarti Mahajan et al. [7] predicted the software quality by investigating the capabilities of Bayes
Network Classification Algorithm using open source software JEdit. Structural code and design
attributes of the software systems was found and suitable metric values were selected for
representing the dataset. Further the metrics values were refined and normalized. The dataset so
obtained was given as input to the Bayes Network Classification Approach.Mean Absolute Error
and Root Mean-Squared Error both were calculated. Finally confusion metrics with an accuracy
of 70.8 was used to predict the outcome.
3. International Journal of Computer Science & Engineering Survey (IJCSES) Vol.5, No.3, June 2014
31
Osamu Mizuno et al. [8] applied the spam filtering technique for finding fault prone software
modules. Each software module was considered as an e-mail message and either of fault-prone
(FP) category or not-fault prone (NFP) category. New modules were classified as FP or NFP by
studying the characteristics of existing FP and NFP modules using spam filter. Instead of metric
measure, source code was used and the text classification is made on the basis of the past history
of the development. FPFinder, prototype tool was used to track bugs. The result of the experiment
concluded that spam filter approach can classify more than 75% of the modules correctly.
Table 1. Overview of the techniques/methods followed in literature review
Sno. Technique
Name
Methodology followed Accuracy
percentage
Limitations
1. K-means
clustering
approach [2]
1. Categorized the dataset into
metrics.
2. Used chi-squared to filter
out the important metrics.
3. Made two clusters of faulty
& non-faulty modules using
k-means
62.4 Most important
attributes for
fault prediction
was not found.
2. Support
vector
machine
(SVM)
technique [3]
1. SVM structure was
generated from the faulty
data in MATLAB 7.4
environment
2. Structure evaluated for the
JEdit dataset.
78.1022 Most important
attributes for
fault prediction
was not found.
3. Module
dependency
graphs
(MDGs) [4]
1. MDGs were used to
represent the structure of
complex software systems
in understandable form.
2. Genetic Algorithm is then
used to efficiently partition
the modules to form MDG
Not
calculated
Does not
provide a
mechanism to
integrate a
designer’s
knowledge of a
system into the
automatic
clustering
process
4. k-NN
clustering
algorithm [5]
1. Metrics was selected to
form a model.
2. Predicted model so
developed was applied to
274 classes of the dataset.
82.84 Similar studies
were not carried
out to confirm
the acceptability
of the
prediction.
5. Hierarchical
based
clustering [6]
1. Hierarchical based
clustering is used.
2. Components were divided
into two categories of faulty
and fault-free.
85.12 Most important
attribute for fault
prediction was
not found.
4. International Journal of Computer Science & Engineering Survey (IJCSES) Vol.5, No.3, June 2014
32
6. Bayes
Network
classification
algorithm [7]
1. Suitable metric values were
selected for representing the
dataset.
2. Bayes network
classification is used.
3. Mean absolute error and
root mean-squared error
were calculated.
70.8 Most important
attribute for fault
prediction was
not found.
7. Spam filtering
technique [8]
1. Each module considered as
an e-mail message
2. New modules were
classified as FP and NFP on
basis of past history
3. FPFinder tool was used to
track bugs.
Classified
more than
75% of
modules
correctly.
1. Some
modules
were
misclassifie
d.
2. No practical
advantage.
3. PROPOSED APPROACH
Step 1:Raw data is collected in the form of structural codes, source code of open source system
and design attributes.
Step 2: Evaluation of the data obtained in step 1 is done on Chidamber & Kemerer metrics suite
[1] and out of those some metrics are chosen to illustrate the principal design. They are-
Coupling between Objects (CBO), CBO for a class is the count of the total number of other
classes to which it is coupled and vice versa.
Lack of Cohesion (LCOM), it is the measurement of the dissimilarity of methods in a class.
It is measured by looking at the instance variable or attributes used by methods in a class.
Number of Children (NOC), The NOC is the number of immediate subclasses of a class in a
hierarchy.
Depth of inheritance (DOI), the depth of a class within the inheritance hierarchy is the
maximum number of steps from the class node to the root of the tree and is measured by the
number of ancestor classes
Weighted Methods per Class (WMC), it is the count of sum of complexities of all methods
in a class. Consider a class P1, with Methods M1…… Mr that are defined in the class. Let.
C1, C2....Cr be the complexity of the methods
WMC= ΣCi where i=1 to r
If all the methods complexities are considered to be unity, then WMC = r the number of
methods in the class.
Response for a class (RFC), it is defined as set of methods that can be potentially executed
in response to a message received by an object of that class.
It is given by RFC= |RS|, where RS, the response set of the class
RS = Mi U all j{Rij}
Number of Public Methods(NPM), it is the count of number of Public methods in a class
Lines of Code ( LOC), It is the count of the total number of lines in the text of the source
code excluding comment lines.
5. International Journal of Computer Science & Engineering Survey (IJCSES) Vol.5, No.3, June 2014
33
Figure 1: Flowchart of methodology followed
Step 3:The data collected in step 2 is now filteredto find the most important metrics for the fault
prediction.
Step 4:Further the reduced or selected attributes are given as input to Genetic Algorithm.
Step5:The performance of the algorithm is evaluated on the basis of confusion matrix. This
allows us to have more detailed analysis than mere having correct guesses. The following sets of
attributes are being used to refine our results:
• Probability of Detection (PD), it is defined as the probability of correct classification of a
module that contains a fault.
PD = TP / (TP + FN)
• Probability of False Alarms (PF) is defined as the ratio of false positives to all non-defect
modules. PF = FP / (FP + TN)
Recall or true positive rate (TP) it is the proportion of faulty modules that are correctly
identified as faulty.
False positive rate (FP)it is the proportion of non-faulty modules that are incorrectly
classified as faulty.
True negative rate (TN)it is defined as the proportion of non-faulty cases that are classified
correctly as non-faulty.
False negative rate (FN) is the proportion of faulty modules that are incorrectly classified as
non-faulty.
6. International Journal of Computer Science & Engineering Survey (IJCSES) Vol.5, No.3, June 2014
34
Table 2. Confusion matrix of prediction outcomes
It can be clearly concluded as probability of detection or PD should be maximum and probability
of false alarm or PF should be minimum.
The accuracy (AC) is the proportion of the total number of predictions that were correct. It is
determined using the equation:
AC = TP+TN /TP+FP+FN+TN
The error in the calculation of the values of PD and PF is calculated as:
Error= FN+FP/ TP+FP+FN+TN
4. RESULTS AND FINDINGS
Step 1: The data for the research is taken as the source code of the open source jEdit project. The
source code of the two versions of jEdit is collected and analyzed and fed as input into JArchitect,
static analysis tool for JAVA code. This tool supports a large number of software code metrics. It
also allows visualization of dependencies using directed graphs and dependency matrix. Code
base snapshots comparison, validation of architectural and quality rules are also performed by this
tool.
Step 2: Using JArchitect_4.0.0.8041 analysis of jEdit version 4.5pre 1 is done and the following
values are calculated-
Public methods on classes give the total number of public methods in the classes of the
software.
Cyclomatic Complexity (CC) is calculated for calculating the Weighted Methods per
Class (WMC) which is the count of sum of complexities of all methods in a class.
Lines of code (LOC) is calculated as the count of number of lines in the text of the source
code of jEdit version 4.5pre 1 and jEdit version 5.0.1excluding comment lines.
Efferent Coupling is calculated both at the method level and at the package level. Efferent
Coupling for a particular method is the number of methods that method directly depends
on. Efferent Coupling for a particular package is the number of packages it directly
depends on. It is used to calculate Coupling between Objects (CBO) metric.
Relational Cohesion (H)gives the average number of internal relationships per type. Let R
be the number of type relationships that are internal to the project and N be the number of
types within the project then
H=(R+1)/N Calculating H gives the value of Lack of Cohesion (LCOM) metric.
The Depth of Inheritance Tree for a class or a structure is the number of its base classes
(including the System.Object class thus DIT >= 1)
7. International Journal of Computer Science & Engineering Survey (IJCSES) Vol.5, No.3, June 2014
35
Table 3. Number of occurrences of the property in jEdit version 4.5pre 1
Property Occurrences
Public methods on classes 1123 classes
Cyclomatic complexity(cc) 8114 methods
LOC 54270
Efferent Coupling 459
Relational Cohesion 4.47
Properties on interfaces 63 interfaces
Methods on interfaces 63 interfaces
Arguments on methods on interfaces 177 methods
Public properties on classes 1123 classes
Arguments on public methods on classes 6594 methods
BC instructions in non-abstract methods 8114 methods
Table 4. Maximum value of the metrics that exists in jEdit version 4.5pre 1
Property Max
Public methods on classes 241 public methods
Cyclomatic complexity(cc) 309 cyclomatic complexity
LOC NA
Efferent Coupling NA
Relational Cohesion NA
Properties on interfaces 0 properties
Methods on interfaces 17 methods
Arguments on methods on interfaces 8 arguments
Public properties on classes 0 public properties
Arguments on public methods on classes 12 arguments
BC instructions in non-abstract methods 1445 BC instructions
Table 5. Average values of the metrics in jEdit version 4.5pre 1
Property Avg
Public methods on classes 5.87
Cyclomatic complexity(cc) 3.1
LOC NA
Efferent Coupling NA
Relational Cohesion NA
Properties on interfaces 0
Methods on interfaces 2.81
Arguments on methods on interfaces 1.34
Public properties on classes 0
Arguments on public methods on classes 1.14
BC instructions in non-abstract methods 29.37
8. International Journal of Computer Science & Engineering Survey (IJCSES) Vol.5, No.3, June 2014
36
Table 6. Standard deviation of the metrics values in jEdit version 4.5pre 1
Property StdDev
Public methods on classes 11.64
Cyclomatic complexity(cc) 7.56
LOC NA
Efferent Coupling NA
Relational Cohesion NA
Properties on interfaces 0
Methods on interfaces 2.56
Arguments on methods on interfaces 1.53
Public properties on classes 0
Arguments on public methods on classes 1.26
BC instructions in non-abstract methods 56.78
Using JArchitect_4.0.0.8041 analysis of jEdit version 5.0.1 is done and the following values are
calculated-
Table 7. Number of occurrences of the property in jEdit version 5.0.1
Property Occurrences
Public methods on classes 46 classes
Cyclomatic complexity(cc) 299 methods
LOC 2774
Efferent Coupling 82
Relational Cohesion 2.51
Properties on interfaces 3 interfaces
Methods on interfaces 3 interfaces
Arguments on methods on interfaces 9 methods
Public properties on classes 46 classes
Arguments on public methods on classes 214 methods
BC instructions in non-abstract methods 299 methods
Table 8. Maximum value of the metrics that exists in jEdit version 5.0.1
Property Max
Public methods on classes 32 public methods
Cyclomatic complexity(cc) 64 cyclomatic complexity
LOC NA
Efferent Coupling NA
Relational Cohesion NA
Properties on interfaces 0 properties
Methods on interfaces 5 methods
Arguments on methods on interfaces 1 arguments
Public properties on classes 0 public properties
Arguments on public methods on classes 6 arguments
BC instructions in non-abstract methods 2052 BC instructions
9. International Journal of Computer Science & Engineering Survey (IJCSES) Vol.5, No.3, June 2014
37
Table 9. Average values of the metrics in jEdit version 5.0.1
Property Avg
Public methods on classes 4.65
Cyclomatic complexity(cc) 2.95
LOC NA
Efferent Coupling NA
Relational Cohesion NA
Properties on interfaces 0
Methods on interfaces 3
Arguments on methods on interfaces 0.78
Public properties on classes 0
Arguments on public methods on classes 1
BC instructions in non-abstract methods 51.21
Table 10. Standard deviation of the metrics values in jEdit version 5.0.1
Property StdDev
Public methods on classes 5.56
Cyclomatic complexity(cc) 5.41
LOC NA
Efferent Coupling NA
Relational Cohesion NA
Properties on interfaces 0
Methods on interfaces 1.63
Arguments on methods on interfaces 0.42
Public properties on classes 0
Arguments on public methods on classes 0.98
BC instructions in non-abstract methods 155.74
Step 3: The values of the metrics collected in the previous step is converted into binary form.
This conversion is done on the basis of the threshold value of the metrics available.
A good range for Relational Cohesion is 1.5 to 4.0. Projects where Relational Cohesion <
1.5 or Relational Cohesion > 4.0 might be problematic.
Methods where CC is higher than 15 are hard to understand and maintain. Methods where
CC is higher than 30 are extremely complex and should be split in smaller methods
(except if they are automatically generated by a tool.)
Types where Ce > 50 are the types that depends on too many other types. They are
complex and have more than one responsibility.
Methods where BC Instructions is higher than 100 are hard to understand and maintain.
Methods where BC Instructions is higher than 200 are extremely complex
Number of Methods > 20 might be hard to understand and maintain but there might be
cases where it is relevant to have a high value for Number of Methods.
10. International Journal of Computer Science & Engineering Survey (IJCSES) Vol.5, No.3, June 2014
38
Table 11. Transformed values for jEdit version 4.5pre 1
Property Transformed
value
Public methods on classes 1
Cyclomatic complexity(cc) 1
LOC 0
Efferent Coupling 1
Relational Cohesion 1
Properties on interfaces 1
Methods on interfaces 1
Arguments on methods on
interfaces
1
Public properties on classes 0
Arguments on public methods
on classes
0
BC instructions in non-
abstract methods
1
Table 12. Transformed values for jEdit version 5.0.1
Property Transformed
value
Public methods on classes 1
Cyclomatic complexity(cc) 0
LOC 0
Efferent Coupling 1
Relational Cohesion 0
Properties on interfaces 0
Methods on interfaces 0
Arguments on methods on
interfaces
1
Public properties on classes 1
Arguments on public methods
on classes
1
BC instructions in non-
abstract methods
1
Step 4. Genetic Algorithm is applied on the above transformed values. Input to the Genetic
Algorithm is the string of the transformed metric values. The result so obtained after the
application of the algorithm is thus further analyzed for calculating the most important attribute
for the fault prediction.
11. International Journal of Computer Science & Engineering Survey (IJCSES) Vol.5, No.3, June 2014
39
Figure 2: Output of the Genetic Algorithm
Genetic Algorithm applied on the metric values gives the public methods on Classes, Efferent
Coupling, Arguments on methods and interfaces, public properties on classes, arguments on
public methods on classes and BC instructions in non-abstract methods as the most important
attributes in the fault prediction.
Step 5: The performance of the algorithm is evaluated using Confusion Matrix. The
following sets of attributes are calculated as:
Table 13. Recorded Confusion matrix
The values of Probability of Detection (PD) and Probability of False Alarms (PF) are 0.875 and
0.44 respectively.The error and accuracy values are calculated and recorded as 0.294 and 0.705
respectively.
5. CONCLUSION
In this paper Genetic Algorithm is used to find critical classes and metrics that are fault prone.The
proposed Genetic Algorithm technique shows high value of Probability of Detection (PD) i.e.
0.875 and low value of Probability of False Alarms (PF) i.e. 0.44.The error and accuracy values
are calculated and recorded as 0.294 and 0.705 respectively.
It is therefore concluded that Genetic algorithm can be used for object-oriented systems and is
useful in predicting the fault prone classes.
The work can be extended by using other evolutionary algorithms for finding the most important
attribute for fault prediction and finding the critical classes and metrics.
REFERENCES
[1] J. Gray. Why do computers stop and what can be done about it? Technical Report 85.7, PN87614,
Tandem Computers, Cupertino, 1985.
12. International Journal of Computer Science & Engineering Survey (IJCSES) Vol.5, No.3, June 2014
40
[2] Grottke, Michael, and Kishor S. Trivedi. "A classification of software faults."Journal of Reliability
Engineering Association of Japan 27, no. 7 (2005): 425-438.
[3] WikiWikiWeb. Heisen bug examples. Last modified Jan. 21, 2004, URL =
http://c2.com/cgi/wiki?HeisenBugExamples (Link verified on May 26, 2005).
[4] S.Chidamber and C.F.Kemerer, “A metrics Suite for Object-Oriented Design,” IEEE Transactions on
Software Engineering, vol. SE-20, no.6, 476-493, 1994.
[5] R.Martin, “Oo design quality metrics - an analysis of dependencies,” In Workshop Pragmatic and
Theoretical Directions in Object-Oriented Software Metrics, OOPSLA’94.
[6] B.Henderson-Sellers, “Object-oriented metrics: measures of complexity,” UpperSaddle River, NJ,
USA: Prentice-Hall, Inc 1996.
[7] J.Bansiya and C.G Davis, “A hierarchical model for object-oriented design quality assessment,” IEEE
Transactions on Software Engineering, vol. 28, no. 1, 4–17, 2002.
[8] L.Madeyski and M.Jureczko “Which process metrics improve software defect prediction models? An
empirical study,” submitted to Information and Software Technology, 2011.
[9] S. N Sivanandam., and S. N. Deepa. Introduction to genetic algorithms. Springer, 2007.
[10] P S. Sandhu, Jagdeep Singh, Vikas Gupta, Mandeep Kaur, Sonia Manhas, Ramandeep Sidhu. “A K-
Means Based Clustering Approach for Finding Faulty Modules in Open Source Software Systems.” In
World Academy of Science, Engineering and Technology 48 2010.
[11] R.Malhotra, Bhupinder Singh and Satinder Pal Ahuja. “Determination of Fault Proneness of Modules
in Open Source Software Systems using SVM Clustering Approach.” In International Conference on
Computer Graphics, Simulation and Modeling (ICGSM'2012).
[12] D. Doval, Diego, Spiros Mancoridis, and Brian S. Mitchell. "Automatic clustering of software
systems using a genetic algorithm." In Software Technology and Engineering Practice, 1999.
STEP'99. Proceedings, pp. 73-81. IEEE, 1999.
[13] P S Sandhu., Rubinderjit Kaur, and Anamika Sharma. "Evaluation of Fault Proneness of Modules in
Open Source Software Systems Using k-NN Clustering."
[14] S. Kaur, Manish Mahajan, and Dr Parvinder S. Sandhu. "Identification of Fault Prone Modules in
Open Source Software Systems using Hierarchical based Clustering." ISEMS, Bangkok, July (2011).
[15] Mahajan, Aarti, Vikas Gupta, and Parvinder S. Sandhu. "A Bayes Network Classification Approach
For Finding Faulty Modules In Open Source Software Systems."
[16] Mizuno, Osamu, Shiro Ikami, Shuya Nakaichi, and Tohru Kikuno. "Spam filter based approach for
finding fault-prone software modules." In Mining Software Repositories, 2007. ICSE Workshops
MSR'07. Fourth International Workshop on, pp. 4-4. IEEE, 2007.
[17] X. Xu, C.-H. Lung, M. Zaman, and A. Srinivasan, “Program restructuring through clustering
techniques,” in Proceedings of the IEEE International Working Conference on Source Code Analysis
and Manipulation (SCAM '04), pp. 75–84, IEEE Computer Society, Ottawa, Canada, September
2004.
[18] G. C. Murphy, D. Notkin, and K. Sullivan, “Software reflexion models: bridging the gap between
source and high-level models,” in Proceedings of the 1995 3rd ACM SIGSOFT Symposium on the
Foundations of Software Engineering, pp. 18–27, October 1995.
[19] Wiggerts, Theo A. "Using clustering algorithms in legacy systems remodularization." In Reverse
Engineering, 1997. Proceedings of the Fourth Working Conference on, pp. 33-43. IEEE, 1997.
Authors
Aditi Puri is a post graduate student in the Computer Science and Engineering
Department at Lovely Professional University. Her research interests include software
engineering, Genetic Algorithm and software evolution.
Harshpreet Singh received the B.tech and M.tech degrees in Computer Science in 2007
and 2009. He is an assistant professor in Computer Science Department at Lovely
Professional University and pursuing PhD degree. His research interests include process
modelling, CBSD, Software reuse.