The document describes an automated technique for classifying bug fixes as complete or incomplete. It defines a "bug neighborhood" as related flows of invalid values that could cause exceptions. The technique analyzes two program versions to identify fixed bugs and classify fixes as complete if the entire bug neighborhood is addressed, or incomplete otherwise. An empirical study on open-source projects found many incomplete fixes that could still cause failures.
Software reliability models error seeding model and failure model-ivGurbakash Phonsa
There are two main types of software reliability models: deterministic and probabilistic. Deterministic models use metrics like lines of code and complexity to estimate errors, while probabilistic models treat failures as random events. The Jelinski-Moranda model is a foundational probabilistic model that assumes a constant failure rate proportional to remaining faults. Variants include models that decrease the failure rate over time as faults are removed, like the Jelinski-Moranda geometric model. Curve fitting models use regression to relate metrics like complexity to outcomes like errors or failure rates.
Testing parallel software is a more complicated task in comparison to testing a standard program. The programmer should be aware both of the traps he can face while testing parallel code and existing methodologies and toolkit.
The document discusses various time series and non-homogeneous Poisson process (NHPP) models for software reliability growth. Some key models mentioned include the ARIMA, Goel-Okumoto, Musa exponential, hyperexponential growth, discrete reliability growth, S-shaped, and generalized NHPP models. The models make different assumptions about factors like error detection rates, removal of faults after failure, and introduction of new faults over time.
Finding latent code errors via machine learning over program ...butest
The document proposes a technique that uses machine learning to identify program properties from dynamic analysis that are likely to indicate errors. It trains models on properties from erroneous and fixed programs, and applies the models to rank properties of new code based on their likelihood of revealing errors. An implementation demonstrates it can increase the concentration of useful error-indicating properties in its output by factors of 50x for C programs and 4.8x for Java programs.
This document summarizes a research paper on mutation testing for C# programs. Mutation testing involves making small changes to a program to generate mutant versions, then testing if test cases can detect the changes. The paper proposes using mutation operators that model common programming errors specific to object-oriented features in C#, like access control, inheritance, and polymorphism. It presents a framework for mutation testing of C# programs and results showing the proposed approach improves accuracy and speed over traditional methods.
The document presents Debug Me, a statistical-based fault localization toolbox. It discusses categories of bug localization techniques, including static and dynamic analysis. It also covers related work on statistical approaches like SOBER and challenges they address. Debug Me ranks predicates based on their evaluation biases in passing and failing tests to localize bugs. It was evaluated on Siemens programs and showed improvements over previous techniques. Future work includes enhancing Debug Me and studying how test suites impact accuracy.
Debugging and optimization of multi-thread OpenMP-programsPVS-Studio
The task of familiarizing programmers with the sphere of developing parallel applications is getting more and more urgent. This article is a brief introduction into creation of multi-thread applications based on OpenMP technology. The approaches to debugging and optimization of parallel applications are described.
A Simplified Model for Evaluating Software Reliability at the Developmental S...Waqas Tariq
The use of open source software is becoming more and more predominant and it is important that the reliability of this software are evaluated. Even though a lot of researchers have tried to establish the failure pattern of different packages a deterministic model for evaluating reliability is not yet developed. The present work details a simplified model for evaluating the reliability of the open source software based on the available failure data. The methodology involves identifying a fixed number of packages at the start of the time and defining the failure rate based on the failure data for these preset number of packages. The defined function of the failure rate is used to arrive at the reliability model. The reliability values obtained using the developed model are also compared with the exact reliability values. Key words: Bugs, Failure density, Failure rate, Open source software, Reliability
Software reliability models error seeding model and failure model-ivGurbakash Phonsa
There are two main types of software reliability models: deterministic and probabilistic. Deterministic models use metrics like lines of code and complexity to estimate errors, while probabilistic models treat failures as random events. The Jelinski-Moranda model is a foundational probabilistic model that assumes a constant failure rate proportional to remaining faults. Variants include models that decrease the failure rate over time as faults are removed, like the Jelinski-Moranda geometric model. Curve fitting models use regression to relate metrics like complexity to outcomes like errors or failure rates.
Testing parallel software is a more complicated task in comparison to testing a standard program. The programmer should be aware both of the traps he can face while testing parallel code and existing methodologies and toolkit.
The document discusses various time series and non-homogeneous Poisson process (NHPP) models for software reliability growth. Some key models mentioned include the ARIMA, Goel-Okumoto, Musa exponential, hyperexponential growth, discrete reliability growth, S-shaped, and generalized NHPP models. The models make different assumptions about factors like error detection rates, removal of faults after failure, and introduction of new faults over time.
Finding latent code errors via machine learning over program ...butest
The document proposes a technique that uses machine learning to identify program properties from dynamic analysis that are likely to indicate errors. It trains models on properties from erroneous and fixed programs, and applies the models to rank properties of new code based on their likelihood of revealing errors. An implementation demonstrates it can increase the concentration of useful error-indicating properties in its output by factors of 50x for C programs and 4.8x for Java programs.
This document summarizes a research paper on mutation testing for C# programs. Mutation testing involves making small changes to a program to generate mutant versions, then testing if test cases can detect the changes. The paper proposes using mutation operators that model common programming errors specific to object-oriented features in C#, like access control, inheritance, and polymorphism. It presents a framework for mutation testing of C# programs and results showing the proposed approach improves accuracy and speed over traditional methods.
The document presents Debug Me, a statistical-based fault localization toolbox. It discusses categories of bug localization techniques, including static and dynamic analysis. It also covers related work on statistical approaches like SOBER and challenges they address. Debug Me ranks predicates based on their evaluation biases in passing and failing tests to localize bugs. It was evaluated on Siemens programs and showed improvements over previous techniques. Future work includes enhancing Debug Me and studying how test suites impact accuracy.
Debugging and optimization of multi-thread OpenMP-programsPVS-Studio
The task of familiarizing programmers with the sphere of developing parallel applications is getting more and more urgent. This article is a brief introduction into creation of multi-thread applications based on OpenMP technology. The approaches to debugging and optimization of parallel applications are described.
A Simplified Model for Evaluating Software Reliability at the Developmental S...Waqas Tariq
The use of open source software is becoming more and more predominant and it is important that the reliability of this software are evaluated. Even though a lot of researchers have tried to establish the failure pattern of different packages a deterministic model for evaluating reliability is not yet developed. The present work details a simplified model for evaluating the reliability of the open source software based on the available failure data. The methodology involves identifying a fixed number of packages at the start of the time and defining the failure rate based on the failure data for these preset number of packages. The defined function of the failure rate is used to arrive at the reliability model. The reliability values obtained using the developed model are also compared with the exact reliability values. Key words: Bugs, Failure density, Failure rate, Open source software, Reliability
A Review on Parameter Estimation Techniques of Software Reliability Growth Mo...Editor IJCATR
Software reliability is considered as a quantifiable metric, which is defined as the probability of a software to operate
without failure for a specified period of time in a specific environment. Various software reliability growth models have been proposed
to predict the reliability of a software. These models help vendors to predict the behaviour of the software before shipment. The
reliability is predicted by estimating the parameters of the software reliability growth models. But the model parameters are generally
in nonlinear relationships which creates many problems in finding the optimal parameters using traditional techniques like Maximum
Likelihood and least Square Estimation. Various stochastic search algorithms have been introduced which have made the task of
parameter estimation, more reliable and computationally easier. Parameter estimation of NHPP based reliability models, using MLE
and using an evolutionary search algorithm called Particle Swarm Optimization, has been explored in the paper.
Fault localization is time-consuming and difficult, which makes it the bottleneck of the
debugging progress. To help facilitate this task, there exist many fault localization techniques
that help narrow down the region of the suspicious code in a program. Better accuracy in fault
localization is achieved from heavy computation cost. Fault localization techniques that can
effectively locate faults also manifest slow response rate. In this paper, we promote the use of
pre-computing to distribute the time-intensive computations to the idle period of coding phase,
in order to speed up such techniques and achieve both low-cost and high accuracy. We raise the
research problems of finding suitable techniques that can be pre-computed and adapt it to the
pre-computing paradigm in a continuous integration environment. Further, we use an existing
fault localization technique to demonstrate our research exploration, and shows visions and
challenges of the related methodologies.
Fault localization is time-consuming and difficult,
which makes it the bottleneck of the
debugging progress. To help facilitate this task, t
here exist many fault localization techniques
that help narrow down the region of the suspicious
code in a program. Better accuracy in fault
localization is achieved from heavy computation cos
t. Fault localization techniques that can
effectively locate faults also manifest slow respon
se rate. In this paper, we promote the use of
pre-computing to distribute the time-intensive comp
utations to the idle period of coding phase,
in order to speed up such techniques and achieve bo
th low-cost and high accuracy. We raise the
research problems of finding suitable techniques th
at can be pre-computed and adapt it to the
pre-computing paradigm in a continuous integration
environment. Further, we use an existing
fault localization technique to demonstrate our res
earch exploration, and shows visions and
challenges of the related methodologies.
This document provides an overview of iterative and recursive algorithms. It begins with defining iterative algorithms as executing steps in iterations to find successive approximations of a solution. Key aspects of iterative algorithms discussed include loop invariants, typical errors, and different types of iterative methods. Recursion is then introduced as algorithms that call themselves with smaller inputs and solve larger cases based on smaller cases. Examples of recursive algorithms are provided for computing even numbers, powers of 2, sequential search, and testing natural numbers. In summary, the document covers the basic concepts and structures of iterative and recursive algorithms through definitions, examples, and comparisons between the two approaches.
DETECTION OF CONTROL FLOW ERRORS IN PARALLEL PROGRAMS AT COMPILE TIME ijdpsjournal
This paper describes a general technique to identify control flow errors in parallel programs, which can be automated into a compiler. The compiler builds a system of linear equations that describes the global control flow of the whole program. Solving these equations using standard techniques of linear algebra
can locate a wide range of control flow bugs at compile time. This paper also describes an implementation of this control flow analysis technique in a prototype compiler for a well-known parallel programming language. In contrast to previous research in automated parallel program analysis, our technique is efficient for large programs, and does not limit the range of language features.
The document discusses formal verification techniques for proving program correctness. It describes the axiomatic approach, where assertions are made at various points in a program about variable relationships. To prove correctness, it must be shown that applying the program statements transforms one assertion into the next. Formal verification can reduce testing costs but cannot replace testing for system validation.
This document summarizes a technique called partition-based regression verification (PRV) that aims to gradually verify software regressions by exploring differential partitions between program versions. The key points are:
- PRV computes differential partitions based on test inputs that lead to different program states between versions. It then generates test cases to verify the partitions.
- An empirical study on 83 program versions shows PRV finds regression-revealing test cases on average for 48% more versions and 63% faster than alternative approaches.
- When applied to the Apache CLI project over 6 years, PRV suggested an average progression of 49%, regression of 18% and intermediate semantic changes of 33% between successive versions.
Architecture of a morphological malware detectorUltraUploader
This document proposes an architecture for a morphological malware detector that combines syntactic and semantic analysis. It builds an efficient signature matching engine using tree automata techniques to represent control flow graphs (CFG). It also describes a graph rewriting engine to handle common malware mutations. The detector extracts CFGs from malware binaries to generate signatures, which are compiled into a minimal automaton database for efficient matching. Experiments showed promising results with a low false positive rate.
178 - A replicated study on duplicate detection: Using Apache Lucene to searc...ESEM 2014
Context: Duplicate detection is a fundamental part of issue management. Systems able to predict whether a new defect report will be closed as a duplicate, may decrease costs by limiting rework and collecting related pieces of information. Goal: Our work explores using Apache Lucene for large- scale duplicate detection based on textual content. Also, we evaluate the previous claim that results are improved if the title is weighted as more important than the description. Method: We conduct a conceptual replication of a well-cited study conducted at Sony Ericsson, using Lucene for searching in the public Android defect repository. In line with the original study, we explore how varying the weight- ing of the title and the description affects the accuracy. Results: We show that Lucene obtains the best results when the defect report title is weighted three times higher than the description, a bigger difference than has been previously acknowledged. Conclusions: Our work shows the potential of using Lucene as a scalable solution for duplicate detection.
Software reverse engineering is an active threat against software programs. One of the popular techniques used to make software reverse engineering harder is obfuscation. Among various control flow obfuscations methods proposed in the last decade there is a lack of inter-functional control flow obfuscation techniques. In this paper we propose an inter- unctional control flow obfuscation by manipulating return instructions. In our proposed method each function is split into different units, with each unit ending with a return instruction. The linear order in which functions appear in the program is obscured by shuffling these units there by creating an interfunctional control flow obfuscation. Experimental results show that the algorithm performs well against automated reverse engineering attacks.
Bug triage means to transfer a new bug to expertise developer. The manual bug triage is opulent in time
and poor in accuracy, there is a need to automatize the bug triage process. In order to automate the bug triage
process, text classification techniques are applied using stopword removal and stemming. In our proposed work
we have used NB-Classifiers to predict the developers. The data reduction techniques like instance selection
and keyword selection are used to obtain bug report and words. This will help the system to predict only those
developers who are expertise in solving the assigned bug. We will also provide the change of status of bug
report i.e. if the bug is solved then the bug report will be updated. If a particular developer fails to solve the bug
then the bug will go back to another developer.
Abstract—Combinatorial testing (also called interaction testing) is an effective specification-based test input generation technique. By now most of research work in combinatorial testing aims to propose novel approaches trying to generate test suites with minimum size that still cover all the pairwise, triple, or n-way combinations of factors. Since the difficulty of solving this problem is demonstrated to be NP-hard, existing approaches have been designed to generate optimal or near optimal combinatorial test suites in polynomial time. In this paper, we try to apply particle swarm optimization (PSO), a kind of meta-heuristic search technique, to pairwise testing (i.e. a special case of combinatorial testing aiming to cover all the pairwise combinations). To systematically build pairwise test suites, we propose two different PSO based algorithms. One algorithm is based on one-test-at-a-time strategy and the other is based on IPO-like strategy. In these two different algorithms, we use PSO to complete the construction of a single test. To successfully apply PSO to cover more uncovered pairwise combinations in this construction process, we provide a detailed description on how to formulate the search space, define the fitness function and set some heuristic settings. To verify the effectiveness of our approach, we implement these algorithms and choose some typical inputs. In our empirical study, we analyze the impact factors of our approach and compare our approach to other well-known approaches. Final empirical results show the effectiveness and efficiency of our approach.
Are current antivirus programs able to detect complex metamorphic malware an ...UltraUploader
This document presents research evaluating current antivirus programs' ability to detect complex metamorphic malware. The researchers designed a metamorphic engine to mutate malware code while preserving functionality. They applied this engine to the MyDoom worm and tested major antivirus products. Most products were unable to reliably detect the mutated MyDoom, demonstrating limitations in static detection techniques and the need for dynamic behavioral analysis to identify evolved malware variants.
The article describes 7 types of metrics and more than 50 their representatives, provides a detailed description and calculation algorithms used. It also touches upon the role of metrics in software development.
Software Reliability Growth Model with Logistic- Exponential Testing-Effort F...IDES Editor
Software reliability is one of the important factors of
software quality. Before software delivered in to market it is
thoroughly checked and errors are removed. Every software
industry wants to develop software that should be error free.
Software reliability growth models are helping the software
industries to develop software which is error free and reliable.
In this paper an analysis is done based on incorporating the
logistic-exponential testing-effort in to NHPP Software
reliability growth model and also observed its release policy.
Experiments are performed on the real datasets. Parameters
are calculated and observed that our model is best fitted for
the datasets.
A study of the Behavior of Floating-Point Errorsijpla
The dangers of programs performing floating-point computations are well known. This is due to numerical reliability issues resulting from rounding errors arising during the computations. In general, these round-off errors are neglected because they are small. However, they can be accumulated and propagated and lead to faulty execution and failures. Typically, in critical embedded systems scenario, these faults may cause dramatic damages (eg. failures of Ariane 5 launch and Patriot Rocket mission). The ufp (unit in the first place) and ulp (unit in the last place) functions are used to estimate maximum value of round-off errors. In this paper, the idea consists in studying the behavior of round-off errors, checking their numerical stability using a set of constraints and ensuring that the computation results of round-off errors do not become larger when solving constraints about the ufp and ulp values.
The document discusses software testing concepts like validation testing vs defect testing, system and component testing strategies, and test automation tools. It defines key terms like bugs, defects, errors, faults, and failures. It also describes techniques like equivalence partitioning and boundary value analysis that are used to generate test cases that thoroughly test software. Component testing tests individual program parts while system testing tests integrated groups of components. Test cases specify conditions to determine if software works as intended.
Software Engineering Domain Knowledge to Identify Duplicate Bug ReportsIJCERT
This document summarizes a research paper that proposes a technique to improve the detection of duplicate bug reports using contextual information extracted from software engineering literature. It describes extracting word lists from software engineering textbooks and project documentation to measure contextual features of bug reports. The technique was evaluated on real bug report datasets and showed potential to significantly reduce manual effort in contextual bug deduplication while maintaining accuracy. Key findings indicate that leveraging domain knowledge from software engineering texts can help automate and enhance the identification of duplicate bug reports.
This document discusses debugging complex programs by breaking them into smaller components that can each be tested independently. It recommends systematically understanding the intended behavior, observing the actual behavior through experiments and small changes, and tracking notes. It then introduces two ways to observe program behavior in DrRacket - using display to print intermediate values, and printf to print multiple values in a formatted string. Both produce side effects by printing rather than returning output values.
This document discusses metamorphic testing, which is an automated software testing technique that can be used by end-user programmers to test their programs without requiring an oracle. It focuses on applying metamorphic testing to spreadsheet applications. Metamorphic testing works by using the expected properties of the target function to generate follow-up test cases based on successful initial test cases. These follow-up cases are then checked against metamorphic relations, which define properties that should hold across multiple executions. The document proposes implementing metamorphic testing for spreadsheets by defining six metamorphic relations to detect errors without needing to determine expected outputs.
PPT - Custom Essay Help Service PowerPoint Presentation, Free DownlRhonda Cetnar
Here are a few key points to reconcile free will and predestination:
- God is sovereign and has predetermined the salvation of the elect, but also grants free will to humanity. His plan and our freedom coexist.
- Mankind in its fallen state cannot, of its own free will, choose salvation or come to faith in God. Regeneration by the Holy Spirit is required.
- Upon regeneration and receiving a new heart, believers are granted free will to respond to God and cooperate with sanctifying grace, though the ultimate glory remains God's.
- God's sovereignty and human responsibility are both biblical truths, though mysterious. We must accept what Scripture clearly teaches of both while acknowledging the tension between these doctrines.
College Paper Writing Tips StudentS Guide - EduBirdiRhonda Cetnar
Here are a few key points about preparations for Chinese New Year:
- Chinese New Year is one of the most important celebrations in Chinese culture and traditions. It signifies renewal and family reunions.
- In the weeks leading up to Chinese New Year, families will clean their homes to sweep away any bad luck from the past year. Windows and doors will also be decorated with red paper cuts and couplets with inspirational quotes.
- Preparations also involve buying new clothes to wear during Chinese New Year and receiving lucky money in red envelopes. Large family banquets are planned with reunion dinner menus.
- Other customs include paying off debts, visiting elders to pay respect and giving them red envelopes for good fortune
More Related Content
Similar to Automated Bug Neighborhood Analysis for Identifying Incomplete Bug Fixes.pdf
A Review on Parameter Estimation Techniques of Software Reliability Growth Mo...Editor IJCATR
Software reliability is considered as a quantifiable metric, which is defined as the probability of a software to operate
without failure for a specified period of time in a specific environment. Various software reliability growth models have been proposed
to predict the reliability of a software. These models help vendors to predict the behaviour of the software before shipment. The
reliability is predicted by estimating the parameters of the software reliability growth models. But the model parameters are generally
in nonlinear relationships which creates many problems in finding the optimal parameters using traditional techniques like Maximum
Likelihood and least Square Estimation. Various stochastic search algorithms have been introduced which have made the task of
parameter estimation, more reliable and computationally easier. Parameter estimation of NHPP based reliability models, using MLE
and using an evolutionary search algorithm called Particle Swarm Optimization, has been explored in the paper.
Fault localization is time-consuming and difficult, which makes it the bottleneck of the
debugging progress. To help facilitate this task, there exist many fault localization techniques
that help narrow down the region of the suspicious code in a program. Better accuracy in fault
localization is achieved from heavy computation cost. Fault localization techniques that can
effectively locate faults also manifest slow response rate. In this paper, we promote the use of
pre-computing to distribute the time-intensive computations to the idle period of coding phase,
in order to speed up such techniques and achieve both low-cost and high accuracy. We raise the
research problems of finding suitable techniques that can be pre-computed and adapt it to the
pre-computing paradigm in a continuous integration environment. Further, we use an existing
fault localization technique to demonstrate our research exploration, and shows visions and
challenges of the related methodologies.
Fault localization is time-consuming and difficult,
which makes it the bottleneck of the
debugging progress. To help facilitate this task, t
here exist many fault localization techniques
that help narrow down the region of the suspicious
code in a program. Better accuracy in fault
localization is achieved from heavy computation cos
t. Fault localization techniques that can
effectively locate faults also manifest slow respon
se rate. In this paper, we promote the use of
pre-computing to distribute the time-intensive comp
utations to the idle period of coding phase,
in order to speed up such techniques and achieve bo
th low-cost and high accuracy. We raise the
research problems of finding suitable techniques th
at can be pre-computed and adapt it to the
pre-computing paradigm in a continuous integration
environment. Further, we use an existing
fault localization technique to demonstrate our res
earch exploration, and shows visions and
challenges of the related methodologies.
This document provides an overview of iterative and recursive algorithms. It begins with defining iterative algorithms as executing steps in iterations to find successive approximations of a solution. Key aspects of iterative algorithms discussed include loop invariants, typical errors, and different types of iterative methods. Recursion is then introduced as algorithms that call themselves with smaller inputs and solve larger cases based on smaller cases. Examples of recursive algorithms are provided for computing even numbers, powers of 2, sequential search, and testing natural numbers. In summary, the document covers the basic concepts and structures of iterative and recursive algorithms through definitions, examples, and comparisons between the two approaches.
DETECTION OF CONTROL FLOW ERRORS IN PARALLEL PROGRAMS AT COMPILE TIME ijdpsjournal
This paper describes a general technique to identify control flow errors in parallel programs, which can be automated into a compiler. The compiler builds a system of linear equations that describes the global control flow of the whole program. Solving these equations using standard techniques of linear algebra
can locate a wide range of control flow bugs at compile time. This paper also describes an implementation of this control flow analysis technique in a prototype compiler for a well-known parallel programming language. In contrast to previous research in automated parallel program analysis, our technique is efficient for large programs, and does not limit the range of language features.
The document discusses formal verification techniques for proving program correctness. It describes the axiomatic approach, where assertions are made at various points in a program about variable relationships. To prove correctness, it must be shown that applying the program statements transforms one assertion into the next. Formal verification can reduce testing costs but cannot replace testing for system validation.
This document summarizes a technique called partition-based regression verification (PRV) that aims to gradually verify software regressions by exploring differential partitions between program versions. The key points are:
- PRV computes differential partitions based on test inputs that lead to different program states between versions. It then generates test cases to verify the partitions.
- An empirical study on 83 program versions shows PRV finds regression-revealing test cases on average for 48% more versions and 63% faster than alternative approaches.
- When applied to the Apache CLI project over 6 years, PRV suggested an average progression of 49%, regression of 18% and intermediate semantic changes of 33% between successive versions.
Architecture of a morphological malware detectorUltraUploader
This document proposes an architecture for a morphological malware detector that combines syntactic and semantic analysis. It builds an efficient signature matching engine using tree automata techniques to represent control flow graphs (CFG). It also describes a graph rewriting engine to handle common malware mutations. The detector extracts CFGs from malware binaries to generate signatures, which are compiled into a minimal automaton database for efficient matching. Experiments showed promising results with a low false positive rate.
178 - A replicated study on duplicate detection: Using Apache Lucene to searc...ESEM 2014
Context: Duplicate detection is a fundamental part of issue management. Systems able to predict whether a new defect report will be closed as a duplicate, may decrease costs by limiting rework and collecting related pieces of information. Goal: Our work explores using Apache Lucene for large- scale duplicate detection based on textual content. Also, we evaluate the previous claim that results are improved if the title is weighted as more important than the description. Method: We conduct a conceptual replication of a well-cited study conducted at Sony Ericsson, using Lucene for searching in the public Android defect repository. In line with the original study, we explore how varying the weight- ing of the title and the description affects the accuracy. Results: We show that Lucene obtains the best results when the defect report title is weighted three times higher than the description, a bigger difference than has been previously acknowledged. Conclusions: Our work shows the potential of using Lucene as a scalable solution for duplicate detection.
Software reverse engineering is an active threat against software programs. One of the popular techniques used to make software reverse engineering harder is obfuscation. Among various control flow obfuscations methods proposed in the last decade there is a lack of inter-functional control flow obfuscation techniques. In this paper we propose an inter- unctional control flow obfuscation by manipulating return instructions. In our proposed method each function is split into different units, with each unit ending with a return instruction. The linear order in which functions appear in the program is obscured by shuffling these units there by creating an interfunctional control flow obfuscation. Experimental results show that the algorithm performs well against automated reverse engineering attacks.
Bug triage means to transfer a new bug to expertise developer. The manual bug triage is opulent in time
and poor in accuracy, there is a need to automatize the bug triage process. In order to automate the bug triage
process, text classification techniques are applied using stopword removal and stemming. In our proposed work
we have used NB-Classifiers to predict the developers. The data reduction techniques like instance selection
and keyword selection are used to obtain bug report and words. This will help the system to predict only those
developers who are expertise in solving the assigned bug. We will also provide the change of status of bug
report i.e. if the bug is solved then the bug report will be updated. If a particular developer fails to solve the bug
then the bug will go back to another developer.
Abstract—Combinatorial testing (also called interaction testing) is an effective specification-based test input generation technique. By now most of research work in combinatorial testing aims to propose novel approaches trying to generate test suites with minimum size that still cover all the pairwise, triple, or n-way combinations of factors. Since the difficulty of solving this problem is demonstrated to be NP-hard, existing approaches have been designed to generate optimal or near optimal combinatorial test suites in polynomial time. In this paper, we try to apply particle swarm optimization (PSO), a kind of meta-heuristic search technique, to pairwise testing (i.e. a special case of combinatorial testing aiming to cover all the pairwise combinations). To systematically build pairwise test suites, we propose two different PSO based algorithms. One algorithm is based on one-test-at-a-time strategy and the other is based on IPO-like strategy. In these two different algorithms, we use PSO to complete the construction of a single test. To successfully apply PSO to cover more uncovered pairwise combinations in this construction process, we provide a detailed description on how to formulate the search space, define the fitness function and set some heuristic settings. To verify the effectiveness of our approach, we implement these algorithms and choose some typical inputs. In our empirical study, we analyze the impact factors of our approach and compare our approach to other well-known approaches. Final empirical results show the effectiveness and efficiency of our approach.
Are current antivirus programs able to detect complex metamorphic malware an ...UltraUploader
This document presents research evaluating current antivirus programs' ability to detect complex metamorphic malware. The researchers designed a metamorphic engine to mutate malware code while preserving functionality. They applied this engine to the MyDoom worm and tested major antivirus products. Most products were unable to reliably detect the mutated MyDoom, demonstrating limitations in static detection techniques and the need for dynamic behavioral analysis to identify evolved malware variants.
The article describes 7 types of metrics and more than 50 their representatives, provides a detailed description and calculation algorithms used. It also touches upon the role of metrics in software development.
Software Reliability Growth Model with Logistic- Exponential Testing-Effort F...IDES Editor
Software reliability is one of the important factors of
software quality. Before software delivered in to market it is
thoroughly checked and errors are removed. Every software
industry wants to develop software that should be error free.
Software reliability growth models are helping the software
industries to develop software which is error free and reliable.
In this paper an analysis is done based on incorporating the
logistic-exponential testing-effort in to NHPP Software
reliability growth model and also observed its release policy.
Experiments are performed on the real datasets. Parameters
are calculated and observed that our model is best fitted for
the datasets.
A study of the Behavior of Floating-Point Errorsijpla
The dangers of programs performing floating-point computations are well known. This is due to numerical reliability issues resulting from rounding errors arising during the computations. In general, these round-off errors are neglected because they are small. However, they can be accumulated and propagated and lead to faulty execution and failures. Typically, in critical embedded systems scenario, these faults may cause dramatic damages (eg. failures of Ariane 5 launch and Patriot Rocket mission). The ufp (unit in the first place) and ulp (unit in the last place) functions are used to estimate maximum value of round-off errors. In this paper, the idea consists in studying the behavior of round-off errors, checking their numerical stability using a set of constraints and ensuring that the computation results of round-off errors do not become larger when solving constraints about the ufp and ulp values.
The document discusses software testing concepts like validation testing vs defect testing, system and component testing strategies, and test automation tools. It defines key terms like bugs, defects, errors, faults, and failures. It also describes techniques like equivalence partitioning and boundary value analysis that are used to generate test cases that thoroughly test software. Component testing tests individual program parts while system testing tests integrated groups of components. Test cases specify conditions to determine if software works as intended.
Software Engineering Domain Knowledge to Identify Duplicate Bug ReportsIJCERT
This document summarizes a research paper that proposes a technique to improve the detection of duplicate bug reports using contextual information extracted from software engineering literature. It describes extracting word lists from software engineering textbooks and project documentation to measure contextual features of bug reports. The technique was evaluated on real bug report datasets and showed potential to significantly reduce manual effort in contextual bug deduplication while maintaining accuracy. Key findings indicate that leveraging domain knowledge from software engineering texts can help automate and enhance the identification of duplicate bug reports.
This document discusses debugging complex programs by breaking them into smaller components that can each be tested independently. It recommends systematically understanding the intended behavior, observing the actual behavior through experiments and small changes, and tracking notes. It then introduces two ways to observe program behavior in DrRacket - using display to print intermediate values, and printf to print multiple values in a formatted string. Both produce side effects by printing rather than returning output values.
This document discusses metamorphic testing, which is an automated software testing technique that can be used by end-user programmers to test their programs without requiring an oracle. It focuses on applying metamorphic testing to spreadsheet applications. Metamorphic testing works by using the expected properties of the target function to generate follow-up test cases based on successful initial test cases. These follow-up cases are then checked against metamorphic relations, which define properties that should hold across multiple executions. The document proposes implementing metamorphic testing for spreadsheets by defining six metamorphic relations to detect errors without needing to determine expected outputs.
Similar to Automated Bug Neighborhood Analysis for Identifying Incomplete Bug Fixes.pdf (20)
PPT - Custom Essay Help Service PowerPoint Presentation, Free DownlRhonda Cetnar
Here are a few key points to reconcile free will and predestination:
- God is sovereign and has predetermined the salvation of the elect, but also grants free will to humanity. His plan and our freedom coexist.
- Mankind in its fallen state cannot, of its own free will, choose salvation or come to faith in God. Regeneration by the Holy Spirit is required.
- Upon regeneration and receiving a new heart, believers are granted free will to respond to God and cooperate with sanctifying grace, though the ultimate glory remains God's.
- God's sovereignty and human responsibility are both biblical truths, though mysterious. We must accept what Scripture clearly teaches of both while acknowledging the tension between these doctrines.
College Paper Writing Tips StudentS Guide - EduBirdiRhonda Cetnar
Here are a few key points about preparations for Chinese New Year:
- Chinese New Year is one of the most important celebrations in Chinese culture and traditions. It signifies renewal and family reunions.
- In the weeks leading up to Chinese New Year, families will clean their homes to sweep away any bad luck from the past year. Windows and doors will also be decorated with red paper cuts and couplets with inspirational quotes.
- Preparations also involve buying new clothes to wear during Chinese New Year and receiving lucky money in red envelopes. Large family banquets are planned with reunion dinner menus.
- Other customs include paying off debts, visiting elders to pay respect and giving them red envelopes for good fortune
Best Essay Websites For Students To Write A Better EssayRhonda Cetnar
The document provides instructions for students to get writing help from the website HelpWriting.net. It outlines a 5-step process: 1) Create an account with an email and password. 2) Complete an order form with instructions and deadline. 3) Review writer bids and qualifications and place a deposit. 4) Review the paper and authorize payment or request revisions. 5) Request multiple revisions to ensure satisfaction, with a refund option for plagiarized content. The goal is to help students write better essays through this online writing assistance.
Purchase Psychology Papers, Best Place To BuyRhonda Cetnar
This document discusses steps to purchase psychology papers from HelpWriting.net:
1. Create an account with a password and email.
2. Complete a 10-minute order form providing instructions, sources, deadline, and sample work.
3. Writers will bid on the request and their qualifications are reviewed before selecting one.
4. The paper is reviewed and the writer is paid, with free revisions available. HelpWriting.net guarantees original, high-quality work or a full refund.
Analytic Rubric Sample For An Argumentative Essay Download SRhonda Cetnar
This document discusses the history and development of tap dancing. It begins by explaining the origins of tap dancing in the combination of African, Irish, English and Scottish dance forms in the mid-1800s. Tap dancing grew popular in vaudeville shows and nightclubs. To stand out, dancers developed unique styles like cane twirling, acrobatics, leg movements and incorporation of props. Styles included flash, novelty, eccentric, military and swing. Tap dancing was a common performing art through the early 20th century and featured prominently in movies.
This document provides instructions for using the HelpWriting.net website to request writing assistance and have papers written. It outlines a 5-step process:
1. Create an account with a password and email.
2. Complete a 10-minute order form with instructions, sources, and deadline.
3. Choose a writer based on bids, qualifications, and reviews. Place a deposit to start.
4. Review the completed paper and authorize final payment if satisfied. Free revisions are allowed.
5. Multiple revisions can be requested to ensure needs are fully met with original, high-quality content. Plagiarized work results in a full refund.
Speech Analysis Com101 - COM 101 Speech AnalysiRhonda Cetnar
1. The memo recommends a full financial status review of a recent job bid to understand and prepare for potential issues.
2. It discusses the impact of occupational fraud like embezzlement, bribery, and conflicts of interest on companies.
3. The memo advises on types of accounting evidence like documents, records, and testimony that could be gathered to support the financial review.
Fortune Teller Ideas For Kids - MeyasityRhonda Cetnar
The document provides instructions for requesting writing assistance from HelpWriting.net. It outlines a 5-step process: 1) Create an account with a password and email. 2) Complete a form with assignment details. 3) Review bids from writers and select one. 4) Review the completed paper and authorize payment. 5) Request revisions until satisfied with the work. The service promises original, high-quality content or a full refund.
Here are the key points to include in an introduction for a lab report on batteries, resistance and current:
- Define voltage, resistance and current. Voltage is the "pressure" that pushes electric charge through a circuit. Resistance is a measure of how difficult it is for charge to flow through a material. Current is the rate of flow of electric charge.
- Explain Ohm's law and its relationship between voltage, resistance and current. According to Ohm's law, current is directly proportional to voltage and inversely proportional to resistance. The mathematical relationship is I=V/R.
- State the objectives of the lab. The objectives are to experimentally determine the relationships between voltage, resistance and current as described by Oh
How Can I Write About Myself. Write My Essay OrRhonda Cetnar
The passage discusses job stress, its causes and impacts. It notes that job stress is a major source of stress for American adults, causing over half of absenteeism from work annually. Certain jobs like police officers face more stress due to negative work pressures. While stress can negatively impact health, it also has some benefits by motivating improved performance. Managing stress through exercise, relaxation and support networks can help maximize the upsides and minimize the downsides of job stress.
Buy Custom Pre Written Essays Online Expert WritinRhonda Cetnar
The document provides instructions for purchasing pre-written essays from an online writing service called HelpWriting.net. It outlines a 5-step process: 1) Create an account with a password and email, 2) Complete an order form with instructions and deadline, 3) Choose a writer based on bids, qualifications and reviews, 4) Review the paper and authorize payment, 5) Request revisions to ensure satisfaction and get a refund for plagiarized work. The service claims to provide original, high-quality content.
This document discusses how to use a blog as a tool to coordinate the social promotion of community service at the Faculty of Health Sciences of the University of Carabobo. It addresses the problem of intensifying social and economic relations globally and the interconnectedness of local and distant events. The blog could help analyze and address complex issues in community social work considering the influence of distant factors. The blog would promote transparency and participation through publishing activities and facilitating communication between stakeholders involved in community service.
Steps In Doing Research Paper , Basic Steps In The Research PrRhonda Cetnar
Here are a few key points about the National Protection and Programs Directorate (NPPD) within the Department of Homeland Security:
- NPPD works to safeguard and strengthen the security and resilience of the nation's critical infrastructure. It aims to reduce risk to the federal government's information systems and networks.
- Some of its divisions include Cybersecurity and Infrastructure Security Agency (CISA), Federal Protective Service (FPS), and Office of Biometric Identity Management (OBIM).
- CISA works with partners across all levels of government and the private sector to defend against today's threats and build resilience against tomorrow's. FPS provides law enforcement and security to federal buildings.
- OBIM manages and shares bi
Abilitations Hi-Write Beginner Paper Level 1 Pack Of 100Rhonda Cetnar
The document discusses attitudes toward marriage portrayed in Geoffrey Chaucer's The Canterbury Tales. Some tales present more traditional views of marriage, such as the Franklin's Tale, while others show more liberal attitudes, like the Miller's and Wife of Bath's Tales. The tales give insights into perspectives on marriage during Chaucer's time. Marriage was seen as a union of spirit and flesh that directed moral will and action. The document examines different marriage philosophies in The Canterbury Tales.
1. The document discusses how to request writing assistance from HelpWriting.net. It involves creating an account, completing an order form with instructions and deadlines, and reviewing writer bids before choosing a writer and placing a deposit.
2. Once a paper is received, the customer can request revisions if needed. HelpWriting.net promises original, high-quality content and refunds for plagiarized work.
3. The process allows customers to request writing assistance and receive draft papers through a bidding system from qualified writers at HelpWriting.net.
How To Write Synthesis Essay Synthesis Essay Examples SynthesisRhonda Cetnar
The document discusses the material properties and applications of copper indium gallium selenide (CIGS) which is used in photovoltaic cells. CIGS is a chalcopyrite-based semiconductor that can vary in composition from CuInSe2 to CuGaSe2. The first CIGS solar cells achieved 12% efficiency in 1974, and commercial CIGS modules were produced starting in the 1990s. Gallium is added to CIGS to stabilize conductivity and increase the band gap. Sodium also helps boost conductivity and control film growth. As of 2014 the highest efficiency achieved in the lab for CIGS was 21.7%.
Blank Chinese Pinyin Tian Zi Ge Writing Practice PaperRhonda Cetnar
This document discusses Jane Smiley's novel "Pretties" and whether it can be considered a dystopia. Some key points made are:
- If you value appearance and plastic surgery, the novel may not seem like a dystopia. But if you think being "pretty" is unrightfully favored, it could be seen as dystopian.
- The idea that one can only be pretty after surgery is preposterous, as is the labeling of people as "Uglies" or "Pretties".
- The author hopes Tally does not become "pretty" so her character's understanding of societal problems is not changed by the surgery.
Myself Writer Essay How To Write An Essay About YourRhonda Cetnar
The document outlines the career goals and path of Brianna Downer, who aims to become an elementary school teacher. It notes that becoming a teacher requires obtaining at least a bachelor's degree in early childhood education and fulfilling daily duties like instructing students in core subjects. Brianna plans to achieve her goal of teaching at the elementary level through completing her education requirements and understanding the pros, cons, and challenges of being a teacher.
English Grammar And Essay Writing, Workbook 2 (ColleRhonda Cetnar
1. The document discusses steps for requesting writing help from HelpWriting.net, including creating an account, completing an order form, and reviewing writer bids before choosing a writer and placing a deposit.
2. After receiving a paper, customers can request revisions to ensure satisfaction. HelpWriting.net promises original, high-quality content and offers refunds for plagiarized work.
3. The process involves registration, submitting a request form, choosing a writer via bids, placing a deposit, reviewing the paper, requesting revisions if needed, and authorizing final payment upon satisfaction.
Basic 3 Paragraph Essay - Write A Three Paragraph EsRhonda Cetnar
The document discusses the relationship between convergence and the Harry Potter franchise. It notes that convergence, or engaging across multiple mediums, can contribute to a franchise's success and fan base. Harry Potter is used as a case study, as it extended beyond the original books into films, video games, merchandise, websites, and theme parks. This allowed the franchise to become globally successful through convergence and fan participation across mediums over time. The essay will analyze how Harry Potter fans have constructed alternate views of the franchise through various convergent activities.
A workshop hosted by the South African Journal of Science aimed at postgraduate students and early career researchers with little or no experience in writing and publishing journal articles.
Main Java[All of the Base Concepts}.docxadhitya5119
This is part 1 of my Java Learning Journey. This Contains Custom methods, classes, constructors, packages, multithreading , try- catch block, finally block and more.
हिंदी वर्णमाला पीपीटी, hindi alphabet PPT presentation, hindi varnamala PPT, Hindi Varnamala pdf, हिंदी स्वर, हिंदी व्यंजन, sikhiye hindi varnmala, dr. mulla adam ali, hindi language and literature, hindi alphabet with drawing, hindi alphabet pdf, hindi varnamala for childrens, hindi language, hindi varnamala practice for kids, https://www.drmullaadamali.com
How to Fix the Import Error in the Odoo 17Celine George
An import error occurs when a program fails to import a module or library, disrupting its execution. In languages like Python, this issue arises when the specified module cannot be found or accessed, hindering the program's functionality. Resolving import errors is crucial for maintaining smooth software operation and uninterrupted development processes.
ISO/IEC 27001, ISO/IEC 42001, and GDPR: Best Practices for Implementation and...PECB
Denis is a dynamic and results-driven Chief Information Officer (CIO) with a distinguished career spanning information systems analysis and technical project management. With a proven track record of spearheading the design and delivery of cutting-edge Information Management solutions, he has consistently elevated business operations, streamlined reporting functions, and maximized process efficiency.
Certified as an ISO/IEC 27001: Information Security Management Systems (ISMS) Lead Implementer, Data Protection Officer, and Cyber Risks Analyst, Denis brings a heightened focus on data security, privacy, and cyber resilience to every endeavor.
His expertise extends across a diverse spectrum of reporting, database, and web development applications, underpinned by an exceptional grasp of data storage and virtualization technologies. His proficiency in application testing, database administration, and data cleansing ensures seamless execution of complex projects.
What sets Denis apart is his comprehensive understanding of Business and Systems Analysis technologies, honed through involvement in all phases of the Software Development Lifecycle (SDLC). From meticulous requirements gathering to precise analysis, innovative design, rigorous development, thorough testing, and successful implementation, he has consistently delivered exceptional results.
Throughout his career, he has taken on multifaceted roles, from leading technical project management teams to owning solutions that drive operational excellence. His conscientious and proactive approach is unwavering, whether he is working independently or collaboratively within a team. His ability to connect with colleagues on a personal level underscores his commitment to fostering a harmonious and productive workplace environment.
Date: May 29, 2024
Tags: Information Security, ISO/IEC 27001, ISO/IEC 42001, Artificial Intelligence, GDPR
-------------------------------------------------------------------------------
Find out more about ISO training and certification services
Training: ISO/IEC 27001 Information Security Management System - EN | PECB
ISO/IEC 42001 Artificial Intelligence Management System - EN | PECB
General Data Protection Regulation (GDPR) - Training Courses - EN | PECB
Webinars: https://pecb.com/webinars
Article: https://pecb.com/article
-------------------------------------------------------------------------------
For more information about PECB:
Website: https://pecb.com/
LinkedIn: https://www.linkedin.com/company/pecb/
Facebook: https://www.facebook.com/PECBInternational/
Slideshare: http://www.slideshare.net/PECBCERTIFICATION
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Dr. Vinod Kumar Kanvaria
Exploiting Artificial Intelligence for Empowering Researchers and Faculty,
International FDP on Fundamentals of Research in Social Sciences
at Integral University, Lucknow, 06.06.2024
By Dr. Vinod Kumar Kanvaria
This presentation includes basic of PCOS their pathology and treatment and also Ayurveda correlation of PCOS and Ayurvedic line of treatment mentioned in classics.
How to Build a Module in Odoo 17 Using the Scaffold MethodCeline George
Odoo provides an option for creating a module by using a single line command. By using this command the user can make a whole structure of a module. It is very easy for a beginner to make a module. There is no need to make each file manually. This slide will show how to create a module using the scaffold method.
2. was fixed in an incomplete way. In this paper, we address
these limitations of previous approaches.
First, we present a characterization of completeness of a
bug fix. Our characterization is based on the definition of a
bug neighborhood, which intuitively is a set of related flows
of invalid values. For an attempted bug fix to be complete,
all related value flows in the bug neighborhood of the flow
must be fixed as well. We describe different ways in which
the bug neighborhood may be affected by code changes that
are targeted toward fixing a bug. Our technique is applicable
to the class of bugs that involves the flow of an invalid value
from one program point to another, where the value causes
a runtime exception. Examples of such exceptions include
null-pointer exceptions, array-index exceptions, and class-
cast exceptions. In this paper, we illustrate the application
of the approach to null-pointer exceptions.
Second, we present an automated analysis that identifies
whether an (NPA, NPR) pair in program has been fixed
in a modified version ¼
of the program. For an attempted
bug fix, the analysis also identifies whether the fix is
complete according to our definition of completeness. Given
an (NPA, NPR) pair in , the analysis technique identifies
the matching pair (NPA¼
, NPR¼
) in ¼
.1
Next, the technique
determines whether there are other NPRs in ¼
related to
NPA¼
and other NPAs in ¼
related to NPR¼
that could
cause null-pointer exceptions in other executions of ¼
. To
do this, the technique performs a backward analysis from
NPR¼
and a forward analysis from NPA¼
, and determines
whether a bug fix has been attempted with respect to (NPA,
NPR). If a fix has been attempted, it classifies the fix as
complete or incomplete. This step of the analysis leverages
the interprocedural null-dereference analysis implemented in
a tool called XYLEM [5] and extensions to that analysis [9],
which we presented in previous work.
In this paper, we also present the results of empirical stud-
ies that we performed using open-source and commercial
software. Our studies show that bug neighborhoods can be
quite large and that there are many instances in which bug
fixes are incomplete. These studies illustrate the need for a
technique like ours that can alert the developer to attempted
bug fixes that are incomplete and that can cause exceptions
to occur in other executions.
The main benefit of our technique is that it can automati-
cally detect incomplete bug fixes, and highlight parts of the
program that should be examined, and potentially changed,
to ensure that a fix is complete. Our technique could be
implemented in an interactive debugging tool that, for a
given bug, suggests to the developer the types of fixes that
would be complete, and identifies the parts of the program
that should be examined to implement the fix. Thus, using
the technique, bug fixes can be made more effective.
1It is possible that the NPA, the NPR, or both are deleted and, therefore,
do not occur in ¼
.
The main contributions of the paper are:
¯ An approach for classifying attempted bug fixes as
complete or incomplete that provides information to
assist the developer in getting a complete fix.
¯ An implementation of the technique for null-
dereference bugs.
¯ Empirical results that show that, for the subjects we
studied, bug neighborhoods for (NPA, NPR) pairs are
large in size and varied in type, and that many attempted
bug fixes are incomplete.
II. COMPLETENESS OF AN ATTEMPTED BUG FIX
In this section, we define completeness of an attempted
bug fix for a null-pointer dereference in a Java program.
First, we present the definition and analysis of a bug
neighborhood, and use the bug neighborhood to define a
complete fix (Section II-A). Then, we illustrate the potential
resulting bug neighborhoods after various kinds of fixes are
attempted for a null-pointer bug (Section II-B).
A. Bug neighborhoods and complete fixes
A null-dereference bug occurs with respect to an (NPA,
NPR) pair in a program . As we illustrated in the Introduc-
tion, an attempted fix for such a bug can leave unresolved
manifestations related to the pair in the modified version ¼
,
although it removes the current bug in this execution. Those
manifestations, which could cause null-pointer exceptions
related to the (NPA, NPR) in other executions, constitute a
bug neighborhood for the (NPA, NPR) pair.
Figure 2 illustrates the concept of a bug neighborhood.
The figure shows that the bug neighborhood for (NPA½,
NPR½) can contain (NPA, NPR) pairs consisting of four
types of NPAs and NPRs related to (NPA½, NPR½): Maybe
NPAs, Maybe NPRs, Forward NPRs, and Backward NPAs.
In the figure, Maybe NPAs and Maybe NPRs are connected
to NPR½ and NPA½, respectively, with solid edges to indicate
that they are reachable in other executions. Forward NPRs
and Backward NPAs are connected to NPR½ and NPA½,
respectively, with dashed edges to indicate that they may be
reachable in other executions, depending on how the (NPA½,
NPR½) is fixed. Forward NPRs and Backward NPAs are also
connected to Maybe NPRs and Maybe NPAs.
Given (NPA½, NPR½), our bug-neighborhood analysis au-
tomatically identifies the other NPAs and NPRs that induce
its bug neighborhood.
1) Maybe NPRs: Maybe NPRs are those null-pointer
dereferences that might be reached by NPA½ in another
execution if the (NPA½, NPR½) pair is not fixed completely.
Our technique performs a forward analysis from NPA½ to
find these Maybe NPRs. In Figure 2, there are two Maybe
NPRs associated with NPA½—Maybe NPR¾ and Maybe
NPR¿—resulting in (NPA½, Maybe NPR¾) and (NPA½,
Maybe NPR¿) being added to the bug neighborhood for
(NPA½, NPR½).
384
3. Figure 2. Bug neighborhood for (NPA½,NPR½) pair.
2) Maybe NPAs: Maybe NPAs are analogous to Maybe
NPRs, and are those null-pointer assignments that might
reach NPR½ in another execution if the (NPA½, NPR½) pair
is not fixed completely. Our technique performs a backward
analysis from NPR½ to find these Maybe NPAs. In Figure 2,
there are two Maybe NPAs associated with NPR½—Maybe
NPA¾ and Maybe NPA¿—resulting in (Maybe NPA¾, NPR½)
and (Maybe NPA¿, NPR½) being added to the bug neighbor-
hood for (NPA½, NPR½).
3) Forward NPRs: Forward NPRs are NPRs that may
become reachable from NPA½ if the program is changed
so that NPR½ is no longer an NPR. These Forward NPRs
are masked by NPR½: if NPR½ is changed so that it cannot
be reached by NPA½ in ¼
, these Forward NPRs may
now be reachable from NPA½. Additionally, Forward NPRs
can mask other Forward NPRs. Thus, Forward NPRs are
computed transitively until no possible NPRs can be reached
from NPA½ because of fixes to NPR½ and to other Forward
NPRs. This transitive step results in a tree of Forward NPAs
rooted at NPR½, which is represented in Figure 2 by the
triangle under NPR½. For Forward NPRs, (NPA½, Forward
NPR½), , (NPA½, Forward NPR ) are added to the bug
neighborhood for (NPA½, NPR½).
To obtain a complete fix, the bug-neighborhood analy-
sis also requires a similar computation of Forward NPRs
associated with Maybe NPRs. Otherwise, even if (NPA½,
NPR½) has been fixed, it is possible that the Forward NPRs
associated with Maybe NPRs can be reached, and they
may cause other null-pointer exceptions after the attempted
fix. This computation generates additional (NPA½, Forward
NPR) pairs, and adds them to the bug neighborhood for
(NPA½, NPR½).
4) Backward NPAs: Backward NPAs are analogous to
Forward NPRs, and are NPAs that may reach NPR½ if the
program is changed so that NPA½ no longer generates a
null value. These Backward NPAs are masked by NPA½:
if NPA½ is changed so that it cannot reach NPR½ in ¼
,
these Backward NPAs may now reach NPR½. Additionally,
Backward NPAs could mask other Backward NPAs. Thus,
Backward NPAs are computed transitively until no possible
NPAs can reach NPR½ because of fixes to NPA½ and to
other Backward NPAs. This transitive step results in a tree
of Backward NPAs rooted at NPA½, which is represented in
Figure 2 as the triangle above NPA½. For Backward NPAs,
(Backward NPA½, NPR½), , (Backward NPA , NPR½) are
added to the bug neighborhood (NPA½, NPR½).
Like Forward-NPR computation, computing the bug
neighborhood also requires computing Backward NPAs as-
sociated with Maybe NPAs. This computation generates
additional (Backward NPA, NPR½) pairs, and adds them
to the bug neighborhood for (NPA½, NPR½).
5) Definitions: We now define bug neighborhood and
complete fix for a null-reference bug represented by the pair
(NPA½, NPR½).
Definition 1: A bug neighborhood for (NPA½, NPR½) is
the set of (NPA, NPR) pairs that are induced by manifes-
tations of Maybe NPAs, Maybe NPRs, Forward NPRs, and
Backward NPAs related to (NPA½, NPR½). The size of the
bug neighborhood for (NPA½, NPR½) is the number of pairs
in the neighborhood.
Definition 2: An attempted bug fix (i.e., change from
to ¼
) for (NPA½, NPR½) in is a complete fix if the bug
neighborhood for the associated pair in ¼
is empty. Note
that our analysis does not determine correctness of fixes.
Thus, it may be possible that a complete fix is incorrect.
B. Effects of attempted fixes on bug neighborhoods
To illustrate the way in which an attempted bug fix can
change the neighborhood, we present four general types of
attempted fixes for an (NPA, NPR) pair: removing both the
NPA and the NPR, removing the flow between the NPA and
the NPR, removing only the NPA, and removing only the
NPR. For each type, we provide examples of changes that
we have seen in practice to fix the (NPA, NPR) pair, show
how the change affects the bug neighborhood, and discuss
how the computation of the bug neighborhood is affected
for specific types of attempted fixes.
1) Removing both the NPA and the NPR: A fix for an
(NPA, NPR) pair might result in the deletion of both the
NPA and the NPR. This fix could occur, for example, if the
method containing the pair is deleted from in creating
¼
. In such cases, no null-pointer exceptions associated with
either the NPA or the NPR could occur in ¼
. Therefore, the
bug neighborhood for (NPA, NPR) in ¼
will be empty, and
thus, the fix is a complete fix.
2) Removing the flow between the NPA and the NPR: A
fix for an (NPA, NPR) pair might involve removing the flow
from the NPA to the NPR, so that the NPR is not reachable
from the NPA in ¼
. This fix could occur, for example, if
the NPA is in method M, the NPR is in method M, and the
method call in M to M, on which the NPA flows, is deleted
from in creating ¼
. In this case, there may still be Maybe
385
4. (a) (b) (c)
Figure 3. Potential bug neighborhoods after attempted fixes: (a) removing the flow between the NPA and the NPR, (b) removing only the NPA, and (c)
removing only the NPR.
NPAs and Maybe NPRs to consider, and thus, the analysis
will search for them in ¼
. Additionally, because neither
the NPA nor the NPR is removed, the Backward NPAs and
Forward NPRs must be considered in the computation of the
bug neighborhood.
Figure 3(a) illustrates the potential components of the bug
neighborhood that result for this type of attempted fix. In the
figure, the flow between NPA½ and NPR½ has been removed
by the change. Because of this, Backward NPAs½ may now
reach NPR½ and Forward NPRs½ may now be reached from
NPA½; all these must be considered in the computation of the
bug neighborhood. Moreover, the Maybe NPRs reachable
from NPA½, (i.e., Maybe NPR¾ and Maybe NPR¿), along
with their Forward NPRs, must be considered. Similarly,
the Maybe NPAs that reach NPR½ (i.e., Maybe NPA¾ and
Maybe NPA¿), along with their Backward NPAs, must be
considered.
3) Removing only the NPA: A fix for an (NPA, NPR)
pair might involve removing only the NPA so that this NPA
cannot reach the NPR in ¼
. For example, this fix could
occur if the NPA is deleted from in creating ¼
. For
another example, this fix could occur if a new object is
created at the NPA so that a null value cannot flow from
this location to NPR in ¼
. In this case, there may still
be Maybe NPAs to consider because they could reach the
NPR. However, because the NPA has been removed, there
are no Maybe NPRs or Forward NPRs to consider. Thus,
the analysis for computing the bug neighborhood is reduced
to searching only for components of the bug neighborhood
related to the NPR.
Figure 3(b) illustrates the potential components of the bug
neighborhood that result for this type of attempted fix. In
the figure, NPA½ has been removed by the change. Because
NPA½ is removed, Backward NPAs½ may now reach NPR½,
and must be considered. The figure also shows that the
Maybe NPAs that reach NPR½, along with the Backward
NPAs associated with them, must also be considered.
4) Removing only the NPR: A fix for an (NPA, NPR)
pair might involve removing only the NPR so that this NPR
cannot be reached by the NPA in ¼
. For example, this fix
could occur if the NPR is deleted from in creating ¼
.
For another example, this fix could occur if a condition is
added around the NPR in ¼
so that the NPA cannot flow
to the NPR in ¼
. In this case, there may still be Maybe
NPRs to consider because they could be reached from the
NPA. However, there are no Maybe NPAs or Backward
NPAs to consider. Thus, the analysis for computing the bug
neighborhood is reduced to searching only for components
of the bug neighborhood related to the NPA.
Figure 3(c) illustrates the potential components of the bug
neighborhood that result for this type of attempted fix. In the
figure, NPR½ has been removed by the change. Therefore,
Forward NPRs½ may now be reached by NPA½, and must be
considered. The figure also shows that the Maybe NPRs that
are reachable from NPA½, and the Forward NPRs associated
with them, must also be considered.
III. AUTOMATED ANALYSIS
In this section, we present the bug-neighborhood analysis
that, given a program and a modified version ¼
of ,
identifies the null-dereference bugs in for which fixes
have been attempted in ¼
, and classifies each attempted fix
as complete or incomplete. The analysis leverages the null-
dereference analysis, implemented in XYLEM [5]. First, we
provide an overview of the XYLEM analysis (Section III-A);
Reference [5] contains details. Then, we present the bug-
neighborhood analysis (Section III-B).
Before presenting the analysis, we present an example
we use to illustrate the XYLEM and the bug-neighborhood
analyses. In Figure 4, the code fragment on the left shows
the original version of function foo(); the fragment on the
right shows a modified version foo’() in which a null check
has been added at line 6.
386
5. foo( int i, j ) { foo’( int i, int j ) {
[1] x = null; // NPA [1] x = null; // NPA
[2] if ( i == 0 ) [2] if ( i == 0 )
[3] x = new C(); [3] x = new C();
[4] y = x; [4] y = x;
[5] if ( j 10 ) { [5] if ( j 0 ) {
[6] if ( x != null )
[7] x.m1(); // NPR [7] x.m1();
[8] x.m2(); // F-NPR [8] x.m2(); // F-NPR
} else { } else {
[9] if ( i == 0 ) [9] if ( i == 0 )
[10] y.m(); [10] y.m();
[11] x.m3(); // M-NPR [11] x.m3(); // M-NPR
[12] x.m4(); // M-F-NPR [12] x.m4(); // M-F-NPR
} }
} }
Figure 4. Example to illustrate XYLEM and bug-neighborhhood analyses.
A. Overview of Xylem Analysis
Starting at a statement that dereferences a variable , the
XYLEM analysis, traverses backward to identify a path over
which a null value for can flow to . During the analysis,
it propagates a set of abstract state predicates backward in
the interprocedural control-flow graph.2
The analysis starts
with a predicate asserting that is null at , and updates
states during the path traversal. If the updated state becomes
inconsistent, no null value for can flow to along that path.
Thus, the analysis does not traverse further along that path.
Consider the application of the XYLEM analysis for the
dereference of x at line 7 in foo(). The analysis initializes
the state to contain predicate x null. Traversing
backward, the analysis reaches the conditional statement at
line 5 and adds j to the state. The statement at
line 4 does not update the state. There are two backward
paths from line 4. Along the path to line 3, the analysis
finds an assignment of a new object to x, which contradicts
the existing predicate x null; therefore, the traversal
along this path stops. Along the other backward path from
line 4, the analysis reaches conditional statement 2 along
its false branch, and adds i to the state. Next, the
analysis reaches line 1 where it finds the null assignment to
x, and identifies statement 7 as an NPR.
B. Bug-neighborhood Analysis
Figure 5 presents, BugNeighborhoodAnalysis, the al-
gorithm that performs the bug-neighborhood analysis. The
algorithm inputs an (NPA, NPR) pair, , from the
original program . The algorithm determines whether the
bug fix with respect to has been attempted. If
not, the bug is reported as “unfixed.” If there has been an
attempted fix, the algorithm determines whether the fix is
complete or incomplete in ¼
, and reports it as “complete”
or “incomplete,” respectively.
The algorithm first (lines 1–2) maps and to the
respective null-assignment and dereference statements (if
2The control-flow graph (CFG) for a method contains nodes
that represent statements and edges that represent the flow of
control between statements. The interprocedural control-flow graph
(ICFG) contains a CFG for each method in the program.
algorithm BugNeighborhoodAnalysis
input : (NPA, NPR) pair in the original program
output : “unfixed”, “complete fix”, “incomplete fix”
declare ¼
: pairs of reaching NPAs for ¼
(mapped from ),
reachable NPRs for ¼
(mapped from ) in modified program ¼
begin
1. ¼
matching null-assignment statement in ¼
2. ¼
matching dereference statement in ¼
3. ¼
// initialization
// check whether ¼
has reaching NPAs in ¼
4. if ¼
then // ¼
exists in ¼
5. ¼
NPAs identified by the XYLEM analysis starting at ¼
6. foreach ¼
¼
do add ¼
¼
to ¼
// check whether ¼
has reachable NPRs in ¼
7. if ¼
then // ¼
exists in ¼
8. ¼
dereferences (excluding ¼
) reachable from ¼
in ¼
9. foreach ¼
¼
do
10. ¼
NPAs identified by the XYLEM analysis starting at ¼
11. if ¼
¼
then add ¼
¼
to ¼
12. foreach (NPA,NPR) ¼
do
find Forward NPRs, Backward NPAs
// set for
13. if ¼
¼
¼
then “unfixed”
14. else if ¼
then “complete fix”
15. else “incomplete fix”
16. return ¼
,
end
Figure 5. Algorithm for the bug-neighborhood analysis.
any) in ¼
. The computation of this mapping is orthogonal
to our analysis; it can be computed using different analyses
(e.g., [10], [11]), which vary in accuracy and cost. After
computing the mapped statements ¼
and ¼
, the remainder
of the algorithm determines whether an attempted fix has
been applied, and if so, whether the fix is complete.
After initializing ¼
(line 3), lines 4–6 of
the algorithm check whether there exist reaching NPAs for
¼
in ¼
. If there exists a matching dereference statement
¼
(i.e., ¼
is not null), the algorithm performs the XYLEM
analysis, starting at ¼
, to identify ¼
—the set of reaching
NPAs, (lines 4–5). For all statements ¼
in ¼
, it adds
¼
¼
to ¼
(line 6).
To illustrate, consider applying the algorithm to (NPA,
NPR) pair (1,7) in function foo(). In foo’(), this NPR has
been fixed by adding the null check in line 6. After matching
the NPR to line 7 in foo’(), the algorithm invokes the
XYLEM analysis, which, because of the added null check,
finds no reaching NPAs. Thus, ¼
is empty in this case.
Lines 7–11 of the algorithm determine whether there
exist reachable NPRs for ¼
. The algorithm computes, in
two steps, the set of NPRs in ¼
that can dereference the
null value generated at ¼
. In the first step, the algorithm
performs a forward reachability analysis, starting at ¼
, to
identify reachable dereferences of the null value generated
at ¼
(line 8). In the second step (lines 9–10), for each
dereference identified in the first step, the algorithm uses
the XYLEM analysis to determine whether for a dereference
¼
, ¼
is identified as an NPA. If this is the case, the
algorithm adds ¼
¼
pair to ¼
(line 11).
The benefit of the two-step approach is that the second
387
6. step, using the XYLEM analysis, can potentially filter out
infeasible reachable dereferences that may be identified in
the first step [9]. Line 12 of the algorithm uses similar
backward and forward analysis to find the Forward NPRs
and the Backward NPAs for each pair in ¼
,
and adds these pairs to ¼
.
Lines 13–15 of the algorithm determine and set the value
of . If ¼
¼
is in ¼
, then there
is still a flow of a null value from the pair associated
with , and thus, the pair has not been fixed. Thus,
the algorithm sets as “unfixed.” Otherwise, if
¼
is empty, the algorithm sets
as “complete fix,” else ( ¼
is not empty),
the algorithm sets as “incomplete fix.” Finally,
in line 16, the algorithm returns ¼
and
.
Consider again the invocation of the algorithm for pair
(1, 7) in foo(). Starting at line 1 (¼
) in foo’(), the
algorithm performs forward reachability analysis to identify
dereferences 7, 8, 10, 11, and 12. For each of these deref-
erences, the algorithm invokes XYLEM. For dereferences 8,
11, and 12, the XYLEM analysis identifies statement 1 as a
reaching NPA, whereas for dereference 7 and 10, it finds
no reaching NPA. Therefore, at line 13 of the algorithm,
¼
: statement 8
is a Forward NPR that has been uncovered by the attempted
fix, statement 11 is an unfixed Maybe NPR, and statement 12
is a Forward NPR associated with statement 11. Thus, a fix
is attempted for pair (1, 7) because the pair does not occur
in ¼
. But, the fix is incomplete because
¼
is not empty.
IV. EMPIRICAL EVALUATION
To evaluate our approach, we implemented it, and con-
ducted two empirical studies to investigate the occurrences
of bug neighborhoods and the completeness of attempted
bug fixes. After describing the experimental setup (Sec-
tion IV-A), we present the results of the two studies (Sec-
tions IV-B and IV-C). Finally, we discuss threats to the
validity of our empirical evaluation (Section IV-D).
A. Experimental Setup
Our experimental subjects consist of three open-source
projects (Ant-1.6.0, Lucene-2.2.0, and Tomcat-4.1.27)3
and
three proprietary commercial products (referred to as App-
A, App-B, and App-C). Table I lists the subjects, along with
the numbers of classes, methods, bytecode instructions, and
(NPA, NPR) pairs in each subject.
We integrated our analysis into the XYLEM tool. The im-
plementation consists of two main components: the mapping
component and the bug-neighborhood analysis component.
Suppose, for our discussion, that the null-pointer bug in
is represented by the pair (NPA¼, NPR¼).
3Available at apache.org
Table I
SUBJECTS USED IN THE EMPIRICAL STUDIES.
Bytecode (NPA, NPR)
Subject Classes Methods Instructions Pairs
Ant-1.6.0 1858 17204 443254 167
Lucene-2.2.0 381 2815 72691 86
Tomcat-4.1.27 260 4077 101075 97
App-A 278 3933 98225 63
App-B 169 1876 46286 119
App-C 2488 13746 340896 107
Table II
EIGHT BUG-NEIGHBORHOOD CATEGORIES FOR AN (NPA, NPR) PAIR.
Maybe NPA Maybe NPR Forward NPR
Category Present Present Present
1 no no no
2 no no yes
3 no yes no
4 no yes yes
5 yes no no
6 yes no yes
7 yes yes no
8 yes yes yes
The mapping component maps statements (bytecode in-
structions) based on their signatures and order of occurrences
in the source code. The result is (1) a mapping between
(NPA¼, NPR¼) and a pair in ¼
or (2) an indication that there
is no matched statement in ¼
for either NPA¼ or NPR¼.
The component for performing the bug-neighborhood
analysis uses the XYLEM analysis to identify the
bug neighborhood for the pair in ¼
associated with
(NPA¼, NPR¼). The component implements algorithm
BugNeighborhoodAnalysis (Figure 5), and performs the
appropriate backward and forward analyses to find the
Maybe NPAs, Maybe NPRs, and Forward NPRs associated
with the mapped pair in ¼
. Currently, the implementation
does not compute Backward NPAs.
Using these NPAs and NPRs in ¼
and the pair in ¼
mapped from (NPA¼, NPR¼), the implementation creates
new null-pointer bug pairs in ¼
and adds them to the bug
neighborhhod for the mapped pair in ¼
. Using the bug
neighborhood for the mapped pair in ¼
, the implementation
classifies (NPA¼, NPR¼) as not fixed (i.e., no attempt was
made to fix the bug), attempted but incomplete, or complete.
B. Study 1: Neighborhood Categories and Sizes
1) Goal and Method: The goal of this study is to examine
the consistency and sizes of bug neighborhoods in practice.
To do this, we ran our implementation on each subject
(Table I). For each (NPA, NPR) pair in a subject, we
computed the bug neighborhood and the neighborhood size.
To understand the consistency of bug neighborhoods, we
created a classification that consists of eight categories,
shown in Table II. Each category is defined based on whether
a Maybe NPA, a Maybe NPR, or a Forward NPR is present
in a neighborhood (recall that our current implementation
388
7. Figure 6. Number of occurrences of bug-neighborhood categories,
aggregated over the subjects. Each bar segment shows the number of (NPA,
NPR) pairs in a bug-neighborhood category that occur in a subject.
does not compute Backward NPAs). For example, Cate-
gory 1 represents the simplest neighborhood, in which no
Maybe NPAs, Maybe NPRs, or Forward NPRs occur. Thus,
for this category, each type of attempted fix shown in
Figure 3 is a complete fix. Category 8 represents the most
complex neighborhoods in which all three of maybe NPAs,
maybe NPRs, and forward NPRs are present.
To present the data on the sizes of the bug neighborhoods,
we created three groups of neighborhood sizes: small (five
or fewer (NPA, NPR) pairs), medium (greater than five but
not greater than 15 (NPA, NPR) pairs), and large (greater
than 15 (NPA, NPR) pairs).
2) Results and Analysis: First, we present the data on oc-
currences and distribution of bug-neighborhood categories.
Then, we present the data on bug-neighborhood sizes.
Figure 6 presents data that show the number of occur-
rences of each neighborhood category, aggregated over the
subjects. The horizontal axis lists the eight categories; the
vertical axis represents the number of (NPA, NPR) pairs
whose neighborhoods are of a particular category. The data
for a category are represented as a segmented bar, in which
the segments represent the number of pairs in each subject;
the height of the bar shows the total number of pairs for the
category. The order of the subjects is the same in all the bars:
Ant is the segment at the bottom, App-C is the segment at
the top. Over all subjects, neighborhood Category 1 occurs
63 times. Of these, 20 occur in Ant, 11 occur in Lucene, and
9 occur in Tomcat; App-A, App-B, and Ant-C contain 3, 6,
and 14 Category 1 neighborhoods, respectively. Category 8
occurs 189 times in total, of which 91 occur in Ant, 26 in
Tomcat, 68 in App-B, and 4 in App-C.
The data illustrate that Category 8 neighborhoods, which
are the most complex neighborhoods, occur most frequently
in our subjects. Category 4, which is also fairly complex
(neighborhoods of this category contain Maybe NPRs and
Forward NPRs), has the second highest occurrence. Another
interesting observation is that neighborhood categories that
have Maybe NPRs but no Forward NPRs (represented as
the sum of Categories 3 and 7) occur often. For such
neighborhoods, a developer can target the NPR and maybe
NPRs in an attempted fix, without being concerned about
whether the fix would unmask forward NPRs.
In Figure 7, we present a different view of the data. The
horizontal axis lists the subjects. For each subject, the chart
contains one bar for each category. The vertical axis, as
in Figure 6, represents the number of (NPA, NPR) pairs
whose bug neighborhoods are of a particular category. The
segments in each bar show the number of pairs that have
small, medium, and large bug neighborhoods.
As the data illustrate, Ant and App-B contain a large
percentage of Category 8 bug neighborhoods: more than
half their (NPA, NPR) pairs have neighborhoods of Cate-
gory 8. Two subjects (Ant and App-C) contain pairs for all
neighborhood categories; all subjects except App-B have at
least seven of the eight categories. App-C has a different
distribution of neighborhood categories from all the other
subjects: most of its (NPA, NPR) pairs belong to Category 7
whereas, in all other subjects, Category 7 is one of the
categories with the fewest (NPA, NPR) pairs.
The data also show a relationship between bug-
neighborhood size and bug-neighborhood categories: all
large neighborhoods occur in the most complex neighbor-
hood categories (Categories 4 and 8). Medium neighbor-
hoods occur in categories that contain Maybe NPAs or
Maybe NPRs (Categories 3 to 8); small neighborhoods are
distributed across all categories. With two exceptions, the
large neighborhoods in our subjects have less than 35 (NPA,
NPR) pairs: Ant has Categroy 8 neighborhoods of size 72
and Lucene has Category 4 neighborhoods of size 46.
Overall, the data indicate that complex bug neighborhoods
can occur frequently in practice and be large in size. There-
fore, an approach, such as ours, that points the developer to
bug-neighborhood statements can be useful in practice as it
provides context information about a bug that can help the
developer ensure that an attempted fix is complete.
C. Study 2: Completeness of Attempted Bug Fixes
1) Goal and Method: The goal of this study is to inves-
tigate the existence and frequency of incomplete fixes. We
considered successive releases of the subjects, and identified
fixes made between each pair of a release and its successive
release. For each and ¼
,4
we ran the analysis to compute
the bugs in for which fixes were attempted in ¼
and the
classification of each fix as complete or incomplete.
Our mapping component for identifying (NPA, NPR) pair
association in and ¼
produces some false positives
4For consistency in terminology, we refer to the releases in a pair as the
original program, denoted by , and the modified program, denoted by ¼
.
389
8. Figure 7. Number of occurrences of bug-neighborhood categories per subject. The bar segments show the number of (NPA, NPR) pairs in a bug-
neighborhood category that have small, medium, and large neighborhoods.
Table III
NUMBER OF INCOMPLETE BUG FIX ATTEMPTS AND THEIR
NEIGHBORHOOD CATEGORIES AND SIZES.
Number of
Incomplete/ Neighborhood Neighborhood
Attempted Categories of Sizes of
Subject Fixes Incomplete Fixes Incomplete Fixes
Ant 4 / 26 2, 4, 4, 5 Small, 3 x Medium
Lucene 3 / 17 4, 4, 4 Small, 2 x Large
Tomcat 0 / 9 — —
App-A 0 / 7 — —
(reported as attempted fixes but not actually attempted).
Because we want to report only precise results, we manually
verified our analysis results for four subjects—Ant, Lucene,
Tomcat, App-A—and removed these false positives. We
report the detailed results for these four subjects.
2) Results and Analysis: Table III illustrates the num-
ber of incomplete and attempted bug fixes and shows the
neighborhood categories and neighborhood sizes of the
incomplete fixes. In Ant, four of 26 attempted bug fixes are
incomplete; in Lucene, three of 17 attempted bug fixes are
incomplete. Five of the seven incomplete fixes belong to Cat-
egory 4, the other two incomplete fixes belong to Category 2
and Category 5. The (NPA, NPR) pairs of the incomplete
fixes have small, medium, and large bug neighborhoods, the
largest neighborhood has size 36. In Tomcat and App-A, all
attempted bug fixes—9 and 7 respectively—are complete.
The data from this study show that, in practice, attempted
bug fixes can be incomplete and, thus, null-pointer bugs
related to the attempted fix of one null-pointer bug can
occur in other executions of the program. Moreover, the bug
neighborhoods of these incomplete fixes can be large and,
therefore, difficult to identify manually. For example, for
two of the incomplete fixes, the bug neighborhoods were
larger than 30. Without automated analysis, determining
whether these additional pairs related to the (NPA, NPR)
exist, and then locating these pairs manually can be a time-
consuming task. Even in the cases where the fix is complete,
the developer may want to know this to avoid searching for
related NPAs and NPRs manually.
D. Threats to Validity
There are several threats to the validity of the evaluation.
Threats to internal validity arise when factors affect the
dependent variables without the researchers’ knowledge. In
our case, our implementation could have flaws that would
affect the accuracy of the results we obtained. We have
used the XYLEM analysis for many experiments and checked
its results manually. Thus, we are confident in the results
it produced. We discovered that, with the mapping we
used in this implementation, our results contained some
false positives. Thus, we verified our results manually, and
reported only those results in Study 2.
Threats to external validity arise when the results of the
experiment cannot be generalized to other situations. One
external threat involves the ability to generalize our results
to other programs. In our study, we used six programs and
thus, we are unable to conclude that our results will hold
in general. However, these programs are used in practice
and are somewhat diverse. Thus, we can conclude that
incomplete fixes for null-pointer bugs do exist in practice,
and that their bug neighborhoods can be significant in size.
Another external threat concerns the ability to generalize
the results for null-pointer exceptions to other types of
bugs that result from incorrect flows of values. We believe
that a similar analysis could provide information for other
related types of bugs. However, only an implementation and
experimentation with such other bugs would confirm our
belief in the applicability of our technique.
V. RELATED WORK
In previous work [9], we presented an approach for
identifying and fixing faults that cause runtime exceptions.
Given a statement at which a null-pointer exception occurs,
390
9. the approach uses the XYLEM analysis, guided by the stack
trace of the failing execution, to identify definite and possible
NPAs that could have caused the exception. The approach
also identifies maybe NPAs and maybe NPRs that could
cause exceptions in other executions. Unlike our previous
work, in this paper, we investigate how bugs are fixed. We
extend our previous approach to define a bug neighborhood
and characterize the completeness of a bug fix in terms of the
bug neighborhood. We also present analysis that compares
two code versions to identify attempted fixes and classify
each fix as complete or incomplete.
There has been much research on mining bug and source-
code repositories to study the history of bug fixes. Infor-
mation about past bug fixes is used for many applications,
such as assessing and improving the effectiveness of static
bug detection (e.g., [7], [12]), prioritizing warnings (e.g.,
[13], [14]), recommending bug fixes and pertinent software
artifacts (e.g., [15], [16], [17]), assigning defect reports
to developers (e.g., [18]), identifying code changes that
subsequently lead to bug fixes (e.g., [19]), and estimating
the fault-proneness of code components (e.g., [20], [21]). We
discuss a sample of this research that is related to different
aspects of our work.
The main distinguishing aspect of our work from the
previous research is that none of the existing work has
investigated whether bugs fixes are complete and how in-
complete bug fixes could be made complete. Much of this
previous research has focused on the bugs reported in bug
repositories, which are more general than the class of bugs
(involving flows of invalid values) to which our approach is
applicable. Determining completeness for general bugs, such
as those related to application functionality, is not addressed
by our approach.
Ayewah and colleagues [7] investigated, for three software
systems, how the bugs reported by FINDBUGS [22] were
fixed across different builds of the system. They manually
classified the bugs as those that could have little, some,
or substantial functional impact. They also examined the
types of program changes that caused the bugs to be fixed
and determined whether such a change was localized and
intended to remove the bug. However, they did not evaluate
whether a bug fix was complete.
Spacco, Hovemeyer, and Pugh [8] examined the evolution
of bugs over successive builds of Sun’s JDK core runtime
library. They presented two approaches for mapping bugs
from one version to another, and reported trends about defect
lifetimes, decay in defects reported, and defect density over
successive builds. However, they performed no evaluation of
whether the bugs that were removed were completely fixed.
Williams and Hollingsworth [12] developed a checker to
detect bugs where the return value of a function was used
before being tested (e.g., dereferencing a returned pointer
without performing a null check). They also developed a
ranking technique that orders the warnings based on whether
the relevant function was involved in a past bug fix that
added a check on its return value, and how often the return
value of the function is tested before use. In their work, a
notion of completeness similar to ours could be used: if a
bug fix added a check on the return value of a function at one
call site, but not at another call site to the same function, the
fix could be marked incomplete. However, the goal of that
work was not to classify fixes as complete or incomplete.
Existing work on recommending systems performs several
activities: suggests fixes for bugs based on patterns of past
fixes [15]; suggests relevant software artifacts, based on a
memory of the artifacts created during the development of a
software system, that should be examined to perform a task,
such as fixing a bug [16]; or provides context information
for a bug, based on mining information (from different data
repositories) about how similar bugs have been fixed in the
past [17]. Our work is related to such approaches in that it
can be used to recommend parts of the code that should
be examined to ensure that a fix is complete. However,
unlike these techniques, our current approach is not based
on mining information from software repositories but on an
analysis of the code.
Researchers have explored using the history of bug fixes
to prioritize new bugs based on machine learning [13] or
statistical modeling [14]. Our approach could be used to
infer correlations between characteristics of bug neighbor-
hoods and the likelihood of attempted fixes, which could
be used to prioritize a new bug based on its neighborhood
characteristics.
VI. CONCLUSION AND FUTURE WORK
In this paper, we presented a new approach for automati-
cally identifying incomplete bug fixes in Java programs. We
examined bugs that involve a flow of invalid values causing
a runtime exception. We introduced the concept of a bug
neighborhood that lets us categorize bugs. In addition, a
bug neighborhood also lets us determine the completeness
of attempted bug fixes: if a bug neighborhood is not empty
after an attempted fix, the fix is incomplete.
Our approach can assist debugging in two different
scenarios. During maintenance, our approach can identify
incomplete bug fixes by analyzing two (or more) revisions
of a program. If incomplete fixes are found, developers can
utilize the remaining information in the bug neighborhood
to complete the incomplete fixes. During development, our
approach can check whether incomplete bug neighborhoods
exist before or after developers attempt to fix a bug, and
guide them through the process of completing the fixes.
Thus, our approach can prevent the introduction of incom-
plete fixes into new revisions.
To evaluate our approach we implemented it and con-
ducted two empirical studies on three open source and three
industrial subjects. Our analysis results indicate that, for the
391
10. subjects considered, large and complex bug neighborhoods
occur frequently, and attempted fixes can be incomplete.
There are several areas of future work; we discuss some
of them here. First, we will address the imprecision of the
mapping component we use to identify the association of the
(NPA, NPR) pairs in and ¼
. We will investigate whether
other mapping techniques, such as JDiff [10], can produce
better and more reliable results.
Second, we will gather more analysis results from other
subjects and we will investigate several interesting open
questions: is there a correlation between incomplete bug
fixes and the size of a bug neighborhood? Is there a correla-
tion between incomplete bug fixes and a bug-neighborhood
category? Are incomplete bug fixes in open-source projects
different from incomplete fixes in industrial projects? Do
incomplete fixes occur more frequently in open-source soft-
ware than in industrial projects?
Third, we will investigate other types of bugs that involve
a flow of invalid values that cause a runtime exception.
We will investigate whether there exist well-defined fault
categories and fix categories for these types of bugs, and how
these categories correlate. Our goal is to support developers
further in addressing incomplete bug fixes. For some fault
categories, we may be able to apply the fixes automatically,
for other fault categories, we may be able to recommend
possible fixes to developers or, at least, provide additional
information that helps the developers to fix the bug.
Finally, we will enhance our analysis to guide developers
in addressing an incomplete fix and making it complete. We
will investigate prioritizing (NPA, NPR) pairs using several
approaches, and presenting the neighborhoods using intuitive
visualization techniques. Such techniques may reduce the
manual process in debugging and save time and cost.
ACKNOWLEDGMENT
This work was supported in part by awards from NSF
under CCF-0429117, CCF-0541049, and CCF-0725202, and
IBM by a Software Quality Innovation Faculty Award.
REFERENCES
[1] C. Flanagan, K. R. M. Leino, M. Lillibridge, G. Nelson, J. B.
Saxe, and R. Stata, “Extended static checking for Java,” in
Proc. of PLDI, Jun. 2002, pp. 234–245.
[2] D. Hovemeyer and W. Pugh, “Finding more null pointer bugs,
but not too many,” in Proc. of PASTE, Jun. 2007, pp. 9–14.
[3] D. Hovemeyer, J. Spacco, and W. Pugh, “Evaluating and
tuning a static analysis to find null pointer bugs,” in Proc.
of PASTE, Sep. 2005, pp. 13–19.
[4] A. Loginov, E. Yahav, S. Chandra, N. Fink, S. Rinetzky, and
M. G. Nanda, “Verifying dereference safety via expanding-
scope analysis,” in Proc. of ISSTA, Jul. 2008, pp. 213–223.
[5] M. G. Nanda and S. Sinha, “Accurate interprocedural null-
dereference analysis for Java,” in Proc. of ICSE, May 2009,
pp. 133–144.
[6] A. Tomb, G. Brat, and W. Visser, “Variably interprocedural
program analysis for runtime error detection,” in Proc. of
ISSTA, Jul. 2007, pp. 97–107.
[7] N. Ayewah, W. Pugh, J. D. Morgenthaler, J. Penix, and
Y. Zhou, “Evaluating static analysis defect warnings on
production software,” in Proc. of PASTE, Jun. 2007, pp. 1–8.
[8] J. Spacco, D. Hovemeyer, and W. Pugh, “Tracking defect
warnings across versions,” in Proc. of Intl. Workshop on MSR,
May 2006, pp. 133–136.
[9] S. Sinha, H. Shah, C. Görg, S. Jiang, M. Kim, and M. J.
Harrold, “Fault localization and repair for Java runtime ex-
ceptions,” in Proc. of ISSTA, Jul. 2009, pp. 153–164.
[10] T. Apiwattanapong, A. Orso, and M. J. Harrold, “JDiff: A
differencing technique and tool for object-oriented programs,”
ASE, vol. 14, no. 1, pp. 3–36, Mar. 2007.
[11] S. Raghavan, R. Rohana, D. Leon, A. Podgurski, and V. Au-
gustine, “Dex: A semantic-graph differencing tool for study-
ing changes in large code bases,” in Proc. of ICSM, Sep. 2004,
pp. 188–197.
[12] C. C. Williams and J. K. Hollingsworth, “Automatic mining of
source code repositories to improve bug finding techniques,”
IEEE TSE, vol. 31, no. 6, pp. 466–480, Jun. 2005.
[13] S. Kim and M. Ernst, “Which warnings should I fix first?”
in Proc. of ESEC/FSE, Sep. 2007, pp. 45–54.
[14] J. R. Ruthruff, J. Penix, J. D. Morgenthaler, S. Elbaum,
and G. Rothermel, “Predicting accurate and actionable static
analysis warnings: An experimental approach,” in Proc. of
ICSE, May 2008, pp. 341–350.
[15] S. Kim, K. Pan, and E. J. Whitehead, Jr., “Memories of bug
fixes,” in Proc. of Intl. Symp. on FSE, Nov. 2006, pp. 35–45.
[16] D. Čubranić, G. C. Murphy, J. Singer, and K. S. Booth,
“Hipikat: A project memory for software development,” IEEE
TSE, vol. 31, no. 6, pp. 446–465, Jun. 2005.
[17] B. Ashok, J. Joy, H. Liang, S. K. Rajamani, G. Srinivasa,
and V. Vangala, “Debugadvisior: A recommender system for
debugging,” in Proc. of ESEC/FSE, Aug. 2009, pp. 373–382.
[18] J. Anvik, L. Hiew, and G. C. Murphy, “Who should fix this
bug?” in Proc. of ICSE, May 2006, pp. 361–370.
[19] S. Kim, T. Zimmermann, K. Pan, and E. J. Whitehead Jr.,
“Automatic identification of bug-introducing changes,” in
Proc. of ASE, Sep. 2006, pp. 81–90.
[20] T. L. Graves, A. F. Karr, J. S. Marron, and H. Siy, “Predicting
fault incidence using software change history,” IEEE TSE,
vol. 26, no. 7, pp. 653–661, Jul. 2000.
[21] T. J. Ostrand, E. J. Weyuker, and R. M. Bell, “Predicting
the location and number of faults in large software systems,”
IEEE TSE, vol. 31, no. 4, pp. 340–355, Apr. 2005.
[22] D. Hovemeyer and W. Pugh, “Finding bugs is easy,” ACM
SIGPLAN Notices (Proceedings of Onward! at OOPSLA
2004), vol. 39, no. 10, pp. 92–106, Dec. 2004.
392