Bench4BL: Reproducibility Study on the Performance of IR-Based Bug Localization

Dongsun Kim
Dongsun KimResearch Associate
Bench4BL: Reproducibility Study on
the Performance of IR-Based Bug
Localization
Jaekwon Lee1, Dongsun Kim1, Tegawendé F. Bissyandé1, 

Woosung Jung2, Yves Le Traon1

1SnT, University of Luxembourg - Luxembourg

2Seoul National University of Education - South Korea
Bug Localization
!2
Bug Localization
!3
Where should we fix?
Bug Localization
!4
Model
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
Bug Report
………..
…. …..
…..….
……..

….
..
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
A set of code files
Bug Localization
F(x)
Test Case
Test Case
Test Case
01: MultiMap<Method, Long> duration = new MultiMap<Method, Long>();
02: // run all benchmarks in same order, recording duration
03: for (Method m : benchmarks) {
04: System.err.println("# "+m.getName()+" benchmarking");
05: List<Integer> reps = getReps(min_reps, m);
06: for (int r : reps) {
07: System.gc();
08: long start = System.nanoTime();
09: m.invoke(suite,r);
10: long stop = System.nanoTime();
11: duration.map(m, stop - start);
12: }
13: }
Function
01: MultiMap<Method, Long> duration = new MultiMap<Method, Long>();
02: // run all benchmarks in same order, recording duration
03: for (Method m : benchmarks) {
04: System.err.println("# "+m.getName()+" benchmarking");
05: List<Integer> reps = getReps(min_reps, m);
06: for (int r : reps) {
07: System.gc();
08: long start = System.nanoTime();
09: m.invoke(suite,r);
10: long stop = System.nanoTime();
11: duration.map(m, stop - start);
12: }
13: }
Function
Fault Localization
Bug Localization
!5
Bug Localization
Model
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
Bug Report
………..
…. …..
…..….
……..

….
..
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
A set of code files
Bug Localization
F(x)
Test Case
Test Case
Test Case
01: MultiMap<Method, Long> duration = new MultiMap<Method, Long>();
02: // run all benchmarks in same order, recording duration
03: for (Method m : benchmarks) {
04: System.err.println("# "+m.getName()+" benchmarking");
05: List<Integer> reps = getReps(min_reps, m);
06: for (int r : reps) {
07: System.gc();
08: long start = System.nanoTime();
09: m.invoke(suite,r);
10: long stop = System.nanoTime();
11: duration.map(m, stop - start);
12: }
13: }
Function
01: MultiMap<Method, Long> duration = new MultiMap<Method, Long>();
02: // run all benchmarks in same order, recording duration
03: for (Method m : benchmarks) {
04: System.err.println("# "+m.getName()+" benchmarking");
05: List<Integer> reps = getReps(min_reps, m);
06: for (int r : reps) {
07: System.gc();
08: long start = System.nanoTime();
09: m.invoke(suite,r);
10: long stop = System.nanoTime();
11: duration.map(m, stop - start);
12: }
13: }
Function
Fault Localization
Bug Localization
!6
Model
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
Bug Report
………..
…. …..
…..….
……..

….
..
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
…..
…..…..
…..…..
…..…..
….…..Java
A set of code files
…..…..
…..…..
…..…..
…..…..
…..…..
Source 

Codes
……..……..
……. ……..
…..……..
……..……..
…….. …….
Bug
Report
Information Retrieval based
Bug Localization (IRBL)
!7
…..…..
…..…..
…..…..
…..…..
…..…..
Source 

Codes
……..……..
……. ……..
…..……..
……..……..
…….. …….
Bug
Report
NL tokens
Code elements
Meta Info.
NL tokens
Code elements
Meta Info.
Extracting Features
Extracting Features
Information Retrieval based
Bug Localization (IRBL)
!8
Feature

Vector
…..…..
…..…..
…..…..
…..…..
…..…..
Source 

Codes
……..……..
……. ……..
…..……..
……..……..
…….. …….
Bug
Report
….
Feature 

Vectors
NL tokens
Code elements
Meta Info.
NL tokens
Code elements
Meta Info.
Extracting Features
Extracting Features
Information Retrieval based
Bug Localization (IRBL)
!9
Feature

Vector
…..…..
…..…..
…..…..
…..…..
…..…..
Source 

Codes
……..……..
……. ……..
…..……..
……..……..
…….. …….
Bug
Report
Recommend

Code Files
…..
…..…..
…..…..
…..…..
…..
…..…..
…..…..
…..…..
…..
…..…..
…..…..
…..…..
…..
…..…..
…..…..
…..…..
1
2
3
N
….
….
Feature 

Vectors
NL tokens
Code elements
Meta Info.
NL tokens
Code elements
Meta Info.
Extracting Features
Extracting Features
Comparing Similarity

& Ranking
Information Retrieval based
Bug Localization (IRBL)
!10
!11
Is there any issue?
!12
Are these results mature enough?
Not enough maturity of performance
Subjects BRTracer BLUiR AmaLgam Locus
ZXing 0.445 0.380 0.410 0.502
SWT 0.467 0.560 0578 0.640
AspectJ 0.264 0.263 0.271 0.320
PDE 0.367 0.349 0.322 0.422
JDT 0.232 0.277 0.282 0.359
(metric : MAP)
Are the subjects still usable?
!13
PDE
Eclipse
ZXing
AspectJ
JDT
SWT
98
286
20
#Reports Period
2004 - 2016
2004 - 2010
2002 - 2006
2010 - 2010
Subject
Out-of-dated subjects
60
98
Are the subjects still usable?
!14
PDE
Eclipse
ZXing
AspectJ
JDT
SWT
98
286
20
#Reports Period
2004 - 2016
2004 - 2010
2002 - 2006
2010 - 2010
Subject
Out-of-dated subjects
60
98
Evaluation Configuration?
!15
Inconsistent evaluation settings
BugLocator
BLIA
Locus
AmaLgam
BRTracer
BLUiR
Version Matching
Test file inclusion
Study Design
!16
Experiment 

Data Set
RQ1: To what extent do IRBL techniques
perform on up-to-date subjects?
Research Questions
!17
Experiment 

Data Set
Experiment
Configuration
RQ1: To what extent do IRBL techniques
perform on up-to-date subject?
RQ2: What is the impact of version
matching on the performance of IRBL
techniques?
RQ3: To what extent are IRBL techniques
sensitive to the inclusion of test code files?
Research Questions
!18
Experiment 

Data Set
Experiment
Configuration
Potential
Improvement
RQ1: To what extent do IRBL techniques
perform on up-to-date subject?
RQ2: What is the impact of version
matching on the performance of IRBL
techniques?
RQ3: To what extent are IRBL techniques
sensitive to the inclusion of test code files?
RQ4: What potential performance gain can
be reached by leveraging duplicate bug
reports?
Research Questions
!19
BugLocator

(ICSE 2012)
BLIA

(APSEC 2015)
Locus

(ICSE 2016)
AmaLgam

(ICPC 2014)
BRTracer

(ICSME 2014)
BLUiR

(ASE 2013)
IRBL Features Sub Modules
Bug report fixing historyFull text
Code

segmentations
Identifiers
Identifiers
Identifiers
Identifiers
Bug report fixing history
Bug report fixing history, Revision history
Revision history
Bug report fixing history

Stack Trace Analysis, Revision history
Bug report fixing history, 

Stack Trace Analysis
IRBL Techniques we used
!20
Subjects
!21
20+
Written in Java
Publicly available

bug reports
20 source code files 

in one of its version
Subjects
46

Projects
New Subjects
9,459 

Bug
Reports
………..
…. …..
…..….
……..

….
..
5

Projects
558 

Bug
Reports
………..
…. …..
…..….
……..

….
..
Old Subjects
!22
Subjects
46

Projects
New Subjects
690 

Major
Versions
9,459 

Bug
Reports
………..
…. …..
…..….
……..

….
..
………..
…. …..
…..….
……..

….
..
………..
…. …..
…..….
……..

….
..
807 

Duplicate
Reports
5

Projects
5 

Major
Versions
558 

Bug
Reports
………..
…. …..
…..….
……..

….
..
………..
…. …..
…..….
……..

….
..
………..
…. …..
…..….
……..

….
..
136 

Duplicate
Reports
Old Subjects
!23
!24
………..
…. …..
…..….
……..

….
..
…..…..
…..…..
…..…..
…..….…..
Java
…..…..
…..…..
…..…..
…..….…..
Java
…..…..
…..…..
…..…..
…..….…..
Java
………..
…. …..
…..….
……..

….
..
………..
…. …..
…..….
……..

….
..
Bug Oracle
New Subjects Old Subjects
………..
…. …..
…..….
……..

….
..
…..…..
…..…..
…..…..
…..….…..
Java
…..…..
…..…..
…..…..
…..….…..
Java
…..…..
…..…..
…..…..
…..….…..
Java
………..
…. …..
…..….
……..

….
..
………..
…. …..
…..….
……..

….
..
Bug Oracle
VS.
Single version matching
Test file included
Configuration
RQ1: 

The use of old vs. new subjects
Single version
Matching
Multiple version
Matching
!25
VS.
Configuration
New subjects
Test files included
RQ2: 

The importance of version matching
!26
………..
…. …..
…..….
……..

….
..
…..…..
…..…..
…..…..
…..….…..
Java
…..…..
…..…..
…..…..
…..….…..
Java
…..…..
…..…..
…..…..
…..….…..
Java
………..
…. …..
…..….
……..

….
..
………..
…. …..
…..….
……..

….
..
Bug Oracle
Test File Included Test File Excluded
VS.
+Test
………..
…. …..
…..….
……..

….
..
…..…..
…..…..
…..…..
…..….…..
Java
…..…..
…..…..
…..…..
…..….…..
Java
…..…..
…..…..
…..…..
…..….…..
Java
………..
…. …..
…..….
……..

….
..
………..
…. …..
…..….
……..

….
..
Bug Oracle
CAMEL-12558: Transacted and Policy should not have outputs
M main/java/org/apache/camel/model/PolicyDefinition.java

M main/java/org/apache/camel/model/TransactedDefinition.java

A test/java/org/apache/camel/catalog/CamelCatalog.java

A main/java/org/apache/camel/tools/apt/CoreEipAnnotationPrint.java

Added camel-web3j Spring-boot test
A test/java/org/apache/camel/itest/springboot/CamelWeb3jTest.java



Update GoogleBigQueryProducer.java

M main/java/org/apache/camel/component/GoogleBigQueryProducer.java
Configuration
Multiple version matching
New subjects
Commit Log
RQ3: 

The impact of test file inclusion
!27
Master reports Merged reportsDuplicate reports
………..
…. …..
…..….
……..

….
..
…..…..
…..…..
…..…..
…..….…..
Java
…..…..
…..…..
…..…..
…..….…..
Java
…..…..
…..…..
…..…..
…..….…..
Java
………..
…. …..
…..….
……..

….
..
………..
…. …..
…..….
……..

….
..
Bug Oracle 

(Master reports)
………..
…. …..
…..….
……..

….
..
…..…..
…..…..
…..…..
…..….…..
Java
…..…..
…..…..
…..…..
…..….…..
Java
…..…..
…..…..
…..…..
…..….…..
Java
………..
…. …..
…..….
……..

….
..
………..
…. …..
…..….
……..

….
..
Bug Oracle 

(Duplicate reports)
Bug Oracle 

(Merged reports)
…..…..
…..…..
…..…..
…..….…..
Java
…..…..
…..…..
…..…..
…..….…..
Java
…..…..
…..…..
…..…..
…..….…..
Java
………..
…. …..
…..….
……..

….
..
…..….
……..

….
………..
…. …..
…..….
……..

….
..
…..….
……..

….
………..
…. …..
…..….
……..

….
..
…..….
……..

….
Configuration
For the all subjects

Including test files in Bug Oracle
RQ4: 

Leveraging duplicate bugs reports
Experiment Results
!28
Metrics
MAP
MRR
MAP =
1
M
MX
j=1
AP(j)
MRR =
1
M
MX
i=1
1
f-ranki
MAP =
1
M
MX
j=1
AP(j)
MRR =
1
M
MX
i=1
1
f-ranki
!29
Mean Average Precision
Mean Reciprocal Rank
●●
●●
0.359
0.35
0.359
0.365
0.363
0.38Locus
BLIA
AmaLgam
BLUiR
BRTracer
BugLocator
0.00 0.25 0.50 0.75 1.00
Distribution of MAP values of all subjects for each techniques
0.455
0.501
0.43
0.516
0.497
0.506Locus
BLIA
AmaLgam
BLUiR
BRTracer
BugLocator
0.00 0.25 0.50 0.75 1.00
Distribution of MRR values of all subjects for each techniques
Baseline Performance
!30
●●
●●
0.359
0.35
0.359
0.365
0.363
0.38Locus
BLIA
AmaLgam
BLUiR
BRTracer
BugLocator
0.00 0.25 0.50 0.75 1.00
Distribution of MAP values of all subjects for each techniques
0.455
0.501
0.43
0.516
0.497
0.506Locus
BLIA
AmaLgam
BLUiR
BRTracer
BugLocator
0.00 0.25 0.50 0.75 1.00
Distribution of MRR values of all subjects for each techniques
Baseline Performance
Bug localization still has much room for improvement.
!31
Technique
Old Subjects New Subjects
MAP MRR MAP MRR
BugLocator 0.2692 0.3985 ↗0.3052 ↗0.4223
BRTracer 0.2645 0.3664 ↗0.3330 ↗0.4690
BLUiR 0.3102 0.4556 0.2881 0.3869
AmaLgam 0.2950 0.4072 0.2906 0.3899
BLIA 0.2935 0.4242 ↗0.3014 0.4155
Locus 0.2641 0.3399 ↗0.3289 ↗0.4430
Single version matching
Test files included
Summary of MAP/MRR of IRBL techniques
!32
Configuration
RQ1: 

The use of old vs. new subjects
Summary of MAP/MRR of IRBL techniques
Not over-fitted to old subjects
Technique
Old Subjects New Subjects
MAP MRR MAP MRR
BugLocator 0.2692 0.3985 ↗0.3052 ↗0.4223
BRTracer 0.2645 0.3664 ↗0.3330 ↗0.4690
BLUiR 0.3102 0.4556 0.2881 0.3869
AmaLgam 0.2950 0.4072 0.2906 0.3899
BLIA 0.2935 0.4242 ↗0.3014 0.4155
Locus 0.2641 0.3399 ↗0.3289 ↗0.4430
!33
Single version matching
Test files included
Configuration
RQ1: 

The use of old vs. new subjects
!34
Summary of MAP/MRR of IRBL techniques
Technique
Single Version Multiple Version
MAP MRR MAP MRR
BugLocator 0.3052 0.4223 ↗0.3713 ↗0.5075
BRTracer 0.3330 0.4690 ↗0.3992 ↗0.5526
BLUiR 0.2881 0.3869 ↗0.3623 ↗0.4802
AmaLgam 0.2906 0.3899 ↗0.3657 ↗0.4840
BLIA 0.3014 0.4155 ↗0.3777 ↗0.5124
Locus 0.3289 0.4430 ↗0.4217 ↗0.5514
New subjects
Test files included
Configuration
RQ2: 

The importance of version matching
New subjects
Test files included
Summary of MAP/MRR of IRBL techniques
The evaluation/execution of IRBL techniques should apply
multiple version matching
!35
Technique
Single Version Multiple Version
MAP MRR MAP MRR
BugLocator 0.3052 0.4223 ↗0.3713 ↗0.5075
BRTracer 0.3330 0.4690 ↗0.3992 ↗0.5526
BLUiR 0.2881 0.3869 ↗0.3623 ↗0.4802
AmaLgam 0.2906 0.3899 ↗0.3657 ↗0.4840
BLIA 0.3014 0.4155 ↗0.3777 ↗0.5124
Locus 0.3289 0.4430 ↗0.4217 ↗0.5514
Configuration
RQ2: 

The importance of version matching
ConfigurationSummary of MAP/MRR of IRBL techniques
!36
Multiple version matching
New subjects
RQ3: 

The impact of test file inclusion
Technique
Test files excluded Test files included
MAP MRR MAP MRR
BugLocator 0.3811 0.4647 0.3713 ↗0.5075
BRTracer 0.4141 0.5090 0.3992 ↗0.5526
BLUiR 0.3603 0.4385 ↗0.3623 ↗0.4802
AmaLgam 0.3633 0.4420 0.3657 ↗0.4840
BLIA 0.3902 0.4728 ↗0.3777 ↗0.5124
Locus 0.4146 0.5002 ↗0.4217 ↗0.5514
Summary of MAP/MRR of IRBL techniques
Technique
Test files excluded Test files included
MAP MRR MAP MRR
BugLocator 0.3811 0.4647 0.3713 ↗0.5075
BRTracer 0.4141 0.5090 0.3992 ↗0.5526
BLUiR 0.3603 0.4385 ↗0.3623 ↗0.4802
AmaLgam 0.3633 0.4420 0.3657 ↗0.4840
BLIA 0.3902 0.4728 ↗0.3777 ↗0.5124
Locus 0.4146 0.5002 ↗0.4217 ↗0.5514
!37
Including test files does not bring bias or noise
Configuration
Multiple version matching
New subjects
RQ3: 

The impact of test file inclusion
RQ4: 

Leveraging duplicate bugs reports
!38
Technique
Master Duplicate Merged
(Master+Duplicate)
MAP MRR MAP MRR MAP MRR
BugLocator 0.3503 0.5051 0.3259 0.4667 0.3502 ↗0.5249
BRTracer 0.3852 0.5508 0.3776 0.5430 0.3787 ↗0.5692
BLUiR 0.3159 0.4540 0.2804 0.4192 ↗0.3325 ↗0.4728
AmaLgam 0.3202 0.4581 0.2829 0.4223 ↗0.3327 ↗0.4725
BLIA 0.3518 0.4915 0.3231 0.4537 ↗0.3577 ↗0.5041
Locus 0.2915 0.4707 0.2871 ↗0.4724 ↗0.3042 ↗0.5021
Summary of MAP/MRR of IRBL techniques
!39
Summary of MAP/MRR of IRBL techniques
RQ4: 

Leveraging duplicate bugs reports
Technique
Master Duplicate Merged
(Master+Duplicate)
MAP MRR MAP MRR MAP MRR
BugLocator 0.3503 0.5051 0.3259 0.4667 0.3502 ↗0.5249
BRTracer 0.3852 0.5508 0.3776 0.5430 0.3787 ↗0.5692
BLUiR 0.3159 0.4540 0.2804 0.4192 ↗0.3325 ↗0.4728
AmaLgam 0.3202 0.4581 0.2829 0.4223 ↗0.3327 ↗0.4725
BLIA 0.3518 0.4915 0.3231 0.4537 ↗0.3577 ↗0.5041
Locus 0.2915 0.4707 0.2871 ↗0.4724 ↗0.3042 ↗0.5021
Duplicate reports are complement master bug reports and
guarantee a minimum level of performance
Summary
!40
!41
Dataset Available
https://github.com/exatoa/Bench4BL
Bug Linking
!42
Bug-Code Linking
Bug Report
CAMEL-12558: Transacted and Policy should not have outputs
M main/java/org/apache/camel/model/PolicyDefinition.java

M main/java/org/apache/camel/model/TransactedDefinition.java

A test/java/org/apache/camel/catalog/CamelCatalog.java

A main/java/org/apache/camel/tools/apt/CoreEipAnnotationPrint.java

Added camel-web3j Spring-boot test
A test/java/org/apache/camel/itest/springboot/CamelWeb3jTest.java



Update GoogleBigQueryProducer.java

M main/java/org/apache/camel/component/GoogleBigQueryProducer.java
Code
Repository
Commit Log
!43
Bug-Code Linking
Bug Report
CAMEL-12558: Transacted and Policy should not have outputs
M main/java/org/apache/camel/model/PolicyDefinition.java

M main/java/org/apache/camel/model/TransactedDefinition.java

A test/java/org/apache/camel/catalog/CamelCatalog.java

A main/java/org/apache/camel/tools/apt/CoreEipAnnotationPrint.java

Added camel-web3j Spring-boot test
A test/java/org/apache/camel/itest/springboot/CamelWeb3jTest.java



Update GoogleBigQueryProducer.java

M main/java/org/apache/camel/component/GoogleBigQueryProducer.java
Code
Repository
Commit Log
!44
Bug Oracle
………..
…. …..
…..….
……..

….
..
main/java/org/apache/camel/model/PolicyDefinition.java

main/java/org/apache/camel/model/TransactedDefinition.java

test/java/org/apache/camel/catalog/CamelCatalog.java

main/java/org/apache/camel/tools/apt/CoreEipAnnotationPrint.java
………..
…. …..
…..….
……..

….
..
main/java/org/apache/camel/GoogleBigQueryProducer.java
………..
…. …..
…..….
……..

….
..
main/java/org/apache/camel/component/StringConcatenator.java
Bug Report 1
Bug Report 2
Bug Report 3
…….
!45
Version Matching
!46
Version Matching Strategy
Single version
Matching
!47
Previous Techniques
Version Matching Strategy
Single version
Matching
Multiple version
Matching
!48
Previous Techniques
Version Matching Approach
!49
Version Matching Approach
Selecting earliest version
!50
Test Case Inclusion
!51
Test File Inclusion
CAMEL-12558: Transacted and Policy should not have outputs
M main/java/org/apache/camel/model/PolicyDefinition.java

M main/java/org/apache/camel/model/TransactedDefinition.java

A test/java/org/apache/camel/catalog/CamelCatalog.java

A main/java/org/apache/camel/tools/apt/CoreEipAnnotationPrint.java

Added camel-web3j Spring-boot test
A test/java/org/apache/camel/itest/springboot/CamelWeb3jTest.java



Update GoogleBigQueryProducer.java

M main/java/org/apache/camel/component/GoogleBigQueryProducer.java
Code
Repository
Commit LogBugLocator
BLIA
Locus
AmaLgam
BRTracer
BLUiR
!52
Test File Inclusion
CAMEL-12558: Transacted and Policy should not have outputs
M main/java/org/apache/camel/model/PolicyDefinition.java

M main/java/org/apache/camel/model/TransactedDefinition.java

A test/java/org/apache/camel/catalog/CamelCatalog.java

A main/java/org/apache/camel/tools/apt/CoreEipAnnotationPrint.java

Added camel-web3j Spring-boot test
A test/java/org/apache/camel/itest/springboot/CamelWeb3jTest.java



Update GoogleBigQueryProducer.java

M main/java/org/apache/camel/component/GoogleBigQueryProducer.java
Code
Repository
Commit LogBugLocator
BLIA
Locus
AmaLgam
BRTracer
BLUiR
We remove 

including “test” or “Test” in a path or filename
!53
Duplicate Report
!54
Duplicate Bug Reports
46

Projects
New Subjects
690 

Major
Versions
9,459 

Bug
Reports
………..
…. …..
…..….
……..

….
..
………..
…. …..
…..….
……..

….
..
………..
…. …..
…..….
……..

….
..
807 

Duplicate
Reports
5

Projects
5 

Major
Versions
558 

Bug
Reports
………..
…. …..
…..….
……..

….
..
………..
…. …..
…..….
……..

….
..
………..
…. …..
…..….
……..

….
..
136 

Duplicate
Reports
Old Subjects
!55
Duplicate Bug Reports
!56
Duplicate Bug Reports
MATH-760 MATH-1192 MATH-2022
MATH-760 MATH-1192
MATH-760 MATH-2022
!57
1 of 57

Recommended

iFixR: Bug Report Driven Program Repair by
iFixR: Bug Report Driven Program RepairiFixR: Bug Report Driven Program Repair
iFixR: Bug Report Driven Program RepairDongsun Kim
554 views35 slides
TBar: Revisiting Template-based Automated Program Repair by
TBar: Revisiting Template-based Automated Program RepairTBar: Revisiting Template-based Automated Program Repair
TBar: Revisiting Template-based Automated Program RepairDongsun Kim
379 views26 slides
AVATAR : Fixing Semantic Bugs with Fix Patterns of Static Analysis Violations by
AVATAR : Fixing Semantic Bugs with Fix Patterns of Static Analysis ViolationsAVATAR : Fixing Semantic Bugs with Fix Patterns of Static Analysis Violations
AVATAR : Fixing Semantic Bugs with Fix Patterns of Static Analysis ViolationsDongsun Kim
468 views25 slides
A Closer Look at Real-World Patches by
A Closer Look at Real-World PatchesA Closer Look at Real-World Patches
A Closer Look at Real-World PatchesDongsun Kim
425 views26 slides
Mining Fix Patterns for FindBugs Violations by
Mining Fix Patterns for FindBugs ViolationsMining Fix Patterns for FindBugs Violations
Mining Fix Patterns for FindBugs ViolationsDongsun Kim
641 views59 slides
You Cannot Fix What You Cannot Find! --- An Investigation of Fault Localizati... by
You Cannot Fix What You Cannot Find! --- An Investigation of Fault Localizati...You Cannot Fix What You Cannot Find! --- An Investigation of Fault Localizati...
You Cannot Fix What You Cannot Find! --- An Investigation of Fault Localizati...Dongsun Kim
529 views29 slides

More Related Content

What's hot

Impact of Tool Support in Patch Construction by
Impact of Tool Support in Patch ConstructionImpact of Tool Support in Patch Construction
Impact of Tool Support in Patch ConstructionDongsun Kim
589 views33 slides
Are current antivirus programs able to detect complex metamorphic malware an ... by
Are current antivirus programs able to detect complex metamorphic malware an ...Are current antivirus programs able to detect complex metamorphic malware an ...
Are current antivirus programs able to detect complex metamorphic malware an ...UltraUploader
223 views19 slides
The Last Line Effect by
The Last Line EffectThe Last Line Effect
The Last Line EffectAndrey Karpov
466 views4 slides
Effective Fault-Localization Techniques for Concurrent Software by
Effective Fault-Localization Techniques for Concurrent SoftwareEffective Fault-Localization Techniques for Concurrent Software
Effective Fault-Localization Techniques for Concurrent SoftwareSangmin Park
837 views66 slides
A hybrid model to detect malicious executables by
A hybrid model to detect malicious executablesA hybrid model to detect malicious executables
A hybrid model to detect malicious executablesUltraUploader
111 views6 slides
Known XML Vulnerabilities Are Still a Threat to Popular Parsers ! & Open Sour... by
Known XML Vulnerabilities Are Still a Threat to Popular Parsers ! & Open Sour...Known XML Vulnerabilities Are Still a Threat to Popular Parsers ! & Open Sour...
Known XML Vulnerabilities Are Still a Threat to Popular Parsers ! & Open Sour...Lionel Briand
2.1K views22 slides

What's hot(20)

Impact of Tool Support in Patch Construction by Dongsun Kim
Impact of Tool Support in Patch ConstructionImpact of Tool Support in Patch Construction
Impact of Tool Support in Patch Construction
Dongsun Kim589 views
Are current antivirus programs able to detect complex metamorphic malware an ... by UltraUploader
Are current antivirus programs able to detect complex metamorphic malware an ...Are current antivirus programs able to detect complex metamorphic malware an ...
Are current antivirus programs able to detect complex metamorphic malware an ...
UltraUploader223 views
Effective Fault-Localization Techniques for Concurrent Software by Sangmin Park
Effective Fault-Localization Techniques for Concurrent SoftwareEffective Fault-Localization Techniques for Concurrent Software
Effective Fault-Localization Techniques for Concurrent Software
Sangmin Park837 views
A hybrid model to detect malicious executables by UltraUploader
A hybrid model to detect malicious executablesA hybrid model to detect malicious executables
A hybrid model to detect malicious executables
UltraUploader111 views
Known XML Vulnerabilities Are Still a Threat to Popular Parsers ! & Open Sour... by Lionel Briand
Known XML Vulnerabilities Are Still a Threat to Popular Parsers ! & Open Sour...Known XML Vulnerabilities Are Still a Threat to Popular Parsers ! & Open Sour...
Known XML Vulnerabilities Are Still a Threat to Popular Parsers ! & Open Sour...
Lionel Briand2.1K views
A preliminary study on using code smells to improve bug localization by krws
A preliminary study on using code smells to improve bug localizationA preliminary study on using code smells to improve bug localization
A preliminary study on using code smells to improve bug localization
krws38 views
Architecture of a morphological malware detector by UltraUploader
Architecture of a morphological malware detectorArchitecture of a morphological malware detector
Architecture of a morphological malware detector
UltraUploader181 views
Detection of vulnerabilities in programs with the help of code analyzers by PVS-Studio
Detection of vulnerabilities in programs with the help of code analyzersDetection of vulnerabilities in programs with the help of code analyzers
Detection of vulnerabilities in programs with the help of code analyzers
PVS-Studio420 views
IRJET- Code Cloning using Abstract Syntax Tree by IRJET Journal
IRJET- Code Cloning using Abstract Syntax TreeIRJET- Code Cloning using Abstract Syntax Tree
IRJET- Code Cloning using Abstract Syntax Tree
IRJET Journal13 views
C and CPP Interview Questions by Sagar Joshi
C and CPP Interview QuestionsC and CPP Interview Questions
C and CPP Interview Questions
Sagar Joshi1.1K views
A tale of experiments on bug prediction by Martin Pinzger
A tale of experiments on bug predictionA tale of experiments on bug prediction
A tale of experiments on bug prediction
Martin Pinzger819 views
Welcome to International Journal of Engineering Research and Development (IJERD) by IJERD Editor
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
IJERD Editor309 views
Xp Day 080506 Unit Tests And Mocks by guillaumecarre
Xp Day 080506 Unit Tests And MocksXp Day 080506 Unit Tests And Mocks
Xp Day 080506 Unit Tests And Mocks
guillaumecarre846 views
Parallel Lint by PVS-Studio
Parallel LintParallel Lint
Parallel Lint
PVS-Studio368 views
Speeding-up Software Testing With Computational Intelligence by Annibale Panichella
Speeding-up Software Testing With Computational IntelligenceSpeeding-up Software Testing With Computational Intelligence
Speeding-up Software Testing With Computational Intelligence
Multi step automated refactoring for code smell by eSAT Journals
Multi step automated refactoring for code smellMulti step automated refactoring for code smell
Multi step automated refactoring for code smell
eSAT Journals185 views

Similar to Bench4BL: Reproducibility Study on the Performance of IR-Based Bug Localization

pawel_jakubowski_master_thesis_swarm by
pawel_jakubowski_master_thesis_swarmpawel_jakubowski_master_thesis_swarm
pawel_jakubowski_master_thesis_swarmPaweł Jakubowski
93 views55 slides
08000182 by
0800018208000182
08000182Chathuranga Disanayaka
123 views23 slides
Machine Learning for Application-Layer Intrusion Detection by
Machine Learning for Application-Layer Intrusion DetectionMachine Learning for Application-Layer Intrusion Detection
Machine Learning for Application-Layer Intrusion Detectionbutest
6.6K views151 slides
masteroppgave_larsbrusletto by
masteroppgave_larsbruslettomasteroppgave_larsbrusletto
masteroppgave_larsbruslettoLars Brusletto
737 views151 slides
Zap Scanning by
Zap ScanningZap Scanning
Zap ScanningSuresh Kumar
924 views92 slides
JConrad_Mod11_FinalProject_031816 by
JConrad_Mod11_FinalProject_031816JConrad_Mod11_FinalProject_031816
JConrad_Mod11_FinalProject_031816Jeff Conrad
1.1K views29 slides

Similar to Bench4BL: Reproducibility Study on the Performance of IR-Based Bug Localization(20)

Machine Learning for Application-Layer Intrusion Detection by butest
Machine Learning for Application-Layer Intrusion DetectionMachine Learning for Application-Layer Intrusion Detection
Machine Learning for Application-Layer Intrusion Detection
butest6.6K views
JConrad_Mod11_FinalProject_031816 by Jeff Conrad
JConrad_Mod11_FinalProject_031816JConrad_Mod11_FinalProject_031816
JConrad_Mod11_FinalProject_031816
Jeff Conrad1.1K views
Java Performance & Profiling by Isuru Perera
Java Performance & ProfilingJava Performance & Profiling
Java Performance & Profiling
Isuru Perera1K views
Dissertation_of_Pieter_van_Zyl_2_March_2010 by Pieter Van Zyl
Dissertation_of_Pieter_van_Zyl_2_March_2010Dissertation_of_Pieter_van_Zyl_2_March_2010
Dissertation_of_Pieter_van_Zyl_2_March_2010
Pieter Van Zyl154 views
The Hacking Games - Operation System Vulnerabilities Meetup 29112022 by lior mazor
The Hacking Games - Operation System Vulnerabilities Meetup 29112022The Hacking Games - Operation System Vulnerabilities Meetup 29112022
The Hacking Games - Operation System Vulnerabilities Meetup 29112022
lior mazor23 views
Enterprise Java: Just What Is It and the Risks, Threats, and Exposures It Poses by Alex Senkevitch
Enterprise Java: Just What Is It and the Risks, Threats, and Exposures It PosesEnterprise Java: Just What Is It and the Risks, Threats, and Exposures It Poses
Enterprise Java: Just What Is It and the Risks, Threats, and Exposures It Poses
Alex Senkevitch92 views
Applying Static Analysis For Detecting Null Pointers In Java Programs by Don Dooley
Applying Static Analysis For Detecting Null Pointers In Java ProgramsApplying Static Analysis For Detecting Null Pointers In Java Programs
Applying Static Analysis For Detecting Null Pointers In Java Programs
Don Dooley3 views

Recently uploaded

Attacking IoT Devices from a Web Perspective - Linux Day by
Attacking IoT Devices from a Web Perspective - Linux Day Attacking IoT Devices from a Web Perspective - Linux Day
Attacking IoT Devices from a Web Perspective - Linux Day Simone Onofri
15 views68 slides
Vertical User Stories by
Vertical User StoriesVertical User Stories
Vertical User StoriesMoisés Armani Ramírez
12 views16 slides
Empathic Computing: Delivering the Potential of the Metaverse by
Empathic Computing: Delivering  the Potential of the MetaverseEmpathic Computing: Delivering  the Potential of the Metaverse
Empathic Computing: Delivering the Potential of the MetaverseMark Billinghurst
476 views80 slides
Unit 1_Lecture 2_Physical Design of IoT.pdf by
Unit 1_Lecture 2_Physical Design of IoT.pdfUnit 1_Lecture 2_Physical Design of IoT.pdf
Unit 1_Lecture 2_Physical Design of IoT.pdfStephenTec
12 views36 slides
Microsoft Power Platform.pptx by
Microsoft Power Platform.pptxMicrosoft Power Platform.pptx
Microsoft Power Platform.pptxUni Systems S.M.S.A.
52 views38 slides
Report 2030 Digital Decade by
Report 2030 Digital DecadeReport 2030 Digital Decade
Report 2030 Digital DecadeMassimo Talia
15 views41 slides

Recently uploaded(20)

Attacking IoT Devices from a Web Perspective - Linux Day by Simone Onofri
Attacking IoT Devices from a Web Perspective - Linux Day Attacking IoT Devices from a Web Perspective - Linux Day
Attacking IoT Devices from a Web Perspective - Linux Day
Simone Onofri15 views
Empathic Computing: Delivering the Potential of the Metaverse by Mark Billinghurst
Empathic Computing: Delivering  the Potential of the MetaverseEmpathic Computing: Delivering  the Potential of the Metaverse
Empathic Computing: Delivering the Potential of the Metaverse
Mark Billinghurst476 views
Unit 1_Lecture 2_Physical Design of IoT.pdf by StephenTec
Unit 1_Lecture 2_Physical Design of IoT.pdfUnit 1_Lecture 2_Physical Design of IoT.pdf
Unit 1_Lecture 2_Physical Design of IoT.pdf
StephenTec12 views
From chaos to control: Managing migrations and Microsoft 365 with ShareGate! by sammart93
From chaos to control: Managing migrations and Microsoft 365 with ShareGate!From chaos to control: Managing migrations and Microsoft 365 with ShareGate!
From chaos to control: Managing migrations and Microsoft 365 with ShareGate!
sammart939 views
1st parposal presentation.pptx by i238212
1st parposal presentation.pptx1st parposal presentation.pptx
1st parposal presentation.pptx
i2382129 views
DALI Basics Course 2023 by Ivory Egg
DALI Basics Course  2023DALI Basics Course  2023
DALI Basics Course 2023
Ivory Egg16 views
Business Analyst Series 2023 - Week 3 Session 5 by DianaGray10
Business Analyst Series 2023 -  Week 3 Session 5Business Analyst Series 2023 -  Week 3 Session 5
Business Analyst Series 2023 - Week 3 Session 5
DianaGray10237 views
【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院 by IttrainingIttraining
【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院
【USB韌體設計課程】精選講義節錄-USB的列舉過程_艾鍗學院
handbook for web 3 adoption.pdf by Liveplex
handbook for web 3 adoption.pdfhandbook for web 3 adoption.pdf
handbook for web 3 adoption.pdf
Liveplex22 views
STPI OctaNE CoE Brochure.pdf by madhurjyapb
STPI OctaNE CoE Brochure.pdfSTPI OctaNE CoE Brochure.pdf
STPI OctaNE CoE Brochure.pdf
madhurjyapb13 views
TouchLog: Finger Micro Gesture Recognition Using Photo-Reflective Sensors by sugiuralab
TouchLog: Finger Micro Gesture Recognition  Using Photo-Reflective SensorsTouchLog: Finger Micro Gesture Recognition  Using Photo-Reflective Sensors
TouchLog: Finger Micro Gesture Recognition Using Photo-Reflective Sensors
sugiuralab19 views
6g - REPORT.pdf by Liveplex
6g - REPORT.pdf6g - REPORT.pdf
6g - REPORT.pdf
Liveplex10 views

Bench4BL: Reproducibility Study on the Performance of IR-Based Bug Localization

  • 1. Bench4BL: Reproducibility Study on the Performance of IR-Based Bug Localization Jaekwon Lee1, Dongsun Kim1, Tegawendé F. Bissyandé1, 
 Woosung Jung2, Yves Le Traon1 1SnT, University of Luxembourg - Luxembourg 2Seoul National University of Education - South Korea
  • 4. Bug Localization !4 Model ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java Bug Report ……….. …. ….. …..…. ……..
 …. .. ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java A set of code files Bug Localization
  • 5. F(x) Test Case Test Case Test Case 01: MultiMap<Method, Long> duration = new MultiMap<Method, Long>(); 02: // run all benchmarks in same order, recording duration 03: for (Method m : benchmarks) { 04: System.err.println("# "+m.getName()+" benchmarking"); 05: List<Integer> reps = getReps(min_reps, m); 06: for (int r : reps) { 07: System.gc(); 08: long start = System.nanoTime(); 09: m.invoke(suite,r); 10: long stop = System.nanoTime(); 11: duration.map(m, stop - start); 12: } 13: } Function 01: MultiMap<Method, Long> duration = new MultiMap<Method, Long>(); 02: // run all benchmarks in same order, recording duration 03: for (Method m : benchmarks) { 04: System.err.println("# "+m.getName()+" benchmarking"); 05: List<Integer> reps = getReps(min_reps, m); 06: for (int r : reps) { 07: System.gc(); 08: long start = System.nanoTime(); 09: m.invoke(suite,r); 10: long stop = System.nanoTime(); 11: duration.map(m, stop - start); 12: } 13: } Function Fault Localization Bug Localization !5 Bug Localization Model ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java Bug Report ……….. …. ….. …..…. ……..
 …. .. ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java A set of code files
  • 6. Bug Localization F(x) Test Case Test Case Test Case 01: MultiMap<Method, Long> duration = new MultiMap<Method, Long>(); 02: // run all benchmarks in same order, recording duration 03: for (Method m : benchmarks) { 04: System.err.println("# "+m.getName()+" benchmarking"); 05: List<Integer> reps = getReps(min_reps, m); 06: for (int r : reps) { 07: System.gc(); 08: long start = System.nanoTime(); 09: m.invoke(suite,r); 10: long stop = System.nanoTime(); 11: duration.map(m, stop - start); 12: } 13: } Function 01: MultiMap<Method, Long> duration = new MultiMap<Method, Long>(); 02: // run all benchmarks in same order, recording duration 03: for (Method m : benchmarks) { 04: System.err.println("# "+m.getName()+" benchmarking"); 05: List<Integer> reps = getReps(min_reps, m); 06: for (int r : reps) { 07: System.gc(); 08: long start = System.nanoTime(); 09: m.invoke(suite,r); 10: long stop = System.nanoTime(); 11: duration.map(m, stop - start); 12: } 13: } Function Fault Localization Bug Localization !6 Model ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java Bug Report ……….. …. ….. …..…. ……..
 …. .. ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java ….. …..….. …..….. …..….. ….…..Java A set of code files
  • 8. …..….. …..….. …..….. …..….. …..….. Source 
 Codes ……..…….. ……. …….. …..…….. ……..…….. …….. ……. Bug Report NL tokens Code elements Meta Info. NL tokens Code elements Meta Info. Extracting Features Extracting Features Information Retrieval based Bug Localization (IRBL) !8
  • 9. Feature
 Vector …..….. …..….. …..….. …..….. …..….. Source 
 Codes ……..…….. ……. …….. …..…….. ……..…….. …….. ……. Bug Report …. Feature 
 Vectors NL tokens Code elements Meta Info. NL tokens Code elements Meta Info. Extracting Features Extracting Features Information Retrieval based Bug Localization (IRBL) !9
  • 10. Feature
 Vector …..….. …..….. …..….. …..….. …..….. Source 
 Codes ……..…….. ……. …….. …..…….. ……..…….. …….. ……. Bug Report Recommend
 Code Files ….. …..….. …..….. …..….. ….. …..….. …..….. …..….. ….. …..….. …..….. …..….. ….. …..….. …..….. …..….. 1 2 3 N …. …. Feature 
 Vectors NL tokens Code elements Meta Info. NL tokens Code elements Meta Info. Extracting Features Extracting Features Comparing Similarity
 & Ranking Information Retrieval based Bug Localization (IRBL) !10
  • 12. !12 Are these results mature enough? Not enough maturity of performance Subjects BRTracer BLUiR AmaLgam Locus ZXing 0.445 0.380 0.410 0.502 SWT 0.467 0.560 0578 0.640 AspectJ 0.264 0.263 0.271 0.320 PDE 0.367 0.349 0.322 0.422 JDT 0.232 0.277 0.282 0.359 (metric : MAP)
  • 13. Are the subjects still usable? !13 PDE Eclipse ZXing AspectJ JDT SWT 98 286 20 #Reports Period 2004 - 2016 2004 - 2010 2002 - 2006 2010 - 2010 Subject Out-of-dated subjects 60 98
  • 14. Are the subjects still usable? !14 PDE Eclipse ZXing AspectJ JDT SWT 98 286 20 #Reports Period 2004 - 2016 2004 - 2010 2002 - 2006 2010 - 2010 Subject Out-of-dated subjects 60 98
  • 15. Evaluation Configuration? !15 Inconsistent evaluation settings BugLocator BLIA Locus AmaLgam BRTracer BLUiR Version Matching Test file inclusion
  • 17. Experiment 
 Data Set RQ1: To what extent do IRBL techniques perform on up-to-date subjects? Research Questions !17
  • 18. Experiment 
 Data Set Experiment Configuration RQ1: To what extent do IRBL techniques perform on up-to-date subject? RQ2: What is the impact of version matching on the performance of IRBL techniques? RQ3: To what extent are IRBL techniques sensitive to the inclusion of test code files? Research Questions !18
  • 19. Experiment 
 Data Set Experiment Configuration Potential Improvement RQ1: To what extent do IRBL techniques perform on up-to-date subject? RQ2: What is the impact of version matching on the performance of IRBL techniques? RQ3: To what extent are IRBL techniques sensitive to the inclusion of test code files? RQ4: What potential performance gain can be reached by leveraging duplicate bug reports? Research Questions !19
  • 20. BugLocator
 (ICSE 2012) BLIA
 (APSEC 2015) Locus
 (ICSE 2016) AmaLgam
 (ICPC 2014) BRTracer
 (ICSME 2014) BLUiR
 (ASE 2013) IRBL Features Sub Modules Bug report fixing historyFull text Code
 segmentations Identifiers Identifiers Identifiers Identifiers Bug report fixing history Bug report fixing history, Revision history Revision history Bug report fixing history
 Stack Trace Analysis, Revision history Bug report fixing history, 
 Stack Trace Analysis IRBL Techniques we used !20
  • 21. Subjects !21 20+ Written in Java Publicly available bug reports 20 source code files 
 in one of its version
  • 22. Subjects 46
 Projects New Subjects 9,459 
 Bug Reports ……….. …. ….. …..…. ……..
 …. .. 5
 Projects 558 
 Bug Reports ……….. …. ….. …..…. ……..
 …. .. Old Subjects !22
  • 23. Subjects 46
 Projects New Subjects 690 
 Major Versions 9,459 
 Bug Reports ……….. …. ….. …..…. ……..
 …. .. ……….. …. ….. …..…. ……..
 …. .. ……….. …. ….. …..…. ……..
 …. .. 807 
 Duplicate Reports 5
 Projects 5 
 Major Versions 558 
 Bug Reports ……….. …. ….. …..…. ……..
 …. .. ……….. …. ….. …..…. ……..
 …. .. ……….. …. ….. …..…. ……..
 …. .. 136 
 Duplicate Reports Old Subjects !23
  • 24. !24 ……….. …. ….. …..…. ……..
 …. .. …..….. …..….. …..….. …..….….. Java …..….. …..….. …..….. …..….….. Java …..….. …..….. …..….. …..….….. Java ……….. …. ….. …..…. ……..
 …. .. ……….. …. ….. …..…. ……..
 …. .. Bug Oracle New Subjects Old Subjects ……….. …. ….. …..…. ……..
 …. .. …..….. …..….. …..….. …..….….. Java …..….. …..….. …..….. …..….….. Java …..….. …..….. …..….. …..….….. Java ……….. …. ….. …..…. ……..
 …. .. ……….. …. ….. …..…. ……..
 …. .. Bug Oracle VS. Single version matching Test file included Configuration RQ1: 
 The use of old vs. new subjects
  • 25. Single version Matching Multiple version Matching !25 VS. Configuration New subjects Test files included RQ2: 
 The importance of version matching
  • 26. !26 ……….. …. ….. …..…. ……..
 …. .. …..….. …..….. …..….. …..….….. Java …..….. …..….. …..….. …..….….. Java …..….. …..….. …..….. …..….….. Java ……….. …. ….. …..…. ……..
 …. .. ……….. …. ….. …..…. ……..
 …. .. Bug Oracle Test File Included Test File Excluded VS. +Test ……….. …. ….. …..…. ……..
 …. .. …..….. …..….. …..….. …..….….. Java …..….. …..….. …..….. …..….….. Java …..….. …..….. …..….. …..….….. Java ……….. …. ….. …..…. ……..
 …. .. ……….. …. ….. …..…. ……..
 …. .. Bug Oracle CAMEL-12558: Transacted and Policy should not have outputs M main/java/org/apache/camel/model/PolicyDefinition.java M main/java/org/apache/camel/model/TransactedDefinition.java A test/java/org/apache/camel/catalog/CamelCatalog.java A main/java/org/apache/camel/tools/apt/CoreEipAnnotationPrint.java Added camel-web3j Spring-boot test A test/java/org/apache/camel/itest/springboot/CamelWeb3jTest.java
 
 Update GoogleBigQueryProducer.java M main/java/org/apache/camel/component/GoogleBigQueryProducer.java Configuration Multiple version matching New subjects Commit Log RQ3: 
 The impact of test file inclusion
  • 27. !27 Master reports Merged reportsDuplicate reports ……….. …. ….. …..…. ……..
 …. .. …..….. …..….. …..….. …..….….. Java …..….. …..….. …..….. …..….….. Java …..….. …..….. …..….. …..….….. Java ……….. …. ….. …..…. ……..
 …. .. ……….. …. ….. …..…. ……..
 …. .. Bug Oracle 
 (Master reports) ……….. …. ….. …..…. ……..
 …. .. …..….. …..….. …..….. …..….….. Java …..….. …..….. …..….. …..….….. Java …..….. …..….. …..….. …..….….. Java ……….. …. ….. …..…. ……..
 …. .. ……….. …. ….. …..…. ……..
 …. .. Bug Oracle 
 (Duplicate reports) Bug Oracle 
 (Merged reports) …..….. …..….. …..….. …..….….. Java …..….. …..….. …..….. …..….….. Java …..….. …..….. …..….. …..….….. Java ……….. …. ….. …..…. ……..
 …. .. …..…. ……..
 …. ……….. …. ….. …..…. ……..
 …. .. …..…. ……..
 …. ……….. …. ….. …..…. ……..
 …. .. …..…. ……..
 …. Configuration For the all subjects
 Including test files in Bug Oracle RQ4: 
 Leveraging duplicate bugs reports
  • 29. Metrics MAP MRR MAP = 1 M MX j=1 AP(j) MRR = 1 M MX i=1 1 f-ranki MAP = 1 M MX j=1 AP(j) MRR = 1 M MX i=1 1 f-ranki !29 Mean Average Precision Mean Reciprocal Rank
  • 30. ●● ●● 0.359 0.35 0.359 0.365 0.363 0.38Locus BLIA AmaLgam BLUiR BRTracer BugLocator 0.00 0.25 0.50 0.75 1.00 Distribution of MAP values of all subjects for each techniques 0.455 0.501 0.43 0.516 0.497 0.506Locus BLIA AmaLgam BLUiR BRTracer BugLocator 0.00 0.25 0.50 0.75 1.00 Distribution of MRR values of all subjects for each techniques Baseline Performance !30
  • 31. ●● ●● 0.359 0.35 0.359 0.365 0.363 0.38Locus BLIA AmaLgam BLUiR BRTracer BugLocator 0.00 0.25 0.50 0.75 1.00 Distribution of MAP values of all subjects for each techniques 0.455 0.501 0.43 0.516 0.497 0.506Locus BLIA AmaLgam BLUiR BRTracer BugLocator 0.00 0.25 0.50 0.75 1.00 Distribution of MRR values of all subjects for each techniques Baseline Performance Bug localization still has much room for improvement. !31
  • 32. Technique Old Subjects New Subjects MAP MRR MAP MRR BugLocator 0.2692 0.3985 ↗0.3052 ↗0.4223 BRTracer 0.2645 0.3664 ↗0.3330 ↗0.4690 BLUiR 0.3102 0.4556 0.2881 0.3869 AmaLgam 0.2950 0.4072 0.2906 0.3899 BLIA 0.2935 0.4242 ↗0.3014 0.4155 Locus 0.2641 0.3399 ↗0.3289 ↗0.4430 Single version matching Test files included Summary of MAP/MRR of IRBL techniques !32 Configuration RQ1: 
 The use of old vs. new subjects
  • 33. Summary of MAP/MRR of IRBL techniques Not over-fitted to old subjects Technique Old Subjects New Subjects MAP MRR MAP MRR BugLocator 0.2692 0.3985 ↗0.3052 ↗0.4223 BRTracer 0.2645 0.3664 ↗0.3330 ↗0.4690 BLUiR 0.3102 0.4556 0.2881 0.3869 AmaLgam 0.2950 0.4072 0.2906 0.3899 BLIA 0.2935 0.4242 ↗0.3014 0.4155 Locus 0.2641 0.3399 ↗0.3289 ↗0.4430 !33 Single version matching Test files included Configuration RQ1: 
 The use of old vs. new subjects
  • 34. !34 Summary of MAP/MRR of IRBL techniques Technique Single Version Multiple Version MAP MRR MAP MRR BugLocator 0.3052 0.4223 ↗0.3713 ↗0.5075 BRTracer 0.3330 0.4690 ↗0.3992 ↗0.5526 BLUiR 0.2881 0.3869 ↗0.3623 ↗0.4802 AmaLgam 0.2906 0.3899 ↗0.3657 ↗0.4840 BLIA 0.3014 0.4155 ↗0.3777 ↗0.5124 Locus 0.3289 0.4430 ↗0.4217 ↗0.5514 New subjects Test files included Configuration RQ2: 
 The importance of version matching
  • 35. New subjects Test files included Summary of MAP/MRR of IRBL techniques The evaluation/execution of IRBL techniques should apply multiple version matching !35 Technique Single Version Multiple Version MAP MRR MAP MRR BugLocator 0.3052 0.4223 ↗0.3713 ↗0.5075 BRTracer 0.3330 0.4690 ↗0.3992 ↗0.5526 BLUiR 0.2881 0.3869 ↗0.3623 ↗0.4802 AmaLgam 0.2906 0.3899 ↗0.3657 ↗0.4840 BLIA 0.3014 0.4155 ↗0.3777 ↗0.5124 Locus 0.3289 0.4430 ↗0.4217 ↗0.5514 Configuration RQ2: 
 The importance of version matching
  • 36. ConfigurationSummary of MAP/MRR of IRBL techniques !36 Multiple version matching New subjects RQ3: 
 The impact of test file inclusion Technique Test files excluded Test files included MAP MRR MAP MRR BugLocator 0.3811 0.4647 0.3713 ↗0.5075 BRTracer 0.4141 0.5090 0.3992 ↗0.5526 BLUiR 0.3603 0.4385 ↗0.3623 ↗0.4802 AmaLgam 0.3633 0.4420 0.3657 ↗0.4840 BLIA 0.3902 0.4728 ↗0.3777 ↗0.5124 Locus 0.4146 0.5002 ↗0.4217 ↗0.5514
  • 37. Summary of MAP/MRR of IRBL techniques Technique Test files excluded Test files included MAP MRR MAP MRR BugLocator 0.3811 0.4647 0.3713 ↗0.5075 BRTracer 0.4141 0.5090 0.3992 ↗0.5526 BLUiR 0.3603 0.4385 ↗0.3623 ↗0.4802 AmaLgam 0.3633 0.4420 0.3657 ↗0.4840 BLIA 0.3902 0.4728 ↗0.3777 ↗0.5124 Locus 0.4146 0.5002 ↗0.4217 ↗0.5514 !37 Including test files does not bring bias or noise Configuration Multiple version matching New subjects RQ3: 
 The impact of test file inclusion
  • 38. RQ4: 
 Leveraging duplicate bugs reports !38 Technique Master Duplicate Merged (Master+Duplicate) MAP MRR MAP MRR MAP MRR BugLocator 0.3503 0.5051 0.3259 0.4667 0.3502 ↗0.5249 BRTracer 0.3852 0.5508 0.3776 0.5430 0.3787 ↗0.5692 BLUiR 0.3159 0.4540 0.2804 0.4192 ↗0.3325 ↗0.4728 AmaLgam 0.3202 0.4581 0.2829 0.4223 ↗0.3327 ↗0.4725 BLIA 0.3518 0.4915 0.3231 0.4537 ↗0.3577 ↗0.5041 Locus 0.2915 0.4707 0.2871 ↗0.4724 ↗0.3042 ↗0.5021 Summary of MAP/MRR of IRBL techniques
  • 39. !39 Summary of MAP/MRR of IRBL techniques RQ4: 
 Leveraging duplicate bugs reports Technique Master Duplicate Merged (Master+Duplicate) MAP MRR MAP MRR MAP MRR BugLocator 0.3503 0.5051 0.3259 0.4667 0.3502 ↗0.5249 BRTracer 0.3852 0.5508 0.3776 0.5430 0.3787 ↗0.5692 BLUiR 0.3159 0.4540 0.2804 0.4192 ↗0.3325 ↗0.4728 AmaLgam 0.3202 0.4581 0.2829 0.4223 ↗0.3327 ↗0.4725 BLIA 0.3518 0.4915 0.3231 0.4537 ↗0.3577 ↗0.5041 Locus 0.2915 0.4707 0.2871 ↗0.4724 ↗0.3042 ↗0.5021 Duplicate reports are complement master bug reports and guarantee a minimum level of performance
  • 43. Bug-Code Linking Bug Report CAMEL-12558: Transacted and Policy should not have outputs M main/java/org/apache/camel/model/PolicyDefinition.java M main/java/org/apache/camel/model/TransactedDefinition.java A test/java/org/apache/camel/catalog/CamelCatalog.java A main/java/org/apache/camel/tools/apt/CoreEipAnnotationPrint.java Added camel-web3j Spring-boot test A test/java/org/apache/camel/itest/springboot/CamelWeb3jTest.java
 
 Update GoogleBigQueryProducer.java M main/java/org/apache/camel/component/GoogleBigQueryProducer.java Code Repository Commit Log !43
  • 44. Bug-Code Linking Bug Report CAMEL-12558: Transacted and Policy should not have outputs M main/java/org/apache/camel/model/PolicyDefinition.java M main/java/org/apache/camel/model/TransactedDefinition.java A test/java/org/apache/camel/catalog/CamelCatalog.java A main/java/org/apache/camel/tools/apt/CoreEipAnnotationPrint.java Added camel-web3j Spring-boot test A test/java/org/apache/camel/itest/springboot/CamelWeb3jTest.java
 
 Update GoogleBigQueryProducer.java M main/java/org/apache/camel/component/GoogleBigQueryProducer.java Code Repository Commit Log !44
  • 45. Bug Oracle ……….. …. ….. …..…. ……..
 …. .. main/java/org/apache/camel/model/PolicyDefinition.java main/java/org/apache/camel/model/TransactedDefinition.java test/java/org/apache/camel/catalog/CamelCatalog.java main/java/org/apache/camel/tools/apt/CoreEipAnnotationPrint.java ……….. …. ….. …..…. ……..
 …. .. main/java/org/apache/camel/GoogleBigQueryProducer.java ……….. …. ….. …..…. ……..
 …. .. main/java/org/apache/camel/component/StringConcatenator.java Bug Report 1 Bug Report 2 Bug Report 3 ……. !45
  • 47. Version Matching Strategy Single version Matching !47 Previous Techniques
  • 48. Version Matching Strategy Single version Matching Multiple version Matching !48 Previous Techniques
  • 50. Version Matching Approach Selecting earliest version !50
  • 52. Test File Inclusion CAMEL-12558: Transacted and Policy should not have outputs M main/java/org/apache/camel/model/PolicyDefinition.java M main/java/org/apache/camel/model/TransactedDefinition.java A test/java/org/apache/camel/catalog/CamelCatalog.java A main/java/org/apache/camel/tools/apt/CoreEipAnnotationPrint.java Added camel-web3j Spring-boot test A test/java/org/apache/camel/itest/springboot/CamelWeb3jTest.java
 
 Update GoogleBigQueryProducer.java M main/java/org/apache/camel/component/GoogleBigQueryProducer.java Code Repository Commit LogBugLocator BLIA Locus AmaLgam BRTracer BLUiR !52
  • 53. Test File Inclusion CAMEL-12558: Transacted and Policy should not have outputs M main/java/org/apache/camel/model/PolicyDefinition.java M main/java/org/apache/camel/model/TransactedDefinition.java A test/java/org/apache/camel/catalog/CamelCatalog.java A main/java/org/apache/camel/tools/apt/CoreEipAnnotationPrint.java Added camel-web3j Spring-boot test A test/java/org/apache/camel/itest/springboot/CamelWeb3jTest.java
 
 Update GoogleBigQueryProducer.java M main/java/org/apache/camel/component/GoogleBigQueryProducer.java Code Repository Commit LogBugLocator BLIA Locus AmaLgam BRTracer BLUiR We remove 
 including “test” or “Test” in a path or filename !53
  • 55. Duplicate Bug Reports 46
 Projects New Subjects 690 
 Major Versions 9,459 
 Bug Reports ……….. …. ….. …..…. ……..
 …. .. ……….. …. ….. …..…. ……..
 …. .. ……….. …. ….. …..…. ……..
 …. .. 807 
 Duplicate Reports 5
 Projects 5 
 Major Versions 558 
 Bug Reports ……….. …. ….. …..…. ……..
 …. .. ……….. …. ….. …..…. ……..
 …. .. ……….. …. ….. …..…. ……..
 …. .. 136 
 Duplicate Reports Old Subjects !55
  • 57. Duplicate Bug Reports MATH-760 MATH-1192 MATH-2022 MATH-760 MATH-1192 MATH-760 MATH-2022 !57