SlideShare a Scribd company logo
1 of 105
A Framework for Evaluating
the SZZ Apporach for
Identifying Bug Introducing
Changes
Daniel da Costa Shane McIntosh Weiyi (Ian)
Shang
Uirá Kulesza Roberta Coelho Ahmed E.
Hassan
Queen’s
University
McGill
University
Concordia
University
Federal
University
Federal
University
Queen’s
University
After a bug is spotted, one has to find its
root cause
2
After a bug is spotted, one has to find its
root cause
2
After a bug is spotted, one has to find its
root cause
2
Bug
After a bug is spotted, one has to find its
root cause
Bug
The effort to find
the root causes
2
The SZZ approach was proposed to find
which changes induce a future fix
3
The SZZ approach was proposed to find
which changes induce a future fix
3
The SZZ approach was proposed to find
which changes induce a future fix
These induced our later
fix
3
SZZ traces historical data from bug-fixing
changes back to fix-inducing changes
4
SZZ traces historical data from bug-fixing
changes back to fix-inducing changes
+ function sum(x, y)
+ begin
+ x – y
+ end
Change #1
Initial
Import
4
SZZ traces historical data from bug-fixing
changes back to fix-inducing changes
function sum(x, y)
begin
+ # mandatory
comment
x – y
+ function sum(x, y)
+ begin
+ x – y
+ end
Change #1
Initial
Import
Change #2
We need
documentation
4
SZZ traces historical data from bug-fixing
changes back to fix-inducing changes
function sum(x, y)
begin
# mandatory
comment
+ x + y
– x – y
function sum(x, y)
begin
+ # mandatory
comment
x – y
+ function sum(x, y)
+ begin
+ x – y
+ end
Change #1
Initial
Import
Change #2
We need
documentation
Change #3
Fixing a bug
4
SZZ traces historical data from bug-fixing
changes back to fix-inducing changes
function sum(x, y)
begin
# mandatory
comment
+ x + y
– x – y
function sum(x, y)
begin
+ # mandatory
comment
x – y
+ function sum(x, y)
+ begin
+ x – y
+ end
Change #1
Initial
Import
Change #2
We need
documentation
Change #3
Fixing a bug
4
SZZ traces historical data from bug-fixing
changes back to fix-inducing changes
function sum(x, y)
begin
# mandatory
comment
+ x + y
– x – y
function sum(x, y)
begin
+ # mandatory
comment
x – y
+ function sum(x, y)
+ begin
+ x – y
+ end
Change #1
Initial
Import
Change #2
We need
documentation
Change #3
Fixing a bug
4
SZZ traces historical data from bug-fixing
changes back to fix-inducing changes
function sum(x, y)
begin
# mandatory
comment
+ x + y
– x – y
function sum(x, y)
begin
+ # mandatory
comment
x – y
+ function sum(x, y)
+ begin
+ x – y
+ end
Change #1
Initial
Import
Change #2
We need
documentation
Change #3
Fixing a bug
4
SZZ traces historical data from bug-fixing
changes back to fix-inducing changes
function sum(x, y)
begin
# mandatory
comment
+ x + y
– x – y
function sum(x, y)
begin
+ # mandatory
comment
x – y
+ function sum(x, y)
+ begin
+ x – y
+ end
Change #1
Initial
Import
Change #2
We need
documentation
Change #3
Fixing a bug
4
SZZ traces historical data from bug-fixing
changes back to fix-inducing changes
function sum(x, y)
begin
# mandatory
comment
+ x + y
– x – y
function sum(x, y)
begin
+ # mandatory
comment
x – y
+ function sum(x, y)
+ begin
+ x – y
+ end
Change #1
Initial
Import
Change #2
We need
documentation
Change #3
Fixing a bug
4
SZZ is the foundational piece behind
several empirical findings in recent
studies
5
SZZ is the foundational piece behind
several empirical findings in recent
studies
What is the buggiest
day?
(Sliwerski et al., 2005)
5
´
SZZ is the foundational piece behind
several empirical findings in recent
studies
What is the buggiest
day?
(Sliwerski et al., 2005)
What is the buggiest
time of day?
(Eyolfson et al., 2011)
5
´
SZZ is the foundational piece behind
several empirical findings in recent
studies
Buggy
Not buggy
(Kim et al., 2008)
What is the buggiest
day?
(Sliwerski et al., 2005)
What is the buggiest
time of day?
(Eyolfson et al., 2011)
5
´
However… SZZ is not without its
limitations
Are we really finding changes that
induce fixes?
6
To make matters harder, no ground truth
is readily available
7
To make matters harder, no ground truth
is readily available
7
Can you tell me the
changes that induced
the fix of bug
#50505?
To make matters harder, no ground truth
is readily available
7
Can you tell me the
changes that induced
the fix of bug
#50505?
To make matters harder, no ground truth
is readily available
7
:-(
In spite of this, SZZ-generated results
must be evaluated
8
We propose metrics to highlight
suspicious values in SZZ-generated data
9
10
Fix-inducing changes that disagree with
team members are suspicious
10
Fix-inducing changes that disagree with
team members are suspicious
10
Fix-inducing changes that disagree with
team members are suspicious
The day of
this release
Time
11
Fix-inducing changes that disagree with
team members are suspicious
Time
5.12.3
11
Fix-inducing changes that disagree with
team members are suspicious
Time
5.12.3
SZZ
11
Fix-inducing changes that disagree with
team members are suspicious
Time
5.12.3
SZZ
Buggy
Change
11
Fix-inducing changes that disagree with
team members are suspicious
Suspicious
!
Time
5.12.3
SZZ
Buggy
Change
Buggy
Change
Suspicious
!
OK!
11
Fix-inducing changes that disagree with
team members are suspicious
Time
12
Changes that are implicated in several
future fixes are suspicious
Time
Buggy
Change
12
Changes that are implicated in several
future fixes are suspicious
Time
Buggy
Change
Bug #1
12
Changes that are implicated in several
future fixes are suspicious
Time
Buggy
Change
Bug #1 Bug #2
12
Changes that are implicated in several
future fixes are suspicious
Time
Buggy
Change
Bug #1 Bug #2 Bug #3
12
Changes that are implicated in several
future fixes are suspicious
Time
Buggy
Change
Bug #1 Bug #2 Bug #3…
12
Changes that are implicated in several
future fixes are suspicious
Bug #400
Fixes that are induced by changes that
are spaced by years are suspicious
13
Time
13
Fixes that are induced by changes that
are spaced by years are suspicious
Time
Bug #1
13
Fixes that are induced by changes that
are spaced by years are suspicious
Time
Bug #1Buggy
Change
13
Fixes that are induced by changes that
are spaced by years are suspicious
Time
Bug #1Buggy
Change
Buggy
Change
13
Fixes that are induced by changes that
are spaced by years are suspicious
Time
Bug #1Buggy
Change
Buggy
Change
13
Fixes that are induced by changes that
are spaced by years are suspicious
Buggy
Change
Time
Bug #1Buggy
Change
Buggy
Change
Buggy
Change
13
Fixes that are induced by changes that
are spaced by years are suspicious
2 years
We evaluate four SZZ variations in our
empirical study
14
We evaluate four SZZ variations in our
empirical study
1. if (age > 18) {
2. can_watch = true;
3. }
4. return price;
Review #1
14
We evaluate four SZZ variations in our
empirical study
1. if (age > 18) {
2. //check loyalty card
3. if(loyaltycard){
4. //give 30% discount
5. price = price * 0.8;}
6. }
7. can_watch = true;
8. }
9. return price;
1. if (age > 18) {
2. can_watch = true;
3. }
4. return price;
Review #2Review #1
14
We evaluate four SZZ variations in our
empirical study
1. if (age >= 18) {
2. // check loyalty card
3. if(loyaltycard){
4. //give 30% discount
5. price = price * 0.7;}
6. }
7. can_watch = true;
8. }
9. return price;
1. if (age > 18) {
2. //check loyalty card
3. if(loyaltycard){
4. //give 30% discount
5. price = price * 0.8;}
6. }
7. can_watch = true;
8. }
9. return price;
1. if (age > 18) {
2. can_watch = true;
3. }
4. return price;
Review #2Review #1 Review #3
14
We evaluate four SZZ variations in our
empirical study
1. if (age >= 18) {
2. - // check loyalty card
3. if(loyaltycard){
4. //give 30% discount
5. price = price * 0.7;}
6. }
7. can_watch = true;
8. }
9. return price;
1. if (age > 18) {
2. + //check loyalty card
3. + if(loyaltycard){
4. + //give 30% discount
5. + price = price * 0.8;}
6. }
7. can_watch = true;
8. }
9. return price;
1. + if (age > 18) {
2. + can_watch = true;
3. + }
4. + return price;
Review #2Review #1 Review #3
14
B-SZZ
We evaluate four SZZ variations in our
empirical study
1. if (age >= 18) {
2. - // check loyalty card
3. if(loyaltycard){
4. //give 30% discount
5. price = price * 0.7;}
6. }
7. can_watch = true;
8. }
9. return price;
1. if (age > 18) {
2. + //check loyalty card
3. + if(loyaltycard){
4. + //give 30% discount
5. + price = price * 0.8;}
6. }
7. can_watch = true;
8. }
9. return price;
1. + if (age > 18) {
2. + can_watch = true;
3. + }
4. + return price;
Review #2Review #1 Review #3
14
We evaluate four SZZ variations in our
empirical study
1. if (age >= 18) {
2. - // check loyalty card
3. if(loyaltycard){
4. //give 30% discount
5. price = price * 0.7;}
6. }
7. can_watch = true;
8. }
9. return price;
1. + if (age > 18) {
2. + can_watch = true;
3. + }
4. + return price;
Review #2Review #1 Review #3
1. if (age > 18) {
2. + //check loyalty card
3. + if(loyaltycard){
4. + //give 30% discount
5. + price = price * 0.8;}
6. }
7. can_watch = true;
8. }
9. return price; 14
We evaluate four SZZ variations in our
empirical study
1. if (age >= 18) {
2. - // check loyalty card
3. if(loyaltycard){
4. //give 30% discount
5. price = price * 0.7;}
6. }
7. can_watch = true;
8. }
9. return price;
1. + if (age > 18) {
2. + can_watch = true;
3. + }
4. + return price;
Review #2Review #1 Review #3
1. if (age > 18) {
2. + //check loyalty card
3. + if(loyaltycard){
4. + //give 30% discount
5. + price = price * 0.8;}
6. }
7. can_watch = true;
8. }
9. return price; 14
1. if (age > 18) {
2. + //check loyalty card
3. + if(loyaltycard){
4. + //give 30% discount
5. + price = price * 0.8;}
6. }
7. can_watch = true;
8. }
9. return price;
We evaluate four SZZ variations in our
empirical study
1. if (age >= 18) {
2. - // check loyalty card
3. if(loyaltycard){
4. //give 30% discount
5. price = price * 0.7;}
6. }
7. can_watch = true;
8. }
9. return price;
1. + if (age > 18) {
2. + can_watch = true;
3. + }
4. + return price;
Review #2Review #1 Review #3
14
1. if (age > 18) {
2. + //check loyalty card
3. + if(loyaltycard){
4. + //give 30% discount
5. + price = price * 0.8;}
6. }
7. can_watch = true;
8. }
9. return price;
We evaluate four SZZ variations in our
empirical study
1. if (age >= 18) {
2. - // check loyalty card
3. if(loyaltycard){
4. //give 30% discount
5. price = price * 0.7;}
6. }
7. can_watch = true;
8. }
9. return price;
1. + if (age > 18) {
2. + can_watch = true;
3. + }
4. + return price;
Review #2Review #1 Review #3
14
1. if (age > 18) {
2. + //check loyalty card
3. + if(loyaltycard){
4. + //give 30% discount
5. + price = price * 0.8;}
6. }
7. can_watch = true;
8. }
9. return price;
We evaluate four SZZ variations in our
empirical study
1. if (age >= 18) {
2. - // check loyalty card
3. if(loyaltycard){
4. //give 30% discount
5. price = price * 0.7;}
6. }
7. can_watch = true;
8. }
9. return price;
1. + if (age > 18) {
2. + can_watch = true;
3. + }
4. + return price;
Review #2Review #1 Review #3
14
MA-SZZ
We evaluate four SZZ variations in our
empirical study
1. if (age >= 18) {
2. - // check loyalty card
3. if(loyaltycard){
4. //give 30% discount
5. price = price * 0.7;}
6. }
7. can_watch = true;
8. }
9. return price;
1. + if (age > 18) {
2. + can_watch = true;
3. + }
4. + return price;
Review #2Review #1 Review #3
1. if (age > 18) {
2. + //check loyalty card
3. + if(loyaltycard){
4. + //give 30% discount
5. + price = price * 0.8;}
6. }
7. can_watch = true;
8. }
9. return price; 14
We evaluate four SZZ variations in our
empirical study
1. if (age >= 18) {
2. - // check loyalty card
3. if(loyaltycard){
4. //give 30% discount
5. price = price * 0.7;}
6. }
7. can_watch = true;
8. }
9. return price;
1. + if (age > 18) {
2. + can_watch = true;
3. + }
4. + return price;
Review #2Review #1 Review #3
1. if (age > 18) {
2. + //check loyalty card
3. + if(loyaltycard){
4. + //give 30% discount
5. + price = price * 0.8;}
6. }
7. can_watch = true;
8. }
9. return price; 14
We evaluate four SZZ variations in our
empirical study
1. if (age >= 18) {
2. - // check loyalty card
3. if(loyaltycard){
4. //give 30% discount
5. price = price * 0.7;}
6. }
7. can_watch = true;
8. }
9. return price;
1. + if (age > 18) {
2. + can_watch = true;
3. + }
4. + return price;
Review #2Review #1 Review #3
1. if (age > 18) {
2. + //check loyalty card
3. + if(loyaltycard){
4. + //give 30% discount
5. + price = price * 0.8;}
6. }
7. can_watch = true;
8. }
9. return price; 14
We evaluate four SZZ variations in our
empirical study
1. if (age >= 18) {
2. - // check loyalty card
3. if(loyaltycard){
4. //give 30% discount
5. price = price * 0.7;}
6. }
7. can_watch = true;
8. }
9. return price;
1. + if (age > 18) {
2. + can_watch = true;
3. + }
4. + return price;
Review #2Review #1 Review #3
1. if (age > 18) {
2. + //check loyalty card
3. + if(loyaltycard){
4. + //give 30% discount
5. + price = price * 0.8;}
6. }
7. can_watch = true;
8. }
9. return price; 14
R-SZZ
We evaluate four SZZ variations in our
empirical study
1. if (age >= 18) {
2. - // check loyalty card
3. if(loyaltycard){
4. //give 30% discount
5. price = price * 0.7;}
6. }
7. can_watch = true;
8. }
9. return price;
1. + if (age > 18) {
2. + can_watch = true;
3. + }
4. + return price;
Review #2Review #1 Review #3
1. if (age > 18) {
2. + //check loyalty card
3. + if(loyaltycard){
4. + //give 30% discount
5. + price = price * 0.8;}
6. }
7. can_watch = true;
8. }
9. return price; 14
We evaluate four SZZ variations in our
empirical study
1. if (age >= 18) {
2. - // check loyalty card
3. if(loyaltycard){
4. //give 30% discount
5. price = price * 0.7;}
6. }
7. can_watch = true;
8. }
9. return price;
1. + if (age > 18) {
2. + can_watch = true;
3. + }
4. + return price;
Review #2Review #1 Review #3
1. if (age > 18) {
2. + //check loyalty card
3. + if(loyaltycard){
4. + //give 30% discount
5. + price = price * 0.8;}
6. }
7. can_watch = true;
8. }
9. return price; 14
L-SZZ
We evaluate four SZZ variations in our
empirical study
1. if (age >= 18) {
2. - // check loyalty card
3. if(loyaltycard){
4. //give 30% discount
5. price = price * 0.7;}
6. }
7. can_watch = true;
8. }
9. return price;
1. + if (age > 18) {
2. + can_watch = true;
3. + }
4. + return price;
Review #2Review #1 Review #3
1. if (age > 18) {
2. + //check loyalty card
3. + if(loyaltycard){
4. + //give 30% discount
5. + price = price * 0.8;}
6. }
7. can_watch = true;
8. }
9. return price; 14
1. if (age > 18) {
2. + //check loyalty card
3. + if(loyaltycard){
4. + //give 30% discount
5. + price = price * 0.8;}
6. }
7. can_watch = true;
8. }
9. return price;
We evaluate four SZZ variations in our
empirical study
1. if (age >= 18) {
2. - // check loyalty card
3. if(loyaltycard){
4. //give 30% discount
5. price = price * 0.7;}
6. }
7. can_watch = true;
8. }
9. return price;
1. + if (age > 18) {
2. + can_watch = true;
3. + }
4. + return price;
Review #2Review #1 Review #3
14
RESULTS
SZZ disagrees with team members
regarding when fixes are induced
15
SZZ disagrees with team members
regarding when fixes are induced
0
10
20
30
40
50
B-SZZ MA-SZZ R-SZZ L-SSZ
Disagreement Ratio (%)
15
0
10
20
30
40
50
B-SZZ MA-SZZ R-SZZ L-SSZ
Disagreement Ratio (%)
SZZ disagrees with team members
regarding when fixes are induced
15
Changes may induce future fixes that
span several days
16
Changes may induce future fixes that
span several days
1,000
10
Days
B-SZZ MA-SZZ R-SZZ L-SZZ 16
1,000
10
Days
B-SZZ MA-SZZ R-SZZ L-SZZ 16
Changes may induce future fixes that
span several days
Many fixes take years to be induced
17
Many fixes take years to be induced
B-SZZ MA-SZZ
1,000
10
Days
17
Many fixes take years to be induced
B-SZZ MA-SZZ
1,000
10
Days
17
To what extent should
we trust in SZZ data
in the end of the day?
SZZ
We
18
We believe that the extreme values of our
metrics may indicate false positives
19
We believe that the extreme values of our
metrics may indicate false positives
B-SZZ MA-SZZ
1,000
10
Days
19
We believe that the extreme values of our
metrics may indicate false positives
B-SZZ MA-SZZ
1,000
10
Days
19
We believe that the extreme values of our
metrics may indicate false positives
B-SZZ MA-SZZ
1,000
10
Days
19
We
manually
analyze 60
bugs for
each
evaluated
SZZ
38 out of 60 fix-inducing changes were
false positives (MA-SZZ)
20
38 out of 60 fix-inducing changes were
false positives (MA-SZZ)
20
Missed Initial
Code
Imp.
Directory
Renamin
g
Equivale
nt
Backout Low
Likelihoo
d
True
Positive
Disagreement Ratio 5 0 3 1 0 2 9
# of Future Fixes 0 5 4 5 0 0 6
Fix Inducing
Changes Time-span
0 5 3 0 1 4 7
Total 5 10 10 6 1 6 22
38 out of 60 fix-inducing changes were
false positives (MA-SZZ)
20
Missed Initial
Code
Imp.
Directory
Renamin
g
Equivale
nt
Backout Low
Likelihoo
d
True
Positive
Disagreement Ratio 5 0 3 1 0 2 9
# of Future Fixes 0 5 4 5 0 0 6
Fix Inducing
Changes Time-span
0 5 3 0 1 4 7
Total 5 10 10 6 1 6 22
Changes that do not change the software
behaviour
22
Changes that do not change the software
behaviour
public interface MyAPI {
getMoney(int
howMuch);
}
22
Changes that do not change the software
behaviour
public interface MyAPI {
getMoney(int
howMuch);
}
public interface MyAPI {
– getMoney(int howMuch);
+ public getMoney(int
howMuch);
}
22
FINAL REMARKS 1
SZZ still has opportunities for
improvements
23
Despite the lack of a “ground truth”, our
framework helps to evaluate the SZZ-
generated data at hand
24
SUMMARY 1
SZZ traces historical data from bug-fixing
changes back to fix-inducing changes
function sum(x, y)
begin
# mandatory
comment
+ x + y
– x – y
function sum(x, y)
begin
+ # mandatory
comment
x – y
+ function sum(x, y)
+ begin
+ x – y
+ end
Change #1
Initial
Import
Change #2
We need
documentation
Change #3
Fixing a bug
4
Time
5.12.3
SZZ
Buggy
Change
Buggy
Change
Suspicious
!
OK!
11
Fix-inducing changes that disagree with
team members are suspicious
38 out of 60 fix-inducing changes were
false positives
20
Missed Initial
Code
Imp.
Directory
Renamin
g
Equivale
nt
Backout Low
Likelihoo
d
True
Positive
Disagreement Ratio 5 0 3 1 0 2 9
# of Future Fixes 0 5 4 5 0 0 6
Fix Inducing
Changes Time-span
0 5 3 0 1 4 7
Total 5 10 10 6 1 6 22
Changes that do not change the software
behaviour
public interface MyAPI {
getMoney(int
howMuch);
}
public interface MyAPI {
– getMoney(int howMuch);
+ public getMoney(int
howMuch);
}
22
A Framework for Evaluating the Results of the SZZ Approach for Identifying Bug-Introducing Changes

More Related Content

More from SAIL_QU

Studying the Integration Practices and the Evolution of Ad Libraries in the G...
Studying the Integration Practices and the Evolution of Ad Libraries in the G...Studying the Integration Practices and the Evolution of Ad Libraries in the G...
Studying the Integration Practices and the Evolution of Ad Libraries in the G...SAIL_QU
 
Studying the Dialogue Between Users and Developers of Free Apps in the Google...
Studying the Dialogue Between Users and Developers of Free Apps in the Google...Studying the Dialogue Between Users and Developers of Free Apps in the Google...
Studying the Dialogue Between Users and Developers of Free Apps in the Google...SAIL_QU
 
Improving the testing efficiency of selenium-based load tests
Improving the testing efficiency of selenium-based load testsImproving the testing efficiency of selenium-based load tests
Improving the testing efficiency of selenium-based load testsSAIL_QU
 
Studying User-Developer Interactions Through the Distribution and Reviewing M...
Studying User-Developer Interactions Through the Distribution and Reviewing M...Studying User-Developer Interactions Through the Distribution and Reviewing M...
Studying User-Developer Interactions Through the Distribution and Reviewing M...SAIL_QU
 
Studying online distribution platforms for games through the mining of data f...
Studying online distribution platforms for games through the mining of data f...Studying online distribution platforms for games through the mining of data f...
Studying online distribution platforms for games through the mining of data f...SAIL_QU
 
Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...
Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...
Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...SAIL_QU
 
Investigating the Challenges in Selenium Usage and Improving the Testing Effi...
Investigating the Challenges in Selenium Usage and Improving the Testing Effi...Investigating the Challenges in Selenium Usage and Improving the Testing Effi...
Investigating the Challenges in Selenium Usage and Improving the Testing Effi...SAIL_QU
 
Mining Development Knowledge to Understand and Support Software Logging Pract...
Mining Development Knowledge to Understand and Support Software Logging Pract...Mining Development Knowledge to Understand and Support Software Logging Pract...
Mining Development Knowledge to Understand and Support Software Logging Pract...SAIL_QU
 
Which Log Level Should Developers Choose For a New Logging Statement?
Which Log Level Should Developers Choose For a New Logging Statement?Which Log Level Should Developers Choose For a New Logging Statement?
Which Log Level Should Developers Choose For a New Logging Statement?SAIL_QU
 
Towards Just-in-Time Suggestions for Log Changes
Towards Just-in-Time Suggestions for Log ChangesTowards Just-in-Time Suggestions for Log Changes
Towards Just-in-Time Suggestions for Log ChangesSAIL_QU
 
The Impact of Task Granularity on Co-evolution Analyses
The Impact of Task Granularity on Co-evolution AnalysesThe Impact of Task Granularity on Co-evolution Analyses
The Impact of Task Granularity on Co-evolution AnalysesSAIL_QU
 
How are Discussions Associated with Bug Reworking? An Empirical Study on Open...
How are Discussions Associated with Bug Reworking? An Empirical Study on Open...How are Discussions Associated with Bug Reworking? An Empirical Study on Open...
How are Discussions Associated with Bug Reworking? An Empirical Study on Open...SAIL_QU
 
A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...
A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...
A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...SAIL_QU
 
A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...
A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...
A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...SAIL_QU
 
Studying the Dialogue Between Users and Developers of Free Apps in the Google...
Studying the Dialogue Between Users and Developers of Free Apps in the Google...Studying the Dialogue Between Users and Developers of Free Apps in the Google...
Studying the Dialogue Between Users and Developers of Free Apps in the Google...SAIL_QU
 
What Do Programmers Know about Software Energy Consumption?
What Do Programmers Know about Software Energy Consumption?What Do Programmers Know about Software Energy Consumption?
What Do Programmers Know about Software Energy Consumption?SAIL_QU
 
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...SAIL_QU
 
Revisiting the Experimental Design Choices for Approaches for the Automated R...
Revisiting the Experimental Design Choices for Approaches for the Automated R...Revisiting the Experimental Design Choices for Approaches for the Automated R...
Revisiting the Experimental Design Choices for Approaches for the Automated R...SAIL_QU
 
Measuring Program Comprehension: A Large-Scale Field Study with Professionals
Measuring Program Comprehension: A Large-Scale Field Study with ProfessionalsMeasuring Program Comprehension: A Large-Scale Field Study with Professionals
Measuring Program Comprehension: A Large-Scale Field Study with ProfessionalsSAIL_QU
 
On the Unreliability of Bug Severity Data
On the Unreliability of Bug Severity DataOn the Unreliability of Bug Severity Data
On the Unreliability of Bug Severity DataSAIL_QU
 

More from SAIL_QU (20)

Studying the Integration Practices and the Evolution of Ad Libraries in the G...
Studying the Integration Practices and the Evolution of Ad Libraries in the G...Studying the Integration Practices and the Evolution of Ad Libraries in the G...
Studying the Integration Practices and the Evolution of Ad Libraries in the G...
 
Studying the Dialogue Between Users and Developers of Free Apps in the Google...
Studying the Dialogue Between Users and Developers of Free Apps in the Google...Studying the Dialogue Between Users and Developers of Free Apps in the Google...
Studying the Dialogue Between Users and Developers of Free Apps in the Google...
 
Improving the testing efficiency of selenium-based load tests
Improving the testing efficiency of selenium-based load testsImproving the testing efficiency of selenium-based load tests
Improving the testing efficiency of selenium-based load tests
 
Studying User-Developer Interactions Through the Distribution and Reviewing M...
Studying User-Developer Interactions Through the Distribution and Reviewing M...Studying User-Developer Interactions Through the Distribution and Reviewing M...
Studying User-Developer Interactions Through the Distribution and Reviewing M...
 
Studying online distribution platforms for games through the mining of data f...
Studying online distribution platforms for games through the mining of data f...Studying online distribution platforms for games through the mining of data f...
Studying online distribution platforms for games through the mining of data f...
 
Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...
Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...
Understanding the Factors for Fast Answers in Technical Q&A Websites: An Empi...
 
Investigating the Challenges in Selenium Usage and Improving the Testing Effi...
Investigating the Challenges in Selenium Usage and Improving the Testing Effi...Investigating the Challenges in Selenium Usage and Improving the Testing Effi...
Investigating the Challenges in Selenium Usage and Improving the Testing Effi...
 
Mining Development Knowledge to Understand and Support Software Logging Pract...
Mining Development Knowledge to Understand and Support Software Logging Pract...Mining Development Knowledge to Understand and Support Software Logging Pract...
Mining Development Knowledge to Understand and Support Software Logging Pract...
 
Which Log Level Should Developers Choose For a New Logging Statement?
Which Log Level Should Developers Choose For a New Logging Statement?Which Log Level Should Developers Choose For a New Logging Statement?
Which Log Level Should Developers Choose For a New Logging Statement?
 
Towards Just-in-Time Suggestions for Log Changes
Towards Just-in-Time Suggestions for Log ChangesTowards Just-in-Time Suggestions for Log Changes
Towards Just-in-Time Suggestions for Log Changes
 
The Impact of Task Granularity on Co-evolution Analyses
The Impact of Task Granularity on Co-evolution AnalysesThe Impact of Task Granularity on Co-evolution Analyses
The Impact of Task Granularity on Co-evolution Analyses
 
How are Discussions Associated with Bug Reworking? An Empirical Study on Open...
How are Discussions Associated with Bug Reworking? An Empirical Study on Open...How are Discussions Associated with Bug Reworking? An Empirical Study on Open...
How are Discussions Associated with Bug Reworking? An Empirical Study on Open...
 
A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...
A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...
A Study of the Relation of Mobile Device Attributes with the User-Perceived Q...
 
A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...
A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...
A Large-Scale Study of the Impact of Feature Selection Techniques on Defect C...
 
Studying the Dialogue Between Users and Developers of Free Apps in the Google...
Studying the Dialogue Between Users and Developers of Free Apps in the Google...Studying the Dialogue Between Users and Developers of Free Apps in the Google...
Studying the Dialogue Between Users and Developers of Free Apps in the Google...
 
What Do Programmers Know about Software Energy Consumption?
What Do Programmers Know about Software Energy Consumption?What Do Programmers Know about Software Energy Consumption?
What Do Programmers Know about Software Energy Consumption?
 
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...
Threshold for Size and Complexity Metrics: A Case Study from the Perspective ...
 
Revisiting the Experimental Design Choices for Approaches for the Automated R...
Revisiting the Experimental Design Choices for Approaches for the Automated R...Revisiting the Experimental Design Choices for Approaches for the Automated R...
Revisiting the Experimental Design Choices for Approaches for the Automated R...
 
Measuring Program Comprehension: A Large-Scale Field Study with Professionals
Measuring Program Comprehension: A Large-Scale Field Study with ProfessionalsMeasuring Program Comprehension: A Large-Scale Field Study with Professionals
Measuring Program Comprehension: A Large-Scale Field Study with Professionals
 
On the Unreliability of Bug Severity Data
On the Unreliability of Bug Severity DataOn the Unreliability of Bug Severity Data
On the Unreliability of Bug Severity Data
 

Recently uploaded

英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作qr0udbr0
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024StefanoLambiase
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Hr365.us smith
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanyChristoph Pohl
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...soniya singh
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Velvetech LLC
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityNeo4j
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...OnePlan Solutions
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)OPEN KNOWLEDGE GmbH
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 

Recently uploaded (20)

英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作英国UN学位证,北安普顿大学毕业证书1:1制作
英国UN学位证,北安普顿大学毕业证书1:1制作
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
 
Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)Recruitment Management Software Benefits (Infographic)
Recruitment Management Software Benefits (Infographic)
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
 
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
Russian Call Girls in Karol Bagh Aasnvi ➡️ 8264348440 💋📞 Independent Escort S...
 
Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...Software Project Health Check: Best Practices and Techniques for Your Product...
Software Project Health Check: Best Practices and Techniques for Your Product...
 
EY_Graph Database Powered Sustainability
EY_Graph Database Powered SustainabilityEY_Graph Database Powered Sustainability
EY_Graph Database Powered Sustainability
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
Maximizing Efficiency and Profitability with OnePlan’s Professional Service A...
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)Der Spagat zwischen BIAS und FAIRNESS (2024)
Der Spagat zwischen BIAS und FAIRNESS (2024)
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 

A Framework for Evaluating the Results of the SZZ Approach for Identifying Bug-Introducing Changes

  • 1. A Framework for Evaluating the SZZ Apporach for Identifying Bug Introducing Changes Daniel da Costa Shane McIntosh Weiyi (Ian) Shang Uirá Kulesza Roberta Coelho Ahmed E. Hassan Queen’s University McGill University Concordia University Federal University Federal University Queen’s University
  • 2. After a bug is spotted, one has to find its root cause 2
  • 3. After a bug is spotted, one has to find its root cause 2
  • 4. After a bug is spotted, one has to find its root cause 2 Bug
  • 5. After a bug is spotted, one has to find its root cause Bug The effort to find the root causes 2
  • 6. The SZZ approach was proposed to find which changes induce a future fix 3
  • 7. The SZZ approach was proposed to find which changes induce a future fix 3
  • 8. The SZZ approach was proposed to find which changes induce a future fix These induced our later fix 3
  • 9. SZZ traces historical data from bug-fixing changes back to fix-inducing changes 4
  • 10. SZZ traces historical data from bug-fixing changes back to fix-inducing changes + function sum(x, y) + begin + x – y + end Change #1 Initial Import 4
  • 11. SZZ traces historical data from bug-fixing changes back to fix-inducing changes function sum(x, y) begin + # mandatory comment x – y + function sum(x, y) + begin + x – y + end Change #1 Initial Import Change #2 We need documentation 4
  • 12. SZZ traces historical data from bug-fixing changes back to fix-inducing changes function sum(x, y) begin # mandatory comment + x + y – x – y function sum(x, y) begin + # mandatory comment x – y + function sum(x, y) + begin + x – y + end Change #1 Initial Import Change #2 We need documentation Change #3 Fixing a bug 4
  • 13. SZZ traces historical data from bug-fixing changes back to fix-inducing changes function sum(x, y) begin # mandatory comment + x + y – x – y function sum(x, y) begin + # mandatory comment x – y + function sum(x, y) + begin + x – y + end Change #1 Initial Import Change #2 We need documentation Change #3 Fixing a bug 4
  • 14. SZZ traces historical data from bug-fixing changes back to fix-inducing changes function sum(x, y) begin # mandatory comment + x + y – x – y function sum(x, y) begin + # mandatory comment x – y + function sum(x, y) + begin + x – y + end Change #1 Initial Import Change #2 We need documentation Change #3 Fixing a bug 4
  • 15. SZZ traces historical data from bug-fixing changes back to fix-inducing changes function sum(x, y) begin # mandatory comment + x + y – x – y function sum(x, y) begin + # mandatory comment x – y + function sum(x, y) + begin + x – y + end Change #1 Initial Import Change #2 We need documentation Change #3 Fixing a bug 4
  • 16. SZZ traces historical data from bug-fixing changes back to fix-inducing changes function sum(x, y) begin # mandatory comment + x + y – x – y function sum(x, y) begin + # mandatory comment x – y + function sum(x, y) + begin + x – y + end Change #1 Initial Import Change #2 We need documentation Change #3 Fixing a bug 4
  • 17. SZZ traces historical data from bug-fixing changes back to fix-inducing changes function sum(x, y) begin # mandatory comment + x + y – x – y function sum(x, y) begin + # mandatory comment x – y + function sum(x, y) + begin + x – y + end Change #1 Initial Import Change #2 We need documentation Change #3 Fixing a bug 4
  • 18. SZZ is the foundational piece behind several empirical findings in recent studies 5
  • 19. SZZ is the foundational piece behind several empirical findings in recent studies What is the buggiest day? (Sliwerski et al., 2005) 5 ´
  • 20. SZZ is the foundational piece behind several empirical findings in recent studies What is the buggiest day? (Sliwerski et al., 2005) What is the buggiest time of day? (Eyolfson et al., 2011) 5 ´
  • 21. SZZ is the foundational piece behind several empirical findings in recent studies Buggy Not buggy (Kim et al., 2008) What is the buggiest day? (Sliwerski et al., 2005) What is the buggiest time of day? (Eyolfson et al., 2011) 5 ´
  • 22. However… SZZ is not without its limitations Are we really finding changes that induce fixes? 6
  • 23. To make matters harder, no ground truth is readily available 7
  • 24. To make matters harder, no ground truth is readily available 7 Can you tell me the changes that induced the fix of bug #50505?
  • 25. To make matters harder, no ground truth is readily available 7 Can you tell me the changes that induced the fix of bug #50505?
  • 26. To make matters harder, no ground truth is readily available 7 :-(
  • 27. In spite of this, SZZ-generated results must be evaluated 8
  • 28. We propose metrics to highlight suspicious values in SZZ-generated data 9
  • 29. 10 Fix-inducing changes that disagree with team members are suspicious
  • 30. 10 Fix-inducing changes that disagree with team members are suspicious
  • 31. 10 Fix-inducing changes that disagree with team members are suspicious The day of this release
  • 32. Time 11 Fix-inducing changes that disagree with team members are suspicious
  • 33. Time 5.12.3 11 Fix-inducing changes that disagree with team members are suspicious
  • 34. Time 5.12.3 SZZ 11 Fix-inducing changes that disagree with team members are suspicious
  • 35. Time 5.12.3 SZZ Buggy Change 11 Fix-inducing changes that disagree with team members are suspicious Suspicious !
  • 37. Time 12 Changes that are implicated in several future fixes are suspicious
  • 38. Time Buggy Change 12 Changes that are implicated in several future fixes are suspicious
  • 39. Time Buggy Change Bug #1 12 Changes that are implicated in several future fixes are suspicious
  • 40. Time Buggy Change Bug #1 Bug #2 12 Changes that are implicated in several future fixes are suspicious
  • 41. Time Buggy Change Bug #1 Bug #2 Bug #3 12 Changes that are implicated in several future fixes are suspicious
  • 42. Time Buggy Change Bug #1 Bug #2 Bug #3… 12 Changes that are implicated in several future fixes are suspicious Bug #400
  • 43. Fixes that are induced by changes that are spaced by years are suspicious 13
  • 44. Time 13 Fixes that are induced by changes that are spaced by years are suspicious
  • 45. Time Bug #1 13 Fixes that are induced by changes that are spaced by years are suspicious
  • 46. Time Bug #1Buggy Change 13 Fixes that are induced by changes that are spaced by years are suspicious
  • 47. Time Bug #1Buggy Change Buggy Change 13 Fixes that are induced by changes that are spaced by years are suspicious
  • 48. Time Bug #1Buggy Change Buggy Change 13 Fixes that are induced by changes that are spaced by years are suspicious Buggy Change
  • 49. Time Bug #1Buggy Change Buggy Change Buggy Change 13 Fixes that are induced by changes that are spaced by years are suspicious 2 years
  • 50. We evaluate four SZZ variations in our empirical study 14
  • 51. We evaluate four SZZ variations in our empirical study 1. if (age > 18) { 2. can_watch = true; 3. } 4. return price; Review #1 14
  • 52. We evaluate four SZZ variations in our empirical study 1. if (age > 18) { 2. //check loyalty card 3. if(loyaltycard){ 4. //give 30% discount 5. price = price * 0.8;} 6. } 7. can_watch = true; 8. } 9. return price; 1. if (age > 18) { 2. can_watch = true; 3. } 4. return price; Review #2Review #1 14
  • 53. We evaluate four SZZ variations in our empirical study 1. if (age >= 18) { 2. // check loyalty card 3. if(loyaltycard){ 4. //give 30% discount 5. price = price * 0.7;} 6. } 7. can_watch = true; 8. } 9. return price; 1. if (age > 18) { 2. //check loyalty card 3. if(loyaltycard){ 4. //give 30% discount 5. price = price * 0.8;} 6. } 7. can_watch = true; 8. } 9. return price; 1. if (age > 18) { 2. can_watch = true; 3. } 4. return price; Review #2Review #1 Review #3 14
  • 54. We evaluate four SZZ variations in our empirical study 1. if (age >= 18) { 2. - // check loyalty card 3. if(loyaltycard){ 4. //give 30% discount 5. price = price * 0.7;} 6. } 7. can_watch = true; 8. } 9. return price; 1. if (age > 18) { 2. + //check loyalty card 3. + if(loyaltycard){ 4. + //give 30% discount 5. + price = price * 0.8;} 6. } 7. can_watch = true; 8. } 9. return price; 1. + if (age > 18) { 2. + can_watch = true; 3. + } 4. + return price; Review #2Review #1 Review #3 14
  • 55. B-SZZ
  • 56. We evaluate four SZZ variations in our empirical study 1. if (age >= 18) { 2. - // check loyalty card 3. if(loyaltycard){ 4. //give 30% discount 5. price = price * 0.7;} 6. } 7. can_watch = true; 8. } 9. return price; 1. if (age > 18) { 2. + //check loyalty card 3. + if(loyaltycard){ 4. + //give 30% discount 5. + price = price * 0.8;} 6. } 7. can_watch = true; 8. } 9. return price; 1. + if (age > 18) { 2. + can_watch = true; 3. + } 4. + return price; Review #2Review #1 Review #3 14
  • 57. We evaluate four SZZ variations in our empirical study 1. if (age >= 18) { 2. - // check loyalty card 3. if(loyaltycard){ 4. //give 30% discount 5. price = price * 0.7;} 6. } 7. can_watch = true; 8. } 9. return price; 1. + if (age > 18) { 2. + can_watch = true; 3. + } 4. + return price; Review #2Review #1 Review #3 1. if (age > 18) { 2. + //check loyalty card 3. + if(loyaltycard){ 4. + //give 30% discount 5. + price = price * 0.8;} 6. } 7. can_watch = true; 8. } 9. return price; 14
  • 58. We evaluate four SZZ variations in our empirical study 1. if (age >= 18) { 2. - // check loyalty card 3. if(loyaltycard){ 4. //give 30% discount 5. price = price * 0.7;} 6. } 7. can_watch = true; 8. } 9. return price; 1. + if (age > 18) { 2. + can_watch = true; 3. + } 4. + return price; Review #2Review #1 Review #3 1. if (age > 18) { 2. + //check loyalty card 3. + if(loyaltycard){ 4. + //give 30% discount 5. + price = price * 0.8;} 6. } 7. can_watch = true; 8. } 9. return price; 14
  • 59. 1. if (age > 18) { 2. + //check loyalty card 3. + if(loyaltycard){ 4. + //give 30% discount 5. + price = price * 0.8;} 6. } 7. can_watch = true; 8. } 9. return price; We evaluate four SZZ variations in our empirical study 1. if (age >= 18) { 2. - // check loyalty card 3. if(loyaltycard){ 4. //give 30% discount 5. price = price * 0.7;} 6. } 7. can_watch = true; 8. } 9. return price; 1. + if (age > 18) { 2. + can_watch = true; 3. + } 4. + return price; Review #2Review #1 Review #3 14
  • 60. 1. if (age > 18) { 2. + //check loyalty card 3. + if(loyaltycard){ 4. + //give 30% discount 5. + price = price * 0.8;} 6. } 7. can_watch = true; 8. } 9. return price; We evaluate four SZZ variations in our empirical study 1. if (age >= 18) { 2. - // check loyalty card 3. if(loyaltycard){ 4. //give 30% discount 5. price = price * 0.7;} 6. } 7. can_watch = true; 8. } 9. return price; 1. + if (age > 18) { 2. + can_watch = true; 3. + } 4. + return price; Review #2Review #1 Review #3 14
  • 61. 1. if (age > 18) { 2. + //check loyalty card 3. + if(loyaltycard){ 4. + //give 30% discount 5. + price = price * 0.8;} 6. } 7. can_watch = true; 8. } 9. return price; We evaluate four SZZ variations in our empirical study 1. if (age >= 18) { 2. - // check loyalty card 3. if(loyaltycard){ 4. //give 30% discount 5. price = price * 0.7;} 6. } 7. can_watch = true; 8. } 9. return price; 1. + if (age > 18) { 2. + can_watch = true; 3. + } 4. + return price; Review #2Review #1 Review #3 14
  • 63. We evaluate four SZZ variations in our empirical study 1. if (age >= 18) { 2. - // check loyalty card 3. if(loyaltycard){ 4. //give 30% discount 5. price = price * 0.7;} 6. } 7. can_watch = true; 8. } 9. return price; 1. + if (age > 18) { 2. + can_watch = true; 3. + } 4. + return price; Review #2Review #1 Review #3 1. if (age > 18) { 2. + //check loyalty card 3. + if(loyaltycard){ 4. + //give 30% discount 5. + price = price * 0.8;} 6. } 7. can_watch = true; 8. } 9. return price; 14
  • 64. We evaluate four SZZ variations in our empirical study 1. if (age >= 18) { 2. - // check loyalty card 3. if(loyaltycard){ 4. //give 30% discount 5. price = price * 0.7;} 6. } 7. can_watch = true; 8. } 9. return price; 1. + if (age > 18) { 2. + can_watch = true; 3. + } 4. + return price; Review #2Review #1 Review #3 1. if (age > 18) { 2. + //check loyalty card 3. + if(loyaltycard){ 4. + //give 30% discount 5. + price = price * 0.8;} 6. } 7. can_watch = true; 8. } 9. return price; 14
  • 65. We evaluate four SZZ variations in our empirical study 1. if (age >= 18) { 2. - // check loyalty card 3. if(loyaltycard){ 4. //give 30% discount 5. price = price * 0.7;} 6. } 7. can_watch = true; 8. } 9. return price; 1. + if (age > 18) { 2. + can_watch = true; 3. + } 4. + return price; Review #2Review #1 Review #3 1. if (age > 18) { 2. + //check loyalty card 3. + if(loyaltycard){ 4. + //give 30% discount 5. + price = price * 0.8;} 6. } 7. can_watch = true; 8. } 9. return price; 14
  • 66. We evaluate four SZZ variations in our empirical study 1. if (age >= 18) { 2. - // check loyalty card 3. if(loyaltycard){ 4. //give 30% discount 5. price = price * 0.7;} 6. } 7. can_watch = true; 8. } 9. return price; 1. + if (age > 18) { 2. + can_watch = true; 3. + } 4. + return price; Review #2Review #1 Review #3 1. if (age > 18) { 2. + //check loyalty card 3. + if(loyaltycard){ 4. + //give 30% discount 5. + price = price * 0.8;} 6. } 7. can_watch = true; 8. } 9. return price; 14
  • 67. R-SZZ
  • 68. We evaluate four SZZ variations in our empirical study 1. if (age >= 18) { 2. - // check loyalty card 3. if(loyaltycard){ 4. //give 30% discount 5. price = price * 0.7;} 6. } 7. can_watch = true; 8. } 9. return price; 1. + if (age > 18) { 2. + can_watch = true; 3. + } 4. + return price; Review #2Review #1 Review #3 1. if (age > 18) { 2. + //check loyalty card 3. + if(loyaltycard){ 4. + //give 30% discount 5. + price = price * 0.8;} 6. } 7. can_watch = true; 8. } 9. return price; 14
  • 69. We evaluate four SZZ variations in our empirical study 1. if (age >= 18) { 2. - // check loyalty card 3. if(loyaltycard){ 4. //give 30% discount 5. price = price * 0.7;} 6. } 7. can_watch = true; 8. } 9. return price; 1. + if (age > 18) { 2. + can_watch = true; 3. + } 4. + return price; Review #2Review #1 Review #3 1. if (age > 18) { 2. + //check loyalty card 3. + if(loyaltycard){ 4. + //give 30% discount 5. + price = price * 0.8;} 6. } 7. can_watch = true; 8. } 9. return price; 14
  • 70. L-SZZ
  • 71. We evaluate four SZZ variations in our empirical study 1. if (age >= 18) { 2. - // check loyalty card 3. if(loyaltycard){ 4. //give 30% discount 5. price = price * 0.7;} 6. } 7. can_watch = true; 8. } 9. return price; 1. + if (age > 18) { 2. + can_watch = true; 3. + } 4. + return price; Review #2Review #1 Review #3 1. if (age > 18) { 2. + //check loyalty card 3. + if(loyaltycard){ 4. + //give 30% discount 5. + price = price * 0.8;} 6. } 7. can_watch = true; 8. } 9. return price; 14
  • 72. 1. if (age > 18) { 2. + //check loyalty card 3. + if(loyaltycard){ 4. + //give 30% discount 5. + price = price * 0.8;} 6. } 7. can_watch = true; 8. } 9. return price; We evaluate four SZZ variations in our empirical study 1. if (age >= 18) { 2. - // check loyalty card 3. if(loyaltycard){ 4. //give 30% discount 5. price = price * 0.7;} 6. } 7. can_watch = true; 8. } 9. return price; 1. + if (age > 18) { 2. + can_watch = true; 3. + } 4. + return price; Review #2Review #1 Review #3 14
  • 74. SZZ disagrees with team members regarding when fixes are induced 15
  • 75. SZZ disagrees with team members regarding when fixes are induced 0 10 20 30 40 50 B-SZZ MA-SZZ R-SZZ L-SSZ Disagreement Ratio (%) 15
  • 76. 0 10 20 30 40 50 B-SZZ MA-SZZ R-SZZ L-SSZ Disagreement Ratio (%) SZZ disagrees with team members regarding when fixes are induced 15
  • 77. Changes may induce future fixes that span several days 16
  • 78. Changes may induce future fixes that span several days 1,000 10 Days B-SZZ MA-SZZ R-SZZ L-SZZ 16
  • 79. 1,000 10 Days B-SZZ MA-SZZ R-SZZ L-SZZ 16 Changes may induce future fixes that span several days
  • 80. Many fixes take years to be induced 17
  • 81. Many fixes take years to be induced B-SZZ MA-SZZ 1,000 10 Days 17
  • 82. Many fixes take years to be induced B-SZZ MA-SZZ 1,000 10 Days 17
  • 83. To what extent should we trust in SZZ data in the end of the day? SZZ We 18
  • 84. We believe that the extreme values of our metrics may indicate false positives 19
  • 85. We believe that the extreme values of our metrics may indicate false positives B-SZZ MA-SZZ 1,000 10 Days 19
  • 86. We believe that the extreme values of our metrics may indicate false positives B-SZZ MA-SZZ 1,000 10 Days 19
  • 87. We believe that the extreme values of our metrics may indicate false positives B-SZZ MA-SZZ 1,000 10 Days 19 We manually analyze 60 bugs for each evaluated SZZ
  • 88. 38 out of 60 fix-inducing changes were false positives (MA-SZZ) 20
  • 89. 38 out of 60 fix-inducing changes were false positives (MA-SZZ) 20 Missed Initial Code Imp. Directory Renamin g Equivale nt Backout Low Likelihoo d True Positive Disagreement Ratio 5 0 3 1 0 2 9 # of Future Fixes 0 5 4 5 0 0 6 Fix Inducing Changes Time-span 0 5 3 0 1 4 7 Total 5 10 10 6 1 6 22
  • 90. 38 out of 60 fix-inducing changes were false positives (MA-SZZ) 20 Missed Initial Code Imp. Directory Renamin g Equivale nt Backout Low Likelihoo d True Positive Disagreement Ratio 5 0 3 1 0 2 9 # of Future Fixes 0 5 4 5 0 0 6 Fix Inducing Changes Time-span 0 5 3 0 1 4 7 Total 5 10 10 6 1 6 22
  • 91. Changes that do not change the software behaviour 22
  • 92. Changes that do not change the software behaviour public interface MyAPI { getMoney(int howMuch); } 22
  • 93. Changes that do not change the software behaviour public interface MyAPI { getMoney(int howMuch); } public interface MyAPI { – getMoney(int howMuch); + public getMoney(int howMuch); } 22
  • 95. SZZ still has opportunities for improvements 23
  • 96. Despite the lack of a “ground truth”, our framework helps to evaluate the SZZ- generated data at hand 24
  • 98. SZZ traces historical data from bug-fixing changes back to fix-inducing changes function sum(x, y) begin # mandatory comment + x + y – x – y function sum(x, y) begin + # mandatory comment x – y + function sum(x, y) + begin + x – y + end Change #1 Initial Import Change #2 We need documentation Change #3 Fixing a bug 4
  • 99.
  • 101.
  • 102. 38 out of 60 fix-inducing changes were false positives 20 Missed Initial Code Imp. Directory Renamin g Equivale nt Backout Low Likelihoo d True Positive Disagreement Ratio 5 0 3 1 0 2 9 # of Future Fixes 0 5 4 5 0 0 6 Fix Inducing Changes Time-span 0 5 3 0 1 4 7 Total 5 10 10 6 1 6 22
  • 103.
  • 104. Changes that do not change the software behaviour public interface MyAPI { getMoney(int howMuch); } public interface MyAPI { – getMoney(int howMuch); + public getMoney(int howMuch); } 22

Editor's Notes

  1. The part that we can see
  2. The part that we can see
  3. - It is hard to build a database of bug causes - Developers are not available to do such a work - Even If they are available, if the bug is too old, they might not recall - May be the developer is not even in reach (turn over)
  4. - It is hard to build a database of bug causes - Developers are not available to do such a work - Even If they are available, if the bug is too old, they might not recall - May be the developer is not even in reach (turn over)
  5. - It is hard to build a database of bug causes - Developers are not available to do such a work - Even If they are available, if the bug is too old, they might not recall - May be the developer is not even in reach (turn over)
  6. - It is hard to build a database of bug causes - Developers are not available to do such a work - Even If they are available, if the bug is too old, they might not recall - May be the developer is not even in reach (turn over)
  7. We should describe the SZZ implementations instead
  8. Replace this one to explain the szz implementations