A Framework for Evaluating the Results of the SZZ Approach for Identifying Bug-Introducing Changes
1. A Framework for Evaluating
the SZZ Apporach for
Identifying Bug Introducing
Changes
Daniel da Costa Shane McIntosh Weiyi (Ian)
Shang
Uirá Kulesza Roberta Coelho Ahmed E.
Hassan
Queen’s
University
McGill
University
Concordia
University
Federal
University
Federal
University
Queen’s
University
2. After a bug is spotted, one has to find its
root cause
2
3. After a bug is spotted, one has to find its
root cause
2
4. After a bug is spotted, one has to find its
root cause
2
Bug
5. After a bug is spotted, one has to find its
root cause
Bug
The effort to find
the root causes
2
6. The SZZ approach was proposed to find
which changes induce a future fix
3
7. The SZZ approach was proposed to find
which changes induce a future fix
3
8. The SZZ approach was proposed to find
which changes induce a future fix
These induced our later
fix
3
10. SZZ traces historical data from bug-fixing
changes back to fix-inducing changes
+ function sum(x, y)
+ begin
+ x – y
+ end
Change #1
Initial
Import
4
11. SZZ traces historical data from bug-fixing
changes back to fix-inducing changes
function sum(x, y)
begin
+ # mandatory
comment
x – y
+ function sum(x, y)
+ begin
+ x – y
+ end
Change #1
Initial
Import
Change #2
We need
documentation
4
12. SZZ traces historical data from bug-fixing
changes back to fix-inducing changes
function sum(x, y)
begin
# mandatory
comment
+ x + y
– x – y
function sum(x, y)
begin
+ # mandatory
comment
x – y
+ function sum(x, y)
+ begin
+ x – y
+ end
Change #1
Initial
Import
Change #2
We need
documentation
Change #3
Fixing a bug
4
13. SZZ traces historical data from bug-fixing
changes back to fix-inducing changes
function sum(x, y)
begin
# mandatory
comment
+ x + y
– x – y
function sum(x, y)
begin
+ # mandatory
comment
x – y
+ function sum(x, y)
+ begin
+ x – y
+ end
Change #1
Initial
Import
Change #2
We need
documentation
Change #3
Fixing a bug
4
14. SZZ traces historical data from bug-fixing
changes back to fix-inducing changes
function sum(x, y)
begin
# mandatory
comment
+ x + y
– x – y
function sum(x, y)
begin
+ # mandatory
comment
x – y
+ function sum(x, y)
+ begin
+ x – y
+ end
Change #1
Initial
Import
Change #2
We need
documentation
Change #3
Fixing a bug
4
15. SZZ traces historical data from bug-fixing
changes back to fix-inducing changes
function sum(x, y)
begin
# mandatory
comment
+ x + y
– x – y
function sum(x, y)
begin
+ # mandatory
comment
x – y
+ function sum(x, y)
+ begin
+ x – y
+ end
Change #1
Initial
Import
Change #2
We need
documentation
Change #3
Fixing a bug
4
16. SZZ traces historical data from bug-fixing
changes back to fix-inducing changes
function sum(x, y)
begin
# mandatory
comment
+ x + y
– x – y
function sum(x, y)
begin
+ # mandatory
comment
x – y
+ function sum(x, y)
+ begin
+ x – y
+ end
Change #1
Initial
Import
Change #2
We need
documentation
Change #3
Fixing a bug
4
17. SZZ traces historical data from bug-fixing
changes back to fix-inducing changes
function sum(x, y)
begin
# mandatory
comment
+ x + y
– x – y
function sum(x, y)
begin
+ # mandatory
comment
x – y
+ function sum(x, y)
+ begin
+ x – y
+ end
Change #1
Initial
Import
Change #2
We need
documentation
Change #3
Fixing a bug
4
18. SZZ is the foundational piece behind
several empirical findings in recent
studies
5
19. SZZ is the foundational piece behind
several empirical findings in recent
studies
What is the buggiest
day?
(Sliwerski et al., 2005)
5
´
20. SZZ is the foundational piece behind
several empirical findings in recent
studies
What is the buggiest
day?
(Sliwerski et al., 2005)
What is the buggiest
time of day?
(Eyolfson et al., 2011)
5
´
21. SZZ is the foundational piece behind
several empirical findings in recent
studies
Buggy
Not buggy
(Kim et al., 2008)
What is the buggiest
day?
(Sliwerski et al., 2005)
What is the buggiest
time of day?
(Eyolfson et al., 2011)
5
´
22. However… SZZ is not without its
limitations
Are we really finding changes that
induce fixes?
6
23. To make matters harder, no ground truth
is readily available
7
24. To make matters harder, no ground truth
is readily available
7
Can you tell me the
changes that induced
the fix of bug
#50505?
25. To make matters harder, no ground truth
is readily available
7
Can you tell me the
changes that induced
the fix of bug
#50505?
26. To make matters harder, no ground truth
is readily available
7
:-(
27. In spite of this, SZZ-generated results
must be evaluated
8
28. We propose metrics to highlight
suspicious values in SZZ-generated data
9
81. Many fixes take years to be induced
B-SZZ MA-SZZ
1,000
10
Days
17
82. Many fixes take years to be induced
B-SZZ MA-SZZ
1,000
10
Days
17
83. To what extent should
we trust in SZZ data
in the end of the day?
SZZ
We
18
84. We believe that the extreme values of our
metrics may indicate false positives
19
85. We believe that the extreme values of our
metrics may indicate false positives
B-SZZ MA-SZZ
1,000
10
Days
19
86. We believe that the extreme values of our
metrics may indicate false positives
B-SZZ MA-SZZ
1,000
10
Days
19
87. We believe that the extreme values of our
metrics may indicate false positives
B-SZZ MA-SZZ
1,000
10
Days
19
We
manually
analyze 60
bugs for
each
evaluated
SZZ
88. 38 out of 60 fix-inducing changes were
false positives (MA-SZZ)
20
89. 38 out of 60 fix-inducing changes were
false positives (MA-SZZ)
20
Missed Initial
Code
Imp.
Directory
Renamin
g
Equivale
nt
Backout Low
Likelihoo
d
True
Positive
Disagreement Ratio 5 0 3 1 0 2 9
# of Future Fixes 0 5 4 5 0 0 6
Fix Inducing
Changes Time-span
0 5 3 0 1 4 7
Total 5 10 10 6 1 6 22
90. 38 out of 60 fix-inducing changes were
false positives (MA-SZZ)
20
Missed Initial
Code
Imp.
Directory
Renamin
g
Equivale
nt
Backout Low
Likelihoo
d
True
Positive
Disagreement Ratio 5 0 3 1 0 2 9
# of Future Fixes 0 5 4 5 0 0 6
Fix Inducing
Changes Time-span
0 5 3 0 1 4 7
Total 5 10 10 6 1 6 22
92. Changes that do not change the software
behaviour
public interface MyAPI {
getMoney(int
howMuch);
}
22
93. Changes that do not change the software
behaviour
public interface MyAPI {
getMoney(int
howMuch);
}
public interface MyAPI {
– getMoney(int howMuch);
+ public getMoney(int
howMuch);
}
22
98. SZZ traces historical data from bug-fixing
changes back to fix-inducing changes
function sum(x, y)
begin
# mandatory
comment
+ x + y
– x – y
function sum(x, y)
begin
+ # mandatory
comment
x – y
+ function sum(x, y)
+ begin
+ x – y
+ end
Change #1
Initial
Import
Change #2
We need
documentation
Change #3
Fixing a bug
4
102. 38 out of 60 fix-inducing changes were
false positives
20
Missed Initial
Code
Imp.
Directory
Renamin
g
Equivale
nt
Backout Low
Likelihoo
d
True
Positive
Disagreement Ratio 5 0 3 1 0 2 9
# of Future Fixes 0 5 4 5 0 0 6
Fix Inducing
Changes Time-span
0 5 3 0 1 4 7
Total 5 10 10 6 1 6 22
103.
104. Changes that do not change the software
behaviour
public interface MyAPI {
getMoney(int
howMuch);
}
public interface MyAPI {
– getMoney(int howMuch);
+ public getMoney(int
howMuch);
}
22
Editor's Notes
The part that we can see
The part that we can see
- It is hard to build a database of bug causes - Developers are not available to do such a work- Even If they are available, if the bug is too old, they might not recall- May be the developer is not even in reach (turn over)
- It is hard to build a database of bug causes - Developers are not available to do such a work- Even If they are available, if the bug is too old, they might not recall- May be the developer is not even in reach (turn over)
- It is hard to build a database of bug causes - Developers are not available to do such a work- Even If they are available, if the bug is too old, they might not recall- May be the developer is not even in reach (turn over)
- It is hard to build a database of bug causes - Developers are not available to do such a work- Even If they are available, if the bug is too old, they might not recall- May be the developer is not even in reach (turn over)
We should describe the SZZ implementations instead
Replace this one to explain the szz implementations