Automatic Identification
  of Bug-Introducing




                             presented by
                    Nicolas Be...
•Control development
                         •Multi-User Access
Source Control System    •Change History
    (e.g., Subve...
Source Control System



  Link Errors to Fixes




Error Reporting System
                         3
...   0.11   #5612   0.12   0.13     ...




                                   time




                                 ...
...   0.11       #5612      0.12        0.13     ...




                                               time




  •Bug Re...
...   0.11       #5612      0.12        0.13     ...




                                               time




  •Bug Re...
...   0.11      #5612      0.12          0.13     ...




                                                time




  •Bug ...
...   0.11       #5612       0.12        0.13     ...




                                                time




  •Bug ...
...    0.11    #5612   0.12   0.13     ...




                                     time




      When was the bug introd...
Why do we want
this information?




                    6
Measure developer
                   performance
Developer A

              Bug Introducing
              Bug Fixing




 ...
Measure the residency time
     of bugs in the system




                             8
Find Bug-Prone
Change Patterns




                  9
How do we get
this information?




                    10
The SZZ Algorithm
                       quot;Fixed Bug #5612quot;



0.11           #5612          0.12         0.13




...
The SZZ Algorithm
                       quot;Fixed Bug #5612quot;



0.11           #5612          0.12
                 ...
The SZZ Algorithm

1: public void bar() {     1: public void foo() {   1: public void foo() {
2: // print report         2...
The SZZ Algorithm
                              Bug Introduction


0.11       #5612       0.12                      0.13

...
Improving the Algorithm



           • use annotation graphs
           • ignore comments
           • ignore blank lines...
1: public void bar() {     1: public void foo() {   1: public void foo() {
2: // print report         2: // print report  ...
1: public void bar() {     1: public void foo() {   1: public void foo() {
2: // print report         2: // print report  ...
1: public void bar() {     1: public void foo() {   1: public void foo() {
2: // print report         2: // print report  ...
introducing changes.
           each bug-fix revision for our two projects, as shown in
           Figure 12. Most bug-fix...
Experimental Setup




Identify all bug-introducing changes
   on method level granularity.

      Compare performance
  o...
False Positive
Algorithm finds a bug introducing change,
but in reality the change is not the introducing change.




False...
Experimental Results
 Performance
               False Positives    False
   Increase
                                 Neg...
Performance Increase:
            remove      .
            36-51% of FP
               14% of FN




                    ...
Still not perfect, but
better than anything
  existiting so far!


                         24
Good about this Paper

                       Good Evaluation
                    Clear Methodology
                  Perf...
Not so Good
Most issues only named but not addressed
           Applications section very short
              Changes afte...
Upcoming SlideShare
Loading in...5
×

Automatic Identification of Bug Introducing Changes

794

Published on

A talk I gave for a MSR course. Original work by Sung Kim, Tom Zimmermann et al.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
794
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "Automatic Identification of Bug Introducing Changes"

  1. 1. Automatic Identification of Bug-Introducing presented by Nicolas Bettenburg 1
  2. 2. •Control development •Multi-User Access Source Control System •Change History (e.g., Subversion) •Capture Problems •Multi-User Access Error Reporting System (e.g., BugZilla) •Error History 2
  3. 3. Source Control System Link Errors to Fixes Error Reporting System 3
  4. 4. ... 0.11 #5612 0.12 0.13 ... time 4
  5. 5. ... 0.11 #5612 0.12 0.13 ... time •Bug Report #5612 reports and error 4
  6. 6. ... 0.11 #5612 0.12 0.13 ... time •Bug Report #5612 reports and error •Error fixed in version 0.13 4
  7. 7. ... 0.11 #5612 0.12 0.13 ... time •Bug Report #5612 reports and error •Error fixed in version 0.13 •Commit Message in 0.13 : “Fixed Bug #5612” 4
  8. 8. ... 0.11 #5612 0.12 0.13 ... time •Bug Report #5612 reports and error •Error fixed in version 0.13 •Commit Message in 0.13 : “Fixed Bug #5612” •Changed Code in 0.13 refers to Location of Bug 4
  9. 9. ... 0.11 #5612 0.12 0.13 ... time When was the bug introduced? Who was responsible? 5
  10. 10. Why do we want this information? 6
  11. 11. Measure developer performance Developer A Bug Introducing Bug Fixing Bug Introducing Bug Fixing Developer B 7
  12. 12. Measure the residency time of bugs in the system 8
  13. 13. Find Bug-Prone Change Patterns 9
  14. 14. How do we get this information? 10
  15. 15. The SZZ Algorithm quot;Fixed Bug #5612quot; 0.11 #5612 0.12 0.13 time First find the bug fixing changes 11
  16. 16. The SZZ Algorithm quot;Fixed Bug #5612quot; 0.11 #5612 0.12 Diff 0.13 time Run diff to find out what changed 12
  17. 17. The SZZ Algorithm 1: public void bar() { 1: public void foo() { 1: public void foo() { 2: // print report 2: // print report 2: // print out report 3: if (report == null) { 3: if (report == null) 3: if (report != null) 4: println(report); 4: { 4: { 5: } 5: println(report); 5: println(report); 6:} 6: } 6: } 7: 7:} 7:} 0.11 0.12 0.13 2: // print out report 3: if (report != null) 6: } 13
  18. 18. The SZZ Algorithm Bug Introduction 0.11 #5612 0.12 0.13 time Wrong! 1. Bug can only appear before report 2. Not all changes are fixes 3. Annotate information is insufficient 14
  19. 19. Improving the Algorithm • use annotation graphs • ignore comments • ignore blank lines • ignore formating changes • ignore outliers • manual inspection 15
  20. 20. 1: public void bar() { 1: public void foo() { 1: public void foo() { 2: // print report 2: // print report 2: // print out report 3: if (report == null) { 3: if (report == null) 3: if (report != null) 4: println(report); 4: { 4: { 5: } 5: println(report); 5: println(report); 6:} 6: } 6: } 7: 7:} 7:} 0.11 0.12 0.13 1 1 1 3 4 4 4 5 5 6 7 7 16
  21. 21. 1: public void bar() { 1: public void foo() { 1: public void foo() { 2: // print report 2: // print report 2: // print out report 3: if (report == null) { 3: if (report == null) 3: if (report != null) 4: println(report); 4: { 4: { 5: } 5: println(report); 5: println(report); 6:} 6: } 6: } 7: 7:} 7:} 0.11 0.12 0.13 1 1 1 3 4 4 4 5 5 6 7 7 17
  22. 22. 1: public void bar() { 1: public void foo() { 1: public void foo() { 2: // print report 2: // print report 2: // print out report 3: if (report == null) { 3: if (report == null) 3: if (report != null) 4: println(report); 4: { 4: { 5: } 5: println(report); 5: println(report); 6:} 6: } 6: } 7: 7:} 7:} 0.11 0.12 0.13 1 1 1 3 4 4 4 5 5 6 7 7 18
  23. 23. introducing changes. each bug-fix revision for our two projects, as shown in Figure 12. Most bug-fix revisions contain changes to just one or two files. All 50% of file change numbers per Outliers revision (between 25% and 75% quartiles) are about 1-3. A typical approach for removing outliers from data is if a data item is 1.5 times greater than the 50% quartile, it is assumed to be an outlier. In our experiment, we adopt a very conservative approach, and use as our definition of Idea: not all file changes in the version that fixes a outlier file change counts that are greater than 5 times the Figure 14. Bug-introducin 50% quartile. This ensures that any changes we note as ignoring outlier revisions. bug are bug-fixing changes. Ignore these revisions! Hunk V outliers truly have a large number of file changes. Changes identified as outliers for our two projects are 4.5. Manual Fix shown as ‘+’ in Figure 12. We identify bug-fix rev and bug-fix revision dat introducing changes. If a ch is a bug-fix, we assume th hunks in the revision are b them are true bug-fixes? It change log and understandi One developer may think others think it is only a s feature addition. To check true bug-fixes, we manually marked them as bug-fix judges, graduate students w development experience, verification. A judge mark projects (see Table 1) an marks. Judges use a GUI-b tool. The tool shows ind Figure 12. Box plots for the number of file changes per revision. Judges read the carefully and decide if the 19
  24. 24. Experimental Setup Identify all bug-introducing changes on method level granularity. Compare performance of SZZ against new algorithm. 20
  25. 25. False Positive Algorithm finds a bug introducing change, but in reality the change is not the introducing change. False Negative Algorithm cannot find a bug introducing change, that in reality is a bug introducing change. 21
  26. 26. Experimental Results Performance False Positives False Increase Negatives Annotation 2% 1-4% Graphs Formatting 32-45% 13-14% Outliers 7-16% Manual 4-5% Inspection 22
  27. 27. Performance Increase: remove . 36-51% of FP 14% of FN 23
  28. 28. Still not perfect, but better than anything existiting so far! 24
  29. 29. Good about this Paper Good Evaluation Clear Methodology Performance Increase Extension of existing algorithm Identification of Threats to Validity 25
  30. 30. Not so Good Most issues only named but not addressed Applications section very short Changes after Bug Reports? Errors in Illustrations Manual Inspection Fuzzy terms 26

×