Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Evaluating the Usefulness of IR-Based Fault LocalizationTechniques

870 views

Published on

Software debugging is tedious and time consuming. To reduce the manual effort needed for debugging, researchers have proposed a considerable number of techniques to automate the process of fault localization; in particular, techniques based on information retrieval (IR) have drawn increased attention in recent years. Although reportedly effective, these techniques have some potential limitations that may affect their performance. First, their effectiveness is likely to depend heavily on the quality of the bug reports; unfortunately, high-quality bug reports that contain rich information are not always available. Second, these techniques have not been evaluated through studies that involve actual developers, which is less than ideal, as purely analytical evaluations can hardly show the actual usefulness of debugging techniques. The goal of this work is to evaluate the usefulness of IR-based techniques in real-world scenarios. Our investigation shows that bug reports do not always contain rich information, and that low-quality bug reports can considerably affect the effectiveness of these techniques. Our research also shows, through a user study, that high-quality bug reports benefit developers just as much as they benefit IR-based techniques. In fact, the information provided by IR-based techniques when operating on high-quality reports is only helpful to developers in a limited number of cases. And even in these cases, such information only helps developers get to the faulty file quickly, but does not help them in their most time consuming task: understanding and fixing the bug within that file.

Published in: Science
  • Be the first to comment

Evaluating the Usefulness of IR-Based Fault LocalizationTechniques

  1. 1. Evaluating the Usefulness of IR-Based Fault Localization Techniques Qianqian Wang* Chris Parnin** Alessandro Orso* * Georgia Institute of Technology, USA ** North Carolina State University, USA
  2. 2. Debugging is Difficult
  3. 3. Debugging is Difficult Let’s&see…& Over&50&years&of&research& on&automated&debugging.&
  4. 4. Debugging is Difficult Let’s&see…& Over&50&years&of&research& on&automated&debugging.& 1962.&Symbolic&Debugging&(UNIVAC&FLIT)& 1981.%Weiser.%Program%Slicing% 1999.$Delta$Debugging$ 2001.%Sta)s)cal%Debugging% �����������������������������
  5. 5. Debugging is Difficult Let’s&see…& Over&50&years&of&research& on&automated&debugging.& 1962.&Symbolic&Debugging&(UNIVAC&FLIT)& 1981.%Weiser.%Program%Slicing% 1999.$Delta$Debugging$ 2001.%Sta)s)cal%Debugging% ����������������������������� STILL
  6. 6. IR-Based FL Techniques • How do they work? • Rank source files based on their lexical similarity to bug reports
 • How well do they work? Top 1 Top 5 Top 10 Percentage 35% 58% 69%
  7. 7. Source code file: CTabFolder.java public class CTabFolder extends Composite { // tooltip int [] toolTipEvents = new int[] {SWT.MouseExit, SWT.MouseHover, SWT.MouseMove, SWT.MouseDown, SWT.DragDetect}; Listener toolTipListener; … / * Returns <code>true</code> if the CTabFolder only displys the selected tab * and <code>false</code> if the CTabFolder displays multiple tabs. */ …void onMouseHover(Event event) { showToolTip(event.x, event.y); } void onDispose() { inDispose = true; hideToolTip(); … } } Understanding IR-based FL Techniques Bug ID: 90018 Summary: Native tooltips left around on CTabFolder. Description: Hover over the PartStack CTabFolder inside eclipse until some native tooltip is displayed. For example, the maximize button. When the tooltip appears, change perspectives using the keybinding. the CTabFolder gets hidden, but its tooltip is permanently displayed and never goes away. Even if that CTabFolder is disposed (I'm assuming) when the perspective is closed. --------------------------------------------------------------------------
  8. 8. Source code file: CTabFolder.java public class CTabFolder extends Composite { // tooltip int [] toolTipEvents = new int[] {SWT.MouseExit, SWT.MouseHover, SWT.MouseMove, SWT.MouseDown, SWT.DragDetect}; Listener toolTipListener; … / * Returns <code>true</code> if the CTabFolder only displys the selected tab * and <code>false</code> if the CTabFolder displays multiple tabs. */ …void onMouseHover(Event event) { showToolTip(event.x, event.y); } void onDispose() { inDispose = true; hideToolTip(); … } } Understanding IR-based FL Techniques Bug ID: 90018 Summary: Native tooltips left around on CTabFolder. Description: Hover over the PartStack CTabFolder inside eclipse until some native tooltip is displayed. For example, the maximize button. When the tooltip appears, change perspectives using the keybinding. the CTabFolder gets hidden, but its tooltip is permanently displayed and never goes away. Even if that CTabFolder is disposed (I'm assuming) when the perspective is closed. --------------------------------------------------------------------------
  9. 9. • Does the presence of technical information affect the fault localization results? • How often do bug reports contain such information? • Is such information enough for developers to find the faulty files easily? Assessing IR-based FL Techniques
  10. 10. Analytical Study User Study
  11. 11. • Q1: Does technical information affect fault localization results? • Q2: How often do bug reports contain technical information? Analytical Study
  12. 12. Subjects Project # Bugs #Source file AspectJ 286 6k SWT 98 0.5k ZXing 20 0.4k Jodatime 9 0.2k
  13. 13. • Categorize bug reports • Stack traces/test cases/program entity names/natural language descriptions • Generate ranked lists • BugLocator - IR-based fault localization tool • Perform statistical analysis Q1: Method
  14. 14. Q1: Results Program entity Stack trace Test case Results √ X X • Does bug report information affect fault localization results? √ Statistically significant difference: p < 0.05 X No statistically significant difference: p >= 0.05
  15. 15. Q1: Results Program entity Stack trace Test case Results √ X X • Does bug report information affect fault localization results? √ Statistically significant difference: p < 0.05 X No statistically significant difference: p >= 0.05
  16. 16. Q1: Results Program entity Stack trace Test case Results √ X X • Does bug report information affect fault localization results? √ Statistically significant difference: p < 0.05 X No statistically significant difference: p >= 0.05 Bug report characteristics affect IR-based fault localization results
  17. 17. • How often bug reports contain technical information? • Select 10,000 bug reports from SWT Bugzilla • Check presence of technical information: • Stack traces • Test cases • Program entity names Q2: Method
  18. 18. • How often bug reports contain technical information? Q2: Results Stack traces Test cases Program entity Percentage 10% 3% 32%
  19. 19. • How often bug reports contain technical information? Q2: Results Stack traces Test cases Program entity Percentage 10% 3% 32%
  20. 20. • How often bug reports contain technical information? Q2: Results Stack traces Test cases Program entity Percentage 10% 3% 32% The majority bug reports do not contain enough information
  21. 21. Additional finding “Optimistic” Evaluation Approach • Assumption: Changed files = faulty files • Reality: • 40% bugs contain multiple changed files • Not all changed files contain bugs • Best-ranked files may not be faulty
  22. 22. Additional finding “Optimistic” Evaluation Approach • Assumption: Changed files = faulty files • Reality: • 40% bugs contain multiple changed files • Not all changed files contain bugs • Best-ranked files may not be faulty Results of existing studies might be worse than what reported
  23. 23. • Q3: Does bug report information affect developers’ performance? • Q4: Do IR-based techniques help developers’ performance? User Study
  24. 24. Experiment Protocol: Setup Participants: 70 developers Graduate Students Software subject: • Eclipse SWT • 2 bugs for each developer Task: find and fix the bug Tools: • Eclipse plug-in • Integrating ranked lists • Logging …" 1)" 2)" 3)" 4)" ✔ ✔ ✔
  25. 25. Experimental Protocol: Variables Bug related Tool related
  26. 26. Experimental Protocol: Variables 1) ——— 2) ——— 3) ——— 4) ——— … 1) ——— 2) ——— 3) ——— 4) ——— … Good/bad ranked list Good/bad bug report With/without a ranked list Bug related Tool related (i.e., with/without the tool)
  27. 27. Experimental Protocol: Evaluation Metrics Time • To find the faulty file • To locate the bug Debugging score
  28. 28. Q3: Results Time used to find the faulty file Time used to locate the bug Debugging score √ X √ Compared the performance of 2 groups:
 1. without tool, good bug reports
 2. without tool, bad bug reports √ Statistically significant difference: p < 0.05 X No statistically significant difference: p >= 0.05
  29. 29. Q3: Results Time used to find the faulty file Time used to locate the bug Debugging score √ X √ Compared the performance of 2 groups:
 1. without tool, good bug reports
 2. without tool, bad bug reports √ Statistically significant difference: p < 0.05 X No statistically significant difference: p >= 0.05
  30. 30. Q3: Results Time used to find the faulty file Time used to locate the bug Debugging score √ X √ Compared the performance of 2 groups:
 1. without tool, good bug reports
 2. without tool, bad bug reports Good bug reports (i.e., with entity names) allow developers to shorten the time to find the faulty file and help them find better fixes √ Statistically significant difference: p < 0.05 X No statistically significant difference: p >= 0.05
  31. 31. Q4: Results Condition Debugging score Time to find the file Time to locate the bug X X X X √ X X X X X X X Compared the performance of 2 groups under 4 conditions:
 1. without tool, {good|bad} bug reports, {good|bad} ranked list
 2. with tool, {good|bad} bug reports, {good|bad} ranked list
  32. 32. Q4: Results Condition Debugging score Time to find the file Time to locate the bug X X X X √ X X X X X X X Compared the performance of 2 groups under 4 conditions:
 1. without tool, {good|bad} bug reports, {good|bad} ranked list
 2. with tool, {good|bad} bug reports, {good|bad} ranked list Good ranked
 list Bad ranked
 list Good bug
 report Bad bug
 report X Not statist, sign. √ Statist. significant
  33. 33. Q4: Results Condition Debugging score Time to find the file Time to locate the bug X X X X √ X X X X X X X Compared the performance of 2 groups under 4 conditions:
 1. without tool, {good|bad} bug reports, {good|bad} ranked list
 2. with tool, {good|bad} bug reports, {good|bad} ranked list Good ranked
 list Bad ranked
 list Good bug
 report Bad bug
 report X Not statist, sign. √ Statist. significant
  34. 34. Q4: Results Condition Debugging score Time to find the file Time to locate the bug X X X X √ X X X X X X X Compared the performance of 2 groups under 4 conditions:
 1. without tool, {good|bad} bug reports, {good|bad} ranked list
 2. with tool, {good|bad} bug reports, {good|bad} ranked list Good ranked
 list Bad ranked
 list Good bug
 report Bad bug
 report X Not statist, sign. √ Statist. significant Only perfect ranked lists help when users can not get enough hints from bug reports
  35. 35. Q4: Results Condition Debugging score Time to find the file Time to locate the bug X X X X √ X X X X X X X Compared the performance of 2 groups under 4 conditions:
 1. without tool, {good|bad} bug reports, {good|bad} ranked list
 2. with tool, {good|bad} bug reports, {good|bad} ranked list Good ranked
 list Bad ranked
 list Good bug
 report Bad bug
 report X Not statist, sign. √ Statist. significant Only perfect ranked lists help when users can not get enough hints from bug reports The tool only helps find the faulty file, but developers spend much more time locating the bug in the faulty file than finding such file
  36. 36. Additional Observations • Developers used program entity names in the bug report as search keywords. • Ranked lists generated by IR-based techniques affected users’ debugging behavior • Gave a starting point • Gave them confidence
  37. 37. Summary • Studied the practical usefulness of IR-based FL techniques • Performed both an analytical study and a user study • Main findings • Bug report characteristics affect IR-based fault localization results • Results of existing studies might be worse than what reported • The majority of bug reports do not contain enough information • “Good” bug reports allow developers to shorten the time to find the faulty file and help them find better fixes • Only perfect ranked lists help when users can not get enough hints from bug reports • The tool only helps find the faulty file, but developers spend much more time locating the bug in the faulty file than finding such file
  38. 38. Implications • Better bug reports are needed • Automated debugging techniques should focus on improving results for bug reports with little information • Automated debugging techniques should provide finer-grained information and context • More user studies and realistic evaluations are needed

×