Duplicate Bug Reports Considered Harmful ... Really?

1,704 views
1,578 views

Published on

Talk given at ICSM 2008 Conference in Beijing, China.
Duplicate Bug reports are commonly to pollute bug reporting systems and have negative effects on a development teams' productivity. Therefore, duplicate bug reports are ignored, once identified. The findings in this research work show, that duplicate reports actually contain extra information that is not present in the original bug reports and developers can potentially benefit from this information. We conduct experiments and a case study on ECLIPSE to quantify the amount of extra information. We show that this extra information can be used to enhance techniques related to bug fixing, such as triaging.

Published in: Education, Technology
2 Comments
2 Likes
Statistics
Notes
No Downloads
Views
Total views
1,704
On SlideShare
0
From Embeds
0
Number of Embeds
68
Actions
Shares
0
Downloads
45
Comments
2
Likes
2
Embeds 0
No embeds

No notes for slide

Duplicate Bug Reports Considered Harmful ... Really?

  1. 1. Duplicate Bug Reports Considered Harmful ... Really?! Nicolas Bettenburg Rahul Premraj Tom Zimmermann Sunghun Kim Saarland University Saarland University University of Calgary MIT CSAIL Queenʼs University Vrije Uni. Amsterdam Microsoft Research Hong Kong University
  2. 2. 2
  3. 3. Duplicate Bug Reports # 2271 A Bug Database 3
  4. 4. Duplicate Bug Reports BUG # 2271 A Bug Database 3
  5. 5. Duplicate Bug Reports # 2271 # 3219 A B Bug Database 3
  6. 6. Duplicate Bug Reports # 2271 # 3219 A B Bug Database 4
  7. 7. Duplicate Bug Reports # 2271 # 3219 A B Bug Database 4
  8. 8. What are the reasons for duplicates? 5
  9. 9. Inexperienced Users 6
  10. 10. Poor Search Feature 7
  11. 11. Multiple Failures - One Defect 8
  12. 12. Accidental Resubmission 9
  13. 13. FIX THAT BUG! Intentional Resubmission 10
  14. 14. ECLIPSE 20% Duplicates 371 per month 11
  15. 15. Duplicate reports are usually ignored once identified! 12
  16. 16. But Wait! Is this really the right thing to do? 13
  17. 17. “Duplicates [...] often add useful information. [It is unfortunate that this information is filed in a new report.]” Developer What Makes a Good Bug Report? to appear in FSE 2008 14
  18. 18. Alan Page Director of Test Excellence, Microsoft 15
  19. 19. Bug duplicates can provide valuable information [...] Alan Page Director of Test Excellence, Microsoft 15
  20. 20. Experiment 1 Do duplicate bug reports contain additional information? 2 EXPERIMENTS Experiment 2 Can additional information improve bug triaging? 16
  21. 21. Experiment 1 Do duplicate bug reports contain additional information? 17
  22. 22. The infoZilla Tool Detects and Extracts Bug 137808 Summary: Exceptions from createFromString lock-up the editor Product: [Modeling] EMF Reporter: Patrick Sodre <psodre@gmail.com> Component: Core Assignee: Marcelo Paternostro <marcelop@ca.ibm.com> Status: VERIFIED FIXED QA Contact: Severity: normal Structural Information: Priority: P3 CC: merks@ca.ibm.com Version: 2.2 Target Milestone: --- Hardware: PC OS: Windows XP Whiteboard: Opened: 2006-04-20 14:25 - Description: 0400 As discussed on the newsgroup under the Thread with the same name I am opening this bug entry. Here is a history of the thread. -- From Ed Merks Patrick, The value is checked before it's applied and can't be applied until it's valid. But this BigDecimal cases behaves oddly because the exception thrown by new BigDecimal(quot;badvaluequot;) has a null message and the property editor relies on returning a non-null SCREENSHOTS message string to indicate there is an error. Please open a bugzilla which I'll fix like this: ### Eclipse Workspace Patch 1.0 #P org.eclipse.emf.edit.ui Index: src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java =================================================================== RCS file: /cvsroot/tools/org.eclipse.emf/plugins/org.eclipse.emf.edit.ui/src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java,v retrieving revision 1.10 diff -u -r1.10 PropertyDescriptor.java --- src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java 21 Mar 2006 SOURCE CODE 16:42:30 -0000 1.10 +++ src/org/eclipse/emf/edit/ui/provider/PropertyDescriptor.java 20 Apr 2006 11:59:10 -0000 @@ -162,7 +162,8 @@ } catch (Exception exception) { - return exception.getMessage(); + String message = exception.getMessage(); + return message == null ? exception.toString() : message; } } Diagnostic diagnostic = Diagnostician.INSTANCE.validate(EDataTypeCellEditor.this.eDataType, value); Patrick Sodre wrote: Hi, It seems that if the user inputs an invalid parameter that gets created from quot;createFromStringquot; the Editor locks-up until the user explicitly calls quot;restore Default Valuequot;. Is this the expected behavior or could something better be done? For PATCHES instance if an exception is thrown restore the value back to what it was before after displaying a pop-up error message. I understand that for DataTypes defined by the user he/she should take care of catching the exceptions but for the default ones like BigInteger/BigDecimal I think the EMF runtime could do some of the grunt work... If you think this is something worth pursuing I could post an entry in Bugzilla. Regards, Patrick Sodre Below is the stack trace that I got from the Editor... java.lang.NumberFormatException at java.math.BigDecimal.<init>(BigDecimal.java:368) at java.math.BigDecimal.<init>(BigDecimal.java:647) at org.eclipse.emf.ecore.impl.EcoreFactoryImpl.createEBigDecimalFromString(EcoreFactoryImpl.java:559) at org.eclipse.emf.ecore.impl.EcoreFactoryImpl.createFromString(EcoreFactoryImpl.java:116) at org.eclipse.emf.edit.ui.provider.PropertyDescriptor$EDataTypeCellEditor.doGetValue(PropertyDescriptor.java:183) STACK TRACES at org.eclipse.jface.viewers.CellEditor.getValue(CellEditor.java:449) at org.eclipse.ui.views.properties.PropertySheetEntry.applyEditorValue(PropertySheetEntry.java:135) at org.eclipse.ui.views.properties.PropertySheetViewer.applyEditorValue(PropertySheetViewer.java:249) at ------- Comment #1 From Ed Merks 2006-04-20 15:09:23 -0400 ------- The fix has been committed to CVS. Thanks for reporting this problem. Extracting Structural Information from Bug Reports MSR 2008
  23. 23. Experimental Setup BUGthisasd asdlknasdklnasdlk askdnaklsdn aksdnlaksdnlkasdkn asd sadddda asdaddasd aksdnlaskdnlkansd Master Report Elements BUGthisasd asdlknasdklnasdlk infoZilla askdnaklsdn aksdnlkasdkn asdasdasdasdasd a s adddda a daddasd asdasdasdasdasd askdnlkansd Duplicate Report Elements 19
  24. 24. Experimental Setup compare Elements Elements BUGthisasd asdlknasdklnasdlk BUGthisasd askdnaklsdn asdlknasdklnasdlk askdnaklsdn aksdnlaksdnlkasdkn aksdnlaksdnlkasdkn asd sadddda asdaddasd aksdnlaskdnlkansd asd BUGthisasd asdlknasdklnasdlk askdnaklsdn sadddda aksdnlkasdkn asdasdasdasdasd a asdaddasd s adddda a aksdnlaskdnlkansd daddasd asdasdasdasdasd askdnlkansd Master Report Extended Report 20
  25. 25. ECLIPSE 21
  26. 26. 16,511 Master Reports ECLIPSE 21
  27. 27. 16,511 Master Reports ECLIPSE 27,838 Duplicate Reports 21
  28. 28. 16,511 Master Reports ECLIPSE 27,838 Duplicate Reports Unique elements per report: Master Extended 21
  29. 29. 16,511 Master Reports ECLIPSE 27,838 Duplicate Reports Unique elements per report: 2.5 1.94 2.0 1.83 1.5 1.42 1.0 0.50 0.5 0.29 0.14 0 Patches Stacktraces Screenshots Master Extended 21
  30. 30. Experiment 1 Do duplicate bug reports contain additional information? 22
  31. 31. Experiment 1 Do duplicate bug reports contain additional information? They do! 22
  32. 32. Experiment 2 Can additional information improve bug triaging? 23
  33. 33. Bug Triage Developer 24
  34. 34. Bug Triage BUG Report Developer 24
  35. 35. Bug Triage BUG Developer ✓BUG Fixed Report 24
  36. 36. Bug Triage BUG BUG ✓ BUG BUG BUG BUG BUG BUG Report BUG Developer Fixed 24
  37. 37. Bug Triage BUG BUG ✓ BUG BUG BUG BUG BUG BUG Report BUG Triager Developer Fixed 24
  38. 38. Experimental Setup •Machine learning to predict developers •Train using master reports •Train using extended reports •10 Runs 25
  39. 39. Results for predicting Top-5 developers Precision 70.00 65 61 60 60 61.25 58 57 57 56 56 55 52 53 52 52 52.50 51 51 47 47 47 48 48 43.75 42 Run 1 2 3 4 5 6 7 8 9 10 All Master Extended 26
  40. 40. Experiment 2 Can additional information improve bug triaging? 27
  41. 41. Experiment 2 Can additional information improve bug triaging? They can! 27
  42. 42. Duplicate reports are usually ignored once identified! 28
  43. 43. Duplicate reports are usually ignored X once identified! Merge Reports 28
  44. 44. 29
  45. 45. 29
  46. 46. 29
  47. 47. 29
  48. 48. 29

×