Bug Prediction and Analysis

6,676 views

Published on

1 Comment
15 Likes
Statistics
Notes
No Downloads
Views
Total views
6,676
On SlideShare
0
From Embeds
0
Number of Embeds
700
Actions
Shares
0
Downloads
0
Comments
1
Likes
15
Embeds 0
No embeds

No notes for slide
  • Bug Prediction and Analysis

    1. 1. Bug Prediction & Analysis Marco D’Ambros
    2. 2. As users, we are used to bugs...
    3. 3. ... and also as developers
    4. 4. But the perception in reverse engineering is different
    5. 5. But the perception in reverse engineering is different There are thousands of bugs
    6. 6. Prediction
    7. 7. Focus resources on bug-prone components Theory Prove correlations Practice with software metrics Rank components according to the bug-proneness
    8. 8. Classification Class A will/won't Release x Bug prediction have bugs Ranking Class A will have more bugs than class B
    9. 9. Classification Class A will/won't Release x Bug prediction have bugs Correct? Ranking Class A will have more bugs than class B
    10. 10. Release x Bug prediction
    11. 11. List of classes ranked by the Release x Bug prediction prediction mccabe fanout fanin wloc nom noa ona pna noc loc - - - - - - - - - - 220 2532 143936 17324 6190 95 11 345 113 954 220 2681 169205 19599 7078 132 13 450 170 1098 220 2672 170149 19616 7067 135 12 459 171 1082 228 2664 169693 19452 6982 136 12 470 171 1074 - - - - - - - - - - 68 807 67027 5944 2930 48 1 36 0 289 80 861 69064 6232 3018 59 3 59 4 313 84 1045 75448 7227 3427 52 6 92 8 338 69 990 69719 6203 2673 57 2 46 8 320 75 1334 105783 10123 3181 79 59 281 33 311 77 1528 108259 11082 3403 72 43 319 36 380 78 1674 137852 12520 4196 115 57 326 44 405 Prediction Comparison performance Release x+1 Bug extraction mccabe fanout fanin wloc nom noa ona pna noc loc - - - - - - - - - - 220 2532 143936 17324 6190 95 11 345 113 954 220 2681 169205 19599 7078 132 13 450 170 1098 220 2672 170149 19616 7067 135 12 459 171 1082 228 2664 169693 19452 6982 136 12 470 171 1074 - - - - - - - - - - 68 807 67027 5944 2930 48 1 36 0 289 80 861 69064 6232 3018 59 3 59 4 313 84 1045 75448 7227 3427 52 6 92 8 338 69 990 69719 6203 2673 57 2 46 8 320 75 1334 105783 10123 3181 79 59 281 33 311 77 1528 108259 11082 3403 72 43 319 36 380 78 1674 137852 12520 4196 115 57 326 44 405 List of classes ranked by the number of Bugzilla actual bugs database
    12. 12. List of classes ranked by the Release x Bug prediction prediction mccabe fanout fanin wloc nom noa ona pna noc loc - - - - - - - - - - 220 2532 143936 17324 6190 95 11 345 113 954 220 2681 169205 19599 7078 132 13 450 170 1098 220 2672 170149 19616 7067 135 12 459 171 1082 228 2664 169693 19452 6982 136 12 470 171 1074 - - - - - - - - - - 68 807 67027 5944 2930 48 1 36 0 289 80 861 69064 6232 3018 59 3 59 4 313 84 1045 75448 7227 3427 52 6 92 8 338 69 990 69719 6203 2673 57 2 46 8 320 75 1334 105783 10123 3181 79 59 281 33 311 77 1528 108259 11082 3403 72 43 319 36 380 78 1674 137852 12520 4196 115 57 326 44 405 Svn / Cvs repository Check out Prediction Comparison performance Release x+1 Bug extraction mccabe fanout fanin wloc nom noa ona pna noc loc - - - - - - - - - - 220 2532 143936 17324 6190 95 11 345 113 954 220 2681 169205 19599 7078 132 13 450 170 1098 220 2672 170149 19616 7067 135 12 459 171 1082 228 2664 169693 19452 6982 136 12 470 171 1074 - - - - - - - - - - 68 807 67027 5944 2930 48 1 36 0 289 80 861 69064 6232 3018 59 3 59 4 313 84 1045 75448 7227 3427 52 6 92 8 338 69 990 69719 6203 2673 57 2 46 8 320 75 1334 105783 10123 3181 79 59 281 33 311 77 1528 108259 11082 3403 72 43 319 36 380 78 1674 137852 12520 4196 115 57 326 44 405 List of classes ranked by the number of Bugzilla actual bugs database
    13. 13. List of classes ranked by the Release x Bug prediction prediction mccabe fanout fanin wloc nom noa ona pna noc loc - - - - - - - - - - 220 2532 143936 17324 6190 95 11 345 113 954 220 2681 169205 19599 7078 132 13 450 170 1098 220 2672 170149 19616 7067 135 12 459 171 1082 228 2664 169693 19452 6982 136 12 470 171 1074 - - - - - - - - - - 68 807 67027 5944 2930 48 1 36 0 289 80 861 69064 6232 3018 59 3 59 4 313 84 1045 75448 7227 3427 52 6 92 8 338 69 990 69719 6203 2673 57 2 46 8 320 75 1334 105783 10123 3181 79 59 281 33 311 77 1528 108259 11082 3403 72 43 319 36 380 78 1674 137852 12520 4196 115 57 326 44 405 Svn / Cvs repository Check out Prediction Comparison performance Release x+1 Bug extraction mccabe fanout fanin wloc nom noa ona pna noc loc - - - - - - - - - - 220 2532 143936 17324 6190 95 11 345 113 954 220 2681 169205 19599 7078 132 13 450 170 1098 220 2672 170149 19616 7067 135 12 459 171 1082 228 2664 169693 19452 6982 136 12 470 171 1074 - - - - - - - - - - 68 807 67027 5944 2930 48 1 36 0 289 80 861 69064 6232 3018 59 3 59 4 313 84 1045 75448 7227 3427 52 6 92 8 338 69 990 69719 6203 2673 57 2 46 8 320 75 1334 105783 10123 3181 79 59 281 33 311 77 1528 108259 11082 3403 72 43 319 36 380 78 1674 137852 12520 4196 115 57 326 44 405 List of classes ranked by the number of Bugzilla actual bugs database
    14. 14. System release Parsing FAMIX Class Attribute Attribute Attribute check out Svn / Cvs Class / File repository Versioning link Inferred system logs link log Parsing Commit comments Bug reference Bug reports in the comment Bugzilla Query Parsing database Bug
    15. 15. Classification Ranking Precision & recall Spearman correlation coefficient
    16. 16. Classification Ranking Precision & recall Spearman correlation coefficient Buggy classes Classes predicted as buggy
    17. 17. Classification Ranking Precision & recall Spearman correlation coefficient Buggy classes FN TP FP Classes predicted as buggy
    18. 18. Classification Ranking Precision & recall Spearman correlation coefficient How How small FP is small FN is Buggy classes FN TP FP Classes predicted as buggy
    19. 19. Classification Ranking Precision & recall Spearman correlation coefficient How How small FP is small FN is Predicted Observed Class D Class E Buggy classes Class A Class A FN TP Class E ... ~ Class D ... FP ... ... Classes predicted ... ... as buggy
    20. 20. Approaches are based on: History Metrics
    21. 21. Predicting Defects for Eclipse Thomas Zimmermann Rahul Premraj Andreas Zeller  Saarland University
    22. 22. Experimental settings Release #Files #Packages 2.0 6740 376 2.1 7900 433 3.0 6614 429 Pre-release defects Post-release defects 6 months before/after release
    23. 23. Classification of classes Using logistic regression models max recall 0.38 Buggy classes FN max precision 0.68 TP FP Classes predicted as buggy
    24. 24. Ranking classes McCabe complexity 0.401 Method LOC 0.405 Total LOC 0.42 Linear regression model 0.416 Pre-release defects 0 0.25 0.50 0.75 1.00
    25. 25. Ranking classes McCabe complexity 0.401 Method LOC 0.405 Total LOC 0.42 Linear regression model 0.416 Pre-release defects Pre-release defects 0.907 0 0.25 0.50 0.75 1.00
    26. 26. Conclusion Past defects is the predictor for future defects
    27. 27. Conclusion Software metrics Past defects is the correlate with defects but predictor for future are not usable in practice defects
    28. 28. Mining metrics to predict component failures Nachiappan Nagappan Thomas Ball Microsoft Research Andreas Zeller  Saarland University
    29. 29. Experimental settings Project Code size Internet Explorer 6 511 KLOC DirectX 306 KLOC Process messaging 147 KLOC component NetMeeting 109 KLOC IIS Core 37 KLOC Granularity level: module
    30. 30. Experimental settings Project Code size Internet Explorer 6 511 KLOC DirectX 306 KLOC Process messaging 147 KLOC component NetMeeting 109 KLOC IIS Core 37 KLOC Granularity level: module (a binary file within Windows)
    31. 31. Experimental settings Project Code size Internet Explorer 6 511 KLOC DirectX 306 KLOC Process messaging 147 KLOC component NetMeeting 109 KLOC IIS Core 37 KLOC Granularity level: module (a binary file A set of classes within Windows)
    32. 32. Q1 Do complexity metrics correlate with defects?
    33. 33. Q1 Do complexity metrics correlate with defects? Maximum correlation Percentage of correlated metrics 1.00 0.75 0.50 0.25 0 A B C D E
    34. 34. Q2 Is there a unique set of metrics that predicts defects in all projets?
    35. 35. Q3 Can we combine metrics to predict defect?
    36. 36. Q3 Can we combine metrics to predict defect? Multicollinearity of metrics
    37. 37. Q3 Can we combine metrics to predict defect? Principal Multicollinearity Component of metrics analysis
    38. 38. Q3 Can we combine metrics to predict defect? Principal Linear/logistic Multicollinearity Component regression of metrics analysis model
    39. 39. Q3 Can we combine metrics to predict defect? Principal Linear/logistic Multicollinearity Component regression of metrics analysis model Spearman/Pearson correlation Percentage of splits which correlate 1.00 0.75 0.50 0.25 0 A B C D E
    40. 40. Q3 Can we combine metrics to predict defect? Principal Linear/logistic Multicollinearity Component regression of metrics analysis model Spearman/Pearson correlation Percentage of splits which correlate Too few samples 1.00 0.75 0.50 0.25 0 A B C D E
    41. 41. Q4 Are predictors obtained from one project applicable to other projects?
    42. 42. Conclusion Metrics can be used to predict defects
    43. 43. Conclusion Metrics can be used to predict defects but
    44. 44. Conclusion Metrics can be used to predict defects but they must be validated on the history
    45. 45. Improving Defect Prediction Using Temporal Features and Non Linear Models Abraham Bernstein Jayalath Ekanayake Martin Pinzger University of Zurich
    46. 46. Experimental settings Plugin #Years #Files updateui 7 757 updatecore 7 459 search 6.5 540 pdeui 6.5 1621 pdebuild 6 198 compare 6.5 315 Non linear models based on 21 historical metrics + LOC
    47. 47. Classification of files Using decision tree learners All files: A Size(CC) Accuracy = Size(A) Correctly classified files: CC
    48. 48. Classification of files Using decision tree learners All files: A Size(CC) Accuracy = Size(A) Correctly classified files: CC Best predictor (7 metrics) Accuracy 99.16%
    49. 49. Ranking of files Using m5 tree regression algorithm Sperman correlation Predictor based on 7 metrics 0.966 Zimmermann’s pre-release defects 0.907 0 0.243 0.485 0.728 0.970
    50. 50. Conclusion Defect prediction can be improved with: Historical information Non-linear function
    51. 51. Predicting Faults Using the Complexity of Code Changes Ahmed E. Hassan Queen’s University
    52. 52. ntuition is that one change affecting one file only is simpler Complexity = Entropy han one affecting many different files, as the developer who has to more changeschange has to keep trackthe entropy The perform the are distributed the higher of all them. Hassan proposed to use Shannon Entropy defined as Shannon Entropy n X Hn (P ) = − pk ∗ log2 pk (1) k=1 where pk is the probability that the file k changes during File A he considered time interval. Figure 4 shows an example with three files and three time intervals. File B File C File A Time File B t1 (2 weeks) t2 (2 weeks) t3 (2 weeks) File C
    53. 53. ntuition is that one change affecting one file only is simpler Complexity = Entropy han one affecting many different files, as the developer who has to more changeschange has to keep trackthe entropy The perform the are distributed the higher of all them. Hassan proposed to use Shannon Entropy defined as Shannon Entropy n X Hn (P ) = − pk ∗ log2 pk (1) k=1 where pk is the probability that the file k changes during File A he considered time interval. Figure 4 shows an example with three files and three time intervals. File B File C File A Time File B t1 (2 weeks) t2 (2 weeks) t3 (2 weeks) FileHn(P) C =
    54. 54. ntuition is that one change affecting one file only is simpler Complexity = Entropy han one affecting many different files, as the developer who has to more changeschange has to keep trackthe entropy The perform the are distributed the higher of all them. Hassan proposed to use Shannon Entropy defined as Shannon Entropy n X Hn (P ) = − pk ∗ log2 pk (1) 4 k=1 where pk is the probability that the file k changes during File A he considered time interval. Figure 4 shows an example with three files and three time intervals. File B File C File A Time File B t1 (2 weeks) t2 (2 weeks) t3 (2 weeks) FileHn(P) C =
    55. 55. ntuition is that one change affecting one file only is simpler Complexity = Entropy han one affecting many different files, as the developer who has to more changeschange has to keep trackthe entropy The perform the are distributed the higher of all them. Hassan proposed to use Shannon Entropy defined as Shannon Entropy n X Hn (P ) = − pk ∗ log2 pk (1) 4 2 k=1 4 where pk is the probability that the file k changes during File A he considered time interval. Figure 4 shows an example with three files and three time intervals. File B File C File A Time File B t1 (2 weeks) t2 (2 weeks) t3 (2 weeks) FileHn(P) C = - 2 4 * log2 4 2
    56. 56. ntuition is that one change affecting one file only is simpler Complexity = Entropy han one affecting many different files, as the developer who has to more changeschange has to keep trackthe entropy The perform the are distributed the higher of all them. Hassan proposed to use Shannon Entropy defined as Shannon Entropy n X Hn (P ) = − pk ∗ log2 pk (1) 4 2 k=1 4 where pk is the probability that the file k changes during File A he considered time1interval. Figure 4 shows an example 4 time intervals. with three files and three File B File C File A Time File B t1 (2 weeks) t2 (2 weeks) t3 (2 weeks) FileHn(P) C = - 2 4 * log2 4 - 1 4 * log2 4 2 1
    57. 57. ntuition is that one change affecting one file only is simpler Complexity = Entropy han one affecting many different files, as the developer who has to more changeschange has to keep trackthe entropy The perform the are distributed the higher of all them. Hassan proposed to use Shannon Entropy defined as Shannon Entropy n X Hn (P ) = − pk ∗ log2 pk (1) 4 2 k=1 4 where pk is the probability that the file k changes during File A he considered time1interval. Figure 4 shows an example 4 time intervals. with three files and three File B 1 File C File A 4 Time File B t1 (2 weeks) t2 (2 weeks) t3 (2 weeks) FileHn(P) C 1 1 = - 2 4 * log2 4 - 1 4 * log2 4 - 1 4 * log 2 4 2
    58. 58. ntuition is that one change affecting one file only is simpler Complexity = Entropy han one affecting many different files, as the developer who has to more changeschange has to keep trackthe entropy The perform the are distributed the higher of all them. Hassan proposed to use Shannon Entropy defined as Shannon Entropy n X Hn (P ) = − pk ∗ log2 pk (1) 4 2 k=1 4 where pk is the probability that the file k changes during File A he considered time1interval. Figure 4 shows an example 4 time intervals. with three files and three File B 1 File C File A 4 Time File B t1 (2 weeks) t2 (2 weeks) t3 (2 weeks) FileHn(P) C 1 1 = - 2 4 * log2 4 - 1 4 * log2 4 - 1 4 * log 2 4 = 1 2
    59. 59. ntuition is that one change affecting one file only is simpler Complexity = Entropy han one affecting many different files, as the developer who has to more changeschange has to keep trackthe entropy The perform the are distributed the higher of all them. Hassan proposed to use Shannon Entropy defined as Shannon Entropy n X Hn (P ) = − pk ∗ log2 pk (1) k=1 where pk is the probability that H > 1? k changes during H=1 the file File A he considered time interval. Figure 4 shows an example with three files and three time intervals. File B File C File A Time File B t1 (2 weeks) t2 (2 weeks) t3 (2 weeks) File C
    60. 60. ntuition is that one change affecting one file only is simpler Complexity = Entropy han one affecting many different files, as the developer who has to more changeschange has to keep trackthe entropy The perform the are distributed the higher of all them. Hassan proposed to use Shannon Entropy defined as Shannon Entropy n X Hn (P ) = − pk ∗ log2 pk (1) k=1 H=1 H > 1? where pk is the probability that the file k changes during File A he considered time interval. Figure 4 shows an example with three files and three time intervals. File B File C File A Time File B t1 (2 weeks) t2 (2 weeks) t3 (2 weeks) File C
    61. 61. ned as:in the last six months). file juse H ,entropy F modified Complexity Metric (HCM) of a c ∗ the To as j∈ i Historyas bug predictor, Hassan  of Complexity Metric (HCM) e change HCP F (j) = X defined the i (j) = ij i History mplexity Metric {a,..,b} of a file j 0, ij ∗ i (j) , otherw HCM (HCM) asc HCP F H i j∈F (3) HCP Fi (j) = X i∈{a,..,b} HCM{a,..,b} (j) = 0, HCP Fi (j) other (3) e i is a.., b} is a set of evolution periods iand HCP the here {a, period with entropy H ,Set i is F is i∈{a,..,b} F of efined as: {a, b} period i and j periods andHmodified filesto re i..,is is a set of with ∗ is ,a j ∈ F HCPFiisis n the a periodevolutionentropy belongingth  file i , F cij Hi i e definition of icij , there otherwise din theHCP Fi (j) = and j is a file belonging as: period  0, are three types (4) cij ∗ Hi , j ∈ Fi he definition ofentropy there are three mod- i is a Fi (j) with0, cij , Hotherwise set of files typ here HCP period= , Fi is the (4) (1) the period i and jHis Mfilebelonging to Fentropy of co ed in cij = 1, everya file modifiedi .in the C Each file gets the According i oi the definition ofentropy Hiarei three types of HCM :the c iisgets ij with1,ijevery,of the systemmod-the a period = entropy the is the set of files in the c , there file modified in F system (1) c i and j is a file belonging to F . According n the period i interval. 1,This file modified approach: HCM definition of cijevery defines types ofconsidered in th 1. (1) cij = , entropy of the system period i gets the there areMthree in the HCM i gets the entropy of C system in the considered its H the Each file is weighted with time W defines considered period 1)interval. This approach HCM. cij = 1, every file modified in the approach HCM interval. This defines (2) the entropyjof the system in the consideredmodified gets cij = p , each modified being gets the probability of file time
    62. 62. In EDHCM (Exponentially Decayed HCM) , entropies f earlier with decaytime, i.e., earlier modifications, have the HCM periods of factors contribution reduced exponentially over time, modelling a exponential decay model. EDHCM was introduced by Ha san. Similarly, LDHCM (Linearly Decayed) and LGDHC (LoGarithmically decayed), have their contributions reduc over time in a respectively linear and logarithmic fashio Both are novel. The definition of the variants follow: P HCP Fi (j) EDHCM{a,..,b} (j) = i∈{a,..,b} eφ1 ∗(|{a,..,b}|−i) ( P HCP Fi (j) LDHCM{a,..,b} (j) = i∈{a,..,b} φ2 ∗(|{a,..,b}|+1−i) ( P HCP Fi (j) LGDHCM{a,..,b} (j) = i∈{a,..,b} φ3 ∗ln(|{a,..,b}|+1.01−i) ( where φ1 , φ2 and φ3 are the decay factors.
    63. 63. In EDHCM (Exponentially Decayed HCM) , entropies f earlier with decaytime, i.e., earlier modifications, have the HCM periods of factors contribution reduced exponentially over time, modelling a exponential decay model. EDHCM was introduced by Ha san. Similarly, LDHCM (Linearly Decayed) and LGDHC (LoGarithmically decayed), have their contributions reduc overExponentially decayed time in a respectively linear and logarithmic fashio Both are novel. The definition of the variants follow: P HCP Fi (j) EDHCM{a,..,b} (j) = i∈{a,..,b} eφ1 ∗(|{a,..,b}|−i) ( P HCP Fi (j) LDHCM{a,..,b} (j) = i∈{a,..,b} φ2 ∗(|{a,..,b}|+1−i) ( P HCP Fi (j) LGDHCM{a,..,b} (j) = i∈{a,..,b} φ3 ∗ln(|{a,..,b}|+1.01−i) ( where φ1 , φ2 and φ3 are the decay factors.
    64. 64. In EDHCM (Exponentially Decayed HCM) , entropies f earlier with decaytime, i.e., earlier modifications, have the HCM periods of factors contribution reduced exponentially over time, modelling a exponential decay model. EDHCM was introduced by Ha san. Similarly, LDHCM (Linearly Decayed) and LGDHC (LoGarithmically decayed), have their contributions reduc overExponentially decayed time in a respectively linear and logarithmic factor Exponential fashio Both are novel. The definition of the variants follow: P HCP Fi (j) EDHCM{a,..,b} (j) = i∈{a,..,b} eφ1 ∗(|{a,..,b}|−i) ( P HCP Fi (j) LDHCM{a,..,b} (j) = i∈{a,..,b} φ2 ∗(|{a,..,b}|+1−i) ( P HCP Fi (j) LGDHCM{a,..,b} (j) = i∈{a,..,b} φ3 ∗ln(|{a,..,b}|+1.01−i) ( where φ1 , φ2 and φ3 are the decay factors.
    65. 65. Experimental settings System Start date #Subsystem NetBSD March 1993 235 FreeBSD June 1993 152 OpenBSD Oct 1995 265 Postgre July 1996 280 KDE April 1997 108 KOffice April 1998 158 Entropy metrics Number of past modifications Number of past defects Subsystem level
    66. 66. 2 Models fitting in terms of R Past defects Past changes HCM WHCM EDHCM 0 0.2 0.4 0.6 NetBSD FreeBSD OpenBSD Postgres KDE KOffice
    67. 67. Prediction error Number of past changes vs Entropy NetBSD FreeBSD OpenBSD Postgres KDE KOffice 0 12.5 25.0 37.5 #Changes - WHCM (%) #Changes - EDHCM (%)
    68. 68. Prediction error Number of past defects vs Entropy NetBSD FreeBSD OpenBSD Postgres KDE KOffice -20.0 -10.0 0 10.0 20.0 30.0 40.0 #Defects - WHCM (%) #Defects - EDHCM (%)
    69. 69. Conclusion Models based on entropy of changes are better defects predictor s than number o f past changes or defects
    70. 70. Conclusion Models based on entropy of changes are better defects predictor s than number o f past changes or defects A complex code change process negatively affects its product, the software system
    71. 71. Epilogue
    72. 72. Epilogue Defect prediction research has been active for several year A large number of scientific papers have been published
    73. 73. Epilogue We can predict defects but results have still limited practical usability
    74. 74. Epilogue Predicting bugs is very difficult because developing code is a human activity
    75. 75. Epilogue A human activity influenced by too many factors How complex was the piece of code? How tested? How experienced was the developer?
    76. 76. Epilogue A human activity influenced by too many factors How complex was the piece of code? How tested? How experienced was the developer? How tired was the developer? How integrated was the developer in the team? Did he like his job?
    77. 77. Epilogue A human activity influenced by too many factors F OC US How complex was the piece of code? How tested? How experienced was the developer? How tired was the developer? How integrated was the developer in the team? Did he like his job?
    78. 78. Epilogue A human activity influenced by too many factors F OC US How complex was the piece of code? How tested? How experienced was the developer? od Hata ow tired was the developer? N y etintegrated was the developer in the team? How Did he like his job?
    79. 79. Analysis
    80. 80. Detect the critical bugs properties of components number of bugs
    81. 81. Detect the critical bugs properties of components number of bugs
    82. 82. Detect the critical components number of bugs properties of bugs
    83. 83. bugzero bugzilla census customerfirst defect-agent extraview-bug-tracker fast- bugtrack fogbugz gnats ibm- rational-clearquest ictracker issue- organizer issuenet-intercept issueview jira legendsoft-spots mantis new-fire omnitracker pointinsight pr-tracker problemtracker quickbugs radar razor rmtrack-bug-tracking
    84. 84. 4 facts about bugs
    85. 85. Bugs are differently harmful Blocker Critical Major Normal Minor Trivial Enhancement
    86. 86. Bugs are differently harmful Blocker Critical Bugzil la is used to repor t Major gs buNormal and change requests Minor Trivial Enhancement
    87. 87. Bugs are differently harmful Blocker Critical Bugzil la is used to repor t Major gs buNormal and change requests Minor Trivial Enhancement
    88. 88. Bugs are graphs
    89. 89. Bugs evolve
    90. 90. An ideal bug life cycle Unconfirmed
    91. 91. An ideal bug life cycle Unconfirmed Verified New Resolved Closed Assigned
    92. 92. A bit less ideal Unconfirmed Verified New Resolved Closed Assigned
    93. 93. A bit less ideal Unconfirmed Verified New Resolved Closed Assigned Reopened
    94. 94. The reality Unconfirmed Verified New Resolved Closed Assigned Reopened
    95. 95. The reality Unconfirmed Verified New Resolved Closed Assigned Reopened
    96. 96. All bug properties can change over time Bug Problem id description product component Criticality severity priority Involved people assignedTo reporter qa State Status Resolution ...
    97. 97. All bug properties can change over time Bug Bug Problem Problem id description id description product component product component Criticality Activity Criticality severity priority severity priority Involved people Involved people steve assignedTo reporter qa AssignedTo mike assignedTo reporter qa State steve john State Status Resolution Status Resolution ... ...
    98. 98. All bug properties can change over time Bug Bug Problem Problem id description id description product component product component Criticality Activity Criticality severity priority severity priority Involved people Involved people steve assignedTo reporter qa AssignedTo mike assignedTo reporter qa State steve john State Status Resolution Status Resolution ... ... i B P de i B P de i B P de i B P de Bug history C C C C Inv Inv Inv Inv S SR S SR S SR S SR
    99. 99. Are there many activities? How long do they live?
    100. 100. Are there many activities? How long do they live? Time period Sep 1998 - Apr 2003 #Bugs 255’302 #Activities 2’706’201
    101. 101. Number of activities 30% 25% 20% 15% 10% 5% 0% 0 1-3 4-5 6-10 11-20 21-30 > 30
    102. 102. Lifetime (reported - last activity) 40% 32% 24% 16% 8% 0% 12 Hours 1 Day 1 Week 1 Month 6 Months 1 Year 2 Years More
    103. 103. Lifetime (reported - last activity) 40% 32% > 50% 24% 16% 8% 0% 12 Hours 1 Day 1 Week 1 Month 6 Months 1 Year 2 Years More
    104. 104. Bugs have long and intense lives
    105. 105. 4 facts about bugs are are evolves have differently graphs long and harmful intense lives
    106. 106. There is a need of analyzing bug repositories Analyzing bugs as evolving entities
    107. 107. “A Bug’s Life” Visualizing a Bug Database Marco D’Ambros Michele Lanza Martin Pinzger
    108. 108. System radiography view “Where (in the system and in its history) are the open bugs located?”
    109. 109. System radiography view “Where (in the system and in its history) are the open bugs located?” Visualization principle •System decomposition on the Component 1 y axis Product A Component 2 •Product :: Component Product B Time
    110. 110. System radiography view “Where (in the system and in its history) are the open bugs located?” Visualization principle •System decomposition on the Component 1 y axis Product A Component 2 y position Color #bugs •Product :: Component • (x,y) : (time, component) Component Product B x position • Color: # open bugs Time Interval Time
    111. 111. System radiography view “Where (in the system and in its history) are the open bugs located?” Visualization principle •System decomposition on the Component 1 y axis Product A Component 2 y position Color #bugs •Product :: Component • (x,y) : (time, component) Component Product B x position • Color: # open bugs Time Interval Time
    112. 112. Mozilla example [Sep ‘98 - Apr ‘03] aggiungere transizione alla prossima slide, volendo anche nel filmato
    113. 113. Mozilla example [Sep ‘98 - Apr ‘03] aggiungere transizione alla prossima slide, volendo anche nel filmato Browser
    114. 114. Mozilla example [Sep ‘98 - Apr ‘03] aggiungere transizione alla prossima slide, volendo anche nel filmato Browser Mailnews
    115. 115. Mozilla example [Sep ‘98 - Apr ‘03] aggiungere transizione alla prossima slide, volendo anche nel filmato Browser Mailnews
    116. 116. The Bug Watch View “How are bugs characterized with respect to their history?”
    117. 117. The Bug Watch View “How are bugs characterized with respect to their history?” Visualization principle End: 10/16/2001 Beginning: 10/19/1999 Time
    118. 118. The Bug Watch View “How are bugs characterized with respect to their history?” Visualization principle End: 10/16/2001 Beginning: 10/19/1999 • 3 Layers Time
    119. 119. The Bug Watch View “How are bugs characterized with respect to their history?” Visualization principle End: 10/16/2001 Beginning: 10/19/1999 • 3 Layers Time • Status
    120. 120. The Bug Watch View “How are bugs characterized with respect to their history?” Visualization principle End: 10/16/2001 Beginning: 10/19/1999 • 3 Layers Time • Status Status From To Assigned 10/19/99 12/21/99 Resolved 12/21/99 1/31/00 Reopened 1/31/00 2/6/00 New 2/6/00 6/5/00 ... ... ...
    121. 121. The Bug Watch View “How are bugs characterized with respect to their history?” Visualization principle End: 10/16/2001 Beginning: 10/19/1999 • 3 Layers Time • Status Status From To Assigned 10/19/99 12/21/99 Resolved 12/21/99 1/31/00 Reopened 1/31/00 2/6/00 New 2/6/00 6/5/00 ... ... ...
    122. 122. The Bug Watch View “How are bugs characterized with respect to their history?” Visualization principle End: 10/16/2001 Beginning: 10/19/1999 • 3 Layers Time • Status Status From To Assigned 10/19/99 12/21/99 Resolved 12/21/99 1/31/00 Reopened 1/31/00 2/6/00 New 2/6/00 6/5/00 ... ... ...
    123. 123. The Bug Watch View “How are bugs characterized with respect to their history?” Visualization principle End: 10/16/2001 Beginning: 10/19/1999 • 3 Layers Time • Status Status From To Assigned 10/19/99 12/21/99 Resolved 12/21/99 1/31/00 Reopened 1/31/00 2/6/00 New 2/6/00 6/5/00 ... ... ...
    124. 124. The Bug Watch View “How are bugs characterized with respect to their history?” Visualization principle End: 10/16/2001 Beginning: 10/19/1999 • 3 Layers Time • Status Status From To Assigned 10/19/99 12/21/99 Resolved 12/21/99 1/31/00 Reopened 1/31/00 2/6/00 New 2/6/00 6/5/00 ... ... ...
    125. 125. The Bug Watch View “How are bugs characterized with respect to their history?” Visualization principle End: 10/16/2001 Beginning: 10/19/1999 • 3 Layers Time • Status Status From To Assigned 10/19/99 12/21/99 Resolved 12/21/99 1/31/00 Reopened 1/31/00 2/6/00 New 2/6/00 6/5/00 ... ... ... • Activity
    126. 126. The Bug Watch View “How are bugs characterized with respect to their history?” Visualization principle End: 10/16/2001 Beginning: 10/19/1999 • 3 Layers Time • Status Status From To Assigned 10/19/99 12/21/99 Resolved 12/21/99 1/31/00 Reopened 1/31/00 2/6/00 New 2/6/00 6/5/00 ... ... ... • Activity • Severity
    127. 127. tell more about the Examples from Mozilla clustering dire cosa e’ la grandezza Browser :: Networking [Nov ‘02- Apr ‘03]
    128. 128. tell more about the Examples from Mozilla clustering dire cosa e’ la grandezza Browser :: Networking [Nov ‘02- Apr ‘03] Reopened 4 times Developer in charge to fix it changed 6 times Many people added in the CC
    129. 129. tell more about the Examples from Mozilla clustering dire cosa e’ la grandezza Browser :: Networking [Nov ‘02- Apr ‘03]
    130. 130. tell more about the Examples from Mozilla clustering dire cosa e’ la grandezza Browser :: Networking [Nov ‘02- Apr ‘03] One status but many activities (addition of CC)
    131. 131. Conclusion Analyzing a bug database Provides useful insights in a software system Helps in detecting the most harmful bugs
    132. 132. Epilogue
    133. 133. Epilogue We are just touching the surface The analysis of bug repositories is still a very open field

    ×