Code Change and Fault           Prediction             Tom Ostrand, Robert Bell, Elaine Weyuker             AT&T Labs – Re...
Overview•Do measures of code change or churn provide useful input to fault prediction models?•Standard model•Base models•C...
The Standard Model• Underlying statistical model  • Negative binomial regression• Output (dependent) variable  • Predicted...
Evaluating prediction models• Model produces ranking of files in a release, from  predicted most faults to fewest faults• ...
Prediction Results, from the StandardModel                  Percent of faults in top 20% of files        FPA       100    ...
Measures of Code Change•Changed/not changed•Number of changes during a release•Number of lines added•Number of lines delet...
Two Subject SystemsLarge provisioning system• 18 releases, 5 year lifespan• 6 programming languages:  • Java (60%), C, C++...
Two Subject SystemsUtility, data aggregation system• 18 releases, 5 year lifespan• >10 programming languages:  • Java (77%...
Distribution of files,       averages over all releases.Percent of Files: Provisioning          Percent of Files: Utility ...
Where do faults occur?Distribution of faults over filesFaults/file: Provisioning               Faults/file: Utility     0....
Provisioning system faults per file, by               release                                 Faults per File, by Change S...
Utility system faults per file, by release                                     Faults per File, by Change Status and Relea...
Potential predictor combinations• Added lines only• Deleted lines only• Modified lines only• Adds & Deletes• Adds & Mods• ...
Distribution of change combinations,all check-ins, all releases:Provisioning system                         Number of File...
Average lines touched for each combination ofchanges                      Average Lines touched                           ...
Faults per file, changed files only:Provisioning system                          Faults per File                          ...
Fault prediction models•Univariate models•Base model: log(KLOC), File age, File type•Augmented models: • Previous Changes ...
Fault-percentile averages for univariatepredictor models: Provisioning system(best result from raw variable, square root, ...
Base Model 1 • KLOC • File age (number of releases) • File type (C,C++,java,sql,make,sh,perl,...)
Base Model 1, and added variables       Mean FPA, Provisioning System                                        Mean FPA, Uti...
Base Model 2 • KLOC • File age (number of releases) • File type (C,C++,java,sql,make,sh,perl,...) • (Previous changes)1/2
Base Model 2, and added variables                     Mean FPA, Provisioning System                    prev-prev changes  ...
Summary• Churn can be an effective aid for improving fault prediction• {Adds+Deletes+Mods} improves the accuracy of a mode...
Upcoming SlideShare
Loading in...5
×

Promise 2011: "Does Measuring Code Change Improve Fault Prediction?"

3,120

Published on

Promise 2011:
"Does Measuring Code Change Improve Fault Prediction?"
Robert Bell, Thomas Ostrand and Elaine Weyuker.

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
3,120
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
13
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Promise 2011: "Does Measuring Code Change Improve Fault Prediction?"

  1. 1. Code Change and Fault Prediction Tom Ostrand, Robert Bell, Elaine Weyuker AT&T Labs – Research Florham Park, NJ, USA PROMISE 2011 Banff, Alberta, September 20-21, 2011© 2007 AT&T Knowledge Ventures. All rights reserved. AT&Tand the AT&T logo are trademarks of AT&T Knowledge Ventures.
  2. 2. Overview•Do measures of code change or churn provide useful input to fault prediction models?•Standard model•Base models•Churn-augmented models
  3. 3. The Standard Model• Underlying statistical model • Negative binomial regression• Output (dependent) variable • Predicted fault count in each file of release n• Predictor (independent) variables • KLOC (n) • Previous faults (n-1) • Previous changes (n-1, n-2) • File age (number of releases) • File type (C,C++,java,sql,make,sh,perl,...)
  4. 4. Evaluating prediction models• Model produces ranking of files in a release, from predicted most faults to fewest faults• Choose cutoff point in ranking, X%• Yield = percent of all faults in the release that are in the first X% of the ranked filesWe’ve usually evaluated models at a 20% cutoff.• Fault-percentile average (FPA) is the average yield over all values of X
  5. 5. Prediction Results, from the StandardModel Percent of faults in top 20% of files FPA 100 90 93 93 91 93 92 80 88 88 87 83 83 81 70 75 76 60 50 40 30 20 10 0
  6. 6. Measures of Code Change•Changed/not changed•Number of changes during a release•Number of lines added•Number of lines deleted•Number of lines modified•Relative churn (line changes/LOC)
  7. 7. Two Subject SystemsLarge provisioning system• 18 releases, 5 year lifespan• 6 programming languages: • Java (60%), C, C++, SQL, SQL-C, SQL-C++• 3000+ files• 1.5Mil LOC• Average of 395 faults/release
  8. 8. Two Subject SystemsUtility, data aggregation system• 18 releases, 5 year lifespan• >10 programming languages: • Java (77%), Perl, xml, sh, ...• 800 files• 280K LOC• Average of 90 faults/release
  9. 9. Distribution of files, averages over all releases.Percent of Files: Provisioning Percent of Files: Utility 6.8% 1.6% 15.1% 11.0% New New Changed Changed Unchanged Unchanged 82.2% 84.4%
  10. 10. Where do faults occur?Distribution of faults over filesFaults/file: Provisioning Faults/file: Utility 0.02 0.12 0.24 New New Changed Changed Unchanged Unchanged 0.80 0.82
  11. 11. Provisioning system faults per file, by release Faults per File, by Change Status and Release 2.5 2 Fault-per-File 1.5 1 0.5 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Release New (Mean=0.24) Unchanged (Mean=0.02) Changed (Mean=0.80)
  12. 12. Utility system faults per file, by release Faults per File, by Change Status and Release 3 2.5 2 Faults per file 1.5 1 0.5 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 New (Mean=.09) Unchanged (Mean=.002) Changed (Mean=.92)
  13. 13. Potential predictor combinations• Added lines only• Deleted lines only• Modified lines only• Adds & Deletes• Adds & Mods• Deletes & Mods• Adds & Deletes & Mods• Relative values: changed lines/LOC
  14. 14. Distribution of change combinations,all check-ins, all releases:Provisioning system Number of Files Mods, 683 Deletes, 296 M & D & A, 2625 Adds, 597 Mods & Deletes, 168 Mods & Adds, 1894 Deletes & Adds, 126
  15. 15. Average lines touched for each combination ofchanges Average Lines touched Mods, 4 Deletes, 5 Mods & Adds, Deletes, 23 21 Mods & Adds, 37 M & D & A, 210 Deletes & Adds, 21
  16. 16. Faults per file, changed files only:Provisioning system Faults per File Deletes, 0.04 Mods, 0.19 Adds, 0.3 Mods & M & D & A, 1.38 Deletes, 0.36 Mods & Adds, 0.55 Deletes & Adds, 0.5
  17. 17. Fault prediction models•Univariate models•Base model: log(KLOC), File age, File type•Augmented models: • Previous Changes • Previous {Adds / Deletes / Mods} • Previous Adds + Deletes + Modifications • Previous {Adds / Deletes / Mods} / LOC (relative churn) • Previous Developers
  18. 18. Fault-percentile averages for univariatepredictor models: Provisioning system(best result from raw variable, square root, fourth root) FPA, univariate models Standard Model Age Language Prior Lines Deleted Prior Faults Prior Changed Prior Lines Modified Prior Lines Added Prior Developers Prior Adds+Deletes+Mods Prior Changes log(KLOC) 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
  19. 19. Base Model 1 • KLOC • File age (number of releases) • File type (C,C++,java,sql,make,sh,perl,...)
  20. 20. Base Model 1, and added variables Mean FPA, Provisioning System Mean FPA, Utility System Standard Model Standard Model prev-changes prev-changes prev-adds,dels,mods prev-adds,dels,mods(prev-adds,dels,mods)/LOC prev-developers prev-developers prev-adds prev-adds prev-changed prev-changed prev-mods prev-mods prev-deletes prev-deletes prev-prev changes prev-prev changes Base 1 Base 1 89 90 91 92 93 94 87 88 89 90 91 92 93 • Base model 1 • KLOC • File age (number of releases) • File type (C,C++,java,sql,make,sh,perl,...)
  21. 21. Base Model 2 • KLOC • File age (number of releases) • File type (C,C++,java,sql,make,sh,perl,...) • (Previous changes)1/2
  22. 22. Base Model 2, and added variables Mean FPA, Provisioning System prev-prev changes prev-adds,dels,mods prev-adds prev-mods prev-developers (prev-adds,dels,mods)/LOC prev-deletes prev-changed Base 2 93.2 93.25 93.3 93.35 93.4 93.45 93.5 93.55 • Base model 2 • KLOC • File age (number of releases) • File type (C,C++,java,sql,make,sh,perl,...) • (Previous changes)1/2
  23. 23. Summary• Churn can be an effective aid for improving fault prediction• {Adds+Deletes+Mods} improves the accuracy of a model that doesn’t include any change informationBUT• a simple count of prior changes slightly outperforms {Adds+Deletes+Mods}• Prior changed is nearly as good as either, when added to a model without change info• Lines added is the most effective single predictor• Lines deleted is least effective single predictor• Relative churn is no better than absolute churn for predicting total fault count
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×