Promise 2011: "Does Measuring Code Change Improve Fault Prediction?"
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Promise 2011: "Does Measuring Code Change Improve Fault Prediction?"

on

  • 3,208 views

Promise 2011:

Promise 2011:
"Does Measuring Code Change Improve Fault Prediction?"
Robert Bell, Thomas Ostrand and Elaine Weyuker.

Statistics

Views

Total Views
3,208
Views on SlideShare
1,084
Embed Views
2,124

Actions

Likes
0
Downloads
11
Comments
0

2 Embeds 2,124

http://promisedata.org 2112
http://translate.googleusercontent.com 12

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Promise 2011: "Does Measuring Code Change Improve Fault Prediction?" Presentation Transcript

  • 1. Code Change and Fault Prediction Tom Ostrand, Robert Bell, Elaine Weyuker AT&T Labs – Research Florham Park, NJ, USA PROMISE 2011 Banff, Alberta, September 20-21, 2011© 2007 AT&T Knowledge Ventures. All rights reserved. AT&Tand the AT&T logo are trademarks of AT&T Knowledge Ventures.
  • 2. Overview•Do measures of code change or churn provide useful input to fault prediction models?•Standard model•Base models•Churn-augmented models
  • 3. The Standard Model• Underlying statistical model • Negative binomial regression• Output (dependent) variable • Predicted fault count in each file of release n• Predictor (independent) variables • KLOC (n) • Previous faults (n-1) • Previous changes (n-1, n-2) • File age (number of releases) • File type (C,C++,java,sql,make,sh,perl,...)
  • 4. Evaluating prediction models• Model produces ranking of files in a release, from predicted most faults to fewest faults• Choose cutoff point in ranking, X%• Yield = percent of all faults in the release that are in the first X% of the ranked filesWe’ve usually evaluated models at a 20% cutoff.• Fault-percentile average (FPA) is the average yield over all values of X
  • 5. Prediction Results, from the StandardModel Percent of faults in top 20% of files FPA 100 90 93 93 91 93 92 80 88 88 87 83 83 81 70 75 76 60 50 40 30 20 10 0
  • 6. Measures of Code Change•Changed/not changed•Number of changes during a release•Number of lines added•Number of lines deleted•Number of lines modified•Relative churn (line changes/LOC)
  • 7. Two Subject SystemsLarge provisioning system• 18 releases, 5 year lifespan• 6 programming languages: • Java (60%), C, C++, SQL, SQL-C, SQL-C++• 3000+ files• 1.5Mil LOC• Average of 395 faults/release
  • 8. Two Subject SystemsUtility, data aggregation system• 18 releases, 5 year lifespan• >10 programming languages: • Java (77%), Perl, xml, sh, ...• 800 files• 280K LOC• Average of 90 faults/release
  • 9. Distribution of files, averages over all releases.Percent of Files: Provisioning Percent of Files: Utility 6.8% 1.6% 15.1% 11.0% New New Changed Changed Unchanged Unchanged 82.2% 84.4%
  • 10. Where do faults occur?Distribution of faults over filesFaults/file: Provisioning Faults/file: Utility 0.02 0.12 0.24 New New Changed Changed Unchanged Unchanged 0.80 0.82
  • 11. Provisioning system faults per file, by release Faults per File, by Change Status and Release 2.5 2 Fault-per-File 1.5 1 0.5 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 Release New (Mean=0.24) Unchanged (Mean=0.02) Changed (Mean=0.80)
  • 12. Utility system faults per file, by release Faults per File, by Change Status and Release 3 2.5 2 Faults per file 1.5 1 0.5 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 New (Mean=.09) Unchanged (Mean=.002) Changed (Mean=.92)
  • 13. Potential predictor combinations• Added lines only• Deleted lines only• Modified lines only• Adds & Deletes• Adds & Mods• Deletes & Mods• Adds & Deletes & Mods• Relative values: changed lines/LOC
  • 14. Distribution of change combinations,all check-ins, all releases:Provisioning system Number of Files Mods, 683 Deletes, 296 M & D & A, 2625 Adds, 597 Mods & Deletes, 168 Mods & Adds, 1894 Deletes & Adds, 126
  • 15. Average lines touched for each combination ofchanges Average Lines touched Mods, 4 Deletes, 5 Mods & Adds, Deletes, 23 21 Mods & Adds, 37 M & D & A, 210 Deletes & Adds, 21
  • 16. Faults per file, changed files only:Provisioning system Faults per File Deletes, 0.04 Mods, 0.19 Adds, 0.3 Mods & M & D & A, 1.38 Deletes, 0.36 Mods & Adds, 0.55 Deletes & Adds, 0.5
  • 17. Fault prediction models•Univariate models•Base model: log(KLOC), File age, File type•Augmented models: • Previous Changes • Previous {Adds / Deletes / Mods} • Previous Adds + Deletes + Modifications • Previous {Adds / Deletes / Mods} / LOC (relative churn) • Previous Developers
  • 18. Fault-percentile averages for univariatepredictor models: Provisioning system(best result from raw variable, square root, fourth root) FPA, univariate models Standard Model Age Language Prior Lines Deleted Prior Faults Prior Changed Prior Lines Modified Prior Lines Added Prior Developers Prior Adds+Deletes+Mods Prior Changes log(KLOC) 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
  • 19. Base Model 1 • KLOC • File age (number of releases) • File type (C,C++,java,sql,make,sh,perl,...)
  • 20. Base Model 1, and added variables Mean FPA, Provisioning System Mean FPA, Utility System Standard Model Standard Model prev-changes prev-changes prev-adds,dels,mods prev-adds,dels,mods(prev-adds,dels,mods)/LOC prev-developers prev-developers prev-adds prev-adds prev-changed prev-changed prev-mods prev-mods prev-deletes prev-deletes prev-prev changes prev-prev changes Base 1 Base 1 89 90 91 92 93 94 87 88 89 90 91 92 93 • Base model 1 • KLOC • File age (number of releases) • File type (C,C++,java,sql,make,sh,perl,...)
  • 21. Base Model 2 • KLOC • File age (number of releases) • File type (C,C++,java,sql,make,sh,perl,...) • (Previous changes)1/2
  • 22. Base Model 2, and added variables Mean FPA, Provisioning System prev-prev changes prev-adds,dels,mods prev-adds prev-mods prev-developers (prev-adds,dels,mods)/LOC prev-deletes prev-changed Base 2 93.2 93.25 93.3 93.35 93.4 93.45 93.5 93.55 • Base model 2 • KLOC • File age (number of releases) • File type (C,C++,java,sql,make,sh,perl,...) • (Previous changes)1/2
  • 23. Summary• Churn can be an effective aid for improving fault prediction• {Adds+Deletes+Mods} improves the accuracy of a model that doesn’t include any change informationBUT• a simple count of prior changes slightly outperforms {Adds+Deletes+Mods}• Prior changed is nearly as good as either, when added to a model without change info• Lines added is the most effective single predictor• Lines deleted is least effective single predictor• Relative churn is no better than absolute churn for predicting total fault count