Predicting Faults from Cached History

2,439 views

Published on

29th International Conference on Software Engineering (ICSE 2007), ACM SIGSOFT Distinguished Paper Award winner.

Published in: Technology
1 Comment
0 Likes
Statistics
Notes
  • thank you so much for providing this..
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Be the first to like this

No Downloads
Views
Total views
2,439
On SlideShare
0
From Embeds
0
Number of Embeds
29
Actions
Shares
0
Downloads
84
Comments
1
Likes
0
Embeds 0
No embeds

No notes for slide
  • Usage Coupling
  • Predicting Faults from Cached History

    1. 1. ` BugCache Predicting Defects Sung Kim • MIT Tom Zimmermann • Saarland University Jim Whitehead • UC Santa Cruz Andreas Zeller • Saarland University
    2. 2. The Problem How should we allocate our resources for quality assurance? Which files should we focus on?
    3. 3. Which files are most bug-prone? The Problem
    4. 4. Where are bugs? Temporal locality: Defected files are likely to have more soon. [Ostrand, Weyuker] Spatial locality: In nearby other bugs! [Zimmermann et al.] In modified files! [Nagappan et al.] In new files! [Graves et al.]
    5. 5. Our Solution <ul><li>List of most bug-prone files </li></ul><ul><li>Combine all bug occurrence models </li></ul>Cache
    6. 6. Bug Cache 10% files most defect-prone all files pre-fetch replacement Near by: co changes load
    7. 7. Outline <ul><li>BugCache Model </li></ul><ul><ul><li>Cache update </li></ul></ul><ul><ul><li>Replacement Policies </li></ul></ul><ul><ul><li>Pre-fetch </li></ul></ul><ul><li>Evaluation </li></ul><ul><ul><li>7 open source projects </li></ul></ul><ul><li>Related Work </li></ul><ul><li>Summary </li></ul>
    8. 8. Bug Cache load if missed load if missed pre-fetch A Fix change Non-fix change Fix change Change history B C
    9. 9. Cache Model Miss Cache size: 2 A B C C
    10. 10. Cache Update Parameter: Block size (neighborhood size) <ul><li>Load missed files </li></ul><ul><li>Load nearby files (spatial locality) </li></ul>File Number of common changes with . 1 4 0 C A B D 4 B
    11. 11. Cache Model Hit Miss Miss Cache size: 2 Block size: 2 Hit A B C A D C B B A C A B Which one should be replaced?
    12. 12. Replacement Policies <ul><li>Least recently used (LRU) Unload the files that have the least recently found defect. </li></ul><ul><li>Least frequently changed (CHANGE) Unload the files that have the fewest changes. </li></ul><ul><li>Least frequent defects (BUG) Unload the files that have the fewest defects. </li></ul>Parameter: Replacement Policy
    13. 13. Cache Model Hit Miss Miss Cache size: 2 Block size: 2 Hit Replacement: BUG A B C A D C B B A C A B Block size: 1 Cache size: 2 File LRU CHANGE BUG -5 2 2 -3 3 1 B C BUG 2 1 (replace)
    14. 14. Pre-fill and pre-fetch <ul><li>Pre-fill </li></ul><ul><ul><li>Fill cache with largest files (LOC) </li></ul></ul><ul><li>Pre-fetch </li></ul><ul><ul><li>Load changed files </li></ul></ul><ul><ul><li>Load added files </li></ul></ul><ul><ul><li>Unload deleted files </li></ul></ul>Parameter: Pre-fetch size
    15. 15. Cache Model Hit Miss Miss Cache size: 2 Block size: 2 Replacement: BUG Pre-fetch size: 1 A B C A D C B B A C A B Hit rate = #Hits / #Defects = 25% Pre-fill Pre-fetch Miss D Pre-fetch
    16. 16. Evaluation PostgreSQL jEdit Mozilla Columba
    17. 17. Hit Rates Cache size = 10% Block/pre-fetch size = 50% of the cache size Replacement policy = LRU
    18. 18. Exhaustive Evaluation <ul><li>Cache size: fixed to 10% </li></ul><ul><li>Vary block size: 0% to 100% of cache size </li></ul><ul><li>Vary pre-fetch size: 0% to 100% of cache size </li></ul><ul><li>Vary replacement: LRU, CHANGE, BUG </li></ul>
    19. 19. Function Level Default vs Optimal Options Cache size = 10% of all functions/methods
    20. 20. Function Level Optimal Hit Rates Project Function Apache 1.3 Columba Eclipse JEdit Mozilla PostgreSQL Subversion 2,113 8,428 33,214 5,489 8,203 8,659 3,693 Cache size = 10% of all functions/methods Hit rate 62% 68% 72% 49% 55% 59% 46% Block 15% 57% 20% 85% 41% 29% 71% Pre-fetch 17% 20% 4% 8% 14% 17% 14% Replace BUG BUG BUG BUG LRU LRU BUG
    21. 21. File Level Default vs Optimal Options Cache size = 10% of all files
    22. 22. File Level Optimal Hit Rates Project Files Apache 1.3 Columba Eclipse JEdit Mozilla PostgreSQL Subversion 154 1,428 3,330 420 396 598 255 Cache size = 10% of all files Hit rate 82% 83% 95% 85% 88% 79% 73% Block 50% 59% 20% 23% 23% 22% 42% Pre-fetch 0% 0% 0% 0% 0% 0% 0% Replace LRU BUG LRU LRU LRU LRU LRU
    23. 23. Related Work In previous work, 10% predicts 44%~78% 20% predicts 71~93% 10% BugCache predicts 73~95%
    24. 24. Summary
    25. 25. BugCache Predicting Defects hit rates of 73%~95% <ul><li>Simple </li></ul><ul><li>Combines all </li></ul><ul><li>bug occurrence models </li></ul>
    26. 26. ` BugCache Predicting Defects Sung Kim • MIT Tom Zimmermann • Saarland University Jim Whitehead • UC Santa Cruz Andreas Zeller • Saarland University
    27. 27. Changes that lead to problems as indicated by later fixes. Bug-introducing Changes ... if (foo!=null) { foo.bar(); ... FIX ... if (foo==null) { foo.bar(); ... BUG-INTRODUCING later fixed

    ×