Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Training on Errors Experiment              to Detect Fault-Prone Software Modules                                        b...
Training on Errors Experiment              to Detect Fault-Prone Software Modules                                        b...
2Osaka Univ.                                                                     What We Tried…Idea: Fault-prone filtering ...
3Osaka Univ.                                                                               Overview     Preliminary     Fa...
4                                                                                          Preliminary:Osaka Univ.        ...
5                                                                                                 Preliminary:Osaka Univ. ...
6                                                                                              Preliminary:Osaka Univ.    ...
6                                                                                                       Preliminary:Osaka ...
7Osaka Univ.                                                                               Overview     Preliminary     Fa...
8Osaka Univ.                                                                 Fault-Prone Filtering     All software module...
8Osaka Univ.                                                                       Fault-Prone Filtering     All software ...
9                                                                                     Fault-Prone Filtering:Osaka Univ.   ...
10                                                                             Fault-Prone Filtering: Osaka Univ.   Exampl...
10                                                                             Fault-Prone Filtering: Osaka Univ.   Exampl...
10                                                                             Fault-Prone Filtering: Osaka Univ.   Exampl...
10                                                                             Fault-Prone Filtering: Osaka Univ.   Exampl...
11                                                                            Fault-Prone Filtering:Osaka Univ.   Example ...
11                                                                             Fault-Prone Filtering:Osaka Univ.   Example...
11                                                                             Fault-Prone Filtering:Osaka Univ.   Example...
11                                                                             Fault-Prone Filtering:Osaka Univ.   Example...
11                                                                             Fault-Prone Filtering:Osaka Univ.   Example...
11                                                                             Fault-Prone Filtering:Osaka Univ.   Example...
12Osaka Univ.                                                                               Overview     Preliminary     F...
13Osaka Univ.                                                                                        Experiments     Targe...
14                                                                                                  Experiment:Osaka Univ....
15                                                                                                 Experiment:Osaka Univ. ...
16Osaka Univ.                                                                            Overview     Preliminary     Faul...
17Osaka Univ.                         Training Only Errors Procedure     In Spam filtering:         Apply e-mail messages t...
18Osaka Univ.                                                                               Overview     Preliminary     F...
19Osaka Univ.                                          Evaluation Measurements     Accuracy                               ...
20Osaka Univ.   Result of Experiment (Transition of Rates)   All extracted                              1   modules are so...
21Osaka Univ.            Result of Experiment (Final Accuracy)         Cumulative prediction result at the end of TOE.    ...
22Osaka Univ.                                                                               Overview     Preliminary     F...
23Osaka Univ.                                                                  Threats to Validity     Threats to construc...
24Osaka Univ.                                                                                Related Works     Much resear...
25Osaka Univ.                                                                                    Conclusions     Summary  ...
26Osaka Univ.                                                                    Q&A                         Thank you!   ...
27                                                 Result of Cross Validation                                             ...
28                           Training Only Errors Procedure  Osaka Univ.           Case 2: Prediction matches to actual st...
28                                       Training Only Errors Procedure  Osaka Univ.                       Case 2: Predict...
28                           Training Only Errors Procedure  Osaka Univ.           Case 2: Prediction matches to actual st...
28                           Training Only Errors Procedure  Osaka Univ.           Case 2: Prediction matches to actual st...
28                           Training Only Errors Procedure  Osaka Univ.           Case 2: Prediction matches to actual st...
28                              Training Only Errors Procedure Osaka Univ.               Case 2: Prediction matches to act...
29                               Training Only Errors Procedure      Osaka Univ.   Case 1: Prediction does not match to ac...
29                                Training Only Errors Procedure      Osaka Univ.    Case 1: Prediction does not match to ...
29                                Training Only Errors Procedure      Osaka Univ.    Case 1: Prediction does not match to ...
29                                Training Only Errors Procedure      Osaka Univ.    Case 1: Prediction does not match to ...
29                                       Training Only Errors Procedure      Osaka Univ.           Case 1: Prediction does...
29                                Training Only Errors Procedure    Osaka Univ.      Case 1: Prediction does not match to ...
30Osaka Univ.                                            Procedure of Experiment     Two experiments with different thresh...
31Osaka Univ.             Result of Experiment (OSB, tFP = 0.25) Comparison with                               1          ...
Upcoming SlideShare
Loading in …5
×

ESEC/FSE 2007 presentation slide by Osamu Mizuno

420 views

Published on

Published in: Technology, News & Politics
  • Be the first to comment

  • Be the first to like this

ESEC/FSE 2007 presentation slide by Osamu Mizuno

  1. 1. Training on Errors Experiment to Detect Fault-Prone Software Modules by Spam Filter Osamu Mizuno, Tohru Kikuno Graduate School of Information Science and Technology Osaka University, JAPANOsaka Univ. ESEC/FSE2007 presentation 1 (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved
  2. 2. Training on Errors Experiment to Detect Fault-Prone Software Modules by Spam Filter Osamu Mizuno, Tohru Kikuno Graduate School of Information Science and Technology Osaka University, JAPANOsaka Univ. ESEC/FSE2007 presentation 1 (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved
  3. 3. 2Osaka Univ. What We Tried…Idea: Fault-prone filtering Detection of fault-prone modules using a generic text discriminator such as a spam filterExperiment SPAM filter: CRM114 (generic text discriminator) Data of fault-proneness from an OSS project (Eclipse) Training Only Errors(TOE) procedureResult Achieved high recall (despite of low precision) (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  4. 4. 3Osaka Univ. Overview Preliminary Fault-Prone Filtering Experiments Training Only Errors (TOE) procedure Results Conclusions (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  5. 5. 4 Preliminary:Osaka Univ. Fault-Prone Modules Fault-prone modules are: Software modules (a certain unit of source code) which may include faults. In this study: Source code of Java methods which seems to include faults from the information of a bug tracking system. (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  6. 6. 5 Preliminary:Osaka Univ. Spam E-mail Filtering (1) Spam e-mail increases year by year. About 94% of entire e-mail messages are Spam. Various spam filters have been developed. Pattern matching based approach causes a rat race between spammers and developers. Bayesian classification based approach has been recognized effective[1]. [1] P. Graham, Hackers and Painters: Big Ideas from the Computer Age, chapter 8, pp. 121-129, 2004. (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  7. 7. 6 Preliminary:Osaka Univ. Spam E-mail Filtering (2) All e-mail messages can be classified into Spam: undesired e-mail Ham: desired e-mail Tokenize and learn both spam and ham e-mail messages as text data and construct corpuses. Existing e-mail Learning (Training) HAM HAM SPAM SPAM HAM corpus corpus SPAM Filter (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  8. 8. 6 Preliminary:Osaka Univ. Spam E-mail Filtering (2) All e-mail messages can be classified into Spam: undesired e-mail Ham: desired e-mail Tokenize and learn both spam and ham e-mail messages as text data and construct corpuses. Existing e-mail Learning (Training) HAM HAM SPAM SPAM HAM Incoming e-mail messages ? Classify corpus SPAM Filter corpus SPAM are classified into spam or ham by spam filter. HAM Incoming e-mail (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  9. 9. 7Osaka Univ. Overview Preliminary Fault-Prone Filtering Experiments Training Only Errors (TOE) procedure Results Conclusions (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  10. 10. 8Osaka Univ. Fault-Prone Filtering All software modules can be classified into bug-detected (fault-prone: FP) not-bug-detected (not-fault-prone: NFP) Tokenize and learn both FP and NFP modules as text data and construct corpusesExisting code modules FP FP Learning (Training) NFP FP NFP corpus corpus FP Filter (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  11. 11. 8Osaka Univ. Fault-Prone Filtering All software modules can be classified into bug-detected (fault-prone: FP) not-bug-detected (not-fault-prone: NFP) Tokenize and learn both FP and NFP modules as text data and construct corpusesExisting code modules FP FP Learning (Training) NFP ? FP NFP Newly developed modules Classify corpus corpus FP are classified into FP or FP NFP by the FP filter. Filter NFP Newly created code (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  12. 12. 9 Fault-Prone Filtering:Osaka Univ. Spam Filter: CRM114 Spam filter: CRM114 (http://crm114.sourceforge.net/) Generic text discriminator for various purpose Implements several classifiers: Markov, OSB, kNN, ... Characteristic: generation of tokens. Tokens are generated by combination of words (not a single word) return (x + 2 * y ); with the OSB tokenizer token return x x + + 2 2 y return + x 2 + * * y return 2 x * + y return * x y 2 * (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  13. 13. 10 Fault-Prone Filtering: Osaka Univ. Example of Fault-Prone Filtering (CRM114, OSB)Source code (mFP) Source code (mNFP)public int fact(int x) { public int fact(int x) { return (x<=1?1:x*fact(++x)); return (x<=1?1:x*fact(--x));} } (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  14. 14. 10 Fault-Prone Filtering: Osaka Univ. Example of Fault-Prone Filtering (CRM114, OSB)Source code (mFP) Source code (mNFP)public int fact(int x) { public int fact(int x) { return (x<=1?1:x*fact(++x)); return (x<=1?1:x*fact(--x));} } Empty Empty FP NFP corpus corpus FP Filter (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  15. 15. 10 Fault-Prone Filtering: Osaka Univ. Example of Fault-Prone Filtering (CRM114, OSB)Source code (mFP) Source code (mNFP)public int fact(int x) { public int fact(int x) { return (x<=1?1:x*fact(++x)); return (x<=1?1:x*fact(--x));} } Tokens (TFP) Tokens (TNFP) public int public int public fact public fact public x public x int fact int fact int int int int ... ... ... ... x ++ FP NFP x -- x x corpus corpus x x * fact * fact * ++ FP * -- * x Filter * x fact ++ fact -- fact x fact x ++ x -- x (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  16. 16. 10 Fault-Prone Filtering: Osaka Univ. Example of Fault-Prone Filtering (CRM114, OSB)Source code (mFP) Source code (mNFP)public int fact(int x) { public int fact(int x) { return (x<=1?1:x*fact(++x)); return (x<=1?1:x*fact(--x));} } Tokens (TFP) Tokens (TNFP) public int public int public fact public fact public x public x int fact Training Training int fact int int int int ... ... ... ... x ++ FP NFP x -- x x corpus corpus x x * fact * fact * ++ FP * -- * x Filter * x fact ++ fact -- fact x fact x ++ x -- x (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  17. 17. 11 Fault-Prone Filtering:Osaka Univ. Example of Fault-Prone Filtering (CRM114, OSB) Source code (mnew) public int sigma(int x) { return (x<=0?0:x+sigma(++x)); } (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  18. 18. 11 Fault-Prone Filtering:Osaka Univ. Example of Fault-Prone Filtering (CRM114, OSB) Source code (mnew) public int sigma(int x) { return (x<=0?0:x+sigma(++x)); } Tokens (Tnew) public int public sigma public x int sigma int int ... ... x ++ x x + sigma + ++ + x sigma ++ sigma x ++ x (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  19. 19. 11 Fault-Prone Filtering:Osaka Univ. Example of Fault-Prone Filtering (CRM114, OSB) Source code (mnew) public int sigma(int x) { return (x<=0?0:x+sigma(++x)); } Tokens (Tnew) public int public sigma public x int sigma int int ... FP NFP ... corpus corpus x ++ x x Prediction FP + sigma Filter + ++ + x sigma ++ sigma x ++ x (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  20. 20. 11 Fault-Prone Filtering:Osaka Univ. Example of Fault-Prone Filtering (CRM114, OSB) Source code (mnew) public int sigma(int x) { return (x<=0?0:x+sigma(++x)); }Tokens (TFP) Tokens (Tnew) Tokens (TNFP)public int public int public intpublic fact public sigma public factpublic x public x public xint fact int sigma int factint int int int int int ... ... ... ... ... ...x ++ x ++ x --x x x x x x* fact + sigma * fact* ++ + ++ * --* x + x * xfact ++ sigma ++ fact --fact x sigma x fact x++ x ++ x -- x (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  21. 21. 11 Fault-Prone Filtering:Osaka Univ. Example of Fault-Prone Filtering (CRM114, OSB) Source code (mnew) public int sigma(int x) { return (x<=0?0:x+sigma(++x)); }Tokens (TFP) Tokens (Tnew) Tokens (TNFP)public int public int public intpublic fact public sigma public factpublic x public x public xint fact int sigma int factint int int int int int ... ... ... ... ... ...x ++ x ++ x --x x x x x x* fact + sigma * fact* ++ + ++ * --* x + x * xfact ++ sigma ++ fact --fact x sigma x fact x++ x ++ x -- x (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  22. 22. 11 Fault-Prone Filtering:Osaka Univ. Example of Fault-Prone Filtering (CRM114, OSB) Source code (mnew) public int sigma(int x) { return (x<=0?0:x+sigma(++x)); } Tokens (Tnew) mnew is predicted as FP public int public sigma because TFP has more public x similarity than TNFP. int sigma int int ... FP NFP ... corpus corpus x ++ Probability: x x Prediction FP 0.52 + sigma Filter + ++ Predicted: + x FP sigma ++ sigma x ++ x (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  23. 23. 12Osaka Univ. Overview Preliminary Fault-Prone Filtering Experiments Training Only Errors (TOE) procedure Results Conclusions (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  24. 24. 13Osaka Univ. Experiments Target: Eclipse project Written in Java “Methods” in Java classes are considered as modules Date of snapshots of cvs repository and bugzilla database January 30, 2007. Large CVS repository (about 14GB) Faults are recorded precisely (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  25. 25. 14 Experiment:Osaka Univ. Collecting FP & NFP Modules Track FP modules from CVS log based on an algorithm by Sliwerski, et. al[2]. [2] J. Sliwerski, et. al., When do changes induce fixes? (on fridays.). In Proc. of MSR2005, pp. 24-28, 2005. Search terms such as “issue”, “problem”, “#”, and bug id as well as “fixed”, “resolved”, or “removed” from CVS log, then identify a revision the bug is removed. Get difference from the previous revision and identify modified modules. Track back repository and identify modules that have not been modified since the bug is reported. They are FP modules. (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  26. 26. 15 Experiment:Osaka Univ. Result of Module Collection Extracted bugs from bugzilla database of Eclipse Conditions: Type of faults: Bugs Status of faults: Resolved,Verified, or Closed Resolution of faults: Fixed Severity: Blocker, Critical, Major, or Normal Total # of faults: 40,627 Result of collection # of faults found in CVS log: 21,761 (52% of total) # of fault-prone(FP) modules: 65,782 # of not-fault-prone(NFP) modules: 1,113,063 (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  27. 27. 16Osaka Univ. Overview Preliminary Fault-Prone Filtering Experiments Training Only Errors (TOE) procedure Results Conclusions (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  28. 28. 17Osaka Univ. Training Only Errors Procedure In Spam filtering: Apply e-mail messages to spam filter in order of arrival. Only misclassified e-mail messages are trained in corpuses. You may do this procedure in daily e-mail assorting. In Fault-prone filtering: Apply software modules to fault-prone filter in order of construction and modification. Only misclassified modules are trained in corpuses. (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  29. 29. 18Osaka Univ. Overview Preliminary Fault-Prone Filtering Experiments Training Only Errors (TOE) procedure Results Conclusions (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  30. 30. 19Osaka Univ. Evaluation Measurements Accuracy Result of Predicted Overall accuracy of prediction prediction NFP FP (N1+N4) / (N1+N2+N3+N4) NFP N1 N2 Actual FP N3 N4 Recall How much actual FP modules are predicted as FP. N3 / (N3+N4) Precision How much predicted FP modules include actual FP modules N2 / (N2+N4) (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  31. 31. 20Osaka Univ. Result of Experiment (Transition of Rates) All extracted 1 modules are sorted 0.9 AA A A A A AAAA AA AAAA AAAA AAA Accuracy AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA A AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA A AAAAAAAA AAA AAA AAA AAAA A by date, and applied 0.8 A A A FFFF F A A FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF FFFFFFFFFF FFF F FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF F FF FFFFFFF FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF F FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF F FFFFFFFFFFFFFFFFFFFFFF 0.7 F F FFFFFFFFFFFFFFFF FFFFFFFFFFFFFFF FFFFFFFFFFFFFFFFF F F Recall FP filter one by one F FF FF FFF F F FFF F 0.6 F F F F Rate from the oldest one. 0.5 I IIII Precision 0.4 I IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII IIIIIIIIIIIIIII I I IIIII IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII II IIIIIIIIIIIIIIIIIIIIII IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII IIII I IIIIIIIIIIIIIIIIIIIIIIII IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII I IIII 0.3 II IIII I I II DI II I D D D D 0.2 DDD DDD D DDD DDD DDDDD D DDDDDD D DDDDDDD False-negative D DDDDDDDDDDDDD Observation D DDDDDDDDDDD DDD DDDDDDD DDDDDD DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD DDD DD DDDD DDD DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD D DDDD DDD 0.1 DD DDDDD DDDDDDDDDDDDDDDDDDDDDDDDDDDDD DDDD DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD D DDDD DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD D D DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD DDD False-positive 0 The prediction result 0 0 00 00 00 00 00 00 00 00 00 0 00 00 00 00 00 00 00 00 00 00 00 00 50 50 50 become stable after 15 25 35 45 55 65 75 85 95 10 11 old Methods sorted by date new 50,000 modules classification. (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  32. 32. 21Osaka Univ. Result of Experiment (Final Accuracy) Cumulative prediction result at the end of TOE. TOE - final Predicted OSB Precision: 0.347 NFP FP Recall: 0.728 NFP 1,022,895 90,168 Actual Accuracy: 0.908 FP 17,890 47,892 In other words, 72% of actual FP modules are predicted as FP. 34% of predicted FP modules include faults. (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  33. 33. 22Osaka Univ. Overview Preliminary Fault-Prone Filtering Experiments Training Only Errors (TOE) procedure Results Conclusions (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  34. 34. 23Osaka Univ. Threats to Validity Threats to construction validity Collection of fault-prone modules from OSS projects. We could not cover all faults in bugzilla database. We have to collect more reliable data in future work. Threats to external validity Generalizability of the results We have to apply Fault-prone Filtering to many projects including industrial ones. (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  35. 35. 24Osaka Univ. Related Works Much research has been done so far. Logistic regression CART Bayesian classification and more. Most of them use software metrics McCabe, Halstead, Object-oriented, and so on. Intuitively speaking, our approach uses a new metric, “frequency of tokens”. (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  36. 36. 25Osaka Univ. Conclusions Summary We proposed the new approach to detect fault prone modules using spam filter. The case study showed that our approach can predict fault prone modules with high accuracy. Future works Using semantic parsing information instead of raw code Using differences between revisions as an input of Fault-prone filtering Seems more reasonable... (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  37. 37. 26Osaka Univ. Q&A Thank you! Any questions? (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  38. 38. 27 Result of Cross Validation (From slide of MSR2007)Result for Eclipse BIRT plugin10-fold cross validation Cross Validation Predicted OSB Precision: 0.319 NFP FP NFP 70,369 16,011 Recall: 0.786 Actual FP 2,039 7,501 Accuracy: 0.811 Recall is important for quality assurance. Precision implies the cost for finding FP modules. Recall is rather high, and precision is rather low. (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved MSR2007
  39. 39. 28 Training Only Errors Procedure Osaka Univ. Case 2: Prediction matches to actual status? ? ? .java FP NFP ? .java corpus corpus .java FP .java Filter (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  40. 40. 28 Training Only Errors Procedure Osaka Univ. Case 2: Prediction matches to actual status? ? ? .java FP NFP Probability: ? .java corpus corpus 0.9 .java FP Predicted: .java Filter Prediction FP (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  41. 41. 28 Training Only Errors Procedure Osaka Univ. Case 2: Prediction matches to actual status? ? ? .java FP NFP Probability: .java corpus corpus 0.9 .java FP Predicted: Filter FP ?FP .java (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  42. 42. 28 Training Only Errors Procedure Osaka Univ. Case 2: Prediction matches to actual status? ? ? .java FP NFP Probability: .java corpus corpus 0.9 .java FP Predicted: Filter FP ?FP .java (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  43. 43. 28 Training Only Errors Procedure Osaka Univ. Case 2: Prediction matches to actual status? ? ? .java FP NFP .java corpus corpus .java FP Filter (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  44. 44. 28 Training Only Errors Procedure Osaka Univ. Case 2: Prediction matches to actual status? ?.java FP NFP ? .java corpus corpus FP .java Filter Prediction (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  45. 45. 29 Training Only Errors Procedure Osaka Univ. Case 1: Prediction does not match to actual status? ? ? .java ? .java FP NFP ? .java corpus corpus .java FP .java Filter (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  46. 46. 29 Training Only Errors Procedure Osaka Univ. Case 1: Prediction does not match to actual status? ? ? .java ? .java FP NFP Probability: ? .java corpus corpus 0.2 .java FP Predicted: .java Filter Prediction NFP (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  47. 47. 29 Training Only Errors Procedure Osaka Univ. Case 1: Prediction does not match to actual status? ? ? .java ? .java FP NFP .java corpus corpus Probability: 0.2 .java FP Predicted: Filter NFP ? FP .java (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  48. 48. 29 Training Only Errors Procedure Osaka Univ. Case 1: Prediction does not match to actual status? ? ? .java ? .java FP NFP .java corpus corpus Probability: 0.2 .java FP Predicted: Filter NFP ? FP .java (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  49. 49. 29 Training Only Errors Procedure Osaka Univ. Case 1: Prediction does not match to actual status ? FP? .java ? Training ? .java ? .java FP NFP .java corpus corpus Probability: 0.2 .java FP Predicted: Filter NFP (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  50. 50. 29 Training Only Errors Procedure Osaka Univ. Case 1: Prediction does not match to actual status? ? ? .java FP NFP ? .java corpus corpus .java FP .java Filter (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  51. 51. 30Osaka Univ. Procedure of Experiment Two experiments with different thresholds of probability(tFP) to determine FP and NFP. Changing tFP may achieve higher recall Experiment 1: TOE with OSB classifier, tFP=0.5 Experiment 2: TOE with OSB classifier, tFP=0.25 Predict more modules as FP than Experiment 1 (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007
  52. 52. 31Osaka Univ. Result of Experiment (OSB, tFP = 0.25) Comparison with 1 0.9 Recall threshold = 0.50 FF FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF F FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF D F A AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAFFFFFFFAAAAAFFFFFFFFFF AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAFFFFFFFFFFFFFFFFF FFFFFAFAFAFAF AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA FFFFAAAAAFAFAFA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAFAAAAAAAA AAAAAAAAFFAAAAAAAAA AAAAAAAAAA FFFFFAAAAAFAFAFAAAAAAAAAAAAAAAAAAAAAAAAAAAAA A FFF FFFFFAAAFFAFAFAFAAAAAAAAAAAAAAAAAAAAAAAAAAA FFFF FAFAFAFA FFAAFAFAFAFA 0.8 DFFF AA F FA FF A DF A F AAAA F AA FFAAA F AA A F AA FFAA FAFA A ccuracy A A 0.7 A Precision becomes DA AA A AA A A D 0.6 A D D lower. A A A DD Rate D 0.5 D DD D DD DD D D Only 1/4 of FP 0.4 D DD D DD D D 0.3 DDDD D D DD DDD Precision predicted modules D DD I IIIIIIIIIIIIIII DD DD DD IIIIIIIIIIIIIDIIIIII DDI I I IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII I IIIII DIDDDDDDIIIIIIIII DID DIDIIIIII I DI IIIIIIII DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII III IIII I DDD DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD D DDDDDDDDDDDDDDDDDDDDDDDD D DDDDDDDDDDDDDDDDDDDDDDD DDDDDDD D DDD DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD D D DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD 0.2 IIIIII III II False-positive DD DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD D DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD DDDDDDDDDDDDDDDDDDDDD D DDDDDDDDDD DDDD hits actual faulty I 0.1 DDDDDDDDDDDDD False-negative DDDDDDDDDDDDDDD D DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD modules. DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD DDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDDD 0 11 00 0 25 0 35 0 45 0 55 0 65 0 75 0 85 0 95 0 10 00 15 0 00 0 0 0 0 0 0 0 0 00 0 00 00 00 00 00 00 00 00 00 Recall becomes much 50 50 50 higher. Methods sorted by date TOE - final Predicted Precision: 0.232 83% of actual faulty OSB NFP FP modules can be Recall: 0.839 NFP 930,218 182,845 Actual detected. FP 10,592 55,190 Accuracy: 0.835 (C) 2007 Osamu Mizuno @ Osaka University / All rights reserved ESEC/FSE2007

×