Customization Support for CBR-Based Defect Prediction<br />Elham Paikari<br />Department of Electrical and Computer Engine...
Agenda<br />Parameters of a CBR Model<br />Parameters Instantiation<br />Weighting Method SANN<br />Frequency Analysis<br ...
CBR and the Parameters<br /><ul><li> Each combination of the parameters :
Instantiation of the general CBR-based prediction method </li></ul>Solution Algorithm<br />Similarity Function<br />Predic...
Instantiation Parameters of the CBR <br />4<br />
Sensitivity Analysis Based On NeuralNetwork (SANN)<br />5<br />Dataset<br />CC<br />……………<br />LOC<br />Xmin(A1)<br />NN<b...
6<br />What is the evaluation result in comparison with existing methods  (un-weighted)<br />What is the evaluation result...
Instantiation Parameters of the CBR <br />7<br />
Data Repository<br />PROMISE Repository <br />120 different CBR instantiations were created<br /> and applied to 11 data s...
Is One Instantiation Always the Best?<br />9<br />MW1<br />PC1<br />MC2<br />KC3<br />AR5<br />CM1<br />
Experimental Design for Frequency Analysis<br />10<br />Min(MMRE)<br />Dataset<br />120 different<br />instantiation<br />...
Frequency Analysis<br />Frequency of the best performance in single attribute analysis <br />Neural network based sensitiv...
12<br />
13<br />Customization Support Using DNA<br />Dataset<br />Eight attributes defined as condition attributes<br />Four data ...
14<br />Application of DNA Results <br />Generation of the Decision Trees<br />Given:<br />NumOfModule = 	High<br />Defect...
Transferability of Rules across Sets of Data<br />15<br /><ul><li>Prediction  on Pred(0.25) = 48.33%
 Prediction  on </li></ul>MMRE = 62.92% <br /><ul><li>The per-class (High, Medium and Low) are calculated based on the con...
Validation and Limitations <br />Tools used for attribute selection, and modeling tasks<br />Neural network, regression an...
Conclusions and Future Work<br />Starting with 11 data sets from the PROMISE repository<br />Calculating the prediction pe...
Upcoming SlideShare
Loading in …5
×

Promise 2011: "Customization Support for CBR-Based Defect Prediction"

2,771
-1

Published on

Promise 2011:
"Customization Support for CBR-Based Defect Prediction"
Elham Paikari, Guenther Ruhe, Bo Sun and Emadoddin Livani.

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
2,771
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
14
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Promise 2011: "Customization Support for CBR-Based Defect Prediction"

  1. 1. Customization Support for CBR-Based Defect Prediction<br />Elham Paikari<br />Department of Electrical and Computer EngineeringUniversity of Calgary2500 University Drive, NWCalgary, AB, Canadaepaikari@ucalgary.ca<br /> <br />Bo Sun<br />Department of Computer ScienceUniversity of Calgary2500 University Drive, NWCalgary, AB, Canadasbo@ucalgary.ca<br />Guenther Ruhe<br />Department of Computer Science & Department of Electrical and Computer EngineeringUniversity of Calgary2500 University Drive, NWCalgary, AB, Canadaruhe@ucalgary.ca<br />Emadoddin Livani<br />Department of Electrical and Computer EngineeringUniversity of Calgary2500 University Drive, NWCalgary, AB, Canadaelivani@ucalgary.ca<br />
  2. 2. Agenda<br />Parameters of a CBR Model<br />Parameters Instantiation<br />Weighting Method SANN<br />Frequency Analysis<br />Dependency Network and the Customization Support<br />Rules Transferability<br />Conclusions and Future Work<br />2<br />
  3. 3. CBR and the Parameters<br /><ul><li> Each combination of the parameters :
  4. 4. Instantiation of the general CBR-based prediction method </li></ul>Solution Algorithm<br />Similarity Function<br />Prediction Performance of CBR model<br />Number of Nearest Neighbor Case<br />Weighting Technique used for Attributes<br />3<br />
  5. 5. Instantiation Parameters of the CBR <br />4<br />
  6. 6. Sensitivity Analysis Based On NeuralNetwork (SANN)<br />5<br />Dataset<br />CC<br />……………<br />LOC<br />Xmin(A1)<br />NN<br />OUTPUTmin(A1)<br />∆1= |OUTPUTmin(A1) - OUTPUTmax(A1)|<br />Xmax (A1)<br />NN<br />OUTPUTmax(A1)<br />
  7. 7. 6<br />What is the evaluation result in comparison with existing methods (un-weighted)<br />What is the evaluation result in comparison with existing methods (MLR)<br />How different numbers of the nearest neighbors can affect the results?<br />
  8. 8. Instantiation Parameters of the CBR <br />7<br />
  9. 9. Data Repository<br />PROMISE Repository <br />120 different CBR instantiations were created<br /> and applied to 11 data sets from PROMISE repository<br />Characterization of data sets<br />8<br />
  10. 10. Is One Instantiation Always the Best?<br />9<br />MW1<br />PC1<br />MC2<br />KC3<br />AR5<br />CM1<br />
  11. 11. Experimental Design for Frequency Analysis<br />10<br />Min(MMRE)<br />Dataset<br />120 different<br />instantiation<br />Max(Pred(0.25))<br />Min(MMRE)<br />Dataset<br />120 different<br />instantiation<br />Max(Pred(0.25))<br />11different<br />Datasets<br />Min(MMRE)<br />Dataset<br />120 different<br />instantiation<br />Max(Pred(0.25))<br />
  12. 12. Frequency Analysis<br />Frequency of the best performance in single attribute analysis <br />Neural network based sensitivity analysis (as the weighting technique)<br />Un-weighted average (as the solution algorithm)<br />Maximum number of nearest neighbors (as the number of nearest neighbors)<br />11<br />
  13. 13. 12<br />
  14. 14. 13<br />Customization Support Using DNA<br />Dataset<br />Eight attributes defined as condition attributes<br />Four data set-related attributes:<br /> (NumOfModule),(DefectRatio),(Language),(LOC)<br />Four CBR-related attributes:<br /> (SimFunc),(WeightingTech),(NumOfNN),(SolutionAlgorithm)<br /> The decision attributes: Pred(0.25) and MMRE<br />(a1,a2,a3,a4)<br />(p1,p2,p3,p4)<br />Rule Induction<br />Customization Support<br />CBR model instantiated by (p1,p2,p3,p4)<br /> <br />Data set<br />DNA<br />New data<br />(a1,a2,a3,a4)<br />Recommendation<br /> f (a1,a2,a3,a4)<br /> <br />Rule Set<br />
  15. 15. 14<br />Application of DNA Results <br />Generation of the Decision Trees<br />Given:<br />NumOfModule = High<br />DefectRatio = High<br />LOC = Medium<br />Language = JAVA<br />Question: How to customize a CBR defect prediction model towards achieving high prediction accuracy measured in MMRE?<br />Recommendation:<br />Customize CBR model by means of:<br />WeightingTech = SANN<br />NumOfNN ≥ 10<br />SolutionAlgorithm = Rank-weighted Average<br />Justification:<br />Based on the data set characteristics, assumptions of rules 3, 4, 5, 11 and 12 are fulfilled. <br />By comparing the probability distributions of MMRE rule No. 11 is the best in terms of having the highest probability (69.2%) to achieve “Low” MMRE.<br />
  16. 16. Transferability of Rules across Sets of Data<br />15<br /><ul><li>Prediction on Pred(0.25) = 48.33%
  17. 17. Prediction on </li></ul>MMRE = 62.92% <br /><ul><li>The per-class (High, Medium and Low) are calculated based on the confusion matrix…</li></ul>Rule Induction<br />Customization Support<br />Min(MMRE)<br />Dataset<br />120 different<br />instantiation<br />CBR model instantiated by (p1,p2,p3,p4)<br /> <br />Data set<br />Max(Pred(0.25))<br />9 different<br />Datasets<br />DNA<br />New data<br />(a1,a2,a3,a4)<br />Min(MMRE)<br />Dataset<br />120 different<br />instantiation<br />Recommendation<br /> f (a1,a2,a3,a4)<br /> <br />Max(Pred(0.25))<br />Rule Set<br />
  18. 18. Validation and Limitations <br />Tools used for attribute selection, and modeling tasks<br />Neural network, regression analysis, CBR, and dependency network analysis <br />Only four parameters of the CBR instantiation<br />The composition of the training and testing data sets<br />Another aspect of the analysis undertaken is the definition of classification intervals for dependency networks, <br />Two discretization algorithms<br />Sensitivity analysis<br />16<br />
  19. 19. Conclusions and Future Work<br />Starting with 11 data sets from the PROMISE repository<br />Calculating the prediction performance of 120 instantiations of the CBR-based defect prediction model based on the value of the MMRE and Pred(0.25) <br />The frequency analysis on the top performances<br />Generating the DNA to provide a customization support for a new data set<br />The compatibility of rule sets extracted from different contexts<br />Enhancement of the validity with inclusion of further data sets<br />Comparing the performance against other measures <br />Other methods for rule induction<br />17<br />
  20. 20. References<br />Brady, A. and Menzies, T. 2010. Case-based reasoning vs parametric models for software quality optimization. In Proceedings of the 6thInternational Conference on Predictive Models in Software Engineering, pp. 3:1-3:10.<br />Catal, C. and Diri, B. 2009. A systematic review of software fault prediction studies. Expert Systems with Applications, vol. 36 (4), pp. 7346-7354.<br />El Emam, K., Benlarbi, S., Goel, N., and Rai, S. N. 2001. Comparing case-based reasoning classifiers for predicting high risk software components. The Journal of Systems and Software, vol. 55, pp. 301-320.<br />Foss, T., Stensrud, E., Kitchenham, B., and Myrtveit , I. 2003. A simulation study of the model evaluation criterion MMRE. IEEE Transactions on Software Engineering, vol. 29 (11), pp. 985- 995.<br />Ganesan, K., Khoshgoftaar, T. M., and Allen, E. B. 2000. Case-based software quality prediction. International Journal of Software Engineering and Knowledge Engineering, vol. 10(2), pp. 139–152.<br />Paikari, E., Richter, M. M., and Ruhe, G. 2010. A comparative study of attribute weighting techniques for software defect prediction using case-based reasoning. In Proceeding of the 22nd International Conference on Software Engineering and Knowledge Engineering, pp. 380-386.<br />18<br />
  21. 21. Thanks<br />ElhamPaikari<br />epaikari@ucalgary.ca<br />19<br />
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×