• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Promise 2011: "Customization Support for CBR-Based Defect Prediction"
 

Promise 2011: "Customization Support for CBR-Based Defect Prediction"

on

  • 2,848 views

Promise 2011:

Promise 2011:
"Customization Support for CBR-Based Defect Prediction"
Elham Paikari, Guenther Ruhe, Bo Sun and Emadoddin Livani.

Statistics

Views

Total Views
2,848
Views on SlideShare
878
Embed Views
1,970

Actions

Likes
0
Downloads
13
Comments
0

3 Embeds 1,970

http://promisedata.org 1957
http://translate.googleusercontent.com 11
http://ai-at-wvu.blogspot.com 2

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Promise 2011: "Customization Support for CBR-Based Defect Prediction" Promise 2011: "Customization Support for CBR-Based Defect Prediction" Presentation Transcript

    • Customization Support for CBR-Based Defect Prediction
      Elham Paikari
      Department of Electrical and Computer EngineeringUniversity of Calgary2500 University Drive, NWCalgary, AB, Canadaepaikari@ucalgary.ca
       
      Bo Sun
      Department of Computer ScienceUniversity of Calgary2500 University Drive, NWCalgary, AB, Canadasbo@ucalgary.ca
      Guenther Ruhe
      Department of Computer Science & Department of Electrical and Computer EngineeringUniversity of Calgary2500 University Drive, NWCalgary, AB, Canadaruhe@ucalgary.ca
      Emadoddin Livani
      Department of Electrical and Computer EngineeringUniversity of Calgary2500 University Drive, NWCalgary, AB, Canadaelivani@ucalgary.ca
    • Agenda
      Parameters of a CBR Model
      Parameters Instantiation
      Weighting Method SANN
      Frequency Analysis
      Dependency Network and the Customization Support
      Rules Transferability
      Conclusions and Future Work
      2
    • CBR and the Parameters
      • Each combination of the parameters :
      • Instantiation of the general CBR-based prediction method
      Solution Algorithm
      Similarity Function
      Prediction Performance of CBR model
      Number of Nearest Neighbor Case
      Weighting Technique used for Attributes
      3
    • Instantiation Parameters of the CBR
      4
    • Sensitivity Analysis Based On NeuralNetwork (SANN)
      5
      Dataset
      CC
      ……………
      LOC
      Xmin(A1)
      NN
      OUTPUTmin(A1)
      ∆1= |OUTPUTmin(A1) - OUTPUTmax(A1)|
      Xmax (A1)
      NN
      OUTPUTmax(A1)
    • 6
      What is the evaluation result in comparison with existing methods (un-weighted)
      What is the evaluation result in comparison with existing methods (MLR)
      How different numbers of the nearest neighbors can affect the results?
    • Instantiation Parameters of the CBR
      7
    • Data Repository
      PROMISE Repository
      120 different CBR instantiations were created
      and applied to 11 data sets from PROMISE repository
      Characterization of data sets
      8
    • Is One Instantiation Always the Best?
      9
      MW1
      PC1
      MC2
      KC3
      AR5
      CM1
    • Experimental Design for Frequency Analysis
      10
      Min(MMRE)
      Dataset
      120 different
      instantiation
      Max(Pred(0.25))
      Min(MMRE)
      Dataset
      120 different
      instantiation
      Max(Pred(0.25))
      11different
      Datasets
      Min(MMRE)
      Dataset
      120 different
      instantiation
      Max(Pred(0.25))
    • Frequency Analysis
      Frequency of the best performance in single attribute analysis
      Neural network based sensitivity analysis (as the weighting technique)
      Un-weighted average (as the solution algorithm)
      Maximum number of nearest neighbors (as the number of nearest neighbors)
      11
    • 12
    • 13
      Customization Support Using DNA
      Dataset
      Eight attributes defined as condition attributes
      Four data set-related attributes:
      (NumOfModule),(DefectRatio),(Language),(LOC)
      Four CBR-related attributes:
      (SimFunc),(WeightingTech),(NumOfNN),(SolutionAlgorithm)
      The decision attributes: Pred(0.25) and MMRE
      (a1,a2,a3,a4)
      (p1,p2,p3,p4)
      Rule Induction
      Customization Support
      CBR model instantiated by (p1,p2,p3,p4)
       
      Data set
      DNA
      New data
      (a1,a2,a3,a4)
      Recommendation
      f (a1,a2,a3,a4)
       
      Rule Set
    • 14
      Application of DNA Results
      Generation of the Decision Trees
      Given:
      NumOfModule = High
      DefectRatio = High
      LOC = Medium
      Language = JAVA
      Question: How to customize a CBR defect prediction model towards achieving high prediction accuracy measured in MMRE?
      Recommendation:
      Customize CBR model by means of:
      WeightingTech = SANN
      NumOfNN ≥ 10
      SolutionAlgorithm = Rank-weighted Average
      Justification:
      Based on the data set characteristics, assumptions of rules 3, 4, 5, 11 and 12 are fulfilled.
      By comparing the probability distributions of MMRE rule No. 11 is the best in terms of having the highest probability (69.2%) to achieve “Low” MMRE.
    • Transferability of Rules across Sets of Data
      15
      • Prediction on Pred(0.25) = 48.33%
      • Prediction on
      MMRE = 62.92%
      • The per-class (High, Medium and Low) are calculated based on the confusion matrix…
      Rule Induction
      Customization Support
      Min(MMRE)
      Dataset
      120 different
      instantiation
      CBR model instantiated by (p1,p2,p3,p4)
       
      Data set
      Max(Pred(0.25))
      9 different
      Datasets
      DNA
      New data
      (a1,a2,a3,a4)
      Min(MMRE)
      Dataset
      120 different
      instantiation
      Recommendation
      f (a1,a2,a3,a4)
       
      Max(Pred(0.25))
      Rule Set
    • Validation and Limitations
      Tools used for attribute selection, and modeling tasks
      Neural network, regression analysis, CBR, and dependency network analysis
      Only four parameters of the CBR instantiation
      The composition of the training and testing data sets
      Another aspect of the analysis undertaken is the definition of classification intervals for dependency networks,
      Two discretization algorithms
      Sensitivity analysis
      16
    • Conclusions and Future Work
      Starting with 11 data sets from the PROMISE repository
      Calculating the prediction performance of 120 instantiations of the CBR-based defect prediction model based on the value of the MMRE and Pred(0.25)
      The frequency analysis on the top performances
      Generating the DNA to provide a customization support for a new data set
      The compatibility of rule sets extracted from different contexts
      Enhancement of the validity with inclusion of further data sets
      Comparing the performance against other measures
      Other methods for rule induction
      17
    • References
      Brady, A. and Menzies, T. 2010. Case-based reasoning vs parametric models for software quality optimization. In Proceedings of the 6thInternational Conference on Predictive Models in Software Engineering, pp. 3:1-3:10.
      Catal, C. and Diri, B. 2009. A systematic review of software fault prediction studies. Expert Systems with Applications, vol. 36 (4), pp. 7346-7354.
      El Emam, K., Benlarbi, S., Goel, N., and Rai, S. N. 2001. Comparing case-based reasoning classifiers for predicting high risk software components. The Journal of Systems and Software, vol. 55, pp. 301-320.
      Foss, T., Stensrud, E., Kitchenham, B., and Myrtveit , I. 2003. A simulation study of the model evaluation criterion MMRE. IEEE Transactions on Software Engineering, vol. 29 (11), pp. 985- 995.
      Ganesan, K., Khoshgoftaar, T. M., and Allen, E. B. 2000. Case-based software quality prediction. International Journal of Software Engineering and Knowledge Engineering, vol. 10(2), pp. 139–152.
      Paikari, E., Richter, M. M., and Ruhe, G. 2010. A comparative study of attribute weighting techniques for software defect prediction using case-based reasoning. In Proceeding of the 22nd International Conference on Software Engineering and Knowledge Engineering, pp. 380-386.
      18
    • Thanks
      ElhamPaikari
      epaikari@ucalgary.ca
      19