Promise 2011: "Customization Support for CBR-Based Defect Prediction"
Upcoming SlideShare
Loading in...5
×
 

Promise 2011: "Customization Support for CBR-Based Defect Prediction"

on

  • 2,882 views

Promise 2011:

Promise 2011:
"Customization Support for CBR-Based Defect Prediction"
Elham Paikari, Guenther Ruhe, Bo Sun and Emadoddin Livani.

Statistics

Views

Total Views
2,882
Views on SlideShare
906
Embed Views
1,976

Actions

Likes
0
Downloads
13
Comments
0

3 Embeds 1,976

http://promisedata.org 1963
http://translate.googleusercontent.com 11
http://ai-at-wvu.blogspot.com 2

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Promise 2011: "Customization Support for CBR-Based Defect Prediction" Promise 2011: "Customization Support for CBR-Based Defect Prediction" Presentation Transcript

  • Customization Support for CBR-Based Defect Prediction
    Elham Paikari
    Department of Electrical and Computer EngineeringUniversity of Calgary2500 University Drive, NWCalgary, AB, Canadaepaikari@ucalgary.ca
     
    Bo Sun
    Department of Computer ScienceUniversity of Calgary2500 University Drive, NWCalgary, AB, Canadasbo@ucalgary.ca
    Guenther Ruhe
    Department of Computer Science & Department of Electrical and Computer EngineeringUniversity of Calgary2500 University Drive, NWCalgary, AB, Canadaruhe@ucalgary.ca
    Emadoddin Livani
    Department of Electrical and Computer EngineeringUniversity of Calgary2500 University Drive, NWCalgary, AB, Canadaelivani@ucalgary.ca
  • Agenda
    Parameters of a CBR Model
    Parameters Instantiation
    Weighting Method SANN
    Frequency Analysis
    Dependency Network and the Customization Support
    Rules Transferability
    Conclusions and Future Work
    2
  • CBR and the Parameters
    • Each combination of the parameters :
    • Instantiation of the general CBR-based prediction method
    Solution Algorithm
    Similarity Function
    Prediction Performance of CBR model
    Number of Nearest Neighbor Case
    Weighting Technique used for Attributes
    3
  • Instantiation Parameters of the CBR
    4
  • Sensitivity Analysis Based On NeuralNetwork (SANN)
    5
    Dataset
    CC
    ……………
    LOC
    Xmin(A1)
    NN
    OUTPUTmin(A1)
    ∆1= |OUTPUTmin(A1) - OUTPUTmax(A1)|
    Xmax (A1)
    NN
    OUTPUTmax(A1)
  • 6
    What is the evaluation result in comparison with existing methods (un-weighted)
    What is the evaluation result in comparison with existing methods (MLR)
    How different numbers of the nearest neighbors can affect the results?
  • Instantiation Parameters of the CBR
    7
  • Data Repository
    PROMISE Repository
    120 different CBR instantiations were created
    and applied to 11 data sets from PROMISE repository
    Characterization of data sets
    8
  • Is One Instantiation Always the Best?
    9
    MW1
    PC1
    MC2
    KC3
    AR5
    CM1
  • Experimental Design for Frequency Analysis
    10
    Min(MMRE)
    Dataset
    120 different
    instantiation
    Max(Pred(0.25))
    Min(MMRE)
    Dataset
    120 different
    instantiation
    Max(Pred(0.25))
    11different
    Datasets
    Min(MMRE)
    Dataset
    120 different
    instantiation
    Max(Pred(0.25))
  • Frequency Analysis
    Frequency of the best performance in single attribute analysis
    Neural network based sensitivity analysis (as the weighting technique)
    Un-weighted average (as the solution algorithm)
    Maximum number of nearest neighbors (as the number of nearest neighbors)
    11
  • 12
  • 13
    Customization Support Using DNA
    Dataset
    Eight attributes defined as condition attributes
    Four data set-related attributes:
    (NumOfModule),(DefectRatio),(Language),(LOC)
    Four CBR-related attributes:
    (SimFunc),(WeightingTech),(NumOfNN),(SolutionAlgorithm)
    The decision attributes: Pred(0.25) and MMRE
    (a1,a2,a3,a4)
    (p1,p2,p3,p4)
    Rule Induction
    Customization Support
    CBR model instantiated by (p1,p2,p3,p4)
     
    Data set
    DNA
    New data
    (a1,a2,a3,a4)
    Recommendation
    f (a1,a2,a3,a4)
     
    Rule Set
  • 14
    Application of DNA Results
    Generation of the Decision Trees
    Given:
    NumOfModule = High
    DefectRatio = High
    LOC = Medium
    Language = JAVA
    Question: How to customize a CBR defect prediction model towards achieving high prediction accuracy measured in MMRE?
    Recommendation:
    Customize CBR model by means of:
    WeightingTech = SANN
    NumOfNN ≥ 10
    SolutionAlgorithm = Rank-weighted Average
    Justification:
    Based on the data set characteristics, assumptions of rules 3, 4, 5, 11 and 12 are fulfilled.
    By comparing the probability distributions of MMRE rule No. 11 is the best in terms of having the highest probability (69.2%) to achieve “Low” MMRE.
  • Transferability of Rules across Sets of Data
    15
    • Prediction on Pred(0.25) = 48.33%
    • Prediction on
    MMRE = 62.92%
    • The per-class (High, Medium and Low) are calculated based on the confusion matrix…
    Rule Induction
    Customization Support
    Min(MMRE)
    Dataset
    120 different
    instantiation
    CBR model instantiated by (p1,p2,p3,p4)
     
    Data set
    Max(Pred(0.25))
    9 different
    Datasets
    DNA
    New data
    (a1,a2,a3,a4)
    Min(MMRE)
    Dataset
    120 different
    instantiation
    Recommendation
    f (a1,a2,a3,a4)
     
    Max(Pred(0.25))
    Rule Set
  • Validation and Limitations
    Tools used for attribute selection, and modeling tasks
    Neural network, regression analysis, CBR, and dependency network analysis
    Only four parameters of the CBR instantiation
    The composition of the training and testing data sets
    Another aspect of the analysis undertaken is the definition of classification intervals for dependency networks,
    Two discretization algorithms
    Sensitivity analysis
    16
  • Conclusions and Future Work
    Starting with 11 data sets from the PROMISE repository
    Calculating the prediction performance of 120 instantiations of the CBR-based defect prediction model based on the value of the MMRE and Pred(0.25)
    The frequency analysis on the top performances
    Generating the DNA to provide a customization support for a new data set
    The compatibility of rule sets extracted from different contexts
    Enhancement of the validity with inclusion of further data sets
    Comparing the performance against other measures
    Other methods for rule induction
    17
  • References
    Brady, A. and Menzies, T. 2010. Case-based reasoning vs parametric models for software quality optimization. In Proceedings of the 6thInternational Conference on Predictive Models in Software Engineering, pp. 3:1-3:10.
    Catal, C. and Diri, B. 2009. A systematic review of software fault prediction studies. Expert Systems with Applications, vol. 36 (4), pp. 7346-7354.
    El Emam, K., Benlarbi, S., Goel, N., and Rai, S. N. 2001. Comparing case-based reasoning classifiers for predicting high risk software components. The Journal of Systems and Software, vol. 55, pp. 301-320.
    Foss, T., Stensrud, E., Kitchenham, B., and Myrtveit , I. 2003. A simulation study of the model evaluation criterion MMRE. IEEE Transactions on Software Engineering, vol. 29 (11), pp. 985- 995.
    Ganesan, K., Khoshgoftaar, T. M., and Allen, E. B. 2000. Case-based software quality prediction. International Journal of Software Engineering and Knowledge Engineering, vol. 10(2), pp. 139–152.
    Paikari, E., Richter, M. M., and Ruhe, G. 2010. A comparative study of attribute weighting techniques for software defect prediction using case-based reasoning. In Proceeding of the 22nd International Conference on Software Engineering and Knowledge Engineering, pp. 380-386.
    18
  • Thanks
    ElhamPaikari
    epaikari@ucalgary.ca
    19