A Validation of Object-Oriented Design Metrics as Quality Indicators Evi Yulianti 1006833110 Iis Solichah 1006800094 Mubar...
Content <ul><li>Article & Author </li></ul><ul><li>Introduction </li></ul><ul><li>Data Analysis </li></ul><ul><li>Case Stu...
Article <ul><li>Publication:  Journal IEEE Transactions on Software Engineering </li></ul><ul><li>Issue Date: Oct 1996 </l...
Author Dr. Victor R. Basili  (University of Maryland) Department of Computer Science , Professor, 1970 – Present Institute...
Introduction
Introduction Time & resource  consuming activity help manager : 1. make decisions, plan and schedule activities, 2. alloca...
Introduction (cont’) Metrics must be defined and validated in order to be used in industry Empirical validation aims at de...
Chidamber & Kemerer’s metric [13] <ul><li>Weighted Methods per Class (WMC) </li></ul><ul><li>Depth of Inheritance Tree of ...
Hypothesis <ul><li>H-WMC </li></ul><ul><li>H-DIT </li></ul><ul><li>H-NOC </li></ul><ul><li>H-CBO </li></ul><ul><li>H-RFC <...
Data Analysis
DATA ANALYSIS <ul><li>Assess empirically whether the OO design metrics defined in [13] are useful predictors or fault-pron...
Distribution and Correlation Analysis <ul><li>Distribution of the analyzed OO metrics based on 180 presented classes. </li...
Distribution and Correlation Analysis (cont.) <ul><li>Descriptive statistics of the metric distributions. </li></ul>
Distribution and Correlation Analysis (cont.) <ul><li>Linear Pearson's correlations (R': Coefficient of determination) bet...
The Relationships Between Fault Probability and OO Metrics <ul><li>Analysis Methodology </li></ul><ul><ul><li>Explanatory ...
The Relationships Between Fault Probability and OO Metrics (cont.) <ul><li>Resulting Table </li></ul>
The Relationships Between Fault Probability and OO Metrics (cont.) <ul><li>Statistics: </li></ul><ul><ul><li>Coefficient: ...
Univariate Analysis <ul><li>A nalyze the relationships between six  OO  metrics introduced in  [ 13 ]  and the probability...
Univariate Analysis (cont.) <ul><li>Test Result:  (analyzed from Table 3) </li></ul><ul><ul><li>WMC :   H-WMC is supported...
Multivariate Analysis <ul><li>T o  evaluate the  predictive  capability  of  those metrics  that  had been assessed  suffi...
Result 1 The   figures before parentheses  i n  the right column are the number of  classes class i f i ed as  faulty  The...
Result II <ul><li>Based on Code Metrics  [2] </li></ul><ul><li>112 classes  (predicted as faulty) out of 180 would be insp...
Result III <ul><li>Accuracies of OO and Code Metrics </li></ul><ul><li>C orrectness (percentage of classes correctly predi...
CASE STUDY: DATA ANALYSIS FOR SAMPLE CLASSES
Class Diagram
Distribution Analysis <ul><li>Distribution of the analyzed OO metrics based on eight sample classes from “Tugas 1”. </li><...
Distribution Analysis (cont.) <ul><li>Distribution of the analyzed OO metrics based on eight sample classes from “Tugas 1”...
Distribution Analysis (cont.) <ul><li>Descriptive statistics of the metric value distributions. </li></ul>WMC DIT RFC NOC ...
Logistic Regression <ul><li>It is used for prediction of the probability of occurrence of an event </li></ul><ul><li>It ma...
SPSS  V.15 VARIABLE VIEW
DATA VIEW
Univariate Analysis:
coefficient constant z = - 0,513 + 0,075*WMC R 2 Odds ratio π = exp(z) / (1+exp(z))
WMC z= -0,513 + 0,075*WMC π = exp(z) / (1+exp(z)) Class WMC π  Tool.java 1 0,392218 CTextbox.java 1 0,392218 DrawingPackag...
DIT z= -1,386 + 21,566*DIT π = exp(z) / (1+exp(z)) Class DIT π  Tool.java 0 0,200047 Screen.java 0 0,200047 ShapeList.java...
NOC z= 0,196 - 0,5309*NOC π = exp(z) / (1+exp(z)) Class NOC π  Tool.java 0 0,548844 Screen.java 0 0,548844 ShapeList.java ...
CBO z= -2,884 – 2,027*CBO π = exp(z) / (1+exp(z)) Class CBO π  Screen.java 0 0,05295 Tool.java 1 0,297967 ShapeList.java 1...
RFC z= -0,941 + 0,09*RFC π = exp(z) / (1+exp(z)) Class RFC π  Tool.java 1 0,299223 Screen.java 5 0,379658 CTextbox.java 6 ...
LCOM z= 0,288 - 0,231*LCOM π = exp(z) / (1+exp(z)) Class LCOM π  Tool.java 0 0,571506429 Screen.java 0 0,571506429 ShapeLi...
Multivariate Analysis:
constant <ul><li>coefficient WMC  </li></ul><ul><li>coefficient DIT </li></ul><ul><li>coefficient NOC </li></ul><ul><li>co...
Related Works
CONCLUSION & FUTURE WORK
Conclusions <ul><li>F ive   out of   the six Chidamber and Kemerer’s  OO  metrics appear to be useful to predict  class  f...
Future Works <ul><li>Replicating this study in an industrial setting: A sample  of  large-scale projects  developed  in  C...
Paper Reference List <ul><li>[2] Amadeus Software Research,  Getting Started With Amadeus,  Amadeus Measurement System, 19...
Upcoming SlideShare
Loading in …5
×

A Validation of Object-Oriented Design Metrics as Quality Indicators

1,365 views

Published on

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,365
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide
  • Tabel 1 menunjukkan bahwa class-class yang diobservasi rata-rata memiliki: -DIT (kedalaman inheritance) rendah NOC rendah (rata-rata class hanya memiliki sedikit children) LCOM juga rendah (rata-rata class memiliki high-cohesion) Dari data tersebut, dapat dilihat bahwa Metrics tersebut tidak dapat melakukan differentiate terhadap sample classes.
  • Korelasi antar metric rendah. Hanya yang dicetak tebal yang memiliki korelasi cukup signifikan. Pada scatterplots: relationship antara CBO dan RFC tidak disebabkan oleh outliers.
  • Logistic regression: a standard technique based on maximum likelihood estimation, to analyze the relationships between metrics and the faultproneness of classes. Univariate -&gt; to evaluate the relationship of each of the metrics in isolation and faultproneness Multivariate -&gt; to evaluate the predictive capability of those metrics that had been assessed sufficiently significant in the univariate analysis Formula di atas adalah persamaan relasi untuk multivariate logistic regression. Univariate adalah salah satu kasus khusus dari multivariate (ketika hanya satu variable yang muncul dalam persamaan). Phi menyatakan probabilitas ditemukannya fault pada class saat validasi. Xi menyatakan design metrics sebagai explanatory variable pada model. ( covaviates of the logistic regression equation )
  • Data statistik yang digunakan pada tabel 3 dan 4: coefficient: semakin besar koefisien, maka pengaruh explanatory variable terhadap respons variable semakin besar.  : selisih odd-ratio -&gt; menunjukkan penambahan/pengurangan odd ratio ketika X bertambah satu unit.  (X): menunjukkan pengaruh metrics terhadap variable yang diprediksi. p-value: keakuratan estimasi koefisien (selisih koefisien)
  • Penjelasan NOC: Kebanyakan class dalam sample tidak memiliki lebih dari satu anak. Reuse merupakan faktor signifikan negatif terhadap fault-density [5]. Large NOC are less fault-prone. Penjelasan LCOM: LCOM tidak dapat dianalisis untuk memprediksi nilai probabilitas fault, karena nilai-nilai yang seharusnya negatif dalam LCOM direset menjadi nol. Hal ini mengakibatkan tidak dapat dilakukannya perbandingan yang fair antar class.
  • Dari total 58 class yang pada actual case memiliki fault, kita bisa memprediksikan 48 class faulty. Dari total 268 fault yang pada actual case terjadi, kita bisa memprediksikan 250 fault. To summarize, results show that the studied OO metrics are useful predictors of fault-proneness.
  • Code metric ini disediakan oleh Amadeus tool [2]. Mendeteksi 112 class sebagai faulty (actual faulty: 58 class).  Terdapat 61 class yang harus diperiksa, padahal bukan faulty.
  • in logistic regression R2 &gt; 0,3 is considered good so our model seems to be a good predictor.
  • A Validation of Object-Oriented Design Metrics as Quality Indicators

    1. 1. A Validation of Object-Oriented Design Metrics as Quality Indicators Evi Yulianti 1006833110 Iis Solichah 1006800094 Mubarik Ahmad 1006833294  Victor R. Basili  Lionel C. Briand  Walcelio L. Melo University of Mariland
    2. 2. Content <ul><li>Article & Author </li></ul><ul><li>Introduction </li></ul><ul><li>Data Analysis </li></ul><ul><li>Case Study </li></ul><ul><li>Conclusion & Future Work </li></ul>
    3. 3. Article <ul><li>Publication:  Journal IEEE Transactions on Software Engineering </li></ul><ul><li>Issue Date: Oct 1996 </li></ul><ul><li>Volume: 22 Issue:10 </li></ul><ul><li>On page(s): 751 - 761 </li></ul>
    4. 4. Author Dr. Victor R. Basili (University of Maryland) Department of Computer Science , Professor, 1970 – Present Institute for Advanced Computer Studies , Research Professor, 1984-Present Dr. Lionel C. Briand (Carleton University) Canada Research Chair (Tier I) in Software Quality Engineering Dr. Walcelio (Walt) L. Melo, Professor, Catholic University of Brasilia, DF, Brazil , 1997-2001 Lead Architect, Model Driven Solutions, 2008-now
    5. 5. Introduction
    6. 6. Introduction Time & resource consuming activity help manager : 1. make decisions, plan and schedule activities, 2. allocate resources for the different software activities identify fault-prone modules TESTING SOFTWARE METRIC … ?
    7. 7. Introduction (cont’) Metrics must be defined and validated in order to be used in industry Empirical validation aims at demonstrating the usefulness of a measure in practice and is, therefore, a crucial activity to establish the overall validity of a measure. ability to identify fault-prone classes
    8. 8. Chidamber & Kemerer’s metric [13] <ul><li>Weighted Methods per Class (WMC) </li></ul><ul><li>Depth of Inheritance Tree of a class (DIT) </li></ul><ul><li>Number Of Children of a Class (NOC) </li></ul><ul><li>Coupling Between Object classes (CBO) </li></ul><ul><li>Response For a Class (RFC) </li></ul><ul><li>Lack of Cohesion on Methods (LCOM) </li></ul>
    9. 9. Hypothesis <ul><li>H-WMC </li></ul><ul><li>H-DIT </li></ul><ul><li>H-NOC </li></ul><ul><li>H-CBO </li></ul><ul><li>H-RFC </li></ul><ul><li>H-LCOM </li></ul>
    10. 10. Data Analysis
    11. 11. DATA ANALYSIS <ul><li>Assess empirically whether the OO design metrics defined in [13] are useful predictors or fault-prone classes. </li></ul><ul><li>Used analysis: </li></ul><ul><ul><li>Descriptive distributions of the OO Metrics </li></ul></ul><ul><ul><li>Univariate and multivariate analysis of the relationship between OO Metrics and fault-proneness. </li></ul></ul>
    12. 12. Distribution and Correlation Analysis <ul><li>Distribution of the analyzed OO metrics based on 180 presented classes. </li></ul>Number of Class
    13. 13. Distribution and Correlation Analysis (cont.) <ul><li>Descriptive statistics of the metric distributions. </li></ul>
    14. 14. Distribution and Correlation Analysis (cont.) <ul><li>Linear Pearson's correlations (R': Coefficient of determination) between the studied OO metrics are very weak. </li></ul><ul><li>These metrics are mostly statistically independent. </li></ul>
    15. 15. The Relationships Between Fault Probability and OO Metrics <ul><li>Analysis Methodology </li></ul><ul><ul><li>Explanatory variable: metrics from [13] </li></ul></ul><ul><ul><li>Response variable: was a fault detected in a class during testing phases? (binary) </li></ul></ul><ul><ul><li>Used methods: Logistic regression </li></ul></ul><ul><ul><ul><li>Univariate </li></ul></ul></ul><ul><ul><ul><li>Multivariate </li></ul></ul></ul>
    16. 16. The Relationships Between Fault Probability and OO Metrics (cont.) <ul><li>Resulting Table </li></ul>
    17. 17. The Relationships Between Fault Probability and OO Metrics (cont.) <ul><li>Statistics: </li></ul><ul><ul><li>Coefficient: the estimated regression coefficient. </li></ul></ul><ul><ul><li> </li></ul></ul><ul><ul><li>Statistical significance (p-value) </li></ul></ul>
    18. 18. Univariate Analysis <ul><li>A nalyze the relationships between six OO metrics introduced in [ 13 ] and the probability of fault detection in a class during test phases. </li></ul><ul><li>Test the hypothesis. </li></ul>
    19. 19. Univariate Analysis (cont.) <ul><li>Test Result: (analyzed from Table 3) </li></ul><ul><ul><li>WMC : H-WMC is supported. </li></ul></ul><ul><ul><li>DIT : H-DIT is supported. </li></ul></ul><ul><ul><li>RFC : H-RFC is supported. </li></ul></ul><ul><ul><li>NOC : H-NOC is not supported. </li></ul></ul><ul><ul><li>LCOM : can’t be analyzed. </li></ul></ul><ul><ul><li>CBO : H-CBO is supported. </li></ul></ul>
    20. 20. Multivariate Analysis <ul><li>T o evaluate the predictive capability of those metrics that had been assessed sufficiently significant in the univariate analysis. </li></ul><ul><li>O nly the metrics that significantly improve the predictive power of the multivariate model </li></ul>
    21. 21. Result 1 The figures before parentheses i n the right column are the number of classes class i f i ed as faulty The figures within the parentheses are the faults contained i n those classes. <ul><li>Based on OO Design Metrics [13] </li></ul><ul><li>Using such a model for classification, the results are obtained by using a classification threshold of π (Fault detection) = 0.5, i.e., when π > 0.5, the class is classified as faulty and, otherwise, as nonfaulty. </li></ul>
    22. 22. Result II <ul><li>Based on Code Metrics [2] </li></ul><ul><li>112 classes (predicted as faulty) out of 180 would be inspected and 51 faulty classes out of 58 would be detected </li></ul>
    23. 23. Result III <ul><li>Accuracies of OO and Code Metrics </li></ul><ul><li>C orrectness (percentage of classes correctly predicted as faulted) </li></ul><ul><li>C ompleteness(percentage of faulty classes detected) </li></ul>Values between parentheses present predictions correctness and completeness when classes are weighted according to number of faults they contain
    24. 24. CASE STUDY: DATA ANALYSIS FOR SAMPLE CLASSES
    25. 25. Class Diagram
    26. 26. Distribution Analysis <ul><li>Distribution of the analyzed OO metrics based on eight sample classes from “Tugas 1”. </li></ul>
    27. 27. Distribution Analysis (cont.) <ul><li>Distribution of the analyzed OO metrics based on eight sample classes from “Tugas 1”. </li></ul>
    28. 28. Distribution Analysis (cont.) <ul><li>Descriptive statistics of the metric value distributions. </li></ul>WMC DIT RFC NOC LCOM CBO Maximum 15 2 18 2 93 2 Minimum 1 0 1 0 0 0 Mean 6.875 0.5 10.375 0.375 11.625 1.375
    29. 29. Logistic Regression <ul><li>It is used for prediction of the probability of occurrence of an event </li></ul><ul><li>It makes use of several predictor variables </li></ul><ul><li>Multivariate logistic regression function: </li></ul>Univariate logistic regression is special case where only one variable appears
    30. 30. SPSS V.15 VARIABLE VIEW
    31. 31. DATA VIEW
    32. 32. Univariate Analysis:
    33. 33. coefficient constant z = - 0,513 + 0,075*WMC R 2 Odds ratio π = exp(z) / (1+exp(z))
    34. 34. WMC z= -0,513 + 0,075*WMC π = exp(z) / (1+exp(z)) Class WMC π Tool.java 1 0,392218 CTextbox.java 1 0,392218 DrawingPackage.java 3 0,428494 Screen.java 5 0,465555 CCircle.java 8 0,521736 ShapeList.java 10 0,558974 CRect.java 12 0,59556 CShape.java 15 0,648397
    35. 35. DIT z= -1,386 + 21,566*DIT π = exp(z) / (1+exp(z)) Class DIT π Tool.java 0 0,200047 Screen.java 0 0,200047 ShapeList.java 0 0,200047 CShape.java 0 0,200047 DrawingPackage.java 0 0,200047 CCircle.java 1 1 CRect.java 1 1 CTextbox.java 2 1
    36. 36. NOC z= 0,196 - 0,5309*NOC π = exp(z) / (1+exp(z)) Class NOC π Tool.java 0 0,548844 Screen.java 0 0,548844 ShapeList.java 0 0,548844 CCircle.java 0 0,548844 CTextbox.java 0 0,548844 DrawingPackage.java 0 0,548844 CRect.java 1 0,417049 CShape.java 2 0,296129
    37. 37. CBO z= -2,884 – 2,027*CBO π = exp(z) / (1+exp(z)) Class CBO π Screen.java 0 0,05295 Tool.java 1 0,297967 ShapeList.java 1 0,297967 CShape.java 1 0,297967 CCircle.java 2 0,763145 CRect.java 2 0,763145 CTextbox.java 2 0,763145 DrawingPackage.java 2 0,763145
    38. 38. RFC z= -0,941 + 0,09*RFC π = exp(z) / (1+exp(z)) Class RFC π Tool.java 1 0,299223 Screen.java 5 0,379658 CTextbox.java 6 0,401072 CCircle.java 9 0,467297 CRect.java 14 0,579081 CShape.java 15 0,600848 DrawingPackage.java 15 0,600848 ShapeList.java 18 0,663515
    39. 39. LCOM z= 0,288 - 0,231*LCOM π = exp(z) / (1+exp(z)) Class LCOM π Tool.java 0 0,571506429 Screen.java 0 0,571506429 ShapeList.java 0 0,571506429 CCircle.java 0 0,571506429 CRect.java 0 0,571506429 CTextbox.java 0 0,571506429 DrawingPackage.java 0 0,571506429 CShape.java 93 6,23919E-10
    40. 40. Multivariate Analysis:
    41. 41. constant <ul><li>coefficient WMC </li></ul><ul><li>coefficient DIT </li></ul><ul><li>coefficient NOC </li></ul><ul><li>coefficient CBO </li></ul><ul><li>coefficient RFC </li></ul>R 2 z= 50,930 - 4,249*WMC - 33,403*DIT + 28,433*NOC + 5,256*CBO -1,885*RFC π = exp(z) / (1+exp(z))
    42. 42. Related Works
    43. 43. CONCLUSION & FUTURE WORK
    44. 44. Conclusions <ul><li>F ive out of the six Chidamber and Kemerer’s OO metrics appear to be useful to predict class fault-proneness during the high- and low-level design phases of the life-cycle </li></ul><ul><li>Chidamber and Kemerer’s OO metrics show to be better predictors than the best set of ”traditional” code metrics, which can only be collected during later phases of the software development processes. </li></ul><ul><li>M ost of these metrics appear to be complementary indicators which are relatively independent from each other </li></ul>
    45. 45. Future Works <ul><li>Replicating this study in an industrial setting: A sample of large-scale projects developed in C++ and Ida95 in the framework of the NAsA Goddard Flight Dynamics Division (Software Engineering Laboratory). </li></ul><ul><li>Studying the variations, in terms of metric definitions and experimental results, between different OO programming languages. </li></ul><ul><li>Extending the empirical investigation to other OO metrics proposed in the literature and develop improved metrics </li></ul>
    46. 46. Paper Reference List <ul><li>[2] Amadeus Software Research, Getting Started With Amadeus, Amadeus Measurement System, 1994. </li></ul><ul><li>[13] S.R. Chidamber and C.F. Kemerer, “A Metrics Suite for Object-Oriented Design,” IEEE Trans. Software Eng., vol. 20, no. 6, pp. 476493, June 1994. </li></ul>

    ×