Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Metrics are usually computed at a low level:           classes, methods, …/ W&I / MDSE        23-4-2012 PAGE 0
Multitude of data values obscures a general      picture of the system maintainability/W&I / MDSE         23-4-2012 PAGE 1
That we are actually interested in!/W&I / MDSE          23-4-2012 PAGE 2
You Cant Control the Unfamiliar:A Study on the RelationsBetween AggregationTechniques for Software Metrics Bogdan Vasilesc...
Two kinds of aggregationSame metrics, different                Same artifact, differentartifacts                          ...
Various techniques can be   found in the literatureSame metrics, different                  Traditional: mean,artifacts   ...
Various techniques can be   found in the literatureSame metrics, different                  Traditional: mean,artifacts   ...
Questions      1. Which and to what extent do the different         aggregation techniques agree?      2. What is the natu...
Qualitas Corpus 20101126     • Qualitas Corpus 20101126r, 106 systems     • FitJava v1.1, 2 packages, 2240 SLOC     • NetB...
1) Agreement between diff techniques      • Agreement:          • Aggregation: Class SLOC  Package          • Techniques ...
1) Agreement: different inequality indices?     • Gini, Theil, Hoover, Atkinson – agree         • aggregates obtained conv...
1) Agreement: traditional and ineq indices?    • mean        • Kolm: strong (0,8) and statistically significant (92%)     ...
2) Nature of the relation: Typical patterns   • Theil is known to be more           • Linear relation with a “fat”     sen...
Which aggregation technique? (1)      • Theil, Hoover, Gini and Atkinson agree          • Any can be chosen from the corre...
Which aggregation technique? (2)      • Kolm and mean agree          • Kolm is reliable for skewed distributions          ...
Conclusions/W&I / MDSE         23-4-2012 PAGE 15
Upcoming SlideShare
Loading in …5
×

ICSM 2011

450 views

Published on

Paper:
Vasilescu B, Serebrenik A and van den Brand MGJ (2011), "You can't control the unfamiliar: A study on the relations between aggregation techniques for software metrics", In Proceedings of the 27th IEEE International Conference on Software Maintenance, pp. 313-322. IEEE.

  • Be the first to comment

  • Be the first to like this

ICSM 2011

  1. 1. Metrics are usually computed at a low level: classes, methods, …/ W&I / MDSE 23-4-2012 PAGE 0
  2. 2. Multitude of data values obscures a general picture of the system maintainability/W&I / MDSE 23-4-2012 PAGE 1
  3. 3. That we are actually interested in!/W&I / MDSE 23-4-2012 PAGE 2
  4. 4. You Cant Control the Unfamiliar:A Study on the RelationsBetween AggregationTechniques for Software Metrics Bogdan Vasilescu Alexander Serebrenik Mark van den Brand
  5. 5. Two kinds of aggregationSame metrics, different Same artifact, differentartifacts metrics/W&I / MDSE 23-4-2012 PAGE 4
  6. 6. Various techniques can be found in the literatureSame metrics, different Traditional: mean,artifacts median, sum, … Econometric inequality indices: Gini, Theil, Hoover, Kolm, Atkinson/W&I / MDSE 23-4-2012 PAGE 5
  7. 7. Various techniques can be found in the literatureSame metrics, different Traditional: mean,artifacts median, sum, … Which aggregation Econometric technique inequality indices: Gini, Theil, Hoover, should we Kolm, Atkinson use?/W&I / MDSE 23-4-2012 PAGE 6
  8. 8. Questions 1. Which and to what extent do the different aggregation techniques agree? 2. What is the nature of the relation between the various aggregation techniques? 3. How does the correlation coefficient change as the systems evolve?/W&I / MDSE 23-4-2012 PAGE 7
  9. 9. Qualitas Corpus 20101126 • Qualitas Corpus 20101126r, 106 systems • FitJava v1.1, 2 packages, 2240 SLOC • NetBeans v6.9.1, 3373 packages 1890536 SLOC./W&I / MDSE 23-4-2012 PAGE 8
  10. 10. 1) Agreement between diff techniques • Agreement: • Aggregation: Class SLOC  Package • Techniques agree if they rank the packages similarly We use rank-based correlation coefficient: Kendall’s /W&I / MDSE 23-4-2012 PAGE 9
  11. 11. 1) Agreement: different inequality indices? • Gini, Theil, Hoover, Atkinson – agree • aggregates obtained convey the same information • Kolm does not!/W&I / MDSE 23-4-2012 PAGE 10
  12. 12. 1) Agreement: traditional and ineq indices? • mean • Kolm: strong (0,8) and statistically significant (92%) • median, standard deviation, and variance • sum • does not correlate with any other aggregation technique/W&I / MDSE 23-4-2012 PAGE 11
  13. 13. 2) Nature of the relation: Typical patterns • Theil is known to be more • Linear relation with a “fat” sensitive to the rich head • Theil increases faster when Gini increases/W&I / MDSE 23-4-2012 PAGE 12
  14. 14. Which aggregation technique? (1) • Theil, Hoover, Gini and Atkinson agree • Any can be chosen from the correlation point of view • Some might be “better” in each specific case • easy to interpret: Gini  [0,1] • provide additional insights: Theil (explanation) • negative values: Gini, Hoover − affects the domain! • sensitive for high values: Theil, Atkinson • deviations from uniformity: Gini, Hoover/ W&I / MDSE 23-4-2012 PAGE 13
  15. 15. Which aggregation technique? (2) • Kolm and mean agree • Kolm is reliable for skewed distributions − better alternative (“by no means”) • Not in the paper: − agreement observed for NOC − but not for DIT!/ W&I / MDSE 23-4-2012 PAGE 14
  16. 16. Conclusions/W&I / MDSE 23-4-2012 PAGE 15

×