SlideShare a Scribd company logo
1 of 14
Download to read offline
Software metrics are usually right-skewed


                                 Histogram of SLOC(org.argouml.ui)
                  25
                  20
                  15
      Frequency

                  10
                  5
                  0




                       0   100             200              300          400   500

                                    SLOC for classes in org.argouml.ui
2/11




Aggregation of software metrics using the
          “softnometric” index

               Bogdan Vasilescu
         b.n.vasilescu@student.tue.nl

          Eindhoven University of Technology
                  The Netherlands


                 March 9, 2011
Aggregation techniques                                          3/11




                                          Inequality indices:
Classical:        Distribution fitting:
                                              Theil
    Mean              Log-normal
                                              Gini
    Sum               Exponential
                                              Kolm
    Cardinality       Negative binomial
                                              Atkinson
Aggregation techniques                                          3/11




                                          Inequality indices:
Classical:        Distribution fitting:
                                              Theil
    Mean              Log-normal
                                              Gini
    Sum               Exponential
                                              Kolm
    Cardinality       Negative binomial
                                              Atkinson
Gini index                                                            4/11

The Gini index is based on the Lorenz curve:
     proportion of the total income of the population (y-axis)
     cumulatively earned by the bottom x% of the people.
     0     perfect equality: every person receives the same income.
     1     perfect inequality: one person receives all the income.
IGini (X ) =     A
               A +B
Gini index                                                            4/11

The Gini index is based on the Lorenz curve:
     proportion of the total income of the population (y-axis)
     cumulatively earned by the bottom x% of the people.
     0     perfect equality: every person receives the same income.
     1     perfect inequality: one person receives all the income.
IGini (X ) =     A
               A +B
Theoretical comparison                                                     5/11




Criteria:
     Domain → determines applicability

     Range → determines interpretation
     Invariance
        •   w.r.t. addition → LOC, ignore headers
        •   w.r.t. multiplication → LOC, percentages vs. absolute values

     Decomposability → explain inequality by partitioning the
     population into groups
Theoretical comparison                                             6/11




Agg. technique   Domain   Range          Invariance   Decomposability
Mean             R        R              -            N/A
Sum              R        R              -            N/A
Cardinality      R        N              -            N/A
Gini Index       R+       [0, 1]         mult.        -
                 R        R              mult.        -
Theil Index      R+       [0, log n]     mult.        yes
Kolm Index       R        R+             add.         yes
Atkinson Index   R+       [0, 1 − 1/n]   mult.        -
Empirical comparison                                                  7/11




Research questions:

    Does LOC relate to bugs?

    Do the aggregation techniques influence the presence/strength of
    this relation?

    Is there any difference between the aggregation techniques?
    Do they express the same thing?
Empirical comparison                                    8/11




Case study: ArgoUML
    Open-source, ∼ 1200 Java classes, ∼ 100 packages.
Empirical comparison                                                    8/11




Case study: ArgoUML
    Open-source, ∼ 1200 Java classes, ∼ 100 packages.

Methodology:
    Tool chain to automatically process issue tracker and version
    control system data.
    Mapped defects to Java classes and then packages.
    Measured SLOC of each class, aggregated to package level.
    For each aggregation technique, statistically studied correlation
    with bugs.
Results                                                                                                         9/11




                 mean             IGini           ITheil          IKolm          IAtkinson           defects
mean                            0.170           0.192           0.6761             0.203             0.0096
IGini                                           0.908            0.467             0.903                0.27
ITheil                                                           0.488             0.918              0.273
IKolm                                                                              0.501              0.119
IAtkinson                                                                                             0.229

     IGini , ITheil and IAtkinson indicate the strongest and also statistically
     significant correlation with the number of defects.
     However, high and statistically significant correlation between
     them.
     Mean indicates the lowest correlation with the number of defects.



 1 statistically significant correlations, with two-sided p-values not exceeding 0.01, are typeset in boldface
Threats to validity                                                  10/11




No control over the issue tracker → mapping of defects to classes.
    bugs missing from the issue tracker.
    bug fixes not showing up in the commit log.

How representative is the case? How about the version?
    replicate on more systems and more versions.

Is LOC the most suitable metric?
    replicate with more metrics.
Conclusions                                                                                                                                              11/11


            Software metrics are not distributed normally.

                           Histogram of SLOC(org.argouml.ui)
                                                                               Theoretical comparison.
            25




                                                                                Agg. technique       Domain       Range             Invariance     Decomposability
            20




                                                                                Mean                 R            R                 -              N/A
                                                                                Sum                  R            R                 -              N/A
            15
Frequency




                                                                                Cardinality          R            N                 -              N/A
            10




                                                                                Gini Index           R+           [0, 1]            mult.          -
                                                                                                     R            R                 mult.          -
            5




                                                                                Theil Index          R+           [0, log n]        mult.          yes
            0




                 0   100             200              300          400   500
                                                                                Kolm Index           R            R+                add.           yes
                              SLOC for classes in org.argouml.ui                Atkinson Index       R+           [0, 1 − 1/n]      mult.          -


                                                                               Empirical comparison.
                                                                                              mean         Gini      Theil        Kolm      Atkinson     defects
                                                                                mean                      0.170     0.192        0.676         0.203     0.0096
                                                                                Gini                                0.908        0.467         0.903        0.27
                                                                                Theil                                            0.488         0.918      0.273
                                                                                Kolm                                                           0.501      0.119
                                                                                Atkinson                                                                  0.229



            Classical aggregation techniques have problems when distributions are
            skewed. Inequality indices look more promising.

More Related Content

Recently uploaded

1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
QucHHunhnh
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
negromaestrong
 

Recently uploaded (20)

2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Key note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdfKey note speaker Neum_Admir Softic_ENG.pdf
Key note speaker Neum_Admir Softic_ENG.pdf
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
Explore beautiful and ugly buildings. Mathematics helps us create beautiful d...
 
Seal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptxSeal of Good Local Governance (SGLG) 2024Final.pptx
Seal of Good Local Governance (SGLG) 2024Final.pptx
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Python Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docxPython Notes for mca i year students osmania university.docx
Python Notes for mca i year students osmania university.docx
 
Third Battle of Panipat detailed notes.pptx
Third Battle of Panipat detailed notes.pptxThird Battle of Panipat detailed notes.pptx
Third Battle of Panipat detailed notes.pptx
 
Magic bus Group work1and 2 (Team 3).pptx
Magic bus Group work1and 2 (Team 3).pptxMagic bus Group work1and 2 (Team 3).pptx
Magic bus Group work1and 2 (Team 3).pptx
 
PROCESS RECORDING FORMAT.docx
PROCESS      RECORDING        FORMAT.docxPROCESS      RECORDING        FORMAT.docx
PROCESS RECORDING FORMAT.docx
 
ICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptxICT Role in 21st Century Education & its Challenges.pptx
ICT Role in 21st Century Education & its Challenges.pptx
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptxSKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
SKILL OF INTRODUCING THE LESSON MICRO SKILLS.pptx
 
General Principles of Intellectual Property: Concepts of Intellectual Proper...
General Principles of Intellectual Property: Concepts of Intellectual  Proper...General Principles of Intellectual Property: Concepts of Intellectual  Proper...
General Principles of Intellectual Property: Concepts of Intellectual Proper...
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Sociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning ExhibitSociology 101 Demonstration of Learning Exhibit
Sociology 101 Demonstration of Learning Exhibit
 
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17  How to Extend Models Using Mixin ClassesMixin Classes in Odoo 17  How to Extend Models Using Mixin Classes
Mixin Classes in Odoo 17 How to Extend Models Using Mixin Classes
 

Featured

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 

Featured (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Benevol 2010

  • 1. Software metrics are usually right-skewed Histogram of SLOC(org.argouml.ui) 25 20 15 Frequency 10 5 0 0 100 200 300 400 500 SLOC for classes in org.argouml.ui
  • 2. 2/11 Aggregation of software metrics using the “softnometric” index Bogdan Vasilescu b.n.vasilescu@student.tue.nl Eindhoven University of Technology The Netherlands March 9, 2011
  • 3. Aggregation techniques 3/11 Inequality indices: Classical: Distribution fitting: Theil Mean Log-normal Gini Sum Exponential Kolm Cardinality Negative binomial Atkinson
  • 4. Aggregation techniques 3/11 Inequality indices: Classical: Distribution fitting: Theil Mean Log-normal Gini Sum Exponential Kolm Cardinality Negative binomial Atkinson
  • 5. Gini index 4/11 The Gini index is based on the Lorenz curve: proportion of the total income of the population (y-axis) cumulatively earned by the bottom x% of the people. 0 perfect equality: every person receives the same income. 1 perfect inequality: one person receives all the income. IGini (X ) = A A +B
  • 6. Gini index 4/11 The Gini index is based on the Lorenz curve: proportion of the total income of the population (y-axis) cumulatively earned by the bottom x% of the people. 0 perfect equality: every person receives the same income. 1 perfect inequality: one person receives all the income. IGini (X ) = A A +B
  • 7. Theoretical comparison 5/11 Criteria: Domain → determines applicability Range → determines interpretation Invariance • w.r.t. addition → LOC, ignore headers • w.r.t. multiplication → LOC, percentages vs. absolute values Decomposability → explain inequality by partitioning the population into groups
  • 8. Theoretical comparison 6/11 Agg. technique Domain Range Invariance Decomposability Mean R R - N/A Sum R R - N/A Cardinality R N - N/A Gini Index R+ [0, 1] mult. - R R mult. - Theil Index R+ [0, log n] mult. yes Kolm Index R R+ add. yes Atkinson Index R+ [0, 1 − 1/n] mult. -
  • 9. Empirical comparison 7/11 Research questions: Does LOC relate to bugs? Do the aggregation techniques influence the presence/strength of this relation? Is there any difference between the aggregation techniques? Do they express the same thing?
  • 10. Empirical comparison 8/11 Case study: ArgoUML Open-source, ∼ 1200 Java classes, ∼ 100 packages.
  • 11. Empirical comparison 8/11 Case study: ArgoUML Open-source, ∼ 1200 Java classes, ∼ 100 packages. Methodology: Tool chain to automatically process issue tracker and version control system data. Mapped defects to Java classes and then packages. Measured SLOC of each class, aggregated to package level. For each aggregation technique, statistically studied correlation with bugs.
  • 12. Results 9/11 mean IGini ITheil IKolm IAtkinson defects mean 0.170 0.192 0.6761 0.203 0.0096 IGini 0.908 0.467 0.903 0.27 ITheil 0.488 0.918 0.273 IKolm 0.501 0.119 IAtkinson 0.229 IGini , ITheil and IAtkinson indicate the strongest and also statistically significant correlation with the number of defects. However, high and statistically significant correlation between them. Mean indicates the lowest correlation with the number of defects. 1 statistically significant correlations, with two-sided p-values not exceeding 0.01, are typeset in boldface
  • 13. Threats to validity 10/11 No control over the issue tracker → mapping of defects to classes. bugs missing from the issue tracker. bug fixes not showing up in the commit log. How representative is the case? How about the version? replicate on more systems and more versions. Is LOC the most suitable metric? replicate with more metrics.
  • 14. Conclusions 11/11 Software metrics are not distributed normally. Histogram of SLOC(org.argouml.ui) Theoretical comparison. 25 Agg. technique Domain Range Invariance Decomposability 20 Mean R R - N/A Sum R R - N/A 15 Frequency Cardinality R N - N/A 10 Gini Index R+ [0, 1] mult. - R R mult. - 5 Theil Index R+ [0, log n] mult. yes 0 0 100 200 300 400 500 Kolm Index R R+ add. yes SLOC for classes in org.argouml.ui Atkinson Index R+ [0, 1 − 1/n] mult. - Empirical comparison. mean Gini Theil Kolm Atkinson defects mean 0.170 0.192 0.676 0.203 0.0096 Gini 0.908 0.467 0.903 0.27 Theil 0.488 0.918 0.273 Kolm 0.501 0.119 Atkinson 0.229 Classical aggregation techniques have problems when distributions are skewed. Inequality indices look more promising.