Aggregation
      of software metrics
      Bogdan Vasilescu
      b.n.vasilescu@student.tue.nl

      Alexander Serebrenik
      a.serebrenik@tue.nl




April 7, 2011
Aggregation techniques for software metrics                                                                          2/8




     Better understand aggregation techniques for software metrics.
                                                                       Source lines of code − freecol−0.9.4




                                                         0.004
                                                         0.003
                                               Density

                                                         0.002
                                                         0.001
                                                         0.000




                                                                 0   500     1000       1500         2000   2500   3000

                                                                                    SLOC per class




     Traditional: mean, sum, median, standard deviation, variance,
     skewness, kurtosis.



/   department of mathematics and computer science
Aggregation techniques for software metrics                                                                                                                 2/8




     Better understand aggregation techniques for software metrics.
                                Household income in Ilocos, the Philippines (1998)                            Source lines of code − freecol−0.9.4
                    5e−06




                                                                                                0.004
                    4e−06




                                                                                                0.003
                    3e−06




                                                                                      Density
          Density




                                                                                                0.002
                    2e−06




                                                                                                0.001
                    1e−06
                    0e+00




                                                                                                0.000




                            0      500000    1000000    1500000   2000000   2500000                     0   500     1000       1500         2000   2500   3000

                                                       Income                                                              SLOC per class




     Traditional: mean, sum, median, standard deviation, variance,
     skewness, kurtosis.
     Inequality indices: Gini, Theil, Atkinson, Hoover, Kolm.

/   department of mathematics and computer science
Correlation study                                                        3/8



     Aggregate SLOC from class to package level.

     Study statistical correlation between pairs of aggregation techniques.

     Not enough to measure.




/   department of mathematics and computer science
Available datasets                                                         4/8


     Qualitas Corpus 20101126 r+e.
         r (recent): the most recent versions from 106 systems.
         e (evolution): all available versions from 13 systems (≥ 10 versions
         available), 414 versions in total.




/   department of mathematics and computer science
Tooling                                                                                                                                                                                                                                                              5/8



     Developed and available tooling to analyze the corpus:
                                             Extract metrics: SLOCCount, Understand (still not generic enough)
                                             Compute inequality indices, perform statistical analyses: R (highly
                                             scriptable)
                                             Put everything together: Python toolchain (easily extendable)

                                             Kendall correlation: Atkinson − skewness (SLOC)                                            Kendall correlation: Gini − Theil (SLOC)                                            Kendall correlation: mean − kurtosis (SLOC)
                                             1.0




                                                                                                                                 1.0




                                                                                                                                                                                                                     1.0
                                                                    q                                                                                                                                                                            q


                                                                                                                                                           q
                                                                                                                                                           q

                                                                                                                                                           q                                                                                     q
                                                                                                                                                           q
                                             0.5




                                                                                                                                 0.5




                                                                                                                                                                                                                     0.5
                                                                                                                                                                                                                                                 q
           Kendall correlation coefficient




                                                                                               Kendall correlation coefficient




                                                                                                                                                                                   Kendall correlation coefficient
                                                                    q
                                             0.0




                                                                                                                                 0.0




                                                                                                                                                                                                                     0.0
                                                                    q




                                                                                                                                                                                                                                                 q
                                             −0.5




                                                                                                                                 −0.5




                                                                                                                                                                                                                     −0.5
                                                                                                                                                                                                                                                 q
                                                                                                                                                                                                                                                 q
                                             −1.0




                                                                                                                                 −1.0




                                                                                                                                                                                                                     −1.0




/   department of mathematics and computer science
Sample results - shape                                                                                                                                                                                                                                                                                 6/8



                                                                                                                                  jfreechart : Atkinson − skewness (SLOC)

                                                                                                                                                                                                                                                   q




                                                                                      4
                                                                                                                                       q                                                                    q             q




                                                                                      3
                                                                                                                q                     q
                                                                                                                                      q
                                                                                                                                                        q




                                                                    skewness (SLOC)
                                                                                                                     q                                                                    q
                                                                                                           q                                                                                      q                   q




                                                                                      2
                                                                                                                q
                                                                                             q                               qq            q q                                    q
                                                                                                                     q      q
                                                                                                                            q                                                 q
                                                                                                                   q        q                                                     q               q



                                                                                      1
                                                                                                                   q                       q                q                         q
                                                                                                      qq                     q                     q        q                                                   q
                                                                                                                     qq     q
                                                                                                           q    q             q       q
                                                                                            q              q                               q
                                                                                      0
                                                                                      −1          qq q
                                                                                                    q
                                                                                                                q
                                                                                                                q          q q
                                                                                                                             q                         q

                                                                                                   q
                                                                                                    q
                                                                                                  q
                                                                                      −2




                                                                                             q

                                                                                           0.0                       0.1                   0.2                                    0.3                           0.4                   0.5

                                                                                                                                                       Atkinson (SLOC)



                                                      jfreechart : Gini − Theil (SLOC)                                                                                                                                                    jfreechart : mean − kurtosis (SLOC)
                    1.5




                                                                                                                                               q                                                                                  q




                                                                                                                                                                                  20
                                                                                                                                  q
                    1.0




                                                                                                                                                            kurtosis (SLOC)
                                                                                                                                 q
                                                                                                                                                                                  15
     Theil (SLOC)




                                                                                                                     q      q
                                                                                                               qq                                                                                                                             qq                                                        q
                                                                                                               q                                                                                                                                                 q
                                                                                                       q
                                                                                                          q
                                                                                                                                                                                  10




                                                                                                       qq                                                                                                                                 q
                    0.5




                                                                                                   q                                                                                                                          q                              q
                                                                          q qq                   q q
                                                                                                 q                                                                                                               q
                                                                         q q q
                                                                                                                                                                                                                                      q
                                                                            q                                                                                                                                                      q
                                                                       qq                                                                                                                                            q                     q                                          q
                                                                                                                                                                                                                                                             q                  q
                                                                    qq
                                                                    qq
                                                                    q                                                                                                                                   q            qq                                                                       q
                                                                                                                                                                                  5




                                                                    q
                                                             qqq qq
                                                              qq
                                                               qq
                                                                q                                                                                                                                 q q           q                  q q
                                                                                                                                                                                                                                           q                 q
                                                        q                                                                                                                                                                   q q                                            q
                                                       q q
                                                        q                                                                                                                                                                        q
                                                      q                                                                                                                                                            q q qqq q q q     qq q
                                       q    q   q q
                                           q q qq
                                                 q                                                                                                                                                q         q         q q
                                                                                                                                                                                                                      q q       q        q                       q   q
                                                                                                                                                                                                                        q q     q      q q                   q                            q
                                qq q
                    0.0




                          q                                                                                                                                                                                      q                  q

                          0.0                 0.2                           0.4                                0.6                               0.8                                          0                     50                100              150           200            250           300

                                                                 Gini (SLOC)                                                                                                                                                                           mean (SLOC)




/   department of mathematics and computer science
/
                                                    Cor. coeff. Atkinson(SLOC) − Kolm(SLOC)                                                                                              Cor. coeff. Gini(SLOC) − Theil(SLOC)

                                                 −1.0      −0.5       0.0       0.5       1.0                                                                                     −1.0        −0.5       0.0        0.5         1.0




                     0.8.1                                                                                                                                                0.8.1
                       1.0                                                                                                                                                  1.0
                       1.1                                                                                                                                                  1.1
             2.0−beta−1                                                                                                                                           2.0−beta−1
             2.0−beta−2                                                                                                                                           2.0−beta−2
             2.0−beta−3                                                                                                                                           2.0−beta−3
             2.0−beta−4                                                                                                                                           2.0−beta−4
                 2.0−final                                                                                                                                            2.0−final
                  2.0−rc2                                                                                                                                              2.0−rc2
                     2.0.1                                                                                                                                                2.0.1
                     2.0.2                                                                                                                                                2.0.2
                     2.0.3                                                                                                                                                2.0.3
             2.1−beta−1                                                                                                                                           2.1−beta−1
             2.1−beta−2                                                                                                                                           2.1−beta−2
             2.1−beta−3                                                                                                                                           2.1−beta−3
            2.1−beta−3b                                                                                                                                          2.1−beta−3b
             2.1−beta−4                                                                                                                                           2.1−beta−4
             2.1−beta−5                                                                                                                                           2.1−beta−5
             2.1−beta−6                                                                                                                                           2.1−beta−6
                 2.1−final                                                                                                                                            2.1−final
                  2.1−rc1                                                                                                                                              2.1−rc1
                     2.1.1                                                                                                                                                2.1.1
                     2.1.2                                                                                                                                                2.1.2
                     2.1.3                                                                                                                                                2.1.3
                     2.1.4                                                                                                                                                2.1.4
                     2.1.5                                                                                                                                                2.1.5
                     2.1.6                                                                                                                                                2.1.6
                     2.1.7                                                                                                                                                2.1.7




department of mathematics and computer science
                     2.1.8                                                                                                                                                2.1.8
                       3.0                                                                                                                                                  3.0
               3.0−alpha                                                                                                                                            3.0−alpha
               3.0−beta1                                                                                                                                            3.0−beta1
               3.0−beta2                                                                                                                                            3.0−beta2
               3.0−beta3                                                                                                                                            3.0−beta3
               3.0−beta4                                                                                                                                            3.0−beta4
                  3.0−rc1                                                                                                                                              3.0−rc1
                     3.0.1                                                                                                                                                3.0.1
                     3.0.2                                                                                                                                                3.0.2
                     3.0.3                                                                                                                                                3.0.3
                                                                                                                                                                                                                                                                                                     Sample results - evolution




                     3.0.4                                                                                                                                                3.0.4
                     3.0.5                                                                                                                                                3.0.5
                       3.1                                                                                                                                                  3.1
              3.1−alpha1                                                                                                                                           3.1−alpha1
               3.1−beta1                                                                                                                                            3.1−beta1
               3.1−beta2                                                                                                                                            3.1−beta2
               3.1−beta3                                                                                                                                            3.1−beta3
                  3.1−rc1                                                                                                                                              3.1−rc1
                  3.1−rc2                                                                                                                                              3.1−rc2
                  3.1−rc3                                                                                                                                              3.1−rc3
                     3.1.1                                                                                                                                                3.1.1
                     3.1.2                                                                                                                                                3.1.2
                     3.1.3                                                                                                                                                3.1.3
              3.2−alpha1                                                                                                                                           3.2−alpha1
              3.2−alpha2                                                                                                                                           3.2−alpha2
                  3.2−cr1                                                                                                                                              3.2−cr1
                  3.2−cr2                                                                                                                                              3.2−cr2
                3.2.0−cr3                                                                                                                                            3.2.0−cr3
                3.2.0−cr4                                                                                                                                            3.2.0−cr4
                3.2.0−cr5                                                                                                                                            3.2.0−cr5
                  3.2.0.ga                                                                                                                                             3.2.0.ga
                                                                                                                                                                                                                                      hibernate − Kendall(Gini(SLOC), Theil(SLOC)) (86 releases)




                 3.2.1−ga                                                                                                                                             3.2.1−ga
                                                                                                hibernate − Kendall(Atkinson(SLOC), Kolm(SLOC)) (86 releases)




                 3.2.2−ga                                                                                                                                             3.2.2−ga
                 3.2.3−ga                                                                                                                                             3.2.3−ga
                 3.2.4−ga                                                                                                                                             3.2.4−ga
               3.2.4−sp1                                                                                                                                            3.2.4−sp1
                 3.2.5−ga                                                                                                                                             3.2.5−ga
                 3.2.6−ga                                                                                                                                             3.2.6−ga
                 3.2.7−ga                                                                                                                                             3.2.7−ga
                3.3.0−cr2                                                                                                                                            3.3.0−cr2
                 3.3.0−ga                                                                                                                                             3.3.0−ga
               3.3.0−sp1                                                                                                                                            3.3.0−sp1
                 3.3.0.cr1                                                                                                                                            3.3.0.cr1
                 3.3.1−ga                                                                                                                                             3.3.1−ga
                 3.3.2−ga                                                                                                                                             3.3.2−ga
           3.5.0−beta−1                                                                                                                                         3.5.0−beta−1
           3.5.0−beta−2                                                                                                                                         3.5.0−beta−2
           3.5.0−beta−3                                                                                                                                         3.5.0−beta−3
           3.5.0−beta−4                                                                                                                                         3.5.0−beta−4
              3.5.0−cr−1                                                                                                                                           3.5.0−cr−1
              3.5.0−cr−2                                                                                                                                           3.5.0−cr−2
               3.5.3−final                                                                                                                                          3.5.3−final
               3.5.5−final                                                                                                                                          3.5.5−final
             3.6.0−beta1                                                                                                                                          3.6.0−beta1
             3.6.0−beta2                                                                                                                                          3.6.0−beta2
             3.6.0−beta3                                                                                                                                          3.6.0−beta3
             3.6.0−beta4                                                                                                                                          3.6.0−beta4
                                                                                                                                                                                                                                                                                                   7/8

Sattose 2011

  • 1.
    Aggregation of software metrics Bogdan Vasilescu b.n.vasilescu@student.tue.nl Alexander Serebrenik a.serebrenik@tue.nl April 7, 2011
  • 2.
    Aggregation techniques forsoftware metrics 2/8 Better understand aggregation techniques for software metrics. Source lines of code − freecol−0.9.4 0.004 0.003 Density 0.002 0.001 0.000 0 500 1000 1500 2000 2500 3000 SLOC per class Traditional: mean, sum, median, standard deviation, variance, skewness, kurtosis. / department of mathematics and computer science
  • 3.
    Aggregation techniques forsoftware metrics 2/8 Better understand aggregation techniques for software metrics. Household income in Ilocos, the Philippines (1998) Source lines of code − freecol−0.9.4 5e−06 0.004 4e−06 0.003 3e−06 Density Density 0.002 2e−06 0.001 1e−06 0e+00 0.000 0 500000 1000000 1500000 2000000 2500000 0 500 1000 1500 2000 2500 3000 Income SLOC per class Traditional: mean, sum, median, standard deviation, variance, skewness, kurtosis. Inequality indices: Gini, Theil, Atkinson, Hoover, Kolm. / department of mathematics and computer science
  • 4.
    Correlation study 3/8 Aggregate SLOC from class to package level. Study statistical correlation between pairs of aggregation techniques. Not enough to measure. / department of mathematics and computer science
  • 5.
    Available datasets 4/8 Qualitas Corpus 20101126 r+e. r (recent): the most recent versions from 106 systems. e (evolution): all available versions from 13 systems (≥ 10 versions available), 414 versions in total. / department of mathematics and computer science
  • 6.
    Tooling 5/8 Developed and available tooling to analyze the corpus: Extract metrics: SLOCCount, Understand (still not generic enough) Compute inequality indices, perform statistical analyses: R (highly scriptable) Put everything together: Python toolchain (easily extendable) Kendall correlation: Atkinson − skewness (SLOC) Kendall correlation: Gini − Theil (SLOC) Kendall correlation: mean − kurtosis (SLOC) 1.0 1.0 1.0 q q q q q q q 0.5 0.5 0.5 q Kendall correlation coefficient Kendall correlation coefficient Kendall correlation coefficient q 0.0 0.0 0.0 q q −0.5 −0.5 −0.5 q q −1.0 −1.0 −1.0 / department of mathematics and computer science
  • 7.
    Sample results -shape 6/8 jfreechart : Atkinson − skewness (SLOC) q 4 q q q 3 q q q q skewness (SLOC) q q q q q 2 q q qq q q q q q q q q q q q 1 q q q q qq q q q q qq q q q q q q q q 0 −1 qq q q q q q q q q q q q −2 q 0.0 0.1 0.2 0.3 0.4 0.5 Atkinson (SLOC) jfreechart : Gini − Theil (SLOC) jfreechart : mean − kurtosis (SLOC) 1.5 q q 20 q 1.0 kurtosis (SLOC) q 15 Theil (SLOC) q q qq qq q q q q q 10 qq q 0.5 q q q q qq q q q q q q q q q q qq q q q q q qq qq q q qq q 5 q qqq qq qq qq q q q q q q q q q q q q q q q q q q q qqq q q q qq q q q q q q q qq q q q q q q q q q q q q q q q q q q qq q 0.0 q q q 0.0 0.2 0.4 0.6 0.8 0 50 100 150 200 250 300 Gini (SLOC) mean (SLOC) / department of mathematics and computer science
  • 8.
    / Cor. coeff. Atkinson(SLOC) − Kolm(SLOC) Cor. coeff. Gini(SLOC) − Theil(SLOC) −1.0 −0.5 0.0 0.5 1.0 −1.0 −0.5 0.0 0.5 1.0 0.8.1 0.8.1 1.0 1.0 1.1 1.1 2.0−beta−1 2.0−beta−1 2.0−beta−2 2.0−beta−2 2.0−beta−3 2.0−beta−3 2.0−beta−4 2.0−beta−4 2.0−final 2.0−final 2.0−rc2 2.0−rc2 2.0.1 2.0.1 2.0.2 2.0.2 2.0.3 2.0.3 2.1−beta−1 2.1−beta−1 2.1−beta−2 2.1−beta−2 2.1−beta−3 2.1−beta−3 2.1−beta−3b 2.1−beta−3b 2.1−beta−4 2.1−beta−4 2.1−beta−5 2.1−beta−5 2.1−beta−6 2.1−beta−6 2.1−final 2.1−final 2.1−rc1 2.1−rc1 2.1.1 2.1.1 2.1.2 2.1.2 2.1.3 2.1.3 2.1.4 2.1.4 2.1.5 2.1.5 2.1.6 2.1.6 2.1.7 2.1.7 department of mathematics and computer science 2.1.8 2.1.8 3.0 3.0 3.0−alpha 3.0−alpha 3.0−beta1 3.0−beta1 3.0−beta2 3.0−beta2 3.0−beta3 3.0−beta3 3.0−beta4 3.0−beta4 3.0−rc1 3.0−rc1 3.0.1 3.0.1 3.0.2 3.0.2 3.0.3 3.0.3 Sample results - evolution 3.0.4 3.0.4 3.0.5 3.0.5 3.1 3.1 3.1−alpha1 3.1−alpha1 3.1−beta1 3.1−beta1 3.1−beta2 3.1−beta2 3.1−beta3 3.1−beta3 3.1−rc1 3.1−rc1 3.1−rc2 3.1−rc2 3.1−rc3 3.1−rc3 3.1.1 3.1.1 3.1.2 3.1.2 3.1.3 3.1.3 3.2−alpha1 3.2−alpha1 3.2−alpha2 3.2−alpha2 3.2−cr1 3.2−cr1 3.2−cr2 3.2−cr2 3.2.0−cr3 3.2.0−cr3 3.2.0−cr4 3.2.0−cr4 3.2.0−cr5 3.2.0−cr5 3.2.0.ga 3.2.0.ga hibernate − Kendall(Gini(SLOC), Theil(SLOC)) (86 releases) 3.2.1−ga 3.2.1−ga hibernate − Kendall(Atkinson(SLOC), Kolm(SLOC)) (86 releases) 3.2.2−ga 3.2.2−ga 3.2.3−ga 3.2.3−ga 3.2.4−ga 3.2.4−ga 3.2.4−sp1 3.2.4−sp1 3.2.5−ga 3.2.5−ga 3.2.6−ga 3.2.6−ga 3.2.7−ga 3.2.7−ga 3.3.0−cr2 3.3.0−cr2 3.3.0−ga 3.3.0−ga 3.3.0−sp1 3.3.0−sp1 3.3.0.cr1 3.3.0.cr1 3.3.1−ga 3.3.1−ga 3.3.2−ga 3.3.2−ga 3.5.0−beta−1 3.5.0−beta−1 3.5.0−beta−2 3.5.0−beta−2 3.5.0−beta−3 3.5.0−beta−3 3.5.0−beta−4 3.5.0−beta−4 3.5.0−cr−1 3.5.0−cr−1 3.5.0−cr−2 3.5.0−cr−2 3.5.3−final 3.5.3−final 3.5.5−final 3.5.5−final 3.6.0−beta1 3.6.0−beta1 3.6.0−beta2 3.6.0−beta2 3.6.0−beta3 3.6.0−beta3 3.6.0−beta4 3.6.0−beta4 7/8