Modeling History
to Understand Software Evolution



          Tudor Gîrba
          www.tudorgirba.com
Modeling History
                              to Understand Software Evolution


                                        ...
}
                                                 }
                                             {
                      ...
fo
                                             rw
                                              ar
                      ...
fo
                                                               rw
                                              g
     ...
{
                         {
    } }
}
                         {



{
                         {
                }

}
   ...
Most often time is put on the horizontal
  and a property on the vertical axis.




             Lehman etal, 2001
Evolution Matrix shows how classes evolve.
    Time is still on the horizontal axis.




             Lanza, Ducasse, 2002
Co-change analysis recovers hidden dependencies.
               Time is the lines.




                  Gall etal, 2003
Evolution information can be
mapped on structural information.




           Eick etal, 2002
Eick etal, 2002




  Lehman etal, 2001




                                                ...
Lanza, Ducasse, 2002
     ...
Eick etal, 2002




                                                      es?
                                            ...
sh
                        ort
                              int
                                 er
                     ...
sh
                                  ort
                                        int
                                     ...
sh
                             ort
                                   int
                                      er
      ...
sh
                              ort
                                    int
                                       er
   ...
sh
                         ort
                               int
                                  er
                  ...
Eick etal, 2002




  Lehman etal, 2001




                                             ...
Lanza, Ducasse, 2002
        ...
Eick etal, 2002




                                                      es?
                                            ...
Evolution Matrix shows class changes.
                                         Lanza, Ducasse, 2004

Idle
class

Pulsar
cl...
Evolution Matrix shows class changes.
                                         Lanza, Ducasse, 2004

Idle
class

Pulsar
cl...
Evolution Matrix shows class changes.
                                                     Lanza, Ducasse, 2004

Idle
clas...
Evolution Matrix shows class changes.
                                         Lanza, Ducasse, 2004

Idle
class

Pulsar
cl...
Evolution Matrix shows class changes.


Idle
class history

Pulsar
class history


Supernova
class history
               ...
System
Version




 Class
Version
System
          Version




 Class     Class
History   Version
System    System
History   Version




 Class     Class
History   Version
System    System
History   Version




 Class     Class
History   Version
System                 System
                                         History                Version


                  ...
Measuring                Yesterday’s         Time-based
        history                  Weather         Detection Strateg...
Measuring history




                    1
What changed? When did it change? ...


      2     4     3     5    7


      2     2     3     4    9


      2     2   ...
Evolution of
Number of Methods               LENOM(C) = ∑ |NOMi(C)-NOMi-1(C)| 2i-n




    LENOM(C)    =       4   +    2 ...
Latest Evolution of
Number of Methods               LENOM(C) = ∑ |NOMi(C)-NOMi-1(C)| 2i-n


Earliest Evolution of
Number o...
ENOM   LENOM EENOM


2   4   3   5   7    7      3.5    3.25


2   2   3   4   9    7      5.75   1.37


2   2   1   2   3...
ENOM   LENOM EENOM


balanced changer    7      3.5    3.25


  late changer      7      5.75   1.37


                   ...
ENOM       LENOM EENOM


balanced changer            7         3.5      3.25


  late changer              7 tity    5.75 ...
History can be measured in many ways.




Evolution                       Number of Methods
Stability                    N...
Yesterday’s weather




                      2
The recently changed parts are likely to change in the
near future.

                                        Common wisdom
The recently changed parts are likely to change in the
near future.


                                ally?
              ...
30%   90%
present
past




       present
past             future




       present
past             future




       present
past             future




       present
past             future




prediction hit          present
past             future


YesterdayWeatherHit(present):

past:=histories.topLENOM(start, present)

future:=histories.topEE...
Overall Yesterday’s Weather shows the localization of
changed in time.                               Girba etal, 2004




...
Time-based
Detection Strategies




                       3
Detection Strategies are metric-based queries to
detect design flaws.                  Lanza, Marinescu 2006




          ...
Example: a God Class centralizes too much
intelligence in the system.
    Class uses directly more than a
    few attribut...
Example: a God Class centralizes too much
intelligence in the system.
    Class uses directly more than a
    few attribut...
History-based Detection Strategies take evolution
into account.                            Ratiu etal, 2004




          ...
History-based Detection Strategies take evolution
into account.                            Ratiu etal, 2004




          ...
Visualizing the evolution
of hierarchies




                            4
What happens with inheritance?


  A                A                A                A                A




         B   ...
History contains too much data.


  A                A                A                A                A            A    ...
System    System
History   Version




 Class     Class
History   Version
System              System
History             Version




          Inheritance
            Version




 Class           ...
System              System
          History             Version




Inheritance         Inheritance
  History            ...
A                A                A                A                A




          B   C            B   C            B   ...
Hierarchy Evolution View encapsulates time.
                                                   Girba etal, 2005
          ...
Hierarchy Evolution View reveals patterns.
                                             Girba etal, 2005
Hierarchy Evolution View reveals patterns.
                                               Girba etal, 2005




           ...
Identifying co-change
patterns




                        5
1   2   3   4   5   6

A
                                        A

B
                            B                       ...
Co-change patterns are n-ary relationships.
1   2   3   4   5   6

A

B

C

D
                            Version
E
1   2   3   4   5   6

A

B

C

D
                            Version
E                           changed
1   2   3   4   5   6

A                            History
                            changed(i)
B

C

D
               ...
What is Concept Analysis?
1   2   3   4   5   6

A

B

C

D

E
{A, B, C, D, E}
    1   2   3   4   5   6                             Ø

A                                       {A, D, B}...
{A, B, C, D, E}
    1   2   3   4   5   6                             Ø

A                                       {A, D, B}...
Parallel Inheritance
add simultaneously children to several classes


Shotgun Surgery
change several classes simultaneousl...
{A, B, C, D, E}
    1   2   3   4   5   6                             Ø

A                                       {A, D, B}...
{A, B, C, D, E}
    1   2   3   4   5    6                               Ø

A                                          {A,...
How developers drive
software evolution




                       6
CVS shows activity.
Who is responsible for this?
Who is responsible for this?
Alphabetical order is no order.
The Hausdorf metric can be used to compute the
similarity between commits.




A
                              d(A, B) = ∑...
Alphabetical order is no order.
Ownership Map reveals development patterns.
                                              Girba etal, 2006
Ownership Map reveals development patterns.
                                                Girba etal, 2006




         ...
Measuring                Yesterday’s         Time-based
        history                  Weather         Detection Strateg...
System              System
          History             Version




Inheritance         Inheritance
  History            ...
History             Version




History             Version




          History             Version
History               Version


                                             en tity.
                                 st ...
Modeling History
                              to Understand Software Evolution


                                        ...
Tudor Gîrba
       www.tudorgirba.com




creativecommons.org/licenses/by/3.0/
Modeling History to Understand Software Evolution with Hismo 2008-03-12
Modeling History to Understand Software Evolution with Hismo 2008-03-12
Upcoming SlideShare
Loading in …5
×

Modeling History to Understand Software Evolution with Hismo 2008-03-12

3,245 views

Published on

Over the past three decades, more and more research has been spent on understanding software evolution. However, the approaches developed so far rely on ad-hoc models, or on too specific meta-models, and thus, it is difficult to reuse or compare their results. We argue for the need of an explicit and generic meta-model that recognizes evolution as an explicit phenomenon and models it as a first class entity. Our solution is to encapsulate the evolution in the explicit notion of history as a sequence of versions, and to build a meta-model around these notions called Hismo. To show the usefulness of our meta-model we exercise its different characteristics by building several reverse engineering applications.

Published in: Technology, Spiritual
1 Comment
2 Likes
Statistics
Notes
No Downloads
Views
Total views
3,245
On SlideShare
0
From Embeds
0
Number of Embeds
1,208
Actions
Shares
0
Downloads
38
Comments
1
Likes
2
Embeds 0
No embeds

No notes for slide

Modeling History to Understand Software Evolution with Hismo 2008-03-12

  1. 1. Modeling History to Understand Software Evolution Tudor Gîrba www.tudorgirba.com
  2. 2. Modeling History to Understand Software Evolution Inauguraldissertation der Philosophisch-naturwissenschaftlichen Fakultät der Universität Bern vorgelegt von Tudor Gîrba von Rumänien Leiter der Arbeit: Prof. Dr. Stéphane Ducasse Prof. Dr. Oscar Nierstrasz Institut für Informatik und angewandte Mathematik
  3. 3. } } { { } } { { g rin ee gin en d ar rw fo
  4. 4. fo rw ar d en gin ee rin g { { { { { { } { { } } actual development } } } { } } }
  5. 5. fo rw g rin ar ee d gin en gin en ee se rin erv g re { { { { { { } { { } } actual development } } } { } } }
  6. 6. { { } } } { { { } } re v er se en gin ee rin g reverse engineering fo actual development rw ar d en gin ee rin g { { } } { { } }
  7. 7. Most often time is put on the horizontal and a property on the vertical axis. Lehman etal, 2001
  8. 8. Evolution Matrix shows how classes evolve. Time is still on the horizontal axis. Lanza, Ducasse, 2002
  9. 9. Co-change analysis recovers hidden dependencies. Time is the lines. Gall etal, 2003
  10. 10. Evolution information can be mapped on structural information. Eick etal, 2002
  11. 11. Eick etal, 2002 Lehman etal, 2001 ... Lanza, Ducasse, 2002 Gall etal, 2003
  12. 12. Eick etal, 2002 es? niqu tech Lehman etal, 2001 ese ll th da te a om mo we acc can ow H ... Lanza, Ducasse, 2002 Gall etal, 2003
  13. 13. sh ort int er me zzo What is a model?
  14. 14. sh ort int er me zzo A model is a simplification of the subject, and its purpose is to answer some particular questions aimed towards the subject. Bezivin, Gerbe, 2001
  15. 15. sh ort int er me zzo what is a meta-model?
  16. 16. sh ort int er me zzo a meta-model is a model that makes statements about what can be expressed in valid models. Seidewitz, 2003
  17. 17. sh ort int er me zzo a good meta-model allows for succinct expression of analyses.
  18. 18. Eick etal, 2002 Lehman etal, 2001 ... Lanza, Ducasse, 2002 Gall etal, 2003
  19. 19. Eick etal, 2002 es? niqu tech Lehman etal, 2001 ese ll th da te a om mo we acc can ow H ... Lanza, Ducasse, 2002 Gall etal, 2003
  20. 20. Evolution Matrix shows class changes. Lanza, Ducasse, 2004 Idle class Pulsar class Supernova class attributes White dwarf class methods Class
  21. 21. Evolution Matrix shows class changes. Lanza, Ducasse, 2004 Idle class Pulsar class Supernova class attributes White dwarf class methods Class
  22. 22. Evolution Matrix shows class changes. Lanza, Ducasse, 2004 Idle class entity. st c lass Pulsar da s fir class mo dele to be needs Supernova tio n class Evolu attributes White dwarf class methods Class
  23. 23. Evolution Matrix shows class changes. Lanza, Ducasse, 2004 Idle class Pulsar class Supernova class attributes White dwarf class methods Class
  24. 24. Evolution Matrix shows class changes. Idle class history Pulsar class history Supernova class history ClassHistory White dwarf isPulsar class history isIdle ...
  25. 25. System Version Class Version
  26. 26. System Version Class Class History Version
  27. 27. System System History Version Class Class History Version
  28. 28. System System History Version Class Class History Version
  29. 29. System System History Version en tity. st c lass 200 5 ya s fir Gir ba, tor l s his m ode H ismo Class Class History Version
  30. 30. Measuring Yesterday’s Time-based history Weather Detection Strategies 1 2 3 4 5 6 Visualizing the evolution Detecting How developers of hierarchies co-change patterns drive evolution
  31. 31. Measuring history 1
  32. 32. What changed? When did it change? ... 2 4 3 5 7 2 2 3 4 9 2 2 1 2 3 2 2 2 2 2 1 5 3 4 4
  33. 33. Evolution of Number of Methods LENOM(C) = ∑ |NOMi(C)-NOMi-1(C)| 2i-n LENOM(C) = 4 + 2 + 1 + 0 = 7 1 5 3 4 4
  34. 34. Latest Evolution of Number of Methods LENOM(C) = ∑ |NOMi(C)-NOMi-1(C)| 2i-n Earliest Evolution of Number of Methods EENOM(C) = ∑ |NOMi(C)-NOMi-1(C)| 22-i -3 -2 -1 0 LENOM(C) = 42 + 22 + 12 + 02 = 1.5 1 5 3 4 4 EENOM(C) = 4 20 + 2 2-1 + 1 2-2 + 0 2-3 = 5.25
  35. 35. ENOM LENOM EENOM 2 4 3 5 7 7 3.5 3.25 2 2 3 4 9 7 5.75 1.37 2 2 1 2 3 3 1 2 2 2 2 2 2 0 0 0 1 5 3 4 4 7 1.25 5.25
  36. 36. ENOM LENOM EENOM balanced changer 7 3.5 3.25 late changer 7 5.75 1.37 3 1 2 dead stable 0 0 0 early changer 7 1.25 5.25
  37. 37. ENOM LENOM EENOM balanced changer 7 3.5 3.25 late changer 7 tity 5.75 nts. 1.37 en ureme lass eas st c gh m s fir rou 3 y a th 1 2 r H isto rison co mpa deadsstable ena ble 0 0 0 early changer 7 1.25 5.25
  38. 38. History can be measured in many ways. Evolution Number of Methods Stability Number of Lines of Code Historical Max of Cyclomatic Complexity Growth Trend Number of Modules ... ...
  39. 39. Yesterday’s weather 2
  40. 40. The recently changed parts are likely to change in the near future. Common wisdom
  41. 41. The recently changed parts are likely to change in the near future. ally? Common wisdom re re they A
  42. 42. 30% 90%
  43. 43. present
  44. 44. past present
  45. 45. past future present
  46. 46. past future present
  47. 47. past future present
  48. 48. past future prediction hit present
  49. 49. past future YesterdayWeatherHit(present): past:=histories.topLENOM(start, present) future:=histories.topEENOM(present, end) past.intersectWith(future).notEmpty() prediction hit present
  50. 50. Overall Yesterday’s Weather shows the localization of changed in time. Girba etal, 2004 hit hit hit YW = 3 / 8 = 37% hit hit hit hit hit hit hit YW = 7 / 8 = 87%
  51. 51. Time-based Detection Strategies 3
  52. 52. Detection Strategies are metric-based queries to detect design flaws. Lanza, Marinescu 2006 Rule 1 METRIC 1 > Threshold 1 AND Quality problem Rule 2 METRIC 2 < Threshold 2
  53. 53. Example: a God Class centralizes too much intelligence in the system. Class uses directly more than a few attributes of other classes ATFD > FEW Functional complexity of the class is very high AND GodClass WMC ! VERY HIGH Class cohesion is low TCC < ONE THIRD
  54. 54. Example: a God Class centralizes too much intelligence in the system. Class uses directly more than a few attributes of other classes ATFD > FEW tab le? f it is s wh Functional complexity of the at i ut, class is very high B AND GodClass WMC ! VERY HIGH Class cohesion is low TCC < ONE THIRD
  55. 55. History-based Detection Strategies take evolution into account. Ratiu etal, 2004 God Class in the last version isGodClass(last) AND Harmless God Class Stable throughout the history Stability > 90%
  56. 56. History-based Detection Strategies take evolution into account. Ratiu etal, 2004 me. God Class in the last version sa ed the isGodClass(last) eat e tr pace ar an ds AND Harmless God Class e Tim Stable throughout the history Stability > 90%
  57. 57. Visualizing the evolution of hierarchies 4
  58. 58. What happens with inheritance? A A A A A B C B C B C B B D D D E ver .1 ver. 2 ver. 3 ver. 4 ver. 5
  59. 59. History contains too much data. A A A A A A A A A A A A A A A A A A A A B C B C B C B B B C B C B C B B B C B C B C B B B C B C B C B B D D D E D D D E D D D E D D D E ver .1 ver. 2 ver. 3 ver. 4 ver. 5 ver .1 ver. 2 ver. 3 ver. 4 ver. 5 ver .1 ver. 2 ver. 3 ver. 4 ver. 5 ver .1 ver. 2 ver. 3 ver. 4 ver. 5 A A A A A A A A A A A A A A A A A A A A B C B C B C B B B C B C B C B B B C B C B C B B B C B C B C B B D D D E D D D E D D D E D D D E ver .1 ver. 2 ver. 3 ver. 4 ver. 5 ver .1 ver. 2 ver. 3 ver. 4 ver. 5 ver .1 ver. 2 ver. 3 ver. 4 ver. 5 ver .1 ver. 2 ver. 3 ver. 4 ver. 5 A A A A A A A A A A A A A A A A A A A A B C B C B C B B B C B C B C B B B C B C B C B B B C B C B C B B D D D E D D D E D D D E D D D E ver .1 ver. 2 ver. 3 ver. 4 ver. 5 ver .1 ver. 2 ver. 3 ver. 4 ver. 5 ver .1 ver. 2 ver. 3 ver. 4 ver. 5 ver .1 ver. 2 ver. 3 ver. 4 ver. 5 A A A A A A A A A A A A A A A A A A A A B C B C B C B B B C B C B C B B B C B C B C B B B C B C B C B B D D D E D D D E D D D E D D D E ver .1 ver. 2 ver. 3 ver. 4 ver. 5 ver .1 ver. 2 ver. 3 ver. 4 ver. 5 ver .1 ver. 2 ver. 3 ver. 4 ver. 5 ver .1 ver. 2 ver. 3 ver. 4 ver. 5
  60. 60. System System History Version Class Class History Version
  61. 61. System System History Version Inheritance Version Class Class History Version
  62. 62. System System History Version Inheritance Inheritance History Version Class Class History Version
  63. 63. A A A A A B C B C B C B B D D D E ver .1 ver. 2 ver. 3 ver. 4 ver. 5 A is persistent, B is stable, C was removed, E is newborn ...
  64. 64. Hierarchy Evolution View encapsulates time. Girba etal, 2005 A changed methods changed age lines C B Removed Removed D E A is persistent, B is stable, C was removed, E is newborn ...
  65. 65. Hierarchy Evolution View reveals patterns. Girba etal, 2005
  66. 66. Hierarchy Evolution View reveals patterns. Girba etal, 2005 ntity . ss e aph t cla a gr firs to as ing tory mapp His les e nab
  67. 67. Identifying co-change patterns 5
  68. 68. 1 2 3 4 5 6 A A B B E C D E C D Gall etal, ‘98
  69. 69. Co-change patterns are n-ary relationships.
  70. 70. 1 2 3 4 5 6 A B C D Version E
  71. 71. 1 2 3 4 5 6 A B C D Version E changed
  72. 72. 1 2 3 4 5 6 A History changed(i) B C D Version E changed
  73. 73. What is Concept Analysis?
  74. 74. 1 2 3 4 5 6 A B C D E
  75. 75. {A, B, C, D, E} 1 2 3 4 5 6 Ø A {A, D, B} {A, E, C, D} {2} {6} B {D, B} {A, D} {A, E, C} C FCA {2, 4} {2, 6} {5, 6} D {D} {A} {C} E {2, 4, 6} {2, 5, 6} {3, 5, 6} Ø {1, 2, 3, 4, 5, 6}
  76. 76. {A, B, C, D, E} 1 2 3 4 5 6 Ø A {A, D, B} {A, E, C, D} {2} {6} B {D, B} {A, D} {A, E, C} C FCA {2, 4} {2, 6} {5, 6} D {D} {A} {C} E {2, 4, 6} {2, 5, 6} {3, 5, 6} Ø {1, 2, 3, 4, 5, 6} Girba etal, 2007
  77. 77. Parallel Inheritance add simultaneously children to several classes Shotgun Surgery change several classes simultaneously, but do not add methods
  78. 78. {A, B, C, D, E} 1 2 3 4 5 6 Ø A {A, D, B} {A, E, C, D} {2} {6} B {D, B} {A, D} {A, E, C} C FCA {2, 4} {2, 6} {5, 6} D {D} {A} {C} E {2, 4, 6} {2, 5, 6} {3, 5, 6} Ø {1, 2, 3, 4, 5, 6}
  79. 79. {A, B, C, D, E} 1 2 3 4 5 6 Ø A {A, D, B} {A, E, C, D} {2} {6} tity B s en A. st clas {D, B} FC firFCA g to{2, 4} {A, D} {A, E, C} y as appin C {2, 6} {5, 6} stor s m Hi ble D ena {D} {A} {C} E {2, 4, 6} {2, 5, 6} {3, 5, 6} Ø {1, 2, 3, 4, 5, 6}
  80. 80. How developers drive software evolution 6
  81. 81. CVS shows activity.
  82. 82. Who is responsible for this?
  83. 83. Who is responsible for this?
  84. 84. Alphabetical order is no order.
  85. 85. The Hausdorf metric can be used to compute the similarity between commits. A d(A, B) = ∑ min2{ | a - b | b in B } a in A B
  86. 86. Alphabetical order is no order.
  87. 87. Ownership Map reveals development patterns. Girba etal, 2006
  88. 88. Ownership Map reveals development patterns. Girba etal, 2006 tity anges. s en c ch clas listi rst ho as fi out tory ing ab His son les rea enab
  89. 89. Measuring Yesterday’s Time-based history Weather Detection Strategies 1 2 3 4 5 6 Visualizing the evolution Detecting How developers of hierarchies co-change patterns drive evolution
  90. 90. System System History Version Inheritance Inheritance History Version Class Class History Version
  91. 91. History Version History Version History Version
  92. 92. History Version en tity. st c lass 200 5 as fir Gir ba, his tory History Version mo dels Hi smo History Version
  93. 93. Modeling History to Understand Software Evolution Inauguraldissertation der Philosophisch-naturwissenschaftlichen Fakultät der Universität Bern vorgelegt von Tudor Gîrba von Rumänien Leiter der Arbeit: Prof. Dr. Stéphane Ducasse Prof. Dr. Oscar Nierstrasz Institut für Informatik und angewandte Mathematik www.tudorgirba.com
  94. 94. Tudor Gîrba www.tudorgirba.com creativecommons.org/licenses/by/3.0/

×