More Related Content More from Tudor Girba (20) Girba Phd Presentation 2005-11-141. Modeling History
to Understand Software Evolution
PhD Defense
Tudor Gîrba
Supervisors: Stéphane Ducasse, Oscar Nierstrasz
13
27
73 2. Context: Reverse engineering is creating
high level views of the system
Requirements
Analysis
Fo
g
in
rw
er
ar
ne
Design
d
gi
En
En
gi
se
ne
er
e
rin
ev
g
R
Time Implementation
© Tudor Gîrba 2 /47 3. Context: Reverse engineering is creating
high level views of the system
Requirements
Analysis
Fo
g
in
rw
er
ar
ne
Design
d
gi
En
En
gi
se
ne
er
e
rin
ev
g
R
Time Implementation
© Tudor Gîrba 2 /47 4. Context: History holds useful information for
reverse engineering
The doctor always looks at my health file
Historical information is useful but, it is hidden among
huge amounts of data
N versions means
N times more data
Version 1 Version 2 Version 3 … Version n
The more data the more techniques are needed to analyze
it
© Tudor Gîrba 3 /47 5. Context: Many techniques were developed
Trend analysis
[Lehman etal. ‘01] Evolution patterns
…
[Lanza, Ducasse ‘02]
Co-change
analysis
[Gall etal. ‘03]
Authors analysis [Eick etal. ‘02]
© Tudor Gîrba 4 /47 6. Problem: Current approaches rely on ad-hoc
models or on too specific meta-models
Trend analysis
[Lehman etal. ‘01] Evolution patterns
…
[Lanza, Ducasse ‘02]
Co-change
analysis
[Gall etal. ‘03]
Authors analysis [Eick etal. ‘02]
© Tudor Gîrba 5 /47 7. Problem: Current approaches rely on ad-hoc
models or on too specific meta-models
Trend analysis
[Lehman etal. ‘01] Evolution patterns
…
[Lanza, Ducasse ‘02]
Research question:
How can we build a generic meta-model?
Co-change
analysis
[Gall etal. ‘03]
Authors analysis [Eick etal. ‘02]
© Tudor Gîrba 5 /47 8. Overview
History Version
History Version
History Version
Hismo
Applications
13
27
73
Historical Yesterday’s History-based Hierarchy Co-change Ownership
measurements Weather Detection evolution patterns Map
Strategies
© Tudor Gîrba 6 /47 9. Overview
History Version
History Version
History Version
Hismo:
Hismo Modeling History
Applications
History Version
13
27 History Version
73
Historical Yesterday’s History-based Hierarchy Co-change Ownership
History Version
measurements Weather Detection evolution patterns Map
Strategies
© Tudor Gîrba 6 /47 10. Example: Evolution Matrix reveals different
evolution patterns [Lanza, Ducasse ‘02]
Pulsar
Class
Idle Polymetric
Class
NOM view
White Dwarf Class NOA
Class
Supernova
Class
versions
© Tudor Gîrba 7 /47 11. Example: Evolution Matrix reveals different
evolution patterns [Lanza, Ducasse ‘02]
Pulsar
Class
Idle Polymetric
Class
view
Thesis: NOM
White Dwarf Class NOA
Class Evolution needs to be modeled
as a first class entity
Supernova
Class
versions
© Tudor Gîrba 7 /47 12. Solution: History encapsulates and
characterizes the evolution
Pulsar
Class History
Idle
Class History ClassHistory
White Dwarf isPulsar
Class History isIdle
…
Supernova
Class History
versions
© Tudor Gîrba 8 /47 13. Hismo: The history meta-model
System
Version
Class Class
History Version
© Tudor Gîrba 9 /47 14. Hismo: The history meta-model
System System
History Version
Class Class
History Version
© Tudor Gîrba 9 /47 15. Hismo: The history meta-model
System System
History Version
Class Class
History Version
© Tudor Gîrba 9 /47 16. … but, what about relationships?
System System
History Version
Inheritance
Version
Class Class
History Version
© Tudor Gîrba 10 /47 17. … but, what about relationships?
System System
History Version
Inheritance Inheritance
History Version
Class Class
History Version
© Tudor Gîrba 10 /47 18. Hismo is obtained by transforming the
structural meta-model
History Version
History Version
History Version
© Tudor Gîrba 11 /47 19. Overview
History Version
History Version
History Version
Hismo
Applications
13
27
73
Historical Yesterday’s History-based Hierarchy Co-change Ownership
measurements Weather Detection evolution patterns Map
Strategies
© Tudor Gîrba 12 /47 20. Overview
Application:
History Version
History measurements
History Version
13 History Version
27
Hismo
73
Applications
13
27
73
Historical Yesterday’s History-based Hierarchy Co-change Ownership
measurements Weather Detection evolution patterns Map
Strategies
© Tudor Gîrba 12 /47 21. Problem: History holds useful information
hidden among large amounts of data
2 4 3 5 7
2 2 3 4 9
2 2 1 2 3
2 2 2 2 2
1 5 3 4 4
How much was a class changed?
When was a class changed?
…
© Tudor Gîrba 13 /47 22. History can be measured: 13
27
How much was a class changed? 73
n
Evolution of Number
ENOM(C)= ∑ |NOMi(C)-NOMi-1(C)|
of Methods i=2
ENOM(C)= 4 + 2 + 1 + 0 = 7
1 5 3 4 4
© Tudor Gîrba 14 /47 23. History can be measured: 13
27
When was a class changed? 73
n
Latest Evolution of
LENOM(C)= ∑ |NOMi(C)-NOMi-1(C)| 2i - n
Number of Methods i=2
Earliest Evolution of n
Number of Methods EENOM(C)= ∑ |NOMi(C)-NOMi-1(C)| 22 - i
i=2
LENOM(C)= 4 2-3 + 2 2-2 + 1 2-1 + 0 20 =1
1 5 3 4 4
EENOM(C)= 4 20 + 2 2-1 + 1 2-2 + 0 2-3 = 5.125
© Tudor Gîrba 15 /47 24. History measurements compress 13
27
aspects of the evolution into numbers 73
ENOM LENOM EENOM
A 2 4 3 5 7 7 3.37 3.25
B 2 2 3 4 9 7 5.75 1.37
C 2 2 1 2 3 3 1 2
D 2 2 2 2 2 0 0 0
E 1 5 3 4 4 7 1 5.12
© Tudor Gîrba 16 /47 25. History measurements compress 13
27
aspects of the evolution into numbers 73
ENOM LENOM EENOM
A Balanced changer 7 3.37 3.25
B Late changer 7 5.75 1.37
C 3 1 2
D Dead stable 0 0 0
E Early changer 7 1 5.12
© Tudor Gîrba 17 /47 26. Many measurements can be defined at 13
27
different levels of abstraction … 73
Evolution Number of Methods
Latest/Earliest Evolution Number of Statements
Stability Cyclomatic Complexity
Historical Max/Min of Lines of Code
Historical Average Number of Classes
Growth Trend Number of modules
… …
… But measurements are a means not a goal
© Tudor Gîrba 18 /47 27. Overview
History Version
History Version
History Version
Hismo
Applications
13
27
73
Historical Yesterday’s History-based Hierarchy Co-change Ownership
measurements Weather Detection evolution patterns Map
Strategies
© Tudor Gîrba 19 /47 28. Overview
Application:
History Version
Yesterday’s Weather
History Version
History Version
Hismo
Applications
13
27
73
Historical Yesterday’s History-based Hierarchy Co-change Ownership
measurements Weather Detection evolution patterns Map
Strategies
© Tudor Gîrba 19 /47 29. Common Wisdom: The recently changed
parts are likely to change in the near future
[Mens,Demeyer ‘01]
Is the common wisdom relevant?
Yesterday’s Weather metaphor:
It expresses the chances of having the same weather today as we
had yesterday
It is location specific
Switzerland - 30% Sahara - 90%
© Tudor Gîrba 20 /47 30. Yesterday’s Weather: For each given version
we check the common wisdom
YesterdayWeatherHit(present):
past:=histories.topLENOM(start, present)
future:=histories.topEENOM(present, end)
past.intersectWith(future).notEmpty()
Past Present Future
versions version versions
© Tudor Gîrba 21 /47 31. Yesterday’s Weather: For each given version
we check the common wisdom
Past Late
Changers
YesterdayWeatherHit(present):
past:=histories.topLENOM(start, present)
future:=histories.topEENOM(present, end)
past.intersectWith(future).notEmpty()
Past Present Future
versions version versions
© Tudor Gîrba 21 /47 32. Yesterday’s Weather: For each given version
we check the common wisdom
Past Late Future Early
Changers Changers
YesterdayWeatherHit(present):
past:=histories.topLENOM(start, present)
future:=histories.topEENOM(present, end)
past.intersectWith(future).notEmpty()
Past Present Future
versions version versions
© Tudor Gîrba 21 /47 33. Yesterday’s Weather: For each given version
we check the common wisdom
Past Late Future Early
Changers Changers
YesterdayWeatherHit(present):
past:=histories.topLENOM(start, present)
future:=histories.topEENOM(present, end)
past.intersectWith(future).notEmpty()
Past Present Future
hit versions version versions
© Tudor Gîrba 21 /47 34. Overall Yesterday’s Weather shows the
localization of changes in time
hit hit hit hit hit hit hit hit hit hit
3 hits 7 hits
YW = = 37% YW = = 87%
8 possible 8 possible
hits hits
© Tudor Gîrba 22 /47 35. Overall Yesterday’s Weather shows the
localization of changes in time
hit hit hit hit hit hit hit hit hit hit
3 hits 7 hits
YW = = 37% YW = = 87%
8 possible 8 possible
hits hits
Case studies:
40 versions of CodeCrawler (180 classes): 100%
40 versions of Jun (700 classes): 79%
40 versions of Jboss (4000 classes): 53%
© Tudor Gîrba 22 /47 36. Overview
History Version
History Version
History Version
Hismo
Applications
13
27
73
Historical Yesterday’s History-based Hierarchy Co-change Ownership
measurements Weather Detection evolution patterns Map
Strategies
© Tudor Gîrba 23 /47 37. Overview
Application:
History Version
History-based Detection Strategies
History Version
History Version
Hismo
Applications
13
27
73
Historical Yesterday’s History-based Hierarchy Co-change Ownership
measurements Weather Detection evolution patterns Map
Strategies
© Tudor Gîrba 23 /47 38. Context: Detection Strategies detect design
flaws based on measurements [Marinescu ‘04]
Example: God Class
Maintainability problem because it encapsulates a lot of
knowledge
Class ATFD > 40
AND
Class WMC > 75
OR God Class
Class TCC < 0.2
AND
Class NOA > 20
© Tudor Gîrba 24 /47 39. History-based Detection Strategies
take evolution into account
Example: a Stable God Class is not necessarily a bad one
History Last God Class
AND Stable God Class
History Stability > 95%
© Tudor Gîrba 25 /47 40. History-based Detection Strategies
take evolution into account
Example: a Stable God Class is not necessarily a bad one
History Last God Class
AND Stable God Class
History Stability > 95%
Case study: 5 out of 24 God Classes in Jun were stable
and harmless
© Tudor Gîrba 25 /47 41. Overview
History Version
History Version
History Version
Hismo
Applications
13
27
73
Historical Yesterday’s History-based Hierarchy Co-change Ownership
measurements Weather Detection evolution patterns Map
Strategies
© Tudor Gîrba 26 /47 42. Overview
Application:
Characterizing the History
evolution Version
of class hierarchies
History Version
History Version
Hismo
Applications
13
27
73
Historical Yesterday’s History-based Hierarchy Co-change Ownership
measurements Weather Detection evolution patterns Map
Strategies
© Tudor Gîrba 26 /47 43. Context: Given the evolution of a hierarchy …
A A A A A
B C B C B C B B
D D D E
time
A is persistent C was removed
B is stable E is newborn
D inherited from C and then from A …
© Tudor Gîrba 27 /47 44. … but useful information is hidden among large
amounts of data
How were the hierarchies evolved?
© Tudor Gîrba 28 /47 45. Hierarchy Evolution Complexity View
characterizes class hierarchy histories
ENOM
A ENOS Age Class
History
Removed
C B
Age Inheritance
D E Removed History
A is persistent C was removed
B is stable E is newborn
D inherited from C and then from A …
© Tudor Gîrba 29 /47 46. Case study: Class hierarchies in Jun reveal
evolution patterns
Young
Unstable root
Reliable inheritance
Persistent
Unbalanced
Stable Newborn
Reliable inheritance
Old Old
Stable Unstable
Balanced Unbalanced
Reliable inheritance Unreliable inheritance
© Tudor Gîrba 30 /47 47. Overview
History Version
History Version
History Version
Hismo
Applications
13
27
73
Historical Yesterday’s History-based Hierarchy Co-change Ownership
measurements Weather Detection evolution patterns Map
Strategies
© Tudor Gîrba 31 /47 48. Overview
Application:
History Version
Detecting co-change patterns
History Version
History Version
Hismo
Applications
13
27
73
Historical Yesterday’s History-based Hierarchy Co-change Ownership
measurements Weather Detection evolution patterns Map
Strategies
© Tudor Gîrba 31 /47 49. Context: Repeated co-changes reveal
hidden dependencies [Gall etal. ‘98]
v1 v2 v3 v4 v5 v6
A
B Can we identify co-change
patterns like:
C Parallel Inheritance
Shotgun Surgery
D
…
?
E
© Tudor Gîrba 32 /47 50. Formal Concept Analysis (FCA) finds
elements that have properties in common
[Ganter, Wille ‘99]
P1 P2 P3 P4 P5 P6
(A,B,C,D,E)
()
A
(A,D,E) (A,B,C,D)
(P2) (P6)
B
(D,E) (A,D) (A,B,C)
C FCA (P5,P6)
(P2,P4) (P2,P6)
D (D) (A) (C)
(P2,P4,P6) (P2,P5,P6) (P3,P5,P6)
E
()
(P1,P2,P3,P4,P5,P6)
To use FCA, we need to map our interests
on elements and properties
© Tudor Gîrba 33 /47 51. Formal Concept Analysis (FCA) finds
elements that have properties in common
[Ganter, Wille ‘99]
P1 P2 P3 P4 P5 P6
(A,B,C,D,E)
()
A
(A,D,E) (A,B,C,D)
(P2) (P6)
B
(D,E) (A,D) (A,B,C)
C FCA (P5,P6)
(P2,P4) (P2,P6)
D (D) (A) (C)
(P2,P4,P6) (P2,P5,P6) (P3,P5,P6)
E
()
(P1,P2,P3,P4,P5,P6)
To use FCA, we need to map our interests
on elements and properties
© Tudor Gîrba 34 /47 52. We use FCA to identify entities that
co-changed repeatedly
v1 v2 v3 v4 v5 v6
(A,B,C,D,E)
()
A
(A,D,E) (A,B,C,D)
(v2) (v6)
B
(D,E) (A,D) (A,B,C)
C FCA (v5,v6)
(v2,v4) (v2,v6)
D (D) (A) (C)
(v2,v4,v6) (v2,v5,v6) (v3,v5,v6)
E
()
(v1,v2,v3,v4,v5,v6)
Elements = Histories
Properties = “changed in version X”
© Tudor Gîrba 35 /47 53. Example: Parallel inheritance denotes
children added in several hierarchies
v1 v2 v3 v4 v5 v6
A 0 1 1 1 2 4
A A A A A A
Elements = ClassHistories
Properties = “changed number of
children in version X”
© Tudor Gîrba 36 /47 54. Example: Parallel inheritance denotes
children added in several hierarchies
v1 v2 v3 v4 v5 v6
A 0 1 1 1 2 4 Case study: JBoss
A A A A A A ServiceMBeanSupport 14
JBossTestCase versions
EJBLocalHome 9
EJBLocalObject versions
Elements = ClassHistories
Properties = “changed number of
children in version X”
© Tudor Gîrba 36 /47 55. Overview
History Version
History Version
History Version
Hismo
Applications
13
27
73
Historical Yesterday’s History-based Hierarchy Co-change Ownership
measurements Weather Detection evolution patterns Map
Strategies
© Tudor Gîrba 37 /47 56. Overview
Application:
History Version
Ownership map
History Version
History Version
Hismo
Applications
13
27
73
Historical Yesterday’s History-based Hierarchy Co-change Ownership
measurements Weather Detection evolution patterns Map
Strategies
© Tudor Gîrba 37 /47 57. Context: The code history might tell you what
happened, but not why it happened
Case study: Outsight [Rysselberghe, Demeyer ‘04]
files
time
© Tudor Gîrba 38 /47 58. Context: The code history might tell you what
happened, but not why it happened
Case study: Outsight [Rysselberghe, Demeyer ‘04]
files
time
Who is responsible for this?
© Tudor Gîrba 38 /47 59. We color the lines to show which author
owned which files in which period
Green author Green author
large commit ownership
File History A
File History B
Blue author
small commit
© Tudor Gîrba 39 /47 62. We cluster the file histories to favor
colored blocks inside each module
We use the Hausdorf distance between the commit
timestamps
d(A, B) = ∑ min2{ | a - b | b ∈ B }
a∈A
A
B
© Tudor Gîrba 42 /47 63. Ownership Map on alphabetically ordered
files is not very useful, but …
© Tudor Gîrba 43 /47 65. The ordered Ownership Map reveals
developer patterns
Edit Takeover
© Tudor Gîrba
Monologue Familiarization Dialogue 44 /47 66. Overview
History Version
History Version
History Version
Hismo
Applications
13
27
73
Historical Yesterday’s History-based Hierarchy Co-change Ownership
measurements Weather Detection evolution patterns Map
Strategies
© Tudor Gîrba 45 /47 67. Overview
History Version
Implementation:
History Version
History Version
Both Hismo and its applications
Hismo are implemented in
Applications one single infrastructure
13
27
73
Historical Yesterday’s History-based Hierarchy Co-change Ownership
measurements Weather Detection evolution patterns Map
Strategies
© Tudor Gîrba 45 /47 68. Implementation: All tools are integrated into Moose
Van CodeCrawler ConAn Chronia
13
27
73
Moose
Model repository Extensible meta-model Integration mechanism
© Tudor Gîrba 46 /47 69. Conclusion: Hismo offers a uniform way of
expressing evolution analyses
History Version
History Version
History Version
Hismo
Applications
13
27
73
Historical Yesterday’s History-based Hierarchy Co-change Ownership
measurements Weather Detection evolution patterns Map
Strategies
© Tudor Gîrba 47 /47 70. Conclusion: Hismo offers a uniform way of
expressing evolution analyses
History Version
History Version
History Version
Hismo
Questions?
Applications
13
27
73
Historical Yesterday’s History-based Hierarchy Co-change Ownership
measurements Weather Detection evolution patterns Map
Strategies
© Tudor Gîrba 47 /47