Girba Phd Presentation 2005-11-14

  • 1,542 views
Uploaded on

I used this set of slides for my PhD defense.

I used this set of slides for my PhD defense.

More in: Technology , Spiritual
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
    Be the first to like this
No Downloads

Views

Total Views
1,542
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
24
Comments
0
Likes
0

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Modeling History to Understand Software Evolution PhD Defense Tudor Gîrba Supervisors: Stéphane Ducasse, Oscar Nierstrasz 13 27 73
  • 2. Context: Reverse engineering is creating high level views of the system Requirements Analysis Fo g in rw er ar ne Design d gi En En gi se ne er e rin ev g R Time Implementation © Tudor Gîrba 2 /47
  • 3. Context: Reverse engineering is creating high level views of the system Requirements Analysis Fo g in rw er ar ne Design d gi En En gi se ne er e rin ev g R Time Implementation © Tudor Gîrba 2 /47
  • 4. Context: History holds useful information for reverse engineering The doctor always looks at my health file Historical information is useful but, it is hidden among huge amounts of data N versions means N times more data Version 1 Version 2 Version 3 … Version n The more data the more techniques are needed to analyze it © Tudor Gîrba 3 /47
  • 5. Context: Many techniques were developed Trend analysis [Lehman etal. ‘01] Evolution patterns … [Lanza, Ducasse ‘02] Co-change analysis [Gall etal. ‘03] Authors analysis [Eick etal. ‘02] © Tudor Gîrba 4 /47
  • 6. Problem: Current approaches rely on ad-hoc models or on too specific meta-models Trend analysis [Lehman etal. ‘01] Evolution patterns … [Lanza, Ducasse ‘02] Co-change analysis [Gall etal. ‘03] Authors analysis [Eick etal. ‘02] © Tudor Gîrba 5 /47
  • 7. Problem: Current approaches rely on ad-hoc models or on too specific meta-models Trend analysis [Lehman etal. ‘01] Evolution patterns … [Lanza, Ducasse ‘02] Research question: How can we build a generic meta-model? Co-change analysis [Gall etal. ‘03] Authors analysis [Eick etal. ‘02] © Tudor Gîrba 5 /47
  • 8. Overview History Version History Version History Version Hismo Applications 13 27 73 Historical Yesterday’s History-based Hierarchy Co-change Ownership measurements Weather Detection evolution patterns Map Strategies © Tudor Gîrba 6 /47
  • 9. Overview History Version History Version History Version Hismo: Hismo Modeling History Applications History Version 13 27 History Version 73 Historical Yesterday’s History-based Hierarchy Co-change Ownership History Version measurements Weather Detection evolution patterns Map Strategies © Tudor Gîrba 6 /47
  • 10. Example: Evolution Matrix reveals different evolution patterns [Lanza, Ducasse ‘02] Pulsar Class Idle Polymetric Class NOM view White Dwarf Class NOA Class Supernova Class versions © Tudor Gîrba 7 /47
  • 11. Example: Evolution Matrix reveals different evolution patterns [Lanza, Ducasse ‘02] Pulsar Class Idle Polymetric Class view Thesis: NOM White Dwarf Class NOA Class Evolution needs to be modeled as a first class entity Supernova Class versions © Tudor Gîrba 7 /47
  • 12. Solution: History encapsulates and characterizes the evolution Pulsar Class History Idle Class History ClassHistory White Dwarf isPulsar Class History isIdle … Supernova Class History versions © Tudor Gîrba 8 /47
  • 13. Hismo: The history meta-model System Version Class Class History Version © Tudor Gîrba 9 /47
  • 14. Hismo: The history meta-model System System History Version Class Class History Version © Tudor Gîrba 9 /47
  • 15. Hismo: The history meta-model System System History Version Class Class History Version © Tudor Gîrba 9 /47
  • 16. … but, what about relationships? System System History Version Inheritance Version Class Class History Version © Tudor Gîrba 10 /47
  • 17. … but, what about relationships? System System History Version Inheritance Inheritance History Version Class Class History Version © Tudor Gîrba 10 /47
  • 18. Hismo is obtained by transforming the structural meta-model History Version History Version History Version © Tudor Gîrba 11 /47
  • 19. Overview History Version History Version History Version Hismo Applications 13 27 73 Historical Yesterday’s History-based Hierarchy Co-change Ownership measurements Weather Detection evolution patterns Map Strategies © Tudor Gîrba 12 /47
  • 20. Overview Application: History Version History measurements History Version 13 History Version 27 Hismo 73 Applications 13 27 73 Historical Yesterday’s History-based Hierarchy Co-change Ownership measurements Weather Detection evolution patterns Map Strategies © Tudor Gîrba 12 /47
  • 21. Problem: History holds useful information hidden among large amounts of data 2 4 3 5 7 2 2 3 4 9 2 2 1 2 3 2 2 2 2 2 1 5 3 4 4 How much was a class changed? When was a class changed? … © Tudor Gîrba 13 /47
  • 22. History can be measured: 13 27 How much was a class changed? 73 n Evolution of Number ENOM(C)= ∑ |NOMi(C)-NOMi-1(C)| of Methods i=2 ENOM(C)= 4 + 2 + 1 + 0 = 7 1 5 3 4 4 © Tudor Gîrba 14 /47
  • 23. History can be measured: 13 27 When was a class changed? 73 n Latest Evolution of LENOM(C)= ∑ |NOMi(C)-NOMi-1(C)| 2i - n Number of Methods i=2 Earliest Evolution of n Number of Methods EENOM(C)= ∑ |NOMi(C)-NOMi-1(C)| 22 - i i=2 LENOM(C)= 4 2-3 + 2 2-2 + 1 2-1 + 0 20 =1 1 5 3 4 4 EENOM(C)= 4 20 + 2 2-1 + 1 2-2 + 0 2-3 = 5.125 © Tudor Gîrba 15 /47
  • 24. History measurements compress 13 27 aspects of the evolution into numbers 73 ENOM LENOM EENOM A 2 4 3 5 7 7 3.37 3.25 B 2 2 3 4 9 7 5.75 1.37 C 2 2 1 2 3 3 1 2 D 2 2 2 2 2 0 0 0 E 1 5 3 4 4 7 1 5.12 © Tudor Gîrba 16 /47
  • 25. History measurements compress 13 27 aspects of the evolution into numbers 73 ENOM LENOM EENOM A Balanced changer 7 3.37 3.25 B Late changer 7 5.75 1.37 C 3 1 2 D Dead stable 0 0 0 E Early changer 7 1 5.12 © Tudor Gîrba 17 /47
  • 26. Many measurements can be defined at 13 27 different levels of abstraction … 73 Evolution Number of Methods Latest/Earliest Evolution Number of Statements Stability Cyclomatic Complexity Historical Max/Min of Lines of Code Historical Average Number of Classes Growth Trend Number of modules … … … But measurements are a means not a goal © Tudor Gîrba 18 /47
  • 27. Overview History Version History Version History Version Hismo Applications 13 27 73 Historical Yesterday’s History-based Hierarchy Co-change Ownership measurements Weather Detection evolution patterns Map Strategies © Tudor Gîrba 19 /47
  • 28. Overview Application: History Version Yesterday’s Weather History Version History Version Hismo Applications 13 27 73 Historical Yesterday’s History-based Hierarchy Co-change Ownership measurements Weather Detection evolution patterns Map Strategies © Tudor Gîrba 19 /47
  • 29. Common Wisdom: The recently changed parts are likely to change in the near future [Mens,Demeyer ‘01] Is the common wisdom relevant? Yesterday’s Weather metaphor: It expresses the chances of having the same weather today as we had yesterday It is location specific Switzerland - 30% Sahara - 90% © Tudor Gîrba 20 /47
  • 30. Yesterday’s Weather: For each given version we check the common wisdom YesterdayWeatherHit(present): past:=histories.topLENOM(start, present) future:=histories.topEENOM(present, end) past.intersectWith(future).notEmpty() Past Present Future versions version versions © Tudor Gîrba 21 /47
  • 31. Yesterday’s Weather: For each given version we check the common wisdom Past Late Changers YesterdayWeatherHit(present): past:=histories.topLENOM(start, present) future:=histories.topEENOM(present, end) past.intersectWith(future).notEmpty() Past Present Future versions version versions © Tudor Gîrba 21 /47
  • 32. Yesterday’s Weather: For each given version we check the common wisdom Past Late Future Early Changers Changers YesterdayWeatherHit(present): past:=histories.topLENOM(start, present) future:=histories.topEENOM(present, end) past.intersectWith(future).notEmpty() Past Present Future versions version versions © Tudor Gîrba 21 /47
  • 33. Yesterday’s Weather: For each given version we check the common wisdom Past Late Future Early Changers Changers YesterdayWeatherHit(present): past:=histories.topLENOM(start, present) future:=histories.topEENOM(present, end) past.intersectWith(future).notEmpty() Past Present Future hit versions version versions © Tudor Gîrba 21 /47
  • 34. Overall Yesterday’s Weather shows the localization of changes in time hit hit hit hit hit hit hit hit hit hit 3 hits 7 hits YW = = 37% YW = = 87% 8 possible 8 possible hits hits © Tudor Gîrba 22 /47
  • 35. Overall Yesterday’s Weather shows the localization of changes in time hit hit hit hit hit hit hit hit hit hit 3 hits 7 hits YW = = 37% YW = = 87% 8 possible 8 possible hits hits Case studies: 40 versions of CodeCrawler (180 classes): 100% 40 versions of Jun (700 classes): 79% 40 versions of Jboss (4000 classes): 53% © Tudor Gîrba 22 /47
  • 36. Overview History Version History Version History Version Hismo Applications 13 27 73 Historical Yesterday’s History-based Hierarchy Co-change Ownership measurements Weather Detection evolution patterns Map Strategies © Tudor Gîrba 23 /47
  • 37. Overview Application: History Version History-based Detection Strategies History Version History Version Hismo Applications 13 27 73 Historical Yesterday’s History-based Hierarchy Co-change Ownership measurements Weather Detection evolution patterns Map Strategies © Tudor Gîrba 23 /47
  • 38. Context: Detection Strategies detect design flaws based on measurements [Marinescu ‘04] Example: God Class Maintainability problem because it encapsulates a lot of knowledge Class ATFD > 40 AND Class WMC > 75 OR God Class Class TCC < 0.2 AND Class NOA > 20 © Tudor Gîrba 24 /47
  • 39. History-based Detection Strategies take evolution into account Example: a Stable God Class is not necessarily a bad one History Last God Class AND Stable God Class History Stability > 95% © Tudor Gîrba 25 /47
  • 40. History-based Detection Strategies take evolution into account Example: a Stable God Class is not necessarily a bad one History Last God Class AND Stable God Class History Stability > 95% Case study: 5 out of 24 God Classes in Jun were stable and harmless © Tudor Gîrba 25 /47
  • 41. Overview History Version History Version History Version Hismo Applications 13 27 73 Historical Yesterday’s History-based Hierarchy Co-change Ownership measurements Weather Detection evolution patterns Map Strategies © Tudor Gîrba 26 /47
  • 42. Overview Application: Characterizing the History evolution Version of class hierarchies History Version History Version Hismo Applications 13 27 73 Historical Yesterday’s History-based Hierarchy Co-change Ownership measurements Weather Detection evolution patterns Map Strategies © Tudor Gîrba 26 /47
  • 43. Context: Given the evolution of a hierarchy … A A A A A B C B C B C B B D D D E time A is persistent C was removed B is stable E is newborn D inherited from C and then from A … © Tudor Gîrba 27 /47
  • 44. … but useful information is hidden among large amounts of data How were the hierarchies evolved? © Tudor Gîrba 28 /47
  • 45. Hierarchy Evolution Complexity View characterizes class hierarchy histories ENOM A ENOS Age Class History Removed C B Age Inheritance D E Removed History A is persistent C was removed B is stable E is newborn D inherited from C and then from A … © Tudor Gîrba 29 /47
  • 46. Case study: Class hierarchies in Jun reveal evolution patterns Young Unstable root Reliable inheritance Persistent Unbalanced Stable Newborn Reliable inheritance Old Old Stable Unstable Balanced Unbalanced Reliable inheritance Unreliable inheritance © Tudor Gîrba 30 /47
  • 47. Overview History Version History Version History Version Hismo Applications 13 27 73 Historical Yesterday’s History-based Hierarchy Co-change Ownership measurements Weather Detection evolution patterns Map Strategies © Tudor Gîrba 31 /47
  • 48. Overview Application: History Version Detecting co-change patterns History Version History Version Hismo Applications 13 27 73 Historical Yesterday’s History-based Hierarchy Co-change Ownership measurements Weather Detection evolution patterns Map Strategies © Tudor Gîrba 31 /47
  • 49. Context: Repeated co-changes reveal hidden dependencies [Gall etal. ‘98] v1 v2 v3 v4 v5 v6 A B Can we identify co-change patterns like: C Parallel Inheritance Shotgun Surgery D … ? E © Tudor Gîrba 32 /47
  • 50. Formal Concept Analysis (FCA) finds elements that have properties in common [Ganter, Wille ‘99] P1 P2 P3 P4 P5 P6 (A,B,C,D,E) () A (A,D,E) (A,B,C,D) (P2) (P6) B (D,E) (A,D) (A,B,C) C FCA (P5,P6) (P2,P4) (P2,P6) D (D) (A) (C) (P2,P4,P6) (P2,P5,P6) (P3,P5,P6) E () (P1,P2,P3,P4,P5,P6) To use FCA, we need to map our interests on elements and properties © Tudor Gîrba 33 /47
  • 51. Formal Concept Analysis (FCA) finds elements that have properties in common [Ganter, Wille ‘99] P1 P2 P3 P4 P5 P6 (A,B,C,D,E) () A (A,D,E) (A,B,C,D) (P2) (P6) B (D,E) (A,D) (A,B,C) C FCA (P5,P6) (P2,P4) (P2,P6) D (D) (A) (C) (P2,P4,P6) (P2,P5,P6) (P3,P5,P6) E () (P1,P2,P3,P4,P5,P6) To use FCA, we need to map our interests on elements and properties © Tudor Gîrba 34 /47
  • 52. We use FCA to identify entities that co-changed repeatedly v1 v2 v3 v4 v5 v6 (A,B,C,D,E) () A (A,D,E) (A,B,C,D) (v2) (v6) B (D,E) (A,D) (A,B,C) C FCA (v5,v6) (v2,v4) (v2,v6) D (D) (A) (C) (v2,v4,v6) (v2,v5,v6) (v3,v5,v6) E () (v1,v2,v3,v4,v5,v6) Elements = Histories Properties = “changed in version X” © Tudor Gîrba 35 /47
  • 53. Example: Parallel inheritance denotes children added in several hierarchies v1 v2 v3 v4 v5 v6 A 0 1 1 1 2 4 A A A A A A Elements = ClassHistories Properties = “changed number of children in version X” © Tudor Gîrba 36 /47
  • 54. Example: Parallel inheritance denotes children added in several hierarchies v1 v2 v3 v4 v5 v6 A 0 1 1 1 2 4 Case study: JBoss A A A A A A ServiceMBeanSupport 14 JBossTestCase versions EJBLocalHome 9 EJBLocalObject versions Elements = ClassHistories Properties = “changed number of children in version X” © Tudor Gîrba 36 /47
  • 55. Overview History Version History Version History Version Hismo Applications 13 27 73 Historical Yesterday’s History-based Hierarchy Co-change Ownership measurements Weather Detection evolution patterns Map Strategies © Tudor Gîrba 37 /47
  • 56. Overview Application: History Version Ownership map History Version History Version Hismo Applications 13 27 73 Historical Yesterday’s History-based Hierarchy Co-change Ownership measurements Weather Detection evolution patterns Map Strategies © Tudor Gîrba 37 /47
  • 57. Context: The code history might tell you what happened, but not why it happened Case study: Outsight [Rysselberghe, Demeyer ‘04] files time © Tudor Gîrba 38 /47
  • 58. Context: The code history might tell you what happened, but not why it happened Case study: Outsight [Rysselberghe, Demeyer ‘04] files time Who is responsible for this? © Tudor Gîrba 38 /47
  • 59. We color the lines to show which author owned which files in which period Green author Green author large commit ownership File History A File History B Blue author small commit © Tudor Gîrba 39 /47
  • 60. The commit history shows what happened © Tudor Gîrba 40 /47
  • 61. Ownership Map shows which author owned which files in which period © Tudor Gîrba 41 /47
  • 62. We cluster the file histories to favor colored blocks inside each module We use the Hausdorf distance between the commit timestamps d(A, B) = ∑ min2{ | a - b | b ∈ B } a∈A A B © Tudor Gîrba 42 /47
  • 63. Ownership Map on alphabetically ordered files is not very useful, but … © Tudor Gîrba 43 /47
  • 64. The ordered Ownership Map reveals developer patterns © Tudor Gîrba 44 /47
  • 65. The ordered Ownership Map reveals developer patterns Edit Takeover © Tudor Gîrba Monologue Familiarization Dialogue 44 /47
  • 66. Overview History Version History Version History Version Hismo Applications 13 27 73 Historical Yesterday’s History-based Hierarchy Co-change Ownership measurements Weather Detection evolution patterns Map Strategies © Tudor Gîrba 45 /47
  • 67. Overview History Version Implementation: History Version History Version Both Hismo and its applications Hismo are implemented in Applications one single infrastructure 13 27 73 Historical Yesterday’s History-based Hierarchy Co-change Ownership measurements Weather Detection evolution patterns Map Strategies © Tudor Gîrba 45 /47
  • 68. Implementation: All tools are integrated into Moose Van CodeCrawler ConAn Chronia 13 27 73 Moose Model repository Extensible meta-model Integration mechanism © Tudor Gîrba 46 /47
  • 69. Conclusion: Hismo offers a uniform way of expressing evolution analyses History Version History Version History Version Hismo Applications 13 27 73 Historical Yesterday’s History-based Hierarchy Co-change Ownership measurements Weather Detection evolution patterns Map Strategies © Tudor Gîrba 47 /47
  • 70. Conclusion: Hismo offers a uniform way of expressing evolution analyses History Version History Version History Version Hismo Questions? Applications 13 27 73 Historical Yesterday’s History-based Hierarchy Co-change Ownership measurements Weather Detection evolution patterns Map Strategies © Tudor Gîrba 47 /47