LSE


         A Portfolio of
         Software Evolution
         Expertise
                   Stéphane Ducasse
         ...
A word of presentation
            Co-author of Object-Oriented Reengineering Patterns
            Co-developer of Moose (...
Roadmap
        •   Some facts
        •   Our approach
            •   Supporting maintenance
            •   Moose an op...
Software is complex.

                 29% Succeeded

                    18% Failed



                 53% Challenged


...
How large is your project?




                                  LSE
S.Ducasse             5
How large is your project?




                                  LSE
S.Ducasse             5
How large is your project?




                                  LSE
S.Ducasse             5
How large is your project?




                                  LSE
S.Ducasse             5
How large is your project?


               1’000’000 lines of code




                                         LSE
S.Duc...
How large is your project?


                1’000’000 lines of code
               * 2 = 2’000’000 seconds




          ...
How large is your project?


                1’000’000 lines of code
               * 2 = 2’000’000 seconds
              ...
How large is your project?


                1’000’000 lines of code
               * 2 = 2’000’000 seconds
              ...
How large is your project?


                1’000’000 lines of code
               * 2 = 2’000’000 seconds
              ...
Maintenance is Continuous Development

                                                                 4.1% Other
       ...
Lehman’s Software Evolution Laws
            Continuous Change: “A program that is used in a
            real-world enviro...
Roadmap
        •   Some facts
        •   Our approach
            •   Supporting maintenance
            •   Moose an op...
Supporting the evolution of applications
            A research goal and agenda grounded in reality

            How to he...
Covered topics
                                                                         Analyses


            Topics     ...
Software Metrics
                                                    [LMO99, OOPSLA00]
                                   ...
One Example: who is responsible of what?


                      (4) Visualisation
    (3) Analyses

2) Modèle


         ...
Moose is a reengineering tool which integrates
     multiple techniques
                Number of classes = 382
          ...
Moose is open and open-source
            meta-described
            meta-model aware


                          Method  ...
Designed to be extensible
                            Class
                            History

            Duplication  ...
Roadmap
        •   Some facts
        •   Our approach
            •   Supporting maintenance
            •   Moose an op...
Understanding large systems
            Understanding code is difficult!
            Systems are large
            Code is ...
Polymetric views




                             W: # fields
                             H: # methods
                  ...
Polymetric views condense information
    To get a feel of the inheritance
    semantics: adding vs. reusing




         ...
Navigating Views...




                                LSE
S.Ducasse                  20
Understanding classes
            Understanding even a class is difficult!




                                            ...
Class Blueprint
       Enriched call flow annotated with
       metrics to give semantics
            Initialization   Exte...
Class Blueprint




                            LSE
S.Ducasse              23
Large delegating interface




                                  LSE
S.Ducasse             24
Sharing Flows




                          LSE
S.Ducasse            25
Regular Subclasses




                          LSE
S.Ducasse            26
Patterns




                     LSE
S.Ducasse       27
How can we predict changes?
            Common wisdom stresses that what changes yesterday
            will change today, ...
With history analysis we can get the
     climate of a software system
                       Past Late               Futu...
How developers develop?
        •   More efficient to put people working together in the
            same office?
        • ...
Who did that?




Files




               Time

                           LSE
S.Ducasse             31
Line colors show which author owned
     which files in which period

                     Green author    Green author
   ...
Which author “possesses” which files?




                                        LSE
S.Ducasse              33
Alphabetical order is no order!




                                       LSE
S.Ducasse               34
Based on similar commit signature


                                               Edit       Takeover




            Mon...
Understanding evolution of large systems
        •   How old are the hierarchies?
        •   How did the classes change?
...
Evolution holds useful information

            A           A              A              A                  A

          ...
Hierarchy Evolution Complexity View
     characterizes class hierarchy histories
                                         ...
Class hierarchies over 40 versions of
     Jun - a 740 classes, 3D framework




                                         ...
Identifying Duplicated Code
            “Parsing the program suite of interest requires a parser for the
            langu...
Language Independent                             a b c defa b cdef



            Language independent, Textual,
         ...
A Conceptual Matrix
                   File A            File B
    a b c defa b cdef




                            File...
Entities that change together can reveal hidden
     dependencies

                                                      (...
How properties spread in large systems?
            Properties:
              Metrics
              People
              S...
Distribution Map




                             LSE
S.Ducasse               45
Ownership
        •   Authors in JBoss




                                    LSE
S.Ducasse                      46
Characterizing Packages
            Butterflies [Metrics05]
              Kind of Radar




                               ...
Relative version




                             LSE
S.Ducasse               48
How to understand Packages
            Packages are key structuring elements
            But complex:
              import...
Surfaces represent package communication



                                                           classes in P1
     ...
Principle

         P2                  P3                  P4
            A2    B2        A3       B3               A4


...
Example




                    LSE
S.Ducasse      52
Symbols contain domain information
        •   What are the concepts used in an application?
        •   How can we use sy...
Looking at the Symbols
        •   Developers use meaningful names, which capture
            the domain knowledge.




  ...
A cluster is a group of documents
     which use the same terms




                                         LSE
S.Ducasse...
Moose has been validated on real life systems
            Several large, industrial case studies (NDA)
              Harma...
Possible New Research Directions

        •   Remodularization
            •   Clustering analysis
            •   Open an...
Evolution/Maintenance is a challenge

            Understanding and maintaining large and complex
            applications...
Upcoming SlideShare
Loading in...5
×

Ducasse's Maintenance Expertise

649

Published on

Published in: Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
649
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
11
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Ducasse's Maintenance Expertise

  1. 1. LSE A Portfolio of Software Evolution Expertise Stéphane Ducasse stephane.ducasse@inria.fr http://stephane.ducasse.free.fr/ Stéphane Ducasse 1
  2. 2. A word of presentation Co-author of Object-Oriented Reengineering Patterns Co-developer of Moose (reengineering platform) 10 PhD Theses in reengineering 50+ articles Grounded in reality Was maintainer of Squeak 3.9 Worked with: Harman-Becker AG Bedag AG Nokia, Daimler LSE S.Ducasse 2
  3. 3. Roadmap • Some facts • Our approach • Supporting maintenance • Moose an open-platform • Some visual examples • Conclusion LSE S.Ducasse 3
  4. 4. Software is complex. 29% Succeeded 18% Failed 53% Challenged The Standish Group, 2004 LSE S.Ducasse 4
  5. 5. How large is your project? LSE S.Ducasse 5
  6. 6. How large is your project? LSE S.Ducasse 5
  7. 7. How large is your project? LSE S.Ducasse 5
  8. 8. How large is your project? LSE S.Ducasse 5
  9. 9. How large is your project? 1’000’000 lines of code LSE S.Ducasse 5
  10. 10. How large is your project? 1’000’000 lines of code * 2 = 2’000’000 seconds LSE S.Ducasse 5
  11. 11. How large is your project? 1’000’000 lines of code * 2 = 2’000’000 seconds / 3600 = 560 hours LSE S.Ducasse 5
  12. 12. How large is your project? 1’000’000 lines of code * 2 = 2’000’000 seconds / 3600 = 560 hours / 8 = 70 days LSE S.Ducasse 5
  13. 13. How large is your project? 1’000’000 lines of code * 2 = 2’000’000 seconds / 3600 = 560 hours / 8 = 70 days / 20 = 3 months LSE S.Ducasse 5
  14. 14. Maintenance is Continuous Development 4.1% Other 18.2% Adaptive (new platforms or OS) Relative Maintenance Effort Between 50% and 75% of global effort is spent on 17.4% Corrective “maintenance” ! (fixing reported errors) 60.3% Perfective (new functionality) The bulk of the maintenance cost is due to new functionality even with better requirements, it is hard to predict new functions LSE S.Ducasse 6
  15. 15. Lehman’s Software Evolution Laws Continuous Change: “A program that is used in a real-world environment must change, or become progressively less useful in that environment.” Software Entropy: “As a program evolves, it becomes more complex, and extra resources are needed to preserve and simplify its structure.” LSE S.Ducasse 7
  16. 16. Roadmap • Some facts • Our approach • Supporting maintenance • Moose an open-platform • Some visual examples • Conclusion LSE S.Ducasse 8
  17. 17. Supporting the evolution of applications A research goal and agenda grounded in reality How to help companies maintaining their large software? What is the xray for software? code, people, practices Which analyses? How can you monitor your system (dashboards....) How to present extracted information? S.Ducasse 9
  18. 18. Covered topics Analyses Topics Reverse Engineering Metamodeling, Software metrics, Program understanding, Representation Transformations Visualization, Evolution analysis, Duplicated code detection, Evolution Code Analysis, Refactorings, Tests Contributions Moose: an open-source extensible reengineering environment: (Lugano, Bern, Annecy, Anvers, Louvain la neuve, ULB, UTSL) Contacts Harman-Becker (3 Millions C++), Bedag (Cobol), Nokia, ABB, IMEC S.Ducasse 10
  19. 19. Software Metrics [LMO99, OOPSLA00] Duplicated Code Identification Understanding Large Systems [ICSM99, ICSM02] Group Identification [WCRE99, TSI00, TSE03] Static/Dynamic Information [ASE03] Test Generation [ICSM99] Feature Analysis [CSMR 06] Concept Identification [JSME 06] Analyses [WCRE 06] Class Understanding [OOPSLA01,TSE04] Package Blueprints Reverse [ICSM 07] Engineering Distribution Maps [ICSM 06] Representation Transformations Language Independent Refactorings [IWPSE 00] Evolution Language Independent Meta Model (FAMIX) Reengineering Patterns [UML99] Version Analyses An Extensible Reengineering [ICSM 05] Environment (Moose) HISMO metamodel [Models 06] [JSME 05] LSE S.Ducasse 11
  20. 20. One Example: who is responsible of what? (4) Visualisation (3) Analyses 2) Modèle (1) Extraction Distribution Map of authors on JBoss S.Ducasse 12
  21. 21. Moose is a reengineering tool which integrates multiple techniques Number of classes = 382 Number of methods = 4268 Metrics … Visualization Moose Queries and Navigation word1 word2 … Semantic Analysis Evolution Analysis LSE S.Ducasse 13
  22. 22. Moose is open and open-source meta-described meta-model aware Method Class Inheritance LSE S.Ducasse 14
  23. 23. Designed to be extensible Class History Duplication Class Author Version Method Class File Event Inheritance Trace LSE S.Ducasse 15
  24. 24. Roadmap • Some facts • Our approach • Supporting maintenance • Moose an open-platform • Some visual examples • Conclusion LSE S.Ducasse 16
  25. 25. Understanding large systems Understanding code is difficult! Systems are large Code is abstract Should I really convinced you? Some existing approaches Metrics: problems you often get meaningless results once combined Visualization: often beautiful but without meaning LSE S.Ducasse 17
  26. 26. Polymetric views W: # fields H: # methods C: # lines of code LSE S.Ducasse 18
  27. 27. Polymetric views condense information To get a feel of the inheritance semantics: adding vs. reusing Classes+Inheritance W: # of Added Methods H: # of Overridden Method C: # of Method Extended methods LOC # statements # parameters LSE S.Ducasse 19
  28. 28. Navigating Views... LSE S.Ducasse 20
  29. 29. Understanding classes Understanding even a class is difficult! LSE S.Ducasse 21
  30. 30. Class Blueprint Enriched call flow annotated with metrics to give semantics Initialization External Interface Internal Implementation Accessor Attribute Invocation Sequence LSE S.Ducasse 22
  31. 31. Class Blueprint LSE S.Ducasse 23
  32. 32. Large delegating interface LSE S.Ducasse 24
  33. 33. Sharing Flows LSE S.Ducasse 25
  34. 34. Regular Subclasses LSE S.Ducasse 26
  35. 35. Patterns LSE S.Ducasse 27
  36. 36. How can we predict changes? Common wisdom stresses that what changes yesterday will change today, but it is true? In the Sahara the weather is constant, tomorrow: 90% chance that it is the same as today In Belgium, the weather is changing really fast (sea influence), 30% chance that it is the same as today LSE S.Ducasse 28
  37. 37. With history analysis we can get the climate of a software system Past Late Future Early Changers Changers 1, TopLENOM1..i (S, t1) ∩ TopEENOMi..n (S, t2) ≠ ∅ YWi(S) = 0, TopLENOM1..i (S, t1) ∩ TopEENOMi..n (S, t2) = ∅ ∑ YWi(S, t1, t2) YW(S, t1, t2) = Past Present Future n-2 hit versions version versions LSE S.Ducasse 29
  38. 38. How developers develop? • More efficient to put people working together in the same office? • How can we optimize software development? LSE S.Ducasse 30
  39. 39. Who did that? Files Time LSE S.Ducasse 31
  40. 40. Line colors show which author owned which files in which period Green author Green author large commit ownership File A File B Blue author small commit LSE S.Ducasse 32
  41. 41. Which author “possesses” which files? LSE S.Ducasse 33
  42. 42. Alphabetical order is no order! LSE S.Ducasse 34
  43. 43. Based on similar commit signature Edit Takeover Monologue Familiarization Dialogue LSE S.Ducasse 35
  44. 44. Understanding evolution of large systems • How old are the hierarchies? • How did the classes change? • How did the inheritance change? LSE S.Ducasse 36
  45. 45. Evolution holds useful information A A A A A BC BC BC B D D D time A is persistent C was removed B is stable E is newborn D inherited from C and then from A … LSE S.Ducasse 37
  46. 46. Hierarchy Evolution Complexity View characterizes class hierarchy histories ENOM A Age ENOS Class History Removed C B Age Inheritance History E D Removed A is persistent C was removed B is stable E is newborn D inherited from C and then from A … LSE S.Ducasse 38
  47. 47. Class hierarchies over 40 versions of Jun - a 740 classes, 3D framework LSE S.Ducasse 39
  48. 48. Identifying Duplicated Code “Parsing the program suite of interest requires a parser for the language dialect of interest. While this is nominally an easy task, in practice one must acquire a tested grammar for the dialect of the language at hand. Often for legacy codes, the dialect is unique and the developing organization will need to build their own parser. Worse, legacy systems often have a number of languages and a parser is needed for each. Standard tools such as Lex and Yacc are rather a disappointment for this purpose, as they deal poorly with lexical hiccups and language ambiguities.” [Baxter 98] Problems Unknown Duplicated Code Scalability Understanding LSE S.Ducasse 40
  49. 49. Language Independent a b c defa b cdef Language independent, Textual, [ICSM’99], M. Rieger’s PhD. Thesis Duploc handled Exact Copies Pascal, Java, Smalltalk, Python, a b c d e fa b x y e f Cobol, C++, PDP-11, C Slower than other approaches but... Max 45 min to adapt our approach to a new language Between 3% and 10% Copies with less identification than parametrized match LSE S.Ducasse 41
  50. 50. A Conceptual Matrix File A File B a b c defa b cdef File A Exact Copies a b c d e fa b x y e f File B Copies with Variations 42 LSE S.Ducasse
  51. 51. Entities that change together can reveal hidden dependencies (A,B,C,D,E) () A 2 3 3 3 4 6 (A,B,C,D) (A,D,E) (v6) (v2) B 6 6 6 5 6 7 (A,B,C) (D,E) (A,D) C 3 3 5 5 8 9 (v5,v6) (v2,v4) (v2,v6) D 1 3 3 4 4 6 (D) (C) (A) (v2,v4,v6) (v3,v5,v6) (v2,v5,v6) E 4 5 5 6 6 6 v1 v2 v3 v4 v5 v6 () (v1,v2,v3,v4,v5,v6) LSE S.Ducasse 43
  52. 52. How properties spread in large systems? Properties: Metrics People Symbol/Concepts Spread = how many packages does it touch? Focus = do packages and properties match? Distribution Map: a generic visualization LSE S.Ducasse 44
  53. 53. Distribution Map LSE S.Ducasse 45
  54. 54. Ownership • Authors in JBoss LSE S.Ducasse 46
  55. 55. Characterizing Packages Butterflies [Metrics05] Kind of Radar LSE S.Ducasse 47
  56. 56. Relative version LSE S.Ducasse 48
  57. 57. How to understand Packages Packages are key structuring elements But complex: import classes.... Package Blueprints [ICSM 2007] LSE S.Ducasse 49
  58. 58. Surfaces represent package communication classes in P1 that do references A3 A4 A2 B4 B4 D1 E1 P4 surface P4 P2 P3 A4 C1 A2 A1 P2 surface A1 B1 C1 D1 A3 B1 P3 surface E1 referenced P1: analyzed package classes P1 blueprint LSE S.Ducasse 50
  59. 59. Principle P2 P3 P4 A2 B2 A3 B3 A4 D1 E1 F1 G1 C1 A1 B1 H1 I1 P1 D1 E1 F1 G1 C1 A1 B1 H1 I1 col col col col col col col col col col col col col col col col col col A1 D1 G1 Internal Internal E1 F1 referenced classes referenced classes references B1 C1 H1 I1 A1 C1 B1 internal references head A1 C1 B1 internal head G1 H1 I1 Package under analysis G1 H1 I1 P1 B3 D1 E1 F1 G1 B3 D1 E1 F1 G1 A3 D1 E1 C1 body A3 D1 E1 C1 references body external references A2 A1 external A2 A1 B2 D1 B2 D1 A4 E1 F1 G1 A4 E1 F1 G1 most—least External most—least internal referencing classes External referenced classes internal referencing classes referenced classes LSE S.Ducasse 51
  60. 60. Example LSE S.Ducasse 52
  61. 61. Symbols contain domain information • What are the concepts used in an application? • How can we use symbolic information? LSE S.Ducasse 53
  62. 62. Looking at the Symbols • Developers use meaningful names, which capture the domain knowledge. LSE S.Ducasse 54
  63. 63. A cluster is a group of documents which use the same terms LSE S.Ducasse 55
  64. 64. Moose has been validated on real life systems Several large, industrial case studies (NDA) Harman-Becker Nokia Daimler Siemens Different implementation languages (C++, Java, Smalltalk, Cobol) We use external C++ parsers Different sizes Moose is used in several research groups LSE S.Ducasse 56
  65. 65. Possible New Research Directions • Remodularization • Clustering analysis • Open and Modular modules • Service Identification in Service Oriented Architecture • Architecture Extraction/Validation • Software Quality • Cost/Bugs prediction • EJB evaluation • Business rules extraction • Model transformation • Test LSE S.Ducasse 57
  66. 66. Evolution/Maintenance is a challenge Understanding and maintaining large and complex applications needs better tools/analyses Moose is a platform for developing new analyses Transfer to tool vendors LSE S.Ducasse 58
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×