Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

20080115 04 - La qualimétrie pour comprendre et appréhender les SI

34 views

Published on

Looking differently at code

Published in: Software
  • Be the first to comment

  • Be the first to like this

20080115 04 - La qualimétrie pour comprendre et appréhender les SI

  1. 1. Stéphane Ducasse LSE Stéphane Ducasse stephane.ducasse@inria.fr http://stephane.ducasse.free.fr/ Looking differently at code 1
  2. 2. S.Ducasse LSE A word of presentation Co-author of Object-Oriented Reengineering Patterns Co-developer of Moose (reengineering platform) 10 PhD Theses in reengineering 50+ articles Grounded in reality Maintainer of open-source projects Worked with: Harman-Becker AG Bedag AG Nokia, Daimler 2
  3. 3. S.Ducasse LSE Roadmap • Some software development facts • Our approach • Supporting maintenance • Moose an open-platform • Visual principles in 3 min • Some visual examples • Conclusion 3
  4. 4. S.Ducasse LSE Software is complex. The Standish Group, 2004 53% Challenged 18% Failed 29% Succeeded 4
  5. 5. 1946
  6. 6. S.Ducasse LSE How large is your project? 1’000’000 lines of code * 2 = 2’000’000 seconds / 3600 = 560 hours / 8 = 70 days / 20 = 3 months 6
  7. 7. S.Ducasse LSE Maintenance is Continuous Development 7 Relative Maintenance Effort Between 50% and 75% of global effort is spent on “maintenance” ! 17.4% Corrective (fixing reported errors) 18.2% Adaptive (new platforms or OS) 60.3% Perfective (new functionality) 4.1% Other The bulk of the maintenance cost is due to new functionality even with better requirements, it is hard to predict new functions
  8. 8. S.Ducasse LSE System evolution is like... SimCity 8
  9. 9. S.Ducasse LSE Software are living… Early decisions were certainly good at that time But the context changes Customers change Technology changes People change 9
  10. 10. Software development is more than forward engineering. Forward engineering Actual development } { } { } { } { } { } { } { } { } {
  11. 11. Maintenance is is needed to evolve the code. Reverse engineering Forward engineering Actual development } { } { } { } { } { } { } { } { } {
  12. 12. S.Ducasse LSE Roadmap • Some software development facts • Our approach • Supporting maintenance • Moose an open-platform • Visual principles in 3 min • Some visual examples • Conclusion 12
  13. 13. S.Ducasse Help teams maintaining large software What is the xray for software? code, people, practices Which analyses? How can you monitor your system (dashboards....) How to present extracted information? 13
  14. 14. S.Ducasse Covered topics Topics Metamodeling, metrics, program understanding, visualization, evolution analysis, duplicated code detection, code Analysis, refactorings, test generation... Contributions Moose: an open-source extensible reengineering environment: (Lugano, Bern,Annecy,Anvers, Louvain la neuve, ULB, UTSL) Contacts Harman-Becker (3 Millions C++), Bedag (Cobol), Nokia, ABB, IMEC 14 Representation Transformations Reverse Engineering Analyses Evolution
  15. 15. S.Ducasse LSE15 Representation Transformations Reverse Engineering Analyses EvolutionLanguage Independent Meta Model (FAMIX) [UML99] An Extensible Reengineering Environment (Moose) [Models 06] Reengineering Patterns Version Analyses [ICSM 05] HISMO metamodel [JSME 05] Understanding Large Systems [WCRE99,TSI00,TSE03] Static/Dynamic Information [ICSM99] Feature Analysis [JSME 06] Class Understanding [OOPSLA01,TSE04] Package Blueprints [ICSM 07] Distribution Maps [ICSM 06] Software Metrics [LMO99, OOPSLA00] Duplicated Code Identification [ICSM99, ICSM02] Group Identification [ASE03] Test Generation [CSMR 06] Concept Identification [WCRE 06] Language Independent Refactorings [IWPSE 00]
  16. 16. S.Ducasse An example: who is responsible of what? 16 (1) Extraction (2) Modèle (4) Visualisation (3) Analyses Distribution Map of authors on JBoss
  17. 17. S.Ducasse LSE Distribution Map 17
  18. 18. S.Ducasse LSE How properties spread in large systems? Properties: Metrics People Symbol/Concepts Spread = how many packages does it touch? Focus = do packages and properties match? Distribution Map: a generic visualization 18
  19. 19. } { } { } { } { } { McCabe = 21 LO C = 753,000NOM = 102 Metrics Queries Visualizations ... Moose is a powerful environment
  20. 20. Moose is designed to be extensible Method Class Inheritance Method Class Inheritance Author File Duplication Event Trace Class Version Class History open meta-described
  21. 21. S.Ducasse LSE Moose has been validated on real life systems Several large, industrial case studies (NDA) Harman-Becker Nokia Daimler Siemens Different implementation languages (C++, Java, Smalltalk, Cobol) We use external C++ parsers Different sizes Moose is used in several research groups 21
  22. 22. S.Ducasse LSE Visualization principles in 3 min • Preattentive visualization (unconscious < 200ms) • Gestalt principles (from 1912) • 70% of our sensors are dedicated to vision 22
  23. 23. Tudor Gîrba How many 5? 23 3332123466509000096766689877835367 7866750910919818971746453039821768 34567865860880221167687687789762 345678915116718199101081876616161 61819010180980808097767674333
  24. 24. Tudor Gîrba How many 5? 24 3332123466509000096766689877835367 7866750910919818971746453039821768 34567865860880221167687687789762 345678915116718199101081876616161 61819010180980808097767674333
  25. 25. Tudor Gîrba Preattentive attributes 25 Color intensity Form: orientation, line length, line width, size, shape, added marks, enclosure Spatial position (2D location) Motion (flicker)
  26. 26. Tudor Gîrba Color / intensity 26
  27. 27. Tudor Gîrba Position 27
  28. 28. Tudor Gîrba Form / Orientation 28
  29. 29. Tudor Gîrba Form / Line length 29
  30. 30. Tudor Gîrba Form / Line width 30
  31. 31. Tudor Gîrba Form / Size 31
  32. 32. Tudor Gîrba Form / Shapes 32
  33. 33. Tudor Gîrba Form / Added marks 33
  34. 34. Tudor Gîrba Form / Enclosure 34
  35. 35. Tudor Gîrba Context 35
  36. 36. Tudor Gîrba Principle of Proximity 36
  37. 37. Tudor Gîrba Principle of Similarity 37
  38. 38. Tudor Gîrba Principle of Similarity 38
  39. 39. Tudor Gîrba 39 Principle of Enclosure
  40. 40. Tudor Gîrba Principle of Enclosure 40
  41. 41. Tudor Gîrba Principle of Closure 41
  42. 42. Tudor Gîrba 42 Principle of connectivity
  43. 43. Tudor Gîrba Principle of connectivity 43
  44. 44. S.Ducasse LSE Roadmap • Some software development facts • Our approach • Supporting maintenance • Moose an open-platform • Visual principles in 3 min • Some visual examples • Conclusion 44
  45. 45. S.Ducasse LSE Challenges inVisualization Screen size Max 12 colors Edge-crossing Limited short-term memory (three to nine) Extracting semantics out Beauty cannot be a goal Get some help from Gestalt principles pre-attentive visualization 45
  46. 46. S.Ducasse LSE Understanding large systems Understanding code is difficult! Systems are large Code is abstract Should I really convinced you? Some existing approaches Metrics: you often get meaningless results once combined Visualization: often beautiful but with little meaning 46
  47. 47. S.Ducasse LSE Polymetric views 47 W: # fields H: # methods C: # lines of code
  48. 48. S.Ducasse LSE Polymetric views condense information 48 Classes+Inheritance W: # of Added Methods H: # of Overridden Method C: # of Method Extended To get a feel of the inheritance semantics: adding vs. reusing methods LOC # statements # parameters
  49. 49. S.Ducasse LSE Understanding classes Understanding even a class is difficult! 49
  50. 50. S.Ducasse LSE Class Blueprint 50 Invocation Sequence Initialization External Interface Internal Implementation Accessor Attribute Enriched call flow annotated with metrics to give semantics
  51. 51. S.Ducasse LSE Class Blueprint 51
  52. 52. S.Ducasse LSE Large delegating interface 52
  53. 53. S.Ducasse LSE Sharing Flows 53
  54. 54. S.Ducasse LSE Regular Subclasses 54
  55. 55. S.Ducasse LSE How developers develop? • More efficient to put people working together in the same office? • How can we optimize software development? 55
  56. 56. S.Ducasse LSE Who did that? 56 Files Time
  57. 57. S.Ducasse LSE Line colors show which author owned which files in which period 57 File A File B Green author large commit Green author ownership Blue author small commit
  58. 58. S.Ducasse LSE Which author “possesses” which files? 58
  59. 59. S.Ducasse LSE Alphabetical order is no order! 59
  60. 60. S.Ducasse LSE Based on similar commit signature 60 DialogueMonologue Edit Takeover Familiarization
  61. 61. S.Ducasse LSE Understanding evolution of large systems • How old are the hierarchies? • How did the classes change? • How did the inheritance change? 61
  62. 62. S.Ducasse LSE Evolution holds useful information 62 A B A BC A BC D A BC D A D time B is stable C was removed E is newborn A is persistent D inherited from C and then from A …
  63. 63. S.Ducasse LSE Hierarchy Evolution Complexity View characterizes class hierarchy histories 63 B is stable C was removed E is newborn A is persistent D inherited from C and then from A … A B E C D ENOS Removed Age Removed Age Inheritance History Class History ENOM
  64. 64. S.Ducasse LSE Class hierarchies over 40 versions of Jun - a 740 classes, 3D framework 64
  65. 65. S.Ducasse LSE Identifying Duplicated Code “Parsing the program suite of interest requires a parser for the language dialect of interest.While this is nominally an easy task, in practice one must acquire a tested grammar for the dialect of the language at hand. Often for legacy codes, the dialect is unique and the developing organization will need to build their own parser.Worse, legacy systems often have a number of languages and a parser is needed for each. Standard tools such as Lex andYacc are rather a disappointment for this purpose, as they deal poorly with lexical hiccups and language ambiguities.” [Baxter 98] Problems Unknown Duplicated Code Scalability Understanding 65
  66. 66. S.Ducasse LSE Language Independent Language independent,Textual, [ICSM’99], M. Rieger’s PhD.Thesis Duploc handled Pascal, Java, Smalltalk, Python, Cobol, C++, PDP-11, C Slower than other approaches but... Max 45 min to adapt our approach to a new language Between 3% and 10% less identification than parametrized match 66 Exact Copies a b c d e f a b c d e f Copies with a b c d e fa b x y e f
  67. 67. S.Ducasse LSE A Conceptual MatrixFile A File A File B File B Exact Copies a b c d e f a b c d e f Copies with a b c d e fa b x y e f Variations 67
  68. 68. S.Ducasse LSE Butterflies 68
  69. 69. S.Ducasse LSE We are interested in your problems! • Remodularization/Repackaging • SOA - Service Identification • Architecture Extraction/Validation • Software Quality • Cost prediction • EJB Analysis • Business rules extraction • Model transformation 69
  70. 70. S.Ducasse LSE Evolution is difficult • We are interested in your problems! • Moose is open-source, you can use it, extend it, change it • We can collaborate! 70 } { } { } { } { } { NOM > 10 & LOC > 100

×