Dependencies and Complexity
John Murray, Senior Product Engineer
Agenda
 Treating Code As Data
 Structure101
 The Architecture of Code
 Tangled Code and the LSM
 Complex Code
 Yuzinji In Operation
 Questions and Comments
Mining the Codebase – Treating Code as Data
 From its earliest days George James Software has
developed tools to extract information from code
 Core engine is a table-driven parser written in COS
 Over the years we have developed new parse tables
and ancillary processing to meet specific needs, e.g.
 Y2K conversion projects – finding date handling code
 Global references – locate and transform
 Coding standards – e.g. use of deprecated commands
 Effective communication of such results demands
increasing UI sophistication
 As a small company we seek to leverage other tools
rather than trying to build everything from scratch
Structure101
 In 2012 we identified an interesting set of tools
from an Irish software company
 The Structure101 tools were originally created for
Java developers, but later released as a
“generic” variant capable of being extended by
language-specific plug-ins called flavors
 Using our parser technology we created a flavor
for InterSystems class definitions and traditional
COS routines
 We bundle Structure101g Studio plus our flavor
as a tool called Yuzinji
 Structure101g Studio is described by its creators
as an Architecture Development Environment
Code Architecture
 Modularity and Connectedness
 Modularity
 Methods
 Classes
 Packages and Package Hierarchy
 Connectedness
 Definitional, e.g.
 extends
 is-of-type
 returns
 Executional, e.g.
 invokes-method
 queries
 calls-label
 jumps-to
Tangled Code
 Connectedness leads to dependencies
 Not inherently bad, but need to be understood
 Levelized Structure Map (LSM)
 A key innovation of Structure101
 Positions modules so dependencies point downward
 Modules not used by any other module float to top
 Modules not using any other module sink to bottom
 Other modules find their levels somewhere between
 Dependency cycles (“tangles”) cause upward-
pointing connections
 Can make codebases harder to understand, modify,
extend, test and deploy
 Appearance of new cycles may be evidence of
unwanted architectural degradation
Complex Code
 Most software systems can be called “complex”
 Challenge is to manage complexity
 Eliminate / control regions of excessive complexity
 Too much being done in a single module
 Hard to “keep it all in mind”
 Too many possible paths through the module
 Hard to understand / follow
 Hard to test thoroughly
 Metrics
 Lines of Code (LoC)
 McCabe Cyclomatic Complexity (CC) within a
method
 XS – excessive structural complexity
Yuzinji in Operation
 Launch from a Caché Studio add-in
 Analyze selected packages or entire Studio project
 Load the results into Structure101g Studio
 Optionally:
 Publish results to a local or remote S101 repository
 View repository using a free web application
 Compare repository snapshots, seeing changes over
time
Demo of Yuzinji
Yuzinji Beyond Complexity and Dependencies
 With Structure101g Studio you can do more, e.g.
 Define your application architecture
 Communicate this to developers via diagrams
 Detect violations of the defined architecture
 Model a set of refactoring steps, in order to
investigate their effect on complexity
 Provide todo lists to drive actual refactoring of code
Further Info
 http://georgejames.com
 http://structure101.com
 Visit George James Software at the Partner
Pavilion

Dependencies & Complexity - Using Structure101 Studio on your InterSystems Caché ObjectScript codebases

  • 1.
    Dependencies and Complexity JohnMurray, Senior Product Engineer
  • 2.
    Agenda  Treating CodeAs Data  Structure101  The Architecture of Code  Tangled Code and the LSM  Complex Code  Yuzinji In Operation  Questions and Comments
  • 3.
    Mining the Codebase– Treating Code as Data  From its earliest days George James Software has developed tools to extract information from code  Core engine is a table-driven parser written in COS  Over the years we have developed new parse tables and ancillary processing to meet specific needs, e.g.  Y2K conversion projects – finding date handling code  Global references – locate and transform  Coding standards – e.g. use of deprecated commands  Effective communication of such results demands increasing UI sophistication  As a small company we seek to leverage other tools rather than trying to build everything from scratch
  • 4.
    Structure101  In 2012we identified an interesting set of tools from an Irish software company  The Structure101 tools were originally created for Java developers, but later released as a “generic” variant capable of being extended by language-specific plug-ins called flavors  Using our parser technology we created a flavor for InterSystems class definitions and traditional COS routines  We bundle Structure101g Studio plus our flavor as a tool called Yuzinji  Structure101g Studio is described by its creators as an Architecture Development Environment
  • 5.
    Code Architecture  Modularityand Connectedness  Modularity  Methods  Classes  Packages and Package Hierarchy  Connectedness  Definitional, e.g.  extends  is-of-type  returns  Executional, e.g.  invokes-method  queries  calls-label  jumps-to
  • 6.
    Tangled Code  Connectednessleads to dependencies  Not inherently bad, but need to be understood  Levelized Structure Map (LSM)  A key innovation of Structure101  Positions modules so dependencies point downward  Modules not used by any other module float to top  Modules not using any other module sink to bottom  Other modules find their levels somewhere between  Dependency cycles (“tangles”) cause upward- pointing connections  Can make codebases harder to understand, modify, extend, test and deploy  Appearance of new cycles may be evidence of unwanted architectural degradation
  • 7.
    Complex Code  Mostsoftware systems can be called “complex”  Challenge is to manage complexity  Eliminate / control regions of excessive complexity  Too much being done in a single module  Hard to “keep it all in mind”  Too many possible paths through the module  Hard to understand / follow  Hard to test thoroughly  Metrics  Lines of Code (LoC)  McCabe Cyclomatic Complexity (CC) within a method  XS – excessive structural complexity
  • 8.
    Yuzinji in Operation Launch from a Caché Studio add-in  Analyze selected packages or entire Studio project  Load the results into Structure101g Studio  Optionally:  Publish results to a local or remote S101 repository  View repository using a free web application  Compare repository snapshots, seeing changes over time
  • 9.
  • 10.
    Yuzinji Beyond Complexityand Dependencies  With Structure101g Studio you can do more, e.g.  Define your application architecture  Communicate this to developers via diagrams  Detect violations of the defined architecture  Model a set of refactoring steps, in order to investigate their effect on complexity  Provide todo lists to drive actual refactoring of code
  • 11.
    Further Info  http://georgejames.com http://structure101.com  Visit George James Software at the Partner Pavilion

Editor's Notes

  • #6 As in Java, our packages can’t actually contain subpackages. But the naming convention typically communicates something about Form.REST and Form.JSON being elements of a higher-level Form package. Java 9 will introduce Modules. See the following refs: https://en.wikipedia.org/wiki/Java_package#Modules https://en.wikipedia.org/wiki/Java_Platform_Module_System http://openjdk.java.net/projects/jigsaw/spec/sotms/
  • #8 LoC is arguably of limited value. In COS, developers may reduce it artificially by putting multiple commands on a single line. S101 uses CC as a metric of “fat” within a method. At levels above the method, the number of dependencies is what’s counted. For example, the “fatness” of a class is how many dependencies there are between class members (e.g. methods). The fatness of a package is how many sets of dependencies there are between classes within the package. E.g. if we have packages X.A, X.B and X.C, with A depending on B and C, and B depending on A and C, then the fatness of X is 4. Note that it doesn’t matter how many ways anything in A depends on something in B; for X fatness calculation this counts as one dependency. Put another way, we count the number of edges in the dependency graph, in which each edge is unidirectional. S101 design tangledness metric applies at the package level and is the ratio of upward-pointing dependencies to the total number of dependencies. For example in the Wasabi package there are 15 upward-pointing (i.e. in the minimum feedback set) and a total of 57 dependencies in any direction. So Wasabi is 26.32% tangled (15/57) Note that the slice tangledness metric is computed differently.