The slides for the first presentation at the InterSystems Developer Community Meetup in the UK on 17th October 2017. The associated video and audio is at https://www.youtube.com/watch?v=yOxpuzXmCms
2. Agenda
Treating Code As Data
Structure101
The Architecture of Code
Tangled Code and the LSM
Complex Code
Yuzinji In Operation
Questions and Comments
3. Mining the Codebase – Treating Code as Data
From its earliest days George James Software has
developed tools to extract information from code
Core engine is a table-driven parser written in COS
Over the years we have developed new parse tables
and ancillary processing to meet specific needs, e.g.
Y2K conversion projects – finding date handling code
Global references – locate and transform
Coding standards – e.g. use of deprecated commands
Effective communication of such results demands
increasing UI sophistication
As a small company we seek to leverage other tools
rather than trying to build everything from scratch
4. Structure101
In 2012 we identified an interesting set of tools
from an Irish software company
The Structure101 tools were originally created for
Java developers, but later released as a
“generic” variant capable of being extended by
language-specific plug-ins called flavors
Using our parser technology we created a flavor
for InterSystems class definitions and traditional
COS routines
We bundle Structure101g Studio plus our flavor
as a tool called Yuzinji
Structure101g Studio is described by its creators
as an Architecture Development Environment
5. Code Architecture
Modularity and Connectedness
Modularity
Methods
Classes
Packages and Package Hierarchy
Connectedness
Definitional, e.g.
extends
is-of-type
returns
Executional, e.g.
invokes-method
queries
calls-label
jumps-to
6. Tangled Code
Connectedness leads to dependencies
Not inherently bad, but need to be understood
Levelized Structure Map (LSM)
A key innovation of Structure101
Positions modules so dependencies point downward
Modules not used by any other module float to top
Modules not using any other module sink to bottom
Other modules find their levels somewhere between
Dependency cycles (“tangles”) cause upward-
pointing connections
Can make codebases harder to understand, modify,
extend, test and deploy
Appearance of new cycles may be evidence of
unwanted architectural degradation
7. Complex Code
Most software systems can be called “complex”
Challenge is to manage complexity
Eliminate / control regions of excessive complexity
Too much being done in a single module
Hard to “keep it all in mind”
Too many possible paths through the module
Hard to understand / follow
Hard to test thoroughly
Metrics
Lines of Code (LoC)
McCabe Cyclomatic Complexity (CC) within a
method
XS – excessive structural complexity
8. Yuzinji in Operation
Launch from a Caché Studio add-in
Analyze selected packages or entire Studio project
Load the results into Structure101g Studio
Optionally:
Publish results to a local or remote S101 repository
View repository using a free web application
Compare repository snapshots, seeing changes over
time
10. Yuzinji Beyond Complexity and Dependencies
With Structure101g Studio you can do more, e.g.
Define your application architecture
Communicate this to developers via diagrams
Detect violations of the defined architecture
Model a set of refactoring steps, in order to
investigate their effect on complexity
Provide todo lists to drive actual refactoring of code
As in Java, our packages can’t actually contain subpackages. But the naming convention typically communicates something about Form.REST and Form.JSON being elements of a higher-level Form package.
Java 9 will introduce Modules. See the following refs:
https://en.wikipedia.org/wiki/Java_package#Modules
https://en.wikipedia.org/wiki/Java_Platform_Module_System
http://openjdk.java.net/projects/jigsaw/spec/sotms/
LoC is arguably of limited value. In COS, developers may reduce it artificially by putting multiple commands on a single line.
S101 uses CC as a metric of “fat” within a method. At levels above the method, the number of dependencies is what’s counted. For example, the “fatness” of a class is how many dependencies there are between class members (e.g. methods). The fatness of a package is how many sets of dependencies there are between classes within the package. E.g. if we have packages X.A, X.B and X.C, with A depending on B and C, and B depending on A and C, then the fatness of X is 4. Note that it doesn’t matter how many ways anything in A depends on something in B; for X fatness calculation this counts as one dependency. Put another way, we count the number of edges in the dependency graph, in which each edge is unidirectional.
S101 design tangledness metric applies at the package level and is the ratio of upward-pointing dependencies to the total number of dependencies. For example in the Wasabi package there are 15 upward-pointing (i.e. in the minimum feedback set) and a total of 57 dependencies in any direction. So Wasabi is 26.32% tangled (15/57)
Note that the slice tangledness metric is computed differently.