1. The Interplay between SemanticThe Interplay between Semantic
Coupling and Co-Change ofCoupling and Co-Change of
Software Classes (journal first)Software Classes (journal first)
Nemitari Ajienka – EdgeHill University (UK)
Andrea Capiluppi – Brunel University London (UK)
Steve Counsell – Brunel University London (UK)
2. ICSE2018 - Gothenburg2 A Capiluppi
Outline
Rationale
Definitions: Semantic coupling and Co-change
Experimental set-up
Results
Conclusion
3. ICSE2018 - Gothenburg3 A Capiluppi
Rationale – Software changes: origin and impact
[Generated by Doxygen and Graphviz]
Certain classes have
the tendency to change
more
Identify patterns or
metrics of those
classes
4. ICSE2018 - Gothenburg4 A Capiluppi
Definitions
Semantic coupling
– Degree of relationship between classes’ semantic content
Co-change (Logical coupling)
– Based on historical data
– Classes changed in the same timeframe (day? Week?
Commit?)
7. ICSE2018 - Gothenburg9 A Capiluppi
Research questions
RQ1: Is there a linear relationship between semantic
and logical coupling?
– Very similar classes (semantically) bound to co-evolve
more often?
RQ2: Is there a directional relationship between
semantic and logical coupling?
– If A and B are co-evolving, does it mean that they’re
semantically linked, or
– If A and B are semantically similar, will they co-evolve?
12. ICSE2018 - Gothenburg14 A Capiluppi
Co-evolution data (logical coupling)
Per project
Per revision
Per pair of OO classes
“what is the likelihood that class A and B co-evolve
together, based on historical data?”
– Low, medium, high likelihood
13. ICSE2018 - Gothenburg15 A Capiluppi
Logical coupling: operationalisation
Support
– class A modified in 3
transactions
– 2 also included changes to C
– Support for A C is 2.→
Confidence
– Confidence for A C (“C→
depends on A”) is 2/3 = 0.67
– Confidence for C A (“A→
depends on C”) is 2/4 = 0.5.
14. ICSE2018 - Gothenburg16 A Capiluppi
Semantic coupling: operationalisation
Per project
All revisions
Pair of classes
UrSQLController vs
UrSQLEntry
– N-gram similarity of 0.6
for n-grams of n=4
Vector Space Model (VSM)
text corpora (full code)
N-Gram technique: small
sentences (class identifiers)
Disco Word synonym: small
sentences (class identifiers)
16. ICSE2018 - Gothenburg18 A Capiluppi
RQ1: linear relationship bw Logical and Semantic
Chi square test
Spearman’ Rank correlation (ρ)
Per project, per pair of classes, in all revisions:
– All confidence metrics (logical coupling)
– All coupling strengths between pairs
17. ICSE2018 - Gothenburg19 A Capiluppi
RQ1 results
No linear relationship
between the strengths of
logical and semantic
dependencies
Can’t infer co-evolution
frequency based on
semantic strength
Using semantic to predict
co-change has low
precision
18. ICSE2018 - Gothenburg20 A Capiluppi
RQ2: directional relationship bw Logical and Semantic
Co-changed Semantic Dependencies (CSD, in %)
– Percentage of sem dependencies that also co-change
Semantic Logical Dependencies (SLD, in %)
– Percentage of logical dependencies that are also
semantically related
19. ICSE2018 - Gothenburg21 A Capiluppi
RQ2: results
Number of semantic and logical
dependencies similar magn order
In most projects, 100%
semantic dependencies are also
logical dependencies
If two classes are semantically coupled, there is a high
chance that they will co-change in the future
20. ICSE2018 - Gothenburg22 A Capiluppi
Serendipity findings
Semantic coupling
– use full source code or just
identifiers?
– which is more efficient?
Chi-squared test of
independence
– VSM
– N-Gram + Disco
21. ICSE2018 - Gothenburg23 A Capiluppi
Results: class corpora or identifiers?
Class corpora and identifiers are related: if one shows
semantic coupling, so does the other
– Identifier-based techniques are much more effective
– N-gram more efficient than Disco
22. ICSE2018 - Gothenburg24 A Capiluppi
Take-away messages
Very similar classes (highly-semantically coupled) are
not co-changing more often
Semantically linked classes are very likely to co-evolve
Using identifiers instead of full corpora is an efficient
and effective way of measuring semantic coupling
Work shared at https://goo.gl/eLuDbB