AN EMPIRICAL STUDY OF THE RELATION BETWEEN STRONG CHANGE COUPLING AND DEFECTS USING HISTORY AND SOCIAL METRICS IN THE APACHE ARIES PROJECT
AN EMPIRICAL STUDY OF THE RELATION
BETWEEN STRONG CHANGE COUPLING
AND DEFECTS USING HISTORY AND SOCIAL
METRICS IN THE APACHE ARIES PROJECT
+ Maintenance effort
[Banker et al.1998]
[D’ambros et al. 2009b, Cataldo et al. 2009, Kirbas et. al 2014]
+ ripple effects
Software systems comprise
artifacts that depend on each other
A change coupling indicates that two artifacts
changed together (co-changed) in the past,
making them evolutionarily connected.
WHY IS CHANGE COUPLING IMPORTANT?
[D’ambros et al. 2009b,
Kirbas et. al 2014]
Change Impact Analysis
[Zimmermann et al. 2005, Zhou et al.
2008, Hassan 2009]
[D’ambros et al. 2009a]
[Zimmermann et al.
[[[Silva et al. 2014]
[Ali et al. 2013]
Change couplings reveal relationships not present in
the code or in the documentation.
THE MAIN GOALS OF THIS PAPER ARE:
investigate the relation between
strong change couplings and
the number of defects (RQ1)
characterize strong change couplings
using historical and social metrics (RQ2)
predict defects associated with strong
change couplings (RQ3)
Identify the strong
set of Metrics
RQ1: Are strong change couplings
related to defects?
RQ2: Can historical and social
metrics identify if a change
coupling is strong?
RQ3: Can we predict defects
associated with strong change
IDENTIFYING STRONG CHANGE COUPLINGS
Number of transactions when both artifacts were changed together (support = 3).
The confidence for the change coupling from A to B would be 3/9 = 0.33.
In turn, the confidence for the change coupling from B to A would be 3/4 = 0.75.
Strong change coupling
After ranking all change couplings by support value, we used a quartile analysis to
determine the relevant change couplings. All couplings with support value higher
than the third quartile were labeled as “strong”.
The Aries project delivers a set of pluggable Java
components enabling an enterprise OSGi
application programming model
..has had 3,901 commits made by 35 contributors
representing 189,241 lines of code
397 Commits Up +77 (24%) from previous 12 months
11 Contributors Up +1 (10%) from previous 12 months
We collected data from six releases, 556
issue reports, five years of software
RQ1: ARE STRONG CHANGE COUPLINGS
RELATED TO DEFECTS?
The majority of the strong change couplings could be
associated with at least one defect.
strong change couplings are moderately correlated (rho 0.46, p < 0.001)
RQ2: CAN HISTORICAL AND SOCIAL METRICS
IDENTIFY IF A CHANGE COUPLING IS STRONG?
To answer this question, we conducted two studies:
• Cross-validation for each release;
• Using one release to train, the consecutive release to test
Cross Validation Results:
• Models with high
• AUC ranging 88% - 99%
• F-measure from
• Models can correctly
predicted strong change
couplings range from 56%
• Problems when the train
release have few
RQ2: WHICH ARE THE BEST METRICS TO IDENTIFY
STRONG CHANGE COUPLINGS?
• In average 5 metrics were used (models were
built with 16 metrics)
• The best subset selected:
• The length of discussion (number of words) – 4
• The number of distinct committers – 3 releases
• The experience of committers – 2 releases
• The number of defect issues associated with – 2
• The number of weeks between the first and last
commit – 2 releases
RQ3: CAN WE PREDICT DEFECTS ASSOCIATED
WITH STRONG CHANGE COUPLINGS?
We found 81 strong change couplings in the post-release
to fix defects. We correctly predicted 37 strong change
• We studied the relationship between strong change couplings
• We developed models with high accuracy to identify strong
• In some cases, it is necessary to group the data.
• In average, 5 metrics were necessary to get better results
• 6 metrics were selected at least in two releases.
• We correctly predict 45.67% of strong change couplings changed
• We want to go deeper and investigate the ways in which strong
change couplings can influence code quality
• Build tools to monitor and track the “damage” caused by these
• Change propagation analysis using contextual information;