1. AN EMPIRICAL STUDY OF THE RELATION
BETWEEN STRONG CHANGE COUPLING
AND DEFECTS USING HISTORY AND SOCIAL
METRICS IN THE APACHE ARIES PROJECT
Igor
Wiese
Rodrigo
Kuroda
Reginaldo
Ré
Gustavo
Oliva
Marco
Gerosa
2. 2
MOTIVATION
+ Dependencies
+ Maintenance effort
[Banker et al.1998]
+ Defects
[D’ambros et al. 2009b, Cataldo et al. 2009, Kirbas et. al 2014]
+ ripple effects
[Pressman, 2001]
Software systems comprise
artifacts that depend on each other
4. CHANGE COUPLING
4
Artifacts a1
Artifacts a2
time
Co-change
commit
A change coupling indicates that two artifacts
changed together (co-changed) in the past,
making them evolutionarily connected.
5. WHY IS CHANGE COUPLING IMPORTANT?
5
Correlation with
Defects
[D’ambros et al. 2009b,
Kirbas et. al 2014]
Change
Coupling
Change Impact Analysis
[Zimmermann et al. 2005, Zhou et al.
2008, Hassan 2009]
Code
Refactoring
[D’ambros et al. 2009a]
Understand
the Sotware
Architecture
[Zimmermann et al.
2003]
Modularity
Assessment
[[[Silva et al. 2014]
Trace
Requiriments
[Ali et al. 2013]
Change couplings reveal relationships not present in
the code or in the documentation.
6. 6
THE MAIN GOALS OF THIS PAPER ARE:
1
2
investigate the relation between
strong change couplings and
the number of defects (RQ1)
characterize strong change couplings
using historical and social metrics (RQ2)
3
predict defects associated with strong
change couplings (RQ3)
7. 7
METHODOLOGY
Data
Collection
Identify the strong
change couplings
Compute the
set of Metrics
Build the
Classifiers
RQ1: Are strong change couplings
related to defects?
RQ2: Can historical and social
metrics identify if a change
coupling is strong?
RQ3: Can we predict defects
associated with strong change
couplings?
8. 8
IDENTIFYING STRONG CHANGE COUPLINGS
Support
Number of transactions when both artifacts were changed together (support = 3).
Confidence
The confidence for the change coupling from A to B would be 3/9 = 0.33.
In turn, the confidence for the change coupling from B to A would be 3/4 = 0.75.
Strong change coupling
After ranking all change couplings by support value, we used a quartile analysis to
determine the relevant change couplings. All couplings with support value higher
than the third quartile were labeled as “strong”.
Artifact A
Artifact B
time
Co-change
commit
9. 9
APACHE ARIES
The Aries project delivers a set of pluggable Java
components enabling an enterprise OSGi
application programming model
..has had 3,901 commits made by 35 contributors
representing 189,241 lines of code
397 Commits Up +77 (24%) from previous 12 months
11 Contributors Up +1 (10%) from previous 12 months
We collected data from six releases, 556
issue reports, five years of software
developments history
10. 10
RQ1: ARE STRONG CHANGE COUPLINGS
RELATED TO DEFECTS?
The majority of the strong change couplings could be
associated with at least one defect.
strong change couplings are moderately correlated (rho 0.46, p < 0.001)
11. 11
RQ2: CAN HISTORICAL AND SOCIAL METRICS
IDENTIFY IF A CHANGE COUPLING IS STRONG?
To answer this question, we conducted two studies:
• Cross-validation for each release;
• Using one release to train, the consecutive release to test
Cross Validation Results:
• Models with high
Accuracy
• AUC ranging 88% - 99%
• F-measure from
70%-99%
Inter-Release Train/Test
• Models can correctly
predicted strong change
couplings range from 56%
to 100%.
• Problems when the train
release have few
instances
12. 12
RQ2: WHICH ARE THE BEST METRICS TO IDENTIFY
STRONG CHANGE COUPLINGS?
• In average 5 metrics were used (models were
built with 16 metrics)
• The best subset selected:
• The length of discussion (number of words) – 4
releases
• The number of distinct committers – 3 releases
• The experience of committers – 2 releases
• The number of defect issues associated with – 2
releases
• The number of weeks between the first and last
commit – 2 releases
13. 13
RQ3: CAN WE PREDICT DEFECTS ASSOCIATED
WITH STRONG CHANGE COUPLINGS?
We found 81 strong change couplings in the post-release
to fix defects. We correctly predicted 37 strong change
couplings (45.67%).
14. CONCLUSIONS
• We studied the relationship between strong change couplings
and defects
• We developed models with high accuracy to identify strong
change couplings
• In some cases, it is necessary to group the data.
• In average, 5 metrics were necessary to get better results
• 6 metrics were selected at least in two releases.
• We correctly predict 45.67% of strong change couplings changed
• We want to go deeper and investigate the ways in which strong
change couplings can influence code quality
• Build tools to monitor and track the “damage” caused by these
couplings;
• Change propagation analysis using contextual information;
14