5. • If change CA1 implements FA and Feature
Code Missing
Changes Dependencies
change CB1 implements FB
FA CA1 CA2 CB1
• If a change CA2 is added to modify
FA and CA2 is dependent on CB1 FB CB1
CA1 CA1 CA2 CB1
Integrate FA
CB1
5
Integrate FB
7. Automated
Grouping ( during
Define Calibrate the Commit Integration)
Dissimilarity Metrics on Assignment
Metrics Prior Versions Algorithm
Developer Guided
Grouping ( during
Development)
8
8. Given two commits characterized by files, developers and change requests (CRs)
Metric Description
File Dependency Distance (FD) Captures source code dependencies
between files involved in two commits
File Association Distance (FA) Captures logical dependencies between
files involved in two commits
Developer Dissimilarity Distance (DD) Captures the working relation between
two developers submitting commits
CR Dependency Distance (CRD) Captures the dissimilarity between the
CRs implemented by two commits
9
9. Automated
Grouping ( during
Define Calibrate the Commit Integration)
Dissimilarity Metrics on Assignment
Metrics Prior Versions Algorithm
Developer Guided
Grouping ( during
Development)
10
10. For each of the four metrics -
b3
• Min_Threshold = Avg(a) b2
• Max_Threshold = Avg(bmin) a
• Silhouette= Avg{(bmin-a)/max(bmin,a)} b1
A higher silhouette value is better
11
11. Automated
Grouping ( during
Define Calibrate the Commit Integration)
Dissimilarity Metrics on Assignment
Metrics Prior Versions Algorithm
Developer Guided
Grouping ( during
Development)
12
12. • Apply the similarity metrics
in order of their precedence
• If no suitable group is found
for a commit, assign the
commit to a new group
Color > Shape
13
13. Automated
Grouping ( during
Define Calibrate the Commit Integration)
Dissimilarity Metrics on Assignment
Metrics Prior Versions Algorithm
Developer Guided
Grouping ( during
Development)
14
14. Groups commits incrementally
and uses developers’ feedback
to improve the grouping during
development
Both approaches follow the k-means clustering method which consists
in assigning each item to the cluster with the nearest mean.
15
15. We analyzed three major versions of a family of mobile
applications
16
16. • Validate the dissimilarity metrics
Can the proposed metrics be used to identify
commit dependencies ?
• Validate the grouping approaches
How efficient are our proposed grouping
approaches?
• Value for Developers
Can the proposed approaches identify commit
dependencies missed by developers ?
17
17. The four similarity metrics display good abilities in
grouping commits ( i.e. high silhouette values)
1 0.94 0.96 0.96
0.79
0.8 0.76
0.67 0.67
Silhouette Value
0.63
0.6 0.57
CRD
0.47 0.49
0.46
FA
0.4 DD
FD
0.2
0
Verion 1 Version 2 Version 3
CRD > FA > DD > FD
18
18. • Efficiency of the Grouping Approaches
– 82% of commit dependencies were recovered by
the automated grouping with a precision of 95%
– The accuracy of the developer-guided grouping
approach is 98%
– We observed that precision/recall improves with
longer history data
• Value for Developers
– Automated grouping and Developer-guided
grouping approaches were able to reduce
integration failures by 76% and 94% respectively
19
Software products lines allow the development of similar products using common software components
Whenever modifications are performed by developers on the main branch integrators selectively propagate the modifications to the respective products by picking changes relevant for the specific products.
To ensure the success of these selective integration, development teams attempt to maintain clear mappings between code changes performed by developers and features from the products. However this mapping is not always maintains carefully, making this integration process very brittle.