Event-Driven Architecture Masterclass: Challenges in Stream Processing
WCRE11a.ppt
1. An Exploratory
Study of Macro
Co-changes
Fehmi Jaafar,
Yann-Ga¨l
e
Gu´h´neuc, Sylvie
e e
Hamel, and
Giuliano Antoniol
Introduction and An Exploratory Study of Macro Co-changes
context
Problems and
Motivation
Macocha
Fehmi Jaafar, Yann-Ga¨l Gu´h´neuc, Sylvie Hamel, and
e e e
Empirical study
Giuliano Antoniol
Validation
Universit´ de Montr´al, Qu´bec, Canada
e e e
Conclusion and
Ongoing Work
Thursday, October 20, 2011
Pattern Trace Identification, Detection, and Enhancement in Java
SOftware Cost-effective Change and Evolution Research Lab
2. An Exploratory
Study of Macro Introduction
Co-changes
Fehmi Jaafar,
Yann-Ga¨l
e
Gu´h´neuc, Sylvie
e e
Hamel, and
Giuliano Antoniol
Introduction and Context
context
◮ Developers must continually change their software
Problems and
Motivation programs to meet new requirements and user needs.
Macocha ◮ Many approaches extract and analyse the changes
Empirical study
undergone by artefacts and infer change propagation.a
Validation
Conclusion and
◮ Several of these approaches identify co-changes among
Ongoing Work artefacts.b
a
A. E. Hassan and R. C. Holt. ICSM 2004.
b
Z. Xing and E. Stroulia. TSE 2005.
2 / 23
3. An Exploratory
Study of Macro Introduction
Co-changes
Fehmi Jaafar,
Yann-Ga¨l
e
Gu´h´neuc, Sylvie
e e
Hamel, and
Co-change
Giuliano Antoniol
◮ Two artefacts are co-changing if they were changed by
Introduction and the same author and with the same log message in a
context
Problems and
time-window of less than 200 ms.a
Motivation
◮ Mockusb defined the proximity in time of check-ins as
Macocha
the check-in time of adjacent files that differ by less
Empirical study
than three minutes.
Validation
Conclusion and
◮ Other studiesc described issues about identifying atomic
Ongoing Work
change sets and reported that, in all cases, they differed
by few minutes.
a
T. Zimmermann et al. ICSE 2004.
b
A. Mockus et al. TSE 2004.
c
D. M. German. TSE 2006.
3 / 23
4. An Exploratory
Study of Macro Problems and Motivation
Co-changes
Fehmi Jaafar,
Yann-Ga¨l
e
Gu´h´neuc, Sylvie
e e
Hamel, and
Giuliano Antoniol Missing Dependencies
Introduction and ◮ If files (e.g., in ArgoUML,
context
Problems and
NotationUtilityJava.java and
Motivation ModelElementNameNotationUml.java) were never
Macocha changed by the same developer at the same time but
Empirical study
were changed by developers (mvw and tfmorris) in two
Validation
consecutive change periods.
Conclusion and
Ongoing Work ◮ Previous co-changes are intrinsically limited in time.
They cannot express patterns of changes between long
time intervals (e.g., ArgoDiagram.java and
ModeCreateAssociationClass.java were maintained
by the same developer but separated by few hours).
4 / 23
5. An Exploratory
Study of Macro Problems and Motivation
Co-changes
Fehmi Jaafar,
Yann-Ga¨l
e
Gu´h´neuc, Sylvie
e e
Hamel, and
Macro co-change
Giuliano Antoniol
A macro co-change is two or more files that change together,
Introduction and i.e., they were maintained in the same change periods.
context
Problems and
Motivation
Macocha
Empirical study
Validation
Conclusion and
Ongoing Work
Figure: Two changes performed by one developer are sequential in
time (after few hours), F1 and F2 are macro co-changing
5 / 23
6. An Exploratory
Study of Macro Problems and Motivation
Co-changes
Fehmi Jaafar,
Yann-Ga¨l
e
Gu´h´neuc, Sylvie
e e
Hamel, and
Giuliano Antoniol Dephase macro co-change
Introduction and A dephase macro co-change is two or more files that have
context
been observed to change with the same shift s.
Problems and
Motivation
Macocha
Empirical study
Validation
Conclusion and
Ongoing Work
Figure: Files F1 and F2 are changed by different developers in two
6 / 23 consecutive periods of time. They are dephase macro co-changing
7. An Exploratory
Study of Macro Macocha
Co-changes
(1/9)
Fehmi Jaafar,
Yann-Ga¨l
e
Gu´h´neuc, Sylvie
e e
Hamel, and
Giuliano Antoniol
Introduction and
context Macocha
Problems and
Motivation We propose Macocha to:
Macocha 1. Mine version-control systems (CVS and SVN).
Empirical study
2. Identify the change periods in a program.
Validation
Conclusion and 3. Group the program source files according to their
Ongoing Work
stability through the change periods.
4. Identify among changed files those that have similar
co-changes pattern.
7 / 23
8. An Exploratory
Study of Macro Macocha
Co-changes
(2/9)
Fehmi Jaafar,
Yann-Ga¨l
e
Gu´h´neuc, Sylvie
e e
Hamel, and
Giuliano Antoniol
Introduction and
context
Macocha
Problems and We draw inspiration and extend the classical sliding window
Motivation
approacha to consider that two subsequent changes are part
Macocha
of one change period if they were committed by:
Empirical study
Validation
◮ any author;
Conclusion and
Ongoing Work
◮ with any log message;
◮ without an interrupt between two changes.
a
T. Zimmermann et al. ICSE 2004.
8 / 23
9. An Exploratory
Study of Macro Macocha
Co-changes
(3/9)
Fehmi Jaafar,
Yann-Ga¨l
e
Gu´h´neuc, Sylvie
e e
Hamel, and Macocha
Giuliano Antoniol
◮ The duration of a change period is less than 40 hours.a
Introduction and
context ◮ If the interrupt between two subsequent changes is
Problems and more than t = 5.17 hours, theses two changes belong to
Motivation
two different change periods.
Macocha
Empirical study
◮ A SMCC is two or more files that have identical profiles
Validation during the life cycle of a program.
Conclusion and
Ongoing Work
◮ A SDMCC is the set composed of F1 and one or more
files, F2...FM, such that F2...FM always macro co-change
with the same shift in time s with respect to F1.
◮ We use the Hamming distance DH to measure the
amount of differences between two change profiles.
a
L. Hatton. Computer 2007.
9 / 23
10. An Exploratory
Study of Macro Macocha
Co-changes
(4/9)
Fehmi Jaafar,
Yann-Ga¨l
e
Gu´h´neuc, Sylvie
e e
Hamel, and
Giuliano Antoniol
Introduction and
context
Problems and
Motivation
Macocha
Empirical study
Validation
Conclusion and
Ongoing Work
Figure: Analysis-process of Macocha
10 / 23
11. An Exploratory
Study of Macro Macocha
Co-changes
(5/9)
Fehmi Jaafar,
Yann-Ga¨l
e
Gu´h´neuc, Sylvie
e e
Hamel, and
Giuliano Antoniol Macocha
Introduction and
Step 1: Detection of change periods.
context
Problems and
Motivation
Macocha
Empirical study
Validation
Conclusion and
Ongoing Work
Figure: Analysis of commits and creation of change periods
11 / 23
12. An Exploratory
Study of Macro Macocha
Co-changes
(6/9)
Fehmi Jaafar,
Yann-Ga¨l
e
Gu´h´neuc, Sylvie
e e
Hamel, and
Giuliano Antoniol
Macocha
Introduction and
context Step 2: Creation of files profiles.
Problems and
Motivation
Macocha
Empirical study
Validation
Conclusion and
Ongoing Work
Figure: From revision control systems to file profiles
12 / 23
13. An Exploratory
Study of Macro Macocha
Co-changes
(7/9)
Fehmi Jaafar,
Yann-Ga¨l
e
Gu´h´neuc, Sylvie
e e
Hamel, and
Giuliano Antoniol
Introduction and
context Macocha
Problems and Step 3: Stability analysis.
Motivation
Macocha
Empirical study
Validation
Conclusion and
Ongoing Work
Figure: Profiles showing file stability
13 / 23
14. An Exploratory
Study of Macro Macocha
Co-changes
(8/9)
Fehmi Jaafar,
Yann-Ga¨l
e
Gu´h´neuc, Sylvie
e e
Hamel, and
Giuliano Antoniol Macocha
Introduction and Step 4: Detection of macro co-changes.
context
Problems and
Motivation
Macocha
Empirical study
Validation
Figure: Files F1 and F2 are in macro co-change
Conclusion and
Ongoing Work
Figure: Three different bit vectors showing approximate macro
co-changes (D(F1,F2)=2; D(F1,F3)=4)
14 / 23
15. An Exploratory
Study of Macro Macocha
Co-changes
(9/9)
Fehmi Jaafar,
Yann-Ga¨l
e
Gu´h´neuc, Sylvie
e e
Hamel, and
Giuliano Antoniol
Macocha
Introduction and
context Step 5: Detection of dephase macro co-changes.
Problems and
Motivation
Macocha
Empirical study
Validation
Conclusion and
Ongoing Work
Figure: Three different bit vectors showing dephase macro
co-changes
15 / 23
16. An Exploratory
Study of Macro Empirical study
Co-changes
Fehmi Jaafar,
Yann-Ga¨l
e
Gu´h´neuc, Sylvie
e e
Hamel, and
Giuliano Antoniol
Research questions
Introduction and
context 1. How does Macocha compare to previous work in term
Problems and of precision and recall?
Motivation
Macocha
2. Are there (approximate) dephase macro co-changes
Empirical study
among files and what is their usefulness?
Validation
Conclusion and Objects
Ongoing Work
We now present the results of our empirical study. We apply
Macocha on four different programs: ArgoUML, FreeBSD,
SIP, and XalanC, developed with three different
programming languages, C, C++, and Java.
16 / 23
17. An Exploratory
Study of Macro Empirical study
Co-changes
Fehmi Jaafar,
Yann-Ga¨l
e
Gu´h´neuc, Sylvie
e e
Hamel, and
Giuliano Antoniol Objects
Introduction and ArgoUML FreeBSD SIP XalanC
context Languages Java C Java C++
Problems and Versions 30 8 2 21
Motivation
Files 3,148 3,603 2,790 529
Macocha
Changes 16,727 186,959 8,046 397,052
Empirical study
Start Dates 98-01-26 94-05-25 05-07-21 99-12-18
Validation End Dates 09-01-29 09-02-11 10-12-09 09-01-17
Conclusion and CPs 2,843 1,121 1,553 924
Ongoing Work
Table: Descriptive statistics of the object programs (CPs: numbers
of change periods)
17 / 23
18. An Exploratory
Study of Macro Empirical study
Co-changes
Fehmi Jaafar,
Yann-Ga¨l
e
Results
Gu´h´neuc, Sylvie
e e
Hamel, and
ArgoUML FreeBSD SIP XalanC
Giuliano Antoniol Idle files 202 1,856 963 7
Changed files 2,946 1,747 1,827 522
Introduction and
context # of SMCC 166 121 142 36
Problems and Max # files 35 24 15 17
Motivation Min # files 2 2 2 2
Macocha # of SMCCH 196 163 182 85
Empirical study Max # files 46 44 32 22
Validation Min # files 2 2 2 2
Conclusion and
# of SDMCC 11 1 6 1
Ongoing Work Max # files 4 2 3 2
Min # files 2 2 2 2
# of SDMCCH 53 63 36 4
Max # files 6 8 5 2
Min # files 2 2 2 2
Table: Cardinalities of the sets obtained in the empirical study
18 / 23
19. An Exploratory
Study of Macro Validation
Co-changes
(1/3)
Fehmi Jaafar,
Yann-Ga¨l
e
Gu´h´neuc, Sylvie
e e
Hamel, and
Giuliano Antoniol
Validation
Introduction and
context We perform two types of validation:
Problems and
Motivation ◮ Quantitatively, we compare the stability analysis of
Macocha Macocha with that of UMLDiffa and the co-change
Empirical study
analysis of Macocha with association rulesb .
Validation
◮ Qualitatively, we use external information provided by
Conclusion and
Ongoing Work bugs reports, mailing lists, and requirement descriptions
to validate the (dephase) macro co-changes not found
using association rules.
a
Z. Xing et al. ICSE 2005.
b
T. Zimmermann et al. ICSE 2004.
19 / 23
20. An Exploratory
Study of Macro Validation
Co-changes
(2/3)
Fehmi Jaafar,
Yann-Ga¨l
e
Gu´h´neuc, Sylvie
e e
Hamel, and
Giuliano Antoniol
Validation
Introduction and
context Idle Groups Changed Groups
Problems and Idle Clusters 202 0
Motivation
ArgoUML Short-lived Clusters 0 1,390
Macocha Active Clusters 0 1,556
Empirical study Idle Clusters 963 0
Validation SIP Short-lived Clusters 0 997
Conclusion and Active Clusters 0 830
Ongoing Work Idle Clusters 7 0
XalanC Short-lived Clusters 0 291
Active Clusters 0 231
Table: Cardinality of Macocha sets in comparison to UMLDiff
20 / 23
21. An Exploratory
Study of Macro Validation
Co-changes
(3/3)
Fehmi Jaafar,
Yann-Ga¨l
e
Gu´h´neuc, Sylvie
e e Validation
Hamel, and
Giuliano Antoniol Association Rules Macocha
Precision Recall Precision Recall
Introduction and
context ArgoUML 15% 66% 20% 75%
Problems and FreeBSD 22% 100% 24% 100%
Motivation SIP 18% 89% 24% 91%
Macocha XalanC 16% 100% 22% 100%
Empirical study
Table: Association rules’s approach vs. Macocha
Validation
Conclusion and Association Rules External Information
Ongoing Work
Precision Recall Precision Recall
ArgoUML 86% 98% 100% 99%
FreeBSD 98% 100% 100% 100%
SIP 85% 96% 100% 98%
XalanC 90% 100% 100% 100%
Table: External evaluation of Macocha when using the results of
the association rules’s approach as oracle and after manual
validation using external information
21 / 23
22. An Exploratory
Study of Macro Conclusion and Ongoing Work
Co-changes
Fehmi Jaafar,
Yann-Ga¨l
e
Gu´h´neuc, Sylvie
e e
Hamel, and
Giuliano Antoniol
Conclusion
Introduction and
context 1. We introduced the novel concepts of macro co-changes
Problems and and dephase macro co-changes to describe that two files
Motivation
were changed by developers within same change
Macocha
periods, with possible shifts in time.
Empirical study
Validation
2. We described, Macocha, an approach to detect
Conclusion and (dephase) macro co-changes using file profiles and their
Ongoing Work
stability in time.
3. We performed two types of validations: quantitatively
and qualitatively, and we showed that SMCC and SDMCC
do exist and bring supplementary information.
22 / 23
23. An Exploratory
Study of Macro Conclusion and Ongoing Work
Co-changes
Fehmi Jaafar,
Yann-Ga¨l
e
Gu´h´neuc, Sylvie
e e
Hamel, and
Giuliano Antoniol
Introduction and Ongoing Work
context
Problems and
1. Performing a comprehensive study of the number of
Motivation MCCs and DMCCs with varying values of t and s.
Macocha
2. Performing a comprehensive study of the different kinds
Empirical study
of DMCCs.
Validation
Conclusion and 3. Relating MCCs and DMCCs with static analysis and
Ongoing Work
external software characteristics, such as change
proneness.
23 / 23