Mining and Untangling Change Genealogies (PhD Defense Talk)

kherzig@acm.org
Kim Herzig
PhD Defense Talk
Mining and Untangling
Change Genealogies

Mining Software Repositories
Software repositories contain
events and artifacts collected
during software development.

Prediction
models.
Recommender
systems.
Development process
measures.
Models allowing
to replay history.

Collect &
combine
Filter Interpret
Prediction
models.
Recommender
systems.
Development process
measures.
Models allowing
to replay history.

Collect &
combine
Filter Interpret
Which are the most defect-‐prone
source files?

Collect &
combine
Filter Interpret

Collect &
combine
Filter Interpret
Code History
Bugs

Collect &
combine
Filter Interpret
Which change fixed which bug?
Which files were changed to do so?
Consider only closed and resolved bug reports.
Distinguish between pre-‐ and post-‐release bugs.
Count the number of distinct
bugs per file as quality measure.
Code History
Bugs

Mining Version Archives
Developer A
changes File A Code Base
submit
Developer might
react on suggestion.
Help developers to prevent incomplete changes; prevent bugs.

Mining Version Archives
Developer A
changes File A Code Base
Model
submit
Detect changes to be
submitted
Analyze historic
code changes.
Suggest further code changes that other
developers did when changing File A.
Developer might
react on suggestion.
Help developers to prevent incomplete changes; prevent bugs.

Mining Change Couplings
Code History

Code History
time

Code History
is a rule: when is changed than someone changes .Interpret
time

Select only frequent occurring patterns.Filter
Code History
Prediction If someone changes we suggest as likely other change.
time

Select only frequent occurring patterns.Filter
Code History
Prediction If someone changes we suggest as likely other change.
Assumption was that depends on .
time
But what if and are
frequently occurring but
independent changes?

Change Genealogies
time
Combine spatial and temporal dimension of archives.
‣ Reasoning over multiple components at multiple points in time
Model dependencies between code changes.
‣ Reasoning over the dependencies and impact of changes

Change Genealogies
When does a change depend on another change ?
‣ If we cannot apply without applying first (﴾e.g. without breaking compilation)﴿.

Change Genealogies
When does a change depend on another change ?
‣ If we cannot apply without applying first (﴾e.g. without breaking compilation)﴿.
Which dependencies to be modeled?
‣ We cannot model all dependencies.
‣ We need a change abstraction that covers most situations
and allows efficient dependency rules.

Change Operations
int cache = 10;
...
6
...
public class C {
public C() {
List<String> list = new List<String>();
if(list.size() < 1){
if(list.isEmty()){
}
}
3
4
5
9
20
21
Source code change
Analyze source code
‣ many revision cannot be compiled
‣ but must be considered in genealogy.
Track changes applied to
‣ method definitions
‣ method calls
Reduce applied changes to
AD added method definition
DD deleted method definition
AC added method call
DC deleted method call
modified method definitionMD

Change Operations
Source code change
Analyze source code
‣ many revision cannot be compiled
‣ but must be considered in genealogy.
Track changes applied to
‣ method definitions
‣ method calls
Reduce applied changes to
AD added method definition
DD deleted method definition
AC added method call
DC deleted method call
modified method definitionMD
int cache = 10;
...
6
...
public class C {
public C() {
List<String> list = new List<String>();
if(list.size() < 1){
}
}
3
4
5
9
20
21

Dependency Rules
DDAD
D
You cannot define the same method twice.
MD AD
D
MD MD
D
You cannot modify a method that does not exist.
DD AD
D
DD MD
D
You cannot delete a method that does not exists.
You can only call methods that exist.
AC AD
D
AC MD
D
DC AC
C
You cannot delete a method call that does not exist.

Genealogy Example
[2010] Herzig, “Capturing the Long-Term Impact of Changes”, ICSE
File 1
File 2
File 3
File 4
+ int A.foo(int)
+ int B.bar(int)
+ B.bar(5)
- int A.foo(int)
+ ﬂoat A.foo(ﬂoat)
+ d = A.foo(5d)
- x = B.bar(5)
+ x = A.foo(5f)
+ d = A.foo(d) + e = A.foo(-1f)
CS1 CS2 CS3 CS4 CS5

Genealogy Example
[2010] Herzig, “Capturing the Long-Term Impact of Changes”, ICSE
File 1
File 2
File 3
File 4
+ int A.foo(int)
+ int B.bar(int)
+ B.bar(5)
- int A.foo(int)
+ float A.foo(float)
+ d = A.foo(5d)
- x = B.bar(5)
+ x = A.foo(5f)
+ d = A.foo(d) + e = A.foo(-1f)
CS1 CS2 CS3 CS4 CS5
Vertex Annotation
Author
Timestamp
Change set ID
method name
class namefull qualified
file name
Each edge is
labeled with the
causing
dependency rule.

Change Set Layer
Change Operation Layer
Change Set Layer ➡
File 1
File 2
File 3
File 4
CS1 CS2 CS3 CS4 CS5
As default Layer

Change Set Layer
File 1
File 2
File 3
File 4
CS1 CS2 CS3 CS4 CS5
CS1 CS2 CS3 CS4 CS5
As default Layer

change set layer: directed and acyclic.
Change Set Layer
File 1
File 2
File 3
File 4
CS1 CS2 CS3 CS4 CS5
CS1 CS2 CS3 CS4 CS5
As default Layer

Cause effect Prediction
Classification Untangling
Applications &
Assumptions

Prediction
Cause effect Applications &
Assumptions
[2011] Herzig and Zeller, “Mining Cause-Effect-Chains from Version Archives”, ISSRE
Funded by
Faculty Grant

Expressing Cause Effect Chains
Developer DDeveloper A
…
Developer B Developer C
Software System
What is the long-‐term impact of the initial change?
‣ Related to change couplings but requires dependencies.
Original problem stems from large repositories combining multiple
projects using each others code.
‣ Example: Developer A might deprecate and replace an authentication method in system A forcing
the authentication in system D to be updated.

…
Software System

…
Did developer A trigger the green changes?
Does depend on ? Is there a path from to ?
Software System

…
We can use CTL to formulate the property
) EF
Software System

Change genealogies allow model checking version archives.
…
We can use CTL to formulate the property
) EF
Software System

Model Checking Genealogies
create
change genealogy
from version
archive
extract valid CTL
rules using
model checking
report frequent
occurring rules as
recommendations
Recommendations
...
) EF
) AG( ) EF )
) EF( ^ )
must be valid in time window ranked by confidence & support

Model Checking Genealogies
create
change genealogy
from version
archive
extract valid CTL
rules using
model checking
report frequent
occurring rules as
recommendations
Recommendations
...
) EF
) AG( ) EF )
) EF( ^ )
must be valid in time window ranked by confidence & support
Recommendations ensured to be based on structural dependencies.

Predicting Future Events
project history
(﴾having a structural dependency on current change)﴿

project history
10% initial
training set

Prediction
model
project history
10% initial
training set

Prediction
model
use the top 3 ranked recommendations
to predict files that will change in time
window. (ranked by confidence, support)
project history
10% initial
training set

Prediction
model
project history
10% initial
training set
#correct predictions + #false predictions
#correct predictions
precision =

project history
10%
training set

project history
10%
training set
+1

How well can we predict cause-‐effect chains?
Precision
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
ArgoUML Jaxen JRuby XStream
Model checking Most frequent changed files

More than 60% of all predictions
are valid.
‣ Which is in sync with other approaches.
Precision
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
[eRose]
[Canfora]
[Canfora] Canfora et al., “Using Multivariate Time Series and Association Rules to Detect Logical Change Couplings: an Empirical Study”, ICSM 2010
[eRose] Zimmermann et al., “Mining Version Histories to Guide Software Changes”, ICSE 2004

are valid.
Structural dependencies ensured.
Precision
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
[eRose]
[Canfora]

are valid.
For over 47% of commits all
recommendations are valid.
Precision
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
[eRose]
[Canfora]

are valid.
For over 47% of commits all
recommendations are valid.
Average rank of highest hit is 1.8.
Precision
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
[eRose]
[Canfora]

Prediction
Cause effect Applications &
Assumptions

Predicting Defects
[2012] Herzig et al., “Classifying Changes and Predicting Defects Using Change Genealogies”, under submission
Using network metrics on change
genealogies.
‣ Metrics express the dependency structures.
‣ Motivated by studies using call graph network metrics
‣ Assumption: Central changes tend to be more crucial and
thus more likely to add defects.
0 1 0 1 1
0 0 1 1 0
1 1 0 0 1
0 1 0 1 1
0 0 0 1 1
1 1 1 0 0
0 1 0 1 1
0 1 0 1 1
0 0 1 1 0
1 1 0 0 1
0 1 0 1 1
0 0 0 1 1
1 1 1 0 0
0 1 0 1 1
0 1 0 1 1
0 0 1 1 0
1 1 0 0 1
0 1 0 1 1
0 0 0 1 1
1 1 1 0 0
0 1 0 1 1
0 1 0 1 1
0 0 1 1 0
1 1 0 0 1
0 1 0 1 1
0 0 0 1 1
1 1 1 0 0
0 1 0 1 1
0 1 0 1 1
0 0 1 1 0
1 1 0 0 1
0 1 0 1 1
0 0 0 1 1
1 1 1 0 0
0 1 0 1 1
0 1 0 1 1
0 0 1 1 0
1 1 0 0 1
0 1 0 1 1
0 0 0 1 1
1 1 1 0 0
0 1 0 1 1
0 1 0 1 1
0 0 1 1 0
1 1 0 0 1
0 1 0 1 1
0 0 0 1 1
1 1 1 0 0
0 1 0 1 1
0 1 0 1 1
0 0 1 1 0
1 1 0 0 1
0 1 0 1 1
0 0 0 1 1
1 1 1 0 0
0 1 0 1 1
▸ four open-‐source projects ▸ stratified 100 cross-‐fold setup ▸ using four different machine learners
PrecisionRecall
HTTPClient Jackrabbit Lucene-‐Java Rhino
Complexity
Metrics
Network
Metrics
Genealogy
Metrics
Network +
Genealogy Metrics
0.2
0.4
0.6
0.8
0.4
0.6
0.8
Genealogy metrics outperform
code and network metrics when
predicting defects.
Prediction

Untangling Changes
[2013] Herzig and Zeller, “The Impact of Tangled Code Changes”, MSR
0
0.3
0.6
0.9
Blob size 2 Blob size 3 Blob size 4
ArgoUML Google Webtool Kit JRuby XStream
natural bob size occurrence
4%
5%
18%
73%
2 3 4 >4
The genealogy change set layer
requires atomic code changes.
‣ Change sets applying changes targeting
multiple issues cause false dependencies.
‣ Manual classification of 7,000 change sets shows
that ~10% of all bug fixes are tangled.
Can we untangle
tangled code
changes?
‣ Yes, for most tangled changes
(﴾blob size 2)﴿ with a precision
between 0.7 and 0.9.
Tangled Change Set Untangling Algorithm Change Set Partition
A
B
Distance MeasuresData DependenciesChange CouplingsCall-Graph
Conﬁdence Voters
0
1
2
3
4
5
6
7
8
9
10
Untangling

Misclassified Issue Reports
[2013] Herzig et al., “It’s not a Bug, It’s a Feature: How Misclassification Impact Bug Prediction”, ICSE
After untangling:
‣ Which partition is a bug fix, which one a new feature?
But are bug reports reporting bugs? No.
‣ Manually classified >7,000 issue reports:
‣ Every 3rd bug report does not require a code fix.
63%
37%
HTTPClient
75%
25%
Jackrabbit
65%
35%
Lucene-‐Java
59%
41%
Rhino
61%
39%
Tomcat5
-TrackerBugzilla-Tracker
wrongly classified correctly classified
0%
10%
20%
30%
40%
HTTPClient Jackrabbit Lucene-‐Java Rhino Tomcat5
TOP 5% TOP 10% TOP 15% TOP 20%
With severe impact on bug count models.
‣ Up to 40% false positive most defect prone files.
‣ Automatically classification of issue reports possible.
[2008] Antoniol et al., “Is it a bug or an enhancement? A text based approach to classify
change requests.”, CASCON
Classification

Mining and Untangling Change Genealogies (PhD Defense Talk)

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (8)

Similar to Mining and Untangling Change Genealogies (PhD Defense Talk)

Similar to Mining and Untangling Change Genealogies (PhD Defense Talk) (20)

More from Kim Herzig

More from Kim Herzig (12)

Recently uploaded

Recently uploaded (20)

Mining and Untangling Change Genealogies (PhD Defense Talk)