Adaptive Change Propagation Using Heuristics

Supporting Software Evolution Using
Adaptive Change Propagation Heuristics
Haroon Malik
Ahmed E. Hassan
School of Computing, Queen’s University, Canada
1

What is Change Propagation
It is the process of propagating code
changes to other entities in software
system.
It ensures the consistency of assumptions
in the system after changing an entity.
Mis-propagating likely to introduce bugs
2

The Change Propagation
Process
3
Determine
Initial Entity
To Change
Change
Entity
Determine
Other Entities
To Change
Consult
Guru for
Advice
New Req., Bug Fix
“How does a change in one source code entity propagate to other
entities?”
No More
Changes
For Each Entity
Suggested Entity

Consider change set with A, B and C
changing together
4
A
B
C

changing together
5
A
B
C
B
C
A
D
E
D
HIST
Heuristic
CUD Heuristic
(Static dependency)
HELPFUL Wasted Developer time

changing together
6
A
B
C
B
C
A
D
E
D
HIST
Heuristic
CUD Heuristic
(Static dependency)
 Which heuristics
should we pick ?
 We should track
the performance of
pool of heuristics
over time for each
entity

changing together
7
A
B
C
B
C
A
D
D
D
HIST
Heuristic
CUD Heuristic
(Static dependency)
 Best Heuristic table
(BHT)
 Tracks and updates

changing together
8
A
B
C
B
C
A
D
D
D
HIST
Heuristic
CUD Heuristic
(Static dependency)
A
E
D
Time
 HIST or CUD?
 BHT says HIST always work
well with A [A-Freq].
 We use HIST
 BHT might also say HIST
worked well with A, last time
[A-REC]

Consider change set with A, B and D
changing together
9
E
D
A

changing together
10
E
D
A
B

changing together
11
E
D
A
B
X
Y
Precision= 1/5= 20%
Recall = 1/1= 100%
We want high Precision & high
Recall

Change Propagation Challenge
Mostly manual & time consuming process
Requires dependency on others
 knowledge of senior developers, who are usually too
busy to guide every change
 Experience of guru, who rarely exists in large projects
 Communication among different teams; itself is a
challenge in large projects
 Use of documentation & previous test suits which are
rarely up-todate
12

Shortcomings of Current
Practices
Explores single dimension
 HIST: Given a changed entity A, a HIST heuristic would suggest
all entities that changed often with A in the past.
 CUD: Given a modified entity A, a CUD heuristic returns all
entities that depend on A or that A depends on.
 FILE: Given a modified entity A, a file heuristic would return all
entities in the same file as A
Static heuristics
 Do not adjust over time nor,
 Adapt to particular changed entity
13

Proposed Approach
Adaptive co-change meta-heuristics:
Tracks best performing heuristics for each
entity in Best Heuristic table (BHT)
Updates Table as project evolves
14

BHT Update
BHT has best performing heuristics
A-Recency:
 For the last change of an entity
A-Frequency
 Over all changes of an entity
By continuously updating the BHT table, we ensure that we
are always using the most optimal heuristic for an entity
15

Empirical Study
Used change sets from 5 open source projects
with over 39 years of development:
PostgreSQL, FreeBSD, Gcluster and GCC
Recover change sets from source control
repositories (CVS)
Replayed the history to measure the
performance
16

Performance Measures of
Heuristics
Project
HIST CUD FILE A-Freq A-Rec
Rec Prec Rec Prec Rec Prec Rec Prec Rec Prec
Postgress 0.69 0.14 0.44 0.02 0.73 0.13 0.45 0.25 0.4 0.30
FreeBSD 0.70 0.12 0.40 0.02 0.76 0.11 0.41 0.27 0.41 0.30
GCluster 0.52 0.18 0.38 0.09 0.70 0.14 0.39 0.22 0.35 0.28
GCC 0.78 0.10 0.43 0.02 0.80 0.12 0.51 0.21 0.47 0.25
All 0.67 0.13 0.41 0.04 0.74 0.12 0.44 0.23 0.40 0.28
F-measure 0.23 0.06 0.21 0.30 0.33
Recall: Adaptive heuristics are similar to traditional heuristics
Precision: Adaptive heuristics out perform traditional heuristics
F-measure: Adaptive heuristics out perform traditional heuristics
(23% better than the best heuristic HIST)
17

Performance Characteristics of
Adaptive Heuristics
To better understand our Adaptive Heuristics we
examined their performance along three direction:
Performance Over Time
BHT Composition over Time
BHT suggestions vs. optimal suggestions
18

Performance Over Time
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
1993 1995 1997 1999 2001 2003 2005
Years
Precesion
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
1993 1995 1997 1999 2001 2003 2005
Years
Recall
HIST CUD File A-Freq A-Rec
For Precision:
 Adaptive heuristic outperforms traditional heuristics.
For Recall:
 Adaptive heuristics do not perform as well as other traditional heuristics.
 Overall A-Rec has lower recall as compared to A-Freq for all projects 19

BHT Composition over Time
0
5
10
15
20
25
30
35
40
45
50
55
60
0 500 1000 1500 2000 2500 3000 3500 4000
Day(s)
HBTcompostion(%)
HIST
FILE
CUD
0
5
10
15
20
25
30
35
40
45
50
55
0 500 1000 1500 2000 2500 3000 3500 4000
Day(s)
HBTcompostion(%)
HIST
FILE
CUD
A-Freq A-Rec
 BHT for Free BSD
 All projects show same trends
 At start History is not widely used
 As the projects evolves, HIST is most effective.
20

BHT Suggestion Vs. Optimal
 Since we are replaying of historical change set we can
compare Adaptive vs. Optimal heuristic
 Optimal heuristic always 100% suggests the best heuristic
 Suggestion: # of correctly suggested heuristics
76-85%
 Performance:
63% of optimal F-measure
HIST is 44% of optimal best performing basic heuristics
 37% room for improvement
21

Improving the Performance
Adaptive Heuristics
Improve HIST in hope to improve adaptive
heuristics by employing advance techniques
Two improved HIST [Hassan, Holt: 2005]
 RECN(M): given a changed entity E, RECN(M) suggests all
entities that changed with E in the past M months.
 FREQ(A): given a changed entity E, FREQ(A) suggests all
entities that changed with E at least twice in the past and
changed more that A% of the time with E.
22

Improved HIST heuristics
 Integrated RECN(4) and FREQ(60) into the heuristic pool
used by adaptive meta-heuristics
 Achieved 0.73 to 0.78 for Recall and 0.64 for Precision
 Nearly 30% increase in performance:
 A-FREQ is within 91% of the optimal heuristic
 A-REC is within 93% of the optimal heuristic
RECN(M) F-Measure FREQ(A) F-Measure
RECN(2) 0.39 FREQ(50) 0.39
RECN(4) 0.40 FREQ(60) 0.44
RECN(6) 0.34 FREQ(70) 0.42
RECN(8) 0.28 FREQ(80) 0.39
23

Findings
Adaptive heuristics can achieve:
 0.73 to 0.78 for Recall and
0.64% Precession
57% improvement over T. heuristics
Performance difference are statically
significant based on a paired Wilcoxon signed
rant test at 5% level of significant.
(Alpha=0.05)
24

Adaptive Change Propagation Using Heuristics

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (12)

Similar to Adaptive Change Propagation Using Heuristics

Similar to Adaptive Change Propagation Using Heuristics (20)

More from SAIL_QU

More from SAIL_QU (20)

Recently uploaded

Recently uploaded (20)

Adaptive Change Propagation Using Heuristics