Partitioning composite code changes to facilitate code review

Partitioning Composite Code Changes
to Facilitate Code Review
Yida Tao and Sunghun Kim
The Hong Kong University of Science and Technology

Atomic Code Change
Fixed bug #123

Atomic Code Change
Fixed bug #12, #34, #56
Removed duplicate code
Add a feature
Javadoc updated
Composite Code Change
Fixed bug #123

Atomic Code Change
Fixed bug #12, #34, #56
Removed duplicate code
Add a feature
Javadoc updated
Composite Code Change
Fixed bug #123
Difficult to review
Likely be rejected

Research Questions
• RQ1: Are composite code changes prevalent?
• RQ2: Can we propose an approach to improve the semantic atomicity
of composite code changes?
• RQ3: Can our approach help developers better review composite code
changes?

RQ1: Occurrence of composite code changes
• Data source
• 4 open-source Java projects
• Revisions that changed >= 2 lines of code
• Commit logs and source code were manually inspected
6
Time period Revisions Avg. cLOC Avg. files
Ant 2010/04/27 -- 2012/03/05 137 26.1 2.0
Commons Math 2011/11/28 -- 2012/04/12 107 84.7 3.5
Xerces 2008/11/03 -- 2012/03/13 116 63.6 3.0
JFreeChart 2008/07/02 -- 2010/03/30 93 144.9 4.1
Total revisions 453

82%
13%
5%
Xerces
7
8% - 29% revisions address multiple issues
RQ1: Occurrence of composite code changes
92%
7%
1%
Ant
82%
12%
6%
Commons Math
71%
18%
11%
JFreeChart
1 issue
2 issues
> 2 issues

Approach
A set of changed
statements
Partition of the change
(A subset is a change-slice)
A composite
code change
Identify
related
statements
8

Approach Formatting
Dependency
Similarity
9
A set of changed
statements
A composite
code change
Partition of the change
(A subset is a change-slice)

Approach
Partition of the patch
Unix diff
(text differencing)
ChangeDistiller*
(AST differencing)
*http://www.ifi.uzh.ch/seal/research
/tools/changeDistiller.html
Formatting
changes
10
Formatting
Dependency
Similarity
A set of changed
statements
A composite
code change

Approach
Partition of the patch
IBM T.J. Watson
Libraries for Analysis
(WALA)*
Inter-procedural
Backward static slicing
*http://wala.sourceforge.net/wiki/in
dex.php/Main_Page
11
Formatting
Dependency
Similarity
A set of changed
statements
A composite
code change

Approach
Partition of the patch“protect array entries against
corruption by returning a clone”
Same change type
Similar delta
12
Formatting
Dependency
Similarity
A set of changed
statements
A composite
code change
P
P’

Evaluation
• 78 composite code changes from the previously inspected data
• 3 human evaluators establish manual partitions for these changes
• Automatic partition results are compared to manual partitions
• Considered acceptableif it exactly matched the manual partition
82%
13%
5%
Xerces
92%
7%
1%
Ant
82%
12%
6%
Commons Math
71%
18%
11%
JFreeChart
1 issue
2 issues
> 2 issues

Evaluation
• 78 composite code changes from the previously inspected data
• 3 human evaluators establish manual partitions for these changes
• Automatic partition results are compared to manual partitions
• Considered acceptableif it exactly matched the manual partition
Acceptable # / Total #
Ant 8 / 11
Commons Math 10 / 19
Xerces 16 / 21
JFreeChart 20 / 27
54 / 78 (69%)

Ant revision 943068 (24 changed LOC)
“Wrong assignment after I renamed the parameter. Unfortunately
there doesn’t seem to be a testcase that catches the error.”
Later fixed in revision 943070
17
One of the two change-slices after partitioning

Preliminary User Study
• RQ3
• Can our automatic partition help developers better review composite changes?
• Participants
• 18 CS graduate students
• Task
• Participants review 12 composite code changes
• Answer a series of code review questions [1], e.g.,
• “What is the consequence of removing the schemaType field?”
• “What do changes in these files have in common?”
18
[1]“Questions programmers ask during software evolution tasks” Sillito et al. FSE 2006

Experimental Settings
• Treatments
• Control group: review code changes by file
• Experimental group: review code changes by partition
19

Results
By file By partition By file By partition
20
p = 0.01 p = 0.81

Formatting
Dependency
Similarity
Composite
Code Changes
Partition of
the change
8% - 29% 69%
By file By partition

Discussion
•Impact of unsatisfactory change partitions
•Balancing between partition costs and benefits

Related Work
• Helping developers help themselves: Automatic decomposition of
code review changesets. Barnett et al. ICSE 2015
• The industrial perspective of composite code changes
• The impact of tangled code changes. Kim Herzig and Andreas Zeller.
MSR 2013
• Filtering Noise in Mixed-Purpose Fixing Commits to Improve Defect
Prediction and Localization. Nguyen et al. ISSRE 2013
• How composite code changes affect defect prediction

Acknowledgment
• We thank all the students that participated in our user study.
• We especially thank Tao He and Hai Wan for their kind help of
arranging the user study.
• We also thank Ananya Kanjilal for her comments on the paper draft.

Partitioning composite code changes to facilitate code review

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (15)

Similar to Partitioning composite code changes to facilitate code review

Similar to Partitioning composite code changes to facilitate code review (20)

Recently uploaded

Recently uploaded (20)

Partitioning composite code changes to facilitate code review

Editor's Notes