How Do Software Engineers Understand Code Changes? FSE 2012
How Do Software Engineers Understand
Code Changes?
An Exploratory Study in Industry
Yida Tao1, Yingong Dang2, Tao Xie3
Dongmei Zhang2, Sunghun Kim1
1The Hong Kong University of Science & Technology
2Microsoft Research Asia
3North Carolina State University
> if (hudRef && hud) {
> if (hudRef.consolePanel) {
> + hudRef.consolePanel.hidePopup()
“Why this change here? This is the
only one that doesn’t seem to make
sense for me…”
3
>+ struct CIDEntry
>+ {
>+ const nsCID* cid;
>+ bool service;
> if (hudRef && hud) {
> if (hudRef.consolePanel) {
> + hudRef.consolePanel.hidePopup()
“What is this used for, I can’t spot it in
use anywhere. ”
“Why this change here? This is the
only one that doesn’t seem to make
sense for me…”
4
>+ struct CIDEntry
>+ {
>+ const nsCID* cid;
>+ bool service;
> if (hudRef && hud) {
> if (hudRef.consolePanel) {
> + hudRef.consolePanel.hidePopup()
“What is this used for, I can’t spot it in
use anywhere. ”
“Why this change here? This is the
only one that doesn’t seem to make
sense for me…”
> browser_hide_removing.js
>+ browser_imageReload.js
>+ image_Reload.html
“These files are missing from this
patch, aren’t they?”
5
>+ struct CIDEntry
>+ {
>+ const nsCID* cid;
>+ bool service;
> if (hudRef && hud) {
> if (hudRef.consolePanel) {
> + hudRef.consolePanel.hidePopup()
“What is this used for, I can’t spot it in
use anywhere. ”
“Why this change here? This is the
only one that doesn’t seem to make
sense for me…”
> browser_hide_removing.js >+ for (var i = aURL.length – 1; i >= 1; i--) {
>+ browser_imageReload.js >+ var chPrev = aURL.charAt(i – 1) ;
>+ image_Reload.html >+ var ch = aURL.charAt(i) ;
“These files are missing from this “I’m not sure why you walk this char
patch, aren’t they?” by char… ”
6
>+ struct CIDEntry
>+ {
>+ const nsCID* cid;
>+ bool service;
> if (hudRef && hud) {
> if (hudRef.consolePanel) {
> + hudRef.consolePanel.hidePopup()
“What is this used for, I can’t spot it in
use anywhere. ”
“Why this change here? This is the
only one that doesn’t seem to make
sense for me…”
> browser_hide_removing.js >+ for (var i = aURL.length – 1; i >= 1; i--) {
>+ browser_imageReload.js >+ var chPrev = aURL.charAt(i – 1) ;
>+ image_Reload.html >+ var ch = aURL.charAt(i) ;
“These files are missing from this “I’m not sure why you walk this char
patch, aren’t they?” by char… ”
7
>+ struct CIDEntry
“…” >+ {
>+ const nsCID* cid;
>+ bool service;
> if (hudRef && hud) {
> if (hudRef.consolePanel) {
> + hudRef.consolePanel.hidePopup()
“What is this used for, I can’t spot it in
use anywhere. ”
“Why this change here? This is the
only one that doesn’t seem to make
sense for me…”
“…”
> browser_hide_removing.js >+ for (var i = aURL.length – 1; i >= 1; i--) {
>+ browser_imageReload.js >+ var chPrev = aURL.charAt(i – 1) ;
>+ image_Reload.html >+ var ch = aURL.charAt(i) ;
“These files are missing from this “…” “I’m not sure why you walk this char
patch, aren’t they?” by char… ”
8
Research Questions
RQ1: How frequently is code change understanding
practiced and in which development tasks is it
required?
RQ2: What are engineers’ information needs and
difficulties for understanding code changes?
RQ3: How to improve the effectiveness and efficiency
of the practices in understanding code changes?
10
Study Methodology
Literature Questionnaire Pilot Interview
Review Design
•Potential •Investigate •Question is
information RQ1, RQ2 relevant &
needs clear
11
Study Methodology
Literature Questionnaire Pilot Interview
Review Design
•Potential •Investigate •Question is
information RQ1, RQ2 relevant &
needs clear
Online Survey Follow-up Analysis
Interview
•Over 1000 MS •Investigate •Answering
employees RQ3 RQs
12
Survey Participants
180 respondents (16% response rate)
Role Distribution Product Teams
OS
PM
Desktop App
14%
Web App
Test Dev Mobile App
31% 55% Service
Others
13
RQs
RQ1 • Frequency ?
• Development tasks ?
RQ2 • Information needs ?
• Difficulty ?
RQ3 • Improvement ?
14
RQ1: Frequency of Understanding Code Changes
How often do you need to understand code changes?
o Several times each hour
o About once an hour
o Several times each day
o About once a day
o Several times each week
o About once a week
o Rarely
o Never
15
RQ1: Frequency of Understanding Code Changes
40%
Percentage of responses
35%
30%
25%
20%
15%
10%
5%
0%
16
RQ1: Frequency of Understanding Code Changes
40%
80%
Percentage of responses
35%
30%
25%
20%
15%
10%
5%
0%
17
RQ1: Tasks Requiring Code Change Understanding
“Select the top three tasks that most often require you
to understand code changes”
[Design/Planning] Refactoring
[Implementation] Developing new feature
[Implementation] Fixing bug
[Integration] Resolving merge conflict
[Verification] Reviewing others’ code changes
[Verification] Reviewing my own code changes
[Verification] Writing & updating test cases
Other, please specify
18
RQ1: Tasks Requiring Code Change Understanding
Percentage of participants who selected the task
0% 15% 30% 45% 60% 75%
Reviewing others' changes 121
Fixing bug 100
Developing new feature 89
Reviewing my own changes 73
Writing/updating test cases 48
Refactoring 34
Resolving merge conflict 30
19
RQ1: Tasks Requiring Code Change Understanding
Percentage of participants who selected the task
0% 15% 30% 45% 60% 75%
Reviewing others' changes 121
Fixing bug 100
Developing new feature 89
Reviewing my own changes 73
Writing/updating test cases 48
Refactoring 34
Resolving merge conflict 30
20
Answers to RQs
RQ1 • Frequently practiced
• Major development tasks
RQ2 • Information needs ?
• Difficulty ?
RQ3 • Improvement ?
21
Potential Information Needs
Literature review (code-change analysis and management)
180 articles in 10 SE venues over the past decade
22
Potential Information Needs
Literature review (code-change analysis and management)
180 articles in 10 SE venues over the past decade
Reasoning & assessing the change
• Clones
• Design
•…
Exploring the change’s context & impact
• Risk
• Consistency
•…
Evaluating the change history
• Change proneness
• Defect proneness
23
Survey Questions
“Rate the importance & difficulty of each information need
(formulated as question) in a change understanding task”
Very 3
Important
2
Important
1
Somewhat
Important
0
Not
Important
24
Survey Questions
“Rate the importance & difficulty of each information need
(formulated as question) in a change understanding task”
Very 3 Very
Important Difficult
2
Important Difficult
1
Somewhat Relatively
Important Easy
0
Not Straightfor
Important -ward
25
Survey Questions
“Rate the importance & difficulty of each information need
(formulated as question) in a change understanding task”
Does this change
Very 3 Very
Important Difficult
introduce code clones?
2 Does this change break
Important Difficult
any code elsewhere?
Which tests should be run
1 to verify this change?
Somewhat Relatively
Important Easy Is this changed location a
0 hotspot for past fixes?
Not Straightfor
Important -ward ……
26
RQ2: Information Needs
3
Difficulty of acquiring the
2
information
1
0
0 1 2 3
Importance
27
RQ2: Information Needs
3
Difficulty of acquiring the
2
information
1
0
0 1 2 3
Importance
28
RQ2: Information Needs
3
Consistency Risk
Difficulty of acquiring the
Completeness
2
information
Correctness
Design
1
0
0 1 2 3
Importance
29
Answers to RQs
RQ1 • Frequently practiced
• Major development tasks
RQ2 • Risk & Quality are important
but difficult to know
RQ3 • Improvement ?
30
RQ3: Interview Items
3
Risk
Difficulty of acquiring the
2
information
1
0
0 1 2 3
Importance
31
RQ3: Interview Items
3
Risk
Difficulty of acquiring the
2
information
1
Rationale
0
0 1 2 3
Importance
32
RQ3: Interview Items
3
Risk
Difficulty of acquiring the
2
information
Defect proneness
1 Change proneness
Rationale
0
0 1 2 3
Importance
33
Assessing a Change’s Risk
3
Risk
Difficulty of acquiring the
2
information
1
0
0 1 2 3
Importance
34
Current Practice on Assessing a Change’s Risk
Manual Code Review
•Error prone
•Cross-components
•Unclear interface
•Hidden assumptions
•…
Unit & Regression Testing
•Time consuming
•Depends on how thorough the tests are
•…
35
Support Assessing a Change’s Risk
•Navigation in diff:
Manual code review
using code analysis
tools (e.g., go to
definition, find all
references,
caller/callee tree)
on the code
change
36
Support Assessing a Change’s Risk
“…miss a level of
understanding object
relationships”
•Navigation in diff:
Manual code review
using code analysis Diff
tools (e.g., go to
definition, find all Navigation
references, in diff
caller/callee tree)
on the code Code
Analysis
change
37
Support Assessing a Change’s Risk
•which code must
Testing
be retested as it is “An ‘Intelli-sense’ for updating
dependent upon these (affected) tests would be
the change? nice as well.”
•who owns testing
that dependency?
•which tests must
be run?
38
Discussion
3
Difficulty of acquiring the
2
information
Defect proneness
1 Change proneness
Rationale
0
0 1 2 3
Importance
39
Discussion
Why is understanding the rationale of a change easy?
• Availability & Quality of commit message
• “It’s entirely up to the dev making the change as to how hard or
easy it is for someone else to figure out why the change was
made.”
Why are historical metrics not that important?
• Developers
• Here and now
• Short-term issue
• Own knowledge
• Testers & PMs
• Historical metrics might be good to reflect bugginess and
complexity of a specific area
40
Other Information Needs
“In addition to the information needs listed above, what else
would you ask when you try to understand a code change? How
difficult is it for you to answer?”
41
Other Information Needs
“In addition to the information needs listed above, what else
would you ask when you try to understand a code change? How
difficult is it for you to answer?”
“Can this change be broken into
smaller discreet changes?”
42
Answers to RQs
RQ1 • Frequently practiced
• Major development tasks
RQ2 • Risk & Quality are important
but difficult to know
• Assessing the risk
RQ3 of a change
• Decomposing a
composite change
46
Summary
A large-scale exploratory study on industrial practice in
understanding code changes
Understanding code changes happens frequently in major
development tasks
47
Summary
A large-scale exploratory study on industrial practice in
understanding code changes
Understanding code changes happens frequently in major
development tasks
An extensive exploration on engineers’ information needs
for understanding code changes
Assessing a change’s risk, consistency & completeness
48
Summary
A large-scale exploratory study on industrial practice in
understanding code changes
Understanding code changes happens frequently in major
development tasks
An extensive exploration on engineers’ information needs
for understanding code changes
Assessing a change’s risk, consistency & completeness
A guideline for future research and tool design that aims at
supporting change-understanding tasks
Navigation in diff
Change decomposition
49
Acknowledgment
All participants of survey / interview
Miryung Kim, Robin Moeur, Thomas Zimmermann, Jacek
Czerwonka, and Kathryn McKinley
50
Summary
A large-scale exploratory study on industrial practice in
understanding code changes
Understanding code changes happens frequently in major
development tasks
An extensive exploration on engineers’ information needs
for understanding code changes
Assessing a change’s risk, consistency & completeness
A guideline for future research and tool design that aims at
supporting change-understanding tasks
Navigation in diff
Change decomposition
51