VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking
A Longitudinal Study of
Programmers’ Backtracking
YoungSeokYoon
(youngseok@cs.cmu.edu)
Institute for Software Research
Carnegie Mellon University
Brad Myers
(bam@cs.cmu.edu)
Human-Computer Interaction Institute
Carnegie Mellon University
What is Backtracking?
• Reverting code fragments to an earlier state
• Examples
– Reverting a parameter to a previously used value
– Removing debugging statements after fixing a bug
– Restoring some deleted code
– …
VL/HCC 2014 3
Previous Studies of Backtracking
• Two qualitative studies of backtracking
[Yoon+, CHASE’12]
1. Preliminary lab study (12 programmers)
2. Online survey (48 respondents)
VL/HCC 2014 4
Previous Studies of Backtracking
• Observation
– Programmers face challenges when backtracking
• locating the right code to be backtracked
• restoring some deleted code correctly
• reverting inter-related code fragments together
– Programmers backtrack relatively often
(75% answered at least “sometimes”)
VL/HCC 2014 5
Limitations of the Previous Studies
• Lab study tasks required participants to backtrack
• Survey results may not correctly reflect the reality
(e.g., programmers might backtrack unconsciously)
• The analyses were mostly qualitative
VL/HCC 2014 6
Longitudinal Study of Backtracking
• Two main goals
– Obtain backtracking statistics in order to quantify
the need for backtracking tools
– Identify backtracking situations that are not very
well supported by existing programming tools
VL/HCC 2014 8
Data Collection – Fluorite Logger
http://www.cs.cmu.edu/~fluorite/
• Eclipse logger for fine-grained code
editing data [Yoon+, PLATEAU’11]
• Information collected:
– Initial snapshot of each source file
– All edit operations (insert, delete, or replace)
– Timestamps, executed editor commands, etc.
• Distributed to programmers since April 2012
VL/HCC 2014 9
[Image Src:Attribution: Rob Lavinsky, iRocks.com - CC-BY-SA-3.0]
Study Participants
Group Description No. of Participants
Coding Time (hours)
[min
/
avg
/
max
/
sum]
The first author (myself) 1 294
/
294
/
294
/
294
Graduate students @ CMU 13
3
/
40
/
216
/
520
Research programmers /
System scientists @ CMU
5
6
/
118
/
446
/
588
Graduate students @ UPitt 2
6
/
29
/
51
/
57
Total 21 people 1,460 hours
VL/HCC 2014 10
Analysis Process
• The data was too big for manual inspection
– 1,345,241 coding events in the logs
• Key idea of the automated analysis
– Keep the evolution history of individual AST nodes
of interest throughout the lifetime of the nodes
– Detect backtracking instances within each node
VL/HCC 2014 11
Analysis Process Illustrated
VL/HCC 2014 12
package example;
public class Example {
public void printRectangleInfo() {
Rectangle rect = getEnclosingRect();
int value = rect.getHeight();
System.out.println("Value:" + value);
}
public Rectangle getEnclosingRect() {
// return some rectangle here
// actual code omitted
// ...
}
}
[Example Source Code Being Processed]
S1
S2
S3
Change history of S1
[v1] Rectangle rect = getEnclosingRect();
Change history of S2
[v1] int value = rect.getHeight();
[v2] int value = rect.getWidth();
[v3] int value = rect.getSize();
[v4] int value = rect.getHeight();
Change history of S3
[v1] System.out.println(value);
[v2] System.out.println("Value:" + value);
[Memory of the Analyzer]
Backtracking
Detected!
Backtracking Instance
A B C A B A
v1 v2 v3 v4 v5 v6
time
getHeight(); getWidth(); getSize(); getHeight(); getWidth(); getHeight();
Three Backtracking Instances:
• v1..v4
• v2..v5
• v4..v6
NOTE: v1..v6 is NOT a
backtracking instance
VL/HCC 2014 13
Research Questions
1. How frequently do programmers backtrack in reality?
2. How large are the backtrackings?
3. How exactly do programmers perform backtracking?
Are they backtracking manually?
4. Is there evidence of exploratory programming?
5. Are there backtrackings performed across multiple editing
sessions?
6. Are there selective backtrackings, which cannot be
performed by the undo command?
7. Do programmers backtrack to the same code repeatedly?
VL/HCC 2014 14
1. Frequency of Backtracking
“How frequently do programmers backtrack in reality?”
• A total of 15,095
backtracking instances
detected
• 10.3 instances/hour
on average
VL/HCC 2014 15
0 10 20 30
P20
P19
P18
P17
P16
P15
P14
P13
P12
P11
P10
P9
P8
P7
P6
P5
P4
P3
P2
P1
P0
Backtracking Instances per Hour
3.8 (min)
28.4 (max)
Average: 10.3/h
Rate varied across
participants
(min=3.8/h, max=28.4/h),
but all of them backtracked
frequently
2. Size of Backtracking
“How large are the backtrackings?”
• How did we define the size of a backtracking?
– Measured the edit distance (Levenshtein distance) between the original
version and the other versions
– Took the maximum value as the size of backtracking instance
A B C D E A
v1 v2 v3 v4 v5 v6
time
farthest
version
(max edit distance)
forward changes backward changes
original
version
VL/HCC 2014 16
2. Size of Backtracking
“How large are the backtrackings?”
VL/HCC 2014 17
1304
3752
5269
2026
2259
265 220
0
2000
4000
6000
1 2-9 10
-49
50
-99
100
-499
500
-999
≥1000
Numberof
BacktrackingInstances
Backtracking Size (No. of Characters)
2. Size of Backtracking
“How large are the backtrackings?”
• Method / variable
names
• String literals
• Number literals
VL/HCC 2014 18
1304
3752
5269
2026
2259
265 220
0
2000
4000
6000
1 2-9 10
-49
50
-99
100
-499
500
-999
≥1000
Numberof
BacktrackingInstances
Backtracking Size (No. of Characters)
2. Size of Backtracking
“How large are the backtrackings?”
• Simple parameter
changes
• Reverting
renaming changes
on methods or
variables
VL/HCC 2014 19
1304
3752
5269
2026
2259
265 220
0
2000
4000
6000
1 2-9 10
-49
50
-99
100
-499
500
-999
≥1000
Numberof
BacktrackingInstances
Backtracking Size (No. of Characters)
2. Size of Backtracking
“How large are the backtrackings?”
• Single statement
changes
• Surrounding existing
code (e.g., try-catch)
then reverting
VL/HCC 2014 20
1304
3752
5269
2026
2259
265 220
0
2000
4000
6000
1 2-9 10
-49
50
-99
100
-499
500
-999
≥1000
Numberof
BacktrackingInstances
Backtracking Size (No. of Characters)
2. Size of Backtracking
“How large are the backtrackings?”
• Adding, removing, or
modifying multiple
statements and
then reverting them
altogether
VL/HCC 2014 21
1304
3752
5269
2026
2259
265 220
0
2000
4000
6000
1 2-9 10
-49
50
-99
100
-499
500
-999
≥1000
Numberof
BacktrackingInstances
Backtracking Size (No. of Characters)
2. Size of Backtracking
“How large are the backtrackings?”
• Significant
algorithmic
changes
• Adding / removing /
modifying multiple
methods and then
reverting
VL/HCC 2014 22
1304
3752
5269
2026
2259
265 220
0
2000
4000
6000
1 2-9 10
-49
50
-99
100
-499
500
-999
≥1000
Numberof
BacktrackingInstances
Backtracking Size (No. of Characters)
2. Size of Backtracking
“How large are the backtrackings?”
VL/HCC 2014 23
1304
3752
5269
2026
2259
265 220
0
2000
4000
6000
1 2-9 10
-49
50
-99
100
-499
500
-999
≥1000
Numberof
BacktrackingInstances
Backtracking Size (No. of Characters)
Programmers backtrack at
varying granularities, from
simple name changes to
significant algorithmic changes
3. Backtracking Tactics
“How exactly do programmers perform backtracking?”
How were the backtrackings
performed?
Manually
38% Using
Existing Tools
49%
Others
13%
VL/HCC 2014 24
3. Backtracking Tactics
“How exactly do programmers perform backtracking?”
How were the backtrackings
performed?
Manually
38% Using
Existing Tools
49%
Others
13%
• Undo (37%)
• Paste (6%)
• Redo (3%)
• Content Assist (2%)
• Toggle Comment (1%)
VL/HCC 2014 25
3. Backtracking Tactics
“How exactly do programmers perform backtracking?”
How were the backtrackings
performed?
Manually
38% Using
Existing Tools
49%
Others
13%
• Unidentified (9%)
• Multiple (4%)
VL/HCC 2014 26
3. Backtracking Tactics
“How exactly do programmers perform backtracking?”
How were the backtrackings
performed?
Manually
38% Using
Existing Tools
49%
Others
13%
• Manual Deletion (25%)
• Manual Typing (13%)
VL/HCC 2014 27
38% of the backtracking
instances were NOT
supported by existing tools,
indicating programmers need
better backtracking tools
4. Cross-Run Backtracking
“Is there evidence of exploratory programming?”
• Make some changes à run the
application à revert the code
back to the way it was before
• 20.4% of all instances were cross-
run instances on average.
VL/HCC 2014 28
0% 10% 20% 30% 40% 50%
P20
P19
P18
P17
P16
P15
P14
P13
P12
P11
P10
P9
P8
P7
P6
P5
P4
P3
P2
P1
P0
Cross-Run Backtracking Percentage
Average: 20.4%
This provides support that
programmers do this kind of
exploratory programming.
5. Cross-Session Backtracking
“Are there backtrackings performed across multiple editing sessions?”
96.7%
98.2%
98.8%
99.0%
99.2% 99.3%
96%
97%
98%
99%
100%
Same
Session
≤1 ≤2 ≤3 ≤4 ≤5
CumulativePercentageofAllBIs
Editing Session Distance
VL/HCC 2014 29
A backtracking tool would
work for 97% of the cases
with only the history within
the same editing session.
6. Selective Backtracking
“Are there backtrackings that could not have done by regular undo?”
• Selective backtracking?
– There are edits in the middle
of a backtracking that change
other parts of the same file, that
are not backtracked together
VL/HCC 2014 30
0% 5% 10% 15% 20%
P20
P19
P18
P17
P16
P15
P14
P13
P12
P11
P10
P9
P8
P7
P6
P5
P4
P3
P2
P1
P0
Selective Backtracking Percentage
Average: 9.5%
6. Selective Backtracking
“Are there backtrackings that could not have done by regular undo?”
• Selective backtracking?
– There are edits in the middle
of a backtracking that change
other parts of the same file, that
are not backtracked together
VL/HCC 2014 31
0% 5% 10% 15% 20%
P20
P19
P18
P17
P16
P15
P14
P13
P12
P11
P10
P9
P8
P7
P6
P5
P4
P3
P2
P1
P0
Selective Backtracking Percentage
Average: 9.5%
On average, 9.5% of all
backtracking instances were
selective, supporting that
programmers need better
selective backtracking tools
7. Repeat Count
“Do programmers backtrack to the same code repeatedly?”
85.0%
11.1%
2.7% 0.7% 0.6%
0%
20%
40%
60%
80%
100%
1 2 3 4 ≥5
PercentageofBacktrackedNodes
Repeat Count
VL/HCC 2014 32
Most (85%) of the time,
programmers backtrack once
and then never gets back to
the same state after diverging
from it
Limitations of the Analysis
• Only exact and successful backtracking instances were
detected
• Only for Java / Eclipse
• Could not determine the semantic relationships
among the backtracking instances
VL/HCC 2014 34
Main Takeaways
• Programmers backtrack quite frequently (10.3/hr)
• 38% of the backtrackings are done purely manually
• 9.5% of the backtrackings are selective, meaning that
they are not supported by conventional undo
• Programmers would benefit from better
backtracking tools!
VL/HCC 2014 35
Azurite – Selective Undo Tool
http://www.cs.cmu.edu/~azurite/
• A selective undo plug-in for Eclipse IDE
– can handle the 9.5% of selective backtrackings
• Presented atVL/HCC
– Initial User Interfaces of the Tool:
Yoon, Myers, & Koo,“Visualization of Fine-Grained Code Change History”,
Full Paper atVL/HCC’13
– Tool Demonstration (yesterday):
Yoon & Myers,“A Demonstration of Azurite: Backtracking Tool for Programmers”,
Showpiece atVL/HCC’14
VL/HCC 2014 36
[Image Src:Attribution: cobalt, flickr.com - CC-BY-SA-2.0 ]
ThankYou!
• FLUORITE: A logging plug-in for Eclipse
(Full of Low-level User Operations Recorded In The Editor)
available at: http://www.cs.cmu.edu/~fluorite/
• AZURITE: A selective undo plug-in for Eclipse
(Adding Zest to Undoing and Restoring Improves Textual Exploration)
available at: http://www.cs.cmu.edu/~azurite/
• Thanks for funding from:
VL/HCC 2014 37