Advertisement

VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking

Graduate Student
Aug. 5, 2014
Advertisement

More Related Content

Similar to VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking(20)

Advertisement

VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking

  1. A Longitudinal Study of Programmers’ Backtracking YoungSeokYoon (youngseok@cs.cmu.edu) Institute for Software Research Carnegie Mellon University Brad Myers (bam@cs.cmu.edu) Human-Computer Interaction Institute Carnegie Mellon University
  2. Background VL/HCC 2014 2
  3. What is Backtracking? •  Reverting code fragments to an earlier state •  Examples – Reverting a parameter to a previously used value – Removing debugging statements after fixing a bug – Restoring some deleted code – … VL/HCC 2014 3
  4. Previous Studies of Backtracking •  Two qualitative studies of backtracking [Yoon+, CHASE’12] 1.  Preliminary lab study (12 programmers) 2.  Online survey (48 respondents) VL/HCC 2014 4
  5. Previous Studies of Backtracking •  Observation – Programmers face challenges when backtracking •  locating the right code to be backtracked •  restoring some deleted code correctly •  reverting inter-related code fragments together – Programmers backtrack relatively often (75% answered at least “sometimes”) VL/HCC 2014 5
  6. Limitations of the Previous Studies •  Lab study tasks required participants to backtrack •  Survey results may not correctly reflect the reality (e.g., programmers might backtrack unconsciously) •  The analyses were mostly qualitative VL/HCC 2014 6
  7. A Longitudinal Study of Backtracking As a follow-up: VL/HCC 2014 7
  8. Longitudinal Study of Backtracking •  Two main goals – Obtain backtracking statistics in order to quantify the need for backtracking tools – Identify backtracking situations that are not very well supported by existing programming tools VL/HCC 2014 8
  9. Data Collection – Fluorite Logger http://www.cs.cmu.edu/~fluorite/ •  Eclipse logger for fine-grained code editing data [Yoon+, PLATEAU’11] •  Information collected: –  Initial snapshot of each source file –  All edit operations (insert, delete, or replace) –  Timestamps, executed editor commands, etc. •  Distributed to programmers since April 2012 VL/HCC 2014 9 [Image Src:Attribution: Rob Lavinsky, iRocks.com - CC-BY-SA-3.0]
  10. Study Participants Group Description No. of Participants Coding Time (hours)   [min  /  avg  /  max  /  sum]   The first author (myself) 1 294  /  294  /  294  /  294   Graduate students @ CMU 13    3  /    40  /  216  /  520   Research programmers / System scientists @ CMU 5    6  /  118  /  446  /  588   Graduate students @ UPitt 2    6  /    29  /    51  /    57   Total 21 people 1,460 hours VL/HCC 2014 10
  11. Analysis Process •  The data was too big for manual inspection – 1,345,241 coding events in the logs •  Key idea of the automated analysis – Keep the evolution history of individual AST nodes of interest throughout the lifetime of the nodes – Detect backtracking instances within each node VL/HCC 2014 11
  12. Analysis Process Illustrated VL/HCC 2014 12 package example; public class Example { public void printRectangleInfo() { Rectangle rect = getEnclosingRect(); int value = rect.getHeight(); System.out.println("Value:" + value); } public Rectangle getEnclosingRect() { // return some rectangle here // actual code omitted // ... } } [Example Source Code Being Processed] S1 S2 S3 Change history of S1 [v1] Rectangle rect = getEnclosingRect(); Change history of S2 [v1] int value = rect.getHeight(); [v2] int value = rect.getWidth(); [v3] int value = rect.getSize(); [v4] int value = rect.getHeight(); Change history of S3 [v1] System.out.println(value); [v2] System.out.println("Value:" + value); [Memory of the Analyzer] Backtracking Detected!
  13. Backtracking Instance A B C A B A v1 v2 v3 v4 v5 v6 time getHeight(); getWidth(); getSize(); getHeight(); getWidth(); getHeight(); Three Backtracking Instances: •  v1..v4 •  v2..v5 •  v4..v6 NOTE: v1..v6 is NOT a backtracking instance VL/HCC 2014 13
  14. Research Questions 1.  How frequently do programmers backtrack in reality? 2.  How large are the backtrackings? 3.  How exactly do programmers perform backtracking? Are they backtracking manually? 4.  Is there evidence of exploratory programming? 5.  Are there backtrackings performed across multiple editing sessions? 6.  Are there selective backtrackings, which cannot be performed by the undo command? 7.  Do programmers backtrack to the same code repeatedly? VL/HCC 2014 14
  15. 1. Frequency of Backtracking “How frequently do programmers backtrack in reality?” •  A total of 15,095 backtracking instances detected •  10.3 instances/hour on average VL/HCC 2014 15 0 10 20 30 P20 P19 P18 P17 P16 P15 P14 P13 P12 P11 P10 P9 P8 P7 P6 P5 P4 P3 P2 P1 P0 Backtracking Instances per Hour 3.8 (min) 28.4 (max) Average: 10.3/h Rate varied across participants (min=3.8/h, max=28.4/h), but all of them backtracked frequently
  16. 2. Size of Backtracking “How large are the backtrackings?” •  How did we define the size of a backtracking? –  Measured the edit distance (Levenshtein distance) between the original version and the other versions –  Took the maximum value as the size of backtracking instance A B C D E A v1 v2 v3 v4 v5 v6 time farthest version (max edit distance) forward changes backward changes original version VL/HCC 2014 16
  17. 2. Size of Backtracking “How large are the backtrackings?” VL/HCC 2014 17 1304 3752 5269 2026 2259 265 220 0 2000 4000 6000 1 2-9 10 -49 50 -99 100 -499 500 -999 ≥1000 Numberof BacktrackingInstances Backtracking Size (No. of Characters)
  18. 2. Size of Backtracking “How large are the backtrackings?” •  Method / variable names •  String literals •  Number literals VL/HCC 2014 18 1304 3752 5269 2026 2259 265 220 0 2000 4000 6000 1 2-9 10 -49 50 -99 100 -499 500 -999 ≥1000 Numberof BacktrackingInstances Backtracking Size (No. of Characters)
  19. 2. Size of Backtracking “How large are the backtrackings?” •  Simple parameter changes •  Reverting renaming changes on methods or variables VL/HCC 2014 19 1304 3752 5269 2026 2259 265 220 0 2000 4000 6000 1 2-9 10 -49 50 -99 100 -499 500 -999 ≥1000 Numberof BacktrackingInstances Backtracking Size (No. of Characters)
  20. 2. Size of Backtracking “How large are the backtrackings?” •  Single statement changes •  Surrounding existing code (e.g., try-catch) then reverting VL/HCC 2014 20 1304 3752 5269 2026 2259 265 220 0 2000 4000 6000 1 2-9 10 -49 50 -99 100 -499 500 -999 ≥1000 Numberof BacktrackingInstances Backtracking Size (No. of Characters)
  21. 2. Size of Backtracking “How large are the backtrackings?” •  Adding, removing, or modifying multiple statements and then reverting them altogether VL/HCC 2014 21 1304 3752 5269 2026 2259 265 220 0 2000 4000 6000 1 2-9 10 -49 50 -99 100 -499 500 -999 ≥1000 Numberof BacktrackingInstances Backtracking Size (No. of Characters)
  22. 2. Size of Backtracking “How large are the backtrackings?” •  Significant algorithmic changes •  Adding / removing / modifying multiple methods and then reverting VL/HCC 2014 22 1304 3752 5269 2026 2259 265 220 0 2000 4000 6000 1 2-9 10 -49 50 -99 100 -499 500 -999 ≥1000 Numberof BacktrackingInstances Backtracking Size (No. of Characters)
  23. 2. Size of Backtracking “How large are the backtrackings?” VL/HCC 2014 23 1304 3752 5269 2026 2259 265 220 0 2000 4000 6000 1 2-9 10 -49 50 -99 100 -499 500 -999 ≥1000 Numberof BacktrackingInstances Backtracking Size (No. of Characters) Programmers backtrack at varying granularities, from simple name changes to significant algorithmic changes
  24. 3. Backtracking Tactics “How exactly do programmers perform backtracking?” How were the backtrackings performed? Manually 38% Using Existing Tools 49% Others 13% VL/HCC 2014 24
  25. 3. Backtracking Tactics “How exactly do programmers perform backtracking?” How were the backtrackings performed? Manually 38% Using Existing Tools 49% Others 13% •  Undo (37%) •  Paste (6%) •  Redo (3%) •  Content Assist (2%) •  Toggle Comment (1%) VL/HCC 2014 25
  26. 3. Backtracking Tactics “How exactly do programmers perform backtracking?” How were the backtrackings performed? Manually 38% Using Existing Tools 49% Others 13% •  Unidentified (9%) •  Multiple (4%) VL/HCC 2014 26
  27. 3. Backtracking Tactics “How exactly do programmers perform backtracking?” How were the backtrackings performed? Manually 38% Using Existing Tools 49% Others 13% •  Manual Deletion (25%) •  Manual Typing (13%) VL/HCC 2014 27 38% of the backtracking instances were NOT supported by existing tools, indicating programmers need better backtracking tools
  28. 4. Cross-Run Backtracking “Is there evidence of exploratory programming?” •  Make some changes à run the application à revert the code back to the way it was before •  20.4% of all instances were cross- run instances on average. VL/HCC 2014 28 0% 10% 20% 30% 40% 50% P20 P19 P18 P17 P16 P15 P14 P13 P12 P11 P10 P9 P8 P7 P6 P5 P4 P3 P2 P1 P0 Cross-Run Backtracking Percentage Average: 20.4% This provides support that programmers do this kind of exploratory programming.
  29. 5. Cross-Session Backtracking “Are there backtrackings performed across multiple editing sessions?” 96.7% 98.2% 98.8% 99.0% 99.2% 99.3% 96% 97% 98% 99% 100% Same Session ≤1 ≤2 ≤3 ≤4 ≤5 CumulativePercentageofAllBIs Editing Session Distance VL/HCC 2014 29 A backtracking tool would work for 97% of the cases with only the history within the same editing session.
  30. 6. Selective Backtracking “Are there backtrackings that could not have done by regular undo?” •  Selective backtracking? –  There are edits in the middle of a backtracking that change other parts of the same file, that are not backtracked together VL/HCC 2014 30 0% 5% 10% 15% 20% P20 P19 P18 P17 P16 P15 P14 P13 P12 P11 P10 P9 P8 P7 P6 P5 P4 P3 P2 P1 P0 Selective Backtracking Percentage Average: 9.5%
  31. 6. Selective Backtracking “Are there backtrackings that could not have done by regular undo?” •  Selective backtracking? –  There are edits in the middle of a backtracking that change other parts of the same file, that are not backtracked together VL/HCC 2014 31 0% 5% 10% 15% 20% P20 P19 P18 P17 P16 P15 P14 P13 P12 P11 P10 P9 P8 P7 P6 P5 P4 P3 P2 P1 P0 Selective Backtracking Percentage Average: 9.5% On average, 9.5% of all backtracking instances were selective, supporting that programmers need better selective backtracking tools
  32. 7. Repeat Count “Do programmers backtrack to the same code repeatedly?” 85.0% 11.1% 2.7% 0.7% 0.6% 0% 20% 40% 60% 80% 100% 1 2 3 4 ≥5 PercentageofBacktrackedNodes Repeat Count VL/HCC 2014 32 Most (85%) of the time, programmers backtrack once and then never gets back to the same state after diverging from it
  33. Wrapping Up VL/HCC 2014 33
  34. Limitations of the Analysis •  Only exact and successful backtracking instances were detected •  Only for Java / Eclipse •  Could not determine the semantic relationships among the backtracking instances VL/HCC 2014 34
  35. Main Takeaways •  Programmers backtrack quite frequently (10.3/hr) •  38% of the backtrackings are done purely manually •  9.5% of the backtrackings are selective, meaning that they are not supported by conventional undo •  Programmers would benefit from better backtracking tools! VL/HCC 2014 35
  36. Azurite – Selective Undo Tool http://www.cs.cmu.edu/~azurite/ •  A selective undo plug-in for Eclipse IDE –  can handle the 9.5% of selective backtrackings •  Presented atVL/HCC –  Initial User Interfaces of the Tool: Yoon, Myers, & Koo,“Visualization of Fine-Grained Code Change History”, Full Paper atVL/HCC’13 –  Tool Demonstration (yesterday): Yoon & Myers,“A Demonstration of Azurite: Backtracking Tool for Programmers”, Showpiece atVL/HCC’14 VL/HCC 2014 36 [Image Src:Attribution: cobalt, flickr.com - CC-BY-SA-2.0 ]
  37. ThankYou! •  FLUORITE: A logging plug-in for Eclipse (Full of Low-level User Operations Recorded In The Editor) available at: http://www.cs.cmu.edu/~fluorite/ •  AZURITE: A selective undo plug-in for Eclipse (Adding Zest to Undoing and Restoring Improves Textual Exploration) available at: http://www.cs.cmu.edu/~azurite/ •  Thanks for funding from: VL/HCC 2014 37
Advertisement