These are the slides for a series of work-in-progress talks given about Timelapse in June 2012. Please send comments and questions to @brrian on twitter, or find my email on the Timelapse project page:
http://cs.washington.edu/homes/burg/timelapse/
2. Reproducing program behavior is
essential in software development
Demonstration:
interacting with a program
to achieve a behavior
Investigation:
interacting with developer
tools to understand behavior
The phases of demonstrating and investigating are inseparable.
3. Task coupling can be hazardous
Example: debugging a color picker widget
4. Barriers to reproducing behavior
Demonstration
• Tedious, time-consuming, and manual
• Difficult to explain repro. steps (e.g., in a bug report)
Investigation
• Repeated demonstration to investigate same behavior
• The behavior being investigated is not held constant
• Interleaving the two phases incurs task switching penalties
5. Simplifying by decoupling
Decoupling behavior demonstration from investigation
simplifies reproduction and enables new practices.
Behavior could be demonstrated, and
later, investigated, by different people.
Behavior could be reproduced precisely,
automatically, quickly, and easily.
The means of investigation could change
independently without re-demonstration.
http://dryicons.com, http://www.getfirebug.com/
6. Solution: deterministic record/replay
User-space libraries
Old idea: achieve identical Jockey (C), Mugshot (JavaScript)
execution by controlling
nondeterminism. Application-specific
Video game engines
Recent idea: modern
Operating systems
dOS, Determinator
browsers are virtual machines.
Virtual machines
ReVirt, LEAP, VMWare Workstation
New idea: adapt virtual
machine record/replay
techniques to web browsers Hardware
DMP, RCDC, Calvin
7. Timelapse: record/replay for the web
Timelapse is an infrastructure and developer tool to
record, replay, and investigate web application behavior.
Contributions
• Adaptation of VM record/replay techniques to the web
• Infrastructure and tool for interactive record/replay
• Design concepts for visualizing, interacting with recordings
8. Demonstrating and Investigating
(with Timelapse)
1. Begin recording
2. Demonstrate behavior
3. End recording
4. Investigate as needed
Example: color picker, revisited
9. Recreating a specific execution
Executions differ because of sources of nondeterminism.
To eliminate nondeterminism,
• Identify what is/isn’t deterministic
• Decide what sources to capture, and how
• Figure out how to replay the recorded nondeterminism
Timelapse adapts definitions and techniques from VM research.
10. Programs and Inputs
An execution is determined by a program and its inputs.
main.js
function foo(a,b) { return a + b; }
function bar(c) { return c * Math.random(); }
foo(argv[1], argv[2]);
main.js
bar(argv[1]); bar(argv[3]); main.js
11. Programs and Inputs
An execution is determined by a program and its inputs.
main.js
function foo(a,b) { return a + b; }
function bar(c) { return c * Math.random(); }
foo(argv[1], argv[2]);
main.js
bar(argv[1]); bar(argv[3]); main.js
12. Programs and Inputs
An execution is determined by a program and its inputs.
main.js
function foo(a,b) { return a + b; }
function bar(c) { return c * Math.random(); }
foo(argv[1], argv[2]);
bar(argv[1]); bar(argv[3]);
Sources of nondeterminism that affect the course
of execution are implicit inputs to the program.
13. Interpreters and Inputs
In an interpreter session, the program is the interpreter itself.
The session could be reproduced
>> load(“defs.js”); REPL session exactly by reusing inputs.
>> foo(1,2);
>> foo(3,4);
>> bar(5); 1. load defs.js interpreter
>> bar(5); inputs
2. CALL foo 1, 2
3. CALL foo 3, 4
defs.js 4. CALL bar, 5
>> foo(a,b) { return a + b; } 5. RET Math.random, 0.453863
>> bar(c) { return c * Math.random(); } 6. CALL bar, 5
7. RET Math.random, 0.986364
14. Web browsers are interpreters
Web browsers interpret JavaScript, CSS, HTML, and user input.
Input Web Interpreter Output
15. Web browsers are interpreters
Executions can be reproduced by capturing and reusing inputs.
Inputs Log
Input Web Interpreter Output
16. Record/replay goals
Precise deterministic record/replay
• Must capture and reuse explicit and implicit inputs
• Able to detect divergence of replay from recording
Self-contained, sharable recordings
• Must not depend on any machine-specific context
• Replay does not require external communication
Integration with existing architecture and tools
• Record/replay must have low performance overhead
• Controllable and compatible with existing developer tools
17. Record/replay technique: proxying
During normal execution, Date.now() returns the current time.
/* file: Source/wtf/DateMath.h */
oo(a,b) { return a + b; } inline double jsCurrentTime()
{
(c) { return “now:”+ Date.now(); }
return floor(WTF::currentTimeMS());
}
18. Record/replay technique: proxying
During recording, the return value of Date.now() is saved.
/* file: Source/wtf/DateMath.h */
oo(a,b) { return a + b; } inline double jsCurrentTime()
{
(c) { return “now:”+ Date.now(); }
return floor(WTF::currentTimeMS());
}
Inputs log
19. Record/replay technique: proxying
On replay, the logged return value of Date.now() is used.
/* file: Source/wtf/DateMath.h */
oo(a,b) { return a + b; } inline double jsCurrentTime()
{
(c) { return “now:”+ Date.now(); }
return floor(WTF::currentTimeMS());
}
Inputs log
20. Shim: the thing in the middle
Shims are used to implement deterministic record/replay.
The hard part of implementing record/replay is designing and placing shims.
21. Background: browser architectures
Web interpreters are embeddable and cross-platform.
User input Date.now,
setTimeout
Callbacks,
visual timer fire
output
Embedders Web Interpreter Platforms
(WebKit, Gecko)
22. Shim placement in the browser stack
Shims are placed at the API boundaries of the web interpreter.
Embedders Web Interpreter Platforms
(WebKit, Gecko)
24. Why not place shims elsewhere?
At the OS or VM level (VMWare Replay)
Low overhead and exact replay
Can’t integrate with developer tools
At the JS library level (Mugshot: Mickens et al., 2010)
Cross-platform
Slower, lower fidelity, can’t integrate with tools
At the level of DOM events only (WaRR: Adrica et al., 2011)
Most DOM events are deterministic and/or have no effect
Skipping pre-dispatch browser code causes strange behavior
28. Mechanics of execution replay
Injecting “push” inputs causes dispatch of DOM event(s).
1. mousedow function onClicked(..)
n {
... = Date.now();
2. mouseup
}
3. click
4. focus
5. submit
DOM Event DIspatch JavaScript execution
29. Comparison: replay timing
VMs and Timelapse share notions of counting and injecting.
Problem: when to inject “push” inputs to cause same execution
VM Browser
record/replay record/replay
• Count branches, • Count DOM event
instructions dispatches
• Inject hardware • Inject “push” inputs
interrupts
30. Challenge: finding and taming nondeterminism
Taming nondeterminism is easier with principled design.
Video games explicitly design to a
record/replay-like architecture.
Taming nondeterminism with
shims relies on well-defined APIs.
Taming nondeterminism post-hoc
requires refactoring and hacks.
Networked games like Starcraft 2
rely on record/replay facilities.
31. Challenge: dealing with real web platforms
Web interpreters are complex, many-limbed, and fast-changing.
• >1 MLOC (mainly C++/ JavaScript)
• Hundreds of committers, and 500-
1000 commits per week
• 6 independent build systems
• Supports many OS, CPU, GUI
toolkits, graphics engines,
toggleable features, embedders
http://hawanja.deviantart.com/
32. Lesson: consider many, implement one
For research, only address essential architectural complexity.
It’s fine to choose one reasonable
configuration. But, don’t ignore
essential complexity.
Platform-specific examples:
• input method editor handling
• rendering and animation
• resource caching, cookies
Embedder-specific examples:
• JavaScript engine
• Embedding API usage http://commons.wikimedia.org
33. Interacting with a recording
Controlling video on YouTube The Omniscient Debugger for Java
34. Interacting with Whyline [Ko & Myers]
Whyline for Java: program output Whyline for Java: program output
38. Table of program inputs
InfoVis mantra: overview, zoom and filter, details-on-demand.
Inputs by id
Input types and time
by color
Table is
adjustable
Panel- Global record
specific controls
controls
39. Recording, replaying, and modality
• Recording can be started/stopped from any Inspector panel
• Replay is controlled from the Timelapse panel only
• Modality displayed next to record/replay controls
40. Slider control and modality
• Timeline and table are linked by sliders and filters.
• Dragging slider seeks replay to the dropped position.
Sliders show same position Dragging induces drop previews
41. How do developers use Timelapse?
Actual use cases? Good affordances? Workflow integration?
Formative user study
• Subjects: 9+3 professional web developers
• Tasks: create a test case, fix a bug
• Time: 45 min/task, + tutorial & exit interview
• How: within-subjects design, Camtasia, Timelapse
42. Important use cases
Play, watch and pause
Visual binary search
Stepping through inputs
Execution anchoring
Refining repro. steps
43. Important use cases
Play, watch and pause
Visual binary search
Stepping through inputs
Execution anchoring
Refining repro. steps
44. Important use cases
Play, watch and pause
Visual binary search
Stepping through inputs
Execution anchoring
Refining repro. steps
45. Important use cases
Play, watch and pause
Visual binary search
Stepping through inputs
Execution anchoring
Refining repro. steps
46. Important use cases
Play, watch and pause
Visual binary search
Stepping through inputs
Execution anchoring
Refining repro. steps
47. Important use cases
Play, watch and pause
Visual binary search
Stepping through inputs
Execution anchoring
Refining repro. steps
48. Study Results
No significant effect on task completion time. Why?
• Timelapse didn’t affect types of information sought
• “Program inputs” not related to existing concepts
• New strategies only possible with tight integration
Interactive record/replay mainly* accelerates existing strategies.
49. Future Work
“Canned” executions can bring research ideas into practice.
Dynamic analysis
on-demand race detection, code coverage, optimizations
Program synthesis
create automated tests from recorded interactions
Corpus analysis
train models, test type systems, or evaluate analyses
50. Future Work
“Canned” executions can bring research ideas into practice.
Dynamic analysis
on-demand race detection, code coverage, optimizations
Program synthesis
create automated tests from recorded interactions
Corpus analysis
train models, test type systems, or evaluate analyses
51. On-demand dynamic analysis
Dynamic analyses generate data at runtime with instrumentation.
Instrumented Generated Dynamic Analysis
execution data analysis Output
52. On-demand dynamic analysis
Record/replay enables on-demand dynamic analysis tools.
Normal Timelapse Developer Instrumented Data and
execution recording tools execution analysis
53. Conclusion
Timelapse decouples demonstration and investigation
using ideas from virtual machine record/replay work.
The Timelapse tool supports interactive inspection.
Timelapse’s infrastructure is useful for other purposes.
The Timelapse research prototype is an open-source fork
of WebKit, and is freely available from the project website.
https://bitbucket.org/burg/timelapse/
55. Network: a real proxy, not a shim
An HTTP proxy records and replays network traffic.
This is a hack.
In the short-term, easier than attempting to recreate
platform-dependent network streams/handles
Long-term solution:
A resource loader implementation with built-in
record/replay primitives, or a higher-level network API
56. Replay fidelity and completeness
Divergence detection supports piecewise implementation.
Web interpreters expose a large and ever-changing API.
Timelapse doesn’t tame all sources of nondeterminism.
Excepting untamed sources, the DOM tree and JavaScript heap
are identical for all recorded and replayed executions.
Divergence is automatically detected by comparing DOM events.
57. Performance characteristics
Performance hasn’t been a priority, but is adequate anyway.
Record and replay slowdowns are negligible because
shims are placed along existing architectural boundaries.
Timelapse is ideal for CPU- and user-bound programs.
Network-bound applications are slower, due to prototype
shortcuts (disabling caching and pipelining).
Replay/seek speed could be improved dramatically by
checkpointing and clipping unnecessary layout/renders.
58. Embedding and platform APIs
Abstraction layers separate web interpreters from platforms/embedders.
EMBEDDING API
PLATFORM API
Embedders Web Interpreter Platforms
(WebKit, Gecko)
59. Embedding and platform APIs
Abstraction layers separate web interpreters from platforms/embedders.
EMBEDDING API
PLATFORM API
Embedders Web Interpreter Platforms
(WebKit, Gecko)
60. Embedding and platform APIs
Shims sit between the web interpreter and abstraction layers.
EMBEDDING API
PLATFORM API
Embedders Web Interpreter Platforms
(WebKit, Gecko)
61. Embedding and platform APIs
Shims sit between the web interpreter and abstraction layers.
EMBEDDING API
PLATFORM API
Embedders Web Interpreter Platforms
(WebKit, Gecko)
62. Shim behavior when recording
EMBEDDING API
PLATFORM API
Start
recording!
63. Shim behavior when recording
EMBEDDING API
PLATFORM API
[user mouse press]
-> (IPC communication)
-> WebKit::WebPage::mouseEvent()
-> WebCore::InputProxy::handleMousePress()
---> WebCore::DeterminismController::capture()
-> WebCore::EventHandler::handleMousePress()
-> WebCore::Node::dispatchMouseEvent()
64. Shim behavior when replaying
EMBEDDING API
PLATFORM API
Start
replaying!
65. Shim behavior when replaying
EMBEDDING API
PLATFORM API
DeterminismController::dispatchNextAction()
-> MousePressAction::dispatch()
-> InputProxy::handleMousePress(true)
-> EventHandler::handleMousePress()
-> Node::dispatchMouseEvent()
-> WebKit::WebPage::mouseEvent()
-> InputProxy::handleMousePress(false)
Editor's Notes
Show the color picker widget.A user has reported that moving one of the RGB sliders up and down repeatedly can cause the others to change values.Try to reproduce this.Once it’s been reproduced, try to find the event handler for mousemoveSet a breakpoint there, then try to reproduce.Oh, it won’t work. I guess I’ll have to simulate.. Or insert logging.How would I describe this bug any more concisely than the user? Test it?
Demo1: create a repro of the bug with two distinct repro interactions w/ the slider.Demo2: Real-time playback, past the second interaction.Seek to the beginning of second interaction.See where we are in the repro.Disable timer category, scope to just this interaction.Find a breakpoint.Link scope to breakpoints.Use breakpoint to binary search until see the change.Step into
TODO Need better connection of goals to strategies
For Timelapse, the browser and platform need to appear deterministic. For VMWare, only the platform needs to appear deterministic.
I will say that the boxes are things that must execute at the same time. Rerunning an analysis on new data means re-demonstrating the behavior.
With a pipeline approach, one recording is sufficient for any number of different sets of instrumentation, as needed.
This slide might be backup, or combined with fidelity slide, or need inspiration.