1. Nov, 2015
Review of Mystery
Machine
Ivan Glushkov
ivan.glushkov@gmail.com
@gliush
2. Why
❖ Need to debug and optimize applications
❖ Complex, heterogenous systems
❖ Different parts written in different languages
❖ Different communicative channels
❖ Different execution environments
❖ Even if individual components are optimized - the
whole system might not work optimally
3. What
❖ They develop performance analysis tools
❖ They apply it to their pipeline
❖ They measure end-to-end performance:
❖ from the point of initiating a page load
❖ to the point when browser finishes rendering
4. Why not
❖ All current approaches assume you instrument your
code, specify relations, etc
❖ Usually you don’t have time or ability
❖ Large systems are developed by large teams
❖ Adding instrumentation retroactively is a Herculean task
5. Overview
❖ They generate a model via large scale reasoning of logs
❖ They can confirm relationships
❖ They need only (requestId, hostId, hostTS, eventId) in each
log message
❖ UberTrace gathers all the log to one point
❖ MysteryMachine conducts causality model from that traces
❖ MysteryMachine performs analyses: identifying critical
paths, slack analysis, outlier detection
6. UberTrace: why
❖ No tools to analyze inter-process optimality
❖ They need to have a single end-to-end performance
tracing tool for all logs
7. UberTrace: requirements
❖ Each log message should contain
❖ Unique request id
❖ Computer id (server node / client laptop)
❖ Timestamp (local clock)
❖ Event name (e.g. “start DOM arendering”)
❖ Task name (<Event,Task> should be unique)
❖ Propagate decision about logging particular request
8. UberTrace
❖ TS are from local clocks -> translated to global clock
❖ Execution time = Latest TS - Earliest TS
❖ RTT = Es - Ec
❖ Clock skew = 1/2 RTT
❖ Multiple observation,
choose minimal one
9. Mystery Machine: casual model
❖ Split all logs into segments
(two consecutive events
for the same task)
❖ Create a casual model
❖ They validated this model
for client-side js library
(42 and 84 segments -> 2583 and 10458 casual
relationships)
13. Mystery Machine: critical path
Critical path - set of segments for which a differential increase in segments execution time
would result in the same differential increase in the end-to-end latency