distributed tracing in 5 minutes

4,865 views
4,571 views

Published on

lightning talk from surge 2012

Published in: Technology
1 Comment
13 Likes
Statistics
Notes
No Downloads
Views
Total views
4,865
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
114
Comments
1
Likes
13
Embeds 0
No embeds

No notes for slide

distributed tracing in 5 minutes

  1. 1. distributed tracing
  2. 2. twitter zipkingoogle dapper x-trace tracelytics ... more!
  3. 3. motivation
  4. 4. what is slow?
  5. 5. what is slow?
  6. 6. causal flow of control
  7. 7. causal flow of control
  8. 8. how to
  9. 9. possible approaches
  10. 10. possible approaches• Unique identifier
  11. 11. possible approaches• Unique identifier • propagate throughout
  12. 12. possible approaches• Unique identifier • propagate throughout • write instrumentation for various transports
  13. 13. possible approaches• Unique identifier • propagate throughout • write instrumentation for various transports
  14. 14. possible approaches• Unique identifier • propagate throughout • write instrumentation for various transports• Observe and correlate
  15. 15. possible approaches• Unique identifier • propagate throughout • write instrumentation for various transports• Observe and correlate • always on the outside - black box
  16. 16. possible approaches• Unique identifier • propagate throughout • write instrumentation for various transports• Observe and correlate • always on the outside - black box • difficult to get threaded + evented processes right
  17. 17. 1BD57B58AE7E315BBEAB6795F0BDC198296357
  18. 18. t = start nginxcache python db internet the java
  19. 19. t = start nginxcache python db internet the java
  20. 20. t = start nginxcache python db internet the java
  21. 21. t = start nginxcache python db internet the java
  22. 22. t = start nginxcache python db internet the java
  23. 23. t = start nginxcache python db internet the java
  24. 24. t = start nginxcache python db internet the java
  25. 25. t = start nginxcache python db internet the java
  26. 26. t = start nginxcache python db internet the java
  27. 27. t = start nginxcache python db internet the java
  28. 28. t = start nginxcache python db internet the java
  29. 29. t = start nginxcache python db internet the java
  30. 30. t = start nginxcache python db internet the java
  31. 31. t = start nginxcache python db internet the java
  32. 32. t = start nginxcache python db internet the java t = end
  33. 33. piggyback rides• More Doable • HTTP: x-headers • Thrift: secret argument • Internal RPC protocol: you’re the boss• Less Doable • SQL: one way ticket, also you’re not percona • memcache: not extensible so not backwards compatible
  34. 34. t = start nginxcache python db internet the java t = end
  35. 35. timing and structure• Timing • distributed = clock skew• Structure -- two approaches • Encode in ID • Encode in back-pointers
  36. 36. encode in ID?• nginx1• nginx1python1• nginx1python1cache1• nginx1python1cache1python2• nginx1python1cache1python2sql 1• nginx1python1cache1python2sql 1python3• ...
  37. 37. encode in back-pointer? nginx python cache python
  38. 38. reporting
  39. 39. reporting
  40. 40. other things worth figuring out• sampling• reporting• aggregate analysis
  41. 41. thanks!tracelytics.com@dankosaurdan@tracelytics.com
  42. 42. resources• X-Trace: http://x-trace.net • http://x-trace.net/pubs/xtr-nsdi07.pdf• Google Dapper: http://research.google.com/ pubs/pub36356.html• Twitter Zipkin: https://github.com/twitter/zipkin• CMU PDL: www.pdl.cmu.edu • StarDust: http://www.pdl.cmu.edu/PDL-FTP/ SelfStar/thereska_sigmetrics06.pdf • Trace Diff: http://www.pdl.cmu.edu/PDL-FTP/ SelfStar/NSDI11.pdf

×