Successfully reported this slideshow.
Your SlideShare is downloading. ×

Zipkin - Runtime open house

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Loading in …3
×

Check these out next

1 of 66 Ad

Zipkin - Runtime open house

Download to read offline

Zipkin is a distributed tracing system that helps us gather timing data for all the disparate services at Twitter, and manages collection and lookup of data through a Collector and a Query service.
With Zipkin, we can trace a subset of all requests made to the site, and collect detailed data about the path taken through our systems, as well as timings. Then, we can visualize and ultimately pinpoint where and possibly why a response took longer than expected.

Zipkin is a distributed tracing system that helps us gather timing data for all the disparate services at Twitter, and manages collection and lookup of data through a Collector and a Query service.
With Zipkin, we can trace a subset of all requests made to the site, and collect detailed data about the path taken through our systems, as well as timings. Then, we can visualize and ultimately pinpoint where and possibly why a response took longer than expected.

Advertisement
Advertisement

More Related Content

Recently uploaded (20)

Advertisement

Zipkin - Runtime open house

  1. 1. Z I P K I N A distributed tracing framework
  2. 2. Why Zipkin?
  3. 3. Google 0.5 sec slower 20% traffic drop @skr | @thisisfranklin 3
  4. 4. Google 0.5 sec slower 1.5 sec faster 20% traffic drop CTR up 12% @skr | @thisisfranklin 3
  5. 5. Performance matters
  6. 6. Front end @skr | @thisisfranklin 5
  7. 7. Front end Back end @skr | @thisisfranklin 5
  8. 8. @skr | @thisisfranklin 6
  9. 9. @skr | @thisisfranklin 6
  10. 10. @skr | @thisisfranklin 7
  11. 11. @skr | @thisisfranklin 8 Picture from http://www.flickr.com/photos/jpellgen
  12. 12. @skr | @thisisfranklin 9
  13. 13. • Collects traces from production requests • Low overhead • Minimum of extra work for developers @skr | @thisisfranklin 9
  14. 14. Finagle “Finagle is an asynchronous network stack for the JVM that you can use to build asynchronous Remote Procedure Call (RPC) clients and servers in Java, Scala, or any JVM-hosted language.” github.com/twitter/finagle
  15. 15. @skr | @thisisfranklin 11
  16. 16. Finagle Http service @skr | @thisisfranklin 11
  17. 17. Finagle Http service @skr | @thisisfranklin 11
  18. 18. Finagle Http service Finagle Thrift Service @skr | @thisisfranklin 11
  19. 19. Finagle Http service Finagle Thrift Service @skr | @thisisfranklin 11
  20. 20. Finagle Http service Finagle Thrift Service @skr | @thisisfranklin 11
  21. 21. Finagle Http service Finagle Thrift Service Service @skr | @thisisfranklin 11
  22. 22. Finagle Http service Finagle Thrift Service Service @skr | @thisisfranklin 11
  23. 23. Finagle Http service Finagle Thrift Service Service @skr | @thisisfranklin 11
  24. 24. Zipkin terminology ‣ Annotation: string data associated with a particular timestamp, service, and host Time time: 2012-01-21 22:37:01 value: “something happened” server: 135.34.53.2 service: “timelineservice” @skr | @thisisfranklin 12
  25. 25. ‣ Span: represents one specific method call; made up of a set of annotations. Has a name and an id. Time T:0ms Client Send Span @skr | @thisisfranklin 13
  26. 26. ‣ Span: represents one specific method call; made up of a set of annotations. Has a name and an id. Time T:0ms Client Send Span T:10ms Server Receive @skr | @thisisfranklin 13
  27. 27. ‣ Span: represents one specific method call; made up of a set of annotations. Has a name and an id. Time T:0ms Client Send Span T:10ms Server Receive T:90ms Server Send @skr | @thisisfranklin 13
  28. 28. ‣ Span: represents one specific method call; made up of a set of annotations. Has a name and an id. Time T:0ms Client Send T:100ms Client Receive Span T:10ms Server Receive T:90ms Server Send @skr | @thisisfranklin 13
  29. 29. ‣ Span: represents one specific method call; made up of a set of annotations. Has a name and an id. Time T:0ms Client Send T:100ms Client Receive Span T:20ms Read 30 kbytes from file T:10ms Server Receive T:90ms Server Send @skr | @thisisfranklin 13
  30. 30. ‣ Span: represents one specific method call; made up of a set of annotations. Has a name and an id. Time T:0ms Client Send T:100ms Client Receive Span T:20ms Read 30 kbytes from file T:10ms Server Receive T:90ms Server Send ‣ Trace: a set of spans all associated with the same request @skr | @thisisfranklin 13
  31. 31. Finagle http service @skr | @thisisfranklin 14
  32. 32. • Generate a random i64 trace id Finagle http service @skr | @thisisfranklin 14
  33. 33. • Generate a random i64 trace id • Decide if we should sample the trace or not Finagle http service @skr | @thisisfranklin 14
  34. 34. • Generate a random i64 trace id • Decide if we should sample the trace or not Finagle http service Finagle thrift service @skr | @thisisfranklin 14
  35. 35. • Generate a random i64 trace id • Decide if we should sample the trace or not Finagle http service • Generate new span id Finagle thrift service @skr | @thisisfranklin 14
  36. 36. • Generate a random i64 trace id • Decide if we should sample the trace or not Finagle http service • Generate new span id • Pass trace header Finagle thrift service @skr | @thisisfranklin 14
  37. 37. • Generate a random i64 trace id • Decide if we should sample the trace or not Finagle http struct RequestHeader { service i64 trace_id, i64 span_id, • Generate new span id optional i64 parent_span_id, • Pass trace header optional bool sampled } Finagle thrift service @skr | @thisisfranklin 14
  38. 38. • Generate a random i64 trace id • Decide if we should sample the trace or not Finagle http struct RequestHeader { service i64 trace_id, i64 span_id, • Generate new span id optional i64 parent_span_id, • Pass trace header optional bool sampled } Finagle • Thrift service adopts trace id from thrift header if it exists service @skr | @thisisfranklin 14
  39. 39. Finagle http service Finagle thrift service @skr | @thisisfranklin 14
  40. 40. Finagle http service Finagle Finagle thrift thrift service service @skr | @thisisfranklin 14
  41. 41. Finagle http service Finagle Finagle thrift thrift service service Finagle thrift service @skr | @thisisfranklin 14
  42. 42. Finagle http service S Finagle Finagle thrift thrift S service service S Finagle thrift S service @skr | @thisisfranklin 14
  43. 43. Finagle http service S Zipkin collector Finagle Finagle thrift thrift S service service S Cassandra Finagle thrift S service @skr | @thisisfranklin 14
  44. 44. Finagle http service S Zipkin collector Finagle Finagle thrift thrift S service service S Zipkin Zipkin Cassandra Query UI Finagle thrift S service @skr | @thisisfranklin 14
  45. 45. Finagle ♥ Zipkin
  46. 46. Finagle Thrift server ServerBuilder() .bindTo(address) .codec(ThriftServerFramedCodec()) .name("servicename") .build(someService) @skr | @thisisfranklin 16
  47. 47. Finagle Thrift server ServerBuilder() .bindTo(address) .codec(ThriftServerFramedCodec()) .name("servicename") .build(someService) @skr | @thisisfranklin 16
  48. 48. Finagle Thrift server ServerBuilder() .bindTo(address) .codec(ThriftServerFramedCodec()) .name("servicename") .build(someService) @skr | @thisisfranklin 16
  49. 49. Finagle Thrift server ServerBuilder() .bindTo(address) .codec(ThriftServerFramedCodec()) .name("servicename") .build(someService) @skr | @thisisfranklin 16
  50. 50. Finagle Http server ServerBuilder() .bindTo(address) .codec(Http()) .name("servicename") .build(someService) @skr | @thisisfranklin 17
  51. 51. ServerBuilder() .bindTo(address) .codec(ThriftServerFramedCodec()) .name("servicename") .tracerFactory(ZipkinTracer()) .build(someService) @skr | @thisisfranklin 18
  52. 52. ClientBuilder() .cluster(hosts) .codec(ThriftServerFramedCodec()) .name("clientname") .tracerFactory(ZipkinTracer()) .build(someService) @skr | @thisisfranklin 19
  53. 53. Trace.record("doing stuff") @skr | @thisisfranklin 20
  54. 54. Trace.record("doing stuff") @skr | @thisisfranklin 20
  55. 55. Trace.record("doing stuff") time: 2012-01-21 22:37:01 value: “doing stuff” server: 135.34.53.2 service: “timelineservice” @skr | @thisisfranklin 20
  56. 56. Trace.recordBinary("key", data) @skr | @thisisfranklin 21
  57. 57. Key Value Trace.recordBinary("key", data) responsecode 500 cache:somekey Hit sql.query select *... @skr | @thisisfranklin 21
  58. 58. Platform Protocol Client Server Finagle Thrift Yes Yes Finagle HTTP Yes Yes Finagle Memcache Yes No Finagle Redis Yes No Cassie Thrift Yes No Querulous JDBC Yes No Ruby Thrift Yes Yes @skr | @thisisfranklin 22
  59. 59. Zipkin UI
  60. 60. @skr | @thisisfranklin 24
  61. 61. @skr | @thisisfranklin 24
  62. 62. @skr | @thisisfranklin 24
  63. 63. What did we find?
  64. 64. github.com/twitter/zipkin @zipkinproject @skr | @thisisfranklin 27

Editor's Notes

  • Talk more about what tracing is. Compare in process profiling to distributed system?\n
  • Before we get into what Zipkin is: why we created it\n
  • Shit is slow, you lose users and money\n
  • \n
  • Simplify wildly, there are two parts to web performance\nFront end. The order assets are loaded, minifying and other tricsk\nBack end. How quickly can we generate and push out the HTML/JSON/whatever\n
  • For the front end we have these nice development tools. Shows us how the assets loaded, what it was waiting for and so on.\nWe want a fancy tool like this for the backend.\nPicked a page where server side was unusually bad, normally it’s mostly FE\n
  • For the back end we only had these graphs to look at per service.\nNothing that ties them together\n
  • Capture information about how services in a datacentre is working together to respond to a request. \n
  • Read this paper from Google called Dapper\n
  • Read this paper from Google called Dapper\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Mention that Cassandra and Scribe can both be replaced. Zookeeper coordination?\n
  • Mention that Cassandra and Scribe can both be replaced. Zookeeper coordination?\n
  • Mention that Cassandra and Scribe can both be replaced. Zookeeper coordination?\n
  • Mention that Cassandra and Scribe can both be replaced. Zookeeper coordination?\n
  • Mention that Cassandra and Scribe can both be replaced. Zookeeper coordination?\n
  • Mention that Cassandra and Scribe can both be replaced. Zookeeper coordination?\n
  • Mention that Cassandra and Scribe can both be replaced. Zookeeper coordination?\n
  • Mention that Cassandra and Scribe can both be replaced. Zookeeper coordination?\n
  • Mention that Cassandra and Scribe can both be replaced. Zookeeper coordination?\n
  • Mention that Cassandra and Scribe can both be replaced. Zookeeper coordination?\n
  • Mention that Cassandra and Scribe can both be replaced. Zookeeper coordination?\n
  • Mention that Cassandra and Scribe can both be replaced. Zookeeper coordination?\n
  • Mention that Cassandra and Scribe can both be replaced. Zookeeper coordination?\n
  • Mention that Cassandra and Scribe can both be replaced. Zookeeper coordination?\n
  • Mention that Cassandra and Scribe can both be replaced. Zookeeper coordination?\n
  • Mention that Cassandra and Scribe can both be replaced. Zookeeper coordination?\n
  • Mention that Cassandra and Scribe can both be replaced. Zookeeper coordination?\n
  • Mention that Cassandra and Scribe can both be replaced. Zookeeper coordination?\n
  • Mention that Cassandra and Scribe can both be replaced. Zookeeper coordination?\n
  • Mention that Cassandra and Scribe can both be replaced. Zookeeper coordination?\n
  • \n
  • Mention dependency on finagle-zipkin\n
  • Mention dependency on finagle-zipkin\n
  • Mention dependency on finagle-zipkin\n
  • Mention dependency on finagle-zipkin\n
  • Mention dependency on finagle-zipkin\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Trace view\nServices on the left. Time scale on top. Least impactful parts of trace collapsed automatically.\nMention bootstrap and dj\n
  • \n
  • It’s all open source, check it out now.\n

×