Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Microservices Tracing with Spring Cloud and Zipkin

7,299 views

Published on

Presentation from the Geecon conference

Published in: Technology
  • Be the first to comment

Microservices Tracing with Spring Cloud and Zipkin

  1. 1. Microservices tracing with Spring Cloud and Zipkin Marcin Grzejszczak Marcin Grzejszczak @mgrzejszczak, 11-13 May 2016
  2. 2. About me Developer at Pivotal Part of Spring Cloud Team Working with OSS: ● Accurest - Consumer Driven Contracts verifier for Java ● JSON Assert - fluent JSON assertions ● Spock Subjects Collaborators Extension ● Gradle Test Profiler ● Up To Date Gradle Plugin TWITTER: @MGrzejszczak BLOG: http://TOOMUCHCODING.COM Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016
  3. 3. Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016
  4. 4. Agenda What is distributed tracing? How to correlate logs with Spring Cloud Sleuth? How to visualize latency with Spring Cloud Sleuth and Zipkin? Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016
  5. 5. An ordinary system... Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016
  6. 6. UI calls backend Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016 UI -> BACKEND
  7. 7. Everything is awesome Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016 CLICK 200
  8. 8. Until it’s not Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016 CLICK 500
  9. 9. Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016
  10. 10. Time to debug Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016 https://tonysbologna.files.wordpress.com/2015/09/mario-and-luigi.jpg?w=468&h=578&crop=1
  11. 11. It doesn’t look like this Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016
  12. 12. More like this Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016
  13. 13. On which server / instance was the exception thrown? Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016
  14. 14. SSH and grep for ERROR to find it? Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016
  15. 15. Distributed tracing - terminology Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016 Span Trace Logs (annotations) Tags (binary annotations)
  16. 16. Distributed tracing - terminology Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016 Span Trace Logs (annotations) Tags (binary annotations)
  17. 17. Span Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016 The basic unit of work (e.g. sending RPC) ● Spans are started and stopped ● They keep track of their timing information ● Once you create a span, you must stop it at some point in the future ● Has a parent and can have multiple children
  18. 18. Trace Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016 A set of spans forming a tree-like structure. ● For example, if you are running a book store then ○ Trace could be retriving a list of available books ○ Assuming that to retrive the books you have to send 3 requests to 3 services then you could have at least 3 spans (1 for each hop) forming 1 trace
  19. 19. SERVICE 1 REQUEST No Trace Id No Span Id RESPONSE SERVICE 2 SERVICE 3 Trace Id = X Span Id = A Trace Id = X Span Id = A Trace Id = X Span Id = A REQUEST RESPONSE Trace Id = X Span Id = B Client Sent Trace Id = X Span Id = B Client Received Trace Id = X Span Id = B Server Received Trace Id = X Span Id = C Trace Id = X Span Id = B Server Sent REQUEST RESPONSE Trace Id = X Span Id = D Client Sent Trace Id = X Span Id = D Client Received Trace Id = X Span Id = D Server Received Trace Id = X Span Id = E Trace Id = X Span Id = D Server Sent Trace Id = X Span Id = E SERVICE 4 REQUEST RESPONSE Trace Id = X Span Id = F Client Sent Trace Id = X Span Id = F Client Received Trace Id = X Span Id = F Server Received Trace Id = X Span Id = G Trace Id = X Span Id = F Server Sent Trace Id = X Span Id = G Trace Id = X Span Id = C
  20. 20. Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016 Span Id = A Parent Id = null Span Id = B Parent Id = A Span Id = C Parent Id = B Span Id = D Parent Id = C Span Id = E Parent Id = D Span Id = F Parent Id = C Span Id = G Parent Id = F
  21. 21. Is it that simple? Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016
  22. 22. Is it that simple? Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016 How do you pass tracing information (incl. Trace ID) between: ● different libraries? ● thread pools? ● asynchronous communication? ● …?
  23. 23. Log correlation with Spring Cloud Sleuth Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016 We take care of passing tracing information between threads / libraries / contexts for ● Hystrix ● RxJava ● Rest Template ● Feign ● Messaging with Spring Integration ● Zuul ● ... If you don’t do anything unexpected there’s nothing you need to do to make Sleuth work. Check the docs for more info.
  24. 24. Now let’s aggregate the logs! Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016 Instead of SSHing to the machines aggregate the logs! ● With Cloud Foundry’s (CF) Loggergator the logs from different instances are streamed into a single place ● You can harvest your logs with Logstash Forwarder / FileBeat ● You can use ELK stack to stream and visualize the logs
  25. 25. Spring Cloud Sleuth with Maven Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016 <dependencyManagement> <dependencies> <dependency> <groupId>org.springframework.cloud</groupId> <artifactId>spring-cloud-dependencies</artifactId> <version>Brixton.RELEASE</version> <type>pom</type> <scope>import</scope> </dependency> </dependencies> </dependencyManagement> <dependency> <groupId>org.springframework.cloud</groupId> <artifactId>spring-cloud-starter-sleuth</artifactId> </dependency>
  26. 26. Spring Cloud Sleuth with Gradle Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016 dependencies { compile "org.springframework.cloud:spring-cloud-starter-sleuth" } dependencyManagement { imports { mavenBom "org.springframework.cloud:spring-cloud-dependencies:Brixton. RELEASE" } }
  27. 27. Log correlation with Spring Cloud Sleuth DEMO Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016
  28. 28. Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016
  29. 29. Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016
  30. 30. Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016
  31. 31. Great! We’ve found the exception! But meanwhile.... Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016
  32. 32. The system is slow... Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016 CLICK 200
  33. 33. One of the services is slow? Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016
  34. 34. Which one? How to measure that? Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016
  35. 35. ● Client Sent (CS) - The client has made a request ● Server Received (SR) - The server side got the request and will start processing it ● Server Send (SS) - Annotated upon completion of request processing ● Client Received (CR) - The client has successfully received the response from the server side Let’s log events! Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016
  36. 36. Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016 CS 0 ms SR 100 ms SS 200 msCR 300 ms
  37. 37. ● The request started at T=0ms ● It took 300 ms for the client to receive a response ● Server side received the request at T=100 ms ● The request got processed on the server side in 100 ms ● Why is there a delay between sending and receiving messages? Conclusions Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016 CS 0 ms SR 100 ms SS 200 msCR 300 ms
  38. 38. Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016 https://blogs.oracle.com/jag/resource/Fallacies.html
  39. 39. Distributed tracing - terminology Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016 Span Trace Logs (annotations) Tags (binary annotations)
  40. 40. Logs Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016 Represents an event in time associated with a span ● Every span has zero or more logs ● Each log is a timestamped event name ● Event should be the stable name of some notable moment in the lifetime of a span ● For instance, a span representing a browser page load might add an event for each of the Performance.timing moments (check https://developer.mozilla. org/en-US/docs/Web/API/PerformanceTiming)
  41. 41. Main logs Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016 ● Client Send (CS) ○ The client has made a request - the span was started ● Server Received (SR) ○ The server side got the request and will start processing it ○ SR timestamp - CS timestamp = NETWORK LATENCY
  42. 42. Main logs Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016 ● Server Send (SS) ○ Annotated upon completion of request processing ○ SS timestamp - SR timestamp = SERVER SIDE PROCESSING TIME ● Client Received (CR) ○ The client has successfully received the response from the server side ○ CR timestamp - CS timestamp = TIME NEEDED TO RECEIVE RESPONSE ○ SS timestamp - CR timestamp = NETWORK LATENCY
  43. 43. Key-value pair ● Every span may also have zero or more key/value Tags ● They do not have timestamps and simply annotate the spans. ● Example of default tags in Sleuth ○ message/payload-size ○ http.method ○ commandKey for Hystrix Tag Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016
  44. 44. How to visualise latency in a distributed system? Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016
  45. 45. ● Zipkin is a distributed tracing system ● It runs as a separate process (you can run it as a Spring Boot application) ● It helps gather timing data needed to troubleshoot latency problems in microservice architectures ● The front end is a "waterfall" style graph of service calls showing call durations as horizontal bars The answer is: Zipkin Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016
  46. 46. How does Zipkin work? Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016 SPANS SENT TO COLLECTORS SPANS SENT TO COLLECTORS STORE IN DB APP APP UI QUERIES FOR TRACE INFO VIA API
  47. 47. Spring Cloud Sleuth and Zipkin integration Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016 ● We take care of passing tracing information between threads / libraries / contexts ● Upon closing of a Span we will send it to Zipkin ○ either via HTTP (spring-cloud-sleuth-zipkin) ○ or via Spring Cloud Stream (spring-cloud-sleuth-stream) ● You can run Zipkin Sping Cloud Stream Collector as a Spring Boot app (spring- cloud-sleuth-zipkin-stream) ○ you can add the dependency to Zipkin UI!
  48. 48. Spring Cloud Sleuth Zipkin with Maven Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016 <dependencyManagement> <dependencies> <dependency> <groupId>org.springframework.cloud</groupId> <artifactId>spring-cloud-dependencies</artifactId> <version>Brixton.RELEASE</version> <type>pom</type> <scope>import</scope> </dependency> </dependencies> </dependencyManagement> <dependency> <groupId>org.springframework.cloud</groupId> <artifactId>spring-cloud-starter-zipkin</artifactId> </dependency>
  49. 49. Spring Cloud Sleuth Zipkin with Gradle Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016 dependencies { compile "org.springframework.cloud:spring-cloud-starter-zipkin" } dependencyManagement { imports { mavenBom "org.springframework.cloud:spring-cloud-dependencies:Brixton. RELEASE" } }
  50. 50. SERVICE 1 /start REQUEST No Trace Id No Span Id RESPONSE SERVICE 2 /foo SERVICE 3 /bar Trace Id = X Span Id = A Trace Id = X Span Id = A Trace Id = X Span Id = A REQUEST RESPONSE Trace Id = X Span Id = B Client Sent Trace Id = X Span Id = B Client Received Trace Id = X Span Id = B Server Received Trace Id = X Span Id = C Trace Id = X Span Id = B Server Sent REQUEST RESPONSE Trace Id = X Span Id = D Client Sent Trace Id = X Span Id = D Client Received Trace Id = X Span Id = D Server Received Trace Id = X Span Id = E Trace Id = X Span Id = D Server Sent Trace Id = X Span Id = E SERVICE 4 /baz REQUEST RESPONSE Trace Id = X Span Id = F Client Sent Trace Id = X Span Id = F Client Received Trace Id = X Span Id = F Server Received Trace Id = X Span Id = G Trace Id = X Span Id = F Server Sent Trace Id = X Span Id = G Trace Id = X Span Id = C
  51. 51. Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016 DEMO
  52. 52. Zipkin for Brewery Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016 ● A test app for Spring Cloud end to end tests ● Source code: https://github.com/spring-cloud-samples/brewery ● Around 10 applications involved
  53. 53. Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016
  54. 54. Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016
  55. 55. Summary Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016 ● Log correlation allows you to match logs for a given trace ● Distributed tracing allows you to quickly see latency issues in your system ● Zipkin is a great tool to visualize the latency graph and system dependencies ● Spring Cloud Sleuth integrates with Zipkin and grants you log correlation
  56. 56. Marcin Grzejszczak @mgrzejszczak, Kraków, 11-13 May 2016
  57. 57. THANK YOU ● https://github.com/marcingrzejszczak/vagrant-elk-box/tree/presentation - code for this presentation (clone and run getReadyForConference.sh - NOTE: you need Vagrant!) ● https://github.com/spring-cloud/spring-cloud-sleuth - Spring Cloud Sleuth repository ● http://cloud.spring.io/spring-cloud-sleuth/spring-cloud-sleuth.html - Sleuth’s documentation ● http://toomuchcoding.com/blog/2016/03/25/spring-cloud-sleuth-rc1-deployed/ - article about RC1 release ● https://github.com/openzipkin/zipkin-java - Repo with Spring Boot Zipkin server ● http://docssleuth-service1.cfapps.io/start - The service1 app from this presentation deployed to Pivotal Cloud Foundry - point of entry to the app ● http://docssleuth-zipkin-server.cfapps.io/ - Zipkin deployed to Pivotal Cloud Foundry ● http://brewery-zipkin-web.cfapps.io - Zipkin deployed to PCF for Brewery Sample app Marcin Grzejszczak, @mgrzejszczak Kraków, 11-13 May 2016

×