Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Latency tracing in distributed Java applications

527 views

Published on

«When systems are not just dozens of subsystems, but dozens of engineering teams, even our best and most experienced engineers routinely guess wrong about the root cause of poor end-to-end performance» — that’s what think in Google.

Latency tracing approach helps Google and many other companies to control stability and performance as well as helps to find root causes of performance degradation even in huge and complex distributed systems.

I’ll tell about what is latency tracing, how that helps you, and how you can implement it in your project. Finally I will show live demo using such tools as Dynatrace and Zipkin.

examples: https://github.com/kslisenko/java-performance

http://javaday.org.ua/kanstantsin-slisenka-profiling-distributed-java-applications/

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Latency tracing in distributed Java applications

  1. 1. 1 JAVA DAY KYIV 2017 LATENCY TRACING IN DISTRIBUTED JAVA APPLICATIONS KONSTANTIN SLISENKO LEAD SOFTWARE ENGINEER NOV 5, 2017
  2. 2. 2 kanstantsin_slisenka@epam.com Konstantin Slisenko Java Team Lead in EPAM Financial services, trading solutions Speaker at Java Meetups github.com/kslisenko
  3. 3. 3 MY TALK IS ABOUT…
  4. 4. 4
  5. 5. 5 AGENDA What is latency tracing? 2 3 1 How it works?2 Live demo3
  6. 6. 6 ONE SYSTEM LITTLE STORY ABOUT
  7. 7. 7
  8. 8. 8 Team 1 Team 3 Team 2
  9. 9. 9 Response time 150ms 150ms Team 1 Team 3 Team 2 150ms 150ms 150ms 150ms
  10. 10. 10 Response time 300ms 300ms Team 1 Team 3 Team 2 300ms 300ms 300ms 300ms
  11. 11. 11 Response time timeout timeout Team 1 Team 3 Team 2 timeout timeout timeout timeout
  12. 12. 12 Response time timeout timeout Team 1 Team 3 Team 2 timeout timeout timeout timeout Frustrated user
  13. 13. 13 1. Whose fault?
  14. 14. 14 1. Whose fault? Where?
  15. 15. 15 1. Whose fault? Where? 2. Why?
  16. 16. 16 1. Whose fault? Where? 2. Why? 3. How to prevent?
  17. 17. 17 PROFILERS, LOGS, METRICS? ? ? ? ? ?
  18. 18. 18 “When systems involve not just dozens of subsystems but dozens of engineering teams, even our best and most experienced engineers routinely guess wrong about the root cause of poor end-to-end performance” https://research.google.com/pubs/pub36356.html
  19. 19. 19 LATENCY TRACING THE MOMENT WHEN YOU NEED
  20. 20. 20 AGENDA What is latency tracing? 2 3 1 How it works?2 Live demo3
  21. 21. 21 HOW IT WORKS
  22. 22. 22 HOW IT WORKS ID=1 300 ms ID = 1
  23. 23. 23 HOW IT WORKS ID=1 300 ms ID = 1 ID = 1 ID=1 150 ms
  24. 24. 24 HOW IT WORKS ID=1 300 ms ID = 1 ID = 1 ID = 1 ID=1 120 ms ID=1 150 ms
  25. 25. 25 Service 2 parent id: 1 Service 3 parent id: 1 TRACES AND SPANS Service 1 no parent id span id: 1
  26. 26. 26 Service 1 no parent id span id: 1 Service 2 parent id: 1 span id: 2 JAVA CODE parent id: 2 span id: 3 DB CALL parent id: 2 span id: 4 Service 3 parent id: 1 span id: 5 TRACES AND SPANS JAVA CODE parent id: 5 span id: 6 DB CALL parent id: 5 span id: 7
  27. 27. 27 Service 1 no parent id span id: 1 Service 2 parent id: 1 span id: 2 JAVA CODE parent id: 2 span id: 3 DB CALL parent id: 2 span id: 4 Service 3 parent id: 1 span id: 5 TRACES AND SPANS JAVA CODE parent id: 5 span id: 6 DB CALL parent id: 5 span id: 7 Team 3 Team 1 Team 2
  28. 28. 28 1. Whose fault? Where? 2. Why? 3. How to prevent?
  29. 29. 29 HOW DO I ADD THIS TO MY PROJECT?
  30. 30. 30 1. Pass request IDs between tiers 2. Measure and report processing time 3. Collect traces and spans SO, THE PLAN IS
  31. 31. 31 Communication protocols  Pass trace/span IDs  Use HTTP headers, JMS attrs  Modify custom protocols
  32. 32. 32 Entry points  Intercept communication frameworks (HTTP, JMS, RPC, …)  Start new traces
  33. 33. 33 Method execution flow  Measure execution time  Report new spans  Capture method arguments  Thread locals for trace/span IDs
  34. 34. 34 Asynchronous invocation  Intercept new thread starting  Pass trace/span IDs to the new threads
  35. 35. 35 WHAT NEEDS TO BE CHANGED Communication protocols  Pass trace/span IDs  Use HTTP headers, JMS attrs  Modify custom protocols Method execution flow  Measure execution time  Report new spans  Capture method arguments  Thread locals for trace/span IDs Entry points  Intercept communication frameworks (HTTP, JMS, RPC…)  Start new traces Asynchronous invocation  Intercept new thread starting  Pass trace/span IDs to the new threads
  36. 36. 36 HOW DO I MODIFY MY JAVA APP?
  37. 37. 37 Instrumentation in Java
  38. 38. 38 Instrumentation in Java Source code
  39. 39. 39 Instrumentation in Java Source code Byte code
  40. 40. 40 Instrumentation in Java Source code Byte code On the fly Custom class loader Java agents JAVASSIST, ASM, … Run-time aspects
  41. 41. 41 Instrumentation in Java Source code Byte code OfflineOn the fly Custom class loader Java agents Compile-time aspects JAVASSIST, ASM, … Run-time aspects
  42. 42. 42 ANY EXISTING TOOLS?
  43. 43. 43 COMMERCIAL Magic Quadrant for Application Performance Monitoring Suites (21 December 2016) OPEN-SOURCE Java Performance Monitoring: 5 Open Source Tools You Should Know (19 January 2017) www.stagemonitor.org github.com/naver/pinpoint www.moskito.org glowroot.org kamon.io zipkin.io https://www.gartner.com/doc/reprints?id=1-3OGTPY9&ct=161221 https://dzone.com/articles/java-performance- monitoring-5-open-source-tools-you-should-know
  44. 44. 44 Tracer tracer = ...; Span parentSpan = ...; Span span = tracer .buildSpan(“someWork”) .asChildOf(parentSpan.context()) .withTag(“foo”, “bar”) .start(); try { // Do things } finally { span.finish(); } A vendor-neutral open standard for distributed tracing http://opentracing.io
  45. 45. 45 AGENDA What is latency tracing? 2 3 1 How it works?2 Live demo3
  46. 46. 46 I’M GOING TO SHOW
  47. 47. 47 http://github.com/kslisenko/java-performance
  48. 48. 48 http://github.com/kslisenko/java-performance HTTP1
  49. 49. 49 http://github.com/kslisenko/java-performance 2 JMS HTTP1
  50. 50. 50 http://github.com/kslisenko/java-performance 2 JMS HTTP1 Custom protocol 3
  51. 51. 51 LET’S GO! github.com/kslisenko/java-performance
  52. 52. 52 TAKE AWAYS
  53. 53. 53 LATENCY TRACING ISSUES AND LIMITATIONS 1. Computation and I/O overhead 2. Custom protocols 3. Reactive streams, batch processing 4. Security and privacy
  54. 54. 54 Latency tracing  Must have — for microservices  Better — in production  At least — at performance testing
  55. 55. 55
  56. 56. 56 QUESTIONS? THANK YOU!

×