1
JAVA DAY KYIV 2017
LATENCY TRACING
IN DISTRIBUTED JAVA
APPLICATIONS
KONSTANTIN SLISENKO
LEAD SOFTWARE ENGINEER
NOV 5, 2017
2
kanstantsin_slisenka@epam.com
Konstantin Slisenko
Java Team Lead in EPAM
Financial services, trading solutions
Speaker at Java Meetups
github.com/kslisenko
3
MY TALK IS ABOUT…
4
5
AGENDA
What is latency tracing?
2
3
1
How it works?2
Live demo3
6
ONE SYSTEM
LITTLE STORY ABOUT
7
8
Team 1
Team 3
Team 2
9
Response time
150ms
150ms
Team 1
Team 3
Team 2
150ms
150ms
150ms
150ms
10
Response time
300ms
300ms
Team 1
Team 3
Team 2
300ms
300ms
300ms
300ms
11
Response time
timeout
timeout
Team 1
Team 3
Team 2
timeout
timeout
timeout
timeout
12
Response time
timeout
timeout
Team 1
Team 3
Team 2
timeout
timeout
timeout
timeout
Frustrated
user
13
1. Whose fault?
14
1. Whose fault?
Where?
15
1. Whose fault?
Where?
2. Why?
16
1. Whose fault?
Where?
2. Why?
3. How to prevent?
17
PROFILERS, LOGS, METRICS?
?
?
?
?
?
18
“When systems involve not just dozens of subsystems
but dozens of engineering teams, even our best and
most experienced engineers routinely guess wrong about
the root cause of poor end-to-end performance”
https://research.google.com/pubs/pub36356.html
19
LATENCY TRACING
THE MOMENT WHEN YOU NEED
20
AGENDA
What is latency tracing?
2
3
1
How it works?2
Live demo3
21
HOW IT WORKS
22
HOW IT WORKS
ID=1 300 ms
ID = 1
23
HOW IT WORKS
ID=1 300 ms
ID = 1
ID = 1
ID=1 150 ms
24
HOW IT WORKS
ID=1 300 ms
ID = 1
ID = 1
ID = 1
ID=1 120 ms
ID=1 150 ms
25
Service 2
parent id: 1
Service 3
parent id: 1
TRACES AND SPANS
Service 1
no parent id
span id: 1
26
Service 1
no parent id
span id: 1
Service 2
parent id: 1
span id: 2
JAVA CODE
parent id: 2
span id: 3
DB CALL
parent id: 2
span id: 4
Service 3
parent id: 1
span id: 5
TRACES AND SPANS
JAVA CODE
parent id: 5
span id: 6
DB CALL
parent id: 5
span id: 7
27
Service 1
no parent id
span id: 1
Service 2
parent id: 1
span id: 2
JAVA CODE
parent id: 2
span id: 3
DB CALL
parent id: 2
span id: 4
Service 3
parent id: 1
span id: 5
TRACES AND SPANS
JAVA CODE
parent id: 5
span id: 6
DB CALL
parent id: 5
span id: 7 Team 3
Team 1
Team 2
28
1. Whose fault?
Where?
2. Why?
3. How to prevent?
29
HOW DO I ADD THIS
TO MY PROJECT?
30
1. Pass request IDs between tiers
2. Measure and report processing time
3. Collect traces and spans
SO, THE PLAN IS
31
Communication protocols
 Pass trace/span IDs
 Use HTTP headers, JMS attrs
 Modify custom protocols
32
Entry points
 Intercept communication
frameworks (HTTP, JMS, RPC, …)
 Start new traces
33
Method execution flow
 Measure execution time
 Report new spans
 Capture method arguments
 Thread locals for trace/span IDs
34
Asynchronous invocation
 Intercept new thread starting
 Pass trace/span IDs to the new
threads
35
WHAT NEEDS TO BE CHANGED
Communication protocols
 Pass trace/span IDs
 Use HTTP headers, JMS attrs
 Modify custom protocols
Method execution flow
 Measure execution time
 Report new spans
 Capture method arguments
 Thread locals for trace/span IDs
Entry points
 Intercept communication
frameworks (HTTP, JMS, RPC…)
 Start new traces
Asynchronous invocation
 Intercept new thread starting
 Pass trace/span IDs to the new
threads
36
HOW DO I MODIFY
MY JAVA APP?
37
Instrumentation in
Java
38
Instrumentation in
Java
Source code
39
Instrumentation in
Java
Source code Byte code
40
Instrumentation in
Java
Source code Byte code
On the fly
Custom class loader
Java agents
JAVASSIST, ASM, …
Run-time aspects
41
Instrumentation in
Java
Source code Byte code
OfflineOn the fly
Custom class loader
Java agents
Compile-time
aspects
JAVASSIST, ASM, …
Run-time aspects
42
ANY EXISTING TOOLS?
43
COMMERCIAL
Magic Quadrant for Application Performance
Monitoring Suites (21 December 2016)
OPEN-SOURCE
Java Performance Monitoring: 5 Open Source
Tools You Should Know (19 January 2017)
www.stagemonitor.org github.com/naver/pinpoint
www.moskito.org
glowroot.org kamon.io
zipkin.io
https://www.gartner.com/doc/reprints?id=1-3OGTPY9&ct=161221
https://dzone.com/articles/java-performance-
monitoring-5-open-source-tools-you-should-know
44
Tracer tracer = ...;
Span parentSpan = ...;
Span span = tracer
.buildSpan(“someWork”)
.asChildOf(parentSpan.context())
.withTag(“foo”, “bar”)
.start();
try {
// Do things
} finally {
span.finish();
}
A vendor-neutral open standard
for distributed tracing
http://opentracing.io
45
AGENDA
What is latency tracing?
2
3
1
How it works?2
Live demo3
46
I’M GOING TO SHOW
47
http://github.com/kslisenko/java-performance
48
http://github.com/kslisenko/java-performance
HTTP1
49
http://github.com/kslisenko/java-performance
2
JMS
HTTP1
50
http://github.com/kslisenko/java-performance
2
JMS
HTTP1
Custom
protocol
3
51
LET’S GO!
github.com/kslisenko/java-performance
52
TAKE AWAYS
53
LATENCY TRACING ISSUES AND LIMITATIONS
1. Computation and I/O overhead
2. Custom protocols
3. Reactive streams, batch processing
4. Security and privacy
54
Latency tracing
 Must have — for microservices
 Better — in production
 At least — at performance testing
55
56
QUESTIONS?
THANK YOU!

Latency tracing in distributed Java applications