1CONFIDENTIAL
PROFILING DISTRIBUTED
JAVA APPLICATIONS
KANSTANTSIN SLISENKA
LEAD SOFTWARE ENGINEER
MAY 25, 2017
2CONFIDENTIAL
Kanstantsin Slisenka
Java Backend Developer
Speaker at Tech Talks, IT Week
ABOUT ME
skype: kslisenko
kslisenko@gmail.com
kanstantsin_slisenka@epam.com
3CONFIDENTIAL
WHAT IS COMMON?
4CONFIDENTIAL
AGENDA
Profiling single JVM1
How profilers work
Java agents (live demo)
Google experience
Dynatrace, Zipkin (live demo)
Profiling distributed systems2
5CONFIDENTIAL
PROFILING SINGLE JVM
6CONFIDENTIAL
“You can’t measure
performance of Java
code not interfering
with JVM”
7CONFIDENTIAL
https://zeroturnaround.com/rebellabs/top-5-java-profilers-revealed-real-world-data-with-visualvm-jprofiler-java-mission-control-yourkit-and-custom-tooling/
8CONFIDENTIAL
Is profiler honest?
9CONFIDENTIAL
Is profiler honest?
NO!*
*Measured performance = (app performance + profiler overhead) * profiler accuracy
10CONFIDENTIAL
MEASURING TIME
System.currentTimeMillis()
System.nanoTime()
Spend time, not always accurate
1. Use benchmarks
http://openjdk.java.net/projects/code-tools/jmh/
2. Warm-up your JVM, …
https://shipilev.net/talks/jpoint-April2014-benchmarking.pdf https://shipilev.net/blog/2014/nanotrusting-nanotime/
11CONFIDENTIAL
public void main() {
a(); // 100 ms
Thread.sleep(200);
b(); // 100 ms
// GC is running – 50ms
c(); // 100 ms
}
CPU VS WALL-CLOCK TIME
12CONFIDENTIAL
Wall-clock time
As much as it takes to execute
100 + 200 + 100 + 50 + 100 = 550 ms
public void main() {
a(); // 100 ms
Thread.sleep(200);
b(); // 100 ms
// GC is running – 50ms
c(); // 100 ms
}
CPU VS WALL-CLOCK TIME
13CONFIDENTIAL
Wall-clock time
As much as it takes to execute
100 + 200 + 100 + 50 + 100 = 550 ms
CPU time
Time CPU was busy
100 + 100 + 100 = 300 ms
public void main() {
a(); // 100 ms
Thread.sleep(200);
b(); // 100 ms
// GC is running – 50ms
c(); // 100 ms
}
CPU VS WALL-CLOCK TIME
14CONFIDENTIAL
JVM DIAGNOSTIC INTERFACES
• JVMTI (native С++ API)
• Attach API
• jstack, jmap, jps, …
• Performance counters
• Heap Dumps
• Flight Recorder
• JMX
– java.lang.management
– custom MBeans
• Java Agents
– java.lang.instrument
github.com/aragozin/jvm-tools
15CONFIDENTIAL
JAVA.LANG.MANAGEMENT github.com/kslisenko/java-performance/tree/master/java-agent-monitoring
16CONFIDENTIAL
ThreadMXBean threadMBean =
ManagementFactory.getThreadMXBean();
System.out.println("Thread count = " +
threadMBean.getThreadCount());
ThreadInfo[] threads = threadMBean
.dumpAllThreads(true, true);
for (ThreadInfo thread : threads) {
System.out.println(thread);
}
17CONFIDENTIAL
ThreadMXBean threadMBean =
ManagementFactory.getThreadMXBean();
System.out.println("Thread count = " +
threadMBean.getThreadCount());
ThreadInfo[] threads = threadMBean
.dumpAllThreads(true, true);
for (ThreadInfo thread : threads) {
System.out.println(thread);
}
18CONFIDENTIAL
ThreadMXBean threadMBean =
ManagementFactory.getThreadMXBean();
System.out.println("Thread count = " +
threadMBean.getThreadCount());
ThreadInfo[] threads = threadMBean
.dumpAllThreads(true, true);
for (ThreadInfo thread : threads) {
System.out.println(thread);
}
19CONFIDENTIAL
Thread dumps in regular intervals
c()
b()
a()
main()
SAMPLING
20CONFIDENTIAL
Thread dumps in regular intervals Injection of measurement code
INSTRUMENTATION
c()
b()
a()
main()
SAMPLING
c()
b()
a()
main()
21CONFIDENTIAL
Thread dumps in regular intervals
Overhead depends on sampling interval
Injection of measurement code
Overhead depends on speed of measurement code
INSTRUMENTATION
c()
b()
a()
main()
SAMPLING
c()
b()
a()
main()
22CONFIDENTIAL
Thread dumps in regular intervals
Overhead depends on sampling interval
relatively small overhead
can be used for unknown code
Injection of measurement code
Overhead depends on speed of measurement code
accuracy (we measure each execution)
we can modify the code also
INSTRUMENTATION
c()
b()
a()
main()
SAMPLING
c()
b()
a()
main()
23CONFIDENTIAL
Thread dumps in regular intervals
Overhead depends on sampling interval
relatively small overhead
can be used for unknown code
accuracy (probability-based approach)
triggers JVM safe-points
Injection of measurement code
Overhead depends on speed of measurement code
accuracy (we measure each execution)
we can modify the code also
relatively big overhead
we must know the code we are instrumenting
INSTRUMENTATION
c()
b()
a()
main()
SAMPLING
c()
b()
a()
main()
24CONFIDENTIAL
How to capture thread dump
1. jstack -l JAVA_PID
2. ManagementFactory.getThreadMXBean()
.dumpAllThreads(true, true);
3. JVMTI AsyncGetCallTrace
SAMPLING
25CONFIDENTIAL
How to capture thread dump
1. jstack -l JAVA_PID
2. ManagementFactory.getThreadMXBean()
.dumpAllThreads(true, true);
3. JVMTI AsyncGetCallTrace
SAMPLING
JVM goes to safe-point
• Application threads are paused
• We never see the code where safe-point never happens
Does not trigger safe-points
26CONFIDENTIAL
How to capture thread dump
1. jstack -l JAVA_PID
2. ManagementFactory.getThreadMXBean()
.dumpAllThreads(true, true);
3. JVMTI AsyncGetCallTrace
SAMPLING
Doesn’t trigger safe-points
github.com/jvm-profiling-tools/honest-profiler
JVM goes to safe-point
• Application threads are paused
• We never see the code where safe-point never happens
27CONFIDENTIAL
Safe-points
> jstack –l JAVA_PID
Total time for which application threads were
stopped: 0.0132329 seconds, Stopping threads took:
0.0007617 seconds
Total time for which application threads were
stopped: 0.0002887 seconds, Stopping threads took:
0.0000385 seconds
-XX:+PrintGCApplicationStoppedTime
-XX:+PrintSafepointStatistics
-XX:PrintSafepointStatisticsCount=1
28CONFIDENTIAL
INSTRUMENTATION
.java
source code
29CONFIDENTIAL
INSTRUMENTATION
.java
source code
dropwizard
metrics Perf4J
long start = System.currentTimeInMillis();
// Your code goes here
long finish = System.currentTimeInMillis();
System.out.println(start - finish);
30CONFIDENTIAL
INSTRUMENTATION
.java .class
source code byte code
compilation
dropwizard
metrics Perf4J
long start = System.currentTimeInMillis();
// Your code goes here
long finish = System.currentTimeInMillis();
System.out.println(start - finish);
31CONFIDENTIAL
proxy classes generation
INSTRUMENTATION
.java .class
source code byte code
compilation
AspectJ
compiler
dropwizard
metrics Perf4J
long start = System.currentTimeInMillis();
// Your code goes here
long finish = System.currentTimeInMillis();
System.out.println(start - finish);
32CONFIDENTIAL
proxy classes generation
INSTRUMENTATION
AspectJ
compiler
.java .class
source code byte code
byte code in runtime
compilation loading
rt.jar
lib/ext
bootstrap
extension
classpath application
dropwizard
metrics Perf4J
long start = System.currentTimeInMillis();
// Your code goes here
long finish = System.currentTimeInMillis();
System.out.println(start - finish);
33CONFIDENTIAL
byte code in runtime
rt.jar
lib/ext
bootstrap
extension
classpath application
proxy classes generation
Frameworks
INSTRUMENTATION
.java .class
source code byte code
compilation loading
AspectJ
compiler
ASM Javassist CGLibAspectJ BCEL
dropwizard
metrics Perf4J
long start = System.currentTimeInMillis();
// Your code goes here
long finish = System.currentTimeInMillis();
System.out.println(start - finish);
34CONFIDENTIAL
proxy classes generation
Frameworks
INSTRUMENTATION
.java .class
source code byte code
compilation loading
AspectJ
compiler
ASM Javassist CGLibAspectJ BCEL
Custom
ClassLoader
dropwizard
metrics Perf4J
byte code in runtime
rt.jar
lib/ext
bootstrap
extension
classpath
custom
application
long start = System.currentTimeInMillis();
// Your code goes here
long finish = System.currentTimeInMillis();
System.out.println(start - finish);
35CONFIDENTIAL
proxy classes generation
FrameworksJava agents
INSTRUMENTATION
.java .class
source code byte code
compilation loading
AspectJ
compiler
ASM Javassist CGLibAspectJ BCEL
Custom
ClassLoader
dropwizard
metrics Perf4J
byte code in runtime
rt.jar
lib/ext
bootstrap
extension
classpath
custom
application
long start = System.currentTimeInMillis();
// Your code goes here
long finish = System.currentTimeInMillis();
System.out.println(start - finish);
36CONFIDENTIAL
> java –jar -agentlib:agent.dll app.jar> java –jar -agentlib:agent.jar app.jar
JAVA AGENTS
Use for deep dive into JVM
• Has access to the JVM state, can receive JVMTI events
• Independent from JVM (not interrupted by GC, can collect
debug information between safe-points, etc.)
API
• JVMTI (C++ native interface of the JVM)
Use for byte-code modification
• Allows to transform byte-code before it is loaded by
ClassLoader
• Follows JVM lifecycle (suspended by GC, etc.)
API
• java.lang.instrument, java.lang.management
Java C++
37CONFIDENTIAL
AGENT EXAMPLES
HPROF Java profiler
JDWP Java debugger
JRebel/XRebel
• https://zeroturnaround.com/software/jrebel/
-agentlib:hprof[=options] ToBeProfiledClass
-agentlib:jdwp=transport=dt_socket,address=localhost:9009,server=y,suspend=y
38CONFIDENTIAL
public class DemoAgent() {
public static void premain(String args, Instrumentation instr) {
instr.addTransformer(new ClassLoadingLogger());
}
}
public class ClassLoadingLogger implements ClassFileTransformer {
public byte[] transform(ClassLoader loader, String className,
Class<?> classBeingRedefined, ProtectionDomain protectionDomain,
byte[] classfileBuffer) throws IllegalClassFormatException {
System.out.println(className);
return classfileBuffer;
}
}
Manifest-Version: 1.0
Agent-Class: com.example.DemoAgent
Premain-Class: com.example.DemoAgent
> java –jar –agentlib:agent.jar app.jar
39CONFIDENTIAL
public class DemoAgent() {
public static void premain(String args, Instrumentation instr) {
instr.addTransformer(new ClassLoadingLogger());
}
}
public class ClassLoadingLogger implements ClassFileTransformer {
public byte[] transform(ClassLoader loader, String className,
Class<?> classBeingRedefined, ProtectionDomain protectionDomain,
byte[] classfileBuffer) throws IllegalClassFormatException {
System.out.println(className);
return classfileBuffer;
}
}
Manifest-Version: 1.0
Agent-Class: com.example.DemoAgent
Premain-Class: com.example.DemoAgent
> java –jar –agentlib:agent.jar app.jar
40CONFIDENTIAL
public class DemoAgent() {
public static void premain(String args, Instrumentation instr) {
instr.addTransformer(new ClassLoadingLogger());
}
}
public class ClassLoadingLogger implements ClassFileTransformer {
public byte[] transform(ClassLoader loader, String className,
Class<?> classBeingRedefined, ProtectionDomain protectionDomain,
byte[] classfileBuffer) throws IllegalClassFormatException {
System.out.println(className);
return classfileBuffer;
}
}
Manifest-Version: 1.0
Agent-Class: com.example.DemoAgent
Premain-Class: com.example.DemoAgent
> java –jar –agentlib:agent.jar app.jar
41CONFIDENTIAL
public class DemoAgent() {
public static void premain(String args, Instrumentation instr) {
instr.addTransformer(new ClassLoadingLogger());
}
}
public class ClassLoadingLogger implements ClassFileTransformer {
public byte[] transform(ClassLoader loader, String className,
Class<?> classBeingRedefined, ProtectionDomain protectionDomain,
byte[] classfileBuffer) throws IllegalClassFormatException {
System.out.println(className);
return classfileBuffer;
}
}
Manifest-Version: 1.0
Agent-Class: com.example.DemoAgent
Premain-Class: com.example.DemoAgent
> java –jar –agentlib:agent.jar app.jar
42CONFIDENTIAL
JVM
ClassLoader
43CONFIDENTIAL
JVM
ClassLoader
Agent
1. premain
44CONFIDENTIAL
JVM
ClassLoader
Agent
ClassFile
Transformer
1. premain
2. addTransformer
45CONFIDENTIAL
JVM
ClassLoader
Class A
Class B
Class C
Agent
ClassFile
Transformer
1. premain
2. addTransformer
3. load class
46CONFIDENTIAL
JVM
ClassLoader
Class A
Class B
Class C
Agent
ClassFile
Transformer
1. premain
2. addTransformer
3. load class
4. transform
Class A
47CONFIDENTIAL
JVM
ClassLoader
Class A
Class B
Class C
Agent
ClassFile
Transformer
Byte code
manipulation
library
1. premain
2. addTransformer
3. load class
5. modify byte code
4. transform
Class A
48CONFIDENTIAL
JVM
ClassLoader
Class A
Class B
Class C
Agent
ClassFile
Transformer
Byte code
manipulation
library
1. premain
2. addTransformer
3. load class
5. modify byte code
6. redefine class
Class A*
4. transform
Class A
49CONFIDENTIAL
JAVASSIST
High-level, object-oriented API
github.com/jboss-javassist/javassist
50CONFIDENTIAL
JAVASSIST github.com/jboss-javassist/javassist
51CONFIDENTIAL
JAVA AGENT + JAVASSIST
LIVE DEMO
https://github.com/kslisenko/java-performance/tree/master/java-agent
52CONFIDENTIAL
PROFILING DISTRIBUTED
SYSTEM
53CONFIDENTIAL
DISTRIBUTED SYSTEM
Server 1
DBServer 2
DBServer 3
HTTP
HTTP
HTTP
54CONFIDENTIAL
LOOKING GOOD
Responses
HTTP 200
150ms
HTTP 200
150ms
Server 1
DBServer 2
DBServer 3
HTTP
HTTP
HTTP
55CONFIDENTIAL
SOMETHING WENT WRONG
Responses
HTTP 200
150ms
HTTP 200
270ms
HTTP 200
270ms
HTTP 200
150ms
Server 1
DBServer 2
DBServer 3
HTTP
HTTP
HTTP
56CONFIDENTIAL
FAIL
Server 1
DBServer 2
DBServer 3HTTP 500
timeout
Responses
HTTP
HTTP
HTTP
HTTP 200
150ms
HTTP 200
270ms
HTTP 200
270ms
HTTP 200
150ms
Frustrated
user
57CONFIDENTIAL
IDENTIFYING PERFORMANCE PROBLEM
HTTP 500
timeout
Responses
HTTP 200
150ms
HTTP 200
270ms
HTTP 200
270ms
HTTP 200
150ms
Header
req-id: 1
Header
req-id: 1
Header
req-id: 1
Server 1
DBServer 2
DBServer 3
HTTP
HTTP
HTTP
Trace
propagation
Frustrated
user
58CONFIDENTIAL
IDENTIFYING PERFORMANCE PROBLEM
HTTP 500
timeout
Responses
HTTP 200
150ms
HTTP 200
270ms
HTTP 200
270ms
HTTP 200
150ms
Req-1 12:45:31.000 150 ms
Req-1 12:45:31.010 130 ms
Header
req-id: 1
Header
req-id: 1
Header
req-id: 1
Req-1 12:45:31.020 120 ms
Server 1
DBServer 2
DBServer 3
HTTP
HTTP
HTTP
Trace
propagation
Frustrated
user
59CONFIDENTIAL
TRACE EXAMPLE 1
http://server1/service
http://server2/service
server2
to DB
business
logic
http://server3/service
server3
to DB
business
logic
150 ms
120 ms
80 ms 30 ms
130 ms
100 ms 20 ms
http://server1/service
http://server2/service
server2
to DB
business
logic
http://server3/service
server3
to DB
business
logic
120 ms
80 ms 30 ms
130 ms
100 ms 20 ms
270 ms
HTTP 200
150ms
HTTP 200
270ms
60CONFIDENTIAL
TRACE EXAMPLE 2
http://server1/service
http://server2/service
server2
to DB
business
logic
http://server3/service
server3
to DB
business
logic
150 ms
120 ms
80 ms 30 ms
130 ms
100 ms 20 ms
http://server1/service
http://server2/service
server2
to DB
business
logic
http://server3/service
server3 to DB
120 ms
80 ms 30 ms
370 ms
350 ms
500 ms
timeout
HTTP 200
150ms
HTTP 500
timeout
61CONFIDENTIAL
“When systems involve not just dozens of subsystems but
dozens of engineering teams, even our best and most
experienced engineers routinely guess wrong about the root
cause of poor end-to-end performance.”
Google Dapper
https://research.google.com/pubs/pub36356.html
62CONFIDENTIAL
GOOGLE DAPPER
Use cases
1. Identify performance problems
across multiple teams and services
2. Build dynamic environment map
Requirements
1. Low overhead
– no impact on running services
2. Application-level transparency*
– programmers should not need to be aware of
the tracing system
3. Scalability
*They instrumented Google Search almost without modifications
63CONFIDENTIAL
GOOGLE DAPPER: TRACES AND SPANS
64CONFIDENTIAL
GOOGLE DAPPER: ARTHITECTURE
65CONFIDENTIAL
GOOGLE DAPPER: TECHNICAL DETAILS
Technical facts
1. Adaptive sampling
2. 1TB/day to BigTable
3. API + MapReduce
4. Instrumentation of common
Google libraries
Issues and limitations
1. Request buffering
2. Batch jobs
3. Queued requests
4. Relative latency
66CONFIDENTIAL
WANT LIKE IN GOOGLE?
67CONFIDENTIAL
COMMERCIAL
Magic Quadrant for Application Performance
Monitoring Suites (21 December 2016)
OPEN-SOURCE
Java Performance Monitoring: 5 Open Source
Tools You Should Know (19 January 2017)
www.stagemonitor.org github.com/naver/pinpoint
www.moskito.org
glowroot.org kamon.io
zipkin.io
https://www.gartner.com/doc/reprints?id=1-3OGTPY9&ct=161221
https://dzone.com/articles/java-performance-
monitoring-5-open-source-tools-you-should-know
68CONFIDENTIAL https://university.dynatrace.com/education/appmon/913/10859
69CONFIDENTIAL
ZIPKIN (SPRING CLOUD SLEUTH)
Server 1 Server 2
HTTPHTTP
transport
storage User interface
API
http://zipkin.io/pages/architecture.html
Instrumented libraries
Send traces and spans
Trace id Trace id
70CONFIDENTIAL
ZIPKIN (SPRING CLOUD SLEUTH)
HTTP
http://zipkin.io/pages/architecture.html
Server 1 Server 2
HTTPHTTP
transport
storage User interface
API
Instrumented libraries
Send traces and spans
Trace id Trace id
71CONFIDENTIAL
ZIPKIN (SPRING CLOUD SLEUTH)
HTTP
http://zipkin.io/pages/architecture.html
Instrumented libraries
Server 1 Server 2
HTTPHTTP
transport
storage User interface
API
Send traces and spans
Trace id Trace id
72CONFIDENTIAL
Backend
DEMO APPLICATION
Frontend Backend
HTTP
HTTP
Demo cases
1. HTTP calls
Spring boot
browser
1
1
github.com/kslisenko/java-performance
73CONFIDENTIAL
Backend
DEMO APPLICATION
Frontend Backend
HTTP
JMS
HTTP
Demo cases
1. HTTP calls
2. JMS
Spring boot
chat queue
JMS
browser
1
1
2
github.com/kslisenko/java-performance
74CONFIDENTIAL
Backend
DEMO APPLICATION
Frontend Backend
HTTP
TCP/IP
custom protocol
JMS
HTTP
Demo cases
1. HTTP calls
2. JMS
3. Custom protocol (TCP/IP)
Spring boot
chat queue
JMS
browser
1
1
2
3
github.com/kslisenko/java-performance
75CONFIDENTIAL
Backend
DEMO APPLICATION
Frontend
MySQL
Backend
HTTP
TCP/IP
custom protocol
JMS
HTTP
Demo cases
1. HTTP calls
2. JMS
3. Custom protocol (TCP/IP)
4. DB, JDBC, Hibernate
Spring boot
chat queue
JMS
browser
1
1
2
3
4
github.com/kslisenko/java-performance
76CONFIDENTIAL
Backend
DEMO APPLICATION
Frontend
MySQL
Backend
HTTP
TCP/IP
custom protocol
JMS
HTTP
Demo cases
1. HTTP calls
2. JMS
3. Custom protocol (TCP/IP)
4. DB, JDBC, Hibernate
5. Exceptions
6. Async invocations
– New threads
– ExecutorService
– CompletableFuture
Spring boot
chat queue
JMS
browser
51
1
2
3
4
6
github.com/kslisenko/java-performance
77CONFIDENTIAL
DYNATRACE + ZIPKIN
LIVE DEMO
github.com/kslisenko/java-performance
78CONFIDENTIAL
CONCLUSION
1. Make it work
2. Make it right
3. Make if fast
79CONFIDENTIAL
REFERENCES
Metric libraries
Perf4J https://github.com/perf4j/perf4j
Metrics http://metrics.dropwizard.io
Servo https://github.com/Netflix/servo
Byte-code modification with
JAVASSIST
https://blog.newrelic.com/2014/09/29/diving-bytecode-
manipulation-creating-audit-log-asm-javassist
https://www.youtube.com/watch?v=39kdr1mNZ_s
Java Agents
https://www.slideshare.net/arhan/oredev-2015-taming-java-
agents
http://www.barcelonajug.org/2015/04/java-agents.html
Profiling
https://blog.codecentric.de/en/2011/10/measure-java-
performance-sampling-or-instrumentation/
https://blog.codecentric.de/en/2014/10/profiler-tell-truth-
javaone/
https://www.youtube.com/watch?v=YCC-CpTE2LU&t=2312s
https://www.slideshare.net/aragozin/java-black-box-profiling
https://www.slideshare.net/aragozin/java-profiling-diy-
jugmskru-2016
Safe-points
http://blog.ragozin.info/2012/10/safepoints-in-hotspot-
jvm.html
https://www.cberner.com/2015/05/24/debugging-jvm-
safepoint-pauses/
80CONFIDENTIAL
QUESTIONS?
THANK YOU!
KANSTANTSIN_SLISENKA@EPAM.COM

Profiling distributed Java applications