Diagnosing HotSpot JVM Memory Leaks with JFR and JMC
1. Diagnosing HotSpot JVM Memory Leaks
with JFR and JMC
Mushfekur Rahman
Software Engineer II
2. Today We’ll Talk About...
● Java Reference Types
● Understanding GC Reachability
● Memory Leak and java.lang.OutOfMemoryError
● Brief Introduction to JFR and JMC
● Few Classic Leak Scenarios
3. Java Reference Types
● What about it?
○ How do we create a reference to an object?
Car c = new Car();
● This is not the only way we can create references *newsflash*
● Java offers four types of references
1. Strong
2. Soft
3. Weak
4. Phantom
4. GC Reachability
● Let’s be a little hypothetical, what does GC do?
○ Simulate a finite memory space as infinite
● A common misunderstanding
○ GC collects unused/dead objects and reclaims memory space occupied by them
○ It’s actually the exact opposite
■ GC marks live objects
■ Reclaims everything else
6. GC Reachability (contd.)
● Eligibility of an object to be GC’d depends on references
● GC algorithms traverse objects starting from GC root
○ GC roots
■ Source of Object trees
■ Possible candidates are active threads, and it's local variables, static variables
and JNI references
○ All the objects that are reachable from GC root are marked as alive (not eligible for GC)
● If there’s no strong reference to an object, it’s going to be collected
8. Behold The OutOfMemoryError !
● It’s not that generic as you might’ve thought!
● Heap
○ Exception in thread "main": java.lang.OutOfMemoryError: Java heap
space
○ This is what today’s session is mostly about
● PermGen
○ Exception in thread "main": java.lang.OutOfMemoryError: PermGen space
○ We’ll talk about PermGen leaks in another session
● Doesn’t always mean there’s a memory leak
○ Fragmentation
○ HotSpot VM throws OutOfMemoryError in case of excessive GC activities (e.g. 90% of
execution time is being used for GC while 2% memory is being used)
9. Wait a minute...
● So, we are fine as long as there’s no OutOfMemoryError?
○ Not really!
○ In fact, little unnoticed leaks are both decisive and vicious.
● What all these has to do with application performance?
○ GC is essential but expensive
○ GC is one of the biggest contributor to the overall latency in JVM applications
○ The less memory used, the less GC
10. Java Mission Control
● Set of powerful monitoring and diagnosis tools that comes with Oracle JDK
● Two main parts
○ JMX Console
■ For monitoring JVM in real time
○ Java Flight Recorder
■ For collecting data about JVM (profiling)
● Can monitor JVM instances both local and remote
○ Event triggers
○ Execute troubleshooting commands on target JVM
● Supports plugins
○ For additional functionalities (e.g. heap dump analysis, DTrace recording etc.)
● Free for development use
11. Java Flight Recorder
● Flight Recorder
“A flight recorder, commonly known as a black
box, although it is now orange-coloured, is an
electronic recording device placed in an aircraft
for the purpose of facilitating the investigation of
aviation accidents and incidents.” - Wikipedia
Fig. An actual aircraft Flight Recorder
12. Java Flight Recorder (contd.)
● High performance event recorder
○ Built into the runtime
● Recordings are stored as binary chunks
○ Very detailed (records everything according to settings)
○ Self contained and self describing
● Very low overhead (<=1% according to Oracle)
○ Can keep it running always
○ Dumps can be taken from time to time using JVM debugging commands (e.g. jcmd)
13. Java Flight Recorder (contd.)
● Recording Types
○ Continuous Recording
■ Have no end time
■ Must be explicitly dumped
○ Time Fixed Recordings (a.k.a Profiling Recordings)
■ Events will be captured within the specified timeframe
■ Will be automatically dumped in the specified location
14. JFR Setup
● Start JVM with the following flags
-XX:+UnlockCommercialFeatures -XX:+FlightRecorder
● For remote profiling add JMX related com.sun.management flags
● Additional parameters can be passed with -XX:StartFlightRecording
● Templates can be specified with a settings parameter
○ Templates can be created using JMC
○ Default template location $JAVA_HOME/jre/lib/jfr
○ Can only be used with JRockit or HotSpot (7u40 or later)
● *If there’s any* disable other profiling tool that does instrumentation (e.g.
XRebel)
15. Recording Events
● Default Recording
○ Starts a continuous recording
○ defaultrecording parameter is need to be added when starting JVM
○ dumponexit and dumponexitpath parameters can also be used with it
○ To get more info on what’s going on we can change log level
-XX:FlightRecorderOptions=loglevel=trace
○ Example: The following will start a default recording
-
XX:FlightRecorderOptions=defaultrecording=true,dumponexit=true,dumponexi
tpath=<path>
16. Recording Events (contd.)
● Creating records on the fly
○ Need to pass signal in target JVM (using jcmd)
○ Available commands: JFR.start, JFR.check, JFR.stop, JFR.dump
○ Command format:
jcmd <pid> <command> <parameters>
○ Starting a recording
jcmd 4609 JFR.start name=DemoRecording settings=profile
delay=20s duration=5m filename=/home/demorecording.jfr
18. Common Pitfalls
● Non-static Inner Classes
○ Inner class has an implicit reference to the containing class
○ Isn’t supposed to live longer than the container class
○ What if an outer context holds a reference to the inner class?
■ Due to the implicit strong reference outer class cannot be GC’d
public class OuterClass {
/**
* Outer class contents
*/
public class InnerClass {
/**
* Inner class contents
*/
}
}
19. Common Pitfalls (contd.)
● Flyweight Pattern
○ Structural design pattern aims to reduce cost of object creation
○ Reusing already existing objects by maintaining an object pool
○ Two states in every object
■ Intrinsic: Shareable values
■ Extrinsic: Non shareable values. Created and/or destroyed based on the client object
actions.
20. Common Pitfalls (contd.)
● JSP Tag Pooling
○ Servlet containers cache tag instances for performance reason
○ States goes back to the pool along with tag instances
○ Wait... it’s not just about heap leak
■ What if you have logic based on current state?
● Kaboom!
○ Should cleanup instance variable states after use
○ Example
21. Common Pitfalls (contd.)
● Stateless Session Beans (Stateless EJBs)
○ Conversational state with a client is not stored
○ However, values of instance variables is held when returning in pool
○ Same as JSP Tag pooling it’s not just about memory
○ Should cleanup instance variable states, unless required
22. Common Pitfalls (contd.)
● ThreadLocal instances
○ Lifecycle bound to the owning thread
○ Typical multithreaded services use thread pools
■ Threads gets re-used
○ What if you forget to cleanup after the thread finishes it’s work?
● Implicit leaks
○ Autoboxing in a loop
○ Serializing/cloning/copying complex object graphs
23. Additional Resources
● Troubleshooting Guide for HotSpot VM (Chapter 3)
● Using Java Flight Recorder (JavaOne ‘15)
● Java Mission Control for Earthlings (Devoxx France ‘15)
● Running Java Flight Recorder