Memory leaks are not always simple or easy to find. Heap dumps from production systems are often gigantic (4+ gigs) with millions of objects in memory. Simple spot checking with traditional tools is woefully inadequate in these situations, especially with real data. Leaks can be entire object graphs with enormous amounts of noise. This session will show you how to build custom tools using the Apache NetBeans Profiler/Heapwalker APIs. Using these APIs, you can read and analyze Java heaps programmatically to ask really hard questions. This gives you the power to analyze complex object graphs with tens of thousands of objects in seconds.
2. Java Heap Review
• Java objects are stored in the heap
• All objects are globally reachable in the heap
• Heap is created when an application starts
• Size of heap is configured using –Xmx and –Xmx
• Garbage collection prunes the heap and removes objects
no longer reachable
• Stack memory - variable values are stored when their
methods are invoked
Heap Contains Everything and can be DUMPED to DISK
3. Why Analyze Heaps?
• IT reports Java EE/Spring server memory footprint has
grown to 9 gigs
• Server app logs contain OutOfMemoryExceptions
• Connections to queueing or database are exhausted
• Serialized Java objects in queue are unreasonably large
• Desktop application becomes unresponsive
• Excessive amount of garbage collection
9. Heap Dump Panic
Too much data!
• Impossible to comprehend
• No human way to explore the data
• Application data model is too
complicated
10. Real Memory Leaks
Bank Account:
1231209
Owner
Bob
Owner
JulieReport
January 2018
Bank Account:
1231210
Bank Account:
1231209
Report
January 2018
Owner
Bob
Challenge:
Data looks good
everywhere…
11. Real Memory Leaks
Causes:
• Faulty clone methods
• Duplicate singletons
• Accidently cached data
• Cache logic bugs
Complications
• May NOT GROW over time (leaks gets cleaned-up)
• More than one non-trivial memory leak
12. What about OQL?
• OQL
• Object Query Language – used for querying heaps
• SQL-like language
• Supports JavaScript expressions
• Supported in NetBeans and VisualVM
• Downside
• Poorly documented and hard to use
• Easy to create runaway queries
14. NetBeans Profiler
• NetBeans is open source IDE/platform
• Modular architecture
• Clean code base
Profiler GUI
Profiler API
15. NetBeans Profiler API
• Parses hprof files
• Creates an object model representing the hprof file
• Pages data in from disk
• Simple API (master in about 10 minutes)
• Independent of NetBeans
• Can be extract and use in any IDE – Plain old Java
Talk is really about how to build a custom heap analysis tool:
• To answer specific data model questions
• With custom logic for your data model
20. Heap Dump Warning
Dumping the heap:
• Takes time
• Consumes diskspace
• Negatively affects performance
21. Targeted Heap Dumps
• Serialize object graphs from application to a file.
• Read the serialized data into another tool and then
programmatically create a heap dump.
27. Which Approach?
• Copying sources easiest
• Most analysis apps are command line (one-offs)
- Note -
You don’t need the classpath of the application from which
the heap was generated.
32. Heap Summary
• getTotalLiveInstances() : long
• getTime() : long
• getTotalAllocatedBytes() : long
• getTotalAllocatedInstances() : long
• getTotalLiveBytes() : long
34. GC Roots
• Garbage Collection Root is an object that is accessible from
outside the heap.
• Objects that aren’t accessible from a GC Root are garbage
collected
• GC root categorization:
• Class loaded by system class loader
• Thread
• Stack Local
• Monitor
• JNI Reference
• Held by JVM
38. Finding Classes
• Can perform lookup using:
• Fully qualified class name (ex. java.lang.String)
• Class ID
• Instance ID
• IDs are unique to heap dump
• Hash codes are not available!
41. Instances
From an instance:
• Who references the instance
• Who does the instance
reference
Perform instanceof to find out:
• ObjectArray
• PrimitiveArray
GCRoot can take forever…
42. Values
If you ask an instance for its
references, you get a list of Value
objects.
67. Best Practices
• Be mindful of your heap
• Cache analysis on disk when processing large heaps
• Heap processing is I/O bound
• Not all profiler calls are the same
• Look for Javadoc: Speed: normal
• Maintain a list of processed objects
• Easy to run in circles
• Exclude JVM internal classes from analysis
• Revisit graph algorithms!
68. Summary
• Heap snapshot can be easily explored
• Excellent way to verify application logic
• Only way to identify deep data model/logic errors
• Can be used to recover data
• Generate a heap snapshot from a frozen/corrupted application and
then mine