Java Memory Analysis: Problems and Solutions

Java Memory Analysis:
Problems and Solutions

Assume that at the high level your data
is represented efﬁciently
• Data doesn’t sit in memory for longer than needed
• No unnecessary duplicate data structures 
- E.g. don’t keep same objects in both a List and a Set
• Data structures are appropriate 
- E.g. don’t use ConcurrentHashMap when no concurrency
• Data format is appropriate 
- E.g. don’t use Strings for int/double numbers

Main sources of memory waste
(from bottom to top level)
• JVM internal object implementation
• Inefﬁcient common data structures 
- Collections 
- Boxed numbers
• Data duplication - often biggest overhead
• Memory leaks

Internal Object Format in HotSpot JVM

Internal Object Format: Alignment
• To enable 4-byte pointers (compressedOops) with
>4G heap, objects are 8-byte aligned
• Thus, for example: 
- java.lang.Integer effective size is 16 bytes 
(12b header + 4b int) 
- java.lang.Long effective size is 24 bytes - not 20! 
(12b header + 8b long + 4b padding)

Summary: small objects are bad
• A small object’s overhead is up to 400% of its
workload
• There are apps with up to 40% of the heap wasted
due to this
• See if you can change your code to “consolidate”
objects or put their contents into ﬂat arrays
• Avoid heap size > 32G! (really ~30G) 
- Unless your data is mostly int[], byte[] etc.

Common Collections
• JDK: java.util.ArrayList, java.util.HashMap,
java.util.concurrent.ConcurrentHashMap etc.
• Third-party - mainly Google:
com.google.common.collect.*
• Scala has its own equivalent of JDK collections
• JDK collections are nothing magical 
- Written in Java, easy to load and read in IDE

Memory Footprint of JDK Collections
• JDK pays not much attention to memory footprint 
- Just some optimizations for empty ArrayLists and HashMaps 
- ConcurrentHashMap and some Google collections are the
worst “memory hogs”
• Memory is wasted due to: 
- Default size of the internal array (10 for AL, 16 for HM) too
high for small maps. Never shrinks after initialization. 
- $Entry objects used by all Maps take at least 32b each!  
- Sets just reuse Map structure, no footprint optimization

Boxed numbers - related to collections
• java.lang.Integer, java.lang.Double etc.
• Were introduced mainly to avoid creating
specialized classes like IntToObjectHashMap
• However, proven to be extremely wasteful: 
- Single int takes 4b. java.lang.Integer effective size is 16b
(12b header + 4b int), plus 4b pointer to it 
- Single long takes 8b. java.lang.Long effective size is 24b
(12b header + 8b long + 4b padding), plus 4b pointer to it

JDK Collections: Summary
• Initialized, but empty collections waste memory
• Things like HashMap<Object, Integer> are bad
• HashMap$Entry etc. may take up to 30% of memory
• Some third-party libraries provide alternatives 
- In particular, fastutil.di.unimi.it (University of Milan, Italy) 
- Has Object2IntHashMap, Long2ObjectHashMap,
Int2DoubleHashMap, etc. - no boxed numbers 
- Has Object2ObjectOpenHashMap : no $Entry objects

Data Duplication
• Can happen for many reasons: 
- s = s1 + s2 or s = s.toUpperCase() etc. always
generates a new String object 
- intObj = new Integer(intScalar) always generates
a new Integer object 
- Duplicate byte[] buffers in I/O, serialization, etc.
• Very hard to detect without tooling 
- Small amount of duplication is inevitable 
- 20-40% waste is not uncommon in unoptimized apps
• Duplicate Strings are most common and easy to ﬁx

Dealing with String duplication
• Use tooling to determine where dup strings are either 
- generated, e.g. s = s.toUpperCase(); 
- permanently attached, e.g. this.name = name;
• Use String.intern() to de-duplicate 
- Uses a JVM-internal fast, scalable canonicalization hashtable 
- Table is ﬁxed and preallocated - no extra memory overhead  
- Small CPU overhead is normally offset by reduced GC time
and improved cache locality
• s = s.toUpperCase.intern(); 
this.name = name.intern(); …

Other duplicate data
• Can be almost anything. Examples: 
- Timestamp objects 
- Partitions (with HashMaps and ArrayLists) in Apache Hive 
- Various byte[], char[] etc. data buffers everywhere
• So far convenient tooling so far for automatic
detection of arbitrary duplicate objects
• But one can often guess correctly 
- Just look at classes that take most memory…

Dealing with non-string duplicates
• Use WeakHashMap to store canonicalized objects 
- com.google.common.collect.Interner wraps a
(Weak)HashMap
• For big data structures, interning may cause some
CPU performance impact 
- Interning calls hashCode() and equals() 
- GC time reduction would likely offset this
• If duplicate objects are mutable, like HashMap… 
- May need CopyOnFirstChangeHashMap, etc.

Duplicate Data: Summary
• Duplicate data may cause huge memory waste 
- Observed up to 40% overhead in unoptimized apps
• Duplicate Strings are easy to 
- Detect (but need tooling to analyze a heap dump) 
- Get rid of - just use String.intern()
• Other kinds of duplicate data more difficult to find 
- But it’s worth the effort! 
- Mutable duplicate data is more difficult to deal with

Memory Leaks
• Unlike C++, Java doesn’t have real leaks 
- Data that’s not used anymore, but not released 
- Too much persistent data cached in memory
• No reliable way to distinguish leaked data… 
- But any data structure that just keeps growing is bad
• So, just pay attention to the biggest (and growing)
data structures 
- Heap dump: see which GC root(s) hold most memory 
- Runtime proﬁling can be more accurate, but more expensive

What is it
• Ofﬂine heap analysis tool 
- Runs once on a given heap dump, produces a text report
• Simple command-line interface:  
- Just one jar + .sh script 
- No complex installation 
- Can run anywhere (laptop or remote headless machine) 
- Needs JDK 8
• See http://www.jxray.com for more info

JXRay: main features
• Shows you what occupies the heap 
- Object histogram: which objects take most memory 
- Reference chains: which GC roots/data structures keep
biggest object “lumps” in memory
• Shows you where memory is wasted 
- Object headers 
- Duplicate Strings 
- Bad collections (empty; 1-element; small (2-4 element)) 
- Bad object arrays (empty (all nulls); length 0 or 1; 1-element) 
- Boxed numbers 
- Duplicate primitive arrays (e.g. byte[] buffers)

Keeping results succinct
• No GUI - generates a plain text report 
- Easy to save and exchange 
- Small: ~50K regardless of the dump size 
- Details a given problem once its overhead is above
threshold (by default 0.1% of used heap)
• Knows about internals of most standard collections 
- More compact/informative representation
• Aggregates reference chains from GC roots to
problematic objects

Reference chain aggregation:
assumptions
• A problem is important if many objects have it 
- E.g.1000s/1000,000s of duplicate strings
• Usually there are not too many places in the code
responsible for such a problem 
- Foo(String s) { 
this.s = s.toUpperCase(); … 
} 
- Bar(String s1, String s2) { 
this.s = s1 + s2; … 
}

Reference chain aggregation: what is it
• In the heap, we may have e.g. 
Baz.stat1 -> HashMap@243 -> ArrayList@650 -> Foo.s = “xyz” 
Baz.stat2 -> LinkedList@798 -> HashSet@134 -> Bar.s = “0” 
Baz.stat1 -> HashMap@529 -> ArrayList@351 -> Foo.s = “abc” 
Baz.stat2 -> LinkedList@284 -> HashSet@960 -> Bar.s = “1” 
… 1000s more chains like this
• JXRay aggregates them all into just two lines: 
Baz.stat1 -> {HashMap} -> {ArrayList} -> Foo.s (“abc”,”xyz” and 
3567 more dup strings) 
Baz.stat2 -> {LinkedList} -> {HashSet} -> Bar.s (“0”, “1” and …)

Treating collections specially
• Object histogram: standard vs JXRay view 
HashMap$Entry 21500 objs 430K 
HashMap$Entry[] 3200 objs 180K 
HashMap 3200 objs 150K  
vs 
{HashMap} 3200 objs 760K
• Reference chains: 
Foo <- HashMap$Entry.value <- HashMap$Entry[] <- 
<- HashMap <- Object[] <- ArrayList <- rootX 
vs 
Foo <- {HashMap.values} <- {ArrayList} <- rootX

Bad collections
• Empty: no elements at all 
- Is it used at all? If yes, allocate lazily.
• 1-element 
- Always has only 1 element - replace with object 
- Almost always has 1 element - solution more complex.
Switch between Object and collection/array lazily.
• Small: 2..4 elements 
- Consider smaller initial capacity 
- Consider replacing with a plain array

Bad object arrays
• Empty: only nulls 
- Same as empty collections - delete or allocate lazily
• Length 0 
- Replace with a singleton zero-length array
• Length 1 
- Replace with an object?
• Single non-null element 
- Replace with an object? Reduce length?

Memory Analysis and
Reducing Footprint:
concrete cases

A Monitoring app
• Scalability wasn’t great 
- Some users had to increase -Xmx again and again. 
- Unclear how to choose the correct size
• Big heap -> long full GC pauses -> frozen UI
• Some OOMs in small clusters 
- Not a scale problem - a bug?

Investigation, part 1
• Started with the smaller dumps with OOMs 
- Immediately found duplicate strings 
- One string repeated 1000s times used 90% of the heap  
- Long SQL query saved in DB many times, then retrieved 
- Adding two String.intern() calls solved the problem.. almost
• Duplicate byte[] buffers in a 3rd-party library code 
- That still caused noticeable overhead 
- Ended up limiting saved query size at high level 
- Library/auto-gen code may be difﬁcult to change…

Investigation, part 2
• Next, looked into heap dumps with scalability
problems 
- Both real and artiﬁcial benchmark setup
• Found all the usual issues 
- String duplication 
- Empty or small (1-4 elements) collections 
- Tons of small objects (object headers used 31% of heap!) 
- Boxed numbers

Standard solutions applied
• Duplicate strings: add more String.intern() calls 
- Easy: check jxray report, ﬁnd what data structures reference
bad strings, edit code 
- Non-trivial when a String object is mostly managed by auto-
generated code
• Bad collections: less trivial 
- Sometimes it’s enough to replace new HashMap() with new
HashMap(expectedSize) 
- Found ArrayLists that almost always size 0/1

Dealing with mostly 0/1-size ArrayLists
• Replaced ArrayList list; —> Object valueOrArray;
• Depending on the situation, valueOrArray may 
- be null 
- point to a single object (element) 
- point to an array of objects (elements)
• ~70 LOC hand-written for this 
- But memory savings were worth the effort

Dealing with non-string duplicate data
• Heap contained a lot of of small objects 
class TimestampAndData { 
long timestamp; 
long value;  
… }
• Guessed that there may be many duplicates 
- E.g. many values are just 0/1
• Added a simple canonicalization cache. Result: 
- 8x fewer TimestampAndData objects 
- 16% memory savings

A Monitoring app: conclusions
• Fixing string/other data duplication, boxed nums,
small/empty collections: together saved ~50% 
- Depends on the workload 
- Scalability improved: more data - higher savings
• Can still save more - replace standard HashMaps
with more memory-friendly maps 
- HashMap$Entry objects may take a lot of memory!

Apache Hive: Hive Server 2 (HS2)
• HS2 may run out of memory
• Most scenarios involve 1000s of partitions and 10s
of concurrent queries
• Not many heap dumps from real users
• Create a benchmark which reproduces the
problem, measure where memory goes, optimize

Experimental setup
• Created a Hive table with 2000 small partitions
• Running 50 concurrent queries like “select
count(myﬁeld_1) from mytable;” crashes an HS2
server with -Xmx500m
• More partitions or concurrent queries - more
memory needed

HS2: Investigation
• Looked into the heap dump generated after OOM
• Not too many different problems: 
- Duplicate strings: 23% 
- java.util.Properties objects take 20% of memory 
- Various bad collections: 18%
• Apparently, many Properties are duplicate 
- A separate copy per partition per query 
- For a read-only partition, all per-query copies are identical

HS2: Fixing duplicate strings
• Some String.intern() calls added
• Some strings come from HDFS code 
- Need separate changes in Hadoop code
• Most interesting: String fields of java.net.URI 
- private fields initialized internally - no access 
- But still can read/write using Java Reflection 
- Wrote StringInternUtils.internStringsInURI(URI) method

HS2: Fixing duplicate  
java.util.Properties objects
• Main problem: Properties object is mutable 
- All PartitionDesc objects representing the same partition
cannot simply use one “canonicalized” Properties object 
- If one is changed, others should not!
• Had to implement a new class 
class CopyOnFirstWriteProperties extends Properties { 
Properties interned; // Used until/unless a mutator called 
// Inherited table is ﬁlled and used after ﬁrst mutation 
… 
}

HS2: Improvements based on simple
read-only benchmark
• Fixing duplicate strings and properties together
saved ~37% of memory
• Another ~5% can be saved by reduplicating
strings in HDFS
• Another ~10% can be saved by dealing with bad
collections

Investigating/fixing concrete apps:
conclusions
• Any app can develop memory problems over time 
- Check and optimize periodically
• Many such problems are easy enough to fix 
- Intern strings, initialize collections lazily, etc.
• Duplication other than strings is frequent 
- More difficult to fix, but may be well worth the effort 
- Need to improve tooling to detect it automatically

Java Memory Analysis: Problems and Solutions

More Related Content

What's hot

Similar to Java Memory Analysis: Problems and Solutions

Recently uploaded

Java Memory Analysis: Problems and Solutions