Advertisement

Java Memory Analysis: Problems and Solutions

May. 2, 2017
Advertisement

More Related Content

Advertisement

Java Memory Analysis: Problems and Solutions

  1. Java Memory Analysis: Problems and Solutions
  2. How Java apps (ab)use memory
  3. Assume that at the high level your data is represented efficiently • Data doesn’t sit in memory for longer than needed • No unnecessary duplicate data structures
 - E.g. don’t keep same objects in both a List and a Set • Data structures are appropriate
 - E.g. don’t use ConcurrentHashMap when no concurrency • Data format is appropriate
 - E.g. don’t use Strings for int/double numbers

  4. Main sources of memory waste (from bottom to top level) • JVM internal object implementation • Inefficient common data structures
 - Collections
 - Boxed numbers • Data duplication - often biggest overhead • Memory leaks
  5. Internal Object Format in HotSpot JVM
  6. Internal Object Format: Alignment • To enable 4-byte pointers (compressedOops) with >4G heap, objects are 8-byte aligned • Thus, for example:
 - java.lang.Integer effective size is 16 bytes
 (12b header + 4b int)
 - java.lang.Long effective size is 24 bytes - not 20!
 (12b header + 8b long + 4b padding)
  7. Summary: small objects are bad • A small object’s overhead is up to 400% of its workload • There are apps with up to 40% of the heap wasted due to this • See if you can change your code to “consolidate” objects or put their contents into flat arrays • Avoid heap size > 32G! (really ~30G)
 - Unless your data is mostly int[], byte[] etc.
  8. Common Collections • JDK: java.util.ArrayList, java.util.HashMap, java.util.concurrent.ConcurrentHashMap etc. • Third-party - mainly Google: com.google.common.collect.* • Scala has its own equivalent of JDK collections • JDK collections are nothing magical
 - Written in Java, easy to load and read in IDE
  9. ArrayList Internals
  10. HashMap Internals
  11. Memory Footprint of JDK Collections • JDK pays not much attention to memory footprint
 - Just some optimizations for empty ArrayLists and HashMaps
 - ConcurrentHashMap and some Google collections are the worst “memory hogs” • Memory is wasted due to:
 - Default size of the internal array (10 for AL, 16 for HM) too high for small maps. Never shrinks after initialization.
 - $Entry objects used by all Maps take at least 32b each! 
 - Sets just reuse Map structure, no footprint optimization
  12. Boxed numbers - related to collections • java.lang.Integer, java.lang.Double etc. • Were introduced mainly to avoid creating specialized classes like IntToObjectHashMap • However, proven to be extremely wasteful:
 - Single int takes 4b. java.lang.Integer effective size is 16b (12b header + 4b int), plus 4b pointer to it
 - Single long takes 8b. java.lang.Long effective size is 24b (12b header + 8b long + 4b padding), plus 4b pointer to it
  13. JDK Collections: Summary • Initialized, but empty collections waste memory • Things like HashMap<Object, Integer> are bad • HashMap$Entry etc. may take up to 30% of memory • Some third-party libraries provide alternatives
 - In particular, fastutil.di.unimi.it (University of Milan, Italy)
 - Has Object2IntHashMap, Long2ObjectHashMap, Int2DoubleHashMap, etc. - no boxed numbers
 - Has Object2ObjectOpenHashMap : no $Entry objects
  14. Data Duplication • Can happen for many reasons:
 - s = s1 + s2 or s = s.toUpperCase() etc. always generates a new String object
 - intObj = new Integer(intScalar) always generates a new Integer object
 - Duplicate byte[] buffers in I/O, serialization, etc. • Very hard to detect without tooling
 - Small amount of duplication is inevitable
 - 20-40% waste is not uncommon in unoptimized apps • Duplicate Strings are most common and easy to fix
  15. Dealing with String duplication • Use tooling to determine where dup strings are either
 - generated, e.g. s = s.toUpperCase();
 - permanently attached, e.g. this.name = name; • Use String.intern() to de-duplicate
 - Uses a JVM-internal fast, scalable canonicalization hashtable
 - Table is fixed and preallocated - no extra memory overhead 
 - Small CPU overhead is normally offset by reduced GC time and improved cache locality • s = s.toUpperCase.intern();
 this.name = name.intern(); …
  16. Other duplicate data • Can be almost anything. Examples:
 - Timestamp objects
 - Partitions (with HashMaps and ArrayLists) in Apache Hive
 - Various byte[], char[] etc. data buffers everywhere • So far convenient tooling so far for automatic detection of arbitrary duplicate objects • But one can often guess correctly
 - Just look at classes that take most memory…
  17. Dealing with non-string duplicates • Use WeakHashMap to store canonicalized objects
 - com.google.common.collect.Interner wraps a (Weak)HashMap • For big data structures, interning may cause some CPU performance impact
 - Interning calls hashCode() and equals()
 - GC time reduction would likely offset this • If duplicate objects are mutable, like HashMap…
 - May need CopyOnFirstChangeHashMap, etc.
  18. Duplicate Data: Summary • Duplicate data may cause huge memory waste
 - Observed up to 40% overhead in unoptimized apps • Duplicate Strings are easy to
 - Detect (but need tooling to analyze a heap dump)
 - Get rid of - just use String.intern() • Other kinds of duplicate data more difficult to find
 - But it’s worth the effort!
 - Mutable duplicate data is more difficult to deal with
  19. Memory Leaks • Unlike C++, Java doesn’t have real leaks
 - Data that’s not used anymore, but not released
 - Too much persistent data cached in memory • No reliable way to distinguish leaked data…
 - But any data structure that just keeps growing is bad • So, just pay attention to the biggest (and growing) data structures
 - Heap dump: see which GC root(s) hold most memory
 - Runtime profiling can be more accurate, but more expensive
  20. JXRay Memory Analysis Tool
  21. What is it • Offline heap analysis tool
 - Runs once on a given heap dump, produces a text report • Simple command-line interface: 
 - Just one jar + .sh script
 - No complex installation
 - Can run anywhere (laptop or remote headless machine)
 - Needs JDK 8 • See http://www.jxray.com for more info
  22. JXRay: main features • Shows you what occupies the heap
 - Object histogram: which objects take most memory
 - Reference chains: which GC roots/data structures keep biggest object “lumps” in memory • Shows you where memory is wasted
 - Object headers
 - Duplicate Strings
 - Bad collections (empty; 1-element; small (2-4 element))
 - Bad object arrays (empty (all nulls); length 0 or 1; 1-element)
 - Boxed numbers
 - Duplicate primitive arrays (e.g. byte[] buffers)
  23. Keeping results succinct • No GUI - generates a plain text report
 - Easy to save and exchange
 - Small: ~50K regardless of the dump size
 - Details a given problem once its overhead is above threshold (by default 0.1% of used heap) • Knows about internals of most standard collections
 - More compact/informative representation • Aggregates reference chains from GC roots to problematic objects
  24. Reference chain aggregation: assumptions • A problem is important if many objects have it
 - E.g.1000s/1000,000s of duplicate strings • Usually there are not too many places in the code responsible for such a problem
 - Foo(String s) {
 this.s = s.toUpperCase(); …
 }
 - Bar(String s1, String s2) {
 this.s = s1 + s2; …
 }
  25. Reference chain aggregation: what is it • In the heap, we may have e.g.
 Baz.stat1 -> HashMap@243 -> ArrayList@650 -> Foo.s = “xyz”
 Baz.stat2 -> LinkedList@798 -> HashSet@134 -> Bar.s = “0”
 Baz.stat1 -> HashMap@529 -> ArrayList@351 -> Foo.s = “abc”
 Baz.stat2 -> LinkedList@284 -> HashSet@960 -> Bar.s = “1”
 … 1000s more chains like this • JXRay aggregates them all into just two lines:
 Baz.stat1 -> {HashMap} -> {ArrayList} -> Foo.s (“abc”,”xyz” and
 3567 more dup strings)
 Baz.stat2 -> {LinkedList} -> {HashSet} -> Bar.s (“0”, “1” and …)
  26. Treating collections specially • Object histogram: standard vs JXRay view
 HashMap$Entry 21500 objs 430K
 HashMap$Entry[] 3200 objs 180K
 HashMap 3200 objs 150K 
 vs
 {HashMap} 3200 objs 760K • Reference chains:
 Foo <- HashMap$Entry.value <- HashMap$Entry[] <-
 <- HashMap <- Object[] <- ArrayList <- rootX
 vs
 Foo <- {HashMap.values} <- {ArrayList} <- rootX
  27. Bad collections • Empty: no elements at all
 - Is it used at all? If yes, allocate lazily. • 1-element
 - Always has only 1 element - replace with object
 - Almost always has 1 element - solution more complex. Switch between Object and collection/array lazily. • Small: 2..4 elements
 - Consider smaller initial capacity
 - Consider replacing with a plain array
  28. Bad object arrays • Empty: only nulls
 - Same as empty collections - delete or allocate lazily • Length 0
 - Replace with a singleton zero-length array • Length 1
 - Replace with an object? • Single non-null element
 - Replace with an object? Reduce length?
  29. Memory Analysis and Reducing Footprint: concrete cases
  30. A Monitoring app • Scalability wasn’t great
 - Some users had to increase -Xmx again and again.
 - Unclear how to choose the correct size • Big heap -> long full GC pauses -> frozen UI • Some OOMs in small clusters
 - Not a scale problem - a bug?
  31. Investigation, part 1 • Started with the smaller dumps with OOMs
 - Immediately found duplicate strings
 - One string repeated 1000s times used 90% of the heap 
 - Long SQL query saved in DB many times, then retrieved
 - Adding two String.intern() calls solved the problem.. almost • Duplicate byte[] buffers in a 3rd-party library code
 - That still caused noticeable overhead
 - Ended up limiting saved query size at high level
 - Library/auto-gen code may be difficult to change…
  32. Investigation, part 2 • Next, looked into heap dumps with scalability problems
 - Both real and artificial benchmark setup • Found all the usual issues
 - String duplication
 - Empty or small (1-4 elements) collections
 - Tons of small objects (object headers used 31% of heap!)
 - Boxed numbers
  33. Standard solutions applied • Duplicate strings: add more String.intern() calls
 - Easy: check jxray report, find what data structures reference bad strings, edit code
 - Non-trivial when a String object is mostly managed by auto- generated code • Bad collections: less trivial
 - Sometimes it’s enough to replace new HashMap() with new HashMap(expectedSize)
 - Found ArrayLists that almost always size 0/1
  34. Dealing with mostly 0/1-size ArrayLists • Replaced ArrayList list; —> Object valueOrArray; • Depending on the situation, valueOrArray may
 - be null
 - point to a single object (element)
 - point to an array of objects (elements) • ~70 LOC hand-written for this
 - But memory savings were worth the effort
  35. Dealing with non-string duplicate data • Heap contained a lot of of small objects
 class TimestampAndData {
 long timestamp;
 long value; 
 … } • Guessed that there may be many duplicates
 - E.g. many values are just 0/1 • Added a simple canonicalization cache. Result:
 - 8x fewer TimestampAndData objects
 - 16% memory savings
  36. A Monitoring app: conclusions • Fixing string/other data duplication, boxed nums, small/empty collections: together saved ~50%
 - Depends on the workload
 - Scalability improved: more data - higher savings • Can still save more - replace standard HashMaps with more memory-friendly maps
 - HashMap$Entry objects may take a lot of memory!
  37. Apache Hive: Hive Server 2 (HS2) • HS2 may run out of memory • Most scenarios involve 1000s of partitions and 10s of concurrent queries • Not many heap dumps from real users • Create a benchmark which reproduces the problem, measure where memory goes, optimize
  38. Experimental setup • Created a Hive table with 2000 small partitions • Running 50 concurrent queries like “select count(myfield_1) from mytable;” crashes an HS2 server with -Xmx500m • More partitions or concurrent queries - more memory needed
  39. HS2: Investigation • Looked into the heap dump generated after OOM • Not too many different problems:
 - Duplicate strings: 23%
 - java.util.Properties objects take 20% of memory
 - Various bad collections: 18% • Apparently, many Properties are duplicate
 - A separate copy per partition per query
 - For a read-only partition, all per-query copies are identical
  40. HS2: Fixing duplicate strings • Some String.intern() calls added • Some strings come from HDFS code
 - Need separate changes in Hadoop code • Most interesting: String fields of java.net.URI
 - private fields initialized internally - no access
 - But still can read/write using Java Reflection
 - Wrote StringInternUtils.internStringsInURI(URI) method
  41. HS2: Fixing duplicate 
 java.util.Properties objects • Main problem: Properties object is mutable
 - All PartitionDesc objects representing the same partition cannot simply use one “canonicalized” Properties object
 - If one is changed, others should not! • Had to implement a new class
 class CopyOnFirstWriteProperties extends Properties {
 Properties interned; // Used until/unless a mutator called
 // Inherited table is filled and used after first mutation
 …
 }
  42. HS2: Improvements based on simple read-only benchmark • Fixing duplicate strings and properties together saved ~37% of memory • Another ~5% can be saved by reduplicating strings in HDFS • Another ~10% can be saved by dealing with bad collections
  43. Investigating/fixing concrete apps: conclusions • Any app can develop memory problems over time
 - Check and optimize periodically • Many such problems are easy enough to fix
 - Intern strings, initialize collections lazily, etc. • Duplication other than strings is frequent
 - More difficult to fix, but may be well worth the effort
 - Need to improve tooling to detect it automatically
Advertisement