Java Garbage Collection: 
A Performance Impact 
Blynov Viacheslav 
Lead Software Developer 
24 November 2014 
www.luxoft.com
Introduction 
www.luxoft.com
Introduction 
Agenda 
 Garbage Collection Overview 
 Java GC Algorithms 
 Basic GC Tuning 
www.luxoft.com
www.luxoft.com
GC Purpose 
Objects that are referenced are said to be live 
Objects that are no longer referenced are considered dead and termed garbage 
Garbage collector is responsible for: 
 allocating memory 
 ensuring that any referenced objects remain in memory 
 recovering memory used by objects that are no longer reachable from references in 
executing code 
www.luxoft.com
www.luxoft.com
GC performance impact 
 Collector needs computational resources (CPU cycles) to perform garbage collection 
 As garbage collection involves moving objects in memory a collector must ensure that no 
thread is using these objects 
www.luxoft.com 
The pauses when all application threads are stopped are called 
stop-the-world pauses 
These pauses generally have the greatest impact on the performance of an application, and 
minimizing those pauses is the key consideration when tuning GC.
Generational model 
www.luxoft.com
Generational model 
www.luxoft.com
Generational model 
www.luxoft.com
Generational model 
www.luxoft.com
Generational model 
www.luxoft.com
Generational model 
www.luxoft.com
Summary 
 all GC algorithms divide the heap into old and young generation 
 all GC algorithm employ stop-the-world approach to clearing objects from 
young generation, which is usually a very quick operation 
www.luxoft.com
www.luxoft.com
“Client” and “Server” JVM types 
Depending on underlying hardware platform and version 
JVM can act as “client” of “server” VM. This affect the 
choice of JIT compiler and default GC algorithm. 
 “client” platform is usually 32-bit and has 1 CPU 
 “server” platform is usually 64-bit (but 32-bit is also 
possible) and has several CPUs 
www.luxoft.com
GC Algorithms 
 Serial garbage collector (-XX:+UseSerialGC) 
 Throughput collector (-XX:+UseParallelGC , -XX:+UseParallelOldGC) 
 CMS collector (-XX:+UseConcMarkSweepGC, -XX:+UseParNewGC) 
 G1 collector (-XX:+UseG1GC) 
www.luxoft.com
Serial garbage collector 
 default collector for client-class platforms (32-bit JVMs on Windows or single-processor 
www.luxoft.com 
machines) 
 uses single thread to process heap 
 stops all application threads for both minor and full GC 
Usage cases: 
 no low-pause requirements 
 “client-style” single-CPU environment 
 very small heap (few hundred MBs) 
 several JVMs running on single platform (number of JVM > number of 
available CPUs)
Throughput (parallel) garbage collector 
 default collector for server-class machines (multi-CPU Unix machines or any 64-bit JVM) 
 utilizes multiple threads for garbage collection to gain speed and minimize pauses 
 stops all application threads for both minor and full GC 
 -XX:+UseParallelGC enables multi-threaded collection of young generation and single-threaded 
www.luxoft.com 
old-generation collection/compaction 
 -XX:+UseParallelOldGC enables multi-threaded collection of young generation and multi-threaded 
old-generation collection/compaction 
Usage cases: 
 multi-CPU are available 
 large heap size and many object created/discarded
CMS (Concurrent Mark Sweep) collector 
 designed to eliminate long pauses associated with full GC cycles 
 stops all application threads during minor GC 
 uses different algorithm to collect young generation (-XX:+UseParNewGC) 
 uses one or more background threads to periodically scan through the old generation and discard 
unused objects. This makes CMS a low-paused collector 
 do not perform any compaction 
 in case of CPU unavailability and/or heap fragmentation – fallback to serial collector 
 by default does not collect permgen 
Usage cases: 
 low pause requirement and available CPU resources 
 in case of single-CPU machine can be used with -XX:+CMSIncrementalMode (deprecated in Java 
8) 
www.luxoft.com
G1 (Garbage First) garbage collector 
 designed to process large heaps (more than 4 Gb) with minimal pauses 
 divides the heap into separate regions 
 performs incremental compaction of old generation by copying data between 
regions 
www.luxoft.com
G1 (Garbage First) garbage collector 
www.luxoft.com
Summary (choosing GC algorithm) 
 Serial GC is best only for application with heap <= 100 Mb 
 Batch jobs which consume all available CPUs will get better performance 
with concurrent collector 
 Batch jobs which DON’T consume all CPUs could get better performance 
with throughput collector 
 When measuring response time the choice between throughput and 
concurrent collectors depends on CPU availability 
 Most of the time CMS should overperform G1 for heaps < 4 Gb 
 For large heaps G1 is better because of the way it can divide work between 
different threads and heap regions 
www.luxoft.com
www.luxoft.com
Sizing the heap 
 Too small heap -> too much time spent in GC 
 Main rule – never to specify heap more than the amount of available physical 
memory 
 -Xms – initial heap size 
 -Xmx – maximum heap size 
www.luxoft.com
Sizing the Generations 
 -XX:NewRatio=N – sets the ration of young generation to old 
 -XX:NewSize=N – sets the size of young generation 
 -XX:MaxNewSize=N – maximum size for young generation 
www.luxoft.com
Sizing Permgen and Metaspace 
Java 7: 
 -XX:PermSize=N 
 -XX:MaxPermSize=N 
Java 8: 
 -XX:MetaspaceSize=N 
 -XX:MaxMetaspaceSize=N 
www.luxoft.com
Adaptive Sizing 
JVM can try to find optimal performance according to its policies and configuration 
 Adaptive sizing controls how the JVM alters the ratio of young generation to old 
 Adjusting generation sizes is base on GC algorithms attempts to meet their pause goals 
 Adaptive tuning can be disabled for small performance boost (usually not recommended) 
 Command line argument differs for different GC algorithms. For, example for throughput collector: 
-XX:+UseAdaptiveSizePolicy – whether to use adaptive policy (true by default) 
-XX:MaxGCPauseMillis=nnn – maximal GC pause we can tolerate 
-XX:GCTimeRatio=nnn - hint to the virtual machine that it's desirable that not more than 1 / (1 + nnn) of the 
application execution time be spent in the collector 
www.luxoft.com
THANK YOU 
www.luxoft.com

Вячеслав Блинов «Java Garbage Collection: A Performance Impact»

  • 1.
    Java Garbage Collection: A Performance Impact Blynov Viacheslav Lead Software Developer 24 November 2014 www.luxoft.com
  • 2.
  • 3.
    Introduction Agenda Garbage Collection Overview  Java GC Algorithms  Basic GC Tuning www.luxoft.com
  • 4.
  • 5.
    GC Purpose Objectsthat are referenced are said to be live Objects that are no longer referenced are considered dead and termed garbage Garbage collector is responsible for:  allocating memory  ensuring that any referenced objects remain in memory  recovering memory used by objects that are no longer reachable from references in executing code www.luxoft.com
  • 6.
  • 7.
    GC performance impact  Collector needs computational resources (CPU cycles) to perform garbage collection  As garbage collection involves moving objects in memory a collector must ensure that no thread is using these objects www.luxoft.com The pauses when all application threads are stopped are called stop-the-world pauses These pauses generally have the greatest impact on the performance of an application, and minimizing those pauses is the key consideration when tuning GC.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
    Summary  allGC algorithms divide the heap into old and young generation  all GC algorithm employ stop-the-world approach to clearing objects from young generation, which is usually a very quick operation www.luxoft.com
  • 15.
  • 16.
    “Client” and “Server”JVM types Depending on underlying hardware platform and version JVM can act as “client” of “server” VM. This affect the choice of JIT compiler and default GC algorithm.  “client” platform is usually 32-bit and has 1 CPU  “server” platform is usually 64-bit (but 32-bit is also possible) and has several CPUs www.luxoft.com
  • 17.
    GC Algorithms Serial garbage collector (-XX:+UseSerialGC)  Throughput collector (-XX:+UseParallelGC , -XX:+UseParallelOldGC)  CMS collector (-XX:+UseConcMarkSweepGC, -XX:+UseParNewGC)  G1 collector (-XX:+UseG1GC) www.luxoft.com
  • 18.
    Serial garbage collector  default collector for client-class platforms (32-bit JVMs on Windows or single-processor www.luxoft.com machines)  uses single thread to process heap  stops all application threads for both minor and full GC Usage cases:  no low-pause requirements  “client-style” single-CPU environment  very small heap (few hundred MBs)  several JVMs running on single platform (number of JVM > number of available CPUs)
  • 19.
    Throughput (parallel) garbagecollector  default collector for server-class machines (multi-CPU Unix machines or any 64-bit JVM)  utilizes multiple threads for garbage collection to gain speed and minimize pauses  stops all application threads for both minor and full GC  -XX:+UseParallelGC enables multi-threaded collection of young generation and single-threaded www.luxoft.com old-generation collection/compaction  -XX:+UseParallelOldGC enables multi-threaded collection of young generation and multi-threaded old-generation collection/compaction Usage cases:  multi-CPU are available  large heap size and many object created/discarded
  • 20.
    CMS (Concurrent MarkSweep) collector  designed to eliminate long pauses associated with full GC cycles  stops all application threads during minor GC  uses different algorithm to collect young generation (-XX:+UseParNewGC)  uses one or more background threads to periodically scan through the old generation and discard unused objects. This makes CMS a low-paused collector  do not perform any compaction  in case of CPU unavailability and/or heap fragmentation – fallback to serial collector  by default does not collect permgen Usage cases:  low pause requirement and available CPU resources  in case of single-CPU machine can be used with -XX:+CMSIncrementalMode (deprecated in Java 8) www.luxoft.com
  • 21.
    G1 (Garbage First)garbage collector  designed to process large heaps (more than 4 Gb) with minimal pauses  divides the heap into separate regions  performs incremental compaction of old generation by copying data between regions www.luxoft.com
  • 22.
    G1 (Garbage First)garbage collector www.luxoft.com
  • 23.
    Summary (choosing GCalgorithm)  Serial GC is best only for application with heap <= 100 Mb  Batch jobs which consume all available CPUs will get better performance with concurrent collector  Batch jobs which DON’T consume all CPUs could get better performance with throughput collector  When measuring response time the choice between throughput and concurrent collectors depends on CPU availability  Most of the time CMS should overperform G1 for heaps < 4 Gb  For large heaps G1 is better because of the way it can divide work between different threads and heap regions www.luxoft.com
  • 24.
  • 25.
    Sizing the heap  Too small heap -> too much time spent in GC  Main rule – never to specify heap more than the amount of available physical memory  -Xms – initial heap size  -Xmx – maximum heap size www.luxoft.com
  • 26.
    Sizing the Generations  -XX:NewRatio=N – sets the ration of young generation to old  -XX:NewSize=N – sets the size of young generation  -XX:MaxNewSize=N – maximum size for young generation www.luxoft.com
  • 27.
    Sizing Permgen andMetaspace Java 7:  -XX:PermSize=N  -XX:MaxPermSize=N Java 8:  -XX:MetaspaceSize=N  -XX:MaxMetaspaceSize=N www.luxoft.com
  • 28.
    Adaptive Sizing JVMcan try to find optimal performance according to its policies and configuration  Adaptive sizing controls how the JVM alters the ratio of young generation to old  Adjusting generation sizes is base on GC algorithms attempts to meet their pause goals  Adaptive tuning can be disabled for small performance boost (usually not recommended)  Command line argument differs for different GC algorithms. For, example for throughput collector: -XX:+UseAdaptiveSizePolicy – whether to use adaptive policy (true by default) -XX:MaxGCPauseMillis=nnn – maximal GC pause we can tolerate -XX:GCTimeRatio=nnn - hint to the virtual machine that it's desirable that not more than 1 / (1 + nnn) of the application execution time be spent in the collector www.luxoft.com
  • 29.