Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Taming GC Pauses for Large HBase Heaps
Liqi Yi – Senior Performance Engineer, Intel Corporation
Acknowledging
Eric Kaczmar...
Legal Disclaimer
2
Intel may make changes to specifications and product descriptions at any time, without notice. Designer...
Motivation
 HBase SLAs driven by query response times
 Unpredictable and lengthy (seconds+) stop the world garbage colle...
Garbage Collectors in JDK 8
 Parallel Compacting Collector
 -XX:+UseParallelOldGC
 Throughput friendly collector
 Conc...
Garbage Collectors in JDK 8
 Parallel Compacting Collector
 -XX:+UseParallelOldGC
 Throughput friendly collector
 Conc...
Garbage Collectors in JDK 8
 Parallel Compacting Collector
 -XX:+UseParallelOldGC
 Throughput friendly collector
 Conc...
Garbage Collectors in JDK 8
 Parallel Compacting Collector
 -XX:+UseParallelOldGC
 Throughput friendly collector
 Conc...
G1 Collector Promises
 Concurrent marking
 Nature compaction
 Simplified tuning
8
G1 Collector Promises
 Concurrent marking
 Nature compaction
 Simplified tuning
 Low and predictable latency
9
G1 Collector Promises
 Concurrent marking
 Nature compaction
 Simplified tuning
 Low and predictable latency
10
Not ye...
Garbage First Collector Architecture Overview
• Heap is divided to ~2K non-contiguous regions of eden, survivor,
and old s...
Garbage First Collector Parameters
G1GC Diagnostic Flags
 -XX:+PrintFlagsFinal
Prints all JVM runtime flags when JVM star...
Garbage First Collector Parameters
G1GC Diagnostic Flags
 -XX:+PrintFlagsFinal
Prints all JVM runtime flags when JVM star...
Experiment Environment
 Hardware  Software
14
8 Regionservers
Dual Sockets Xeon E5 2699
v3 @ 2.3GHz
128 GB DDR4 @ 2133 M...
• Regionserver
100GB Java heap
40% BlockCache, 40% Memstore
1 table, 1 column family,
1 billion rows, 1KB each row
170GB t...
• Regionserver
100GB Java heap
40% BlockCache, 40% Memstore
1 table, 1 column family,
1 billion rows, 1KB each row
170GB t...
CMS Baseline
17
0
100
200
300
400
500
600
700
800
900
1000
0 500 1000 1500 2000 2500 3000 3500 4000
GCpausetime(ms)
Time s...
CMS Baseline
18
0
100
200
300
400
500
600
700
800
900
1000
0 500 1000 1500 2000 2500 3000 3500 4000
GCpausetime(ms)
Time s...
0
50
100
150
200
250
300
350
400
450
0 500 1000 1500 2000 2500 3000 3500 4000
GCpausetime(ms)
Time since the JVM launched
...
How-to Improve Garbage Collection Behavior
1. Enable GC logging
2. Look for outliers (long pauses) in the logs
3. Understa...
Examining Logs – Sample log snippet #1
21
2957.020: [G1Ergonomics (Mixed GCs) continue mixed GCs, reason: candidate old re...
Examining Logs – Sample log snippet #1
22
2942.508: [G1Ergonomics (Mixed GCs) continue mixed GCs, reason: candidate old re...
Root Causes and Possible Solutions
 Mixed GC takes much more time to free up less wasted regions
 Not cost efficient
 H...
Examining Logs – Sample log snippet #2
24
[Eden: 4480.0M(4480.0M)->0.0B(4480.0M) Survivors: 640.0M->640.0M Heap: 70.5G(100...
Root Cause and Solution
 Concurrent marking started when heap still has sufficient free space
 Causing extra GC pauses
...
G1 Parameters Tuning
26
-XX:+UseG1GC -XX:MaxGCPauseMillis=100
-XX:G1HeapWastePercent=20 -XX:InitiatingHeapOccupancyPercent...
GC log After Tuning – Sample log snippet
27
[Eden: 4800.0M(4800.0M)->0.0B(4640.0M) Survivors: 640.0M->704.0M Heap: 81.5G(1...
0
50
100
150
200
250
300
350
400
0 500 1000 1500 2000 2500 3000 3500 4000
GCpausetime(ms)
Time since the JVM launched
G1GC...
0
50
100
150
200
250
300
350
400
450
0 500 1000 1500 2000 2500 3000 3500 4000
Time since the JVM launched
G1GC pause time ...
0
50
100
150
200
250
300
350
400
450
0 500 1000 1500 2000 2500 3000 3500 4000
Time since the JVM launched
G1GC pause time ...
G1 Characteristic Before and After Tuning
31
1,114,798
446,482
81,322 51,826
1,694,428
519,062
759,254
76,089 39,855
1,394...
G1GC characteristic before and after tuning
32
1,114,798
446,482
81,322 51,826
1,694,428
519,062
759,254
76,089 39,855
1,3...
G1 Characteristic Before and After Tuning
33
1,114,798
446,482
81,322 51,826
1,694,428
519,062
759,254
76,089 39,855
1,394...
0
100
200
300
400
500
600
700
800
900
1000
0 500 1000 1500 2000 2500 3000 3500 4000
GCpausetime(ms)
Time since Java VM lau...
0
100
200
300
400
500
600
700
800
900
1000
0 500 1000 1500 2000 2500 3000 3500 4000
GCpausetime(ms)
Time since Java VM lau...
Conclusion
36
 CMS behaves well until it doesn’t (stop the world GC lasting 5+ seconds)
 JDK8u40 G1 collector is a good ...
Contacts
 Eric Kaczmarek
eric.kaczmarek@intel.com
 Yanping Wang
yanping.wang@intel.com
Twitter: @YanWang6
 Liqi Yi
liqi...
Additional resources
 https://blogs.oracle.com/g1gc/
 http://www.infoq.com/articles/G1-One-Garbage-Collector-To-Rule-The...
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase
Upcoming SlideShare
Loading in …5
×

HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase

12,205 views

Published on

In this presentation, we will introduce Hotspot's Garbage First collector (G1GC) as the most suitable collector for latency-sensitive applications running with large memory environments. We will first discuss G1GC internal operations and tuning opportunities, and also cover tuning flags that set desired GC pause targets, change adaptive GC thresholds, and adjust GC activities at runtime. We will provide several HBase case studies using Java heaps as large as 100GB that show how to best tune applications to remove unpredicted, protracted GC pauses.

Published in: Software
  • Hello! Get Your Professional Job-Winning Resume Here - Check our website! https://vk.cc/818RFv
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here

HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase

  1. 1. Taming GC Pauses for Large HBase Heaps Liqi Yi – Senior Performance Engineer, Intel Corporation Acknowledging Eric Kaczmarek – Senior Java Performance Architect, Intel Corporation Yanping Wang – Java GC Performance Architect, Intel Corporation
  2. 2. Legal Disclaimer 2 Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined". Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The information here is subject to change without notice. Do not finalize a design with this information. The products described in this document may contain design defects or errors known as errata which may cause the product to deviate from published specifications. Current characterized errata are available on request. Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order. Tests document performance of components on a particular test, in specific systems. Differences in hardware, software, or configuration will affect actual performance. Consult other sources of information to evaluate performance as you consider your purchase. For more complete information about performance and benchmark results, visit http://www.intel.com/performance. Results have been estimated based on internal Intel analysis and are provided for informational purposes only. Any difference in system hardware or software design or configuration may affect actual performance. Results have been simulated and are provided for informational purposes only. Results were derived using simulations run on an architecture simulator or model. Any difference in system hardware or software design or configuration may affect actual performance. Intel does not control or audit the design or implementation of third party benchmark data or Web sites referenced in this document. Intel encourages all of its customers to visit the referenced Web sites or others where similar performance benchmark data are reported and confirm whether the referenced benchmark data are accurate and reflect performance of systems available for purchase. Intel and the Intel logo are trademarks of Intel Corporation in the U.S. and other countries. *Other names and brands may be claimed as the property of others. Copyright © 2015 Intel Corporation. All rights reserved.
  3. 3. Motivation  HBase SLAs driven by query response times  Unpredictable and lengthy (seconds+) stop the world garbage collection burdens applications (resulting in not mission critical use)  Frequent GC affect throughput  Growing memory capacity (200+GB) not uncommon  Need for a more efficient GC algorithm to tame 100GB+ Java Heap 3
  4. 4. Garbage Collectors in JDK 8  Parallel Compacting Collector  -XX:+UseParallelOldGC  Throughput friendly collector  Concurrent Mark Sweep (CMS) Collector  -XX:+UseConcMarkSweepGC  Low latency collector for heap < 32GB  Garbage First (G1) Collector  -XX:+UseG1GC  Low latency collector 4
  5. 5. Garbage Collectors in JDK 8  Parallel Compacting Collector  -XX:+UseParallelOldGC  Throughput friendly collector  Concurrent Mark Sweep (CMS) Collector  -XX:+UseConcMarkSweepGC  Low latency collector for heap < 32GB  Garbage First (G1) Collector  -XX:+UseG1GC  Low latency collector 5 Maintenance mode
  6. 6. Garbage Collectors in JDK 8  Parallel Compacting Collector  -XX:+UseParallelOldGC  Throughput friendly collector  Concurrent Mark Sweep (CMS) Collector  -XX:+UseConcMarkSweepGC  Low latency collector for heap < 32GB  Garbage First (G1) Collector  -XX:+UseG1GC  Low latency collector 6 Will be replaced by G1 Maintenance mode
  7. 7. Garbage Collectors in JDK 8  Parallel Compacting Collector  -XX:+UseParallelOldGC  Throughput friendly collector  Concurrent Mark Sweep (CMS) Collector  -XX:+UseConcMarkSweepGC  Low latency collector for heap < 32GB  Garbage First (G1) Collector  -XX:+UseG1GC  Low latency collector 7 Will be replaced by G1 Maintenance mode One Garbage Collector To Rule Them All
  8. 8. G1 Collector Promises  Concurrent marking  Nature compaction  Simplified tuning 8
  9. 9. G1 Collector Promises  Concurrent marking  Nature compaction  Simplified tuning  Low and predictable latency 9
  10. 10. G1 Collector Promises  Concurrent marking  Nature compaction  Simplified tuning  Low and predictable latency 10 Not yet there, but getting closer…
  11. 11. Garbage First Collector Architecture Overview • Heap is divided to ~2K non-contiguous regions of eden, survivor, and old spaces, region size can be 1MB, 2MB, 4MB, 8MB, 16MB, 32MB. • Humongous regions are old regions for large objects that are larger than ½ of region size. Humongous objects are stored in contiguous regions in heap. • Number of regions in eden and survivor can be changed between GC’s. • Young GC: multi-threaded, low stop-the-world pauses • Concurrent marking and clean up: mostly parallel, multi-threaded • Mixed GC: multi-threaded, incremental GC, collects and compacts heap partially, more expensive stop-the-world pauses • Full GC: Serial, collects and compacts entire heap 11 * Picture from: http://www.oracle.com/webfolder/technetwork/tutorials/obe/java/G1GettingStarted/index.html
  12. 12. Garbage First Collector Parameters G1GC Diagnostic Flags  -XX:+PrintFlagsFinal Prints all JVM runtime flags when JVM starts (no overhead)  -XX:+PrintGCDetails Prints GC status (Must have for GC tuning, low overhead)  -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps Tracks time stamps for each GC activity (low overhead)  -XX:+PrintAdaptiveSizePolicy Prints information every time when GC decides to change any setting or hits certain conditions  -XX:+PrintReferenceGC Prints GC reference processing for each GC G1GC Performance Flags  -XX:+UseG1GC -Xms100g –Xmx100g  -XX:+ParallelRefProcEnabled Uses Multi-threads in parallel to process references  -XX:MaxGCPauseMillis=100 Sets low desired GC pause target, the default is 200ms  -XX:ParallelGCThreads=48 Sets number of Parallel GC threads Recommended: 8+(#of_logical_processors-8)(5/8) 12
  13. 13. Garbage First Collector Parameters G1GC Diagnostic Flags  -XX:+PrintFlagsFinal Prints all JVM runtime flags when JVM starts (no overhead)  -XX:+PrintGCDetails Prints GC status (Must have for GC tuning, low overhead)  -XX:+PrintGCDateStamps -XX:+PrintGCTimeStamps Tracks time stamps for each GC activity (low overhead)  -XX:+PrintAdaptiveSizePolicy Prints information every time when GC decides to change any setting or hits certain conditions  -XX:+PrintReferenceGC Prints GC reference processing for each GC G1GC Performance Flags  -XX:+UseG1GC -Xms100g –Xmx100g  -XX:+ParallelRefProcEnabled Uses Multi-threads in parallel to process references  -XX:MaxGCPauseMillis=100 Sets low desired GC pause target, the default is 200ms  -XX:ParallelGCThreads=48 Sets number of Parallel GC threads Recommended: 8+(#of_logical_processors-8)(5/8) 13 Ideally, the only option needed
  14. 14. Experiment Environment  Hardware  Software 14 8 Regionservers Dual Sockets Xeon E5 2699 v3 @ 2.3GHz 128 GB DDR4 @ 2133 MHz 4X Intel S3700 SSD 10Gbps Ethernet HBase 0.98.6 Zookeeper 3.4.5 Hadoop 2.5.0 YCSB (Yahoo Cloud Serving Benchmark) 0.14 Oracle JDK8 update 40
  15. 15. • Regionserver 100GB Java heap 40% BlockCache, 40% Memstore 1 table, 1 column family, 1 billion rows, 1KB each row 170GB table data per regionserver HDFS replicate factor: 3 Experiment Configuration • YCSB Client 15 8 separate YCSB clients 50% read 50% write workload Zipfian row key generator Cold start each run for one hour Query rate un-throttled
  16. 16. • Regionserver 100GB Java heap 40% BlockCache, 40% Memstore 1 table, 1 column family, 1 billion rows, 1KB each row 170GB table data per regionserver HDFS replicate factor: 3 Experiment Configuration • YCSB Client 16 8 separate YCSB clients 50% read 50% write workload Zipfian row key generator Cold start each run for one hour Query rate un-throttled 50% read 50% write worst offender
  17. 17. CMS Baseline 17 0 100 200 300 400 500 600 700 800 900 1000 0 500 1000 1500 2000 2500 3000 3500 4000 GCpausetime(ms) Time since the JVM launched CMS GC pause time series4.96 seconds 2.6 seconds 10 > 400ms
  18. 18. CMS Baseline 18 0 100 200 300 400 500 600 700 800 900 1000 0 500 1000 1500 2000 2500 3000 3500 4000 GCpausetime(ms) Time since the JVM launched CMS GC pause time series4.96 seconds 2.6 seconds 10 > 400ms Yes! your application just paused for 5seconds
  19. 19. 0 50 100 150 200 250 300 350 400 450 0 500 1000 1500 2000 2500 3000 3500 4000 GCpausetime(ms) Time since the JVM launched G1GC pause time series G1 Baseline 19 71% > 100ms 26% > 200ms 11% > 300ms 3 pauses ~400ms -XX:+UseG1GC -Xms100g –Xmx100g -XX:MaxGCPauseMillis=100
  20. 20. How-to Improve Garbage Collection Behavior 1. Enable GC logging 2. Look for outliers (long pauses) in the logs 3. Understand the root of long GC pauses 4. Tune GC command line to avoid or alleviate the symptoms 5. Examine logs and repeat at step #2 (multiple iterations) 20
  21. 21. Examining Logs – Sample log snippet #1 21 2957.020: [G1Ergonomics (Mixed GCs) continue mixed GCs, reason: candidate old regions available, candidate old regions: 597 regions, reclaimable: 5504622248 bytes (5.13 %), threshold: 5.00 %], 0.3933224 secs] [Parallel Time: 353.2 ms, GC Workers: 48] [GC Worker Start (ms): Min: 2956632.9, Avg: 2956633.2, Max: 2956633.4, Diff: 0.6] [Ext Root Scanning (ms): Min: 0.5, Avg: 0.7, Max: 1.5, Diff: 1.1, Sum: 35.3] [Update RS (ms): Min: 7.5, Avg: 8.6, Max: 9.7, Diff: 2.2, Sum: 411.7] [Processed Buffers: Min: 4, Avg: 6.9, Max: 14, Diff: 10, Sum: 333] [Scan RS (ms): Min: 70.5, Avg: 71.5, Max: 73.9, Diff: 3.4, Sum: 3432.5] [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.8] [Object Copy (ms): Min: 269.6, Avg: 271.3, Max: 272.3, Diff: 2.7, Sum: 13024.7] … [Eden: 4480.0M(4480.0M)->0.0B(4480.0M) Survivors: 640.0M->640.0M Heap: 65.6G(100.0G)->61.5G(100.0G)] [Times: user=17.28 sys=0.04, real=0.40 secs] 400 ms GC pause It is a mixed GC Scan RS and copy costs are expensive 4GB freed
  22. 22. Examining Logs – Sample log snippet #1 22 2942.508: [G1Ergonomics (Mixed GCs) continue mixed GCs, reason: candidate old regions available, candidate old regions: 1401 regions, reclaimable: 15876399368 bytes (14.79 %), threshold: 5.00 %], 0.1843897 secs] [Parallel Time: 160.3 ms, GC Workers: 48] [GC Worker Start (ms): Min: 2942325.1, Avg: 2942325.4, Max: 2942325.7, Diff: 0.6] [Ext Root Scanning (ms): Min: 0.4, Avg: 0.8, Max: 1.8, Diff: 1.3, Sum: 38.0] [Update RS (ms): Min: 6.9, Avg: 7.8, Max: 9.2, Diff: 2.3, Sum: 374.9] [Processed Buffers: Min: 4, Avg: 6.7, Max: 11, Diff: 7, Sum: 320] [Scan RS (ms): Min: 26.1, Avg: 27.1, Max: 27.7, Diff: 1.6, Sum: 1299.1] [Code Root Scanning (ms): Min: 0.0, Avg: 0.0, Max: 0.2, Diff: 0.2, Sum: 0.3] [Object Copy (ms): Min: 123.2, Avg: 123.8, Max: 124.4, Diff: 1.1, Sum: 5942.0] … [Eden: 4480.0M(4480.0M)->0.0B(4480.0M) Survivors: 640.0M->640.0M Heap: 68.3G(100.0G)->61.3G(100.0G)] [Times: user=6.89 sys=1.17, real=0.19 secs] 190 ms GC pause 7GB freed Mixed GC 15 seconds before Scan RS and copy are much better
  23. 23. Root Causes and Possible Solutions  Mixed GC takes much more time to free up less wasted regions  Not cost efficient  Heap is not full, could leave it for next collection cycle  Exclude most expensive mixed GC by increasing waste percent  Increase –XX:G1HeapWastePercent to 20% ( default 5%) 23
  24. 24. Examining Logs – Sample log snippet #2 24 [Eden: 4480.0M(4480.0M)->0.0B(4480.0M) Survivors: 640.0M->640.0M Heap: 70.5G(100.0G)->68.4G(100.0G)] … [Eden: 4480.0M(4480.0M)->0.0B(4480.0M) Survivors: 640.0M->640.0M Heap: 70.2G(100.0G)->68.4G(100.0G)] … [Eden: 4480.0M(4480.0M)->0.0B(4480.0M) Survivors: 640.0M->640.0M Heap: 70.6G(100.0G)->70.3G(100.0G)] … [Eden: 4480.0M(4480.0M)->0.0B(4480.0M) Survivors: 640.0M->640.0M Heap: 71.9G(100.0G)->69.5G(100.0G)] 30GB heap is unused because concurrent marking phase started too early
  25. 25. Root Cause and Solution  Concurrent marking started when heap still has sufficient free space  Causing extra GC pauses  Extra copying for objects die in near future  Goal is to increase heap usage  Increase “InitiatingHeapOccupancyPercent”, so more heap can be used  Increase “ConcGCThreads”, so concurrent marking phase can be completed early enough to avoid full GC 25
  26. 26. G1 Parameters Tuning 26 -XX:+UseG1GC -XX:MaxGCPauseMillis=100 -XX:G1HeapWastePercent=20 -XX:InitiatingHeapOccupancyPercent=75 -XX:ConcGCThreads=32 -XX:ParallelGCThreads=48 -XX:InitialHeapSize=107374182400 -XX:MaxHeapSize=107374182400 -XX:+ParallelRefProcEnabled -XX:+PrintAdaptiveSizePolicy -XX:+PrintFlagsFinal -XX:+PrintGC -XX:+PrintGCDateStamps -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+PrintReferenceGC
  27. 27. GC log After Tuning – Sample log snippet 27 [Eden: 4800.0M(4800.0M)->0.0B(4640.0M) Survivors: 640.0M->704.0M Heap: 81.5G(100.0G)->78.6G(100.0G)] … [Eden: 4640.0M(4640.0M)->0.0B(5664.0M) Survivors: 640.0M->672.0M Heap: 81.2G(100.0G)->78.3G(100.0G)] … [Eden: 4480.0M(4480.0M)->0.0B(4480.0M) Survivors: 640.0M->640.0M Heap: 82.2G(100.0G)->79.6G(100.0G)] … [Eden: 4640.0M(4480.0M)->0.0B(7712.0M) Survivors: 640.0M->640.0M Heap: 80.4G(100.0G)->77.4G(100.0G)] Heap usage increased from 70GB to 80GB without Full GC
  28. 28. 0 50 100 150 200 250 300 350 400 0 500 1000 1500 2000 2500 3000 3500 4000 GCpausetime(ms) Time since the JVM launched G1GC pause time series G1 After Tuning 28 40% > 100ms 8% > 200ms 1% > 300ms 1 pause 340ms 2189 GC pauses, average 105ms, 99percentile 310ms
  29. 29. 0 50 100 150 200 250 300 350 400 450 0 500 1000 1500 2000 2500 3000 3500 4000 Time since the JVM launched G1GC pause time after tuning 0 50 100 150 200 250 300 350 400 450 0 500 1000 1500 2000 2500 3000 3500 4000 GCpausetime(ms) Time since the JVM launched G1GC pause time before tuning G1Before and After Tuning 29 40% > 100ms 8% > 200ms 1% > 300ms11% 26%s 71%s
  30. 30. 0 50 100 150 200 250 300 350 400 450 0 500 1000 1500 2000 2500 3000 3500 4000 Time since the JVM launched G1GC pause time after tuning 0 50 100 150 200 250 300 350 400 450 0 500 1000 1500 2000 2500 3000 3500 4000 GCpausetime(ms) Time since the JVM launched G1GC pause time before tuning G1Before and After Tuning 30 40% > 100ms 8% > 200ms 1% > 300ms11% 26%s 71%s 60% of Pauses Shorter than 100ms
  31. 31. G1 Characteristic Before and After Tuning 31 1,114,798 446,482 81,322 51,826 1,694,428 519,062 759,254 76,089 39,855 1,394,260 0 200,000 400,000 600,000 800,000 1,000,000 1,200,000 1,400,000 1,600,000 1,800,000 Mixed GC Pauses Young GC Pauses Cleanup Pauses Remark Pauses Total STW Pauses clusterGCpausetime(seconds) GC type distribution comparison Before Tuning After Tuning 53% 41% 18% 7% 23%
  32. 32. G1GC characteristic before and after tuning 32 1,114,798 446,482 81,322 51,826 1,694,428 519,062 759,254 76,089 39,855 1,394,260 0 200,000 400,000 600,000 800,000 1,000,000 1,200,000 1,400,000 1,600,000 1,800,000 Mixed GC Pauses Young GC Pauses Cleanup Pauses Remark Pauses Total STW Pauses clusterGCpausetime(seconds) GC type distribution comparison Before Tuning After Tuning 53% 41% 18% 7% 23% Young GC replacing expensive Mixed GC
  33. 33. G1 Characteristic Before and After Tuning 33 1,114,798 446,482 81,322 51,826 1,694,428 519,062 759,254 76,089 39,855 1,394,260 0 200,000 400,000 600,000 800,000 1,000,000 1,200,000 1,400,000 1,600,000 1,800,000 Mixed GC Pauses Young GC Pauses Cleanup Pauses Remark Pauses Total STW Pauses clusterGCpausetime(seconds) GC type distribution comparison Before Tuning After Tuning 53% 41% 18% 7% 23% ~20% Shorter total application pause
  34. 34. 0 100 200 300 400 500 600 700 800 900 1000 0 500 1000 1500 2000 2500 3000 3500 4000 GCpausetime(ms) Time since Java VM launched (seconds) GC pause time conparison G1 after G1 before CMS Comparison 34 100ms 200ms 300ms
  35. 35. 0 100 200 300 400 500 600 700 800 900 1000 0 500 1000 1500 2000 2500 3000 3500 4000 GCpausetime(ms) Time since Java VM launched (seconds) GC pause time conparison G1 after G1 before CMS Comparison 35 100ms 200ms 300ms Less than 1% pauses over 300ms
  36. 36. Conclusion 36  CMS behaves well until it doesn’t (stop the world GC lasting 5+ seconds)  JDK8u40 G1 collector is a good CMS alternative for large Java Heaps (100+ GB)  Default G1 settings still do not offer best performance  Requires tuning  Tuned G1 provides low and predictable latencies for HBase running with large heaps (100+GB) G1 Collector Constantly Improves!!
  37. 37. Contacts  Eric Kaczmarek eric.kaczmarek@intel.com  Yanping Wang yanping.wang@intel.com Twitter: @YanWang6  Liqi Yi liqi.yi@intel.com Twitter: @yi_liqi 37
  38. 38. Additional resources  https://blogs.oracle.com/g1gc/  http://www.infoq.com/articles/G1-One-Garbage-Collector-To-Rule-Them-All  http://www.oracle.com/webfolder/technetwork/tutorials/obe/java/G1GettingStart ed/index.html#ClearCT  https://blogs.oracle.com/g1gc/entry/g1gc_logs_how_to_print  http://blog.cloudera.com/blog/2014/12/tuning-java-garbage-collection-for- hbase/ 38

×