Garbage First Garbage Collector (G1 GC) - Migration to, Expectations and Advanced Tuning

Monica Beckwith
Monica BeckwithJava Champion. Java/JVM Performance - JavaOne Rock Star at Microsoft
Garbage-First Garbage Collector
Charlie Hunt
Monica Beckwith
John Cuthbertson
What to Expect
Learn what you need to know to experience nirvana in
the evaluation of G1 GC even if your are migrating from
Parallel GC to G1, or CMS GC to G1 GC
You also get a walk through of some case study data
Who are these guys?
About us
• Charlie Hunt
• Architect, Performance Engineering, Salesforce.com
• Monica Beckwith
• Performance Architect, Servergy
• John Cuthbertson
• GC Engineer, Azul Systems
Agenda
Three Topics
• Migration / Evaluation Prep
• Parallel GC to G1 GC
• CMS GC to G1 GC
And, at the end
• Summary of things to watch for during your G1
evaluation
Prep Work
Prep Work
• Want less stress in your performance testing?
Prep Work
• Want less stress in your performance testing?
• Define success in terms of throughput, latency, footprint and
capacity
• Approach
• Get your application stakeholders together and start asking
questions!
Throughput
• What’s the expected throughput?
• Can you fall below that expected throughput?
• How long can you fall below that expected throughput?
• What throughput can you never fall below?
• How is throughput measured?
• Txns per sec, messages per sec, one metric, multiple metrics?
• Where is throughput measured?
• Application server? At the client?
Latency
• What’s the expected latency?
• Can you exceed the expected latency?
• By how much, and for how long?
• What latency should you never exceed?
• How is latency measured?
• Response time, percentile(s), one metric, multiple metrics?
• Where is latency measured?
• Application server? At the client?
Footprint
• How much RAM / memory can you use?
• What amount of RAM / memory should you never
exceed?
• How is memory consumption measured?
• What tools?
• At what time during application execution?
Capacity
• What’s the expected load?
• i.e. concurrent users, number of concurrent txns, injection rate
• How much additional (load) capacity do you need?
• i.e. concurrent users, number of concurrent txns, injection rate
• Capacity metrics are often some combination of
• # of concurrent users, an injection rate, number of concurrent
txns in flight and/or CPU utilization
One last one …. Power Consumption
• Power consumption may also be a requirement
• What’s the expected power consumption?
• FYI - power consumption BIOS settings can impact
throughput, latency and capacity
• Can observe as much as 15% difference in latency and CPU
utilization
Experimental Design
• Now you’ve got the inputs to begin experimental design
• Do you have a workload that be configured to test the
performance requirements?
• You may have to develop or enhance a workload
• Do you have an environment that can test the
requirements you’ve gathered?
Share the Experiment Design(s)
• Once you have the set of experiments to test defined
• Share those plans with your stakeholders
• It’s a great way to validate your testing for the right things in an
appropriate manner
• You’ll likely get feedback and tweak your experiment’s design(s)
• Most of all, you get agreement on what’s being tested, and how
• Saves you a lot of hassle!
Migrating from
Parallel GC to G1 GC
When to consider migration
• When Parallel GC can’t be tuned to meet latency goals
• Usually due to Full GC pauses
• Duration and/or frequency
• When Parallel GC can’t meet footprint and latency goals
• G1 will likely be able to run with smaller Java heap and shorter
pauses than Parallel GC’s full GCs
• Max load capacity will likely be lower with G1
• G1 has more overhead than Parallel GC
When to consider migration
• If you can tune Parallel GC to meet your
throughput, latency, footprint and capacity goals
• Stay with Parallel GC!
• If you can tune Parallel GC to meet latency, footprint and capacity
goals, it offers the best throughput of the HotSpot GCs
Parallel GC Heap Sizing
• Suppose an application using Parallel GC and these
heap sizing options
-Xms5g -Xmx5g -Xmn2g
-XX:PermSize=700m -XX:MaxPermSize=700m
-XX:-UseAdaptiveSizePolicy -XX:SurvivorRatio=4
-XX:+UseParallelOldGC
-XX:+PrintGCDetails -XX:+PrintGCTimeStamps
Does this make sense?
• Updated for G1
-Xms5g -Xmx5g -Xmn2g
-XX:PermSize=700m -XX:MaxPermSize=700m
-XX:-UseAdaptiveSizePolicy -XX:SurvivorRatio=4
-XX:+UseG1GC -XX:MaxGCPauseMillis=250
-XX:InitiatingHeapOccupancyPercent=45 (default is 45)
-XX:+PrintGCDetails -XX:+PrintGCTimeStamps
Remove all young gen size refining options
• Updated for G1
-Xms5g -Xmx5g -Xmn2g
-XX:PermSize=700m -XX:MaxPermSize=700m
-XX:-UseAdaptiveSizePolicy -XX:SurvivorRatio=4
-XX:+UseG1GC -XX:MaxGCPauseMillis=250
-XX:InitiatingHeapOccupancyPercent=45 (default is 45)
-XX:+PrintGCDetails -XX:+PrintGCTimeStamps
Example Application Requirements
• At expected load
• 2 second response time at 99th percentile
• No response time above 10 seconds
• 2x expected load
• 2 second response time at 99th percentile
• 6 GB RAM available to JVM
Current Observations
• With Parallel GC at expected capacity / load
• Observing 2.5 second latency at 99th percentile
• Above 2 second 99th percentile requirement
• ~ 29% CPU utilization
• 2x expected load not evaluated
• Why bother? Can’t meet 99th percentile requirement for
expected capacity / load!
Again, our Parallel GC heap sizing
-Xms5g -Xmx5g -Xmn2g
-XX:PermSize=700m -XX:MaxPermSize=700m
-XX:-UseAdaptiveSizePolicy -XX:SurvivorRatio=4
-XX:+UseParallelOldGC
-XX:+PrintGCDetails -XX:+PrintGCTimeStamps
Again, our starting place for G1 GC
-Xms5g -Xmx5g
-XX:PermSize=700m -XX:MaxPermSize=700m
-XX:+UseG1GC -XX:MaxGCPauseMillis=250
-XX:InitiatingHeapOccupancyPercent=45
-XX:+PrintGCDetails -XX:+PrintGCTimeStamps
First G1 GC results
• Initial observations
• 99th percentile response time
• G1 GC - 1.4 seconds
• Parallel GC - 2.5 seconds
• ~ 44% reduction
seconds
0
0.5
1
1.5
2
2.5
Parallel GC Initial G1 GC
99th Percentile
Response Time
(lower is better)
Parallel GC
Initial G1 GC
99th percentile
requirement
First G1 GC results
• Initial observations
• CPU utilization
• G1 GC - 39%
• Parallel GC – 29%
0
10
20
30
40
50
60
70
80
90
100
Parallel GC Initial G1 GC
CPU Utilization
(lower is better)
Parallel GC
Initial G1 GC
percent
Analysis
• Are we done?
• Should we next test for 2x capacity?
• No, go look at the raw G1 GC logs for potential improvement!!!
• Additionally, that jump in CPU utilization is a bit
concerning, 29% to 39%
• How will the app behave as the injection rate increases towards
our capacity goals of 2x expected load?
G1 GC Analysis
• Starting point
• Look for “to-space overflow”, “to-space exhausted” and “Full
GCs” in the G1 GC logs
• #1 goal is to avoid these with G1 GC!
Evacuation failures (i)
• to-space overflow and to-space exhausted
• These are G1 region evacuation failures
• Situations where no G1 regions are available to evacuate live
objects
• Note, the region being evacuated could be an eden
region, survivor region or an old region – i.e. “to-space overflow” is
not limited to a survivor space overflow situation like what can
occur with other HotSpot GCs
Evacuation failures (ii)
• to-space overflow and to-space exhausted
• (Very) expensive events to deal with
• Any object that has already been evacuated, its object references
must be updated, and the region must be tenured
• Any object that has not been evacuated, its object references must
be self-forwarded and the region tenured in place
• Evacuation failures are usually followed by a Full GC
• Full GC collects all regions, and is single threaded
G1 GC Analysis
• What did we find?
About every 25 minutes …
2216.206: [GC pause (young) (to-space overflow), 0.62894300
secs]
[Parallel Time: 239.4 ms] …
2217.958: [Full GC 4802M->1571M(5120M), 3.0889520 secs]
[Times: user=4.34 sys=0.01, real=3.09 secs]
This is not good!
G1 GC Analysis
• What next?
• Enable -XX:+PrintAdaptiveSizePolicy to see what adaptive
decisions G1 is making
• Look at GC logs for those areas leading up to to-space
overflows, to-space exhausted and Full GC
G1 GC Analysis
• What did we find?
2219.546: [G1Ergonomics (Concurrent Cycles) request
concurrent cycle initiation, reason: occupancy higher than
threshold, occupancy: 2311061504 bytes, allocation request:
6440816 bytes, threshold: 2312110080 bytes (45.00
%), source: concurrent humongous allocation]
+250 of these in two hours!
• Resolution: avoid humongous allocations
G1 GC Analysis
• What next?
• Object allocations greater than or equal to 50% of G1’s region
size are considered a humongous allocation
• Humongous object allocation & collection are not optimized in G1
• G1 region size is determined automatically at JVM launch time
based on the size of the Java heap (-Xms in 7u40 & later)
• But you can override it with -XX:G1HeapRegionSize=#
• Size must be a power of 2 ranging from 1 MB to 32 MB
G1 GC Analysis
• What next (continued)?
• Object allocation from GC log, 6440816 bytes, or about 6.4 MB
• Need a region size such that 6.4 MB is less than 50% of the
region size, and where the region size is a power of 2 between
1 MB and 32 MB
• Set -XX:G1HeapRegionSize=16m
• 16 MB * 50% = 8 MB, 8 MB > 6.4 MB
• Note: -Xms & -Xmx must be aligned with the region size
• 5g == 5120m, 5120m/16m == 320, 5000m/16m == 312.5
Adjust -Xms, & -Xmx to
gain alignment at 16m
Updated G1 GC Options
-Xms5g -Xmx5g
-XX:PermSize=700m -XX:MaxPermSize=700m
-XX:+UseG1GC -XX:MaxGCPauseMillis=250
-XX:InitiatingHeapOccupancyPercent=45
-XX:G1HeapRegionSize=16m
-XX:+PrintGCDetails -XX:+PrintGCTimeStamps
• Observations with G1 region size at 16 MB
• 99th percentile response time dropped another 25%
• G1 GC – 1.05 seconds (16 MB region size)
• G1 GC - 1.4 seconds (initial configuration)
• Parallel GC - 2.5 seconds
16 MB Region Size
seconds
0
0.5
1
1.5
2
2.5
Parallel GC Initial G1 GC G1 Region
Size
99th Percentile
Response Time
(lower is better)
Parallel GC
Initial G1 GC
G1 Region Size
99th percentile
requirement
16 MB Region Size
• Observations with G1 region size at 16 MB
• CPU utilization dropped, but still higher than Parallel GC
• G1 GC - 33% (16 MB region size)
• G1 GC - 39% (initial configuration)
• Parallel GC - 29%
0
10
20
30
40
50
60
70
80
90
100
Parallel GC Initial G1 GC G1 Region
Size
CPU Utilization
(lower is better)
Parallel GC
Initial G1 GC
G1 Region Size
percent
Analysis
• Are we done?
• No, go look at the raw G1 GC logs for some additional potential
improvement!!!
• CPU utilization still a little higher, 29% vs. 33%
• Still need to know how the app will behave at 2x expected load
capacity goal?
G1 GC Analysis
• What did we find?
• No humongous allocations, no adaptive / ergonomic issues
• Looking at the amount of space reclaimed per marking & mixed
GC cycle
• 300 MB to 900 MB per cycle, ~ 6% - 18% of the Java heap
• ~ 95% heap occupancy just prior to mixed GC cycles
• Seems pretty good
• Delicate balance between starting the marking cycle too early and
wasting CPU cycles at the expense of capacity vs starting too late and
experiencing to-space overflow or to-space exhausted events
Analysis
• What about meeting 99th percentile latency at 2x
expected load capacity goal?
• 99th percentile response time
• 1.2 seconds, ~ 150 ms higher, 1.05 seconds at 1x
• CPU utilization
• 59%, ~ 1.8x increase, 33% at 1x
0
0.5
1
1.5
2
2.5
Expected Load 2x Expected Load
99th Percentile
Response Time
(lower is better)
Expected Load
2x Expected Load
seconds
99th percentile
requirement
percent
0
10
20
30
40
50
60
70
80
90
100
Expected Load 2x Expected Load
CPU Utilization for G1 GC
Expected Load
2x Expected Load
Analysis
• Are we done?
• Yes, for now ;-)
• Requirements met
• Some additional
performance realized via GC
analysis
Analysis
• Are we done?
• Yes, for now ;-)
• Requirements met
• Some additional
performance realized via GC
analysis
• Stakeholders happy!
Migrating from
CMS GC to G1 GC
When to consider migration
• When CMS GC cannot meet your latency goals
• Usually due to fragmented old gen space, or not enough
available Java heap
• When CMS GC has difficulties meeting footprint goals
• G1 GC will likely run with a smaller Java heap
• G1 GC may use more CPU
• G1 has more overhead than CMS GC
• Though not as much of a delta as G1 GC to Parallel GC
Example Application Requirements
• Expected load
• 1500 concurrent users, 2 Txns/user per second
• 99th percentile – 250 ms
• Desire all response times to be less than 1 second
• 12 GB RAM per JVM
• 2x load capacity
• 3000 users, 2 Txns/user per second
Current CMS GC Configuration
-Xms10g -Xmx10g -Xmn2g
-XX:PermSize=64m -XX:MaxPermSize=64m
-XX:InitialSurvivorRatio=5 -XX:SurvivorRatio=5
-XX:InitialTenuringThreshold=15 -XX:MaxTenuringThreshold=15
-XX:+UseConcMarkSweepGC -XX:+UseParNewGC
-XX:CMSInitiatingOccupancyFraction=60
-XX:+UseCMSInitiatingOccupancyOnly
-XX:+CMSClassUnloadingEnabled
-XX:ParallelGCThreads=12 -XX:ConcGCThreads=4
Does this make sense?
• Updated for G1
-Xms10g -Xmx10g -Xmn2g
-XX:PermSize=64m -XX:MaxPermSize=64m
-XX:InitialSurvivorRatio=5 -XX:SurvivorRatio=5
-XX:InitialTenuringThreshold=15 -XX:MaxTenuringThreshold=15
-XX:+UseG1GC -XX:MaxGCPauseMillis=200
-XX:InitiatingHeapOccupancyPercent=50
-XX:ParallelGCThreads=12 -XX:ConcGCThreads=4
CMS Configuration Updated for G1
-Xms10g -Xmx10g -Xmn2g
-XX:PermSize=64m -XX:MaxPermSize=64m
-XX:InitialSurvivorRatio=5 -XX:SurvivorRatio=5
-XX:InitialTenuringThreshold=15 -XX:MaxTenuringThreshold=15
-XX:+UseG1GC -XX:MaxGCPauseMillis=200 (default is 200)
-XX:InitiatingHeapOccupancyPercent=50 (default is 45)
-XX:ParallelGCThreads=12 -XX:ConcGCThreads=4
Current Observations
• With CMS GC
• At expected capacity
• Meeting 99th percentile response time of 250 ms
• 240 ms
• Not meeting max response time less than 1 second
• ~ 5 seconds
• At 2x expected capacity
• Not meeting 99th percentile (250 ms), or max response time < 1 sec
• 99th percentile – 310 ms
• Max response time ~ 8 seconds
Current Observations
• What did we find?
• CMS GC promotion failures
• Here’s an example
[ParNew (promotion failed) 1797568K-
>1775476K(1797568K), 0.7051600 secs] [CMS: 3182978K-
>1773645K(8388608K), 5.2594980 secs] 4736752K-
>1773645K(10186176K), [CMS Perm : 52858K-
>52843K(65536K)] 5.9649620 secs] [Times: user=7.43
sys=0.00, real=5.96 secs]
This is not good!
Again, CMS GC Heap Sizing
-Xms10g -Xmx10g -Xmn2g
-XX:PermSize=64m -XX:MaxPermSize=64m
-XX:InitialSurvivorRatio=5 -XX:SurvivorRatio=5
-XX:InitialTenuringThreshold=15 -XX:MaxTenuringThreshold=15
-XX:+UseConcMarkSweepGC -XX:+UseParNewGC
-XX:CMSInitiatingOccupancyFraction=60
-XX:+UseCMSInitiatingOccupancyOnly
-XX:+CMSClassUnloadingEnabled
-XX:ParallelGCThreads=12 -XX:ConcGCThreads=4
Again, our starting place for G1 GC
-Xms10g -Xmx10g
-XX:PermSize=64m -XX:MaxPermSize=64m
-XX:+UseG1GC -XX:MaxGCPauseMillis=200 (default is 200)
-XX:InitiatingHeapOccupancyPercent=50 (default is 45)
First G1 GC results (response times)
• Initial observations
• 99th percentile at expected load
• CMS – 240 ms
• G1 – 78 ms
milliseconds
0
50
100
150
200
250
300
CMS GC G1 GC
99th Percentile
(lower is better)
CMS GC
G1 GC
99th percentile
requirement
First G1 GC results (response times)
• Initial observations
• Max at expected load
• CMS – 5000 ms
• G1 – 582 ms
milliseconds
0
500
1000
1500
2000
2500
3000
3500
4000
4500
5000
CMS GC G1 GC
Max Response Time
(lower is better)
CMS GC
G1 GC
Max response
time requirement
First G1 GC results (CPU utilization)
• Initial observations
• At expected load
• CMS – 29%
• G1 – 33%
percent
0
10
20
30
40
50
60
70
80
90
100
CMS GC G1 GC
CPU Utilization
(lower is better)
CMS GC
G1 GC
Analysis
• Are we done?
• No, go look at the raw G1 GC logs for potential improvement!!!
• Additionally, the increase in CPU utilization, (29% to 33%) may
be a little concerning
• How will the app behave at 2x load capacity?
G1 2x Capacity (response times)
• Observations
• 99th percentile at 2x capacity
• CMS – 310 ms
• G1 – 103 ms
milliseconds
0
50
100
150
200
250
300
350
CMS GC G1 GC
99th Percentile
(lower is better)
CMS GC
G1 GC
99th percentile
requirement
G1 2x Capacity (response times)
• Observations
• Max at 2x capacity
• CMS – 8000 ms
• G1 – 5960 ms
milliseconds
0
1000
2000
3000
4000
5000
6000
7000
8000
CMS GC G1 GC
Max Response Time
(lower is better)
CMS GC
G1 GC
Max response
time requirement
G1 2x Capacity (CPU utilization)
• Observations
• CPU at 2x capacity
• CMS – 59%
• G1 – 56%
percent
0
10
20
30
40
50
60
70
80
90
100
CMS GC G1 GC
CPU Utilization
(lower is better)
CMS GC
G1 GC
G1 GC Analysis
• What next?
• Go look at the GC logs!!!
G1 GC Analysis
• What did we find?
• Excessive RSet update times
[GC pause (young), 0.9929170 secs] [Parallel Time: 988.0 ms, GC
Workers: 10]
…. <other GC info removed> ….
[Update RS (ms): Min: 154.5, Avg: 237.4, Max: 980.2, Diff:
825.7, Sum: 2373.8]
Very high RSet
update time!
Very high pause
time for 200 ms
pause time goal!
G1 GC Analysis
• What next?
• When RSet update times are high
• Tune -XX:G1RSetUpdatingPauseTimePercent (default is 10)
• Target time to spend updating RSets during GC evacuation pause
• Lower value pushes more work onto G1 Concurrent Refinement
threads
• Higher value pushes more to be done during GC (stop the world)
pause
• Tune -XX:G1ConcRefinementThreads
• Defaults to value of -XX:ParallelGCThreads
G1 GC Analysis
• What next? (continued)
• Test system has 12 virtual processors
• Currently using default -XX:ParallelGCThreads=10
• Change -XX:ParallelGCThreads=12
• This will also set -XX:G1ConcRefinementThreads=12
• Decrease -XX:G1RSetUpdatingPauseTimePercent=5
• default is 10
• Watch for change / increase in CPU usage
Updated G1 GC JVM Options
-Xms10g -Xmx10g
-XX:PermSize=64m -XX:MaxPermSize=64m
-XX:+UseG1GC -XX:MaxGCPauseMillis=200
-XX:InitiatingHeapOccupancyPercent=50
-XX:ParallelGCThreads=12
-XX:G1RSetUpdatingPauseTimePercent=5
Updated G1 GC Config (response times)
• Observations at 2x capacity
• 99th percentile
• CMS – 310 ms
• Initial G1 – 103 ms
• Updated G1 - 194 ms
milliseconds
0
50
100
150
200
250
300
350
CMS Initial G1 Updated
G1
99th Percentile
(lower is better)
CMS
Initial G1
Updated G1
99th percentile
requirement
Updated G1 GC Config (response times)
• Observations at 2x capacity
• Max response time
• CMS – 8000 ms
• Initial G1 – 5960 ms
• Updated G1 – 548 ms
0
1000
2000
3000
4000
5000
6000
7000
8000
CMS Initial G1 Updated
G1
Max Response Time
(lower is better)
CMS
Initial G1
Updated G1
milliseconds
Max response
time requirement
Updated G1 GC Config (CPU utilization)
• Observations at 2x capacity
• CPU Utilization
• CMS – 59%
• Initial G1 – 56%
• Updated G1 – 57%
percent
0
10
20
30
40
50
60
70
80
90
100
CMS Initial G1 Updated
G1
CPU Utilization
(lower is better)
CMS
Initial G1
Updated G1
Analysis
• Are we done?
• No, go look at the raw G1 GC logs for some additional potential
improvement!!!
G1 GC Analysis
• What did we find?
• No humongous allocations, no adaptive / ergonomic issues
• Looking at the amount of space reclaimed per marking & mixed
GC cycle
• 1 GB to 2 GB per cycle, ~ 10% - 20% of the Java heap
• ~ 90% heap occupancy just prior to mixed GC cycle
• Seems pretty good
• Delicate balance between starting the marking cycle too early and
wasting CPU cycles at the expense of capacity vs starting too late and
experiencing to-space exhausted or to-space overflow events
Analysis
• Are we done?
• Yes, for now ;-)
• Requirements met
• 99th percentile
• Max response time
• Both expected load and 2x
expected load
• With lower CPU utilization!
Analysis
• Are we done?
• Yes, for now ;-)
• Requirements met
• 99th percentile
• Max response time
• Both expected load and 2x
expected load
• With lower CPU utilization!
• Stakeholders happy!
Summary
Call to Action (i)
• Understand the “success criteria”
• Understand when it makes sense to transition and
evaluate G1 GC
• Don’t over-tune!
• Start G1 evaluation with minimal settings
• Initial and max heap size, pause time target and when to start
marking cycle
• Don’t use young gen sizing, i.e. -Xmn, -XX:SurvivorRatio, etc
Call to Action (ii)
• Do GC analysis
• Use tools for a high level view and identify troublesome time
periods --- then look at the raw GC logs
• Look at several GC events prior and after to see how things go into
an undesirable state, and the action G1 took and the recovery
Call to Action (iii)
• Do GC analysis
• If you see evacuation failures (to-space overflow, to-space
exhausted or Full GC)
• Enable -XX:+PrintAdaptiveSizePolicy
• If you see humongous object allocations
• Increase -XX:G1HeapRegionSize
• If you see to-space exhausted, i.e. marking cycle starting too late
• Remove young gen sizing constraints if they exist
• Decrease -XX:InitiatingHeapOccupancyPercent, (default is 45)
• Increase Java heap size -Xms and -Xmx
Call to Action (iv)
• Do GC analysis
• If you see high RSet update times
• Increase -XX:ParallelGCThreads if not at the max budget of virtual
processors per JVM
• This will increase -XX:G1ConcRefinementThreads
• Can set independently of -XX:ParallelGCThreads
• Decrease -XX:G1RSetUpdatingPauseTimePercent
• Lowers time spent in RSet updating in GC evacuation pause
• Note: May observe increase in CPU utilization
Call to Action (iv) continued …
• To observe RSet update and RSet scan information
• Add these three JVM command line options
• -XX:+G1SummarizeRSetStats
• -XX:G1SummarizeRSetStatsPeriod=1
• -XX:+UnlockDiagnosticVMOptions
• Can use these to identify if some of the updating and
coarsening of RSets is being pushed onto the mutator
(application) threads
Call to Action (v)
• If time from marking cycle start until mixed GCs start
take a long time
• Increase -XX:ConcGCThreads (default ParallelGCThreads / 4)
• Note: may observe an increase in CPU utilization
Call to Action (vi)
• If marking cycles are running too frequently
• Look at space reclaimed in the mixed GC cycles after a marking
cycle completes
• If you’re reclaiming (very) little space per marking cycle & mixed
GC cycle
• Increase -XX:InitiatingHeapOccupancyPercent
• Note: The more frequent the marking cycles & mixed GC
cycles, you may observe higher CPU utilization
Call to Action …. Last One
• Parallel Reference Processing
• Look for high “[Ref Proc:” times or high reference processing
times in remark pauses --- should be a fraction of the overall GC
pause time
Call to Action …. Last One
• Parallel Reference Processing
• Look for high “[Ref Proc:” times or high reference processing
times in remark pauses --- should be a fraction of the overall GC
pause time
[Other 24.8 ms]
[Choose CSet: 0.0 ms]
[Ref Proc: 16.9 ms]
Or,
[GC remark 17212.791: [GC ref-proc, 7.8638730 secs], 8.4352460 secs]
Very high reference
processing times
Call to Action …. Last One
• Parallel Reference Processing
• Look for high “[Ref Proc:” times or high reference processing
times in remark pauses --- should be a fraction of the overall GC
pause time
[Other 24.8 ms]
[Choose CSet: 0.0 ms]
[Ref Proc: 16.9 ms]
Or,
[GC remark 17212.791: [GC ref-proc, 7.8638730 secs], 8.4352460 secs]
• Enable -XX:+ParallelRefProcEnabled to address this
Very high reference
processing times
Additional Info
• JavaOne Sessions
• G1 Collector: Current and Future Adaptability and Ergonomics
• If you missed it, get a recording!
• Tuning Low-Pause Garbage Collection (Hands on Lab)
• If you missed it, get the materials!
Additional Info
• GC Analysis Tools
• JClarity has some very good GC analysis tools
• Currently adding G1 analysis
• http://www.jclarity.com
• We most often use awk, perl, specialized JFreeChart and
spreadsheets :-O
If you can,
share observations and GC logs at
hotspot-gc-use@openjdk.java.net
Happy G1 GC Trails!
Garbage First Garbage Collector (G1 GC) - Migration to, Expectations and Advanced Tuning
1 of 91

Recommended

Garbage First Garbage Collector (G1 GC): Current and Future Adaptability and ... by
Garbage First Garbage Collector (G1 GC): Current and Future Adaptability and ...Garbage First Garbage Collector (G1 GC): Current and Future Adaptability and ...
Garbage First Garbage Collector (G1 GC): Current and Future Adaptability and ...Monica Beckwith
7.2K views58 slides
G1 Garbage Collector: Details and Tuning by
G1 Garbage Collector: Details and TuningG1 Garbage Collector: Details and Tuning
G1 Garbage Collector: Details and TuningSimone Bordet
18.7K views66 slides
Enabling Vectorized Engine in Apache Spark by
Enabling Vectorized Engine in Apache SparkEnabling Vectorized Engine in Apache Spark
Enabling Vectorized Engine in Apache SparkKazuaki Ishizaki
830 views58 slides
Garbage First Garbage Collection (G1 GC) #jjug_ccc #ccc_cd6 by
Garbage First Garbage Collection (G1 GC) #jjug_ccc #ccc_cd6Garbage First Garbage Collection (G1 GC) #jjug_ccc #ccc_cd6
Garbage First Garbage Collection (G1 GC) #jjug_ccc #ccc_cd6Yuji Kubota
74.8K views86 slides
Autoscaling Flink with Reactive Mode by
Autoscaling Flink with Reactive ModeAutoscaling Flink with Reactive Mode
Autoscaling Flink with Reactive ModeFlink Forward
926 views17 slides
Tuning Apache Kafka Connectors for Flink.pptx by
Tuning Apache Kafka Connectors for Flink.pptxTuning Apache Kafka Connectors for Flink.pptx
Tuning Apache Kafka Connectors for Flink.pptxFlink Forward
430 views54 slides

More Related Content

What's hot

Introduction to Redis by
Introduction to RedisIntroduction to Redis
Introduction to RedisDvir Volk
121.1K views24 slides
Practical learnings from running thousands of Flink jobs by
Practical learnings from running thousands of Flink jobsPractical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobsFlink Forward
269 views18 slides
Problems with PostgreSQL on Multi-core Systems with MultiTerabyte Data by
Problems with PostgreSQL on Multi-core Systems with MultiTerabyte DataProblems with PostgreSQL on Multi-core Systems with MultiTerabyte Data
Problems with PostgreSQL on Multi-core Systems with MultiTerabyte DataJignesh Shah
1.7K views36 slides
PostgreSQL and Benchmarks by
PostgreSQL and BenchmarksPostgreSQL and Benchmarks
PostgreSQL and BenchmarksJignesh Shah
5.3K views38 slides
Hive on Tezのベストプラクティス by
Hive on TezのベストプラクティスHive on Tezのベストプラクティス
Hive on TezのベストプラクティスYahoo!デベロッパーネットワーク
3.5K views40 slides
Stephan Ewen - Experiences running Flink at Very Large Scale by
Stephan Ewen -  Experiences running Flink at Very Large ScaleStephan Ewen -  Experiences running Flink at Very Large Scale
Stephan Ewen - Experiences running Flink at Very Large ScaleVerverica
3.5K views76 slides

What's hot(20)

Introduction to Redis by Dvir Volk
Introduction to RedisIntroduction to Redis
Introduction to Redis
Dvir Volk121.1K views
Practical learnings from running thousands of Flink jobs by Flink Forward
Practical learnings from running thousands of Flink jobsPractical learnings from running thousands of Flink jobs
Practical learnings from running thousands of Flink jobs
Flink Forward269 views
Problems with PostgreSQL on Multi-core Systems with MultiTerabyte Data by Jignesh Shah
Problems with PostgreSQL on Multi-core Systems with MultiTerabyte DataProblems with PostgreSQL on Multi-core Systems with MultiTerabyte Data
Problems with PostgreSQL on Multi-core Systems with MultiTerabyte Data
Jignesh Shah1.7K views
PostgreSQL and Benchmarks by Jignesh Shah
PostgreSQL and BenchmarksPostgreSQL and Benchmarks
PostgreSQL and Benchmarks
Jignesh Shah5.3K views
Stephan Ewen - Experiences running Flink at Very Large Scale by Ververica
Stephan Ewen -  Experiences running Flink at Very Large ScaleStephan Ewen -  Experiences running Flink at Very Large Scale
Stephan Ewen - Experiences running Flink at Very Large Scale
Ververica 3.5K views
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,... by confluent
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...
confluent6.3K views
Iceberg: A modern table format for big data (Strata NY 2018) by Ryan Blue
Iceberg: A modern table format for big data (Strata NY 2018)Iceberg: A modern table format for big data (Strata NY 2018)
Iceberg: A modern table format for big data (Strata NY 2018)
Ryan Blue2K views
Top 5 Mistakes to Avoid When Writing Apache Spark Applications by Cloudera, Inc.
Top 5 Mistakes to Avoid When Writing Apache Spark ApplicationsTop 5 Mistakes to Avoid When Writing Apache Spark Applications
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
Cloudera, Inc.127.8K views
Thrift vs Protocol Buffers vs Avro - Biased Comparison by Igor Anishchenko
Thrift vs Protocol Buffers vs Avro - Biased ComparisonThrift vs Protocol Buffers vs Avro - Biased Comparison
Thrift vs Protocol Buffers vs Avro - Biased Comparison
Igor Anishchenko240.8K views
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase by HBaseCon
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBaseHBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase
HBaseCon 2015: Taming GC Pauses for Large Java Heap in HBase
HBaseCon14.6K views
Producer Performance Tuning for Apache Kafka by Jiangjie Qin
Producer Performance Tuning for Apache KafkaProducer Performance Tuning for Apache Kafka
Producer Performance Tuning for Apache Kafka
Jiangjie Qin44.6K views
Linux Performance Analysis: New Tools and Old Secrets by Brendan Gregg
Linux Performance Analysis: New Tools and Old SecretsLinux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old Secrets
Brendan Gregg604K views
Dynamic Rule-based Real-time Market Data Alerts by Flink Forward
Dynamic Rule-based Real-time Market Data AlertsDynamic Rule-based Real-time Market Data Alerts
Dynamic Rule-based Real-time Market Data Alerts
Flink Forward756 views
Where is my bottleneck? Performance troubleshooting in Flink by Flink Forward
Where is my bottleneck? Performance troubleshooting in FlinkWhere is my bottleneck? Performance troubleshooting in Flink
Where is my bottleneck? Performance troubleshooting in Flink
Flink Forward540 views
Webinar: Deep Dive on Apache Flink State - Seth Wiesman by Ververica
Webinar: Deep Dive on Apache Flink State - Seth WiesmanWebinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
Ververica 1.2K views
YugaByte DB Internals - Storage Engine and Transactions by Yugabyte
YugaByte DB Internals - Storage Engine and Transactions YugaByte DB Internals - Storage Engine and Transactions
YugaByte DB Internals - Storage Engine and Transactions
Yugabyte2.6K views
Building a fully managed stream processing platform on Flink at scale for Lin... by Flink Forward
Building a fully managed stream processing platform on Flink at scale for Lin...Building a fully managed stream processing platform on Flink at scale for Lin...
Building a fully managed stream processing platform on Flink at scale for Lin...
Flink Forward856 views
Jvm tuning for low latency application & Cassandra by Quentin Ambard
Jvm tuning for low latency application & CassandraJvm tuning for low latency application & Cassandra
Jvm tuning for low latency application & Cassandra
Quentin Ambard3.8K views
Introducing KRaft: Kafka Without Zookeeper With Colin McCabe | Current 2022 by HostedbyConfluent
Introducing KRaft: Kafka Without Zookeeper With Colin McCabe | Current 2022Introducing KRaft: Kafka Without Zookeeper With Colin McCabe | Current 2022
Introducing KRaft: Kafka Without Zookeeper With Colin McCabe | Current 2022
HostedbyConfluent1.1K views

Viewers also liked

Concurrent Mark-Sweep Garbage Collection #jjug_ccc by
Concurrent Mark-Sweep Garbage Collection #jjug_cccConcurrent Mark-Sweep Garbage Collection #jjug_ccc
Concurrent Mark-Sweep Garbage Collection #jjug_cccYuji Kubota
115.2K views93 slides
Jdk9で変更になる(かも知れない)jvmオプションの標準設定 by
Jdk9で変更になる(かも知れない)jvmオプションの標準設定Jdk9で変更になる(かも知れない)jvmオプションの標準設定
Jdk9で変更になる(かも知れない)jvmオプションの標準設定Kazuyuki Nakamura
1.6K views27 slides
Metaspace by
MetaspaceMetaspace
MetaspaceYasumasa Suenaga
24.9K views56 slides
第六回渋谷Java Java8のJVM監視を考える by
第六回渋谷Java Java8のJVM監視を考える第六回渋谷Java Java8のJVM監視を考える
第六回渋谷Java Java8のJVM監視を考えるchonaso
15.4K views22 slides
Solving Multi-tenancy and G1GC in Apache HBase by
Solving Multi-tenancy and G1GC in Apache HBase Solving Multi-tenancy and G1GC in Apache HBase
Solving Multi-tenancy and G1GC in Apache HBase HBaseCon
2.2K views46 slides
GCについて by
GCについてGCについて
GCについてcactusman
12.5K views27 slides

Viewers also liked(10)

Concurrent Mark-Sweep Garbage Collection #jjug_ccc by Yuji Kubota
Concurrent Mark-Sweep Garbage Collection #jjug_cccConcurrent Mark-Sweep Garbage Collection #jjug_ccc
Concurrent Mark-Sweep Garbage Collection #jjug_ccc
Yuji Kubota115.2K views
Jdk9で変更になる(かも知れない)jvmオプションの標準設定 by Kazuyuki Nakamura
Jdk9で変更になる(かも知れない)jvmオプションの標準設定Jdk9で変更になる(かも知れない)jvmオプションの標準設定
Jdk9で変更になる(かも知れない)jvmオプションの標準設定
Kazuyuki Nakamura1.6K views
第六回渋谷Java Java8のJVM監視を考える by chonaso
第六回渋谷Java Java8のJVM監視を考える第六回渋谷Java Java8のJVM監視を考える
第六回渋谷Java Java8のJVM監視を考える
chonaso15.4K views
Solving Multi-tenancy and G1GC in Apache HBase by HBaseCon
Solving Multi-tenancy and G1GC in Apache HBase Solving Multi-tenancy and G1GC in Apache HBase
Solving Multi-tenancy and G1GC in Apache HBase
HBaseCon2.2K views
GCについて by cactusman
GCについてGCについて
GCについて
cactusman12.5K views
JVM のいろはにほ #javajo by Yuji Kubota
JVM のいろはにほ #javajoJVM のいろはにほ #javajo
JVM のいろはにほ #javajo
Yuji Kubota27.3K views
java.lang.OutOfMemoryError #渋谷java by Yuji Kubota
java.lang.OutOfMemoryError #渋谷javajava.lang.OutOfMemoryError #渋谷java
java.lang.OutOfMemoryError #渋谷java
Yuji Kubota37.7K views
楽して JVM を学びたい #jjug by Yuji Kubota
楽して JVM を学びたい #jjug楽して JVM を学びたい #jjug
楽して JVM を学びたい #jjug
Yuji Kubota23.8K views

Similar to Garbage First Garbage Collector (G1 GC) - Migration to, Expectations and Advanced Tuning

Hadoop Meetup Jan 2019 - Dynamometer and a Case Study in NameNode GC by
Hadoop Meetup Jan 2019 - Dynamometer and a Case Study in NameNode GCHadoop Meetup Jan 2019 - Dynamometer and a Case Study in NameNode GC
Hadoop Meetup Jan 2019 - Dynamometer and a Case Study in NameNode GCErik Krogen
1.1K views20 slides
淺談 Java GC 原理、調教和 新發展 by
淺談 Java GC 原理、調教和新發展淺談 Java GC 原理、調教和新發展
淺談 Java GC 原理、調教和 新發展Leon Chen
12.2K views80 slides
ZGC-SnowOne.pdf by
ZGC-SnowOne.pdfZGC-SnowOne.pdf
ZGC-SnowOne.pdfMonica Beckwith
99 views44 slides
Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kacz... by
Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kacz...Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kacz...
Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kacz...Spark Summit
6K views24 slides
hbaseconasia2017: HBase Practice At XiaoMi by
hbaseconasia2017: HBase Practice At XiaoMihbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMiHBaseCon
1.8K views45 slides
GC Tuning Confessions Of A Performance Engineer by
GC Tuning Confessions Of A Performance EngineerGC Tuning Confessions Of A Performance Engineer
GC Tuning Confessions Of A Performance EngineerMonica Beckwith
12.2K views71 slides

Similar to Garbage First Garbage Collector (G1 GC) - Migration to, Expectations and Advanced Tuning(20)

Hadoop Meetup Jan 2019 - Dynamometer and a Case Study in NameNode GC by Erik Krogen
Hadoop Meetup Jan 2019 - Dynamometer and a Case Study in NameNode GCHadoop Meetup Jan 2019 - Dynamometer and a Case Study in NameNode GC
Hadoop Meetup Jan 2019 - Dynamometer and a Case Study in NameNode GC
Erik Krogen1.1K views
淺談 Java GC 原理、調教和 新發展 by Leon Chen
淺談 Java GC 原理、調教和新發展淺談 Java GC 原理、調教和新發展
淺談 Java GC 原理、調教和 新發展
Leon Chen12.2K views
Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kacz... by Spark Summit
Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kacz...Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kacz...
Taming GC Pauses for Humongous Java Heaps in Spark Graph Computing-(Eric Kacz...
Spark Summit6K views
hbaseconasia2017: HBase Practice At XiaoMi by HBaseCon
hbaseconasia2017: HBase Practice At XiaoMihbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: HBase Practice At XiaoMi
HBaseCon1.8K views
GC Tuning Confessions Of A Performance Engineer by Monica Beckwith
GC Tuning Confessions Of A Performance EngineerGC Tuning Confessions Of A Performance Engineer
GC Tuning Confessions Of A Performance Engineer
Monica Beckwith12.2K views
Am I reading GC logs Correctly? by Tier1 App
Am I reading GC logs Correctly?Am I reading GC logs Correctly?
Am I reading GC logs Correctly?
Tier1 App2K views
Kafka to the Maxka - (Kafka Performance Tuning) by DataWorks Summit
Kafka to the Maxka - (Kafka Performance Tuning)Kafka to the Maxka - (Kafka Performance Tuning)
Kafka to the Maxka - (Kafka Performance Tuning)
DataWorks Summit6.4K views
Мониторинг. Опять, rootconf 2016 by Vsevolod Polyakov
Мониторинг. Опять, rootconf 2016Мониторинг. Опять, rootconf 2016
Мониторинг. Опять, rootconf 2016
Vsevolod Polyakov587 views
Slices Of Performance in Java - Oleksandr Bodnar by GlobalLogic Ukraine
Slices Of Performance in Java - Oleksandr BodnarSlices Of Performance in Java - Oleksandr Bodnar
Slices Of Performance in Java - Oleksandr Bodnar
Java Garbage Collectors – Moving to Java7 Garbage First (G1) Collector by Gurpreet Sachdeva
Java Garbage Collectors – Moving to Java7 Garbage First (G1) CollectorJava Garbage Collectors – Moving to Java7 Garbage First (G1) Collector
Java Garbage Collectors – Moving to Java7 Garbage First (G1) Collector
Gurpreet Sachdeva1.8K views
Toronto meetup 20190917 by Bill Liu
Toronto meetup 20190917Toronto meetup 20190917
Toronto meetup 20190917
Bill Liu393 views
High Performance Solr and JVM Tuning Strategies used for MapQuest’s Search Ah... by Lucidworks
High Performance Solr and JVM Tuning Strategies used for MapQuest’s Search Ah...High Performance Solr and JVM Tuning Strategies used for MapQuest’s Search Ah...
High Performance Solr and JVM Tuning Strategies used for MapQuest’s Search Ah...
Lucidworks13.1K views
JVM @ Taobao - QCon Hangzhou 2011 by Kris Mok
JVM @ Taobao - QCon Hangzhou 2011JVM @ Taobao - QCon Hangzhou 2011
JVM @ Taobao - QCon Hangzhou 2011
Kris Mok5.8K views
G1 Garbage Collector - Big Heaps and Low Pauses? by C2B2 Consulting
G1 Garbage Collector - Big Heaps and Low Pauses?G1 Garbage Collector - Big Heaps and Low Pauses?
G1 Garbage Collector - Big Heaps and Low Pauses?
C2B2 Consulting16.1K views
SOLR Power FTW: short version by Alex Pinkin
SOLR Power FTW: short versionSOLR Power FTW: short version
SOLR Power FTW: short version
Alex Pinkin3.4K views

More from Monica Beckwith

A G1GC Saga-KCJUG.pptx by
A G1GC Saga-KCJUG.pptxA G1GC Saga-KCJUG.pptx
A G1GC Saga-KCJUG.pptxMonica Beckwith
135 views55 slides
QCon London.pdf by
QCon London.pdfQCon London.pdf
QCon London.pdfMonica Beckwith
136 views33 slides
Enabling Java: Windows on Arm64 - A Success Story! by
Enabling Java: Windows on Arm64 - A Success Story!Enabling Java: Windows on Arm64 - A Success Story!
Enabling Java: Windows on Arm64 - A Success Story!Monica Beckwith
193 views32 slides
Applying Concurrency Cookbook Recipes to SPEC JBB by
Applying Concurrency Cookbook Recipes to SPEC JBBApplying Concurrency Cookbook Recipes to SPEC JBB
Applying Concurrency Cookbook Recipes to SPEC JBBMonica Beckwith
302 views43 slides
Intro to Garbage Collection by
Intro to Garbage CollectionIntro to Garbage Collection
Intro to Garbage CollectionMonica Beckwith
199 views33 slides
OpenJDK Concurrent Collectors by
OpenJDK Concurrent CollectorsOpenJDK Concurrent Collectors
OpenJDK Concurrent CollectorsMonica Beckwith
396 views51 slides

More from Monica Beckwith(17)

Enabling Java: Windows on Arm64 - A Success Story! by Monica Beckwith
Enabling Java: Windows on Arm64 - A Success Story!Enabling Java: Windows on Arm64 - A Success Story!
Enabling Java: Windows on Arm64 - A Success Story!
Monica Beckwith193 views
Applying Concurrency Cookbook Recipes to SPEC JBB by Monica Beckwith
Applying Concurrency Cookbook Recipes to SPEC JBBApplying Concurrency Cookbook Recipes to SPEC JBB
Applying Concurrency Cookbook Recipes to SPEC JBB
Monica Beckwith302 views
OPENJDK: IN THE NEW AGE OF CONCURRENT GARBAGE COLLECTORS by Monica Beckwith
OPENJDK: IN THE NEW AGE OF CONCURRENT GARBAGE COLLECTORSOPENJDK: IN THE NEW AGE OF CONCURRENT GARBAGE COLLECTORS
OPENJDK: IN THE NEW AGE OF CONCURRENT GARBAGE COLLECTORS
Monica Beckwith278 views
The Performance Engineer's Guide to Java (HotSpot) Virtual Machine by Monica Beckwith
The Performance Engineer's Guide to Java (HotSpot) Virtual MachineThe Performance Engineer's Guide to Java (HotSpot) Virtual Machine
The Performance Engineer's Guide to Java (HotSpot) Virtual Machine
Monica Beckwith1.9K views
Garbage First Garbage Collector: Where the Rubber Meets the Road! by Monica Beckwith
Garbage First Garbage Collector: Where the Rubber Meets the Road!Garbage First Garbage Collector: Where the Rubber Meets the Road!
Garbage First Garbage Collector: Where the Rubber Meets the Road!
Monica Beckwith2K views
JFokus Java 9 contended locking performance by Monica Beckwith
JFokus Java 9 contended locking performanceJFokus Java 9 contended locking performance
JFokus Java 9 contended locking performance
Monica Beckwith1.3K views
Java Performance Engineer's Survival Guide by Monica Beckwith
Java Performance Engineer's Survival GuideJava Performance Engineer's Survival Guide
Java Performance Engineer's Survival Guide
Monica Beckwith3.7K views
The Performance Engineer's Guide To (OpenJDK) HotSpot Garbage Collection - Th... by Monica Beckwith
The Performance Engineer's Guide To (OpenJDK) HotSpot Garbage Collection - Th...The Performance Engineer's Guide To (OpenJDK) HotSpot Garbage Collection - Th...
The Performance Engineer's Guide To (OpenJDK) HotSpot Garbage Collection - Th...
Monica Beckwith2.3K views
The Performance Engineer's Guide To HotSpot Just-in-Time Compilation by Monica Beckwith
The Performance Engineer's Guide To HotSpot Just-in-Time CompilationThe Performance Engineer's Guide To HotSpot Just-in-Time Compilation
The Performance Engineer's Guide To HotSpot Just-in-Time Compilation
Monica Beckwith1.7K views
Java 9: The (G1) GC Awakens! by Monica Beckwith
Java 9: The (G1) GC Awakens!Java 9: The (G1) GC Awakens!
Java 9: The (G1) GC Awakens!
Monica Beckwith39.7K views
Game of Performance: A Song of JIT and GC by Monica Beckwith
Game of Performance: A Song of JIT and GCGame of Performance: A Song of JIT and GC
Game of Performance: A Song of JIT and GC
Monica Beckwith1.5K views
Way Improved :) GC Tuning Confessions - presented at JavaOne2015 by Monica Beckwith
Way Improved :) GC Tuning Confessions - presented at JavaOne2015Way Improved :) GC Tuning Confessions - presented at JavaOne2015
Way Improved :) GC Tuning Confessions - presented at JavaOne2015
Monica Beckwith4.1K views
GC Tuning Confessions Of A Performance Engineer - Improved :) by Monica Beckwith
GC Tuning Confessions Of A Performance Engineer - Improved :)GC Tuning Confessions Of A Performance Engineer - Improved :)
GC Tuning Confessions Of A Performance Engineer - Improved :)
Monica Beckwith2.7K views

Recently uploaded

Keynote Talk: Open Source is Not Dead - Charles Schulz - Vates by
Keynote Talk: Open Source is Not Dead - Charles Schulz - VatesKeynote Talk: Open Source is Not Dead - Charles Schulz - Vates
Keynote Talk: Open Source is Not Dead - Charles Schulz - VatesShapeBlue
210 views15 slides
What’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlue by
What’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlueWhat’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlue
What’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlueShapeBlue
222 views23 slides
Backup and Disaster Recovery with CloudStack and StorPool - Workshop - Venko ... by
Backup and Disaster Recovery with CloudStack and StorPool - Workshop - Venko ...Backup and Disaster Recovery with CloudStack and StorPool - Workshop - Venko ...
Backup and Disaster Recovery with CloudStack and StorPool - Workshop - Venko ...ShapeBlue
144 views12 slides
Backroll, News and Demo - Pierre Charton, Matthias Dhellin, Ousmane Diarra - ... by
Backroll, News and Demo - Pierre Charton, Matthias Dhellin, Ousmane Diarra - ...Backroll, News and Demo - Pierre Charton, Matthias Dhellin, Ousmane Diarra - ...
Backroll, News and Demo - Pierre Charton, Matthias Dhellin, Ousmane Diarra - ...ShapeBlue
146 views15 slides
DRBD Deep Dive - Philipp Reisner - LINBIT by
DRBD Deep Dive - Philipp Reisner - LINBITDRBD Deep Dive - Philipp Reisner - LINBIT
DRBD Deep Dive - Philipp Reisner - LINBITShapeBlue
140 views21 slides
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue by
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlueCloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlueShapeBlue
94 views13 slides

Recently uploaded(20)

Keynote Talk: Open Source is Not Dead - Charles Schulz - Vates by ShapeBlue
Keynote Talk: Open Source is Not Dead - Charles Schulz - VatesKeynote Talk: Open Source is Not Dead - Charles Schulz - Vates
Keynote Talk: Open Source is Not Dead - Charles Schulz - Vates
ShapeBlue210 views
What’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlue by ShapeBlue
What’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlueWhat’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlue
What’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlue
ShapeBlue222 views
Backup and Disaster Recovery with CloudStack and StorPool - Workshop - Venko ... by ShapeBlue
Backup and Disaster Recovery with CloudStack and StorPool - Workshop - Venko ...Backup and Disaster Recovery with CloudStack and StorPool - Workshop - Venko ...
Backup and Disaster Recovery with CloudStack and StorPool - Workshop - Venko ...
ShapeBlue144 views
Backroll, News and Demo - Pierre Charton, Matthias Dhellin, Ousmane Diarra - ... by ShapeBlue
Backroll, News and Demo - Pierre Charton, Matthias Dhellin, Ousmane Diarra - ...Backroll, News and Demo - Pierre Charton, Matthias Dhellin, Ousmane Diarra - ...
Backroll, News and Demo - Pierre Charton, Matthias Dhellin, Ousmane Diarra - ...
ShapeBlue146 views
DRBD Deep Dive - Philipp Reisner - LINBIT by ShapeBlue
DRBD Deep Dive - Philipp Reisner - LINBITDRBD Deep Dive - Philipp Reisner - LINBIT
DRBD Deep Dive - Philipp Reisner - LINBIT
ShapeBlue140 views
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue by ShapeBlue
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlueCloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue
ShapeBlue94 views
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R... by ShapeBlue
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...
ShapeBlue132 views
Business Analyst Series 2023 - Week 4 Session 7 by DianaGray10
Business Analyst Series 2023 -  Week 4 Session 7Business Analyst Series 2023 -  Week 4 Session 7
Business Analyst Series 2023 - Week 4 Session 7
DianaGray10126 views
Migrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlue by ShapeBlue
Migrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlueMigrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlue
Migrating VMware Infra to KVM Using CloudStack - Nicolas Vazquez - ShapeBlue
ShapeBlue176 views
Centralized Logging Feature in CloudStack using ELK and Grafana - Kiran Chava... by ShapeBlue
Centralized Logging Feature in CloudStack using ELK and Grafana - Kiran Chava...Centralized Logging Feature in CloudStack using ELK and Grafana - Kiran Chava...
Centralized Logging Feature in CloudStack using ELK and Grafana - Kiran Chava...
ShapeBlue101 views
Hypervisor Agnostic DRS in CloudStack - Brief overview & demo - Vishesh Jinda... by ShapeBlue
Hypervisor Agnostic DRS in CloudStack - Brief overview & demo - Vishesh Jinda...Hypervisor Agnostic DRS in CloudStack - Brief overview & demo - Vishesh Jinda...
Hypervisor Agnostic DRS in CloudStack - Brief overview & demo - Vishesh Jinda...
ShapeBlue120 views
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti... by ShapeBlue
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...
ShapeBlue98 views
Automating a World-Class Technology Conference; Behind the Scenes of CiscoLive by Network Automation Forum
Automating a World-Class Technology Conference; Behind the Scenes of CiscoLiveAutomating a World-Class Technology Conference; Behind the Scenes of CiscoLive
Automating a World-Class Technology Conference; Behind the Scenes of CiscoLive
Elevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlue by ShapeBlue
Elevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlueElevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlue
Elevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlue
ShapeBlue179 views
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT by ShapeBlue
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBITUpdates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT
Updates on the LINSTOR Driver for CloudStack - Rene Peinthor - LINBIT
ShapeBlue166 views
2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlue by ShapeBlue
2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlue2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlue
2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlue
ShapeBlue103 views
The Role of Patterns in the Era of Large Language Models by Yunyao Li
The Role of Patterns in the Era of Large Language ModelsThe Role of Patterns in the Era of Large Language Models
The Role of Patterns in the Era of Large Language Models
Yunyao Li80 views
Declarative Kubernetes Cluster Deployment with Cloudstack and Cluster API - O... by ShapeBlue
Declarative Kubernetes Cluster Deployment with Cloudstack and Cluster API - O...Declarative Kubernetes Cluster Deployment with Cloudstack and Cluster API - O...
Declarative Kubernetes Cluster Deployment with Cloudstack and Cluster API - O...
ShapeBlue88 views

Garbage First Garbage Collector (G1 GC) - Migration to, Expectations and Advanced Tuning

  • 1. Garbage-First Garbage Collector Charlie Hunt Monica Beckwith John Cuthbertson
  • 2. What to Expect Learn what you need to know to experience nirvana in the evaluation of G1 GC even if your are migrating from Parallel GC to G1, or CMS GC to G1 GC You also get a walk through of some case study data
  • 4. About us • Charlie Hunt • Architect, Performance Engineering, Salesforce.com • Monica Beckwith • Performance Architect, Servergy • John Cuthbertson • GC Engineer, Azul Systems
  • 6. Three Topics • Migration / Evaluation Prep • Parallel GC to G1 GC • CMS GC to G1 GC
  • 7. And, at the end • Summary of things to watch for during your G1 evaluation
  • 9. Prep Work • Want less stress in your performance testing?
  • 10. Prep Work • Want less stress in your performance testing? • Define success in terms of throughput, latency, footprint and capacity • Approach • Get your application stakeholders together and start asking questions!
  • 11. Throughput • What’s the expected throughput? • Can you fall below that expected throughput? • How long can you fall below that expected throughput? • What throughput can you never fall below? • How is throughput measured? • Txns per sec, messages per sec, one metric, multiple metrics? • Where is throughput measured? • Application server? At the client?
  • 12. Latency • What’s the expected latency? • Can you exceed the expected latency? • By how much, and for how long? • What latency should you never exceed? • How is latency measured? • Response time, percentile(s), one metric, multiple metrics? • Where is latency measured? • Application server? At the client?
  • 13. Footprint • How much RAM / memory can you use? • What amount of RAM / memory should you never exceed? • How is memory consumption measured? • What tools? • At what time during application execution?
  • 14. Capacity • What’s the expected load? • i.e. concurrent users, number of concurrent txns, injection rate • How much additional (load) capacity do you need? • i.e. concurrent users, number of concurrent txns, injection rate • Capacity metrics are often some combination of • # of concurrent users, an injection rate, number of concurrent txns in flight and/or CPU utilization
  • 15. One last one …. Power Consumption • Power consumption may also be a requirement • What’s the expected power consumption? • FYI - power consumption BIOS settings can impact throughput, latency and capacity • Can observe as much as 15% difference in latency and CPU utilization
  • 16. Experimental Design • Now you’ve got the inputs to begin experimental design • Do you have a workload that be configured to test the performance requirements? • You may have to develop or enhance a workload • Do you have an environment that can test the requirements you’ve gathered?
  • 17. Share the Experiment Design(s) • Once you have the set of experiments to test defined • Share those plans with your stakeholders • It’s a great way to validate your testing for the right things in an appropriate manner • You’ll likely get feedback and tweak your experiment’s design(s) • Most of all, you get agreement on what’s being tested, and how • Saves you a lot of hassle!
  • 19. When to consider migration • When Parallel GC can’t be tuned to meet latency goals • Usually due to Full GC pauses • Duration and/or frequency • When Parallel GC can’t meet footprint and latency goals • G1 will likely be able to run with smaller Java heap and shorter pauses than Parallel GC’s full GCs • Max load capacity will likely be lower with G1 • G1 has more overhead than Parallel GC
  • 20. When to consider migration • If you can tune Parallel GC to meet your throughput, latency, footprint and capacity goals • Stay with Parallel GC! • If you can tune Parallel GC to meet latency, footprint and capacity goals, it offers the best throughput of the HotSpot GCs
  • 21. Parallel GC Heap Sizing • Suppose an application using Parallel GC and these heap sizing options -Xms5g -Xmx5g -Xmn2g -XX:PermSize=700m -XX:MaxPermSize=700m -XX:-UseAdaptiveSizePolicy -XX:SurvivorRatio=4 -XX:+UseParallelOldGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps
  • 22. Does this make sense? • Updated for G1 -Xms5g -Xmx5g -Xmn2g -XX:PermSize=700m -XX:MaxPermSize=700m -XX:-UseAdaptiveSizePolicy -XX:SurvivorRatio=4 -XX:+UseG1GC -XX:MaxGCPauseMillis=250 -XX:InitiatingHeapOccupancyPercent=45 (default is 45) -XX:+PrintGCDetails -XX:+PrintGCTimeStamps
  • 23. Remove all young gen size refining options • Updated for G1 -Xms5g -Xmx5g -Xmn2g -XX:PermSize=700m -XX:MaxPermSize=700m -XX:-UseAdaptiveSizePolicy -XX:SurvivorRatio=4 -XX:+UseG1GC -XX:MaxGCPauseMillis=250 -XX:InitiatingHeapOccupancyPercent=45 (default is 45) -XX:+PrintGCDetails -XX:+PrintGCTimeStamps
  • 24. Example Application Requirements • At expected load • 2 second response time at 99th percentile • No response time above 10 seconds • 2x expected load • 2 second response time at 99th percentile • 6 GB RAM available to JVM
  • 25. Current Observations • With Parallel GC at expected capacity / load • Observing 2.5 second latency at 99th percentile • Above 2 second 99th percentile requirement • ~ 29% CPU utilization • 2x expected load not evaluated • Why bother? Can’t meet 99th percentile requirement for expected capacity / load!
  • 26. Again, our Parallel GC heap sizing -Xms5g -Xmx5g -Xmn2g -XX:PermSize=700m -XX:MaxPermSize=700m -XX:-UseAdaptiveSizePolicy -XX:SurvivorRatio=4 -XX:+UseParallelOldGC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps
  • 27. Again, our starting place for G1 GC -Xms5g -Xmx5g -XX:PermSize=700m -XX:MaxPermSize=700m -XX:+UseG1GC -XX:MaxGCPauseMillis=250 -XX:InitiatingHeapOccupancyPercent=45 -XX:+PrintGCDetails -XX:+PrintGCTimeStamps
  • 28. First G1 GC results • Initial observations • 99th percentile response time • G1 GC - 1.4 seconds • Parallel GC - 2.5 seconds • ~ 44% reduction seconds 0 0.5 1 1.5 2 2.5 Parallel GC Initial G1 GC 99th Percentile Response Time (lower is better) Parallel GC Initial G1 GC 99th percentile requirement
  • 29. First G1 GC results • Initial observations • CPU utilization • G1 GC - 39% • Parallel GC – 29% 0 10 20 30 40 50 60 70 80 90 100 Parallel GC Initial G1 GC CPU Utilization (lower is better) Parallel GC Initial G1 GC percent
  • 30. Analysis • Are we done? • Should we next test for 2x capacity? • No, go look at the raw G1 GC logs for potential improvement!!! • Additionally, that jump in CPU utilization is a bit concerning, 29% to 39% • How will the app behave as the injection rate increases towards our capacity goals of 2x expected load?
  • 31. G1 GC Analysis • Starting point • Look for “to-space overflow”, “to-space exhausted” and “Full GCs” in the G1 GC logs • #1 goal is to avoid these with G1 GC!
  • 32. Evacuation failures (i) • to-space overflow and to-space exhausted • These are G1 region evacuation failures • Situations where no G1 regions are available to evacuate live objects • Note, the region being evacuated could be an eden region, survivor region or an old region – i.e. “to-space overflow” is not limited to a survivor space overflow situation like what can occur with other HotSpot GCs
  • 33. Evacuation failures (ii) • to-space overflow and to-space exhausted • (Very) expensive events to deal with • Any object that has already been evacuated, its object references must be updated, and the region must be tenured • Any object that has not been evacuated, its object references must be self-forwarded and the region tenured in place • Evacuation failures are usually followed by a Full GC • Full GC collects all regions, and is single threaded
  • 34. G1 GC Analysis • What did we find? About every 25 minutes … 2216.206: [GC pause (young) (to-space overflow), 0.62894300 secs] [Parallel Time: 239.4 ms] … 2217.958: [Full GC 4802M->1571M(5120M), 3.0889520 secs] [Times: user=4.34 sys=0.01, real=3.09 secs] This is not good!
  • 35. G1 GC Analysis • What next? • Enable -XX:+PrintAdaptiveSizePolicy to see what adaptive decisions G1 is making • Look at GC logs for those areas leading up to to-space overflows, to-space exhausted and Full GC
  • 36. G1 GC Analysis • What did we find? 2219.546: [G1Ergonomics (Concurrent Cycles) request concurrent cycle initiation, reason: occupancy higher than threshold, occupancy: 2311061504 bytes, allocation request: 6440816 bytes, threshold: 2312110080 bytes (45.00 %), source: concurrent humongous allocation] +250 of these in two hours! • Resolution: avoid humongous allocations
  • 37. G1 GC Analysis • What next? • Object allocations greater than or equal to 50% of G1’s region size are considered a humongous allocation • Humongous object allocation & collection are not optimized in G1 • G1 region size is determined automatically at JVM launch time based on the size of the Java heap (-Xms in 7u40 & later) • But you can override it with -XX:G1HeapRegionSize=# • Size must be a power of 2 ranging from 1 MB to 32 MB
  • 38. G1 GC Analysis • What next (continued)? • Object allocation from GC log, 6440816 bytes, or about 6.4 MB • Need a region size such that 6.4 MB is less than 50% of the region size, and where the region size is a power of 2 between 1 MB and 32 MB • Set -XX:G1HeapRegionSize=16m • 16 MB * 50% = 8 MB, 8 MB > 6.4 MB • Note: -Xms & -Xmx must be aligned with the region size • 5g == 5120m, 5120m/16m == 320, 5000m/16m == 312.5 Adjust -Xms, & -Xmx to gain alignment at 16m
  • 39. Updated G1 GC Options -Xms5g -Xmx5g -XX:PermSize=700m -XX:MaxPermSize=700m -XX:+UseG1GC -XX:MaxGCPauseMillis=250 -XX:InitiatingHeapOccupancyPercent=45 -XX:G1HeapRegionSize=16m -XX:+PrintGCDetails -XX:+PrintGCTimeStamps
  • 40. • Observations with G1 region size at 16 MB • 99th percentile response time dropped another 25% • G1 GC – 1.05 seconds (16 MB region size) • G1 GC - 1.4 seconds (initial configuration) • Parallel GC - 2.5 seconds 16 MB Region Size seconds 0 0.5 1 1.5 2 2.5 Parallel GC Initial G1 GC G1 Region Size 99th Percentile Response Time (lower is better) Parallel GC Initial G1 GC G1 Region Size 99th percentile requirement
  • 41. 16 MB Region Size • Observations with G1 region size at 16 MB • CPU utilization dropped, but still higher than Parallel GC • G1 GC - 33% (16 MB region size) • G1 GC - 39% (initial configuration) • Parallel GC - 29% 0 10 20 30 40 50 60 70 80 90 100 Parallel GC Initial G1 GC G1 Region Size CPU Utilization (lower is better) Parallel GC Initial G1 GC G1 Region Size percent
  • 42. Analysis • Are we done? • No, go look at the raw G1 GC logs for some additional potential improvement!!! • CPU utilization still a little higher, 29% vs. 33% • Still need to know how the app will behave at 2x expected load capacity goal?
  • 43. G1 GC Analysis • What did we find? • No humongous allocations, no adaptive / ergonomic issues • Looking at the amount of space reclaimed per marking & mixed GC cycle • 300 MB to 900 MB per cycle, ~ 6% - 18% of the Java heap • ~ 95% heap occupancy just prior to mixed GC cycles • Seems pretty good • Delicate balance between starting the marking cycle too early and wasting CPU cycles at the expense of capacity vs starting too late and experiencing to-space overflow or to-space exhausted events
  • 44. Analysis • What about meeting 99th percentile latency at 2x expected load capacity goal? • 99th percentile response time • 1.2 seconds, ~ 150 ms higher, 1.05 seconds at 1x • CPU utilization • 59%, ~ 1.8x increase, 33% at 1x 0 0.5 1 1.5 2 2.5 Expected Load 2x Expected Load 99th Percentile Response Time (lower is better) Expected Load 2x Expected Load seconds 99th percentile requirement percent 0 10 20 30 40 50 60 70 80 90 100 Expected Load 2x Expected Load CPU Utilization for G1 GC Expected Load 2x Expected Load
  • 45. Analysis • Are we done? • Yes, for now ;-) • Requirements met • Some additional performance realized via GC analysis
  • 46. Analysis • Are we done? • Yes, for now ;-) • Requirements met • Some additional performance realized via GC analysis • Stakeholders happy!
  • 48. When to consider migration • When CMS GC cannot meet your latency goals • Usually due to fragmented old gen space, or not enough available Java heap • When CMS GC has difficulties meeting footprint goals • G1 GC will likely run with a smaller Java heap • G1 GC may use more CPU • G1 has more overhead than CMS GC • Though not as much of a delta as G1 GC to Parallel GC
  • 49. Example Application Requirements • Expected load • 1500 concurrent users, 2 Txns/user per second • 99th percentile – 250 ms • Desire all response times to be less than 1 second • 12 GB RAM per JVM • 2x load capacity • 3000 users, 2 Txns/user per second
  • 50. Current CMS GC Configuration -Xms10g -Xmx10g -Xmn2g -XX:PermSize=64m -XX:MaxPermSize=64m -XX:InitialSurvivorRatio=5 -XX:SurvivorRatio=5 -XX:InitialTenuringThreshold=15 -XX:MaxTenuringThreshold=15 -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:CMSInitiatingOccupancyFraction=60 -XX:+UseCMSInitiatingOccupancyOnly -XX:+CMSClassUnloadingEnabled -XX:ParallelGCThreads=12 -XX:ConcGCThreads=4
  • 51. Does this make sense? • Updated for G1 -Xms10g -Xmx10g -Xmn2g -XX:PermSize=64m -XX:MaxPermSize=64m -XX:InitialSurvivorRatio=5 -XX:SurvivorRatio=5 -XX:InitialTenuringThreshold=15 -XX:MaxTenuringThreshold=15 -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -XX:InitiatingHeapOccupancyPercent=50 -XX:ParallelGCThreads=12 -XX:ConcGCThreads=4
  • 52. CMS Configuration Updated for G1 -Xms10g -Xmx10g -Xmn2g -XX:PermSize=64m -XX:MaxPermSize=64m -XX:InitialSurvivorRatio=5 -XX:SurvivorRatio=5 -XX:InitialTenuringThreshold=15 -XX:MaxTenuringThreshold=15 -XX:+UseG1GC -XX:MaxGCPauseMillis=200 (default is 200) -XX:InitiatingHeapOccupancyPercent=50 (default is 45) -XX:ParallelGCThreads=12 -XX:ConcGCThreads=4
  • 53. Current Observations • With CMS GC • At expected capacity • Meeting 99th percentile response time of 250 ms • 240 ms • Not meeting max response time less than 1 second • ~ 5 seconds • At 2x expected capacity • Not meeting 99th percentile (250 ms), or max response time < 1 sec • 99th percentile – 310 ms • Max response time ~ 8 seconds
  • 54. Current Observations • What did we find? • CMS GC promotion failures • Here’s an example [ParNew (promotion failed) 1797568K- >1775476K(1797568K), 0.7051600 secs] [CMS: 3182978K- >1773645K(8388608K), 5.2594980 secs] 4736752K- >1773645K(10186176K), [CMS Perm : 52858K- >52843K(65536K)] 5.9649620 secs] [Times: user=7.43 sys=0.00, real=5.96 secs] This is not good!
  • 55. Again, CMS GC Heap Sizing -Xms10g -Xmx10g -Xmn2g -XX:PermSize=64m -XX:MaxPermSize=64m -XX:InitialSurvivorRatio=5 -XX:SurvivorRatio=5 -XX:InitialTenuringThreshold=15 -XX:MaxTenuringThreshold=15 -XX:+UseConcMarkSweepGC -XX:+UseParNewGC -XX:CMSInitiatingOccupancyFraction=60 -XX:+UseCMSInitiatingOccupancyOnly -XX:+CMSClassUnloadingEnabled -XX:ParallelGCThreads=12 -XX:ConcGCThreads=4
  • 56. Again, our starting place for G1 GC -Xms10g -Xmx10g -XX:PermSize=64m -XX:MaxPermSize=64m -XX:+UseG1GC -XX:MaxGCPauseMillis=200 (default is 200) -XX:InitiatingHeapOccupancyPercent=50 (default is 45)
  • 57. First G1 GC results (response times) • Initial observations • 99th percentile at expected load • CMS – 240 ms • G1 – 78 ms milliseconds 0 50 100 150 200 250 300 CMS GC G1 GC 99th Percentile (lower is better) CMS GC G1 GC 99th percentile requirement
  • 58. First G1 GC results (response times) • Initial observations • Max at expected load • CMS – 5000 ms • G1 – 582 ms milliseconds 0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000 CMS GC G1 GC Max Response Time (lower is better) CMS GC G1 GC Max response time requirement
  • 59. First G1 GC results (CPU utilization) • Initial observations • At expected load • CMS – 29% • G1 – 33% percent 0 10 20 30 40 50 60 70 80 90 100 CMS GC G1 GC CPU Utilization (lower is better) CMS GC G1 GC
  • 60. Analysis • Are we done? • No, go look at the raw G1 GC logs for potential improvement!!! • Additionally, the increase in CPU utilization, (29% to 33%) may be a little concerning • How will the app behave at 2x load capacity?
  • 61. G1 2x Capacity (response times) • Observations • 99th percentile at 2x capacity • CMS – 310 ms • G1 – 103 ms milliseconds 0 50 100 150 200 250 300 350 CMS GC G1 GC 99th Percentile (lower is better) CMS GC G1 GC 99th percentile requirement
  • 62. G1 2x Capacity (response times) • Observations • Max at 2x capacity • CMS – 8000 ms • G1 – 5960 ms milliseconds 0 1000 2000 3000 4000 5000 6000 7000 8000 CMS GC G1 GC Max Response Time (lower is better) CMS GC G1 GC Max response time requirement
  • 63. G1 2x Capacity (CPU utilization) • Observations • CPU at 2x capacity • CMS – 59% • G1 – 56% percent 0 10 20 30 40 50 60 70 80 90 100 CMS GC G1 GC CPU Utilization (lower is better) CMS GC G1 GC
  • 64. G1 GC Analysis • What next? • Go look at the GC logs!!!
  • 65. G1 GC Analysis • What did we find? • Excessive RSet update times [GC pause (young), 0.9929170 secs] [Parallel Time: 988.0 ms, GC Workers: 10] …. <other GC info removed> …. [Update RS (ms): Min: 154.5, Avg: 237.4, Max: 980.2, Diff: 825.7, Sum: 2373.8] Very high RSet update time! Very high pause time for 200 ms pause time goal!
  • 66. G1 GC Analysis • What next? • When RSet update times are high • Tune -XX:G1RSetUpdatingPauseTimePercent (default is 10) • Target time to spend updating RSets during GC evacuation pause • Lower value pushes more work onto G1 Concurrent Refinement threads • Higher value pushes more to be done during GC (stop the world) pause • Tune -XX:G1ConcRefinementThreads • Defaults to value of -XX:ParallelGCThreads
  • 67. G1 GC Analysis • What next? (continued) • Test system has 12 virtual processors • Currently using default -XX:ParallelGCThreads=10 • Change -XX:ParallelGCThreads=12 • This will also set -XX:G1ConcRefinementThreads=12 • Decrease -XX:G1RSetUpdatingPauseTimePercent=5 • default is 10 • Watch for change / increase in CPU usage
  • 68. Updated G1 GC JVM Options -Xms10g -Xmx10g -XX:PermSize=64m -XX:MaxPermSize=64m -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -XX:InitiatingHeapOccupancyPercent=50 -XX:ParallelGCThreads=12 -XX:G1RSetUpdatingPauseTimePercent=5
  • 69. Updated G1 GC Config (response times) • Observations at 2x capacity • 99th percentile • CMS – 310 ms • Initial G1 – 103 ms • Updated G1 - 194 ms milliseconds 0 50 100 150 200 250 300 350 CMS Initial G1 Updated G1 99th Percentile (lower is better) CMS Initial G1 Updated G1 99th percentile requirement
  • 70. Updated G1 GC Config (response times) • Observations at 2x capacity • Max response time • CMS – 8000 ms • Initial G1 – 5960 ms • Updated G1 – 548 ms 0 1000 2000 3000 4000 5000 6000 7000 8000 CMS Initial G1 Updated G1 Max Response Time (lower is better) CMS Initial G1 Updated G1 milliseconds Max response time requirement
  • 71. Updated G1 GC Config (CPU utilization) • Observations at 2x capacity • CPU Utilization • CMS – 59% • Initial G1 – 56% • Updated G1 – 57% percent 0 10 20 30 40 50 60 70 80 90 100 CMS Initial G1 Updated G1 CPU Utilization (lower is better) CMS Initial G1 Updated G1
  • 72. Analysis • Are we done? • No, go look at the raw G1 GC logs for some additional potential improvement!!!
  • 73. G1 GC Analysis • What did we find? • No humongous allocations, no adaptive / ergonomic issues • Looking at the amount of space reclaimed per marking & mixed GC cycle • 1 GB to 2 GB per cycle, ~ 10% - 20% of the Java heap • ~ 90% heap occupancy just prior to mixed GC cycle • Seems pretty good • Delicate balance between starting the marking cycle too early and wasting CPU cycles at the expense of capacity vs starting too late and experiencing to-space exhausted or to-space overflow events
  • 74. Analysis • Are we done? • Yes, for now ;-) • Requirements met • 99th percentile • Max response time • Both expected load and 2x expected load • With lower CPU utilization!
  • 75. Analysis • Are we done? • Yes, for now ;-) • Requirements met • 99th percentile • Max response time • Both expected load and 2x expected load • With lower CPU utilization! • Stakeholders happy!
  • 77. Call to Action (i) • Understand the “success criteria” • Understand when it makes sense to transition and evaluate G1 GC • Don’t over-tune! • Start G1 evaluation with minimal settings • Initial and max heap size, pause time target and when to start marking cycle • Don’t use young gen sizing, i.e. -Xmn, -XX:SurvivorRatio, etc
  • 78. Call to Action (ii) • Do GC analysis • Use tools for a high level view and identify troublesome time periods --- then look at the raw GC logs • Look at several GC events prior and after to see how things go into an undesirable state, and the action G1 took and the recovery
  • 79. Call to Action (iii) • Do GC analysis • If you see evacuation failures (to-space overflow, to-space exhausted or Full GC) • Enable -XX:+PrintAdaptiveSizePolicy • If you see humongous object allocations • Increase -XX:G1HeapRegionSize • If you see to-space exhausted, i.e. marking cycle starting too late • Remove young gen sizing constraints if they exist • Decrease -XX:InitiatingHeapOccupancyPercent, (default is 45) • Increase Java heap size -Xms and -Xmx
  • 80. Call to Action (iv) • Do GC analysis • If you see high RSet update times • Increase -XX:ParallelGCThreads if not at the max budget of virtual processors per JVM • This will increase -XX:G1ConcRefinementThreads • Can set independently of -XX:ParallelGCThreads • Decrease -XX:G1RSetUpdatingPauseTimePercent • Lowers time spent in RSet updating in GC evacuation pause • Note: May observe increase in CPU utilization
  • 81. Call to Action (iv) continued … • To observe RSet update and RSet scan information • Add these three JVM command line options • -XX:+G1SummarizeRSetStats • -XX:G1SummarizeRSetStatsPeriod=1 • -XX:+UnlockDiagnosticVMOptions • Can use these to identify if some of the updating and coarsening of RSets is being pushed onto the mutator (application) threads
  • 82. Call to Action (v) • If time from marking cycle start until mixed GCs start take a long time • Increase -XX:ConcGCThreads (default ParallelGCThreads / 4) • Note: may observe an increase in CPU utilization
  • 83. Call to Action (vi) • If marking cycles are running too frequently • Look at space reclaimed in the mixed GC cycles after a marking cycle completes • If you’re reclaiming (very) little space per marking cycle & mixed GC cycle • Increase -XX:InitiatingHeapOccupancyPercent • Note: The more frequent the marking cycles & mixed GC cycles, you may observe higher CPU utilization
  • 84. Call to Action …. Last One • Parallel Reference Processing • Look for high “[Ref Proc:” times or high reference processing times in remark pauses --- should be a fraction of the overall GC pause time
  • 85. Call to Action …. Last One • Parallel Reference Processing • Look for high “[Ref Proc:” times or high reference processing times in remark pauses --- should be a fraction of the overall GC pause time [Other 24.8 ms] [Choose CSet: 0.0 ms] [Ref Proc: 16.9 ms] Or, [GC remark 17212.791: [GC ref-proc, 7.8638730 secs], 8.4352460 secs] Very high reference processing times
  • 86. Call to Action …. Last One • Parallel Reference Processing • Look for high “[Ref Proc:” times or high reference processing times in remark pauses --- should be a fraction of the overall GC pause time [Other 24.8 ms] [Choose CSet: 0.0 ms] [Ref Proc: 16.9 ms] Or, [GC remark 17212.791: [GC ref-proc, 7.8638730 secs], 8.4352460 secs] • Enable -XX:+ParallelRefProcEnabled to address this Very high reference processing times
  • 87. Additional Info • JavaOne Sessions • G1 Collector: Current and Future Adaptability and Ergonomics • If you missed it, get a recording! • Tuning Low-Pause Garbage Collection (Hands on Lab) • If you missed it, get the materials!
  • 88. Additional Info • GC Analysis Tools • JClarity has some very good GC analysis tools • Currently adding G1 analysis • http://www.jclarity.com • We most often use awk, perl, specialized JFreeChart and spreadsheets :-O
  • 89. If you can, share observations and GC logs at hotspot-gc-use@openjdk.java.net
  • 90. Happy G1 GC Trails!