Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Shenandoah GC: Java Without The Garbage Collection Hiccups (Christine Flood)

3,829 views

Published on

Just like a spoon full of sugar will cure your hiccups, running your JVM with -XX:+UseShenandoahGC will cure your Java garbage collection hiccups. Shenandoah GC is a new garbage collector algorithm developed for OpenJDK at Red Hat, which will produce much better pause times than the currently-available algorithms without a significant decrease in throughput. In this session, we'll explain how Shenandoah works and compare it to the currently-available OpenJDK garbage collectors.

Published in: Software
  • Be the first to comment

Shenandoah GC: Java Without The Garbage Collection Hiccups (Christine Flood)

  1. 1. Shenandoah Garbage Collector: Java without the GC Hiccups Christine H. Flood
  2. 2. A Cure for hiccups
  3. 3. Java Hiccup cure
  4. 4. Stop The Word Evacuation Java ThreadsGC Threads Java Threads Response Time Predictability Disappears
  5. 5. Init Mark Final Mark Concurrent Mark Concurrent Evacuation Shenandoah
  6. 6. SpecJBB2015 Algorithm Max- JOPS Critical- JOPS Total Pause Time Average Pause Time Max Pause Time Shenandoah 71652 43472 106.66s 60.06ms 558.12ms G1 80467 6199 545.70s 597.70ms 1709.61ms Intel Platform: Brickland-EX Cpu:Broadwell-EX, QDF:QKT3B0 Stepping (QS), 2.2Ghz, 24 Core, 60MB shared cache COD ENABLED Intel(R) Xeon(R) CPU E7-8890 v4 @2.20GHz 524288 MB memory, 1598 GB disk space
  7. 7. Can we try it ourselves? ● Released in Fedora 24 ● May also be downloaded and built from – http://openjdk.java.net/projects/shenandoah/
  8. 8. How Does It Work?
  9. 9. Brief Intro to Compacting GCs Heap after several binary tree modifications Heap after Compaction And reclamation of Unreachable objects Two phases: 1) Trace 2) Compact
  10. 10. Concurrent Tracing ● Solved Problem ● Snapshot At The Beginning (SATB) – Used by several OpenJDK GC algorithms ● CMS ● G1 ● Shenandoah
  11. 11. Concurrent Compaction ● Only in Shenandoah ● Move the Live Objects While the Java Threads are Running.
  12. 12. 80% live empty 100% live empty empty empty 50% live 100% live 20% live 10% live Shenandoah Heap: Region BasedChoose Regions With The Most Garbage
  13. 13. Why is Concurrent Compaction Complicated? ● Java Thread 1 – Foo.x = 1 ● Java Thread 2 – Foo.y = 2 ● GC Thread - Copies Foo to Foo' ● Java Thread 3 – Foo.z = 3
  14. 14. What we want to happen when the GC Thread copies Foo T1 T2 T3 Foo ● Before ● After T1 T2 T3 Foo' But Finding and Updating all the references to Foo takes time.
  15. 15. What Shenandoah does ● Before ● After Almost as good, as long as all accesses go through the Forwarding pointer. Indirection Pointer Foo Indirection Pointer Foo Indirection Pointer Foo' T1 T2 T3 T1 T2 T3
  16. 16. Indirection Pointer Foo Indirection Pointer Foo' Read Barriers ● Without Read Barrier – Read(Foo + Field Offset) ● With ReadBarrier – Read(Read(Foo-8) + Field Offset)
  17. 17. Read Barriers: Reading a Field ● Without Shenandoah 0x00007fffe1102cd1: mov 0x10(%rsi),%rsi ;*getfield value ; - java.lang.String::equals@20 (line 982) ● With Shenandoah 0x00007fffe1102ccd: mov -0x8(%rsi),%rsi read the contents of the indirection pointer for the address contained in register rsi back into rsi. 0x00007fffe1102cd1: mov 0x10(%rsi),%rsi ;*getfield value ; - java.lang.String::equals@20 (line 982) Smart compiler Will fill delay slots
  18. 18. But there is still a race condition ● Java Thread Read ResolveLocation(Foo – 0x8) … Writes to Foo ● GC Thread ... Copies Foo to Foo' ● Solution: Copying write barriers. ● Java Threads aid in evacuation, by not writing to objects targeted for evacuation.
  19. 19. Write Barrier 0x00007fffe1110318: movabs $0x7fffec0b92c0,%rax 0x00007fffe1110322: mov (%rax,%rbx,1),%al 0x00007fffe1110325: test $0x1,%al ← evacuation in progress? 0x00007fffe1110328: je 0x00007fffe1110339 ← if not jump to putfield 0x00007fffe111032e: xchg %rdi,%rax 0x00007fffe1110331: callq 0x00007fffe10ffd20 ; {runtime_call} ← else make a call out to the runtime to copy the object to an evacuation region. 0x00007fffe1110336: xchg %rax,%rdi 0x00007fffe1110339: mov %esi,0x10(%rdi) ;*putfield count ; - java.util.Hashtable::addEntry@83 (line 436)
  20. 20. Aren't Those Barriers Expensive?
  21. 21. So, what do these barriers cost? ● Not as much as you might think…. – Barrier Optimizations ● New Objects ● Immutable Fields ● Array Size ● Class Pointers ● Read after Read ● Read after Write ● Hoisting
  22. 22. We ran several DaCapo Benchmarks Without Any GC Activity Benchmark Shenandoah G1 Percentage Overhead Avrora 2096ms 2052ms 2.1% FOP 1103ms 1044ms 5.6% LUIndex 861ms 832ms 3.5%
  23. 23. Why not generational?
  24. 24. 25 Why not Generational? ● Generational hypothesis is the observation that, in most cases, young objects are much more likely to die than old objects. – Memory management Glossary
  25. 25. Why Not Generational? ● LRU Benchmark – Models a URL cache mapping URL to web page content. – Generational GC pays a steep penalty for copying data. Collector Total Time Total Pause Time Average Pause Time Max Pause Time Shenandoah 15167ms 3.81s 23.19ms 44.85ms G1 178244ms 11.89s 116.60ms 230.573ms
  26. 26. How Does Shenandoah Compare With Other OpenJDK Collectors?
  27. 27. Currently Available OpenJDK GC's ● Serial GC – Small Footprint – Minimal overhead ● Parallel GC – High Throughput ● G1 – Managed Pause Times – Compaction ● ParNew/CMS – Minimal Pause Times
  28. 28. What's Next?
  29. 29. Shorter Pause Times ● We are moving more of our work into concurrent phases to meet the original 10ms goal.
  30. 30. Shenandoah 2.0 ● Observations – Marking the entire heap takes a long time and touches rarely used parts of memory. – Garbage is only created by stack changes or writes to the heap. X
  31. 31. Focus GC wherever writes are happening. ● Generational Application – Writes happening in recently allocated regions ● LRU – Writes happening in oldest regions
  32. 32. Shenandoah 2.0 Theory ● Keep track of writes to regions. ● Focus on regions which have changed ● Collect Region Sets together.
  33. 33. Table of Inter-Region References Regions 0 1 2 3 4 5 6 7 0 1 2 3 X 4 5 X 6 7 X Regions 3 & 5 collected together
  34. 34. Table of Inter-Region References Regions 0 1 2 3 4 5 6 7 0 1 2 3 X 4 5 X 6 7 X Scan Region 7 when collecting region 6.
  35. 35. 80% live empty 100% live empty empty empty 50 0 200 100 Shenandoah Heap: Region BasedChoose Regions With The Most Updates
  36. 36. Partial Collections ● Scan Thread Stacks and Other Roots ● Scan Entire Region Group and Referencing Regions.
  37. 37. Region Groups Help NUMA ● Regions that reference each other will be collected together.
  38. 38. NUMA Aware GC Threads ● NUMA node 1 ● NUMA node 2 Java Threads Concurrent GC Threads N1 Region N2 Region N1 Region N2 Region Shared Region Shared Region Shared Region Empty Region Empty Region Empty Region Shared Region Shared Region Java Threads Concurrent GC Threads
  39. 39. Takeaway Message Shenandoah rocks for some applications!
  40. 40. Who would benefit from Shenandoah? Stock trading applications E-commerce web sites Any applications with QOS guarantees Interactive Applications
  41. 41. Who would benefit from Shenandoah? Applications with large heaps that require fast response times
  42. 42. More Information http://openjdk.java.net/projects/shenandoah/ chf@redhat.com rkennke@redhat.com

×