Garbage, Garbage EverywhereGC Strategies for Event Processing Systems on the JVMC. Scott AndreasPizza, Beer, and Tech Talk...
What’s ESP / CEP?•   Event Stream Processing    Selecting events on dimensions among a stream of moving    data, maintaini...
ESP and Network Analytics•   Packet flows are event streams with many dimensions.•   Blast them into the engine, select ove...
Back of the Envelope•   500 Mbps / sec data comes into the JVM    xxx Mbps / sec data goes out of the JVM•   This memory m...
Opening the Grimoire•   GC Tuning to the rescue•   Oracle guide with 84 -XX: options•   Stas’s guide: 830 options (OpenJDK...
Moment of Pause•   Don’t touch the knobs unless you need to•   Server defaults are a decent place to start for local devel...
Generational Garbage Collection•   Modern JVMs divide heap space up into multiple “generations.”•   Most applications have...
first attempt: “deploy the g1”
Low-Pause Collector: The G1  v% => threshold violations. lower is better.
G1 Collector•   Hundreds of tiny 1ms collections / second rather than    ParNew’s ~100 - 200ms larger collections.•   Capa...
second attempt:you’re gonna laugh
the unsafe
Unsafe•   A OpenJDK/HotSpot class exposing direct access to the    underlying VM, OS, and memory.•   This includes the abi...
Learned While Astray•   Finalization occurs in a single thread.•   Jumping from native finalization back into Java is expen...
returning to earth ::attempt 3[a]
Lessons from Science•   Your rate of “freeing” must be equal to or exceed your rate of    object allocation on the heap.• ...
best way to help out the gc ::
PRODUCE LESS  GARBAGE
Breaking out YourKit
attempt 3[b]:responsible tuning of the old hat
Optimizing for Infant Mortality                                                      default newgen ratios in java 6•   Ja...
CMS Collector•   Guardian of the tenured generation, favorite workhorse for years.•   Primarily parallel, easier on the CP...
ParNew Collector•   Designed for the small, but works great in the large.    Excellent throughput, parallel collection.•  ...
Explosions in the Barrel
“real-time” and the jvm
Real Time and the JVM•   Real Time    Ability to meet specific targets with low variance is critical to the    bare minimum...
Real Time and “The Pause”•   To what extent can a system which can endure pauses of    unpredictable duration be considere...
what does your app sound like?
Garbage, Garbage EverywhereGC Strategies for Event Processing Systems on the JVMC. Scott AndreasPizza, Beer, and Tech Talk...
Scott Andreas - Garbage, Garbage Everywhere: GC Strategies for Event Processing Systems on the JVM, Boundary Tech Talks 11...
Scott Andreas - Garbage, Garbage Everywhere: GC Strategies for Event Processing Systems on the JVM, Boundary Tech Talks 11...
Scott Andreas - Garbage, Garbage Everywhere: GC Strategies for Event Processing Systems on the JVM, Boundary Tech Talks 11...
Scott Andreas - Garbage, Garbage Everywhere: GC Strategies for Event Processing Systems on the JVM, Boundary Tech Talks 11...
Scott Andreas - Garbage, Garbage Everywhere: GC Strategies for Event Processing Systems on the JVM, Boundary Tech Talks 11...
Scott Andreas - Garbage, Garbage Everywhere: GC Strategies for Event Processing Systems on the JVM, Boundary Tech Talks 11...
Scott Andreas - Garbage, Garbage Everywhere: GC Strategies for Event Processing Systems on the JVM, Boundary Tech Talks 11...
Scott Andreas - Garbage, Garbage Everywhere: GC Strategies for Event Processing Systems on the JVM, Boundary Tech Talks 11...
Upcoming SlideShare
Loading in …5
×

Scott Andreas - Garbage, Garbage Everywhere: GC Strategies for Event Processing Systems on the JVM, Boundary Tech Talks 11/17/11

5,933 views

Published on

This presentation from the November 17, 2011 Boundary Meetup takes us through the architecture of Boundary's stream processing infrastructure and how the architecture is pushing the bounds of JVM throughput.

Published in: Technology
0 Comments
6 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
5,933
On SlideShare
0
From Embeds
0
Number of Embeds
3,314
Actions
Shares
0
Downloads
44
Comments
0
Likes
6
Embeds 0
No embeds

No notes for slide

Scott Andreas - Garbage, Garbage Everywhere: GC Strategies for Event Processing Systems on the JVM, Boundary Tech Talks 11/17/11

  1. 1. Garbage, Garbage EverywhereGC Strategies for Event Processing Systems on the JVMC. Scott AndreasPizza, Beer, and Tech TalksNovember 17, 2011
  2. 2. What’s ESP / CEP?• Event Stream Processing Selecting events on dimensions among a stream of moving data, maintaining them for a brief period, emitting aggregations.• Complex Event Processing Identifying correlations between events, predicting trends, and programmatically reacting to emergent trends.
  3. 3. ESP and Network Analytics• Packet flows are event streams with many dimensions.• Blast them into the engine, select over the stream, emit aggregations based on queries.• Ipfix data flows in, JSON comes out.
  4. 4. Back of the Envelope• 500 Mbps / sec data comes into the JVM xxx Mbps / sec data goes out of the JVM• This memory must be allocated, retained for processing, freed, and collected.• Actual allocation rates far higher than data in / out (Memory also used for deserializing, aggregations, etc).
  5. 5. Opening the Grimoire• GC Tuning to the rescue• Oracle guide with 84 -XX: options• Stas’s guide: 830 options (OpenJDK debug builds) http://www.oracle.com/technetwork/java/javase/tech/vmoptions-jsp-140102.html http://stas-blogspot.blogspot.com/2011/07/most-complete-list-of-xx-options-for.html
  6. 6. Moment of Pause• Don’t touch the knobs unless you need to• Server defaults are a decent place to start for local development• Defaults shipped with Cassandra decent for bimodal GC profiles• Basic rule of thumb: if you’re aggressively tuning garbage collection, you can trade hours of frustration for ~10% gain
  7. 7. Generational Garbage Collection• Modern JVMs divide heap space up into multiple “generations.”• Most applications have a lot of objects which live for a very short time, and a lot which live (nearly) forever.• Generational collection enables the JVM to collect unused memory more efficiently by avoiding unnecessarily scanning heap / object graphs for references or free regions.
  8. 8. first attempt: “deploy the g1”
  9. 9. Low-Pause Collector: The G1 v% => threshold violations. lower is better.
  10. 10. G1 Collector• Hundreds of tiny 1ms collections / second rather than ParNew’s ~100 - 200ms larger collections.• Capable of meeting ambitious pause targets.• Powered by a gang of threads working in parallel • ...cooperating to chew through CPU like it’s free.
  11. 11. second attempt:you’re gonna laugh
  12. 12. the unsafe
  13. 13. Unsafe• A OpenJDK/HotSpot class exposing direct access to the underlying VM, OS, and memory.• This includes the ability to allocate, manage, and free memory.• Perhaps we can outsmart the JVM and do a better job than it!
  14. 14. Learned While Astray• Finalization occurs in a single thread.• Jumping from native finalization back into Java is expensive.• Attempting to outsmart the garbage collector by creating hundreds of thousands of tiny ByteBuffers is...a thing.• Java’s collectors are very good at collecting garbage. Your home-grown in-app GC go-kart is probably not.
  15. 15. returning to earth ::attempt 3[a]
  16. 16. Lessons from Science• Your rate of “freeing” must be equal to or exceed your rate of object allocation on the heap.• High rates of allocation speed up heap fragmentation, which compounds the problem.• Creating less garbage reduces your rate of allocation (and freeing).• This means less work for the garbage collector.
  17. 17. best way to help out the gc ::
  18. 18. PRODUCE LESS GARBAGE
  19. 19. Breaking out YourKit
  20. 20. attempt 3[b]:responsible tuning of the old hat
  21. 21. Optimizing for Infant Mortality default newgen ratios in java 6• Java 6 AMD64 (server) defaults to allocating 1/3 of heap to the new gen, 2/3 to the old gen.• ESP/CEP workloads place tremendous pressure on the newgen. The vast majority of objects survive less than five seconds.• Experiment: Allocate 80% of heap to the new gen, set a higher tenuring threshold, and lean hard on the ParNew collector.
  22. 22. CMS Collector• Guardian of the tenured generation, favorite workhorse for years.• Primarily parallel, easier on the CPU than the G1.• ...But contains a significant pause phase, is less suited to meeting low pause targets.
  23. 23. ParNew Collector• Designed for the small, but works great in the large. Excellent throughput, parallel collection.• Can collect ~5GB in ~200ms on a quad-core Xeon w/HT.• 200ms pause every several seconds favorable compared to less frequent multi-second pauses and promotion failures.
  24. 24. Explosions in the Barrel
  25. 25. “real-time” and the jvm
  26. 26. Real Time and the JVM• Real Time Ability to meet specific targets with low variance is critical to the bare minimum functionality of the product (e.g., air bags).• “Soft” Real Time Ability to meet targets important but not critical. Value of system’s functionality is diminished but not eliminated by delay.
  27. 27. Real Time and “The Pause”• To what extent can a system which can endure pauses of unpredictable duration be considered “real-time”?• Is it sufficient to mitigate the frequency and duration of pauses for a system to still deliver value as “soft real-time”?• Is the alternative worth the cost?
  28. 28. what does your app sound like?
  29. 29. Garbage, Garbage EverywhereGC Strategies for Event Processing Systems on the JVMC. Scott AndreasPizza, Beer, and Tech TalksNovember 17, 2011

×