• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
«Большие объёмы данных и сборка мусора в Java
 

«Большие объёмы данных и сборка мусора в Java

on

  • 1,455 views

Алексей Рагозин, Technical Lead, Caching and Data Grid Services, VP, Risk and PnL, Deutsche Bank

Алексей Рагозин, Technical Lead, Caching and Data Grid Services, VP, Risk and PnL, Deutsche Bank

Statistics

Views

Total Views
1,455
Views on SlideShare
1,455
Embed Views
0

Actions

Likes
3
Downloads
28
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    «Большие объёмы данных и сборка мусора в Java «Большие объёмы данных и сборка мусора в Java Presentation Transcript

    • Big JVM and garbage collection Alexey Ragozin alexey.ragozin@gmail.com Sep 2012
    • What it is all about?• Automatic memory management, how it works• Why JVM need Stop-the-World pauses• Tuning GC in HotSpot JVM
    • Automatic memory management Languages with automatic memory management  Java, JavaScript, Erlang, Haskell, Python, PHP, C#, Ruby, Perl, SmallTalk, OCaml, List, Scala, ML, Go, D, …  … and counting Langauges without automatic memory managment  C, C++, Pascal/Delphi, Objective-C  Anything else, anyone?
    • How to manage memory?Garbage – data structure (object) in memoryunreachable for the program.How to find garbage? Reference counting Object graph traversal Do not collect garbage at all
    • Reference counting+ Simple+ No Stop-the-World pauses required– Cannot collect cyclic graphs– 15-30% CPU overhead– Pretty bad for multi core systems
    • Object graph traversal• Roots Static fields Local variables (stack frames)• Reachable objects - alive• Unreachable objects - garbageIn general, graph should not be mutated during graph traversal. As a consequence, application should be frozen for period of while runtime is collecting garbage.
    • Garbage collection AlgorithmsCopy collection  Traverse object graph and copy reachable object to other space  Mark old space as freeMark / Sweep  Traverse object graph and mark reachable objects  Scan (sweep) whole memory and “free” unmarked objectsMark / Sweep / Compact  … mark … sweep ….  Relocate live objects to defragment free space
    • Garbage collection EconomicsS – whole heap size Total amountL – size of live objects of garbageCopy collection S−L Throughput ≈ c ⋅ LMark / Sweep S−L S−L Throughput ≈ c1 ⋅ + c2 ⋅ L S For all algorithms based on reference reachability. GC efficiency is in reverse proportion to amount of live objects.
    • Garbage collectionGenerational approach
    • WHAT DO WE HAVE IN TOOL BOX?
    • Garbage collection Terms dictionaryStop-the-world (STW) pause – pause of all application threads requireCompacting algorithms – can move objects in memory to defragment free spaceParallel collection – using multiple cores to reduce STW timeConcurrent collection – collection algorithms working in parallel with application threads
    • Garbage collection Throughput vs low latencyThroughput algorithms – minimize total time of program execution – economically efficient CPU utilizationLow pause algorithms – minimize time of individual STW pause – may use background (concurrent) collection – may incremental collection
    • Oracle HotSpot JVMThroughput algorithms Parallel GC (-XX:+UseParallelOldGC) Young: Copy collector Old: Parallel Mark Sweep CompactLow pause algorithms Concurrent Mark Sweet (-XX: +UseConcMarkSweepGC) Young: Copy collector Old: Mark Sweep – not compacting (prone for fragmentation) – most work is in background – young collections are STW
    • Oracle HotSpot JVMLow pause algorithms Garbage First – G1 (-XX:+UseG1GC) Young: Copy collector Old: Incremental copy collector – incremental – more STW but shorter – collect regions with more garbage first – compacting, but had problems with large objectsG1 – algorithm of future, hopefully not forever – bad throughput – pauses normally are twice longer than CMS
    • Garbage collection Generational approachYoung space collection  High throughput  Low memory utilizationPromotion  Eden (nursery) -> Survivor (keep) space -> Old spaceOld space collection  Better memory utilization  Orders of magnitude lower throughputMemory barrier  JVM “tracks” references from old to young space
    • Oracle’s HotSpot JVMDefault (serial) collector Young: Serial copy collector, Old: serial MSCParallel scavenge / Parallel old GC Young: Parallel copy collector, Old: serial MSC or parallel MSCConcurrent mark sweep (CMS) Young: Serial or parallel copy collector, Old: concurrent mark sweepG1 (garbage first) http://blog.ragozin.info/2011/07/hotspot-jvm-garbage-collection-options.html Young: Copy collector (region based) Old: Incremental MSC
    • Oracle’s HotSpot JVMYoung collector Old collector JVM optionSerial (DefNew) Serial Mark-Sweep-Compact -XX:+UseSerialGCParallel scavenge (PSYoungGen) Serial Mark-Sweep-Compact (PSOldGen) -XX:+UseParallelGCParallel scavenge (PSYoungGen) Parallel Mark-Sweep-Compact (ParOldGen) -XX:+UseParallelOldGCSerial (DefNew) Concurrent Mark Sweep -XX:+UseConcMarkSweepGC -XX:-UseParNewGCParallel (ParNew) Concurrent Mark Sweep -XX:+UseConcMarkSweepGC -XX:+UseParNewGCG1 -XX:+UseG1GC http://blog.ragozin.info/2011/09/hotspot-jvm-garbage-collection-options.html
    • Oracle’s Jrockit JVM-Xgc: option Generational Mark Sweep/Compactgenconcon or gencon Yes concurrent incrementalsingleconcon or singlecon No concurrent incrementalgenconpar Yes concurrent parallelsingleconpar No concurrent parallelgenparpar or genpar Yes parallel parallelsingleparpar or singlepar No parallel parallelgenparcon Yes parallel incrementalsingleparcon No parallel incremental http://blog.ragozin.info/2011/07/jrockit-gc-in-action.html
    • Azul Zing JVM• Generational GC• Young – Concurrent mark sweep compact MSC)• Old – Concurrent mark sweep compact MSC)Azul Zing can relocate objects in memory without STW pause.Secret – read barrier (барьер чтения).Requires special linux kernel modules to run
    • JVM HEAP SIZE AND PAUSES
    • Concurrent Mark SweepInitial mark - Stop-The-World Collect root references (thread stacks) – mark them gray Mark them as grayConcurrent mark - concurrent Do three color marking until grays exhaust Mark all black objects on dirty regions as gray (by card table) RepeatRemark - Stop-The-World Final remarkSweep - concurrent Scan heap and reclaim white objects
    • Cost structure of pauses (CMS) Summary of pauses
    • MOVING OUT OF HEAP
    • Direct memory buffersjava.nio.ByteBuffer.allocateDirect()Pro• Memory is allocated out of heap• Memory is deallocated when ByteBuffer is collected• Cross platform, native javaCon• Fragmentation of non-heap memory• Memory is deallocated when ByteBuffer is collected• Complicated multi thread programming• -XX:MaxDirectMemorySize=<value>
    • RTSJScoped memory• Objects can be allocated in chosen memory areas• Scoped and immortal areas are not garbage collected• Scoped areas can be release by whole area• Cross references between areas are limited and this limitation is enforced in run time
    • Unsafe javasun.misc.Unsafe• Unsafe.allocateMemory(…)• Unsafe.reallocateMemory(…)• Unsafe.freeMemory(…)
    • Thank youhttp://blog.ragozin.info- my articles Alexey Ragozin alexey.ragozin@gmail.com
    • YOUNG COLLECTION
    • Memory spaces in HotSpot JVM Memory geometry • Young space: -XX:NewSize=<n> -XX:MaxNewSize=<n> • Survival space: Young space / -XX:SurvivorRatio=<n> • Young + old space: -Xms<n> -Xmx<n> • Permanent space: -XX:PermSize=<n> -XX:MaxPerSize=<n>* G1 has same set of spaces but they are not continuous address ranges but dynamic sets of regions
    • How young collection works?Collect root references  Stack frame references  References from other spaces (tenured + permanent) does it mean scanning old space?Travers object graph  Visit only live object  Copy live object to other region of young space or old spaceConsider whole Eden and old survivor space to be free memory Write barrier is required to effectively collect references from old to young space.
    • How young collector works Collecting root references Card marking barrier Each 512 bytes of heap is associated with flag (card). Once reference is written in memory, associated card is marked dirty.
    • How young collection works? Coping live objects Card table is reset just before copy collector starts to move objects.
    • How young collection works? Collection finished Since every object in young space has been relocated, clean card means that there is no references to young space in particular 512 bytes of heap.
    • Thread local allocation blocks TLA in HotSpot JVM • Each thread preallocates block in Eden • Thread is allocating new objects in its TLAB • Then TLAB is full, new TLAB allocated • If object does not fit TLAB • Allocate in Eden space • If does not fit Eden (or ‑XX:PretenureSizeThreshold) • Allocate in old space
    • Young collection stop-the-world  Total STW time  Collect roots  Scan thread stacks  Scan dirty cards  Read card table ~ Sheap 1  Scan pages marked as dirty ~ C− S heap  Copy live objects  Process special references* You can use -XX:+PrintGCTaskTimeStamps to analyze time of individual phases* You can use -XX:+PrintReferenceGC to analyze reference processing times
    • OLD SPACE COLLECTION
    • HotSpot: Old space collectionStop-the-World Mark-Sweep-Compact  Single threaded  MultithreadedConcurrent Mark Sweep (CMS)  Background collection of old spaceG1 (Garbage Fisrt)  Incremental Stop-the-Wolrd collection
    • HotSpot: Old space collection Concurrent Mark SweepHotSpot’s CMS (Concurrent Mark Sweep)• Does not compact• Prone to fragmentation• Use separate free lists for each object size• Use statistic to manage fragmentation• Introduces 2 short STW phases
    • HotSpot: Old space collection Incremental collectionHotSpot’s G1• Space is divided into regions• Regions can be collected individually• Write barrier tracks references between regions• Subset of regions collected during STW pause  Live object are “evacuated” to other regions• Young collections – all Eden regions collected• Partial collection – few old regions collected• Global marking is used to estimated live population
    • Concurrent Mark Sweep Three color marking
    • Concurrent Mark Sweep Three color marking
    • Concurrent Mark Sweep Three color marking
    • Concurrent Mark Sweep Three color marking
    • Concurrent Marking Artifacts SATB barrier example
    • Concurrent Marking Artifacts SATB barrier example
    • Concurrent Marking Artifacts SATB barrier example
    • Concurrent Marking Artifacts SATB barrier example
    • Concurrent Marking Artifacts SATB barrier example
    • Concurrent Marking Artifacts SATB barrier example
    • Concurrent Marking Artifacts SATB barrier example
    • Concurrent Mark SweepInitial mark - Stop-The-World Collect root references (thread stacks) – mark them gray Mark them as grayConcurrent mark - concurrent Do three color marking until grays exhaust Mark all black objects on dirty regions as gray (by card table) RepeatRemark - Stop-The-World Final remarkSweep - concurrent Scan heap and reclaim white objects
    • Cost structure of pauses (CMS) Summary of pauses
    • Patching OpenJDK Serial collector gainhttp://aragozin.blogspot.com/2011/07/openjdk-patch-cutting-down-gc-pause.html
    • Patching OpenJDK CMS collector gainhttp://aragozin.blogspot.com/2011/07/openjdk-patch-cutting-down-gc-pause.html
    • Concurrent Mark Sweep Full GCConcurrent mode failureIf background collection cannot free memory fast enough. CMS will perform Stop-The-World single thread Mark-Sweep- Compact.Promotion failureDue to fragmentation. Old space may not have continuous block of memory to accommodate promoted object even if free space is available.CMS will perform Stop-The-World single thread Mark-Sweep- Compact to defragment memory.
    • TUNING TROUBLESHOTING
    • Common reasons for long STW [Times: user=0.53 sys=0.06, real=0.15 secs] • Full GC • OS Swapping • Too many survivors in young space • Long reference processing • JNI delays • Long CMS initial mark / remark
    • CMS Check list• jdk6u22 - jdk6u26 – broken free lists logic• -XX:CMSWaitDuration=…• -XX:+CMSScavengeBeforeRemark=…• -XX:-CMSConcurrentMTEnabled• Consider CMS for permanent space• Size your heap -Xmn / -Xms / -Xmx  Expected data + young space + CMS overhead  CMS overhead ~30% of expected data
    • Tuning young collectionEden size too small – frequent YGC, objects promoted to old space early too large – more long lived objects need to be copiedSurvivor space size too small – overflow, objects prematurely promoted too large – memory wastedTenuring threshold higher – objects are kept in young space for longer higher – more objects in young space, more copy time
    • Tuning young collectionEden size -XX:MaxNewSize=<n> -XX:NewSize=<n> Eden size = new size – 2 * survivor space sizeSurvivor space size -XX:SurvivorRatio=<n> Survivor space size = new size / survivor ratioTenuring threshold -XX:MaxTenuringThreshold=<n>
    • Tuning young collectionSmall heap sizes Balance tenuring threshold / survivor space to keep objects in limited young space for longerLarge heap sizes (4Gb and greater) Limit tenuring threshold to avoid increase in copy time Limit survivor space to avoid accidental long young collections Increase Eden size instead of increasing tenuring threshold
    • Tuning young collection GC tuning is based on application allocation pattern If application allocation patterns is changed – you are in trouble In practice application always have different “modes of operation” GC tuning – choosing better evil
    • Diagnostics
    • Surviving with huge heap• CMS is very good in terms of pauses  You can reliably keep pauses under 150ms – 50ms on 30GiB – 50 GiB• Fragmentation treat  Not big deal for server type of applications  XML processing is GC disaster• Very narrow GC comfort zone  If you tune for “long run” you are likely to have pauses during initial loads / bulk refreshes