• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Designing for garbage collection
 

Designing for garbage collection

on

  • 720 views

 

Statistics

Views

Total Views
720
Views on SlideShare
719
Embed Views
1

Actions

Likes
0
Downloads
16
Comments
0

1 Embed 1

http://ams.activemailservice.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Designing for garbage collection Designing for garbage collection Presentation Transcript

    • Designing for Garbage Collection Gregg Donovan Senior Software Engineer Etsy.com Wednesday, July 31, 13
    • 3.5Years Search Engineering at Etsy.com 5 years Search & Web Engineering atTheLadders.com Wednesday, July 31, 13
    • Wednesday, July 31, 13
    • 25+ million members Wednesday, July 31, 13
    • 20+ million items Wednesday, July 31, 13
    • 900k+ active sellers Wednesday, July 31, 13
    • 60+ million monthly unique visitors Wednesday, July 31, 13
    • Wednesday, July 31, 13
    • Wednesday, July 31, 13
    • Wednesday, July 31, 13
    • Wednesday, July 31, 13
    • Wednesday, July 31, 13
    • Wednesday, July 31, 13
    • Wednesday, July 31, 13
    • CodeAsCraft.etsy.com Wednesday, July 31, 13
    • Wednesday, July 31, 13
    • Understanding GC Wednesday, July 31, 13
    • Understanding GC Monitoring GC Wednesday, July 31, 13
    • Understanding GC Monitoring GC Debugging Memory Leaks Wednesday, July 31, 13
    • Understanding GC Monitoring GC Debugging Memory Leaks Design for Partial Availability Wednesday, July 31, 13
    • Wednesday, July 31, 13
    • public class BuzzwordDetector { static String[] prefixes = { "synergy", "win-win" }; static String[] myArgs = { "clown synergy", "gorilla win-wins", "whamee" }; public static void main(String[] args) { args = myArgs; int buzzwords = 0; for (int i = 0; i < args.length; i++) { String lc = args[i].toLowerCase(); for (int j = 0; j < prefixes.length; j++) { if (lc.contains(prefixes[j])) { buzzwords++; } } } System.out.println("Found " + buzzwords + " buzzwords"); } } Wednesday, July 31, 13
    • New(): ref <- allocate() if ref = null /* Heap is full */ collect() ref <- allocate() if ref = null /* Heap is still full */ error "Out of memory" return ref atomic collect(): markFromRoots() sweep(HeapStart, HeapEnd) From Garbage Collection Handbook Wednesday, July 31, 13
    • markFromRoots(): initialise(worklist) for each fld in Roots ref <- *fld if ref != null && not isMarked(ref) setMarked(ref) add(worklist, ref) mark() initialise(worklist): worklist <- empty mark(): while not isEmpty(worklist) ref <- remove(worklist) /* ref is marked */ for each fld in Pointers(ref) child <- *fld if (child != null && not isMarked(child) setMarked(child) add(worklist, child) From Garbage Collection Handbook Wednesday, July 31, 13
    • Trivia:Who invented the first GC and Mark-and-Sweep? Wednesday, July 31, 13
    • Weak Generational Hypothesis Wednesday, July 31, 13
    • Where do objects in your application live? Wednesday, July 31, 13
    • GC Terminology: Concurrent vs Parallel Wednesday, July 31, 13
    • JVM Collectors Wednesday, July 31, 13
    • Serial Wednesday, July 31, 13
    • Throughput Wednesday, July 31, 13
    • CMS Wednesday, July 31, 13
    • Garbage First (G1) Wednesday, July 31, 13
    • Continuously Concurrent Compacting Collector (C4) Wednesday, July 31, 13
    • IBM, Dalvik, etc.? Wednesday, July 31, 13
    • Why Throughput? Wednesday, July 31, 13
    • Questions so far? Wednesday, July 31, 13
    • Monitoring Wednesday, July 31, 13
    • GC time per request Wednesday, July 31, 13
    • ... import java.lang.management.*; ... public static long getCollectionTime() { long collectionTime = 0; for (GarbageCollectorMXBean mbean : ManagementFactory.getGarbageCollectorMXBeans()) { collectionTime += mbean.getCollectionTime(); } return collectionTime; } Available via JMX Wednesday, July 31, 13
    • Wednesday, July 31, 13
    • Visual GC Wednesday, July 31, 13
    • Wednesday, July 31, 13
    • Wednesday, July 31, 13
    • export GC_DEBUG="-verbose:gc -XX:+PrintGCDateStamps -XX:+PrintHeapAtGC -XX:+PrintGCApplicationStoppedTime -XX:+PrintGCApplicationConcurrentTime -XX:+PrintAdaptiveSizePolicy -XX:AdaptiveSizePolicyOutputInterval=1 -XX:+PrintTenuringDistribution -XX:+PrintGCDetails -XX:+PrintCommandLineFlags -XX:+PrintSafepointStatistics -Xloggc:/var/log/search/gc.log" Wednesday, July 31, 13
    • Wednesday, July 31, 13
    • 2013-04-08T20:14:00.162+0000: 4197.791: [Full GCAdaptiveSizeStart: 4206.559 collection: 213 PSAdaptiveSizePolicy::compute_generation_free_space limits: desired_promo_size: 9927789154 promo_limit: 8321564672 free_in_old_gen: 4096 max_old_gen_size: 22190686208 avg_old_live: 22190682112 AdaptiveSizePolicy::compute_generation_free_space limits: desired_eden_size: 9712028790 old_eden_size: 8321564672 eden_limit: 8321564672 cur_eden: 8321564672 max_eden_size: 8321564672 avg_young_live: 7340911616 AdaptiveSizePolicy::compute_generation_free_space: gc time limit gc_cost: 1.000000 GCTimeLimit: 98 PSAdaptiveSizePolicy::compute_generation_free_space: costs minor_time: 0.167092 major_cost: 0.965075 mutator_cost: 0.000000 throughput_goal: 0.990000 live_space: 29859940352 free_space: 16643129344 old_promo_size: 8321564672 old_eden_size: 8321564672 desired_promo_size: 8321564672 desired_eden_size: 8321564672 AdaptiveSizeStop: collection: 213 [PSYoungGen: 8126528K->7599356K(9480896K)] [ParOldGen: 21670588K->21670588K(21670592K)] 29797116K- >29269944K(31151488K) [PSPermGen: 58516K->58512K(65536K)], 8.7690670 secs] [Times: user=137.36 sys=0.03, real=8.77 secs] Heap after GC invocations=213 (full 210): PSYoungGen total 9480896K, used 7599356K [0x00007fee47ab0000, 0x00007ff0dd000000, 0x00007ff0dd000000) eden space 8126528K, 93% used [0x00007fee47ab0000,0x00007ff0177ef080,0x00007ff037ac0000) from space 1354368K, 0% used [0x00007ff037ac0000,0x00007ff037ac0000,0x00007ff08a560000) to space 1354368K, 0% used [0x00007ff08a560000,0x00007ff08a560000,0x00007ff0dd000000) ParOldGen total 21670592K, used 21670588K [0x00007fe91d000000, 0x00007fee47ab0000, 0x00007fee47ab0000) object space 21670592K, 99% used [0x00007fe91d000000,0x00007fee47aaf0e0,0x00007fee47ab0000) PSPermGen total 65536K, used 58512K [0x00007fe915000000, 0x00007fe919000000, 0x00007fe91d000000) object space 65536K, 89% used [0x00007fe915000000,0x00007fe918924130,0x00007fe919000000) } Wednesday, July 31, 13
    • GC Log Analyzers? GCHisto GCViewer garbagecat github.com/Netflix/gcviz Wednesday, July 31, 13
    • Graphing with Logster github.com/etsy/logster Wednesday, July 31, 13
    • Wednesday, July 31, 13
    • GC Dashboard github.com/etsy/dashboard Wednesday, July 31, 13
    • Wednesday, July 31, 13
    • YourKit.com Wednesday, July 31, 13
    • Designing for Partial Availability Wednesday, July 31, 13
    • JVMTI GC Hook? Wednesday, July 31, 13
    • How can a client ignore GC-ing hosts? Wednesday, July 31, 13
    • Server lies to clients about availability TCP socket receive buffer TCP write buffer Wednesday, July 31, 13
    • “Banner” protocol 1. Connect via TCP 2.Wait ~1-10ms 3. Either receive magic four byte header or try another host 4. Only send query after receiving header from server Wednesday, July 31, 13
    • 0xC0DEA5CF Wednesday, July 31, 13
    • public function open() { $this->handle_ = @fsockopen($this->host_, $this->port_, $errno, $errstr, $this->connectTimeout_ / 1000.0); try { stream_set_timeout($this->handle_, 0, $banner_timeout * 1000); $read_start = microtime(true); $data = $this->readAll(4); $read_time = (microtime(true) - $read_start) * 1000; // micros to millis $arr = unpack('N', $data); $value = $arr[1]; if ($value !== 0xC0DEA5CF) { StatsD::increment("search.baddata.{$short_hostname}.{$this->getPort()}"); throw new TTransportException("[$value] does match banner [0xC0DEA5CF]"); } } catch (Exception $e) { $this->close(); // this won't necessarily be closed by clients throw new TTransportException($message, self::BANNER_TIMEOUT_CODE); } } Wednesday, July 31, 13
    • private static class BannerSendingTProcessorFactory extends TProcessorFactory { private final TProcessor base; public BannerSendingTProcessorFactory(TProcessor base) { super(base); this.base = base; } @Override public TProcessor getProcessor(TTransport trans) { return new BannerTProcessor(base, (TSocket) trans); } } private static final class BannerTProcessor implements TProcessor { private final TProcessor base; private final TSocket tsocket; private BannerTProcessor(TProcessor base, TSocket tsocket) { this.base = checkNotNull(base); this.tsocket = checkNotNull(tsocket); } @Override public boolean process(TProtocol in, TProtocol out) throws TException { this.tsocket.write(TBannerUtil.BANNER, 0, 4); this.tsocket.flush(); return this.base.process(in, out); } } Wednesday, July 31, 13
    • What if GC happens mid-request? Wednesday, July 31, 13
    • Backup requests Wednesday, July 31, 13
    • Jeff Dean: Achieving Rapid Response Time in Large Online Services Wednesday, July 31, 13
    • Sharding? Naive approach: only as fast as the slowest shard. Wednesday, July 31, 13
    • “Make a reliable whole out of unreliable parts.” Wednesday, July 31, 13
    • Memory Leaks Wednesday, July 31, 13
    • SolrIndexSearcher generation marking with YourKit triggers Wednesday, July 31, 13
    • Wednesday, July 31, 13
    • Questions so far? Wednesday, July 31, 13
    • Miscellaneous Topics Wednesday, July 31, 13
    • System.gc()? Wednesday, July 31, 13
    • -XX:+UseCompressedOops Wednesday, July 31, 13
    • -XX:+UseNUMA Wednesday, July 31, 13
    • Paging Wednesday, July 31, 13
    • #!/usr/bin/env bash # This script is designed to be run every minute by cron. host=$(hostname -s) psout=$(ps h -p `cat /var/run/etsy-search.pid` -o min_flt,maj_flt 2>/dev/null) min_flt=$(echo $psout | awk '{print $1}') # minor page faults maj_flt=$(echo $psout | awk '{print $2}') # major page faults epoch_s=$(date +%s) echo -e "search_memstats.$host.etsy-search.min_fltt${min_flt:-0}t$epoch_s" | nc graphite.etsycorp.com 2003 echo -e "search_memstats.$host.etsy-search.maj_fltt${maj_flt:-0}t$epoch_s" | nc graphite.etsycorp.com 2003 Wednesday, July 31, 13
    • Solution 1: Buy more RAM Ideally enough RAM to: Keep data in OS file buffers AND ensure no paging ofVM memory AND whatever else happens on the box ~$5-10/GB Wednesday, July 31, 13
    • echo “0” > /proc/sys/vm/swappiness Wednesday, July 31, 13
    • mlock()/mlockall() github.com/LucidWorks/mlockall-agent Wednesday, July 31, 13
    • echo “-17” > /proc/$PID/oom_adj Mercy from the OOM Killer Wednesday, July 31, 13
    • Huge Pages Wednesday, July 31, 13
    • -XX:+AlwaysPreTouch Wednesday, July 31, 13
    • Future Directions Wednesday, July 31, 13
    • Many small VMs instead of one large VM microsharding Wednesday, July 31, 13
    • Off-heap memory with sun.misc.Unsafe? Wednesday, July 31, 13
    • Try G1 again Wednesday, July 31, 13
    • Try C4 again Wednesday, July 31, 13
    • Resources Wednesday, July 31, 13
    • gchandbook.org Wednesday, July 31, 13
    • Wednesday, July 31, 13
    • bit.ly/mmgcb Mark Miller’s GC Bootcamp Wednesday, July 31, 13
    • bit.ly/giltene GilTene: Understanding Java Garbage Collection Wednesday, July 31, 13
    • bit.ly/cpumemory Ulrich Drepper: What Every Programmer Should Know About Memory Wednesday, July 31, 13
    • github.com/pingtimeout/jvm-options Wednesday, July 31, 13
    • Read the JVM Source (Not as scary as it sounds.) hg.openjdk.java.net/jdk7/jdk7 Wednesday, July 31, 13
    • Mechanical Sympathy Google Group bit.ly/mechsym Wednesday, July 31, 13
    • Questions? Thanks for coming! Gregg Donovan gregg@etsy.com Wednesday, July 31, 13