SlideShare a Scribd company logo
Cache is King
(Or How to Stop Worrying and
Start Caching in Java)

SriSatish Ambati
Performance & Partner Engineer
Now: Azul Systems,
Upcoming: Riptano, Apache Cassandra
Twitter: @srisatish
The Trail


• Examples
• Elements of Cache Performance
   – Theory
   – Metrics
   – Focus on Oracle Coherence, Apache Cassandra
• 200GB Cache Design
• JVM in BigData Land
   – Overheads in Java – Objects, GC
   – Locks,
   – Serialization
   – JMX
Wie die wahl hat, hat die Qual!
He who has the choice has the agony!
You are on the train -
     & When *asked* to pick a stock to invest in?

What does one do?
a) Pick all single letter stocks (Citi, Ford, Hyatt, Sprint)
b) Pick commodities
c) Pick penny stocks
d) Pick an index
e) Pick the nearest empty seat
If only we could pick an index of Cache Vendors!
Some example caches

• Homegrown caches – Surprisingly work well.
  – (do it yourself! It’s a giant hash)
• Coherence, Gemstone/VMWare, GigaSpaces,
  EhCache/Terracotta, Infinispan/JBoss etc
• NoSQL stores: Apache Cassandra, HBase,
  SimpleDB
• Non java alternatives: MemCached & clones,
  Redis, CouchDB, MongoDB
Visualize Cache


• Simple example
• Visualize Cache




 Replicated Cache      Distributed Cache
Example, RTView : Coherence heat map:
Elements of Cache Performance : Metrics

• Inserts: Puts/sec, Latencies
• Reads: Gets/sec, Latencies, Indexing
• Updates: mods/sec, latencies (Locate, Modify &
  Notify)
• Replication
    – Synchronous, Asynchronous (faster w)
•   Consistency – Eventual
•   Persistence
•   Size of Objects, Number of Objects/rows
•   Size of Cache
•   # of cacheserver Nodes (read only, read write)
•   # of clients
Elements of Cache Performance:
            “Think Locality”

• Hot or Not: The 80/20 rule.
   – A small set of objects are very popular!
   – Most popular commodity of the day?
• Hit or Miss: Hit Ratio
   – How effective is your cache?
   – LRU, LFU, FIFO.. Expiration
• Long-lived objects lead to better locality.
• Spikes happen
   – Cascading events
   – Cache Thrash: full table scans
A feather in the CAP
• Tunable Consistency
   – Levels, 0,1, ALL
   – Doesn’t mean data loss
     (journaled systems)


• SEDA
   – Partitioning, Cluster-membership
     & Failure detection, Storage
     engines
   – Event driven & non-blocking io
   – Pure Java
NoSQL/Cassandra:
           furiously fast writes

                           n2              to
    client                           a pply y
                                      m emor
    issues
                                n1
    write        find node

                                commit log
             partitioner

• Append only writes
   – Sequential disk access
• No locks in critical path
• Key based atomicity
Performance


• Facebook Inbox
  – Writes:0.12ms, Reads:15ms @ 50GB
    data
  – More than10x better than MySQL
• ycbs/PNUTS benchmarks
  – 5ms read/writes @ 5k ops/s (50/50
    Update heavy)
  – 8ms reads/5ms writes @ 5k ops/s (95/5
    read heavy)
• Lab environment
  – ~5k writes per sec per node, <5ms
    latencies
  – ~10k reads per sec per node, <5ms
yahoo cloud store benchmark
50/50 – Update Heavy
yahoo cloud store benchmark
95/5 – read heavy
I/O considerations


• Asynchronous
• Sockets
• Persistence –
  – File, DB (CacheLoaders)
  – Dedicated disks: What happens in the cloud?
• Data Access Patterns of Doom,
  – “Death by a million gets” – Batch your reads.
Partitioning & Distributed Caches


• Near Cache/L1 Cache
    – Bring data close to the Logic that is using it. (HBase)
    – Birds of feather flock together - related data live closer
•   Read-only nodes, Read-Write nodes
•   Ranges, Bloom Filters
•   Management nodes
•   Communication Costs
•   Balancing (buckets)
•   Serialization (more later)
Birthdays, Collisions &
       Hashing functions

• Birthday Paradox
   – For the N=21 people in a room
   – Probability that at least 2 of them share same birthday
     is ~0.47
• Collisions are real!
• An unbalanced HashMap behaves like a list O(n) retrieval
• Chaining & Linear probing
• Performance Degrades
    – with 80% table density
•
Bloom Filter: in full bloom

• “constant” time
• size:compact
• false positives
• Single lookup
   for key in file
• Deletion
• Improve
   – Counting BF
   – Bloomier filters
How many nodes to get a 200G cache?
Imagine
          – John Lennon
How many nodes to get a 200G cache?


• Who needs a 200G cache?
  – Disk is the new Tape!
• 200 nodes @ 1GB heap each
• 2 nodes @ 100GB heap each
  – (plus overhead)
the devil’s in the details
JVM in BigData Land!

A few limits for scale
• Object overhead
   – average enterprise collection has 3 elements!
   – Use byte[ ], primitives where possible!
• Locks : synchronized
   – Can’t use all my multi-cores!
   – java.util.collections also hold locks
   – Use non-blocking collections!
• (de) Serialization is expensive
   – Hampers object portability, cluster-scaleability
   – Use avro, thrift!
• Garbage Collection
   – Can’t throw memory at the problem!?
   – Mitigate, Monitor, Measure footprint
Tools

• What is the JVM doing:
   – dtrace, hprof, introscope, dynatrace, jconsole,
     visualvm, yourkit, azul zvision
• Invasive JVM observation tools
   – bci, jvmti, jvmdi/pi agents, jmx, logging
• What is the OS doing:
   – dtrace, oprofile, vtune
• What is the network disk doing:
   – Ganglia, iostat, lsof, netstat, nagios
Java Limits: Objects are not cheap!


• How many bytes for a 8 char String ?
• (assume 32-bit)
    String
                                                                     A. 64bytes

        JVM Overhead            book keeping fields Pointer          31% overhead
          char[] 16 bytes               12 bytes     4 bytes
                                                                     Size of String
                                                       data
                   JVM Overhead
                                                          16 bytes
                                                                     Varies with JVM
                          16 bytes



• How many objects in a Tomcat idle
  instance?
Picking the right collection: Mozart or Bach?
                                            TreeMap

• 100 elements of Treemap                                             Fixed Overhead: 48 bytes
                          TreeMap$Entry
  of <Double, Double>
                                                                      Per-entry Overhead: 40 bytes
   – 82% overhead, 88 bytes constant cost
     per element
   – Enables updates while maintaining
     order                                               data

• double[], double[] –                          Double                 double
   – 2% overhead, amortized
   – [con: load-then-use]                        JVM Overhead            data
                                                   16 bytes              8 bytes
• Sparse collections, Empty
  collections,
• Wrong collections for the                           *From one 32-bit JVM.
                                                      Varies with JVM Architecture
  problem
JEE is not cheap either!                                                                          Million Objects
                                                                                                                           allocated live

                                                                                        JBoss 5.1         20                                  4
                                                                                        Apache Tomcat 6.0 0.25                               0.1
                              JBoss 5.1                                                                  Apache Tomcat 6.0
                              Allocated                                                                       Allocated
Class name                      Size (B)                   Count        Avg (B)      Class name                     Size (B)              Count     Avg (B)
Total                                      1,410,764,512     19,830,135       71.1   Total                                     21,580,592 228,805         94.3
char[]                                       423,372,528      4,770,424       88.7   char[]                                     4,215,784    48,574       86.8
byte[]                                       347,332,152      1,971,692      176.2   byte[]                                     3,683,984     5,024     733.3
int[]                                         85,509,280      1,380,642       61.9   Built‐in VM methodKlass                    2,493,064    16,355     152.4
java.lang.String                              73,623,024      3,067,626         24   Built‐in VM constMethodKlass               1,955,696    16,355     119.6
java.lang.Object[]                            64,788,840        565,693      114.5   Built‐in VM constantPoolKlass              1,437,240     1,284 1,119.30
java.util.regex.Matcher                       51,448,320        643,104         80   Built‐in VM instanceKlass                  1,078,664     1,284     840.1
java.lang.reflect.Method                      43,374,528        301,212       144
                                                                                     java.lang.Class[]                            922,808    45,354       20.3
java.util.HashMap$Entry[]                     27,876,848        140,898      197.9
                                                                                     Built‐in VM constantPoolCacheK               903,360     1,132        798
java.util.TreeMap$Entry                       22,116,136        394,931         56                                 Live
                                                                                     java.lang.String                             753,936    31,414         24
java.util.HashMap$Entry                       19,806,440        495,161         40
                                                                                     java.lang.Object[]                           702,264     8,118       86.5
java.nio.HeapByteBuffer                       17,582,928        366,311         48
                                                                                     java.lang.reflect.Method                     310,752     2,158        144
java.nio.HeapCharBuffer                       17,575,296        366,152         48
java.lang.StringBuilder                       15,322,128        638,422         24   short[]                                      261,112     3,507       74.5
java.util.TreeMap$EntryIterator               15,056,784        313,683         48   java.lang.Class                              255,904     1,454        176
java.util.ArrayList                           11,577,480        289,437         40   int[][]                                      184,680     2,032       90.9
java.util.HashMap                              7,829,056        122,329         64   java.lang.String[]                           173,176     1,746       99.2
java.util.TreeMap                              7,754,688        107,704         72   java.util.zip.ZipEntry                       172,080     2,390         72
Another example, Overhead in collection
Garbage Collection
• Pause Times
   if stop_the_word_FullGC > ttl_of_node
    => failed requests; node repair
    => node is declared dead
• Allocation Rate
   – New object creation, insertion rate
• Live Objects (residency)
   – if residency in heap > 50%
   – GC overheads dominate.
• Overhead: space, cpu cycles spent GC
• 64-bit not addressing pause times
   – Bigger is not better!
   – 40-50% increase in heap sizes for same
      workloads.
Too many free parameters!!
Tune GC:
• Entropy is: Number of flags it takes to tune GC.
• Workloads in lab do not represent production
• Fragile, Meaning of flags changes.

Solution:
• Ask VM vendor to provide one flag soln.
• Azul’s PauselessGC (now in software)

⇒ Avoid OOM, configure node death if OOM
⇒ Azul’s Cooperative-Memory (swap space for your jvm
  under spike: No more OOM!)
Memory Fragmentation

• Fragmentation
   – Performance degrades over time
   – Inducing “Full GC” makes problem go away
   – Free memory that cannot be used

• Reduce occurrence
   – Use a compacting collector
   – Promote less often
   – Use uniform sized objects

• Solution – unsolved
   – Use latest CMS with CR:6631166
   – Azul’s Zing JVM & Pauseless GC
Sizing: Young Generation

• Should we set –Xms == -Xmx ?
• Use –Xmn (fixed eden)


         allocations {new Object();}
                                          survivor ratio

        eden            survivor spaces      Tenuring
               promotion                     Threshold

                                   allocation by jvm
          old generation
Generations

• Don’t promote too often!
  – Frequent promotion causes fragmentation
• Size the generations
  – Min GC times are a function of Live Set
  – Old Gen should host steady state comfortably
• Parallelize on multicores:
  – -XX:ParallelCMSThreads=4
  – -XX:ParallelGCThreads=4
• Avoid CMS Initiating heuristic
  – -XX:+UseCMSInitiationOccupanyOnly
• Use Concurrent for System.gc()
  – -XX:+ExplicitGCInvokesConcurrent
Memory Leaks

• Application takes all
  memory you got!
• Live heap shows sawtooth
• Eventually throws OOM
Theory:
• Allocated, Live heap,
  PermGen
Common sources:
• Finalizers, Classloaders,
  ThreadLocal
synchronized:
       Amdahl’s law trumps Moore’s!


•   Coarse grained locks
•   io under lock
•   Stop signal on a highway
•   java.util.concurrent does not mean no locks
•   Non Blocking, Lock free, Wait free collections
Locks: Distributed Caching

• Schemes
   – Optimistic, Pessimistic
• Consistency
   – Eventually vs. ACID
• Contention, Waits
• java.util.concurrent, critical sections,
   – Use Lock Striping
• MVCC, Lock-free, wait-free DataStructures.
  (NBHM)
• Transactions are expensive
⇒Reduce JTA abuse, Set the right isolation levels.
writes: monitors
UUID

Are you using UUID gen for messaging?

• java.util.UUID is slow
   – static use leads to contention
SecureRandom
• Uses /dev/urandom for seed initialization
    -Djava.security.egd=file:/dev/urandom

• PRNG without file is atleast 20%-40% better.
• Use TimeUUIDs where possible – much faster
• JUG – java.uuid.generator
•   http://github.com/cowtowncoder/java-uuid-generator
•   http://jug.safehaus.org/
•   http://johannburkard.de/blog/programming/java/Java-UUID-generators-compared.html
Towards Non-blocking high scale collections!

• Big Array to hold Data
• Concurrent writes via: CAS & Finite State Machine
   – No locks, no volatile
   – Much faster than locking under heavy load
   – Directly reach main data array in 1 step
• Resize as needed
   – Copy Array to a larger Array on demand
   – Use State Machine to help copy
   – “ Mark” old Array words to avoid missing late updates
• Use Non-Blocking Hashmap, google collections
Non-Blocking HashMap

            Azul Vega2 – 768 cpus

                            1K Table                                                        1M Table
            1200                                                     1200


            1000
                                           NB-99                     1000



             800                                                     800




                                                         M-ops/sec
M-ops/sec




             600
                                          CHM-99                     600



             400                                                     400                                            NB
                                     NB-75
             200                                                     200
                                           CHM-75                                                               CHM
              0                                                         0
                   0   100 200 300 400 500 600 700 800                      0   100   200   300   400   500   600   700   800
                                Threads                                                      Threads
Inter-node communication

• TCP for mgmt & data: Infinispan
• TCP for mgmt, UDP for data: Coherence, Infinispan
• UDP for mgmt, TCP for data: Cassandra, Infinispan
• Instrumentation: EHCache/Terracotta

Bandwidth & Latency considerations
⇒ Ensure proper network configuration in the kernel
⇒ Run Datagram tests
⇒ Limit number of management nodes & nodes
Example, Apache Cassandra


• Partition, Ring, Gateway, BloomFilters
• Gossip Protocol
  – It’s exponential
  – (epidemic algorithm)
• Failure Detector
  – Accrual rate phi
• Anti-Entropy
  – Bringing replicas to uptodate
Coherence Communication Issues
Marshal Arts:
     Serialization/Deserialization



•   java.io.Serializable is S.L..O.…W
•   Use “transient”
•    jserial, avro, etc
•   + Google Protocol Buffers,
•       PortableObjectFormat (Coherence)
•   + JBossMarshalling
•   + Externalizable + byte[]
•   + Roll your own
Serialization + Deserialization uBench

•   http://code.google.com/p/thrift-protobuf-compare/wiki/BenchmarkingV2
Count what is countable, measure what is measurable, and what is not
measurable, make measurable
                                          -Galileo
Latency:
       Where have all the millis gone?

• Moore’s law amplifies bandwidth
    – Latencies are still lagging!
•   Measure. 90th percentile. Look for consistency.
•   => JMX is great! JMX is also very slow.
•   Lesser number of nodes means less MBeans!
•   Monitor (network, memory, cpu), ganglia,
•   Know thyself: Application Footprint, Trend data.
Optimization hinders evolution
                – Alan Perlis
Q&A

•   References:
•   Making Sense of Large Heaps, Nick Mitchell, IBM
•   Oracle Coherence 3.5, Aleksandar Seovic
•   Large Pages in Java http://andrigoss.blogspot.com/2008/02/jvm-performance-
    tuning.html
•   Patterns of Doom http://3.latest.googtst23.appspot.com/
•   Infinispan Demos http://community.jboss.org/wiki/5minutetutorialonInfinispan
•   RTView, Tom Lubinski, http://www.sl.com/pdfs/SL-BACSIG-100429-final.pdf
•   Google Protocol Buffers, http://code.google.com/p/protobuf/
•   Azul’s Pauseless GC http://www.azulsystems.com/technology/zing-virtual-
    machine
•   Cliff Click’s Non-Blocking Hash Map http://sourceforge.net/projects/high-scale-
    lib/
•   JVM Serialization Benchmarks:
•    http://code.google.com/p/thrift-protobuf-compare/wiki/BenchmarkingV2
Cassandra links

•   Verner Wogels, Eventually Consistent
    http://www.allthingsdistributed.com/2008/12/eventually_consistent.html
•   Bloom, Burton H. (1970), "Space/time trade-offs in hash coding with allowable
    errors"
•   Avinash Lakshman, http://static.last.fm/johan/nosql-
    20090611/cassandra_nosql.pdf
•   Eric Brewer, CAP http://www.cs.berkeley.edu/~brewer/cs262b-2004/PODC-
    keynote.pdf
•   Tony Printzeis, Charlie Hunt, Javaone Talk
    http://www.scribd.com/doc/36090475/GC-Tuning-in-the-Java
•   http://github.com/digitalreasoning/PyStratus/wiki/Documentation
•   http://www.cs.cornell.edu/home/rvr/papers/flowgossip.pdf
•   Cassandra on Cloud, http://www.coreyhulen.org/?p=326
•   Cliff Click’s, Non-blocking HashMap http://sourceforge.net/projects/high-scale-lib/
•   Brian F. Cooper., Yahoo Cloud Storage Benchmark,
    http://www.brianfrankcooper.net/pubs/ycsb-v4.pdf
•   www.riptano.com
Further questions, reach me:
  Azul Systems, sris@azulsystems.com
  Twitter: @srisatish, srisatish@riptano.com

More Related Content

What's hot

JVM for Dummies - OSCON 2011
JVM for Dummies - OSCON 2011JVM for Dummies - OSCON 2011
JVM for Dummies - OSCON 2011
Charles Nutter
 
Structure for scale: Dialing in your apps for optimal performance
Structure for scale: Dialing in your apps for optimal performanceStructure for scale: Dialing in your apps for optimal performance
Structure for scale: Dialing in your apps for optimal performance
Atlassian
 

What's hot (20)

Lightweight Grids With Terracotta
Lightweight Grids With TerracottaLightweight Grids With Terracotta
Lightweight Grids With Terracotta
 
Java performance tuning
Java performance tuningJava performance tuning
Java performance tuning
 
ApacheCon2010: Cache & Concurrency Considerations in Cassandra (& limits of JVM)
ApacheCon2010: Cache & Concurrency Considerations in Cassandra (& limits of JVM)ApacheCon2010: Cache & Concurrency Considerations in Cassandra (& limits of JVM)
ApacheCon2010: Cache & Concurrency Considerations in Cassandra (& limits of JVM)
 
Methods of NoSQL database systems benchmarking
Methods of NoSQL database systems benchmarkingMethods of NoSQL database systems benchmarking
Methods of NoSQL database systems benchmarking
 
Scaling Your Cache And Caching At Scale
Scaling Your Cache And Caching At ScaleScaling Your Cache And Caching At Scale
Scaling Your Cache And Caching At Scale
 
IAP09 CUDA@MIT 6.963 - Guest Lecture: CUDA Tricks and High-Performance Comput...
IAP09 CUDA@MIT 6.963 - Guest Lecture: CUDA Tricks and High-Performance Comput...IAP09 CUDA@MIT 6.963 - Guest Lecture: CUDA Tricks and High-Performance Comput...
IAP09 CUDA@MIT 6.963 - Guest Lecture: CUDA Tricks and High-Performance Comput...
 
近未来的並列 LL
近未来的並列 LL近未来的並列 LL
近未来的並列 LL
 
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
 
Beyond JVM - YOW! Sydney 2013
Beyond JVM - YOW! Sydney 2013Beyond JVM - YOW! Sydney 2013
Beyond JVM - YOW! Sydney 2013
 
JVM for Dummies - OSCON 2011
JVM for Dummies - OSCON 2011JVM for Dummies - OSCON 2011
JVM for Dummies - OSCON 2011
 
Structure for scale: Dialing in your apps for optimal performance
Structure for scale: Dialing in your apps for optimal performanceStructure for scale: Dialing in your apps for optimal performance
Structure for scale: Dialing in your apps for optimal performance
 
Ehcache 3: JSR-107 on steroids at Devoxx Morocco
Ehcache 3: JSR-107 on steroids at Devoxx MoroccoEhcache 3: JSR-107 on steroids at Devoxx Morocco
Ehcache 3: JSR-107 on steroids at Devoxx Morocco
 
From Java code to Java heap: Understanding and optimizing your application's ...
From Java code to Java heap: Understanding and optimizing your application's ...From Java code to Java heap: Understanding and optimizing your application's ...
From Java code to Java heap: Understanding and optimizing your application's ...
 
Riak add presentation
Riak add presentationRiak add presentation
Riak add presentation
 
Thousands of Threads and Blocking I/O
Thousands of Threads and Blocking I/OThousands of Threads and Blocking I/O
Thousands of Threads and Blocking I/O
 
Betting On Data Grids
Betting On Data GridsBetting On Data Grids
Betting On Data Grids
 
DjangoCon 2010 Scaling Disqus
DjangoCon 2010 Scaling DisqusDjangoCon 2010 Scaling Disqus
DjangoCon 2010 Scaling Disqus
 
Memcached Study
Memcached StudyMemcached Study
Memcached Study
 
Garbage First and you
Garbage First and youGarbage First and you
Garbage First and you
 
The JVM is your friend
The JVM is your friendThe JVM is your friend
The JVM is your friend
 

Similar to Cache is King ( Or How To Stop Worrying And Start Caching in Java) at Chicago Mercantile Group, 2010

How to Stop Worrying and Start Caching in Java
How to Stop Worrying and Start Caching in JavaHow to Stop Worrying and Start Caching in Java
How to Stop Worrying and Start Caching in Java
srisatish ambati
 
Andy Parsons Pivotal June 2011
Andy Parsons Pivotal June 2011Andy Parsons Pivotal June 2011
Andy Parsons Pivotal June 2011
Andy Parsons
 
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
Chester Chen
 
Scaling up java applications on windows
Scaling up java applications on windowsScaling up java applications on windows
Scaling up java applications on windows
Juarez Junior
 
java-monitoring-troubleshooting
java-monitoring-troubleshootingjava-monitoring-troubleshooting
java-monitoring-troubleshooting
William Au
 
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
xlight
 

Similar to Cache is King ( Or How To Stop Worrying And Start Caching in Java) at Chicago Mercantile Group, 2010 (20)

How to Stop Worrying and Start Caching in Java
How to Stop Worrying and Start Caching in JavaHow to Stop Worrying and Start Caching in Java
How to Stop Worrying and Start Caching in Java
 
How does Apache Pegasus (incubating) community develop at SensorsData
How does Apache Pegasus (incubating) community develop at SensorsDataHow does Apache Pegasus (incubating) community develop at SensorsData
How does Apache Pegasus (incubating) community develop at SensorsData
 
Andy Parsons Pivotal June 2011
Andy Parsons Pivotal June 2011Andy Parsons Pivotal June 2011
Andy Parsons Pivotal June 2011
 
Loom promises: be there!
Loom promises: be there!Loom promises: be there!
Loom promises: be there!
 
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
SF Big Analytics & SF Machine Learning Meetup: Machine Learning at the Limit ...
 
New hope is comming? Project Loom.pdf
New hope is comming? Project Loom.pdfNew hope is comming? Project Loom.pdf
New hope is comming? Project Loom.pdf
 
High-Performance Storage Services with HailDB and Java
High-Performance Storage Services with HailDB and JavaHigh-Performance Storage Services with HailDB and Java
High-Performance Storage Services with HailDB and Java
 
Scaling up java applications on windows
Scaling up java applications on windowsScaling up java applications on windows
Scaling up java applications on windows
 
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
 
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
JavaOne2016 - Microservices: Terabytes in Microseconds [CON4516]
 
java-monitoring-troubleshooting
java-monitoring-troubleshootingjava-monitoring-troubleshooting
java-monitoring-troubleshooting
 
Fixing twitter
Fixing twitterFixing twitter
Fixing twitter
 
Fixing_Twitter
Fixing_TwitterFixing_Twitter
Fixing_Twitter
 
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
 
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...Fixing Twitter  Improving The Performance And Scalability Of The Worlds Most ...
Fixing Twitter Improving The Performance And Scalability Of The Worlds Most ...
 
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth WiesmanWebinar: Deep Dive on Apache Flink State - Seth Wiesman
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
 
Javaforum looking into the memory
Javaforum   looking into the memoryJavaforum   looking into the memory
Javaforum looking into the memory
 
Basics of JVM Tuning
Basics of JVM TuningBasics of JVM Tuning
Basics of JVM Tuning
 
Accelerating NoSQL
Accelerating NoSQLAccelerating NoSQL
Accelerating NoSQL
 
Heapoff memory wtf
Heapoff memory wtfHeapoff memory wtf
Heapoff memory wtf
 

More from srisatish ambati

High order bits from cassandra & hadoop
High order bits from cassandra & hadoopHigh order bits from cassandra & hadoop
High order bits from cassandra & hadoop
srisatish ambati
 
High order bits from cassandra & hadoop
High order bits from cassandra & hadoopHigh order bits from cassandra & hadoop
High order bits from cassandra & hadoop
srisatish ambati
 

More from srisatish ambati (14)

H2O Open Dallas 2016 keynote for Business Transformation
H2O Open Dallas 2016 keynote for Business TransformationH2O Open Dallas 2016 keynote for Business Transformation
H2O Open Dallas 2016 keynote for Business Transformation
 
Digital Transformation with AI and Data - H2O.ai and Open Source
Digital Transformation with AI and Data - H2O.ai and Open SourceDigital Transformation with AI and Data - H2O.ai and Open Source
Digital Transformation with AI and Data - H2O.ai and Open Source
 
Top 10 Performance Gotchas for scaling in-memory Algorithms.
Top 10 Performance Gotchas for scaling in-memory Algorithms.Top 10 Performance Gotchas for scaling in-memory Algorithms.
Top 10 Performance Gotchas for scaling in-memory Algorithms.
 
Java one2011 brisk-and_high_order_bits_from_cassandra_and_hadoop
Java one2011 brisk-and_high_order_bits_from_cassandra_and_hadoopJava one2011 brisk-and_high_order_bits_from_cassandra_and_hadoop
Java one2011 brisk-and_high_order_bits_from_cassandra_and_hadoop
 
High order bits from cassandra & hadoop
High order bits from cassandra & hadoopHigh order bits from cassandra & hadoop
High order bits from cassandra & hadoop
 
High order bits from cassandra & hadoop
High order bits from cassandra & hadoopHigh order bits from cassandra & hadoop
High order bits from cassandra & hadoop
 
Cassandra at no_sql
Cassandra at no_sqlCassandra at no_sql
Cassandra at no_sql
 
Brisk hadoop june2011_sfjava
Brisk hadoop june2011_sfjavaBrisk hadoop june2011_sfjava
Brisk hadoop june2011_sfjava
 
Brisk hadoop june2011
Brisk hadoop june2011Brisk hadoop june2011
Brisk hadoop june2011
 
Cacheconcurrencyconsistency cassandra svcc
Cacheconcurrencyconsistency cassandra svccCacheconcurrencyconsistency cassandra svcc
Cacheconcurrencyconsistency cassandra svcc
 
Jvm goes big_data_sfjava
Jvm goes big_data_sfjavaJvm goes big_data_sfjava
Jvm goes big_data_sfjava
 
jvm goes to big data
jvm goes to big datajvm goes to big data
jvm goes to big data
 
Svccg nosql 2011_sri-cassandra
Svccg nosql 2011_sri-cassandraSvccg nosql 2011_sri-cassandra
Svccg nosql 2011_sri-cassandra
 
JavaOne 2010: Top 10 Causes for Java Issues in Production and What to Do When...
JavaOne 2010: Top 10 Causes for Java Issues in Production and What to Do When...JavaOne 2010: Top 10 Causes for Java Issues in Production and What to Do When...
JavaOne 2010: Top 10 Causes for Java Issues in Production and What to Do When...
 

Recently uploaded

Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Peter Udo Diehl
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
Bhaskar Mitra
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 

Recently uploaded (20)

Introduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG EvaluationIntroduction to Open Source RAG and RAG Evaluation
Introduction to Open Source RAG and RAG Evaluation
 
Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........Bits & Pixels using AI for Good.........
Bits & Pixels using AI for Good.........
 
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
 
Search and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical FuturesSearch and Society: Reimagining Information Access for Radical Futures
Search and Society: Reimagining Information Access for Radical Futures
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
IESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIESVE for Early Stage Design and Planning
IESVE for Early Stage Design and Planning
 
Knowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and backKnowledge engineering: from people to machines and back
Knowledge engineering: from people to machines and back
 
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCustom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
 
UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1UiPath Test Automation using UiPath Test Suite series, part 1
UiPath Test Automation using UiPath Test Suite series, part 1
 
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMsTo Graph or Not to Graph Knowledge Graph Architectures and LLMs
To Graph or Not to Graph Knowledge Graph Architectures and LLMs
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 

Cache is King ( Or How To Stop Worrying And Start Caching in Java) at Chicago Mercantile Group, 2010

  • 1. Cache is King (Or How to Stop Worrying and Start Caching in Java) SriSatish Ambati Performance & Partner Engineer Now: Azul Systems, Upcoming: Riptano, Apache Cassandra Twitter: @srisatish
  • 2. The Trail • Examples • Elements of Cache Performance – Theory – Metrics – Focus on Oracle Coherence, Apache Cassandra • 200GB Cache Design • JVM in BigData Land – Overheads in Java – Objects, GC – Locks, – Serialization – JMX
  • 3. Wie die wahl hat, hat die Qual! He who has the choice has the agony!
  • 4. You are on the train - & When *asked* to pick a stock to invest in? What does one do? a) Pick all single letter stocks (Citi, Ford, Hyatt, Sprint) b) Pick commodities c) Pick penny stocks d) Pick an index e) Pick the nearest empty seat
  • 5. If only we could pick an index of Cache Vendors!
  • 6. Some example caches • Homegrown caches – Surprisingly work well. – (do it yourself! It’s a giant hash) • Coherence, Gemstone/VMWare, GigaSpaces, EhCache/Terracotta, Infinispan/JBoss etc • NoSQL stores: Apache Cassandra, HBase, SimpleDB • Non java alternatives: MemCached & clones, Redis, CouchDB, MongoDB
  • 7. Visualize Cache • Simple example • Visualize Cache Replicated Cache Distributed Cache
  • 8. Example, RTView : Coherence heat map:
  • 9. Elements of Cache Performance : Metrics • Inserts: Puts/sec, Latencies • Reads: Gets/sec, Latencies, Indexing • Updates: mods/sec, latencies (Locate, Modify & Notify) • Replication – Synchronous, Asynchronous (faster w) • Consistency – Eventual • Persistence • Size of Objects, Number of Objects/rows • Size of Cache • # of cacheserver Nodes (read only, read write) • # of clients
  • 10. Elements of Cache Performance: “Think Locality” • Hot or Not: The 80/20 rule. – A small set of objects are very popular! – Most popular commodity of the day? • Hit or Miss: Hit Ratio – How effective is your cache? – LRU, LFU, FIFO.. Expiration • Long-lived objects lead to better locality. • Spikes happen – Cascading events – Cache Thrash: full table scans
  • 11. A feather in the CAP • Tunable Consistency – Levels, 0,1, ALL – Doesn’t mean data loss (journaled systems) • SEDA – Partitioning, Cluster-membership & Failure detection, Storage engines – Event driven & non-blocking io – Pure Java
  • 12. NoSQL/Cassandra: furiously fast writes n2 to client a pply y m emor issues n1 write find node commit log partitioner • Append only writes – Sequential disk access • No locks in critical path • Key based atomicity
  • 13. Performance • Facebook Inbox – Writes:0.12ms, Reads:15ms @ 50GB data – More than10x better than MySQL • ycbs/PNUTS benchmarks – 5ms read/writes @ 5k ops/s (50/50 Update heavy) – 8ms reads/5ms writes @ 5k ops/s (95/5 read heavy) • Lab environment – ~5k writes per sec per node, <5ms latencies – ~10k reads per sec per node, <5ms
  • 14. yahoo cloud store benchmark 50/50 – Update Heavy
  • 15. yahoo cloud store benchmark 95/5 – read heavy
  • 16. I/O considerations • Asynchronous • Sockets • Persistence – – File, DB (CacheLoaders) – Dedicated disks: What happens in the cloud? • Data Access Patterns of Doom, – “Death by a million gets” – Batch your reads.
  • 17. Partitioning & Distributed Caches • Near Cache/L1 Cache – Bring data close to the Logic that is using it. (HBase) – Birds of feather flock together - related data live closer • Read-only nodes, Read-Write nodes • Ranges, Bloom Filters • Management nodes • Communication Costs • Balancing (buckets) • Serialization (more later)
  • 18. Birthdays, Collisions & Hashing functions • Birthday Paradox – For the N=21 people in a room – Probability that at least 2 of them share same birthday is ~0.47 • Collisions are real! • An unbalanced HashMap behaves like a list O(n) retrieval • Chaining & Linear probing • Performance Degrades – with 80% table density •
  • 19. Bloom Filter: in full bloom • “constant” time • size:compact • false positives • Single lookup for key in file • Deletion • Improve – Counting BF – Bloomier filters
  • 20. How many nodes to get a 200G cache?
  • 21. Imagine – John Lennon
  • 22. How many nodes to get a 200G cache? • Who needs a 200G cache? – Disk is the new Tape! • 200 nodes @ 1GB heap each • 2 nodes @ 100GB heap each – (plus overhead)
  • 23. the devil’s in the details
  • 24. JVM in BigData Land! A few limits for scale • Object overhead – average enterprise collection has 3 elements! – Use byte[ ], primitives where possible! • Locks : synchronized – Can’t use all my multi-cores! – java.util.collections also hold locks – Use non-blocking collections! • (de) Serialization is expensive – Hampers object portability, cluster-scaleability – Use avro, thrift! • Garbage Collection – Can’t throw memory at the problem!? – Mitigate, Monitor, Measure footprint
  • 25. Tools • What is the JVM doing: – dtrace, hprof, introscope, dynatrace, jconsole, visualvm, yourkit, azul zvision • Invasive JVM observation tools – bci, jvmti, jvmdi/pi agents, jmx, logging • What is the OS doing: – dtrace, oprofile, vtune • What is the network disk doing: – Ganglia, iostat, lsof, netstat, nagios
  • 26. Java Limits: Objects are not cheap! • How many bytes for a 8 char String ? • (assume 32-bit) String A. 64bytes JVM Overhead book keeping fields Pointer 31% overhead char[] 16 bytes 12 bytes 4 bytes Size of String data JVM Overhead 16 bytes Varies with JVM 16 bytes • How many objects in a Tomcat idle instance?
  • 27. Picking the right collection: Mozart or Bach? TreeMap • 100 elements of Treemap Fixed Overhead: 48 bytes TreeMap$Entry of <Double, Double> Per-entry Overhead: 40 bytes – 82% overhead, 88 bytes constant cost per element – Enables updates while maintaining order data • double[], double[] – Double double – 2% overhead, amortized – [con: load-then-use] JVM Overhead data 16 bytes 8 bytes • Sparse collections, Empty collections, • Wrong collections for the *From one 32-bit JVM. Varies with JVM Architecture problem
  • 28. JEE is not cheap either! Million Objects allocated live JBoss 5.1 20 4 Apache Tomcat 6.0 0.25 0.1 JBoss 5.1 Apache Tomcat 6.0 Allocated Allocated Class name Size (B) Count Avg (B) Class name Size (B) Count Avg (B) Total 1,410,764,512 19,830,135 71.1 Total 21,580,592 228,805 94.3 char[] 423,372,528 4,770,424 88.7 char[] 4,215,784 48,574 86.8 byte[] 347,332,152 1,971,692 176.2 byte[] 3,683,984 5,024 733.3 int[] 85,509,280 1,380,642 61.9 Built‐in VM methodKlass 2,493,064 16,355 152.4 java.lang.String 73,623,024 3,067,626 24 Built‐in VM constMethodKlass 1,955,696 16,355 119.6 java.lang.Object[] 64,788,840 565,693 114.5 Built‐in VM constantPoolKlass 1,437,240 1,284 1,119.30 java.util.regex.Matcher 51,448,320 643,104 80 Built‐in VM instanceKlass 1,078,664 1,284 840.1 java.lang.reflect.Method 43,374,528 301,212 144 java.lang.Class[] 922,808 45,354 20.3 java.util.HashMap$Entry[] 27,876,848 140,898 197.9 Built‐in VM constantPoolCacheK 903,360 1,132 798 java.util.TreeMap$Entry 22,116,136 394,931 56 Live java.lang.String 753,936 31,414 24 java.util.HashMap$Entry 19,806,440 495,161 40 java.lang.Object[] 702,264 8,118 86.5 java.nio.HeapByteBuffer 17,582,928 366,311 48 java.lang.reflect.Method 310,752 2,158 144 java.nio.HeapCharBuffer 17,575,296 366,152 48 java.lang.StringBuilder 15,322,128 638,422 24 short[] 261,112 3,507 74.5 java.util.TreeMap$EntryIterator 15,056,784 313,683 48 java.lang.Class 255,904 1,454 176 java.util.ArrayList 11,577,480 289,437 40 int[][] 184,680 2,032 90.9 java.util.HashMap 7,829,056 122,329 64 java.lang.String[] 173,176 1,746 99.2 java.util.TreeMap 7,754,688 107,704 72 java.util.zip.ZipEntry 172,080 2,390 72
  • 29. Another example, Overhead in collection
  • 30. Garbage Collection • Pause Times if stop_the_word_FullGC > ttl_of_node => failed requests; node repair => node is declared dead • Allocation Rate – New object creation, insertion rate • Live Objects (residency) – if residency in heap > 50% – GC overheads dominate. • Overhead: space, cpu cycles spent GC • 64-bit not addressing pause times – Bigger is not better! – 40-50% increase in heap sizes for same workloads.
  • 31. Too many free parameters!! Tune GC: • Entropy is: Number of flags it takes to tune GC. • Workloads in lab do not represent production • Fragile, Meaning of flags changes. Solution: • Ask VM vendor to provide one flag soln. • Azul’s PauselessGC (now in software) ⇒ Avoid OOM, configure node death if OOM ⇒ Azul’s Cooperative-Memory (swap space for your jvm under spike: No more OOM!)
  • 32. Memory Fragmentation • Fragmentation – Performance degrades over time – Inducing “Full GC” makes problem go away – Free memory that cannot be used • Reduce occurrence – Use a compacting collector – Promote less often – Use uniform sized objects • Solution – unsolved – Use latest CMS with CR:6631166 – Azul’s Zing JVM & Pauseless GC
  • 33. Sizing: Young Generation • Should we set –Xms == -Xmx ? • Use –Xmn (fixed eden) allocations {new Object();} survivor ratio eden survivor spaces Tenuring promotion Threshold allocation by jvm old generation
  • 34. Generations • Don’t promote too often! – Frequent promotion causes fragmentation • Size the generations – Min GC times are a function of Live Set – Old Gen should host steady state comfortably • Parallelize on multicores: – -XX:ParallelCMSThreads=4 – -XX:ParallelGCThreads=4 • Avoid CMS Initiating heuristic – -XX:+UseCMSInitiationOccupanyOnly • Use Concurrent for System.gc() – -XX:+ExplicitGCInvokesConcurrent
  • 35. Memory Leaks • Application takes all memory you got! • Live heap shows sawtooth • Eventually throws OOM Theory: • Allocated, Live heap, PermGen Common sources: • Finalizers, Classloaders, ThreadLocal
  • 36. synchronized: Amdahl’s law trumps Moore’s! • Coarse grained locks • io under lock • Stop signal on a highway • java.util.concurrent does not mean no locks • Non Blocking, Lock free, Wait free collections
  • 37. Locks: Distributed Caching • Schemes – Optimistic, Pessimistic • Consistency – Eventually vs. ACID • Contention, Waits • java.util.concurrent, critical sections, – Use Lock Striping • MVCC, Lock-free, wait-free DataStructures. (NBHM) • Transactions are expensive ⇒Reduce JTA abuse, Set the right isolation levels.
  • 39. UUID Are you using UUID gen for messaging? • java.util.UUID is slow – static use leads to contention SecureRandom • Uses /dev/urandom for seed initialization -Djava.security.egd=file:/dev/urandom • PRNG without file is atleast 20%-40% better. • Use TimeUUIDs where possible – much faster • JUG – java.uuid.generator • http://github.com/cowtowncoder/java-uuid-generator • http://jug.safehaus.org/ • http://johannburkard.de/blog/programming/java/Java-UUID-generators-compared.html
  • 40. Towards Non-blocking high scale collections! • Big Array to hold Data • Concurrent writes via: CAS & Finite State Machine – No locks, no volatile – Much faster than locking under heavy load – Directly reach main data array in 1 step • Resize as needed – Copy Array to a larger Array on demand – Use State Machine to help copy – “ Mark” old Array words to avoid missing late updates • Use Non-Blocking Hashmap, google collections
  • 41. Non-Blocking HashMap Azul Vega2 – 768 cpus 1K Table 1M Table 1200 1200 1000 NB-99 1000 800 800 M-ops/sec M-ops/sec 600 CHM-99 600 400 400 NB NB-75 200 200 CHM-75 CHM 0 0 0 100 200 300 400 500 600 700 800 0 100 200 300 400 500 600 700 800 Threads Threads
  • 42. Inter-node communication • TCP for mgmt & data: Infinispan • TCP for mgmt, UDP for data: Coherence, Infinispan • UDP for mgmt, TCP for data: Cassandra, Infinispan • Instrumentation: EHCache/Terracotta Bandwidth & Latency considerations ⇒ Ensure proper network configuration in the kernel ⇒ Run Datagram tests ⇒ Limit number of management nodes & nodes
  • 43. Example, Apache Cassandra • Partition, Ring, Gateway, BloomFilters • Gossip Protocol – It’s exponential – (epidemic algorithm) • Failure Detector – Accrual rate phi • Anti-Entropy – Bringing replicas to uptodate
  • 45. Marshal Arts: Serialization/Deserialization • java.io.Serializable is S.L..O.…W • Use “transient” • jserial, avro, etc • + Google Protocol Buffers, • PortableObjectFormat (Coherence) • + JBossMarshalling • + Externalizable + byte[] • + Roll your own
  • 46. Serialization + Deserialization uBench • http://code.google.com/p/thrift-protobuf-compare/wiki/BenchmarkingV2
  • 47. Count what is countable, measure what is measurable, and what is not measurable, make measurable -Galileo
  • 48. Latency: Where have all the millis gone? • Moore’s law amplifies bandwidth – Latencies are still lagging! • Measure. 90th percentile. Look for consistency. • => JMX is great! JMX is also very slow. • Lesser number of nodes means less MBeans! • Monitor (network, memory, cpu), ganglia, • Know thyself: Application Footprint, Trend data.
  • 50. Q&A • References: • Making Sense of Large Heaps, Nick Mitchell, IBM • Oracle Coherence 3.5, Aleksandar Seovic • Large Pages in Java http://andrigoss.blogspot.com/2008/02/jvm-performance- tuning.html • Patterns of Doom http://3.latest.googtst23.appspot.com/ • Infinispan Demos http://community.jboss.org/wiki/5minutetutorialonInfinispan • RTView, Tom Lubinski, http://www.sl.com/pdfs/SL-BACSIG-100429-final.pdf • Google Protocol Buffers, http://code.google.com/p/protobuf/ • Azul’s Pauseless GC http://www.azulsystems.com/technology/zing-virtual- machine • Cliff Click’s Non-Blocking Hash Map http://sourceforge.net/projects/high-scale- lib/ • JVM Serialization Benchmarks: • http://code.google.com/p/thrift-protobuf-compare/wiki/BenchmarkingV2
  • 51. Cassandra links • Verner Wogels, Eventually Consistent http://www.allthingsdistributed.com/2008/12/eventually_consistent.html • Bloom, Burton H. (1970), "Space/time trade-offs in hash coding with allowable errors" • Avinash Lakshman, http://static.last.fm/johan/nosql- 20090611/cassandra_nosql.pdf • Eric Brewer, CAP http://www.cs.berkeley.edu/~brewer/cs262b-2004/PODC- keynote.pdf • Tony Printzeis, Charlie Hunt, Javaone Talk http://www.scribd.com/doc/36090475/GC-Tuning-in-the-Java • http://github.com/digitalreasoning/PyStratus/wiki/Documentation • http://www.cs.cornell.edu/home/rvr/papers/flowgossip.pdf • Cassandra on Cloud, http://www.coreyhulen.org/?p=326 • Cliff Click’s, Non-blocking HashMap http://sourceforge.net/projects/high-scale-lib/ • Brian F. Cooper., Yahoo Cloud Storage Benchmark, http://www.brianfrankcooper.net/pubs/ycsb-v4.pdf • www.riptano.com
  • 52. Further questions, reach me: Azul Systems, sris@azulsystems.com Twitter: @srisatish, srisatish@riptano.com