From distributed caches to in-memory data grids


Published on

A brief introduction into modern caching technologies, starting from distributed memcached to modern data grids like Oracle Coherence.
Slides were presented during distributed caching tech talk in Moscow, May 17 2012.

Published in: Technology
No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Add pictures
  • Add pictures
  • Add pictures
  • Well, sort em.
  • Didn’t I forget anything? Master-Master config, 4 example
  • Fix TBDs
  • Add picture
  • Check what Computational Grid really is
  • Add picture with partitions
  • TBD
  • TBD
  • TBD
  • Add picture
  • Add better pic with L1/L2/L3
  • From distributed caches to in-memory data grids

    1. 1. From distributed cachesto in-memory data grids TechTalk by Max A. Alexejev
    2. 2. Memory Hierarchy R <1ns L1 ~4 cycles, ~1ns Cost L2 ~10 cycles, ~3ns L3 ~42 cycles, ~15ns DRAM >65ns Flash / SSD / USB Storage term HDD Tapes, Remote systems, etc 2 Max A. Alexejev
    3. 3. Software caches Improve response times by reducing data access latency Offload persistent storages Only work for IO-bound applications! 3 Max A. Alexejev
    4. 4. Caches and data location Consistency protocol Shared Local Remote Distributed Hierarchical Distribution algorithm 4 Max A. Alexejev
    5. 5. Ok, so how do we grow beyond one node? Data replication 5 Max A. Alexejev
    6. 6. Pro’s and Con’s of replication Pro • Best read performance (for local replicated caches) • Fault tolerant cache (both local and remote) • Can be smart: replicate only part of CRUD cycle Con • Poor writes performance • Additional network load • Can scale only vertically: limited by single machine size • In case of master-master replication, requires complex consistency protocol 6 Max A. Alexejev
    7. 7. Ok, so how do we grow beyond one node? Data distribution 7 Max A. Alexejev
    8. 8. Pro’s and Con’s of data distribution Pro • Can scale horizontally beyond single machine size • Reads and writes performance scales horizontally Con • No fault tolerance for cached data • Increased latency of reads (due to network round-trip and serialization expenses) 8 Max A. Alexejev
    9. 9. What do high-load applications needfrom cache? Linear Distributed Low horizontal latency cache scalability 9 Max A. Alexejev
    10. 10. Cache access patterns: Client Cache Aside For reading data: For writing data 1. Application asks 1. Application writes for some data for a some new data or given key updates existing. Cache 2. Check the cache 2. Write it to the 3. If data is in the cache cache return it to 3. Write it to the DB. the user 4. If data is not in the Overall: cache fetch it from the DB, put it in • Increases reads the cache, return it performance to the user. • Offloads DB reads DB • Introduces race conditions for writes10 Max A. Alexejev
    11. 11. Cache access patterns: Client Read Through For reading data: 1. Application asks for some data for a given key 2. Check the cache 3. If data is in the cache return it to the user Cache 4. If data is not in the cache – cache will invoke fetching it from the DB by himself, saving retrieved value and returning it to the user. Overall: • Reduces reads latency • Offloads read load from underlying storage • May have blocking behavior, thus helping with dog-pile DB effect • Requires “smarter” cache nodes11 Max A. Alexejev
    12. 12. Cache access patterns: Client Write Through For writing data 1. Application writes some new data or updates existing. Cache 2. Write it to the cache 3. Cache will then synchronously write it to the DB. Overall: • Slightly increases writes latency DB • Provides natural invalidation • Removes race conditions on writes12 Max A. Alexejev
    13. 13. Cache access patterns: Client Write Behind For writing data 1. Application writes some new data or updates existing. 2. Write it to the cache Cache adds writes request to its internal queue. 3. Cache 4. Later, cache asynchronously flushes queue to DB on a periodic basis and/or when queue size reaches certain limit. Overall: • Dramatically reduces writes latency by a price of inconsistency window • Provides writes batching • May provide updates deduplication DB13 Max A. Alexejev
    14. 14. A variety of products on the market… Memcached Hazelcast Cassandra GigaSpaces Redis Terracotta Oracle Coherence Infinispan MongoDB Riak EhCache … 14 Max A. Alexejev
    15. 15. KV caches NoSQL Data Grids Oracle Memcached Redis Coherence Ehcache Cassandra GemFire Lets sort em out! … MongoDB GigaSpaces Some products are really hard to sort – like Terracotta in both DSO … GridGain and Express modes. Hazelcast Infinispan15 Max A. Alexejev
    16. 16. Why don’t we have any distributedin-memory RDBMS? Master – MultiSlaves configuration • Is, if fact, an example of replication • Helps with reads distribution, but does not help with writes • Does not scale beyond single master Horizontal partitioning (sharding) • Helps with reads and writes for datasets with good data affinity • Does not work nicely with joins semantics (i.e., there are no distributed joins) 16 Max A. Alexejev
    17. 17. Key-Value caches• Memcached and EHCache are good examples to look at• Keys and values are arbitrary binary (serializable) entities• Basic operations are put(K,V), get(K), replace(K,V), remove(K)• May provide group operations like getAll(…) and putAll(…)• Some operations provide atomicity guarantees (CAS, inc/dec) 17 Max A. Alexejev
    18. 18. Memcached • Developed for LiveJournal in 2003 • Has client libraries in PHP, Java, Ruby, Python and many others • Nodes are independent and don’t communicate with each other18 Max A. Alexejev
    19. 19. EHCache • Initially named “Easy Hibernate Cache” • Java-centric, mature product with open- source and commercial editions • Open-source version provides only replication capabilities, distributed caching requires commercial license for both EHCache and Terracotta TSA19 Max A. Alexejev
    20. 20. NoSQL Systems A whole bunch of different products with both persistent and non-persistent storage options. Lets call them caches and storages, accordingly. Built to provide good horizontal scalability Try to fill the feature gap between pure KV and full-blown RDBMS 20 Max A. Alexejev
    21. 21.  Written in C, supported by VMWare  Client libraries for C, C#, Java, Scala, PHP, Erlang, etc  Single-threaded async impl  Has configurable persistence Case study: Redis  Works with K-V pairs, where K is a string and V may be either number, hset users:goku powerlevel 9000 string or Object (JSON) hget users:goku powerlevel  Provides 5 interfaces for: strings, hashes, sorted lists, sets, sorted sets  Supports transactions21 Max A. Alexejev
    22. 22. Use cases: Redis Good for fixed lists, tagging, ratings, counters, analytics and queues (pub-sub messaging) Has Master – MultiSlave replication support. Master node is currently a SPOF. Distributed Redis was named “Redis Cluster” and is currently under development 22 Max A. Alexejev
    23. 23. • Written in Java, developed in Facebook. • Inspired by Amazon Dynamo replication mechanics, but uses column-based data model. Case study: Cassandra • Good for logs processing, index storage, voting, jobs storage etc. • Bad for transactional processing. • Want to know more? Ask Alexey!23 Max A. Alexejev
    24. 24. In-Memory Data GridsNew generation of caching products, trying to combine benefits of replicated anddistributed schemes. 24 Max A. Alexejev
    25. 25. IMDG: Evolution Data Grids Computational • Reliable storage and Grids live data balancing • Reliable jobs among grid nodes execution, scheduling and load balancing Modern IMDG 25 Max A. Alexejev
    26. 26. IMDG: Caching concepts• Implements KV cache interface • Live data redistribution when nodes are going up or down – no data loss, no• Provides indexed search by values clients termination• Provides reliable distributed locks • Supports RT, WT, WB caching patterns interface and hierarchical caches (near caching)• Caching scheme – partitioned or • Supports atomic computations on grid distributed, may be specified per cache nodes or cache service• Provides events subscription for entries (change notifications)• Configurable fault tolerance for distributed schemes (HA)• Equal data (and read/write load) 26 Max A. Alexejev distribution among grid nodes
    27. 27. IMDG: Under the hood • All data is split in a number of sections, called partitions. • Partition, rather then entry, is an atomic unit of data migration when grid rebalances. Number of partitions is fixed for cluster lifetime. • Indexes are distributed among grid nodes. • Clients may or may not be part of the grid cluster.27 Max A. Alexejev
    28. 28. IMDG Under the hood: Requests routing For get() and put() requests: 1. Cluster member, that makes a request, calculates key hash code. 2. Partition number is calculated using this hash code. 3. Node is identified by partition number. 4. Request is then routed to identified node, executed, and results are sent back to the client member who initiated request. For filter queries: 1. Cluster member initiating requests sends it to all storage enabled nodes in the cluster. 2. Query is executed on every node using distributed indexes and partial results are sent to the requesting member. 3. Requesting member merges partial results locally. 4. Final result set is returned from filter method.28 Max A. Alexejev
    29. 29. IMDG: Advanced use-cases Messaging Map-Reduce calculations Cluster-wide singleton And more… 29 Max A. Alexejev
    30. 30. GC tuning for large grid nodes An easy way to go: rolling restarts or storage-enabled cluster nodes. Can not be used in any project. A complex way to go: fine-tune CMS collector to ensure that it will always keep up cleaning garbage concurrently under normal production workload. An expensive way to go: use OffHeap storages provided by some vendors (Oracle, Terracotta) and use direct memory buffers available to JVM. 30 Max A. Alexejev
    31. 31. IMDG: Market players Oracle Coherence: commercial, free for evaluation use. GigaSpaces: commercial. GridGain: commercial. Hazelcast: open-source. Infinispan: open-source. 31 Max A. Alexejev
    32. 32. TerracottaA company behind EHCache, Quartz and Terracotta Server Array.Acquired by Software AG. 32 Max A. Alexejev
    33. 33. Terracotta Server ArrayAll data is split in a number of sections, called stripes.Stripes consist of 2 or more Terracotta nodes. One of them is Active node, others have Passive status.All data is distributed among stripes and replicated inside stripes.Open Source limitation: only one stripe. Such setup will support HA, but will not distribute cache data. I.e., it is not horizontally scalable. 33 Max A. Alexejev
    34. 34. Max A. AlexejevQA SessionAnd thank you for coming!
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.