Memory is the new disk, disk is the new tape, Bela Ban (JBoss by RedHat)


Published on

Published in: Technology
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Memory is the new disk, disk is the new tape, Bela Ban (JBoss by RedHat)

  1. 1. Memory is the new disk, disk is the new tape Bela Ban, JBoss / Red Hat
  2. 2. Motivation● We want to store our data in memory – Memory access is faster than disk access – Even across a network – A DB requires network communication, too● The disk is used for archival purposes● Not a replacement for DBs ! – Only a key-value store – NoSQL
  3. 3. Problems● #1: How do we provide memory large enough to store the data (e.g. 2 TB of memory) ?● #2: How do we guarantee persistence ? – Survival of data between reboots / crashes
  4. 4. #1: Large memory● We aggregate the memory of all nodes in a cluster into a large virtual memory space – 100 nodes of 10 GB == 1 TB of virtual memory
  5. 5. #2: Persistence● We store keys redundantly on multiple nodes – Unless all nodes on which key K is stored crash at the same time, K is persistent● We can also store the data on disk – To prevent data loss in case all cluster nodes crash – This can be done asynchronously, on a background thread
  6. 6. How do we provide redundancy ?
  7. 7. Store every key on every node A B C D K1 K1 K1 K1 K2 K2 K2 K2 K3 K3 K3 K3 K4 K4 K4 K4● RAID 1● Pro: data is available everywhere – No network round trip – Data loss only when all nodes crash● Con: we can only use 25% of our memory
  8. 8. Store every key on 1 node only A B C D K1 K2 K3 K4● RAID 0, JBOD● Pro: we can use 100% of our memory● Con: data loss on node crash – No redundancy
  9. 9. Store every key on K nodes A B C D K1 K1 K2 K2 K3 K3 K4 K4● K is configurable (2 in the example)● Variable RAID● Pro: we can use a variable % of our memory – User determines tradeoff between memory consumption and risk of data loss
  10. 10. So how do we determine on which nodes the keys are stored ?
  11. 11. Consistent hashing● Given a key K and a set of nodes, CH(K) will always pick the same node P for K – We can also pick a list {P,Q} for K● Anyone knows that K is on P● If P leaves, CH(K) will pick another node Q and rebalance affected keys● A good CH will rebalance 1/N keys at most (where N = number of cluster nodes)
  12. 12. Example A B C D K1 K1 K2 K2 K3 K3 K4 K4● K2 is stored on B (primary owner) and C (backup owner)
  13. 13. Example A B C D K1 K1 K2 K2 K3 K3 K4 K4● Node B now crashes
  14. 14. Example A B C D K1 K1 K1 K2 K2 K2 K3 K3 K4 K4● C (the backup owner of K2) copies K2 to D – C is now the primary owner of K2● A copies K1 to C – C is now the backup owner of K1
  15. 15. Rebalancing● Unless all N owners of a key K crash exactly at the same time, K is always stored redundantly● When less than N owners crash, rebalancing will copy/move keys to other nodes, so that we have N owners again
  16. 16. Enter ReplCache● ReplCache is a distributed hashmap spanning the entire cluster● Operations: put(K,V), get(K), remove(K)● For every key, we can define how many times wed like it to be stored in the cluster – 1: RAID 0 – -1: RAID 1 – N: variable RAID
  17. 17. Use of ReplCache JBoss ReplCache Servlet Apache JBoss ReplCache ClusterHTTP Servlet mod_jk JBoss ReplCache Servlet DB
  18. 18. Demo
  19. 19. Use cases● JBoss AS: session distribution using Infinispan – For data scalability, sessions are stored only N times in a cluster● GridFS (Infinispan) – I/O over grid – Files are chunked into slices, each slice is stored in the grid (redundantly if needed) – Store a 4GB DVD in a grid where each node has only 2GB of heap
  20. 20. Use cases● Hibernate Over Grid (OGM) – Replaces DB backend with Infinispan backed grid
  21. 21. Conclusion● Given enough nodes in a cluster, we can provide persistence for data● Unlike RAID, where everything is stored fully redundantly (even /tmp), we can define persistence guarantees per key● Ideal for data sets which need to be accessed quickly – For the paranoid we can still stream to disk
  22. 22. Conclusion● Data is distributed over a grid – Cache is closer to clients – No bottleneck to the DBMS – Keys are on different nodes
  23. 23. ConclusionClient Client Client Client Client Client Client Client Client Client Client ClientCache Cache Cache Cache Cache CacheCache Cache Cache Cache Cache Cache CacheClient Client Client Cache Client Client Client
  24. 24. Questions ?● Demo (JGroups) –● Infinispan –● OGM –