SlideShare a Scribd company logo
1 of 24
Download to read offline
Memory is the new disk,
 disk is the new tape


 Bela Ban, JBoss / Red Hat
Motivation
●   We want to store our data in memory
        –   Memory access is faster than disk access
        –   Even across a network
        –   A DB requires network communication, too
●   The disk is used for archival purposes
●   Not a replacement for DBs !
        –   Only a key-value store
        –   NoSQL
Problems
●   #1: How do we provide memory large
    enough to store the data (e.g. 2 TB of
    memory) ?
●   #2: How do we guarantee persistence ?
        –   Survival of data between reboots / crashes
#1: Large memory
●   We aggregate the memory of all nodes in a
    cluster into a large virtual memory space
       –   100 nodes of 10 GB == 1 TB of virtual
            memory
#2: Persistence
●   We store keys redundantly on multiple
    nodes
       –   Unless all nodes on which key K is stored
            crash at the same time, K is persistent
●   We can also store the data on disk
       –   To prevent data loss in case all cluster
            nodes crash
       –   This can be done asynchronously, on a
            background thread
How do we provide redundancy ?
Store every key on every node
           A          B            C        D
           K1        K1            K1       K1
           K2        K2            K2       K2
           K3        K3            K3       K3
           K4        K4            K4       K4

●   RAID 1
●   Pro: data is available everywhere
       –   No network round trip
       –   Data loss only when all nodes crash
●   Con: we can only use 25% of our memory
Store every key on 1 node only
           A        B       C       D
           K1       K2      K3     K4




●   RAID 0, JBOD
●   Pro: we can use 100% of our memory
●   Con: data loss on node crash
       –   No redundancy
Store every key on K nodes
            A         B         C          D
            K1        K1
                      K2        K2
                                K3         K3
            K4                             K4


●   K is configurable (2 in the example)
●   Variable RAID
●   Pro: we can use a variable % of our memory
        –   User determines tradeoff between memory
             consumption and risk of data loss
So how do we determine on which nodes the
            keys are stored ?
Consistent hashing
●   Given a key K and a set of nodes, CH(K)
    will always pick the same node P for K
        –   We can also pick a list {P,Q} for K
●   Anyone 'knows' that K is on P
●   If P leaves, CH(K) will pick another node Q
    and rebalance affected keys
●   A good CH will rebalance 1/N keys at most
    (where N = number of cluster nodes)
Example
         A         B         C        D
         K1        K1
                   K2       K2
                            K3        K3
         K4                           K4

●   K2 is stored on B (primary owner) and C
    (backup owner)
Example
         A       B       C    D
        K1       K1
                 K2      K2
                         K3   K3
        K4                    K4

●   Node B now crashes
Example
           A          B          C            D
           K1         K1         K1
                      K2         K2           K2
                                 K3           K3
           K4                                 K4

●   C (the backup owner of K2) copies K2 to D
       –   C is now the primary owner of K2
●   A copies K1 to C
       –   C is now the backup owner of K1
Rebalancing
●   Unless all N owners of a key K crash
    exactly at the same time, K is always
    stored redundantly
●   When less than N owners crash,
    rebalancing will copy/move keys to other
    nodes, so that we have N owners again
Enter ReplCache
●   ReplCache is a distributed hashmap
    spanning the entire cluster
●   Operations: put(K,V), get(K), remove(K)
●   For every key, we can define how many
    times we'd like it to be stored in the cluster
        –   1: RAID 0
        –   -1: RAID 1
        –   N: variable RAID
Use of ReplCache

                JBoss          ReplCache

                Servlet


       Apache   JBoss          ReplCache
                                           Cluster
HTTP            Servlet
       mod_jk

                JBoss          ReplCache

                Servlet




                          DB
Demo
Use cases
●   JBoss AS: session distribution using
    Infinispan
        –   For data scalability, sessions are stored
             only N times in a cluster
●   GridFS (Infinispan)
        –   I/O over grid
        –   Files are chunked into slices, each slice is
              stored in the grid (redundantly if needed)
        –   Store a 4GB DVD in a grid where each
              node has only 2GB of heap
Use cases
●   Hibernate Over Grid (OGM)
       –   Replaces DB backend with Infinispan
            backed grid
Conclusion
●   Given enough nodes in a cluster, we can
    provide persistence for data
●   Unlike RAID, where everything is stored
    fully redundantly (even /tmp), we can
    define persistence guarantees per key
●   Ideal for data sets which need to be
    accessed quickly
        –   For the paranoid we can still stream to disk
Conclusion
●   Data is distributed over a grid
        –   Cache is closer to clients
        –   No bottleneck to the DBMS
        –   Keys are on different nodes
Conclusion


Client    Client    Client    Client     Client    Client
 Client    Client    Client    Client     Client    Client

Cache     Cache     Cache
 Cache     Cache     Cache


Cache     Cache     Cache
 Cache     Cache     Cache

                                        Cache
Client    Client    Client               Cache
 Client    Client    Client
Questions ?
●   Demo (JGroups)
        –   http://www.jgroups.org
●   Infinispan
        –   http://www.infinispan.org
●   OGM
        –   http://community.jboss.org/en/hibernate/ogm

More Related Content

What's hot

Disaster recovery of OpenStack Cinder using DRBD
Disaster recovery of OpenStack Cinder using DRBDDisaster recovery of OpenStack Cinder using DRBD
Disaster recovery of OpenStack Cinder using DRBD
Viswesuwara Nathan
 
Storage structure
Storage structureStorage structure
Storage structure
Mohd Arif
 
Ceph Intro and Architectural Overview by Ross Turk
Ceph Intro and Architectural Overview by Ross TurkCeph Intro and Architectural Overview by Ross Turk
Ceph Intro and Architectural Overview by Ross Turk
buildacloud
 

What's hot (20)

Disaster recovery of OpenStack Cinder using DRBD
Disaster recovery of OpenStack Cinder using DRBDDisaster recovery of OpenStack Cinder using DRBD
Disaster recovery of OpenStack Cinder using DRBD
 
Storage structure
Storage structureStorage structure
Storage structure
 
Gluster fs hadoop_fifth-elephant
Gluster fs hadoop_fifth-elephantGluster fs hadoop_fifth-elephant
Gluster fs hadoop_fifth-elephant
 
Block Storage For VMs With Ceph
Block Storage For VMs With CephBlock Storage For VMs With Ceph
Block Storage For VMs With Ceph
 
Ceph - A distributed storage system
Ceph - A distributed storage systemCeph - A distributed storage system
Ceph - A distributed storage system
 
Bluestore
BluestoreBluestore
Bluestore
 
Data Reduction for Gluster with VDO
Data Reduction for Gluster with VDOData Reduction for Gluster with VDO
Data Reduction for Gluster with VDO
 
Arbiter volumes in gluster
Arbiter volumes in glusterArbiter volumes in gluster
Arbiter volumes in gluster
 
Baker: Scaling OVN with Kubernetes API Server
Baker: Scaling OVN with Kubernetes API ServerBaker: Scaling OVN with Kubernetes API Server
Baker: Scaling OVN with Kubernetes API Server
 
OVN Controller Incremental Processing
OVN Controller Incremental ProcessingOVN Controller Incremental Processing
OVN Controller Incremental Processing
 
Ceph on Windows
Ceph on WindowsCeph on Windows
Ceph on Windows
 
Linux Stammtisch Munich: Ceph - Overview, Experiences and Outlook
Linux Stammtisch Munich: Ceph - Overview, Experiences and OutlookLinux Stammtisch Munich: Ceph - Overview, Experiences and Outlook
Linux Stammtisch Munich: Ceph - Overview, Experiences and Outlook
 
Ceph Intro and Architectural Overview by Ross Turk
Ceph Intro and Architectural Overview by Ross TurkCeph Intro and Architectural Overview by Ross Turk
Ceph Intro and Architectural Overview by Ross Turk
 
Ceph, Now and Later: Our Plan for Open Unified Cloud Storage
Ceph, Now and Later: Our Plan for Open Unified Cloud StorageCeph, Now and Later: Our Plan for Open Unified Cloud Storage
Ceph, Now and Later: Our Plan for Open Unified Cloud Storage
 
Red Hat Enterprise Linux: Open, hyperconverged infrastructure
Red Hat Enterprise Linux: Open, hyperconverged infrastructureRed Hat Enterprise Linux: Open, hyperconverged infrastructure
Red Hat Enterprise Linux: Open, hyperconverged infrastructure
 
HKG15-401: Ceph and Software Defined Storage on ARM servers
HKG15-401: Ceph and Software Defined Storage on ARM serversHKG15-401: Ceph and Software Defined Storage on ARM servers
HKG15-401: Ceph and Software Defined Storage on ARM servers
 
Ceph data services in a multi- and hybrid cloud world
Ceph data services in a multi- and hybrid cloud worldCeph data services in a multi- and hybrid cloud world
Ceph data services in a multi- and hybrid cloud world
 
Community Update at OpenStack Summit Boston
Community Update at OpenStack Summit BostonCommunity Update at OpenStack Summit Boston
Community Update at OpenStack Summit Boston
 
SF Ceph Users Jan. 2014
SF Ceph Users Jan. 2014SF Ceph Users Jan. 2014
SF Ceph Users Jan. 2014
 
OpenEBS hangout #4
OpenEBS hangout #4OpenEBS hangout #4
OpenEBS hangout #4
 

Viewers also liked

Kicking ass with redis
Kicking ass with redisKicking ass with redis
Kicking ass with redis
Dvir Volk
 

Viewers also liked (8)

Mini-Training: Redis
Mini-Training: RedisMini-Training: Redis
Mini-Training: Redis
 
Redis and Groovy and Grails - gr8conf 2011
Redis and Groovy and Grails - gr8conf 2011Redis and Groovy and Grails - gr8conf 2011
Redis and Groovy and Grails - gr8conf 2011
 
Redis to the Rescue?
Redis to the Rescue?Redis to the Rescue?
Redis to the Rescue?
 
Redis Use Patterns (DevconTLV June 2014)
Redis Use Patterns (DevconTLV June 2014)Redis Use Patterns (DevconTLV June 2014)
Redis Use Patterns (DevconTLV June 2014)
 
Redis for the Everyday Developer
Redis for the Everyday DeveloperRedis for the Everyday Developer
Redis for the Everyday Developer
 
Redis in Practice
Redis in PracticeRedis in Practice
Redis in Practice
 
Kicking ass with redis
Kicking ass with redisKicking ass with redis
Kicking ass with redis
 
Everything you always wanted to know about Redis but were afraid to ask
Everything you always wanted to know about Redis but were afraid to askEverything you always wanted to know about Redis but were afraid to ask
Everything you always wanted to know about Redis but were afraid to ask
 

Similar to Memory is the new disk, disk is the new tape, Bela Ban (JBoss by RedHat)

Dynamic Change Data Capture with Flink CDC and Consistent Hashing
Dynamic Change Data Capture with Flink CDC and Consistent HashingDynamic Change Data Capture with Flink CDC and Consistent Hashing
Dynamic Change Data Capture with Flink CDC and Consistent Hashing
HostedbyConfluent
 
Dynamic Change Data Capture with Flink CDC and Consistent Hashing
Dynamic Change Data Capture with Flink CDC and Consistent HashingDynamic Change Data Capture with Flink CDC and Consistent Hashing
Dynamic Change Data Capture with Flink CDC and Consistent Hashing
Yaroslav Tkachenko
 

Similar to Memory is the new disk, disk is the new tape, Bela Ban (JBoss by RedHat) (20)

Cacheconcurrencyconsistency cassandra svcc
Cacheconcurrencyconsistency cassandra svccCacheconcurrencyconsistency cassandra svcc
Cacheconcurrencyconsistency cassandra svcc
 
Erasure codes and storage tiers on gluster
Erasure codes and storage tiers on glusterErasure codes and storage tiers on gluster
Erasure codes and storage tiers on gluster
 
Integrating Cache Oblivious Approach with Modern Processor Architecture: The ...
Integrating Cache Oblivious Approach with Modern Processor Architecture: The ...Integrating Cache Oblivious Approach with Modern Processor Architecture: The ...
Integrating Cache Oblivious Approach with Modern Processor Architecture: The ...
 
Backing up Wikipedia Databases
Backing up Wikipedia DatabasesBacking up Wikipedia Databases
Backing up Wikipedia Databases
 
Cassandra On EC2
Cassandra On EC2Cassandra On EC2
Cassandra On EC2
 
Operation Unthinkable – Software Defined Storage @ Booking.com (Peter Buschman)
Operation Unthinkable – Software Defined Storage @ Booking.com (Peter Buschman)Operation Unthinkable – Software Defined Storage @ Booking.com (Peter Buschman)
Operation Unthinkable – Software Defined Storage @ Booking.com (Peter Buschman)
 
Ceph Day Chicago - Ceph at work at Bloomberg
Ceph Day Chicago - Ceph at work at Bloomberg Ceph Day Chicago - Ceph at work at Bloomberg
Ceph Day Chicago - Ceph at work at Bloomberg
 
Kafka on ZFS: Better Living Through Filesystems
Kafka on ZFS: Better Living Through Filesystems Kafka on ZFS: Better Living Through Filesystems
Kafka on ZFS: Better Living Through Filesystems
 
Application Caching: The Hidden Microservice
Application Caching: The Hidden MicroserviceApplication Caching: The Hidden Microservice
Application Caching: The Hidden Microservice
 
A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ...
 A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ... A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ...
A Hitchhiker's Guide to Apache Kafka Geo-Replication with Sanjana Kaundinya ...
 
Ceph Day London 2014 - Deploying ceph in the wild
Ceph Day London 2014 - Deploying ceph in the wildCeph Day London 2014 - Deploying ceph in the wild
Ceph Day London 2014 - Deploying ceph in the wild
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
 
Taking Docker to Production: What You Need to Know and Decide
Taking Docker to Production: What You Need to Know and DecideTaking Docker to Production: What You Need to Know and Decide
Taking Docker to Production: What You Need to Know and Decide
 
Taking Docker to Production: What You Need to Know and Decide
Taking Docker to Production: What You Need to Know and DecideTaking Docker to Production: What You Need to Know and Decide
Taking Docker to Production: What You Need to Know and Decide
 
Raft Engine Meetup 220702.pdf
Raft Engine Meetup 220702.pdfRaft Engine Meetup 220702.pdf
Raft Engine Meetup 220702.pdf
 
Open Source Data Deduplication
Open Source Data DeduplicationOpen Source Data Deduplication
Open Source Data Deduplication
 
Challenges with Gluster and Persistent Memory with Dan Lambright
Challenges with Gluster and Persistent Memory with Dan LambrightChallenges with Gluster and Persistent Memory with Dan Lambright
Challenges with Gluster and Persistent Memory with Dan Lambright
 
Dynamic Change Data Capture with Flink CDC and Consistent Hashing
Dynamic Change Data Capture with Flink CDC and Consistent HashingDynamic Change Data Capture with Flink CDC and Consistent Hashing
Dynamic Change Data Capture with Flink CDC and Consistent Hashing
 
Dynamic Change Data Capture with Flink CDC and Consistent Hashing
Dynamic Change Data Capture with Flink CDC and Consistent HashingDynamic Change Data Capture with Flink CDC and Consistent Hashing
Dynamic Change Data Capture with Flink CDC and Consistent Hashing
 
“Show Me the Garbage!”, Garbage Collection a Friend or a Foe
“Show Me the Garbage!”, Garbage Collection a Friend or a Foe“Show Me the Garbage!”, Garbage Collection a Friend or a Foe
“Show Me the Garbage!”, Garbage Collection a Friend or a Foe
 

More from OpenBlend society

Byteman and The Jokre, Sanne Grinovero (JBoss by RedHat)
Byteman and The Jokre, Sanne Grinovero (JBoss by RedHat)Byteman and The Jokre, Sanne Grinovero (JBoss by RedHat)
Byteman and The Jokre, Sanne Grinovero (JBoss by RedHat)
OpenBlend society
 
SOA architecture patterns, Matjaž Jurič (FRI/Univerza v Ljubljani)
SOA architecture patterns, Matjaž Jurič (FRI/Univerza v Ljubljani)SOA architecture patterns, Matjaž Jurič (FRI/Univerza v Ljubljani)
SOA architecture patterns, Matjaž Jurič (FRI/Univerza v Ljubljani)
OpenBlend society
 
Seam 3 from a Web developer’s point of view, Matija Mazi (Parsek)
Seam 3 from a Web developer’s point of view, Matija Mazi (Parsek)Seam 3 from a Web developer’s point of view, Matija Mazi (Parsek)
Seam 3 from a Web developer’s point of view, Matija Mazi (Parsek)
OpenBlend society
 
National Reference runtime environment, Boris Šaletić (MJU)
National Reference runtime environment, Boris Šaletić (MJU)National Reference runtime environment, Boris Šaletić (MJU)
National Reference runtime environment, Boris Šaletić (MJU)
OpenBlend society
 
Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero...
Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero...Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero...
Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero...
OpenBlend society
 
Java SE 7 - The Platform Evolves, Dalibor Topić (Oracle)
Java SE 7 - The Platform Evolves, Dalibor Topić (Oracle)Java SE 7 - The Platform Evolves, Dalibor Topić (Oracle)
Java SE 7 - The Platform Evolves, Dalibor Topić (Oracle)
OpenBlend society
 
Enterprise Java Virtualization, Sacha Labourey
Enterprise Java Virtualization, Sacha LaboureyEnterprise Java Virtualization, Sacha Labourey
Enterprise Java Virtualization, Sacha Labourey
OpenBlend society
 

More from OpenBlend society (13)

Byteman and The Jokre, Sanne Grinovero (JBoss by RedHat)
Byteman and The Jokre, Sanne Grinovero (JBoss by RedHat)Byteman and The Jokre, Sanne Grinovero (JBoss by RedHat)
Byteman and The Jokre, Sanne Grinovero (JBoss by RedHat)
 
SOA architecture patterns, Matjaž Jurič (FRI/Univerza v Ljubljani)
SOA architecture patterns, Matjaž Jurič (FRI/Univerza v Ljubljani)SOA architecture patterns, Matjaž Jurič (FRI/Univerza v Ljubljani)
SOA architecture patterns, Matjaž Jurič (FRI/Univerza v Ljubljani)
 
Seam 3 from a Web developer’s point of view, Matija Mazi (Parsek)
Seam 3 from a Web developer’s point of view, Matija Mazi (Parsek)Seam 3 from a Web developer’s point of view, Matija Mazi (Parsek)
Seam 3 from a Web developer’s point of view, Matija Mazi (Parsek)
 
National Reference runtime environment, Boris Šaletić (MJU)
National Reference runtime environment, Boris Šaletić (MJU)National Reference runtime environment, Boris Šaletić (MJU)
National Reference runtime environment, Boris Šaletić (MJU)
 
Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero...
Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero...Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero...
Introducing Hibernate OGM: porting JPA applications to NoSQL, Sanne Grinovero...
 
Java SE 7 - The Platform Evolves, Dalibor Topić (Oracle)
Java SE 7 - The Platform Evolves, Dalibor Topić (Oracle)Java SE 7 - The Platform Evolves, Dalibor Topić (Oracle)
Java SE 7 - The Platform Evolves, Dalibor Topić (Oracle)
 
How to cuddle your EJBs, Carlo de Wolf
How to cuddle your EJBs, Carlo de WolfHow to cuddle your EJBs, Carlo de Wolf
How to cuddle your EJBs, Carlo de Wolf
 
Enterprise Java Virtualization, Sacha Labourey
Enterprise Java Virtualization, Sacha LaboureyEnterprise Java Virtualization, Sacha Labourey
Enterprise Java Virtualization, Sacha Labourey
 
OSGi & Java EE: A hybrid approach to Enterprise Java Application Development,...
OSGi & Java EE: A hybrid approach to Enterprise Java Application Development,...OSGi & Java EE: A hybrid approach to Enterprise Java Application Development,...
OSGi & Java EE: A hybrid approach to Enterprise Java Application Development,...
 
Tackling Actual Problems on the Wings of the Netbeans Platform, Jure Polutnik
Tackling Actual Problems on the Wings of the Netbeans Platform, Jure PolutnikTackling Actual Problems on the Wings of the Netbeans Platform, Jure Polutnik
Tackling Actual Problems on the Wings of the Netbeans Platform, Jure Polutnik
 
Android Up Close, Martin Sonc
Android Up Close, Martin SoncAndroid Up Close, Martin Sonc
Android Up Close, Martin Sonc
 
Successful Application Lifecycle Management in heterogeneous environments, Ma...
Successful Application Lifecycle Management in heterogeneous environments, Ma...Successful Application Lifecycle Management in heterogeneous environments, Ma...
Successful Application Lifecycle Management in heterogeneous environments, Ma...
 
Becoming an Open Source developer, Dimitris Andreadis
Becoming an Open Source developer, Dimitris AndreadisBecoming an Open Source developer, Dimitris Andreadis
Becoming an Open Source developer, Dimitris Andreadis
 

Recently uploaded

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Recently uploaded (20)

JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUKSpring Boot vs Quarkus the ultimate battle - DevoxxUK
Spring Boot vs Quarkus the ultimate battle - DevoxxUK
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 

Memory is the new disk, disk is the new tape, Bela Ban (JBoss by RedHat)

  • 1. Memory is the new disk, disk is the new tape Bela Ban, JBoss / Red Hat
  • 2. Motivation ● We want to store our data in memory – Memory access is faster than disk access – Even across a network – A DB requires network communication, too ● The disk is used for archival purposes ● Not a replacement for DBs ! – Only a key-value store – NoSQL
  • 3. Problems ● #1: How do we provide memory large enough to store the data (e.g. 2 TB of memory) ? ● #2: How do we guarantee persistence ? – Survival of data between reboots / crashes
  • 4. #1: Large memory ● We aggregate the memory of all nodes in a cluster into a large virtual memory space – 100 nodes of 10 GB == 1 TB of virtual memory
  • 5. #2: Persistence ● We store keys redundantly on multiple nodes – Unless all nodes on which key K is stored crash at the same time, K is persistent ● We can also store the data on disk – To prevent data loss in case all cluster nodes crash – This can be done asynchronously, on a background thread
  • 6. How do we provide redundancy ?
  • 7. Store every key on every node A B C D K1 K1 K1 K1 K2 K2 K2 K2 K3 K3 K3 K3 K4 K4 K4 K4 ● RAID 1 ● Pro: data is available everywhere – No network round trip – Data loss only when all nodes crash ● Con: we can only use 25% of our memory
  • 8. Store every key on 1 node only A B C D K1 K2 K3 K4 ● RAID 0, JBOD ● Pro: we can use 100% of our memory ● Con: data loss on node crash – No redundancy
  • 9. Store every key on K nodes A B C D K1 K1 K2 K2 K3 K3 K4 K4 ● K is configurable (2 in the example) ● Variable RAID ● Pro: we can use a variable % of our memory – User determines tradeoff between memory consumption and risk of data loss
  • 10. So how do we determine on which nodes the keys are stored ?
  • 11. Consistent hashing ● Given a key K and a set of nodes, CH(K) will always pick the same node P for K – We can also pick a list {P,Q} for K ● Anyone 'knows' that K is on P ● If P leaves, CH(K) will pick another node Q and rebalance affected keys ● A good CH will rebalance 1/N keys at most (where N = number of cluster nodes)
  • 12. Example A B C D K1 K1 K2 K2 K3 K3 K4 K4 ● K2 is stored on B (primary owner) and C (backup owner)
  • 13. Example A B C D K1 K1 K2 K2 K3 K3 K4 K4 ● Node B now crashes
  • 14. Example A B C D K1 K1 K1 K2 K2 K2 K3 K3 K4 K4 ● C (the backup owner of K2) copies K2 to D – C is now the primary owner of K2 ● A copies K1 to C – C is now the backup owner of K1
  • 15. Rebalancing ● Unless all N owners of a key K crash exactly at the same time, K is always stored redundantly ● When less than N owners crash, rebalancing will copy/move keys to other nodes, so that we have N owners again
  • 16. Enter ReplCache ● ReplCache is a distributed hashmap spanning the entire cluster ● Operations: put(K,V), get(K), remove(K) ● For every key, we can define how many times we'd like it to be stored in the cluster – 1: RAID 0 – -1: RAID 1 – N: variable RAID
  • 17. Use of ReplCache JBoss ReplCache Servlet Apache JBoss ReplCache Cluster HTTP Servlet mod_jk JBoss ReplCache Servlet DB
  • 18. Demo
  • 19. Use cases ● JBoss AS: session distribution using Infinispan – For data scalability, sessions are stored only N times in a cluster ● GridFS (Infinispan) – I/O over grid – Files are chunked into slices, each slice is stored in the grid (redundantly if needed) – Store a 4GB DVD in a grid where each node has only 2GB of heap
  • 20. Use cases ● Hibernate Over Grid (OGM) – Replaces DB backend with Infinispan backed grid
  • 21. Conclusion ● Given enough nodes in a cluster, we can provide persistence for data ● Unlike RAID, where everything is stored fully redundantly (even /tmp), we can define persistence guarantees per key ● Ideal for data sets which need to be accessed quickly – For the paranoid we can still stream to disk
  • 22. Conclusion ● Data is distributed over a grid – Cache is closer to clients – No bottleneck to the DBMS – Keys are on different nodes
  • 23. Conclusion Client Client Client Client Client Client Client Client Client Client Client Client Cache Cache Cache Cache Cache Cache Cache Cache Cache Cache Cache Cache Cache Client Client Client Cache Client Client Client
  • 24. Questions ? ● Demo (JGroups) – http://www.jgroups.org ● Infinispan – http://www.infinispan.org ● OGM – http://community.jboss.org/en/hibernate/ogm