SlideShare a Scribd company logo
1 of 30
Download to read offline
Gossip Protocol & Key-Value Store
Theory and Practice
Dr. SAJEEV G P
April 16, 2016
Dr. SAJEEV G P Gossip Protocol & Key-Value Store 1 / 1
Contents
Outline
Background
Introduction to Gossip
Gossip Model
Key-Value Store
Cassandra
CAP Theorem
Further study
Dr. SAJEEV G P Gossip Protocol & Key-Value Store 2 / 1
Background
Dr. SAJEEV G P Gossip Protocol & Key-Value Store 3 / 1
Real World Applications
Resume Youtube Video
Last watched status..
Already Watched?
Youtube
Dr. SAJEEV G P Gossip Protocol & Key-Value Store 4 / 1
Some Real World Applications
Facebook Search
Term Search
Amazon Search
Amazon Recommendations
Dr. SAJEEV G P Gossip Protocol & Key-Value Store 5 / 1
Cloud Platform
App Engine
Compute Engine
Cloud storage
Cloud BigTable
Google Dataow
Google Translate API
Google BigQuery
Cloud Prediction API
Gossip  Key-Value Store
Gossip is for communication
Key-Value store is the database
Dr. SAJEEV G P Gossip Protocol  Key-Value Store 6 / 1
Gossip Protocol
Multicast
Multicast is group communication where information is addressed to a
group of destination computers simultaneously.
Types of Casting
Unicast
Multicast
Broadcast
Multicast in application level
Multicast in Network level
Dr. SAJEEV G P Gossip Protocol  Key-Value Store 7 / 1
Multicast Protocol: Centralized  Tree Based
Centralized
Tree based
Dr. SAJEEV G P Gossip Protocol  Key-Value Store 8 / 1
Gossip Protocol: Epedemic multicast
Dr. SAJEEV G P Gossip Protocol  Key-Value Store 9 / 1
Gossip Analysis
Basics
Population of (n+1) individuals mixing homogeneously
Contact rate between any individual pair is β
At any time, each individual is either uninfected (numbering x) or
infected (numbering y)
Then, x0 = n , y0 = 1 and at all times x + y = n + 1
Infected-uninfected contact turns latter infected, and it stays infected
Dr. SAJEEV G P Gossip Protocol  Key-Value Store 10 / 1
Gossip Analysis
Gossip Properties
Lightweight in large groups
Spreads quickly
Fault-tolerant
Terms
n + 1 nodes
x: # of uninfected nodes
y: # of infected nodes
x + y = n + 1
Continuous time ..
dx
dt = −βxy
β contact rate = b
n
Solution:
x = n(n+1)
n+eβ(n+1)t
y = (n+1)
1+ne−β(n+1)t
Dr. SAJEEV G P Gossip Protocol  Key-Value Store 11 / 1
Gossip Analysis ..
No. of infected nodes ..
At t = c log (n),
y ≈ (n + 1) − 1
ncb−2
Low Latency
Set c, b to be
small numbers
Within
t = c log (n)
rounds:
all will receive
the multicast
except 1
ncb−2
Lightweight
Each node has
transmitted no
more than:
cb log (n)
gossip
messages.
Fault tolerance
With 50% of packet drop
b ← b
2 :
Takes twice as
many rounds
With 50% node failures
n ← n
2 :
Same as above
Dr. SAJEEV G P Gossip Protocol  Key-Value Store 12 / 1
Key-Value Stores
Simplest form of database
management systems.
They store pairs of keys and
values as well as retrieve
values when a key is known.
Examples
twitter.com: Tweet id ⇒
information about tweet
amazon.com: Item number ⇒
information about it
kayak.com: Flight number ⇒
information about ight, e.g.,
availability
yourbank.com: Ac number ⇒
information about it
Dr. SAJEEV G P Gossip Protocol  Key-Value Store 13 / 1
Key-Value Stores..
It's a dictionary data
structure.
NoSQL database
Insert, lookup, and delete
by key
E.g., hash table, binary tree
But distributed
Key-Value stores reuse many
techniques from DHTs
NoSQL Databases
Traditional RDBMS..
Schema-based, i.e., structured tables
Primary key that is unique within that table
Queried using SQL , Supports joins
Dr. SAJEEV G P Gossip Protocol  Key-Value Store 14 / 1
Todays Workload
Big Data Era ..
Data: Large and unstructured
Lots of random reads and writes
Sometimes write-heavy
Foreign keys rarely needed
Joins infrequent
Need of todays Workload
Speed
Avoid Single Point of Failure (SPOF)
Low TCO (Total cost of operation)
Fewer system administrators
Incremental scalability, Scale out, not up
Dr. SAJEEV G P Gossip Protocol  Key-Value Store 15 / 1
Key-Value or NoSQL
NoSQL systems often use column-oriented storage
RDBMSs store an entire row together (on disk or at a server)
NoSQL systems typically store a column together (or a group of
columns).
Entries within a column are indexed and easy to locate, given a key
(and vice-versa)
Why useful?
Range searches within a column are fast since you don't need to
fetch the entire database
E.g., get me all the blog-ids from the blog table that were updated
within the past month
Search in the the last-updated column, fetch corresponding blog-id
column
Don't need to fetch the other columns
Dr. SAJEEV G P Gossip Protocol  Key-Value Store 16 / 1
CASSANDRA
What is CASSANDRA
A distributed key-value store
Intended to run in a
data-center (and also across
DCs)
Originally designed at
Facebook
Open-sourced later, today an
Apache project
In use
Some of the companies that
use Cassandra in their
production clusters
IBM, Adobe, HP, eBay,
Ericsson, Symantec
Twitter, Spotify
PBS Kids
Netix: uses Cassandra to
keep track of your current
position in the video youâre
watching
Dr. SAJEEV G P Gossip Protocol  Key-Value Store 17 / 1
CASSANDRA
Objectives and Functions
Performance
Availability
Scalability
Fault-Tolerance
P2P Cluster
Decentralized design
Each node has the same role
No single point of failure
Avoids issues of master-slave DBMS's
No bottlenecking
Dr. SAJEEV G P Gossip Protocol  Key-Value Store 18 / 1
Cassandra Architecture
DHT Like
Dr. SAJEEV G P Gossip Protocol  Key-Value Store 19 / 1
Cassandra Architecture ..
Replication Strategy
Simple Strategy: uses the Partitioner, of which there are two kinds
RandomPartitioner: Chord-like hash partitioning
ByteOrderedPartitioner: Assigns ranges of keys to servers.
Easier for range queries (e.g., get me all twitter users starting with
[a-b])
NetworkTopologyStrategy NetworkTopologyStrategy: for multi-DC
deployments
Two replicas per DC
Three replicas per DC
Per DC
First replica placed according to Partitioner
Then go clockwise around ring until you hit a dierent rack
Dr. SAJEEV G P Gossip Protocol  Key-Value Store 20 / 1
Cassandra Architecture ..
At node
On receiving a write
Log it in disk commit log (for failure recovery)
Make changes to appropriate memtables
Memtable = In-memory representation of multiple key- value pairs
Later, when memtable is full or old, ush to disk
Data le: An SSTable (Sorted String Table) - list of key-value
pairs, sorted by key
Index le: An SSTable of (key, position in data sstable) pairs
And a Bloom lter (for ecient search)
Dr. SAJEEV G P Gossip Protocol  Key-Value Store 21 / 1
Cassandra Architecture ..
CAP
C consistency: all nodes see same data at any time, or reads
return latest written value by any client
A vailability: the system allows operations all the time, and
operations return quickly
P artition-tolerance: the system continues to work in spite of
network partitions
CAP Theorem
In a distributed system you can satisfy at most 2 out of the 3 guarantees.
Eventual consistency : weak consistency model
Dr. SAJEEV G P Gossip Protocol  Key-Value Store 22 / 1
Cassandra Architecture ..
CAP Theorem
CAP
Cassandra Consistency
Cassandra chooses
Consistency
and
Availability
Cassandra has consistency levels
Dr. SAJEEV G P Gossip Protocol  Key-Value Store 23 / 1
CAP Theorem ..
Consistency: No. of Replicas ..
Client is allowed to choose a consistency level for each operation
(read/write)
ANY: any server, Fastest
ALL: all replicas, Ensures strong consistency, but slowest
ONE: at least one replica, Faster than ALL
QUORUM: quorum across all replicas in all datacenters (DCs)
Dr. SAJEEV G P Gossip Protocol  Key-Value Store 24 / 1
Cassandra Architecture ..
Cassandra  Gossip
Dr. SAJEEV G P Gossip Protocol  Key-Value Store 25 / 1
For further learning
Gossip Applications
Distributed
Computing/Networking
Information Dissemination
Gossip Learning
Distributed Data
Research/Projects
Gossip Protocol:
Anti-entropy (simple
epidemics)
Rumor mongering
(complex epidemics)
Eager epidemic
dissemination
Key-Value Store:
Cassandra
Redis
python-Cassandra
Dr. SAJEEV G P Gossip Protocol  Key-Value Store 26 / 1
Further Learning ..
Cassandra
Resources
http://cassandra.apache.org/
http://academy.datastax.com/
http://www.planetcassandra.org/
Use locally installed
apache-cassandra-3.4
Use Cassandra Cluster service
Python-Cassandra
Cassndra Cluster Setup
cqlengine: Cassandra CQL
object mapper for Python
Python Application
Dr. SAJEEV G P Gossip Protocol  Key-Value Store 27 / 1
References I
Atikoglu, B., Xu, Y., Frachtenberg, E., Jiang, S., and Paleczny, M.
Workload analysis of a large-scale key-value store.
In ACM SIGMETRICS Performance Evaluation Review (2012),
vol. 40, ACM, pp. 5364.
Datastax.
Python cassandra-driver[onlne] available:.
https://pypi.python.org/pypi/cassandra-driver (2015).
Gupta, I., and Meseguer, J.
Quantitative analysis of consistency in nosql key-value stores.
In Quantitative Evaluation of Systems: 12th International
Conference, QEST 2015, Madrid, Spain, September 1-3, 2015,
Proceedings (2015), vol. 9259, Springer, p. 228.
Dr. SAJEEV G P Gossip Protocol  Key-Value Store 28 / 1
References II
Jenkins, K., Hopkinson, K., and Birman, K.
A gossip protocol for subgroup multicast.
In Distributed Computing Systems Workshop, 2001 International
Conference on (2001), IEEE, pp. 2530.
Lakshman, A., and Malik, P.
Cassandra: A decentralized structured storage system.
SIGOPS Oper. Syst. Rev. 44, 2 (Apr. 2010), 3540.
van der Hoek, W.
A framework for epistemic gossip protocols.
In Multi-Agent Systems: 12th European Conference, EUMAS 2014,
Prague, Czech Republic, December 18-19, 2014, Revised Selected
Papers (2015), vol. 8953, Springer, p. 193.
Dr. SAJEEV G P Gossip Protocol  Key-Value Store 29 / 1
Thank You..
Dr. SAJEEV G P Gossip Protocol  Key-Value Store 30 / 1

More Related Content

What's hot

Data Stream Algorithms in Storm and R
Data Stream Algorithms in Storm and RData Stream Algorithms in Storm and R
Data Stream Algorithms in Storm and RRadek Maciaszek
 
Hadoop Summit Europe 2014: Apache Storm Architecture
Hadoop Summit Europe 2014: Apache Storm ArchitectureHadoop Summit Europe 2014: Apache Storm Architecture
Hadoop Summit Europe 2014: Apache Storm ArchitectureP. Taylor Goetz
 
Storm: The Real-Time Layer - GlueCon 2012
Storm: The Real-Time Layer  - GlueCon 2012Storm: The Real-Time Layer  - GlueCon 2012
Storm: The Real-Time Layer - GlueCon 2012Dan Lynn
 
Storm Real Time Computation
Storm Real Time ComputationStorm Real Time Computation
Storm Real Time ComputationSonal Raj
 
Improved Reliable Streaming Processing: Apache Storm as example
Improved Reliable Streaming Processing: Apache Storm as exampleImproved Reliable Streaming Processing: Apache Storm as example
Improved Reliable Streaming Processing: Apache Storm as exampleDataWorks Summit/Hadoop Summit
 
Learning Stream Processing with Apache Storm
Learning Stream Processing with Apache StormLearning Stream Processing with Apache Storm
Learning Stream Processing with Apache StormEugene Dvorkin
 
Apache Storm and twitter Streaming API integration
Apache Storm and twitter Streaming API integrationApache Storm and twitter Streaming API integration
Apache Storm and twitter Streaming API integrationUday Vakalapudi
 
Self-managed and automatically reconfigurable stream processing
Self-managed and automatically reconfigurable stream processingSelf-managed and automatically reconfigurable stream processing
Self-managed and automatically reconfigurable stream processingVasia Kalavri
 
Storm presentation
Storm presentationStorm presentation
Storm presentationShyam Raj
 
Apache Flink: API, runtime, and project roadmap
Apache Flink: API, runtime, and project roadmapApache Flink: API, runtime, and project roadmap
Apache Flink: API, runtime, and project roadmapKostas Tzoumas
 
Real time big data analytics with Storm by Ron Bodkin of Think Big Analytics
Real time big data analytics with Storm by Ron Bodkin of Think Big AnalyticsReal time big data analytics with Storm by Ron Bodkin of Think Big Analytics
Real time big data analytics with Storm by Ron Bodkin of Think Big AnalyticsData Con LA
 
Distributed Realtime Computation using Apache Storm
Distributed Realtime Computation using Apache StormDistributed Realtime Computation using Apache Storm
Distributed Realtime Computation using Apache Stormthe100rabh
 
Patterns of Streaming Applications
Patterns of Streaming ApplicationsPatterns of Streaming Applications
Patterns of Streaming ApplicationsC4Media
 
Storm-on-YARN: Convergence of Low-Latency and Big-Data
Storm-on-YARN: Convergence of Low-Latency and Big-DataStorm-on-YARN: Convergence of Low-Latency and Big-Data
Storm-on-YARN: Convergence of Low-Latency and Big-DataDataWorks Summit
 

What's hot (20)

Data Stream Algorithms in Storm and R
Data Stream Algorithms in Storm and RData Stream Algorithms in Storm and R
Data Stream Algorithms in Storm and R
 
Hadoop Summit Europe 2014: Apache Storm Architecture
Hadoop Summit Europe 2014: Apache Storm ArchitectureHadoop Summit Europe 2014: Apache Storm Architecture
Hadoop Summit Europe 2014: Apache Storm Architecture
 
Storm: The Real-Time Layer - GlueCon 2012
Storm: The Real-Time Layer  - GlueCon 2012Storm: The Real-Time Layer  - GlueCon 2012
Storm: The Real-Time Layer - GlueCon 2012
 
Storm
StormStorm
Storm
 
Storm Real Time Computation
Storm Real Time ComputationStorm Real Time Computation
Storm Real Time Computation
 
Improved Reliable Streaming Processing: Apache Storm as example
Improved Reliable Streaming Processing: Apache Storm as exampleImproved Reliable Streaming Processing: Apache Storm as example
Improved Reliable Streaming Processing: Apache Storm as example
 
Learning Stream Processing with Apache Storm
Learning Stream Processing with Apache StormLearning Stream Processing with Apache Storm
Learning Stream Processing with Apache Storm
 
Apache Storm and twitter Streaming API integration
Apache Storm and twitter Streaming API integrationApache Storm and twitter Streaming API integration
Apache Storm and twitter Streaming API integration
 
Self-managed and automatically reconfigurable stream processing
Self-managed and automatically reconfigurable stream processingSelf-managed and automatically reconfigurable stream processing
Self-managed and automatically reconfigurable stream processing
 
Storm presentation
Storm presentationStorm presentation
Storm presentation
 
Apache Flink: API, runtime, and project roadmap
Apache Flink: API, runtime, and project roadmapApache Flink: API, runtime, and project roadmap
Apache Flink: API, runtime, and project roadmap
 
Storm
StormStorm
Storm
 
Real time big data analytics with Storm by Ron Bodkin of Think Big Analytics
Real time big data analytics with Storm by Ron Bodkin of Think Big AnalyticsReal time big data analytics with Storm by Ron Bodkin of Think Big Analytics
Real time big data analytics with Storm by Ron Bodkin of Think Big Analytics
 
STORM
STORMSTORM
STORM
 
Distributed Realtime Computation using Apache Storm
Distributed Realtime Computation using Apache StormDistributed Realtime Computation using Apache Storm
Distributed Realtime Computation using Apache Storm
 
Apache Storm
Apache StormApache Storm
Apache Storm
 
Jan 2012 HUG: Storm
Jan 2012 HUG: StormJan 2012 HUG: Storm
Jan 2012 HUG: Storm
 
Patterns of Streaming Applications
Patterns of Streaming ApplicationsPatterns of Streaming Applications
Patterns of Streaming Applications
 
Resource Aware Scheduling in Apache Storm
Resource Aware Scheduling in Apache StormResource Aware Scheduling in Apache Storm
Resource Aware Scheduling in Apache Storm
 
Storm-on-YARN: Convergence of Low-Latency and Big-Data
Storm-on-YARN: Convergence of Low-Latency and Big-DataStorm-on-YARN: Convergence of Low-Latency and Big-Data
Storm-on-YARN: Convergence of Low-Latency and Big-Data
 

Viewers also liked

2014 09-23 Mechanism of Gossip protocol
2014 09-23 Mechanism of Gossip protocol2014 09-23 Mechanism of Gossip protocol
2014 09-23 Mechanism of Gossip protocolSugawara Genki
 
SQL for Elasticsearch
SQL for ElasticsearchSQL for Elasticsearch
SQL for ElasticsearchJodok Batlogg
 
Cassandra - A decentralized storage system
Cassandra - A decentralized storage systemCassandra - A decentralized storage system
Cassandra - A decentralized storage systemArunit Gupta
 
Cassandra architecture
Cassandra architectureCassandra architecture
Cassandra architectureT Jake Luciani
 
Big data key-value and column stores redis - cassandra
Big data  key-value and column stores redis - cassandraBig data  key-value and column stores redis - cassandra
Big data key-value and column stores redis - cassandraJWORKS powered by Ordina
 
Key-value databases in practice Redis @ DotNetToscana
Key-value databases in practice Redis @ DotNetToscanaKey-value databases in practice Redis @ DotNetToscana
Key-value databases in practice Redis @ DotNetToscanaMatteo Baglini
 
Cassandra background-and-architecture
Cassandra background-and-architectureCassandra background-and-architecture
Cassandra background-and-architectureMarkus Klems
 
Estudo comparativo entr bancos RDBMS, NoSQL e NewSQL
Estudo comparativo entr bancos RDBMS, NoSQL e NewSQLEstudo comparativo entr bancos RDBMS, NoSQL e NewSQL
Estudo comparativo entr bancos RDBMS, NoSQL e NewSQLOrlando Vitali
 
Sistemas NoSQL, surgimento, características e exemplos
Sistemas NoSQL, surgimento, características e exemplosSistemas NoSQL, surgimento, características e exemplos
Sistemas NoSQL, surgimento, características e exemplosAricelio Souza
 
Banco de Dados Não Relacionais vs Banco de Dados Relacionais
Banco de Dados Não Relacionais vs Banco de Dados RelacionaisBanco de Dados Não Relacionais vs Banco de Dados Relacionais
Banco de Dados Não Relacionais vs Banco de Dados Relacionaisalexculpado
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache CassandraDataStax
 
Gossip-based algorithms
Gossip-based algorithmsGossip-based algorithms
Gossip-based algorithmsAmir Payberah
 
Introducao aos Bancos de Dados Não-relacionais
Introducao aos Bancos de Dados Não-relacionaisIntroducao aos Bancos de Dados Não-relacionais
Introducao aos Bancos de Dados Não-relacionaisMauricio De Diana
 
NoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenNoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenLorenzo Alberton
 
Dynamo and BigTable - Review and Comparison
Dynamo and BigTable - Review and ComparisonDynamo and BigTable - Review and Comparison
Dynamo and BigTable - Review and ComparisonGrisha Weintraub
 
What is NoSQL and CAP Theorem
What is NoSQL and CAP TheoremWhat is NoSQL and CAP Theorem
What is NoSQL and CAP TheoremRahul Jain
 

Viewers also liked (20)

2014 09-23 Mechanism of Gossip protocol
2014 09-23 Mechanism of Gossip protocol2014 09-23 Mechanism of Gossip protocol
2014 09-23 Mechanism of Gossip protocol
 
SQL for Elasticsearch
SQL for ElasticsearchSQL for Elasticsearch
SQL for Elasticsearch
 
Cassandra
CassandraCassandra
Cassandra
 
Tech Talk Buscapé - Redis
Tech Talk Buscapé - RedisTech Talk Buscapé - Redis
Tech Talk Buscapé - Redis
 
Cassandra - A decentralized storage system
Cassandra - A decentralized storage systemCassandra - A decentralized storage system
Cassandra - A decentralized storage system
 
Cassandra architecture
Cassandra architectureCassandra architecture
Cassandra architecture
 
Big data key-value and column stores redis - cassandra
Big data  key-value and column stores redis - cassandraBig data  key-value and column stores redis - cassandra
Big data key-value and column stores redis - cassandra
 
Key-value databases in practice Redis @ DotNetToscana
Key-value databases in practice Redis @ DotNetToscanaKey-value databases in practice Redis @ DotNetToscana
Key-value databases in practice Redis @ DotNetToscana
 
Cassandra background-and-architecture
Cassandra background-and-architectureCassandra background-and-architecture
Cassandra background-and-architecture
 
Estudo comparativo entr bancos RDBMS, NoSQL e NewSQL
Estudo comparativo entr bancos RDBMS, NoSQL e NewSQLEstudo comparativo entr bancos RDBMS, NoSQL e NewSQL
Estudo comparativo entr bancos RDBMS, NoSQL e NewSQL
 
Tech Talk Buscapé - Clean Code
Tech Talk Buscapé - Clean CodeTech Talk Buscapé - Clean Code
Tech Talk Buscapé - Clean Code
 
Sistemas NoSQL, surgimento, características e exemplos
Sistemas NoSQL, surgimento, características e exemplosSistemas NoSQL, surgimento, características e exemplos
Sistemas NoSQL, surgimento, características e exemplos
 
NoSQL: Introducción a las Bases de Datos no estructuradas
NoSQL: Introducción a las Bases de Datos no estructuradasNoSQL: Introducción a las Bases de Datos no estructuradas
NoSQL: Introducción a las Bases de Datos no estructuradas
 
Banco de Dados Não Relacionais vs Banco de Dados Relacionais
Banco de Dados Não Relacionais vs Banco de Dados RelacionaisBanco de Dados Não Relacionais vs Banco de Dados Relacionais
Banco de Dados Não Relacionais vs Banco de Dados Relacionais
 
An Overview of Apache Cassandra
An Overview of Apache CassandraAn Overview of Apache Cassandra
An Overview of Apache Cassandra
 
Gossip-based algorithms
Gossip-based algorithmsGossip-based algorithms
Gossip-based algorithms
 
Introducao aos Bancos de Dados Não-relacionais
Introducao aos Bancos de Dados Não-relacionaisIntroducao aos Bancos de Dados Não-relacionais
Introducao aos Bancos de Dados Não-relacionais
 
NoSQL Databases: Why, what and when
NoSQL Databases: Why, what and whenNoSQL Databases: Why, what and when
NoSQL Databases: Why, what and when
 
Dynamo and BigTable - Review and Comparison
Dynamo and BigTable - Review and ComparisonDynamo and BigTable - Review and Comparison
Dynamo and BigTable - Review and Comparison
 
What is NoSQL and CAP Theorem
What is NoSQL and CAP TheoremWhat is NoSQL and CAP Theorem
What is NoSQL and CAP Theorem
 

Similar to Gossip & Key Value Store

Bhupeshbansal bigdata
Bhupeshbansal bigdata Bhupeshbansal bigdata
Bhupeshbansal bigdata Bhupesh Bansal
 
Handling Data in Mega Scale Web Systems
Handling Data in Mega Scale Web SystemsHandling Data in Mega Scale Web Systems
Handling Data in Mega Scale Web SystemsVineet Gupta
 
Using Cassandra with your Web Application
Using Cassandra with your Web ApplicationUsing Cassandra with your Web Application
Using Cassandra with your Web Applicationsupertom
 
Speed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudSpeed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudRevolution Analytics
 
Ontology-based data access: why it is so cool!
Ontology-based data access: why it is so cool!Ontology-based data access: why it is so cool!
Ontology-based data access: why it is so cool!Josef Hardi
 
What’s New in ScyllaDB Open Source 5.0
What’s New in ScyllaDB Open Source 5.0What’s New in ScyllaDB Open Source 5.0
What’s New in ScyllaDB Open Source 5.0ScyllaDB
 
Renegotiating the boundary between database latency and consistency
Renegotiating the boundary between database latency  and consistencyRenegotiating the boundary between database latency  and consistency
Renegotiating the boundary between database latency and consistencyScyllaDB
 
Distributed Systems: scalability and high availability
Distributed Systems: scalability and high availabilityDistributed Systems: scalability and high availability
Distributed Systems: scalability and high availabilityRenato Lucindo
 
Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Anton Nazaruk
 
Spinnaker VLDB 2011
Spinnaker VLDB 2011Spinnaker VLDB 2011
Spinnaker VLDB 2011sandeep_tata
 
NoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch Analysis
NoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch AnalysisNoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch Analysis
NoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch AnalysisHelena Edelson
 
Inter Task Communication On Volatile Nodes
Inter Task Communication On Volatile NodesInter Task Communication On Volatile Nodes
Inter Task Communication On Volatile Nodesnagarajan_ka
 
Data Engineering for Data Scientists
Data Engineering for Data Scientists Data Engineering for Data Scientists
Data Engineering for Data Scientists jlacefie
 
The Need for Async @ ScalaWorld
The Need for Async @ ScalaWorldThe Need for Async @ ScalaWorld
The Need for Async @ ScalaWorldKonrad Malawski
 
Processing large-scale graphs with Google Pregel
Processing large-scale graphs with Google PregelProcessing large-scale graphs with Google Pregel
Processing large-scale graphs with Google PregelMax Neunhöffer
 
About "Apache Cassandra"
About "Apache Cassandra"About "Apache Cassandra"
About "Apache Cassandra"Jihyun Ahn
 

Similar to Gossip & Key Value Store (20)

Bhupeshbansal bigdata
Bhupeshbansal bigdata Bhupeshbansal bigdata
Bhupeshbansal bigdata
 
Handling Data in Mega Scale Web Systems
Handling Data in Mega Scale Web SystemsHandling Data in Mega Scale Web Systems
Handling Data in Mega Scale Web Systems
 
Using Cassandra with your Web Application
Using Cassandra with your Web ApplicationUsing Cassandra with your Web Application
Using Cassandra with your Web Application
 
Speed up R with parallel programming in the Cloud
Speed up R with parallel programming in the CloudSpeed up R with parallel programming in the Cloud
Speed up R with parallel programming in the Cloud
 
No sql
No sqlNo sql
No sql
 
Ontology-based data access: why it is so cool!
Ontology-based data access: why it is so cool!Ontology-based data access: why it is so cool!
Ontology-based data access: why it is so cool!
 
What’s New in ScyllaDB Open Source 5.0
What’s New in ScyllaDB Open Source 5.0What’s New in ScyllaDB Open Source 5.0
What’s New in ScyllaDB Open Source 5.0
 
Distributed Deep Learning + others for Spark Meetup
Distributed Deep Learning + others for Spark MeetupDistributed Deep Learning + others for Spark Meetup
Distributed Deep Learning + others for Spark Meetup
 
NoSql Database
NoSql DatabaseNoSql Database
NoSql Database
 
Renegotiating the boundary between database latency and consistency
Renegotiating the boundary between database latency  and consistencyRenegotiating the boundary between database latency  and consistency
Renegotiating the boundary between database latency and consistency
 
Distributed Systems: scalability and high availability
Distributed Systems: scalability and high availabilityDistributed Systems: scalability and high availability
Distributed Systems: scalability and high availability
 
Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?Big Data Streams Architectures. Why? What? How?
Big Data Streams Architectures. Why? What? How?
 
No sql
No sqlNo sql
No sql
 
Spinnaker VLDB 2011
Spinnaker VLDB 2011Spinnaker VLDB 2011
Spinnaker VLDB 2011
 
NoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch Analysis
NoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch AnalysisNoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch Analysis
NoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch Analysis
 
Inter Task Communication On Volatile Nodes
Inter Task Communication On Volatile NodesInter Task Communication On Volatile Nodes
Inter Task Communication On Volatile Nodes
 
Data Engineering for Data Scientists
Data Engineering for Data Scientists Data Engineering for Data Scientists
Data Engineering for Data Scientists
 
The Need for Async @ ScalaWorld
The Need for Async @ ScalaWorldThe Need for Async @ ScalaWorld
The Need for Async @ ScalaWorld
 
Processing large-scale graphs with Google Pregel
Processing large-scale graphs with Google PregelProcessing large-scale graphs with Google Pregel
Processing large-scale graphs with Google Pregel
 
About "Apache Cassandra"
About "Apache Cassandra"About "Apache Cassandra"
About "Apache Cassandra"
 

Recently uploaded

Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdfankushspencer015
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSISrknatarajan
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSRajkumarAkumalla
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduitsrknatarajan
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordAsst.prof M.Gokilavani
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...roncy bisnoi
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 

Recently uploaded (20)

Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
UNIT-III FMM. DIMENSIONAL ANALYSIS
UNIT-III FMM.        DIMENSIONAL ANALYSISUNIT-III FMM.        DIMENSIONAL ANALYSIS
UNIT-III FMM. DIMENSIONAL ANALYSIS
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
UNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular ConduitsUNIT-II FMM-Flow Through Circular Conduits
UNIT-II FMM-Flow Through Circular Conduits
 
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete RecordCCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
CCS335 _ Neural Networks and Deep Learning Laboratory_Lab Complete Record
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
 
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 

Gossip & Key Value Store

  • 1. Gossip Protocol & Key-Value Store Theory and Practice Dr. SAJEEV G P April 16, 2016 Dr. SAJEEV G P Gossip Protocol & Key-Value Store 1 / 1
  • 2. Contents Outline Background Introduction to Gossip Gossip Model Key-Value Store Cassandra CAP Theorem Further study Dr. SAJEEV G P Gossip Protocol & Key-Value Store 2 / 1
  • 3. Background Dr. SAJEEV G P Gossip Protocol & Key-Value Store 3 / 1
  • 4. Real World Applications Resume Youtube Video Last watched status.. Already Watched? Youtube Dr. SAJEEV G P Gossip Protocol & Key-Value Store 4 / 1
  • 5. Some Real World Applications Facebook Search Term Search Amazon Search Amazon Recommendations Dr. SAJEEV G P Gossip Protocol & Key-Value Store 5 / 1
  • 6. Cloud Platform App Engine Compute Engine Cloud storage Cloud BigTable Google Dataow Google Translate API Google BigQuery Cloud Prediction API Gossip Key-Value Store Gossip is for communication Key-Value store is the database Dr. SAJEEV G P Gossip Protocol Key-Value Store 6 / 1
  • 7. Gossip Protocol Multicast Multicast is group communication where information is addressed to a group of destination computers simultaneously. Types of Casting Unicast Multicast Broadcast Multicast in application level Multicast in Network level Dr. SAJEEV G P Gossip Protocol Key-Value Store 7 / 1
  • 8. Multicast Protocol: Centralized Tree Based Centralized Tree based Dr. SAJEEV G P Gossip Protocol Key-Value Store 8 / 1
  • 9. Gossip Protocol: Epedemic multicast Dr. SAJEEV G P Gossip Protocol Key-Value Store 9 / 1
  • 10. Gossip Analysis Basics Population of (n+1) individuals mixing homogeneously Contact rate between any individual pair is β At any time, each individual is either uninfected (numbering x) or infected (numbering y) Then, x0 = n , y0 = 1 and at all times x + y = n + 1 Infected-uninfected contact turns latter infected, and it stays infected Dr. SAJEEV G P Gossip Protocol Key-Value Store 10 / 1
  • 11. Gossip Analysis Gossip Properties Lightweight in large groups Spreads quickly Fault-tolerant Terms n + 1 nodes x: # of uninfected nodes y: # of infected nodes x + y = n + 1 Continuous time .. dx dt = −βxy β contact rate = b n Solution: x = n(n+1) n+eβ(n+1)t y = (n+1) 1+ne−β(n+1)t Dr. SAJEEV G P Gossip Protocol Key-Value Store 11 / 1
  • 12. Gossip Analysis .. No. of infected nodes .. At t = c log (n), y ≈ (n + 1) − 1 ncb−2 Low Latency Set c, b to be small numbers Within t = c log (n) rounds: all will receive the multicast except 1 ncb−2 Lightweight Each node has transmitted no more than: cb log (n) gossip messages. Fault tolerance With 50% of packet drop b ← b 2 : Takes twice as many rounds With 50% node failures n ← n 2 : Same as above Dr. SAJEEV G P Gossip Protocol Key-Value Store 12 / 1
  • 13. Key-Value Stores Simplest form of database management systems. They store pairs of keys and values as well as retrieve values when a key is known. Examples twitter.com: Tweet id ⇒ information about tweet amazon.com: Item number ⇒ information about it kayak.com: Flight number ⇒ information about ight, e.g., availability yourbank.com: Ac number ⇒ information about it Dr. SAJEEV G P Gossip Protocol Key-Value Store 13 / 1
  • 14. Key-Value Stores.. It's a dictionary data structure. NoSQL database Insert, lookup, and delete by key E.g., hash table, binary tree But distributed Key-Value stores reuse many techniques from DHTs NoSQL Databases Traditional RDBMS.. Schema-based, i.e., structured tables Primary key that is unique within that table Queried using SQL , Supports joins Dr. SAJEEV G P Gossip Protocol Key-Value Store 14 / 1
  • 15. Todays Workload Big Data Era .. Data: Large and unstructured Lots of random reads and writes Sometimes write-heavy Foreign keys rarely needed Joins infrequent Need of todays Workload Speed Avoid Single Point of Failure (SPOF) Low TCO (Total cost of operation) Fewer system administrators Incremental scalability, Scale out, not up Dr. SAJEEV G P Gossip Protocol Key-Value Store 15 / 1
  • 16. Key-Value or NoSQL NoSQL systems often use column-oriented storage RDBMSs store an entire row together (on disk or at a server) NoSQL systems typically store a column together (or a group of columns). Entries within a column are indexed and easy to locate, given a key (and vice-versa) Why useful? Range searches within a column are fast since you don't need to fetch the entire database E.g., get me all the blog-ids from the blog table that were updated within the past month Search in the the last-updated column, fetch corresponding blog-id column Don't need to fetch the other columns Dr. SAJEEV G P Gossip Protocol Key-Value Store 16 / 1
  • 17. CASSANDRA What is CASSANDRA A distributed key-value store Intended to run in a data-center (and also across DCs) Originally designed at Facebook Open-sourced later, today an Apache project In use Some of the companies that use Cassandra in their production clusters IBM, Adobe, HP, eBay, Ericsson, Symantec Twitter, Spotify PBS Kids Netix: uses Cassandra to keep track of your current position in the video youâre watching Dr. SAJEEV G P Gossip Protocol Key-Value Store 17 / 1
  • 18. CASSANDRA Objectives and Functions Performance Availability Scalability Fault-Tolerance P2P Cluster Decentralized design Each node has the same role No single point of failure Avoids issues of master-slave DBMS's No bottlenecking Dr. SAJEEV G P Gossip Protocol Key-Value Store 18 / 1
  • 19. Cassandra Architecture DHT Like Dr. SAJEEV G P Gossip Protocol Key-Value Store 19 / 1
  • 20. Cassandra Architecture .. Replication Strategy Simple Strategy: uses the Partitioner, of which there are two kinds RandomPartitioner: Chord-like hash partitioning ByteOrderedPartitioner: Assigns ranges of keys to servers. Easier for range queries (e.g., get me all twitter users starting with [a-b]) NetworkTopologyStrategy NetworkTopologyStrategy: for multi-DC deployments Two replicas per DC Three replicas per DC Per DC First replica placed according to Partitioner Then go clockwise around ring until you hit a dierent rack Dr. SAJEEV G P Gossip Protocol Key-Value Store 20 / 1
  • 21. Cassandra Architecture .. At node On receiving a write Log it in disk commit log (for failure recovery) Make changes to appropriate memtables Memtable = In-memory representation of multiple key- value pairs Later, when memtable is full or old, ush to disk Data le: An SSTable (Sorted String Table) - list of key-value pairs, sorted by key Index le: An SSTable of (key, position in data sstable) pairs And a Bloom lter (for ecient search) Dr. SAJEEV G P Gossip Protocol Key-Value Store 21 / 1
  • 22. Cassandra Architecture .. CAP C consistency: all nodes see same data at any time, or reads return latest written value by any client A vailability: the system allows operations all the time, and operations return quickly P artition-tolerance: the system continues to work in spite of network partitions CAP Theorem In a distributed system you can satisfy at most 2 out of the 3 guarantees. Eventual consistency : weak consistency model Dr. SAJEEV G P Gossip Protocol Key-Value Store 22 / 1
  • 23. Cassandra Architecture .. CAP Theorem CAP Cassandra Consistency Cassandra chooses Consistency and Availability Cassandra has consistency levels Dr. SAJEEV G P Gossip Protocol Key-Value Store 23 / 1
  • 24. CAP Theorem .. Consistency: No. of Replicas .. Client is allowed to choose a consistency level for each operation (read/write) ANY: any server, Fastest ALL: all replicas, Ensures strong consistency, but slowest ONE: at least one replica, Faster than ALL QUORUM: quorum across all replicas in all datacenters (DCs) Dr. SAJEEV G P Gossip Protocol Key-Value Store 24 / 1
  • 25. Cassandra Architecture .. Cassandra Gossip Dr. SAJEEV G P Gossip Protocol Key-Value Store 25 / 1
  • 26. For further learning Gossip Applications Distributed Computing/Networking Information Dissemination Gossip Learning Distributed Data Research/Projects Gossip Protocol: Anti-entropy (simple epidemics) Rumor mongering (complex epidemics) Eager epidemic dissemination Key-Value Store: Cassandra Redis python-Cassandra Dr. SAJEEV G P Gossip Protocol Key-Value Store 26 / 1
  • 27. Further Learning .. Cassandra Resources http://cassandra.apache.org/ http://academy.datastax.com/ http://www.planetcassandra.org/ Use locally installed apache-cassandra-3.4 Use Cassandra Cluster service Python-Cassandra Cassndra Cluster Setup cqlengine: Cassandra CQL object mapper for Python Python Application Dr. SAJEEV G P Gossip Protocol Key-Value Store 27 / 1
  • 28. References I Atikoglu, B., Xu, Y., Frachtenberg, E., Jiang, S., and Paleczny, M. Workload analysis of a large-scale key-value store. In ACM SIGMETRICS Performance Evaluation Review (2012), vol. 40, ACM, pp. 5364. Datastax. Python cassandra-driver[onlne] available:. https://pypi.python.org/pypi/cassandra-driver (2015). Gupta, I., and Meseguer, J. Quantitative analysis of consistency in nosql key-value stores. In Quantitative Evaluation of Systems: 12th International Conference, QEST 2015, Madrid, Spain, September 1-3, 2015, Proceedings (2015), vol. 9259, Springer, p. 228. Dr. SAJEEV G P Gossip Protocol Key-Value Store 28 / 1
  • 29. References II Jenkins, K., Hopkinson, K., and Birman, K. A gossip protocol for subgroup multicast. In Distributed Computing Systems Workshop, 2001 International Conference on (2001), IEEE, pp. 2530. Lakshman, A., and Malik, P. Cassandra: A decentralized structured storage system. SIGOPS Oper. Syst. Rev. 44, 2 (Apr. 2010), 3540. van der Hoek, W. A framework for epistemic gossip protocols. In Multi-Agent Systems: 12th European Conference, EUMAS 2014, Prague, Czech Republic, December 18-19, 2014, Revised Selected Papers (2015), vol. 8953, Springer, p. 193. Dr. SAJEEV G P Gossip Protocol Key-Value Store 29 / 1
  • 30. Thank You.. Dr. SAJEEV G P Gossip Protocol Key-Value Store 30 / 1