Cassandra - Research Paper Overview

Cassandra
A Decentralized Structured Storage System
Avinash Lakshman Prashant Malik
Facebook Facebook
Presented by Sameera Nelson

Outline …
 Introduction
 Data Model
 System Architecture
 Bootstrapping & Scaling
 Local Persistence
 Conclusion

What is Cassandra ?
 Distributed Storage System
 Manages Structured Data
 Highly available , No SPoF
 Not a Relational Data Model
 Handle high write throughput
◦ No impact on read efficiency

Motivation
 Operational Requirements in Facebook
◦ Performance
◦ Reliability/ Dealing with Failures
◦ Efficiency
◦ Continues Growth
 Application
◦ Inbox Search Problem, Facebook

Related Work
 Google File System
◦ Distributed FS, Single master/Slave
 Ficus/ Coda
◦ Distributed FS
 Farsite
◦ Distributed FS, No centralized server
 Bayou
◦ Distributed Relational DB System
 Dynamo
◦ Distributed Storage system

Data Model
Figure from Eben Hewitt’s slides.

• Table
• Multidimensional map indexed by key
• Columns
• Grouped in to Column Families
• Simple
• Super (Nested Column Families)
• Column has
• Name/ Value/ Timestamp
Data Model

Supported Operations
 insert(table; key; rowMutation)
 get(table; key; columnName)
 delete(table; key; columnName)

Query Language
CREATE TABLE users
( user_id int PRIMARY KEY,
fname text,
lname text );
INSERT INTO users
(user_id, fname, lname) VALUES (1745, 'john',
'smith');
SELECT * FROM users;

Fully Distributed …
 No Single Point of Failure

Cassandra Architecture
 Partitioning
 Data distribution across nodes
 Replication
 Data duplication across nodes
 Cluster Membership
 Node management in cluster
 adding/ deleting

Partitioning
 The Token Ring

Partitioning
 Partitions using Consistent hashing

Partitioning
 Assignment in to the relevant partition

Replication
 Based on configured replication factor

Replication
 Different Replication Policies
◦ Rack Unaware
 Replicate at N-1 nodes
◦ Rack Aware
 Zookeeper, using a leader
◦ Data center Aware
 similar to Rack Aware, leader chosen at Datacenter
level.

Cluster Membership
 Based on scuttlebutt
 Efficient Gossip based mechanism
 Inspired for real life rumor spreading.
 Anti Entropy protocol
◦ Repair replicated data by comparing &
reconciling differences

Cluster Membership
Gossip Based

Cluster Membership
 Failure Detection
◦ Accrual Failure Detector
If a node is faulty, the suspicion level increases.
Φ(t)  k as t  k
k - threshold variable
◦ If node is correct
Φ(t) = 0

Bootstrapping & Scaling
 Bootstrapping
◦ Node selects random token
◦ Locally persisted, gossiped to cluster
 Scaling
◦ Cassandra bootstrap algorithm initiated by
operator
◦ New node get a spitted range of heavily
loaded node

Local Persistence
 Write Operation

Local Persistence
 Write Operation
◦ Flush to disk after threshold
◦ Sequential Entries, Index per each
◦ Data file merging
◦ Rolling Commit logs

Local Persistence
 Read Operation
◦ Indexes all data on primary key
◦ Maintain column indices
Rea
d
Data

Conclusion
 Proven high scalability, performance, and
wide applicability
 Very high update throughput, delivering low
latency
 Future work
◦ Adding compression
◦ Support atomicity across keys
◦ Secondary index support

Cassandra - Research Paper Overview

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Cassandra - Research Paper Overview

Similar to Cassandra - Research Paper Overview (20)

Recently uploaded

Recently uploaded (20)

Cassandra - Research Paper Overview