Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Cassandra - Research Paper Overview

1,556 views

Published on

Cassandra is a Distribu

Published in: Technology
  • Be the first to comment

Cassandra - Research Paper Overview

  1. 1. Cassandra A Decentralized Structured Storage System Avinash Lakshman Prashant Malik Facebook Facebook Presented by Sameera Nelson
  2. 2. Outline …  Introduction  Data Model  System Architecture  Bootstrapping & Scaling  Local Persistence  Conclusion
  3. 3. What is Cassandra ?  Distributed Storage System  Manages Structured Data  Highly available , No SPoF  Not a Relational Data Model  Handle high write throughput ◦ No impact on read efficiency
  4. 4. Motivation  Operational Requirements in Facebook ◦ Performance ◦ Reliability/ Dealing with Failures ◦ Efficiency ◦ Continues Growth  Application ◦ Inbox Search Problem, Facebook
  5. 5. Related Work  Google File System ◦ Distributed FS, Single master/Slave  Ficus/ Coda ◦ Distributed FS  Farsite ◦ Distributed FS, No centralized server  Bayou ◦ Distributed Relational DB System  Dynamo ◦ Distributed Storage system
  6. 6. Data Model
  7. 7. Data Model Figure from Eben Hewitt’s slides.
  8. 8. • Table • Multidimensional map indexed by key • Columns • Grouped in to Column Families • Simple • Super (Nested Column Families) • Column has • Name/ Value/ Timestamp Data Model
  9. 9. Supported Operations  insert(table; key; rowMutation)  get(table; key; columnName)  delete(table; key; columnName)
  10. 10. Query Language CREATE TABLE users ( user_id int PRIMARY KEY, fname text, lname text ); INSERT INTO users (user_id, fname, lname) VALUES (1745, 'john', 'smith'); SELECT * FROM users;
  11. 11. System Architecture
  12. 12. Fully Distributed …  No Single Point of Failure
  13. 13. Cassandra Architecture  Partitioning  Data distribution across nodes  Replication  Data duplication across nodes  Cluster Membership  Node management in cluster  adding/ deleting
  14. 14. Partitioning  The Token Ring
  15. 15. Partitioning  Partitions using Consistent hashing
  16. 16. Partitioning  Assignment in to the relevant partition
  17. 17. Replication  Based on configured replication factor
  18. 18. Replication  Different Replication Policies ◦ Rack Unaware  Replicate at N-1 nodes ◦ Rack Aware  Zookeeper, using a leader ◦ Data center Aware  similar to Rack Aware, leader chosen at Datacenter level.
  19. 19. Cluster Membership  Based on scuttlebutt  Efficient Gossip based mechanism  Inspired for real life rumor spreading.  Anti Entropy protocol ◦ Repair replicated data by comparing & reconciling differences
  20. 20. Cluster Membership Gossip Based
  21. 21. Cluster Membership  Failure Detection ◦ Accrual Failure Detector If a node is faulty, the suspicion level increases. Φ(t)  k as t  k k - threshold variable ◦ If node is correct Φ(t) = 0
  22. 22. Bootstrapping & Scaling
  23. 23. Bootstrapping & Scaling  Bootstrapping ◦ Node selects random token ◦ Locally persisted, gossiped to cluster  Scaling ◦ Cassandra bootstrap algorithm initiated by operator ◦ New node get a spitted range of heavily loaded node
  24. 24. Local Persistence
  25. 25. Local Persistence  Write Operation
  26. 26. Local Persistence  Write Operation ◦ Flush to disk after threshold ◦ Sequential Entries, Index per each ◦ Data file merging ◦ Rolling Commit logs
  27. 27. Local Persistence  Read Operation ◦ Indexes all data on primary key ◦ Maintain column indices Rea d Data
  28. 28. Conclusion
  29. 29. Conclusion  Proven high scalability, performance, and wide applicability  Very high update throughput, delivering low latency  Future work ◦ Adding compression ◦ Support atomicity across keys ◦ Secondary index support
  30. 30. Thank You

×