Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.



Published on

A NoSQL Database

Published in: Internet
  • Be the first to comment


  2. 2. Agenda What is Cassandra? History Architecture Key Features and Benefits Who’s using Cassandra? Where to get Cassandra
  3. 3. Definition of Cassandra Apache Cassandra™ is a free Distributed… High performance… Extremely scalable… Fault tolerant (i.e. no single point of failure)… post-relational database solution. Cassandra can serve as both real-time datastore (the “system of record”) for online/transactional applications, and as a read-intensive database for business intelligence systems.
  4. 4. The History of Cassandra Bigtable Dynamo
  5. 5. Architecture Overview  Cassandra was designed with the understanding that system/hardware failures can and do occur  Peer-to-peer, distributed system  All nodes the same  Data partitioned among all nodes in the cluster  Custom data replication to ensure fault tolerance  Read/Write-anywhere design
  6. 6. Architecture Overview  Each node communicates with each other through the Gossip protocol, which exchanges information across the cluster every second  A commit log is used on each node to capture write activity. Data durability is assured  Data also written to an in-memory structure (memtable) and then to disk once the memory structure is full (an SStable)
  7. 7. Architecture Overview  The schema used in Cassandra is mirrored after Google Bigtable. It is a row-oriented, column structure  A keyspace is akin to a database in the RDBMS world  A column family is similar to an RDBMS table but is more flexible/dynamic  A row in a column family is indexed by its key. Other columns may be indexed as well ID Name SSN DOB Portfolio Keyspace Customer Column Family
  8. 8. Data Model keyspace settings column family settings column name value timestamp * Figure taken from Eben Hewitt’s (author of Oreilly’s Cassandra book) slides.
  9. 9.  Partitioning How data is partitioned across nodes  Replication How data is duplicated across nodes  Cluster Membership How nodes are added, deleted to the cluster System Architecture
  10. 10. • Nodes are logically structured in Ring Topology. • Hashed value of key associated with data partition is used to assign it to a node in the ring. • Hashing rounds off after certain value to support ring structure. • Lightly loaded nodes moves position to alleviate highly loaded nodes. Partitioning
  11. 11. Replication  Each data item is replicated at N (replication factor) nodes.  Different Replication Policies ◦ Rack Unaware – replicate data at N-1 successive nodes after its coordinator ◦ Rack Aware – uses ‘Zookeeper’ to choose a leader which tells nodes the range they are replicas for ◦ Datacenter Aware – similar to Rack Aware but leader is chosen at Datacenter level instead of Rack level.
  12. 12. 01 1/2 F E D C B A N=3 h(key2) h(key1) 12 Partitioning and Replication * Figure taken from Avinash Lakshman and Prashant Malik (authors of the paper) slides.
  13. 13. Gossip Protocols • Network Communication protocols inspired for real life rumour spreading. • Periodic, Pairwise, inter-node communication. • Low frequency communication ensures low cost. • Random selection of peers. • Example – Node A wish to search for pattern in data – Round 1 – Node A searches locally and then gossips with node B. – Round 2 – Node A,B gossips with C and D. – Round 3 – Nodes A,B,C and D gossips with 4 other nodes …… • Round by round doubling makes protocol very robust.
  14. 14. Gossip Protocols • Variety of Gossip Protocols exists – Dissemination protocol • Event Dissemination: multicasts events via gossip. high latency might cause network strain. • Background data dissemination: continuous gossip about information regarding participating nodes – Anti Entropy protocol • Used to repair replicated data by comparing and reconciling differences. This type of protocol is used in Cassandra to repair data in replications.
  15. 15. Cluster Management  Uses Scuttleback (a Gossip protocol) to manage nodes.  Uses gossip for node membership and to transmit system control state.  Node Fail state is given by variable ‘phi’ which tells how likely a node might fail (suspicion level) instead of simple binary value (up/down).  This type of system is known as Accrual Failure Detector.
  16. 16. Why Cassandra?  Gigabyte to Petabyte scalability  Linear performance gains through adding nodes  No single point of failure  Easy replication / data distribution  Multi-data center and Cloud capable  No need for separate caching layer  Tunable data consistency  Flexible schema design  Data Compression  CQL language (like SQL)  Support for key languages and platforms  No need for special hardware or software
  17. 17. Big Data Scalability  Capable of comfortably scaling to petabytes  New nodes = Linear performance increases  Add new nodes online 1 2 Double Throughput Capabilities 1 2 3 4
  18. 18. No Single Point of Failure  All nodes the same  Customized replication affords tunable data redundancy  Read/write from any node  Can replicate data among different physical data center racks
  19. 19. Easy Replication / Data Distribution  Transparently handled by Cassandra  Multi-data center capable  Exploits all the benefits of Cloud computing  Able to do hybrid Cloud/On-premise setup
  20. 20. No Need for Caching Software  Peer-to-peer architecture removes need for special caching layer and the programming that goes with it  The database cluster uses the memory from all participating nodes to cache the data assigned to each node  No irregularities between a memory cache and database are encountered Database Server Memcached Servers Application Servers Writes Reads
  21. 21. Tunable Data Consistency  Choose between strong and eventual consistency (All to any node responding) depending on the need  Can be done on a per-operation basis, and for both reads and writes  Handles Multi-data center operations 1 2 3 4 5 6  Any  One  Quorum  Local_Quorum  Each_Quorum  All Writes  One  Quorum  Local_Quorum  Each_Quorum  All Reads
  22. 22. Flexible Schema  Dynamic schema design allows for much more flexible data storage than rigid RDBMS  Handles structured, semi-structured, and unstructured data. Counters also supported  No offline/downtime for schema changes  Supports primary and secondary indexes ID Name SSN DOB Portfolio Keyspace Customer Column Family
  23. 23. Data Compression  Uses Google’s Snappy data compression algorithm  Compresses data on a per column family level  Internal tests at DataStax show up to 80%+ compression of raw data  No performance penalty (and some increases in overall performance due to less physical I/O)!
  24. 24. CQL Language  Very similar to RDBMS SQL syntax  Create objects via DDL (e.g. CREATE…)  Core DML commands supported: INSERT, UPDATE, DELETE  Query data with SELECT 1 2 3 4 5 6 SELECT * FROM USERS WHERE STATE = ‘TX’;
  25. 25. Query Closest replica Cassandra Cluster Replica A Result Replica B Replica C Digest Query Digest Response Digest Response Result Client Read repair if digests differ Read Operation * Figure taken from Avinash Lakshman and Prashant Malik (authors of the paper) slides.
  26. 26. Facebook Inbox Search • Cassandra developed to address this problem. • 50+TB of user messages data in 150 node cluster on which Cassandra is tested. • Search user index of all messages in 2 ways. – Term search : search by a key word – Interactions search : search by a user id Latency Stat Search Interactions Term Search Min 7.69 ms 7.78 ms Median 15.69 ms 18.27 ms Max 26.13 ms 44.41 ms
  27. 27. Comparison with MySQL • MySQL > 50 GB Data Writes Average : ~300 ms Reads Average : ~350 ms • Cassandra > 50 GB Data Writes Average : 0.12 ms Reads Average : 15 ms • Stats provided by Authors using facebook data.
  28. 28. Who’s Using Cassandra?