Cassandra

4,424 views
4,258 views

Published on

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
4,424
On SlideShare
0
From Embeds
0
Number of Embeds
81
Actions
Shares
0
Downloads
68
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Cassandra

  1. 1. Cassandra <br />Jahangir Mohammed<br />md.jahangir27@gmail.com<br />
  2. 2. What is Cassandra?<br />Distributed data store<br />O(1) DHT<br />Column-oriented<br />Dynamo + Big Table<br />
  3. 3. Why not RDBMS?<br />Many-to-many relationships -> Joins -> Denormalization -> Multiple copies of data or redundancy<br />Rigid schema<br />Vertical scaling is easier than horizontal<br />ACID, Distributed transaction, Two-phase commit<br />Slower writes<br />
  4. 4. CAP Theorem<br />Consistency – All clients will read the same data at same time.<br />Availability – Service always up and running.<br />Partition tolerance – System on whole operates despite network issues.<br />
  5. 5.
  6. 6. Features<br />Proven<br />Rich data model<br />Scalable<br />Distributed & Decentralized<br />Cross datacenter support<br />High performance writes/reads<br />No SPOF<br />Schema free<br />Tunable consistency<br />
  7. 7. Limitations<br />No ACID transactions(if needed)<br />Eventually consistent(Tunable consistency, trade-off with performance)<br />
  8. 8. ARCHITECTURE<br />Ring<br />Each node – unique token<br />Tokens range from 0 to 2**127<br />Keys MD5 hash to determine node<br />
  9. 9. RING<br />h(key1)<br />0<br />1<br />N=3<br />B<br />h(key2)<br />A<br />C<br />F<br />E<br />D<br />1/2<br />9<br />
  10. 10. ARCHITECTURE<br /> P2P: <br />All nodes are identical<br />No “master” node<br />Gossip: <br />Protocol for intra-ring communication<br />Each node have state information about other nodes<br />Anti-entropy & Read Repair: <br />Replica synchronization mechanism<br />Occurs during major compaction<br />Uses Merkle trees<br />
  11. 11. READ REPAIR<br />Client<br />Result<br />Query<br />Cassandra Cluster<br />Read repair if digests differ<br />Closest replica<br />Result<br />Replica A<br />Digest Query<br />Digest Response<br />Digest Response<br />Replica B<br />Replica C<br />
  12. 12. WRITE PATH<br />Commit log: Responsible for all writes<br />Memtable: In-memory data structure, written after commit log.<br />SSTable: <br />Immutable table<br />Memtable flushed to disk<br />
  13. 13. WRITE PATH<br />Key (CF1 , CF2 , CF3)<br /><ul><li> Data size
  14. 14. Number of Objects
  15. 15. Lifetime</li></ul>Memtable ( CF1)<br />Commit Log<br />Binary serialized <br />Key ( CF1 , CF2 , CF3 )<br />Memtable ( CF2)<br />FLUSH<br />Memtable ( CF2)<br />Data file on disk<br /><Key name><Size of key Data><Index of columns/supercolumns>< Serialized column family> <br />---<br />---<br />---<br />---<br /><Key name><Size of key Data><Index of columns/supercolumns>< Serialized column family><br />Dedicated Disk<br />
  16. 16. READ PATH<br />
  17. 17. ARCHITECTURE<br />Bloom filter:<br />Performance booster<br />Fast, nondeterministic algorithms<br />In memory<br />Used during read operation<br />Tombstones:<br />Deletion marker<br />Soft delete<br />Marker older than a set time, GC’ed<br />
  18. 18. HINTED HANDOFF & COMPACTION<br />Hinted Handoff:<br />Node responsible down<br />Coordinator creates hint<br />Compaction:<br />Merge SSTables.<br />Keys merged<br />Columns combined<br />Tombstones discarded<br />New index created<br />
  19. 19. PARTITIONER<br />Decides where row key(data) finds place in ring.<br />Random Partitioner:<br />MD5 hash<br />Spreads keys evenly<br />Inefficient range queries<br />Order-Preserving Partitioner:<br />Rows sorted<br />
  20. 20. DATA MODEL<br />Keyspace:<br />Like Database. <br />Container for CFs.<br />Column Family:<br />Like Table(But, not exactly a relational database table).<br />Container of rows.<br />Row:<br />Sorted collection of columns.<br />Column:<br />Basic unit of data structure.<br />Triplet of name, value and timestamp<br />
  21. 21. DATA MODEL<br />Super Column:<br />Special column.<br />Sorted associative array of columns.<br />Map of maps.<br />Only one level deep.<br />Super Column Family:<br />Container of rows having super columns.<br />4-D DHT = Standard CF:<br />[Keyspace][ColumnFamily][Key][Column].<br />5-D DHT = Super CF:<br />[Keyspace][ColumnFamily][Key][SuperColumn][SubColumn].<br />
  22. 22. REPLICATION & CONSISTENCY<br />Replication: No. of copies of data in the system.<br />Consistency level: No. of replicas to respond.<br />
  23. 23. WRITE<br />
  24. 24. READ<br />
  25. 25. REPLICA PLACEMENT STRATEGY <br />Simple Strategy:<br />Rack-Unaware<br />Fast<br />Single D.C.<br />
  26. 26. SIMPLE STRATEGY<br />
  27. 27. OLD NTS<br />Rack-aware<br />Same D.C.<br />
  28. 28. NTS<br />Rack-aware<br />D.C. aware<br />
  29. 29. IMAGE REFERENCES<br /> Nathan Hurst’s Blog<br />http://2.bp.blogspot.com/_YGilJHLjrrI/TJy3K0wshLI/AAAAAAAAAOI/ogAvf8Ckq3k/s1600/cassandra-ring2.png<br />Sigmod presentation: Avinash et. al, Facebook<br />Datastax<br />http://answers.oreilly.com/topic/2408-replica-placement-strategies-when-using-cassandra/<br />
  30. 30. QUESTIONS?<br />

×