ARCHITECTURE P2P: All nodes are identical No “master” node Gossip: Protocol for intra-ring communication Each node have state information about other nodes Anti-entropy & Read Repair: Replica synchronization mechanism Occurs during major compaction Uses Merkle trees
READ REPAIR Client Result Query Cassandra Cluster Read repair if digests differ Closest replica Result Replica A Digest Query Digest Response Digest Response Replica B Replica C
WRITE PATH Commit log: Responsible for all writes Memtable: In-memory data structure, written after commit log. SSTable: Immutable table Memtable flushed to disk
ARCHITECTURE Bloom filter: Performance booster Fast, nondeterministic algorithms In memory Used during read operation Tombstones: Deletion marker Soft delete Marker older than a set time, GC’ed
HINTED HANDOFF & COMPACTION Hinted Handoff: Node responsible down Coordinator creates hint Compaction: Merge SSTables. Keys merged Columns combined Tombstones discarded New index created
PARTITIONER Decides where row key(data) finds place in ring. Random Partitioner: MD5 hash Spreads keys evenly Inefficient range queries Order-Preserving Partitioner: Rows sorted
DATA MODEL Keyspace: Like Database. Container for CFs. Column Family: Like Table(But, not exactly a relational database table). Container of rows. Row: Sorted collection of columns. Column: Basic unit of data structure. Triplet of name, value and timestamp
DATA MODEL Super Column: Special column. Sorted associative array of columns. Map of maps. Only one level deep. Super Column Family: Container of rows having super columns. 4-D DHT = Standard CF: [Keyspace][ColumnFamily][Key][Column]. 5-D DHT = Super CF: [Keyspace][ColumnFamily][Key][SuperColumn][SubColumn].
REPLICATION & CONSISTENCY Replication: No. of copies of data in the system. Consistency level: No. of replicas to respond.