Cassandra is a NoSQL database designed for scalability and high availability without compromising performance. It distributes data across nodes and datacenters for redundancy and fault tolerance. Cassandra uses a decentralized architecture and dynamic replication to balance requests and allow incremental growth without reconfiguring the system. It provides tunable consistency options and supports column-oriented data storage for flexibility and efficient queries.
5. “NoSQL”
Performance Reliability
Scaling B&D
Tuesday, April 20, 2010
6. Performance
Tuesday, April 20, 2010
Each Cassandra node manages its storage locally. Not limited by obsolete systems, and not
slowed by layering on top of a DFS.
9. Durable
• Write to commitlog
• fsync is cheap since it’s append-only
• Write to memtable
• [amortized] flush memtable to sstable
Tuesday, April 20, 2010
Cassandra is one of the few NoSQL systems that is suitable for use when data loss is
unacceptable.
10. SSTable format, briefly
<row data 0>
<key 127>
<row data 1>
<key 255>
...
...
<row data 127>
...
<row data 255>
...
Tuesday, April 20, 2010
26. Bondage & Discipline
• Twitter: “Fifteen months ago, it took two
weeks to perform ALTER TABLE on the
statuses [tweets] table.”
Tuesday, April 20, 2010