0
The Cassandra Distributed Database

             Eric Evans
        eevans@rackspace.com
             @jericevans


      ...
A prophetess in Troy during the Trojan War. Her predictions were
always true, but never believed.
A massively scalable, decentralized, structured data store (aka
database).
Outline



1 Project History


2 Description


3 Case Studies


4 Roadmap
• 7 new committers added
• Dozens of contributors
• 100+ people on IRC
• Hundreds of closed issues (bugs, features, etc)
•...
Outline



1 Project History


2 Description


3 Case Studies


4 Roadmap
Cassandra is...




• O(1) DHT
• Eventual consistency
• Tunable trade-offs, consistency vs. latency
But...




• Values are structured, indexed
• Columns / column families
• Slicing w/ predicates (queries)
Column families
Supercolumn families
Querying



• get(): retrieve by column name
• multiget(): by column name for a set of keys
• get slice(): by column name,...
Column comparators



• TimeUUID
• LexicalUUID
• UTF8
• Long
• Bytes
• ...
Updating




• insert(): add/update column (by key)
• batch insert(): add/update multiple columns (by key)
• remove(): rem...
Consistency



CAP Theorem: choose any two of Consistency, Availability, or
Partition tolerance.
  • Zero
  • One
  • Quor...
Client API


• Thrift (12 different languages!)
• Ruby
    • http://github.com/fauna/cassandra/tree/master
    • http://git...
Performance vs MySQL w/ 50GB




• MySQL
   • 300ms write
   • 350ms read

• Cassandra
    • 0.12ms write
    • 15ms read
Writes
About writes...



• No reads
• No seeks
• Sequential disk access
• Atomic within a column family
• Fast
• Any node
• Alwa...
Reads
About reads...




• Any node
• Read repair
• Usual caching conventions apply
Outline



1 Project History


2 Description


3 Case Studies


4 Roadmap
Case 1: Digg




Digg is a social news site that allows people to discover and share
content from anywhere on the Internet...
Digg
Problem




• Terabytes of data; high transaction rate (reads dominated)
• Multiple clusters; heavily sharded
• Management...
Solution




• Currently production on ”Green Badges”
• Cassandra as primary data store RSN
• Datacenter and rack-aware re...
Case 2: Twitter




Twitter is a social networking and microblogging service that
enables its users to send and read tweet...
Twitter
MySQL




• Terabytes of data, ˜1,000,000 ops/s
• Calls for heavy sharding, light replication
• Schema changes are very di...
Case 3: Facebook




Facebook is a social networking site where users can create a
profile, add friends, and send them mess...
Inbox Search




• 100 TB
• 160 nodes
• 1/2 billion writes per day (2yr old number?)
Case 4: Mahalo




Mahalo.com is a web directory and knowledge exchange. It
differentiates itself by tracking and building ...
MySQL




• Partial deployment; 16 million video records (and growing)
• Writes (and storage) rapidly exceeding single box...
Outline



1 Project History


2 Description


3 Case Studies


4 Roadmap
0.6


• batch mutate command
• authentication (basic)
• new consistency level, ANY
• fat client
• mmapped i/o reads (defau...
0.7


• more efficient compactions (row sizes bigger than memory)
• easier (dynamic?) column family changes
• SSTable versio...
THE END
The Cassandra Distributed Database
The Cassandra Distributed Database
The Cassandra Distributed Database
The Cassandra Distributed Database
Upcoming SlideShare
Loading in...5
×

The Cassandra Distributed Database

103,067

Published on

Apache Cassandra is a highly scalable second-generation distributed database, bringing together Dynamo's fully distributed design and Bigtable's ColumnFamily-based data model.

This presentation, given at FOSDEM in 2010, provides a brief summary of cassandra's history, a high-level overview of the architecture and data model, and showcases some real life use-cases.

Published in: Technology
3 Comments
80 Likes
Statistics
Notes
No Downloads
Views
Total Views
103,067
On Slideshare
0
From Embeds
0
Number of Embeds
8
Actions
Shares
0
Downloads
1,984
Comments
3
Likes
80
Embeds 0
No embeds

No notes for slide

Transcript of "The Cassandra Distributed Database"

  1. 1. The Cassandra Distributed Database Eric Evans eevans@rackspace.com @jericevans FOSDEM February 7, 2010
  2. 2. A prophetess in Troy during the Trojan War. Her predictions were always true, but never believed.
  3. 3. A massively scalable, decentralized, structured data store (aka database).
  4. 4. Outline 1 Project History 2 Description 3 Case Studies 4 Roadmap
  5. 5. • 7 new committers added • Dozens of contributors • 100+ people on IRC • Hundreds of closed issues (bugs, features, etc) • 3 major releases, 2 point releases • Graduation to TLP?
  6. 6. Outline 1 Project History 2 Description 3 Case Studies 4 Roadmap
  7. 7. Cassandra is... • O(1) DHT • Eventual consistency • Tunable trade-offs, consistency vs. latency
  8. 8. But... • Values are structured, indexed • Columns / column families • Slicing w/ predicates (queries)
  9. 9. Column families
  10. 10. Supercolumn families
  11. 11. Querying • get(): retrieve by column name • multiget(): by column name for a set of keys • get slice(): by column name, or a range of names • returning columns • returning super columns • multiget slice(): a subset of columns for a set of keys • get count: number of columns or sub-columns • get range slice(): subset of columns for a range of keys
  12. 12. Column comparators • TimeUUID • LexicalUUID • UTF8 • Long • Bytes • ...
  13. 13. Updating • insert(): add/update column (by key) • batch insert(): add/update multiple columns (by key) • remove(): remove a column • batch mutate(): like batch insert() but can also delete (new for 0.6, deprecates batch insert()) • Remove key range RSN
  14. 14. Consistency CAP Theorem: choose any two of Consistency, Availability, or Partition tolerance. • Zero • One • Quorum ((N / 2) + 1) • All
  15. 15. Client API • Thrift (12 different languages!) • Ruby • http://github.com/fauna/cassandra/tree/master • http://github.com/NZKoz/cassandra object/tree/master • Python • http://github.com/digg/lazyboy/tree/master • http://github.com/driftx/Telephus/tree/master (Twisted) • Scala • http://github.com/viktorklang/Cassidy/tree/master • http://github.com/nodeta/scalandra/tree/master
  16. 16. Performance vs MySQL w/ 50GB • MySQL • 300ms write • 350ms read • Cassandra • 0.12ms write • 15ms read
  17. 17. Writes
  18. 18. About writes... • No reads • No seeks • Sequential disk access • Atomic within a column family • Fast • Any node • Always writeable (hinted hand-off)
  19. 19. Reads
  20. 20. About reads... • Any node • Read repair • Usual caching conventions apply
  21. 21. Outline 1 Project History 2 Description 3 Case Studies 4 Roadmap
  22. 22. Case 1: Digg Digg is a social news site that allows people to discover and share content from anywhere on the Internet by submitting stories and links, and voting and commenting on submitted stories and links. Ranked 98th by Alexa.com.
  23. 23. Digg
  24. 24. Problem • Terabytes of data; high transaction rate (reads dominated) • Multiple clusters; heavily sharded • Management nightmare (high effort, error prone) • Unsatisfied availability requirements (geographic isolation)
  25. 25. Solution • Currently production on ”Green Badges” • Cassandra as primary data store RSN • Datacenter and rack-aware replication
  26. 26. Case 2: Twitter Twitter is a social networking and microblogging service that enables its users to send and read tweets, text-based posts of up to 140 characters. Ranked 12th by Alexa.com.
  27. 27. Twitter
  28. 28. MySQL • Terabytes of data, ˜1,000,000 ops/s • Calls for heavy sharding, light replication • Schema changes are very difficult, (if possible at all) • Manual sharding is very high effort • Automated sharding and replication is Hard
  29. 29. Case 3: Facebook Facebook is a social networking site where users can create a profile, add friends, and send them messages. Users can also join groups organized by location or other points of common interest. Ranked #2 by Alexa.com.
  30. 30. Inbox Search • 100 TB • 160 nodes • 1/2 billion writes per day (2yr old number?)
  31. 31. Case 4: Mahalo Mahalo.com is a web directory and knowledge exchange. It differentiates itself by tracking and building hand-crafted result sets for many of the popular search terms. (it also means ”thank you” in Hawaiian)
  32. 32. MySQL • Partial deployment; 16 million video records (and growing) • Writes (and storage) rapidly exceeding single box limitations • Managability suffering (clustering is painful) • Concerns over availability
  33. 33. Outline 1 Project History 2 Description 3 Case Studies 4 Roadmap
  34. 34. 0.6 • batch mutate command • authentication (basic) • new consistency level, ANY • fat client • mmapped i/o reads (default on 64bit jvm) • improved write concurrency (HH) • networking optimizations • row caching • improved management tools • per-keyspace replication factor
  35. 35. 0.7 • more efficient compactions (row sizes bigger than memory) • easier (dynamic?) column family changes • SSTable versioning • SSTable compression • support for column family truncation • improved configuration handling • remove key range command • even more improved management tools • vector clocks w/ server-side conflict resolution
  36. 36. THE END
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×