Outside The Box With Apache Cassandra

                 Eric Evans
            eevans@rackspace.com
                 @jeri...
Cassandra is...




A massively scalable, decentralized, structured data store (aka
database).
Outline


1 Background


2 Project History


3 Description


4 Case Studies


5 Roadmap
The Digital Universe
Consolidation
Old Guard
Vertical Scaling Sucks
CAP Theorem (aka Brewer’s Theorem)




Distributed systems cannot provide all three of:
  • Consistency
  • Availability
 ...
Influential Papers



Dynamo: Amazon’s Highly Available Key-value Store   1

  • Voldemort
  • Riak
Bigtable: A Distributed...
Outline


1 Background


2 Project History


3 Description


4 Case Studies


5 Roadmap
• 7 new committers added
• Dozens of contributors
• 200+ (!) people on IRC
• Hundreds of closed issues (bugs, features, et...
Outline


1 Background


2 Project History


3 Description


4 Case Studies


5 Roadmap
Cassandra is...




• O(1) DHT
• Eventual consistency
• Tunable trade-offs, consistency vs. availability
But...




• Values are structured, indexed
• Columns / column families
• Slicing w/ predicates (queries)
Column families
Supercolumn families
Client API


• Thrift (12 different languages!)3
• High-level client libraries
    • Ruby
    • Perl
    • Python (Twisted ...
Querying



• get(): retrieve by column name
• multiget(): by column name for a set of keys
• get slice(): by column name,...
Updating




• insert(): add/update column (by key)
• batch insert(): add/update multiple columns (by key)
• remove(): rem...
Column comparators



• TimeUUID
• LexicalUUID
• UTF8
• Long
• Bytes
• ...
Consistency



CAP Theorem: choose any two of Consistency, Availability, or
Partition tolerance.
  • Zero
  • One
  • Quor...
About writes...




• Atomic within a column family
• Any node
• Always writeable (hinted hand-off)
• Fast
Writes
About reads...




• Any node
• Read repair
• Key cache
• Record cache
Reads
Outline


1 Background


2 Project History


3 Description


4 Case Studies


5 Roadmap
Case 1: Digg




Digg is a social news site that allows people to discover and share
content from anywhere on the Internet...
Digg
Problem




• Terabytes of data; high transaction rate (reads dominated)
• Multiple clusters; heavily sharded
• Management...
Solution




• Currently production on ”Green Badges”
• Cassandra as primary data store RSN
• Datacenter and rack-aware re...
Case 2: Twitter




Twitter is a social networking and microblogging service that
enables its users to send and read tweet...
Twitter
MySQL




• Terabytes of data, ˜1,000,000 ops/s
• Calls for heavy sharding, light replication
• Schema changes are very di...
Case 3: Facebook




Facebook is a social networking site where users can create a
profile, add friends, and send them mess...
Inbox Search




• 100 TB
• 160 nodes
• 1/2 billion writes per day (2yr old number?)
Outline


1 Background


2 Project History


3 Description


4 Case Studies


5 Roadmap
0.6


• batch mutate command
• authentication (basic)
• new consistency level, ANY
• fat client
• mmapped i/o reads (defau...
0.7


• more efficient compactions (row sizes bigger than memory)
• easier (dynamic?) column family changes
• SSTable versio...
Questions?
Outside The Box With Apache Cassnadra
Outside The Box With Apache Cassnadra
Outside The Box With Apache Cassnadra
Outside The Box With Apache Cassnadra
Upcoming SlideShare
Loading in...5
×

Outside The Box With Apache Cassnadra

4,632

Published on

Cassandra presentation given at the 3rd annual Palmetto Open Source Software Conference (POSSCON 2010).

Published in: Technology
0 Comments
13 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
4,632
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
149
Comments
0
Likes
13
Embeds 0
No embeds

No notes for slide

Outside The Box With Apache Cassnadra

  1. 1. Outside The Box With Apache Cassandra Eric Evans eevans@rackspace.com @jericevans Palemetto Open Source Software Conference April 16, 2010
  2. 2. Cassandra is... A massively scalable, decentralized, structured data store (aka database).
  3. 3. Outline 1 Background 2 Project History 3 Description 4 Case Studies 5 Roadmap
  4. 4. The Digital Universe
  5. 5. Consolidation
  6. 6. Old Guard
  7. 7. Vertical Scaling Sucks
  8. 8. CAP Theorem (aka Brewer’s Theorem) Distributed systems cannot provide all three of: • Consistency • Availability • Partition Tolerance
  9. 9. Influential Papers Dynamo: Amazon’s Highly Available Key-value Store 1 • Voldemort • Riak Bigtable: A Distributed Storage System for Structured Data 2 • Hypertable • HBase 1 http: //www.allthingsdistributed.com/2007/10/amazons_dynamo.html 2 http://labs.google.com/papers/bigtable-osdi06.pdf
  10. 10. Outline 1 Background 2 Project History 3 Description 4 Case Studies 5 Roadmap
  11. 11. • 7 new committers added • Dozens of contributors • 200+ (!) people on IRC • Hundreds of closed issues (bugs, features, etc) • 4 major releases; a number of stable point releases • Graduation to TLP
  12. 12. Outline 1 Background 2 Project History 3 Description 4 Case Studies 5 Roadmap
  13. 13. Cassandra is... • O(1) DHT • Eventual consistency • Tunable trade-offs, consistency vs. availability
  14. 14. But... • Values are structured, indexed • Columns / column families • Slicing w/ predicates (queries)
  15. 15. Column families
  16. 16. Supercolumn families
  17. 17. Client API • Thrift (12 different languages!)3 • High-level client libraries • Ruby • Perl • Python (Twisted too) • Scala • Java • PHP • Grails • C++ 3 http://incubator.apache.org/thrift
  18. 18. Querying • get(): retrieve by column name • multiget(): by column name for a set of keys • get slice(): by column name, or a range of names • returning columns • returning super columns • multiget slice(): a subset of columns for a set of keys • get count: number of columns or sub-columns • get range slice(): subset of columns for a range of keys
  19. 19. Updating • insert(): add/update column (by key) • batch insert(): add/update multiple columns (by key) • remove(): remove a column • batch mutate(): like batch insert() but can also delete (new for 0.6, deprecates batch insert())
  20. 20. Column comparators • TimeUUID • LexicalUUID • UTF8 • Long • Bytes • ...
  21. 21. Consistency CAP Theorem: choose any two of Consistency, Availability, or Partition tolerance. • Zero • One • Quorum ((N / 2) + 1) • All
  22. 22. About writes... • Atomic within a column family • Any node • Always writeable (hinted hand-off) • Fast
  23. 23. Writes
  24. 24. About reads... • Any node • Read repair • Key cache • Record cache
  25. 25. Reads
  26. 26. Outline 1 Background 2 Project History 3 Description 4 Case Studies 5 Roadmap
  27. 27. Case 1: Digg Digg is a social news site that allows people to discover and share content from anywhere on the Internet by submitting stories and links, and voting and commenting on submitted stories and links. Ranked 98th by Alexa.com.
  28. 28. Digg
  29. 29. Problem • Terabytes of data; high transaction rate (reads dominated) • Multiple clusters; heavily sharded • Management nightmare (high effort, error prone) • Unsatisfied availability requirements (geographic isolation)
  30. 30. Solution • Currently production on ”Green Badges” • Cassandra as primary data store RSN • Datacenter and rack-aware replication
  31. 31. Case 2: Twitter Twitter is a social networking and microblogging service that enables its users to send and read tweets, text-based posts of up to 140 characters. Ranked 12th by Alexa.com.
  32. 32. Twitter
  33. 33. MySQL • Terabytes of data, ˜1,000,000 ops/s • Calls for heavy sharding, light replication • Schema changes are very difficult, (if possible at all) • Manual sharding is very high effort • Automated sharding and replication is Hard
  34. 34. Case 3: Facebook Facebook is a social networking site where users can create a profile, add friends, and send them messages. Users can also join groups organized by location or other points of common interest. Ranked #2 by Alexa.com.
  35. 35. Inbox Search • 100 TB • 160 nodes • 1/2 billion writes per day (2yr old number?)
  36. 36. Outline 1 Background 2 Project History 3 Description 4 Case Studies 5 Roadmap
  37. 37. 0.6 • batch mutate command • authentication (basic) • new consistency level, ANY • fat client • mmapped i/o reads (default on 64bit jvm) • improved write concurrency (HH) • networking optimizations • row caching • improved management tools • per-keyspace replication factor
  38. 38. 0.7 • more efficient compactions (row sizes bigger than memory) • easier (dynamic?) column family changes • SSTable versioning • SSTable compression • support for column family truncation • improved configuration handling • remove key range command • even more improved management tools • vector clocks w/ server-side conflict resolution
  39. 39. Questions?
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×