Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Cassandra Explained

12,903 views

Published on

Disruptive Code; September 22, 2010; Stockholm Sweden.

Published in: Technology

Cassandra Explained

  1. 1. Cassandra Explained Berlin Buzzwords June 6, 2010 Eric Evans eevans@rackspace.com @jericevans http://blog.sym-link.com
  2. 2. Outline ● Background ● Description ● API ● Examples
  3. 3. Background
  4. 4. Influential Papers ● BigTable ● Strong consistency ● Sparse map data model ● GFS, Chubby, et al ● Dynamo ● O(1) distributed hash table (DHT) ● BASE (aka eventual consistency) ● Client tunable consistency/availability
  5. 5. NoSQL ● HBase ● Hypertable ● MongoDB ● HyperGraphDB ● Riak ● Memcached ● Voldemort ● Tokyo Cabinet ● Neo4J ● Redis ● Cassandra ● CouchDB
  6. 6. NoSQL Big data ● HBase ● Hypertable ● MongoDB ● HyperGraphDB ● Riak ● Memcached ● Voldemort ● Tokyo Cabinet ● Neo4J ● Redis ● Cassandra ● CouchDB
  7. 7. Bigtable / Dynamo Bigtable Dynamo ● HBase ● Riak ● Hypertable ● Voldemort Cassandra ??
  8. 8. Dynamo-Bigtable Lovechild
  9. 9. CAP Theorem “Pick Two” ● CP ● AP ● Bigtable ● Dynamo ● Hypertable ● Voldemort ● HBase ● Cassandra
  10. 10. CAP Theorem “Pick Two” ● Consistency ● Availability ● Partition Tolerance
  11. 11. Description
  12. 12. Properties ● Symmetric ● No single point of failure ● Linearly scalable ● Ease of administration ● Flexible partitioning, replica placement ● Automated provisioning ● High availability (eventual consistency)
  13. 13. P2P Routing
  14. 14. P2P Routing
  15. 15. Partitioning ● Random ● 128bit namespace, (MD5) ● Good distribution ● Order Preserving ● Tokens determine namespace ● Natural order (lexicographical) ● Range / cover queries ● Yours ??
  16. 16. Replica Placement ● SimpleSnitch ● Default ● N-1 successive nodes ● RackInferringSnitch ● Infers DC/rack from IP ● PropertyFileSnitch ● Configured w/ a properties file
  17. 17. Bootstrap
  18. 18. Bootstrap
  19. 19. Bootstrap
  20. 20. Choosing Consistency Write Read Level Description Level Description ZERO Hail Mary ZERO N/A ANY 1 replica (HH) ANY N/A ONE 1 replica ONE 1 replica QUORUM (N / 2) +1 QUORUM (N / 2) +1 ALL All replicas ALL All replicas R+W>N
  21. 21. Quorum ((N/2) + 1)
  22. 22. Quorum ((N/2) + 1)
  23. 23. Data Model
  24. 24. Overview ● Keyspace ● Uppermost namespace ● Typically one per application ● ColumnFamily ● Associates records of a similar kind ● Record-level Atomicity ● Indexed ● Column ● Basic unit of storage
  25. 25. Sparse Table
  26. 26. Column ● name ● byte[] ● Queried against (predicates) ● Determines sort order ● value ● byte[] ● Opaque to Cassandra ● timestamp ● long ● Conflict resolution (Last Write Wins)
  27. 27. Column Comparators ● Bytes ● UTF8 ● TimeUUID ● Long ● LexicalUUID ● Composite (third-party) http://github.com/edanuff/CassandraCompositeType
  28. 28. API
  29. 29. Low / High ● Thrift ● Compact binary RPC framework ● 12 different languages ● Idiomatic ● Hector (Java) ● Pycassa (Python) ● Others... http://wiki.apache.org/cassandra/ClientOptions
  30. 30. Thrift Read Methods ● get() → Column ● get_slice() → list<Column> ● mulitget_slice() → map<key, list<Column>> ● get_count() → int ● multiget_count() → map<key, int> ● get_range_slices()
  31. 31. Thrift Write Methods ● insert() ● batch_insert() ● remove() ● batch_mutate()
  32. 32. Examples
  33. 33. Pycassa – Python Client API ● connect() → Thrift proxy ● cf = ColumnFamily(proxy, ksp, cfname) ● cf.insert() → long ● cf.get() → dict ● cf.get_range() → dict http://github.com/vomjom/pycassa
  34. 34. Address Book – Setup <!-- conf/storage-conf.xml --> <Keyspace Name=”AddressBook”> <ColumnFamily Name=”Addresses” CompareWith=”BytesType” RowsCached=”10000” KeysCached=”50%” Comment=”Too lame” /> </Keyspace>
  35. 35. Adding an entry key = uuid() columns = { 'first': 'Eric', 'last': 'Evans', 'email': 'eevans@rackspace.com', 'city': 'Austin', 'zip': 78250 } addresses.insert(key, columns)
  36. 36. Fetching a record # fetching the record by key record = addresses.get(key) # accessing columns by name zipcode = record['zip'] city = record['city']
  37. 37. Indexing <!-- conf/storage-conf.xml --> <Keyspace Name=”AddressBook”> <ColumnFamily Name=”Addresses” CompareWith=”BytesType” RowsCached=”10000” KeysCached=”50%” Comment=”Too lame” /> <ColumnFamily Name=”ByCity” CompareWith=”UTF8Type” /> </Keyspace>
  38. 38. Updating the index key = uuid() columns = { 'first': 'Eric', 'last': 'Evans', 'email': 'eevans@rackspace.com', 'city': 'Austin', 'zip': 78250 } addresses.insert(key, columns) byCity.insert('Austin', {key: ''})
  39. 39. Timeseries <!-- conf/storage-conf.xml --> <Keyspace Name=”Sites”> <ColumnFamily Name=”Stats” CompareWith=”LongType”/> </Keyspace>
  40. 40. Logging values # time as a long, binary, network-order ts = pack('>d', long(time() * 1e6)) stats.insert('org.apache', {ts: value})
  41. 41. Slicing begin = pack('>d', long(s * 1e6)) stats.get_range('org.apache', column_start=begin) end = pack('>d', long((s + 86400) * 1e6)) stats.get_range(start='org.apache', finish='org.debian', column_start=begin, column_finish=end)
  42. 42. Questions?

×