• Like

Loading…

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

Building a distributed Key-Value store with Cassandra

  • 4,966 views
Uploaded on

Slides from my talk at Kiwi Pycon in 2010. …

Slides from my talk at Kiwi Pycon in 2010.

Covers why we chose Cassandra, overview of it's feature and data model, and how we implemented our application.

More in: Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
No Downloads

Views

Total Views
4,966
On Slideshare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
44
Comments
1
Likes
3

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Building a Key-Value Store with Cassandra Kiwi PyCon 2010 Aaron Morton @aaronmorton Weta Digital 1
  • 2. Why Cassandra? • Part of a larger project started earlier this year to build new systems for code running on the render farm of 35,000 cores • Larger project goals were Scalability, Reliability, Flexible Schema 2
  • 3. How about MySQL ? • It works. But... • Schema changes • Write redundancy • Query language mismatch • So went looking for the right tool for the job 3
  • 4. Redis ? • Fast, flexible. But... • Single core limit • Replication, but no cluster (itʼs coming) • Limited support options 4
  • 5. Couch DB ? • Schema free, scalable (sort of), redundant (sort of). But... • Single write thread limit • Replication, but no cluster (itʼs coming) • Low consistency with asynchronous replication 5
  • 6. Cassandra ? • Just right, perhaps. Letʼs see... • Highly available • Tuneable synchronous replication • Scalable writes and reads • Schema free (sort of) • Lots of new mistakes to be made 6
  • 7. Availability • Row data is kept together and replicated around the cluster • Replication Factor is configurable • Partitioner determines the position of a row key in the distributed hash table • Replication Strategy determines where in the cluster to place the replicas 7
  • 8. Consistency • Each read or write request specifies a Consistency Level • Individual nodes may be inconsistent with respect to others • Reads may give consistent results while some nodes have inconsistent values • The entire cluster will eventually mode to a state where there is one version of each 8
  • 9. Consistency • R+W>N • R = Read Consistency • W = Write Consistency • N = Replication Factor 9
  • 10. Scale • Distributed hash table • Scale throughput and capacity with more nodes, more disk, more memory • Adding or removing nodes is an online operation • Gossip based protocol for discovery 10
  • 11. Data Model • Column orientated • Denormalise • Cassandra in an index building machine • Simple explanation: a row has a key and stores an ordered hash in one or more Column Families 11
  • 12. Data Model • Keyspace • Row / Key • Column Family or Super Column Family • Column 12
  • 13. Data Model User CF Posts SCF post_1:{ email:fred@... Fred title: foo, dob:04/03 body: bar} post_100:{ Bob email:bob title: monkeys, body: naughty} 13
  • 14. API • Thrift • Avro (beta) • Auto generated bindings for many languages • Stateful connections • Python wrappers pycassa, Telephus (twisted) 14
  • 15. API • Client supplied time stamp for all mutations • Client supplied Consistency Level for all mutations and reads 15
  • 16. API • insert (key, column_family, super_column, column, value) • get(key, column_family, super_column, column) • remove(key, column_family, super_column, column) 16
  • 17. API • Slicing columns or super columns • list of names • start, finish, count, reversed • get_slice() to slice one row • multiget_slice() to slice multiple rows • get_range_slices() to slice rows and columns 17
  • 18. API • Slicing keys • start key, finish key, count • Partitioner effects key order • get_range_slices() to slice rows and columns 18
  • 19. API • batch_mutate() • multiple rows and CFʼs • delete or insert / update • Individual mutations are atomic • Request is not atomic, no rollback 19
  • 20. Our Application Varnish Nginx Tornado Cassandra Rabbit MQ 20
  • 21. Our Application • Similar to Amazon S3. • REST API. • Databases, Buckets, Keys+Values. 21
  • 22. Our Column Families • Database (super) • Bucket (super) • Bucket Index • Object • Object Index (super) 22
  • 23. Our API http:// db_name.wetafx.co.nz/bucket/key 23
  • 24. PUT Object • /bucket/object • batch_mutate() • one row in Objects CF with columns for meta and the body • one column in ObjectIndex CF row for the bucket 24
  • 25. List Objects • /bucket_name?start=foo • get_slice() • for the bucket row in ObjectIndex CF • if needed, multiget_slice() to “join” to the Object CF 25
  • 26. Delete Bucket • /bucket_name • get_slice() on ObjectIndex CF • batch_mutate() to delete Object CF and ObjectIndex CF • delete Bucket CF row 26
  • 27. Thanks • http://wetafx.co.nz • http://cassandra.apache.org/ • 27