Loading…

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

Like this presentation? Why not share!

Like this? Share it with your network

Share

Building a distributed Key-Value store with Cassandra

on

  • 5,898 views

Slides from my talk at Kiwi Pycon in 2010.

Slides from my talk at Kiwi Pycon in 2010.

Covers why we chose Cassandra, overview of it's feature and data model, and how we implemented our application.

Statistics

Views

Total Views
5,898
Views on SlideShare
5,898
Embed Views
0

Actions

Likes
3
Downloads
43
Comments
1

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Building a distributed Key-Value store with Cassandra Presentation Transcript

  • 1. Building a Key-Value Store with Cassandra Kiwi PyCon 2010 Aaron Morton @aaronmorton Weta Digital 1
  • 2. Why Cassandra? • Part of a larger project started earlier this year to build new systems for code running on the render farm of 35,000 cores • Larger project goals were Scalability, Reliability, Flexible Schema 2
  • 3. How about MySQL ? • It works. But... • Schema changes • Write redundancy • Query language mismatch • So went looking for the right tool for the job 3
  • 4. Redis ? • Fast, flexible. But... • Single core limit • Replication, but no cluster (itʼs coming) • Limited support options 4
  • 5. Couch DB ? • Schema free, scalable (sort of), redundant (sort of). But... • Single write thread limit • Replication, but no cluster (itʼs coming) • Low consistency with asynchronous replication 5
  • 6. Cassandra ? • Just right, perhaps. Letʼs see... • Highly available • Tuneable synchronous replication • Scalable writes and reads • Schema free (sort of) • Lots of new mistakes to be made 6
  • 7. Availability • Row data is kept together and replicated around the cluster • Replication Factor is configurable • Partitioner determines the position of a row key in the distributed hash table • Replication Strategy determines where in the cluster to place the replicas 7
  • 8. Consistency • Each read or write request specifies a Consistency Level • Individual nodes may be inconsistent with respect to others • Reads may give consistent results while some nodes have inconsistent values • The entire cluster will eventually mode to a state where there is one version of each 8
  • 9. Consistency • R+W>N • R = Read Consistency • W = Write Consistency • N = Replication Factor 9
  • 10. Scale • Distributed hash table • Scale throughput and capacity with more nodes, more disk, more memory • Adding or removing nodes is an online operation • Gossip based protocol for discovery 10
  • 11. Data Model • Column orientated • Denormalise • Cassandra in an index building machine • Simple explanation: a row has a key and stores an ordered hash in one or more Column Families 11
  • 12. Data Model • Keyspace • Row / Key • Column Family or Super Column Family • Column 12
  • 13. Data Model User CF Posts SCF post_1:{ email:fred@... Fred title: foo, dob:04/03 body: bar} post_100:{ Bob email:bob title: monkeys, body: naughty} 13
  • 14. API • Thrift • Avro (beta) • Auto generated bindings for many languages • Stateful connections • Python wrappers pycassa, Telephus (twisted) 14
  • 15. API • Client supplied time stamp for all mutations • Client supplied Consistency Level for all mutations and reads 15
  • 16. API • insert (key, column_family, super_column, column, value) • get(key, column_family, super_column, column) • remove(key, column_family, super_column, column) 16
  • 17. API • Slicing columns or super columns • list of names • start, finish, count, reversed • get_slice() to slice one row • multiget_slice() to slice multiple rows • get_range_slices() to slice rows and columns 17
  • 18. API • Slicing keys • start key, finish key, count • Partitioner effects key order • get_range_slices() to slice rows and columns 18
  • 19. API • batch_mutate() • multiple rows and CFʼs • delete or insert / update • Individual mutations are atomic • Request is not atomic, no rollback 19
  • 20. Our Application Varnish Nginx Tornado Cassandra Rabbit MQ 20
  • 21. Our Application • Similar to Amazon S3. • REST API. • Databases, Buckets, Keys+Values. 21
  • 22. Our Column Families • Database (super) • Bucket (super) • Bucket Index • Object • Object Index (super) 22
  • 23. Our API http:// db_name.wetafx.co.nz/bucket/key 23
  • 24. PUT Object • /bucket/object • batch_mutate() • one row in Objects CF with columns for meta and the body • one column in ObjectIndex CF row for the bucket 24
  • 25. List Objects • /bucket_name?start=foo • get_slice() • for the bucket row in ObjectIndex CF • if needed, multiget_slice() to “join” to the Object CF 25
  • 26. Delete Bucket • /bucket_name • get_slice() on ObjectIndex CF • batch_mutate() to delete Object CF and ObjectIndex CF • delete Bucket CF row 26
  • 27. Thanks • http://wetafx.co.nz • http://cassandra.apache.org/ • 27