Getting Started with Cassandra: Key-Value Store Concepts

MySQL for Beginners Gary Dusbabek Rackspace April Fools!!!11

Apache Gary Dusbabek Rackspace

What is Cassandra? Key-value store (with some structure) Highly scalable Eventually consistent Distributed Tunable Partitioning Replication

Where did it come from? Created at Facebook Dynamo: distribution architecture BigTable: data model Open-sourced in 2008 Apache incubator in early 2009 Graduation in March 2010

Who uses it? Rackspace Facebook (of course) Twitter Digg Reddit IBM Others…

What problems does it solve? Reliability at scale No single point of failure (all nodes are identical) Simple scaling linear High write throughput Large data sets

What problems can’t it solve? No flexible indices No querying on non PK values Not good for big binary data (>64mb) unless you chunk Row contents must fit in available memory

Concepts: CAP CAP Theorem Consistency Availability Partition tolerance ,[object Object]

Cassandra chooses A and P but allows them to be tunable to have more C.,[object Object]

Concepts: Replication & Consistency You specify replication factor You specify consistency level for read/write operations ZERO, ONE, QUORUM, ALL, ANY

Ring Topology Storage ring Every node gets a token Defines its place in the storage ring And which keys it is responsible for (its ranges) RF=3 a j d g

Ring Topology Storage ring Every node gets a token Defines its place in the storage ring And which keys it is responsible for (its ranges) RF=2 a j d g

Ring: New Node New node Ranges are adjusted RF=3 a m j d g

Ring: New Node New node Ranges are adjusted RF=2 a m j d g

Ring Partition Node dies or becomes isolated from the ring Hints Handoff RF=3 a m j d g

Data Model Keyspace-contains column families ColumnFamily Standard or Super Two levels of indexes (key and column name)

Data Model Column and subcolumn sorting Specify your own comparator: TimeUUID LexicalUUID UTF8 Long Bytes CreateYourOwn

Data Model Standard Column Family

Data Model Super Column Family

Inserting: Overview Simple: put(key, col, value) Complex: put(key, [col:value, …, col:value]) Batch: multi key.

Inserting: Writes Commit log for durability Memtable – no disk access (no reads or seeks) Sstables are final (become read only) Index Bloom filter Raw data Atomic within a ColumnFamily Bottom line: FAST!!!

Querying: Overview You need a key or keys: Single: key=‘a’ Range: key=‘a’ through ’f’ And columns to retrieve: Slice: cols={bar through kite} By name: key=‘b’ cols={bar, cat, llama} Nothing like SQL “WHERE col=‘faz’” But secondary indices are being worked on (see CASSANDRA-749)

Querying: Reads Not as fast as writes Read repair when out of sync New in 0.6: Row cache (avoid sstable lookup) Key cache (avoid index scan)

Client API (Low level) Fat Client Maybe too low level, not well-tested Thrift (currently best-supported) Many language bindings Not much of a community No streaming Fast transport Avro Just getting started Shows promise

Client API (High Level) Rapidly changing, getting feature-rich Connection pools Load balancing/Failover Reduces the verbosity of working with thrift For Java, see Hector http://github.com/rantav/hector Also Ruby, Python, C++, C#, Perl, PHP http://wiki.apache.org/cassandra/ClientExamples

Java Bits: JMX Relatively easy to expose objects and services as MBeans Simplifies aspects of cluster and node management Easy monitoring You choose the JMX-enabled system management tool (jconsole is alright)

Java Bits: available libraries Excellent: Google collections Multimap, BiMap, Iterators java.util.concurrency nio files (including mmap) Meh: nio sockets

Java Bits: Heap & GC Cassandra tweaks the default GC settings quite a bit: XX:+UseParNewGC XX:+UseConcMarkSweepGC XX:+CMSParallelRemarkEnabled XX:TargetSurvivorRatio=90 XX:SurvivorRatio=128 XX:MaxTenuringThreshold=0 XX:+HeapDumpOnOutOfMemoryError XX:+AggressiveOpts

Java Bits: code management Library versioning No standard way Mostly declarative Not readily queryable Must ship every dependency Or use ant/mvn. Now you have two (or more!) problems.

Java Bits: daemonization Java doesn’t make it easy re: stdout, stderr After setting up, System.out and System.err are close()d Windows: don’t ask

Future Direction Range delete (delete these cols from those keys) Vector clocks (including server-side conflict resolution) Altering keyspace/column family definitions on a live cluster Byte[] keys Compression Multi-tenant support Less memory restrictions

Linky wiki.apache.org/cassandra cassandra.apache.org Google BigTable labs.google.com/papers/bigtable.html Amazon Dynamo s3.amazonaws.com/AllThingsDistributed/sosp/amazon-dynamo-sosp2007.pdf Facebook Cassandra www.facebook.com/note.php?note_id=24413138919 Java tuning: java.sun.com/performance/reference/whitepapers/tuning.html java.sun.com/javase/technologies/hotspot/gc/index.jsp Me gdusbabek@gmail.com gdusbabek on twitter and just about everything else.

Getting Started with Cassandra: Key-Value Store Concepts

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Getting Started with Cassandra: Key-Value Store Concepts

Similar to Getting Started with Cassandra: Key-Value Store Concepts (20)

More from gdusbabek

More from gdusbabek (14)

Recently uploaded

Recently uploaded (20)

Getting Started with Cassandra: Key-Value Store Concepts

Editor's Notes