Big Data Grows Up - A (re)introduction to Cassandra

Big Data Grows Up
A (re)introduction to Cassandra
Robbie Strickland

Who am I?
Robbie Strickland
Software Development Manager
The Weather Channel
rostrickland@gmail.com
@dont_use_twitter

Who am I?
●
●
●
●
●

Cassandra user/contributor since 2010
… it was at release 0.5 back then
4 years? Oracle DBA’s aren’t impressed
Done lots of dumb stuff with Cassandra
… and some really awesome stuff too

Why Cassandra?
It’s fast:
●
●
●
●

No locks
Tunable consistency
Sequential R/W
Decentralized

Why Cassandra?
It scales (linearly):
●
●
●
●

Multi data center
No SPOF
DHT
Hadoop integration

Why Cassandra?
It’s fault tolerant:
● Automatic replication
● Masterless
● Failed nodes
replaced with ease

What’s different?
… a lot in the last year (ish)

What’s new?
●
●
●
●
●
●
●

Virtual nodes
O(n) data moved off-heap
CQL3 (and defining schemas)
Native protocol/driver
Collections
Lightweight transactions
Compaction throttling that actually works

What’s gone?
●
●
●
●

Manual token management
Supercolumns
Thrift (if you use the native driver)
Directly managing storage rows

What’s still the same?
●
●
●
●
●

Still not an RDBMS
Still no joins (see above)
Still no ad-hoc queries (see above again)
Still requires a denormalized data model (^^)
Still need to know what the heck you’re
doing

Token Management
Linear scalability without the migraine

The old way
● 1 token per node
● Assigned manually
● Adding nodes ==
reassignment of all
tokens
● Node rebuild
heavily taxes a few
nodes

A
F

B

cluster with
no vnodes
E

C

D

… enter Vnodes
N

A

B
C

M
L

D

cluster with
vnodes

K
J

E
F

I

H

G

● n tokens per node
● Assigned magically
● Adding nodes ==
painless
● Node rebuild
distributed across
many nodes

Going Off-heap
because the JVM sometimes sucks

Why go off-heap
●
●
●
●
●

GC overhead
JVM no good with big heap sizes
GC overhead
GC overhead
GC overhead

O(n) data structures
●
●
●
●

Row cache
Bloom filters
Compression offsets
Partition summary

… all these are moved off-heap

New memory allocation
Row cache
Bloom filters
JVM

Compression offsets
Partition summary

native
heap

Partition key cache

Death of a (Thrift)
Salesman
Or, how to build a killer data store
without a crappy interface

Reasons not to ditch Thrift
●
●
●
●

Lots of client libraries still use it
You finally got it installed
You didn’t know there was another choice
It sucks less than many alternatives

… in spite of all those benefits, you
really should ditch Thrift because:
● It requires your entire result set to fit into
RAM on both client and server
● The native protocol is better, faster, and
supports all the new features
● Thrift-based client libraries are always a step
behind
● It’s going away eventually

… and did I mention ...
It requires your entire result set
to fit into RAM
on both client and server!!!

Going Native
really catchy tag line here

Native protocol
●
●
●
●
●
●
●

It’s binary, making it lighter weight
It supports cursors (FTW!)
It supports prepared statements
Cluster awareness built-in
Either synchronous or asynchronous ops
Only supports CQL-based operations
Can be used side-by-side with Thrift

Native drivers
from DataStax:
Java
C#
Python
… other community supported drivers available

Native query example
val insert =
session.prepare("INSERT INTO myKsp.myTable (myKey, col1, col2) VALUES (?,?,?)")
val select = session.prepare("SELECT * FROM myKsp.myTable WHERE myKey = ?")
val cluster = Cluster.builder().addContactPoints(host1, host2, host3)
val session = cluster.connect()
session.execute(insert.bind(myKey, col1, col2))
val result = session.execute(select.bind(myKey))

Wait, was that SQL?!!
Or, how to make Cassandra more awesome
while simultaneously irritating early adopters

Introducing CQL3
●
●
●
●
●
●
●

Because the first two attempts sucked
Stands for “Cassandra Query Language”
Looks a heck of a lot like SQL
… but isn’t
Substantially lowers the learning curve
… but also makes it easier to screw up
An abstraction over the storage rows

Storage rows
[default@unknown] create keyspace Library;
[default@unknown] use Library;
[default@Library] create column family Books
...

with comparator=UTF8Type

...

and key_validation_class=UTF8Type

…

and default_validation_class=UTF8Type;

[default@Library] set Books['Patriot Games']['author'] = 'Tom Clancy';
[default@Library] set Books['Patriot Games']['year'] = '1987';
[default@Library] list Books;
RowKey: Patriot Games
=> (name=author, value=Tom Clancy, timestamp=1393102991499000)
=> (name=year, value=1987, timestamp=1393103015955000)

Storage rows - composites
[default@Library] create column family Authors
...
with key_validation_class=UTF8Type
...
and comparator='CompositeType(LongType,UTF8Type,UTF8Type)'
...
and default_validation_class=UTF8Type;
[default@Library] set Authors['Tom Clancy']['1987:Patriot Games:publisher'] = 'Putnam';
[default@Library] set Authors['Tom Clancy']['1987:Patriot Games:ISBN'] = '0-399-13241-4';
[default@Library] set Authors['Tom Clancy']['1993:Without Remorse:publisher'] = 'Putnam';
[default@Library] set Authors['Tom Clancy']['1993:Without Remorse:ISBN'] = '0-399-13825-0';
[default@Library] list Authors;
RowKey: Tom Clancy
=> (name=1987:Patriot Games:ISBN, value=0-399-13241-4, timestamp=1393104011458000)
=> (name=1987:Patriot Games:publisher, value=Putnam, timestamp=1393103948577000)
=> (name=1993:Without Remorse:ISBN, value=0-399-13825-0, timestamp=1393104109214000)
=> (name=1993:Without Remorse:publisher, value=Putnam, timestamp=1393104083773000)

CQL - simple intro
cqlsh> CREATE KEYSPACE Library WITH REPLICATION = {'class':'SimpleStrategy', 'replication_factor':1};
cqlsh> use Library;
cqlsh:library> CREATE TABLE Books (
...

title varchar,

...

author varchar,

...

year int,

...

PRIMARY KEY (title)

... );
cqlsh:library> INSERT INTO Books (title, author, year) VALUES ('Patriot Games', 'Tom Clancy', 1987);
cqlsh:library> INSERT INTO Books (title, author, year) VALUES ('Without Remorse', 'Tom Clancy', 1993);

CQL - simple intro

Storage rows:

CQL - composite key
CREATE TABLE Authors (
name varchar,
year int,
title varchar,
publisher varchar,
ISBN varchar,
PRIMARY KEY (name, year, title)
)

CQL - composite key

Storage rows:

Keys and Filters
●
●
●
●
●
●

Ad hoc queries are NOT supported
Query by key
Key must include all potential filter columns
Must include partition key in filter
Subsequent filters must be in order
Only last filter can be a range

Example - Books table
CREATE TABLE Books (
title varchar,
author varchar,
year int,
PRIMARY KEY (title)
)

title varchar,
author varchar,
year int,
PRIMARY KEY (author, title)
)

title varchar,
author varchar,
year int,
PRIMARY KEY (author, year)
)

title varchar,
author varchar,
year int,
PRIMARY KEY (year, author)
)

Secondary Indexes
●
●
●
●
●

Allows query-by-value
CREATE INDEX myIdx ON myTable (myCol)
Works well on low cardinality fields
Won’t scale for high cardinality fields
Don’t overuse it -- not a quick fix for a bad
data model

title varchar,
author varchar,
year int,
PRIMARY KEY (author)
)
CREATE INDEX Books_year ON Books(year)

Composite Partition Keys
● PRIMARY KEY((year, author), title)
● Creates a more granular shard key
● Can be useful to make certain queries more
efficient, or to better distribute data
● Updates sharing a partition key are atomic
and isolated

title varchar,
author varchar,
year int,
PRIMARY KEY ((year, author), title)
)

title varchar,
author varchar,
year int,
PRIMARY KEY (year, author, title)
)

Collections
denormalization done well

Supported types
● Sets - ordered naturally
● Lists - ordered by index
● Maps - key/value pairs

Caveats
● Max 64k items in a collection
● Max 64k size per item
● Collections are read in their entirety, so keep
them small

Lists

List name

Ordering
meta data

List item
value

Using tracing
● In cqlsh, “tracing on”
● … enjoy!

Antipattern
CREATE TABLE WorkQueue (
name varchar,
time bigint,
workItem varchar,
PRIMARY KEY (name, time)
)
… do a bunch of inserts ...
SELECT * FROM WorkQueue WHERE name='ToDo' ORDER BY time ASC;
DELETE FROM WorkQueue WHERE name=’ToDo’ AND time=[some_time]

Antipattern

20k tombstones!!
13ms of 17ms spent reading tombstones

Lightweight Transactions
(no it’s not ACID)

Primer
●
●
●
●
●
●

Supports basic Compare-and-Set ops
Provides linearizable consistency
… aka serial isolation
Uses “Paxos light” under the hood
Still expensive -- four round trips!
For most cases quorum reads/writes will be
sufficient

Usage
INSERT INTO Users (login, name)
VALUES (‘rs_atl’, ‘Robbie Strickland’)
IF NOT EXISTS;
UPDATE Users
SET password=’super_secure_password’
WHERE login=’rs_atl’
IF reset_token=’some_reset_token’;

Other cool stuff
●
●
●
●
●

Triggers (experimental)
Batching multiple requests
Leveled compaction
Configuration via CQL
Gossip-based rack/DC configuration

Thank you!
Robbie Strickland
Software Development Manager
The Weather Channel
rostrickland@gmail.com
@dont_use_twitter

Big Data Grows Up - A (re)introduction to Cassandra

More Related Content

What's hot

Similar to Big Data Grows Up - A (re)introduction to Cassandra

More from Robbie Strickland

Recently uploaded

Big Data Grows Up - A (re)introduction to Cassandra