Big Data Grows Up
A (re)introduction to Cassandra
Robbie Strickland
Who am I?
Robbie Strickland
Software Development Manager
The Weather Channel
rostrickland@gmail.com
@dont_use_twitter
Who am I?
●
●
●
●
●

Cassandra user/contributor since 2010
… it was at release 0.5 back then
4 years? Oracle DBA’s aren’t impressed
Done lots of dumb stuff with Cassandra
… and some really awesome stuff too
Cassandra in 2010
Cassandra in 2010
Cassandra in 2014
Why Cassandra?
It’s fast:
●
●
●
●

No locks
Tunable consistency
Sequential R/W
Decentralized
Why Cassandra?
It scales (linearly):
●
●
●
●

Multi data center
No SPOF
DHT
Hadoop integration
Why Cassandra?
It’s fault tolerant:
● Automatic replication
● Masterless
● Failed nodes
replaced with ease
What’s different?
… a lot in the last year (ish)
What’s new?
●
●
●
●
●
●
●

Virtual nodes
O(n) data moved off-heap
CQL3 (and defining schemas)
Native protocol/driver
Collections
Lightweight transactions
Compaction throttling that actually works
What’s gone?
●
●
●
●

Manual token management
Supercolumns
Thrift (if you use the native driver)
Directly managing storage rows
What’s still the same?
●
●
●
●
●

Still not an RDBMS
Still no joins (see above)
Still no ad-hoc queries (see above again)
Still requires a denormalized data model (^^)
Still need to know what the heck you’re
doing
Token Management
Linear scalability without the migraine
The old way
● 1 token per node
● Assigned manually
● Adding nodes ==
reassignment of all
tokens
● Node rebuild
heavily taxes a few
nodes

A
F

B

cluster with
no vnodes
E

C

D
… enter Vnodes
N

A

B
C

M
L

D

cluster with
vnodes

K
J

E
F

I

H

G

● n tokens per node
● Assigned magically
● Adding nodes ==
painless
● Node rebuild
distributed across
many nodes
Node rebuild without Vnodes
Node rebuild with Vnodes
Going Off-heap
because the JVM sometimes sucks
Why go off-heap
●
●
●
●
●

GC overhead
JVM no good with big heap sizes
GC overhead
GC overhead
GC overhead
O(n) data structures
●
●
●
●

Row cache
Bloom filters
Compression offsets
Partition summary

… all these are moved off-heap
New memory allocation
Row cache
Bloom filters
JVM

Compression offsets
Partition summary

native
heap

Partition key cache
Death of a (Thrift)
Salesman
Or, how to build a killer data store
without a crappy interface
Reasons not to ditch Thrift
●
●
●
●

Lots of client libraries still use it
You finally got it installed
You didn’t know there was another choice
It sucks less than many alternatives
… in spite of all those benefits, you
really should ditch Thrift because:
● It requires your entire result set to fit into
RAM on both client and server
● The native protocol is better, faster, and
supports all the new features
● Thrift-based client libraries are always a step
behind
● It’s going away eventually
… and did I mention ...
It requires your entire result set
to fit into RAM
on both client and server!!!
Requesting too much data
Going Native
really catchy tag line here
Native protocol
●
●
●
●
●
●
●

It’s binary, making it lighter weight
It supports cursors (FTW!)
It supports prepared statements
Cluster awareness built-in
Either synchronous or asynchronous ops
Only supports CQL-based operations
Can be used side-by-side with Thrift
Native drivers
from DataStax:
Java
C#
Python
… other community supported drivers available
Native query example
val insert =
session.prepare("INSERT INTO myKsp.myTable (myKey, col1, col2) VALUES (?,?,?)")
val select = session.prepare("SELECT * FROM myKsp.myTable WHERE myKey = ?")
val cluster = Cluster.builder().addContactPoints(host1, host2, host3)
val session = cluster.connect()
session.execute(insert.bind(myKey, col1, col2))
val result = session.execute(select.bind(myKey))
Wait, was that SQL?!!
Or, how to make Cassandra more awesome
while simultaneously irritating early adopters
Introducing CQL3
●
●
●
●
●
●
●

Because the first two attempts sucked
Stands for “Cassandra Query Language”
Looks a heck of a lot like SQL
… but isn’t
Substantially lowers the learning curve
… but also makes it easier to screw up
An abstraction over the storage rows
Storage rows
[default@unknown] create keyspace Library;
[default@unknown] use Library;
[default@Library] create column family Books
...

with comparator=UTF8Type

...

and key_validation_class=UTF8Type

…

and default_validation_class=UTF8Type;

[default@Library] set Books['Patriot Games']['author'] = 'Tom Clancy';
[default@Library] set Books['Patriot Games']['year'] = '1987';
[default@Library] list Books;
RowKey: Patriot Games
=> (name=author, value=Tom Clancy, timestamp=1393102991499000)
=> (name=year, value=1987, timestamp=1393103015955000)
Storage rows - composites
[default@Library] create column family Authors
...
with key_validation_class=UTF8Type
...
and comparator='CompositeType(LongType,UTF8Type,UTF8Type)'
...
and default_validation_class=UTF8Type;
[default@Library] set Authors['Tom Clancy']['1987:Patriot Games:publisher'] = 'Putnam';
[default@Library] set Authors['Tom Clancy']['1987:Patriot Games:ISBN'] = '0-399-13241-4';
[default@Library] set Authors['Tom Clancy']['1993:Without Remorse:publisher'] = 'Putnam';
[default@Library] set Authors['Tom Clancy']['1993:Without Remorse:ISBN'] = '0-399-13825-0';
[default@Library] list Authors;
RowKey: Tom Clancy
=> (name=1987:Patriot Games:ISBN, value=0-399-13241-4, timestamp=1393104011458000)
=> (name=1987:Patriot Games:publisher, value=Putnam, timestamp=1393103948577000)
=> (name=1993:Without Remorse:ISBN, value=0-399-13825-0, timestamp=1393104109214000)
=> (name=1993:Without Remorse:publisher, value=Putnam, timestamp=1393104083773000)
CQL - simple intro
cqlsh> CREATE KEYSPACE Library WITH REPLICATION = {'class':'SimpleStrategy', 'replication_factor':1};
cqlsh> use Library;
cqlsh:library> CREATE TABLE Books (
...

title varchar,

...

author varchar,

...

year int,

...

PRIMARY KEY (title)

... );
cqlsh:library> INSERT INTO Books (title, author, year) VALUES ('Patriot Games', 'Tom Clancy', 1987);
cqlsh:library> INSERT INTO Books (title, author, year) VALUES ('Without Remorse', 'Tom Clancy', 1993);
CQL - simple intro

Storage rows:
CQL - composite key
CREATE TABLE Authors (
name varchar,
year int,
title varchar,
publisher varchar,
ISBN varchar,
PRIMARY KEY (name, year, title)
)
CQL - composite key

Storage rows:
Keys and Filters
●
●
●
●
●
●

Ad hoc queries are NOT supported
Query by key
Key must include all potential filter columns
Must include partition key in filter
Subsequent filters must be in order
Only last filter can be a range
Example - Books table
CREATE TABLE Books (
title varchar,
author varchar,
year int,
PRIMARY KEY (title)
)
Example - Books table
CREATE TABLE Books (
title varchar,
author varchar,
year int,
PRIMARY KEY (author, title)
)
Example - Books table
CREATE TABLE Books (
title varchar,
author varchar,
year int,
PRIMARY KEY (author, year)
)
Example - Books table
CREATE TABLE Books (
title varchar,
author varchar,
year int,
PRIMARY KEY (year, author)
)
Secondary Indexes
●
●
●
●
●

Allows query-by-value
CREATE INDEX myIdx ON myTable (myCol)
Works well on low cardinality fields
Won’t scale for high cardinality fields
Don’t overuse it -- not a quick fix for a bad
data model
Example - Books table
CREATE TABLE Books (
title varchar,
author varchar,
year int,
PRIMARY KEY (author)
)
CREATE INDEX Books_year ON Books(year)
Composite Partition Keys
● PRIMARY KEY((year, author), title)
● Creates a more granular shard key
● Can be useful to make certain queries more
efficient, or to better distribute data
● Updates sharing a partition key are atomic
and isolated
Example - Books table
CREATE TABLE Books (
title varchar,
author varchar,
year int,
PRIMARY KEY ((year, author), title)
)
Example - Books table
CREATE TABLE Books (
title varchar,
author varchar,
year int,
PRIMARY KEY (year, author, title)
)
Collections
denormalization done well
Supported types
● Sets - ordered naturally
● Lists - ordered by index
● Maps - key/value pairs
Caveats
● Max 64k items in a collection
● Max 64k size per item
● Collections are read in their entirety, so keep
them small
Sets
Sets

Set
name

Item
value
Lists
Lists

List name

Ordering
meta data

List item
value
Maps
Maps

Map
name

Key

Value
TRON
(tracing on)
Using tracing
● In cqlsh, “tracing on”
● … enjoy!
Example
1393126200000
Antipattern
CREATE TABLE WorkQueue (
name varchar,
time bigint,
workItem varchar,
PRIMARY KEY (name, time)
)
… do a bunch of inserts ...
SELECT * FROM WorkQueue WHERE name='ToDo' ORDER BY time ASC;
DELETE FROM WorkQueue WHERE name=’ToDo’ AND time=[some_time]
Antipattern - enqueue
Antipattern - dequeue
Antipattern

20k tombstones!!
13ms of 17ms spent reading tombstones
Lightweight Transactions
(no it’s not ACID)
Primer
●
●
●
●
●
●

Supports basic Compare-and-Set ops
Provides linearizable consistency
… aka serial isolation
Uses “Paxos light” under the hood
Still expensive -- four round trips!
For most cases quorum reads/writes will be
sufficient
Usage
INSERT INTO Users (login, name)
VALUES (‘rs_atl’, ‘Robbie Strickland’)
IF NOT EXISTS;
UPDATE Users
SET password=’super_secure_password’
WHERE login=’rs_atl’
IF reset_token=’some_reset_token’;
Other cool stuff
●
●
●
●
●

Triggers (experimental)
Batching multiple requests
Leveled compaction
Configuration via CQL
Gossip-based rack/DC configuration
Thank you!
Robbie Strickland
Software Development Manager
The Weather Channel
rostrickland@gmail.com
@dont_use_twitter

Big Data Grows Up - A (re)introduction to Cassandra

  • 1.
    Big Data GrowsUp A (re)introduction to Cassandra Robbie Strickland
  • 2.
    Who am I? RobbieStrickland Software Development Manager The Weather Channel rostrickland@gmail.com @dont_use_twitter
  • 3.
    Who am I? ● ● ● ● ● Cassandrauser/contributor since 2010 … it was at release 0.5 back then 4 years? Oracle DBA’s aren’t impressed Done lots of dumb stuff with Cassandra … and some really awesome stuff too
  • 4.
  • 5.
  • 6.
  • 7.
    Why Cassandra? It’s fast: ● ● ● ● Nolocks Tunable consistency Sequential R/W Decentralized
  • 8.
    Why Cassandra? It scales(linearly): ● ● ● ● Multi data center No SPOF DHT Hadoop integration
  • 9.
    Why Cassandra? It’s faulttolerant: ● Automatic replication ● Masterless ● Failed nodes replaced with ease
  • 10.
    What’s different? … alot in the last year (ish)
  • 11.
    What’s new? ● ● ● ● ● ● ● Virtual nodes O(n)data moved off-heap CQL3 (and defining schemas) Native protocol/driver Collections Lightweight transactions Compaction throttling that actually works
  • 12.
    What’s gone? ● ● ● ● Manual tokenmanagement Supercolumns Thrift (if you use the native driver) Directly managing storage rows
  • 13.
    What’s still thesame? ● ● ● ● ● Still not an RDBMS Still no joins (see above) Still no ad-hoc queries (see above again) Still requires a denormalized data model (^^) Still need to know what the heck you’re doing
  • 14.
  • 15.
    The old way ●1 token per node ● Assigned manually ● Adding nodes == reassignment of all tokens ● Node rebuild heavily taxes a few nodes A F B cluster with no vnodes E C D
  • 16.
    … enter Vnodes N A B C M L D clusterwith vnodes K J E F I H G ● n tokens per node ● Assigned magically ● Adding nodes == painless ● Node rebuild distributed across many nodes
  • 17.
  • 18.
  • 19.
    Going Off-heap because theJVM sometimes sucks
  • 20.
    Why go off-heap ● ● ● ● ● GCoverhead JVM no good with big heap sizes GC overhead GC overhead GC overhead
  • 21.
    O(n) data structures ● ● ● ● Rowcache Bloom filters Compression offsets Partition summary … all these are moved off-heap
  • 22.
    New memory allocation Rowcache Bloom filters JVM Compression offsets Partition summary native heap Partition key cache
  • 23.
    Death of a(Thrift) Salesman Or, how to build a killer data store without a crappy interface
  • 24.
    Reasons not toditch Thrift ● ● ● ● Lots of client libraries still use it You finally got it installed You didn’t know there was another choice It sucks less than many alternatives
  • 25.
    … in spiteof all those benefits, you really should ditch Thrift because: ● It requires your entire result set to fit into RAM on both client and server ● The native protocol is better, faster, and supports all the new features ● Thrift-based client libraries are always a step behind ● It’s going away eventually
  • 26.
    … and didI mention ... It requires your entire result set to fit into RAM on both client and server!!!
  • 27.
  • 28.
  • 29.
    Native protocol ● ● ● ● ● ● ● It’s binary,making it lighter weight It supports cursors (FTW!) It supports prepared statements Cluster awareness built-in Either synchronous or asynchronous ops Only supports CQL-based operations Can be used side-by-side with Thrift
  • 30.
    Native drivers from DataStax: Java C# Python …other community supported drivers available
  • 31.
    Native query example valinsert = session.prepare("INSERT INTO myKsp.myTable (myKey, col1, col2) VALUES (?,?,?)") val select = session.prepare("SELECT * FROM myKsp.myTable WHERE myKey = ?") val cluster = Cluster.builder().addContactPoints(host1, host2, host3) val session = cluster.connect() session.execute(insert.bind(myKey, col1, col2)) val result = session.execute(select.bind(myKey))
  • 32.
    Wait, was thatSQL?!! Or, how to make Cassandra more awesome while simultaneously irritating early adopters
  • 33.
    Introducing CQL3 ● ● ● ● ● ● ● Because thefirst two attempts sucked Stands for “Cassandra Query Language” Looks a heck of a lot like SQL … but isn’t Substantially lowers the learning curve … but also makes it easier to screw up An abstraction over the storage rows
  • 34.
    Storage rows [default@unknown] createkeyspace Library; [default@unknown] use Library; [default@Library] create column family Books ... with comparator=UTF8Type ... and key_validation_class=UTF8Type … and default_validation_class=UTF8Type; [default@Library] set Books['Patriot Games']['author'] = 'Tom Clancy'; [default@Library] set Books['Patriot Games']['year'] = '1987'; [default@Library] list Books; RowKey: Patriot Games => (name=author, value=Tom Clancy, timestamp=1393102991499000) => (name=year, value=1987, timestamp=1393103015955000)
  • 35.
    Storage rows -composites [default@Library] create column family Authors ... with key_validation_class=UTF8Type ... and comparator='CompositeType(LongType,UTF8Type,UTF8Type)' ... and default_validation_class=UTF8Type; [default@Library] set Authors['Tom Clancy']['1987:Patriot Games:publisher'] = 'Putnam'; [default@Library] set Authors['Tom Clancy']['1987:Patriot Games:ISBN'] = '0-399-13241-4'; [default@Library] set Authors['Tom Clancy']['1993:Without Remorse:publisher'] = 'Putnam'; [default@Library] set Authors['Tom Clancy']['1993:Without Remorse:ISBN'] = '0-399-13825-0'; [default@Library] list Authors; RowKey: Tom Clancy => (name=1987:Patriot Games:ISBN, value=0-399-13241-4, timestamp=1393104011458000) => (name=1987:Patriot Games:publisher, value=Putnam, timestamp=1393103948577000) => (name=1993:Without Remorse:ISBN, value=0-399-13825-0, timestamp=1393104109214000) => (name=1993:Without Remorse:publisher, value=Putnam, timestamp=1393104083773000)
  • 36.
    CQL - simpleintro cqlsh> CREATE KEYSPACE Library WITH REPLICATION = {'class':'SimpleStrategy', 'replication_factor':1}; cqlsh> use Library; cqlsh:library> CREATE TABLE Books ( ... title varchar, ... author varchar, ... year int, ... PRIMARY KEY (title) ... ); cqlsh:library> INSERT INTO Books (title, author, year) VALUES ('Patriot Games', 'Tom Clancy', 1987); cqlsh:library> INSERT INTO Books (title, author, year) VALUES ('Without Remorse', 'Tom Clancy', 1993);
  • 37.
    CQL - simpleintro Storage rows:
  • 38.
    CQL - compositekey CREATE TABLE Authors ( name varchar, year int, title varchar, publisher varchar, ISBN varchar, PRIMARY KEY (name, year, title) )
  • 39.
    CQL - compositekey Storage rows:
  • 40.
    Keys and Filters ● ● ● ● ● ● Adhoc queries are NOT supported Query by key Key must include all potential filter columns Must include partition key in filter Subsequent filters must be in order Only last filter can be a range
  • 41.
    Example - Bookstable CREATE TABLE Books ( title varchar, author varchar, year int, PRIMARY KEY (title) )
  • 42.
    Example - Bookstable CREATE TABLE Books ( title varchar, author varchar, year int, PRIMARY KEY (author, title) )
  • 43.
    Example - Bookstable CREATE TABLE Books ( title varchar, author varchar, year int, PRIMARY KEY (author, year) )
  • 44.
    Example - Bookstable CREATE TABLE Books ( title varchar, author varchar, year int, PRIMARY KEY (year, author) )
  • 45.
    Secondary Indexes ● ● ● ● ● Allows query-by-value CREATEINDEX myIdx ON myTable (myCol) Works well on low cardinality fields Won’t scale for high cardinality fields Don’t overuse it -- not a quick fix for a bad data model
  • 46.
    Example - Bookstable CREATE TABLE Books ( title varchar, author varchar, year int, PRIMARY KEY (author) ) CREATE INDEX Books_year ON Books(year)
  • 47.
    Composite Partition Keys ●PRIMARY KEY((year, author), title) ● Creates a more granular shard key ● Can be useful to make certain queries more efficient, or to better distribute data ● Updates sharing a partition key are atomic and isolated
  • 48.
    Example - Bookstable CREATE TABLE Books ( title varchar, author varchar, year int, PRIMARY KEY ((year, author), title) )
  • 49.
    Example - Bookstable CREATE TABLE Books ( title varchar, author varchar, year int, PRIMARY KEY (year, author, title) )
  • 50.
  • 51.
    Supported types ● Sets- ordered naturally ● Lists - ordered by index ● Maps - key/value pairs
  • 52.
    Caveats ● Max 64kitems in a collection ● Max 64k size per item ● Collections are read in their entirety, so keep them small
  • 53.
  • 54.
  • 55.
  • 56.
  • 57.
  • 58.
  • 59.
  • 60.
    Using tracing ● Incqlsh, “tracing on” ● … enjoy!
  • 61.
  • 62.
    Antipattern CREATE TABLE WorkQueue( name varchar, time bigint, workItem varchar, PRIMARY KEY (name, time) ) … do a bunch of inserts ... SELECT * FROM WorkQueue WHERE name='ToDo' ORDER BY time ASC; DELETE FROM WorkQueue WHERE name=’ToDo’ AND time=[some_time]
  • 63.
  • 64.
  • 65.
    Antipattern 20k tombstones!! 13ms of17ms spent reading tombstones
  • 66.
  • 67.
    Primer ● ● ● ● ● ● Supports basic Compare-and-Setops Provides linearizable consistency … aka serial isolation Uses “Paxos light” under the hood Still expensive -- four round trips! For most cases quorum reads/writes will be sufficient
  • 68.
    Usage INSERT INTO Users(login, name) VALUES (‘rs_atl’, ‘Robbie Strickland’) IF NOT EXISTS; UPDATE Users SET password=’super_secure_password’ WHERE login=’rs_atl’ IF reset_token=’some_reset_token’;
  • 69.
    Other cool stuff ● ● ● ● ● Triggers(experimental) Batching multiple requests Leveled compaction Configuration via CQL Gossip-based rack/DC configuration
  • 70.
    Thank you! Robbie Strickland SoftwareDevelopment Manager The Weather Channel rostrickland@gmail.com @dont_use_twitter