@doanduyhai
Introduction to Cassandra
DuyHai DOAN, Technical Advocate
@doanduyhai
Who Am I ?!
Duy Hai DOAN
Cassandra technical advocate
•  talks, meetups, confs
•  open-source devs (Achilles, …)
•  OSS Cassandra point of contact
☞ duy_hai.doan@datastax.com
☞ @doanduyhai 
2
@doanduyhai
Datastax!
•  Founded in April 2010 
•  We contribute a lot to Apache Cassandra™
•  400+ customers (25 of the Fortune 100), 200+ employees
•  Headquarter in San Francisco Bay area
•  EU headquarter in London, offices in France and Germany
•  Datastax Enterprise = OSS Cassandra + extra features
3
@doanduyhai
Agenda!
Architecture
•  Cluster, Replication, Consistency
Data model
•  Last Write Win (LWW), CQL basics, From SQL to CQL,
Lightweight Transaction
DSE
Use Cases
4
@doanduyhai
Cassandra history!
NoSQL database
•  created at Facebook
•  open-sourced since 2008
•  current version = 2.1
•  column-oriented ☞ distributed table
5
@doanduyhai
Cassandra 5 key facts!
Key fact 1: linear scalability
C*
C*C*
NetcoSports
3 nodes, ≈3GB
1k+ nodes, PB+
YOU
6
@doanduyhai
Cassandra 5 key facts!
Key fact 2: continuous availability (≈100% up-time)
•  resilient architecture (Dynamo)

7
@doanduyhai
Cassandra 5 key facts!
Key fact 3: multi-data centers
•  out-of-the-box (config only)
•  AWS conf for multi-region DCs 
•  GCE/CloudStack support
•  Microsoft Azure



8
@doanduyhai
Multi-DC usages!
New York (DC1)
 London (DC2)
Data-locality, disaster recovery 
n2
n3
n4
n5
n6
n7
n8
n1
n2
n3
n4n5
n1
Async
replication
9
@doanduyhai
Multi-DC usages!
Workload segregation/virtual DC
n2
n3
n4
n5
n6
n7
n8
n1
n2
n3
n4n5
n1
Production
(Live)
Analytics
(Spark/Hadoop)
Same
room
Async
replication
10
@doanduyhai
Multi-DC usages!
Prod data copy for testing/benchmarking
n2
n3
n4
n5
n6
n7
n8
n1
n2
n3n1
Use
LOCAL
consistency
My tiny test
cluster
Data copy
NEVER WRITE HERE !!!
11
@doanduyhai
Cassandra 5 key facts!
Key fact 4: operational simplicity
•  1 node = 1 process + 2 config file (main + IP)
•  deployment automation
•  OpsCenter for monitoring


12
@doanduyhai
Cassandra 5 key facts!
13
@doanduyhai
Cassandra 5 key facts!
Key fact 5: analytics combo
•  Cassandra + Spark = awesome !
•  realtime streaming/analytics/aggregation …
14
Cassandra architecture!
Cluster
Replication
Consistency
@doanduyhai
Cassandra architecture!
Cluster layer
•  Amazon DynamoDB paper
•  masterless architecture

Data-store layer
•  Google Big Table paper
•  Columns/columns family
16
@doanduyhai
Data distribution!
Random: hash of #partition → token = hash(#p)

Hash: ]-X, X]

X = huge number (264/2)

 n1
n2
n3
n4
n5
n6
n7
n8
17
@doanduyhai
Token Ranges!
A: ]0, X/8]
B: ] X/8, 2X/8]
C: ] 2X/8, 3X/8]
D: ] 3X/8, 4X/8]
E: ] 4X/8, 5X/8]
F: ] 5X/8, 6X/8]
G: ] 6X/8, 7X/8]
H: ] 7X/8, X]
n1
n2
n3
n4
n5
n6
n7
n8
A
B
C
D
E
F
G
H
18
@doanduyhai
Distributed Table!
n1
n2
n3
n4
n5
n6
n7
n8
A
B
C
D
E
F
G
H
user_id1
user_id2
user_id3
user_id4
user_id5
19
@doanduyhai
Distributed Table!
n1
n2
n3
n4
n5
n6
n7
n8
A
B
C
D
E
F
G
H
user_id1
user_id2
user_id3
user_id4
user_id5
20
@doanduyhai
Linear scalability!
n1
n2
n3
n4
n5
n6
n7
n8
n1
n2
n3 n4
n5
n6
n7
n8n9
n10
8 nodes 10 nodes
21
@doanduyhai
Failure tolerance!
Replication Factor (RF) = 3
n1
n2
n3
n4
n5
n6
n7
n8
1
2
3
{B, A, H}
{C, B, A}
{D, C, B}
A
B
C
D
E
F
G
H
22
@doanduyhai
Coordinator node!
Incoming requests (read/write)

Coordinator node handles the request
Every node can be coordinator àmasterless
n1
n2
n3
n4
n5
n6
n7
n8
1
2
3
coordinator
request
23
@doanduyhai
Consistency!
Tunable at runtime
•  ONE
•  QUORUM (strict majority w.r.t. RF)
•  ALL

Apply both to read & write


24
@doanduyhai
Consistency in action!
RF = 3, Write ONE, Read ONE
B A A
B A A
Read ONE: A
data replication in progress …
Write ONE: B
25
@doanduyhai
Consistency in action!
RF = 3, Write ONE, Read QUORUM
B A A
Write ONE: B
Read QUORUM: A
B A A
data replication in progress …
26
@doanduyhai
Consistency in action!
RF = 3, Write ONE, Read ALL
B A A
Read ALL: B
B A A
data replication in progress …
Write ONE: B
27
@doanduyhai
Consistency in action!
RF = 3, Write QUORUM, Read ONE

B B A
Write QUORUM: B
Read ONE: A
B B A
data replication in progress …
28
@doanduyhai
Consistency in action!
RF = 3, Write QUORUM, Read QUORUM
B B A
Read QUORUM: B
B B A
data replication in progress …
Write QUORUM: B
29
@doanduyhai
Consistency trade-off!
30
@doanduyhai
Consistency level!
ONE
Fast, may not read latest written value

31
@doanduyhai
Consistency level!
QUORUM
Strict majority w.r.t. Replication Factor
Good balance
32
@doanduyhai
Consistency level!
ALL
Paranoid
Slow, no high availability
33
@doanduyhai
Consistency summary!

ONERead + ONEWrite
☞ available for read/write even (N-1) replicas down


QUORUMRead + QUORUMWrite
☞ available for read/write even 1+ replica down
34
Q & R
! "!
Data model!
Last Write Win!
CQL basics!
From SQL to CQL!
Lightweight Transaction!
@doanduyhai
Cassandra Write Path!
Commit log1
. . .
1
Commit log2
Commit logn
Memory
37
@doanduyhai
Cassandra Write Path!
Memory
Commit log1
. . .
1
Commit log2
Commit logn
MemTable
Table1
MemTable
Table2
MemTable
TableN
2
. . .
38
@doanduyhai
Cassandra Write Path!
Commit log1
Commit log2
Commit logn
Table1
SSTable1
Table2 Table3
SSTable2 SSTable3
3
Memory
. . .
39
@doanduyhai
Cassandra Write Path!
Commit log1
Commit log2
Commit logn
Table1
SSTable1
Table2 Table3
SSTable2 SSTable3
Memory. . .
MemTable
Table1
MemTable
Table2
MemTable
TableN
. . .
40
@doanduyhai
Cassandra Write Path!
Commit log1
Commit log2
Commit logn
Table1
SSTable1
Table2 Table3
SSTable2 SSTable3
Memory
SSTable1
SSTable2
SSTable3
. . .
41
@doanduyhai
Last Write Win (LWW)!
jdoe
age
 name
33 John DOE
INSERT INTO users(login, name, age) VALUES(‘jdoe’, ‘John DOE’, 33);
#partition 
42
@doanduyhai
Last Write Win (LWW)!
jdoe
age (t1) name (t1)
33 John DOE
INSERT INTO users(login, name, age) VALUES(‘jdoe’, ‘John DOE’, 33);
auto-generated timestamp
.
43
@doanduyhai
Last Write Win (LWW)!
UPDATE users SET age = 34 WHERE login = ‘jdoe’;
jdoe
age (t1) name (t1)
33 John DOE
jdoe
age (t2)
34
SSTable1 SSTable2
44
@doanduyhai
Last Write Win (LWW)!
DELETE age FROM users WHERE login = ‘jdoe’;
jdoe
age (t3)
ý
tombstone
jdoe
age (t1) name (t1)
33 John DOE
jdoe
age (t2)
34
SSTable1 SSTable2 SSTable3
45
@doanduyhai
Last Write Win (LWW)!
SELECT age FROM users WHERE login = ‘jdoe’;
???
SSTable1 SSTable2 SSTable3
jdoe
age (t3)
ý
jdoe
age (t1) name (t1)
33 John DOE
jdoe
age (t2)
34
46
@doanduyhai
Last Write Win (LWW)!
SELECT age FROM users WHERE login = ‘jdoe’;
✓✕✕
SSTable1 SSTable2 SSTable3
jdoe
age (t3)
ý
jdoe
age (t1) name (t1)
33 John DOE
jdoe
age (t2)
34
47
@doanduyhai
Compaction!
SSTable1 SSTable2 SSTable3
jdoe
age (t3)
ý
jdoe
age (t1) name (t1)
33 John DOE
jdoe
age (t2)
34
New SSTable
jdoe
age (t3) name (t1)
ý John DOE
48
@doanduyhai
Historical data!
history
id
date1(t1) date2(t2) … date9(t9)
… … … …
SSTable1 SSTable2
You want to keep data history ?
•  do not use internal generated timestamp !!!
•  ☞ time-series data modeling
id
date10(t10)date11(t11) …
 …
… … … …
49
@doanduyhai
CRUD operations!

INSERT INTO users(login, name, age) VALUES(‘jdoe’, ‘John DOE’, 33);

UPDATE users SET age = 34 WHERE login = ‘jdoe’;

DELETE age FROM users WHERE login = ‘jdoe’;

SELECT age FROM users WHERE login = ‘jdoe’;
50
@doanduyhai
Simple Table!

CREATE TABLE users (

 
login text,

 
name text,

 
age int,

 
…

 
PRIMARY KEY(login));
partition key (#partition)
51
@doanduyhai
What about joins ?!
How can I join data between tables ?
How can I model 1 – N relationships ?

How to model a mailbox ?
EmailsUser
1 n
52
@doanduyhai
Clustered table (1 – N)!

CREATE TABLE mailbox (

 
login text,

 
message_id timeuuid,

 
interlocutor text,

 
message text,

 
PRIMARY KEY((login), message_id));
partition key clustering column
(sorted)
unicity
53
@doanduyhai
SSTable2
SSTable1
On disk layout
jdoe
message_id1 message_id2 … message_id104
… … … …
hsue
message_id1 message_id2 … message_id78
… … … …
jdoe
message_id105 message_id106 … message_id169
… … … …
54
@doanduyhai
Queries!
Get message by user and message_id (date)

SELECT * FROM mailbox WHERE login = jdoe 

and message_id = ‘2014-09-25 16:00:00’;
Get message by user and date interval

SELECT * FROM mailbox WHERE login = jdoe 

and message_id <= ‘2014-09-25 16:00:00’

and message_id >= ‘2014-09-20 16:00:00’;
55
@doanduyhai
Queries!
Get message by message_id only ?

SELECT * FROM mailbox WHERE message_id = ‘2014-09-25 16:00:00’;
Get message by date interval only ?

SELECT * FROM mailbox WHERE

and message_id <= ‘2014-09-25 16:00:00’

and message_id >= ‘2014-09-20 16:00:00’;
❓
❓
56
@doanduyhai
Queries!
Get message by message_id only (#partition not provided) 

SELECT * FROM mailbox WHERE message_id = ‘2014-09-25 16:00:00’;
Get message by date interval only (#partition not provided)

SELECT * FROM mailbox WHERE

and message_id <= ‘2014-09-25 16:00:00’

and message_id >= ‘2014-09-20 16:00:00’;
57
@doanduyhai
Without #partition
?
?
?
?
?
?
?
?
❓
❓
❓
❓
❓
❓
❓
❓
No #partition
☞ no token
☞ where are my data ?
58
@doanduyhai
The importance of #partition
In RDBMS, no primary key
☞ full table scan
😭
59
@doanduyhai
The importance of #partition
With Cassandra, no partition key
☞ full CLUSTER scan
😱
60
@doanduyhai
Queries!

SELECT * FROM mailbox WHERE login >= ‘hsue’ and login <= ‘jdoe’;
Get message by user range (range query on #partition)

SELECT * FROM mailbox WHERE login like ‘%doe%‘;
Get message by user pattern (non exact match on #partition)
61
@doanduyhai
WHERE clause restrictions!
All queries (INSERT/UPDATE/DELETE/SELECT) must provide #partition
Only exact match (=) on #partition, range queries (<, ≤, >, ≥) not allowed
•  ☞ full cluster scan

On clustering columns, only range queries (<, ≤, >, ≥) and exact match 

WHERE clause only possible
•  on columns defined in PRIMARY KEY
•  on indexed columns ( )
62
@doanduyhai
WHERE clause restrictions!
What if I want to perform « arbitrary » WHERE clause ?
•  search form scenario, dynamic search fields
63
@doanduyhai
WHERE clause restrictions!
What if I want to perform « arbitrary » WHERE clause ?
•  search form scenario, dynamic search fields

DO NOT RE-INVENT THE WHEEL !
☞ Apache Solr (Lucene) integration (Datastax Enterprise)
☞ Same JVM, 1-cluster-2-products (Solr & Cassandra)
64
@doanduyhai
WHERE clause restrictions!
What if I want to perform « arbitrary » WHERE clause ?
•  search form scenario, dynamic search fields

DO NOT RE-INVENT THE WHEEL !
☞ Apache Solr (Lucene) integration (Datastax Enterprise)
☞ Same JVM, 1-cluster-2-products (Solr & Cassandra)

SELECT * FROM users WHERE solr_query = ‘age:[33 TO *] AND gender:male’;


SELECT * FROM users WHERE solr_query = ‘lastname:*schwei?er’;
65
@doanduyhai
Collections & maps!

CREATE TABLE users (

 
login text,

 
name text,

 
age int,

 
friends set<text>,

 
hobbies list<text>,

 
languages map<int, text>,

 
…

 
PRIMARY KEY(login));
66
Keep the cardinality low ≈ 1000
@doanduyhai
User Defined Type (UDT)!

CREATE TABLE users (

 
login text,

 
…

 
street_number int,

 
street_name text,

 
postcode int,

 
country text,

 
…

 
PRIMARY KEY(login));
Instead of
67
@doanduyhai
User Defined Type (UDT)!

CREATE TYPE address (

 
street_number int,

 
street_name text,

 
postcode int,

 
country text);


CREATE TABLE users (

 
login text,

 
…

 
location frozen <address>,

 
…

 
PRIMARY KEY(login));
68
@doanduyhai
UDT insert!

INSERT INTO users(login,name, location) VALUES (

 
‘jdoe’, 

 
’John DOE’,

 
{
‘street_number’: 124,
‘street_name’: ‘Congress Avenue’,
‘postcode’: 95054,
‘country’: ‘USA’
});
69
@doanduyhai
UDT update!

UPDATE users set location = 

 
{
‘street_number’: 125,
‘street_name’: ‘Congress Avenue’,
‘postcode’: 95054,
‘country’: ‘USA’
}
WHERE login = jdoe;
Can be nested ☞ store documents
•  but no dynamic fields (or use map<text, blob>)
70
@doanduyhai
From SQL to CQL!
Normalized
Comment
User
1
n
CREATE TABLE comments (

article_id uuid, 

comment_id timeuuid, 

author_login text, // typical join id

content text, 

PRIMARY KEY((article_id), comment_id));
71
@doanduyhai
From SQL to CQL
1 SELECT
-  10 last comments
-  10 author_login

What to do with 10 author_login ???
Comment
User
1
n
72
@doanduyhai
From SQL to CQL
1 SELECT
-  10 last comments
-  10 author_login

What to do with 10 author_login ???
10 extra SELECT → N+1 SELECT problem !
Comment
User
1
n
73
@doanduyhai
From SQL to CQL!
De-normalized
Comment
User
1
n
CREATE TABLE comments (

article_id uuid, 

comment_id timeuuid, 

author frozen<person>, // person is UDT

content text, 

PRIMARY KEY((article_id), comment_id));
74
@doanduyhai
Data modeling best practices!
Start by queries
•  identify core functional read paths
•  1 read scenario ≈ 1 SELECT 

75
@doanduyhai
Data modeling best practices!
Start by queries
•  identify core functional read paths
•  1 read scenario ≈ 1 SELECT 

Denormalize
•  wisely, only duplicate necessary & immutable data
•  functional/technical trade-off
76
@doanduyhai
Data modeling best practices!
Person UDT
- firstname/lastname
- date of birth
- gender
- mood
- location
77
@doanduyhai
Data modeling best practices!
John DOE, male
birthdate: 21/02/1981
subscribed since 03/06/2011
☉ San Mateo, CA
’’Impossible is not John DOE’’
Full detail read from
User table on click
78
@doanduyhai
Data modeling trade-off
What if ...
•  not possible to de-normalize with immutable data ?
•  have to duplicate mutable data ? 

79
@doanduyhai
Data modeling trade-off
2 strategies
•  either accept to normalize some data (extra SELECT required)
•  or de-normalize and update everywhere upon data mutation 
80
@doanduyhai
Data modeling trade-off
2 strategies
•  either accept to normalize some data (extra SELECT required)
•  or de-normalize and update everywhere upon data mutation 

But always keep those scenarios rare (5%-10% max), focus on the 90%

81
@doanduyhai
Data modeling trade-off
2 strategies
•  either accept to normalize some data (extra SELECT required)
•  or de-normalize and update everywhere upon data mutation 

But always keep those scenarios rare (5%-10% max), focus on the 90%

Example: Twitter tweet deletion
82
Q & R
! "!
@doanduyhai
Lightweight Transaction (LWT)!
What ? ☞ make operations linearizable

Why ? ☞ solve a class of race conditions in Cassandra that
would require installing an external lock manager 
84
@doanduyhai
Lightweight Transaction (LWT)!
INSERT INTO account (id, email) 
VALUES (‘jdoe’,
‘john_doe@fiction.com’);
SELECT * FROM account
WHERE id= ‘jdoe’;
(0 rows)
SELECT * FROM account
WHERE id= ‘jdoe’;
(0 rows)
INSERT INTO account (id, email) 
VALUES (‘jdoe’, 
‘jdoe@fiction.com’);
winner
85
@doanduyhai
Lightweight Transaction (LWT)!
How ? ☞ implementing Paxos protocol on Cassandra

Syntax ? 

INSERT INTO account (id, email) VALUES (‘jdoe’, ‘john_doe@fiction.com’)

IF NOT EXISTS;


UPDATE account SET email = ‘jdoe@fiction.com’ 

IF email = ‘john_doe@fiction.com’ WHERE id=‘jdoe’;
86
@doanduyhai
Lightweight Transaction (LWT)!
Recommendations
•  insert with LWT ☞ delete must use LWT

INSERT INTO my_table … IF NOT EXISTS 

☞ DELETE FROM my_table … IF EXISTS
87
@doanduyhai
Lightweight Transaction (LWT)!
Recommendations
•  LWT expensive (4 round-trips), do not abuse
•  only for 1% – 5% use cases 
88
@doanduyhai
Lightweight Transaction (LWT)!
1
2
3
4Compare Swap / Learn
Queue-in Consensus
89
Q & R
! "!
@doanduyhai
DSE (Datastax Enterprise)!
Security
Analytics (Spark & Hadoop)
Search (Solr)
91
@doanduyhai
Use Cases!
Messaging
Collections/
Playlists
Fraud
detection
Recommendation/
Personalization
Internet of things/
Sensor data
92
@doanduyhai
Use Cases!
Messaging
Collections/
Playlists
Fraud
detection
Recommendation/
Personalization
Internet of things/
Sensor data
93
Thank You
@doanduyhai
duy_hai.doan@datastax.com
https://academy.datastax.com/

Cassandra introduction @ NantesJUG