Apache Cassandra - Data modelling

Cassandra Data Modelling
CQL is not SQL
Querying simple tables + CQL TRACE (your new best friend)
C* columns and disk storage
C* column nesting (clustering)
Querying clustering columns
RDBMS data modelling: normalize tables then define queries
C* data modelling: define queries then define denormalized tables
C* data modelling: use-case

Where is my data?
01->09
30->39
10->19
40->49
20->29
50->59
70->79
60->69
CREATE keyspace...
CREATE TABLE users {
username text,
password text,
address text,
PRIMARY KEY(username)
}
username password address
john@site.com xxx 35 Arthur St
bill@yahoo.com xxx 21 Jump St
james@gmail.com xxx 18 Smith St

Where is my data?
01->09
30->39
10->19
40->49
20->29
50->59
70->79
60->69
Each node owns a range of tokens and a ring
ALWAYS forms a complete token range.
hash(primary key) -> token
The token produced always falls between the upper
bound and lower bound of the complete token range
(0->79)*
*doesn’t matter if the PK is a string, int, float, GUID,
blob...always falls within the token range.
the hash produced is randomized
so token for hash(john@site.com) could produce be
any number between 0-79, but will always produce
the same number
-> consistent hashing

Where is my data?
01->09 bill@yahoo.com
30->39
10->19
40->49
20->29
john@site.com
50->59
70->79
bill@yahoo.com
60->69
token = hash(primary key)
eg
hash(john@site.com) = 26
hash(bill@yahoo.com) = 79
hash(james@gmail.com) = 5

What is the difference between:
SQL> SELECT * FROM users;
and
CQL> SELECT * FROM users;
Answer : Where is my data?
CQL is not SQL

SELECT * FROM users;
1. query node 3
Where is my data?
8 2
5
7 3
1
6
4
30->39
10->19
40->49
20->29
john@site.com
50->59
70->79
bill@yahoo.com
60->69

1. query node 3
2. query node 8
Where is my data?
8 2
5
7 3
1
6
4
30->39
10->19
40->49
20->29
john@site.com
50->59
70->79
bill@yahoo.com
60->69

1. query node 3
2. query node 8
3. query node 1
Where is my data?
8 2
5
7 3
1
6
4
30->39
10->19
40->49
20->29
john@site.com
50->59
70->79
bill@yahoo.com
60->69

Username (PK) PasswordAddress
aaaaaaaaxxx xxx
aaaaaaabxxx xxx
bbbbbbbbxxx xxx
cccccccccxxx xxx
zzzzzzzzzxxx xxx
40->49
1. query node 3
2. query node 8
3. query node 1
4. query node 2
This is called a table scan in C* parlance and is
a performance anti-pattern, you can see that very
quickly you will timeout the query.
Proper design means this query is unnecessary.
Test all queries by running a TRACE in DevCenter
or at the CQLSH prompt as a sanity check.
Where is my data?
8 2
5
7 3
1
6
4
30->39
10->19
40->49
20->29
john@site.com
50->59
70->79
bill@yahoo.com
60->69

C* is very slow at scanning down a list of partition_keys because they are distributed over many partitions / nodes
C* is very fast at scanning across columns for a specific partition_key because they are on a single partition.
col1 | col2 | col3 | col4 | col5 | col6 | col7 | col8 | col9 | col10 | col11 | col12 | col13 | col14 | col15 | col16….aaaaaaab
fast scan !!
slow
scan
OK, i get that the partition_keys are spread out on different nodes and thats why scanning down them is slow, but that doesn’t explain why
scanning across columns is fast for a specific partition_key (e.g john@site.com )
It all comes down to the disk storage of columns for a specific partition_key, the on disk.

aaaaaaab col1 | col2 | col3 | col4 | col5 | col6 | col7 | col8 | col9 | col10 | col11 | col12 | col13 | col14 | col15 | col16….aaaaaaab
disk
efficient disk scan, query and slice semantics
(partition_key:aaaaaaab)
(column=col1, value=xx, timestamp=1357866010549000)
(partition_key:xxxxxxxx)

C* column nesting (clustering)
CREATE TABLE sessions {
username text,
session_id text,
url text,
time_spent int,
browser text,
PRIMARY KEY(username, session_id)
}
PRIMARY KEY(<partition_key>, <clustering column>)
john@site.com
session2
url xxx
time_spent xxx
browser xxx
session3 ...
SELECT * FROM sessions WHERE username=”john@site.com” AND session_id=”session2”;
Dictates what node
the data is stored on
Dictates how data is
stored and sorted
under the partition
key

2
1
01->29
60->89 30->59
3
john@site.com
session1
url xxx
time_spent xxx
browser xxx
session2 ...
bill@yahoo.com
session1
url xxx
time_spent xxx
browser xxx
session2 ...
james@gmail.com
session1
url xxx
time_spent xxx
browser xxx
session2 ...
username text,
session_id text,
url text,
time_spent int,
browser text,
}
PRIMARY KEY(<partition_key>, <clustering column>)

C* column nesting (clustering) - queries
username text,
session_id text,
url text,
time_spent int,
browser text,
}
GOOD:
SELECT * FROM sessions WHERE username=”john@site.com”;
SELECT * FROM sessions WHERE username=”john@site.com” AND session_id=”session2”;
WRONG:
SELECT * FROM sessions WHERE session_id=”session2”;
RULE: Clustering columns or partition_keys prior to the most granular clustering column must be present in the query.

CREATE TABLE timeline (
day text,
hour int,
min int,
sec int,
reading text,
PRIMARY KEY (day, hour, min, sec)
);
PRIMARY KEY(<partition_key>, <cl. column>, <cl. column>, <cl. column>)
day1
hour1
min1
sec1
reading
sec2
reading
day 2 ...
SELECT * FROM timeline WHERE day=day1 AND hour=hour1 AND min=min1 AND sec=sec1;

CREATE TABLE timeline (
day text,
hour int,
min int,
sec int,
value text,
PRIMARY KEY (day, hour, min, sec)
);
GOOD:
SELECT * FROM timeline WHERE day=day1;
SELECT * FROM timeline WHERE day=day1 AND hour=hour1;
SELECT * FROM timeline WHERE day=day1 AND hour=hour1 AND min=min1;
SELECT * FROM timeline WHERE day=day1 AND hour=hour1 AND min=min1 AND sec=sec1;
WRONG:
SELECT * FROM sessions WHERE day=day1 AND min=min1;
RULE: Clustering columns must be present in the query in the same order as the PRIMARY KEY
CAREFUL: be aware how much data you are returning !!

Notes and limitations
(partition_key:aaaaaaab)
RULE: Always design your tables so that you limit the amount of data stored in a single partition_key to the size of the
in_memory_compaction_limit_in_mb that is set in cassandra.yaml (default 64mb)
Why? Compaction (which we will cover later) needs to be able to process a complete partition_key and all its underlying data in memory,
swapping to and from disk introduces serious performance degradation and poor JVM GC behaviour.
RULE: Clustering columns must be present in the query in the same order as the PRIMARY KEY
CAREFUL: be aware how much data you are returning !!

Where is my data?
CQL TRACE will show you where your data is,
how costly it is to get it in terms of time and
how many node hops it is going to take to get it.

Sane queries on simple tables + introducing indexes
CREATE TABLE users {
username text,
password text,
address text,
age int,
PRIMARY KEY(username)
}
SELECT * FROM users WHERE username=”john@site.com”;
SELECT address, age FROM users WHERE username=”bill@yahoo.com”;
CREATE INDEX age_key ON users(age);
SELECT * FROM users WHERE age=35;
CAREFUL: think about how much data you are returning and from where that data is coming...if you don’t
know, or can’t work it out run a TRACE at the CQLSH console or run the query under DevCenter 1.3...

CQL TRACE - your new best friend
TRACE provides a description of each step it takes to satisfy the request, the names of nodes that are affected, the time for each step,
and the total time for the request. TRACE is the most powerful tool in a data modellers hands. (INSERT)
activity | timestamp | source | source_elapsed (microseconds)
-------------------------------------+--------------+-----------+----------------
execute_cql3_query | 16:41:00,754 | 127.0.0.1 | 0
Parsing statement | 16:41:00,754 | 127.0.0.1 | 48
Preparing statement | 16:41:00,755 | 127.0.0.1 | 658
Determining replicas for mutation | 16:41:00,755 | 127.0.0.1 | 979
Message received from /127.0.0.1 | 16:41:00,756 | 127.0.0.3 | 37
Acquiring switchLock read lock | 16:41:00,756 | 127.0.0.1 | 1848
Sending message to /127.0.0.3 | 16:41:00,756 | 127.0.0.1 | 1853
Appending to commitlog | 16:41:00,756 | 127.0.0.1 | 1891
Adding to emp memtable | 16:41:00,756 | 127.0.0.1 | 1997
Enqueuing response to /127.0.0.1 | 16:41:00,758 | 127.0.0.3 | 1282
Enqueuing response to /127.0.0.1 | 16:41:00,758 | 127.0.0.2 | 1024
Processing response from /127.0.0.2 | 16:41:00,765 | 127.0.0.1 | 11063
Processing response from /127.0.0.3 | 16:41:00,765 | 127.0.0.1 | 11066
Request complete | 16:41:00,765 | 127.0.0.1 | 11139

CQL TRACE - how do I invoke it?
Option 1: CQLSH
All C* installs come with a commandline client
called cqlsh, you can run any CQL commands against
a cassandra cluster using cqlsh, to invoke TRACE:
cqlsh>TRACE ON;
cqlsh>SELECT * FROM mytable WHERE id=1;
After running the query, cqlsh will return with
both the query results and the TRACE results.
Option 2: DevCenter 1.3+
For the GUI inclined (like me) DevCenter
automatically runs a TRACE on every query in a TAB
behind the execution/results screen, you can see
the formatted results there.

RDBMS data modelling
1. Design normalised tables. 2. Define SQL queries 3. Build consuming application
JOINS -> normalize tables -> queries last

Cassandra data modelling
1. Define CQL queries 2. Design de-normalised tables
for each query.
3. Build consuming application
no JOINS -> queries first -> then denormalize tables

Data modelling use-case #1 - Music data

Q1
CREATE TABLE performers_by_style {
style TEXT,
name TEXT,
PRIMARY KEY(style, name)
}
WITH CLUSTERING ORDER BY (name ASC);
(partition_key:style1)
(column=name1:, value=, timestamp=1357866010549000)
(partition_key:style2)
SELECT * FROM performers_by_style WHERE style=”rock”;

Q2
CREATE TABLE performer (
name TEXT,
type TEXT,
country TEXT,
style LIST<TEXT>,
founded INT,
born INT
died TEXT,
PRIMARY KEY (name)
);
SELECT * FROM performer WHERE name=”someName”;
(partition_key:someName)
(column=type, value=, timestamp=1357866010549000)
...

Q3
CREATE TABLE album (
title TEXT,
year INT,
performer TEXT,
genre TEXT,
tracks map<INT,TEXT>,
PRIMARY KEY((title,year))
);
(partition_key:myTitle:2014)
(column=performer, value=Blondie, timestamp=1357866010549000)
(column=genre, value=rock, timestamp=1357866010549000)
(column=tracks, value={1:track1, 2:track2, 3:track3}, timestamp=1357866010549000)
(partition_key:title56:1999)
...
SELECT * FROM album WHERE title=”myTitle” AND year=2014;

Q4
CREATE TABLE albums_by_performer (
performer TEXT,
year INT,
title TEXT,
genre TEXT,
PRIMARY KEY(performer, year, title)
)
WITH CLUSTERING ORDER BY (year DESC, title ASC);
SELECT * FROM albums_by_performer WHERE performer=”myPerformer”;
(partition_key:myPerformer)
(column=year1:, value=, timestamp=1357866010549000)
(column=title1:, value=, timestamp=1357866010549000)
(column=genre, value=rock, timestamp=1357866010549000)
(partition_key:perfomer2)

Q5
CREATE TABLE albums_by_genre (
genre TEXT,
performer TEXT,
year INT,
title TEXT,
PRIMARY KEY(genre, performer, year, title)
)
WITH CLUSTERING ORDER BY (performer ASC, year DESC, title ASC);
(partition_key:myGenre)
(column=performer1, value=, timestamp=1357866010549000)
(column=year1, value=, timestamp=1357866010549000)
(column=title1, value=, timestamp=1357866010549000)
(partition_key:genre2)
SELECT * FROM albums_by_genre WHERE genre=”myGenre”;

Q6
CREATE TABLE albums_by_track (
track TEXT,
performer TEXT,
year INT,
title TEXT,
PRIMARY KEY(track, performer, year, title)
)
WITH CLUSTERING ORDER BY (performer ASC, year DESC, title ASC);
(partition_key:myTrack)
(column=performer1, value=, timestamp=1357866010549000)
(column=year1:, value=, timestamp=1357866010549000)
(column=title1, value=, timestamp=1357866010549000)
(partition_key:track2)
SELECT * FROM albums_by_track WHERE genre=”myTrack”;

Q7
CREATE TABLE tracks_by_album (
album TEXT,
year INT,
number INT,
performer TEXT,
genre TEXT,
title TEXT,
PRIMARY KEY((album, year), number)
)
WITH CLUSTERING ORDER BY (number ASC);
(partition_key:myAlbum:2014)
(column=number1, value=, timestamp=1357866010549000)
(column=performer, value=performer1, timestamp=1357866010549000)
(column=genre, value=genre1, timestamp=1357866010549000)
(column=title, value=title1, timestamp=1357866010549000)
(column=number2, value=, timestamp=1357866010549000)
(column=performer, value=performer1, timestamp=1357866010549000)
(column=genre, value=genre1, timestamp=1357866010549000)
(column=title, value=title1, timestamp=1357866010549000)
SELECT title, year FROM tracks_by_album WHERE album=”myAlbum” AND year=2015;

Cassandra is not an RDBMS, Cassandra is vastly more powerfully than any RDBMS in existence with the
proven ability in production to run 1000x node clusters.
But as a Cassandra Data Modeller you need to *think different*, you need to think distributed and
denormalized, but ultimately you need to ask the question:
“Where is my data?”
an RDBMS

Apache Cassandra - Data modelling

Recommended

Recommended

More Related Content

What's hot

What's hot (16)

Similar to Apache Cassandra - Data modelling

Similar to Apache Cassandra - Data modelling (20)

More from Alex Thompson

More from Alex Thompson (6)

Recently uploaded

Recently uploaded (20)

Apache Cassandra - Data modelling