Cassandra Data Modeling Workshop

Cassandra Data Modeling Workshop
Matthew F. Dennis // @mdennis

Overview
● Hopefully interactive
● Use cases submitted via Google Moderator,
email, IRC, etc
● Interesting and/or common requests in the
slides to get us started
● Bring up others if you have them !

Data Modeling Goals
● Keep data queried together on disk together
● In a more general sense think about the
efficiency of querying your data and work
backward from there to a model in Cassandra
● Don't try to normalize your data (contrary to
many use cases in relational databases)
● Usually better to keep a record that something
happened as opposed to changing a value (not
always advisable or possible)

ClickStream Data
(use case #1)

● A ClickStream (in this context) is the sequence
of actions a user of an application performs
● Usually this refers to clicking links in a WebApp
● Useful for ad selection, error recording, UI/UX
improvement, A/B testing, debugging, et cetera
● Not a lot of detail in the Google Moderator
request on what the purpose of collecting the
ClickStream data was – so I made some up

ClickStream Data Defined
● Record actions of a user within a session for
debugging purposes if app/browser/page/server
crashes

Recording Sessions
● CF for sessions a user has had
● Row Key is user name/id
● Column Name is session id (TimeUUID)
● Column Value is empty (or length of session, or some
aggregated details about the session after it ended)
● CF for actual sessions
● Row Key is TimeUUID session id
● Column Name is timestamp/TimeUUID of each click
● Column Value is details about that click (serialized)

UserSessions Column Family
Session_01 Session_02 Session_03
(TimeUUID) (TimeUUID)
userId (TimeUUID)

(empty/agg) (empty/agg) (empty/agg)

● Most recent session
● All sessions for a given time period

Sessions Column Family
timestamp_01 timestamp_02 timestamp_03
SessionId
(TimeUUID) ClickData ClickData ClickData
(json/xml/etc) (json/xml/etc) (json/xml/etc)

● Retrieve entire session's ClickStream (row)
● Order of clicks/events preserved
● Retrieve ClickStream for a slice of time within the session
● First action taken in a session
● Most recent action taken in a session
● Why JSON/XML/etc?

Of Course
(depends on what you want to do)
● Secondary Indexes
● All Sessions in one row
● Track by time of activity instead of session

Secondary Indexes Applied
● Drop UserSessions CF and use secondary
indexes
● Uses a “well known” column to record the user
in the row; secondary index is created on that
column
● Doesn't work so well when storing aggregates
about sessions in the UserSessions CF
● Better when you want to retrieve all sessions a
user has had

All Sessions In One Row Applied
● Row Key is userId
● Column Name is composite of timestamp and
sessionId
● Can efficiently request activity of a user across
all sessions within a specific time range
● Rows could potentially grow quite large, be
careful
● Reads will almost always require at least two
seeks on disk

Time Period Partitioning Applied
● Row Key is composite of userId and time “bucket”
● e.g. jan_2011 or jan_01_2011 for month or day buckets respectively
● Column Name is TimeUUID of click
● Column Value is serialized click data
● Avoids always requiring multiple seeks when the user has old
data but only recent data is requested
● Easy to lazily aggregate old activity
● Can still efficiently request activity of a user across all
sessions within a specific time range

Rolling Time Window Of Data Points
(use case #2)
● Similar to RRDTool was the example given
● Essentially store a series of data points within a
rolling window
● common request from Cassandra users for this
and/or similar

Data Points Defined
● Each data point has a value (or multiple values)
● Each data point corresponds to a specific point
in time or an interval/bucket (e.g. 5 th minute of
th
17 hour on some date)

Time Window Model
System7:RenderTime

TimeUUID0 TimeUUID1 TimeUUID2

s7:rt 0.051 0.014 0.173

Some request took 0.014 seconds to render

● Row Key is the id of the time window data you are
tracking (e.g. server7:render_time)
● Column Name is timestamp (or TimeUUID) the event
occurred at
● Column Value is the value of the event (e.g. 0.051)

The Details
● Cassandra TTL values are key here
● When you insert each data point set the TTL to the max time
range you will ever request; there is very little overhead to
expiring columns
● When querying, construct TimeUUIDs for the min/max of
the time range in question and use them as the start/end
in your get_slice call
● Consider partitioning the rows by a known time period
(e.g. “year”) if you plan on keeping a long history of data
(NB: requires slightly more complex logic in the app if a
time range spans such a period)
● Very efficient queries for any window of time

Rolling Window Of Counters
(use case #3)
● “How to model rolling time window that contains counters with time
buckets of monthly (12 months), weekly (4 weeks), daily (7 days),
hourly (24 hours)? Example would be; how many times user logged
into a system in last 24 hours, last 7 days ...”
● Timezones and “rolling window” is what makes this interesting

Rolling Time Window Details
● One row for every granularity you want to track
(e.g. day, hour)
● Row Key consists of the granularity, metric, user
and system
● Column Name is a “fixed” time bucket on UTC time
● Column Values are counts of the logins in that
bucket
● get_slice calls to return multiple counters which
are them summed up

Rolling Time Window Counter Model
user3:system5:logins:by_day

20110107 ... 20110523
U3:S5:L:D
2 ... 7

2 logins in Jan 7th 2011 7 logins on May 23rd 2011
for user 3 on system 5 for user 3 on system 5

user3:system5:logins:by_hour

2011010710 ... 2011052316
U3:S5:L:H
1 ... 7

one login for user 3 on system 5 2 logins for user 3 on system 5
on Jan 7th 2011 for the 10th hour on May 23rd 2011 for the 16th hour

Rolling Time Window Queries
● Time window is rolling and there are other
timezones besides UTC
● one get_slice for the “middle” counts
● one get_slice for the “left end”
● one get_slice for the “right end”

Example: logins for the past 7 days
● Determine date/time boundaries
● Determine UTC days that are wholly contained
within your boundaries to select and sum
● Select and sum counters for the remaining hours
on either side of the UTC days
● O(1) queries (3 in this case), can be requested
from C* in parallel
● NB: some timezones are annoying (e.g. 15 minute
or 30 minutes offsets); I try to ignore them

Alternatives?
(of course)
● If you're counting logins and each user doesn't login
in hundreds of times a day, just have one row per
user with a TimeUUID column name for the time the
login occurred
● Supports any timezone/range/granularity easily
● More expensive for large ranges (e.g. year)
regardless of granularity, so cache results (in C*)
lazily.
● NB: caching results for rolling windows is not usually
helpful (because, well it's rolling and always changes)

Eventually Atomic
(use case #4)
● “When there are many to many or one to many relations involved how
to model that and also keep it atomic? for eg: one user can upload
many pictures and those pictures can somehow be related to other
users as well.”
● Attempting full ACID compliance in distributed systems is a bad idea
(and impossible in the general sense)
● However, consistency is important and can certainly be achieved in
C*
● Many approaches / alternatives
● I like transaction log approach, especially in the context of C*

Transaction Logs
(in this context)
● Records what is going to be performed before it
is actually performed
● Performs the actions that need to be atomic (in
the indivisible sense, not the all at once sense)
● Marks that the actions were performed

In Cassandra
● Serialize all actions that need to be performed
in a single column – JSON, XML, YAML (yuck!),
cpickle, JSO, et cetera
● Row Key = randomly chosen C* node token
● Column Name = TimeUUID
● Perform actions
● Delete Column

Configuration Details
● Short GC_Grace on the XACT_LOG Column
Family (e.g. 1 hour)
● Write to XACT_LOG at CL.QUORUM or
CL.LOCAL_QUORUM for durability (if it fails
with an unavailable exception, pick a different
node token and/or node and try again; same
semantics as a traditional relational DB)
● 1M memtable ops, 1 hour memtable flush time

Failures
● Before insert into the XACT_LOG
● After insert, before actions
● After insert, in middle of actions
● After insert, after actions, before delete
● After insert, after actions, after delete

Recovery
● Each C* has a crond job offset from every other
by some time period
● Each job runs the same code: multiget_slice for
all node tokens for all columns older than some
time period
● Any columns need to be replayed in their
entirety and are deleted after replay (normally
there are no columns because normally things
are working normally)

XACT_LOG Comments
● Idempotent writes are awesome (that's why this
works so well)
● Doesn't work so well for counters (they're not
idempotent)
● Clients must be able to deal with temporarily
inconsistent data (they have to do this anyway)
● Could use a reliable queuing service (e.g. SQS)
instead of polling – push to SQS first, then
XACT log.

Q?
Cassandra Data Modeling Workshop
Matthew F. Dennis // @mdennis

Cassandra Data Modeling Workshop

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (20)

Similar to Cassandra Data Modeling Workshop

Similar to Cassandra Data Modeling Workshop (20)

Recently uploaded

Recently uploaded (20)

Cassandra Data Modeling Workshop