Cassandra Data Modeling
Upcoming SlideShare
Loading in...5
×
 

Cassandra Data Modeling

on

  • 13,773 views

 

Statistics

Views

Total Views
13,773
Views on SlideShare
13,502
Embed Views
271

Actions

Likes
21
Downloads
382
Comments
0

9 Embeds 271

http://highlyscalable.wordpress.com 226
http://lanyrd.com 28
http://twitter.com 6
https://twitter.com 4
http://www.demo.crescosolution.com 3
http://rbistolfi.posterous.com 1
http://tweetedtimes.com 1
http://mongopi.wordpress.com 1
http://local.microhint.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Cassandra Data Modeling Cassandra Data Modeling Presentation Transcript

  • Cassandra Data Modeling Workshop Matthew F. Dennis // @mdennis
  • Overview● Hopefully interactive● Use cases submitted via Google Moderator, email, IRC, etc● Interesting and/or common requests in the slides to get us started● Bring up others if you have them !
  • Data Modeling Goals● Keep data queried together on disk together● In a more general sense think about the efficiency of querying your data and work backward from there to a model in Cassandra● Dont try to normalize your data (contrary to many use cases in relational databases)● Usually better to keep a record that something happened as opposed to changing a value (not always advisable or possible)
  • ClickStream Data (use case #1)● A ClickStream (in this context) is the sequence of actions a user of an application performs● Usually this refers to clicking links in a WebApp● Useful for ad selection, error recording, UI/UX improvement, A/B testing, debugging, et cetera● Not a lot of detail in the Google Moderator request on what the purpose of collecting the ClickStream data was – so I made some up
  • ClickStream Data Defined● Record actions of a user within a session for debugging purposes if app/browser/page/server crashes
  • Recording Sessions● CF for sessions a user has had ● Row Key is user name/id ● Column Name is session id (TimeUUID) ● Column Value is empty (or length of session, or some aggregated details about the session after it ended)● CF for actual sessions ● Row Key is TimeUUID session id ● Column Name is timestamp/TimeUUID of each click ● Column Value is details about that click (serialized)
  • UserSessions Column Family Session_01 Session_02 Session_03 (TimeUUID) (TimeUUID) userId (TimeUUID) (empty/agg) (empty/agg) (empty/agg)● Most recent session● All sessions for a given time period
  • Sessions Column Family timestamp_01 timestamp_02 timestamp_03 SessionId(TimeUUID) ClickData ClickData ClickData (json/xml/etc) (json/xml/etc) (json/xml/etc)● Retrieve entire sessions ClickStream (row)● Order of clicks/events preserved● Retrieve ClickStream for a slice of time within the session● First action taken in a session● Most recent action taken in a session● Why JSON/XML/etc?
  • Alternatives?
  • Of Course (depends on what you want to do)● Secondary Indexes● All Sessions in one row● Track by time of activity instead of session
  • Secondary Indexes Applied● Drop UserSessions CF and use secondary indexes● Uses a “well known” column to record the user in the row; secondary index is created on that column● Doesnt work so well when storing aggregates about sessions in the UserSessions CF● Better when you want to retrieve all sessions a user has had
  • All Sessions In One Row Applied● Row Key is userId● Column Name is composite of timestamp and sessionId● Can efficiently request activity of a user across all sessions within a specific time range● Rows could potentially grow quite large, be careful● Reads will almost always require at least two seeks on disk
  • Time Period Partitioning Applied● Row Key is composite of userId and time “bucket” ● e.g. jan_2011 or jan_01_2011 for month or day buckets respectively● Column Name is TimeUUID of click● Column Value is serialized click data● Avoids always requiring multiple seeks when the user has old data but only recent data is requested● Easy to lazily aggregate old activity● Can still efficiently request activity of a user across all sessions within a specific time range
  • Rolling Time Window Of Data Points (use case #2)● Similar to RRDTool was the example given● Essentially store a series of data points within a rolling window● common request from Cassandra users for this and/or similar
  • Data Points Defined● Each data point has a value (or multiple values)● Each data point corresponds to a specific point in time or an interval/bucket (e.g. 5 th minute of th 17 hour on some date)
  • Time Window Model System7:RenderTime TimeUUID0 TimeUUID1 TimeUUID2 s7:rt 0.051 0.014 0.173 Some request took 0.014 seconds to render● Row Key is the id of the time window data you are tracking (e.g. server7:render_time)● Column Name is timestamp (or TimeUUID) the event occurred at● Column Value is the value of the event (e.g. 0.051)
  • The Details● Cassandra TTL values are key here ● When you insert each data point set the TTL to the max time range you will ever request; there is very little overhead to expiring columns● When querying, construct TimeUUIDs for the min/max of the time range in question and use them as the start/end in your get_slice call● Consider partitioning the rows by a known time period (e.g. “year”) if you plan on keeping a long history of data (NB: requires slightly more complex logic in the app if a time range spans such a period)● Very efficient queries for any window of time
  • Rolling Window Of Counters (use case #3)● “How to model rolling time window that contains counters with time buckets of monthly (12 months), weekly (4 weeks), daily (7 days), hourly (24 hours)? Example would be; how many times user logged into a system in last 24 hours, last 7 days ...”● Timezones and “rolling window” is what makes this interesting
  • Rolling Time Window Details● One row for every granularity you want to track (e.g. day, hour)● Row Key consists of the granularity, metric, user and system● Column Name is a “fixed” time bucket on UTC time● Column Values are counts of the logins in that bucket● get_slice calls to return multiple counters which are them summed up
  • Rolling Time Window Counter Model user3:system5:logins:by_day 20110107 ... 20110523 U3:S5:L:D 2 ... 7 2 logins in Jan 7th 2011 7 logins on May 23rd 2011 for user 3 on system 5 for user 3 on system 5 user3:system5:logins:by_hour 2011010710 ... 2011052316 U3:S5:L:H 1 ... 7one login for user 3 on system 5 2 logins for user 3 on system 5on Jan 7th 2011 for the 10th hour on May 23rd 2011 for the 16th hour
  • Rolling Time Window Queries● Time window is rolling and there are other timezones besides UTC ● one get_slice for the “middle” counts ● one get_slice for the “left end” ● one get_slice for the “right end”
  • Example: logins for the past 7 days● Determine date/time boundaries● Determine UTC days that are wholly contained within your boundaries to select and sum● Select and sum counters for the remaining hours on either side of the UTC days● O(1) queries (3 in this case), can be requested from C* in parallel● NB: some timezones are annoying (e.g. 15 minute or 30 minutes offsets); I try to ignore them
  • Alternatives? (of course)● If youre counting logins and each user doesnt login in hundreds of times a day, just have one row per user with a TimeUUID column name for the time the login occurred● Supports any timezone/range/granularity easily● More expensive for large ranges (e.g. year) regardless of granularity, so cache results (in C*) lazily.● NB: caching results for rolling windows is not usually helpful (because, well its rolling and always changes)
  • Eventually Atomic (use case #4)● “When there are many to many or one to many relations involved how to model that and also keep it atomic? for eg: one user can upload many pictures and those pictures can somehow be related to other users as well.”● Attempting full ACID compliance in distributed systems is a bad idea (and impossible in the general sense)● However, consistency is important and can certainly be achieved in C*● Many approaches / alternatives● I like transaction log approach, especially in the context of C*
  • Transaction Logs (in this context)● Records what is going to be performed before it is actually performed● Performs the actions that need to be atomic (in the indivisible sense, not the all at once sense)● Marks that the actions were performed
  • In Cassandra● Serialize all actions that need to be performed in a single column – JSON, XML, YAML (yuck!), cpickle, JSO, et cetera ● Row Key = randomly chosen C* node token ● Column Name = TimeUUID● Perform actions● Delete Column
  • Configuration Details● Short GC_Grace on the XACT_LOG Column Family (e.g. 1 hour)● Write to XACT_LOG at CL.QUORUM or CL.LOCAL_QUORUM for durability (if it fails with an unavailable exception, pick a different node token and/or node and try again; same semantics as a traditional relational DB)● 1M memtable ops, 1 hour memtable flush time
  • Failures● Before insert into the XACT_LOG● After insert, before actions● After insert, in middle of actions● After insert, after actions, before delete● After insert, after actions, after delete
  • Recovery● Each C* has a crond job offset from every other by some time period● Each job runs the same code: multiget_slice for all node tokens for all columns older than some time period● Any columns need to be replayed in their entirety and are deleted after replay (normally there are no columns because normally things are working normally)
  • XACT_LOG Comments● Idempotent writes are awesome (thats why this works so well)● Doesnt work so well for counters (theyre not idempotent)● Clients must be able to deal with temporarily inconsistent data (they have to do this anyway)● Could use a reliable queuing service (e.g. SQS) instead of polling – push to SQS first, then XACT log.
  • Q?Cassandra Data Modeling Workshop Matthew F. Dennis // @mdennis