Cassandra paris meetup november 2013
Upcoming SlideShare
Loading in...5
×
 

Cassandra paris meetup november 2013

on

  • 1,044 views

Slides for the Paris Cassandra Meetup November 2013

Slides for the Paris Cassandra Meetup November 2013

Data Modeling Patterns with Cassandra

Statistics

Views

Total Views
1,044
Views on SlideShare
1,028
Embed Views
16

Actions

Likes
2
Downloads
37
Comments
0

1 Embed 16

https://twitter.com 16

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Cassandra paris meetup november 2013 Cassandra paris meetup november 2013 Presentation Transcript

  • Data Modeling Patterns with Cassandra 1
  • Duy Hai DOAN • Freelance Java/NoSQL developer • C* passionate, creator of Achilles • Working on • @doanduyhai • doanduyhai@gmail.com http://achilles.archinnov.info 2
  • Core concepts Query API Composites STORAGE ENGINE 3 View slide
  • Table/Colum Family • Physical storage  Row key = Partition Key Partition Column1 Column2 Column3 Column4 Key1 cell1 cell2 cell3 cell4 Partition Column1 Column2 Column3 Key2 cell1 cell2 cell3 Partition Column1 Column2 Key3 cell1 cell2 Partition Column1 Column2 Column3 Column4 Key4 cell1 cell2 cell3 cell4 … … … … … … … … … … … … … 4 View slide
  • Row distribution • Random (Murmur3 hash) n3 Row Key2 Column1 Column2 Column3 cell1 cell2 cell3 n4 n2 n5 n1 Column1 Column2 Row Key3 cell1 cell2 Row Key1 Column1 Column2 Column3 Column4 cell1 cell2 cell3 cell4 n6 n8 n7 • OrderPreserving (bad) 5
  • Logical view • Map<PartitionKey,SortedMap<Column,Cell>> • Acces by PartitionKey + Column very fast Partition Key1 Column1 Column2 cell1 cell2 Column3 Column4 cell3 cell4 • Query on range of PartitionKey awfully slow, bad • Query on range of Column very fast Partition Key1 Column1 Column2 cell1 cell2 Column3 Column4 cell3 cell4 … … • Can store values in Column → value-less rows Partition Key4 aaa bbb ccc ddd … Ø Ø Ø Ø …  but limited to 64k 6
  • Queries • Slice query  For a given partition key k • Give cell where column = n • List all columns and cells where column – between n1 and n2 – in ascending/descending order – limited to first p results • Ex: – – – – k = friends, n1 = robin, n2 = null descending p=3 friends alfred Ø batgirl Ø catwoman Ø 3 lois_lane Ø n1 robin Ø zatana Ø 7
  • Queries • Multiget slice query  Slice query for a given set of partition keys {k1, k2,…}  ≈ n slice queries  Semantics: • • • • • Partition Key: k1, from column: n1, to column: n2 UNION Partition Key: k2, from column: n1, to column: n2 UNION …
  • Data types • Static definition  PartitionKey: key_validation_class  Column: comparator  Cell: default_validation_class • Dynamic  ByteType  UTF8Type (JSON serialization) • DDL creation script (Thrift) create column family test with key_validation_class = LongType and comparator = UUIDType and default_validation_class = UTF8Type 9
  • Composites • n components for PartitionKey/Column:  Type1: Type2: ….: Typen  Ordering by successive component • Ex: Composite(IntType,UTF8Type)  IntType: character category • {1→ bad, 2→ friend, 3→ hero}  UTF8Type: character name index 1:joker 1:lex_luthor Ø Ø … 2:alfred 2:batgirl Ø Ø … 3:batman 3:flash Ø Ø 10
  • Queries with composite index 1:joker 1:lex_luthor Ø … Ø 2:alfred 2:batgirl Ø … Ø 3:batman 3:flash Ø Ø  Find characters where partition key = ‘index’ • and type=1 (bad) – {joker, lex_luthor…} • and type=1 (bad) and name=‘joker’ – {joker} • and type=3 (hero) and name>=‘batman’ and name<=‘green_lantern’ – {batman, flash} • and type>=1 (bad) and type<=2 (friend) – {joker, lex_luthor,…,alfred,batgirl…} • and name=‘lex_luthor’ • and name>=’alfred’ and name<=‘flash’ 11
  • Physical view • Skinny row Partition Key: LongType Column: UTF8Type Cell: UTF8Type ------------------RowKey: 10 => (column=age, value=35, timestamp=…) => (column=name, value=John DOE, timestamp=…) ------------------RowKey: 11 => (column=age, value=28, timestamp=…) => (column=name, value=Helen SUE, timestamp=…) 12
  • Physical view • Wide row with composites Partition Key: UTF8Type Column: CompositeType(IntType,UTF8Type) Cell: LongType ------------------RowKey: popularity_index => (column=1:joker, value=85, timestamp=…) => (column=1:lex_luthor, value=…, timestamp=…) => (column=2:alfred, value=…, timestamp=…) => (column=2:batgirl, value=…, timestamp=…) => (column=3:batman, value=…, timestamp=…) => (column=3:flash, value=…, timestamp=…) 13
  • Mapping to storage engine Implementation details Collections & Maps mapping CQL3 14
  • Simple table (skinny row) CQL3 schema CREATE TABLE users ( id bigint PRIMARY KEY, name text, age int); Storage engine structure Partition Key: LongType Column: CompositeType(UTF8Type) Cell: ByteType 15
  • Simple table (skinny row) CQL3 logical view id | age | name -----+-------+----------10 | 35 | John DOE 11 | 28 | Helen SUE Storage engine physical view RowKey: 10 => (column=, value=, timestamp=…) => (column=age, value=00000023, timestamp=…) 0x23 = 35 => (column=name, value=4a6f686e20444f45, timestamp=…) 4a6f686e20444f45 = John DOE RowKey: 11 => (column=, value=, timestamp=…) => (column=age, value=‘28’, timestamp=…) => (column=name, value=‘Helen SUE’, timestamp=…)
  • Marker column RowKey: 10 => (column=, value=, timestamp=…) => (column=age, value=00000023, timestamp=…) => (column=name, value=4a6f686e20444f45, timestamp=…) • INSERT INTO USERS(id) VALUES(12) RowKey: 12 => (column=, value=, timestamp=…) 17
  • Clustered table (wide row) CQL3 schema CREATE TABLE comments ( article_id int, posted_at timestamp, Partition Key author text, content text, PRIMARY KEY (article_id, posted_at)); Clusterting component Storage engine structure Partition key: IntType Column: CompositeType(DateType,UTF8Type) Cell: ByteType 18
  • Clustered table CQL3 logical view article_id | posted_at | author | content ---------------+--------------------------------------------------------+--------------------------+--------------------------------1 | 2013-11-05 19:00:00Paris, Madrid | John DOE | This article is great 1 | 2013-11-05 19:01:00Paris, Madrid | Helen SUE | Worth reading 2 | 2013-11-05 18:00:00Paris, Madrid | Richard SMITH | Good point Storage engine physical view RowKey: 1 => (column=2013-11-05 19:00:00+0100:, value=, timestamp=..) John DOE => (column=2013-11-05 19:00:00+0100:author, value=4a6f686e20444f45, timestamp=…) => (column=2013-11-05 19:00:00+0100:content, value=546869732061727469636c65206973206772656174, timestamp=…) => (column=2013-11-05 19:01:00+0100:, value=, timestamp=…) Helen SUE => (column=2013-11-05 19:01:00+0100:author, value=48656c656e20535545, timestamp=…) => (column=2013-11-05 19:01:00+0100:content, value=576f7274682072656164696e67, timestamp=…) RowKey: 2 => (column=2013-11-05 18:00:00+0100:, value=, timestamp=…) => (column=2013-11-05 18:00:00+0100:author, value=5269636861726420534d495448, timestamp=…) => (column=2013-11-05 18:00:00+0100:content, value=476f6f6420706f696e74, timestamp=…) Richard SMITH
  • Clustered table Partition Key: IntType Column: CompositeType(DateType,UTF8Type) Cell: ByteType CREATE TABLE comments ( article_id int, posted_at timestamp, author text, content text, PRIMARY KEY (article_id, posted_at)) article_id | posted_at | author | content --------------+---------------------------------------------------------+-------------------------+-------------------------------1 | 2013-11-05 19:00:00Paris, Madrid | John DOE | This article is great 1 | 2013-11-05 19:01:00Paris, Madrid | Helen SUE | Worth reading RowKey: 1 => (column=2013-11-05 => (column=2013-11-05 => (column=2013-11-05 => (column=2013-11-05 => (column=2013-11-05 => (column=2013-11-05 19:00:00+0100:, value=,…) 19:00:00+0100:author, value=‘John DOE’, …) 19:00:00+0100:content, value=‘This article is great’,…) 19:01:00+0100:, value=,…) 19:01:00+0100:author, value=‘Helen SUE’, …) 19:01:00+0100:content, value=‘Worth reading’,…)
  • Clustered table • Complicated ? One more example CQL3 schema CREATE TABLE phone_network_quality ( provider text, city text, date int, // format YYYYmmDDhh rate double, PRIMARY KEY (provider, city, date) ) ; Storage engine structure Partition Key: UTF8Type Column: CompositeType(UTF8Type,IntType,UTF8Type) Cell: ByteType 21
  • Clustered table provider | city | date | rate ---------------+---------+----------------------+-----Orange | Lyon | 2013110512 | 0.93 Orange | Paris | 2013110512 | 0.89 Orange | Paris | 2013110513 | 0.91 Orange | Paris | 2013110515 | 0.97 SFR | Lyon | 2013110512 | 0.88 SFR | Paris | 2013110512 | 0.78 SFR | Paris | 2013110515 | 0.93 CREATE TABLE phone_network_quality ( provider text, city text, date int, rate double, PRIMARY KEY (provider, city, date) ) RowKey: Orange => (column=Lyon:2013110512:rate, value=0.93, timestamp=…) => (column=Paris:2013110512:rate, value=0.89, timestamp=…) => (column=Paris:2013110513:rate, value=0.91, timestamp=…) => (column=Paris:2013110515:rate, value=0.97, timestamp=…) ------------------RowKey: SFR => (column=Lyon:2013110512:rate, value=0.88, timestamp=…) => (column=Paris:2013110512:rate, value=0.78, timestamp=…) => (column=Paris:2013110515:rate, value=0.93, timestamp=…) 22
  • Rationale • Familiar query, ugly Thrift API • Adopt schema • Schema change easy → ADD/REMOVE COLUMN (not primary key component) • Decouple logical view/physical storage 23
  • Rationale • ‘SQL’ like WHERE clause now possible  SELECT rate FROM phone_network_quality WHERE provider = ‘ORANGE’  SELECT rate FROM p.n.q WHERE provider = ‘ORANGE’ AND city=‘Paris’  SELECT rate FROM p.n.q WHERE provider = ‘ORANGE’ AND city = ‘Paris’ AND date >= 2013110519 AND date <= 2013110521 • Still some queries not possible  SELECT rate FROM p.n.q WHERE city = ‘Paris’  SELECT rate FROM p.n.q WHERE provider = ‘ORANGE’ AND date >= 2013110519 AND date <= 2013110521  SELECT rate FROM p.n.q WHERE date >= 2013110519 AND date <= 2013110521 24
  • Impl details • Where is the column name of ‘clustered components’ stored ?     SELECT key_aliases,key_validator,column_aliases,comparator FROM system.schema_columnfamilies WHERE keyspace_name=‘xxx' AND columnfamily_name=‘phone_network_quality’ key_aliases | key_validator | column_aliases | comparator --------------------+-----------------------+---------------------------+---------------------------------------------------------------------------------["provider"] | UTF8Type | ["city","date"] | CompositeType(UTF8Type,Int32Type,UTF8Type) 25
  • Impl details • Why not store all data in Column and use value-less strategy ? RowKey: SFR => (column=Lyon:2013110512:rate, value=0.88, …) => (column=Paris:2013110512:rate, value=0.78, …) => (column=Paris:2013110515:rate, value=0.93, …) RowKey: SFR => (column=Lyon:2013110512:0.88, value=, …) => (column=Paris:2013110512:0.78, value=, …) => (column=Paris:2013110515:0.93, value=, …)  Limited to 64k for Column size (all components)  Impossible to store Counter value in Column → counter column CANNOT be a clustering component  Impossible to add/remove column with ALTER TABLE 26
  • Impl details • Eager fetching CREATE TABLE user_file ( user_id bigint, file_id timeuuid, name text, payload blob, size int, PRIMARY KEY(user_id, file_id)); SELECT name,size FROM file WHERE user_id = xxx ORDER BY file_id DESC;  Payload also loaded server-side, memory pressure! 10 uuid1: uuid1:name uuid1:payload Ø video.mp4 xxxxxxxxxxxx uuid1:size uuid2: 9Mb … Ø • Solution: 2 tables  user_file_metadata & user_file_payload 27
  • Collections & Map Impl CREATE TABLE user ( id int PRIMARY KEY, friends list<text>, followers set<text>, preferences map<text,text>) ; INSERT INTO user(id,friends,followers,preferences) VALUES(10,[‘John’,’Helen’],{‘Paul’,’George’},{‘country’:’FR’,’city’:’Paris’}); RowKey: 10 => (column=, value=, timestamp=…) => (column=followers:’George’, value=, timestamp=…) => (column=followers:’Paul’, value=, timestamp=…) => (column=friends:’uuid1’, value=‘John’, timestamp=…) => (column=friends:’uuid2’, value=‘Helen’, timestamp=…) => (column=preferences:’city’, value=‘Paris’, timestamp=…) => (column=preferences:’country’, value=‘FR’, timestamp=…) 28
  • Collections & Map Impl • Set & Map  Add element(s)  Remove element(s):  Get element(s) • List  Append element(s)  Prepend element(s) with decreasing timeuuid/ REF TIME mirror -xxxxxxx Prepend2 1262304000000 01/01/2012 REFERENCE TIME 1383084226280 Append1 now() 1383084371483 Append3  Remove element at index i: read-before-write  Insert element at index i: read-delete_range-write_range 29
  • Collections & Maps Impl • Elements  Loaded all at once  Recommended cardinality ≈ [100-10000] • Avoid  Set<Blob>, Map<Blob,xxx> → 64k limit • No secondary index possible on element (yet) • No IN clause in CQL3  SELECT … FROM user WHERE friends IN(‘Bob’,’Alice’) 30
  • Modeling relationships To denormalize or not BASIC MODELING PATTERNS 31
  • Modeling relationships • 1-1, n-1 User 1 1 write Comment  Normalized CREATE TABLE comment ( article_id uuid, posted_at timestamp, author_id uuid, content text, PRIMARY KEY(article_id, posted_at));  Denormalized CREATE TABLE comment ( article_id uuid, posted_at timestamp, author text // author object as JSON, content text, PRIMARY KEY(article_id, posted_at)); 32
  • Modeling relationships • 1-n, n-n  Normalized N User N follow CREATE TABLE user_followers ( user_id uuid, follower_id uuid, PRIMARY KEY(user_id, follower_id)) ;  Denormalized CREATE TABLE user_followers ( user_id uuid, follower_id uuid, follower_data text, // JSON content PRIMARY KEY(user_id, follower_id)) ; 33
  • Normalize?... Or not • Pros  Data in one place  Easy update (one place)  No discrepancy • Cons  Way too SQLish  Lots of reads  Do not scale !!! 34
  • Denormalization types • 1-1, n-1  Compact CREATE TABLE comment ( article_id uuid, posted_at timestamp, author text // author object as JSON, content text, PRIMARY KEY(article_id, posted_at));  Flatten CREATE TABLE comment ( article_id uuid, posted_at timestamp, content text, author_id uuid, author_name text, author_biography int, … PRIMARY KEY(article_id ,posted_at)); 35
  • Denormalization types • n-1, n-n  Collection & Map CREATE TABLE user_followers ( user_id uuid PRIMARY KEY, followers list<text> // JSON content );  Clustered CREATE TABLE user_followers ( user_id uuid, follower_id uuid, follower_name text, follower_biography text, …, PRIMARY KEY (user_id, follower_id)) ; 36
  • Mutability • Only immutable part CREATE TABLE comment ( article_id uuid, posted_at timestamp, content text, author_id uuid, author_name text, author_biography int, → GET RID OF IT! REALLY USEFUL ??? …, PRIMARY KEY(article_id, posted_at)); • CQRS approach CREATE TABLE user_followers ( user_id uuid, follower_id uuid, follower_name text, follower_biography text, → SPREAD BIO UPDATE TO ALL COPIES PRIMARY KEY (user_id, follower_id)) ;
  • Compound primary key Type density Advanced wide rows and bucketing Clustering order Counters ADVANCED MODELING PATTERNS 38
  • Compound primary key • Simple partition key CREATE TABLE xxx ( partition_key uuid, clustering_key1 text, clustering_key2 timeuuid, …, PRIMARY KEY (partition_key, clustering_key1, clustering_key2)) ; • Composite partition key CREATE TABLE xxx ( partition_key1 uuid, partition_key2 text, clustering_key1 text, clustering_key2 timeuuid, …, PRIMARY KEY ((partition_key1, partition_key2), clustering_key1, clustering_key2)) ; 39
  • Compound primary key • Ordering (ORDER BY) PRIMARY KEY (partition_key, Index 1:joker 1:lex_luthor Ø Ø clustering_key1, clustering_key2)) … 2:alfred 2:batgirl Ø Ø … 3:batman 3:flash Ø Ø • Unicity by tuple PRIMARY KEY (partition_key, clustering_key1, clustering_key2))
  • Compound primary key • Example CREATE TABLE song_rating _by_value( song_id uuid, date timestamp, rating_value int, user_id long, user_name text, PRIMARY KEY (song_id, rating_value)) ; Unique ?  CREATE TABLE song_rating _by_value( song_id uuid, date timestamp, rating_value int, user_id long, user_name text, PRIMARY KEY (song_id, rating_value , date)) ; Unique 
  • Clustering type density • Offers ordering CREATE TABLE offer_ordering ( offer_type text, // PROMO, PREMIUM, TRIAL, FREE order int, id uuid, label text, PRIMARY KEY (offer_type, order)) ; INSERT INTO offer_ordering(…) VALUES(‘PREMIUM’,1,uuid1,’Full premium’); INSERT INTO offer_ordering(…) VALUES(‘PREMIUM’,2,uuid2,’Light premium’); • New offer between existing ‘Full’ & ‘Light’ ? CREATE TABLE offer_ordering ( offer_type text, // PROMO, PREMIUM, TRIAL, FREE order double, id uuid, label text, PRIMARY KEY (offer_type, order)) ; 42
  • Ultra wide rows • Heavy read-repair & anti-entropy 1. Hash of rows are compared 2. If not equal, complete rows are exchanged! • EVENT IF ONLY ONE COLUMN DIFFERS !!!! • Recommendations  100.000 cols of 1kb / 100Mb • 1 col/sec, column 8 bytes (long), cell 8 bytes (double) → 16 bytes/sec. • 1 row → 6.553.600 secs → 75 days → 2.5 months of data • Bucketing 43
  • Bucketing  Original wide row CREATE TABLE metrics ( type text, // CPU, MEMORY, DISK … date timestamp, value double, PRIMARY KEY (type, date)) ;  Split by year (YYYY) CREATE TABLE metrics ( type text, year int, // format YYYY date timestamp, value double, PRIMARY KEY ((type, year), date)) ; 2012-01-01 00:00:00 0.24 2013-01-01 00:00:00 CPU:2013 0.24 CPU:2012 … … … … 2012-12-31 23:59:59 0.25 2013-12-31 23:59:59 0.25 44
  • Bucketing  Split by month (YYYYMM) CREATE TABLE metrics ( type text, month int, date timestamp, CPU:201301 value double, PRIMARY KEY ((type, month), date)) ; 2013-01-01 00:00:00 0.24 … … 2013-01-31 23:59:59 0.25 2013-01-01 00:00:00 0.24 … … 2013-01-01 23:59:59 0.25  Split by day (YYYYMMDD) CREATE TABLE metrics ( type text, day int, date timestamp, CPU:20130101 value double, PRIMARY KEY ((type, day), date)) ; 45
  • Querying across buckets • With split by month  Get all CPU metrics from January 25th to February 10th ? SELECT * FROM metrics WHERE type = ‘CPU’ AND month IN (201301, 201302) AND date>=‘2013-01-25 00:00:00+0100’ AND date<=‘2013-02-10 00:00:00+0100’ 46
  • Querying across buckets • Semantics SELECT * FROM metrics WHERE type = ‘CPU’ AND month = 201301 AND date>=‘2013-01-25 00:00:00+0100’ AND date<=‘2013-02-10 00:00:00+0100’ UNION SELECT * FROM metrics WHERE type = ‘CPU’ AND month = 201302 AND date>=‘2013-01-25 00:00:00+0100’ AND date<=‘2013-02-10 00:00:00+0100’
  • Querying across buckets • Storage engine view Jan. 25th - Jan. 31st 2013-01-01 2013-01-01 00:00:10 CPU:201301 00:00:00 0.24 0.23 CPU:201302 … 2013-01-25 00:00:00 … 2013-01-31 23:59:59 … 0.27 … 0.25 2013-02-01 2013-02-10 … 00:00:00 23:59:59 0.18 … 0.25 Feb. 01st - Feb. 10th … …
  • Querying across buckets • Pre-requisites  Columns ranges disjoint between rows  Date/timeuuid good candidates • Counter-example CPU:2012 CPU:2013 01 02 03 04 05 06 07 08 09 10 11 12 0.75 0.71 0.68 0.79 0.69 0.73 0.75 0.77 0.83 0.78 0.79 0.88 01 02 03 04 05 06 07 08 09 10 11 12 0.71 0.72 0.67 0.80 0.71 0.75 0.74 0.76 0.81 0.79 0.83 0.87 SELECT average_value FROM metrics WHERE type = ‘CPU’ AND YEAR IN(2012,2013) AND MONTH >=11 AND MONTH<=02 49
  • Clustering order • For natural reverse order (time) CREATE TABLE user_tweets ( user_id uuid, tweet_id timeuuid, tweet_content text, PRIMARY KEY (user_id, tweet_id)) WITH CLUSTERING ORDER (tweet_id DESC) ; SELECT tweet_id, tweet_content FROM user_tweets WHERE user_id = 10 LIMIT 10; • Rationale  Backward seeks in SSTables 10 uuid1 uuid2 … … … … uuid98 uuid99 uuid100 … … … … … … 33 uuid1 uuid2 … … … … uuid51 uuid52 uuid53 … … … 50
  • Counters • Basic recommendations  Only increment/decrement/get  Delete possible  Never read after delete, no guarantee on purge 51
  • Counters patterns • / vote  Naive impl: incr/decr one counter  Dual counter (up_vote/down_vote)  Extended info • Vote count (up_vote + down_vote) • Vote ratio (up_vote/(up_vote + down_vote)) CREATE TABLE article_popularity ( article_id uuid PRIMARY KEY, up_vote counter, down_vote counter, total_vote_count counter → NEED ATOMIC BATCH ); 52
  • Counters patterns • Rating system CREATE TABLE movie_rating ( movie_id uuid PRIMARY KEY, zero_star counter, one_star counter, two_star counter, three_star counter, four_star counter, five_star counter);
  • Counters patterns • Votes grouping CREATE TABLE votes_grouping ( candidate text, state text, city text, district text vote counter, PRIMARY KEY(candidate,state,city,district)); • Hierarchical grouping (static) • Possible results  Vote count for ‘Obama’ in ‘Florida’  Vote count for ‘Obama’ in ‘New York State’ and ‘New York City’  Vote count for ‘Obama’ and ‘Romney’ in ‘Florida’ (using IN clause on primary key candidate)  Vote count for ‘Obama’ in ‘Florida’ and ‘New York City’ 54
  • Counters patterns • Votes per locality CREATE TABLE votes_per_locality ( candidate text, locality text, vote counter, PRIMARY KEY((candidate,locality))); • Write path (atomic batch) UPDATE votes_grouping SET vote = vote +1 WHERE candidate=‘Obama’ AND state =‘State of New York’ AND city=‘New York City’ AND district=‘Manhattan’; UPDATE votes_per_locality SET vote = vote +1 WHERE candidate = ‘Obama’ AND locality = ‘State of New York’; UPDATE votes_per_locality SET vote = vote +1 WHERE candidate = ‘Obama’ AND locality = ‘New York City; UPDATE votes_per_locality SET vote = vote +1 WHERE candidate = ‘Obama’ AND locality = ‘Manhattan’;
  • Counters patterns • New possible queries  Vote count for ‘Obama’ in ‘Florida’ and ‘Nevada’ SELECT vote FROM votes_per_locality WHERE candidate = ‘Obama’ AND locality IN (‘Nevada’, ‘Florida’);  Vote count for ‘Obama’ in ‘Florida’ and ‘New York City’ SELECT vote FROM votes_per_locality WHERE candidate = ‘Obama’ AND locality IN (‘Florida’, ‘New York City’);  Vote count for ‘Obama’ and ‘Romney’ in ‘Florida’ and ‘New York City’
  • TTL, my 2 cents Timestamp, this old buddy EXOTIC MODELING PATTERNS 57
  • Time to live • Abstract  Set on columns only  Resolution in second(s)  CQL3 functions : ttl(<column_name>)  Expired server-side 58
  • Time to live use-cases • Security token CREATE TABLE oauth_token ( token uuid PRIMARY KEY, user_id bigint, permissions list<text>); INSERT INTO oauth_token(…) VALUES(…) USING TTL 3600; • One-time token CREATE TABLE validation_code ( validation_code uuid PRIMARY KEY, user_registration_email text); INSERT INTO validation_code(…) VALUES(…) USING TTL 86400; //1 day 59
  • Time to live use-cases • Natural countdown • Rate-limiting/thresholding • Live Demo ! COUNT(*)  https://github.com/doanduyhai/Paris_Cassandra_M eetup-Nov_2013 60
  • Timestamp • Properties  Automatically set on each column  Can be set manually to any value  Resolution in micro-second  Used to solve conflicting values • Live Demo !  https://github.com/doanduyhai/Paris_Cassandra_M eetup-Nov_2013 61
  • Timestamp • Write barrier  Delete operation (create a tombstone) with timestamp in future ! • Live Demo !  https://github.com/doanduyhai/Paris_Cassandra_M eetup-Nov_2013 62
  • Thank you @doanduyhai doanduyhai@gmail.com http://achilles.archinnov.info Youtube: http://www.youtube.com/user/paloitTV 63