Cassandra Basics, Counters and Time Series Modeling

5,167 views
4,763 views

Published on

Presented at Athens Cassandra Users Group meetup http://www.meetup.com/Athens-Cassandra-Users/events/177040142/

Published in: Software, Technology

Cassandra Basics, Counters and Time Series Modeling

  1. 1. C* @ Icon Platforms Vassilis Bekiaris @karbonized1 Software Architect
  2. 2. Presentation outline • Meet Cassandra • CQL - Data modeling basics • Counters & Time-series use case: Polls
  3. 3. Meet Cassandra
  4. 4. History • Started at Facebook • Historically builds on • Dynamo for distribution: consistent hashing, eventual consistency • BigTable for disk storage model Amazon’s Dynamo: http://www.allthingsdistributed.com/2007/10/amazons_dynamo.html Google’s BigTable: http://research.google.com/archive/bigtable.html
  5. 5. Cassandra is • A distributed database written in Java • Scalable • Masterless, no single point of failure • Tunable consistency • Network topology aware
  6. 6. Cassandra Data Model • Original “Map of Maps” schema • row key ➞ Map<ColumnName, Value> • Now (in CQL): • Keyspace = Database • ColumnFamily = Table • Row = Partition • Column = Cell • Data types • strings, booleans, integers, decimals • collections: list, set, map • not indexable, not individually query- able • counters • custom types
  7. 7. Cassandra Replication Factor & Consistency Levels • CAP Theorem: • Consistency • Availability • Tolerance in the face of network partitions Original article: http://www.cs.berkeley.edu/~brewer/cs262b-2004/PODC-keynote.pdf Review 12 years later: http://www.infoq.com/articles/cap-twelve-years-later-how-the-rules-have-changed Fun with distributed systems under partitions: http://aphyr.com/tags/jepsen
  8. 8. Cassandra Replication Factor & Consistency Levels • RF: designated per keyspace • CL: • Writes: ANY, ONE, QUORUM, ALL • Reads: ONE, QUORUM, ALL • Consistent reads & writes are
 achieved when CL(W) + CL(R) > RF • QUORUM = RF/2 + 1 • Additional QUORUM variants: • LOCAL_QUORUM: quorum of replica nodes within same DC • EACH_QUORUM: quorum of replica nodes from all DCs Cassandra parameters calculator: http://www.ecyrd.com/cassandracalculator/
  9. 9. Masterless design • All nodes in the cluster are equal • Gossip protocol among servers • Adding / removing nodes is easy • Clients are cluster-aware Traditional replicated relational database systems focus on the problem of guaranteeing strong consistency to replicated data. Although strong consistency provides the application writer a convenient programming model, these systems are limited in scalability and availability [7]. These systems are not capable of A B C DE F G Key K Nodes B, C and D store keys in range (A,B) including K. Figure 2: Partitioning and replication of keys in Dynamo ring. Image from “Dynamo: Amazon’s Highly Available Key-value Store”
  10. 10. Write path • Storage is log-structured; updates do not overwrite, deletes do not remove • Commit log: sequential disk access • Memtables: in-memory data structure (partially off-heap since 2.1b2) • Memtables are flushed to SSTable on disk • Compaction: merge SSTables, remove tombstones
  11. 11. Read path • For each SSTable that may contain a partition key: • Bloom filters: estimate probability of locating partition data per SSTable • Locate offset in SSTable • Sequential read in SSTable (if query involves several columns) • A partition’s columns are merged from several SSTables / memtable, as column updates never overwrite data
  12. 12. CQL - Data Modeling Basics
  13. 13. CQL • Cassandra Query Language • Client API for Cassandra • CQL3 available since Cassandra 1.2 • Familiar syntax • Easy to use • Drivers available for Java, Python, C# and more
  14. 14. Creating a table
  15. 15. Creating a table - what happened?? • A new table was created • It looks familiar! • We defined the username as the primary key, therefore we are able to identify a row and query quickly by username • Primary keys can be composite; the first part of the primary key is the partition key and determines the primary node for the partition
  16. 16. Composite Primary Key
  17. 17. Composite Primary Key Partition Key
  18. 18. Composite Primary Key Partition Key Clustering Column(s)
  19. 19. Composite Primary Key Partition Key Clustering Column(s) Partition key (not ordered)
  20. 20. Composite Primary Key Partition Key Clustering Column(s) Partition key (not ordered) Clustering key (ordered)
  21. 21. Composite Primary Key - Partition Layout username johndoe key: value: key: value: username anna key: value: last_login last_login married_to 2014-01-04T12:00:00 2014-04-03T13:57:13 janedoe
  22. 22. Insert/Update • INSERT & UPDATE are functionally equivalent • New in Cassandra 2.0: Support for lightweight transactions (compare-and- set) • e.g. INSERT INTO users (username, email) VALUES (‘tony’, ‘tony@gmail.com’) IF NOT EXISTS; • Based on Paxos consensus protocol Paxos Made Live: An Engineering Perspective: http://research.google.com/archive/paxos_made_live.pdf
  23. 23. Select query • SELECT * FROM user_attributes; • Selecting across several partitions can be slow • Default LIMIT 10.000 • Can filter results with WHERE clauses on partition key, partition key & clustering columns or indexed columns • EQ & IN operators allowed for partition keys • EQ, <, > … operators allowed for clustering columns
  24. 24. Select query - Ordering • Partition keys are not ordered • … but clustering columns are ordered • Default ordering is mandated by clustering columns • ORDER BY can be specified on clustering columns at query time; default order can be set WITH CLUSTERING ORDER on table creation
  25. 25. Secondary Indexes • Secondary indexes allow queries using EQ or IN operators in columns other than the partition key • Internally implemented as hidden tables • “Cassandra's built-in indexes are best on a table having many rows that contain the indexed value. The more unique values that exist in a particular column, the more overhead you will have, on average, to query and maintain the index.” http://www.datastax.com/documentation/cql/3.0/cql/ddl/ddl_when_use_index_c.html
  26. 26. Secondary Indexes
  27. 27. Query Performance • Single-partition queries are fast! • Queries for ranges on clustering columns are fast! • Queries for multiple partitions are slow • Use secondary indexes with caution
  28. 28. Counter columns
  29. 29. Tracing CQL requests
  30. 30. Setting TTL
  31. 31. Counters and Time Series use case: Polls
  32. 32. Use cases
  33. 33. Data access patterns • View poll ➞ Get poll name & sorted list of answers by poll id • User votes ➞ Insert answer with user id, poll id, answer id, timestamp • View result ➞ Retrieve counts per poll & answer
  34. 34. Poll & answers POLL_ID TEXT POLL_ID ANSWER_ID SORT_ORDER POLL POLL_ANSWER ANSWER_ID TEXT ANSWER
  35. 35. Poll & answers • Need 3 queries to display a poll • 2 by PK EQ • 1 for multiple rows by PK IN
  36. 36. Poll & answers revisited POLL_ID TEXT POLL_ID SORT_ORDER ANSWER_ID ANSWER_TEXT POLL POLL_ANSWER
  37. 37. Poll & answers revisited • Need 2 queries to display a poll • both by PK EQ
  38. 38. Poll & answers re-revisited POLL_ID POLL_TEXT (STATIC) SORT_ORDER ANSWER_ID ANSWER_TEXT POLL (Requires Cassandra 2.0.6+)
  39. 39. Poll & answers re-revisited • One table to rule them all • One query by PK EQ
  40. 40. Votes • Record user’s votes in a timeline • Count of votes per answer
  41. 41. Votes POLL_ID VOTED_ON USER_ID ANSWER_ID VOTE
  42. 42. Time buckets • If you have tons of votes to record, you may want to split your partitions in buckets e.g. per day
  43. 43. Time buckets • Partition layout poll_id:1 day:20140401 user_id:21 answer_id:4 user_id:22 answer_id:1 poll_id:1 day:20140402 user_id:27 answer_id:2 user_id:29 answer_id:3
  44. 44. Counting votes • Count per poll_id & answer_id
  45. 45. Links • http://cassandra.apache.org • http://planetcassandra.org/
 Cassandra binary distributions, use cases, webinars • http://www.datastax.com/docs
 Excellent documentation for all things Cassandra (and DSE) • http://www.slideshare.net/patrickmcfadin/cassandra-20-and-timeseries
 Cassandra 2.0 new features & time series modeling
  46. 46. Thank you!

×