Oh why is mycluster’s readperformance   terrible?
What it is.• In one sentence:  – A data store that resembles:     • A Hash table (k/v) that’s evenly distributed across a ...
Where did it come from?• Legacy:  – Google Big Table and Amazon Dynamo.  – Open sourced by Facebook in 2009.  – They had d...
What it is not.• Not a general purpose data store.   – Highly specialized use cases.        • The Business use case must a...
Who uses it?• Web scale companies.  – Netflix, Twitter.     • Capture clickstream data.     • User activity/gaming.     • ...
Interest over time/google trend…
Netflix slide..
Where do people use it?• Mostly in analytic/reporting ‘stacks’.  – Fire hose (value proposition 1) in vast amounts of ‘log...
Important to know right up front.• Designed for High Write rates (all activity is sequential io).    – If improperly used,...
• With analytics/reporting in mind:  – Let’s explore RDMS storage inefficiency (For large    query) and Cassandra’s value ...
Data in an RDBMS (physical)                                         Block size 8k   Symbol Price     TimeSelect * …       ...
Data in a Cassandra (physical)Select *Where                     KEY      Col/ValueSymbol=MSFT               MSFT     t1 =>...
What it is, in depth:• Log Structured Data store. (all activity is  sequentially written)• Favors Availability and Partiti...
System Properties• Distributed / elastic scalability. (value proposition 3)• Fault Tolerant – Rack aware, Inter/intra  dat...
Evenly distributes data (default)• Consistent hashing.• Token Range: 0 – 2^127-1• Your ‘key’ getsAssigned a token.Eg. Key ...
Replication Factor = 3• Consistency Level• Hinted Handoff
Consistency Level
DataTypes.
ACID?• A/I/D ( in bits and bobs)• BASE. Basically Available Soft-state Eventual  consistency
Cassandra/Future• Will slowly take on more rdbms like features.  – Cassandra 1.1 has row level isolation. Previously    yo...
Reference: CAP Theorem.• Consistency (all nodes see the same data at  the same time)• Availability (a guarantee that every...
Upcoming SlideShare
Loading in …5
×

Apache Cassandra Opinion and Fact

276 views

Published on

Yet another Cassandra Slide deck.

  • Be the first to comment

  • Be the first to like this

Apache Cassandra Opinion and Fact

  1. 1. Oh why is mycluster’s readperformance terrible?
  2. 2. What it is.• In one sentence: – A data store that resembles: • A Hash table (k/v) that’s evenly distributed across a cluster of servers. Practically speaking, not n level k/v. • Or, An excel spread sheet that is chopped up and housed on different servers. • Basic Nomenclature..
  3. 3. Where did it come from?• Legacy: – Google Big Table and Amazon Dynamo. – Open sourced by Facebook in 2009. – They had designed it for ‘Inbox search’
  4. 4. What it is not.• Not a general purpose data store. – Highly specialized use cases. • The Business use case must align with Cassandra architecture. – No transactions – No joins. (make multiple roundtrips). De-normalize data. – No stored procedures. – No range queries across keys (default) – No Referential Integrity (pk constraint, foreign keys). – No locking. – Uses timestamp to upsert data. – Charlie must be aghast.
  5. 5. Who uses it?• Web scale companies. – Netflix, Twitter. • Capture clickstream data. • User activity/gaming. • Backing store for search tools (lucene) – Structured/Unstructured data.• Trend: Web scale companies moving from distributed Mysql to Cassandra.
  6. 6. Interest over time/google trend…
  7. 7. Netflix slide..
  8. 8. Where do people use it?• Mostly in analytic/reporting ‘stacks’. – Fire hose (value proposition 1) in vast amounts of ‘log like’ data. – Hopefully your data model ensures that your data is physically clustered (value proposition 2) on read. - Data that is physically clustered is conducive to reporting. - Can be used ‘real time’ but not its strength.
  9. 9. Important to know right up front.• Designed for High Write rates (all activity is sequential io). – If improperly used, read performance will suffer. – Always strive to minimize disk seek on read.• Millions of Inserts should result in 10’s of thousands of Row Keys. (not millions of keys)• Main usage pattern: High Write / Low Read (rates). See Netflix slide.• Anti-pattern: (oltp like) Millions of inserts / Millions of reads. (for main data tables)• If your Cassandra use is kosher, then you will find that IO is the bottleneck. Need better performance? Simply add more boxes (more io bandwidth to your cluster)• It’s all about physically clustering your data for efficient reads.• You have a query in mind, well, design a Cassandra table that satisfies your query. (lots of data duplication all over the place). Make sure that your query is satisfied by navigating to a single row key.• Favors throughput over latency.
  10. 10. • With analytics/reporting in mind: – Let’s explore RDMS storage inefficiency (For large query) and Cassandra’s value proposition # 2.
  11. 11. Data in an RDBMS (physical) Block size 8k Symbol Price TimeSelect * … db block 1 MSFT 28.01 t1 1 k row sizeWhere …Symbol=MSFT … MSFT 28.03 t5Minimum IO = 24K (8k x blocks visited)3 seeks. db block 20 …Slow. … MSFT 28.03 t7 … db block 1000 … … MSFT 28.01 t22 …
  12. 12. Data in a Cassandra (physical)Select *Where KEY Col/ValueSymbol=MSFT MSFT t1 => 28.03 t5 => 28.03 t7=>28.03 t22=>28.01Minimum IO = 8K (8k x 1)- 1 seek to KEY (+ overhead), then sequentially read the data.- You want to make sure that you are getting a lot of value per seek!Cassandra likes “Wide Rows”.- Your data is physically clustered and sorted (t1,t5…).- Millions of inserts have resulted in thousands of keys. (high write/ low read)- Fast
  13. 13. What it is, in depth:• Log Structured Data store. (all activity is sequentially written)• Favors Availability and Partition tolerance over Consistency. – Consistency is tunable, but if you tune for high consistency you trade off performance. – Cassandra consistency is not the same as database consistency. It is ReadYourWrites consistency.• Column oriented.• TTL (time to live / expire data )• Compaction (coalesce data in files)
  14. 14. System Properties• Distributed / elastic scalability. (value proposition 3)• Fault Tolerant – Rack aware, Inter/intra datacenter data replication. (value proposition 4)• Peer to peer, no single point of failure. (value proposition 5) (write/read from any node, it will act as the proxy to the cluster). No master node.• Durable.
  15. 15. Evenly distributes data (default)• Consistent hashing.• Token Range: 0 – 2^127-1• Your ‘key’ getsAssigned a token.Eg. Key = smith = token15, place it on the EasternNode.
  16. 16. Replication Factor = 3• Consistency Level• Hinted Handoff
  17. 17. Consistency Level
  18. 18. DataTypes.
  19. 19. ACID?• A/I/D ( in bits and bobs)• BASE. Basically Available Soft-state Eventual consistency
  20. 20. Cassandra/Future• Will slowly take on more rdbms like features. – Cassandra 1.1 has row level isolation. Previously you could read some one else’s inflight data.
  21. 21. Reference: CAP Theorem.• Consistency (all nodes see the same data at the same time)• Availability (a guarantee that every request receives a response about whether it was successful or failed)• Partition tolerance (the system continues to operate despite arbitrary message loss or failure of part of the system)

×