HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

888
-1

Published on

HR5 alum Stephen Portanova will be presenting on the highly scalable database Cassandra, which is used by Reddit, Netflix, CERN, and The Weather Channel. 'nuff said.

Published in: Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
888
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
18
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

HRX Meetup Group 8/20/2014: Cassandra and How to Scale your Database

  1. 1. Cassandra Pretty Cool
  2. 2. History Google Big Table Amazon Dynamo
  3. 3. Today
  4. 4. Why Should You Care ● Horizontal Scaling (basically auto sharding) ● Multiple Nodes - Highly Available ● Really Fast Writes ● Not too shabby at reads either - SLICES!! ● Bright Future
  5. 5. The Cluster ● replication factor (rf) ● read consistency (r) ● write consistency (w) ● clustering - shard on partition key
  6. 6. The One Ring
  7. 7. Storage - Vnodes
  8. 8. Data Model ● Wide rows ● Slices Queries ● Denormalization ● Index tables
  9. 9. Data Model - Simple Key CREATE TABLE email_app.emails ( user_id text, subject text, to_add text, cc text, body text, ROW KEY PRIMARY KEY(user_id));
  10. 10. Data Model - Simple Inserts INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ‘party’, ‘cat@b.com‘, ‘hippo@b.com‘, ‘at my place’); INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘999’, ‘wat ‘, ‘horse@b.com‘, ‘giraffe@b.com‘, ‘is going on?’);
  11. 11. Data Model Simple Inserts Result Select * from email_app.emails; 111 subject to_add cc body party cat@ hippo@ at my place subject to_add cc body wat horse@ giraffe@ is going on 999
  12. 12. Mental Model - Nested Hash Row Keys 111 999 to cc body Column Values subject subject to cc body
  13. 13. Data Model - Simple Insert - Again INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ‘party’, ‘cat@b.com‘, ‘hippo@b. com‘, ‘at my place’); 111 subject to_add cc body party cat@ hippo@ at my place subject to_add cc body wat horse@ giraffe@ Is going on? 999 IDEMPOTENT
  14. 14. Data Model - Composite Key 1 CREATE TABLE email_app.emails ( user_id text, subject text, to_add text, cc text, body text, PRIMARY KEY(user_id, subject)); ROW KEY CLUSTERING KEY
  15. 15. Data Model - Composite Insert 1 INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ‘party‘, ‘cat@b.com‘, ‘hippo@b.com‘, ‘at my place’); Same as Before. Right???
  16. 16. Data Model Composite Insert Result Select * from emails WHERE user_id = 111; Subject 111 party|to_ad party|cc party|body cat@ hippo@ At my place
  17. 17. Mental Model - Nested Hash 111 to_add cc body Row Key Column Values party Clustering Column user_id subject
  18. 18. Data Model - Composite Insert 2 INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ’ swim’, ‘cat@b.com‘, ‘hippo@b.com‘, ‘in the pool’);
  19. 19. Composite Insert 2 Result Select * from emails WHERE user_id = ‘111’; Subject 111 party|to_add party|cc party|body cat@ hippo@ at my place swim|to_add swim|cc swim|body cat@ hippo@b in the pool Sorted by clustering column - “subject”
  20. 20. Mental Model - Nested Sorted Hash 111 party to cc body Row Key Clustering Column Column Values swim to cc body user_id subject
  21. 21. Why sorted? SLICE QUERIES!! SELECT * FROM emails WHERE user_id = '111' AND (subject) >= ('s') AND (subject) < (‘t’); 111 party|to_add party|cc party|body cat@ giraffe@ At my place swim|to_add swim|cc swim|body cat@ hippo@b in the pool
  22. 22. DM - Compound Composite Key CREATE TABLE email_app.emails ( user_id text, subject text, to_add text, cc text, body text, PRIMARY KEY((user_id, subject), to_add)); ROW KEY CLUSTERING KEY
  23. 23. Composite / Compound Inserts INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ‘wat‘, ‘horse@b.com‘, ‘giraffe@b. com‘, ‘is going on?’); INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ‘party‘, ‘cat@b.com‘, ‘hippo@b. com‘, ‘at my place’);
  24. 24. Composite Insert 2 Result SELECT * FROM emails WHERE user_id = ‘111’; SELECT * FROM emails WHERE user_id = ‘111’ AND subject = ‘party’; 111:party cat@|cc cat@|body hippo@ At my place to_add
  25. 25. Data Model - Composite Insert 1 INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ‘party‘, ‘dog@b.com‘, ‘hippo@b. com‘, ‘all the time’); SELECT * FROM emails WHERE user_id = ‘111’ AND subject = ‘party’; 111:party cat@|cc cat@...|body giraffe@ At my place dog@|cc dog@|body hippo@b all the time Sorting / slice on - “to_add” to_add
  26. 26. DM - Compound Composite Key 2 CREATE TABLE email_app.emails ( user_id text, subject text, to_add text, cc text, body text, ROW KEY CLUSTERING KEYS PRIMARY KEY((user_id, subject), to_add, cc));
  27. 27. Composite / Clustered Inserts INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ‘party‘, ‘dog@b.com‘, ‘hippo@b. com‘, ‘all the time); INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ‘party‘, ‘cat@b.com‘, ‘hippo@b. com‘, ‘At my place’); INSERT INTO email_app.emails (user_id, subject, to_add, cc, body) VALUES (‘111’, ‘party‘, ‘cat@b.com‘, ‘mouse@b. com‘, ‘At my place’);
  28. 28. DM - Composite / Clustered Inserts SELECT * FROM emails WHERE user_id = ‘111’ AND subject = ‘party’; 111|party cat@|hippo@|body cat@|mouse@|body at my place at my place dog@|hippo@|body all the time Slice on (to_add) OR (to_add, cc)
  29. 29. Mental Model - Nested Sorted Hash 111|party cat dog hippo mouse hippo body body body Row Key Clustering Columns Column Values user_id + subject to_add cc
  30. 30. Part 2 / 8 of this 7 hour talk ● Denormalization ● Index Column Families ● Cassandra Internals (memtables, SSTables, compaction, repair)
  31. 31. Part 8 / 8: The Future ● Continually improving ● More and more adoption ● Awesome projects ● http://www.datastax. com/documentation/cassandra/2. 0/pdf/cassandra20.pdf ● http://planetcassandra.org/

×