C* Summit EU 2013: Apache Cassandra 2.0 — Data Model on Fire
Upcoming SlideShare
Loading in...5
×
 

C* Summit EU 2013: Apache Cassandra 2.0 — Data Model on Fire

on

  • 4,451 views

Speaker: Patrick McFadin, Chief Evangelist at DataStax ...

Speaker: Patrick McFadin, Chief Evangelist at DataStax
Video: http://www.youtube.com/watch?v=oUEKMcTsbfU&list=PLqcm6qE9lgKLoYaakl3YwIWP4hmGsHm5e&index=22
Functional data models are great, but how can you squeeze out more performance and make them awesome! Let's talk through some example Cassandra 2.0 models, go through the tuning steps and understand the tradeoffs. Many time's just a simple understanding of the underlying Cassandra 2.0 internals can make all the difference. I've helped some of the biggest companies in the world do this and I can help you. Do you feel the need for Cassandra 2.0 speed?

Statistics

Views

Total Views
4,451
Views on SlideShare
3,380
Embed Views
1,071

Actions

Likes
5
Downloads
89
Comments
0

14 Embeds 1,071

http://planetcassandra.org 641
http://www.planetcassandra.org 343
http://cloud.feedly.com 41
http://planetcassandra.com 14
http://planetca.w11.wh-2.com 12
http://localhost 8
http://23.253.69.203 4
http://newsblur.com 2
http://digg.com 1
http://hivereader.com 1
http://feedly.com 1
http://www.inoreader.com 1
http://www.newsblur.com 1
http://reader.faltering.com 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    C* Summit EU 2013: Apache Cassandra 2.0 — Data Model on Fire C* Summit EU 2013: Apache Cassandra 2.0 — Data Model on Fire Presentation Transcript

    • #CASSANDRAEU Data Model on Fire Patrick McFadin | Chief Evangelist DataStax @PatrickMcFadin Friday, October 18, 13
    • Data Model is King •With 2.0 we now have more choices •Sometimes the data model is only the first part •Understanding the underlying engine helps •You aren’t done until you tune Load test baby! Friday, October 18, 13 #CASSANDRAEU
    • Light Weight Transactions Friday, October 18, 13
    • The race is on Process 1 #CASSANDRAEU Process 2 SELECT firstName, lastName FROM users WHERE username = 'pmcfadin'; T0 T1 (0 rows) SELECT firstName, lastName FROM users WHERE username = 'pmcfadin'; (0 rows) INSERT INTO users (username, firstname, lastname, email, password, created_date) VALUES ('pmcfadin','Patrick','McFadin', ['patrick@datastax.com'], 'ba27e03fd95e507daf2937c937d499ab', '2011-06-20 13:50:00'); Got nothing! Good to go! T2 T3 This one wins Friday, October 18, 13 INSERT INTO users (username, firstname, lastname, email, password, created_date) VALUES ('pmcfadin','Paul','McFadin', ['paul@oracle.com'], 'ea24e13ad95a209ded8912e937d499de', '2011-06-20 13:51:00');
    • Solution LWT #CASSANDRAEU Process 1 INSERT INTO users (username, firstname, lastname, email, password, created_date) VALUES ('pmcfadin','Patrick','McFadin', ['patrick@datastax.com'], 'ba27e03fd95e507daf2937c937d499ab', '2011-06-20 13:50:00') IF NOT EXISTS; [applied] ----------True T0 T1 •Check performed for record •Paxos ensures exclusive access •applied = true: Success Friday, October 18, 13
    • Solution LWT Process 2 T2 T3 INSERT INTO users (username, firstname, lastname, email, password, created_date) VALUES ('pmcfadin','Paul','McFadin', ['paul@oracle.com'], 'ea24e13ad95a209ded8912e937d499de', '2011-06-20 13:51:00') IF NOT EXISTS; [applied] | username | created_date | firstname | lastname -----------+----------+--------------------------+-----------+---------False | pmcfadin | 2011-06-20 13:50:00-0700 | Patrick | McFadin •applied = false: Rejected •No record stomping! Friday, October 18, 13 #CASSANDRAEU
    • LWT Fine Print #CASSANDRAEU •Light Weight Transactions solve edge conditions •They have latency cost. • Be aware • Load test • Consider in your data model •Now go shut down that ZooKeeper mess you have! Friday, October 18, 13
    • Form Versioning: Revisited Friday, October 18, 13
    • Form Versioning Pt 1 •From “Next top data model” •Great idea, but edge conditions CREATE TABLE working_version ( ! username varchar, ! form_id int, ! version_number int, ! locked_by varchar, ! form_attributes map<varchar,varchar> ! PRIMARY KEY ((username, form_id), version_number) ) WITH CLUSTERING ORDER BY (version_number DESC); •Each user has a form •Each form needs versioning •Need an exclusive lock on the form Friday, October 18, 13 #CASSANDRAEU
    • Form Versioning Pt 1 1. Insert first version INSERT INTO working_version (username, form_id, version_number, locked_by, form_attributes) VALUES ('pmcfadin',1138,1,'', {'FirstName<text>':'First Name: ', 'LastName<text>':'Last Name: ', 'EmailAddress<text>':'Email Address: ', 'Newsletter<radio>':'Y,N'}); 2. Lock for one user Danger Zone UPDATE working_version SET locked_by = 'pmcfadin' WHERE username = 'pmcfadin' AND form_id = 1138 AND version_number = 1; 3. Insert new version. Release lock INSERT INTO working_version (username, form_id, version_number, locked_by, form_attributes) VALUES ('pmcfadin',1138,2,null, {'FirstName<text>':'First Name: ', 'LastName<text>':'Last Name: ', 'EmailAddress<text>':'Email Address: ', 'Newsletter<checkbox>':'Y'}); Friday, October 18, 13 #CASSANDRAEU
    • Form Versioning Pt 2 #CASSANDRAEU 1. Insert first version INSERT INTO working_version (username, form_id, version_number, locked_by, form_attributes) VALUES ('pmcfadin',1138,1,'pmcfadin', {'FirstName<text>':'First Name: ', 'LastName<text>':'Last Name: ', 'EmailAddress<text>':'Email Address: ', 'Newsletter<radio>':'Y,N'}) IF NOT EXISTS; Exclusive lock UPDATE working_version SET form_attributes['EmailAddress<text>'] = 'Primary Email Address: ' WHERE username = 'pmcfadin' AND form_id = 1138 AND version_number = 1 IF locked_by = 'pmcfadin'; Accepted UPDATE working_version SET form_attributes['EmailAddress<text>'] = 'Email Adx: ' WHERE username = 'pmcfadin' AND form_id = 1138 AND version_number = 1 IF locked_by = 'dude'; Rejected (sorry dude) Friday, October 18, 13
    • Form Versioning Pt 2 •Old way: Edge cases with problems • Use external locking? • Take your chances? •New way: Managed expectations (LWT) • Exclusive by existence check • Continued with IF clause • Downside: More latency Friday, October 18, 13 #CASSANDRAEU
    • Fire: Bring it Friday, October 18, 13
    • Cassandra 2.0 Fire •Great changes in both 1.2 and 2.0 for perf •Three big changes in 2.0 I like Friday, October 18, 13 #CASSANDRAEU
    • Cassandra 2.0 Fire •Great changes in both 1.2 and 2.0 for perf •Three big changes in 2.0 I like Single pass compaction Friday, October 18, 13 #CASSANDRAEU
    • Cassandra 2.0 Fire •Great changes in both 1.2 and 2.0 for perf •Three big changes in 2.0 I like Single pass compaction Hints to reduce SSTable reads Friday, October 18, 13 #CASSANDRAEU
    • Cassandra 2.0 Fire •Great changes in both 1.2 and 2.0 for perf •Three big changes in 2.0 I like Single pass compaction Hints to reduce SSTable reads Faster index reads from off-heap Friday, October 18, 13 #CASSANDRAEU
    • Why is this important? •Reducing SStable reads mean less seeks •Disk seeks can add up fast •5 seeks on SATA = 60ms of just disk! Avg Access Time* Rotation Speed 12ms 7200 RPM 7ms 10k RPM 5ms 15k RPM .04ms SSD * Source: www.tomshardware.com Friday, October 18, 13 #CASSANDRAEU
    • Why is this important? •Reducing SStable reads mean less seeks •Disk seeks can add up fast •5 seeks on SATA = 60ms of just disk! Avg Access Time* Rotation Speed 12ms 7200 RPM 7ms 10k RPM 5ms 15k RPM .04ms SSD Shared storage == Great sadness * Source: www.tomshardware.com Friday, October 18, 13 #CASSANDRAEU
    • Quick Diversion #CASSANDRAEU •cfhistograms is your friend •Histograms of statistics per table •Collected... • per read • per write • SSTable flush • Compaction nodetool cfhistograms <keyspace> <table> Friday, October 18, 13
    • #CASSANDRAEU How do I even read this thing! Friday, October 18, 13
    • Histograms How to #CASSANDRAEU nodetool cfhistograms videodb users videodb/users histograms Offset SSTables Write Latency (micros) 1 107 0 2 0 0 10 0 0 250 0 5 800 0 10 1250 0 0 Read Latency (micros) 0 0 0 0 50 300 Partition Size (bytes) 0 0 0 0 0 5 Cell Count •Unit-less column •Units are assigned by each column •Numerical buckets Friday, October 18, 13 0 0 5 0 0 0
    • Histograms How to #CASSANDRAEU nodetool cfhistograms videodb users videodb/users histograms Offset SSTables Write Latency (micros) 1 107 0 2 2 0 10 0 0 250 0 5 800 0 10 1250 0 0 Read Latency (micros) 0 0 0 0 50 300 Partition Size (bytes) 0 0 0 0 0 5 •Per read. How many seeks? •Offset is number of SSTables read •Less == lower read latency •107 reads took 1 seek to satisfy Friday, October 18, 13 Cell Count 0 0 5 0 0 0
    • Histograms How to #CASSANDRAEU nodetool cfhistograms videodb users videodb/users histograms Offset SSTables Write Latency (micros) 1 107 0 2 2 0 10 0 0 250 0 5 800 0 10 1250 0 0 Read Latency (micros) 0 0 0 0 50 300 •Per write. How fast? •Offset is microseconds Friday, October 18, 13 Partition Size (bytes) 0 0 0 0 0 5 Cell Count 0 0 5 0 0 0
    • Histograms How to #CASSANDRAEU nodetool cfhistograms videodb users videodb/users histograms Offset SSTables Write Latency (micros) 1 107 0 2 2 0 10 0 0 250 0 5 800 0 10 1250 0 0 Read Latency (micros) 0 0 0 0 50 300 •Per read. How fast? •Offset is microseconds Friday, October 18, 13 Partition Size (bytes) 0 0 0 0 0 5 Cell Count 0 0 5 0 0 0
    • Histograms How to #CASSANDRAEU nodetool cfhistograms videodb users videodb/users histograms Offset SSTables Write Latency (micros) 1 107 0 2 2 0 10 0 0 250 0 5 800 0 10 1250 0 0 Read Latency (micros) 0 0 0 0 50 300 Partition Size (bytes) 0 0 0 0 0 5 •Per partition (storage row) •Offset is size in bytes •5 partitions are 1250 bytes Friday, October 18, 13 Cell Count 0 0 5 0 0 0
    • Histograms How to #CASSANDRAEU nodetool cfhistograms videodb users videodb/users histograms Offset SSTables Write Latency (micros) 1 107 0 2 2 0 10 0 0 250 0 5 800 0 10 1250 0 0 Read Latency (micros) 0 0 0 0 50 300 Partition Size (bytes) 0 0 0 0 0 5 •Per partition (storage row) •Offset is count of cells in partition •5 partitions have 10 cells Friday, October 18, 13 Cell Count 0 0 5 0 0 0
    • Histograms + Data Model •Your data model is the key to success •How do you ensure that? Test Measure Repeat Friday, October 18, 13 #CASSANDRAEU
    • Real World Example •Real Customer •Needed very tight SLA on reads Problem •Read response highly variable •Loading data increases latency Friday, October 18, 13 #CASSANDRAEU
    • Offset Friday, October SSTables 1 2 3 4 5 6 7 8 10 12 14 17 20 24 29 35 42 50 60 72 86 103 124 149 179 215 258 310 372 446 535 642 770 924 1109 1331 1597 1916 2299 2759 3311 3973 4768 5722 6866 8239 9887 11864 14237 17084 20501 24601 29521 35425 42510 51012 61214 73457 88148 105778 126934 152321 18, 13 2016550 2064495 434526 51084 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Write Latency (micros) 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Read Latency (micros) 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 18 47 71 141 67 36466 263829 608488 209549 398845 625099 462636 499920 380787 285323 202417 148920 106452 81533 55470 43512 30810 22375 15148 12047 11298 9652 6715 13788 15322 8585 5041 2892 1543 900 486 285 Partition Size (bytes) 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1629 0 2971 1468 59 45105 5731 132391 16265 20015 30980 44973 38502 69479 39218 23027 58498 73629 33444 28321 17021 13072 7790 7764 5890 4046 2973 1954 936 661 409 289 Cell Count 0 0 0 0 0 0 0 0 1629 2971 1286 68 188 101 50799 269 132414 32943 62099 116855 41562 42796 46719 57693 27659 26941 21589 19494 8681 9499 9360 4349 4242 2422 1685 954 610 366 303 188 106 64 55 23 15 3 2 0 1 0 0 3 0 0 0 0 0 0 0 0 0 0 #CASSANDRAEU • Compactions behind • Disk IO problems • How to optimize?
    • Offset Less seeks 2 ms! Friday, October SSTables 1 2 3 4 5 6 7 8 10 12 14 17 20 24 29 35 42 50 60 72 86 103 124 149 179 215 258 310 372 446 535 642 770 924 1109 1331 1597 1916 2299 2759 3311 3973 4768 5722 6866 8239 9887 11864 14237 17084 20501 24601 29521 35425 42510 51012 61214 73457 88148 105778 126934 152321 18, 13 2045656 1813961 70496 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Write Latency (micros) 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Read Latency (micros) 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 17 95 84 174 53082 318074 423140 382926 365670 414824 442701 335862 302920 236448 171726 122880 90413 66682 53385 39121 26828 18930 12517 8269 6049 4614 5868 6167 2879 2054 8913 4429 1541 560 192 59 19 0 Partition Size (bytes) 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 47 0 860 392 46 30325 4082 97224 11843 15160 23484 34799 29619 53155 30702 18627 47739 61853 28875 24391 14450 11112 6609 6654 4986 3352 2465 1607 809 523 333 262 Cell Count 0 0 0 0 0 0 0 0 47 860 393 50 0 21 34489 32 97226 24490 47077 94761 32559 33885 37051 48429 23272 22459 17953 16178 7123 7836 7904 3552 3525 1998 1411 757 518 294 254 162 89 62 54 23 12 3 2 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 #CASSANDRAEU • Tuned data disk • Compactions better • 1 less seek overall • Further tuning made it even better! What about the partition size?
    • Partition Size #CASSANDRAEU •Tuning is an option based on size in bytes •All about the reads •index_interval •How many samples taken •Lower for faster access but more memory usage •column_index_size_in_kb •Add column indexes to a row when the data reaches this size •Partial row reads? Maybe smaller. Friday, October 18, 13
    • Tuning results •Spent a lot of time tuning disk •Played with • index_interval (Lowered) • concurrent_reads (Increased) • column_index_size_in_kb (Lowered) 220 Million Ops/Day 10000 Transactions/Sec Peak 9ms at 95th percentile. Measured at the application! Friday, October 18, 13 #CASSANDRAEU
    • Offset 1 2 3 4 5 6 7 8 10 12 14 17 20 24 29 35 42 50 60 72 86 103 124 149 179 215 258 310 372 446 535 642 770 924 1109 1331 1597 1916 2299 2759 3311 3973 4768 5722 6866 8239 9887 11864 14237 17084 20501 24601 29521 35425 42510 51012 Friday, October 18, 13 SSTables 27425403 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Write Latency 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Read Latency 0 0 0 1 24 56 92 283 2834 11954 32621 135311 314195 610665 536736 162541 25277 7847 5864 9580 5517 3822 1850 394 253 305 4657297 12748409 7475534 263549 217171 41908 24876 13566 10875 9379 7111 5333 5072 3987 5290 5169 2867 2093 3177 2161 1552 1200 834 1380 6219 4977 2114 6479 18417 5532 Row Size 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1218345 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Column Count 0 0 0 0 0 0 0 0 0 0 1218345 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 #CASSANDRAEU • The two hump problem • Reads awesome until • Compaction! • Solution: • Throttle down compaction • Tune disk • Ignore it
    • Disk + Data Model •Understand the internals • Size of partition • Compaction •Learn how to measure •Load test Friday, October 18, 13 #CASSANDRAEU
    • #CASSANDRAEU Thank you! Time for questions... *More? My data modeling talks: The Data Model is Dead, Long Live the Data Model Become a Super Modeler The World's Next Top Data Model Friday, October 18, 13