• Save
Cassandra SF 2012 - Technical Deep Dive: query performance
Upcoming SlideShare
Loading in...5
×
 

Cassandra SF 2012 - Technical Deep Dive: query performance

on

  • 5,970 views

Technical Deep Dive:

Technical Deep Dive:
query performance

Cassandra SF 2012


Aaron Morton, Apache Cassandra Committer
@aaronmorton
www.thelastpickle.com

Statistics

Views

Total Views
5,970
Views on SlideShare
5,943
Embed Views
27

Actions

Likes
11
Downloads
0
Comments
0

5 Embeds 27

https://si0.twimg.com 9
http://us-w1.rockmelt.com 8
https://twitter.com 4
https://twimg0-a.akamaihd.net 3
http://www.slashdocs.com 3

Accessibility

Categories

Upload Details

Uploaded via as Apple Keynote

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n

Cassandra SF 2012 - Technical Deep Dive: query performance Cassandra SF 2012 - Technical Deep Dive: query performance Presentation Transcript

  • TECHNICAL DEEP DIVE:QUERY PERFORMANCE CASSANDRA SF 2012 Aaron Morton, Apache Cassandra Committer @aaronmorton www.thelastpickle.com Licensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License
  • Today. Write Path Read Path
  • 7 Microseconds of Terror.
  • Standard write path...1. Append to Commit Log2. Merge with Memtable
  • Blast through the thin Martian atmosphere! (Also time to synchronise around the commit log.)
  • Write path details...Acquire read lock from serverwide read-write switchLock.
  • Memtable flush path acquires the write lock from switchLock.
  • Flush will block ifmemtable_flush_queue_s ize Memtables are waiting to flush.
  • memtable_flush_queue_size test... m1.xlarge Cassandra node m1.xlarge client node 1 CF with 6 Secondary Indexes 1 Client Thread 10,000 Inserts, 100 Columns per Row 1100 bytes per Column
  • memtable_flush_queue_size test...create column family WithIndex with comparator = AsciiType and default_validation_class = AsciiType and key_validation_class = AsciiType and max_compaction_threshold = 0 and min_compaction_threshold = 0 and compression_options = null and column_metadata [{ column_name : 0000000000, validation_class : AsciiType, index_type : 0, index_name : 0000000000},
  • CF write latency and memtable_flush_queue_size... memtable_flush_queue_size=7 memtable_flush_queue_size=1 1,200 900Latency Microseconds 600 300 0 85th 95th 99th 100th
  • Request latency and memtable_flush_queue_size... memtable_flush_queue_size=7 memtable_flush_queue_size=1 5,000,000 3,750,000Latecy Microseconds 2,500,000 1,250,000 0 85th 95th 99th 100th
  • Deploy the Hypersonic parachute! (Also time to write to the CommitLog.)
  • Commit Log... Write to CommitLog if Keyspace wide durable_writes is enabled. (Enabled by default.)
  • durable_writes test... 10,000 Inserts, 50 Columns per Row 50 bytes per Column
  • Request latency and durable_writes (1 client)... enabled disabled 7,000 5,250Latency Microseconds 3,500 1,750 0 85th 95th 99th
  • Request latency and durable_writes (10 clients)... enabled disabled 30,000 22,500Latency Microseconds 15,000 7,500 0 85th 95th 99th
  • Request latency and durable_writes (20 clients)... enabled disabled 90,000 67,500Latency Microseconds 45,000 22,500 0 85th 95th 99th
  • CommitLog has eitherperiodic or batch file syncing. (Periodic with fsync every 10 seconds is default.)
  • periodic commit log adds mutation to queue then acknowledges. Commit Log is appended to by a single thread, sync is called every commitlog_sync_period_in_ms.
  • CommitLog tests... 10,000 Inserts, 50 Columns per Row 50 bytes per Column
  • Request latency and commitlog_sync_period_in_ms... 10,000 ms 10 ms 220 208 Latecy Microseconds 195 183 170 85th 95th 99th
  • batch commit log adds mutation to queue and waits before acknowledging. Writer thread processes mutations forcommitlog_sync_batch_window_in_ms duration, then syncs, then signals.
  • Request latency and commitlog_sync_batch_window_in_ms... 50 ms 0 ms 800 750 Latecy Microseconds 700 650 600 85th 95th 99th
  • Request latency comparing periodic and batch sync... periodic batch 800 600 Latecy Microseconds 400 200 0 85th 95th 99th
  • Engage the rocket powered sky crane!(Also time to merge the mutation with the current Memtable.)
  • Merge mutation... Row level Isolation provided via SnapTree. (https://github.com/nbronson/snaptree)
  • 00 REM Cassandra for C6405 REM Clone row_cols into my_cols10 GOSUB 100020 FOR col = write.first_col TO write.last_col25 REM Add or Reconcile col with my_cols30 GOSUB 200040 IF my_cols != row_cols THEN GOTO 0550 NEXT col55 REM Atomic swap row_cols with my_cols60 GOSUB 300070 IF swapped_cols = FALSE THEN GOTO 05
  • Row concurrency tests... 10,000 Columns per Row 50 bytes per Column 50 Columns per Insert
  • CF Write Latency and row concurrency (10 clients)... different rows single row 2,000 1,500Latecy Microseconds 1,000 500 0 85th 95th 99th
  • Write path with KEYS secondary indexes...1. Append to Commit Log2. Read modified Index Columns3. Merge with Memtable4. Update Secondary Indexes
  • Secondary Indexes... synchronized access to indexed rows. (Keyspace wide)
  • Index concurrency tests... CF with 2 Indexes 10,000 Inserts 6 Columns per Row 35 bytes per Column Alternating column values
  • Request latency and index concurrency (10 clients)... different rows single row 4,000 3,000Latecy Microseconds 2,000 1,000 0 85th 95th 99th
  • Read modified indexedcolumns or all for row deletion.
  • Insert or delete changes to index rows.
  • Index tests... 10,000 Inserts 50 Columns per Row 50 bytes per Column
  • Request latency and secondary indexes... no indexes six indexes 3,000 2,250Latecy Microseconds 1,500 750 0 85th 95th 99th
  • Today Write Path Read Path
  • Two read paths... SliceByNamesReadCommand SliceFromReadCommand
  • Locating row start per SSTable...1. Check BloomFilter2. Read KeyCache3. Read Index Samples, seekand partial scan -Index.db (Step 3 only used if KeyCache lookup missed.)
  • Column Familybloom_filter_fp_chance controls BloomFilter effectiveness.
  • bloom_filter_fp_chance tests... 1,000,000 Rows 50 Columns per Row 50 bytes per Column commitlog_total_space_in_mb: 1 Read random 10% of rows.
  • CF read latency and bloom_filter_fp_chance... default 0.000744. 0.1 7,000 5,250Latecy Microseconds 3,500 1,750 0 85th 95th 99th
  • SSTables per read and bloom_filter_fp_chance (14 SStables)... default 0.000744. 0.1 1 1 SSTables per Read 1 0 0 85th 95th 99th
  • Bloom Filter False Ratio and bloom_filter_fp_chance... default 0.000744. 0.1 1 1 False Positive Ratio 0 0 0
  • key_cache_size_in_mbcontrols the size of the key cache.
  • key_cache_size_in_mb tests... 10,000 Rows 50 Columns per Row 50 bytes per Column Read all Rows
  • CF read latency and key_cache_size_in_mb... default (100MB) 100% Hit Rate disabled 300 225 Latecy Microseconds 150 75 0 85th 95th 99th
  • index_interval controls the size of the -Index.db sampling.
  • index_interval tests... 100,000 Rows 50 Columns per Row 50 bytes per Column key_cache_size_in_mb: 0 Read 1 Column from random 10% of Rows
  • CF read latency and index_interval... index_interval=128 (default) index_interval=512 20,000 15,000Latecy Microseconds 10,000 5,000 0 85th 95th 99th
  • Row Cache removes all disk IO.
  • row_cache_size_in_mb controls the size of Row Cache.
  • row_cache_size_in_mb tests... 100,000 Rows 50 Columns per Row 50 bytes per Column Read all Rows
  • CF read latency and row_cache_size_in_mb... row_cache_size_in_mb=0 and key_cache_size_in_mb=100mb row_cache_size_in_mb=100mb and key_cache_size_in_mb=0 260 195 Latecy Microseconds 130 65 0 85th 95th 99th
  • Slice by Name (excluding Counter Columns)...1. Order SSTables by maxTimestamp.2. Break if SSTable covered by previous RowTombstone.3. Remove irrelevant Columns from query.4. Eagerly read Columns from SSTable.5. Hoist Columns into current Size Tier.
  • When reading by Column name... A query on a wide row mustread the entire Column Index.
  • Column Index tests... Read first Column by name from 1,200 Columns. Read first Column by name from 1,000,000 Columns.
  • CF read latency and Column Index... First Column from 1,200 First Column from 1,000,000 6,000 4,500Latecy Microseconds 3,000 1,500 0 85th 95th 99th
  • When reading by Column name... A narrow query on a wide row performs better.
  • Name Locality tests... 1,000,000 Columns 50 bytes per Column Read 100 Columns from middle of row. Read 100 Columns from spread across row.
  • CF read latency and name locality... Adjacent Columns Spread Columns 200,000 150,000Latecy Microseconds 100,000 50,000 0 85th 95th 99th
  • Slice...1. Skip if SSTable covered by previous RowTombstone.2. Read and merge Columns from SSTable. (May be eager or lazy reading)
  • When slicing...A query using natural Columnorder without a start has the highest performance.
  • Start position tests... 1,000,000 Columns 50 bytes per Column Read first 100 Columns without start. Read first 100 Columns with start.
  • CF read latency and start position... Without start position With start position 40,000 30,000Latecy Microseconds 20,000 10,000 0 85th 95th 99th
  • When slicing...Any start column requires the Column index to be read.
  • Start offset tests... 1,000,000 Columns 50 bytes per Column Read first 100 Columns with start. Read middle 100 Columns with start.
  • CF read latency and start offset... First MIddle 40,000 30,000Latecy Microseconds 20,000 10,000 0 85th 95th 99th
  • When slicing... Reverse requires the Column index to be read.
  • Start offset tests... 1,000,000 Columns 50 bytes per Column Read first 100 Columns without start. Read last 100 Columns with reversed.
  • CF read latency and reversed... Forward Reversed 40,000 30,000Latecy Microseconds 20,000 10,000 0 85th 95th 99th
  • Aaron Morton @aaronmorton www.thelastpickle.comLicensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License