Cassandra Summit 2010 Performance Tuning

16,531 views

Published on

Published in: Technology
1 Comment
25 Likes
Statistics
Notes
No Downloads
Views
Total views
16,531
On SlideShare
0
From Embeds
0
Number of Embeds
4,019
Actions
Shares
0
Downloads
393
Comments
1
Likes
25
Embeds 0
No embeds

No notes for slide

Cassandra Summit 2010 Performance Tuning

  1. 1. Cassandra Summit 1.0 Performance Tuning Brandon Williams Riptano, Inc. brandon@riptano.com brandonwilliams@apache.org @faltering driftx on freenode August 10, 2010 Brandon Williams Cassandra Summit 1.0
  2. 2. Tuning Writes Tuning Reads Making writes faster Use a separate IO device for the commit log. Brandon Williams Cassandra Summit 1.0
  3. 3. Tuning Writes Tuning Reads Making writes faster Use a separate IO device for the commit log. Hard to accomplish in the cloud Brandon Williams Cassandra Summit 1.0
  4. 4. Tuning Writes Tuning Reads Making writes faster Use a separate IO device for the commit log. Hard to accomplish in the cloud Rackspace: one IO device, but it’s persistent (RAID array underneath) Brandon Williams Cassandra Summit 1.0
  5. 5. Tuning Writes Tuning Reads Making writes faster Use a separate IO device for the commit log. Hard to accomplish in the cloud Rackspace: one IO device, but it’s persistent (RAID array underneath) EC2: EBS is slow, local disk is impersistent Brandon Williams Cassandra Summit 1.0
  6. 6. Tuning Writes Tuning Reads Making writes faster Use a separate IO device for the commit log. Hard to accomplish in the cloud Rackspace: one IO device, but it’s persistent (RAID array underneath) EC2: EBS is slow, local disk is impersistent You could put the commitlog on the ephemeral drive anyway, at the price of durability But then, why have a commitlog at all? Brandon Williams Cassandra Summit 1.0
  7. 7. Tuning Writes Tuning Reads Making writes faster Use a separate IO device for the commit log. Hard to accomplish in the cloud Rackspace: one IO device, but it’s persistent (RAID array underneath) EC2: EBS is slow, local disk is impersistent You could put the commitlog on the ephemeral drive anyway, at the price of durability But then, why have a commitlog at all? Maybe you can disable it in 0.7/0.8 Brandon Williams Cassandra Summit 1.0
  8. 8. Tuning Writes Tuning Reads Making writes faster Use a separate IO device for the commit log. Hard to accomplish in the cloud Rackspace: one IO device, but it’s persistent (RAID array underneath) EC2: EBS is slow, local disk is impersistent You could put the commitlog on the ephemeral drive anyway, at the price of durability But then, why have a commitlog at all? Maybe you can disable it in 0.7/0.8 Realservers: one RAID array, bad RAID options Brandon Williams Cassandra Summit 1.0
  9. 9. Tuning Writes Tuning Reads Making writes faster Use a separate IO device for the commit log. Hard to accomplish in the cloud Rackspace: one IO device, but it’s persistent (RAID array underneath) EC2: EBS is slow, local disk is impersistent You could put the commitlog on the ephemeral drive anyway, at the price of durability But then, why have a commitlog at all? Maybe you can disable it in 0.7/0.8 Realservers: one RAID array, bad RAID options Will anyone ever offer SSDs? Brandon Williams Cassandra Summit 1.0
  10. 10. Tuning Writes Tuning Reads What else? concurrent writers (concurrent readers for reads) increase if you have lots of cores Brandon Williams Cassandra Summit 1.0
  11. 11. Tuning Writes Tuning Reads What else? concurrent writers (concurrent readers for reads) increase if you have lots of cores memtable flush writers increase if you have lots of IO Brandon Williams Cassandra Summit 1.0
  12. 12. Tuning Writes Tuning Reads What are all these options? memtable throughput in mb memtable operations in millions memtable flush after mins bigger memtables improve writes? Brandon Williams Cassandra Summit 1.0
  13. 13. Tuning Writes Tuning Reads What are all these options? memtable throughput in mb memtable operations in millions memtable flush after mins bigger memtables improve writes? no, but they can improve reads Brandon Williams Cassandra Summit 1.0
  14. 14. Tuning Writes Tuning Reads What are all these options? memtable throughput in mb memtable operations in millions memtable flush after mins bigger memtables improve writes? no, but they can improve reads what? Brandon Williams Cassandra Summit 1.0
  15. 15. Tuning Writes Tuning Reads Compaction: the slayer of reads Brandon Williams Cassandra Summit 1.0
  16. 16. Tuning Writes Tuning Reads Compaction: the slayer of reads a necessary evil Brandon Williams Cassandra Summit 1.0
  17. 17. Tuning Writes Tuning Reads Compaction: the slayer of reads a necessary evil IO contention hell Brandon Williams Cassandra Summit 1.0
  18. 18. Tuning Writes Tuning Reads Compaction: the slayer of reads a necessary evil IO contention hell you can reduce compaction priority in 0.6.4 or later -Dcassandra.compaction.priority=1 Brandon Williams Cassandra Summit 1.0
  19. 19. Tuning Writes Tuning Reads Compaction: the slayer of reads a necessary evil IO contention hell you can reduce compaction priority in 0.6.4 or later -Dcassandra.compaction.priority=1 constantly outstripping it means you need more nodes Brandon Williams Cassandra Summit 1.0
  20. 20. Tuning Writes Tuning Reads Compaction: the slayer of reads a necessary evil IO contention hell you can reduce compaction priority in 0.6.4 or later -Dcassandra.compaction.priority=1 constantly outstripping it means you need more nodes reducing the priority affects CPU usage, not IO Brandon Williams Cassandra Summit 1.0
  21. 21. Tuning Writes Tuning Reads Compaction: the slayer of reads a necessary evil IO contention hell you can reduce compaction priority in 0.6.4 or later -Dcassandra.compaction.priority=1 constantly outstripping it means you need more nodes reducing the priority affects CPU usage, not IO avoid reading from slow hosts Brandon Williams Cassandra Summit 1.0
  22. 22. Tuning Writes Tuning Reads Compaction: the slayer of reads a necessary evil IO contention hell you can reduce compaction priority in 0.6.4 or later -Dcassandra.compaction.priority=1 constantly outstripping it means you need more nodes reducing the priority affects CPU usage, not IO avoid reading from slow hosts dynamic snitch Brandon Williams Cassandra Summit 1.0
  23. 23. Tuning Writes Tuning Reads Compaction: the slayer of reads a necessary evil IO contention hell you can reduce compaction priority in 0.6.4 or later -Dcassandra.compaction.priority=1 constantly outstripping it means you need more nodes reducing the priority affects CPU usage, not IO avoid reading from slow hosts dynamic snitch accrual failure detector Brandon Williams Cassandra Summit 1.0
  24. 24. Tuning Writes Tuning Reads Compaction (con’t) bigger memtables absorb more overwrites Brandon Williams Cassandra Summit 1.0
  25. 25. Tuning Writes Tuning Reads Compaction (con’t) bigger memtables absorb more overwrites less sstables makes for more efficient compaction Brandon Williams Cassandra Summit 1.0
  26. 26. Tuning Writes Tuning Reads Compaction (con’t) bigger memtables absorb more overwrites less sstables makes for more efficient compaction if you are write once then read-only, you *could* turn it off Brandon Williams Cassandra Summit 1.0
  27. 27. Tuning Writes Tuning Reads Compaction (con’t) bigger memtables absorb more overwrites less sstables makes for more efficient compaction if you are write once then read-only, you *could* turn it off merge-on-read and bloomfilters save you Brandon Williams Cassandra Summit 1.0
  28. 28. Tuning Writes Tuning Reads Compaction (con’t) bigger memtables absorb more overwrites less sstables makes for more efficient compaction if you are write once then read-only, you *could* turn it off merge-on-read and bloomfilters save you someday, you’ll want to repair Brandon Williams Cassandra Summit 1.0
  29. 29. Tuning Writes Tuning Reads Know your read pattern Brandon Williams Cassandra Summit 1.0
  30. 30. Tuning Writes Tuning Reads Know your read pattern how much data is in the working set? Brandon Williams Cassandra Summit 1.0
  31. 31. Tuning Writes Tuning Reads Know your read pattern how much data is in the working set? disk is slow: you want that in memory Brandon Williams Cassandra Summit 1.0
  32. 32. Tuning Writes Tuning Reads Know your read pattern how much data is in the working set? disk is slow: you want that in memory sometimes you can’t afford the cost Brandon Williams Cassandra Summit 1.0
  33. 33. Tuning Writes Tuning Reads Know your read pattern how much data is in the working set? disk is slow: you want that in memory sometimes you can’t afford the cost how many reads are repeats? Brandon Williams Cassandra Summit 1.0
  34. 34. Tuning Writes Tuning Reads Know your read pattern how much data is in the working set? disk is slow: you want that in memory sometimes you can’t afford the cost how many reads are repeats? doing lots of random IO within a row? column index size in kb Brandon Williams Cassandra Summit 1.0
  35. 35. Tuning Writes Tuning Reads Caches Brandon Williams Cassandra Summit 1.0
  36. 36. Tuning Writes Tuning Reads Caches on a cold hit, each row requires two seeks Brandon Williams Cassandra Summit 1.0
  37. 37. Tuning Writes Tuning Reads Caches on a cold hit, each row requires two seeks one to find the row’s position in the index Brandon Williams Cassandra Summit 1.0
  38. 38. Tuning Writes Tuning Reads Caches on a cold hit, each row requires two seeks one to find the row’s position in the index key cache eliminates this Brandon Williams Cassandra Summit 1.0
  39. 39. Tuning Writes Tuning Reads Caches on a cold hit, each row requires two seeks one to find the row’s position in the index key cache eliminates this another to read the row row cache eliminates this, too Brandon Williams Cassandra Summit 1.0
  40. 40. Tuning Writes Tuning Reads Caches on a cold hit, each row requires two seeks one to find the row’s position in the index key cache eliminates this another to read the row row cache eliminates this, too columns in the row are contiguous afterwards Brandon Williams Cassandra Summit 1.0
  41. 41. Tuning Writes Tuning Reads Caches on a cold hit, each row requires two seeks one to find the row’s position in the index key cache eliminates this another to read the row row cache eliminates this, too columns in the row are contiguous afterwards make fat rows Brandon Williams Cassandra Summit 1.0
  42. 42. Tuning Writes Tuning Reads Caches on a cold hit, each row requires two seeks one to find the row’s position in the index key cache eliminates this another to read the row row cache eliminates this, too columns in the row are contiguous afterwards make fat rows but not too fat, since the row is the unit of distribution Brandon Williams Cassandra Summit 1.0
  43. 43. Tuning Writes Tuning Reads Caches on a cold hit, each row requires two seeks one to find the row’s position in the index key cache eliminates this another to read the row row cache eliminates this, too columns in the row are contiguous afterwards make fat rows but not too fat, since the row is the unit of distribution the OS file cache Brandon Williams Cassandra Summit 1.0
  44. 44. Tuning Writes Tuning Reads Caches on a cold hit, each row requires two seeks one to find the row’s position in the index key cache eliminates this another to read the row row cache eliminates this, too columns in the row are contiguous afterwards make fat rows but not too fat, since the row is the unit of distribution the OS file cache use a good OS Brandon Williams Cassandra Summit 1.0
  45. 45. Tuning Writes Tuning Reads Caching Strategies Brandon Williams Cassandra Summit 1.0
  46. 46. Tuning Writes Tuning Reads Caching Strategies key cache excellent bang for your buck half your seeks are gone a lot of keys fit in a relatively small amount of memory Brandon Williams Cassandra Summit 1.0
  47. 47. Tuning Writes Tuning Reads Caching Strategies key cache excellent bang for your buck half your seeks are gone a lot of keys fit in a relatively small amount of memory row cache all seeks are gone but more heap usage = more GC pressure Brandon Williams Cassandra Summit 1.0
  48. 48. Tuning Writes Tuning Reads Caching Strategies key cache excellent bang for your buck half your seeks are gone a lot of keys fit in a relatively small amount of memory row cache all seeks are gone but more heap usage = more GC pressure trying to use 32GB of row cache will wreck you Brandon Williams Cassandra Summit 1.0
  49. 49. Tuning Writes Tuning Reads Caching Strategies key cache excellent bang for your buck half your seeks are gone a lot of keys fit in a relatively small amount of memory row cache all seeks are gone but more heap usage = more GC pressure trying to use 32GB of row cache will wreck you estimating the correct size can be difficult use the average row size in cfstats as a starting point in 0.7, each SSTable has a persistent row size histogram the penalty for being wrong can be catastrophic: OOM can’t be done programmatically in Java, or Cassandra would do it for you this is why you can’t set an absolute amount in bytes Brandon Williams Cassandra Summit 1.0
  50. 50. Tuning Writes Tuning Reads Caching Strategies key cache excellent bang for your buck half your seeks are gone a lot of keys fit in a relatively small amount of memory row cache all seeks are gone but more heap usage = more GC pressure trying to use 32GB of row cache will wreck you estimating the correct size can be difficult use the average row size in cfstats as a starting point in 0.7, each SSTable has a persistent row size histogram the penalty for being wrong can be catastrophic: OOM can’t be done programmatically in Java, or Cassandra would do it for you this is why you can’t set an absolute amount in bytes if you enable on it very fat rows, it can be bad Brandon Williams Cassandra Summit 1.0
  51. 51. Tuning Writes Tuning Reads Caching Strategies key cache excellent bang for your buck half your seeks are gone a lot of keys fit in a relatively small amount of memory row cache all seeks are gone but more heap usage = more GC pressure trying to use 32GB of row cache will wreck you estimating the correct size can be difficult use the average row size in cfstats as a starting point in 0.7, each SSTable has a persistent row size histogram the penalty for being wrong can be catastrophic: OOM can’t be done programmatically in Java, or Cassandra would do it for you this is why you can’t set an absolute amount in bytes if you enable on it very fat rows, it can be bad keep your indexes in a different column family Brandon Williams Cassandra Summit 1.0
  52. 52. Tuning Writes Tuning Reads Caching Strategies (con’t) OS file cache: it’s free no size estimation needed Brandon Williams Cassandra Summit 1.0
  53. 53. Tuning Writes Tuning Reads Caching Strategies (con’t) OS file cache: it’s free no size estimation needed mmap is great unless it makes you swap Brandon Williams Cassandra Summit 1.0
  54. 54. Tuning Writes Tuning Reads Caching Strategies (con’t) OS file cache: it’s free no size estimation needed mmap is great unless it makes you swap switch to mmap index only Brandon Williams Cassandra Summit 1.0
  55. 55. Tuning Writes Tuning Reads Caching Strategies (con’t) OS file cache: it’s free no size estimation needed mmap is great unless it makes you swap switch to mmap index only why do you have swap enabled, anyway? Brandon Williams Cassandra Summit 1.0
  56. 56. Tuning Writes Tuning Reads Caching Strategies (con’t) OS file cache: it’s free no size estimation needed mmap is great unless it makes you swap switch to mmap index only why do you have swap enabled, anyway? Brandon Williams Cassandra Summit 1.0
  57. 57. Tuning Writes Tuning Reads Caching Strategies (con’t) OS file cache: it’s free no size estimation needed mmap is great unless it makes you swap switch to mmap index only why do you have swap enabled, anyway? Absolute numbers vs percentages percentages can be an OOM time bomb harder to calculate how much memory the cache will use Brandon Williams Cassandra Summit 1.0
  58. 58. Tuning Writes Tuning Reads Caching Strategies (con’t) OS file cache: it’s free no size estimation needed mmap is great unless it makes you swap switch to mmap index only why do you have swap enabled, anyway? Absolute numbers vs percentages percentages can be an OOM time bomb harder to calculate how much memory the cache will use Brandon Williams Cassandra Summit 1.0
  59. 59. Tuning Writes Tuning Reads Caching Strategies (con’t) lookup order: row cache key cache disk (file cache?) Brandon Williams Cassandra Summit 1.0
  60. 60. Tuning Writes Tuning Reads Caching Strategies (con’t) lookup order: row cache key cache disk (file cache?) sizing your caches: large key cache smaller row cache for very hot rows leave the rest to the OS Brandon Williams Cassandra Summit 1.0
  61. 61. Tuning Writes Tuning Reads Caching Strategies (con’t) lookup order: row cache key cache disk (file cache?) sizing your caches: large key cache smaller row cache for very hot rows leave the rest to the OS don’t make your heap larger than needed Brandon Williams Cassandra Summit 1.0
  62. 62. Tuning Writes Tuning Reads Caching Strategies (con’t) lookup order: row cache key cache disk (file cache?) sizing your caches: large key cache smaller row cache for very hot rows leave the rest to the OS don’t make your heap larger than needed monitor hit rates via JMX actually, monitor everything you can Brandon Williams Cassandra Summit 1.0
  63. 63. Tuning Writes Tuning Reads Test, Measure, Tweak, Repeat Brandon Williams Cassandra Summit 1.0
  64. 64. Tuning Writes Tuning Reads Test, Measure, Tweak, Repeat use stress.py as a baseline make sure you have multiprocessing Brandon Williams Cassandra Summit 1.0
  65. 65. Tuning Writes Tuning Reads Test, Measure, Tweak, Repeat use stress.py as a baseline make sure you have multiprocessing move to real world data Brandon Williams Cassandra Summit 1.0
  66. 66. Tuning Writes Tuning Reads Settings you don’t need to touch commitlog rotation threshold in mb SlicedBufferSizeInKB FlushIndexBufferSizeInMB Brandon Williams Cassandra Summit 1.0
  67. 67. Tuning Writes Tuning Reads The End Questions? Brandon Williams Cassandra Summit 1.0

×