Al has been using Cassandra since version 0.6 and has spent the last few months doing little else but tune Cassandra clusters. In this talk, Al will show how to tune Cassandra for efficient operation using multiple views into system metrics, including OS stats, GC logs, JMX, and cassandra-stress.
5. Questions to ask:
• Look at the available hardware and make an educated guess
• How many sockets/cores? Hyperthreading? NUMA?
• How much RAM?
• memory bandwidth matters
• What kind of storage?
• How much per node?
• What kind of network interface is it?
• Some clouds have PPS limit
10. JVM
• Use Hotspot Java 8 >= u45
• Java 7 is EOL and slower
• OpenJDK is fine
•Zulu is a handy way to get the latest
•http://www.azulsystems.com/products/zulu
•Speaking of Azul …
• Some Datastax customers are having success with C4
• But I can’t talk about any of them
12. cassandra-env.sh: CMS
MAX_HEAP_SIZE=8G
HEAP_NEWSIZE=2G # start here, adjust to workload
# http://blog.ragozin.info/2012/03/secret-hotspot-option-
improving-gc.html
JVM_OPTS="$JVM_OPTS -XX:+UnlockDiagnosticVMOptions"
JVM_OPTS="$JVM_OPTS -XX:ParGCCardsPerStrideChunk=4096"
# these will need to be adjusted to the workload; start here
JVM_OPTS="$JVM_OPTS -XX:SurvivorRatio=2"
JVM_OPTS="$JVM_OPTS -XX:MaxTenuringThreshold=15"
16. cassandra.yaml: commitlog
# Cassandra >= 2.1.9
commitlog_segment_recycling: false
# on SSDs and some HDD RAID
trickle_fsync: true
trickle_fsync_interval_in_kb: 1024
# and/or set vm.dirty_background_bytes low
echo 8388608 > /proc/sys/vm/dirty_background_bytes
17. cassandra.yaml: miscellaneous
num_tokens: 32 # or 1, if you prefer
# default in OSS is “all”
internode_compression: dc
# Cassandra >= 2.1.5
otc_coalescing_strategy: TIMEHORIZON
# https://issues.apache.org/jira/browse/CASSANDRA-8611
streaming_socket_timeout_in_ms: 600000
18. cassandra: schema
• The data model is the single most important factor for performance!
• Check your compression block size (per table)
• Use size-tiered compaction (STCS)
• leveled compaction (LCS) for read-heavy workloads on fast storage
• the current default of 160MB sstable_size_in_mb is fine
• DTCS for time series (http://www.datastax.com/dev/blog/dtcs-notes-from-the-field)
20. Linux: storage
cd /sys/block
for drive in sd* xvd* vd* nvme*
do
echo deadline > $drive/queue/scheduler
echo 8 > $drive/queue/read_ahead_kb
# only on fast SSDs
echo 0 > $drive/queue/nomerges
done
21. Linux: RAID & filesystems
• use xfs
• ext4 if you must
• ZFS if you love yourself and want to be happy
• btrfs if you like to live dangerously
• RAID*: Pass stripe size & width to mkfs whenever possible
• RAID0 is by far the most common choice
• RAID10 is fine if you can afford the disks
• RAID5/6 in some circumstances, but there’s a tradeoff
• JBOD is great but has tradeoffs
23. Disable Frequency Scaling
# make sure the CPUs run at max frequency
for sysfs_cpu in /sys/devices/system/cpu/cpu[0-9]*
do
echo performance > $sysfs_cpu/cpufreq/scaling_governor
done