This document discusses using Cassandra to store time-series metrics data. It describes how the schema was matched to storage by using a measurement column family with rows organized by metric ID and time. It also covers optimizing data expiration through techniques like TTL expiration, synchronized compactions, and leveraging immutable sstable modification times. Effective monitoring is emphasized as well, including dashboards to track the ring and using Cassandra log volumes to identify issues.
24. #CASSANDRA13
Cleanup
l Not just for topology changes
l Tombstoned rows (not referenced)
l Rotated row keys decrease references
l Cons: Must process every sstable.
26. #CASSANDRA13
Leverage SStable Mod Time
l If now – mtime > TTL => all data is expired
l We can quickly eliminate entire sstables:
find -mtime +<TTL> -name *.db | xargs rm
l Fast and low overhead
l Cons: Rolling restart
26G 2013-05-17 09:44 Metrics-metrics_60-hf-7209-Data.db
28. #CASSANDRA13
Increasing minor compactions
l By default, STC requires a minimum of 4 ssts
l Leads to large non-compacted sstables
l Dropping to 2 can flatten the storage growth
nodetool setcompactionthreshold <ks> <cf> 2
l Cons: CPU/IO increase
32. #CASSANDRA13
Disk Errors => Throw Away
l If you ever see this, replace!
end_request: I/O error, dev xvdb, sector 467940617
end_request: I/O error, dev xvdb, sector 467940617
l Mark node down, bootstrap new
l No metric for this?
33. #CASSANDRA13
Cassandra Log Volume
l Count log lines seen every 10 minutes
l Track over time
l Can identify:
- Unbalanced workloads
- Schema disagreements
- Phantom gossip nodes
- GC activity
l grep -v '.java' => exceptions