Successfully reported this slideshow.
Your SlideShare is downloading. ×

Cassandra meetup 20150331

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Loading in …3
×

Check these out next

1 of 60 Ad
Advertisement

More Related Content

Similar to Cassandra meetup 20150331 (20)

Advertisement

Recently uploaded (20)

Cassandra meetup 20150331

  1. 1. @WrathOfChris github.com/WrathOfChris .blog.wrathofchris.com Time Series Metrics with Cassandra
  2. 2. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris About Me • Chris Maxwell • @WrathOfChris • Sr Systems Engineer @ Ubiquiti Networks • Cloud Guy • DevOps
  3. 3. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris Mission • Metrics service for internal services • Deliver 90 60 30 days of system and app metrics • Gain experience with Cassandra
  4. 4. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris History Ancient Designs Aging Tools Pitfalls https://flic.kr/p/6pqVnP
  5. 5. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris Graphite (v1) • Single instance • carbon-relay + (2-4) carbon-cache processes (=cpu)
  6. 6. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris Graphite (v1) Problems: • Single point of SUCCESS! • Can grow to 16-32 cores, but I/O saturation • Carbon write-amplifies 10x (flushes every 10s)
  7. 7. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris Graphite (v2) • Frontend: carbon-relay • Backend: carbon-relay + 4x carbon-cache • m3.2xlarge ephemeral SSD • Manual consistent-hash by IP • Replication 3
  8. 8. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris Graphite (v2) Problems: • Kind of like a Dynamo, but not • Replacing node requires full partition key shuffle • Adding 5 nodes took 6 days on 1Gbps to re-replicate ring • Less than 50% disk free means pain during reshuffle
  9. 9. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris Limitations • Cloud Native • Avoid Manual Intervention • Ephemeral SSD > EBS https://flic.kr/p/2hZy6P
  10. 10. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris Design What we set out to build https://flic.kr/p/2spiXb
  11. 11. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris Graphite (v3) …it got complicated…
  12. 12. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris Graphite (v3) Ingest: • carbon-c-relay https://github.com/grobian/carbon-c-relay • cyanite https://github.com/pyr/cyanite • cassandra
  13. 13. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris Graphite (v3) Retrieval: • graphite-api https://github.com/brutasse/graphite-api • grafana https://github.com/grafana/grafana • cyanite https://github.com/pyr/cyanite • elasticsearch (metric path cache)
  14. 14. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris
  15. 15. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris Journey Lessons learned along the way https://flic.kr/p/hjY15L
  16. 16. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris Size Tiered Compaction • Sorted String Table (SSTable) is an immutable data file • New data written to small SSTables • Periodically merged into larger SSTables
  17. 17. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris Size Tiered Compaction • Merge 4 similarly sized SSTables into 1 new SSTable • Data migrates into larger SSTables that are less- regularly compacted • Disk space required: Sum of 4 largest SSTables
  18. 18. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris Size Tiered Compaction • Updating a partition frequently may cause it to be spread between SSTables • Metrics workload writes to all partitions, every period
  19. 19. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris Size Tiered Compaction • Metrics workload writes to all partitions, every period • Range queries that spanned 50+ SSTables !!!
  20. 20. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris Size Tiered Compaction • Getting to the older data… • Ingest 25% more data • Major Compaction: • Requires 50% free space • Compacts all SSTables into 1 large SSTable
  21. 21. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris Aside: DELETE • DELETE is the INSERT of a TOMBSTONE to the end of a partition • INSERTs with TTL become tombstones in the future • Tombstones live for at least gc_grace_seconds • Data is only deleted during compaction https://flic.kr/p/35RACf
  22. 22. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris gc_grace_seconds Grace is getting something you don’t deserve (time to noetool repair a node that is down)
  23. 23. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris gc_grace_seconds deleted data reappears!
  24. 24. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris Time To Live • INSERT with TTL becomes tombstone after expiry • 10s for 6 hours • 60s for 3 days • 300s for 30 days https://flic.kr/p/6Fxv7M
  25. 25. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris TTL • gc_grace_seconds is 10 days (by default) • 10s for 6 hours 10.25 days • 60s for 3 days 13 days • 300s for 30 days 40 days https://flic.kr/p/gBLHYf
  26. 26. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris https://flic.kr/p/4LNiXg https://flic.kr/p/35RACf 1.4TB Disks
  27. 27. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris Levelled Compaction based on Google’s LevelDB implementation
  28. 28. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris Levelled Compaction • Data is ingested at Level 0 • Immediately compacted and merged with L1 • Partitions are merged up to Ln • 90% of partition data guaranteed to be in same level
  29. 29. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris Levelled Compaction • Metrics workload writes to all partitions, every period • Immediately rolled up to L1 • Immediately rolled up to L2 • Immediately rolled up to L3 • Immediately rolled up to L4 • Immediately rolled up to L5
  30. 30. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris Levelled Compaction • Metrics workload writes to all partitions, every period • 1 batch of writes —> 5 writes
  31. 31. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris Increasing Write rate Constant Ingest rate
  32. 32. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris Increasing Write rate Constant Ingest rate https://flic.kr/p/4LNiXg
  33. 33. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris compaction_throughput_mb_per_sec: 128 …then 0 (unlimited)
  34. 34. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris Speeding Compactions … Don’t Do This … multithreaded: true cassandra_in_memory_compaction_limit_in_mb: 256M
  35. 35. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris Date Tiered Compaction
  36. 36. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris Date Tiered Compaction • Written by Björn Hegerfors at Spotify • Experimental! • Released in 2.0.11 / 2.1.1 • Group data by time • Compact by time • Drop expired data by time
  37. 37. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris Compact SSTables by date window
  38. 38. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris – but the docs say 8GB maximum heap! MAX_HEAP_SIZE=16G HEAP_NEWSIZE=2048M
  39. 39. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris – Rick Branson, Instagram http://www.slideshare.net/planetcassandra/cassandra-summit-2014-cassandra-at-instagram-2014 -XX:+CMSScavengeBeforeRemark -XX:CMSMaxAbortablePrecleanTime=60000 -XX:CMSWaitDuration=30000
  40. 40. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris All systems normal Inadvertently tested 30,000 writes/sec during launch
  41. 41. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris Cloud Native http://wattsupwiththat.com/2015/03/17/spaceship-lenticular-cloud-maybe-the-coolest-cloud-picture-evah/
  42. 42. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris Cloud Native Ec2MultiRegionSnitch
  43. 43. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris Cloud Native Ephemeral RAID0 -Djava.io.tmpdir=/mnt/cassandra/tmp
  44. 44. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris Disable AutoScaling Terminate Process: aws autoscaling suspend-processes --scaling-processes Terminate
  45. 45. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris Cloud Native This design works to 50 instances per region
  46. 46. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris Security Groups IAM instance-profile role Security Group + (per region) Security Group
  47. 47. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris Management (OpsCenter) IAM instance-profile role Security Group + (per region) Security Group
  48. 48. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris Internode Encryption server_encryption_options: internode_encryption: all • keytool -genkeypair -alias test-cass -keyalg RSA -validity 3650 -keystore test-cass.keystore • keytool -export -alias test-cass -keystore test-cass.keystore -rfc -file test-cass.crt • keytool -import -alias test-cass -file test-cass.crt -keystore test-cass.truststore
  49. 49. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris Seeds Cheated….
  50. 50. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris Seeds • selects first 3 nodes from each region using Autoscale Group order • ignores (self) as a seed for bootstrapping first 3 nodes in each region
  51. 51. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris General • >= 4 Cores per node always • >= 8 Cores as soon as feasible • EC2 sweet spots: • m3.2xlarge (8c/160GB) for small workloads • i2.2xlarge (8c/1.6TB) for production • Avoid c3.2xlarge - CPU:Mem ratio is too high
  52. 52. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris Breaking News! Dense-storage Instances for EC2
  53. 53. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris Questions?
  54. 54. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris d2 instances Joining a node - system/network
  55. 55. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris d2 instances Joining a node - disk performance
  56. 56. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris General Metrics
  57. 57. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris General Cassandra Metrics
  58. 58. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris Metrics CPU - DateTiered
  59. 59. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris Metrics JVM - DateTiered
  60. 60. @WrathOfChris blog.wrathofchris.com github.com/WrathOfChris Metrics Compaction/CommitLog - DateTiered

×