Come here about our experience scaling Cassandra on EKS to over 1000 nodes and 20 million transactions per second. This session will cover the lessons learned, successes, failures, and tools used to get there.
This talk was given by Matt Overstreet for DoK Day Europe @ KubeCon 2022.
2. Who am I?
Matt Overstreet
Field CTO, Cloud
DataStax
"Our Mission is to Connect every
developer in the world to the power of
Apache Cassandra, with the freedom to
run their data on any device and in any
cloud."
DoK Day Europe 2022 @ KubeCon 2
3. What did we do?
Built a 1,200 node K8ssandra cluster on Kubernetes
And used Kubernetes jobs to bench test it with NoSQLBench
DoK Day Europe 2022 @ KubeCon 3
4. A what now?
Kubernetes - You know this one, and if you are watching today
you know about the history of using it for stateful stuff like
databases.
Apache Cassandra - An open source database that runs some
of the worlds biggest "fast data", and geo-distributed
workloads.
K8ssandra - A way to run Cassandra on Kubernetes.
NoSQLBench - A reliable way to benchmark servers.
DoK Day Europe 2022 @ KubeCon 4
5. Why do we care?
The first rule of distributed system
troubleshooting: Compare the configs
Config drift between machines in
distributed software causes weird
issues.
! "
DoK Day Europe 2022 @ KubeCon 5
6. What did K8ssandra bring to the party?
• Configuration consistency for the cluster
• Ease of Ops - 1,000 nodes would have taken weeks if
hand configured (Shout out to Amazon EKS )
Setup for other Kubernetes goodness!
• Consolidated logging
• Full stack visibility with a service mesh
DoK Day Europe 2022 @ KubeCon 6
8. What worked?
• 1,200+ nodes/1 engineer/1 week
• Petabyte+ of data/tens of millions of reads and writes per
second/p75 latencies under 10ms
• Destroyed 800 nodes in 5 minutes using the combined
superpowers of helm update and Amazon Elastic
Kubernetes Service
! " #
• Scaling up the load test through Kubernetes was super easy
DoK Day Europe 2022 @ KubeCon 8
10. What didn’t?
Gotchas with K8ssandra
• K8ssandra (v1) monitoring wasn’t up to the task, tweaks are
required for large clusters
• K8ssandra needs a little love when deciding how to
coordinate new Cassandra node "racks" with their
K8ssandra pod startup order
DoK Day Europe 2022 @ KubeCon 10
11. What didn’t?
Gotchas with EKS (Hyperscalers in
general)
• Resource limits on AWS EKS were often a surprise
• If you are using cloud hosted Kubernetes, think twice before
using an autoscaler
!
• Know before you grow... IP addresses can run out quickly
for large Kubernetes clusters with single or limited pod
DoK Day Europe 2022 @ KubeCon 11
12. What did I miss?
Thanks!
Me matt.overstreet@datastax.com | @omnifroodle - twitter
K8ssandra https://K8ssandra.io | @k8ssandra - twitter
DataStax https://www.datastax.com
DoK Day Europe 2022 @ KubeCon 12