SlideShare a Scribd company logo
1 of 12
Download to read offline
1,000 Cassandra nodes on
Kubernetes?
Matt Overstreet
DoK Day Europe 2022 @ KubeCon 1
Who am I?
Matt Overstreet
Field CTO, Cloud
DataStax
"Our Mission is to Connect every
developer in the world to the power of
Apache Cassandra, with the freedom to
run their data on any device and in any
cloud."
DoK Day Europe 2022 @ KubeCon 2
What did we do?
Built a 1,200 node K8ssandra cluster on Kubernetes
And used Kubernetes jobs to bench test it with NoSQLBench
DoK Day Europe 2022 @ KubeCon 3
A what now?
Kubernetes - You know this one, and if you are watching today
you know about the history of using it for stateful stuff like
databases.
Apache Cassandra - An open source database that runs some
of the worlds biggest "fast data", and geo-distributed
workloads.
K8ssandra - A way to run Cassandra on Kubernetes.
NoSQLBench - A reliable way to benchmark servers.
DoK Day Europe 2022 @ KubeCon 4
Why do we care?
The first rule of distributed system
troubleshooting: Compare the configs
Config drift between machines in
distributed software causes weird
issues.
! "
DoK Day Europe 2022 @ KubeCon 5
What did K8ssandra bring to the party?
• Configuration consistency for the cluster
• Ease of Ops - 1,000 nodes would have taken weeks if
hand configured (Shout out to Amazon EKS )
Setup for other Kubernetes goodness!
• Consolidated logging
• Full stack visibility with a service mesh
DoK Day Europe 2022 @ KubeCon 6
What worked?
DoK Day Europe 2022 @ KubeCon 7
What worked?
• 1,200+ nodes/1 engineer/1 week
• Petabyte+ of data/tens of millions of reads and writes per
second/p75 latencies under 10ms
• Destroyed 800 nodes in 5 minutes using the combined
superpowers of helm update and Amazon Elastic
Kubernetes Service
! " #
• Scaling up the load test through Kubernetes was super easy
DoK Day Europe 2022 @ KubeCon 8
What didn’t?
DoK Day Europe 2022 @ KubeCon 9
What didn’t?
Gotchas with K8ssandra
• K8ssandra (v1) monitoring wasn’t up to the task, tweaks are
required for large clusters
• K8ssandra needs a little love when deciding how to
coordinate new Cassandra node "racks" with their
K8ssandra pod startup order
DoK Day Europe 2022 @ KubeCon 10
What didn’t?
Gotchas with EKS (Hyperscalers in
general)
• Resource limits on AWS EKS were often a surprise
• If you are using cloud hosted Kubernetes, think twice before
using an autoscaler
!
• Know before you grow... IP addresses can run out quickly
for large Kubernetes clusters with single or limited pod
DoK Day Europe 2022 @ KubeCon 11
What did I miss?
Thanks!
Me matt.overstreet@datastax.com | @omnifroodle - twitter
K8ssandra https://K8ssandra.io | @k8ssandra - twitter
DataStax https://www.datastax.com
DoK Day Europe 2022 @ KubeCon 12

More Related Content

Similar to 1000 node Cassandra cluster on Amazon's EKS?

The many uses of Kubernetes cross cluster migration of persistent data
The many uses of Kubernetes cross cluster migration of persistent dataThe many uses of Kubernetes cross cluster migration of persistent data
The many uses of Kubernetes cross cluster migration of persistent data
DoKC
 
Why Kubernetes as a container orchestrator is a right choice for running spar...
Why Kubernetes as a container orchestrator is a right choice for running spar...Why Kubernetes as a container orchestrator is a right choice for running spar...
Why Kubernetes as a container orchestrator is a right choice for running spar...
DataWorks Summit
 
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
Anant Corporation
 

Similar to 1000 node Cassandra cluster on Amazon's EKS? (20)

Day 2 Kubernetes - Tools for Operability (QConSF)
Day 2 Kubernetes - Tools for Operability (QConSF)Day 2 Kubernetes - Tools for Operability (QConSF)
Day 2 Kubernetes - Tools for Operability (QConSF)
 
Zenko & MetalK8s @ Dublin Docker Meetup, June 2018
Zenko & MetalK8s @ Dublin Docker Meetup, June 2018Zenko & MetalK8s @ Dublin Docker Meetup, June 2018
Zenko & MetalK8s @ Dublin Docker Meetup, June 2018
 
Confluent On Azure: Why you should add Confluent to your Azure toolkit | Alic...
Confluent On Azure: Why you should add Confluent to your Azure toolkit | Alic...Confluent On Azure: Why you should add Confluent to your Azure toolkit | Alic...
Confluent On Azure: Why you should add Confluent to your Azure toolkit | Alic...
 
The many uses of Kubernetes cross cluster migration of persistent data
The many uses of Kubernetes cross cluster migration of persistent dataThe many uses of Kubernetes cross cluster migration of persistent data
The many uses of Kubernetes cross cluster migration of persistent data
 
The many uses of Kubernetes cross cluster migration of persistent data
The many uses of Kubernetes cross cluster migration of persistent dataThe many uses of Kubernetes cross cluster migration of persistent data
The many uses of Kubernetes cross cluster migration of persistent data
 
Kubernetes 101
Kubernetes 101Kubernetes 101
Kubernetes 101
 
On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen...
On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen...On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen...
On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen...
 
[WSO2Con Asia 2018] Deploying Applications in K8S and Docker
[WSO2Con Asia 2018] Deploying Applications in K8S and Docker[WSO2Con Asia 2018] Deploying Applications in K8S and Docker
[WSO2Con Asia 2018] Deploying Applications in K8S and Docker
 
Day 2 Kubernetes - Tools for Operability (Velocity London Meetup)
Day 2 Kubernetes - Tools for Operability (Velocity London Meetup)Day 2 Kubernetes - Tools for Operability (Velocity London Meetup)
Day 2 Kubernetes - Tools for Operability (Velocity London Meetup)
 
JOSA TechTalks - Downgrade your Costs
JOSA TechTalks - Downgrade your CostsJOSA TechTalks - Downgrade your Costs
JOSA TechTalks - Downgrade your Costs
 
Build and Deploy Cloud Native Camel Quarkus routes with Tekton and Knative
Build and Deploy Cloud Native Camel Quarkus routes with Tekton and KnativeBuild and Deploy Cloud Native Camel Quarkus routes with Tekton and Knative
Build and Deploy Cloud Native Camel Quarkus routes with Tekton and Knative
 
Unleashing k8 s to reduce complexities of an entire middleware platform
Unleashing k8 s to reduce complexities of an entire middleware platformUnleashing k8 s to reduce complexities of an entire middleware platform
Unleashing k8 s to reduce complexities of an entire middleware platform
 
Migrating a build farm from on-prem to AWS
Migrating a build farm from on-prem to AWSMigrating a build farm from on-prem to AWS
Migrating a build farm from on-prem to AWS
 
Why Kubernetes as a container orchestrator is a right choice for running spar...
Why Kubernetes as a container orchestrator is a right choice for running spar...Why Kubernetes as a container orchestrator is a right choice for running spar...
Why Kubernetes as a container orchestrator is a right choice for running spar...
 
Containerized Hadoop beyond Kubernetes
Containerized Hadoop beyond KubernetesContainerized Hadoop beyond Kubernetes
Containerized Hadoop beyond Kubernetes
 
Run Cloud Native MySQL NDB Cluster in Kubernetes
Run Cloud Native MySQL NDB Cluster in KubernetesRun Cloud Native MySQL NDB Cluster in Kubernetes
Run Cloud Native MySQL NDB Cluster in Kubernetes
 
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
 
Metal-k8s presentation by Julien Girardin @ Paris Kubernetes Meetup
Metal-k8s presentation by Julien Girardin @ Paris Kubernetes MeetupMetal-k8s presentation by Julien Girardin @ Paris Kubernetes Meetup
Metal-k8s presentation by Julien Girardin @ Paris Kubernetes Meetup
 
Webinar: The Future of SQL
Webinar: The Future of SQLWebinar: The Future of SQL
Webinar: The Future of SQL
 
25 12 18 meetup - road to k8s
25 12 18 meetup - road to k8s25 12 18 meetup - road to k8s
25 12 18 meetup - road to k8s
 

More from DoKC

Overcoming challenges with protecting and migrating data in multi-cloud K8s e...
Overcoming challenges with protecting and migrating data in multi-cloud K8s e...Overcoming challenges with protecting and migrating data in multi-cloud K8s e...
Overcoming challenges with protecting and migrating data in multi-cloud K8s e...
DoKC
 
We will Dok You! - The journey to adopt stateful workloads on k8s
We will Dok You! - The journey to adopt stateful workloads on k8sWe will Dok You! - The journey to adopt stateful workloads on k8s
We will Dok You! - The journey to adopt stateful workloads on k8s
DoKC
 
Leveraging Running Stateful Workloads on Kubernetes for the Benefit of Develo...
Leveraging Running Stateful Workloads on Kubernetes for the Benefit of Develo...Leveraging Running Stateful Workloads on Kubernetes for the Benefit of Develo...
Leveraging Running Stateful Workloads on Kubernetes for the Benefit of Develo...
DoKC
 

More from DoKC (20)

Distributed Vector Databases - What, Why, and How
Distributed Vector Databases - What, Why, and HowDistributed Vector Databases - What, Why, and How
Distributed Vector Databases - What, Why, and How
 
Is It Safe? Security Hardening for Databases Using Kubernetes Operators
Is It Safe? Security Hardening for Databases Using Kubernetes OperatorsIs It Safe? Security Hardening for Databases Using Kubernetes Operators
Is It Safe? Security Hardening for Databases Using Kubernetes Operators
 
Stop Worrying and Keep Querying, Using Automated Multi-Region Disaster Recovery
Stop Worrying and Keep Querying, Using Automated Multi-Region Disaster RecoveryStop Worrying and Keep Querying, Using Automated Multi-Region Disaster Recovery
Stop Worrying and Keep Querying, Using Automated Multi-Region Disaster Recovery
 
Transforming Data Processing with Kubernetes: Journey Towards a Self-Serve Da...
Transforming Data Processing with Kubernetes: Journey Towards a Self-Serve Da...Transforming Data Processing with Kubernetes: Journey Towards a Self-Serve Da...
Transforming Data Processing with Kubernetes: Journey Towards a Self-Serve Da...
 
The State of Stateful on Kubernetes
The State of Stateful on KubernetesThe State of Stateful on Kubernetes
The State of Stateful on Kubernetes
 
Colocating Data Workloads and Web Services on Kubernetes to Improve Resource ...
Colocating Data Workloads and Web Services on Kubernetes to Improve Resource ...Colocating Data Workloads and Web Services on Kubernetes to Improve Resource ...
Colocating Data Workloads and Web Services on Kubernetes to Improve Resource ...
 
Make Your Kafka Cluster Production-Ready
Make Your Kafka Cluster Production-ReadyMake Your Kafka Cluster Production-Ready
Make Your Kafka Cluster Production-Ready
 
Dynamic Large Scale Spark on Kubernetes: Empowering the Community with Argo W...
Dynamic Large Scale Spark on Kubernetes: Empowering the Community with Argo W...Dynamic Large Scale Spark on Kubernetes: Empowering the Community with Argo W...
Dynamic Large Scale Spark on Kubernetes: Empowering the Community with Argo W...
 
Run PostgreSQL in Warp Speed Using NVMe/TCP in the Cloud
Run PostgreSQL in Warp Speed Using NVMe/TCP in the CloudRun PostgreSQL in Warp Speed Using NVMe/TCP in the Cloud
Run PostgreSQL in Warp Speed Using NVMe/TCP in the Cloud
 
ING Data Services hosted on ICHP DoK Amsterdam 2023
ING Data Services hosted on ICHP DoK Amsterdam 2023ING Data Services hosted on ICHP DoK Amsterdam 2023
ING Data Services hosted on ICHP DoK Amsterdam 2023
 
Implementing data and databases on K8s within the Dutch government
Implementing data and databases on K8s within the Dutch governmentImplementing data and databases on K8s within the Dutch government
Implementing data and databases on K8s within the Dutch government
 
StatefulSets in K8s - DoK Talks #154
StatefulSets in K8s - DoK Talks #154StatefulSets in K8s - DoK Talks #154
StatefulSets in K8s - DoK Talks #154
 
Running PostgreSQL in Kubernetes: from day 0 to day 2 with CloudNativePG - Do...
Running PostgreSQL in Kubernetes: from day 0 to day 2 with CloudNativePG - Do...Running PostgreSQL in Kubernetes: from day 0 to day 2 with CloudNativePG - Do...
Running PostgreSQL in Kubernetes: from day 0 to day 2 with CloudNativePG - Do...
 
Analytics with Apache Superset and ClickHouse - DoK Talks #151
Analytics with Apache Superset and ClickHouse - DoK Talks #151Analytics with Apache Superset and ClickHouse - DoK Talks #151
Analytics with Apache Superset and ClickHouse - DoK Talks #151
 
Overcoming challenges with protecting and migrating data in multi-cloud K8s e...
Overcoming challenges with protecting and migrating data in multi-cloud K8s e...Overcoming challenges with protecting and migrating data in multi-cloud K8s e...
Overcoming challenges with protecting and migrating data in multi-cloud K8s e...
 
Evaluating Cloud Native Storage Vendors - DoK Talks #147
Evaluating Cloud Native Storage Vendors - DoK Talks #147Evaluating Cloud Native Storage Vendors - DoK Talks #147
Evaluating Cloud Native Storage Vendors - DoK Talks #147
 
Kubernetes Cluster Upgrade Strategies and Data: Best Practices for your State...
Kubernetes Cluster Upgrade Strategies and Data: Best Practices for your State...Kubernetes Cluster Upgrade Strategies and Data: Best Practices for your State...
Kubernetes Cluster Upgrade Strategies and Data: Best Practices for your State...
 
We will Dok You! - The journey to adopt stateful workloads on k8s
We will Dok You! - The journey to adopt stateful workloads on k8sWe will Dok You! - The journey to adopt stateful workloads on k8s
We will Dok You! - The journey to adopt stateful workloads on k8s
 
Mastering MongoDB on Kubernetes, the power of operators
Mastering MongoDB on Kubernetes, the power of operators Mastering MongoDB on Kubernetes, the power of operators
Mastering MongoDB on Kubernetes, the power of operators
 
Leveraging Running Stateful Workloads on Kubernetes for the Benefit of Develo...
Leveraging Running Stateful Workloads on Kubernetes for the Benefit of Develo...Leveraging Running Stateful Workloads on Kubernetes for the Benefit of Develo...
Leveraging Running Stateful Workloads on Kubernetes for the Benefit of Develo...
 

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 

Recently uploaded (20)

Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 

1000 node Cassandra cluster on Amazon's EKS?

  • 1. 1,000 Cassandra nodes on Kubernetes? Matt Overstreet DoK Day Europe 2022 @ KubeCon 1
  • 2. Who am I? Matt Overstreet Field CTO, Cloud DataStax "Our Mission is to Connect every developer in the world to the power of Apache Cassandra, with the freedom to run their data on any device and in any cloud." DoK Day Europe 2022 @ KubeCon 2
  • 3. What did we do? Built a 1,200 node K8ssandra cluster on Kubernetes And used Kubernetes jobs to bench test it with NoSQLBench DoK Day Europe 2022 @ KubeCon 3
  • 4. A what now? Kubernetes - You know this one, and if you are watching today you know about the history of using it for stateful stuff like databases. Apache Cassandra - An open source database that runs some of the worlds biggest "fast data", and geo-distributed workloads. K8ssandra - A way to run Cassandra on Kubernetes. NoSQLBench - A reliable way to benchmark servers. DoK Day Europe 2022 @ KubeCon 4
  • 5. Why do we care? The first rule of distributed system troubleshooting: Compare the configs Config drift between machines in distributed software causes weird issues. ! " DoK Day Europe 2022 @ KubeCon 5
  • 6. What did K8ssandra bring to the party? • Configuration consistency for the cluster • Ease of Ops - 1,000 nodes would have taken weeks if hand configured (Shout out to Amazon EKS ) Setup for other Kubernetes goodness! • Consolidated logging • Full stack visibility with a service mesh DoK Day Europe 2022 @ KubeCon 6
  • 7. What worked? DoK Day Europe 2022 @ KubeCon 7
  • 8. What worked? • 1,200+ nodes/1 engineer/1 week • Petabyte+ of data/tens of millions of reads and writes per second/p75 latencies under 10ms • Destroyed 800 nodes in 5 minutes using the combined superpowers of helm update and Amazon Elastic Kubernetes Service ! " # • Scaling up the load test through Kubernetes was super easy DoK Day Europe 2022 @ KubeCon 8
  • 9. What didn’t? DoK Day Europe 2022 @ KubeCon 9
  • 10. What didn’t? Gotchas with K8ssandra • K8ssandra (v1) monitoring wasn’t up to the task, tweaks are required for large clusters • K8ssandra needs a little love when deciding how to coordinate new Cassandra node "racks" with their K8ssandra pod startup order DoK Day Europe 2022 @ KubeCon 10
  • 11. What didn’t? Gotchas with EKS (Hyperscalers in general) • Resource limits on AWS EKS were often a surprise • If you are using cloud hosted Kubernetes, think twice before using an autoscaler ! • Know before you grow... IP addresses can run out quickly for large Kubernetes clusters with single or limited pod DoK Day Europe 2022 @ KubeCon 11
  • 12. What did I miss? Thanks! Me matt.overstreet@datastax.com | @omnifroodle - twitter K8ssandra https://K8ssandra.io | @k8ssandra - twitter DataStax https://www.datastax.com DoK Day Europe 2022 @ KubeCon 12