3. DataStax's Cass-Operator
Datacenter provisioning
● Schedule all pods
● Bootstrap nodes in the appropriate order
○ Seeds
○ Across racks
○ etc.
○ Uniform configuration
● Scale-up
○ Add new nodes in a balanced manner across rack
● Scale-down
○ Remove nodes one at a time across racks
● Node recovery
○ Restart process
○ Reschedule instance (IE replace node)
○ Replace instance
■ Specific workflows for seed node replacements
● Multi-DC / Multi-Rack
● Multi-Region / Multi-K8s Cluster
○ Note this requires support at a networking layer for pod to pod IP connectivity. This may be accomplished within the cluster with CNIs
like Cilium or externally via traditional networking tools.
4. DataStax's Cass-Operator
Differentiators
● OSS Ecosystem / Components
● Cass Config Builder - OSS project extracted from DataStax OpsCenter
● Life Cycle Manager to provide automated configuration file rendering
● Cass Config Definitions - definitions files for cass-config-builder,
○ defines all configuration files, their parameters, and templates
● Management API for Apache Cassandra (MAAC)
● Metrics Collector for Apache Cassandra (MCAC)
● Reference Prometheus Operator CRDs
○ ServiceMonitor
○ Instance
● Reference Grafana Operator CRDs
○ Instance
○ Dashboards
○ Datasource
● PodTemplateSpec
○ Customization of existing pods including support for adding containers,
volumes, etc
● Advanced Networking
○ Node Port
○ Host Network
● Simple security
○ Management API mTLS support
○ Automated generation of keystore and truststore for internode and client to
node TLS
● Automated superuser account configuration
○ The default superuser (cassandra/cassandra) is disabled and never available to clients
○ Cluster administration account may be automatically (or provided) with values stored in a k8s secret
● Automatic application of NetworkTopologyStrategy with appropriate RF for system keyspaces
● Validating webhook
○ Invalid changes are rejected with a helpful message
● Rolling cluster updates
○ Change in binary (C* upgrade)
○ Change in configuration
○ Canary deployments - single rack application of changes for validation before broader deployment
○ Rolling restart
● Platform Integration / Testing / Certification
○ Red Hat Openshift compatible and certified
■ Secure, Universal Base Image (UBI) foundation images with security
■ scanning performed by Red Hat
■ cass-operator
■ cass-config-builder
■ apache-cassandra w/ MCAC and MAAC
■ Integration with Red Hat certification pipeline / marketplace
■ Presence in Red Hat Operator Hub built into OpenShift interface
○ VMware Tanzu Kubernetes Grid Integrated Edition compatible and certified
■ Security scanning for images performed by VMware
○ Amazon EKS
○ Azure AKS
○ Google GKE
● Documentation / Reference Implementations
○ Cloud storage classes
○ Ingress solutions
■ Sample connection validation application with reference implementations of Java Driver
client connection parameters
● Cluster-level Stop / Resume
○ Stop all running instances while keeping persistent storage
○ Allows for scaling compute down to zero. Bringing the cluster back up follows expected startup procedures
5. DataStax's Cass-Operator
Road Map / Inflight
● Repair
○ Reaper integration
● Backups
○ Velero integration
○ Medusa integration
● Advanced Networking via sidecar
○ Combination of proxy sidecars (a la Envoy) to allow for persistent IP addresses despite Kubernetes' best efforts to shuffle them.
● Single pod canary deployments
● Platform Certification
○ VMware Project Pacific
○ Rancher Kubernetes Engine (K3s)
○ Documentation
● Multi-region
● Multi-cloud
● Additional ingress providers
○ Voyager
○ HAProxy
○ Gloo
○ Ambassador
○ Envoy
○ NGINX Ingress Controller
● Additional storage class references
○ OpenEBS
● Cassandra Enhancements
6. Orange Telecom's CassKop
● Nodes labeling to map any internal architecture (including network specific labels to muti-dc setup)
● Volumes & sidecars management (possibly linked to PodTemplateSpec)
● Backup & restore (we ruled out velero and can share why we went with Instaclustr but Medusa could work too)
● Kubectl plugin integration (quite useful on the ops side without an admin UI)
● MultiCassKop evolution to drive multiple cass-operators instead of multiple casskops (this could remain Orange
internal if too specific)
7. K8ssandra
● K8ssandra provides a production-ready platform for running
Apache Cassandra on Kubernetes. This includes automation
for operational tasks such as repairs, backups, and
monitoring.
● K8ssandra is a cloud native distribution of Apache Cassandra
meant to run on Kubernetes.
● At a pure component level, K8ssandra integrates and
packages together
○ Apache Cassandra 3.11.7
○ Kubernetes Operator for Apache Cassandra
(cass-operator)
○ Reaper, also known as the Repair Web Interface
○ Medusa for backup and restore
○ Metrics Collector, with Prometheus integration, and
visualization via preconfigured Grafana dashboards
○ Templates for connections into your Kubernetes
environment via Ingress solutions such as Traefik
● Right now K8ssandra is deployed as an entire stack. It
currently assumes your deployment uses the entire stack.
Trading out certain components for others is not supported.
8. Strategy: Scalable Fast Data
Architecture: Cassandra, Spark, Kafka
Engineering: Node, Python, JVM,CLR
Operations: Cloud, Container
Rescue: Downtime!! I need help.
www.anant.us | solutions@anant.us | (855) 262-6826
3 Washington Circle, NW | Suite 301 | Washington, DC 20037