Link: https://www.youtube.com/watch?v=D8kJCvsHD9Q&list=PLHgdNuGxrJt04Fwaip9aDYvXrbRSmc5HZ&index=12
https://go.dok.community/slack
https://dok.community/
From DoK Day NA 2022 (https://www.youtube.com/watch?v=YWTa-DiVljY&list=PLHgdNuGxrJt04Fwaip9aDYvXrbRSmc5HZ)
In the software industry we’re fond of terms that define major trends, like “cloud native”, “Kubernetes native” and “serverless”. As more and more organizations move stateful workloads to Kubernetes, we’ve started to see these terms applied to data infrastructure, where they can get overtaken by marketing hype unless we work to define them.
In this talk, we’ll examine two different databases, TiDB and Apache Cassandra, in order to identify what it means for a database to be Kubernetes native and why it matters. We’ll look at points including:
- The differences between cloud native, Kubernetes native, and serverless
- How databases become Kubernetes native
- Benefits of Kubernetes native databases
- How Kubernetes can better support databases
-----
Jeff has worked as a software engineer and architect in multiple industries and as a developer advocate helping engineers get up to speed on Apache Cassandra. He's involved in multiple open source projects in the Cassandra and Kubernetes ecosystems including Stargate and K8ssandra. Jeff is the author of the O’Reilly books “Cassandra: The Definitive Guide" and “Managing Cloud Native Data on Kubernetes".
2. Cloud Native
DoK Day North America 2022 @ KubeCon
Database
Jeff Carpenter, DataStax
“The Kubernetes Native Database”
3. Cloud Native
DoK Day North America 2022 @ KubeCon
Database
Kubernetes Native Database
Jeff Carpenter, DataStax
“The Kubernetes Native Database”
4. Cloud Native
DoK Day North America 2022 @ KubeCon
Database
Kubernetes Native Database
Serverless Database
Jeff Carpenter, DataStax
“The Kubernetes Native Database”
5. Jeff Carpenter, DataStax DoK Day North America 2022 @ KubeCon
“The Kubernetes Native Database”
Managing Cloud Native
Data on Kubernetes
● Coming Dec 2022
● This talk based on Chapter 7
“The Kubernetes Native Database”
6. Jeff Carpenter, DataStax DoK Day North America 2022 @ KubeCon
“The Kubernetes Native Database”
1. Leverage compute / network / storage as commodity APIs
2. Separate the control and data planes
3. Make observability easy
4. Make the default configuration secure
5. Prefer declarative configuration
Cloud Native Data Principles
7. Jeff Carpenter, DataStax DoK Day North America 2022 @ KubeCon
“The Kubernetes Native Database”
● MySQL Compatible
● Hybrid Transactional and
Analytical Processing (HTAP)
● Separation of compute and
storage
○ TiDB - compute
○ TiKV - SQL storage
○ TiFlash - columnar storage
● Spark Integration
● K8s only
TiDB
8. Jeff Carpenter, DataStax DoK Day North America 2022 @ KubeCon
“The Kubernetes Native Database”
● Operator controls all
components
● Optional extension to K8s
Scheduler
● Manages CRDs including
TiDBCluster, TiDBMonitor
TiDB Operator
9. Jeff Carpenter, DataStax DoK Day North America 2022 @ KubeCon
“The Kubernetes Native Database”
● TiDB resource
○ Allows specification of TiDB /
TiKV / TiFlash instances and
supporting infrastructure
○ Monitor with Prometheus /
Grafana stack (not shown)
● Not fully cloud-native
○ Could use object storage
instead of PVs
○ Could use etcd instead of
Discovery Service
TiDB Cluster
10. Jeff Carpenter, DataStax DoK Day North America 2022 @ KubeCon
“The Kubernetes Native Database”
● Apache Cassandra factored
into microservices and
available as a managed service
● Uses object storage instead of
PVs for a true serverless
architecture
● Leverages etcd and
Prometheus/Grafana stack
● API access via Stargate
○ REST, GraphQL, Docs, gRPC
● Multi-tenant, multi-cluster
AstraDB
11. Jeff Carpenter, DataStax DoK Day North America 2022 @ KubeCon
“The Kubernetes Native Database”
● Astra DB operator deploys multi-tenant
clusters using DBInstallation resource
● Ingress routes incoming traffic by tenant to
specific Coordinator / Data Service instances
○ Metadata stored in etcd (not shown)
● Authentication delegated to IAM service
● Data Services use local PVs for caching,
object storage for longer term persistence
● Compaction Service processes data files in
object storage in the background
Astra DBInstallation
12. Jeff Carpenter, DataStax DoK Day North America 2022 @ KubeCon
“The Kubernetes Native Database”
● Maximum leverage of Kubernetes APIs
○ StatefulSets, Deployments, Etcd, Ingress, Scheduler
● Automated, declarative management
○ Via operators and CRDs
● Observable through standard APIs
○ I.e. Prometheus
● Secure by default
○ I.e. no default passwords
What makes a Database Kubernetes Native ?
13. Jeff Carpenter, DataStax DoK Day North America 2022 @ KubeCon
“The Kubernetes Native Database”
● Microservices / serverless
● Multi-cluster / Multi-cloud
● Multi-tenant
● Community based
● Open source
The future of Kubernetes
Native Databases
14. Jeff Carpenter, DataStax DoK Day North America 2022 @ KubeCon
“The Kubernetes Native Database”
● Improved StatefulSets
● Resources to manage multi-tenancy and multi-cluster
● Additional hypervisor support
● Compute resource management (e.g. quotas)
● Better disk initialization (e.g. striping)
What Databases Need from Kubernetes
15. Jeff Carpenter, DataStax DoK Day North America 2022 @ KubeCon
“The Kubernetes Native Database”
Special thanks to:
● Ed Huang, PingCAP
● Jake Luciani, DataStax
Thank you!