SlideShare a Scribd company logo
© 2022 Altinity, Inc.
Life in the Herd -
Cloud Native ClickHouse
Robert Hodges, Altinity
DoK San Francisco Meetup - 20 July
1
© 2022 Altinity, Inc.
Personal and company introductions
ClickHouse support and services including Altinity.Cloud
Authors of Altinity Kubernetes Operator for ClickHouse
and other open source projects
Robert Hodges
Database geek with 30+ years
on DBMS systems. Day job:
Altinity CEO
Altinity Engineering
Database geeks with centuries
of experience in DBMS and
applications
2
© 2022 Altinity, Inc.
© 2022 Altinity, Inc.
The Kubernetes
Vision: Databases
as cattle, not pets
3
© 2022 Altinity, Inc.
Introducing ClickHouse. It’s complicated.
ClickHouse Server
shard1
Analytic
application
Zookeeper Server
ClickHouse Server
shard1
ClickHouse Server
shard1
ClickHouse Server
shard2
ClickHouse Server
shard2
ClickHouse Server
shard2
Zookeeper Server Zookeeper Server
Availability Zone Availability Zone Availability Zone
4
© 2022 Altinity, Inc.
kube-system namespace
Operators make complex databases work on Kubernetes
Altinity
ClickHouse
Operator
your-favorite namespace
Apache 2.0 source,
distributed as Docker
image
kubectl -f apply
my-cluster.yaml
Best practice deployment
“Adjust reality”
ClickHouse
Resource
Definition
5
© 2022 Altinity, Inc.
Operators make complex configurations simple
apiVersion: "clickhouse.altinity.com/v1"
kind: "ClickHouseInstallation"
metadata:
name: "demo-01"
spec:
configuration:
zookeeper:
nodes:
- host: zookeeper.zoo1ns
port: 2181
clusters:
- name: "cl"
layout:
shardsCount: 1
replicasCount: 2
. . .
6
Shards and replicas
Where is Zookeeper?
More stuff like
storage and
versions
© 2022 Altinity, Inc.
Well-written operators make your life better
7
Rolling
Upgrade
Prometheus
Metrics
Export
Vertical
Scaling
Horizontal
Scaling
Multi-AZ
Deployment
Node
Anti-
Affinity
© 2022 Altinity, Inc.
© 2022 Altinity, Inc.
But…
Cattle are
individuals, too!
8
© 2022 Altinity, Inc.
A typical distributed service
Load
Balancer
Service
#1
Service
#3
Service
#2
Storage
Storage
Storage
Traffic
© 2022 Altinity, Inc.
Created using Kubernetes stateful set
Pod
“svc-1”
Persistent
Volume
Service
“svc”
Stateful
Set
Persistent
Volume
Claim
Persistent
Volume
Persistent
Volume
Pod
“svc-2”
Pod
“svc-3”
Persistent
Volume
Claim
Persistent
Volume
Claim
Config
Maps
Config
Maps
© 2022 Altinity, Inc.
Database replicas are asymmetric
DBMS
Replica 1
DBMS
Replica 3
DBMS
Replica 2
Different software versions
Different resources
R/W vs. R/O Different AZs
© 2022 Altinity, Inc.
Databases need a more sophisticated replica model
SS/Pod
“svc-1”
Persistent
Volume
Service
“svc”
ClickHouse
Installation
Persistent
Volume
Claim
Persistent
Volume
Persistent
Volume
SS/Pod
“svc-2”
SS/Pod
“svc-3”
Persistent
Volume
Claim
Persistent
Volume
Claim
Config
Maps
Config
Maps
© 2022 Altinity, Inc.
© 2022 Altinity, Inc.
What if your calf
gets sick?
13
© 2022 Altinity, Inc.
$ kubectl apply -f crash.yaml
clickhouseinstallation.clickhouse.altinity.com/crash-demo configured
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
chi-crash-demo-ch-0-0-0 0/1 CrashLoopBackOff 1 (12s ago) 15s
14
Every [Kubernetes] SRE’s least favorite error
© 2022 Altinity, Inc.
Traditional cure for database crashes
15
Database
DBA
© 2022 Altinity, Inc.
Malign effects of EBS hardware failure in AWS
SS/Pod
“svc-1”
Persistent
Volume
Service
“svc”
ClickHouse
Installation
Persistent
Volume
Claim
Persistent
Volume
Persistent
Volume
SS/Pod
“svc-2”
SS/Pod
“svc-3”
Persistent
Volume
Claim
Persistent
Volume
Claim
Config
Maps
Config
Maps
© 2022 Altinity, Inc.
Operators need adaptation to handle DBMS failures
17
Logs accessible
outside pod
Change CRD and
apply
Take DBMS out of
rotation & fix online
Robust node
replace protocol
kubectl exec
Volume
corruption
Metadata
corruption
Misconfiguration
Bad upgrade
Unfulfilled resources
Accidental
volume deletion
Built-in locks on
PVC and/or PV
Slow startup
© 2022 Altinity, Inc.
© 2022 Altinity, Inc.
How can cattle
adapt to the
herd?
18
© 2022 Altinity, Inc.
File-based vs. network management
Server
Storage
Logs &
Metric
Data
Config
Files
Direct access to
file system
required
Configuration Backup & Restore Metrics export
Lots of sidecars!
© 2022 Altinity, Inc.
File-based vs. network management
Server
Storage
Logs &
Metric
Data
Config
Files
Direct access to
file system
required
Server
Storage
Logs &
Metric
Data
Config
Files
All
management
via network
Configuration
Metrics export
Backup & Restore
Configuration Backup & Restore Metrics export
Lots of sidecars! Just one pod!
© 2022 Altinity, Inc.
Attached local storage vs. network only with local cache
Server
NVMe
SSD
Server
NVMe
SSD
Server
NVMe
SSD
Pods pinned to host!
(But low latency)
© 2022 Altinity, Inc.
Attached local storage vs. network only with local cache
Server
NVMe
SSD
Server
NVMe
SSD
Server
NVMe
SSD
Server
SSD
Cache
Server
SSD
Cache
Server
SSD
Cache
Shared
Storage
Layer
Pods pinned to host!
(But low latency)
Pods scheduled on any host!
(But higher latency)
© 2022 Altinity, Inc.
Shared nothing versus separated compute/storage
Server
Dedicated
Storage
Server
Dedicated
Storage
Server
Dedicated
Storage
Server
Server
Server
Server
Server
Shared
Storage
Layer
Increasing volume of data, increasing elasticity
© 2022 Altinity, Inc.
© 2022 Altinity, Inc.
Final thoughts
on life in the
herd
24
© 2022 Altinity, Inc.
Database happiness on Kubernetes
● Build a replica model that permits variation
○ Not stateful sets
● Fix things without resort to ‘kubectl exec’
● Access management and storage through the network
25
© 2022 Altinity, Inc.
Thank you!
Questions?
https://altinity.com
rhodges at altinity.com
26
Altinity.Cloud
Altinity
Kubernetes
Operator for
ClickHouse
We’re hiring!

More Related Content

More from Altinity Ltd

ClickHouse ReplacingMergeTree in Telecom Apps
ClickHouse ReplacingMergeTree in Telecom AppsClickHouse ReplacingMergeTree in Telecom Apps
ClickHouse ReplacingMergeTree in Telecom Apps
Altinity Ltd
 
Adventures with the ClickHouse ReplacingMergeTree Engine
Adventures with the ClickHouse ReplacingMergeTree EngineAdventures with the ClickHouse ReplacingMergeTree Engine
Adventures with the ClickHouse ReplacingMergeTree Engine
Altinity Ltd
 
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with  Apache Pulsar and Apache PinotBuilding a Real-Time Analytics Application with  Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
Altinity Ltd
 
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdfAltinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
Altinity Ltd
 
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
Altinity Ltd
 
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdfOSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
Altinity Ltd
 
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
Altinity Ltd
 
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
Altinity Ltd
 
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
Altinity Ltd
 
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
Altinity Ltd
 
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
Altinity Ltd
 
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdfOSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
Altinity Ltd
 
OSA Con 2022 - Specifics of data analysis in Time Series Databases - Roman Kh...
OSA Con 2022 - Specifics of data analysis in Time Series Databases - Roman Kh...OSA Con 2022 - Specifics of data analysis in Time Series Databases - Roman Kh...
OSA Con 2022 - Specifics of data analysis in Time Series Databases - Roman Kh...
Altinity Ltd
 
OSA Con 2022 - Signal Correlation, the Ho11y Grail - Michael Hausenblas - AWS...
OSA Con 2022 - Signal Correlation, the Ho11y Grail - Michael Hausenblas - AWS...OSA Con 2022 - Signal Correlation, the Ho11y Grail - Michael Hausenblas - AWS...
OSA Con 2022 - Signal Correlation, the Ho11y Grail - Michael Hausenblas - AWS...
Altinity Ltd
 
OSA Con 2022 - Scaling your Pandas Analytics with Modin - Doris Lee - Ponder.pdf
OSA Con 2022 - Scaling your Pandas Analytics with Modin - Doris Lee - Ponder.pdfOSA Con 2022 - Scaling your Pandas Analytics with Modin - Doris Lee - Ponder.pdf
OSA Con 2022 - Scaling your Pandas Analytics with Modin - Doris Lee - Ponder.pdf
Altinity Ltd
 
OSA Con 2022 - Quick Reflexes_ Building Real-Time Data Analytics with Redpand...
OSA Con 2022 - Quick Reflexes_ Building Real-Time Data Analytics with Redpand...OSA Con 2022 - Quick Reflexes_ Building Real-Time Data Analytics with Redpand...
OSA Con 2022 - Quick Reflexes_ Building Real-Time Data Analytics with Redpand...
Altinity Ltd
 
OSA Con 2022 - Extract, Transform, and Learn about your developers - Brian Le...
OSA Con 2022 - Extract, Transform, and Learn about your developers - Brian Le...OSA Con 2022 - Extract, Transform, and Learn about your developers - Brian Le...
OSA Con 2022 - Extract, Transform, and Learn about your developers - Brian Le...
Altinity Ltd
 
OSA Con 2022 - Building Event Collection SDKs and Data Models - Paul Boocock ...
OSA Con 2022 - Building Event Collection SDKs and Data Models - Paul Boocock ...OSA Con 2022 - Building Event Collection SDKs and Data Models - Paul Boocock ...
OSA Con 2022 - Building Event Collection SDKs and Data Models - Paul Boocock ...
Altinity Ltd
 
OSA Con 2022 - Building a Real-time Analytics Application with Apache Pulsar ...
OSA Con 2022 - Building a Real-time Analytics Application with Apache Pulsar ...OSA Con 2022 - Building a Real-time Analytics Application with Apache Pulsar ...
OSA Con 2022 - Building a Real-time Analytics Application with Apache Pulsar ...
Altinity Ltd
 
OSA Con 2022 - Arrow in Flight_ New Developments in Data Connectivity - David...
OSA Con 2022 - Arrow in Flight_ New Developments in Data Connectivity - David...OSA Con 2022 - Arrow in Flight_ New Developments in Data Connectivity - David...
OSA Con 2022 - Arrow in Flight_ New Developments in Data Connectivity - David...
Altinity Ltd
 

More from Altinity Ltd (20)

ClickHouse ReplacingMergeTree in Telecom Apps
ClickHouse ReplacingMergeTree in Telecom AppsClickHouse ReplacingMergeTree in Telecom Apps
ClickHouse ReplacingMergeTree in Telecom Apps
 
Adventures with the ClickHouse ReplacingMergeTree Engine
Adventures with the ClickHouse ReplacingMergeTree EngineAdventures with the ClickHouse ReplacingMergeTree Engine
Adventures with the ClickHouse ReplacingMergeTree Engine
 
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with  Apache Pulsar and Apache PinotBuilding a Real-Time Analytics Application with  Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
 
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdfAltinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
 
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
 
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdfOSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
 
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
 
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
 
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
 
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
 
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
 
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdfOSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
 
OSA Con 2022 - Specifics of data analysis in Time Series Databases - Roman Kh...
OSA Con 2022 - Specifics of data analysis in Time Series Databases - Roman Kh...OSA Con 2022 - Specifics of data analysis in Time Series Databases - Roman Kh...
OSA Con 2022 - Specifics of data analysis in Time Series Databases - Roman Kh...
 
OSA Con 2022 - Signal Correlation, the Ho11y Grail - Michael Hausenblas - AWS...
OSA Con 2022 - Signal Correlation, the Ho11y Grail - Michael Hausenblas - AWS...OSA Con 2022 - Signal Correlation, the Ho11y Grail - Michael Hausenblas - AWS...
OSA Con 2022 - Signal Correlation, the Ho11y Grail - Michael Hausenblas - AWS...
 
OSA Con 2022 - Scaling your Pandas Analytics with Modin - Doris Lee - Ponder.pdf
OSA Con 2022 - Scaling your Pandas Analytics with Modin - Doris Lee - Ponder.pdfOSA Con 2022 - Scaling your Pandas Analytics with Modin - Doris Lee - Ponder.pdf
OSA Con 2022 - Scaling your Pandas Analytics with Modin - Doris Lee - Ponder.pdf
 
OSA Con 2022 - Quick Reflexes_ Building Real-Time Data Analytics with Redpand...
OSA Con 2022 - Quick Reflexes_ Building Real-Time Data Analytics with Redpand...OSA Con 2022 - Quick Reflexes_ Building Real-Time Data Analytics with Redpand...
OSA Con 2022 - Quick Reflexes_ Building Real-Time Data Analytics with Redpand...
 
OSA Con 2022 - Extract, Transform, and Learn about your developers - Brian Le...
OSA Con 2022 - Extract, Transform, and Learn about your developers - Brian Le...OSA Con 2022 - Extract, Transform, and Learn about your developers - Brian Le...
OSA Con 2022 - Extract, Transform, and Learn about your developers - Brian Le...
 
OSA Con 2022 - Building Event Collection SDKs and Data Models - Paul Boocock ...
OSA Con 2022 - Building Event Collection SDKs and Data Models - Paul Boocock ...OSA Con 2022 - Building Event Collection SDKs and Data Models - Paul Boocock ...
OSA Con 2022 - Building Event Collection SDKs and Data Models - Paul Boocock ...
 
OSA Con 2022 - Building a Real-time Analytics Application with Apache Pulsar ...
OSA Con 2022 - Building a Real-time Analytics Application with Apache Pulsar ...OSA Con 2022 - Building a Real-time Analytics Application with Apache Pulsar ...
OSA Con 2022 - Building a Real-time Analytics Application with Apache Pulsar ...
 
OSA Con 2022 - Arrow in Flight_ New Developments in Data Connectivity - David...
OSA Con 2022 - Arrow in Flight_ New Developments in Data Connectivity - David...OSA Con 2022 - Arrow in Flight_ New Developments in Data Connectivity - David...
OSA Con 2022 - Arrow in Flight_ New Developments in Data Connectivity - David...
 

Life in the Herd-Cloud Native ClickHouse-DoK-2022-07-20.pdf

  • 1. © 2022 Altinity, Inc. Life in the Herd - Cloud Native ClickHouse Robert Hodges, Altinity DoK San Francisco Meetup - 20 July 1
  • 2. © 2022 Altinity, Inc. Personal and company introductions ClickHouse support and services including Altinity.Cloud Authors of Altinity Kubernetes Operator for ClickHouse and other open source projects Robert Hodges Database geek with 30+ years on DBMS systems. Day job: Altinity CEO Altinity Engineering Database geeks with centuries of experience in DBMS and applications 2
  • 3. © 2022 Altinity, Inc. © 2022 Altinity, Inc. The Kubernetes Vision: Databases as cattle, not pets 3
  • 4. © 2022 Altinity, Inc. Introducing ClickHouse. It’s complicated. ClickHouse Server shard1 Analytic application Zookeeper Server ClickHouse Server shard1 ClickHouse Server shard1 ClickHouse Server shard2 ClickHouse Server shard2 ClickHouse Server shard2 Zookeeper Server Zookeeper Server Availability Zone Availability Zone Availability Zone 4
  • 5. © 2022 Altinity, Inc. kube-system namespace Operators make complex databases work on Kubernetes Altinity ClickHouse Operator your-favorite namespace Apache 2.0 source, distributed as Docker image kubectl -f apply my-cluster.yaml Best practice deployment “Adjust reality” ClickHouse Resource Definition 5
  • 6. © 2022 Altinity, Inc. Operators make complex configurations simple apiVersion: "clickhouse.altinity.com/v1" kind: "ClickHouseInstallation" metadata: name: "demo-01" spec: configuration: zookeeper: nodes: - host: zookeeper.zoo1ns port: 2181 clusters: - name: "cl" layout: shardsCount: 1 replicasCount: 2 . . . 6 Shards and replicas Where is Zookeeper? More stuff like storage and versions
  • 7. © 2022 Altinity, Inc. Well-written operators make your life better 7 Rolling Upgrade Prometheus Metrics Export Vertical Scaling Horizontal Scaling Multi-AZ Deployment Node Anti- Affinity
  • 8. © 2022 Altinity, Inc. © 2022 Altinity, Inc. But… Cattle are individuals, too! 8
  • 9. © 2022 Altinity, Inc. A typical distributed service Load Balancer Service #1 Service #3 Service #2 Storage Storage Storage Traffic
  • 10. © 2022 Altinity, Inc. Created using Kubernetes stateful set Pod “svc-1” Persistent Volume Service “svc” Stateful Set Persistent Volume Claim Persistent Volume Persistent Volume Pod “svc-2” Pod “svc-3” Persistent Volume Claim Persistent Volume Claim Config Maps Config Maps
  • 11. © 2022 Altinity, Inc. Database replicas are asymmetric DBMS Replica 1 DBMS Replica 3 DBMS Replica 2 Different software versions Different resources R/W vs. R/O Different AZs
  • 12. © 2022 Altinity, Inc. Databases need a more sophisticated replica model SS/Pod “svc-1” Persistent Volume Service “svc” ClickHouse Installation Persistent Volume Claim Persistent Volume Persistent Volume SS/Pod “svc-2” SS/Pod “svc-3” Persistent Volume Claim Persistent Volume Claim Config Maps Config Maps
  • 13. © 2022 Altinity, Inc. © 2022 Altinity, Inc. What if your calf gets sick? 13
  • 14. © 2022 Altinity, Inc. $ kubectl apply -f crash.yaml clickhouseinstallation.clickhouse.altinity.com/crash-demo configured $ kubectl get pods NAME READY STATUS RESTARTS AGE chi-crash-demo-ch-0-0-0 0/1 CrashLoopBackOff 1 (12s ago) 15s 14 Every [Kubernetes] SRE’s least favorite error
  • 15. © 2022 Altinity, Inc. Traditional cure for database crashes 15 Database DBA
  • 16. © 2022 Altinity, Inc. Malign effects of EBS hardware failure in AWS SS/Pod “svc-1” Persistent Volume Service “svc” ClickHouse Installation Persistent Volume Claim Persistent Volume Persistent Volume SS/Pod “svc-2” SS/Pod “svc-3” Persistent Volume Claim Persistent Volume Claim Config Maps Config Maps
  • 17. © 2022 Altinity, Inc. Operators need adaptation to handle DBMS failures 17 Logs accessible outside pod Change CRD and apply Take DBMS out of rotation & fix online Robust node replace protocol kubectl exec Volume corruption Metadata corruption Misconfiguration Bad upgrade Unfulfilled resources Accidental volume deletion Built-in locks on PVC and/or PV Slow startup
  • 18. © 2022 Altinity, Inc. © 2022 Altinity, Inc. How can cattle adapt to the herd? 18
  • 19. © 2022 Altinity, Inc. File-based vs. network management Server Storage Logs & Metric Data Config Files Direct access to file system required Configuration Backup & Restore Metrics export Lots of sidecars!
  • 20. © 2022 Altinity, Inc. File-based vs. network management Server Storage Logs & Metric Data Config Files Direct access to file system required Server Storage Logs & Metric Data Config Files All management via network Configuration Metrics export Backup & Restore Configuration Backup & Restore Metrics export Lots of sidecars! Just one pod!
  • 21. © 2022 Altinity, Inc. Attached local storage vs. network only with local cache Server NVMe SSD Server NVMe SSD Server NVMe SSD Pods pinned to host! (But low latency)
  • 22. © 2022 Altinity, Inc. Attached local storage vs. network only with local cache Server NVMe SSD Server NVMe SSD Server NVMe SSD Server SSD Cache Server SSD Cache Server SSD Cache Shared Storage Layer Pods pinned to host! (But low latency) Pods scheduled on any host! (But higher latency)
  • 23. © 2022 Altinity, Inc. Shared nothing versus separated compute/storage Server Dedicated Storage Server Dedicated Storage Server Dedicated Storage Server Server Server Server Server Shared Storage Layer Increasing volume of data, increasing elasticity
  • 24. © 2022 Altinity, Inc. © 2022 Altinity, Inc. Final thoughts on life in the herd 24
  • 25. © 2022 Altinity, Inc. Database happiness on Kubernetes ● Build a replica model that permits variation ○ Not stateful sets ● Fix things without resort to ‘kubectl exec’ ● Access management and storage through the network 25
  • 26. © 2022 Altinity, Inc. Thank you! Questions? https://altinity.com rhodges at altinity.com 26 Altinity.Cloud Altinity Kubernetes Operator for ClickHouse We’re hiring!