SlideShare a Scribd company logo
© 2022 Altinity, Inc.
Deep Dive on
ClickHouse Sharding
and Replication
Robert Hodges and Altinity Engineering
22 September 2022
1
© 2202 Altinity, Inc.
© 2022 Altinity, Inc.
Let’s make some introductions
ClickHouse support and services including Altinity.Cloud
Authors of Altinity Kubernetes Operator for ClickHouse
and other open source projects
Us
Database geeks with centuries
of experience in DBMS and
applications
You
Applications developers
looking to learn about
ClickHouse
2
© 2022 Altinity, Inc.
© 2022 Altinity, Inc.
What’s a
ClickHouse?
3
© 2022 Altinity, Inc.
Understands SQL
Runs on bare metal to cloud
Shared nothing architecture
Stores data in columns
Parallel and vectorized execution
Scales to many petabytes
Is Open source (Apache 2.0)
ClickHouse is a SQL Data Warehouse
It’s the core engine for
real-time analytics
ClickHouse
Event
Streams
ELT
Object
Storage
Interactive
Graphics
Dashboards
APIs
4
© 2022 Altinity, Inc.
Distributed data is deeper than it looks
5
Width:
2 meters
Depth:
60 meters
“The
Bolton
Strid”
© 2022 Altinity, Inc.
© 2022 Altinity, Inc.
Introducing
sharding and
replication
6
© 2022 Altinity, Inc.
Clickhouse nodes can scale vertically
Network-
Attached
Storage
CPU
RAM
Host
7
© 2022 Altinity, Inc.
Clickhouse nodes can scale vertically
CPU
RAM
Host
Network-
Attached
Storage
8
© 2022 Altinity, Inc.
Clusters introduce horizontal scaling
Shards
Replicas
Host Host Host
Host
Replicas improve read
IOPs and concurrency
Shards add write
IOPS
9
© 2022 Altinity, Inc.
Different sharding and replication patterns
Shard 1
Shard 3
Shard 2
Shard 4
All Sharded
Data sharded 4
ways without
replication
Replica 1
Replica 3
Replica 2
Replica 4
All Replicated
Data replicated 4
times without
sharding
Shard 1
Replica 1
Shard 1
Replica 2
Shard 2
Replica 1
Shard 2
Replica 2
Sharded and
Replicated
Data sharded 2
ways and
replicated 2 times
10
© 2022 Altinity, Inc.
MergeTree tables support replication
MergeTree
SummingMergeTree
AggregatingMergeTree
CollapsingMergeTree
VersionedCollapsing
MergeTree
ReplicatedMergeTree
ReplicatedSummingMergeTree
ReplicatedAggregatingMergeTree
ReplicatedCollapsingMergeTree
ReplicatedVersionedCollapsing
MergeTree
ReplacingMergeTree ReplicatedReplacingMergeTree
Source data
Aggregated
data; single
row per group
Evolving data
11
© 2022 Altinity, Inc.
How replication works
INSERT
Replicate
ClickHouse Node 1
Table: ontime
(Parts)
ReplicatedMergeTree
:9009
:9443 ClickHouse Node 2
Table: ontime
(Parts)
ReplicatedMergeTree
:9009
:9443
zookeeper-1
ZNodes
:2181 zookeeper-2
ZNodes
:2181 zookeeper-3
ZNodes
:2181
12
© 2022 Altinity, Inc.
What is replicated?
Replicated statements Non-replicated statements
● INSERT
● ALTER TABLE
exceptions: FREEZE, MOVE TO
DISK, FETCH
● OPTIMIZE
● TRUNCATE
● CREATE table
● DROP table
● RENAME table
● DETACH table
● ATTACH table
Replicated*MergeTree ONLY
13
© 2022 Altinity, Inc.
© 2022 Altinity, Inc.
Building
distributed
schema
14
© 2022 Altinity, Inc.
Example of a distributed data set with shards and replicas
clickhouse-0
ontime
_local
airports
ontime
clickhouse-1
ontime
_local
airports
ontime
clickhouse-2
ontime
_local
airports
ontime
clickhouse-3
ontime
_local
airports
ontime
Distributed
table
(No data)
Sharded,
replicated
table
(Partial data)
Fully
replicated
table
(All data)
15
© 2022 Altinity, Inc.
Step 1: A sharded, replicated fact table
CREATE TABLE IF NOT EXISTS `ontime_local` (
`Year` UInt16 CODEC(DoubleDelta, ZSTD(1)),
`Quarter` UInt8,
`Month` UInt8,
`DayofMonth` UInt8,
`DayOfWeek` UInt8, ...
) Engine=ReplicatedMergeTree(
'/clickhouse/{cluster}/tables/{shard}/{database}/ontime_local',
'{replica}')
PARTITION BY toYYYYMM(FlightDate)
ORDER BY (FlightDate, `Year`, `Month`, DepDel15)
Replication is at the table level! Use a Replicated% Engine
16
© 2022 Altinity, Inc.
Step 2: A distributed table to find data
CREATE TABLE IF NOT EXISTS ontime
AS ontime_local
ENGINE = Distributed(
'{cluster}', currentDatabase(), ontime_local, rand())
Cluster
layout
Database Table Sharding
key
(optional)
17
© 2022 Altinity, Inc.
Step 3: A fully replicated dimension table
CREATE TABLE IF NOT EXISTS airports
AS default.dot_airports
Engine=ReplicatedMergeTree(
'/clickhouse/{cluster}/tables/all/{database}/airports',
'{replica}')
PARTITION BY tuple()
PRIMARY KEY AirportID
ORDER BY AirportID
Don’t bother with partitions
for small tables
Resolves to current
database
18
© 2022 Altinity, Inc.
Macros help CREATE TABLE ON CLUSTER
/etc/clickhouse-server/config.d/macros.xml:
<clickhouse>
<macros>
<all-sharded-shard>2</all-sharded-shard>
<cluster>demo</cluster>
<shard>0</shard>
<replica>clickhouse-0-1</replica>
</macros>
</clickhouse>
select * from system.macros
Replica names
should be unique
per host
19
© 2022 Altinity, Inc.
What does ON CLUSTER do?
ON CLUSTER executes a command over a set of nodes
CREATE TABLE IF NOT EXISTS `ontime_local` ON CLUSTER `{cluster}` ...
DROP TABLE IF EXISTS `ontime_local` ON CLUSTER `{cluster}` ...
ALTER TABLE `ontime_local` ON CLUSTER `{cluster}` ...
20
© 2022 Altinity, Inc.
How does ON CLUSTER know where to go?
/etc/clickhouse-server/config.d/remote_servers.xml:
<clickhouse>
<remote_servers>
<demo>
<!-- <secret>top secret</secret> -->
<shard>
<replica><host>10.0.0.71</host><port>9000</port></replica>
<replica><host>10.0.0.72</host><port>9000</port></replica>
<internal_replication>true</internal_replication>
</shard>
<shard>
. . .
</shard>
</demo>
</remote_servers>
</clickhouse>
“It’s a cluster
because I said so!”
Cluster name
21
Shared secret
© 2022 Altinity, Inc.
List layouts using system.clusters
-- Find name and hosts in each layout
SELECT
cluster,
groupArray(concat(host_name,':',toString(port))) AS hosts
FROM system.clusters
GROUP BY cluster ORDER BY cluster
22
© 2022 Altinity, Inc.
© 2022 Altinity, Inc.
Loading and
querying data
23
© 2022 Altinity, Inc.
Data loading: Distributed vs. local INSERTs
ontime
_local
ontime
Insert via
distributed
table
Insert directly
to shards
ontime
_local
ontime
ontime
_local
ontime
ontime
_local
ontime
Data
Pipeline Data
Pipeline
Applications may have to
be more intelligent
May require more
resources
(Queue)
24
© 2022 Altinity, Inc.
INSERT into a distributed vs. local table
-- Insert into distributed table
INSERT INTO ontime VALUES
(2017,1,1,1,7,'2017-01-01','AA',19805,...),
(2017,1,1,1,7,'2017-01-01','AA',19805,...),
...
-- Insert into a local table
INSERT INTO ontime_local VALUES
(2017,1,1,1,7,'2017-01-01','AA',19805,...),
(2017,1,1,1,7,'2017-01-01','AA',19805,...),
...
25
© 2022 Altinity, Inc.
How does a distributed INSERT work?
ontime
_local
ontime
Insert via
distributed table
ontime
_local
ontime
ontime
_local
ontime
Data
Pipeline
(Queue)
insert_distributed_sync:
● 0 = async propagation
● 1 = sync propagation ontime
_local
ontime
Thread Pool
select * from
system.distribution_queue
replication
26
© 2022 Altinity, Inc.
Options for processing INSERTs
● Local vs distributed data insertion
○ INSERT to local table – no need to sync, larger blocks, faster
○ INSERT to Distributed table – sharding by ClickHouse
○ CHProxy -- distributes transactions across nodes, only works with HTTP
connections
● Asynchronous (default) vs synchronous insertions
○ insert_distributed_sync - Wait until batches make it to local tables
○ insert_quorum, select_sequential_consistency – Wait until replicas sync
27
© 2022 Altinity, Inc.
How do distributed SELECTs work?
ontime
_local
ontime
Application
ontime
_local
ontime
ontime
_local
ontime
ontime
_local
ontime
Application
Innermost
subselect is
distributed
AggregateState
computed
locally
Aggregates
merged on
initiator node
28
© 2022 Altinity, Inc.
Queries are pushed to all shards
SELECT Carrier, avg(DepDelay) AS Delay
FROM ontime
GROUP BY Carrier ORDER BY Delay DESC
SELECT Carrier, avg(DepDelay) AS Delay
FROM ontime_local
GROUP BY Carrier ORDER BY Delay DESC
29
© 2022 Altinity, Inc.
ClickHouse pushes down JOINs by default
SELECT o.Dest d, a.Name n, count(*) c, avg(o.ArrDelayMinutes) ad
FROM default.ontime o
JOIN default.airports a ON (a.IATA = o.Dest)
GROUP BY d, n HAVING c > 100000 ORDER BY d DESC
LIMIT 10
SELECT Dest AS d, Name AS n, count() AS c, avg(ArrDelayMinutes) AS
ad
FROM default.ontime_local AS o
ALL INNER JOIN default.airports AS a ON a.IATA = o.Dest
GROUP BY d, n HAVING c > 100000 ORDER BY d DESC LIMIT 10
30
© 2022 Altinity, Inc.
...Unless the left side “table” is a subquery
SELECT d, Name n, c AS flights, ad
FROM
(
SELECT Dest d, count(*) c, avg(ArrDelayMinutes) ad
FROM default.ontime
GROUP BY d HAVING c > 100000
ORDER BY ad DESC
) AS o
LEFT JOIN airports ON airports.IATA = o.d
LIMIT 10
Remote
Servers
31
© 2022 Altinity, Inc.
It’s more complex when multiple tables are distributed
select foo from T1 where a in (select a from T2)
distributed_product_mode=?
local
select foo
from T1_local
where a in (
select a
from T2_local)
allow
select foo
from T1_local
where a in (
select a
from T2)
global
create temporary table
tmp Engine = Set
AS select a from T2;
select foo from
T1_local where a in
tmp;
(Subquery runs on
local table)
(Subquery runs on
distributed table) (Subquery runs on initiator;
broadcast to local temp table)
32
© 2022 Altinity, Inc.
What’s actually happening with queries? Let’s find out!
SELECT hostName() host, event_time, query_id,
is_initial_query AS initial,
if(is_initial_query, '', initial_query_id) as initial_q,
query
FROM cluster('{cluster}', system.query_log) AS st
WHERE type = 'QueryFinish' AND has(databases, 'test')
ORDER BY st.event_time DESC LIMIT 25
33
© 2022 Altinity, Inc.
Thinking about distributed data and joins
Large
id
1
2
…
…
1000
Small
id
1
…
100
Large
id
1
2
…
…
1000
Large
id
1
2
…
…
1000
Large
id
1001
1002
…
…
2000
Large
id
2001
2002
…
…
2000
Large
id
1001
1002
…
…
2000
Small
id
1
…
100
Shard 1 Shard 2 Shard 1 Shard 2
“Bucketing Model”
“Big Table Model”
All keys replicated Matching keys in
each bucket
34
© 2022 Altinity, Inc.
© 2022 Altinity, Inc.
Tricks to query
distributed tables
35
© 2022 Altinity, Inc.
Use remote() to select from another node
SELECT count()
FROM remote('host-2', currentDatabase(), 'ontime_ref')
SELECT count()
FROM remoteSecure('host-2', currentDatabase(), 'ontime_ref')
┌───count()─┐
│ 196508419 │
└───────────┘
-- You can insert too, with FUNCTION keyword.
INSERT INTO FUNCTION remote(host, database, table, login,
password)
VALUES . . .
36
© 2022 Altinity, Inc.
More remote query tricks!
SELECT hostName() AS h, count() AS c FROM sdata GROUP BY h
┌─h─────────────────────────┬───c─┐
│ chi-test-rh-test-rh-1-0-0 │ 492 │
│ chi-test-rh-test-rh-0-0-0 │ 508 │
└───────────────────────────┴─────┘
SELECT hostName() AS h, count() AS c
FROM remote('chi-test-rh-test-rh-{0,1}-{0,1}', default, sdata)
GROUP BY h
┌─h─────────────────────────┬────c─┐
│ chi-test-rh-test-rh-1-0-0 │ 984 │
│ chi-test-rh-test-rh-1-1-0 │ 984 │
│ chi-test-rh-test-rh-0-1-0 │ 1016 │
│ chi-test-rh-test-rh-0-0-0 │ 1016 │
└───────────────────────────┴──────┘
Distributed table
Remote query all 4
hosts
37
© 2022 Altinity, Inc.
cluster() distributes queries dynamically
SELECT
hostName() AS host, count() AS tables
FROM cluster('{cluster}', system.tables)
WHERE database = 'default'
GROUP BY host
┌─host──────────────────────┬─tables─┐
│ chi-test-rh-test-rh-1-0-0 │ 2 │
│ chi-test-rh-test-rh-0-1-0 │ 2 │
└───────────────────────────┴────────┘
38
© 2022 Altinity, Inc.
clusterAllReplicas() goes to every node
SELECT
hostName() AS host, count() AS tables
FROM clusterAllReplicas('{cluster}', system.tables)
WHERE database = 'default'
GROUP BY host
┌─host──────────────────────┬─tables─┐
│ chi-test-rh-test-rh-1-0-0 │ 2 │
│ chi-test-rh-test-rh-1-1-0 │ 2 │
│ chi-test-rh-test-rh-0-1-0 │ 2 │
│ chi-test-rh-test-rh-0-0-0 │ 2 │
└───────────────────────────┴────────┘
39
© 2022 Altinity, Inc.
© 2022 Altinity, Inc.
Scaling up
40
© 2022 Altinity, Inc.
Load testing and capacity planning made simple…
1. Establish single node baseload
● Use production data
● Max out SELECT & INSERT capacity with load tests
● Adjust schema and queries, retest
2. Add replicas to increase SELECT capacity
3. Add shards to increase INSERT capacity
41
© 2022 Altinity, Inc.
Selecting the sharding key
Shard 2 Shard 3
Shard 1
Randomized Key, e.g.,
cityHash64(Url)
Must query
all shards
Nodes are
balanced
Shard 3
Specific Key e.g.,
cityHash64(TenantId)
Unbalanced
nodes
Queries can
skip shards
Shard 2
Shard 1
Easier to
add nodes
Hard to
add nodes
42
© 2022 Altinity, Inc.
Options for shard rebalancing
● INSERT INTO new_cluster SELECT FROM old_cluster
○ Clickhouse-copier automates this
● Use (undocumented) ALTER TABLE MOVE PART TO SHARD
○ Example: ALTER TABLE test_move MOVE PART 'all_0_0_0' TO SHARD
'/clickhouse/shard_1/tables/test_move
● Move parts manually
○ ALTER TABLE FREEZE PARTITION
○ rsync to new host
○ ALTER TABLE ATTACH PARTITION
○ Drop original partition
43
© 2022 Altinity, Inc.
Bi-level sharding combines both approaches
cityHash64(Url)
Shard 2 Shard 3
Shard 1
TenantId
Shard 2
Shard 1
cityHash64(Url) cityHash64(Url)
Shard 2
Shard 1
Tenant-Group-1 Tenant-Group-2 Tenant-Group-3
Application chooses group
Distributed table
44
© 2022 Altinity, Inc.
© 2022 Altinity, Inc.
Wrap-up and more
information
45
© 2022 Altinity, Inc.
Where is the documentation?
ClickHouse official docs – https://clickhouse.com/docs/
Altinity Blog – https://altinity.com/blog/
Altinity Youtube Channel –
https://www.youtube.com/channel/UCE3Y2lDKl_ZfjaCrh62onYA
Altinity Knowledge Base – https://kb.altinity.com/
ClickHouse Capacity Planning by Mik Kocikowski of CloudFlare
Meetups, other blogs, and external resources. Use your powers of Search!
46
© 2022 Altinity, Inc.
Where can I get help?
Telegram - ClickHouse Channel
Slack
● ClickHouse Public Workspace - clickhousedb.slack.com
● Altinity Public Workspace - altinitydbworkspace.slack.com
Education - Altinity ClickHouse Training
Support - Altinity offers support for ClickHouse in all environments
47
© 2022 Altinity, Inc. 48
© 2202 Altinity, Inc.
Thank you and
good luck!
Website: https://altinity.com
Email: info@altinity.com
Slack: altinitydbworkspace.slack.com
Altinity.Cloud
Altinity Support
Altinity Stable
Builds
We’re hiring!

More Related Content

What's hot

ClickHouse Keeper
ClickHouse KeeperClickHouse Keeper
ClickHouse Keeper
Altinity Ltd
 
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEOClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
Altinity Ltd
 
ClickHouse Materialized Views: The Magic Continues
ClickHouse Materialized Views: The Magic ContinuesClickHouse Materialized Views: The Magic Continues
ClickHouse Materialized Views: The Magic Continues
Altinity Ltd
 
A Day in the Life of a ClickHouse Query Webinar Slides
A Day in the Life of a ClickHouse Query Webinar Slides A Day in the Life of a ClickHouse Query Webinar Slides
A Day in the Life of a ClickHouse Query Webinar Slides
Altinity Ltd
 
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander ZaitsevClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
Altinity Ltd
 
High Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouseHigh Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouse
Altinity Ltd
 
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEOClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
Altinity Ltd
 
A Fast Intro to Fast Query with ClickHouse, by Robert Hodges
A Fast Intro to Fast Query with ClickHouse, by Robert HodgesA Fast Intro to Fast Query with ClickHouse, by Robert Hodges
A Fast Intro to Fast Query with ClickHouse, by Robert Hodges
Altinity Ltd
 
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert HodgesWebinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Altinity Ltd
 
ClickHouse Mark Cache, by Mik Kocikowski, Cloudflare
ClickHouse Mark Cache, by Mik Kocikowski, CloudflareClickHouse Mark Cache, by Mik Kocikowski, Cloudflare
ClickHouse Mark Cache, by Mik Kocikowski, Cloudflare
Altinity Ltd
 
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
Altinity Ltd
 
10 Good Reasons to Use ClickHouse
10 Good Reasons to Use ClickHouse10 Good Reasons to Use ClickHouse
10 Good Reasons to Use ClickHouse
rpolat
 
RocksDB Performance and Reliability Practices
RocksDB Performance and Reliability PracticesRocksDB Performance and Reliability Practices
RocksDB Performance and Reliability Practices
Yoshinori Matsunobu
 
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlare
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlareClickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlare
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlare
Altinity Ltd
 
ClickHouse Features for Advanced Users, by Aleksei Milovidov
ClickHouse Features for Advanced Users, by Aleksei MilovidovClickHouse Features for Advanced Users, by Aleksei Milovidov
ClickHouse Features for Advanced Users, by Aleksei Milovidov
Altinity Ltd
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Altinity Ltd
 
ClickHouse on Kubernetes, by Alexander Zaitsev, Altinity CTO
ClickHouse on Kubernetes, by Alexander Zaitsev, Altinity CTOClickHouse on Kubernetes, by Alexander Zaitsev, Altinity CTO
ClickHouse on Kubernetes, by Alexander Zaitsev, Altinity CTO
Altinity Ltd
 
Data Warehouse on Kubernetes: lessons from Clickhouse Operator
Data Warehouse on Kubernetes: lessons from Clickhouse OperatorData Warehouse on Kubernetes: lessons from Clickhouse Operator
Data Warehouse on Kubernetes: lessons from Clickhouse Operator
Altinity Ltd
 
Your first ClickHouse data warehouse
Your first ClickHouse data warehouseYour first ClickHouse data warehouse
Your first ClickHouse data warehouse
Altinity Ltd
 
Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...
Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...
Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...
Altinity Ltd
 

What's hot (20)

ClickHouse Keeper
ClickHouse KeeperClickHouse Keeper
ClickHouse Keeper
 
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEOClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO
 
ClickHouse Materialized Views: The Magic Continues
ClickHouse Materialized Views: The Magic ContinuesClickHouse Materialized Views: The Magic Continues
ClickHouse Materialized Views: The Magic Continues
 
A Day in the Life of a ClickHouse Query Webinar Slides
A Day in the Life of a ClickHouse Query Webinar Slides A Day in the Life of a ClickHouse Query Webinar Slides
A Day in the Life of a ClickHouse Query Webinar Slides
 
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander ZaitsevClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
 
High Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouseHigh Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouse
 
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEOClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
 
A Fast Intro to Fast Query with ClickHouse, by Robert Hodges
A Fast Intro to Fast Query with ClickHouse, by Robert HodgesA Fast Intro to Fast Query with ClickHouse, by Robert Hodges
A Fast Intro to Fast Query with ClickHouse, by Robert Hodges
 
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert HodgesWebinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
 
ClickHouse Mark Cache, by Mik Kocikowski, Cloudflare
ClickHouse Mark Cache, by Mik Kocikowski, CloudflareClickHouse Mark Cache, by Mik Kocikowski, Cloudflare
ClickHouse Mark Cache, by Mik Kocikowski, Cloudflare
 
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...HTTP Analytics for 6M requests per second using ClickHouse, by  Alexander Boc...
HTTP Analytics for 6M requests per second using ClickHouse, by Alexander Boc...
 
10 Good Reasons to Use ClickHouse
10 Good Reasons to Use ClickHouse10 Good Reasons to Use ClickHouse
10 Good Reasons to Use ClickHouse
 
RocksDB Performance and Reliability Practices
RocksDB Performance and Reliability PracticesRocksDB Performance and Reliability Practices
RocksDB Performance and Reliability Practices
 
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlare
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlareClickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlare
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlare
 
ClickHouse Features for Advanced Users, by Aleksei Milovidov
ClickHouse Features for Advanced Users, by Aleksei MilovidovClickHouse Features for Advanced Users, by Aleksei Milovidov
ClickHouse Features for Advanced Users, by Aleksei Milovidov
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
 
ClickHouse on Kubernetes, by Alexander Zaitsev, Altinity CTO
ClickHouse on Kubernetes, by Alexander Zaitsev, Altinity CTOClickHouse on Kubernetes, by Alexander Zaitsev, Altinity CTO
ClickHouse on Kubernetes, by Alexander Zaitsev, Altinity CTO
 
Data Warehouse on Kubernetes: lessons from Clickhouse Operator
Data Warehouse on Kubernetes: lessons from Clickhouse OperatorData Warehouse on Kubernetes: lessons from Clickhouse Operator
Data Warehouse on Kubernetes: lessons from Clickhouse Operator
 
Your first ClickHouse data warehouse
Your first ClickHouse data warehouseYour first ClickHouse data warehouse
Your first ClickHouse data warehouse
 
Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...
Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...
Analytics at Speed: Introduction to ClickHouse and Common Use Cases. By Mikha...
 

Similar to Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf

A day in the life of a click house query
A day in the life of a click house queryA day in the life of a click house query
A day in the life of a click house query
CristinaMunteanu43
 
Altinity Quickstart for ClickHouse-2202-09-15.pdf
Altinity Quickstart for ClickHouse-2202-09-15.pdfAltinity Quickstart for ClickHouse-2202-09-15.pdf
Altinity Quickstart for ClickHouse-2202-09-15.pdf
Altinity Ltd
 
Polyglot ClickHouse -- ClickHouse SF Meetup Sept 10
Polyglot ClickHouse -- ClickHouse SF Meetup Sept 10Polyglot ClickHouse -- ClickHouse SF Meetup Sept 10
Polyglot ClickHouse -- ClickHouse SF Meetup Sept 10
Altinity Ltd
 
Creating Beautiful Dashboards with Grafana and ClickHouse
Creating Beautiful Dashboards with Grafana and ClickHouseCreating Beautiful Dashboards with Grafana and ClickHouse
Creating Beautiful Dashboards with Grafana and ClickHouse
Altinity Ltd
 
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
InfluxData
 
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Altinity Ltd
 
Relational Database to Apache Spark (and sometimes back again)
Relational Database to Apache Spark (and sometimes back again)Relational Database to Apache Spark (and sometimes back again)
Relational Database to Apache Spark (and sometimes back again)
Ed Thewlis
 
Introduction to Presto at Treasure Data
Introduction to Presto at Treasure DataIntroduction to Presto at Treasure Data
Introduction to Presto at Treasure Data
Taro L. Saito
 
All you need to know about CREATE STATISTICS
All you need to know about CREATE STATISTICSAll you need to know about CREATE STATISTICS
All you need to know about CREATE STATISTICS
EDB
 
Snowflake’s Cloud Data Platform and Modern Analytics
Snowflake’s Cloud Data Platform and Modern AnalyticsSnowflake’s Cloud Data Platform and Modern Analytics
Snowflake’s Cloud Data Platform and Modern Analytics
Senturus
 
OpenInfra Summit Vancouver 2023 - SSoT
OpenInfra Summit Vancouver 2023 - SSoTOpenInfra Summit Vancouver 2023 - SSoT
OpenInfra Summit Vancouver 2023 - SSoT
Ohyama Hiroyasu
 
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdfAltinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
Altinity Ltd
 
Webinar: ArangoDB 3.8 Preview - Analytics at Scale
Webinar: ArangoDB 3.8 Preview - Analytics at Scale Webinar: ArangoDB 3.8 Preview - Analytics at Scale
Webinar: ArangoDB 3.8 Preview - Analytics at Scale
ArangoDB Database
 
ClickHouse materialized views - a secret weapon for high performance analytic...
ClickHouse materialized views - a secret weapon for high performance analytic...ClickHouse materialized views - a secret weapon for high performance analytic...
ClickHouse materialized views - a secret weapon for high performance analytic...
Altinity Ltd
 
InfluxDB IOx Tech Talks: A Rusty Introduction to Apache Arrow and How it App...
InfluxDB IOx Tech Talks:  A Rusty Introduction to Apache Arrow and How it App...InfluxDB IOx Tech Talks:  A Rusty Introduction to Apache Arrow and How it App...
InfluxDB IOx Tech Talks: A Rusty Introduction to Apache Arrow and How it App...
InfluxData
 
Presto in Treasure Data (presented at db tech showcase Sapporo 2015)
Presto in Treasure Data (presented at db tech showcase Sapporo 2015)Presto in Treasure Data (presented at db tech showcase Sapporo 2015)
Presto in Treasure Data (presented at db tech showcase Sapporo 2015)
Mitsunori Komatsu
 
Data warehouse or conventional database: Which is right for you?
Data warehouse or conventional database: Which is right for you?Data warehouse or conventional database: Which is right for you?
Data warehouse or conventional database: Which is right for you?
Data Con LA
 
Data relay introduction to big data clusters
Data relay introduction to big data clustersData relay introduction to big data clusters
Data relay introduction to big data clusters
Chris Adkin
 
ClickHouse -If Combinators for Fun and Profit-2022-05-04.pdf
ClickHouse -If Combinators for Fun and Profit-2022-05-04.pdfClickHouse -If Combinators for Fun and Profit-2022-05-04.pdf
ClickHouse -If Combinators for Fun and Profit-2022-05-04.pdf
Altinity Ltd
 
Andre Paul: Importing VMware infrastructures into CloudStack
Andre Paul: Importing VMware infrastructures into CloudStackAndre Paul: Importing VMware infrastructures into CloudStack
Andre Paul: Importing VMware infrastructures into CloudStack
ShapeBlue
 

Similar to Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf (20)

A day in the life of a click house query
A day in the life of a click house queryA day in the life of a click house query
A day in the life of a click house query
 
Altinity Quickstart for ClickHouse-2202-09-15.pdf
Altinity Quickstart for ClickHouse-2202-09-15.pdfAltinity Quickstart for ClickHouse-2202-09-15.pdf
Altinity Quickstart for ClickHouse-2202-09-15.pdf
 
Polyglot ClickHouse -- ClickHouse SF Meetup Sept 10
Polyglot ClickHouse -- ClickHouse SF Meetup Sept 10Polyglot ClickHouse -- ClickHouse SF Meetup Sept 10
Polyglot ClickHouse -- ClickHouse SF Meetup Sept 10
 
Creating Beautiful Dashboards with Grafana and ClickHouse
Creating Beautiful Dashboards with Grafana and ClickHouseCreating Beautiful Dashboards with Grafana and ClickHouse
Creating Beautiful Dashboards with Grafana and ClickHouse
 
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
 
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
 
Relational Database to Apache Spark (and sometimes back again)
Relational Database to Apache Spark (and sometimes back again)Relational Database to Apache Spark (and sometimes back again)
Relational Database to Apache Spark (and sometimes back again)
 
Introduction to Presto at Treasure Data
Introduction to Presto at Treasure DataIntroduction to Presto at Treasure Data
Introduction to Presto at Treasure Data
 
All you need to know about CREATE STATISTICS
All you need to know about CREATE STATISTICSAll you need to know about CREATE STATISTICS
All you need to know about CREATE STATISTICS
 
Snowflake’s Cloud Data Platform and Modern Analytics
Snowflake’s Cloud Data Platform and Modern AnalyticsSnowflake’s Cloud Data Platform and Modern Analytics
Snowflake’s Cloud Data Platform and Modern Analytics
 
OpenInfra Summit Vancouver 2023 - SSoT
OpenInfra Summit Vancouver 2023 - SSoTOpenInfra Summit Vancouver 2023 - SSoT
OpenInfra Summit Vancouver 2023 - SSoT
 
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdfAltinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
 
Webinar: ArangoDB 3.8 Preview - Analytics at Scale
Webinar: ArangoDB 3.8 Preview - Analytics at Scale Webinar: ArangoDB 3.8 Preview - Analytics at Scale
Webinar: ArangoDB 3.8 Preview - Analytics at Scale
 
ClickHouse materialized views - a secret weapon for high performance analytic...
ClickHouse materialized views - a secret weapon for high performance analytic...ClickHouse materialized views - a secret weapon for high performance analytic...
ClickHouse materialized views - a secret weapon for high performance analytic...
 
InfluxDB IOx Tech Talks: A Rusty Introduction to Apache Arrow and How it App...
InfluxDB IOx Tech Talks:  A Rusty Introduction to Apache Arrow and How it App...InfluxDB IOx Tech Talks:  A Rusty Introduction to Apache Arrow and How it App...
InfluxDB IOx Tech Talks: A Rusty Introduction to Apache Arrow and How it App...
 
Presto in Treasure Data (presented at db tech showcase Sapporo 2015)
Presto in Treasure Data (presented at db tech showcase Sapporo 2015)Presto in Treasure Data (presented at db tech showcase Sapporo 2015)
Presto in Treasure Data (presented at db tech showcase Sapporo 2015)
 
Data warehouse or conventional database: Which is right for you?
Data warehouse or conventional database: Which is right for you?Data warehouse or conventional database: Which is right for you?
Data warehouse or conventional database: Which is right for you?
 
Data relay introduction to big data clusters
Data relay introduction to big data clustersData relay introduction to big data clusters
Data relay introduction to big data clusters
 
ClickHouse -If Combinators for Fun and Profit-2022-05-04.pdf
ClickHouse -If Combinators for Fun and Profit-2022-05-04.pdfClickHouse -If Combinators for Fun and Profit-2022-05-04.pdf
ClickHouse -If Combinators for Fun and Profit-2022-05-04.pdf
 
Andre Paul: Importing VMware infrastructures into CloudStack
Andre Paul: Importing VMware infrastructures into CloudStackAndre Paul: Importing VMware infrastructures into CloudStack
Andre Paul: Importing VMware infrastructures into CloudStack
 

More from Altinity Ltd

Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptxBuilding an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
Altinity Ltd
 
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
Altinity Ltd
 
Building an Analytic Extension to MySQL with ClickHouse and Open Source
Building an Analytic Extension to MySQL with ClickHouse and Open SourceBuilding an Analytic Extension to MySQL with ClickHouse and Open Source
Building an Analytic Extension to MySQL with ClickHouse and Open Source
Altinity Ltd
 
Fun with ClickHouse Window Functions-2021-08-19.pdf
Fun with ClickHouse Window Functions-2021-08-19.pdfFun with ClickHouse Window Functions-2021-08-19.pdf
Fun with ClickHouse Window Functions-2021-08-19.pdf
Altinity Ltd
 
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdfCloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
Altinity Ltd
 
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
Altinity Ltd
 
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Altinity Ltd
 
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdfOwn your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
Altinity Ltd
 
ClickHouse ReplacingMergeTree in Telecom Apps
ClickHouse ReplacingMergeTree in Telecom AppsClickHouse ReplacingMergeTree in Telecom Apps
ClickHouse ReplacingMergeTree in Telecom Apps
Altinity Ltd
 
Adventures with the ClickHouse ReplacingMergeTree Engine
Adventures with the ClickHouse ReplacingMergeTree EngineAdventures with the ClickHouse ReplacingMergeTree Engine
Adventures with the ClickHouse ReplacingMergeTree Engine
Altinity Ltd
 
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with  Apache Pulsar and Apache PinotBuilding a Real-Time Analytics Application with  Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
Altinity Ltd
 
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
Altinity Ltd
 
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdfOSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
Altinity Ltd
 
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
Altinity Ltd
 
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
Altinity Ltd
 
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
Altinity Ltd
 
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
Altinity Ltd
 
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
Altinity Ltd
 
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdfOSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
Altinity Ltd
 
OSA Con 2022 - Specifics of data analysis in Time Series Databases - Roman Kh...
OSA Con 2022 - Specifics of data analysis in Time Series Databases - Roman Kh...OSA Con 2022 - Specifics of data analysis in Time Series Databases - Roman Kh...
OSA Con 2022 - Specifics of data analysis in Time Series Databases - Roman Kh...
Altinity Ltd
 

More from Altinity Ltd (20)

Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptxBuilding an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
 
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
 
Building an Analytic Extension to MySQL with ClickHouse and Open Source
Building an Analytic Extension to MySQL with ClickHouse and Open SourceBuilding an Analytic Extension to MySQL with ClickHouse and Open Source
Building an Analytic Extension to MySQL with ClickHouse and Open Source
 
Fun with ClickHouse Window Functions-2021-08-19.pdf
Fun with ClickHouse Window Functions-2021-08-19.pdfFun with ClickHouse Window Functions-2021-08-19.pdf
Fun with ClickHouse Window Functions-2021-08-19.pdf
 
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdfCloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
 
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
 
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
 
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdfOwn your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
 
ClickHouse ReplacingMergeTree in Telecom Apps
ClickHouse ReplacingMergeTree in Telecom AppsClickHouse ReplacingMergeTree in Telecom Apps
ClickHouse ReplacingMergeTree in Telecom Apps
 
Adventures with the ClickHouse ReplacingMergeTree Engine
Adventures with the ClickHouse ReplacingMergeTree EngineAdventures with the ClickHouse ReplacingMergeTree Engine
Adventures with the ClickHouse ReplacingMergeTree Engine
 
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with  Apache Pulsar and Apache PinotBuilding a Real-Time Analytics Application with  Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
 
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
 
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdfOSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
 
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
 
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
 
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
 
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
 
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
 
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdfOSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
 
OSA Con 2022 - Specifics of data analysis in Time Series Databases - Roman Kh...
OSA Con 2022 - Specifics of data analysis in Time Series Databases - Roman Kh...OSA Con 2022 - Specifics of data analysis in Time Series Databases - Roman Kh...
OSA Con 2022 - Specifics of data analysis in Time Series Databases - Roman Kh...
 

Recently uploaded

From Clues to Connections: How Social Media Investigators Expose Hidden Networks
From Clues to Connections: How Social Media Investigators Expose Hidden NetworksFrom Clues to Connections: How Social Media Investigators Expose Hidden Networks
From Clues to Connections: How Social Media Investigators Expose Hidden Networks
Milind Agarwal
 
ISBP 821 - UCP 600 - ed).pdf banking standards
ISBP 821 - UCP 600 - ed).pdf banking standardsISBP 821 - UCP 600 - ed).pdf banking standards
ISBP 821 - UCP 600 - ed).pdf banking standards
DevanshuAnada1
 
Maruti Wagon R on road price in Faridabad - CarDekho
Maruti Wagon R on road price in Faridabad - CarDekhoMaruti Wagon R on road price in Faridabad - CarDekho
Maruti Wagon R on road price in Faridabad - CarDekho
kamli sharma#S10
 
Female Service Girls Call Navi Mumbai 9930245274 Provide Best And Top Girl Se...
Female Service Girls Call Navi Mumbai 9930245274 Provide Best And Top Girl Se...Female Service Girls Call Navi Mumbai 9930245274 Provide Best And Top Girl Se...
Female Service Girls Call Navi Mumbai 9930245274 Provide Best And Top Girl Se...
dizzycaye
 
Nipissing University degree offer Nipissing diploma Transcript
Nipissing University degree offer Nipissing diploma TranscriptNipissing University degree offer Nipissing diploma Transcript
Nipissing University degree offer Nipissing diploma Transcript
zyqedad
 
Artificial Intelligence (AI) Technology Project Proposal _ by Slidesgo.pptx
Artificial Intelligence (AI) Technology Project Proposal _ by Slidesgo.pptxArtificial Intelligence (AI) Technology Project Proposal _ by Slidesgo.pptx
Artificial Intelligence (AI) Technology Project Proposal _ by Slidesgo.pptx
vaishnavisharma877623
 
bai-tap-tieng-anh-lop-12-unit-4-the-mass-media (1).doc
bai-tap-tieng-anh-lop-12-unit-4-the-mass-media (1).docbai-tap-tieng-anh-lop-12-unit-4-the-mass-media (1).doc
bai-tap-tieng-anh-lop-12-unit-4-the-mass-media (1).doc
PhngThLmHnh
 
Fine-Tuning of Small/Medium LLMs for Business QA on Structured Data
Fine-Tuning of Small/Medium LLMs for Business QA on Structured DataFine-Tuning of Small/Medium LLMs for Business QA on Structured Data
Fine-Tuning of Small/Medium LLMs for Business QA on Structured Data
kevig
 
potential development of the A* search algorithm specifically
potential development of the A* search algorithm specificallypotential development of the A* search algorithm specifically
potential development of the A* search algorithm specifically
huseindihon
 
OpenMetadata Spotlight - OpenMetadata @ Aspire by Vinol Joy Dsouza
OpenMetadata Spotlight - OpenMetadata @ Aspire by Vinol Joy DsouzaOpenMetadata Spotlight - OpenMetadata @ Aspire by Vinol Joy Dsouza
OpenMetadata Spotlight - OpenMetadata @ Aspire by Vinol Joy Dsouza
OpenMetadata
 
Universidad de Valladolid degree offer diploma Transcript
Universidad de Valladolid  degree offer diploma TranscriptUniversidad de Valladolid  degree offer diploma Transcript
Universidad de Valladolid degree offer diploma Transcript
taqyea
 
University of the Sunshine Coast degree offer diploma Transcript
University of the Sunshine Coast  degree offer diploma TranscriptUniversity of the Sunshine Coast  degree offer diploma Transcript
University of the Sunshine Coast degree offer diploma Transcript
taqyea
 
Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 in CityGirls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 in City
solankikamal004
 
How We Added Replication to QuestDB - JonTheBeach
How We Added Replication to QuestDB - JonTheBeachHow We Added Replication to QuestDB - JonTheBeach
How We Added Replication to QuestDB - JonTheBeach
javier ramirez
 
The University of New England degree offer diploma Transcript
The University of New England  degree offer diploma TranscriptThe University of New England  degree offer diploma Transcript
The University of New England degree offer diploma Transcript
taqyea
 
Amul goes international: Desi dairy giant to launch fresh ...
Amul goes international: Desi dairy giant to launch fresh ...Amul goes international: Desi dairy giant to launch fresh ...
Amul goes international: Desi dairy giant to launch fresh ...
chetankumar9855
 
Introduction to the Red Hat Portfolio.pdf
Introduction to the Red Hat Portfolio.pdfIntroduction to the Red Hat Portfolio.pdf
Introduction to the Red Hat Portfolio.pdf
kihus38
 
Beautiful Girls Call 9711199171 9711199171 Provide Best And Top Girl Service ...
Beautiful Girls Call 9711199171 9711199171 Provide Best And Top Girl Service ...Beautiful Girls Call 9711199171 9711199171 Provide Best And Top Girl Service ...
Beautiful Girls Call 9711199171 9711199171 Provide Best And Top Girl Service ...
janvikumar4133
 
Supervised Learning (Data Science).pptx
Supervised Learning  (Data Science).pptxSupervised Learning  (Data Science).pptx
Supervised Learning (Data Science).pptx
TARIKU ENDALE
 
Sin Involves More Than You Might Think (We'll Explain)
Sin Involves More Than You Might Think (We'll Explain)Sin Involves More Than You Might Think (We'll Explain)
Sin Involves More Than You Might Think (We'll Explain)
sapna sharmap11
 

Recently uploaded (20)

From Clues to Connections: How Social Media Investigators Expose Hidden Networks
From Clues to Connections: How Social Media Investigators Expose Hidden NetworksFrom Clues to Connections: How Social Media Investigators Expose Hidden Networks
From Clues to Connections: How Social Media Investigators Expose Hidden Networks
 
ISBP 821 - UCP 600 - ed).pdf banking standards
ISBP 821 - UCP 600 - ed).pdf banking standardsISBP 821 - UCP 600 - ed).pdf banking standards
ISBP 821 - UCP 600 - ed).pdf banking standards
 
Maruti Wagon R on road price in Faridabad - CarDekho
Maruti Wagon R on road price in Faridabad - CarDekhoMaruti Wagon R on road price in Faridabad - CarDekho
Maruti Wagon R on road price in Faridabad - CarDekho
 
Female Service Girls Call Navi Mumbai 9930245274 Provide Best And Top Girl Se...
Female Service Girls Call Navi Mumbai 9930245274 Provide Best And Top Girl Se...Female Service Girls Call Navi Mumbai 9930245274 Provide Best And Top Girl Se...
Female Service Girls Call Navi Mumbai 9930245274 Provide Best And Top Girl Se...
 
Nipissing University degree offer Nipissing diploma Transcript
Nipissing University degree offer Nipissing diploma TranscriptNipissing University degree offer Nipissing diploma Transcript
Nipissing University degree offer Nipissing diploma Transcript
 
Artificial Intelligence (AI) Technology Project Proposal _ by Slidesgo.pptx
Artificial Intelligence (AI) Technology Project Proposal _ by Slidesgo.pptxArtificial Intelligence (AI) Technology Project Proposal _ by Slidesgo.pptx
Artificial Intelligence (AI) Technology Project Proposal _ by Slidesgo.pptx
 
bai-tap-tieng-anh-lop-12-unit-4-the-mass-media (1).doc
bai-tap-tieng-anh-lop-12-unit-4-the-mass-media (1).docbai-tap-tieng-anh-lop-12-unit-4-the-mass-media (1).doc
bai-tap-tieng-anh-lop-12-unit-4-the-mass-media (1).doc
 
Fine-Tuning of Small/Medium LLMs for Business QA on Structured Data
Fine-Tuning of Small/Medium LLMs for Business QA on Structured DataFine-Tuning of Small/Medium LLMs for Business QA on Structured Data
Fine-Tuning of Small/Medium LLMs for Business QA on Structured Data
 
potential development of the A* search algorithm specifically
potential development of the A* search algorithm specificallypotential development of the A* search algorithm specifically
potential development of the A* search algorithm specifically
 
OpenMetadata Spotlight - OpenMetadata @ Aspire by Vinol Joy Dsouza
OpenMetadata Spotlight - OpenMetadata @ Aspire by Vinol Joy DsouzaOpenMetadata Spotlight - OpenMetadata @ Aspire by Vinol Joy Dsouza
OpenMetadata Spotlight - OpenMetadata @ Aspire by Vinol Joy Dsouza
 
Universidad de Valladolid degree offer diploma Transcript
Universidad de Valladolid  degree offer diploma TranscriptUniversidad de Valladolid  degree offer diploma Transcript
Universidad de Valladolid degree offer diploma Transcript
 
University of the Sunshine Coast degree offer diploma Transcript
University of the Sunshine Coast  degree offer diploma TranscriptUniversity of the Sunshine Coast  degree offer diploma Transcript
University of the Sunshine Coast degree offer diploma Transcript
 
Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 in CityGirls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 in City
 
How We Added Replication to QuestDB - JonTheBeach
How We Added Replication to QuestDB - JonTheBeachHow We Added Replication to QuestDB - JonTheBeach
How We Added Replication to QuestDB - JonTheBeach
 
The University of New England degree offer diploma Transcript
The University of New England  degree offer diploma TranscriptThe University of New England  degree offer diploma Transcript
The University of New England degree offer diploma Transcript
 
Amul goes international: Desi dairy giant to launch fresh ...
Amul goes international: Desi dairy giant to launch fresh ...Amul goes international: Desi dairy giant to launch fresh ...
Amul goes international: Desi dairy giant to launch fresh ...
 
Introduction to the Red Hat Portfolio.pdf
Introduction to the Red Hat Portfolio.pdfIntroduction to the Red Hat Portfolio.pdf
Introduction to the Red Hat Portfolio.pdf
 
Beautiful Girls Call 9711199171 9711199171 Provide Best And Top Girl Service ...
Beautiful Girls Call 9711199171 9711199171 Provide Best And Top Girl Service ...Beautiful Girls Call 9711199171 9711199171 Provide Best And Top Girl Service ...
Beautiful Girls Call 9711199171 9711199171 Provide Best And Top Girl Service ...
 
Supervised Learning (Data Science).pptx
Supervised Learning  (Data Science).pptxSupervised Learning  (Data Science).pptx
Supervised Learning (Data Science).pptx
 
Sin Involves More Than You Might Think (We'll Explain)
Sin Involves More Than You Might Think (We'll Explain)Sin Involves More Than You Might Think (We'll Explain)
Sin Involves More Than You Might Think (We'll Explain)
 

Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf

  • 1. © 2022 Altinity, Inc. Deep Dive on ClickHouse Sharding and Replication Robert Hodges and Altinity Engineering 22 September 2022 1 © 2202 Altinity, Inc.
  • 2. © 2022 Altinity, Inc. Let’s make some introductions ClickHouse support and services including Altinity.Cloud Authors of Altinity Kubernetes Operator for ClickHouse and other open source projects Us Database geeks with centuries of experience in DBMS and applications You Applications developers looking to learn about ClickHouse 2
  • 3. © 2022 Altinity, Inc. © 2022 Altinity, Inc. What’s a ClickHouse? 3
  • 4. © 2022 Altinity, Inc. Understands SQL Runs on bare metal to cloud Shared nothing architecture Stores data in columns Parallel and vectorized execution Scales to many petabytes Is Open source (Apache 2.0) ClickHouse is a SQL Data Warehouse It’s the core engine for real-time analytics ClickHouse Event Streams ELT Object Storage Interactive Graphics Dashboards APIs 4
  • 5. © 2022 Altinity, Inc. Distributed data is deeper than it looks 5 Width: 2 meters Depth: 60 meters “The Bolton Strid”
  • 6. © 2022 Altinity, Inc. © 2022 Altinity, Inc. Introducing sharding and replication 6
  • 7. © 2022 Altinity, Inc. Clickhouse nodes can scale vertically Network- Attached Storage CPU RAM Host 7
  • 8. © 2022 Altinity, Inc. Clickhouse nodes can scale vertically CPU RAM Host Network- Attached Storage 8
  • 9. © 2022 Altinity, Inc. Clusters introduce horizontal scaling Shards Replicas Host Host Host Host Replicas improve read IOPs and concurrency Shards add write IOPS 9
  • 10. © 2022 Altinity, Inc. Different sharding and replication patterns Shard 1 Shard 3 Shard 2 Shard 4 All Sharded Data sharded 4 ways without replication Replica 1 Replica 3 Replica 2 Replica 4 All Replicated Data replicated 4 times without sharding Shard 1 Replica 1 Shard 1 Replica 2 Shard 2 Replica 1 Shard 2 Replica 2 Sharded and Replicated Data sharded 2 ways and replicated 2 times 10
  • 11. © 2022 Altinity, Inc. MergeTree tables support replication MergeTree SummingMergeTree AggregatingMergeTree CollapsingMergeTree VersionedCollapsing MergeTree ReplicatedMergeTree ReplicatedSummingMergeTree ReplicatedAggregatingMergeTree ReplicatedCollapsingMergeTree ReplicatedVersionedCollapsing MergeTree ReplacingMergeTree ReplicatedReplacingMergeTree Source data Aggregated data; single row per group Evolving data 11
  • 12. © 2022 Altinity, Inc. How replication works INSERT Replicate ClickHouse Node 1 Table: ontime (Parts) ReplicatedMergeTree :9009 :9443 ClickHouse Node 2 Table: ontime (Parts) ReplicatedMergeTree :9009 :9443 zookeeper-1 ZNodes :2181 zookeeper-2 ZNodes :2181 zookeeper-3 ZNodes :2181 12
  • 13. © 2022 Altinity, Inc. What is replicated? Replicated statements Non-replicated statements ● INSERT ● ALTER TABLE exceptions: FREEZE, MOVE TO DISK, FETCH ● OPTIMIZE ● TRUNCATE ● CREATE table ● DROP table ● RENAME table ● DETACH table ● ATTACH table Replicated*MergeTree ONLY 13
  • 14. © 2022 Altinity, Inc. © 2022 Altinity, Inc. Building distributed schema 14
  • 15. © 2022 Altinity, Inc. Example of a distributed data set with shards and replicas clickhouse-0 ontime _local airports ontime clickhouse-1 ontime _local airports ontime clickhouse-2 ontime _local airports ontime clickhouse-3 ontime _local airports ontime Distributed table (No data) Sharded, replicated table (Partial data) Fully replicated table (All data) 15
  • 16. © 2022 Altinity, Inc. Step 1: A sharded, replicated fact table CREATE TABLE IF NOT EXISTS `ontime_local` ( `Year` UInt16 CODEC(DoubleDelta, ZSTD(1)), `Quarter` UInt8, `Month` UInt8, `DayofMonth` UInt8, `DayOfWeek` UInt8, ... ) Engine=ReplicatedMergeTree( '/clickhouse/{cluster}/tables/{shard}/{database}/ontime_local', '{replica}') PARTITION BY toYYYYMM(FlightDate) ORDER BY (FlightDate, `Year`, `Month`, DepDel15) Replication is at the table level! Use a Replicated% Engine 16
  • 17. © 2022 Altinity, Inc. Step 2: A distributed table to find data CREATE TABLE IF NOT EXISTS ontime AS ontime_local ENGINE = Distributed( '{cluster}', currentDatabase(), ontime_local, rand()) Cluster layout Database Table Sharding key (optional) 17
  • 18. © 2022 Altinity, Inc. Step 3: A fully replicated dimension table CREATE TABLE IF NOT EXISTS airports AS default.dot_airports Engine=ReplicatedMergeTree( '/clickhouse/{cluster}/tables/all/{database}/airports', '{replica}') PARTITION BY tuple() PRIMARY KEY AirportID ORDER BY AirportID Don’t bother with partitions for small tables Resolves to current database 18
  • 19. © 2022 Altinity, Inc. Macros help CREATE TABLE ON CLUSTER /etc/clickhouse-server/config.d/macros.xml: <clickhouse> <macros> <all-sharded-shard>2</all-sharded-shard> <cluster>demo</cluster> <shard>0</shard> <replica>clickhouse-0-1</replica> </macros> </clickhouse> select * from system.macros Replica names should be unique per host 19
  • 20. © 2022 Altinity, Inc. What does ON CLUSTER do? ON CLUSTER executes a command over a set of nodes CREATE TABLE IF NOT EXISTS `ontime_local` ON CLUSTER `{cluster}` ... DROP TABLE IF EXISTS `ontime_local` ON CLUSTER `{cluster}` ... ALTER TABLE `ontime_local` ON CLUSTER `{cluster}` ... 20
  • 21. © 2022 Altinity, Inc. How does ON CLUSTER know where to go? /etc/clickhouse-server/config.d/remote_servers.xml: <clickhouse> <remote_servers> <demo> <!-- <secret>top secret</secret> --> <shard> <replica><host>10.0.0.71</host><port>9000</port></replica> <replica><host>10.0.0.72</host><port>9000</port></replica> <internal_replication>true</internal_replication> </shard> <shard> . . . </shard> </demo> </remote_servers> </clickhouse> “It’s a cluster because I said so!” Cluster name 21 Shared secret
  • 22. © 2022 Altinity, Inc. List layouts using system.clusters -- Find name and hosts in each layout SELECT cluster, groupArray(concat(host_name,':',toString(port))) AS hosts FROM system.clusters GROUP BY cluster ORDER BY cluster 22
  • 23. © 2022 Altinity, Inc. © 2022 Altinity, Inc. Loading and querying data 23
  • 24. © 2022 Altinity, Inc. Data loading: Distributed vs. local INSERTs ontime _local ontime Insert via distributed table Insert directly to shards ontime _local ontime ontime _local ontime ontime _local ontime Data Pipeline Data Pipeline Applications may have to be more intelligent May require more resources (Queue) 24
  • 25. © 2022 Altinity, Inc. INSERT into a distributed vs. local table -- Insert into distributed table INSERT INTO ontime VALUES (2017,1,1,1,7,'2017-01-01','AA',19805,...), (2017,1,1,1,7,'2017-01-01','AA',19805,...), ... -- Insert into a local table INSERT INTO ontime_local VALUES (2017,1,1,1,7,'2017-01-01','AA',19805,...), (2017,1,1,1,7,'2017-01-01','AA',19805,...), ... 25
  • 26. © 2022 Altinity, Inc. How does a distributed INSERT work? ontime _local ontime Insert via distributed table ontime _local ontime ontime _local ontime Data Pipeline (Queue) insert_distributed_sync: ● 0 = async propagation ● 1 = sync propagation ontime _local ontime Thread Pool select * from system.distribution_queue replication 26
  • 27. © 2022 Altinity, Inc. Options for processing INSERTs ● Local vs distributed data insertion ○ INSERT to local table – no need to sync, larger blocks, faster ○ INSERT to Distributed table – sharding by ClickHouse ○ CHProxy -- distributes transactions across nodes, only works with HTTP connections ● Asynchronous (default) vs synchronous insertions ○ insert_distributed_sync - Wait until batches make it to local tables ○ insert_quorum, select_sequential_consistency – Wait until replicas sync 27
  • 28. © 2022 Altinity, Inc. How do distributed SELECTs work? ontime _local ontime Application ontime _local ontime ontime _local ontime ontime _local ontime Application Innermost subselect is distributed AggregateState computed locally Aggregates merged on initiator node 28
  • 29. © 2022 Altinity, Inc. Queries are pushed to all shards SELECT Carrier, avg(DepDelay) AS Delay FROM ontime GROUP BY Carrier ORDER BY Delay DESC SELECT Carrier, avg(DepDelay) AS Delay FROM ontime_local GROUP BY Carrier ORDER BY Delay DESC 29
  • 30. © 2022 Altinity, Inc. ClickHouse pushes down JOINs by default SELECT o.Dest d, a.Name n, count(*) c, avg(o.ArrDelayMinutes) ad FROM default.ontime o JOIN default.airports a ON (a.IATA = o.Dest) GROUP BY d, n HAVING c > 100000 ORDER BY d DESC LIMIT 10 SELECT Dest AS d, Name AS n, count() AS c, avg(ArrDelayMinutes) AS ad FROM default.ontime_local AS o ALL INNER JOIN default.airports AS a ON a.IATA = o.Dest GROUP BY d, n HAVING c > 100000 ORDER BY d DESC LIMIT 10 30
  • 31. © 2022 Altinity, Inc. ...Unless the left side “table” is a subquery SELECT d, Name n, c AS flights, ad FROM ( SELECT Dest d, count(*) c, avg(ArrDelayMinutes) ad FROM default.ontime GROUP BY d HAVING c > 100000 ORDER BY ad DESC ) AS o LEFT JOIN airports ON airports.IATA = o.d LIMIT 10 Remote Servers 31
  • 32. © 2022 Altinity, Inc. It’s more complex when multiple tables are distributed select foo from T1 where a in (select a from T2) distributed_product_mode=? local select foo from T1_local where a in ( select a from T2_local) allow select foo from T1_local where a in ( select a from T2) global create temporary table tmp Engine = Set AS select a from T2; select foo from T1_local where a in tmp; (Subquery runs on local table) (Subquery runs on distributed table) (Subquery runs on initiator; broadcast to local temp table) 32
  • 33. © 2022 Altinity, Inc. What’s actually happening with queries? Let’s find out! SELECT hostName() host, event_time, query_id, is_initial_query AS initial, if(is_initial_query, '', initial_query_id) as initial_q, query FROM cluster('{cluster}', system.query_log) AS st WHERE type = 'QueryFinish' AND has(databases, 'test') ORDER BY st.event_time DESC LIMIT 25 33
  • 34. © 2022 Altinity, Inc. Thinking about distributed data and joins Large id 1 2 … … 1000 Small id 1 … 100 Large id 1 2 … … 1000 Large id 1 2 … … 1000 Large id 1001 1002 … … 2000 Large id 2001 2002 … … 2000 Large id 1001 1002 … … 2000 Small id 1 … 100 Shard 1 Shard 2 Shard 1 Shard 2 “Bucketing Model” “Big Table Model” All keys replicated Matching keys in each bucket 34
  • 35. © 2022 Altinity, Inc. © 2022 Altinity, Inc. Tricks to query distributed tables 35
  • 36. © 2022 Altinity, Inc. Use remote() to select from another node SELECT count() FROM remote('host-2', currentDatabase(), 'ontime_ref') SELECT count() FROM remoteSecure('host-2', currentDatabase(), 'ontime_ref') ┌───count()─┐ │ 196508419 │ └───────────┘ -- You can insert too, with FUNCTION keyword. INSERT INTO FUNCTION remote(host, database, table, login, password) VALUES . . . 36
  • 37. © 2022 Altinity, Inc. More remote query tricks! SELECT hostName() AS h, count() AS c FROM sdata GROUP BY h ┌─h─────────────────────────┬───c─┐ │ chi-test-rh-test-rh-1-0-0 │ 492 │ │ chi-test-rh-test-rh-0-0-0 │ 508 │ └───────────────────────────┴─────┘ SELECT hostName() AS h, count() AS c FROM remote('chi-test-rh-test-rh-{0,1}-{0,1}', default, sdata) GROUP BY h ┌─h─────────────────────────┬────c─┐ │ chi-test-rh-test-rh-1-0-0 │ 984 │ │ chi-test-rh-test-rh-1-1-0 │ 984 │ │ chi-test-rh-test-rh-0-1-0 │ 1016 │ │ chi-test-rh-test-rh-0-0-0 │ 1016 │ └───────────────────────────┴──────┘ Distributed table Remote query all 4 hosts 37
  • 38. © 2022 Altinity, Inc. cluster() distributes queries dynamically SELECT hostName() AS host, count() AS tables FROM cluster('{cluster}', system.tables) WHERE database = 'default' GROUP BY host ┌─host──────────────────────┬─tables─┐ │ chi-test-rh-test-rh-1-0-0 │ 2 │ │ chi-test-rh-test-rh-0-1-0 │ 2 │ └───────────────────────────┴────────┘ 38
  • 39. © 2022 Altinity, Inc. clusterAllReplicas() goes to every node SELECT hostName() AS host, count() AS tables FROM clusterAllReplicas('{cluster}', system.tables) WHERE database = 'default' GROUP BY host ┌─host──────────────────────┬─tables─┐ │ chi-test-rh-test-rh-1-0-0 │ 2 │ │ chi-test-rh-test-rh-1-1-0 │ 2 │ │ chi-test-rh-test-rh-0-1-0 │ 2 │ │ chi-test-rh-test-rh-0-0-0 │ 2 │ └───────────────────────────┴────────┘ 39
  • 40. © 2022 Altinity, Inc. © 2022 Altinity, Inc. Scaling up 40
  • 41. © 2022 Altinity, Inc. Load testing and capacity planning made simple… 1. Establish single node baseload ● Use production data ● Max out SELECT & INSERT capacity with load tests ● Adjust schema and queries, retest 2. Add replicas to increase SELECT capacity 3. Add shards to increase INSERT capacity 41
  • 42. © 2022 Altinity, Inc. Selecting the sharding key Shard 2 Shard 3 Shard 1 Randomized Key, e.g., cityHash64(Url) Must query all shards Nodes are balanced Shard 3 Specific Key e.g., cityHash64(TenantId) Unbalanced nodes Queries can skip shards Shard 2 Shard 1 Easier to add nodes Hard to add nodes 42
  • 43. © 2022 Altinity, Inc. Options for shard rebalancing ● INSERT INTO new_cluster SELECT FROM old_cluster ○ Clickhouse-copier automates this ● Use (undocumented) ALTER TABLE MOVE PART TO SHARD ○ Example: ALTER TABLE test_move MOVE PART 'all_0_0_0' TO SHARD '/clickhouse/shard_1/tables/test_move ● Move parts manually ○ ALTER TABLE FREEZE PARTITION ○ rsync to new host ○ ALTER TABLE ATTACH PARTITION ○ Drop original partition 43
  • 44. © 2022 Altinity, Inc. Bi-level sharding combines both approaches cityHash64(Url) Shard 2 Shard 3 Shard 1 TenantId Shard 2 Shard 1 cityHash64(Url) cityHash64(Url) Shard 2 Shard 1 Tenant-Group-1 Tenant-Group-2 Tenant-Group-3 Application chooses group Distributed table 44
  • 45. © 2022 Altinity, Inc. © 2022 Altinity, Inc. Wrap-up and more information 45
  • 46. © 2022 Altinity, Inc. Where is the documentation? ClickHouse official docs – https://clickhouse.com/docs/ Altinity Blog – https://altinity.com/blog/ Altinity Youtube Channel – https://www.youtube.com/channel/UCE3Y2lDKl_ZfjaCrh62onYA Altinity Knowledge Base – https://kb.altinity.com/ ClickHouse Capacity Planning by Mik Kocikowski of CloudFlare Meetups, other blogs, and external resources. Use your powers of Search! 46
  • 47. © 2022 Altinity, Inc. Where can I get help? Telegram - ClickHouse Channel Slack ● ClickHouse Public Workspace - clickhousedb.slack.com ● Altinity Public Workspace - altinitydbworkspace.slack.com Education - Altinity ClickHouse Training Support - Altinity offers support for ClickHouse in all environments 47
  • 48. © 2022 Altinity, Inc. 48 © 2202 Altinity, Inc. Thank you and good luck! Website: https://altinity.com Email: info@altinity.com Slack: altinitydbworkspace.slack.com Altinity.Cloud Altinity Support Altinity Stable Builds We’re hiring!