SlideShare a Scribd company logo
© 2022 Altinity, Inc.
Deep Dive on
ClickHouse Sharding
and Replication
Robert Hodges and Altinity Engineering
22 September 2022
1
© 2202 Altinity, Inc.
© 2022 Altinity, Inc.
Let’s make some introductions
ClickHouse support and services including Altinity.Cloud
Authors of Altinity Kubernetes Operator for ClickHouse
and other open source projects
Us
Database geeks with centuries
of experience in DBMS and
applications
You
Applications developers
looking to learn about
ClickHouse
2
© 2022 Altinity, Inc.
© 2022 Altinity, Inc.
What’s a
ClickHouse?
3
© 2022 Altinity, Inc.
Understands SQL
Runs on bare metal to cloud
Shared nothing architecture
Stores data in columns
Parallel and vectorized execution
Scales to many petabytes
Is Open source (Apache 2.0)
ClickHouse is a SQL Data Warehouse
It’s the core engine for
real-time analytics
ClickHouse
Event
Streams
ELT
Object
Storage
Interactive
Graphics
Dashboards
APIs
4
© 2022 Altinity, Inc.
Distributed data is deeper than it looks
5
Width:
2 meters
Depth:
60 meters
“The
Bolton
Strid”
© 2022 Altinity, Inc.
© 2022 Altinity, Inc.
Introducing
sharding and
replication
6
© 2022 Altinity, Inc.
Clickhouse nodes can scale vertically
Network-
Attached
Storage
CPU
RAM
Host
7
© 2022 Altinity, Inc.
Clickhouse nodes can scale vertically
CPU
RAM
Host
Network-
Attached
Storage
8
© 2022 Altinity, Inc.
Clusters introduce horizontal scaling
Shards
Replicas
Host Host Host
Host
Replicas improve read
IOPs and concurrency
Shards add write
IOPS
9
© 2022 Altinity, Inc.
Different sharding and replication patterns
Shard 1
Shard 3
Shard 2
Shard 4
All Sharded
Data sharded 4
ways without
replication
Replica 1
Replica 3
Replica 2
Replica 4
All Replicated
Data replicated 4
times without
sharding
Shard 1
Replica 1
Shard 1
Replica 2
Shard 2
Replica 1
Shard 2
Replica 2
Sharded and
Replicated
Data sharded 2
ways and
replicated 2 times
10
© 2022 Altinity, Inc.
MergeTree tables support replication
MergeTree
SummingMergeTree
AggregatingMergeTree
CollapsingMergeTree
VersionedCollapsing
MergeTree
ReplicatedMergeTree
ReplicatedSummingMergeTree
ReplicatedAggregatingMergeTree
ReplicatedCollapsingMergeTree
ReplicatedVersionedCollapsing
MergeTree
ReplacingMergeTree ReplicatedReplacingMergeTree
Source data
Aggregated
data; single
row per group
Evolving data
11
© 2022 Altinity, Inc.
How replication works
INSERT
Replicate
ClickHouse Node 1
Table: ontime
(Parts)
ReplicatedMergeTree
:9009
:9443 ClickHouse Node 2
Table: ontime
(Parts)
ReplicatedMergeTree
:9009
:9443
zookeeper-1
ZNodes
:2181 zookeeper-2
ZNodes
:2181 zookeeper-3
ZNodes
:2181
12
© 2022 Altinity, Inc.
What is replicated?
Replicated statements Non-replicated statements
● INSERT
● ALTER TABLE
exceptions: FREEZE, MOVE TO
DISK, FETCH
● OPTIMIZE
● TRUNCATE
● CREATE table
● DROP table
● RENAME table
● DETACH table
● ATTACH table
Replicated*MergeTree ONLY
13
© 2022 Altinity, Inc.
© 2022 Altinity, Inc.
Building
distributed
schema
14
© 2022 Altinity, Inc.
Example of a distributed data set with shards and replicas
clickhouse-0
ontime
_local
airports
ontime
clickhouse-1
ontime
_local
airports
ontime
clickhouse-2
ontime
_local
airports
ontime
clickhouse-3
ontime
_local
airports
ontime
Distributed
table
(No data)
Sharded,
replicated
table
(Partial data)
Fully
replicated
table
(All data)
15
© 2022 Altinity, Inc.
Step 1: A sharded, replicated fact table
CREATE TABLE IF NOT EXISTS `ontime_local` (
`Year` UInt16 CODEC(DoubleDelta, ZSTD(1)),
`Quarter` UInt8,
`Month` UInt8,
`DayofMonth` UInt8,
`DayOfWeek` UInt8, ...
) Engine=ReplicatedMergeTree(
'/clickhouse/{cluster}/tables/{shard}/{database}/ontime_local',
'{replica}')
PARTITION BY toYYYYMM(FlightDate)
ORDER BY (FlightDate, `Year`, `Month`, DepDel15)
Replication is at the table level! Use a Replicated% Engine
16
© 2022 Altinity, Inc.
Step 2: A distributed table to find data
CREATE TABLE IF NOT EXISTS ontime
AS ontime_local
ENGINE = Distributed(
'{cluster}', currentDatabase(), ontime_local, rand())
Cluster
layout
Database Table Sharding
key
(optional)
17
© 2022 Altinity, Inc.
Step 3: A fully replicated dimension table
CREATE TABLE IF NOT EXISTS airports
AS default.dot_airports
Engine=ReplicatedMergeTree(
'/clickhouse/{cluster}/tables/all/{database}/airports',
'{replica}')
PARTITION BY tuple()
PRIMARY KEY AirportID
ORDER BY AirportID
Don’t bother with partitions
for small tables
Resolves to current
database
18
© 2022 Altinity, Inc.
Macros help CREATE TABLE ON CLUSTER
/etc/clickhouse-server/config.d/macros.xml:
<clickhouse>
<macros>
<all-sharded-shard>2</all-sharded-shard>
<cluster>demo</cluster>
<shard>0</shard>
<replica>clickhouse-0-1</replica>
</macros>
</clickhouse>
select * from system.macros
Replica names
should be unique
per host
19
© 2022 Altinity, Inc.
What does ON CLUSTER do?
ON CLUSTER executes a command over a set of nodes
CREATE TABLE IF NOT EXISTS `ontime_local` ON CLUSTER `{cluster}` ...
DROP TABLE IF EXISTS `ontime_local` ON CLUSTER `{cluster}` ...
ALTER TABLE `ontime_local` ON CLUSTER `{cluster}` ...
20
© 2022 Altinity, Inc.
How does ON CLUSTER know where to go?
/etc/clickhouse-server/config.d/remote_servers.xml:
<clickhouse>
<remote_servers>
<demo>
<!-- <secret>top secret</secret> -->
<shard>
<replica><host>10.0.0.71</host><port>9000</port></replica>
<replica><host>10.0.0.72</host><port>9000</port></replica>
<internal_replication>true</internal_replication>
</shard>
<shard>
. . .
</shard>
</demo>
</remote_servers>
</clickhouse>
“It’s a cluster
because I said so!”
Cluster name
21
Shared secret
© 2022 Altinity, Inc.
List layouts using system.clusters
-- Find name and hosts in each layout
SELECT
cluster,
groupArray(concat(host_name,':',toString(port))) AS hosts
FROM system.clusters
GROUP BY cluster ORDER BY cluster
22
© 2022 Altinity, Inc.
© 2022 Altinity, Inc.
Loading and
querying data
23
© 2022 Altinity, Inc.
Data loading: Distributed vs. local INSERTs
ontime
_local
ontime
Insert via
distributed
table
Insert directly
to shards
ontime
_local
ontime
ontime
_local
ontime
ontime
_local
ontime
Data
Pipeline Data
Pipeline
Applications may have to
be more intelligent
May require more
resources
(Queue)
24
© 2022 Altinity, Inc.
INSERT into a distributed vs. local table
-- Insert into distributed table
INSERT INTO ontime VALUES
(2017,1,1,1,7,'2017-01-01','AA',19805,...),
(2017,1,1,1,7,'2017-01-01','AA',19805,...),
...
-- Insert into a local table
INSERT INTO ontime_local VALUES
(2017,1,1,1,7,'2017-01-01','AA',19805,...),
(2017,1,1,1,7,'2017-01-01','AA',19805,...),
...
25
© 2022 Altinity, Inc.
How does a distributed INSERT work?
ontime
_local
ontime
Insert via
distributed table
ontime
_local
ontime
ontime
_local
ontime
Data
Pipeline
(Queue)
insert_distributed_sync:
● 0 = async propagation
● 1 = sync propagation ontime
_local
ontime
Thread Pool
select * from
system.distribution_queue
replication
26
© 2022 Altinity, Inc.
Options for processing INSERTs
● Local vs distributed data insertion
○ INSERT to local table – no need to sync, larger blocks, faster
○ INSERT to Distributed table – sharding by ClickHouse
○ CHProxy -- distributes transactions across nodes, only works with HTTP
connections
● Asynchronous (default) vs synchronous insertions
○ insert_distributed_sync - Wait until batches make it to local tables
○ insert_quorum, select_sequential_consistency – Wait until replicas sync
27
© 2022 Altinity, Inc.
How do distributed SELECTs work?
ontime
_local
ontime
Application
ontime
_local
ontime
ontime
_local
ontime
ontime
_local
ontime
Application
Innermost
subselect is
distributed
AggregateState
computed
locally
Aggregates
merged on
initiator node
28
© 2022 Altinity, Inc.
Queries are pushed to all shards
SELECT Carrier, avg(DepDelay) AS Delay
FROM ontime
GROUP BY Carrier ORDER BY Delay DESC
SELECT Carrier, avg(DepDelay) AS Delay
FROM ontime_local
GROUP BY Carrier ORDER BY Delay DESC
29
© 2022 Altinity, Inc.
ClickHouse pushes down JOINs by default
SELECT o.Dest d, a.Name n, count(*) c, avg(o.ArrDelayMinutes) ad
FROM default.ontime o
JOIN default.airports a ON (a.IATA = o.Dest)
GROUP BY d, n HAVING c > 100000 ORDER BY d DESC
LIMIT 10
SELECT Dest AS d, Name AS n, count() AS c, avg(ArrDelayMinutes) AS
ad
FROM default.ontime_local AS o
ALL INNER JOIN default.airports AS a ON a.IATA = o.Dest
GROUP BY d, n HAVING c > 100000 ORDER BY d DESC LIMIT 10
30
© 2022 Altinity, Inc.
...Unless the left side “table” is a subquery
SELECT d, Name n, c AS flights, ad
FROM
(
SELECT Dest d, count(*) c, avg(ArrDelayMinutes) ad
FROM default.ontime
GROUP BY d HAVING c > 100000
ORDER BY ad DESC
) AS o
LEFT JOIN airports ON airports.IATA = o.d
LIMIT 10
Remote
Servers
31
© 2022 Altinity, Inc.
It’s more complex when multiple tables are distributed
select foo from T1 where a in (select a from T2)
distributed_product_mode=?
local
select foo
from T1_local
where a in (
select a
from T2_local)
allow
select foo
from T1_local
where a in (
select a
from T2)
global
create temporary table
tmp Engine = Set
AS select a from T2;
select foo from
T1_local where a in
tmp;
(Subquery runs on
local table)
(Subquery runs on
distributed table) (Subquery runs on initiator;
broadcast to local temp table)
32
© 2022 Altinity, Inc.
What’s actually happening with queries? Let’s find out!
SELECT hostName() host, event_time, query_id,
is_initial_query AS initial,
if(is_initial_query, '', initial_query_id) as initial_q,
query
FROM cluster('{cluster}', system.query_log) AS st
WHERE type = 'QueryFinish' AND has(databases, 'test')
ORDER BY st.event_time DESC LIMIT 25
33
© 2022 Altinity, Inc.
Thinking about distributed data and joins
Large
id
1
2
…
…
1000
Small
id
1
…
100
Large
id
1
2
…
…
1000
Large
id
1
2
…
…
1000
Large
id
1001
1002
…
…
2000
Large
id
2001
2002
…
…
2000
Large
id
1001
1002
…
…
2000
Small
id
1
…
100
Shard 1 Shard 2 Shard 1 Shard 2
“Bucketing Model”
“Big Table Model”
All keys replicated Matching keys in
each bucket
34
© 2022 Altinity, Inc.
© 2022 Altinity, Inc.
Tricks to query
distributed tables
35
© 2022 Altinity, Inc.
Use remote() to select from another node
SELECT count()
FROM remote('host-2', currentDatabase(), 'ontime_ref')
SELECT count()
FROM remoteSecure('host-2', currentDatabase(), 'ontime_ref')
┌───count()─┐
│ 196508419 │
└───────────┘
-- You can insert too, with FUNCTION keyword.
INSERT INTO FUNCTION remote(host, database, table, login,
password)
VALUES . . .
36
© 2022 Altinity, Inc.
More remote query tricks!
SELECT hostName() AS h, count() AS c FROM sdata GROUP BY h
┌─h─────────────────────────┬───c─┐
│ chi-test-rh-test-rh-1-0-0 │ 492 │
│ chi-test-rh-test-rh-0-0-0 │ 508 │
└───────────────────────────┴─────┘
SELECT hostName() AS h, count() AS c
FROM remote('chi-test-rh-test-rh-{0,1}-{0,1}', default, sdata)
GROUP BY h
┌─h─────────────────────────┬────c─┐
│ chi-test-rh-test-rh-1-0-0 │ 984 │
│ chi-test-rh-test-rh-1-1-0 │ 984 │
│ chi-test-rh-test-rh-0-1-0 │ 1016 │
│ chi-test-rh-test-rh-0-0-0 │ 1016 │
└───────────────────────────┴──────┘
Distributed table
Remote query all 4
hosts
37
© 2022 Altinity, Inc.
cluster() distributes queries dynamically
SELECT
hostName() AS host, count() AS tables
FROM cluster('{cluster}', system.tables)
WHERE database = 'default'
GROUP BY host
┌─host──────────────────────┬─tables─┐
│ chi-test-rh-test-rh-1-0-0 │ 2 │
│ chi-test-rh-test-rh-0-1-0 │ 2 │
└───────────────────────────┴────────┘
38
© 2022 Altinity, Inc.
clusterAllReplicas() goes to every node
SELECT
hostName() AS host, count() AS tables
FROM clusterAllReplicas('{cluster}', system.tables)
WHERE database = 'default'
GROUP BY host
┌─host──────────────────────┬─tables─┐
│ chi-test-rh-test-rh-1-0-0 │ 2 │
│ chi-test-rh-test-rh-1-1-0 │ 2 │
│ chi-test-rh-test-rh-0-1-0 │ 2 │
│ chi-test-rh-test-rh-0-0-0 │ 2 │
└───────────────────────────┴────────┘
39
© 2022 Altinity, Inc.
© 2022 Altinity, Inc.
Scaling up
40
© 2022 Altinity, Inc.
Load testing and capacity planning made simple…
1. Establish single node baseload
● Use production data
● Max out SELECT & INSERT capacity with load tests
● Adjust schema and queries, retest
2. Add replicas to increase SELECT capacity
3. Add shards to increase INSERT capacity
41
© 2022 Altinity, Inc.
Selecting the sharding key
Shard 2 Shard 3
Shard 1
Randomized Key, e.g.,
cityHash64(Url)
Must query
all shards
Nodes are
balanced
Shard 3
Specific Key e.g.,
cityHash64(TenantId)
Unbalanced
nodes
Queries can
skip shards
Shard 2
Shard 1
Easier to
add nodes
Hard to
add nodes
42
© 2022 Altinity, Inc.
Options for shard rebalancing
● INSERT INTO new_cluster SELECT FROM old_cluster
○ Clickhouse-copier automates this
● Use (undocumented) ALTER TABLE MOVE PART TO SHARD
○ Example: ALTER TABLE test_move MOVE PART 'all_0_0_0' TO SHARD
'/clickhouse/shard_1/tables/test_move
● Move parts manually
○ ALTER TABLE FREEZE PARTITION
○ rsync to new host
○ ALTER TABLE ATTACH PARTITION
○ Drop original partition
43
© 2022 Altinity, Inc.
Bi-level sharding combines both approaches
cityHash64(Url)
Shard 2 Shard 3
Shard 1
TenantId
Shard 2
Shard 1
cityHash64(Url) cityHash64(Url)
Shard 2
Shard 1
Tenant-Group-1 Tenant-Group-2 Tenant-Group-3
Application chooses group
Distributed table
44
© 2022 Altinity, Inc.
© 2022 Altinity, Inc.
Wrap-up and more
information
45
© 2022 Altinity, Inc.
Where is the documentation?
ClickHouse official docs – https://clickhouse.com/docs/
Altinity Blog – https://altinity.com/blog/
Altinity Youtube Channel –
https://www.youtube.com/channel/UCE3Y2lDKl_ZfjaCrh62onYA
Altinity Knowledge Base – https://kb.altinity.com/
ClickHouse Capacity Planning by Mik Kocikowski of CloudFlare
Meetups, other blogs, and external resources. Use your powers of Search!
46
© 2022 Altinity, Inc.
Where can I get help?
Telegram - ClickHouse Channel
Slack
● ClickHouse Public Workspace - clickhousedb.slack.com
● Altinity Public Workspace - altinitydbworkspace.slack.com
Education - Altinity ClickHouse Training
Support - Altinity offers support for ClickHouse in all environments
47
© 2022 Altinity, Inc. 48
© 2202 Altinity, Inc.
Thank you and
good luck!
Website: https://altinity.com
Email: info@altinity.com
Slack: altinitydbworkspace.slack.com
Altinity.Cloud
Altinity Support
Altinity Stable
Builds
We’re hiring!

More Related Content

What's hot

Building ClickHouse and Making Your First Contribution: A Tutorial_06.10.2021
Building ClickHouse and Making Your First Contribution: A Tutorial_06.10.2021Building ClickHouse and Making Your First Contribution: A Tutorial_06.10.2021
Building ClickHouse and Making Your First Contribution: A Tutorial_06.10.2021
Altinity Ltd
 
Altinity Quickstart for ClickHouse
Altinity Quickstart for ClickHouseAltinity Quickstart for ClickHouse
Altinity Quickstart for ClickHouse
Altinity Ltd
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Altinity Ltd
 
Your first ClickHouse data warehouse
Your first ClickHouse data warehouseYour first ClickHouse data warehouse
Your first ClickHouse data warehouse
Altinity Ltd
 
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
Altinity Ltd
 
A Day in the Life of a ClickHouse Query Webinar Slides
A Day in the Life of a ClickHouse Query Webinar Slides A Day in the Life of a ClickHouse Query Webinar Slides
A Day in the Life of a ClickHouse Query Webinar Slides
Altinity Ltd
 
Altinity Quickstart for ClickHouse-2202-09-15.pdf
Altinity Quickstart for ClickHouse-2202-09-15.pdfAltinity Quickstart for ClickHouse-2202-09-15.pdf
Altinity Quickstart for ClickHouse-2202-09-15.pdf
Altinity Ltd
 
ClickHouse Monitoring 101: What to monitor and how
ClickHouse Monitoring 101: What to monitor and howClickHouse Monitoring 101: What to monitor and how
ClickHouse Monitoring 101: What to monitor and how
Altinity Ltd
 
All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...
All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...
All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...
Altinity Ltd
 
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEOClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
Altinity Ltd
 
ClickHouse Keeper
ClickHouse KeeperClickHouse Keeper
ClickHouse Keeper
Altinity Ltd
 
ClickHouse Materialized Views: The Magic Continues
ClickHouse Materialized Views: The Magic ContinuesClickHouse Materialized Views: The Magic Continues
ClickHouse Materialized Views: The Magic Continues
Altinity Ltd
 
Dangerous on ClickHouse in 30 minutes, by Robert Hodges, Altinity CEO
Dangerous on ClickHouse in 30 minutes, by Robert Hodges, Altinity CEODangerous on ClickHouse in 30 minutes, by Robert Hodges, Altinity CEO
Dangerous on ClickHouse in 30 minutes, by Robert Hodges, Altinity CEO
Altinity Ltd
 
A Fast Intro to Fast Query with ClickHouse, by Robert Hodges
A Fast Intro to Fast Query with ClickHouse, by Robert HodgesA Fast Intro to Fast Query with ClickHouse, by Robert Hodges
A Fast Intro to Fast Query with ClickHouse, by Robert Hodges
Altinity Ltd
 
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlare
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlareClickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlare
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlare
Altinity Ltd
 
ClickHouse Mark Cache, by Mik Kocikowski, Cloudflare
ClickHouse Mark Cache, by Mik Kocikowski, CloudflareClickHouse Mark Cache, by Mik Kocikowski, Cloudflare
ClickHouse Mark Cache, by Mik Kocikowski, Cloudflare
Altinity Ltd
 
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Altinity Ltd
 
ClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei MilovidovClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei Milovidov
Altinity Ltd
 
ClickHouse materialized views - a secret weapon for high performance analytic...
ClickHouse materialized views - a secret weapon for high performance analytic...ClickHouse materialized views - a secret weapon for high performance analytic...
ClickHouse materialized views - a secret weapon for high performance analytic...
Altinity Ltd
 
10 Good Reasons to Use ClickHouse
10 Good Reasons to Use ClickHouse10 Good Reasons to Use ClickHouse
10 Good Reasons to Use ClickHouse
rpolat
 

What's hot (20)

Building ClickHouse and Making Your First Contribution: A Tutorial_06.10.2021
Building ClickHouse and Making Your First Contribution: A Tutorial_06.10.2021Building ClickHouse and Making Your First Contribution: A Tutorial_06.10.2021
Building ClickHouse and Making Your First Contribution: A Tutorial_06.10.2021
 
Altinity Quickstart for ClickHouse
Altinity Quickstart for ClickHouseAltinity Quickstart for ClickHouse
Altinity Quickstart for ClickHouse
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
 
Your first ClickHouse data warehouse
Your first ClickHouse data warehouseYour first ClickHouse data warehouse
Your first ClickHouse data warehouse
 
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
 
A Day in the Life of a ClickHouse Query Webinar Slides
A Day in the Life of a ClickHouse Query Webinar Slides A Day in the Life of a ClickHouse Query Webinar Slides
A Day in the Life of a ClickHouse Query Webinar Slides
 
Altinity Quickstart for ClickHouse-2202-09-15.pdf
Altinity Quickstart for ClickHouse-2202-09-15.pdfAltinity Quickstart for ClickHouse-2202-09-15.pdf
Altinity Quickstart for ClickHouse-2202-09-15.pdf
 
ClickHouse Monitoring 101: What to monitor and how
ClickHouse Monitoring 101: What to monitor and howClickHouse Monitoring 101: What to monitor and how
ClickHouse Monitoring 101: What to monitor and how
 
All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...
All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...
All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...
 
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEOClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
 
ClickHouse Keeper
ClickHouse KeeperClickHouse Keeper
ClickHouse Keeper
 
ClickHouse Materialized Views: The Magic Continues
ClickHouse Materialized Views: The Magic ContinuesClickHouse Materialized Views: The Magic Continues
ClickHouse Materialized Views: The Magic Continues
 
Dangerous on ClickHouse in 30 minutes, by Robert Hodges, Altinity CEO
Dangerous on ClickHouse in 30 minutes, by Robert Hodges, Altinity CEODangerous on ClickHouse in 30 minutes, by Robert Hodges, Altinity CEO
Dangerous on ClickHouse in 30 minutes, by Robert Hodges, Altinity CEO
 
A Fast Intro to Fast Query with ClickHouse, by Robert Hodges
A Fast Intro to Fast Query with ClickHouse, by Robert HodgesA Fast Intro to Fast Query with ClickHouse, by Robert Hodges
A Fast Intro to Fast Query with ClickHouse, by Robert Hodges
 
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlare
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlareClickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlare
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlare
 
ClickHouse Mark Cache, by Mik Kocikowski, Cloudflare
ClickHouse Mark Cache, by Mik Kocikowski, CloudflareClickHouse Mark Cache, by Mik Kocikowski, Cloudflare
ClickHouse Mark Cache, by Mik Kocikowski, Cloudflare
 
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
 
ClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei MilovidovClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei Milovidov
 
ClickHouse materialized views - a secret weapon for high performance analytic...
ClickHouse materialized views - a secret weapon for high performance analytic...ClickHouse materialized views - a secret weapon for high performance analytic...
ClickHouse materialized views - a secret weapon for high performance analytic...
 
10 Good Reasons to Use ClickHouse
10 Good Reasons to Use ClickHouse10 Good Reasons to Use ClickHouse
10 Good Reasons to Use ClickHouse
 

Similar to Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf

A day in the life of a click house query
A day in the life of a click house queryA day in the life of a click house query
A day in the life of a click house query
CristinaMunteanu43
 
Polyglot ClickHouse -- ClickHouse SF Meetup Sept 10
Polyglot ClickHouse -- ClickHouse SF Meetup Sept 10Polyglot ClickHouse -- ClickHouse SF Meetup Sept 10
Polyglot ClickHouse -- ClickHouse SF Meetup Sept 10
Altinity Ltd
 
Creating Beautiful Dashboards with Grafana and ClickHouse
Creating Beautiful Dashboards with Grafana and ClickHouseCreating Beautiful Dashboards with Grafana and ClickHouse
Creating Beautiful Dashboards with Grafana and ClickHouse
Altinity Ltd
 
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
InfluxData
 
Relational Database to Apache Spark (and sometimes back again)
Relational Database to Apache Spark (and sometimes back again)Relational Database to Apache Spark (and sometimes back again)
Relational Database to Apache Spark (and sometimes back again)
Ed Thewlis
 
Introduction to Presto at Treasure Data
Introduction to Presto at Treasure DataIntroduction to Presto at Treasure Data
Introduction to Presto at Treasure Data
Taro L. Saito
 
All you need to know about CREATE STATISTICS
All you need to know about CREATE STATISTICSAll you need to know about CREATE STATISTICS
All you need to know about CREATE STATISTICS
EDB
 
Snowflake’s Cloud Data Platform and Modern Analytics
Snowflake’s Cloud Data Platform and Modern AnalyticsSnowflake’s Cloud Data Platform and Modern Analytics
Snowflake’s Cloud Data Platform and Modern Analytics
Senturus
 
OpenInfra Summit Vancouver 2023 - SSoT
OpenInfra Summit Vancouver 2023 - SSoTOpenInfra Summit Vancouver 2023 - SSoT
OpenInfra Summit Vancouver 2023 - SSoT
Ohyama Hiroyasu
 
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdfAltinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
Altinity Ltd
 
Webinar: ArangoDB 3.8 Preview - Analytics at Scale
Webinar: ArangoDB 3.8 Preview - Analytics at Scale Webinar: ArangoDB 3.8 Preview - Analytics at Scale
Webinar: ArangoDB 3.8 Preview - Analytics at Scale
ArangoDB Database
 
InfluxDB IOx Tech Talks: A Rusty Introduction to Apache Arrow and How it App...
InfluxDB IOx Tech Talks:  A Rusty Introduction to Apache Arrow and How it App...InfluxDB IOx Tech Talks:  A Rusty Introduction to Apache Arrow and How it App...
InfluxDB IOx Tech Talks: A Rusty Introduction to Apache Arrow and How it App...
InfluxData
 
Presto in Treasure Data (presented at db tech showcase Sapporo 2015)
Presto in Treasure Data (presented at db tech showcase Sapporo 2015)Presto in Treasure Data (presented at db tech showcase Sapporo 2015)
Presto in Treasure Data (presented at db tech showcase Sapporo 2015)
Mitsunori Komatsu
 
Data warehouse or conventional database: Which is right for you?
Data warehouse or conventional database: Which is right for you?Data warehouse or conventional database: Which is right for you?
Data warehouse or conventional database: Which is right for you?
Data Con LA
 
Data relay introduction to big data clusters
Data relay introduction to big data clustersData relay introduction to big data clusters
Data relay introduction to big data clusters
Chris Adkin
 
ClickHouse -If Combinators for Fun and Profit-2022-05-04.pdf
ClickHouse -If Combinators for Fun and Profit-2022-05-04.pdfClickHouse -If Combinators for Fun and Profit-2022-05-04.pdf
ClickHouse -If Combinators for Fun and Profit-2022-05-04.pdf
Altinity Ltd
 
Andre Paul: Importing VMware infrastructures into CloudStack
Andre Paul: Importing VMware infrastructures into CloudStackAndre Paul: Importing VMware infrastructures into CloudStack
Andre Paul: Importing VMware infrastructures into CloudStack
ShapeBlue
 
Hive User Meeting August 2009 Facebook
Hive User Meeting August 2009 FacebookHive User Meeting August 2009 Facebook
Hive User Meeting August 2009 Facebook
ragho
 
Hive User Meeting 2009 8 Facebook
Hive User Meeting 2009 8 FacebookHive User Meeting 2009 8 Facebook
Hive User Meeting 2009 8 FacebookZheng Shao
 
Extreme BI: Creating Virtualized Hybrid Type 1+2 Dimensions
Extreme BI: Creating Virtualized Hybrid Type 1+2 DimensionsExtreme BI: Creating Virtualized Hybrid Type 1+2 Dimensions
Extreme BI: Creating Virtualized Hybrid Type 1+2 Dimensions
Kent Graziano
 

Similar to Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf (20)

A day in the life of a click house query
A day in the life of a click house queryA day in the life of a click house query
A day in the life of a click house query
 
Polyglot ClickHouse -- ClickHouse SF Meetup Sept 10
Polyglot ClickHouse -- ClickHouse SF Meetup Sept 10Polyglot ClickHouse -- ClickHouse SF Meetup Sept 10
Polyglot ClickHouse -- ClickHouse SF Meetup Sept 10
 
Creating Beautiful Dashboards with Grafana and ClickHouse
Creating Beautiful Dashboards with Grafana and ClickHouseCreating Beautiful Dashboards with Grafana and ClickHouse
Creating Beautiful Dashboards with Grafana and ClickHouse
 
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
 
Relational Database to Apache Spark (and sometimes back again)
Relational Database to Apache Spark (and sometimes back again)Relational Database to Apache Spark (and sometimes back again)
Relational Database to Apache Spark (and sometimes back again)
 
Introduction to Presto at Treasure Data
Introduction to Presto at Treasure DataIntroduction to Presto at Treasure Data
Introduction to Presto at Treasure Data
 
All you need to know about CREATE STATISTICS
All you need to know about CREATE STATISTICSAll you need to know about CREATE STATISTICS
All you need to know about CREATE STATISTICS
 
Snowflake’s Cloud Data Platform and Modern Analytics
Snowflake’s Cloud Data Platform and Modern AnalyticsSnowflake’s Cloud Data Platform and Modern Analytics
Snowflake’s Cloud Data Platform and Modern Analytics
 
OpenInfra Summit Vancouver 2023 - SSoT
OpenInfra Summit Vancouver 2023 - SSoTOpenInfra Summit Vancouver 2023 - SSoT
OpenInfra Summit Vancouver 2023 - SSoT
 
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdfAltinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
 
Webinar: ArangoDB 3.8 Preview - Analytics at Scale
Webinar: ArangoDB 3.8 Preview - Analytics at Scale Webinar: ArangoDB 3.8 Preview - Analytics at Scale
Webinar: ArangoDB 3.8 Preview - Analytics at Scale
 
InfluxDB IOx Tech Talks: A Rusty Introduction to Apache Arrow and How it App...
InfluxDB IOx Tech Talks:  A Rusty Introduction to Apache Arrow and How it App...InfluxDB IOx Tech Talks:  A Rusty Introduction to Apache Arrow and How it App...
InfluxDB IOx Tech Talks: A Rusty Introduction to Apache Arrow and How it App...
 
Presto in Treasure Data (presented at db tech showcase Sapporo 2015)
Presto in Treasure Data (presented at db tech showcase Sapporo 2015)Presto in Treasure Data (presented at db tech showcase Sapporo 2015)
Presto in Treasure Data (presented at db tech showcase Sapporo 2015)
 
Data warehouse or conventional database: Which is right for you?
Data warehouse or conventional database: Which is right for you?Data warehouse or conventional database: Which is right for you?
Data warehouse or conventional database: Which is right for you?
 
Data relay introduction to big data clusters
Data relay introduction to big data clustersData relay introduction to big data clusters
Data relay introduction to big data clusters
 
ClickHouse -If Combinators for Fun and Profit-2022-05-04.pdf
ClickHouse -If Combinators for Fun and Profit-2022-05-04.pdfClickHouse -If Combinators for Fun and Profit-2022-05-04.pdf
ClickHouse -If Combinators for Fun and Profit-2022-05-04.pdf
 
Andre Paul: Importing VMware infrastructures into CloudStack
Andre Paul: Importing VMware infrastructures into CloudStackAndre Paul: Importing VMware infrastructures into CloudStack
Andre Paul: Importing VMware infrastructures into CloudStack
 
Hive User Meeting August 2009 Facebook
Hive User Meeting August 2009 FacebookHive User Meeting August 2009 Facebook
Hive User Meeting August 2009 Facebook
 
Hive User Meeting 2009 8 Facebook
Hive User Meeting 2009 8 FacebookHive User Meeting 2009 8 Facebook
Hive User Meeting 2009 8 Facebook
 
Extreme BI: Creating Virtualized Hybrid Type 1+2 Dimensions
Extreme BI: Creating Virtualized Hybrid Type 1+2 DimensionsExtreme BI: Creating Virtualized Hybrid Type 1+2 Dimensions
Extreme BI: Creating Virtualized Hybrid Type 1+2 Dimensions
 

More from Altinity Ltd

Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptxBuilding an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
Altinity Ltd
 
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
Altinity Ltd
 
Building an Analytic Extension to MySQL with ClickHouse and Open Source
Building an Analytic Extension to MySQL with ClickHouse and Open SourceBuilding an Analytic Extension to MySQL with ClickHouse and Open Source
Building an Analytic Extension to MySQL with ClickHouse and Open Source
Altinity Ltd
 
Fun with ClickHouse Window Functions-2021-08-19.pdf
Fun with ClickHouse Window Functions-2021-08-19.pdfFun with ClickHouse Window Functions-2021-08-19.pdf
Fun with ClickHouse Window Functions-2021-08-19.pdf
Altinity Ltd
 
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdfCloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
Altinity Ltd
 
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
Altinity Ltd
 
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Altinity Ltd
 
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdfOwn your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
Altinity Ltd
 
ClickHouse ReplacingMergeTree in Telecom Apps
ClickHouse ReplacingMergeTree in Telecom AppsClickHouse ReplacingMergeTree in Telecom Apps
ClickHouse ReplacingMergeTree in Telecom Apps
Altinity Ltd
 
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with  Apache Pulsar and Apache PinotBuilding a Real-Time Analytics Application with  Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
Altinity Ltd
 
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
Altinity Ltd
 
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdfOSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
Altinity Ltd
 
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
Altinity Ltd
 
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
Altinity Ltd
 
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
Altinity Ltd
 
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
Altinity Ltd
 
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
Altinity Ltd
 
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdfOSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
Altinity Ltd
 
OSA Con 2022 - Specifics of data analysis in Time Series Databases - Roman Kh...
OSA Con 2022 - Specifics of data analysis in Time Series Databases - Roman Kh...OSA Con 2022 - Specifics of data analysis in Time Series Databases - Roman Kh...
OSA Con 2022 - Specifics of data analysis in Time Series Databases - Roman Kh...
Altinity Ltd
 
OSA Con 2022 - Signal Correlation, the Ho11y Grail - Michael Hausenblas - AWS...
OSA Con 2022 - Signal Correlation, the Ho11y Grail - Michael Hausenblas - AWS...OSA Con 2022 - Signal Correlation, the Ho11y Grail - Michael Hausenblas - AWS...
OSA Con 2022 - Signal Correlation, the Ho11y Grail - Michael Hausenblas - AWS...
Altinity Ltd
 

More from Altinity Ltd (20)

Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptxBuilding an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
 
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
 
Building an Analytic Extension to MySQL with ClickHouse and Open Source
Building an Analytic Extension to MySQL with ClickHouse and Open SourceBuilding an Analytic Extension to MySQL with ClickHouse and Open Source
Building an Analytic Extension to MySQL with ClickHouse and Open Source
 
Fun with ClickHouse Window Functions-2021-08-19.pdf
Fun with ClickHouse Window Functions-2021-08-19.pdfFun with ClickHouse Window Functions-2021-08-19.pdf
Fun with ClickHouse Window Functions-2021-08-19.pdf
 
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdfCloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
 
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
 
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
 
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdfOwn your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
 
ClickHouse ReplacingMergeTree in Telecom Apps
ClickHouse ReplacingMergeTree in Telecom AppsClickHouse ReplacingMergeTree in Telecom Apps
ClickHouse ReplacingMergeTree in Telecom Apps
 
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with  Apache Pulsar and Apache PinotBuilding a Real-Time Analytics Application with  Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
 
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
 
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdfOSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
 
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
 
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
 
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
 
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
 
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
 
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdfOSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
 
OSA Con 2022 - Specifics of data analysis in Time Series Databases - Roman Kh...
OSA Con 2022 - Specifics of data analysis in Time Series Databases - Roman Kh...OSA Con 2022 - Specifics of data analysis in Time Series Databases - Roman Kh...
OSA Con 2022 - Specifics of data analysis in Time Series Databases - Roman Kh...
 
OSA Con 2022 - Signal Correlation, the Ho11y Grail - Michael Hausenblas - AWS...
OSA Con 2022 - Signal Correlation, the Ho11y Grail - Michael Hausenblas - AWS...OSA Con 2022 - Signal Correlation, the Ho11y Grail - Michael Hausenblas - AWS...
OSA Con 2022 - Signal Correlation, the Ho11y Grail - Michael Hausenblas - AWS...
 

Recently uploaded

Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
correoyaya
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
Tiktokethiodaily
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
jerlynmaetalle
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
yhkoc
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
axoqas
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
AbhimanyuSinha9
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
Opendatabay
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
ewymefz
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
benishzehra469
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
vcaxypu
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
AlejandraGmez176757
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
StarCompliance.io
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
ewymefz
 

Recently uploaded (20)

Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
Innovative Methods in Media and Communication Research by Sebastian Kubitschk...
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
 
The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...The affect of service quality and online reviews on customer loyalty in the E...
The affect of service quality and online reviews on customer loyalty in the E...
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
一比一原版(CU毕业证)卡尔顿大学毕业证成绩单
 
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
做(mqu毕业证书)麦考瑞大学毕业证硕士文凭证书学费发票原版一模一样
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
 
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...Best best suvichar in gujarati english meaning of this sentence as Silk road ...
Best best suvichar in gujarati english meaning of this sentence as Silk road ...
 
一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
Opendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptxOpendatabay - Open Data Marketplace.pptx
Opendatabay - Open Data Marketplace.pptx
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
一比一原版(UMich毕业证)密歇根大学|安娜堡分校毕业证成绩单
 
Empowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptxEmpowering Data Analytics Ecosystem.pptx
Empowering Data Analytics Ecosystem.pptx
 
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
一比一原版(ArtEZ毕业证)ArtEZ艺术学院毕业证成绩单
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
Business update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMIBusiness update Q1 2024 Lar España Real Estate SOCIMI
Business update Q1 2024 Lar España Real Estate SOCIMI
 
Investigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_CrimesInvestigate & Recover / StarCompliance.io / Crypto_Crimes
Investigate & Recover / StarCompliance.io / Crypto_Crimes
 
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
一比一原版(UPenn毕业证)宾夕法尼亚大学毕业证成绩单
 

Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf

  • 1. © 2022 Altinity, Inc. Deep Dive on ClickHouse Sharding and Replication Robert Hodges and Altinity Engineering 22 September 2022 1 © 2202 Altinity, Inc.
  • 2. © 2022 Altinity, Inc. Let’s make some introductions ClickHouse support and services including Altinity.Cloud Authors of Altinity Kubernetes Operator for ClickHouse and other open source projects Us Database geeks with centuries of experience in DBMS and applications You Applications developers looking to learn about ClickHouse 2
  • 3. © 2022 Altinity, Inc. © 2022 Altinity, Inc. What’s a ClickHouse? 3
  • 4. © 2022 Altinity, Inc. Understands SQL Runs on bare metal to cloud Shared nothing architecture Stores data in columns Parallel and vectorized execution Scales to many petabytes Is Open source (Apache 2.0) ClickHouse is a SQL Data Warehouse It’s the core engine for real-time analytics ClickHouse Event Streams ELT Object Storage Interactive Graphics Dashboards APIs 4
  • 5. © 2022 Altinity, Inc. Distributed data is deeper than it looks 5 Width: 2 meters Depth: 60 meters “The Bolton Strid”
  • 6. © 2022 Altinity, Inc. © 2022 Altinity, Inc. Introducing sharding and replication 6
  • 7. © 2022 Altinity, Inc. Clickhouse nodes can scale vertically Network- Attached Storage CPU RAM Host 7
  • 8. © 2022 Altinity, Inc. Clickhouse nodes can scale vertically CPU RAM Host Network- Attached Storage 8
  • 9. © 2022 Altinity, Inc. Clusters introduce horizontal scaling Shards Replicas Host Host Host Host Replicas improve read IOPs and concurrency Shards add write IOPS 9
  • 10. © 2022 Altinity, Inc. Different sharding and replication patterns Shard 1 Shard 3 Shard 2 Shard 4 All Sharded Data sharded 4 ways without replication Replica 1 Replica 3 Replica 2 Replica 4 All Replicated Data replicated 4 times without sharding Shard 1 Replica 1 Shard 1 Replica 2 Shard 2 Replica 1 Shard 2 Replica 2 Sharded and Replicated Data sharded 2 ways and replicated 2 times 10
  • 11. © 2022 Altinity, Inc. MergeTree tables support replication MergeTree SummingMergeTree AggregatingMergeTree CollapsingMergeTree VersionedCollapsing MergeTree ReplicatedMergeTree ReplicatedSummingMergeTree ReplicatedAggregatingMergeTree ReplicatedCollapsingMergeTree ReplicatedVersionedCollapsing MergeTree ReplacingMergeTree ReplicatedReplacingMergeTree Source data Aggregated data; single row per group Evolving data 11
  • 12. © 2022 Altinity, Inc. How replication works INSERT Replicate ClickHouse Node 1 Table: ontime (Parts) ReplicatedMergeTree :9009 :9443 ClickHouse Node 2 Table: ontime (Parts) ReplicatedMergeTree :9009 :9443 zookeeper-1 ZNodes :2181 zookeeper-2 ZNodes :2181 zookeeper-3 ZNodes :2181 12
  • 13. © 2022 Altinity, Inc. What is replicated? Replicated statements Non-replicated statements ● INSERT ● ALTER TABLE exceptions: FREEZE, MOVE TO DISK, FETCH ● OPTIMIZE ● TRUNCATE ● CREATE table ● DROP table ● RENAME table ● DETACH table ● ATTACH table Replicated*MergeTree ONLY 13
  • 14. © 2022 Altinity, Inc. © 2022 Altinity, Inc. Building distributed schema 14
  • 15. © 2022 Altinity, Inc. Example of a distributed data set with shards and replicas clickhouse-0 ontime _local airports ontime clickhouse-1 ontime _local airports ontime clickhouse-2 ontime _local airports ontime clickhouse-3 ontime _local airports ontime Distributed table (No data) Sharded, replicated table (Partial data) Fully replicated table (All data) 15
  • 16. © 2022 Altinity, Inc. Step 1: A sharded, replicated fact table CREATE TABLE IF NOT EXISTS `ontime_local` ( `Year` UInt16 CODEC(DoubleDelta, ZSTD(1)), `Quarter` UInt8, `Month` UInt8, `DayofMonth` UInt8, `DayOfWeek` UInt8, ... ) Engine=ReplicatedMergeTree( '/clickhouse/{cluster}/tables/{shard}/{database}/ontime_local', '{replica}') PARTITION BY toYYYYMM(FlightDate) ORDER BY (FlightDate, `Year`, `Month`, DepDel15) Replication is at the table level! Use a Replicated% Engine 16
  • 17. © 2022 Altinity, Inc. Step 2: A distributed table to find data CREATE TABLE IF NOT EXISTS ontime AS ontime_local ENGINE = Distributed( '{cluster}', currentDatabase(), ontime_local, rand()) Cluster layout Database Table Sharding key (optional) 17
  • 18. © 2022 Altinity, Inc. Step 3: A fully replicated dimension table CREATE TABLE IF NOT EXISTS airports AS default.dot_airports Engine=ReplicatedMergeTree( '/clickhouse/{cluster}/tables/all/{database}/airports', '{replica}') PARTITION BY tuple() PRIMARY KEY AirportID ORDER BY AirportID Don’t bother with partitions for small tables Resolves to current database 18
  • 19. © 2022 Altinity, Inc. Macros help CREATE TABLE ON CLUSTER /etc/clickhouse-server/config.d/macros.xml: <clickhouse> <macros> <all-sharded-shard>2</all-sharded-shard> <cluster>demo</cluster> <shard>0</shard> <replica>clickhouse-0-1</replica> </macros> </clickhouse> select * from system.macros Replica names should be unique per host 19
  • 20. © 2022 Altinity, Inc. What does ON CLUSTER do? ON CLUSTER executes a command over a set of nodes CREATE TABLE IF NOT EXISTS `ontime_local` ON CLUSTER `{cluster}` ... DROP TABLE IF EXISTS `ontime_local` ON CLUSTER `{cluster}` ... ALTER TABLE `ontime_local` ON CLUSTER `{cluster}` ... 20
  • 21. © 2022 Altinity, Inc. How does ON CLUSTER know where to go? /etc/clickhouse-server/config.d/remote_servers.xml: <clickhouse> <remote_servers> <demo> <!-- <secret>top secret</secret> --> <shard> <replica><host>10.0.0.71</host><port>9000</port></replica> <replica><host>10.0.0.72</host><port>9000</port></replica> <internal_replication>true</internal_replication> </shard> <shard> . . . </shard> </demo> </remote_servers> </clickhouse> “It’s a cluster because I said so!” Cluster name 21 Shared secret
  • 22. © 2022 Altinity, Inc. List layouts using system.clusters -- Find name and hosts in each layout SELECT cluster, groupArray(concat(host_name,':',toString(port))) AS hosts FROM system.clusters GROUP BY cluster ORDER BY cluster 22
  • 23. © 2022 Altinity, Inc. © 2022 Altinity, Inc. Loading and querying data 23
  • 24. © 2022 Altinity, Inc. Data loading: Distributed vs. local INSERTs ontime _local ontime Insert via distributed table Insert directly to shards ontime _local ontime ontime _local ontime ontime _local ontime Data Pipeline Data Pipeline Applications may have to be more intelligent May require more resources (Queue) 24
  • 25. © 2022 Altinity, Inc. INSERT into a distributed vs. local table -- Insert into distributed table INSERT INTO ontime VALUES (2017,1,1,1,7,'2017-01-01','AA',19805,...), (2017,1,1,1,7,'2017-01-01','AA',19805,...), ... -- Insert into a local table INSERT INTO ontime_local VALUES (2017,1,1,1,7,'2017-01-01','AA',19805,...), (2017,1,1,1,7,'2017-01-01','AA',19805,...), ... 25
  • 26. © 2022 Altinity, Inc. How does a distributed INSERT work? ontime _local ontime Insert via distributed table ontime _local ontime ontime _local ontime Data Pipeline (Queue) insert_distributed_sync: ● 0 = async propagation ● 1 = sync propagation ontime _local ontime Thread Pool select * from system.distribution_queue replication 26
  • 27. © 2022 Altinity, Inc. Options for processing INSERTs ● Local vs distributed data insertion ○ INSERT to local table – no need to sync, larger blocks, faster ○ INSERT to Distributed table – sharding by ClickHouse ○ CHProxy -- distributes transactions across nodes, only works with HTTP connections ● Asynchronous (default) vs synchronous insertions ○ insert_distributed_sync - Wait until batches make it to local tables ○ insert_quorum, select_sequential_consistency – Wait until replicas sync 27
  • 28. © 2022 Altinity, Inc. How do distributed SELECTs work? ontime _local ontime Application ontime _local ontime ontime _local ontime ontime _local ontime Application Innermost subselect is distributed AggregateState computed locally Aggregates merged on initiator node 28
  • 29. © 2022 Altinity, Inc. Queries are pushed to all shards SELECT Carrier, avg(DepDelay) AS Delay FROM ontime GROUP BY Carrier ORDER BY Delay DESC SELECT Carrier, avg(DepDelay) AS Delay FROM ontime_local GROUP BY Carrier ORDER BY Delay DESC 29
  • 30. © 2022 Altinity, Inc. ClickHouse pushes down JOINs by default SELECT o.Dest d, a.Name n, count(*) c, avg(o.ArrDelayMinutes) ad FROM default.ontime o JOIN default.airports a ON (a.IATA = o.Dest) GROUP BY d, n HAVING c > 100000 ORDER BY d DESC LIMIT 10 SELECT Dest AS d, Name AS n, count() AS c, avg(ArrDelayMinutes) AS ad FROM default.ontime_local AS o ALL INNER JOIN default.airports AS a ON a.IATA = o.Dest GROUP BY d, n HAVING c > 100000 ORDER BY d DESC LIMIT 10 30
  • 31. © 2022 Altinity, Inc. ...Unless the left side “table” is a subquery SELECT d, Name n, c AS flights, ad FROM ( SELECT Dest d, count(*) c, avg(ArrDelayMinutes) ad FROM default.ontime GROUP BY d HAVING c > 100000 ORDER BY ad DESC ) AS o LEFT JOIN airports ON airports.IATA = o.d LIMIT 10 Remote Servers 31
  • 32. © 2022 Altinity, Inc. It’s more complex when multiple tables are distributed select foo from T1 where a in (select a from T2) distributed_product_mode=? local select foo from T1_local where a in ( select a from T2_local) allow select foo from T1_local where a in ( select a from T2) global create temporary table tmp Engine = Set AS select a from T2; select foo from T1_local where a in tmp; (Subquery runs on local table) (Subquery runs on distributed table) (Subquery runs on initiator; broadcast to local temp table) 32
  • 33. © 2022 Altinity, Inc. What’s actually happening with queries? Let’s find out! SELECT hostName() host, event_time, query_id, is_initial_query AS initial, if(is_initial_query, '', initial_query_id) as initial_q, query FROM cluster('{cluster}', system.query_log) AS st WHERE type = 'QueryFinish' AND has(databases, 'test') ORDER BY st.event_time DESC LIMIT 25 33
  • 34. © 2022 Altinity, Inc. Thinking about distributed data and joins Large id 1 2 … … 1000 Small id 1 … 100 Large id 1 2 … … 1000 Large id 1 2 … … 1000 Large id 1001 1002 … … 2000 Large id 2001 2002 … … 2000 Large id 1001 1002 … … 2000 Small id 1 … 100 Shard 1 Shard 2 Shard 1 Shard 2 “Bucketing Model” “Big Table Model” All keys replicated Matching keys in each bucket 34
  • 35. © 2022 Altinity, Inc. © 2022 Altinity, Inc. Tricks to query distributed tables 35
  • 36. © 2022 Altinity, Inc. Use remote() to select from another node SELECT count() FROM remote('host-2', currentDatabase(), 'ontime_ref') SELECT count() FROM remoteSecure('host-2', currentDatabase(), 'ontime_ref') ┌───count()─┐ │ 196508419 │ └───────────┘ -- You can insert too, with FUNCTION keyword. INSERT INTO FUNCTION remote(host, database, table, login, password) VALUES . . . 36
  • 37. © 2022 Altinity, Inc. More remote query tricks! SELECT hostName() AS h, count() AS c FROM sdata GROUP BY h ┌─h─────────────────────────┬───c─┐ │ chi-test-rh-test-rh-1-0-0 │ 492 │ │ chi-test-rh-test-rh-0-0-0 │ 508 │ └───────────────────────────┴─────┘ SELECT hostName() AS h, count() AS c FROM remote('chi-test-rh-test-rh-{0,1}-{0,1}', default, sdata) GROUP BY h ┌─h─────────────────────────┬────c─┐ │ chi-test-rh-test-rh-1-0-0 │ 984 │ │ chi-test-rh-test-rh-1-1-0 │ 984 │ │ chi-test-rh-test-rh-0-1-0 │ 1016 │ │ chi-test-rh-test-rh-0-0-0 │ 1016 │ └───────────────────────────┴──────┘ Distributed table Remote query all 4 hosts 37
  • 38. © 2022 Altinity, Inc. cluster() distributes queries dynamically SELECT hostName() AS host, count() AS tables FROM cluster('{cluster}', system.tables) WHERE database = 'default' GROUP BY host ┌─host──────────────────────┬─tables─┐ │ chi-test-rh-test-rh-1-0-0 │ 2 │ │ chi-test-rh-test-rh-0-1-0 │ 2 │ └───────────────────────────┴────────┘ 38
  • 39. © 2022 Altinity, Inc. clusterAllReplicas() goes to every node SELECT hostName() AS host, count() AS tables FROM clusterAllReplicas('{cluster}', system.tables) WHERE database = 'default' GROUP BY host ┌─host──────────────────────┬─tables─┐ │ chi-test-rh-test-rh-1-0-0 │ 2 │ │ chi-test-rh-test-rh-1-1-0 │ 2 │ │ chi-test-rh-test-rh-0-1-0 │ 2 │ │ chi-test-rh-test-rh-0-0-0 │ 2 │ └───────────────────────────┴────────┘ 39
  • 40. © 2022 Altinity, Inc. © 2022 Altinity, Inc. Scaling up 40
  • 41. © 2022 Altinity, Inc. Load testing and capacity planning made simple… 1. Establish single node baseload ● Use production data ● Max out SELECT & INSERT capacity with load tests ● Adjust schema and queries, retest 2. Add replicas to increase SELECT capacity 3. Add shards to increase INSERT capacity 41
  • 42. © 2022 Altinity, Inc. Selecting the sharding key Shard 2 Shard 3 Shard 1 Randomized Key, e.g., cityHash64(Url) Must query all shards Nodes are balanced Shard 3 Specific Key e.g., cityHash64(TenantId) Unbalanced nodes Queries can skip shards Shard 2 Shard 1 Easier to add nodes Hard to add nodes 42
  • 43. © 2022 Altinity, Inc. Options for shard rebalancing ● INSERT INTO new_cluster SELECT FROM old_cluster ○ Clickhouse-copier automates this ● Use (undocumented) ALTER TABLE MOVE PART TO SHARD ○ Example: ALTER TABLE test_move MOVE PART 'all_0_0_0' TO SHARD '/clickhouse/shard_1/tables/test_move ● Move parts manually ○ ALTER TABLE FREEZE PARTITION ○ rsync to new host ○ ALTER TABLE ATTACH PARTITION ○ Drop original partition 43
  • 44. © 2022 Altinity, Inc. Bi-level sharding combines both approaches cityHash64(Url) Shard 2 Shard 3 Shard 1 TenantId Shard 2 Shard 1 cityHash64(Url) cityHash64(Url) Shard 2 Shard 1 Tenant-Group-1 Tenant-Group-2 Tenant-Group-3 Application chooses group Distributed table 44
  • 45. © 2022 Altinity, Inc. © 2022 Altinity, Inc. Wrap-up and more information 45
  • 46. © 2022 Altinity, Inc. Where is the documentation? ClickHouse official docs – https://clickhouse.com/docs/ Altinity Blog – https://altinity.com/blog/ Altinity Youtube Channel – https://www.youtube.com/channel/UCE3Y2lDKl_ZfjaCrh62onYA Altinity Knowledge Base – https://kb.altinity.com/ ClickHouse Capacity Planning by Mik Kocikowski of CloudFlare Meetups, other blogs, and external resources. Use your powers of Search! 46
  • 47. © 2022 Altinity, Inc. Where can I get help? Telegram - ClickHouse Channel Slack ● ClickHouse Public Workspace - clickhousedb.slack.com ● Altinity Public Workspace - altinitydbworkspace.slack.com Education - Altinity ClickHouse Training Support - Altinity offers support for ClickHouse in all environments 47
  • 48. © 2022 Altinity, Inc. 48 © 2202 Altinity, Inc. Thank you and good luck! Website: https://altinity.com Email: info@altinity.com Slack: altinitydbworkspace.slack.com Altinity.Cloud Altinity Support Altinity Stable Builds We’re hiring!