SlideShare a Scribd company logo
© 2022 Altinity, Inc.
A Day in the Life of
a ClickHouse Query
Intro to ClickHouse Internals
Robert Hodges & Altinity Engineering
1
10 February 2022
© 2022 Altinity, Inc.
Let’s make some introductions
ClickHouse support and services including Altinity.Cloud
Authors of Altinity Kubernetes Operator for ClickHouse
and other open source projects
Robert Hodges
Database geek with 30+ years
on DBMS systems. Day job:
Altinity CEO
Altinity Engineering
Database geeks with centuries
of experience in DBMS and
applications
2
© 2022 Altinity, Inc.
© 2021 Altinity, Inc.
Foundations
3
© 2022 Altinity, Inc.
Understands SQL
Runs on bare metal to cloud
Shared nothing architecture
Stores data in columns
Parallel and vectorized execution
Scales to many petabytes
Is Open source (Apache 2.0)
ClickHouse is a SQL Data Warehouse
It’s the core engine for
real-time analytics
ClickHouse
Event
Streams
ELT
Object
Storage
Interactive
Graphics
Dashboards
APIs
4
© 2022 Altinity, Inc.
If you understand the engine you can make it faster
ClickHouse has a simple execution model–there’s no magic
Any developer can understand how it works
Knowledge leads to faster and more efficient queries
5
(Another fast
engine!)
© 2022 Altinity, Inc.
© 2021 Altinity, Inc.
What happens
when you insert
data?
6
© 2022 Altinity, Inc.
Let’s create a table!
CREATE TABLE IF NOT EXISTS sdata (
DevId Int32,
Type String,
MDate Date,
MDatetime DateTime,
Value Float64
) ENGINE = MergeTree()
PARTITION BY toYYYYMM(MDate)
ORDER BY (DevId, MDatetime)
7
Table engine type
How to break data into
parts
How to index and sort
data in each part
Table columns
© 2022 Altinity, Inc.
Let’s now insert some data…
INSERT INTO sdata VALUES
(15, 'TEMP', '2018-01-01', '2018-01-01 23:29:55', 18.0),
(15, 'TEMP', '2018-01-01', '2018-01-01 23:30:56', 18.7)
8
(This is an example. Most people don’t
insert data this way!)
© 2022 Altinity, Inc.
Part in Storage
How does ClickHouse process an insert?
9
INSERT INTO sdata
VALUES
(15, 'TEMP', . . .),
(15, 'TEMP', . . .)
2 rows in set. Elapsed: 0.271 sec.
ClickHouse Server
Parse/Plan
Respond
Load
Part in RAM
Sort rows ( table ORDER BY )
© 2022 Altinity, Inc.
Part in Storage
How can we make this more efficient? Parallelize!
10
set max_insert_threads=4
insert into ontime_test
select * from ontime
where toYear(FlightDate)
between 2000 and 2001
2 rows in set. Elapsed: 0.271 sec.
ClickHouse Server
Parse/Plan
Respond
Load
Parts in RAM
© 2022 Altinity, Inc.
Parallelism affects speed and memory usage
11
insert into ontime_test
select * from ontime
where toYear(FlightDate)
between 2000 and 2001
set max_insert_threads=1
. . .
set max_insert_threads=2
. . .
set max_insert_threads=4
© 2022 Altinity, Inc.
OK, where did those awesome stats come from?
SELECT
event_time,
type,
is_initial_query,
query_duration_ms / 1000 AS duration,
read_rows,
read_bytes,
result_rows,
formatReadableSize(memory_usage) AS memory,
query
FROM system.query_log
WHERE (user = 'default') AND (type = 'QueryFinish')
ORDER BY event_time DESC
LIMIT 50
12
© 2022 Altinity, Inc.
What’s going on down there when you INSERT?
Table
Part
Index Columns
Sparse index finds rows by Carrier,
Origin, FlightDate
Columns sorted
by Carrier, Origin,
FlightDate
Rows in the part
all belong to
same Year
Part
Index Columns
Part 13
© 2022 Altinity, Inc.
Understanding what’s in a MergeTree part
/var/lib/clickhouse/data/airline/ontime
2017-07-01 AA
2017-07-01 EV
2017-07-01 UA
2017-07-02 AA
...
primary.idx
|
|
|
|
.mrk .bin
20170701_20170731_355_355_2/
(FlightDate, Carrier...) ActualElapsedTime Airline AirlineID...
|
|
|
|
.mrk .bin
|
|
|
|
.mrk .bin
Granule
Compressed
Block
Mark
14
© 2022 Altinity, Inc.
Why MergeTree? Because it merges!
Part
Index Columns
Part
Index Columns
Rewritten, Bigger Part
Index Columns
15
Update and delete also rewrite parts
© 2022 Altinity, Inc.
Bigger parts are more efficient!
● Pick a PARTITION BY that gives nice, fat partitions ( 1-300GB, < 1000
total parts per table)
○ Can’t decide? Partition by month.
● Insert large blocks of data to avoid lots of merges afterwards
○ ClickHouse is fine with tens of millions of rows!
● The simplest way to make blocks bigger is to batch input data
○ Avoid different partition keys in the same block
○ ClickHouse has parameters like max_insert_block_size but defaults are OK
○ Look at logs and actual part sizes to see if you need to do more
16
© 2022 Altinity, Inc.
How can I see how big table parts are?
SELECT
table, partition, name,
marks, rows, data_compressed_bytes,
data_uncompressed_bytes, bytes_on_disk
FROM system.parts
WHERE active
AND level=0
AND database = 'default'
AND table = 'ontime_test'
ORDER BY table DESC, partition ASC, name ASC
17
Part is in use
Part has not been merged
© 2022 Altinity, Inc.
Tips to optimized INSERT
Making INSERT faster
● Increase max_insert_threads (parallel creation of parts)
● Enable input_format_parallel_parsing to parallelize input parsing
○ Works for TSV/CSV/Values data
● Write bigger blocks (less merging afterwards)
Making INSERT less memory intensive
● Decrease max_insert_threads (reduces parts simultaneously in memory)
● Disable input_format_parallel_parsing
● Write smaller blocks (less memory required at INSERT time)
18
© 2022 Altinity, Inc.
© 2021 Altinity, Inc.
How do basic
queries work?
19
© 2022 Altinity, Inc.
Aggregation is a key feature of analytic queries
20
SELECT Carrier,
avg(DepDelay) AS Delay
FROM ontime
GROUP BY Carrier
ORDER BY Delay DESC
Aggregates group measurements for one more more dimensions
Aggregate
Function
Dimension
© 2022 Altinity, Inc.
How does ClickHouse process a query with aggregates?
21
SELECT Carrier,
avg(DepDelay)AS Delay
FROM ontime
GROUP BY Carrier
ORDER BY Delay DESC
┌─FlightDate─┬──────────────Delay─┐
│ 2016-12-17 │ 53.72058516196447 │
│ 1990-12-21 │ 43.87767499654839 │
│ 1990-12-22 │ 43.64346952962076 │
. . .
ClickHouse Server
Parse/Plan
Merge/Sort
Scan
In-RAM
Hash
Tables
Parts in Storage
© 2022 Altinity, Inc.
How can you compute an average in parallel?
22
= 2
Numerator=6
Denominator=3
Scan
Merge
1 2 3 1 3 5 0 5 0 0
Numerator=9
Denominator=3
Numerator=5
Denominator=4
6 + 9 + 5
3 + 3 + 4 Partial
Aggregate
© 2022 Altinity, Inc.
How does a ClickHouse thread do aggregation?
23
Merge/Sort
Scan Thread
Parts in Storage
AL => 4259/1070,
2385/415, …
DL => 20663/1198,
25166/2711, …
… Scan Thread
Hash Table
Other
Scan
Thread
Hash
Tables
Result
GROUP BY
Key
Partial
Aggregates
© 2022 Altinity, Inc.
We can now understand aggregation performance drivers
24
SELECT Carrier,
avg(DepDelay)AS Delay
FROM ontime
GROUP BY Carrier
ORDER BY Delay DESC
LIMIT 50
Simple aggregate, short
GROUP BY key with few values
SELECT Carrier, FlightDate,
avg(DepDelay) AS Delay,
uniqExact(TailNum) AS Aircraft
FROM ontime
GROUP BY Carrier, FlightDate
ORDER BY Delay DESC
LIMIT 50
More complex aggregates, longer
GROUP BY with more values
3.4 sec
2.4 GB RAM
0.84 sec
1.6 KB RAM
© 2022 Altinity, Inc.
Parallelism affects speed and memory usage
25
SELECT Origin, FlightDate,
avg(DepDelay) AS Delay,
uniqExact(TailNum) AS Aircraft
FROM ontime
WHERE Carrier='WN'
GROUP BY Origin, FlightDate
ORDER BY Delay DESC
LIMIT 5
SET max_threads = 1
. . .
SET max_threads = 4
. . .
SET max_threads = 16
© 2022 Altinity, Inc.
This is a good time to mention ClickHouse memory limits
26
SQL
Query
max_memory_usage
(Default=10Gb)
Single query limit
SQL
Query
max_memory_usage_for_user
(Default=Unlimited)
All queries for a user
SQL
Query
SQL
Query
SQL
Query
All memory on server
ClickHouse
Process
max_server _memory_usage
(Default=90% of available RAM)
© 2022 Altinity, Inc.
Tips to make aggregation queries faster
● Remove/exchange “heavy” aggregation functions
● Reduce the number of values in GROUP BY
● Increase max_threads (parallelism)
● Reduce I/O
○ Filter out unnecessary rows
○ Improve compression of data in storage
27
© 2022 Altinity, Inc.
Tips to reduce memory usage in aggregation queries
● Remove/exchange “heavy” aggregation functions
● Reduce number of values in GROUP BY
● Change max_threads value
● Dump aggregates to external storage
○ SET max_bytes_before_external_group_by > 0
● Filter out unnecessary rows
28
© 2022 Altinity, Inc.
© 2021 Altinity, Inc.
How do joins
work?
29
© 2022 Altinity, Inc.
JOIN combines data between tables
SELECT o.Dest,
any(a.Name) AS AirportName,
count(Dest) AS Flights
FROM ontime o JOIN airports a
ON a.IATA = o.Dest
GROUP BY Dest
ORDER BY Flights
DESC LIMIT 10
30
Join condition
“Left”
Table
“Right”
Table
© 2022 Altinity, Inc.
How does ClickHouse process a query with a join?
Left Side
Table
(Big)
Merged
Result
Filter
Right Side
Table
(Small)
In-RAM
Hash
Table
Load the right side table
Scan left side table in
parallel, computing
aggregates and adding
joined data
Merge and sort results
31
Keys and
column
values!
© 2022 Altinity, Inc.
Let’s look more deeply at what’s happening in the scan
. . .
IATA
. . .
airports
. . .
Dest
. . .
ontime
SELECT . . . FROM ontime o JOIN airports a ON a.IATA = o.Dest
32
S
c
a
n
ATL 576
1501
3302
…
Hartsfield Jackson Atlanta International Airport
Hartsfield Jackson Atlanta International Airport
Hartsfield Jackson Atlanta International Airport
…
ORD 255 Chicago O'Hare International Airport
Partial
aggregates
© 2022 Altinity, Inc.
It would be more efficient to join after aggregating
Left Side
Table
(Big)
Merge
Filter
Right Side
Table
(Small)
In-RAM
Hash
Table
Load the right side table
Scan left side table in
parallel, computing
aggregates
Merge, then join and sort
33
Join
© 2022 Altinity, Inc.
You can do exactly that with a subquery
SELECT o.Dest, any(a.Name) AS AirportName,
count(Dest) AS Flights
FROM ontime o
JOIN default.airports a ON a.IATA = o.Dest
GROUP BY Dest ORDER BY Flights
DESC LIMIT 10
SELECT o.Dest, a.Name AS AirportName, o.Flights
FROM (
SELECT Dest, count(Dest) AS Flights
FROM ontime GROUP BY Dest ) AS o
JOIN default.airports a ON a.IATA = o.Dest
ORDER BY Flights DESC LIMIT 10
34
2.71 sec
19.9 MB RAM
0.663 sec
1.58 KB RAM
© 2022 Altinity, Inc.
Simple ways to keep JOINs fast and efficient
● Keep the right side table(s) overall size small
● Minimize the columns joined from the right side
● Add filter conditions to the right side table to reduce rows
● JOIN after aggregation if possible
● Use a Dictionary instead of a JOIN
○ Dictionaries are just loaded once and can be shared across queries
35
Pro tip: The SQL IN operator is also a join under the covers.
© 2022 Altinity, Inc.
© 2021 Altinity, Inc.
How does a
distributed query
work?
36
© 2022 Altinity, Inc.
Example of a distributed data set with shards and replicas
clickhouse-0
ontime
_local
airports
ontime
clickhouse-1
ontime
_local
airports
ontime
clickhouse-2
ontime
_local
airports
ontime
clickhouse-3
ontime
_local
airports
ontime
Distributed
table
(No data)
Sharded,
replicated
table
(Partial data)
Fully
replicated
table
(All data)
37
© 2022 Altinity, Inc.
Distributed send subqueries to multiple nodes
ontime
_local
ontime
Application
ontime
_local
ontime
ontime
_local
ontime
ontime
_local
ontime
Application
Innermost
subselect is
distributed
AggregateState
computed
locally
Aggregates
merged on
initiator node
38
© 2022 Altinity, Inc.
Queries are pushed to all shards
SELECT Carrier, avg(DepDelay) AS Delay
FROM ontime
GROUP BY Carrier ORDER BY Delay DESC
SELECT Carrier, avg(DepDelay) AS Delay
FROM ontime_local
GROUP BY Carrier ORDER BY Delay DESC
39
© 2022 Altinity, Inc.
ClickHouse pushes down JOINs by default
SELECT o.Dest d, a.Name n, count(*) c, avg(o.ArrDelayMinutes) ad
FROM default.ontime o
JOIN default.airports a ON (a.IATA = o.Dest)
GROUP BY d, n HAVING c > 100000 ORDER BY d DESC
LIMIT 10
SELECT Dest AS d, Name AS n, count() AS c, avg(ArrDelayMinutes) AS
ad
FROM default.ontime_local AS o
ALL INNER JOIN default.airports AS a ON a.IATA = o.Dest
GROUP BY d, n HAVING c > 100000 ORDER BY d DESC LIMIT 10
40
© 2022 Altinity, Inc.
...Unless the left side “table” is a subquery
SELECT d, Name n, c AS flights, ad
FROM
(
SELECT Dest d, count(*) c, avg(ArrDelayMinutes) ad
FROM default.ontime
GROUP BY d HAVING c > 100000
ORDER BY ad DESC
) AS o
LEFT JOIN airports ON airports.IATA = o.d
LIMIT 10
Remote
Servers
41
© 2022 Altinity, Inc.
It’s more complex when multiple tables are distributed
select foo from T1 where a in (select a from T2)
distributed_product_mode=?
local
select foo
from T1_local
where a in (
select a
from T2_local)
allow
select foo
from T1_local
where a in (
select a
from T2)
global
create temporary table
tmp Engine = Set
AS select a from T2;
select foo from
T1_local where a in
tmp;
(Subquery runs on
local table)
(Subquery runs on
distributed table) (Subquery runs on initiator;
broadcast to local temp table)
42
© 2022 Altinity, Inc.
Tips to make distributed queries more efficient
● Think about where your data are located
● Move WHERE and heavy grouping work to left hand side of join
● Use a subquery to order joins after the remote scan
● Use the query_log to see what actually executes on the remote node(s)
43
© 2022 Altinity, Inc.
© 2021 Altinity, Inc.
Where to learn
more
44
© 2022 Altinity, Inc.
Where is the documentation?
ClickHouse official docs – https://clickhouse.com/docs/
Altinity Blog – https://altinity.com/blog/
Altinity Youtube Channel –
https://www.youtube.com/channel/UCE3Y2lDKl_ZfjaCrh62onYA
Altinity Knowledge Base – https://kb.altinity.com/
Meetups, other blogs, and external resources. Use your powers of Search!
45
© 2022 Altinity, Inc.
References for this talk
Altinity Knowledge Base – https://kb.altinity.com/
ClickHouse Source Code – https://github.com/ClickHouse/ClickHouse
Talks and Blog Articles -
● ClickHouse Deep Dive, Alexey Milovidov
● Про JOIN’ы (в ClickHouse) - Artyem Zuikov
● Модификаторы DISTINCT и ORDER BY для всех агрегатных функций -
Sofia Sergeevna Borzenkova
● ClickHouse Kernel Analysis - Storage Structure and Query Acceleration of
MergeTree - Alibaba Cloud
46
© 2022 Altinity, Inc.
Thank you!
Questions?
https://altinity.com
47

More Related Content

What's hot

Tricks every ClickHouse designer should know, by Robert Hodges, Altinity CEO
Tricks every ClickHouse designer should know, by Robert Hodges, Altinity CEOTricks every ClickHouse designer should know, by Robert Hodges, Altinity CEO
Tricks every ClickHouse designer should know, by Robert Hodges, Altinity CEO
Altinity Ltd
 
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlare
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlareClickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlare
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlare
Altinity Ltd
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Altinity Ltd
 
Dangerous on ClickHouse in 30 minutes, by Robert Hodges, Altinity CEO
Dangerous on ClickHouse in 30 minutes, by Robert Hodges, Altinity CEODangerous on ClickHouse in 30 minutes, by Robert Hodges, Altinity CEO
Dangerous on ClickHouse in 30 minutes, by Robert Hodges, Altinity CEO
Altinity Ltd
 
All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...
All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...
All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...
Altinity Ltd
 
Better than you think: Handling JSON data in ClickHouse
Better than you think: Handling JSON data in ClickHouseBetter than you think: Handling JSON data in ClickHouse
Better than you think: Handling JSON data in ClickHouse
Altinity Ltd
 
A Day in the Life of a ClickHouse Query Webinar Slides
A Day in the Life of a ClickHouse Query Webinar Slides A Day in the Life of a ClickHouse Query Webinar Slides
A Day in the Life of a ClickHouse Query Webinar Slides
Altinity Ltd
 
10 Good Reasons to Use ClickHouse
10 Good Reasons to Use ClickHouse10 Good Reasons to Use ClickHouse
10 Good Reasons to Use ClickHouse
rpolat
 
A Fast Intro to Fast Query with ClickHouse, by Robert Hodges
A Fast Intro to Fast Query with ClickHouse, by Robert HodgesA Fast Intro to Fast Query with ClickHouse, by Robert Hodges
A Fast Intro to Fast Query with ClickHouse, by Robert Hodges
Altinity Ltd
 
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Altinity Ltd
 
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander ZaitsevClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
Altinity Ltd
 
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert HodgesWebinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Altinity Ltd
 
Altinity Quickstart for ClickHouse
Altinity Quickstart for ClickHouseAltinity Quickstart for ClickHouse
Altinity Quickstart for ClickHouse
Altinity Ltd
 
ClickHouse Features for Advanced Users, by Aleksei Milovidov
ClickHouse Features for Advanced Users, by Aleksei MilovidovClickHouse Features for Advanced Users, by Aleksei Milovidov
ClickHouse Features for Advanced Users, by Aleksei Milovidov
Altinity Ltd
 
High Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouseHigh Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouse
Altinity Ltd
 
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
Altinity Ltd
 
All about Zookeeper and ClickHouse Keeper.pdf
All about Zookeeper and ClickHouse Keeper.pdfAll about Zookeeper and ClickHouse Keeper.pdf
All about Zookeeper and ClickHouse Keeper.pdf
Altinity Ltd
 
ClickHouse on Kubernetes! By Robert Hodges, Altinity CEO
ClickHouse on Kubernetes! By Robert Hodges, Altinity CEOClickHouse on Kubernetes! By Robert Hodges, Altinity CEO
ClickHouse on Kubernetes! By Robert Hodges, Altinity CEO
Altinity Ltd
 
Adventures with the ClickHouse ReplacingMergeTree Engine
Adventures with the ClickHouse ReplacingMergeTree EngineAdventures with the ClickHouse ReplacingMergeTree Engine
Adventures with the ClickHouse ReplacingMergeTree Engine
Altinity Ltd
 
Data Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UI
Data Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UIData Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UI
Data Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UI
Altinity Ltd
 

What's hot (20)

Tricks every ClickHouse designer should know, by Robert Hodges, Altinity CEO
Tricks every ClickHouse designer should know, by Robert Hodges, Altinity CEOTricks every ClickHouse designer should know, by Robert Hodges, Altinity CEO
Tricks every ClickHouse designer should know, by Robert Hodges, Altinity CEO
 
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlare
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlareClickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlare
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlare
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
 
Dangerous on ClickHouse in 30 minutes, by Robert Hodges, Altinity CEO
Dangerous on ClickHouse in 30 minutes, by Robert Hodges, Altinity CEODangerous on ClickHouse in 30 minutes, by Robert Hodges, Altinity CEO
Dangerous on ClickHouse in 30 minutes, by Robert Hodges, Altinity CEO
 
All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...
All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...
All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...
 
Better than you think: Handling JSON data in ClickHouse
Better than you think: Handling JSON data in ClickHouseBetter than you think: Handling JSON data in ClickHouse
Better than you think: Handling JSON data in ClickHouse
 
A Day in the Life of a ClickHouse Query Webinar Slides
A Day in the Life of a ClickHouse Query Webinar Slides A Day in the Life of a ClickHouse Query Webinar Slides
A Day in the Life of a ClickHouse Query Webinar Slides
 
10 Good Reasons to Use ClickHouse
10 Good Reasons to Use ClickHouse10 Good Reasons to Use ClickHouse
10 Good Reasons to Use ClickHouse
 
A Fast Intro to Fast Query with ClickHouse, by Robert Hodges
A Fast Intro to Fast Query with ClickHouse, by Robert HodgesA Fast Intro to Fast Query with ClickHouse, by Robert Hodges
A Fast Intro to Fast Query with ClickHouse, by Robert Hodges
 
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
 
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander ZaitsevClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
 
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert HodgesWebinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
 
Altinity Quickstart for ClickHouse
Altinity Quickstart for ClickHouseAltinity Quickstart for ClickHouse
Altinity Quickstart for ClickHouse
 
ClickHouse Features for Advanced Users, by Aleksei Milovidov
ClickHouse Features for Advanced Users, by Aleksei MilovidovClickHouse Features for Advanced Users, by Aleksei Milovidov
ClickHouse Features for Advanced Users, by Aleksei Milovidov
 
High Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouseHigh Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouse
 
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
 
All about Zookeeper and ClickHouse Keeper.pdf
All about Zookeeper and ClickHouse Keeper.pdfAll about Zookeeper and ClickHouse Keeper.pdf
All about Zookeeper and ClickHouse Keeper.pdf
 
ClickHouse on Kubernetes! By Robert Hodges, Altinity CEO
ClickHouse on Kubernetes! By Robert Hodges, Altinity CEOClickHouse on Kubernetes! By Robert Hodges, Altinity CEO
ClickHouse on Kubernetes! By Robert Hodges, Altinity CEO
 
Adventures with the ClickHouse ReplacingMergeTree Engine
Adventures with the ClickHouse ReplacingMergeTree EngineAdventures with the ClickHouse ReplacingMergeTree Engine
Adventures with the ClickHouse ReplacingMergeTree Engine
 
Data Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UI
Data Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UIData Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UI
Data Warehouses in Kubernetes Visualized: the ClickHouse Kubernetes Operator UI
 

Similar to A day in the life of a click house query

Altinity Quickstart for ClickHouse-2202-09-15.pdf
Altinity Quickstart for ClickHouse-2202-09-15.pdfAltinity Quickstart for ClickHouse-2202-09-15.pdf
Altinity Quickstart for ClickHouse-2202-09-15.pdf
Altinity Ltd
 
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEOClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
Altinity Ltd
 
ClickHouse materialized views - a secret weapon for high performance analytic...
ClickHouse materialized views - a secret weapon for high performance analytic...ClickHouse materialized views - a secret weapon for high performance analytic...
ClickHouse materialized views - a secret weapon for high performance analytic...
Altinity Ltd
 
Don't Do This [FOSDEM 2023]
Don't Do This [FOSDEM 2023]Don't Do This [FOSDEM 2023]
Don't Do This [FOSDEM 2023]
Jimmy Angelakos
 
Data warehouse or conventional database: Which is right for you?
Data warehouse or conventional database: Which is right for you?Data warehouse or conventional database: Which is right for you?
Data warehouse or conventional database: Which is right for you?
Data Con LA
 
Maryna Popova "Deep dive AWS Redshift"
Maryna Popova "Deep dive AWS Redshift"Maryna Popova "Deep dive AWS Redshift"
Maryna Popova "Deep dive AWS Redshift"
Lviv Startup Club
 
How to teach an elephant to rock'n'roll
How to teach an elephant to rock'n'rollHow to teach an elephant to rock'n'roll
How to teach an elephant to rock'n'roll
PGConf APAC
 
Practical Partitioning in Production with Postgres
Practical Partitioning in Production with PostgresPractical Partitioning in Production with Postgres
Practical Partitioning in Production with Postgres
Jimmy Angelakos
 
Creating Beautiful Dashboards with Grafana and ClickHouse
Creating Beautiful Dashboards with Grafana and ClickHouseCreating Beautiful Dashboards with Grafana and ClickHouse
Creating Beautiful Dashboards with Grafana and ClickHouse
Altinity Ltd
 
Webinar: Strength in Numbers: Introduction to ClickHouse Cluster Performance
Webinar: Strength in Numbers: Introduction to ClickHouse Cluster PerformanceWebinar: Strength in Numbers: Introduction to ClickHouse Cluster Performance
Webinar: Strength in Numbers: Introduction to ClickHouse Cluster Performance
Altinity Ltd
 
Hash join use memory optimization
Hash join use memory optimizationHash join use memory optimization
Hash join use memory optimization
ICTeam S.p.A.
 
Common Table Expressions (CTE) & Window Functions in MySQL 8.0
Common Table Expressions (CTE) & Window Functions in MySQL 8.0Common Table Expressions (CTE) & Window Functions in MySQL 8.0
Common Table Expressions (CTE) & Window Functions in MySQL 8.0
oysteing
 
cloudera Apache Kudu Updatable Analytical Storage for Modern Data Platform
cloudera Apache Kudu Updatable Analytical Storage for Modern Data Platformcloudera Apache Kudu Updatable Analytical Storage for Modern Data Platform
cloudera Apache Kudu Updatable Analytical Storage for Modern Data Platform
Rakuten Group, Inc.
 
PHP tips by a MYSQL DBA
PHP tips by a MYSQL DBAPHP tips by a MYSQL DBA
PHP tips by a MYSQL DBA
Amit Kumar Singh
 
Pro Techniques for the SSAS MD Developer
Pro Techniques for the SSAS MD DeveloperPro Techniques for the SSAS MD Developer
Pro Techniques for the SSAS MD Developer
Jens Vestergaard
 
Partition and conquer large data in PostgreSQL 10
Partition and conquer large data in PostgreSQL 10Partition and conquer large data in PostgreSQL 10
Partition and conquer large data in PostgreSQL 10
Ashutosh Bapat
 
Relational Database to Apache Spark (and sometimes back again)
Relational Database to Apache Spark (and sometimes back again)Relational Database to Apache Spark (and sometimes back again)
Relational Database to Apache Spark (and sometimes back again)
Ed Thewlis
 
Developers' mDay 2017. - Bogdan Kecman Oracle
Developers' mDay 2017. - Bogdan Kecman OracleDevelopers' mDay 2017. - Bogdan Kecman Oracle
Developers' mDay 2017. - Bogdan Kecman Oracle
mCloud
 
Developers’ mDay u Banjoj Luci - Bogdan Kecman, Oracle – MySQL Server 8.0
Developers’ mDay u Banjoj Luci - Bogdan Kecman, Oracle – MySQL Server 8.0Developers’ mDay u Banjoj Luci - Bogdan Kecman, Oracle – MySQL Server 8.0
Developers’ mDay u Banjoj Luci - Bogdan Kecman, Oracle – MySQL Server 8.0
mCloud
 
15 Ways to Kill Your Mysql Application Performance
15 Ways to Kill Your Mysql Application Performance15 Ways to Kill Your Mysql Application Performance
15 Ways to Kill Your Mysql Application Performance
guest9912e5
 

Similar to A day in the life of a click house query (20)

Altinity Quickstart for ClickHouse-2202-09-15.pdf
Altinity Quickstart for ClickHouse-2202-09-15.pdfAltinity Quickstart for ClickHouse-2202-09-15.pdf
Altinity Quickstart for ClickHouse-2202-09-15.pdf
 
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEOClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
 
ClickHouse materialized views - a secret weapon for high performance analytic...
ClickHouse materialized views - a secret weapon for high performance analytic...ClickHouse materialized views - a secret weapon for high performance analytic...
ClickHouse materialized views - a secret weapon for high performance analytic...
 
Don't Do This [FOSDEM 2023]
Don't Do This [FOSDEM 2023]Don't Do This [FOSDEM 2023]
Don't Do This [FOSDEM 2023]
 
Data warehouse or conventional database: Which is right for you?
Data warehouse or conventional database: Which is right for you?Data warehouse or conventional database: Which is right for you?
Data warehouse or conventional database: Which is right for you?
 
Maryna Popova "Deep dive AWS Redshift"
Maryna Popova "Deep dive AWS Redshift"Maryna Popova "Deep dive AWS Redshift"
Maryna Popova "Deep dive AWS Redshift"
 
How to teach an elephant to rock'n'roll
How to teach an elephant to rock'n'rollHow to teach an elephant to rock'n'roll
How to teach an elephant to rock'n'roll
 
Practical Partitioning in Production with Postgres
Practical Partitioning in Production with PostgresPractical Partitioning in Production with Postgres
Practical Partitioning in Production with Postgres
 
Creating Beautiful Dashboards with Grafana and ClickHouse
Creating Beautiful Dashboards with Grafana and ClickHouseCreating Beautiful Dashboards with Grafana and ClickHouse
Creating Beautiful Dashboards with Grafana and ClickHouse
 
Webinar: Strength in Numbers: Introduction to ClickHouse Cluster Performance
Webinar: Strength in Numbers: Introduction to ClickHouse Cluster PerformanceWebinar: Strength in Numbers: Introduction to ClickHouse Cluster Performance
Webinar: Strength in Numbers: Introduction to ClickHouse Cluster Performance
 
Hash join use memory optimization
Hash join use memory optimizationHash join use memory optimization
Hash join use memory optimization
 
Common Table Expressions (CTE) & Window Functions in MySQL 8.0
Common Table Expressions (CTE) & Window Functions in MySQL 8.0Common Table Expressions (CTE) & Window Functions in MySQL 8.0
Common Table Expressions (CTE) & Window Functions in MySQL 8.0
 
cloudera Apache Kudu Updatable Analytical Storage for Modern Data Platform
cloudera Apache Kudu Updatable Analytical Storage for Modern Data Platformcloudera Apache Kudu Updatable Analytical Storage for Modern Data Platform
cloudera Apache Kudu Updatable Analytical Storage for Modern Data Platform
 
PHP tips by a MYSQL DBA
PHP tips by a MYSQL DBAPHP tips by a MYSQL DBA
PHP tips by a MYSQL DBA
 
Pro Techniques for the SSAS MD Developer
Pro Techniques for the SSAS MD DeveloperPro Techniques for the SSAS MD Developer
Pro Techniques for the SSAS MD Developer
 
Partition and conquer large data in PostgreSQL 10
Partition and conquer large data in PostgreSQL 10Partition and conquer large data in PostgreSQL 10
Partition and conquer large data in PostgreSQL 10
 
Relational Database to Apache Spark (and sometimes back again)
Relational Database to Apache Spark (and sometimes back again)Relational Database to Apache Spark (and sometimes back again)
Relational Database to Apache Spark (and sometimes back again)
 
Developers' mDay 2017. - Bogdan Kecman Oracle
Developers' mDay 2017. - Bogdan Kecman OracleDevelopers' mDay 2017. - Bogdan Kecman Oracle
Developers' mDay 2017. - Bogdan Kecman Oracle
 
Developers’ mDay u Banjoj Luci - Bogdan Kecman, Oracle – MySQL Server 8.0
Developers’ mDay u Banjoj Luci - Bogdan Kecman, Oracle – MySQL Server 8.0Developers’ mDay u Banjoj Luci - Bogdan Kecman, Oracle – MySQL Server 8.0
Developers’ mDay u Banjoj Luci - Bogdan Kecman, Oracle – MySQL Server 8.0
 
15 Ways to Kill Your Mysql Application Performance
15 Ways to Kill Your Mysql Application Performance15 Ways to Kill Your Mysql Application Performance
15 Ways to Kill Your Mysql Application Performance
 

Recently uploaded

一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
ewymefz
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
NABLAS株式会社
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
Subhajit Sahu
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
balafet
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
MaleehaSheikh2
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
axoqas
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
nscud
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Subhajit Sahu
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
TravisMalana
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
slg6lamcq
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
ahzuo
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
ewymefz
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
slg6lamcq
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
nscud
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
pchutichetpong
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
Tiktokethiodaily
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
enxupq
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
haila53
 

Recently uploaded (20)

一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单一比一原版(NYU毕业证)纽约大学毕业证成绩单
一比一原版(NYU毕业证)纽约大学毕业证成绩单
 
社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .社内勉強会資料_LLM Agents                              .
社内勉強会資料_LLM Agents                              .
 
Adjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTESAdjusting primitives for graph : SHORT REPORT / NOTES
Adjusting primitives for graph : SHORT REPORT / NOTES
 
SOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape ReportSOCRadar Germany 2024 Threat Landscape Report
SOCRadar Germany 2024 Threat Landscape Report
 
Machine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptxMachine learning and optimization techniques for electrical drives.pptx
Machine learning and optimization techniques for electrical drives.pptx
 
FP Growth Algorithm and its Applications
FP Growth Algorithm and its ApplicationsFP Growth Algorithm and its Applications
FP Growth Algorithm and its Applications
 
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
哪里卖(usq毕业证书)南昆士兰大学毕业证研究生文凭证书托福证书原版一模一样
 
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
一比一原版(CBU毕业证)卡普顿大学毕业证成绩单
 
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
 
Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)Malana- Gimlet Market Analysis (Portfolio 2)
Malana- Gimlet Market Analysis (Portfolio 2)
 
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
一比一原版(UniSA毕业证书)南澳大学毕业证如何办理
 
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
一比一原版(CBU毕业证)卡普顿大学毕业证如何办理
 
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
一比一原版(IIT毕业证)伊利诺伊理工大学毕业证成绩单
 
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
一比一原版(Adelaide毕业证书)阿德莱德大学毕业证如何办理
 
Criminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdfCriminal IP - Threat Hunting Webinar.pdf
Criminal IP - Threat Hunting Webinar.pdf
 
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
一比一原版(CBU毕业证)不列颠海角大学毕业证成绩单
 
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
Data Centers - Striving Within A Narrow Range - Research Report - MCG - May 2...
 
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
1.Seydhcuxhxyxhccuuxuxyxyxmisolids 2019.pptx
 
一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单一比一原版(QU毕业证)皇后大学毕业证成绩单
一比一原版(QU毕业证)皇后大学毕业证成绩单
 
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdfCh03-Managing the Object-Oriented Information Systems Project a.pdf
Ch03-Managing the Object-Oriented Information Systems Project a.pdf
 

A day in the life of a click house query

  • 1. © 2022 Altinity, Inc. A Day in the Life of a ClickHouse Query Intro to ClickHouse Internals Robert Hodges & Altinity Engineering 1 10 February 2022
  • 2. © 2022 Altinity, Inc. Let’s make some introductions ClickHouse support and services including Altinity.Cloud Authors of Altinity Kubernetes Operator for ClickHouse and other open source projects Robert Hodges Database geek with 30+ years on DBMS systems. Day job: Altinity CEO Altinity Engineering Database geeks with centuries of experience in DBMS and applications 2
  • 3. © 2022 Altinity, Inc. © 2021 Altinity, Inc. Foundations 3
  • 4. © 2022 Altinity, Inc. Understands SQL Runs on bare metal to cloud Shared nothing architecture Stores data in columns Parallel and vectorized execution Scales to many petabytes Is Open source (Apache 2.0) ClickHouse is a SQL Data Warehouse It’s the core engine for real-time analytics ClickHouse Event Streams ELT Object Storage Interactive Graphics Dashboards APIs 4
  • 5. © 2022 Altinity, Inc. If you understand the engine you can make it faster ClickHouse has a simple execution model–there’s no magic Any developer can understand how it works Knowledge leads to faster and more efficient queries 5 (Another fast engine!)
  • 6. © 2022 Altinity, Inc. © 2021 Altinity, Inc. What happens when you insert data? 6
  • 7. © 2022 Altinity, Inc. Let’s create a table! CREATE TABLE IF NOT EXISTS sdata ( DevId Int32, Type String, MDate Date, MDatetime DateTime, Value Float64 ) ENGINE = MergeTree() PARTITION BY toYYYYMM(MDate) ORDER BY (DevId, MDatetime) 7 Table engine type How to break data into parts How to index and sort data in each part Table columns
  • 8. © 2022 Altinity, Inc. Let’s now insert some data… INSERT INTO sdata VALUES (15, 'TEMP', '2018-01-01', '2018-01-01 23:29:55', 18.0), (15, 'TEMP', '2018-01-01', '2018-01-01 23:30:56', 18.7) 8 (This is an example. Most people don’t insert data this way!)
  • 9. © 2022 Altinity, Inc. Part in Storage How does ClickHouse process an insert? 9 INSERT INTO sdata VALUES (15, 'TEMP', . . .), (15, 'TEMP', . . .) 2 rows in set. Elapsed: 0.271 sec. ClickHouse Server Parse/Plan Respond Load Part in RAM Sort rows ( table ORDER BY )
  • 10. © 2022 Altinity, Inc. Part in Storage How can we make this more efficient? Parallelize! 10 set max_insert_threads=4 insert into ontime_test select * from ontime where toYear(FlightDate) between 2000 and 2001 2 rows in set. Elapsed: 0.271 sec. ClickHouse Server Parse/Plan Respond Load Parts in RAM
  • 11. © 2022 Altinity, Inc. Parallelism affects speed and memory usage 11 insert into ontime_test select * from ontime where toYear(FlightDate) between 2000 and 2001 set max_insert_threads=1 . . . set max_insert_threads=2 . . . set max_insert_threads=4
  • 12. © 2022 Altinity, Inc. OK, where did those awesome stats come from? SELECT event_time, type, is_initial_query, query_duration_ms / 1000 AS duration, read_rows, read_bytes, result_rows, formatReadableSize(memory_usage) AS memory, query FROM system.query_log WHERE (user = 'default') AND (type = 'QueryFinish') ORDER BY event_time DESC LIMIT 50 12
  • 13. © 2022 Altinity, Inc. What’s going on down there when you INSERT? Table Part Index Columns Sparse index finds rows by Carrier, Origin, FlightDate Columns sorted by Carrier, Origin, FlightDate Rows in the part all belong to same Year Part Index Columns Part 13
  • 14. © 2022 Altinity, Inc. Understanding what’s in a MergeTree part /var/lib/clickhouse/data/airline/ontime 2017-07-01 AA 2017-07-01 EV 2017-07-01 UA 2017-07-02 AA ... primary.idx | | | | .mrk .bin 20170701_20170731_355_355_2/ (FlightDate, Carrier...) ActualElapsedTime Airline AirlineID... | | | | .mrk .bin | | | | .mrk .bin Granule Compressed Block Mark 14
  • 15. © 2022 Altinity, Inc. Why MergeTree? Because it merges! Part Index Columns Part Index Columns Rewritten, Bigger Part Index Columns 15 Update and delete also rewrite parts
  • 16. © 2022 Altinity, Inc. Bigger parts are more efficient! ● Pick a PARTITION BY that gives nice, fat partitions ( 1-300GB, < 1000 total parts per table) ○ Can’t decide? Partition by month. ● Insert large blocks of data to avoid lots of merges afterwards ○ ClickHouse is fine with tens of millions of rows! ● The simplest way to make blocks bigger is to batch input data ○ Avoid different partition keys in the same block ○ ClickHouse has parameters like max_insert_block_size but defaults are OK ○ Look at logs and actual part sizes to see if you need to do more 16
  • 17. © 2022 Altinity, Inc. How can I see how big table parts are? SELECT table, partition, name, marks, rows, data_compressed_bytes, data_uncompressed_bytes, bytes_on_disk FROM system.parts WHERE active AND level=0 AND database = 'default' AND table = 'ontime_test' ORDER BY table DESC, partition ASC, name ASC 17 Part is in use Part has not been merged
  • 18. © 2022 Altinity, Inc. Tips to optimized INSERT Making INSERT faster ● Increase max_insert_threads (parallel creation of parts) ● Enable input_format_parallel_parsing to parallelize input parsing ○ Works for TSV/CSV/Values data ● Write bigger blocks (less merging afterwards) Making INSERT less memory intensive ● Decrease max_insert_threads (reduces parts simultaneously in memory) ● Disable input_format_parallel_parsing ● Write smaller blocks (less memory required at INSERT time) 18
  • 19. © 2022 Altinity, Inc. © 2021 Altinity, Inc. How do basic queries work? 19
  • 20. © 2022 Altinity, Inc. Aggregation is a key feature of analytic queries 20 SELECT Carrier, avg(DepDelay) AS Delay FROM ontime GROUP BY Carrier ORDER BY Delay DESC Aggregates group measurements for one more more dimensions Aggregate Function Dimension
  • 21. © 2022 Altinity, Inc. How does ClickHouse process a query with aggregates? 21 SELECT Carrier, avg(DepDelay)AS Delay FROM ontime GROUP BY Carrier ORDER BY Delay DESC ┌─FlightDate─┬──────────────Delay─┐ │ 2016-12-17 │ 53.72058516196447 │ │ 1990-12-21 │ 43.87767499654839 │ │ 1990-12-22 │ 43.64346952962076 │ . . . ClickHouse Server Parse/Plan Merge/Sort Scan In-RAM Hash Tables Parts in Storage
  • 22. © 2022 Altinity, Inc. How can you compute an average in parallel? 22 = 2 Numerator=6 Denominator=3 Scan Merge 1 2 3 1 3 5 0 5 0 0 Numerator=9 Denominator=3 Numerator=5 Denominator=4 6 + 9 + 5 3 + 3 + 4 Partial Aggregate
  • 23. © 2022 Altinity, Inc. How does a ClickHouse thread do aggregation? 23 Merge/Sort Scan Thread Parts in Storage AL => 4259/1070, 2385/415, … DL => 20663/1198, 25166/2711, … … Scan Thread Hash Table Other Scan Thread Hash Tables Result GROUP BY Key Partial Aggregates
  • 24. © 2022 Altinity, Inc. We can now understand aggregation performance drivers 24 SELECT Carrier, avg(DepDelay)AS Delay FROM ontime GROUP BY Carrier ORDER BY Delay DESC LIMIT 50 Simple aggregate, short GROUP BY key with few values SELECT Carrier, FlightDate, avg(DepDelay) AS Delay, uniqExact(TailNum) AS Aircraft FROM ontime GROUP BY Carrier, FlightDate ORDER BY Delay DESC LIMIT 50 More complex aggregates, longer GROUP BY with more values 3.4 sec 2.4 GB RAM 0.84 sec 1.6 KB RAM
  • 25. © 2022 Altinity, Inc. Parallelism affects speed and memory usage 25 SELECT Origin, FlightDate, avg(DepDelay) AS Delay, uniqExact(TailNum) AS Aircraft FROM ontime WHERE Carrier='WN' GROUP BY Origin, FlightDate ORDER BY Delay DESC LIMIT 5 SET max_threads = 1 . . . SET max_threads = 4 . . . SET max_threads = 16
  • 26. © 2022 Altinity, Inc. This is a good time to mention ClickHouse memory limits 26 SQL Query max_memory_usage (Default=10Gb) Single query limit SQL Query max_memory_usage_for_user (Default=Unlimited) All queries for a user SQL Query SQL Query SQL Query All memory on server ClickHouse Process max_server _memory_usage (Default=90% of available RAM)
  • 27. © 2022 Altinity, Inc. Tips to make aggregation queries faster ● Remove/exchange “heavy” aggregation functions ● Reduce the number of values in GROUP BY ● Increase max_threads (parallelism) ● Reduce I/O ○ Filter out unnecessary rows ○ Improve compression of data in storage 27
  • 28. © 2022 Altinity, Inc. Tips to reduce memory usage in aggregation queries ● Remove/exchange “heavy” aggregation functions ● Reduce number of values in GROUP BY ● Change max_threads value ● Dump aggregates to external storage ○ SET max_bytes_before_external_group_by > 0 ● Filter out unnecessary rows 28
  • 29. © 2022 Altinity, Inc. © 2021 Altinity, Inc. How do joins work? 29
  • 30. © 2022 Altinity, Inc. JOIN combines data between tables SELECT o.Dest, any(a.Name) AS AirportName, count(Dest) AS Flights FROM ontime o JOIN airports a ON a.IATA = o.Dest GROUP BY Dest ORDER BY Flights DESC LIMIT 10 30 Join condition “Left” Table “Right” Table
  • 31. © 2022 Altinity, Inc. How does ClickHouse process a query with a join? Left Side Table (Big) Merged Result Filter Right Side Table (Small) In-RAM Hash Table Load the right side table Scan left side table in parallel, computing aggregates and adding joined data Merge and sort results 31 Keys and column values!
  • 32. © 2022 Altinity, Inc. Let’s look more deeply at what’s happening in the scan . . . IATA . . . airports . . . Dest . . . ontime SELECT . . . FROM ontime o JOIN airports a ON a.IATA = o.Dest 32 S c a n ATL 576 1501 3302 … Hartsfield Jackson Atlanta International Airport Hartsfield Jackson Atlanta International Airport Hartsfield Jackson Atlanta International Airport … ORD 255 Chicago O'Hare International Airport Partial aggregates
  • 33. © 2022 Altinity, Inc. It would be more efficient to join after aggregating Left Side Table (Big) Merge Filter Right Side Table (Small) In-RAM Hash Table Load the right side table Scan left side table in parallel, computing aggregates Merge, then join and sort 33 Join
  • 34. © 2022 Altinity, Inc. You can do exactly that with a subquery SELECT o.Dest, any(a.Name) AS AirportName, count(Dest) AS Flights FROM ontime o JOIN default.airports a ON a.IATA = o.Dest GROUP BY Dest ORDER BY Flights DESC LIMIT 10 SELECT o.Dest, a.Name AS AirportName, o.Flights FROM ( SELECT Dest, count(Dest) AS Flights FROM ontime GROUP BY Dest ) AS o JOIN default.airports a ON a.IATA = o.Dest ORDER BY Flights DESC LIMIT 10 34 2.71 sec 19.9 MB RAM 0.663 sec 1.58 KB RAM
  • 35. © 2022 Altinity, Inc. Simple ways to keep JOINs fast and efficient ● Keep the right side table(s) overall size small ● Minimize the columns joined from the right side ● Add filter conditions to the right side table to reduce rows ● JOIN after aggregation if possible ● Use a Dictionary instead of a JOIN ○ Dictionaries are just loaded once and can be shared across queries 35 Pro tip: The SQL IN operator is also a join under the covers.
  • 36. © 2022 Altinity, Inc. © 2021 Altinity, Inc. How does a distributed query work? 36
  • 37. © 2022 Altinity, Inc. Example of a distributed data set with shards and replicas clickhouse-0 ontime _local airports ontime clickhouse-1 ontime _local airports ontime clickhouse-2 ontime _local airports ontime clickhouse-3 ontime _local airports ontime Distributed table (No data) Sharded, replicated table (Partial data) Fully replicated table (All data) 37
  • 38. © 2022 Altinity, Inc. Distributed send subqueries to multiple nodes ontime _local ontime Application ontime _local ontime ontime _local ontime ontime _local ontime Application Innermost subselect is distributed AggregateState computed locally Aggregates merged on initiator node 38
  • 39. © 2022 Altinity, Inc. Queries are pushed to all shards SELECT Carrier, avg(DepDelay) AS Delay FROM ontime GROUP BY Carrier ORDER BY Delay DESC SELECT Carrier, avg(DepDelay) AS Delay FROM ontime_local GROUP BY Carrier ORDER BY Delay DESC 39
  • 40. © 2022 Altinity, Inc. ClickHouse pushes down JOINs by default SELECT o.Dest d, a.Name n, count(*) c, avg(o.ArrDelayMinutes) ad FROM default.ontime o JOIN default.airports a ON (a.IATA = o.Dest) GROUP BY d, n HAVING c > 100000 ORDER BY d DESC LIMIT 10 SELECT Dest AS d, Name AS n, count() AS c, avg(ArrDelayMinutes) AS ad FROM default.ontime_local AS o ALL INNER JOIN default.airports AS a ON a.IATA = o.Dest GROUP BY d, n HAVING c > 100000 ORDER BY d DESC LIMIT 10 40
  • 41. © 2022 Altinity, Inc. ...Unless the left side “table” is a subquery SELECT d, Name n, c AS flights, ad FROM ( SELECT Dest d, count(*) c, avg(ArrDelayMinutes) ad FROM default.ontime GROUP BY d HAVING c > 100000 ORDER BY ad DESC ) AS o LEFT JOIN airports ON airports.IATA = o.d LIMIT 10 Remote Servers 41
  • 42. © 2022 Altinity, Inc. It’s more complex when multiple tables are distributed select foo from T1 where a in (select a from T2) distributed_product_mode=? local select foo from T1_local where a in ( select a from T2_local) allow select foo from T1_local where a in ( select a from T2) global create temporary table tmp Engine = Set AS select a from T2; select foo from T1_local where a in tmp; (Subquery runs on local table) (Subquery runs on distributed table) (Subquery runs on initiator; broadcast to local temp table) 42
  • 43. © 2022 Altinity, Inc. Tips to make distributed queries more efficient ● Think about where your data are located ● Move WHERE and heavy grouping work to left hand side of join ● Use a subquery to order joins after the remote scan ● Use the query_log to see what actually executes on the remote node(s) 43
  • 44. © 2022 Altinity, Inc. © 2021 Altinity, Inc. Where to learn more 44
  • 45. © 2022 Altinity, Inc. Where is the documentation? ClickHouse official docs – https://clickhouse.com/docs/ Altinity Blog – https://altinity.com/blog/ Altinity Youtube Channel – https://www.youtube.com/channel/UCE3Y2lDKl_ZfjaCrh62onYA Altinity Knowledge Base – https://kb.altinity.com/ Meetups, other blogs, and external resources. Use your powers of Search! 45
  • 46. © 2022 Altinity, Inc. References for this talk Altinity Knowledge Base – https://kb.altinity.com/ ClickHouse Source Code – https://github.com/ClickHouse/ClickHouse Talks and Blog Articles - ● ClickHouse Deep Dive, Alexey Milovidov ● Про JOIN’ы (в ClickHouse) - Artyem Zuikov ● Модификаторы DISTINCT и ORDER BY для всех агрегатных функций - Sofia Sergeevna Borzenkova ● ClickHouse Kernel Analysis - Storage Structure and Query Acceleration of MergeTree - Alibaba Cloud 46
  • 47. © 2022 Altinity, Inc. Thank you! Questions? https://altinity.com 47