SlideShare a Scribd company logo
1 of 31
Download to read offline
Unleashing Real-time Insights with ClickHouse:
Navigating the Landscape in 2024
ALKIN TEZUYSAL
FOSSASIA , Hanoi, Vietnam - Apr 2024
@ask_dba
@ChistaDATA Inc. 2024
Let’s get connected with Alkin first
Alkin Tezuysal - EVP - Global Services @chistadata
● Linkedin : https://www.linkedin.com/in/askdba/
Open Source Database Evangelist
● Previously PlanetScale, Percona and Pythian as Technical Manager, SRE, DBA
● Previously Enterprise DBA , Informix, Oracle, DB2 , SQL Server
@ask_dba
@ChistaDATA Inc. 2024
About ChistaDATA Inc.
Founded in 2021 by Shiv Iyer - CEO and Principal
Strong lineage, backed by leading investors
Focusing on ClickHouse infrastructure engineering and performance operations
What’s ClickHouse anyway?
Services and Products around dedicated DBaaS, Managed Services, Support and Consulting
www.chistadata.io www.chistadata.com
@ask_dba
@ChistaDATA Inc. 2024
● Most Influential in Database Community 2022 - The Redgate 100
● MySQL Cookbook, 4th Edition 2022 - O'Reilly Media, Inc.
● MySQL Rockstar 2023 - Oracle (MySQL Community)
● Database Design and Modeling with PostgreSQL and MySQL 2024 - <Packt>
Recognitions
@ask_dba
@ChistaDATA Inc. 2024
Maritime Trivia
@ask_dba
@ChistaDATA Inc. 2024
What is the term for the process of turning a sailing vessel away from
the wind, allowing the sails to fill and propel the boat forward?
Tacking…
@ask_dba
@ChistaDATA Inc. 2024
Gybing…
Tremola?
What is ClickHouse?
ClickHouse is;
● Open-source Apache 2.0
● Column-oriented
● Database management system that is engineered for high-speed analytics.
● Its columnar storage model and advanced compression enable real-time
analysis on large data volumes.
@ask_dba
@ChistaDATA Inc. 2024
Row oriented
mysql> select customer_id, customer_zip_code_prefix, customer_city , customer_state from customers limit 5;
+----------------------------------+--------------------------+---------------+----------------+
| customer_id | customer_zip_code_prefix | customer_city | customer_state |
+----------------------------------+--------------------------+---------------+----------------+
| 00012a2ce6f8dcda20d059ce98491703 | 6273 | osasco | SP |
| 000161a058600d5901f007fab4c27140 | 35550 | itapecerica | MG |
| 0001fd6190edaaf884bcaf3d49edf079 | 29830 | nova venecia | ES |
| 0002414f95344307404f0ace7a26f1d5 | 39664 | mendonca | MG |
| 000379cdec625522490c315e70c7a9fb | 4841 | sao paulo | SP |
+----------------------------------+--------------------------+---------------+----------------+
@ask_dba
@ChistaDATA Inc. 2024
Each column in separate file with same row offset.
@ask_dba
@ChistaDATA Inc. 2024
Column-oriented?
Query id: e8312155-2b9a-4ced-8af4-e05a2a977842
┌─customer_id──────────────────────┬─customer_zip_code_prefix─┬─customer_city─┬─customer_state─┐
1. │ 00012a2ce6f8dcda20d059ce98491703 │ 6273 │ osasco │ SP │
2. │ 000161a058600d5901f007fab4c27140 │ 35550 │ itapecerica │ MG │
3. │ 0001fd6190edaaf884bcaf3d49edf079 │ 29830 │ nova venecia │ ES │
4. │ 0002414f95344307404f0ace7a26f1d5 │ 39664 │ mendonca │ MG │
5. │ 000379cdec625522490c315e70c7a9fb │ 4841 │ sao paulo │ SP │
└──────────────────────────────────┴──────────────────────────┴───────────────┴─────────────
───┘
5 rows in set. Elapsed: 0.010 sec.
@ask_dba
@ChistaDATA Inc. 2024
Row oriented
mysql> select customer_id, customer_zip_code_prefix, customer_city , customer_state from customers limit 5;
+----------------------------------+--------------------------+---------------+----------------+
| customer_id | customer_zip_code_prefix | customer_city | customer_state |
+----------------------------------+--------------------------+---------------+----------------+
| 00012a2ce6f8dcda20d059ce98491703 | 6273 | osasco | SP |
| 000161a058600d5901f007fab4c27140 | 35550 | itapecerica | MG |
| 0001fd6190edaaf884bcaf3d49edf079 | 29830 | nova venecia | ES |
| 0002414f95344307404f0ace7a26f1d5 | 39664 | mendonca | MG |
| 000379cdec625522490c315e70c7a9fb | 4841 | sao paulo | SP |
+----------------------------------+--------------------------+---------------+----------------+
@ask_dba
@ChistaDATA Inc. 2024
The importance of real-time analytics
● Helps deliver on strategic imperatives
● Competitive advantage
● Improve efficiencies
● Enhance customer experience
● Increase revenues
@ask_dba
@ChistaDATA Inc. 2024
ClickHouse Highlights
● Efficient compression
○ Supports multiple compression codecs, such as LZ4 and ZSTD
● Vectorized Query Execution
○ Vectorized query execution processes data in batches, operating on multiple data
points with a single CPU instruction.
● CPU Efficiency
○ Full use of modern CPUs' capabilities, including SIMD (Single Instruction, Multiple
Data) instructions
● Scalability
○ Built-in horizontal sharding and replication.
● Rich Function Library
○ Built-in functions and operators for data transformation, filtering, and aggregation
● Geospatial Support, Materialized Views, Support for SQL Syntax
@ask_dba
@ChistaDATA Inc. 2024
Benchmarks
@ask_dba
@ChistaDATA Inc. 2024
ClickHouse Engine Family
● MergeTree The most universal and functional table engines for high-load
tasks.
● Log Lightweight engines with minimum functionality.
● Integration Engines Engines for communicating with other data storage and
processing systems.
@ask_dba
@ChistaDATA Inc. 2024
ClickHouse Integration Engine Family
@ask_dba
@ChistaDATA Inc. 2024
Sample integration with MySQL
mysql> desc customers ;
+--------------------------+-------------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+--------------------------+-------------+------+-----+---------+-------+
| customer_id | varchar(45) | NO | PRI | NULL | |
| customer_unique_id | varchar(45) | NO | UNI | NULL | |
| customer_zip_code_prefix | int | YES | | NULL | |
| customer_city | varchar(25) | YES | | NULL | |
| customer_state | char(2) | YES | | NULL | |
+--------------------------+-------------+------+-----+---------+-------+
5 rows in set (0.00 sec)
@ask_dba
@ChistaDATA Inc. 2024
Sample integration with MySQL
statement: CREATE TABLE olist.mysql_data
(
`customer_id` String,
`customer_unique_id` String,
`customer_zip_code_prefix` Nullable(Int32) DEFAULT NULL,
`customer_city` Nullable(String) DEFAULT NULL,
`customer_state` Nullable(String) DEFAULT NULL
)
ENGINE = MySQL('127.0.0.1:3306', 'olist', 'customers', 'root', '[HIDDEN]')
1 row in set. Elapsed: 0.001 sec.
@ask_dba
@ChistaDATA Inc. 2024
Load data to ClickHouse
:) INSERT INTO customers SELECT *
FROM mysql_data
Query id: f4e154ad-c6dd-497d-988e-d0d019319a53
Ok.
@ask_dba
@ChistaDATA Inc. 2024
Transferred table in ClickHouse
:) select count(*) from customers;
SELECT count(*)
FROM customers
Query id: dffacd95-a0ae-4027-b12b-dfa17d780e79
┌─count()─┐
1. │ 192016 │
└─────────┘
1 row in set. Elapsed: 0.008 sec.
@ask_dba
@ChistaDATA Inc. 2024
ClickHouse default compression
+------------------------------------+------------+
| Table | Size in MB |
+------------------------------------+------------+
| geolocation | 54.58 |
| customers | 34.39 |
| order_reviews | 20.58 |
| orders | 20.58 |
| order_items | 14.56 |
| order_payments | 11.55 |
| products | 5.52 |
| sellers | 0.33 |
| product_cateegory_name_translation | 0.02 |
+------------------------------------+------------+
Query id: e18d6938-9213-462d-a0d5-12a297ffa1ef
┌─table_name─┬─size──────┬─total_rows─┐
1. │ customers │ 13.64 MiB │ 192016 │
2. │ mysql_data │ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │
└────────────┴───────────┴─────────
───┘
2 rows in set. Elapsed: 0.001 sec.
@ask_dba
@ChistaDATA Inc. 2024
Use Case Ideas
● Analytics on denormalized tables
● Star Schema migration
● Time Series data ingestion via streaming
● Log data
● OLTP data archive
● Data Lake and Fabric solutions
● Observibility
@ask_dba
@ChistaDATA Inc. 2024
Streaming Data to Real Time Analytics
@ask_dba
@ChistaDATA Inc. 2024
@ask_dba
@ask_dba
@ChistaDATA Inc. 2024
Get started with clickhouse-local
$ curl https://clickhouse.com/ | sh
$ ./clickhouse local -q "SELECT * FROM 'customers.tsv'"
@ask_dba
@ChistaDATA Inc. 2024
Get started with brew on MacOS
$ brew install --cask clickhouse
$ clickhouse
ClickHouse local version 24.3.1.2672 (official build).
macbook-pro-4.local :) SELECT
name AS table_name,
formatReadableSize(total_bytes) AS size,
total_rows
FROM system.tables
WHERE database = 'olist'
ORDER BY total_bytes DESC;
SELECT
name AS table_name,
@ask_dba
@ChistaDATA Inc. 2024
Born to Sail, Forced to Work!
Catching winds
@svrubato
How to contribute
to community?
@ChistaDATA Inc. 2024
@ask_dba
@ask_dba
@ChistaDATA Inc. 2024
@ChistaDATA Inc. 2024
@ask_dba
THANK YOU
Q&A
@ChistaDATA Inc. 2024
@ask_dba
References & Credits
● Blog - ChistaDATA Inc.
● Knowledge Base Archive - ChistaDATA Inc.
● What Is ClickHouse? | ClickHouse Docs
● ClickHouse Quick Start
● Benchmarking Opentelemetry
● clickhouse-benchmark | ClickHouse Docs
@ask_dba
@ChistaDATA Inc. 2024

More Related Content

Similar to Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 2024 , Vietnam FOSSASIA '24.pptx.pdf

Similar to Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 2024 , Vietnam FOSSASIA '24.pptx.pdf (20)

MySQL 5.7. Tutorial - Dutch PHP Conference 2015
MySQL 5.7. Tutorial - Dutch PHP Conference 2015MySQL 5.7. Tutorial - Dutch PHP Conference 2015
MySQL 5.7. Tutorial - Dutch PHP Conference 2015
 
MySQL 8.0 Released Update
MySQL 8.0 Released UpdateMySQL 8.0 Released Update
MySQL 8.0 Released Update
 
Informatica Cloud Winter 2016 Release Webinar
Informatica Cloud Winter 2016 Release WebinarInformatica Cloud Winter 2016 Release Webinar
Informatica Cloud Winter 2016 Release Webinar
 
Automating Networks by Converting into API/Webs
Automating Networks by Converting into API/WebsAutomating Networks by Converting into API/Webs
Automating Networks by Converting into API/Webs
 
Automating Networks by using API
Automating Networks by using APIAutomating Networks by using API
Automating Networks by using API
 
Performance schema and sys schema
Performance schema and sys schemaPerformance schema and sys schema
Performance schema and sys schema
 
Webinar - Macy’s: Why Your Database Decision Directly Impacts Customer Experi...
Webinar - Macy’s: Why Your Database Decision Directly Impacts Customer Experi...Webinar - Macy’s: Why Your Database Decision Directly Impacts Customer Experi...
Webinar - Macy’s: Why Your Database Decision Directly Impacts Customer Experi...
 
Integration with Dynamics CRM
Integration with Dynamics CRMIntegration with Dynamics CRM
Integration with Dynamics CRM
 
Fulltext engine for non fulltext searches
Fulltext engine for non fulltext searchesFulltext engine for non fulltext searches
Fulltext engine for non fulltext searches
 
IBM BC2015 - Cisco - Cloud is Now - VersaStack
IBM BC2015 - Cisco - Cloud is Now - VersaStackIBM BC2015 - Cisco - Cloud is Now - VersaStack
IBM BC2015 - Cisco - Cloud is Now - VersaStack
 
MariaDB 10.5 new features for troubleshooting (mariadb server fest 2020)
MariaDB 10.5 new features for troubleshooting (mariadb server fest 2020)MariaDB 10.5 new features for troubleshooting (mariadb server fest 2020)
MariaDB 10.5 new features for troubleshooting (mariadb server fest 2020)
 
Oracle Query Tuning Tips - Get it Right the First Time
Oracle Query Tuning Tips - Get it Right the First TimeOracle Query Tuning Tips - Get it Right the First Time
Oracle Query Tuning Tips - Get it Right the First Time
 
BI in the Clouds (Wlodek Bielski Technology Stream)
BI in the Clouds (Wlodek Bielski Technology Stream)BI in the Clouds (Wlodek Bielski Technology Stream)
BI in the Clouds (Wlodek Bielski Technology Stream)
 
20150423 m3
20150423 m320150423 m3
20150423 m3
 
Confoo 2021 -- MySQL New Features
Confoo 2021 -- MySQL New FeaturesConfoo 2021 -- MySQL New Features
Confoo 2021 -- MySQL New Features
 
BigQuery implementation
BigQuery implementationBigQuery implementation
BigQuery implementation
 
SQL Server 2022 Programmability & Performance
SQL Server 2022 Programmability & PerformanceSQL Server 2022 Programmability & Performance
SQL Server 2022 Programmability & Performance
 
MariaDB 10.0 Query Optimizer
MariaDB 10.0 Query OptimizerMariaDB 10.0 Query Optimizer
MariaDB 10.0 Query Optimizer
 
Macy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-FlightMacy's: Changing Engines in Mid-Flight
Macy's: Changing Engines in Mid-Flight
 
Meeting the challenges of OLTP Big Data with Scylla
Meeting the challenges of OLTP Big Data with ScyllaMeeting the challenges of OLTP Big Data with Scylla
Meeting the challenges of OLTP Big Data with Scylla
 

More from Alkin Tezuysal

Integrating best of breed open source tools to vitess orchestrator pleu21
Integrating best of breed open source tools to vitess  orchestrator   pleu21Integrating best of breed open source tools to vitess  orchestrator   pleu21
Integrating best of breed open source tools to vitess orchestrator pleu21
Alkin Tezuysal
 

More from Alkin Tezuysal (20)

Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024
Design and Modeling for MySQL SCALE 21X Pasadena, CA Mar 2024
 
FOSSASIA - MySQL Cookbook 4e Journey APR 2023.pdf
FOSSASIA - MySQL Cookbook 4e Journey APR 2023.pdfFOSSASIA - MySQL Cookbook 4e Journey APR 2023.pdf
FOSSASIA - MySQL Cookbook 4e Journey APR 2023.pdf
 
MySQL Ecosystem in 2023 - FOSSASIA'23 - Alkin.pptx.pdf
MySQL Ecosystem in 2023 - FOSSASIA'23 - Alkin.pptx.pdfMySQL Ecosystem in 2023 - FOSSASIA'23 - Alkin.pptx.pdf
MySQL Ecosystem in 2023 - FOSSASIA'23 - Alkin.pptx.pdf
 
How OLTP to OLAP Archival Demystified
How OLTP to OLAP Archival DemystifiedHow OLTP to OLAP Archival Demystified
How OLTP to OLAP Archival Demystified
 
MySQL Cookbook: Recipes for Developers, Alkin Tezuysal and Sveta Smirnova - P...
MySQL Cookbook: Recipes for Developers, Alkin Tezuysal and Sveta Smirnova - P...MySQL Cookbook: Recipes for Developers, Alkin Tezuysal and Sveta Smirnova - P...
MySQL Cookbook: Recipes for Developers, Alkin Tezuysal and Sveta Smirnova - P...
 
My first 90 days with ClickHouse.pdf
My first 90 days with ClickHouse.pdfMy first 90 days with ClickHouse.pdf
My first 90 days with ClickHouse.pdf
 
KubeCon_NA_2021
KubeCon_NA_2021KubeCon_NA_2021
KubeCon_NA_2021
 
Integrating best of breed open source tools to vitess orchestrator pleu21
Integrating best of breed open source tools to vitess  orchestrator   pleu21Integrating best of breed open source tools to vitess  orchestrator   pleu21
Integrating best of breed open source tools to vitess orchestrator pleu21
 
Vitess: Scalable Database Architecture - Kubernetes Community Days Africa Ap...
Vitess: Scalable Database Architecture -  Kubernetes Community Days Africa Ap...Vitess: Scalable Database Architecture -  Kubernetes Community Days Africa Ap...
Vitess: Scalable Database Architecture - Kubernetes Community Days Africa Ap...
 
How to shard MariaDB like a pro - FOSDEM 2021
How to shard MariaDB like a pro  - FOSDEM 2021How to shard MariaDB like a pro  - FOSDEM 2021
How to shard MariaDB like a pro - FOSDEM 2021
 
Vitess - Data on Kubernetes
Vitess -  Data on Kubernetes  Vitess -  Data on Kubernetes
Vitess - Data on Kubernetes
 
MySQL Ecosystem in 2020
MySQL Ecosystem in 2020MySQL Ecosystem in 2020
MySQL Ecosystem in 2020
 
Introduction to Vitess on Kubernetes for MySQL - Webinar
Introduction to Vitess on Kubernetes for MySQL -  WebinarIntroduction to Vitess on Kubernetes for MySQL -  Webinar
Introduction to Vitess on Kubernetes for MySQL - Webinar
 
When is Myrocks good? 2020 Webinar Series
When is Myrocks good? 2020 Webinar SeriesWhen is Myrocks good? 2020 Webinar Series
When is Myrocks good? 2020 Webinar Series
 
Mysql 8 vs Mariadb 10.4 Webinar 2020 Feb
Mysql 8 vs Mariadb 10.4 Webinar 2020 FebMysql 8 vs Mariadb 10.4 Webinar 2020 Feb
Mysql 8 vs Mariadb 10.4 Webinar 2020 Feb
 
Myrocks in the wild wild west! FOSDEM 2020
Myrocks in the wild wild west! FOSDEM 2020Myrocks in the wild wild west! FOSDEM 2020
Myrocks in the wild wild west! FOSDEM 2020
 
Mysql 8 vs Mariadb 10.4 Highload++ 2019
Mysql 8 vs Mariadb 10.4 Highload++ 2019Mysql 8 vs Mariadb 10.4 Highload++ 2019
Mysql 8 vs Mariadb 10.4 Highload++ 2019
 
When is MyRocks good?
When is MyRocks good? When is MyRocks good?
When is MyRocks good?
 
How to upgrade like a boss to MySQL 8.0 - PLE19
How to upgrade like a boss to MySQL 8.0 -  PLE19How to upgrade like a boss to MySQL 8.0 -  PLE19
How to upgrade like a boss to MySQL 8.0 - PLE19
 
Mysql ecosystem in 2019
Mysql ecosystem in 2019Mysql ecosystem in 2019
Mysql ecosystem in 2019
 

Recently uploaded

Breaking Down the Flutterwave Scandal What You Need to Know.pdf
Breaking Down the Flutterwave Scandal What You Need to Know.pdfBreaking Down the Flutterwave Scandal What You Need to Know.pdf
Breaking Down the Flutterwave Scandal What You Need to Know.pdf
UK Journal
 

Recently uploaded (20)

BT & Neo4j _ How Knowledge Graphs help BT deliver Digital Transformation.pptx
BT & Neo4j _ How Knowledge Graphs help BT deliver Digital Transformation.pptxBT & Neo4j _ How Knowledge Graphs help BT deliver Digital Transformation.pptx
BT & Neo4j _ How Knowledge Graphs help BT deliver Digital Transformation.pptx
 
Intro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджераIntro in Product Management - Коротко про професію продакт менеджера
Intro in Product Management - Коротко про професію продакт менеджера
 
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
 
TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024TopCryptoSupers 12thReport OrionX May2024
TopCryptoSupers 12thReport OrionX May2024
 
Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024Long journey of Ruby Standard library at RubyKaigi 2024
Long journey of Ruby Standard library at RubyKaigi 2024
 
A Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System StrategyA Business-Centric Approach to Design System Strategy
A Business-Centric Approach to Design System Strategy
 
How we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdfHow we scaled to 80K users by doing nothing!.pdf
How we scaled to 80K users by doing nothing!.pdf
 
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties ReimaginedEasier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
 
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptxWSO2CONMay2024OpenSourceConferenceDebrief.pptx
WSO2CONMay2024OpenSourceConferenceDebrief.pptx
 
Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024Enterprise Knowledge Graphs - Data Summit 2024
Enterprise Knowledge Graphs - Data Summit 2024
 
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
ASRock Industrial FDO Solutions in Action for Industrial Edge AI _ Kenny at A...
 
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
Behind the Scenes From the Manager's Chair: Decoding the Secrets of Successfu...
 
Breaking Down the Flutterwave Scandal What You Need to Know.pdf
Breaking Down the Flutterwave Scandal What You Need to Know.pdfBreaking Down the Flutterwave Scandal What You Need to Know.pdf
Breaking Down the Flutterwave Scandal What You Need to Know.pdf
 
WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024WebRTC and SIP not just audio and video @ OpenSIPS 2024
WebRTC and SIP not just audio and video @ OpenSIPS 2024
 
Oauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoftOauth 2.0 Introduction and Flows with MuleSoft
Oauth 2.0 Introduction and Flows with MuleSoft
 
ECS 2024 Teams Premium - Pretty Secure
ECS 2024   Teams Premium - Pretty SecureECS 2024   Teams Premium - Pretty Secure
ECS 2024 Teams Premium - Pretty Secure
 
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
1111 ChatGPT Prompts PDF Free Download - Prompts for ChatGPT
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
 
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdfIntroduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
Introduction to FDO and How It works Applications _ Richard at FIDO Alliance.pdf
 
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on ThanabotsContinuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
Continuing Bonds Through AI: A Hermeneutic Reflection on Thanabots
 

Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 2024 , Vietnam FOSSASIA '24.pptx.pdf

  • 1. Unleashing Real-time Insights with ClickHouse: Navigating the Landscape in 2024 ALKIN TEZUYSAL FOSSASIA , Hanoi, Vietnam - Apr 2024 @ask_dba @ChistaDATA Inc. 2024
  • 2. Let’s get connected with Alkin first Alkin Tezuysal - EVP - Global Services @chistadata ● Linkedin : https://www.linkedin.com/in/askdba/ Open Source Database Evangelist ● Previously PlanetScale, Percona and Pythian as Technical Manager, SRE, DBA ● Previously Enterprise DBA , Informix, Oracle, DB2 , SQL Server @ask_dba @ChistaDATA Inc. 2024
  • 3. About ChistaDATA Inc. Founded in 2021 by Shiv Iyer - CEO and Principal Strong lineage, backed by leading investors Focusing on ClickHouse infrastructure engineering and performance operations What’s ClickHouse anyway? Services and Products around dedicated DBaaS, Managed Services, Support and Consulting www.chistadata.io www.chistadata.com @ask_dba @ChistaDATA Inc. 2024
  • 4. ● Most Influential in Database Community 2022 - The Redgate 100 ● MySQL Cookbook, 4th Edition 2022 - O'Reilly Media, Inc. ● MySQL Rockstar 2023 - Oracle (MySQL Community) ● Database Design and Modeling with PostgreSQL and MySQL 2024 - <Packt> Recognitions @ask_dba @ChistaDATA Inc. 2024
  • 5. Maritime Trivia @ask_dba @ChistaDATA Inc. 2024 What is the term for the process of turning a sailing vessel away from the wind, allowing the sails to fill and propel the boat forward?
  • 7. What is ClickHouse? ClickHouse is; ● Open-source Apache 2.0 ● Column-oriented ● Database management system that is engineered for high-speed analytics. ● Its columnar storage model and advanced compression enable real-time analysis on large data volumes. @ask_dba @ChistaDATA Inc. 2024
  • 8. Row oriented mysql> select customer_id, customer_zip_code_prefix, customer_city , customer_state from customers limit 5; +----------------------------------+--------------------------+---------------+----------------+ | customer_id | customer_zip_code_prefix | customer_city | customer_state | +----------------------------------+--------------------------+---------------+----------------+ | 00012a2ce6f8dcda20d059ce98491703 | 6273 | osasco | SP | | 000161a058600d5901f007fab4c27140 | 35550 | itapecerica | MG | | 0001fd6190edaaf884bcaf3d49edf079 | 29830 | nova venecia | ES | | 0002414f95344307404f0ace7a26f1d5 | 39664 | mendonca | MG | | 000379cdec625522490c315e70c7a9fb | 4841 | sao paulo | SP | +----------------------------------+--------------------------+---------------+----------------+ @ask_dba @ChistaDATA Inc. 2024
  • 9. Each column in separate file with same row offset. @ask_dba @ChistaDATA Inc. 2024
  • 10. Column-oriented? Query id: e8312155-2b9a-4ced-8af4-e05a2a977842 ┌─customer_id──────────────────────┬─customer_zip_code_prefix─┬─customer_city─┬─customer_state─┐ 1. │ 00012a2ce6f8dcda20d059ce98491703 │ 6273 │ osasco │ SP │ 2. │ 000161a058600d5901f007fab4c27140 │ 35550 │ itapecerica │ MG │ 3. │ 0001fd6190edaaf884bcaf3d49edf079 │ 29830 │ nova venecia │ ES │ 4. │ 0002414f95344307404f0ace7a26f1d5 │ 39664 │ mendonca │ MG │ 5. │ 000379cdec625522490c315e70c7a9fb │ 4841 │ sao paulo │ SP │ └──────────────────────────────────┴──────────────────────────┴───────────────┴───────────── ───┘ 5 rows in set. Elapsed: 0.010 sec. @ask_dba @ChistaDATA Inc. 2024
  • 11. Row oriented mysql> select customer_id, customer_zip_code_prefix, customer_city , customer_state from customers limit 5; +----------------------------------+--------------------------+---------------+----------------+ | customer_id | customer_zip_code_prefix | customer_city | customer_state | +----------------------------------+--------------------------+---------------+----------------+ | 00012a2ce6f8dcda20d059ce98491703 | 6273 | osasco | SP | | 000161a058600d5901f007fab4c27140 | 35550 | itapecerica | MG | | 0001fd6190edaaf884bcaf3d49edf079 | 29830 | nova venecia | ES | | 0002414f95344307404f0ace7a26f1d5 | 39664 | mendonca | MG | | 000379cdec625522490c315e70c7a9fb | 4841 | sao paulo | SP | +----------------------------------+--------------------------+---------------+----------------+ @ask_dba @ChistaDATA Inc. 2024
  • 12. The importance of real-time analytics ● Helps deliver on strategic imperatives ● Competitive advantage ● Improve efficiencies ● Enhance customer experience ● Increase revenues @ask_dba @ChistaDATA Inc. 2024
  • 13. ClickHouse Highlights ● Efficient compression ○ Supports multiple compression codecs, such as LZ4 and ZSTD ● Vectorized Query Execution ○ Vectorized query execution processes data in batches, operating on multiple data points with a single CPU instruction. ● CPU Efficiency ○ Full use of modern CPUs' capabilities, including SIMD (Single Instruction, Multiple Data) instructions ● Scalability ○ Built-in horizontal sharding and replication. ● Rich Function Library ○ Built-in functions and operators for data transformation, filtering, and aggregation ● Geospatial Support, Materialized Views, Support for SQL Syntax @ask_dba @ChistaDATA Inc. 2024
  • 15. ClickHouse Engine Family ● MergeTree The most universal and functional table engines for high-load tasks. ● Log Lightweight engines with minimum functionality. ● Integration Engines Engines for communicating with other data storage and processing systems. @ask_dba @ChistaDATA Inc. 2024
  • 16. ClickHouse Integration Engine Family @ask_dba @ChistaDATA Inc. 2024
  • 17. Sample integration with MySQL mysql> desc customers ; +--------------------------+-------------+------+-----+---------+-------+ | Field | Type | Null | Key | Default | Extra | +--------------------------+-------------+------+-----+---------+-------+ | customer_id | varchar(45) | NO | PRI | NULL | | | customer_unique_id | varchar(45) | NO | UNI | NULL | | | customer_zip_code_prefix | int | YES | | NULL | | | customer_city | varchar(25) | YES | | NULL | | | customer_state | char(2) | YES | | NULL | | +--------------------------+-------------+------+-----+---------+-------+ 5 rows in set (0.00 sec) @ask_dba @ChistaDATA Inc. 2024
  • 18. Sample integration with MySQL statement: CREATE TABLE olist.mysql_data ( `customer_id` String, `customer_unique_id` String, `customer_zip_code_prefix` Nullable(Int32) DEFAULT NULL, `customer_city` Nullable(String) DEFAULT NULL, `customer_state` Nullable(String) DEFAULT NULL ) ENGINE = MySQL('127.0.0.1:3306', 'olist', 'customers', 'root', '[HIDDEN]') 1 row in set. Elapsed: 0.001 sec. @ask_dba @ChistaDATA Inc. 2024
  • 19. Load data to ClickHouse :) INSERT INTO customers SELECT * FROM mysql_data Query id: f4e154ad-c6dd-497d-988e-d0d019319a53 Ok. @ask_dba @ChistaDATA Inc. 2024
  • 20. Transferred table in ClickHouse :) select count(*) from customers; SELECT count(*) FROM customers Query id: dffacd95-a0ae-4027-b12b-dfa17d780e79 ┌─count()─┐ 1. │ 192016 │ └─────────┘ 1 row in set. Elapsed: 0.008 sec. @ask_dba @ChistaDATA Inc. 2024
  • 21. ClickHouse default compression +------------------------------------+------------+ | Table | Size in MB | +------------------------------------+------------+ | geolocation | 54.58 | | customers | 34.39 | | order_reviews | 20.58 | | orders | 20.58 | | order_items | 14.56 | | order_payments | 11.55 | | products | 5.52 | | sellers | 0.33 | | product_cateegory_name_translation | 0.02 | +------------------------------------+------------+ Query id: e18d6938-9213-462d-a0d5-12a297ffa1ef ┌─table_name─┬─size──────┬─total_rows─┐ 1. │ customers │ 13.64 MiB │ 192016 │ 2. │ mysql_data │ ᴺᵁᴸᴸ │ ᴺᵁᴸᴸ │ └────────────┴───────────┴───────── ───┘ 2 rows in set. Elapsed: 0.001 sec. @ask_dba @ChistaDATA Inc. 2024
  • 22. Use Case Ideas ● Analytics on denormalized tables ● Star Schema migration ● Time Series data ingestion via streaming ● Log data ● OLTP data archive ● Data Lake and Fabric solutions ● Observibility @ask_dba @ChistaDATA Inc. 2024
  • 23. Streaming Data to Real Time Analytics @ask_dba @ChistaDATA Inc. 2024
  • 25. Get started with clickhouse-local $ curl https://clickhouse.com/ | sh $ ./clickhouse local -q "SELECT * FROM 'customers.tsv'" @ask_dba @ChistaDATA Inc. 2024
  • 26. Get started with brew on MacOS $ brew install --cask clickhouse $ clickhouse ClickHouse local version 24.3.1.2672 (official build). macbook-pro-4.local :) SELECT name AS table_name, formatReadableSize(total_bytes) AS size, total_rows FROM system.tables WHERE database = 'olist' ORDER BY total_bytes DESC; SELECT name AS table_name, @ask_dba @ChistaDATA Inc. 2024
  • 27. Born to Sail, Forced to Work! Catching winds @svrubato How to contribute to community? @ChistaDATA Inc. 2024 @ask_dba
  • 31. References & Credits ● Blog - ChistaDATA Inc. ● Knowledge Base Archive - ChistaDATA Inc. ● What Is ClickHouse? | ClickHouse Docs ● ClickHouse Quick Start ● Benchmarking Opentelemetry ● clickhouse-benchmark | ClickHouse Docs @ask_dba @ChistaDATA Inc. 2024