Building an Observability platform with ClickHouse

•

1 like•640 views

The document discusses using ClickHouse as the backend storage for observability data in SigNoz, an open source observability platform. ClickHouse is well-suited for storing observability data due to its ability to handle wide tables, perform fast aggregation queries, and provide compression capabilities. The document outlines SigNoz's architecture, demonstrates how tracing and metrics data is modeled in ClickHouse, and shows how ClickHouse outperforms Elasticsearch for ingesting and querying logs and metrics at scale. Overall, ClickHouse is presented as an efficient and less resource-intensive solution than alternatives like Druid for storing observability data, especially for open source projects.

Technology

Building an Observability
platform with ClickHouse
Ankit Nayan (CTO, SigNoz)
https://github.com/SigNoz/signoz

About Me
- Building SigNoz, part of YCombinator
- 8 yrs experience in build scalable
tech architecture
- Passionate about distributed systems
- Love Himalayas and trekking

SigNoz - Open source observability platform
👉 Metrics + Traces in a single pane
👉 Powerful trace filtering and aggregation capabilities
👉 Out-of-box like SaaS solutions, requiring minimal dev
effort
👉 Native backend and frontend to OpenTelemetry
👉 Can be run within your own cloud
https://github.com/SigNoz/signoz

Quick Demo
https://github.com/SigNoz/signoz

Architecture of SigNoz
Components
● OpenTelemetry Collector
● ClickHouse
● Query Service
● Frontend
https://github.com/SigNoz/signoz

Why ClickHouse for storing Observability data
● Wide tables with each query processes large number of
rows but a few columns
● Running aggregate queries
● Distributed columnar datastore and blazing fast queries
● Compression capabilities
● Hot and cold storage

Tracing data model in ClickHouse
Using Array(String) to
store tagKeys and
tagValues
Extracting frequently used
tags to columns
https://github.com/jaegertrac
ing/jaeger/issues/1438

Storage Performance of Clickhouse
https://github.com/SigNoz/signoz

ClickHouse for Logs Management at Uber
[1] https://eng.uber.com/logging/

Comparison with Elastic
● A single ClickHouse node could
ingest 300K logs per second, 10x
a single ES node
● >80% queries are aggregation
queries, such as terms, histogram
and percentile aggregations.
Elastic is not designed to support
fast aggregation
[1] https://eng.uber.com/logging/

$Metrics Data model for ClickHouse {"__name__":"up","instance":"clickhouse_exporter_1:91 16","job":"clickhouse"} │$

Metrics @ SigNoz
● Metrics with ClickHouse uses 1.16 byte vs 1.37
byte with Prom local storage
● PromQL like query model for reading metrics data
from ClickHouse
https://github.com/SigNoz/signoz

Planned Improvements
● Another table to store complete span and index by traceID
● Use ReplicatedMergeTree for data redundancy
● Use SummingMergeTree for faster aggregation queries
● Use distributed tables for distributed queries
● Separation of roles using separate data and query nodes
leads to independent scaling and independent hardware
SKUs
https://github.com/SigNoz/signoz

Summary
● ClickHouse seems a good fit for Observability data
● Uber has used ClickHouse at scale for logs
● Aggregation of span data more efficient with OLAP DBs
● Less resource intensive for smaller scale compared to
Druid, ideal for open source project like ours
https://github.com/SigNoz/signoz

Thank You
ankit@signoz.io
@ankitnayan
https://github.com/SigNoz/signoz
Give SigNoz a try. Check out our Github repo at

What's hot

Apache Kafka Architecture & Fundamentals Explainedconfluent

Altinity Quickstart for ClickHouseAltinity Ltd

Introduction to the Mysteries of ClickHouse Replication, By Robert Hodges and...Altinity Ltd

Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdfAltinity Ltd

All about Zookeeper and ClickHouse Keeper.pdfAltinity Ltd

ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEOAltinity Ltd

All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...Altinity Ltd

Real Time analytics with Druid, Apache Spark and KafkaDaria Litvinov

Your first ClickHouse data warehouseAltinity Ltd

kafkaAmikam Snir

High Performance, High Reliability Data Loading on ClickHouseAltinity Ltd

ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...Altinity Ltd

MeetUp Monitoring with Prometheus and Grafana (September 2018)Lucas Jellema

Introduction to MongoDBMike Dirolf

왜 쿠버네티스는 systemd로 cgroup을 관리하려고 할까요Jo Hoon

Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안SANG WON PARK

From my sql to postgresql using kafka+debeziumClement Demonchy

Introduction and Overview of Apache Kafka, TriHUG July 23, 2013mumrah

Ansible Callback Pluginsjtyr

10 Good Reasons to Use ClickHouserpolat

What's hot (20)

Apache Kafka Architecture & Fundamentals Explained

Altinity Quickstart for ClickHouse

Introduction to the Mysteries of ClickHouse Replication, By Robert Hodges and...

Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf

All about Zookeeper and ClickHouse Keeper.pdf

ClickHouse Query Performance Tips and Tricks, by Robert Hodges, Altinity CEO

All About JSON and ClickHouse - Tips, Tricks and New Features-2022-07-26-FINA...

Real Time analytics with Druid, Apache Spark and Kafka

Your first ClickHouse data warehouse

kafka

High Performance, High Reliability Data Loading on ClickHouse

ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...

MeetUp Monitoring with Prometheus and Grafana (September 2018)

Introduction to MongoDB

왜 쿠버네티스는 systemd로 cgroup을 관리하려고 할까요

Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안

From my sql to postgresql using kafka+debezium

Introduction and Overview of Apache Kafka, TriHUG July 23, 2013

Ansible Callback Plugins

10 Good Reasons to Use ClickHouse

Similar to Building an Observability platform with ClickHouse

MongoDB World 2019: Packing Up Your Data and Moving to MongoDB AtlasMongoDB

Designing for operability and managabilityGaurav Bahrani

[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...Anna Ossowski

OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...NETWAYS

Big data @ Hootsuite analtyicsClaudiu Coman

Make your data fly - Building data platform in AWSKimmo Kantojärvi

Machine learning and big data @ uber a tale of two systemsZhenxiao Luo

NetflixOSS Meetup season 3 episode 1Ruslan Meshenberg

Build an Open Source Data Lake For Data ScientistsShawn Zhu

Executive Intro to BigQueryWilliam M. Cohee

Ledingkart Meetup #2: Scaling Search @LendingkartMukesh Singh

Design Choices for Cloud Data PlatformsAshish Mrig

Archmage, Pinterest’s Real-time Analytics Platform on DruidImply

OSMC 2023 | What’s new with Grafana Labs’s Open Source Observability stack by...NETWAYS

Handout: 'Open Source Tools & Resources'BDPA Education and Technology Foundation

Big Query BasicsIdo Green

Budapest Data Forum 2017 - BigQuery, Looker And Big Data Analytics At Petabyt...Rittman Analytics

How to Develop and Operate Cloud Native Data Platforms and ApplicationsAlluxio, Inc.

Big Data on Cloud Native PlatformSunil Govindan

Similar to Building an Observability platform with ClickHouse (20)

MongoDB World 2019: Packing Up Your Data and Moving to MongoDB Atlas

Designing for operability and managability

[Virtual Meetup] Using Elasticsearch as a Time-Series Database in the Endpoin...

OSMC 2018 | Learnings, patterns and Uber’s metrics platform M3, open sourced ...

Big data @ Hootsuite analtyics

Make your data fly - Building data platform in AWS

Machine learning and big data @ uber a tale of two systems

NetflixOSS Meetup season 3 episode 1

Build an Open Source Data Lake For Data Scientists

Executive Intro to BigQuery

Ledingkart Meetup #2: Scaling Search @Lendingkart

Design Choices for Cloud Data Platforms

Archmage, Pinterest’s Real-time Analytics Platform on Druid

OSMC 2023 | What’s new with Grafana Labs’s Open Source Observability stack by...

Handout: 'Open Source Tools & Resources'

Big Query Basics

Budapest Data Forum 2017 - BigQuery, Looker And Big Data Analytics At Petabyt...

How to Develop and Operate Cloud Native Data Platforms and Applications

Big Data on Cloud Native Platform

Recently uploaded

Commit 2024 - Secret Management made easyAlfredo García Lavilla

costume and set research powerpoint presentationphoebematthew05

WordPress Websites for Engineers: Elevate Your Brandgvaughan

DevEX - reference for building teams, processes, and platformsSergiu Bodiu

"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays

Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University

APIForce Zurich 5 April Automation LPDGMarianaLemus7

Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada

Pigging Solutions in Pet Food ManufacturingPigging Solutions

Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson

Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro

Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55

Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst

Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited

"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays

Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi

Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm

"Federated learning: out of reach no matter how close",Oleksandr LapshynFwdays

Recently uploaded (20)

Commit 2024 - Secret Management made easy

costume and set research powerpoint presentation

WordPress Websites for Engineers: Elevate Your Brand

DevEX - reference for building teams, processes, and platforms

"Debugging python applications inside k8s environment", Andrii Soldatenko

Nell’iperspazio con Rocket: il Framework Web di Rust!

APIForce Zurich 5 April Automation LPDG

Advanced Test Driven-Development @ php[tek] 2024

New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024

Pigging Solutions in Pet Food Manufacturing

Are Multi-Cloud and Serverless Good or Bad?

Unraveling Multimodality with Large Language Models.pdf

Unleash Your Potential - Namagunga Girls Coding Club

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...

Human Factors of XR: Using Human Factors to Design XR Systems

Ensuring Technical Readiness For Copilot in Microsoft 365

"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack

Vertex AI Gemini Prompt Engineering Tips

Streamlining Python Development: A Guide to a Modern Project Setup

"Federated learning: out of reach no matter how close",Oleksandr Lapshyn

Building an Observability platform with ClickHouse

1. Building an Observability platform with ClickHouse Ankit Nayan (CTO, SigNoz) https://github.com/SigNoz/signoz

2. About Me - Building SigNoz, part of YCombinator - 8 yrs experience in build scalable tech architecture - Passionate about distributed systems - Love Himalayas and trekking

3. SigNoz - Open source observability platform 👉 Metrics + Traces in a single pane 👉 Powerful trace filtering and aggregation capabilities 👉 Out-of-box like SaaS solutions, requiring minimal dev effort 👉 Native backend and frontend to OpenTelemetry 👉 Can be run within your own cloud https://github.com/SigNoz/signoz

4. Quick Demo https://github.com/SigNoz/signoz

5. Architecture of SigNoz Components ● OpenTelemetry Collector ● ClickHouse ● Query Service ● Frontend https://github.com/SigNoz/signoz

6. Why ClickHouse for storing Observability data ● Wide tables with each query processes large number of rows but a few columns ● Running aggregate queries ● Distributed columnar datastore and blazing fast queries ● Compression capabilities ● Hot and cold storage

7. Tracing data model in ClickHouse Using Array(String) to store tagKeys and tagValues Extracting frequently used tags to columns https://github.com/jaegertrac ing/jaeger/issues/1438

8. Storage Performance of Clickhouse https://github.com/SigNoz/signoz

9. ClickHouse for Logs Management at Uber [1] https://eng.uber.com/logging/

10. Comparison with Elastic ● A single ClickHouse node could ingest 300K logs per second, 10x a single ES node ● >80% queries are aggregation queries, such as terms, histogram and percentile aggregations. Elastic is not designed to support fast aggregation [1] https://eng.uber.com/logging/

11. Metrics Data model for ClickHouse {"__name__":"up","instance":"clickhouse_exporter_1:91 16","job":"clickhouse"} │

12. Metrics @ SigNoz ● Metrics with ClickHouse uses 1.16 byte vs 1.37 byte with Prom local storage ● PromQL like query model for reading metrics data from ClickHouse https://github.com/SigNoz/signoz

13. Planned Improvements ● Another table to store complete span and index by traceID ● Use ReplicatedMergeTree for data redundancy ● Use SummingMergeTree for faster aggregation queries ● Use distributed tables for distributed queries ● Separation of roles using separate data and query nodes leads to independent scaling and independent hardware SKUs https://github.com/SigNoz/signoz

14. Summary ● ClickHouse seems a good fit for Observability data ● Uber has used ClickHouse at scale for logs ● Aggregation of span data more efficient with OLAP DBs ● Less resource intensive for smaller scale compared to Druid, ideal for open source project like ours https://github.com/SigNoz/signoz

15. Thank You ankit@signoz.io @ankitnayan https://github.com/SigNoz/signoz Give SigNoz a try. Check out our Github repo at

Building an Observability platform with ClickHouse

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Building an Observability platform with ClickHouse

Similar to Building an Observability platform with ClickHouse (20)

More from Altinity Ltd

More from Altinity Ltd (20)

Recently uploaded

Recently uploaded (20)

Building an Observability platform with ClickHouse