SlideShare a Scribd company logo
1 of 45
Download to read offline
Real-Time, Exactly-Once Data
Ingestion from Kafka to ClickHouse
Mohammad Roohitavaf, Jun Li
October 21, 2021
The Real-Time Analytics Processing Pipeline
ClickHouse as Real-Time Analytics Database
• ClickHouse: an open-source columnar database
to support OLAP
• Data insertion favors large blocks over individual
rows
• Kafka serves as data buffering
• A Block Aggregator is a data loader to aggregate
Kafka messages into large blocks before loading to
ClickHouse
Block Aggregator Failures
• With respect to block aggregator
• Kafka can fail
• Database backend can fail
• Network connections to Kafka and database can fail
• Block aggregator itself can crash
• Blindly retries on loading data will lead to data loss or data duplication to data
persisted in database
• Kafka transaction mechanism can not be applied here
Our Solution: Exactly-Once Message Delivery to ClickHouse
• To have aggregator to deterministically produce identical blocks to ClickHouse
• With existing runtime supports:
• Kafka metadata store to keep track of execution state, and
• ClickHouse’s block duplication detection
The Outline of the Talk
• The block aggregator developed for multi-DC deployment
• The deterministic message replay protocol in block aggregator
• The runtime verifier as a monitoring/debugging tool for block aggregator
• Issues and experiences in block aggregator’s implementation and deployment
• The block aggregator deployment in production
The Multi-DC Kafka/ClickHouse Deployment
• Each database shard has its own topic
• #partitions in topic = #replicas in shard
• Block aggregator co-located in each
replica (as two containers in a
Kubernetes pod)
• Block aggregator only inserts data to
local database replica (with ClickHouse
replication protocol to replicate data to
other replicas)
• Each block aggregator subscribes to
both Kafka clusters
The Multi-DC Kafka/ClickHouse Failure Scenario (1)
(Kafka DC Down)
The Multi-DC Kafka/ClickHouse Failure Scenario (2)
(DC Down)
(ClickHouse DC
Down)
• ClickHouse insert-quorum = 2
The Multi-DC Kafka/ClickHouse Failure Scenario (3)
(Kafka DC Down)
(ClickHouse
DC Down)
• ClickHouse insert-quorum = 2
Mappings of Topics, Tables, Rows, Messages
• One topic contains messages associated with multiple
tables in database
• One message contains multiple rows belonging to the
same table
• Each message is an opaque byte-array in Kafka based on
the protobuf-based encoding mechanism
• Block aggregator relies on ClickHouse table schema to
decode Kafka messages
• When a new table is added to database, no need to make
schema changes to Kafka clusters
• The number of topics does not grow as the tables continue
to be added
• Table rows constructed from Kafka messages in two Kafka
DCs get merged in database
The Block Aggregator Architecture
The Key Features of Block Aggregator
• Support multi-datacenter deployment model
• Multiple tables per topic/partition
• No data loss/duplication
• Monitoring with over hundred metrics:
• Message processing rates
• Block insertion rate and failure rate
• Block size distribution
• Block loading time distribution
• Kafka metadata commit time and failure rate
• Whether abnormal message consumption behaviors happened (such as message
offset re-wound or skipped)
The Outline of the Talk
• The block aggregator developed for multi-DC deployment
• The deterministic message replay protocol in block aggregator
• The runtime verifier as a monitoring/debugging tool for block aggregator
• Issues and experiences in block aggregator’s implementation and deployment
• The block aggregator deployment in production
A Naïve Way for Block Aggregator to Replay Messages (1)
A Naïve Way for Block Aggregator to Replay Messages (2)
Our Solution: Block-Level Deduplication in ClickHouse (1)
• ClickHouse relies on ZooKeeper to store metadata
• Each block stored contains a hash value
• New blocks to be inserted need to have hash uniqueness checked
• Blocks are identical if
• Having same block size
• Containing same rows
• And rows in same order
Our Solution: Guarantee to Form Identical Blocks (2)
• Store metadata back to Kafka which describes the latest blocks formed for
each table
• In case of failure, the next Block Aggregator that picks up the partition will
know exactly how to reconstruct the latest blocks formed for each table by
the previous Block Aggregator
• The Block Aggregators can be in two different ClickHouse replicas, if Kafka
partition rebalancing happens
The Metadata Structure
For each Kafka connector, the metadata persisted to Kafka, per partition, is:
replica_1,table1,0,29,20,table2,5,20,10
The last block for table1 decided to load to ClickHouse: [0, 29].
Starting offset min = 0, we have consumed 20 messages for table1.
The last block for table2 decided to load to ClickHouse: [5, 20].
Starting offset min = 0, we have consumed 10 messages for table2.
In total, we have consumed all 30 messages from offset min=0 to offset max=29: 20 for table 1 and 10 for table2.
replica-Id, [table-name, begin-msg-offset, end-msg-offset, count]+
Metadata.min = MIN (begin-msg-offset); Metadata.max = MAX(end-msg-offset)
The Metadata Structure for Special Block
• Special block: when begin-msg-offset = end-msg-offset + 1
• Either no message for the table with offset less than begin-msg-offset
• Or any message for the table with offset less than begin-msg-offset has been
received and acknowledged by ClickHouse
• Example: replica_id,table1,30,29,20,table2,5,20,10
• All messages with offset less than 30 for table1 are acknowledged by
ClickHouse
Message Processing Sequence: Consume/Commit/Load
The message processing
shown here is per partition
Two Execution Modes:
• Aggregators starts from the message offset previously committed
• REPLAY: Where aggregator retries sending the last blocks sent for each table to avoid
data loss
• CONSUME: Where aggregator is done with REPLAY and it is in the normal state
• Mode Switching:
DetermineState (current_offset, saved_metadata) {
begin=saved_metadata.min
end = saved_metadata.max
if (current_offset > end) state = CONSUME
else state = REPLAY
}
The Top-Level Processing Loop of A Kafka Connector
• For each Kafka Connector:
while (running){ //outer loop
wait for ClickHouse and Kafka to be healthy and connected
while (running){ // inner loop
batch = read a batch from Kafka if error, break inner loop
for (msg : batch.messages){
partitionHandlers[msg.partition].consume(msg) if error, break
inner loop
}
for (ph : partitionHandlers){
if (ph.state == CONSUME){
ph.checkBuffers() if error, break the inner loop
}
}
}
disconnect from Kafka
clear partitionHandlers
}
Consume loop
Check buffers loop
- Commit to Kafka
- Flush to ClickHouse
- Append message to its
table’s buffer
Elapsed time <= max_poll_interval
Some Clarifications
• Partition handlers can be dynamically created or deleted due to Kafka Broker’s decision
• Under some failure condition, one Kafka Connector can have > 1 partitions assigned
• Partition handler performs metadata commit on the corresponding partition
• Each partition handler can process multiple tables (because a Kafka partition can support
multiple tables)
• At any given time, each partition handler can only have one in-flight block, per table, to
be inserted to ClickHouse
• No new block can be submitted until the current in-flight block gets successful ACK from ClickHouse
• Thus, the metadata committed is just one block per table ahead, i.e., “Write Ahead Logging with One
Block”
• In other words, when replay happens, at most one block per table needs to be replayed
Some Clarifications (cont’d)
• If block insertion to ClickHouse fails,
• The outermost loop will disconnect the Kafka Connector from the Kafka Broker
• The Kafka consumer group rebalancing gets triggered automatically
• A different replica’s Kafka Connector will be assigned for the partition and block insertion
continues at this new replica
• Thus, rebalancing allows “Global Retries with Last Committed State” over multiple replicas
• The same failure handling mechanism can be applied, for example, when metadata
commit to Kafka fails
• Thus, Kafka consumer group rebalancing is an indicator on the situation in which a failure
cannot be recovered by a block aggregator
Example on Partition Rebalancing on Replicas
The following diagram shows two aggregators in one shard being killed (to simulate 1
datacenter down), and block insertion traffic gets picked up by the two remaining
aggregators in the same shard.
The Outline of the Talk
• The block aggregator developed for multi-DC deployment
• The deterministic message replay protocol in block aggregator
• The runtime verifier as a monitoring/debugging tool for block aggregator
• Issues and experiences in block aggregator’s implementation and deployment
• The block aggregator deployment in production
Runtime Verification
•Aggregator Verifier (AV): To check all blocks flushed by all aggregators to
ClickHouse not cause any data loss/duplication
•How can AV know what are the blocks flushed by the aggregators?
• Each aggregator commits metadata to Kafka before flushing anything to ClickHouse, for each
partition
• All metadata records committed by the aggregators will be appended to an internal topic in
Kafka called __consumer_offsets
• Thus, AV needs to subscribe to this topic and learn about all blocks flushed to ClickHouse by all
aggregators
Runtime Verification Algorithm
Let M.t.start and M.t.end be the start offset
and end offset for table t in metadata M,
respectively
For any given metadata instances M and M’,
where M committed happened before M’
committed, in time:
•Backward Anomaly: For some table t,
M’.t.end < M.t.start
•Overlap Anomaly: For some table t,
M.t.start < M’.t.end AND M’.t.start <
M.t.end
Runtime Verifier Implementation
•The verifier reads metadata instances in the commit order to Kafka, stored in the system
topic called _consumer_offset.
•The _consumer_offset is a partitioned topic and Kafka does not guarantee ordering across
partitions.
•We order metadata instances with respect to their commit timestamp at the brokers. This
approach requires the clock of the Kafka brokers to be synchronized with an uncertainty
window less than the time between committing two metadata instances. Thus, we should
not commit metadata to Kafka too frequently.
•This is not a problem in block aggregator, as it commits metadata to Kafka for each block
every several seconds, which is not very frequent compared to the clock skew.
The Outline of the Talk
• The block aggregator developed for multi-DC deployment
• The deterministic message replay protocol in block aggregator
• The runtime verifier as a monitoring/debugging tool for block aggregator
• Issues and experiences in block aggregator’s implementation and deployment
• The block aggregator deployment in production
Compile and Link ClickHouse into Block Aggregator
• Instead of using the C++ client library at the ClickHouse repo, we compiled
and linked the entire ClickHouse codebase to block aggregator
• It allows us to leverage the native ClickHouse implementation:
• Native TCP/IP communication protocol (with TLS and connection pooling)
• Select query capabilities just like ClickHouse-Client (for testing purpose)
• Table schema retrieval, and block header construction from schema
• Column construction from protobuf-based Kafka message deserialization
• Column default expression evaluation
• ZooKeeper client for distributed locking
Dynamic Table Schema Update
• To dynamically update a table schema:
• Step 1: Table schema is updated to each ClickHouse shard
• Step 2: Block aggregators in each shard is restarted, thus to load updated schema from the
co-located ClickHouse replica
• Step 3: With offline confirmation on schema update, the client application updates its
application logic to follow the updated schema to produce new Kafka messages
• Requirement: Block aggregator needs to be able to deserialize the Kafka
messages into blocks, for the messages with or without the updated schema
• Solution: to enforce that columns in a table schema can only be added and
can not be deleted afterwards
Multiple ZooKeeper Clusters for One ClickHouse Cluster
• ClickHouse relies on ZooKeeper as metadata store and replication coordination
• Each block insertion takes roughly 15 remote calls to ZooKeeper server cluster
• Block insertion is performed per table
• Our ZooKeeper (with 3.5.8) cluster is deployed across three datacenters with ~ 20 ms cross-
datacenter communication latency
• For a large ClickHouse cluster with 250 shards (with each shard having 4 replicas), a single
ZooKeeper deployment can introduce high ZooKeeper “hardware exception” rate
• The exception due to ZooKeeper session frequently expired
• Multiple ZooKeeper clusters are deployed instead, with each allocated with a subset of the
ClickHouse shards
• In our deployment, 50 shards share one ZK cluster
• It depends on block insertion rate per table, and total number of tables involved in real-time
insertion
Distributed Locking at Block Aggregator
• Before “insert_quorum_parallel” is introduced in ClickHouse,
• In each shard, for each table, only one replica is allowed to perform data insertion
• Distributed locking is used to coordinate block insertion at block aggregators
• The ZooKeeper locking implementation in ClickHouse is used
• More recent ClickHouse version has “insert_quorum_parallel” introduced
• The default value is true
• According to the Altinity blog article, current ClickHouse implementation breaks
sequential consistency and may have other side effects
• In our recent product release based on ClickHouse 21.8, we turned this option off
• And we still enforce distributed locking at block aggregator
Testing on Block Aggregator
• Resiliency Testing (in an 8-shard cluster with 32 replicas )
• Follow the “Chaos Monkey” approach
• Kill: individual processes and individual containers, across ZooKeeper, ClickHouse, Block Aggregator
• Kill: all processes and containers in one datacenter, across ZooKeeper, ClickHouse, Block Aggregator
• To validate whether data loading can recover and continue
• Smaller-scale integration testing
• The whole cluster runs on a single machine with multiple processes from ZooKeeper, ClickHouse and
Block Aggregators
• Programmatically control process start/stop, along with small table insertion
• In addition, to turn on fault injection at predefined points in Block Aggregator code
- For example, to not accept Kafka messages deliberately for 10 seconds
• Validate whether data loss and data duplication happens
ClickHouse Troubleshooting and Remediation
• The setting “insert_quorum = 2” is to guarantee high data reliability
• ClickHouse Exception (with error code = 286) can happen occasionally:
2021.04.10 16:26:38.896509 [ 59963 ] {8421e4d6-43f0-4792-8570-7ef2bf8f595a} <Error> executeQuery: Code: 286, e.displayText()
= DB::Exception: Quorum for previous write has not been satisfied yet. Status: version: 1
part_name: 20210410-0_990_990_0
required_number_of_replicas: 2
actual_number_of_replicas: 1
replicas: SLC-74137
Data insertion in the whole shard stops
when this exception happens!
ClickHouse Troubleshooting and Remediation (cont’d)
• An inhouse tool is developed to:
• scan ZooKeeper subtree associated with log replication queues
• inspect why queued commands cannot be performed
• Once queued commands all get cleared, the quorum then automatically gets satisfied
• Afterwards, data insertion resumes in the shard
• Real-time alerts are defined:
• Long duration time that a shard does not have block insertion
• Block insertion experiences non-zero failure rate with error code = 286
• Some replicas have their replication queues too large
The Outline of the Talk
• The block aggregator developed for multi-DC deployment
• The deterministic message replay protocol in block aggregator
• The runtime verifier as a monitoring/debugging tool for block aggregator
• Issues and experiences in block aggregator’s implementation and deployment
• The block aggregator deployment in production
Block Aggregator Deployment in Production
One Example Deployment
Kafka Clusters: 2 Datacenters
The ClickHouse Cluster:
*2 datacenters
*250 shards
*Each shard having 4 replicas (2 replica
per DC)
*Each aggregator co-located in each
replica
Metric Measured Result
Total messages processed/sec (peak) 280 K
Total message bytes processed/sec (peak) 220 MB/sec
95%-tile block insertion time (quorum=2) 3.8 sec (for table 1)
1.1 sec (for table 2)
4.0 sec (for table 3)
95%-tile block size 0.16 MB (for table 1)
0.03 MB (for table 2)
0.46 MB (for table 3)
95%-tile number of rows in a block 1358 rows (for table 1)
1.8 rows (for table 2)
1894 rows (for table 3)
95%-tile Kafka commit time 64 ms
End-to-end message consumption Lag time < 30 sec
Block Aggregator Deployment in Production
•The block insertion rate at the shard level in a 24-hour window
Block Aggregator Deployment in Production
•The message consumption LAG time at the shard level captured in a 24-hour window
Block Aggregator Deployment in Production
•The Kafka Group Rebalance Rate at the shard level in a 24-hour window (always 0)
Block Aggregator Deployment in Production
•The ZooKeeper hardware exception in a 24-hour window (close to 0)
Summary
•Using streaming platforms like Kafka is one standard way to transfer data across data
processing systems
•For Columnar DB, block loading is more efficient than loading individual records
•Under failure conditions, replaying Kafka messages may cause data loss or data duplication at
block loaders
•Our solution is to deterministically produce identical blocks under various failure conditions so
that the backend Columnar DB can detect and remove duplicated blocks
•The same solution allows us to verify that blocks are always produced correctly under failure
conditions
•This solution has been developed and deployed into production

More Related Content

What's hot

ClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei MilovidovClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei MilovidovAltinity Ltd
 
Tricks every ClickHouse designer should know, by Robert Hodges, Altinity CEO
Tricks every ClickHouse designer should know, by Robert Hodges, Altinity CEOTricks every ClickHouse designer should know, by Robert Hodges, Altinity CEO
Tricks every ClickHouse designer should know, by Robert Hodges, Altinity CEOAltinity Ltd
 
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...Altinity Ltd
 
ClickHouse materialized views - a secret weapon for high performance analytic...
ClickHouse materialized views - a secret weapon for high performance analytic...ClickHouse materialized views - a secret weapon for high performance analytic...
ClickHouse materialized views - a secret weapon for high performance analytic...Altinity Ltd
 
10 Good Reasons to Use ClickHouse
10 Good Reasons to Use ClickHouse10 Good Reasons to Use ClickHouse
10 Good Reasons to Use ClickHouserpolat
 
High Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouseHigh Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouseAltinity Ltd
 
Materialize: a platform for changing data
Materialize: a platform for changing dataMaterialize: a platform for changing data
Materialize: a platform for changing dataAltinity Ltd
 
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdfDeep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdfAltinity Ltd
 
All about Zookeeper and ClickHouse Keeper.pdf
All about Zookeeper and ClickHouse Keeper.pdfAll about Zookeeper and ClickHouse Keeper.pdf
All about Zookeeper and ClickHouse Keeper.pdfAltinity Ltd
 
ClickHouse Materialized Views: The Magic Continues
ClickHouse Materialized Views: The Magic ContinuesClickHouse Materialized Views: The Magic Continues
ClickHouse Materialized Views: The Magic ContinuesAltinity Ltd
 
A Fast Intro to Fast Query with ClickHouse, by Robert Hodges
A Fast Intro to Fast Query with ClickHouse, by Robert HodgesA Fast Intro to Fast Query with ClickHouse, by Robert Hodges
A Fast Intro to Fast Query with ClickHouse, by Robert HodgesAltinity Ltd
 
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlare
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlareClickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlare
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlareAltinity Ltd
 
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEOClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEOAltinity Ltd
 
Your first ClickHouse data warehouse
Your first ClickHouse data warehouseYour first ClickHouse data warehouse
Your first ClickHouse data warehouseAltinity Ltd
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevAltinity Ltd
 
[Meetup] a successful migration from elastic search to clickhouse
[Meetup] a successful migration from elastic search to clickhouse[Meetup] a successful migration from elastic search to clickhouse
[Meetup] a successful migration from elastic search to clickhouseVianney FOUCAULT
 
Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...
Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...
Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...HostedbyConfluent
 
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...Altinity Ltd
 
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert HodgesWebinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert HodgesAltinity Ltd
 

What's hot (20)

ClickHouse Keeper
ClickHouse KeeperClickHouse Keeper
ClickHouse Keeper
 
ClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei MilovidovClickHouse Deep Dive, by Aleksei Milovidov
ClickHouse Deep Dive, by Aleksei Milovidov
 
Tricks every ClickHouse designer should know, by Robert Hodges, Altinity CEO
Tricks every ClickHouse designer should know, by Robert Hodges, Altinity CEOTricks every ClickHouse designer should know, by Robert Hodges, Altinity CEO
Tricks every ClickHouse designer should know, by Robert Hodges, Altinity CEO
 
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
Webinar slides: MORE secrets of ClickHouse Query Performance. By Robert Hodge...
 
ClickHouse materialized views - a secret weapon for high performance analytic...
ClickHouse materialized views - a secret weapon for high performance analytic...ClickHouse materialized views - a secret weapon for high performance analytic...
ClickHouse materialized views - a secret weapon for high performance analytic...
 
10 Good Reasons to Use ClickHouse
10 Good Reasons to Use ClickHouse10 Good Reasons to Use ClickHouse
10 Good Reasons to Use ClickHouse
 
High Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouseHigh Performance, High Reliability Data Loading on ClickHouse
High Performance, High Reliability Data Loading on ClickHouse
 
Materialize: a platform for changing data
Materialize: a platform for changing dataMaterialize: a platform for changing data
Materialize: a platform for changing data
 
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdfDeep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
Deep Dive on ClickHouse Sharding and Replication-2202-09-22.pdf
 
All about Zookeeper and ClickHouse Keeper.pdf
All about Zookeeper and ClickHouse Keeper.pdfAll about Zookeeper and ClickHouse Keeper.pdf
All about Zookeeper and ClickHouse Keeper.pdf
 
ClickHouse Materialized Views: The Magic Continues
ClickHouse Materialized Views: The Magic ContinuesClickHouse Materialized Views: The Magic Continues
ClickHouse Materialized Views: The Magic Continues
 
A Fast Intro to Fast Query with ClickHouse, by Robert Hodges
A Fast Intro to Fast Query with ClickHouse, by Robert HodgesA Fast Intro to Fast Query with ClickHouse, by Robert Hodges
A Fast Intro to Fast Query with ClickHouse, by Robert Hodges
 
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlare
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlareClickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlare
Clickhouse Capacity Planning for OLAP Workloads, Mik Kocikowski of CloudFlare
 
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEOClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
ClickHouse tips and tricks. Webinar slides. By Robert Hodges, Altinity CEO
 
Your first ClickHouse data warehouse
Your first ClickHouse data warehouseYour first ClickHouse data warehouse
Your first ClickHouse data warehouse
 
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander ZaitsevMigration to ClickHouse. Practical guide, by Alexander Zaitsev
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
 
[Meetup] a successful migration from elastic search to clickhouse
[Meetup] a successful migration from elastic search to clickhouse[Meetup] a successful migration from elastic search to clickhouse
[Meetup] a successful migration from elastic search to clickhouse
 
Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...
Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...
Real-time Data Ingestion from Kafka to ClickHouse with Deterministic Re-tries...
 
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
ClickHouse Data Warehouse 101: The First Billion Rows, by Alexander Zaitsev a...
 
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert HodgesWebinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
 

Similar to Real-Time Data Ingestion from Kafka to ClickHouse

Swift container sync
Swift container syncSwift container sync
Swift container syncOpen Stack
 
Building an Event Bus at Scale
Building an Event Bus at ScaleBuilding an Event Bus at Scale
Building an Event Bus at Scalejimriecken
 
Fundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaFundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaAngelo Cesaro
 
Apache Kafka - From zero to hero
Apache Kafka - From zero to heroApache Kafka - From zero to hero
Apache Kafka - From zero to heroApache Kafka TLV
 
Kafka zero to hero
Kafka zero to heroKafka zero to hero
Kafka zero to heroAvi Levi
 
Strict-Data-Consistency-in-Distrbuted-Systems-With-Failures
Strict-Data-Consistency-in-Distrbuted-Systems-With-FailuresStrict-Data-Consistency-in-Distrbuted-Systems-With-Failures
Strict-Data-Consistency-in-Distrbuted-Systems-With-FailuresSlava Imeshev
 
Deep dive into Apache Kafka consumption
Deep dive into Apache Kafka consumptionDeep dive into Apache Kafka consumption
Deep dive into Apache Kafka consumptionAlexandre Tamborrino
 
Multi-Datacenter Kafka - Strata San Jose 2017
Multi-Datacenter Kafka - Strata San Jose 2017Multi-Datacenter Kafka - Strata San Jose 2017
Multi-Datacenter Kafka - Strata San Jose 2017Gwen (Chen) Shapira
 
Exactly-once Stream Processing Done Right with Matthias J Sax
Exactly-once Stream Processing Done Right with Matthias J SaxExactly-once Stream Processing Done Right with Matthias J Sax
Exactly-once Stream Processing Done Right with Matthias J SaxHostedbyConfluent
 
Linked In Stream Processing Meetup - Apache Pulsar
Linked In Stream Processing Meetup - Apache PulsarLinked In Stream Processing Meetup - Apache Pulsar
Linked In Stream Processing Meetup - Apache PulsarKarthik Ramasamy
 
Real time data pipline with kafka streams
Real time data pipline with kafka streamsReal time data pipline with kafka streams
Real time data pipline with kafka streamsYoni Farin
 
Stateful streaming and the challenge of state
Stateful streaming and the challenge of stateStateful streaming and the challenge of state
Stateful streaming and the challenge of stateYoni Farin
 
Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache KafkaChhavi Parasher
 
OSMC 2016 - Monasca - Monitoring-as-a-Service (at-Scale) by Roland Hochmuth
OSMC 2016 - Monasca - Monitoring-as-a-Service (at-Scale) by Roland HochmuthOSMC 2016 - Monasca - Monitoring-as-a-Service (at-Scale) by Roland Hochmuth
OSMC 2016 - Monasca - Monitoring-as-a-Service (at-Scale) by Roland HochmuthNETWAYS
 
OSMC 2016 | Monasca: Monitoring-as-a-Service (at-Scale) by Roland Hochmuth
OSMC 2016 | Monasca: Monitoring-as-a-Service (at-Scale) by Roland HochmuthOSMC 2016 | Monasca: Monitoring-as-a-Service (at-Scale) by Roland Hochmuth
OSMC 2016 | Monasca: Monitoring-as-a-Service (at-Scale) by Roland HochmuthNETWAYS
 
Disaster Recovery Plans for Apache Kafka
Disaster Recovery Plans for Apache KafkaDisaster Recovery Plans for Apache Kafka
Disaster Recovery Plans for Apache Kafkaconfluent
 
Modern Distributed Messaging and RPC
Modern Distributed Messaging and RPCModern Distributed Messaging and RPC
Modern Distributed Messaging and RPCMax Alexejev
 

Similar to Real-Time Data Ingestion from Kafka to ClickHouse (20)

Swift container sync
Swift container syncSwift container sync
Swift container sync
 
Building an Event Bus at Scale
Building an Event Bus at ScaleBuilding an Event Bus at Scale
Building an Event Bus at Scale
 
Fundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache KafkaFundamentals and Architecture of Apache Kafka
Fundamentals and Architecture of Apache Kafka
 
Apache Kafka - From zero to hero
Apache Kafka - From zero to heroApache Kafka - From zero to hero
Apache Kafka - From zero to hero
 
Kafka zero to hero
Kafka zero to heroKafka zero to hero
Kafka zero to hero
 
Strict-Data-Consistency-in-Distrbuted-Systems-With-Failures
Strict-Data-Consistency-in-Distrbuted-Systems-With-FailuresStrict-Data-Consistency-in-Distrbuted-Systems-With-Failures
Strict-Data-Consistency-in-Distrbuted-Systems-With-Failures
 
Deep dive into Apache Kafka consumption
Deep dive into Apache Kafka consumptionDeep dive into Apache Kafka consumption
Deep dive into Apache Kafka consumption
 
Multi-Datacenter Kafka - Strata San Jose 2017
Multi-Datacenter Kafka - Strata San Jose 2017Multi-Datacenter Kafka - Strata San Jose 2017
Multi-Datacenter Kafka - Strata San Jose 2017
 
Exactly-once Stream Processing Done Right with Matthias J Sax
Exactly-once Stream Processing Done Right with Matthias J SaxExactly-once Stream Processing Done Right with Matthias J Sax
Exactly-once Stream Processing Done Right with Matthias J Sax
 
Linked In Stream Processing Meetup - Apache Pulsar
Linked In Stream Processing Meetup - Apache PulsarLinked In Stream Processing Meetup - Apache Pulsar
Linked In Stream Processing Meetup - Apache Pulsar
 
Real time data pipline with kafka streams
Real time data pipline with kafka streamsReal time data pipline with kafka streams
Real time data pipline with kafka streams
 
Stateful streaming and the challenge of state
Stateful streaming and the challenge of stateStateful streaming and the challenge of state
Stateful streaming and the challenge of state
 
Fundamentals of Apache Kafka
Fundamentals of Apache KafkaFundamentals of Apache Kafka
Fundamentals of Apache Kafka
 
Kafka101
Kafka101Kafka101
Kafka101
 
L6.sp17.pptx
L6.sp17.pptxL6.sp17.pptx
L6.sp17.pptx
 
OSMC 2016 - Monasca - Monitoring-as-a-Service (at-Scale) by Roland Hochmuth
OSMC 2016 - Monasca - Monitoring-as-a-Service (at-Scale) by Roland HochmuthOSMC 2016 - Monasca - Monitoring-as-a-Service (at-Scale) by Roland Hochmuth
OSMC 2016 - Monasca - Monitoring-as-a-Service (at-Scale) by Roland Hochmuth
 
OSMC 2016 | Monasca: Monitoring-as-a-Service (at-Scale) by Roland Hochmuth
OSMC 2016 | Monasca: Monitoring-as-a-Service (at-Scale) by Roland HochmuthOSMC 2016 | Monasca: Monitoring-as-a-Service (at-Scale) by Roland Hochmuth
OSMC 2016 | Monasca: Monitoring-as-a-Service (at-Scale) by Roland Hochmuth
 
Apache kafka
Apache kafkaApache kafka
Apache kafka
 
Disaster Recovery Plans for Apache Kafka
Disaster Recovery Plans for Apache KafkaDisaster Recovery Plans for Apache Kafka
Disaster Recovery Plans for Apache Kafka
 
Modern Distributed Messaging and RPC
Modern Distributed Messaging and RPCModern Distributed Messaging and RPC
Modern Distributed Messaging and RPC
 

More from Altinity Ltd

Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptxBuilding an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptxAltinity Ltd
 
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...Altinity Ltd
 
Building an Analytic Extension to MySQL with ClickHouse and Open Source
Building an Analytic Extension to MySQL with ClickHouse and Open SourceBuilding an Analytic Extension to MySQL with ClickHouse and Open Source
Building an Analytic Extension to MySQL with ClickHouse and Open SourceAltinity Ltd
 
Fun with ClickHouse Window Functions-2021-08-19.pdf
Fun with ClickHouse Window Functions-2021-08-19.pdfFun with ClickHouse Window Functions-2021-08-19.pdf
Fun with ClickHouse Window Functions-2021-08-19.pdfAltinity Ltd
 
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdfCloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdfAltinity Ltd
 
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...Altinity Ltd
 
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...Altinity Ltd
 
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdfOwn your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdfAltinity Ltd
 
ClickHouse ReplacingMergeTree in Telecom Apps
ClickHouse ReplacingMergeTree in Telecom AppsClickHouse ReplacingMergeTree in Telecom Apps
ClickHouse ReplacingMergeTree in Telecom AppsAltinity Ltd
 
Adventures with the ClickHouse ReplacingMergeTree Engine
Adventures with the ClickHouse ReplacingMergeTree EngineAdventures with the ClickHouse ReplacingMergeTree Engine
Adventures with the ClickHouse ReplacingMergeTree EngineAltinity Ltd
 
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with  Apache Pulsar and Apache PinotBuilding a Real-Time Analytics Application with  Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with Apache Pulsar and Apache PinotAltinity Ltd
 
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdfAltinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdfAltinity Ltd
 
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...Altinity Ltd
 
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdfOSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdfAltinity Ltd
 
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...Altinity Ltd
 
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...Altinity Ltd
 
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...Altinity Ltd
 
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...Altinity Ltd
 
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...Altinity Ltd
 
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdfOSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdfAltinity Ltd
 

More from Altinity Ltd (20)

Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptxBuilding an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
Building an Analytic Extension to MySQL with ClickHouse and Open Source.pptx
 
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
Cloud Native ClickHouse at Scale--Using the Altinity Kubernetes Operator-2022...
 
Building an Analytic Extension to MySQL with ClickHouse and Open Source
Building an Analytic Extension to MySQL with ClickHouse and Open SourceBuilding an Analytic Extension to MySQL with ClickHouse and Open Source
Building an Analytic Extension to MySQL with ClickHouse and Open Source
 
Fun with ClickHouse Window Functions-2021-08-19.pdf
Fun with ClickHouse Window Functions-2021-08-19.pdfFun with ClickHouse Window Functions-2021-08-19.pdf
Fun with ClickHouse Window Functions-2021-08-19.pdf
 
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdfCloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
Cloud Native Data Warehouses - Intro to ClickHouse on Kubernetes-2021-07.pdf
 
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
Building High Performance Apps with Altinity Stable Builds for ClickHouse | A...
 
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
Application Monitoring using Open Source - VictoriaMetrics & Altinity ClickHo...
 
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdfOwn your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
Own your ClickHouse data with Altinity.Cloud Anywhere-2023-01-17.pdf
 
ClickHouse ReplacingMergeTree in Telecom Apps
ClickHouse ReplacingMergeTree in Telecom AppsClickHouse ReplacingMergeTree in Telecom Apps
ClickHouse ReplacingMergeTree in Telecom Apps
 
Adventures with the ClickHouse ReplacingMergeTree Engine
Adventures with the ClickHouse ReplacingMergeTree EngineAdventures with the ClickHouse ReplacingMergeTree Engine
Adventures with the ClickHouse ReplacingMergeTree Engine
 
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with  Apache Pulsar and Apache PinotBuilding a Real-Time Analytics Application with  Apache Pulsar and Apache Pinot
Building a Real-Time Analytics Application with Apache Pulsar and Apache Pinot
 
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdfAltinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
Altinity Webinar: Introduction to Altinity.Cloud-Platform for Real-Time Data.pdf
 
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
OSA Con 2022 - What Data Engineering Can Learn from Frontend Engineering - Pe...
 
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdfOSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
OSA Con 2022 - Welcome to OSA CON Version 2022 - Robert Hodges - Altinity.pdf
 
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
OSA Con 2022 - Using ClickHouse Database to Power Analytics and Customer Enga...
 
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
OSA Con 2022 - Tips and Tricks to Keep Your Queries under 100ms with ClickHou...
 
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
OSA Con 2022 - The Open Source Analytic Universe, Version 2022 - Robert Hodge...
 
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
OSA Con 2022 - Switching Jaeger Distributed Tracing to ClickHouse to Enable A...
 
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
OSA Con 2022 - Streaming Data Made Easy - Tim Spann & David Kjerrumgaard - St...
 
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdfOSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
OSA Con 2022 - State of Open Source Databases - Peter Zaitsev - Percona.pdf
 

Recently uploaded

Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Andreas Granig
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureDinusha Kumarasiri
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样umasea
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesPhilip Schwarz
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanyChristoph Pohl
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesŁukasz Chruściel
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based projectAnoyGreter
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaHanief Utama
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEEVICTOR MAESTRE RAMIREZ
 
What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....kzayra69
 

Recently uploaded (20)

Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024Automate your Kamailio Test Calls - Kamailio World 2024
Automate your Kamailio Test Calls - Kamailio World 2024
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
Implementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with AzureImplementing Zero Trust strategy with Azure
Implementing Zero Trust strategy with Azure
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
办理学位证(UQ文凭证书)昆士兰大学毕业证成绩单原版一模一样
 
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort ServiceHot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
Hot Sexy call girls in Patel Nagar🔝 9953056974 🔝 escort Service
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
Folding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a seriesFolding Cheat Sheet #4 - fourth in a series
Folding Cheat Sheet #4 - fourth in a series
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte GermanySuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
SuccessFactors 1H 2024 Release - Sneak-Peek by Deloitte Germany
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
Unveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New FeaturesUnveiling the Future: Sylius 2.0 New Features
Unveiling the Future: Sylius 2.0 New Features
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...Advancing Engineering with AI through the Next Generation of Strategic Projec...
Advancing Engineering with AI through the Next Generation of Strategic Projec...
 
MYjobs Presentation Django-based project
MYjobs Presentation Django-based projectMYjobs Presentation Django-based project
MYjobs Presentation Django-based project
 
React Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief UtamaReact Server Component in Next.js by Hanief Utama
React Server Component in Next.js by Hanief Utama
 
Cloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEECloud Data Center Network Construction - IEEE
Cloud Data Center Network Construction - IEEE
 
What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....What are the key points to focus on before starting to learn ETL Development....
What are the key points to focus on before starting to learn ETL Development....
 

Real-Time Data Ingestion from Kafka to ClickHouse

  • 1. Real-Time, Exactly-Once Data Ingestion from Kafka to ClickHouse Mohammad Roohitavaf, Jun Li October 21, 2021
  • 2. The Real-Time Analytics Processing Pipeline
  • 3. ClickHouse as Real-Time Analytics Database • ClickHouse: an open-source columnar database to support OLAP • Data insertion favors large blocks over individual rows • Kafka serves as data buffering • A Block Aggregator is a data loader to aggregate Kafka messages into large blocks before loading to ClickHouse
  • 4. Block Aggregator Failures • With respect to block aggregator • Kafka can fail • Database backend can fail • Network connections to Kafka and database can fail • Block aggregator itself can crash • Blindly retries on loading data will lead to data loss or data duplication to data persisted in database • Kafka transaction mechanism can not be applied here
  • 5. Our Solution: Exactly-Once Message Delivery to ClickHouse • To have aggregator to deterministically produce identical blocks to ClickHouse • With existing runtime supports: • Kafka metadata store to keep track of execution state, and • ClickHouse’s block duplication detection
  • 6. The Outline of the Talk • The block aggregator developed for multi-DC deployment • The deterministic message replay protocol in block aggregator • The runtime verifier as a monitoring/debugging tool for block aggregator • Issues and experiences in block aggregator’s implementation and deployment • The block aggregator deployment in production
  • 7. The Multi-DC Kafka/ClickHouse Deployment • Each database shard has its own topic • #partitions in topic = #replicas in shard • Block aggregator co-located in each replica (as two containers in a Kubernetes pod) • Block aggregator only inserts data to local database replica (with ClickHouse replication protocol to replicate data to other replicas) • Each block aggregator subscribes to both Kafka clusters
  • 8. The Multi-DC Kafka/ClickHouse Failure Scenario (1) (Kafka DC Down)
  • 9. The Multi-DC Kafka/ClickHouse Failure Scenario (2) (DC Down) (ClickHouse DC Down) • ClickHouse insert-quorum = 2
  • 10. The Multi-DC Kafka/ClickHouse Failure Scenario (3) (Kafka DC Down) (ClickHouse DC Down) • ClickHouse insert-quorum = 2
  • 11. Mappings of Topics, Tables, Rows, Messages • One topic contains messages associated with multiple tables in database • One message contains multiple rows belonging to the same table • Each message is an opaque byte-array in Kafka based on the protobuf-based encoding mechanism • Block aggregator relies on ClickHouse table schema to decode Kafka messages • When a new table is added to database, no need to make schema changes to Kafka clusters • The number of topics does not grow as the tables continue to be added • Table rows constructed from Kafka messages in two Kafka DCs get merged in database
  • 12. The Block Aggregator Architecture
  • 13. The Key Features of Block Aggregator • Support multi-datacenter deployment model • Multiple tables per topic/partition • No data loss/duplication • Monitoring with over hundred metrics: • Message processing rates • Block insertion rate and failure rate • Block size distribution • Block loading time distribution • Kafka metadata commit time and failure rate • Whether abnormal message consumption behaviors happened (such as message offset re-wound or skipped)
  • 14. The Outline of the Talk • The block aggregator developed for multi-DC deployment • The deterministic message replay protocol in block aggregator • The runtime verifier as a monitoring/debugging tool for block aggregator • Issues and experiences in block aggregator’s implementation and deployment • The block aggregator deployment in production
  • 15. A Naïve Way for Block Aggregator to Replay Messages (1)
  • 16. A Naïve Way for Block Aggregator to Replay Messages (2)
  • 17. Our Solution: Block-Level Deduplication in ClickHouse (1) • ClickHouse relies on ZooKeeper to store metadata • Each block stored contains a hash value • New blocks to be inserted need to have hash uniqueness checked • Blocks are identical if • Having same block size • Containing same rows • And rows in same order
  • 18. Our Solution: Guarantee to Form Identical Blocks (2) • Store metadata back to Kafka which describes the latest blocks formed for each table • In case of failure, the next Block Aggregator that picks up the partition will know exactly how to reconstruct the latest blocks formed for each table by the previous Block Aggregator • The Block Aggregators can be in two different ClickHouse replicas, if Kafka partition rebalancing happens
  • 19. The Metadata Structure For each Kafka connector, the metadata persisted to Kafka, per partition, is: replica_1,table1,0,29,20,table2,5,20,10 The last block for table1 decided to load to ClickHouse: [0, 29]. Starting offset min = 0, we have consumed 20 messages for table1. The last block for table2 decided to load to ClickHouse: [5, 20]. Starting offset min = 0, we have consumed 10 messages for table2. In total, we have consumed all 30 messages from offset min=0 to offset max=29: 20 for table 1 and 10 for table2. replica-Id, [table-name, begin-msg-offset, end-msg-offset, count]+ Metadata.min = MIN (begin-msg-offset); Metadata.max = MAX(end-msg-offset)
  • 20. The Metadata Structure for Special Block • Special block: when begin-msg-offset = end-msg-offset + 1 • Either no message for the table with offset less than begin-msg-offset • Or any message for the table with offset less than begin-msg-offset has been received and acknowledged by ClickHouse • Example: replica_id,table1,30,29,20,table2,5,20,10 • All messages with offset less than 30 for table1 are acknowledged by ClickHouse
  • 21. Message Processing Sequence: Consume/Commit/Load The message processing shown here is per partition
  • 22. Two Execution Modes: • Aggregators starts from the message offset previously committed • REPLAY: Where aggregator retries sending the last blocks sent for each table to avoid data loss • CONSUME: Where aggregator is done with REPLAY and it is in the normal state • Mode Switching: DetermineState (current_offset, saved_metadata) { begin=saved_metadata.min end = saved_metadata.max if (current_offset > end) state = CONSUME else state = REPLAY }
  • 23. The Top-Level Processing Loop of A Kafka Connector • For each Kafka Connector: while (running){ //outer loop wait for ClickHouse and Kafka to be healthy and connected while (running){ // inner loop batch = read a batch from Kafka if error, break inner loop for (msg : batch.messages){ partitionHandlers[msg.partition].consume(msg) if error, break inner loop } for (ph : partitionHandlers){ if (ph.state == CONSUME){ ph.checkBuffers() if error, break the inner loop } } } disconnect from Kafka clear partitionHandlers } Consume loop Check buffers loop - Commit to Kafka - Flush to ClickHouse - Append message to its table’s buffer Elapsed time <= max_poll_interval
  • 24. Some Clarifications • Partition handlers can be dynamically created or deleted due to Kafka Broker’s decision • Under some failure condition, one Kafka Connector can have > 1 partitions assigned • Partition handler performs metadata commit on the corresponding partition • Each partition handler can process multiple tables (because a Kafka partition can support multiple tables) • At any given time, each partition handler can only have one in-flight block, per table, to be inserted to ClickHouse • No new block can be submitted until the current in-flight block gets successful ACK from ClickHouse • Thus, the metadata committed is just one block per table ahead, i.e., “Write Ahead Logging with One Block” • In other words, when replay happens, at most one block per table needs to be replayed
  • 25. Some Clarifications (cont’d) • If block insertion to ClickHouse fails, • The outermost loop will disconnect the Kafka Connector from the Kafka Broker • The Kafka consumer group rebalancing gets triggered automatically • A different replica’s Kafka Connector will be assigned for the partition and block insertion continues at this new replica • Thus, rebalancing allows “Global Retries with Last Committed State” over multiple replicas • The same failure handling mechanism can be applied, for example, when metadata commit to Kafka fails • Thus, Kafka consumer group rebalancing is an indicator on the situation in which a failure cannot be recovered by a block aggregator
  • 26. Example on Partition Rebalancing on Replicas The following diagram shows two aggregators in one shard being killed (to simulate 1 datacenter down), and block insertion traffic gets picked up by the two remaining aggregators in the same shard.
  • 27. The Outline of the Talk • The block aggregator developed for multi-DC deployment • The deterministic message replay protocol in block aggregator • The runtime verifier as a monitoring/debugging tool for block aggregator • Issues and experiences in block aggregator’s implementation and deployment • The block aggregator deployment in production
  • 28. Runtime Verification •Aggregator Verifier (AV): To check all blocks flushed by all aggregators to ClickHouse not cause any data loss/duplication •How can AV know what are the blocks flushed by the aggregators? • Each aggregator commits metadata to Kafka before flushing anything to ClickHouse, for each partition • All metadata records committed by the aggregators will be appended to an internal topic in Kafka called __consumer_offsets • Thus, AV needs to subscribe to this topic and learn about all blocks flushed to ClickHouse by all aggregators
  • 29. Runtime Verification Algorithm Let M.t.start and M.t.end be the start offset and end offset for table t in metadata M, respectively For any given metadata instances M and M’, where M committed happened before M’ committed, in time: •Backward Anomaly: For some table t, M’.t.end < M.t.start •Overlap Anomaly: For some table t, M.t.start < M’.t.end AND M’.t.start < M.t.end
  • 30. Runtime Verifier Implementation •The verifier reads metadata instances in the commit order to Kafka, stored in the system topic called _consumer_offset. •The _consumer_offset is a partitioned topic and Kafka does not guarantee ordering across partitions. •We order metadata instances with respect to their commit timestamp at the brokers. This approach requires the clock of the Kafka brokers to be synchronized with an uncertainty window less than the time between committing two metadata instances. Thus, we should not commit metadata to Kafka too frequently. •This is not a problem in block aggregator, as it commits metadata to Kafka for each block every several seconds, which is not very frequent compared to the clock skew.
  • 31. The Outline of the Talk • The block aggregator developed for multi-DC deployment • The deterministic message replay protocol in block aggregator • The runtime verifier as a monitoring/debugging tool for block aggregator • Issues and experiences in block aggregator’s implementation and deployment • The block aggregator deployment in production
  • 32. Compile and Link ClickHouse into Block Aggregator • Instead of using the C++ client library at the ClickHouse repo, we compiled and linked the entire ClickHouse codebase to block aggregator • It allows us to leverage the native ClickHouse implementation: • Native TCP/IP communication protocol (with TLS and connection pooling) • Select query capabilities just like ClickHouse-Client (for testing purpose) • Table schema retrieval, and block header construction from schema • Column construction from protobuf-based Kafka message deserialization • Column default expression evaluation • ZooKeeper client for distributed locking
  • 33. Dynamic Table Schema Update • To dynamically update a table schema: • Step 1: Table schema is updated to each ClickHouse shard • Step 2: Block aggregators in each shard is restarted, thus to load updated schema from the co-located ClickHouse replica • Step 3: With offline confirmation on schema update, the client application updates its application logic to follow the updated schema to produce new Kafka messages • Requirement: Block aggregator needs to be able to deserialize the Kafka messages into blocks, for the messages with or without the updated schema • Solution: to enforce that columns in a table schema can only be added and can not be deleted afterwards
  • 34. Multiple ZooKeeper Clusters for One ClickHouse Cluster • ClickHouse relies on ZooKeeper as metadata store and replication coordination • Each block insertion takes roughly 15 remote calls to ZooKeeper server cluster • Block insertion is performed per table • Our ZooKeeper (with 3.5.8) cluster is deployed across three datacenters with ~ 20 ms cross- datacenter communication latency • For a large ClickHouse cluster with 250 shards (with each shard having 4 replicas), a single ZooKeeper deployment can introduce high ZooKeeper “hardware exception” rate • The exception due to ZooKeeper session frequently expired • Multiple ZooKeeper clusters are deployed instead, with each allocated with a subset of the ClickHouse shards • In our deployment, 50 shards share one ZK cluster • It depends on block insertion rate per table, and total number of tables involved in real-time insertion
  • 35. Distributed Locking at Block Aggregator • Before “insert_quorum_parallel” is introduced in ClickHouse, • In each shard, for each table, only one replica is allowed to perform data insertion • Distributed locking is used to coordinate block insertion at block aggregators • The ZooKeeper locking implementation in ClickHouse is used • More recent ClickHouse version has “insert_quorum_parallel” introduced • The default value is true • According to the Altinity blog article, current ClickHouse implementation breaks sequential consistency and may have other side effects • In our recent product release based on ClickHouse 21.8, we turned this option off • And we still enforce distributed locking at block aggregator
  • 36. Testing on Block Aggregator • Resiliency Testing (in an 8-shard cluster with 32 replicas ) • Follow the “Chaos Monkey” approach • Kill: individual processes and individual containers, across ZooKeeper, ClickHouse, Block Aggregator • Kill: all processes and containers in one datacenter, across ZooKeeper, ClickHouse, Block Aggregator • To validate whether data loading can recover and continue • Smaller-scale integration testing • The whole cluster runs on a single machine with multiple processes from ZooKeeper, ClickHouse and Block Aggregators • Programmatically control process start/stop, along with small table insertion • In addition, to turn on fault injection at predefined points in Block Aggregator code - For example, to not accept Kafka messages deliberately for 10 seconds • Validate whether data loss and data duplication happens
  • 37. ClickHouse Troubleshooting and Remediation • The setting “insert_quorum = 2” is to guarantee high data reliability • ClickHouse Exception (with error code = 286) can happen occasionally: 2021.04.10 16:26:38.896509 [ 59963 ] {8421e4d6-43f0-4792-8570-7ef2bf8f595a} <Error> executeQuery: Code: 286, e.displayText() = DB::Exception: Quorum for previous write has not been satisfied yet. Status: version: 1 part_name: 20210410-0_990_990_0 required_number_of_replicas: 2 actual_number_of_replicas: 1 replicas: SLC-74137 Data insertion in the whole shard stops when this exception happens!
  • 38. ClickHouse Troubleshooting and Remediation (cont’d) • An inhouse tool is developed to: • scan ZooKeeper subtree associated with log replication queues • inspect why queued commands cannot be performed • Once queued commands all get cleared, the quorum then automatically gets satisfied • Afterwards, data insertion resumes in the shard • Real-time alerts are defined: • Long duration time that a shard does not have block insertion • Block insertion experiences non-zero failure rate with error code = 286 • Some replicas have their replication queues too large
  • 39. The Outline of the Talk • The block aggregator developed for multi-DC deployment • The deterministic message replay protocol in block aggregator • The runtime verifier as a monitoring/debugging tool for block aggregator • Issues and experiences in block aggregator’s implementation and deployment • The block aggregator deployment in production
  • 40. Block Aggregator Deployment in Production One Example Deployment Kafka Clusters: 2 Datacenters The ClickHouse Cluster: *2 datacenters *250 shards *Each shard having 4 replicas (2 replica per DC) *Each aggregator co-located in each replica Metric Measured Result Total messages processed/sec (peak) 280 K Total message bytes processed/sec (peak) 220 MB/sec 95%-tile block insertion time (quorum=2) 3.8 sec (for table 1) 1.1 sec (for table 2) 4.0 sec (for table 3) 95%-tile block size 0.16 MB (for table 1) 0.03 MB (for table 2) 0.46 MB (for table 3) 95%-tile number of rows in a block 1358 rows (for table 1) 1.8 rows (for table 2) 1894 rows (for table 3) 95%-tile Kafka commit time 64 ms End-to-end message consumption Lag time < 30 sec
  • 41. Block Aggregator Deployment in Production •The block insertion rate at the shard level in a 24-hour window
  • 42. Block Aggregator Deployment in Production •The message consumption LAG time at the shard level captured in a 24-hour window
  • 43. Block Aggregator Deployment in Production •The Kafka Group Rebalance Rate at the shard level in a 24-hour window (always 0)
  • 44. Block Aggregator Deployment in Production •The ZooKeeper hardware exception in a 24-hour window (close to 0)
  • 45. Summary •Using streaming platforms like Kafka is one standard way to transfer data across data processing systems •For Columnar DB, block loading is more efficient than loading individual records •Under failure conditions, replaying Kafka messages may cause data loss or data duplication at block loaders •Our solution is to deterministically produce identical blocks under various failure conditions so that the backend Columnar DB can detect and remove duplicated blocks •The same solution allows us to verify that blocks are always produced correctly under failure conditions •This solution has been developed and deployed into production