How is Kafka so Fast?

•Download as PPTX, PDF•

5 likes•1,364 views

I explain the key points of Kafka that makes it a very fast and reliable distributed message bus even using mechanical disks.

Technology

Ricardo Paiva and Hervé Rivière
Understanding the design of Kafka and how it
handles Criteo workload
How is Kafka so
fast?

3 •
Apache Kafka is a distributed message queue
• Open-sourced by LinkedIn in 2011
• High-throughput
• Highly distributed
• Fault-tolerant
• Low-latency
What is Kafka?

4 •
• Use case
• GLUP pipeline (aka Kafka Local)
• Streaming event processing platform (aka Kafka Stream)
• Some figures :
• 14 clusters / 200 servers / 7 DC
• Up to 7 millions messages / sec
• Up to 150 TB processed per day
Kafka @ Criteo ?

5 •
Topics, Partitions and Offsets
7 6 5 4 3 2 1 08910
7 6 5 4 3 2 1 0
7 6 5 4 3 2 1 08
7 6 5 4 3 2 1 0891011
7 6 5 4 3 2 1 08
7 6 5 4 3 2 1 0
Partition 0
Partition 1
Partition 0
Partition 1
Partition 2
Partition 3
OldNew
Writes
Topic A
Topic B

7 •
Brokers
• Manage partitions
• Receive from producer records for a (topic, partition)
• Answer to consumer asking records for (topic, partition, offset)
• Manage replicas
• Manage consumer coordination
• Assigning good partitions to the good consumer
Broker 1
Producer
Broker 2
Consumer
Consumer
Fetch (Topic A, Partition 4, Offset 10)
Bytes
Fetch (Topic B, Partition 1, Offset 10)
Bytes

8 •
Producers
Producer
Broker
(partition leader)
Broker
(replica)
Broker
(replica)
ack
• Producers decide what partitions to send to;
• Producers can send a batch of messages;
• Producers can compress a batch;
• Producers wait for acknowledgement from the broker (acks=1) or broker + replica (acks=all);

9 •
Consumers
ConsumerBroker
6 7 8 9 10 11 12 12
offset=7
Partition 2:
Partition 2, offset 6
7 8 9
1
2
3Commit offset=9
• Consumers control what offset to consume from;
• Consumers commit offsets to kafka, but it’s just another Kafka topic;
• Consumers can receive batched and / or compressed data;
• Kafka coordinates which partitions each consumer will consume from.

12 •
• Each Kafka partition is mapped to segment files
• Segment file : log append structure
• Records are immutable
• Broker is doing very few random disk search
Only sequential I/O
Kafka
Active
Segment
file
Old
segment
files

13 •
• Kafka relies on native Linux Page cache (read-ahead and write-behind)
• JVM off-heap cache for free
• Kafka records aren’t deserialized in Kafka JVM
• No Java object memory overhead
• No OutOfMemory issue
• No big GC pauses
Caching data for free
Kafka
Active
Segment
file
Disk
OS
Old segments files

14 •
Reliability with replication
• Kafka disk writes are asynchronous
• Kafka replicas synchronisation (over network) is synchronous
• Trusting replicas in case of data corruption / server crash
Broker
(partition leader)
Broker
(replica)
Broker
(previous
leader)

16 •
Sending data from file to network (traditional approach)
read(file, tmp_buf, len);
write(socket, tmp_buf, len);

17 •
Sending data from file to network (zero-copy approach)
transferTo(position, count, writableChannel);

19 •
• Paralelism based on topic partitions;
• Data compressed/uncompressed on the client;
• Producers send a batch of messages;
• No serialization/deserialization costs on the brokers;
• Writing directly to file:
• Append only (cheapers disks);
• No complex data structure (no BTree or LSM tree);
• Uses OS memory management;
• Relies on replicas not on disks;
• Zero-copy;
Key takeaways

What's hot

Apache kafkaSrikrishna k

Tips & Tricks for Apache Kafka®confluent

Apache Kafka IntroductionAmita Mirajkar

Demystifying flink memory allocation and tuning - Roshan Naik, UberFlink Forward

APACHE KAFKA / Kafka Connect / Kafka StreamsKetan Gote

Uber: Kafka Consumer Proxyconfluent

Securing Kafka confluent

Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...Flink Forward

Stephan Ewen - Experiences running Flink at Very Large ScaleVerverica

Kafka presentationMohammed Fazuluddin

Please Upgrade Apache Kafka. Now. (Gwen Shapira, Confluent) Kafka Summit SF 2019confluent

Practical learnings from running thousands of Flink jobsFlink Forward

Apache Kafka Architecture & Fundamentals Explainedconfluent

Security and Multi-Tenancy with Apache Pulsar in Yahoo! (Verizon Media) - Pul...StreamNative

From cache to in-memory data grid. Introduction to Hazelcast.Taras Matyashovsky

Apache Kafka - OverviewCodeOps Technologies LLP

An Introduction to Apache KafkaAmir Sedighi

Apache kafkaKumar Shivam

Scaling Apache Pulsar to 10 Petabytes/DayScyllaDB

ksqlDB: A Stream-Relational Database Systemconfluent

What's hot (20)

Apache kafka

Tips & Tricks for Apache Kafka®

Apache Kafka Introduction

Demystifying flink memory allocation and tuning - Roshan Naik, Uber

APACHE KAFKA / Kafka Connect / Kafka Streams

Uber: Kafka Consumer Proxy

Securing Kafka

Dynamically Scaling Data Streams across Multiple Kafka Clusters with Zero Fli...

Stephan Ewen - Experiences running Flink at Very Large Scale

Kafka presentation

Please Upgrade Apache Kafka. Now. (Gwen Shapira, Confluent) Kafka Summit SF 2019

Practical learnings from running thousands of Flink jobs

Apache Kafka Architecture & Fundamentals Explained

Security and Multi-Tenancy with Apache Pulsar in Yahoo! (Verizon Media) - Pul...

From cache to in-memory data grid. Introduction to Hazelcast.

Apache Kafka - Overview

An Introduction to Apache Kafka

Apache kafka

Scaling Apache Pulsar to 10 Petabytes/Day

ksqlDB: A Stream-Relational Database System

Similar to How is Kafka so Fast?

Fundamentals of Apache KafkaChhavi Parasher

Capital One Delivers Risk Insights in Real Time with Stream Processingconfluent

Building Stream Infrastructure across Multiple Data Centers with Apache KafkaGuozhang Wang

Stream Processing @ LyftJamie Grier

Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...confluent

Building High-Throughput, Low-Latency Pipelines in Kafkaconfluent

Consensus in Apache Kafka: From Theory to Production.pdfGuozhang Wang

Scylla Summit 2018: Keeping Your Latency SLAs No Matter What!ScyllaDB

Performance Tuning RocksDB for Kafka Streams’ State Storesconfluent

Real time data pipline with kafka streamsYoni Farin

Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...Erik Onnen

Kafka ExplainatonNguyenChiHoangMinh

Distributed messaging through KafkaDileep Kalidindi

Kafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life Exampleconfluent

Web Analytics using Kafka - August talk w/ Women Who CodePurnima Kamath

Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement VMware Tanzu

Apache Performance Tuning: Scaling OutSander Temme

World of Tanks Experience of Using KafkaLevon Avakyan

Tuning kafka pipelinesSumant Tambe

Hadoop 3.0 - Revolution or evolution?Uwe Printz

Similar to How is Kafka so Fast? (20)

Fundamentals of Apache Kafka

Capital One Delivers Risk Insights in Real Time with Stream Processing

Building Stream Infrastructure across Multiple Data Centers with Apache Kafka

Stream Processing @ Lyft

Performance Tuning RocksDB for Kafka Streams' State Stores (Dhruba Borthakur,...

Building High-Throughput, Low-Latency Pipelines in Kafka

Consensus in Apache Kafka: From Theory to Production.pdf

Scylla Summit 2018: Keeping Your Latency SLAs No Matter What!

Performance Tuning RocksDB for Kafka Streams’ State Stores

Real time data pipline with kafka streams

Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...

Kafka Explainaton

Distributed messaging through Kafka

Kafka Summit NYC 2017 Introduction to Kafka Streams with a Real-life Example

Web Analytics using Kafka - August talk w/ Women Who Code

Slides for the Apache Geode Hands-on Meetup and Hackathon Announcement

Apache Performance Tuning: Scaling Out

World of Tanks Experience of Using Kafka

Tuning kafka pipelines

Hadoop 3.0 - Revolution or evolution?

Recently uploaded

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10

Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays

Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services

Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays

Apidays New York 2024 - The value of a flexible API Management solution for O...apidays

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software

Why Teams call analytics are critical to your entire businesspanagenda

Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub

FWD Group - Insurer Innovation Award 2024The Digital Insurer

AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin

TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc

Elevate Developer Efficiency & build GenAI Application with Amazon QBhuvaneswari Subramani

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous

Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz

Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea

Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021

Recently uploaded (20)

Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf

Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...

Artificial Intelligence Chap.5 : Uncertainty

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...

Vector Search -An Introduction in Oracle Database 23ai.pptx

Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...

Apidays New York 2024 - The value of a flexible API Management solution for O...

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME

Why Teams call analytics are critical to your entire business

Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...

FWD Group - Insurer Innovation Award 2024

AWS Community Day CPH - Three problems of Terraform

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery

Elevate Developer Efficiency & build GenAI Application with Amazon Q

ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke

Introduction to Multilingual Retrieval Augmented Generation (RAG)

Finding Java's Hidden Performance Traps @ DevoxxUK 2024

Six Myths about Ontologies: The Basics of Formal Ontology

How is Kafka so Fast?

1. Ricardo Paiva and Hervé Rivière Understanding the design of Kafka and how it handles Criteo workload How is Kafka so fast?

2. What is Kafka?

3. 3 • Apache Kafka is a distributed message queue • Open-sourced by LinkedIn in 2011 • High-throughput • Highly distributed • Fault-tolerant • Low-latency What is Kafka?

4. 4 • • Use case • GLUP pipeline (aka Kafka Local) • Streaming event processing platform (aka Kafka Stream) • Some figures : • 14 clusters / 200 servers / 7 DC • Up to 7 millions messages / sec • Up to 150 TB processed per day Kafka @ Criteo ?

5. 5 • Topics, Partitions and Offsets 7 6 5 4 3 2 1 08910 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 08 7 6 5 4 3 2 1 0891011 7 6 5 4 3 2 1 08 7 6 5 4 3 2 1 0 Partition 0 Partition 1 Partition 0 Partition 1 Partition 2 Partition 3 OldNew Writes Topic A Topic B

6. Complexity inside the clients

7. 7 • Brokers • Manage partitions • Receive from producer records for a (topic, partition) • Answer to consumer asking records for (topic, partition, offset) • Manage replicas • Manage consumer coordination • Assigning good partitions to the good consumer Broker 1 Producer Broker 2 Consumer Consumer Fetch (Topic A, Partition 4, Offset 10) Bytes Fetch (Topic B, Partition 1, Offset 10) Bytes

8. 8 • Producers Producer Broker (partition leader) Broker (replica) Broker (replica) ack • Producers decide what partitions to send to; • Producers can send a batch of messages; • Producers can compress a batch; • Producers wait for acknowledgement from the broker (acks=1) or broker + replica (acks=all);

9. 9 • Consumers ConsumerBroker 6 7 8 9 10 11 12 12 offset=7 Partition 2: Partition 2, offset 6 7 8 9 1 2 3Commit offset=9 • Consumers control what offset to consume from; • Consumers commit offsets to kafka, but it’s just another Kafka topic; • Consumers can receive batched and / or compressed data; • Kafka coordinates which partitions each consumer will consume from.

10. Did you say SSD is better than HDD ?

11. 11 • Faster but not that much

12. 12 • • Each Kafka partition is mapped to segment files • Segment file : log append structure • Records are immutable • Broker is doing very few random disk search Only sequential I/O Kafka Active Segment file Old segment files

13. 13 • • Kafka relies on native Linux Page cache (read-ahead and write-behind) • JVM off-heap cache for free • Kafka records aren’t deserialized in Kafka JVM • No Java object memory overhead • No OutOfMemory issue • No big GC pauses Caching data for free Kafka Active Segment file Disk OS Old segments files

14. 14 • Reliability with replication • Kafka disk writes are asynchronous • Kafka replicas synchronisation (over network) is synchronous • Trusting replicas in case of data corruption / server crash Broker (partition leader) Broker (replica) Broker (previous leader)

15. Zero Copy

16. 16 • Sending data from file to network (traditional approach) read(file, tmp_buf, len); write(socket, tmp_buf, len);

17. 17 • Sending data from file to network (zero-copy approach) transferTo(position, count, writableChannel);

18. Make things simple

19. 19 • • Paralelism based on topic partitions; • Data compressed/uncompressed on the client; • Producers send a batch of messages; • No serialization/deserialization costs on the brokers; • Writing directly to file: • Append only (cheapers disks); • No complex data structure (no BTree or LSM tree); • Uses OS memory management; • Relies on replicas not on disks; • Zero-copy; Key takeaways

20. Thank you! #rivers

Editor's Notes

Do quick presentation of each other short agenda (first kafka basics + seconds design choice that made it a great tool for our scale)
Why this name : just because initial creator (Jay Kreps) liked this author, like the fact he was a writer and think it was a good name for an OS project.
Topic is just lake a table in a DB but for a queue for a queue we called that topic. You send message to Bid request topic and you received message from billable click topic Partition are a section of a topic. So here topic A have two partiotn / topic B have 4. Partitions are spread over different servers but one partition is always fitting in one server. Topic can be bid request and billable click Bid request as 1000 partitions Partitions are in different server Order only inside a partition Each message as a monotonic offset. Focus on : - Kafka is just storing bytes / no schema --> you can send image in kafka if you want (not a wonderfull idea, but it works)
First step we want to explain you is complexity is not in server but in client
Producer and consumer Broker is dummy Difference between rabbit MQ or oyher queue : you can have huge queue if you want (cf event sourcing store) limit is disk / don’t care about status of a message is it well received is dummy + pull and not push You can group together consumer to create a consumer group and so a distributed application. Broker is managing coordianation of consumer to assgn good partition to good consumer
Focus on : - No SPOF /no broker acting like gateway for the cluster : producer is maintenaing the mapping (topic, partition) -> broker Batch is only logic : one physical message (one send request / ack) is containing several messages Batch advantage : Compress is efficient / network ack is efficient : one ack for each 1 000 messages for instance
Warning : consumer receive compress batch data only if producer was sending like that
Cost efficiency + highest perf Advantage here is to use JBOD or RAID Having ssd will cost more with equal perf or even lower
- Same cache system than varnish (HTTP cache server) - Designed to work with linux only. - a heap of 4gb is enough because no data inside (only managing metatdata and client connection)
- Same cache system than varnish (HTTP cache server) - Designed to work with linux only. - a heap of 4gb is enough because no data inside (only managing metatdata and client connection)
Disk is async (and it's ok because network is sync)

How is Kafka so Fast?

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to How is Kafka so Fast?

Similar to How is Kafka so Fast? (20)

Recently uploaded

Recently uploaded (20)

How is Kafka so Fast?

Editor's Notes