3. 3 •
Apache Kafka is a distributed message queue
• Open-sourced by LinkedIn in 2011
• High-throughput
• Highly distributed
• Fault-tolerant
• Low-latency
What is Kafka?
4. 4 •
• Use case
• GLUP pipeline (aka Kafka Local)
• Streaming event processing platform (aka Kafka Stream)
• Some figures :
• 14 clusters / 200 servers / 7 DC
• Up to 7 millions messages / sec
• Up to 150 TB processed per day
Kafka @ Criteo ?
12. 12 •
• Each Kafka partition is mapped to segment files
• Segment file : log append structure
• Records are immutable
• Broker is doing very few random disk search
Only sequential I/O
Kafka
Active
Segment
file
Old
segment
files
13. 13 •
• Kafka relies on native Linux Page cache (read-ahead and write-behind)
• JVM off-heap cache for free
• Kafka records aren’t deserialized in Kafka JVM
• No Java object memory overhead
• No OutOfMemory issue
• No big GC pauses
Caching data for free
Kafka
Active
Segment
file
Disk
OS
Old segments files
14. 14 •
Reliability with replication
• Kafka disk writes are asynchronous
• Kafka replicas synchronisation (over network) is synchronous
• Trusting replicas in case of data corruption / server crash
Broker
(partition leader)
Broker
(replica)
Broker
(previous
leader)
19. 19 •
• Paralelism based on topic partitions;
• Data compressed/uncompressed on the client;
• Producers send a batch of messages;
• No serialization/deserialization costs on the brokers;
• Writing directly to file:
• Append only (cheapers disks);
• No complex data structure (no BTree or LSM tree);
• Uses OS memory management;
• Relies on replicas not on disks;
• Zero-copy;
Key takeaways
Do quick presentation of each other
short agenda (first kafka basics + seconds design choice that made it a great tool for our scale)
Why this name : just because initial creator (Jay Kreps) liked this author, like the fact he was a writer and think it was a good name for an OS project.
Topic is just lake a table in a DB but for a queue for a queue we called that topic. You send message to Bid request topic and you received message from billable click topic
Partition are a section of a topic. So here topic A have two partiotn / topic B have 4. Partitions are spread over different servers but one partition is always fitting in one server.
Topic can be bid request and billable click
Bid request as 1000 partitions
Partitions are in different server
Order only inside a partition
Each message as a monotonic offset.
Focus on :
- Kafka is just storing bytes / no schema --> you can send image in kafka if you want (not a wonderfull idea, but it works)
First step we want to explain you is complexity is not in server but in client
Producer and consumer
Broker is dummy
Difference between rabbit MQ or oyher queue : you can have huge queue if you want (cf event sourcing store) limit is disk / don’t care about status of a message is it well received is dummy + pull and not push
You can group together consumer to create a consumer group and so a distributed application. Broker is managing coordianation of consumer to assgn good partition to good consumer
Focus on :
- No SPOF /no broker acting like gateway for the cluster : producer is maintenaing the mapping (topic, partition) -> broker
Batch is only logic : one physical message (one send request / ack) is containing several messages
Batch advantage : Compress is efficient / network ack is efficient : one ack for each 1 000 messages for instance
Warning : consumer receive compress batch data only if producer was sending like that
Cost efficiency + highest perf
Advantage here is to use JBOD or RAID
Having ssd will cost more with equal perf or even lower
- Same cache system than varnish (HTTP cache server)
- Designed to work with linux only.
- a heap of 4gb is enough because no data inside (only managing metatdata and client connection)
- Same cache system than varnish (HTTP cache server)
- Designed to work with linux only.
- a heap of 4gb is enough because no data inside (only managing metatdata and client connection)
Disk is async (and it's ok because network is sync)