Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Deep Dive into Apache
Kafka
Jun Rao
co-founder of Confluent
Kafka Usage
Agenda
• High throughput
• Reliability and durability
• Compacted topic
Scaled-out Architecture
5
Kafka cluster
broker 1
…
producer producer producer
consumer consumer
broker 2 broker n topic pa...
Persistent Log
6
Detailed Log Representation
7
offset 0 - 10000
timestamp
index
offset
index
offset 10001 - 20000 offset 20001 - 30000
offs...
Message Format
offset
message
length CRC timestamp
key
length
key
content
value
length
value
content
8 bytes 4 bytes 4 byt...
Batching and Compression
compressed
batch 1send()
send()
send()
send()
producer
async
flush
poll()compressed
batch 2
compr...
Agenda
• High throughput
• Reliability and durability
• Compacted topic
Kafka Replication
• Configurable replication factor
• Tolerating f – 1 failures with f replicas
• Unlike quorum based repl...
Replicas and Layout
• Topic partition has replicas
• Replicas spread evenly among brokers
topic1-part1
logs
broker 1
topic...
High Level Data Flow in Replication
broker 1
producer
leader
broker 2
follower
broker 3
follower
4
2
2
3
commit
ack
When p...
Extend to Multiple Partitions
Leaders are evenly spread among brokers
broker 1 broker 2
topic3-part1
follower
broker 3
top...
In-sync Replicas (ISR)
broker 1
producer
leader
broker 2
follower
broker 3
follower
2
2
topic1-part1 topic1-part1 topic1-p...
Follower Failure
broker 1
producer
leader
broker 2
follower
broker 3
follower
2
2
topic1-part1 topic1-part1 topic1-part1
1...
Shrinking ISR
broker 1
producer
leader
broker 2
follower
broker 3
2
topic1-part1 topic1-part1 topic1-part1
1
m1 m1 m1
m2 m...
Failed Replica Coming Back
broker 1
producer
leader
broker 2
follower
broker 3
2
topic1-part1 topic1-part1 topic1-part1
1
...
Leader Failure
broker 1
producer
leader
broker 2
follower
broker 3
2
topic1-part1 topic1-part1 topic1-part1
1
m1 m1 m1
m2 ...
Selecting New Leader from ISR
broker 1
producer
leader
broker 2
leader
broker 3
2
topic1-part1 topic1-part1 topic1-part1
1...
Expanding ISR
broker 1
producer
leader
broker 2
leader
broker 3
2
topic1-part1 topic1-part1 topic1-part1
1
m1 m1 m1
m2 m2 ...
m5
Unclean Leader Election
broker 1
producer
leader ???
broker 2
leader
broker 3
2
topic1-part1 topic1-part1 topic1-part1
...
m5
Guaranteed Replicas
broker 1
producer
broker 2
leader
broker 3
topic1-part1 topic1-part1 topic1-part1
1
m1 m1 m1
m2 m2 ...
Mission Critical Data
• Disable Unclean Leader Election
• unclean.leader.election.enable = false
• Set replication factor
...
Failure Detection and Controller Flow
broker 1 broker 2
topic3-part1
follower
broker 3
topic3-part1
follower
topic1-part1
...
Agenda
• High throughput
• Reliability and durability
• Compacted topic
Log Compaction: What is it?
Use Case
product
catalog search
index
item1  “new description”
Adding a New Index Instance
product
catalog search
index
item1  “new description”
new
search
index
Using a Compacted Topic
product
catalog search
index
item1  “new description”
new
search
index
set consumer offset to 0
Log Cleaner Implementation
offset 3001 - 4000
firstDirty
1. build map
2. scan and probe map
firstDirty
offset 2001 - 3000o...
Cleaning Configs
• log.cleaner.min.cleanable.ratio (default 0.5)
• dirty/total ratio when log cleaner is triggered
• log.c...
Be Careful with Deletes
• Delete tombstone modeled as null message
• Danger of removing a deleted key too soon
• Consumer ...
Summary
• Apache Kafka is a streaming platform
• The storage part supports
• High throughput
• High availability and durab...
Coming Up Next
Date Title Speaker
10/27 Data Integration with Kafka Gwen Shapira
11/17 Demystifying Stream Processing Neha...
Deep Dive into Apache Kafka
Upcoming SlideShare
Loading in …5
×

Deep Dive into Apache Kafka

5,384 views

Published on

In the last few years, Apache Kafka has been used extensively in enterprises for real-time data collecting, delivering, and processing. In this presentation, Jun Rao, Co-founder, Confluent, gives a deep dive on some of the key internals that help make Kafka popular.

- Companies like LinkedIn are now sending more than 1 trillion messages per day to Kafka. Learn about the underlying design in Kafka that leads to such high throughput.
- Many companies (e.g., financial institutions) are now storing mission critical data in Kafka. Learn how Kafka supports high availability and durability through its built-in replication mechanism.
- One common use case of Kafka is for propagating updatable database records. Learn how a unique feature called compaction in Apache Kafka is designed to solve this kind of problem more naturally.

Published in: Software

Deep Dive into Apache Kafka

  1. 1. Deep Dive into Apache Kafka Jun Rao co-founder of Confluent
  2. 2. Kafka Usage
  3. 3. Agenda • High throughput • Reliability and durability • Compacted topic
  4. 4. Scaled-out Architecture 5 Kafka cluster broker 1 … producer producer producer consumer consumer broker 2 broker n topic partition
  5. 5. Persistent Log 6
  6. 6. Detailed Log Representation 7 offset 0 - 10000 timestamp index offset index offset 10001 - 20000 offset 20001 - 30000 offset index offset index timestamp index timestamp index
  7. 7. Message Format offset message length CRC timestamp key length key content value length value content 8 bytes 4 bytes 4 bytes magic byte 1 byte attribute 1 byte 8 bytes 4 bytes varies 4 bytes varies
  8. 8. Batching and Compression compressed batch 1send() send() send() send() producer async flush poll()compressed batch 2 compressed batch 3 compressed batch 1 compressed batch 2 compressed batch 3 consumerbroker
  9. 9. Agenda • High throughput • Reliability and durability • Compacted topic
  10. 10. Kafka Replication • Configurable replication factor • Tolerating f – 1 failures with f replicas • Unlike quorum based replication • Automated failover
  11. 11. Replicas and Layout • Topic partition has replicas • Replicas spread evenly among brokers topic1-part1 logs broker 1 topic1-part2 logs broker 2 topic2-part2 topic2-part1 logs broker 3 topic1-part1 logs broker 4 topic1-part2 topic2-part2 topic1-part1 topic1-part2 topic2-part1 topic2-part2 topic2-part1
  12. 12. High Level Data Flow in Replication broker 1 producer leader broker 2 follower broker 3 follower 4 2 2 3 commit ack When producer receives ack Latency Durability on failures acks=0 (no ack) no network delay some data loss acks=1 (wait for leader) 1 network roundtrip a few data loss acks=all (wait for committed) 2 network roundtrips no data loss topic1-part1 topic1-part1 topic1-part1 consumer 1
  13. 13. Extend to Multiple Partitions Leaders are evenly spread among brokers broker 1 broker 2 topic3-part1 follower broker 3 topic3-part1 follower topic1-part1 producer leader topic1-part1 follower topic1-part1 follower broker 4 topic3-part1 leader producer topic2-part1 producer leader topic2-part1 follower topic2-part1 follower
  14. 14. In-sync Replicas (ISR) broker 1 producer leader broker 2 follower broker 3 follower 2 2 topic1-part1 topic1-part1 topic1-part1 1 m1 m1 m1 m2 m2 m2 ISR last committed m2, m1 In-sync : replica reads from leader’s log end within replica.lag.time.max.ms
  15. 15. Follower Failure broker 1 producer leader broker 2 follower broker 3 follower 2 2 topic1-part1 topic1-part1 topic1-part1 1 m1 m1 m1 m2 m2 m2 ISR last committed
  16. 16. Shrinking ISR broker 1 producer leader broker 2 follower broker 3 2 topic1-part1 topic1-part1 topic1-part1 1 m1 m1 m1 m2 m2 m2 ISR m3 m3 m4 m4last committed m4, m3 follower
  17. 17. Failed Replica Coming Back broker 1 producer leader broker 2 follower broker 3 2 topic1-part1 topic1-part1 topic1-part1 1 m1 m1 m1 m2 m2 m2 ISR m3 m3 m4 m4last committed m3 2 follower
  18. 18. Leader Failure broker 1 producer leader broker 2 follower broker 3 2 topic1-part1 topic1-part1 topic1-part1 1 m1 m1 m1 m2 m2 m2 ISR m3 m3 m4 m4last committed m3 2 follower
  19. 19. Selecting New Leader from ISR broker 1 producer leader broker 2 leader broker 3 2 topic1-part1 topic1-part1 topic1-part1 1 m1 m1 m1 m2 m2 m2 ISR m3 m3 m4 m4last committed m3 follower
  20. 20. Expanding ISR broker 1 producer leader broker 2 leader broker 3 2 topic1-part1 topic1-part1 topic1-part1 1 m1 m1 m1 m2 m2 m2 ISR m3 m3 m4 m4 last committed m3 follower m4 m5 m5 m5
  21. 21. m5 Unclean Leader Election broker 1 producer leader ??? broker 2 leader broker 3 2 topic1-part1 topic1-part1 topic1-part1 1 m1 m1 m1 m2 m2 m2 ISR m3 m3 m4 m4last committed m3 follower m4 m5
  22. 22. m5 Guaranteed Replicas broker 1 producer broker 2 leader broker 3 topic1-part1 topic1-part1 topic1-part1 1 m1 m1 m1 m2 m2 m2 m3 m3 m4 m4 last committed m3 m4 m5 ISR > min.insync.replicas ? m6
  23. 23. Mission Critical Data • Disable Unclean Leader Election • unclean.leader.election.enable = false • Set replication factor • default.replication.factor = 3 • Set minimum ISRs • min.insync.replicas = 2 • Set producer acks • acks = all 24
  24. 24. Failure Detection and Controller Flow broker 1 broker 2 topic3-part1 follower broker 3 topic3-part1 follower topic1-part1 controller leader topic1-part1 follower topic1-part1 follower broker 4 topic3-part1 leader topic2-part1 leader topic2-part1 follower topic2-part1 follower new leaders and ISRs
  25. 25. Agenda • High throughput • Reliability and durability • Compacted topic
  26. 26. Log Compaction: What is it?
  27. 27. Use Case product catalog search index item1  “new description”
  28. 28. Adding a New Index Instance product catalog search index item1  “new description” new search index
  29. 29. Using a Compacted Topic product catalog search index item1  “new description” new search index set consumer offset to 0
  30. 30. Log Cleaner Implementation offset 3001 - 4000 firstDirty 1. build map 2. scan and probe map firstDirty offset 2001 - 3000offset 1001 - 2000 offset 4001 - 5000 offset 5001 - 6000 offset 3001 - 5000offset 1001 - 3000 offset 5001 - 6000 key1 3500 key2 3700 key3 4200key1 1500 key last offset reject key4 2100 keep key3 4200 keep
  31. 31. Cleaning Configs • log.cleaner.min.cleanable.ratio (default 0.5) • dirty/total ratio when log cleaner is triggered • log.cleaner.io.max.bytes.per.second (default infinite) • Max rate cleaning can be done • Can be used for throttling
  32. 32. Be Careful with Deletes • Delete tombstone modeled as null message • Danger of removing a deleted key too soon • Consumer still assumes the old value with the key • log.cleaner.delete.retention.ms (default 1 day) • “Delete tombstone” removed after that time • Consumer needs to finish consuming the tombstone before that time
  33. 33. Summary • Apache Kafka is a streaming platform • The storage part supports • High throughput • High availability and durability • Retaining database-like data
  34. 34. Coming Up Next Date Title Speaker 10/27 Data Integration with Kafka Gwen Shapira 11/17 Demystifying Stream Processing Neha Narkhede 12/1 A Practical Guide To Selecting A Stream Processing Technology Michael Noll 12/15 Streaming in Practice: Putting Apache Kafka in Production Roger Hoover

×