Velocity 2019 - Kafka Operations Deep Dive

1
Monitor Disk
Space
and other ways to keep Kafka
happy
Gwen Shapira, @gwenshap, Software Engineer

Me
• Software engineer @ Confluent
• Committer on Apache Kafka
• Co-author of
“Kafka - the Definitive Guide”
• Tweets a lot: @gwenshap
• Learning to devops

3
In which disk-related failure
scenarios are discussed in
unprecedented level of detail

Producer
Consume
r
Kafka Cluster
Stream Processing Apps
Connectors Connectors

Partitions
• Kafka organizes messages into
topics
• Each topics have a set of
partitions
• Each partition is a replicated log
of messages, referenced by
sequential offset
Partition 0
Partition 1
Partition 2
0 1 2 3 4 5
0 1 2 3 4 5 6 7
0 1 2 3 4
Offset

Replication
• Each Partition is replicated 3
times
• Each replica lives on separate
broker
• Leader handles all reads and
writes.
• Followers replicate events
from leader.
01234567
Replica 1 Replica 2 Replica 3
01234567
01234567
Producer

9
Partition 1
Replica 100
Leader
Partition 1
Replica 101
Broker 100 Broker 101
Broker 102
Controller
Partition 1
Replica 102
Partition 2
Replica 100
Partition 2
Replica 101
Leader
Partition 2
Replica 102
Zookeeper: /brokers/100, 101, 102

10
Partition 1
Replica 100
Leader
Partition 1
Replica 101
Broker 102
Controller
Partition 1
Replica 102
Partition 2
Replica 100
Partition 2
Replica 101
Leader
Partition 2
Replica 102
Zookeeper: /brokers/101, 102
✗

11
Partition 1
Replica 100
Leader
Partition 1
Replica 101
Broker 102
Controller
Partition 1
Replica 102
Partition 2
Replica 100
Partition 2
Replica 101
Leader
Partition 2
Replica 102
✗
Oh no.
Broker 100
is missing.

12
Partition 1
Replica 100
Leader
Partition 1
Replica 101
Broker 102
Controller
Partition 1
Replica 102
Partition 2
Replica 100
Partition 2
Replica 101
Leader
Partition 2
Replica 102
✗
Broker 102: you
now lead
partition 1
Broker 101: you
now follow
broker 102 for
partition 1

13
Partition 1
Replica 102
Leader
Partition 1
Replica 101
Broker 102
Controller
Partition 1
Replica 100
Partition 2
Replica 100
Partition 2
Replica 101
Leader
Partition 2
Replica 102
✗

14
Partition 1
Replica 102
Leader
Partition 1
Replica 101
Broker 102
Controller
Partition 1
Replica 100
Partition 2
Replica 100
Partition 2
Replica 101
Leader
Partition 2
Replica 102

15
Partition 1
Replica 102
Leader
Partition 1
Replica 101
Broker 102
Controller
Partition 1
Replica 100
Partition 2
Replica 100
Partition 2
Replica 101
Leader
Partition 2
Replica 102
Broker 100
is back!
Broker 100:
Note the
new
leaders: 101
and 102

16
Partition 1
Replica 102
Leader
Partition 1
Replica 101
Broker 102
Controller
Partition 1
Replica 100
Partition 2
Replica 100
Partition 2
Replica 101
Leader
Partition 2
Replica 102
What did I
miss?

17
Partition 1
Replica 102
Leader
Partition 1
Replica 101
Broker 102
Controller
Partition 1
Replica 100
Partition 2
Replica 100
Partition 2
Replica 101
Leader
Partition 2
Replica 102
What did I
miss?
Lots of
events!

18
Partition 1
Replica 102
Leader
Partition 1
Replica 101
Broker 102
Controller
Partition 1
Replica 100
Partition 2
Replica 100
Partition 2
Replica 101
Leader
Partition 2
Replica 102
Thanks
guys, I
caught up!

19
Partition 1
Replica 102
Leader
Partition 1
Replica 101
Broker 102
Controller
Partition 1
Replica 100
Partition 2
Replica 100
Partition 2
Replica 101
Leader
Partition 2
Replica 102
Broker 100,
you are
preferred
leader for
partition 1
Broker 101,
follow broker
100 for
partition 1
Broker 102,
follow broker
100 for
partition 1

20
Partition 1
Replica 100
Leader
Partition 1
Replica 101
Broker 102
Controller
Partition 1
Replica 102
Partition 2
Replica 100
Partition 2
Replica 101
Leader
Partition 2
Replica 102

21
What could possibly go wrong?

22
When Kafka runs out of disk
space

23
Best case scenario:
Broker ran out of disk space
and crashed.

24
Solution:
1. Get bigger disks
2. Store less data

25
What not to do. Ever:
cat /dev/null > /data/log/my_topic-
15/00000000000001548736.log
While Kafka is up and running.

26
When you are in a hole
Stop digging.
Don’t know where the holes are?
Walk slowly.

27
General Tips for
Stable Kafka
● Over-provision
● Upgrade to latest bug-fixes
● Don’t mess with stuff you don’t
understand.
● Call support when you have to

28
Bad Scenario
https://issues.apache.org/jira/browse/KAFKA-7151

29
Partition 1
Replica 100
Leader
Partition 1
Replica 101
Broker 100 Broker 101 Broker 102
Partition 1
Replica 102
Latest offset 1000 Latest offset 1000 Latest offset 1000

30
Partition 1
Replica 100
Leader
Partition 1
Replica 101
Partition 1
Replica 102
What did
I miss?
Anything
after
1000?

31
Partition 1
Replica 100
Leader
Partition 1
Replica 101
Partition 1
Replica 102
Here is
1001 to
1010

32
Partition 1
Replica 100
Leader
Partition 1
Replica 101
Partition 1
Replica 102

33
Partition 1
Replica 100
Leader
Partition 1
Replica 101
Partition 1
Replica 102

34
Partition 1
Replica 100
Leader
Partition 1
Replica 101
Partition 1
Replica 102
What did
I miss?
Anything
after
1010?

35
Partition 1
Replica 100
Leader
Partition 1
Replica 101
Partition 1
Replica 102
Too busy
trying to
access
disk

36
Partition 1
Replica 100
Leader
Partition 1
Replica 101
Partition 1
Replica 102
IO
ERROR

37
Partition 1
Replica 100
Leader
Partition 1
Replica 101
Partition 1
Replica 102
✗
Too far
behind to
be leader
Too far
behind to
be leader

39
Partition 1
Replica 100
Leader
Partition 1
Replica 101
Partition 1
Replica 102
I’m back.
As you know, I’m the
leader.
Based on my disk,
latest event is 1000

40
Partition 1
Replica 100
Leader
Partition 1
Replica 101
Partition 1
Replica 102
What did I
miss?
LOL. No.
Latest is 1010.
We can’t follow
you.

41
Solution:
Enable unclean leader election.
Lose messages 1010-1020.

42
Solution:
https://issues.apache.org/jira/browse/KAFKA-7151

45
Systems Hierarchy of Needs
CPU
Bandwidth
Disk
RAM

46
Most common Symptom:
Under-replicated partitions
You basically can’t alert on that.
We monitor the resources,
act early and add resources.

47
How to add CPU / Bandwidth?
Normally by adding brokers
And rebalancing partitions

48
When good EBS volumes go bad

49
Partition 1
Replica 100
Leader
Partition 1
Replica 101
Partition 1
Replica 102
Hanging.
Not talking to
anyone

50
What will happen?
Lets zoom in

51
Network Threads
Also reading from disk
Partition 1
Replica 101
Partition 2
Replica 101
Leader
Partition 1
Replica 100
Leader
Partition 2
Replica 100Request Threads
Writing to disk
Replica Fetchers
Reading from leader
Writing to disk
Zookeeper Client
No disks involved
Hanging.
Not talking to
anyone

52
Partition 1
Replica 100
Leader
Partition 1
Replica 101
Broker 102
Controller
Partition 1
Replica 102
Partition 2
Replica 100
Partition 2
Replica 101
Leader
Partition 2
Replica 102
✗Only part
of the
broker
that is
alive!

53
Partition 1
Replica 100
Leader
Partition 1
Replica 101
Broker 102
Controller
Partition 1
Replica 102
Partition 2
Replica 100
Partition 2
Replica 101
Leader
Partition 2
Replica 102
✗
Broker 100 is
totally alive!
No need to
elect leaders!

55
Solution:
Stop the broker ASAP.
Open ticket to replace disk

56
How to detect this?
● Broker is up
● Logs look fine
● Request Handler idle% is 0
● Network Handler idle% is 0
● Client time-out

58
Canary
● Lead partition on every broker
● Produce and Consume
● Every 10 seconds
● Yell if 3 consecutive misses
Partition 1
Replica 100
Leader
Partition 1
Replica 101
Partition 2
Replica 100
Partition 2
Replica 101
Leader

59
You can reuse canary for simple
failure injection testing

61
You don’t really know how your
software will behave until it is in
production for quite a while.

62
More Key Points
● Keep an eye on your key resources
● Tread carefully in unknown territory
● Sometimes crashed broker is GOOD
● Monitor user scenarios -
especially for SLAs

Velocity 2019 - Kafka Operations Deep Dive

Recommended

Recommended

More Related Content

Similar to Velocity 2019 - Kafka Operations Deep Dive

Similar to Velocity 2019 - Kafka Operations Deep Dive (20)

More from Gwen (Chen) Shapira

More from Gwen (Chen) Shapira (20)

Recently uploaded

Recently uploaded (20)

Velocity 2019 - Kafka Operations Deep Dive

Editor's Notes