Watch this talk here: https://www.confluent.io/online-talks/everything-you-always-wanted-to-know-about-kafkas-rebalance-protocol-but-were-afraid-to-ask-on-demand
Apache Kafka® is a scalable streaming platform with built-in dynamic client scaling. The elastic scale-in/scale-out feature leverages Kafka’s “rebalance protocol” that was designed in the 0.9 release and improved ever since then. The original design aims for on-prem deployments of stateless clients. However, it does not always align with modern deployment tools like Kubernetes and stateful stream processing clients, like Kafka Streams. Those shortcomings lead to two major recent improvement proposals, namely static group membership and incremental rebalancing.
This talk provides a deep dive into the details of the rebalance protocol, starting from its original design in version 0.9 up to the latest improvements and future work.
We discuss internal technical details, pros and cons of the existing approaches, and explain how you configure your client correctly for your use case. Additionally, we discuss configuration tradeoffs for stateless, stateful, on-prem, and containerized deployments.
2024: Domino Containers - The Next Step. News from the Domino Container commu...
Everything You Always Wanted to Know About Kafka's Rebalance Protocol but Were Afraid to Ask
1. 1
1
Everything you always wanted to know
about Kafka’s rebalance protocol but
you were afraid to ask
Matthias J. Sax | Software Engineer
@MatthiasJSax
2. What is rebalancing
about?
● Group membership
● Resource assignment
● Example: KafkaConsumer
○ Consumer group
○ Partition ownership
@MatthiasJSax
2
19. 19
19
Static Group Membership
group.instance.id member.id
A 1
B 2
C 3
GroupCoordinator
(broker side)
group.id=“grp”
group.instance.id=“A”
POD
Application
member.id=1
group.id=“grp”
group.instance.id=“C”
POD
Application
member.id=3
group.id=“grp”
group.instance.id=“B”
POD
Application
member.id=2
@MatthiasJSax 14
20. 20
20
Static Group Membership
group.instance.id member.id
A 1
B 2
C 3
GroupCoordinator
(broker side)
group.id=“grp”
group.instance.id=“C”
POD
Application
member.id=3
group.id=“grp”
group.instance.id=“B”
POD
Application
member.id=2
@MatthiasJSax 14
21. 21
21
Static Group Membership
group.instance.id member.id
A 1
B 2
C 3
GroupCoordinator
(broker side)
group.id=“grp”
group.instance.id=“A”
POD
Application
member.id=1
group.id=“grp”
group.instance.id=“C”
POD
Application
member.id=3
group.id=“grp”
group.instance.id=“B”
POD
Application
member.id=2
@MatthiasJSax 14
22. 22
22
22
Looking into the Future
● Work in progress
○ Incremental rebalancing
for Kafka Consumers
and Kafka Streams
● Future work
○ Smooth scale-out for
Kafka Streams
@MatthiasJSax
15
26. 26
26
Incremental Rebalancing
@MatthiasJSax
C1
C2
C3
Group Leader
eived all subscriptions) synchronization barrier
GroupCoordinator
(broker side)
SyncGroup
(intended
assignment)
SyncResponse
(enforce revoke)
JoinGroup
Group Leader
(received all subscriptions)
JoinResponse
SyncGroup
SyncResponse
16
27. 27
27
27
@MatthiasJSax
Summary
● Deep dive into rebalance protocol
○ Powerful and flexible
○ Stop-the-world property
● Since AK 2.3 / CP 5.3
○ Static group membership
○ Incremental rebalancing (Connect)
● Work in progress for AK 2.4 / CP 5.4
17