"While Apache Kafka lacks native support for topic renaming, there are scenarios where renaming topics becomes necessary. This presentation will delve into the utilization of MirrorMaker 2.0 as a solution for renaming Kafka topics. It will illustrate how MirrorMaker 2.0 can efficiently facilitate the migration of messages from the old topic to the new one and how Kafka Connect Metrics can be employed to monitor the mirroring progress. The discussion will encompass the complexity of renaming Kafka topics, addressing certain limitations, and exploring potential workarounds when using MirrorMaker 2.0 for this purpose. Despite not being originally designed for topic renaming, MirrorMaker 2.0 has a suitable solution for renaming Kafka topics.
Blog Post : https://engineering.hellofresh.com/renaming-a-kafka-topic-d6ff3aaf3f03"
Human Factors of XR: Using Human Factors to Design XR Systems
Renaming a Kafka Topic | Kafka Summit London
1. Renaming a kafka topic.
Nahidul Kibria
Senior Platform Engineer @ HelloFresh SE
@nahidupa
2. The Challenge of Inconsistent Kafka Topic Names
A context change can make a Kafka topic name irrelevant.
3. If we have a proper naming convention, we get some benefits out of the box.
For example, we can
● Find topics ownership.
● Moreover, Implement ACLs with the convention.
● Another significant point is implementing a Namespaces.
Unlike k8s, Kafka does not have a “namespaces”, so the only way to have one is through
the “Naming convention.”
We can imagine How important it is to have a shared cluster.
Why Consistent Naming Matters
4. Moving Towards Consistent Naming
→Creating a new cluster to satisfy the topic name consistency and other
policies and asking everyone to move there.
- It involves, downtime, increased cost, we can end up maintaining
multiple cluster for a while.
→We were looking for a Incremental Approach.
→A better alternative is topic migration with improved naming
conventions.
- This approach is: Simpler and less risky compared to creating a new
cluster.
- Efficient: Data migration tools can automate the process, minimizing
downtime.
● Sustainable: Addressing naming conventions with subset of topics.
5. Kafka Topic Renaming: A Hurdle
Apache Kafka does not currently support renaming topics natively.
6. Can we just copying data to a new topic !?
● Partition Mismatch: Cloning might distribute messages unevenly across partitions in the
new topic.
● Consumer Group Offset Loss: Consumer groups won't automatically track their progress
(offsets) in the new topic, leading to potential data loss or reprocessing.
● Message Order Issues: The order of messages might not be preserved during the cloning
process.
The Naive Approach - Cloning Topics
7. Leverage Existing Solutions (MirrorMaker2)
MM2 is based on the Kafka connect framework. It moves data from a source cluster to a
destination cluster.
- It's a Kafka Connect framework built specifically for data migration.
- What makes it especially valuable for renaming is its ability to handle the complexities we just
discussed.
- MirrorMaker2 ensures proper partition management, tracks consumer group offsets, and
preserves the original order of your messages.
- However, it's primarily designed for Disaster Recovery (DR) scenarios.
10. Shared Internal Topics: By default,
MirrorMaker2 creates internal topics that are
shared across all renaming jobs to track it
progress.
- To enable renaming, we can customize
internal topics with a 1:1 mapping to the
topic being renamed. This ensures
isolation and simplifies management.
- This help us to get redo capability,
Deleting a specific internal topic allows
resuming mirroring for that particular
renamed topic if necessary.
Customizing MirrorMaker2 for Renaming
13. Lesson learned (Optimizing MirrorMaker2 for Renaming)
● Tuning Connect Tasks:
○ Set connect task max config, similar to topic partitions count.
● Balance Group Offsets Refresh Rates:
○ Avoid aggressive refresh.groups.interval.seconds values. Aim for a balance
between responsiveness and stability. (e.g.,
sync.group.offsets.interval.seconds: 120, refresh.groups.interval.seconds:
120)
sync.group.offsets.interval.seconds: 120
refresh.groups.interval.seconds: 120
14. Lesson learned (User Considerations for Kafka Topic Renaming)
Explaining Offset translation to the teams
○ Offset Translation:
■ Explain offset translation to users in an FAQ or documentation. This clarifies why
partition offsets might appear different after renaming.
○ Consumer Group Considerations:
■ Advise users to stop consumers and verify consumer group offset sync.
○ Downtime:
■ Clearly communicate any expected downtime/avoid during the renaming process.
15. ● Life before KIP-875
a. Only way to reset MM2 to redo mirroring is deleting internal offset topic and redeploy with
new alias cluster name as __consumer_offsets is global.
b. After this KIP-875 is it possible to reset the connector offset.
curl -s -X DELETE http://..//connectors/{}/offsets
● Before merging any big topic cloning, do the capacity planning.
First-class offsets support in Kafka Connect
16. All the Complexity is hidden to from the user/developers and asked for a
PR.
Streamlined User Experience