Presented at Stream Processing Meetup (7/19/2018)(https://www.meetup.com/Stream-Processing-Meetup-LinkedIn/events/251481797/).
At Uber, we operate 20+ Kafka clusters to collect system and application logs as well as event data from rider and driver apps. We need a Kafka replication solution to replicate data between Kafka clusters across multiple data centers for different purposes. This talk will introduce the history behind uReplicator and the high level architecture. As the original uReplicator ran into scalability challenges and operational overhead as the scale of Kafka clusters increased, we built the Federated uReplicator which addressed above issues and provide an extensible architecture for further scaling.
20. Solution 2: 1-1 Partition Mapping
● Round robin
● Topic with N partitions
● N^2 connection
● DoS-like traffic
MM 1
MM 2
MM 3
p0
p1
p3MM 4
Source Destination
p2
p0
p1
p2
p3
21
MM 1
MM 2
MM 3
p0
p1
p3MM 4
Source Destination
p2
p0
p1
p2
p3
● Deterministic partition
● Topic with N partitions
● N connection
● Reduce contention
21. Full Speed Phase Throughput
● First hour: 27.2MB/s => 53.6MB/s per aggregate broker
22
Before
After
22. Problem 2: Long Tail Phase
● During full speed phase, all workers are busy
● Some workers catch up and become idle, but the others are still busy
23
23. Solution 1: Dynamic Workload Balance
● Original partition assignment
○ Number of partitions
○ Heavy partition on the same worker
● Workload-based assignment
○ Total workload when added
■ Source cluster bytes-in-rate
■ Retrieved from Chaperone3
○ Dynamic rebalance periodically
■ Exceeds 1.5 times the average workload
24
24. Solution 2: Lag-Feedback Rebalance
● Monitors topic lags
○ Balance the workers based on lags
■ larger lag = heavier workload
○ Dedicated workers for lagging topics
25
25. Dynamic Workload Rebalance
● Periodically adjust workload every 10 minutes
● Multiple lagging topics on the same worker spreads to multiple
workers
26
26. Issues Again
● Scalability
○ Number of partitions
○ Up to 1 hour for a full rebalance during rolling restart
● Operation
○ Number of deployments
○ New cluster
● Sanity
○ Loop
○ Double route
27
28. ● uReplicator Front
○ whitelist/blacklist topics (topic, src, dst)
○ Persist in DB
○ Sanity checks
○ Assigns to uReplicator Manager
Federated uReplicator
29
29. Federated uReplicator
● uReplicator Manager
○ if src->dst not exist:
■ select controller/workers from pool -> new route
■ add topic to route -> done
○ else:
■ for each route:
● if #partition < max:
○ add topic to route -> done
■ select controller/workers from pool -> new route
■ add topic to route -> done
30
31. ● Auto-scaling
○ Total workload of route
○ Total lag in the route
○ Total workload / expected workload on worker
○ Automatically add workers to route
Federated uReplicator
32
32. ● Consume message from Kafka
● Compare count from each Tier
● https://eng.uber.com/chaperone/
● https://github.com/uber/chaperone
Chaperone Auditing
33
33. Agenda
● Use cases & scale
● Motivation
● High-level design
● Future work
34
34. Future Work
● Open source
● Extensible framework
○ From anything to anything
35