uReplicator: Uber Engineering’s Scalable, Robust Kafka Replicator

uReplicator: Uber Engineering’s Scalable,
Robust Kafka Replicator
Hongliang Xu
Software Engineer, Streaming Data
Streaming Processing Meetup@LinkedIn
July 19, 2018

Agenda
● Use cases & scale
● Motivation
● High-level design
● Future work
3

Agenda
● Motivation
● Future work
4

DC2
Kafka@Uber
Applications
[ProxyClient]
Kafka REST
Proxy
Regional
Kafka
Aggregate
Kafka
Applications
[ProxyClient]
Kafka REST
Proxy
Regional
Kafka
Secondary
Kafka
DC1
uReplicator
DC1/2
5

Trillion+ ~PBs
Messages/Day Data Volume
Scale
excluding replication
Tens of Thousands
Topics
6

Replication - Use Cases
● Aggregation
○ All-active
○ Ingestion
● Migration
○ Allow consumer move later
7

Agenda
● Use Cases & Scale
● Motivation
● High-level Design
● Future Work
8

MirrorMaker
● Expensive rebalancing
● Difficulty adding topics
● Possible data loss
● Metadata sync issues
9

Requirement
● Stable replication
● Simple operations
● High throughput
● No data loss
● Auditing
10

Agenda
● Use Cases & Scale
● Motivation
● High-level Design
● Future Work
11

uReplicator
● uReplicator controller
○ Helix controller
○ Handle adding/deleting topics
○ Handle adding/deleting/failure of uReplicator workers
○ Assign topic partitions to each worker process
12

Zookeeper
Helix
Controller
Helix
Agent
Thread 1
Thread N
Topic-partition
Helix
Agent
Thread 1
Thread N
Topic-partition
Helix
Agent
Thread 1
Thread N
Topic-partition
New topic
New partition
Worker1 Worker2 Worker3
curl -X POST -d '{"topic":"testTopic", "numPartitions":"4"}'
http://localhost:7061/topics
Calculate
idealState of the
topic partitions
{
"externalView":
{topic: testTopic},
"idealState": {
"0": ["MM worker1"],
"3": ["MM worker1"]
}
}
uReplicator
13

Zookeeper
Helix
Controller
Helix
Agent
Thread 1
Thread N
Topic-partition
Helix
Agent
Thread 1
Thread N
Topic-partition
Helix
Agent
Thread 1
Thread N
Topic-partition
New topic
New partition
topic: testTopic
partition:0 topic: testTopic
partition:3
topic: testTopic
partition:1
topic: testTopic
partition:2
uReplicator
14

uReplicator
● uReplicator worker
○ Helix agent
○ Dynamic Fetcher
○ Kafka producer
○ Commit after flush
15
Zookeeper
Helix
Controller
Worker thread
Helix
Agent
FetcherManager
topic: testTopic
partition:0
FetcherThread
SimpleConsumer
LeaderFind Thread
LinkedBlockingQueueKafka Producer

Zookeeper
Helix
Controller
Helix
Agent
Thread 1
Thread N
Topic-partition
Helix
Agent
Thread 1
Thread N
Topic-partition
Helix
Agent
Thread 1
Thread N
Topic-partition
New topic
New partition
topic: testTopic
partition:3
topic: testTopic
partition:1
topic: testTopic
partition:2
uReplicator
16

Zookeeper
Helix
Controller
Helix
Agent
Thread 1
Thread N
Topic-partition
Helix
Agent
Thread 1
Thread N
Topic-partition
Helix
Agent
Thread 1
Thread N
Topic-partition
New topic
New partition
topic: testTopic
partition:3
topic: testTopic
partition:1
topic: testTopic
partition:2
uReplicator
17

Performance Issues
● Catch-up time is too long (4 hours failover)
○ Full-speed phase: 2~3 hours
○ Long tail phase: 5~6 hours
18

Problem: Full Speed Phase
● Destination brokers are bound by CPUs
19

Solution 1: Increase Batch Size
● Increase throughput
○ producer.batch.size: 64KB => 128KB
○ producer.linger.ms: 100 => 1000
● Effect
○ Batch size increases: 10~22KB => 50~90KB
○ Compress rate: 40~62% => 27~35% (compressed size)
20

Solution 2: 1-1 Partition Mapping
● Round robin
● Topic with N partitions
● N^2 connection
● DoS-like traffic
MM 1
MM 2
MM 3
p0
p1
p3MM 4
Source Destination
p2
p0
p1
p2
p3
21
MM 1
MM 2
MM 3
p0
p1
p3MM 4
Source Destination
p2
p0
p1
p2
p3
● Deterministic partition
● Topic with N partitions
● N connection
● Reduce contention

Full Speed Phase Throughput
● First hour: 27.2MB/s => 53.6MB/s per aggregate broker
22
Before
After

Problem 2: Long Tail Phase
● During full speed phase, all workers are busy
● Some workers catch up and become idle, but the others are still busy
23

Solution 1: Dynamic Workload Balance
● Original partition assignment
○ Number of partitions
○ Heavy partition on the same worker
● Workload-based assignment
○ Total workload when added
■ Source cluster bytes-in-rate
■ Retrieved from Chaperone3
○ Dynamic rebalance periodically
■ Exceeds 1.5 times the average workload
24

Solution 2: Lag-Feedback Rebalance
● Monitors topic lags
○ Balance the workers based on lags
■ larger lag = heavier workload
○ Dedicated workers for lagging topics
25

Dynamic Workload Rebalance
● Periodically adjust workload every 10 minutes
● Multiple lagging topics on the same worker spreads to multiple
workers
26

Issues Again
● Scalability
○ Number of partitions
○ Up to 1 hour for a full rebalance during rolling restart
● Operation
○ Number of deployments
○ New cluster
● Sanity
○ Loop
○ Double route
27

Federated uReplicator
● Deployment group
● 1 manager
● Multiple route
● Controller/worker pool
28

● uReplicator Front
○ whitelist/blacklist topics (topic, src, dst)
○ Persist in DB
○ Sanity checks
○ Assigns to uReplicator Manager
29

● uReplicator Manager
○ if src->dst not exist:
■ select controller/workers from pool -> new route
■ add topic to route -> done
○ else:
■ for each route:
● if #partition < max:
○ add topic to route -> done
■ select controller/workers from pool -> new route
■ add topic to route -> done
30

● uReplicator Controller/Worker
○ Step 1: route assignment
○ Manager->Controller/Worker
○ Resource: @<src cluster>@<dst cluster>
○ Partition: <route id>
○ Step 2: topic assignment
○ Manager->Controller
○ Resource: <topic name>
○ Partition: @<src cluster>@<dst cluster>@<route id>
31

● Auto-scaling
○ Total workload of route
○ Total lag in the route
○ Total workload / expected workload on worker
○ Automatically add workers to route
32

● Consume message from Kafka
● Compare count from each Tier
● https://eng.uber.com/chaperone/
● https://github.com/uber/chaperone
Chaperone Auditing
33

Agenda
● Motivation
● Future work
34

Future Work
● Open source
● Extensible framework
○ From anything to anything
35

Links
● https://github.com/uber/uReplicator
● https://eng.uber.com/ureplicator/
36

Acknowledgement
● Chinmay Soman, Xiang Fu, Yuanchi Ning, Zhenmin Li, Xiaobing Li, Ying
Zheng, Lei Lin & streaming-data@Uber
37

Thank you
Proprietary and confidential © 2016 Uber Technologies, Inc. All rights reserved. No part of this document may be
reproduced or utilized in any form or by any means, electronic or mechanical, including photocopying, recording, or by any
information storage or retrieval systems, without permission in writing from Uber. This document is intended only for the
use of the individual or entity to whom it is addressed and contains information that is privileged, confidential or otherwise
exempt from disclosure under applicable law. All recipients of this document are notified that the information contained
herein includes proprietary and confidential information of Uber, and recipient may not make use of, disseminate, or in any
way disclose this document or any of the enclosed information to any person other than employees of addressee to the
extent necessary for consultations with authorized personnel of Uber.
More open-source projects at eng.uber.com
38

Proprietary and confidential © 2018 Uber Technologies, Inc. All rights reserved. No part of this
document may be reproduced or utilized in any form or by any means, electronic or mechanical,
including photocopying, recording, or by any information storage or retrieval systems, without
permission in writing from Uber. This document is intended only for the use of the individual or entity
to whom it is addressed and contains information that is privileged, confidential or otherwise exempt
from disclosure under applicable law. All recipients of this document are notified that the information
contained herein includes proprietary and confidential information of Uber, and recipient may not
make use of, disseminate, or in any way disclose this document or any of the enclosed information to
any person other than employees of addressee to the extent necessary for consultations with
authorized personnel of Uber. 39

uReplicator: Uber Engineering’s Scalable, Robust Kafka Replicator

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to uReplicator: Uber Engineering’s Scalable, Robust Kafka Replicator

Similar to uReplicator: Uber Engineering’s Scalable, Robust Kafka Replicator (20)

Recently uploaded

Recently uploaded (20)

uReplicator: Uber Engineering’s Scalable, Robust Kafka Replicator