Disaster Recovery for Multi-Region Apache Kafka Ecosystems at Uber

Disaster Recovery for Multi-Region
Apache Kafka Ecosystems at Uber
Yupeng Fu
Streaming Data Team, Uber
Apr 29, 2019

About myself
● Yupeng Fu
● Staﬀ Engineer @ Uber
● Streaming Data
● Worked at Alluxio, Palantir
● UCSD & Tsinghua

Data Infrastructure @ Uber
PRODUCERS CONSUMERS
Real-time Analytics,
Alerts, DashboardsSamza / Flink
Applications
Data Science
Analytics
Reporting
Apache
Kafka
Vertica / Hive
Rider App
Driver App
API / Services
Etc.
Ad-hoc Exploration
ELK
Debugging
Hadoop
Surge
Mobile App
Cassandra
MySQL
DATABASES
(Internal) Services
AWS S3
Payment

Apache Kafka at Uber
● General pub-sub, messaging queue
● Stream processing
○ AthenaX - self-service streaming analytics platform (Apache Samza &
Apache Flink)
● Database changelog transport
○ Cassandra, MySQL, etc.
● Ingestion into data lake
○ HDFS, S3
● Logging

PBsMessages / DayTrillions Data/day
Tens of
Thousands
Topics
Scale
excluding replication
ThousandsServices

Multi-region at Uber
● Provide business resilience and continuity as the top priority
○ Survive outages and disasters without major business impact
○ Region isolation to avoid cascading failure
● Take good care of customer experiences
○ Serve user requests in a closer region
○ Data integrity and consistency matters
● Improve infrastructure ﬂexibility and eﬃciency
○ Decease compliance and policy risks
○ Leverage both on-premise and cloud partners

Considerations for apps/services
● Highly available
○ Auto and on-demand region failover
● Highly ﬂexible
○ Stateless and mobile
○ Data sharded by Geo
● Tradeoﬀs in SLA
○ Local data vs aggregated view
○ Latency vs consistency
● Leverage active-active storage layer for state
sharing

Considerations for Apache Kafka
● Producer
○ Data produced locally

● Producer
● Data aggregation
○ Topics replicated to agg clusters

● Producer
● Active-active consumers
○ Double compute
○ Data ingestion

Active-active example: surge
● Real-time dynamic pricing
● Critical service with strict SLA
● Heavy distributed computation
● Large memory footprint
● Latency over consistency
Dynamic pricing
Rider
Driver

Data replication - uReplicator
● Uber’s Apache Kafka replication service
● Goals
○ Stable replication, e.g. rebalance only occurs during startup
○ Operate with ease, e.g. add/remove whitelists
○ Scalable
○ High throughput
● Open sourced: https://github.com/uber/uReplicator
● Blog: https://eng.uber.com/ureplicator/

● Producer
● Active-active consumers
○ Double compute
○ Data ingestion
● Active-passive consumers
○ Consistency sensitive apps
○ Challenge on offset sync

Oﬀset sync - challenges
● Requirements
○ No data loss -> cannot resume from
largest offset
○ Reduce duplicates -> cannot resume
from smallest offset
● Constraints
○ Not all messages have timestamp
○ Messages in the agg cluster out of order
due to the merge

Offset sync - architecture
● uReplicator reports the offset from src to
dst to the offset manager
● Offset manager
○ Stores the checkpoints state
○ Translates the offsets mapping
● Sync job periodically translates the offsets
and pushes the new offsets
● Internal consumer looks up the offsets

Oﬀset sync - checkpoint
src-cluster src-offset dst-cluster dst-offset
R1-region 1 R1-agg 1
11
12
21
22
11
12
21
22
21
22
11
12
13
14
23
24
13
14
23
24
23
24
13
14

Oﬀset sync - translation
11
12
21
22
11
12
21
22
21
22
11
12
13
14
23
24
13
14
23
24
23
24
13
14
● Find the mapped oﬀset during the failover
○ Find the src offsets from the most recent
checkpoints
○ Take the min of the checkpointed offsets
on the failed over agg cluster
6
3
5
1
3
1
7

Oﬀset sync - active-passive producer
11
12
21
11
12
21
21
11
12
13
14
13
14
● Find the mapped oﬀset during the failover
○ Find the src offsets from the most recent
checkpoints
○ Take the min of the checkpointed offsets
on the failed over agg cluster, ignore the
checkpoints when the src offset is the
latest
3
13
14
15
16
15
16
15
16

Proprietary and confidential © 2019 Uber Technologies, Inc. All rights reserved. No part of this document may be reproduced or utilized in any
form or by any means, electronic or mechanical, including photocopying, recording, or by any information storage or retrieval systems, without
permission in writing from Uber. This document is intended only for the use of the individual or entity to whom it is addressed and contains
information that is privileged, confidential or otherwise exempt from disclosure under applicable law. All recipients of this document are notified
that the information contained herein includes proprietary and confidential information of Uber, and recipient may not make use of, disseminate,
or in any way disclose this document or any of the enclosed information to any person other than employees of addressee to the extent
necessary for consultations with authorized personnel of Uber.

Motivation - Why not MirrorMaker
● Pain point
○ Expensive rebalancing
○ Difficulty adding topics
○ Possible data loss
○ Metadata sync issues

Disaster Recovery for Multi-Region Apache Kafka Ecosystems at Uber

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Disaster Recovery for Multi-Region Apache Kafka Ecosystems at Uber

Similar to Disaster Recovery for Multi-Region Apache Kafka Ecosystems at Uber (20)

More from confluent

More from confluent (20)

Recently uploaded

Recently uploaded (20)

Disaster Recovery for Multi-Region Apache Kafka Ecosystems at Uber