Brooklin Mirror Maker - How and why we moved away from Kafka Mirror Maker

Brooklin Mirror Maker
How and Why we moved away from Kafka Mirror Maker
Shun-ping Chiu
Software engineer @ LinkedIn Data Pipelines

Agenda
Kafka Mirroring Use Cases
Limitations for Kafka Mirror Maker
Future Work

● Aggregating data from all data centers
● Moving data between LinkedIn and external
cloud services
Mirroring
Use Cases

Tremendous Kafka Data
● Kafka data at LinkedIn continues to grow rapidly
● We are at 5T messages and 1.4 PB everyday

Big Scale to Operate
40+Kafka src clusters
in different DCs
100+pipelines
2Tmessages/day

Kafka Mirror Maker(KMM) Topology
Datacenter B
aggregate
tracking
tracking
KMM KMM
Datacenter A
aggregate
tracking
tracking
KMM KMM
● Each KMM pipeline
○ mirrors data from 1 source cluster to 1
destination cluster
○ constitutes its own KMM cluster

Datacenter B
aggregate
tracking
tracking
Datacenter A
aggregate
tracking
tracking
KMM
aggregate
metrics
metrics
aggregate
metrics
metrics
Datacenter C
aggregate
tracking
tracking
aggregate
metrics
metrics
...
KMM KMM
KMM KMM KMM
KMM KMM KMM KMM KMM KMM
KMM KMM KMM KMM KMM KMM
KMM Setup
● # of KMM clusters =
# of data centers x # of Kafka src
clusters
● Need to operate more than 100+ KMM
clusters

● Static configuration file per KMM cluster requires every change
to be deployed
Example - Add a Topic in KMM
● Let’s say we have a pipeline (a KMM cluster) with 100+ hosts
● And 100+ pipelines ?

KMM Pain Points
● Hard to operate
○ hard to add new topic
○ difficult to split the pipeline
● One bad partition brings down the pipeline
○ deleted topic
○ ACL issue
● Performance issues
○ Unable to catch up with traffic
○ Increased lag

: (
Your Kafka Mirror Maker runs into problems and need to restart. We’re just collecting some error
infos and we will restart for you. (0% completed)

Brooklin - Stream Ingestion Service
DestinationsSources
Data stores
Messaging systems
Microsoft
EventHubs
Data stores
Messaging systems
Microsoft
EventHubs

BMM is built on Brooklin
DestinationsSources
Data stores
Messaging systems
Microsoft
EventHubs
Data stores
Messaging systems
Microsoft
EventHubs

● Built on top of our stream ingestion service, Brooklin
○ Better operability
○ Fault isolation
○ Performance optimizations
● BMM has fully replaced KMM at LinkedIn today

KMM vs BMM
Datacenter B
aggregate
tracking
tracking
BMM
Datacenter A
aggregate
tracking
tracking
BMM
Datacenter B
aggregate
tracking
tracking
KMM KMM
Datacenter A
aggregate
tracking
tracking
KMM KMM
● BMM is one cluster per data center

BMM Topology
Datacenter A
aggregate
tracking
tracking
BMM
metrics
aggregate
metrics
Datacenter B
aggregate
tracking
tracking
BMM
metrics
aggregate
metrics
Datacenter C
aggregate
tracking
tracking
BMM
metrics
aggregate
metrics
...
100+KMM clusters
~10BMM clusters

Dynamic Management API
Brooklin
Engine
Kafka src
connector
Kafka dest
connector
Management
Rest API
Diagnostics
Rest API
ZooKeeper
Management/
monitoring
portal
SRE/op
dashboards

Restful API- Creating a Pipeline
Brooklin
Engine
Management
Rest API
ZooKeeper
create POST /datastream
name: mm_DC1-tracking_DC2-aggregate-tracking
connectorName: KafkaMirrorMaker
source:
connectionString: kafkassl://DC1-tracking-vip:12345/topicA|topicB
destination:
connectionString: kafkassl://DC2-aggregate-tracking-vip:12345
metadata:
taskNums: 5

Restful API - Updating a Pipeline
Brooklin
Engine
Management
Rest API
ZooKeeper
update PUT /datastream/mm_DC1-tracking_DC2-aggregate-
tracking
name: mm_DC1-tracking_DC2-aggregate-tracking
connectorName: KafkaMirrorMaker
source:
connectionString: kafkassl://DC1-tracking-vip:12345/topicA|topicB|topicC|topicD
destination:
connectionString: kafkassl://DC2-aggregate-tracking-vip:12345
metadata:
taskNums: 10
^topic*.

Pause a Pipeline
● Manually pause and resume mirroring for each pipeline
● BMM can automatically pause mirroring for bad partitions for fault
isolation
○ Flow of messages from healthy partitions continue
○ Auto-resumes the partitions after configurable duration

Diagnostic API
Brooklin
Engine
Kafka src
connector
Kafka dest
connector
Management
Rest API
Diagnostics
Rest API
ZooKeeper
Management/
monitoring
portal
SRE/op
dashboards

Restful API - On-demand Diagnostics
Brooklin
Engine
Diagnostics
Rest API
ZooKeeper
getAllStatus GET /diag?datastream=mm_DC1-tracking_DC2-aggregate-tracking
host1.prod.linkedin.com:
datastream: mm_DC1-tracking_DC2-aggregate-tracking
assignedTopicPartitions: [topicA-0, topicA-3, topicB-0, topicB-2]
autoPausedPartitions: [{topicA-3: {reason: SEND_ERROR, description: failed to produce messages from this
partition}}]
manuallyPausedPartitions: []
host2.prod.linkedin.com:
datastream: mm_DC1-tracking_DC2-aggregate-tracking
assignedTopicPartitions: [topicA-1, topicA-2, topicB-1, topicB-3]
autoPausedPartitions: []
manuallyPausedPartitions: []

Brooklin Mirroring Pseudocode
while (!shutdown) {
records = consumer.poll();
producer.send(records);
if (timeToCommit) {
producer.flush();
consumer.commit();
}
}
Producer flush can be expensive

Flushless Produce
Only commit “safe” acknowledged checkpoints:
consumer.poll() → producer.send(records) → consumer.commit(offsets)
consumer.poll() → producer.send(records) → producer.flush() → consumer.commit()

Flushless Produce
sp0 consumer producer
checkpoint
manager
o1, o2 o1, o2 o1, o2
o1
o2
Source
Destination
ack(sp0, o2)
dp0
dp1
● Checkpoint manager maintains producer-acknowledged offsets for
each source partition
Source partition sp0
in-flight: [o1]
acked: [o2]
safe checkpoint: --

Flushless Produce
sp0 consumer producer
checkpoint
manager
o3, o4 o3, o4 o3, o4
o3
o4
Source
Destination
ack(sp0, o1)
dp0
dp1
● Update safe checkpoint to largest acknowledged offset that is less
than oldest in-flight (if any)
Source partition sp0
in-flight: [o3, o4]
acked: [o1, o2]
safe checkpoint: o2

Manage Performance through Task
● Datastream task
○ Consists of a dedicated kafka consumer and use a share producer pool to
produce the data
○ Performance is controlled by the # of Tasks
○ Tasks are assigned to each host within the BMM cluster
● BMM uses sticky assignment to speeds up the task allocation

Sticky Task Assignment
ZooKeeper
BMM
host
BMM
host
BMM
host
BMM
host
Task 1 Task 2 Task 3 Task 4
Task 5 Task 6
ZooKeeper
BMM
host
BMM
host
BMM
host
BMM
host
Task 1 Task 2 Task 3 Task 4
Task 5
BMM
host
Leader
Leader
Task 6

BMM Performance Numbers
● Testing environment
○ Intel(R) Xeon(R) CPU E5-2680 v3 @ 2.50GHz, 12 cores, 64GB RAM
● Performance Metrics with 20 datastream tasks:
○ Throughput: compressed bytes up to 28 MB/s
○ Memory utilization: 70%
○ CPU utilization: ~100%

Passthrough Compression
● BMM is CPU bound, 70%+ CPU time is spent in decompression & re-
compression
○ GZIPInputStream.read(): ~10%
○ GZIPOutputStream.write(): ~61%
● “Passthrough” mirroring - skip the decompression & recompression
○ Throughput ~ 100MB/s
○ CPU utilization drops to 50%

● Better workload distribution - workload
based assignment
● Auto-scaling - adjust number of tasks based
on throughput
Performance
&
Stability

Open Source
Expected at EOM, April 2019

Brooklin Mirror Maker - How and why we moved away from Kafka Mirror Maker

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Brooklin Mirror Maker - How and why we moved away from Kafka Mirror Maker

Similar to Brooklin Mirror Maker - How and why we moved away from Kafka Mirror Maker (20)

Recently uploaded

Recently uploaded (20)

Brooklin Mirror Maker - How and why we moved away from Kafka Mirror Maker