Citi Tech Talk Disaster Recovery Solutions Deep Dive

Disaster Recovery Solutions Deep Dive
Customer Success Engineering
August 2022

Table of Contents
2
1. Brokers, Zookeeper, Producers &
Consumers
A quick Primer
3. Stretch Clusters & Multi-Region Cluster
An asynchronous, multi-region solution
2. Disaster Recovery Options - Cluster
Linking & Schema Linking
A synchronous and optionally asynchronous
solution
4. Summary
Which solution is right for me?

01. Brokers, Zookeepers,
Producers & Consumers 101

Apache Kafka: Scale Out Vs. Failover
5
Broker 1
Topic1
partition1
Broker 2 Broker 3 Broker 4
Topic1
partition1
Topic1
partition1
Topic1
partition2
Topic1
partition2
Topic1
partition2
Topic1
partition3
Topic1
partition4
Topic1
partition3
Topic1
partition3
Topic1
partition4
Topic1
partition4

Apache Zookeeper - Cluster coordination
6
6
Broker 1
partition
Broker 2
(controller) Broker 3 Broker 4
Zookeeper 2
partition
partition
Zookeeper 1
Zookeeper 3
(leader)
partition
partition
partition
partition
Stores metadata:
heartbeats, watches,
controller elections,
cluster/topic configs,
permissions writes go to leader

Clients
Smart Clients to dumb pipes

Producer
8
P
partition 1
partition 2
partition 3
partition 4
A Kafka producer sends data to
multiple partitions based on
partitioning strategy (default is
hash(key) % no of partitions).
Data is sent in batch per partition then batched in request per broker.
Can configure batch size, linger, parallel connections per broker

Producer
9
P
partition 1
partition 2
partition 3
partition 4
A producer can choose to get
acknowledgement (acks) from 0,
1, or ALL (in-sync) replicas of
the partition

Consumer
10
C
A consumer polls data from
partitions it has been assigned
based on a subscription

Consumer
11
C
As the consumer reads the data and
processes the data, it can commit
offsets (where it has read up to) in
different ways (per time interval,
individual records, or “end of current
batch”)
commit
offset
heartbeat
poll records

Consumers - Consumer Groups
12
C
C
C1
C
C
C2
Different applications can
independently read from same
topic partitions at their own pace

Consumers - Consumer group members
13
C C
C C
Within the same application (consumer
group), different partitions can be
assigned to different consumers to
increase parallel consumption as well as
support failover

Make Kafka
Widely Accessible
to Developers
14
Enable all developers to leverage Kafka throughout the
organization with a wide variety of Confluent clients
Confluent Clients
Battle-tested and high performing
producer and consumer APIs (plus
admin client)

Recent Regional Cloud
Outages
1
7
AWS Azure GCP
Dec 2021: An unexplained
AWS outage created
business disruptions all day
(CNBC)
Nov 2020: A Kinesis outage
brought down over a dozen
AWS services for 17 hours
in us-east-1
(CRN, AWS)
Apr 1 2021: Some critical
Azure services were
unavailable for an hour
(Coralogix)
Sept 2018: South Central
US region was unavailable
for over a day
(The Register)
Nov 2021: An outage that
affected Home Depot, Snap,
Spotify, and Etsy
(Bloomberg)

Outages hurt business
performance
1
8
A data center or a region
may be down for multiple
hours–up to a day–based
on hist
Data Center
has an outage
The applications in that
data center that run your
business go of
Mission-critical
applications fail
Customers are unable to
place orders, discover
products, receive service,
etc.
Customer
Impact
Revenue is lost directly
from the inability to do
business during downtime,
and indirectly by damaging
brand image and customer
trust
Financial/Reput
a

Failure Types
1
9
Transient Failures Permanent Failures (Data Loss)
Transient failures in data-centers or
clusters are common and worth
protecting against for business
continuity purposes.
Regional outages are rare but still
worth protecting against for mission
critical systems.
Outages are typically transient but
occasionally permanent. Users
accidentally delete topics, human
error occurs.
If your data is unrecoverable and
mission critical, you need an
additional complementary solution.

Failure Scenarios
Data-Center / Regional
Outages
Platform Failures Human Error
Data-Centers have single
points of failure associated with
hardware resulting in
associated outages.
Regional Outages arise from
failures in the underlying cloud
provider.
People delete topics, clusters
and worse.
Unexpected behaviour arise
from standard operations and
within the CI/CD pipeline.
Load is applied unevenly or in
short bursts by batch
processing systems.
Performance limitations arise
unexpectedly.
Bugs occur in Kafka,
Zookeeper and associated
systems.

Cluster Linking & Schema
Linking

22
Cluster Linking
Cluster Linking, built into Confluent Platform and
Confluent Cloud allows you to directly connect
clusters together mirroring topics from one cluster
to another.
Cluster Linking makes it easier to build multi-cluster,
multi-cloud, and hybrid cloud deployments.
Active cluster
Consumers
Producers
clicks
clicks
Topics
DR cluster
clicks
clicks
Mirror Topics
Cluster Link
Primary Region DR Region

23
Schema Linking
Schema Linking, built into Schema Registry allows you
to directly connect Schema Registry clusters
together mirroring subjects or entire contexts.
Contexts, introduced alongside Schema Linking allows
you to create namespaces within Schema Registry
which ensures mirrored subjects don’t run into
schema naming clashes.
Active cluster
Consumers
Producers
clicks
clicks
Schemas
DR cluster
clicks
clicks
Mirror Schemas
Schema Link
Consumers
Producers

24
Prefixing
Prefixing allows you to add a prefix to a topic and if
desired the associated consumer group to avoid
topic and consumer group naming clashes between
the primary and Disaster Recovery cluster.
This is important when used in an active-active setup
and required to use a two way Cluster Link strategy
which is the recommended approach.
Active cluster
Consumer-Group
clicks
clicks
Topic
DR cluster
clicks
clicks
DR-topic
Cluster Link
DR-Consumer-Group

HA/DR Active-Passive
1. Steady state
Setup
● The cluster link can
automatically create mirror
topics for any new topics on
the active cluster
● Historical data is replicated &
incoming data is synced in
real-time
Active cluster
Consumers
Producers
clicks
clicks
topics
DR cluster
clicks
clicks
mirror topics
Cluster Link

2. Failover
1. Detect a regional outage via
metrics going to zero in that
region; decide to failover
2. Call failover API on mirror
topics to make them writable
3. Update DNS to point at DR
cluster
4. Start clients in DR region
Active cluster
Consumers
Producers
clicks
clicks
topics
DR cluster
clicks
clicks
mirror topics
failover
REST API or CLI
Consumers
Producers

3. Fail forward
The standard strategy is to “fail
forward” promoting the DR region
to be their new Primary Region:
● Cloud regions offer identical
service
● They moved all of their
applications & data systems to
the DR region
● Failing back would introduce
risk with little benefit
To fail forward, simply:
1. Delete topics on original
cluster (or spin up new cluster)
2. Establish cluster link in reverse
direction
Active DR cluster
clicks
clicks
mirror topics
DR Active cluster
clicks
clicks
mirror topics
Cluster Link
Consumers
Producers
Primary DR Region DR Primary Region

3. Failback (alternative)
If you can’t fail forward and need
to failback to the original region:
1. Delete topics on Primary
cluster (or spin up a new
cluster)
2. Establish a cluster link in the
reverse direction
3. When Primary has caught up,
migrate producers &
consumers back:
a. Stop clients
b. promote mirror topic(s)
c. Restart clients pointed at
Primary cluster
DR cluster
clicks
clicks
mirror topics
Consumers
Producers
Cluster Link
Primary cluster
clicks
clicks
mirror topics

Synced
asynchronously
HA/DR - Consumers must tolerate some duplicates
Consumers must tolerate
duplicate messages
because Cluster Linking is
asynchronous.
Primary cluster
Consumer X
A B C D
Topic
Consumer X offset
at time
of outage
DR cluster
A B C D
Mirror Topic
Consumer X offset
at time of failover
... ...
A B C C D ...
Consumes message
C twice

Active-Passive
Bi-Directional Cluster Linking
31

DR cluster
“East”
HA/DR Bi-Directional Cluster Linking: Automatic Data Recovery & Failback
1. Steady state
Setup
For a topic named clicks
● We create duplicate topics on
both the Primary and DR
cluster
● Create prefixed cluster links in
both directions
● Produce records to clicks on
the Primary cluster
● Consumers consume from a
Regex pattern
Primary cluster
“West”
clicks
Consumers
.*clicks
Producers
clicks Add prefix
west
clicks
clicks clicks
west.clicks
east.clicks
Add prefix
east

DR cluster
“East”
2. Outage strikes!
An outage in the primary region
stops:
● Stops producers & consumers
in primary region
● Temporarily pauses cluster
link mirroring
● A small set of data may not
have been replicated yet to
the DR cluster – this is your
“RPO”
Primary cluster
“West”
clicks
Consumers
.*clicks
Producers
clicks
clicks
clicks clicks
west.clicks
east.clicks

DR cluster
“East”
3. Failover
To failover:
● Move consumers and
producers to the DR cluster -
keep the same topic names /
regex
● Consumers consume both
○ Pre-failover data in
west.clicks
○ Post-failover data in clicks
● Don’t delete the cluster link
● Disable clicks -> west.clicks
offset replication
Primary cluster
“West”
clicks
Consumers
.*clicks
Producers
clicks
clicks
clicks clicks
west.clicks
east.clicks

DR cluster
“East”
4. Recovery
If/when the outage is over:
● The primary-to-DR cluster link
automatically recovers the
lagged data (RPO) from the
primary cluster
Note: this data will be “late arriving”
to the consumers
● New records generated to the
DR cluster will automatically
begin replicating to the primary
Primary cluster
“West”
clicks
Consumers
.*clicks
Producers
clicks
Recovers
data
Fails back
data clicks
clicks clicks
west.clicks
east.clicks

DR cluster
“East”
5. Failback
To failback to the primary region
Consumers need to pick up at the
end of the writable topics, so:
● Ensure that all consumer
groups have 0 consumer lag
for their DR topics e.g.
west.clicks
● Reset all consumer offsets to
the last offsets (LEO), this
can be done by the platform
operator
Finally, move consumers & producers
back to Primary
● Each producer / consumer
group can be moved
independently
Primary cluster
“West”
clicks
Consumers
.*clicks
Producers
clicks
Recovers
data
Fails back
data clicks
clicks clicks
west.clicks
east.clicks
Reset consumers to
resume here
move
move

DR cluster
“East”
Primary cluster
“West”
clicks
Recovers
data
Fails back
data clicks
clicks clicks
west.clicks
east.clicks
6. And beyond
Re-enable clicks -> west.clicks
consumer offset replication
Once consumer lag is 0 on
east.clicks, then reset all
consumer groups to Log End
Offset (last offset of the partition)
on “clicks” on DR cluster
Consumers
.*clicks
Producers
clicks
Reset consumers to
resume here

Active-Active
Bi-Directional Cluster Linking
38

West cluster
clicks
east.clicks
East cluster
west.clicks
clicks
Consumers
.*clicks
Producers
Add prefix
west
Add prefix
east
Consumers
.*clicks
Producers
Applications /
Web Traffic
Load Balancer
(example)
Applications Applications
39
HA/DR Bi-Directional Cluster Linking: Active-Active
1. Steady state

West cluster
clicks
east.clicks
East cluster
west.clicks
clicks
Consumers
.*clicks
Producers
Add prefix
west
Add prefix
east
Consumers
.*clicks
Producers
Applications /
Web Traffic
Load Balancer
(example)
40
2. Outage strikes!

West cluster
clicks
east.clicks
East cluster
west.clicks
clicks
Consumers
.*clicks
Producers
Add prefix
west
Add prefix
east
Consumers
.*clicks
Producers
Applications /
Web Traffic
Load Balancer
(example)
re-route
41
3. Failover

West cluster
clicks
east.clicks
East cluster
west.clicks
clicks
Consumers
.*clicks
Producers
Add prefix
west
Add prefix
east
Consumers
.*clicks
Producers
Applications /
Web Traffic
Load Balancer
(example)
Any remaining pre-failure data is
automatically recovered by the
consumers
re-route
42
4. Return to Steady State

43
Stretch Cluster
A Stretch Cluster is ONE Kafka cluster that is
“stretched” across multiple availability zones or data
centers.
Uses Kafka internal replication features to achieve
RPO = 0 & low RTO.

3. Stretch Clusters & Multi-Region Cluster
44

Stretch Cluster: Non-Stretch Cluster Cluster Behaviour
1. Steady State
Setup
● Any unknown number of
brokers represented here by
brokers 1-4 spread across 2
DCs
● A standard three node
Zookeeper cluster spread
across 2 DCs
DC “West”
Broker 1
Broker 2
Zookeeper 1
Zookeeper 2
DC “East”
Broker 3
Broker 4
Zookeeper 3

1. Steady State… continued
Setup
● Any unknown number of
brokers represented here by
brokers 1-4 spread across 2
DCs
● A standard three node
Zookeeper cluster spread
across 2 DCs
● We’ll also assume a
replication-factor of 3,
min.insync.replicas of 2 and
acks=all
DC “West”
Replica 1
Replica 2
Zookeeper 1
Zookeeper 2
DC “East”
Replica 3
Unused
Broker
Zookeeper 3

2. DC Outage
An outage in DC “West”
● … let’s start by just focusing
on Kafka.
DC “West”
Replica 1
Replica 2
Zookeeper 1
Zookeeper 2
DC “East”
Replica 3
Unused
Broker
Zookeeper 3

2. DC Outage
● Min.insync.replicas can no
longer be met and we lose
availability
DC “West”
Replica 1
Replica 2
Zookeeper 1
Zookeeper 2
DC “East”
Replica 3
Unused
Broker
Zookeeper 3

3. Fixing Broker Availability
Increase to rf=4
● Looks like we’ve solved our
issue…
DC “West”
Replica 1
Replica 2
Zookeeper 1
Zookeeper 2
DC “East”
Replica 3
Replica 4
Zookeeper 3

3. Fixing Broker Availability… But
Increase to rf=4
● Looks like we’ve solved our
issue… but, if our 2 replicas
are down or out of sync then
we lose availability unless we
trigger an unclean leader
election and accept data loss.
DC “West”
Replica 1
Replica 2
Zookeeper 1
Zookeeper 2
DC “East”
Replica 3
Out of Sync
Replica 4
Out of Sync
Zookeeper 3

4. Fixing Data Loss
Increase to min.insync.replicas
to 3
● Consumers continue to
operate
● Producers continue to operate
once we revert to
min.insync.replicas=2
DC “West”
Replica 1
Replica 2
Zookeeper 1
Zookeeper 2
DC “East”
Replica 3
Replica 4
Out of Sync
Zookeeper 3

4. Fixing Data Loss… But What About Zookeeper?
DC “West”
Replica 1
Replica 2
Zookeeper 1
Zookeeper 2
DC “East”
Replica 3
Replica 4
Out of Sync
Zookeeper 3

DC “West”
Zookeeper 1
Zookeeper 2
DC “East”
Zookeeper 3
Broker 1
Broker 2
Broker 3
Broker 4

DC “West”
Broker 1
Broker 2
Zookeeper 1
Zookeeper 2
DC “East”
Broker 3
Broker 4
Zookeeper 3

Stretch Cluster: 2 DC + Observer
1. Steady State
Setup
● A minimum of 4 brokers
● 6 Zookeeper nodes, one of
which is an observer
● Replication factor of 4,
acks=all
DC “West”
Broker 1
Broker 2
Zookeeper 1
Zookeeper 2
DC “East”
Broker 3
Broker 4
Zookeeper 4
Zookeeper 5
Zookeeper 3
Zookeeper 6
(Observer)

2. DC Outage - On observer DC
An outage in DC “East”
operate
● Producers continue to
operate once we revert to
DC “West”
Broker 1
Broker 2
Zookeeper 1
Zookeeper 2
DC “East”
Broker 3
Broker 4
Zookeeper 4
Zookeeper 5
Zookeeper 3
Zookeeper 6
(Observer)

3. DC Outage - On non-observer DC
● We can’t reach Zookeeper
quorum!
DC “West”
Broker 1
Broker 2
Zookeeper 1
Zookeeper 2
DC “East”
Broker 3
Broker 4
Zookeeper 4
Zookeeper 5
Zookeeper 3
Zookeeper 6
(Observer)

3. DC Outage - On non-observer DC… but
● We promote the Zookeeper
observer to a full follower
● Remove Zookeeper 1, 2 &
3 from quorum list
● Perform rolling restart of
Zookeeper nodes
DC “West”
Broker 1
Broker 2
Zookeeper 1
Zookeeper 2
DC “East”
Broker 3
Broker 4
Zookeeper 4
Zookeeper 5
Zookeeper 3 Zookeeper 6

3. DC Outage - On non-observer DC
operate
● Producers continue to
operate once we revert to
DC “West”
Broker 1
Broker 2
Zookeeper 1
Zookeeper 2
DC “East”
Broker 3
Broker 4
Zookeeper 4
Zookeeper 5

4. Network Partition
A network partition occurs
between DCs
operate as usual up until
they’ve consumed all fully
replicated data
● Producer will fail as we can
no longer meet
DC “West”
Broker 1
Broker 2
Zookeeper 1
Zookeeper 2
DC “East”
Broker 3
Broker 4
Zookeeper 4
Zookeeper 5
Zookeeper 3
Zookeeper 6
(Observer)

5. Fixing Network Partition
between DCs
● We manually shutdown DC
“East” then update
● Clients resume operating
as normal
● Consumers failing over
from DC “East” will
consume some duplicate
records
DC “West”
Broker 1
Broker 2
Zookeeper 1
Zookeeper 2
DC “East”
Broker 3
Broker 4
Zookeeper 4
Zookeeper 5
Zookeeper 3
Zookeeper 6
(Observer)

64
Observer Risk!
Zookeeper observers solve our availability and split-
brain issues but risk data loss!
DC “West”
Zookeeper
Leader
Zookeeper
Follower
Zookeeper
Follower
DC “East”
Zookeeper
Follower (Out
of Sync)
Zookeeper
Follower (Out
of Sync)
Zookeeper
Observer (Out
of Sync)
Quorum

65
Hierarchical Quorum
Hierarchical Quorum involves getting consensus
between multiple Zookeeper “groups” which each
form their own quorum. In the case of two DC
hierarchy, consensus must be reached between
BOTH DCs.
DC “West”
Zookeeper 1
(Leader)
Zookeeper 2
Zookeeper 3
DC “East”
Zookeeper 4
Zookeeper 5
Zookeeper 6
Quorum

Stretch Cluster: 2 DC + Hierarchical Quorum
1. Steady State
Setup
● 6 Zookeeper nodes, arranged
into two groups
acks=all
DC “West”
Broker 1
Broker 2
Zookeeper 1
Zookeeper 2
DC “East”
Broker 3
Broker 4
Zookeeper 4
Zookeeper 5

2. DC Outage
operate for leaders on DC
“West”
● Leaders can’t be elected
and configuration updates
can’t be made until we
have hierarchical quorum
DC “West”
Broker 1
Broker 2
Zookeeper 1
Zookeeper 2
DC “East”
Broker 3
Broker 4
Zookeeper 4
Zookeeper 5

3. DC Outage
● Remove DC “East”
Zookeeper group from
hierarchy
● Revert to
DC “West”
Broker 1
Broker 2
Zookeeper 1
Zookeeper 2
DC “East”
Broker 3
Broker 4
Zookeeper 4
Zookeeper 5

4. Network Partition
between DCs
operate as usual up until
they’ve consumed all fully
replicated data
● Producer will fail as we can
no longer meet
DC “West”
Broker 1
Broker 2
Zookeeper 1
Zookeeper 2
DC “East”
Broker 3
Broker 4
Zookeeper 4
Zookeeper 5

5. Fixing Network Partition
between DCs
● We manually shutdown DC
“East”, remove from the
hierarchy & update
● Clients resume operating
as normal
● Consumers failing over
from DC “East” will
consume some duplicate
records
DC “West”
Broker 1
Broker 2
Zookeeper 1
Zookeeper 2
DC “East”
Broker 3
Broker 4
Zookeeper 4
Zookeeper 5

Stretch Cluster: 2.5 DC
1. Steady State
Setup
● 3 Zookeeper nodes
acks=all
● Note: It’s actually better for the
DC’s with brokers to be
closest
DC “West”
Broker 1
Broker 2
Zookeeper 1
DC “East”
Broker 3
Broker 4
Zookeeper 3
DC “Central”
Zookeeper 2

2. DC Outage
operate
● Producers continue to operate
once we revert to
DC “West”
Broker 1
Broker 2
Zookeeper 1
DC “East”
Broker 3
Broker 4
Zookeeper 3
DC “Central”
Zookeeper 2

3. DC Network Partition
A network partition in DC
“West”
● Consumers connected to DC
“East” continue to operate
DC “West”
Broker 1
Broker 2
Zookeeper 1
DC “East”
Broker 3
Broker 4
Zookeeper 3
DC “Central”
Zookeeper 2

“West”
“West” continue to operate
until they’ve processed all fully
replicated records
DC “West”
Broker 1
Broker 2
Zookeeper 1
DC “East”
Broker 3
Broker 4
Zookeeper 3
DC “Central”
Zookeeper 2

“West”
● Producers connected to DC
“East” continue to operate
once we revert to
DC “West”
Broker 1
Broker 2
Zookeeper 1
DC “East”
Broker 3
Broker 4
Zookeeper 3
DC “Central”
Zookeeper 2

“West”
once we shutdown DC “West”,
failover and revert to
DC “West”
Broker 1
Broker 2
Zookeeper 1
DC “East”
Broker 3
Broker 4
Zookeeper 3
DC “Central”
Zookeeper 2

79
Multi-Region Clusters:
Followers vs Observers
Followers are normal replicas, however, observers act
the same except that they are not considered for
acks=all produce requests.
DC “West”
Producers
Follower
Synchronous
Leader
DC “East”
Observer
Asynchronous

80
Multi-Region Clusters:
Automatic Observer
Promotion
As of Confluent Platform v6.1 observers can be
configured to be promoted to meet the
ObserverPromotionPolicy, including:
● Under-min-isr: Promoted if in-sync replica size
drops below min.insync.replicas
● Under-replicated: Promoted to cover any replica
which is no longer insync
● Leader-is-observe: Promoted if the current
leader is an observer
DC “West”
Producers
Follower
Synchronous
Leader
DC “East”
Follower
Asynchronous

Multi-Region Clusters: 2.5 DC
1. Steady State
Setup
● Replication factor of 4, 2
additional observers,
acks=all
DC “West”
Broker 1
Broker 2
Zookeeper 1
DC “East”
Broker 4
Broker 5
Zookeeper 3
DC “Central”
Zookeeper 2
Broker 3
(Observer1)
Broker 6
(Observer2)

2. DC Outage
● The Observer in DC “East” is
promoted
● Consumers and Producers
continue to operate as usual
● RPO = 0
● RTO ~ 0
DC “West”
Broker 1
Broker 2
Zookeeper 1
DC “East”
Broker 4
(Replica 3)
Broker 5
(Replica 4)
Zookeeper 3
DC “Central”
Zookeeper 2
Broker 3
(Observer1)
Broker 6
(Replica 5)

“West”
● The Observer in DC “East” is
promoted
connected to DC “East”
DC “West”
Broker 1
Broker 2
Zookeeper 1
DC “East”
Broker 4
(Replica 3)
Broker 5
(Replica 4)
Zookeeper 3
DC “Central”
Zookeeper 2
Broker 3
(Observer1)
Broker 6
(Replica 5)

“West”
● The Observer in DC “West”
cannot be promoted as it has
no Zookeeper Quorum
DC “West”
Broker 1
Broker 2
Zookeeper 1
DC “East”
Broker 4
(Replica 3)
Broker 5
(Replica 4)
Zookeeper 3
DC “Central”
Zookeeper 2
Broker 3
(Observer 1)
Broker 6
(Replica 5)

“West”
until they’ve processed all fully
replicated records. Once we
shutdown DC “West” the
consumers will failover and
consume from the same point.
This will result in duplicate
consumption
DC “West”
Broker 1
Broker 2
Zookeeper 1
DC “East”
Broker 4
(Replica 3)
Broker 5
(Replica 4)
Zookeeper 3
DC “Central”
Zookeeper 2
Broker 3
(Observer 1)
Broker 6
(Replica 5)

“West”
“West” fail as we can no longer
meet min.insync.replicas=3
DC “West”
Broker 1
Broker 2
Zookeeper 1
DC “East”
Broker 4
(Replica 3)
Broker 5
(Replica 4)
Zookeeper 3
DC “Central”
Zookeeper 2
Broker 3
(Observer 1)
Broker 6
(Replica 5)

“West”
● To continue operating as
normal we must manually
shutdown DC “West
DC “West”
Broker 1
Broker 2
Zookeeper 1
DC “East”
Broker 4
(Replica 3)
Broker 5
(Replica 4)
Zookeeper 3
DC “Central”
Zookeeper 2
Broker 3
(Observer 1)
Broker 6
(Replica 5)

Stretch Cluster - 3 DC (ICG EAP KaaS)
88

Multi-Region Clusters: 3 DC
1. Steady State
Setup
● 9 brokers from which 3 brokers
from MWDC stores only
observers
● Replication factor of 4, 1
additional observers,
acks=all
DC “MW”
Broker 1
(Observer)
Broker 2
(Observer)
Zookeeper 1
DC “NJ”
Broker 7
Broker 8
Zookeeper 4
DC “NY”
Zookeeper 2
Broker 3
(Observer)
Broker 9
Broker 4
Broker 5
Broker 6

2. DC Outage
An outage in DC “NY”
● The Observers in DC “MW” is
promoted
● RPO = 0
● RTO ~ 0
DC “MWDC”
Broker 1
(Replica)
Broker 2
(Replica)
Zookeeper 1
DC “NJ”
Broker 7
Broker 8
Zookeeper 4
DC “NY”
Zookeeper 2
Broker 3
(Replica)
Broker 9
Broker 4
Broker 5
Broker 6

A network partition in DC “NJ”
● The Observer in DC “MW” is
promoted
connected to DC “NY” continue
to operate as usual
DC “MW”
Broker 1
(Replica)
Broker 2
(Replica)
Zookeeper 1
DC “NJ”
Broker 7
Broker 8
Zookeeper 4
DC “NY”
Zookeeper 2
Broker 3
(Replica)
Broker 9
Broker 4
Broker 5
Broker 6

“NJ” continue to operate until
they’ve processed all fully
replicated records. Once we
shutdown DC “NJ” or
application is restarted, the
consumers will failover and
consume from the same point.
This will result in duplicate
consumption
DC “MW”
Broker 1
(Replica)
Broker 2
(Replica)
Zookeeper 1
DC “NJ”
Broker 7
Broker 8
Zookeeper 4
DC “NY”
Zookeeper 2
Broker 3
(Replica)
Broker 9
Broker 4
Broker 5
Broker 6

“NJ” fail as we can no longer
meet min.insync.replicas=3
● Once application is restarted,
the producers will failover and
produce the data connecting to
DC “NY” / DC “MW”
DC “MW”
Broker 1
(Replica)
Broker 2
(Replica)
Zookeeper 1
DC “NJ”
Broker 7
Broker 8
Zookeeper 4
DC “NY”
Zookeeper 2
Broker 3
(Replica)
Broker 9
Broker 4
Broker 5
Broker 6

● To continue operating as
normal we must manually
shutdown DC “NJ”
DC “MW”
Broker 1
(Replica)
Broker 2
(Replica)
Zookeeper 1
DC “NJ”
Broker 7
Broker 8
Zookeeper 4
DC “NY”
Zookeeper 2
Broker 3
(Replica)
Broker 9
Broker 4
Broker 5
Broker 6

Comparison
96
Supported Cluster Linking
Stretch Cluster / Multi-Region
Cluster
Replicator / MirrorMaker 2
RPO=0 ✓
RTO=~0 ✓ ✓ ✓
Active-Active ✓ ✓ ✓
Failover With All Clients ✓ ✓
Failover With Transactions ✓ ✓
Failover Maintains Record Ordering
✓
✓
Smooth Failback ✓ ✓
Handles Full Cluster Failure ✓ ✓
Hybrid Cloud / Multi-Cloud ✓ ✓
Open Source ✓* ✓*
Preserves Metadata ✓ ✓ ✓*

Citi Tech Talk Disaster Recovery Solutions Deep Dive

Citi Tech Talk Disaster Recovery Solutions Deep Dive

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Citi Tech Talk Disaster Recovery Solutions Deep Dive

Similar to Citi Tech Talk Disaster Recovery Solutions Deep Dive (20)

More from confluent

More from confluent (20)

Recently uploaded

Recently uploaded (20)

Citi Tech Talk Disaster Recovery Solutions Deep Dive

Editor's Notes