Building a Fault Tolerant Distributed Architecture

Building a Fault Tolerant Distributed
Architecture
Rodrigo Toste Gomes
Software Engineer

A MemSQL Cluster
2
- Master Aggregator
- Leaves
Leaves store shards of data.
Master Aggregator is the source of truth for cluster state:
- Location of data shards;

Fault Tolerance and High Availability
3
Maintain replicas of shards in different nodes.
System can tolerate one node failure.
Master Aggregator is also responsible for:
- Location of data shard replicas;
- Replication states;
- Moving shards around.

4
Cluster State:
Node State:
- Leaf A is online
- Leaf B is online
Databases:
- data
Shards:
- data:0 - primary on A
- data:0 - replica on B
- data:1 - primary on B
- data:1 - replica on A
Example Cluster
Nodes:
Master Aggregator
Leaf A
Leaf B

Maintaining the Cluster State
5
Cluster State:
- Leaf A is online
Leaf A:
Offline MA
BA
Heartbeats
Cluster State:
- Leaf A is offline
Leaf A:
Offline

6
Cluster State:
- Leaf A is online
Leaf A:
Offline
Leaf B:
Online; replica of tweets:0
Cluster State:
- Leaf A is offline
Leaf A:
Offline
Leaf B:
Online; replica of data:0

7
Cluster State:
- Leaf A is online
Leaf A:
Offline
Leaf B:
Online; replica of tweets:0
Cluster State:
- Leaf A is offline
Leaf A:
Offline
Leaf B:
Online; replica of data:0
Leaf B:
Online; primary of data:0

MemSQL Distributed Architecture
8
Master Aggregator:
- Maintains primary copy of cluster state;
- Monitors node state via heartbeats.
Leaves:
- Monitor for changes in local and cluster state;
- Reconcile local state with cluster state.
What happens when a node is unable to reconcile its local
state with the cluster state?

Shard Can’t Replicate
9
Cluster State:
- data:0 primary on A
- data:0 replica on B
Leaf A:
Primary data:0
Replicating data:0 to B
Leaf B:
Replica of data:0
MA
BA
Heartbeats
data:0

Shard Can’t Replicate
10
Cluster State:
Leaf A:
Primary data:0
Leaf B:
Replica of data:0
Network Partition A | B
MA
BA
Heartbeats
data:0

Strategy 1: Don’t block writes
11
MA
BA
Heartbeats
data:0
Cluster State:
Leaf A:
Primary data:0
Leaf B:
Out of Date Replica of data:0

Strategy 1: Don’t block writes (Leaf A crashes)
12
MA
BA
Heartbeats
data:0
Cluster State:
Leaf A:
Offline
Leaf B:
Cluster State:
- data:0 primary on B
Leaf B:
Primary of data:0
Not a MemSQL approach
given potential for data loss

Strategy 2: Do nothing - block writes indefinitely
13
MA
BA
Heartbeats
data:0
Cluster State:
Leaf A:
Primary data:0
Leaf B:
Replica of data:0
Write workload stalls

Strategy 3: Leaf Notify
14
MA
BA
Heartbeats
data:0
Cluster State:
- data:0 synchronous replica on B
Leaf A:
Primary data:0
Leaf B:
Replica of data:0
Write workload is stalled

15
MA
BA
Heartbeats
data:0
Cluster State:
- data:0 synchronous replica on B
Leaf A:
Primary data:0
Leaf Notifies MA - no replication
Leaf B:
Replica of data:0

16
MA
BA
Heartbeats
data:0
Cluster State:
- data:0 asynchronous replica on B
Leaf A:
Primary data:0
Leaf B:

MemSQL Distributed Architecture
17
Master Aggregator:
- Maintains primary copy of cluster state;
- Monitors node state via heartbeats.
Leaves:
- Monitor for changes in local and cluster state;
- Reconcile local state with cluster state;
- Notify MA to change cluster state when reconciling is
impossible.

Building a Fault Tolerant Distributed Architecture

Recommended

Recommended

More Related Content

What's hot

What's hot (7)

Similar to Building a Fault Tolerant Distributed Architecture

Similar to Building a Fault Tolerant Distributed Architecture (6)

More from SingleStore

More from SingleStore (20)

Recently uploaded

Recently uploaded (20)

Building a Fault Tolerant Distributed Architecture