Eventually consistent systems are often more cost-effective to implement and maintain than their strongly consistent cousins. Gossip-based anti-entropy methods can be used to improve the consistency of such systems even as they expand geographically.
2. alpha
bravo
charlie
delta
echo
Imagine a network of
connected servers – a
distributed system.
They also make it more available,
that is, responsive even to many
simultaneous client requests. Each peer has a copy of the
database and can respond
independently.
The servers are peers.
Working together, they can
make the system more
tolerant to failures like
earthquakes and outages.
3. This means that the peers have to figure out how
to synchronize independent updates to the data…
set key X to value Y set key P to value Q
alpha bravo
… and sometimes they’ll have to resolve conflicts!
set key M to value N set key M to value F
alpha bravo
4. alpha bravo
The peers can’t see into each other’s copies of the database, so they have
to communicate via remote procedure calls (RPCs) to share information.
The communication has to be extremely efficient so the system doesn’t
slow down too much while synchronization is happening.
Note: This is a good use case for protocol buffers!
5. The tradeoffs include things like:
- staleness (how bad will it be if users sometimes see old data
before it’s updated?)
- expense (how many messages have to travel across the wire;
how many clusters are we hosting?)
strong causal eventual
More consistency, higher latency, lower availability
There are a number of ways to implement synchronization and
achieve consensus, depending on how consistent the system
needs to be.
6. Eventual consistency is a good budget-friendly choice for
applications that choose to prioritize higher availability and lower
latency over consistency.
alpha
bravo
M:N
M:F
time
alpha
bravo
M:N
M:N
Given enough time to catch up on synchronizations, the system will
eventually be fully consistent, meaning that every peer will
eventually contain the same data objects.
7. Anti-entropy (aka “gossip”) is a mechanism for achieving eventual
consistency. It enables peers to periodically share their perspective of
the data with another peer in the system.
alpha bravo
P:Q
M:N
D:F
L:S
P:Q v3
M:N v11
D:F v5
L:S v1
Alpha shares its
current values for each
object with bravo as
well as each object’s
version vector.
Each time a peer gets a
client request to update a
value, it also increments
the object’s version vector.
Version vectors encode
provenance and change
history information.
8. The version vector will
allow bravo to compare
the values it has for
each object, and to
decide whether or not
to update them.
bravo
P:Q v3
M:N v11
D:F v5
L:S v1
P:Q v3
M:Z v10
D:F v5
If bravo has a smaller version number
for an object compared to alpha, it
knows it should update its local value.
what alpha has what I have If bravo sees
alpha has an
object it
doesn’t
already know
about, it can
add it locally.
9. What if alpha and bravo
have different values for
the same key, but the
same version vector? In
this case, bravo needs a
tie-breaker.
bravo
D:F v5 D:B v5
what alpha has what I have
Bravo will compare its global process id (PID)
to alpha’s. If alpha’s PID is lower, bravo will
update its local value for the object. PIDs are
unique, so each peer is guaranteed to have a
different number.
alpha
PID 1
PID 2
10. With bilateral anti-entropy, we can
enable alpha not only to share its
data and version vectors with
bravo, but also receive data and
version vectors back from bravo.
If alpha is initiating anti-entropy,
it will begin by sharing its version
vectors for all objects with bravo.
This is known as the “push” phase.
After it notifies bravo that the
push phase is complete, alpha will
receive data and version vectors
from bravo, which is the “pull”
phase.
alpha bravo
PID 2
PID 1
push
pull
11. As the receiver, bravo also experiences
anti-entropy in multiple phases.
First, bravo starts receiving version vectors
for each of alpha’s objects.
alpha bravo
PID 2
PID 1
get version vectors
12. Bravo then uses the version vectors to
determine which objects to ask alpha for.
alpha bravo
PID 2
PID 1
request data
13. Once bravo gets back this data from
alpha, it can use it to make local repairs.
alpha bravo
PID 2
PID 1
get data
14. As soon as bravo is notified that alpha’s
push phase is complete, it determines which
objects it has locally that alpha doesn’t yet
know about, and sends those back to
alpha.
alpha bravo
PID 2
PID 1
send repairs
COMPLETE
16. alpha
bravo
Why doesn’t alpha just send all the
data for each object it has during
the push phase? That would save
me from having to ask for it later.
Sending only version vectors is a lot
cheaper! It reduces the size of
messages across the wire.
17. alpha
bravo
So why is it called
a version “vector”
then?
In order to make the right changes (and make them in the right order), we
need to communicate the count of updates an object has had as well as
our PIDs (which express each of our perspectives in the system).
18. alpha
bravo
charlie
delta
echo
How do we know which peers
to perform anti-entropy with?
At least one peer needs to know
about some other peers in the
network to get anti-entropy
started. That peer will select a
peer at random to start
“gossiping” with.
As part of the anti-entropy
process, peers can also tell
each other about newly joined
peers in the network (and
their PIDs).
PID 1
PID 2
PID 21
PID 42
PID 88
19. delta
echo
How often
should we
gossip?
That’s configurable.
More gossip means
more consistency at
the price of more
messages traveling
across the network.
charlie
We could each have different anti-entropy
intervals. We also have a configured “jitter” to
prevent our anti-entropy sessions from all
happening at the exact same time.