Grokking Techtalk #39: Gossip protocol and applications

Gossip protocol and applications
Tu Nguyen
Staff Software Engineer - Axon

Gossip in computer science
A peer-to-peer communication protocol●
Inspired by epidemics, human gossip and social networks (spreading rumors)●
epidemic protocol (synonym)■
why ?■
rumors or epidemics in society travel at a great speed and reach to almost every member of the community
without needing a central coordinator.
●
Gossip was founded originally to solve Multicast problem●
Multicast●
we want to communicate a message to all the nodes in the network■
each node sends the message to only a few of the nodes■
Multicast problems ?●
Fault-tolerance: node might crash, packet might be dropped, etc○
Scalability: millions, hundreds of millions of nodes○
Centralized: single sender “multi-cast” TCP/UDP packets to others.○
Tree-based multicast: too much redundancy with ACK/NACK msg.○
Multicast was originally heavily used in network devices (eg. routers); how to leverage it in application layer ?○

Gossip basic
A node wants to share some information to the other nodes in the network. Then periodically it
selects randomly a node from the set of nodes and exchanges the information. The node that
receives the information does exactly the same thing.
Cycle●
number of rounds to spread the information■
Fanout●
number of nodes that a node “gossip” within each cycle■

Gossip properties
Node selection must be random (or guarantee enough peer diversity)●
Node only stores local information. There is no shared global state.●
Communication is round-based (periodic).●
Transmission and processing capacity per round is limited.●
All nodes run the same protocol.●
Not deterministic (because of randomness peer sampling).●

Advantages of Gossip
Scalable●
Fault-tolerance●
Robust●
Decentralized●
Convergent consistency●

Gossip modeling
Consider a distributed network where nodes are message-passing to each
other.
State of a node●
Susceptible - node has not received update yet (is not infected).■
Infected - node with an update it is willing to share.■
Removed - node has received the update but is not willing to share.■
Two basic models●
SI (anti-entropy)■
SIR (rumor-mongering)■
When R state happens ?
👉 Many algorithms. One of them are counting for redundant messages.

Gossip modeling
Push / Pull / Push-Pull●
Push■
I nodes are the ones sending/infecting S nodes●
efficient when there are a few updates.●
Pull■
all nodes are actively pulling for updates●
efficient when there are many updates.●
Push-Pull■
node pushes when it has updates and also pulls for new updates●
node and selected node are exchanging information ●

https://flopezluis.github.io/gossip-simulator/

Applications
Cluster membership●
Information dissemination●
Failure detection●
Database replication●
Overlay network●
Aggregations●

Cluster Membership
Who are my live peers ?
Desired properties
Connectedness●
Balance●
Short path-length●
Reducing redundancy●
Scalability●
Accuracy●
Full Partial

Full Partial
👍 Connectedness
👍 Short-path length
👌 Accuracy
👌 Balance
👎 High redundancy
👎 Low scalability
👌 Connectedness
👌 Short-path length
👌 Accuracy
👌 Balance
👍 Low redundancy
👍 High scalability
Cluster Membership
✅

SWIM - Cornell University 2002●
SCAMP - Microsoft Research 2003●
CYCLON - Vrije University, The Netherlands, 2005●
HYPARVIEW - University of Lisbon, 2007●
Cluster Membership

SWIM - Cornell university (2002)
Scalable Weakly-consistent Infection-style Process Group
Membership
https://www.cs.cornell.edu/projects/Quicksilver/public_pdfs/SWIM.pdf
Properties
Scalable●
Weakly consistent●
Infection-style●
Membership protocol●

SWIM
Motivated by traditional heart-beating●
every interval T, notify peers of liveness■
if no update received from peer P after T * limit, mark P as dead.■
heart-beat = membership + failure detection■
Heart-beat is doing good at:●
completeness - yes!■
strong completeness - every crashed node is eventually detected by all correct
nodes.
●
Accuracy - high!■
Heart-beat problems ?●
Network load: N^2■

SWIM is trying to ...
Separate two problems and solve them one-by-one●
Failure detection (👉 “live” peers)○
Membership protocol (👉 list of peers)○
Optimization●
Reduce network load○
Failure detection○
decrease processing time●
increase accuracy●

Failure Detection properties
One step back...●
The two properties of a distributed system□
Safety - nothing bad ever happens○
Liveness - something good eventually happens.○
Failure Detection properties●
Completeness (L) - failure detector would find the node(s) that finally crashed in the
system.
□
Accuracy (S) - correct decisions that the failure detector has made in a node.□

Failure Detection properties
Degree of completeness●
depends on number of crashed nodes is suspected by a failure detector in a certain
period
□
Strong completeness - every faulty node is eventually permanently suspected by every non-
faulty node
○
Weak completeness - every faulty node is eventually permanently suspected by some non-faulty
node
○
Degree of accuracy●
depends on number of mistakes that a failure detector made in certain period□
Strong accuracy - no node is suspected (by any node) before it crashes○
Weak accuracy - some non-faulty node is never suspected○
Eventual strong accuracy - after some time, system becomes strong accuracy.○
Eventual weak accuracy - after some time, system becomes weak accuracy.○

SWIM Failure Detection
Each node in set of N node●
Choose a random peer○
Ping - ACK□
Indirect Ping (iff no ACK)○
Choose k random peers□
indirect Ping○
Evaluation:
completeness: every nodes will be pinged!●
accuracy: “high” (🔍)●
speed of detection: 1 * Interval●
network load: (4*k + 2) * N ~ 0(N)●

SWIM Membership Protocol
Aware of join / leave nodes●
Motivated by Gossip●
Piggy-back approach■
Infection-style○
ping is sent to random peer□
eventually (weakly) consistent□
updates send peer-to-peer□

SWIM - Optimization
Suspicion state - to improve accuracy
Trade-off between failure detection time and false positives.●
Introduce suspicion state.●
A 👉 B: Ping! Suspect C failed■
B 👉 A: ACK!■
A few moment later■
A, B 👉 C: Ping! Are you dead ?□
C 👉 A,B: ACK! (i’m not 😋)□
State FSM

SWIM - Optimization
Round-robin probe peer selection
Randomly sort peer set■
Ping in round-robin order■
Evaluation:
Completeness: increase, time-bounded○
State FSM

SWIM - Limitations
Node leave vs fail●
Re-joining●
Event ordering●
Message encryption●
Peer metadata●
Custom payload●
Network participants●
More details: https://www.cs.cornell.edu/projects/Quicksilver/public_pdfs/SWIM.pdf

SWIM - Implementation
memberlist https://github.com/hashicorp/memberlist●
serf, consul, etcd are relying on swim-based memberlist for failure detection and group
membership.
●

Other “announced” applications
Cassandra internal - understand gossip https://www.youtube.com/watch?v=FuP1Fvrv6ZQ●
AWS S3 gossip http://status.aws.amazon.com/s3-20080720.html●
Slicing structured overlay network
T-MAN https://www.researchgate.net/publication/225403352_T-Man_Gossip-
Based_Overlay_Topology_Management
●
https://managementfromscratch.wordpress.com/2016/04/01/introduction-to-gossip●

Grokking Techtalk #39: Gossip protocol and applications

More Related Content

What's hot

Similar to Grokking Techtalk #39: Gossip protocol and applications

More from Grokking VN

Recently uploaded

Grokking Techtalk #39: Gossip protocol and applications