Gossip protocol and applications
Tu Nguyen
Staff Software Engineer - Axon
Gossip protocol
Gossip in computer science
A peer-to-peer communication protocol●
Inspired by epidemics, human gossip and social networks (spreading rumors)●
epidemic protocol (synonym)■
why ?■
rumors or epidemics in society travel at a great speed and reach to almost every member of the community
without needing a central coordinator.
●
Gossip was founded originally to solve Multicast problem●
Multicast●
we want to communicate a message to all the nodes in the network■
each node sends the message to only a few of the nodes■
Multicast problems ?●
Fault-tolerance: node might crash, packet might be dropped, etc○
Scalability: millions, hundreds of millions of nodes○
Centralized: single sender “multi-cast” TCP/UDP packets to others.○
Tree-based multicast: too much redundancy with ACK/NACK msg.○
Multicast was originally heavily used in network devices (eg. routers); how to leverage it in application layer ?○
Gossip basic
A node wants to share some information to the other nodes in the network. Then periodically it
selects randomly a node from the set of nodes and exchanges the information. The node that
receives the information does exactly the same thing.
Cycle●
number of rounds to spread the information■
Fanout●
number of nodes that a node “gossip” within each cycle■
Gossip properties
Node selection must be random (or guarantee enough peer diversity)●
Node only stores local information. There is no shared global state.●
Communication is round-based (periodic).●
Transmission and processing capacity per round is limited.●
All nodes run the same protocol.●
Not deterministic (because of randomness peer sampling).●
Advantages of Gossip
Scalable●
Fault-tolerance●
Robust●
Decentralized●
Convergent consistency●
Gossip modeling
Consider a distributed network where nodes are message-passing to each
other.
State of a node●
Susceptible - node has not received update yet (is not infected).■
Infected - node with an update it is willing to share.■
Removed - node has received the update but is not willing to share.■
Two basic models●
SI (anti-entropy)■
SIR (rumor-mongering)■
When R state happens ?
👉 Many algorithms. One of them are counting for redundant messages.
Gossip modeling
Push / Pull / Push-Pull●
Push■
I nodes are the ones sending/infecting S nodes●
efficient when there are a few updates.●
Pull■
all nodes are actively pulling for updates●
efficient when there are many updates.●
Push-Pull■
node pushes when it has updates and also pulls for new updates●
node and selected node are exchanging information ●
Gossip modeling
https://flopezluis.github.io/gossip-simulator/
Gossip Applications
Applications
Cluster membership●
Information dissemination●
Failure detection●
Database replication●
Overlay network●
Aggregations●
Cluster Membership
 Who are my live peers ?
Desired properties
Connectedness●
Balance●
Short path-length●
Reducing redundancy●
Scalability●
Accuracy●
Full Partial
Full Partial
👍 Connectedness
👍 Short-path length
👌 Accuracy
👌 Balance
👎 High redundancy
👎 Low scalability
👌 Connectedness
👌 Short-path length
👌 Accuracy
👌 Balance
👍 Low redundancy
👍 High scalability
Cluster Membership
✅
SWIM - Cornell University 2002●
SCAMP - Microsoft Research 2003●
CYCLON - Vrije University, The Netherlands, 2005●
HYPARVIEW - University of Lisbon, 2007●
Cluster Membership
SWIM - Cornell university (2002)
Scalable Weakly-consistent Infection-style Process Group
Membership
https://www.cs.cornell.edu/projects/Quicksilver/public_pdfs/SWIM.pdf
Properties
Scalable●
Weakly consistent●
Infection-style●
Membership protocol●
SWIM
Motivated by traditional heart-beating●
every interval T, notify peers of liveness■
if no update received from peer P after T * limit, mark P as dead.■
heart-beat = membership + failure detection■
Heart-beat is doing good at:●
completeness - yes!■
strong completeness - every crashed node is eventually detected by all correct
nodes.
●
Accuracy - high!■
Heart-beat problems ?●
Network load: N^2■
SWIM is trying to ...
Separate two problems and solve them one-by-one●
Failure detection (👉 “live” peers)○
Membership protocol (👉 list of peers)○
Optimization●
Reduce network load○
Failure detection○
decrease processing time●
increase accuracy●
Failure Detection properties
One step back...●
The two properties of a distributed system□
Safety - nothing bad ever happens○
Liveness - something good eventually happens.○
Failure Detection properties●
Completeness (L) - failure detector would find the node(s) that finally crashed in the
system. 
□
Accuracy (S) - correct decisions that the failure detector has made in a node.□
Failure Detection properties
Degree of completeness●
depends on number of crashed nodes is suspected by a failure detector in a certain
period
□
Strong completeness - every faulty node is eventually permanently suspected by every non-
faulty node
○
Weak completeness - every faulty node is eventually permanently suspected by some non-faulty
node
○
Degree of accuracy●
depends on number of mistakes that a failure detector made in certain period□
Strong accuracy - no node is suspected (by any node) before it crashes○
Weak accuracy - some non-faulty node is never suspected○
Eventual strong accuracy - after some time, system becomes strong accuracy.○
Eventual weak accuracy - after some time, system becomes weak accuracy.○
SWIM Failure Detection
Each node in set of N node●
Choose a random peer○
Ping - ACK□
Indirect Ping (iff no ACK)○
Choose k random peers□
indirect Ping○
Evaluation:
completeness: every nodes will be pinged!●
accuracy: “high” (🔍)●
speed of detection: 1 * Interval●
network load: (4*k + 2) * N ~ 0(N)●
SWIM Membership Protocol
Aware of join / leave nodes●
Motivated by Gossip●
Piggy-back approach■
Infection-style○
ping is sent to random peer□
eventually (weakly) consistent□
updates send peer-to-peer□
SWIM - Optimization
Suspicion state - to improve accuracy
Trade-off between failure detection time and false positives.●
Introduce suspicion state.●
A 👉 B: Ping! Suspect C failed■
B 👉 A: ACK!■
A few moment later■
A, B 👉 C: Ping! Are you dead ?□
C 👉 A,B: ACK! (i’m not 😋)□
State FSM
SWIM - Optimization
Round-robin probe peer selection
Randomly sort peer set■
Ping in round-robin order■
Evaluation:
Completeness: increase, time-bounded○
State FSM
SWIM - Limitations
Node leave vs fail●
Re-joining●
Event ordering●
Message encryption●
Peer metadata●
Custom payload●
Network participants●
More details:  https://www.cs.cornell.edu/projects/Quicksilver/public_pdfs/SWIM.pdf
SWIM - Implementation
memberlist https://github.com/hashicorp/memberlist●
serf, consul, etcd are relying on swim-based memberlist for failure detection and group
membership.
●
Other “announced” applications
Cassandra internal - understand gossip https://www.youtube.com/watch?v=FuP1Fvrv6ZQ●
AWS S3 gossip http://status.aws.amazon.com/s3-20080720.html●
Slicing structured overlay network
T-MAN  https://www.researchgate.net/publication/225403352_T-Man_Gossip-
Based_Overlay_Topology_Management
●
https://managementfromscratch.wordpress.com/2016/04/01/introduction-to-gossip●

Grokking Techtalk #39: Gossip protocol and applications

  • 1.
    Gossip protocol andapplications Tu Nguyen Staff Software Engineer - Axon
  • 2.
  • 4.
    Gossip in computerscience A peer-to-peer communication protocol● Inspired by epidemics, human gossip and social networks (spreading rumors)● epidemic protocol (synonym)■ why ?■ rumors or epidemics in society travel at a great speed and reach to almost every member of the community without needing a central coordinator. ● Gossip was founded originally to solve Multicast problem● Multicast● we want to communicate a message to all the nodes in the network■ each node sends the message to only a few of the nodes■ Multicast problems ?● Fault-tolerance: node might crash, packet might be dropped, etc○ Scalability: millions, hundreds of millions of nodes○ Centralized: single sender “multi-cast” TCP/UDP packets to others.○ Tree-based multicast: too much redundancy with ACK/NACK msg.○ Multicast was originally heavily used in network devices (eg. routers); how to leverage it in application layer ?○
  • 5.
    Gossip basic A nodewants to share some information to the other nodes in the network. Then periodically it selects randomly a node from the set of nodes and exchanges the information. The node that receives the information does exactly the same thing. Cycle● number of rounds to spread the information■ Fanout● number of nodes that a node “gossip” within each cycle■
  • 6.
    Gossip properties Node selectionmust be random (or guarantee enough peer diversity)● Node only stores local information. There is no shared global state.● Communication is round-based (periodic).● Transmission and processing capacity per round is limited.● All nodes run the same protocol.● Not deterministic (because of randomness peer sampling).●
  • 7.
  • 8.
    Gossip modeling Consider adistributed network where nodes are message-passing to each other. State of a node● Susceptible - node has not received update yet (is not infected).■ Infected - node with an update it is willing to share.■ Removed - node has received the update but is not willing to share.■ Two basic models● SI (anti-entropy)■ SIR (rumor-mongering)■ When R state happens ? 👉 Many algorithms. One of them are counting for redundant messages.
  • 9.
    Gossip modeling Push /Pull / Push-Pull● Push■ I nodes are the ones sending/infecting S nodes● efficient when there are a few updates.● Pull■ all nodes are actively pulling for updates● efficient when there are many updates.● Push-Pull■ node pushes when it has updates and also pulls for new updates● node and selected node are exchanging information ●
  • 10.
  • 11.
  • 12.
  • 13.
    Applications Cluster membership● Information dissemination● Failuredetection● Database replication● Overlay network● Aggregations●
  • 14.
    Cluster Membership  Who aremy live peers ? Desired properties Connectedness● Balance● Short path-length● Reducing redundancy● Scalability● Accuracy● Full Partial
  • 15.
    Full Partial 👍 Connectedness 👍Short-path length 👌 Accuracy 👌 Balance 👎 High redundancy 👎 Low scalability 👌 Connectedness 👌 Short-path length 👌 Accuracy 👌 Balance 👍 Low redundancy 👍 High scalability Cluster Membership ✅
  • 16.
    SWIM - CornellUniversity 2002● SCAMP - Microsoft Research 2003● CYCLON - Vrije University, The Netherlands, 2005● HYPARVIEW - University of Lisbon, 2007● Cluster Membership
  • 17.
    SWIM - Cornelluniversity (2002) Scalable Weakly-consistent Infection-style Process Group Membership https://www.cs.cornell.edu/projects/Quicksilver/public_pdfs/SWIM.pdf Properties Scalable● Weakly consistent● Infection-style● Membership protocol●
  • 18.
    SWIM Motivated by traditionalheart-beating● every interval T, notify peers of liveness■ if no update received from peer P after T * limit, mark P as dead.■ heart-beat = membership + failure detection■ Heart-beat is doing good at:● completeness - yes!■ strong completeness - every crashed node is eventually detected by all correct nodes. ● Accuracy - high!■ Heart-beat problems ?● Network load: N^2■
  • 19.
    SWIM is tryingto ... Separate two problems and solve them one-by-one● Failure detection (👉 “live” peers)○ Membership protocol (👉 list of peers)○ Optimization● Reduce network load○ Failure detection○ decrease processing time● increase accuracy●
  • 20.
    Failure Detection properties Onestep back...● The two properties of a distributed system□ Safety - nothing bad ever happens○ Liveness - something good eventually happens.○ Failure Detection properties● Completeness (L) - failure detector would find the node(s) that finally crashed in the system.  □ Accuracy (S) - correct decisions that the failure detector has made in a node.□
  • 21.
    Failure Detection properties Degreeof completeness● depends on number of crashed nodes is suspected by a failure detector in a certain period □ Strong completeness - every faulty node is eventually permanently suspected by every non- faulty node ○ Weak completeness - every faulty node is eventually permanently suspected by some non-faulty node ○ Degree of accuracy● depends on number of mistakes that a failure detector made in certain period□ Strong accuracy - no node is suspected (by any node) before it crashes○ Weak accuracy - some non-faulty node is never suspected○ Eventual strong accuracy - after some time, system becomes strong accuracy.○ Eventual weak accuracy - after some time, system becomes weak accuracy.○
  • 22.
    SWIM Failure Detection Eachnode in set of N node● Choose a random peer○ Ping - ACK□ Indirect Ping (iff no ACK)○ Choose k random peers□ indirect Ping○ Evaluation: completeness: every nodes will be pinged!● accuracy: “high” (🔍)● speed of detection: 1 * Interval● network load: (4*k + 2) * N ~ 0(N)●
  • 23.
    SWIM Membership Protocol Awareof join / leave nodes● Motivated by Gossip● Piggy-back approach■ Infection-style○ ping is sent to random peer□ eventually (weakly) consistent□ updates send peer-to-peer□
  • 24.
    SWIM - Optimization Suspicionstate - to improve accuracy Trade-off between failure detection time and false positives.● Introduce suspicion state.● A 👉 B: Ping! Suspect C failed■ B 👉 A: ACK!■ A few moment later■ A, B 👉 C: Ping! Are you dead ?□ C 👉 A,B: ACK! (i’m not 😋)□ State FSM
  • 25.
    SWIM - Optimization Round-robinprobe peer selection Randomly sort peer set■ Ping in round-robin order■ Evaluation: Completeness: increase, time-bounded○ State FSM
  • 26.
    SWIM - Limitations Nodeleave vs fail● Re-joining● Event ordering● Message encryption● Peer metadata● Custom payload● Network participants● More details:  https://www.cs.cornell.edu/projects/Quicksilver/public_pdfs/SWIM.pdf
  • 27.
    SWIM - Implementation memberlist https://github.com/hashicorp/memberlist● serf,consul, etcd are relying on swim-based memberlist for failure detection and group membership. ●
  • 28.
    Other “announced” applications Cassandrainternal - understand gossip https://www.youtube.com/watch?v=FuP1Fvrv6ZQ● AWS S3 gossip http://status.aws.amazon.com/s3-20080720.html● Slicing structured overlay network T-MAN  https://www.researchgate.net/publication/225403352_T-Man_Gossip- Based_Overlay_Topology_Management ● https://managementfromscratch.wordpress.com/2016/04/01/introduction-to-gossip●