- 1. Capacity of Agreementwith Finite Link Capacity Guanfeng Liang @ Infocom 2011 Electrical and Computer Engineering University of Illinois at Urbana-Champaign Joint work with Prof. Nitin Vaidya 1
- 2. MOTIVATION 2
- 3. Motivation Distributed systems are emerging Cloud computing (e.g. Windows Azure), distributed file systems, data centers, multiplayer online games Large number of distributed components Distributed components need to be coordinated 3
- 4. Motivation Distributed primitives Clock synchronization Mutual exclusion Agreement etc. Large body of literature in Distributed Algorithms 4
- 5. MotivationA networking guy asks: “How would constraints of the network affect the performance of these primitives?”A algorithm guy replies: “……” Network-aware distributed algorithm design 5
- 6. BYZANTINE AGREEMENTIN P2P NETWORKS 6
- 7. Byzantine Agreement (BA): Broadcast A sender wants to send message to n-1 receivers Fault-free receivers must agree Sender fault-free agree on its message Any ≤ f nodes may fail 7
- 8. Why agreement? Distributed systems are failure-prone Non-malicious: crashed nodes, buggy codes Malicious: attacker tries to crack the system Robust system against faults: Important to maintain consistent state 8
- 9. Impact of the Network How does capacity (rate region) of the network affect agreement performance? How to quantify the impact? 9
- 10. Rate Region Defines the way “links” may share channel Interference posed to each other determines whether a set of transmissions can succeed together 10
- 11. “Ethernet” Rate Region S Rate S21 2 Rate S1 Rate S1 + Rate S2 ≤ C 11
- 12. Point-to-Point Network Rate Region S Each directed link independent of other links 1 2Rate ij ≤ Capacity ij 12
- 13. Capacity of Agreement b(t) = # bits agreed in [0,t] b(t ) Throughput lim t t Capacity of agreement: supremum of achievable throughput for a given rate region 13
- 14. Upper Bound of Capacity in P2P Networks NC1: C ≤ min-cut(S,X | f receivers removed) S 1 3 2 14
- 15. Upper Bound of Capacity in P2P Networks NC2: C ≤ In(X | f nodes removed) S 1 3 2 15
- 16. Upper Bound of Capacity in P2P Networks NC1: C ≤ min-cut(S,X | f receivers removed) NC2: C ≤ In(X | f nodes removed) S Upper bound = 1+ε ε 1 3 2 16
- 17. Classic Solution for Broadcast value v S v v vFaulty peer 1 3 2 17
- 18. Classic Solution for Broadcast value v S v v v 1 v v 3 2 18
- 19. Classic Solution for Broadcast value v S v v v 1 v v 3 2 ? ? 19
- 20. Classic Solution for Broadcast value v S v v v 1 v v 3 2 ? v ? v 20
- 21. Classic Solution for Broadcast value v S v v v 1 v v 3 [v,v,?] 2 ? v [v,v,?] ? v 21
- 22. Classic Solution for Broadcast value v S v v v 1 v v 3 vMajority 2vote results ? vin correct vresult at ?good receiver v 22
- 23. Classic Solution for BroadcastFaulty source S v x w 1 3 2 23
- 24. Classic Solution for Broadcast S v x w 1 w w 3 2 24
- 25. Classic Solution for Broadcast S v x w 1 w w 3 2 v x v x 25
- 26. Classic Solution for Broadcast S v x w [v,w,x] 1 w w 3 [v,w,x] 2 v x [v,w,x] v x 26
- 27. Classic Solution for Broadcast S v x w [v,w,x] 1 w w 3 [v,w,x] 2 v x [v,w,x]Vote result videntical atgood receivers x 27
- 28. Classic Solution in P2P Networks Whole message is sent on every link Throughput ≤ slowest link S Throughput ≤ ε but ε Upper bound = 1+ε 1 3 2 28
- 29. Improving Broadcast Throughput Observation: classic solution is in fact an “error correction code” “Error detection codes” are more efficient 29
- 30. Error Detection CodeTwo-bit value a, b S a a+b b 1 3 2 30
- 31. Error Detection Code Two-bit value a, b S a a+b b[a,b,a+b] 1 b b 3 [a,b,a+b] 2 a a+b [a,b,a+b] a a+b 31
- 32. Error Detection Code Two-bit value a, b S a a+b b [a,b,a+b] 1 b b 3 [a,b,a+b] 2 a a+b [a,b,a+b] Parity check passes a at all nodes a+b Agree on (a,b) 32
- 33. Error Detection Code Two-bit value a, b S a a+b b 1 b b 3 [?,b,a+b] 2 ? a+b [?,b,a+b] Parity check fails at a node ?if 1 misbehaves a+b 33
- 34. Error Detection Code Two-bit value Only detection is a, b not what we want S a z b [a,b,z] 1 b b 3 [a,b,z] 2 a z [a,b,z] Check fails at a good node a if S sends bad zcodeword (a,b,z) 34
- 35. Modification Agree on small pieces of data in each “round” If X misbehaves with Y in a given round, avoid using XY link in the next round (for next piece of data) Repeat 35
- 36. Algorithm Structure Fast round (as in the example) 36
- 37. Algorithm Structure Fast round (as in the example) S a a+b b [a,b,a+b] 1 b b 3 [a,b,a+b] 2 a a+b [a,b,a+b] a a+b 37
- 38. Algorithm Structure Fast round (as in the example) Fast round… Fast round in which failure is detected Expensive round to learn new info about failure 38
- 39. Algorithm Structure Fast round (as in the example) Fast round… Fast round in which failure is detected Expensive round to learn new info about failure Fast round Fast round… Expensive round to learn new info about failure. 39
- 40. Algorithm Structure Fast round (as in the example) Fast round… Fast round in which failure is detected Expensive round to learn new info about failure Fast round Fast round… Expensive round to learn new info about failure. After a small number of expensive rounds, failures completely identified 40
- 41. Algorithm Structure Fast round (as in the example) Fast round… Fast round in which failure is detected Expensive round to learn new info about failure Fast round Fast round… Expensive round to learn new info about failure. After a small number of rounds failures identified Only fast rounds hereon 41
- 42. Algorithm “Analysis” Many fast rounds Few expensive rounds When averaged over time, the cost of expensive rounds is negligible Average usage of link capacity depends only on the fast round, which is very efficient Achieves capacity for 4-node networks, and symmetric networks 42
- 43. OPEN PROBLEMS 43
- 44. Open Problems Capacity of agreement for general rate regions 44
- 45. Open Problems Capacity of agreement for general rate regions Even the multicast problem with Byzantine nodes is unsolved - For multicast, sources fault-free 45
- 46. Rich Problem Space Wireless channel allows overhearing Transmit to 2 at high 1 rate, or low rate ? 2 - Low rate allows S reception at 1 3 46
- 47. Rich Problem Space Similar questions relevant for any multi-party computation Distributed Communication Computation Multi-party computing under Communication Constraints 47
- 48. MIND TEASER 48
- 49. How many bits needed? N nodes each has a k-bit input Check if all inputs are identical At least 1 node “detects” if not identical 2 Intuitive guess: (N-1)k bit 1 Is it the best we can do? 3 49
- 50. THANK YOU! 50
- 51. Improving Broadcast Throughput Observation: classic solution is in fact an “error correction” “Error detection” suffices Disseminate some data Check if consistent or not Consistent: decide Inconsistent: diagnose and adapt Repeat for new data 51

