SlideShare a Scribd company logo
1 of 71
Distributed Systems
(3rd Edition)
Chapter 06: Coordination
Version: March 20, 2022
2 / 49
Coordination: Clock synchronization Physical clocks
Physical clocks
Problem
Sometimes we simply need the exact time, not just an ordering.
Solution: Universal Coordinated Time (UTC)
► Based on the number of transitions per second of the cesium 133
atom (pretty accurate).
► At present, the real time is taken as the average of some 50
cesium clocks around the world.
► Introduces a leap second from time to time to compensate that
days are getting longer.
Note
UTC is broadcast through short-wave radio and satellite. Satellites can
give an accuracy of about ±0.5 ms.
3 / 49
Coordination: Clock synchronization Clock synchronization algorithms
Clock synchronization
Precision
The goal is to keep the deviation between two clocks on any two
machines within a specified bound, known as the precision π:
∀
t,∀
p,q : |Cp(t) −Cq (t)|≤π
with Cp(t) the computed clock time of machine p at UTC time t.
Accuracy
In the case of accuracy, we aim to keep the clock bound to a value α:
∀
t,∀
p : |Cp (t) −t|≤α
Synchronization
► Internal synchronization: keep clocks precise
► External synchronization: keep clocks accurate
Coordination: Clock synchronization Clock synchronization algorithms
Clock drift
Clock specifications
► A clock comes specified with its maximum clock drift rate ρ.
► F(t) denotes oscillator frequency of the hardware clock at time t
► F is the clock’s ideal (constant) frequency ⇒ living up to
specifications:
∀
t : (1−ρ) ≤
F(t)
F
≤(1+ ρ)
Observation
By using hardware interrupts we
couple a software clock to the
hardware clock, and thus also its
clock drift rate:
1
Cp(t) =
F
t
0
F(t)dt ⇒ =
dCp(t) F(t)
dt F
⇒∀
t : 1−ρ ≤
dCp(t)
dt
≤1+ ρ
Fast, perfect, slow clocks
Clock time, C
dt
dCp(t)
= 1
dt
dCp(t)
> 1
dt
dCp(t)
< 1
4 / 49
Coordination: Clock synchronization Clock synchronization algorithms
Detecting and adjusting incorrect times
Getting the current time from a time server
B
T1
T2 T3
T4
A


Treq 

Tres
Network Time Protocol 5 / 49
Computing the relative offset θ and delay δ
Assumption: δTreq = T2 −T1 ≈T4 −T3 = δTres
θ = T3 + (T2 −T1) + (T4 −T3) / 2−T4 = (T2 −T1) + (T3 −T4) / 2
δ = (T4 −T1) −(T3 −T2) / 2
Coordination: Clock synchronization Clock synchronization algorithms
Detecting and adjusting incorrect times
Getting the current time from a time server
B
T1
T2 T3
T4
A


Treq 

Tres
Network Time Protocol 5 / 49
Computing the relative offset θ and delay δ
Assumption: δTreq = T2 −T1 ≈T4 −T3 = δTres
θ = T3 + (T2 −T1) + (T4 −T3) / 2−T4 = (T2 −T1) + (T3 −T4) / 2
δ = (T4 −T1) −(T3 −T2) / 2
Network Time Protocol
Collect eight (θ,δ) pairs and choose θ for which associated delay δ
was minimal.
Coordination: Clock synchronization Clock synchronization algorithms
Keeping time without UTC
Principle
Let the time server scan all machines periodically, calculate an
average, and inform each machine how it should adjust its time relative
to its present time.
Using a time server
Time daemon
3:00
3:00
3:25
2:50
Network
3:00 3:00
+25
3:25
2:50
3:05
3:00 0 +5
-10 +15
-20
3:05
3:05
The Berkeley algorithm 6 / 49
Coordination: Clock synchronization Clock synchronization algorithms
Keeping time without UTC
Principle
Let the time server scan all machines periodically, calculate an
average, and inform each machine how it should adjust its time relative
to its present time.
Using a time server
Time daemon
3:00
3:00
Network
3:00 3:00
+25
3:05
3:00 0 +5
-10 +15
-20
2:50 3:25 2:50 3:25 3:05 3:05
Fundamental
You’ll have to take into account that setting the time back is never
allowed ⇒ smooth adjustments (i.e., run faster or slower).
The Berkeley algorithm 6 / 49
Coordination: Clock synchronization Clock synchronization algorithms
Reference broadcast synchronization
Essence
► A node broadcasts a reference message m ⇒ each receiving
node p records the time Tp,m that it received m.
► Note: Tp,m is read from p’s local clock.
Problem: averaging will not
capture drift ⇒ use linear
regression
∑M
k= 1 p,k q,k
(T −T )
M
NO: Offset[p,q](t) =
YES: Offset[p,q](t) = αt + β
RBS minimizes critical
path
Message preparation
Time spent in NIC
A
B
C
Delivery time
to app.
Critical path RBS
Usual critical path
Clock synchronization in wireless networks 7 / 49
8 / 49
Coordination: Logical clocks Lamport’s logical clocks
The Happened-before relationship
Issue
What usually matters is not that all processes agree on exactly what
time it is, but that they agree on the order in which events occur.
Requires a notion of ordering.
8 / 49
Coordination: Logical clocks Lamport’s logical clocks
The Happened-before relationship
Issue
What usually matters is not that all processes agree on exactly what
time it is, but that they agree on the order in which events occur.
Requires a notion of ordering.
The happened-before relation
► If a and b are two events in the same process, and a comes
before b, then a → b.
► If a is the sending of a message, and b is the receipt of that
► message, then a → b
If a → b and b → c, then a → c
Note
This introduces a partial ordering of events in a system with
concurrently operating processes.
9 / 49
Coordination: Logical clocks Lamport’s logical clocks
Logical clocks
Problem
How do we maintain a global view on the system’s behavior that is
consistent with the happened-before relation?
9 / 49
Coordination: Logical clocks Lamport’s logical clocks
Logical clocks
Problem
How do we maintain a global view on the system’s behavior that is
consistent with the happened-before relation?
Attach a timestamp C(e) to each event e, satisfying the
following properties:
P1 If a and b are two events in the same process, and a → b, then
we demand that C(a) < C(b).
P2 If a corresponds to sending a message m, and b to the receipt of
that message, then also C(a) < C(b).
9 / 49
Coordination: Logical clocks Lamport’s logical clocks
Logical clocks
Problem
How do we maintain a global view on the system’s behavior that is
consistent with the happened-before relation?
Attach a timestamp C(e) to each event e, satisfying the
following properties:
P1 If a and b are two events in the same process, and a → b, then
we demand that C(a) < C(b).
P2 If a corresponds to sending a message m, and b to the receipt of
that message, then also C(a) < C(b).
Problem
How to attach a timestamp to an event when there’s no global clock ⇒
maintain a consistent set of logical clocks, one per process.
10 / 49
Coordination: Logical clocks Lamport’s logical clocks
Logical clocks: solution
Each process Pi maintains a local counter Ci and adjusts
this counter
1. For each new event that takes place within Pi , Ci is incremented
by 1.
2. Each time a message m is sent by process Pi , the message
receives a timestamp ts(m) = Ci .
3. Whenever a message m is received by a process Pj , Pj adjusts
its local counter Cj to max{Cj,ts(m)}; then executes step 1 before
passing m to the application.
Notes
► Property P1 is satisfied by (1); Property P2 by (2) and (3).
► It can still occur that two events happen at the same time. Avoid
this by breaking ties through process IDs.
Coordination: Logical clocks Lamport’s logical clocks
Logical clocks: example
Consider three processes with event counters operating at
different rates
0
6
12
18
24
30
36
42
48
54
60
0
8
16
24
32
40
48
56
64
72
80
0
10
20
30
40
50
60
70
80
90
100
m1
m2
m3
m4
P1
P2
P P P P
3 1 2 3
m1
m2
m3
m4
0
10
20
30
40
50
60
70
80
90
100
P2 adjusts
its clock
P adjusts
1
its clock
0
6
12
18
24
30
36
42
48
70
76
11 / 49
0
8
16
24
32
40
48
61
69
77
85
Coordination: Logical clocks Lamport’s logical clocks
Logical clocks: where implemented
Adjustments implemented in middleware
Adjust local clock
Message is received
Application sends message
Middleware sends message
Adjust local clock
and timestamp message
Middleware layer
Application layer
Network layer
Message is delivered
to application
12 / 49
Coordination: Logical clocks Lamport’s logical clocks
Example: Total-ordered multicast
Concurrent updates on a replicated database are seen in
the same order everywhere
► P1 adds $100 to an account (initial value: $1000)
► P2 increments account by 1%
► There are two replicas
Update 1 Update 2
Update 1 is
performed before
update 2
Update 2 is
performed before
update 1
Replicated database
Result
In absence of proper synchronization:
replica #1 ← $1111, while replica #2 ← $1110.
Example: Total-ordered multicasting 13 / 49
Example: Total-ordered multicasting 14 / 49
Coordination: Logical clocks Lamport’s logical clocks
Example: Total-ordered multicast
Solution
► Process Pi sends timestamped message mi to all others. The
message itself is put in a local queue queuei.
► Any incoming message at Pj is queued in queuej, according to its
timestamp, and acknowledged to every other process.
Example: Total-ordered multicasting 14 / 49
Coordination: Logical clocks Lamport’s logical clocks
Example: Total-ordered multicast
Solution
► Process Pi sends timestamped message mi to all others. The
message itself is put in a local queue queuei.
► Any incoming message at Pj is queued in queuej, according to its
timestamp, and acknowledged to every other process.
Pj passes a message mi to its application if:
(1) mi is at the head of queuej
(2) for each process Pk , there is a message mk in queuej with a
larger timestamp.
Example: Total-ordered multicasting 14 / 49
Coordination: Logical clocks Lamport’s logical clocks
Example: Total-ordered multicast
Solution
► Process Pi sends timestamped message mi to all others. The
message itself is put in a local queue queuei.
► Any incoming message at Pj is queued in queuej, according to its
timestamp, and acknowledged to every other process.
Pj passes a message mi to its application if:
(1) mi is at the head of queuej
(2) for each process Pk , there is a message mk in queuej with a
larger timestamp.
Note
We are assuming that communication is reliable and FIFO ordered.
Example: Total-ordered multicasting 15 / 49
Coordination: Logical clocks Lamport’s logical clocks
Lamport’s clocks for mutual exclusion
1 class Process:
# The request queue
# The current logical clock
self.queue.append((self.clock, self.procID, ENTER)) # Append request t o q
self.cleanupQ() # Sort the queue
self.chan.sendTo(self.otherProcs, (self.clock,self.procID,ENTER)) # Send requ
self.queue = tmp
self.clock = self.clock + 1
# and copy t o new queue
# Increment clock value
self.chan.sendTo(self.otherProcs, (self.clock,self.procID,RELEASE)) # Release
2 def i n i t ( s e l f , chan):
3 self.queue = [ ]
4 self.clock = 0
5
6 def requestToEnter(self):
7 self.clock = self.clock + 1 # Increment clock value
8
9
10
11
12 def allowToEnter(self, requester):
13 self.clock = self.clock + 1 # Increment clock value
14 self.chan.sendTo([requester], (self.clock,self.procID,ALLOW)) # Permit other
15
16 def release(self):
17 tmp = [ r for r i n self.queue[1:] i f r[ 2 ] == ENTER] # Remove a l l ALLOWs
18
19
20
21
22 def allowedToEnter(self):
23 commProcs = set([req[1] for req i n self.queue[1:]]) # See who has sent a mess
24 return (self.queue[0][1]==self.procID and len(self.otherProcs)==len(commProcs
Example: Total-ordered multicasting 16 / 49
Coordination: Logical clocks Lamport’s logical clocks
Lamport’s clocks for mutual exclusion
1 def receive(self):
2 msg = self.chan.recvFrom(self.otherProcs)[1]
3 self.clock = max(self.clock, msg[0])
4 self.clock = self.clock + 1
5 i f msg[2] == ENTER:
6 self.queue.append(msg)
7 self.allowToEnter(msg[1])
8 e l i f msg[2] == ALLOW:
9 self.queue.append(msg)
10 e l i f msg[2] == RELEASE:
11 del(self.queue[0])
12 self.cleanupQ()
# Pick up any message
# Adjust clock v a l u e .. .
# . . . and increment
# Append an ENTER request
# and unconditionally a l l
# Append an ALLOW
# Just remove f i r s t messa
# And so rt and cleanup
Example: Total-ordered multicasting 17 / 49
Coordination: Logical clocks Lamport’s logical clocks
Lamport’s clocks for mutual exclusion
Analogy with total-ordered multicast
► With total-ordered multicast, all processes build identical queues,
delivering messages in the same order
► Mutual exclusion is about agreeing in which order processes are
allowed to enter a critical section
Coordination: Logical clocks Vector clocks
Vector clocks
Observation
Lamport’s clocks do not guarantee that if C(a) < C(b) that a causally
preceded b.
Concurrent message
transmission using logical
clocks
m1
m2
m3
0
10
20
30
40
50
60
70
80
90
100
0
6
12
18
24
30
36
42
48
70
76
0
8
16
24
32
40
48
61
69
77
85
P1
P2
P3
Observation
Event a: m1 is received at T = 16;
Event b: m2 is sent at T = 20.
m4
m5
18 / 49
Coordination: Logical clocks Vector clocks
Vector clocks
Observation
Lamport’s clocks do not guarantee that if C(a) < C(b) that a causally
preceded b.
Concurrent message
transmission using logical
clocks
m1
m3
m2
0
10
20
30
40
50
60
70
80
90
100
0
6
12
18
24
30
36
42
48
70
76
0
8
16
24
32
40
48
61
69
77
85
P1
P2
P3
Observation
Event a: m1 is received at T = 16;
Event b: m2 is sent at T = 20.
Note
We cannot conclude that a causally
precedes b.
m4
m5
18 / 49
19 / 49
Coordination: Logical clocks Vector clocks
Causal dependency
Definition
We say that b may causally depend on a if ts(a) < ts(b), with:
► for all k, ts(a)[k] ≤ts(b)[k] and
► there exists at least one index k for which ts(a)[k ] < ts(b)[k ]
Precedence vs. dependency
► We say that a causally precedes b.
► b may causally depend on a, as there may be information from a
that is propagated into b.
20 / 49
Coordination: Logical clocks Vector clocks
Capturing causality
Solution: each Pi maintains a vector VCi
► VCi [i] is the local logical clock at process Pi .
► If VCi [j] = k then Pi knows that k events have occurred at Pj .
Maintaining vector clocks
1. Before executing an event Pi executes VCi [i] ← VCi [i]+ 1.
2. When process Pi sends a message m to Pj , it sets m’s (vector)
timestamp ts(m) equal to VCi after having executed step 1.
3. Upon the receipt of a message m, process Pj sets
VCj [k ] ← max{VCj[k ], ts(m)[k ]} for each k , after which it
executes step 1 and then delivers the message to the application.
Coordination: Logical clocks Vector clocks
Vector clocks: Example
Capturing potential causality when exchanging messages
P1
P2
P3
(0,1,0)
(4,3,0)
(4,3,2)
(2,1,1)
m1
m4
P2
P3
(1,1,0) (2,1,0) (3,1,0) (4,1,0)
P (1,1,0) (2,1,0) (3,1,0) (4,1,0)
1
(2,2,0)
(2,3,1) (4,3,2)
m2
(2,3,0)
21 / 49
m2 m3 m1 m3
(4,2,0) (0,1,0)
m4
(a) (b)
Analysis
Situation ts(m2) ts(m4) ts(m2)
<
ts(m4)
ts(m2)
>
ts(m4)
Conclusion
(a) (2,1,0) (4,3,0) Yes No m2 may causally precede m4
(b) (4,1,0) (2,3,0) No No m2 and m4 may conflict
22 / 49
Coordination: Logical clocks Vector clocks
Causally ordered multicasting
Observation
We can now ensure that a message is delivered only if all causally
preceding messages have already been delivered.
Adjustment
Pi increments VCi [i] only when sending a message, and Pj “adjusts”
VCj when receiving a message (i.e., effectively does not change
VCj [j]).
22 / 49
Coordination: Logical clocks Vector clocks
Causally ordered multicasting
Observation
We can now ensure that a message is delivered only if all causally
preceding messages have already been delivered.
Adjustment
Pi increments VCi [i] only when sending a message, and Pj “adjusts”
VCj when receiving a message (i.e., effectively does not change
VCj [j]).
Pj postpones delivery of m until:
1. ts(m)[i] = VCj [i]+ 1
2. ts(m)[k] ≤VCj [k] for all k I= i
Coordination: Logical clocks Vector clocks
Causally ordered multicasting
Enforcing causal communication
P1
P2
P3
(0,0,0)
(1,1,0)
(1,0,0) (1,1,0)
(1,0,0)
(1,0,0) (1,1,0)
m
23 / 49
m*
Coordination: Logical clocks Vector clocks
Causally ordered multicasting
Enforcing causal communication
P1
P2
P3
(0,0,0)
(1,1,0)
(1,0,0) (1,1,0)
(1,0,0)
(1,0,0) (1,1,0)
m
23 / 49
m*
Example
Take VC3 = [0,2,2],ts(m) = [1,3,0] from P1. What information does
P3 have, and what will it do when receiving m (from P1)?
24 / 49
Coordination: Mutual exclusion Overview
Mutual exclusion
Problem
A number of processes in a distributed system want exclusive access
to some resource.
Basic solutions
Permission-based: A process wanting to enter its critical section, or
access a resource, needs permission from other
processes.
Token-based: A token is passed between processes. The one who
has the token may proceed in its critical section, or pass
it on when not interested.
Coordination: Mutual exclusion A centralized algorithm
Permission-based, centralized
Simply use a coordinator
Queue is
empty
P0 P1
Request OK
P2
C
No reply
P0 P1
Request
P2
C
2
Release
OK
P0
25 / 49
P P
1 2
C
Coordinator
(a) (b) (c)
(a) Process P1 asks the coordinator for permission to access a
shared resource. Permission is granted.
(b) Process P2 then asks permission to access the same resource.
The coordinator does not reply.
(c) When P1 releases the resource, it tells the coordinator, which
then replies to P2 .
26 / 49
Coordination: Mutual exclusion A distributed algorithm
Mutual exclusion Ricart & Agrawala
The same as Lamport except that acknowledgments are
not sent
Return a response to a request only when:
► The receiving process has no interest in the shared resource; or
► The receiving process is waiting for the resource, but has lower
priority (known through comparison of timestamps).
In all other cases, reply is deferred, implying some more local
administration.
Coordination: Mutual exclusion A distributed algorithm
Mutual exclusion Ricart & Agrawala
Example with three processes
0
1 2
8
8
8 12
12
12
(a) (b) (c)
(a) Two processes want to access a shared resource at the same
moment.
(b) P0 has the lowest timestamp, so it wins.
(c) When process P0 is done, it sends an OK also, so P2 can now go
ahead.
0
1 2
OK OK
OK
Accesses
resource
0
27 / 49
1 2
OK
Accesses
resource
Coordination: Mutual exclusion A token-ring algorithm
Mutual exclusion: Token ring algorithm
4
5
6
7
Essence
Organize processes in a logical ring, and let a token be passed
between them. The one that holds the token is allowed to enter the
critical region (if it wants to).
An overlay network constructed as a logical ring with a
circulating token
Token
0 1 2 3
28 / 49
29 / 49
Coordination: Mutual exclusion A decentralized algorithm
Decentralized mutual exclusion
Principle
Assume every resource is replicated N times, with each replica having
its own coordinator ⇒ access requires a majority vote from m > N/2
coordinators. A coordinator always responds immediately to a request.
Assumption
When a coordinator crashes, it will recover quickly, but will have
forgotten about permissions it had granted.
30 / 49
Coordination: Mutual exclusion A decentralized algorithm
Decentralized mutual exclusion
How robust is this system?
► Let p = ∆t/T be the probability that a coordinator resets during a
time interval ∆t , while having a lifetime of T.
30 / 49
Coordination: Mutual exclusion A decentralized algorithm
Decentralized mutual exclusion
How robust is this system?
► Let p = ∆t/T be the probability that a coordinator resets during a
time interval ∆t , while having a lifetime of T.
► The probability P[k] that k out of m coordinators reset during the
same interval is
P[k] =
m
k
pk (1−p)m−k
30 / 49
Coordination: Mutual exclusion A decentralized algorithm
Decentralized mutual exclusion
How robust is this system?
► Let p = ∆t/T be the probability that a coordinator resets during a
time interval ∆t , while having a lifetime of T.
► The probability P[k] that k out of m coordinators reset during the
same interval is
P[k] =
m
k
pk (1−p)m−k
► f coordinators reset ⇒ correctness is violated when there is only
a minority of nonfaulty coordinators: when m −f ≤N/2, or,
f ≥m −N/ 2.
30 / 49
Coordination: Mutual exclusion A decentralized algorithm
Decentralized mutual exclusion
How robust is this system?
► Let p = ∆t/T be the probability that a coordinator resets during a
time interval ∆t , while having a lifetime of T.
► The probability P[k] that k out of m coordinators reset during the
same interval is
P[k] =
m
k
pk (1−p)m−k
► f coordinators reset ⇒ correctness is violated when there is only
a minority of nonfaulty coordinators: when m −f ≤N/2, or,
f ≥m −N/ 2.
k= m−N/ 2
►The probability of a violation is ∑N P[k].
31 / 49
Coordination: Mutual exclusion A decentralized algorithm
Decentralized mutual exclusion
Violation probabilities for various parameter values
N m p Violation N m p Violation
8 5 3 sec/hour < 10−15 8 5 30 sec/hour < 10−10
8 6 3 sec/hour < 10−18 8 6 30 sec/hour < 10−11
16 9 3 sec/hour < 10−27 16 9 30 sec/hour < 10−18
16 12 3 sec/hour < 10−36 16 12 30 sec/hour < 10−24
32 17 3 sec/hour < 10−52 32 17 30 sec/hour < 10−35
32 24 3 sec/hour < 10−73 32 24 30 sec/hour < 10−49
31 / 49
Coordination: Mutual exclusion A decentralized algorithm
Decentralized mutual exclusion
Violation probabilities for various parameter values
N m p Violation N m p Violation
8 5 3 sec/hour < 10−15 8 5 30 sec/hour < 10−10
8 6 3 sec/hour < 10−18 8 6 30 sec/hour < 10−11
16 9 3 sec/hour < 10−27 16 9 30 sec/hour < 10−18
16 12 3 sec/hour < 10−36 16 12 30 sec/hour < 10−24
32 17 3 sec/hour < 10−52 32 17 30 sec/hour < 10−35
32 24 3 sec/hour < 10−73 32 24 30 sec/hour < 10−49
So....
What can we conclude?
32 / 49
Coordination: Mutual exclusion A decentralized algorithm
Mutual exclusion: comparison
Algorithm
Messages per
entry/exit
Delay before entry
(in message times)
Centralized 3 2
Distributed 2·(N −1) 2·(N −1)
Token ring 1,...,∞ 0,...,N −1
Decentralized 2·m ·k +m,k = 1,2,... 2·m ·k
33 / 49
Coordination: Election algorithms
Election algorithms
Principle
An algorithm requires that some process acts as a coordinator. The
question is how to select this special process dynamically.
Note
In many systems the coordinator is chosen by hand (e.g. file servers).
This leads to centralized solutions ⇒ single point of failure.
33 / 49
Coordination: Election algorithms
Election algorithms
Principle
An algorithm requires that some process acts as a coordinator. The
question is how to select this special process dynamically.
Note
In many systems the coordinator is chosen by hand (e.g. file servers).
This leads to centralized solutions ⇒ single point of failure.
Teasers
1. If a coordinator is chosen dynamically, to what extent can we
speak about a centralized or distributed solution?
2. Is a fully distributed solution, i.e. one without a coordinator, always
more robust than any centralized/coordinated solution?
34 / 49
Coordination: Election algorithms
Basic assumptions
► All processes have unique id’s
► All processes know id’s of all processes in the system (but not if
they are up or down)
► Election means identifying the process with the highest id that is
up
35 / 49
Coordination: Election algorithms The bully algorithm
Election by bullying
Principle
Consider N processes {P0, . . . , PN−1} and let id (Pk ) = k . When a
process Pk notices that the coordinator is no longer responding to
requests, it initiates an election:
1. Pk sends an ELECTION message to all processes with higher
identifiers: Pk+1, Pk+2 ,...,PN−1.
2. If no one responds, Pk wins the election and becomes
coordinator.
3. If one of the higher-ups answers, it takes over and Pk ’s job is
done.
Coordination: Election algorithms The bully algorithm
Election
4
0
6
3
7
OK
1
Election by bullying
The bully election algorithm
1
2 5 2
4
0
5
6
3
7
Election
1
2
4
0
5
6
3
7
OK
1
2
4
0
5
6
3
7
Coordinator
1
2
4
0
5
6
3
7
36 / 49
37 / 49
Coordination: Election algorithms A ring algorithm
Election in a ring
Principle
Process priority is obtained by organizing processes into a (logical)
ring. Process with the highest priority should be elected as coordinator.
► Any process can start an election by sending an election
message to its successor. If a successor is down, the message is
passed on to the next successor.
► If a message is passed on, the sender adds itself to the list.
When it gets back to the initiator, everyone had a chance to make
its presence known.
► The initiator sends a coordinator message around the ring
containing a list of all living processes. The one with the highest
priority is elected as coordinator.
Coordination: Election algorithms A ring algorithm
Election in a ring
3 4
5
6
7
0
[3]
38 / 49
[3,4]
[3,4,5]
[3,4,5,6]
[3,4,5,6,0]
[3,4,5,6,0,1] [3,4,5,6,0,1,2]
[6]
Election algorithm using a ring
[6,0,1] [6,0,1,2]
1 2
[6,0]
[6,0,1,2,3]
[6,0,1,2,3,4]
[6,0,1,2,3,4,5]
► The solid line shows the election messages initiated by P6
► The dashed one the messages by P3
Coordination: Election algorithms Elections in wireless environments
A solution for wireless networks
A sample network
4
5
2
4 a
6 b
3
c
d
e
1
f
g 2
h
8
i
j
4
Capacity
4
5
2
3
c
d
e
1
f
g 2
h
8
i
j
4
Broadcasting
node
6 b
4 a
39 / 49
Coordination: Election algorithms Elections in wireless environments
A solution for wireless networks
A sample network
4
5
2
4 a
6 b
3
c
d
e
1
f
g 2
h 8
i
j
4
g receives
broadcast
from b first
4
40 / 49
5
2
4 a
6 b
3
c
d
e
1
f
g 2
j
4
e receives
broadcast
from g first
i
h
8
Coordination: Election algorithms Elections in wireless environments
A solution for wireless networks
A sample network
5
2
4 a
6 b
3
c
d
e
1
g 2
i
j
4
f
f receives 4
broadcast
from e first
h
8
4
41 / 49
2
2
4 a
3
c
d
f
g
h
8
i
j
4
[f,4]
[c,3]
[d,2]
[i,5] 5
[h,8]
[h,8]
[h,8] 6 b
[j,4]
e
[f,4] 1
42 / 49
Coordination: Location systems
Positioning nodes
Issue
In large-scale distributed systems in which nodes are dispersed across
a wide-area network, we often need to take some notion of proximity or
distance into account ⇒ it starts with determining a (relative) location
of a node.
Coordination: Location systems GPS: Global Positioning System
Computing position
Observation
A node P needs d + 1 landmarks to compute its own position in a
d-dimensional space. Consider two-dimensional case.
Computing a position in 2D
P
d2
d1
d3
(x3,y3)
(x2,y2)
(x1,y1)
Solution
P needs to solve three
equations in two unknowns
(xP ,yP):
di = (xi −xP )2 + (yi −yP )2
43 / 49
Observation
4 satellites ⇒ 4 equations in 4 unknowns (with ∆r as one of them)
44 / 49
Coordination: Location systems GPS: Global Positioning System
Global positioning system
Assuming that the clocks of the satellites are accurate and
synchronized
► It takes a while before a signal reaches the receiver
► The receiver’s clock is definitely out of sync with the satellite
Basics
Observation
4 satellites ⇒ 4 equations in 4 unknowns (with ∆r as one of them)
44 / 49
Coordination: Location systems GPS: Global Positioning System
Global positioning system
Assuming that the clocks of the satellites are accurate and
synchronized
► It takes a while before a signal reaches the receiver
► The receiver’s clock is definitely out of sync with the satellite
Basics
► ∆ r : unknown deviation of the receiver’s clock.
Observation
4 satellites ⇒ 4 equations in 4 unknowns (with ∆r as one of them)
44 / 49
Coordination: Location systems GPS: Global Positioning System
Global positioning system
Assuming that the clocks of the satellites are accurate and
synchronized
► It takes a while before a signal reaches the receiver
► The receiver’s clock is definitely out of sync with the satellite
Basics
► ∆ r : unknown deviation of the receiver’s clock.
► xr , yr , zr : unknown coordinates of the receiver.
Observation
4 satellites ⇒ 4 equations in 4 unknowns (with ∆r as one of them)
44 / 49
Coordination: Location systems GPS: Global Positioning System
Global positioning system
Assuming that the clocks of the satellites are accurate and
synchronized
► It takes a while before a signal reaches the receiver
► The receiver’s clock is definitely out of sync with the satellite
Basics
► ∆ r : unknown deviation of the receiver’s clock.
► xr , yr , zr : unknown coordinates of the receiver.
► Ti : timestamp on a message from satellite i
Observation
4 satellites ⇒ 4 equations in 4 unknowns (with ∆r as one of them)
44 / 49
Coordination: Location systems GPS: Global Positioning System
Global positioning system
Assuming that the clocks of the satellites are accurate and
synchronized
► It takes a while before a signal reaches the receiver
► The receiver’s clock is definitely out of sync with the satellite
Basics
► ∆ r : unknown deviation of the receiver’s clock.
► xr , yr , zr : unknown coordinates of the receiver.
► Ti : timestamp on a message from satellite i
► ∆i = (Tnow −Ti ) + ∆ r : measured delay of the message sent by
satellite i.
Observation
4 satellites ⇒ 4 equations in 4 unknowns (with ∆r as one of them)
44 / 49
Coordination: Location systems GPS: Global Positioning System
Global positioning system
Assuming that the clocks of the satellites are accurate and
synchronized
► It takes a while before a signal reaches the receiver
► The receiver’s clock is definitely out of sync with the satellite
Basics
► ∆ r : unknown deviation of the receiver’s clock.
► xr , yr , zr : unknown coordinates of the receiver.
► Ti : timestamp on a message from satellite i
► ∆i = (Tnow −Ti ) + ∆ r : measured delay of the message sent by
satellite i.
► Measured distance to satellite i: c ×∆i (c is speed of light)
Observation
4 satellites ⇒ 4 equations in 4 unknowns (with ∆r as one of them)
44 / 49
Coordination: Location systems GPS: Global Positioning System
Global positioning system
Assuming that the clocks of the satellites are accurate and
synchronized
► It takes a while before a signal reaches the receiver
► The receiver’s clock is definitely out of sync with the satellite
Basics
► ∆ r : unknown deviation of the receiver’s clock.
► xr , yr , zr : unknown coordinates of the receiver.
► Ti : timestamp on a message from satellite i
► ∆i = (Tnow −Ti ) + ∆ r : measured delay of the message sent by
satellite i.
► Measured distance to satellite i: c ×∆i (c is speed of light)
► Real distance:
di = c∆ i −c∆ r =
✓
(xi −xr )2 + (yi −yr )2 + (zi −zr )2
Coordination: Location systems When GPS is not an option
WiFi-based location services
Basic idea
► Assume we have a database of known access points (APs) with
coordinates
► Assume we can estimate distance to an AP
► Then: with 3 detected access points, we can compute a position.
War driving: locating access points
► Use a WiFi-enabled device along with a GPS receiver, and move
through an area while recording observed access points.
► Compute the centroid: assume an access point AP has been
detected at N different locations {x
_1,x
_2,...,x
_
N }, with known GPS
location.
AP
► Compute location of AP as_
x =
∑N
i= 1
_
xi
N
► Number of sampled detection points N may be too low.
45 / 49
.
Problems
► Limited accuracy of each GPS detection point _xi
► An access point has a nonuniform transmission range
Coordination: Location systems Logical positioning of nodes
Computing position
Problems
► Measured latencies to
landmarks fluctuate
► Computed distances will
not even be consistent
Inconsistent distances in 1D
space
1
P
2 3
Q
4
R
2.8
1.0 2.0
Solution: minimize errors
► Use N special landmark nodes L1,...,LN .
► Landmarks measure their pairwise latencies d˜(Li,Lj )
► A central node computes the coordinates for each landmark,
minimizing:
∑ ∑
i= 1 j= i+ 1
N N ˜ i j
ˆ i j
d(L ,L ) −d(L ,L )
d̃(Li ,Lj )
2
where dˆ(Li,Lj ) is distance after nodes Li and Lj have been
Centralized positpionoinsgitioned. 46 / 49
Coordination: Location systems Logical positioning of nodes
Computing position
Choosing the dimension m
The hidden parameter is the dimension m with N > m. A node P
measures its distance to each of the N landmarks and computes its
coordinates by minimizing
N
∑
i= 1
i
˜ ˆ i
d(L ,P) −d(L ,P)
d̃(Li ,P)
2
Observation
Practice shows that m can be as small as 6 or 7 to achieve latency
estimations within a factor 2 of the actual value.
Centralized positioning 47 / 49
Coordination: Location systems Logical positioning of nodes
Vivaldi
Principle: network of springs exerting forces
Consider a collection of N nodes P1,..., PN , each Pi having
coordinates _xi. Two nodes exert a mutual force:
_
Fij = d̃(Pi ,Pj ) −d̂(Pi ,Pj ) ×u(_
xi −_
xj )
with u(_xi−_xj) is the unit vector in the direction of _xi−_xj
Node Pi repeatedly executes steps
1. Measure the latency d˜ijto node Pj , and also receive Pj ’s
coordinates _xj.
2. Compute the error e = d˜(Pi ,Pj ) −dˆ(Pi,Pj )
3. Compute the direction_
u = u(_
xi −_
xj ).
4. Compute the force vector Fij = e·_
u
5. Adjust own position by moving along the force vector:
_
xi ←_
xi + δ ·_
u.
Decentralized positioning 48 / 49
49 / 49
Coordination: Gossip-based coordination A peer-sampling service
Example applications
Typical apps
► Data dissemination: Perhaps the most important one. Note that
there are many variants of dissemination.
► Aggregation: Let every node Pi maintain a variable vi . When two
nodes gossip, they each reset their variable to
vi ,vj ← (vi + vj )/ 2
Result: in the end each node will have computed the average
v̄ = ∑i vi / N.
49 / 49
Coordination: Gossip-based coordination A peer-sampling service
Example applications
Typical apps
► Data dissemination: Perhaps the most important one. Note that
there are many variants of dissemination.
► Aggregation: Let every node Pi maintain a variable vi . When two
nodes gossip, they each reset their variable to
vi ,vj ← (vi + vj )/ 2
Result: in the end each node will have computed the average
v̄ = ∑i vi / N.
► What happens in the case that initially vi = 1 and vj = 0,j I= i?

More Related Content

Similar to slides.06.pptx

Clock Synchronization (Distributed computing)
Clock Synchronization (Distributed computing)Clock Synchronization (Distributed computing)
Clock Synchronization (Distributed computing)Sri Prasanna
 
Distributed computing time
Distributed computing timeDistributed computing time
Distributed computing timeDeepak John
 
Chapter 5-Synchronozation.ppt
Chapter 5-Synchronozation.pptChapter 5-Synchronozation.ppt
Chapter 5-Synchronozation.pptsirajmohammed35
 
Distributed system TimeNState-Tanenbaum.ppt
Distributed system TimeNState-Tanenbaum.pptDistributed system TimeNState-Tanenbaum.ppt
Distributed system TimeNState-Tanenbaum.pptTantraNathjha1
 
TCP congestion control
TCP congestion controlTCP congestion control
TCP congestion controlShubham Jain
 
Distributed System by Pratik Tambekar
Distributed System by Pratik TambekarDistributed System by Pratik Tambekar
Distributed System by Pratik TambekarPratik Tambekar
 
RTOS Material hfffffffffffffffffffffffffffffffffffff
RTOS Material hfffffffffffffffffffffffffffffffffffffRTOS Material hfffffffffffffffffffffffffffffffffffff
RTOS Material hfffffffffffffffffffffffffffffffffffffadugnanegero
 
Transaction Timestamping in Temporal Databases
Transaction Timestamping in Temporal DatabasesTransaction Timestamping in Temporal Databases
Transaction Timestamping in Temporal DatabasesGera Shegalov
 
Designing TCP-Friendly Window-based Congestion Control
Designing TCP-Friendly Window-based Congestion ControlDesigning TCP-Friendly Window-based Congestion Control
Designing TCP-Friendly Window-based Congestion Controlsoohyunc
 
Scheduling Fixed Priority Tasks with Preemption Threshold
Scheduling Fixed Priority Tasks with Preemption ThresholdScheduling Fixed Priority Tasks with Preemption Threshold
Scheduling Fixed Priority Tasks with Preemption ThresholdDeepak Raj
 
MathCAD - Synchronicity Algorithm.pdf
MathCAD - Synchronicity Algorithm.pdfMathCAD - Synchronicity Algorithm.pdf
MathCAD - Synchronicity Algorithm.pdfJulio Banks
 

Similar to slides.06.pptx (20)

Clock.pdf
Clock.pdfClock.pdf
Clock.pdf
 
Clock Synchronization (Distributed computing)
Clock Synchronization (Distributed computing)Clock Synchronization (Distributed computing)
Clock Synchronization (Distributed computing)
 
Distributed computing time
Distributed computing timeDistributed computing time
Distributed computing time
 
Chapter 5-Synchronozation.ppt
Chapter 5-Synchronozation.pptChapter 5-Synchronozation.ppt
Chapter 5-Synchronozation.ppt
 
Clocks
ClocksClocks
Clocks
 
14 queuing
14 queuing14 queuing
14 queuing
 
Distributed system TimeNState-Tanenbaum.ppt
Distributed system TimeNState-Tanenbaum.pptDistributed system TimeNState-Tanenbaum.ppt
Distributed system TimeNState-Tanenbaum.ppt
 
Chapter 6 synchronization
Chapter 6 synchronizationChapter 6 synchronization
Chapter 6 synchronization
 
Ds practical file
Ds practical fileDs practical file
Ds practical file
 
3. syncro. in distributed system
3. syncro. in distributed system3. syncro. in distributed system
3. syncro. in distributed system
 
Fairness in Transfer Control Protocol for Congestion Control in Multiplicativ...
Fairness in Transfer Control Protocol for Congestion Control in Multiplicativ...Fairness in Transfer Control Protocol for Congestion Control in Multiplicativ...
Fairness in Transfer Control Protocol for Congestion Control in Multiplicativ...
 
Time in distributed systmes
Time in distributed systmesTime in distributed systmes
Time in distributed systmes
 
TCP congestion control
TCP congestion controlTCP congestion control
TCP congestion control
 
Distributed System by Pratik Tambekar
Distributed System by Pratik TambekarDistributed System by Pratik Tambekar
Distributed System by Pratik Tambekar
 
RTOS Material hfffffffffffffffffffffffffffffffffffff
RTOS Material hfffffffffffffffffffffffffffffffffffffRTOS Material hfffffffffffffffffffffffffffffffffffff
RTOS Material hfffffffffffffffffffffffffffffffffffff
 
Transaction Timestamping in Temporal Databases
Transaction Timestamping in Temporal DatabasesTransaction Timestamping in Temporal Databases
Transaction Timestamping in Temporal Databases
 
Designing TCP-Friendly Window-based Congestion Control
Designing TCP-Friendly Window-based Congestion ControlDesigning TCP-Friendly Window-based Congestion Control
Designing TCP-Friendly Window-based Congestion Control
 
Tcp congestion avoidance
Tcp congestion avoidanceTcp congestion avoidance
Tcp congestion avoidance
 
Scheduling Fixed Priority Tasks with Preemption Threshold
Scheduling Fixed Priority Tasks with Preemption ThresholdScheduling Fixed Priority Tasks with Preemption Threshold
Scheduling Fixed Priority Tasks with Preemption Threshold
 
MathCAD - Synchronicity Algorithm.pdf
MathCAD - Synchronicity Algorithm.pdfMathCAD - Synchronicity Algorithm.pdf
MathCAD - Synchronicity Algorithm.pdf
 

More from balewayalew

Chapter 1-Introduction.ppt
Chapter 1-Introduction.pptChapter 1-Introduction.ppt
Chapter 1-Introduction.pptbalewayalew
 
Data Analytics.ppt
Data Analytics.pptData Analytics.ppt
Data Analytics.pptbalewayalew
 
PE1 Module 4.ppt
PE1 Module 4.pptPE1 Module 4.ppt
PE1 Module 4.pptbalewayalew
 
PE1 Module 3.ppt
PE1 Module 3.pptPE1 Module 3.ppt
PE1 Module 3.pptbalewayalew
 
PE1 Module 2.ppt
PE1 Module 2.pptPE1 Module 2.ppt
PE1 Module 2.pptbalewayalew
 
Chapter -6- Ethics and Professionalism of ET (2).pptx
Chapter -6- Ethics and Professionalism of ET (2).pptxChapter -6- Ethics and Professionalism of ET (2).pptx
Chapter -6- Ethics and Professionalism of ET (2).pptxbalewayalew
 
Chapter -5- Augumented Reality (AR).pptx
Chapter -5- Augumented Reality (AR).pptxChapter -5- Augumented Reality (AR).pptx
Chapter -5- Augumented Reality (AR).pptxbalewayalew
 
PE1 Module 1.ppt
PE1 Module 1.pptPE1 Module 1.ppt
PE1 Module 1.pptbalewayalew
 
Ch 1-Non-functional Requirements.ppt
Ch 1-Non-functional Requirements.pptCh 1-Non-functional Requirements.ppt
Ch 1-Non-functional Requirements.pptbalewayalew
 
Ch 6 - Requirement Management.pptx
Ch 6 - Requirement Management.pptxCh 6 - Requirement Management.pptx
Ch 6 - Requirement Management.pptxbalewayalew
 

More from balewayalew (20)

slides.07.pptx
slides.07.pptxslides.07.pptx
slides.07.pptx
 
slides.08.pptx
slides.08.pptxslides.08.pptx
slides.08.pptx
 
Chapter 1-Introduction.ppt
Chapter 1-Introduction.pptChapter 1-Introduction.ppt
Chapter 1-Introduction.ppt
 
Data Analytics.ppt
Data Analytics.pptData Analytics.ppt
Data Analytics.ppt
 
PE1 Module 4.ppt
PE1 Module 4.pptPE1 Module 4.ppt
PE1 Module 4.ppt
 
PE1 Module 3.ppt
PE1 Module 3.pptPE1 Module 3.ppt
PE1 Module 3.ppt
 
PE1 Module 2.ppt
PE1 Module 2.pptPE1 Module 2.ppt
PE1 Module 2.ppt
 
Chapter -6- Ethics and Professionalism of ET (2).pptx
Chapter -6- Ethics and Professionalism of ET (2).pptxChapter -6- Ethics and Professionalism of ET (2).pptx
Chapter -6- Ethics and Professionalism of ET (2).pptx
 
Chapter -5- Augumented Reality (AR).pptx
Chapter -5- Augumented Reality (AR).pptxChapter -5- Augumented Reality (AR).pptx
Chapter -5- Augumented Reality (AR).pptx
 
Chapter 8.ppt
Chapter 8.pptChapter 8.ppt
Chapter 8.ppt
 
PE1 Module 1.ppt
PE1 Module 1.pptPE1 Module 1.ppt
PE1 Module 1.ppt
 
chapter7.ppt
chapter7.pptchapter7.ppt
chapter7.ppt
 
chapter6.ppt
chapter6.pptchapter6.ppt
chapter6.ppt
 
chapter5.ppt
chapter5.pptchapter5.ppt
chapter5.ppt
 
chapter4.ppt
chapter4.pptchapter4.ppt
chapter4.ppt
 
chapter3.ppt
chapter3.pptchapter3.ppt
chapter3.ppt
 
chapter2.ppt
chapter2.pptchapter2.ppt
chapter2.ppt
 
chapter1.ppt
chapter1.pptchapter1.ppt
chapter1.ppt
 
Ch 1-Non-functional Requirements.ppt
Ch 1-Non-functional Requirements.pptCh 1-Non-functional Requirements.ppt
Ch 1-Non-functional Requirements.ppt
 
Ch 6 - Requirement Management.pptx
Ch 6 - Requirement Management.pptxCh 6 - Requirement Management.pptx
Ch 6 - Requirement Management.pptx
 

Recently uploaded

Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsPrecisely
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions
 

Recently uploaded (20)

Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Unlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power SystemsUnlocking the Potential of the Cloud for IBM Power Systems
Unlocking the Potential of the Cloud for IBM Power Systems
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Pigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping ElbowsPigging Solutions Piggable Sweeping Elbows
Pigging Solutions Piggable Sweeping Elbows
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 

slides.06.pptx

  • 1. Distributed Systems (3rd Edition) Chapter 06: Coordination Version: March 20, 2022
  • 2. 2 / 49 Coordination: Clock synchronization Physical clocks Physical clocks Problem Sometimes we simply need the exact time, not just an ordering. Solution: Universal Coordinated Time (UTC) ► Based on the number of transitions per second of the cesium 133 atom (pretty accurate). ► At present, the real time is taken as the average of some 50 cesium clocks around the world. ► Introduces a leap second from time to time to compensate that days are getting longer. Note UTC is broadcast through short-wave radio and satellite. Satellites can give an accuracy of about ±0.5 ms.
  • 3. 3 / 49 Coordination: Clock synchronization Clock synchronization algorithms Clock synchronization Precision The goal is to keep the deviation between two clocks on any two machines within a specified bound, known as the precision π: ∀ t,∀ p,q : |Cp(t) −Cq (t)|≤π with Cp(t) the computed clock time of machine p at UTC time t. Accuracy In the case of accuracy, we aim to keep the clock bound to a value α: ∀ t,∀ p : |Cp (t) −t|≤α Synchronization ► Internal synchronization: keep clocks precise ► External synchronization: keep clocks accurate
  • 4. Coordination: Clock synchronization Clock synchronization algorithms Clock drift Clock specifications ► A clock comes specified with its maximum clock drift rate ρ. ► F(t) denotes oscillator frequency of the hardware clock at time t ► F is the clock’s ideal (constant) frequency ⇒ living up to specifications: ∀ t : (1−ρ) ≤ F(t) F ≤(1+ ρ) Observation By using hardware interrupts we couple a software clock to the hardware clock, and thus also its clock drift rate: 1 Cp(t) = F t 0 F(t)dt ⇒ = dCp(t) F(t) dt F ⇒∀ t : 1−ρ ≤ dCp(t) dt ≤1+ ρ Fast, perfect, slow clocks Clock time, C dt dCp(t) = 1 dt dCp(t) > 1 dt dCp(t) < 1 4 / 49
  • 5. Coordination: Clock synchronization Clock synchronization algorithms Detecting and adjusting incorrect times Getting the current time from a time server B T1 T2 T3 T4 A   Treq   Tres Network Time Protocol 5 / 49 Computing the relative offset θ and delay δ Assumption: δTreq = T2 −T1 ≈T4 −T3 = δTres θ = T3 + (T2 −T1) + (T4 −T3) / 2−T4 = (T2 −T1) + (T3 −T4) / 2 δ = (T4 −T1) −(T3 −T2) / 2
  • 6. Coordination: Clock synchronization Clock synchronization algorithms Detecting and adjusting incorrect times Getting the current time from a time server B T1 T2 T3 T4 A   Treq   Tres Network Time Protocol 5 / 49 Computing the relative offset θ and delay δ Assumption: δTreq = T2 −T1 ≈T4 −T3 = δTres θ = T3 + (T2 −T1) + (T4 −T3) / 2−T4 = (T2 −T1) + (T3 −T4) / 2 δ = (T4 −T1) −(T3 −T2) / 2 Network Time Protocol Collect eight (θ,δ) pairs and choose θ for which associated delay δ was minimal.
  • 7. Coordination: Clock synchronization Clock synchronization algorithms Keeping time without UTC Principle Let the time server scan all machines periodically, calculate an average, and inform each machine how it should adjust its time relative to its present time. Using a time server Time daemon 3:00 3:00 3:25 2:50 Network 3:00 3:00 +25 3:25 2:50 3:05 3:00 0 +5 -10 +15 -20 3:05 3:05 The Berkeley algorithm 6 / 49
  • 8. Coordination: Clock synchronization Clock synchronization algorithms Keeping time without UTC Principle Let the time server scan all machines periodically, calculate an average, and inform each machine how it should adjust its time relative to its present time. Using a time server Time daemon 3:00 3:00 Network 3:00 3:00 +25 3:05 3:00 0 +5 -10 +15 -20 2:50 3:25 2:50 3:25 3:05 3:05 Fundamental You’ll have to take into account that setting the time back is never allowed ⇒ smooth adjustments (i.e., run faster or slower). The Berkeley algorithm 6 / 49
  • 9. Coordination: Clock synchronization Clock synchronization algorithms Reference broadcast synchronization Essence ► A node broadcasts a reference message m ⇒ each receiving node p records the time Tp,m that it received m. ► Note: Tp,m is read from p’s local clock. Problem: averaging will not capture drift ⇒ use linear regression ∑M k= 1 p,k q,k (T −T ) M NO: Offset[p,q](t) = YES: Offset[p,q](t) = αt + β RBS minimizes critical path Message preparation Time spent in NIC A B C Delivery time to app. Critical path RBS Usual critical path Clock synchronization in wireless networks 7 / 49
  • 10. 8 / 49 Coordination: Logical clocks Lamport’s logical clocks The Happened-before relationship Issue What usually matters is not that all processes agree on exactly what time it is, but that they agree on the order in which events occur. Requires a notion of ordering.
  • 11. 8 / 49 Coordination: Logical clocks Lamport’s logical clocks The Happened-before relationship Issue What usually matters is not that all processes agree on exactly what time it is, but that they agree on the order in which events occur. Requires a notion of ordering. The happened-before relation ► If a and b are two events in the same process, and a comes before b, then a → b. ► If a is the sending of a message, and b is the receipt of that ► message, then a → b If a → b and b → c, then a → c Note This introduces a partial ordering of events in a system with concurrently operating processes.
  • 12. 9 / 49 Coordination: Logical clocks Lamport’s logical clocks Logical clocks Problem How do we maintain a global view on the system’s behavior that is consistent with the happened-before relation?
  • 13. 9 / 49 Coordination: Logical clocks Lamport’s logical clocks Logical clocks Problem How do we maintain a global view on the system’s behavior that is consistent with the happened-before relation? Attach a timestamp C(e) to each event e, satisfying the following properties: P1 If a and b are two events in the same process, and a → b, then we demand that C(a) < C(b). P2 If a corresponds to sending a message m, and b to the receipt of that message, then also C(a) < C(b).
  • 14. 9 / 49 Coordination: Logical clocks Lamport’s logical clocks Logical clocks Problem How do we maintain a global view on the system’s behavior that is consistent with the happened-before relation? Attach a timestamp C(e) to each event e, satisfying the following properties: P1 If a and b are two events in the same process, and a → b, then we demand that C(a) < C(b). P2 If a corresponds to sending a message m, and b to the receipt of that message, then also C(a) < C(b). Problem How to attach a timestamp to an event when there’s no global clock ⇒ maintain a consistent set of logical clocks, one per process.
  • 15. 10 / 49 Coordination: Logical clocks Lamport’s logical clocks Logical clocks: solution Each process Pi maintains a local counter Ci and adjusts this counter 1. For each new event that takes place within Pi , Ci is incremented by 1. 2. Each time a message m is sent by process Pi , the message receives a timestamp ts(m) = Ci . 3. Whenever a message m is received by a process Pj , Pj adjusts its local counter Cj to max{Cj,ts(m)}; then executes step 1 before passing m to the application. Notes ► Property P1 is satisfied by (1); Property P2 by (2) and (3). ► It can still occur that two events happen at the same time. Avoid this by breaking ties through process IDs.
  • 16. Coordination: Logical clocks Lamport’s logical clocks Logical clocks: example Consider three processes with event counters operating at different rates 0 6 12 18 24 30 36 42 48 54 60 0 8 16 24 32 40 48 56 64 72 80 0 10 20 30 40 50 60 70 80 90 100 m1 m2 m3 m4 P1 P2 P P P P 3 1 2 3 m1 m2 m3 m4 0 10 20 30 40 50 60 70 80 90 100 P2 adjusts its clock P adjusts 1 its clock 0 6 12 18 24 30 36 42 48 70 76 11 / 49 0 8 16 24 32 40 48 61 69 77 85
  • 17. Coordination: Logical clocks Lamport’s logical clocks Logical clocks: where implemented Adjustments implemented in middleware Adjust local clock Message is received Application sends message Middleware sends message Adjust local clock and timestamp message Middleware layer Application layer Network layer Message is delivered to application 12 / 49
  • 18. Coordination: Logical clocks Lamport’s logical clocks Example: Total-ordered multicast Concurrent updates on a replicated database are seen in the same order everywhere ► P1 adds $100 to an account (initial value: $1000) ► P2 increments account by 1% ► There are two replicas Update 1 Update 2 Update 1 is performed before update 2 Update 2 is performed before update 1 Replicated database Result In absence of proper synchronization: replica #1 ← $1111, while replica #2 ← $1110. Example: Total-ordered multicasting 13 / 49
  • 19. Example: Total-ordered multicasting 14 / 49 Coordination: Logical clocks Lamport’s logical clocks Example: Total-ordered multicast Solution ► Process Pi sends timestamped message mi to all others. The message itself is put in a local queue queuei. ► Any incoming message at Pj is queued in queuej, according to its timestamp, and acknowledged to every other process.
  • 20. Example: Total-ordered multicasting 14 / 49 Coordination: Logical clocks Lamport’s logical clocks Example: Total-ordered multicast Solution ► Process Pi sends timestamped message mi to all others. The message itself is put in a local queue queuei. ► Any incoming message at Pj is queued in queuej, according to its timestamp, and acknowledged to every other process. Pj passes a message mi to its application if: (1) mi is at the head of queuej (2) for each process Pk , there is a message mk in queuej with a larger timestamp.
  • 21. Example: Total-ordered multicasting 14 / 49 Coordination: Logical clocks Lamport’s logical clocks Example: Total-ordered multicast Solution ► Process Pi sends timestamped message mi to all others. The message itself is put in a local queue queuei. ► Any incoming message at Pj is queued in queuej, according to its timestamp, and acknowledged to every other process. Pj passes a message mi to its application if: (1) mi is at the head of queuej (2) for each process Pk , there is a message mk in queuej with a larger timestamp. Note We are assuming that communication is reliable and FIFO ordered.
  • 22. Example: Total-ordered multicasting 15 / 49 Coordination: Logical clocks Lamport’s logical clocks Lamport’s clocks for mutual exclusion 1 class Process: # The request queue # The current logical clock self.queue.append((self.clock, self.procID, ENTER)) # Append request t o q self.cleanupQ() # Sort the queue self.chan.sendTo(self.otherProcs, (self.clock,self.procID,ENTER)) # Send requ self.queue = tmp self.clock = self.clock + 1 # and copy t o new queue # Increment clock value self.chan.sendTo(self.otherProcs, (self.clock,self.procID,RELEASE)) # Release 2 def i n i t ( s e l f , chan): 3 self.queue = [ ] 4 self.clock = 0 5 6 def requestToEnter(self): 7 self.clock = self.clock + 1 # Increment clock value 8 9 10 11 12 def allowToEnter(self, requester): 13 self.clock = self.clock + 1 # Increment clock value 14 self.chan.sendTo([requester], (self.clock,self.procID,ALLOW)) # Permit other 15 16 def release(self): 17 tmp = [ r for r i n self.queue[1:] i f r[ 2 ] == ENTER] # Remove a l l ALLOWs 18 19 20 21 22 def allowedToEnter(self): 23 commProcs = set([req[1] for req i n self.queue[1:]]) # See who has sent a mess 24 return (self.queue[0][1]==self.procID and len(self.otherProcs)==len(commProcs
  • 23. Example: Total-ordered multicasting 16 / 49 Coordination: Logical clocks Lamport’s logical clocks Lamport’s clocks for mutual exclusion 1 def receive(self): 2 msg = self.chan.recvFrom(self.otherProcs)[1] 3 self.clock = max(self.clock, msg[0]) 4 self.clock = self.clock + 1 5 i f msg[2] == ENTER: 6 self.queue.append(msg) 7 self.allowToEnter(msg[1]) 8 e l i f msg[2] == ALLOW: 9 self.queue.append(msg) 10 e l i f msg[2] == RELEASE: 11 del(self.queue[0]) 12 self.cleanupQ() # Pick up any message # Adjust clock v a l u e .. . # . . . and increment # Append an ENTER request # and unconditionally a l l # Append an ALLOW # Just remove f i r s t messa # And so rt and cleanup
  • 24. Example: Total-ordered multicasting 17 / 49 Coordination: Logical clocks Lamport’s logical clocks Lamport’s clocks for mutual exclusion Analogy with total-ordered multicast ► With total-ordered multicast, all processes build identical queues, delivering messages in the same order ► Mutual exclusion is about agreeing in which order processes are allowed to enter a critical section
  • 25. Coordination: Logical clocks Vector clocks Vector clocks Observation Lamport’s clocks do not guarantee that if C(a) < C(b) that a causally preceded b. Concurrent message transmission using logical clocks m1 m2 m3 0 10 20 30 40 50 60 70 80 90 100 0 6 12 18 24 30 36 42 48 70 76 0 8 16 24 32 40 48 61 69 77 85 P1 P2 P3 Observation Event a: m1 is received at T = 16; Event b: m2 is sent at T = 20. m4 m5 18 / 49
  • 26. Coordination: Logical clocks Vector clocks Vector clocks Observation Lamport’s clocks do not guarantee that if C(a) < C(b) that a causally preceded b. Concurrent message transmission using logical clocks m1 m3 m2 0 10 20 30 40 50 60 70 80 90 100 0 6 12 18 24 30 36 42 48 70 76 0 8 16 24 32 40 48 61 69 77 85 P1 P2 P3 Observation Event a: m1 is received at T = 16; Event b: m2 is sent at T = 20. Note We cannot conclude that a causally precedes b. m4 m5 18 / 49
  • 27. 19 / 49 Coordination: Logical clocks Vector clocks Causal dependency Definition We say that b may causally depend on a if ts(a) < ts(b), with: ► for all k, ts(a)[k] ≤ts(b)[k] and ► there exists at least one index k for which ts(a)[k ] < ts(b)[k ] Precedence vs. dependency ► We say that a causally precedes b. ► b may causally depend on a, as there may be information from a that is propagated into b.
  • 28. 20 / 49 Coordination: Logical clocks Vector clocks Capturing causality Solution: each Pi maintains a vector VCi ► VCi [i] is the local logical clock at process Pi . ► If VCi [j] = k then Pi knows that k events have occurred at Pj . Maintaining vector clocks 1. Before executing an event Pi executes VCi [i] ← VCi [i]+ 1. 2. When process Pi sends a message m to Pj , it sets m’s (vector) timestamp ts(m) equal to VCi after having executed step 1. 3. Upon the receipt of a message m, process Pj sets VCj [k ] ← max{VCj[k ], ts(m)[k ]} for each k , after which it executes step 1 and then delivers the message to the application.
  • 29. Coordination: Logical clocks Vector clocks Vector clocks: Example Capturing potential causality when exchanging messages P1 P2 P3 (0,1,0) (4,3,0) (4,3,2) (2,1,1) m1 m4 P2 P3 (1,1,0) (2,1,0) (3,1,0) (4,1,0) P (1,1,0) (2,1,0) (3,1,0) (4,1,0) 1 (2,2,0) (2,3,1) (4,3,2) m2 (2,3,0) 21 / 49 m2 m3 m1 m3 (4,2,0) (0,1,0) m4 (a) (b) Analysis Situation ts(m2) ts(m4) ts(m2) < ts(m4) ts(m2) > ts(m4) Conclusion (a) (2,1,0) (4,3,0) Yes No m2 may causally precede m4 (b) (4,1,0) (2,3,0) No No m2 and m4 may conflict
  • 30. 22 / 49 Coordination: Logical clocks Vector clocks Causally ordered multicasting Observation We can now ensure that a message is delivered only if all causally preceding messages have already been delivered. Adjustment Pi increments VCi [i] only when sending a message, and Pj “adjusts” VCj when receiving a message (i.e., effectively does not change VCj [j]).
  • 31. 22 / 49 Coordination: Logical clocks Vector clocks Causally ordered multicasting Observation We can now ensure that a message is delivered only if all causally preceding messages have already been delivered. Adjustment Pi increments VCi [i] only when sending a message, and Pj “adjusts” VCj when receiving a message (i.e., effectively does not change VCj [j]). Pj postpones delivery of m until: 1. ts(m)[i] = VCj [i]+ 1 2. ts(m)[k] ≤VCj [k] for all k I= i
  • 32. Coordination: Logical clocks Vector clocks Causally ordered multicasting Enforcing causal communication P1 P2 P3 (0,0,0) (1,1,0) (1,0,0) (1,1,0) (1,0,0) (1,0,0) (1,1,0) m 23 / 49 m*
  • 33. Coordination: Logical clocks Vector clocks Causally ordered multicasting Enforcing causal communication P1 P2 P3 (0,0,0) (1,1,0) (1,0,0) (1,1,0) (1,0,0) (1,0,0) (1,1,0) m 23 / 49 m* Example Take VC3 = [0,2,2],ts(m) = [1,3,0] from P1. What information does P3 have, and what will it do when receiving m (from P1)?
  • 34. 24 / 49 Coordination: Mutual exclusion Overview Mutual exclusion Problem A number of processes in a distributed system want exclusive access to some resource. Basic solutions Permission-based: A process wanting to enter its critical section, or access a resource, needs permission from other processes. Token-based: A token is passed between processes. The one who has the token may proceed in its critical section, or pass it on when not interested.
  • 35. Coordination: Mutual exclusion A centralized algorithm Permission-based, centralized Simply use a coordinator Queue is empty P0 P1 Request OK P2 C No reply P0 P1 Request P2 C 2 Release OK P0 25 / 49 P P 1 2 C Coordinator (a) (b) (c) (a) Process P1 asks the coordinator for permission to access a shared resource. Permission is granted. (b) Process P2 then asks permission to access the same resource. The coordinator does not reply. (c) When P1 releases the resource, it tells the coordinator, which then replies to P2 .
  • 36. 26 / 49 Coordination: Mutual exclusion A distributed algorithm Mutual exclusion Ricart & Agrawala The same as Lamport except that acknowledgments are not sent Return a response to a request only when: ► The receiving process has no interest in the shared resource; or ► The receiving process is waiting for the resource, but has lower priority (known through comparison of timestamps). In all other cases, reply is deferred, implying some more local administration.
  • 37. Coordination: Mutual exclusion A distributed algorithm Mutual exclusion Ricart & Agrawala Example with three processes 0 1 2 8 8 8 12 12 12 (a) (b) (c) (a) Two processes want to access a shared resource at the same moment. (b) P0 has the lowest timestamp, so it wins. (c) When process P0 is done, it sends an OK also, so P2 can now go ahead. 0 1 2 OK OK OK Accesses resource 0 27 / 49 1 2 OK Accesses resource
  • 38. Coordination: Mutual exclusion A token-ring algorithm Mutual exclusion: Token ring algorithm 4 5 6 7 Essence Organize processes in a logical ring, and let a token be passed between them. The one that holds the token is allowed to enter the critical region (if it wants to). An overlay network constructed as a logical ring with a circulating token Token 0 1 2 3 28 / 49
  • 39. 29 / 49 Coordination: Mutual exclusion A decentralized algorithm Decentralized mutual exclusion Principle Assume every resource is replicated N times, with each replica having its own coordinator ⇒ access requires a majority vote from m > N/2 coordinators. A coordinator always responds immediately to a request. Assumption When a coordinator crashes, it will recover quickly, but will have forgotten about permissions it had granted.
  • 40. 30 / 49 Coordination: Mutual exclusion A decentralized algorithm Decentralized mutual exclusion How robust is this system? ► Let p = ∆t/T be the probability that a coordinator resets during a time interval ∆t , while having a lifetime of T.
  • 41. 30 / 49 Coordination: Mutual exclusion A decentralized algorithm Decentralized mutual exclusion How robust is this system? ► Let p = ∆t/T be the probability that a coordinator resets during a time interval ∆t , while having a lifetime of T. ► The probability P[k] that k out of m coordinators reset during the same interval is P[k] = m k pk (1−p)m−k
  • 42. 30 / 49 Coordination: Mutual exclusion A decentralized algorithm Decentralized mutual exclusion How robust is this system? ► Let p = ∆t/T be the probability that a coordinator resets during a time interval ∆t , while having a lifetime of T. ► The probability P[k] that k out of m coordinators reset during the same interval is P[k] = m k pk (1−p)m−k ► f coordinators reset ⇒ correctness is violated when there is only a minority of nonfaulty coordinators: when m −f ≤N/2, or, f ≥m −N/ 2.
  • 43. 30 / 49 Coordination: Mutual exclusion A decentralized algorithm Decentralized mutual exclusion How robust is this system? ► Let p = ∆t/T be the probability that a coordinator resets during a time interval ∆t , while having a lifetime of T. ► The probability P[k] that k out of m coordinators reset during the same interval is P[k] = m k pk (1−p)m−k ► f coordinators reset ⇒ correctness is violated when there is only a minority of nonfaulty coordinators: when m −f ≤N/2, or, f ≥m −N/ 2. k= m−N/ 2 ►The probability of a violation is ∑N P[k].
  • 44. 31 / 49 Coordination: Mutual exclusion A decentralized algorithm Decentralized mutual exclusion Violation probabilities for various parameter values N m p Violation N m p Violation 8 5 3 sec/hour < 10−15 8 5 30 sec/hour < 10−10 8 6 3 sec/hour < 10−18 8 6 30 sec/hour < 10−11 16 9 3 sec/hour < 10−27 16 9 30 sec/hour < 10−18 16 12 3 sec/hour < 10−36 16 12 30 sec/hour < 10−24 32 17 3 sec/hour < 10−52 32 17 30 sec/hour < 10−35 32 24 3 sec/hour < 10−73 32 24 30 sec/hour < 10−49
  • 45. 31 / 49 Coordination: Mutual exclusion A decentralized algorithm Decentralized mutual exclusion Violation probabilities for various parameter values N m p Violation N m p Violation 8 5 3 sec/hour < 10−15 8 5 30 sec/hour < 10−10 8 6 3 sec/hour < 10−18 8 6 30 sec/hour < 10−11 16 9 3 sec/hour < 10−27 16 9 30 sec/hour < 10−18 16 12 3 sec/hour < 10−36 16 12 30 sec/hour < 10−24 32 17 3 sec/hour < 10−52 32 17 30 sec/hour < 10−35 32 24 3 sec/hour < 10−73 32 24 30 sec/hour < 10−49 So.... What can we conclude?
  • 46. 32 / 49 Coordination: Mutual exclusion A decentralized algorithm Mutual exclusion: comparison Algorithm Messages per entry/exit Delay before entry (in message times) Centralized 3 2 Distributed 2·(N −1) 2·(N −1) Token ring 1,...,∞ 0,...,N −1 Decentralized 2·m ·k +m,k = 1,2,... 2·m ·k
  • 47. 33 / 49 Coordination: Election algorithms Election algorithms Principle An algorithm requires that some process acts as a coordinator. The question is how to select this special process dynamically. Note In many systems the coordinator is chosen by hand (e.g. file servers). This leads to centralized solutions ⇒ single point of failure.
  • 48. 33 / 49 Coordination: Election algorithms Election algorithms Principle An algorithm requires that some process acts as a coordinator. The question is how to select this special process dynamically. Note In many systems the coordinator is chosen by hand (e.g. file servers). This leads to centralized solutions ⇒ single point of failure. Teasers 1. If a coordinator is chosen dynamically, to what extent can we speak about a centralized or distributed solution? 2. Is a fully distributed solution, i.e. one without a coordinator, always more robust than any centralized/coordinated solution?
  • 49. 34 / 49 Coordination: Election algorithms Basic assumptions ► All processes have unique id’s ► All processes know id’s of all processes in the system (but not if they are up or down) ► Election means identifying the process with the highest id that is up
  • 50. 35 / 49 Coordination: Election algorithms The bully algorithm Election by bullying Principle Consider N processes {P0, . . . , PN−1} and let id (Pk ) = k . When a process Pk notices that the coordinator is no longer responding to requests, it initiates an election: 1. Pk sends an ELECTION message to all processes with higher identifiers: Pk+1, Pk+2 ,...,PN−1. 2. If no one responds, Pk wins the election and becomes coordinator. 3. If one of the higher-ups answers, it takes over and Pk ’s job is done.
  • 51. Coordination: Election algorithms The bully algorithm Election 4 0 6 3 7 OK 1 Election by bullying The bully election algorithm 1 2 5 2 4 0 5 6 3 7 Election 1 2 4 0 5 6 3 7 OK 1 2 4 0 5 6 3 7 Coordinator 1 2 4 0 5 6 3 7 36 / 49
  • 52. 37 / 49 Coordination: Election algorithms A ring algorithm Election in a ring Principle Process priority is obtained by organizing processes into a (logical) ring. Process with the highest priority should be elected as coordinator. ► Any process can start an election by sending an election message to its successor. If a successor is down, the message is passed on to the next successor. ► If a message is passed on, the sender adds itself to the list. When it gets back to the initiator, everyone had a chance to make its presence known. ► The initiator sends a coordinator message around the ring containing a list of all living processes. The one with the highest priority is elected as coordinator.
  • 53. Coordination: Election algorithms A ring algorithm Election in a ring 3 4 5 6 7 0 [3] 38 / 49 [3,4] [3,4,5] [3,4,5,6] [3,4,5,6,0] [3,4,5,6,0,1] [3,4,5,6,0,1,2] [6] Election algorithm using a ring [6,0,1] [6,0,1,2] 1 2 [6,0] [6,0,1,2,3] [6,0,1,2,3,4] [6,0,1,2,3,4,5] ► The solid line shows the election messages initiated by P6 ► The dashed one the messages by P3
  • 54. Coordination: Election algorithms Elections in wireless environments A solution for wireless networks A sample network 4 5 2 4 a 6 b 3 c d e 1 f g 2 h 8 i j 4 Capacity 4 5 2 3 c d e 1 f g 2 h 8 i j 4 Broadcasting node 6 b 4 a 39 / 49
  • 55. Coordination: Election algorithms Elections in wireless environments A solution for wireless networks A sample network 4 5 2 4 a 6 b 3 c d e 1 f g 2 h 8 i j 4 g receives broadcast from b first 4 40 / 49 5 2 4 a 6 b 3 c d e 1 f g 2 j 4 e receives broadcast from g first i h 8
  • 56. Coordination: Election algorithms Elections in wireless environments A solution for wireless networks A sample network 5 2 4 a 6 b 3 c d e 1 g 2 i j 4 f f receives 4 broadcast from e first h 8 4 41 / 49 2 2 4 a 3 c d f g h 8 i j 4 [f,4] [c,3] [d,2] [i,5] 5 [h,8] [h,8] [h,8] 6 b [j,4] e [f,4] 1
  • 57. 42 / 49 Coordination: Location systems Positioning nodes Issue In large-scale distributed systems in which nodes are dispersed across a wide-area network, we often need to take some notion of proximity or distance into account ⇒ it starts with determining a (relative) location of a node.
  • 58. Coordination: Location systems GPS: Global Positioning System Computing position Observation A node P needs d + 1 landmarks to compute its own position in a d-dimensional space. Consider two-dimensional case. Computing a position in 2D P d2 d1 d3 (x3,y3) (x2,y2) (x1,y1) Solution P needs to solve three equations in two unknowns (xP ,yP): di = (xi −xP )2 + (yi −yP )2 43 / 49
  • 59. Observation 4 satellites ⇒ 4 equations in 4 unknowns (with ∆r as one of them) 44 / 49 Coordination: Location systems GPS: Global Positioning System Global positioning system Assuming that the clocks of the satellites are accurate and synchronized ► It takes a while before a signal reaches the receiver ► The receiver’s clock is definitely out of sync with the satellite Basics
  • 60. Observation 4 satellites ⇒ 4 equations in 4 unknowns (with ∆r as one of them) 44 / 49 Coordination: Location systems GPS: Global Positioning System Global positioning system Assuming that the clocks of the satellites are accurate and synchronized ► It takes a while before a signal reaches the receiver ► The receiver’s clock is definitely out of sync with the satellite Basics ► ∆ r : unknown deviation of the receiver’s clock.
  • 61. Observation 4 satellites ⇒ 4 equations in 4 unknowns (with ∆r as one of them) 44 / 49 Coordination: Location systems GPS: Global Positioning System Global positioning system Assuming that the clocks of the satellites are accurate and synchronized ► It takes a while before a signal reaches the receiver ► The receiver’s clock is definitely out of sync with the satellite Basics ► ∆ r : unknown deviation of the receiver’s clock. ► xr , yr , zr : unknown coordinates of the receiver.
  • 62. Observation 4 satellites ⇒ 4 equations in 4 unknowns (with ∆r as one of them) 44 / 49 Coordination: Location systems GPS: Global Positioning System Global positioning system Assuming that the clocks of the satellites are accurate and synchronized ► It takes a while before a signal reaches the receiver ► The receiver’s clock is definitely out of sync with the satellite Basics ► ∆ r : unknown deviation of the receiver’s clock. ► xr , yr , zr : unknown coordinates of the receiver. ► Ti : timestamp on a message from satellite i
  • 63. Observation 4 satellites ⇒ 4 equations in 4 unknowns (with ∆r as one of them) 44 / 49 Coordination: Location systems GPS: Global Positioning System Global positioning system Assuming that the clocks of the satellites are accurate and synchronized ► It takes a while before a signal reaches the receiver ► The receiver’s clock is definitely out of sync with the satellite Basics ► ∆ r : unknown deviation of the receiver’s clock. ► xr , yr , zr : unknown coordinates of the receiver. ► Ti : timestamp on a message from satellite i ► ∆i = (Tnow −Ti ) + ∆ r : measured delay of the message sent by satellite i.
  • 64. Observation 4 satellites ⇒ 4 equations in 4 unknowns (with ∆r as one of them) 44 / 49 Coordination: Location systems GPS: Global Positioning System Global positioning system Assuming that the clocks of the satellites are accurate and synchronized ► It takes a while before a signal reaches the receiver ► The receiver’s clock is definitely out of sync with the satellite Basics ► ∆ r : unknown deviation of the receiver’s clock. ► xr , yr , zr : unknown coordinates of the receiver. ► Ti : timestamp on a message from satellite i ► ∆i = (Tnow −Ti ) + ∆ r : measured delay of the message sent by satellite i. ► Measured distance to satellite i: c ×∆i (c is speed of light)
  • 65. Observation 4 satellites ⇒ 4 equations in 4 unknowns (with ∆r as one of them) 44 / 49 Coordination: Location systems GPS: Global Positioning System Global positioning system Assuming that the clocks of the satellites are accurate and synchronized ► It takes a while before a signal reaches the receiver ► The receiver’s clock is definitely out of sync with the satellite Basics ► ∆ r : unknown deviation of the receiver’s clock. ► xr , yr , zr : unknown coordinates of the receiver. ► Ti : timestamp on a message from satellite i ► ∆i = (Tnow −Ti ) + ∆ r : measured delay of the message sent by satellite i. ► Measured distance to satellite i: c ×∆i (c is speed of light) ► Real distance: di = c∆ i −c∆ r = ✓ (xi −xr )2 + (yi −yr )2 + (zi −zr )2
  • 66. Coordination: Location systems When GPS is not an option WiFi-based location services Basic idea ► Assume we have a database of known access points (APs) with coordinates ► Assume we can estimate distance to an AP ► Then: with 3 detected access points, we can compute a position. War driving: locating access points ► Use a WiFi-enabled device along with a GPS receiver, and move through an area while recording observed access points. ► Compute the centroid: assume an access point AP has been detected at N different locations {x _1,x _2,...,x _ N }, with known GPS location. AP ► Compute location of AP as_ x = ∑N i= 1 _ xi N ► Number of sampled detection points N may be too low. 45 / 49 . Problems ► Limited accuracy of each GPS detection point _xi ► An access point has a nonuniform transmission range
  • 67. Coordination: Location systems Logical positioning of nodes Computing position Problems ► Measured latencies to landmarks fluctuate ► Computed distances will not even be consistent Inconsistent distances in 1D space 1 P 2 3 Q 4 R 2.8 1.0 2.0 Solution: minimize errors ► Use N special landmark nodes L1,...,LN . ► Landmarks measure their pairwise latencies d˜(Li,Lj ) ► A central node computes the coordinates for each landmark, minimizing: ∑ ∑ i= 1 j= i+ 1 N N ˜ i j ˆ i j d(L ,L ) −d(L ,L ) d̃(Li ,Lj ) 2 where dˆ(Li,Lj ) is distance after nodes Li and Lj have been Centralized positpionoinsgitioned. 46 / 49
  • 68. Coordination: Location systems Logical positioning of nodes Computing position Choosing the dimension m The hidden parameter is the dimension m with N > m. A node P measures its distance to each of the N landmarks and computes its coordinates by minimizing N ∑ i= 1 i ˜ ˆ i d(L ,P) −d(L ,P) d̃(Li ,P) 2 Observation Practice shows that m can be as small as 6 or 7 to achieve latency estimations within a factor 2 of the actual value. Centralized positioning 47 / 49
  • 69. Coordination: Location systems Logical positioning of nodes Vivaldi Principle: network of springs exerting forces Consider a collection of N nodes P1,..., PN , each Pi having coordinates _xi. Two nodes exert a mutual force: _ Fij = d̃(Pi ,Pj ) −d̂(Pi ,Pj ) ×u(_ xi −_ xj ) with u(_xi−_xj) is the unit vector in the direction of _xi−_xj Node Pi repeatedly executes steps 1. Measure the latency d˜ijto node Pj , and also receive Pj ’s coordinates _xj. 2. Compute the error e = d˜(Pi ,Pj ) −dˆ(Pi,Pj ) 3. Compute the direction_ u = u(_ xi −_ xj ). 4. Compute the force vector Fij = e·_ u 5. Adjust own position by moving along the force vector: _ xi ←_ xi + δ ·_ u. Decentralized positioning 48 / 49
  • 70. 49 / 49 Coordination: Gossip-based coordination A peer-sampling service Example applications Typical apps ► Data dissemination: Perhaps the most important one. Note that there are many variants of dissemination. ► Aggregation: Let every node Pi maintain a variable vi . When two nodes gossip, they each reset their variable to vi ,vj ← (vi + vj )/ 2 Result: in the end each node will have computed the average v̄ = ∑i vi / N.
  • 71. 49 / 49 Coordination: Gossip-based coordination A peer-sampling service Example applications Typical apps ► Data dissemination: Perhaps the most important one. Note that there are many variants of dissemination. ► Aggregation: Let every node Pi maintain a variable vi . When two nodes gossip, they each reset their variable to vi ,vj ← (vi + vj )/ 2 Result: in the end each node will have computed the average v̄ = ∑i vi / N. ► What happens in the case that initially vi = 1 and vj = 0,j I= i?