Introduction: What is clock synchronization?
The challenges of clock synchronization.
Basic Concepts: Software and hardware clocks. Basic clock synchronization algorithm
Algorithms: Deep dive into landmark papers
NTP: Internet scale time synchronization
How to Troubleshoot Apps for the Modern Connected Worker
Clock Synchronization in Distributed Systems
1. Introduction Basic Concepts Algorithms NTP Summary 1 of 45 slides
Clock Synchronization in Distributed Systems
Wissenschaftlicher Vortrag
Zbigniew Jerzak
Technisches Universit¨t Dresden, Fakult¨t Informatik
a a
Monday 28th September, 2009
Clock Synchronization in Distributed Systems Zbigniew Jerzak
2. Introduction Basic Concepts Algorithms NTP Summary 2 of 45 slides
Motivation
Clock Synchronization in Distributed Systems Zbigniew Jerzak
3. Introduction Basic Concepts Algorithms NTP Summary 3 of 45 slides
Outline
Introduction
What is clock synchronization?
The challenges of clock synchronization
Basic Concepts
Software and hardware clocks
Basic clock synchronization algorithm
Algorithms
Deep dive into landmark papers
NTP
Internet scale time synchronization
Summary
Clock Synchronization in Distributed Systems Zbigniew Jerzak
4. Introduction Basic Concepts Algorithms NTP Summary 4 of 45 slides
Problem Definition
based on: [LMS85]
1. At any time, the values of all the nonfaulty clocks must be
approximately equal (within ∆max ).
2. There is a small bound on the amount by which a nonfaulty
processs clock is changed.
Clock Synchronization in Distributed Systems Zbigniew Jerzak
5. Introduction Basic Concepts Algorithms NTP Summary 5 of 45 slides
System Definition
A set of N distributed processes
Every process has a local physical clock
No direct access to a shared global clock
Communication between processes is message-based
Clock Synchronization in Distributed Systems Zbigniew Jerzak
6. Introduction Basic Concepts Algorithms NTP Summary 6 of 45 slides
Clock Synchronization Application Areas
based on: [Lis93]
At most once message delivery [LSW90]
Cache consistency [GC89]
Active replication [HCZ08]
Medium access control [KG94]
Global Positioning System
Global System for Mobile communications (second generation)
Clock Synchronization in Distributed Systems Zbigniew Jerzak
7. Introduction Basic Concepts Algorithms NTP Summary 6 of 45 slides
Clock Synchronization Application Areas
based on: [Lis93]
At most once message delivery [LSW90]
Cache consistency [GC89]
Active replication [HCZ08]
Medium access control [KG94]
Global Positioning System
Global System for Mobile communications (second generation)
Clock Synchronization in Distributed Systems Zbigniew Jerzak
8. Introduction Basic Concepts Algorithms NTP Summary 7 of 45 slides
Time Division Multiple Access
Requirement: real-time
communication using a shared
medium
Problem: collisions can arbitrarily
delay messages
Solution: synchronize clocks to
determine access slots [Joc07]
frame based data flow
divide frames into slots
scheduler assigns processes to
slots
clock synchronization: collision
free schedule execution
Clock Synchronization in Distributed Systems Zbigniew Jerzak
9. Introduction Basic Concepts Algorithms NTP Summary 7 of 45 slides
Time Division Multiple Access
Requirement: real-time
communication using a shared
medium
Problem: collisions can arbitrarily
delay messages
Solution: synchronize clocks to
determine access slots [Joc07]
frame based data flow
divide frames into slots
scheduler assigns processes to
slots
clock synchronization: collision
free schedule execution
Clock Synchronization in Distributed Systems Zbigniew Jerzak
10. Introduction Basic Concepts Algorithms NTP Summary 7 of 45 slides
Time Division Multiple Access
Requirement: real-time
communication using a shared
medium
Problem: collisions can arbitrarily
delay messages
Solution: synchronize clocks to
determine access slots [Joc07]
frame based data flow
divide frames into slots
scheduler assigns processes to
slots
clock synchronization: collision
free schedule execution
Clock Synchronization in Distributed Systems Zbigniew Jerzak
11. Introduction Basic Concepts Algorithms NTP Summary 7 of 45 slides
Time Division Multiple Access
Requirement: real-time
communication using a shared
medium
Problem: collisions can arbitrarily
delay messages
Solution: synchronize clocks to
determine access slots [Joc07]
frame based data flow
divide frames into slots
scheduler assigns processes to
slots
clock synchronization: collision
free schedule execution
Clock Synchronization in Distributed Systems Zbigniew Jerzak
12. Introduction Basic Concepts Algorithms NTP Summary 7 of 45 slides
Time Division Multiple Access
Requirement: real-time
communication using a shared
medium
Problem: collisions can arbitrarily
delay messages
Solution: synchronize clocks to
determine access slots [Joc07]
frame based data flow
divide frames into slots
scheduler assigns processes to
slots
clock synchronization: collision
free schedule execution
Clock Synchronization in Distributed Systems Zbigniew Jerzak
13. Introduction Basic Concepts Algorithms NTP Summary 7 of 45 slides
Time Division Multiple Access
Requirement: real-time
communication using a shared
medium
Problem: collisions can arbitrarily
delay messages
Solution: synchronize clocks to
determine access slots [Joc07]
frame based data flow
divide frames into slots
scheduler assigns processes to
slots
clock synchronization: collision
free schedule execution
Clock Synchronization in Distributed Systems Zbigniew Jerzak
14. Introduction Basic Concepts Algorithms NTP Summary 7 of 45 slides
Time Division Multiple Access
Requirement: real-time
communication using a shared
medium
Problem: collisions can arbitrarily
delay messages
Solution: synchronize clocks to
determine access slots [Joc07]
frame based data flow
divide frames into slots
scheduler assigns processes to
slots
clock synchronization: collision
free schedule execution
Clock Synchronization in Distributed Systems Zbigniew Jerzak
15. Introduction Basic Concepts Algorithms NTP Summary 7 of 45 slides
Time Division Multiple Access
Requirement: real-time
communication using a shared
medium
Problem: collisions can arbitrarily
delay messages
Solution: synchronize clocks to
determine access slots [Joc07]
frame based data flow
divide frames into slots
scheduler assigns processes to
slots
clock synchronization: collision
free schedule execution
Clock Synchronization in Distributed Systems Zbigniew Jerzak
16. Introduction Basic Concepts Algorithms NTP Summary 8 of 45 slides
Problem: Unstable Clocks
6
offset [us]
5
4
3
2
offset [us]
1
0
-1
-2
-3 (out of the box) (in the box)
-4
5 6 7 8 9 10 11
time [h]
Clock Synchronization in Distributed Systems Zbigniew Jerzak
17. Introduction Basic Concepts Algorithms NTP Summary 8 of 45 slides
Problem: Unstable Clocks
clock offset [us]
2 temperature [C]
0
temperature [C]
-2
clock offset
26
-4
25
-6
24
23
22
21
12 12.5 13 13.5 14
time [h]
Clock Synchronization in Distributed Systems Zbigniew Jerzak
19. Introduction Basic Concepts Algorithms NTP Summary 10 of 45 slides
Problem: Omissions and Crashes
The probability of failure
A system consisting of 280 nodes
increases with the increasing
partitions on average once a
number of system elements
day [MPHD06]
Byzantine [LSP82] errors
Clock Synchronization in Distributed Systems Zbigniew Jerzak
20. Introduction Basic Concepts Algorithms NTP Summary 10 of 45 slides
Problem: Omissions and Crashes
The probability of failure
,,Two-faced clocks” present
increases with the increasing
different values to different
number of system elements
processes [LMS85]
Byzantine [LSP82] errors
Clock Synchronization in Distributed Systems Zbigniew Jerzak
21. Introduction Basic Concepts Algorithms NTP Summary 11 of 45 slides
Clocks
The timekeeping element –
an oscillator:
pendulum
quartz crystal
microwave (133 Cs)
In computer science
Clock Synchronization in Distributed Systems Zbigniew Jerzak
22. Introduction Basic Concepts Algorithms NTP Summary 11 of 45 slides
Clocks
The timekeeping element –
an oscillator:
pendulum
quartz crystal
microwave (133 Cs)
In computer science
Clock Synchronization in Distributed Systems Zbigniew Jerzak
23. Introduction Basic Concepts Algorithms NTP Summary 12 of 45 slides
Hardware Clocks
Hardware clock: H(t)
dH(t)
Rate: f (t) =
dt
Drift: ρ(t) = f (t) − 1
Clock Synchronization in Distributed Systems Zbigniew Jerzak
24. Introduction Basic Concepts Algorithms NTP Summary 13 of 45 slides
Clock Drift of Different PlanetLab Hosts
80
uba.ar
70 ssvl.kth.se
iit-tech.net
60
50
drift rate [ppm]
40
30
20
10
0
-10
-20
Mar Mar Mar Mar Mar Mar Mar Mar Mar Mar
18 19 20 21 22 23 24 25 26 27
date
Clock Synchronization in Distributed Systems Zbigniew Jerzak
25. Introduction Basic Concepts Algorithms NTP Summary 14 of 45 slides
Correctness of the Hardware Clock
|ρ(t)| ≤ ρmax
ρmax ≤ 500 for HPET [Cor04]
H(t) − H(s) ≥ (t − s)(1 − ρmax )
H(t) − H(s) ≤ (t − s)(1 + ρmax )
linear envelope of real-time
Clock Synchronization in Distributed Systems Zbigniew Jerzak
26. Introduction Basic Concepts Algorithms NTP Summary 14 of 45 slides
Correctness of the Hardware Clock
|ρ(t)| ≤ ρmax
ρmax ≤ 500 for HPET [Cor04]
H(t) − H(s) ≥ (t − s)(1 − ρmax )
H(t) − H(s) ≤ (t − s)(1 + ρmax )
linear envelope of real-time
Clock Synchronization in Distributed Systems Zbigniew Jerzak
27. Introduction Basic Concepts Algorithms NTP Summary 15 of 45 slides
Software Clocks
Hardware clocks in general are not
synchronized:
Hp (t) − Hq (t) is not bounded
Software clocks are used instead
Sp (t) = Hp (t) + ap (t)
A process will have a physical clock that ,,ticks” continually and a
logical clock whose value equals the value of the physical clock
plus some offset [LMS85].
Clock Synchronization in Distributed Systems Zbigniew Jerzak
28. Introduction Basic Concepts Algorithms NTP Summary 15 of 45 slides
Software Clocks
Hardware clocks in general are not
synchronized:
Hp (t) − Hq (t) is not bounded
Software clocks are used instead
Sp (t) = Hp (t) + ap (t)
A process will have a physical clock that ,,ticks” continually and a
logical clock whose value equals the value of the physical clock
plus some offset [LMS85].
Clock Synchronization in Distributed Systems Zbigniew Jerzak
29. Introduction Basic Concepts Algorithms NTP Summary 16 of 45 slides
ap (t): Continuous and Discrete Software Clocks
Clock Synchronization in Distributed Systems Zbigniew Jerzak
30. Introduction Basic Concepts Algorithms NTP Summary 17 of 45 slides
Main Components of Clock Synchronization Algorithm
Clock Synchronization in Distributed Systems Zbigniew Jerzak
31. Introduction Basic Concepts Algorithms NTP Summary 18 of 45 slides
Basic Clock Synchronization Algorithm
1 C l o c k V a l u e Ap ; // c u r r e n t a d j u s t m e n t
2 ClockValue T; // end o f c u r r e n t r o u n d
3 ClockValue P; // r e −s y n c p e r i o d
4
5 void i n i t () {
6 Ap ,T = i n i t i a l A d j ( ) ;
7 s c h e d u l e ( synchronizationRound , P , T ) ;
8 }
9
10 void synchronizationRound () {
11 C l o c k V a l u e c l k [ |N| ] ; // r e m o t e c l o c k r e a d i n g s
12 C l o c k V a l u e e r r [ |N| ] ; // r e m o t e r e a d i n g e r r o r s
13
14 readClocks ( clk , e r r ) ;
15 Ap = a d j u s t ( Ap , T , c l k , e r r ) ;
16 T = T + P;
17 }
Clock Synchronization in Distributed Systems Zbigniew Jerzak
32. Introduction Basic Concepts Algorithms NTP Summary 19 of 45 slides
Algorithms Classification
Clock Synchronization in Distributed Systems Zbigniew Jerzak
33. Introduction Basic Concepts Algorithms NTP Summary 20 of 45 slides
External vs Internal Clock Synchronization
External [Cri89, Mil91, CF95]:
time reference external to the
system
maintain ∆max wrt. external time
reference
Internal [LMS85, WL88, CF95,
FL06]:
maintain ∆max wrt. other system
members
Externally synchronized clocks are also internally synchronized.
The converse is not true. [Cri89]
Clock Synchronization in Distributed Systems Zbigniew Jerzak
34. Introduction Basic Concepts Algorithms NTP Summary 20 of 45 slides
External vs Internal Clock Synchronization
External [Cri89, Mil91, CF95]:
time reference external to the
system
maintain ∆max wrt. external time
reference
Internal [LMS85, WL88, CF95,
FL06]:
maintain ∆max wrt. other system
members
Externally synchronized clocks are also internally synchronized.
The converse is not true. [Cri89]
Clock Synchronization in Distributed Systems Zbigniew Jerzak
35. Introduction Basic Concepts Algorithms NTP Summary 20 of 45 slides
External vs Internal Clock Synchronization
External [Cri89, Mil91, CF95]:
time reference external to the
system
maintain ∆max wrt. external time
reference
Internal [LMS85, WL88, CF95,
FL06]:
maintain ∆max wrt. other system
members
Externally synchronized clocks are also internally synchronized.
The converse is not true. [Cri89]
Clock Synchronization in Distributed Systems Zbigniew Jerzak
36. Introduction Basic Concepts Algorithms NTP Summary 21 of 45 slides
Software vs Hardware Clock Synchronization
Hardware (assisted) clock
synchronization [KSB85, SR88, KKMS95]
Very precise (e.g. phase locking)
Very expensive (additional hardware)
Software clock synchronization [WL88, Mil91, FL06]
Less precise
More flexible
Cheap
Clock Synchronization in Distributed Systems Zbigniew Jerzak
37. Introduction Basic Concepts Algorithms NTP Summary 21 of 45 slides
Software vs Hardware Clock Synchronization
Hardware (assisted) clock
synchronization [KSB85, SR88, KKMS95]
Very precise (e.g. phase locking)
Very expensive (additional hardware)
Software clock synchronization [WL88, Mil91, FL06]
Less precise
More flexible
Cheap
Clock Synchronization in Distributed Systems Zbigniew Jerzak
38. Introduction Basic Concepts Algorithms NTP Summary 22 of 45 slides
Deterministic vs Probabilistic Clock Synchronization
Deterministic [WL88, FC95, WS07]:
∃ ub(td)
∆max holds
Probabilistic [Cri89, OS94]:
ub(td)
∆max does not hold
indication when ∆max is reached
Clock Synchronization in Distributed Systems Zbigniew Jerzak
39. Introduction Basic Concepts Algorithms NTP Summary 22 of 45 slides
Deterministic vs Probabilistic Clock Synchronization
Deterministic [WL88, FC95, WS07]:
∃ ub(td)
∆max holds
Probabilistic [Cri89, OS94]:
ub(td)
∆max does not hold
indication when ∆max is reached
Clock Synchronization in Distributed Systems Zbigniew Jerzak
43. Introduction Basic Concepts Algorithms NTP Summary 24 of 45 slides
Fault Tolerant Clock Synchronization
based on: [LL84, WL88]
td ∈ [δmin , δmax ]
∀p∈N : |ρp (t)| ≤ ρmax
Sp (t) = Hp (t) + ap (t),
ap (t) - discrete function of time
Initial synchronization: ∀p,q∈N : |Sp (0) − Sq (0)| < γ
|N|2 messages per round
[CF94]: |N| + 1 for crash-stop failures
|N| ≥ 3|F| + 1
Clock Synchronization in Distributed Systems Zbigniew Jerzak
44. Introduction Basic Concepts Algorithms NTP Summary 25 of 45 slides
FTCS – Algorithm Outline
1. Broadcast Sp (T i )
2. Wait for other broadcasts for γ + δmax
3. Use convergence function to calculate midpoint
i+1 i
4. ap = ap + midpoint
i+1
5. Use ap to ,,switch” to new software clock
6. Wait until T i+1 = T i + P
7. Loop
Clock Synchronization in Distributed Systems Zbigniew Jerzak
45. Introduction Basic Concepts Algorithms NTP Summary 25 of 45 slides
FTCS – Algorithm Outline
1. Broadcast Sp (T i )
2. Wait for other broadcasts for γ + δmax
3. Use convergence function to calculate midpoint
i+1 i
4. ap = ap + midpoint
i+1
5. Use ap to ,,switch” to new software clock
6. Wait until T i+1 = T i + P
7. Loop
Clock Synchronization in Distributed Systems Zbigniew Jerzak
46. Introduction Basic Concepts Algorithms NTP Summary 25 of 45 slides
FTCS – Algorithm Outline
1. Broadcast Sp (T i )
2. Wait for other broadcasts for γ + δmax
3. Use convergence function to calculate midpoint
i+1 i
4. ap = ap + midpoint
i+1
5. Use ap to ,,switch” to new software clock
6. Wait until T i+1 = T i + P
7. Loop
Clock Synchronization in Distributed Systems Zbigniew Jerzak
47. Introduction Basic Concepts Algorithms NTP Summary 25 of 45 slides
FTCS – Algorithm Outline
1. Broadcast Sp (T i )
2. Wait for other broadcasts for γ + δmax
3. Use convergence function to calculate midpoint
i+1 i
4. ap = ap + midpoint
i+1
5. Use ap to ,,switch” to new software clock
6. Wait until T i+1 = T i + P
7. Loop
Clock Synchronization in Distributed Systems Zbigniew Jerzak
48. Introduction Basic Concepts Algorithms NTP Summary 25 of 45 slides
FTCS – Algorithm Outline
1. Broadcast Sp (T i )
2. Wait for other broadcasts for γ + δmax
3. Use convergence function to calculate midpoint
i+1 i
4. ap = ap + midpoint
i+1
5. Use ap to ,,switch” to new software clock
6. Wait until T i+1 = T i + P
7. Loop
Clock Synchronization in Distributed Systems Zbigniew Jerzak
49. Introduction Basic Concepts Algorithms NTP Summary 25 of 45 slides
FTCS – Algorithm Outline
1. Broadcast Sp (T i )
2. Wait for other broadcasts for γ + δmax
3. Use convergence function to calculate midpoint
i+1 i
4. ap = ap + midpoint
i+1
5. Use ap to ,,switch” to new software clock
6. Wait until T i+1 = T i + P
7. Loop
Clock Synchronization in Distributed Systems Zbigniew Jerzak
50. Introduction Basic Concepts Algorithms NTP Summary 25 of 45 slides
FTCS – Algorithm Outline
1. Broadcast Sp (T i )
2. Wait for other broadcasts for γ + δmax
3. Use convergence function to calculate midpoint
i+1 i
4. ap = ap + midpoint
i+1
5. Use ap to ,,switch” to new software clock
6. Wait until T i+1 = T i + P
7. Loop
Clock Synchronization in Distributed Systems Zbigniew Jerzak
51. Introduction Basic Concepts Algorithms NTP Summary 26 of 45 slides
FTCS – Fault Tolerant Convergence Function
1 C l o c k V a l u e c f n ( c l k [ |N| ] , |F| )
2 {
3 ClockValue midpoint ;
4 C l o c k V a l u e tmp [ |N| ] ;
5
6 midpoint = 0;
7 tmp [ |N| ] = s o r t ( c l k [ |N| ] ) ;
8
9 f o r ( i=|F| ; i <2|F|+1; ++i )
10 {
11 m i d p o i n t = m i d p o i n t + tmp [ i ] ;
12 }
13 m i d p o i n t = m i d p o i n t / |F|+1;
14
15 return midpoint ;
16 }
Clock Synchronization in Distributed Systems Zbigniew Jerzak
52. Introduction Basic Concepts Algorithms NTP Summary 27 of 45 slides
Probabilistic Clock Synchronization
based on: [Cri89]
p (td ∈ [δmin , δmax ]) = 1
Remote clocks cannot be read with a priori specified precision
Timeout delay, which divides messages into slow and fast
Processes suffer only timing failures
Clock Synchronization in Distributed Systems Zbigniew Jerzak
53. Introduction Basic Concepts Algorithms NTP Summary 28 of 45 slides
PCS – Remote Clock Reading I
ub(m2 ) = (D − A) − (C − B) − δmin (m1 )
Clock Synchronization in Distributed Systems Zbigniew Jerzak
54. Introduction Basic Concepts Algorithms NTP Summary 29 of 45 slides
PCS – Remote Clock Reading II
ub(m2 ) + δmin (m1 )
Cp (T , q) = (T − D) + C +
2
ub(m2 ) − δmin (m1 )
Ep (T , q) =
2
Clock Synchronization in Distributed Systems Zbigniew Jerzak
55. Introduction Basic Concepts Algorithms NTP Summary 30 of 45 slides
PCS – Adjusting Local Clock
Recall: Sq (t) = Hq (t) + aq (t)
aq (t) = αHq (t) + β
Sq (t) = Hq (t)(1 + α) + β
Local time: Sq (T ), remote time: Cp (T , q)
Sq (T ) = Hq (T )(1 + α) + β
Goal: after P local time shows Cp (T , q) + P
Sq (T + P) = Cp (T , q) + P = (Hq (T ) + P)(1 + α) + β
Solution:
Cp (T , q) − Sq (T )
α=
P
β = Sq (T ) − Hq (T )(1 + α)
Clock Synchronization in Distributed Systems Zbigniew Jerzak
56. Introduction Basic Concepts Algorithms NTP Summary 30 of 45 slides
PCS – Adjusting Local Clock
Recall: Sq (t) = Hq (t) + aq (t)
aq (t) = αHq (t) + β
Sq (t) = Hq (t)(1 + α) + β
Local time: Sq (T ), remote time: Cp (T , q)
Sq (T ) = Hq (T )(1 + α) + β
Goal: after P local time shows Cp (T , q) + P
Sq (T + P) = Cp (T , q) + P = (Hq (T ) + P)(1 + α) + β
Solution:
Cp (T , q) − Sq (T )
α=
P
β = Sq (T ) − Hq (T )(1 + α)
Clock Synchronization in Distributed Systems Zbigniew Jerzak
57. Introduction Basic Concepts Algorithms NTP Summary 30 of 45 slides
PCS – Adjusting Local Clock
Recall: Sq (t) = Hq (t) + aq (t)
aq (t) = αHq (t) + β
Sq (t) = Hq (t)(1 + α) + β
Local time: Sq (T ), remote time: Cp (T , q)
Sq (T ) = Hq (T )(1 + α) + β
Goal: after P local time shows Cp (T , q) + P
Sq (T + P) = Cp (T , q) + P = (Hq (T ) + P)(1 + α) + β
Solution:
Cp (T , q) − Sq (T )
α=
P
β = Sq (T ) − Hq (T )(1 + α)
Clock Synchronization in Distributed Systems Zbigniew Jerzak
58. Introduction Basic Concepts Algorithms NTP Summary 30 of 45 slides
PCS – Adjusting Local Clock
Recall: Sq (t) = Hq (t) + aq (t)
aq (t) = αHq (t) + β
Sq (t) = Hq (t)(1 + α) + β
Local time: Sq (T ), remote time: Cp (T , q)
Sq (T ) = Hq (T )(1 + α) + β
Goal: after P local time shows Cp (T , q) + P
Sq (T + P) = Cp (T , q) + P = (Hq (T ) + P)(1 + α) + β
Solution:
Cp (T , q) − Sq (T )
α=
P
β = Sq (T ) − Hq (T )(1 + α)
Clock Synchronization in Distributed Systems Zbigniew Jerzak
59. Introduction Basic Concepts Algorithms NTP Summary 31 of 45 slides
PCS – Specifying Precision
Lower ub(m2 ) implies lower error Ep (T , q)
Achieving a given error requires a bound ubmax
Trade-off between Ep (T , q) and probability p(ub(m) > ubmax )
Using k readings and knowing p:
p(success) = 1 − p k
Clock Synchronization in Distributed Systems Zbigniew Jerzak
60. Introduction Basic Concepts Algorithms NTP Summary 31 of 45 slides
PCS – Specifying Precision
Lower ub(m2 ) implies lower error Ep (T , q)
Achieving a given error requires a bound ubmax
Trade-off between Ep (T , q) and probability p(ub(m) > ubmax )
Using k readings and knowing p:
p(success) = 1 − p k
Clock Synchronization in Distributed Systems Zbigniew Jerzak
61. Introduction Basic Concepts Algorithms NTP Summary 31 of 45 slides
PCS – Specifying Precision
Lower ub(m2 ) implies lower error Ep (T , q)
Achieving a given error requires a bound ubmax
Trade-off between Ep (T , q) and probability p(ub(m) > ubmax )
Using k readings and knowing p:
p(success) = 1 − p k
Clock Synchronization in Distributed Systems Zbigniew Jerzak
62. Introduction Basic Concepts Algorithms NTP Summary 31 of 45 slides
PCS – Specifying Precision
Lower ub(m2 ) implies lower error Ep (T , q)
Achieving a given error requires a bound ubmax
Trade-off between Ep (T , q) and probability p(ub(m) > ubmax )
Using k readings and knowing p:
p(success) = 1 − p k
Clock Synchronization in Distributed Systems Zbigniew Jerzak
63. Introduction Basic Concepts Algorithms NTP Summary 32 of 45 slides
Gossip-based Synchronization
based on: [BPQS08]
Problem: scale to thousands of nodes
Solution: gossip-based algorithms (partial view)
Remote clock reading: Cristian approach [Cri89]
Digital signatures
Discrete clock adjustment
Clock Synchronization in Distributed Systems Zbigniew Jerzak
64. Introduction Basic Concepts Algorithms NTP Summary 32 of 45 slides
Gossip-based Synchronization
based on: [BPQS08]
Problem: scale to thousands of nodes
Solution: gossip-based algorithms (partial view)
Remote clock reading: Cristian approach [Cri89]
Digital signatures
Discrete clock adjustment
Clock Synchronization in Distributed Systems Zbigniew Jerzak
65. Introduction Basic Concepts Algorithms NTP Summary 32 of 45 slides
Gossip-based Synchronization
based on: [BPQS08]
Problem: scale to thousands of nodes
Solution: gossip-based algorithms (partial view)
Remote clock reading: Cristian approach [Cri89]
Digital signatures
Discrete clock adjustment
Clock Synchronization in Distributed Systems Zbigniew Jerzak
66. Introduction Basic Concepts Algorithms NTP Summary 32 of 45 slides
Gossip-based Synchronization
based on: [BPQS08]
Problem: scale to thousands of nodes
Solution: gossip-based algorithms (partial view)
Remote clock reading: Cristian approach [Cri89]
Digital signatures
Discrete clock adjustment
Clock Synchronization in Distributed Systems Zbigniew Jerzak
67. Introduction Basic Concepts Algorithms NTP Summary 32 of 45 slides
Gossip-based Synchronization
based on: [BPQS08]
Problem: scale to thousands of nodes
Solution: gossip-based algorithms (partial view)
Remote clock reading: Cristian approach [Cri89]
Digital signatures
Discrete clock adjustment
Clock Synchronization in Distributed Systems Zbigniew Jerzak
68. Introduction Basic Concepts Algorithms NTP Summary 33 of 45 slides
Gossip-based Synchronization – The Algorithm
1. Obtain a random list of neighbors
2. Use the remote clock reading to calculate offsets O
3. Sort the offsets
U
1
4. Adjustment: O(i)
U −L
i=L
L = α|N|
U = |N| − L
0 ≤ α < 0.5
5. Update local clock
6. Loop
Clock Synchronization in Distributed Systems Zbigniew Jerzak
69. Introduction Basic Concepts Algorithms NTP Summary 34 of 45 slides
Network Time Protocol – Goal & Definitions
based on: [Mil91, Mil03]
Goal: accurate and precise time on a statistical basis with
acceptable network overheads and instabilities in a large, diverse
internet (interconnected) system. [Mil91]
Offset: |Hp (t) − Hq (t)|
dHp (t) dHq (t)
Skew: −
dt dt
Clock Synchronization:
time synchronization: bounding offset
frequency synchronization: bounding skew
Clock Synchronization in Distributed Systems Zbigniew Jerzak
70. Introduction Basic Concepts Algorithms NTP Summary 34 of 45 slides
Network Time Protocol – Goal & Definitions
based on: [Mil91, Mil03]
Goal: accurate and precise time on a statistical basis with
acceptable network overheads and instabilities in a large, diverse
internet (interconnected) system. [Mil91]
Offset: |Hp (t) − Hq (t)|
dHp (t) dHq (t)
Skew: −
dt dt
Clock Synchronization:
time synchronization: bounding offset
frequency synchronization: bounding skew
Clock Synchronization in Distributed Systems Zbigniew Jerzak
71. Introduction Basic Concepts Algorithms NTP Summary 34 of 45 slides
Network Time Protocol – Goal & Definitions
based on: [Mil91, Mil03]
Goal: accurate and precise time on a statistical basis with
acceptable network overheads and instabilities in a large, diverse
internet (interconnected) system. [Mil91]
Offset: |Hp (t) − Hq (t)|
dHp (t) dHq (t)
Skew: −
dt dt
Clock Synchronization:
time synchronization: bounding offset
frequency synchronization: bounding skew
Clock Synchronization in Distributed Systems Zbigniew Jerzak
72. Introduction Basic Concepts Algorithms NTP Summary 34 of 45 slides
Network Time Protocol – Goal & Definitions
based on: [Mil91, Mil03]
Goal: accurate and precise time on a statistical basis with
acceptable network overheads and instabilities in a large, diverse
internet (interconnected) system. [Mil91]
Offset: |Hp (t) − Hq (t)|
dHp (t) dHq (t)
Skew: −
dt dt
Clock Synchronization:
time synchronization: bounding offset
frequency synchronization: bounding skew
Clock Synchronization in Distributed Systems Zbigniew Jerzak
73. Introduction Basic Concepts Algorithms NTP Summary 35 of 45 slides
NTP – Configuration
Servers ordered into strata
Redundant paths
tolerate link failures
SP algorithm
Clock Synchronization in Distributed Systems Zbigniew Jerzak
74. Introduction Basic Concepts Algorithms NTP Summary 35 of 45 slides
NTP – Configuration
Servers ordered into strata
Redundant paths
tolerate link failures
SP algorithm
Clock Synchronization in Distributed Systems Zbigniew Jerzak
75. Introduction Basic Concepts Algorithms NTP Summary 35 of 45 slides
NTP – Configuration
Servers ordered into strata
Redundant paths
tolerate link failures
SP algorithm
Clock Synchronization in Distributed Systems Zbigniew Jerzak
76. Introduction Basic Concepts Algorithms NTP Summary 35 of 45 slides
NTP – Configuration
Servers ordered into strata
Redundant paths
tolerate link failures
SP algorithm
Clock Synchronization in Distributed Systems Zbigniew Jerzak
77. Introduction Basic Concepts Algorithms NTP Summary 36 of 45 slides
NTP – Reading Remote Clock
Round trip delay: (D − A) − (C − B)
(C +B) (D+A)
Clock offset of q wrt. p: θ = 2 − 2
(D−A)−(C −B)
Error: 2
Clock Synchronization in Distributed Systems Zbigniew Jerzak
78. Introduction Basic Concepts Algorithms NTP Summary 37 of 45 slides
NTP – Data Filtering
Problem: accurate offset from a sample population
Solution: minimum filter
order m readings according to round trip delay
select the lowest round trip (first) reading
Clock Synchronization in Distributed Systems Zbigniew Jerzak
79. Introduction Basic Concepts Algorithms NTP Summary 37 of 45 slides
NTP – Data Filtering
Problem: accurate offset from a sample population
Solution: minimum filter
order m readings according to round trip delay
select the lowest round trip (first) reading
Clock Synchronization in Distributed Systems Zbigniew Jerzak
80. Introduction Basic Concepts Algorithms NTP Summary 38 of 45 slides
NTP – Peer Selection
Problem: select and combine best peers
Solution: calculate per peer statistics
1. order peers by stratum and round trip delay
i=m−1
2. filter dispersion: χ = |θi − θ0 | 0.5i
i=0
k=|N|−1
j=|N|−1
3. peer dispersion: ∀j=0 : χj = θj0 − θk 0.75k
0
k=0
4. eliminate the peer with highest dispersion
5. terminate if one peer left
6. terminate if peer dispersion < minimum filter dispersion
Clock Synchronization in Distributed Systems Zbigniew Jerzak
81. Introduction Basic Concepts Algorithms NTP Summary 38 of 45 slides
NTP – Peer Selection
Problem: select and combine best peers
Solution: calculate per peer statistics
1. order peers by stratum and round trip delay
i=m−1
2. filter dispersion: χ = |θi − θ0 | 0.5i
i=0
k=|N|−1
j=|N|−1
3. peer dispersion: ∀j=0 : χj = θj0 − θk 0.75k
0
k=0
4. eliminate the peer with highest dispersion
5. terminate if one peer left
6. terminate if peer dispersion < minimum filter dispersion
Clock Synchronization in Distributed Systems Zbigniew Jerzak
82. Introduction Basic Concepts Algorithms NTP Summary 39 of 45 slides
NTP – Clock Correction
based on: [Mil92]
Only one peer: directly apply offset
1 C l o c k V a l u e c f n ( o f f s e t [ |N| ] , s t r a t u m [ |N| ] , d i s t a n c e [ |N| ] )
2 {
3 C l o c k V a l u e tmp1 ;
4 C l o c k V a l u e tmp2 =0;
5 C l o c k V a l u e tmp3 =0;
6
7 f o r ( i =0; i <|N| ; ++i ) {
8 tmp1 = 1 / ( s t r a t u m [ i ] ∗ MAXDISPERS+d i s t a n c e [ i ] ) ;
9 tmp2 += tmp1 ;
10 tmp3 += tmp1∗ o f f s e t [ i ] ;
11 }
12 r e t u r n ( tmp3/tmp2 ) ;
13 }
Clock Synchronization in Distributed Systems Zbigniew Jerzak
83. Introduction Basic Concepts Algorithms NTP Summary 39 of 45 slides
NTP – Clock Correction
based on: [Mil92]
Only one peer: directly apply offset
1 C l o c k V a l u e c f n ( o f f s e t [ |N| ] , s t r a t u m [ |N| ] , d i s t a n c e [ |N| ] )
2 {
3 C l o c k V a l u e tmp1 ;
4 C l o c k V a l u e tmp2 =0;
5 C l o c k V a l u e tmp3 =0;
6
7 f o r ( i =0; i <|N| ; ++i ) {
8 tmp1 = 1 / ( s t r a t u m [ i ] ∗ MAXDISPERS+d i s t a n c e [ i ] ) ;
9 tmp2 += tmp1 ;
10 tmp3 += tmp1∗ o f f s e t [ i ] ;
11 }
12 r e t u r n ( tmp3/tmp2 ) ;
13 }
Clock Synchronization in Distributed Systems Zbigniew Jerzak
84. Introduction Basic Concepts Algorithms NTP Summary 40 of 45 slides
Summary
Clock synchronization is a difficult problem
External clock synchronization has lower overheads
Internal clock synchronization is more robust
Clock synchronization is an important problem
For hard-real time applications
For wireless networks
Clock synchronization is practical
GPS
GSM (2G)
Clock Synchronization in Distributed Systems Zbigniew Jerzak
85. Introduction Basic Concepts Algorithms NTP Summary 41 of 45 slides
Thank You!
Clock Synchronization in Distributed Systems Zbigniew Jerzak
86. Introduction Basic Concepts Algorithms NTP Summary 42 of 45 slides
References I
Roberto Baldoni, Marco Platania, Leonardo Querzoni, and Sirio Scipioni.
A peer-to-peer filter-based algorithm for internal clock synchronization in presence of corrupted processes.
In PRDC 2008: 14th IEEE Pacific Rim International Symposium on Dependable Computing, pages 64–72.
IEEE Computer Society, 2008.
Flaviu Cristian and Christof Fetzer.
Probabilistic internal clock synchronization.
In Proceedings of the Thirteenth Symposium on Reliable Distributed Systems (SRDS1994), pages 22–31,
October 1994.
F. Cristian and C. Fetzer.
Fault-tolerant external clock synchronization.
In ICDCS ’95: Proceedings of the 15th International Conference on Distributed Computing Systems,
page 70, Washington, DC, USA, 1995. IEEE Computer Society.
Intel Corporation.
Ia-pc hpet (high precision event timers) specification.
Online, October 2004.
Flaviu Cristian.
Probabilistic clock synchronization.
Distributed Computing, 3(3):146–158, September 1989.
Christof Fetzer and Flaviu Cristian.
An optimal internal clock synchronization algorithm.
In Proceedings of the 10th Annual IEEE Conference on Computer Assurance (COMPASS1995), pages
187–196, June 1995.
Clock Synchronization in Distributed Systems Zbigniew Jerzak
87. Introduction Basic Concepts Algorithms NTP Summary 43 of 45 slides
References II
Rui Fan and Nancy A. Lynch.
Gradient clock synchronization.
Distributed Computing, 18(4):255–266, 2006.
Cary G. Gray and David R. Cheriton.
Leases: An efficient fault-tolerant mechanism for distributed file cache consistency.
In SOSP 1989: Proceedings of the twelfth ACM Symposium on Operating Systems Principles, pages
202–210, 1989.
Jeong-Hyon Hwang, Ugur Cetintemel, and Stan Zdonik.
Fast and highly-available stream processing over wide area networks.
In ICDE ’08: Proceedings of the 2008 IEEE 24th International Conference on Data Engineering, pages
804–813, Washington, DC, USA, 2008. IEEE Computer Society.
M. Jochim.
Zeitig steuern - sichere daten¨bertragung im automobil.
u
c’t Magazin f¨r Computertechnik, 2(1):190–195, January 2007.
u
Hermann Kopetz and G¨nter Gr¨nsteidl.
u u
Ttp-a protocol for fault-tolerant real-time systems.
Computer, 27(1):14–23, 1994.
H. Kopetz, A. Kruger, D. Millinger, and A. Schedl.
A synchronization strategy for a time-triggered multi-cluster real-time system.
In 14th Symposium on Reliable Distributed Systems, 1995. Proceedings, pages 154–161, Bad Neuenahr,
Germany, September 1995.
Clock Synchronization in Distributed Systems Zbigniew Jerzak
88. Introduction Basic Concepts Algorithms NTP Summary 44 of 45 slides
References III
C. M. Krishna, Kang G. Shin, and Ricky W. Butler.
Ensuring fault tolerance of phase-locked clocks.
IEEE Trans. Comput., 34(8):752–756, 1985.
Barbara Liskov.
Practical uses of synchronized clocks in distributed systems.
Distributed Computing, 6(4):211–219, 1993.
Jennifer Lundelius and Nancy A. Lynch.
An upper and lower bound for clock synchronization.
Information and Control, 62(2/3):190–204, 1984.
Leslie Lamport and P. M. Melliar-Smith.
Synchronizing clocks in the presence of faults.
J. ACM, 32(1):52–78, 1985.
Leslie Lamport, Robert Shostak, and Marshall Pease.
The byzantine generals problem.
ACM Trans. Program. Lang. Syst., 4(3):382–401, 1982.
B. Liskov, L. Shrira, and J. Wroclawski.
Efficient at-most-once messages based on synchronized clocks.
In SIGCOMM ’90: Proceedings of the ACM symposium on Communications architectures & protocols,
pages 41–49, New York, NY, USA, 1990. ACM.
David L. Mills.
Internet time synchronization: the network time protocol.
IEEE Transactions on Communications, 39(10):1482–1493, October 1991.
Clock Synchronization in Distributed Systems Zbigniew Jerzak
89. Introduction Basic Concepts Algorithms NTP Summary 45 of 45 slides
References IV
David L. Mills.
Network time protocol (version 3) specification, implementation and analysis, March 1992.
David L. Mills.
A brief history of ntp time: memoirs of an internet timekeeper.
SIGCOMM Comput. Commun. Rev., 33(2):9–21, 2003.
Alan Mislove, Ansley Post, Andreas Haeberlen, and Peter Druschely.
Experiences in building and operating a reliable peer-to-peer application.
In Yolande Berbers and Willy Zwaenepoel, editors, EuroSys, pages 147–159, Leuven, Belgium, April 2006.
ACM.
A. Olson and K.G. Shin.
Probabilistic clock synchronization in large distributed systems.
IEEE Transactions on Computers, 43(9):1106–1112, September 1994.
K. G. Shin and P. Ramanathan.
Transmission delays in hardware clock synchronization.
IEEE Trans. Comput., 37(11):1465–1467, 1988.
Jennifer Lundelius Welch and Nancy Lynch.
A new fault-tolerant algorithm for clock synchronization.
Information and Computing, 77(1):1–36, 1988.
Josef Widder and Ulrich Schmid.
Booting clock synchronization in partially synchronous systems with hybrid process and link failures.
Distributed Computing, 20(2):115–140, May 2007.
Clock Synchronization in Distributed Systems Zbigniew Jerzak