Successfully reported this slideshow.
Upcoming SlideShare
×

# Clock Synchronization in Distributed Systems

51,015 views

Published on

Introduction: What is clock synchronization?
The challenges of clock synchronization.
Basic Concepts: Software and hardware clocks. Basic clock synchronization algorithm
Algorithms: Deep dive into landmark papers
NTP: Internet scale time synchronization

• Full Name
Comment goes here.

Are you sure you want to Yes No
• Hi there! Get Your Professional Job-Winning Resume Here - Check our website! http://bit.ly/resumpro

Are you sure you want to  Yes  No
• Very good Presentation sir

Are you sure you want to  Yes  No
• outstanding research work.

Are you sure you want to  Yes  No

### Clock Synchronization in Distributed Systems

1. 1. Introduction Basic Concepts Algorithms NTP Summary 1 of 45 slides Clock Synchronization in Distributed Systems Wissenschaftlicher Vortrag Zbigniew Jerzak Technisches Universit¨t Dresden, Fakult¨t Informatik a a Monday 28th September, 2009 Clock Synchronization in Distributed Systems Zbigniew Jerzak
2. 2. Introduction Basic Concepts Algorithms NTP Summary 2 of 45 slides Motivation Clock Synchronization in Distributed Systems Zbigniew Jerzak
3. 3. Introduction Basic Concepts Algorithms NTP Summary 3 of 45 slides Outline Introduction What is clock synchronization? The challenges of clock synchronization Basic Concepts Software and hardware clocks Basic clock synchronization algorithm Algorithms Deep dive into landmark papers NTP Internet scale time synchronization Summary Clock Synchronization in Distributed Systems Zbigniew Jerzak
4. 4. Introduction Basic Concepts Algorithms NTP Summary 4 of 45 slides Problem Deﬁnition based on: [LMS85] 1. At any time, the values of all the nonfaulty clocks must be approximately equal (within ∆max ). 2. There is a small bound on the amount by which a nonfaulty processs clock is changed. Clock Synchronization in Distributed Systems Zbigniew Jerzak
5. 5. Introduction Basic Concepts Algorithms NTP Summary 5 of 45 slides System Deﬁnition A set of N distributed processes Every process has a local physical clock No direct access to a shared global clock Communication between processes is message-based Clock Synchronization in Distributed Systems Zbigniew Jerzak
6. 6. Introduction Basic Concepts Algorithms NTP Summary 6 of 45 slides Clock Synchronization Application Areas based on: [Lis93] At most once message delivery [LSW90] Cache consistency [GC89] Active replication [HCZ08] Medium access control [KG94] Global Positioning System Global System for Mobile communications (second generation) Clock Synchronization in Distributed Systems Zbigniew Jerzak
7. 7. Introduction Basic Concepts Algorithms NTP Summary 6 of 45 slides Clock Synchronization Application Areas based on: [Lis93] At most once message delivery [LSW90] Cache consistency [GC89] Active replication [HCZ08] Medium access control [KG94] Global Positioning System Global System for Mobile communications (second generation) Clock Synchronization in Distributed Systems Zbigniew Jerzak
8. 8. Introduction Basic Concepts Algorithms NTP Summary 7 of 45 slides Time Division Multiple Access Requirement: real-time communication using a shared medium Problem: collisions can arbitrarily delay messages Solution: synchronize clocks to determine access slots [Joc07] frame based data ﬂow divide frames into slots scheduler assigns processes to slots clock synchronization: collision free schedule execution Clock Synchronization in Distributed Systems Zbigniew Jerzak
9. 9. Introduction Basic Concepts Algorithms NTP Summary 7 of 45 slides Time Division Multiple Access Requirement: real-time communication using a shared medium Problem: collisions can arbitrarily delay messages Solution: synchronize clocks to determine access slots [Joc07] frame based data ﬂow divide frames into slots scheduler assigns processes to slots clock synchronization: collision free schedule execution Clock Synchronization in Distributed Systems Zbigniew Jerzak
10. 10. Introduction Basic Concepts Algorithms NTP Summary 7 of 45 slides Time Division Multiple Access Requirement: real-time communication using a shared medium Problem: collisions can arbitrarily delay messages Solution: synchronize clocks to determine access slots [Joc07] frame based data ﬂow divide frames into slots scheduler assigns processes to slots clock synchronization: collision free schedule execution Clock Synchronization in Distributed Systems Zbigniew Jerzak
11. 11. Introduction Basic Concepts Algorithms NTP Summary 7 of 45 slides Time Division Multiple Access Requirement: real-time communication using a shared medium Problem: collisions can arbitrarily delay messages Solution: synchronize clocks to determine access slots [Joc07] frame based data ﬂow divide frames into slots scheduler assigns processes to slots clock synchronization: collision free schedule execution Clock Synchronization in Distributed Systems Zbigniew Jerzak
12. 12. Introduction Basic Concepts Algorithms NTP Summary 7 of 45 slides Time Division Multiple Access Requirement: real-time communication using a shared medium Problem: collisions can arbitrarily delay messages Solution: synchronize clocks to determine access slots [Joc07] frame based data ﬂow divide frames into slots scheduler assigns processes to slots clock synchronization: collision free schedule execution Clock Synchronization in Distributed Systems Zbigniew Jerzak
13. 13. Introduction Basic Concepts Algorithms NTP Summary 7 of 45 slides Time Division Multiple Access Requirement: real-time communication using a shared medium Problem: collisions can arbitrarily delay messages Solution: synchronize clocks to determine access slots [Joc07] frame based data ﬂow divide frames into slots scheduler assigns processes to slots clock synchronization: collision free schedule execution Clock Synchronization in Distributed Systems Zbigniew Jerzak
14. 14. Introduction Basic Concepts Algorithms NTP Summary 7 of 45 slides Time Division Multiple Access Requirement: real-time communication using a shared medium Problem: collisions can arbitrarily delay messages Solution: synchronize clocks to determine access slots [Joc07] frame based data ﬂow divide frames into slots scheduler assigns processes to slots clock synchronization: collision free schedule execution Clock Synchronization in Distributed Systems Zbigniew Jerzak
15. 15. Introduction Basic Concepts Algorithms NTP Summary 7 of 45 slides Time Division Multiple Access Requirement: real-time communication using a shared medium Problem: collisions can arbitrarily delay messages Solution: synchronize clocks to determine access slots [Joc07] frame based data ﬂow divide frames into slots scheduler assigns processes to slots clock synchronization: collision free schedule execution Clock Synchronization in Distributed Systems Zbigniew Jerzak
16. 16. Introduction Basic Concepts Algorithms NTP Summary 8 of 45 slides Problem: Unstable Clocks 6 offset [us] 5 4 3 2 offset [us] 1 0 -1 -2 -3 (out of the box) (in the box) -4 5 6 7 8 9 10 11 time [h] Clock Synchronization in Distributed Systems Zbigniew Jerzak
17. 17. Introduction Basic Concepts Algorithms NTP Summary 8 of 45 slides Problem: Unstable Clocks clock offset [us] 2 temperature [C] 0 temperature [C] -2 clock offset 26 -4 25 -6 24 23 22 21 12 12.5 13 13.5 14 time [h] Clock Synchronization in Distributed Systems Zbigniew Jerzak
18. 18. Introduction Basic Concepts Algorithms NTP Summary 9 of 45 slides Problem: Varying Delays 7 10 LAN: [se09 - sedell06].inf.tu-dresden.de MAN: sews11.inf.tu-dresden.de - mindfab.net 6 MAN: itias.homeip.net - rg4.polsl.pl 10 WAN: sedell06.inf.tu-dresden.de - rg4.polsl.pl 105 104 # messages 3 10 2 10 1 10 0 10 0 20000 40000 60000 80000 100000 round trip time [µs] Clock Synchronization in Distributed Systems Zbigniew Jerzak
19. 19. Introduction Basic Concepts Algorithms NTP Summary 10 of 45 slides Problem: Omissions and Crashes The probability of failure A system consisting of 280 nodes increases with the increasing partitions on average once a number of system elements day [MPHD06] Byzantine [LSP82] errors Clock Synchronization in Distributed Systems Zbigniew Jerzak
20. 20. Introduction Basic Concepts Algorithms NTP Summary 10 of 45 slides Problem: Omissions and Crashes The probability of failure ,,Two-faced clocks” present increases with the increasing diﬀerent values to diﬀerent number of system elements processes [LMS85] Byzantine [LSP82] errors Clock Synchronization in Distributed Systems Zbigniew Jerzak
21. 21. Introduction Basic Concepts Algorithms NTP Summary 11 of 45 slides Clocks The timekeeping element – an oscillator: pendulum quartz crystal microwave (133 Cs) In computer science Clock Synchronization in Distributed Systems Zbigniew Jerzak
22. 22. Introduction Basic Concepts Algorithms NTP Summary 11 of 45 slides Clocks The timekeeping element – an oscillator: pendulum quartz crystal microwave (133 Cs) In computer science Clock Synchronization in Distributed Systems Zbigniew Jerzak
23. 23. Introduction Basic Concepts Algorithms NTP Summary 12 of 45 slides Hardware Clocks Hardware clock: H(t) dH(t) Rate: f (t) = dt Drift: ρ(t) = f (t) − 1 Clock Synchronization in Distributed Systems Zbigniew Jerzak
24. 24. Introduction Basic Concepts Algorithms NTP Summary 13 of 45 slides Clock Drift of Diﬀerent PlanetLab Hosts 80 uba.ar 70 ssvl.kth.se iit-tech.net 60 50 drift rate [ppm] 40 30 20 10 0 -10 -20 Mar Mar Mar Mar Mar Mar Mar Mar Mar Mar 18 19 20 21 22 23 24 25 26 27 date Clock Synchronization in Distributed Systems Zbigniew Jerzak
25. 25. Introduction Basic Concepts Algorithms NTP Summary 14 of 45 slides Correctness of the Hardware Clock |ρ(t)| ≤ ρmax ρmax ≤ 500 for HPET [Cor04] H(t) − H(s) ≥ (t − s)(1 − ρmax ) H(t) − H(s) ≤ (t − s)(1 + ρmax ) linear envelope of real-time Clock Synchronization in Distributed Systems Zbigniew Jerzak
26. 26. Introduction Basic Concepts Algorithms NTP Summary 14 of 45 slides Correctness of the Hardware Clock |ρ(t)| ≤ ρmax ρmax ≤ 500 for HPET [Cor04] H(t) − H(s) ≥ (t − s)(1 − ρmax ) H(t) − H(s) ≤ (t − s)(1 + ρmax ) linear envelope of real-time Clock Synchronization in Distributed Systems Zbigniew Jerzak
27. 27. Introduction Basic Concepts Algorithms NTP Summary 15 of 45 slides Software Clocks Hardware clocks in general are not synchronized: Hp (t) − Hq (t) is not bounded Software clocks are used instead Sp (t) = Hp (t) + ap (t) A process will have a physical clock that ,,ticks” continually and a logical clock whose value equals the value of the physical clock plus some oﬀset [LMS85]. Clock Synchronization in Distributed Systems Zbigniew Jerzak
28. 28. Introduction Basic Concepts Algorithms NTP Summary 15 of 45 slides Software Clocks Hardware clocks in general are not synchronized: Hp (t) − Hq (t) is not bounded Software clocks are used instead Sp (t) = Hp (t) + ap (t) A process will have a physical clock that ,,ticks” continually and a logical clock whose value equals the value of the physical clock plus some oﬀset [LMS85]. Clock Synchronization in Distributed Systems Zbigniew Jerzak
29. 29. Introduction Basic Concepts Algorithms NTP Summary 16 of 45 slides ap (t): Continuous and Discrete Software Clocks Clock Synchronization in Distributed Systems Zbigniew Jerzak
30. 30. Introduction Basic Concepts Algorithms NTP Summary 17 of 45 slides Main Components of Clock Synchronization Algorithm Clock Synchronization in Distributed Systems Zbigniew Jerzak
31. 31. Introduction Basic Concepts Algorithms NTP Summary 18 of 45 slides Basic Clock Synchronization Algorithm 1 C l o c k V a l u e Ap ; // c u r r e n t a d j u s t m e n t 2 ClockValue T; // end o f c u r r e n t r o u n d 3 ClockValue P; // r e −s y n c p e r i o d 4 5 void i n i t () { 6 Ap ,T = i n i t i a l A d j ( ) ; 7 s c h e d u l e ( synchronizationRound , P , T ) ; 8 } 9 10 void synchronizationRound () { 11 C l o c k V a l u e c l k [ |N| ] ; // r e m o t e c l o c k r e a d i n g s 12 C l o c k V a l u e e r r [ |N| ] ; // r e m o t e r e a d i n g e r r o r s 13 14 readClocks ( clk , e r r ) ; 15 Ap = a d j u s t ( Ap , T , c l k , e r r ) ; 16 T = T + P; 17 } Clock Synchronization in Distributed Systems Zbigniew Jerzak
32. 32. Introduction Basic Concepts Algorithms NTP Summary 19 of 45 slides Algorithms Classiﬁcation Clock Synchronization in Distributed Systems Zbigniew Jerzak
33. 33. Introduction Basic Concepts Algorithms NTP Summary 20 of 45 slides External vs Internal Clock Synchronization External [Cri89, Mil91, CF95]: time reference external to the system maintain ∆max wrt. external time reference Internal [LMS85, WL88, CF95, FL06]: maintain ∆max wrt. other system members Externally synchronized clocks are also internally synchronized. The converse is not true. [Cri89] Clock Synchronization in Distributed Systems Zbigniew Jerzak
34. 34. Introduction Basic Concepts Algorithms NTP Summary 20 of 45 slides External vs Internal Clock Synchronization External [Cri89, Mil91, CF95]: time reference external to the system maintain ∆max wrt. external time reference Internal [LMS85, WL88, CF95, FL06]: maintain ∆max wrt. other system members Externally synchronized clocks are also internally synchronized. The converse is not true. [Cri89] Clock Synchronization in Distributed Systems Zbigniew Jerzak
35. 35. Introduction Basic Concepts Algorithms NTP Summary 20 of 45 slides External vs Internal Clock Synchronization External [Cri89, Mil91, CF95]: time reference external to the system maintain ∆max wrt. external time reference Internal [LMS85, WL88, CF95, FL06]: maintain ∆max wrt. other system members Externally synchronized clocks are also internally synchronized. The converse is not true. [Cri89] Clock Synchronization in Distributed Systems Zbigniew Jerzak
36. 36. Introduction Basic Concepts Algorithms NTP Summary 21 of 45 slides Software vs Hardware Clock Synchronization Hardware (assisted) clock synchronization [KSB85, SR88, KKMS95] Very precise (e.g. phase locking) Very expensive (additional hardware) Software clock synchronization [WL88, Mil91, FL06] Less precise More ﬂexible Cheap Clock Synchronization in Distributed Systems Zbigniew Jerzak
37. 37. Introduction Basic Concepts Algorithms NTP Summary 21 of 45 slides Software vs Hardware Clock Synchronization Hardware (assisted) clock synchronization [KSB85, SR88, KKMS95] Very precise (e.g. phase locking) Very expensive (additional hardware) Software clock synchronization [WL88, Mil91, FL06] Less precise More ﬂexible Cheap Clock Synchronization in Distributed Systems Zbigniew Jerzak
38. 38. Introduction Basic Concepts Algorithms NTP Summary 22 of 45 slides Deterministic vs Probabilistic Clock Synchronization Deterministic [WL88, FC95, WS07]: ∃ ub(td) ∆max holds Probabilistic [Cri89, OS94]: ub(td) ∆max does not hold indication when ∆max is reached Clock Synchronization in Distributed Systems Zbigniew Jerzak
39. 39. Introduction Basic Concepts Algorithms NTP Summary 22 of 45 slides Deterministic vs Probabilistic Clock Synchronization Deterministic [WL88, FC95, WS07]: ∃ ub(td) ∆max holds Probabilistic [Cri89, OS94]: ub(td) ∆max does not hold indication when ∆max is reached Clock Synchronization in Distributed Systems Zbigniew Jerzak
40. 40. Introduction Basic Concepts Algorithms NTP Summary 23 of 45 slides Clock Synchronization Algorithms Fault Tolerant Clock Synchronization (FTCS) [WL88] Software Internal Deterministic Probabilistic Clock Synchronization (PCS) [Cri89] Software External Probabilistic Gossip-based Synchronization [BPQS08] Software Internal Clock Synchronization in Distributed Systems Zbigniew Jerzak
41. 41. Introduction Basic Concepts Algorithms NTP Summary 23 of 45 slides Clock Synchronization Algorithms Fault Tolerant Clock Synchronization (FTCS) [WL88] Software Internal Deterministic Probabilistic Clock Synchronization (PCS) [Cri89] Software External Probabilistic Gossip-based Synchronization [BPQS08] Software Internal Clock Synchronization in Distributed Systems Zbigniew Jerzak
42. 42. Introduction Basic Concepts Algorithms NTP Summary 23 of 45 slides Clock Synchronization Algorithms Fault Tolerant Clock Synchronization (FTCS) [WL88] Software Internal Deterministic Probabilistic Clock Synchronization (PCS) [Cri89] Software External Probabilistic Gossip-based Synchronization [BPQS08] Software Internal Clock Synchronization in Distributed Systems Zbigniew Jerzak
43. 43. Introduction Basic Concepts Algorithms NTP Summary 24 of 45 slides Fault Tolerant Clock Synchronization based on: [LL84, WL88] td ∈ [δmin , δmax ] ∀p∈N : |ρp (t)| ≤ ρmax Sp (t) = Hp (t) + ap (t), ap (t) - discrete function of time Initial synchronization: ∀p,q∈N : |Sp (0) − Sq (0)| < γ |N|2 messages per round [CF94]: |N| + 1 for crash-stop failures |N| ≥ 3|F| + 1 Clock Synchronization in Distributed Systems Zbigniew Jerzak
44. 44. Introduction Basic Concepts Algorithms NTP Summary 25 of 45 slides FTCS – Algorithm Outline 1. Broadcast Sp (T i ) 2. Wait for other broadcasts for γ + δmax 3. Use convergence function to calculate midpoint i+1 i 4. ap = ap + midpoint i+1 5. Use ap to ,,switch” to new software clock 6. Wait until T i+1 = T i + P 7. Loop Clock Synchronization in Distributed Systems Zbigniew Jerzak
45. 45. Introduction Basic Concepts Algorithms NTP Summary 25 of 45 slides FTCS – Algorithm Outline 1. Broadcast Sp (T i ) 2. Wait for other broadcasts for γ + δmax 3. Use convergence function to calculate midpoint i+1 i 4. ap = ap + midpoint i+1 5. Use ap to ,,switch” to new software clock 6. Wait until T i+1 = T i + P 7. Loop Clock Synchronization in Distributed Systems Zbigniew Jerzak
46. 46. Introduction Basic Concepts Algorithms NTP Summary 25 of 45 slides FTCS – Algorithm Outline 1. Broadcast Sp (T i ) 2. Wait for other broadcasts for γ + δmax 3. Use convergence function to calculate midpoint i+1 i 4. ap = ap + midpoint i+1 5. Use ap to ,,switch” to new software clock 6. Wait until T i+1 = T i + P 7. Loop Clock Synchronization in Distributed Systems Zbigniew Jerzak
47. 47. Introduction Basic Concepts Algorithms NTP Summary 25 of 45 slides FTCS – Algorithm Outline 1. Broadcast Sp (T i ) 2. Wait for other broadcasts for γ + δmax 3. Use convergence function to calculate midpoint i+1 i 4. ap = ap + midpoint i+1 5. Use ap to ,,switch” to new software clock 6. Wait until T i+1 = T i + P 7. Loop Clock Synchronization in Distributed Systems Zbigniew Jerzak
48. 48. Introduction Basic Concepts Algorithms NTP Summary 25 of 45 slides FTCS – Algorithm Outline 1. Broadcast Sp (T i ) 2. Wait for other broadcasts for γ + δmax 3. Use convergence function to calculate midpoint i+1 i 4. ap = ap + midpoint i+1 5. Use ap to ,,switch” to new software clock 6. Wait until T i+1 = T i + P 7. Loop Clock Synchronization in Distributed Systems Zbigniew Jerzak
49. 49. Introduction Basic Concepts Algorithms NTP Summary 25 of 45 slides FTCS – Algorithm Outline 1. Broadcast Sp (T i ) 2. Wait for other broadcasts for γ + δmax 3. Use convergence function to calculate midpoint i+1 i 4. ap = ap + midpoint i+1 5. Use ap to ,,switch” to new software clock 6. Wait until T i+1 = T i + P 7. Loop Clock Synchronization in Distributed Systems Zbigniew Jerzak
50. 50. Introduction Basic Concepts Algorithms NTP Summary 25 of 45 slides FTCS – Algorithm Outline 1. Broadcast Sp (T i ) 2. Wait for other broadcasts for γ + δmax 3. Use convergence function to calculate midpoint i+1 i 4. ap = ap + midpoint i+1 5. Use ap to ,,switch” to new software clock 6. Wait until T i+1 = T i + P 7. Loop Clock Synchronization in Distributed Systems Zbigniew Jerzak
51. 51. Introduction Basic Concepts Algorithms NTP Summary 26 of 45 slides FTCS – Fault Tolerant Convergence Function 1 C l o c k V a l u e c f n ( c l k [ |N| ] , |F| ) 2 { 3 ClockValue midpoint ; 4 C l o c k V a l u e tmp [ |N| ] ; 5 6 midpoint = 0; 7 tmp [ |N| ] = s o r t ( c l k [ |N| ] ) ; 8 9 f o r ( i=|F| ; i <2|F|+1; ++i ) 10 { 11 m i d p o i n t = m i d p o i n t + tmp [ i ] ; 12 } 13 m i d p o i n t = m i d p o i n t / |F|+1; 14 15 return midpoint ; 16 } Clock Synchronization in Distributed Systems Zbigniew Jerzak
52. 52. Introduction Basic Concepts Algorithms NTP Summary 27 of 45 slides Probabilistic Clock Synchronization based on: [Cri89] p (td ∈ [δmin , δmax ]) = 1 Remote clocks cannot be read with a priori speciﬁed precision Timeout delay, which divides messages into slow and fast Processes suﬀer only timing failures Clock Synchronization in Distributed Systems Zbigniew Jerzak
53. 53. Introduction Basic Concepts Algorithms NTP Summary 28 of 45 slides PCS – Remote Clock Reading I ub(m2 ) = (D − A) − (C − B) − δmin (m1 ) Clock Synchronization in Distributed Systems Zbigniew Jerzak
54. 54. Introduction Basic Concepts Algorithms NTP Summary 29 of 45 slides PCS – Remote Clock Reading II ub(m2 ) + δmin (m1 ) Cp (T , q) = (T − D) + C + 2 ub(m2 ) − δmin (m1 ) Ep (T , q) = 2 Clock Synchronization in Distributed Systems Zbigniew Jerzak
55. 55. Introduction Basic Concepts Algorithms NTP Summary 30 of 45 slides PCS – Adjusting Local Clock Recall: Sq (t) = Hq (t) + aq (t) aq (t) = αHq (t) + β Sq (t) = Hq (t)(1 + α) + β Local time: Sq (T ), remote time: Cp (T , q) Sq (T ) = Hq (T )(1 + α) + β Goal: after P local time shows Cp (T , q) + P Sq (T + P) = Cp (T , q) + P = (Hq (T ) + P)(1 + α) + β Solution: Cp (T , q) − Sq (T ) α= P β = Sq (T ) − Hq (T )(1 + α) Clock Synchronization in Distributed Systems Zbigniew Jerzak
56. 56. Introduction Basic Concepts Algorithms NTP Summary 30 of 45 slides PCS – Adjusting Local Clock Recall: Sq (t) = Hq (t) + aq (t) aq (t) = αHq (t) + β Sq (t) = Hq (t)(1 + α) + β Local time: Sq (T ), remote time: Cp (T , q) Sq (T ) = Hq (T )(1 + α) + β Goal: after P local time shows Cp (T , q) + P Sq (T + P) = Cp (T , q) + P = (Hq (T ) + P)(1 + α) + β Solution: Cp (T , q) − Sq (T ) α= P β = Sq (T ) − Hq (T )(1 + α) Clock Synchronization in Distributed Systems Zbigniew Jerzak
57. 57. Introduction Basic Concepts Algorithms NTP Summary 30 of 45 slides PCS – Adjusting Local Clock Recall: Sq (t) = Hq (t) + aq (t) aq (t) = αHq (t) + β Sq (t) = Hq (t)(1 + α) + β Local time: Sq (T ), remote time: Cp (T , q) Sq (T ) = Hq (T )(1 + α) + β Goal: after P local time shows Cp (T , q) + P Sq (T + P) = Cp (T , q) + P = (Hq (T ) + P)(1 + α) + β Solution: Cp (T , q) − Sq (T ) α= P β = Sq (T ) − Hq (T )(1 + α) Clock Synchronization in Distributed Systems Zbigniew Jerzak
58. 58. Introduction Basic Concepts Algorithms NTP Summary 30 of 45 slides PCS – Adjusting Local Clock Recall: Sq (t) = Hq (t) + aq (t) aq (t) = αHq (t) + β Sq (t) = Hq (t)(1 + α) + β Local time: Sq (T ), remote time: Cp (T , q) Sq (T ) = Hq (T )(1 + α) + β Goal: after P local time shows Cp (T , q) + P Sq (T + P) = Cp (T , q) + P = (Hq (T ) + P)(1 + α) + β Solution: Cp (T , q) − Sq (T ) α= P β = Sq (T ) − Hq (T )(1 + α) Clock Synchronization in Distributed Systems Zbigniew Jerzak
59. 59. Introduction Basic Concepts Algorithms NTP Summary 31 of 45 slides PCS – Specifying Precision Lower ub(m2 ) implies lower error Ep (T , q) Achieving a given error requires a bound ubmax Trade-oﬀ between Ep (T , q) and probability p(ub(m) > ubmax ) Using k readings and knowing p: p(success) = 1 − p k Clock Synchronization in Distributed Systems Zbigniew Jerzak
60. 60. Introduction Basic Concepts Algorithms NTP Summary 31 of 45 slides PCS – Specifying Precision Lower ub(m2 ) implies lower error Ep (T , q) Achieving a given error requires a bound ubmax Trade-oﬀ between Ep (T , q) and probability p(ub(m) > ubmax ) Using k readings and knowing p: p(success) = 1 − p k Clock Synchronization in Distributed Systems Zbigniew Jerzak
61. 61. Introduction Basic Concepts Algorithms NTP Summary 31 of 45 slides PCS – Specifying Precision Lower ub(m2 ) implies lower error Ep (T , q) Achieving a given error requires a bound ubmax Trade-oﬀ between Ep (T , q) and probability p(ub(m) > ubmax ) Using k readings and knowing p: p(success) = 1 − p k Clock Synchronization in Distributed Systems Zbigniew Jerzak
62. 62. Introduction Basic Concepts Algorithms NTP Summary 31 of 45 slides PCS – Specifying Precision Lower ub(m2 ) implies lower error Ep (T , q) Achieving a given error requires a bound ubmax Trade-oﬀ between Ep (T , q) and probability p(ub(m) > ubmax ) Using k readings and knowing p: p(success) = 1 − p k Clock Synchronization in Distributed Systems Zbigniew Jerzak
63. 63. Introduction Basic Concepts Algorithms NTP Summary 32 of 45 slides Gossip-based Synchronization based on: [BPQS08] Problem: scale to thousands of nodes Solution: gossip-based algorithms (partial view) Remote clock reading: Cristian approach [Cri89] Digital signatures Discrete clock adjustment Clock Synchronization in Distributed Systems Zbigniew Jerzak
64. 64. Introduction Basic Concepts Algorithms NTP Summary 32 of 45 slides Gossip-based Synchronization based on: [BPQS08] Problem: scale to thousands of nodes Solution: gossip-based algorithms (partial view) Remote clock reading: Cristian approach [Cri89] Digital signatures Discrete clock adjustment Clock Synchronization in Distributed Systems Zbigniew Jerzak
65. 65. Introduction Basic Concepts Algorithms NTP Summary 32 of 45 slides Gossip-based Synchronization based on: [BPQS08] Problem: scale to thousands of nodes Solution: gossip-based algorithms (partial view) Remote clock reading: Cristian approach [Cri89] Digital signatures Discrete clock adjustment Clock Synchronization in Distributed Systems Zbigniew Jerzak
66. 66. Introduction Basic Concepts Algorithms NTP Summary 32 of 45 slides Gossip-based Synchronization based on: [BPQS08] Problem: scale to thousands of nodes Solution: gossip-based algorithms (partial view) Remote clock reading: Cristian approach [Cri89] Digital signatures Discrete clock adjustment Clock Synchronization in Distributed Systems Zbigniew Jerzak
67. 67. Introduction Basic Concepts Algorithms NTP Summary 32 of 45 slides Gossip-based Synchronization based on: [BPQS08] Problem: scale to thousands of nodes Solution: gossip-based algorithms (partial view) Remote clock reading: Cristian approach [Cri89] Digital signatures Discrete clock adjustment Clock Synchronization in Distributed Systems Zbigniew Jerzak
68. 68. Introduction Basic Concepts Algorithms NTP Summary 33 of 45 slides Gossip-based Synchronization – The Algorithm 1. Obtain a random list of neighbors 2. Use the remote clock reading to calculate oﬀsets O 3. Sort the oﬀsets U 1 4. Adjustment: O(i) U −L i=L L = α|N| U = |N| − L 0 ≤ α < 0.5 5. Update local clock 6. Loop Clock Synchronization in Distributed Systems Zbigniew Jerzak
69. 69. Introduction Basic Concepts Algorithms NTP Summary 34 of 45 slides Network Time Protocol – Goal & Deﬁnitions based on: [Mil91, Mil03] Goal: accurate and precise time on a statistical basis with acceptable network overheads and instabilities in a large, diverse internet (interconnected) system. [Mil91] Oﬀset: |Hp (t) − Hq (t)| dHp (t) dHq (t) Skew: − dt dt Clock Synchronization: time synchronization: bounding oﬀset frequency synchronization: bounding skew Clock Synchronization in Distributed Systems Zbigniew Jerzak
70. 70. Introduction Basic Concepts Algorithms NTP Summary 34 of 45 slides Network Time Protocol – Goal & Deﬁnitions based on: [Mil91, Mil03] Goal: accurate and precise time on a statistical basis with acceptable network overheads and instabilities in a large, diverse internet (interconnected) system. [Mil91] Oﬀset: |Hp (t) − Hq (t)| dHp (t) dHq (t) Skew: − dt dt Clock Synchronization: time synchronization: bounding oﬀset frequency synchronization: bounding skew Clock Synchronization in Distributed Systems Zbigniew Jerzak
71. 71. Introduction Basic Concepts Algorithms NTP Summary 34 of 45 slides Network Time Protocol – Goal & Deﬁnitions based on: [Mil91, Mil03] Goal: accurate and precise time on a statistical basis with acceptable network overheads and instabilities in a large, diverse internet (interconnected) system. [Mil91] Oﬀset: |Hp (t) − Hq (t)| dHp (t) dHq (t) Skew: − dt dt Clock Synchronization: time synchronization: bounding oﬀset frequency synchronization: bounding skew Clock Synchronization in Distributed Systems Zbigniew Jerzak
72. 72. Introduction Basic Concepts Algorithms NTP Summary 34 of 45 slides Network Time Protocol – Goal & Deﬁnitions based on: [Mil91, Mil03] Goal: accurate and precise time on a statistical basis with acceptable network overheads and instabilities in a large, diverse internet (interconnected) system. [Mil91] Oﬀset: |Hp (t) − Hq (t)| dHp (t) dHq (t) Skew: − dt dt Clock Synchronization: time synchronization: bounding oﬀset frequency synchronization: bounding skew Clock Synchronization in Distributed Systems Zbigniew Jerzak
73. 73. Introduction Basic Concepts Algorithms NTP Summary 35 of 45 slides NTP – Conﬁguration Servers ordered into strata Redundant paths tolerate link failures SP algorithm Clock Synchronization in Distributed Systems Zbigniew Jerzak
74. 74. Introduction Basic Concepts Algorithms NTP Summary 35 of 45 slides NTP – Conﬁguration Servers ordered into strata Redundant paths tolerate link failures SP algorithm Clock Synchronization in Distributed Systems Zbigniew Jerzak
75. 75. Introduction Basic Concepts Algorithms NTP Summary 35 of 45 slides NTP – Conﬁguration Servers ordered into strata Redundant paths tolerate link failures SP algorithm Clock Synchronization in Distributed Systems Zbigniew Jerzak
76. 76. Introduction Basic Concepts Algorithms NTP Summary 35 of 45 slides NTP – Conﬁguration Servers ordered into strata Redundant paths tolerate link failures SP algorithm Clock Synchronization in Distributed Systems Zbigniew Jerzak
77. 77. Introduction Basic Concepts Algorithms NTP Summary 36 of 45 slides NTP – Reading Remote Clock Round trip delay: (D − A) − (C − B) (C +B) (D+A) Clock oﬀset of q wrt. p: θ = 2 − 2 (D−A)−(C −B) Error: 2 Clock Synchronization in Distributed Systems Zbigniew Jerzak
78. 78. Introduction Basic Concepts Algorithms NTP Summary 37 of 45 slides NTP – Data Filtering Problem: accurate oﬀset from a sample population Solution: minimum ﬁlter order m readings according to round trip delay select the lowest round trip (ﬁrst) reading Clock Synchronization in Distributed Systems Zbigniew Jerzak
79. 79. Introduction Basic Concepts Algorithms NTP Summary 37 of 45 slides NTP – Data Filtering Problem: accurate oﬀset from a sample population Solution: minimum ﬁlter order m readings according to round trip delay select the lowest round trip (ﬁrst) reading Clock Synchronization in Distributed Systems Zbigniew Jerzak
80. 80. Introduction Basic Concepts Algorithms NTP Summary 38 of 45 slides NTP – Peer Selection Problem: select and combine best peers Solution: calculate per peer statistics 1. order peers by stratum and round trip delay i=m−1 2. ﬁlter dispersion: χ = |θi − θ0 | 0.5i i=0 k=|N|−1 j=|N|−1 3. peer dispersion: ∀j=0 : χj = θj0 − θk 0.75k 0 k=0 4. eliminate the peer with highest dispersion 5. terminate if one peer left 6. terminate if peer dispersion < minimum ﬁlter dispersion Clock Synchronization in Distributed Systems Zbigniew Jerzak
81. 81. Introduction Basic Concepts Algorithms NTP Summary 38 of 45 slides NTP – Peer Selection Problem: select and combine best peers Solution: calculate per peer statistics 1. order peers by stratum and round trip delay i=m−1 2. ﬁlter dispersion: χ = |θi − θ0 | 0.5i i=0 k=|N|−1 j=|N|−1 3. peer dispersion: ∀j=0 : χj = θj0 − θk 0.75k 0 k=0 4. eliminate the peer with highest dispersion 5. terminate if one peer left 6. terminate if peer dispersion < minimum ﬁlter dispersion Clock Synchronization in Distributed Systems Zbigniew Jerzak
82. 82. Introduction Basic Concepts Algorithms NTP Summary 39 of 45 slides NTP – Clock Correction based on: [Mil92] Only one peer: directly apply oﬀset 1 C l o c k V a l u e c f n ( o f f s e t [ |N| ] , s t r a t u m [ |N| ] , d i s t a n c e [ |N| ] ) 2 { 3 C l o c k V a l u e tmp1 ; 4 C l o c k V a l u e tmp2 =0; 5 C l o c k V a l u e tmp3 =0; 6 7 f o r ( i =0; i <|N| ; ++i ) { 8 tmp1 = 1 / ( s t r a t u m [ i ] ∗ MAXDISPERS+d i s t a n c e [ i ] ) ; 9 tmp2 += tmp1 ; 10 tmp3 += tmp1∗ o f f s e t [ i ] ; 11 } 12 r e t u r n ( tmp3/tmp2 ) ; 13 } Clock Synchronization in Distributed Systems Zbigniew Jerzak
83. 83. Introduction Basic Concepts Algorithms NTP Summary 39 of 45 slides NTP – Clock Correction based on: [Mil92] Only one peer: directly apply oﬀset 1 C l o c k V a l u e c f n ( o f f s e t [ |N| ] , s t r a t u m [ |N| ] , d i s t a n c e [ |N| ] ) 2 { 3 C l o c k V a l u e tmp1 ; 4 C l o c k V a l u e tmp2 =0; 5 C l o c k V a l u e tmp3 =0; 6 7 f o r ( i =0; i <|N| ; ++i ) { 8 tmp1 = 1 / ( s t r a t u m [ i ] ∗ MAXDISPERS+d i s t a n c e [ i ] ) ; 9 tmp2 += tmp1 ; 10 tmp3 += tmp1∗ o f f s e t [ i ] ; 11 } 12 r e t u r n ( tmp3/tmp2 ) ; 13 } Clock Synchronization in Distributed Systems Zbigniew Jerzak
84. 84. Introduction Basic Concepts Algorithms NTP Summary 40 of 45 slides Summary Clock synchronization is a diﬃcult problem External clock synchronization has lower overheads Internal clock synchronization is more robust Clock synchronization is an important problem For hard-real time applications For wireless networks Clock synchronization is practical GPS GSM (2G) Clock Synchronization in Distributed Systems Zbigniew Jerzak
85. 85. Introduction Basic Concepts Algorithms NTP Summary 41 of 45 slides Thank You! Clock Synchronization in Distributed Systems Zbigniew Jerzak
86. 86. Introduction Basic Concepts Algorithms NTP Summary 42 of 45 slides References I Roberto Baldoni, Marco Platania, Leonardo Querzoni, and Sirio Scipioni. A peer-to-peer ﬁlter-based algorithm for internal clock synchronization in presence of corrupted processes. In PRDC 2008: 14th IEEE Paciﬁc Rim International Symposium on Dependable Computing, pages 64–72. IEEE Computer Society, 2008. Flaviu Cristian and Christof Fetzer. Probabilistic internal clock synchronization. In Proceedings of the Thirteenth Symposium on Reliable Distributed Systems (SRDS1994), pages 22–31, October 1994. F. Cristian and C. Fetzer. Fault-tolerant external clock synchronization. In ICDCS ’95: Proceedings of the 15th International Conference on Distributed Computing Systems, page 70, Washington, DC, USA, 1995. IEEE Computer Society. Intel Corporation. Ia-pc hpet (high precision event timers) speciﬁcation. Online, October 2004. Flaviu Cristian. Probabilistic clock synchronization. Distributed Computing, 3(3):146–158, September 1989. Christof Fetzer and Flaviu Cristian. An optimal internal clock synchronization algorithm. In Proceedings of the 10th Annual IEEE Conference on Computer Assurance (COMPASS1995), pages 187–196, June 1995. Clock Synchronization in Distributed Systems Zbigniew Jerzak
87. 87. Introduction Basic Concepts Algorithms NTP Summary 43 of 45 slides References II Rui Fan and Nancy A. Lynch. Gradient clock synchronization. Distributed Computing, 18(4):255–266, 2006. Cary G. Gray and David R. Cheriton. Leases: An eﬃcient fault-tolerant mechanism for distributed ﬁle cache consistency. In SOSP 1989: Proceedings of the twelfth ACM Symposium on Operating Systems Principles, pages 202–210, 1989. Jeong-Hyon Hwang, Ugur Cetintemel, and Stan Zdonik. Fast and highly-available stream processing over wide area networks. In ICDE ’08: Proceedings of the 2008 IEEE 24th International Conference on Data Engineering, pages 804–813, Washington, DC, USA, 2008. IEEE Computer Society. M. Jochim. Zeitig steuern - sichere daten¨bertragung im automobil. u c’t Magazin f¨r Computertechnik, 2(1):190–195, January 2007. u Hermann Kopetz and G¨nter Gr¨nsteidl. u u Ttp-a protocol for fault-tolerant real-time systems. Computer, 27(1):14–23, 1994. H. Kopetz, A. Kruger, D. Millinger, and A. Schedl. A synchronization strategy for a time-triggered multi-cluster real-time system. In 14th Symposium on Reliable Distributed Systems, 1995. Proceedings, pages 154–161, Bad Neuenahr, Germany, September 1995. Clock Synchronization in Distributed Systems Zbigniew Jerzak
88. 88. Introduction Basic Concepts Algorithms NTP Summary 44 of 45 slides References III C. M. Krishna, Kang G. Shin, and Ricky W. Butler. Ensuring fault tolerance of phase-locked clocks. IEEE Trans. Comput., 34(8):752–756, 1985. Barbara Liskov. Practical uses of synchronized clocks in distributed systems. Distributed Computing, 6(4):211–219, 1993. Jennifer Lundelius and Nancy A. Lynch. An upper and lower bound for clock synchronization. Information and Control, 62(2/3):190–204, 1984. Leslie Lamport and P. M. Melliar-Smith. Synchronizing clocks in the presence of faults. J. ACM, 32(1):52–78, 1985. Leslie Lamport, Robert Shostak, and Marshall Pease. The byzantine generals problem. ACM Trans. Program. Lang. Syst., 4(3):382–401, 1982. B. Liskov, L. Shrira, and J. Wroclawski. Eﬃcient at-most-once messages based on synchronized clocks. In SIGCOMM ’90: Proceedings of the ACM symposium on Communications architectures & protocols, pages 41–49, New York, NY, USA, 1990. ACM. David L. Mills. Internet time synchronization: the network time protocol. IEEE Transactions on Communications, 39(10):1482–1493, October 1991. Clock Synchronization in Distributed Systems Zbigniew Jerzak
89. 89. Introduction Basic Concepts Algorithms NTP Summary 45 of 45 slides References IV David L. Mills. Network time protocol (version 3) speciﬁcation, implementation and analysis, March 1992. David L. Mills. A brief history of ntp time: memoirs of an internet timekeeper. SIGCOMM Comput. Commun. Rev., 33(2):9–21, 2003. Alan Mislove, Ansley Post, Andreas Haeberlen, and Peter Druschely. Experiences in building and operating a reliable peer-to-peer application. In Yolande Berbers and Willy Zwaenepoel, editors, EuroSys, pages 147–159, Leuven, Belgium, April 2006. ACM. A. Olson and K.G. Shin. Probabilistic clock synchronization in large distributed systems. IEEE Transactions on Computers, 43(9):1106–1112, September 1994. K. G. Shin and P. Ramanathan. Transmission delays in hardware clock synchronization. IEEE Trans. Comput., 37(11):1465–1467, 1988. Jennifer Lundelius Welch and Nancy Lynch. A new fault-tolerant algorithm for clock synchronization. Information and Computing, 77(1):1–36, 1988. Josef Widder and Ulrich Schmid. Booting clock synchronization in partially synchronous systems with hybrid process and link failures. Distributed Computing, 20(2):115–140, May 2007. Clock Synchronization in Distributed Systems Zbigniew Jerzak