The leap second is coming by Tomonori TAKADA [APRICOT 2015]

1,595 views

Published on

A presentation given at the Lightning Talks session of APRICOT 2015 on Thursday, 5th March 2015.

Published in: Internet
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,595
On SlideShare
0
From Embeds
0
Number of Embeds
124
Actions
Shares
0
Downloads
0
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide
  • My name is Tomonori Takada and I am a system engineer from QTNet.
    QTNet is a(エイ) telecom carrier serving individual and corporate customers mainly in the Kyushu region of Japan.
  • The main topic of today’s presentation is how leap seconds are addressed(アドレスド) in NTP.
  • IERS announced the insertion of a leap second for this year.
  • NTP has an identifier called the Leap Indicator to correct the time with leap seconds. For leap second insertions in 2012 and this year, a packet with a(ア) binary value of “01(ゼロワン)” is distributed from the higher NTP server. Once a node receives(レシーブス) the Leap Indicator, it uses(ユーズス) the OS or kernel functionality on its own to adjust the time when a leap second is inserted. The Leap Indicator only makes notification of the(ジ) occurrence(オカランス) of leap second insertions; it is the OS or kernel that actually corrects the time.
  • The last time a leap second was inserted, several problems were reported, particularly in Linux environments. One of the problems was that the kernel would hang and reboot when it received the Leap Indicator. Another problem was that CPU usage skyrocketed(スカイロケテッド) after a leap second was inserted in an environment where specific applications were running.
    The first problem occurred with older versions(バージョンズ) of the OS or kernel. In fact, various distributors had given warnings beforehand. We were therefore able to prevent this problem by upgrading our(アワ) OS and kernel. The second problem occurred with the newer versions of the OS or kernel. In either case, bugs in the OS or kernel came to the surface with the leap second insertion and the Leap Indicator.
  • Before going over QTNet’s response in 2012, let me talk about the(ジ) operation of NTP at QTNet.
    QTNet runs two time servers as Stratum 1. One is JJY Server. It acquires(アクワイアス) time information from NICT via(バイア) PSTN. The other is GPS Server. It acquires time information from GPS. These two servers are appliance servers.
    We run three Linux time servers as Stratum 2. As clock references, these servers use our two Stratum 1 servers as well as the time server provided by NICT via(バイア) the(ジ) Internet(インターネット). They have been configured as peers to one another to equalize(イコライズ) the time among the(ジ) three(スリー) servers(サーバーズ).
    Over 100 QTNet servers have been configured to use these(ジーズ) Stratum 2 servers as clock references.
  • Now let me talk about how QTNet responded to the leap second insertion in 2012. QTNet had decided not to use the Leap Indicator for leap second adjustment for two reasons:
    The first reason is that, as I have mentioned before, we had received information beforehand on the hang-up of older versions of the OS or kernel when receiving the Leap Indicator. The second reason is that QTNet’s Stratum 1 time servers had the ability to adjust leap seconds without using the Leap Indicator. This is called “Gradually Adjust mode.” With Gradually Adjust mode, time servers put a clock back gradually about two hours before the leap second insertion and complete a one-second adjustment by such time.
  • Based on this policy, QTNet changed a part of the NTP architecture before the leap second insertion. Specifically, the NICT time server was excluded as a clock reference, because, it distributes the leap indicator. This means that our Stratum 2 servers will use only the time servers that have the “Gradually Adjust mode”, and do not involve the insertion of the Leap Indicator.
  • Now, I would like to show you what happened when a leap second was inserted in 2012. This slide shows the status as of 9:00 a.m. on the day of the leap second insertion.
    Fortunately, QTNet did not encounter any problems(プロブレムス) related to the Leap Indicator, however another problem occurred. This problem was that the Stratum2 server’s peer status failed. The question is: why did the peer status fail?
  • There can be two reasons: First, the JJY and GPS Servers had two different speeds of putting the clock back. As we checked with our product vendor later, we found out that the JJY Server was designed to start the leap second adjustment 125 minutes before the insertion, while the GPS Server was designed to start 120 minutes before the insertion. This could widen(ワイドン) the difference in clock time between Stratum 2 servers that use the JJY and GPS Servers(サーバーズ) as clock references.
    Secondly, the polling interval of Stratum 2 for using Stratum 1 as a clock reference differed (among Stratum servers). The interval had been variably(バリアブリィ) configured between 64 seconds and 1(ワンサウザンズ),024(トゥエンティフォー) seconds. This usually causes no problems, but the longer the polling interval is, the longer it takes to correct the time, when the Stratum 1 Server put the clock back gradually. As a result, there could have been a(ア) gap in clock time between Stratum 2 Servers even though they used the same time source.
  • Now, what are we going to do to address the leap second insertion this year? QTNet is currently considering two options:
    Plan A is to use the Leap Indicator. Plan A is based on the assumption that we will upgrade the(ジ) OS and kernel of all servers to the latest version beforehand. This is what they should always be, but we have some servers with older versions(バージョンズ) of the OS or kernel in reality.
    Plan B is not to use the Leap Indicator but, instead use the “Gradually Adjust mode” of Stratum 1 servers , just like we did in 2012. We believe that we need to take the following steps to remedy(レメディ) the problem that we experienced.
    Step 1 is to use only one Stratum 1 server to unify the speed to put the clock back at Stratum 1. Step 2 is to shorten and unify the polling intervals for using Stratum 1 as a clock reference so that the clocks of the Stratum 2 servers are in sync with one another.
  • In this presentation, I quickly looked back on the problems that occurred during the leap second insertion in 2012 and how QTNet responded. I also talked about QTNet’s plans for the leap second insertion scheduled for this year. Whatever option we choose, it is important to check the specifications and constraints(コンストレインツ) of the time servers, ntpd, and the OS and kernel.
    Whether we like it or not, a leap second will be inserted this year. I would like to wish everyone good luck in getting through the leap second insertion this year.
  • Thank you for listening.
  • The leap second is coming by Tomonori TAKADA [APRICOT 2015]

    1. 1. The leap second is coming. Kyusyu Telecommunication Network Co.,Inc. (QTNet) Tomonori TAKADA
    2. 2. 1Kyushu Telecommunication Network Co., Inc. Agenda • A leap second will be inserted in this year. • How to adjust NTP for the leap second • The insertion of the previous leap second (2012) • QTNet NTP servers overview. • QTNet Plan for The leap second of 2012 • QTNet Plan for The leap second of 2015
    3. 3. 2Kyushu Telecommunication Network Co., Inc. A leap second will be inserted this year July 1, 2015 +0900 (JST) 23:59:58 23:59:59 23:59:60 00:00:00 00:00:01 June 30, 2015 (UTC) 08:59:58 08:59:59 08:59:60 09:00:00 09:00:01
    4. 4. 3Kyushu Telecommunication Network Co., Inc. How to adjust NTP for the leap second a description of the NTP/SNTP Version 4 message format Value and meaning of the Leap Indicator ※ adapted from http://www.rfc-base.org/txt/rfc-2030.txt • Leap Indicator(LI) is used to adjust for the leap second. • The node which is received as “LI=01”, adjusts the host clock by itself using OS/Kernel functionality when inserting the leap second. • LI just indicates that leap second will be happen.
    5. 5. 4Kyushu Telecommunication Network Co., Inc. The insertion of the previous leap second (2012) • The previous leap second was inserted on July 1, 2012 at +0900(JST). • Several problems were reported, especially on Linux. 1. OS rebooted after it received the leap indicator. 2. CPU usage reached 100% on multiple hosts running some specific applications after the leap second was inserted.
    6. 6. 5Kyushu Telecommunication Network Co., Inc. QTNet NTP Servers overview GPS time server JJY (Dial-up) time server via PSTN Stratum1 Stratum2 peer peer ntp server 1 ntp server 2 ntp server 3 QTNet Servers internet ※NICT is an organization that determines and maintains Japan Standard Time or JST, and distributes JST in various ways.
    7. 7. 6Kyushu Telecommunication Network Co., Inc. QTNet Plan for the leap second of 2012 • We did not use the Leap Indicator. • We had gotten information of old OS/Kernel bugs which were relevant to the leap indicator. • Our Stratum1 servers could adjust for a leap second without using the leap indicator, “Gradually adjust mode”. About two hours before the leap second, Our Stratum1 servers began to set the clock back slowly.
    8. 8. 7Kyushu Telecommunication Network Co., Inc. QTNet Plan for the leap second of 2012 via PSTN Stratum1 Stratum2 peer peer ntp server 1 ntp server 2 ntp server 3 QTNet Servers GPS time server JJY (Dial-up) time server excluded from synchronizing Leap indicator distributionGradually adjust mode Gradually adjust mode
    9. 9. 8Kyushu Telecommunication Network Co., Inc. Status at the leap second insertion on 2012 via PSTN Stratum1 Stratum2 peer peer ntp server 1 ntp server 2 ntp server 3 QTNet Servers GPS time server JJY (Dial-up) time server Gradually adjust mode Gradually adjust mode ○ ○ ○ × ×
    10. 10. 9Kyushu Telecommunication Network Co., Inc. The reasons for the peer status failure via PSTN Stratum1 Stratum2 ntp server 1 ntp server 3 ntp server 2 GPS time server JJY (Dial-up) time server Start time for setting the clock back 1/64sec 1/128sec 1/128secPolling interval 125 minutes before 120 minutes before
    11. 11. 10Kyushu Telecommunication Network Co., Inc. Possible plans for The leap second of 2015 • Plan A: Using the Leap Indicator • We have to update our server’s OS/Kernel. • Plan B: Not using the Leap Indicator (Using “Gradually adjust mode” ) • Single Time source (Stratum1). • We make the Stratum2 polling interval to Stratum1 shorter than before (about each 16 seconds).
    12. 12. 11Kyushu Telecommunication Network Co., Inc. The leap second is Coming. July 1, 2015 +0900 (JST) 23:59:58 23:59:59 23:59:60 00:00:00 00:00:01 June 30, 2015 (UTC) 08:59:58 08:59:59 08:59:60 09:00:00 09:00:01
    13. 13. 12Kyushu Telecommunication Network Co., Inc. Let's do our best.
    14. 14. 13Kyushu Telecommunication Network Co., Inc. Appendix Sites in JPN http://ringeye.jawfish.org/~ori/misc/leapsecond-20120701.html http://chicchaki.cocolog-nifty.com/kanetamas_memo/2005/11/ntp_2b1d.html http://www.rfc-base.org/txt/rfc-2030.txt https://www.seiko-sol.co.jp/support/download/LeapSecond20120701_2520_30_40.pdf http://www-06.ibm.com/jp/linux/tech/doc/0019db89.html Sites in English http://jjy.nict.go.jp/time/teljjy/teljjy_p1-e.html http://googleblog.blogspot.jp/2011/09/time-technology-and-leaping-seconds.html

    ×