There is an ominous rumbling in the internet about 768K day, some even termed it internet doomsday others called it “Y2K” of internet. The fear is justified given the experience of wide spread internet outage during 512K day when internet BGP table size exceeded 512,000 routes. The 512K day caused havoc and many routers simply exhausted of TCAM (Ternary content-addressable memory) size and were unable to process certain routes leaving parts of internet unreachable. The same issue seems possible this year again when internet routes exceeds 768K routes.
2. There is an ominous rumbling in the internet about 768K day, some even termed it
internet doomsday others called it “Y2K” of internet. The fear is justified given the
experience of wide spread internet outage during 512K day when internet BGP table
size exceeded 512,000 routes. The 512K day caused havoc and many routers simply
exhausted of TCAM (Ternary content-addressable memory) size and were unable to
process certain routes leaving parts of internet unreachable. The same issue seems
possible this year again when internet routes exceeds 768K routes. Some predicts
August 12 is 768K day following 512k Day which happened in August 12, 2014[1].
Some Internet Outages are predicted
While majority of teir1 ISPs were caught off guard during 512K day, this should not be
the case this time around. There are mechanisms within BGP route configuration to
protect routers from exhausting TCAM and presumably ISPs upgraded their legacy
routers with patches. However, this is temporary fix and may come with string attached
on reachability etc. A more permanent fix requires routers to accept upto full internet
routes in its forwarding table. This requires a special memory known as “TCAM”. Older
routers are built with limited TCAM size and hence, unable to provide faster response
and may exhaust its resources and eventually fail or unable to process certain prefixes.
A typical outage could be something similar to the following diagram.
Figure 1. [Courtesy ThousandEyes] This diagram is presented in the blog page of “Thousand Eyes” at
https://blog.thousandeyes.com/what-is-768k-day/ to depict recent outage in bayarea.
A blog post in “Thousand Eyes”[2] claimed that the writer has observed packet lost on
several interfaces in the Cogent (AS 174) network in San Francisco Bayarea. As a
result, many peer ISPs like comcast, quest, amazon, 8x8 etc were affected. Recently,
several media reported outage in Australia in which media outlet like CSO[2] and
Computerworld[3] claimed the outage directly related to BGP prefixes reaching 768K.
There seems to be an increase phenomena of internet outage reported in twitter[5]
3. message. Proper analyses could ascertain how many of reported outages are due to
BGP prefix issues. Nonetheless, BGP route sizes are increasing and this will definitely
cause network reachability issues in routers that lacks bigger forwarding table.
According to CIDR report (a website that keeps track of global BGP routes), the BGP
route size already exceeded 768K and current table size shown as 783K as of June
11th, 2019[6]. However, this report is not official and may include duplicates.
Irrespective of the numbers presented in CIDR report, I can ascertain that majority of
the customers I talked to, are looking to replace or add edge routers with 750k+ table
size for IPV4 and around 65K for IPv6. Henceforth, it should go without saying that one
should be cognizant of the issue and take precaution before 768K day arrives.
Under the hood
Internet routers generally process route request in two tables in conjunction with routing
protocols: RIB and FIB. While RIB is part of control plane and generally processed by
NOS, much of the table look and processing are done at FIB level which part of Routing
hardware or reside within the pipeline of Marchant Silicon or ASIC.
Figure 2. Routing functions including RIB and FIB processing.
If the lookup engine (TCAM) within ASIC pipeline lacks capability of processing certain
number of tables for IPv4 packets, RIB may flood the table causing overflow problem.
With patches, routers may able to control processing and lookups at ASIC pipeline.
However, such patches are temporary fix and protects routers from failing. A more
permanent fix is to somehow connect ASIC packet processing pipeline to external
TCAMs using high speed bus. Older ASICs lacks such capabilities resulting routers
more software dependable and may be limited in table size capabilities.
4. Solution: Can whitebox Switch help?
Disaggregation or whitebox is the best solutions for this problem. Buy the choice of
hardware herein router/switch from your preferred vendor and select software or
Network Operating System (NOS). The benefit of whitebox or disaggregation nfor that
matter allows you to select best of the breed merchant silicon and buy those in your
terms with a price point you can afford. Result you get the best of both world: hardware
and software. For the bigger IPv4 table size, Broadcom® Qumran-MX™ silicon with
BCM52311™ Knowledge-Based Processor (KBP) [TCAM] provides you optimal choice
for upto 1 million IPv4 routes. There are cases where upto 1.2 million routes are
possible in such system.
Figure 3. Edge Router Whitebox based on Broadcom® Qumran-MX.
A number of hardware vendors are currently offering Qumran-MX based platform with
industry proven NOS from companies such as IP Infusion. As depicted in the figure
above, Merchant Silicon herein Broadcom® Qumran-MX™ is connected through an
internal bus (known as ELK bus) to external TCAM which provides further capabilities
for lookups.
However, it is also important to select appropriate software vendor that has optimize
such boxes and provides optimal route capabilities of more than 768K to facilitate your
upgrade or help in your preparation for 768K day.
IPInfusion’s OcNOS™ is tested with a number of Hardware vendors providing you a
wide slections, please ensure you select appropriate Qumran-MX based hardware with
External TCAM to achieve upto 1 Million route. If you are interested about OcNOS and
how it can solve your 768K day, you may visit their website at
https://www.ipinfusion.com .
However, please make sure you ask each vendor to provide you with test report or
atleast enough data to make educated decision.
5. Reference
[1] Some internet outages predicted for the coming month as '768k Day' approaches.
Available at https://www.zdnet.com/article/some-internet-outages-predicted-for-the-coming-month-
as-768k-day-approaches/.
[2] Australian Internet Users Face Looming ‘768k Day’. Available at
https://www.cso.com.au/mediareleases/34669/australian-internet-users-face-looming-768k-day/
[3] Australian Internet Users Face Looming ‘768k Day’.
https://www.computerworld.com.au/mediareleases/34669/australian-internet-users-face-looming-
768k-day/.
[4] Thousand Eyes. What is 768K Day, and Will It Cause Internet Outages? Available at
https://blog.thousandeyes.com/what-is-768k-day/.
[5] Internet outage tag at twitter. Available at
https://twitter.com/search?q=internet%20outage&src=tyah
[6] CIDR, 2019. CIDR report for June 11, 2019. Available at https://www.cidr-
report.org/as2.0/