SlideShare a Scribd company logo
An Approach to
Routing in a Clos
Randy Bush <randy@psg.com>
IIJ Research & Arrcus
This Works
2019.02.25 Clos 1Creative Commons: Attribution & Share Alike
Ethernet
This Might Work
2019.02.25 Clos 2Creative Commons: Attribution & Share Alike
Ethernet Ethernet Ethernet
This Won’t Work
2019.02.25 Clos 3Creative Commons: Attribution & Share Alike
This Works (Clos Network)
2019.02.25 Clos 4Creative Commons: Attribution & Share Alike
WAN
S
p
i
n
e
External
TORs
Clos is Not an Acronym
2019.02.25 Clos 5Creative Commons: Attribution & Share Alike
Clos, Charles (Mar 1953)
“A study of non-blocking switching
networks”
Bell System Technical Journal. 32
(2): 406–424
For Example:
IIJ is Building a Second
Medium Scale Data
Center (MSDC)
in Shiroi/Chiba
Capacity of 6k Racks
2019.02.25 Clos 6Creative Commons: Attribution & Share Alike
How Do You Route
In Something of
This Scale?
2019.02.25 Clos 7Creative Commons: Attribution & Share Alike
OSPF OK to 500 Nodes
IS-IS good to 1,000
Limited Because They
Repeatedly Flood
Everything
2019.02.25 Clos 8Creative Commons: Attribution & Share Alike
Your Clos on IS-IS or OSPF
2019.02.25 Clos 9Creative Commons: Attribution & Share Alike
BGP Scales Because
It Signals
Only Changes
So BGP has become
common in MSDCs
2019.02.25 Clos 10Creative Commons: Attribution & Share Alike
BGP Is Great as
Updates are Infrequent
2019.02.25 Clos 11Creative Commons: Attribution & Share Alike
WAN
ECMP can be Very Wide
32, 64, even 128
2019.02.25 Clos 12Creative Commons: Attribution & Share Alike
WAN
But What is the
Decision Process?
2019.02.25 Clos 13Creative Commons: Attribution & Share Alike
WAN
Do You Want to Write BGP
Policy for Massive ECMP?
2019.02.25 Clos 14Creative Commons: Attribution & Share Alike
WAN
Consult the Professor
2019.02.25 Clos 15Creative Commons: Attribution & Share Alike
Shortest Path First
Edsger W Dijkstra
1930-2002
BGP-SPF
2019.02.25 Clos 16Creative Commons: Attribution & Share Alike
The Path Calculation of IS-IS
With the Update Rate of BGP
SPF?
I thought BGP was path
vector, not link state!
2019.02.25 Clos 17Creative Commons: Attribution & Share Alike
s/Best Path/SPF/
2019.02.25 Clos 18Creative Commons: Attribution & Share Alike
• New SAFI
• NLRI format exactly same as BGP LS (RFC
7752) Address Family to carry link state
information
• BGP runs Dijkstra instead of Best Path
Decision process
• BGP MP (new SAFI) and BGP-LS Node
attribute for compatibility
• Peering Models: eBGP, iBGP, RR
2019.02.25 Clos 19Creative Commons: Attribution & Share Alike
Neighbor
Distribution
Route Reflection
Outbound Policy
Inbound Policy
Link State
AS-Path Length
EGP vs IGP
Arrival Order
Non-deterministic
MED
IGP metric
Tie Break
BGP4
Classic
2019.02.25 Clos 20Creative Commons: Attribution & Share Alike
Neighbor
Distribution
Route Reflection
Outbound Policy
Inbound Policy
Link State
AS-Path Length
EGP vs IGP
Arrival Order
Non-deterministic
MED
IGP metric
Tie Break
SPF
Removed!
BGP-
SPF
BGP-SPF
• Next-Hop and Path Attributes come for free
with BGP Link-State Address Family
• Needed for RFC 4271 error handling
• Decision Process Phases 1 and 2 (best path)
replaced by SPF algorithm (AKA Dijkstra)
• Decision Process Phase 3 (tie break) may be
skipped as NLRI is unique per BGP speaker
• Need to assure the most recent version of
NLRI is always used and re-advertised
• Augmented with sequence numbers
212019.02.25 Clos Creative Commons: Attribution & Share Alike
BGP-SPF
• Starting with greatly simplified SPF with P2P only
links in single area (i.e., SPT)
• Should scale very well to many use cases
• Could support computation of LFAs, Segment
Routing SIDs, and other IGP features
• BGP-LS format includes necessary Link-State
• Link-State AF is dual-stack AF since both IPv4
and IPv6 addresses/prefixes advertised
• BGP-LS format also supports VPNs but SPF behavior
not defined
• Work needed to define interaction with existing
unicast AFs
• Matter of local implementation policy
2019.02.25 Clos Creative Commons: Attribution & Share Alike 22
Peering Model
• BGP sessions, optionally with Route-Reflector or
controller hierarchy
• Link discovery/liveliness detection outside of BGP
• RR hierarchy can be less than fully connected but must
provide redundancy
• Must not be dependent on SPF for connectivity
• Controller could learn the expected topology through
some other means and inject it
• SPF Computation is distributed though
• Similar to “Jupiter Rising: A Decade of Clos Topologies and
Centralized Control in Google’s Datacenter Network”
232019.02.25 Clos Creative Commons: Attribution & Share Alike
BTW, Every Rack
is (often) an AS
Get Over It
2019.02.25 Clos 24Creative Commons: Attribution & Share Alike
How Does BGP-SPF
Learn Link State?
2019.02.25 Clos 25Creative Commons: Attribution & Share Alike
Motivation
• BGP-SPF needs link neighbor discovery,
liveness, and addressability
• LLDP is an IEEE protocol, complex, and ‘hard’
(IPR) to extend past 1500 bytes
• We wanted something simple and saw no real
need for the complexities of CLNP, …
• So we propose a new EtherType with TLVs
• We discuss Ether payloads, not framing
2019.02.25 Clos 26Creative Commons: Attribution & Share Alike
Device DeviceDevice
Topology / Routing Stack
2019.02.25 Clos 27Creative Commons: Attribution & Share Alike
Ether PDUs Ether PDUs Ether PDUs
Link Check
AFI/SAFIs
Link Check
AFI/SAFIs
Link Check
AFI/SAFIs
BGP-SPF BGP-SPF BGP-SPFTCP TCP TCPTCP
MAC Link State exchanged over raw Ethernet and pushed up stack
Add the AFI/SAFI data IP-Level Liveness Check
BGP-SPF uses link data to discover and build the topology database
2019.02.25 Clos 28Creative Commons: Attribution & Share Alike
East West Protocol
Device DeviceDevice
Link Check
AFI/SAFIs
Link Check
AFI/SAFIs
Link Check
AFI/SAFIs
BGP-SPF BGP-SPF BGP-SPFTCP TCP TCPTCP
Ether PDUs Ether PDUs Ether PDUs
BGP-LS for BGP-SPF
2019.02.25 Clos 29Creative Commons: Attribution & Share Alike
Link State / Topology
Repackage to New BGP NLRI
RFC 7752
Links / Nodes / Prefixes
BGP-SPF
2019.02.25 Clos 30Creative Commons: Attribution & Share Alike
North/South Protocol
Device DeviceDevice
Link Check
AFI/SAFIs
Link Check
AFI/SAFIs
Link Check
AFI/SAFIs
BGP-SPF BGP-SPF BGP-SPFTCP TCP TCPTCP
7752 Shim 7752 Shim 7752 Shim
Ether PDUs Ether PDUs Ether PDUs
BTW, There is No IPR
2019.02.25 Clos 31Creative Commons: Attribution & Share Alike

More Related Content

Similar to An Approach to Routing in a Clos

Kuo wei's thesis
Kuo wei's thesisKuo wei's thesis
Kuo wei's thesis
f97663
 
Cisco Live! :: Introduction to Segment Routing :: BRKRST-2124 | Las Vegas 2017
Cisco Live! :: Introduction to Segment Routing :: BRKRST-2124  | Las Vegas 2017Cisco Live! :: Introduction to Segment Routing :: BRKRST-2124  | Las Vegas 2017
Cisco Live! :: Introduction to Segment Routing :: BRKRST-2124 | Las Vegas 2017
Bruno Teixeira
 
Presentation 4 for students of professordkinney.com
Presentation 4 for students of professordkinney.comPresentation 4 for students of professordkinney.com
Presentation 4 for students of professordkinney.com
Arnold Derrick Kinney
 
CPO Bovington Seminar Jhjasfhjsfajskfsjakfhasjkfhsajkfdhsadfljkshafjk
CPO Bovington Seminar JhjasfhjsfajskfsjakfhasjkfhsajkfdhsadfljkshafjkCPO Bovington Seminar Jhjasfhjsfajskfsjakfhasjkfhsajkfdhsadfljkshafjk
CPO Bovington Seminar Jhjasfhjsfajskfsjakfhasjkfhsajkfdhsadfljkshafjk
ssuser504946
 
MENOG-Segment Routing Introduction
MENOG-Segment Routing IntroductionMENOG-Segment Routing Introduction
MENOG-Segment Routing Introduction
Rasoul Mesghali, CCIE RS
 
BSides: BGP Hijacking and Secure Internet Routing
BSides: BGP Hijacking and Secure Internet RoutingBSides: BGP Hijacking and Secure Internet Routing
BSides: BGP Hijacking and Secure Internet Routing
APNIC
 
Overlay/Underlay - Betting on Container Networking
Overlay/Underlay - Betting on Container NetworkingOverlay/Underlay - Betting on Container Networking
Overlay/Underlay - Betting on Container Networking
Lee Calcote
 
MPLS101.ppt
MPLS101.pptMPLS101.ppt
MPLS101.ppt
ssuserd0c720
 
Advanced Topics and Future Directions in MPLS
Advanced Topics and Future Directions in MPLS Advanced Topics and Future Directions in MPLS
Advanced Topics and Future Directions in MPLS
Cisco Canada
 
CCNA v6.0 ITN - Chapter 06
CCNA v6.0 ITN - Chapter 06CCNA v6.0 ITN - Chapter 06
CCNA v6.0 ITN - Chapter 06
Irsandi Hasan
 
PLNOG14: Evolved Programmable Network, architektura dla sieci operatorskich -...
PLNOG14: Evolved Programmable Network, architektura dla sieci operatorskich -...PLNOG14: Evolved Programmable Network, architektura dla sieci operatorskich -...
PLNOG14: Evolved Programmable Network, architektura dla sieci operatorskich -...
PROIDEA
 
Implementation of isp mpls backbone network on i pv6 using 6 pe routers MAIN PPT
Implementation of isp mpls backbone network on i pv6 using 6 pe routers MAIN PPTImplementation of isp mpls backbone network on i pv6 using 6 pe routers MAIN PPT
Implementation of isp mpls backbone network on i pv6 using 6 pe routers MAIN PPT
Satish Kumar
 
Implementation of isp mpls backbone network on i pv6 using 6 pe routers main PPT
Implementation of isp mpls backbone network on i pv6 using 6 pe routers main PPTImplementation of isp mpls backbone network on i pv6 using 6 pe routers main PPT
Implementation of isp mpls backbone network on i pv6 using 6 pe routers main PPT
Satish Kumar
 
BGP evolution -from SDN perspective
BGP evolution -from SDN perspectiveBGP evolution -from SDN perspective
BGP evolution -from SDN perspective
Miya Kohno
 
Tcp ip protocol
Tcp ip protocol Tcp ip protocol
Tcp ip protocol saurav-IT
 
Weaponizing BGP using communities
Weaponizing BGP using communitiesWeaponizing BGP using communities
Weaponizing BGP using communities
APNIC
 
Next Generation IP Transport
Next Generation IP TransportNext Generation IP Transport
Next Generation IP Transport
MyNOG
 

Similar to An Approach to Routing in a Clos (20)

Kuo wei's thesis
Kuo wei's thesisKuo wei's thesis
Kuo wei's thesis
 
Cisco Live! :: Introduction to Segment Routing :: BRKRST-2124 | Las Vegas 2017
Cisco Live! :: Introduction to Segment Routing :: BRKRST-2124  | Las Vegas 2017Cisco Live! :: Introduction to Segment Routing :: BRKRST-2124  | Las Vegas 2017
Cisco Live! :: Introduction to Segment Routing :: BRKRST-2124 | Las Vegas 2017
 
Presentation 4 for students of professordkinney.com
Presentation 4 for students of professordkinney.comPresentation 4 for students of professordkinney.com
Presentation 4 for students of professordkinney.com
 
CPO Bovington Seminar Jhjasfhjsfajskfsjakfhasjkfhsajkfdhsadfljkshafjk
CPO Bovington Seminar JhjasfhjsfajskfsjakfhasjkfhsajkfdhsadfljkshafjkCPO Bovington Seminar Jhjasfhjsfajskfsjakfhasjkfhsajkfdhsadfljkshafjk
CPO Bovington Seminar Jhjasfhjsfajskfsjakfhasjkfhsajkfdhsadfljkshafjk
 
RPKI Tutorial
RPKI Tutorial RPKI Tutorial
RPKI Tutorial
 
MENOG-Segment Routing Introduction
MENOG-Segment Routing IntroductionMENOG-Segment Routing Introduction
MENOG-Segment Routing Introduction
 
Mpls101
Mpls101Mpls101
Mpls101
 
BSides: BGP Hijacking and Secure Internet Routing
BSides: BGP Hijacking and Secure Internet RoutingBSides: BGP Hijacking and Secure Internet Routing
BSides: BGP Hijacking and Secure Internet Routing
 
Overlay/Underlay - Betting on Container Networking
Overlay/Underlay - Betting on Container NetworkingOverlay/Underlay - Betting on Container Networking
Overlay/Underlay - Betting on Container Networking
 
MPLS101.ppt
MPLS101.pptMPLS101.ppt
MPLS101.ppt
 
Advanced Topics and Future Directions in MPLS
Advanced Topics and Future Directions in MPLS Advanced Topics and Future Directions in MPLS
Advanced Topics and Future Directions in MPLS
 
CCNA v6.0 ITN - Chapter 06
CCNA v6.0 ITN - Chapter 06CCNA v6.0 ITN - Chapter 06
CCNA v6.0 ITN - Chapter 06
 
PLNOG14: Evolved Programmable Network, architektura dla sieci operatorskich -...
PLNOG14: Evolved Programmable Network, architektura dla sieci operatorskich -...PLNOG14: Evolved Programmable Network, architektura dla sieci operatorskich -...
PLNOG14: Evolved Programmable Network, architektura dla sieci operatorskich -...
 
Implementation of isp mpls backbone network on i pv6 using 6 pe routers MAIN PPT
Implementation of isp mpls backbone network on i pv6 using 6 pe routers MAIN PPTImplementation of isp mpls backbone network on i pv6 using 6 pe routers MAIN PPT
Implementation of isp mpls backbone network on i pv6 using 6 pe routers MAIN PPT
 
Implementation of isp mpls backbone network on i pv6 using 6 pe routers main PPT
Implementation of isp mpls backbone network on i pv6 using 6 pe routers main PPTImplementation of isp mpls backbone network on i pv6 using 6 pe routers main PPT
Implementation of isp mpls backbone network on i pv6 using 6 pe routers main PPT
 
BGP evolution -from SDN perspective
BGP evolution -from SDN perspectiveBGP evolution -from SDN perspective
BGP evolution -from SDN perspective
 
Ccna1v3 mod09
Ccna1v3 mod09Ccna1v3 mod09
Ccna1v3 mod09
 
Tcp ip protocol
Tcp ip protocol Tcp ip protocol
Tcp ip protocol
 
Weaponizing BGP using communities
Weaponizing BGP using communitiesWeaponizing BGP using communities
Weaponizing BGP using communities
 
Next Generation IP Transport
Next Generation IP TransportNext Generation IP Transport
Next Generation IP Transport
 

More from APNIC

APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC
 
Registry Data Accuracy Improvements, presented by Chimi Dorji at SANOG 41 / I...
Registry Data Accuracy Improvements, presented by Chimi Dorji at SANOG 41 / I...Registry Data Accuracy Improvements, presented by Chimi Dorji at SANOG 41 / I...
Registry Data Accuracy Improvements, presented by Chimi Dorji at SANOG 41 / I...
APNIC
 
APNIC Updates presented by Paul Wilson at CaribNOG 27
APNIC Updates presented by Paul Wilson at  CaribNOG 27APNIC Updates presented by Paul Wilson at  CaribNOG 27
APNIC Updates presented by Paul Wilson at CaribNOG 27
APNIC
 
APNIC Policy Roundup presented by Sunny Chendi at TWNOG 5.0
APNIC Policy Roundup presented by Sunny Chendi at TWNOG 5.0APNIC Policy Roundup presented by Sunny Chendi at TWNOG 5.0
APNIC Policy Roundup presented by Sunny Chendi at TWNOG 5.0
APNIC
 
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC
 
APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53
APNIC
 
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
APNIC
 
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
APNIC
 
On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024
APNIC
 
Networking in the Penumbra presented by Geoff Huston at NZNOG
Networking in the Penumbra presented by Geoff Huston at NZNOGNetworking in the Penumbra presented by Geoff Huston at NZNOG
Networking in the Penumbra presented by Geoff Huston at NZNOG
APNIC
 
IP addressing and IPv6, presented by Paul Wilson at IETF 119
IP addressing and IPv6, presented by Paul Wilson at IETF 119IP addressing and IPv6, presented by Paul Wilson at IETF 119
IP addressing and IPv6, presented by Paul Wilson at IETF 119
APNIC
 
draft-harrison-sidrops-manifest-number-01, presented at IETF 119
draft-harrison-sidrops-manifest-number-01, presented at IETF 119draft-harrison-sidrops-manifest-number-01, presented at IETF 119
draft-harrison-sidrops-manifest-number-01, presented at IETF 119
APNIC
 
Making an RFC in Today's IETF, presented by Geoff Huston at IETF 119
Making an RFC in Today's IETF, presented by Geoff Huston at IETF 119Making an RFC in Today's IETF, presented by Geoff Huston at IETF 119
Making an RFC in Today's IETF, presented by Geoff Huston at IETF 119
APNIC
 
IPv6 Operational Issues (with DNS), presented by Geoff Huston at IETF 119
IPv6 Operational Issues (with DNS), presented by Geoff Huston at IETF 119IPv6 Operational Issues (with DNS), presented by Geoff Huston at IETF 119
IPv6 Operational Issues (with DNS), presented by Geoff Huston at IETF 119
APNIC
 
Is DNS ready for IPv6, presented by Geoff Huston at IETF 119
Is DNS ready for IPv6, presented by Geoff Huston at IETF 119Is DNS ready for IPv6, presented by Geoff Huston at IETF 119
Is DNS ready for IPv6, presented by Geoff Huston at IETF 119
APNIC
 
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...Benefits of doing Internet peering and running an Internet Exchange (IX) pres...
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...
APNIC
 
APNIC Update and RIR Policies for ccTLDs, presented at APTLD 85
APNIC Update and RIR Policies for ccTLDs, presented at APTLD 85APNIC Update and RIR Policies for ccTLDs, presented at APTLD 85
APNIC Update and RIR Policies for ccTLDs, presented at APTLD 85
APNIC
 
NANOG 90: 'BGP in 2023' presented by Geoff Huston
NANOG 90: 'BGP in 2023' presented by Geoff HustonNANOG 90: 'BGP in 2023' presented by Geoff Huston
NANOG 90: 'BGP in 2023' presented by Geoff Huston
APNIC
 
DNS-OARC 42: Is the DNS ready for IPv6? presentation by Geoff Huston
DNS-OARC 42: Is the DNS ready for IPv6? presentation by Geoff HustonDNS-OARC 42: Is the DNS ready for IPv6? presentation by Geoff Huston
DNS-OARC 42: Is the DNS ready for IPv6? presentation by Geoff Huston
APNIC
 
APAN 57: APNIC Report at APAN 57, Bangkok, Thailand
APAN 57: APNIC Report at APAN 57, Bangkok, ThailandAPAN 57: APNIC Report at APAN 57, Bangkok, Thailand
APAN 57: APNIC Report at APAN 57, Bangkok, Thailand
APNIC
 

More from APNIC (20)

APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
APNIC Foundation, presented by Ellisha Heppner at the PNG DNS Forum 2024
 
Registry Data Accuracy Improvements, presented by Chimi Dorji at SANOG 41 / I...
Registry Data Accuracy Improvements, presented by Chimi Dorji at SANOG 41 / I...Registry Data Accuracy Improvements, presented by Chimi Dorji at SANOG 41 / I...
Registry Data Accuracy Improvements, presented by Chimi Dorji at SANOG 41 / I...
 
APNIC Updates presented by Paul Wilson at CaribNOG 27
APNIC Updates presented by Paul Wilson at  CaribNOG 27APNIC Updates presented by Paul Wilson at  CaribNOG 27
APNIC Updates presented by Paul Wilson at CaribNOG 27
 
APNIC Policy Roundup presented by Sunny Chendi at TWNOG 5.0
APNIC Policy Roundup presented by Sunny Chendi at TWNOG 5.0APNIC Policy Roundup presented by Sunny Chendi at TWNOG 5.0
APNIC Policy Roundup presented by Sunny Chendi at TWNOG 5.0
 
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
APNIC Policy Roundup, presented by Sunny Chendi at the 5th ICANN APAC-TWNIC E...
 
APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53
 
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
DDoS In Oceania and the Pacific, presented by Dave Phelan at NZNOG 2024
 
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
'Future Evolution of the Internet' delivered by Geoff Huston at Everything Op...
 
On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024On Starlink, presented by Geoff Huston at NZNOG 2024
On Starlink, presented by Geoff Huston at NZNOG 2024
 
Networking in the Penumbra presented by Geoff Huston at NZNOG
Networking in the Penumbra presented by Geoff Huston at NZNOGNetworking in the Penumbra presented by Geoff Huston at NZNOG
Networking in the Penumbra presented by Geoff Huston at NZNOG
 
IP addressing and IPv6, presented by Paul Wilson at IETF 119
IP addressing and IPv6, presented by Paul Wilson at IETF 119IP addressing and IPv6, presented by Paul Wilson at IETF 119
IP addressing and IPv6, presented by Paul Wilson at IETF 119
 
draft-harrison-sidrops-manifest-number-01, presented at IETF 119
draft-harrison-sidrops-manifest-number-01, presented at IETF 119draft-harrison-sidrops-manifest-number-01, presented at IETF 119
draft-harrison-sidrops-manifest-number-01, presented at IETF 119
 
Making an RFC in Today's IETF, presented by Geoff Huston at IETF 119
Making an RFC in Today's IETF, presented by Geoff Huston at IETF 119Making an RFC in Today's IETF, presented by Geoff Huston at IETF 119
Making an RFC in Today's IETF, presented by Geoff Huston at IETF 119
 
IPv6 Operational Issues (with DNS), presented by Geoff Huston at IETF 119
IPv6 Operational Issues (with DNS), presented by Geoff Huston at IETF 119IPv6 Operational Issues (with DNS), presented by Geoff Huston at IETF 119
IPv6 Operational Issues (with DNS), presented by Geoff Huston at IETF 119
 
Is DNS ready for IPv6, presented by Geoff Huston at IETF 119
Is DNS ready for IPv6, presented by Geoff Huston at IETF 119Is DNS ready for IPv6, presented by Geoff Huston at IETF 119
Is DNS ready for IPv6, presented by Geoff Huston at IETF 119
 
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...Benefits of doing Internet peering and running an Internet Exchange (IX) pres...
Benefits of doing Internet peering and running an Internet Exchange (IX) pres...
 
APNIC Update and RIR Policies for ccTLDs, presented at APTLD 85
APNIC Update and RIR Policies for ccTLDs, presented at APTLD 85APNIC Update and RIR Policies for ccTLDs, presented at APTLD 85
APNIC Update and RIR Policies for ccTLDs, presented at APTLD 85
 
NANOG 90: 'BGP in 2023' presented by Geoff Huston
NANOG 90: 'BGP in 2023' presented by Geoff HustonNANOG 90: 'BGP in 2023' presented by Geoff Huston
NANOG 90: 'BGP in 2023' presented by Geoff Huston
 
DNS-OARC 42: Is the DNS ready for IPv6? presentation by Geoff Huston
DNS-OARC 42: Is the DNS ready for IPv6? presentation by Geoff HustonDNS-OARC 42: Is the DNS ready for IPv6? presentation by Geoff Huston
DNS-OARC 42: Is the DNS ready for IPv6? presentation by Geoff Huston
 
APAN 57: APNIC Report at APAN 57, Bangkok, Thailand
APAN 57: APNIC Report at APAN 57, Bangkok, ThailandAPAN 57: APNIC Report at APAN 57, Bangkok, Thailand
APAN 57: APNIC Report at APAN 57, Bangkok, Thailand
 

Recently uploaded

一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
ufdana
 
制作毕业证书(ANU毕业证)莫纳什大学毕业证成绩单官方原版办理
制作毕业证书(ANU毕业证)莫纳什大学毕业证成绩单官方原版办理制作毕业证书(ANU毕业证)莫纳什大学毕业证成绩单官方原版办理
制作毕业证书(ANU毕业证)莫纳什大学毕业证成绩单官方原版办理
cuobya
 
Bài tập unit 1 English in the world.docx
Bài tập unit 1 English in the world.docxBài tập unit 1 English in the world.docx
Bài tập unit 1 English in the world.docx
nhiyenphan2005
 
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
3ipehhoa
 
可查真实(Monash毕业证)西澳大学毕业证成绩单退学买
可查真实(Monash毕业证)西澳大学毕业证成绩单退学买可查真实(Monash毕业证)西澳大学毕业证成绩单退学买
可查真实(Monash毕业证)西澳大学毕业证成绩单退学买
cuobya
 
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
zoowe
 
[HUN][hackersuli] Red Teaming alapok 2024
[HUN][hackersuli] Red Teaming alapok 2024[HUN][hackersuli] Red Teaming alapok 2024
[HUN][hackersuli] Red Teaming alapok 2024
hackersuli
 
存档可查的(USC毕业证)南加利福尼亚大学毕业证成绩单制做办理
存档可查的(USC毕业证)南加利福尼亚大学毕业证成绩单制做办理存档可查的(USC毕业证)南加利福尼亚大学毕业证成绩单制做办理
存档可查的(USC毕业证)南加利福尼亚大学毕业证成绩单制做办理
fovkoyb
 
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
ysasp1
 
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
zyfovom
 
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdfJAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
Javier Lasa
 
Search Result Showing My Post is Now Buried
Search Result Showing My Post is Now BuriedSearch Result Showing My Post is Now Buried
Search Result Showing My Post is Now Buried
Trish Parr
 
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdf
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdfMeet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdf
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdf
Florence Consulting
 
Understanding User Behavior with Google Analytics.pdf
Understanding User Behavior with Google Analytics.pdfUnderstanding User Behavior with Google Analytics.pdf
Understanding User Behavior with Google Analytics.pdf
SEO Article Boost
 
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
3ipehhoa
 
Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
CIOWomenMagazine
 
Gen Z and the marketplaces - let's translate their needs
Gen Z and the marketplaces - let's translate their needsGen Z and the marketplaces - let's translate their needs
Gen Z and the marketplaces - let's translate their needs
Laura Szabó
 
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
uehowe
 
Italy Agriculture Equipment Market Outlook to 2027
Italy Agriculture Equipment Market Outlook to 2027Italy Agriculture Equipment Market Outlook to 2027
Italy Agriculture Equipment Market Outlook to 2027
harveenkaur52
 
Explore-Insanony: Watch Instagram Stories Secretly
Explore-Insanony: Watch Instagram Stories SecretlyExplore-Insanony: Watch Instagram Stories Secretly
Explore-Insanony: Watch Instagram Stories Secretly
Trending Blogers
 

Recently uploaded (20)

一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
一比一原版(CSU毕业证)加利福尼亚州立大学毕业证成绩单专业办理
 
制作毕业证书(ANU毕业证)莫纳什大学毕业证成绩单官方原版办理
制作毕业证书(ANU毕业证)莫纳什大学毕业证成绩单官方原版办理制作毕业证书(ANU毕业证)莫纳什大学毕业证成绩单官方原版办理
制作毕业证书(ANU毕业证)莫纳什大学毕业证成绩单官方原版办理
 
Bài tập unit 1 English in the world.docx
Bài tập unit 1 English in the world.docxBài tập unit 1 English in the world.docx
Bài tập unit 1 English in the world.docx
 
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
急速办(bedfordhire毕业证书)英国贝德福特大学毕业证成绩单原版一模一样
 
可查真实(Monash毕业证)西澳大学毕业证成绩单退学买
可查真实(Monash毕业证)西澳大学毕业证成绩单退学买可查真实(Monash毕业证)西澳大学毕业证成绩单退学买
可查真实(Monash毕业证)西澳大学毕业证成绩单退学买
 
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
国外证书(Lincoln毕业证)新西兰林肯大学毕业证成绩单不能毕业办理
 
[HUN][hackersuli] Red Teaming alapok 2024
[HUN][hackersuli] Red Teaming alapok 2024[HUN][hackersuli] Red Teaming alapok 2024
[HUN][hackersuli] Red Teaming alapok 2024
 
存档可查的(USC毕业证)南加利福尼亚大学毕业证成绩单制做办理
存档可查的(USC毕业证)南加利福尼亚大学毕业证成绩单制做办理存档可查的(USC毕业证)南加利福尼亚大学毕业证成绩单制做办理
存档可查的(USC毕业证)南加利福尼亚大学毕业证成绩单制做办理
 
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
成绩单ps(UST毕业证)圣托马斯大学毕业证成绩单快速办理
 
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
学位认证网(DU毕业证)迪肯大学毕业证成绩单一比一原版制作
 
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdfJAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
JAVIER LASA-EXPERIENCIA digital 1986-2024.pdf
 
Search Result Showing My Post is Now Buried
Search Result Showing My Post is Now BuriedSearch Result Showing My Post is Now Buried
Search Result Showing My Post is Now Buried
 
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdf
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdfMeet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdf
Meet up Milano 14 _ Axpo Italia_ Migration from Mule3 (On-prem) to.pdf
 
Understanding User Behavior with Google Analytics.pdf
Understanding User Behavior with Google Analytics.pdfUnderstanding User Behavior with Google Analytics.pdf
Understanding User Behavior with Google Analytics.pdf
 
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
1比1复刻(bath毕业证书)英国巴斯大学毕业证学位证原版一模一样
 
Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
Internet of Things in Manufacturing: Revolutionizing Efficiency & Quality | C...
 
Gen Z and the marketplaces - let's translate their needs
Gen Z and the marketplaces - let's translate their needsGen Z and the marketplaces - let's translate their needs
Gen Z and the marketplaces - let's translate their needs
 
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
办理毕业证(UPenn毕业证)宾夕法尼亚大学毕业证成绩单快速办理
 
Italy Agriculture Equipment Market Outlook to 2027
Italy Agriculture Equipment Market Outlook to 2027Italy Agriculture Equipment Market Outlook to 2027
Italy Agriculture Equipment Market Outlook to 2027
 
Explore-Insanony: Watch Instagram Stories Secretly
Explore-Insanony: Watch Instagram Stories SecretlyExplore-Insanony: Watch Instagram Stories Secretly
Explore-Insanony: Watch Instagram Stories Secretly
 

An Approach to Routing in a Clos

  • 1. An Approach to Routing in a Clos Randy Bush <randy@psg.com> IIJ Research & Arrcus
  • 2. This Works 2019.02.25 Clos 1Creative Commons: Attribution & Share Alike Ethernet
  • 3. This Might Work 2019.02.25 Clos 2Creative Commons: Attribution & Share Alike Ethernet Ethernet Ethernet
  • 4. This Won’t Work 2019.02.25 Clos 3Creative Commons: Attribution & Share Alike
  • 5. This Works (Clos Network) 2019.02.25 Clos 4Creative Commons: Attribution & Share Alike WAN S p i n e External TORs
  • 6. Clos is Not an Acronym 2019.02.25 Clos 5Creative Commons: Attribution & Share Alike Clos, Charles (Mar 1953) “A study of non-blocking switching networks” Bell System Technical Journal. 32 (2): 406–424
  • 7. For Example: IIJ is Building a Second Medium Scale Data Center (MSDC) in Shiroi/Chiba Capacity of 6k Racks 2019.02.25 Clos 6Creative Commons: Attribution & Share Alike
  • 8. How Do You Route In Something of This Scale? 2019.02.25 Clos 7Creative Commons: Attribution & Share Alike
  • 9. OSPF OK to 500 Nodes IS-IS good to 1,000 Limited Because They Repeatedly Flood Everything 2019.02.25 Clos 8Creative Commons: Attribution & Share Alike
  • 10. Your Clos on IS-IS or OSPF 2019.02.25 Clos 9Creative Commons: Attribution & Share Alike
  • 11. BGP Scales Because It Signals Only Changes So BGP has become common in MSDCs 2019.02.25 Clos 10Creative Commons: Attribution & Share Alike
  • 12. BGP Is Great as Updates are Infrequent 2019.02.25 Clos 11Creative Commons: Attribution & Share Alike WAN
  • 13. ECMP can be Very Wide 32, 64, even 128 2019.02.25 Clos 12Creative Commons: Attribution & Share Alike WAN
  • 14. But What is the Decision Process? 2019.02.25 Clos 13Creative Commons: Attribution & Share Alike WAN
  • 15. Do You Want to Write BGP Policy for Massive ECMP? 2019.02.25 Clos 14Creative Commons: Attribution & Share Alike WAN
  • 16. Consult the Professor 2019.02.25 Clos 15Creative Commons: Attribution & Share Alike Shortest Path First Edsger W Dijkstra 1930-2002
  • 17. BGP-SPF 2019.02.25 Clos 16Creative Commons: Attribution & Share Alike The Path Calculation of IS-IS With the Update Rate of BGP
  • 18. SPF? I thought BGP was path vector, not link state! 2019.02.25 Clos 17Creative Commons: Attribution & Share Alike
  • 19. s/Best Path/SPF/ 2019.02.25 Clos 18Creative Commons: Attribution & Share Alike • New SAFI • NLRI format exactly same as BGP LS (RFC 7752) Address Family to carry link state information • BGP runs Dijkstra instead of Best Path Decision process • BGP MP (new SAFI) and BGP-LS Node attribute for compatibility • Peering Models: eBGP, iBGP, RR
  • 20. 2019.02.25 Clos 19Creative Commons: Attribution & Share Alike Neighbor Distribution Route Reflection Outbound Policy Inbound Policy Link State AS-Path Length EGP vs IGP Arrival Order Non-deterministic MED IGP metric Tie Break BGP4 Classic
  • 21. 2019.02.25 Clos 20Creative Commons: Attribution & Share Alike Neighbor Distribution Route Reflection Outbound Policy Inbound Policy Link State AS-Path Length EGP vs IGP Arrival Order Non-deterministic MED IGP metric Tie Break SPF Removed! BGP- SPF
  • 22. BGP-SPF • Next-Hop and Path Attributes come for free with BGP Link-State Address Family • Needed for RFC 4271 error handling • Decision Process Phases 1 and 2 (best path) replaced by SPF algorithm (AKA Dijkstra) • Decision Process Phase 3 (tie break) may be skipped as NLRI is unique per BGP speaker • Need to assure the most recent version of NLRI is always used and re-advertised • Augmented with sequence numbers 212019.02.25 Clos Creative Commons: Attribution & Share Alike
  • 23. BGP-SPF • Starting with greatly simplified SPF with P2P only links in single area (i.e., SPT) • Should scale very well to many use cases • Could support computation of LFAs, Segment Routing SIDs, and other IGP features • BGP-LS format includes necessary Link-State • Link-State AF is dual-stack AF since both IPv4 and IPv6 addresses/prefixes advertised • BGP-LS format also supports VPNs but SPF behavior not defined • Work needed to define interaction with existing unicast AFs • Matter of local implementation policy 2019.02.25 Clos Creative Commons: Attribution & Share Alike 22
  • 24. Peering Model • BGP sessions, optionally with Route-Reflector or controller hierarchy • Link discovery/liveliness detection outside of BGP • RR hierarchy can be less than fully connected but must provide redundancy • Must not be dependent on SPF for connectivity • Controller could learn the expected topology through some other means and inject it • SPF Computation is distributed though • Similar to “Jupiter Rising: A Decade of Clos Topologies and Centralized Control in Google’s Datacenter Network” 232019.02.25 Clos Creative Commons: Attribution & Share Alike
  • 25. BTW, Every Rack is (often) an AS Get Over It 2019.02.25 Clos 24Creative Commons: Attribution & Share Alike
  • 26. How Does BGP-SPF Learn Link State? 2019.02.25 Clos 25Creative Commons: Attribution & Share Alike
  • 27. Motivation • BGP-SPF needs link neighbor discovery, liveness, and addressability • LLDP is an IEEE protocol, complex, and ‘hard’ (IPR) to extend past 1500 bytes • We wanted something simple and saw no real need for the complexities of CLNP, … • So we propose a new EtherType with TLVs • We discuss Ether payloads, not framing 2019.02.25 Clos 26Creative Commons: Attribution & Share Alike
  • 28. Device DeviceDevice Topology / Routing Stack 2019.02.25 Clos 27Creative Commons: Attribution & Share Alike Ether PDUs Ether PDUs Ether PDUs Link Check AFI/SAFIs Link Check AFI/SAFIs Link Check AFI/SAFIs BGP-SPF BGP-SPF BGP-SPFTCP TCP TCPTCP MAC Link State exchanged over raw Ethernet and pushed up stack Add the AFI/SAFI data IP-Level Liveness Check BGP-SPF uses link data to discover and build the topology database
  • 29. 2019.02.25 Clos 28Creative Commons: Attribution & Share Alike East West Protocol Device DeviceDevice Link Check AFI/SAFIs Link Check AFI/SAFIs Link Check AFI/SAFIs BGP-SPF BGP-SPF BGP-SPFTCP TCP TCPTCP Ether PDUs Ether PDUs Ether PDUs
  • 30. BGP-LS for BGP-SPF 2019.02.25 Clos 29Creative Commons: Attribution & Share Alike Link State / Topology Repackage to New BGP NLRI RFC 7752 Links / Nodes / Prefixes BGP-SPF
  • 31. 2019.02.25 Clos 30Creative Commons: Attribution & Share Alike North/South Protocol Device DeviceDevice Link Check AFI/SAFIs Link Check AFI/SAFIs Link Check AFI/SAFIs BGP-SPF BGP-SPF BGP-SPFTCP TCP TCPTCP 7752 Shim 7752 Shim 7752 Shim Ether PDUs Ether PDUs Ether PDUs
  • 32. BTW, There is No IPR 2019.02.25 Clos 31Creative Commons: Attribution & Share Alike