RIFT A New Approach to Building DC Fabrics

MyNOG
MyNOGMyNOG
© 2018 Juniper Networks
RIFT
A new approach to building DC fabrics
Nitin Vig
Chief Architect, Juniper Networks
© 2018 Juniper Networks
AGENDA
2
Datacenter Fabric Trends
Introduction to RIFT
RIFT key features
Industry status
Summary
© 2018 Juniper Networks
DATACENTER FABRIC - TRENDS
Hybrid Clouds are here to stay
• Hybrid cloud for many reasons, one of them to keep real-estate from Hyper scalers
• Customers are hosting their content & critical business processes; Need to build own fabrics
• Impossible to sustain proprietary OPEX efforts
Fabrics are becoming Uniform, Local & Regular
• Vast amount of bandwidth close to the producer & consumer necessary
• Fabric architectures being adopted outside the conventional DC (Metro, PoP)
• WAN-style Traffic Engineering & protection replaced by Wide Fan-out & distributed systems redundancy
Fabric is the new “RAM chip”
• No one configures RAM banks manually in every laptop
• IP fabrics HW is largely commodity already
• IP fabrics will “OPEX commoditize” (consume bandwidth)
3
© 2018 Juniper Networks
DATACENTER FABRIC – TECHNOLOGY EVOLUTION
Tree to CLOS topology
• Tree: core/aggregation/access layers
• Folded CLOS or Fat Trees: Spine & Leaf
Layer2 switching to Layer3 routing
• Layer 3 routing underlay with Layer2/3 overlay
Layer3 underlay routing options: IGP > eBGP
• For scaling. Convergence & OPEX considerations
4
Folded
Original Fat Tree (based on CLOS)
Folder Fat Tree
© 2018 Juniper Networks
DATACENTER FABRIC: ROUTING PROTOCOL CHALLENGES
• Routing protocols are complex (to deal with irregular topologies)
• Routing protocols are:
• EITHER: Fast, but not scalable to 100k nodes (link-state)
• OR: Slow, when scalable to 100k nodes (distance-vector)
CURRENT ROUTING PROTOCOLS DATACENTER FABRICS
Built for irregular network topologies
Low degree of connectivity
Uniform topology (CLOS, folded Fat-Tree)
High degree of connectivity (Hyper-scale DCs)
NOT A PERFECT MATCH
© 2018 Juniper Networks 6
REQUIREMENT BGP
(modified for DC)
ISIS
(modified for DC)
01 Close to Zero Touch Provisioning
02 Link discovery/Automatic forming of trees/preventing cabling violations ⚠ ⚠
03 Minimal amount of routes/information on ToRs (cost-optimized)
04 High degree of ECMP (BGP needs lots knobs, memory, own-as-path violations) ⚠
05 Traffic engineering by Next-hops, Prefix modification
06 See all links in topology to support PCE/SR ⚠
07 Carry opaque configuration data (key-value) efficiently ⚠
08 Take a node out of production quickly and without disruption (overload)
09 Automatic disaggregation on failures to prevent black-holing
10 Minimal blast radius on failures
11 Fastest possible convergence on failures
DATACENTER FABRIC: KEY REQUIREMENTS
© 2018 Juniper Networks
LET’S TAKE A FRESH LOOK
Distance Vector
(RIP)
7
Link State
(ISIS, OSPF)
Path Vector
(BGP)
Vectors of destination and distance
“Tell you neighbors rest of the network”
Router announced LSDB, Dijkstra
“Tell rest of the network your neighbors”
Full-paths announced in BGP
“Paths described by sequence of ASs”
Routing protocols in our network
© 2018 Juniper Networks
LINK STATE v/s DISTANCE/PATH VECTOR
Link State
• Topology view à TE enabler
• Fast propagation
Distance/Path Vector
• Granular policy control & traffic engineering
time
time
Node 1
Node 0
Node 3
Node 2
Node 5
Node 4
Node 1
Node 0
Node 3
Node 2
Node 5
Node 4
computation
Update
tx-mission
Link State Convergence
Distance/Path Vector Convergence
Both protocols types (LS and Distance/Path Vector) are frequently used in todays networks
© 2018 Juniper Networks
RIFT: ROUTING IN FAT TREES
• CLOS optimized routing protocol
• Full BW Utilization
• Built in Fabric Provisioning
• Fast convergence
9
Clean slate approach to building DC Fabrics
Market Requirements
Juniper Invention
• Link-State (North) + Distance-Vector (South)
• Simplest leaf Implementation
• Failure Domain Containment
• Support all DC applications
© 2018 Juniper Networks
RIFT AT A GLANCE
1. Topological sort
• Uses the concept of directionality
2. Link-State flood Up (North)
• Full topology and all prefixes @ top spine
only
3. Distance Vector Down (South)
• 0/0 is sufficient to send traffic Up.
• More-specific prefixes advertised in specific
scenarios (link failures, traffic engineering)
4. Bounce
• Flood reduction
• Automatic dis-aggregation
© 2018 Juniper Networks
RIFT IN STEADY STATE – BASICS
Aggregation
Localization
Pfx: 0/0
Pfx Y
Pfx Z
Pfx ZPfx YPfx XPfx W
Pfx: 0/0
Spine (Level 2)Learn Pfx A,B,C,D from Spine (level 1)
Spine (Level 1)
Learn 0/0 from Spine (level 2)
Learn Pfx A,B,C,D from Leaf (level 0)
Leaf (Level 0)Learn 0/0 from Spine (level 1)
© 2018 Juniper Networks
POD 1
Pfx DPfx CPfx BPfx A
Spine (Level 2)
Spine (Level 1)
Leaf (Level 0)
RIFT FEATURES
DETECTING CABLING MIS-CONFIGURATION
Problem statement: Fabric should automatically detect and
block wrong cabling.
Automatic rejection of adjacencies based on minimal
configuration
• A1 to B1: Forbidden due to POD mismatch
• A0 to B1: Forbidden due to POD mismatch (A0 already
formed A0-A1 even if POD not configured on A0)
• B0 to C0: Forbidden based on level mismatch
POD 0
C0
A0
A1
B0
B1
© 2018 Juniper Networks
RIFT FEATURES
(NEAR) ZERO TOUCH PROVISIONING
Problem statement: Fabric should auto-configure with close to zero-touch
Automatic SystemID derivation
• RIFT SystemID (64 bits) is automatically derived from node’s EUI-64
Top-level (superspine) switches must be manually configured
• Either: with flag=SUPERSPINE (default level 16)
• Or: explicit level (e.g.: level 3 in the example)
A node with non-configured level derives its level from the neighbor’s level
(highest neighbor’s level – 1)
• E, F -> derived level 2
• I, J -> derived level 1
Node with flag=LEAF_ONLY has always derived level 0
J
N
F
Level 0
Level 1
Level 2
Level 3A
E
I
M
Flag = LEAF_ONLY Flag = LEAF_ONLY
level=3
manual
© 2018 Juniper Networks
A0
RIFT FEATURES
ROUTING IN FAILURE: AUTOMATIC DISAGGREGATION
Problem statement: Avoid any traffic black-holing due to Link
failures
1) Link C2 – B1 breaks. C2 looses reachability to Pfx Y & Z
2) C2 sends updates with only one Nbr (A1)
3) D2 receives update from C2:
• Our neighbors don’t match (B1 is missing)
• C2 has no reachability to pfx Y & Z
• Lower level nodes use 0/0 – risk of traffic black hole.
4) D2 originates new update w/ disaggregated prefixes (Y,Z)
Note:
• Nodes on lower level (A1, B1) get more specific route.
• Nodes further down [Level 0] still can use 0/0 only
A1
C2
Pfx ZPfx YPfx XPfx W
D2 learns C2 has
lost Nbr B1
3
D2
Pfx 0/0 à C2, D2
Pfx Y,Z à D2
Pfx 0/0 à A1, A2
B1C2 – B1
link fails
1
C2 sends only
Nbr A1 in update
2 D2 advertises specific
route to pfx Y & Z
4
© 2018 Juniper Networks
RIFT FEATURES
FLOODING REDUCTION: FOR HIGHLY MESHED DC TOPOLOGIES
Problem statement: Avoid redundant information in highly
meshed topologies
N-port spine switch
Level 2 spine – all N ports are southbound
Level 1 spine
• N/2 ports are Southbound
• N/2 ports are Northbound
Link-State Flooding become over-kill (known problem in link-
state protocols)
© 2018 Juniper Networks
RIFT FEATURES
FLOODING REDUCTION: HAPPENS IN THE NORTH DIRECTION
Each ‘L’ node which ‘L+2’ nodes are reachable via particular “L+1’
nodes
Single ‘L+1’ node can flood updates from ‘L’ node to given set of
‘L+2’ nodes -> Flood Repeater (FR) node
For redundancy, in RIFT ‘L’ node selects at least two ‘L+1’ nodes as
FRs (using a selection algorithm)
Updates sent to non-FRs marked with ‘do-not-reflect’ flag
Similar algorithm is executed at each level.L
L+1
L+2
XX XX
© 2018 Juniper Networks
RIFT FEATURES
WEIGHTED BANDWIDTH LOAD-BALANCING
Problem Statement: Load-balance traffic across links based on link capacity
Weighted Bandwidth load-balancing example:
1. Each upstream node gets a value based on available bandwidth
• Upstream node BW = BW to upstream node + uplink BW upstream node
• On X, upstream node I & J -> 2 x 10G + 4 x 40G = 180G
• Upstream node BW is converted to next exponent of 2
• On X, upstream node I & J -> 180G -> 8 (Note: 27 < 180 < 28)
• Exponent for I & J = 8
2. Received route’s metric is adjusted based on above value (BAD – Bandwidth
Adjusted Distance)
• BAD = original D * (1 + Max_Upstream_Exp – Current_Upstream_Exp)
• On X, upstream node I -> BAD = D * (1 + 8 - 8) = D
• On X, upstream node J -> BAD = D * (1 + 8 - 8) = D
• Equal BW load-balancing -> distance (metric) not adjusted
J
Y
F
A
E
I
X
10G
40G
100G
© 2018 Juniper Networks 18
REQUIREMENT BGP
(modified for DC)
ISIS
(modified for DC)
RIFT
01 Close to Zero Touch Provisioning
02 Link discovery/Automatic forming of trees/preventing cabling violations ⚠ ⚠
03 Minimal amount of routes/information on ToRs (cost-optimized)
04 High degree of ECMP (BGP needs lots knobs, memory, own-as-path violations) ⚠
05 Traffic engineering by Next-hops, Prefix modification
06 See all links in topology to support PCE/SR ⚠
07 Carry opaque configuration data (key-value) efficiently ⚠
08 Take a node out of production quickly and without disruption (overload)
09 Automatic disaggregation on failures to prevent black-holing
10 Minimal blast radius on failures
11 Fastest possible convergence on failures
RIFT FEATURES SUMMARY
DATACENTER FABRIC: KEY REQUIREMENTS
© 2018 Juniper Networks
INDUSTRY STATUS
Standardization
• Initiated by Antoni Przygienda (Juniper Networks)
• Standards Track Working Group Draft (I-D)
• Base for further work toward RFC
• https://tools.ietf.org/html/draft-ietf-rift-rift-06
Co-operation
• Join work at IETF WG (JNPR, CSCO, Nokia, Comcast)
• Contact authors, share opinion
• The data structures for packet are public (GPB)
I-D RFC STD
individual
Availability
• RIFT on python: https://github.com/brunorijsman/rift-
python
• RIFT trial code available from Juniper:
https://www.juniper.net/us/en/dm/free-rift-trial/
• Production-ready Juniper code: Q4’2019
Relevant drafts
• Policy-guided prefixes with RIFT:
https://tools.ietf.org/html/draft-atlas-rift-pgp-01
• RIFT YANG model:
https://tools.ietf.org/html/draft-ietf-rift-yang-00
• Segment Routing in Fat Trees (SRIFT):
https://tools.ietf.org/html/draft-zzhang-rift-sr-01
© 2018 Juniper Networks
SUMMARY: RIFT PROTOCOL ADVANTAGES
• Fastest possible convergence
• Automatic topology detection
• Minimal routes on TORs
• High degree of ECMP
• Fast de-commissioning of Nodes
• Excessive flooding
• Manual neighbor detection
• Zero-touch provisioning
• Automatic disaggregation on failure
• Minimal blast radius on failures
• Utilize all fabric paths without loops
• Support for non-ECMP paths
• Key-Value Store
Link-State and Distance Vector
Take
‘best of both’
Leave
‘not-so-good’
Unique RIFT additions
© 2018 Juniper Networks
THANK YOU
nitinvig@juniper.net
1 of 21

Recommended

RPKI and Me by
RPKI and MeRPKI and Me
RPKI and MeMyNOG
641 views40 slides
Engineering The New IP Transport by
Engineering The New IP TransportEngineering The New IP Transport
Engineering The New IP TransportMyNOG
411 views23 slides
Next Gen Monitoring with INT by
Next Gen Monitoring with INTNext Gen Monitoring with INT
Next Gen Monitoring with INTMyNOG
1.2K views34 slides
Routing In Fat Trees by
Routing In Fat TreesRouting In Fat Trees
Routing In Fat TreesAPNIC
1.2K views15 slides
Introduction to Segment Routing by
Introduction to Segment RoutingIntroduction to Segment Routing
Introduction to Segment RoutingMyNOG
2.6K views35 slides
IPLC Analytic Dashboard - Mohd Rizal bin Mohd Ramly by
IPLC Analytic Dashboard - Mohd Rizal bin Mohd RamlyIPLC Analytic Dashboard - Mohd Rizal bin Mohd Ramly
IPLC Analytic Dashboard - Mohd Rizal bin Mohd RamlyMyNOG
1K views24 slides

More Related Content

What's hot

Segment routing in ISO-XR 5.2.2 by
Segment routing in ISO-XR 5.2.2Segment routing in ISO-XR 5.2.2
Segment routing in ISO-XR 5.2.2Bertrand Duvivier
2.2K views8 slides
Segment Routing: Prepare Your Network For New Business Models by
Segment Routing:  Prepare Your Network For New Business ModelsSegment Routing:  Prepare Your Network For New Business Models
Segment Routing: Prepare Your Network For New Business ModelsCisco Service Provider
16K views16 slides
TechWiseTV Workshop: Segment Routing for the Datacenter by
TechWiseTV Workshop: Segment Routing for the DatacenterTechWiseTV Workshop: Segment Routing for the Datacenter
TechWiseTV Workshop: Segment Routing for the DatacenterRobb Boyd
1.3K views41 slides
05 (IDNOG02) Technology to reserve the redundancy on the layer2 network by Sa... by
05 (IDNOG02) Technology to reserve the redundancy on the layer2 network by Sa...05 (IDNOG02) Technology to reserve the redundancy on the layer2 network by Sa...
05 (IDNOG02) Technology to reserve the redundancy on the layer2 network by Sa...Indonesia Network Operators Group
728 views13 slides
SRv6 Network Programming: deployment use-cases by
SRv6 Network Programming: deployment use-cases SRv6 Network Programming: deployment use-cases
SRv6 Network Programming: deployment use-cases APNIC
2K views85 slides
Segment Routing by
Segment RoutingSegment Routing
Segment RoutingAPNIC
1.3K views13 slides

What's hot(20)

Segment Routing: Prepare Your Network For New Business Models by Cisco Service Provider
Segment Routing:  Prepare Your Network For New Business ModelsSegment Routing:  Prepare Your Network For New Business Models
Segment Routing: Prepare Your Network For New Business Models
TechWiseTV Workshop: Segment Routing for the Datacenter by Robb Boyd
TechWiseTV Workshop: Segment Routing for the DatacenterTechWiseTV Workshop: Segment Routing for the Datacenter
TechWiseTV Workshop: Segment Routing for the Datacenter
Robb Boyd1.3K views
SRv6 Network Programming: deployment use-cases by APNIC
SRv6 Network Programming: deployment use-cases SRv6 Network Programming: deployment use-cases
SRv6 Network Programming: deployment use-cases
APNIC2K views
Segment Routing by APNIC
Segment RoutingSegment Routing
Segment Routing
APNIC1.3K views
Traffic Engineering for CDNs by MyNOG
Traffic Engineering for CDNsTraffic Engineering for CDNs
Traffic Engineering for CDNs
MyNOG1.7K views
Segment Routing Lab by Cisco Canada
Segment Routing Lab Segment Routing Lab
Segment Routing Lab
Cisco Canada2.9K views
Segment Routing Technology Deep Dive and Advanced Use Cases by Cisco Canada
Segment Routing Technology Deep Dive and Advanced Use CasesSegment Routing Technology Deep Dive and Advanced Use Cases
Segment Routing Technology Deep Dive and Advanced Use Cases
Cisco Canada5.6K views
RPKI: An Operator’s Implementation by MyNOG
RPKI: An Operator’s ImplementationRPKI: An Operator’s Implementation
RPKI: An Operator’s Implementation
MyNOG924 views
Disaggregation in PON Network - SDN PON by Ravi Sharma
Disaggregation in PON Network - SDN PON  Disaggregation in PON Network - SDN PON
Disaggregation in PON Network - SDN PON
Ravi Sharma71 views
The Segment Routing Architecture (IEEE Globecom 2015) by nagendranainar
The Segment Routing Architecture (IEEE Globecom 2015)The Segment Routing Architecture (IEEE Globecom 2015)
The Segment Routing Architecture (IEEE Globecom 2015)
nagendranainar1.3K views
WAN SDN meet Segment Routing by APNIC
WAN SDN meet Segment RoutingWAN SDN meet Segment Routing
WAN SDN meet Segment Routing
APNIC5K views
MPLS SDN NFV WORLD'17 - SDN NFV deployment update by Stephane Litkowski
MPLS SDN NFV WORLD'17 - SDN NFV deployment updateMPLS SDN NFV WORLD'17 - SDN NFV deployment update
MPLS SDN NFV WORLD'17 - SDN NFV deployment update
Stephane Litkowski3.1K views
BGP Traffic Engineering with SDN Controller by APNIC
BGP Traffic Engineering with SDN ControllerBGP Traffic Engineering with SDN Controller
BGP Traffic Engineering with SDN Controller
APNIC3.2K views
Next Generation DDoS Services – can we do this with NFV? - CF Chui by MyNOG
Next Generation DDoS Services – can we do this with NFV? - CF ChuiNext Generation DDoS Services – can we do this with NFV? - CF Chui
Next Generation DDoS Services – can we do this with NFV? - CF Chui
MyNOG277 views
Open Connect Appliances - Jocelyn Ooi by MyNOG
Open Connect Appliances - Jocelyn OoiOpen Connect Appliances - Jocelyn Ooi
Open Connect Appliances - Jocelyn Ooi
MyNOG1.8K views

Similar to RIFT A New Approach to Building DC Fabrics

PLNOG19 - Krzysztof Szarkowicz - RIFT i nowe pomysły na routing by
PLNOG19 - Krzysztof Szarkowicz - RIFT i nowe pomysły na routingPLNOG19 - Krzysztof Szarkowicz - RIFT i nowe pomysły na routing
PLNOG19 - Krzysztof Szarkowicz - RIFT i nowe pomysły na routingPROIDEA
29 views27 slides
Network_Layer.ppt by
Network_Layer.pptNetwork_Layer.ppt
Network_Layer.pptRajSingh52036
59 views43 slides
Mobile IoT Network :Current Status and Future Evolution by
Mobile IoT  Network :Current Status and Future EvolutionMobile IoT  Network :Current Status and Future Evolution
Mobile IoT Network :Current Status and Future EvolutionSivasothy Shanmugalingam
163 views17 slides
Chap.1 ethernet introduction by
Chap.1 ethernet introductionChap.1 ethernet introduction
Chap.1 ethernet introduction東原 李
2.8K views48 slides
SDI to IP 2110 Transition Part 2 by
SDI to IP 2110 Transition Part 2SDI to IP 2110 Transition Part 2
SDI to IP 2110 Transition Part 2Dr. Mohieddin Moradi
405 views200 slides
OpenNebula - Mellanox Considerations for Smart Cloud by
OpenNebula - Mellanox Considerations for Smart CloudOpenNebula - Mellanox Considerations for Smart Cloud
OpenNebula - Mellanox Considerations for Smart CloudOpenNebula Project
1.1K views33 slides

Similar to RIFT A New Approach to Building DC Fabrics(20)

PLNOG19 - Krzysztof Szarkowicz - RIFT i nowe pomysły na routing by PROIDEA
PLNOG19 - Krzysztof Szarkowicz - RIFT i nowe pomysły na routingPLNOG19 - Krzysztof Szarkowicz - RIFT i nowe pomysły na routing
PLNOG19 - Krzysztof Szarkowicz - RIFT i nowe pomysły na routing
PROIDEA29 views
Chap.1 ethernet introduction by 東原 李
Chap.1 ethernet introductionChap.1 ethernet introduction
Chap.1 ethernet introduction
東原 李2.8K views
OpenNebula - Mellanox Considerations for Smart Cloud by OpenNebula Project
OpenNebula - Mellanox Considerations for Smart CloudOpenNebula - Mellanox Considerations for Smart Cloud
OpenNebula - Mellanox Considerations for Smart Cloud
OpenNebula Project1.1K views
Services and applications’ infrastructure for agile optical networks by Tal Lavian Ph.D.
Services and applications’ infrastructure for agile optical networksServices and applications’ infrastructure for agile optical networks
Services and applications’ infrastructure for agile optical networks
Tal Lavian Ph.D.597 views
Pristine glif 2015 by ICT PRISTINE
Pristine glif 2015Pristine glif 2015
Pristine glif 2015
ICT PRISTINE1.4K views
Network basics 2 eng. moaath alshaikh by Moaath alshaikh
Network basics 2 eng. moaath alshaikhNetwork basics 2 eng. moaath alshaikh
Network basics 2 eng. moaath alshaikh
Moaath alshaikh55 views
Coherent DSP meets open transport SDN by HidekiNishizawa
Coherent DSP meets open transport SDNCoherent DSP meets open transport SDN
Coherent DSP meets open transport SDN
HidekiNishizawa517 views
Routing in Dense Topologies - What's all the Fuss? by APNIC
Routing in Dense Topologies - What's all the Fuss?Routing in Dense Topologies - What's all the Fuss?
Routing in Dense Topologies - What's all the Fuss?
APNIC433 views
6TiSCH @Telecom Bretagne 2015 by Pascal Thubert
6TiSCH @Telecom Bretagne 20156TiSCH @Telecom Bretagne 2015
6TiSCH @Telecom Bretagne 2015
Pascal Thubert2.2K views
Iec 62439 3.4-prp_kirrmann by Jörgen Gade
Iec 62439 3.4-prp_kirrmannIec 62439 3.4-prp_kirrmann
Iec 62439 3.4-prp_kirrmann
Jörgen Gade164 views
Logical_Routing_NSX_T_2.4.pptx.pptx by AnwarAnsari40
Logical_Routing_NSX_T_2.4.pptx.pptxLogical_Routing_NSX_T_2.4.pptx.pptx
Logical_Routing_NSX_T_2.4.pptx.pptx
AnwarAnsari4021 views
Study and Emulation of 10G-EPON with Triple Play by Satya Prakash Rout
Study and Emulation of 10G-EPON with Triple PlayStudy and Emulation of 10G-EPON with Triple Play
Study and Emulation of 10G-EPON with Triple Play
Satya Prakash Rout525 views

More from MyNOG

Peering Personal MyNOG-10 by
Peering Personal MyNOG-10Peering Personal MyNOG-10
Peering Personal MyNOG-10MyNOG
121 views32 slides
Embedded CDNs in 2023 by
Embedded CDNs in 2023Embedded CDNs in 2023
Embedded CDNs in 2023MyNOG
112 views22 slides
Edge virtualisation for Carrier Networks by
Edge virtualisation for Carrier NetworksEdge virtualisation for Carrier Networks
Edge virtualisation for Carrier NetworksMyNOG
98 views13 slides
Equinix: New Markets, New Frontiers by
Equinix: New Markets, New FrontiersEquinix: New Markets, New Frontiers
Equinix: New Markets, New FrontiersMyNOG
160 views26 slides
Securing the Onion: 5G Cloud Native Infrastructure by
Securing the Onion: 5G Cloud Native InfrastructureSecuring the Onion: 5G Cloud Native Infrastructure
Securing the Onion: 5G Cloud Native InfrastructureMyNOG
100 views22 slides
Hierarchical Network Controller by
Hierarchical Network ControllerHierarchical Network Controller
Hierarchical Network ControllerMyNOG
79 views25 slides

More from MyNOG(20)

Peering Personal MyNOG-10 by MyNOG
Peering Personal MyNOG-10Peering Personal MyNOG-10
Peering Personal MyNOG-10
MyNOG121 views
Embedded CDNs in 2023 by MyNOG
Embedded CDNs in 2023Embedded CDNs in 2023
Embedded CDNs in 2023
MyNOG112 views
Edge virtualisation for Carrier Networks by MyNOG
Edge virtualisation for Carrier NetworksEdge virtualisation for Carrier Networks
Edge virtualisation for Carrier Networks
MyNOG98 views
Equinix: New Markets, New Frontiers by MyNOG
Equinix: New Markets, New FrontiersEquinix: New Markets, New Frontiers
Equinix: New Markets, New Frontiers
MyNOG160 views
Securing the Onion: 5G Cloud Native Infrastructure by MyNOG
Securing the Onion: 5G Cloud Native InfrastructureSecuring the Onion: 5G Cloud Native Infrastructure
Securing the Onion: 5G Cloud Native Infrastructure
MyNOG100 views
Hierarchical Network Controller by MyNOG
Hierarchical Network ControllerHierarchical Network Controller
Hierarchical Network Controller
MyNOG79 views
Aether: The First Open Source 5G/LTE Connected Edge Cloud Platform by MyNOG
Aether: The First Open Source 5G/LTE Connected Edge Cloud PlatformAether: The First Open Source 5G/LTE Connected Edge Cloud Platform
Aether: The First Open Source 5G/LTE Connected Edge Cloud Platform
MyNOG102 views
Cleaning up your RPKI invalids by MyNOG
Cleaning up your RPKI invalidsCleaning up your RPKI invalids
Cleaning up your RPKI invalids
MyNOG30 views
Introducing Peering LAN 2.0 at DE-CIX by MyNOG
Introducing Peering LAN 2.0 at DE-CIXIntroducing Peering LAN 2.0 at DE-CIX
Introducing Peering LAN 2.0 at DE-CIX
MyNOG100 views
Load balancing and Service in Kubernetes by MyNOG
Load balancing and Service in KubernetesLoad balancing and Service in Kubernetes
Load balancing and Service in Kubernetes
MyNOG97 views
Cloud SDN: BGP Peering and RPKI by MyNOG
Cloud SDN: BGP Peering and RPKICloud SDN: BGP Peering and RPKI
Cloud SDN: BGP Peering and RPKI
MyNOG83 views
SDM – A New (Subsea) Cable Paradigm by MyNOG
SDM – A New (Subsea) Cable ParadigmSDM – A New (Subsea) Cable Paradigm
SDM – A New (Subsea) Cable Paradigm
MyNOG119 views
AI in Networking: Transforming Network Operations with Juniper Mist AIDE by MyNOG
AI in Networking: Transforming Network Operations with Juniper Mist AIDEAI in Networking: Transforming Network Operations with Juniper Mist AIDE
AI in Networking: Transforming Network Operations with Juniper Mist AIDE
MyNOG262 views
Malaysia Data Center Landscape, Where is the next hotspot to place your fiber... by MyNOG
Malaysia Data Center Landscape, Where is the next hotspot to place your fiber...Malaysia Data Center Landscape, Where is the next hotspot to place your fiber...
Malaysia Data Center Landscape, Where is the next hotspot to place your fiber...
MyNOG173 views
FUTURE-PROOFING DATA CENTRES from Connectivity Perspective by MyNOG
FUTURE-PROOFING DATA CENTRES from Connectivity PerspectiveFUTURE-PROOFING DATA CENTRES from Connectivity Perspective
FUTURE-PROOFING DATA CENTRES from Connectivity Perspective
MyNOG82 views
Keep Ukraine Connected: A project from the community – for the community by R... by MyNOG
Keep Ukraine Connected: A project from the community – for the community by R...Keep Ukraine Connected: A project from the community – for the community by R...
Keep Ukraine Connected: A project from the community – for the community by R...
MyNOG80 views
Solving Civilization’s Long Term Communication Needs by Dinesh Kummaran, Tran... by MyNOG
Solving Civilization’s Long Term Communication Needs by Dinesh Kummaran, Tran...Solving Civilization’s Long Term Communication Needs by Dinesh Kummaran, Tran...
Solving Civilization’s Long Term Communication Needs by Dinesh Kummaran, Tran...
MyNOG79 views
MyIX Updates by Raja Mohan Marappan, MyIX by MyNOG
MyIX Updates by Raja Mohan Marappan, MyIXMyIX Updates by Raja Mohan Marappan, MyIX
MyIX Updates by Raja Mohan Marappan, MyIX
MyNOG58 views
Exploring Quantum Engineering for Networking by Melchior Aelmans, Juniper Net... by MyNOG
Exploring Quantum Engineering for Networking by Melchior Aelmans, Juniper Net...Exploring Quantum Engineering for Networking by Melchior Aelmans, Juniper Net...
Exploring Quantum Engineering for Networking by Melchior Aelmans, Juniper Net...
MyNOG51 views
Quick wins in the NetOps Journey by Vincent Boon, Opengear by MyNOG
Quick wins in the NetOps Journey by Vincent Boon, OpengearQuick wins in the NetOps Journey by Vincent Boon, Opengear
Quick wins in the NetOps Journey by Vincent Boon, Opengear
MyNOG48 views

Recently uploaded

WEB 2.O TOOLS: Empowering education.pptx by
WEB 2.O TOOLS: Empowering education.pptxWEB 2.O TOOLS: Empowering education.pptx
WEB 2.O TOOLS: Empowering education.pptxnarmadhamanohar21
16 views16 slides
PORTFOLIO 1 (Bret Michael Pepito).pdf by
PORTFOLIO 1 (Bret Michael Pepito).pdfPORTFOLIO 1 (Bret Michael Pepito).pdf
PORTFOLIO 1 (Bret Michael Pepito).pdfbrejess0410
8 views6 slides
Marketing and Community Building in Web3 by
Marketing and Community Building in Web3Marketing and Community Building in Web3
Marketing and Community Building in Web3Federico Ast
12 views64 slides
information by
informationinformation
informationkhelgishekhar
9 views4 slides
DU Series - Day 4.pptx by
DU Series - Day 4.pptxDU Series - Day 4.pptx
DU Series - Day 4.pptxUiPathCommunity
106 views28 slides
IETF 118: Starlink Protocol Performance by
IETF 118: Starlink Protocol PerformanceIETF 118: Starlink Protocol Performance
IETF 118: Starlink Protocol PerformanceAPNIC
297 views22 slides

Recently uploaded(10)

PORTFOLIO 1 (Bret Michael Pepito).pdf by brejess0410
PORTFOLIO 1 (Bret Michael Pepito).pdfPORTFOLIO 1 (Bret Michael Pepito).pdf
PORTFOLIO 1 (Bret Michael Pepito).pdf
brejess04108 views
Marketing and Community Building in Web3 by Federico Ast
Marketing and Community Building in Web3Marketing and Community Building in Web3
Marketing and Community Building in Web3
Federico Ast12 views
IETF 118: Starlink Protocol Performance by APNIC
IETF 118: Starlink Protocol PerformanceIETF 118: Starlink Protocol Performance
IETF 118: Starlink Protocol Performance
APNIC297 views
UiPath Document Understanding_Day 3.pptx by UiPathCommunity
UiPath Document Understanding_Day 3.pptxUiPath Document Understanding_Day 3.pptx
UiPath Document Understanding_Day 3.pptx
UiPathCommunity105 views
How to think like a threat actor for Kubernetes.pptx by LibbySchulze1
How to think like a threat actor for Kubernetes.pptxHow to think like a threat actor for Kubernetes.pptx
How to think like a threat actor for Kubernetes.pptx
LibbySchulze15 views
Building trust in our information ecosystem: who do we trust in an emergency by Tina Purnat
Building trust in our information ecosystem: who do we trust in an emergencyBuilding trust in our information ecosystem: who do we trust in an emergency
Building trust in our information ecosystem: who do we trust in an emergency
Tina Purnat100 views

RIFT A New Approach to Building DC Fabrics

  • 1. © 2018 Juniper Networks RIFT A new approach to building DC fabrics Nitin Vig Chief Architect, Juniper Networks
  • 2. © 2018 Juniper Networks AGENDA 2 Datacenter Fabric Trends Introduction to RIFT RIFT key features Industry status Summary
  • 3. © 2018 Juniper Networks DATACENTER FABRIC - TRENDS Hybrid Clouds are here to stay • Hybrid cloud for many reasons, one of them to keep real-estate from Hyper scalers • Customers are hosting their content & critical business processes; Need to build own fabrics • Impossible to sustain proprietary OPEX efforts Fabrics are becoming Uniform, Local & Regular • Vast amount of bandwidth close to the producer & consumer necessary • Fabric architectures being adopted outside the conventional DC (Metro, PoP) • WAN-style Traffic Engineering & protection replaced by Wide Fan-out & distributed systems redundancy Fabric is the new “RAM chip” • No one configures RAM banks manually in every laptop • IP fabrics HW is largely commodity already • IP fabrics will “OPEX commoditize” (consume bandwidth) 3
  • 4. © 2018 Juniper Networks DATACENTER FABRIC – TECHNOLOGY EVOLUTION Tree to CLOS topology • Tree: core/aggregation/access layers • Folded CLOS or Fat Trees: Spine & Leaf Layer2 switching to Layer3 routing • Layer 3 routing underlay with Layer2/3 overlay Layer3 underlay routing options: IGP > eBGP • For scaling. Convergence & OPEX considerations 4 Folded Original Fat Tree (based on CLOS) Folder Fat Tree
  • 5. © 2018 Juniper Networks DATACENTER FABRIC: ROUTING PROTOCOL CHALLENGES • Routing protocols are complex (to deal with irregular topologies) • Routing protocols are: • EITHER: Fast, but not scalable to 100k nodes (link-state) • OR: Slow, when scalable to 100k nodes (distance-vector) CURRENT ROUTING PROTOCOLS DATACENTER FABRICS Built for irregular network topologies Low degree of connectivity Uniform topology (CLOS, folded Fat-Tree) High degree of connectivity (Hyper-scale DCs) NOT A PERFECT MATCH
  • 6. © 2018 Juniper Networks 6 REQUIREMENT BGP (modified for DC) ISIS (modified for DC) 01 Close to Zero Touch Provisioning 02 Link discovery/Automatic forming of trees/preventing cabling violations ⚠ ⚠ 03 Minimal amount of routes/information on ToRs (cost-optimized) 04 High degree of ECMP (BGP needs lots knobs, memory, own-as-path violations) ⚠ 05 Traffic engineering by Next-hops, Prefix modification 06 See all links in topology to support PCE/SR ⚠ 07 Carry opaque configuration data (key-value) efficiently ⚠ 08 Take a node out of production quickly and without disruption (overload) 09 Automatic disaggregation on failures to prevent black-holing 10 Minimal blast radius on failures 11 Fastest possible convergence on failures DATACENTER FABRIC: KEY REQUIREMENTS
  • 7. © 2018 Juniper Networks LET’S TAKE A FRESH LOOK Distance Vector (RIP) 7 Link State (ISIS, OSPF) Path Vector (BGP) Vectors of destination and distance “Tell you neighbors rest of the network” Router announced LSDB, Dijkstra “Tell rest of the network your neighbors” Full-paths announced in BGP “Paths described by sequence of ASs” Routing protocols in our network
  • 8. © 2018 Juniper Networks LINK STATE v/s DISTANCE/PATH VECTOR Link State • Topology view à TE enabler • Fast propagation Distance/Path Vector • Granular policy control & traffic engineering time time Node 1 Node 0 Node 3 Node 2 Node 5 Node 4 Node 1 Node 0 Node 3 Node 2 Node 5 Node 4 computation Update tx-mission Link State Convergence Distance/Path Vector Convergence Both protocols types (LS and Distance/Path Vector) are frequently used in todays networks
  • 9. © 2018 Juniper Networks RIFT: ROUTING IN FAT TREES • CLOS optimized routing protocol • Full BW Utilization • Built in Fabric Provisioning • Fast convergence 9 Clean slate approach to building DC Fabrics Market Requirements Juniper Invention • Link-State (North) + Distance-Vector (South) • Simplest leaf Implementation • Failure Domain Containment • Support all DC applications
  • 10. © 2018 Juniper Networks RIFT AT A GLANCE 1. Topological sort • Uses the concept of directionality 2. Link-State flood Up (North) • Full topology and all prefixes @ top spine only 3. Distance Vector Down (South) • 0/0 is sufficient to send traffic Up. • More-specific prefixes advertised in specific scenarios (link failures, traffic engineering) 4. Bounce • Flood reduction • Automatic dis-aggregation
  • 11. © 2018 Juniper Networks RIFT IN STEADY STATE – BASICS Aggregation Localization Pfx: 0/0 Pfx Y Pfx Z Pfx ZPfx YPfx XPfx W Pfx: 0/0 Spine (Level 2)Learn Pfx A,B,C,D from Spine (level 1) Spine (Level 1) Learn 0/0 from Spine (level 2) Learn Pfx A,B,C,D from Leaf (level 0) Leaf (Level 0)Learn 0/0 from Spine (level 1)
  • 12. © 2018 Juniper Networks POD 1 Pfx DPfx CPfx BPfx A Spine (Level 2) Spine (Level 1) Leaf (Level 0) RIFT FEATURES DETECTING CABLING MIS-CONFIGURATION Problem statement: Fabric should automatically detect and block wrong cabling. Automatic rejection of adjacencies based on minimal configuration • A1 to B1: Forbidden due to POD mismatch • A0 to B1: Forbidden due to POD mismatch (A0 already formed A0-A1 even if POD not configured on A0) • B0 to C0: Forbidden based on level mismatch POD 0 C0 A0 A1 B0 B1
  • 13. © 2018 Juniper Networks RIFT FEATURES (NEAR) ZERO TOUCH PROVISIONING Problem statement: Fabric should auto-configure with close to zero-touch Automatic SystemID derivation • RIFT SystemID (64 bits) is automatically derived from node’s EUI-64 Top-level (superspine) switches must be manually configured • Either: with flag=SUPERSPINE (default level 16) • Or: explicit level (e.g.: level 3 in the example) A node with non-configured level derives its level from the neighbor’s level (highest neighbor’s level – 1) • E, F -> derived level 2 • I, J -> derived level 1 Node with flag=LEAF_ONLY has always derived level 0 J N F Level 0 Level 1 Level 2 Level 3A E I M Flag = LEAF_ONLY Flag = LEAF_ONLY level=3 manual
  • 14. © 2018 Juniper Networks A0 RIFT FEATURES ROUTING IN FAILURE: AUTOMATIC DISAGGREGATION Problem statement: Avoid any traffic black-holing due to Link failures 1) Link C2 – B1 breaks. C2 looses reachability to Pfx Y & Z 2) C2 sends updates with only one Nbr (A1) 3) D2 receives update from C2: • Our neighbors don’t match (B1 is missing) • C2 has no reachability to pfx Y & Z • Lower level nodes use 0/0 – risk of traffic black hole. 4) D2 originates new update w/ disaggregated prefixes (Y,Z) Note: • Nodes on lower level (A1, B1) get more specific route. • Nodes further down [Level 0] still can use 0/0 only A1 C2 Pfx ZPfx YPfx XPfx W D2 learns C2 has lost Nbr B1 3 D2 Pfx 0/0 à C2, D2 Pfx Y,Z à D2 Pfx 0/0 à A1, A2 B1C2 – B1 link fails 1 C2 sends only Nbr A1 in update 2 D2 advertises specific route to pfx Y & Z 4
  • 15. © 2018 Juniper Networks RIFT FEATURES FLOODING REDUCTION: FOR HIGHLY MESHED DC TOPOLOGIES Problem statement: Avoid redundant information in highly meshed topologies N-port spine switch Level 2 spine – all N ports are southbound Level 1 spine • N/2 ports are Southbound • N/2 ports are Northbound Link-State Flooding become over-kill (known problem in link- state protocols)
  • 16. © 2018 Juniper Networks RIFT FEATURES FLOODING REDUCTION: HAPPENS IN THE NORTH DIRECTION Each ‘L’ node which ‘L+2’ nodes are reachable via particular “L+1’ nodes Single ‘L+1’ node can flood updates from ‘L’ node to given set of ‘L+2’ nodes -> Flood Repeater (FR) node For redundancy, in RIFT ‘L’ node selects at least two ‘L+1’ nodes as FRs (using a selection algorithm) Updates sent to non-FRs marked with ‘do-not-reflect’ flag Similar algorithm is executed at each level.L L+1 L+2 XX XX
  • 17. © 2018 Juniper Networks RIFT FEATURES WEIGHTED BANDWIDTH LOAD-BALANCING Problem Statement: Load-balance traffic across links based on link capacity Weighted Bandwidth load-balancing example: 1. Each upstream node gets a value based on available bandwidth • Upstream node BW = BW to upstream node + uplink BW upstream node • On X, upstream node I & J -> 2 x 10G + 4 x 40G = 180G • Upstream node BW is converted to next exponent of 2 • On X, upstream node I & J -> 180G -> 8 (Note: 27 < 180 < 28) • Exponent for I & J = 8 2. Received route’s metric is adjusted based on above value (BAD – Bandwidth Adjusted Distance) • BAD = original D * (1 + Max_Upstream_Exp – Current_Upstream_Exp) • On X, upstream node I -> BAD = D * (1 + 8 - 8) = D • On X, upstream node J -> BAD = D * (1 + 8 - 8) = D • Equal BW load-balancing -> distance (metric) not adjusted J Y F A E I X 10G 40G 100G
  • 18. © 2018 Juniper Networks 18 REQUIREMENT BGP (modified for DC) ISIS (modified for DC) RIFT 01 Close to Zero Touch Provisioning 02 Link discovery/Automatic forming of trees/preventing cabling violations ⚠ ⚠ 03 Minimal amount of routes/information on ToRs (cost-optimized) 04 High degree of ECMP (BGP needs lots knobs, memory, own-as-path violations) ⚠ 05 Traffic engineering by Next-hops, Prefix modification 06 See all links in topology to support PCE/SR ⚠ 07 Carry opaque configuration data (key-value) efficiently ⚠ 08 Take a node out of production quickly and without disruption (overload) 09 Automatic disaggregation on failures to prevent black-holing 10 Minimal blast radius on failures 11 Fastest possible convergence on failures RIFT FEATURES SUMMARY DATACENTER FABRIC: KEY REQUIREMENTS
  • 19. © 2018 Juniper Networks INDUSTRY STATUS Standardization • Initiated by Antoni Przygienda (Juniper Networks) • Standards Track Working Group Draft (I-D) • Base for further work toward RFC • https://tools.ietf.org/html/draft-ietf-rift-rift-06 Co-operation • Join work at IETF WG (JNPR, CSCO, Nokia, Comcast) • Contact authors, share opinion • The data structures for packet are public (GPB) I-D RFC STD individual Availability • RIFT on python: https://github.com/brunorijsman/rift- python • RIFT trial code available from Juniper: https://www.juniper.net/us/en/dm/free-rift-trial/ • Production-ready Juniper code: Q4’2019 Relevant drafts • Policy-guided prefixes with RIFT: https://tools.ietf.org/html/draft-atlas-rift-pgp-01 • RIFT YANG model: https://tools.ietf.org/html/draft-ietf-rift-yang-00 • Segment Routing in Fat Trees (SRIFT): https://tools.ietf.org/html/draft-zzhang-rift-sr-01
  • 20. © 2018 Juniper Networks SUMMARY: RIFT PROTOCOL ADVANTAGES • Fastest possible convergence • Automatic topology detection • Minimal routes on TORs • High degree of ECMP • Fast de-commissioning of Nodes • Excessive flooding • Manual neighbor detection • Zero-touch provisioning • Automatic disaggregation on failure • Minimal blast radius on failures • Utilize all fabric paths without loops • Support for non-ECMP paths • Key-Value Store Link-State and Distance Vector Take ‘best of both’ Leave ‘not-so-good’ Unique RIFT additions
  • 21. © 2018 Juniper Networks THANK YOU nitinvig@juniper.net