SlideShare a Scribd company logo
RIFT:
ROUTING IN FAT TREES
(draft-przygienda-rift)
Krzysztof Grzegorz Szarkowicz
Juniper Networks, Solution Architect
•Problem
•Generic routing protocols have been pressed into service in data center fabrics
underlay and they are doing a suboptimal job
•How Can We Address It
•Crate a novel routing protocol specialized towards fabric underlay. Protocol
that requires minimum configuration, prevents mis-cabling, optimizes
flooding, holds default route only on leafs and automatically disaggregates on
failures
•Standardization status
•Protocol specification published (draft-przygienda-rift)
•Anyone can built protocol stack based on the published specification
RIFT: Elevator Pitch of a “Self-Driving Fabric”
•Advantages
•Topology elements - nodes, links, prefixes
•Each node originates packets with its elements
•Packets are ”flooded” across the network
•”Newest” version wins
•Each node “sees” whole topology
•Each node “computes” reachability to everywhere
•Conversion is very fast
•Disadvantages
•Every link failure shakes whole network
•Flooding generates excessive load for large average connectivity
•Periodic re-flooding (refreshes)
Examples: OSPF, IS-IS, PNNI,
TRILL, RBridges
Link State and SPF = Distributed Computation
•Prefixes “gather” metric when passed along links
•Each node computes “single best” result and passes it on
(Add-Path added “multiple best” results )
•A node keeps all copies, otherwise it would have to
trigger “re-diffusion”
•Loop prevention is easy on strictly uniformly increasing
metric.
•Ideal for “policy” rather than “reachability”
•Scales when properly implemented to much higher # of
routes than Link-State
•Slow convergence
Examples: BGP, RIP, IGRP
Distance/Path Vector = Diffused Computation
•Link State
•Topology view  TE enabler
•Distance/Path Vector
•Every computation could enforce policy –
granular control – TE
•Both protocols types (LS and
Distance/Path Vector) are frequently
used in todays networks
time
time
Node 1
Node 0
Node 3
Node 2
Node 5
Node 4
Node 1
Node 0
Node 3
Node 2
Node 5
Node 4
computation
Update
tx-mission
Link State Convergence
Distance Vector Convergence
Link State vs Distance/Path Vector
•Clos and Fat-Tree topologies
•Current state of dynamic DC routing
•Dynamic DC routing requirements matrix
DC Fabric Routing: a Specialized Problem
•Clos offers well-understood blocking
probabilities
•Work done at AT&T (Bell Systems) in
1950s for crossbar scaling
•Fully connected CLOS is dense and
expensive
•Data centers today tend to be variations
of “folded Fat-Tree”:
•Input stages are same as output Stages
•CLOS w/ (m >= n)
FOLD
swap
2_Rx X 2_Tx
2 (2_Rx X 2_Tx)
Clos Topologies
•Several of large DC fabrics use E-BGP with IGP like features (RFC7938)
•”looping paths” (allow-as)
•“Relaxed Multi-Path ECMP”
•AS numbering schemes to control “path hunting” via policies
•AddPaths to support multi-homing, ECMP on EBGP
•Efforts to get around 65K ASes and limited private AS space
•Proprietary provisioning and configuration solutions, LLDP Extensions
•“Violations” of FSM like restart timers and minimum-route-advertisement timers
•Emerging Work for “Peer Auto-Discovery” and “SPF” Diametrically Opposite to BGP Design Principles
•Reliance on “Update Groups” ~ Peer Groups to Prevent Withdrawal and Path Hunting After Server Link
Failures
•Others run IGP (ISIS)
•Yet others run BGP over IGP (traditional routing architecture)
•Less than more successful attempts @ prefix summarization, micro- and black-holing
•Works better for single-tenant fabrics without LAN stretch or VM mobility
Current State of Affairs
Problem / Attempted Solution BGP modified for DC
(all kind of “mods”)
ISIS modified for DC
(RFC7356 + “mods”)
RIFT
Native DC
Link Discovery/Automatic Forming of Trees/Preventing Cabling Violations ⚠️ ⚠️
Minimal Amount of Routes/Information on ToRs
High Degree of ECMP (BGP needs lots knobs, memory, own-AS-path violations) and ideally
NEC and LFA
⚠️
Traffic Engineering by Next-Hops, Prefix Modifications
See All Links in Topology to Support PCE/SR ⚠️
Carry Opaque Configuration Data (Key-Value) Efficiently ⚠️
Take a Node out of Production Quickly and Without Disruption
Automatic Disaggregation on Failures to Prevent Black-Holing and Back-Hauling
Minimal Blast Radius on Failures (On Failure Smallest Possible Part of the Network
“Shakes”)
Fastest Possible Convergence on Failures
Simplest Initial Implementation
Dynamic DC Routing Requirements Breakdown (RFC7938+)
•General concept
•Automatic cabling constraints
•Automatic disaggregation on failures
•Automatic flooding reduction
•Other
“Just because the standard provides a cliff in front of you,
you are not necessarily required to jump off it.”
— Norman Diamond
RIFT: Novel Dynamic Routing Algorithm for Clos Underlay
•Link-State flood Up (North)
•Full topology and all prefixes @
top spine only
•Distance Vector down
•0/0 is sufficient to send traffic UP.
•More specific prefixes
•disaggregated in case of failure
•TE
•Flood reduction and automatic
dis-aggregation
In One Picture: Link-State Up, Distance Vector Down & Bounce
•Link Information Element
•POD #
•Level #
•Node ID
•Transported over well known m-cast address and port
•POD # == 0 “Any POD”
•Node derive POD from 1st Northbound neighbor it establish
adjacency.
•Auto-configuration
•Level # == 0 “Leaf”
Content
header
Major Ver.
Minor Ver.
System ID
Level
LIE
Name
Flood UDP port
Neighbor
POD#
Adjacency Formation (aka Hello)
Automatic rejection of adjacencies based
on minimum configuration
•A1 to B1 forbidden due to POD mismatch
•A0 to B1 forbidden due to POD mismatch
(A0 already formed A0-A1 even if POD
not configured on A0)
•B0 to C0 forbidden based on level
mismatch
Automatic Topology Constraints
•N-TIE := S-TIE + carries link-
state info
•attributes
•Detailed topology
•Not all fields always required
•Packing efficiency
•Contextual
•TIE Types
•Node TIE
•Prefix TIE
•PGPrefix TIE
•KeyValue TIE
TIE packet
TIE element
NodeTIEElement
Prefixes
KeyValueTIEElement
Level
capabilities
flags
Set of NodeNeighborsTIEElement
Neighbour System ID
Level
Metric
Set LinkID
TIE Element
TIE header
TIE ID
Seq. Number
Lifetime
Direction (N/S)
Originator System ID
TIE Type
TIE Nr. Type
header
Major Ver.
Minor Ver.
System ID
Level
Topology Information Element
•Leafs
•Only 0/0 to connected
level 1 spines.
•Spine 111 [112]
•0/0 via S31, S32 [S33,S34]
•Pfx A via L101
•Pfx B via L102
•Spine 211 [212]
•0/0 via S31, S32 [S33,S34]
•Pfx C via L201
•Pfx D via L202
•Spine 31, 32, 33, 34
•Pfx A via S111, S112
•Pfx B via S111, S112
•Pfx C via S211, S212
•Pfx D va S211, S212
S-TIE(111)
Neigh: 31,
32, 101, 102
Pfx: 0/0
111
N-TIE(201)
Pfx C
N-TIE(202)
Pfx D
N-TIE(212)
212
201 202
31
aggregation
localization
34
Pfx DPfx CPfx BPfx A
S-TIE(31)
Neigh:
111, 211
Pfx: 0/0
TIE(x) – TIE originated
at node X.
211112
101 102
32
Routing in steady state – basics (1)
33
1) Spine @ level X [S31] sent S-TIE to
node @ level (X-1) [S111]
2) Node @ level (X-1) [S111]send S-TIE
up to all neighbors [S32]
3) Spine that received bounced S-TIA
[S32] compares their neighbors w/
one in S-TIE
4) Discovered “Equal connectivity
group”
 Disaggregation
 Flood reduction
111 212
201 212
31 34
Pfx DPfx CPfx BPfx A
S-TIE(31)
Neigh:
111, 211
Pfx: 0/0 1
2
3
112 211
32
101 102
S-TIE reflection
33
101 102
1) Spine X [S32] receive bounced S-TIE(31)
2) Discovery
• Neighbor not matches – one [S211] is missing in S-
TIE(S31)
• Spine Y [S31] has no connectivity to some pfx (pfx: C, D).
• As node in lower level (Level 1) use 0/0 – risk of black
hole/losses.
3) Spine X [S32] originate new S-TIE(32) w/
disaggregated prefixes (C,D)
Note:
Nodes on lower level (Level 1) get more specific route.
Nodes further down [L101, L102] still can use 0/0 only
111 212
201 202
31
Pfx DPfx CPfx BPfx A
S-TIE(31)
Neigh:
111
Pfx: 0/0
2
1
3
32
Pfx A S111
Pfx B S111
Pfx C S211
Pfx D S211
S-TIE(33)
Neigh: 111,
211
Pfx: 0/0; C, D
Pfx 0/0  S31, S32
Pfx C S32
Pfx D S32
Pfx 0/0  S31,
S32
112 211
Routing in failure – automatic disaggregation
•N-port spine switch
•Level 2 spine – all N ports are
southbound
•Level 1 spine
•N/2 ports are Southbound
•N/2 ports are Nothbound
•Link-State Flooding become over-kill
Highly mesh topology
•A lot of redundant information
•Known problem in Link-State
protocols in Highly meshed
networks
Flooding w/o Reduction
•Each “B” node computes from reflected south
representation of other “B” nodes
•Set of South neighbors
•Set of North neighbors
•Nodes having both sets matching consider
themselves “Flood Reduction Group” and load-
balance flooding
•Fully distributed, unsynchronized election
•In this example case B1 & B2
•Each node chooses based on hash computation
which other nodes’ Information it forwards on first
flood attempt
•Similar to DF election in EVPN, or DR in PIM
Flooding with Reduction
Summary of RIFT Advantages
• Fastest possible convergence
• Automatic detection of topology
• Minimal routes on TORs
• High degree of ECMP
• Minimal blast radius on failures
• Fast De-comissioning of Nodes
• Reduced flooding
• Automatic neighbor detection
• Automatic disaggregation on failures
• Key-Value Store
• Advantages of Link-State and
Distance Vector
• No disadvantages of Link-State
or Distance Vector
• And some neither can do
•Standardization
•Individual contribution to IETF Routing WG
•Base for further work toward I-D
•Implementation
•Prototype reference code exist
•PoC Test runs, performance data collected
•Cooperation
•Join work at IETF WG
•Contact authors, share opinion
•The data structures for packet are public (GPB) –
see draft-przygienda-rift
I-D RFC STD
individual
Summary of RIFT Advantages
•Packaged Pre-Production Code
•Flooding/Hello/SPF/RIB/Flood Reduction
•Optional REDIS Analytics for Link-State DB, Routes and Statistics Implemented
•Very Heavily Multi-Threaded to Fully Utilize Modern CPU Architectures
•Memory and Thread Safe Unlike C
•Carries Its Own Topology Environment for Heavy-Duty Testing
•Easily Portable Across All Modern OS Platforms (OSX, Linux, FreeBSD)
Early Juniper Implementation
«Junos Proxy Daemon»
RIFT
«JET APIs»
RPD
«OS»
JunOS BSD
BSD Sockets
Implementation
External System
RIB
External System
Interface Manager
«Telemetry Store»
Redis
External System
Configuration
External System
Log Sink
External System
«CLI»
FILE«File»
YAML Config
Configurator
«RUST»
RIFT
RIFT System Integration
Linux & JunOS
Conceptual Design
2017 Przygienda
«deamon»
SYSLOG
MGD
«JunOS ODL/DDL/CLI»
«xor»
«JunOS ODL/DDL/CLI»
«JET»
Interface
Status
«xor»
«NETLINK»
«JET»
«NETLINK»
«deamon»
SYSLOG
«NETLINK»
Sockets
«JET»
Route &
Nexthop
API
«thread»
RIB
«thread»
Interface Manager
Key-Value
Statistics &
LSDB
syslog
«thread»
Node Instance
«thread»
Thrift Server
parses
CLI
Commands
parses
initial
topology
«testing only»
Topology
«thrift service»
Internal Protocol
Elements State
«NETLINK»
«JET»
Interface
Status
«thrift service»
Internal Protocol
Elements Configuration
«RUST Channel»
Configuration
•Implemented as
external application
•JET APIs
•BSD sockets
•Thrift service
•RIFT 0.1.3 binary
package
•MAC, Linux, FreeBSD
Early Juniper Implementation
Indicative Numbers for RIFT: Topology
•On MacPro Laptop (Low Power I7 with 4 Real Cores)
•21 Nodes, 60 links, 600 Prefixes
•Convergence From Cold Start
•~ 4x Faster Than Flat IGP (~ 300 Millisecs)
•~ 3x Less Flooding Than Flat IGP
•Single Link Flap at Super-Spine
•~ 2x Faster Than Flat IGP
•< 50 Millisecs Convergence (35 avg, 70 Max)
Indicative Numbers for RIFT Implementation
KRZYSZTOF GRZEGORZ SZARKOWICZ
JUNIPER NETWORKS, SOLUTION ARCHITECT
kszarkowicz@juniper.net

More Related Content

What's hot

Introduction to OpenFlow
Introduction to OpenFlowIntroduction to OpenFlow
Introduction to OpenFlow
Joel W. King
 
Scaling Networks with Segment Routing
Scaling Networks with Segment RoutingScaling Networks with Segment Routing
Scaling Networks with Segment Routing
APNIC
 
MPLS WC 2014 Segment Routing TI-LFA Fast ReRoute
MPLS WC 2014  Segment Routing TI-LFA Fast ReRouteMPLS WC 2014  Segment Routing TI-LFA Fast ReRoute
MPLS WC 2014 Segment Routing TI-LFA Fast ReRoute
Bruno Decraene
 
SDN Traffic Engineering, A Natural Evolution
SDN Traffic Engineering, A Natural EvolutionSDN Traffic Engineering, A Natural Evolution
SDN Traffic Engineering, A Natural Evolution
APNIC
 
Segment Routing for Dummies
Segment Routing for DummiesSegment Routing for Dummies
Segment Routing for Dummies
Gary Jan
 
Segment Routing Advanced Use Cases - Cisco Live 2016 USA
Segment Routing Advanced Use Cases - Cisco Live 2016 USASegment Routing Advanced Use Cases - Cisco Live 2016 USA
Segment Routing Advanced Use Cases - Cisco Live 2016 USA
Jose Liste
 
Onos summit roadmap dec 9
Onos summit  roadmap dec 9Onos summit  roadmap dec 9
Onos summit roadmap dec 9
ONOS Project
 
How LinkedIn used TCP Anycast to make the site faster
How LinkedIn used TCP Anycast to make the site fasterHow LinkedIn used TCP Anycast to make the site faster
How LinkedIn used TCP Anycast to make the site faster
Shawn Zandi
 
The Segment Routing Architecture (IEEE Globecom 2015)
The Segment Routing Architecture (IEEE Globecom 2015)The Segment Routing Architecture (IEEE Globecom 2015)
The Segment Routing Architecture (IEEE Globecom 2015)
nagendranainar
 
OTV PPT by NETWORKERS HOME
OTV PPT by NETWORKERS HOMEOTV PPT by NETWORKERS HOME
OTV PPT by NETWORKERS HOME
networkershome
 
Osdc2014 openstack networking yves_fauser
Osdc2014 openstack networking yves_fauserOsdc2014 openstack networking yves_fauser
Osdc2014 openstack networking yves_fauser
yfauser
 
Fabric Path PPT by NETWORKERS HOME
Fabric Path PPT by NETWORKERS HOMEFabric Path PPT by NETWORKERS HOME
Fabric Path PPT by NETWORKERS HOME
networkershome
 
Segment Routing Lab
Segment Routing Lab Segment Routing Lab
Segment Routing Lab
Cisco Canada
 
FEX -PPT By NETWORKERS HOME
FEX -PPT By NETWORKERS HOMEFEX -PPT By NETWORKERS HOME
FEX -PPT By NETWORKERS HOME
networkershome
 
SDN - OpenFlow protocol
SDN - OpenFlow protocolSDN - OpenFlow protocol
SDN - OpenFlow protocol
Ulf Marxen
 
Open Flow Tutorial Series - Set 1
Open Flow Tutorial Series - Set 1Open Flow Tutorial Series - Set 1
Open Flow Tutorial Series - Set 1
Radhika Hirannaiah
 
Layer 123 SDN World Congress OpenDaylight Service Function Chaining Use Cases
Layer 123 SDN World Congress OpenDaylight Service Function Chaining Use CasesLayer 123 SDN World Congress OpenDaylight Service Function Chaining Use Cases
Layer 123 SDN World Congress OpenDaylight Service Function Chaining Use Cases
abhijit2511
 
Vla ns
Vla nsVla ns
Vla ns
UDLA
 
Segment Routing
Segment RoutingSegment Routing
Segment Routing
APNIC
 
OpenFlow Extensions
OpenFlow ExtensionsOpenFlow Extensions
OpenFlow Extensions
US-Ignite
 

What's hot (20)

Introduction to OpenFlow
Introduction to OpenFlowIntroduction to OpenFlow
Introduction to OpenFlow
 
Scaling Networks with Segment Routing
Scaling Networks with Segment RoutingScaling Networks with Segment Routing
Scaling Networks with Segment Routing
 
MPLS WC 2014 Segment Routing TI-LFA Fast ReRoute
MPLS WC 2014  Segment Routing TI-LFA Fast ReRouteMPLS WC 2014  Segment Routing TI-LFA Fast ReRoute
MPLS WC 2014 Segment Routing TI-LFA Fast ReRoute
 
SDN Traffic Engineering, A Natural Evolution
SDN Traffic Engineering, A Natural EvolutionSDN Traffic Engineering, A Natural Evolution
SDN Traffic Engineering, A Natural Evolution
 
Segment Routing for Dummies
Segment Routing for DummiesSegment Routing for Dummies
Segment Routing for Dummies
 
Segment Routing Advanced Use Cases - Cisco Live 2016 USA
Segment Routing Advanced Use Cases - Cisco Live 2016 USASegment Routing Advanced Use Cases - Cisco Live 2016 USA
Segment Routing Advanced Use Cases - Cisco Live 2016 USA
 
Onos summit roadmap dec 9
Onos summit  roadmap dec 9Onos summit  roadmap dec 9
Onos summit roadmap dec 9
 
How LinkedIn used TCP Anycast to make the site faster
How LinkedIn used TCP Anycast to make the site fasterHow LinkedIn used TCP Anycast to make the site faster
How LinkedIn used TCP Anycast to make the site faster
 
The Segment Routing Architecture (IEEE Globecom 2015)
The Segment Routing Architecture (IEEE Globecom 2015)The Segment Routing Architecture (IEEE Globecom 2015)
The Segment Routing Architecture (IEEE Globecom 2015)
 
OTV PPT by NETWORKERS HOME
OTV PPT by NETWORKERS HOMEOTV PPT by NETWORKERS HOME
OTV PPT by NETWORKERS HOME
 
Osdc2014 openstack networking yves_fauser
Osdc2014 openstack networking yves_fauserOsdc2014 openstack networking yves_fauser
Osdc2014 openstack networking yves_fauser
 
Fabric Path PPT by NETWORKERS HOME
Fabric Path PPT by NETWORKERS HOMEFabric Path PPT by NETWORKERS HOME
Fabric Path PPT by NETWORKERS HOME
 
Segment Routing Lab
Segment Routing Lab Segment Routing Lab
Segment Routing Lab
 
FEX -PPT By NETWORKERS HOME
FEX -PPT By NETWORKERS HOMEFEX -PPT By NETWORKERS HOME
FEX -PPT By NETWORKERS HOME
 
SDN - OpenFlow protocol
SDN - OpenFlow protocolSDN - OpenFlow protocol
SDN - OpenFlow protocol
 
Open Flow Tutorial Series - Set 1
Open Flow Tutorial Series - Set 1Open Flow Tutorial Series - Set 1
Open Flow Tutorial Series - Set 1
 
Layer 123 SDN World Congress OpenDaylight Service Function Chaining Use Cases
Layer 123 SDN World Congress OpenDaylight Service Function Chaining Use CasesLayer 123 SDN World Congress OpenDaylight Service Function Chaining Use Cases
Layer 123 SDN World Congress OpenDaylight Service Function Chaining Use Cases
 
Vla ns
Vla nsVla ns
Vla ns
 
Segment Routing
Segment RoutingSegment Routing
Segment Routing
 
OpenFlow Extensions
OpenFlow ExtensionsOpenFlow Extensions
OpenFlow Extensions
 

Similar to PLNOG19 - Krzysztof Szarkowicz - RIFT i nowe pomysły na routing

Routing In Fat Trees
Routing In Fat TreesRouting In Fat Trees
Routing In Fat Trees
APNIC
 
Dynamic Routing All Algorithms, Working And Basics
Dynamic Routing All Algorithms, Working And BasicsDynamic Routing All Algorithms, Working And Basics
Dynamic Routing All Algorithms, Working And Basics
Harsh Mehta
 
Signal Integrity - A Crash Course [R Lott]
Signal Integrity - A Crash Course [R Lott]Signal Integrity - A Crash Course [R Lott]
Signal Integrity - A Crash Course [R Lott]
Ryan Lott
 
Multi protocol label switching (mpls)
Multi protocol label switching (mpls)Multi protocol label switching (mpls)
Multi protocol label switching (mpls)
Online
 
Networks-part17-Bridges-RP1.pptjwhwhsjshh
Networks-part17-Bridges-RP1.pptjwhwhsjshhNetworks-part17-Bridges-RP1.pptjwhwhsjshh
Networks-part17-Bridges-RP1.pptjwhwhsjshh
VijayKaran7
 
400-101 CCIE Routing and Switching IT Certification
400-101 CCIE Routing and Switching IT Certification400-101 CCIE Routing and Switching IT Certification
400-101 CCIE Routing and Switching IT Certification
wrouthae
 
RIFT A New Approach to Building DC Fabrics
RIFT A New Approach to Building DC FabricsRIFT A New Approach to Building DC Fabrics
RIFT A New Approach to Building DC Fabrics
MyNOG
 
OpenSDWN: Programmatic control over home and enterprise Wi-Fi
OpenSDWN: Programmatic control over home and enterprise Wi-FiOpenSDWN: Programmatic control over home and enterprise Wi-Fi
OpenSDWN: Programmatic control over home and enterprise Wi-Fi
Julius Schulz-Zander
 
Disaggregated Networking - The Drivers, the Software & The High Availability
Disaggregated Networking - The Drivers, the Software & The High AvailabilityDisaggregated Networking - The Drivers, the Software & The High Availability
Disaggregated Networking - The Drivers, the Software & The High Availability
Open Networking Summit
 
Tutorial: Network State Awareness Troubleshooting
Tutorial: Network State Awareness TroubleshootingTutorial: Network State Awareness Troubleshooting
Tutorial: Network State Awareness Troubleshooting
APNIC
 
Chap.1 ethernet introduction
Chap.1 ethernet introductionChap.1 ethernet introduction
Chap.1 ethernet introduction
東原 李
 
Tendencias de Uso y Diseño de Redes de Interconexión en Computadores Paralel...
Tendencias de Uso y Diseño de Redes de Interconexión  en Computadores Paralel...Tendencias de Uso y Diseño de Redes de Interconexión  en Computadores Paralel...
Tendencias de Uso y Diseño de Redes de Interconexión en Computadores Paralel...
Facultad de Informática UCM
 
Pilot protection
Pilot protectionPilot protection
Pilot protection
Suraj Shandilya
 
Lan overview
Lan overviewLan overview
Lan overview
Shyam Gupta
 
Dc ch08 : local area network overview
Dc ch08 : local area network overviewDc ch08 : local area network overview
Dc ch08 : local area network overview
Syaiful Ahdan
 
CCNA- part 8 switch
CCNA- part 8 switchCCNA- part 8 switch
Packet Switching Technique in Computer Network
Packet Switching Technique in Computer NetworkPacket Switching Technique in Computer Network
Packet Switching Technique in Computer Network
NiharikaDubey17
 
vlsi dESIGN lETURE NOTES FOR ENGINEERING STUDENTS
vlsi dESIGN lETURE NOTES FOR ENGINEERING STUDENTSvlsi dESIGN lETURE NOTES FOR ENGINEERING STUDENTS
vlsi dESIGN lETURE NOTES FOR ENGINEERING STUDENTS
Niranjan Reddy
 
To HDR and Beyond
To HDR and BeyondTo HDR and Beyond
To HDR and Beyond
inside-BigData.com
 
A Flexible Router Architecture for 3D Network-on-Chips
A Flexible Router Architecture for 3D Network-on-ChipsA Flexible Router Architecture for 3D Network-on-Chips
A Flexible Router Architecture for 3D Network-on-Chips
Mostafa Khamis
 

Similar to PLNOG19 - Krzysztof Szarkowicz - RIFT i nowe pomysły na routing (20)

Routing In Fat Trees
Routing In Fat TreesRouting In Fat Trees
Routing In Fat Trees
 
Dynamic Routing All Algorithms, Working And Basics
Dynamic Routing All Algorithms, Working And BasicsDynamic Routing All Algorithms, Working And Basics
Dynamic Routing All Algorithms, Working And Basics
 
Signal Integrity - A Crash Course [R Lott]
Signal Integrity - A Crash Course [R Lott]Signal Integrity - A Crash Course [R Lott]
Signal Integrity - A Crash Course [R Lott]
 
Multi protocol label switching (mpls)
Multi protocol label switching (mpls)Multi protocol label switching (mpls)
Multi protocol label switching (mpls)
 
Networks-part17-Bridges-RP1.pptjwhwhsjshh
Networks-part17-Bridges-RP1.pptjwhwhsjshhNetworks-part17-Bridges-RP1.pptjwhwhsjshh
Networks-part17-Bridges-RP1.pptjwhwhsjshh
 
400-101 CCIE Routing and Switching IT Certification
400-101 CCIE Routing and Switching IT Certification400-101 CCIE Routing and Switching IT Certification
400-101 CCIE Routing and Switching IT Certification
 
RIFT A New Approach to Building DC Fabrics
RIFT A New Approach to Building DC FabricsRIFT A New Approach to Building DC Fabrics
RIFT A New Approach to Building DC Fabrics
 
OpenSDWN: Programmatic control over home and enterprise Wi-Fi
OpenSDWN: Programmatic control over home and enterprise Wi-FiOpenSDWN: Programmatic control over home and enterprise Wi-Fi
OpenSDWN: Programmatic control over home and enterprise Wi-Fi
 
Disaggregated Networking - The Drivers, the Software & The High Availability
Disaggregated Networking - The Drivers, the Software & The High AvailabilityDisaggregated Networking - The Drivers, the Software & The High Availability
Disaggregated Networking - The Drivers, the Software & The High Availability
 
Tutorial: Network State Awareness Troubleshooting
Tutorial: Network State Awareness TroubleshootingTutorial: Network State Awareness Troubleshooting
Tutorial: Network State Awareness Troubleshooting
 
Chap.1 ethernet introduction
Chap.1 ethernet introductionChap.1 ethernet introduction
Chap.1 ethernet introduction
 
Tendencias de Uso y Diseño de Redes de Interconexión en Computadores Paralel...
Tendencias de Uso y Diseño de Redes de Interconexión  en Computadores Paralel...Tendencias de Uso y Diseño de Redes de Interconexión  en Computadores Paralel...
Tendencias de Uso y Diseño de Redes de Interconexión en Computadores Paralel...
 
Pilot protection
Pilot protectionPilot protection
Pilot protection
 
Lan overview
Lan overviewLan overview
Lan overview
 
Dc ch08 : local area network overview
Dc ch08 : local area network overviewDc ch08 : local area network overview
Dc ch08 : local area network overview
 
CCNA- part 8 switch
CCNA- part 8 switchCCNA- part 8 switch
CCNA- part 8 switch
 
Packet Switching Technique in Computer Network
Packet Switching Technique in Computer NetworkPacket Switching Technique in Computer Network
Packet Switching Technique in Computer Network
 
vlsi dESIGN lETURE NOTES FOR ENGINEERING STUDENTS
vlsi dESIGN lETURE NOTES FOR ENGINEERING STUDENTSvlsi dESIGN lETURE NOTES FOR ENGINEERING STUDENTS
vlsi dESIGN lETURE NOTES FOR ENGINEERING STUDENTS
 
To HDR and Beyond
To HDR and BeyondTo HDR and Beyond
To HDR and Beyond
 
A Flexible Router Architecture for 3D Network-on-Chips
A Flexible Router Architecture for 3D Network-on-ChipsA Flexible Router Architecture for 3D Network-on-Chips
A Flexible Router Architecture for 3D Network-on-Chips
 

Recently uploaded

Curve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods RegressionCurve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods Regression
Nada Hikmah
 
Software Engineering and Project Management - Introduction, Modeling Concepts...
Software Engineering and Project Management - Introduction, Modeling Concepts...Software Engineering and Project Management - Introduction, Modeling Concepts...
Software Engineering and Project Management - Introduction, Modeling Concepts...
Prakhyath Rai
 
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURSCompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
RamonNovais6
 
Manufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptxManufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptx
Madan Karki
 
Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...
bijceesjournal
 
Welding Metallurgy Ferrous Materials.pdf
Welding Metallurgy Ferrous Materials.pdfWelding Metallurgy Ferrous Materials.pdf
Welding Metallurgy Ferrous Materials.pdf
AjmalKhan50578
 
artificial intelligence and data science contents.pptx
artificial intelligence and data science contents.pptxartificial intelligence and data science contents.pptx
artificial intelligence and data science contents.pptx
GauravCar
 
ITSM Integration with MuleSoft.pptx
ITSM  Integration with MuleSoft.pptxITSM  Integration with MuleSoft.pptx
ITSM Integration with MuleSoft.pptx
VANDANAMOHANGOUDA
 
An Introduction to the Compiler Designss
An Introduction to the Compiler DesignssAn Introduction to the Compiler Designss
An Introduction to the Compiler Designss
ElakkiaU
 
Data Control Language.pptx Data Control Language.pptx
Data Control Language.pptx Data Control Language.pptxData Control Language.pptx Data Control Language.pptx
Data Control Language.pptx Data Control Language.pptx
ramrag33
 
Rainfall intensity duration frequency curve statistical analysis and modeling...
Rainfall intensity duration frequency curve statistical analysis and modeling...Rainfall intensity duration frequency curve statistical analysis and modeling...
Rainfall intensity duration frequency curve statistical analysis and modeling...
bijceesjournal
 
132/33KV substation case study Presentation
132/33KV substation case study Presentation132/33KV substation case study Presentation
132/33KV substation case study Presentation
kandramariana6
 
Embedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoringEmbedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoring
IJECEIAES
 
AI assisted telemedicine KIOSK for Rural India.pptx
AI assisted telemedicine KIOSK for Rural India.pptxAI assisted telemedicine KIOSK for Rural India.pptx
AI assisted telemedicine KIOSK for Rural India.pptx
architagupta876
 
An improved modulation technique suitable for a three level flying capacitor ...
An improved modulation technique suitable for a three level flying capacitor ...An improved modulation technique suitable for a three level flying capacitor ...
An improved modulation technique suitable for a three level flying capacitor ...
IJECEIAES
 
Applications of artificial Intelligence in Mechanical Engineering.pdf
Applications of artificial Intelligence in Mechanical Engineering.pdfApplications of artificial Intelligence in Mechanical Engineering.pdf
Applications of artificial Intelligence in Mechanical Engineering.pdf
Atif Razi
 
Material for memory and display system h
Material for memory and display system hMaterial for memory and display system h
Material for memory and display system h
gowrishankartb2005
 
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
IJECEIAES
 
Design and optimization of ion propulsion drone
Design and optimization of ion propulsion droneDesign and optimization of ion propulsion drone
Design and optimization of ion propulsion drone
bjmsejournal
 
integral complex analysis chapter 06 .pdf
integral complex analysis chapter 06 .pdfintegral complex analysis chapter 06 .pdf
integral complex analysis chapter 06 .pdf
gaafergoudaay7aga
 

Recently uploaded (20)

Curve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods RegressionCurve Fitting in Numerical Methods Regression
Curve Fitting in Numerical Methods Regression
 
Software Engineering and Project Management - Introduction, Modeling Concepts...
Software Engineering and Project Management - Introduction, Modeling Concepts...Software Engineering and Project Management - Introduction, Modeling Concepts...
Software Engineering and Project Management - Introduction, Modeling Concepts...
 
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURSCompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
CompEx~Manual~1210 (2).pdf COMPEX GAS AND VAPOURS
 
Manufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptxManufacturing Process of molasses based distillery ppt.pptx
Manufacturing Process of molasses based distillery ppt.pptx
 
Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...Comparative analysis between traditional aquaponics and reconstructed aquapon...
Comparative analysis between traditional aquaponics and reconstructed aquapon...
 
Welding Metallurgy Ferrous Materials.pdf
Welding Metallurgy Ferrous Materials.pdfWelding Metallurgy Ferrous Materials.pdf
Welding Metallurgy Ferrous Materials.pdf
 
artificial intelligence and data science contents.pptx
artificial intelligence and data science contents.pptxartificial intelligence and data science contents.pptx
artificial intelligence and data science contents.pptx
 
ITSM Integration with MuleSoft.pptx
ITSM  Integration with MuleSoft.pptxITSM  Integration with MuleSoft.pptx
ITSM Integration with MuleSoft.pptx
 
An Introduction to the Compiler Designss
An Introduction to the Compiler DesignssAn Introduction to the Compiler Designss
An Introduction to the Compiler Designss
 
Data Control Language.pptx Data Control Language.pptx
Data Control Language.pptx Data Control Language.pptxData Control Language.pptx Data Control Language.pptx
Data Control Language.pptx Data Control Language.pptx
 
Rainfall intensity duration frequency curve statistical analysis and modeling...
Rainfall intensity duration frequency curve statistical analysis and modeling...Rainfall intensity duration frequency curve statistical analysis and modeling...
Rainfall intensity duration frequency curve statistical analysis and modeling...
 
132/33KV substation case study Presentation
132/33KV substation case study Presentation132/33KV substation case study Presentation
132/33KV substation case study Presentation
 
Embedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoringEmbedded machine learning-based road conditions and driving behavior monitoring
Embedded machine learning-based road conditions and driving behavior monitoring
 
AI assisted telemedicine KIOSK for Rural India.pptx
AI assisted telemedicine KIOSK for Rural India.pptxAI assisted telemedicine KIOSK for Rural India.pptx
AI assisted telemedicine KIOSK for Rural India.pptx
 
An improved modulation technique suitable for a three level flying capacitor ...
An improved modulation technique suitable for a three level flying capacitor ...An improved modulation technique suitable for a three level flying capacitor ...
An improved modulation technique suitable for a three level flying capacitor ...
 
Applications of artificial Intelligence in Mechanical Engineering.pdf
Applications of artificial Intelligence in Mechanical Engineering.pdfApplications of artificial Intelligence in Mechanical Engineering.pdf
Applications of artificial Intelligence in Mechanical Engineering.pdf
 
Material for memory and display system h
Material for memory and display system hMaterial for memory and display system h
Material for memory and display system h
 
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...
 
Design and optimization of ion propulsion drone
Design and optimization of ion propulsion droneDesign and optimization of ion propulsion drone
Design and optimization of ion propulsion drone
 
integral complex analysis chapter 06 .pdf
integral complex analysis chapter 06 .pdfintegral complex analysis chapter 06 .pdf
integral complex analysis chapter 06 .pdf
 

PLNOG19 - Krzysztof Szarkowicz - RIFT i nowe pomysły na routing

  • 1. RIFT: ROUTING IN FAT TREES (draft-przygienda-rift) Krzysztof Grzegorz Szarkowicz Juniper Networks, Solution Architect
  • 2. •Problem •Generic routing protocols have been pressed into service in data center fabrics underlay and they are doing a suboptimal job •How Can We Address It •Crate a novel routing protocol specialized towards fabric underlay. Protocol that requires minimum configuration, prevents mis-cabling, optimizes flooding, holds default route only on leafs and automatically disaggregates on failures •Standardization status •Protocol specification published (draft-przygienda-rift) •Anyone can built protocol stack based on the published specification RIFT: Elevator Pitch of a “Self-Driving Fabric”
  • 3. •Advantages •Topology elements - nodes, links, prefixes •Each node originates packets with its elements •Packets are ”flooded” across the network •”Newest” version wins •Each node “sees” whole topology •Each node “computes” reachability to everywhere •Conversion is very fast •Disadvantages •Every link failure shakes whole network •Flooding generates excessive load for large average connectivity •Periodic re-flooding (refreshes) Examples: OSPF, IS-IS, PNNI, TRILL, RBridges Link State and SPF = Distributed Computation
  • 4. •Prefixes “gather” metric when passed along links •Each node computes “single best” result and passes it on (Add-Path added “multiple best” results ) •A node keeps all copies, otherwise it would have to trigger “re-diffusion” •Loop prevention is easy on strictly uniformly increasing metric. •Ideal for “policy” rather than “reachability” •Scales when properly implemented to much higher # of routes than Link-State •Slow convergence Examples: BGP, RIP, IGRP Distance/Path Vector = Diffused Computation
  • 5. •Link State •Topology view  TE enabler •Distance/Path Vector •Every computation could enforce policy – granular control – TE •Both protocols types (LS and Distance/Path Vector) are frequently used in todays networks time time Node 1 Node 0 Node 3 Node 2 Node 5 Node 4 Node 1 Node 0 Node 3 Node 2 Node 5 Node 4 computation Update tx-mission Link State Convergence Distance Vector Convergence Link State vs Distance/Path Vector
  • 6. •Clos and Fat-Tree topologies •Current state of dynamic DC routing •Dynamic DC routing requirements matrix DC Fabric Routing: a Specialized Problem
  • 7. •Clos offers well-understood blocking probabilities •Work done at AT&T (Bell Systems) in 1950s for crossbar scaling •Fully connected CLOS is dense and expensive •Data centers today tend to be variations of “folded Fat-Tree”: •Input stages are same as output Stages •CLOS w/ (m >= n) FOLD swap 2_Rx X 2_Tx 2 (2_Rx X 2_Tx) Clos Topologies
  • 8. •Several of large DC fabrics use E-BGP with IGP like features (RFC7938) •”looping paths” (allow-as) •“Relaxed Multi-Path ECMP” •AS numbering schemes to control “path hunting” via policies •AddPaths to support multi-homing, ECMP on EBGP •Efforts to get around 65K ASes and limited private AS space •Proprietary provisioning and configuration solutions, LLDP Extensions •“Violations” of FSM like restart timers and minimum-route-advertisement timers •Emerging Work for “Peer Auto-Discovery” and “SPF” Diametrically Opposite to BGP Design Principles •Reliance on “Update Groups” ~ Peer Groups to Prevent Withdrawal and Path Hunting After Server Link Failures •Others run IGP (ISIS) •Yet others run BGP over IGP (traditional routing architecture) •Less than more successful attempts @ prefix summarization, micro- and black-holing •Works better for single-tenant fabrics without LAN stretch or VM mobility Current State of Affairs
  • 9. Problem / Attempted Solution BGP modified for DC (all kind of “mods”) ISIS modified for DC (RFC7356 + “mods”) RIFT Native DC Link Discovery/Automatic Forming of Trees/Preventing Cabling Violations ⚠️ ⚠️ Minimal Amount of Routes/Information on ToRs High Degree of ECMP (BGP needs lots knobs, memory, own-AS-path violations) and ideally NEC and LFA ⚠️ Traffic Engineering by Next-Hops, Prefix Modifications See All Links in Topology to Support PCE/SR ⚠️ Carry Opaque Configuration Data (Key-Value) Efficiently ⚠️ Take a Node out of Production Quickly and Without Disruption Automatic Disaggregation on Failures to Prevent Black-Holing and Back-Hauling Minimal Blast Radius on Failures (On Failure Smallest Possible Part of the Network “Shakes”) Fastest Possible Convergence on Failures Simplest Initial Implementation Dynamic DC Routing Requirements Breakdown (RFC7938+)
  • 10. •General concept •Automatic cabling constraints •Automatic disaggregation on failures •Automatic flooding reduction •Other “Just because the standard provides a cliff in front of you, you are not necessarily required to jump off it.” — Norman Diamond RIFT: Novel Dynamic Routing Algorithm for Clos Underlay
  • 11. •Link-State flood Up (North) •Full topology and all prefixes @ top spine only •Distance Vector down •0/0 is sufficient to send traffic UP. •More specific prefixes •disaggregated in case of failure •TE •Flood reduction and automatic dis-aggregation In One Picture: Link-State Up, Distance Vector Down & Bounce
  • 12. •Link Information Element •POD # •Level # •Node ID •Transported over well known m-cast address and port •POD # == 0 “Any POD” •Node derive POD from 1st Northbound neighbor it establish adjacency. •Auto-configuration •Level # == 0 “Leaf” Content header Major Ver. Minor Ver. System ID Level LIE Name Flood UDP port Neighbor POD# Adjacency Formation (aka Hello)
  • 13. Automatic rejection of adjacencies based on minimum configuration •A1 to B1 forbidden due to POD mismatch •A0 to B1 forbidden due to POD mismatch (A0 already formed A0-A1 even if POD not configured on A0) •B0 to C0 forbidden based on level mismatch Automatic Topology Constraints
  • 14. •N-TIE := S-TIE + carries link- state info •attributes •Detailed topology •Not all fields always required •Packing efficiency •Contextual •TIE Types •Node TIE •Prefix TIE •PGPrefix TIE •KeyValue TIE TIE packet TIE element NodeTIEElement Prefixes KeyValueTIEElement Level capabilities flags Set of NodeNeighborsTIEElement Neighbour System ID Level Metric Set LinkID TIE Element TIE header TIE ID Seq. Number Lifetime Direction (N/S) Originator System ID TIE Type TIE Nr. Type header Major Ver. Minor Ver. System ID Level Topology Information Element
  • 15. •Leafs •Only 0/0 to connected level 1 spines. •Spine 111 [112] •0/0 via S31, S32 [S33,S34] •Pfx A via L101 •Pfx B via L102 •Spine 211 [212] •0/0 via S31, S32 [S33,S34] •Pfx C via L201 •Pfx D via L202 •Spine 31, 32, 33, 34 •Pfx A via S111, S112 •Pfx B via S111, S112 •Pfx C via S211, S212 •Pfx D va S211, S212 S-TIE(111) Neigh: 31, 32, 101, 102 Pfx: 0/0 111 N-TIE(201) Pfx C N-TIE(202) Pfx D N-TIE(212) 212 201 202 31 aggregation localization 34 Pfx DPfx CPfx BPfx A S-TIE(31) Neigh: 111, 211 Pfx: 0/0 TIE(x) – TIE originated at node X. 211112 101 102 32 Routing in steady state – basics (1) 33
  • 16. 1) Spine @ level X [S31] sent S-TIE to node @ level (X-1) [S111] 2) Node @ level (X-1) [S111]send S-TIE up to all neighbors [S32] 3) Spine that received bounced S-TIA [S32] compares their neighbors w/ one in S-TIE 4) Discovered “Equal connectivity group”  Disaggregation  Flood reduction 111 212 201 212 31 34 Pfx DPfx CPfx BPfx A S-TIE(31) Neigh: 111, 211 Pfx: 0/0 1 2 3 112 211 32 101 102 S-TIE reflection 33
  • 17. 101 102 1) Spine X [S32] receive bounced S-TIE(31) 2) Discovery • Neighbor not matches – one [S211] is missing in S- TIE(S31) • Spine Y [S31] has no connectivity to some pfx (pfx: C, D). • As node in lower level (Level 1) use 0/0 – risk of black hole/losses. 3) Spine X [S32] originate new S-TIE(32) w/ disaggregated prefixes (C,D) Note: Nodes on lower level (Level 1) get more specific route. Nodes further down [L101, L102] still can use 0/0 only 111 212 201 202 31 Pfx DPfx CPfx BPfx A S-TIE(31) Neigh: 111 Pfx: 0/0 2 1 3 32 Pfx A S111 Pfx B S111 Pfx C S211 Pfx D S211 S-TIE(33) Neigh: 111, 211 Pfx: 0/0; C, D Pfx 0/0  S31, S32 Pfx C S32 Pfx D S32 Pfx 0/0  S31, S32 112 211 Routing in failure – automatic disaggregation
  • 18. •N-port spine switch •Level 2 spine – all N ports are southbound •Level 1 spine •N/2 ports are Southbound •N/2 ports are Nothbound •Link-State Flooding become over-kill Highly mesh topology
  • 19. •A lot of redundant information •Known problem in Link-State protocols in Highly meshed networks Flooding w/o Reduction
  • 20. •Each “B” node computes from reflected south representation of other “B” nodes •Set of South neighbors •Set of North neighbors •Nodes having both sets matching consider themselves “Flood Reduction Group” and load- balance flooding •Fully distributed, unsynchronized election •In this example case B1 & B2 •Each node chooses based on hash computation which other nodes’ Information it forwards on first flood attempt •Similar to DF election in EVPN, or DR in PIM Flooding with Reduction
  • 21. Summary of RIFT Advantages • Fastest possible convergence • Automatic detection of topology • Minimal routes on TORs • High degree of ECMP • Minimal blast radius on failures • Fast De-comissioning of Nodes • Reduced flooding • Automatic neighbor detection • Automatic disaggregation on failures • Key-Value Store • Advantages of Link-State and Distance Vector • No disadvantages of Link-State or Distance Vector • And some neither can do
  • 22. •Standardization •Individual contribution to IETF Routing WG •Base for further work toward I-D •Implementation •Prototype reference code exist •PoC Test runs, performance data collected •Cooperation •Join work at IETF WG •Contact authors, share opinion •The data structures for packet are public (GPB) – see draft-przygienda-rift I-D RFC STD individual Summary of RIFT Advantages
  • 23. •Packaged Pre-Production Code •Flooding/Hello/SPF/RIB/Flood Reduction •Optional REDIS Analytics for Link-State DB, Routes and Statistics Implemented •Very Heavily Multi-Threaded to Fully Utilize Modern CPU Architectures •Memory and Thread Safe Unlike C •Carries Its Own Topology Environment for Heavy-Duty Testing •Easily Portable Across All Modern OS Platforms (OSX, Linux, FreeBSD) Early Juniper Implementation
  • 24. «Junos Proxy Daemon» RIFT «JET APIs» RPD «OS» JunOS BSD BSD Sockets Implementation External System RIB External System Interface Manager «Telemetry Store» Redis External System Configuration External System Log Sink External System «CLI» FILE«File» YAML Config Configurator «RUST» RIFT RIFT System Integration Linux & JunOS Conceptual Design 2017 Przygienda «deamon» SYSLOG MGD «JunOS ODL/DDL/CLI» «xor» «JunOS ODL/DDL/CLI» «JET» Interface Status «xor» «NETLINK» «JET» «NETLINK» «deamon» SYSLOG «NETLINK» Sockets «JET» Route & Nexthop API «thread» RIB «thread» Interface Manager Key-Value Statistics & LSDB syslog «thread» Node Instance «thread» Thrift Server parses CLI Commands parses initial topology «testing only» Topology «thrift service» Internal Protocol Elements State «NETLINK» «JET» Interface Status «thrift service» Internal Protocol Elements Configuration «RUST Channel» Configuration •Implemented as external application •JET APIs •BSD sockets •Thrift service •RIFT 0.1.3 binary package •MAC, Linux, FreeBSD Early Juniper Implementation
  • 25. Indicative Numbers for RIFT: Topology
  • 26. •On MacPro Laptop (Low Power I7 with 4 Real Cores) •21 Nodes, 60 links, 600 Prefixes •Convergence From Cold Start •~ 4x Faster Than Flat IGP (~ 300 Millisecs) •~ 3x Less Flooding Than Flat IGP •Single Link Flap at Super-Spine •~ 2x Faster Than Flat IGP •< 50 Millisecs Convergence (35 avg, 70 Max) Indicative Numbers for RIFT Implementation
  • 27. KRZYSZTOF GRZEGORZ SZARKOWICZ JUNIPER NETWORKS, SOLUTION ARCHITECT kszarkowicz@juniper.net