VXLAN Design and
Deployment
Marian Klas, Systems Engineer
mklas@cisco.com
Agenda
• Why VXLAN?
• VXLAN Fundamentals
• Underlay Deployment Considerations
• Overlay Deployment Considerations
• Summary and Conclusion
2
Trend: Flexible Data Center Fabrics
Hosts
V
M
O
S
V
M
O
S
Virtual
Physical
Create Virtual Networkson
top of an efficient IP
network
Mobility
Segmentation + Policy
Scale
Automated & Programmable
Full Cross Sectional BW
L2 + L3 Connectivity
Physical + Virtual
Use VXLAN to Create DC Fabrics
3
VXLAN Fundamentals
4
Why Overlays?
Flexible Overlay Virtual Network
• Mobility – Trackend-point attach at edges
• Segmentation
• Scale – Reduce core state
• Distribute and partition state to network edge
• Flexibility/Programmability
• Reduced number oftouch points
Robust Underlay/Fabric
• High Capacity Resilient Fabric
• Intelligent Packet Handling
• Programmable & Manageable
Seek well integrated best in class Overlays and Underlays
5
Overlay Taxonomy
Overlay Control Plane
Encapsulation
Service = Virtual Network Instance (VNI)
Identifier = VN Identifier (VNID)
NVE = Network Virtualization Edge
VTEP = VXLAN Tunnel End-Point
Underlay Control Plane
Underlay Network
Hosts
(end-points)
Edge Devices (NVE)
Edge Device (NVE)
VTEPs
6
VXLAN is an Overlay Encapsulation
Overlay Control Plane
Encapsulation
VXLAN
Data Plane Learning
Flood and Learn over a multidestination
distribution tree joined by all edge devices
Protocol Learning
Advertise hosts in a protocol
amongst edge devices
t
7
VXLAN Packet Structure
Ethernet in IP with a shim for scalable segmentation
Outer MAC Header Outer IP Header Outer UDP Header
FCS
Allows for 16M
possible segments
UDP 4789
Hash of the inner L2/L3/L4 headers
of the original frame.
Enables entropy for ECMP Load
balancing in the Network.
Src and Dst addresses of
the VTEPs
Src VTEP MAC Address
Next-Hop MAC Address
50 (54) Bytes of overhead
Reserved
VNI
Reserved
VXLAN
Flags
RRRRIRRR
8 Bytes
8
24
24
8
Checksum
0x0000
UDP
Length
VXLAN
Port
Source
Port
8 Bytes
16
16
16
16
Dest.
IP
Source
IP
Header
Checksum
Protocol
0x11
(UDP)
20 Bytes
IP
Header
Misc.
Data
32
32
16
8
72
Ether
Type
0x0800
VLAN
ID
Tag
VLAN
Type
0x8100
Src.
MAC
Address
14 Bytes
(4 Bytes Optional)
Dest.
MAC
Address
16
16
16
48
48
Original Layer 2 Frame
VXLAN Header
Large scale
segmentation
Tunnel Entropy
Ethernet Payload
8
Data Plane Learning
Dedicated Multicast Distribution Tree per VNI
PIM Join for Multicast
Group 239.1.1.1
PIM Join for Multicast
Group 239.2.2.2
Web
VM
Web
VM
DB
VM
DB
VM
V V V
V
V
9
Data Plane Learning
Dedicated Multicast Distribution Tree per VNI
VM1 VM3
VM2
V V V
V
V
10
VM1 VM3
VM2
V V V
V
V
Data Plane Learning
Learning on Broadcast Source - ARP Request Example
IP A è G
ARP Req
MAC IP Addr
VM 1 VTEP 1
MAC IP Addr
VM 1 VTEP 1
ARP Req
IP A è G
ARP Req
ARP Req ARP Req 11
VM1 VM3
VM2
V V V
V
V
Data Plane Learning
Learning on Unicast Source - ARP Response Example
ARP Resp
MAC IP Addr
VM 2 VTEP 2
VTEP 2 è VTEP 1
ARP Resp
ARP Resp
MAC IP Addr
VM 1 VTEP 1
12
VXLAN Evolution
• Head-end replication enables unicast-only mode
• Control Plane provides dynamic VTEP discovery
Multicast Independent
• Workload MAC addresses learnt by VXLAN NVEs
• Advertise L2/L3 address-to-VTEP association
information in a protocol
Protocol Learning
prevents floods
• VXLAN HW Gateways to other encaps/networks
• VXLAN HW Gateway redundancy
• Enable hybrid overlays
External Connectivity
• VXLAN Routing
• Distributed IP Gateways
IP Services
14
V V V
V
V
VXLAN Evolution: Using a Control Protocol
VTEP Discovery
TOR 2
TOR 3
TOR 1
BGP consolidates and
propagates VTEP list for VNI
2
4 VTEP can perform
Head-End Replication
Overlay Neighbors
TOR 3 , IP C
TOR 2 , IP B
3
VTEP obtains list of
VTEP neighbors for
each VNI
1
VTEPs advertise their VNI membership in BGP
IP A IP B
IP C
BGP Route
Reflector
1 1
15
V V V
V
V
TOR 2
TOR 3
TOR 1
IP A IP B
IP C
VXLAN Unicast Mode
Head-end replication
VXLAN Encap
4
3 VTEP performs Head-
End Replication
**Information statically configured or
dynamically retrieved via control plane
(VTEP discovery)
Overlay Neighbors
POD3 , IP C
POD2 , IP B
2
VTEP retrieves the list
of Overlay Neighbors**
BUM Frame
1
A host sends a L2 BUM* frame
IP A è IP B
BUM Frame
IP A è IP C
BUM Frame
*Broadcast, Unknown Unicast or Multicast
5 Frames are unicast to
the neighbors
§ Host Route Distribution decoupled from the Underlay protocol
§ Use MP-BGP on the leaf nodes to distribute internal host/subnet
routes and external reachability information
Route-Reflectors deployed for scaling purposes
iBGP Adjacencies
BGP EVPN Control Plane for VXLAN
Host and Subnet Route Distribution
V V V
V
V
RR RR
17
V V V
V
V
1. Host Attaches
2. Attachment NVE advertises host’s MAC (+IP) through BGP RR
3. Choice of encapsulation is also advertised
NLRI:
Host MAC1, IP1
NVE IP L1/MAC L1
VNI 5000
Ext.Community:
Encapsulation: VXLAN, NVGRE
Sequence 0
Host 1
VLAN 10
VNI 5000
MAC IP VNI Next-
Hop
Encap Seq
1 1 5000 IP L1
MAC
L1
VXLAN 0
BGP EVPN Control Plane
Host Advertisement
RR RR
18
V V V
V
V
1. Host Moves to NVE3
2. NVE3 detects Host1 and advertises H1 with seq#1
3. NVE1 sees more recent route and withdraws its advertisement
NLRI:
Host MAC1, IP1
NVE IP L3/MAC L3
VNI 5000
Ext.Community:
Encapsulation: VXLAN, NVGRE
Sequence 1
Host 1
VLAN 10
VNI 5000
MAC IP VNI Next-
Hop
Encap Seq
1 1 5000 IP L1
MAC L1
VXLAN 0
MAC IP VNI Next-
Hop
Encap Seq
1 1 5000 IP L3
MAC L3
VXLAN 1
BGP EVPN Control Plane
Host Moves
RR RR
19
Destination is in another segment.
Packet is routed to the new segment
VXLANORANGE VXLANBLUE
Ingress VXLAN packet on
Orange segment
VXLAN
Router
VXLAN L2 and L3 Gateways
Connecting VXLAN to the broader network
L2 Gateway: VXLAN to VLAN
Bridging
VXLANORANGE
Ingress VXLAN packet on
Orange segment
Egress interface chosen (bridge
may .1Q tag the packet)
VXLAN L2
Gateway
SVI
Egress interface chosen (bridge
may .1Q tag the packet)
L3 Gateway: VXLAN to X Routing
• VXLAN
• VLAN
VLAN100 VLAN200
N7K w/F3
LC
Nexus 3K N5600 N9K
ASR1K/ASR
9K
L2 Gateway Yes Yes Yes Yes Yes
L3 Gateway Yes No Yes Yes Yes
VXLAN BGP EVPN Nexus Programmable Fabric
Support
21
Platform Spine
Function
Leaf Function Border Leaf
Function
DCNM Fabric
Management
Nexus 5600 Yes
7.3(0)N1(1)
Yes
7.3(0)N1(1)
Yes
7.3(0)N1(1)
Yes
7.2(3)
Nexus
7000/7700
(F3 Only)
Yes
7.3(0)D1(1)
Yes
7.3(0)D1(1)
Yes
7.3(0)D1(1)
Yes
7.2(3)
Nexus 9300 Yes
7.0(3)I1(3)
Yes
7.0(3)I1(3)
Yes
7.0(3)I1(3)
Yes
7.2(3)
Nexus 9500 Yes
7.0(3)I1(3)
Yes
7.0(3)I1(3)
Yes
7.0(3)I1(3)
Yes
7.2(3)
Underlay Deployment
Considerations
22
Leaf/Spine Topology – Underlay Considerations
• High Bi-Sectional Bandwidth
• Wide ECMP: Unicast or
Multicast
• Uniform Reachability,
Deterministic Latency
• High Redundancy: Node/Link
Failure
• Line rate, low latency, for all
traffic
ECMP Links
(Layer-3)
• MTU and Overlays
• Unicast Routing Protocol and IP
Addressing
• Multicast for BUM* Traffic
Replication
Deployment Considerations
Underlay
24
MTU and VXLAN
Underlay
• VXLAN adds 50 Bytes to the
Original Ethernet Frame
• Avoid Fragmentation by adjusting
the IP Networks MTU
• Data Centers often require Jumbo
MTU; most Server NIC do support
up to 9000 Bytes
• Using a MTU of 9216* Bytes
accommodates VXLAN Overhead
plus Server max. MTU
25
Underlay
Outer IP Header
Outer MAC Header
UDP Header
VXLAN Header
Original Layer-2 Frame
Overlay
50
(54)
Bytes
of
Overhead
*Cisco Nexus 5600/6000switches only support 9192 Byte for Layer-3 Traffic
Building your IP Network – Interface Principles
• Know your IP addressing and IP
scale requirements
• Best to use single Aggregate for all
Underlay Links and Loopbacks
• IPv4 only
• For each Point-2-Point (P2P)
connection, minimum /31 required
• Loopback requires /32
• Routed Ports/Interfaces
• Layer-3 Interfaces between Spine and
Leaf (no switchport)
• VTEP uses Loopback as Source-
Interface
Underlay
26
V2
V1
V3
Building your IP Network – Routing Protocols; OSPF
• OSPF – watch your Network type
• Network Type Point-2-Point (P2P)
• Preferred (only LSA type-1)
• No DR/BDR election
• Suits well for routed interfaces/ports
(optimal from a LSA Database
perspective)
• Full SPF calculation on Link Change
• Network Type Broadcast
• Suboptimal from a LSA Database
perspective (LSA type-1 & 2)
• DR/BDR election
• Additional election and Database
Overhead
Underlay
29
V2
V1
V3
Building your IP Network – Routing Protocols; IS-IS
• IS-IS – what was this CLNS?
- Independent of IP (CLNS)
- Well suited for routed interfaces/ports
- No SPF calculation on Link change;
only if Topology changes
- Fast Re-convergence
- Not everyone is familiar with it
Underlay
31
V2
V1
V3
Building your IP Network – Routing Protocols; iBGP
• iBGP + IGP = The Routing Protocol
Combo
• IGP for underlay topology &
reachability (e.g. IS-IS, OSPF)
• iBGP for VTEP (loopback) reachability
• iBGP route-reflector for simplification
and scale
• Requires two routing protocols
• Separates Links (IGP) from VTEPs
(iBGP)
• End-Host information are still in iBGP but
different address-family
Underlay
33
V2
V1
V3
iBGP
RR RR RR RR
Multicast Enabled Underlay
• May use PIM-ASM or PIM-BiDir (Different hardware has different capabilities)
• Cisco Nexus 9000 Series Switches (supports ASM/SSM/Ingress-Replication for VXLAN VTEP support)
and Cisco Nexus 5600 Series Switches (supports only PIM BIDIR for VXLAN VTEP support) can be
part of the same VXLAN EVPN fabric but not share the same Layer-2 VNI.
• Spine and Aggregation switches make good RP locations in clos and traditional topologies respectively
• Reserve a range of multicast channels to service the overlay and optimize for diverse VNIs
• In clos topologies with lean spine, using multiple RPs across the multiple spines and mapping different
VNIs to different RPs will provide a simple load balancing measure
• Design a multicast underlay for a network overlay, host VTEPs will simply leverage this network.
N1KV
Nexus 7K
with F3 LC
Nexus 3K Nexus 5K/6K
Nexus 9K
Standalone
CSR 1000V
ASR1K
ASR9K
Mcast mode IGMP v2/v3
PIM-ASM &
Bidir-PIM
PIM-ASM
(Bidir –
Future)
Bidir-PIM
PIM-ASM
(Bidir –
Future)
Bidir-PIM
(ASM –Future)
PIM-ASM &
Bidir-PIM
35
Multicast Enabled Underlay
• Host Overlay VTEPs join multicast groups
as hosts using IGMP reports
• Host overlays will work over an L2
underlay, ensure IGMP snooping is in
place to scope the reach of multicast
• A multicast enabled L3 underlay is the
better option as it enables a hybrid overlay
(host and network VTEPs)
• Ensure that the first hop router for the host in
the underlay is configured to service the
IGMP reports from the host VTEP
Host Overlay to Hybrid Overlay
VNI 6000
IGMP
PIM
HW VTEP
SW VTEP
36
Multi-Pathing and Entropy
• Symmetric Underlay Network topologies facilitate ECMP routing:
• Multi-path load balancing
• Fast Re-convergence on link Failures
• Polarization: Encapsulated flows appear as a single flow which hashes to a single path
• Entropy in the encapsulation header to depolarize tunnels
• Variable UDP source port in VXLAN outer header
• Underlay must support ECMP hashing on L4 port numbers
NV-edge NV-edge
49
Overlay Deployment
Considerations
55
Type of Overlay Service
Layer 2 Overlays
• Emulate a LAN segment
• Transport Ethernet Frames (IP and non-IP)
• Single subnet mobility (L2 domain)
• Exposure to open L2 flooding
• Useful in emulating physical topologies
Layer 3 Overlays
• Abstract IP based connectivity
• Transport IP Packets
• Full mobility regardless of subnets
• Contain network related failures (floods)
• Useful in abstracting connectivity and policy
Hybrid L2/L3 Overlays offer the best of both domains
Building your VTEP (VXLAN Tunnel End-Point)
Overlay
58
Configuration Example
# Features & Globals
feature bgp
feature nv overlay
nv overlay evpn
# Spine (S1)
# Leaf (V1)
interface nve1
source-interface loopback0
host-reachability protocol bgp
*Simplified BGP configuration; would have 4 BGP peers (RR)
IGP not shown
V2
V1
V3
iBGP
RR RR RR RR
Enables VTEP (onlyrequired on Leaf or Border)
Enables EVPN Control-Plane in BGP
Configure the VTEP interface
Enable BGP for Host reachability
Use a Loopback for Source Interface
Building your Overlay Control-Plane
Overlay
59
Configuration Example
# Features & Globals
feature bgp
feature nv overlay
nv overlay evpn
# Spine (S1)
router bgp 65500
router-id 10.10.10.S1
address-family ipv4 unicast
address-family l2vpn evpn
neighbor 10.10.10.V1 remote-as 65500
update-source loopback0
address-family l2vpn evpn
send-community both
route-reflector-client
# Leaf (V1)
router bgp 65500
router-id 10.10.10.V1
address-family ipv4 unicast
neighbor 10.10.10.S1 remote-as 65500
update-source loopback0
address-family l2vpn evpn
send-community both
*
*Simplified BGP configuration; would have 4 BGP peers (RR)
IGP not shown
V2
V1
V3
iBGP
RR RR RR RR
Enables EVPN Control-Plane in BGP
Activate L2VPN EVPN under each BGP neighbor
Send Extended BGP Community
to distribute EVPN route attributes
Extend your VLAN to VXLAN
• Mapping a IEEE 802.1Q VLAN ID to a VXLAN
VNI
• VLAN to VNI configuration on a per-Switch based
• VLAN becomes “Switch Local Identifier”
• VNI becomes “Network Global Identifier”
• 4k VLAN limitation per-Switch does still apply
• 4k Network limitation has been removed
• Dependent on VLAN Space!
Overlay
Configuration Example
60
# Features
feature vn-segment-vlan-based
# VLAN to VNI mapping (MT-Lite)
vlan 43
vn-segment 30000
# Activate Layer-2 VNI for EVPN
evpn
vni 30000 l2
rd auto
route-target import auto
route-target export auto
# Activate Layer-2 VNI on VTEP
interface nve1
source-interface loopback0
host-reachability protocol bgp
member vni 30000
mcast-group 239.239.239.100
suppress-arp
VLAN to Layer-2 VNI mapping
Enables EVPN Control-
Plane for Layer-2
Services
Enables Layer-2 VNI
on VTEP and suppress
ARP
Alternative is to use
“ingress-replication
protocol bgp”
VLAN
VLAN VNI
Multi-Tenancy Lite (MT-Lite)
ethernet vxlan
Distributed Gateway Function in L3 Overlays
Traditional L2 - centralised L2/L3 boundary
• Always bridge, route only at an aggregation point
• Large amounts of state converge
• Scale problem for large# of L2 segments
• Traditional L2 and L2 overlays
L2/L3 fabric (or overlay)
• Always route (at the leaves), bridge when necessary
• Distribute and disaggregate necessary state
• Optimal scalability
• Enhanced forwarding and L3 overlays
App
OS
App
OS
Virtual Physical
L3 Boundary
L3 Boundary
App
OS
App
OS
Virtual Physical
L2/L3 Fabric
61
Distributed IP Anycast Gateway
VXLAN L3
Gateway
L3 Fabric
VXLAN L3
Gateway
VM
OS
VM
OS
VXLAN L3
Gateway
VXLAN L3
Gateway
VXLAN L3
Gateway
The same “Anycast” SVI IP/MAC is used at all VTEPs/ToRs
A host will always find its SVI anywhere it moves
SVI IP Address
MAC: 0000.dead.beef
IP: 10.1.1.1
SVI IP Address
MAC: 0000.dead.beef
IP: 10.1.1.1
62
Distributed IP Anycast Gateway
Detailed View
L3 Fabric
VXLAN L3
Gateway
VXLAN L3
Gateway
VXLAN L3
Gateway
SVI B
Underlay
/ IP Core
VLAN A'
VLAN B'
VTEP L2 GWY
L3 GWY
SVI A SVI B
Underlay
/ IP Core
VLAN A
VLAN B
VNI A
VNI B
VTEP
L2 GWY
L3 GWY
SVI A
ConsistentAnycast SVI IP / MAC
address at all leaves
VLAN-IDs are locally significant
63
Distributed IP Anycast Gateway*
Configuration example
64
Host Y
VLAN 55
Host A
VLAN 43
V1
V3
V2
*Requires EVPN Control-Plane.
Routing in VXLAN – define the resources
Overlay
65
Configuration Example for VRF-A
# Define VLAN for VRF routing instance
vlan 2500
vn-segment 50000
# Define SVI for VRF routing instance
interface Vlan2500
no shutdown
mtu 9216
vrf member VRF-A
ip forward
# VRF configuration for “customer” VRF
vrf context VRF-A
vni 50000
rd auto
address-family ipv4 unicast
route-target both auto
route-target both auto evpn
VRF-A
ethernet
ethernet
VNI
50000
vxlan
VLAN to Layer-3 VNI mapping
VLAN to Layer-3 VNI mapping
- ip forward required for prefix-
based routing
VRF context definition
- VNI
- Route-Distinguisher
- Route-Targets
- IPv4 and/or IPv6
Distributed IP Anycast Gateway*
Configuration Example for “BLUE” (V1 & V3)
Overlay
Configuration Example for “RED” (V1-3)
66
# Features & Globals
feature interface-vlan
fabric forwarding anycast-gateway-mac 2020.DEAD.BEEF
# VLAN to VNI mapping (MT-Lite)
vlan 55
vn-segment 30001
# Anycast Gateway MAC, inherited by any interface
(SVI) using “fabric forwarding”
fabric forwarding anycast-gateway-mac 0002.0002.0002
# Distributed IP Anycast Gateway (SVI)
interface vlan 55
no shutdown
vrf member VRF-A
ip address 98.98.98.1/24 tag 12345
fabric forwarding mode anycast-gateway
# Features & Globals
feature interface-vlan
fabric forwarding anycast-gateway-mac 2020.DEAD.BEEF
# VLAN to VNI mapping (MT-Lite)
vlan 43
vn-segment 30000
# Anycast Gateway MAC, inherited by any interface
(SVI) using “fabric forwarding”
fabric forwarding anycast-gateway-mac 0002.0002.0002
# Distributed IP Anycast Gateway (SVI)
interface vlan 43
no shutdown
vrf member VRF-A
ip address 11.11.11.1/24 tag 12345
fabric forwarding mode anycast-gateway
*Requires EVPN Control-Plane. VRF and BGP configuration not shown
Routing in VXLAN – configure the routing
Overlay
67
Configuration Example for VRF-A
# Activate Layer-3 VNI on VTEP
interface nve1
source-interface loopback0
host-reachability protocol bgp
member vni 30000
mcast-group 239.239.239.100
suppress-arp
member vni 50000 associate-vrf
# Route-Map for Redistribute Subnet
route-map REDIST-SUBNET permit 10
match tag 12345
# Control-Plane configuration for VRF (Tenant)
router bgp 65500
…
vrf VRF-A
address-family ipv4 unicast
advertise l2vpn evpn
redistribute direct route-map REDIST-SUBNET
maximum-paths ibgp 2
VRF-A
ethernet
ethernet
VNI
50000
vxlan
Enables Layer-3 VNI on VTEP
and associate it to VRF
VRF/Tenant definition
within Overlay Control-Plane
Host Subnet Redistribution
• Host “A” is a silent Host
• Not known via ARP/IP
• How can Host “Y” reach Host “A”
• Host “A” and “Y” are in different
VLAN/Subnet
• Route for Host “A”-Subnet will be
advertised by V1 and V2
• Host “Y” will reach either V1 or V2
based on ECMP
• From V1 or V2, Host “A” can be
reached via Layer-2 Segment.
Overlay
69
Host Y
VLAN 55
Host A
VLAN 43
V1
V3
V2
I know Subnet “A”
VXLAN Hardware VTEP Redundancy (vPC)
• VXLAN vPC Domain Configuration
Classic Ethernet
• Configure VXLAN specific vPC
Peer-Link Configuration
• Extend the IP Interface (Loopback)
configuration for the VTEP
• Secondary IP address (anycast) is
used as the anycast VTEP address
• Both vPC VTEP switches need to
have the identical secondary IP
address configured under the
loopback interface
Southbound Connectivity
70
Host D
VNI 30000
V4
V5
VXLAN Hardware Gateway Redundancy (vPC)
Southbound Connectivity
71
Host D
VNI 30000
V4
V5
vPC VTEP Configuration Example for (V4-5)
# VLAN to VNI mapping (MT-Lite)
vlan 55
vn-segment 30000
# VTEP IP Interface; Source/Destination for all
VXLAN Encapsulated Traffic.
§ Primary IP address is used for Orphan Hosts
§ Secondary IP is for vPC Hosts (same IP on both
vPC Peers)
interface loopback0
ip address 10.10.10.V/32
ip address 10.10.10.VAnycast/32 secondary
# VTEP configuration using Loopback as source.
Destination Group for VNI 30001 is “239.1.1.2”
interface nve1
source-interface loopback0
host-reachability protocol bgp
member vni 30000
mcast-group 239.239.239.100
suppress-arp
member vni 50000 associate-vrf
interface loopback0
ip address 10.10.10.5/32
ip address 10.10.10.99/32
secondary
interface loopback0
ip address 10.10.10.4/32
ip address 10.10.10.99/32
secondary
Add Secondary IP to VTEP Loopback
VXLAN Hardware Gateway Redundancy (vPC)
Not to Forget!
72
Host D
VNI 30000
V4
V5
vPC VTEP Configuration Example for (V4-5)
# VPC Domain Configuration
vpc domain 99
peer-switch
peer-keepalive destination V4-mgmt source v5-mgmt
peer-gateway
ip arp synchronize
# VPC Peer-Link
interface port-channelXX
switchport mode trunk
vpc peer-link
# VPC Domain Routing Adjacency
interface Vlan3999
no shutdown
ip address 10.254.254.1/30
ip router ospf 1 area 0.0.0.0
ip ospf network point-to-point
ip pim sparse-mode
interface loopback0
ip address 10.10.10.5/32
ip address 10.10.10.99/32
secondary
interface loopback0
ip address 10.10.10.4/32
ip address 10.10.10.99/32
secondary
Routed Interface (SVI) for routing
adjacencyacross VPC Peer-Link
peer-gatewayneeds to be
enabled so that vPC VTEP
switches can forward traffic
for each other’s router
MAC address
Folded Clos Topology
• Fully Symmetric, BW rich topology, Optimized for East-West traffic
• Lean Spine does not do any VXLAN termination/gateway
• Access to other networks through border leaf block
Providing Topology Symmetry
L3 Fabric
WAN/DCI
VXLAN L2/3
Gateway
VXLAN L2/3
Gateway
VXLAN L2/3
Gateway
VXLAN L2/3
Gateway
VXLAN L2/3
Gateway
VXLAN L2/3
Gateway
Spine
Border Leaf
Leaf
73
VXLAN/EVPN Fabric External Routing
• The Border Leaf/Spine provides
Layer-2 and Layer-3 connectivity to
external Network
• Flexible routing protocol options for
for external routing
• Today, VRF-lite allows to extend
VRF outside of the fabric
• With Nexus 7000/7700 and F3,
other options are coming
Overlay
VBL
WAN
V2
V1
V3
VXLAN/EVPN Fabric External Routing
Overlay
V2
V1
V3
VBL
WAN
VRF
A
VRF
B
VRF
C
VRF for External Routing
need to exist on Border Leaf
Interface-Type Options:
• Physical Routed Ports
• Sub-Interfaces
• VLAN SVIs over Trunk Ports Peering Interface can
be in Global or Tenant VRF
VXLAN/EVPN Fabric External Routing (eBGP)
Overlay
V2
V1
V3
VBL
WAN
AS# 65599
VRF
A
VRF
B
VRF
C
Border Leaf Configuration Example
# Sub-Interface Configuration
interface Ethernet1/1
no switchport
interface Ethernet1/1.10
mtu 9216
encapsulation dot1q 10
vrf member VRF-A
ip address 10.254.254.1/30
# eBGP Configuration
router bgp 65500
…
vrf VRF-A
address-family ipv4 unicast
advertise l2vpn evpn
aggregate-address 10.0.0.0/8 summary-only
neighbor 10.254.254.2 remote-as 65599
update-source Ethernet1/1.10
address-family ipv4 unicast
*
*Ensure that non-necessary routes are not advertised towards the External Network
Advertise external learned routes
into EVPN (Route-Type 5)
VXLAN/EVPN Fabric External Routing (eBGP)
Overlay
V2
V1
V3
VBL
WAN
AS# 65599
VRF
A
VRF
B
VRF
C
Edge Router Configuration Example
# Interface Configuration
interface Ethernet1/1
vrf member VRF-A
ip address 10.254.254.2/30
# eBGP Configuration
router bgp 65599
…
vrf VRF-A
address-family ipv4 unicast
neighbor 10.254.254.2 remote-as 65500
update-source Ethernet1/1
address-family ipv4 unicast
L3 Fabric
OTV Transport
Border
PE
Border
PE
MPLS/IP Core
VXLAN L3
Gateway
VXLAN L3
Gateway
P
P
P
N5600, N7K, N9K
N7K
MPLS IP-VPN & VXLAN – in NX-OS 7.3
L3 Handoff – Border PE
VXLAN L3
Gateway
VXLAN L3
Gateway
VXLAN L3
Gateway
VXLAN L3
Gateway
VXLAN L3
Gateway
VXLAN L3
Gateway
79
Summary and Conclusion
80
Summary recommendations & takeaways
• Optimize the location of L2 and L3 GWYs to optimize routing and minimize failure
exposure
• Leverage L3 VXLAN services enabled by control protocols as the main service and L2
extensions as the exception
• Design the underlay with the VXLAN overlay in mind
• Design the network hierarchically: both the underlay as well as the overlay
• L3 Gateways are key to a sound overlay design
• A combination of pull protocols and push protocols may render optimal scale and
resiliency
• Link the provisioning of the overlay and scoping of VNIs to the host orchestration system
for optimal scale
81
Thank you
82
We’re ready. Are you?

VXLAN Design and Deployment.pdf

  • 1.
    VXLAN Design and Deployment MarianKlas, Systems Engineer mklas@cisco.com
  • 2.
    Agenda • Why VXLAN? •VXLAN Fundamentals • Underlay Deployment Considerations • Overlay Deployment Considerations • Summary and Conclusion 2
  • 3.
    Trend: Flexible DataCenter Fabrics Hosts V M O S V M O S Virtual Physical Create Virtual Networkson top of an efficient IP network Mobility Segmentation + Policy Scale Automated & Programmable Full Cross Sectional BW L2 + L3 Connectivity Physical + Virtual Use VXLAN to Create DC Fabrics 3
  • 4.
  • 5.
    Why Overlays? Flexible OverlayVirtual Network • Mobility – Trackend-point attach at edges • Segmentation • Scale – Reduce core state • Distribute and partition state to network edge • Flexibility/Programmability • Reduced number oftouch points Robust Underlay/Fabric • High Capacity Resilient Fabric • Intelligent Packet Handling • Programmable & Manageable Seek well integrated best in class Overlays and Underlays 5
  • 6.
    Overlay Taxonomy Overlay ControlPlane Encapsulation Service = Virtual Network Instance (VNI) Identifier = VN Identifier (VNID) NVE = Network Virtualization Edge VTEP = VXLAN Tunnel End-Point Underlay Control Plane Underlay Network Hosts (end-points) Edge Devices (NVE) Edge Device (NVE) VTEPs 6
  • 7.
    VXLAN is anOverlay Encapsulation Overlay Control Plane Encapsulation VXLAN Data Plane Learning Flood and Learn over a multidestination distribution tree joined by all edge devices Protocol Learning Advertise hosts in a protocol amongst edge devices t 7
  • 8.
    VXLAN Packet Structure Ethernetin IP with a shim for scalable segmentation Outer MAC Header Outer IP Header Outer UDP Header FCS Allows for 16M possible segments UDP 4789 Hash of the inner L2/L3/L4 headers of the original frame. Enables entropy for ECMP Load balancing in the Network. Src and Dst addresses of the VTEPs Src VTEP MAC Address Next-Hop MAC Address 50 (54) Bytes of overhead Reserved VNI Reserved VXLAN Flags RRRRIRRR 8 Bytes 8 24 24 8 Checksum 0x0000 UDP Length VXLAN Port Source Port 8 Bytes 16 16 16 16 Dest. IP Source IP Header Checksum Protocol 0x11 (UDP) 20 Bytes IP Header Misc. Data 32 32 16 8 72 Ether Type 0x0800 VLAN ID Tag VLAN Type 0x8100 Src. MAC Address 14 Bytes (4 Bytes Optional) Dest. MAC Address 16 16 16 48 48 Original Layer 2 Frame VXLAN Header Large scale segmentation Tunnel Entropy Ethernet Payload 8
  • 9.
    Data Plane Learning DedicatedMulticast Distribution Tree per VNI PIM Join for Multicast Group 239.1.1.1 PIM Join for Multicast Group 239.2.2.2 Web VM Web VM DB VM DB VM V V V V V 9
  • 10.
    Data Plane Learning DedicatedMulticast Distribution Tree per VNI VM1 VM3 VM2 V V V V V 10
  • 11.
    VM1 VM3 VM2 V VV V V Data Plane Learning Learning on Broadcast Source - ARP Request Example IP A è G ARP Req MAC IP Addr VM 1 VTEP 1 MAC IP Addr VM 1 VTEP 1 ARP Req IP A è G ARP Req ARP Req ARP Req 11
  • 12.
    VM1 VM3 VM2 V VV V V Data Plane Learning Learning on Unicast Source - ARP Response Example ARP Resp MAC IP Addr VM 2 VTEP 2 VTEP 2 è VTEP 1 ARP Resp ARP Resp MAC IP Addr VM 1 VTEP 1 12
  • 13.
    VXLAN Evolution • Head-endreplication enables unicast-only mode • Control Plane provides dynamic VTEP discovery Multicast Independent • Workload MAC addresses learnt by VXLAN NVEs • Advertise L2/L3 address-to-VTEP association information in a protocol Protocol Learning prevents floods • VXLAN HW Gateways to other encaps/networks • VXLAN HW Gateway redundancy • Enable hybrid overlays External Connectivity • VXLAN Routing • Distributed IP Gateways IP Services 14
  • 14.
    V V V V V VXLANEvolution: Using a Control Protocol VTEP Discovery TOR 2 TOR 3 TOR 1 BGP consolidates and propagates VTEP list for VNI 2 4 VTEP can perform Head-End Replication Overlay Neighbors TOR 3 , IP C TOR 2 , IP B 3 VTEP obtains list of VTEP neighbors for each VNI 1 VTEPs advertise their VNI membership in BGP IP A IP B IP C BGP Route Reflector 1 1 15
  • 15.
    V V V V V TOR2 TOR 3 TOR 1 IP A IP B IP C VXLAN Unicast Mode Head-end replication VXLAN Encap 4 3 VTEP performs Head- End Replication **Information statically configured or dynamically retrieved via control plane (VTEP discovery) Overlay Neighbors POD3 , IP C POD2 , IP B 2 VTEP retrieves the list of Overlay Neighbors** BUM Frame 1 A host sends a L2 BUM* frame IP A è IP B BUM Frame IP A è IP C BUM Frame *Broadcast, Unknown Unicast or Multicast 5 Frames are unicast to the neighbors
  • 16.
    § Host RouteDistribution decoupled from the Underlay protocol § Use MP-BGP on the leaf nodes to distribute internal host/subnet routes and external reachability information Route-Reflectors deployed for scaling purposes iBGP Adjacencies BGP EVPN Control Plane for VXLAN Host and Subnet Route Distribution V V V V V RR RR 17
  • 17.
    V V V V V 1.Host Attaches 2. Attachment NVE advertises host’s MAC (+IP) through BGP RR 3. Choice of encapsulation is also advertised NLRI: Host MAC1, IP1 NVE IP L1/MAC L1 VNI 5000 Ext.Community: Encapsulation: VXLAN, NVGRE Sequence 0 Host 1 VLAN 10 VNI 5000 MAC IP VNI Next- Hop Encap Seq 1 1 5000 IP L1 MAC L1 VXLAN 0 BGP EVPN Control Plane Host Advertisement RR RR 18
  • 18.
    V V V V V 1.Host Moves to NVE3 2. NVE3 detects Host1 and advertises H1 with seq#1 3. NVE1 sees more recent route and withdraws its advertisement NLRI: Host MAC1, IP1 NVE IP L3/MAC L3 VNI 5000 Ext.Community: Encapsulation: VXLAN, NVGRE Sequence 1 Host 1 VLAN 10 VNI 5000 MAC IP VNI Next- Hop Encap Seq 1 1 5000 IP L1 MAC L1 VXLAN 0 MAC IP VNI Next- Hop Encap Seq 1 1 5000 IP L3 MAC L3 VXLAN 1 BGP EVPN Control Plane Host Moves RR RR 19
  • 19.
    Destination is inanother segment. Packet is routed to the new segment VXLANORANGE VXLANBLUE Ingress VXLAN packet on Orange segment VXLAN Router VXLAN L2 and L3 Gateways Connecting VXLAN to the broader network L2 Gateway: VXLAN to VLAN Bridging VXLANORANGE Ingress VXLAN packet on Orange segment Egress interface chosen (bridge may .1Q tag the packet) VXLAN L2 Gateway SVI Egress interface chosen (bridge may .1Q tag the packet) L3 Gateway: VXLAN to X Routing • VXLAN • VLAN VLAN100 VLAN200 N7K w/F3 LC Nexus 3K N5600 N9K ASR1K/ASR 9K L2 Gateway Yes Yes Yes Yes Yes L3 Gateway Yes No Yes Yes Yes
  • 20.
    VXLAN BGP EVPNNexus Programmable Fabric Support 21 Platform Spine Function Leaf Function Border Leaf Function DCNM Fabric Management Nexus 5600 Yes 7.3(0)N1(1) Yes 7.3(0)N1(1) Yes 7.3(0)N1(1) Yes 7.2(3) Nexus 7000/7700 (F3 Only) Yes 7.3(0)D1(1) Yes 7.3(0)D1(1) Yes 7.3(0)D1(1) Yes 7.2(3) Nexus 9300 Yes 7.0(3)I1(3) Yes 7.0(3)I1(3) Yes 7.0(3)I1(3) Yes 7.2(3) Nexus 9500 Yes 7.0(3)I1(3) Yes 7.0(3)I1(3) Yes 7.0(3)I1(3) Yes 7.2(3)
  • 21.
  • 22.
    Leaf/Spine Topology –Underlay Considerations • High Bi-Sectional Bandwidth • Wide ECMP: Unicast or Multicast • Uniform Reachability, Deterministic Latency • High Redundancy: Node/Link Failure • Line rate, low latency, for all traffic ECMP Links (Layer-3)
  • 23.
    • MTU andOverlays • Unicast Routing Protocol and IP Addressing • Multicast for BUM* Traffic Replication Deployment Considerations Underlay 24
  • 24.
    MTU and VXLAN Underlay •VXLAN adds 50 Bytes to the Original Ethernet Frame • Avoid Fragmentation by adjusting the IP Networks MTU • Data Centers often require Jumbo MTU; most Server NIC do support up to 9000 Bytes • Using a MTU of 9216* Bytes accommodates VXLAN Overhead plus Server max. MTU 25 Underlay Outer IP Header Outer MAC Header UDP Header VXLAN Header Original Layer-2 Frame Overlay 50 (54) Bytes of Overhead *Cisco Nexus 5600/6000switches only support 9192 Byte for Layer-3 Traffic
  • 25.
    Building your IPNetwork – Interface Principles • Know your IP addressing and IP scale requirements • Best to use single Aggregate for all Underlay Links and Loopbacks • IPv4 only • For each Point-2-Point (P2P) connection, minimum /31 required • Loopback requires /32 • Routed Ports/Interfaces • Layer-3 Interfaces between Spine and Leaf (no switchport) • VTEP uses Loopback as Source- Interface Underlay 26 V2 V1 V3
  • 26.
    Building your IPNetwork – Routing Protocols; OSPF • OSPF – watch your Network type • Network Type Point-2-Point (P2P) • Preferred (only LSA type-1) • No DR/BDR election • Suits well for routed interfaces/ports (optimal from a LSA Database perspective) • Full SPF calculation on Link Change • Network Type Broadcast • Suboptimal from a LSA Database perspective (LSA type-1 & 2) • DR/BDR election • Additional election and Database Overhead Underlay 29 V2 V1 V3
  • 27.
    Building your IPNetwork – Routing Protocols; IS-IS • IS-IS – what was this CLNS? - Independent of IP (CLNS) - Well suited for routed interfaces/ports - No SPF calculation on Link change; only if Topology changes - Fast Re-convergence - Not everyone is familiar with it Underlay 31 V2 V1 V3
  • 28.
    Building your IPNetwork – Routing Protocols; iBGP • iBGP + IGP = The Routing Protocol Combo • IGP for underlay topology & reachability (e.g. IS-IS, OSPF) • iBGP for VTEP (loopback) reachability • iBGP route-reflector for simplification and scale • Requires two routing protocols • Separates Links (IGP) from VTEPs (iBGP) • End-Host information are still in iBGP but different address-family Underlay 33 V2 V1 V3 iBGP RR RR RR RR
  • 29.
    Multicast Enabled Underlay •May use PIM-ASM or PIM-BiDir (Different hardware has different capabilities) • Cisco Nexus 9000 Series Switches (supports ASM/SSM/Ingress-Replication for VXLAN VTEP support) and Cisco Nexus 5600 Series Switches (supports only PIM BIDIR for VXLAN VTEP support) can be part of the same VXLAN EVPN fabric but not share the same Layer-2 VNI. • Spine and Aggregation switches make good RP locations in clos and traditional topologies respectively • Reserve a range of multicast channels to service the overlay and optimize for diverse VNIs • In clos topologies with lean spine, using multiple RPs across the multiple spines and mapping different VNIs to different RPs will provide a simple load balancing measure • Design a multicast underlay for a network overlay, host VTEPs will simply leverage this network. N1KV Nexus 7K with F3 LC Nexus 3K Nexus 5K/6K Nexus 9K Standalone CSR 1000V ASR1K ASR9K Mcast mode IGMP v2/v3 PIM-ASM & Bidir-PIM PIM-ASM (Bidir – Future) Bidir-PIM PIM-ASM (Bidir – Future) Bidir-PIM (ASM –Future) PIM-ASM & Bidir-PIM 35
  • 30.
    Multicast Enabled Underlay •Host Overlay VTEPs join multicast groups as hosts using IGMP reports • Host overlays will work over an L2 underlay, ensure IGMP snooping is in place to scope the reach of multicast • A multicast enabled L3 underlay is the better option as it enables a hybrid overlay (host and network VTEPs) • Ensure that the first hop router for the host in the underlay is configured to service the IGMP reports from the host VTEP Host Overlay to Hybrid Overlay VNI 6000 IGMP PIM HW VTEP SW VTEP 36
  • 31.
    Multi-Pathing and Entropy •Symmetric Underlay Network topologies facilitate ECMP routing: • Multi-path load balancing • Fast Re-convergence on link Failures • Polarization: Encapsulated flows appear as a single flow which hashes to a single path • Entropy in the encapsulation header to depolarize tunnels • Variable UDP source port in VXLAN outer header • Underlay must support ECMP hashing on L4 port numbers NV-edge NV-edge 49
  • 32.
  • 33.
    Type of OverlayService Layer 2 Overlays • Emulate a LAN segment • Transport Ethernet Frames (IP and non-IP) • Single subnet mobility (L2 domain) • Exposure to open L2 flooding • Useful in emulating physical topologies Layer 3 Overlays • Abstract IP based connectivity • Transport IP Packets • Full mobility regardless of subnets • Contain network related failures (floods) • Useful in abstracting connectivity and policy Hybrid L2/L3 Overlays offer the best of both domains
  • 35.
    Building your VTEP(VXLAN Tunnel End-Point) Overlay 58 Configuration Example # Features & Globals feature bgp feature nv overlay nv overlay evpn # Spine (S1) # Leaf (V1) interface nve1 source-interface loopback0 host-reachability protocol bgp *Simplified BGP configuration; would have 4 BGP peers (RR) IGP not shown V2 V1 V3 iBGP RR RR RR RR Enables VTEP (onlyrequired on Leaf or Border) Enables EVPN Control-Plane in BGP Configure the VTEP interface Enable BGP for Host reachability Use a Loopback for Source Interface
  • 36.
    Building your OverlayControl-Plane Overlay 59 Configuration Example # Features & Globals feature bgp feature nv overlay nv overlay evpn # Spine (S1) router bgp 65500 router-id 10.10.10.S1 address-family ipv4 unicast address-family l2vpn evpn neighbor 10.10.10.V1 remote-as 65500 update-source loopback0 address-family l2vpn evpn send-community both route-reflector-client # Leaf (V1) router bgp 65500 router-id 10.10.10.V1 address-family ipv4 unicast neighbor 10.10.10.S1 remote-as 65500 update-source loopback0 address-family l2vpn evpn send-community both * *Simplified BGP configuration; would have 4 BGP peers (RR) IGP not shown V2 V1 V3 iBGP RR RR RR RR Enables EVPN Control-Plane in BGP Activate L2VPN EVPN under each BGP neighbor Send Extended BGP Community to distribute EVPN route attributes
  • 37.
    Extend your VLANto VXLAN • Mapping a IEEE 802.1Q VLAN ID to a VXLAN VNI • VLAN to VNI configuration on a per-Switch based • VLAN becomes “Switch Local Identifier” • VNI becomes “Network Global Identifier” • 4k VLAN limitation per-Switch does still apply • 4k Network limitation has been removed • Dependent on VLAN Space! Overlay Configuration Example 60 # Features feature vn-segment-vlan-based # VLAN to VNI mapping (MT-Lite) vlan 43 vn-segment 30000 # Activate Layer-2 VNI for EVPN evpn vni 30000 l2 rd auto route-target import auto route-target export auto # Activate Layer-2 VNI on VTEP interface nve1 source-interface loopback0 host-reachability protocol bgp member vni 30000 mcast-group 239.239.239.100 suppress-arp VLAN to Layer-2 VNI mapping Enables EVPN Control- Plane for Layer-2 Services Enables Layer-2 VNI on VTEP and suppress ARP Alternative is to use “ingress-replication protocol bgp” VLAN VLAN VNI Multi-Tenancy Lite (MT-Lite) ethernet vxlan
  • 38.
    Distributed Gateway Functionin L3 Overlays Traditional L2 - centralised L2/L3 boundary • Always bridge, route only at an aggregation point • Large amounts of state converge • Scale problem for large# of L2 segments • Traditional L2 and L2 overlays L2/L3 fabric (or overlay) • Always route (at the leaves), bridge when necessary • Distribute and disaggregate necessary state • Optimal scalability • Enhanced forwarding and L3 overlays App OS App OS Virtual Physical L3 Boundary L3 Boundary App OS App OS Virtual Physical L2/L3 Fabric 61
  • 39.
    Distributed IP AnycastGateway VXLAN L3 Gateway L3 Fabric VXLAN L3 Gateway VM OS VM OS VXLAN L3 Gateway VXLAN L3 Gateway VXLAN L3 Gateway The same “Anycast” SVI IP/MAC is used at all VTEPs/ToRs A host will always find its SVI anywhere it moves SVI IP Address MAC: 0000.dead.beef IP: 10.1.1.1 SVI IP Address MAC: 0000.dead.beef IP: 10.1.1.1 62
  • 40.
    Distributed IP AnycastGateway Detailed View L3 Fabric VXLAN L3 Gateway VXLAN L3 Gateway VXLAN L3 Gateway SVI B Underlay / IP Core VLAN A' VLAN B' VTEP L2 GWY L3 GWY SVI A SVI B Underlay / IP Core VLAN A VLAN B VNI A VNI B VTEP L2 GWY L3 GWY SVI A ConsistentAnycast SVI IP / MAC address at all leaves VLAN-IDs are locally significant 63
  • 41.
    Distributed IP AnycastGateway* Configuration example 64 Host Y VLAN 55 Host A VLAN 43 V1 V3 V2 *Requires EVPN Control-Plane.
  • 42.
    Routing in VXLAN– define the resources Overlay 65 Configuration Example for VRF-A # Define VLAN for VRF routing instance vlan 2500 vn-segment 50000 # Define SVI for VRF routing instance interface Vlan2500 no shutdown mtu 9216 vrf member VRF-A ip forward # VRF configuration for “customer” VRF vrf context VRF-A vni 50000 rd auto address-family ipv4 unicast route-target both auto route-target both auto evpn VRF-A ethernet ethernet VNI 50000 vxlan VLAN to Layer-3 VNI mapping VLAN to Layer-3 VNI mapping - ip forward required for prefix- based routing VRF context definition - VNI - Route-Distinguisher - Route-Targets - IPv4 and/or IPv6
  • 43.
    Distributed IP AnycastGateway* Configuration Example for “BLUE” (V1 & V3) Overlay Configuration Example for “RED” (V1-3) 66 # Features & Globals feature interface-vlan fabric forwarding anycast-gateway-mac 2020.DEAD.BEEF # VLAN to VNI mapping (MT-Lite) vlan 55 vn-segment 30001 # Anycast Gateway MAC, inherited by any interface (SVI) using “fabric forwarding” fabric forwarding anycast-gateway-mac 0002.0002.0002 # Distributed IP Anycast Gateway (SVI) interface vlan 55 no shutdown vrf member VRF-A ip address 98.98.98.1/24 tag 12345 fabric forwarding mode anycast-gateway # Features & Globals feature interface-vlan fabric forwarding anycast-gateway-mac 2020.DEAD.BEEF # VLAN to VNI mapping (MT-Lite) vlan 43 vn-segment 30000 # Anycast Gateway MAC, inherited by any interface (SVI) using “fabric forwarding” fabric forwarding anycast-gateway-mac 0002.0002.0002 # Distributed IP Anycast Gateway (SVI) interface vlan 43 no shutdown vrf member VRF-A ip address 11.11.11.1/24 tag 12345 fabric forwarding mode anycast-gateway *Requires EVPN Control-Plane. VRF and BGP configuration not shown
  • 44.
    Routing in VXLAN– configure the routing Overlay 67 Configuration Example for VRF-A # Activate Layer-3 VNI on VTEP interface nve1 source-interface loopback0 host-reachability protocol bgp member vni 30000 mcast-group 239.239.239.100 suppress-arp member vni 50000 associate-vrf # Route-Map for Redistribute Subnet route-map REDIST-SUBNET permit 10 match tag 12345 # Control-Plane configuration for VRF (Tenant) router bgp 65500 … vrf VRF-A address-family ipv4 unicast advertise l2vpn evpn redistribute direct route-map REDIST-SUBNET maximum-paths ibgp 2 VRF-A ethernet ethernet VNI 50000 vxlan Enables Layer-3 VNI on VTEP and associate it to VRF VRF/Tenant definition within Overlay Control-Plane
  • 46.
    Host Subnet Redistribution •Host “A” is a silent Host • Not known via ARP/IP • How can Host “Y” reach Host “A” • Host “A” and “Y” are in different VLAN/Subnet • Route for Host “A”-Subnet will be advertised by V1 and V2 • Host “Y” will reach either V1 or V2 based on ECMP • From V1 or V2, Host “A” can be reached via Layer-2 Segment. Overlay 69 Host Y VLAN 55 Host A VLAN 43 V1 V3 V2 I know Subnet “A”
  • 47.
    VXLAN Hardware VTEPRedundancy (vPC) • VXLAN vPC Domain Configuration Classic Ethernet • Configure VXLAN specific vPC Peer-Link Configuration • Extend the IP Interface (Loopback) configuration for the VTEP • Secondary IP address (anycast) is used as the anycast VTEP address • Both vPC VTEP switches need to have the identical secondary IP address configured under the loopback interface Southbound Connectivity 70 Host D VNI 30000 V4 V5
  • 48.
    VXLAN Hardware GatewayRedundancy (vPC) Southbound Connectivity 71 Host D VNI 30000 V4 V5 vPC VTEP Configuration Example for (V4-5) # VLAN to VNI mapping (MT-Lite) vlan 55 vn-segment 30000 # VTEP IP Interface; Source/Destination for all VXLAN Encapsulated Traffic. § Primary IP address is used for Orphan Hosts § Secondary IP is for vPC Hosts (same IP on both vPC Peers) interface loopback0 ip address 10.10.10.V/32 ip address 10.10.10.VAnycast/32 secondary # VTEP configuration using Loopback as source. Destination Group for VNI 30001 is “239.1.1.2” interface nve1 source-interface loopback0 host-reachability protocol bgp member vni 30000 mcast-group 239.239.239.100 suppress-arp member vni 50000 associate-vrf interface loopback0 ip address 10.10.10.5/32 ip address 10.10.10.99/32 secondary interface loopback0 ip address 10.10.10.4/32 ip address 10.10.10.99/32 secondary Add Secondary IP to VTEP Loopback
  • 49.
    VXLAN Hardware GatewayRedundancy (vPC) Not to Forget! 72 Host D VNI 30000 V4 V5 vPC VTEP Configuration Example for (V4-5) # VPC Domain Configuration vpc domain 99 peer-switch peer-keepalive destination V4-mgmt source v5-mgmt peer-gateway ip arp synchronize # VPC Peer-Link interface port-channelXX switchport mode trunk vpc peer-link # VPC Domain Routing Adjacency interface Vlan3999 no shutdown ip address 10.254.254.1/30 ip router ospf 1 area 0.0.0.0 ip ospf network point-to-point ip pim sparse-mode interface loopback0 ip address 10.10.10.5/32 ip address 10.10.10.99/32 secondary interface loopback0 ip address 10.10.10.4/32 ip address 10.10.10.99/32 secondary Routed Interface (SVI) for routing adjacencyacross VPC Peer-Link peer-gatewayneeds to be enabled so that vPC VTEP switches can forward traffic for each other’s router MAC address
  • 50.
    Folded Clos Topology •Fully Symmetric, BW rich topology, Optimized for East-West traffic • Lean Spine does not do any VXLAN termination/gateway • Access to other networks through border leaf block Providing Topology Symmetry L3 Fabric WAN/DCI VXLAN L2/3 Gateway VXLAN L2/3 Gateway VXLAN L2/3 Gateway VXLAN L2/3 Gateway VXLAN L2/3 Gateway VXLAN L2/3 Gateway Spine Border Leaf Leaf 73
  • 51.
    VXLAN/EVPN Fabric ExternalRouting • The Border Leaf/Spine provides Layer-2 and Layer-3 connectivity to external Network • Flexible routing protocol options for for external routing • Today, VRF-lite allows to extend VRF outside of the fabric • With Nexus 7000/7700 and F3, other options are coming Overlay VBL WAN V2 V1 V3
  • 52.
    VXLAN/EVPN Fabric ExternalRouting Overlay V2 V1 V3 VBL WAN VRF A VRF B VRF C VRF for External Routing need to exist on Border Leaf Interface-Type Options: • Physical Routed Ports • Sub-Interfaces • VLAN SVIs over Trunk Ports Peering Interface can be in Global or Tenant VRF
  • 53.
    VXLAN/EVPN Fabric ExternalRouting (eBGP) Overlay V2 V1 V3 VBL WAN AS# 65599 VRF A VRF B VRF C Border Leaf Configuration Example # Sub-Interface Configuration interface Ethernet1/1 no switchport interface Ethernet1/1.10 mtu 9216 encapsulation dot1q 10 vrf member VRF-A ip address 10.254.254.1/30 # eBGP Configuration router bgp 65500 … vrf VRF-A address-family ipv4 unicast advertise l2vpn evpn aggregate-address 10.0.0.0/8 summary-only neighbor 10.254.254.2 remote-as 65599 update-source Ethernet1/1.10 address-family ipv4 unicast * *Ensure that non-necessary routes are not advertised towards the External Network Advertise external learned routes into EVPN (Route-Type 5)
  • 54.
    VXLAN/EVPN Fabric ExternalRouting (eBGP) Overlay V2 V1 V3 VBL WAN AS# 65599 VRF A VRF B VRF C Edge Router Configuration Example # Interface Configuration interface Ethernet1/1 vrf member VRF-A ip address 10.254.254.2/30 # eBGP Configuration router bgp 65599 … vrf VRF-A address-family ipv4 unicast neighbor 10.254.254.2 remote-as 65500 update-source Ethernet1/1 address-family ipv4 unicast
  • 55.
    L3 Fabric OTV Transport Border PE Border PE MPLS/IPCore VXLAN L3 Gateway VXLAN L3 Gateway P P P N5600, N7K, N9K N7K MPLS IP-VPN & VXLAN – in NX-OS 7.3 L3 Handoff – Border PE VXLAN L3 Gateway VXLAN L3 Gateway VXLAN L3 Gateway VXLAN L3 Gateway VXLAN L3 Gateway VXLAN L3 Gateway 79
  • 56.
  • 57.
    Summary recommendations &takeaways • Optimize the location of L2 and L3 GWYs to optimize routing and minimize failure exposure • Leverage L3 VXLAN services enabled by control protocols as the main service and L2 extensions as the exception • Design the underlay with the VXLAN overlay in mind • Design the network hierarchically: both the underlay as well as the overlay • L3 Gateways are key to a sound overlay design • A combination of pull protocols and push protocols may render optimal scale and resiliency • Link the provisioning of the overlay and scoping of VNIs to the host orchestration system for optimal scale 81
  • 58.
  • 59.