Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Demystifying EVPN in the data center: Part 1 in 2 episode series


Published on

Network operators are slowly but surely embracing L3-based leaf-spine designs. However, either due to legacy applications or certain multi-tenancy requirements, the need for L2 across racks is still present. How do you solve the problem of providing L2 across multiple racks? EVPN is quickly emerging as the best answer to this question.

In this episode of our 2-part series on EVPN, we start with a discussion of the use cases, a review of the technologies EVPN competes with, and dive into an evaluation of the pros and cons of each.

For a recording of the live event, go to

Published in: Technology
  • Be the first to comment

Demystifying EVPN in the data center: Part 1 in 2 episode series

  1. 1. 1 Oct 12, 2017 Dinesh G Dutt | Cumulus Networks Part 1: Technology, Use Cases, Bridging Operationalizing EVPN in the DC
  2. 2. 2Cumulus Networks What is EVPN ? Why should you care ? Use cases and requirements BGP models for EVPN in FRR EVPN for bridging Configuring EVPN in FRR Agenda
  3. 3. 3Cumulus Networks What is EVPN • Ethernet VPN i.e. another form of L2 VPN ▪ Different from VPLS • Original EVPN RFC: RFC 7432 ▪ BGP MPLS-based Ethernet VPN ▪ Requirements defined in RFC 7209
  4. 4. 4Cumulus Networks Primary Goals of EVPN • Overcome the limitations of VPLS ▪ Support for multihoming and redundancy ▪ No data plane learning => no flooding ▪ Multicast optimization ▪ Allows supporting multiple encapsulation types (signaled via control protocol) ▪ Less configuration
  5. 5. 5 Wait! This all sounds like service provider stuff. Why should I care ?
  6. 6. 6Cumulus Networks The Story In the Data Center So Far SPINE LEAF • CLOS is the new network architecture • IP-based fabrics is in, VLAN/L2-based fabrics is out • Scale out wins over scale in • Fixed form factor boxes largely win over modular chassis solutions • Cloud-native apps rule!
  7. 7. 7Cumulus Networks Except... • Many enterprise DC still have plenty of legacy applications, designed with old world network assumptions ▪ See Ivan Pepelnjak’s blog post for these assumptions: html ▪ Solutions such as VM Mobility are still steeped in the assumptions of an L2 segment, even though IP address can be maintained without requiring L2
  8. 8. 8Cumulus Networks VxLAN To The Rescue • VxLAN has become quite popular as the model for running L2 over a pure L3 network ▪ Primarily introduced as a multi-tenant, private cloud story • Original script was for a controller-based play • But controller-based play has had a limited run
  9. 9. 9 EVPN in the DC: BGP VxLAN-based Ethernet VPN
  10. 10. 10Cumulus Networks Meet The New EVPN • A new set of IETF drafts defining the adaptation of EVPN in the data center • Base draft is: draft-ietf-bess-evpn-overlay-08 ▪ A Network Virtualization Overlay Solution Using EVPN ▪ VNI (virtual network identifier) replaces VPN in terminology • Replaces MPLS-based fabrics with IP-based fabrics: ▪ VxLAN, NVGRE, and MPLS over GRE • Controller-less VxLAN
  11. 11. 11Cumulus Networks EVPN in the DC: Summary • Supports extending L2 segments over an IP fabric • Supports routing between L2 segments • L3 multicast in the overlay is a work in progress • BGP is the control plane • Multi-vendor support • Mainstream introduction of VxLAN routing in merchant silicon
  12. 12. 12 Use Cases & Requirements
  13. 13. 13Cumulus Networks Three Primary Use Cases • Replace VLAN-based access-agg-core enterprise architecture with EVPN-CLOS based architecture • Multi-tenant hosting • Data Center Interconnect (DCI)
  14. 14. 14Cumulus Networks Replacing L2 Core With L3 Core in Traditional Enterprises • Don’t require > 4K VLANs ▪ Typically tens to hundreds, maybe a couple of thousand • No other orchestrator usually available ▪ Orchestrating across compute and network • Routing between L2 VNIs mandatory • L3 multicast between L2 VNIs maybe required
  15. 15. 15Cumulus Networks Multi-Tenant DC a.k.a Private Cloud • Require > 4K VNIs in the fabric • Routing across VNIs in well-defined points in the network only ▪ Routing will be VRF-aware • Orchestrator maybe present to simplify deployment ▪ Example: Openstack • L3 multicast across tenants not common
  16. 16. 16Cumulus Networks Datacenter Interconnect (DCI) • Stretch L2 segment across DC • Support for isolating control plane chatter across DCs • Support for some form of aggregation/summary of MACs to scale out • Optimize replication to avoid replicating from local VTEP to every remote VTEP • Support multi-homing and redundancy of border routers • Translating VNIs
  17. 17. 17Cumulus Networks Why Focus on Use Cases ? • Modern DC network are built on the KISS principle ▪ Keep it simple stupid • Immutable infrastructure is the growing mantra ▪ Network doesn’t change dynamically in tune with app • EVPN has the potential to re-introduce all the complexity of old networks back into the modern DC network • Focusing on use cases and deployment models can put a check on complexity ▪ More as we go through the webinar
  18. 18. 18 BGP Deployment Model
  19. 19. 19Cumulus Networks What’s iBGP Got To Do With It ? • eBGP is the deployment model in the modern DC • EVPN is typically deployed as an iBGP model with peering between VTEPs ▪ Holdover from SP world ▪ Assumes a different IGP protocol to setup fabric connectivity ▪ Spines become iBGP route reflectors(RR) to avoid iBGP full mesh SPINE LEAF
  20. 20. 20Cumulus Networks Simplify BGP Deployment Model • Make EVPN BGP peering work over eBGP • Leaves peer with spines as usual • Spines transport EVPN AFI/SAFI without pushing state into the data plane (similar to iBGP RR) • Modification: For EVPN AFI/SAFI, don’t automatically do next-hop-self SPINE LEAF
  21. 21. 21 EVPN Bridging
  22. 22. 22Cumulus Networks BGP and EVPN Basics • EVPN uses l2vpn AFI and evpn SAFI • Multiple different pieces of information to exchange: ▪ MAC and MAC/IP along with associated VNI and remote VTEP (VxLAN Tunnel Endpoint) binding ▪ List of VNIs each VTEP is interested in ▪ Route prefixes (subnet routes) ▪ Multicast routes ▪ etc. • Encoding these different types of information is done by defining route types ▪ There are ~12 route types defined today
  23. 23. 23Cumulus Networks Basic Bridging in EVPN • Forward packets based on MAC address lookup ▪ Learn where destination MAC is ▪ Learn the source-MAC to port binding • Handle BUM (broadcast, unknown unicast, multicast) ▪ Send BUM traffic only where desired • Optimize L2 multicast ▪ Send multicast packets where there are interested listeners Exchanged via BGP (Type 2 Routes) Traditional Learning Exchanged via BGP (Type 3 Routes) IGMP/MLD Proxy to BGP Type 6 Route Ingress Replication or L3 Multicast
  24. 24. 24Cumulus Networks Type 3 Routes Illustrated A X B C Y Z L1 L2 L3 L4 S2S1 W When EVPN family is activated, L1 sends Type 3 route advt to its BGP peers indicating its interested in Brown and Blue VNIs S1 and S2 send this information to L2, L3 and L4 L2, L3 and L4 learn of L1’s VNI list • Similarly L2, L3 and L4 send their own Type 3 routes • At the end, each VTEP has a list of other VTEPs and the list of VNIs they’re interested in
  25. 25. 25Cumulus Networks Illustrating Unknown Unicast Data Plane A X B C Y Z L1 L2 L3 L4 S2S1 X sends packet to Z L1 associates X’s MAC/VNI with ingress port. Since Z is unknown, does ingress replication to L3, L4 L3, L4 decapsulate packet and flood it out all known brown VNI ports since they don’t know Z’s location as well • Ingress replication is done only to L3/L4 which have brown VNI • Different switching chips support doing ECMP post ingress replication; static, predefined spreading of traffic is done where chip doesn’t support • No egress VTEP learns off of VxLAN packets (implicitly disabled with EVPN) W
  26. 26. 26Cumulus Networks Illustrating The Control Plane A X B C Y Z L1 L2 L3 L4 S2S1 X sends packet to Z L1 learns X’s ingress port, sends Type 2 route with Mac of X, VNI, VTEP of X, to its BGP peers, S1 and S2 Spines sends the received Type 2 route to its peers, L2-L4. Nothing is installed on the spine itself L3 & L4 install a MAC table entry with Mac of X pointing to VTEP of L1. L2 merely stores this info in the BGP VNI RIB since it has no brown VNI W
  27. 27. 27 Handling BUMs
  28. 28. 28Cumulus Networks Three Choices For Handling BUMs Head end or ingress replication L3 multicast i.e. underlay uses multicast Drop unknown unicast and unknown multicast silently
  29. 29. 29Cumulus Networks Ingress Replication • Keeps the underlay simple ▪ No need to setup/debug L3 multicast • The default model on Cumulus Linux • The most popular when I speak to customers (potential or otherwise) ▪ Maybe biased info, since Cumulus only supports this today
  30. 30. 30Cumulus Networks L3 Multicast in Underlay • Map each VNI’s traffic to a L3 multicast group • Ideal is that each VNI is mapped to a separate L3 multicast group • Control and data plane efficiency limit ideal goals • More complex configuration due to additional configuration: ▪ Configuring PIM ▪ Mapping VNI to L3 multicast group ▪ Additional checking if VNI received in group is of interest • Only benefit is ability to handle lots of BUM traffic (or even L2 multicast)
  31. 31. 31Cumulus Networks Drop BUM Traffic • Many network admins consider BUM traffic as a potential DDOS attack vector • A key primary goal of EVPN was to eliminate BUM via control plane support • Useful mostly if used in conjunction with ARP suppression • Primary drawbacks: ▪ Inability to handle silent servers (speak only when spoken to). Do these even exist anymore ? ▪ Slower convergence due to control plane distributing information rather than learning via data plane
  32. 32. 32 Dual-Attached Hosts
  33. 33. 33Cumulus Networks Dual-Attached Hosts Deployment Model • The two switches a dual-attached host connects to behave no differently w.r.t. BGP for EVPN than regular BGP ▪ Each of the two switches has its own ASN • MLAG typically used to provide a single logical bonded interface to the host • Peer-link/MLAG is sometimes debated ▪ Alternate proposal is to use the L3 core and BGP to exchange relevant information between the switches ▪ Type 1 and type 4 route types defined for this purpose ▪ Not commonly deployed or popular ▪ Maybe of interest for data center interconnect switches
  34. 34. 34Cumulus Networks VxLAN Configuration for Dual-Attached Hosts • Many switching ASICs do not support multiple VTEP IP addresses associated with a MAC/VNI in the MAC table • So both switches attached that a dual-attached host connects to MUST use an anycast IP address as the VTEP IP address ▪ Ensure that this anycast VTEP IP is advertised in BGP underlay
  35. 35. 35 ARP Suppression in EVPN
  36. 36. 36Cumulus Networks ARP Suppression • Eliminate or reduce ARP broadcasts by providing local ARP proxy ▪ Not a traditional L3 ARP Proxy, just a L2 ARP local response • Announce MAC/IP binding along with MAC/VNI to VTEP association ▪ This is also a Type-2 route • Can be enabled on a per-VNI basis
  37. 37. 37Cumulus Networks ARP Suppression: Vendor Notes • ARP Suppression can be enabled on Cumulus, independent of the VTEP being the gateway for that VNI • Some of the other vendors enable this feature only if VTEP is also the gateway for that VNI • Cumulus supports only ARP suppression today, ND support coming soon
  38. 38. 38 Modifications to Linux Kernel for EVPN Support
  39. 39. 39Cumulus Networks Three Primary Modifications to Support EVPN • The Linux kernel had three primary modifications: ▪ Support for ARP suppression ▪ Adding a flag to indicate a MAC table entry was learnt via an external source ▪ Adding a flag to indicate an IP/IPv6 neighbor entry was learnt via an external source • The first has been upstreamed and accepted into mainstream Linux kernel • The two flags are being upstreamed
  40. 40. 40 Configuration Example
  41. 41. 41Cumulus Networks Configuration Steps (Cumulus Linux/FRR specific) • Configure VxLAN VNI ▪ Map the VLAN the VNI maps to • Configure BGP ▪ eBGP ▪ advertise IPv4 unicast underlay, announce loopback and VTEP IP address at a minimum ▪ Activate l2vpn/evpn AFI/SAFI and advertise all VNI
  42. 42. 42Cumulus Networks Configure VxLAN VNI (for VNI 33) net add interface lo ip address net add vxlan vx-33 vxlan id 33 net add vxlan vx-33 vxlan local-tunnelip net add interface vx-33 bridge access 1000
  43. 43. 43Cumulus Networks Configure BGP for EVPN (for leaf and spine) LEAF CONFIG router bgp 65456 bgp router-id neighbor fabric peer-group neighbor fabric remote-as external neighbor uplink-1 interface peer-group fabric neighbor uplink-2 interface peer-group fabric address-family ipv4 unicast neighbor fabric activate redistribute connected address-family l2vpn evpn neighbor fabric activate advertise-all-vni SPINE CONFIG router bgp 65535 bgp router-id neighbor fabric peer-group neighbor fabric remote-as external neighbor swp1 interface peer-group fabric neighbor swp2 interface peer-group fabric address-family ipv4 unicast neighbor fabric activate redistribute connected address-family l2vpn evpn neighbor fabric activate
  44. 44. 44 Wait! What ? Thats the entire BGP Config ?
  45. 45. 45Cumulus Networks Cisco BGP Config (In Comparison, just a leaf) router bgp 200 router-id neighbor remote-as 100 update-source loopback0 ebgp-multihop 3 allowas-in send-community extended address-family l2vpn evpn allowas-in send-community extended neighbor remote-as 100 update-source loopback0 ebgp-multihop 3 allowas-in send-community extended address-family l2vpn evpn allowas-in send-community extended vrf vxlan-900001 advertise l2vpn evpn evpn vni 2001001 l2 rd auto route-target import auto route-target export auto vni 2001002 l2 rd auto route-target import auto route-target export auto
  46. 46. 46Cumulus Networks Cisco BGP Config (contd.) vrf context vxlan-900001 vni 900001 rd auto address-family ipv4 unicast route-target import 65535:101 evpn route-target export 65535:101 evpn route-target import 65535:101 route-target export 65535:101 address-family ipv6 unicast route-target import 65535:101 evpn route-target export 65535:101 evpn route-target import 65535:101 evpn route-target export 65535:101 evpn
  47. 47. 47Cumulus Networks FRR’s Simplified Configuration • Assume sane defaults • Simplify the common case • Take out all the stuff that’s inconsequential • Those who want all the knobs and warts still have it GOAL: Simplify configuration to reduce human error
  48. 48. 48Cumulus Networks Summary • EVPN is a standards-based technology that allows enterprise networks to run traditional applications over a L3 core • EVPN uses VxLAN as its base data plane encapsulation • EVPN uses BGP as the control plane • FRR/Cumulus Linux use sane defaults to simplify the EVPN configuration and operations
  49. 49. 49Cumulus Networks Next Webinar Operationalizing EVPN in the DC: Part 2 Routing with EVPN & Putting It All Together Nov 2, 10 AM PDT
  50. 50. 50 Thank you! Visit us at or follow us @cumulusnetworks or © 2017 Cumulus Networks. Cumulus Networks, the Cumulus Networks Logo, and Cumulus Linux are trademarks or registered trademarks of Cumulus Networks, Inc. or its affiliates in the U.S. and other countries. Other names may be trademarks of their respective owners. The registered trademark Linux® is used pursuant to a sublicense from LMI, the exclusive licensee of Linus Torvalds, owner of the mark on a world-wide basis.