Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Challenges of L2 NID Based Architecture for vCPE and NFV Deployment

1,835 views

Published on

Challenges of L2 NID Based Architecture for vCPE and NFV Deployment

Published in: Internet
  • Be the first to comment

Challenges of L2 NID Based Architecture for vCPE and NFV Deployment

  1. 1. Challenges of Layer 2 NID Based Architecture For vCPE/NFV Deployments Santanu Dasgupta Sr. Consulting Engineer – Global Service Provider Network Architecture BDNOG-3 May, 2015
  2. 2. 2© 2013-2014 Cisco and/or its affiliates. All rights reserved. Background and Context
  3. 3. 3© 2013-2014 Cisco and/or its affiliates. All rights reserved. “Network Functions” in SP Network Architecture Landscape LTE Smartphone Access xDSL WiFi Smartphone PC RNC2G 3G Ethernet CE NodeB eNodeB AP Small Cell FAP Gateways / Service Edge OSS/BSS Subsystems and Control Data Plane Voice Video Data Core Network Infrastructure IMS xDSLHFC PGWSGW 2/3G GGSN 2/3G SGSN MME ePDG eWAG PE Metro Network Infrastructure NAT FW IPSec DPICGNCaching Opt MSC-SMGW A-SBC I-SBC BGCF MGCF PS / RLS DRA Video ingestion DRM Video Network EMS Provisioning Analytics Billing Radius DNS DHCP S-CSCF P-CSCF I-CSCF Trans- coding Cache Control Policy Parental control HLR HSS ENUM TAS SMS-C Services OCS MMS-C HCSRMS xDSLDSLAM DSL/ FTTX BNG Core Routing Metro Ethernet Biz CPE Consumer CPE Cable Modem CMTS Capacity Planning WLC SecGW HNB-GW Policy SDN Controller BGP server Metro Ethernet Data CenterCore and Data Center Network Infrastructure
  4. 4. 4© 2013-2014 Cisco and/or its affiliates. All rights reserved. Virtualization of “Network Functions” Existing Hardware / Appliance based Network Functions (NFs) Virtualized NFs running as VM on x86 Server Platform Step 1: Decouple software from underlying hardware Step 2: Port it as a VM/ container on x86 Server platform running as a Network Function Data Center Switching Infrastructure Hypervisor vFW vCPE vDPI vLB Hypervisor
  5. 5. 5© 2013-2014 Cisco and/or its affiliates. All rights reserved. NFV, SDN & Orchestration Together Partial list, just a few main ones are mentioned here Ethernet Switching Network Underlay Hypervisor Hypervisor Hypervisor NAT Firewall DPI Orchestration and SDN Control Function Storage Server 1 Server 2 Server 3 Firewall DPI VM / VNF Lifecycle Management in End-to-end manner Network Plumbing to orchestrate dynamic topologies Configuration Management of the VNFs Integration with Other DC/POD And the WAN OAM, Assurance, Analytics Standard APIs NAT
  6. 6. 6© 2013-2014 Cisco and/or its affiliates. All rights reserved. Traditional Managed CPE with IP/MPLS L3VPN PE PE P CE CE Carrier Ethernet / Backhaul Carrier Ethernet / Backhaul §  There are multiple genuine and perceived issues in the traditional service delivery model – §  CPE provisioning and servicing often require truck roll (sending engineers) à high OPEX §  The amount of feature sets enabled on the on-premise CPE makes the solution complex to operate §  Service delivery is not agile, lacks automation, service turn-up / changes takes a lot of time §  On site CPE’s are often expensive and not an open platform §  The industry is expecting something that is more open, agile, fully automated, flexible to address different market segments and can help the operators to reduce their TCO §  This is where L2 NID on-premises + vCPE architecture discussion for business VPN started almost 2 years back in the industry MP-BGP Static / IGP / eBGPStatic / IGP / eBGP MPLS Core Branch Branch
  7. 7. 7© 2013-2014 Cisco and/or its affiliates. All rights reserved. What is “L2 NID” ? §  Layer 2 NID – Layer 2 Network Interface Device (example Cisco ME 1200) §  Some calls it Layer 2 Network Termination Device (NTD) à we will call it L2 NID for this presentation §  L2 NID is the device that Carrier Ethernet Operator drops at Customer Premises to terminate the Ethernet last mile §  It is managed by the operator §  It has user facing interfaces (UNI) and network facing interfaces (NNI) – typically all Ethernet §  It marks the demarcation point between the Operator and Customer Network §  The L2 NID is typically a 4 to 6 port FE/GE/10GE L2 switch with some other capabilities such as – §  Ethernet OAM (CFM, Y.1731) for fault & performance management, Service Activation (Y.1564), timing support (mobile b/h) etc. Carrier Ethernet AGG AGG AGG NPE NPE L2 NID MPLS Core Typical Carrier Ethernet Operation Domain Demarcation Customer Premises PE
  8. 8. 8© 2013-2014 Cisco and/or its affiliates. All rights reserved. L2 NID + vCPE Architecture Proposal for Managed CPE/VPN §  No need to have a Layer 3 CPE at Customer premises anymore §  Virtualize the L3 CPE and Put that at SP’s POP or Cloud / NFV Data Center §  Make the branch simplified with only one device, where complex features are running at SP’s Cloud making it easier to operate à may also help to reduce cost Carrier Ethernet AGG AGG AGG NPE NPE L2 NID MPLS Core One Device on Customer Prem vCPE vCPE Layer 2 Backhaul of Branch Traffic to vCPE PE-CE RoutingVery simple on-premises CPE L3 CPE Virtualized, Complex features running at SP’s premises PECPE Animated
  9. 9. 9© 2013-2014 Cisco and/or its affiliates. All rights reserved. L2 NID + vCPE Architecture Proposal for Managed CPE/VPN Carrier Ethernet AGG AGG AGG NPE NPE L2 NID MPLS Core One Device on Customer Prem vCPE vCPE Layer 2 Backhaul of Branch Traffic to vCPE PE-CE RoutingVery simple on-premises CPE L3 CPE Virtualized, Complex features running at SP’s premises PE §  No need to have a Layer 3 CPE at Customer premises anymore §  Virtualize the L3 CPE and Put that at SP’s POP or Cloud / NFV Data Center §  Make the branch simplified with only one device, where complex features are running at SP’s Cloud making it easier to operate à may also help to reduce cost
  10. 10. 10© 2013-2014 Cisco and/or its affiliates. All rights reserved. Why The L2 NID Based Alternate Looked Promising ? Carrier Ethernet AGG AGG NPE NPE L2 NID Carrier Ethernet AGG AGG AGG NPE NPE L2 NID MPLS Core CPE AGG Two Devices on Customer Prem One Device on Customer Prem MPLS Core vCPE vCPE PE-CE Routing over Carrier Ethernet Transport Layer 2 Backhaul of Branch Traffic to vCPE PE-CE Routing Traditional Model L2 NID+ vCPE Model Reduction of Customer Premise Devices from Two to One was Promising to Reduce Cost and Complexity PE PE
  11. 11. 11© 2013-2014 Cisco and/or its affiliates. All rights reserved. This Would Enable Agile Service Creation too Carrier Ethernet AGG AGG AGG NPE NPE L2 NID MPLS Core One Device on Customer Prem vCPE vFW vDPI NFV DC NFV and Cloud Orchestration §  NFV and Orchestration also enables agile service creation and turn-up §  vCPE can be chained with rich set of NFV, Cloud IaaS, PaaS and SaaS services (SP hosted or 3rd party) §  This is true irrespective of the on-premises CPE type – be in L2 NID or L3 CPE or whatever else PE PE Cloud DC (SP / 3rd Party) IaaS / PaaS / SaaS
  12. 12. 12© 2013-2014 Cisco and/or its affiliates. All rights reserved. Challenges with the L2 NID Based Architecture, where there is no L3 CPE at the Branch
  13. 13. 13© 2013-2014 Cisco and/or its affiliates. All rights reserved. MAC Address Scale Issues for the NFV DC . . . 250 MAC Address 250 MAC Address 250 MAC Address Carrier Ethernet AGG AGG AGG NPE NPE L2 NID L2 NID L2 NID L2 NID TOR Switch vCPE vCPE L2 Trunk / QinQ L3 VPN PE L3 Links 1 Million MAC Address!!! (4000 customers @250 MAC) §  NFV DC’s are built with a network switching underlay à servers aren’t directly connected to the NPE §  With layer 2 backhaul of traffic from customer branches, the NFV DC switching layer will learn all customer MAC addresses §  An example site with 4000 customer sites and 250 MAC address per site means 1 Million MACs §  The switching underlay / TOR switches will now need to support and learn 1Million MAC addresses §  Impacting cost of the network, service scale, convergence time upon failure due to large table size §  This can be technically solved with end-to-end overlay (like GRE or MPLS PW) from branch to vCPE or NPE to vCPE à defeating the original simplicity of the proposed architecture to a major extent L2 NID L2 NID L2 NID
  14. 14. 14© 2013-2014 Cisco and/or its affiliates. All rights reserved. Security Exposure Due to Extension of the Broadcast Domain Carrier Ethernet AGG AGG AGG NPE vCPE L2 NID L2 NID L2 NID . . . vCPE vCPE vCPE TOR Switch Customer’s L2 Domain Extended to NFV DC’s delivering vCPE §  By not having a L3 CPE at branch, and vCPE at NFV DC, it extends customer’s Layer 2 domain all the way to the NFV DC §  For a POP with 4000 customers, it means extension of 4000 layer 2 domains hitting the NFV DC à SP typically has no control what assets are there at these branches and how secure they are §  This poses a significant security risk to SP’s infrastructure for various DDoS/other attacks NFV DC
  15. 15. 15© 2013-2014 Cisco and/or its affiliates. All rights reserved. Potential Risks Due to Layer 2 Loops Carrier Ethernet AGG AGG AGG NPE vCPEL2 NID L2 NID vCPE vCPE vCPE TOR Switch L2 NID L2 NID To Customer LAN To Customer LAN Customer LAN’s STP Domain Operators domain Starts here onwards §  In a L2 NID only based architecture, it is critical to demarcate customer’s STP domains at the L2 NID §  There could be dual homing situations, where L2 NID may have to participate in Customer’s Spanning Tree domain, also may require some form of loop prevention mechanism on the NNI side too §  Such dual homed connectivity requirement pose risks. Operational errors may cause the SP infrastructure to get impacted due to layer 2 loops originated from a customer branch
  16. 16. 16© 2013-2014 Cisco and/or its affiliates. All rights reserved. High Availability Design Challenges Carrier Ethernet AGG AGG AGG NPE NPE L2 NID MPLS Core vCPE1 vCPE2 PE-CE Routing PE Layer 2 Backhaul of Branch Traffic to vCPE §  For situations with two vCPE’s for HA, the two vCPE’s need to run HSRP / VRRP (lets consider VRRP) §  There are multiple ways to run the VRRP traffic between the two vCPE’s that comes with different levels of complexity and different degree of reliability §  The “L2 NID ßà vCPE” connectivity tracking becomes key for reliable bi-directional packet forwarding
  17. 17. 17© 2013-2014 Cisco and/or its affiliates. All rights reserved. High Availability Design Challenges VRRP on Directly Connected Links Carrier Ethernet AGG AGG AGG NPE NPE L2 NID MPLS Core vCPE1 vCPE2 PE-CE Routing PE Layer 2 Backhaul of Branch Traffic to vCPE §  Simplest way to run VRRP à but requires a L2 segment between two NFV DCs (typically two sites) §  Less reliable, since VRRP operation is blind to the connectivity failures from vCPE to L2 NID §  If vCPE1 is active on VRRP segment, and if vCPE1 to the L2 NID connectivity fails, vCPE1 may continue to remain active à will cause service outage VRRP
  18. 18. 18© 2013-2014 Cisco and/or its affiliates. All rights reserved. High Availability Design Challenges VRRP on Via the Carrier Ethernet Network Carrier Ethernet AGG AGG AGG NPE2 NPE1 L2 NID MPLS Core vCPE1 vCPE2 PE §  To address the reliability issue, VRRP may be carried across the Carrier Ethernet network §  vCPE1 – NPE1 – NPE2 – vCPE2 à need a L2 path, FRR enabled explicit path between NPE’s to force the path §  vCPE1 – NPE1 – AGG … – AGG – NPE2 – vCPE2 §  More reliable now, since VRRP traffic will be dropped if the L2NID to vCPE connectivity fails, but – §  Carrier Ethernet network to ensure VRRP packets aren’t dropped during congestion – that will trigger false failover §  VRRP delay timers may not be very aggressive, Carrier Ethernet network to ensure minimum delay §  This is more complex to provision and operate VRRP Potential Alternate Paths for VRRP Sessions
  19. 19. 19© 2013-2014 Cisco and/or its affiliates. All rights reserved. High Availability Design Challenges VRRP via the Carrier Ethernet Network + IEEE 802.1ag (CFM) Carrier Ethernet AGG AGG AGG NPE2 NPE1 L2 NID MPLS Core vCPE1 vCPE2 PE §  The previous solution is still not end-to-end, not covering the AGG to the L2 NID connectivity §  For an end-to-end reliable operations, that is a key requirement §  A way to address this challenge is to use VRRP and CFM (802.1ag) together §  CFM runs end-to-end from L2NID to vCPE. When due to any failure on the path, CFM session expires à interface of vCPE goes to line protocol “down” state à VRRP traffic cannot go out any more out of the interface à VRRP switchover takes place to the standby §  Solves this HA issue, but brings back a lot of complexities in the network §  We’re trying to remove complexities by removing L3 CPE from branch à but we introduced different complexities now VRRP 802.1ag CFM Sessions VRRP CFM CFM
  20. 20. 20© 2013-2014 Cisco and/or its affiliates. All rights reserved. Carrier Ethernet AGG AGG AGG NPE2 NPE1 L2 NID MPLS Core vCPE1 vCPE2 PE VRRP 802.1ag CFM Sessions VRRP CFM CFM High Availability Design Challenges Upstream Routing from vCPE to the L3VPN PE PE-CE Routing with Conditional Advertisement depending on vCPE to L2 NID connectivity and VRRP status (typically will require eBGP) §  The vCPE’s need to run PE-CE routing with eBGP / IGP or Static routing §  The vCPE’s now need to perform conditional route advertisement to the L3 VPN PE’s depending on the reachability of vCPE to L2 NID and VRRP status §  If vCPE1 is the preferred path for the downstream traffic, but vCPE1 has lost connectivity to L2 NID, the vCPE1 needs to make L3VPN PE aware by advertising routes with less preferred attribute than vCPE2 §  Typically restrict the PE-CE routing protocol to eBGP and may add more complexity in the design
  21. 21. 21© 2013-2014 Cisco and/or its affiliates. All rights reserved. Lack of L3 Capability @Branch Will Limit Available Services Carrier Ethernet AGG AGG AGG NPE NPE L2 NID MPLS Core vCPE1 vCPE2 PEVRRP Branch §  Many customer may require capabilities at branch that requires Layer 3 devices §  Such as IPSec VPN or WAN Acceleration §  Many Service Providers are looking forward to use 3G or 4G LTE as backup connectivity §  Typical L2 NIDs do not have those interfaces à forcing another CPE for the backup §  If Hierarchical & granular QOS is a requirement at the branch, this could be challenging with L2 NID too
  22. 22. 22© 2013-2014 Cisco and/or its affiliates. All rights reserved. Conclusion of L2 NID + vCPE Architecture §  We were attempting to simplify the architecture by removing L3 CPE from the branch in the managed CPE / VPN architecture §  But in that process, complexities of other types got injected back into the network §  MAC address scaling issue impacting service scale, convergence, cost §  Security exposure of the vCPE / NFV DC and the SP infrastructure due to extension of L2 domains §  Possible chances of Layer 2 loops due to operational errors §  Complex design requirements to satisfy high availability à more difficult to operate §  It may create further limitations when it comes to service availability at the branch §  Layer 3 services such as IPSec, WAN Acceleration etc. are not possible from the branch anymore §  3G / 4G LTE on the same device §  Hierarchical and Granular QOS from the branch
  23. 23. 23© 2013-2014 Cisco and/or its affiliates. All rights reserved. So What We May Do To Approach the Problem ? This is My Recommendation J Carrier Ethernet AGG AGG AGG NPE NPE L3 NID MPLS Core vFW vCPE* vDPI NFV DC NFV and Cloud Orchestration Cloud DC (SP / 3rd Party) PE PE §  Keep a L3 CPE at branch, may be physical or virtual à call it a L3 NID or L3 CPE or whatever you like §  We need to demarcate customer’s L2 network at the branch itself, that’s how the networks scaled §  This helps avoid MAC address scale, security & L2 Loop issues. Also avoids additional issues with VRRP design §  Make the L3 CPE at branch Zero Touch Provisioning (ZTP) capable – to achieve automation and agility §  If required, try and make the L3 CPE at branch simplified by reducing the footprint of “enabled features” §  Provision complex CPE features on NFV DC (may include advanced routing on a vCPE) §  Have the ability to service chain vCPE with other rich set of functions using NFV orchestration system à make it agile! L3 NID or L3 CPE Simple Routing like Static / IGP with vCPE, or PE-CE Routing from branchDemarcate Customer L2 Network Here vCPE* - May not be reqd. most of the time IaaS / PaaS / SaaS
  24. 24. Thank you.
  25. 25. 25© 2013-2014 Cisco and/or its affiliates. All rights reserved. NFV – How to build / Augment Operations skillsets •  Most existing technologies, protocols and associated skills are equally required •  On top of that, there are needs for acquisition of New Skills •  x86 Server Virtualization •  Virtualization on Linux (and KVM/QEMU) Environment •  Cloud Orchestration Systems – such as OpenStack •  Virtual Switches – OVS, Netmap/VALE, Snabbswitch, Vendor Specific etc •  SDN Controllers – OpenDayLight, Vendor Specific •  Device Programmability and APIs – NETCONF, Yang, RESTCONF, REST APIs, OF…. •  Service Function Chaining – specially NSH (Network Service Header) •  Network based Virtual Overlay transport – VXLAN, MPLSoGRE/UDP, LISP, L2TPv3….. •  Automation Tools – puppet / chef etc. •  Management, Orchestration, OSS Fundamentals, •  …..

×