• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Migrating to OpenFlow SDNs
 

Migrating to OpenFlow SDNs

on

  • 942 views

Migrating to OpenFlow SDNs a presentation by Justin Dustzadeh, Huawei at the US Ignite ONF GENI workshop on October 8, 2013

Migrating to OpenFlow SDNs a presentation by Justin Dustzadeh, Huawei at the US Ignite ONF GENI workshop on October 8, 2013

Statistics

Views

Total Views
942
Views on SlideShare
942
Embed Views
0

Actions

Likes
2
Downloads
107
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment
  • Migration WG whose task it is to examine the existing examples of deploying SDN, ideally with the goal of full transition.  Roughly, the idea is to examine the cases where this has been done and gather best practices, etc., from the experiences.  
  • The Charter specifies two migration approaches, depicted in Figure 1. The first approach is the more direct method of upgrading existing networking equipment with OpenFlow Agents and decommissioning the Control Machine in favor of OpenFlow Controllers and Configurators.
  • The second approach includes a phased approach, illustrated in Figure 2, in which OpenFlow devices are deployed in conjunction with existing devices. Network operations are maintained by both the existing Control Machine and by OpenFlow Controllers and Configurators. Once services have been migrated to the OpenFlow target network, the starting network is decommissioned.
  • Legacy devices are traditional Switch/Routers with integrated control and forwarding plane. OpenFlow devices are switches with only OpenFlow forwarding planes, with the control plane residing external to the device. Hybrid OpenFlow Switches refers to devices with both legacy control and data plane and OpenFlow capabilities.
  • Campus Networks are typically composed of multiple buildings, interconnected with a central operations center. Components of the Campus network would include a Campus wide backbone. An egress point to the Wide Area Network is typically associated with a datacenter of some description. Each building will typically have a wiring closet and, in many cases, additional networking/datacenter facilities – be they for different academic departments, administration facilities, or campus wide IT resources. Enterprise Datacenters can range in size, but are typically composed of networking resources used to interconnect various sub-networks of servers (physical or virtual) together with associated storage (e.g. NAS or SAN), security, and networking functions (e.g. WAN acceleration, Load Balancing, etc.). Requirements for software-defined networking can vary, but application driven services rank high on the list. Multi-Tenant Datacenters have benefitted greatly from software-defined networking. These datacenters share many aspects of the typical Enterprise Datacenter, however, multiple tenants must typically share the physical resources. Virtualization of computing resources is almost a necessity, with robust features such as Virtual Machine migration facilitating a variety of capabilities, includingresource balancing, maintenance, and disaster recovery. Soft Switches within the computing resources themselves are a dominant component of the architecture. The net effect is that portions of the datacenter move and change, demanding that the overlay network must move and change to echo those changes. Increasingly, however, software-defined networking devices help address these requirements.Service Provider/Wide AreaNetworks introduce significant diversity. Service providers network architectures and requirements vary. For example, a Mobile Cellular Service Provider will have a radio network; along with a mobile backhaul network which hands off to an access network and ultimately a core network. Different applications of OpenFlow and SDN are being developed and deployed today. Service Providers, such as Google, are using OpenFlow to manage their inter-exchange resources and to ensure appropriate bandwidth is available at appropriate times. Many use cases are being developed by the industry, with software-defined solutions addressing Layer 0 through 7 network domains.
  •  Goal was to create a new environment (co-existence model) and let experimenters use it.  Gradual migration of users to OpenFlow over a 2-year period (Jun 2009 to Jun 2011).  Use of a variety of switches and controllers, including: HP, NEC, Nicira, BigSwitch.  3 types of networks: wireline, experimental, and wireless (ofwifi with 30 APs).  Emphasis on VLAN configuration: make new VLAN, migrate users to it, then introduce OpenFlow.  Even so, some problems on a VLAN did take down the whole network.  25 wireline users, 77 wireless users, about 30 APs, in the order of 100 subnets.  Flow setup time less than 100ms.  Experimental work included traffic engineering and scalability exercises.  Use of many existing/custom-built tools, including probing tools and VM-based tools (list can be shared).  No major issues with loops.  200-300 flows/second on wireline network and about 700 flows/second on wireless network.  Traffic engineering algorithms were key to deployment (throughput and rate limits).  3 major types of tools: additional probes on switches (dummy machines), user-installed software, collection on controller, VM circulated to different campuses (further info can be shared).  Same switch had OpenFlow and non-OpenFlow VLANS. Users were moved from one to the other on the same switch.
  • Manage the Risk in Deploying Eventual Goal: Expand the OF Support to Serveral other L2 VLANs and then Interconnect Them at L3 RouterTool Requirements: oftrace, wiresharkdissector for OF, minnet, ofrewind, Hassel andNetPlumber, ATPGGAP AnalysisAdd safeguards in place within Switch firmware or OF controller to automatically revert configurationsStronger interoperability between the OF network and Non-OF network
  • Data plane and BGP control plane tightly coupled. Hard to keep up with BGP control plane changes or additional features on vendor specific OS and platforms.Puts extra load on the edge router’s control plane, which can lead to failures.BGP Scale limited by the CPU/Memory resources available on the edge router.Makes BGP configuration, management, monitoring and troubleshooting difficult and complex especially for large-scale deployments.Network operator spends a significant amount of time creating/maintaining BGP peering sessions and policies manually.
  • In the traditional BGP deployment models, edge router maintains numerous BGP adjacencies as well as large number of BGP routes/paths for multiple address families such as IPv4, IPv6, VPNv4 and VPNv6 etc. In addition, to meet customer SLAs, edge router may be configured with aggressive BGP session or Bidirectional Forwarding Detection (BFD) timers. Handling BGP state machine, processing BGP updates as per configured policies and calculating best paths for each address-family puts a heavy load on the router. Additionaly, by definition, service changes are quite frequent on the edge routers to provision new customers or update customer policies. Because of the limited resources, including CPU and memory, as well as proprietary nature of OS, service acceleration and innovation is dependant on vendor implementation. In the traditional deployment model, Provider Edge (PE) router runs BGP with external BGP speaking peers. In a typical Service provider environment, it is not uncommon for an edge router to maintain 500K+ Internet and/or L3VPN routes. Besides external peerings, edge router also maintains internal peering sessions typically with dual Route Reflectors (RR) as depicted in Figure 19. All the BGP sessions as well as policies are typically configured manually using vendor specific CLI. Data plane and BGP control plane tightly coupled. Hard to keep up with BGP control plane changes or additional features on vendor specific OS and platforms.Puts extra load on the edge router’s control plane, which can lead to failures.BGP Scale limited by the CPU/Memory resources available on the edge router.Makes BGP configuration, management, monitoring and troubleshooting difficult and complex especially for large-scale deployments.Network operator spends a significant amount of time creating/maintaining BGP peering sessions and policies manually. BGP Free Core is becoming popular among network operatorswho run some form of encapsulation in the core. Motivations:– Simplified core architecture– Lower cost of core infrastructure– Increase in core speed– Simplified core management– Better control on traffic patterns in the core– Direct preparation for optical switching
  • Lessons learnt and deployment practices. High level and not comprehensive but can provide some guidelines for others who are planning to go on similar journey. For example, the lack of fault tolerant OpenFlow controllers can be mitigated by provisioning multiple OpenFlow controllers to provide redundancy. Similarly, the lack of BGP relay agent on the OF enabled device to replicate the BGP sessions to provide resiliency for the BGP Free Edge use case and similarly resiliency for the BGP route controller can be addressed by deploying controller across multiple VMs and across multiple physical servers similar to cloud infrastructure and NFV. More work needed on requirements such as resiliency and redundancy for fault-tolerant OpenFlow controllersAlternative options available to mitigate the resiliency concernsDeploy multiple OpenFlow controllers to provide redundancy Deploy BGP controller across multiple VMs/ multiple physical servers for to avoid single point of failure

Migrating to OpenFlow SDNs Migrating to OpenFlow SDNs Presentation Transcript

  • Justin Dustzadeh Huawei 1 Migrating to OpenFlow SDNs
  • © 2013 Open Networking Foundation Outline 2 • Overview • Migration Use Cases • Conclusion
  • © 2013 Open Networking Foundation Migration Working Group Overview 3 Objective • Accelerate adoption of open SDN; assist network operators with recommendations on SDN migration Timeline • Formed in April 2013, 1st milestone deliverable ready, 3 other milestones through 2Q2014 Focus • Examine real-world migration use-cases, gather best practices and make recommendations on migration methods, tools and systems Who • Team of industry experts and practitioners who have carried out or have interest in carrying out SDN migrations
  • © 2013 Open Networking Foundation ONF Migration Working Group Charter, Goals & Migration Steps 4 • Identify core requirements of the Target Network • Prepare the Starting Network for migration • Phased Migration of service • Validate the result 1 2 3 4 Target Network OpenFlow Controller Phased Migration Starting Network Device Device Device 2 3 4 1
  • © 2013 Open Networking Foundation What Are We Producing? Migration WG Deliverables 5 1st milestone: • Submit document on use cases and migration methods, leveraging the experience of prior work by network operators 2nd milestone: • Submit document describing the goals and metrics for the migration 3rd milestone: • Publish prototype working code for migration, and validate the metrics 4th milestone: • Demonstration of prototype migration tool chain
  • © 2013 Open Networking Foundation ONF SDN Architecture 6
  • © 2013 Open Networking Foundation SDN Migration Approaches 1. Direct Upgrade 7 Starting Network Operational Support Systems Control Control Control Device Device Device Target Network Operational Support Systems OpenFlow Controller & Configurator Upgrading existing equipment with OpenFlow Agents Device Device Device OpenFlow Agent
  • © 2013 Open Networking Foundation Target Network Operational Support Systems OpenFlow Controller & Configurator Operational Support Systems Device Device Device Control Control Control Phased Deployment SDN Migration Approaches 2. Phased (Parallel) Upgrade Starting Network Control Control Control Device Device Device Operational Support Systems OpenFlow Controller & Configurator 8
  • © 2013 Open Networking Foundation SDN Migration Approaches A Closer Look at Device Types 9 • Legacy Switch – Traditional switch/router with integrated control and forwarding plane • OpenFlow Switch – OpenFlow forwarding only, control plane residing external to device • Hybrid Switch – OpenFlow forwarding as well as legacy control and data planes Traditional RIB / FIB Traditional RIB / FIB OpenFlow
  • © 2013 Open Networking Foundation SDN Migration Approaches A Closer Look at Device Types 10 Three approaches for migration to OpenFlow-based SDN: 1. Legacy to Greenfield 2. Legacy to Mixed 3. Legacy to Hybrid Traditional RIB / FIB Legacy Switch Hybrid Switch OpenFlow Switch Traditional RIB / FIB OpenFlow
  • © 2013 Open Networking Foundation Real-World Migration Approaches Deployment Scenarios 11 1. Legacy to Greenfield – Either no existing deployment, or – Legacy network upgraded to become OpenFlow-enabled and the Control Machine is replaced with an OpenFlow controller Traditional RIB / FIB Traditional RIB / FIB Traditional RIB / FIB Legacy Network Legacy Switch Greenfield OpenFlow Network OpenFlow Switch OpenFlow Controller
  • © 2013 Open Networking Foundation Real-World Migration Approaches Deployment Scenarios 12 2. Legacy to Mixed (or “Ships-in-the-Night”) – New OpenFlow devices are deployed and co-exist with traditional switches/routers and interface with legacy Control Machines – OpenFlow controller and traditional devices need to exchange routing information via the legacy Control Machine. Traditional RIB / FIB Traditional RIB / FIB Traditional RIB / FIB Legacy Network Legacy Switch Traditional RIB / FIB Mixed (Legacy & OpenFlow) Network OpenFlow Switch OpenFlow Controller
  • © 2013 Open Networking Foundation Real-World Migration Approaches Deployment Scenarios 13 3. Hybrid Network Deployment – Mixed Network deployments and Hybrid devices (with both legacy and OpenFlow functionality) coexist – Hybrid devices interface with OpenFlow Controller and legacy Control Machine Traditional RIB / FIB Traditional RIB / FIB Traditional RIB / FIB Legacy Network Legacy Switch Hybrid OpenFlow Network Hybrid Switch Traditional RIB / FIB OpenFlow Controller Traditional RIB / FIB OpenFlow OpenFlow Switch
  • © 2013 Open Networking Foundation Real-World Considerations Network Domains and Layers 14 • Service enablement is often the motivation for SDN migration • Services can be end-to-end – Overlay on conceptual (virtual) networks – Spanning several network segments – Several layers of technologies some/all of which addressable by OpenFlow • OpenFlow could address layer 0, 1, 2, 2.5, 3, 4-7 applications – Different use cases requiring specific migration recommendations • Examples – Application-specific capacity scheduling at lower layers, DPI- based service chaining at IP Edge, etc.
  • © 2013 Open Networking Foundation Outline 15 • Overview • Migration Use Cases • Conclusion
  • © 2013 Open Networking Foundation Types of Networks 16 WAN DC Enterprise DC Campus • Multiple buildings • Campus backbone • Groups of users, BYOD • Heterogeneous IT • Various sizes • Sub-networks, storage • Security • WAN optimization, LB • Multi-tenant, virtualization • Mid-size to hyperscale • VM mobility • Disaster recovery • Significant diversity • Multiple domains • Carriers (access, transport) • Many customers
  • © 2013 Open Networking Foundation Migration Use Cases 17 1. Campus Network: Stanford OpenFlow deployment 2. Network Edge: NTT’s BGP-Free Edge field trial 3. Inter-Data Center WAN: Google’s SDN-powered WAN (B4) WAN DCCampus
  • © 2013 Open Networking Foundation Campus Network Use Case Stanford OpenFlow Deployment 18 Motivation: • Understand and verify the new SDN technology • Motivate the need for SDN through innovative experiments • Contribute back to OpenFlow specification and community
  • © 2013 Open Networking Foundation Stanford OpenFlow Deployment Objectives 19 Overview: • Part of Stanford campus network migrated to OpenFlow in 2010 • Migration initially focused on wireless users • Later expanded to selected wired users • Multiple islands across William Gates CS building and Paul Allen CIS building • Eventual goal was to expand OpenFlow support to several other L2 VLANs and then interconnect them at a L3 router
  • © 2013 Open Networking Foundation Stanford OpenFlow Deployment Topology 20 • Production Network in 3A Wing OpenFlow-enabled • 6 48-port 1GE OpenFlow switches from 4 vendors • 30 WiFi APs based on ALIX PCEngine boxes with dual 802.11g interfaces running Linux-Based software reference switch from OpenFlow website • 1 WiMAX Base-Station William Gates CS Building OpenFlow-Enabled Network (McKeown Group) Paul Allen CIS/CIX Building OpenFlow-Enabled Network • VLAN 98 was OpenFlow-enabled • 6 48-port 1GE OpenFlow switches from 1 vendor • 14 WiFi APs based on ALIX PCEngine boxes with dual 802.11g interfaces
  • © 2013 Open Networking Foundation Stanford OpenFlow Deployment Migration Requirements 21 Target Network Requirements: • Network availability > 99.9% • Fail-safe scheme to revert the network back to legacy mode • Network performance close to the legacy network’s performance • No affect on user experience in any way Phased Migration: • Migration planned to provide better visibility into network traffic and allow network experimentation for select users (opt-in) • Migrate select VLANs and users to OpenFlow control, allowing for a clear path of staged deployment within the existing campus network
  • © 2013 Open Networking Foundation Stanford OpenFlow Deployment Migration Approach 22 Four phases for gradual move of individual users then VLANs to OpenFlow: 1. Add OpenFlow support on hardware (a 1-time firmware update) 2. Verify OpenFlow support on switch: – Add experimental VLAN / test hosts managed by external controller. Once verified, move to next phase 3. Migrate users to new network: – Create new non-OpenFlow network, safely migrate users to new network before using OpenFlow for production traffic (minimizing risk). Main steps: • Add new Production sub-network; gradually add/move users to new subnet; verify reachability within new network 4. Enable OpenFlow for new subnet: – Once the new subnetwork was functional, enable OpenFlow control for that network by configuring the controller – Again, verify correctness, reachability, performance, and stability using standard monitoring tools, and user experience info collected in surveys
  • © 2013 Open Networking Foundation Data Plane Statistics to Verify Stability Stanford OpenFlow Deployment Tools, Monitoring and Statistics 23 Control Plane Statistics (SNAC controller) Traffic Volume and CPU Usage Monitoring Infrastructure for the OpenFlow Network
  • © 2013 Open Networking Foundation Stanford OpenFlow Deployment Migration Acceptance 24 Correctness and Reachability: • Reachability verified using user/probe-generated traffic; completion of requests made confirmed correctness and reachability Performance: • Correlating monitored statistics in data plane and control plane allowed to identify anomalies and incorrect behaviors Stability: • Statistics monitored for a long period of time. Progression plots frequently made to verify stability and health of the network Service Acceptance: • Network stability gradually improved as switches, controllers and understanding matured. User surveys stopped providing relevant data as users started seeing consistently acceptable service.
  • © 2013 Open Networking Foundation Network Edge Use Case Problem Statement 25 Challenges with Traditional Models: • Heavy load on edge routers in traditional BGP deployment models – BGP adjacencies, routes/paths for address families: IPv4/6, VPNv4/6… – BGP state machine, policy-based BPG updates, best path calculation – Frequent service changes (provision new customers or update policies) • Limited resources (CPU, memory) and proprietary OS • Service agility & innovation dependent on vendor implementation Current BGP Deployment Model IP/MPLS Backbone Customer AInternet Customer BInternet eBGP iBGP Device Device CE CE PE2PE1 RR1 RR2 P P
  • © 2013 Open Networking Foundation Network Edge Use Case NTT’s BGP-Free Edge Field Trial 26 Motivation: • Extend notion of BGP-free core to the edge of the network • Simplified, low-cost routing edge architecture with centralized BGP policy management, leveraging OpenFlow/SDN • Accelerated deployment of edge services Current BGP Deployment Model IP/MPLS Backbone Customer AInternet Customer BInternet eBGP iBGP Device Device CE CE PE2PE1 RR1 RR2 P P
  • © 2013 Open Networking Foundation BGP-Free Edge SDN Architecture 27 Overview: • Move BGP control plane to commodity x86 server and use OpenFlow-enabled switches for the forwarding plane • Simplification of eBGP routing (control plane load) on edge router • Flexibility to calculate customized BGP best paths not only for each ingress point, but also on a per-customer basis BGP-Free Edge – SDN Architecture IP/MPLS Backbone Customer AInternet Customer BInternet Device CE P P BGP Route Controller OpenFlow Controller Control Layer OpenFlow Session eBGP Session Redirection PE1 Device CE PE2
  • © 2013 Open Networking Foundation BGP-Free Edge SDN Architecture 28 • Remote BGP peers (e.g. CEs) connected to edge device as before • BGP sessions not handled/terminated by OF-enabled edge device • OpenFlow controller pre-programs default flows on edge device • Edge device sends all BGP control plane traffic from internal and external peers to BGP route controller BGP-Free Edge – SDN Architecture IP/MPLS Backbone Customer AInternet Customer BInternet Device CE P P BGP Route Controller OpenFlow Controller Control Layer OpenFlow Session eBGP Session Redirection PE1 Device CE PE2
  • © 2013 Open Networking Foundation BGP-Free Edge Pre-Migration Assessment 29 • Support of required scale and future growth • Consistency of OpenFlow versions between controller and switch • A BGP route controller capable of handling BGP process • Ensure that appropriate APIs, scripting, and other operational tools are compatible with the SDN-based deployment • Ensure BGP peer creation and activation can be automated (optional) • Ensure proper training is provided to NOC staff
  • © 2013 Open Networking Foundation BGP-Free Edge Migration Procedures (Ships-in-the-Night) 30 1. Configure iBGP session between RR and BGP route controller so that BGP route controller can learn routes from the entire network 2. Configure BGP between OpenFlow and BGP route controller 3. Program a default flow entry in OpenFlow controller to initially forward traffic for matching OpenFlow entry to BGP route controller – Alternatively, TCP port 179 can be programmed to match and forward only BGP traffic to the BGP route controller 4. Before the migration, BGP path information for a random sample of prefixes should be captured. This will help validate accurate BGP path information after migration. 5. Configure a VLAN per customer and configure a corresponding BGP session on the BGP network controller 6. Once the session is established, decommission the session on the legacy router 7. BGP route controller runs BGP best path selection algorithm and passes the best paths to OpenFlow which in turn programs the OpenFlow switch 8. Once forwarding table is programmed, control traffic continue to be forwarded to the BGP route controller while data traffic now follows the path through the OF and non- OF enabled switches/routers along the way to the destination 9. Repeat the above steps to migrate the rest of the BGP sessions and additional edge routers
  • © 2013 Open Networking Foundation BGP-Free Edge Migration Procedures 31 VPN2 VPN1 VPN1 BGP Route Controller OpenFlow Controller BGP Route Reflector Traditional IP/MPLS Core P1 P2 Control Control VPN2 BGP Sessions Post-Migration BGP Sessions BGP Sessions not yet migrated x x Control Layer OpenFlow Session CE1 CE2 CE3 CE4 PE1 PE2 PE3 PE4 • Remove CE BGP session on the old PE and configure a new BGP session on the BGP controller for the corresponding CE. • Remove BGP session on the RR for the corresponding PE • Establish BGP session between the BGP network controller and BGP RR • Establish routing session between OF controller and BGP controller • OF controller programs the forwarding table on the PE1 once it has all the routing information • Repeat the same steps for rest of the PEs in the network which needs to be migrated.
  • © 2013 Open Networking Foundation BGP-Free Edge Migration Approach 32 From traditional BGP-speaking edge router to BGP-free paradigm: • Greenfield Deployment – All the edge devices are OpenFlow capable (BGP-free) with BGP terminated at the route controller – Perhaps the easiest migration model • Mixed (“Ships-in-the-Night”) Deployment – A new BGP-free edge router is deployed and will co-exist with other traditional BGP speaking routers – The new BGP free edge devices and the traditional devices need to exchange routing information via the BGP route controller • Hybrid Network Deployment – Legacy and OpenFlow devices coexist. The edge switch runs BGP and OpenFlow. The edge router continues to run BGP while BGP sessions and corresponding policies are offloaded to the BGP route controller gradually. – Requires careful planning and a lot more resources during the transition stage especially since edge device has to maintain the regular forwarding and OpenFlow forwarding tables along with BGP table
  • © 2013 Open Networking Foundation BGP-Free Edge Post-Migration 33 Post-Migration Acceptance: • All BGP sessions on BGP Network Controller should be up • Ensure BGP network controller receives and sends all expected BGP routes with proper next hops from customers and RR and selects the correct BGP best paths • To ensure BGP routes are learned accurately, compare BGP output of select prefixes with sample output captured in step 4 of migration Services Acceptance: • Any existing Internet or VPN services should function normally • A random sample of prefixes in the Internet as well as for select customers can be used to validate the service continuity • Appropriate troubleshooting steps such as ping and trace routes can be employed to check the connectivity.
  • © 2013 Open Networking Foundation Inter-Data Center WAN Google's OpenFlow-Powered WAN (B4) 34 Overview: • Google’s WAN organized as 2 backbones • Internet-facing (I-scale) network carrying user traffic • Internal (G-scale) network carrying traffic between datacenters – B4: OpenFlow-powered SDN • Use SDN to manage WAN as a fabric versus a collection of boxes – Delivery of Google’s global user-based services (Google Web Search, Google+, Gmail, YouTube, Google Maps, etc.) would not be scalable with the traditional technologies due to their non- linear complexity in management and configuration.
  • © 2013 Open Networking Foundation Google’s WAN (B4) Highlights 35 • 1000s of individual applications, different traffic volumes, different latency sensitivities and different overall priorities – user data copies (e.g., email, documents, audio/video files) to remote data centers for availability/durability – remote storage access for computation over inherently distributed data sources – large-scale data push synchronizing state across multiple data centers • Example: the user-data represents the lowest volume on B4, is the most latency sensitive, and is of the highest priority. • B4 was built with a 3-layer architecture: – Switch hardware layer – Site controller layer – Global control layer
  • © 2013 Open Networking Foundation Google’s WAN (B4) Architecture 36 • Switch hardware layer – Switch hardware custom built from multiple merchant networking chips – Forwards traffic and does not run complex control software • Site controller layer – Network control systems hosting OpenFlow controllers and network control applications • Global layer – Logically-centralized applications (e.g. an SDN Gateway and a central TE server) that enable the central control of the entire network Instead built one integrated, centralized service combining routing and traffic engineering Google chose to deploy routing and traffic engineering as independent services, with the standard routing service deployed initially and central TE subsequently deployed as an overlay. • Focus initial work on SDN infrastructure • Able to fall back to shortest path routing in case of facing issues with TE service
  • © 2013 Open Networking Foundation Google’s WAN (B4) Pre-Migration Assessment 37 A number of B4’s characteristics led to the design approach: • Elastic bandwidth demands: – the majority of Google's data center traffic involves synchronizing large data sets across sites. These applications benefit from as much bandwidth as they can get but can tolerate periodic failures with temporary bandwidth reductions. • Moderate number of sites: – While B4 must scale among multiple dimensions, targeting the data center deployments meant that the total number of WAN sites would be a few dozens.
  • © 2013 Open Networking Foundation Google’s WAN (B4) Pre-Migration Assessment 38 • End application control: – Google controls both the applications and the site networks connected to B4. Hence, it can enforce relative application priorities and control bursts at the network edge, rather than through over provisioning or complex functionality in B4. • Cost sensitivity: – B4’s capacity targets and growth rate led to unsustainable cost projections. – The traditional approach of provisioning WAN links at 30-40% (or 2-3x the cost of a fully utilized WAN) to protect against failures and packet loss, combined with prevailing per-port router cost, would make the network prohibitively expensive.
  • © 2013 Open Networking Foundation Google’s WAN (B4) Migration Approach 39 • Integration of the target network with the legacy routing – Provide a gradual path for enabling OpenFlow in the production network • BGP integration as a step toward deploying new protocols customized to the requirements of, for instance, a private WAN setting • Migration path moved in stages from a fully distributed monolithic control and data plane hardware architecture to a physically decentralized (though logically centralized) control plane architecture • The hybrid migration for the Google B4 network proceeded in 3 general stages (see next slides)
  • © 2013 Open Networking Foundation Google’s WAN (B4) Migration Approach (Step 1: Legacy) 40 1. Legacy: In the initial stage, the network connects Data Centers through legacy nodes using E/IBGP and ISIS routing. Cluster Border routers interface the Data Centers to the network. Legacy Hybrid SDN Deployment
  • © 2013 Open Networking Foundation Google’s WAN (B4) Migration Approach (Step 2: Mixed) 41 2. Mixed: In this phase, a subset of nodes in the network are OpenFlow-enabled and controlled by the logically-centralized controller utilizing Paxos, OpenFlow controller, and Quagga Mixed Hybrid SDN Deployment
  • © 2013 Open Networking Foundation Google’s WAN (B4) Migration Approach (Step 3: Final) 42 3. Final: All nodes are OpenFlow-enabled and the controller controls the entire network. There is no direct correspondence between the Data Center and the network. The controller has also TE server that guides the Traffic Engineering in the network. Hybrid SDN Deployment
  • © 2013 Open Networking Foundation Google’s WAN (B4) Post-Migration Acceptance 43 • Google’s WAN (B4) has been in deployment for 3 years • Carries more traffic than Google’s public-facing WAN, and has a higher growth rate • Among the first and largest SDN/OpenFlow deployments • Scales to meet application bandwidth demands more efficiently than would otherwise be possible • Supports rapid deployment and iteration of novel control functionality such as TE • Enables tight integration with end applications for adaptive behavior in response to failures or changing communication patterns
  • © 2013 Open Networking Foundation Outline 44 • Overview • Migration Use Cases • Conclusion
  • © 2013 Open Networking Foundation Example Guidelines Recommendations, Best Practices, etc. 45 Example Recommendations and Best Practices: • Focus on service continuity with minimal disruption • Analysis of OpenFlow features and desired capabilities on the controller and OpenFlow switch • Detailed gap analysis to understand impact on existing services • Availability of alternate options to mitigate risk during migration • Consistency of OpenFlow versions between controller and switch • OpenFlow switch must be upgraded to run appropriate code and hardware firmware before migration can be initiated • Provisioning of necessary network management tools for migrated network for proper management and monitoring of traffic & devices
  • © 2013 Open Networking Foundation Example Guidelines Recommendations, Best Practices, etc. 46 Example Recommendations and Best Practices: (cont’d) • Detailed method of procedure for step-by-step migration with back out procedures clearly documented in case of unexpected results • Investigate if reverting the configuration can be automated to minimize disruption in case of deteriorated performance • Create pre and post-migration check lists with specific samples of applications and/or source destination prefixes which will be used for connectivity and service continuity checks • Appropriate troubleshooting steps such as ping, trace or accessing an application can be employed to check the connectivity • In a mixed environment, a dummy service such as customer VPN can be created to verify service availability
  • © 2013 Open Networking Foundation Summary 47 • OpenFlow still evolving as new use cases and deployment models emerge • Legacy networks can successfully migrate to OpenFlow-based SDN – The 3 use cases illustrate diverse migration scenarios for WAN, campus/LAN and service provider/Internet edge – Google and Stanford use cases (both in production) illustrate good examples of successful migration to OpenFlow – Alternative options available today to address any gaps with OpenFlow • More work ahead – Share your real-world SDN migration experience with the community
  • © 2013 Open Networking Foundation How Can You Get Involved? Migration Working Group Charter 48 1st milestone: • Submit document on use cases and migration methods, leveraging the experience of prior work by network operators 2nd milestone: • Submit document describing the goals and metrics for the migration. 3rd milestone: • Publish prototype working code for migration, and validate the metrics. 4th milestone: • Demonstration of prototype migration tool chain.