0
Network Control and Management in the 100x100 Architecture
The Role of Network Control and Management <ul><li>Many different network environments </li></ul><ul><ul><li>Data center n...
Control Plane: The Key Leverage Point <ul><li>Great Potential: control plane determines the behavior of the network </li><...
100x100 Project Themes ?? Incentives vs. structure Information hiding Economics Time/space correlation, end-system infrast...
A Clean-slate Design <ul><li>What are the fundamental causes of outages? </li></ul><ul><li>How to reduce/simplify the soft...
Three Principles for Network Control & Management <ul><li>Network-level Objectives: </li></ul><ul><li>Express goals explic...
Three Principles for Network Control & Management <ul><li>Network-wide Views: </li></ul><ul><li>Design network to provide ...
Three Principles for Network Control & Management <ul><li>Direct Control: </li></ul><ul><li>Allow logic to directly set fo...
Overview of the 4D Architecture <ul><li>Decision Plane: </li></ul><ul><li>All  management logic implemented on centralized...
Overview of the 4D Architecture <ul><li>Dissemination Plane: </li></ul><ul><li>Provides a robust communication channel to ...
Overview of the 4D Architecture <ul><li>Discovery Plane: </li></ul><ul><li>Each router discovers its own resources and its...
Overview of the 4D Architecture <ul><li>Data Plane: </li></ul><ul><li>Spatially distributed routers/switches </li></ul><ul...
Concerns and Challenges <ul><li>How does the 4D simplify the problem? </li></ul><ul><li>How will communication between rou...
Fundamental Problem: Wrong Abstractions <ul><li>Management Plane </li></ul><ul><li>Figure out what is happening in network...
Good Abstractions Reduce Complexity <ul><li>All decision making logic lifted out of control plane </li></ul><ul><li>Elimin...
4D Separates Distributed Computing Issues from Networking Issues <ul><li>Distributed computing issues  !  protocols and ne...
4D Can Leverage Network Structure <ul><li>Decision plane logic can be specialized for structure of each physical network <...
The Feasibility of the 4D Architecture <ul><li>We designed and built a prototype of the 4D Architecture </li></ul><ul><li>...
Evaluation of the 4D Prototype <ul><li>Evaluated using Emulab  ( www.emulab.net ) </li></ul><ul><ul><li>Linux PCs used as ...
Performance of the 4D Prototype <ul><li>Trivial prototype has performance comparable to well-tuned production networks </l...
Future Work <ul><li>Scalability </li></ul><ul><ul><li>Evaluate over 1-10K switches, 10-100K routes </li></ul></ul><ul><ul>...
Themes of Network Control & Management <ul><li>Holistic Design </li></ul><ul><li>Many different technologies – a few commo...
Recent Results <ul><li>G. Xie, J. Zhan, D. A. Maltz, H. Zhang, A. Greenberg, G. Hjalmtysson, J. Rexford, “ On Static Reach...
Questions?
Fundamental Problem: Wrong Abstractions <ul><li>interface Ethernet0 </li></ul><ul><li>ip address 6.2.5.14 255.255.255.128 ...
Fundamental Problem: Wrong Abstractions Router ID (sorted by file size) 881 0 Lines in  config file 2000 1000 0 Size of co...
Fundamental Problem: Wrong Abstractions <ul><li>Management Plane </li></ul><ul><li>Figure out what is happening in network...
Good Abstractions Reduce Complexity <ul><li>All decision making logic lifted out of control plane </li></ul><ul><li>Elimin...
Fundamental Problem: Conflating Distributed Systems Issues with Networking Issues <ul><li>Distributed Systems Concern: res...
<ul><li>Distributed Systems Concern: resiliency to link failures </li></ul><ul><ul><li>Solution: multiple paths through ro...
Fundamental Problem: Conflating Distributed Systems Issues with Networking Issues <ul><li>Networking Concern: implement re...
4D Separates Distributed Computing Issues from Networking Issues <ul><li>Distributed computing issues  !  protocols and ne...
4D Can Leverage Network Structure <ul><li>Decision plane logic can be specialized for structure of each physical network <...
Fundamental Problem: Computing Configurations is Intractable <ul><li>Computing configuration files that cause control plan...
Direct Control Provides Complete Control <ul><li>Zero device-specific configuration </li></ul><ul><li>Supports many models...
4D and Today’s Networks <ul><li>4D architecture and principles apply to today’s networks as well as 100x100 </li></ul><ul>...
4D Supports Network Evolution & Expansion <ul><li>Decision logic can be upgraded as needed </li></ul><ul><ul><li>No need f...
Three Key Questions <ul><li>Is there any transition path to deploy the 4D architecture? </li></ul><ul><li>Is the 4D archit...
Deployment of the 4D Architecture <ul><li>Pre-existing industry trend towards separating router hardware from software </l...
Reachability Example <ul><li>Two locations, each with data center & front office </li></ul><ul><li>All routers exchange ro...
Reachability Example R1 R2 R5 R4 R3 Chicago (chi) New York (nyc) Data Center chi-DC chi-FO nyc-DC nyc-FO chi-DC chi-FO nyc...
Reachability Example R1 R2 R5 R4 R3 Data Center Packet filter: Drop nyc-FO -> * Permit * Packet filter: Drop chi-FO -> * P...
Reachability Example <ul><li>A new short-cut link added between data centers </li></ul><ul><li>Intended for backup traffic...
Reachability Example <ul><li>Oops – new link lets packets violate  security policy ! </li></ul><ul><li>Routing changed, bu...
Prohibiting Packets from chi-FO to nyc-DC
Reachability Example <ul><li>Typical response – add more packet filters to plug the holes in security policy </li></ul>R1 ...
Reachability Example <ul><li>Packet filters have surprising consequences </li></ul><ul><li>Consider a link failure </li></...
Reachability Example <ul><li>Network has less  survivability  than topology suggests </li></ul><ul><li>chi-FO and nyc-FO s...
Allowing Packets from chi-FO to nyc-FO
 
 
Packet Filters Implement Policy <ul><li>Packet filters used extensively throughout networks </li></ul><ul><li>Protect rout...
Mechanisms for Action at a Distance <ul><li>Policy often implemented by tagging routes on one router … </li></ul><ul><li>…...
Multiple Interacting Routing Processes Client Server OSPF BGP OSPF FIB FIB OSPF FIB OSPF FIB OSPF FIB OSPF EBGP Policy1 Po...
The Routing Instance Graph of a  881 Router Network
Reconvergence Time Under Single Link Failure
Reconvergence Time When  Master DE Crashes
Reconvergence Time When Network Partitions
Reconvergence Time When Network Partitions
Systems of Systems <ul><li>Systems are designed as components to be used in larger systems in different contexts, for diff...
Many Implementations Possible <ul><li>Multiple decision engines </li></ul><ul><li>Hot stand-by </li></ul><ul><li>Divide ne...
Direct Expression Enables New Algorithms <ul><li>OSPF normally calculates a single path to each destination D </li></ul><u...
Slides under Development
Supporting Network Evolution <ul><li>Logic for controlling the network needs to change over time  </li></ul><ul><ul><li>Tr...
Supporting Network Evolution Today <ul><li>Today’s “Solution” </li></ul><ul><ul><li>Vendors stuff their routers with softw...
Supporting Network Expansion <ul><li>Networks are constantly growing </li></ul><ul><ul><li>New routers/switches/links adde...
Supporting Network Expansion Today <ul><li>Routers run a link-state routing protocol </li></ul><ul><ul><li>Size of link-st...
Supporting Remote Devices <ul><li>Maintaining communication with all network devices is critical for network management </...
Supporting Remote Devices Today <ul><li>Today’s “Solution” </li></ul><ul><ul><li>Use PSTN as management network of last re...
Network Control and Management Today <ul><li>Data Plane </li></ul><ul><li>Distributed routers </li></ul><ul><li>Forwarding...
Network Control and Management Today <ul><li>Data Plane </li></ul><ul><li>Distributed routers </li></ul><ul><li>Forwarding...
A Study of Operational Production Networks <ul><li>How complicated/simple are real control planes? </li></ul><ul><li>What ...
Learning from Ethernet Evolution Experience Current Implementations:   Everything Changed Except Name and Framing  Etherne...
Ethernet: Re-inventing the Wheel <ul><li>Becoming as service-rich and complex as IP </li></ul><ul><ul><li>Traffic engineer...
Control/Management Needs of 100x100 Network Architecture <ul><li>Control/Management creates logical network from physical ...
Upcoming SlideShare
Loading in...5
×

Network-Control-and-..

195

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
195
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
0
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Great Need: current control plane promotes ossification and fragility
  • Fix up the graphics on this slide
  • Fix up the graphics on this slide
  • Focus on what routers do, shouldn’t do. Don’t overreach the paper. Focus on architectural issues. Reviewers like the example
  • Routes flow like water through the graph, gated by policy on the links
  • Break up into two slides? Redraw size dist graphs with thicker lines and larger fonts.
  • Transcript of "Network-Control-and-.."

    1. 1. Network Control and Management in the 100x100 Architecture
    2. 2. The Role of Network Control and Management <ul><li>Many different network environments </li></ul><ul><ul><li>Data center networks, enterprise/campus </li></ul></ul><ul><ul><li>Access, backbone networks </li></ul></ul><ul><li>Many different technologies </li></ul><ul><ul><li>Longest-prefix routing, label switching, switching </li></ul></ul><ul><ul><li>IP, MPLS, ATM, optical circuits </li></ul></ul><ul><li>Many different policies </li></ul><ul><ul><li>Routing, reachability, transit, traffic engineering, robustness </li></ul></ul><ul><li>The control plane software binds these elements together and defines the network </li></ul>
    3. 3. Control Plane: The Key Leverage Point <ul><li>Great Potential: control plane determines the behavior of the network </li></ul><ul><ul><li>Reaction to events, reachability, services </li></ul></ul><ul><li>Great Opportunities </li></ul><ul><ul><li>A radical clean-slate control plane can be deployed </li></ul></ul><ul><ul><ul><li>Agnostic to packet format: IPv4/v6, ethernet </li></ul></ul></ul><ul><ul><ul><li>No changes to end-system software </li></ul></ul></ul><ul><ul><li>Control plane is the nexus of network evolution </li></ul></ul><ul><ul><ul><li>Changing the control plane logic can smooth transitions in network technologies and architectures </li></ul></ul></ul>
    4. 4. 100x100 Project Themes ?? Incentives vs. structure Information hiding Economics Time/space correlation, end-system infrastructure coordination ?? Exploit structure to achieve efficiency Fundamental primitives Security Network-wide abstractions Explicitly modeled Re-factoring Control/ management Holistic Design Structure Clean Slate
    5. 5. A Clean-slate Design <ul><li>What are the fundamental causes of outages? </li></ul><ul><li>How to reduce/simplify the software in networks? </li></ul><ul><ul><li>Control logic is software – no reason it should be hard to update, but how to avoid complexity pitfalls </li></ul></ul><ul><li>What functionality needs to be distributed – what can be centralized? </li></ul><ul><ul><li>What would a “RISC” router look like? </li></ul></ul><ul><li>Leverage technology trends </li></ul><ul><ul><li>CPU and link-speed growing faster than # of switches </li></ul></ul>FIX ME
    6. 6. Three Principles for Network Control & Management <ul><li>Network-level Objectives: </li></ul><ul><li>Express goals explicitly </li></ul><ul><ul><li>Security policies, QoS, egress point selection </li></ul></ul><ul><li>Do not bury goals in box-specific configuration </li></ul>Management Logic Reachability matrix Traffic engineering rules
    7. 7. Three Principles for Network Control & Management <ul><li>Network-wide Views: </li></ul><ul><li>Design network to provide timely, accurate info </li></ul><ul><ul><li>Topology, traffic, resource limitations </li></ul></ul><ul><li>Give logic the inputs it needs </li></ul>Management Logic Reachability matrix Traffic engineering rules Read state info
    8. 8. Three Principles for Network Control & Management <ul><li>Direct Control: </li></ul><ul><li>Allow logic to directly set forwarding state </li></ul><ul><ul><li>FIB entries, packet filters, queuing parameters </li></ul></ul><ul><li>Logic computes desired network state, let it implement it </li></ul>Management Logic Reachability matrix Traffic engineering rules Read state info Write state
    9. 9. Overview of the 4D Architecture <ul><li>Decision Plane: </li></ul><ul><li>All management logic implemented on centralized servers making all decisions </li></ul><ul><li>Decision Elements use views to compute data plane state that meets objectives , then directly writes this state to routers </li></ul>Decision Dissemination Discovery Data Network-level objectives Direct control Network-wide views
    10. 10. Overview of the 4D Architecture <ul><li>Dissemination Plane: </li></ul><ul><li>Provides a robust communication channel to each router – and robustness is the only goal! </li></ul><ul><li>May run over same links as user data, but logically separate and independently controlled </li></ul>Decision Dissemination Discovery Data Network-level objectives Direct control Network-wide views
    11. 11. Overview of the 4D Architecture <ul><li>Discovery Plane: </li></ul><ul><li>Each router discovers its own resources and its local environment </li></ul><ul><li>E.g., the identity of its immediate neighbors </li></ul>Decision Dissemination Discovery Data Network-level objectives Direct control Network-wide views
    12. 12. Overview of the 4D Architecture <ul><li>Data Plane: </li></ul><ul><li>Spatially distributed routers/switches </li></ul><ul><li>Ideally exposes interface to tables in hardware </li></ul><ul><li>Can deploy with today’s technology </li></ul>Decision Dissemination Discovery Data Network-level objectives Direct control Network-wide views
    13. 13. Concerns and Challenges <ul><li>How does the 4D simplify the problem? </li></ul><ul><li>How will communication between routers and DEs survive failures in the network? </li></ul><ul><ul><li>Can a robust dissemination plane be built? </li></ul></ul><ul><li>Latency means DE’s view of network is behind reality. Will the control loop be stable? </li></ul><ul><li>What is the overhead to/from the DEs? </li></ul><ul><li>What happens in a network partition? </li></ul>FIX ME
    14. 14. Fundamental Problem: Wrong Abstractions <ul><li>Management Plane </li></ul><ul><li>Figure out what is happening in network </li></ul><ul><li>Decide how to change it </li></ul>Shell scripts Traffic Eng Databases Planning tools OSPF SNMP netflow modems Configs Link metrics <ul><li>Control Plane </li></ul><ul><li>Multiple routing processes on each router </li></ul><ul><li>Each router with different configuration program </li></ul><ul><li>Huge number of control knobs: metrics, ACLs, policy </li></ul>FIB FIB FIB Routing policies Packet filters <ul><li>Data Plane </li></ul><ul><li>Distributed routers </li></ul><ul><li>Forwarding, filtering, queueing </li></ul><ul><li>Based on FIB or labels </li></ul>OSPF BGP OSPF BGP OSPF BGP
    15. 15. Good Abstractions Reduce Complexity <ul><li>All decision making logic lifted out of control plane </li></ul><ul><li>Eliminates duplicate logic in management plane </li></ul><ul><li>Dissemination plane provides robust communication to/from data plane switches </li></ul>Management Plane Control Plane Data Plane Decision Plane Dissemination Data Plane Configs FIBs, ACLs FIBs, ACLs
    16. 16. 4D Separates Distributed Computing Issues from Networking Issues <ul><li>Distributed computing issues ! protocols and network architecture </li></ul><ul><ul><li>Overhead </li></ul></ul><ul><ul><li>Resiliency </li></ul></ul><ul><ul><li>Scalability </li></ul></ul><ul><li>Networking issues ! management logic </li></ul><ul><ul><li>Traffic engineering and service provisioning </li></ul></ul><ul><ul><li>Egress point selection </li></ul></ul><ul><ul><li>Reachability control (VPNs) </li></ul></ul><ul><ul><li>Precomputation of backup paths </li></ul></ul>
    17. 17. 4D Can Leverage Network Structure <ul><li>Decision plane logic can be specialized for structure of each physical network </li></ul><ul><ul><li>Distributed protocols must be prepared for arbitrary topology graphs </li></ul></ul><ul><ul><li>4D enables network logic specialized differently for access and for backbone </li></ul></ul><ul><li>Advantages </li></ul><ul><ul><li>Faster route computations </li></ul></ul><ul><ul><li>Retain flexibility to evolve network as needed </li></ul></ul><ul><ul><li>Support transition to 100x100 architecture </li></ul></ul>
    18. 18. The Feasibility of the 4D Architecture <ul><li>We designed and built a prototype of the 4D Architecture </li></ul><ul><li>4D Architecture permits many designs – prototype is a single, simple design point </li></ul><ul><li>Decision plane </li></ul><ul><ul><li>Contains logic to simultaneously compute routes and enforce reachability matrix </li></ul></ul><ul><ul><li>Multiple Decision Elements per network, using simple election protocol to pick master </li></ul></ul><ul><li>Dissemination plane </li></ul><ul><ul><li>Uses source routes to direct control messages </li></ul></ul><ul><ul><li>Extremely simple, but can route around failed data links </li></ul></ul>
    19. 19. Evaluation of the 4D Prototype <ul><li>Evaluated using Emulab ( www.emulab.net ) </li></ul><ul><ul><li>Linux PCs used as routers (650 – 800MHz) </li></ul></ul><ul><ul><li>Tested on 9 enterprise network topologies (10-100 routers each) </li></ul></ul>Example network with 49 switches and 5 DEs
    20. 20. Performance of the 4D Prototype <ul><li>Trivial prototype has performance comparable to well-tuned production networks </li></ul><ul><li>Recovers from single link failure in < 300 ms </li></ul><ul><ul><li>< 1 s response considered “excellent” </li></ul></ul><ul><li>Survives failure of master Decision Element </li></ul><ul><ul><li>New DE takes control within 1 s </li></ul></ul><ul><ul><li>No disruption unless second fault occurs </li></ul></ul><ul><li>Gracefully handles complete network partitions </li></ul><ul><ul><li>Less than 1.5 s of outage </li></ul></ul>
    21. 21. Future Work <ul><li>Scalability </li></ul><ul><ul><li>Evaluate over 1-10K switches, 10-100K routes </li></ul></ul><ul><ul><li>Networks with backbone-like propagation delays </li></ul></ul><ul><li>Structuring decision logic </li></ul><ul><ul><li>Arbitrate among multiple, potentially competing objectives </li></ul></ul><ul><ul><li>Unify control when some logic takes longer than others </li></ul></ul><ul><li>Protocol improvements </li></ul><ul><ul><li>Better dissemination and discovery planes </li></ul></ul><ul><li>Deployment in today’s networks </li></ul><ul><ul><li>Data center, enterprise, campus, backbone (RCP) </li></ul></ul>
    22. 22. Themes of Network Control & Management <ul><li>Holistic Design </li></ul><ul><li>Many different technologies – a few common problems </li></ul><ul><li>Find the right abstractions: exploit commonality </li></ul><ul><li>Clean Slate </li></ul><ul><li>How much autonomy do routers/switches need? </li></ul><ul><li>New principles for controlling networks </li></ul><ul><li>Eliminate duplicate logic </li></ul><ul><li>Leverage Network Structure </li></ul><ul><li>Many different types of networks exist - each with different objectives and topologies </li></ul><ul><li>Separate networking issues from distributed system issues </li></ul>
    23. 23. Recent Results <ul><li>G. Xie, J. Zhan, D. A. Maltz, H. Zhang, A. Greenberg, G. Hjalmtysson, J. Rexford, “ On Static Reachability Analysis of IP Networks ,” IEEE INFOCOM 2005, Orlando, FL, March 2005. </li></ul><ul><li>J. Rexford, A. Greenberg, G. Hjalmtysson, D. A. Maltz, A. Myers, G. Xie, J. Zhan, H. Zhang, “ Network-Wide Decision Making: Toward a Wafer-Thin Control Plane ,” Proceedings of ACM HotNets-III, San Diego, CA, November 2004. </li></ul><ul><li>D. A. Maltz, J. Zhan, G. Xie, G. Hjalmtysson, A. Greenberg, H. Zhang, “ Routing Design in Operational Networks: A Look from the Inside ,” Proceedings of the 2004 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications (ACM SIGCOMM 2004), Portland, Oregon, 2004. </li></ul><ul><li>D. A. Maltz, J. Zhan, G. Xie, H. Zhang, G. Hjalmtysson, A. Greenberg, J. Rexford, “ Structure Preserving Anonymization of Router Configuration Data ,” Proceedings of ACM/Usenix Internet Measurement Conference (IMC 2004), Sicily, Italy, 2004. </li></ul>
    24. 24. Questions?
    25. 25. Fundamental Problem: Wrong Abstractions <ul><li>interface Ethernet0 </li></ul><ul><li>ip address 6.2.5.14 255.255.255.128 </li></ul><ul><li>interface Serial1/0.5 point-to-point </li></ul><ul><li>ip address 6.2.2.85 255.255.255.252 </li></ul><ul><li>ip access-group 143 in </li></ul><ul><li>frame-relay interface-dlci 28 </li></ul><ul><li>router ospf 64 </li></ul><ul><li>redistribute connected subnets </li></ul><ul><li>redistribute bgp 64780 metric 1 subnets </li></ul><ul><li>network 66.251.75.128 0.0.0.127 area 0 </li></ul><ul><li>router bgp 64780 </li></ul><ul><li>redistribute ospf 64 match route-map 8aTzlvBrbaW </li></ul><ul><li>neighbor 66.253.160.68 remote-as 12762 </li></ul><ul><li>neighbor 66.253.160.68 distribute-list 4 in </li></ul>access-list 143 deny 1.1.0.0/16 access-list 143 permit any route-map 8aTzlvBrbaW deny 10 match ip address 4 route-map 8aTzlvBrbaW permit 20 match ip address 7 ip route 10.2.2.1/16 10.2.1.7
    26. 26. Fundamental Problem: Wrong Abstractions Router ID (sorted by file size) 881 0 Lines in config file 2000 1000 0 Size of configuration files in a single enterprise network (881 routers)
    27. 27. Fundamental Problem: Wrong Abstractions <ul><li>Management Plane </li></ul><ul><li>Figure out what is happening in network </li></ul><ul><li>Decide how to change it </li></ul>Shell scripts Traffic Eng Databases Planning tools OSPF SNMP netflow modems Configs Link metrics <ul><li>Control Plane </li></ul><ul><li>Multiple routing processes on each router </li></ul><ul><li>Each router with different configuration program </li></ul><ul><li>Huge number of control knobs: metrics, ACLs, policy </li></ul>FIB FIB FIB Routing policies Packet filters <ul><li>Data Plane </li></ul><ul><li>Distributed routers </li></ul><ul><li>Forwarding, filtering, queueing </li></ul><ul><li>Based on FIB or labels </li></ul>OSPF BGP OSPF BGP OSPF BGP
    28. 28. Good Abstractions Reduce Complexity <ul><li>All decision making logic lifted out of control plane </li></ul><ul><li>Eliminates duplicate logic in management plane </li></ul><ul><li>Dissemination plane provides robust communication to/from data plane switches </li></ul>Management Plane Control Plane Data Plane Decision Plane Dissemination Data Plane Configs FIBs, ACLs FIBs, ACLs
    29. 29. Fundamental Problem: Conflating Distributed Systems Issues with Networking Issues <ul><li>Distributed Systems Concern: resiliency to link failures </li></ul><ul><ul><li>Solution: multiple paths through routing process graph </li></ul></ul>D D left Routing Process D left Routing Process D left Routing Process D D D
    30. 30. <ul><li>Distributed Systems Concern: resiliency to link failures </li></ul><ul><ul><li>Solution: multiple paths through routing process graph </li></ul></ul>D right Routing Process D left Routing Process D left Routing Process D D D Fundamental Problem: Conflating Distributed Systems Issues with Networking Issues
    31. 31. Fundamental Problem: Conflating Distributed Systems Issues with Networking Issues <ul><li>Networking Concern: implement resource or security policy </li></ul><ul><ul><li>Solution: restrict flow of routing information, filter routes, summarize/aggregate routes </li></ul></ul>D D left Routing Process D left Routing Process D left Routing Process D D D
    32. 32. 4D Separates Distributed Computing Issues from Networking Issues <ul><li>Distributed computing issues ! protocols and network architecture </li></ul><ul><ul><li>Overhead </li></ul></ul><ul><ul><li>Resiliency </li></ul></ul><ul><ul><li>Scalability </li></ul></ul><ul><li>Networking issues ! management logic </li></ul><ul><ul><li>Traffic engineering and service provisioning </li></ul></ul><ul><ul><li>Egress point selection </li></ul></ul><ul><ul><li>Reachability control (VPNs) </li></ul></ul><ul><ul><li>Precomputation of backup paths </li></ul></ul>
    33. 33. 4D Can Leverage Network Structure <ul><li>Decision plane logic can be specialized for structure of each physical network </li></ul><ul><ul><li>Distributed protocols must be prepared for arbitrary topology graphs </li></ul></ul><ul><ul><li>4D enables network logic specialized differently for access and for backbone </li></ul></ul><ul><li>Advantages </li></ul><ul><ul><li>Faster route computations </li></ul></ul><ul><ul><li>Retain flexibility to evolve network as needed </li></ul></ul><ul><ul><li>Support transition to 100x100 architecture </li></ul></ul>
    34. 34. Fundamental Problem: Computing Configurations is Intractable <ul><li>Computing configuration files that cause control plane to compute desired forwarding states is intractable </li></ul><ul><ul><li>NP-hard in many cases </li></ul></ul><ul><ul><li>Requires predictive model of control plane behavior </li></ul></ul><ul><li>Configurations files form a program that defines a set of forwarding states </li></ul><ul><ul><li>Very hard to create program that permits only desired states, and doesn’t transit through bad ones </li></ul></ul>Forwarding states allowed by configs Auto-adaptation leads to/thru bad states Planned responses avoid bad states
    35. 35. Direct Control Provides Complete Control <ul><li>Zero device-specific configuration </li></ul><ul><li>Supports many models for “pushing” routes </li></ul><ul><ul><li>Trivial push – convergence requires time for all updates to be receive and applied – same as today </li></ul></ul><ul><ul><li>Synchronized update – updates propagated, but not applied till agreed time in the future – clock skew defines convergence time </li></ul></ul><ul><ul><li>Controlled state trajectory – DE serializes updates to avoid all incorrect transient states </li></ul></ul>
    36. 36. 4D and Today’s Networks <ul><li>4D architecture and principles apply to today’s networks as well as 100x100 </li></ul><ul><ul><li>Enterprise/campus/university networks </li></ul></ul><ul><ul><li>Data center networks </li></ul></ul><ul><ul><li>Access/backbone networks </li></ul></ul><ul><li>Greater expressivity in determining behavior </li></ul><ul><ul><li>Behavior of butterfly graph gadgets under failure </li></ul></ul><ul><ul><li>Selection of traffic egress points </li></ul></ul>
    37. 37. 4D Supports Network Evolution & Expansion <ul><li>Decision logic can be upgraded as needed </li></ul><ul><ul><li>No need for update of distributed protocols implemented in software distributed on every switch </li></ul></ul><ul><li>Decision Elements can be upgraded as needed </li></ul><ul><ul><li>Network expansion requires upgrades only to DEs, not every switch </li></ul></ul>
    38. 38. Three Key Questions <ul><li>Is there any transition path to deploy the 4D architecture? </li></ul><ul><li>Is the 4D architecture feasible? </li></ul><ul><li>Does the 4D architecture have more expressive power than today’s approaches to network control and management? </li></ul>
    39. 39. Deployment of the 4D Architecture <ul><li>Pre-existing industry trend towards separating router hardware from software </li></ul><ul><ul><li>IETF: FORCES, GSMP, GMPLS </li></ul></ul><ul><ul><li>SoftRouter [Lakshman, HotNets’04] </li></ul></ul><ul><li>Incremental deployment path exists </li></ul><ul><ul><li>Individual networks can upgrade to 4D and gain benefits </li></ul></ul><ul><ul><li>Small enterprise networks have most to gain </li></ul></ul><ul><ul><li>No changes to end-systems required </li></ul></ul>
    40. 40. Reachability Example <ul><li>Two locations, each with data center & front office </li></ul><ul><li>All routers exchange routes over all links </li></ul>R1 R2 R5 R4 R3 Chicago (chi) New York (nyc) Data Center Front Office
    41. 41. Reachability Example R1 R2 R5 R4 R3 Chicago (chi) New York (nyc) Data Center chi-DC chi-FO nyc-DC nyc-FO chi-DC chi-FO nyc-DC nyc-FO Front Office
    42. 42. Reachability Example R1 R2 R5 R4 R3 Data Center Packet filter: Drop nyc-FO -> * Permit * Packet filter: Drop chi-FO -> * Permit * Front Office chi nyc chi-DC chi-FO nyc-DC nyc-FO chi-DC chi-FO nyc-DC nyc-FO
    43. 43. Reachability Example <ul><li>A new short-cut link added between data centers </li></ul><ul><li>Intended for backup traffic between centers </li></ul>R1 R2 R5 R4 R3 Data Center Packet filter: Drop nyc-FO -> * Permit * Packet filter: Drop chi-FO -> * Permit * Front Office chi nyc
    44. 44. Reachability Example <ul><li>Oops – new link lets packets violate security policy ! </li></ul><ul><li>Routing changed, but </li></ul><ul><li>Packet filters don’t update automatically </li></ul>R1 R2 R5 R4 R3 Data Center Packet filter: Drop nyc-FO -> * Permit * Packet filter: Drop chi-FO -> * Permit * Front Office chi nyc
    45. 45. Prohibiting Packets from chi-FO to nyc-DC
    46. 46. Reachability Example <ul><li>Typical response – add more packet filters to plug the holes in security policy </li></ul>R1 R2 R5 R4 R3 Data Center Front Office chi nyc Packet filter: Drop nyc-FO -> * Permit * Packet filter: Drop chi-FO -> * Permit *
    47. 47. Reachability Example <ul><li>Packet filters have surprising consequences </li></ul><ul><li>Consider a link failure </li></ul><ul><li>chi-FO and nyc-FO still connected </li></ul>R1 R2 R5 R4 R3 Data Center Drop nyc-FO -> * Front Office chi nyc Drop chi-FO -> *
    48. 48. Reachability Example <ul><li>Network has less survivability than topology suggests </li></ul><ul><li>chi-FO and nyc-FO still connected </li></ul><ul><li>But packet filter means no data can flow! </li></ul><ul><li>Probing the network won’t predict this problem </li></ul>R1 R2 R5 R4 R3 Data Center Drop nyc-FO -> * Front Office chi nyc Drop chi-FO -> *
    49. 49. Allowing Packets from chi-FO to nyc-FO
    50. 52. Packet Filters Implement Policy <ul><li>Packet filters used extensively throughout networks </li></ul><ul><li>Protect routers from attack </li></ul><ul><li>Implement reachability matrix </li></ul><ul><ul><li>Define which hosts can communicate </li></ul></ul><ul><ul><li>Localize traffic, particularly multicas t </li></ul></ul>
    51. 53. Mechanisms for Action at a Distance <ul><li>Policy often implemented by tagging routes on one router … </li></ul><ul><li>… And testing for tag at another router </li></ul>FIB Routing Process FIB Routing Process FIB Routing Process A:tag=12 A R1 R2 R3 A:tag=12 Tag?
    52. 54. Multiple Interacting Routing Processes Client Server OSPF BGP OSPF FIB FIB OSPF FIB OSPF FIB OSPF FIB OSPF EBGP Policy1 Policy2 Internet
    53. 55. The Routing Instance Graph of a 881 Router Network
    54. 56. Reconvergence Time Under Single Link Failure
    55. 57. Reconvergence Time When Master DE Crashes
    56. 58. Reconvergence Time When Network Partitions
    57. 59. Reconvergence Time When Network Partitions
    58. 60. Systems of Systems <ul><li>Systems are designed as components to be used in larger systems in different contexts, for different purposes, interacting with different components </li></ul><ul><ul><li>Example: OSPF and BGP are complex systems in its own right, they are components in a routing system of a network, interacting with each other and packet filters, interacting with management tools … </li></ul></ul><ul><li>Complex configuration to enable flexibility </li></ul><ul><ul><li>The glue has tremendous impact on network performance </li></ul></ul><ul><ul><li>State of art: multiple interactive distributed programs written in assembly language </li></ul></ul><ul><li>Lack of intellectual framework to understand global behavior </li></ul>
    59. 61. Many Implementations Possible <ul><li>Multiple decision engines </li></ul><ul><li>Hot stand-by </li></ul><ul><li>Divide network & load share </li></ul><ul><li>Distributed decision engines </li></ul><ul><li>Up to one per router </li></ul><ul><li>Choice can be based on reliability requirements </li></ul><ul><li>Dessim. Plane can be in-band, or leverage OOB links </li></ul><ul><li>Less need for distributed solutions (harder to reason about) </li></ul><ul><li>More focus on network issues, less on distributed protocols </li></ul>Single redundant decision engine
    60. 62. Direct Expression Enables New Algorithms <ul><li>OSPF normally calculates a single path to each destination D </li></ul><ul><li>OSPF allows load-balancing only for equal-cost paths to avoid loops </li></ul><ul><li>Using ECMP requires careful engineering of link weights </li></ul>D D <ul><li>Decision Plane with network-wide view can compute multiple paths </li></ul><ul><li>“ Backup paths” installed for free! </li></ul><ul><li>Bounded stretch, bounded fan-in </li></ul>
    61. 63. Slides under Development
    62. 64. Supporting Network Evolution <ul><li>Logic for controlling the network needs to change over time </li></ul><ul><ul><li>Traffic engineering rules </li></ul></ul><ul><ul><li>Interactions with other networks </li></ul></ul><ul><ul><li>Service characteristics </li></ul></ul><ul><li>Upgrades to field-deployed network equipment must be avoided </li></ul><ul><ul><li>Very high cost </li></ul></ul><ul><ul><li>Software upgrades often require hardware upgrades (more CPU or memory) </li></ul></ul>
    63. 65. Supporting Network Evolution Today <ul><li>Today’s “Solution” </li></ul><ul><ul><li>Vendors stuff their routers with software implementing all possible “features” </li></ul></ul><ul><ul><ul><li>Multiple routing protocols </li></ul></ul></ul><ul><ul><ul><li>Multiple signaling protocols (RSVP, CR-LDP) </li></ul></ul></ul><ul><ul><ul><li>Each feature controlled by parameters set at configuration time to achieve late binding </li></ul></ul></ul><ul><ul><li>Feature-creep creates configuration nightmare </li></ul></ul><ul><ul><ul><li>Tremendous complexity for syntax & semantics </li></ul></ul></ul><ul><ul><ul><li>Mis-interactions between features is common </li></ul></ul></ul><ul><li>Our Goal: Separate decision making logic from the field-deployed devices </li></ul>
    64. 66. Supporting Network Expansion <ul><li>Networks are constantly growing </li></ul><ul><ul><li>New routers/switches/links added </li></ul></ul><ul><ul><li>Old equipment rarely removed </li></ul></ul><ul><li>Adding a new switch can cause old equipment to become overloaded </li></ul><ul><ul><li>CPU/Memory demands on each device should not scale up with network size </li></ul></ul>
    65. 67. Supporting Network Expansion Today <ul><li>Routers run a link-state routing protocol </li></ul><ul><ul><li>Size of link-state database scales with # of routers </li></ul></ul><ul><ul><li>Expanding network can exceed memory limits of old routers </li></ul></ul><ul><li>Today’s “Solution” </li></ul><ul><ul><li>Monitor resources on all routers </li></ul></ul><ul><ul><li>Predict approach of exhaustion and then: </li></ul></ul><ul><ul><ul><li>Global upgrade </li></ul></ul></ul><ul><ul><ul><li>Rearchitecture of routing design to add summarization, route aggregation, information hiding </li></ul></ul></ul><ul><li>Our Goal: make demands scale with hardware (e.g., # of interfaces) </li></ul>
    66. 68. Supporting Remote Devices <ul><li>Maintaining communication with all network devices is critical for network management </li></ul><ul><ul><li>Diagnosis of problems </li></ul></ul><ul><ul><li>Monitoring status and network health </li></ul></ul><ul><ul><li>Updating configuration or software </li></ul></ul><ul><li>“the chicken or the egg….” </li></ul><ul><ul><li>Cannot send device configuration/management information until it can communicate </li></ul></ul><ul><ul><li>Device cannot communicate until it is correctly configured </li></ul></ul>
    67. 69. Supporting Remote Devices Today <ul><li>Today’s “Solution” </li></ul><ul><ul><li>Use PSTN as management network of last resort </li></ul></ul><ul><ul><li>Connect console of remote routers to phone modem </li></ul></ul><ul><ul><li>Can’t be used for customer premise equipment (CPE): DSL/cable modems, integrated access devices (IADs) </li></ul></ul><ul><ul><li>In a converged network, PSTN is decommissioned </li></ul></ul><ul><li>Our Goal: Preserve management communication to any device that is not physically partitioned, regardless of configuration state </li></ul>
    68. 70. Network Control and Management Today <ul><li>Data Plane </li></ul><ul><li>Distributed routers </li></ul><ul><li>Forwarding, filtering, queueing </li></ul><ul><li>Based on FIB or labels </li></ul>Packet filters <ul><li>State everywhere! </li></ul><ul><li>Dynamic state in FIBs </li></ul><ul><li>Configured state in settings, policies, packet filters </li></ul><ul><li>Programmed state in magic constants, timers </li></ul><ul><li>Many dependencies between bits of state </li></ul><ul><li>State updated in uncoordinated, decentralized way! </li></ul><ul><li>Management Plane </li></ul><ul><li>Figure out what is happening in network </li></ul><ul><li>Decide how to change it </li></ul>Shell scripts Traffic Eng Databases Planning tools OSPF SNMP netflow modems Configs OSPF BGP Link metrics OSPF BGP OSPF BGP <ul><li>Control Plane </li></ul><ul><li>Multiple routing processes on each router </li></ul><ul><li>Each router with different configuration program </li></ul><ul><li>Huge number of control knobs: metrics, ACLs, policy </li></ul>FIB FIB FIB Routing policies
    69. 71. Network Control and Management Today <ul><li>Data Plane </li></ul><ul><li>Distributed routers </li></ul><ul><li>Forwarding, filtering, queueing </li></ul><ul><li>Based on FIB or labels </li></ul>Packet filters <ul><li>State everywhere! </li></ul><ul><li>Dynamic state in FIBs </li></ul><ul><li>Configured state in settings, policies, packet filters </li></ul><ul><li>Programmed state in magic constants, timers </li></ul><ul><li>Many dependencies between bits of state </li></ul><ul><li>State updated in uncoordinated, decentralized way! </li></ul><ul><li>Logic everywhere! </li></ul><ul><li>Path Computation built into routing protocols </li></ul><ul><li>Routing Policy distributed across the routers </li></ul><ul><li>Packet Filters placed by tools in Mng. Plane </li></ul><ul><li>No way to arbitrate inconsistencies between logic! </li></ul><ul><li>Management Plane </li></ul><ul><li>Figure out what is happening in network </li></ul><ul><li>Decide how to change it </li></ul>Shell scripts Traffic Eng Databases Planning tools OSPF SNMP netflow modems Configs OSPF BGP Link metrics OSPF BGP OSPF BGP <ul><li>Control Plane </li></ul><ul><li>Multiple routing processes on each router </li></ul><ul><li>Each router with different configuration program </li></ul><ul><li>Huge number of control knobs: metrics, ACLs, policy </li></ul>FIB FIB FIB Routing policies
    70. 72. A Study of Operational Production Networks <ul><li>How complicated/simple are real control planes? </li></ul><ul><li>What is the structure of the distributed system? </li></ul><ul><li>Use reverse-engineering methodology </li></ul><ul><li>There are few or no documents </li></ul><ul><li>The ones that exist are out-of-date </li></ul><ul><li>Anonymized configuration files for 31 active networks (>8,000 configuration files) </li></ul><ul><li>6 Tier-1 and Tier-2 Internet backbone networks </li></ul><ul><li>25 enterprise networks </li></ul><ul><li>Sizes between 10 and 1,200 routers </li></ul><ul><li>4 enterprise networks significantly larger than the backbone networks </li></ul>
    71. 73. Learning from Ethernet Evolution Experience Current Implementations: Everything Changed Except Name and Framing Ethernet Conc.. Router Server WAN HUB Switch <ul><li>Switched solution </li></ul><ul><li>Little use for collision domains </li></ul><ul><li>Servers, routers 10 x station speed </li></ul><ul><li>10/100/1000 Mbps, 10gig coming: Copper, Fiber </li></ul>WAN LAN Ethernet or 802.3 <ul><li>Bus-based Local Area Network </li></ul><ul><li>Collision Domain, CSMA/CD </li></ul><ul><li>Bridges and Repeaters for distance/capacity extension </li></ul><ul><li>1-10Mbps: coax, twisted pair (10BaseT) </li></ul>B/R Early Implementations
    72. 74. Ethernet: Re-inventing the Wheel <ul><li>Becoming as service-rich and complex as IP </li></ul><ul><ul><li>Traffic engineering </li></ul></ul><ul><ul><li>Reachability control and traffic isolation (VLANs) </li></ul></ul><ul><ul><li>QoS (802.1q) </li></ul></ul><ul><li>Ethernet networks rediscovering the problems and solutions faced by IP networks </li></ul><ul><li>Is there commonality to exploit? </li></ul><ul><ul><li>Switch/routers are all fundamentally table-driven </li></ul></ul><ul><ul><li>Destination addr, MPLS labels, VLANs, Circuit IDs </li></ul></ul>
    73. 75. Control/Management Needs of 100x100 Network Architecture <ul><li>Control/Management creates logical network from physical network </li></ul><ul><ul><li>Supports architecture and end-to-end view of 100x100 network </li></ul></ul><ul><li>Access Network </li></ul><ul><ul><li>Logical level: aggregation tree between CPE and Regional Node </li></ul></ul><ul><ul><li>Physical level: network with redundant links and multiple Regional Nodes </li></ul></ul><ul><li>Backbone Network </li></ul><ul><ul><li>Logical level: full mesh of links among Regional Nodes </li></ul></ul><ul><ul><li>Physical level: sparse graph of fiber routes constrained by geography </li></ul></ul>
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×