Advertisement
Advertisement

More Related Content

Advertisement
Advertisement

SDN Traffic Engineering, A Natural Evolution

  1. SDN Traffic Engineering A Natural Evolution Cengiz Alaettinoglu
  2. What is Traffic Engineering (TE)? Minimizes the worst link utilization • Alleviates traffic congestion • Better/longer use of capital expenditure Routing traffic around congested links • Can use shortest as well as non-shortest paths • IGP metric tuning, RSVP-TE, segment routing Confidential. Copyright © 2014 Packet Design 2
  3. Evolution of Traffic Engineering Offline traffic engineering • Optimal but not adaptive On device traffic engineering • Adaptive but not optimal Software defined networking • Best of both worlds, yet simpler; simplicity enabled by • Segment routing • Push based telemetry • SDN Traffic Engineering App Confidential. Copyright © 2014 Packet Design 3
  4. Offline Traffic Engineering Topology model Traffic demand matrix Optimization algorithm computes routes so that the worst link utilization is minimized • Linear-programming Confidential. Copyright © 2014 Packet Design 4
  5. Pros and Cons of Offline Traffic Engineering Confidential. Copyright © 2014 Packet Design 5 Very good link utilization values Network model is hard to keep accurate Traffic demand matrix is hard to compute Optimization algorithm is very slow • Hours to days Can not adopt to failures Too many tunnels (N^2) Some paths may be surprisingly long
  6. On Device Traffic Engineering All routers redistribute available bandwidth of its links Each router • Sets up 1-N tunnels to other routers • Monitors the utilization of these tunnels (auto-bandwidth) • Triggers a re-optimization when utilization changes • Uses CSPF (constraint based shortest path first) to compute the paths • Signals the path and reserves bandwidth using RSVP Confidential. Copyright © 2014 Packet Design 6
  7. Pros and Cons of On-Device Traffic Engineering Confidential. Copyright © 2014 Packet Design 7 Very good link utilization values • Each router is selfish in optimization • No network wide optimization Network model is hard to keep accurate Traffic demand matrix is hard to compute Optimization algorithm is very slow • Hours to days Can not adopt to failures Too many tunnels (N^2) Some paths may be surprisingly long Impact on IGP, particularly convergence times Race conditions after failures • Long-lived FRR RSVP-TE high-state and high-refresh overhead
  8. Example Deployments Small Medium Large Routers 75 450 1900 Links 300 2000 8000 Tunnels 1600 20500 132000 Confidential. Copyright © 2014 Packet Design 8 • Majority of the tunnels have very small amount of traffic • No TE needed
  9. Link / Tunnel Distribution Confidential. Copyright © 2014 Packet Design 9 Most links carry small number of tunnels Small number of links carry a lot of tunnels
  10. Lets See What Happens When a Link Fails NNOV-CR1-B → M10-CR1-B fails at 2015-12-27 12:04:05.200490 • Carries 327 tunnels • 22 head-end routers Head routers get a signal via RSVP-TE and re-optimize • Race to available bandwidth • Each optimizes for itself • They donot know what the other 21 routers need Confidential. Copyright © 2014 Packet Design 10
  11. 5 Routers for 9 tunnels Fail to Find a Path Time Head-end Operation Tunnel Next State 2015-12-27 12:06:19.831701 VLGD-RGR3 Change Tunnel State RT-VLGD_RGR3-to-M10_CR2A Oper Status: Down 2015-12-27 12:06:19.831701 VLGD-RGR3 Change Tunnel State nonRT-VLGD_RGR3-to-M10_CR2A Oper Status: Down 2015-12-27 12:07:48.764798 VLGD-RGR2 Change Tunnel State RT-VLGD_RGR2-to-M10_CR1A Oper Status: Down 2015-12-27 12:07:48.764798 VLGD-RGR2 Change Tunnel State nonRT-VLGD_RGR2-to-M10_CR1A Oper Status: Down 2015-12-27 12:11:23.214201 NNOV-RGR1 Change Tunnel State xyz:12389:3456:mvpn:v3456-VGTRK Oper Status: Down 2015-12-27 12:23:55.980877 ARKH-RGR1 Change Tunnel State RT-ARKH_RGR1-to-M10_CR1A Oper Status: Down 2015-12-27 12:23:55.980877 ARKH-RGR1 Change Tunnel State nonRT-ARKH_RGR1-to-M10_CR1A Oper Status: Down 2015-12-27 12:24:17.097178 SKTV-RGR2 Change Tunnel State RT-SKTV_RGR2-to-M10_CR1A Oper Status: Down 2015-12-27 12:24:17.097178 SKTV-RGR2 Change Tunnel State nonRT-SKTV_RGR2-to-M10_CR1A Oper Status: Down Confidential. Copyright © 2014 Packet Design 11 This can be avoided with a centralized optimization solution!
  12. An Example Failed Tunnel This tunnel only requests 34Mbps • Its shortest IGP path does not have this bandwidth Confidential. Copyright © 2014 Packet Design 12
  13. Stuck FRR Tunnels What happens when a tunnel fails to optimize and it is FRR protected? • FRR is stuck • Usually no reservations are made on these paths Confidential. Copyright © 2014 Packet Design 13
  14. AN SDN APPROACH Confidential. Copyright © 2014 Packet Design 14
  15. What do We Really Want Real-time model • Alleviate congestion especially after a link failure Create as few tunnels as necessary • Very small signaling overhead • Very small IGP overhead • No dynamics due to utilization changes Network wide centralized optimization Confidential. Copyright © 2014 Packet Design 15
  16. SDN Promises a Solution Segment routing (SR) replaces RSVP • Provides uncompromised functionality • Simple control plane Push based telemetry for traffic matrices • YANG model based SDN Controller is part of the network control plane • Has real-time topology • Enables manipulating paths on the devices using a standard south bound protocol SDN Traffic Engineering App optimizes paths network wide • As few or as many tunnels as policy allows Confidential. Copyright © 2014 Packet Design 16
  17. Segment Routing Segment routing simplifies IP/MPLS control plane • No need to run LDP or RSVP-TE Functionality is not compromised • Can forward traffic on none-shortest paths for traffic engineering • Detour, bypass FRR (fast re-route) and IP LFA protection • Secondary paths • SLA conforming service specific paths (e.g. L2/L3 VPNs) • SDN programmability Confidential. Copyright © 2014 Packet Design 17
  18. TE Needs Shortest and Non-shortest Paths SR can Encode Any Path Confidential. Copyright © 2014 Packet Design 18 A B Z DC V W YX 1 Segment (shortest IGP path) • Go to Z on shortest path (node segment) A B Z DC V W YX 5 Segments • Go to B on shortest path • Go to W on shortest path • Go to Y on shortest path • Go to D on shortest path • Go to Z on shortest path A B Z DC V W YX 3 Segments • Go to C on shortest path • Go to X on link 3 (adjacency segment) • Go to Z on shortest path
  19. Push Based Telemetry We Still Need Traffic Demand Matrices How much customer traffic enters the network in Vietnam and Destined to Tokyo Demand does not change based on IGP routing (less ignore BGP for now) Traditionally netflow is used for this and can still be used Push and model based telemetry is has very promising features, including much more real-time view Confidential. Copyright © 2014 Packet Design 19
  20. YANG Model Pushed by Ingress Routers Similar content to netflow for traffic matrix generation • No port/proto level detail OpenConfig standardizes the model • Vendor extensions Pushed from the routers • Few seconds to minutes Efficient transfer of data • Binary encoded using Google ProtoBuf Confidential. Copyright © 2014 Packet Design 20 +--ro traffic-collector +--ro afs +--ro af* [af-name] +--ro counters +--ro prefixes +--ro prefix* +--ro ipaddr? string +--ro mask? string +--ro label? Tc-oper-local-label +--ro base-counter-statistics | +--ro transmit-packets-per-second-switched? uint64 | +--ro transmit-bytes-per-second-switched? uint64 | +--ro count-history* | +--ro event-start-timestamp? uint64 | +--ro event-end-timestamp? uint64 | +--ro transmit-number-of-packets-switched? uint64 | +--ro transmit-number-of-bytes-switched? uint64 | +--ro is-valid? boolean +--ro traffic-matrix-counter-statistics | +--ro transmit-packets-per-second-switched? uint64 | +--ro transmit-bytes-per-second-switched? uint64 | +--ro count-history* | +--ro event-start-timestamp? uint64 | +--ro event-end-timestamp? uint64 | +--ro transmit-number-of-packets-switched? uint64 | +--ro transmit-number-of-bytes-switched? uint64 | +--ro is-valid? boolean +--ro prefix? string
  21. Traffic Engineering as an SDN App SDN App manages traffic demand • Current as well as future • IGP does not have to signal available bandwidth • Push based telemetry or netflow based matrices can be generated SDN App computes paths and allocates bandwidth • Centralization yields better resource optimization • Only creates as many tunnels as necessary SDN App adapts to failures and repairs • No race conditions after failures and repairs Segment routing can be used as a simplification • SDN controller makes this an abstraction for the APP 21
  22. Reducing the N^2 Tunnels Only generated tunnels for traffic going over congested links • Tunnels no longer needs to be configured apriori at the routers • Only create them if they will have a positive impact • Special case: • Under normal conditions don’t generate any tunnels • Under failure conditions generate as necessary Do not create tunnels when IGP path satisfies the constrains These are easy to implement in software but hard to do in each device without a global picture Confidential. Copyright © 2014 Packet Design 22
  23. Illustration 1 Gbps links Two elephant flows • 850Mbps west to east • 500Mbps north to south Lots of mice flows Confidential. Copyright © 2014 Packet Design 23
  24. Concluding Remarks SDN simplifies running a traffic engineered network Magic is in the SDN App on the top This App needs enablers from the infrastructure • SDN controllers • Segment routing • Push based telemetry • NetConf/YANG, PCEP and other south bound protocols Confidential. Copyright © 2014 Packet Design 24
Advertisement