Cardigan: SDN Distributed Routing
Fabric Going Live at an Internet
Exchange
Jonathan Stringer
Dean Pemberton
Qiang Fu
Christopher Lorier
Richard Nelson
Josh Bailey
Carlos N. A. Corrêa
Christian Esteve Rothenberg
Raphael Vicente Rosa
ISCC 2014
Cardigan: Executive Summary
Build experience with SDN
– Directly address operational comfort
– Practical migration path (non-SDN networks)
Design and implementation of an SDN-based distributed routing fabric
– Routing-as-a-Service to an SDN-based IP network
Rethink IXP architecture
– Unleash innovation in inter-domain routing
– Tackle security and economics hard issues
Pilot deployment:
– Viable migration path to SDN
– Identification of incompatibilities and issues in production
2
RouteFlow Primer
Fig. 1 - RouteFlow block diagram
3
Cardigan 101
• An IXP is just somewhere you:
– Advertise packets you accept
– Send packets that you want to get rid of
4
Fig. 2 – Cardigan initial 2-switch deployment
Limitations and Easy Fixes
• Extensible message formats
– RFProtocol flexibility
– Set of matches, actions, options...
– IPv6, MPLS on Ethernet
• Inneficient gateway resolution
– RFClient: Reception of its Netlink announcement and discovery of the
associated gateway MAC address
– Cache IPv4 and IPv6 routes until ARP resolutions
• Scalable router abstraction
– Router abstraction dependent on the physical topology
– No arbitrary paths inside the ISP network
– Traffic classification using VLAN tags per static inter-switch links (ISLs)
configuration
– Fine granular traffic control introducing MPLS pathsM
5
Cardigan 2.0
• MPLS Label Switched Path (LSP)
– A prefix to a path (set of paths)
– Ingress node through a set of transit nodes
– Operator freely defines packet circuits over the
network (arbitrary level of detail)
– Forwarding Path Manager (FPM) component On
RouteFlow architecture
• Feeds all calculated routes (even if initially discarded)
6
Cardigan 2.0
7
Fig. 3 – Cardigan with MPLS design
Deployment
• Pronto Switches (PicOs) - 1G SFPs
• Out-of-band VM controller by layer 2 VLANs
• Traffic forwarded directly by OpenFlow switches
• In production for 9 months
• 90 organizations - forwarding customer traffic and sharing routes
• 1134 flows on each swicth (1028 layer 3 routes)
8
Discussion Itens (1/2)
• Protocol compliance
– OF 1.0 TTL decrement
• MAC addressing
– Scalability of flow tables (1 route = n-1 flow rules)
• OF agent implementation
– Vendor switches memory leaks and flow counters
• Encapsulation Hazards
– MTU size for Ethernet, VLAN, MPLS, etc
9
Discussion Itens (2/2)
• Gateway Address Resolution - Increased performance
– Separation of gateway resolution and route processing
• Scalability
– Distribution of the FIB accross multiple devices, different
data plane technologies (NPU/FPGA)
• Resilience
– High-available non-stop forwarding solution and
systematical SDN troubleshooting
• Policy enforcements at IXPs – tedious tasks
– Manual time-of-the-day routing, dynamic traffic
engineering, route preferences, etc)’
10
Related Work
• Many related work efforts: RCP, SoftRouter...
• IXP: an interesting networking landscape
• SDX: A Software Defined Internet Exchange
– Arpit Gupta (Georgia Institute of Technology), Laurent Vanbever
(Princeton University), Muhammad Shahbaz (Georgia Institute
of Technology), Sean P. Donovan (Georgia Institute of
Technology), Brandon Schlinker (University of Southern
California), Nick Feamster (Georgia Institute of Technology),
Jennifer Rexford (Princeton University), Scott Shenker (UC
Berkeley), Russ Clark (Georgia Institute of Technology), Ethan
Katz-Bassett (University of Southern California)
– ACM SIGCOMM, Chicago, IL. August 2014.
11
Future Work
• Performance, scalability, resilience, security, etc.
• Rethinking peering between SDN domains
12
Conclusion
• SDN-based distributed router in a live IXP
• Reduces operational complexity
• Hybrid SDN-IP network side-by-side
• New approach to the router abstraction model
• Scale to large networks
• Implementation of policies
– Load balancing, closest exit usage, complex setups
13

Cardigan at ISCC 2014

  • 1.
    Cardigan: SDN DistributedRouting Fabric Going Live at an Internet Exchange Jonathan Stringer Dean Pemberton Qiang Fu Christopher Lorier Richard Nelson Josh Bailey Carlos N. A. Corrêa Christian Esteve Rothenberg Raphael Vicente Rosa ISCC 2014
  • 2.
    Cardigan: Executive Summary Buildexperience with SDN – Directly address operational comfort – Practical migration path (non-SDN networks) Design and implementation of an SDN-based distributed routing fabric – Routing-as-a-Service to an SDN-based IP network Rethink IXP architecture – Unleash innovation in inter-domain routing – Tackle security and economics hard issues Pilot deployment: – Viable migration path to SDN – Identification of incompatibilities and issues in production 2
  • 3.
    RouteFlow Primer Fig. 1- RouteFlow block diagram 3
  • 4.
    Cardigan 101 • AnIXP is just somewhere you: – Advertise packets you accept – Send packets that you want to get rid of 4 Fig. 2 – Cardigan initial 2-switch deployment
  • 5.
    Limitations and EasyFixes • Extensible message formats – RFProtocol flexibility – Set of matches, actions, options... – IPv6, MPLS on Ethernet • Inneficient gateway resolution – RFClient: Reception of its Netlink announcement and discovery of the associated gateway MAC address – Cache IPv4 and IPv6 routes until ARP resolutions • Scalable router abstraction – Router abstraction dependent on the physical topology – No arbitrary paths inside the ISP network – Traffic classification using VLAN tags per static inter-switch links (ISLs) configuration – Fine granular traffic control introducing MPLS pathsM 5
  • 6.
    Cardigan 2.0 • MPLSLabel Switched Path (LSP) – A prefix to a path (set of paths) – Ingress node through a set of transit nodes – Operator freely defines packet circuits over the network (arbitrary level of detail) – Forwarding Path Manager (FPM) component On RouteFlow architecture • Feeds all calculated routes (even if initially discarded) 6
  • 7.
    Cardigan 2.0 7 Fig. 3– Cardigan with MPLS design
  • 8.
    Deployment • Pronto Switches(PicOs) - 1G SFPs • Out-of-band VM controller by layer 2 VLANs • Traffic forwarded directly by OpenFlow switches • In production for 9 months • 90 organizations - forwarding customer traffic and sharing routes • 1134 flows on each swicth (1028 layer 3 routes) 8
  • 9.
    Discussion Itens (1/2) •Protocol compliance – OF 1.0 TTL decrement • MAC addressing – Scalability of flow tables (1 route = n-1 flow rules) • OF agent implementation – Vendor switches memory leaks and flow counters • Encapsulation Hazards – MTU size for Ethernet, VLAN, MPLS, etc 9
  • 10.
    Discussion Itens (2/2) •Gateway Address Resolution - Increased performance – Separation of gateway resolution and route processing • Scalability – Distribution of the FIB accross multiple devices, different data plane technologies (NPU/FPGA) • Resilience – High-available non-stop forwarding solution and systematical SDN troubleshooting • Policy enforcements at IXPs – tedious tasks – Manual time-of-the-day routing, dynamic traffic engineering, route preferences, etc)’ 10
  • 11.
    Related Work • Manyrelated work efforts: RCP, SoftRouter... • IXP: an interesting networking landscape • SDX: A Software Defined Internet Exchange – Arpit Gupta (Georgia Institute of Technology), Laurent Vanbever (Princeton University), Muhammad Shahbaz (Georgia Institute of Technology), Sean P. Donovan (Georgia Institute of Technology), Brandon Schlinker (University of Southern California), Nick Feamster (Georgia Institute of Technology), Jennifer Rexford (Princeton University), Scott Shenker (UC Berkeley), Russ Clark (Georgia Institute of Technology), Ethan Katz-Bassett (University of Southern California) – ACM SIGCOMM, Chicago, IL. August 2014. 11
  • 12.
    Future Work • Performance,scalability, resilience, security, etc. • Rethinking peering between SDN domains 12
  • 13.
    Conclusion • SDN-based distributedrouter in a live IXP • Reduces operational complexity • Hybrid SDN-IP network side-by-side • New approach to the router abstraction model • Scale to large networks • Implementation of policies – Load balancing, closest exit usage, complex setups 13