Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Applying Hyper-scale
Design Patterns to Routing
Hannes Gredler, CTO RtBrick Inc.
DEVNET-2064
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Who am I ?
• CTO at RtBrick, Inc.
• Past stint: Dist...
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
>2013 exposure to Data Center Networks & SR
New larg...
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Got a couple of inconvenient insights …
• Networks h...
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 5DEVNET-2064
John Gage
Sun Microsystems
1) “The netw...
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 6DEVNET-2064
Hannes Gredler, 2015
2) “Is it possible...
• Introduction
• Multi-Level Architecture
• Micro-services & APIs
• Commoditization & Unit Economics
• Resiliency, system ...
Multi-Level Architecture
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Hyper-scale Multi-level Architecture
9DEVNET-2064
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Forwarding node
• Translates RIB Objects to local OS...
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Protocol I/O node
• Schema driven protocol
serialize...
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Application (route computation) node
APPd
input_proc...
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Putting it all together
13DEVNET-2064
Micro-Services & APIs
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Build a system of little components
• Micro-service ...
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
REST/JSON based APIs
16DEVNET-2064
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
bds://local/bgp.neighbor
bds://local/isis.ad
j
bds:/...
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Open IPC format = BSON/JSON
• Binary JSON for memory...
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Table replication & state flow within a system
19DEV...
Commoditization & Unit Economics
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Compute Strategy: Yahoo vs. Google
21DEVNET-2064
Few...
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
• Economy of scale will ultimately render custom-ASI...
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
• 16-32 CPU Cores, 64 GB RAM, Solid State Disks
• ap...
Resiliency, system coupling and
state recovery
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Hyper-scale Multi-level Architecture
25DEVNET-2064
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Resiliency
26DEVNET-2064
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 27DEVNET-2064
Resiliency – snapshot DB to disk
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Resiliency (2) – restart based on disk snapshot
28DE...
Open Source
Development & Test
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Open Development
• Open Source
• 100 eyes better tha...
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Open development (1)
• Use what is usable
• No needs...
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 32DEVNET-2064
Open development (2)
• Kick-ass VPP cr...
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Open Development (3)
VPP Internet stream generator
3...
Conclusion
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
In Conclusion
• Network Equipment design got to be
•...
rtbrick demo at fd.io booth
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Demo hosted at EC2 instance
37DEVNET-2064
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
• 39Million IPv4 / 3M IPv6 route entries
• BGP table...
© 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public
Process restart using snapshots (3)
Normal appd rest...
Thank you
Applying Hyper-scale Design Patterns to Routing
Upcoming SlideShare
Loading in …5
×

Applying Hyper-scale Design Patterns to Routing

Hannes Gredlers CiscoLive! 2016 presentation on how to design a 3rd generation routing system

Related Books

Free with a 30 day trial from Scribd

See all

Related Audiobooks

Free with a 30 day trial from Scribd

See all
  • Be the first to comment

Applying Hyper-scale Design Patterns to Routing

  1. 1. Applying Hyper-scale Design Patterns to Routing Hannes Gredler, CTO RtBrick Inc. DEVNET-2064
  2. 2. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Who am I ? • CTO at RtBrick, Inc. • Past stint: Distinguished Engineer with the “other router-vendor” • 18 Years working experience, developing, deploying and supporting Routing Software • Expertise • BGP, IS-IS, MPLS • 20+ Patents • 20+ Proposed Standards http://www.arkko.com/tools/allstats/hannesgredler.html • IETF WG co-chair (IS-IS) DEVNET-2064 2
  3. 3. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public >2013 exposure to Data Center Networks & SR New large-scale data-center network model emerging  [draft-ietf-rtgwg-bgp-routing-large-dc]  End-to-End Layer-3 routing  Fixes issues with L2 switching data plane.  Hierarchical Topology  CLOS-based  Max 5-stages  Use of aggregation at TORs DEVNET-2064 3
  4. 4. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Got a couple of inconvenient insights … • Networks have become Anti-Moore • direct sourcing from OEM manufacturers in Taiwan • Hardware is a Commodity • Cost per Bit dropping sharply (USD 400 / 100GBE port) • Boutique ASICs viable in 5 years from now ? • Curated Software Release models approaching EOL • Modularization or Custom package selection desired (no-PIM, no RSVP, etc.) • Pay per-use • Different model (node vs. system) for Resiliency • Open sourcing of components the new normal • Integration of components becomes core competency DEVNET-2064 4
  5. 5. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 5DEVNET-2064 John Gage Sun Microsystems 1) “The network is the computer”
  6. 6. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 6DEVNET-2064 Hannes Gredler, 2015 2) “Is it possible to construct a router based on the web 2.0 mindset ?”
  7. 7. • Introduction • Multi-Level Architecture • Micro-services & APIs • Commoditization & Unit Economics • Resiliency, system coupling and state recovery • Open Source Development & Test • Conclusion Agenda
  8. 8. Multi-Level Architecture
  9. 9. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Hyper-scale Multi-level Architecture 9DEVNET-2064
  10. 10. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Forwarding node • Translates RIB Objects to local OS representation • Tables • Routes • Nexthops • Hardware Prefix Caching • Aggregate FIB table • (filter specifics) • Localize fwd table • VPNs DEVNET-2064 10
  11. 11. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Protocol I/O node • Schema driven protocol serializer /de-serializer • Keep alive delegation /absorption • Terminal Communication point for Sockets, stdio & file I/O • Pre-processing protocol stream (filter BGP PA128) • Queuing machinery & Routing Protocol update generation • Interface state handling DEVNET-2064 11
  12. 12. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Application (route computation) node APPd input_proc add: bgp_filter_input chg: bgp_filter_input del: - IB: pre-filter Nbr Neighbor 193.203.0.40 pre-filter Ipv4 RIB Data Structure Schema • Schema driven Data Structure Server • Stores Applications Objects • Routes, Nexthops, Tables • Triggered execution (Add, Chg, Del) of internal/external Application code • Python functions • C/C++ library calls • Executables vfork() DEVNET-2064 12
  13. 13. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Putting it all together 13DEVNET-2064
  14. 14. Micro-Services & APIs
  15. 15. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Build a system of little components • Micro-service architecture is like a UNIX pipeline model • Small pieces of software, serving a unique purpose • Easy transfer of state from one brick to next SortFilterSource Filter Sink curl http://192.168.1.1/bds/object | grep “Received-From:” | sort | grep “foo” > out DEVNET-2064 15
  16. 16. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public REST/JSON based APIs 16DEVNET-2064
  17. 17. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public bds://local/bgp.neighbor bds://local/isis.ad j bds://local/isis.lsdb.l2 bds://217.160.181.216/bgp.rib-in PUBSUB Database centric / Distributed Data Store DEVNET-2064 17
  18. 18. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Open IPC format = BSON/JSON • Binary JSON for memory and I/O efficiency • JSON conversion on the fly possible DEVNET-2064 18
  19. 19. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Table replication & state flow within a system 19DEVNET-2064
  20. 20. Commoditization & Unit Economics
  21. 21. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Compute Strategy: Yahoo vs. Google 21DEVNET-2064 Few Big vs. Many Small
  22. 22. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public • Economy of scale will ultimately render custom-ASICs obsolete • FY2016 systems shipping: 100GB, > 128K FIB entries • Disintegration happening • soon to enter the Edge Router Business … • For ease of integration makes no Hardware, no locality, no OS assumptions • Unbounded Configuration Possibilities: • Single Switch, Cluster of Switches, Co-located x86 Rack Servers …. • Large FIBs, Small FIBs, SW-based forwarders & Combos thereof Commodity data plane = White-boxes DEVNET-2064 22
  23. 23. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public • 16-32 CPU Cores, 64 GB RAM, Solid State Disks • app USD 3000 • Runs stock Ubuntu / Centos • Linux Containers (LXC) • dependency management • Para-Virtualization Commodity control plane = 1RU Rack Servers DEVNET-2064 23
  24. 24. Resiliency, system coupling and state recovery
  25. 25. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Hyper-scale Multi-level Architecture 25DEVNET-2064
  26. 26. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Resiliency 26DEVNET-2064
  27. 27. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 27DEVNET-2064 Resiliency – snapshot DB to disk
  28. 28. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Resiliency (2) – restart based on disk snapshot 28DEVNET-2064
  29. 29. Open Source Development & Test
  30. 30. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Open Development • Open Source • 100 eyes better than 4 eyes, Network effects • Long term Maintenance • Open Source means sharing of not just Code: • Code • Test • Build • Documentation 30DEVNET-2064
  31. 31. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Open development (1) • Use what is usable • No needs to re-invent Linux, event-loops, memory managers • Kernel based networking stacks are not usable for a router • Debugging hard (GDB live attachment) • Experimental forwarding code with no fault-domains in your kernel, really ? • TCP snapshots / restart. • In 2016Q1 we did not have a packet forwarding core • Cisco did release fd.io / VPP • User space DPDK design aligned with our (religious) believes • Most feature complete open-source L3 forwarder • Engineered for performance and maintainability 31DEVNET-2064
  32. 32. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public 32DEVNET-2064 Open development (2) • Kick-ass VPP crew • Helped us to implement necessary core-features (indirect next-hop) within two weeks. • Good balance between Stability and feature velocity • Excellent Continuous Integration & Test Automation (untypical for FLOSS projects)
  33. 33. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Open Development (3) VPP Internet stream generator 33DEVNET-2064
  34. 34. Conclusion
  35. 35. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public In Conclusion • Network Equipment design got to be • Distributed, Multi-level Architecture • Micro-service based • Running on Commodity Hardware • “System” Resilient • Open Development / Open Test • Cisco Vector Packet Processing (VPP) • Best code in the industry (why is this free ?) • Good Code Governance • Establishment of an innovative ecosystem around VPP underway 35DEVNET-2064
  36. 36. rtbrick demo at fd.io booth
  37. 37. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Demo hosted at EC2 instance 37DEVNET-2064
  38. 38. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public • 39Million IPv4 / 3M IPv6 route entries • BGP table snapshots from RIPE RIS server • route-processing / update / restart performance • 20x compared to JUNOS | IOS-XR • Full-bringup time 180s • Resync time 26s • Full fault domain isolation • Blast radius within a protocol-process of an address family • Process restart • Preservation of TCP session • Fast, robust Re-sync of state • Everything versioned Demo SCALE
  39. 39. © 2016 Cisco and/or its affiliates. All rights reserved. Cisco Public Process restart using snapshots (3) Normal appd restart resync time for 2.5M RIB entries / 28s snapshot appd restart resync time for 2.5M RIB entries / 6s
  40. 40. Thank you

    Be the first to comment

    Login to see the comments

  • picosleep

    Dec. 6, 2016
  • BlakeWillis

    Dec. 16, 2016

Hannes Gredlers CiscoLive! 2016 presentation on how to design a 3rd generation routing system

Views

Total views

21,957

On Slideshare

0

From embeds

0

Number of embeds

20,267

Actions

Downloads

117

Shares

0

Comments

0

Likes

2

×