v
Linux Networking: Why Should You Care
Shrijeet Mukherjee & Dinesh G Dutt
May 23, 2016
Agenda
Linux vs Linux-based
Evolution of Some of the Basic Components
The New Stuff
May 23, 2016 Cumulus Networks Confidential 2
Key Takeaways
Linux networking is mature
Linux networking is growing
Kernel as the source of truth => no vendor
lock-in
 Same operating model as applications
 Stable API for applications to develop to
 Most freedom of choice for customers
May 23, 2016 Cumulus Networks Confidential 3
Linux As a NOS: Version 1
cumulusnetworks.com 4
CPU, RAM, Flash, etc. Switch Silicon
Front Panel Ports
UserSpaceLinuxKernel
ASIC
Driver
Routing
Tables
ARP
Table
Bridge
Table
Ethernet
Interfaces
Vendor Blob (Holds Master State)
Linux networking is not used at all
SAI
Linux As a NOS: Version 2
cumulusnetworks.com 5
CPU, RAM, Flash, etc. Switch Silicon
Front Panel Ports
UserSpaceLinuxKernel
ASIC
Driver
Partial Kernel
sync
Apps
Linux kernel networking is used partially
Vendor Blob (Holds Master State)
Routing
Tables
ARP
Table
Bridge
Table
Ethernet
Interfaces
SAI
Linux As A NOS: Version 3
cumulusnetworks.com 6
CPU, RAM, Flash, etc. Switch Silicon
Front Panel Ports
LinuxKernel
ASIC
Driver
Routing
Tables
ARP
Table
Bridge
Table
Ethernet
Interfaces
Silicon Switch
Driver
SAI
Cumulus®
Linux®
Architecture
cumulusnetworks.com 7
CPU, RAM, Flash, etc. Switch Silicon
Front Panel Ports
UserSpaceLinuxKernel
ASIC
Driver
Routing
Tables
ARP
Table
Bridge
Table
Ethernet
Interfaces
Automation Monitoring
Third Party/Customer Applications
Network
OrchestrationRouting
Suite
Bridging VXLAN
Quagga
VxLAN
switchd
Switch HAL
May 23, 2016 Cumulus Networks Confidential 8
•Infrastructure or Application
•Model or API
Perceptions ..
May 23, 2016 Cumulus Networks Confidential 9
Linux Networking: Application to Switches
The Linux universe
APP
 Hadoop/Zookeper
Hypervisor
 KVM/Xen
Docker
 Mesos/Kubernetes
Kernel
 Bridges/Routers/OVS
Each builds on each other and is silently
interchangeable i.e Infra
 Even MSFT picked Linux for it’s cloud OS
Virtual Machines
Virtual Switches
Physical Servers
Physical Switch
Router
Is everywhere
A General Note On Innovation & Platforms
May 23, 2016 Cumulus Networks Confidential 10
 Application interface, Portable
and maintainable apps can use
this layer
Innovation here leads to hair
pulling
 Low level device interface;
Simplifies system software
developers, but only useful in the
context of the full system around it
Innovation and change here is
good
API vs Model
Cumulus Networks Confidential 11
 Picture to the left is the standard kernel
ARP handling flow (request or solicitation)
 This flow gets executed under the covers
or IP exchanges are broken
 What is the API that captures this behavior
e.g.
 caching arp differently
 Filtering points are modeled differently
 Incorrect ARP behavior will wreak havoc in
your network, unless you have funded that
havoc handling already
May 23, 2016
(a) (b)
arp_process Netfilter
Transmission of an
ARP packet
Bridging code
Netfilter
Netfilter
Make a local
copy
Drop it
Linearize it
Is the buffer
shared?
Sanity check
(e.g. Should we
process it?
Is the ARP
frame fragmented
in memory?
Reception of an
ARP packet
Yes
Yes
Faile
d
No
Passed
No
dev_queue_xmit
arp_xmit
NF_ARP_IN
NF_ARP_OUT
NF_ARP_FORW
ARD
Fill in header and
payload arp_create
Is Kernel Development Fast Enough ?
May 23, 2016 Cumulus Networks Confidential 12
Grey : Traditional
Stack
Red : Special
pathways
What is the Ratio ?
So the linux
kernel is getting
in your way ?
https://en.wikipedia.org/wiki/Troy_McClure
The New Stuff: The Big Stuff
cumulusnetworks.com 13
CPU, RAM, Flash, etc. Switch Silicon
Front Panel Ports
LinuxKernel
switchd
Switch HAL
ASIC
DriverRouting
Tables
ARP
Table
Bridge
Table
Ethernet
Interfaces
MPLS
VRF
VxLA
N
Switch HAL
Linux Networking State of the Union: End 2015
Cumulus Networks Confidential 14
 Nftables
 EBPF
 TC integration
 New Bridge driver
 VXLAN driver
enhancements
 VRF
 LWT (infra for MPLS)
 Link state management
 Optimize IPv4 FIB lookup,
route driven congestion
algorithm selection
 Switchdev support for Mellanox
Switch, DSA devices
 NetCP (network coproc) driver
support
 TCP fingerprinting
May 23, 2016
Linux Infrastructure: Enabling Technology Adoption
 Infrastructure
components
 VXLan & Geneve
 Foo over UDP
 DCTCP
 OVS
Cumulus Networks Confidential 15May 23, 2016
 Userland Upgrades
 Quagga
 iproute2
 ethtool
 lldpd
 libnl
 ifupdown2
Linux Infrastructure Enabling Technology Adoption
 Infrastructure
components
 VXLan & Geneve
 Foo over UDP
 DCTCP
 OVS
Cumulus Networks Confidential 16May 23, 2016
 Userland Upgrades
 Quagga
 iproute2
 ethtool
 lldpd
 libnl
 ifupdown2
So what is the New new initiative like Sonic, OpenSwitch
etc buying you ?
May 23, 2016 Cumulus Networks Confidential 17
•A Deeper Dive Into Some of the
Components
Packet flow in Netfilter and General Networking
Cumulus Networks Confidential 18May 23, 2016
FORWARD PATH OUTPUT PATHINPUT PATH
Application Layer
Protocol Layer
Network Layer
Link Layer
clone packet
clonepacket
clonepacket
no clone to
AF_PACKET
Xtables (ip, ip6)
ebtables
Other NF parts
Other Networking
Jan Engelhardt network PNG
Bridging : A Model for Each Deployment Model
Old Bridge driver
 Simplified environments
 Multiple STP domains if needed
New Bridge driver
 Perfect for high scale
 803.ad compliant
Cumulus Networks Confidential 19May 23, 2016
Routing
Complete Datapath model
 Leave protocol handling to user space
 Well defined constructs of
• Isolation via Tables
• Policy via Rules
Light Weight Tunnels (LWT)
 Enable simple tunnel termination
 No device overhead
 Preserve the networking stack behavior
Cumulus Networks Confidential 20May 23, 2016
Routing : Multi-Tenancy and Isolation
Namespaces
 Well understood container-ization of the network
 Relatively heavy weight
All network objects need namespace management
Needs new userspace api
Virtual Routing and Forwarding (VRF)
 Enables Layer3 multi-tenancy
 Utilizes existing constructs
Interface for VRF binding
Tables for route table isolation
Cumulus Networks Confidential 21May 23, 2016
Packet Processing : Solutions Galore
Current
 Xtables
 TC & Qdisc : The reigning standard
New
 Nftables
 eBPF
• And they may both get the TC frontends
• New classifiers in TC are easier to work with.
Cumulus Networks Confidential 22May 23, 2016
Evolving Initiatives To Watch For
Light Weight Tunnels
 Expand use cases for device independent tunneling
 Implement VPN support with MPLS
 Handle early demux
Switchdev
 Coming into it’s own, policy implementations in full
swing
 BOF every netdev/plumbers conference
 Your VM and Switch operates the same way
Cumulus Networks Confidential 23May 23, 2016
Evolving Initiatives To Watch For
Netdevices
 Netlink filtering
 Lighter weight netdev
 Needed for multi-tenancy solutions (VRF, vlans etc
etc)
L1 info tool (ethtool++)
 Async operations
 Handle expanded SFP eeproms
Cumulus Networks Confidential 24May 23, 2016
Evolving Initiatives To Watch For
EBPF
 Gathering steam
 Offloading is the next frontier
Tunnels
 Checksums, Yeah it’s back !
 Header alignment (inconsiderate to cpu alignment)
Cumulus Networks Confidential 25May 23, 2016
Evolving Initiatives To Watch For
NFTables
 Parse graph may become the default representation
SRIOV
 Is it a NIC, is switch ? It is a plane-bird
Cumulus Networks Confidential 26May 23, 2016
May 23, 2016 Cumulus Networks Confidential 27
•Participate in netdev ::
http://www.netdevconf.org/
•Netconf article from LWN ::
https://lwn.net/Articles/674943/
For Further Information
Summary
Evolution of the network OS
Cumulus Networks and Partner Confidential 28
Linux OS
 Linux as network
OS
 Native routing and
switching
 Open and proven
Linux-based OS
 Linux as embedded
OS with process and
memory management
 Proprietary routing
and switching stack
Monolithic OS
 No real OS
 while loop
 Proprietary routing
and switching stack
Examples:
 IOS, CatOS
Third party
real time OS
 Embedded OS with
process and memory
management
 Proprietary routing
and switching stack
Example:
 ION, iCOS/Fastpath
Examples:
 NX-OS, EOS
Example:
 Cumulus Linux
May 23, 2016 Cumulus Networks Confidential 29
•L3 To The Host and Container
Networking
•Guest Speaker: Kelsey Hightower,
Google
•April 19, 2016
Next Talk
© 2016 Cumulus Networks. Cumulus Networks, the Cumulus Networks Logo, and Cumulus Linux are trademarks or registered trademarks of Cumulus Networks, Inc. or its affiliates in
the U.S. and other countries. Other names may be trademarks of their respective owners. The registered trademark Linux® is used pursuant to a sublicense from LMI, the exclusive
licensee of Linus Torvalds, owner of the mark on a world-wide basis.
ThankYou!
Cumulus Networks Confidential 30May 23, 2016
Cumulus Networks Confidential 32May 23, 2016

Webinar-Linux Networking is Awesome

  • 1.
    v Linux Networking: WhyShould You Care Shrijeet Mukherjee & Dinesh G Dutt May 23, 2016
  • 2.
    Agenda Linux vs Linux-based Evolutionof Some of the Basic Components The New Stuff May 23, 2016 Cumulus Networks Confidential 2
  • 3.
    Key Takeaways Linux networkingis mature Linux networking is growing Kernel as the source of truth => no vendor lock-in  Same operating model as applications  Stable API for applications to develop to  Most freedom of choice for customers May 23, 2016 Cumulus Networks Confidential 3
  • 4.
    Linux As aNOS: Version 1 cumulusnetworks.com 4 CPU, RAM, Flash, etc. Switch Silicon Front Panel Ports UserSpaceLinuxKernel ASIC Driver Routing Tables ARP Table Bridge Table Ethernet Interfaces Vendor Blob (Holds Master State) Linux networking is not used at all SAI
  • 5.
    Linux As aNOS: Version 2 cumulusnetworks.com 5 CPU, RAM, Flash, etc. Switch Silicon Front Panel Ports UserSpaceLinuxKernel ASIC Driver Partial Kernel sync Apps Linux kernel networking is used partially Vendor Blob (Holds Master State) Routing Tables ARP Table Bridge Table Ethernet Interfaces SAI
  • 6.
    Linux As ANOS: Version 3 cumulusnetworks.com 6 CPU, RAM, Flash, etc. Switch Silicon Front Panel Ports LinuxKernel ASIC Driver Routing Tables ARP Table Bridge Table Ethernet Interfaces Silicon Switch Driver SAI
  • 7.
    Cumulus® Linux® Architecture cumulusnetworks.com 7 CPU, RAM,Flash, etc. Switch Silicon Front Panel Ports UserSpaceLinuxKernel ASIC Driver Routing Tables ARP Table Bridge Table Ethernet Interfaces Automation Monitoring Third Party/Customer Applications Network OrchestrationRouting Suite Bridging VXLAN Quagga VxLAN switchd Switch HAL
  • 8.
    May 23, 2016Cumulus Networks Confidential 8 •Infrastructure or Application •Model or API Perceptions ..
  • 9.
    May 23, 2016Cumulus Networks Confidential 9 Linux Networking: Application to Switches The Linux universe APP  Hadoop/Zookeper Hypervisor  KVM/Xen Docker  Mesos/Kubernetes Kernel  Bridges/Routers/OVS Each builds on each other and is silently interchangeable i.e Infra  Even MSFT picked Linux for it’s cloud OS Virtual Machines Virtual Switches Physical Servers Physical Switch Router Is everywhere
  • 10.
    A General NoteOn Innovation & Platforms May 23, 2016 Cumulus Networks Confidential 10  Application interface, Portable and maintainable apps can use this layer Innovation here leads to hair pulling  Low level device interface; Simplifies system software developers, but only useful in the context of the full system around it Innovation and change here is good
  • 11.
    API vs Model CumulusNetworks Confidential 11  Picture to the left is the standard kernel ARP handling flow (request or solicitation)  This flow gets executed under the covers or IP exchanges are broken  What is the API that captures this behavior e.g.  caching arp differently  Filtering points are modeled differently  Incorrect ARP behavior will wreak havoc in your network, unless you have funded that havoc handling already May 23, 2016 (a) (b) arp_process Netfilter Transmission of an ARP packet Bridging code Netfilter Netfilter Make a local copy Drop it Linearize it Is the buffer shared? Sanity check (e.g. Should we process it? Is the ARP frame fragmented in memory? Reception of an ARP packet Yes Yes Faile d No Passed No dev_queue_xmit arp_xmit NF_ARP_IN NF_ARP_OUT NF_ARP_FORW ARD Fill in header and payload arp_create
  • 12.
    Is Kernel DevelopmentFast Enough ? May 23, 2016 Cumulus Networks Confidential 12 Grey : Traditional Stack Red : Special pathways What is the Ratio ? So the linux kernel is getting in your way ? https://en.wikipedia.org/wiki/Troy_McClure
  • 13.
    The New Stuff:The Big Stuff cumulusnetworks.com 13 CPU, RAM, Flash, etc. Switch Silicon Front Panel Ports LinuxKernel switchd Switch HAL ASIC DriverRouting Tables ARP Table Bridge Table Ethernet Interfaces MPLS VRF VxLA N Switch HAL
  • 14.
    Linux Networking Stateof the Union: End 2015 Cumulus Networks Confidential 14  Nftables  EBPF  TC integration  New Bridge driver  VXLAN driver enhancements  VRF  LWT (infra for MPLS)  Link state management  Optimize IPv4 FIB lookup, route driven congestion algorithm selection  Switchdev support for Mellanox Switch, DSA devices  NetCP (network coproc) driver support  TCP fingerprinting May 23, 2016
  • 15.
    Linux Infrastructure: EnablingTechnology Adoption  Infrastructure components  VXLan & Geneve  Foo over UDP  DCTCP  OVS Cumulus Networks Confidential 15May 23, 2016  Userland Upgrades  Quagga  iproute2  ethtool  lldpd  libnl  ifupdown2
  • 16.
    Linux Infrastructure EnablingTechnology Adoption  Infrastructure components  VXLan & Geneve  Foo over UDP  DCTCP  OVS Cumulus Networks Confidential 16May 23, 2016  Userland Upgrades  Quagga  iproute2  ethtool  lldpd  libnl  ifupdown2 So what is the New new initiative like Sonic, OpenSwitch etc buying you ?
  • 17.
    May 23, 2016Cumulus Networks Confidential 17 •A Deeper Dive Into Some of the Components
  • 18.
    Packet flow inNetfilter and General Networking Cumulus Networks Confidential 18May 23, 2016 FORWARD PATH OUTPUT PATHINPUT PATH Application Layer Protocol Layer Network Layer Link Layer clone packet clonepacket clonepacket no clone to AF_PACKET Xtables (ip, ip6) ebtables Other NF parts Other Networking Jan Engelhardt network PNG
  • 19.
    Bridging : AModel for Each Deployment Model Old Bridge driver  Simplified environments  Multiple STP domains if needed New Bridge driver  Perfect for high scale  803.ad compliant Cumulus Networks Confidential 19May 23, 2016
  • 20.
    Routing Complete Datapath model Leave protocol handling to user space  Well defined constructs of • Isolation via Tables • Policy via Rules Light Weight Tunnels (LWT)  Enable simple tunnel termination  No device overhead  Preserve the networking stack behavior Cumulus Networks Confidential 20May 23, 2016
  • 21.
    Routing : Multi-Tenancyand Isolation Namespaces  Well understood container-ization of the network  Relatively heavy weight All network objects need namespace management Needs new userspace api Virtual Routing and Forwarding (VRF)  Enables Layer3 multi-tenancy  Utilizes existing constructs Interface for VRF binding Tables for route table isolation Cumulus Networks Confidential 21May 23, 2016
  • 22.
    Packet Processing :Solutions Galore Current  Xtables  TC & Qdisc : The reigning standard New  Nftables  eBPF • And they may both get the TC frontends • New classifiers in TC are easier to work with. Cumulus Networks Confidential 22May 23, 2016
  • 23.
    Evolving Initiatives ToWatch For Light Weight Tunnels  Expand use cases for device independent tunneling  Implement VPN support with MPLS  Handle early demux Switchdev  Coming into it’s own, policy implementations in full swing  BOF every netdev/plumbers conference  Your VM and Switch operates the same way Cumulus Networks Confidential 23May 23, 2016
  • 24.
    Evolving Initiatives ToWatch For Netdevices  Netlink filtering  Lighter weight netdev  Needed for multi-tenancy solutions (VRF, vlans etc etc) L1 info tool (ethtool++)  Async operations  Handle expanded SFP eeproms Cumulus Networks Confidential 24May 23, 2016
  • 25.
    Evolving Initiatives ToWatch For EBPF  Gathering steam  Offloading is the next frontier Tunnels  Checksums, Yeah it’s back !  Header alignment (inconsiderate to cpu alignment) Cumulus Networks Confidential 25May 23, 2016
  • 26.
    Evolving Initiatives ToWatch For NFTables  Parse graph may become the default representation SRIOV  Is it a NIC, is switch ? It is a plane-bird Cumulus Networks Confidential 26May 23, 2016
  • 27.
    May 23, 2016Cumulus Networks Confidential 27 •Participate in netdev :: http://www.netdevconf.org/ •Netconf article from LWN :: https://lwn.net/Articles/674943/ For Further Information
  • 28.
    Summary Evolution of thenetwork OS Cumulus Networks and Partner Confidential 28 Linux OS  Linux as network OS  Native routing and switching  Open and proven Linux-based OS  Linux as embedded OS with process and memory management  Proprietary routing and switching stack Monolithic OS  No real OS  while loop  Proprietary routing and switching stack Examples:  IOS, CatOS Third party real time OS  Embedded OS with process and memory management  Proprietary routing and switching stack Example:  ION, iCOS/Fastpath Examples:  NX-OS, EOS Example:  Cumulus Linux
  • 29.
    May 23, 2016Cumulus Networks Confidential 29 •L3 To The Host and Container Networking •Guest Speaker: Kelsey Hightower, Google •April 19, 2016 Next Talk
  • 30.
    © 2016 CumulusNetworks. Cumulus Networks, the Cumulus Networks Logo, and Cumulus Linux are trademarks or registered trademarks of Cumulus Networks, Inc. or its affiliates in the U.S. and other countries. Other names may be trademarks of their respective owners. The registered trademark Linux® is used pursuant to a sublicense from LMI, the exclusive licensee of Linus Torvalds, owner of the mark on a world-wide basis. ThankYou! Cumulus Networks Confidential 30May 23, 2016
  • 31.