Your SlideShare is downloading. ×
0
DevConf 2014   Kernel Networking Walkthrough
DevConf 2014   Kernel Networking Walkthrough
DevConf 2014   Kernel Networking Walkthrough
DevConf 2014   Kernel Networking Walkthrough
DevConf 2014   Kernel Networking Walkthrough
DevConf 2014   Kernel Networking Walkthrough
DevConf 2014   Kernel Networking Walkthrough
DevConf 2014   Kernel Networking Walkthrough
DevConf 2014   Kernel Networking Walkthrough
DevConf 2014   Kernel Networking Walkthrough
DevConf 2014   Kernel Networking Walkthrough
DevConf 2014   Kernel Networking Walkthrough
DevConf 2014   Kernel Networking Walkthrough
DevConf 2014   Kernel Networking Walkthrough
DevConf 2014   Kernel Networking Walkthrough
DevConf 2014   Kernel Networking Walkthrough
DevConf 2014   Kernel Networking Walkthrough
DevConf 2014   Kernel Networking Walkthrough
DevConf 2014   Kernel Networking Walkthrough
DevConf 2014   Kernel Networking Walkthrough
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

DevConf 2014 Kernel Networking Walkthrough

1,686

Published on

This presentation features a walk through the Linux kernel networking stack covering the essentials and recent developments a developer needs to know. Our starting point is the network card driver as …

This presentation features a walk through the Linux kernel networking stack covering the essentials and recent developments a developer needs to know. Our starting point is the network card driver as it feeds a packet into the stack. We will follow the packet as it traverses through various subsystems such as packet filtering, routing, protocol stacks, and the socket layer. We will pause here and there to look into concepts such as segmentation offloading, TCP small queues, and low latency polling. We will cover APIs exposed by the kernel that go beyond use of write()/read() on sockets and will look into how they are implemented on the kernel side.

Published in: Technology
0 Comments
7 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
1,686
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
110
Comments
0
Likes
7
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Kernel Networking Walkthrough Thomas Graf – Principal Software Engineer Networking Services Red Hat Feb 7, 2014 1 Kernel Networking Walkthrough
  • 2. Agenda ● How does a packet get in and out of the net stack? ● ● How does a packet get through the net stack? ● ● 2 RX Handler, IP Processing, TCP Processing, TCP Fast Open How to account for memory and do flow control? ● ● NAPI, Busy Polling, RSS, RPS, XPS, GRO, TSO Socket Buffers, Flow Control, TCP Small Queues Q&A Kernel Networking Walkthrough
  • 3. Touring the Network Stack Expectation 3 Reality Kernel Networking Walkthrough
  • 4. How does a packet get in and out of the Network Stack? 4 Kernel Networking Walkthrough
  • 5. Receive & Transmit Process NIC Network Stack (Kernel Space) Ring Buffer Parse IP Parse TCP/UDP Socket Buffer read() Forward DMA Device? Ring Buffer 5 Local? Process (User Space) Task Construct IP Construct TCP/UDP Kernel Networking Walkthrough write() Socket Buffer
  • 6. The 3 ways into the Network Stack Interrupt Driven Network Stack Ring Buffer NAPI based Polling poll() Network Stack Ring Buffer Busy Polling busy_poll() Task Network Stack Ring Buffer 6 Kernel Networking Walkthrough
  • 7. RSS – Receive Side Scaling ● ● NIC distributes packets across multiple RX queues allowing for parallel processing. Separate IRQ per RX queue, thus selects CPU to run hardware interrupt handler on. RX-queue-1 CPU 1 RX-queue-2 CPU 3 filter RX-queue-3 CPU 1 RX-queue-4 CPU 5 7 Kernel Networking Walkthrough
  • 8. RPS – Receive Packet Steering ● Software filter to select CPU # for processing ● Use it to ... ... distribute single queue to multiple CPUs ... redo queue - CPU mapping RX-queue-1 RX-queue-2 RX-queue-3 RX-queue-4 8 CPU 1 CPU 1 CPU 2 CPU 2 CPU 3 CPU 3 Kernel Networking Walkthrough
  • 9. Hardware Offload ● RX/TX Checksumming ● ● Virtual LAN filtering and tag stripping ● ● 9 Perform CPU intensive checksumming in hardware. Strip 802.1Q header and store VLAN ID in network packet meta data. Filter out unsubscribed VLANs. Kernel Networking Walkthrough
  • 10. Generic Receive Offload NAPI based GRO poll() Network Stack Ring Buffer GRO MTU 10 Kernel Networking Walkthrough Up to 64K
  • 11. Segmentation Offload Up to 64K Network Stack Generic Segmentation Offload (GSO) MTU Ring Buffer TCP Segmentation Offload (TSO) MTU 11 Kernel Networking Walkthrough
  • 12. How does a packet get through the Network Stack? (c) Karen Sagovac 12 Kernel Networking Walkthrough
  • 13. Packet Processing Link Layer Packet Socket ETH_P_ALL Ingress QoS tcpdump Bridge Open vSwitch RX Handler Team Bonding macvlan macvtap IPv4 Proto Handler IPv6 ARP Feast of the hungry chicks IPX Drop 13 Kernel Networking Walkthrough ...
  • 14. IP Processing PREROUTING IP Handler INPUT Route Lookup Local Delivery Forwarding L4 (TCP, ...) FORWARD Route Lookup Link Layer IPv4 Construction POSTROUTING OUTPUT 14 Kernel Networking Walkthrough Local Output User Space
  • 15. TCP Processing IP Parse TCP Lookup Socket Socket Filter socket locked task exists Receive TCP Prequeue process context ← softirq Receive Socket Buffer read() poll() Task 15 Backlog Kernel Networking Walkthrough
  • 16. TCP Fast Open (net.ipv4.tcp_fastopen) Regular Fast Open Client 1st Req Server Client 1st Req SYN ACK SYN+ 2x RTT ACK+ HTTP GE Server 2x RTT T SYN ookie CK+C A SYN+ ACK+ HTTP GET Data 2nd Req Data 2nd Req SYN 1x RTT ACK SYN+ 2x RTT ACK+ HTTP GE T Data 16 Kernel Networking Walkthrough SYN+ Cook ie+ HTTP GET +Data +ACK SYN
  • 17. Memory Accounting & Flow Control 17 Kernel Networking Walkthrough
  • 18. Socket Buffers & Flow Control (net.ipv4.tcp_{r|w}mem) ssh ssh Block or EWOULDBLOCK write() rmem -= packet-size wmem overlimit? Socket Buffer rmem += packet-size wmem += packet-size rmem overlimit? Socket Buffer Reduce TCP Window TCP/IP TCP/IP TX Ring Buffer wmem -= packet-size 18 Kernel Networking Walkthrough RX Ring Buffer
  • 19. TCP Small Queues (net.ipv4.tcp_limit_output_bytes) ssh torrent write() write() Socket Buffer Socket Buffer TSQ: max 128Kb in flight per socket TCP/IP Queuing Discipline Driver TX Ring Buffer 19 Kernel Networking Walkthrough
  • 20. Q&A Feedback Page ● http://devconf.cz/f/1 Coming Up Next: NetworkManager for Enterprise Dan Williams 20 Kernel Networking Walkthrough

×