Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Software segmentation offloading for FreeBSD
Stefano Garzarella, Luigi Rizzo !
stefano.garzarella@gmail.com
!
!
!
!
!
Unive...
EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com
• Introduction
• Hardware offloading (TSO, RSC)
• Softw...
EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com
• The use of large frames makes network communication
...
EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com
Sender side!
• TCP Segmentation Offload (TSO)
• Data se...
EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com 5
TCP Segmentation Offload
Physical (Hardware)
Data-Lin...
EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com
Receiver side!
• Receive Side Coalescing (RSC)
or hard...
EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com 7
Receive Side Coalescing
Physical (Hardware)
Data-Lin...
EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com 8
Software offloading
• Why software implementation?
• ...
EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com
Receiver side!
• Large Receive Offload (LRO)
• Software...
EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com 10
Large Receive Offload
App Data (64KB)
TCP 1 (64KB)
I...
EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com
Sender side!
• Generic Segmentation Offload (GSO)
• Sof...
EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com 12
Physical (Hardware)
Data-Link layer
(Ethernet)
IP
T...
EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com
• Network stack changes
• tcp_output()
• ip_output()
•...
EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com
1. Checks if GSO is enabled:
• sysctl net.inet.tcp.gso...
EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com
• If GSO is enabled and required:
• avoids checksum (I...
EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com
• If GSO is enabled and required:
• calls gso_dispatch...
EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com 17
gso_dispatch()
!
int!
gso_dispatch(struct ifnet *if...
EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com 18
gso_functions[]
• gso_functions[GSO_TCP4]!
• gso_ip...
EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com 19
GSO: m_seg()
not used
…
m_flags
m_ext.ext_buf
m_next...
EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com 20
GSO: fix TCP/IPv4 headers
Checksum Urgent pointer
Se...
EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com 21
GSO: fix TCP/IPv6 headers
Source IP address (128 bit...
EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com 22
GSO on UDP flow
Physical (Hardware)
Data-Link layer
...
EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com 23
GSO: fix UDP/IPv4 headers
UDP header (only in the fir...
EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com 24
GSO: fix UDP/IPv6 headers
Source IP address (128 bit...
EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com
To manage the GSO parameters there are some sysctl:
• ...
EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com
• Kernel patches for FreeBSD-current, FreeBSD 10-
stab...
EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com 27
GSO patch
diff gso-current
EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com
• Sender: CPU i7-870 at 2.93 GHz + Turboboost,
Intel 1...
EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com
Freq. [GHz] TSO GSO none Speedup GSO - none
2.93 9347 ...
EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com
Freq. [GHz] TSO GSO none Speedup GSO - none
2.93 9097 ...
EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com 31
Results UDP/IPv4
Freq. [GHz] GSO none Speedup GSO -...
EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com
Freq. [GHz] GSO none Speedup GSO - none
2.93 7281 6197...
EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com
• More performance measurements
• Optimize code paths
...
EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com 34
Thank you!
Upcoming SlideShare
Loading in …5
×

Software segmentation offloading for FreeBSD by Stefano Garzarella

2,705 views

Published on

Abstract
The use of large frames makes network communication much less demanding for the CPU. Yet, backward compatibility and slow links requires the use of 1500 byte or smaller frames.
Modern NICs with hardware TCP segmentation offloading (TSO) address this problem. However, a generic software version (GSO) provided by the OS has reason to exist, for use on paths with no suitable hardware, such as between virtual machines or with older or buggy NICs.

In this paper we present our work to add GSO to FreeBSD. Our preliminary implementation, depending on CPU speed, shows up to 90% speedup compared to segmentation done in the TCP stack, saturating a 10 Gbit link at 2 GHz with checksum offloading

Published in: Technology
  • Be the first to comment

Software segmentation offloading for FreeBSD by Stefano Garzarella

  1. 1. Software segmentation offloading for FreeBSD Stefano Garzarella, Luigi Rizzo ! stefano.garzarella@gmail.com ! ! ! ! ! Università di Pisa September 28, 2014!
  2. 2. EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com • Introduction • Hardware offloading (TSO, RSC) • Software offloading (LRO) • Generic Segmentation Offload (GSO) • Results 2 Outline
  3. 3. EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com • The use of large frames makes network communication much less demanding for the CPU. • Backward compatibility and slow links requires the use of 1500 byte or smaller frames. • Modern NICs with hardware TCP segmentation offloading (TSO) address this problem. • Generic software version (GSO) provided by the OS has reason to exist, for use on paths with no suitable hardware. 3 Introduction
  4. 4. EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com Sender side! • TCP Segmentation Offload (TSO) • Data segmentation is offloaded to the NIC, that divides the data into the maximum transmission unit (MTU) size of the outgoing interface. • The network stack is crossed only once per (large) segment instead of once per 1500-byte frames. • Issues: • Only TCP traffic • Early version supported only IPv4 4 Hardware Offloading
  5. 5. EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com 5 TCP Segmentation Offload Physical (Hardware) Data-Link layer (Ethernet) IP TCP Application layerUsers Space Kernel Space NIC (Hardware - MTU=1500 bytes) Data-Link layer (Device Driver) App Data (64KB) Cable ETH IP TCP 1a (1460 bytes) TCP 1a (1460 bytes) IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) TCP 1a (1460 bytes) IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) TCP 1a (1460 bytes) IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) TCP 1a (1460 bytes) IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) TCP 1a (1460 bytes) IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) TCP 1a (1460 bytes) IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) TCP 1a (1460 bytes) IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) TCP 1a (1460 bytes) IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) TCP 1a (1460 bytes) IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) TCP 1a (1460 bytes) IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) Interface App Data (64KB) TCP 1 (64KB) IP TCP 1 (64KB) ETH IP TCP 1 (64KB) Physical (Hardware) Data-Link layer (Ethernet) IP TCP Application layerUsers Space Kernel Space NIC (Hardware - MTU=1500 bytes) Data-Link layer (Device Driver) Cable TCP Segmentation Offload (TSO) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) Interface a) TSO b) no TSO
  6. 6. EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com Receiver side! • Receive Side Coalescing (RSC) or hardware LRO • Allows a NIC to combine incoming TCP/IP packets that belong to the same connection into one large receive segment before passing it to the operating system. • Reduces CPU use because the TCP/IP stack is executed only once for a set of received Ethernet packets. 6 Hardware Offloading
  7. 7. EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com 7 Receive Side Coalescing Physical (Hardware) Data-Link layer (Ethernet) IP TCP Application layerUsers Space Kernel Space NIC (Hardware) Data-Link layer (Device Driver) App Data (64KB) ETH IP TCP 1a (1460 bytes) TCP 1a (1460 bytes) IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) Cable Interface ETH IP TCP 1a (1460 bytes) TCP 1a (1460 bytes) IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) TCP 1a (1460 bytes) IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) TCP 1a (1460 bytes) IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) TCP 1a (1460 bytes) IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) TCP 1a (1460 bytes) IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) TCP 1a (1460 bytes) IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) TCP 1a (1460 bytes) IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) TCP 1a (1460 bytes) IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) TCP 1a (1460 bytes) IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) App Data (64KB) TCP 1 (64KB) IP TCP 1 (64KB) ETH IP TCP 1 (64KB) Physical (Hardware) Data-Link layer (Ethernet) IP TCP Application layerUsers Space Kernel Space NIC (Hardware) Data-Link layer (Device Driver) Cable Interface ETH IP TCP 1 (64KB) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) Receive Side Coalescing (RSC) ETH IP TCP 1a (1460 bytes) a) RSC b) no RSC
  8. 8. EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com 8 Software offloading • Why software implementation? • Older NICs (IPv4 / IPv6) • Buggy NICs • Communication between Virtual Machines • Easy to extend for new protocols
  9. 9. EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com Receiver side! • Large Receive Offload (LRO) • Software implementation of RSC • Available since FreeBSD 7.1 • Requires changes in each device drivers 9 Software offloading
  10. 10. EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com 10 Large Receive Offload App Data (64KB) TCP 1 (64KB) IP TCP 1 (64KB) ETH IP TCP 1 (64KB) Physical (Hardware) Data-Link layer (Ethernet) IP TCP Application layerUsers Space Kernel Space NIC (Hardware) Data-Link layer (Device Driver) Cable Interface ETH IP TCP 1 (64KB) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) Receive Side Coalescing (RSC) ETH IP TCP 1a (1460 bytes) Physical (Hardware) Data-Link layer (Ethernet) IP TCP Application layerUsers Space Kernel Space NIC (Hardware) Data-Link layer (Device Driver) App Data (64KB) ETH IP TCP 1a (1460 bytes) Cable Interface Large Receive Offload (LRO) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) TCP 1 (64KB) IP TCP 1 (64KB) ETH IP TCP 1 (64KB) a) RSC b) LRO
  11. 11. EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com Sender side! • Generic Segmentation Offload (GSO) • Software implementation of TSO • Supported TCP, UDP and IPv4, IPv6 • Available for FreeBSD: • -current, 10-stable, 9-stable • Segmentation just before the device driver 11 Software offloading
  12. 12. EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com 12 Physical (Hardware) Data-Link layer (Ethernet) IP TCP Application layerUsers Space Kernel Space NIC (Hardware - MTU=1500 bytes) Data-Link layer (Device Driver) App Data (64KB) Cable ETH IP TCP 1a (1460 bytes) TCP 1a (1460 bytes) IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) TCP 1a (1460 bytes) IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) TCP 1a (1460 bytes) IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) TCP 1a (1460 bytes) IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) TCP 1a (1460 bytes) IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) TCP 1a (1460 bytes) IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) TCP 1a (1460 bytes) IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) TCP 1a (1460 bytes) IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) TCP 1a (1460 bytes) IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) TCP 1a (1460 bytes) IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes)Interface App Data (64KB) TCP 1 (64KB) IP TCP 1 (64KB) ETH IP TCP 1 (64KB) Physical (Hardware) Data-Link layer (Ethernet) IP TCP Application layerUsers Space Kernel Space NIC (Hardware - MTU=1500 bytes) Data-Link layer (Device Driver) Cable Generic Segmentation Offload (GSO) Interface ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) a) GSO b) no GSO GSO on TCP flow
  13. 13. EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com • Network stack changes • tcp_output() • ip_output() • ether_output() • gso_dispatch() 13 GSO: data flow sbappendstream() User Application Socket Layer Network Stack send(DATA) tcp_usr_send(mbuf_chain) socket send buffer DATA DATATCPPIP tcp_output() ip_output() ether_output() Network Interface IF output queue ifp->if_transmit() so_proto->pr_usrreqs->pru_send(mbuf_chain) mbuf_chain sosend_generic() m_uiotombuf() gso_dispatch() DATATCPPIP DATATCPPIPETH ifp->if_output() ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes)
  14. 14. EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com 1. Checks if GSO is enabled: • sysctl net.inet.tcp.gso = 1! • sysctl net.gso.”ifname”.enable_gso = 1! 2. Checks if the packet length exceeds the MTU ! • If 1 and 2 are true, sets GSO flag • m->m_pkthdr.csum_flags |= GSO_TO_CSUM(GSO_TCP4); 14 tcp_output() Network Stack socket send buffer DATATCPPIP tcp_output() ip_output() ether_output() Network Interface IF output queue ifp->if_transmit() gso_dispatch() DATATCPPIP DATATCPPIPETH ifp->if_output() mbuf … m_pkthdr.csum_flags GSO_TCP4 ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes)
  15. 15. EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com • If GSO is enabled and required: • avoids checksum (IP & TCP) • avoids IP Fragmentation 15 ip_output() Network Stack socket send buffer DATATCPPIP tcp_output() ip_output() ether_output() Network Interface IF output queue ifp->if_transmit() gso_dispatch() DATATCPPIP DATATCPPIPETH ifp->if_output() mbuf … m_pkthdr.csum_flags GSO_TCP4 ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes)
  16. 16. EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com • If GSO is enabled and required: • calls gso_dispatch() instead of ifp->transmit() 16 ether_output() Network Stack socket send buffer DATATCPPIP tcp_output() ip_output() ether_output() Network Interface IF output queue ifp->if_transmit() gso_dispatch() DATATCPPIP DATATCPPIPETH ifp->if_output() mbuf … m_pkthdr.csum_flags GSO_TCP4 ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes)
  17. 17. EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com 17 gso_dispatch() ! int! gso_dispatch(struct ifnet *ifp, ! ! ! ! ! struct mbuf *m, u_int mac_hlen)! {! ! …! ! gso_flags = CSUM_TO_GSO(m->m_pkthdr.csum_flags);! ! …! ! ! ! error = gso_functions[gso_flags](ifp, m, mac_hlen);! ! return error;! } enum gso_type {! GSO_NONE,! GSO_TCP4,! GSO_TCP6,! GSO_UDP4,! GSO_UDP6,! /*! * GSO_SCTP4, TODO! * GSO_SCTP6,! */! GSO_END_OF_TYPE! }; Network Stack socket send buffer DATATCPPIP tcp_output() ip_output() ether_output() Network Interface IF output queue ifp->if_transmit() gso_dispatch() DATATCPPIP DATATCPPIPETH ifp->if_output() mbuf … m_pkthdr.csum_flags GSO_TCP4 ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) ETH IP TCP 1a (1460 bytes) int (*gso_functions[GSO_END_OF_TYPE]) ! ! ! (struct ifnet*, struct mbuf*, u_int);! ! #define CSUM_TO_GSO(x) ((x & CSUM_GSO_MASK)! ! ! ! ! ! ! >> CSUM_GSO_OFFSET)
  18. 18. EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com 18 gso_functions[] • gso_functions[GSO_TCP4]! • gso_ip4_tcp(…) - GSO on TCP/IPv4 packet 1. m_seg(struct mbuf *m0, int hdr_len, int mss, …)! !! returns the mbuf queue that contains the segments of the original packet (m0). • hdr_len - first bytes of m0 that are copied in each new segments • mss - maximum segment size 2. fixes TCP and IP headers in each new segments 3. sends new segments to the device driver [ifp->if_transmit()]
  19. 19. EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com 19 GSO: m_seg() not used … m_flags m_ext.ext_buf m_next m_len m_ext.ref_cnt m_ext.ext_free m_ext.ext_type m_ext.ext_size m_nextpkt m_data mbuf m_type M_EXT NULL MT_DATA m_dat m_nextpkt m_pkthdr.tags m_pkthdr.recvif mbuf m_pkthdr.tso_segsz m_next m_pkthdr.csum_data m_pkthdr.csum_flags … m_type m_pkthdr.len m_len m_data m_flags M_PKTHDR TCP Header 54 NULL MT_DATA 64KB IP Header Ethern Header CSUM_GSO 1500 not used … m_flags m_ext.ext_buf m_next m_len m_ext.ref_cnt m_ext.ext_free m_ext.ext_type m_ext.ext_size m_nextpkt m_data mbuf m_type M_EXT NULL MT_DATA not used … m_flags m_ext.ext_buf m_next m_len m_ext.ref_cnt m_ext.ext_free m_ext.ext_type m_ext.ext_size m_nextpkt m_data mbuf m_type M_EXT NULL MT_DATA not used … m_flags m_ext.ext_buf m_next m_len m_ext.ref_cnt m_ext.ext_free m_ext.ext_type m_ext.ext_size m_nextpkt m_data mbuf m_type M_EXT NULL MT_DATA not used … m_flags m_ext.ext_buf m_next m_len m_ext.ref_cnt m_ext.ext_free m_ext.ext_type m_ext.ext_size m_nextpkt m_data mbuf m_type M_EXT NULL NULL MT_DATA not used … m_flags m_ext.ext_buf m_next m_len m_ext.ref_cnt m_ext.ext_free m_ext.ext_type m_ext.ext_size m_nextpkt m_data mbuf m_type M_EXT NULL MT_DATA m_dat m_nextpkt m_pkthdr.tags m_pkthdr.recvif mbuf m_pkthdr.tso_segsz m_next m_pkthdr.csum_data m_pkthdr.csum_flags … m_type m_pkthdr.len m_len m_data m_flags M_PKTHDR TCP Header 54 MT_DATA 1514 IP Header Ethern Header not used … m_flags m_ext.ext_buf m_next m_len m_ext.ref_cnt m_ext.ext_free m_ext.ext_type m_ext.ext_size m_nextpkt m_data mbuf m_type M_EXT NULL MT_DATA NULL m_dat m_nextpkt m_pkthdr.tags m_pkthdr.recvif mbuf m_pkthdr.tso_segsz m_next m_pkthdr.csum_data m_pkthdr.csum_flags … m_type m_pkthdr.len m_len m_data m_flags M_PKTHDR TCP Header MT_DATA 1514 IP Header Ethern Header not used … m_flags m_ext.ext_buf m_next m_len m_ext.ref_cnt m_ext.ext_free m_ext.ext_type m_ext.ext_size m_nextpkt m_data mbuf m_type M_EXT NULL MT_DATA NULL not used … m_flags m_ext.ext_buf m_next m_len m_ext.ref_cnt m_ext.ext_free m_ext.ext_type m_ext.ext_size m_nextpkt m_data mbuf m_type M_EXT NULL MT_DATA m_dat m_nextpkt m_pkthdr.tags m_pkthdr.recvif mbuf m_pkthdr.tso_segsz m_next m_pkthdr.csum_data m_pkthdr.csum_flags … m_type m_pkthdr.len m_len m_data m_flags M_PKTHDR TCP Header 54 MT_DATA 1514 IP Header Ethern Header not used … m_flags m_ext.ext_buf m_next m_len m_ext.ref_cnt m_ext.ext_free m_ext.ext_type m_ext.ext_size m_nextpkt m_data mbuf m_type M_EXT NULL MT_DATA NULL m_dat m_nextpkt m_pkthdr.tags m_pkthdr.recvif mbuf m_pkthdr.tso_segsz m_next m_pkthdr.csum_data m_pkthdr.csum_flags … m_type m_pkthdr.len m_len m_data m_flags M_PKTHDR TCP Header 54 MT_DATA 1514 IP Header Ethern Header not used … m_flags m_ext.ext_buf m_next m_len m_ext.ref_cnt m_ext.ext_free m_ext.ext_type m_ext.ext_size m_nextpkt m_data mbuf m_type M_EXT NULL MT_DATA NULL m_dat m_nextpkt m_pkthdr.tags m_pkthdr.recvif mbuf m_pkthdr.tso_segsz m_next m_pkthdr.csum_data m_pkthdr.csum_flags … m_type m_pkthdr.len m_len m_data m_flags M_PKTHDR TCP Header 54 NULL MT_DATA 520 IP Header Ethern Header not used … m_flags m_ext.ext_buf m_next m_len m_ext.ref_cnt m_ext.ext_free m_ext.ext_type m_ext.ext_size m_nextpkt m_data mbuf m_type M_EXT NULL MT_DATA NULL hdr_len *m0 hdr_len mss m_seg(struct mbuf *m0, int hdr_len, int mss, …
  20. 20. EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com 20 GSO: fix TCP/IPv4 headers Checksum Urgent pointer Sequence number Ack number Window size Source port Destination port N S S Y N A C K E C E C W R U R G P S H F I N R S T Data offset Reserved DATA Source IP address Destination IP address Version Total Length (header + data)IHL DSCP ECN Identification Time to live Protocol Header checksum Fragment Offset D F M F Ethernet header IPv4 TCP
  21. 21. EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com 21 GSO: fix TCP/IPv6 headers Source IP address (128 bit) Destination IP address (128 bit) Version Flow labelTraffic Class Payload Length Next Header Hop Limit Ethernet header Checksum Urgent pointer Sequence number Ack number Window size Source port Destination port N S S Y N A C K E C E C W R U R G P S H F I N R S T Data offset Reserved DATA IPv6 TCP
  22. 22. EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com 22 GSO on UDP flow Physical (Hardware) Data-Link layer (Ethernet) IP UDP Application layerUsers Space Kernel Space NIC (Hardware - MTU=1500 bytes) Data-Link layer (Device Driver) App Data (64KB) Cable Interface UDP 1 (16KB) IP Fragmentation ETH IP 1b (1472 bytes) ETH IP 1b (1472 bytes) IP 1b (1472 bytes) ETH IP 1b (1472 bytes) ETH IP 1b (1472 bytes) IP 1b (1472 bytes) ETH IP 1b (1472 bytes) ETH IP 1b (1472 bytes) IP 1b (1472 bytes) ETH IP 1b (1472 bytes) ETH IP 1b (1472 bytes) IP 1b (1472 bytes) ETH IP 1b (1472 bytes) ETH IP 1b (1472 bytes) IP 1b (1472 bytes) ETH IP 1b (1472 bytes) ETH IP 1b (1472 bytes) IP 1b (1472 bytes) ETH IP 1b (1472 bytes) ETH IP 1b (1472 bytes) IP 1b (1472 bytes) ETH IP 1b (1472 bytes) ETH IP 1b (1472 bytes) IP 1b (1472 bytes) ETH IP 1b (1472 bytes) ETH IP 1b (1472 bytes) IP 1b (1472 bytes) ETH IP UDP 1a (1460 bytes) ETH IP UDP 1a (1460 bytes) IP UDP 1a (1460 bytes) App Data (16KB) UDP 1 (16KB) IP UDP 1 (16KB) ETH IP UDP 1 (16KB) Physical (Hardware) Data-Link layer (Ethernet) IP UDP Application layerUsers Space Kernel Space NIC (Hardware - MTU=1500 bytes) Data-Link layer (Device Driver) Cable Generic Segmentation Offload (GSO) Interface ETH IP 1b (1472 bytes) ETH IP 1b (1472 bytes) ETH IP 1b (1472 bytes) ETH IP 1b (1472 bytes) ETH IP 1b (1472 bytes) ETH IP 1b (1472 bytes) ETH IP 1b (1472 bytes) ETH IP 1b (1472 bytes) ETH IP 1b (1472 bytes) ETH IP 1b (1472 bytes) ETH IP 1b (1472 bytes) ETH IP 1b (1472 bytes) ETH IP 1b (1472 bytes) ETH IP 1b (1472 bytes) ETH IP 1b (1472 bytes) ETH IP 1b (1472 bytes) ETH IP 1b (1472 bytes) ETH IP 1b (1472 bytes) ETH IP UDP 1a (1460 bytes) ETH IP UDP 1a (1460 bytes) a) GSO b) no GSO
  23. 23. EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com 23 GSO: fix UDP/IPv4 headers UDP header (only in the first frag) Checksum DATA Source IP address Destination IP address Version Total Length (header + data)IHL DSCP ECN Identification Time to live Protocol Header checksum Fragment Offset D F M F Ethernet header IPv4
  24. 24. EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com 24 GSO: fix UDP/IPv6 headers Source IP address (128 bit) Destination IP address (128 bit) Version Flow labelTraffic Class Payload Length Next Header Hop Limit Ethernet header DATA Identification ReservedNext Header Res M FFragment Offset UDP header (only in the first frag) Checksum IPv6 Frag. Hdr
  25. 25. EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com To manage the GSO parameters there are some sysctl: • net.inet.tcp.gso! ! ! GSO enable on TCP communications (!=0) • net.inet.udp.gso! ! ! GSO enable on UDP communications (!=0) • for each interface: • net.gso.dev.”ifname”.max_burst GSO burst length limit [default: IP_MAXPACKET=65535] • net.gso.dev.”ifname”.enable_gso! !! GSO enable on “ifname” interface (!=0) 25 sysctl
  26. 26. EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com • Kernel patches for FreeBSD-current, FreeBSD 10- stable and FreeBSD 9-stable available at: https://github.com/stefano-garzarella/freebsd-gso • FreeBSD source with GSO available at: https://github.com/stefano-garzarella/freebsd-gso-src ! ! • To compile kernel with GSO support: • “options GSO” in kernel config 26 GSO code
  27. 27. EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com 27 GSO patch diff gso-current
  28. 28. EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com • Sender: CPU i7-870 at 2.93 GHz + Turboboost, Intel 10 Gbit NIC. • Receiver: CPU i7-3770K at 3.50GHz + Turboboost, Intel 10 Gbit NIC. • (RSC/LRO)-enabled (otherwise TSO/GSO are ineffective) • Benchmark tool: netperf 2.6.0 28 Experiments
  29. 29. EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com Freq. [GHz] TSO GSO none Speedup GSO - none 2.93 9347 9298 8308 12% 2.53 9266 9401 6771 39% 2.00 9408 9294 5499 69% 1.46 9408 8087 4075 98% 1.05 9408 5673 2884 97% 0.45 6760 2206 1244 77% 29 Results TCP/IPv4 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 0 0.5 1 1.5 2 2.5 3 Throughput[10^6bit/sec] Frequency [GHz] Experiments with TCP/IPv4 (checksum offload enabled) TSO GSO none
  30. 30. EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com Freq. [GHz] TSO GSO none Speedup GSO - none 2.93 9097 8861 4966 78% 2.53 9113 8290 4008 107% 2.00 9066 6599 3152 109% 1.46 7357 5180 2348 121% 1.05 6125 3607 1732 108% 0.45 2005 1505 651 131% 30 Results TCP/IPv6 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 0 0.5 1 1.5 2 2.5 3 Throughput[10^6bit/sec] Frequency [GHz] Experiments with TCP/IPv6 (checksum offload enabled) TSO GSO none
  31. 31. EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com 31 Results UDP/IPv4 Freq. [GHz] GSO none Speedup GSO - none 2.93 9440 8084 17% 2.53 7772 6649 17% 2.00 6336 5338 19% 1.46 4748 4014 18% 1.05 3359 2831 19% 0.45 1312 1120 17% 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 0 0.5 1 1.5 2 2.5 3 Throughput[10^6bit/sec] Frequency [GHz] Experiments with UDP/IPv4 GSO none
  32. 32. EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com Freq. [GHz] GSO none Speedup GSO - none 2.93 7281 6197 17% 2.53 5953 5020 19% 2.00 4804 4048 19% 1.46 3582 3004 19% 1.05 2512 2092 20% 0.45 998 826 21% 32 Results UDP/IPv6 0 1000 2000 3000 4000 5000 6000 7000 8000 9000 10000 0 0.5 1 1.5 2 2.5 3 Throughput[10^6bit/sec] Frequency [GHz] Experiments with UDP/IPv6 GSO none
  33. 33. EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com • More performance measurements • Optimize code paths • Add support to new protocols (SCTP, …) 33 Future works
  34. 34. EuroBSDcon 2014 - Stefano Garzarella - stefano.garzarella@gmail.com 34 Thank you!

×