SlideShare a Scribd company logo
1 of 23
Download to read offline
DPDK	Summit	- San	Jose	– 2017
#DPDKSummit
Accelerating Packet Processing with
GRO and GSO in DPDK
2#DPDKSummit
Packet Processing Overheads
u A	major	part	of	packet	processing	“applications”	are	performed	on	a	per-
packet	basis	(e.g.	Firewall,	TCP/IPv4	stack)
u Per-packet	routines	(e.g.	header	processing)	dominate	packet	processing	
overheads
u Reduce	the	packet	number to	be	processed	can	mitigate	the	per-packet	
overhead
3
Methods to Reduce the Packet Number
u Use	large	Maximum	Transmission	Unit	(MTU)
u MTU	size	depends	on	physical	links L
u Use	network	interface	card	(NIC)	features:	Large	Receive	Offload	(LRO),	
TCP	Segmentation	Offload	(TSO)	and	UDP	Fragmentation	Offload	(UFO)
Application
...
fewer large
packets
Egress
direction: large packets
LRO
NIC
Ingress
direction:
...
TSO/UFO
Drawbacks:
• Only support TCP and UDP
• Not all NICs support LRO/TSO/UFO
4#DPDKSummit
GRO and GSO
u Software Method:
• Generic Receive Offload (GRO) merges small packets into larger ones
• Generic Segmentation Offload (GSO) splits large packets into smaller ones
u Advantages:
• Don’t reply on physical link support
• Don’t reply on NIC hardwaresupport
• Able to support various kinds of protocols,like TCP, VxLAN and GRE
5
u In	DPDK,	GRO	and	GSO	are	two	standalone	libraries
GRO and GSO in DPDK
DPDK	Application
#DPDKSummit
NIC
DPDK PMD
DPDK GRO Library
NIC
DPDK PMD
DPDK GSO Library
Ingress direction Egress direction
6
Library Framework
#DPDKSummit
…
GRO API
IPv4-GRE-TCP/IPv4
GRO
Packet type:
IPv4-GRE-IPv4-TCP
Packet type:
IPv4-TCP
rte_gro_ctx_create()
rte_gro_ctx_destroy()
rte_gro_reassemble…()
rte_gro_timeout_flush()
rte_gro_get_pkt_count()
TCP/IPv4 GRO
GSO API
...IPv4-GRE-TCP/IPv4
GSO
TCP/IPv4 GSO
ol_flags:
PKT_TX_TCP_SEG| PKT_TX_IPV4
ol_flags:
PKT_TX_TCP_SEG |
PKT_TX_IPV4 |
PKT_TX_OUTER_IPV4 |
PKT_TX_TUNNEL_GRE
rte_gso_segment(pkt)
• One	GRO	type,	one	kind	of	packets
• Indicated	by	MBUF->packet_type
• One	GSO	type,	one	kind	of	packets
• Indicated	by	MBUF->ol_flags
7
API: How Applications Use it?
nb_pkts = rte_eth_rx_burst(…, pkts, …);
nb_segs = rte_gro_reassemble_burst(pkts, nb_pkts, …);
rte_eth_tx_burst(…, pkts, nb_segs);
struct rte_mbuf *gso_segs[N];
nb_segs = rte_gso_segment(…, pkt, gso_segs, …);
rte_eth_tx_burst(…, gso_segs, nb_segs);
GRO Sample
GSO Sample
8
API: Two Sets in GRO
#DPDKSummit
ctx = rte_gro_ctx_create()
rte_gro_reassemble(pkts, ctx)
rte_gro_timeout_flush(ctx)
Heavyweight Mode API
• Supports to merge a large
number of packets with fine-
grained control
rte_gro_reassemble_burst()
• Support to merge a small
number of packets rapidly
Lightweight Mode API
9
How to Merge and Split Packets?
u GRO	Reassembly	Algorithm	merges	packets
u GSO	Segmentation	Scheme	splits	packets
10
GRO Algorithm: Challenges
u A	high	cost	algorithm/implementation	would	cause	packet	dropping	in	a	
high	speed	network
u Packet	reordering	makes	it	hard	to	merge	packets
u Linux	GRO	fails	to	merge	packets	when	encounters	packet	reordering
#DPDKSummit
Algorithm	is	lightweight	to	scale	fast	networking	speed
Algorithm	is	capable	of	handling	packet	reordering
11
GRO Algorithm: Key-based Approach
#DPDKSummit
Categorize into an
existed “flow”
Search for a “neighbor” in
the “flow”
Merge the packets
• Insert a new “flow”
• Store the packet
Store the packet in
the “flow”
packet
find a “flow”
not
find
not
find
find a neighbor
TCP/IPv4 Packets
Header fields representing a “flow”:
• src/dst: mac, IP, port
• ACK number
Header fields deciding “neighbor”:
• IP id
• Sequence number
same
value
incremental
12
GRO Algorithm: Key-based Approach
#DPDKSummit
Two Characters
• Lightweight: classify packets
into “flows” to accelerate
packet aggregation is simple
• More: storing out-of-order
packets makes it possible to
merge later
Address challenge 1 and 2
Categorize into an
existed “flow”
Search for a “neighbor” in
the “flow”
Merge the packets
• Insert a new “flow”
• Store the packet
Store the packet in
the “flow”
packet
find a “flow”
not
find
not
find
find a neighbor
13
GRO Algorithm: TCP/IPv4 Example
#DPDKSummit
Flow MAC IP Port ACK
Number
F1 1:2:3:4:5:6
11:22:33:44:55:66
1.1.1.1
1.1.1.2
5041
5043
1
F2 1:2:3:4:5:6
11:22:33:44:55:66
1.1.1.1
1.1.1.2
5001
5002
1
Packet IP ID Sequence
number
Payload
length
P0 1 1 100
P2 3 301 100
P3
Flow IP ID Sequence
number
Payload
length
F2 4 401 100

‚
Packet IP ID Sequence
number
Payload
length
P1 7 701 1500
14#DPDKSummit
Flow MAC IP Port ACK
Number
F1 1:2:3:4:5:6
11:22:33:44:55:66
1.1.1.1
1.1.1.2
5041
5043
1
F2 1:2:3:4:5:6
11:22:33:44:55:66
1.1.1.1
1.1.1.2
5001
5002
1
Packets IP ID Sequence
number
Payload
length
P0 1 1 100
P2 3 301 100
P3
Flow IP ID Sequence
number
Payload
length
F2 4 401 100
Packets IP ID Sequence
number
Payload
length
P1 7 701 1500
IP	ID	and	sequence	number	
are	incremental	à Neighbors
GRO Algorithm: TCP/IPv4 Example
15#DPDKSummit
Flow MAC IP Port ACK
Numbe
r
f0 1:2:3:4:5:6
11:22:33:44:55:66
1.1.1.1
1.1.1.2
5041
5043
1
f1 1:2:3:4:5:6
11:22:33:44:55:66
1.1.1.1
1.1.1.2
5001
5002
1
Packets IP ID Sequence
number
Payload
length
P0 1 1 100
P2 4 301 200
Packets IP ID Sequence
number
Payload
length
P1 7 701 1500
ƒ
GRO Algorithm: TCP/IPv4 Example
16#DPDKSummit
GSO Scheme: Overview
u Segmentation	Workflow
Split the payload into
smaller parts
Add the packet header
to each payload part
Update packet headers
for all GSO segments
an input packet
GSO Segments
How to organize a
GSO segment?
17#DPDKSummit
GSO Scheme: Zero-copy Based
Approach
u “Two-part”	MBUF	is	used	to	organize	a	GSO	segment
• Direct	MBUF	holds the	packet	header
• Indirect	MBUF:	“pointer”,	point	to	a	payload	part	of	the	packet	to	segment
Data room
Direct MBUF
MBUF
Meta
data
Indirect MBUF 1
MBUF
Meta
datanext next…
Indirect MBUF N
Forming	a	GSO	segment	just	needs	to	copy	the	header!
MBUF
Meta
data
18
PYLD 1
#DPDKSummit
GSO Scheme: Example
PKT
HDR
MBUF
Meta
data next
PYLD 0HDRPKT
MBUF
Meta
data
 Copy HDR to direct MBUF
ƒ Reduce reference counter of
PKT’s MBUF by 1
• PKT will be freed
automatically, When all
indirect MBUFs are freed
MBUF
Meta
data
‚ “Attach” indirect MBUF to PKT
and make it point to the
correct payload part
PKT
HDR
MBUF
Meta
data next
MBUF
Meta
data
19
Host Kernel
#DPDKSummit
Experiment: Physical Topology
Switch (testpmd)
VM
vhost-userNIC 1
virtio-netNIC 0
logically
connected
Physically
connected
Server 0
iperf iperf
TCP/IPv4
packets
20
NIC 1
iperf client
Host Kernel
#DPDKSummit
Experiment: Setup
Testpmd
iperf server
VM
vhost-userNIC 1
virtio-netNIC 0
Small
TCP/IPv4
packets
iperf server
Host Kernel
Testpmd
iperf client
VM
vhost-user
GSO
virtio-netNIC 0
large
TCP/IPv4
packets
GRO
Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz
Ethernet Controller XL710 for 40GbE QSFP+
DPDK GRO DPDK GSO
Linux GRO Linux GSO
21
0
0.5
1
1.5
2
2.5
3
1 TCP
Connection
2 TCP
Connections
4 TCP
Connections
Speedup#DPDKSummit
Experiment: GRO Performance
0
0.5
1
1.5
2
2.5
3
1 TCP
Connection
2 TCP
Connections
4 TCP
Connections
Speedup
Figure 1. Iperf throughput Improvement of
DPDK-GRO over No-GRO
Figure 2. Iperf throughput Improvement of
DPDK-GRO over Linux-GRO
2.7x 1.9x
22
0
0.5
1
1.5
2
2.5
3
1 TCP
Connection
2 TCP
Connections
4 TCP
Connections
Speedup
#DPDKSummit
Experiment: GSO Performance
0
0.5
1
1.5
2
2.5
1 TCP
Connection
2 TCP
Connections
4 TCP
Connections
Speedup
Figure 1. Iperf throughput Improvement of
DPDK-GSO over No-GSO
Figure 2. Iperf throughput Improvement of
DPDK-GSO over Linux-GSO
2.2x 1.5x
Thanks!
Jiayu	Hu
jiayu.hu@intel.com

More Related Content

What's hot

What's hot (20)

Vxlan control plane and routing
Vxlan control plane and routingVxlan control plane and routing
Vxlan control plane and routing
 
Leveraging Nexus Repository Manager at the Heart of DevOps
Leveraging Nexus Repository Manager at the Heart of DevOpsLeveraging Nexus Repository Manager at the Heart of DevOps
Leveraging Nexus Repository Manager at the Heart of DevOps
 
Kubernetes Networking
Kubernetes NetworkingKubernetes Networking
Kubernetes Networking
 
Kubernetes day 2 Operations
Kubernetes day 2 OperationsKubernetes day 2 Operations
Kubernetes day 2 Operations
 
Replacing iptables with eBPF in Kubernetes with Cilium
Replacing iptables with eBPF in Kubernetes with CiliumReplacing iptables with eBPF in Kubernetes with Cilium
Replacing iptables with eBPF in Kubernetes with Cilium
 
VXLAN BGP EVPN: Technology Building Blocks
VXLAN BGP EVPN: Technology Building BlocksVXLAN BGP EVPN: Technology Building Blocks
VXLAN BGP EVPN: Technology Building Blocks
 
Cilium - BPF & XDP for containers
 Cilium - BPF & XDP for containers Cilium - BPF & XDP for containers
Cilium - BPF & XDP for containers
 
Faster packet processing in Linux: XDP
Faster packet processing in Linux: XDPFaster packet processing in Linux: XDP
Faster packet processing in Linux: XDP
 
VPLS Fundamental
VPLS FundamentalVPLS Fundamental
VPLS Fundamental
 
Designing Multi-tenant Data Centers Using EVPN
Designing Multi-tenant Data Centers Using EVPNDesigning Multi-tenant Data Centers Using EVPN
Designing Multi-tenant Data Centers Using EVPN
 
Kubernetes networking in AWS
Kubernetes networking in AWSKubernetes networking in AWS
Kubernetes networking in AWS
 
Linux Linux Traffic Control
Linux Linux Traffic ControlLinux Linux Traffic Control
Linux Linux Traffic Control
 
Installing Aix
Installing AixInstalling Aix
Installing Aix
 
Airheads Tech Talks: Understanding ClearPass OnGuard Agents
Airheads Tech Talks: Understanding ClearPass OnGuard AgentsAirheads Tech Talks: Understanding ClearPass OnGuard Agents
Airheads Tech Talks: Understanding ClearPass OnGuard Agents
 
Aruba Instant 6.4.0.2-4.1 Command Line Interface Reference Guide
Aruba Instant 6.4.0.2-4.1 Command Line Interface Reference GuideAruba Instant 6.4.0.2-4.1 Command Line Interface Reference Guide
Aruba Instant 6.4.0.2-4.1 Command Line Interface Reference Guide
 
Set your Data in Motion with Confluent & Apache Kafka Tech Talk Series LME
Set your Data in Motion with Confluent & Apache Kafka Tech Talk Series LMESet your Data in Motion with Confluent & Apache Kafka Tech Talk Series LME
Set your Data in Motion with Confluent & Apache Kafka Tech Talk Series LME
 
Vxlan deep dive session rev0.5 final
Vxlan deep dive session rev0.5   finalVxlan deep dive session rev0.5   final
Vxlan deep dive session rev0.5 final
 
Open shift 4 infra deep dive
Open shift 4    infra deep diveOpen shift 4    infra deep dive
Open shift 4 infra deep dive
 
nftables - the evolution of Linux Firewall
nftables - the evolution of Linux Firewallnftables - the evolution of Linux Firewall
nftables - the evolution of Linux Firewall
 
OpenFlow tutorial
OpenFlow tutorialOpenFlow tutorial
OpenFlow tutorial
 

Similar to LF_DPDK17_GRO/GSO Libraries: Bring Significant Performance Gains to DPDK-based Applications

Exploiting Network Protocols To Exhaust Bandwidth Links 2008 Final
Exploiting Network Protocols To Exhaust Bandwidth Links 2008 FinalExploiting Network Protocols To Exhaust Bandwidth Links 2008 Final
Exploiting Network Protocols To Exhaust Bandwidth Links 2008 Final
masoodnt10
 
Computer network (11)
Computer network (11)Computer network (11)
Computer network (11)
NYversity
 
Recent advance in netmap/VALE(mSwitch)
Recent advance in netmap/VALE(mSwitch)Recent advance in netmap/VALE(mSwitch)
Recent advance in netmap/VALE(mSwitch)
micchie
 
Aceleracion TCP Mikrotik.pdf
Aceleracion TCP Mikrotik.pdfAceleracion TCP Mikrotik.pdf
Aceleracion TCP Mikrotik.pdf
WifiCren
 
PACKET Sniffer IMPLEMENTATION
PACKET Sniffer IMPLEMENTATIONPACKET Sniffer IMPLEMENTATION
PACKET Sniffer IMPLEMENTATION
Goutham Royal
 

Similar to LF_DPDK17_GRO/GSO Libraries: Bring Significant Performance Gains to DPDK-based Applications (20)

Chapter 3. sensors in the network domain
Chapter 3. sensors in the network domainChapter 3. sensors in the network domain
Chapter 3. sensors in the network domain
 
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running LinuxLinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
 
UAV Data Link Design for Dependable Real-Time Communications
UAV Data Link Design for Dependable Real-Time CommunicationsUAV Data Link Design for Dependable Real-Time Communications
UAV Data Link Design for Dependable Real-Time Communications
 
100 M pps on PC.
100 M pps on PC.100 M pps on PC.
100 M pps on PC.
 
MTU (maximum transmission unit) & MRU (maximum receive unit)
MTU (maximum transmission unit) & MRU (maximum receive unit)MTU (maximum transmission unit) & MRU (maximum receive unit)
MTU (maximum transmission unit) & MRU (maximum receive unit)
 
Introduction to DPDK
Introduction to DPDKIntroduction to DPDK
Introduction to DPDK
 
Exploiting Network Protocols To Exhaust Bandwidth Links 2008 Final
Exploiting Network Protocols To Exhaust Bandwidth Links 2008 FinalExploiting Network Protocols To Exhaust Bandwidth Links 2008 Final
Exploiting Network Protocols To Exhaust Bandwidth Links 2008 Final
 
IETF 100: Surviving IPv6 fragmentation
IETF 100: Surviving IPv6 fragmentationIETF 100: Surviving IPv6 fragmentation
IETF 100: Surviving IPv6 fragmentation
 
Optimization of Low-efficiency Traffic in OpenFlow Software Defined Networks
Optimization of Low-efficiency Traffic in OpenFlowSoftware Defined NetworksOptimization of Low-efficiency Traffic in OpenFlowSoftware Defined Networks
Optimization of Low-efficiency Traffic in OpenFlow Software Defined Networks
 
Bar-BoF session about Simplemux at IETF93, Prague
Bar-BoF session about Simplemux at IETF93, PragueBar-BoF session about Simplemux at IETF93, Prague
Bar-BoF session about Simplemux at IETF93, Prague
 
Computer network (11)
Computer network (11)Computer network (11)
Computer network (11)
 
Recent advance in netmap/VALE(mSwitch)
Recent advance in netmap/VALE(mSwitch)Recent advance in netmap/VALE(mSwitch)
Recent advance in netmap/VALE(mSwitch)
 
Aceleracion TCP Mikrotik.pdf
Aceleracion TCP Mikrotik.pdfAceleracion TCP Mikrotik.pdf
Aceleracion TCP Mikrotik.pdf
 
Transport Layer Numericals
Transport Layer NumericalsTransport Layer Numericals
Transport Layer Numericals
 
Analyzing network packets Using Wireshark
Analyzing network packets Using WiresharkAnalyzing network packets Using Wireshark
Analyzing network packets Using Wireshark
 
Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)
 
PACKET Sniffer IMPLEMENTATION
PACKET Sniffer IMPLEMENTATIONPACKET Sniffer IMPLEMENTATION
PACKET Sniffer IMPLEMENTATION
 
Primer to Browser Netwroking
Primer to Browser NetwrokingPrimer to Browser Netwroking
Primer to Browser Netwroking
 
FEC & File Multicast
FEC & File MulticastFEC & File Multicast
FEC & File Multicast
 
Introduction to IP
Introduction to IPIntroduction to IP
Introduction to IP
 

More from LF_DPDK

LF_DPDK17_Event Adapters - Connecting Devices to Eventdev
LF_DPDK17_Event Adapters - Connecting Devices to EventdevLF_DPDK17_Event Adapters - Connecting Devices to Eventdev
LF_DPDK17_Event Adapters - Connecting Devices to Eventdev
LF_DPDK
 
LF_DPDK17_Integrating and using DPDK with Open vSwitch
LF_DPDK17_Integrating and using DPDK with Open vSwitchLF_DPDK17_Integrating and using DPDK with Open vSwitch
LF_DPDK17_Integrating and using DPDK with Open vSwitch
LF_DPDK
 
LF_DPDK17_ OpenVswitch hardware offload over DPDK
LF_DPDK17_ OpenVswitch hardware offload over DPDKLF_DPDK17_ OpenVswitch hardware offload over DPDK
LF_DPDK17_ OpenVswitch hardware offload over DPDK
LF_DPDK
 
LF_DPDK17_DPDK support for new hardware offloads
LF_DPDK17_DPDK support for new hardware offloadsLF_DPDK17_DPDK support for new hardware offloads
LF_DPDK17_DPDK support for new hardware offloads
LF_DPDK
 
LF_DPDK17_DPDK's best kept secret – Micro-benchmark performance tests
LF_DPDK17_DPDK's best kept secret – Micro-benchmark performance testsLF_DPDK17_DPDK's best kept secret – Micro-benchmark performance tests
LF_DPDK17_DPDK's best kept secret – Micro-benchmark performance tests
LF_DPDK
 
LF_DPDK17_Accelerating NFV with VMware's Enhanced Network Stack (ENS) and Int...
LF_DPDK17_Accelerating NFV with VMware's Enhanced Network Stack (ENS) and Int...LF_DPDK17_Accelerating NFV with VMware's Enhanced Network Stack (ENS) and Int...
LF_DPDK17_Accelerating NFV with VMware's Enhanced Network Stack (ENS) and Int...
LF_DPDK
 
LF_DPDK17_Make DPDK's software traffic manager a deployable solution for vBNG
LF_DPDK17_Make DPDK's software traffic manager a deployable solution for vBNGLF_DPDK17_Make DPDK's software traffic manager a deployable solution for vBNG
LF_DPDK17_Make DPDK's software traffic manager a deployable solution for vBNG
LF_DPDK
 
LF_DPDK17_OpenNetVM: A high-performance NFV platforms to meet future communic...
LF_DPDK17_OpenNetVM: A high-performance NFV platforms to meet future communic...LF_DPDK17_OpenNetVM: A high-performance NFV platforms to meet future communic...
LF_DPDK17_OpenNetVM: A high-performance NFV platforms to meet future communic...
LF_DPDK
 
LF_DPDK17_Accelerating Packet Processing with FPGA NICs
LF_DPDK17_Accelerating Packet Processing with FPGA NICsLF_DPDK17_Accelerating Packet Processing with FPGA NICs
LF_DPDK17_Accelerating Packet Processing with FPGA NICs
LF_DPDK
 
LF_DPDK17_rte_security: enhancing IPSEC offload
LF_DPDK17_rte_security: enhancing IPSEC offload LF_DPDK17_rte_security: enhancing IPSEC offload
LF_DPDK17_rte_security: enhancing IPSEC offload
LF_DPDK
 
LF_DPDK17_Enabling hardware acceleration in DPDK data plane applications
LF_DPDK17_Enabling hardware acceleration in DPDK data plane applicationsLF_DPDK17_Enabling hardware acceleration in DPDK data plane applications
LF_DPDK17_Enabling hardware acceleration in DPDK data plane applications
LF_DPDK
 
LF_DPDK17_Flexible and Extensible support for new protocol processing with DP...
LF_DPDK17_Flexible and Extensible support for new protocol processing with DP...LF_DPDK17_Flexible and Extensible support for new protocol processing with DP...
LF_DPDK17_Flexible and Extensible support for new protocol processing with DP...
LF_DPDK
 
LF_DPDK17_rte_raw_device: implementing programmable accelerators using generi...
LF_DPDK17_rte_raw_device: implementing programmable accelerators using generi...LF_DPDK17_rte_raw_device: implementing programmable accelerators using generi...
LF_DPDK17_rte_raw_device: implementing programmable accelerators using generi...
LF_DPDK
 

More from LF_DPDK (20)

LF_DPDK17_Event Adapters - Connecting Devices to Eventdev
LF_DPDK17_Event Adapters - Connecting Devices to EventdevLF_DPDK17_Event Adapters - Connecting Devices to Eventdev
LF_DPDK17_Event Adapters - Connecting Devices to Eventdev
 
LF_DPDK17_Integrating and using DPDK with Open vSwitch
LF_DPDK17_Integrating and using DPDK with Open vSwitchLF_DPDK17_Integrating and using DPDK with Open vSwitch
LF_DPDK17_Integrating and using DPDK with Open vSwitch
 
LF_DPDK17_ OpenVswitch hardware offload over DPDK
LF_DPDK17_ OpenVswitch hardware offload over DPDKLF_DPDK17_ OpenVswitch hardware offload over DPDK
LF_DPDK17_ OpenVswitch hardware offload over DPDK
 
LF_DPDK17_DPDK support for new hardware offloads
LF_DPDK17_DPDK support for new hardware offloadsLF_DPDK17_DPDK support for new hardware offloads
LF_DPDK17_DPDK support for new hardware offloads
 
LF_DPDK17_DPDK's best kept secret – Micro-benchmark performance tests
LF_DPDK17_DPDK's best kept secret – Micro-benchmark performance testsLF_DPDK17_DPDK's best kept secret – Micro-benchmark performance tests
LF_DPDK17_DPDK's best kept secret – Micro-benchmark performance tests
 
LF_DPDK17_Lagopus Router
LF_DPDK17_Lagopus RouterLF_DPDK17_Lagopus Router
LF_DPDK17_Lagopus Router
 
LF_DPDK17_DPDK Membership Library
LF_DPDK17_DPDK Membership LibraryLF_DPDK17_DPDK Membership Library
LF_DPDK17_DPDK Membership Library
 
LF_DPDK17_Accelerating NFV with VMware's Enhanced Network Stack (ENS) and Int...
LF_DPDK17_Accelerating NFV with VMware's Enhanced Network Stack (ENS) and Int...LF_DPDK17_Accelerating NFV with VMware's Enhanced Network Stack (ENS) and Int...
LF_DPDK17_Accelerating NFV with VMware's Enhanced Network Stack (ENS) and Int...
 
LF_DPDK17_testpmd: swissknife for NFV
LF_DPDK17_testpmd: swissknife for NFVLF_DPDK17_testpmd: swissknife for NFV
LF_DPDK17_testpmd: swissknife for NFV
 
LF_DPDK17_Make DPDK's software traffic manager a deployable solution for vBNG
LF_DPDK17_Make DPDK's software traffic manager a deployable solution for vBNGLF_DPDK17_Make DPDK's software traffic manager a deployable solution for vBNG
LF_DPDK17_Make DPDK's software traffic manager a deployable solution for vBNG
 
LF_DPDK17_OpenNetVM: A high-performance NFV platforms to meet future communic...
LF_DPDK17_OpenNetVM: A high-performance NFV platforms to meet future communic...LF_DPDK17_OpenNetVM: A high-performance NFV platforms to meet future communic...
LF_DPDK17_OpenNetVM: A high-performance NFV platforms to meet future communic...
 
LF_DPDK17_DPDK on Microsoft Azure
LF_DPDK17_DPDK on Microsoft AzureLF_DPDK17_DPDK on Microsoft Azure
LF_DPDK17_DPDK on Microsoft Azure
 
LF_DPDK17_VPP Host Stack
LF_DPDK17_VPP Host StackLF_DPDK17_VPP Host Stack
LF_DPDK17_VPP Host Stack
 
LF_DPDK17_Accelerating Packet Processing with FPGA NICs
LF_DPDK17_Accelerating Packet Processing with FPGA NICsLF_DPDK17_Accelerating Packet Processing with FPGA NICs
LF_DPDK17_Accelerating Packet Processing with FPGA NICs
 
LF_DPDK17_rte_security: enhancing IPSEC offload
LF_DPDK17_rte_security: enhancing IPSEC offload LF_DPDK17_rte_security: enhancing IPSEC offload
LF_DPDK17_rte_security: enhancing IPSEC offload
 
LF_DPDK17_Enabling hardware acceleration in DPDK data plane applications
LF_DPDK17_Enabling hardware acceleration in DPDK data plane applicationsLF_DPDK17_Enabling hardware acceleration in DPDK data plane applications
LF_DPDK17_Enabling hardware acceleration in DPDK data plane applications
 
LF_DPDK17_Flexible and Extensible support for new protocol processing with DP...
LF_DPDK17_Flexible and Extensible support for new protocol processing with DP...LF_DPDK17_Flexible and Extensible support for new protocol processing with DP...
LF_DPDK17_Flexible and Extensible support for new protocol processing with DP...
 
LF_DPDK17_rte_raw_device: implementing programmable accelerators using generi...
LF_DPDK17_rte_raw_device: implementing programmable accelerators using generi...LF_DPDK17_rte_raw_device: implementing programmable accelerators using generi...
LF_DPDK17_rte_raw_device: implementing programmable accelerators using generi...
 
LF_DPDK17_Technical Roadmap
LF_DPDK17_Technical RoadmapLF_DPDK17_Technical Roadmap
LF_DPDK17_Technical Roadmap
 
LF_DPDK_Mellanox bifurcated driver model
LF_DPDK_Mellanox bifurcated driver modelLF_DPDK_Mellanox bifurcated driver model
LF_DPDK_Mellanox bifurcated driver model
 

Recently uploaded

Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
FIDO Alliance
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
FIDO Alliance
 

Recently uploaded (20)

Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
Event-Driven Architecture Masterclass: Engineering a Robust, High-performance...
 
Working together SRE & Platform Engineering
Working together SRE & Platform EngineeringWorking together SRE & Platform Engineering
Working together SRE & Platform Engineering
 
Design and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data ScienceDesign and Development of a Provenance Capture Platform for Data Science
Design and Development of a Provenance Capture Platform for Data Science
 
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider  Progress from Awareness to Implementation.pptxTales from a Passkey Provider  Progress from Awareness to Implementation.pptx
Tales from a Passkey Provider Progress from Awareness to Implementation.pptx
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 Warsaw
 
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties ReimaginedEasier, Faster, and More Powerful – Notes Document Properties Reimagined
Easier, Faster, and More Powerful – Notes Document Properties Reimagined
 
Overview of Hyperledger Foundation
Overview of Hyperledger FoundationOverview of Hyperledger Foundation
Overview of Hyperledger Foundation
 
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...Hyatt driving innovation and exceptional customer experiences with FIDO passw...
Hyatt driving innovation and exceptional customer experiences with FIDO passw...
 
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
Event-Driven Architecture Masterclass: Integrating Distributed Data Stores Ac...
 
Generative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdfGenerative AI Use Cases and Applications.pdf
Generative AI Use Cases and Applications.pdf
 
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
TEST BANK For, Information Technology Project Management 9th Edition Kathy Sc...
 
ADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptxADP Passwordless Journey Case Study.pptx
ADP Passwordless Journey Case Study.pptx
 
UiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overviewUiPath manufacturing technology benefits and AI overview
UiPath manufacturing technology benefits and AI overview
 
Introduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptxIntroduction to FIDO Authentication and Passkeys.pptx
Introduction to FIDO Authentication and Passkeys.pptx
 
ChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps Productivity
 
JavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate GuideJavaScript Usage Statistics 2024 - The Ultimate Guide
JavaScript Usage Statistics 2024 - The Ultimate Guide
 
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptxCyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
Cyber Insurance - RalphGilot - Embry-Riddle Aeronautical University.pptx
 
Intro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptxIntro to Passkeys and the State of Passwordless.pptx
Intro to Passkeys and the State of Passwordless.pptx
 
Event-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream ProcessingEvent-Driven Architecture Masterclass: Challenges in Stream Processing
Event-Driven Architecture Masterclass: Challenges in Stream Processing
 
Vector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptxVector Search @ sw2con for slideshare.pptx
Vector Search @ sw2con for slideshare.pptx
 

LF_DPDK17_GRO/GSO Libraries: Bring Significant Performance Gains to DPDK-based Applications

  • 1. DPDK Summit - San Jose – 2017 #DPDKSummit Accelerating Packet Processing with GRO and GSO in DPDK
  • 2. 2#DPDKSummit Packet Processing Overheads u A major part of packet processing “applications” are performed on a per- packet basis (e.g. Firewall, TCP/IPv4 stack) u Per-packet routines (e.g. header processing) dominate packet processing overheads u Reduce the packet number to be processed can mitigate the per-packet overhead
  • 3. 3 Methods to Reduce the Packet Number u Use large Maximum Transmission Unit (MTU) u MTU size depends on physical links L u Use network interface card (NIC) features: Large Receive Offload (LRO), TCP Segmentation Offload (TSO) and UDP Fragmentation Offload (UFO) Application ... fewer large packets Egress direction: large packets LRO NIC Ingress direction: ... TSO/UFO Drawbacks: • Only support TCP and UDP • Not all NICs support LRO/TSO/UFO
  • 4. 4#DPDKSummit GRO and GSO u Software Method: • Generic Receive Offload (GRO) merges small packets into larger ones • Generic Segmentation Offload (GSO) splits large packets into smaller ones u Advantages: • Don’t reply on physical link support • Don’t reply on NIC hardwaresupport • Able to support various kinds of protocols,like TCP, VxLAN and GRE
  • 5. 5 u In DPDK, GRO and GSO are two standalone libraries GRO and GSO in DPDK DPDK Application #DPDKSummit NIC DPDK PMD DPDK GRO Library NIC DPDK PMD DPDK GSO Library Ingress direction Egress direction
  • 6. 6 Library Framework #DPDKSummit … GRO API IPv4-GRE-TCP/IPv4 GRO Packet type: IPv4-GRE-IPv4-TCP Packet type: IPv4-TCP rte_gro_ctx_create() rte_gro_ctx_destroy() rte_gro_reassemble…() rte_gro_timeout_flush() rte_gro_get_pkt_count() TCP/IPv4 GRO GSO API ...IPv4-GRE-TCP/IPv4 GSO TCP/IPv4 GSO ol_flags: PKT_TX_TCP_SEG| PKT_TX_IPV4 ol_flags: PKT_TX_TCP_SEG | PKT_TX_IPV4 | PKT_TX_OUTER_IPV4 | PKT_TX_TUNNEL_GRE rte_gso_segment(pkt) • One GRO type, one kind of packets • Indicated by MBUF->packet_type • One GSO type, one kind of packets • Indicated by MBUF->ol_flags
  • 7. 7 API: How Applications Use it? nb_pkts = rte_eth_rx_burst(…, pkts, …); nb_segs = rte_gro_reassemble_burst(pkts, nb_pkts, …); rte_eth_tx_burst(…, pkts, nb_segs); struct rte_mbuf *gso_segs[N]; nb_segs = rte_gso_segment(…, pkt, gso_segs, …); rte_eth_tx_burst(…, gso_segs, nb_segs); GRO Sample GSO Sample
  • 8. 8 API: Two Sets in GRO #DPDKSummit ctx = rte_gro_ctx_create() rte_gro_reassemble(pkts, ctx) rte_gro_timeout_flush(ctx) Heavyweight Mode API • Supports to merge a large number of packets with fine- grained control rte_gro_reassemble_burst() • Support to merge a small number of packets rapidly Lightweight Mode API
  • 9. 9 How to Merge and Split Packets? u GRO Reassembly Algorithm merges packets u GSO Segmentation Scheme splits packets
  • 10. 10 GRO Algorithm: Challenges u A high cost algorithm/implementation would cause packet dropping in a high speed network u Packet reordering makes it hard to merge packets u Linux GRO fails to merge packets when encounters packet reordering #DPDKSummit Algorithm is lightweight to scale fast networking speed Algorithm is capable of handling packet reordering
  • 11. 11 GRO Algorithm: Key-based Approach #DPDKSummit Categorize into an existed “flow” Search for a “neighbor” in the “flow” Merge the packets • Insert a new “flow” • Store the packet Store the packet in the “flow” packet find a “flow” not find not find find a neighbor TCP/IPv4 Packets Header fields representing a “flow”: • src/dst: mac, IP, port • ACK number Header fields deciding “neighbor”: • IP id • Sequence number same value incremental
  • 12. 12 GRO Algorithm: Key-based Approach #DPDKSummit Two Characters • Lightweight: classify packets into “flows” to accelerate packet aggregation is simple • More: storing out-of-order packets makes it possible to merge later Address challenge 1 and 2 Categorize into an existed “flow” Search for a “neighbor” in the “flow” Merge the packets • Insert a new “flow” • Store the packet Store the packet in the “flow” packet find a “flow” not find not find find a neighbor
  • 13. 13 GRO Algorithm: TCP/IPv4 Example #DPDKSummit Flow MAC IP Port ACK Number F1 1:2:3:4:5:6 11:22:33:44:55:66 1.1.1.1 1.1.1.2 5041 5043 1 F2 1:2:3:4:5:6 11:22:33:44:55:66 1.1.1.1 1.1.1.2 5001 5002 1 Packet IP ID Sequence number Payload length P0 1 1 100 P2 3 301 100 P3 Flow IP ID Sequence number Payload length F2 4 401 100  ‚ Packet IP ID Sequence number Payload length P1 7 701 1500
  • 14. 14#DPDKSummit Flow MAC IP Port ACK Number F1 1:2:3:4:5:6 11:22:33:44:55:66 1.1.1.1 1.1.1.2 5041 5043 1 F2 1:2:3:4:5:6 11:22:33:44:55:66 1.1.1.1 1.1.1.2 5001 5002 1 Packets IP ID Sequence number Payload length P0 1 1 100 P2 3 301 100 P3 Flow IP ID Sequence number Payload length F2 4 401 100 Packets IP ID Sequence number Payload length P1 7 701 1500 IP ID and sequence number are incremental à Neighbors GRO Algorithm: TCP/IPv4 Example
  • 15. 15#DPDKSummit Flow MAC IP Port ACK Numbe r f0 1:2:3:4:5:6 11:22:33:44:55:66 1.1.1.1 1.1.1.2 5041 5043 1 f1 1:2:3:4:5:6 11:22:33:44:55:66 1.1.1.1 1.1.1.2 5001 5002 1 Packets IP ID Sequence number Payload length P0 1 1 100 P2 4 301 200 Packets IP ID Sequence number Payload length P1 7 701 1500 ƒ GRO Algorithm: TCP/IPv4 Example
  • 16. 16#DPDKSummit GSO Scheme: Overview u Segmentation Workflow Split the payload into smaller parts Add the packet header to each payload part Update packet headers for all GSO segments an input packet GSO Segments How to organize a GSO segment?
  • 17. 17#DPDKSummit GSO Scheme: Zero-copy Based Approach u “Two-part” MBUF is used to organize a GSO segment • Direct MBUF holds the packet header • Indirect MBUF: “pointer”, point to a payload part of the packet to segment Data room Direct MBUF MBUF Meta data Indirect MBUF 1 MBUF Meta datanext next… Indirect MBUF N Forming a GSO segment just needs to copy the header! MBUF Meta data
  • 18. 18 PYLD 1 #DPDKSummit GSO Scheme: Example PKT HDR MBUF Meta data next PYLD 0HDRPKT MBUF Meta data  Copy HDR to direct MBUF ƒ Reduce reference counter of PKT’s MBUF by 1 • PKT will be freed automatically, When all indirect MBUFs are freed MBUF Meta data ‚ “Attach” indirect MBUF to PKT and make it point to the correct payload part PKT HDR MBUF Meta data next MBUF Meta data
  • 19. 19 Host Kernel #DPDKSummit Experiment: Physical Topology Switch (testpmd) VM vhost-userNIC 1 virtio-netNIC 0 logically connected Physically connected Server 0 iperf iperf TCP/IPv4 packets
  • 20. 20 NIC 1 iperf client Host Kernel #DPDKSummit Experiment: Setup Testpmd iperf server VM vhost-userNIC 1 virtio-netNIC 0 Small TCP/IPv4 packets iperf server Host Kernel Testpmd iperf client VM vhost-user GSO virtio-netNIC 0 large TCP/IPv4 packets GRO Intel(R) Xeon(R) CPU E5-2699 v4 @ 2.20GHz Ethernet Controller XL710 for 40GbE QSFP+ DPDK GRO DPDK GSO Linux GRO Linux GSO
  • 21. 21 0 0.5 1 1.5 2 2.5 3 1 TCP Connection 2 TCP Connections 4 TCP Connections Speedup#DPDKSummit Experiment: GRO Performance 0 0.5 1 1.5 2 2.5 3 1 TCP Connection 2 TCP Connections 4 TCP Connections Speedup Figure 1. Iperf throughput Improvement of DPDK-GRO over No-GRO Figure 2. Iperf throughput Improvement of DPDK-GRO over Linux-GRO 2.7x 1.9x
  • 22. 22 0 0.5 1 1.5 2 2.5 3 1 TCP Connection 2 TCP Connections 4 TCP Connections Speedup #DPDKSummit Experiment: GSO Performance 0 0.5 1 1.5 2 2.5 1 TCP Connection 2 TCP Connections 4 TCP Connections Speedup Figure 1. Iperf throughput Improvement of DPDK-GSO over No-GSO Figure 2. Iperf throughput Improvement of DPDK-GSO over Linux-GSO 2.2x 1.5x