SlideShare a Scribd company logo
Understanding
DPDK
Description of techniques used to achieve
high throughput on a commodity hardware
How fast SW has to work?
14.88 millions of 64 byte packets per second on 10G interface
1.8 GHz -> 1 cycle = 0,55 ns
1 packet -> 67.2 ns = 120 clock cycles
IFG
Pream
ble
DST
MAC
SRC
MAC
SRC
MAC
Type Payload CRC
84 Bytes
412 8 60
Comparative speed values
CPU to memory speed = 6-8 GBytes/s
PCI-Express x16 speed = 5 GBytes/s
Access to RAM = 200 ns
Access to L3 cache = 4 ns
Context switch ~= 1000 ns (3.2 GHz)
Packet processing in Linux
User space
Kernel space
NIC
App
Driver
RX/TX queues
Socket
Ring
buffers
Linux kernel overhead
System calls
Context switching on blocking I/O
Data copying from kernel to user space
Interrupt handling in kernel
Expense of sendto
Function Activity Time (ns)
sendto system call 96
sosend_dgram lock sock_buff, alloc mbuf, copy in 137
udp_output UDP header setup 57
ip_output route lookup, ip header setup 198
ether_otput MAC lookup, MAC header setup 162
ixgbe_xmit device programming 220
Total 950
Packet processing with DPDK
User space
Kernel space
NIC
App DPDK
Ring
buffers
UIO driver
RX/TX
queues
Kernel space
Updating a register in Linux
User space
HW
ioctl()
Register
syscall
VFS
copy_from_user()
iowrite()
Updating a register with DPDK
User space
HW
assign
Register
What is used inside DPDK?
Processor affinity (separate cores)
Huge pages (no swap, TLB)
UIO (no copying from kernel)
Polling (no interrupts overhead)
Lockless synchronization (avoid waiting)
Batch packets handling
SSE, NUMA awareness
Linux default scheduling
Core 0
Core 1
Core 2
Core 3
t1 t4t3t2
How to isolate a core for a process
To diagnose use top
“top” , press “f” , press “j”
Before boot use isolcpus
“isolcpus=2,4,6”
After boot - use cpuset
“cset shield -c 1-3”, “cset shield -k on”
Core 2Core 1
Run-to-completion model
RX/TX
thread
RX/TX
thread
Port 1 Port 2
Core 2Core 1
Pipeline model
RX
thread
TX
thread
Port 1 Port 2
Ring
Page tables tree
Linux paging model
cr3
Page
Page
Global
Directory
Page
Table
Page
Middle
Directory
TLB
TLB
Page
Table
RAM
OffsetVirtual page
Physical Page Offset
TLB characteristics
$ cpuid | grep -i tlb
size: 12–4,096 entries
hit time: 0.5–1 clock cycle
miss penalty: 10–100 clock cycles
miss rate: 0.01–1%
It is very expensive resource!
Solution - Hugepages
Benefit: optimized TLB usage, no swap
Hugepage size = 2M
Usage:
mount hugetlbfs /mnt/huge
mmap
Library - libhugetlbfs
Lockless ring design
Writer can preempt writer and reader
Reader can not preempt writer
Reader and writer can work simultaneously on
different cores
Barrier
CAS operation
Bulk queue/dequeue
Lockless ring (Single Producer)
1
cons_head
cons_tail
prod_head
prod_tail
prod_next 2
cons_head
cons_tail
prod_head
prod_next
prod_tail
3
cons_head
cons_tail
prod_head
prod_tail
Lockless ring (Single Consumer)
1
cons_head
cons_tail
prod_head
prod_tail
cons_next 2
cons_tail prod_head
prod_tail
cons_next
cons_head
3
cons_head
cons_tail
prod_head
prod_tail
Lockless ring (Multiple Producers)
1
cons_head
cons_tail
prod_head
prod_tail
prod_next1
prod_next2 3
cons_head
cons_tail
prod_head
2
cons_head
cons_tail
prod_head
prod_next2
prod_tail
prod_next1
4
cons_head
cons_tail
5
cons_head
cons_tail
prod_head
prod_tail
prod_tail
prod_head
prod_tail
prod_next1
prod_next2
prod_next1
prod_next2
Kernel space network driver
App
IP stack
Driver
NIC
Data
Desc
Config
Data
User space
Kernel space
Interrupts
UIO
“The most important devices can’t be handled
in user space, including, but not
limited to, network interfaces and block
devices.” - LDD3
UIO
User space
Kernel space
Interfacesysfs /dev/uioX
App
US driver epoll()
mmap()
UIO framework
driver
NIC User space
Access to device from user space
BAR0 (Mem)
BAR1
BAR2 (IO)
BAR5
BAR4
BAR3
Vendor Id
Device Id
Command
Revision Id
Status
...
Configuration
registers
I/O and memory
regions
/sys/class/uio/uioX/maps/mapX
/sys/class/uio/uioX/portio/portX
/dev/uioX -> mmap (offset)
/sys/bus/pci/devices
Host memory NIC memory
DMA RX
Update RDT
DMA descriptor(s)
RX queue RX FIFO
DMA packet
Descriptor ringMemory
DMA descriptors
Host memory NIC memory
DMA TX
Update TDT
DMA descriptor(s)
TX queue TX FIFO
DMA packet
Descriptor ringMemory
DMA descriptors
Receive from SW side
DD DD DDDD
RDT
DD
mbuf1
addr
DD
mbuf2
addr
RDT
RDH = 1
RDT = 5
RDBA = 0
RDLEN = 6
mbuf1
RDH
RDH
mbuf2
Transmit from SW side
DD DD DDDD
TDT
DD
mbuf1
addr
DD
mbuf2
addr
TDT
TDH = 1
TDT = 5
TDBA = 0
TDLEN = 6
mbuf1
TDH
TDH
mbuf2
NUMA
CPU 0
Cores
Memory
controller
I/O controller
Memory
PCI-E PCI-E
CPU 1
Cores
Memory
controller
I/O controller
Memory
PCI-E PCI-E
QPI
Socket 0 Socket 1
RSS (Receive Side Scaling)
Hash
function
Queue 0 CPU N
...
Queue N
Incoming traffic Indirection
table
Flow director
Queue 0 CPU N
...
Queue N
Incoming traffic
Filter table
Hash
function
Outgoing traffic
Drop Route
Virtualization - SR-IOV
NIC
VMM
VM1
VF driver
VM2
VF driver
PF driver
VF
Virtual bridge
VF PF
NIC
Slow path using bifurcated driver
Kernel DPDK
VF
Virtual bridge
PF Filter table
Slow path using TAP
User space
Kernel space
NIC
App DPDK
Ring
buffers
TAP device
RX/TX
queues
TCP/IP
stack
Slow path using KNI
User space
Kernel space
NIC
App DPDK
Ring
buffers
KNI device
RX/TX
queues
TCP/IP
stack
x86 HW
Application 1 - Traffic generator
User space
Streams generator
DUT
Traffic analyzer
x86 HW
Application 2 - Router
Kernel
User space
Routing table
Routing table cacheDUT1 DUT2
x86 HW
Application 3 - Middlebox
User space
DPIDUT1 DUT2
References
Device Drivers in User Space
Userspace I/O drivers in a realtime context
The Userspace I/O HOWTO
The anatomy of a PCI/PCI Express kernel driver
From Intel® Data Plane Development Kit to Wind River Network Acceleration
Platform
DPDK Design Tips (Part 1 - RSS)
Getting the Best of Both Worlds with Queue Splitting (Bifurcated Driver)
Design considerations for efficient network applications with Intel® multi-core
processor-based systems on Linux
Introduction to Intel Ethernet Flow Director
My blog
Learning Network Programming

More Related Content

What's hot

Intel DPDK Step by Step instructions
Intel DPDK Step by Step instructionsIntel DPDK Step by Step instructions
Intel DPDK Step by Step instructions
Hisaki Ohara
 
Linux Network Stack
Linux Network StackLinux Network Stack
Linux Network Stack
Adrien Mahieux
 
Dpdk performance
Dpdk performanceDpdk performance
Dpdk performance
Stephen Hemminger
 
DPDK KNI interface
DPDK KNI interfaceDPDK KNI interface
DPDK KNI interface
Denys Haryachyy
 
1 intro to_dpdk_and_hw
1 intro to_dpdk_and_hw1 intro to_dpdk_and_hw
1 intro to_dpdk_and_hw
videos
 
Understanding DPDK algorithmics
Understanding DPDK algorithmicsUnderstanding DPDK algorithmics
Understanding DPDK algorithmics
Denys Haryachyy
 
Intel dpdk Tutorial
Intel dpdk TutorialIntel dpdk Tutorial
Intel dpdk Tutorial
Saifuddin Kaijar
 
FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)
Kirill Tsym
 
Kernel Recipes 2019 - XDP closer integration with network stack
Kernel Recipes 2019 -  XDP closer integration with network stackKernel Recipes 2019 -  XDP closer integration with network stack
Kernel Recipes 2019 - XDP closer integration with network stack
Anne Nicolas
 
Xdp and ebpf_maps
Xdp and ebpf_mapsXdp and ebpf_maps
Xdp and ebpf_maps
lcplcp1
 
LinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking WalkthroughLinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking Walkthrough
Thomas Graf
 
Fun with Network Interfaces
Fun with Network InterfacesFun with Network Interfaces
Fun with Network Interfaces
Kernel TLV
 
Linux Networking Explained
Linux Networking ExplainedLinux Networking Explained
Linux Networking Explained
Thomas Graf
 
Ixgbe internals
Ixgbe internalsIxgbe internals
Ixgbe internals
SUSE Labs Taipei
 
FD.IO Vector Packet Processing
FD.IO Vector Packet ProcessingFD.IO Vector Packet Processing
FD.IO Vector Packet Processing
Kernel TLV
 
Userspace networking
Userspace networkingUserspace networking
Userspace networking
Stephen Hemminger
 
Tutorial: Using GoBGP as an IXP connecting router
Tutorial: Using GoBGP as an IXP connecting routerTutorial: Using GoBGP as an IXP connecting router
Tutorial: Using GoBGP as an IXP connecting router
Shu Sugimoto
 
The TCP/IP Stack in the Linux Kernel
The TCP/IP Stack in the Linux KernelThe TCP/IP Stack in the Linux Kernel
The TCP/IP Stack in the Linux Kernel
Divye Kapoor
 
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDPDockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
Thomas Graf
 
TC Flower Offload
TC Flower OffloadTC Flower Offload
TC Flower Offload
Netronome
 

What's hot (20)

Intel DPDK Step by Step instructions
Intel DPDK Step by Step instructionsIntel DPDK Step by Step instructions
Intel DPDK Step by Step instructions
 
Linux Network Stack
Linux Network StackLinux Network Stack
Linux Network Stack
 
Dpdk performance
Dpdk performanceDpdk performance
Dpdk performance
 
DPDK KNI interface
DPDK KNI interfaceDPDK KNI interface
DPDK KNI interface
 
1 intro to_dpdk_and_hw
1 intro to_dpdk_and_hw1 intro to_dpdk_and_hw
1 intro to_dpdk_and_hw
 
Understanding DPDK algorithmics
Understanding DPDK algorithmicsUnderstanding DPDK algorithmics
Understanding DPDK algorithmics
 
Intel dpdk Tutorial
Intel dpdk TutorialIntel dpdk Tutorial
Intel dpdk Tutorial
 
FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)
 
Kernel Recipes 2019 - XDP closer integration with network stack
Kernel Recipes 2019 -  XDP closer integration with network stackKernel Recipes 2019 -  XDP closer integration with network stack
Kernel Recipes 2019 - XDP closer integration with network stack
 
Xdp and ebpf_maps
Xdp and ebpf_mapsXdp and ebpf_maps
Xdp and ebpf_maps
 
LinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking WalkthroughLinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking Walkthrough
 
Fun with Network Interfaces
Fun with Network InterfacesFun with Network Interfaces
Fun with Network Interfaces
 
Linux Networking Explained
Linux Networking ExplainedLinux Networking Explained
Linux Networking Explained
 
Ixgbe internals
Ixgbe internalsIxgbe internals
Ixgbe internals
 
FD.IO Vector Packet Processing
FD.IO Vector Packet ProcessingFD.IO Vector Packet Processing
FD.IO Vector Packet Processing
 
Userspace networking
Userspace networkingUserspace networking
Userspace networking
 
Tutorial: Using GoBGP as an IXP connecting router
Tutorial: Using GoBGP as an IXP connecting routerTutorial: Using GoBGP as an IXP connecting router
Tutorial: Using GoBGP as an IXP connecting router
 
The TCP/IP Stack in the Linux Kernel
The TCP/IP Stack in the Linux KernelThe TCP/IP Stack in the Linux Kernel
The TCP/IP Stack in the Linux Kernel
 
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDPDockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDP
 
TC Flower Offload
TC Flower OffloadTC Flower Offload
TC Flower Offload
 

Viewers also liked

DPDK summit 2015: It's kind of fun to do the impossible with DPDK
DPDK summit 2015: It's kind of fun  to do the impossible with DPDKDPDK summit 2015: It's kind of fun  to do the impossible with DPDK
DPDK summit 2015: It's kind of fun to do the impossible with DPDK
Lagopus SDN/OpenFlow switch
 
The Basic Introduction of Open vSwitch
The Basic Introduction of Open vSwitchThe Basic Introduction of Open vSwitch
The Basic Introduction of Open vSwitch
Te-Yen Liu
 
100 M pps on PC.
100 M pps on PC.100 M pps on PC.
100 M pps on PC.
Redge Technologies
 
Disruptive IP Networking with Intel DPDK on Linux
Disruptive IP Networking with Intel DPDK on LinuxDisruptive IP Networking with Intel DPDK on Linux
Disruptive IP Networking with Intel DPDK on Linux
Naoto MATSUMOTO
 
Vagrant
VagrantVagrant
Seastar:高スループットなサーバアプリケーションの為の新しいフレームワーク
Seastar:高スループットなサーバアプリケーションの為の新しいフレームワークSeastar:高スループットなサーバアプリケーションの為の新しいフレームワーク
Seastar:高スループットなサーバアプリケーションの為の新しいフレームワーク
Takuya ASADA
 
OpenVZ - Linux Containers:第2回 コンテナ型仮想化の情報交換会@東京
OpenVZ - Linux Containers:第2回 コンテナ型仮想化の情報交換会@東京OpenVZ - Linux Containers:第2回 コンテナ型仮想化の情報交換会@東京
OpenVZ - Linux Containers:第2回 コンテナ型仮想化の情報交換会@東京
Kentaro Ebisawa
 
コンテナ情報交換会2
コンテナ情報交換会2コンテナ情報交換会2
コンテナ情報交換会2
Masahide Yamamoto
 
cassandra 100 node cluster admin operation
cassandra 100 node cluster admin operationcassandra 100 node cluster admin operation
cassandra 100 node cluster admin operation
oranie Narut
 
PaaSの作り方 Sqaleの場合
PaaSの作り方 Sqaleの場合PaaSの作り方 Sqaleの場合
PaaSの作り方 Sqaleの場合
hiboma
 
Inside Sqale's Backend at Sapporo Ruby Kaigi 2012
Inside Sqale's Backend at Sapporo Ruby Kaigi 2012Inside Sqale's Backend at Sapporo Ruby Kaigi 2012
Inside Sqale's Backend at Sapporo Ruby Kaigi 2012
Gosuke Miyashita
 
Nosqlの基礎知識(2013年7月講義資料)
Nosqlの基礎知識(2013年7月講義資料)Nosqlの基礎知識(2013年7月講義資料)
Nosqlの基礎知識(2013年7月講義資料)
CLOUDIAN KK
 
Structural design of tunnel lining
Structural design of tunnel liningStructural design of tunnel lining
Structural design of tunnel lining
Mahesh Raj Bhatt
 
Tunnel engg.2
Tunnel engg.2Tunnel engg.2
Tunnel engg.2
SHUBHAM DABHADE
 
Bridges precast
Bridges precastBridges precast
Bridges precast
Dr Fereidoun Dejahang
 
Ecg533 rock-tunnel-engineering
Ecg533 rock-tunnel-engineeringEcg533 rock-tunnel-engineering
Ecg533 rock-tunnel-engineering
Junaida Wally
 
Tunneling
Tunneling  Tunneling
Guidelines
GuidelinesGuidelines
Guidelines
Šumadin Šumić
 
Precast segmental concrete bridges a
Precast segmental concrete bridges aPrecast segmental concrete bridges a
Precast segmental concrete bridges a
Palmer Consulting Services, LLC
 
Diaphragm Wall Presentation By Gagan
Diaphragm Wall Presentation By GaganDiaphragm Wall Presentation By Gagan
Diaphragm Wall Presentation By Gagan
HERITAGE INFRASPACE INDIA PRIVATE LIMITED
 

Viewers also liked (20)

DPDK summit 2015: It's kind of fun to do the impossible with DPDK
DPDK summit 2015: It's kind of fun  to do the impossible with DPDKDPDK summit 2015: It's kind of fun  to do the impossible with DPDK
DPDK summit 2015: It's kind of fun to do the impossible with DPDK
 
The Basic Introduction of Open vSwitch
The Basic Introduction of Open vSwitchThe Basic Introduction of Open vSwitch
The Basic Introduction of Open vSwitch
 
100 M pps on PC.
100 M pps on PC.100 M pps on PC.
100 M pps on PC.
 
Disruptive IP Networking with Intel DPDK on Linux
Disruptive IP Networking with Intel DPDK on LinuxDisruptive IP Networking with Intel DPDK on Linux
Disruptive IP Networking with Intel DPDK on Linux
 
Vagrant
VagrantVagrant
Vagrant
 
Seastar:高スループットなサーバアプリケーションの為の新しいフレームワーク
Seastar:高スループットなサーバアプリケーションの為の新しいフレームワークSeastar:高スループットなサーバアプリケーションの為の新しいフレームワーク
Seastar:高スループットなサーバアプリケーションの為の新しいフレームワーク
 
OpenVZ - Linux Containers:第2回 コンテナ型仮想化の情報交換会@東京
OpenVZ - Linux Containers:第2回 コンテナ型仮想化の情報交換会@東京OpenVZ - Linux Containers:第2回 コンテナ型仮想化の情報交換会@東京
OpenVZ - Linux Containers:第2回 コンテナ型仮想化の情報交換会@東京
 
コンテナ情報交換会2
コンテナ情報交換会2コンテナ情報交換会2
コンテナ情報交換会2
 
cassandra 100 node cluster admin operation
cassandra 100 node cluster admin operationcassandra 100 node cluster admin operation
cassandra 100 node cluster admin operation
 
PaaSの作り方 Sqaleの場合
PaaSの作り方 Sqaleの場合PaaSの作り方 Sqaleの場合
PaaSの作り方 Sqaleの場合
 
Inside Sqale's Backend at Sapporo Ruby Kaigi 2012
Inside Sqale's Backend at Sapporo Ruby Kaigi 2012Inside Sqale's Backend at Sapporo Ruby Kaigi 2012
Inside Sqale's Backend at Sapporo Ruby Kaigi 2012
 
Nosqlの基礎知識(2013年7月講義資料)
Nosqlの基礎知識(2013年7月講義資料)Nosqlの基礎知識(2013年7月講義資料)
Nosqlの基礎知識(2013年7月講義資料)
 
Structural design of tunnel lining
Structural design of tunnel liningStructural design of tunnel lining
Structural design of tunnel lining
 
Tunnel engg.2
Tunnel engg.2Tunnel engg.2
Tunnel engg.2
 
Bridges precast
Bridges precastBridges precast
Bridges precast
 
Ecg533 rock-tunnel-engineering
Ecg533 rock-tunnel-engineeringEcg533 rock-tunnel-engineering
Ecg533 rock-tunnel-engineering
 
Tunneling
Tunneling  Tunneling
Tunneling
 
Guidelines
GuidelinesGuidelines
Guidelines
 
Precast segmental concrete bridges a
Precast segmental concrete bridges aPrecast segmental concrete bridges a
Precast segmental concrete bridges a
 
Diaphragm Wall Presentation By Gagan
Diaphragm Wall Presentation By GaganDiaphragm Wall Presentation By Gagan
Diaphragm Wall Presentation By Gagan
 

Similar to Understanding DPDK

Steen_Dissertation_March5
Steen_Dissertation_March5Steen_Dissertation_March5
Steen_Dissertation_March5
Steen Larsen
 
Polyraptor
PolyraptorPolyraptor
Polyraptor
MohammedAlasmar2
 
Embedded Recipes 2019 - Introduction to JTAG debugging
Embedded Recipes 2019 - Introduction to JTAG debuggingEmbedded Recipes 2019 - Introduction to JTAG debugging
Embedded Recipes 2019 - Introduction to JTAG debugging
Anne Nicolas
 
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running LinuxLinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
brouer
 
NUSE (Network Stack in Userspace) at #osio
NUSE (Network Stack in Userspace) at #osioNUSE (Network Stack in Userspace) at #osio
NUSE (Network Stack in Userspace) at #osio
Hajime Tazaki
 
L05 parallel
L05 parallelL05 parallel
Memory management
Memory managementMemory management
Memory management
Adrien Mahieux
 
The Spectre of Meltdowns
The Spectre of MeltdownsThe Spectre of Meltdowns
The Spectre of Meltdowns
Andriy Berestovskyy
 
Dpdk accelerated Ostinato
Dpdk accelerated OstinatoDpdk accelerated Ostinato
Dpdk accelerated Ostinato
pstavirs
 
OSN days 2019 - Open Networking and Programmable Switch
OSN days 2019 - Open Networking and Programmable SwitchOSN days 2019 - Open Networking and Programmable Switch
OSN days 2019 - Open Networking and Programmable Switch
Chun Ming Ou
 
Lrz kurs: gpu and mic programming with r
Lrz kurs: gpu and mic programming with rLrz kurs: gpu and mic programming with r
Lrz kurs: gpu and mic programming with r
Ferdinand Jamitzky
 
Polyraptor
PolyraptorPolyraptor
Polyraptor
MohammedAlasmar2
 
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...
Akihiro Hayashi
 
introduction to linux kernel tcp/ip ptocotol stack
introduction to linux kernel tcp/ip ptocotol stack introduction to linux kernel tcp/ip ptocotol stack
introduction to linux kernel tcp/ip ptocotol stack
monad bobo
 
The New Systems Performance
The New Systems PerformanceThe New Systems Performance
The New Systems Performance
Brendan Gregg
 
Lec12 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- P6, Netbur...
Lec12 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- P6, Netbur...Lec12 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- P6, Netbur...
Lec12 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- P6, Netbur...
Hsien-Hsin Sean Lee, Ph.D.
 
Semiconductor memories
Semiconductor memoriesSemiconductor memories
Semiconductor memories
SambitShreeman
 
Brkdct 3101
Brkdct 3101Brkdct 3101
Brkdct 3101
Nguyen Van Linh
 
Cisco crs1
Cisco crs1Cisco crs1
Cisco crs1
wjunjmt
 
Introduction to tcpdump
Introduction to tcpdumpIntroduction to tcpdump
Introduction to tcpdump
Lev Walkin
 

Similar to Understanding DPDK (20)

Steen_Dissertation_March5
Steen_Dissertation_March5Steen_Dissertation_March5
Steen_Dissertation_March5
 
Polyraptor
PolyraptorPolyraptor
Polyraptor
 
Embedded Recipes 2019 - Introduction to JTAG debugging
Embedded Recipes 2019 - Introduction to JTAG debuggingEmbedded Recipes 2019 - Introduction to JTAG debugging
Embedded Recipes 2019 - Introduction to JTAG debugging
 
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running LinuxLinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
LinuxCon2009: 10Gbit/s Bi-Directional Routing on standard hardware running Linux
 
NUSE (Network Stack in Userspace) at #osio
NUSE (Network Stack in Userspace) at #osioNUSE (Network Stack in Userspace) at #osio
NUSE (Network Stack in Userspace) at #osio
 
L05 parallel
L05 parallelL05 parallel
L05 parallel
 
Memory management
Memory managementMemory management
Memory management
 
The Spectre of Meltdowns
The Spectre of MeltdownsThe Spectre of Meltdowns
The Spectre of Meltdowns
 
Dpdk accelerated Ostinato
Dpdk accelerated OstinatoDpdk accelerated Ostinato
Dpdk accelerated Ostinato
 
OSN days 2019 - Open Networking and Programmable Switch
OSN days 2019 - Open Networking and Programmable SwitchOSN days 2019 - Open Networking and Programmable Switch
OSN days 2019 - Open Networking and Programmable Switch
 
Lrz kurs: gpu and mic programming with r
Lrz kurs: gpu and mic programming with rLrz kurs: gpu and mic programming with r
Lrz kurs: gpu and mic programming with r
 
Polyraptor
PolyraptorPolyraptor
Polyraptor
 
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...
Exploring Compiler Optimization Opportunities for the OpenMP 4.x Accelerator...
 
introduction to linux kernel tcp/ip ptocotol stack
introduction to linux kernel tcp/ip ptocotol stack introduction to linux kernel tcp/ip ptocotol stack
introduction to linux kernel tcp/ip ptocotol stack
 
The New Systems Performance
The New Systems PerformanceThe New Systems Performance
The New Systems Performance
 
Lec12 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- P6, Netbur...
Lec12 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- P6, Netbur...Lec12 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- P6, Netbur...
Lec12 Computer Architecture by Hsien-Hsin Sean Lee Georgia Tech -- P6, Netbur...
 
Semiconductor memories
Semiconductor memoriesSemiconductor memories
Semiconductor memories
 
Brkdct 3101
Brkdct 3101Brkdct 3101
Brkdct 3101
 
Cisco crs1
Cisco crs1Cisco crs1
Cisco crs1
 
Introduction to tcpdump
Introduction to tcpdumpIntroduction to tcpdump
Introduction to tcpdump
 

More from Denys Haryachyy

Understanding iptables
Understanding iptablesUnderstanding iptables
Understanding iptables
Denys Haryachyy
 
Secure communication
Secure communicationSecure communication
Secure communication
Denys Haryachyy
 
Network sockets
Network socketsNetwork sockets
Network sockets
Denys Haryachyy
 
C++ 11
C++ 11C++ 11
Git basics
Git basicsGit basics
Git basics
Denys Haryachyy
 
History of the personal computer
History of the personal computerHistory of the personal computer
History of the personal computer
Denys Haryachyy
 

More from Denys Haryachyy (6)

Understanding iptables
Understanding iptablesUnderstanding iptables
Understanding iptables
 
Secure communication
Secure communicationSecure communication
Secure communication
 
Network sockets
Network socketsNetwork sockets
Network sockets
 
C++ 11
C++ 11C++ 11
C++ 11
 
Git basics
Git basicsGit basics
Git basics
 
History of the personal computer
History of the personal computerHistory of the personal computer
History of the personal computer
 

Recently uploaded

Wired_2.0_Create_AmsterdamJUG_09072024.pptx
Wired_2.0_Create_AmsterdamJUG_09072024.pptxWired_2.0_Create_AmsterdamJUG_09072024.pptx
Wired_2.0_Create_AmsterdamJUG_09072024.pptx
SimonedeGijt
 
Celebrity Girls Call Mumbai 🛵🚡9910780858 💃 Choose Best And Top Girl Service A...
Celebrity Girls Call Mumbai 🛵🚡9910780858 💃 Choose Best And Top Girl Service A...Celebrity Girls Call Mumbai 🛵🚡9910780858 💃 Choose Best And Top Girl Service A...
Celebrity Girls Call Mumbai 🛵🚡9910780858 💃 Choose Best And Top Girl Service A...
norina2645
 
11 Top Cross Browser Testing Tools to Know About.pdf
11 Top Cross Browser Testing Tools to Know About.pdf11 Top Cross Browser Testing Tools to Know About.pdf
11 Top Cross Browser Testing Tools to Know About.pdf
kalichargn70th171
 
ThaiPy meetup - Indexes and Django
ThaiPy meetup - Indexes and DjangoThaiPy meetup - Indexes and Django
ThaiPy meetup - Indexes and Django
akshesh doshi
 
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
confluent
 
High Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 ...
High Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 ...High Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 ...
High Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 ...
singhlata50dh
 
655221243123332131-Complete-MERN-stack.pdf
655221243123332131-Complete-MERN-stack.pdf655221243123332131-Complete-MERN-stack.pdf
655221243123332131-Complete-MERN-stack.pdf
KhnhTrn343305
 
Independent Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class H...
Independent Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class H...Independent Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class H...
Independent Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class H...
aslasdfmkhan4750
 
Introduction to Cloud computing for Internet of Things
Introduction to Cloud computing for Internet of ThingsIntroduction to Cloud computing for Internet of Things
Introduction to Cloud computing for Internet of Things
NachuSubramanian1
 
Odoo E-commerce website development guides
Odoo E-commerce website development guidesOdoo E-commerce website development guides
Odoo E-commerce website development guides
jhkdigitalmarketing
 
Vip Girls Call ServiCe Hyderabad 0000000000 Pooja Best High Class Hyderabad A...
Vip Girls Call ServiCe Hyderabad 0000000000 Pooja Best High Class Hyderabad A...Vip Girls Call ServiCe Hyderabad 0000000000 Pooja Best High Class Hyderabad A...
Vip Girls Call ServiCe Hyderabad 0000000000 Pooja Best High Class Hyderabad A...
ashiklo9823
 
Independent Girls call Service Pune 000XX00000 Provide Best And Top Girl Serv...
Independent Girls call Service Pune 000XX00000 Provide Best And Top Girl Serv...Independent Girls call Service Pune 000XX00000 Provide Best And Top Girl Serv...
Independent Girls call Service Pune 000XX00000 Provide Best And Top Girl Serv...
bhumivarma35300
 
bangalore Girls call 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
bangalore Girls call  👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Deliverybangalore Girls call  👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
bangalore Girls call 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
sunilverma7884
 
Software development... for all? (keynote at ICSOFT'2024)
Software development... for all? (keynote at ICSOFT'2024)Software development... for all? (keynote at ICSOFT'2024)
Software development... for all? (keynote at ICSOFT'2024)
miso_uam
 
Cisco Live Announcements: New ThousandEyes Release Highlights - July 2024
Cisco Live Announcements: New ThousandEyes Release Highlights - July 2024Cisco Live Announcements: New ThousandEyes Release Highlights - July 2024
Cisco Live Announcements: New ThousandEyes Release Highlights - July 2024
ThousandEyes
 
The Ultimate Guide to Phone Spy Apps: Everything You Need to Know
The Ultimate Guide to Phone Spy Apps: Everything You Need to KnowThe Ultimate Guide to Phone Spy Apps: Everything You Need to Know
The Ultimate Guide to Phone Spy Apps: Everything You Need to Know
onemonitarsoftware
 
Attendance Tracking From Paper To Digital
Attendance Tracking From Paper To DigitalAttendance Tracking From Paper To Digital
Attendance Tracking From Paper To Digital
Task Tracker
 
How To Fill Timesheet in TaskSprint: Quick Guide 2024
How To Fill Timesheet in TaskSprint: Quick Guide 2024How To Fill Timesheet in TaskSprint: Quick Guide 2024
How To Fill Timesheet in TaskSprint: Quick Guide 2024
TaskSprint | Employee Efficiency Software
 
Building infrastructure with code_ A deep dive into CDK for IaC in Java.pdf
Building infrastructure with code_ A deep dive into CDK for IaC in Java.pdfBuilding infrastructure with code_ A deep dive into CDK for IaC in Java.pdf
Building infrastructure with code_ A deep dive into CDK for IaC in Java.pdf
mohitd6
 
Prada Group Reports Strong Growth in First Quarter …
Prada Group Reports Strong Growth in First Quarter …Prada Group Reports Strong Growth in First Quarter …
Prada Group Reports Strong Growth in First Quarter …
908dutch
 

Recently uploaded (20)

Wired_2.0_Create_AmsterdamJUG_09072024.pptx
Wired_2.0_Create_AmsterdamJUG_09072024.pptxWired_2.0_Create_AmsterdamJUG_09072024.pptx
Wired_2.0_Create_AmsterdamJUG_09072024.pptx
 
Celebrity Girls Call Mumbai 🛵🚡9910780858 💃 Choose Best And Top Girl Service A...
Celebrity Girls Call Mumbai 🛵🚡9910780858 💃 Choose Best And Top Girl Service A...Celebrity Girls Call Mumbai 🛵🚡9910780858 💃 Choose Best And Top Girl Service A...
Celebrity Girls Call Mumbai 🛵🚡9910780858 💃 Choose Best And Top Girl Service A...
 
11 Top Cross Browser Testing Tools to Know About.pdf
11 Top Cross Browser Testing Tools to Know About.pdf11 Top Cross Browser Testing Tools to Know About.pdf
11 Top Cross Browser Testing Tools to Know About.pdf
 
ThaiPy meetup - Indexes and Django
ThaiPy meetup - Indexes and DjangoThaiPy meetup - Indexes and Django
ThaiPy meetup - Indexes and Django
 
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
 
High Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 ...
High Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 ...High Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 ...
High Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 ...
 
655221243123332131-Complete-MERN-stack.pdf
655221243123332131-Complete-MERN-stack.pdf655221243123332131-Complete-MERN-stack.pdf
655221243123332131-Complete-MERN-stack.pdf
 
Independent Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class H...
Independent Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class H...Independent Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class H...
Independent Girls Call ServiCe Hyderabad 0000000000 Tanisha Best High Class H...
 
Introduction to Cloud computing for Internet of Things
Introduction to Cloud computing for Internet of ThingsIntroduction to Cloud computing for Internet of Things
Introduction to Cloud computing for Internet of Things
 
Odoo E-commerce website development guides
Odoo E-commerce website development guidesOdoo E-commerce website development guides
Odoo E-commerce website development guides
 
Vip Girls Call ServiCe Hyderabad 0000000000 Pooja Best High Class Hyderabad A...
Vip Girls Call ServiCe Hyderabad 0000000000 Pooja Best High Class Hyderabad A...Vip Girls Call ServiCe Hyderabad 0000000000 Pooja Best High Class Hyderabad A...
Vip Girls Call ServiCe Hyderabad 0000000000 Pooja Best High Class Hyderabad A...
 
Independent Girls call Service Pune 000XX00000 Provide Best And Top Girl Serv...
Independent Girls call Service Pune 000XX00000 Provide Best And Top Girl Serv...Independent Girls call Service Pune 000XX00000 Provide Best And Top Girl Serv...
Independent Girls call Service Pune 000XX00000 Provide Best And Top Girl Serv...
 
bangalore Girls call 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
bangalore Girls call  👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Deliverybangalore Girls call  👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
bangalore Girls call 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
 
Software development... for all? (keynote at ICSOFT'2024)
Software development... for all? (keynote at ICSOFT'2024)Software development... for all? (keynote at ICSOFT'2024)
Software development... for all? (keynote at ICSOFT'2024)
 
Cisco Live Announcements: New ThousandEyes Release Highlights - July 2024
Cisco Live Announcements: New ThousandEyes Release Highlights - July 2024Cisco Live Announcements: New ThousandEyes Release Highlights - July 2024
Cisco Live Announcements: New ThousandEyes Release Highlights - July 2024
 
The Ultimate Guide to Phone Spy Apps: Everything You Need to Know
The Ultimate Guide to Phone Spy Apps: Everything You Need to KnowThe Ultimate Guide to Phone Spy Apps: Everything You Need to Know
The Ultimate Guide to Phone Spy Apps: Everything You Need to Know
 
Attendance Tracking From Paper To Digital
Attendance Tracking From Paper To DigitalAttendance Tracking From Paper To Digital
Attendance Tracking From Paper To Digital
 
How To Fill Timesheet in TaskSprint: Quick Guide 2024
How To Fill Timesheet in TaskSprint: Quick Guide 2024How To Fill Timesheet in TaskSprint: Quick Guide 2024
How To Fill Timesheet in TaskSprint: Quick Guide 2024
 
Building infrastructure with code_ A deep dive into CDK for IaC in Java.pdf
Building infrastructure with code_ A deep dive into CDK for IaC in Java.pdfBuilding infrastructure with code_ A deep dive into CDK for IaC in Java.pdf
Building infrastructure with code_ A deep dive into CDK for IaC in Java.pdf
 
Prada Group Reports Strong Growth in First Quarter …
Prada Group Reports Strong Growth in First Quarter …Prada Group Reports Strong Growth in First Quarter …
Prada Group Reports Strong Growth in First Quarter …
 

Understanding DPDK

  • 1. Understanding DPDK Description of techniques used to achieve high throughput on a commodity hardware
  • 2. How fast SW has to work? 14.88 millions of 64 byte packets per second on 10G interface 1.8 GHz -> 1 cycle = 0,55 ns 1 packet -> 67.2 ns = 120 clock cycles IFG Pream ble DST MAC SRC MAC SRC MAC Type Payload CRC 84 Bytes 412 8 60
  • 3. Comparative speed values CPU to memory speed = 6-8 GBytes/s PCI-Express x16 speed = 5 GBytes/s Access to RAM = 200 ns Access to L3 cache = 4 ns Context switch ~= 1000 ns (3.2 GHz)
  • 4. Packet processing in Linux User space Kernel space NIC App Driver RX/TX queues Socket Ring buffers
  • 5. Linux kernel overhead System calls Context switching on blocking I/O Data copying from kernel to user space Interrupt handling in kernel
  • 6. Expense of sendto Function Activity Time (ns) sendto system call 96 sosend_dgram lock sock_buff, alloc mbuf, copy in 137 udp_output UDP header setup 57 ip_output route lookup, ip header setup 198 ether_otput MAC lookup, MAC header setup 162 ixgbe_xmit device programming 220 Total 950
  • 7. Packet processing with DPDK User space Kernel space NIC App DPDK Ring buffers UIO driver RX/TX queues
  • 8. Kernel space Updating a register in Linux User space HW ioctl() Register syscall VFS copy_from_user() iowrite()
  • 9. Updating a register with DPDK User space HW assign Register
  • 10. What is used inside DPDK? Processor affinity (separate cores) Huge pages (no swap, TLB) UIO (no copying from kernel) Polling (no interrupts overhead) Lockless synchronization (avoid waiting) Batch packets handling SSE, NUMA awareness
  • 11. Linux default scheduling Core 0 Core 1 Core 2 Core 3 t1 t4t3t2
  • 12. How to isolate a core for a process To diagnose use top “top” , press “f” , press “j” Before boot use isolcpus “isolcpus=2,4,6” After boot - use cpuset “cset shield -c 1-3”, “cset shield -k on”
  • 13. Core 2Core 1 Run-to-completion model RX/TX thread RX/TX thread Port 1 Port 2
  • 14. Core 2Core 1 Pipeline model RX thread TX thread Port 1 Port 2 Ring
  • 15. Page tables tree Linux paging model cr3 Page Page Global Directory Page Table Page Middle Directory
  • 17. TLB characteristics $ cpuid | grep -i tlb size: 12–4,096 entries hit time: 0.5–1 clock cycle miss penalty: 10–100 clock cycles miss rate: 0.01–1% It is very expensive resource!
  • 18. Solution - Hugepages Benefit: optimized TLB usage, no swap Hugepage size = 2M Usage: mount hugetlbfs /mnt/huge mmap Library - libhugetlbfs
  • 19. Lockless ring design Writer can preempt writer and reader Reader can not preempt writer Reader and writer can work simultaneously on different cores Barrier CAS operation Bulk queue/dequeue
  • 20. Lockless ring (Single Producer) 1 cons_head cons_tail prod_head prod_tail prod_next 2 cons_head cons_tail prod_head prod_next prod_tail 3 cons_head cons_tail prod_head prod_tail
  • 21. Lockless ring (Single Consumer) 1 cons_head cons_tail prod_head prod_tail cons_next 2 cons_tail prod_head prod_tail cons_next cons_head 3 cons_head cons_tail prod_head prod_tail
  • 22. Lockless ring (Multiple Producers) 1 cons_head cons_tail prod_head prod_tail prod_next1 prod_next2 3 cons_head cons_tail prod_head 2 cons_head cons_tail prod_head prod_next2 prod_tail prod_next1 4 cons_head cons_tail 5 cons_head cons_tail prod_head prod_tail prod_tail prod_head prod_tail prod_next1 prod_next2 prod_next1 prod_next2
  • 23. Kernel space network driver App IP stack Driver NIC Data Desc Config Data User space Kernel space Interrupts
  • 24. UIO “The most important devices can’t be handled in user space, including, but not limited to, network interfaces and block devices.” - LDD3
  • 25. UIO User space Kernel space Interfacesysfs /dev/uioX App US driver epoll() mmap() UIO framework driver
  • 26. NIC User space Access to device from user space BAR0 (Mem) BAR1 BAR2 (IO) BAR5 BAR4 BAR3 Vendor Id Device Id Command Revision Id Status ... Configuration registers I/O and memory regions /sys/class/uio/uioX/maps/mapX /sys/class/uio/uioX/portio/portX /dev/uioX -> mmap (offset) /sys/bus/pci/devices
  • 27. Host memory NIC memory DMA RX Update RDT DMA descriptor(s) RX queue RX FIFO DMA packet Descriptor ringMemory DMA descriptors
  • 28. Host memory NIC memory DMA TX Update TDT DMA descriptor(s) TX queue TX FIFO DMA packet Descriptor ringMemory DMA descriptors
  • 29. Receive from SW side DD DD DDDD RDT DD mbuf1 addr DD mbuf2 addr RDT RDH = 1 RDT = 5 RDBA = 0 RDLEN = 6 mbuf1 RDH RDH mbuf2
  • 30. Transmit from SW side DD DD DDDD TDT DD mbuf1 addr DD mbuf2 addr TDT TDH = 1 TDT = 5 TDBA = 0 TDLEN = 6 mbuf1 TDH TDH mbuf2
  • 31. NUMA CPU 0 Cores Memory controller I/O controller Memory PCI-E PCI-E CPU 1 Cores Memory controller I/O controller Memory PCI-E PCI-E QPI Socket 0 Socket 1
  • 32. RSS (Receive Side Scaling) Hash function Queue 0 CPU N ... Queue N Incoming traffic Indirection table
  • 33. Flow director Queue 0 CPU N ... Queue N Incoming traffic Filter table Hash function Outgoing traffic Drop Route
  • 34. Virtualization - SR-IOV NIC VMM VM1 VF driver VM2 VF driver PF driver VF Virtual bridge VF PF
  • 35. NIC Slow path using bifurcated driver Kernel DPDK VF Virtual bridge PF Filter table
  • 36. Slow path using TAP User space Kernel space NIC App DPDK Ring buffers TAP device RX/TX queues TCP/IP stack
  • 37. Slow path using KNI User space Kernel space NIC App DPDK Ring buffers KNI device RX/TX queues TCP/IP stack
  • 38. x86 HW Application 1 - Traffic generator User space Streams generator DUT Traffic analyzer
  • 39. x86 HW Application 2 - Router Kernel User space Routing table Routing table cacheDUT1 DUT2
  • 40. x86 HW Application 3 - Middlebox User space DPIDUT1 DUT2
  • 41. References Device Drivers in User Space Userspace I/O drivers in a realtime context The Userspace I/O HOWTO The anatomy of a PCI/PCI Express kernel driver From Intel® Data Plane Development Kit to Wind River Network Acceleration Platform DPDK Design Tips (Part 1 - RSS) Getting the Best of Both Worlds with Queue Splitting (Bifurcated Driver) Design considerations for efficient network applications with Intel® multi-core processor-based systems on Linux Introduction to Intel Ethernet Flow Director