11. ... CPU 1
CPU 2
CPU n
napi_poll()
CPU Q 1
CPU Q 2
CPU Q n
...
enqueue_to_backlog()
RX Q 1
RX Q 2
RX Q n-1
RX Q n
CPU n-1 CPU Q n-1
CPU i
Ring Buffer
27. Page Ref Count
●
Tracking the number of users of the page
●
The possible page ref count (ideally):
– 1: The whole page is available.
– 2: One half of the page is in use.
– 3: The whole page is in use.
28. Page Count Operations
static inline void set_page_count(struct page *page, int v)
{
atomic_set(&page->_refcount, v);
if (page_ref_tracepoint_active(__tracepoint_page_ref_set))
__page_ref_set(page, v);
}
static inline void page_ref_add(struct page *page, int nr)
{
atomic_add(nr, &page->_refcount);
if (page_ref_tracepoint_active(__tracepoint_page_ref_mod))
__page_ref_mod(page, nr);
}
static inline void page_ref_sub(struct page *page, int nr)
{
atomic_sub(nr, &page->_refcount);
if (page_ref_tracepoint_active(__tracepoint_page_ref_mod))
__page_ref_mod(page, -nr);
}
30. Adjusted Ref Count (1/3)
●
A locally maintained pagecnt_bias for the RX
page
●
Initial value of pagecnt_bias: 1
●
adj_pagecnt = pagecnt – pagecnt_bias
– 0: The whole page is available.
– 1: One half of the page is in use.
– 2: The whole page is in use.
31. Adjusted Ref Count (2/3)
●
Harvesting a packet: pagecnt_bias--
●
Recycling the page: pagecnt_bias++
– The XDP program returns XDP_DROP or an error.
– The packet is small enough to be copied into the
allocated skb.
– The packet is failed to be packed into skb.
●
If pagecnt_bias == 0, set pagecnt_bias to
USHRT_MAX and add USHRT_MAX to pagecnt.
32. Adjusted Ref Count (3/3)
●
Packet consumed: pagecnt--
– The packet is consumed by the network stack.
– XDP_TX or XDP_REDIRECT is completed.
●
Releasing the page: pagecnt -= pagecnt_bias
– With the help of __page_frag_cache_drain()
38. Incoming New Features
●
XDP for ixgbevf (linux-next)
– ixgbe blocks XDP if SR-IOV is enabled.
●
XDP redirect memory return API (net-next)
– Managing pages across drivers
– Adopted by ixgbe, i40e, mlx5, tuntap, and virtio_net
– Preparing for the AF_XDP zero-copy patch set
– ixgbe tweaked the page ref counting scheme for the
new API.
42. References
Linux kernel v4.15
https://github.com/torvalds/linux/tree/v4.15/drivers/net/ethernet/intel/ixgb
e
[0/5] Enable XDP for ixgbevf
http://patchwork.ozlabs.org/cover/887197/
[net-next V11 PATCH 00/17] XDP redirect memory return API
https://www.spinics.net/lists/netdev/msg495995.html
ixgbe: tweak page counting for XDP_REDIRECT
https://patchwork.ozlabs.org/patch/889261/
Monitoring and Tuning the Linux Networking Stack: Sending Data
https://blog.packagecloud.io/eng/2017/02/06/monitoring-tuning-linux-net
working-stack-sending-data/
Monitoring and Tuning the Linux Networking Stack: Receiving Data
https://blog.packagecloud.io/eng/2016/06/22/monitoring-tuning-linux-net
working-stack-receiving-data/