This talk will introduce XDP (eXpress Data Path), and explain how this is essentially a new (programmable) network layer in-front of the existing network stack. Then it will dive into the details of the new XDP redirect feature, which goes beyond forwarding packets out other NIC devices.
The eXpress Data Path (XDP) has been gradually integrated into the Linux kernel over several releases. XDP offers fast and programmable packet processing in kernel context. The operating system kernel itself provides a safe execution environment for custom packet processing applications, in form of eBPF programs, executed in device driver context. XDP provides a fully integrated solution working in concert with the kernel’s networking stack. Applications are written in higher level languages such as C and compiled via LLVM into eBPF bytecode which the kernel statically analyses for safety, and JIT translates into native instructions. This is an alternative approach, compared to kernel bypass mechanisms (like DPDK and netmap).
Netronome's half-day tutorial on host data plane acceleration at ACM SIGCOMM 2018 introduced attendees to models for host data plane acceleration and provided an in-depth understanding of SmartNIC deployment models at hyperscale cloud vendors and telecom service providers.
Presenter Bios
Jakub Kicinski is a long term Linux kernel contributor, who has been leading the kernel team at Netronome for the last two years. Jakub’s major contributions include the creation of BPF hardware offload mechanisms in the kernel and bpftool user space utility, as well as work on the Linux kernel side of OVS offload.
David Beckett is a Software Engineer at Netronome with a strong technical background of computer networks including academic research with DDoS. David has expertise in the areas of Linux architecture and computer programming. David has a Masters Degree in Electrical, Electronic Engineering at Queen’s University Belfast and continues as a PhD student studying Emerging Application Layer DDoS threats.
The first version of eBPF hardware offload was merged into the Linux kernel in October 2016 and became part of Linux v4.9. For the last two years the project has been growing and evolving to integrate more closely with the core kernel infrastructure and enable more advanced use cases. This talk will explain the internals of the kernel architecture of the offload and how it allows seamless execution of unmodified eBPF datapaths in HW.
Comprehensive XDP Offload-handling the Edge CasesNetronome
While XDP is less complex to offload than other forms of kernel functionality due to the fact that it sits at the bottom of the stack, there are a number of items that lead to complexity when dealing with XDP offload. This talk will explore some of the ideas around how to implement these concepts as well as share some of the results we have seen while implementing offload on a 32 bit architecture.
In this talk we discuss the mechanisms of utilizing the eBPF language to perform hardware accelerated network packet manipulation and filtering. P4 programs can be compiled into eBPF scripts for offload in the Linux kernel using the Traffic Classifier (TC) subsystem. We demonstrate how, using eBPF as an intermediate language, it has been possible to extend the TC to either Just In Time (JIT) compile eBPF code to x86 assembler for software offload or to IXP byte code for execution in a trusted hardware environment within the Netronome Agilio intelligent server adapter. We finish by encouraging the audience to experiment with their own eBPF applications within the TC hardware accelerated system. The TC kernel patches are available on the Linux Kernel Networking mailing list as a Request For Comment (RFC) contribution.
Dinan Gunawardena, Director, Software Engineering, Netronome
Dinan Gunawardena is a Software Director focusing on running the driver team at Netronome. Previously, Dinan founded a software startup and was a Senior Research Engineer within the Operating Systems and Networking Group at Microsoft Research for 12 years, shipping technology in several versions of Microsoft Windows and the Bing Search Engine. Dinan has received over 20 patents and is a Chartered Software Engineer. Dinan has a Masters in Computer Science from University of Cambridge and a M.B.A. from WBS.
Jakub Kicinski, Software Engineering, Netronome
Jakub Kicinski is a Software Engineer specializing in the Linux Kernel drivers for Netronome SmartNICs. Jakub has previously worked as an intern for Intel Corporation. Jakub is also a researcher with expertise in Linux kernel. Experience in application development on complex multi-CPU and FPGA platforms. He is interested in high-performance software exploiting hardware capabilities and is passionate about networking. Jakub has a Masters in Computer Science from Gdansk University of Technology.
A Kernel of Truth: Intrusion Detection and Attestation with eBPFoholiab
"Attestation is hard" is something you might hear from security researchers tracking nation states and APTs, but it's actually pretty true for most network-connected systems!
Modern deployment methodologies mean that disparate teams create workloads for shared worker-hosts (ranging from Jenkins to Kubernetes and all the other orchestrators and CI tools in-between), meaning that at any given moment your hosts could be running any one of a number of services, connecting to who-knows-what on the internet.
So when your network-based intrusion detection system (IDS) opaquely declares that one of these machines has made an "anomalous" network connection, how do you even determine if it's business as usual? Sure you can log on to the host to try and figure it out, but (in case you hadn't noticed) computers are pretty fast these days, and once the connection is closed it might as well not have happened... Assuming it wasn't actually a reverse shell...
At Yelp we turned to the Linux kernel to tell us whodunit! Utilizing the Linux kernel's eBPF subsystem - an in-kernel VM with syscall hooking capabilities - we're able to aggregate metadata about the calling process tree for any internet-bound TCP connection by filtering IPs and ports in-kernel and enriching with process tree information in userland. The result is "pidtree-bcc": a supplementary IDS. Now whenever there's an alert for a suspicious connection, we just search for it in our SIEM (spoiler alert: it's nearly always an engineer doing something "innovative")! And the cherry on top? It's stupid fast with negligible overhead, creating a much higher signal-to-noise ratio than the kernels firehose-like audit subsystems.
This talk will look at how you can tune the signal-to-noise ratio of your IDS by making it reflect your business logic and common usage patterns, get more work done by reducing MTTR for false positives, use eBPF and the kernel to do all the hard work for you, accidentally load test your new IDS by not filtering all RFC-1918 addresses, and abuse Docker to get to production ASAP!
As well as looking at some of the technologies that the kernel puts at your disposal, this talk will also tell pidtree-bcc's road from hackathon project to production system and how focus on demonstrating business value early on allowed the organization to give us buy-in to build and deploy a brand new project from scratch.
Netronome's half-day tutorial on host data plane acceleration at ACM SIGCOMM 2018 introduced attendees to models for host data plane acceleration and provided an in-depth understanding of SmartNIC deployment models at hyperscale cloud vendors and telecom service providers.
Presenter Bios
Jakub Kicinski is a long term Linux kernel contributor, who has been leading the kernel team at Netronome for the last two years. Jakub’s major contributions include the creation of BPF hardware offload mechanisms in the kernel and bpftool user space utility, as well as work on the Linux kernel side of OVS offload.
David Beckett is a Software Engineer at Netronome with a strong technical background of computer networks including academic research with DDoS. David has expertise in the areas of Linux architecture and computer programming. David has a Masters Degree in Electrical, Electronic Engineering at Queen’s University Belfast and continues as a PhD student studying Emerging Application Layer DDoS threats.
The first version of eBPF hardware offload was merged into the Linux kernel in October 2016 and became part of Linux v4.9. For the last two years the project has been growing and evolving to integrate more closely with the core kernel infrastructure and enable more advanced use cases. This talk will explain the internals of the kernel architecture of the offload and how it allows seamless execution of unmodified eBPF datapaths in HW.
Comprehensive XDP Offload-handling the Edge CasesNetronome
While XDP is less complex to offload than other forms of kernel functionality due to the fact that it sits at the bottom of the stack, there are a number of items that lead to complexity when dealing with XDP offload. This talk will explore some of the ideas around how to implement these concepts as well as share some of the results we have seen while implementing offload on a 32 bit architecture.
In this talk we discuss the mechanisms of utilizing the eBPF language to perform hardware accelerated network packet manipulation and filtering. P4 programs can be compiled into eBPF scripts for offload in the Linux kernel using the Traffic Classifier (TC) subsystem. We demonstrate how, using eBPF as an intermediate language, it has been possible to extend the TC to either Just In Time (JIT) compile eBPF code to x86 assembler for software offload or to IXP byte code for execution in a trusted hardware environment within the Netronome Agilio intelligent server adapter. We finish by encouraging the audience to experiment with their own eBPF applications within the TC hardware accelerated system. The TC kernel patches are available on the Linux Kernel Networking mailing list as a Request For Comment (RFC) contribution.
Dinan Gunawardena, Director, Software Engineering, Netronome
Dinan Gunawardena is a Software Director focusing on running the driver team at Netronome. Previously, Dinan founded a software startup and was a Senior Research Engineer within the Operating Systems and Networking Group at Microsoft Research for 12 years, shipping technology in several versions of Microsoft Windows and the Bing Search Engine. Dinan has received over 20 patents and is a Chartered Software Engineer. Dinan has a Masters in Computer Science from University of Cambridge and a M.B.A. from WBS.
Jakub Kicinski, Software Engineering, Netronome
Jakub Kicinski is a Software Engineer specializing in the Linux Kernel drivers for Netronome SmartNICs. Jakub has previously worked as an intern for Intel Corporation. Jakub is also a researcher with expertise in Linux kernel. Experience in application development on complex multi-CPU and FPGA platforms. He is interested in high-performance software exploiting hardware capabilities and is passionate about networking. Jakub has a Masters in Computer Science from Gdansk University of Technology.
A Kernel of Truth: Intrusion Detection and Attestation with eBPFoholiab
"Attestation is hard" is something you might hear from security researchers tracking nation states and APTs, but it's actually pretty true for most network-connected systems!
Modern deployment methodologies mean that disparate teams create workloads for shared worker-hosts (ranging from Jenkins to Kubernetes and all the other orchestrators and CI tools in-between), meaning that at any given moment your hosts could be running any one of a number of services, connecting to who-knows-what on the internet.
So when your network-based intrusion detection system (IDS) opaquely declares that one of these machines has made an "anomalous" network connection, how do you even determine if it's business as usual? Sure you can log on to the host to try and figure it out, but (in case you hadn't noticed) computers are pretty fast these days, and once the connection is closed it might as well not have happened... Assuming it wasn't actually a reverse shell...
At Yelp we turned to the Linux kernel to tell us whodunit! Utilizing the Linux kernel's eBPF subsystem - an in-kernel VM with syscall hooking capabilities - we're able to aggregate metadata about the calling process tree for any internet-bound TCP connection by filtering IPs and ports in-kernel and enriching with process tree information in userland. The result is "pidtree-bcc": a supplementary IDS. Now whenever there's an alert for a suspicious connection, we just search for it in our SIEM (spoiler alert: it's nearly always an engineer doing something "innovative")! And the cherry on top? It's stupid fast with negligible overhead, creating a much higher signal-to-noise ratio than the kernels firehose-like audit subsystems.
This talk will look at how you can tune the signal-to-noise ratio of your IDS by making it reflect your business logic and common usage patterns, get more work done by reducing MTTR for false positives, use eBPF and the kernel to do all the hard work for you, accidentally load test your new IDS by not filtering all RFC-1918 addresses, and abuse Docker to get to production ASAP!
As well as looking at some of the technologies that the kernel puts at your disposal, this talk will also tell pidtree-bcc's road from hackathon project to production system and how focus on demonstrating business value early on allowed the organization to give us buy-in to build and deploy a brand new project from scratch.
This presentation will cover the basics of performance testing. Configuring systems correctly is essential to characterizing the performance of SmartNICs. The configuration of BIOS, CPU allocation, OS and VM parameters will be covered. Also, choices of traffic generators and typical test topologies will be described.
Linux Kernel Cryptographic API and Use CasesKernel TLV
The Linux kernel has a rich and modular cryptographic API that is used extensively by familiar user facing software such as Android. It's also cryptic, badly documented, subject to change and can easily bite you in unexpected and painful ways.
This talk will describe the crypto API, provide some usage example and discuss some of the more interesting in-kernel users, such as DM-Crypt, DM-Verity and the new fie system encryption code.
Gilad Ben-Yossef is a principal software engineer at ARM. He works on the kernel security sub-system and the ARM CryptCell engine. Open source work done by Gilad includes an experiment in integration of network processors in the networking stack, a patch set for reducing the interference caused to user space processes in large multi-core systems by Linux kernel “maintenance” work and on SMP support for the Synopsys Arc processor among others.
Gilad has co-authored O’Reilly’s “Building Embedded Linux Systems” 2nd edition and presented at such venues as Embedded Linux Conference Europe and the Ottawa Linux Symposium, as well as co-founded Hamakor, an Israeli NGO for the advancement for Open Source and Free Software in Israel. When not hacking on kernel code you can find Gilad meditating and making dad jokes on Twitter.
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDPThomas Graf
This talk will start with a deep dive and hands on examples of BPF, possibly the most promising low level technology to address challenges in application and network security, tracing, and visibility. We will discuss how BPF evolved from a simple bytecode language to filter raw sockets for tcpdump to the a JITable virtual machine capable of universally extending and instrumenting both the Linux kernel and user space applications. The introduction is followed by a concrete example of how the Cilium open source project applies BPF to solve networking, security, and load balancing for highly distributed applications. We will discuss and demonstrate how Cilium with the help of BPF can be combined with distributed system orchestration such as Docker to simplify security, operations, and troubleshooting of distributed applications.
This presentation introduces Data Plane Development Kit overview and basics. It is a part of a Network Programming Series.
First, the presentation focuses on the network performance challenges on the modern systems by comparing modern CPUs with modern 10 Gbps ethernet links. Then it touches memory hierarchy and kernel bottlenecks.
The following part explains the main DPDK techniques, like polling, bursts, hugepages and multicore processing.
DPDK overview explains how is the DPDK application is being initialized and run, touches lockless queues (rte_ring), memory pools (rte_mempool), memory buffers (rte_mbuf), hashes (rte_hash), cuckoo hashing, longest prefix match library (rte_lpm), poll mode drivers (PMDs) and kernel NIC interface (KNI).
At the end, there are few DPDK performance tips.
Tags: access time, burst, cache, dpdk, driver, ethernet, hub, hugepage, ip, kernel, lcore, linux, memory, pmd, polling, rss, softswitch, switch, userspace, xeon
DPDK (Data Plane Development Kit) Overview by Rami Rosen
* Background and short history
* Advantages and disadvantages
- Very High speed networking acceleration in L2
- How this acceleration is achieved (hugepages, optimizations)
- rte_kni (and KCP)
- VPP (and FD.io project) , providing routing and switching.
- TLDK (Transport Layer Development Kit, TCP/UDP)
* Anatomy of a simple DPDK application.
* Development and governance model
* Testpmd: DPDK CLI tool
* DDP - Dynamic Device Profiles
Rami Rosen is a Linux Kernel expert, the author of "Linux Kernel Networking", Apress, 2014.
Rami had published two articles about DPDK in the last year:
"Network acceleration with DPDK"
https://lwn.net/Articles/725254/
"Userspace Networking with DPDK"
https://www.linuxjournal.com/content/userspace-networking-dpdk
netfilter is a framework provided by the Linux kernel that allows various networking-related operations to be implemented in the form of customized handlers.
iptables is a user-space application program that allows a system administrator to configure the tables provided by the Linux kernel firewall (implemented as different netfilter modules) and the chains and rules it stores.
Many systems use iptables/netfilter, Linux's native packet filtering/mangling framework since Linux 2.4, be it home routers or sophisticated cloud network stacks.
In this session, we will talk about the netfilter framework and its facilities, explain how basic filtering and mangling use-cases are implemented using iptables, and introduce some less common but powerful extensions of iptables.
Shmulik Ladkani, Chief Architect at Nsof Networks.
Long time network veteran and kernel geek.
Shmulik started his career at Jungo (acquired by NDS/Cisco) implementing residential gateway software, focusing on embedded Linux, Linux kernel, networking and hardware/software integration.
Some billions of forwarded packets later, Shmulik left his position as Jungo's lead architect and joined Ravello Systems (acquired by Oracle) as tech lead, developing a virtual data center as a cloud-based service, focusing around virtualization systems, network virtualization and SDN.
Recently he co-founded Nsof Networks, where he's been busy architecting network infrastructure as a cloud-based service, gazing at internet routes in astonishment, and playing the chkuku.
Kernel Recipes 2019 - XDP closer integration with network stackAnne Nicolas
XDP (eXpress Data Path) is the new programmable in-kernel fast-path, which is placed as a layer before the existing Linux kernel network stack (netstack).
We claim XDP is not kernel-bypass, as it is a layer before and it can easily fall-through to netstack. Reality is that it can easily be (ab)used to create a kernel-bypass situation, where non of the kernel facilities are used (in form of BPF-helpers and in-kernel tables). The main disadvantage with kernel-bypass, is the need to re-implement everything, even basic building blocks, like routing tables and ARP protocol handling.
It is part of the concept and speed gain, that XDP allows users to avoid calling part of the kernel code. Users have the freedom to do kernel-bypass and re-implement everything, but the kernel should provide access to more in-kernel tables, via BPF-helpers, such that users can leverage other parts of the Open Source ecosystem, like router daemons etc.
This talk is about how XDP can work in-concert with netstack, and proposal on how we can take this even-further. Crazy ideas like using XDP frames to move SKB allocation out of driver code, will also be proposed.
This talk is all about the Berkeley Packet Filters (BPF) and their uses in Linux.
Agenda:
* What is a BPF and why do we need it?
* Writing custom BPFs
* Notes on BPF implementation in the kernel
* Usage examples: SOCKET_FILTER & seccomp
Speaker:
Kfir Gollan, senior embedded software developer, Linux kernel hacker and software team leader.
eBPF is an exciting new technology that is poised to transform Linux performance engineering. eBPF enables users to dynamically and programatically trace any kernel or user space code path, safely and efficiently. However, understanding eBPF is not so simple. The goal of this talk is to give audiences a fundamental understanding of eBPF, how it interconnects existing Linux tracing technologies, and provides a powerful aplatform to solve any Linux performance problem.
Building Network Functions with eBPF & BCCKernel TLV
eBPF (Extended Berkeley Packet Filter) is an in-kernel virtual machine that allows running user-supplied sandboxed programs inside of the kernel. It is especially well-suited to network programs and it's possible to write programs that filter traffic, classify traffic and perform high-performance custom packet processing.
BCC (BPF Compiler Collection) is a toolkit for creating efficient kernel tracing and manipulation programs. It makes use of eBPF.
BCC provides an end-to-end workflow for developing eBPF programs and supplies Python bindings, making eBPF programs much easier to write.
Together, eBPF and BCC allow you to develop and deploy network functions safely and easily, focusing on your application logic (instead of kernel datapath integration).
In this session, we will introduce eBPF and BCC, explain how to implement a network function using BCC, discuss some real-life use-cases and show a live demonstration of the technology.
About the speaker
Shmulik Ladkani, Chief Technology Officer at Meta Networks,
Long time network veteran and kernel geek.
Shmulik started his career at Jungo (acquired by NDS/Cisco) implementing residential gateway software, focusing on embedded Linux, Linux kernel, networking and hardware/software integration.
Some billions of forwarded packets later, Shmulik left his position as Jungo's lead architect and joined Ravello Systems (acquired by Oracle) as tech lead, developing a virtual data center as a cloud-based service, focusing around virtualization systems, network virtualization and SDN.
Recently he co-founded Meta Networks where he's been busy architecting secure, multi-tenant, large-scale network infrastructure as a cloud-based service.
Transparent eBPF Offload: Playing Nice with the Linux KernelOpen-NFP
Netronome is developing transparent kernel acceleration based on XDP and BPF. This allows bytecode intended for kernel applications to be directly offloaded using the upstreamed nfp_bpf_jit. The talk will focus on the basics of getting an XDP environment up and running, a couple of use cases and a simple demo of Netronome’s BPF offload in action.
This presentation will cover the basics of performance testing. Configuring systems correctly is essential to characterizing the performance of SmartNICs. The configuration of BIOS, CPU allocation, OS and VM parameters will be covered. Also, choices of traffic generators and typical test topologies will be described.
Linux Kernel Cryptographic API and Use CasesKernel TLV
The Linux kernel has a rich and modular cryptographic API that is used extensively by familiar user facing software such as Android. It's also cryptic, badly documented, subject to change and can easily bite you in unexpected and painful ways.
This talk will describe the crypto API, provide some usage example and discuss some of the more interesting in-kernel users, such as DM-Crypt, DM-Verity and the new fie system encryption code.
Gilad Ben-Yossef is a principal software engineer at ARM. He works on the kernel security sub-system and the ARM CryptCell engine. Open source work done by Gilad includes an experiment in integration of network processors in the networking stack, a patch set for reducing the interference caused to user space processes in large multi-core systems by Linux kernel “maintenance” work and on SMP support for the Synopsys Arc processor among others.
Gilad has co-authored O’Reilly’s “Building Embedded Linux Systems” 2nd edition and presented at such venues as Embedded Linux Conference Europe and the Ottawa Linux Symposium, as well as co-founded Hamakor, an Israeli NGO for the advancement for Open Source and Free Software in Israel. When not hacking on kernel code you can find Gilad meditating and making dad jokes on Twitter.
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDPThomas Graf
This talk will start with a deep dive and hands on examples of BPF, possibly the most promising low level technology to address challenges in application and network security, tracing, and visibility. We will discuss how BPF evolved from a simple bytecode language to filter raw sockets for tcpdump to the a JITable virtual machine capable of universally extending and instrumenting both the Linux kernel and user space applications. The introduction is followed by a concrete example of how the Cilium open source project applies BPF to solve networking, security, and load balancing for highly distributed applications. We will discuss and demonstrate how Cilium with the help of BPF can be combined with distributed system orchestration such as Docker to simplify security, operations, and troubleshooting of distributed applications.
This presentation introduces Data Plane Development Kit overview and basics. It is a part of a Network Programming Series.
First, the presentation focuses on the network performance challenges on the modern systems by comparing modern CPUs with modern 10 Gbps ethernet links. Then it touches memory hierarchy and kernel bottlenecks.
The following part explains the main DPDK techniques, like polling, bursts, hugepages and multicore processing.
DPDK overview explains how is the DPDK application is being initialized and run, touches lockless queues (rte_ring), memory pools (rte_mempool), memory buffers (rte_mbuf), hashes (rte_hash), cuckoo hashing, longest prefix match library (rte_lpm), poll mode drivers (PMDs) and kernel NIC interface (KNI).
At the end, there are few DPDK performance tips.
Tags: access time, burst, cache, dpdk, driver, ethernet, hub, hugepage, ip, kernel, lcore, linux, memory, pmd, polling, rss, softswitch, switch, userspace, xeon
DPDK (Data Plane Development Kit) Overview by Rami Rosen
* Background and short history
* Advantages and disadvantages
- Very High speed networking acceleration in L2
- How this acceleration is achieved (hugepages, optimizations)
- rte_kni (and KCP)
- VPP (and FD.io project) , providing routing and switching.
- TLDK (Transport Layer Development Kit, TCP/UDP)
* Anatomy of a simple DPDK application.
* Development and governance model
* Testpmd: DPDK CLI tool
* DDP - Dynamic Device Profiles
Rami Rosen is a Linux Kernel expert, the author of "Linux Kernel Networking", Apress, 2014.
Rami had published two articles about DPDK in the last year:
"Network acceleration with DPDK"
https://lwn.net/Articles/725254/
"Userspace Networking with DPDK"
https://www.linuxjournal.com/content/userspace-networking-dpdk
netfilter is a framework provided by the Linux kernel that allows various networking-related operations to be implemented in the form of customized handlers.
iptables is a user-space application program that allows a system administrator to configure the tables provided by the Linux kernel firewall (implemented as different netfilter modules) and the chains and rules it stores.
Many systems use iptables/netfilter, Linux's native packet filtering/mangling framework since Linux 2.4, be it home routers or sophisticated cloud network stacks.
In this session, we will talk about the netfilter framework and its facilities, explain how basic filtering and mangling use-cases are implemented using iptables, and introduce some less common but powerful extensions of iptables.
Shmulik Ladkani, Chief Architect at Nsof Networks.
Long time network veteran and kernel geek.
Shmulik started his career at Jungo (acquired by NDS/Cisco) implementing residential gateway software, focusing on embedded Linux, Linux kernel, networking and hardware/software integration.
Some billions of forwarded packets later, Shmulik left his position as Jungo's lead architect and joined Ravello Systems (acquired by Oracle) as tech lead, developing a virtual data center as a cloud-based service, focusing around virtualization systems, network virtualization and SDN.
Recently he co-founded Nsof Networks, where he's been busy architecting network infrastructure as a cloud-based service, gazing at internet routes in astonishment, and playing the chkuku.
Kernel Recipes 2019 - XDP closer integration with network stackAnne Nicolas
XDP (eXpress Data Path) is the new programmable in-kernel fast-path, which is placed as a layer before the existing Linux kernel network stack (netstack).
We claim XDP is not kernel-bypass, as it is a layer before and it can easily fall-through to netstack. Reality is that it can easily be (ab)used to create a kernel-bypass situation, where non of the kernel facilities are used (in form of BPF-helpers and in-kernel tables). The main disadvantage with kernel-bypass, is the need to re-implement everything, even basic building blocks, like routing tables and ARP protocol handling.
It is part of the concept and speed gain, that XDP allows users to avoid calling part of the kernel code. Users have the freedom to do kernel-bypass and re-implement everything, but the kernel should provide access to more in-kernel tables, via BPF-helpers, such that users can leverage other parts of the Open Source ecosystem, like router daemons etc.
This talk is about how XDP can work in-concert with netstack, and proposal on how we can take this even-further. Crazy ideas like using XDP frames to move SKB allocation out of driver code, will also be proposed.
This talk is all about the Berkeley Packet Filters (BPF) and their uses in Linux.
Agenda:
* What is a BPF and why do we need it?
* Writing custom BPFs
* Notes on BPF implementation in the kernel
* Usage examples: SOCKET_FILTER & seccomp
Speaker:
Kfir Gollan, senior embedded software developer, Linux kernel hacker and software team leader.
eBPF is an exciting new technology that is poised to transform Linux performance engineering. eBPF enables users to dynamically and programatically trace any kernel or user space code path, safely and efficiently. However, understanding eBPF is not so simple. The goal of this talk is to give audiences a fundamental understanding of eBPF, how it interconnects existing Linux tracing technologies, and provides a powerful aplatform to solve any Linux performance problem.
Building Network Functions with eBPF & BCCKernel TLV
eBPF (Extended Berkeley Packet Filter) is an in-kernel virtual machine that allows running user-supplied sandboxed programs inside of the kernel. It is especially well-suited to network programs and it's possible to write programs that filter traffic, classify traffic and perform high-performance custom packet processing.
BCC (BPF Compiler Collection) is a toolkit for creating efficient kernel tracing and manipulation programs. It makes use of eBPF.
BCC provides an end-to-end workflow for developing eBPF programs and supplies Python bindings, making eBPF programs much easier to write.
Together, eBPF and BCC allow you to develop and deploy network functions safely and easily, focusing on your application logic (instead of kernel datapath integration).
In this session, we will introduce eBPF and BCC, explain how to implement a network function using BCC, discuss some real-life use-cases and show a live demonstration of the technology.
About the speaker
Shmulik Ladkani, Chief Technology Officer at Meta Networks,
Long time network veteran and kernel geek.
Shmulik started his career at Jungo (acquired by NDS/Cisco) implementing residential gateway software, focusing on embedded Linux, Linux kernel, networking and hardware/software integration.
Some billions of forwarded packets later, Shmulik left his position as Jungo's lead architect and joined Ravello Systems (acquired by Oracle) as tech lead, developing a virtual data center as a cloud-based service, focusing around virtualization systems, network virtualization and SDN.
Recently he co-founded Meta Networks where he's been busy architecting secure, multi-tenant, large-scale network infrastructure as a cloud-based service.
Transparent eBPF Offload: Playing Nice with the Linux KernelOpen-NFP
Netronome is developing transparent kernel acceleration based on XDP and BPF. This allows bytecode intended for kernel applications to be directly offloaded using the upstreamed nfp_bpf_jit. The talk will focus on the basics of getting an XDP environment up and running, a couple of use cases and a simple demo of Netronome’s BPF offload in action.
Ariel Waizel discusses the Data Plane Development Kit (DPDK), an API for developing fast packet processing code in user space.
* Who needs this library? Why bypass the kernel?
* How does it work?
* How good is it? What are the benchmarks?
* Pros and cons
Ariel worked on kernel development at the IDF, Ben Gurion University, and several companies. He is interested in networking, security, machine learning, and basically everything except UI development. Currently a Solution Architect at ConteXtream (an HPE company), which specializes in SDN solutions for the telecom industry.
01 high bandwidth acquisitioncomputing compressionall in a boxYutaka Kawai
This was presented by Alexandre CASTELLANE, Bruno MESNET, and Fabrice MOYEN (IBM France) at OpenPOWER summit EU 2019. The original one is uploaded at:
https://static.sched.com/hosted_files/opeu19/2c/OPF_Lyon_31-09-2019_all_in_a_box_acquisition_computing.pdf
"Session ID: BUD17-300
Session Name: Journey of a packet - BUD17-300
Speaker: Maxim Uvarov
Track: LNG
★ Session Summary ★
Describe step by step what components a packet goes through and details cases when components are implemented in hardware or in software. Attendees will have the definite presentation to understand fundamental differences with DPDK and how ODP solves low end and high end networking issues.
---------------------------------------------------
★ Resources ★
Event Page: http://connect.linaro.org/resource/bud17/bud17-300/
Presentation: https://www.slideshare.net/linaroorg/bud17300-journey-of-a-packet
Video: https://youtu.be/wRZXw_xBT20
---------------------------------------------------
★ Event Details ★
Linaro Connect Budapest 2017 (BUD17)
6-10 March 2017
Corinthia Hotel, Budapest,
Erzsébet krt. 43-49,
1073 Hungary
---------------------------------------------------
Keyword: packet, LNG
http://www.linaro.org
http://connect.linaro.org
---------------------------------------------------
Follow us on Social Media
https://www.facebook.com/LinaroOrg
https://twitter.com/linaroorg
https://www.youtube.com/user/linaroorg?sub_confirmation=1
https://www.linkedin.com/company/1026961
RAPIDS: GPU-Accelerated ETL and Feature EngineeringKeith Kraus
The RAPIDS suite of open source software libraries gives you the freedom to execute end-to-end data science and analytics pipelines entirely on GPUs. It relies on NVIDIA® CUDA® primitives for low-level compute optimization, but exposes that GPU parallelism and high-bandwidth memory speed through user-friendly Python interfaces.
Accelerating Apache Spark by Several Orders of Magnitude with GPUs and RAPIDS...Databricks
GPU acceleration has been at the heart of scientific computing and artificial intelligence for many years now. GPUs provide the computational power needed for the most demanding applications such as Deep Neural Networks, nuclear or weather simulation. Since the launch of RAPIDS in mid-2018, this vast computational resource has become available for Data Science workloads too. The RAPIDS toolkit, which is now available on the Databricks Unified Analytics Platform, is a GPU-accelerated drop-in replacement for utilities such as Pandas/NumPy/ScikitLearn/XGboost. Through its use of Dask wrappers the platform allows for true, large scale computation with minimal, if any, code changes.
The goal of this talk is to discuss RAPIDS, its functionality, architecture as well as the way it integrates with Spark providing on many occasions several orders of magnitude acceleration versus its CPU-only counterparts.
Production high-performance networking with Snabb and LuaJIT (Linux.conf.au 2...Igalia
By Andy Wingo.
It used to be that to set up a serious network, you needed to stock racks and racks with specialized proprietary single-purpose boxes. This was because only specialized hardware could handle the hundreds of gigabits per second that might flow through any given box.
Things have changed. With the rise of cheap commodity Xeon-based servers and widespread availability of 10 gigabit network cards, an off-the-shelf server with a few NICs can now handle the workload. The age of open source software-driven routers is fully here -- but it doesn't look like what we thought it would, 10 years ago.
We thought it would be Linux everywhere, but it turns out that Linux's networking stack is just too slow. To get around this problem, modern high-speed software switches bypass the kernel entirely, instead booting network cards and handling traffic entirely from user-space. The up-side of this is that now we have the possibility of using pleasant, hackable, open source, standalone software stacks to deliver network applications that are tailored to specific needs.
This talk presents Snabb, a toolkit for building user-space network functions. Snabb is entirely written in the expressive Lua language, minimizing the amount of code that you have to write to get stuff done. Snabb specifically uses the LuaJIT implementation of Lua, giving us excellent code generation as
well as efficient access to low-level binary data and AVX2 assembly generation.
Snabb's goal is to be "rewritable software": software that's so simple that you could explain it to someone and they could write their own. By the end of the presentation, you too should have this feeling.
We will also describe how Snabb is used in practice in major telecoms and ISPs to provide IPv6 transition technologies to entire countries. Using Snabb allowed a small team of open-source hackers to ship a product that competed favorably
against offerings from traditional network vendors.
(c) linux.conf.au 2017, CC-BY-SA
Hobart, 16-20 January 2017
https://linux.conf.au
Design Considerations, Installation, and Commissioning of the RedRaider Cluster at the Texas Tech University
High Performance Computing Center
Outline of this talk
HPCC Staff and Students
Previous clusters
• History, Performance, usage Patterns, and Experience
Motivation for Upgrades
• Compute Capacity Goals
• Related Considerations
Installation and Benchmarks Conclusions and Q&A
Session ID: SFO17-509
Session Name: Deep Learning on ARM Platforms
- SFO17-509
Speaker: Jammy Zhou
Track:
★ Session Summary ★
A new era of deep learning is coming with algorithm evolvement, powerful computing platforms and large dataset availability. This session will focus on existing and potential heterogeneous accelerator solutions (GPU, FPGA, DSP, and etc) for ARM platforms and the work ahead from platform perspective.
---------------------------------------------------
★ Resources ★
Event Page: http://connect.linaro.org/resource/sfo17/sfo17-509/
Presentation:
Video:
---------------------------------------------------
★ Event Details ★
Linaro Connect San Francisco 2017 (SFO17)
25-29 September 2017
Hyatt Regency San Francisco Airport
---------------------------------------------------
Keyword:
http://www.linaro.org
http://connect.linaro.org
---------------------------------------------------
Follow us on Social Media
https://www.facebook.com/LinaroOrg
https://twitter.com/linaroorg
https://www.youtube.com/user/linaroorg?sub_confirmation=1
https://www.linkedin.com/company/1026961
RAPIDS – Open GPU-accelerated Data ScienceData Works MD
RAPIDS – Open GPU-accelerated Data Science
RAPIDS is an initiative driven by NVIDIA to accelerate the complete end-to-end data science ecosystem with GPUs. It consists of several open source projects that expose familiar interfaces making it easy to accelerate the entire data science pipeline- from the ETL and data wrangling to feature engineering, statistical modeling, machine learning, and graph analysis.
Corey J. Nolet
Corey has a passion for understanding the world through the analysis of data. He is a developer on the RAPIDS open source project focused on accelerating machine learning algorithms with GPUs.
Adam Thompson
Adam Thompson is a Senior Solutions Architect at NVIDIA. With a background in signal processing, he has spent his career participating in and leading programs focused on deep learning for RF classification, data compression, high-performance computing, and managing and designing applications targeting large collection frameworks. His research interests include deep learning, high-performance computing, systems engineering, cloud architecture/integration, and statistical signal processing. He holds a Masters degree in Electrical & Computer Engineering from Georgia Tech and a Bachelors from Clemson University.
For the full video of this presentation, please visit:
http://www.embedded-vision.com/platinum-members/luxoft/embedded-vision-training/videos/pages/may-2016-embedded-vision-summit
For more information about embedded vision, please visit:
http://www.embedded-vision.com
Alexey Rybakov, Senior Director at LUXOFT, presents the "Making Computer Vision Software Run Fast on Your Embedded Platform" tutorial at the May 2016 Embedded Vision Summit.
Many computer vision algorithms perform well on desktop class systems, but struggle on resource constrained embedded platforms. This how-to talk provides a comprehensive overview of various optimization methods that make vision software run fast on low power, small footprint hardware that is widely used in automotive, surveillance, and mobile devices. The presentation explores practical aspects of deep algorithm and software optimization such as thinning of input data, using dynamic regions of interest, mastering data pipelines and memory access, overcoming compiler inefficiencies, and more.
Kernel Recipes 2019 - Driving the industry toward upstream firstAnne Nicolas
Wanting to avoid the Android experience, Google developers always aimed to make their Chrome OS Linux kernels as close to mainline as possible. However, when Chromebooks were first created, Google was left with no choice, the mainline kernel, in some subsystems, still did not have all the functionalities needed by Chromebooks. Hence, similarly to Android, Chrome OS had to develop their own out-of-tree code for the kernel and maintain that for a few different kernel versions.
Luckily, over the last few years a strong and consistent effort has been happening to bring Chromebook devices closer to mainline. It has led to significant improvements that now make it possible to run mainline on Chrome OS devices. And not only Chromebooks, as these significant strides are also improving Arm-based SOCs and other key components of the rich Chromebook hardware ecosystem. In this talk, we will look at how and why upstream support for Chromebooks improved, the current status of various models, and what we expect in the future.
Enric Balletbò i Serra
Kernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMIAnne Nicolas
As the name would suggest, a Non-Maskable Interrupt (NMI) is an interrupt-like feature that is unaffected by the disabling of classic interrupts. In Linux, NMIs are involved in some features such as performance event monitoring, hard-lockup detector, on demand state dumping, etc… Their potential to fire when least expected can fill the most seasoned kernel hackers with dread.
AArch64 (aka arm64 in the Linux tree) does not provide architected NMIs, a consequence being that features benefiting from NMIs see their use limited on AArch64. However, the Arm Generic Interrupt Controller (GIC) supports interrupt prioritization and masking, which, among other things, provides a way to control whether or not a set of interrupts can be signaled to a CPU.
This talk will cover how, using the GIC interrupt priorities, we provide a way to configure some interrupts to behave in an NMI-like manner on AArch64. We’ll discuss the implementation, some of the complications that ensued and also some of the benefits obtained from it.
Julien Thierry
Kernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernelAnne Nicolas
At a rate of almost 9 changes per hour (24/7), the Linux kernel is definitely a scary beast. Bugs are introduced on a daily basis and, through the use of multiple code analyzers, *some* of them are detected and fixed before they hit mainline. Over the course of the last few years, Gustavo has been fixing such bugs and many different issues in every corner of the Linux kernel. Recently, he was in charge of leading the efforts to globally enable -Wimplicit-fallthrough; which appears by default in Linux v5.3. This presentation is a report on all the stuff Gustavo has found and fixed in the kernel with the support of the Core Infrastructure Initiative.
Gustavo A.R. Silva
Kernel Recipes 2019 - Metrics are moneyAnne Nicolas
In I.T. we all use all kinds of metrics. Operations teams rely heavily on these, especially when things go south. These metrics are sometimes overrated. Let’s dive into a few real life stories together.
Aurélien Rougemont
Kernel Recipes 2019 - Kernel documentation: past, present, and futureAnne Nicolas
The Linux kernel project includes a huge amount of documentation, but that information has seen little in the way of care over the
years. The amount of care has increased significantly recently, though, and things are improving quickly. Listen as the kernel’s documentation maintainer discusses the current state of the kernel’s docs, how we got here, where we’re trying to go, and how you can help.
Jonathan Corbet
Embedded Recipes 2019 - Knowing your ARM from your ARSE: wading through the t...Anne Nicolas
Modern SoC designs incorporate technologies from numerous vendors, each with their own inconsistent, confusing, undocumented and even contradictory terminology. The result is a mess of acronyms and product names which have a surprising impact on the ability to develop reusable, modular code thanks to the nature of the underlying IP being obscured.
This presentation will dive into some of the misnomers plaguing the Arm ecosystem, with the aim of explaining why things are like they are, how they fit together under the architectural umbrella and how you, as a developer, can decipher the baffling ingredients list of your next SoC design!
Will Deacon
Kernel Recipes 2019 - GNU poke, an extensible editor for structured binary dataAnne Nicolas
GNU poke is a new interactive editor for binary data. Not limited to editing basic ntities such as bits and bytes, it provides a full-fledged procedural, interactive programming language designed to describe data structures and to operate on them. Once a user has defined a structure for binary data (usually matching some file format) she can search, inspect, create, shuffle and modify abstract entities such as ELF relocations, MP3 tags, DWARF expressions, partition table entries, and so on, with primitives resembling simple editing of bits and bytes. The program comes with a library of already written descriptions (or “pickles” in poke parlance) for many binary formats.
GNU poke is useful in many domains. It is very well suited to aid in the development of programs that operate on binary files, such as assemblers and linkers. This was in fact the primary inspiration that brought me to write it: easily injecting flaws into ELF files in order to reproduce toolchain bugs. Also, due to its flexibility, poke is also very useful for reverse engineering, where the real structure of the data being edited is discovered by experiment, interactively. It is also good for the fast development of prototypes for programs like linkers, compressors or filters, and it provides a convenient foundation to write other utilities such as diff and patch tools for binary files.
This talk (unlike Gaul) is divided into four parts. First I will introduce the program and show what it does: from simple bits/bytes editing to user-defined structures. Then I will show some of the internals, and how poke is implemented. The third block will cover the way of using Poke to describe user data, which is to say the art of writing “pickles”. The presentation ends with a status of the project, a call for hackers, and a hint at future works.
Jose E. Marchesi
Kernel Recipes 2019 - Analyzing changes to the binary interface exposed by th...Anne Nicolas
Operating system distributors often face challenges that are somewhat different from that of upstream kernel developers. For instance, some kernel updates often need to stay at least binary compatible with modules that might be “out of tree” for some time.
In that context, being able to automatically detect and analyze changes to the binary interface exposed by the kernel to its module does have some noticeable value.
The Libabigail framework is capable of analyzing ELF binaries along with their accompanying debug info in the DWARF format, detect and report changes in types, functions, variables and ELF symbols. It has historically supported that for user space shared libraries and application so we worked to make it understand the Linux kernel
binaries.
In this presentation, we are going to present the current support of ABI analysis for Linux Kernel binaries, the challenges we face, how we address them and the plans we have for the future.
Dodji Seketeli, Jessica Yu, Matthias Männich
Embedded Recipes 2019 - Remote update adventures with RAUC, Yocto and BareboxAnne Nicolas
Different upgrade and update strategies exist when it comes to embedded Linux system. If at development time none of these strategies have been chosen, adding them afterwards can be tedious task.
Even harder it gets when the system is already deployed in the field and only accessible via a 3G connection.
This talk is a developer experience of putting in place exactly that. Giving a return of experience on one way of doing it on a system running Barebox and a Yocto-based distribution.
Patrick Boettcher
Embedded Recipes 2019 - Making embedded graphics less specialAnne Nicolas
Traditionally graphics drivers were one of the last hold-outs of proprietary software in an embedded Linux system. This situation is changing with open-source graphics drivers showing up for almost all of the graphics acceleration peripherals on the market right now. This talk will show how open-source graphics drivers are making embedded systems less special, as well as trying to provide an overview of the Linux graphics stack, de-mystifying what is often seen as black magic GPU stuff from outside observers.
Lucas Stach
Embedded Recipes 2019 - Linux on Open Source Hardware and Libre SiliconAnne Nicolas
This talk will explore Open Source Hardware projects relevant to Linux, including boards like BeagleBone, Olimex OLinuXino, Giant board and more. Looking at the benefits and challenges of designing Open Source Hardware for a Linux system, along with BeagleBoard.org’s experience of working with community, manufacturers, and distributors to create an Open Source Hardware platform. In closing also looking at the future, Libre Silicon like RISC-V designs, and where this might take Linux.
Drew Fustini
Embedded Recipes 2019 - From maintaining I2C to the big (embedded) pictureAnne Nicolas
The I2C subsystem is not the shiniest part of the Linux Kernel. For embedded devices, though, it is one of the many puzzle pieces which just have to work. Wolfram Sang has the experience of maintaining this subsystem for nearly 7 years now. This talk gives a short overview of how maintaining works in general and specifically in this subsystem. But mainly, it will highlight noteworthy points in the timeline and lessons learnt from that. It will present trends, not so much regarding I2C but more the Linux Kernel and the embedded ecosystem in general. And of course, there will be plenty of anecdotes and bits from behind the scenes for your entertainment.
Wolfram Sang
Embedded Recipes 2019 - Testing firmware the devops wayAnne Nicolas
ITRenew is selling recertified OCP servers under the Sesame brand, those servers come either with their original UEFI BIOS or with LinuxBoot. The LinuxBoot project is pushing the Linux kernel inside bios flash and using userland programs as bootloader.
To achieve quality on our software stack, as any project, we need to test it. Traditional BIOS are tested by hand, this is 2019 we need to do it automatically! We already presented the hardware setup behind the LinuxBoot CI, this talk will focus on the software.
We use u-root for our userland bootloader; this software is written in Go so we naturally choose to use Go for our testing too. We will present how we are using and extending the Go native test framework `go test` for testing embedded systems (serial console) and improving the report format for integration to a CI.
Julien Viard de Galbert
Embedded Recipes 2019 - Herd your socs become a matchmakerAnne Nicolas
About 60% of the Linux kernel source tree is devoted to drivers for a large variety of supported hardware components. Especially in the embedded world, the number of different SoC families, versions, and revisions, integrating a myriad of “IP cores”, keeps on growing.
In this presentation, Geert will explain how to match drivers against hardware, and how to support a wide variety of (dis)similar devices, without turning platform and driver code into an entangled bowl of spaghetti.tra
Starting with a brief history of driver matching in Linux, he will fast-forward to device-tree based matching. He will discuss ways to handle slight variations of the same hardware devices, and different SoC revisions, each with their own quirks and bugs. Finally, Geert will show best practices for evolving device drivers in a maintainable way, based on his experiences as an embedded Linux kernel developer and maintainer.
Geert Uytterhoeven
Embedded Recipes 2019 - LLVM / Clang integrationAnne Nicolas
Buildroot is a popular and easy to use embedded Linux build system. It generates, in few minutes, lightweight and customized Linux systems, including the cross-compilation toolchain, kernel and bootloader images, as well as a wide variety of userspace libraries and programs.
This talk is about the integration of LLVM/clang into Buildroot.
In 2018, Valentin Korenblit, supervised by Romain Naour, worked on this topic during his internship at Smile ECS. After a short introduction about llvm/clang and Buildroot, this talk will go through the numerous issues discovered while adding llvm/clang componants and how these issues were fixed. Romain will also detail the work in progress and the work to be done based on llvm/clang libraries (OpenCL, Compiler-rt, BCC. Chromium, ldd).
Romain Naour
Embedded Recipes 2019 - Introduction to JTAG debuggingAnne Nicolas
This talk introduces JTAG debugging capabilities, both for debugging hardware and software. Marek first explains what the JTAG stands for and explains the operation of the JTAG state machine. This is followed by an introduction to free software JTAG tools, OpenOCD and urJTAG. Marek shortly explains how to debug software using those tools and how that ties into the JTAG state machine. However, JTAG was designed for testing hardware. Marek explains what boundary scan testing (BST) is, what are BSDL files and their format, and practically demonstrates how to blink an LED using BST and only free software tools.
Marek Vasut
Embedded Recipes 2019 - Pipewire a new foundation for embedded multimediaAnne Nicolas
PipeWire is an open source project that aims to greatly improve audio and video handling under Linux. Utilising a fresh design, it bridges use cases that have been previously addressed by different tools – or not addressed at all -, providing ground for building complex, yet secure and efficient, multimedia systems.
In this talk, Julien is going to present the PipeWire project and the concepts that make up its design. In addition, he is going to give an update of the current and future work going on around PipeWire, both upstream and in Automotive Grade Linux, an early adopter that Julien is actively working on.
Julian Bouzas
Kernel Recipes 2019 - ftrace: Where modifying a running kernel all startedAnne Nicolas
Ftrace’s most powerful feature is the function tracer (and function graph tracer which is built from it). But to have this enabled on production systems, it had to have its overhead be negligible when disabled. As the function tracer uses gcc’s profiling mechanism, which adds a call to “mcount” (or more recently fentry, don’t worry if you don’t know what this is, it will all be explained) at the start of almost all functions, it had to do something about the overhead that causes. The solution was to turn those calls into “nops” (an instruction that the CPU simply ignores). But this was no easy feat. It took a lot to come up with a solution (and also turning a few network cards into bricks). This talk will explain the history of how ftrace came about implementing the function tracer, and brought with it the possibility of static branches and soon static calls!
Steven Rostedt
Kernel Recipes 2019 - Suricata and XDPAnne Nicolas
Suricata is a network threat detection engine using network packets capture to reconstruct the traffic till the application layer and find threats on the network using rules that define behavior to detect. This task is really CPU intensive and discarding non interesting traffic is a solution to enable a scaling of Suricata to 40gbps and other.
This talk will present the latest evolution of Suricata that knows uses eBPF and XDP to bypass traffic. Suricata 5.0 is supporting the hardware XDP to provide ypass with network card such as Netronome. It also takes advantage of pinned maps to get persistance of the bypassed flows. This talk will cover the different usage of XDP and eBPF in Suricata and shows how it impact performance and usability. If development time permit, the talk will also cover AF_XDP and the impact on this new capture method on Suricata.
Eric Leblond
Kernel Recipes 2019 - Marvels of Memory Auto-configuration (SPD)Anne Nicolas
System memory configuration is a transparent operation nowadays, something that we all came to expect to just work out of the box. Still, it does happen behind the scenes every single time we boot our computers. This requires the cooperation of hardware components on the mainboard and on memory modules themselves, as well as firmware code to drive these. While it is possible to just let it happen, having a deeper understanding of how it works makes it possible to access valuable information from the operating system at run-time.
I will take you through the history of system memory configuration from the mid 70s to now. We will explore the different types of memory modules, how their configuration data is stored and how the firmware can access them. We will see which problems had to be solved along the way and how they were solved. Lastly we will see how Linux supports reading the memory configuration information and what you can do with that information.
Jean Delvare
Cyaniclab : Software Development Agency Portfolio.pdfCyanic lab
CyanicLab, an offshore custom software development company based in Sweden,India, Finland, is your go-to partner for startup development and innovative web design solutions. Our expert team specializes in crafting cutting-edge software tailored to meet the unique needs of startups and established enterprises alike. From conceptualization to execution, we offer comprehensive services including web and mobile app development, UI/UX design, and ongoing software maintenance. Ready to elevate your business? Contact CyanicLab today and let us propel your vision to success with our top-notch IT solutions.
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTier1 app
Even though at surface level ‘java.lang.OutOfMemoryError’ appears as one single error; underlyingly there are 9 types of OutOfMemoryError. Each type of OutOfMemoryError has different causes, diagnosis approaches and solutions. This session equips you with the knowledge, tools, and techniques needed to troubleshoot and conquer OutOfMemoryError in all its forms, ensuring smoother, more efficient Java applications.
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns
Unlocking Business Potential: Tailored Technology Solutions by Prosigns
Discover how Prosigns, a leading technology solutions provider, partners with businesses to drive innovation and success. Our presentation showcases our comprehensive range of services, including custom software development, web and mobile app development, AI & ML solutions, blockchain integration, DevOps services, and Microsoft Dynamics 365 support.
Custom Software Development: Prosigns specializes in creating bespoke software solutions that cater to your unique business needs. Our team of experts works closely with you to understand your requirements and deliver tailor-made software that enhances efficiency and drives growth.
Web and Mobile App Development: From responsive websites to intuitive mobile applications, Prosigns develops cutting-edge solutions that engage users and deliver seamless experiences across devices.
AI & ML Solutions: Harnessing the power of Artificial Intelligence and Machine Learning, Prosigns provides smart solutions that automate processes, provide valuable insights, and drive informed decision-making.
Blockchain Integration: Prosigns offers comprehensive blockchain solutions, including development, integration, and consulting services, enabling businesses to leverage blockchain technology for enhanced security, transparency, and efficiency.
DevOps Services: Prosigns' DevOps services streamline development and operations processes, ensuring faster and more reliable software delivery through automation and continuous integration.
Microsoft Dynamics 365 Support: Prosigns provides comprehensive support and maintenance services for Microsoft Dynamics 365, ensuring your system is always up-to-date, secure, and running smoothly.
Learn how our collaborative approach and dedication to excellence help businesses achieve their goals and stay ahead in today's digital landscape. From concept to deployment, Prosigns is your trusted partner for transforming ideas into reality and unlocking the full potential of your business.
Join us on a journey of innovation and growth. Let's partner for success with Prosigns.
Globus Connect Server Deep Dive - GlobusWorld 2024Globus
We explore the Globus Connect Server (GCS) architecture and experiment with advanced configuration options and use cases. This content is targeted at system administrators who are familiar with GCS and currently operate—or are planning to operate—broader deployments at their institution.
How Recreation Management Software Can Streamline Your Operations.pptxwottaspaceseo
Recreation management software streamlines operations by automating key tasks such as scheduling, registration, and payment processing, reducing manual workload and errors. It provides centralized management of facilities, classes, and events, ensuring efficient resource allocation and facility usage. The software offers user-friendly online portals for easy access to bookings and program information, enhancing customer experience. Real-time reporting and data analytics deliver insights into attendance and preferences, aiding in strategic decision-making. Additionally, effective communication tools keep participants and staff informed with timely updates. Overall, recreation management software enhances efficiency, improves service delivery, and boosts customer satisfaction.
Accelerate Enterprise Software Engineering with PlatformlessWSO2
Key takeaways:
Challenges of building platforms and the benefits of platformless.
Key principles of platformless, including API-first, cloud-native middleware, platform engineering, and developer experience.
How Choreo enables the platformless experience.
How key concepts like application architecture, domain-driven design, zero trust, and cell-based architecture are inherently a part of Choreo.
Demo of an end-to-end app built and deployed on Choreo.
Listen to the keynote address and hear about the latest developments from Rachana Ananthakrishnan and Ian Foster who review the updates to the Globus Platform and Service, and the relevance of Globus to the scientific community as an automation platform to accelerate scientific discovery.
Quarkus Hidden and Forbidden ExtensionsMax Andersen
Quarkus has a vast extension ecosystem and is known for its subsonic and subatomic feature set. Some of these features are not as well known, and some extensions are less talked about, but that does not make them less interesting - quite the opposite.
Come join this talk to see some tips and tricks for using Quarkus and some of the lesser known features, extensions and development techniques.
In software engineering, the right architecture is essential for robust, scalable platforms. Wix has undergone a pivotal shift from event sourcing to a CRUD-based model for its microservices. This talk will chart the course of this pivotal journey.
Event sourcing, which records state changes as immutable events, provided robust auditing and "time travel" debugging for Wix Stores' microservices. Despite its benefits, the complexity it introduced in state management slowed development. Wix responded by adopting a simpler, unified CRUD model. This talk will explore the challenges of event sourcing and the advantages of Wix's new "CRUD on steroids" approach, which streamlines API integration and domain event management while preserving data integrity and system resilience.
Participants will gain valuable insights into Wix's strategies for ensuring atomicity in database updates and event production, as well as caching, materialization, and performance optimization techniques within a distributed system.
Join us to discover how Wix has mastered the art of balancing simplicity and extensibility, and learn how the re-adoption of the modest CRUD has turbocharged their development velocity, resilience, and scalability in a high-growth environment.
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisGlobus
JASMIN is the UK’s high-performance data analysis platform for environmental science, operated by STFC on behalf of the UK Natural Environment Research Council (NERC). In addition to its role in hosting the CEDA Archive (NERC’s long-term repository for climate, atmospheric science & Earth observation data in the UK), JASMIN provides a collaborative platform to a community of around 2,000 scientists in the UK and beyond, providing nearly 400 environmental science projects with working space, compute resources and tools to facilitate their work. High-performance data transfer into and out of JASMIN has always been a key feature, with many scientists bringing model outputs from supercomputers elsewhere in the UK, to analyse against observational or other model data in the CEDA Archive. A growing number of JASMIN users are now realising the benefits of using the Globus service to provide reliable and efficient data movement and other tasks in this and other contexts. Further use cases involve long-distance (intercontinental) transfers to and from JASMIN, and collecting results from a mobile atmospheric radar system, pushing data to JASMIN via a lightweight Globus deployment. We provide details of how Globus fits into our current infrastructure, our experience of the recent migration to GCSv5.4, and of our interest in developing use of the wider ecosystem of Globus services for the benefit of our user community.
Check out the webinar slides to learn more about how XfilesPro transforms Salesforce document management by leveraging its world-class applications. For more details, please connect with sales@xfilespro.com
If you want to watch the on-demand webinar, please click here: https://www.xfilespro.com/webinars/salesforce-document-management-2-0-smarter-faster-better/
top nidhi software solution freedownloadvrstrong314
This presentation emphasizes the importance of data security and legal compliance for Nidhi companies in India. It highlights how online Nidhi software solutions, like Vector Nidhi Software, offer advanced features tailored to these needs. Key aspects include encryption, access controls, and audit trails to ensure data security. The software complies with regulatory guidelines from the MCA and RBI and adheres to Nidhi Rules, 2014. With customizable, user-friendly interfaces and real-time features, these Nidhi software solutions enhance efficiency, support growth, and provide exceptional member services. The presentation concludes with contact information for further inquiries.
How to Position Your Globus Data Portal for Success Ten Good PracticesGlobus
Science gateways allow science and engineering communities to access shared data, software, computing services, and instruments. Science gateways have gained a lot of traction in the last twenty years, as evidenced by projects such as the Science Gateways Community Institute (SGCI) and the Center of Excellence on Science Gateways (SGX3) in the US, The Australian Research Data Commons (ARDC) and its platforms in Australia, and the projects around Virtual Research Environments in Europe. A few mature frameworks have evolved with their different strengths and foci and have been taken up by a larger community such as the Globus Data Portal, Hubzero, Tapis, and Galaxy. However, even when gateways are built on successful frameworks, they continue to face the challenges of ongoing maintenance costs and how to meet the ever-expanding needs of the community they serve with enhanced features. It is not uncommon that gateways with compelling use cases are nonetheless unable to get past the prototype phase and become a full production service, or if they do, they don't survive more than a couple of years. While there is no guaranteed pathway to success, it seems likely that for any gateway there is a need for a strong community and/or solid funding streams to create and sustain its success. With over twenty years of examples to draw from, this presentation goes into detail for ten factors common to successful and enduring gateways that effectively serve as best practices for any new or developing gateway.
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...informapgpstrackings
Keep tabs on your field staff effortlessly with Informap Technology Centre LLC. Real-time tracking, task assignment, and smart features for efficient management. Request a live demo today!
For more details, visit us : https://informapuae.com/field-staff-tracking/
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Globus
Large Language Models (LLMs) are currently the center of attention in the tech world, particularly for their potential to advance research. In this presentation, we'll explore a straightforward and effective method for quickly initiating inference runs on supercomputers using the vLLM tool with Globus Compute, specifically on the Polaris system at ALCF. We'll begin by briefly discussing the popularity and applications of LLMs in various fields. Following this, we will introduce the vLLM tool, and explain how it integrates with Globus Compute to efficiently manage LLM operations on Polaris. Attendees will learn the practical aspects of setting up and remotely triggering LLMs from local machines, focusing on ease of use and efficiency. This talk is ideal for researchers and practitioners looking to leverage the power of LLMs in their work, offering a clear guide to harnessing supercomputing resources for quick and effective LLM inference.
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...Mind IT Systems
Healthcare providers often struggle with the complexities of chronic conditions and remote patient monitoring, as each patient requires personalized care and ongoing monitoring. Off-the-shelf solutions may not meet these diverse needs, leading to inefficiencies and gaps in care. It’s here, custom healthcare software offers a tailored solution, ensuring improved care and effectiveness.
Custom Healthcare Software for Managing Chronic Conditions and Remote Patient...
Kernel Recipes 2018 - XDP: a new fast and programmable network layer - Jesper Dangaard Brouer
1. XDP: new fast and programmable network layer - Kernel Recipes, Paris, Sep 2018
XDP: a new fast and
programmable network layer
XDP - eXpress Data Path
Jesper Dangaard Brouer
Principal Engineer, Red Hat Inc.
Kernel Recipes
Paris, France, Sep 2018
2. XDP: new fast and programmable network layer - Kernel Recipes, Paris, Sep 2018
Learn that:
● Linux got a new fast and programmable network layer
○ And touch upon the basic building blocks
● How this XDP technology got motivated upstream
● Why eBPF is such a perfect match for XDP
● Demystify: Why this XDP technology is so fast!
● How XDP achieves "hidden" RX bulking
● Why XDP-redirect is a novel approach
What will you learn?
What will you actually get out of listening to this talk?
3. XDP: new fast and programmable network layer - Kernel Recipes, Paris, Sep 2018
XDP is about providing eBPF programmable superpowers to end-users
● But in this talk you will not learn how to code eBPF
Motivated and dedicated to: Making eBPF programming more easily consumable
● But NOT in this talk…
● … go watch many of my other talks! :-)
○ Like the tutorial: XDP for the Rest of Us
Anti slide: What you will NOT learn
We cannot cover everything today...
4. XDP: new fast and programmable network layer - Kernel Recipes, Paris, Sep 20184
XDP basically: New layer in the kernel network stack
● Before allocating the SKB
○ Driver level hook at DMA level
● Means: Competing at the same “layer” as DPDK / netmap
● Super fast, due to
○ Take action/decision earlier (e.g. skip some network layers)
○ No-memory allocations
● Not kernel bypass, data-plane is kept inside kernel
○ via BPF: makes early network stack run-time programmable
○ Cooperates with kernel
Single slide intro: What is XDP?
Compressed slide about XDP … cover the details later
5. XDP: new fast and programmable network layer - Kernel Recipes, Paris, Sep 2018
Next slides
What! - New layer in kernel networking?!
How did you motivate this upstream!?!
5
6. XDP: new fast and programmable network layer - Kernel Recipes, Paris, Sep 2018
Kernel bypass solutions: Like DPDK, netmap, PF_RING
● Show they are 10x faster than Linux kernel netstack
● This is true: FOR THEIR SPECIFIC USE-CASES
This lead to WRONG conclusions and quotes:
● Wrong: "Kernel is inherently too slow to handle networking"
Got motivated by showing this is WRONG
● Acknowledge competition, XDP was late to the game (2016 initial idea)
○ Netmap 2011 (SIGCOMM), DPDK 2013 (goes open-source?), PF_RING 2010 (earliest ref found)
Motivation through competition
Competition was helpful to force us to find an upstream kernel alternative
7. XDP: new fast and programmable network layer - Kernel Recipes, Paris, Sep 2018
It is all about the use-case
● Kernel bypass benchmarks show L2/L3 use-cases
● Linux netstack assumes socket delivery use-case
Linux network stack was build with socket-delivery use-case in focus
● Naming of container for packet data structure says is all:
○ SKB → sk_buff → (originates from) socket buffer
○ Thus, netstack takes this cost upfront (at driver level alloc SKB)
Unfair comparison
Comparing apples and bananas
8. XDP: new fast and programmable network layer - Kernel Recipes, Paris, Sep 2018
XDP basically change the picture: Operate at same level
● Introduce layer before allocating the SKB
● Hook at driver level (running eBPF)
XDP goals
● Close the performance gap to kernel-bypass solutions
○ Not a goal to be faster
● Provide in-kernel alternative, that is more flexible
○ Don’t steal the entire NIC
● Work in concert with existing network stack
XDP design goals
Operate at same L2/L3 level, allows for apple to apple comparison
9. XDP: new fast and programmable network layer - Kernel Recipes, Paris, Sep 2018
Next slides
Show me some benchmarks!
9
10. XDP: new fast and programmable network layer - Kernel Recipes, Paris, Sep 2018
Comparing dropping packet:
DPDK vs. XDP vs Linux
● XDP follow DPDK line,
offset is due to driver indirect calls,
calculated to 16 nanosec offset
● Linux scales perfectly,
conntrack overhead is significant
● HW/driver does PCIe compression to
avoid PCIe limitations
Benchmark: Drop packet
Driver: mlx5,
HW: 100Gbit/s ConnectX-5 Ex.
CPU: E5-1650v4
11. XDP: new fast and programmable network layer - Kernel Recipes, Paris, Sep 2018
Comparing L2 forward packets
● XDP-same-NIC uses XDP_TX, which
bounce packet inside NIC driver
● XDP-different-NIC use
XDP_REDIRECT, which is more
advanced (further optimization possible)
● Shows XDP can be faster than
DPDK, in certain cases/modes
Benchmark: L2 forwarding
Driver: mlx5,
HW: 100Gbit/s ConnectX-5 Ex.
CPU: E5-1650v4
12. XDP: new fast and programmable network layer - Kernel Recipes, Paris, Sep 2018
Next slides
Sounds cool!
Tell me more about: Basic design and building blocks
12
13. XDP: new fast and programmable network layer - Kernel Recipes, Paris, Sep 2018
BPF program return an action or verdict
● XDP_DROP, XDP_PASS, XDP_TX, XDP_ABORTED, XDP_REDIRECT
How to cooperate with network stack
● Pop/push or modify headers: Change RX-handler kernel use
○ e.g. handle protocol unknown to running kernel
● Can propagate 32Bytes meta-data from XDP stage to network stack
○ TC (clsbpf) hook can use meta-data, e.g. set SKB mark
XDP actions and cooperation
What are the basic building blocks I can use?
13
14. XDP: new fast and programmable network layer - Kernel Recipes, Paris, Sep 2018
Data-plane: inside kernel, split into:
● Kernel-core: Fabric in charge of moving packets quickly
● In-kernel BPF program:
○ Policy logic decide action
○ Read/write access to packet
Control-plane: Userspace
● Userspace load BPF program
● Can control program via changing BPF maps
● Everything goes through bpf system call
Design: XDP: data-plane and control-plane
Overall design
14
15. XDP: new fast and programmable network layer - Kernel Recipes, Paris, Sep 2018
BPF is sandboxed code running inside kernel (XDP only loaded by root)
● A given kernel BPF hook just define:
○ possible actions and limit helpers (that can lookup or change kernel state)
Users get programmable policies (within these limits)
● Userspace "control-plane" API tied to userspace app (not kernel API)
○ likely via modifying a BPF-map
● No longer need a kernel ABI
○ like sysctl/procfs/ioctls etc.
Why kernel developers should love BPF
How BPF avoids creating a new kernel ABI for every new user-invented policy decision
15
16. XDP: new fast and programmable network layer - Kernel Recipes, Paris, Sep 2018
Next slides
Why XDP_REDIRECT is so interesting?!
16
17. XDP: new fast and programmable network layer - Kernel Recipes, Paris, Sep 2018
XDP got action code XDP_REDIRECT (that drivers must implement)
● In basic form: Redirecting RAW frames out another net_device/ifindex
● Egress driver: implement ndo_xdp_xmit (and ndo_xdp_flush)
Performance low without using a map for redirect (single CPU core numbers, ixgbe):
● Using helper: bpf_redirect = 7.5 Mpps
● Using helper: bpf_redirect_map = 13.0 Mpps
What is going on?
● Using redirect maps is a HUGE performance boost, why!?
XDP action XDP_REDIRECT
First lets cover the basics, net_device redirect (... later, more flexibility in maps)
Avail since kernel v4.14, but drivers slower to adapt
17
18. XDP: new fast and programmable network layer - Kernel Recipes, Paris, Sep 201818
Basic design: Simplify changes needed in drivers
● Name “redirect” is more generic, than “forwarding”
First trick: Hide RX bulking from driver code via MAP
● Driver still processes packets one at a time - calling xdp_do_redirect
● End of driver NAPI poll routine “flush” (max 64 packets) - call xdp_do_flush_map
● Thus, bulking via e.g. delaying expensive NIC tailptr/doorbell
Second trick: invent new types of redirects easy
● Without changing any driver code!
● Hopefully last XDP action code(?)
Novel: redirect using BPF maps
Why is it so brilliant to use BPF maps for redirecting?
19. XDP: new fast and programmable network layer - Kernel Recipes, Paris, Sep 2018
Gist: Driver XDP RX-processing (NAPI) loop
Demonstrate NAPI-poll call BPF prog per packet and end with MAP-flush
Basic code needed in Driver for supporting XDP:
1 while (desc_in_rx_ring && budget_left--) {
2 action = bpf_prog_run_xdp(xdp_prog, xdp_buff);
3 /* helper bpf_redirect_map have set map (and index)via this_cpu_ptr */
4 switch (action) {
5 case XDP_PASS: break;
6 case XDP_TX: res = driver_local_xmit_xdp_ring(adapter, xdp_buff); break;
7 case XDP_REDIRECT: res = xdp_do_redirect(netdev, xdp_buff, xdp_prog); break;
8 default: bpf_warn_invalid_xdp_action(action); /* fallthrough */
9 case XDP_ABORTED: trace_xdp_exception(netdev, xdp_prog, action); /* fallthrough */
10 case XDP_DROP: res = DRV_XDP_CONSUMED; break;
11 } /* left out acting on res */
12 } /* End of NAPI-poll */
13 xdp_do_flush_map(); /* Bulk chosen by map, can store xdf_frame for flushing */
14 driver_local_XDP_TX_flush();
20. XDP: new fast and programmable network layer - Kernel Recipes, Paris, Sep 201820
The “devmap”: BPF_MAP_TYPE_DEVMAP
● Contains net_devices, userspace adds them via ifindex to map-index
The “cpumap”: BPF_MAP_TYPE_CPUMAP
● Allow redirecting RAW xdp frames to remote CPU
○ SKB is created on remote CPU, and normal network stack invoked
● The map-index is the CPU number (the value is queue size)
AF_XDP - “xskmap”: BPF_MAP_TYPE_XSKMAP
● Allow redirecting RAW xdp frames into userspace
○ via new Address Family socket type: AF_XDP
○ Very close to ‘netmap’ design, keeping drivers in kernel-space
○ Ask Björn Töpel who implemented it… in audience
Redirect map types
What kind of redirects are people inventing?!
21. XDP: new fast and programmable network layer - Kernel Recipes, Paris, Sep 2018
Next slides
Spectre V2 killed XDP performance
21
22. XDP: new fast and programmable network layer - Kernel Recipes, Paris, Sep 201822
Hey, you killed my XDP performance! (Retpoline tricks for indirect calls)
● Still processing 6 Mpps per CPU core
● But could do approx 13 Mpps before!
Initial through it was net_device->ndo_xdp_xmit call
● Implemented redirect bulking, but only helped a little
Real pitfall: DMA API use indirect function call pointers
● Christoph Hellwig PoC patch show perf return to approx 10 Mpps
Thus, solutions in the pipeline...
Performance issue: Spectre (variant 2)
CONFIG_RETPOLINE and newer GCC compiler
- for stopping Spectre (variant 2) CPU side-channel attacks
23. XDP: new fast and programmable network layer - Kernel Recipes, Paris, Sep 2018
Next slides
Questions asked by NDIV (HAproxy) in:
“Challenges migrating from NDIV to XDP”
NetDev Conf 0x12 (slides and video)
23
24. XDP: new fast and programmable network layer - Kernel Recipes, Paris, Sep 2018
NDIV callback rx_done() per packet batch, at end of RX loop
● XDP do have this, but part of BPF map abstraction
● For XDP, correspond to: xdp_do_flush_map
In theory, NDIV could be implemented with BPF as
● Move match/filter part into BPF prog, that return action XDP_REDIRECT
● New redirect BPF map type for NDIV
○ (Ugly) could do mem alloc for NDIV “out-packet(s)”
○ (Ugly) could invoke ndiv->handle_rx(xdp_frame)
● This map type, can receive event RX-done via xdp_do_flush_map
NDIV need callback rx_done()
NDIV have callback rx_done() per packet batch, at end of RX loop
25. XDP: new fast and programmable network layer - Kernel Recipes, Paris, Sep 2018
There is no TX hook in XDP, why?
● For packets arriving from netstack
○ makes no-sense performance-wise (too late, SKB already alloc)
● BPF hook for TX is available
○ Use TC (clsbpf) egress BPF-prog hook (takes SKB like your handler)
(Disclaimer only idea:) Maybe do XDP bpf_prog invoke if ndo_xdp_xmit() fails
● Use-case: react to XDP-redirect TX overflow
○ E.g. allow new XDP action, like another redirect target (load-balance)
XDP hook at TX?
Could not find the XDP hook at TX
NDIV have ndiv->handle_tx(ndiv, skb)
26. XDP: new fast and programmable network layer - Kernel Recipes, Paris, Sep 2018
End slide
… Questions?
plus.google.com/+JesperDangaardBrouer
linkedin.com/in/brouer
youtube.com/channel/UCSypIUCgtI42z63soRMONng
facebook.com/brouer
twitter.com/JesperBrouer
27. XDP: new fast and programmable network layer - Kernel Recipes, Paris, Sep 201827
XDP + BPF combined effort of many people
Thanks to all contributors
● Alexei Starovoitov
● Daniel Borkmann
● Brenden Blanco
● Tom Herbert
● John Fastabend
● Martin KaFai Lau
● Jakub Kicinski
● Jason Wang
● Andy Gospodarek
● Thomas Graf
● Michael Chan (bnxt_en)
● Saeed Mahameed (mlx5)
● Tariq Toukan (mlx4)
● Björn Töpel (i40e + AF_XDP)
● Magnus Karlsson (AF_XDP)
● Yuval Mintz (qede)
● Sunil Goutham (thunderx)
● Jason Wang (VM)
● Michael S. Tsirkin (ptr_ring)
● Edward Cree
● Toke Høiland-Jørgensen
28. XDP: new fast and programmable network layer - Kernel Recipes, Paris, Sep 2018
Next slides
Extra slides if time permits...
28
29. XDP: new fast and programmable network layer - Kernel Recipes, Paris, Sep 2018
Next slides
What is this CPUMAP redirect?
30. XDP: new fast and programmable network layer - Kernel Recipes, Paris, Sep 201830
Basic cpumap properties
● Enables redirection of XDP frames to remote CPUs
● Moved SKB allocation outside driver (could help simplify drivers)
Scalability and isolation mechanism
● Allows isolating/decouple driver XDP layer from network stack
○ Don't delay XDP by deep call into network stack
● Enables DDoS protection on end-hosts (that run services)
○ XDP fast-enough to avoid packet drops happen in HW NICs
Another use-case: Fix NIC-HW RSS/RX-hash broken/uneven CPU distribution
● Proto unknown to HW: e.g. VXLAN and double-tagged VLANs
● Suricata use this feature
XDP_REDIRECT + cpumap
What is cpumap redirect?
31. XDP: new fast and programmable network layer - Kernel Recipes, Paris, Sep 201831
Cpumap architecture: Every slot in array-map: dest-CPU
● MPSC (Multi Producer Single Consumer) model: per dest-CPU
○ Multiple RX-queue CPUs can enqueue to single dest-CPU
● Fast per CPU enqueue store (for now) 8 packets
○ Amortized enqueue cost to shared ptr_ring queue via bulk-enq
● Lockless dequeue, via pinning kthread CPU and disallow ptr_ring resize
Important properties from main shared queue ptr_ring (cyclic array based)
● Enqueue+dequeue don't share cache-line for synchronization
○ Synchronization happens based on elements
○ In queue almost full case, avoid cache-line bouncing
○ In queue almost empty case, reduce cache-line bouncing via bulk-enq
Cpumap redirect: CPU scaling
Tricky part getting cross CPU delivery fast-enough
32. XDP: new fast and programmable network layer - Kernel Recipes, Paris, Sep 201832
XDP_REDIRECT
enqueue into
cpumap
Kthread dequeue
Start normal
netstack
App can run on
another CPU via
socket queue
CPU scheduling via cpumap
Queuing and scheduling in cpumap
CPU#2
kthread
CPU#1
NAPI RX
CPU#3
Userspace
sched
Hint: Same CPU sched possible
● But adjust /proc/sys/kernel/sched_wakeup_granularity_ns
33. XDP: new fast and programmable network layer - Kernel Recipes, Paris, Sep 2018
Next slides
Recent changes to XDP core
33
34. XDP: new fast and programmable network layer - Kernel Recipes, Paris, Sep 201834
Long standing request: separate BPF programs per RX queue
● This is not likely to happen… because
Solution instead: provide info per RX queue (xdp_rxq_info)
● Info: ingress net_device (Exposed as: ctx->ingress_ifindex)
● Info: ingress RX-queue number (Exposed as: ctx->rx_queue_index)
Thus, NIC level XDP/bpf program can instead filter on rx_queue_index
Recent change: Information per RX-queue
Recent change in: kernel v4.16
35. XDP: new fast and programmable network layer - Kernel Recipes, Paris, Sep 201835
XDP_REDIRECT needs to queue XDP frames e.g. for bulking
● Queuing open-coded for both cpumap and tun-driver
● Generalize/standardize into struct xdp_frame
● Store info in top of XDP frame headroom (reserved)
○ Avoids allocating memory
Recent change: queuing via xdp_frame
Very recent changes: only accepted in net-next (to appear in v4.18)
36. XDP: new fast and programmable network layer - Kernel Recipes, Paris, Sep 201836
API for how redirected frames are freed or "returned"
● XDP frames are returned to originating RX driver
● Furthermore: this happens per RX-queue level (extended xdp_rxq_info)
This allows driver to implement different memory models per RX-queue
● E.g. needed for AF_XDP zero-copy mode
Recent change: Memory return API
Very recent changes: only accepted in net-next (to appear in v4.18)