This document provides an overview of cBPF and eBPF. It discusses the history and implementation of cBPF, including how it was originally used for packet filtering. It then covers eBPF in more depth, explaining what it is, its history, implementation including different program types and maps. It also discusses several uses of eBPF including networking, firewalls, DDoS mitigation, profiling, security, and chaos engineering. Finally, it introduces XDP and DPDK, comparing XDP's benefits over DPDK.
eBPF is an exciting new technology that is poised to transform Linux performance engineering. eBPF enables users to dynamically and programatically trace any kernel or user space code path, safely and efficiently. However, understanding eBPF is not so simple. The goal of this talk is to give audiences a fundamental understanding of eBPF, how it interconnects existing Linux tracing technologies, and provides a powerful aplatform to solve any Linux performance problem.
Netronome's half-day tutorial on host data plane acceleration at ACM SIGCOMM 2018 introduced attendees to models for host data plane acceleration and provided an in-depth understanding of SmartNIC deployment models at hyperscale cloud vendors and telecom service providers.
Presenter Bios
Jakub Kicinski is a long term Linux kernel contributor, who has been leading the kernel team at Netronome for the last two years. Jakub’s major contributions include the creation of BPF hardware offload mechanisms in the kernel and bpftool user space utility, as well as work on the Linux kernel side of OVS offload.
David Beckett is a Software Engineer at Netronome with a strong technical background of computer networks including academic research with DDoS. David has expertise in the areas of Linux architecture and computer programming. David has a Masters Degree in Electrical, Electronic Engineering at Queen’s University Belfast and continues as a PhD student studying Emerging Application Layer DDoS threats.
The Linux kernel is undergoing the most fundamental architecture evolution in history and is becoming a microkernel. Why is the Linux kernel evolving into a microkernel? The potentially biggest fundamental change ever happening to the Linux kernel. This talk covers how companies like Facebook and Google use BPF to patch 0-day exploits, how BPF will change the way features are added to the kernel forever, and how BPF is introducing a new type of application deployment method for the Linux kernel.
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDPThomas Graf
This talk will start with a deep dive and hands on examples of BPF, possibly the most promising low level technology to address challenges in application and network security, tracing, and visibility. We will discuss how BPF evolved from a simple bytecode language to filter raw sockets for tcpdump to the a JITable virtual machine capable of universally extending and instrumenting both the Linux kernel and user space applications. The introduction is followed by a concrete example of how the Cilium open source project applies BPF to solve networking, security, and load balancing for highly distributed applications. We will discuss and demonstrate how Cilium with the help of BPF can be combined with distributed system orchestration such as Docker to simplify security, operations, and troubleshooting of distributed applications.
eBPF is an exciting new technology that is poised to transform Linux performance engineering. eBPF enables users to dynamically and programatically trace any kernel or user space code path, safely and efficiently. However, understanding eBPF is not so simple. The goal of this talk is to give audiences a fundamental understanding of eBPF, how it interconnects existing Linux tracing technologies, and provides a powerful aplatform to solve any Linux performance problem.
Netronome's half-day tutorial on host data plane acceleration at ACM SIGCOMM 2018 introduced attendees to models for host data plane acceleration and provided an in-depth understanding of SmartNIC deployment models at hyperscale cloud vendors and telecom service providers.
Presenter Bios
Jakub Kicinski is a long term Linux kernel contributor, who has been leading the kernel team at Netronome for the last two years. Jakub’s major contributions include the creation of BPF hardware offload mechanisms in the kernel and bpftool user space utility, as well as work on the Linux kernel side of OVS offload.
David Beckett is a Software Engineer at Netronome with a strong technical background of computer networks including academic research with DDoS. David has expertise in the areas of Linux architecture and computer programming. David has a Masters Degree in Electrical, Electronic Engineering at Queen’s University Belfast and continues as a PhD student studying Emerging Application Layer DDoS threats.
The Linux kernel is undergoing the most fundamental architecture evolution in history and is becoming a microkernel. Why is the Linux kernel evolving into a microkernel? The potentially biggest fundamental change ever happening to the Linux kernel. This talk covers how companies like Facebook and Google use BPF to patch 0-day exploits, how BPF will change the way features are added to the kernel forever, and how BPF is introducing a new type of application deployment method for the Linux kernel.
DockerCon 2017 - Cilium - Network and Application Security with BPF and XDPThomas Graf
This talk will start with a deep dive and hands on examples of BPF, possibly the most promising low level technology to address challenges in application and network security, tracing, and visibility. We will discuss how BPF evolved from a simple bytecode language to filter raw sockets for tcpdump to the a JITable virtual machine capable of universally extending and instrumenting both the Linux kernel and user space applications. The introduction is followed by a concrete example of how the Cilium open source project applies BPF to solve networking, security, and load balancing for highly distributed applications. We will discuss and demonstrate how Cilium with the help of BPF can be combined with distributed system orchestration such as Docker to simplify security, operations, and troubleshooting of distributed applications.
BPF of Berkeley Packet Filter mechanism was first introduced in linux in 1997 in version 2.1.75. It has seen a number of extensions of the years. Recently in versions 3.15 - 3.19 it received a major overhaul which drastically expanded it's applicability. This talk will cover how the instruction set looks today and why. It's architecture, capabilities, interface, just-in-time compilers. We will also talk about how it's being used in different areas of the kernel like tracing and networking and future plans.
SOSCON 2019.10.17
What are the methods for packet processing on Linux? And how fast are each packet processing methods? In this presentation, we will learn how to handle packets on Linux (User space, socket filter, netfilter, tc), and compare performance with analysis of where each packet processing is done in the network stack (hook point). Also, we will discuss packet processing using XDP, an in-kernel fast-path recently added to the Linux kernel. eXpress Data Path (XDP) is a high-performance programmable network data-path within the Linux kernel. The XDP is located at the lowest level of access through SW in the network stack, the point at which driver receives the packet. By using the eBPF infrastructure at this hook point, the network stack can be expanded without modifying the kernel.
Daniel T. Lee (Hoyeon Lee)
@danieltimlee
Daniel T. Lee currently works as Software Engineer at Kosslab and contributing to Linux kernel BPF project. He has interest in cloud, Linux networking, and tracing technologies, and likes to analyze the kernel's internal using BPF technology.
Video: https://www.youtube.com/watch?v=JRFNIKUROPE . Talk for linux.conf.au 2017 (LCA2017) by Brendan Gregg, about Linux enhanced BPF (eBPF). Abstract:
A world of new capabilities is emerging for the Linux 4.x series, thanks to enhancements that have been included in Linux for to Berkeley Packet Filter (BPF): an in-kernel virtual machine that can execute user space-defined programs. It is finding uses for security auditing and enforcement, enhancing networking (including eXpress Data Path), and performance observability and troubleshooting. Many new open source tools that have been written in the past 12 months for performance analysis that use BPF. Tracing superpowers have finally arrived for Linux!
For its use with tracing, BPF provides the programmable capabilities to the existing tracing frameworks: kprobes, uprobes, and tracepoints. In particular, BPF allows timestamps to be recorded and compared from custom events, allowing latency to be studied in many new places: kernel and application internals. It also allows data to be efficiently summarized in-kernel, including as histograms. This has allowed dozens of new observability tools to be developed so far, including measuring latency distributions for file system I/O and run queue latency, printing details of storage device I/O and TCP retransmits, investigating blocked stack traces and memory leaks, and a whole lot more.
This talk will summarize BPF capabilities and use cases so far, and then focus on its use to enhance Linux tracing, especially with the open source bcc collection. bcc includes BPF versions of old classics, and many new tools, including execsnoop, opensnoop, funcccount, ext4slower, and more (many of which I developed). Perhaps you'd like to develop new tools, or use the existing tools to find performance wins large and small, especially when instrumenting areas that previously had zero visibility. I'll also summarize how we intend to use these new capabilities to enhance systems analysis at Netflix.
Using the new extended Berkley Packet Filter capabilities in Linux to the improve performance of auditing security relevant kernel events around network, file and process actions.
USENIX LISA2021 talk by Brendan Gregg (https://www.youtube.com/watch?v=_5Z2AU7QTH4). This talk is a deep dive that describes how BPF (eBPF) works internally on Linux, and dissects some modern performance observability tools. Details covered include the kernel BPF implementation: the verifier, JIT compilation, and the BPF execution environment; the BPF instruction set; different event sources; and how BPF is used by user space, using bpftrace programs as an example. This includes showing how bpftrace is compiled to LLVM IR and then BPF bytecode, and how per-event data and aggregated map data are fetched from the kernel.
In this session, we’ll review how previous efforts, including Netfilter, Berkley Packet Filter (BPF), Open vSwitch (OVS), and TC, approached the problem of extensibility. We’ll show you an open source solution available within the Red Hat Enterprise Linux kernel, where extending and merging some of the existing concepts leads to an extensible framework that satisfies the networking needs of datacenter and cloud virtualization.
Cilium - Container Networking with BPF & XDPThomas Graf
This talk demonstrates that programmability and performance does not require user space networking, it can be achieved in the kernel by generating BPF programs and leveraging the existing kernel subsystems. We will demo an early prototype which provides fast IPv6 & IPv4 connectivity to containers, container labels based security policy with avg cost O(1), and debugging and monitoring based on the per-cpu perf ring buffer. We encourage a lively discussion on the approach taken and next steps.
Using eBPF for High-Performance Networking in CiliumScyllaDB
The Cilium project is a popular networking solution for Kubernetes, based on eBPF. This talk uses eBPF code and demos to explore the basics of how Cilium makes network connections, and manipulates packets so that they can avoid traversing the kernel's built-in networking stack. You'll see how eBPF enables high-performance networking as well as deep network observability and security.
UM2019 Extended BPF: A New Type of SoftwareBrendan Gregg
Keynote for Ubuntu Masters 2019 by Brendan Gregg, Netflix. Video https://www.youtube.com/watch?v=7pmXdG8-7WU&feature=youtu.be . "Extended BPF is a new type of software, and the first fundamental change to how kernels are used in 50 years. This new type of software is already in use by major companies: Netflix has 14 BPF programs running by default on all of its cloud servers, which run Ubuntu Linux. Facebook has 40 BPF programs running by default. Extended BPF is composed of an in-kernel runtime for executing a virtual BPF instruction set through a safety verifier and with JIT compilation. So far it has been used for software defined networking, performance tools, security policies, and device drivers, with more uses planned and more we have yet to think of. It is changing how we use and think about systems. This talk explores the past, present, and future of BPF, with BPF performance tools as a use case."
BPF & Cilium - Turning Linux into a Microservices-aware Operating SystemThomas Graf
Container runtimes cause Linux to return to its original purpose: to serve applications interacting directly with the kernel. At the same time, the Linux kernel is traditionally difficult to change and its development process is full of myths. A new efficient in-kernel programming language called eBPF is changing this and allows everyone to extend existing kernel components or glue them together in new forms without requiring to change the kernel itself.
Building Network Functions with eBPF & BCCKernel TLV
eBPF (Extended Berkeley Packet Filter) is an in-kernel virtual machine that allows running user-supplied sandboxed programs inside of the kernel. It is especially well-suited to network programs and it's possible to write programs that filter traffic, classify traffic and perform high-performance custom packet processing.
BCC (BPF Compiler Collection) is a toolkit for creating efficient kernel tracing and manipulation programs. It makes use of eBPF.
BCC provides an end-to-end workflow for developing eBPF programs and supplies Python bindings, making eBPF programs much easier to write.
Together, eBPF and BCC allow you to develop and deploy network functions safely and easily, focusing on your application logic (instead of kernel datapath integration).
In this session, we will introduce eBPF and BCC, explain how to implement a network function using BCC, discuss some real-life use-cases and show a live demonstration of the technology.
About the speaker
Shmulik Ladkani, Chief Technology Officer at Meta Networks,
Long time network veteran and kernel geek.
Shmulik started his career at Jungo (acquired by NDS/Cisco) implementing residential gateway software, focusing on embedded Linux, Linux kernel, networking and hardware/software integration.
Some billions of forwarded packets later, Shmulik left his position as Jungo's lead architect and joined Ravello Systems (acquired by Oracle) as tech lead, developing a virtual data center as a cloud-based service, focusing around virtualization systems, network virtualization and SDN.
Recently he co-founded Meta Networks where he's been busy architecting secure, multi-tenant, large-scale network infrastructure as a cloud-based service.
eBPF (extended Berkeley Packet Filters) is a modern kernel technology that can be used to introduce dynamic tracing into a system that wasn't prepared or instrumented in any way. The tracing programs run in the kernel, are guaranteed to never crash or hang your system, and can probe every module and function -- from the kernel to user-space frameworks such as Node and Ruby.
In this workshop, you will experiment with Linux dynamic tracing first-hand. First, you will explore BCC, the BPF Compiler Collection, which is a set of tools and libraries for dynamic tracing. Many of your tracing needs will be answered by BCC, and you will experiment with memory leak analysis, generic function tracing, kernel tracepoints, static tracepoints in user-space programs, and the "baked" tools for file I/O, network, and CPU analysis. You'll be able to choose between working on a set of hands-on labs prepared by the instructors, or trying the tools out on your own test system.
Next, you will hack on some of the bleeding edge tools in the BCC toolkit, and build a couple of simple tools of your own. You'll be able to pick from a curated list of GitHub issues for the BCC project, a set of hands-on labs with known "school solutions", and an open-ended list of problems that need tools for effective analysis. At the end of this workshop, you will be equipped with a toolbox for diagnosing issues in the field, as well as a framework for building your own tools when the generic ones do not suffice.
Container Network Interface: Network Plugins for Kubernetes and beyondKubeAcademy
With the rise of modern containers comes new problems to solve – especially in networking. Numerous container SDN solutions have recently entered the market, each best suited for a particular environment. Combined with multiple container runtimes and orchestrators available today, there exists a need for a common layer to allow interoperability between them and the network solutions.
As different environments demand different networking solutions, multiple vendors and viewpoints look to a specification to help guide interoperability. Container Network Interface (CNI) is a specification started by CoreOS with the input from the wider open source community aimed to make network plugins interoperable between container execution engines. It aims to be as common and vendor-neutral as possible to support a wide variety of networking options — from MACVLAN to modern SDNs such as Weave and flannel.
CNI is growing in popularity. It got its start as a network plugin layer for rkt, a container runtime from CoreOS. Today rkt ships with multiple CNI plugins allowing users to take advantage of virtual switching, MACVLAN and IPVLAN as well as multiple IP management strategies, including DHCP. CNI is getting even wider adoption with Kubernetes adding support for it. Kubernetes accelerates development cycles while simplifying operations, and with support for CNI is taking the next step toward a common ground for networking. For continued success toward interoperability, Kubernetes users can come to this session to learn the CNI basics.
This talk will cover the CNI interface, including an example of how to build a simple plugin. It will also show Kubernetes users how CNI can be used to solve their networking challenges and how they can get involved.
KubeCon schedule link: http://sched.co/4VAo
Dataplane programming with eBPF: architecture and toolsStefano Salsano
eBPF is definitely a complex technology. Developing complex systems based on eBPF is challenging due to the intrinsic limitations of the model and the known shortcomings of the tool chain.
The learning curve of this technology is very steep and needs continuous coaching from experts. This tutorial will investigate:
What is eBPF and why it has gained a prominent position among the solutions to improve the packet processing performance in Linux/x86 nodes. We will shortly present some important use case scenarios for eBPF, like Kubernetes’ Cilium
The architecture of eBPF and its programming toolchain (e.g. bcc
What are the frameworks for eBPF programming, such as Polycube and InKeV.
How to make eBPF programming easier, more flexible and modular with HIKe/eCLAT
How to implement a custom application logic in eBPF with eCLAT using a python-like script
How to extend the framework and develop new modules
BPF of Berkeley Packet Filter mechanism was first introduced in linux in 1997 in version 2.1.75. It has seen a number of extensions of the years. Recently in versions 3.15 - 3.19 it received a major overhaul which drastically expanded it's applicability. This talk will cover how the instruction set looks today and why. It's architecture, capabilities, interface, just-in-time compilers. We will also talk about how it's being used in different areas of the kernel like tracing and networking and future plans.
SOSCON 2019.10.17
What are the methods for packet processing on Linux? And how fast are each packet processing methods? In this presentation, we will learn how to handle packets on Linux (User space, socket filter, netfilter, tc), and compare performance with analysis of where each packet processing is done in the network stack (hook point). Also, we will discuss packet processing using XDP, an in-kernel fast-path recently added to the Linux kernel. eXpress Data Path (XDP) is a high-performance programmable network data-path within the Linux kernel. The XDP is located at the lowest level of access through SW in the network stack, the point at which driver receives the packet. By using the eBPF infrastructure at this hook point, the network stack can be expanded without modifying the kernel.
Daniel T. Lee (Hoyeon Lee)
@danieltimlee
Daniel T. Lee currently works as Software Engineer at Kosslab and contributing to Linux kernel BPF project. He has interest in cloud, Linux networking, and tracing technologies, and likes to analyze the kernel's internal using BPF technology.
Video: https://www.youtube.com/watch?v=JRFNIKUROPE . Talk for linux.conf.au 2017 (LCA2017) by Brendan Gregg, about Linux enhanced BPF (eBPF). Abstract:
A world of new capabilities is emerging for the Linux 4.x series, thanks to enhancements that have been included in Linux for to Berkeley Packet Filter (BPF): an in-kernel virtual machine that can execute user space-defined programs. It is finding uses for security auditing and enforcement, enhancing networking (including eXpress Data Path), and performance observability and troubleshooting. Many new open source tools that have been written in the past 12 months for performance analysis that use BPF. Tracing superpowers have finally arrived for Linux!
For its use with tracing, BPF provides the programmable capabilities to the existing tracing frameworks: kprobes, uprobes, and tracepoints. In particular, BPF allows timestamps to be recorded and compared from custom events, allowing latency to be studied in many new places: kernel and application internals. It also allows data to be efficiently summarized in-kernel, including as histograms. This has allowed dozens of new observability tools to be developed so far, including measuring latency distributions for file system I/O and run queue latency, printing details of storage device I/O and TCP retransmits, investigating blocked stack traces and memory leaks, and a whole lot more.
This talk will summarize BPF capabilities and use cases so far, and then focus on its use to enhance Linux tracing, especially with the open source bcc collection. bcc includes BPF versions of old classics, and many new tools, including execsnoop, opensnoop, funcccount, ext4slower, and more (many of which I developed). Perhaps you'd like to develop new tools, or use the existing tools to find performance wins large and small, especially when instrumenting areas that previously had zero visibility. I'll also summarize how we intend to use these new capabilities to enhance systems analysis at Netflix.
Using the new extended Berkley Packet Filter capabilities in Linux to the improve performance of auditing security relevant kernel events around network, file and process actions.
USENIX LISA2021 talk by Brendan Gregg (https://www.youtube.com/watch?v=_5Z2AU7QTH4). This talk is a deep dive that describes how BPF (eBPF) works internally on Linux, and dissects some modern performance observability tools. Details covered include the kernel BPF implementation: the verifier, JIT compilation, and the BPF execution environment; the BPF instruction set; different event sources; and how BPF is used by user space, using bpftrace programs as an example. This includes showing how bpftrace is compiled to LLVM IR and then BPF bytecode, and how per-event data and aggregated map data are fetched from the kernel.
In this session, we’ll review how previous efforts, including Netfilter, Berkley Packet Filter (BPF), Open vSwitch (OVS), and TC, approached the problem of extensibility. We’ll show you an open source solution available within the Red Hat Enterprise Linux kernel, where extending and merging some of the existing concepts leads to an extensible framework that satisfies the networking needs of datacenter and cloud virtualization.
Cilium - Container Networking with BPF & XDPThomas Graf
This talk demonstrates that programmability and performance does not require user space networking, it can be achieved in the kernel by generating BPF programs and leveraging the existing kernel subsystems. We will demo an early prototype which provides fast IPv6 & IPv4 connectivity to containers, container labels based security policy with avg cost O(1), and debugging and monitoring based on the per-cpu perf ring buffer. We encourage a lively discussion on the approach taken and next steps.
Using eBPF for High-Performance Networking in CiliumScyllaDB
The Cilium project is a popular networking solution for Kubernetes, based on eBPF. This talk uses eBPF code and demos to explore the basics of how Cilium makes network connections, and manipulates packets so that they can avoid traversing the kernel's built-in networking stack. You'll see how eBPF enables high-performance networking as well as deep network observability and security.
UM2019 Extended BPF: A New Type of SoftwareBrendan Gregg
Keynote for Ubuntu Masters 2019 by Brendan Gregg, Netflix. Video https://www.youtube.com/watch?v=7pmXdG8-7WU&feature=youtu.be . "Extended BPF is a new type of software, and the first fundamental change to how kernels are used in 50 years. This new type of software is already in use by major companies: Netflix has 14 BPF programs running by default on all of its cloud servers, which run Ubuntu Linux. Facebook has 40 BPF programs running by default. Extended BPF is composed of an in-kernel runtime for executing a virtual BPF instruction set through a safety verifier and with JIT compilation. So far it has been used for software defined networking, performance tools, security policies, and device drivers, with more uses planned and more we have yet to think of. It is changing how we use and think about systems. This talk explores the past, present, and future of BPF, with BPF performance tools as a use case."
BPF & Cilium - Turning Linux into a Microservices-aware Operating SystemThomas Graf
Container runtimes cause Linux to return to its original purpose: to serve applications interacting directly with the kernel. At the same time, the Linux kernel is traditionally difficult to change and its development process is full of myths. A new efficient in-kernel programming language called eBPF is changing this and allows everyone to extend existing kernel components or glue them together in new forms without requiring to change the kernel itself.
Building Network Functions with eBPF & BCCKernel TLV
eBPF (Extended Berkeley Packet Filter) is an in-kernel virtual machine that allows running user-supplied sandboxed programs inside of the kernel. It is especially well-suited to network programs and it's possible to write programs that filter traffic, classify traffic and perform high-performance custom packet processing.
BCC (BPF Compiler Collection) is a toolkit for creating efficient kernel tracing and manipulation programs. It makes use of eBPF.
BCC provides an end-to-end workflow for developing eBPF programs and supplies Python bindings, making eBPF programs much easier to write.
Together, eBPF and BCC allow you to develop and deploy network functions safely and easily, focusing on your application logic (instead of kernel datapath integration).
In this session, we will introduce eBPF and BCC, explain how to implement a network function using BCC, discuss some real-life use-cases and show a live demonstration of the technology.
About the speaker
Shmulik Ladkani, Chief Technology Officer at Meta Networks,
Long time network veteran and kernel geek.
Shmulik started his career at Jungo (acquired by NDS/Cisco) implementing residential gateway software, focusing on embedded Linux, Linux kernel, networking and hardware/software integration.
Some billions of forwarded packets later, Shmulik left his position as Jungo's lead architect and joined Ravello Systems (acquired by Oracle) as tech lead, developing a virtual data center as a cloud-based service, focusing around virtualization systems, network virtualization and SDN.
Recently he co-founded Meta Networks where he's been busy architecting secure, multi-tenant, large-scale network infrastructure as a cloud-based service.
eBPF (extended Berkeley Packet Filters) is a modern kernel technology that can be used to introduce dynamic tracing into a system that wasn't prepared or instrumented in any way. The tracing programs run in the kernel, are guaranteed to never crash or hang your system, and can probe every module and function -- from the kernel to user-space frameworks such as Node and Ruby.
In this workshop, you will experiment with Linux dynamic tracing first-hand. First, you will explore BCC, the BPF Compiler Collection, which is a set of tools and libraries for dynamic tracing. Many of your tracing needs will be answered by BCC, and you will experiment with memory leak analysis, generic function tracing, kernel tracepoints, static tracepoints in user-space programs, and the "baked" tools for file I/O, network, and CPU analysis. You'll be able to choose between working on a set of hands-on labs prepared by the instructors, or trying the tools out on your own test system.
Next, you will hack on some of the bleeding edge tools in the BCC toolkit, and build a couple of simple tools of your own. You'll be able to pick from a curated list of GitHub issues for the BCC project, a set of hands-on labs with known "school solutions", and an open-ended list of problems that need tools for effective analysis. At the end of this workshop, you will be equipped with a toolbox for diagnosing issues in the field, as well as a framework for building your own tools when the generic ones do not suffice.
Container Network Interface: Network Plugins for Kubernetes and beyondKubeAcademy
With the rise of modern containers comes new problems to solve – especially in networking. Numerous container SDN solutions have recently entered the market, each best suited for a particular environment. Combined with multiple container runtimes and orchestrators available today, there exists a need for a common layer to allow interoperability between them and the network solutions.
As different environments demand different networking solutions, multiple vendors and viewpoints look to a specification to help guide interoperability. Container Network Interface (CNI) is a specification started by CoreOS with the input from the wider open source community aimed to make network plugins interoperable between container execution engines. It aims to be as common and vendor-neutral as possible to support a wide variety of networking options — from MACVLAN to modern SDNs such as Weave and flannel.
CNI is growing in popularity. It got its start as a network plugin layer for rkt, a container runtime from CoreOS. Today rkt ships with multiple CNI plugins allowing users to take advantage of virtual switching, MACVLAN and IPVLAN as well as multiple IP management strategies, including DHCP. CNI is getting even wider adoption with Kubernetes adding support for it. Kubernetes accelerates development cycles while simplifying operations, and with support for CNI is taking the next step toward a common ground for networking. For continued success toward interoperability, Kubernetes users can come to this session to learn the CNI basics.
This talk will cover the CNI interface, including an example of how to build a simple plugin. It will also show Kubernetes users how CNI can be used to solve their networking challenges and how they can get involved.
KubeCon schedule link: http://sched.co/4VAo
Dataplane programming with eBPF: architecture and toolsStefano Salsano
eBPF is definitely a complex technology. Developing complex systems based on eBPF is challenging due to the intrinsic limitations of the model and the known shortcomings of the tool chain.
The learning curve of this technology is very steep and needs continuous coaching from experts. This tutorial will investigate:
What is eBPF and why it has gained a prominent position among the solutions to improve the packet processing performance in Linux/x86 nodes. We will shortly present some important use case scenarios for eBPF, like Kubernetes’ Cilium
The architecture of eBPF and its programming toolchain (e.g. bcc
What are the frameworks for eBPF programming, such as Polycube and InKeV.
How to make eBPF programming easier, more flexible and modular with HIKe/eCLAT
How to implement a custom application logic in eBPF with eCLAT using a python-like script
How to extend the framework and develop new modules
The OpenCSD library for decoding CoreSight traces has reached the point where it is ready to be integrated into applications. This session will present an overview of the state of the library, its interfaces and explore and demonstrate a sample integration with perf.
Install FD.IO VPP On Intel(r) Architecture & Test with Trex*Michelle Holley
This demo/lab will guide you to install and configure FD.io Vector Packet Processing (VPP) on Intel® Architecture (AI) Server. You will also learn to install TRex* on another AI Server to send packets to the VPP, and use some VPP commands to forward packets back to the TRex*.
Speaker: Loc Nguyen. Loc is a Software Application Engineer in Data Center Scale Engineering Team. Loc joined Intel in 2005, and has worked in various projects. Before joining the network group, Loc worked in High-Performance Computing area and supported Intel® Xeon Phi™ Product Family. His interest includes computer graphics, parallel computing, and computer networking.
Debugging is an essential part of Linux kernel development. In
user-space we have the support of the kernel and many debugging tools, tracking down a kernel bug, instead, can be very difficult if you don't know the proper methodologies. This talk will cover some techniques to understand how the kernel works, hunt down and fix kernel bugs in order to become a better kernel developer.
This work presents a P4 compiler backend targeting XDP, the eXpress Data Path. P4 is a domain-specific language describing how packets are processed by the data plane of a programmable network elements. XDP is designed for users who want programmability as well as performance.
https://github.com/williamtu/p4c-xdp/
Make Your Containers Faster: Linux Container Performance ToolsKernel TLV
If you look under the hood, Linux containers are just processes with some isolation features and resource quotas sprinkled on top. In this talk, we will apply modern Linux performance tools to container analysis: get high-level resource utilization on running containers with docker stats, htop, and nsenter; dig into high-CPU issues with perf; detect slow filesystem latency with BPF-based tools; and generate flame graphs of interesting event call stacks.
Sasha Goldshtein is the CTO of Sela Group, a Microsoft MVP and Regional Director, Pluralsight and O'Reilly author, and international consultant and trainer. Sasha is the author of two books and multiple online courses, and a prolific blogger. He is also an active open source contributor to projects focused on system diagnostics, performance monitoring, and tracing -- across multiple operating systems and runtimes. Sasha authored and delivered training courses on Linux performance optimization, event tracing, production debugging, mobile application development, and modern C++. Between his consulting engagements, Sasha speaks at international conferences world-wide.
You can find more details on the meetup page - https://www.meetup.com/Tel-Aviv-Yafo-Linux-Kernel-Meetup/events/245319189/
Comprehensive XDP Offload-handling the Edge CasesNetronome
While XDP is less complex to offload than other forms of kernel functionality due to the fact that it sits at the bottom of the stack, there are a number of items that lead to complexity when dealing with XDP offload. This talk will explore some of the ideas around how to implement these concepts as well as share some of the results we have seen while implementing offload on a 32 bit architecture.
DPDK Summit 2015 in San Francisco.
Intel's presentation by Keith Wiles.
For additional details and the video recording please visit www.dpdksummit.com.
The first version of eBPF hardware offload was merged into the Linux kernel in October 2016 and became part of Linux v4.9. For the last two years the project has been growing and evolving to integrate more closely with the core kernel infrastructure and enable more advanced use cases. This talk will explain the internals of the kernel architecture of the offload and how it allows seamless execution of unmodified eBPF datapaths in HW.
Code Yellow: Helping operations top-heavy teams the smart wayMichael Kehoe
We will look at the process for Code Yellow, the term we use for this process of "righting the ship," and discuss how to identify teams that are struggling. Through a look at three separate experiences, we will examine some of the root causes, what steps were taken, and how the engineering organization as a whole supports the process.
QConSF 2018: Building Production-Ready ApplicationsMichael Kehoe
In 2016, Susan Fowler released the 'Production Ready Microservices' book. This book sets an industry benchmark on explaining how microservices should be conceived, all the way through to documentation. So how does this translate actionable items? This session will explore how to expertly deploy your microservice to production. The audience will learn best practice for designing, deploying, monitoring & documenting application. By the end of the session, attendees should feel confident that they have the knowledge to deploy a service that will be reliable and scalable.
Helping operations top-heavy teams the smart wayMichael Kehoe
All engineering teams run into trouble from time to time. Alert fatigue, caused by technical debt or a failure to plan for growth, can quickly burn out SREs, overloading both development and operations with reactive work. Layer in the potential for communication problems between teams, and we can find ourselves in a place so troublesome we cannot easily see a path out. At times like this, our natural instinct as reliability engineers is to double down and fight through the issues. Often, however, we need to step back, assess the situation, and ask for help to put the team back on the road to success.
We will look at the process for Code Yellow, the term we use for this process of “righting the ship”, and discuss how to identify teams that are struggling. Through a look at three separate experiences, we will examine some of the root causes, what steps were taken, and how the engineering organization as a whole supports the process.
AllDayDevops: What the NTSB teaches us about incident management & postmortemsMichael Kehoe
The National Transport Safety Bureau is one of the most widely known Government bodies in the world. It’s their role to run into an incident, secure the scene and understand everything that happened. Given the important and unpredictable nature of their work, they have an extensive manual that sets out how incidents should be attended to and how the investigation should progress.
This session will detail how the NTSB’s approach to its work and the procedure that drives it, is transferable to us as incident responders. We’ll talk about the NTSB’s pre-incident preparation, incident notification, attending it, collecting information from the field and writing up a report and holding hearings. We’ll consistently draw parallels to IT incident management and how to create applicable process and procedures that mimic those of the NTSB.
Papers We Love Sept. 2018: 007: Democratically Finding The Cause of Packet DropsMichael Kehoe
Network failures continue to plague datacenter
operators as their symptoms may not have
direct correlation with where or why they occur. We
introduce 007, a lightweight, always-on diagnosis application
that can find problematic links and also
pinpoint problems for each TCP connection. 007 is
completely contained within the end host. During
its two month deployment in a tier-1 datacenter, it
detected every problem found by previously deployed
monitoring tools while also finding the sources of
other problems previously undetected.
What the NTSB teaches us about incident management & postmortemsMichael Kehoe
The National Transport Safety Bureau is one of the most widely known Government bodies in the world. It’s their role to run into an incident, secure the scene and understand everything that happened. Given the important and unpredictable nature of their work, they have an extensive manual that sets out how incidents should be attended to and how the investigation should progress.
This session will detail how the NTSB’s approach to its work and the procedure that drives it, is transferable to us as incident responders. We’ll talk about the NTSB’s pre-incident preparation, incident notification, attending it, collecting information from the field and writing up a report and holding hearings. We’ll consistently draw parallels to IT incident management and how to create applicable process and procedures that mimic those of the NTSB.
In 2016, Susan Fowler released the 'Production Ready Microservices' book. This book sets an industry benchmark on explaining how microservices should be conceived, all the way through to documentation. So how does this translate for Python applications? This session will explore how to expertly deploy your Python micro-service to production.
Helping operations top-heavy teams the smart wayMichael Kehoe
SRE teams can sometimes run into periods of time where they have staff burnout, technical debt or poor reliability. As SRE’s, we’re programmed to keep fighting through the issues, when sometimes it’s best to step back, assess the situation; and ask for help to put the team back on a successful pathway. This talk will discuss three separate experiences where teams needed some extra help to stabilize their services and oncall. We’ll discuss how to identify struggling teams; get the right assistance; and build a strategy for the team to succeed.
The Next Wave of Reliability EngineeringMichael Kehoe
In 2018, Site Reliability Engineering (SRE) will turn 15 years old. Since Google's inception of the term SRE, companies across the world have adopted a new operations mindset along with automation, deployment and monitoring principals. Most of what SRE does now is well established throughout the industry, so what is the next-wave of reliability principals and automation frameworks?
This session will dive into what the future holds for reliability engineering as a field and what will be the next areas of investment and improvement for reliability teams.
SF Chaos Engineering Meetup: Building Disaster Recovery via Resilience Engine...Michael Kehoe
How often have you heard stories where someone thought they had a disaster strategy, never tested it and it fails when you need it the most? LinkedIn has evolved from serving live traffic out of one data center to four data centers spread geographically. Serving live traffic from four data centers at the same time has taken the company from a disaster recovery model to a disaster avoidance model, where an unhealthy data center can be taken out of rotation and its traffic redistributed to the healthy data centers within minutes, with virtually no visible impact to users.
As LinkedIn transitioned from big monolithic applications to microservices, it was difficult to determine capacity constraints of individual services to handle extra load during disaster scenarios. Stress testing individual services using artificial load in a complex microservices architecture wasn’t sufficient to provide enough confidence in data center’s capacity. To solve this problem, LinkedIn moves live traffic to services site-wide by shifting traffic between datacenters to simulate a disaster every business day!
SRECon-Europe-2017: Reducing MTTR and False Escalations: Event Correlation at...Michael Kehoe
LinkedIn’s production stack is made up of over 900 applications, 2200 internal API’s and hundreds of databases. With any given application having many interconnected pieces, it is difficult to escalate to the right person in a timely manner.
In order to combat this, LinkedIn built an Event Correlation Engine that monitors service health and maps dependencies between services to correctly escalate to the SRE’s who own the unhealthy service.
We’ll discuss the approach we used in building a correlation engine and how it has been used at LinkedIn to reduce incident impact and provide better quality of life to LinkedIn’s oncall engineers.
All of us depend on the underlying network to be stable whether in the datacenter or in the cloud. We all have a basic knowledge of how traditional networks run, however in the past 10 years, we’ve moved to building redundant physical topologies in our networks, optimized the routing methodologies accordingly, moved into the cloud and gotten greater visibility and tuneables in the Linux kernel network stack. A lot has changed!
However, the way we troubleshoot the network in relation to the applications we support hasn’t adapted. In this session, we’ll review the progress that network infrastructure has made look at specific examples where traditional troubleshooting responses fail us and demonstrate our need to rethink our approach to making applications and the network interact harmoniously.
Velocity San Jose 2017: Traffic shifts: Avoiding disasters at scaleMichael Kehoe
LinkedIn has evolved from serving live traffic out of one data center to four data centers spread geographically. Serving live traffic from four data centers at the same time has taken the company from a disaster recovery model to a disaster avoidance model, where an unhealthy data center can be taken out of rotation and its traffic redistributed to the healthy data centers within minutes, with virtually no visible impact to users.
As LinkedIn transitioned from big monolithic applications to microservices, it was difficult to determine capacity constraints of individual services to handle extra load during disaster scenarios. Stress testing individual services using artificial load in a complex microservices architecture wasn’t sufficient to provide enough confidence in data center’s capacity. To solve this problem, LinkedIn leverages live traffic to stress services site-wide by shifting traffic to simulate a disaster load.
Michael Kehoe and Anil Mallapur discuss how LinkedIn uses traffic shifts to mitigate user impact by migrating live traffic between its data centers and stress test site-wide services for improved capacity handling and member experience.
Reducing MTTR and False Escalations: Event Correlation at LinkedInMichael Kehoe
LinkedIn’s production stack is made up of over 900 applications and over 2200 internal API’s. With any given application having many interconnected pieces, it is difficult to escalate to the right person in a timely manner.
In order to combat this, LinkedIn built an Event Correlation Engine that monitors service health and maps dependencies between services to correctly escalate to the SRE’s who own the unhealthy service.
We’ll discuss the approach we used in building a correlation engine and how it has been used at LinkedIn to reduce incident impact and provide better quality of life to LinkedIn’s oncall engineers.
LinkedIn serves traffic for its 467 million members from four data centers and multiple PoPs spread geographically around the world. Serving live traffic from from many places at the same time has taken us from a disaster recovery model to a disaster avoidance model where we can take an unhealthy data center or PoP out of rotation and redistribute its traffic to a healthy one within minutes, with virtually no visible impact to users. The geographical distribution of our infrastructure also allows us to optimize the end-user's experience by geo routing users to the best possible PoP and datacenter.
This talk provide details on how LinkedIn shifts traffic between its PoPs and data centers to provide the best possible performance and availability for its members. We will also touch on the complexities of performance in APAC, how IPv6 is helping our members and how LinkedIn stress tests data centers verify its disaster recovery capabilities.
Couchbase Connect 2016: Monitoring Production Deployments The Tools – LinkedInMichael Kehoe
Good monitoring can be the difference between a great night's sleep or hearing your phone go off at 2:37 a.m. because of a production outage. Couchbase Server provides a large number of metrics which can be overwhelming if you do not know the critical things to focus on or how to expose that information to your monitoring system. In this talk we will look at example production incidents, going in depth around specific things to monitor, and how this information can be used to find issues, work out root cause, and discover trends.
Using SaltStack to Auto Triage and Remediate Production SystemsMichael Kehoe
LinkedIn created an auto-remediation system named Nurse which leverages SaltStack and the CherryPy API to auto-triage and remediate issues with production systems. See how LinkedIn uses SaltStack with Nurse in its production environment and learn how to architect your own auto-triage and remediation system.
Student information management system project report ii.pdfKamal Acharya
Our project explains about the student management. This project mainly explains the various actions related to student details. This project shows some ease in adding, editing and deleting the student details. It also provides a less time consuming process for viewing, adding, editing and deleting the marks of the students.
Welcome to WIPAC Monthly the magazine brought to you by the LinkedIn Group Water Industry Process Automation & Control.
In this month's edition, along with this month's industry news to celebrate the 13 years since the group was created we have articles including
A case study of the used of Advanced Process Control at the Wastewater Treatment works at Lleida in Spain
A look back on an article on smart wastewater networks in order to see how the industry has measured up in the interim around the adoption of Digital Transformation in the Water Industry.
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxR&R Consult
CFD analysis is incredibly effective at solving mysteries and improving the performance of complex systems!
Here's a great example: At a large natural gas-fired power plant, where they use waste heat to generate steam and energy, they were puzzled that their boiler wasn't producing as much steam as expected.
R&R and Tetra Engineering Group Inc. were asked to solve the issue with reduced steam production.
An inspection had shown that a significant amount of hot flue gas was bypassing the boiler tubes, where the heat was supposed to be transferred.
R&R Consult conducted a CFD analysis, which revealed that 6.3% of the flue gas was bypassing the boiler tubes without transferring heat. The analysis also showed that the flue gas was instead being directed along the sides of the boiler and between the modules that were supposed to capture the heat. This was the cause of the reduced performance.
Based on our results, Tetra Engineering installed covering plates to reduce the bypass flow. This improved the boiler's performance and increased electricity production.
It is always satisfying when we can help solve complex challenges like this. Do your systems also need a check-up or optimization? Give us a call!
Work done in cooperation with James Malloy and David Moelling from Tetra Engineering.
More examples of our work https://www.r-r-consult.dk/en/cases-en/
Final project report on grocery store management system..pdfKamal Acharya
In today’s fast-changing business environment, it’s extremely important to be able to respond to client needs in the most effective and timely manner. If your customers wish to see your business online and have instant access to your products or services.
Online Grocery Store is an e-commerce website, which retails various grocery products. This project allows viewing various products available enables registered users to purchase desired products instantly using Paytm, UPI payment processor (Instant Pay) and also can place order by using Cash on Delivery (Pay Later) option. This project provides an easy access to Administrators and Managers to view orders placed using Pay Later and Instant Pay options.
In order to develop an e-commerce website, a number of Technologies must be studied and understood. These include multi-tiered architecture, server and client-side scripting techniques, implementation technologies, programming language (such as PHP, HTML, CSS, JavaScript) and MySQL relational databases. This is a project with the objective to develop a basic website where a consumer is provided with a shopping cart website and also to know about the technologies used to develop such a website.
This document will discuss each of the underlying technologies to create and implement an e- commerce website.
Immunizing Image Classifiers Against Localized Adversary Attacksgerogepatton
This paper addresses the vulnerability of deep learning models, particularly convolutional neural networks
(CNN)s, to adversarial attacks and presents a proactive training technique designed to counter them. We
introduce a novel volumization algorithm, which transforms 2D images into 3D volumetric representations.
When combined with 3D convolution and deep curriculum learning optimization (CLO), itsignificantly improves
the immunity of models against localized universal attacks by up to 40%. We evaluate our proposed approach
using contemporary CNN architectures and the modified Canadian Institute for Advanced Research (CIFAR-10
and CIFAR-100) and ImageNet Large Scale Visual Recognition Challenge (ILSVRC12) datasets, showcasing
accuracy improvements over previous techniques. The results indicate that the combination of the volumetric
input and curriculum learning holds significant promise for mitigating adversarial attacks without necessitating
adversary training.
Hierarchical Digital Twin of a Naval Power SystemKerry Sado
A hierarchical digital twin of a Naval DC power system has been developed and experimentally verified. Similar to other state-of-the-art digital twins, this technology creates a digital replica of the physical system executed in real-time or faster, which can modify hardware controls. However, its advantage stems from distributing computational efforts by utilizing a hierarchical structure composed of lower-level digital twin blocks and a higher-level system digital twin. Each digital twin block is associated with a physical subsystem of the hardware and communicates with a singular system digital twin, which creates a system-level response. By extracting information from each level of the hierarchy, power system controls of the hardware were reconfigured autonomously. This hierarchical digital twin development offers several advantages over other digital twins, particularly in the field of naval power systems. The hierarchical structure allows for greater computational efficiency and scalability while the ability to autonomously reconfigure hardware controls offers increased flexibility and responsiveness. The hierarchical decomposition and models utilized were well aligned with the physical twin, as indicated by the maximum deviations between the developed digital twin hierarchy and the hardware.
5. Michael Kehoe
$ WHOAMI
• Sr Staff Site Reliability Engineer @
LinkedIn
• Production-SRE Team
• What I do:
• Disaster Recovery
• (Organizational) Visibility Engineering
• Incident Management
• Reliability Research
7. “BPF is a highly flexible and efficient virtual
machine-like construct in the Linux kernel
allowing to execute bytecode at various hook
points in a safe manner. It is used in a number
of Linux kernel subsystems, most prominently
networking, tracing and security (e.g.
sandboxing).”
C i l i u m
8. What is cBPF?
• cBPF – Classic BPF
• Also known as “Linux Packet Filtering”
• BPF was first introduced in 1992 by
Steven McCanne and Van Jacobson in
BSD
• Better known as the packet filter
language in tcpdump
9. What is cBPF?
• Network packet filtering, Seccomp
• Filter Expressions Bytecode
Interpret
• Small, in-kernel VM, Register based,
switch dispatch interpreter, few
instructions
• BPF uses a simple, non-shared buffer
model made possible by today’s larger
address space
11. History of BPF
• Before BPF, each OS (Sun, DEC, SGI
etc) had its own packet filtering API
• In 1993: Steven McCanne & Van
Jacobsen released a paper titled the
BSD Packet Filter (BPF)
• Implemented as “Linux Socket Filter” in
kernel 2.2
• While maintaining the BPF language (for
describing filters), uses a different
internal architecture
13. BPF (original) implementation
• Open a special-purpose
character-device, namely
/dev/bpfn, for dealing with
raw packets.
• Associate the previous
device with a network
interface by using the
ioctl(2) system call
https://www.tcpdump.org/papers/bpf-usenix93.pdf
14. BPF (original) implementation
• Set various BPF
parameters, (e.g. buffer
size, attach some BPF
filters ) This is done using
the ioctl(2) system call
• Read packets from the
kernel, or send raw packets,
by reading/writing to the
corresponding file descriptor
of /dev/bpf using
read(2)/write(2) system callshttps://www.tcpdump.org/papers/bpf-usenix93.pdf
15. BPF (LSF) implementation
• Utilizes sockets for
passing/receiving packets
to/from the kernel-space
• Filters are attached with the
setsockopt(2) system call
https://www.tcpdump.org/papers/bpf-usenix93.pdf
16. BPF (LSF) implementation
• Create a special-purpose
socket (i.e., PF_PACKET) 2
• Attach a BPF program to
the socket using the
setsockopt(2) system call
https://www.tcpdump.org/papers/bpf-usenix93.pdf
17. BPF (LSF) implementation
• Set the network interface to
promiscuous mode with
ioctl(2) (optionally)
• Read packets from the
kernel, or send raw
packets, by reading/writing
to the file descriptor of the
socket using
recvfrom(2)/sendto(2)
system calls
https://www.tcpdump.org/papers/bpf-usenix93.pdf
18. BPF (LSF) implementation
TCPDUMP EXAMPLE
https://static.sched.com/hosted_files/kccnceu19/b8/KubeCon-Europe-2019-Beatriz_Martinez_eBPF.pdf
25. What is eBPF?
• eBPF – extended Berkeley Packet Filter
• User-defined, sandboxed bytecode
executed by the kernel
• VM that implements a RISC-like
assembly language in kernel space
• All interactions between kernel/ user
space are done through eBPF “maps”
• eBPF does not allow loops
26. What is eBPF?
• Similar to LSF, but with the following
improvements:
• More registers, JIT compiler (flexible/ faster),
verifier
• Attach on Tracepoint, Kprobe, Uprobe, USDT
• In-kernel trace aggregation & filtering
• Control via bpf()
• Designed for general event processing within
the kernel
• All interactions between kernel/ user space
are done through eBPF “maps”
28. History of BPF
• 3.15: Optimization of BPF Interpreter’s instruction
set
• 3.18: Linux eBPF was released (bpf() syscall)
• 3.19: Socket supports, BPF Maps
• 4.1: Kprobe support
• 4.4: Perf events
• 4.7: Attach to tracepoints
• 4.8: XDP core
• 4.10: cgroups support
• 4.18: bpfilter released
http://hsdm.dorsal.polymtl.ca/system/files/eBPF-5May2017%20%281%29.pdf
32. (e)BPF Program Types
• prog_type determines the
subset of kernel helper
functions that the program
may call
• Determines the program
input (bpf_context)
https://www.tcpdump.org/papers/bpf-usenix93.pdf
33. (e)BPF Program Types
SOCKET-RELATED
• SOCKET_FILTER: Filtering actions (e.g. drop packets)
• SK_SKB: Access SKB and docket details with a view to redirect
SKB’s
• SOCK_OPS – Catch socket operations
• XDP: Allows access to packet data as early as possible (DDoS
mitigation/ Load-balancing)
https://www.tcpdump.org/papers/bpf-usenix93.pdf
34. (e)BPF Program Types
XDP
• XDP: Allows access to packet data as early as possible (DDoS
mitigation/ Load-balancing)
https://www.tcpdump.org/papers/bpf-usenix93.pdf
35. (e)BPF Program Types
KPROBES, TRACEPOINTS & PERF
• KPROBE – Instrument code in any kernel function
• TRACEPOINT – Instrument tracepoints in kernel code
• PERF_EVENT: Instrument software and hardware perf events
https://www.tcpdump.org/papers/bpf-usenix93.pdf
36. (e)BPF Program Types
CGROUPS
• CGROUP_SKB – Allow or deny network access on IP egress/
ingress
• CGROUP_SOCK – Allow or deny network access at various
socket-lreated events
• CGROUP_DEVICE – Determine if a device operation should be
permitted
https://www.tcpdump.org/papers/bpf-usenix93.pdf
37. (e)BPF Program Types
LIGHTWEIGHT TUNNELS
• LWT_IN – Examine inbound packets for lightweight tunnel de-
encapsulation
• LWT_OUT – Implement encapsulation tunnels for specific
destination routes
• LWT_XMIT – Allowed to modify content and prepend a L2 header
https://www.tcpdump.org/papers/bpf-usenix93.pdf
38. (e)BPF Program Types
TRAFFIC CONTROL
• SCHED_CLS: A network traffic-control classifier
• SCHED_ACT: A network traffic-control action
https://www.tcpdump.org/papers/bpf-usenix93.pdf
40. (e)BPF Maps
• Generic structure for
storage of different types of
data
• Allow sharing of data
between:
• eBPF kernel program
• Kernel and user-space
https://www.tcpdump.org/papers/bpf-usenix93.pdf
41. (e)BPF Maps
• Each map has the following
attributes:
• Type
• Max number of elements
• Key Size (bytes)
• Value Size (bytes)
http://man7.org/linux/man-pages/man2/bpf.2.html
42. (e)BPF Maps
• HASH - A hash table
• ARRAY- An array map, optimized for fast lookup speeds
• PROG_ARRAY - An array of FD’s corresponding to eBPF
programs
• PERCPU_ARRAY - A per-CPU array, used to implement
histograms
• PERF_EVENT_ARRAY - Stores pointers to struct perf_event
• CGROUP_ARRAY – Stores pointers to control groups
https://lwn.net/Articles/740157/
43. (e)BPF Maps
• LRU_HASH - A hash table that only retains the most recently
used items
• LRU_PER_CPU_HASH - A per-CPU hash table that only retains
the most recently used items
• LPM_TRIE - A longest-prefix match true, good for matching IP
addresses
• STACK_TRACE - Stores stack traces
• ARRAY_OF_MAPS - A map-in-map data structure
• HASH_OF_MAPS – A map-in-map data structurehttps://lwn.net/Articles/740157/
44. (e)BPF Maps
• DEVICE_MAP - For storing and looking up network device
references
• SOCKET_MAP – Stores and looks up sockets and allows
redirection
https://lwn.net/Articles/740157/
46. What
can BPF
be used
for?
1 Networking (e.g. load balancing)
2 Firewalls
3 DDOS mitigation
4 Profiling & Tracing
5 Container Security
6 Device Drivers
7 Chaos Engineering
47. What can BPF be used for
NETWORKING
• Load-balancing
• Katran (Facebook)
• General networking
• Cilium
• Extending the TCP stack
• Network Monitoring
• Flowmill
• Weaveworks
48. What can BPF be used for
FIREWALLS
• Bpfilter (Linux 4.18)
49. What can BPF be used for
DDOS MITIGATION
• Use of eBPF & XDP to perform infra-wide
DDoS mitigation
• Facebook
• Cloudflare
50. What can BPF be used for
PROFILE & TRACING
• Sysdig
• bpftrace
51. What can BPF be used for
SECURITY
• Cilium
• Seccomp BPF
52. What can BPF be used for
DEVICE DRIVERS
• eBPF provides a pseudo device driver
possible to extend this in multiple ways
53. What can BPF be used for
CHAOS ENGINEERING
• Use Cilium to inject latency, packet-loss,
L7 HTTP errors (via a Go extension)
55. Introduction to XDP
• XDP – eXpress Data Path
• High performance, programmable
network data path (IO Visor Project)
• Linux Kernels answer for DPDK
(Released in 4.8)
56. Introduction to XDP
• Features:
• Does not require specialized hardware
• Does not require kernel bypass
• Does not replace TCP/ IP stack
• Works with TCP/ IP stack with eBPF
57. Introduction to XDP
• XDP program runs as soon as the packet
gets to the network driver
• XDP program needs to edit with an
action:
• XDP_TX
• XDP_DROP
• XDP_PASS
59. Introduction to DPDK
• DPDK – Data Plane Development Kit
• Created in 2010 by Intel
• Collection of data plane libraries & NIC
drivers for fast packet processing
• Open-Source under Linux Foundation
• Support for multiple CPU architectures
62. XDP & DPDK
BENEFITS OF XDP
• No 3rd party code
• Option of busy polling or interrupt driven
networking
• Removes the need to:
• Allocate large pages
• Dedicated CPU’s
• Inject packets into the kernel from 3rd
party user space
• Define a new security model
https://www.iovisor.org/technology/xdp