eBPF Basics

(c|e)BPF Basics
Michael Kehoe
Sr Staff Site Reliability Engineer

Today’s
agenda
1 Introduction
2 cBPF Introduction, History & Implementation
3 eBPF Introduction, History & Implementation
5 eBPF Uses
6 XDP
7 DPDK

Michael Kehoe
$ WHOAMI
• Sr Staff Site Reliability Engineer @
LinkedIn
• Production-SRE Team
• What I do:
• Disaster Recovery
• (Organizational) Visibility Engineering
• Incident Management
• Reliability Research

(c)BPF Introduction &
History & Implementation

“BPF is a highly flexible and efficient virtual
machine-like construct in the Linux kernel
allowing to execute bytecode at various hook
points in a safe manner. It is used in a number
of Linux kernel subsystems, most prominently
networking, tracing and security (e.g.
sandboxing).”
C i l i u m

What is cBPF?
• cBPF – Classic BPF
• Also known as “Linux Packet Filtering”
• BPF was first introduced in 1992 by
Steven McCanne and Van Jacobson in
BSD
• Better known as the packet filter
language in tcpdump

What is cBPF?
• Network packet filtering, Seccomp
• Filter Expressions  Bytecode 
Interpret
• Small, in-kernel VM, Register based,
switch dispatch interpreter, few
instructions
• BPF uses a simple, non-shared buffer
model made possible by today’s larger
address space

History of BPF
• Before BPF, each OS (Sun, DEC, SGI
etc) had its own packet filtering API
• In 1993: Steven McCanne & Van
Jacobsen released a paper titled the
BSD Packet Filter (BPF)
• Implemented as “Linux Socket Filter” in
kernel 2.2
• While maintaining the BPF language (for
describing filters), uses a different
internal architecture

BPF (original) implementation
• Open a special-purpose
character-device, namely
/dev/bpfn, for dealing with
raw packets.
• Associate the previous
device with a network
interface by using the
ioctl(2) system call
https://www.tcpdump.org/papers/bpf-usenix93.pdf

BPF (original) implementation
• Set various BPF
parameters, (e.g. buffer
size, attach some BPF
filters ) This is done using
the ioctl(2) system call
• Read packets from the
kernel, or send raw packets,
by reading/writing to the
corresponding file descriptor
of /dev/bpf using
read(2)/write(2) system callshttps://www.tcpdump.org/papers/bpf-usenix93.pdf

BPF (LSF) implementation
• Utilizes sockets for
passing/receiving packets
to/from the kernel-space
• Filters are attached with the
setsockopt(2) system call

• Create a special-purpose
socket (i.e., PF_PACKET) 2
• Attach a BPF program to
the socket using the
setsockopt(2) system call

• Set the network interface to
promiscuous mode with
ioctl(2) (optionally)
• Read packets from the
kernel, or send raw
packets, by reading/writing
to the file descriptor of the
socket using
recvfrom(2)/sendto(2)
system calls

TCPDUMP EXAMPLE
https://static.sched.com/hosted_files/kccnceu19/b8/KubeCon-Europe-2019-Beatriz_Martinez_eBPF.pdf

(e)BPF Introduction &
History & Implementation

(e)BPF
1 Introduction
2 History
3 Implementation
5 Program Types
6 Maps

“eBPF is Linux’s new superpower”
G a u r a v G u p t a

“eBPF does to Linux what JavaScript does to
HTML”
B r e n d a n G r e g g

“Run code in the kernel without having to write
a kernel module”
L i z R i c e

“Stateful, programmable in-kernel decisions for
networking, tracing and security”
S u c h a k r a p a n i D a t t S h a r m a

What is eBPF?
• eBPF – extended Berkeley Packet Filter
• User-defined, sandboxed bytecode
executed by the kernel
• VM that implements a RISC-like
assembly language in kernel space
• All interactions between kernel/ user
space are done through eBPF “maps”
• eBPF does not allow loops

What is eBPF?
• Similar to LSF, but with the following
improvements:
• More registers, JIT compiler (flexible/ faster),
verifier
• Attach on Tracepoint, Kprobe, Uprobe, USDT
• In-kernel trace aggregation & filtering
• Control via bpf()
• Designed for general event processing within
the kernel
• All interactions between kernel/ user space
are done through eBPF “maps”

History of BPF
• 3.15: Optimization of BPF Interpreter’s instruction
set
• 3.18: Linux eBPF was released (bpf() syscall)
• 3.19: Socket supports, BPF Maps
• 4.1: Kprobe support
• 4.4: Perf events
• 4.7: Attach to tracepoints
• 4.8: XDP core
• 4.10: cgroups support
• 4.18: bpfilter released
http://hsdm.dorsal.polymtl.ca/system/files/eBPF-5May2017%20%281%29.pdf

What is eBPF?
http://hsdm.dorsal.polymtl.ca/system/files/eBPF-5May2017%20%281%29.pdf

(e)BPF Program Types
• prog_type determines the
subset of kernel helper
functions that the program
may call
• Determines the program
input (bpf_context)

SOCKET-RELATED
• SOCKET_FILTER: Filtering actions (e.g. drop packets)
• SK_SKB: Access SKB and docket details with a view to redirect
SKB’s
• SOCK_OPS – Catch socket operations
• XDP: Allows access to packet data as early as possible (DDoS
mitigation/ Load-balancing)

XDP
• XDP: Allows access to packet data as early as possible (DDoS
mitigation/ Load-balancing)

KPROBES, TRACEPOINTS & PERF
• KPROBE – Instrument code in any kernel function
• TRACEPOINT – Instrument tracepoints in kernel code
• PERF_EVENT: Instrument software and hardware perf events

CGROUPS
• CGROUP_SKB – Allow or deny network access on IP egress/
ingress
• CGROUP_SOCK – Allow or deny network access at various
socket-lreated events
• CGROUP_DEVICE – Determine if a device operation should be
permitted

LIGHTWEIGHT TUNNELS
• LWT_IN – Examine inbound packets for lightweight tunnel de-
encapsulation
• LWT_OUT – Implement encapsulation tunnels for specific
destination routes
• LWT_XMIT – Allowed to modify content and prepend a L2 header

TRAFFIC CONTROL
• SCHED_CLS: A network traffic-control classifier
• SCHED_ACT: A network traffic-control action

(e)BPF Maps
• Generic structure for
storage of different types of
data
• Allow sharing of data
between:
• eBPF kernel program
• Kernel and user-space

(e)BPF Maps
• Each map has the following
attributes:
• Type
• Max number of elements
• Key Size (bytes)
• Value Size (bytes)
http://man7.org/linux/man-pages/man2/bpf.2.html

(e)BPF Maps
• HASH - A hash table
• ARRAY- An array map, optimized for fast lookup speeds
• PROG_ARRAY - An array of FD’s corresponding to eBPF
programs
• PERCPU_ARRAY - A per-CPU array, used to implement
histograms
• PERF_EVENT_ARRAY - Stores pointers to struct perf_event
• CGROUP_ARRAY – Stores pointers to control groups
https://lwn.net/Articles/740157/

(e)BPF Maps
• LRU_HASH - A hash table that only retains the most recently
used items
• LRU_PER_CPU_HASH - A per-CPU hash table that only retains
the most recently used items
• LPM_TRIE - A longest-prefix match true, good for matching IP
addresses
• STACK_TRACE - Stores stack traces
• ARRAY_OF_MAPS - A map-in-map data structure
• HASH_OF_MAPS – A map-in-map data structurehttps://lwn.net/Articles/740157/

(e)BPF Maps
• DEVICE_MAP - For storing and looking up network device
references
• SOCKET_MAP – Stores and looks up sockets and allows
redirection
https://lwn.net/Articles/740157/

What
can BPF
be used
for?
1 Networking (e.g. load balancing)
2 Firewalls
3 DDOS mitigation
4 Profiling & Tracing
5 Container Security
6 Device Drivers
7 Chaos Engineering

What can BPF be used for
NETWORKING
• Load-balancing
• Katran (Facebook)
• General networking
• Cilium
• Extending the TCP stack
• Network Monitoring
• Flowmill
• Weaveworks

FIREWALLS
• Bpfilter (Linux 4.18)

DDOS MITIGATION
• Use of eBPF & XDP to perform infra-wide
DDoS mitigation
• Facebook
• Cloudflare

PROFILE & TRACING
• Sysdig
• bpftrace

SECURITY
• Cilium
• Seccomp BPF

DEVICE DRIVERS
• eBPF provides a pseudo device driver 
possible to extend this in multiple ways

CHAOS ENGINEERING
• Use Cilium to inject latency, packet-loss,
L7 HTTP errors (via a Go extension)

Introduction to XDP
• XDP – eXpress Data Path
• High performance, programmable
network data path (IO Visor Project)
• Linux Kernels answer for DPDK
(Released in 4.8)

Introduction to XDP
• Features:
• Does not require specialized hardware
• Does not require kernel bypass
• Does not replace TCP/ IP stack
• Works with TCP/ IP stack with eBPF

Introduction to XDP
• XDP program runs as soon as the packet
gets to the network driver
• XDP program needs to edit with an
action:
• XDP_TX
• XDP_DROP
• XDP_PASS

Introduction to DPDK
• DPDK – Data Plane Development Kit
• Created in 2010 by Intel
• Collection of data plane libraries & NIC
drivers for fast packet processing
• Open-Source under Linux Foundation
• Support for multiple CPU architectures

DPDK Architecture
https://core.dpdk.org/

XDP & DPDK
BENEFITS OF XDP
• No 3rd party code
• Option of busy polling or interrupt driven
networking
• Removes the need to:
• Allocate large pages
• Dedicated CPU’s
• Inject packets into the kernel from 3rd
party user space
• Define a new security model
https://www.iovisor.org/technology/xdp

eBPF Basics

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to eBPF Basics

Similar to eBPF Basics (20)

More from Michael Kehoe

More from Michael Kehoe (20)

Recently uploaded

Recently uploaded (20)

eBPF Basics

Editor's Notes