SlideShare a Scribd company logo
BPF at Facebook
Alexei Starovoitov
3
1) kernel upgrades
2) BPF in the datacenter
3) BPF evolution
4) where you can help
Agenda
Kernel upgrades in FB
4
•- "Upstream first" philosophy.
•- Close to zero private patches.
•- As soon as practical kernel team:
• . takes the latest upstream kernel
• . stabilizes it
• . rolls it across the fleet
• . backports relevant features until the cycle repeats
•- It used to take months to upgrade. Now few weeks. Days when necessary.
•
move fast
Kernel version by count
5
•- As of September 2019.
•- It will be different tomorrow.
- One kernel version on most servers.
- Many 4.16.x flavors due to long tail.
- Challenging environment for user space.
- Even more challenging for BPF based tracing.
Do not break user space
6
•- Must not change kernel ABI.
•- Must not cause performance regressions.
•- Must not change user space behavior.
- Investigate all differences.
. Either unexpected improvement or regression.
. Team work is necessary to root cause.
"The first rule" of kernel programming... multiplied by FB scale.
Do you use BPF?
7
•- Run this command on your laptop:
•
• sudo bpftool prog show | grep name | wc -l
•
•- What number does it print?
- Don't have bpftool ? Run this:
ls -la /proc/*/fd | grep bpf-prog | wc -l
BPF at Facebook
8
•- ~40 BPF programs active on every server.
•- ~100 BPF programs loaded on demand for short period of time.
•- Mainly used by daemons that run on every server.
•- Many teams are writing and deploying them.
BPF program distribution by type
9
Kernel team is involved in lots of investigations.
10
BPF?
BPF?
BPF?
BPF?
BPF?
•It's not true, but I often feel this way :)
Example 1: packet capture daemon
11
- This daemon is using SCHED_CLS BPF program.
- The program is attached to TC ingress and runs on every packet.
- With 1 out of million probability it does bpf_perf_event_output(skb).
- On new kernel this daemon causes 1% cpu regression.
- Disabling the daemon makes the regression go away.
- Is it BPF?
Example 1: packet capture daemon (resolved)
12
- Turned out the daemon is loading KPROBE BPF program as well for unrelated logic.
- kprobe-d function doesn't exist in new kernel.
- Daemon decides that BPF is unusable and falls back to NFLOG-based packet
capture.
- nflog loads iptable modules and causes 1% cpu regression.
Takeaway for developers
13
- kprobe is not a stable ABI.
- Everytime kernel developers change the code some kernel developers pay the price.
Example 2: performance profiling daemon
14
- The daemon is using BPF tracepoints, kprobes in the scheduler and task execution.
- It collects kernel and user stack traces, walks python user stacks inside BPF
program and aggregates across the fleet.
- This daemon is #1 tool for performance analysis.
- On new kernel it causes 2% cpu regression.
- Higher softirq times. Slower user apps.
- Disabling the daemon makes the regression go away.
- Is it BPF?
Example 2: performance profiling daemon (resolved)
15
- Turned out that simply installing kprobe makes 5.2 kernel remap kernel .text from
2M huge pages into 4k.
- That caused more I-TLB misses.
- Making BPF execution in the kernel slower and user space as well.
Takeaway
16
- kprobe is essential part of kernel functionality.
Example 3: security monitoring daemon
17
- The daemon is using 3 kprobes and 1 kretprobe.
- Its BPF program code just over 200 lines of C.
- It runs with low priority.
- It wakes up every few seconds, consumes 0.01% of one cpu and 0.01% of memory.
- Yet it causes large P99 latency regression for database server that runs on all other
cpus and consumes many Gbytes of memory.
- Throughput of the database is not affected.
- Disabling the daemon makes the regression go away.
- Is it BPF?
Investigation
18
Facts:
- Occasionally memcpy() in a database gets stuck for 1/4 of a second.
- The daemon is rarely reading /proc/pid/environ.
Guesses:
- Is database waiting on kernel to handle page fault ?
- While kernel is blocked on mmap_sem ?
- but "top" and others read /proc way more often. Why that daemon is special?
- Dive into kernel code
fs/proc/base.c
environ_read()
access_remote_vm()
down_read(&mm->mmap_sem)
funclatency.py - Time functions and print latency as a
histogram
19
# funclatency.py -d100 -m __access_remote_vm
Tracing 1 functions for "__access_remote_vm"... Hit Ctrl-C to end.
msecs : count distribution
0 -> 1 : 21938 |****************************************|
2 -> 3 : 0 | |
4 -> 7 : 0 | |
8 -> 15 : 0 | |
16 -> 31 : 0 | |
32 -> 63 : 0 | |
64 -> 127 : 0 | |
128 -> 255 : 7 | |
256 -> 511 : 3 | |
Detaching...
This histogram shows that over the last 100 seconds there
were 3 events where reading /proc took more than 256 ms.
funcslower.py - Dump kernel and user stack when given
kernel function was slower than threshold
20
# funcslower.py -m 200 -KU __access_remote_vm
Tracing function calls slower than 200 ms... Ctrl+C to quit.
COMM PID LAT(ms) RVAL FUNC
security_daemon 1720415 399.02 605 __access_remote_vm
kretprobe_trampoline
read
facebook::...::readBytes(folly::File const&)
...
This was the kernel+user stack trace when our security
daemon was stuck in sys_read() for 399 ms.
Yes. It's that daemon causing database latency spikes.
Collect more stack traces with offwaketime.py ...
21
finish_task_switch
__schedule
preempt_schedule_common
_cond_resched
__get_user_pages
get_user_pages_remote
__access_remote_vm
proc_pid_cmdline_read
__vfs_read
vfs_read
sys_read
do_syscall_64
read
facebook::...::readBytes(folly::File const&)
The task reading from /proc/pid/cmdline can go to sleep without releasing
mmap_sem of mm of that pid.
The page fault in that pid will be blocked until this task finishes reading /proc.
Root cause
22
- The daemon is using 3 kprobes and 1 kretprobe.
- Its BPF program code just over 200 lines of C.
- It runs with low priority.
- It wakes up every few seconds, consumes 0.01% of one cpu and 0.01% of memory.
Low CPU quota for the daemon coupled with aggressive sysctl kernel.sched_*
tweaks were responsible.
Takeaway
23
- BPF tracing tools are the best to tackle BPF regression.
BPF BPF
Another kind of BPF investigations
24
- Many kernels run in the datacenter.
- Daemons (and their BPF programs) need to work on all of them.
- BPF program works on developer server, but fails in production.
On developer server
25
On production server
26
- Embedded LLVM is safer than standalone LLVM.
- LLVM takes 70 Mb on disk. 20 Mb of memory at steady state. More at peak.
- Dependency on system kernel headers. Subsystem internal headers are missing.
- Compilation errors captured at runtime.
- Compilation on production server disturbs the main workload.
- And the other way around. llvm may take minutes to compile 100 lines of C.
BPF CO-RE (Compile Once Run Everywhere)
27
- Compile BPF program into "Run Everywhere" .o file (BPF assembly + extra).
- Test it on developer server against many "kernels".
- Adjust .o file on production server by libbpf.
- No compilation on production server.
BTF (BPF Type Format)
28
- BTF describes types, relocations, source code.
- LLVM compiles BPF program C code into BPF assembler and BTF.
- gcc+pahole compiles kernel C code into vmlinux binary and BTF.
- libbpf compares prog's BTF with vmlinux's BTF and adjusts BPF assembly before
loading into the kernel.
- Developers can compile and test for kprobe and kernel data structure
compatibility on a single server at build time instead of on N servers at run-time.
trace_kfree_skb today
29
PARM2 typo will "work" too
six bpf_probe_read() calls
Any type cast is allowed
clang -I/path_to_kernel_headers/ -I/path_to_user/
trace_kfree_skb today
30
trace_kfree_skb with CO-RE
31
Works with any raw tracepoint
Same kernel helper as in networking programs
If skb and location are accidentally swapped
the verifier will catch it
Define kernel structs by hand instead of
including vmlinux.h
BPF verifier giant leaps in 2019
32
- Bounded loops
- bpf_spin_lock
- Dead code elimination
- Scalar precision tracking
BPF
BPF verifier is smarter than llvm
33
- The verifier removes dead code after it was optimized by llvm -O2.
- Developers cannot cheat by type casting integer to pointer or removing 'const'.
- LLVM goal -> optimize the code.
- The verifier goal -> analyze the code.
- Different takes on data flow analysis.
- The verifier data flow analysis must be precise.
BPF verifier 2.0
34
- The verifier cannot tell what "r2 = *(u64*)(r1 + 8)" assembly instruction is doing.
- Unless r1 is a builtin type and +8 is checked by is_valid_access().
- The verifier cannot trust user space hints to verify BPF program assembly code.
- In-kernel BTF is trusted.
- With BTF the verifier data flow analysis enters into new realm of possibilities.
35
Every program type implements its own
is_valid_access() and convert_ctx_access().
#1 cause for code bloat.
Bug prone code.
None of it is needed with BTF.
Will be able to remove 1000s of lines.*
* when BTF kconfig is on.
How you can help
36
We need you
to hack.
to talk.
to invent.
BPF development is 100% use case driven.
Your requests, complains, sharing of success stories are shaping the future kernel.
JUST DO BPF.
37

More Related Content

What's hot

Kernel Recipes 2019 - ftrace: Where modifying a running kernel all started
Kernel Recipes 2019 - ftrace: Where modifying a running kernel all startedKernel Recipes 2019 - ftrace: Where modifying a running kernel all started
Kernel Recipes 2019 - ftrace: Where modifying a running kernel all started
Anne Nicolas
 
YOW2020 Linux Systems Performance
YOW2020 Linux Systems PerformanceYOW2020 Linux Systems Performance
YOW2020 Linux Systems Performance
Brendan Gregg
 
Understanding DPDK
Understanding DPDKUnderstanding DPDK
Understanding DPDK
Denys Haryachyy
 
BPF Internals (eBPF)
BPF Internals (eBPF)BPF Internals (eBPF)
BPF Internals (eBPF)
Brendan Gregg
 
Understanding eBPF in a Hurry!
Understanding eBPF in a Hurry!Understanding eBPF in a Hurry!
Understanding eBPF in a Hurry!
Ray Jenkins
 
BPF - in-kernel virtual machine
BPF - in-kernel virtual machineBPF - in-kernel virtual machine
BPF - in-kernel virtual machine
Alexei Starovoitov
 
eBPF/XDP
eBPF/XDP eBPF/XDP
eBPF/XDP
Netronome
 
EBPF and Linux Networking
EBPF and Linux NetworkingEBPF and Linux Networking
EBPF and Linux Networking
PLUMgrid
 
Boost UDP Transaction Performance
Boost UDP Transaction PerformanceBoost UDP Transaction Performance
Boost UDP Transaction Performance
LF Events
 
eBPF maps 101
eBPF maps 101eBPF maps 101
eBPF maps 101
SUSE Labs Taipei
 
Kernel Recipes 2017: Using Linux perf at Netflix
Kernel Recipes 2017: Using Linux perf at NetflixKernel Recipes 2017: Using Linux perf at Netflix
Kernel Recipes 2017: Using Linux perf at Netflix
Brendan Gregg
 
Linux Performance Profiling and Monitoring
Linux Performance Profiling and MonitoringLinux Performance Profiling and Monitoring
Linux Performance Profiling and Monitoring
Georg Schönberger
 
Faster packet processing in Linux: XDP
Faster packet processing in Linux: XDPFaster packet processing in Linux: XDP
Faster packet processing in Linux: XDP
Daniel T. Lee
 
re:Invent 2019 BPF Performance Analysis at Netflix
re:Invent 2019 BPF Performance Analysis at Netflixre:Invent 2019 BPF Performance Analysis at Netflix
re:Invent 2019 BPF Performance Analysis at Netflix
Brendan Gregg
 
KubernetesでGPUクラスタを管理したい
KubernetesでGPUクラスタを管理したいKubernetesでGPUクラスタを管理したい
KubernetesでGPUクラスタを管理したい
Yuji Oshima
 
Kernel Recipes 2017 - EBPF and XDP - Eric Leblond
Kernel Recipes 2017 - EBPF and XDP - Eric LeblondKernel Recipes 2017 - EBPF and XDP - Eric Leblond
Kernel Recipes 2017 - EBPF and XDP - Eric Leblond
Anne Nicolas
 
Performance optimization for all flash based on aarch64 v2.0
Performance optimization for all flash based on aarch64 v2.0Performance optimization for all flash based on aarch64 v2.0
Performance optimization for all flash based on aarch64 v2.0
Ceph Community
 
I/O仮想化最前線〜ネットワークI/Oを中心に〜
I/O仮想化最前線〜ネットワークI/Oを中心に〜I/O仮想化最前線〜ネットワークI/Oを中心に〜
I/O仮想化最前線〜ネットワークI/Oを中心に〜
Ryousei Takano
 
P4, EPBF, and Linux TC Offload
P4, EPBF, and Linux TC OffloadP4, EPBF, and Linux TC Offload
P4, EPBF, and Linux TC Offload
Open-NFP
 
BPF / XDP 8월 세미나 KossLab
BPF / XDP 8월 세미나 KossLabBPF / XDP 8월 세미나 KossLab
BPF / XDP 8월 세미나 KossLab
Taeung Song
 

What's hot (20)

Kernel Recipes 2019 - ftrace: Where modifying a running kernel all started
Kernel Recipes 2019 - ftrace: Where modifying a running kernel all startedKernel Recipes 2019 - ftrace: Where modifying a running kernel all started
Kernel Recipes 2019 - ftrace: Where modifying a running kernel all started
 
YOW2020 Linux Systems Performance
YOW2020 Linux Systems PerformanceYOW2020 Linux Systems Performance
YOW2020 Linux Systems Performance
 
Understanding DPDK
Understanding DPDKUnderstanding DPDK
Understanding DPDK
 
BPF Internals (eBPF)
BPF Internals (eBPF)BPF Internals (eBPF)
BPF Internals (eBPF)
 
Understanding eBPF in a Hurry!
Understanding eBPF in a Hurry!Understanding eBPF in a Hurry!
Understanding eBPF in a Hurry!
 
BPF - in-kernel virtual machine
BPF - in-kernel virtual machineBPF - in-kernel virtual machine
BPF - in-kernel virtual machine
 
eBPF/XDP
eBPF/XDP eBPF/XDP
eBPF/XDP
 
EBPF and Linux Networking
EBPF and Linux NetworkingEBPF and Linux Networking
EBPF and Linux Networking
 
Boost UDP Transaction Performance
Boost UDP Transaction PerformanceBoost UDP Transaction Performance
Boost UDP Transaction Performance
 
eBPF maps 101
eBPF maps 101eBPF maps 101
eBPF maps 101
 
Kernel Recipes 2017: Using Linux perf at Netflix
Kernel Recipes 2017: Using Linux perf at NetflixKernel Recipes 2017: Using Linux perf at Netflix
Kernel Recipes 2017: Using Linux perf at Netflix
 
Linux Performance Profiling and Monitoring
Linux Performance Profiling and MonitoringLinux Performance Profiling and Monitoring
Linux Performance Profiling and Monitoring
 
Faster packet processing in Linux: XDP
Faster packet processing in Linux: XDPFaster packet processing in Linux: XDP
Faster packet processing in Linux: XDP
 
re:Invent 2019 BPF Performance Analysis at Netflix
re:Invent 2019 BPF Performance Analysis at Netflixre:Invent 2019 BPF Performance Analysis at Netflix
re:Invent 2019 BPF Performance Analysis at Netflix
 
KubernetesでGPUクラスタを管理したい
KubernetesでGPUクラスタを管理したいKubernetesでGPUクラスタを管理したい
KubernetesでGPUクラスタを管理したい
 
Kernel Recipes 2017 - EBPF and XDP - Eric Leblond
Kernel Recipes 2017 - EBPF and XDP - Eric LeblondKernel Recipes 2017 - EBPF and XDP - Eric Leblond
Kernel Recipes 2017 - EBPF and XDP - Eric Leblond
 
Performance optimization for all flash based on aarch64 v2.0
Performance optimization for all flash based on aarch64 v2.0Performance optimization for all flash based on aarch64 v2.0
Performance optimization for all flash based on aarch64 v2.0
 
I/O仮想化最前線〜ネットワークI/Oを中心に〜
I/O仮想化最前線〜ネットワークI/Oを中心に〜I/O仮想化最前線〜ネットワークI/Oを中心に〜
I/O仮想化最前線〜ネットワークI/Oを中心に〜
 
P4, EPBF, and Linux TC Offload
P4, EPBF, and Linux TC OffloadP4, EPBF, and Linux TC Offload
P4, EPBF, and Linux TC Offload
 
BPF / XDP 8월 세미나 KossLab
BPF / XDP 8월 세미나 KossLabBPF / XDP 8월 세미나 KossLab
BPF / XDP 8월 세미나 KossLab
 

Similar to Kernel Recipes 2019 - BPF at Facebook

Kernel bug hunting
Kernel bug huntingKernel bug hunting
Kernel bug hunting
Andrea Righi
 
story_of_bpf-1.pdf
story_of_bpf-1.pdfstory_of_bpf-1.pdf
story_of_bpf-1.pdf
hegikip775
 
Developing MIPS Exploits to Hack Routers
Developing MIPS Exploits to Hack RoutersDeveloping MIPS Exploits to Hack Routers
Developing MIPS Exploits to Hack Routers
Onur Alanbel
 
Linux Kernel Platform Development: Challenges and Insights
 Linux Kernel Platform Development: Challenges and Insights Linux Kernel Platform Development: Challenges and Insights
Linux Kernel Platform Development: Challenges and Insights
GlobalLogic Ukraine
 
Testing Persistent Storage Performance in Kubernetes with Sherlock
Testing Persistent Storage Performance in Kubernetes with SherlockTesting Persistent Storage Performance in Kubernetes with Sherlock
Testing Persistent Storage Performance in Kubernetes with Sherlock
ScyllaDB
 
eBPF Basics
eBPF BasicseBPF Basics
eBPF Basics
Michael Kehoe
 
Linux BPF Superpowers
Linux BPF SuperpowersLinux BPF Superpowers
Linux BPF Superpowers
Brendan Gregg
 
ebpf and IO Visor: The What, how, and what next!
ebpf and IO Visor: The What, how, and what next!ebpf and IO Visor: The What, how, and what next!
ebpf and IO Visor: The What, how, and what next!
Affan Syed
 
Dataplane programming with eBPF: architecture and tools
Dataplane programming with eBPF: architecture and toolsDataplane programming with eBPF: architecture and tools
Dataplane programming with eBPF: architecture and tools
Stefano Salsano
 
Security Monitoring with eBPF
Security Monitoring with eBPFSecurity Monitoring with eBPF
Security Monitoring with eBPF
Alex Maestretti
 
OpenPOWER Application Optimization
OpenPOWER Application Optimization OpenPOWER Application Optimization
OpenPOWER Application Optimization
Ganesan Narayanasamy
 
Multi-threaded Performance Pitfalls
Multi-threaded Performance PitfallsMulti-threaded Performance Pitfalls
Multi-threaded Performance PitfallsCiaran McHale
 
Spying on the Linux kernel for fun and profit
Spying on the Linux kernel for fun and profitSpying on the Linux kernel for fun and profit
Spying on the Linux kernel for fun and profit
Andrea Righi
 
Andrea Righi - Spying on the Linux kernel for fun and profit
Andrea Righi - Spying on the Linux kernel for fun and profitAndrea Righi - Spying on the Linux kernel for fun and profit
Andrea Righi - Spying on the Linux kernel for fun and profit
linuxlab_conf
 
”Bare-Metal Container" presented at HPCC2016
”Bare-Metal Container" presented at HPCC2016”Bare-Metal Container" presented at HPCC2016
”Bare-Metal Container" presented at HPCC2016
Kuniyasu Suzaki
 
Running Applications on the NetBSD Rump Kernel by Justin Cormack
Running Applications on the NetBSD Rump Kernel by Justin Cormack Running Applications on the NetBSD Rump Kernel by Justin Cormack
Running Applications on the NetBSD Rump Kernel by Justin Cormack
eurobsdcon
 
Best Practices and Performance Studies for High-Performance Computing Clusters
Best Practices and Performance Studies for High-Performance Computing ClustersBest Practices and Performance Studies for High-Performance Computing Clusters
Best Practices and Performance Studies for High-Performance Computing Clusters
Intel® Software
 
AIX Advanced Administration Knowledge Share
AIX Advanced Administration Knowledge ShareAIX Advanced Administration Knowledge Share
AIX Advanced Administration Knowledge Share
.Gastón. .Bx.
 
Not breaking userspace: the evolving Linux ABI
Not breaking userspace: the evolving Linux ABINot breaking userspace: the evolving Linux ABI
Not breaking userspace: the evolving Linux ABI
Alison Chaiken
 
DEF CON 27 - JEFF DILEO - evil e bpf in depth
DEF CON 27 - JEFF DILEO - evil e bpf in depthDEF CON 27 - JEFF DILEO - evil e bpf in depth
DEF CON 27 - JEFF DILEO - evil e bpf in depth
Felipe Prado
 

Similar to Kernel Recipes 2019 - BPF at Facebook (20)

Kernel bug hunting
Kernel bug huntingKernel bug hunting
Kernel bug hunting
 
story_of_bpf-1.pdf
story_of_bpf-1.pdfstory_of_bpf-1.pdf
story_of_bpf-1.pdf
 
Developing MIPS Exploits to Hack Routers
Developing MIPS Exploits to Hack RoutersDeveloping MIPS Exploits to Hack Routers
Developing MIPS Exploits to Hack Routers
 
Linux Kernel Platform Development: Challenges and Insights
 Linux Kernel Platform Development: Challenges and Insights Linux Kernel Platform Development: Challenges and Insights
Linux Kernel Platform Development: Challenges and Insights
 
Testing Persistent Storage Performance in Kubernetes with Sherlock
Testing Persistent Storage Performance in Kubernetes with SherlockTesting Persistent Storage Performance in Kubernetes with Sherlock
Testing Persistent Storage Performance in Kubernetes with Sherlock
 
eBPF Basics
eBPF BasicseBPF Basics
eBPF Basics
 
Linux BPF Superpowers
Linux BPF SuperpowersLinux BPF Superpowers
Linux BPF Superpowers
 
ebpf and IO Visor: The What, how, and what next!
ebpf and IO Visor: The What, how, and what next!ebpf and IO Visor: The What, how, and what next!
ebpf and IO Visor: The What, how, and what next!
 
Dataplane programming with eBPF: architecture and tools
Dataplane programming with eBPF: architecture and toolsDataplane programming with eBPF: architecture and tools
Dataplane programming with eBPF: architecture and tools
 
Security Monitoring with eBPF
Security Monitoring with eBPFSecurity Monitoring with eBPF
Security Monitoring with eBPF
 
OpenPOWER Application Optimization
OpenPOWER Application Optimization OpenPOWER Application Optimization
OpenPOWER Application Optimization
 
Multi-threaded Performance Pitfalls
Multi-threaded Performance PitfallsMulti-threaded Performance Pitfalls
Multi-threaded Performance Pitfalls
 
Spying on the Linux kernel for fun and profit
Spying on the Linux kernel for fun and profitSpying on the Linux kernel for fun and profit
Spying on the Linux kernel for fun and profit
 
Andrea Righi - Spying on the Linux kernel for fun and profit
Andrea Righi - Spying on the Linux kernel for fun and profitAndrea Righi - Spying on the Linux kernel for fun and profit
Andrea Righi - Spying on the Linux kernel for fun and profit
 
”Bare-Metal Container" presented at HPCC2016
”Bare-Metal Container" presented at HPCC2016”Bare-Metal Container" presented at HPCC2016
”Bare-Metal Container" presented at HPCC2016
 
Running Applications on the NetBSD Rump Kernel by Justin Cormack
Running Applications on the NetBSD Rump Kernel by Justin Cormack Running Applications on the NetBSD Rump Kernel by Justin Cormack
Running Applications on the NetBSD Rump Kernel by Justin Cormack
 
Best Practices and Performance Studies for High-Performance Computing Clusters
Best Practices and Performance Studies for High-Performance Computing ClustersBest Practices and Performance Studies for High-Performance Computing Clusters
Best Practices and Performance Studies for High-Performance Computing Clusters
 
AIX Advanced Administration Knowledge Share
AIX Advanced Administration Knowledge ShareAIX Advanced Administration Knowledge Share
AIX Advanced Administration Knowledge Share
 
Not breaking userspace: the evolving Linux ABI
Not breaking userspace: the evolving Linux ABINot breaking userspace: the evolving Linux ABI
Not breaking userspace: the evolving Linux ABI
 
DEF CON 27 - JEFF DILEO - evil e bpf in depth
DEF CON 27 - JEFF DILEO - evil e bpf in depthDEF CON 27 - JEFF DILEO - evil e bpf in depth
DEF CON 27 - JEFF DILEO - evil e bpf in depth
 

More from Anne Nicolas

Kernel Recipes 2019 - Driving the industry toward upstream first
Kernel Recipes 2019 - Driving the industry toward upstream firstKernel Recipes 2019 - Driving the industry toward upstream first
Kernel Recipes 2019 - Driving the industry toward upstream first
Anne Nicolas
 
Kernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMI
Kernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMIKernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMI
Kernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMI
Anne Nicolas
 
Kernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernel
Kernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernelKernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernel
Kernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernel
Anne Nicolas
 
Kernel Recipes 2019 - Metrics are money
Kernel Recipes 2019 - Metrics are moneyKernel Recipes 2019 - Metrics are money
Kernel Recipes 2019 - Metrics are money
Anne Nicolas
 
Kernel Recipes 2019 - Kernel documentation: past, present, and future
Kernel Recipes 2019 - Kernel documentation: past, present, and futureKernel Recipes 2019 - Kernel documentation: past, present, and future
Kernel Recipes 2019 - Kernel documentation: past, present, and future
Anne Nicolas
 
Embedded Recipes 2019 - Knowing your ARM from your ARSE: wading through the t...
Embedded Recipes 2019 - Knowing your ARM from your ARSE: wading through the t...Embedded Recipes 2019 - Knowing your ARM from your ARSE: wading through the t...
Embedded Recipes 2019 - Knowing your ARM from your ARSE: wading through the t...
Anne Nicolas
 
Kernel Recipes 2019 - GNU poke, an extensible editor for structured binary data
Kernel Recipes 2019 - GNU poke, an extensible editor for structured binary dataKernel Recipes 2019 - GNU poke, an extensible editor for structured binary data
Kernel Recipes 2019 - GNU poke, an extensible editor for structured binary data
Anne Nicolas
 
Kernel Recipes 2019 - Analyzing changes to the binary interface exposed by th...
Kernel Recipes 2019 - Analyzing changes to the binary interface exposed by th...Kernel Recipes 2019 - Analyzing changes to the binary interface exposed by th...
Kernel Recipes 2019 - Analyzing changes to the binary interface exposed by th...
Anne Nicolas
 
Embedded Recipes 2019 - Remote update adventures with RAUC, Yocto and Barebox
Embedded Recipes 2019 - Remote update adventures with RAUC, Yocto and BareboxEmbedded Recipes 2019 - Remote update adventures with RAUC, Yocto and Barebox
Embedded Recipes 2019 - Remote update adventures with RAUC, Yocto and Barebox
Anne Nicolas
 
Embedded Recipes 2019 - Making embedded graphics less special
Embedded Recipes 2019 - Making embedded graphics less specialEmbedded Recipes 2019 - Making embedded graphics less special
Embedded Recipes 2019 - Making embedded graphics less special
Anne Nicolas
 
Embedded Recipes 2019 - Linux on Open Source Hardware and Libre Silicon
Embedded Recipes 2019 - Linux on Open Source Hardware and Libre SiliconEmbedded Recipes 2019 - Linux on Open Source Hardware and Libre Silicon
Embedded Recipes 2019 - Linux on Open Source Hardware and Libre Silicon
Anne Nicolas
 
Embedded Recipes 2019 - From maintaining I2C to the big (embedded) picture
Embedded Recipes 2019 - From maintaining I2C to the big (embedded) pictureEmbedded Recipes 2019 - From maintaining I2C to the big (embedded) picture
Embedded Recipes 2019 - From maintaining I2C to the big (embedded) picture
Anne Nicolas
 
Embedded Recipes 2019 - Testing firmware the devops way
Embedded Recipes 2019 - Testing firmware the devops wayEmbedded Recipes 2019 - Testing firmware the devops way
Embedded Recipes 2019 - Testing firmware the devops way
Anne Nicolas
 
Embedded Recipes 2019 - Herd your socs become a matchmaker
Embedded Recipes 2019 - Herd your socs become a matchmakerEmbedded Recipes 2019 - Herd your socs become a matchmaker
Embedded Recipes 2019 - Herd your socs become a matchmaker
Anne Nicolas
 
Embedded Recipes 2019 - LLVM / Clang integration
Embedded Recipes 2019 - LLVM / Clang integrationEmbedded Recipes 2019 - LLVM / Clang integration
Embedded Recipes 2019 - LLVM / Clang integration
Anne Nicolas
 
Embedded Recipes 2019 - Introduction to JTAG debugging
Embedded Recipes 2019 - Introduction to JTAG debuggingEmbedded Recipes 2019 - Introduction to JTAG debugging
Embedded Recipes 2019 - Introduction to JTAG debugging
Anne Nicolas
 
Embedded Recipes 2019 - Pipewire a new foundation for embedded multimedia
Embedded Recipes 2019 - Pipewire a new foundation for embedded multimediaEmbedded Recipes 2019 - Pipewire a new foundation for embedded multimedia
Embedded Recipes 2019 - Pipewire a new foundation for embedded multimedia
Anne Nicolas
 
Kernel Recipes 2019 - Suricata and XDP
Kernel Recipes 2019 - Suricata and XDPKernel Recipes 2019 - Suricata and XDP
Kernel Recipes 2019 - Suricata and XDP
Anne Nicolas
 
Kernel Recipes 2019 - Marvels of Memory Auto-configuration (SPD)
Kernel Recipes 2019 - Marvels of Memory Auto-configuration (SPD)Kernel Recipes 2019 - Marvels of Memory Auto-configuration (SPD)
Kernel Recipes 2019 - Marvels of Memory Auto-configuration (SPD)
Anne Nicolas
 
Kernel Recipes 2019 - Formal modeling made easy
Kernel Recipes 2019 - Formal modeling made easyKernel Recipes 2019 - Formal modeling made easy
Kernel Recipes 2019 - Formal modeling made easy
Anne Nicolas
 

More from Anne Nicolas (20)

Kernel Recipes 2019 - Driving the industry toward upstream first
Kernel Recipes 2019 - Driving the industry toward upstream firstKernel Recipes 2019 - Driving the industry toward upstream first
Kernel Recipes 2019 - Driving the industry toward upstream first
 
Kernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMI
Kernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMIKernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMI
Kernel Recipes 2019 - No NMI? No Problem! – Implementing Arm64 Pseudo-NMI
 
Kernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernel
Kernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernelKernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernel
Kernel Recipes 2019 - Hunting and fixing bugs all over the Linux kernel
 
Kernel Recipes 2019 - Metrics are money
Kernel Recipes 2019 - Metrics are moneyKernel Recipes 2019 - Metrics are money
Kernel Recipes 2019 - Metrics are money
 
Kernel Recipes 2019 - Kernel documentation: past, present, and future
Kernel Recipes 2019 - Kernel documentation: past, present, and futureKernel Recipes 2019 - Kernel documentation: past, present, and future
Kernel Recipes 2019 - Kernel documentation: past, present, and future
 
Embedded Recipes 2019 - Knowing your ARM from your ARSE: wading through the t...
Embedded Recipes 2019 - Knowing your ARM from your ARSE: wading through the t...Embedded Recipes 2019 - Knowing your ARM from your ARSE: wading through the t...
Embedded Recipes 2019 - Knowing your ARM from your ARSE: wading through the t...
 
Kernel Recipes 2019 - GNU poke, an extensible editor for structured binary data
Kernel Recipes 2019 - GNU poke, an extensible editor for structured binary dataKernel Recipes 2019 - GNU poke, an extensible editor for structured binary data
Kernel Recipes 2019 - GNU poke, an extensible editor for structured binary data
 
Kernel Recipes 2019 - Analyzing changes to the binary interface exposed by th...
Kernel Recipes 2019 - Analyzing changes to the binary interface exposed by th...Kernel Recipes 2019 - Analyzing changes to the binary interface exposed by th...
Kernel Recipes 2019 - Analyzing changes to the binary interface exposed by th...
 
Embedded Recipes 2019 - Remote update adventures with RAUC, Yocto and Barebox
Embedded Recipes 2019 - Remote update adventures with RAUC, Yocto and BareboxEmbedded Recipes 2019 - Remote update adventures with RAUC, Yocto and Barebox
Embedded Recipes 2019 - Remote update adventures with RAUC, Yocto and Barebox
 
Embedded Recipes 2019 - Making embedded graphics less special
Embedded Recipes 2019 - Making embedded graphics less specialEmbedded Recipes 2019 - Making embedded graphics less special
Embedded Recipes 2019 - Making embedded graphics less special
 
Embedded Recipes 2019 - Linux on Open Source Hardware and Libre Silicon
Embedded Recipes 2019 - Linux on Open Source Hardware and Libre SiliconEmbedded Recipes 2019 - Linux on Open Source Hardware and Libre Silicon
Embedded Recipes 2019 - Linux on Open Source Hardware and Libre Silicon
 
Embedded Recipes 2019 - From maintaining I2C to the big (embedded) picture
Embedded Recipes 2019 - From maintaining I2C to the big (embedded) pictureEmbedded Recipes 2019 - From maintaining I2C to the big (embedded) picture
Embedded Recipes 2019 - From maintaining I2C to the big (embedded) picture
 
Embedded Recipes 2019 - Testing firmware the devops way
Embedded Recipes 2019 - Testing firmware the devops wayEmbedded Recipes 2019 - Testing firmware the devops way
Embedded Recipes 2019 - Testing firmware the devops way
 
Embedded Recipes 2019 - Herd your socs become a matchmaker
Embedded Recipes 2019 - Herd your socs become a matchmakerEmbedded Recipes 2019 - Herd your socs become a matchmaker
Embedded Recipes 2019 - Herd your socs become a matchmaker
 
Embedded Recipes 2019 - LLVM / Clang integration
Embedded Recipes 2019 - LLVM / Clang integrationEmbedded Recipes 2019 - LLVM / Clang integration
Embedded Recipes 2019 - LLVM / Clang integration
 
Embedded Recipes 2019 - Introduction to JTAG debugging
Embedded Recipes 2019 - Introduction to JTAG debuggingEmbedded Recipes 2019 - Introduction to JTAG debugging
Embedded Recipes 2019 - Introduction to JTAG debugging
 
Embedded Recipes 2019 - Pipewire a new foundation for embedded multimedia
Embedded Recipes 2019 - Pipewire a new foundation for embedded multimediaEmbedded Recipes 2019 - Pipewire a new foundation for embedded multimedia
Embedded Recipes 2019 - Pipewire a new foundation for embedded multimedia
 
Kernel Recipes 2019 - Suricata and XDP
Kernel Recipes 2019 - Suricata and XDPKernel Recipes 2019 - Suricata and XDP
Kernel Recipes 2019 - Suricata and XDP
 
Kernel Recipes 2019 - Marvels of Memory Auto-configuration (SPD)
Kernel Recipes 2019 - Marvels of Memory Auto-configuration (SPD)Kernel Recipes 2019 - Marvels of Memory Auto-configuration (SPD)
Kernel Recipes 2019 - Marvels of Memory Auto-configuration (SPD)
 
Kernel Recipes 2019 - Formal modeling made easy
Kernel Recipes 2019 - Formal modeling made easyKernel Recipes 2019 - Formal modeling made easy
Kernel Recipes 2019 - Formal modeling made easy
 

Recently uploaded

Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
Globus
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
informapgpstrackings
 
Corporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMSCorporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMS
Tendenci - The Open Source AMS (Association Management Software)
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Globus
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
Philip Schwarz
 
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisProviding Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Globus
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
Globus
 
Into the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdfInto the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdf
Ortus Solutions, Corp
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
Tier1 app
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
Globus
 
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Anthony Dahanne
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
rickgrimesss22
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
AMB-Review
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
Globus
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
vrstrong314
 
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Shahin Sheidaei
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
Juraj Vysvader
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus
 
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdfEnhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
Jay Das
 

Recently uploaded (20)

Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...Developing Distributed High-performance Computing Capabilities of an Open Sci...
Developing Distributed High-performance Computing Capabilities of an Open Sci...
 
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
Field Employee Tracking System| MiTrack App| Best Employee Tracking Solution|...
 
Corporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMSCorporate Management | Session 3 of 3 | Tendenci AMS
Corporate Management | Session 3 of 3 | Tendenci AMS
 
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
Exploring Innovations in Data Repository Solutions - Insights from the U.S. G...
 
A Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of PassageA Sighting of filterA in Typelevel Rite of Passage
A Sighting of filterA in Typelevel Rite of Passage
 
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data AnalysisProviding Globus Services to Users of JASMIN for Environmental Data Analysis
Providing Globus Services to Users of JASMIN for Environmental Data Analysis
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
 
Into the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdfInto the Box 2024 - Keynote Day 2 Slides.pdf
Into the Box 2024 - Keynote Day 2 Slides.pdf
 
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERRORTROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
TROUBLESHOOTING 9 TYPES OF OUTOFMEMORYERROR
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
 
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
Paketo Buildpacks : la meilleure façon de construire des images OCI? DevopsDa...
 
Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024Globus Connect Server Deep Dive - GlobusWorld 2024
Globus Connect Server Deep Dive - GlobusWorld 2024
 
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
 
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdfDominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
Dominate Social Media with TubeTrivia AI’s Addictive Quiz Videos.pdf
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
 
top nidhi software solution freedownload
top nidhi software solution freedownloadtop nidhi software solution freedownload
top nidhi software solution freedownload
 
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
Gamify Your Mind; The Secret Sauce to Delivering Success, Continuously Improv...
 
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
In 2015, I used to write extensions for Joomla, WordPress, phpBB3, etc and I ...
 
Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024Globus Compute wth IRI Workflows - GlobusWorld 2024
Globus Compute wth IRI Workflows - GlobusWorld 2024
 
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdfEnhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
Enhancing Project Management Efficiency_ Leveraging AI Tools like ChatGPT.pdf
 

Kernel Recipes 2019 - BPF at Facebook

  • 1.
  • 3. 3 1) kernel upgrades 2) BPF in the datacenter 3) BPF evolution 4) where you can help Agenda
  • 4. Kernel upgrades in FB 4 •- "Upstream first" philosophy. •- Close to zero private patches. •- As soon as practical kernel team: • . takes the latest upstream kernel • . stabilizes it • . rolls it across the fleet • . backports relevant features until the cycle repeats •- It used to take months to upgrade. Now few weeks. Days when necessary. • move fast
  • 5. Kernel version by count 5 •- As of September 2019. •- It will be different tomorrow. - One kernel version on most servers. - Many 4.16.x flavors due to long tail. - Challenging environment for user space. - Even more challenging for BPF based tracing.
  • 6. Do not break user space 6 •- Must not change kernel ABI. •- Must not cause performance regressions. •- Must not change user space behavior. - Investigate all differences. . Either unexpected improvement or regression. . Team work is necessary to root cause. "The first rule" of kernel programming... multiplied by FB scale.
  • 7. Do you use BPF? 7 •- Run this command on your laptop: • • sudo bpftool prog show | grep name | wc -l • •- What number does it print? - Don't have bpftool ? Run this: ls -la /proc/*/fd | grep bpf-prog | wc -l
  • 8. BPF at Facebook 8 •- ~40 BPF programs active on every server. •- ~100 BPF programs loaded on demand for short period of time. •- Mainly used by daemons that run on every server. •- Many teams are writing and deploying them.
  • 10. Kernel team is involved in lots of investigations. 10 BPF? BPF? BPF? BPF? BPF? •It's not true, but I often feel this way :)
  • 11. Example 1: packet capture daemon 11 - This daemon is using SCHED_CLS BPF program. - The program is attached to TC ingress and runs on every packet. - With 1 out of million probability it does bpf_perf_event_output(skb). - On new kernel this daemon causes 1% cpu regression. - Disabling the daemon makes the regression go away. - Is it BPF?
  • 12. Example 1: packet capture daemon (resolved) 12 - Turned out the daemon is loading KPROBE BPF program as well for unrelated logic. - kprobe-d function doesn't exist in new kernel. - Daemon decides that BPF is unusable and falls back to NFLOG-based packet capture. - nflog loads iptable modules and causes 1% cpu regression.
  • 13. Takeaway for developers 13 - kprobe is not a stable ABI. - Everytime kernel developers change the code some kernel developers pay the price.
  • 14. Example 2: performance profiling daemon 14 - The daemon is using BPF tracepoints, kprobes in the scheduler and task execution. - It collects kernel and user stack traces, walks python user stacks inside BPF program and aggregates across the fleet. - This daemon is #1 tool for performance analysis. - On new kernel it causes 2% cpu regression. - Higher softirq times. Slower user apps. - Disabling the daemon makes the regression go away. - Is it BPF?
  • 15. Example 2: performance profiling daemon (resolved) 15 - Turned out that simply installing kprobe makes 5.2 kernel remap kernel .text from 2M huge pages into 4k. - That caused more I-TLB misses. - Making BPF execution in the kernel slower and user space as well.
  • 16. Takeaway 16 - kprobe is essential part of kernel functionality.
  • 17. Example 3: security monitoring daemon 17 - The daemon is using 3 kprobes and 1 kretprobe. - Its BPF program code just over 200 lines of C. - It runs with low priority. - It wakes up every few seconds, consumes 0.01% of one cpu and 0.01% of memory. - Yet it causes large P99 latency regression for database server that runs on all other cpus and consumes many Gbytes of memory. - Throughput of the database is not affected. - Disabling the daemon makes the regression go away. - Is it BPF?
  • 18. Investigation 18 Facts: - Occasionally memcpy() in a database gets stuck for 1/4 of a second. - The daemon is rarely reading /proc/pid/environ. Guesses: - Is database waiting on kernel to handle page fault ? - While kernel is blocked on mmap_sem ? - but "top" and others read /proc way more often. Why that daemon is special? - Dive into kernel code fs/proc/base.c environ_read() access_remote_vm() down_read(&mm->mmap_sem)
  • 19. funclatency.py - Time functions and print latency as a histogram 19 # funclatency.py -d100 -m __access_remote_vm Tracing 1 functions for "__access_remote_vm"... Hit Ctrl-C to end. msecs : count distribution 0 -> 1 : 21938 |****************************************| 2 -> 3 : 0 | | 4 -> 7 : 0 | | 8 -> 15 : 0 | | 16 -> 31 : 0 | | 32 -> 63 : 0 | | 64 -> 127 : 0 | | 128 -> 255 : 7 | | 256 -> 511 : 3 | | Detaching... This histogram shows that over the last 100 seconds there were 3 events where reading /proc took more than 256 ms.
  • 20. funcslower.py - Dump kernel and user stack when given kernel function was slower than threshold 20 # funcslower.py -m 200 -KU __access_remote_vm Tracing function calls slower than 200 ms... Ctrl+C to quit. COMM PID LAT(ms) RVAL FUNC security_daemon 1720415 399.02 605 __access_remote_vm kretprobe_trampoline read facebook::...::readBytes(folly::File const&) ... This was the kernel+user stack trace when our security daemon was stuck in sys_read() for 399 ms. Yes. It's that daemon causing database latency spikes.
  • 21. Collect more stack traces with offwaketime.py ... 21 finish_task_switch __schedule preempt_schedule_common _cond_resched __get_user_pages get_user_pages_remote __access_remote_vm proc_pid_cmdline_read __vfs_read vfs_read sys_read do_syscall_64 read facebook::...::readBytes(folly::File const&) The task reading from /proc/pid/cmdline can go to sleep without releasing mmap_sem of mm of that pid. The page fault in that pid will be blocked until this task finishes reading /proc.
  • 22. Root cause 22 - The daemon is using 3 kprobes and 1 kretprobe. - Its BPF program code just over 200 lines of C. - It runs with low priority. - It wakes up every few seconds, consumes 0.01% of one cpu and 0.01% of memory. Low CPU quota for the daemon coupled with aggressive sysctl kernel.sched_* tweaks were responsible.
  • 23. Takeaway 23 - BPF tracing tools are the best to tackle BPF regression. BPF BPF
  • 24. Another kind of BPF investigations 24 - Many kernels run in the datacenter. - Daemons (and their BPF programs) need to work on all of them. - BPF program works on developer server, but fails in production.
  • 26. On production server 26 - Embedded LLVM is safer than standalone LLVM. - LLVM takes 70 Mb on disk. 20 Mb of memory at steady state. More at peak. - Dependency on system kernel headers. Subsystem internal headers are missing. - Compilation errors captured at runtime. - Compilation on production server disturbs the main workload. - And the other way around. llvm may take minutes to compile 100 lines of C.
  • 27. BPF CO-RE (Compile Once Run Everywhere) 27 - Compile BPF program into "Run Everywhere" .o file (BPF assembly + extra). - Test it on developer server against many "kernels". - Adjust .o file on production server by libbpf. - No compilation on production server.
  • 28. BTF (BPF Type Format) 28 - BTF describes types, relocations, source code. - LLVM compiles BPF program C code into BPF assembler and BTF. - gcc+pahole compiles kernel C code into vmlinux binary and BTF. - libbpf compares prog's BTF with vmlinux's BTF and adjusts BPF assembly before loading into the kernel. - Developers can compile and test for kprobe and kernel data structure compatibility on a single server at build time instead of on N servers at run-time.
  • 29. trace_kfree_skb today 29 PARM2 typo will "work" too six bpf_probe_read() calls Any type cast is allowed clang -I/path_to_kernel_headers/ -I/path_to_user/
  • 31. 31 Works with any raw tracepoint Same kernel helper as in networking programs If skb and location are accidentally swapped the verifier will catch it Define kernel structs by hand instead of including vmlinux.h
  • 32. BPF verifier giant leaps in 2019 32 - Bounded loops - bpf_spin_lock - Dead code elimination - Scalar precision tracking BPF
  • 33. BPF verifier is smarter than llvm 33 - The verifier removes dead code after it was optimized by llvm -O2. - Developers cannot cheat by type casting integer to pointer or removing 'const'. - LLVM goal -> optimize the code. - The verifier goal -> analyze the code. - Different takes on data flow analysis. - The verifier data flow analysis must be precise.
  • 34. BPF verifier 2.0 34 - The verifier cannot tell what "r2 = *(u64*)(r1 + 8)" assembly instruction is doing. - Unless r1 is a builtin type and +8 is checked by is_valid_access(). - The verifier cannot trust user space hints to verify BPF program assembly code. - In-kernel BTF is trusted. - With BTF the verifier data flow analysis enters into new realm of possibilities.
  • 35. 35 Every program type implements its own is_valid_access() and convert_ctx_access(). #1 cause for code bloat. Bug prone code. None of it is needed with BTF. Will be able to remove 1000s of lines.* * when BTF kconfig is on.
  • 36. How you can help 36 We need you to hack. to talk. to invent. BPF development is 100% use case driven. Your requests, complains, sharing of success stories are shaping the future kernel.