SlideShare a Scribd company logo
1 of 31
Download to read offline
Kernel bug hunting
Modena, 27 November 2019
Andrea Righi
andrea.righi@canonical.com
www.canonical.com
twitter: @arighi
Linux kernel is complex
●
25.590.567 lines of code right now
find -type f -name '*.[chS]' -exec wc -l {} ; | awk 'BEGIN{sum=0}{sum+=$1}END{print sum}'
●
229 patches last week week
git log --oneline v5.4-rc7..v5.4-rc8 | wc -l
●
195 files changed, 3398 insertions(+), 4081 deletions(-)
git diff --stat v5.4-rc7..v5.4-rc8 | tail -1
https://www.linuxcounter.net/statistics/kernel
Kernel bugs
●
Kernel panic
●
Fatal error, system becomes unusable
●
Kernel oops
●
Non-fatal error, some functionalities can be compromised
●
Wrong result
●
Fatal error from user’s perspective
●
Security vulnerability
●
Side-channel attack, data leakage, …
●
Performance regression
●
Everything is correct, but slower...
Debugging techniques
●
blinking LED
●
printk() / dump_stack()
●
procfs
●
SysReq key (Documentation/admin-guide/sysrq.rst)
●
debugger (i.e., kgdb, …)
●
virtualization
●
profiling / tracing
Kernel debugging hands-on
●
Virtualization can help to track down kernel bugs
●
virtme
●
Run the kernel inside a qemu/kvm instance, virtualizing the running
system
●
Generate crash dump
●
Analyze system data offline (after the crash)
●
crash test kernel module
●
https://github.com/arighi/crashtest
●
Simple scripts to speed up kernel development
(wrappers around virtme/qemu/crash):
●
https://github.com/arighi/kernel-crash-tools
Profiling vs tracing
Profiling vs tracing
●
Profiling
●
Create a periodic timed interrupt that collects the current
program counter, function address and the entire stack
back trace
●
Tracing
●
Record times and invocations of specific events
Profiling example: perf top
Tracing example: strace
●
strace(1): system call tracer in Linux
●
It uses the ptrace() system call that pauses the target
process for each syscall so that the debugger can read
the state
●
And it’s doing this twice: when the syscall begins and when
it ends!
strace overhead
### Regular execution ###
$ dd if=/dev/zero of=/dev/null bs=1 count=500k
512000+0 records in
512000+0 records out
512000 bytes (512 kB, 500 KiB) copied, 0,501455 s, 1.0 MB/s
### Strace execution (tracing a syscall that is never called) ###
$ strace -e trace=accept dd if=/dev/zero of=/dev/null bs=1 count=500k
512000+0 records in
512000+0 records out
512000 bytes (512 kB, 500 KiB) copied, 44.0216 s, 11,6 kB/s
+++ exited with 0 +++
Tracing kernel functions: kprobe
eBPF
eBPF history
●
Initially it was BPF: Berkeley Packet Filter
●
It has its roots in BSD in the very early 1990’s
●
Originally designed as a mechanism for fast filtering network packets
●
3.15: Linux introduced eBPF: extended Berkeley Packet Filter
●
More efficient / more generic than the original BPF
●
3.18: eBPF VM exposed to user-space
●
4.9: eBPF programs can be attached to perf_events
●
4.10: eBPF programs can be attached to cgroups
●
4.15: eBPF LSM hooks
eBPF features
●
Highly efficient VM that lives in the kernel
●
Inject safe sanboxed bytecode into the kernel
●
Attach code to kernel functions / events
●
In-kernel JIT compiler
●
Dynamically translate eBPF bytecode into native opcodes
●
eBPF makes kernel programmable without having to
cross kernel/user-space boundaries
●
Access in-kernel data structures directly without the risk
of crashing, hanging or breaking the kernel in any way
eBPF as a VM
●
Example assembly of a simple
eBPF filter
●
Load 16-bit quantity from offset
12 in the packet to the
accumulator (ethernet type)
●
Compare the value to see if the
packet is an IP packet
●
If the packet is IP, return TRUE
(packet is accepted)
●
otherwise return 0 (packet is
rejected)
●
Only 4 VM instructions to filter
IP packets!
ldh [12]
jeq #ETHERTYPE_IP, l1, l2
l1: ret #TRUE
l2: ret #0
eBPF use cases
How many eBPF programs are running in your laptop?
●
Run this:
ls -la /proc/*/fd | grep bpf-prog | wc -l
Flame graphs
●
CPU flame graphs
●
x-axis
sample population
●
y-axis
●
stack depth
●
Wider boxes =
More samples =
More CPU time =
More overhead!
Flame Graph Search
s..
sun/nio/ch/SocketChannel..
org/mozi..
org..
io..
d..
tcp_v4_rcv
i..
org..
vfs_write
io/netty/channel/AbstractChannelHandlerContext:.fireChannelRead
JavaCalls::call_virtual
o..
ip_q..
org/mozi..
tcp_sen..
cpu..
org/..
[unknown]
io/netty/channel/AbstractCha..
JavaCalls::call_virtual
org/mozilla/javascript/gen/file__root_vert_x_2_1_..
ip..
io/netty/channel/nio/AbstractNioByteChannel$NioByteUnsafe:.read
io/netty/channel/nio/Abstr..
t..
s..
o..
JavaCalls::call_helper
Interpreter
ip_..
__do_softirq
ip_local_out
ep_p..
org/mozilla/javas..
s..
org/mozilla/javascript/gen/file__root_vert_x_2_1_5_sys_mods_io..
system_ca..
s..
_..
[unkn..
__..
sta..
sun..
_..
tcp_transmit_skb
do_softirq
org/m..
__..
JavaThread::thread_main_inner
io/netty/channel/ChannelDupl..
io/netty/channel/DefaultCha..
java
ip_rcv_fi..
t..
G..
o..
tcp_v4_..
__tcp..
do_sync_write
v..
x..
call_stub
ip_finish_out..
net_rx_act..
io/netty/channel/ChannelOut..
wrk
tcp_write_xmit
ip_..
loc..
syste..
i..
aeProcessEvents
system_call_fastpath
org..
do..
local_bh_en..
or..
swapper
org/mozilla/javascript/gen/file__root_vert_x_2_1_5..
JavaThread::run
sun/nio/ch/FileDispatch..
[..
__tcp_push_pendi..
tc..
sun/re..
tcp_..
tcp_w..
process_ba..
io/ne..
[..
inet_se..
ip..or..
thread_entry
Interpreter
org/vertx/java/core/http/impl/DefaultHttpServer$ServerHandler:.doM..
tcp_rcv..
vfs_write
ip_queue_xmit
sock_aio..
aeMain
_..
org/vertx/java/core/net/impl/VertxHandler:.channelRead
org/moz..
__netif_r..
__netif_r..
io/netty/channel/AbstractCh..
do_softirq_..
org/mozilla/javascript/gen/file__root_vert_x_2_1_5_sys_mods_io_..
io/netty/channel/nio/NioEventLoop:.processSelectedKeys
org/mozilla/javas..
Interpreter
so..
_..
[unknown]
io/netty/channel/AbstractCh..
or..
io/netty/channel/AbstractChannelHandlerContext:.fireChannelRead
ip_local_..
__..
org/vertx/java/core/net/impl..
io/..
sock_aio_write
ip_rcv
tcp_sendmsg
e..
do..
thread_main
ip_..
io..
ip_output
io/netty/channel/nio/NioEventLoop:.processSelectedKeysOptimized
java_start
org/vertx/java/core/http/impl/ServerConnection:.handleRequests..
pr..
socke..
sys_e..
org/moz..
or..
io/netty/channel/AbstractCha..
io/netty/handler/codec/ByteToMessageDecoder:.channelRead
x..
io/netty/channel/AbstractCha..
h..
start_thread
ne..
inet_sendmsg
start_thread
r..
ip_local_..
org/mozilla/javas..
o..
sys_write
socket_wri..
i..
io/netty/handler/codec/ByteT..
org/..
do_sync_..
sys_write
BCC tracing tools
●
BPF Compiler Collection https://github.com/iovisor/bcc
●
Front-end to eBPF
●
BCC makes eBPF programs easier to write
●
Include C wrapper around LLVM
●
Python
●
Lua
●
C++
●
C helper libs
●
Lots of pre-defined tools available
Example #1: trace exec()
●
Intercept all the processes executed in the system
Example #2: keylogger
●
Identify where and how keyboard characters are received
and processed by the kernel
Example #3: ping
●
Identify where ICMP packets (ECHO_REQUEST /
ECHO_REPLY) are received and processed by the kernel
Example #4: task wait / wakeup
●
Determine the stack trace
of a sleeping process and
the stack trace of the
process that wakes up a
sleeping process
Is tracing safe?
eBPF: performance overhead - use case #1
●
user-space deamon using an eBPF program attached to a
function (via kprobe)
●
kernel is updated, function doesn’t exist anymore
●
daemon starts to use an older/slower non-BPF method
●
5% performance regression
eBPF: performance overhead - use case #2
●
kernel function mapped to a 2MB huge page
●
eBPF program attached to that function (via kprobe)
●
setting the kprobe causes the function to be remapped to a
regular 4KB page
●
increased TLB misses
●
2% performance regression
eBPF: compile once, run everywhere?
●
… not exactly! :-(
●
eBPF programs are compiled on the target system
immediately before they are loaded
●
Linux headers are needed to understand kernel data
structures
●
structure randomization is a problem
●
BTF (BPF type format) has been created
●
kernel data description embedded in the kernel (no longer
any need to ship kernel headers around!)
Conclusion
●
Virtualization is your friend to speed up kernel
development
●
Real-time tracing can be an effective way to study and
understand how the kernel works
●
Kernel development can be challenging... but fun! :)
References
●
Brendan Gregg blog
●
http://brendangregg.com/blog/
●
BCC tools
●
https://github.com/iovisor/bcc
●
virtme
●
https://github.com/amluto/virtme
●
crashtest
●
https://github.com/arighi/crashtest
●
kernel-crash-tools
●
https://github.com/arighi/kernel-crash-tools
Thank you
Andrea Righi
andrea.righi@canonical.com
www.canonical.com
twitter: @arighi

More Related Content

What's hot

The Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageThe Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageKernel TLV
 
RxNetty vs Tomcat Performance Results
RxNetty vs Tomcat Performance ResultsRxNetty vs Tomcat Performance Results
RxNetty vs Tomcat Performance ResultsBrendan Gregg
 
FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)Kirill Tsym
 
Spying on the Linux kernel for fun and profit
Spying on the Linux kernel for fun and profitSpying on the Linux kernel for fun and profit
Spying on the Linux kernel for fun and profitAndrea Righi
 
Mirko Damiani - An Embedded soft real time distributed system in Go
Mirko Damiani - An Embedded soft real time distributed system in GoMirko Damiani - An Embedded soft real time distributed system in Go
Mirko Damiani - An Embedded soft real time distributed system in Golinuxlab_conf
 
Kernel Recipes 2016 - Landlock LSM: Unprivileged sandboxing
Kernel Recipes 2016 - Landlock LSM: Unprivileged sandboxingKernel Recipes 2016 - Landlock LSM: Unprivileged sandboxing
Kernel Recipes 2016 - Landlock LSM: Unprivileged sandboxingAnne Nicolas
 
Senior Design: Raspberry Pi Cluster Computing
Senior Design: Raspberry Pi Cluster ComputingSenior Design: Raspberry Pi Cluster Computing
Senior Design: Raspberry Pi Cluster ComputingRalph Walker II
 
Kernel Recipes 2019 - BPF at Facebook
Kernel Recipes 2019 - BPF at FacebookKernel Recipes 2019 - BPF at Facebook
Kernel Recipes 2019 - BPF at FacebookAnne Nicolas
 
Twisted Introduction
Twisted IntroductionTwisted Introduction
Twisted Introductioncyli
 
High-Performance Networking Using eBPF, XDP, and io_uring
High-Performance Networking Using eBPF, XDP, and io_uringHigh-Performance Networking Using eBPF, XDP, and io_uring
High-Performance Networking Using eBPF, XDP, and io_uringScyllaDB
 
Linux Container Basics
Linux Container BasicsLinux Container Basics
Linux Container BasicsMichael Kehoe
 
FBTFTP: an opensource framework to build dynamic tftp servers
FBTFTP: an opensource framework to build dynamic tftp serversFBTFTP: an opensource framework to build dynamic tftp servers
FBTFTP: an opensource framework to build dynamic tftp serversAngelo Failla
 
OpenZFS Channel programs
OpenZFS Channel programsOpenZFS Channel programs
OpenZFS Channel programsMatthew Ahrens
 
Emerging Persistent Memory Hardware and ZUFS - PM-based File Systems in User ...
Emerging Persistent Memory Hardware and ZUFS - PM-based File Systems in User ...Emerging Persistent Memory Hardware and ZUFS - PM-based File Systems in User ...
Emerging Persistent Memory Hardware and ZUFS - PM-based File Systems in User ...Kernel TLV
 
Snabb Switch: Riding the HPC wave to simpler, better network appliances (FOSD...
Snabb Switch: Riding the HPC wave to simpler, better network appliances (FOSD...Snabb Switch: Riding the HPC wave to simpler, better network appliances (FOSD...
Snabb Switch: Riding the HPC wave to simpler, better network appliances (FOSD...Igalia
 
Brief introduction to kselftest
Brief introduction to kselftestBrief introduction to kselftest
Brief introduction to kselftestSeongJae Park
 
An Introduction to Twisted
An Introduction to TwistedAn Introduction to Twisted
An Introduction to Twistedsdsern
 
All of Your Network Monitoring is (probably) Wrong
All of Your Network Monitoring is (probably) WrongAll of Your Network Monitoring is (probably) Wrong
All of Your Network Monitoring is (probably) Wrongice799
 

What's hot (20)

The Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast StorageThe Linux Block Layer - Built for Fast Storage
The Linux Block Layer - Built for Fast Storage
 
RxNetty vs Tomcat Performance Results
RxNetty vs Tomcat Performance ResultsRxNetty vs Tomcat Performance Results
RxNetty vs Tomcat Performance Results
 
FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)FD.io Vector Packet Processing (VPP)
FD.io Vector Packet Processing (VPP)
 
DPDK In Depth
DPDK In DepthDPDK In Depth
DPDK In Depth
 
Spying on the Linux kernel for fun and profit
Spying on the Linux kernel for fun and profitSpying on the Linux kernel for fun and profit
Spying on the Linux kernel for fun and profit
 
Mirko Damiani - An Embedded soft real time distributed system in Go
Mirko Damiani - An Embedded soft real time distributed system in GoMirko Damiani - An Embedded soft real time distributed system in Go
Mirko Damiani - An Embedded soft real time distributed system in Go
 
Kernel Recipes 2016 - Landlock LSM: Unprivileged sandboxing
Kernel Recipes 2016 - Landlock LSM: Unprivileged sandboxingKernel Recipes 2016 - Landlock LSM: Unprivileged sandboxing
Kernel Recipes 2016 - Landlock LSM: Unprivileged sandboxing
 
Senior Design: Raspberry Pi Cluster Computing
Senior Design: Raspberry Pi Cluster ComputingSenior Design: Raspberry Pi Cluster Computing
Senior Design: Raspberry Pi Cluster Computing
 
Kernel Recipes 2019 - BPF at Facebook
Kernel Recipes 2019 - BPF at FacebookKernel Recipes 2019 - BPF at Facebook
Kernel Recipes 2019 - BPF at Facebook
 
Twisted Introduction
Twisted IntroductionTwisted Introduction
Twisted Introduction
 
High-Performance Networking Using eBPF, XDP, and io_uring
High-Performance Networking Using eBPF, XDP, and io_uringHigh-Performance Networking Using eBPF, XDP, and io_uring
High-Performance Networking Using eBPF, XDP, and io_uring
 
Linux Container Basics
Linux Container BasicsLinux Container Basics
Linux Container Basics
 
FBTFTP: an opensource framework to build dynamic tftp servers
FBTFTP: an opensource framework to build dynamic tftp serversFBTFTP: an opensource framework to build dynamic tftp servers
FBTFTP: an opensource framework to build dynamic tftp servers
 
OpenZFS Channel programs
OpenZFS Channel programsOpenZFS Channel programs
OpenZFS Channel programs
 
Emerging Persistent Memory Hardware and ZUFS - PM-based File Systems in User ...
Emerging Persistent Memory Hardware and ZUFS - PM-based File Systems in User ...Emerging Persistent Memory Hardware and ZUFS - PM-based File Systems in User ...
Emerging Persistent Memory Hardware and ZUFS - PM-based File Systems in User ...
 
How we use Twisted in Launchpad
How we use Twisted in LaunchpadHow we use Twisted in Launchpad
How we use Twisted in Launchpad
 
Snabb Switch: Riding the HPC wave to simpler, better network appliances (FOSD...
Snabb Switch: Riding the HPC wave to simpler, better network appliances (FOSD...Snabb Switch: Riding the HPC wave to simpler, better network appliances (FOSD...
Snabb Switch: Riding the HPC wave to simpler, better network appliances (FOSD...
 
Brief introduction to kselftest
Brief introduction to kselftestBrief introduction to kselftest
Brief introduction to kselftest
 
An Introduction to Twisted
An Introduction to TwistedAn Introduction to Twisted
An Introduction to Twisted
 
All of Your Network Monitoring is (probably) Wrong
All of Your Network Monitoring is (probably) WrongAll of Your Network Monitoring is (probably) Wrong
All of Your Network Monitoring is (probably) Wrong
 

Similar to Kernel bug hunting

Dataplane programming with eBPF: architecture and tools
Dataplane programming with eBPF: architecture and toolsDataplane programming with eBPF: architecture and tools
Dataplane programming with eBPF: architecture and toolsStefano Salsano
 
Security Monitoring with eBPF
Security Monitoring with eBPFSecurity Monitoring with eBPF
Security Monitoring with eBPFAlex Maestretti
 
Understanding eBPF in a Hurry!
Understanding eBPF in a Hurry!Understanding eBPF in a Hurry!
Understanding eBPF in a Hurry!Ray Jenkins
 
Not breaking userspace: the evolving Linux ABI
Not breaking userspace: the evolving Linux ABINot breaking userspace: the evolving Linux ABI
Not breaking userspace: the evolving Linux ABIAlison Chaiken
 
Linux Kernel Platform Development: Challenges and Insights
 Linux Kernel Platform Development: Challenges and Insights Linux Kernel Platform Development: Challenges and Insights
Linux Kernel Platform Development: Challenges and InsightsGlobalLogic Ukraine
 
Introduction of eBPF - 時下最夯的Linux Technology
Introduction of eBPF - 時下最夯的Linux Technology Introduction of eBPF - 時下最夯的Linux Technology
Introduction of eBPF - 時下最夯的Linux Technology Jace Liang
 
ebpf and IO Visor: The What, how, and what next!
ebpf and IO Visor: The What, how, and what next!ebpf and IO Visor: The What, how, and what next!
ebpf and IO Visor: The What, how, and what next!Affan Syed
 
BPF & Cilium - Turning Linux into a Microservices-aware Operating System
BPF  & Cilium - Turning Linux into a Microservices-aware Operating SystemBPF  & Cilium - Turning Linux into a Microservices-aware Operating System
BPF & Cilium - Turning Linux into a Microservices-aware Operating SystemThomas Graf
 
Lec 10-linux-review
Lec 10-linux-reviewLec 10-linux-review
Lec 10-linux-reviewabinaya m
 
story_of_bpf-1.pdf
story_of_bpf-1.pdfstory_of_bpf-1.pdf
story_of_bpf-1.pdfhegikip775
 
Talk 160920 @ Cat System Workshop
Talk 160920 @ Cat System WorkshopTalk 160920 @ Cat System Workshop
Talk 160920 @ Cat System WorkshopQuey-Liang Kao
 
Efficient System Monitoring in Cloud Native Environments
Efficient System Monitoring in Cloud Native EnvironmentsEfficient System Monitoring in Cloud Native Environments
Efficient System Monitoring in Cloud Native EnvironmentsGergely Szabó
 
eBPF - Rethinking the Linux Kernel
eBPF - Rethinking the Linux KerneleBPF - Rethinking the Linux Kernel
eBPF - Rethinking the Linux KernelThomas Graf
 
DEF CON 27 - JEFF DILEO - evil e bpf in depth
DEF CON 27 - JEFF DILEO - evil e bpf in depthDEF CON 27 - JEFF DILEO - evil e bpf in depth
DEF CON 27 - JEFF DILEO - evil e bpf in depthFelipe Prado
 
DevopsItalia2015 - DHCP at Facebook - Evolution of an infrastructure
DevopsItalia2015 - DHCP at Facebook - Evolution of an infrastructureDevopsItalia2015 - DHCP at Facebook - Evolution of an infrastructure
DevopsItalia2015 - DHCP at Facebook - Evolution of an infrastructureAngelo Failla
 
Introduction to eBPF
Introduction to eBPFIntroduction to eBPF
Introduction to eBPFRogerColl2
 
DCSF 19 eBPF Superpowers
DCSF 19 eBPF SuperpowersDCSF 19 eBPF Superpowers
DCSF 19 eBPF SuperpowersDocker, Inc.
 
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)Kevin Lynch
 
Calico-eBPF-Dataplane-CNCF-Webinar-Slides.pdf
Calico-eBPF-Dataplane-CNCF-Webinar-Slides.pdfCalico-eBPF-Dataplane-CNCF-Webinar-Slides.pdf
Calico-eBPF-Dataplane-CNCF-Webinar-Slides.pdfyingxinwang4
 

Similar to Kernel bug hunting (20)

Dataplane programming with eBPF: architecture and tools
Dataplane programming with eBPF: architecture and toolsDataplane programming with eBPF: architecture and tools
Dataplane programming with eBPF: architecture and tools
 
Security Monitoring with eBPF
Security Monitoring with eBPFSecurity Monitoring with eBPF
Security Monitoring with eBPF
 
Understanding eBPF in a Hurry!
Understanding eBPF in a Hurry!Understanding eBPF in a Hurry!
Understanding eBPF in a Hurry!
 
eBPF Basics
eBPF BasicseBPF Basics
eBPF Basics
 
Not breaking userspace: the evolving Linux ABI
Not breaking userspace: the evolving Linux ABINot breaking userspace: the evolving Linux ABI
Not breaking userspace: the evolving Linux ABI
 
Linux Kernel Platform Development: Challenges and Insights
 Linux Kernel Platform Development: Challenges and Insights Linux Kernel Platform Development: Challenges and Insights
Linux Kernel Platform Development: Challenges and Insights
 
Introduction of eBPF - 時下最夯的Linux Technology
Introduction of eBPF - 時下最夯的Linux Technology Introduction of eBPF - 時下最夯的Linux Technology
Introduction of eBPF - 時下最夯的Linux Technology
 
ebpf and IO Visor: The What, how, and what next!
ebpf and IO Visor: The What, how, and what next!ebpf and IO Visor: The What, how, and what next!
ebpf and IO Visor: The What, how, and what next!
 
BPF & Cilium - Turning Linux into a Microservices-aware Operating System
BPF  & Cilium - Turning Linux into a Microservices-aware Operating SystemBPF  & Cilium - Turning Linux into a Microservices-aware Operating System
BPF & Cilium - Turning Linux into a Microservices-aware Operating System
 
Lec 10-linux-review
Lec 10-linux-reviewLec 10-linux-review
Lec 10-linux-review
 
story_of_bpf-1.pdf
story_of_bpf-1.pdfstory_of_bpf-1.pdf
story_of_bpf-1.pdf
 
Talk 160920 @ Cat System Workshop
Talk 160920 @ Cat System WorkshopTalk 160920 @ Cat System Workshop
Talk 160920 @ Cat System Workshop
 
Efficient System Monitoring in Cloud Native Environments
Efficient System Monitoring in Cloud Native EnvironmentsEfficient System Monitoring in Cloud Native Environments
Efficient System Monitoring in Cloud Native Environments
 
eBPF - Rethinking the Linux Kernel
eBPF - Rethinking the Linux KerneleBPF - Rethinking the Linux Kernel
eBPF - Rethinking the Linux Kernel
 
DEF CON 27 - JEFF DILEO - evil e bpf in depth
DEF CON 27 - JEFF DILEO - evil e bpf in depthDEF CON 27 - JEFF DILEO - evil e bpf in depth
DEF CON 27 - JEFF DILEO - evil e bpf in depth
 
DevopsItalia2015 - DHCP at Facebook - Evolution of an infrastructure
DevopsItalia2015 - DHCP at Facebook - Evolution of an infrastructureDevopsItalia2015 - DHCP at Facebook - Evolution of an infrastructure
DevopsItalia2015 - DHCP at Facebook - Evolution of an infrastructure
 
Introduction to eBPF
Introduction to eBPFIntroduction to eBPF
Introduction to eBPF
 
DCSF 19 eBPF Superpowers
DCSF 19 eBPF SuperpowersDCSF 19 eBPF Superpowers
DCSF 19 eBPF Superpowers
 
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
Kubernetes @ Squarespace (SRE Portland Meetup October 2017)
 
Calico-eBPF-Dataplane-CNCF-Webinar-Slides.pdf
Calico-eBPF-Dataplane-CNCF-Webinar-Slides.pdfCalico-eBPF-Dataplane-CNCF-Webinar-Slides.pdf
Calico-eBPF-Dataplane-CNCF-Webinar-Slides.pdf
 

More from Andrea Righi

Eco-friendly Linux kernel development
Eco-friendly Linux kernel developmentEco-friendly Linux kernel development
Eco-friendly Linux kernel developmentAndrea Righi
 
Linux kernel bug hunting
Linux kernel bug huntingLinux kernel bug hunting
Linux kernel bug huntingAndrea Righi
 
Linux kernel tracing superpowers in the cloud
Linux kernel tracing superpowers in the cloudLinux kernel tracing superpowers in the cloud
Linux kernel tracing superpowers in the cloudAndrea Righi
 
Understand and optimize Linux I/O
Understand and optimize Linux I/OUnderstand and optimize Linux I/O
Understand and optimize Linux I/OAndrea Righi
 

More from Andrea Righi (6)

Eco-friendly Linux kernel development
Eco-friendly Linux kernel developmentEco-friendly Linux kernel development
Eco-friendly Linux kernel development
 
Linux kernel bug hunting
Linux kernel bug huntingLinux kernel bug hunting
Linux kernel bug hunting
 
Linux kernel tracing superpowers in the cloud
Linux kernel tracing superpowers in the cloudLinux kernel tracing superpowers in the cloud
Linux kernel tracing superpowers in the cloud
 
Understand and optimize Linux I/O
Understand and optimize Linux I/OUnderstand and optimize Linux I/O
Understand and optimize Linux I/O
 
Debugging linux
Debugging linuxDebugging linux
Debugging linux
 
Linux boot-time
Linux boot-timeLinux boot-time
Linux boot-time
 

Recently uploaded

Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfryanfarris8
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension AidPhilip Schwarz
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionOnePlan Solutions
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesVictorSzoltysek
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...kalichargn70th171
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfkalichargn70th171
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfVishalKumarJha10
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech studentsHimanshiGarg82
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionSolGuruz
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providermohitmore19
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfproinshot.com
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdfPearlKirahMaeRagusta1
 

Recently uploaded (20)

Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdfAzure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS LiveVip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
Vip Call Girls Noida ➡️ Delhi ➡️ 9999965857 No Advance 24HRS Live
 
Diamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with PrecisionDiamond Application Development Crafting Solutions with Precision
Diamond Application Development Crafting Solutions with Precision
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 

Kernel bug hunting

  • 1. Kernel bug hunting Modena, 27 November 2019 Andrea Righi andrea.righi@canonical.com www.canonical.com twitter: @arighi
  • 2. Linux kernel is complex ● 25.590.567 lines of code right now find -type f -name '*.[chS]' -exec wc -l {} ; | awk 'BEGIN{sum=0}{sum+=$1}END{print sum}' ● 229 patches last week week git log --oneline v5.4-rc7..v5.4-rc8 | wc -l ● 195 files changed, 3398 insertions(+), 4081 deletions(-) git diff --stat v5.4-rc7..v5.4-rc8 | tail -1
  • 4. Kernel bugs ● Kernel panic ● Fatal error, system becomes unusable ● Kernel oops ● Non-fatal error, some functionalities can be compromised ● Wrong result ● Fatal error from user’s perspective ● Security vulnerability ● Side-channel attack, data leakage, … ● Performance regression ● Everything is correct, but slower...
  • 5. Debugging techniques ● blinking LED ● printk() / dump_stack() ● procfs ● SysReq key (Documentation/admin-guide/sysrq.rst) ● debugger (i.e., kgdb, …) ● virtualization ● profiling / tracing
  • 6. Kernel debugging hands-on ● Virtualization can help to track down kernel bugs ● virtme ● Run the kernel inside a qemu/kvm instance, virtualizing the running system ● Generate crash dump ● Analyze system data offline (after the crash) ● crash test kernel module ● https://github.com/arighi/crashtest ● Simple scripts to speed up kernel development (wrappers around virtme/qemu/crash): ● https://github.com/arighi/kernel-crash-tools
  • 8. Profiling vs tracing ● Profiling ● Create a periodic timed interrupt that collects the current program counter, function address and the entire stack back trace ● Tracing ● Record times and invocations of specific events
  • 10. Tracing example: strace ● strace(1): system call tracer in Linux ● It uses the ptrace() system call that pauses the target process for each syscall so that the debugger can read the state ● And it’s doing this twice: when the syscall begins and when it ends!
  • 11. strace overhead ### Regular execution ### $ dd if=/dev/zero of=/dev/null bs=1 count=500k 512000+0 records in 512000+0 records out 512000 bytes (512 kB, 500 KiB) copied, 0,501455 s, 1.0 MB/s ### Strace execution (tracing a syscall that is never called) ### $ strace -e trace=accept dd if=/dev/zero of=/dev/null bs=1 count=500k 512000+0 records in 512000+0 records out 512000 bytes (512 kB, 500 KiB) copied, 44.0216 s, 11,6 kB/s +++ exited with 0 +++
  • 13. eBPF
  • 14. eBPF history ● Initially it was BPF: Berkeley Packet Filter ● It has its roots in BSD in the very early 1990’s ● Originally designed as a mechanism for fast filtering network packets ● 3.15: Linux introduced eBPF: extended Berkeley Packet Filter ● More efficient / more generic than the original BPF ● 3.18: eBPF VM exposed to user-space ● 4.9: eBPF programs can be attached to perf_events ● 4.10: eBPF programs can be attached to cgroups ● 4.15: eBPF LSM hooks
  • 15. eBPF features ● Highly efficient VM that lives in the kernel ● Inject safe sanboxed bytecode into the kernel ● Attach code to kernel functions / events ● In-kernel JIT compiler ● Dynamically translate eBPF bytecode into native opcodes ● eBPF makes kernel programmable without having to cross kernel/user-space boundaries ● Access in-kernel data structures directly without the risk of crashing, hanging or breaking the kernel in any way
  • 16. eBPF as a VM ● Example assembly of a simple eBPF filter ● Load 16-bit quantity from offset 12 in the packet to the accumulator (ethernet type) ● Compare the value to see if the packet is an IP packet ● If the packet is IP, return TRUE (packet is accepted) ● otherwise return 0 (packet is rejected) ● Only 4 VM instructions to filter IP packets! ldh [12] jeq #ETHERTYPE_IP, l1, l2 l1: ret #TRUE l2: ret #0
  • 18. How many eBPF programs are running in your laptop? ● Run this: ls -la /proc/*/fd | grep bpf-prog | wc -l
  • 19. Flame graphs ● CPU flame graphs ● x-axis sample population ● y-axis ● stack depth ● Wider boxes = More samples = More CPU time = More overhead! Flame Graph Search s.. sun/nio/ch/SocketChannel.. org/mozi.. org.. io.. d.. tcp_v4_rcv i.. org.. vfs_write io/netty/channel/AbstractChannelHandlerContext:.fireChannelRead JavaCalls::call_virtual o.. ip_q.. org/mozi.. tcp_sen.. cpu.. org/.. [unknown] io/netty/channel/AbstractCha.. JavaCalls::call_virtual org/mozilla/javascript/gen/file__root_vert_x_2_1_.. ip.. io/netty/channel/nio/AbstractNioByteChannel$NioByteUnsafe:.read io/netty/channel/nio/Abstr.. t.. s.. o.. JavaCalls::call_helper Interpreter ip_.. __do_softirq ip_local_out ep_p.. org/mozilla/javas.. s.. org/mozilla/javascript/gen/file__root_vert_x_2_1_5_sys_mods_io.. system_ca.. s.. _.. [unkn.. __.. sta.. sun.. _.. tcp_transmit_skb do_softirq org/m.. __.. JavaThread::thread_main_inner io/netty/channel/ChannelDupl.. io/netty/channel/DefaultCha.. java ip_rcv_fi.. t.. G.. o.. tcp_v4_.. __tcp.. do_sync_write v.. x.. call_stub ip_finish_out.. net_rx_act.. io/netty/channel/ChannelOut.. wrk tcp_write_xmit ip_.. loc.. syste.. i.. aeProcessEvents system_call_fastpath org.. do.. local_bh_en.. or.. swapper org/mozilla/javascript/gen/file__root_vert_x_2_1_5.. JavaThread::run sun/nio/ch/FileDispatch.. [.. __tcp_push_pendi.. tc.. sun/re.. tcp_.. tcp_w.. process_ba.. io/ne.. [.. inet_se.. ip..or.. thread_entry Interpreter org/vertx/java/core/http/impl/DefaultHttpServer$ServerHandler:.doM.. tcp_rcv.. vfs_write ip_queue_xmit sock_aio.. aeMain _.. org/vertx/java/core/net/impl/VertxHandler:.channelRead org/moz.. __netif_r.. __netif_r.. io/netty/channel/AbstractCh.. do_softirq_.. org/mozilla/javascript/gen/file__root_vert_x_2_1_5_sys_mods_io_.. io/netty/channel/nio/NioEventLoop:.processSelectedKeys org/mozilla/javas.. Interpreter so.. _.. [unknown] io/netty/channel/AbstractCh.. or.. io/netty/channel/AbstractChannelHandlerContext:.fireChannelRead ip_local_.. __.. org/vertx/java/core/net/impl.. io/.. sock_aio_write ip_rcv tcp_sendmsg e.. do.. thread_main ip_.. io.. ip_output io/netty/channel/nio/NioEventLoop:.processSelectedKeysOptimized java_start org/vertx/java/core/http/impl/ServerConnection:.handleRequests.. pr.. socke.. sys_e.. org/moz.. or.. io/netty/channel/AbstractCha.. io/netty/handler/codec/ByteToMessageDecoder:.channelRead x.. io/netty/channel/AbstractCha.. h.. start_thread ne.. inet_sendmsg start_thread r.. ip_local_.. org/mozilla/javas.. o.. sys_write socket_wri.. i.. io/netty/handler/codec/ByteT.. org/.. do_sync_.. sys_write
  • 20. BCC tracing tools ● BPF Compiler Collection https://github.com/iovisor/bcc ● Front-end to eBPF ● BCC makes eBPF programs easier to write ● Include C wrapper around LLVM ● Python ● Lua ● C++ ● C helper libs ● Lots of pre-defined tools available
  • 21. Example #1: trace exec() ● Intercept all the processes executed in the system
  • 22. Example #2: keylogger ● Identify where and how keyboard characters are received and processed by the kernel
  • 23. Example #3: ping ● Identify where ICMP packets (ECHO_REQUEST / ECHO_REPLY) are received and processed by the kernel
  • 24. Example #4: task wait / wakeup ● Determine the stack trace of a sleeping process and the stack trace of the process that wakes up a sleeping process
  • 26. eBPF: performance overhead - use case #1 ● user-space deamon using an eBPF program attached to a function (via kprobe) ● kernel is updated, function doesn’t exist anymore ● daemon starts to use an older/slower non-BPF method ● 5% performance regression
  • 27. eBPF: performance overhead - use case #2 ● kernel function mapped to a 2MB huge page ● eBPF program attached to that function (via kprobe) ● setting the kprobe causes the function to be remapped to a regular 4KB page ● increased TLB misses ● 2% performance regression
  • 28. eBPF: compile once, run everywhere? ● … not exactly! :-( ● eBPF programs are compiled on the target system immediately before they are loaded ● Linux headers are needed to understand kernel data structures ● structure randomization is a problem ● BTF (BPF type format) has been created ● kernel data description embedded in the kernel (no longer any need to ship kernel headers around!)
  • 29. Conclusion ● Virtualization is your friend to speed up kernel development ● Real-time tracing can be an effective way to study and understand how the kernel works ● Kernel development can be challenging... but fun! :)
  • 30. References ● Brendan Gregg blog ● http://brendangregg.com/blog/ ● BCC tools ● https://github.com/iovisor/bcc ● virtme ● https://github.com/amluto/virtme ● crashtest ● https://github.com/arighi/crashtest ● kernel-crash-tools ● https://github.com/arighi/kernel-crash-tools