Kernel bug hunting

Kernel bug hunting
Modena, 27 November 2019
Andrea Righi
andrea.righi@canonical.com
www.canonical.com
twitter: @arighi

Linux kernel is complex
●
25.590.567 lines of code right now
find -type f -name '*.[chS]' -exec wc -l {} ; | awk 'BEGIN{sum=0}{sum+=$1}END{print sum}'
●
229 patches last week week
git log --oneline v5.4-rc7..v5.4-rc8 | wc -l
●
195 files changed, 3398 insertions(+), 4081 deletions(-)
git diff --stat v5.4-rc7..v5.4-rc8 | tail -1

https://www.linuxcounter.net/statistics/kernel

Kernel bugs
●
Kernel panic
●
Fatal error, system becomes unusable
●
Kernel oops
●
Non-fatal error, some functionalities can be compromised
●
Wrong result
●
Fatal error from user’s perspective
●
Security vulnerability
●
Side-channel attack, data leakage, …
●
Performance regression
●
Everything is correct, but slower...

Debugging techniques
●
blinking LED
●
printk() / dump_stack()
●
procfs
●
SysReq key (Documentation/admin-guide/sysrq.rst)
●
debugger (i.e., kgdb, …)
●
virtualization
●
profiling / tracing

Kernel debugging hands-on
●
Virtualization can help to track down kernel bugs
●
virtme
●
Run the kernel inside a qemu/kvm instance, virtualizing the running
system
●
Generate crash dump
●
Analyze system data offline (after the crash)
●
crash test kernel module
●
https://github.com/arighi/crashtest
●
Simple scripts to speed up kernel development
(wrappers around virtme/qemu/crash):
●
https://github.com/arighi/kernel-crash-tools

Profiling vs tracing
●
Profiling
●
Create a periodic timed interrupt that collects the current
program counter, function address and the entire stack
back trace
●
Tracing
●
Record times and invocations of specific events

Tracing example: strace
●
strace(1): system call tracer in Linux
●
It uses the ptrace() system call that pauses the target
process for each syscall so that the debugger can read
the state
●
And it’s doing this twice: when the syscall begins and when
it ends!

strace overhead
### Regular execution ###
$ dd if=/dev/zero of=/dev/null bs=1 count=500k
512000+0 records in
512000+0 records out
512000 bytes (512 kB, 500 KiB) copied, 0,501455 s, 1.0 MB/s
### Strace execution (tracing a syscall that is never called) ###
$ strace -e trace=accept dd if=/dev/zero of=/dev/null bs=1 count=500k
512000+0 records in
512000+0 records out
512000 bytes (512 kB, 500 KiB) copied, 44.0216 s, 11,6 kB/s
+++ exited with 0 +++

Tracing kernel functions: kprobe

eBPF history
●
Initially it was BPF: Berkeley Packet Filter
●
It has its roots in BSD in the very early 1990’s
●
Originally designed as a mechanism for fast filtering network packets
●
3.15: Linux introduced eBPF: extended Berkeley Packet Filter
●
More efficient / more generic than the original BPF
●
3.18: eBPF VM exposed to user-space
●
4.9: eBPF programs can be attached to perf_events
●
4.10: eBPF programs can be attached to cgroups
●
4.15: eBPF LSM hooks

eBPF features
●
Highly efficient VM that lives in the kernel
●
Inject safe sanboxed bytecode into the kernel
●
Attach code to kernel functions / events
●
In-kernel JIT compiler
●
Dynamically translate eBPF bytecode into native opcodes
●
eBPF makes kernel programmable without having to
cross kernel/user-space boundaries
●
Access in-kernel data structures directly without the risk
of crashing, hanging or breaking the kernel in any way

eBPF as a VM
●
Example assembly of a simple
eBPF filter
●
Load 16-bit quantity from offset
12 in the packet to the
accumulator (ethernet type)
●
Compare the value to see if the
packet is an IP packet
●
If the packet is IP, return TRUE
(packet is accepted)
●
otherwise return 0 (packet is
rejected)
●
Only 4 VM instructions to filter
IP packets!
ldh [12]
jeq #ETHERTYPE_IP, l1, l2
l1: ret #TRUE
l2: ret #0

How many eBPF programs are running in your laptop?
●
Run this:
ls -la /proc/*/fd | grep bpf-prog | wc -l

Flame graphs
●
CPU flame graphs
●
x-axis
sample population
●
y-axis
●
stack depth
●
Wider boxes =
More samples =
More CPU time =
More overhead!
Flame Graph Search
s..
sun/nio/ch/SocketChannel..
org/mozi..
org..
io..
d..
tcp_v4_rcv
i..
org..
vfs_write
io/netty/channel/AbstractChannelHandlerContext:.fireChannelRead
JavaCalls::call_virtual
o..
ip_q..
org/mozi..
tcp_sen..
cpu..
org/..
[unknown]
io/netty/channel/AbstractCha..
JavaCalls::call_virtual
org/mozilla/javascript/gen/file__root_vert_x_2_1_..
ip..
io/netty/channel/nio/AbstractNioByteChannel$NioByteUnsafe:.read
io/netty/channel/nio/Abstr..
t..
s..
o..
JavaCalls::call_helper
Interpreter
ip_..
__do_softirq
ip_local_out
ep_p..
org/mozilla/javas..
s..
org/mozilla/javascript/gen/file__root_vert_x_2_1_5_sys_mods_io..
system_ca..
s..
_..
[unkn..
__..
sta..
sun..
_..
tcp_transmit_skb
do_softirq
org/m..
__..
JavaThread::thread_main_inner
io/netty/channel/ChannelDupl..
io/netty/channel/DefaultCha..
java
ip_rcv_fi..
t..
G..
o..
tcp_v4_..
__tcp..
do_sync_write
v..
x..
call_stub
ip_finish_out..
net_rx_act..
io/netty/channel/ChannelOut..
wrk
tcp_write_xmit
ip_..
loc..
syste..
i..
aeProcessEvents
system_call_fastpath
org..
do..
local_bh_en..
or..
swapper
org/mozilla/javascript/gen/file__root_vert_x_2_1_5..
JavaThread::run
sun/nio/ch/FileDispatch..
[..
__tcp_push_pendi..
tc..
sun/re..
tcp_..
tcp_w..
process_ba..
io/ne..
[..
inet_se..
ip..or..
thread_entry
Interpreter
org/vertx/java/core/http/impl/DefaultHttpServer$ServerHandler:.doM..
tcp_rcv..
vfs_write
ip_queue_xmit
sock_aio..
aeMain
_..
org/vertx/java/core/net/impl/VertxHandler:.channelRead
org/moz..
__netif_r..
__netif_r..
io/netty/channel/AbstractCh..
do_softirq_..
org/mozilla/javascript/gen/file__root_vert_x_2_1_5_sys_mods_io_..
io/netty/channel/nio/NioEventLoop:.processSelectedKeys
org/mozilla/javas..
Interpreter
so..
_..
[unknown]
io/netty/channel/AbstractCh..
or..
io/netty/channel/AbstractChannelHandlerContext:.fireChannelRead
ip_local_..
__..
org/vertx/java/core/net/impl..
io/..
sock_aio_write
ip_rcv
tcp_sendmsg
e..
do..
thread_main
ip_..
io..
ip_output
io/netty/channel/nio/NioEventLoop:.processSelectedKeysOptimized
java_start
org/vertx/java/core/http/impl/ServerConnection:.handleRequests..
pr..
socke..
sys_e..
org/moz..
or..
io/netty/handler/codec/ByteToMessageDecoder:.channelRead
x..
h..
start_thread
ne..
inet_sendmsg
start_thread
r..
ip_local_..
org/mozilla/javas..
o..
sys_write
socket_wri..
i..
io/netty/handler/codec/ByteT..
org/..
do_sync_..
sys_write

BCC tracing tools
●
BPF Compiler Collection https://github.com/iovisor/bcc
●
Front-end to eBPF
●
BCC makes eBPF programs easier to write
●
Include C wrapper around LLVM
●
Python
●
Lua
●
C++
●
C helper libs
●
Lots of pre-defined tools available

Example #1: trace exec()
●
Intercept all the processes executed in the system

Example #2: keylogger
●
Identify where and how keyboard characters are received
and processed by the kernel

Example #3: ping
●
Identify where ICMP packets (ECHO_REQUEST /
ECHO_REPLY) are received and processed by the kernel

Example #4: task wait / wakeup
●
Determine the stack trace
of a sleeping process and
the stack trace of the
process that wakes up a
sleeping process

eBPF: performance overhead - use case #1
●
user-space deamon using an eBPF program attached to a
function (via kprobe)
●
kernel is updated, function doesn’t exist anymore
●
daemon starts to use an older/slower non-BPF method
●
5% performance regression

eBPF: performance overhead - use case #2
●
kernel function mapped to a 2MB huge page
●
eBPF program attached to that function (via kprobe)
●
setting the kprobe causes the function to be remapped to a
regular 4KB page
●
increased TLB misses
●
2% performance regression

eBPF: compile once, run everywhere?
●
… not exactly! :-(
●
eBPF programs are compiled on the target system
immediately before they are loaded
●
Linux headers are needed to understand kernel data
structures
●
structure randomization is a problem
●
BTF (BPF type format) has been created
●
kernel data description embedded in the kernel (no longer
any need to ship kernel headers around!)

Conclusion
●
Virtualization is your friend to speed up kernel
development
●
Real-time tracing can be an effective way to study and
understand how the kernel works
●
Kernel development can be challenging... but fun! :)

References
●
Brendan Gregg blog
●
http://brendangregg.com/blog/
●
BCC tools
●
https://github.com/iovisor/bcc
●
virtme
●
https://github.com/amluto/virtme
●
crashtest
●
https://github.com/arighi/crashtest
●
kernel-crash-tools
●
https://github.com/arighi/kernel-crash-tools

Thank you
Andrea Righi
andrea.righi@canonical.com
www.canonical.com
twitter: @arighi

Kernel bug hunting

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Kernel bug hunting

Similar to Kernel bug hunting (20)

More from Andrea Righi

More from Andrea Righi (6)

Recently uploaded

Recently uploaded (20)

Kernel bug hunting