Practical SystemTAP
Basics

Lubomir Rintel <lkundrak@v3.sk>
BTC: 18PhCYhP1FxZvWjJgUF57J2DuHwHnr4eLa
Outline
●

What can it do

●

How does it work

●

Language

●

Practical example
“A dynamic tracing tool”
●

GDB?

●

DTrace?

●

AWK?

●

C?
Powerful language
●

Probe points

●

Translated into C

●

Compiled into kernel module

●

Can access user and kernel memory

●

Communicate to the user space
#!/usr/bin/stap
# Hello SystemTAP!
probe begin {
println ("Hello world!");
}
probe syscall.open {
println (execname (), " opened ",
user_string ($filename))
}
# yum -y install kernel-devel 
systemtap-devel systemtap-runtime
# debuginfo-install -y kernel
# stap -v example1.stp
...
Hello world!
hald opened /sys/devices/...
python opened /usr/share/...
^C
#
So intense!
Probe points
●

Static (think DTrace)

●

Dynamic

●

Kernel and User space

●

Predefined (system calls, etc.)

●

Can access variables in scope
begin
end
syscall.open
process ("/usr/sbin/httpd")
.function ("ap_invoke_handler")
module ("fuse").function
("fuse_*").return
kernel.function
("*setxattr*@fs/xattr.c")
kernel.statement ("kfree@mm/slub.c+3")
timer.ms (100)
Scripting
●

Resembles AWK

●

Can inline C!

●

Functions

●

Variables

●

Loops

●

Statistical aggregates

●

I/O to userspace

●

Safety limits apply
Library
●

Probes

●

Tapsets

●

Functions

●

Documentation

stapex
stapfuncs
stapprobes
stapvars

(3stap)
(3stap)
(3stap)
(3stap)

-

systemtap
systemtap
systemtap
systemtap

examples
functions
probe points
variables
Practical example
●

Simple Perl physical memory profiler

●

Kernel probe

●

User probe

●

Simple and fast
Perl part: the virtual machine
●

●

●

●

Interpreter instance (my_perl) holds current
OP pointer
OP points to a package it was compiled from
Each OP is implemented by Perl_pp_*()
function
runops() loop consumes OP tree, calling
Perl_pp_*()
/* The package a process is
* executing */
global package;
/* A Perl instruction */
probe process
("/usr/lib/perl5/CORE/libperl.so")
.function ("Perl_pp_*")
{
pid = pid ();
package[pid] = user_string (
$my_perl->Icurcop
->cop_stashpv);
}
Kernel part: memory management
●

User calls malloc()

●

libc calls mmap2() to map anonymous memory

●

kernel maps read-only shared zero-page

●

write causes a page fault

●

do_wp_page() copies on write
/* Allocations per package and pid */
global allocs;
/* Account for COW */
probe kernel.function ("do_wp_page")
{
pid = pid ();
pkg = package[pid];
if (pkg != "")
allocs[pkg, pid] <<<
mem_page_size ();
}
Output
●

Do it as fast as we can

●

We are resource-constrained for safety

●

Process it in userspace later on

●

Also, flush it upon end
/* Dump frequently, to so that we
* won't overflow */
probe timer.ms (100), end
{
foreach ([pkg, pid] in allocs) {
printf (""%d","%s"n",
@sum (allocs[pkg, pid]),
pkg)
}
delete allocs;
}
$ stap -v perlmem.stp |tee profile.csv
"8192","main"
"20480","Exporter"
"45056","main"
"36864","constant"
"118784","strict"
"8192","warnings"
"16384","vars"
"8192","Getopt::Long"
"32768","warnings::register"
"36864","main"
"36864","Exporter"
"24576","warnings"
...
What's missing
●

eval

●

Multiple interpreters

●

execve()

●

Memory reclaimation
Questions?

Found this useful? My Bitcoin address is:
18PhCYhP1FxZvWjJgUF57J2DuHwHnr4eLa
Safety
●

Native code vs. DTrace's virtual machine

●

Running unprivileged

●

Guru mode

●

Inline C

●

Timeouts

●

Memory

Practical SystemTAP basics: Perl memory profiling