The document discusses Linux micro-benchmarks, which measure small portions of a system's performance rather than modeling a real workload. While micro-benchmarks can be misused, they are valid when used correctly to understand bottlenecks by profiling workloads. Several case studies are presented, showing how micro-benchmarks can reveal unexpected behaviors when systems calls are measured, such as a benchmark that actually measured page faults in addition to forks. Micro-benchmarks are most useful when profiling to identify bottlenecks, when they are kept very simple, and when they have been thoroughly tested.
1. Tales of Linux Micro-benchmarks
Matt Fleming
@fleming_matt
2. Agenda
- Background: What is a micro-benchmark?
- Why all the hate?
- Case Studies
- When are they useful?
@fleming_mattTales of Linux micro-benchmarks
3. Micro-what?
Benchmark: A program to measure the
performance of a system, usually for comparsion.
Models a real-life workload.
Micro-benchmark: A program to measure a (small
but important!) portion of a system.
Artificial/synthetic.
@fleming_mattTales of Linux micro-benchmarks
4. Why all the hate?
You need some understanding of OS and
runtime/toolchain.
Many do not actually test what the author
intended.
But this is simply a bug or user error, it doesn’t
invalidate the concept.
@fleming_mattTales of Linux micro-benchmarks
5. Why all the hate?
C: simple loop compiled with -O0 and -O2
for (i = 0; i < 1000000000; i++)
val = val * 2;
Time: 2.238s Time: 0.001s
@fleming_mattTales of Linux micro-benchmarks
6. Why all the hate?
C: simple loop compiled with -O0 and -O2
for (i = 0; i < 1000000000; i++)
val = val * 2;
movl $0x0,-0xc(%rbp)
jmp 2
1:
shlq -0x8(%rbp)
addl $0x1,-0xc(%rbp)
2:
cmpl $999999999,-0xc(%rbp)
jle 1
Time: 2.238s Time: 0.001s
@fleming_mattTales of Linux micro-benchmarks
7. Why all the hate?
C: simple loop compiled with -O0 and -O2
for (i = 0; i < 1000000000; i++)
val = val * 2;
movl $0x0,-0xc(%rbp)
jmp 2
1:
shlq -0x8(%rbp)
addl $0x1,-0xc(%rbp)
2:
cmpl $999999999,-0xc(%rbp)
jle 1
Time: 2.238s Time: 0.001s
@fleming_mattTales of Linux micro-benchmarks
8. Case study 1 - Siege
@fleming_mattTales of Linux micro-benchmarks
9. Case study 1 - Siege
5.62% [kernel] [k] task_cputime
3.33% [kernel] [k] osq_lock
2.58% [kernel] [k] thread_group_cputime
$ perf top
- task_cputime
- 97.35% thread_group_cputime
thread_group_cputime_adjusted
do_sys_times
sys_times
entry_SYSCALL_64_fastpath
@fleming_mattTales of Linux micro-benchmarks
10. Case study 2 - lmbench
Measures fork() + exit()
@fleming_mattTales of Linux micro-benchmarks
11. Case study 2 - lmbench
Measures fork() + exit()
Actually measures fork() + page fault + exit()
@fleming_mattTales of Linux micro-benchmarks
12. Case study 2 - lmbench
Measures fork() + exit()
Actually measures fork() + page fault + exit()
Faulting address
fault_around_bytes#PF
@fleming_mattTales of Linux micro-benchmarks
13. Case study 3 - hackbench
Message-passing micro-benchmark
Processes or threads
Pipes or sockets
@fleming_mattTales of Linux micro-benchmarks
14. Case study 3 - hackbench
Message-passing micro-benchmark
Processes or threads
Pipes or sockets
70%
@fleming_mattTales of Linux micro-benchmarks
15. Case study 4 - pipetest
@fleming_mattTales of Linux micro-benchmarks
16. Case study 4 - pipetest
@fleming_mattTales of Linux micro-benchmarks
17. When are they useful?
After profiling your workload and identifying
bottlenecks
When they’re super simple
When they’ve been tested
@fleming_mattTales of Linux micro-benchmarks