Linux BPF Superpowers

Brendan Gregg
Brendan GreggSenior Performance Architect at Netflix
Linux	
  4.x	
  Performance	
  
Using	
  BPF	
  Superpowers	
  
Brendan Gregg
Senior Performance Architect
Feb	
  
2016	
  
Ten	
  years	
  ago,	
  
I	
  gave	
  a	
  talk	
  here	
  	
  
about	
  DTrace	
  tools…	
  
Linux BPF Superpowers
Superpowers	
  are	
  coming	
  to	
  Linux	
  	
  
Solve performance issues that were previously impossible
For example, full off-CPU analysis…
Linux BPF Superpowers
Linux BPF Superpowers
Ideal	
  Thread	
  States	
  
A starting point for
deeper analysis
Linux	
  Thread	
  States	
  
Based on:
TASK_RUNNING
TASK_INTERRUPTIBLE
TASK_UNINTERRUPTIBLE
Still a useful
starting point
Linux	
  On-­‐CPU	
  Analysis	
  
CPU	
  Flame	
  Graph	
  
•  I'll start with on-CPU analysis:
•  Split into user/kernel states
using /proc, mpstat(1), ...
•  perf_events ("perf") to analyze further:
–  User & kernel stack sampling (as a CPU flame graph)
–  CPI
–  Should be easy, but…
Broken	
  Stacks	
  
Missing Java
stacks
Missing	
  Symbols	
  
"[unknown]"
Java	
  Mixed-­‐Mode	
  CPU	
  Flame	
  Graph	
  
Java
JVM
Kernel
GC
•  Fixed!
–  Java –XX:+PreserveFramePointer
–  Java perf-map-agent
–  Linux perf_events
Stack depth
Samples
(alphabetical sort)
Also,	
  CPI	
  Flame	
  Graph	
  
Cycles Per Instruction
-  red == instruction
heavy
-  blue == cycle heavy
(likely mem stalls)
zoomed:
Linux	
  Off-­‐CPU	
  Analysis	
  
On Linux, the state
isn't helpful, but the
code path is
Off-CPU analysis by
measuring blocked
time with stack traces
Off-­‐CPU	
  Time	
  Flame	
  Graph	
  
From	
  hRp://www.brendangregg.com/blog/2016-­‐02-­‐01/linux-­‐wakeup-­‐offwake-­‐profiling.html	
  
Stack depth
Off-CPU time
Off-­‐CPU	
  Time	
  (zoomed):	
  tar(1)	
  
file read
from disk
directory read
from disk
Currently kernel stacks only; user stacks will add more context
pipe write
path read from disk
fstat from disk
Off-­‐CPU	
  Time:	
  more	
  states	
  
lock
contention sleep
run queue
latency
Flame graph quantifies total time spent in states
CPU	
  +	
  Off-­‐CPU	
  ==	
  See	
  Everything?	
  
Off-­‐CPU	
  Time	
  (zoomed):	
  gzip(1)	
  
Off-CPU doesn't always make sense:
what is gzip blocked on?
Wakeup	
  Time	
  Flame	
  Graph	
  
Wakeup	
  Time	
  (zoomed):	
  gzip(1)	
  
gzip(1) is blocked on tar(1)!
tar cf - * | gzip > out.tar.gz
Can't we associate off-CPU with wakeup stacks?
Off-­‐Wake	
  Time	
  Flame	
  Graph	
  
Wakeup stacks
are associated
and merged
in-kernel
using
BPF
We couldn't do
this before
Linux BPF Superpowers
•  One wakeup stack is often
not enough…
•  Who woke the waker?
Haven't	
  Solved	
  Everything	
  Yet…	
  
Chain	
  Graphs	
  
Merging multiple
wakeup stacks
in kernel using
BPF
With enough
stacks, all paths
lead to metal
Solve	
  Everything	
  
CPU + off-CPU analysis can solve most issues
Flame graph (profiling) types:
1.  CPU
2.  CPI
3.  Off-CPU time
4.  Wakeup time
5.  Off-wake time
6.  Chain
BPF makes this all more practical
different off-CPU analysis views,
with more context and
increasing measurement cost
2.	
  BPF	
  
"One	
  of	
  the	
  more	
  interesbng	
  features	
  in	
  this	
  
cycle	
  is	
  the	
  ability	
  to	
  aRach	
  eBPF	
  programs	
  
(user-­‐defined,	
  sandboxed	
  bytecode	
  executed	
  
by	
  the	
  kernel)	
  to	
  kprobes.	
  This	
  allows	
  user-­‐
defined	
  instrumentabon	
  on	
  a	
  live	
  kernel	
  image	
  
that	
  can	
  never	
  crash,	
  hang	
  or	
  interfere	
  with	
  the	
  
kernel	
  negabvely."	
  
–	
  Ingo	
  Molnár	
  (Linux	
  developer)	
  
Source:	
  hRps://lkml.org/lkml/2015/4/14/232	
  
2.	
  BPF	
  
"crazy	
  stuff"	
  
–	
  Alexei	
  Starovoitov	
  (eBPF	
  lead)	
  
Source:	
  hRp://www.slideshare.net/AlexeiStarovoitov/bpf-­‐inkernel-­‐virtual-­‐machine	
  
BPF	
  
•  eBPF == enhanced Berkeley Packet Filter; now just BPF
•  Integrated into Linux (in stages: 3.15, 3.19, 4.1, 4.5, …)
•  Uses
–  virtual networking
–  tracing
–  "crazy stuff"
•  Front-ends
–  samples/bpf (raw)
–  bcc: Python, C
–  Linux perf_events BPF	
  mascot	
  
BPF	
  for	
  Tracing	
  
•  Can do per-event output and in-kernel summary
statistics (histograms, etc).
BPF	
  bytecode	
  
User	
  Program	
  
1.	
  generate	
  
2.	
  load	
  
Kernel	
  
kprobes	
  
uprobes	
  
tracepoints	
  
BPF	
  
maps	
  
perf_output	
  
per-­‐
event	
  
data	
  
stabsbcs	
  
3.	
  async	
  
read	
  
Old	
  way:	
  TCP	
  Retransmits	
  
•  tcpdump of all send & receive, dump to FS, post-process
•  Overheads adds up on 10GbE+
send	
  
receive	
  
tcpdump	
   Kernel	
  
file	
  system	
  
1.	
  read	
  
2.	
  dump	
  
Analyzer	
  
1.	
  read	
  
2.	
  state	
  machine	
  
3.	
  print	
  
disks	
  
buffer	
  
New	
  way:	
  BPF	
  TCP	
  Retransmits	
  
•  Just trace the retransmit functions
•  Negligible overhead
send	
  
receive	
  
tcpretrans	
  (bcc)	
   Kernel	
  
tcp_retransmit_skb()	
  
1.	
  Config	
  BPF	
  &	
  kprobe	
  
2.	
  read,	
  print	
   send/recv	
  
as-­‐is	
  
BPF:	
  TCP	
  Retransmits	
  
# ./tcpretrans
TIME PID IP LADDR:LPORT T> RADDR:RPORT STATE
01:55:05 0 4 10.153.223.157:22 R> 69.53.245.40:34619 ESTABLISHED
01:55:05 0 4 10.153.223.157:22 R> 69.53.245.40:34619 ESTABLISHED
01:55:17 0 4 10.153.223.157:22 R> 69.53.245.40:22957 ESTABLISHED
[…]
includes	
  kernel	
  state	
  
Old:	
  Off-­‐CPU	
  Time	
  Stack	
  Profiling	
  
•  perf_events tracing of sched events, post-process
•  Despite buffering, usually high cost (>1M events/sec)
perf	
  record	
   Kernel	
  
scheduler	
  
1.	
  async	
  read	
  
2.	
  dump	
  
perf	
  inject	
  
1.	
  read	
  
2.	
  rewrite	
   disks	
  
perf	
  report/script	
  
read,	
  process,	
  print	
  
buffer	
  
file	
  system	
  
(or	
  pipe)	
  
New:	
  BPF	
  Off-­‐CPU	
  Time	
  Stacks	
  
•  Measure off-CPU time, add to map with key = stack,
value = total time. Async read map.
offcpuDme	
  (bcc)	
   Kernel	
  
1.	
  Config	
  BPF	
  &	
  kprobe	
  
2.	
  async	
  read	
  stacks	
  
3.	
  symbol	
  translate	
  
4.	
  print	
  
maps	
  
BPF	
  
scheduler	
  
finish_task_switch()	
  
Stack	
  Trace	
  Hack	
  
•  For my offcputime tool, I wrote a BPF stack walker:
"Crazy	
  Stuff"	
  
•  … using unrolled loops & goto:
BPF	
  Stack	
  Traces	
  
•  Proper BPF stack support just landed in net-next:
•  Allows more than just chain graphs
Date Sat, 20 Feb 2016 00:25:05 -0500 (EST)
Subject Re: [PATCH net-next 0/3] bpf_get_stackid() and stack_trace map
From David Miller <>
From: Alexei Starovoitov <ast@fb.com>
Date: Wed, 17 Feb 2016 19:58:56 -0800
> This patch set introduces new map type to store stack traces and
> corresponding bpf_get_stackid() helper.
...
Series applied, thanks Alexei.
memleak	
  
•  Real-time memory growth and leak analysis:
•  Uses my stack hack, but will switch to BPF stacks soon
•  By Sasha Goldshtein. Another bcc tool.
# ./memleak.py -o 10 60 1
Attaching to kmalloc and kfree, Ctrl+C to quit.
[01:27:34] Top 10 stacks with outstanding allocations:
72 bytes in 1 allocations from stack
alloc_fdtable [kernel] (ffffffff8121960f)
expand_files [kernel] (ffffffff8121986b)
sys_dup2 [kernel] (ffffffff8121a68d)
[…]
2048 bytes in 1 allocations from stack
alloc_fdtable [kernel] (ffffffff812195da)
expand_files [kernel] (ffffffff8121986b)
sys_dup2 [kernel] (ffffffff8121a68d) ]
Trace	
  for	
  60s	
  
Show	
  kernel	
  
allocabons	
  
older	
  than	
  10s	
  
that	
  were	
  not	
  
freed	
  
3.	
  bcc	
  
•  BPF Compiler Collection
–  https://github.com/iovisor/bcc
•  Python front-end, C instrumentation
•  Currently beta – in development!
•  Some example tracing tools…
execsnoop	
  
•  Trace new processes:
# ./execsnoop
PCOMM PID RET ARGS
bash 15887 0 /usr/bin/man ls
preconv 15894 0 /usr/bin/preconv -e UTF-8
man 15896 0 /usr/bin/tbl
man 15897 0 /usr/bin/nroff -mandoc -rLL=169n -rLT=169n -Tutf8
man 15898 0 /usr/bin/pager -s
nroff 15900 0 /usr/bin/locale charmap
nroff 15901 0 /usr/bin/groff -mtty-char -Tutf8 -mandoc -rLL=169n …
groff 15902 0 /usr/bin/troff -mtty-char -mandoc -rLL=169n -rLT=169 …
groff 15903 0 /usr/bin/grotty
biolatency	
  
•  Block device (disk) I/O latency distribution:
# ./biolatency -mT 1 5
Tracing block device I/O... Hit Ctrl-C to end.
06:20:16
msecs : count distribution
0 -> 1 : 36 |**************************************|
2 -> 3 : 1 |* |
4 -> 7 : 3 |*** |
8 -> 15 : 17 |***************** |
16 -> 31 : 33 |********************************** |
32 -> 63 : 7 |******* |
64 -> 127 : 6 |****** |
[…]
ext4slower	
  
•  ext4 file system I/O, slower than a threshold:
# ./ext4slower 1
Tracing ext4 operations slower than 1 ms
TIME COMM PID T BYTES OFF_KB LAT(ms) FILENAME
06:49:17 bash 3616 R 128 0 7.75 cksum
06:49:17 cksum 3616 R 39552 0 1.34 [
06:49:17 cksum 3616 R 96 0 5.36 2to3-2.7
06:49:17 cksum 3616 R 96 0 14.94 2to3-3.4
06:49:17 cksum 3616 R 10320 0 6.82 411toppm
06:49:17 cksum 3616 R 65536 0 4.01 a2p
06:49:17 cksum 3616 R 55400 0 8.77 ab
06:49:17 cksum 3616 R 36792 0 16.34 aclocal-1.14
06:49:17 cksum 3616 R 15008 0 19.31 acpi_listen
06:49:17 cksum 3616 R 6123 0 17.23 add-apt-
repository
06:49:17 cksum 3616 R 6280 0 18.40 addpart
06:49:17 cksum 3616 R 27696 0 2.16 addr2line
06:49:17 cksum 3616 R 58080 0 10.11 ag
06:49:17 cksum 3616 R 906 0 6.30 ec2-meta-data
[…]
bashreadline	
  
•  Trace bash interactive commands system-wide:
# ./bashreadline
TIME PID COMMAND
05:28:25 21176 ls -l
05:28:28 21176 date
05:28:35 21176 echo hello world
05:28:43 21176 foo this command failed
05:28:45 21176 df -h
05:29:04 3059 echo another shell
05:29:13 21176 echo first shell again
gethostlatency	
  
•  Show latency for getaddrinfo/gethostbyname[2] calls:
# ./gethostlatency
TIME PID COMM LATms HOST
06:10:24 28011 wget 90.00 www.iovisor.org
06:10:28 28127 wget 0.00 www.iovisor.org
06:10:41 28404 wget 9.00 www.netflix.com
06:10:48 28544 curl 35.00 www.netflix.com.au
06:11:10 29054 curl 31.00 www.plumgrid.com
06:11:16 29195 curl 3.00 www.facebook.com
06:11:25 29404 curl 72.00 foo
06:11:28 29475 curl 1.00 foo
trace	
  
•  Trace custom events. Ad hoc analysis multitool:
# trace 'sys_read (arg3 > 20000) "read %d bytes", arg3'
TIME PID COMM FUNC -
05:18:23 4490 dd sys_read read 1048576 bytes
05:18:23 4490 dd sys_read read 1048576 bytes
05:18:23 4490 dd sys_read read 1048576 bytes
05:18:23 4490 dd sys_read read 1048576 bytes
^C
Linux	
  bcc/BPF	
  Tracing	
  Tools	
  
4.	
  Future	
  Work	
  
•  All event sources
•  Language improvements
•  More tools: eg, TCP
•  GUI support
Linux	
  Event	
  Sources	
  
done	
  
XXX:	
  todo	
   XXX:	
  todo	
  
XXX:	
  todo	
  
done	
  
BPF/bcc	
  Language	
  Improvements	
  
More	
  Tools	
  
•  eg, netstat(8)…
$ netstat -s
Ip:
7962754 total packets received
8 with invalid addresses
0 forwarded
0 incoming packets discarded
7962746 incoming packets delivered
8019427 requests sent out
Icmp:
382 ICMP messages received
0 input ICMP message failed.
ICMP input histogram:
destination unreachable: 125
timeout in transit: 257
3410 ICMP messages sent
0 ICMP messages failed
ICMP output histogram:
destination unreachable: 3410
IcmpMsg:
InType3: 125
InType11: 257
OutType3: 3410
Tcp:
17337 active connections openings
395515 passive connection openings
8953 failed connection attempts
240214 connection resets received
3 connections established
7198375 segments received
7504939 segments send out
62696 segments retransmited
10 bad segments received.
1072 resets sent
InCsumErrors: 5
Udp:
759925 packets received
3412 packets to unknown port received.
0 packet receive errors
784370 packets sent
UdpLite:
TcpExt:
858 invalid SYN cookies received
8951 resets received for embryonic SYN_RECV sockets
14 packets pruned from receive queue because of socket buffer overrun
6177 TCP sockets finished time wait in fast timer
293 packets rejects in established connections because of timestamp
733028 delayed acks sent
89 delayed acks further delayed because of locked socket
Quick ack mode was activated 13214 times
336520 packets directly queued to recvmsg prequeue.
43964 packets directly received from backlog
11406012 packets directly received from prequeue
1039165 packets header predicted
7066 packets header predicted and directly queued to user
1428960 acknowledgments not containing data received
1004791 predicted acknowledgments
1 times recovered from packet loss due to fast retransmit
5044 times recovered from packet loss due to SACK data
2 bad SACKs received
Detected reordering 4 times using SACK
Detected reordering 11 times using time stamp
13 congestion windows fully recovered
11 congestion windows partially recovered using Hoe heuristic
TCPDSACKUndo: 39
2384 congestion windows recovered after partial ack
228 timeouts after SACK recovery
100 timeouts in loss state
5018 fast retransmits
39 forward retransmits
783 retransmits in slow start
32455 other TCP timeouts
TCPLossProbes: 30233
TCPLossProbeRecovery: 19070
992 sack retransmits failed
18 times receiver scheduled too late for direct processing
705 packets collapsed in receive queue due to low socket buffer
13658 DSACKs sent for old packets
8 DSACKs sent for out of order packets
13595 DSACKs received
33 DSACKs for out of order packets received
32 connections reset due to unexpected data
108 connections reset due to early user close
1608 connections aborted due to timeout
TCPSACKDiscard: 4
TCPDSACKIgnoredOld: 1
TCPDSACKIgnoredNoUndo: 8649
TCPSpuriousRTOs: 445
TCPSackShiftFallback: 8588
TCPRcvCoalesce: 95854
TCPOFOQueue: 24741
TCPOFOMerge: 8
TCPChallengeACK: 1441
TCPSYNChallenge: 5
TCPSpuriousRtxHostQueues: 1
TCPAutoCorking: 4823
IpExt:
InOctets: 1561561375
OutOctets: 1509416943
InNoECTPkts: 8201572
InECT1Pkts: 2
InECT0Pkts: 3844
InCEPkts: 306
Linux BPF Superpowers
BeRer	
  TCP	
  Tools	
  
•  TCP retransmit by type and time
•  Congestion algorithm metrics
•  etc.
GUI	
  Support	
  
•  eg, Netflix Vector: open source instance analyzer:
Summary	
  
•  BPF in Linux 4.x makes many new things possible
–  Stack-based thread state analysis (solve all issues!)
–  Real-time memory growth/leak detection
–  Better TCP metrics
–  etc...
•  Get involved: see iovisor/bcc
•  So far just a preview of things to come
Links	
  
•  iovisor bcc:
•  https://github.com/iovisor/bcc
•  http://www.brendangregg.com/blog/2015-09-22/bcc-linux-4.3-tracing.html
•  http://blogs.microsoft.co.il/sasha/2016/02/14/two-new-ebpf-tools-memleak-and-argdist/
•  BPF Off-CPU, Wakeup, Off-Wake & Chain Graphs:
•  http://www.brendangregg.com/blog/2016-01-20/ebpf-offcpu-flame-graph.html
•  http://www.brendangregg.com/blog/2016-02-01/linux-wakeup-offwake-profiling.html
•  http://www.brendangregg.com/blog/2016-02-05/ebpf-chaingraph-prototype.html
•  Linux Performance:
•  http://www.brendangregg.com/linuxperf.html
•  Linux perf_events:
•  https://perf.wiki.kernel.org/index.php/Main_Page
•  http://www.brendangregg.com/perf.html
•  Flame Graphs:
•  http://techblog.netflix.com/2015/07/java-in-flames.html
•  http://www.brendangregg.com/flamegraphs.html
•  Netflix Tech Blog on Vector:
•  http://techblog.netflix.com/2015/04/introducing-vector-netflixs-on-host.html
•  Wordcloud: https://www.jasondavies.com/wordcloud/
Feb	
  
2016	
  
•  Questions?
•  http://slideshare.net/brendangregg
•  http://www.brendangregg.com
•  bgregg@netflix.com
•  @brendangregg
Thanks to Alexei Starovoitov (Facebook), Brenden
Blanco (PLUMgrid), Daniel Borkmann (Cisco), Wang
Nan (Huawei), Sasha Goldshtein (Sela), and other
BPF and bcc contributors!
1 of 60

Recommended

BPF: Tracing and more by
BPF: Tracing and moreBPF: Tracing and more
BPF: Tracing and moreBrendan Gregg
200.3K views72 slides
BPF - in-kernel virtual machine by
BPF - in-kernel virtual machineBPF - in-kernel virtual machine
BPF - in-kernel virtual machineAlexei Starovoitov
11.7K views41 slides
Performance Wins with eBPF: Getting Started (2021) by
Performance Wins with eBPF: Getting Started (2021)Performance Wins with eBPF: Getting Started (2021)
Performance Wins with eBPF: Getting Started (2021)Brendan Gregg
1.4K views30 slides
eBPF maps 101 by
eBPF maps 101eBPF maps 101
eBPF maps 101SUSE Labs Taipei
4.2K views64 slides
Building Network Functions with eBPF & BCC by
Building Network Functions with eBPF & BCCBuilding Network Functions with eBPF & BCC
Building Network Functions with eBPF & BCCKernel TLV
3K views32 slides
eBPF - Rethinking the Linux Kernel by
eBPF - Rethinking the Linux KerneleBPF - Rethinking the Linux Kernel
eBPF - Rethinking the Linux KernelThomas Graf
1.2K views24 slides

More Related Content

What's hot

Linux kernel tracing by
Linux kernel tracingLinux kernel tracing
Linux kernel tracingViller Hsiao
16.9K views70 slides
UM2019 Extended BPF: A New Type of Software by
UM2019 Extended BPF: A New Type of SoftwareUM2019 Extended BPF: A New Type of Software
UM2019 Extended BPF: A New Type of SoftwareBrendan Gregg
33.1K views48 slides
eBPF Basics by
eBPF BasicseBPF Basics
eBPF BasicsMichael Kehoe
2.7K views63 slides
Understanding eBPF in a Hurry! by
Understanding eBPF in a Hurry!Understanding eBPF in a Hurry!
Understanding eBPF in a Hurry!Ray Jenkins
1.5K views77 slides
Linux Kernel Crashdump by
Linux Kernel CrashdumpLinux Kernel Crashdump
Linux Kernel CrashdumpMarian Marinov
2.4K views41 slides
eBPF Trace from Kernel to Userspace by
eBPF Trace from Kernel to UserspaceeBPF Trace from Kernel to Userspace
eBPF Trace from Kernel to UserspaceSUSE Labs Taipei
8.5K views74 slides

What's hot(20)

Linux kernel tracing by Viller Hsiao
Linux kernel tracingLinux kernel tracing
Linux kernel tracing
Viller Hsiao16.9K views
UM2019 Extended BPF: A New Type of Software by Brendan Gregg
UM2019 Extended BPF: A New Type of SoftwareUM2019 Extended BPF: A New Type of Software
UM2019 Extended BPF: A New Type of Software
Brendan Gregg33.1K views
Understanding eBPF in a Hurry! by Ray Jenkins
Understanding eBPF in a Hurry!Understanding eBPF in a Hurry!
Understanding eBPF in a Hurry!
Ray Jenkins1.5K views
eBPF Trace from Kernel to Userspace by SUSE Labs Taipei
eBPF Trace from Kernel to UserspaceeBPF Trace from Kernel to Userspace
eBPF Trace from Kernel to Userspace
SUSE Labs Taipei8.5K views
BPF Internals (eBPF) by Brendan Gregg
BPF Internals (eBPF)BPF Internals (eBPF)
BPF Internals (eBPF)
Brendan Gregg15.3K views
Introduction to eBPF and XDP by lcplcp1
Introduction to eBPF and XDPIntroduction to eBPF and XDP
Introduction to eBPF and XDP
lcplcp15.5K views
Container Performance Analysis by Brendan Gregg
Container Performance AnalysisContainer Performance Analysis
Container Performance Analysis
Brendan Gregg448.6K views
Xdp and ebpf_maps by lcplcp1
Xdp and ebpf_mapsXdp and ebpf_maps
Xdp and ebpf_maps
lcplcp11.7K views
LinuxCon 2015 Linux Kernel Networking Walkthrough by Thomas Graf
LinuxCon 2015 Linux Kernel Networking WalkthroughLinuxCon 2015 Linux Kernel Networking Walkthrough
LinuxCon 2015 Linux Kernel Networking Walkthrough
Thomas Graf26.3K views
LISA2019 Linux Systems Performance by Brendan Gregg
LISA2019 Linux Systems PerformanceLISA2019 Linux Systems Performance
LISA2019 Linux Systems Performance
Brendan Gregg374.9K views
Linux Performance Analysis: New Tools and Old Secrets by Brendan Gregg
Linux Performance Analysis: New Tools and Old SecretsLinux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old Secrets
Brendan Gregg603.9K views
Linux Performance Profiling and Monitoring by Georg Schönberger
Linux Performance Profiling and MonitoringLinux Performance Profiling and Monitoring
Linux Performance Profiling and Monitoring
Georg Schönberger8.5K views
Cfgmgmtcamp 2023 — eBPF Superpowers by Raphaël PINSON
Cfgmgmtcamp 2023 — eBPF SuperpowersCfgmgmtcamp 2023 — eBPF Superpowers
Cfgmgmtcamp 2023 — eBPF Superpowers
Raphaël PINSON110 views
Faster packet processing in Linux: XDP by Daniel T. Lee
Faster packet processing in Linux: XDPFaster packet processing in Linux: XDP
Faster packet processing in Linux: XDP
Daniel T. Lee1.4K views
DPDK In Depth by Kernel TLV
DPDK In DepthDPDK In Depth
DPDK In Depth
Kernel TLV5.2K views
EBPF and Linux Networking by PLUMgrid
EBPF and Linux NetworkingEBPF and Linux Networking
EBPF and Linux Networking
PLUMgrid14.6K views
USENIX ATC 2017: Visualizing Performance with Flame Graphs by Brendan Gregg
USENIX ATC 2017: Visualizing Performance with Flame GraphsUSENIX ATC 2017: Visualizing Performance with Flame Graphs
USENIX ATC 2017: Visualizing Performance with Flame Graphs
Brendan Gregg674.1K views

Viewers also liked

Velocity 2015 linux perf tools by
Velocity 2015 linux perf toolsVelocity 2015 linux perf tools
Velocity 2015 linux perf toolsBrendan Gregg
1.1M views142 slides
Velocity 2017 Performance analysis superpowers with Linux eBPF by
Velocity 2017 Performance analysis superpowers with Linux eBPFVelocity 2017 Performance analysis superpowers with Linux eBPF
Velocity 2017 Performance analysis superpowers with Linux eBPFBrendan Gregg
736.2K views54 slides
ACM Applicative System Methodology 2016 by
ACM Applicative System Methodology 2016ACM Applicative System Methodology 2016
ACM Applicative System Methodology 2016Brendan Gregg
158.2K views57 slides
Stop the Guessing: Performance Methodologies for Production Systems by
Stop the Guessing: Performance Methodologies for Production SystemsStop the Guessing: Performance Methodologies for Production Systems
Stop the Guessing: Performance Methodologies for Production SystemsBrendan Gregg
32.1K views58 slides
Netflix: From Clouds to Roots by
Netflix: From Clouds to RootsNetflix: From Clouds to Roots
Netflix: From Clouds to RootsBrendan Gregg
67.8K views97 slides
SREcon 2016 Performance Checklists for SREs by
SREcon 2016 Performance Checklists for SREsSREcon 2016 Performance Checklists for SREs
SREcon 2016 Performance Checklists for SREsBrendan Gregg
206.8K views79 slides

Viewers also liked(20)

Velocity 2015 linux perf tools by Brendan Gregg
Velocity 2015 linux perf toolsVelocity 2015 linux perf tools
Velocity 2015 linux perf tools
Brendan Gregg1.1M views
Velocity 2017 Performance analysis superpowers with Linux eBPF by Brendan Gregg
Velocity 2017 Performance analysis superpowers with Linux eBPFVelocity 2017 Performance analysis superpowers with Linux eBPF
Velocity 2017 Performance analysis superpowers with Linux eBPF
Brendan Gregg736.2K views
ACM Applicative System Methodology 2016 by Brendan Gregg
ACM Applicative System Methodology 2016ACM Applicative System Methodology 2016
ACM Applicative System Methodology 2016
Brendan Gregg158.2K views
Stop the Guessing: Performance Methodologies for Production Systems by Brendan Gregg
Stop the Guessing: Performance Methodologies for Production SystemsStop the Guessing: Performance Methodologies for Production Systems
Stop the Guessing: Performance Methodologies for Production Systems
Brendan Gregg32.1K views
Netflix: From Clouds to Roots by Brendan Gregg
Netflix: From Clouds to RootsNetflix: From Clouds to Roots
Netflix: From Clouds to Roots
Brendan Gregg67.8K views
SREcon 2016 Performance Checklists for SREs by Brendan Gregg
SREcon 2016 Performance Checklists for SREsSREcon 2016 Performance Checklists for SREs
SREcon 2016 Performance Checklists for SREs
Brendan Gregg206.8K views
Performance Tuning EC2 Instances by Brendan Gregg
Performance Tuning EC2 InstancesPerformance Tuning EC2 Instances
Performance Tuning EC2 Instances
Brendan Gregg171.7K views
Blazing Performance with Flame Graphs by Brendan Gregg
Blazing Performance with Flame GraphsBlazing Performance with Flame Graphs
Blazing Performance with Flame Graphs
Brendan Gregg323.6K views
Linux Systems Performance 2016 by Brendan Gregg
Linux Systems Performance 2016Linux Systems Performance 2016
Linux Systems Performance 2016
Brendan Gregg504.5K views
Broken Linux Performance Tools 2016 by Brendan Gregg
Broken Linux Performance Tools 2016Broken Linux Performance Tools 2016
Broken Linux Performance Tools 2016
Brendan Gregg822.9K views
Kernel Recipes 2017: Using Linux perf at Netflix by Brendan Gregg
Kernel Recipes 2017: Using Linux perf at NetflixKernel Recipes 2017: Using Linux perf at Netflix
Kernel Recipes 2017: Using Linux perf at Netflix
Brendan Gregg1.5M views
Berkeley Packet Filters by Kernel TLV
Berkeley Packet FiltersBerkeley Packet Filters
Berkeley Packet Filters
Kernel TLV6.4K views
Running Hadoop as Service in AltiScale Platform by InMobi Technology
Running Hadoop as Service in AltiScale PlatformRunning Hadoop as Service in AltiScale Platform
Running Hadoop as Service in AltiScale Platform
InMobi Technology1.2K views
ACM DEBS 2015: Realtime Streaming Analytics Patterns by Srinath Perera
ACM DEBS 2015: Realtime Streaming Analytics PatternsACM DEBS 2015: Realtime Streaming Analytics Patterns
ACM DEBS 2015: Realtime Streaming Analytics Patterns
Srinath Perera41.1K views
Мониторь, автоматизируй Docker by Badoo Development
Мониторь, автоматизируй DockerМониторь, автоматизируй Docker
Мониторь, автоматизируй Docker
Badoo Development16.6K views
Linux 4.x Tracing: Performance Analysis with bcc/BPF by Brendan Gregg
Linux 4.x Tracing: Performance Analysis with bcc/BPFLinux 4.x Tracing: Performance Analysis with bcc/BPF
Linux 4.x Tracing: Performance Analysis with bcc/BPF
Brendan Gregg10.7K views
Docker в Badoo: ПМЖ или временная регистрация by Badoo Development
Docker в Badoo: ПМЖ или временная регистрацияDocker в Badoo: ПМЖ или временная регистрация
Docker в Badoo: ПМЖ или временная регистрация
Badoo Development21.8K views

Similar to Linux BPF Superpowers

OSSNA 2017 Performance Analysis Superpowers with Linux BPF by
OSSNA 2017 Performance Analysis Superpowers with Linux BPFOSSNA 2017 Performance Analysis Superpowers with Linux BPF
OSSNA 2017 Performance Analysis Superpowers with Linux BPFBrendan Gregg
5.1K views68 slides
Modern Linux Tracing Landscape by
Modern Linux Tracing LandscapeModern Linux Tracing Landscape
Modern Linux Tracing LandscapeSasha Goldshtein
1.9K views30 slides
USENIX ATC 2017 Performance Superpowers with Enhanced BPF by
USENIX ATC 2017 Performance Superpowers with Enhanced BPFUSENIX ATC 2017 Performance Superpowers with Enhanced BPF
USENIX ATC 2017 Performance Superpowers with Enhanced BPFBrendan Gregg
7.1K views60 slides
Container Performance Analysis Brendan Gregg, Netflix by
Container Performance Analysis Brendan Gregg, NetflixContainer Performance Analysis Brendan Gregg, Netflix
Container Performance Analysis Brendan Gregg, NetflixDocker, Inc.
12.6K views75 slides
Linux 4.x Tracing Tools: Using BPF Superpowers by
Linux 4.x Tracing Tools: Using BPF SuperpowersLinux 4.x Tracing Tools: Using BPF Superpowers
Linux 4.x Tracing Tools: Using BPF SuperpowersBrendan Gregg
210.2K views68 slides
Linux Perf Tools by
Linux Perf ToolsLinux Perf Tools
Linux Perf ToolsRaj Pandey
53 views75 slides

Similar to Linux BPF Superpowers(20)

OSSNA 2017 Performance Analysis Superpowers with Linux BPF by Brendan Gregg
OSSNA 2017 Performance Analysis Superpowers with Linux BPFOSSNA 2017 Performance Analysis Superpowers with Linux BPF
OSSNA 2017 Performance Analysis Superpowers with Linux BPF
Brendan Gregg5.1K views
USENIX ATC 2017 Performance Superpowers with Enhanced BPF by Brendan Gregg
USENIX ATC 2017 Performance Superpowers with Enhanced BPFUSENIX ATC 2017 Performance Superpowers with Enhanced BPF
USENIX ATC 2017 Performance Superpowers with Enhanced BPF
Brendan Gregg7.1K views
Container Performance Analysis Brendan Gregg, Netflix by Docker, Inc.
Container Performance Analysis Brendan Gregg, NetflixContainer Performance Analysis Brendan Gregg, Netflix
Container Performance Analysis Brendan Gregg, Netflix
Docker, Inc.12.6K views
Linux 4.x Tracing Tools: Using BPF Superpowers by Brendan Gregg
Linux 4.x Tracing Tools: Using BPF SuperpowersLinux 4.x Tracing Tools: Using BPF Superpowers
Linux 4.x Tracing Tools: Using BPF Superpowers
Brendan Gregg210.2K views
Linux Perf Tools by Raj Pandey
Linux Perf ToolsLinux Perf Tools
Linux Perf Tools
Raj Pandey53 views
Modern Linux Tracing Landscape by Kernel TLV
Modern Linux Tracing LandscapeModern Linux Tracing Landscape
Modern Linux Tracing Landscape
Kernel TLV954 views
Designing Tracing Tools by Brendan Gregg
Designing Tracing ToolsDesigning Tracing Tools
Designing Tracing Tools
Brendan Gregg5.7K views
Kernel Recipes 2019 - BPF at Facebook by Anne Nicolas
Kernel Recipes 2019 - BPF at FacebookKernel Recipes 2019 - BPF at Facebook
Kernel Recipes 2019 - BPF at Facebook
Anne Nicolas5.1K views
Designing Tracing Tools by Sysdig
Designing Tracing ToolsDesigning Tracing Tools
Designing Tracing Tools
Sysdig 522 views
Playing BBR with a userspace network stack by Hajime Tazaki
Playing BBR with a userspace network stackPlaying BBR with a userspace network stack
Playing BBR with a userspace network stack
Hajime Tazaki2.1K views
Network Programming: Data Plane Development Kit (DPDK) by Andriy Berestovskyy
Network Programming: Data Plane Development Kit (DPDK)Network Programming: Data Plane Development Kit (DPDK)
Network Programming: Data Plane Development Kit (DPDK)
Andriy Berestovskyy2.3K views
re:Invent 2019 BPF Performance Analysis at Netflix by Brendan Gregg
re:Invent 2019 BPF Performance Analysis at Netflixre:Invent 2019 BPF Performance Analysis at Netflix
re:Invent 2019 BPF Performance Analysis at Netflix
Brendan Gregg5.5K views
Kernel Recipes 2015 - Kernel dump analysis by Anne Nicolas
Kernel Recipes 2015 - Kernel dump analysisKernel Recipes 2015 - Kernel dump analysis
Kernel Recipes 2015 - Kernel dump analysis
Anne Nicolas2.5K views
Fast boot by SZ Lin
Fast bootFast boot
Fast boot
SZ Lin3.9K views
Linux Kernel Platform Development: Challenges and Insights by GlobalLogic Ukraine
 Linux Kernel Platform Development: Challenges and Insights Linux Kernel Platform Development: Challenges and Insights
Linux Kernel Platform Development: Challenges and Insights
GlobalLogic Ukraine1.3K views
Linux Capabilities - eng - v2.1.5, compact by Alessandro Selli
Linux Capabilities - eng - v2.1.5, compactLinux Capabilities - eng - v2.1.5, compact
Linux Capabilities - eng - v2.1.5, compact
Alessandro Selli1.1K views
Accelerating microbiome research with OpenACC by Igor Sfiligoi
Accelerating microbiome research with OpenACCAccelerating microbiome research with OpenACC
Accelerating microbiome research with OpenACC
Igor Sfiligoi109 views
JavaOne 2015 Java Mixed-Mode Flame Graphs by Brendan Gregg
JavaOne 2015 Java Mixed-Mode Flame GraphsJavaOne 2015 Java Mixed-Mode Flame Graphs
JavaOne 2015 Java Mixed-Mode Flame Graphs
Brendan Gregg23.1K views

More from Brendan Gregg

YOW2021 Computing Performance by
YOW2021 Computing PerformanceYOW2021 Computing Performance
YOW2021 Computing PerformanceBrendan Gregg
2K views108 slides
IntelON 2021 Processor Benchmarking by
IntelON 2021 Processor BenchmarkingIntelON 2021 Processor Benchmarking
IntelON 2021 Processor BenchmarkingBrendan Gregg
1K views17 slides
Systems@Scale 2021 BPF Performance Getting Started by
Systems@Scale 2021 BPF Performance Getting StartedSystems@Scale 2021 BPF Performance Getting Started
Systems@Scale 2021 BPF Performance Getting StartedBrendan Gregg
1.5K views30 slides
Computing Performance: On the Horizon (2021) by
Computing Performance: On the Horizon (2021)Computing Performance: On the Horizon (2021)
Computing Performance: On the Horizon (2021)Brendan Gregg
92.9K views113 slides
Performance Wins with BPF: Getting Started by
Performance Wins with BPF: Getting StartedPerformance Wins with BPF: Getting Started
Performance Wins with BPF: Getting StartedBrendan Gregg
2K views24 slides
YOW2020 Linux Systems Performance by
YOW2020 Linux Systems PerformanceYOW2020 Linux Systems Performance
YOW2020 Linux Systems PerformanceBrendan Gregg
1.9K views64 slides

More from Brendan Gregg(20)

YOW2021 Computing Performance by Brendan Gregg
YOW2021 Computing PerformanceYOW2021 Computing Performance
YOW2021 Computing Performance
Brendan Gregg2K views
IntelON 2021 Processor Benchmarking by Brendan Gregg
IntelON 2021 Processor BenchmarkingIntelON 2021 Processor Benchmarking
IntelON 2021 Processor Benchmarking
Brendan Gregg1K views
Systems@Scale 2021 BPF Performance Getting Started by Brendan Gregg
Systems@Scale 2021 BPF Performance Getting StartedSystems@Scale 2021 BPF Performance Getting Started
Systems@Scale 2021 BPF Performance Getting Started
Brendan Gregg1.5K views
Computing Performance: On the Horizon (2021) by Brendan Gregg
Computing Performance: On the Horizon (2021)Computing Performance: On the Horizon (2021)
Computing Performance: On the Horizon (2021)
Brendan Gregg92.9K views
Performance Wins with BPF: Getting Started by Brendan Gregg
Performance Wins with BPF: Getting StartedPerformance Wins with BPF: Getting Started
Performance Wins with BPF: Getting Started
Brendan Gregg2K views
YOW2020 Linux Systems Performance by Brendan Gregg
YOW2020 Linux Systems PerformanceYOW2020 Linux Systems Performance
YOW2020 Linux Systems Performance
Brendan Gregg1.9K views
LPC2019 BPF Tracing Tools by Brendan Gregg
LPC2019 BPF Tracing ToolsLPC2019 BPF Tracing Tools
LPC2019 BPF Tracing Tools
Brendan Gregg1.8K views
LSFMM 2019 BPF Observability by Brendan Gregg
LSFMM 2019 BPF ObservabilityLSFMM 2019 BPF Observability
LSFMM 2019 BPF Observability
Brendan Gregg8.3K views
YOW2018 CTO Summit: Working at netflix by Brendan Gregg
YOW2018 CTO Summit: Working at netflixYOW2018 CTO Summit: Working at netflix
YOW2018 CTO Summit: Working at netflix
Brendan Gregg3.5K views
YOW2018 Cloud Performance Root Cause Analysis at Netflix by Brendan Gregg
YOW2018 Cloud Performance Root Cause Analysis at NetflixYOW2018 Cloud Performance Root Cause Analysis at Netflix
YOW2018 Cloud Performance Root Cause Analysis at Netflix
Brendan Gregg142.6K views
NetConf 2018 BPF Observability by Brendan Gregg
NetConf 2018 BPF ObservabilityNetConf 2018 BPF Observability
NetConf 2018 BPF Observability
Brendan Gregg2.7K views
ATO Linux Performance 2018 by Brendan Gregg
ATO Linux Performance 2018ATO Linux Performance 2018
ATO Linux Performance 2018
Brendan Gregg3.3K views
Linux Performance 2018 (PerconaLive keynote) by Brendan Gregg
Linux Performance 2018 (PerconaLive keynote)Linux Performance 2018 (PerconaLive keynote)
Linux Performance 2018 (PerconaLive keynote)
Brendan Gregg426.7K views
How Netflix Tunes EC2 Instances for Performance by Brendan Gregg
How Netflix Tunes EC2 Instances for PerformanceHow Netflix Tunes EC2 Instances for Performance
How Netflix Tunes EC2 Instances for Performance
Brendan Gregg524.4K views
LISA17 Container Performance Analysis by Brendan Gregg
LISA17 Container Performance AnalysisLISA17 Container Performance Analysis
LISA17 Container Performance Analysis
Brendan Gregg9.3K views
Kernel Recipes 2017: Performance Analysis with BPF by Brendan Gregg
Kernel Recipes 2017: Performance Analysis with BPFKernel Recipes 2017: Performance Analysis with BPF
Kernel Recipes 2017: Performance Analysis with BPF
Brendan Gregg5.3K views
EuroBSDcon 2017 System Performance Analysis Methodologies by Brendan Gregg
EuroBSDcon 2017 System Performance Analysis MethodologiesEuroBSDcon 2017 System Performance Analysis Methodologies
EuroBSDcon 2017 System Performance Analysis Methodologies
Brendan Gregg15.8K views

Recently uploaded

20231123_Camunda Meetup Vienna.pdf by
20231123_Camunda Meetup Vienna.pdf20231123_Camunda Meetup Vienna.pdf
20231123_Camunda Meetup Vienna.pdfPhactum Softwareentwicklung GmbH
49 views73 slides
Future of AR - Facebook Presentation by
Future of AR - Facebook PresentationFuture of AR - Facebook Presentation
Future of AR - Facebook PresentationRob McCarty
54 views27 slides
Igniting Next Level Productivity with AI-Infused Data Integration Workflows by
Igniting Next Level Productivity with AI-Infused Data Integration Workflows Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows Safe Software
373 views86 slides
Why and How CloudStack at weSystems - Stephan Bienek - weSystems by
Why and How CloudStack at weSystems - Stephan Bienek - weSystemsWhy and How CloudStack at weSystems - Stephan Bienek - weSystems
Why and How CloudStack at weSystems - Stephan Bienek - weSystemsShapeBlue
172 views13 slides
MVP and prioritization.pdf by
MVP and prioritization.pdfMVP and prioritization.pdf
MVP and prioritization.pdfrahuldharwal141
39 views8 slides
Business Analyst Series 2023 - Week 4 Session 7 by
Business Analyst Series 2023 -  Week 4 Session 7Business Analyst Series 2023 -  Week 4 Session 7
Business Analyst Series 2023 - Week 4 Session 7DianaGray10
110 views31 slides

Recently uploaded(20)

Future of AR - Facebook Presentation by Rob McCarty
Future of AR - Facebook PresentationFuture of AR - Facebook Presentation
Future of AR - Facebook Presentation
Rob McCarty54 views
Igniting Next Level Productivity with AI-Infused Data Integration Workflows by Safe Software
Igniting Next Level Productivity with AI-Infused Data Integration Workflows Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Igniting Next Level Productivity with AI-Infused Data Integration Workflows
Safe Software373 views
Why and How CloudStack at weSystems - Stephan Bienek - weSystems by ShapeBlue
Why and How CloudStack at weSystems - Stephan Bienek - weSystemsWhy and How CloudStack at weSystems - Stephan Bienek - weSystems
Why and How CloudStack at weSystems - Stephan Bienek - weSystems
ShapeBlue172 views
Business Analyst Series 2023 - Week 4 Session 7 by DianaGray10
Business Analyst Series 2023 -  Week 4 Session 7Business Analyst Series 2023 -  Week 4 Session 7
Business Analyst Series 2023 - Week 4 Session 7
DianaGray10110 views
Elevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlue by ShapeBlue
Elevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlueElevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlue
Elevating Privacy and Security in CloudStack - Boris Stoyanov - ShapeBlue
ShapeBlue149 views
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R... by ShapeBlue
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...
ShapeBlue105 views
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti... by ShapeBlue
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...
DRaaS using Snapshot copy and destination selection (DRaaS) - Alexandre Matti...
ShapeBlue69 views
The Power of Heat Decarbonisation Plans in the Built Environment by IES VE
The Power of Heat Decarbonisation Plans in the Built EnvironmentThe Power of Heat Decarbonisation Plans in the Built Environment
The Power of Heat Decarbonisation Plans in the Built Environment
IES VE67 views
Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ... by ShapeBlue
Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ...Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ...
Live Demo Showcase: Unveiling Dell PowerFlex’s IaaS Capabilities with Apache ...
ShapeBlue52 views
Declarative Kubernetes Cluster Deployment with Cloudstack and Cluster API - O... by ShapeBlue
Declarative Kubernetes Cluster Deployment with Cloudstack and Cluster API - O...Declarative Kubernetes Cluster Deployment with Cloudstack and Cluster API - O...
Declarative Kubernetes Cluster Deployment with Cloudstack and Cluster API - O...
ShapeBlue59 views
KVM Security Groups Under the Hood - Wido den Hollander - Your.Online by ShapeBlue
KVM Security Groups Under the Hood - Wido den Hollander - Your.OnlineKVM Security Groups Under the Hood - Wido den Hollander - Your.Online
KVM Security Groups Under the Hood - Wido den Hollander - Your.Online
ShapeBlue154 views
Backup and Disaster Recovery with CloudStack and StorPool - Workshop - Venko ... by ShapeBlue
Backup and Disaster Recovery with CloudStack and StorPool - Workshop - Venko ...Backup and Disaster Recovery with CloudStack and StorPool - Workshop - Venko ...
Backup and Disaster Recovery with CloudStack and StorPool - Workshop - Venko ...
ShapeBlue114 views
What’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlue by ShapeBlue
What’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlueWhat’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlue
What’s New in CloudStack 4.19 - Abhishek Kumar - ShapeBlue
ShapeBlue191 views
Confidence in CloudStack - Aron Wagner, Nathan Gleason - Americ by ShapeBlue
Confidence in CloudStack - Aron Wagner, Nathan Gleason - AmericConfidence in CloudStack - Aron Wagner, Nathan Gleason - Americ
Confidence in CloudStack - Aron Wagner, Nathan Gleason - Americ
ShapeBlue58 views
Automating a World-Class Technology Conference; Behind the Scenes of CiscoLive by Network Automation Forum
Automating a World-Class Technology Conference; Behind the Scenes of CiscoLiveAutomating a World-Class Technology Conference; Behind the Scenes of CiscoLive
Automating a World-Class Technology Conference; Behind the Scenes of CiscoLive

Linux BPF Superpowers

  • 1. Linux  4.x  Performance   Using  BPF  Superpowers   Brendan Gregg Senior Performance Architect Feb   2016  
  • 2. Ten  years  ago,   I  gave  a  talk  here     about  DTrace  tools…  
  • 4. Superpowers  are  coming  to  Linux     Solve performance issues that were previously impossible For example, full off-CPU analysis…
  • 7. Ideal  Thread  States   A starting point for deeper analysis
  • 8. Linux  Thread  States   Based on: TASK_RUNNING TASK_INTERRUPTIBLE TASK_UNINTERRUPTIBLE Still a useful starting point
  • 9. Linux  On-­‐CPU  Analysis   CPU  Flame  Graph   •  I'll start with on-CPU analysis: •  Split into user/kernel states using /proc, mpstat(1), ... •  perf_events ("perf") to analyze further: –  User & kernel stack sampling (as a CPU flame graph) –  CPI –  Should be easy, but…
  • 12. Java  Mixed-­‐Mode  CPU  Flame  Graph   Java JVM Kernel GC •  Fixed! –  Java –XX:+PreserveFramePointer –  Java perf-map-agent –  Linux perf_events
  • 14. Also,  CPI  Flame  Graph   Cycles Per Instruction -  red == instruction heavy -  blue == cycle heavy (likely mem stalls) zoomed:
  • 15. Linux  Off-­‐CPU  Analysis   On Linux, the state isn't helpful, but the code path is Off-CPU analysis by measuring blocked time with stack traces
  • 16. Off-­‐CPU  Time  Flame  Graph   From  hRp://www.brendangregg.com/blog/2016-­‐02-­‐01/linux-­‐wakeup-­‐offwake-­‐profiling.html   Stack depth Off-CPU time
  • 17. Off-­‐CPU  Time  (zoomed):  tar(1)   file read from disk directory read from disk Currently kernel stacks only; user stacks will add more context pipe write path read from disk fstat from disk
  • 18. Off-­‐CPU  Time:  more  states   lock contention sleep run queue latency Flame graph quantifies total time spent in states
  • 19. CPU  +  Off-­‐CPU  ==  See  Everything?  
  • 20. Off-­‐CPU  Time  (zoomed):  gzip(1)   Off-CPU doesn't always make sense: what is gzip blocked on?
  • 21. Wakeup  Time  Flame  Graph  
  • 22. Wakeup  Time  (zoomed):  gzip(1)   gzip(1) is blocked on tar(1)! tar cf - * | gzip > out.tar.gz Can't we associate off-CPU with wakeup stacks?
  • 24. Wakeup stacks are associated and merged in-kernel using BPF We couldn't do this before
  • 26. •  One wakeup stack is often not enough… •  Who woke the waker? Haven't  Solved  Everything  Yet…  
  • 28. Merging multiple wakeup stacks in kernel using BPF With enough stacks, all paths lead to metal
  • 29. Solve  Everything   CPU + off-CPU analysis can solve most issues Flame graph (profiling) types: 1.  CPU 2.  CPI 3.  Off-CPU time 4.  Wakeup time 5.  Off-wake time 6.  Chain BPF makes this all more practical different off-CPU analysis views, with more context and increasing measurement cost
  • 30. 2.  BPF   "One  of  the  more  interesbng  features  in  this   cycle  is  the  ability  to  aRach  eBPF  programs   (user-­‐defined,  sandboxed  bytecode  executed   by  the  kernel)  to  kprobes.  This  allows  user-­‐ defined  instrumentabon  on  a  live  kernel  image   that  can  never  crash,  hang  or  interfere  with  the   kernel  negabvely."   –  Ingo  Molnár  (Linux  developer)   Source:  hRps://lkml.org/lkml/2015/4/14/232  
  • 31. 2.  BPF   "crazy  stuff"   –  Alexei  Starovoitov  (eBPF  lead)   Source:  hRp://www.slideshare.net/AlexeiStarovoitov/bpf-­‐inkernel-­‐virtual-­‐machine  
  • 32. BPF   •  eBPF == enhanced Berkeley Packet Filter; now just BPF •  Integrated into Linux (in stages: 3.15, 3.19, 4.1, 4.5, …) •  Uses –  virtual networking –  tracing –  "crazy stuff" •  Front-ends –  samples/bpf (raw) –  bcc: Python, C –  Linux perf_events BPF  mascot  
  • 33. BPF  for  Tracing   •  Can do per-event output and in-kernel summary statistics (histograms, etc). BPF  bytecode   User  Program   1.  generate   2.  load   Kernel   kprobes   uprobes   tracepoints   BPF   maps   perf_output   per-­‐ event   data   stabsbcs   3.  async   read  
  • 34. Old  way:  TCP  Retransmits   •  tcpdump of all send & receive, dump to FS, post-process •  Overheads adds up on 10GbE+ send   receive   tcpdump   Kernel   file  system   1.  read   2.  dump   Analyzer   1.  read   2.  state  machine   3.  print   disks   buffer  
  • 35. New  way:  BPF  TCP  Retransmits   •  Just trace the retransmit functions •  Negligible overhead send   receive   tcpretrans  (bcc)   Kernel   tcp_retransmit_skb()   1.  Config  BPF  &  kprobe   2.  read,  print   send/recv   as-­‐is  
  • 36. BPF:  TCP  Retransmits   # ./tcpretrans TIME PID IP LADDR:LPORT T> RADDR:RPORT STATE 01:55:05 0 4 10.153.223.157:22 R> 69.53.245.40:34619 ESTABLISHED 01:55:05 0 4 10.153.223.157:22 R> 69.53.245.40:34619 ESTABLISHED 01:55:17 0 4 10.153.223.157:22 R> 69.53.245.40:22957 ESTABLISHED […] includes  kernel  state  
  • 37. Old:  Off-­‐CPU  Time  Stack  Profiling   •  perf_events tracing of sched events, post-process •  Despite buffering, usually high cost (>1M events/sec) perf  record   Kernel   scheduler   1.  async  read   2.  dump   perf  inject   1.  read   2.  rewrite   disks   perf  report/script   read,  process,  print   buffer   file  system   (or  pipe)  
  • 38. New:  BPF  Off-­‐CPU  Time  Stacks   •  Measure off-CPU time, add to map with key = stack, value = total time. Async read map. offcpuDme  (bcc)   Kernel   1.  Config  BPF  &  kprobe   2.  async  read  stacks   3.  symbol  translate   4.  print   maps   BPF   scheduler   finish_task_switch()  
  • 39. Stack  Trace  Hack   •  For my offcputime tool, I wrote a BPF stack walker:
  • 40. "Crazy  Stuff"   •  … using unrolled loops & goto:
  • 41. BPF  Stack  Traces   •  Proper BPF stack support just landed in net-next: •  Allows more than just chain graphs Date Sat, 20 Feb 2016 00:25:05 -0500 (EST) Subject Re: [PATCH net-next 0/3] bpf_get_stackid() and stack_trace map From David Miller <> From: Alexei Starovoitov <ast@fb.com> Date: Wed, 17 Feb 2016 19:58:56 -0800 > This patch set introduces new map type to store stack traces and > corresponding bpf_get_stackid() helper. ... Series applied, thanks Alexei.
  • 42. memleak   •  Real-time memory growth and leak analysis: •  Uses my stack hack, but will switch to BPF stacks soon •  By Sasha Goldshtein. Another bcc tool. # ./memleak.py -o 10 60 1 Attaching to kmalloc and kfree, Ctrl+C to quit. [01:27:34] Top 10 stacks with outstanding allocations: 72 bytes in 1 allocations from stack alloc_fdtable [kernel] (ffffffff8121960f) expand_files [kernel] (ffffffff8121986b) sys_dup2 [kernel] (ffffffff8121a68d) […] 2048 bytes in 1 allocations from stack alloc_fdtable [kernel] (ffffffff812195da) expand_files [kernel] (ffffffff8121986b) sys_dup2 [kernel] (ffffffff8121a68d) ] Trace  for  60s   Show  kernel   allocabons   older  than  10s   that  were  not   freed  
  • 43. 3.  bcc   •  BPF Compiler Collection –  https://github.com/iovisor/bcc •  Python front-end, C instrumentation •  Currently beta – in development! •  Some example tracing tools…
  • 44. execsnoop   •  Trace new processes: # ./execsnoop PCOMM PID RET ARGS bash 15887 0 /usr/bin/man ls preconv 15894 0 /usr/bin/preconv -e UTF-8 man 15896 0 /usr/bin/tbl man 15897 0 /usr/bin/nroff -mandoc -rLL=169n -rLT=169n -Tutf8 man 15898 0 /usr/bin/pager -s nroff 15900 0 /usr/bin/locale charmap nroff 15901 0 /usr/bin/groff -mtty-char -Tutf8 -mandoc -rLL=169n … groff 15902 0 /usr/bin/troff -mtty-char -mandoc -rLL=169n -rLT=169 … groff 15903 0 /usr/bin/grotty
  • 45. biolatency   •  Block device (disk) I/O latency distribution: # ./biolatency -mT 1 5 Tracing block device I/O... Hit Ctrl-C to end. 06:20:16 msecs : count distribution 0 -> 1 : 36 |**************************************| 2 -> 3 : 1 |* | 4 -> 7 : 3 |*** | 8 -> 15 : 17 |***************** | 16 -> 31 : 33 |********************************** | 32 -> 63 : 7 |******* | 64 -> 127 : 6 |****** | […]
  • 46. ext4slower   •  ext4 file system I/O, slower than a threshold: # ./ext4slower 1 Tracing ext4 operations slower than 1 ms TIME COMM PID T BYTES OFF_KB LAT(ms) FILENAME 06:49:17 bash 3616 R 128 0 7.75 cksum 06:49:17 cksum 3616 R 39552 0 1.34 [ 06:49:17 cksum 3616 R 96 0 5.36 2to3-2.7 06:49:17 cksum 3616 R 96 0 14.94 2to3-3.4 06:49:17 cksum 3616 R 10320 0 6.82 411toppm 06:49:17 cksum 3616 R 65536 0 4.01 a2p 06:49:17 cksum 3616 R 55400 0 8.77 ab 06:49:17 cksum 3616 R 36792 0 16.34 aclocal-1.14 06:49:17 cksum 3616 R 15008 0 19.31 acpi_listen 06:49:17 cksum 3616 R 6123 0 17.23 add-apt- repository 06:49:17 cksum 3616 R 6280 0 18.40 addpart 06:49:17 cksum 3616 R 27696 0 2.16 addr2line 06:49:17 cksum 3616 R 58080 0 10.11 ag 06:49:17 cksum 3616 R 906 0 6.30 ec2-meta-data […]
  • 47. bashreadline   •  Trace bash interactive commands system-wide: # ./bashreadline TIME PID COMMAND 05:28:25 21176 ls -l 05:28:28 21176 date 05:28:35 21176 echo hello world 05:28:43 21176 foo this command failed 05:28:45 21176 df -h 05:29:04 3059 echo another shell 05:29:13 21176 echo first shell again
  • 48. gethostlatency   •  Show latency for getaddrinfo/gethostbyname[2] calls: # ./gethostlatency TIME PID COMM LATms HOST 06:10:24 28011 wget 90.00 www.iovisor.org 06:10:28 28127 wget 0.00 www.iovisor.org 06:10:41 28404 wget 9.00 www.netflix.com 06:10:48 28544 curl 35.00 www.netflix.com.au 06:11:10 29054 curl 31.00 www.plumgrid.com 06:11:16 29195 curl 3.00 www.facebook.com 06:11:25 29404 curl 72.00 foo 06:11:28 29475 curl 1.00 foo
  • 49. trace   •  Trace custom events. Ad hoc analysis multitool: # trace 'sys_read (arg3 > 20000) "read %d bytes", arg3' TIME PID COMM FUNC - 05:18:23 4490 dd sys_read read 1048576 bytes 05:18:23 4490 dd sys_read read 1048576 bytes 05:18:23 4490 dd sys_read read 1048576 bytes 05:18:23 4490 dd sys_read read 1048576 bytes ^C
  • 51. 4.  Future  Work   •  All event sources •  Language improvements •  More tools: eg, TCP •  GUI support
  • 52. Linux  Event  Sources   done   XXX:  todo   XXX:  todo   XXX:  todo   done  
  • 54. More  Tools   •  eg, netstat(8)… $ netstat -s Ip: 7962754 total packets received 8 with invalid addresses 0 forwarded 0 incoming packets discarded 7962746 incoming packets delivered 8019427 requests sent out Icmp: 382 ICMP messages received 0 input ICMP message failed. ICMP input histogram: destination unreachable: 125 timeout in transit: 257 3410 ICMP messages sent 0 ICMP messages failed ICMP output histogram: destination unreachable: 3410 IcmpMsg: InType3: 125 InType11: 257 OutType3: 3410 Tcp: 17337 active connections openings 395515 passive connection openings 8953 failed connection attempts 240214 connection resets received 3 connections established 7198375 segments received 7504939 segments send out 62696 segments retransmited 10 bad segments received. 1072 resets sent InCsumErrors: 5 Udp: 759925 packets received 3412 packets to unknown port received. 0 packet receive errors 784370 packets sent UdpLite: TcpExt: 858 invalid SYN cookies received 8951 resets received for embryonic SYN_RECV sockets 14 packets pruned from receive queue because of socket buffer overrun 6177 TCP sockets finished time wait in fast timer 293 packets rejects in established connections because of timestamp 733028 delayed acks sent 89 delayed acks further delayed because of locked socket Quick ack mode was activated 13214 times 336520 packets directly queued to recvmsg prequeue. 43964 packets directly received from backlog 11406012 packets directly received from prequeue 1039165 packets header predicted 7066 packets header predicted and directly queued to user 1428960 acknowledgments not containing data received 1004791 predicted acknowledgments 1 times recovered from packet loss due to fast retransmit 5044 times recovered from packet loss due to SACK data 2 bad SACKs received Detected reordering 4 times using SACK Detected reordering 11 times using time stamp 13 congestion windows fully recovered 11 congestion windows partially recovered using Hoe heuristic TCPDSACKUndo: 39 2384 congestion windows recovered after partial ack 228 timeouts after SACK recovery 100 timeouts in loss state 5018 fast retransmits 39 forward retransmits 783 retransmits in slow start 32455 other TCP timeouts TCPLossProbes: 30233 TCPLossProbeRecovery: 19070 992 sack retransmits failed 18 times receiver scheduled too late for direct processing 705 packets collapsed in receive queue due to low socket buffer 13658 DSACKs sent for old packets 8 DSACKs sent for out of order packets 13595 DSACKs received 33 DSACKs for out of order packets received 32 connections reset due to unexpected data 108 connections reset due to early user close 1608 connections aborted due to timeout TCPSACKDiscard: 4 TCPDSACKIgnoredOld: 1 TCPDSACKIgnoredNoUndo: 8649 TCPSpuriousRTOs: 445 TCPSackShiftFallback: 8588 TCPRcvCoalesce: 95854 TCPOFOQueue: 24741 TCPOFOMerge: 8 TCPChallengeACK: 1441 TCPSYNChallenge: 5 TCPSpuriousRtxHostQueues: 1 TCPAutoCorking: 4823 IpExt: InOctets: 1561561375 OutOctets: 1509416943 InNoECTPkts: 8201572 InECT1Pkts: 2 InECT0Pkts: 3844 InCEPkts: 306
  • 56. BeRer  TCP  Tools   •  TCP retransmit by type and time •  Congestion algorithm metrics •  etc.
  • 57. GUI  Support   •  eg, Netflix Vector: open source instance analyzer:
  • 58. Summary   •  BPF in Linux 4.x makes many new things possible –  Stack-based thread state analysis (solve all issues!) –  Real-time memory growth/leak detection –  Better TCP metrics –  etc... •  Get involved: see iovisor/bcc •  So far just a preview of things to come
  • 59. Links   •  iovisor bcc: •  https://github.com/iovisor/bcc •  http://www.brendangregg.com/blog/2015-09-22/bcc-linux-4.3-tracing.html •  http://blogs.microsoft.co.il/sasha/2016/02/14/two-new-ebpf-tools-memleak-and-argdist/ •  BPF Off-CPU, Wakeup, Off-Wake & Chain Graphs: •  http://www.brendangregg.com/blog/2016-01-20/ebpf-offcpu-flame-graph.html •  http://www.brendangregg.com/blog/2016-02-01/linux-wakeup-offwake-profiling.html •  http://www.brendangregg.com/blog/2016-02-05/ebpf-chaingraph-prototype.html •  Linux Performance: •  http://www.brendangregg.com/linuxperf.html •  Linux perf_events: •  https://perf.wiki.kernel.org/index.php/Main_Page •  http://www.brendangregg.com/perf.html •  Flame Graphs: •  http://techblog.netflix.com/2015/07/java-in-flames.html •  http://www.brendangregg.com/flamegraphs.html •  Netflix Tech Blog on Vector: •  http://techblog.netflix.com/2015/04/introducing-vector-netflixs-on-host.html •  Wordcloud: https://www.jasondavies.com/wordcloud/
  • 60. Feb   2016   •  Questions? •  http://slideshare.net/brendangregg •  http://www.brendangregg.com •  bgregg@netflix.com •  @brendangregg Thanks to Alexei Starovoitov (Facebook), Brenden Blanco (PLUMgrid), Daniel Borkmann (Cisco), Wang Nan (Huawei), Sasha Goldshtein (Sela), and other BPF and bcc contributors!