USENIX ATC 2017: Visualizing Performance with Flame Graphs

Brendan Gregg
Brendan GreggSenior Performance Architect at Netflix
Visualizing Performance with Flame Graphs
Brendan Gregg
Senior Performance Architect
Jul 2017
2017 USENIX Annual Technical Conference
Visualize	CPU	-me	consumed	by	
all	so5ware	
Kernel	
Java	
User-level
Agenda	
1.	CPU	Flame	graphs	
2.	Fixing	Stacks	&	Symbols	 3.	Advanced	flame	graphs
Take	aways	
1.  Interpret	CPU	flame	graphs	
2.  Understand	piHalls	with	stack	traces	and	symbols	
3.  Discover	opportuniKes	for	future	development
USENIX ATC 2017: Visualizing Performance with Flame Graphs
Case	Study	
Exception handling consuming CPU
CPU	PROFILING	
Summary
CPU	Profiling	
A
B
block interrupt
on-CPU off-CPU
A
B
A A
B
A
syscall
time
•  Record stacks at a timed interval: simple and effective
–  Pros: Low (deterministic) overhead
–  Cons: Coarse accuracy, but usually sufficient
stack
samples: A
Stack	Traces	
•  A	code	path	snapshot.	e.g.,	from	jstack(1):	
$ jstack 1819
[…]
"main" prio=10 tid=0x00007ff304009000 nid=0x7361
runnable [0x00007ff30d4f9000]
java.lang.Thread.State: RUNNABLE
at Func_abc.func_c(Func_abc.java:6)
at Func_abc.func_b(Func_abc.java:16)
at Func_abc.func_a(Func_abc.java:23)
at Func_abc.main(Func_abc.java:27)
running
parent
g.parent
g.g.parent
System	Profilers	
•  Linux	
–  perf_events	(aka	"perf")	
•  Oracle	Solaris	
–  DTrace	
•  OS	X	
–  Instruments	
•  Windows	
–  XPerf,	WPA	(which	now	has	flame	graphs!)	
•  And	many	others…
Linux	perf_events	
•  Standard	Linux	profiler	
–  Provides	the	perf	command	(mulK-tool)	
–  Usually	pkg	added	by	linux-tools-common,	etc.	
•  Many	event	sources:	
–  Timer-based	sampling	
–  Hardware	events	
–  Tracepoints	
–  Dynamic	tracing	
•  Can	sample	stacks	of	(almost)	everything	on	CPU	
–  Can	miss	hard	interrupt	ISRs,	but	these	should	be	near-zero.	They	can	be	measured	if	needed	(I	wrote	
my	own	tools).
perf	Profiling	
# perf record -F 99 -ag -- sleep 30
[ perf record: Woken up 9 times to write data ]
[ perf record: Captured and wrote 2.745 MB perf.data (~119930 samples) ]
# perf report -n -stdio
[…]
# Overhead Samples Command Shared Object Symbol
# ........ ............ ....... ................. .............................
#
20.42% 605 bash [kernel.kallsyms] [k] xen_hypercall_xen_version
|
--- xen_hypercall_xen_version
check_events
|
|--44.13%-- syscall_trace_enter
| tracesys
| |
| |--35.58%-- __GI___libc_fcntl
| | |
| | |--65.26%-- do_redirection_internal
| | | do_redirections
| | | execute_builtin_or_function
| | | execute_simple_command
[… ~13,000 lines truncated …]
call tree
summary
Full	perf	report	Output
…	as	a	Flame	Graph
Flame	Graph	Summary	
•  Visualizes	a	collecKon	of	stack	traces	
–  x-axis:	alphabeKcal	stack	sort,	to	maximize	merging	
–  y-axis:	stack	depth	
–  color:	random	(default),	or	a	dimension	
•  Currently	made	from	Perl	+	SVG	+	JavaScript	
–  hBps://github.com/brendangregg/FlameGraph	
–  Takes	input	from	many	different	profilers	
–  MulKple	d3	versions	are	being	developed	
•  References:	
–  hcp://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html		
–  hcp://queue.acm.org/detail.cfm?id=2927301	
–  "The	Flame	Graph"	CACM,	June	2016
Flame	Graph	InterpretaKon	
a()
b() h()
c()
d()
e() f()
g()
i()
Flame	Graph	InterpretaKon	(1/3)	
Top	edge	shows	who	is	running	on-CPU,	
and	how	much	(width)	
a()
b() h()
c()
d()
e() f()
g()
i()
Flame	Graph	InterpretaKon	(2/3)	
h()
d()
e()
i()
a()
b()
c()
f()
g()
Top-down	shows	ancestry	
e.g.,	from	g():
Flame	Graph	InterpretaKon	(3/3)	
a()
b() h()
c()
d()
e() f()
g()
i()
Widths	are	proporKonal	to	presence	in	samples	
e.g.,	comparing	b()	to	h()	(incl.	children)
Mixed-Mode	Flame	Graphs	
•  Hues:	
–  green	==	JIT	(eg,	Java)	
–  aqua	==	inlined	
•  if	included	
–  red	==	user-level*	
–  orange	==	kernel	
–  yellow	==	C++	
•  Intensity:	
–  Randomized	to	
differenKate	frames	
–  Or	hashed	on	
funcKon	name	
Java JVM
(C++)
Kernel
Mixed-Mode
C
*	new	palece	uses	red	for	kernel	modules	too
DifferenKal	Flame	Graphs	
•  Hues:	
–  red	==	more	samples	
–  blue	==	less	samples	
•  Intensity:	
–  Degree	of	difference	
•  Compares	two	profiles	
•  Can	show	other	
metrics:	e.g.,	CPI	
•  Other	types	exist	
–  flamegraphdiff	
Differential
more less
Icicle	Graph	
top (leaf) merge
Flame	Graph	Search	
•  Color:	magenta	
to	show	
matched	
frames	
search
button
Flame	Charts	
•  Flame charts: x-axis is time
•  Flame graphs: x-axis is population (maximize merging)
•  Final note: these are useful, but are not flame graphs
from	
Chrome	
dev	tools
STACK	TRACING	
PiHalls	and	fixes
Broken	Stack	Traces	are	Common	
Because:	
A.  Profilers	use	frame	pointer	walking	by	default	
B.  Compilers	reuse	the	frame	pointer	register	as	a	general	purpose	register:	a	
(usually	very	small)	performance	opKmizaKon.	
	 # perf record –F 99 –a –g – sleep 30
# perf script
[…]
java 4579 cpu-clock:
7f417908c10b [unknown] (/tmp/perf-4458.map)
java 4579 cpu-clock:
7f41792fc65f [unknown] (/tmp/perf-4458.map)
a2d53351ff7da603 [unknown] ([unknown])
[…]
…	as	a	Flame	Graph	
Broken Java stacks
(missing frame pointer)
Fixing	Stack	Walking	
A.  Frame	pointer-based	
–  Fix	by	disabling	that	compiler	opKmizaKon:	gcc's	-fno-omit-frame-pointer	
–  Pros:	simple,	supported	by	many	tools	
–  Cons:	might	cost	a	licle	extra	CPU	
B.  Debug	info	(DWARF)	walking	
–  Cons:	costs	disk	space,	and	not	supported	by	all	profilers.	Even	possible	with	JIT?	
C.  JIT	runKme	walkers	
–  Pros:	include	more	internals,	such	as	inlined	frames	
–  Cons:	limited	to	applicaKon	internals	-	no	kernel	
D. Last	branch	record
Fixing	Java	Stack	Traces	
# perf script
[…]
java 8131 cpu-clock:
7fff76f2dce1 [unknown] ([vdso])
7fd3173f7a93 os::javaTimeMillis() (/usr/lib/jvm…
7fd301861e46 [unknown] (/tmp/perf-8131.map)
7fd30184def8 [unknown] (/tmp/perf-8131.map)
7fd30174f544 [unknown] (/tmp/perf-8131.map)
7fd30175d3a8 [unknown] (/tmp/perf-8131.map)
7fd30166d51c [unknown] (/tmp/perf-8131.map)
7fd301750f34 [unknown] (/tmp/perf-8131.map)
7fd3016c2280 [unknown] (/tmp/perf-8131.map)
7fd301b02ec0 [unknown] (/tmp/perf-8131.map)
7fd3016f9888 [unknown] (/tmp/perf-8131.map)
7fd3016ece04 [unknown] (/tmp/perf-8131.map)
7fd30177783c [unknown] (/tmp/perf-8131.map)
7fd301600aa8 [unknown] (/tmp/perf-8131.map)
7fd301a4484c [unknown] (/tmp/perf-8131.map)
7fd3010072e0 [unknown] (/tmp/perf-8131.map)
7fd301007325 [unknown] (/tmp/perf-8131.map)
7fd301007325 [unknown] (/tmp/perf-8131.map)
7fd3010004e7 [unknown] (/tmp/perf-8131.map)
7fd3171df76a JavaCalls::call_helper(JavaValue*,…
7fd3171dce44 JavaCalls::call_virtual(JavaValue*…
7fd3171dd43a JavaCalls::call_virtual(JavaValue*…
7fd31721b6ce thread_entry(JavaThread*, Thread*)…
7fd3175389e0 JavaThread::thread_main_inner() (/…
7fd317538cb2 JavaThread::run() (/usr/lib/jvm/nf…
7fd3173f6f52 java_start(Thread*) (/usr/lib/jvm/…
7fd317a7e182 start_thread (/lib/x86_64-linux-gn…
# perf script
[…]
java 4579 cpu-clock:
7f417908c10b [unknown] (/tmp/…
java 4579 cpu-clock:
7f41792fc65f [unknown] (/tmp/…
a2d53351ff7da603 [unknown] ([unkn…
[…]
I	prototyped	JVM	frame	
pointers.	Oracle	rewrote	it	
and	added	it	to	Java	as	
-XX:+PreserveFramePointer	
(JDK	8	u60b19)
Fixed	Stacks	Flame	Graph	
Java stacks
(but no symbols, yet)
Inlining	
•  Many	frames	may	be	missing	(inlined)	
–  Flame	graph	may	sKll	make	enough	sense	
•  Inlining	can	oqen	be	be	tuned	
–  e.g.	Java's	-XX:-Inline	to	disable,	but	can	be	80%	slower	
–  Java's	-XX:MaxInlineSize	and	-XX:InlineSmallCode	can	be	tuned	
a	licle	to	reveal	more	frames:	can	even	improve	performance!	
•  RunKmes	can	un-inline	on	demand	
–  So	that	excepKon	stack	traces	make	sense	
–  e.g.	Java's	perf-map-agent	can	un-inline	(unfoldall	opKon)
Stack	Depth	
•  perf	had	a	127	frame	limit	
•  Now	tunable	in	Linux	4.8	
–  sysctl	-w	kernel.perf_event_max_stack=512	
–  Thanks	Arnaldo	Carvalho	de	Melo!	
A	Java	microservice	
with	a	stack	depth	
of	>	900	broken	stacks	
perf_event_max_stack=1024
SYMBOLS	
Fixing
Fixing	NaKve	Symbols	
A.  Add	a	-dbgsym	package,	if	available	
B.  Recompile	from	source
Fixing	JIT	Symbols	(Java,	Node.js,	…)	
•  Just-in-Kme	runKmes	don't	have	a	pre-compiled	symbol	table	
•  So	Linux	perf	looks	for	an	externally	provided	JIT	symbol	
file:	/tmp/perf-PID.map	
	
	
•  This	can	be	created	by	runKmes;	eg,	Java's	perf-map-agent	
# perf script
Failed to open /tmp/perf-8131.map, continuing without symbols
[…]
java 8131 cpu-clock:
7fff76f2dce1 [unknown] ([vdso])
7fd3173f7a93 os::javaTimeMillis() (/usr/lib/jvm…
7fd301861e46 [unknown] (/tmp/perf-8131.map)
[…]
Java Mixed-Mode Flame Graph
Fixed	Stacks	&	Symbols	
Java JVM
Kernel
GC
Stacks	&	Symbols	(zoom)
Symbol	Churn	
•  For	JIT	runKmes,	symbols	can	change	during	a	profile	
•  Symbols	may	be	mistranslated	by	perf's	map	snapshot	
•  SoluKons:	
A.  Take	a	before	&	aqer	snapshot,	and	compare	
B.  perf's	new	support	for	Kmestamped	symbol	logs
Containers	
•  perf	can't	find	any	symbol	sources	
–  Unless	you	copy	them	into	the	host	
•  I'm	tesKng	Krister	Johansen's	fix,	hopefully	for	Linux	4.13	
–  lkml:	"[PATCH	Kp/perf/core	0/7]	namespace	tracing	improvements"
INSTRUCTIONS	
For	Linux
Linux	CPU	Flame	Graphs	
Linux	2.6+,	via	perf.data	and	perf	script:	
	
Linux	4.5+	can	use	folded	output	
–  Skips	the	CPU-costly	stackcollapse-perf.pl	step;	see:	
hcp://www.brendangregg.com/blog/2016-04-30/linux-perf-folded.html	
Linux	4.9+,	via	BPF:	
–  Most	efficient:	no	perf.data	file,	summarizes	in-kernel	
git clone --depth 1 https://github.com/brendangregg/FlameGraph
cd FlameGraph
perf record -F 99 -a –g -- sleep 30
perf script | ./stackcollapse-perf.pl |./flamegraph.pl > perf.svg
git clone --depth 1 https://github.com/brendangregg/FlameGraph
git clone --depth 1 https://github.com/iovisor/bcc
./bcc/tools/profile.py -dF 99 30 | ./FlameGraph/flamegraph.pl > perf.svg
perf record
perf script
capture	stacks	
write	text	
stackcollapse-perf.pl
flamegraph.pl
perf.data	
write	samples	
reads	samples	
folded	output	
perf record
perf report –g
folded
capture	stacks	
folded	report	
awk
flamegraph.pl
perf.data	
write	samples	
reads	samples	
folded	output	
Linux	4.5	
count	stacks	(BPF)	
folded	
output	
flamegraph.pl
profile.py
Linux	4.9	
Linux	Profiling	OpKmizaKons	
Linux	2.6
Language/RunKme	InstrucKons	
•  Each	may	have	special	stack/symbol	instrucKons	
–  Java,	Node.js,	Python,	Ruby,	C++,	Go,	…	
•  I'm	documenKng	some	on:	
–  hcp://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html		
–  Also	try	an	Internet	search
GUI	AutomaKon	
Flame Graphs
Eg,	NeHlix	Vector	(self-service	UI):	
Should	be	open	sourced;	you	may	also	build/buy	your	own
ADVANCED	FLAME	GRAPHS	
Future	Work
Flame	graphs	can	be	generated	for	stack	traces	from	any	Linux	event	source
Page	Faults	
•  Show	what	triggered	main	memory	(resident)	to	grow:	
•  "fault"	as	(physical)	main	memory	is	allocated	on-demand,	
when	a	virtual	page	is	first	populated	
•  Low	overhead	tool	to	solve	some	types	of	memory	leak	
# perf record -e page-faults -p PID -g -- sleep 120
RES column in top(1) grows
because
Other	Memory	Sources	
hcp://www.brendangregg.com/FlameGraphs/memoryflamegraphs.html
Context	Switches	
•  Show	why	Java	blocked	and	stopped	running	on-CPU:	
•  IdenKfies	locks,	I/O,	sleeps	
–  If	code	path	shouldn't	block	and	looks	random,	it's	an	involuntary	context	switch.	I	could	filter	these,	but	you	should	
have	solved	them	beforehand	(CPU	load).	
•  e.g.,	was	used	to	understand	framework	differences:	
# perf record -e context-switches -p PID -g -- sleep 5
vs
rxNetty Tomcat
futex
sys_poll
epoll
futex
Disk	I/O	Requests	
•  Shows	who	issued	disk	I/O	(sync	reads	&	writes):	
•  e.g.:	page	faults	in	GC?	This	JVM	has	swapped	out!:	
# perf record -e block:block_rq_insert -a -g -- sleep 60
GC
TCP	Events	
•  TCP	transmit,	using	dynamic	tracing:	
•  Note:	can	be	high	overhead	for	high	packet	rates	
–  For	the	current	perf	trace,	dump,	post-process	cycle	
•  Can	also	trace	TCP	connect	&	accept	
–  Lower	frequency,	therefore	lower	overhead	
•  TCP	receive	is	async	
–  Could	trace	via	socket	read	
# perf probe tcp_sendmsg
# perf record -e probe:tcp_sendmsg -a -g -- sleep 1; jmaps
# perf script -f comm,pid,tid,cpu,time,event,ip,sym,dso,trace > out.stacks
# perf probe --del tcp_sendmsg
TCP sends
CPU	Cache	Misses	
•  In	this	example,	sampling	via	Last	Level	Cache	loads:	
	
	
•  -c	is	the	count	(samples	
once	per	count)	
•  Use	other	CPU	counters	to	
sample	hits,	misses,	stalls	
	
# perf record -e LLC-loads -c 10000 -a -g -- sleep 5; jmaps
# perf script -f comm,pid,tid,cpu,time,event,ip,sym,dso > out.stacks
CPI	Flame	Graph	
•  Cycles	Per	InstrucKon	
–  red	==	instrucKon	heavy	
–  blue	==	cycle	heavy	
(likely	memory	stall	cycles)	
zoomed:
Off-CPU	Analysis	
Off-CPU	analysis	is	the	study	of	
blocking	states,	or	the	code-path	
(stack	trace)	that	led	to	these	states
Off-CPU	Time	Flame	Graph	
More	info	hcp://www.brendangregg.com/blog/2016-02-01/linux-wakeup-offwake-profiling.html	
Stack depth
Off-CPU time
Off-CPU	Time	(zoomed):	tar(1)	
file read
from disk
directory read
from disk
Only	showing	kernel	stacks	in	this	example	
pipe write
path read from disk
fstat from disk
CPU	+	Off-CPU	Flame	Graphs:	See	Everything	
hcp://www.brendangregg.com/flamegraphs.html	
CPU	
Off-CPU
Off-CPU	Time	(zoomed):	gzip(1)	
The off-CPU stack trace often doesn't show the root cause of latency.
What is gzip blocked on?
Off-Wake	Time	Flame	Graph	
Uses	Linux	enhanced	BPF	to	merge	off-CPU	and	waker	stack	in	kernel	context
Off-Wake	Time	Flame	Graph	(zoomed)	
Waker	task	
Waker	stack	
Blocked	stack	
Blocked	task	
Stack	
DirecKon	
Wokeup
Chain	Graphs	
Walking	the	chain	of	wakeup	stacks	to	reach	root	cause
Hot	Cold	Flame	Graphs	
Includes	both	CPU	&	
Off-CPU	(or	chain)	stacks	
in	one	flame	graph	
•  However,	Off-CPU	Kme	oqen	
dominates:	threads	waiKng	or	
polling	
hcp://www.brendangregg.com/FlameGraphs/hotcoldflamegraphs.html
Flame	Graph	Diff	
hcps://github.com/corpaul/flamegraphdiff
Take	aways	
1.  Interpret	CPU	flame	graphs	
2.  Understand	piHalls	with	stack	traces	and	symbols	
3.  Discover	opportuniKes	for	future	development
Links	&	References	
•  Flame	Graphs	
–  "The	Flame	Graph"	Communica-ons	of	the	ACM,	Vol.	56,	No.	6	(June	2016)	
–  hcp://queue.acm.org/detail.cfm?id=2927301		
–  hcp://www.brendangregg.com/flamegraphs.html	->	hBp://www.brendangregg.com/flamegraphs.html#Updates		
–  hcp://www.brendangregg.com/FlameGraphs/cpuflamegraphs.html		
–  hcp://www.brendangregg.com/FlameGraphs/memoryflamegraphs.html		
–  hcp://www.brendangregg.com/FlameGraphs/offcpuflamegraphs.html		
–  hcp://techblog.neHlix.com/2015/07/java-in-flames.html		
–  hcp://techblog.neHlix.com/2016/04/saving-13-million-computaKonal-minutes.html		
–  hcp://techblog.neHlix.com/2014/11/nodejs-in-flames.html	
–  hcp://www.brendangregg.com/blog/2014-11-09/differenKal-flame-graphs.html		
–  hcp://www.brendangregg.com/blog/2016-01-20/ebpf-offcpu-flame-graph.html		
–  hcp://www.brendangregg.com/blog/2016-02-01/linux-wakeup-offwake-profiling.html		
–  hcp://www.brendangregg.com/blog/2016-02-05/ebpf-chaingraph-prototype.html				
–  hcp://corpaul.github.io/flamegraphdiff/		
•  Linux	perf_events	
–  hcps://perf.wiki.kernel.org/index.php/Main_Page		
–  hcp://www.brendangregg.com/perf.html		
•  NeHlix	Vector	
–  hcps://github.com/neHlix/vector	
–  hcp://techblog.neHlix.com/2015/04/introducing-vector-neHlixs-on-host.html
Thank You
–  QuesKons?	
–  hcp://www.brendangregg.com	
–  hcp://slideshare.net/brendangregg		
–  bgregg@neHlix.com	
–  @brendangregg	
	
	
	
Next	topic:	Performance	Superpowers	with	Enhanced	BPF	
2017 USENIX Annual Technical Conference
1 of 66

Recommended

Linux Performance Analysis: New Tools and Old Secrets by
Linux Performance Analysis: New Tools and Old SecretsLinux Performance Analysis: New Tools and Old Secrets
Linux Performance Analysis: New Tools and Old SecretsBrendan Gregg
604K views75 slides
Kernel Recipes 2017: Using Linux perf at Netflix by
Kernel Recipes 2017: Using Linux perf at NetflixKernel Recipes 2017: Using Linux perf at Netflix
Kernel Recipes 2017: Using Linux perf at NetflixBrendan Gregg
1.5M views79 slides
Performance Tuning EC2 Instances by
Performance Tuning EC2 InstancesPerformance Tuning EC2 Instances
Performance Tuning EC2 InstancesBrendan Gregg
171.7K views81 slides
Linux 4.x Tracing Tools: Using BPF Superpowers by
Linux 4.x Tracing Tools: Using BPF SuperpowersLinux 4.x Tracing Tools: Using BPF Superpowers
Linux 4.x Tracing Tools: Using BPF SuperpowersBrendan Gregg
210.2K views68 slides
BPF Internals (eBPF) by
BPF Internals (eBPF)BPF Internals (eBPF)
BPF Internals (eBPF)Brendan Gregg
15.3K views122 slides
[232] 성능어디까지쥐어짜봤니 송태웅 by
[232] 성능어디까지쥐어짜봤니 송태웅[232] 성능어디까지쥐어짜봤니 송태웅
[232] 성능어디까지쥐어짜봤니 송태웅NAVER D2
18.2K views140 slides

More Related Content

What's hot

YOW2021 Computing Performance by
YOW2021 Computing PerformanceYOW2021 Computing Performance
YOW2021 Computing PerformanceBrendan Gregg
2K views108 slides
Java Performance Analysis on Linux with Flame Graphs by
Java Performance Analysis on Linux with Flame GraphsJava Performance Analysis on Linux with Flame Graphs
Java Performance Analysis on Linux with Flame GraphsBrendan Gregg
24.7K views71 slides
LISA2019 Linux Systems Performance by
LISA2019 Linux Systems PerformanceLISA2019 Linux Systems Performance
LISA2019 Linux Systems PerformanceBrendan Gregg
375.3K views64 slides
re:Invent 2019 BPF Performance Analysis at Netflix by
re:Invent 2019 BPF Performance Analysis at Netflixre:Invent 2019 BPF Performance Analysis at Netflix
re:Invent 2019 BPF Performance Analysis at NetflixBrendan Gregg
5.5K views65 slides
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt by
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven RostedtKernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven RostedtAnne Nicolas
5.6K views123 slides
Understanding eBPF in a Hurry! by
Understanding eBPF in a Hurry!Understanding eBPF in a Hurry!
Understanding eBPF in a Hurry!Ray Jenkins
1.5K views77 slides

What's hot(20)

YOW2021 Computing Performance by Brendan Gregg
YOW2021 Computing PerformanceYOW2021 Computing Performance
YOW2021 Computing Performance
Brendan Gregg2K views
Java Performance Analysis on Linux with Flame Graphs by Brendan Gregg
Java Performance Analysis on Linux with Flame GraphsJava Performance Analysis on Linux with Flame Graphs
Java Performance Analysis on Linux with Flame Graphs
Brendan Gregg24.7K views
LISA2019 Linux Systems Performance by Brendan Gregg
LISA2019 Linux Systems PerformanceLISA2019 Linux Systems Performance
LISA2019 Linux Systems Performance
Brendan Gregg375.3K views
re:Invent 2019 BPF Performance Analysis at Netflix by Brendan Gregg
re:Invent 2019 BPF Performance Analysis at Netflixre:Invent 2019 BPF Performance Analysis at Netflix
re:Invent 2019 BPF Performance Analysis at Netflix
Brendan Gregg5.5K views
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt by Anne Nicolas
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven RostedtKernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Kernel Recipes 2017 - Understanding the Linux kernel via ftrace - Steven Rostedt
Anne Nicolas5.6K views
Understanding eBPF in a Hurry! by Ray Jenkins
Understanding eBPF in a Hurry!Understanding eBPF in a Hurry!
Understanding eBPF in a Hurry!
Ray Jenkins1.5K views
Velocity 2015 linux perf tools by Brendan Gregg
Velocity 2015 linux perf toolsVelocity 2015 linux perf tools
Velocity 2015 linux perf tools
Brendan Gregg1.1M views
eBPF Trace from Kernel to Userspace by SUSE Labs Taipei
eBPF Trace from Kernel to UserspaceeBPF Trace from Kernel to Userspace
eBPF Trace from Kernel to Userspace
SUSE Labs Taipei8.5K views
Linux Performance Profiling and Monitoring by Georg Schönberger
Linux Performance Profiling and MonitoringLinux Performance Profiling and Monitoring
Linux Performance Profiling and Monitoring
Georg Schönberger8.5K views
YOW2020 Linux Systems Performance by Brendan Gregg
YOW2020 Linux Systems PerformanceYOW2020 Linux Systems Performance
YOW2020 Linux Systems Performance
Brendan Gregg1.9K views
Tracing MariaDB server with bpftrace - MariaDB Server Fest 2021 by Valeriy Kravchuk
Tracing MariaDB server with bpftrace - MariaDB Server Fest 2021Tracing MariaDB server with bpftrace - MariaDB Server Fest 2021
Tracing MariaDB server with bpftrace - MariaDB Server Fest 2021
Valeriy Kravchuk363 views
Deep dive into PostgreSQL statistics. by Alexey Lesovsky
Deep dive into PostgreSQL statistics.Deep dive into PostgreSQL statistics.
Deep dive into PostgreSQL statistics.
Alexey Lesovsky3.7K views
Linux Instrumentation by DarkStarSword
Linux InstrumentationLinux Instrumentation
Linux Instrumentation
DarkStarSword9.7K views
LISA17 Container Performance Analysis by Brendan Gregg
LISA17 Container Performance AnalysisLISA17 Container Performance Analysis
LISA17 Container Performance Analysis
Brendan Gregg9.3K views
EBPF and Linux Networking by PLUMgrid
EBPF and Linux NetworkingEBPF and Linux Networking
EBPF and Linux Networking
PLUMgrid14.6K views
(발표자료) CentOS EOL에 따른 대응 OS 검토 및 적용 방안.pdf by ssuserf8b8bd1
(발표자료) CentOS EOL에 따른 대응 OS 검토 및 적용 방안.pdf(발표자료) CentOS EOL에 따른 대응 OS 검토 및 적용 방안.pdf
(발표자료) CentOS EOL에 따른 대응 OS 검토 및 적용 방안.pdf
ssuserf8b8bd12.8K views
High-Performance Networking Using eBPF, XDP, and io_uring by ScyllaDB
High-Performance Networking Using eBPF, XDP, and io_uringHigh-Performance Networking Using eBPF, XDP, and io_uring
High-Performance Networking Using eBPF, XDP, and io_uring
ScyllaDB3.3K views
Performance Wins with eBPF: Getting Started (2021) by Brendan Gregg
Performance Wins with eBPF: Getting Started (2021)Performance Wins with eBPF: Getting Started (2021)
Performance Wins with eBPF: Getting Started (2021)
Brendan Gregg1.4K views
Linux Performance Analysis and Tools by Brendan Gregg
Linux Performance Analysis and ToolsLinux Performance Analysis and Tools
Linux Performance Analysis and Tools
Brendan Gregg531.1K views

Similar to USENIX ATC 2017: Visualizing Performance with Flame Graphs

JavaOne 2015 Java Mixed-Mode Flame Graphs by
JavaOne 2015 Java Mixed-Mode Flame GraphsJavaOne 2015 Java Mixed-Mode Flame Graphs
JavaOne 2015 Java Mixed-Mode Flame GraphsBrendan Gregg
23.1K views92 slides
May2010 hex-core-opt by
May2010 hex-core-optMay2010 hex-core-opt
May2010 hex-core-optJeff Larkin
459 views101 slides
Cray XT Porting, Scaling, and Optimization Best Practices by
Cray XT Porting, Scaling, and Optimization Best PracticesCray XT Porting, Scaling, and Optimization Best Practices
Cray XT Porting, Scaling, and Optimization Best PracticesJeff Larkin
782 views122 slides
Can FPGAs Compete with GPUs? by
Can FPGAs Compete with GPUs?Can FPGAs Compete with GPUs?
Can FPGAs Compete with GPUs?inside-BigData.com
1.2K views35 slides
Best Practices for performance evaluation and diagnosis of Java Applications ... by
Best Practices for performance evaluation and diagnosis of Java Applications ...Best Practices for performance evaluation and diagnosis of Java Applications ...
Best Practices for performance evaluation and diagnosis of Java Applications ...IndicThreads
2.7K views42 slides
Profiling your Applications using the Linux Perf Tools by
Profiling your Applications using the Linux Perf ToolsProfiling your Applications using the Linux Perf Tools
Profiling your Applications using the Linux Perf ToolsemBO_Conference
9.7K views37 slides

Similar to USENIX ATC 2017: Visualizing Performance with Flame Graphs(20)

JavaOne 2015 Java Mixed-Mode Flame Graphs by Brendan Gregg
JavaOne 2015 Java Mixed-Mode Flame GraphsJavaOne 2015 Java Mixed-Mode Flame Graphs
JavaOne 2015 Java Mixed-Mode Flame Graphs
Brendan Gregg23.1K views
May2010 hex-core-opt by Jeff Larkin
May2010 hex-core-optMay2010 hex-core-opt
May2010 hex-core-opt
Jeff Larkin459 views
Cray XT Porting, Scaling, and Optimization Best Practices by Jeff Larkin
Cray XT Porting, Scaling, and Optimization Best PracticesCray XT Porting, Scaling, and Optimization Best Practices
Cray XT Porting, Scaling, and Optimization Best Practices
Jeff Larkin782 views
Best Practices for performance evaluation and diagnosis of Java Applications ... by IndicThreads
Best Practices for performance evaluation and diagnosis of Java Applications ...Best Practices for performance evaluation and diagnosis of Java Applications ...
Best Practices for performance evaluation and diagnosis of Java Applications ...
IndicThreads2.7K views
Profiling your Applications using the Linux Perf Tools by emBO_Conference
Profiling your Applications using the Linux Perf ToolsProfiling your Applications using the Linux Perf Tools
Profiling your Applications using the Linux Perf Tools
emBO_Conference9.7K views
Scala & Spark(1.6) in Performance Aspect for Scala Taiwan by Jimin Hsieh
Scala & Spark(1.6) in Performance Aspect for Scala TaiwanScala & Spark(1.6) in Performance Aspect for Scala Taiwan
Scala & Spark(1.6) in Performance Aspect for Scala Taiwan
Jimin Hsieh665 views
What’s Slowing Down Your Kafka Pipeline? With Ruizhe Cheng and Pete Stevenson... by HostedbyConfluent
What’s Slowing Down Your Kafka Pipeline? With Ruizhe Cheng and Pete Stevenson...What’s Slowing Down Your Kafka Pipeline? With Ruizhe Cheng and Pete Stevenson...
What’s Slowing Down Your Kafka Pipeline? With Ruizhe Cheng and Pete Stevenson...
HostedbyConfluent351 views
Continuous Go Profiling & Observability by ScyllaDB
Continuous Go Profiling & ObservabilityContinuous Go Profiling & Observability
Continuous Go Profiling & Observability
ScyllaDB855 views
Introduction to FPGA acceleration by Marco77328
Introduction to FPGA accelerationIntroduction to FPGA acceleration
Introduction to FPGA acceleration
Marco77328101 views
Iurii Antykhovych "Java and performance tools and toys" by LogeekNightUkraine
Iurii Antykhovych "Java and performance tools and toys"Iurii Antykhovych "Java and performance tools and toys"
Iurii Antykhovych "Java and performance tools and toys"
(PFC306) Performance Tuning Amazon EC2 Instances | AWS re:Invent 2014 by Amazon Web Services
(PFC306) Performance Tuning Amazon EC2 Instances | AWS re:Invent 2014(PFC306) Performance Tuning Amazon EC2 Instances | AWS re:Invent 2014
(PFC306) Performance Tuning Amazon EC2 Instances | AWS re:Invent 2014
Amazon Web Services36.8K views
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ... by AMD Developer Central
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
PL-4044, OpenACC on AMD APUs and GPUs with the PGI Accelerator Compilers, by ...
NovaProva, a new generation unit test framework for C programs by Greg Banks
NovaProva, a new generation unit test framework for C programsNovaProva, a new generation unit test framework for C programs
NovaProva, a new generation unit test framework for C programs
Greg Banks1.1K views
The Hitchhiker's Guide to Faster Builds. Viktor Kirilov. CoreHard Spring 2019 by corehard_by
The Hitchhiker's Guide to Faster Builds. Viktor Kirilov. CoreHard Spring 2019The Hitchhiker's Guide to Faster Builds. Viktor Kirilov. CoreHard Spring 2019
The Hitchhiker's Guide to Faster Builds. Viktor Kirilov. CoreHard Spring 2019
corehard_by5.1K views
Easy and High Performance GPU Programming for Java Programmers by Kazuaki Ishizaki
Easy and High Performance GPU Programming for Java ProgrammersEasy and High Performance GPU Programming for Java Programmers
Easy and High Performance GPU Programming for Java Programmers
Kazuaki Ishizaki6.9K views
HKG18-TR14 - Postmortem Debugging with Coresight by Linaro
HKG18-TR14 - Postmortem Debugging with CoresightHKG18-TR14 - Postmortem Debugging with Coresight
HKG18-TR14 - Postmortem Debugging with Coresight
Linaro558 views

More from Brendan Gregg

IntelON 2021 Processor Benchmarking by
IntelON 2021 Processor BenchmarkingIntelON 2021 Processor Benchmarking
IntelON 2021 Processor BenchmarkingBrendan Gregg
1K views17 slides
Systems@Scale 2021 BPF Performance Getting Started by
Systems@Scale 2021 BPF Performance Getting StartedSystems@Scale 2021 BPF Performance Getting Started
Systems@Scale 2021 BPF Performance Getting StartedBrendan Gregg
1.5K views30 slides
Computing Performance: On the Horizon (2021) by
Computing Performance: On the Horizon (2021)Computing Performance: On the Horizon (2021)
Computing Performance: On the Horizon (2021)Brendan Gregg
93.1K views113 slides
UM2019 Extended BPF: A New Type of Software by
UM2019 Extended BPF: A New Type of SoftwareUM2019 Extended BPF: A New Type of Software
UM2019 Extended BPF: A New Type of SoftwareBrendan Gregg
33.1K views48 slides
LPC2019 BPF Tracing Tools by
LPC2019 BPF Tracing ToolsLPC2019 BPF Tracing Tools
LPC2019 BPF Tracing ToolsBrendan Gregg
1.8K views9 slides
LSFMM 2019 BPF Observability by
LSFMM 2019 BPF ObservabilityLSFMM 2019 BPF Observability
LSFMM 2019 BPF ObservabilityBrendan Gregg
8.3K views67 slides

More from Brendan Gregg(20)

IntelON 2021 Processor Benchmarking by Brendan Gregg
IntelON 2021 Processor BenchmarkingIntelON 2021 Processor Benchmarking
IntelON 2021 Processor Benchmarking
Brendan Gregg1K views
Systems@Scale 2021 BPF Performance Getting Started by Brendan Gregg
Systems@Scale 2021 BPF Performance Getting StartedSystems@Scale 2021 BPF Performance Getting Started
Systems@Scale 2021 BPF Performance Getting Started
Brendan Gregg1.5K views
Computing Performance: On the Horizon (2021) by Brendan Gregg
Computing Performance: On the Horizon (2021)Computing Performance: On the Horizon (2021)
Computing Performance: On the Horizon (2021)
Brendan Gregg93.1K views
UM2019 Extended BPF: A New Type of Software by Brendan Gregg
UM2019 Extended BPF: A New Type of SoftwareUM2019 Extended BPF: A New Type of Software
UM2019 Extended BPF: A New Type of Software
Brendan Gregg33.1K views
LPC2019 BPF Tracing Tools by Brendan Gregg
LPC2019 BPF Tracing ToolsLPC2019 BPF Tracing Tools
LPC2019 BPF Tracing Tools
Brendan Gregg1.8K views
LSFMM 2019 BPF Observability by Brendan Gregg
LSFMM 2019 BPF ObservabilityLSFMM 2019 BPF Observability
LSFMM 2019 BPF Observability
Brendan Gregg8.3K views
YOW2018 CTO Summit: Working at netflix by Brendan Gregg
YOW2018 CTO Summit: Working at netflixYOW2018 CTO Summit: Working at netflix
YOW2018 CTO Summit: Working at netflix
Brendan Gregg3.5K views
YOW2018 Cloud Performance Root Cause Analysis at Netflix by Brendan Gregg
YOW2018 Cloud Performance Root Cause Analysis at NetflixYOW2018 Cloud Performance Root Cause Analysis at Netflix
YOW2018 Cloud Performance Root Cause Analysis at Netflix
Brendan Gregg142.9K views
NetConf 2018 BPF Observability by Brendan Gregg
NetConf 2018 BPF ObservabilityNetConf 2018 BPF Observability
NetConf 2018 BPF Observability
Brendan Gregg2.7K views
ATO Linux Performance 2018 by Brendan Gregg
ATO Linux Performance 2018ATO Linux Performance 2018
ATO Linux Performance 2018
Brendan Gregg3.3K views
Linux Performance 2018 (PerconaLive keynote) by Brendan Gregg
Linux Performance 2018 (PerconaLive keynote)Linux Performance 2018 (PerconaLive keynote)
Linux Performance 2018 (PerconaLive keynote)
Brendan Gregg427.1K views
How Netflix Tunes EC2 Instances for Performance by Brendan Gregg
How Netflix Tunes EC2 Instances for PerformanceHow Netflix Tunes EC2 Instances for Performance
How Netflix Tunes EC2 Instances for Performance
Brendan Gregg524.7K views
Kernel Recipes 2017: Performance Analysis with BPF by Brendan Gregg
Kernel Recipes 2017: Performance Analysis with BPFKernel Recipes 2017: Performance Analysis with BPF
Kernel Recipes 2017: Performance Analysis with BPF
Brendan Gregg5.3K views
EuroBSDcon 2017 System Performance Analysis Methodologies by Brendan Gregg
EuroBSDcon 2017 System Performance Analysis MethodologiesEuroBSDcon 2017 System Performance Analysis Methodologies
EuroBSDcon 2017 System Performance Analysis Methodologies
Brendan Gregg15.8K views
OSSNA 2017 Performance Analysis Superpowers with Linux BPF by Brendan Gregg
OSSNA 2017 Performance Analysis Superpowers with Linux BPFOSSNA 2017 Performance Analysis Superpowers with Linux BPF
OSSNA 2017 Performance Analysis Superpowers with Linux BPF
Brendan Gregg5.1K views
USENIX ATC 2017 Performance Superpowers with Enhanced BPF by Brendan Gregg
USENIX ATC 2017 Performance Superpowers with Enhanced BPFUSENIX ATC 2017 Performance Superpowers with Enhanced BPF
USENIX ATC 2017 Performance Superpowers with Enhanced BPF
Brendan Gregg7.1K views
Velocity 2017 Performance analysis superpowers with Linux eBPF by Brendan Gregg
Velocity 2017 Performance analysis superpowers with Linux eBPFVelocity 2017 Performance analysis superpowers with Linux eBPF
Velocity 2017 Performance analysis superpowers with Linux eBPF
Brendan Gregg736.6K views

Recently uploaded

LLMs in Production: Tooling, Process, and Team Structure by
LLMs in Production: Tooling, Process, and Team StructureLLMs in Production: Tooling, Process, and Team Structure
LLMs in Production: Tooling, Process, and Team StructureAggregage
65 views77 slides
The Power of Heat Decarbonisation Plans in the Built Environment by
The Power of Heat Decarbonisation Plans in the Built EnvironmentThe Power of Heat Decarbonisation Plans in the Built Environment
The Power of Heat Decarbonisation Plans in the Built EnvironmentIES VE
85 views20 slides
CryptoBotsAI by
CryptoBotsAICryptoBotsAI
CryptoBotsAIchandureddyvadala199
42 views5 slides
Qualifying SaaS, IaaS.pptx by
Qualifying SaaS, IaaS.pptxQualifying SaaS, IaaS.pptx
Qualifying SaaS, IaaS.pptxSachin Bhandari
1.1K views8 slides
The Power of Generative AI in Accelerating No Code Adoption.pdf by
The Power of Generative AI in Accelerating No Code Adoption.pdfThe Power of Generative AI in Accelerating No Code Adoption.pdf
The Power of Generative AI in Accelerating No Code Adoption.pdfSaeed Al Dhaheri
44 views18 slides
"Package management in monorepos", Zoltan Kochan by
"Package management in monorepos", Zoltan Kochan"Package management in monorepos", Zoltan Kochan
"Package management in monorepos", Zoltan KochanFwdays
37 views18 slides

Recently uploaded(20)

LLMs in Production: Tooling, Process, and Team Structure by Aggregage
LLMs in Production: Tooling, Process, and Team StructureLLMs in Production: Tooling, Process, and Team Structure
LLMs in Production: Tooling, Process, and Team Structure
Aggregage65 views
The Power of Heat Decarbonisation Plans in the Built Environment by IES VE
The Power of Heat Decarbonisation Plans in the Built EnvironmentThe Power of Heat Decarbonisation Plans in the Built Environment
The Power of Heat Decarbonisation Plans in the Built Environment
IES VE85 views
The Power of Generative AI in Accelerating No Code Adoption.pdf by Saeed Al Dhaheri
The Power of Generative AI in Accelerating No Code Adoption.pdfThe Power of Generative AI in Accelerating No Code Adoption.pdf
The Power of Generative AI in Accelerating No Code Adoption.pdf
Saeed Al Dhaheri44 views
"Package management in monorepos", Zoltan Kochan by Fwdays
"Package management in monorepos", Zoltan Kochan"Package management in monorepos", Zoltan Kochan
"Package management in monorepos", Zoltan Kochan
Fwdays37 views
Future of AR - Facebook Presentation by Rob McCarty
Future of AR - Facebook PresentationFuture of AR - Facebook Presentation
Future of AR - Facebook Presentation
Rob McCarty66 views
The Role of Patterns in the Era of Large Language Models by Yunyao Li
The Role of Patterns in the Era of Large Language ModelsThe Role of Patterns in the Era of Large Language Models
The Role of Patterns in the Era of Large Language Models
Yunyao Li104 views
Mobile Core Solutions & Successful Cases.pdf by IPLOOK Networks
Mobile Core Solutions & Successful Cases.pdfMobile Core Solutions & Successful Cases.pdf
Mobile Core Solutions & Successful Cases.pdf
IPLOOK Networks16 views
Deep Tech and the Amplified Organisation: Core Concepts by Holonomics
Deep Tech and the Amplified Organisation: Core ConceptsDeep Tech and the Amplified Organisation: Core Concepts
Deep Tech and the Amplified Organisation: Core Concepts
Holonomics17 views
NTGapps NTG LowCode Platform by Mustafa Kuğu
NTGapps NTG LowCode Platform NTGapps NTG LowCode Platform
NTGapps NTG LowCode Platform
Mustafa Kuğu474 views
Don’t Make A Human Do A Robot’s Job! : 6 Reasons Why AI Will Save Us & Not De... by Moses Kemibaro
Don’t Make A Human Do A Robot’s Job! : 6 Reasons Why AI Will Save Us & Not De...Don’t Make A Human Do A Robot’s Job! : 6 Reasons Why AI Will Save Us & Not De...
Don’t Make A Human Do A Robot’s Job! : 6 Reasons Why AI Will Save Us & Not De...
Moses Kemibaro38 views
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or... by ShapeBlue
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...
Zero to Cloud Hero: Crafting a Private Cloud from Scratch with XCP-ng, Xen Or...
ShapeBlue209 views
This talk was not generated with ChatGPT: how AI is changing science by Elena Simperl
This talk was not generated with ChatGPT: how AI is changing scienceThis talk was not generated with ChatGPT: how AI is changing science
This talk was not generated with ChatGPT: how AI is changing science
Elena Simperl34 views
"Node.js Development in 2024: trends and tools", Nikita Galkin by Fwdays
"Node.js Development in 2024: trends and tools", Nikita Galkin "Node.js Development in 2024: trends and tools", Nikita Galkin
"Node.js Development in 2024: trends and tools", Nikita Galkin
Fwdays37 views
Business Analyst Series 2023 - Week 4 Session 7 by DianaGray10
Business Analyst Series 2023 -  Week 4 Session 7Business Analyst Series 2023 -  Week 4 Session 7
Business Analyst Series 2023 - Week 4 Session 7
DianaGray10152 views
GDSC GLAU Info Session.pptx by gauriverrma4
GDSC GLAU Info Session.pptxGDSC GLAU Info Session.pptx
GDSC GLAU Info Session.pptx
gauriverrma415 views
What is Authentication Active Directory_.pptx by HeenaMehta35
What is Authentication Active Directory_.pptxWhat is Authentication Active Directory_.pptx
What is Authentication Active Directory_.pptx
HeenaMehta3515 views
"Surviving highload with Node.js", Andrii Shumada by Fwdays
"Surviving highload with Node.js", Andrii Shumada "Surviving highload with Node.js", Andrii Shumada
"Surviving highload with Node.js", Andrii Shumada
Fwdays59 views

USENIX ATC 2017: Visualizing Performance with Flame Graphs