Intro to linux performance analysis

Intro to
Linux Performance
Analysis
Chris McEniry
LOPSA-SD
March 27, 2014

Me
• Systems Architect
• Sony Network Entertainment
• 18 years running stuff
• Majority of the last 14 years: medium-large Internet
services

Read this book…
And look here:
http://www.brendangregg.com/
http://www.brendangregg.com/
methodology.html
http://www.brendangregg.com/Slides/
LISA2012_methodologies.pdf
http://www.amazon.com/Systems-Performance-Enterprise-Brendan-Gregg/dp/0133390098

The website is down!!!
It’s just too slow!
The DB is too slow!
The disk is too slow!
SLOW!!!
http://farm4.staticﬂickr.com/3190/2976755407_6a6a574596_o.jpg

SLOW!!!!
• What does slow mean
anyways?
• Is it not transferring fast
enough?
• Is it handling (not) too many
requests?
http://commons.wikimedia.org/wiki/File:United_States_sign_-_Slow_Trafﬁc_Ahead.svg

Slow can mean…
• Latency: How long it takes
• ms, s, request time, etc
• Throughput: How much can
happen at the same time
• bandwidth, IOPS, rps, tps,
etc
http://upload.wikimedia.org/wikipedia/commons/2/2e/Miniature_DNF_Dictionary_055_ubt.JPG

Slowness comes from…
• Full utilization of a resource
• Waiting in a saturated queue
• Generated errors!
!
• The USE Method
http://farm6.staticﬂickr.com/5181/5614813544_a30d693a50_o.jpg

Utilization
• You have fully used up what’s
been allocated
• aka 5 lb bag
http://farm3.staticﬂickr.com/2524/4000641774_3331fe06fb_o.jpg

Saturation
• Waiting for someone else to
get done so you can do yours
• Typically because a resource
is fully utilized, but not
necessarily directly
http://www.fotocommunity.com/pc/pc/display/30396619

Errors
• Dropped packets
• Incorrect responses
• Deadlocks
• Timeouts
!
• Not all failures fail fast
http://farm8.staticﬂickr.com/7001/6509400855_aaaf915871_b.jpg

How do we determine?
• Different types of tools for
different examinations
• Depends on what you’re
looking for (which can be a
problem in and of itself)
http://farm5.staticﬂickr.com/4083/5086955738_61f6455ace_b.jpg

Resource vs Transaction
• Do you care if…
• a CPU is maxed out?
• processes are blocked?
• packets are lost?
• or if…
• a user’s request fails?
• a user gives up on waiting for a response?

Maturity
• Tracing tools, especially using
in production, requires a level
of maturity
• I’m not that mature… ;)
• No, really just focusing on the
basics ﬁrst
http://upload.wikimedia.org/wikipedia/commons/b/bd/OFLC_large_R18%2B.svg

http://image.slidesharecdn.com/scalelinuxperformance-130224171331-phpapp01/95/slide-15-638.jpg?cb=1362166290

http://image.slidesharecdn.com/scalelinuxperformance-130224171331-phpapp01/95/slide-16-638.jpg?cb=1362166290

Errors
!
(mostly - sometimes stats go here)
/var/log/messages

Saturation of the scheduler
uptime

vmstat
Saturation
Utilization
Counts

Maybe you can get additional utilization if you know the
max r/s or w/s - but not as clear based on different
properties.
iostat -x

ifconﬁg
Saturation
Utilization
Errors

What are your examples?
http://upload.wikimedia.org/wikipedia/commons/f/f3/Uncle_Sam_(pointing_ﬁnger).jpg

Running out of Apache
Threads
• Lots of incoming requests
• Apache hits ServerLimit of
threads (Utilization!)
• Requests start to get stuck in
TCP backlog (Saturation!)
• Apache endpoints are
removed from load balancers
(Error!)
• Fail!
http://upload.wikimedia.org/wikipedia/commons/9/96/Colorful_Threads_(3965274345).jpg

Cold DB Start
• DB’s like to be in memory, but
can’t start that way
• All data requests go to disk
(which is SAN backed)
• SAN controller CPU gets
maxed out (Utilization!)
• HBA queues get deep
(Saturation!)
• Requests timeout (Error!)
• Fail!

Methods > Tools
• Don’t let tools get in the way of
solutions
• It’s easy to think that all your
missing a tool.
• But are you actually following
a method to your performance
madness?
http://upload.wikimedia.org/wikipedia/commons/6/6d/Three_Card_Monte.jpg

Anti-Methods
• Blame Someone Else
• Streetlight
• Drunk Man
• Random Change
• Passive Benchmark
!
• Don’t do these…
http://www.brendangregg.com/methodology.html http://upload.wikimedia.org/wikipedia/commons/a/af/Villainc.svg

Methods
• Ad Hoc Checklist
• Problem Statement
• Scientiﬁc
• Workload Characterization
• Drill-down Analysis
• By-layer
• Latency Analysis
• Tools
• Stack Proﬁle
• Off-CPU Analysis
• Thread State Analysis
• Active Benchmark
http://www.brendangregg.com/methodology.html http://memegenerator.net/instance/9192015

Linux Performance
Tools
Chris McEniry
LOPSA-SD
March 27, 2014

Intro to linux performance analysis

More Related Content

What's hot

Viewers also liked

Similar to Intro to linux performance analysis

More from Chris McEniry

Recently uploaded

Intro to linux performance analysis