Intro to linux performance analysis
Upcoming SlideShare
Loading in...5
×
 

Intro to linux performance analysis

on

  • 150 views

LOPSA SD 2014.03.27 Presentation on Linux Performance Analysis

LOPSA SD 2014.03.27 Presentation on Linux Performance Analysis

An introduction using the USE method and showing how several tools fit into those resource evaluations.

Statistics

Views

Total Views
150
Views on SlideShare
149
Embed Views
1

Actions

Likes
1
Downloads
5
Comments
0

1 Embed 1

http://www.slideee.com 1

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Intro to linux performance analysis Intro to linux performance analysis Presentation Transcript

  • Intro to Linux Performance Analysis Chris McEniry LOPSA-SD March 27, 2014
  • Me • Systems Architect • Sony Network Entertainment • 18 years running stuff • Majority of the last 14 years: medium-large Internet services
  • Read this book… And look here: http://www.brendangregg.com/ http://www.brendangregg.com/ methodology.html http://www.brendangregg.com/Slides/ LISA2012_methodologies.pdf http://www.amazon.com/Systems-Performance-Enterprise-Brendan-Gregg/dp/0133390098
  • The website is down!!! It’s just too slow! The DB is too slow! The disk is too slow! SLOW!!! http://farm4.staticflickr.com/3190/2976755407_6a6a574596_o.jpg
  • SLOW!!!! • What does slow mean anyways? • Is it not transferring fast enough? • Is it handling (not) too many requests? http://commons.wikimedia.org/wiki/File:United_States_sign_-_Slow_Traffic_Ahead.svg
  • Slow can mean… • Latency: How long it takes • ms, s, request time, etc • Throughput: How much can happen at the same time • bandwidth, IOPS, rps, tps, etc http://upload.wikimedia.org/wikipedia/commons/2/2e/Miniature_DNF_Dictionary_055_ubt.JPG
  • Slowness comes from… • Full utilization of a resource • Waiting in a saturated queue • Generated errors! ! • The USE Method http://farm6.staticflickr.com/5181/5614813544_a30d693a50_o.jpg
  • Utilization • You have fully used up what’s been allocated • aka 5 lb bag http://farm3.staticflickr.com/2524/4000641774_3331fe06fb_o.jpg
  • Saturation • Waiting for someone else to get done so you can do yours • Typically because a resource is fully utilized, but not necessarily directly http://www.fotocommunity.com/pc/pc/display/30396619
  • Errors • Dropped packets • Incorrect responses • Deadlocks • Timeouts ! • Not all failures fail fast http://farm8.staticflickr.com/7001/6509400855_aaaf915871_b.jpg
  • How do we determine? • Different types of tools for different examinations • Depends on what you’re looking for (which can be a problem in and of itself) http://farm5.staticflickr.com/4083/5086955738_61f6455ace_b.jpg
  • Resource vs Transaction • Do you care if… • a CPU is maxed out? • processes are blocked? • packets are lost? • or if… • a user’s request fails? • a user gives up on waiting for a response?
  • Maturity • Tracing tools, especially using in production, requires a level of maturity • I’m not that mature… ;) • No, really just focusing on the basics first http://upload.wikimedia.org/wikipedia/commons/b/bd/OFLC_large_R18%2B.svg
  • http://image.slidesharecdn.com/scalelinuxperformance-130224171331-phpapp01/95/slide-15-638.jpg?cb=1362166290
  • http://image.slidesharecdn.com/scalelinuxperformance-130224171331-phpapp01/95/slide-16-638.jpg?cb=1362166290
  • General
  • ? /var/log/messages
  • Errors ! (mostly - sometimes stats go here) /var/log/messages
  • CPU
  • ? uptime
  • Saturation of the scheduler uptime
  • ? top
  • top Saturation Utilization
  • Memory
  • ? free
  • Utilization free
  • ? vmstat
  • vmstat Saturation Utilization Counts
  • ? slabtop
  • Utilization slabtop
  • Disk
  • ? df
  • Utilization df
  • ? iostat -x
  • Maybe you can get additional utilization if you know the max r/s or w/s - but not as clear based on different properties. iostat -x
  • IO (Network)
  • ? ping
  • Errors ping
  • ? netstat
  • Saturation netstat
  • ? netstat -s
  • Errors netstat -s
  • ? ifconfig
  • ifconfig Saturation Utilization Errors
  • What are your examples? http://upload.wikimedia.org/wikipedia/commons/f/f3/Uncle_Sam_(pointing_finger).jpg
  • Applications
  • Running out of Apache Threads • Lots of incoming requests • Apache hits ServerLimit of threads (Utilization!) • Requests start to get stuck in TCP backlog (Saturation!) • Apache endpoints are removed from load balancers (Error!) • Fail! http://upload.wikimedia.org/wikipedia/commons/9/96/Colorful_Threads_(3965274345).jpg
  • Cold DB Start • DB’s like to be in memory, but can’t start that way • All data requests go to disk (which is SAN backed) • SAN controller CPU gets maxed out (Utilization!) • HBA queues get deep (Saturation!) • Requests timeout (Error!) • Fail!
  • Summary
  • Methods > Tools • Don’t let tools get in the way of solutions • It’s easy to think that all your missing a tool. • But are you actually following a method to your performance madness? http://upload.wikimedia.org/wikipedia/commons/6/6d/Three_Card_Monte.jpg
  • Anti-Methods • Blame Someone Else • Streetlight • Drunk Man • Random Change • Passive Benchmark ! • Don’t do these… http://www.brendangregg.com/methodology.html http://upload.wikimedia.org/wikipedia/commons/a/af/Villainc.svg
  • Methods • Ad Hoc Checklist • Problem Statement • Scientific • Workload Characterization • Drill-down Analysis • By-layer • Latency Analysis • Tools • Stack Profile • Off-CPU Analysis • Thread State Analysis • Active Benchmark http://www.brendangregg.com/methodology.html http://memegenerator.net/instance/9192015
  • Linux Performance Tools Chris McEniry LOPSA-SD March 27, 2014