performance tweaks and     tools for linux            joe damato      twitter: @joedamato      blog: timetobleed.com
~12 hour flight+11 hour time change
and
side effects are..
nasty bugsejpphoto (flickr)
fatboyke (flickr)                    code
memory bloat37prime (flickr)
?
TOOLS
LSOF   list open fileslsof -nPp <pid>
lsof -nPp <pid>-nInhibits the conversion of network numbers to host names.-PInhibits the conversion of port numbers to nam...
STRACE    trace system calls and signals      strace -cp <pid>strace -ttTp <pid> -o <file>
strace -cp <pid>-cCount time, calls, and errors for each system call and report asummary on program exit.-p pidAttach to t...
strace -ttTp <pid> -o <file>-ttIf given twice, the time printed will include the microseconds.-TShow the time spent in sys...
http client connection                                    read 772 bytes read(22, "GET / HTTP/1.1r"..., 16384) = 772 <0.00...
mysql connection                    write sql query to dbwrite(5, "SELECT * FROM `table`"..., 56) = 56 <0.0023>read(5, "25...
strace ruby to see some interesting things...
stracing ruby: SIGVTALRM    --- SIGVTALRM (Virtual   timer expired) @ 0 (0) ---    rt_sigreturn(0x1a)        = 2207807 <0....
stracing ruby: sigprocmask% time     seconds usecs/call      calls    errors syscall------ ----------- ----------- -------...
TCPDUMP   dump traffic on a networktcpdump -i eth0 -s 0 -nqA    tcp dst port 3306
tcpdump -i <eth> -s <len> -nqA <expr>  tcpdump -i <eth> -w <file> <expr>-i <eth>Network interface.-s <len>Snarf len bytes ...
tcp dst port 8019:52:20.216294 IP 24.203.197.27.40105 >174.37.48.236.80: tcp 438E...*.@.l.%&.....%0....POx..%s.oP.......GE...
tcp dst port 330619:51:06.501632 IP 10.8.85.66.50443 >10.8.85.68.3306: tcp 98E..."K@.@.Yy.UB.UD.....z....L............GZ.y...
tcpdump -w <file>
PERFTOOLS     googles performance toolsCPUPROFILE=/tmp/myprof ./myapp  pprof ./myapp /tmp/myprof
wget http://google-perftools.googlecode.com/files/google-perftools-1.6.tar.gz                                 downloadtar ...
pprof ruby                pprof ruby ruby.prof --text           ruby.prof --gifTotal: 103 samples    95 92.2% rb_yield_0  ...
Profiling MRI                  • 10% of production                     VM time spent in                     rb_str_sub_ban...
Profiling EM + threads           Total: 3763 samples            2764 73.5% catch_timer             989 26.3% memcpy       ...
PERFTOOLS.RB   perftools for ruby codepprof.rb /tmp/myrbprof                 github.com/tmm1/perftools.rb
gem install perftools.rbRUBYOPT="-r`gem which perftools | tail -1`"CPUPROFILE=/tmp/myrbprofruby myapp.rbpprof.rb /tmp/myrb...
require sinatra                       $ ab -c 1 -n 50 http://127.0.0.1:4567/compute                       $ ab -c 1 -n 50 ...
CPUPROFILE_REALTIME=1                         CPUPROFILE=app.profCPUPROFILE=app-rt.prof
redis-rb bottleneck
why is rubygems slow?
faster     bundle     install• 23% spent in  Gem::Version#<=>• simple patch to rubygems  improved overall install  perform...
CPUPROFILE_OBJECTS=1 CPUPROFILE=app-objs.prof• object allocation profiler  mode built-in• 1 sample = 1 object  created• Tim...
LTRACE        trace library calls      ltrace -cp <pid>ltrace -ttTp <pid> -o <file>
ltrace -c ruby threaded_em.rb         % time     seconds usecs/call      calls       function         ------ ----------- -...
LTRACE/LIBDL  trace dlopen’d library callsltrace -F <conf> -bg -x   <symbol> -p <pid>             github.com/ice799/ltrace...
ltrace -F <conf> -b -g -x <sym>-bIgnore signals.-gIgnore libraries linked at compile time.-F <conf>Read prototypes from co...
ltrace -x garbage_collect19:08:06.436926   garbage_collect()   =   <void>   <0.221679>19:08:15.329311   garbage_collect() ...
ltrace -x mysql_real_querymysql_real_query(0x1c9e0500,   "SET NAMES UTF8", 16)         =   0   <0.000324>mysql_real_query(...
ltrace -x memcached_setmemcached_set(0x15d46b80,   "Status:33",   21,   "004b",   366)   =   0   <0.01116>memcached_set(0x...
GDB the GNU debuggergdb <executable>gdb attach <pid>
Debugging Ruby Segfaults test_segv.rb:4: [BUG] Segmentation fault ruby 1.8.7 (2008-08-11 patchlevel 72) [i686-darwin9.7.0]...
1. Attach to running process $ ps aux | grep ruby joe 23611 0.0 0.1      25424   7540 S Dec01 0:00 ruby test_segv.rb $ sud...
def test  require segv  4.times do    Dir.chdir /tmp do       Hash.new{ segv }[0]    end  end     (gdb) whereend       #0 ...
GDB.RB  gdb with MRI hooksgem install gdb.rb   gdb.rb <pid>                  github.com/tmm1/gdb.rb
(gdb) ruby eval 1+23(gdb) ruby eval Thread.current#<Thread:0x1d630 run>(gdb) ruby eval Thread.list.size8
(gdb) ruby threads list0x15890 main thread THREAD_STOPPED    WAIT_JOIN(0x19ef4)0x19ef4      thread THREAD_STOPPED    WAIT_...
(gdb) ruby objects  HEAPS            8  SLOTS      1686252  LIVE        893327 (52.98%)  FREE        792925 (47.02%)  scop...
(gdb) ruby objects strings      140 ulib      158 u0      294 un      619 u    30503 unique strings  3187435 bytes
def test  require segv  4.times do    Dir.chdir /tmp do       Hash.new{ segv }[0]    end  endend            (gdb) ruby thr...
rails_warden leak(gdb) ruby objects classes    1197 MIME::Type    2657 NewRelic::MetricSpec    2719 TZInfo::TimezoneTransi...
mongrel sleeper thread0x16814c00          thread THREAD_STOPPED    WAIT_TIME(0.47) 1522 bytes      node_fcall    sleep in ...
god memory leaks(gdb) ruby objects arrays    43   God::Process elements instances          43   God::Watch    94310 3     ...
BLEAK_HOUSE     ruby memory leak detector  ruby-bleak-house myapp.rbbleak /tmp/bleak.<PID>.*.dump                    githu...
• BleakHouse    • installs a patched version of ruby: ruby-bleak-         house     •   unlike gdb.rb, see where objects w...
IOPROFILE             summarizes strace and lsofwget http://aspersa.googlecode.com/svn/trunk/ioprofile
strace -cp <pid> ioprofile is a script that captures one sample of lsof output then starts strace for a specified amount o...
strace -ttTp <pid> -o <file>  -c CELL  specify what to put in the cells of the output. ‘times’, ‘count’,  or ‘sizes‘.  bel...
/proclots of interesting things hiding in proc
/proc/[pid]/pagemap•   (as of linux 2.6.25)•   /proc/[pid]/pagemap - find out which physical    frame each virtual page is ...
/proc/[pid]/stack • get a process’s stack trace
/proc/[pid]/status• lots of small bits of information
/proc/sys/kernel/core_pattern   • tell system where to output core dumps   • can also launch user program when     core du...
and lots more  /proc/meminfo    /proc/scsi/*    /proc/net/*/proc/sys/net/ipv4/*         ...
MONITOR RAID STATUS
• how do you know when a hard drive in  your RAID array fails?• turns out there some command line tools  that most RAID ve...
snooki:/# /usr/StorMan/arcconf getconfig 1 ALControllers found: 1----------------------------------------------------------...
• write a script to parse that• run it with cron• the ugly ruby script i use for adaptec:  http://gist.github.com/643666
more well known tools:iostat, vmstat, top, and free
LINUX TWEAKS
ADJUST TIMER FREQUENCY
CONFIG_HZ_100=y CONFIG_HZ=100• Set the timer interrupt frequency.• Fewer timer interrupts means processes  run with fewer ...
CONNECTOR
CONFIG_CONNECTOR=yCONFIG_PROC_EVENTS=y• connector kernel module is useful for  process monitoring.• build or use a system ...
TCP SEGMENTATION OFFLOADING
sudo ethtool -K eth1 tso on  • Allows kernel to offload large packet    segmentation to the network adapter.  • Frees the C...
http://kerneltrap.org/node/397   Tx/Rx TCP file send long (bi-directional Rx/Tx):   w/o TSO: 1500Mbps, 82% CPU   w/ TSO: 16...
INTEL I/OAT DMA ENGINE
CONFIG_DMADEVICES=y          CONFIG_INTEL_IOATDMA=y           CONFIG_DMA_ENGINE=y            CONFIG_NET_DMA=y• these optio...
check if I/OAT is enabled [joe@timetobleed]% dmesg | grep ioat ioatdma 0000:00:08.0: setting latency timer to 64 ioatdma 0...
DIRECT CACHE ACCESS
CONFIG_DCA=y• I/OAT includes Direct Cache Access (DCA)• DCA allows a driver to warm CPU cache.• Requires driver and device...
DCA enable hackget a hack to enable DCA on some vendorlocked systems:http://github.com/ice799/dca_forceread about the hack...
THROTTLE NIC INTERRUPTS
insmod e1000e.koInterruptThrottleRate=1•   SOME drivers allow you to specify the interrupt throttling    algorithm.•   e10...
PROCESS AFFINITY
Process Affinity• Linux allows you to set which CPU(s) a  process may run on.• For example, set PID 123 to CPUs 4-6:       ...
IRQ AFFINITY
IRQ Affinity• NIC, disk, etc IRQ handlers can be set to  execute on specific processors.• The IRQ to CPU map can be found at...
IRQ affinity• Can pin IRQ handlers for devices to specific  CPUs.• Can then use taskset to pin important  processes to other...
irqbalance• http://www.irqbalance.org• “irqbalance is a Linux* daemon that  distributes interrupts over the processors  an...
OPROFILE
CONFIG_OPROFILE=yCONFIG_HAVE_OPROFILE=y•   oprofile is a system wide profiler that can    profile both kernel and application...
LATENCYTOP
CONFIG_LATENCYTOP=y • LatencyTOP helps you understand what   caused the kernel to block a process.
nasty bugsejpphoto (flickr)
fatboyke (flickr)                    code
memory bloat37prime (flickr)
TOOLS
TWEAKS
Thanks.   (spasiba ?)  @joedamatotimetobleed.com
Performance tweaks and tools for Linux (Joe Damato)
Performance tweaks and tools for Linux (Joe Damato)
Performance tweaks and tools for Linux (Joe Damato)
Performance tweaks and tools for Linux (Joe Damato)
Performance tweaks and tools for Linux (Joe Damato)
Performance tweaks and tools for Linux (Joe Damato)
Performance tweaks and tools for Linux (Joe Damato)
Performance tweaks and tools for Linux (Joe Damato)
Upcoming SlideShare
Loading in …5
×

Performance tweaks and tools for Linux (Joe Damato)

2,314 views

Published on

Published in: Technology

Performance tweaks and tools for Linux (Joe Damato)

  1. 1. performance tweaks and tools for linux joe damato twitter: @joedamato blog: timetobleed.com
  2. 2. ~12 hour flight+11 hour time change
  3. 3. and
  4. 4. side effects are..
  5. 5. nasty bugsejpphoto (flickr)
  6. 6. fatboyke (flickr) code
  7. 7. memory bloat37prime (flickr)
  8. 8. ?
  9. 9. TOOLS
  10. 10. LSOF list open fileslsof -nPp <pid>
  11. 11. lsof -nPp <pid>-nInhibits the conversion of network numbers to host names.-PInhibits the conversion of port numbers to names for network files FD TYPE NAME json cwd DIR /var/www/myapp memcached txt REG /usr/bin/ruby mysql mem REG /json-1.1.9/ext/json/ext/generator.so http mem REG /json-1.1.9/ext/json/ext/parser.so mem REG /memcached-0.17.4/lib/rlibmemcached.so mem REG /mysql-2.8.1/lib/mysql_api.so 0u CHR /dev/null 1w REG /usr/local/nginx/logs/error.log 2w REG /usr/local/nginx/logs/error.log 3u IPv4 10.8.85.66:33326->10.8.85.68:3306 (ESTABLISHED) 10u IPv4 10.8.85.66:33327->10.8.85.68:3306 (ESTABLISHED) 11u IPv4 127.0.0.1:58273->127.0.0.1:11211 (ESTABLISHED) 12u REG /tmp/RackMultipart.28957.0 33u IPv4 174.36.83.42:37466->69.63.180.21:80 (ESTABLISHED)
  12. 12. STRACE trace system calls and signals strace -cp <pid>strace -ttTp <pid> -o <file>
  13. 13. strace -cp <pid>-cCount time, calls, and errors for each system call and report asummary on program exit.-p pidAttach to the process with the process ID pid and begin tracing.% time seconds usecs/call calls errors syscall------ ----------- ----------- --------- --------- ---------------- 50.39 0.000064 0 1197 592 read 34.65 0.000044 0 609 writev 14.96 0.000019 0 1226 epoll_ctl 0.00 0.000000 0 4 close 0.00 0.000000 0 1 select 0.00 0.000000 0 4 socket 0.00 0.000000 0 4 4 connect 0.00 0.000000 0 1057 epoll_wait------ ----------- ----------- --------- --------- ----------------100.00 0.000127 4134 596 total
  14. 14. strace -ttTp <pid> -o <file>-ttIf given twice, the time printed will include the microseconds.-TShow the time spent in system calls.-o filenameWrite the trace output to the file filename rather than to stderr.epoll_wait(9, {{EPOLLIN, {u32=68841296, u64=68841296}}}, 4096, 50) = 1 <0.033109>accept(10, {sin_port=38313, sin_addr="127.0.0.1"}, [1226]) = 22 <0.000014>fcntl(22, F_GETFL) = 0x2 (flags O_RDWR) <0.000007>fcntl(22, F_SETFL, O_RDWR|O_NONBLOCK) = 0 <0.000008>setsockopt(22, SOL_TCP, TCP_NODELAY, [1], 4) = 0 <0.000008>accept(10, 0x7fff5d9c07d0, [1226]) = -1 EAGAIN <0.000014>epoll_ctl(9, EPOLL_CTL_ADD, 22, {EPOLLIN, {u32=108750368, u64=108750368}}) = 0 <0.000009>epoll_wait(9, {{EPOLLIN, {u32=108750368, u64=108750368}}}, 4096, 50) = 1 <0.000007>read(22, "GET / HTTP/1.1r"..., 16384) = 772 <0.000012>rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0 <0.000007>poll([{fd=5, events=POLLIN|POLLPRI}], 1, 0) = 0 (Timeout) <0.000008>write(5, "1000000-0003SELECT * FROM `table`"..., 56) = 56 <0.000023>read(5, "25101,20x234m"..., 16384) = 284 <1.300897>
  15. 15. http client connection read 772 bytes read(22, "GET / HTTP/1.1r"..., 16384) = 772 <0.0012> incoming http request took 0.0012s
  16. 16. mysql connection write sql query to dbwrite(5, "SELECT * FROM `table`"..., 56) = 56 <0.0023>read(5, "25101,20x234m"..., 16384) = 284 <1.30> read query response slow query
  17. 17. strace ruby to see some interesting things...
  18. 18. stracing ruby: SIGVTALRM --- SIGVTALRM (Virtual timer expired) @ 0 (0) --- rt_sigreturn(0x1a) = 2207807 <0.000009> --- SIGVTALRM (Virtual timer expired) @ 0 (0) --- rt_sigreturn(0x1a) = 0 <0.000009> --- SIGVTALRM (Virtual timer expired) @ 0 (0) --- rt_sigreturn(0x1a) = 140734552062624 <0.000009> --- SIGVTALRM (Virtual timer expired) @ 0 (0) --- rt_sigreturn(0x1a) = 140734552066688 <0.000009> --- SIGVTALRM (Virtual timer expired) @ 0 (0) --- rt_sigreturn(0x1a) = 11333952 <0.000008> --- SIGVTALRM (Virtual timer expired) @ 0 (0) --- rt_sigreturn(0x1a) = 0 <0.000009> --- SIGVTALRM (Virtual timer expired) @ 0 (0) --- rt_sigreturn(0x1a) = 1 <0.000010> --- SIGVTALRM (Virtual timer expired) @ 0 (0) ---• ruby 1.8 uses signals to schedule its green threads• process receives a SIGVTALRM signal every 10ms
  19. 19. stracing ruby: sigprocmask% time seconds usecs/call calls errors syscall------ ----------- ----------- --------- --------- ----------------100.00 0.326334 0 3568567 rt_sigprocmask 0.00 0.000000 0 9 read 0.00 0.000000 0 10 open 0.00 0.000000 0 10 close 0.00 0.000000 0 9 fstat 0.00 0.000000 0 25 mmap------ ----------- ----------- --------- --------- ----------------100.00 0.326334 3568685 0 total • debian/redhat compile ruby with --enable-pthread • uses a native thread timer for SIGVTALRM • causes excessive calls to sigprocmask: 30% slowdown!
  20. 20. TCPDUMP dump traffic on a networktcpdump -i eth0 -s 0 -nqA tcp dst port 3306
  21. 21. tcpdump -i <eth> -s <len> -nqA <expr> tcpdump -i <eth> -w <file> <expr>-i <eth>Network interface.-s <len>Snarf len bytes of data from each packet.-nDont convert addresses (host addresses, port numbers) to names.-qQuiet output. Print less protocol information.-APrint each packet (minus its link level header) in ASCII.-w <file>Write the raw packets to file rather than printing them out.<expr>libpcap expression, for example: tcp src port 80 tcp dst port 3306
  22. 22. tcp dst port 8019:52:20.216294 IP 24.203.197.27.40105 >174.37.48.236.80: tcp 438E...*.@.l.%&.....%0....POx..%s.oP.......GET /poll_images/cld99erh0/logo.png HTTP/1.1Accept: */*Referer: http://apps.facebook.com/realpolls/?_fb_q=1
  23. 23. tcp dst port 330619:51:06.501632 IP 10.8.85.66.50443 >10.8.85.68.3306: tcp 98E..."K@.@.Yy.UB.UD.....z....L............GZ.y3b..[......W....SELECT * FROM `votes` WHERE (`poll_id` =72621) LIMIT 1
  24. 24. tcpdump -w <file>
  25. 25. PERFTOOLS googles performance toolsCPUPROFILE=/tmp/myprof ./myapp pprof ./myapp /tmp/myprof
  26. 26. wget http://google-perftools.googlecode.com/files/google-perftools-1.6.tar.gz downloadtar zxvf google-perftools-1.6.tar.gzcd google-perftools-1.6./configure --prefix=/optmake compilesudo make install# for linuxexport LD_PRELOAD=/opt/lib/libprofiler.so setup# for osxexport DYLD_INSERT_LIBRARIES=/opt/lib/libprofiler.dylibCPUPROFILE=/tmp/ruby.prof ruby -e profile 5_000_000.times{ "hello world" }pprof `which ruby` --text /tmp/ruby.prof report
  27. 27. pprof ruby pprof ruby ruby.prof --text ruby.prof --gifTotal: 103 samples 95 92.2% rb_yield_0 103 100.0% rb_eval 12 11.7% gc_sweep 52 50.5% rb_str_new3 3 2.9% obj_free 103 100.0% int_dotimes 12 11.7% gc_mark
  28. 28. Profiling MRI • 10% of production VM time spent in rb_str_sub_bang • String#sub! • called from Time.parsereturn unless str.sub!(/A(d{1,2})/, )return unless str.sub!(/A( d|d{1,2})/, )return unless str.sub!(/A( d|d{1,2})/, )return unless str.sub!(/A(d{1,3})/, )return unless str.sub!(/A(d{1,2})/, )return unless str.sub!(/A(d{1,2})/, )
  29. 29. Profiling EM + threads Total: 3763 samples 2764 73.5% catch_timer 989 26.3% memcpy 3 0.1% st_lookup 2 0.1% rb_thread_schedule 1 0.0% rb_eval 1 0.0% rb_newobj 1 0.0% rb_gc_force_recycle • known issue: EM+threads = slow • memcpy?? • thread context switches copy the stack w/ memcpy • EM allocates huge buffer on the stack • solution: move buffer to the heap
  30. 30. PERFTOOLS.RB perftools for ruby codepprof.rb /tmp/myrbprof github.com/tmm1/perftools.rb
  31. 31. gem install perftools.rbRUBYOPT="-r`gem which perftools | tail -1`"CPUPROFILE=/tmp/myrbprofruby myapp.rbpprof.rb /tmp/myrbprof --textpprof.rb /tmp/myrbprof --gif > /tmp/myrbprof.gif
  32. 32. require sinatra $ ab -c 1 -n 50 http://127.0.0.1:4567/compute $ ab -c 1 -n 50 http://127.0.0.1:4567/sleepget /sleep do sleep 0.25 done • Sampling profiler:end • 232 samples totalget /compute do • 83 samples were in /compute proc{ |n| a,b=0,1 • 118 samples had /compute on the stack but were in n.times{ a,b = b,a+b } another function b }.call(10_000) • /compute accounts for 50% done of process, but only 35% of time was in /compute itselfend== Sinatra has ended his set (crowd applauds)PROFILE: interrupts/evictions/bytes = 232/0/2152Total: 232 samples 83 35.8% 35.8% 118 50.9% Sinatra::Application#GET /compute 56 24.1% 59.9% 56 24.1% garbage_collector 35 15.1% 75.0% 113 48.7% Integer#times
  33. 33. CPUPROFILE_REALTIME=1 CPUPROFILE=app.profCPUPROFILE=app-rt.prof
  34. 34. redis-rb bottleneck
  35. 35. why is rubygems slow?
  36. 36. faster bundle install• 23% spent in Gem::Version#<=>• simple patch to rubygems improved overall install performance by 15%• http://gist.github.com/ 458185
  37. 37. CPUPROFILE_OBJECTS=1 CPUPROFILE=app-objs.prof• object allocation profiler mode built-in• 1 sample = 1 object created• Time parsing is both CPU and object allocation intensive• using mysql2 moves this to C
  38. 38. LTRACE trace library calls ltrace -cp <pid>ltrace -ttTp <pid> -o <file>
  39. 39. ltrace -c ruby threaded_em.rb % time seconds usecs/call calls function ------ ----------- ----------- --------- -------------------- 48.65 11.741295 617 19009 memcpy 30.16 7.279634 831 8751 longjmp 9.78 2.359889 135 17357 _setjmp 8.91 2.150565 285 7540 malloc 1.10 0.265946 20 13021 memset 0.81 0.195272 19 10105 __ctype_b_loc 0.35 0.084575 19 4361 strcmp 0.19 0.046163 19 2377 strlen 0.03 0.006272 23 265 realloc ------ ----------- ----------- --------- -------------------- 100.00 24.134999 82999 total ltrace -ttT -e memcpy ruby threaded_em.rb01:24:48.769408 --- SIGVTALRM (Virtual timer expired) ---01:24:48.769616 memcpy(0x1216000, "", 1086328) = 0x1216000 <0.000578>01:24:48.770555 memcpy(0x6e32670, "240&343v", 1086328) = 0x6e32670 <0.000418>01:24:49.899414 --- SIGVTALRM (Virtual timer expired) ---01:24:49.899490 memcpy(0x1320000, "", 1082584) = 0x1320000 <0.000628>01:24:49.900474 memcpy(0x6e32670, "", 1086328) = 0x6e32670 <0.000479>
  40. 40. LTRACE/LIBDL trace dlopen’d library callsltrace -F <conf> -bg -x <symbol> -p <pid> github.com/ice799/ltrace/tree/libdl
  41. 41. ltrace -F <conf> -b -g -x <sym>-bIgnore signals.-gIgnore libraries linked at compile time.-F <conf>Read prototypes from config file.-x <sym>Trace calls to the function sym.-F ltrace.confint mysql_real_query(addr,string,ulong);void garbage_collect(void);int memcached_set(addr,string,ulong,string,ulong);
  42. 42. ltrace -x garbage_collect19:08:06.436926 garbage_collect() = <void> <0.221679>19:08:15.329311 garbage_collect() = <void> <0.187546>19:08:17.662149 garbage_collect() = <void> <0.199200>19:08:20.486655 garbage_collect() = <void> <0.205864>19:08:25.102302 garbage_collect() = <void> <0.214295>19:08:35.552337 garbage_collect() = <void> <0.189172>
  43. 43. ltrace -x mysql_real_querymysql_real_query(0x1c9e0500, "SET NAMES UTF8", 16) = 0 <0.000324>mysql_real_query(0x1c9e0500, "SET SQL_AUTO_IS_NULL=0", 22) = 0 <0.000322>mysql_real_query(0x19c7a500, "SELECT * FROM `users`", 21) = 0 <1.206506>mysql_real_query(0x1c9e0500, "COMMIT", 6) = 0 <0.000181>
  44. 44. ltrace -x memcached_setmemcached_set(0x15d46b80, "Status:33", 21, "004b", 366) = 0 <0.01116>memcached_set(0x15d46b80, "Status:96", 21, "004b", 333) = 0 <0.00224>memcached_set(0x15d46b80, "Status:57", 21, "004b", 298) = 0 <0.01850>memcached_set(0x15d46b80, "Status:10", 21, "004b", 302) = 0 <0.00530>memcached_set(0x15d46b80, "Status:67", 21, "004b", 318) = 0 <0.00291>memcached_set(0x15d46b80, "Status:02", 21, "004b", 299) = 0 <0.00658>memcached_set(0x15d46b80, "Status:34", 21, "004b", 264) = 0 <0.00243>
  45. 45. GDB the GNU debuggergdb <executable>gdb attach <pid>
  46. 46. Debugging Ruby Segfaults test_segv.rb:4: [BUG] Segmentation fault ruby 1.8.7 (2008-08-11 patchlevel 72) [i686-darwin9.7.0] def test #include "ruby.h" require segv 4.times do VALUE Dir.chdir /tmp do segv() Hash.new{ segv }[0] { end VALUE array[1]; end array[1000000] = NULL; end return Qnil; } sleep 10 test() void Init_segv() { rb_define_method(rb_cObject, "segv", segv, 0); }
  47. 47. 1. Attach to running process $ ps aux | grep ruby joe 23611 0.0 0.1 25424 7540 S Dec01 0:00 ruby test_segv.rb $ sudo gdb ruby 23611 Attaching to program: ruby, process 23611 0x00007fa5113c0c93 in nanosleep () from /lib/libc.so.6 (gdb) c Continuing. Program received signal SIGBUS, Bus error. segv () at segv.c:7 7 array[1000000] = NULL;2. Use a coredump Process.setrlimit Process::RLIMIT_CORE, 300*1024*1024 $ sudo mkdir /cores $ sudo chmod 777 /cores $ sudo sysctl kernel.core_pattern=/cores/%e.core.%s.%p.%t $ sudo gdb ruby /cores/ruby.core.6.23611.1259781224
  48. 48. def test require segv 4.times do Dir.chdir /tmp do Hash.new{ segv }[0] end end (gdb) whereend #0 segv () at segv.c:7 #1 0x000000000041f2be in call_cfunc () at eval.c:5727test() ... #13 0x000000000043ba8c in rb_hash_default () at hash.c:521 ... #19 0x000000000043b92a in rb_hash_aref () at hash.c:429 ... #26 0x00000000004bb7bc in chdir_yield () at dir.c:728 #27 0x000000000041d8d7 in rb_ensure () at eval.c:5528 #28 0x00000000004bb93a in dir_s_chdir () at dir.c:816 ... #35 0x000000000041c444 in rb_yield () at eval.c:5142 #36 0x0000000000450690 in int_dotimes () at numeric.c:2834 ... #48 0x0000000000412a90 in ruby_run () at eval.c:1678 #49 0x000000000041014e in main () at main.c:48
  49. 49. GDB.RB gdb with MRI hooksgem install gdb.rb gdb.rb <pid> github.com/tmm1/gdb.rb
  50. 50. (gdb) ruby eval 1+23(gdb) ruby eval Thread.current#<Thread:0x1d630 run>(gdb) ruby eval Thread.list.size8
  51. 51. (gdb) ruby threads list0x15890 main thread THREAD_STOPPED WAIT_JOIN(0x19ef4)0x19ef4 thread THREAD_STOPPED WAIT_TIME(57.10s)0x19e34 thread THREAD_STOPPED WAIT_FD(5)0x19dc4 thread THREAD_STOPPED WAIT_NONE0x19dc8 thread THREAD_STOPPED WAIT_NONE0x19dcc thread THREAD_STOPPED WAIT_NONE0x22668 thread THREAD_STOPPED WAIT_NONE0x1d630 curr thread THREAD_RUNNABLE WAIT_NONE
  52. 52. (gdb) ruby objects HEAPS 8 SLOTS 1686252 LIVE 893327 (52.98%) FREE 792925 (47.02%) scope 1641 (0.18%) regexp 2255 (0.25%) data 3539 (0.40%) class 3680 (0.41%) hash 6196 (0.69%) object 8785 (0.98%) array 13850 (1.55%) string 105350 (11.79%) node 742346 (83.10%)
  53. 53. (gdb) ruby objects strings 140 ulib 158 u0 294 un 619 u 30503 unique strings 3187435 bytes
  54. 54. def test require segv 4.times do Dir.chdir /tmp do Hash.new{ segv }[0] end endend (gdb) ruby threadstest() 0xa3e000 main curr thread THREAD_RUNNABLE WAIT_NONE node_vcall segv in test_segv.rb:5 node_call test in test_segv.rb:5 node_call call in test_segv.rb:5 node_call default in test_segv.rb:5 node_call [] in test_segv.rb:5 node_call test in test_segv.rb:4 node_call chdir in test_segv.rb:4 node_call test in test_segv.rb:3 node_call times in test_segv.rb:3 node_vcall test in test_segv.rb:9
  55. 55. rails_warden leak(gdb) ruby objects classes 1197 MIME::Type 2657 NewRelic::MetricSpec 2719 TZInfo::TimezoneTransitionInfo 4124 Warden::Manager 4124 MethodOverrideForAll 4124 AccountMiddleware 4124 Rack::Cookies 4125 ActiveRecord::ConnectionAdapters::ConnectionManagement 4125 ActionController::Session::CookieStore 4125 ActionController::Failsafe 4125 ActionController::ParamsParser 4125 Rack::Lock 4125 ActionController::Dispatcher 4125 ActiveRecord::QueryCache 4125 ActiveSupport::MessageVerifier 4125 Rack::Headmiddleware chain leaking per request
  56. 56. mongrel sleeper thread0x16814c00 thread THREAD_STOPPED WAIT_TIME(0.47) 1522 bytes node_fcall sleep in lib/mongrel/configurator.rb:285 node_fcall run in lib/mongrel/configurator.rb:285 node_fcall loop in lib/mongrel/configurator.rb:285 node_call run in lib/mongrel/configurator.rb:285 node_call initialize in lib/mongrel/configurator.rb:285 node_call new in lib/mongrel/configurator.rb:285 node_call run in bin/mongrel_rails:128 node_call run in lib/mongrel/command.rb:212 node_call run in bin/mongrel_rails:281 node_fcall (unknown) in bin/mongrel_rails:19def run @listeners.each {|name,s| s.run } $mongrel_sleeper_thread = Thread.new { loop { sleep 1 } }end
  57. 57. god memory leaks(gdb) ruby objects arrays 43 God::Process elements instances 43 God::Watch 94310 3 43 God::Driver 94311 3 43 God::DriverEventQueue 94314 2 43 God::Conditions::MemoryUsage 94316 1 43 God::Conditions::ProcessRunning 43 God::Behaviors::CleanPidFile 5369 arrays 45 Process::Status 2863364 member elements 86 God::Metric 327 God::System::SlashProcPollermany arrays with 327 God::System::Process 90k+ elements! 406 God::DriverEvent 5 separate god leaks fixed by Eric Lindvall with the help of gdb.rb!
  58. 58. BLEAK_HOUSE ruby memory leak detector ruby-bleak-house myapp.rbbleak /tmp/bleak.<PID>.*.dump github.com/fauna/bleak_house
  59. 59. • BleakHouse • installs a patched version of ruby: ruby-bleak- house • unlike gdb.rb, see where objects were created (file:line) • create multiple dumps over time with `kill -USR2 <pid>` and compare to find leaks191691 total objectsFinal heap size 191691 filled, 220961 freeDisplaying top 20 most common line/class pairs 89513 __null__:__null__:__node__ 41438 __null__:__null__:String 2348 ruby/site_ruby/1.8/rubygems/specification.rb:557:Array 1508 ruby/gems/1.8/specifications/gettext-1.9.gemspec:14:String 1021 ruby/gems/1.8/specifications/heel-0.2.0.gemspec:14:String 951 ruby/site_ruby/1.8/rubygems/version.rb:111:String 935 ruby/site_ruby/1.8/rubygems/specification.rb:557:String 834 ruby/site_ruby/1.8/rubygems/version.rb:146:Array
  60. 60. IOPROFILE summarizes strace and lsofwget http://aspersa.googlecode.com/svn/trunk/ioprofile
  61. 61. strace -cp <pid> ioprofile is a script that captures one sample of lsof output then starts strace for a specified amount of time. after strace finishes, the results are processed. below is an example that comes with ioprofile$ ioprofile t/samples/ioprofile-001.txt total pread read pwrite write filename 10.094264 10.094264 0.000000 0.000000 0.000000 /data/data/abd_2dia/aia_227_228.ibd 8.356632 8.356632 0.000000 0.000000 0.000000 /data/data/abd_2dia/aia_227_223.ibd 0.048850 0.046989 0.000000 0.001861 0.000000 /data/data/abd/aia_instances.ibd 0.035016 0.031001 0.000000 0.004015 0.000000 /data/data/abd/vo_difuus.ibd 0.013360 0.000000 0.001723 0.000000 0.011637 /var/log/mysql/mysql-relay.002113 0.008676 0.000000 0.000000 0.000000 0.008676 /data/data/master.info 0.002060 0.000000 0.000000 0.002060 0.000000 /data/data/ibdata1 0.001490 0.000000 0.000000 0.001490 0.000000 /data/data/ib_logfile1 0.000555 0.000000 0.000000 0.000000 0.000555 /var/log/mysql/mysql-relay-log.info 0.000141 0.000000 0.000000 0.000141 0.000000 /data/data/ib_logfile0 0.000100 0.000000 0.000000 0.000100 0.000000 /data/data/abd/9fvus.ibd
  62. 62. strace -ttTp <pid> -o <file> -c CELL specify what to put in the cells of the output. ‘times’, ‘count’, or ‘sizes‘. below is an example of -c sizes:$ ioprofile -c sizes t/samples/ioprofile-001.txt total pread read pwrite write filename 90800128 90800128 0 0 0 /data/data/abd_2dia/aia_227_223.ibd 52150272 52150272 0 0 0 /data/data/abd_2dia/aia_227_228.ibd 999424 0 0 999424 0 /data/data/ibdata1 638976 131072 0 507904 0 /data/data/abd/vo_difuus.ibd 327680 114688 0 212992 0 /data/data/abd/aia_instances.ibd 305263 0 149662 0 155601 /var/log/mysql/mysql-relay.002113 217088 0 0 217088 0 /data/data/ib_logfile1 22638 0 0 0 22638 /data/data/master.info 16384 0 0 16384 0 /data/data/abd/9fvus.ibd 1088 0 0 0 1088 /var/log/mysql/mysql-relay-log.info 512 0 0 512 0 /data/data/ib_logfile0
  63. 63. /proclots of interesting things hiding in proc
  64. 64. /proc/[pid]/pagemap• (as of linux 2.6.25)• /proc/[pid]/pagemap - find out which physical frame each virtual page is mapped to and swap flag.• /proc/kpagecount - stores number of times a particular physical page is mapped.• /proc/kpageflags - flags about each page (locked, slab, dirty, writeback, ...)• read more: Documentation/vm/pagemap.txt
  65. 65. /proc/[pid]/stack • get a process’s stack trace
  66. 66. /proc/[pid]/status• lots of small bits of information
  67. 67. /proc/sys/kernel/core_pattern • tell system where to output core dumps • can also launch user program when core dumps happen. • echo "|/path/to/core_helper.rb %p %s %u %g" > /proc/sys/kernel/core_pattern • /proc/[pid]/ files are kept in place until your handler exits, so full state of the process at death may be inspected. • http://gist.github.com/587443
  68. 68. and lots more /proc/meminfo /proc/scsi/* /proc/net/*/proc/sys/net/ipv4/* ...
  69. 69. MONITOR RAID STATUS
  70. 70. • how do you know when a hard drive in your RAID array fails?• turns out there some command line tools that most RAID vendors provide.• adaptec - /usr/StorMan/arcconf getconfig 1 AL• 3ware - /usr/bin/tw_cli
  71. 71. snooki:/# /usr/StorMan/arcconf getconfig 1 ALControllers found: 1----------------------------------------------------------------------Controller information---------------------------------------------------------------------- Controller Status : Optimal Channel description : SAS/SATA Controller Model : Adaptec 3405 Controller Serial Number : 7C4911519E3 Physical Slot :2 Temperature : 42 C/ 107 F (Normal) Installed memory : 128 MB Copyback : Disabled Background consistency check : Enabled Automatic Failover : Enabled Global task priority : High Stayawake period : Disabled Spinup limit internal drives :0 Spinup limit external drives :0 Defunct disk drive count :0 Logical devices/Failed/Degraded : 1/0/0 -------------------------------------------------------- Controller Version Information -------------------------------------------------------- BIOS : 5.2-0 (17304) Firmware : 5.2-0 (17304) Driver : 1.1-5 (2461)
  72. 72. • write a script to parse that• run it with cron• the ugly ruby script i use for adaptec: http://gist.github.com/643666
  73. 73. more well known tools:iostat, vmstat, top, and free
  74. 74. LINUX TWEAKS
  75. 75. ADJUST TIMER FREQUENCY
  76. 76. CONFIG_HZ_100=y CONFIG_HZ=100• Set the timer interrupt frequency.• Fewer timer interrupts means processes run with fewer interruptions.• Servers (without interactive software) should have lower timer frequency.
  77. 77. CONNECTOR
  78. 78. CONFIG_CONNECTOR=yCONFIG_PROC_EVENTS=y• connector kernel module is useful for process monitoring.• build or use a system like god to watch processes.• when processes die the kernel notifies you• you can restart/recover/etc.
  79. 79. TCP SEGMENTATION OFFLOADING
  80. 80. sudo ethtool -K eth1 tso on • Allows kernel to offload large packet segmentation to the network adapter. • Frees the CPU to do more useful work. • After running the command above, verify with: [joe@timetobleed]% dmesg | tail -1 [892528.450378] 0000:04:00.1: eth1: TSO is Enabled
  81. 81. http://kerneltrap.org/node/397 Tx/Rx TCP file send long (bi-directional Rx/Tx): w/o TSO: 1500Mbps, 82% CPU w/ TSO: 1633Mbps, 75% CPU Tx TCP file send long (Tx only): w/o TSO: 940Mbps, 40% CPU w/ TSO: 940Mbps, 19% CPU
  82. 82. INTEL I/OAT DMA ENGINE
  83. 83. CONFIG_DMADEVICES=y CONFIG_INTEL_IOATDMA=y CONFIG_DMA_ENGINE=y CONFIG_NET_DMA=y• these options enable Intel I/OAT DMA engine present in recent Xeon CPUs.• increases throughput because kernel can offload network data copying to the DMA engine.• CPU can do more useful work.• statistics about savings can be found in sysfs: /sys/class/dma/
  84. 84. check if I/OAT is enabled [joe@timetobleed]% dmesg | grep ioat ioatdma 0000:00:08.0: setting latency timer to 64 ioatdma 0000:00:08.0: Intel(R) I/OAT DMA Engine found, 4 channels, device version 0x12, driver version 3.64 ioatdma 0000:00:08.0: irq 56 for MSI/MSI-X
  85. 85. DIRECT CACHE ACCESS
  86. 86. CONFIG_DCA=y• I/OAT includes Direct Cache Access (DCA)• DCA allows a driver to warm CPU cache.• Requires driver and device support.• Intel 10GbE driver (ixgbe) supports this feature.• Must enable this feature in the BIOS.• Some vendors hide BIOS option so you will need a hack to enable DCA.
  87. 87. DCA enable hackget a hack to enable DCA on some vendorlocked systems:http://github.com/ice799/dca_forceread about the hack here:http://timetobleed.com/enabling-bios-options-on-a-live-server-with-no-rebooting/
  88. 88. THROTTLE NIC INTERRUPTS
  89. 89. insmod e1000e.koInterruptThrottleRate=1• SOME drivers allow you to specify the interrupt throttling algorithm.• e1000e is one of these drivers.• Two dynamic throttling algorithms: dynamic (1) and dynamic conservative (3).• The difference is the interrupt rate for “Lowest Latency” traffic.• Algorithm 1 is more aggressive for this traffic class.• Read driver documentation for more information.• Be careful to avoid receive livelock.
  90. 90. PROCESS AFFINITY
  91. 91. Process Affinity• Linux allows you to set which CPU(s) a process may run on.• For example, set PID 123 to CPUs 4-6: # taskset -c 4,5,6 123
  92. 92. IRQ AFFINITY
  93. 93. IRQ Affinity• NIC, disk, etc IRQ handlers can be set to execute on specific processors.• The IRQ to CPU map can be found at: /proc/interrupts• Individual IRQs may be set in the file: /proc/irq/[IRQ NUMBER]/smp_affinity
  94. 94. IRQ affinity• Can pin IRQ handlers for devices to specific CPUs.• Can then use taskset to pin important processes to other CPUs.• The result is NIC and disk will not interrupt important processes running elsewhere.• Can also help preserve CPU caches.
  95. 95. irqbalance• http://www.irqbalance.org• “irqbalance is a Linux* daemon that distributes interrupts over the processors and cores you have in your computer system.”
  96. 96. OPROFILE
  97. 97. CONFIG_OPROFILE=yCONFIG_HAVE_OPROFILE=y• oprofile is a system wide profiler that can profile both kernel and application level code.• oprofile has a kernel driver which collects data from CPU registers (on x86 these are MSRs).• oprofile can also annote source code with performance information.• this can help you find and fix performance problems.
  98. 98. LATENCYTOP
  99. 99. CONFIG_LATENCYTOP=y • LatencyTOP helps you understand what caused the kernel to block a process.
  100. 100. nasty bugsejpphoto (flickr)
  101. 101. fatboyke (flickr) code
  102. 102. memory bloat37prime (flickr)
  103. 103. TOOLS
  104. 104. TWEAKS
  105. 105. Thanks. (spasiba ?) @joedamatotimetobleed.com

×