Ruby performance
profiling
Minsk, SaM Solutions, 2013
Presented by Alexey Tulia
@AlexeyTulia, github.com/crible
Ruby. A programmer’s best friend
Ruby implementations
Reasons?
Garbage collection
Slow methods call
one iteration takes ~ 100ms
Garbage collection
Need one Gb?
Expect 128 GC calls!
You lose 128*0,1 = 12,8 sec
Allocated memory never returns to the system!
runs on every 8Mb of allocated memory
Garbage collection
More objects allocation -> more GC calls -> slow code
RUBY_HEAP_MIN_SLOTS=8000000
RUBY_GC_MALLOC_LIMIT=10000
Garbage collection tuning
performance patches
Ruby String performance
require 'benchmark'
ITERATIONS = 1000000
def run(str, bench)
bench.report("#{str.length + 1} chars") do
ITERATIONS.times do
new_string = str + 'x'
end
end
end
Benchmark.bm do |bench|
run("12345678901234567890", bench)
run("123456789012345678901", bench)
run("1234567890123456789012", bench)
run("12345678901234567890123", bench)
run("123456789012345678901234", bench)
run("1234567890123456789012345", bench)
run("12345678901234567890123456", bench)
end
user system total real
21 chars 0.250000 0.000000 0.250000 ( 0.247459)
22 chars 0.250000 0.000000 0.250000 ( 0.246954)
23 chars 0.250000 0.000000 0.250000 ( 0.248440)
24 chars 0.480000 0.000000 0.480000 ( 0.478391)
25 chars 0.480000 0.000000 0.480000 ( 0.479662)
26 chars 0.480000 0.000000 0.480000 ( 0.481211)
27 chars 0.490000 0.000000 0.490000 ( 0.490404)
Ruby String performance
Heap strings
Shared strings
Embedded strings
struct RString {
long len;
char *ptr;
};
Ruby String performance
struct RString {
long len;
char *ptr;
VALUE shared;
};
struct RString {
char ary[RSTRING_EMBED_LEN_MAX +1];
}
RSTRING_EMBED_LEN_MAX = 23
Slow methods call
What can I do to improve performance?
Use C extensions or gems
Use plain SQL instead of frameworks
Use CPU and memory profiling
Use Rubinius or JRuby
STRACE
trace system calls and signals
strace -cp <pid>
strace -ttTp <pid> -o <file>
% time seconds uses/call calls errors syscall
50,39 0,00064 0 1197 592 read
34,56 0,00044 0 609 writev
14,96 0,000019 0 1226 epoll_ctl
0,00 0,000000 0 4 close
0,00 0,000000 0 1 select
0,00 0,000000 0 4 socket
0,00 0,000000 0 4 4 connect
0,00 0,000000 0 1057 epoll_wait
100,0 0,000127 4134 596 total
strace -cp <pid>
LTRACE
trace dynamic library calls
ltrace -F <conf_file> -bg -x <symbol> -p pid
ltrace -F <conf_file> -bg -x <symbol> -p pid
-F <conf_file>
int mysql_real_query(addr,string,ulong);
void garbage_collect(void);
int memcached_set(addr,string,ulong,string,ulong);
ltrace -x garbage_collect
ltrace -x mysql_real_query
mysql_real_query(0x1c9e0500, "SET NAMES 'UTF8'", 16) = 0 <0.000324>
mysql_real_query(0x1c9e0500, "SET SQL_AUTO_IS_NULL=0", 22) = 0 <0.000322>
mysql_real_query(0x19c7a500, "SELECT * FROM `users`", 21) = 0 <1.206506>
mysql_real_query(0x1c9e0500, "COMMIT", 6) = 0 <0.000181>
RUBY-PROF
fast code profiler for Ruby
%self total self child calls name
------------------------------------------------------------------------
8.39 0.54 0.23 0.31 602 Array#each_index
7.30 0.41 0.20 0.21 1227 Integer#gcd
6.20 0.49 0.17 0.32 5760 Timecell#date
5.11 0.15 0.14 0.01 1 Magick::Image#to_blob
RUBY-PROF
KCachegrind
a tool for visualisation
http://kcachegrind.sourceforge.net
KCachegrind
KCachegrind
PERFTOOLS.RB
google’s performance tools for ruby code
CPUPROFILE=/tmp/myprof 
pprof --calgrind ./myapp /tmp/myprof
gem install perftools.rb
RUBYOPT="-r`gem which perftools | tail -1`" 
ruby my_app.rb
PERFTOOLS.RB
CPUPROFILE_REALTIME = 1
CPU/wall time
CPUPROFILE_OBJECTS = 1
CPUPROFILE_METHODS = 1
When should I stop performance optimisation?
Premature optimization is the root of all evil
(c) Donald Knuth
Make it work
Make it right
Make it fast
What should I remeber before profiling?
Turn GC off (GC.disable)
class Foo
def do_smth
return "x" * 1024 # take one Kb of memory
end
end
smth = "x" * 7999999 # alloc almost 8Mb
Foo.new.do_smth # here GC will be called
do_smth is so slow!!!
Questions?

Профилирование и оптимизация производительности Ruby-кода

  • 1.
    Ruby performance profiling Minsk, SaMSolutions, 2013 Presented by Alexey Tulia @AlexeyTulia, github.com/crible
  • 2.
  • 3.
  • 5.
  • 6.
  • 7.
    one iteration takes~ 100ms Garbage collection Need one Gb? Expect 128 GC calls! You lose 128*0,1 = 12,8 sec Allocated memory never returns to the system! runs on every 8Mb of allocated memory
  • 8.
    Garbage collection More objectsallocation -> more GC calls -> slow code
  • 9.
  • 10.
    Ruby String performance require'benchmark' ITERATIONS = 1000000 def run(str, bench) bench.report("#{str.length + 1} chars") do ITERATIONS.times do new_string = str + 'x' end end end Benchmark.bm do |bench| run("12345678901234567890", bench) run("123456789012345678901", bench) run("1234567890123456789012", bench) run("12345678901234567890123", bench) run("123456789012345678901234", bench) run("1234567890123456789012345", bench) run("12345678901234567890123456", bench) end user system total real 21 chars 0.250000 0.000000 0.250000 ( 0.247459) 22 chars 0.250000 0.000000 0.250000 ( 0.246954) 23 chars 0.250000 0.000000 0.250000 ( 0.248440) 24 chars 0.480000 0.000000 0.480000 ( 0.478391) 25 chars 0.480000 0.000000 0.480000 ( 0.479662) 26 chars 0.480000 0.000000 0.480000 ( 0.481211) 27 chars 0.490000 0.000000 0.490000 ( 0.490404)
  • 11.
    Ruby String performance Heapstrings Shared strings Embedded strings
  • 12.
    struct RString { longlen; char *ptr; }; Ruby String performance struct RString { long len; char *ptr; VALUE shared; }; struct RString { char ary[RSTRING_EMBED_LEN_MAX +1]; } RSTRING_EMBED_LEN_MAX = 23
  • 13.
  • 14.
    What can Ido to improve performance? Use C extensions or gems Use plain SQL instead of frameworks Use CPU and memory profiling Use Rubinius or JRuby
  • 18.
    STRACE trace system callsand signals strace -cp <pid> strace -ttTp <pid> -o <file>
  • 19.
    % time secondsuses/call calls errors syscall 50,39 0,00064 0 1197 592 read 34,56 0,00044 0 609 writev 14,96 0,000019 0 1226 epoll_ctl 0,00 0,000000 0 4 close 0,00 0,000000 0 1 select 0,00 0,000000 0 4 socket 0,00 0,000000 0 4 4 connect 0,00 0,000000 0 1057 epoll_wait 100,0 0,000127 4134 596 total strace -cp <pid>
  • 20.
    LTRACE trace dynamic librarycalls ltrace -F <conf_file> -bg -x <symbol> -p pid
  • 21.
    ltrace -F <conf_file>-bg -x <symbol> -p pid -F <conf_file> int mysql_real_query(addr,string,ulong); void garbage_collect(void); int memcached_set(addr,string,ulong,string,ulong);
  • 22.
    ltrace -x garbage_collect ltrace-x mysql_real_query mysql_real_query(0x1c9e0500, "SET NAMES 'UTF8'", 16) = 0 <0.000324> mysql_real_query(0x1c9e0500, "SET SQL_AUTO_IS_NULL=0", 22) = 0 <0.000322> mysql_real_query(0x19c7a500, "SELECT * FROM `users`", 21) = 0 <1.206506> mysql_real_query(0x1c9e0500, "COMMIT", 6) = 0 <0.000181>
  • 23.
  • 24.
    %self total selfchild calls name ------------------------------------------------------------------------ 8.39 0.54 0.23 0.31 602 Array#each_index 7.30 0.41 0.20 0.21 1227 Integer#gcd 6.20 0.49 0.17 0.32 5760 Timecell#date 5.11 0.15 0.14 0.01 1 Magick::Image#to_blob RUBY-PROF
  • 25.
    KCachegrind a tool forvisualisation http://kcachegrind.sourceforge.net
  • 26.
  • 27.
  • 29.
    PERFTOOLS.RB google’s performance toolsfor ruby code CPUPROFILE=/tmp/myprof pprof --calgrind ./myapp /tmp/myprof gem install perftools.rb RUBYOPT="-r`gem which perftools | tail -1`" ruby my_app.rb
  • 30.
    PERFTOOLS.RB CPUPROFILE_REALTIME = 1 CPU/walltime CPUPROFILE_OBJECTS = 1 CPUPROFILE_METHODS = 1
  • 31.
    When should Istop performance optimisation? Premature optimization is the root of all evil (c) Donald Knuth
  • 32.
    Make it work Makeit right Make it fast
  • 33.
    What should Iremeber before profiling? Turn GC off (GC.disable) class Foo def do_smth return "x" * 1024 # take one Kb of memory end end smth = "x" * 7999999 # alloc almost 8Mb Foo.new.do_smth # here GC will be called do_smth is so slow!!!
  • 36.