High Performance Ruby
Tips,Techniques, and Futures
Monday, July 1, 13
Me
• Charles Oliver Nutter
• @headius
• Java developer since 1996
• JRuby developer since 2006
• Red Hat / JBoss polyglot ...
Is Ruby fast?
Monday, July 1, 13
Is Ruby fast enough?
Monday, July 1, 13
How fast do you need
Ruby to be?
Monday, July 1, 13
What Should We
Optimize?
• Overall execution time?
• Memory use?
• Developer time?
• Developer happiness? :-)
Monday, July...
Ruby can be fast...
if you know how.
Monday, July 1, 13
Strategies
• Use a better runtime
• Use more cores
• Write better code
Monday, July 1, 13
Use a BetterVM
Monday, July 1, 13
Many Options
• Ruby 2.0
• Significant execution improvements
• JRuby
• Leveraging JVM more and more
• Rubinius
• Optimizing...
0
7.5
15
22.5
30
Java 1.4 Java 5 Java 6 Java 7
Go Java Go!
JRuby 1.0.3 (bm_red_black_tree.rb)
300% for free
Monday, July 1...
0
2
4
6
8
1.0.3 1.1.6 1.4.0 1.5.6 1.6.8 1.7.0
OpenJDK 8 (bm_red_black_tree.rb)
Go JRuby Go!
8.2x Improvement
Monday, July ...
rbtree Extension
• Pure Ruby version works everywhere
• C or Java extension FOR SPEED
• Oh really? ;-)
Monday, July 1, 13
Monday, July 1, 13
0 1 2 3 4
ruby-1.9.3 + Ruby
ruby-2.0.0 + Ruby
maglev + Ruby
macruby-0.12 + Ruby
rbx-2.0.0rc1 + Ruby
ruby-1.9.3 + C ext
rub...
0 1 2 3 4
ruby-1.9.3 + Ruby
ruby-2.0.0 + Ruby
maglev + Ruby
macruby-0.12 + Ruby
rbx-2.0.0rc1 + Ruby
ruby-1.9.3 + C ext
rub...
0 1 2 3 4
ruby-1.9.3 + Ruby
ruby-2.0.0 + Ruby
maglev + Ruby
macruby-0.12 + Ruby
rbx-2.0.0rc1 + Ruby
ruby-1.9.3 + C ext
rub...
0 1 2 3 4
ruby-1.9.3 + Ruby
ruby-2.0.0 + Ruby
maglev + Ruby
macruby-0.12 + Ruby
rbx-2.0.0rc1 + Ruby
ruby-1.9.3 + C ext
rub...
0 1 2 3 4
ruby-1.9.3 + Ruby
ruby-2.0.0 + Ruby
maglev + Ruby
macruby-0.12 + Ruby
rbx-2.0.0rc1 + Ruby
ruby-1.9.3 + C ext
rub...
0 1 2 3 4
ruby-1.9.3 + Ruby
ruby-2.0.0 + Ruby
maglev + Ruby
macruby-0.12 + Ruby
rbx-2.0.0rc1 + Ruby
ruby-1.9.3 + C ext
rub...
0 1 2 3 4
ruby-1.9.3 + Ruby
ruby-2.0.0 + Ruby
maglev + Ruby
macruby-0.12 + Ruby
rbx-2.0.0rc1 + Ruby
ruby-1.9.3 + C ext
rub...
0 1 2 3 4
ruby-1.9.3 + Ruby
ruby-2.0.0 + Ruby
maglev + Ruby
macruby-0.12 + Ruby
rbx-2.0.0rc1 + Ruby
ruby-1.9.3 + C ext
rub...
0 1 2 3 4
ruby-1.9.3 + Ruby
ruby-2.0.0 + Ruby
maglev + Ruby
macruby-0.12 + Ruby
rbx-2.0.0rc1 + Ruby
ruby-1.9.3 + C ext
rub...
0 1 2 3 4
ruby-1.9.3 + Ruby
ruby-2.0.0 + Ruby
maglev + Ruby
macruby-0.12 + Ruby
rbx-2.0.0rc1 + Ruby
ruby-1.9.3 + C ext
rub...
But How?
Monday, July 1, 13
Dynamic Optimization
• Target method/value discovered at runtime
• Lookup is expensive
• We can cache it
• Cache has to be...
Method Caching
Target
Object
FooClass
def foo ...
def bar ...
associated with
obj.foo() VM
method table
Monday, July 1, 13
VM Operations
Method Caching
Target
Object
FooClass
def foo ...
def bar ...
associated with
obj.foo() VM
Call Site
method ...
VM Operations
Method Lookup
Method Caching
Target
Object
FooClass
def foo ...
def bar ...
associated with
obj.foo() VM
Cal...
VM Operations
Method Lookup
Method Caching
Target
Object
FooClass
def foo ...
def bar ...
associated with
obj.foo() VM
def...
VM Operations
Method Lookup
Branch
Method Caching
Target
Object
FooClass
def foo ...
def bar ...
associated with
obj.foo()...
VM Operations
Method Lookup
Branch
Method Cache
Method Caching
Target
Object
FooClass
def foo ...
def bar ...
associated w...
Constant Lookup
Constant
Table
MY_CONST VM
Monday, July 1, 13
VM Operations
Constant Lookup
Constant
Table
MY_CONST VM
Access Site
Monday, July 1, 13
VM Operations
LocateValue
Constant Lookup
Constant
Table
MY_CONST VM
Access Site
value
Monday, July 1, 13
VM Operations
LocateValue
Bind Permanently
Constant Lookup
Constant
Table
MY_CONST VM
Access Site
value
Monday, July 1, 13
def foo; 1; end
def invoker; foo; end
i = 0
while i < 10000
  invoker
  i+=1
end
Inlining
Monday, July 1, 13
def invoker; 1; end
i = 0
while i < 10000
  invoker
  i+=1
end
Inline foo into invoker
Monday, July 1, 13
i = 0
while i < 10000
  1
  i+=1
end
Inline invoker into loop
Monday, July 1, 13
i = 0
while i < 10000
  i+=1
end
Value is transient
Monday, July 1, 13
i = 10000
Loop does nothing
Monday, July 1, 13
Variable i is never read
Monday, July 1, 13
Use More Cores
Monday, July 1, 13
It's a multi-core world
• Scaling today is horizontal, not vertical
• N processes does not cut it
• N users * X MB process...
True Parallellism
Ruby
Threads
Native
Threads
CPU Cores
in Use
Monday, July 1, 13
True Parallellism
Ruby
Threads
Native
Threads
Ruby 1.8.7
Green Threading
CPU Cores
in Use
Single Thread
Monday, July 1, 13
True Parallellism
Ruby
Threads
Native
Threads
Ruby 1.8.7 Ruby 2.0.0
Green Threading
CPU Cores
in Use
Global LockSingle Thr...
True Parallellism
Ruby
Threads
Native
Threads
Ruby 1.8.7 Ruby 2.0.0
Green Threading
CPU Cores
in Use
JRuby
Global LockSing...
Multicore in MRI
200MB MRI
Instance
200MB MRI
Instance
200MB MRI
Instance
200MB MRI
Instance
200MB MRI
Instance
200MB MRI
...
Multicore in JRuby
300MB JRuby
Instance
One instance across 10 threads = 300MB
Monday, July 1, 13
require 'benchmark'
ary = (1..1000000).to_a
loop {
  puts Benchmark.measure {
    10.times {
      ary.each {|i|}
    }
  ...
require 'benchmark'
ary = (1..1000000).to_a
loop {
  puts Benchmark.measure {
    (1..10).map {
      Thread.new {
       ...
Monday, July 1, 13
Ruby 1.9
single thread
JRuby
single thread
Monday, July 1, 13
Ruby 1.9
single thread
Ruby 1.9
multiple threads
JRuby
single thread
JRuby
multiple threads
Monday, July 1, 13
0.2s
0.35s
0.5s
0.65s
0.8s
one thread two threads three threads four threads
Per-iteration time versus thread count
thread...
Doing It Right
• Lock-free persistent data structures
• hamster et al
• Thread-safety utilities
• Mutex, Queue, thread_saf...
Finding Problems
• JRuby
• VM flags (heap/thread dumps, debug)
• Some of the best tools in the world
• Rubinius
• gdb, OS-l...
Write Better Code
Monday, July 1, 13
• eval
• Exceptions as flow control
• Excessive allocation
• Defeating optimizations
• IO, DB, bad libraries
• VM flaw*
Usua...
eval
• Code never stays the same
• VM can't cache, can't see patterns
• No optimization is possible*
*Specific cases can so...
Fixing eval
• Evaluate code into a method and leave it
• Methods are stable, optimizable
• Pass dynamic state, rather than...
Exceptions
• Act like a special return value
• Construct object with information
• Capture call stack at raise point
• Unr...
def foo(a); raise; rescue; return a + 1; end
Shallow stack, 100k calls:
JRuby w/ exception: 7.7s
JRuby w/o exception: 0.00...
def foo(a); raise; rescue; return a + 1; end
Deep stack, 100k calls:
JRuby w/ exception: 200s
Ruby 2 w/ exception: 1.25s
R...
Exception Alternatives
• Pre-allocated exception object
• Empty backtrace passed to raise()
• Special return value
• Check...
Allocation
• Literals
• "foo" creates object every time
• String + String,Array + Array
• Creates intermediate objects
• +...
Fixing Literals
• Constants are your friends
• Optimizes well on most impls
• Avoids literal churn
• Cache common interpol...
Fixing Concat/Copy
• Modify in place
• Thread-safety trade-offs...
• Use persistent structures
• "hamster" gem
• Google "i...
Fixing Enum Chaining
• Condense into fewer steps
• Lazy Enumerator in 2.0
• Just use a loop :-)
Monday, July 1, 13
Defeating Optimization
• Caching and inlining are key to perf
• If we can't cache...
• Methods won't inline, won't optimiz...
Method Cache Busting
• VM must ensure cache is correct
• Check type
• Ensure method table is the same
• New type every tim...
VM Operations
Method Lookup
Branch
Method Cache
Dynamic Invocation
Target
Object
FooClass
def foo ...
def bar ...
associat...
VM Operations
Method Lookup
Branch
Method Cache
Dynamic Invocation
Target
Object
FooClass
def foo ...
def bar ...
associat...
VM Operations
Method Lookup
Branch
Method Cache
Dynamic Invocation
Target
Object
FooClass
def foo ...
def bar ...
associat...
VM Operations
Method Lookup
Branch
Method Cache
Dynamic Invocation
Target
Object
FooClass
def foo ...
def bar ...
associat...
VM Operations
Method Lookup
Branch
Method Cache
Dynamic Invocation
Target
Object
FooClass
def foo ...
def bar ...
associat...
VM Operations
Method Lookup
Branch
Method Cache
Dynamic Invocation
Target
Object
FooClass
def foo ...
def bar ...
associat...
VM Operations
Method Lookup
Branch
Method Cache
Dynamic Invocation
Target
Object
FooClass
def foo ...
def bar ...
associat...
VM Operations
Method Lookup
Branch
Method Cache
Dynamic Invocation
Target
Object
FooClass
def foo ...
def bar ...
associat...
VM Operations
Method Lookup
Branch
Method Cache
Dynamic Invocation
Target
Object
FooClass
def foo ...
def bar ...
associat...
VM Operations
Method Lookup
Branch
Method Cache
Dynamic Invocation
Target
Object
FooClass
def foo ...
def bar ...
associat...
Singletons
• Creates new types at runtime
• Impossible to cache based on type
• Usually defines new methods
• Method table ...
Object#extend
• Includes module into single object
• New one-off type every time
• Class hierarchy keeps changing
foo.exte...
static VALUE
io_getpartial(int argc, VALUE *argv, VALUE io, int nonblock)
{
...
n = rb_read_internal(fptr->fd, RSTRING_PTR...
static VALUE
io_getpartial(int argc, VALUE *argv, VALUE io, int nonblock)
{
...
n = rb_read_internal(fptr->fd, RSTRING_PTR...
void
rb_mod_sys_fail(VALUE mod, const char *mesg)
{
VALUE exc = make_errno_exc(mesg);
rb_extend_object(exc, mod);
rb_exc_r...
void
rb_mod_sys_fail(VALUE mod, const char *mesg)
{
VALUE exc = make_errno_exc(mesg);
rb_extend_object(exc, mod);
rb_exc_r...
Fixing Singletons/
#extend
• Functional patterns
• FooLibrary.process(obj) rather than
obj.extend FooLibrary; obj.process
...
Monday, July 1, 13
Monday, July 1, 13
Constant Lookup
• Constants in tables on classes/modules
• Usually assigned only once, at load time
• Lookup is expensive,...
Constant Cache
• Constant search proceeds two ways
• First, lexical scoping
• Second, class hierarchy
• Invalidation happe...
Constant Cache Busting
• Redefining constants
• Introducing new lexical scopes
• Classes created at runtime
• Evaluated cod...
Fixing Constants
• Don't modify them
• i.e. CONSTANT
• Avoid runtime class hierarchy changes
Monday, July 1, 13
How to Get Help
Monday, July 1, 13
Performance Issues
• Assume nothing...most can be fixed
• Isolate bad code, small a case as possible
• UseVM tools to monit...
Concurrency Issues
• Avoid mutable state
• Synchronize mutations
• Start coarse-grained, get finer over time
• VM tooling t...
Monday, July 1, 13
Ruby can be fast...and
we want to help you.
Monday, July 1, 13
ThankYou!
• Charles Oliver Nutter
• @headius
• http://jruby.org
• http://blog.headius.com
• Book: "Using JRuby"
• Book: "D...
Upcoming SlideShare
Loading in...5
×

High Performance Ruby - E4E Conference 2013

3,757

Published on

A presentation on how JRuby is making Ruby faster, along with some tricks for all Rubyists to speed up their code.

Published in: Technology, Education
2 Comments
18 Likes
Statistics
Notes
No Downloads
Views
Total Views
3,757
On Slideshare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
0
Comments
2
Likes
18
Embeds 0
No embeds

No notes for slide

High Performance Ruby - E4E Conference 2013

  1. 1. High Performance Ruby Tips,Techniques, and Futures Monday, July 1, 13
  2. 2. Me • Charles Oliver Nutter • @headius • Java developer since 1996 • JRuby developer since 2006 • Red Hat / JBoss polyglot group Monday, July 1, 13
  3. 3. Is Ruby fast? Monday, July 1, 13
  4. 4. Is Ruby fast enough? Monday, July 1, 13
  5. 5. How fast do you need Ruby to be? Monday, July 1, 13
  6. 6. What Should We Optimize? • Overall execution time? • Memory use? • Developer time? • Developer happiness? :-) Monday, July 1, 13
  7. 7. Ruby can be fast... if you know how. Monday, July 1, 13
  8. 8. Strategies • Use a better runtime • Use more cores • Write better code Monday, July 1, 13
  9. 9. Use a BetterVM Monday, July 1, 13
  10. 10. Many Options • Ruby 2.0 • Significant execution improvements • JRuby • Leveraging JVM more and more • Rubinius • OptimizingVM built for Ruby Monday, July 1, 13
  11. 11. 0 7.5 15 22.5 30 Java 1.4 Java 5 Java 6 Java 7 Go Java Go! JRuby 1.0.3 (bm_red_black_tree.rb) 300% for free Monday, July 1, 13
  12. 12. 0 2 4 6 8 1.0.3 1.1.6 1.4.0 1.5.6 1.6.8 1.7.0 OpenJDK 8 (bm_red_black_tree.rb) Go JRuby Go! 8.2x Improvement Monday, July 1, 13
  13. 13. rbtree Extension • Pure Ruby version works everywhere • C or Java extension FOR SPEED • Oh really? ;-) Monday, July 1, 13
  14. 14. Monday, July 1, 13
  15. 15. 0 1 2 3 4 ruby-1.9.3 + Ruby ruby-2.0.0 + Ruby maglev + Ruby macruby-0.12 + Ruby rbx-2.0.0rc1 + Ruby ruby-1.9.3 + C ext ruby-2.0.0 + C ext jruby + Ruby jruby + Java ext red/black tree, pure Ruby versus native Runtime per iteration Monday, July 1, 13
  16. 16. 0 1 2 3 4 ruby-1.9.3 + Ruby ruby-2.0.0 + Ruby maglev + Ruby macruby-0.12 + Ruby rbx-2.0.0rc1 + Ruby ruby-1.9.3 + C ext ruby-2.0.0 + C ext jruby + Ruby jruby + Java ext 3.96 red/black tree, pure Ruby versus native Runtime per iteration Monday, July 1, 13
  17. 17. 0 1 2 3 4 ruby-1.9.3 + Ruby ruby-2.0.0 + Ruby maglev + Ruby macruby-0.12 + Ruby rbx-2.0.0rc1 + Ruby ruby-1.9.3 + C ext ruby-2.0.0 + C ext jruby + Ruby jruby + Java ext 3.96 2.48 red/black tree, pure Ruby versus native Runtime per iteration Monday, July 1, 13
  18. 18. 0 1 2 3 4 ruby-1.9.3 + Ruby ruby-2.0.0 + Ruby maglev + Ruby macruby-0.12 + Ruby rbx-2.0.0rc1 + Ruby ruby-1.9.3 + C ext ruby-2.0.0 + C ext jruby + Ruby jruby + Java ext 3.96 2.48 1.39 red/black tree, pure Ruby versus native Runtime per iteration Monday, July 1, 13
  19. 19. 0 1 2 3 4 ruby-1.9.3 + Ruby ruby-2.0.0 + Ruby maglev + Ruby macruby-0.12 + Ruby rbx-2.0.0rc1 + Ruby ruby-1.9.3 + C ext ruby-2.0.0 + C ext jruby + Ruby jruby + Java ext 3.96 2.48 1.39 1.19 red/black tree, pure Ruby versus native Runtime per iteration Monday, July 1, 13
  20. 20. 0 1 2 3 4 ruby-1.9.3 + Ruby ruby-2.0.0 + Ruby maglev + Ruby macruby-0.12 + Ruby rbx-2.0.0rc1 + Ruby ruby-1.9.3 + C ext ruby-2.0.0 + C ext jruby + Ruby jruby + Java ext 3.96 2.48 1.39 1.19 0.51 red/black tree, pure Ruby versus native Runtime per iteration Monday, July 1, 13
  21. 21. 0 1 2 3 4 ruby-1.9.3 + Ruby ruby-2.0.0 + Ruby maglev + Ruby macruby-0.12 + Ruby rbx-2.0.0rc1 + Ruby ruby-1.9.3 + C ext ruby-2.0.0 + C ext jruby + Ruby jruby + Java ext 3.96 2.48 1.39 1.19 0.51 0.51 red/black tree, pure Ruby versus native Runtime per iteration Monday, July 1, 13
  22. 22. 0 1 2 3 4 ruby-1.9.3 + Ruby ruby-2.0.0 + Ruby maglev + Ruby macruby-0.12 + Ruby rbx-2.0.0rc1 + Ruby ruby-1.9.3 + C ext ruby-2.0.0 + C ext jruby + Ruby jruby + Java ext 3.96 2.48 1.39 1.19 0.51 0.51 0.51 red/black tree, pure Ruby versus native Runtime per iteration Monday, July 1, 13
  23. 23. 0 1 2 3 4 ruby-1.9.3 + Ruby ruby-2.0.0 + Ruby maglev + Ruby macruby-0.12 + Ruby rbx-2.0.0rc1 + Ruby ruby-1.9.3 + C ext ruby-2.0.0 + C ext jruby + Ruby jruby + Java ext 3.96 2.48 1.39 1.19 0.51 0.51 0.51 0.29 red/black tree, pure Ruby versus native Runtime per iteration Monday, July 1, 13
  24. 24. 0 1 2 3 4 ruby-1.9.3 + Ruby ruby-2.0.0 + Ruby maglev + Ruby macruby-0.12 + Ruby rbx-2.0.0rc1 + Ruby ruby-1.9.3 + C ext ruby-2.0.0 + C ext jruby + Ruby jruby + Java ext 3.96 2.48 1.39 1.19 0.51 0.51 0.51 0.29 0.1 red/black tree, pure Ruby versus native Runtime per iteration Monday, July 1, 13
  25. 25. But How? Monday, July 1, 13
  26. 26. Dynamic Optimization • Target method/value discovered at runtime • Lookup is expensive • We can cache it • Cache has to be validated • Indirection hurts pipeline • Inline methods/values at access point Monday, July 1, 13
  27. 27. Method Caching Target Object FooClass def foo ... def bar ... associated with obj.foo() VM method table Monday, July 1, 13
  28. 28. VM Operations Method Caching Target Object FooClass def foo ... def bar ... associated with obj.foo() VM Call Site method table Monday, July 1, 13
  29. 29. VM Operations Method Lookup Method Caching Target Object FooClass def foo ... def bar ... associated with obj.foo() VM Call Site method table Monday, July 1, 13
  30. 30. VM Operations Method Lookup Method Caching Target Object FooClass def foo ... def bar ... associated with obj.foo() VM def foo ... Call Site method table Monday, July 1, 13
  31. 31. VM Operations Method Lookup Branch Method Caching Target Object FooClass def foo ... def bar ... associated with obj.foo() VM def foo ... Call Site method table Monday, July 1, 13
  32. 32. VM Operations Method Lookup Branch Method Cache Method Caching Target Object FooClass def foo ... def bar ... associated with obj.foo() VM def foo ... Call Site method table Monday, July 1, 13
  33. 33. Constant Lookup Constant Table MY_CONST VM Monday, July 1, 13
  34. 34. VM Operations Constant Lookup Constant Table MY_CONST VM Access Site Monday, July 1, 13
  35. 35. VM Operations LocateValue Constant Lookup Constant Table MY_CONST VM Access Site value Monday, July 1, 13
  36. 36. VM Operations LocateValue Bind Permanently Constant Lookup Constant Table MY_CONST VM Access Site value Monday, July 1, 13
  37. 37. def foo; 1; end def invoker; foo; end i = 0 while i < 10000   invoker   i+=1 end Inlining Monday, July 1, 13
  38. 38. def invoker; 1; end i = 0 while i < 10000   invoker   i+=1 end Inline foo into invoker Monday, July 1, 13
  39. 39. i = 0 while i < 10000   1   i+=1 end Inline invoker into loop Monday, July 1, 13
  40. 40. i = 0 while i < 10000   i+=1 end Value is transient Monday, July 1, 13
  41. 41. i = 10000 Loop does nothing Monday, July 1, 13
  42. 42. Variable i is never read Monday, July 1, 13
  43. 43. Use More Cores Monday, July 1, 13
  44. 44. It's a multi-core world • Scaling today is horizontal, not vertical • N processes does not cut it • N users * X MB process = $$$ • CoW is only a partial band-aid • Non-parallel impls are falling behind • JRuby, Rubinius your only real options Monday, July 1, 13
  45. 45. True Parallellism Ruby Threads Native Threads CPU Cores in Use Monday, July 1, 13
  46. 46. True Parallellism Ruby Threads Native Threads Ruby 1.8.7 Green Threading CPU Cores in Use Single Thread Monday, July 1, 13
  47. 47. True Parallellism Ruby Threads Native Threads Ruby 1.8.7 Ruby 2.0.0 Green Threading CPU Cores in Use Global LockSingle Thread Monday, July 1, 13
  48. 48. True Parallellism Ruby Threads Native Threads Ruby 1.8.7 Ruby 2.0.0 Green Threading CPU Cores in Use JRuby Global LockSingle Thread Real Threading Monday, July 1, 13
  49. 49. Multicore in MRI 200MB MRI Instance 200MB MRI Instance 200MB MRI Instance 200MB MRI Instance 200MB MRI Instance 200MB MRI Instance 200MB MRI Instance 200MB MRI Instance 200MB MRI Instance 200MB MRI Instance 200MB MRI Instance Ten instances * 200MB = 2GB Monday, July 1, 13
  50. 50. Multicore in JRuby 300MB JRuby Instance One instance across 10 threads = 300MB Monday, July 1, 13
  51. 51. require 'benchmark' ary = (1..1000000).to_a loop {   puts Benchmark.measure {     10.times {       ary.each {|i|}     }   } } Monday, July 1, 13
  52. 52. require 'benchmark' ary = (1..1000000).to_a loop {   puts Benchmark.measure {     (1..10).map {       Thread.new {         ary.each {|i|}       }     }.map(&:join)   } } Monday, July 1, 13
  53. 53. Monday, July 1, 13
  54. 54. Ruby 1.9 single thread JRuby single thread Monday, July 1, 13
  55. 55. Ruby 1.9 single thread Ruby 1.9 multiple threads JRuby single thread JRuby multiple threads Monday, July 1, 13
  56. 56. 0.2s 0.35s 0.5s 0.65s 0.8s one thread two threads three threads four threads Per-iteration time versus thread count threaded_reverse Monday, July 1, 13
  57. 57. Doing It Right • Lock-free persistent data structures • hamster et al • Thread-safety utilities • Mutex, Queue, thread_safe + atomic gems • Threaded servers • puma, trinidad, torquebox, JVM servers Monday, July 1, 13
  58. 58. Finding Problems • JRuby • VM flags (heap/thread dumps, debug) • Some of the best tools in the world • Rubinius • gdb, OS-level tools • #rubinius Monday, July 1, 13
  59. 59. Write Better Code Monday, July 1, 13
  60. 60. • eval • Exceptions as flow control • Excessive allocation • Defeating optimizations • IO, DB, bad libraries • VM flaw* Usual Suspects *I usually assume it's JRuby's fault until proven otherwise Monday, July 1, 13
  61. 61. eval • Code never stays the same • VM can't cache, can't see patterns • No optimization is possible* *Specific cases can sometimes be cached and optimized Monday, July 1, 13
  62. 62. Fixing eval • Evaluate code into a method and leave it • Methods are stable, optimizable • Pass dynamic state, rather than interpolate • Branches are cheaper than new code • Do all evaluation up front • ...not during your app's hot path Monday, July 1, 13
  63. 63. Exceptions • Act like a special return value • Construct object with information • Capture call stack at raise point • Unroll call stack until rescued • Overhead ranges from big to huge • Especially costly on optimizingVMs Monday, July 1, 13
  64. 64. def foo(a); raise; rescue; return a + 1; end Shallow stack, 100k calls: JRuby w/ exception: 7.7s JRuby w/o exception: 0.004s Ruby 2 w/ exception: 0.25s Ruby 2 w/o exception: 0.009s Rubinius w/ exception: 0.1s Rubinius w/o exception: 0.002s Monday, July 1, 13
  65. 65. def foo(a); raise; rescue; return a + 1; end Deep stack, 100k calls: JRuby w/ exception: 200s Ruby 2 w/ exception: 1.25s Rubinius w/ exception: 7.7s Monday, July 1, 13
  66. 66. Exception Alternatives • Pre-allocated exception object • Empty backtrace passed to raise() • Special return value • Check at each caller • catch/throw • Avoids most overhead Monday, July 1, 13
  67. 67. Allocation • Literals • "foo" creates object every time • String + String,Array + Array • Creates intermediate objects • += is especially wasteful • Slicing and enumerating • ary.map{}.select{}.inject{}.find = 3 arrays Monday, July 1, 13
  68. 68. Fixing Literals • Constants are your friends • Optimizes well on most impls • Avoids literal churn • Cache common interpolated values • Study memory profiles Monday, July 1, 13
  69. 69. Fixing Concat/Copy • Modify in place • Thread-safety trade-offs... • Use persistent structures • "hamster" gem • Google "immutable ruby" Monday, July 1, 13
  70. 70. Fixing Enum Chaining • Condense into fewer steps • Lazy Enumerator in 2.0 • Just use a loop :-) Monday, July 1, 13
  71. 71. Defeating Optimization • Caching and inlining are key to perf • If we can't cache... • Methods won't inline, won't optimize • Constants must be looked up every time • We have less time for real work Monday, July 1, 13
  72. 72. Method Cache Busting • VM must ensure cache is correct • Check type • Ensure method table is the same • New type every time? No caching. • Modify method table? No caching. Monday, July 1, 13
  73. 73. VM Operations Method Lookup Branch Method Cache Dynamic Invocation Target Object FooClass def foo ... def bar ... associated with obj.foo() VM Call Site method table Monday, July 1, 13
  74. 74. VM Operations Method Lookup Branch Method Cache Dynamic Invocation Target Object FooClass def foo ... def bar ... associated with obj.foo() VM def foo ... Call Site method table Monday, July 1, 13
  75. 75. VM Operations Method Lookup Branch Method Cache Dynamic Invocation Target Object FooClass def foo ... def bar ... associated with obj.foo() VM Call Site method table Monday, July 1, 13
  76. 76. VM Operations Method Lookup Branch Method Cache Dynamic Invocation Target Object FooClass def foo ... def bar ... associated with obj.foo() VM def foo ... Call Site method table Monday, July 1, 13
  77. 77. VM Operations Method Lookup Branch Method Cache Dynamic Invocation Target Object FooClass def foo ... def bar ... associated with obj.foo() VM Call Site method table Monday, July 1, 13
  78. 78. VM Operations Method Lookup Branch Method Cache Dynamic Invocation Target Object FooClass def foo ... def bar ... associated with obj.foo() VM def foo ... Call Site method table Monday, July 1, 13
  79. 79. VM Operations Method Lookup Branch Method Cache Dynamic Invocation Target Object FooClass def foo ... def bar ... associated with obj.foo() VM Call Site method table Monday, July 1, 13
  80. 80. VM Operations Method Lookup Branch Method Cache Dynamic Invocation Target Object FooClass def foo ... def bar ... associated with obj.foo() VM def foo ... Call Site method table Monday, July 1, 13
  81. 81. VM Operations Method Lookup Branch Method Cache Dynamic Invocation Target Object FooClass def foo ... def bar ... associated with obj.foo() VM Call Site method table Monday, July 1, 13
  82. 82. VM Operations Method Lookup Branch Method Cache Dynamic Invocation Target Object FooClass def foo ... def bar ... associated with obj.foo() VM def foo ... Call Site method table Monday, July 1, 13
  83. 83. Singletons • Creates new types at runtime • Impossible to cache based on type • Usually defines new methods • Method table is always different class << foo ... def foo.bar ... Monday, July 1, 13
  84. 84. Object#extend • Includes module into single object • New one-off type every time • Class hierarchy keeps changing foo.extend Enumerable Monday, July 1, 13
  85. 85. static VALUE io_getpartial(int argc, VALUE *argv, VALUE io, int nonblock) { ... n = rb_read_internal(fptr->fd, RSTRING_PTR(str), len); rb_str_unlocktmp(str); if (n < 0) { if (!nonblock && rb_io_wait_readable(fptr->fd)) goto again; if (nonblock && (errno == EWOULDBLOCK || errno == EAGAIN)) rb_mod_sys_fail(rb_mWaitReadable, "read would block"); rb_sys_fail_path(fptr->pathv); } ... } Monday, July 1, 13
  86. 86. static VALUE io_getpartial(int argc, VALUE *argv, VALUE io, int nonblock) { ... n = rb_read_internal(fptr->fd, RSTRING_PTR(str), len); rb_str_unlocktmp(str); if (n < 0) { if (!nonblock && rb_io_wait_readable(fptr->fd)) goto again; if (nonblock && (errno == EWOULDBLOCK || errno == EAGAIN)) rb_mod_sys_fail(rb_mWaitReadable, "read would block"); rb_sys_fail_path(fptr->pathv); } ... } Monday, July 1, 13
  87. 87. void rb_mod_sys_fail(VALUE mod, const char *mesg) { VALUE exc = make_errno_exc(mesg); rb_extend_object(exc, mod); rb_exc_raise(exc); } Monday, July 1, 13
  88. 88. void rb_mod_sys_fail(VALUE mod, const char *mesg) { VALUE exc = make_errno_exc(mesg); rb_extend_object(exc, mod); rb_exc_raise(exc); } Monday, July 1, 13
  89. 89. Fixing Singletons/ #extend • Functional patterns • FooLibrary.process(obj) rather than obj.extend FooLibrary; obj.process • Create types up front (programmatically?) • 1000 predefined types beats infinite types Monday, July 1, 13
  90. 90. Monday, July 1, 13
  91. 91. Monday, July 1, 13
  92. 92. Constant Lookup • Constants in tables on classes/modules • Usually assigned only once, at load time • Lookup is expensive, like methods • Values can be cached Monday, July 1, 13
  93. 93. Constant Cache • Constant search proceeds two ways • First, lexical scoping • Second, class hierarchy • Invalidation happens globally Monday, July 1, 13
  94. 94. Constant Cache Busting • Redefining constants • Introducing new lexical scopes • Classes created at runtime • Evaluated code • Altering class hierarchies • Lookup results may change...no caching Monday, July 1, 13
  95. 95. Fixing Constants • Don't modify them • i.e. CONSTANT • Avoid runtime class hierarchy changes Monday, July 1, 13
  96. 96. How to Get Help Monday, July 1, 13
  97. 97. Performance Issues • Assume nothing...most can be fixed • Isolate bad code, small a case as possible • UseVM tools to monitor caches • Fix if it's your bug, PR if it's a library • Come to us for help or if it's aVM bug • Repeat... Monday, July 1, 13
  98. 98. Concurrency Issues • Avoid mutable state • Synchronize mutations • Start coarse-grained, get finer over time • VM tooling to monitor locks, contention • ContactVM authors for help Monday, July 1, 13
  99. 99. Monday, July 1, 13
  100. 100. Ruby can be fast...and we want to help you. Monday, July 1, 13
  101. 101. ThankYou! • Charles Oliver Nutter • @headius • http://jruby.org • http://blog.headius.com • Book: "Using JRuby" • Book: "Deploying JRuby" Monday, July 1, 13

×