• Save
High Performance Ruby - E4E Conference 2013
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

High Performance Ruby - E4E Conference 2013

on

  • 3,732 views

A presentation on how JRuby is making Ruby faster, along with some tricks for all Rubyists to speed up their code.

A presentation on how JRuby is making Ruby faster, along with some tricks for all Rubyists to speed up their code.

Statistics

Views

Total Views
3,732
Views on SlideShare
3,718
Embed Views
14

Actions

Likes
15
Downloads
0
Comments
2

1 Embed 14

https://twitter.com 14

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

CC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike LicenseCC Attribution-NonCommercial-ShareAlike License

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
  • @gandralf Now where did you get that? The presentation quite clearly indicates the opposite. Did you even watch the presentation?
    Are you sure you want to
    Your message goes here
    Processing…
  • Do you want to speed up ruby?
    Use something else.
    Simple like that.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

High Performance Ruby - E4E Conference 2013 Presentation Transcript

  • 1. High Performance Ruby Tips,Techniques, and Futures Monday, July 1, 13
  • 2. Me • Charles Oliver Nutter • @headius • Java developer since 1996 • JRuby developer since 2006 • Red Hat / JBoss polyglot group Monday, July 1, 13
  • 3. Is Ruby fast? Monday, July 1, 13
  • 4. Is Ruby fast enough? Monday, July 1, 13
  • 5. How fast do you need Ruby to be? Monday, July 1, 13
  • 6. What Should We Optimize? • Overall execution time? • Memory use? • Developer time? • Developer happiness? :-) Monday, July 1, 13
  • 7. Ruby can be fast... if you know how. Monday, July 1, 13
  • 8. Strategies • Use a better runtime • Use more cores • Write better code Monday, July 1, 13
  • 9. Use a BetterVM Monday, July 1, 13
  • 10. Many Options • Ruby 2.0 • Significant execution improvements • JRuby • Leveraging JVM more and more • Rubinius • OptimizingVM built for Ruby Monday, July 1, 13
  • 11. 0 7.5 15 22.5 30 Java 1.4 Java 5 Java 6 Java 7 Go Java Go! JRuby 1.0.3 (bm_red_black_tree.rb) 300% for free Monday, July 1, 13
  • 12. 0 2 4 6 8 1.0.3 1.1.6 1.4.0 1.5.6 1.6.8 1.7.0 OpenJDK 8 (bm_red_black_tree.rb) Go JRuby Go! 8.2x Improvement Monday, July 1, 13
  • 13. rbtree Extension • Pure Ruby version works everywhere • C or Java extension FOR SPEED • Oh really? ;-) Monday, July 1, 13
  • 14. Monday, July 1, 13
  • 15. 0 1 2 3 4 ruby-1.9.3 + Ruby ruby-2.0.0 + Ruby maglev + Ruby macruby-0.12 + Ruby rbx-2.0.0rc1 + Ruby ruby-1.9.3 + C ext ruby-2.0.0 + C ext jruby + Ruby jruby + Java ext red/black tree, pure Ruby versus native Runtime per iteration Monday, July 1, 13
  • 16. 0 1 2 3 4 ruby-1.9.3 + Ruby ruby-2.0.0 + Ruby maglev + Ruby macruby-0.12 + Ruby rbx-2.0.0rc1 + Ruby ruby-1.9.3 + C ext ruby-2.0.0 + C ext jruby + Ruby jruby + Java ext 3.96 red/black tree, pure Ruby versus native Runtime per iteration Monday, July 1, 13
  • 17. 0 1 2 3 4 ruby-1.9.3 + Ruby ruby-2.0.0 + Ruby maglev + Ruby macruby-0.12 + Ruby rbx-2.0.0rc1 + Ruby ruby-1.9.3 + C ext ruby-2.0.0 + C ext jruby + Ruby jruby + Java ext 3.96 2.48 red/black tree, pure Ruby versus native Runtime per iteration Monday, July 1, 13
  • 18. 0 1 2 3 4 ruby-1.9.3 + Ruby ruby-2.0.0 + Ruby maglev + Ruby macruby-0.12 + Ruby rbx-2.0.0rc1 + Ruby ruby-1.9.3 + C ext ruby-2.0.0 + C ext jruby + Ruby jruby + Java ext 3.96 2.48 1.39 red/black tree, pure Ruby versus native Runtime per iteration Monday, July 1, 13
  • 19. 0 1 2 3 4 ruby-1.9.3 + Ruby ruby-2.0.0 + Ruby maglev + Ruby macruby-0.12 + Ruby rbx-2.0.0rc1 + Ruby ruby-1.9.3 + C ext ruby-2.0.0 + C ext jruby + Ruby jruby + Java ext 3.96 2.48 1.39 1.19 red/black tree, pure Ruby versus native Runtime per iteration Monday, July 1, 13
  • 20. 0 1 2 3 4 ruby-1.9.3 + Ruby ruby-2.0.0 + Ruby maglev + Ruby macruby-0.12 + Ruby rbx-2.0.0rc1 + Ruby ruby-1.9.3 + C ext ruby-2.0.0 + C ext jruby + Ruby jruby + Java ext 3.96 2.48 1.39 1.19 0.51 red/black tree, pure Ruby versus native Runtime per iteration Monday, July 1, 13
  • 21. 0 1 2 3 4 ruby-1.9.3 + Ruby ruby-2.0.0 + Ruby maglev + Ruby macruby-0.12 + Ruby rbx-2.0.0rc1 + Ruby ruby-1.9.3 + C ext ruby-2.0.0 + C ext jruby + Ruby jruby + Java ext 3.96 2.48 1.39 1.19 0.51 0.51 red/black tree, pure Ruby versus native Runtime per iteration Monday, July 1, 13
  • 22. 0 1 2 3 4 ruby-1.9.3 + Ruby ruby-2.0.0 + Ruby maglev + Ruby macruby-0.12 + Ruby rbx-2.0.0rc1 + Ruby ruby-1.9.3 + C ext ruby-2.0.0 + C ext jruby + Ruby jruby + Java ext 3.96 2.48 1.39 1.19 0.51 0.51 0.51 red/black tree, pure Ruby versus native Runtime per iteration Monday, July 1, 13
  • 23. 0 1 2 3 4 ruby-1.9.3 + Ruby ruby-2.0.0 + Ruby maglev + Ruby macruby-0.12 + Ruby rbx-2.0.0rc1 + Ruby ruby-1.9.3 + C ext ruby-2.0.0 + C ext jruby + Ruby jruby + Java ext 3.96 2.48 1.39 1.19 0.51 0.51 0.51 0.29 red/black tree, pure Ruby versus native Runtime per iteration Monday, July 1, 13
  • 24. 0 1 2 3 4 ruby-1.9.3 + Ruby ruby-2.0.0 + Ruby maglev + Ruby macruby-0.12 + Ruby rbx-2.0.0rc1 + Ruby ruby-1.9.3 + C ext ruby-2.0.0 + C ext jruby + Ruby jruby + Java ext 3.96 2.48 1.39 1.19 0.51 0.51 0.51 0.29 0.1 red/black tree, pure Ruby versus native Runtime per iteration Monday, July 1, 13
  • 25. But How? Monday, July 1, 13
  • 26. Dynamic Optimization • Target method/value discovered at runtime • Lookup is expensive • We can cache it • Cache has to be validated • Indirection hurts pipeline • Inline methods/values at access point Monday, July 1, 13
  • 27. Method Caching Target Object FooClass def foo ... def bar ... associated with obj.foo() VM method table Monday, July 1, 13
  • 28. VM Operations Method Caching Target Object FooClass def foo ... def bar ... associated with obj.foo() VM Call Site method table Monday, July 1, 13
  • 29. VM Operations Method Lookup Method Caching Target Object FooClass def foo ... def bar ... associated with obj.foo() VM Call Site method table Monday, July 1, 13
  • 30. VM Operations Method Lookup Method Caching Target Object FooClass def foo ... def bar ... associated with obj.foo() VM def foo ... Call Site method table Monday, July 1, 13
  • 31. VM Operations Method Lookup Branch Method Caching Target Object FooClass def foo ... def bar ... associated with obj.foo() VM def foo ... Call Site method table Monday, July 1, 13
  • 32. VM Operations Method Lookup Branch Method Cache Method Caching Target Object FooClass def foo ... def bar ... associated with obj.foo() VM def foo ... Call Site method table Monday, July 1, 13
  • 33. Constant Lookup Constant Table MY_CONST VM Monday, July 1, 13
  • 34. VM Operations Constant Lookup Constant Table MY_CONST VM Access Site Monday, July 1, 13
  • 35. VM Operations LocateValue Constant Lookup Constant Table MY_CONST VM Access Site value Monday, July 1, 13
  • 36. VM Operations LocateValue Bind Permanently Constant Lookup Constant Table MY_CONST VM Access Site value Monday, July 1, 13
  • 37. def foo; 1; end def invoker; foo; end i = 0 while i < 10000   invoker   i+=1 end Inlining Monday, July 1, 13
  • 38. def invoker; 1; end i = 0 while i < 10000   invoker   i+=1 end Inline foo into invoker Monday, July 1, 13
  • 39. i = 0 while i < 10000   1   i+=1 end Inline invoker into loop Monday, July 1, 13
  • 40. i = 0 while i < 10000   i+=1 end Value is transient Monday, July 1, 13
  • 41. i = 10000 Loop does nothing Monday, July 1, 13
  • 42. Variable i is never read Monday, July 1, 13
  • 43. Use More Cores Monday, July 1, 13
  • 44. It's a multi-core world • Scaling today is horizontal, not vertical • N processes does not cut it • N users * X MB process = $$$ • CoW is only a partial band-aid • Non-parallel impls are falling behind • JRuby, Rubinius your only real options Monday, July 1, 13
  • 45. True Parallellism Ruby Threads Native Threads CPU Cores in Use Monday, July 1, 13
  • 46. True Parallellism Ruby Threads Native Threads Ruby 1.8.7 Green Threading CPU Cores in Use Single Thread Monday, July 1, 13
  • 47. True Parallellism Ruby Threads Native Threads Ruby 1.8.7 Ruby 2.0.0 Green Threading CPU Cores in Use Global LockSingle Thread Monday, July 1, 13
  • 48. True Parallellism Ruby Threads Native Threads Ruby 1.8.7 Ruby 2.0.0 Green Threading CPU Cores in Use JRuby Global LockSingle Thread Real Threading Monday, July 1, 13
  • 49. Multicore in MRI 200MB MRI Instance 200MB MRI Instance 200MB MRI Instance 200MB MRI Instance 200MB MRI Instance 200MB MRI Instance 200MB MRI Instance 200MB MRI Instance 200MB MRI Instance 200MB MRI Instance 200MB MRI Instance Ten instances * 200MB = 2GB Monday, July 1, 13
  • 50. Multicore in JRuby 300MB JRuby Instance One instance across 10 threads = 300MB Monday, July 1, 13
  • 51. require 'benchmark' ary = (1..1000000).to_a loop {   puts Benchmark.measure {     10.times {       ary.each {|i|}     }   } } Monday, July 1, 13
  • 52. require 'benchmark' ary = (1..1000000).to_a loop {   puts Benchmark.measure {     (1..10).map {       Thread.new {         ary.each {|i|}       }     }.map(&:join)   } } Monday, July 1, 13
  • 53. Monday, July 1, 13
  • 54. Ruby 1.9 single thread JRuby single thread Monday, July 1, 13
  • 55. Ruby 1.9 single thread Ruby 1.9 multiple threads JRuby single thread JRuby multiple threads Monday, July 1, 13
  • 56. 0.2s 0.35s 0.5s 0.65s 0.8s one thread two threads three threads four threads Per-iteration time versus thread count threaded_reverse Monday, July 1, 13
  • 57. Doing It Right • Lock-free persistent data structures • hamster et al • Thread-safety utilities • Mutex, Queue, thread_safe + atomic gems • Threaded servers • puma, trinidad, torquebox, JVM servers Monday, July 1, 13
  • 58. Finding Problems • JRuby • VM flags (heap/thread dumps, debug) • Some of the best tools in the world • Rubinius • gdb, OS-level tools • #rubinius Monday, July 1, 13
  • 59. Write Better Code Monday, July 1, 13
  • 60. • eval • Exceptions as flow control • Excessive allocation • Defeating optimizations • IO, DB, bad libraries • VM flaw* Usual Suspects *I usually assume it's JRuby's fault until proven otherwise Monday, July 1, 13
  • 61. eval • Code never stays the same • VM can't cache, can't see patterns • No optimization is possible* *Specific cases can sometimes be cached and optimized Monday, July 1, 13
  • 62. Fixing eval • Evaluate code into a method and leave it • Methods are stable, optimizable • Pass dynamic state, rather than interpolate • Branches are cheaper than new code • Do all evaluation up front • ...not during your app's hot path Monday, July 1, 13
  • 63. Exceptions • Act like a special return value • Construct object with information • Capture call stack at raise point • Unroll call stack until rescued • Overhead ranges from big to huge • Especially costly on optimizingVMs Monday, July 1, 13
  • 64. def foo(a); raise; rescue; return a + 1; end Shallow stack, 100k calls: JRuby w/ exception: 7.7s JRuby w/o exception: 0.004s Ruby 2 w/ exception: 0.25s Ruby 2 w/o exception: 0.009s Rubinius w/ exception: 0.1s Rubinius w/o exception: 0.002s Monday, July 1, 13
  • 65. def foo(a); raise; rescue; return a + 1; end Deep stack, 100k calls: JRuby w/ exception: 200s Ruby 2 w/ exception: 1.25s Rubinius w/ exception: 7.7s Monday, July 1, 13
  • 66. Exception Alternatives • Pre-allocated exception object • Empty backtrace passed to raise() • Special return value • Check at each caller • catch/throw • Avoids most overhead Monday, July 1, 13
  • 67. Allocation • Literals • "foo" creates object every time • String + String,Array + Array • Creates intermediate objects • += is especially wasteful • Slicing and enumerating • ary.map{}.select{}.inject{}.find = 3 arrays Monday, July 1, 13
  • 68. Fixing Literals • Constants are your friends • Optimizes well on most impls • Avoids literal churn • Cache common interpolated values • Study memory profiles Monday, July 1, 13
  • 69. Fixing Concat/Copy • Modify in place • Thread-safety trade-offs... • Use persistent structures • "hamster" gem • Google "immutable ruby" Monday, July 1, 13
  • 70. Fixing Enum Chaining • Condense into fewer steps • Lazy Enumerator in 2.0 • Just use a loop :-) Monday, July 1, 13
  • 71. Defeating Optimization • Caching and inlining are key to perf • If we can't cache... • Methods won't inline, won't optimize • Constants must be looked up every time • We have less time for real work Monday, July 1, 13
  • 72. Method Cache Busting • VM must ensure cache is correct • Check type • Ensure method table is the same • New type every time? No caching. • Modify method table? No caching. Monday, July 1, 13
  • 73. VM Operations Method Lookup Branch Method Cache Dynamic Invocation Target Object FooClass def foo ... def bar ... associated with obj.foo() VM Call Site method table Monday, July 1, 13
  • 74. VM Operations Method Lookup Branch Method Cache Dynamic Invocation Target Object FooClass def foo ... def bar ... associated with obj.foo() VM def foo ... Call Site method table Monday, July 1, 13
  • 75. VM Operations Method Lookup Branch Method Cache Dynamic Invocation Target Object FooClass def foo ... def bar ... associated with obj.foo() VM Call Site method table Monday, July 1, 13
  • 76. VM Operations Method Lookup Branch Method Cache Dynamic Invocation Target Object FooClass def foo ... def bar ... associated with obj.foo() VM def foo ... Call Site method table Monday, July 1, 13
  • 77. VM Operations Method Lookup Branch Method Cache Dynamic Invocation Target Object FooClass def foo ... def bar ... associated with obj.foo() VM Call Site method table Monday, July 1, 13
  • 78. VM Operations Method Lookup Branch Method Cache Dynamic Invocation Target Object FooClass def foo ... def bar ... associated with obj.foo() VM def foo ... Call Site method table Monday, July 1, 13
  • 79. VM Operations Method Lookup Branch Method Cache Dynamic Invocation Target Object FooClass def foo ... def bar ... associated with obj.foo() VM Call Site method table Monday, July 1, 13
  • 80. VM Operations Method Lookup Branch Method Cache Dynamic Invocation Target Object FooClass def foo ... def bar ... associated with obj.foo() VM def foo ... Call Site method table Monday, July 1, 13
  • 81. VM Operations Method Lookup Branch Method Cache Dynamic Invocation Target Object FooClass def foo ... def bar ... associated with obj.foo() VM Call Site method table Monday, July 1, 13
  • 82. VM Operations Method Lookup Branch Method Cache Dynamic Invocation Target Object FooClass def foo ... def bar ... associated with obj.foo() VM def foo ... Call Site method table Monday, July 1, 13
  • 83. Singletons • Creates new types at runtime • Impossible to cache based on type • Usually defines new methods • Method table is always different class << foo ... def foo.bar ... Monday, July 1, 13
  • 84. Object#extend • Includes module into single object • New one-off type every time • Class hierarchy keeps changing foo.extend Enumerable Monday, July 1, 13
  • 85. static VALUE io_getpartial(int argc, VALUE *argv, VALUE io, int nonblock) { ... n = rb_read_internal(fptr->fd, RSTRING_PTR(str), len); rb_str_unlocktmp(str); if (n < 0) { if (!nonblock && rb_io_wait_readable(fptr->fd)) goto again; if (nonblock && (errno == EWOULDBLOCK || errno == EAGAIN)) rb_mod_sys_fail(rb_mWaitReadable, "read would block"); rb_sys_fail_path(fptr->pathv); } ... } Monday, July 1, 13
  • 86. static VALUE io_getpartial(int argc, VALUE *argv, VALUE io, int nonblock) { ... n = rb_read_internal(fptr->fd, RSTRING_PTR(str), len); rb_str_unlocktmp(str); if (n < 0) { if (!nonblock && rb_io_wait_readable(fptr->fd)) goto again; if (nonblock && (errno == EWOULDBLOCK || errno == EAGAIN)) rb_mod_sys_fail(rb_mWaitReadable, "read would block"); rb_sys_fail_path(fptr->pathv); } ... } Monday, July 1, 13
  • 87. void rb_mod_sys_fail(VALUE mod, const char *mesg) { VALUE exc = make_errno_exc(mesg); rb_extend_object(exc, mod); rb_exc_raise(exc); } Monday, July 1, 13
  • 88. void rb_mod_sys_fail(VALUE mod, const char *mesg) { VALUE exc = make_errno_exc(mesg); rb_extend_object(exc, mod); rb_exc_raise(exc); } Monday, July 1, 13
  • 89. Fixing Singletons/ #extend • Functional patterns • FooLibrary.process(obj) rather than obj.extend FooLibrary; obj.process • Create types up front (programmatically?) • 1000 predefined types beats infinite types Monday, July 1, 13
  • 90. Monday, July 1, 13
  • 91. Monday, July 1, 13
  • 92. Constant Lookup • Constants in tables on classes/modules • Usually assigned only once, at load time • Lookup is expensive, like methods • Values can be cached Monday, July 1, 13
  • 93. Constant Cache • Constant search proceeds two ways • First, lexical scoping • Second, class hierarchy • Invalidation happens globally Monday, July 1, 13
  • 94. Constant Cache Busting • Redefining constants • Introducing new lexical scopes • Classes created at runtime • Evaluated code • Altering class hierarchies • Lookup results may change...no caching Monday, July 1, 13
  • 95. Fixing Constants • Don't modify them • i.e. CONSTANT • Avoid runtime class hierarchy changes Monday, July 1, 13
  • 96. How to Get Help Monday, July 1, 13
  • 97. Performance Issues • Assume nothing...most can be fixed • Isolate bad code, small a case as possible • UseVM tools to monitor caches • Fix if it's your bug, PR if it's a library • Come to us for help or if it's aVM bug • Repeat... Monday, July 1, 13
  • 98. Concurrency Issues • Avoid mutable state • Synchronize mutations • Start coarse-grained, get finer over time • VM tooling to monitor locks, contention • ContactVM authors for help Monday, July 1, 13
  • 99. Monday, July 1, 13
  • 100. Ruby can be fast...and we want to help you. Monday, July 1, 13
  • 101. ThankYou! • Charles Oliver Nutter • @headius • http://jruby.org • http://blog.headius.com • Book: "Using JRuby" • Book: "Deploying JRuby" Monday, July 1, 13