Applying Compiler
                                Technology to Ruby
                                      Evan Phoenix
  ...
What makes Ruby great
                                 can make Ruby slow.


Wednesday, September 16, 2009
‣ Highly Dynamic




Wednesday, September 16, 2009
‣ Highly Dynamic
                                • Very high level operations
                                • New code c...
Haven’t other languages
                                had these same features/
                                      wea...
‣Prior Work




Wednesday, September 16, 2009
‣Prior Work
                                • Smalltalk
                                 • 1980-1994: Extensive work to ma...
‣What Can We Learn From Them?




Wednesday, September 16, 2009
‣What Can We Learn From Them?
                                • Complied code is faster than interpreted code
            ...
‣Code Generation (JIT)
                                • Eliminating overhead of interpreter instantly increases
         ...
‣Type Profile
                                • As the program executes, it’s possible to see how one method
              ...
1: 25245
                                          2: 275
                                  2
                            ...
‣Type Profiling (Cont.)
                                • 98% of all method calls are to the same method
                  ...
‣Type Feedback
                                • Optimize a semi-static relationship to generate faster code
             ...
‣Method Inlining
                                • Rather than emit a call to a target method, copy it’s body at the
     ...
Implementation



Wednesday, September 16, 2009
‣Code Generation (JIT)
                                • Early experimentation with custom JIT
                           ...
‣LLVM
                                • Provides an internal AST (LLVM IR) for describing work to
                        ...
‣Type Profiling
                                • All call sites use a class called InlineCache, one per call site
        ...
‣When To Compile
                                • It takes time for a method’s type information to settle down
          ...
‣How to Compile
                                • To impact runtime as little as possible, all JIT compilation happens
   ...
‣Benchmarks                                   Seconds

                                         9

                       ...
‣Benchmarks                                     Seconds

                                          30


                  ...
‣Benchmarks                                  Seconds

                                        7

                         ...
‣Benchmarks                                           Seconds

                                                8

        ...
‣Conclusion
                                • Ruby is a wonderful language because it is organized for humans
            ...
Upcoming SlideShare
Loading in...5
×

Ruby World

1,926

Published on

Evan Phoenixs presentation at the Ruby World conference, Sept 7th, 2009.

Published in: Technology, Education
1 Comment
5 Likes
Statistics
Notes
  • Here's the video: http://www.ustream.tv/recorded/2394634
    <br /><object type="application/x-shockwave-flash" data="http://www.ustream.tv/flash/video/2394634" width="350" height="288"><param name="movie" value="http://www.ustream.tv/flash/video/2394634"></param><embed src="http://www.ustream.tv/flash/video/2394634" width="350" height="288" type="application/x-shockwave-flash"></embed></object>
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total Views
1,926
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
26
Comments
1
Likes
5
Embeds 0
No embeds

No notes for slide

Ruby World

  1. 1. Applying Compiler Technology to Ruby Evan Phoenix Sept 8, 2009 Wednesday, September 16, 2009
  2. 2. What makes Ruby great can make Ruby slow. Wednesday, September 16, 2009
  3. 3. ‣ Highly Dynamic Wednesday, September 16, 2009
  4. 4. ‣ Highly Dynamic • Very high level operations • New code can be introduced at anytime • Dynamic typing • Exclusively late bound method calls • Easier to implement as an interpreter Wednesday, September 16, 2009
  5. 5. Haven’t other languages had these same features/ weaknesses? Wednesday, September 16, 2009
  6. 6. ‣Prior Work Wednesday, September 16, 2009
  7. 7. ‣Prior Work • Smalltalk • 1980-1994: Extensive work to make it fast • Self • 1992-1996: A primary research vehicle for making dynamic languages fast • Java / Hotspot • 1996-present: A battle hardened engine for (limited) dynamic dispatch Wednesday, September 16, 2009
  8. 8. ‣What Can We Learn From Them? Wednesday, September 16, 2009
  9. 9. ‣What Can We Learn From Them? • Complied code is faster than interpreted code • It’s very hard (almost impossible) to figure things out staticly • The type profile of a program is stable over time • Therefore: • Learn what a program does and optimize based on that • This is called Type Feedback Wednesday, September 16, 2009
  10. 10. ‣Code Generation (JIT) • Eliminating overhead of interpreter instantly increases performance a fixed percentage • Naive code generation results in small improvement over interpreter • Method calling continues to dominate time • Need a way to generate better code • Combine with program type information! Wednesday, September 16, 2009
  11. 11. ‣Type Profile • As the program executes, it’s possible to see how one method calls another methods • The relationship of one method and all the methods it calls is the type profile of the method • Just because you CAN use dynamic dispatch, doesn’t mean you always do. • It’s common that a call site always calls the same method every time it’s run Wednesday, September 16, 2009
  12. 12. 1: 25245 2: 275 2 3: 86 1% 4: 50 5: 35 6: 6 7: 10 8: 5 9: 5 10: 2 10+: 34 1 class 98% Call sites running Array specs Wednesday, September 16, 2009
  13. 13. ‣Type Profiling (Cont.) • 98% of all method calls are to the same method every time • In other words, 98% of all method calls are statically bound Wednesday, September 16, 2009
  14. 14. ‣Type Feedback • Optimize a semi-static relationship to generate faster code • Semi-static relationships are found by profiling all call sites • Allow JIT to make vastly better decisions • Most common optimization: Method Inlining Wednesday, September 16, 2009
  15. 15. ‣Method Inlining • Rather than emit a call to a target method, copy it’s body at the call site • Eliminates code to lookup and begin execution of target method • Simplifies (or eliminates) setup for target method • Allows for type propagation, as well as providing a wider horizon for optimization. • A wider horizon means better generated code, which means less work to do per method == faster execution. Wednesday, September 16, 2009
  16. 16. Implementation Wednesday, September 16, 2009
  17. 17. ‣Code Generation (JIT) • Early experimentation with custom JIT • Realized we weren’t experts • Would take years to get good code being generated • Switched to LLVM Wednesday, September 16, 2009
  18. 18. ‣LLVM • Provides an internal AST (LLVM IR) for describing work to be done • Text representation of AST allows for easy debugging • Provides ability to compile AST to machine code in memory • Contains thousands of optimizations • Competitive with GCC Wednesday, September 16, 2009
  19. 19. ‣Type Profiling • All call sites use a class called InlineCache, one per call site • InlineCache accelerates method dispatch by caching previous method used • In addition, tracks a fixed number of receiver classes seen when there is a cache miss • When compiling a method using LLVM, all InlineCaches for a method can be read • InlineCaches with good information can be used to accurately find a method to inline Wednesday, September 16, 2009
  20. 20. ‣When To Compile • It takes time for a method’s type information to settle down • Compiling too early means not having enough type info • Compiling too late means lost performance • Use simple call counters to allow a method to “heat up” • Each invocation of a method increments counter • When counter reaches a certain value, method is queued for compilation. • Threshold value is tunable: -Xjit.call_til_compile • Still experimenting with good default values Wednesday, September 16, 2009
  21. 21. ‣How to Compile • To impact runtime as little as possible, all JIT compilation happens in a background OS thread • Methods are queued, and background thread reads queue to find methods to compile • After compiling, function pointers to JIT generated code are installed in methods • All future invocations of method use JIT code Wednesday, September 16, 2009
  22. 22. ‣Benchmarks Seconds 9 8.02 6.75 def foo() 5.30 5.90 ary = [] 4.5 100.times { |i| ary << i } 3.60 end 2.25 2.59 300,000 times 0 1.8 1.9 rbx rbx jit rbx jit +blocks Wednesday, September 16, 2009
  23. 23. ‣Benchmarks Seconds 30 25.36 22.5 def foo() hsh = {} 15 100.times { |i| hsh[i] = 0 } 12.54 12.01 end 7.5 100,000 times 5.26 4.85 0 1.8 1.9 rbx rbx jit rbx jit +blocks Wednesday, September 16, 2009
  24. 24. ‣Benchmarks Seconds 7 6.26 5.25 def foo() hsh = { 47 => true } 3.5 3.64 100.times { |i| hsh[i] } end 2.68 2.66 1.75 2.09 100,000 times 0 1.8 1.9 rbx rbx jit rbx jit +blocks Wednesday, September 16, 2009
  25. 25. ‣Benchmarks Seconds 8 7.36 7.27 6 4 tak(18, 9, 0) 2 1.89 1.58 1.53 1.53 0 1.8 1.9 jruby rbx rbx jit rbx jit +blocks Wednesday, September 16, 2009
  26. 26. ‣Conclusion • Ruby is a wonderful language because it is organized for humans • By gather and using information about a running program, it’s possible to make that program much faster without impacting flexibility • Thank You! Wednesday, September 16, 2009
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×