Uploaded on

Evan Phoenixs presentation at the Ruby World conference, Sept 7th, 2009.

Evan Phoenixs presentation at the Ruby World conference, Sept 7th, 2009.

More in: Technology , Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
  • Here's the video: http://www.ustream.tv/recorded/2394634
    <br /><object type="application/x-shockwave-flash" data="http://www.ustream.tv/flash/video/2394634" width="350" height="288"><param name="movie" value="http://www.ustream.tv/flash/video/2394634"></param><embed src="http://www.ustream.tv/flash/video/2394634" width="350" height="288" type="application/x-shockwave-flash"></embed></object>
    Are you sure you want to
    Your message goes here
No Downloads

Views

Total Views
1,861
On Slideshare
0
From Embeds
0
Number of Embeds
0

Actions

Shares
Downloads
25
Comments
1
Likes
5

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Applying Compiler Technology to Ruby Evan Phoenix Sept 8, 2009 Wednesday, September 16, 2009
  • 2. What makes Ruby great can make Ruby slow. Wednesday, September 16, 2009
  • 3. ‣ Highly Dynamic Wednesday, September 16, 2009
  • 4. ‣ Highly Dynamic • Very high level operations • New code can be introduced at anytime • Dynamic typing • Exclusively late bound method calls • Easier to implement as an interpreter Wednesday, September 16, 2009
  • 5. Haven’t other languages had these same features/ weaknesses? Wednesday, September 16, 2009
  • 6. ‣Prior Work Wednesday, September 16, 2009
  • 7. ‣Prior Work • Smalltalk • 1980-1994: Extensive work to make it fast • Self • 1992-1996: A primary research vehicle for making dynamic languages fast • Java / Hotspot • 1996-present: A battle hardened engine for (limited) dynamic dispatch Wednesday, September 16, 2009
  • 8. ‣What Can We Learn From Them? Wednesday, September 16, 2009
  • 9. ‣What Can We Learn From Them? • Complied code is faster than interpreted code • It’s very hard (almost impossible) to figure things out staticly • The type profile of a program is stable over time • Therefore: • Learn what a program does and optimize based on that • This is called Type Feedback Wednesday, September 16, 2009
  • 10. ‣Code Generation (JIT) • Eliminating overhead of interpreter instantly increases performance a fixed percentage • Naive code generation results in small improvement over interpreter • Method calling continues to dominate time • Need a way to generate better code • Combine with program type information! Wednesday, September 16, 2009
  • 11. ‣Type Profile • As the program executes, it’s possible to see how one method calls another methods • The relationship of one method and all the methods it calls is the type profile of the method • Just because you CAN use dynamic dispatch, doesn’t mean you always do. • It’s common that a call site always calls the same method every time it’s run Wednesday, September 16, 2009
  • 12. 1: 25245 2: 275 2 3: 86 1% 4: 50 5: 35 6: 6 7: 10 8: 5 9: 5 10: 2 10+: 34 1 class 98% Call sites running Array specs Wednesday, September 16, 2009
  • 13. ‣Type Profiling (Cont.) • 98% of all method calls are to the same method every time • In other words, 98% of all method calls are statically bound Wednesday, September 16, 2009
  • 14. ‣Type Feedback • Optimize a semi-static relationship to generate faster code • Semi-static relationships are found by profiling all call sites • Allow JIT to make vastly better decisions • Most common optimization: Method Inlining Wednesday, September 16, 2009
  • 15. ‣Method Inlining • Rather than emit a call to a target method, copy it’s body at the call site • Eliminates code to lookup and begin execution of target method • Simplifies (or eliminates) setup for target method • Allows for type propagation, as well as providing a wider horizon for optimization. • A wider horizon means better generated code, which means less work to do per method == faster execution. Wednesday, September 16, 2009
  • 16. Implementation Wednesday, September 16, 2009
  • 17. ‣Code Generation (JIT) • Early experimentation with custom JIT • Realized we weren’t experts • Would take years to get good code being generated • Switched to LLVM Wednesday, September 16, 2009
  • 18. ‣LLVM • Provides an internal AST (LLVM IR) for describing work to be done • Text representation of AST allows for easy debugging • Provides ability to compile AST to machine code in memory • Contains thousands of optimizations • Competitive with GCC Wednesday, September 16, 2009
  • 19. ‣Type Profiling • All call sites use a class called InlineCache, one per call site • InlineCache accelerates method dispatch by caching previous method used • In addition, tracks a fixed number of receiver classes seen when there is a cache miss • When compiling a method using LLVM, all InlineCaches for a method can be read • InlineCaches with good information can be used to accurately find a method to inline Wednesday, September 16, 2009
  • 20. ‣When To Compile • It takes time for a method’s type information to settle down • Compiling too early means not having enough type info • Compiling too late means lost performance • Use simple call counters to allow a method to “heat up” • Each invocation of a method increments counter • When counter reaches a certain value, method is queued for compilation. • Threshold value is tunable: -Xjit.call_til_compile • Still experimenting with good default values Wednesday, September 16, 2009
  • 21. ‣How to Compile • To impact runtime as little as possible, all JIT compilation happens in a background OS thread • Methods are queued, and background thread reads queue to find methods to compile • After compiling, function pointers to JIT generated code are installed in methods • All future invocations of method use JIT code Wednesday, September 16, 2009
  • 22. ‣Benchmarks Seconds 9 8.02 6.75 def foo() 5.30 5.90 ary = [] 4.5 100.times { |i| ary << i } 3.60 end 2.25 2.59 300,000 times 0 1.8 1.9 rbx rbx jit rbx jit +blocks Wednesday, September 16, 2009
  • 23. ‣Benchmarks Seconds 30 25.36 22.5 def foo() hsh = {} 15 100.times { |i| hsh[i] = 0 } 12.54 12.01 end 7.5 100,000 times 5.26 4.85 0 1.8 1.9 rbx rbx jit rbx jit +blocks Wednesday, September 16, 2009
  • 24. ‣Benchmarks Seconds 7 6.26 5.25 def foo() hsh = { 47 => true } 3.5 3.64 100.times { |i| hsh[i] } end 2.68 2.66 1.75 2.09 100,000 times 0 1.8 1.9 rbx rbx jit rbx jit +blocks Wednesday, September 16, 2009
  • 25. ‣Benchmarks Seconds 8 7.36 7.27 6 4 tak(18, 9, 0) 2 1.89 1.58 1.53 1.53 0 1.8 1.9 jruby rbx rbx jit rbx jit +blocks Wednesday, September 16, 2009
  • 26. ‣Conclusion • Ruby is a wonderful language because it is organized for humans • By gather and using information about a running program, it’s possible to make that program much faster without impacting flexibility • Thank You! Wednesday, September 16, 2009