RubyKaigi2014: Just in Time compiler for CRuby

A
Just
In
Time
Compiler
for
CRuby
CRuby言語処理系向けJITコンパイラ
Masahiro
Ide
Yokohama
Na=onal
University
RubyKaigi
2014
2014年9月19日(金)
1

Outline
• Ruby
• Just-‐In-‐Time(JIT)
compiler
• RuJIT:
a
JIT
compiler
for
Ruby
※あまり実装よりの内容ではありません
また，CRubyの実装に詳しくない人でも分かる
内容(にしたつもり)です．
2

Ruby
Language
and
Implementa=on
• Ruby
is
a
dynamically
typed
scrip=ng
language
– Support
excep=on,
garbage
collec=on,
con=nua=on
– CRuby
is
Ruby
interpreter
in
C
(de
facto
standard
implementa=on
of
Ruby)
– Many
Implementa=ons
JRuby(Java),
IRonRuby(.NET),
Rubinius(Ruby),
MRuby(C),
Topaz(PyPy)
– Numeric
benchmarks
10-‐100
=mes
slower
than
C
• Need
to
improve!!!
3

CRuby
Internal
YARV
bytecode
#toplevel
09
putobject
100
11
send
:=mes,
block
17
leave
#block
02
putself
03
getdynamic
i
06
send
:puts,
nil
14
leave
Ruby
code
100.=mes
do
|i|
puts
i
end
Parser
AST
Bytecode
Compiler
Interpreter
Memory
Manager
C
library
CPU
4

HOW
TO
SPEEDUP
RUBY
1. Translate/Compile
to
low-‐level
language
– プログラム実行中にコンパイル→
Just-‐In-‐Time(JIT)
– プログラム実行前にコンパイル→Ahead-‐Of-‐Time(AOT)
– Target
• Java
bytecode
(JRuby),
.NET
CLR
(IronRuby),
LLVM
IR(Rubinius)
2. Speedup
Interpreter
– ディスパッチ方法の工夫
– メソッドキャッシュ
– …
5

Outline
• Ruby
• Just-‐In-‐Time(JIT)
compiler
• RuJIT:
a
JIT
compiler
for
Ruby
– Design
– Usage
– Current
status
6

RuJIT
• A
Trace
based
JIT
compiler
for
CRuby
– Like
Firefox’s
javascript
VM
approach
• Based
on
current
version
of
CRuby
• Started
in
April
2014
• More
Speed,
Speed,
Speed
– …
but
maintain
compa=bility
with
current
Ruby
7

Implementa=on
Strategies
• Objec=ve
– Improve
Ruby’s
performance
• Development
policy
– Low
development
cost
• 1
person,
2~3
month
– High
extensibility
• Easy
to
develop
new
op=miza=on
techniques
– 実用的なソフトウェア
• いつかRubyのメインラインに
→この方針に基づいて設計を行っている
8

RuJIT
overview
RuJIT
YARV
bytecode
#toplevel
09
putobject
100
11
send
:=mes,
block
17
leave
#block
02
putself
03
getdynamic
i
06
send
:puts,
nil
14
leave
Ruby
code
100.=mes
do
|i|
puts
i
end
Parser
AST
Bytecode
Compiler
Interpreter
Memory
Manager
C
library
CPU
Na=ve
code
9

RuJIT
Design
• Trace
based
JIT
compiler
– Detect
hot
path(e.g.
loop)
• Intermediate
Representa=on
(IR)
for
op=miza=on
– Low
level
– Contain
run=me
informa=on
• Na=ve
Code
translator
– Use
C
compiler
(GCC,
clang)
as
backend
10
YARV
Trace
Selec=on
Engine
Trace
Cache
RuJIT
Run=me
CRuby
Run=me
IR
Generator
Op=mizer
invoke
Code
Generator
C
Compiler
Na=ve
code
Trace
compile

トレース方式コンパイラの実行の流れ
11
def square(x)!
return x * x!
end!
!
i = 0; y = 0!
while y < 100000!
y += square(i)!
i = i + 1!
end

Construct
trace
def square(x)!
return x * x!
end!
!
i = 0; y = 0!
while y < 100000!
y += square(i)!
i = i + 1!
end
12
BB0:
guard_method_redefine(Fixnum.*)
guard_method_redefine(Fixnum.+)
guard_method_redefine(Fixnum.<)
BB1:
guard_is_fixnum(i)
push_frame
#
invoke
square
tmp
=
fixnum_mul(i,
i)
pop_frame
#
leave
square
y
=
fixnum_plus(y,
tmp)
i
=
fixnum_plus(i,
1)
tmp2
=
fixnum_lt(i,
100000)
guard_nil(tmp2)
jump
BB1
トレース

Op=mize
trace
def square(x)!
return x * x!
end!
!
i = 0; y = 0!
while y < 100000!
y += square(i)!
i = i + 1!
end
13
トレース
BB0:
BB1:
guard_is_fixnum(i)
push_frame
#
invoke
square
tmp
=
fixnum_mul(i,
i)
pop_frame
#
leave
square
y
=
fixnum_plus(y,
tmp)
i
=
fixnum_plus(i,
1)
tmp2
=
fixnum_lt(i,
100000)
guard_nil(tmp2)
jump
BB1
RuJIT
can
hoist
guard_method_redefine

Op=mize
trace
def square(x)!
return x * x!
end!
!
i = 0; y = 0!
while y < 100000!
y += square(i)!
i = i + 1!
end
14
トレース
BB0:
BB1:
guard_is_fixnum(i)
push_frame
tmp
=
fixnum_mul(i,
i)
pop_frame
y
=
fixnum_plus(y,
tmp)
i
=
fixnum_plus(i,
1)
tmp2
=
fixnum_lt(i,
100000)
guard_nil(tmp2)
jump
BB1
RuJIT
can
remove
method
frame

Op=mize
trace
def square(x)!
return x * x!
end!
!
i = 0; y = 0!
while y < 100000!
y += square(i)!
i = i + 1!
end
15
トレース
BB0:
guard_is_fixnum(i)
BB1:
guard_is_fixnum(i)
push_frame
tmp
=
fixnum_mul(i,
i)
pop_frame
y
=
fixnum_plus(y,
tmp)
i
=
fixnum_plus(i,
1)
tmp2
=
fixnum_lt(i,
100000)
guard_nil(tmp2)
jump
BB1
RuJIT
can
hoist
guard_is_fixnum
Because
variable
‘i’
is
always
fixnum

Op=mize
trace
def square(x)!
return x * x!
end!
!
i = 0; y = 0!
while y < 100000!
y += square(i)!
i = i + 1!
end
16
Na=ve
code
BB0:
guard_is_fixnum(i)
BB1:
tmp
=
fixnum_mul(i,
i)
y
=
fixnum_plus(y,
tmp)
i
=
fixnum_plus(i,
1)
tmp2
=
fixnum_lt(i,
100000)
guard_nil(tmp2)
jump
BB1
トレース

トレースからインタプリタへの移行
YARV
Bytecode
17
Fixnum.*
is
redefined!
Invalidate
compiled
code
and
Fall
back
to
yarv
interpreter
#toplevel
09
putobject
100
11
send
:=mes,
block
17
leave
#block
02
putself
03
getdynamic
i
06
send
:puts,
nil
14
leave
Parser
AST
Bytecode
Compiler
Interpreter
CRuby
BB0:
guard_is_fixnum(i)
BB1:
tmp
=
fixnum_mul(i,
i)
y
=
fixnum_plus(y,
tmp)
i
=
fixnum_plus(i,
1)
tmp2
=
fixnum_lt(i,
100000)
guard_nil(tmp2)
jump
BB1

ここまでのまとめ
トレース方式JITコンパイラ
• 特徴
– 頻繁に実行されるパスのみをコンパイル
• 実行時に収集した仮定，前提条件を含めてコンパイル
– 積極的な最適化が可能
• JITで生成された機械語ではトレースに含まれていない
パスが実行されるとインタプリタへfall
back
• 欠点
– 頻繁なトレース↔インタプリタ遷移は性能劣化を
招く
• スタック，レジスタの書き戻しなど…
18

RuJITのコンパイル手法
• YARV
Bytecode→
RuJIT
IR→
C
code→
Na=ve
code
– たいていの内部命令はバイトコード命令に対応
– JIT内部にアセンブラ不要
• 開発コスト低い
19
バイトコード
命令
getlocal
x
getlocal
y
plus
x,
y
Ruby
Program
x
+
y
内部命令
guard_int
x
guard_int
y
iadd
x,
y
C言語
if(is_int(x)){
if(is_int(y)){
ret
=
x
+
y;
}
else
{…}
}
else
{…}
機械語
movl
%edi,
%ecx
andl
%esi,
%ecx
xorl
%eax,
%eax
testb
$1,
%cl
je
LBB0_2
addl
%edi,
%esi
movl
%esi,
%eax
内部命令に対する
Cコードテンプレート
を用いて変換
gcc
clang

バイトコードから
RuJIT内部命令への変換
• データ変換の記述は煩雑
– 各バイトコードごとのにスタック操作，pc操作，
内部データの更新
– 最適化しやすいように内部命令をどんどん追加
• 問題：どのように開発コストを抑えつつバグの
少ない変換を行うか
→変換規則，内部命令仕様から変換コード自動生成
• 共通の処理は手で書かない
20

内部命令記述と生成例
• 生成例
• 内部命令の変換則
21
define FixnumAdd!
:def :user!
(LHS:lir_t, RHS:lir_t)!
!
emit FixnumAdd if!
opcode is opt_plus and!
method_name is "+" and!
LHS is Fixnum and!
RHS is Fixnum!
!
if ( // rule0 FixnumAdd!
(opcode == BIN(opt_plus)) &&!
(ci->mid == idPLUS) &&!
(ci->argc == 2) &&!
(Fixnum.“+”は再定義されていない) &&!
(FIXNUM_P(params[0])) &&!
(FIXNUM_P(params[1]))!
) {!
Emit_GuardMethod(idPLUS);!
Emit_GuardTypeFixnum(regs[0]);!
Emit_GuardTypeFixnum(regs[1]);!
Emit_FixnumAdd(regs[0],
regs[1]);!
}

Op=miza=on
• 処理/実装コストの低いから，まずは実装
– Peephole
Op=miza=on
– Constant
Propaga=on
– Common
Sub-‐expression
Elimina=on
– Dead
Code
Elimina=on
– Escape
analysis
+
Stack
alloca=on
22

Evalua=on
• 用いたベンチマーク
– CRuby処理系付属のベンチマーク(の一部)
• 比較対象
– CRuby
(trunk)
• 環境
– OSX
10.9
– Clang
3.5
23

Evalua=on
• 多くのベンチマークで2~5倍の高速化
• いくつかのベンチマークは100倍以上
24
10
9
8
7
6
5
4
3
2
1
0
〜〜
〜〜
〜〜
〜〜
〜〜
〜〜
〜〜

Evalua=on
• 変化なし・低速化のケース
– トレース検出の変更
– Cで書かれたコードはトレースできない
Cで定義されているコード
トレースの検出対応
できていないケース
25
1.2
1
0.8
0.6
0.4
0.2
0
io_file_create
io_file_read
io_file_write
io_select
io_select2
io_select3
app_factorial
app_fib

Conclusion
• RuJIT
:
a
trace
based
JIT
compiler
for
CRuby
– トレース方式JITコンパイラによりRubyスクリプトのうち
頻繁に実行される箇所を機械語へ変換
– How
to
use
RuJIT
1. git
clone
git@github.com:imasahiro/rujit.git
2. cd
/path/to/rujit
3. ./configure
&&
make
&&
make
install
4. ruby
hello_world.rb
• 今後の課題
– 互換性(Trace
API,
Excep=on,
Thread対応)
– Need
more
test
26

FAQ
• Railsは動く?（もしくは他のrubyプログラムは？）
– まだ動きません．現在対応中です
• RuJIT，なんか動きません
– GithubのIssueに投げてもらえると助かります
– h~ps://github.com/imasahiro/rujit/issues
• ◯◯◯との比較は？
– JRuby,
Rubinius,
Topaz,
mrubyのJIT,
yarv2llvm,
CastOff…
– 現在評価中です
27

RubyKaigi2014: Just in Time compiler for CRuby

More Related Content

What's hot

Similar to RubyKaigi2014: Just in Time compiler for CRuby

RubyKaigi2014: Just in Time compiler for CRuby