SlideShare a Scribd company logo
Just-In-Time Compiler in PHP 8
Nikita Popov @ betterCode PHP 8
About Me
About Me
●
Dmitry Stogov works on JIT
●
I work on everything else :)
About Me
●
Dmitry Stogov works on JIT
●
I work on everything else :)
●
My JIT involvement mostly QA
Just-In-Time (JIT) Compiler
PHP Code
Opcodes
Virtual
Machine
CPU
Just-In-Time (JIT) Compiler
PHP Code
Opcodes
Virtual
Machine
CPU
Machine
Code
JIT
History
●
Old project started by Zend in PHP 5 times
●
Mainly implemented by Dmitry Stogov
History
●
Early prototypes: The rest of PHP is too slow for
it to matter
History
●
Early prototypes: The rest of PHP is too slow for
it to matter
– Too many allocations
– Too much memory usage
– Too much pointer chasing
– Cache locality is key
History
●
Early prototypes: The rest of PHP is too slow for
it to matter
●
PHPNG (later: PHP 7) project started to
optimize PHP
●
Large performance improvements (2x), no JIT
needed!
History
●
SSA and type inference from JIT integrated into
opcache
●
Used for opcode optimizations
History
●
SSA and type inference from JIT integrated into
opcache
●
Used for opcode optimizations
– Constant Propagation
– Dead Code Elimination
– Refcount Optimization
Configuration
●
Enable opcache
●
opcache.jit_buffer_size=128M
●
Done!
Configuration
●
Advanced configuration:
– opcache.jit (CRTO)
– opcache.jit_debug, opcache.jit_bisect_limit
– opcache.jit_max_root_traces, opcache.jit_max_side_traces,
opcache.jit_max_exit_counters
– opcache.jit_hot_loop, opcache.jit_hot_func, opcache.jit_hot_return,
opcache.jit_hot_side_exit
– opcache.jit_blacklist_root_trace, opcache.jit_blacklist_side_trace
– opcache.jit_max_loop_unrolls, opcache.jit_max_recursive_calls,
opcache.jit_max_recursive_returns, opcache.jit_max_polymorphic_calls
– https://www.php.net/manual/en/opcache.configuration.php
Performance
bench.php
micro_bench.php
PHP-Parser
amphp
Symfony Demo
With Preloading
0 0.5 1 1.5 2 2.5 3 3.5
Baseline: Opcache + No JIT
bench.php
micro_bench.php
PHP-Parser
amphp
Symfony Demo
With Preloading
0 0.5 1 1.5 2 2.5 3 3.5
Baseline: Opcache + No JIT
Performance
bench.php
micro_bench.php
PHP-Parser
amphp
Symfony Demo
With Preloading
0 0.5 1 1.5 2 2.5 3 3.5
Baseline: Opcache + No JIT
Performance
bench.php
micro_bench.php
PHP-Parser
amphp
Symfony Demo
With Preloading
0 0.5 1 1.5 2 2.5 3 3.5
Baseline: Opcache + No JIT
Performance
bench.php
micro_bench.php
PHP-Parser
amphp
Symfony Demo
With Preloading
0 0.5 1 1.5 2 2.5 3 3.5
Baseline: Opcache + No JIT
Performance
Performance
●
Heavily depends on workload
●
Larger impact the more time is spent executing
PHP code (rather than e.g. DB queries)
●
More useful for "non-standard" applications
Function JIT
●
opcache.jit=function
●
Always JITs a whole function
Function JIT
PHP Code
Opcodes
Virtual
Machine
CPU
Machine
Code
JIT
Trigger
Function JIT
●
Trigger: When to JIT
– 0: All functions, on script load
– 1: All functions, on first execution
– 2: Profile first request, JIT hot functions
– 3: Profile on the fly, JIT hot functions
<?php
function sum(int $n) {
$sum = 0;
for ($i = 0; $i < $n; $i++) {
$sum += $i;
}
return $sum;
}
<?php
function sum(int $n) {
entry:
$sum = 0;
$i = 0;
goto cond;
loop:
$sum += $i;
$i++;
cond:
if ($i < $n) goto loop;
finish:
return $sum;
}
<?php
function sum(int $n) {
entry:
$sum_0 = 0;
$i_0 = 0;
goto cond;
loop:
$sum_2 = $sum_1 + $i_1;
$i_2 = $i_1 + 1;
cond:
$sum_1 = phi(entry: $sum_0, loop: $sum_2);
$i_1 = phi(entry: $i_0, loop: $i_2);
if ($i_1 < $n) goto loop;
finish:
return $sum_1;
}
<?php
function sum(int $n) {
entry:
$sum_0 = 0; # int
$i_0 = 0; # int
goto cond;
loop:
$sum_2 = $sum_1 + $i_1; # int|float
$i_2 = $i_1 + 1; # int
cond:
$sum_1 = phi(entry: $sum_0, loop: $sum_2); # int|float
$i_1 = phi(entry: $i_0, loop: $i_2); # int
if ($i_1 < $n) goto loop;
finish:
return $sum_1;
}
...
.L2:
mov $0x0, 0x60(%r14)
mov $0x4, 0x68(%r14)
xor %rdx, %rdx
jmp .L5
.L3:
mov %rsi, 0x50(%r14)
mov $0x4, 0x58(%r14)
cmp $0x4, 0x68(%r14)
jnz .L10
mov 0x60(%r14), %rax
add %rdx, %rax
jo .L9
mov %rax, 0x60(%r14)
.L4:
add $0x1, %rdx
.L5:
...
...
.L2:
mov $0x0, 0x60(%r14)
mov $0x4, 0x68(%r14)
xor %rdx, %rdx
jmp .L5
.L3:
mov %rsi, 0x50(%r14)
mov $0x4, 0x58(%r14)
cmp $0x4, 0x68(%r14)
jnz .L10
mov 0x60(%r14), %rax
add %rdx, %rax
jo .L9
mov %rax, 0x60(%r14)
.L4:
add $0x1, %rdx
.L5:
...
Assign 0 to $i (in register)
Increment $i (in register)
...
.L2:
mov $0x0, 0x60(%r14)
mov $0x4, 0x68(%r14)
xor %rdx, %rdx
jmp .L5
.L3:
mov %rsi, 0x50(%r14)
mov $0x4, 0x58(%r14)
cmp $0x4, 0x68(%r14)
jnz .L10
mov 0x60(%r14), %rax
add %rdx, %rax
jo .L9
mov %rax, 0x60(%r14)
.L4:
add $0x1, %rdx
.L5:
...
Frame pointer
...
.L2:
mov $0x0, 0x60(%r14)
mov $0x4, 0x68(%r14)
xor %rdx, %rdx
jmp .L5
.L3:
mov %rsi, 0x50(%r14)
mov $0x4, 0x58(%r14)
cmp $0x4, 0x68(%r14)
jnz .L10
mov 0x60(%r14), %rax
add %rdx, %rax
jo .L9
mov %rax, 0x60(%r14)
.L4:
add $0x1, %rdx
.L5:
...
Assign int(0) to $sum
...
.L2:
mov $0x0, 0x60(%r14)
mov $0x4, 0x68(%r14)
xor %rdx, %rdx
jmp .L5
.L3:
mov %rsi, 0x50(%r14)
mov $0x4, 0x58(%r14)
cmp $0x4, 0x68(%r14)
jnz .L10
mov 0x60(%r14), %rax
add %rdx, %rax
jo .L9
mov %rax, 0x60(%r14)
.L4:
add $0x1, %rdx
.L5:
...
Check whether $sum is int
...
.L2:
mov $0x0, 0x60(%r14)
mov $0x4, 0x68(%r14)
xor %rdx, %rdx
jmp .L5
.L3:
mov %rsi, 0x50(%r14)
mov $0x4, 0x58(%r14)
cmp $0x4, 0x68(%r14)
jnz .L10
mov 0x60(%r14), %rax
add %rdx, %rax
jo .L9
mov %rax, 0x60(%r14)
.L4:
add $0x1, %rdx
.L5:
...
Load $sum to register
Add $sum and $i
Write result back
...
.L2:
mov $0x0, 0x60(%r14)
mov $0x4, 0x68(%r14)
xor %rdx, %rdx
jmp .L5
.L3:
mov %rsi, 0x50(%r14)
mov $0x4, 0x58(%r14)
cmp $0x4, 0x68(%r14)
jnz .L10
mov 0x60(%r14), %rax
add %rdx, %rax
jo .L9
mov %rax, 0x60(%r14)
.L4:
add $0x1, %rdx
.L5:
...
Check if addition overflowed
...
.L9:
vxorps %xmm0, %xmm0, %xmm0
vcvtsi2sd 0x60(%r14), %xmm0, %xmm0
vxorps %xmm1, %xmm1, %xmm1
vcvtsi2sd %rdx, %xmm1, %xmm1
vaddsd %xmm1, %xmm0, %xmm0
vmovsd %xmm0, 0x60(%r14)
mov $0x5, 0x68(%r14)
jmp .L4
.L10:
vxorps %xmm0, %xmm0, %xmm0
vcvtsi2sd %rdx, %xmm0, %xmm0
vaddsd 0x60(%r14), %xmm0, %xmm0
vmovsd %xmm0, 0x60(%r14)
jmp .L4
.L11:
...
Convert $i to float
...
.L9:
vxorps %xmm0, %xmm0, %xmm0
vcvtsi2sd 0x60(%r14), %xmm0, %xmm0
vxorps %xmm1, %xmm1, %xmm1
vcvtsi2sd %rdx, %xmm1, %xmm1
vaddsd %xmm1, %xmm0, %xmm0
vmovsd %xmm0, 0x60(%r14)
mov $0x5, 0x68(%r14)
jmp .L4
.L10:
vxorps %xmm0, %xmm0, %xmm0
vcvtsi2sd %rdx, %xmm0, %xmm0
vaddsd 0x60(%r14), %xmm0, %xmm0
vmovsd %xmm0, 0x60(%r14)
jmp .L4
.L11:
...
Add (float)$i to $sum
...
.L9:
vxorps %xmm0, %xmm0, %xmm0
vcvtsi2sd 0x60(%r14), %xmm0, %xmm0
vxorps %xmm1, %xmm1, %xmm1
vcvtsi2sd %rdx, %xmm1, %xmm1
vaddsd %xmm1, %xmm0, %xmm0
vmovsd %xmm0, 0x60(%r14)
mov $0x5, 0x68(%r14)
jmp .L4
.L10:
vxorps %xmm0, %xmm0, %xmm0
vcvtsi2sd %rdx, %xmm0, %xmm0
vaddsd 0x60(%r14), %xmm0, %xmm0
vmovsd %xmm0, 0x60(%r14)
jmp .L4
.L11:
...
Convert $sum to floatConvert $sum to float
Convert $i to float
...
.L9:
vxorps %xmm0, %xmm0, %xmm0
vcvtsi2sd 0x60(%r14), %xmm0, %xmm0
vxorps %xmm1, %xmm1, %xmm1
vcvtsi2sd %rdx, %xmm1, %xmm1
vaddsd %xmm1, %xmm0, %xmm0
vmovsd %xmm0, 0x60(%r14)
mov $0x5, 0x68(%r14)
jmp .L4
.L10:
vxorps %xmm0, %xmm0, %xmm0
vcvtsi2sd %rdx, %xmm0, %xmm0
vaddsd 0x60(%r14), %xmm0, %xmm0
vmovsd %xmm0, 0x60(%r14)
jmp .L4
.L11:
...
Add $sum and $i as floats
Mark $sum slot as float
...
.L9:
vxorps %xmm0, %xmm0, %xmm0
vcvtsi2sd 0x60(%r14), %xmm0, %xmm0
vxorps %xmm1, %xmm1, %xmm1
vcvtsi2sd %rdx, %xmm1, %xmm1
vaddsd %xmm1, %xmm0, %xmm0
vmovsd %xmm0, 0x60(%r14)
mov $0x5, 0x68(%r14)
jmp .L4
.L10:
vxorps %xmm0, %xmm0, %xmm0
vcvtsi2sd %rdx, %xmm0, %xmm0
vaddsd 0x60(%r14), %xmm0, %xmm0
vmovsd %xmm0, 0x60(%r14)
jmp .L4
.L11:
...
This code is almost
certainly unused!
Can't store $sum in
register, because it
might turn float
Tracing JIT
VM Execution
+ Profiling
Tracing JIT
VM Execution
+ Profiling
Trace
Collection
Hot
Tracing JIT
VM Execution
+ Profiling
Trace
Collection
Trace
Compilation
Hot
Tracing JIT
VM Execution
+ Profiling
Trace
Collection
Trace
Execution
Trace
Compilation
Hot
Tracing JIT
VM Execution
+ Profiling
Trace
Collection
Trace
Execution
Trace
Compilation
Hot
Deoptimization
<?php
function sum(int $n) {
entry:
$sum = 0;
$i = 0;
goto cond;
loop:
$sum += $i;
$i++;
cond:
if ($i < $n) goto loop;
finish:
return $sum;
}
<?php
trace:
<?php
function sum(int $n) {
entry:
$sum = 0;
$i = 0;
goto cond;
loop:
$sum += $i;
$i++;
cond:
if ($i < $n) goto loop;
finish:
return $sum;
}
<?php
trace:
if ($i < $n)
<?php
function sum(int $n) {
entry:
$sum = 0;
$i = 0;
goto cond;
loop:
$sum += $i;
$i++;
cond:
if ($i < $n) goto loop;
finish:
return $sum;
}
<?php
trace:
if ($i < $n)
$sum += $i;
<?php
function sum(int $n) {
entry:
$sum = 0;
$i = 0;
goto cond;
loop:
$sum += $i;
$i++;
cond:
if ($i < $n) goto loop;
finish:
return $sum;
}
<?php
trace:
if ($i < $n)
$sum += $i;
$i++;
<?php
function sum(int $n) {
entry:
$sum = 0;
$i = 0;
goto cond;
loop:
$sum += $i;
$i++;
cond:
if ($i < $n) goto loop;
finish:
return $sum;
}
<?php
trace:
if ($i < $n)
$sum += $i;
$i++;
goto trace;
<?php
function sum(int $n) {
entry:
$sum = 0;
$i = 0;
goto cond;
loop:
$sum += $i;
$i++;
cond:
if ($i < $n) goto loop;
finish:
return $sum;
}
<?php
$sum_0 = ...;
$i_0 = ...;
trace:
$sum_1 = phi($sum_0, $sum_2);
$i_1 = phi($i_0, $i_2);
if ($i_1 < $n)
$sum_2 = $sum_1 + $i_1;
$i_2 = $i_1 + 1;
goto trace;
<?php
function sum(int $n) {
entry:
$sum = 0;
$i = 0;
goto cond;
loop:
$sum += $i;
$i++;
cond:
if ($i < $n) goto loop;
finish:
return $sum;
}
<?php
$sum_0 = ...; # int
$i_0 = ...;
trace:
$sum_1 = phi($sum_0, $sum_2);
$i_1 = phi($i_0, $i_2);
if ($i_1 < $n) # does not exit
$sum_2 = $sum_1 + $i_1; # int
$i_2 = $i_1 + 1;
goto trace;
sub $0x10, %rsp
mov $EG(jit_trace_num), %rax
mov $0x1, (%rax)
cmp $0x4, 0x68(%r14)
jnz jit$$trace_exit_0
mov 0x50(%r14), %rcx
mov 0x60(%r14), %rdx
mov 0x70(%r14), %rsi
.L1:
cmp %rcx, %rsi
jge jit$$trace_exit_1
mov %rdx, %rax
add %rsi, %rax
jo jit$$trace_exit_2
mov %rax, %rdx
add $0x1, %rsi
mov $EG(vm_interrupt), %rax
cmp $0x0, (%rax)
jz .L1
jmp jit$$trace_exit_3
sub $0x10, %rsp
mov $EG(jit_trace_num), %rax
mov $0x1, (%rax)
cmp $0x4, 0x68(%r14)
jnz jit$$trace_exit_0
mov 0x50(%r14), %rcx
mov 0x60(%r14), %rdx
mov 0x70(%r14), %rsi
.L1:
cmp %rcx, %rsi
jge jit$$trace_exit_1
mov %rdx, %rax
add %rsi, %rax
jo jit$$trace_exit_2
mov %rax, %rdx
add $0x1, %rsi
mov $EG(vm_interrupt), %rax
cmp $0x0, (%rax)
jz .L1
jmp jit$$trace_exit_3
Check if $sum is int (exit 0)
sub $0x10, %rsp
mov $EG(jit_trace_num), %rax
mov $0x1, (%rax)
cmp $0x4, 0x68(%r14)
jnz jit$$trace_exit_0
mov 0x50(%r14), %rcx
mov 0x60(%r14), %rdx
mov 0x70(%r14), %rsi
.L1:
cmp %rcx, %rsi
jge jit$$trace_exit_1
mov %rdx, %rax
add %rsi, %rax
jo jit$$trace_exit_2
mov %rax, %rdx
add $0x1, %rsi
mov $EG(vm_interrupt), %rax
cmp $0x0, (%rax)
jz .L1
jmp jit$$trace_exit_3
Load $n, $sum, $i into registers
sub $0x10, %rsp
mov $EG(jit_trace_num), %rax
mov $0x1, (%rax)
cmp $0x4, 0x68(%r14)
jnz jit$$trace_exit_0
mov 0x50(%r14), %rcx
mov 0x60(%r14), %rdx
mov 0x70(%r14), %rsi
.L1:
cmp %rcx, %rsi
jge jit$$trace_exit_1
mov %rdx, %rax
add %rsi, %rax
jo jit$$trace_exit_2
mov %rax, %rdx
add $0x1, %rsi
mov $EG(vm_interrupt), %rax
cmp $0x0, (%rax)
jz .L1
jmp jit$$trace_exit_3
Check $i < $n (exit 1)
sub $0x10, %rsp
mov $EG(jit_trace_num), %rax
mov $0x1, (%rax)
cmp $0x4, 0x68(%r14)
jnz jit$$trace_exit_0
mov 0x50(%r14), %rcx
mov 0x60(%r14), %rdx
mov 0x70(%r14), %rsi
.L1:
cmp %rcx, %rsi
jge jit$$trace_exit_1
mov %rdx, %rax
add %rsi, %rax
jo jit$$trace_exit_2
mov %rax, %rdx
add $0x1, %rsi
mov $EG(vm_interrupt), %rax
cmp $0x0, (%rax)
jz .L1
jmp jit$$trace_exit_3
$sum += $i, check overflow (exit 2)
sub $0x10, %rsp
mov $EG(jit_trace_num), %rax
mov $0x1, (%rax)
cmp $0x4, 0x68(%r14)
jnz jit$$trace_exit_0
mov 0x50(%r14), %rcx
mov 0x60(%r14), %rdx
mov 0x70(%r14), %rsi
.L1:
cmp %rcx, %rsi
jge jit$$trace_exit_1
mov %rdx, %rax
add %rsi, %rax
jo jit$$trace_exit_2
mov %rax, %rdx
add $0x1, %rsi
mov $EG(vm_interrupt), %rax
cmp $0x0, (%rax)
jz .L1
jmp jit$$trace_exit_3
$i++
sub $0x10, %rsp
mov $EG(jit_trace_num), %rax
mov $0x1, (%rax)
cmp $0x4, 0x68(%r14)
jnz jit$$trace_exit_0
mov 0x50(%r14), %rcx
mov 0x60(%r14), %rdx
mov 0x70(%r14), %rsi
.L1:
cmp %rcx, %rsi
jge jit$$trace_exit_1
mov %rdx, %rax
add %rsi, %rax
jo jit$$trace_exit_2
mov %rax, %rdx
add $0x1, %rsi
mov $EG(vm_interrupt), %rax
cmp $0x0, (%rax)
jz .L1
jmp jit$$trace_exit_3
Check VM interrupt, like timeout
(exit 3)
sub $0x10, %rsp
mov $EG(jit_trace_num), %rax
mov $0x1, (%rax)
cmp $0x4, 0x68(%r14)
jnz jit$$trace_exit_0
mov 0x50(%r14), %rcx
mov 0x60(%r14), %rdx
mov 0x70(%r14), %rsi
.L1:
cmp %rcx, %rsi
jge jit$$trace_exit_1
mov %rdx, %rax
add %rsi, %rax
jo jit$$trace_exit_2
mov %rax, %rdx
add $0x1, %rsi
mov $EG(vm_interrupt), %rax
cmp $0x0, (%rax)
jz .L1
jmp jit$$trace_exit_3
sub $0x10, %rsp
mov $EG(jit_trace_num), %rax
mov $0x1, (%rax)
cmp $0x4, 0x68(%r14)
jnz jit$$trace_exit_0
mov 0x50(%r14), %rcx
mov 0x60(%r14), %rdx
mov 0x70(%r14), %rsi
.L1:
cmp %rcx, %rsi
jge jit$$trace_exit_1
mov %rdx, %rax
add %rsi, %rax
jo jit$$trace_exit_2
mov %rax, %rdx
add $0x1, %rsi
mov $EG(vm_interrupt), %rax
cmp $0x0, (%rax)
jz .L1
jmp jit$$trace_exit_3
Exits go to VM or side traces
TRACE-2$sum$5:
mov $EG(jit_trace_num), %rax
mov $0x2, (%rax)
mov 0x70(%r14), %rax
cmp 0x50(%r14), %rax
jge jit$$trace_exit_0
cmp $0x5, 0x68(%r14)
jnz jit$$trace_exit_1
vxorps %xmm0, %xmm0, %xmm0
vcvtsi2sd %rax, %xmm0, %xmm0
vaddsd 0x60(%r14), %xmm0, %xmm0
vmovsd %xmm0, 0x60(%r14)
add $0x1, 0x70(%r14)
mov $EG(vm_interrupt), %rax
cmp $0x0, (%rax)
jz TRACE-1$sum$5+4
jmp jit$$trace_exit_2
TRACE-2$sum$5:
mov $EG(jit_trace_num), %rax
mov $0x2, (%rax)
mov 0x70(%r14), %rax
cmp 0x50(%r14), %rax
jge jit$$trace_exit_0
cmp $0x5, 0x68(%r14)
jnz jit$$trace_exit_1
vxorps %xmm0, %xmm0, %xmm0
vcvtsi2sd %rax, %xmm0, %xmm0
vaddsd 0x60(%r14), %xmm0, %xmm0
vmovsd %xmm0, 0x60(%r14)
add $0x1, 0x70(%r14)
mov $EG(vm_interrupt), %rax
cmp $0x0, (%rax)
jz TRACE-1$sum$5+4
jmp jit$$trace_exit_2
Check if $sum is float
TRACE-2$sum$5:
mov $EG(jit_trace_num), %rax
mov $0x2, (%rax)
mov 0x70(%r14), %rax
cmp 0x50(%r14), %rax
jge jit$$trace_exit_0
cmp $0x5, 0x68(%r14)
jnz jit$$trace_exit_1
vxorps %xmm0, %xmm0, %xmm0
vcvtsi2sd %rax, %xmm0, %xmm0
vaddsd 0x60(%r14), %xmm0, %xmm0
vmovsd %xmm0, 0x60(%r14)
add $0x1, 0x70(%r14)
mov $EG(vm_interrupt), %rax
cmp $0x0, (%rax)
jz TRACE-1$sum$5+4
jmp jit$$trace_exit_2
$sum += (float) $i
TRACE-2$sum$5:
mov $EG(jit_trace_num), %rax
mov $0x2, (%rax)
mov 0x70(%r14), %rax
cmp 0x50(%r14), %rax
jge jit$$trace_exit_0
cmp $0x5, 0x68(%r14)
jnz jit$$trace_exit_1
vxorps %xmm0, %xmm0, %xmm0
vcvtsi2sd %rax, %xmm0, %xmm0
vaddsd 0x60(%r14), %xmm0, %xmm0
vmovsd %xmm0, 0x60(%r14)
add $0x1, 0x70(%r14)
mov $EG(vm_interrupt), %rax
cmp $0x0, (%rax)
jz TRACE-1$sum$5+4
jmp jit$$trace_exit_2
sub $0x10, %rsp
mov $EG(jit_trace_num), %rax
mov $0x1, (%rax)
cmp $0x4, 0x68(%r14)
jnz jit$$trace_exit_0
mov 0x50(%r14), %rcx
mov 0x60(%r14), %rdx
mov 0x70(%r14), %rsi
.L1:
cmp %rcx, %rsi
jge jit$$trace_exit_1
mov %rdx, %rax
add %rsi, %rax
jo jit$$trace_exit_2
mov %rax, %rdx
add $0x1, %rsi
mov $EG(vm_interrupt), %rax
cmp $0x0, (%rax)
jz .L1
jmp jit$$trace_exit_3
Trace 2
Interception
●
Each opcode stores a "VM handler" pointer
Interception
●
Each opcode stores a "VM handler" pointer
●
Replace handler at function entry, loop headers,
returns
●
Handler counts executions and invokes JIT
Trace Collection
●
Separate VM that collects type info while
executing
Trace Collection
●
Separate VM that collects type info while
executing
●
Traces can span different loops and functions
– Calls effectively get "inlined"
Code Generation
●
Early prototypes used LLVM
– Architecture agnostic
– Supports many sophisticated optimizations
Code Generation
●
Early prototypes used LLVM
– Architecture agnostic
– Supports many sophisticated optimizations
– But: Extremely slow compile-times
Code Generation
●
Early prototypes used LLVM
– Architecture agnostic
– Supports many sophisticated optimizations
– But: Extremely slow compile-times
●
Now using DynASM from the LuaJIT project
– Very fast
– But: Architecture specific
|.macro LONG_MATH_REG, opcode, dst_reg, src_reg
|| switch (opcode) {
|| case ZEND_ADD:
| add dst_reg, src_reg
|| break;
|| case ZEND_SUB:
| sub dst_reg, src_reg
|| break;
|| case ZEND_MUL:
| imul dst_reg, src_reg
|| break;
|| case ZEND_BW_OR:
| or dst_reg, src_reg
|| break;
|| case ZEND_BW_AND:
| and dst_reg, src_reg
|| break;
...
|| }
|.endmacro
|.macro LONG_MATH_REG, opcode, dst_reg, src_reg
|| switch (opcode) {
|| case ZEND_ADD:
| add dst_reg, src_reg
|| break;
|| case ZEND_SUB:
| sub dst_reg, src_reg
|| break;
|| case ZEND_MUL:
| imul dst_reg, src_reg
|| break;
|| case ZEND_BW_OR:
| or dst_reg, src_reg
|| break;
|| case ZEND_BW_AND:
| and dst_reg, src_reg
|| break;
...
|| }
|.endmacro
C code
X86 Assembly with placeholders
Code Generation
●
DynASM itself supports many architectures
●
But JIT code has to be written for each
●
No support for M1 at this time, sorry!
Closing Thoughts
●
Performance benefit workload dependent
– Try it!
Closing Thoughts
●
Performance benefit workload dependent
– Try it!
●
Room for improvement
– E.g. optimizations (loop invariant code motion, etc.)
Closing Thoughts
●
Concern: Stability
– Increased potential for hard to debug, hard to
reproduce bugs
Closing Thoughts
●
Concern: Stability
– Increased potential for hard to debug, hard to
reproduce bugs
●
Concern: Maintenance
– Only one person really understands the JIT
Thank You!

More Related Content

What's hot

Sigreturn Oriented Programming
Sigreturn Oriented ProgrammingSigreturn Oriented Programming
Sigreturn Oriented Programming
Angel Boy
 
Google V8 engine
Google V8 engineGoogle V8 engine
Google V8 engine
Xuân Thu Nguyễn
 
Pwning in c++ (basic)
Pwning in c++ (basic)Pwning in c++ (basic)
Pwning in c++ (basic)
Angel Boy
 
traitを使って楽したい話
traitを使って楽したい話traitを使って楽したい話
traitを使って楽したい話
infinite_loop
 
ELFの動的リンク
ELFの動的リンクELFの動的リンク
ELFの動的リンク
7shi
 
ゆるバグ
ゆるバグゆるバグ
ゆるバグ
MITSUNARI Shigeo
 
Tracing MariaDB server with bpftrace - MariaDB Server Fest 2021
Tracing MariaDB server with bpftrace - MariaDB Server Fest 2021Tracing MariaDB server with bpftrace - MariaDB Server Fest 2021
Tracing MariaDB server with bpftrace - MariaDB Server Fest 2021
Valeriy Kravchuk
 
Performance Profiling in Rust
Performance Profiling in RustPerformance Profiling in Rust
Performance Profiling in Rust
InfluxData
 
Java Crash分析(2012-05-10)
Java Crash分析(2012-05-10)Java Crash分析(2012-05-10)
Java Crash分析(2012-05-10)
Kris Mok
 
php-src の歩き方
php-src の歩き方php-src の歩き方
php-src の歩き方
do_aki
 
Cache-Oblivious データ構造入門 @DSIRNLP#5
Cache-Oblivious データ構造入門 @DSIRNLP#5Cache-Oblivious データ構造入門 @DSIRNLP#5
Cache-Oblivious データ構造入門 @DSIRNLP#5Takuya Akiba
 
サーバーサイドな人がフロントエンド技術と仲良くするはじめの一歩
サーバーサイドな人がフロントエンド技術と仲良くするはじめの一歩サーバーサイドな人がフロントエンド技術と仲良くするはじめの一歩
サーバーサイドな人がフロントエンド技術と仲良くするはじめの一歩
Y Watanabe
 
How A Compiler Works: GNU Toolchain
How A Compiler Works: GNU ToolchainHow A Compiler Works: GNU Toolchain
How A Compiler Works: GNU Toolchain
National Cheng Kung University
 
Visual C++で使えるC++11
Visual C++で使えるC++11Visual C++で使えるC++11
Visual C++で使えるC++11
nekko1119
 
BPF Internals (eBPF)
BPF Internals (eBPF)BPF Internals (eBPF)
BPF Internals (eBPF)
Brendan Gregg
 
Qemu JIT Code Generator and System Emulation
Qemu JIT Code Generator and System EmulationQemu JIT Code Generator and System Emulation
Qemu JIT Code Generator and System Emulation
National Cheng Kung University
 
The Microkernel Mach Under NeXTSTEP
The Microkernel Mach Under NeXTSTEPThe Microkernel Mach Under NeXTSTEP
The Microkernel Mach Under NeXTSTEPGregor Schmidt
 
Slab Allocator in Linux Kernel
Slab Allocator in Linux KernelSlab Allocator in Linux Kernel
Slab Allocator in Linux Kernel
Adrian Huang
 

What's hot (20)

Sigreturn Oriented Programming
Sigreturn Oriented ProgrammingSigreturn Oriented Programming
Sigreturn Oriented Programming
 
Google V8 engine
Google V8 engineGoogle V8 engine
Google V8 engine
 
Pwning in c++ (basic)
Pwning in c++ (basic)Pwning in c++ (basic)
Pwning in c++ (basic)
 
traitを使って楽したい話
traitを使って楽したい話traitを使って楽したい話
traitを使って楽したい話
 
ELFの動的リンク
ELFの動的リンクELFの動的リンク
ELFの動的リンク
 
initramfsについて
initramfsについてinitramfsについて
initramfsについて
 
ゆるバグ
ゆるバグゆるバグ
ゆるバグ
 
Tracing MariaDB server with bpftrace - MariaDB Server Fest 2021
Tracing MariaDB server with bpftrace - MariaDB Server Fest 2021Tracing MariaDB server with bpftrace - MariaDB Server Fest 2021
Tracing MariaDB server with bpftrace - MariaDB Server Fest 2021
 
Performance Profiling in Rust
Performance Profiling in RustPerformance Profiling in Rust
Performance Profiling in Rust
 
Java Crash分析(2012-05-10)
Java Crash分析(2012-05-10)Java Crash分析(2012-05-10)
Java Crash分析(2012-05-10)
 
php-src の歩き方
php-src の歩き方php-src の歩き方
php-src の歩き方
 
Cache-Oblivious データ構造入門 @DSIRNLP#5
Cache-Oblivious データ構造入門 @DSIRNLP#5Cache-Oblivious データ構造入門 @DSIRNLP#5
Cache-Oblivious データ構造入門 @DSIRNLP#5
 
サーバーサイドな人がフロントエンド技術と仲良くするはじめの一歩
サーバーサイドな人がフロントエンド技術と仲良くするはじめの一歩サーバーサイドな人がフロントエンド技術と仲良くするはじめの一歩
サーバーサイドな人がフロントエンド技術と仲良くするはじめの一歩
 
How A Compiler Works: GNU Toolchain
How A Compiler Works: GNU ToolchainHow A Compiler Works: GNU Toolchain
How A Compiler Works: GNU Toolchain
 
Visual C++で使えるC++11
Visual C++で使えるC++11Visual C++で使えるC++11
Visual C++で使えるC++11
 
BPF Internals (eBPF)
BPF Internals (eBPF)BPF Internals (eBPF)
BPF Internals (eBPF)
 
Qemu JIT Code Generator and System Emulation
Qemu JIT Code Generator and System EmulationQemu JIT Code Generator and System Emulation
Qemu JIT Code Generator and System Emulation
 
The Microkernel Mach Under NeXTSTEP
The Microkernel Mach Under NeXTSTEPThe Microkernel Mach Under NeXTSTEP
The Microkernel Mach Under NeXTSTEP
 
淺談探索 Linux 系統設計之道
淺談探索 Linux 系統設計之道 淺談探索 Linux 系統設計之道
淺談探索 Linux 系統設計之道
 
Slab Allocator in Linux Kernel
Slab Allocator in Linux KernelSlab Allocator in Linux Kernel
Slab Allocator in Linux Kernel
 

Similar to Just-In-Time Compiler in PHP 8

Php 7 hhvm and co
Php 7 hhvm and coPhp 7 hhvm and co
Php 7 hhvm and co
Pierre Joye
 
PHP in 2018 - Q4 - AFUP Limoges
PHP in 2018 - Q4 - AFUP LimogesPHP in 2018 - Q4 - AFUP Limoges
PHP in 2018 - Q4 - AFUP Limoges
✅ William Pinaud
 
Cluj.py Meetup: Extending Python in C
Cluj.py Meetup: Extending Python in CCluj.py Meetup: Extending Python in C
Cluj.py Meetup: Extending Python in C
Steffen Wenz
 
掀起 Swift 的面紗
掀起 Swift 的面紗掀起 Swift 的面紗
掀起 Swift 的面紗
Pofat Tseng
 
Anatomy of a PHP Request ( UTOSC 2010 )
Anatomy of a PHP Request ( UTOSC 2010 )Anatomy of a PHP Request ( UTOSC 2010 )
Anatomy of a PHP Request ( UTOSC 2010 )Joseph Scott
 
The why and how of moving to PHP 5.5/5.6
The why and how of moving to PHP 5.5/5.6The why and how of moving to PHP 5.5/5.6
The why and how of moving to PHP 5.5/5.6
Wim Godden
 
The bytecode hocus pocus - JavaOne 2016
The bytecode hocus pocus - JavaOne 2016The bytecode hocus pocus - JavaOne 2016
The bytecode hocus pocus - JavaOne 2016
Raimon Ràfols
 
The bytecode gobbledygook
The bytecode gobbledygookThe bytecode gobbledygook
The bytecode gobbledygook
Raimon Ràfols
 
Diving into HHVM Extensions (PHPNW Conference 2015)
Diving into HHVM Extensions (PHPNW Conference 2015)Diving into HHVM Extensions (PHPNW Conference 2015)
Diving into HHVM Extensions (PHPNW Conference 2015)
James Titcumb
 
Living With Legacy Code
Living With Legacy CodeLiving With Legacy Code
Living With Legacy Code
Rowan Merewood
 
Diving into HHVM Extensions (Brno PHP Conference 2015)
Diving into HHVM Extensions (Brno PHP Conference 2015)Diving into HHVM Extensions (Brno PHP Conference 2015)
Diving into HHVM Extensions (Brno PHP Conference 2015)
James Titcumb
 
The why and how of moving to php 8
The why and how of moving to php 8The why and how of moving to php 8
The why and how of moving to php 8
Wim Godden
 
HHVM: Efficient and Scalable PHP/Hack Execution / Guilherme Ottoni (Facebook)
HHVM: Efficient and Scalable PHP/Hack Execution / Guilherme Ottoni (Facebook)HHVM: Efficient and Scalable PHP/Hack Execution / Guilherme Ottoni (Facebook)
HHVM: Efficient and Scalable PHP/Hack Execution / Guilherme Ottoni (Facebook)
Ontico
 
The bytecode mumbo-jumbo
The bytecode mumbo-jumboThe bytecode mumbo-jumbo
The bytecode mumbo-jumbo
Raimon Ràfols
 
Boosting Developer Productivity with Clang
Boosting Developer Productivity with ClangBoosting Developer Productivity with Clang
Boosting Developer Productivity with Clang
Samsung Open Source Group
 
CodePolitan Webinar: The Rise of PHP
CodePolitan Webinar: The Rise of PHPCodePolitan Webinar: The Rise of PHP
CodePolitan Webinar: The Rise of PHP
Steeven Salim
 
JIT compilation for CPython
JIT compilation for CPythonJIT compilation for CPython
JIT compilation for CPython
delimitry
 
Exakat for PHP : smart code reviewing engine
Exakat for PHP : smart code reviewing engineExakat for PHP : smart code reviewing engine
Exakat for PHP : smart code reviewing engine
Damien Seguy
 
Php7 HHVM and co
Php7 HHVM and coPhp7 HHVM and co
Php7 HHVM and co
weltling
 
What To Expect From PHP7
What To Expect From PHP7What To Expect From PHP7
What To Expect From PHP7
Codemotion
 

Similar to Just-In-Time Compiler in PHP 8 (20)

Php 7 hhvm and co
Php 7 hhvm and coPhp 7 hhvm and co
Php 7 hhvm and co
 
PHP in 2018 - Q4 - AFUP Limoges
PHP in 2018 - Q4 - AFUP LimogesPHP in 2018 - Q4 - AFUP Limoges
PHP in 2018 - Q4 - AFUP Limoges
 
Cluj.py Meetup: Extending Python in C
Cluj.py Meetup: Extending Python in CCluj.py Meetup: Extending Python in C
Cluj.py Meetup: Extending Python in C
 
掀起 Swift 的面紗
掀起 Swift 的面紗掀起 Swift 的面紗
掀起 Swift 的面紗
 
Anatomy of a PHP Request ( UTOSC 2010 )
Anatomy of a PHP Request ( UTOSC 2010 )Anatomy of a PHP Request ( UTOSC 2010 )
Anatomy of a PHP Request ( UTOSC 2010 )
 
The why and how of moving to PHP 5.5/5.6
The why and how of moving to PHP 5.5/5.6The why and how of moving to PHP 5.5/5.6
The why and how of moving to PHP 5.5/5.6
 
The bytecode hocus pocus - JavaOne 2016
The bytecode hocus pocus - JavaOne 2016The bytecode hocus pocus - JavaOne 2016
The bytecode hocus pocus - JavaOne 2016
 
The bytecode gobbledygook
The bytecode gobbledygookThe bytecode gobbledygook
The bytecode gobbledygook
 
Diving into HHVM Extensions (PHPNW Conference 2015)
Diving into HHVM Extensions (PHPNW Conference 2015)Diving into HHVM Extensions (PHPNW Conference 2015)
Diving into HHVM Extensions (PHPNW Conference 2015)
 
Living With Legacy Code
Living With Legacy CodeLiving With Legacy Code
Living With Legacy Code
 
Diving into HHVM Extensions (Brno PHP Conference 2015)
Diving into HHVM Extensions (Brno PHP Conference 2015)Diving into HHVM Extensions (Brno PHP Conference 2015)
Diving into HHVM Extensions (Brno PHP Conference 2015)
 
The why and how of moving to php 8
The why and how of moving to php 8The why and how of moving to php 8
The why and how of moving to php 8
 
HHVM: Efficient and Scalable PHP/Hack Execution / Guilherme Ottoni (Facebook)
HHVM: Efficient and Scalable PHP/Hack Execution / Guilherme Ottoni (Facebook)HHVM: Efficient and Scalable PHP/Hack Execution / Guilherme Ottoni (Facebook)
HHVM: Efficient and Scalable PHP/Hack Execution / Guilherme Ottoni (Facebook)
 
The bytecode mumbo-jumbo
The bytecode mumbo-jumboThe bytecode mumbo-jumbo
The bytecode mumbo-jumbo
 
Boosting Developer Productivity with Clang
Boosting Developer Productivity with ClangBoosting Developer Productivity with Clang
Boosting Developer Productivity with Clang
 
CodePolitan Webinar: The Rise of PHP
CodePolitan Webinar: The Rise of PHPCodePolitan Webinar: The Rise of PHP
CodePolitan Webinar: The Rise of PHP
 
JIT compilation for CPython
JIT compilation for CPythonJIT compilation for CPython
JIT compilation for CPython
 
Exakat for PHP : smart code reviewing engine
Exakat for PHP : smart code reviewing engineExakat for PHP : smart code reviewing engine
Exakat for PHP : smart code reviewing engine
 
Php7 HHVM and co
Php7 HHVM and coPhp7 HHVM and co
Php7 HHVM and co
 
What To Expect From PHP7
What To Expect From PHP7What To Expect From PHP7
What To Expect From PHP7
 

More from Nikita Popov

A whirlwind tour of the LLVM optimizer
A whirlwind tour of the LLVM optimizerA whirlwind tour of the LLVM optimizer
A whirlwind tour of the LLVM optimizer
Nikita Popov
 
Opaque Pointers Are Coming
Opaque Pointers Are ComingOpaque Pointers Are Coming
Opaque Pointers Are Coming
Nikita Popov
 
What's new in PHP 8.0?
What's new in PHP 8.0?What's new in PHP 8.0?
What's new in PHP 8.0?
Nikita Popov
 
PHP Performance Trivia
PHP Performance TriviaPHP Performance Trivia
PHP Performance Trivia
Nikita Popov
 
Typed Properties and more: What's coming in PHP 7.4?
Typed Properties and more: What's coming in PHP 7.4?Typed Properties and more: What's coming in PHP 7.4?
Typed Properties and more: What's coming in PHP 7.4?
Nikita Popov
 
Static Optimization of PHP bytecode (PHPSC 2017)
Static Optimization of PHP bytecode (PHPSC 2017)Static Optimization of PHP bytecode (PHPSC 2017)
Static Optimization of PHP bytecode (PHPSC 2017)
Nikita Popov
 
PHP Language Trivia
PHP Language TriviaPHP Language Trivia
PHP Language Trivia
Nikita Popov
 
PHP 7 – What changed internally? (Forum PHP 2015)
PHP 7 – What changed internally? (Forum PHP 2015)PHP 7 – What changed internally? (Forum PHP 2015)
PHP 7 – What changed internally? (Forum PHP 2015)
Nikita Popov
 
PHP 7 – What changed internally? (PHP Barcelona 2015)
PHP 7 – What changed internally? (PHP Barcelona 2015)PHP 7 – What changed internally? (PHP Barcelona 2015)
PHP 7 – What changed internally? (PHP Barcelona 2015)
Nikita Popov
 
PHP 7 – What changed internally?
PHP 7 – What changed internally?PHP 7 – What changed internally?
PHP 7 – What changed internally?
Nikita Popov
 

More from Nikita Popov (10)

A whirlwind tour of the LLVM optimizer
A whirlwind tour of the LLVM optimizerA whirlwind tour of the LLVM optimizer
A whirlwind tour of the LLVM optimizer
 
Opaque Pointers Are Coming
Opaque Pointers Are ComingOpaque Pointers Are Coming
Opaque Pointers Are Coming
 
What's new in PHP 8.0?
What's new in PHP 8.0?What's new in PHP 8.0?
What's new in PHP 8.0?
 
PHP Performance Trivia
PHP Performance TriviaPHP Performance Trivia
PHP Performance Trivia
 
Typed Properties and more: What's coming in PHP 7.4?
Typed Properties and more: What's coming in PHP 7.4?Typed Properties and more: What's coming in PHP 7.4?
Typed Properties and more: What's coming in PHP 7.4?
 
Static Optimization of PHP bytecode (PHPSC 2017)
Static Optimization of PHP bytecode (PHPSC 2017)Static Optimization of PHP bytecode (PHPSC 2017)
Static Optimization of PHP bytecode (PHPSC 2017)
 
PHP Language Trivia
PHP Language TriviaPHP Language Trivia
PHP Language Trivia
 
PHP 7 – What changed internally? (Forum PHP 2015)
PHP 7 – What changed internally? (Forum PHP 2015)PHP 7 – What changed internally? (Forum PHP 2015)
PHP 7 – What changed internally? (Forum PHP 2015)
 
PHP 7 – What changed internally? (PHP Barcelona 2015)
PHP 7 – What changed internally? (PHP Barcelona 2015)PHP 7 – What changed internally? (PHP Barcelona 2015)
PHP 7 – What changed internally? (PHP Barcelona 2015)
 
PHP 7 – What changed internally?
PHP 7 – What changed internally?PHP 7 – What changed internally?
PHP 7 – What changed internally?
 

Recently uploaded

The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
g2nightmarescribd
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
Jemma Hussein Allen
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 

Recently uploaded (20)

The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
Generating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using SmithyGenerating a custom Ruby SDK for your web service or Rails API using Smithy
Generating a custom Ruby SDK for your web service or Rails API using Smithy
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
The Future of Platform Engineering
The Future of Platform EngineeringThe Future of Platform Engineering
The Future of Platform Engineering
 
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdfFIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
FIDO Alliance Osaka Seminar: Passkeys and the Road Ahead.pdf
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 

Just-In-Time Compiler in PHP 8

  • 1. Just-In-Time Compiler in PHP 8 Nikita Popov @ betterCode PHP 8
  • 3. About Me ● Dmitry Stogov works on JIT ● I work on everything else :)
  • 4. About Me ● Dmitry Stogov works on JIT ● I work on everything else :) ● My JIT involvement mostly QA
  • 5. Just-In-Time (JIT) Compiler PHP Code Opcodes Virtual Machine CPU
  • 6. Just-In-Time (JIT) Compiler PHP Code Opcodes Virtual Machine CPU Machine Code JIT
  • 7. History ● Old project started by Zend in PHP 5 times ● Mainly implemented by Dmitry Stogov
  • 8. History ● Early prototypes: The rest of PHP is too slow for it to matter
  • 9. History ● Early prototypes: The rest of PHP is too slow for it to matter – Too many allocations – Too much memory usage – Too much pointer chasing – Cache locality is key
  • 10. History ● Early prototypes: The rest of PHP is too slow for it to matter ● PHPNG (later: PHP 7) project started to optimize PHP ● Large performance improvements (2x), no JIT needed!
  • 11. History ● SSA and type inference from JIT integrated into opcache ● Used for opcode optimizations
  • 12. History ● SSA and type inference from JIT integrated into opcache ● Used for opcode optimizations – Constant Propagation – Dead Code Elimination – Refcount Optimization
  • 14. Configuration ● Advanced configuration: – opcache.jit (CRTO) – opcache.jit_debug, opcache.jit_bisect_limit – opcache.jit_max_root_traces, opcache.jit_max_side_traces, opcache.jit_max_exit_counters – opcache.jit_hot_loop, opcache.jit_hot_func, opcache.jit_hot_return, opcache.jit_hot_side_exit – opcache.jit_blacklist_root_trace, opcache.jit_blacklist_side_trace – opcache.jit_max_loop_unrolls, opcache.jit_max_recursive_calls, opcache.jit_max_recursive_returns, opcache.jit_max_polymorphic_calls – https://www.php.net/manual/en/opcache.configuration.php
  • 16. bench.php micro_bench.php PHP-Parser amphp Symfony Demo With Preloading 0 0.5 1 1.5 2 2.5 3 3.5 Baseline: Opcache + No JIT Performance
  • 17. bench.php micro_bench.php PHP-Parser amphp Symfony Demo With Preloading 0 0.5 1 1.5 2 2.5 3 3.5 Baseline: Opcache + No JIT Performance
  • 18. bench.php micro_bench.php PHP-Parser amphp Symfony Demo With Preloading 0 0.5 1 1.5 2 2.5 3 3.5 Baseline: Opcache + No JIT Performance
  • 19. bench.php micro_bench.php PHP-Parser amphp Symfony Demo With Preloading 0 0.5 1 1.5 2 2.5 3 3.5 Baseline: Opcache + No JIT Performance
  • 20. Performance ● Heavily depends on workload ● Larger impact the more time is spent executing PHP code (rather than e.g. DB queries) ● More useful for "non-standard" applications
  • 23. Function JIT ● Trigger: When to JIT – 0: All functions, on script load – 1: All functions, on first execution – 2: Profile first request, JIT hot functions – 3: Profile on the fly, JIT hot functions
  • 24. <?php function sum(int $n) { $sum = 0; for ($i = 0; $i < $n; $i++) { $sum += $i; } return $sum; }
  • 25. <?php function sum(int $n) { entry: $sum = 0; $i = 0; goto cond; loop: $sum += $i; $i++; cond: if ($i < $n) goto loop; finish: return $sum; }
  • 26. <?php function sum(int $n) { entry: $sum_0 = 0; $i_0 = 0; goto cond; loop: $sum_2 = $sum_1 + $i_1; $i_2 = $i_1 + 1; cond: $sum_1 = phi(entry: $sum_0, loop: $sum_2); $i_1 = phi(entry: $i_0, loop: $i_2); if ($i_1 < $n) goto loop; finish: return $sum_1; }
  • 27. <?php function sum(int $n) { entry: $sum_0 = 0; # int $i_0 = 0; # int goto cond; loop: $sum_2 = $sum_1 + $i_1; # int|float $i_2 = $i_1 + 1; # int cond: $sum_1 = phi(entry: $sum_0, loop: $sum_2); # int|float $i_1 = phi(entry: $i_0, loop: $i_2); # int if ($i_1 < $n) goto loop; finish: return $sum_1; }
  • 28. ... .L2: mov $0x0, 0x60(%r14) mov $0x4, 0x68(%r14) xor %rdx, %rdx jmp .L5 .L3: mov %rsi, 0x50(%r14) mov $0x4, 0x58(%r14) cmp $0x4, 0x68(%r14) jnz .L10 mov 0x60(%r14), %rax add %rdx, %rax jo .L9 mov %rax, 0x60(%r14) .L4: add $0x1, %rdx .L5: ...
  • 29. ... .L2: mov $0x0, 0x60(%r14) mov $0x4, 0x68(%r14) xor %rdx, %rdx jmp .L5 .L3: mov %rsi, 0x50(%r14) mov $0x4, 0x58(%r14) cmp $0x4, 0x68(%r14) jnz .L10 mov 0x60(%r14), %rax add %rdx, %rax jo .L9 mov %rax, 0x60(%r14) .L4: add $0x1, %rdx .L5: ... Assign 0 to $i (in register) Increment $i (in register)
  • 30. ... .L2: mov $0x0, 0x60(%r14) mov $0x4, 0x68(%r14) xor %rdx, %rdx jmp .L5 .L3: mov %rsi, 0x50(%r14) mov $0x4, 0x58(%r14) cmp $0x4, 0x68(%r14) jnz .L10 mov 0x60(%r14), %rax add %rdx, %rax jo .L9 mov %rax, 0x60(%r14) .L4: add $0x1, %rdx .L5: ... Frame pointer
  • 31. ... .L2: mov $0x0, 0x60(%r14) mov $0x4, 0x68(%r14) xor %rdx, %rdx jmp .L5 .L3: mov %rsi, 0x50(%r14) mov $0x4, 0x58(%r14) cmp $0x4, 0x68(%r14) jnz .L10 mov 0x60(%r14), %rax add %rdx, %rax jo .L9 mov %rax, 0x60(%r14) .L4: add $0x1, %rdx .L5: ... Assign int(0) to $sum
  • 32. ... .L2: mov $0x0, 0x60(%r14) mov $0x4, 0x68(%r14) xor %rdx, %rdx jmp .L5 .L3: mov %rsi, 0x50(%r14) mov $0x4, 0x58(%r14) cmp $0x4, 0x68(%r14) jnz .L10 mov 0x60(%r14), %rax add %rdx, %rax jo .L9 mov %rax, 0x60(%r14) .L4: add $0x1, %rdx .L5: ... Check whether $sum is int
  • 33. ... .L2: mov $0x0, 0x60(%r14) mov $0x4, 0x68(%r14) xor %rdx, %rdx jmp .L5 .L3: mov %rsi, 0x50(%r14) mov $0x4, 0x58(%r14) cmp $0x4, 0x68(%r14) jnz .L10 mov 0x60(%r14), %rax add %rdx, %rax jo .L9 mov %rax, 0x60(%r14) .L4: add $0x1, %rdx .L5: ... Load $sum to register Add $sum and $i Write result back
  • 34. ... .L2: mov $0x0, 0x60(%r14) mov $0x4, 0x68(%r14) xor %rdx, %rdx jmp .L5 .L3: mov %rsi, 0x50(%r14) mov $0x4, 0x58(%r14) cmp $0x4, 0x68(%r14) jnz .L10 mov 0x60(%r14), %rax add %rdx, %rax jo .L9 mov %rax, 0x60(%r14) .L4: add $0x1, %rdx .L5: ... Check if addition overflowed
  • 35. ... .L9: vxorps %xmm0, %xmm0, %xmm0 vcvtsi2sd 0x60(%r14), %xmm0, %xmm0 vxorps %xmm1, %xmm1, %xmm1 vcvtsi2sd %rdx, %xmm1, %xmm1 vaddsd %xmm1, %xmm0, %xmm0 vmovsd %xmm0, 0x60(%r14) mov $0x5, 0x68(%r14) jmp .L4 .L10: vxorps %xmm0, %xmm0, %xmm0 vcvtsi2sd %rdx, %xmm0, %xmm0 vaddsd 0x60(%r14), %xmm0, %xmm0 vmovsd %xmm0, 0x60(%r14) jmp .L4 .L11: ... Convert $i to float
  • 36. ... .L9: vxorps %xmm0, %xmm0, %xmm0 vcvtsi2sd 0x60(%r14), %xmm0, %xmm0 vxorps %xmm1, %xmm1, %xmm1 vcvtsi2sd %rdx, %xmm1, %xmm1 vaddsd %xmm1, %xmm0, %xmm0 vmovsd %xmm0, 0x60(%r14) mov $0x5, 0x68(%r14) jmp .L4 .L10: vxorps %xmm0, %xmm0, %xmm0 vcvtsi2sd %rdx, %xmm0, %xmm0 vaddsd 0x60(%r14), %xmm0, %xmm0 vmovsd %xmm0, 0x60(%r14) jmp .L4 .L11: ... Add (float)$i to $sum
  • 37. ... .L9: vxorps %xmm0, %xmm0, %xmm0 vcvtsi2sd 0x60(%r14), %xmm0, %xmm0 vxorps %xmm1, %xmm1, %xmm1 vcvtsi2sd %rdx, %xmm1, %xmm1 vaddsd %xmm1, %xmm0, %xmm0 vmovsd %xmm0, 0x60(%r14) mov $0x5, 0x68(%r14) jmp .L4 .L10: vxorps %xmm0, %xmm0, %xmm0 vcvtsi2sd %rdx, %xmm0, %xmm0 vaddsd 0x60(%r14), %xmm0, %xmm0 vmovsd %xmm0, 0x60(%r14) jmp .L4 .L11: ... Convert $sum to floatConvert $sum to float Convert $i to float
  • 38. ... .L9: vxorps %xmm0, %xmm0, %xmm0 vcvtsi2sd 0x60(%r14), %xmm0, %xmm0 vxorps %xmm1, %xmm1, %xmm1 vcvtsi2sd %rdx, %xmm1, %xmm1 vaddsd %xmm1, %xmm0, %xmm0 vmovsd %xmm0, 0x60(%r14) mov $0x5, 0x68(%r14) jmp .L4 .L10: vxorps %xmm0, %xmm0, %xmm0 vcvtsi2sd %rdx, %xmm0, %xmm0 vaddsd 0x60(%r14), %xmm0, %xmm0 vmovsd %xmm0, 0x60(%r14) jmp .L4 .L11: ... Add $sum and $i as floats Mark $sum slot as float
  • 39. ... .L9: vxorps %xmm0, %xmm0, %xmm0 vcvtsi2sd 0x60(%r14), %xmm0, %xmm0 vxorps %xmm1, %xmm1, %xmm1 vcvtsi2sd %rdx, %xmm1, %xmm1 vaddsd %xmm1, %xmm0, %xmm0 vmovsd %xmm0, 0x60(%r14) mov $0x5, 0x68(%r14) jmp .L4 .L10: vxorps %xmm0, %xmm0, %xmm0 vcvtsi2sd %rdx, %xmm0, %xmm0 vaddsd 0x60(%r14), %xmm0, %xmm0 vmovsd %xmm0, 0x60(%r14) jmp .L4 .L11: ... This code is almost certainly unused! Can't store $sum in register, because it might turn float
  • 41. Tracing JIT VM Execution + Profiling Trace Collection Hot
  • 42. Tracing JIT VM Execution + Profiling Trace Collection Trace Compilation Hot
  • 43. Tracing JIT VM Execution + Profiling Trace Collection Trace Execution Trace Compilation Hot
  • 44. Tracing JIT VM Execution + Profiling Trace Collection Trace Execution Trace Compilation Hot Deoptimization
  • 45. <?php function sum(int $n) { entry: $sum = 0; $i = 0; goto cond; loop: $sum += $i; $i++; cond: if ($i < $n) goto loop; finish: return $sum; } <?php trace:
  • 46. <?php function sum(int $n) { entry: $sum = 0; $i = 0; goto cond; loop: $sum += $i; $i++; cond: if ($i < $n) goto loop; finish: return $sum; } <?php trace: if ($i < $n)
  • 47. <?php function sum(int $n) { entry: $sum = 0; $i = 0; goto cond; loop: $sum += $i; $i++; cond: if ($i < $n) goto loop; finish: return $sum; } <?php trace: if ($i < $n) $sum += $i;
  • 48. <?php function sum(int $n) { entry: $sum = 0; $i = 0; goto cond; loop: $sum += $i; $i++; cond: if ($i < $n) goto loop; finish: return $sum; } <?php trace: if ($i < $n) $sum += $i; $i++;
  • 49. <?php function sum(int $n) { entry: $sum = 0; $i = 0; goto cond; loop: $sum += $i; $i++; cond: if ($i < $n) goto loop; finish: return $sum; } <?php trace: if ($i < $n) $sum += $i; $i++; goto trace;
  • 50. <?php function sum(int $n) { entry: $sum = 0; $i = 0; goto cond; loop: $sum += $i; $i++; cond: if ($i < $n) goto loop; finish: return $sum; } <?php $sum_0 = ...; $i_0 = ...; trace: $sum_1 = phi($sum_0, $sum_2); $i_1 = phi($i_0, $i_2); if ($i_1 < $n) $sum_2 = $sum_1 + $i_1; $i_2 = $i_1 + 1; goto trace;
  • 51. <?php function sum(int $n) { entry: $sum = 0; $i = 0; goto cond; loop: $sum += $i; $i++; cond: if ($i < $n) goto loop; finish: return $sum; } <?php $sum_0 = ...; # int $i_0 = ...; trace: $sum_1 = phi($sum_0, $sum_2); $i_1 = phi($i_0, $i_2); if ($i_1 < $n) # does not exit $sum_2 = $sum_1 + $i_1; # int $i_2 = $i_1 + 1; goto trace;
  • 52. sub $0x10, %rsp mov $EG(jit_trace_num), %rax mov $0x1, (%rax) cmp $0x4, 0x68(%r14) jnz jit$$trace_exit_0 mov 0x50(%r14), %rcx mov 0x60(%r14), %rdx mov 0x70(%r14), %rsi .L1: cmp %rcx, %rsi jge jit$$trace_exit_1 mov %rdx, %rax add %rsi, %rax jo jit$$trace_exit_2 mov %rax, %rdx add $0x1, %rsi mov $EG(vm_interrupt), %rax cmp $0x0, (%rax) jz .L1 jmp jit$$trace_exit_3
  • 53. sub $0x10, %rsp mov $EG(jit_trace_num), %rax mov $0x1, (%rax) cmp $0x4, 0x68(%r14) jnz jit$$trace_exit_0 mov 0x50(%r14), %rcx mov 0x60(%r14), %rdx mov 0x70(%r14), %rsi .L1: cmp %rcx, %rsi jge jit$$trace_exit_1 mov %rdx, %rax add %rsi, %rax jo jit$$trace_exit_2 mov %rax, %rdx add $0x1, %rsi mov $EG(vm_interrupt), %rax cmp $0x0, (%rax) jz .L1 jmp jit$$trace_exit_3 Check if $sum is int (exit 0)
  • 54. sub $0x10, %rsp mov $EG(jit_trace_num), %rax mov $0x1, (%rax) cmp $0x4, 0x68(%r14) jnz jit$$trace_exit_0 mov 0x50(%r14), %rcx mov 0x60(%r14), %rdx mov 0x70(%r14), %rsi .L1: cmp %rcx, %rsi jge jit$$trace_exit_1 mov %rdx, %rax add %rsi, %rax jo jit$$trace_exit_2 mov %rax, %rdx add $0x1, %rsi mov $EG(vm_interrupt), %rax cmp $0x0, (%rax) jz .L1 jmp jit$$trace_exit_3 Load $n, $sum, $i into registers
  • 55. sub $0x10, %rsp mov $EG(jit_trace_num), %rax mov $0x1, (%rax) cmp $0x4, 0x68(%r14) jnz jit$$trace_exit_0 mov 0x50(%r14), %rcx mov 0x60(%r14), %rdx mov 0x70(%r14), %rsi .L1: cmp %rcx, %rsi jge jit$$trace_exit_1 mov %rdx, %rax add %rsi, %rax jo jit$$trace_exit_2 mov %rax, %rdx add $0x1, %rsi mov $EG(vm_interrupt), %rax cmp $0x0, (%rax) jz .L1 jmp jit$$trace_exit_3 Check $i < $n (exit 1)
  • 56. sub $0x10, %rsp mov $EG(jit_trace_num), %rax mov $0x1, (%rax) cmp $0x4, 0x68(%r14) jnz jit$$trace_exit_0 mov 0x50(%r14), %rcx mov 0x60(%r14), %rdx mov 0x70(%r14), %rsi .L1: cmp %rcx, %rsi jge jit$$trace_exit_1 mov %rdx, %rax add %rsi, %rax jo jit$$trace_exit_2 mov %rax, %rdx add $0x1, %rsi mov $EG(vm_interrupt), %rax cmp $0x0, (%rax) jz .L1 jmp jit$$trace_exit_3 $sum += $i, check overflow (exit 2)
  • 57. sub $0x10, %rsp mov $EG(jit_trace_num), %rax mov $0x1, (%rax) cmp $0x4, 0x68(%r14) jnz jit$$trace_exit_0 mov 0x50(%r14), %rcx mov 0x60(%r14), %rdx mov 0x70(%r14), %rsi .L1: cmp %rcx, %rsi jge jit$$trace_exit_1 mov %rdx, %rax add %rsi, %rax jo jit$$trace_exit_2 mov %rax, %rdx add $0x1, %rsi mov $EG(vm_interrupt), %rax cmp $0x0, (%rax) jz .L1 jmp jit$$trace_exit_3 $i++
  • 58. sub $0x10, %rsp mov $EG(jit_trace_num), %rax mov $0x1, (%rax) cmp $0x4, 0x68(%r14) jnz jit$$trace_exit_0 mov 0x50(%r14), %rcx mov 0x60(%r14), %rdx mov 0x70(%r14), %rsi .L1: cmp %rcx, %rsi jge jit$$trace_exit_1 mov %rdx, %rax add %rsi, %rax jo jit$$trace_exit_2 mov %rax, %rdx add $0x1, %rsi mov $EG(vm_interrupt), %rax cmp $0x0, (%rax) jz .L1 jmp jit$$trace_exit_3 Check VM interrupt, like timeout (exit 3)
  • 59. sub $0x10, %rsp mov $EG(jit_trace_num), %rax mov $0x1, (%rax) cmp $0x4, 0x68(%r14) jnz jit$$trace_exit_0 mov 0x50(%r14), %rcx mov 0x60(%r14), %rdx mov 0x70(%r14), %rsi .L1: cmp %rcx, %rsi jge jit$$trace_exit_1 mov %rdx, %rax add %rsi, %rax jo jit$$trace_exit_2 mov %rax, %rdx add $0x1, %rsi mov $EG(vm_interrupt), %rax cmp $0x0, (%rax) jz .L1 jmp jit$$trace_exit_3
  • 60. sub $0x10, %rsp mov $EG(jit_trace_num), %rax mov $0x1, (%rax) cmp $0x4, 0x68(%r14) jnz jit$$trace_exit_0 mov 0x50(%r14), %rcx mov 0x60(%r14), %rdx mov 0x70(%r14), %rsi .L1: cmp %rcx, %rsi jge jit$$trace_exit_1 mov %rdx, %rax add %rsi, %rax jo jit$$trace_exit_2 mov %rax, %rdx add $0x1, %rsi mov $EG(vm_interrupt), %rax cmp $0x0, (%rax) jz .L1 jmp jit$$trace_exit_3 Exits go to VM or side traces
  • 61. TRACE-2$sum$5: mov $EG(jit_trace_num), %rax mov $0x2, (%rax) mov 0x70(%r14), %rax cmp 0x50(%r14), %rax jge jit$$trace_exit_0 cmp $0x5, 0x68(%r14) jnz jit$$trace_exit_1 vxorps %xmm0, %xmm0, %xmm0 vcvtsi2sd %rax, %xmm0, %xmm0 vaddsd 0x60(%r14), %xmm0, %xmm0 vmovsd %xmm0, 0x60(%r14) add $0x1, 0x70(%r14) mov $EG(vm_interrupt), %rax cmp $0x0, (%rax) jz TRACE-1$sum$5+4 jmp jit$$trace_exit_2
  • 62. TRACE-2$sum$5: mov $EG(jit_trace_num), %rax mov $0x2, (%rax) mov 0x70(%r14), %rax cmp 0x50(%r14), %rax jge jit$$trace_exit_0 cmp $0x5, 0x68(%r14) jnz jit$$trace_exit_1 vxorps %xmm0, %xmm0, %xmm0 vcvtsi2sd %rax, %xmm0, %xmm0 vaddsd 0x60(%r14), %xmm0, %xmm0 vmovsd %xmm0, 0x60(%r14) add $0x1, 0x70(%r14) mov $EG(vm_interrupt), %rax cmp $0x0, (%rax) jz TRACE-1$sum$5+4 jmp jit$$trace_exit_2 Check if $sum is float
  • 63. TRACE-2$sum$5: mov $EG(jit_trace_num), %rax mov $0x2, (%rax) mov 0x70(%r14), %rax cmp 0x50(%r14), %rax jge jit$$trace_exit_0 cmp $0x5, 0x68(%r14) jnz jit$$trace_exit_1 vxorps %xmm0, %xmm0, %xmm0 vcvtsi2sd %rax, %xmm0, %xmm0 vaddsd 0x60(%r14), %xmm0, %xmm0 vmovsd %xmm0, 0x60(%r14) add $0x1, 0x70(%r14) mov $EG(vm_interrupt), %rax cmp $0x0, (%rax) jz TRACE-1$sum$5+4 jmp jit$$trace_exit_2 $sum += (float) $i
  • 64. TRACE-2$sum$5: mov $EG(jit_trace_num), %rax mov $0x2, (%rax) mov 0x70(%r14), %rax cmp 0x50(%r14), %rax jge jit$$trace_exit_0 cmp $0x5, 0x68(%r14) jnz jit$$trace_exit_1 vxorps %xmm0, %xmm0, %xmm0 vcvtsi2sd %rax, %xmm0, %xmm0 vaddsd 0x60(%r14), %xmm0, %xmm0 vmovsd %xmm0, 0x60(%r14) add $0x1, 0x70(%r14) mov $EG(vm_interrupt), %rax cmp $0x0, (%rax) jz TRACE-1$sum$5+4 jmp jit$$trace_exit_2
  • 65. sub $0x10, %rsp mov $EG(jit_trace_num), %rax mov $0x1, (%rax) cmp $0x4, 0x68(%r14) jnz jit$$trace_exit_0 mov 0x50(%r14), %rcx mov 0x60(%r14), %rdx mov 0x70(%r14), %rsi .L1: cmp %rcx, %rsi jge jit$$trace_exit_1 mov %rdx, %rax add %rsi, %rax jo jit$$trace_exit_2 mov %rax, %rdx add $0x1, %rsi mov $EG(vm_interrupt), %rax cmp $0x0, (%rax) jz .L1 jmp jit$$trace_exit_3 Trace 2
  • 66. Interception ● Each opcode stores a "VM handler" pointer
  • 67. Interception ● Each opcode stores a "VM handler" pointer ● Replace handler at function entry, loop headers, returns ● Handler counts executions and invokes JIT
  • 68. Trace Collection ● Separate VM that collects type info while executing
  • 69. Trace Collection ● Separate VM that collects type info while executing ● Traces can span different loops and functions – Calls effectively get "inlined"
  • 70. Code Generation ● Early prototypes used LLVM – Architecture agnostic – Supports many sophisticated optimizations
  • 71. Code Generation ● Early prototypes used LLVM – Architecture agnostic – Supports many sophisticated optimizations – But: Extremely slow compile-times
  • 72. Code Generation ● Early prototypes used LLVM – Architecture agnostic – Supports many sophisticated optimizations – But: Extremely slow compile-times ● Now using DynASM from the LuaJIT project – Very fast – But: Architecture specific
  • 73. |.macro LONG_MATH_REG, opcode, dst_reg, src_reg || switch (opcode) { || case ZEND_ADD: | add dst_reg, src_reg || break; || case ZEND_SUB: | sub dst_reg, src_reg || break; || case ZEND_MUL: | imul dst_reg, src_reg || break; || case ZEND_BW_OR: | or dst_reg, src_reg || break; || case ZEND_BW_AND: | and dst_reg, src_reg || break; ... || } |.endmacro
  • 74. |.macro LONG_MATH_REG, opcode, dst_reg, src_reg || switch (opcode) { || case ZEND_ADD: | add dst_reg, src_reg || break; || case ZEND_SUB: | sub dst_reg, src_reg || break; || case ZEND_MUL: | imul dst_reg, src_reg || break; || case ZEND_BW_OR: | or dst_reg, src_reg || break; || case ZEND_BW_AND: | and dst_reg, src_reg || break; ... || } |.endmacro C code X86 Assembly with placeholders
  • 75. Code Generation ● DynASM itself supports many architectures ● But JIT code has to be written for each ● No support for M1 at this time, sorry!
  • 76. Closing Thoughts ● Performance benefit workload dependent – Try it!
  • 77. Closing Thoughts ● Performance benefit workload dependent – Try it! ● Room for improvement – E.g. optimizations (loop invariant code motion, etc.)
  • 78. Closing Thoughts ● Concern: Stability – Increased potential for hard to debug, hard to reproduce bugs
  • 79. Closing Thoughts ● Concern: Stability – Increased potential for hard to debug, hard to reproduce bugs ● Concern: Maintenance – Only one person really understands the JIT