SlideShare a Scribd company logo
1 of 80
Download to read offline
Just-In-Time Compiler in PHP 8
Nikita Popov @ betterCode PHP 8
About Me
About Me
●
Dmitry Stogov works on JIT
●
I work on everything else :)
About Me
●
Dmitry Stogov works on JIT
●
I work on everything else :)
●
My JIT involvement mostly QA
Just-In-Time (JIT) Compiler
PHP Code
Opcodes
Virtual
Machine
CPU
Just-In-Time (JIT) Compiler
PHP Code
Opcodes
Virtual
Machine
CPU
Machine
Code
JIT
History
●
Old project started by Zend in PHP 5 times
●
Mainly implemented by Dmitry Stogov
History
●
Early prototypes: The rest of PHP is too slow for
it to matter
History
●
Early prototypes: The rest of PHP is too slow for
it to matter
– Too many allocations
– Too much memory usage
– Too much pointer chasing
– Cache locality is key
History
●
Early prototypes: The rest of PHP is too slow for
it to matter
●
PHPNG (later: PHP 7) project started to
optimize PHP
●
Large performance improvements (2x), no JIT
needed!
History
●
SSA and type inference from JIT integrated into
opcache
●
Used for opcode optimizations
History
●
SSA and type inference from JIT integrated into
opcache
●
Used for opcode optimizations
– Constant Propagation
– Dead Code Elimination
– Refcount Optimization
Configuration
●
Enable opcache
●
opcache.jit_buffer_size=128M
●
Done!
Configuration
●
Advanced configuration:
– opcache.jit (CRTO)
– opcache.jit_debug, opcache.jit_bisect_limit
– opcache.jit_max_root_traces, opcache.jit_max_side_traces,
opcache.jit_max_exit_counters
– opcache.jit_hot_loop, opcache.jit_hot_func, opcache.jit_hot_return,
opcache.jit_hot_side_exit
– opcache.jit_blacklist_root_trace, opcache.jit_blacklist_side_trace
– opcache.jit_max_loop_unrolls, opcache.jit_max_recursive_calls,
opcache.jit_max_recursive_returns, opcache.jit_max_polymorphic_calls
– https://www.php.net/manual/en/opcache.configuration.php
Performance
bench.php
micro_bench.php
PHP-Parser
amphp
Symfony Demo
With Preloading
0 0.5 1 1.5 2 2.5 3 3.5
Baseline: Opcache + No JIT
bench.php
micro_bench.php
PHP-Parser
amphp
Symfony Demo
With Preloading
0 0.5 1 1.5 2 2.5 3 3.5
Baseline: Opcache + No JIT
Performance
bench.php
micro_bench.php
PHP-Parser
amphp
Symfony Demo
With Preloading
0 0.5 1 1.5 2 2.5 3 3.5
Baseline: Opcache + No JIT
Performance
bench.php
micro_bench.php
PHP-Parser
amphp
Symfony Demo
With Preloading
0 0.5 1 1.5 2 2.5 3 3.5
Baseline: Opcache + No JIT
Performance
bench.php
micro_bench.php
PHP-Parser
amphp
Symfony Demo
With Preloading
0 0.5 1 1.5 2 2.5 3 3.5
Baseline: Opcache + No JIT
Performance
Performance
●
Heavily depends on workload
●
Larger impact the more time is spent executing
PHP code (rather than e.g. DB queries)
●
More useful for "non-standard" applications
Function JIT
●
opcache.jit=function
●
Always JITs a whole function
Function JIT
PHP Code
Opcodes
Virtual
Machine
CPU
Machine
Code
JIT
Trigger
Function JIT
●
Trigger: When to JIT
– 0: All functions, on script load
– 1: All functions, on first execution
– 2: Profile first request, JIT hot functions
– 3: Profile on the fly, JIT hot functions
<?php
function sum(int $n) {
$sum = 0;
for ($i = 0; $i < $n; $i++) {
$sum += $i;
}
return $sum;
}
<?php
function sum(int $n) {
entry:
$sum = 0;
$i = 0;
goto cond;
loop:
$sum += $i;
$i++;
cond:
if ($i < $n) goto loop;
finish:
return $sum;
}
<?php
function sum(int $n) {
entry:
$sum_0 = 0;
$i_0 = 0;
goto cond;
loop:
$sum_2 = $sum_1 + $i_1;
$i_2 = $i_1 + 1;
cond:
$sum_1 = phi(entry: $sum_0, loop: $sum_2);
$i_1 = phi(entry: $i_0, loop: $i_2);
if ($i_1 < $n) goto loop;
finish:
return $sum_1;
}
<?php
function sum(int $n) {
entry:
$sum_0 = 0; # int
$i_0 = 0; # int
goto cond;
loop:
$sum_2 = $sum_1 + $i_1; # int|float
$i_2 = $i_1 + 1; # int
cond:
$sum_1 = phi(entry: $sum_0, loop: $sum_2); # int|float
$i_1 = phi(entry: $i_0, loop: $i_2); # int
if ($i_1 < $n) goto loop;
finish:
return $sum_1;
}
...
.L2:
mov $0x0, 0x60(%r14)
mov $0x4, 0x68(%r14)
xor %rdx, %rdx
jmp .L5
.L3:
mov %rsi, 0x50(%r14)
mov $0x4, 0x58(%r14)
cmp $0x4, 0x68(%r14)
jnz .L10
mov 0x60(%r14), %rax
add %rdx, %rax
jo .L9
mov %rax, 0x60(%r14)
.L4:
add $0x1, %rdx
.L5:
...
...
.L2:
mov $0x0, 0x60(%r14)
mov $0x4, 0x68(%r14)
xor %rdx, %rdx
jmp .L5
.L3:
mov %rsi, 0x50(%r14)
mov $0x4, 0x58(%r14)
cmp $0x4, 0x68(%r14)
jnz .L10
mov 0x60(%r14), %rax
add %rdx, %rax
jo .L9
mov %rax, 0x60(%r14)
.L4:
add $0x1, %rdx
.L5:
...
Assign 0 to $i (in register)
Increment $i (in register)
...
.L2:
mov $0x0, 0x60(%r14)
mov $0x4, 0x68(%r14)
xor %rdx, %rdx
jmp .L5
.L3:
mov %rsi, 0x50(%r14)
mov $0x4, 0x58(%r14)
cmp $0x4, 0x68(%r14)
jnz .L10
mov 0x60(%r14), %rax
add %rdx, %rax
jo .L9
mov %rax, 0x60(%r14)
.L4:
add $0x1, %rdx
.L5:
...
Frame pointer
...
.L2:
mov $0x0, 0x60(%r14)
mov $0x4, 0x68(%r14)
xor %rdx, %rdx
jmp .L5
.L3:
mov %rsi, 0x50(%r14)
mov $0x4, 0x58(%r14)
cmp $0x4, 0x68(%r14)
jnz .L10
mov 0x60(%r14), %rax
add %rdx, %rax
jo .L9
mov %rax, 0x60(%r14)
.L4:
add $0x1, %rdx
.L5:
...
Assign int(0) to $sum
...
.L2:
mov $0x0, 0x60(%r14)
mov $0x4, 0x68(%r14)
xor %rdx, %rdx
jmp .L5
.L3:
mov %rsi, 0x50(%r14)
mov $0x4, 0x58(%r14)
cmp $0x4, 0x68(%r14)
jnz .L10
mov 0x60(%r14), %rax
add %rdx, %rax
jo .L9
mov %rax, 0x60(%r14)
.L4:
add $0x1, %rdx
.L5:
...
Check whether $sum is int
...
.L2:
mov $0x0, 0x60(%r14)
mov $0x4, 0x68(%r14)
xor %rdx, %rdx
jmp .L5
.L3:
mov %rsi, 0x50(%r14)
mov $0x4, 0x58(%r14)
cmp $0x4, 0x68(%r14)
jnz .L10
mov 0x60(%r14), %rax
add %rdx, %rax
jo .L9
mov %rax, 0x60(%r14)
.L4:
add $0x1, %rdx
.L5:
...
Load $sum to register
Add $sum and $i
Write result back
...
.L2:
mov $0x0, 0x60(%r14)
mov $0x4, 0x68(%r14)
xor %rdx, %rdx
jmp .L5
.L3:
mov %rsi, 0x50(%r14)
mov $0x4, 0x58(%r14)
cmp $0x4, 0x68(%r14)
jnz .L10
mov 0x60(%r14), %rax
add %rdx, %rax
jo .L9
mov %rax, 0x60(%r14)
.L4:
add $0x1, %rdx
.L5:
...
Check if addition overflowed
...
.L9:
vxorps %xmm0, %xmm0, %xmm0
vcvtsi2sd 0x60(%r14), %xmm0, %xmm0
vxorps %xmm1, %xmm1, %xmm1
vcvtsi2sd %rdx, %xmm1, %xmm1
vaddsd %xmm1, %xmm0, %xmm0
vmovsd %xmm0, 0x60(%r14)
mov $0x5, 0x68(%r14)
jmp .L4
.L10:
vxorps %xmm0, %xmm0, %xmm0
vcvtsi2sd %rdx, %xmm0, %xmm0
vaddsd 0x60(%r14), %xmm0, %xmm0
vmovsd %xmm0, 0x60(%r14)
jmp .L4
.L11:
...
Convert $i to float
...
.L9:
vxorps %xmm0, %xmm0, %xmm0
vcvtsi2sd 0x60(%r14), %xmm0, %xmm0
vxorps %xmm1, %xmm1, %xmm1
vcvtsi2sd %rdx, %xmm1, %xmm1
vaddsd %xmm1, %xmm0, %xmm0
vmovsd %xmm0, 0x60(%r14)
mov $0x5, 0x68(%r14)
jmp .L4
.L10:
vxorps %xmm0, %xmm0, %xmm0
vcvtsi2sd %rdx, %xmm0, %xmm0
vaddsd 0x60(%r14), %xmm0, %xmm0
vmovsd %xmm0, 0x60(%r14)
jmp .L4
.L11:
...
Add (float)$i to $sum
...
.L9:
vxorps %xmm0, %xmm0, %xmm0
vcvtsi2sd 0x60(%r14), %xmm0, %xmm0
vxorps %xmm1, %xmm1, %xmm1
vcvtsi2sd %rdx, %xmm1, %xmm1
vaddsd %xmm1, %xmm0, %xmm0
vmovsd %xmm0, 0x60(%r14)
mov $0x5, 0x68(%r14)
jmp .L4
.L10:
vxorps %xmm0, %xmm0, %xmm0
vcvtsi2sd %rdx, %xmm0, %xmm0
vaddsd 0x60(%r14), %xmm0, %xmm0
vmovsd %xmm0, 0x60(%r14)
jmp .L4
.L11:
...
Convert $sum to floatConvert $sum to float
Convert $i to float
...
.L9:
vxorps %xmm0, %xmm0, %xmm0
vcvtsi2sd 0x60(%r14), %xmm0, %xmm0
vxorps %xmm1, %xmm1, %xmm1
vcvtsi2sd %rdx, %xmm1, %xmm1
vaddsd %xmm1, %xmm0, %xmm0
vmovsd %xmm0, 0x60(%r14)
mov $0x5, 0x68(%r14)
jmp .L4
.L10:
vxorps %xmm0, %xmm0, %xmm0
vcvtsi2sd %rdx, %xmm0, %xmm0
vaddsd 0x60(%r14), %xmm0, %xmm0
vmovsd %xmm0, 0x60(%r14)
jmp .L4
.L11:
...
Add $sum and $i as floats
Mark $sum slot as float
...
.L9:
vxorps %xmm0, %xmm0, %xmm0
vcvtsi2sd 0x60(%r14), %xmm0, %xmm0
vxorps %xmm1, %xmm1, %xmm1
vcvtsi2sd %rdx, %xmm1, %xmm1
vaddsd %xmm1, %xmm0, %xmm0
vmovsd %xmm0, 0x60(%r14)
mov $0x5, 0x68(%r14)
jmp .L4
.L10:
vxorps %xmm0, %xmm0, %xmm0
vcvtsi2sd %rdx, %xmm0, %xmm0
vaddsd 0x60(%r14), %xmm0, %xmm0
vmovsd %xmm0, 0x60(%r14)
jmp .L4
.L11:
...
This code is almost
certainly unused!
Can't store $sum in
register, because it
might turn float
Tracing JIT
VM Execution
+ Profiling
Tracing JIT
VM Execution
+ Profiling
Trace
Collection
Hot
Tracing JIT
VM Execution
+ Profiling
Trace
Collection
Trace
Compilation
Hot
Tracing JIT
VM Execution
+ Profiling
Trace
Collection
Trace
Execution
Trace
Compilation
Hot
Tracing JIT
VM Execution
+ Profiling
Trace
Collection
Trace
Execution
Trace
Compilation
Hot
Deoptimization
<?php
function sum(int $n) {
entry:
$sum = 0;
$i = 0;
goto cond;
loop:
$sum += $i;
$i++;
cond:
if ($i < $n) goto loop;
finish:
return $sum;
}
<?php
trace:
<?php
function sum(int $n) {
entry:
$sum = 0;
$i = 0;
goto cond;
loop:
$sum += $i;
$i++;
cond:
if ($i < $n) goto loop;
finish:
return $sum;
}
<?php
trace:
if ($i < $n)
<?php
function sum(int $n) {
entry:
$sum = 0;
$i = 0;
goto cond;
loop:
$sum += $i;
$i++;
cond:
if ($i < $n) goto loop;
finish:
return $sum;
}
<?php
trace:
if ($i < $n)
$sum += $i;
<?php
function sum(int $n) {
entry:
$sum = 0;
$i = 0;
goto cond;
loop:
$sum += $i;
$i++;
cond:
if ($i < $n) goto loop;
finish:
return $sum;
}
<?php
trace:
if ($i < $n)
$sum += $i;
$i++;
<?php
function sum(int $n) {
entry:
$sum = 0;
$i = 0;
goto cond;
loop:
$sum += $i;
$i++;
cond:
if ($i < $n) goto loop;
finish:
return $sum;
}
<?php
trace:
if ($i < $n)
$sum += $i;
$i++;
goto trace;
<?php
function sum(int $n) {
entry:
$sum = 0;
$i = 0;
goto cond;
loop:
$sum += $i;
$i++;
cond:
if ($i < $n) goto loop;
finish:
return $sum;
}
<?php
$sum_0 = ...;
$i_0 = ...;
trace:
$sum_1 = phi($sum_0, $sum_2);
$i_1 = phi($i_0, $i_2);
if ($i_1 < $n)
$sum_2 = $sum_1 + $i_1;
$i_2 = $i_1 + 1;
goto trace;
<?php
function sum(int $n) {
entry:
$sum = 0;
$i = 0;
goto cond;
loop:
$sum += $i;
$i++;
cond:
if ($i < $n) goto loop;
finish:
return $sum;
}
<?php
$sum_0 = ...; # int
$i_0 = ...;
trace:
$sum_1 = phi($sum_0, $sum_2);
$i_1 = phi($i_0, $i_2);
if ($i_1 < $n) # does not exit
$sum_2 = $sum_1 + $i_1; # int
$i_2 = $i_1 + 1;
goto trace;
sub $0x10, %rsp
mov $EG(jit_trace_num), %rax
mov $0x1, (%rax)
cmp $0x4, 0x68(%r14)
jnz jit$$trace_exit_0
mov 0x50(%r14), %rcx
mov 0x60(%r14), %rdx
mov 0x70(%r14), %rsi
.L1:
cmp %rcx, %rsi
jge jit$$trace_exit_1
mov %rdx, %rax
add %rsi, %rax
jo jit$$trace_exit_2
mov %rax, %rdx
add $0x1, %rsi
mov $EG(vm_interrupt), %rax
cmp $0x0, (%rax)
jz .L1
jmp jit$$trace_exit_3
sub $0x10, %rsp
mov $EG(jit_trace_num), %rax
mov $0x1, (%rax)
cmp $0x4, 0x68(%r14)
jnz jit$$trace_exit_0
mov 0x50(%r14), %rcx
mov 0x60(%r14), %rdx
mov 0x70(%r14), %rsi
.L1:
cmp %rcx, %rsi
jge jit$$trace_exit_1
mov %rdx, %rax
add %rsi, %rax
jo jit$$trace_exit_2
mov %rax, %rdx
add $0x1, %rsi
mov $EG(vm_interrupt), %rax
cmp $0x0, (%rax)
jz .L1
jmp jit$$trace_exit_3
Check if $sum is int (exit 0)
sub $0x10, %rsp
mov $EG(jit_trace_num), %rax
mov $0x1, (%rax)
cmp $0x4, 0x68(%r14)
jnz jit$$trace_exit_0
mov 0x50(%r14), %rcx
mov 0x60(%r14), %rdx
mov 0x70(%r14), %rsi
.L1:
cmp %rcx, %rsi
jge jit$$trace_exit_1
mov %rdx, %rax
add %rsi, %rax
jo jit$$trace_exit_2
mov %rax, %rdx
add $0x1, %rsi
mov $EG(vm_interrupt), %rax
cmp $0x0, (%rax)
jz .L1
jmp jit$$trace_exit_3
Load $n, $sum, $i into registers
sub $0x10, %rsp
mov $EG(jit_trace_num), %rax
mov $0x1, (%rax)
cmp $0x4, 0x68(%r14)
jnz jit$$trace_exit_0
mov 0x50(%r14), %rcx
mov 0x60(%r14), %rdx
mov 0x70(%r14), %rsi
.L1:
cmp %rcx, %rsi
jge jit$$trace_exit_1
mov %rdx, %rax
add %rsi, %rax
jo jit$$trace_exit_2
mov %rax, %rdx
add $0x1, %rsi
mov $EG(vm_interrupt), %rax
cmp $0x0, (%rax)
jz .L1
jmp jit$$trace_exit_3
Check $i < $n (exit 1)
sub $0x10, %rsp
mov $EG(jit_trace_num), %rax
mov $0x1, (%rax)
cmp $0x4, 0x68(%r14)
jnz jit$$trace_exit_0
mov 0x50(%r14), %rcx
mov 0x60(%r14), %rdx
mov 0x70(%r14), %rsi
.L1:
cmp %rcx, %rsi
jge jit$$trace_exit_1
mov %rdx, %rax
add %rsi, %rax
jo jit$$trace_exit_2
mov %rax, %rdx
add $0x1, %rsi
mov $EG(vm_interrupt), %rax
cmp $0x0, (%rax)
jz .L1
jmp jit$$trace_exit_3
$sum += $i, check overflow (exit 2)
sub $0x10, %rsp
mov $EG(jit_trace_num), %rax
mov $0x1, (%rax)
cmp $0x4, 0x68(%r14)
jnz jit$$trace_exit_0
mov 0x50(%r14), %rcx
mov 0x60(%r14), %rdx
mov 0x70(%r14), %rsi
.L1:
cmp %rcx, %rsi
jge jit$$trace_exit_1
mov %rdx, %rax
add %rsi, %rax
jo jit$$trace_exit_2
mov %rax, %rdx
add $0x1, %rsi
mov $EG(vm_interrupt), %rax
cmp $0x0, (%rax)
jz .L1
jmp jit$$trace_exit_3
$i++
sub $0x10, %rsp
mov $EG(jit_trace_num), %rax
mov $0x1, (%rax)
cmp $0x4, 0x68(%r14)
jnz jit$$trace_exit_0
mov 0x50(%r14), %rcx
mov 0x60(%r14), %rdx
mov 0x70(%r14), %rsi
.L1:
cmp %rcx, %rsi
jge jit$$trace_exit_1
mov %rdx, %rax
add %rsi, %rax
jo jit$$trace_exit_2
mov %rax, %rdx
add $0x1, %rsi
mov $EG(vm_interrupt), %rax
cmp $0x0, (%rax)
jz .L1
jmp jit$$trace_exit_3
Check VM interrupt, like timeout
(exit 3)
sub $0x10, %rsp
mov $EG(jit_trace_num), %rax
mov $0x1, (%rax)
cmp $0x4, 0x68(%r14)
jnz jit$$trace_exit_0
mov 0x50(%r14), %rcx
mov 0x60(%r14), %rdx
mov 0x70(%r14), %rsi
.L1:
cmp %rcx, %rsi
jge jit$$trace_exit_1
mov %rdx, %rax
add %rsi, %rax
jo jit$$trace_exit_2
mov %rax, %rdx
add $0x1, %rsi
mov $EG(vm_interrupt), %rax
cmp $0x0, (%rax)
jz .L1
jmp jit$$trace_exit_3
sub $0x10, %rsp
mov $EG(jit_trace_num), %rax
mov $0x1, (%rax)
cmp $0x4, 0x68(%r14)
jnz jit$$trace_exit_0
mov 0x50(%r14), %rcx
mov 0x60(%r14), %rdx
mov 0x70(%r14), %rsi
.L1:
cmp %rcx, %rsi
jge jit$$trace_exit_1
mov %rdx, %rax
add %rsi, %rax
jo jit$$trace_exit_2
mov %rax, %rdx
add $0x1, %rsi
mov $EG(vm_interrupt), %rax
cmp $0x0, (%rax)
jz .L1
jmp jit$$trace_exit_3
Exits go to VM or side traces
TRACE-2$sum$5:
mov $EG(jit_trace_num), %rax
mov $0x2, (%rax)
mov 0x70(%r14), %rax
cmp 0x50(%r14), %rax
jge jit$$trace_exit_0
cmp $0x5, 0x68(%r14)
jnz jit$$trace_exit_1
vxorps %xmm0, %xmm0, %xmm0
vcvtsi2sd %rax, %xmm0, %xmm0
vaddsd 0x60(%r14), %xmm0, %xmm0
vmovsd %xmm0, 0x60(%r14)
add $0x1, 0x70(%r14)
mov $EG(vm_interrupt), %rax
cmp $0x0, (%rax)
jz TRACE-1$sum$5+4
jmp jit$$trace_exit_2
TRACE-2$sum$5:
mov $EG(jit_trace_num), %rax
mov $0x2, (%rax)
mov 0x70(%r14), %rax
cmp 0x50(%r14), %rax
jge jit$$trace_exit_0
cmp $0x5, 0x68(%r14)
jnz jit$$trace_exit_1
vxorps %xmm0, %xmm0, %xmm0
vcvtsi2sd %rax, %xmm0, %xmm0
vaddsd 0x60(%r14), %xmm0, %xmm0
vmovsd %xmm0, 0x60(%r14)
add $0x1, 0x70(%r14)
mov $EG(vm_interrupt), %rax
cmp $0x0, (%rax)
jz TRACE-1$sum$5+4
jmp jit$$trace_exit_2
Check if $sum is float
TRACE-2$sum$5:
mov $EG(jit_trace_num), %rax
mov $0x2, (%rax)
mov 0x70(%r14), %rax
cmp 0x50(%r14), %rax
jge jit$$trace_exit_0
cmp $0x5, 0x68(%r14)
jnz jit$$trace_exit_1
vxorps %xmm0, %xmm0, %xmm0
vcvtsi2sd %rax, %xmm0, %xmm0
vaddsd 0x60(%r14), %xmm0, %xmm0
vmovsd %xmm0, 0x60(%r14)
add $0x1, 0x70(%r14)
mov $EG(vm_interrupt), %rax
cmp $0x0, (%rax)
jz TRACE-1$sum$5+4
jmp jit$$trace_exit_2
$sum += (float) $i
TRACE-2$sum$5:
mov $EG(jit_trace_num), %rax
mov $0x2, (%rax)
mov 0x70(%r14), %rax
cmp 0x50(%r14), %rax
jge jit$$trace_exit_0
cmp $0x5, 0x68(%r14)
jnz jit$$trace_exit_1
vxorps %xmm0, %xmm0, %xmm0
vcvtsi2sd %rax, %xmm0, %xmm0
vaddsd 0x60(%r14), %xmm0, %xmm0
vmovsd %xmm0, 0x60(%r14)
add $0x1, 0x70(%r14)
mov $EG(vm_interrupt), %rax
cmp $0x0, (%rax)
jz TRACE-1$sum$5+4
jmp jit$$trace_exit_2
sub $0x10, %rsp
mov $EG(jit_trace_num), %rax
mov $0x1, (%rax)
cmp $0x4, 0x68(%r14)
jnz jit$$trace_exit_0
mov 0x50(%r14), %rcx
mov 0x60(%r14), %rdx
mov 0x70(%r14), %rsi
.L1:
cmp %rcx, %rsi
jge jit$$trace_exit_1
mov %rdx, %rax
add %rsi, %rax
jo jit$$trace_exit_2
mov %rax, %rdx
add $0x1, %rsi
mov $EG(vm_interrupt), %rax
cmp $0x0, (%rax)
jz .L1
jmp jit$$trace_exit_3
Trace 2
Interception
●
Each opcode stores a "VM handler" pointer
Interception
●
Each opcode stores a "VM handler" pointer
●
Replace handler at function entry, loop headers,
returns
●
Handler counts executions and invokes JIT
Trace Collection
●
Separate VM that collects type info while
executing
Trace Collection
●
Separate VM that collects type info while
executing
●
Traces can span different loops and functions
– Calls effectively get "inlined"
Code Generation
●
Early prototypes used LLVM
– Architecture agnostic
– Supports many sophisticated optimizations
Code Generation
●
Early prototypes used LLVM
– Architecture agnostic
– Supports many sophisticated optimizations
– But: Extremely slow compile-times
Code Generation
●
Early prototypes used LLVM
– Architecture agnostic
– Supports many sophisticated optimizations
– But: Extremely slow compile-times
●
Now using DynASM from the LuaJIT project
– Very fast
– But: Architecture specific
|.macro LONG_MATH_REG, opcode, dst_reg, src_reg
|| switch (opcode) {
|| case ZEND_ADD:
| add dst_reg, src_reg
|| break;
|| case ZEND_SUB:
| sub dst_reg, src_reg
|| break;
|| case ZEND_MUL:
| imul dst_reg, src_reg
|| break;
|| case ZEND_BW_OR:
| or dst_reg, src_reg
|| break;
|| case ZEND_BW_AND:
| and dst_reg, src_reg
|| break;
...
|| }
|.endmacro
|.macro LONG_MATH_REG, opcode, dst_reg, src_reg
|| switch (opcode) {
|| case ZEND_ADD:
| add dst_reg, src_reg
|| break;
|| case ZEND_SUB:
| sub dst_reg, src_reg
|| break;
|| case ZEND_MUL:
| imul dst_reg, src_reg
|| break;
|| case ZEND_BW_OR:
| or dst_reg, src_reg
|| break;
|| case ZEND_BW_AND:
| and dst_reg, src_reg
|| break;
...
|| }
|.endmacro
C code
X86 Assembly with placeholders
Code Generation
●
DynASM itself supports many architectures
●
But JIT code has to be written for each
●
No support for M1 at this time, sorry!
Closing Thoughts
●
Performance benefit workload dependent
– Try it!
Closing Thoughts
●
Performance benefit workload dependent
– Try it!
●
Room for improvement
– E.g. optimizations (loop invariant code motion, etc.)
Closing Thoughts
●
Concern: Stability
– Increased potential for hard to debug, hard to
reproduce bugs
Closing Thoughts
●
Concern: Stability
– Increased potential for hard to debug, hard to
reproduce bugs
●
Concern: Maintenance
– Only one person really understands the JIT
Thank You!

More Related Content

What's hot

unique_ptrにポインタ以外のものを持たせるとき
unique_ptrにポインタ以外のものを持たせるときunique_ptrにポインタ以外のものを持たせるとき
unique_ptrにポインタ以外のものを持たせるときShintarou Okada
 
Composer 從入門到實戰
Composer 從入門到實戰Composer 從入門到實戰
Composer 從入門到實戰Shengyou Fan
 
Visual C++で使えるC++11
Visual C++で使えるC++11Visual C++で使えるC++11
Visual C++で使えるC++11nekko1119
 
JIT のコードを読んでみた
JIT のコードを読んでみたJIT のコードを読んでみた
JIT のコードを読んでみたy-uti
 
PHP の GC の話
PHP の GC の話PHP の GC の話
PHP の GC の話y-uti
 
PHP5.6からPHP7.0への移行
PHP5.6からPHP7.0への移行PHP5.6からPHP7.0への移行
PHP5.6からPHP7.0への移行Yasuo Ohgaki
 
Understand more about C
Understand more about CUnderstand more about C
Understand more about CYi-Hsiu Hsu
 
삶이편해지는_백엔드_개발자_지식.pdf
삶이편해지는_백엔드_개발자_지식.pdf삶이편해지는_백엔드_개발자_지식.pdf
삶이편해지는_백엔드_개발자_지식.pdfSeung kyoo Park
 
Web Development Course: PHP lecture 1
Web Development Course: PHP lecture 1Web Development Course: PHP lecture 1
Web Development Course: PHP lecture 1Gheyath M. Othman
 
OPcacheの新機能ファイルベースキャッシュの内部実装を読んでみた
OPcacheの新機能ファイルベースキャッシュの内部実装を読んでみたOPcacheの新機能ファイルベースキャッシュの内部実装を読んでみた
OPcacheの新機能ファイルベースキャッシュの内部実装を読んでみたYoshio Hanawa
 
Memory Optimization
Memory OptimizationMemory Optimization
Memory OptimizationWei Lin
 
Writing a fast HTTP parser
Writing a fast HTTP parserWriting a fast HTTP parser
Writing a fast HTTP parserfukamachi
 
php and sapi and zendengine2 and...
php and sapi and zendengine2 and...php and sapi and zendengine2 and...
php and sapi and zendengine2 and...do_aki
 
PHPUnit 4.0で追加されたwillReturn*()の紹介
PHPUnit 4.0で追加されたwillReturn*()の紹介PHPUnit 4.0で追加されたwillReturn*()の紹介
PHPUnit 4.0で追加されたwillReturn*()の紹介Takaaki Hirano
 

What's hot (20)

unique_ptrにポインタ以外のものを持たせるとき
unique_ptrにポインタ以外のものを持たせるときunique_ptrにポインタ以外のものを持たせるとき
unique_ptrにポインタ以外のものを持たせるとき
 
Composer 從入門到實戰
Composer 從入門到實戰Composer 從入門到實戰
Composer 從入門到實戰
 
Visual C++で使えるC++11
Visual C++で使えるC++11Visual C++で使えるC++11
Visual C++で使えるC++11
 
JIT のコードを読んでみた
JIT のコードを読んでみたJIT のコードを読んでみた
JIT のコードを読んでみた
 
llvm入門
llvm入門llvm入門
llvm入門
 
Php Ppt
Php PptPhp Ppt
Php Ppt
 
Php
PhpPhp
Php
 
Rust vs C++
Rust vs C++Rust vs C++
Rust vs C++
 
PHP の GC の話
PHP の GC の話PHP の GC の話
PHP の GC の話
 
PHP5.6からPHP7.0への移行
PHP5.6からPHP7.0への移行PHP5.6からPHP7.0への移行
PHP5.6からPHP7.0への移行
 
Understand more about C
Understand more about CUnderstand more about C
Understand more about C
 
삶이편해지는_백엔드_개발자_지식.pdf
삶이편해지는_백엔드_개발자_지식.pdf삶이편해지는_백엔드_개발자_지식.pdf
삶이편해지는_백엔드_개발자_지식.pdf
 
Odoo Web Services
Odoo Web ServicesOdoo Web Services
Odoo Web Services
 
Web Development Course: PHP lecture 1
Web Development Course: PHP lecture 1Web Development Course: PHP lecture 1
Web Development Course: PHP lecture 1
 
OPcacheの新機能ファイルベースキャッシュの内部実装を読んでみた
OPcacheの新機能ファイルベースキャッシュの内部実装を読んでみたOPcacheの新機能ファイルベースキャッシュの内部実装を読んでみた
OPcacheの新機能ファイルベースキャッシュの内部実装を読んでみた
 
Memory Optimization
Memory OptimizationMemory Optimization
Memory Optimization
 
Concurrency With Go
Concurrency With GoConcurrency With Go
Concurrency With Go
 
Writing a fast HTTP parser
Writing a fast HTTP parserWriting a fast HTTP parser
Writing a fast HTTP parser
 
php and sapi and zendengine2 and...
php and sapi and zendengine2 and...php and sapi and zendengine2 and...
php and sapi and zendengine2 and...
 
PHPUnit 4.0で追加されたwillReturn*()の紹介
PHPUnit 4.0で追加されたwillReturn*()の紹介PHPUnit 4.0で追加されたwillReturn*()の紹介
PHPUnit 4.0で追加されたwillReturn*()の紹介
 

Similar to Just-In-Time Compiler in PHP 8

Php 7 hhvm and co
Php 7 hhvm and coPhp 7 hhvm and co
Php 7 hhvm and coPierre Joye
 
PHP in 2018 - Q4 - AFUP Limoges
PHP in 2018 - Q4 - AFUP LimogesPHP in 2018 - Q4 - AFUP Limoges
PHP in 2018 - Q4 - AFUP Limoges✅ William Pinaud
 
Cluj.py Meetup: Extending Python in C
Cluj.py Meetup: Extending Python in CCluj.py Meetup: Extending Python in C
Cluj.py Meetup: Extending Python in CSteffen Wenz
 
掀起 Swift 的面紗
掀起 Swift 的面紗掀起 Swift 的面紗
掀起 Swift 的面紗Pofat Tseng
 
Anatomy of a PHP Request ( UTOSC 2010 )
Anatomy of a PHP Request ( UTOSC 2010 )Anatomy of a PHP Request ( UTOSC 2010 )
Anatomy of a PHP Request ( UTOSC 2010 )Joseph Scott
 
The why and how of moving to PHP 5.5/5.6
The why and how of moving to PHP 5.5/5.6The why and how of moving to PHP 5.5/5.6
The why and how of moving to PHP 5.5/5.6Wim Godden
 
The bytecode hocus pocus - JavaOne 2016
The bytecode hocus pocus - JavaOne 2016The bytecode hocus pocus - JavaOne 2016
The bytecode hocus pocus - JavaOne 2016Raimon Ràfols
 
The bytecode gobbledygook
The bytecode gobbledygookThe bytecode gobbledygook
The bytecode gobbledygookRaimon Ràfols
 
Diving into HHVM Extensions (PHPNW Conference 2015)
Diving into HHVM Extensions (PHPNW Conference 2015)Diving into HHVM Extensions (PHPNW Conference 2015)
Diving into HHVM Extensions (PHPNW Conference 2015)James Titcumb
 
Living With Legacy Code
Living With Legacy CodeLiving With Legacy Code
Living With Legacy CodeRowan Merewood
 
Diving into HHVM Extensions (Brno PHP Conference 2015)
Diving into HHVM Extensions (Brno PHP Conference 2015)Diving into HHVM Extensions (Brno PHP Conference 2015)
Diving into HHVM Extensions (Brno PHP Conference 2015)James Titcumb
 
The why and how of moving to php 8
The why and how of moving to php 8The why and how of moving to php 8
The why and how of moving to php 8Wim Godden
 
HHVM: Efficient and Scalable PHP/Hack Execution / Guilherme Ottoni (Facebook)
HHVM: Efficient and Scalable PHP/Hack Execution / Guilherme Ottoni (Facebook)HHVM: Efficient and Scalable PHP/Hack Execution / Guilherme Ottoni (Facebook)
HHVM: Efficient and Scalable PHP/Hack Execution / Guilherme Ottoni (Facebook)Ontico
 
The bytecode mumbo-jumbo
The bytecode mumbo-jumboThe bytecode mumbo-jumbo
The bytecode mumbo-jumboRaimon Ràfols
 
CodePolitan Webinar: The Rise of PHP
CodePolitan Webinar: The Rise of PHPCodePolitan Webinar: The Rise of PHP
CodePolitan Webinar: The Rise of PHPSteeven Salim
 
JIT compilation for CPython
JIT compilation for CPythonJIT compilation for CPython
JIT compilation for CPythondelimitry
 
Exakat for PHP : smart code reviewing engine
Exakat for PHP : smart code reviewing engineExakat for PHP : smart code reviewing engine
Exakat for PHP : smart code reviewing engineDamien Seguy
 
Php7 HHVM and co
Php7 HHVM and coPhp7 HHVM and co
Php7 HHVM and coweltling
 
What To Expect From PHP7
What To Expect From PHP7What To Expect From PHP7
What To Expect From PHP7Codemotion
 

Similar to Just-In-Time Compiler in PHP 8 (20)

Php 7 hhvm and co
Php 7 hhvm and coPhp 7 hhvm and co
Php 7 hhvm and co
 
PHP in 2018 - Q4 - AFUP Limoges
PHP in 2018 - Q4 - AFUP LimogesPHP in 2018 - Q4 - AFUP Limoges
PHP in 2018 - Q4 - AFUP Limoges
 
Cluj.py Meetup: Extending Python in C
Cluj.py Meetup: Extending Python in CCluj.py Meetup: Extending Python in C
Cluj.py Meetup: Extending Python in C
 
掀起 Swift 的面紗
掀起 Swift 的面紗掀起 Swift 的面紗
掀起 Swift 的面紗
 
Anatomy of a PHP Request ( UTOSC 2010 )
Anatomy of a PHP Request ( UTOSC 2010 )Anatomy of a PHP Request ( UTOSC 2010 )
Anatomy of a PHP Request ( UTOSC 2010 )
 
The why and how of moving to PHP 5.5/5.6
The why and how of moving to PHP 5.5/5.6The why and how of moving to PHP 5.5/5.6
The why and how of moving to PHP 5.5/5.6
 
The bytecode hocus pocus - JavaOne 2016
The bytecode hocus pocus - JavaOne 2016The bytecode hocus pocus - JavaOne 2016
The bytecode hocus pocus - JavaOne 2016
 
The bytecode gobbledygook
The bytecode gobbledygookThe bytecode gobbledygook
The bytecode gobbledygook
 
Diving into HHVM Extensions (PHPNW Conference 2015)
Diving into HHVM Extensions (PHPNW Conference 2015)Diving into HHVM Extensions (PHPNW Conference 2015)
Diving into HHVM Extensions (PHPNW Conference 2015)
 
Living With Legacy Code
Living With Legacy CodeLiving With Legacy Code
Living With Legacy Code
 
Diving into HHVM Extensions (Brno PHP Conference 2015)
Diving into HHVM Extensions (Brno PHP Conference 2015)Diving into HHVM Extensions (Brno PHP Conference 2015)
Diving into HHVM Extensions (Brno PHP Conference 2015)
 
The why and how of moving to php 8
The why and how of moving to php 8The why and how of moving to php 8
The why and how of moving to php 8
 
HHVM: Efficient and Scalable PHP/Hack Execution / Guilherme Ottoni (Facebook)
HHVM: Efficient and Scalable PHP/Hack Execution / Guilherme Ottoni (Facebook)HHVM: Efficient and Scalable PHP/Hack Execution / Guilherme Ottoni (Facebook)
HHVM: Efficient and Scalable PHP/Hack Execution / Guilherme Ottoni (Facebook)
 
The bytecode mumbo-jumbo
The bytecode mumbo-jumboThe bytecode mumbo-jumbo
The bytecode mumbo-jumbo
 
Boosting Developer Productivity with Clang
Boosting Developer Productivity with ClangBoosting Developer Productivity with Clang
Boosting Developer Productivity with Clang
 
CodePolitan Webinar: The Rise of PHP
CodePolitan Webinar: The Rise of PHPCodePolitan Webinar: The Rise of PHP
CodePolitan Webinar: The Rise of PHP
 
JIT compilation for CPython
JIT compilation for CPythonJIT compilation for CPython
JIT compilation for CPython
 
Exakat for PHP : smart code reviewing engine
Exakat for PHP : smart code reviewing engineExakat for PHP : smart code reviewing engine
Exakat for PHP : smart code reviewing engine
 
Php7 HHVM and co
Php7 HHVM and coPhp7 HHVM and co
Php7 HHVM and co
 
What To Expect From PHP7
What To Expect From PHP7What To Expect From PHP7
What To Expect From PHP7
 

More from Nikita Popov

A whirlwind tour of the LLVM optimizer
A whirlwind tour of the LLVM optimizerA whirlwind tour of the LLVM optimizer
A whirlwind tour of the LLVM optimizerNikita Popov
 
Opaque Pointers Are Coming
Opaque Pointers Are ComingOpaque Pointers Are Coming
Opaque Pointers Are ComingNikita Popov
 
What's new in PHP 8.0?
What's new in PHP 8.0?What's new in PHP 8.0?
What's new in PHP 8.0?Nikita Popov
 
Static Optimization of PHP bytecode (PHPSC 2017)
Static Optimization of PHP bytecode (PHPSC 2017)Static Optimization of PHP bytecode (PHPSC 2017)
Static Optimization of PHP bytecode (PHPSC 2017)Nikita Popov
 
PHP Language Trivia
PHP Language TriviaPHP Language Trivia
PHP Language TriviaNikita Popov
 
PHP 7 – What changed internally? (Forum PHP 2015)
PHP 7 – What changed internally? (Forum PHP 2015)PHP 7 – What changed internally? (Forum PHP 2015)
PHP 7 – What changed internally? (Forum PHP 2015)Nikita Popov
 
PHP 7 – What changed internally? (PHP Barcelona 2015)
PHP 7 – What changed internally? (PHP Barcelona 2015)PHP 7 – What changed internally? (PHP Barcelona 2015)
PHP 7 – What changed internally? (PHP Barcelona 2015)Nikita Popov
 
PHP 7 – What changed internally?
PHP 7 – What changed internally?PHP 7 – What changed internally?
PHP 7 – What changed internally?Nikita Popov
 

More from Nikita Popov (8)

A whirlwind tour of the LLVM optimizer
A whirlwind tour of the LLVM optimizerA whirlwind tour of the LLVM optimizer
A whirlwind tour of the LLVM optimizer
 
Opaque Pointers Are Coming
Opaque Pointers Are ComingOpaque Pointers Are Coming
Opaque Pointers Are Coming
 
What's new in PHP 8.0?
What's new in PHP 8.0?What's new in PHP 8.0?
What's new in PHP 8.0?
 
Static Optimization of PHP bytecode (PHPSC 2017)
Static Optimization of PHP bytecode (PHPSC 2017)Static Optimization of PHP bytecode (PHPSC 2017)
Static Optimization of PHP bytecode (PHPSC 2017)
 
PHP Language Trivia
PHP Language TriviaPHP Language Trivia
PHP Language Trivia
 
PHP 7 – What changed internally? (Forum PHP 2015)
PHP 7 – What changed internally? (Forum PHP 2015)PHP 7 – What changed internally? (Forum PHP 2015)
PHP 7 – What changed internally? (Forum PHP 2015)
 
PHP 7 – What changed internally? (PHP Barcelona 2015)
PHP 7 – What changed internally? (PHP Barcelona 2015)PHP 7 – What changed internally? (PHP Barcelona 2015)
PHP 7 – What changed internally? (PHP Barcelona 2015)
 
PHP 7 – What changed internally?
PHP 7 – What changed internally?PHP 7 – What changed internally?
PHP 7 – What changed internally?
 

Recently uploaded

Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 

Recently uploaded (20)

Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 

Just-In-Time Compiler in PHP 8

  • 1. Just-In-Time Compiler in PHP 8 Nikita Popov @ betterCode PHP 8
  • 3. About Me ● Dmitry Stogov works on JIT ● I work on everything else :)
  • 4. About Me ● Dmitry Stogov works on JIT ● I work on everything else :) ● My JIT involvement mostly QA
  • 5. Just-In-Time (JIT) Compiler PHP Code Opcodes Virtual Machine CPU
  • 6. Just-In-Time (JIT) Compiler PHP Code Opcodes Virtual Machine CPU Machine Code JIT
  • 7. History ● Old project started by Zend in PHP 5 times ● Mainly implemented by Dmitry Stogov
  • 8. History ● Early prototypes: The rest of PHP is too slow for it to matter
  • 9. History ● Early prototypes: The rest of PHP is too slow for it to matter – Too many allocations – Too much memory usage – Too much pointer chasing – Cache locality is key
  • 10. History ● Early prototypes: The rest of PHP is too slow for it to matter ● PHPNG (later: PHP 7) project started to optimize PHP ● Large performance improvements (2x), no JIT needed!
  • 11. History ● SSA and type inference from JIT integrated into opcache ● Used for opcode optimizations
  • 12. History ● SSA and type inference from JIT integrated into opcache ● Used for opcode optimizations – Constant Propagation – Dead Code Elimination – Refcount Optimization
  • 14. Configuration ● Advanced configuration: – opcache.jit (CRTO) – opcache.jit_debug, opcache.jit_bisect_limit – opcache.jit_max_root_traces, opcache.jit_max_side_traces, opcache.jit_max_exit_counters – opcache.jit_hot_loop, opcache.jit_hot_func, opcache.jit_hot_return, opcache.jit_hot_side_exit – opcache.jit_blacklist_root_trace, opcache.jit_blacklist_side_trace – opcache.jit_max_loop_unrolls, opcache.jit_max_recursive_calls, opcache.jit_max_recursive_returns, opcache.jit_max_polymorphic_calls – https://www.php.net/manual/en/opcache.configuration.php
  • 16. bench.php micro_bench.php PHP-Parser amphp Symfony Demo With Preloading 0 0.5 1 1.5 2 2.5 3 3.5 Baseline: Opcache + No JIT Performance
  • 17. bench.php micro_bench.php PHP-Parser amphp Symfony Demo With Preloading 0 0.5 1 1.5 2 2.5 3 3.5 Baseline: Opcache + No JIT Performance
  • 18. bench.php micro_bench.php PHP-Parser amphp Symfony Demo With Preloading 0 0.5 1 1.5 2 2.5 3 3.5 Baseline: Opcache + No JIT Performance
  • 19. bench.php micro_bench.php PHP-Parser amphp Symfony Demo With Preloading 0 0.5 1 1.5 2 2.5 3 3.5 Baseline: Opcache + No JIT Performance
  • 20. Performance ● Heavily depends on workload ● Larger impact the more time is spent executing PHP code (rather than e.g. DB queries) ● More useful for "non-standard" applications
  • 23. Function JIT ● Trigger: When to JIT – 0: All functions, on script load – 1: All functions, on first execution – 2: Profile first request, JIT hot functions – 3: Profile on the fly, JIT hot functions
  • 24. <?php function sum(int $n) { $sum = 0; for ($i = 0; $i < $n; $i++) { $sum += $i; } return $sum; }
  • 25. <?php function sum(int $n) { entry: $sum = 0; $i = 0; goto cond; loop: $sum += $i; $i++; cond: if ($i < $n) goto loop; finish: return $sum; }
  • 26. <?php function sum(int $n) { entry: $sum_0 = 0; $i_0 = 0; goto cond; loop: $sum_2 = $sum_1 + $i_1; $i_2 = $i_1 + 1; cond: $sum_1 = phi(entry: $sum_0, loop: $sum_2); $i_1 = phi(entry: $i_0, loop: $i_2); if ($i_1 < $n) goto loop; finish: return $sum_1; }
  • 27. <?php function sum(int $n) { entry: $sum_0 = 0; # int $i_0 = 0; # int goto cond; loop: $sum_2 = $sum_1 + $i_1; # int|float $i_2 = $i_1 + 1; # int cond: $sum_1 = phi(entry: $sum_0, loop: $sum_2); # int|float $i_1 = phi(entry: $i_0, loop: $i_2); # int if ($i_1 < $n) goto loop; finish: return $sum_1; }
  • 28. ... .L2: mov $0x0, 0x60(%r14) mov $0x4, 0x68(%r14) xor %rdx, %rdx jmp .L5 .L3: mov %rsi, 0x50(%r14) mov $0x4, 0x58(%r14) cmp $0x4, 0x68(%r14) jnz .L10 mov 0x60(%r14), %rax add %rdx, %rax jo .L9 mov %rax, 0x60(%r14) .L4: add $0x1, %rdx .L5: ...
  • 29. ... .L2: mov $0x0, 0x60(%r14) mov $0x4, 0x68(%r14) xor %rdx, %rdx jmp .L5 .L3: mov %rsi, 0x50(%r14) mov $0x4, 0x58(%r14) cmp $0x4, 0x68(%r14) jnz .L10 mov 0x60(%r14), %rax add %rdx, %rax jo .L9 mov %rax, 0x60(%r14) .L4: add $0x1, %rdx .L5: ... Assign 0 to $i (in register) Increment $i (in register)
  • 30. ... .L2: mov $0x0, 0x60(%r14) mov $0x4, 0x68(%r14) xor %rdx, %rdx jmp .L5 .L3: mov %rsi, 0x50(%r14) mov $0x4, 0x58(%r14) cmp $0x4, 0x68(%r14) jnz .L10 mov 0x60(%r14), %rax add %rdx, %rax jo .L9 mov %rax, 0x60(%r14) .L4: add $0x1, %rdx .L5: ... Frame pointer
  • 31. ... .L2: mov $0x0, 0x60(%r14) mov $0x4, 0x68(%r14) xor %rdx, %rdx jmp .L5 .L3: mov %rsi, 0x50(%r14) mov $0x4, 0x58(%r14) cmp $0x4, 0x68(%r14) jnz .L10 mov 0x60(%r14), %rax add %rdx, %rax jo .L9 mov %rax, 0x60(%r14) .L4: add $0x1, %rdx .L5: ... Assign int(0) to $sum
  • 32. ... .L2: mov $0x0, 0x60(%r14) mov $0x4, 0x68(%r14) xor %rdx, %rdx jmp .L5 .L3: mov %rsi, 0x50(%r14) mov $0x4, 0x58(%r14) cmp $0x4, 0x68(%r14) jnz .L10 mov 0x60(%r14), %rax add %rdx, %rax jo .L9 mov %rax, 0x60(%r14) .L4: add $0x1, %rdx .L5: ... Check whether $sum is int
  • 33. ... .L2: mov $0x0, 0x60(%r14) mov $0x4, 0x68(%r14) xor %rdx, %rdx jmp .L5 .L3: mov %rsi, 0x50(%r14) mov $0x4, 0x58(%r14) cmp $0x4, 0x68(%r14) jnz .L10 mov 0x60(%r14), %rax add %rdx, %rax jo .L9 mov %rax, 0x60(%r14) .L4: add $0x1, %rdx .L5: ... Load $sum to register Add $sum and $i Write result back
  • 34. ... .L2: mov $0x0, 0x60(%r14) mov $0x4, 0x68(%r14) xor %rdx, %rdx jmp .L5 .L3: mov %rsi, 0x50(%r14) mov $0x4, 0x58(%r14) cmp $0x4, 0x68(%r14) jnz .L10 mov 0x60(%r14), %rax add %rdx, %rax jo .L9 mov %rax, 0x60(%r14) .L4: add $0x1, %rdx .L5: ... Check if addition overflowed
  • 35. ... .L9: vxorps %xmm0, %xmm0, %xmm0 vcvtsi2sd 0x60(%r14), %xmm0, %xmm0 vxorps %xmm1, %xmm1, %xmm1 vcvtsi2sd %rdx, %xmm1, %xmm1 vaddsd %xmm1, %xmm0, %xmm0 vmovsd %xmm0, 0x60(%r14) mov $0x5, 0x68(%r14) jmp .L4 .L10: vxorps %xmm0, %xmm0, %xmm0 vcvtsi2sd %rdx, %xmm0, %xmm0 vaddsd 0x60(%r14), %xmm0, %xmm0 vmovsd %xmm0, 0x60(%r14) jmp .L4 .L11: ... Convert $i to float
  • 36. ... .L9: vxorps %xmm0, %xmm0, %xmm0 vcvtsi2sd 0x60(%r14), %xmm0, %xmm0 vxorps %xmm1, %xmm1, %xmm1 vcvtsi2sd %rdx, %xmm1, %xmm1 vaddsd %xmm1, %xmm0, %xmm0 vmovsd %xmm0, 0x60(%r14) mov $0x5, 0x68(%r14) jmp .L4 .L10: vxorps %xmm0, %xmm0, %xmm0 vcvtsi2sd %rdx, %xmm0, %xmm0 vaddsd 0x60(%r14), %xmm0, %xmm0 vmovsd %xmm0, 0x60(%r14) jmp .L4 .L11: ... Add (float)$i to $sum
  • 37. ... .L9: vxorps %xmm0, %xmm0, %xmm0 vcvtsi2sd 0x60(%r14), %xmm0, %xmm0 vxorps %xmm1, %xmm1, %xmm1 vcvtsi2sd %rdx, %xmm1, %xmm1 vaddsd %xmm1, %xmm0, %xmm0 vmovsd %xmm0, 0x60(%r14) mov $0x5, 0x68(%r14) jmp .L4 .L10: vxorps %xmm0, %xmm0, %xmm0 vcvtsi2sd %rdx, %xmm0, %xmm0 vaddsd 0x60(%r14), %xmm0, %xmm0 vmovsd %xmm0, 0x60(%r14) jmp .L4 .L11: ... Convert $sum to floatConvert $sum to float Convert $i to float
  • 38. ... .L9: vxorps %xmm0, %xmm0, %xmm0 vcvtsi2sd 0x60(%r14), %xmm0, %xmm0 vxorps %xmm1, %xmm1, %xmm1 vcvtsi2sd %rdx, %xmm1, %xmm1 vaddsd %xmm1, %xmm0, %xmm0 vmovsd %xmm0, 0x60(%r14) mov $0x5, 0x68(%r14) jmp .L4 .L10: vxorps %xmm0, %xmm0, %xmm0 vcvtsi2sd %rdx, %xmm0, %xmm0 vaddsd 0x60(%r14), %xmm0, %xmm0 vmovsd %xmm0, 0x60(%r14) jmp .L4 .L11: ... Add $sum and $i as floats Mark $sum slot as float
  • 39. ... .L9: vxorps %xmm0, %xmm0, %xmm0 vcvtsi2sd 0x60(%r14), %xmm0, %xmm0 vxorps %xmm1, %xmm1, %xmm1 vcvtsi2sd %rdx, %xmm1, %xmm1 vaddsd %xmm1, %xmm0, %xmm0 vmovsd %xmm0, 0x60(%r14) mov $0x5, 0x68(%r14) jmp .L4 .L10: vxorps %xmm0, %xmm0, %xmm0 vcvtsi2sd %rdx, %xmm0, %xmm0 vaddsd 0x60(%r14), %xmm0, %xmm0 vmovsd %xmm0, 0x60(%r14) jmp .L4 .L11: ... This code is almost certainly unused! Can't store $sum in register, because it might turn float
  • 41. Tracing JIT VM Execution + Profiling Trace Collection Hot
  • 42. Tracing JIT VM Execution + Profiling Trace Collection Trace Compilation Hot
  • 43. Tracing JIT VM Execution + Profiling Trace Collection Trace Execution Trace Compilation Hot
  • 44. Tracing JIT VM Execution + Profiling Trace Collection Trace Execution Trace Compilation Hot Deoptimization
  • 45. <?php function sum(int $n) { entry: $sum = 0; $i = 0; goto cond; loop: $sum += $i; $i++; cond: if ($i < $n) goto loop; finish: return $sum; } <?php trace:
  • 46. <?php function sum(int $n) { entry: $sum = 0; $i = 0; goto cond; loop: $sum += $i; $i++; cond: if ($i < $n) goto loop; finish: return $sum; } <?php trace: if ($i < $n)
  • 47. <?php function sum(int $n) { entry: $sum = 0; $i = 0; goto cond; loop: $sum += $i; $i++; cond: if ($i < $n) goto loop; finish: return $sum; } <?php trace: if ($i < $n) $sum += $i;
  • 48. <?php function sum(int $n) { entry: $sum = 0; $i = 0; goto cond; loop: $sum += $i; $i++; cond: if ($i < $n) goto loop; finish: return $sum; } <?php trace: if ($i < $n) $sum += $i; $i++;
  • 49. <?php function sum(int $n) { entry: $sum = 0; $i = 0; goto cond; loop: $sum += $i; $i++; cond: if ($i < $n) goto loop; finish: return $sum; } <?php trace: if ($i < $n) $sum += $i; $i++; goto trace;
  • 50. <?php function sum(int $n) { entry: $sum = 0; $i = 0; goto cond; loop: $sum += $i; $i++; cond: if ($i < $n) goto loop; finish: return $sum; } <?php $sum_0 = ...; $i_0 = ...; trace: $sum_1 = phi($sum_0, $sum_2); $i_1 = phi($i_0, $i_2); if ($i_1 < $n) $sum_2 = $sum_1 + $i_1; $i_2 = $i_1 + 1; goto trace;
  • 51. <?php function sum(int $n) { entry: $sum = 0; $i = 0; goto cond; loop: $sum += $i; $i++; cond: if ($i < $n) goto loop; finish: return $sum; } <?php $sum_0 = ...; # int $i_0 = ...; trace: $sum_1 = phi($sum_0, $sum_2); $i_1 = phi($i_0, $i_2); if ($i_1 < $n) # does not exit $sum_2 = $sum_1 + $i_1; # int $i_2 = $i_1 + 1; goto trace;
  • 52. sub $0x10, %rsp mov $EG(jit_trace_num), %rax mov $0x1, (%rax) cmp $0x4, 0x68(%r14) jnz jit$$trace_exit_0 mov 0x50(%r14), %rcx mov 0x60(%r14), %rdx mov 0x70(%r14), %rsi .L1: cmp %rcx, %rsi jge jit$$trace_exit_1 mov %rdx, %rax add %rsi, %rax jo jit$$trace_exit_2 mov %rax, %rdx add $0x1, %rsi mov $EG(vm_interrupt), %rax cmp $0x0, (%rax) jz .L1 jmp jit$$trace_exit_3
  • 53. sub $0x10, %rsp mov $EG(jit_trace_num), %rax mov $0x1, (%rax) cmp $0x4, 0x68(%r14) jnz jit$$trace_exit_0 mov 0x50(%r14), %rcx mov 0x60(%r14), %rdx mov 0x70(%r14), %rsi .L1: cmp %rcx, %rsi jge jit$$trace_exit_1 mov %rdx, %rax add %rsi, %rax jo jit$$trace_exit_2 mov %rax, %rdx add $0x1, %rsi mov $EG(vm_interrupt), %rax cmp $0x0, (%rax) jz .L1 jmp jit$$trace_exit_3 Check if $sum is int (exit 0)
  • 54. sub $0x10, %rsp mov $EG(jit_trace_num), %rax mov $0x1, (%rax) cmp $0x4, 0x68(%r14) jnz jit$$trace_exit_0 mov 0x50(%r14), %rcx mov 0x60(%r14), %rdx mov 0x70(%r14), %rsi .L1: cmp %rcx, %rsi jge jit$$trace_exit_1 mov %rdx, %rax add %rsi, %rax jo jit$$trace_exit_2 mov %rax, %rdx add $0x1, %rsi mov $EG(vm_interrupt), %rax cmp $0x0, (%rax) jz .L1 jmp jit$$trace_exit_3 Load $n, $sum, $i into registers
  • 55. sub $0x10, %rsp mov $EG(jit_trace_num), %rax mov $0x1, (%rax) cmp $0x4, 0x68(%r14) jnz jit$$trace_exit_0 mov 0x50(%r14), %rcx mov 0x60(%r14), %rdx mov 0x70(%r14), %rsi .L1: cmp %rcx, %rsi jge jit$$trace_exit_1 mov %rdx, %rax add %rsi, %rax jo jit$$trace_exit_2 mov %rax, %rdx add $0x1, %rsi mov $EG(vm_interrupt), %rax cmp $0x0, (%rax) jz .L1 jmp jit$$trace_exit_3 Check $i < $n (exit 1)
  • 56. sub $0x10, %rsp mov $EG(jit_trace_num), %rax mov $0x1, (%rax) cmp $0x4, 0x68(%r14) jnz jit$$trace_exit_0 mov 0x50(%r14), %rcx mov 0x60(%r14), %rdx mov 0x70(%r14), %rsi .L1: cmp %rcx, %rsi jge jit$$trace_exit_1 mov %rdx, %rax add %rsi, %rax jo jit$$trace_exit_2 mov %rax, %rdx add $0x1, %rsi mov $EG(vm_interrupt), %rax cmp $0x0, (%rax) jz .L1 jmp jit$$trace_exit_3 $sum += $i, check overflow (exit 2)
  • 57. sub $0x10, %rsp mov $EG(jit_trace_num), %rax mov $0x1, (%rax) cmp $0x4, 0x68(%r14) jnz jit$$trace_exit_0 mov 0x50(%r14), %rcx mov 0x60(%r14), %rdx mov 0x70(%r14), %rsi .L1: cmp %rcx, %rsi jge jit$$trace_exit_1 mov %rdx, %rax add %rsi, %rax jo jit$$trace_exit_2 mov %rax, %rdx add $0x1, %rsi mov $EG(vm_interrupt), %rax cmp $0x0, (%rax) jz .L1 jmp jit$$trace_exit_3 $i++
  • 58. sub $0x10, %rsp mov $EG(jit_trace_num), %rax mov $0x1, (%rax) cmp $0x4, 0x68(%r14) jnz jit$$trace_exit_0 mov 0x50(%r14), %rcx mov 0x60(%r14), %rdx mov 0x70(%r14), %rsi .L1: cmp %rcx, %rsi jge jit$$trace_exit_1 mov %rdx, %rax add %rsi, %rax jo jit$$trace_exit_2 mov %rax, %rdx add $0x1, %rsi mov $EG(vm_interrupt), %rax cmp $0x0, (%rax) jz .L1 jmp jit$$trace_exit_3 Check VM interrupt, like timeout (exit 3)
  • 59. sub $0x10, %rsp mov $EG(jit_trace_num), %rax mov $0x1, (%rax) cmp $0x4, 0x68(%r14) jnz jit$$trace_exit_0 mov 0x50(%r14), %rcx mov 0x60(%r14), %rdx mov 0x70(%r14), %rsi .L1: cmp %rcx, %rsi jge jit$$trace_exit_1 mov %rdx, %rax add %rsi, %rax jo jit$$trace_exit_2 mov %rax, %rdx add $0x1, %rsi mov $EG(vm_interrupt), %rax cmp $0x0, (%rax) jz .L1 jmp jit$$trace_exit_3
  • 60. sub $0x10, %rsp mov $EG(jit_trace_num), %rax mov $0x1, (%rax) cmp $0x4, 0x68(%r14) jnz jit$$trace_exit_0 mov 0x50(%r14), %rcx mov 0x60(%r14), %rdx mov 0x70(%r14), %rsi .L1: cmp %rcx, %rsi jge jit$$trace_exit_1 mov %rdx, %rax add %rsi, %rax jo jit$$trace_exit_2 mov %rax, %rdx add $0x1, %rsi mov $EG(vm_interrupt), %rax cmp $0x0, (%rax) jz .L1 jmp jit$$trace_exit_3 Exits go to VM or side traces
  • 61. TRACE-2$sum$5: mov $EG(jit_trace_num), %rax mov $0x2, (%rax) mov 0x70(%r14), %rax cmp 0x50(%r14), %rax jge jit$$trace_exit_0 cmp $0x5, 0x68(%r14) jnz jit$$trace_exit_1 vxorps %xmm0, %xmm0, %xmm0 vcvtsi2sd %rax, %xmm0, %xmm0 vaddsd 0x60(%r14), %xmm0, %xmm0 vmovsd %xmm0, 0x60(%r14) add $0x1, 0x70(%r14) mov $EG(vm_interrupt), %rax cmp $0x0, (%rax) jz TRACE-1$sum$5+4 jmp jit$$trace_exit_2
  • 62. TRACE-2$sum$5: mov $EG(jit_trace_num), %rax mov $0x2, (%rax) mov 0x70(%r14), %rax cmp 0x50(%r14), %rax jge jit$$trace_exit_0 cmp $0x5, 0x68(%r14) jnz jit$$trace_exit_1 vxorps %xmm0, %xmm0, %xmm0 vcvtsi2sd %rax, %xmm0, %xmm0 vaddsd 0x60(%r14), %xmm0, %xmm0 vmovsd %xmm0, 0x60(%r14) add $0x1, 0x70(%r14) mov $EG(vm_interrupt), %rax cmp $0x0, (%rax) jz TRACE-1$sum$5+4 jmp jit$$trace_exit_2 Check if $sum is float
  • 63. TRACE-2$sum$5: mov $EG(jit_trace_num), %rax mov $0x2, (%rax) mov 0x70(%r14), %rax cmp 0x50(%r14), %rax jge jit$$trace_exit_0 cmp $0x5, 0x68(%r14) jnz jit$$trace_exit_1 vxorps %xmm0, %xmm0, %xmm0 vcvtsi2sd %rax, %xmm0, %xmm0 vaddsd 0x60(%r14), %xmm0, %xmm0 vmovsd %xmm0, 0x60(%r14) add $0x1, 0x70(%r14) mov $EG(vm_interrupt), %rax cmp $0x0, (%rax) jz TRACE-1$sum$5+4 jmp jit$$trace_exit_2 $sum += (float) $i
  • 64. TRACE-2$sum$5: mov $EG(jit_trace_num), %rax mov $0x2, (%rax) mov 0x70(%r14), %rax cmp 0x50(%r14), %rax jge jit$$trace_exit_0 cmp $0x5, 0x68(%r14) jnz jit$$trace_exit_1 vxorps %xmm0, %xmm0, %xmm0 vcvtsi2sd %rax, %xmm0, %xmm0 vaddsd 0x60(%r14), %xmm0, %xmm0 vmovsd %xmm0, 0x60(%r14) add $0x1, 0x70(%r14) mov $EG(vm_interrupt), %rax cmp $0x0, (%rax) jz TRACE-1$sum$5+4 jmp jit$$trace_exit_2
  • 65. sub $0x10, %rsp mov $EG(jit_trace_num), %rax mov $0x1, (%rax) cmp $0x4, 0x68(%r14) jnz jit$$trace_exit_0 mov 0x50(%r14), %rcx mov 0x60(%r14), %rdx mov 0x70(%r14), %rsi .L1: cmp %rcx, %rsi jge jit$$trace_exit_1 mov %rdx, %rax add %rsi, %rax jo jit$$trace_exit_2 mov %rax, %rdx add $0x1, %rsi mov $EG(vm_interrupt), %rax cmp $0x0, (%rax) jz .L1 jmp jit$$trace_exit_3 Trace 2
  • 66. Interception ● Each opcode stores a "VM handler" pointer
  • 67. Interception ● Each opcode stores a "VM handler" pointer ● Replace handler at function entry, loop headers, returns ● Handler counts executions and invokes JIT
  • 68. Trace Collection ● Separate VM that collects type info while executing
  • 69. Trace Collection ● Separate VM that collects type info while executing ● Traces can span different loops and functions – Calls effectively get "inlined"
  • 70. Code Generation ● Early prototypes used LLVM – Architecture agnostic – Supports many sophisticated optimizations
  • 71. Code Generation ● Early prototypes used LLVM – Architecture agnostic – Supports many sophisticated optimizations – But: Extremely slow compile-times
  • 72. Code Generation ● Early prototypes used LLVM – Architecture agnostic – Supports many sophisticated optimizations – But: Extremely slow compile-times ● Now using DynASM from the LuaJIT project – Very fast – But: Architecture specific
  • 73. |.macro LONG_MATH_REG, opcode, dst_reg, src_reg || switch (opcode) { || case ZEND_ADD: | add dst_reg, src_reg || break; || case ZEND_SUB: | sub dst_reg, src_reg || break; || case ZEND_MUL: | imul dst_reg, src_reg || break; || case ZEND_BW_OR: | or dst_reg, src_reg || break; || case ZEND_BW_AND: | and dst_reg, src_reg || break; ... || } |.endmacro
  • 74. |.macro LONG_MATH_REG, opcode, dst_reg, src_reg || switch (opcode) { || case ZEND_ADD: | add dst_reg, src_reg || break; || case ZEND_SUB: | sub dst_reg, src_reg || break; || case ZEND_MUL: | imul dst_reg, src_reg || break; || case ZEND_BW_OR: | or dst_reg, src_reg || break; || case ZEND_BW_AND: | and dst_reg, src_reg || break; ... || } |.endmacro C code X86 Assembly with placeholders
  • 75. Code Generation ● DynASM itself supports many architectures ● But JIT code has to be written for each ● No support for M1 at this time, sorry!
  • 76. Closing Thoughts ● Performance benefit workload dependent – Try it!
  • 77. Closing Thoughts ● Performance benefit workload dependent – Try it! ● Room for improvement – E.g. optimizations (loop invariant code motion, etc.)
  • 78. Closing Thoughts ● Concern: Stability – Increased potential for hard to debug, hard to reproduce bugs
  • 79. Closing Thoughts ● Concern: Stability – Increased potential for hard to debug, hard to reproduce bugs ● Concern: Maintenance – Only one person really understands the JIT