HHVM on AArch64
Max Wang
Software Engineer HHVM
Agenda
Four questions:
Will the demo work?
1
4
3
Where do we go from here?
How did we get running on AArch64?2
What is HHVM?
Agenda
Four questions:
Will the demo work?
1
4
3
Where do we go from here?
How did we get running on AArch64?2
What is HHVM?
Agenda
Four questions:
Will the demo work?
1
4
3
Where do we go from here?
How did we get running on AArch64?2
What is HHVM?
Agenda
Four questions:
Will the demo work?
1
4
3
Where do we go from here?
How did we get running on AArch64?2
What is HHVM?
Agenda
Four questions:
Will the demo work?
1
4
3
Where do we go from here?
How did we get running on AArch64?2
What is HHVM?
Will the demo work?
1
4
3
Where do we go from here?
How did we get running on AArch64?2
What is HHVM?
Agenda
i.e., Why should you care?
• Just-in-time compiler for PHP and Hack
What is HHVM?
In a nutshell
"1" + "2" == 3
"1" + "2wo" == 3
"15" + "0xf" == 15
("15" == "0xf") == true
$str = "foo";
list($a, $b) = $str;
var_dump($a); // "f"
list($a, $b) = "foo";
var_dump($a); // NULL
• Just-in-time compiler for PHP and Hack
What is HHVM?
In a nutshell
<?hh
class Foo<T> {
public async function getBar(
dict<string,T> $ts
): Awaitable<Bar> {
return await fetch_bar($this->priv, $ts);
}
}
• Serves production web traffic for Facebook
• HHVM is fast!
• Orders of magnitude improvement
• Not just for FB!
What is HHVM?
In a nutshell
• Open source (https://github.com/facebook/hhvm)
• Used by 3 of the Alexa Top 5
• Facebook, Baidu, Wikipedia
• Also: Box, Slack, Etsy,
Wordpress, ...
What is HHVM?
In a nutshell
{
• Fast!
• Open source!
• Just-in-time compiler for PHP and Hack
What is HHVM?
In a nutshell
• Major performance improvement over PHP5
• But:
• slow ahead-of-time compilation
• massive binary size
• static type inference on a dynamic language
Compilation pipeline
HipHop for PHP (HPHPc)
PHP
hphpc
binaryC++
g++
ahead of time
Compilation pipeline
High-level PHP bytecode
PHP x64HHBC
just in time
Compilation pipeline
High-level PHP bytecode
PHP x64HHBC
just in time
$elem = ...;
if ($elem > 0) {
...
}
63 SetL L:4
65 PopC
66 Int 0
75 CGetL2 L:4
77 Gt
78 JmpZ 13 (91)
Compilation pipeline
HHVM intermediate representation
x64HHIR
just in time
PHP HHBC
63 SetL L:4
65 PopC
66 Int 0
75 CGetL2 L:4
77 Gt
78 JmpZ 13 (91)
63: SetL L:4
(12) t3:Int = LdStk<Int,IRSPOff 0> t1:StkPtr
(14) StLoc<4> t0:FramePtr, t3:Int
66: Int 0
(21) StStk<IRSPOff 0> t1:StkPtr, 0
75: CGetL2 L:4
(24) StStk<IRSPOff 0> t1:StkPtr, t3:Int
(25) StStk<IRSPOff -1> t1:StkPtr, 0
Compilation pipeline
Bytecode-to-bytecode transformation
x64HHIR
just in time
PHP HHBC
optionally ahead of time
63 SetL L:4
65 PopC
66 Int 0
75 CGetL2 L:4
77 Gt
78 JmpZ 13 (91)
63: SetL L:4
(12) t3:Int = LdStk<Int,IRSPOff 0> t1:StkPtr
(14) StLoc<4> t0:FramePtr, t3:Int
66: Int 0
(21) StStk<IRSPOff 0> t1:StkPtr, 0
75: CGetL2 L:4
(24) StStk<IRSPOff 0> t1:StkPtr, t3:Int
(25) StStk<IRSPOff -1> t1:StkPtr, 0
• This is where the magic happens
Compilation pipeline
x64HHIR
just in time
PHP HHBC
optionally ahead of time
83 CGetL L:4
85 CGetL2 L:2
87 Add
88 SetL L:2
90 PopC
$sum:Dbl ; $elem:Int
83 CGetL L:4
85 CGetL2 L:2
87 Add
88 SetL L:2
90 PopC
$sum:Dbl ; $elem:Dbl
83 CGetL L:4
85 CGetL2 L:2
87 Add
88 SetL L:2
90 PopC
$sum:Int ; $elem:Dbl
Dynamic type specialization
JIT optimizations
43 CGetL L:1
45 CGetL2 L:3
47 Lt
48 JmpZ 57 (105)
$n:Int ; $i:Int
91 IncDecL L:3 PostInc
94 PopC
95 CGetL L:1
97 CGetL2 L:3
99 Lt
100 JmpNZ -47 (53)
$n:Int; $i:Int
83 CGetL L:4
85 CGetL2 L:2
87 Add
88 SetL L:2
90 PopC
$sum:Int ; $elem:Int
53 BaseL L:0 Warn
57 QueryM 0 CGet EL:3
$arr:Arr ; $i:Int
63 SetL L:4
65 PopC
66 Int 0
75 CGetL2 L:4
77 Gt
78 JmpZ 13 (91)
$elem:Unc ; Stk{0}:Dbl
63 SetL L:4
65 PopC
66 Int 0
75 CGetL2 L:4
77 Gt
78 JmpZ 13 (91)
$elem:Unc ; Stk{0}:Int
function addPositive($arr, $n) {
$sum = 0;
for ($i = 0; $i < $n; $i++) {
$elem = $arr[$i];
if ($elem > 0) {
$sum = $sum + $elem;
}
}
return $sum;
}
83 CGetL L:4
85 CGetL2 L:2
87 Add
88 SetL L:2
90 PopC
$sum:Dbl ; $elem:Int
83 CGetL L:4
85 CGetL2 L:2
87 Add
88 SetL L:2
90 PopC
$sum:Dbl ; $elem:Dbl
83 CGetL L:4
85 CGetL2 L:2
87 Add
88 SetL L:2
90 PopC
$sum:Int ; $elem:Dbl
Profile-guided optimization
JIT optimizations
43 CGetL L:1
45 CGetL2 L:3
47 Lt
48 JmpZ 57 (105)
$n:Int ; $i:Int
91 IncDecL L:3 PostInc
94 PopC
95 CGetL L:1
97 CGetL2 L:3
99 Lt
100 JmpNZ -47 (53)
$n:Int; $i:Int
83 CGetL L:4
85 CGetL2 L:2
87 Add
88 SetL L:2
90 PopC
$sum:Int ; $elem:Int
53 BaseL L:0 Warn
57 QueryM 0 CGet EL:3
$arr:Arr ; $i:Int
63 SetL L:4
65 PopC
66 Int 0
75 CGetL2 L:4
77 Gt
78 JmpZ 13 (91)
$elem:Unc ; Stk{0}:Dbl
63 SetL L:4
65 PopC
66 Int 0
75 CGetL2 L:4
77 Gt
78 JmpZ 13 (91)
$elem:Unc ; Stk{0}:Int
function addPositive($arr, $n) {
$sum = 0;
for ($i = 0; $i < $n; $i++) {
$elem = $arr[$i];
if ($elem > 0) {
$sum = $sum + $elem;
}
}
return $sum;
}
83 CGetL L:4
85 CGetL2 L:2
87 Add
88 SetL L:2
90 PopC
$sum:Dbl ; $elem:Dbl
Profile-guided optimization
JIT optimizations 43 CGetL L:1
45 CGetL2 L:3
47 Lt
48 JmpZ 57 (105)
$n:Int ; $i:Int
53 BaseL L:0 Warn
57 QueryM 0 CGet EL:3
$arr:Arr ; $i:Int
63 SetL L:4
65 PopC
66 Int 0
75 CGetL2 L:4
77 Gt
78 JmpZ 13 (91)
$elem:Unc ; Stk{0}:Dblfunction addPositive($arr, $n) {
$sum = 0;
for ($i = 0; $i < $n; $i++) {
$elem = $arr[$i];
if ($elem > 0) {
$sum = $sum + $elem;
}
}
return $sum;
}
91 IncDecL L:3 PostInc
94 PopC
95 CGetL L:1
97 CGetL2 L:3
99 Lt
100 JmpNZ -47 (53)
HHIR optimization passes
JIT optimizations
69: CGetL L:1
(12) t3:Str = LdLoc<Str,1> t0:FramePtr
(13) IncRef t3:Str
71: CGetL2 L:0
(16) t4:Str = LdLoc<Str,0> t0:FramePtr
(17) IncRef t4:Str
73: Concat
(22) t5:Str = ConcatStrStr t4:Str, t3:Str
(24) DecRef<-> t3:Str
74: SetL L:2
(27) StLoc<2> t0:FramePtr, t5:Str
(28) IncRef t5:Str
76: PopC
(31) DecRef<-> t5:Str
77: CGetL L:2
(33) IncRef t5:Str
79: FCallBuiltin 1 1 "strlen"
(35) t7:Int = LdStrLen t5:Str
(36) DecRef<-> t5:Str
69: CGetL L:1
(12) t3:Str = LdLoc<Str,1> t0:FramePtr
(13) IncRef t3:Str
71: CGetL2 L:0
(16) t4:Str = LdLoc<Str,0> t0:FramePtr
(17) IncRef t4:Str
73: Concat
(22) t5:Str = ConcatStrStr t4:Str, t3:Str
(24) DecRef<-> t3:Str
74: SetL L:2
(27) StLoc<2> t0:FramePtr, t5:Str
(28) IncRef t5:Str
76: PopC
(31) DecRef<-> t5:Str
$c = $a . $b;
$len = strlen($c);
HHIR optimization passes
JIT optimizations
69: CGetL L:1
(12) t3:Str = LdLoc<Str,1> t0:FramePtr
(13) IncRef t3:Str
71: CGetL2 L:0
(16) t4:Str = LdLoc<Str,0> t0:FramePtr
(17) IncRef t4:Str
73: Concat
(22) t5:Str = ConcatStrStr t4:Str, t3:Str
(24) DecRef<-> t3:Str
74: SetL L:2
(27) StLoc<2> t0:FramePtr, t5:Str
(28) IncRef t5:Str
76: PopC
(31) DecRef<-> t5:Str
77: CGetL L:2
(33) IncRef t5:Str
79: FCallBuiltin 1 1 "strlen"
(35) t7:Int = LdStrLen t5:Str
(36) DecRef<-> t5:Str
$c = $a . $b;
$len = strlen($c);
HHIR optimization passes
JIT optimizations
69: CGetL L:1
(12) t3:Str = LdLoc<Str,1> t0:FramePtr
(13) Nop
71: CGetL2 L:0
(16) t4:Str = LdLoc<Str,0> t0:FramePtr
(17) IncRef t4:Str
73: Concat
(22) t5:Str = ConcatStrStr t4:Str, t3:Str
(24) Nop
74: SetL L:2
(27) StLoc<2> t0:FramePtr, t5:Str
(28) Nop
76: PopC
(31) Nop
77: CGetL L:2
(33) Nop
79: FCallBuiltin 1 1 "strlen"
(35) t7:Int = LdStrLen t5:Str
(36) Nop
$c = $a . $b;
$len = strlen($c);
HHIR optimization passes
JIT optimizations
69: CGetL L:1
(12) t3:Str = LdLoc<Str,1> t0:FramePtr
71: CGetL2 L:0
(16) t4:Str = LdLoc<Str,0> t0:FramePtr
(17) IncRef t4:Str
73: Concat
(22) t5:Str = ConcatStrStr t4:Str, t3:Str
74: SetL L:2
(27) StLoc<2> t0:FramePtr, t5:Str
79: FCallBuiltin 1 1 "strlen"
(35) t7:Int = LdStrLen t5:Str
$c = $a . $b;
$len = strlen($c);
/* ... */
HHIR optimization passes
JIT optimizations
Will the demo work?
1
4
3
Where do we go from here?
How did we get running on AArch64?2
What is HHVM?
Agenda
i.e., Who did all the hard work?
Compilation pipeline
x64HHIR
just in time
PHP HHBC
optionally ahead of time
Compilation pipeline
ARM simulator
x64
HHIR
just in time
PHP HHBC
optionally ahead of time
vixl
• Maintenance nightmare
• > 600 HHIR ops:
• We aren't ARM experts
Compilation pipeline
ARM simulator
x64
HHIR
just in time
PHP HHBC
optionally ahead of time
vixl
Compilation pipeline
x64HHIR
just in time
PHP HHBC
optionally ahead of time
63 SetL L:4
65 PopC
66 Int 0
75 CGetL2 L:4
77 Gt
78 JmpZ 13 (91)
63: SetL L:4
(12) t3:Int = LdStk<Int,IRSPOff 0> t1:StkPtr
(14) StLoc<4> t0:FramePtr, t3:Int
66: Int 0
(21) StStk<IRSPOff 0> t1:StkPtr, 0
75: CGetL2 L:4
(24) StStk<IRSPOff 0> t1:StkPtr, t3:Int
(25) StStk<IRSPOff -1> t1:StkPtr, 0
just in time
Compilation pipeline
Virtual assembly
PHP x64HHIRHHBC vasm
optionally ahead of time
63: SetL L:4
(12) t3:Int = LdStk<Int,IRSPOff 0> t1:StkPtr
(14) StLoc<4> t0:FramePtr, t3:Int
66: Int 0
(21) StStk<IRSPOff 0> t1:StkPtr, 0
75: CGetL2 L:4
(24) StStk<IRSPOff 0> t1:StkPtr, t3:Int
(25) StStk<IRSPOff -1> t1:StkPtr, 0
load [%128] => %129
storeb %136(17b), [%rbp - 0x48]
store %129, [%rbp - 0x50]
storeb %136(17b), [%128 + 0x8]
store %129, [%128]
just in time
• Uncanny resemblance to x64
• Spiritual sibling of WebKit's Bare Bones Backend
Compilation pipeline
Virtual assembly
PHP x64HHIRHHBC vasm
optionally ahead of time
just in time
• "Why don't you just use LLVM?" 🤔
• We tried it:
• No noticeable performance gains
• LLVM's MCJIT is too heavyweight
Compilation pipeline
LLVM? Have you heard of it?
PHP LLIR?HHIRHHBC vasm
optionally ahead of time
• Experimental LLVM backend stress-tested vasm
• Calling conventions
• Register widths
• ...
Compilation pipeline
LLVM? Have you heard of it?
PHP LLIR?HHIRHHBC vasm
optionally ahead of time just in time
Compilation pipeline
ARM backend
PHP HHIR
just in time
HHBC vasm
optionally ahead of time
x64
arm
Compilation pipeline
Backends for everyone!
PHP HHIR
just in time
HHBC vasm
optionally ahead of time
x64
ppc64
arm
• Lakshmi Pathy — @lpathy
• Dave Estes — @dave-estes
• Jim Saxman — @jim-saxman
• Christoph Müllner — @cmuellner
• Steve Walk — @swalk-cavium
• Andrew Pinski — @apinski-cavium
• ...
Contributors
The most important slide in this talk
• Final vasm-to-AArch64 lowering pass
• Code smashing
• Boundary-crossing b/w C++ and jitted code
• Bonus: Continuous integration testing!
ARM backend
Baseline functionality
• Strength reduction on flag-setting instructions
• 64-bit immediate lifting
• Branch offset optimizations
• ...
ARM backend
Optimizations
Will the demo work?
1
4
3
Where do we go from here?
How did we get running on AArch64?2
What is HHVM?
Agenda
Hold onto your backends
Will the demo work?
1
4
3
Where do we go from here?
How did we get running on AArch64?2
What is HHVM?
Agenda
i.e., How can you get involved?
• Cache is king!
• Instruction sequences generally larger on AArch64
• HHVM is sensitive to code layout
• Huge pages
• Indirect branch rewriting
• Locality tuning
• ...
Future work
Code size and layout
• Profile OSS workloads on ARM using perf
• https://github.com/hhvm/oss-performance
• Make some measurements
Future work
More ARM-specific optimizations
• Website: http://hhvm.com/
• GitHub: https://github.com/facebook/hhvm
• IRC: #hphp-dev on Freenode
• Mailing list: https://groups.google.com/d/forum/hhvm-arm
• My email: mwang@fb.com
Resources
Feel free to contribute!
Quick recap
HHVM on AArch64
Quick recap
HHVM on AArch64
1 HHVM
Quick recap
HHVM on AArch64
1
It runs on AArch64 (thanks to the community)2
HHVM
Quick recap
HHVM on AArch64
Seriously, the demo worked and everything
1
3
It runs on AArch64 (thanks to the community)2
HHVM
Any questions?
Any questions?
Seriously, the demo worked and everything
1
4
3
Any questions?
It runs on AArch64 (thanks to the community)2
HHVM
HHVM on AArch64 - BUD17-400K1

HHVM on AArch64 - BUD17-400K1

  • 2.
    HHVM on AArch64 MaxWang Software Engineer HHVM
  • 3.
    Agenda Four questions: Will thedemo work? 1 4 3 Where do we go from here? How did we get running on AArch64?2 What is HHVM?
  • 4.
    Agenda Four questions: Will thedemo work? 1 4 3 Where do we go from here? How did we get running on AArch64?2 What is HHVM?
  • 5.
    Agenda Four questions: Will thedemo work? 1 4 3 Where do we go from here? How did we get running on AArch64?2 What is HHVM?
  • 6.
    Agenda Four questions: Will thedemo work? 1 4 3 Where do we go from here? How did we get running on AArch64?2 What is HHVM?
  • 7.
    Agenda Four questions: Will thedemo work? 1 4 3 Where do we go from here? How did we get running on AArch64?2 What is HHVM?
  • 8.
    Will the demowork? 1 4 3 Where do we go from here? How did we get running on AArch64?2 What is HHVM? Agenda i.e., Why should you care?
  • 9.
    • Just-in-time compilerfor PHP and Hack What is HHVM? In a nutshell "1" + "2" == 3 "1" + "2wo" == 3 "15" + "0xf" == 15 ("15" == "0xf") == true $str = "foo"; list($a, $b) = $str; var_dump($a); // "f" list($a, $b) = "foo"; var_dump($a); // NULL
  • 11.
    • Just-in-time compilerfor PHP and Hack What is HHVM? In a nutshell <?hh class Foo<T> { public async function getBar( dict<string,T> $ts ): Awaitable<Bar> { return await fetch_bar($this->priv, $ts); } }
  • 12.
    • Serves productionweb traffic for Facebook • HHVM is fast! • Orders of magnitude improvement • Not just for FB! What is HHVM? In a nutshell
  • 13.
    • Open source(https://github.com/facebook/hhvm) • Used by 3 of the Alexa Top 5 • Facebook, Baidu, Wikipedia • Also: Box, Slack, Etsy, Wordpress, ... What is HHVM? In a nutshell {
  • 14.
    • Fast! • Opensource! • Just-in-time compiler for PHP and Hack What is HHVM? In a nutshell
  • 15.
    • Major performanceimprovement over PHP5 • But: • slow ahead-of-time compilation • massive binary size • static type inference on a dynamic language Compilation pipeline HipHop for PHP (HPHPc) PHP hphpc binaryC++ g++ ahead of time
  • 16.
    Compilation pipeline High-level PHPbytecode PHP x64HHBC just in time
  • 17.
    Compilation pipeline High-level PHPbytecode PHP x64HHBC just in time $elem = ...; if ($elem > 0) { ... } 63 SetL L:4 65 PopC 66 Int 0 75 CGetL2 L:4 77 Gt 78 JmpZ 13 (91)
  • 18.
    Compilation pipeline HHVM intermediaterepresentation x64HHIR just in time PHP HHBC 63 SetL L:4 65 PopC 66 Int 0 75 CGetL2 L:4 77 Gt 78 JmpZ 13 (91) 63: SetL L:4 (12) t3:Int = LdStk<Int,IRSPOff 0> t1:StkPtr (14) StLoc<4> t0:FramePtr, t3:Int 66: Int 0 (21) StStk<IRSPOff 0> t1:StkPtr, 0 75: CGetL2 L:4 (24) StStk<IRSPOff 0> t1:StkPtr, t3:Int (25) StStk<IRSPOff -1> t1:StkPtr, 0
  • 19.
    Compilation pipeline Bytecode-to-bytecode transformation x64HHIR justin time PHP HHBC optionally ahead of time 63 SetL L:4 65 PopC 66 Int 0 75 CGetL2 L:4 77 Gt 78 JmpZ 13 (91) 63: SetL L:4 (12) t3:Int = LdStk<Int,IRSPOff 0> t1:StkPtr (14) StLoc<4> t0:FramePtr, t3:Int 66: Int 0 (21) StStk<IRSPOff 0> t1:StkPtr, 0 75: CGetL2 L:4 (24) StStk<IRSPOff 0> t1:StkPtr, t3:Int (25) StStk<IRSPOff -1> t1:StkPtr, 0
  • 20.
    • This iswhere the magic happens Compilation pipeline x64HHIR just in time PHP HHBC optionally ahead of time
  • 21.
    83 CGetL L:4 85CGetL2 L:2 87 Add 88 SetL L:2 90 PopC $sum:Dbl ; $elem:Int 83 CGetL L:4 85 CGetL2 L:2 87 Add 88 SetL L:2 90 PopC $sum:Dbl ; $elem:Dbl 83 CGetL L:4 85 CGetL2 L:2 87 Add 88 SetL L:2 90 PopC $sum:Int ; $elem:Dbl Dynamic type specialization JIT optimizations 43 CGetL L:1 45 CGetL2 L:3 47 Lt 48 JmpZ 57 (105) $n:Int ; $i:Int 91 IncDecL L:3 PostInc 94 PopC 95 CGetL L:1 97 CGetL2 L:3 99 Lt 100 JmpNZ -47 (53) $n:Int; $i:Int 83 CGetL L:4 85 CGetL2 L:2 87 Add 88 SetL L:2 90 PopC $sum:Int ; $elem:Int 53 BaseL L:0 Warn 57 QueryM 0 CGet EL:3 $arr:Arr ; $i:Int 63 SetL L:4 65 PopC 66 Int 0 75 CGetL2 L:4 77 Gt 78 JmpZ 13 (91) $elem:Unc ; Stk{0}:Dbl 63 SetL L:4 65 PopC 66 Int 0 75 CGetL2 L:4 77 Gt 78 JmpZ 13 (91) $elem:Unc ; Stk{0}:Int function addPositive($arr, $n) { $sum = 0; for ($i = 0; $i < $n; $i++) { $elem = $arr[$i]; if ($elem > 0) { $sum = $sum + $elem; } } return $sum; }
  • 22.
    83 CGetL L:4 85CGetL2 L:2 87 Add 88 SetL L:2 90 PopC $sum:Dbl ; $elem:Int 83 CGetL L:4 85 CGetL2 L:2 87 Add 88 SetL L:2 90 PopC $sum:Dbl ; $elem:Dbl 83 CGetL L:4 85 CGetL2 L:2 87 Add 88 SetL L:2 90 PopC $sum:Int ; $elem:Dbl Profile-guided optimization JIT optimizations 43 CGetL L:1 45 CGetL2 L:3 47 Lt 48 JmpZ 57 (105) $n:Int ; $i:Int 91 IncDecL L:3 PostInc 94 PopC 95 CGetL L:1 97 CGetL2 L:3 99 Lt 100 JmpNZ -47 (53) $n:Int; $i:Int 83 CGetL L:4 85 CGetL2 L:2 87 Add 88 SetL L:2 90 PopC $sum:Int ; $elem:Int 53 BaseL L:0 Warn 57 QueryM 0 CGet EL:3 $arr:Arr ; $i:Int 63 SetL L:4 65 PopC 66 Int 0 75 CGetL2 L:4 77 Gt 78 JmpZ 13 (91) $elem:Unc ; Stk{0}:Dbl 63 SetL L:4 65 PopC 66 Int 0 75 CGetL2 L:4 77 Gt 78 JmpZ 13 (91) $elem:Unc ; Stk{0}:Int function addPositive($arr, $n) { $sum = 0; for ($i = 0; $i < $n; $i++) { $elem = $arr[$i]; if ($elem > 0) { $sum = $sum + $elem; } } return $sum; }
  • 23.
    83 CGetL L:4 85CGetL2 L:2 87 Add 88 SetL L:2 90 PopC $sum:Dbl ; $elem:Dbl Profile-guided optimization JIT optimizations 43 CGetL L:1 45 CGetL2 L:3 47 Lt 48 JmpZ 57 (105) $n:Int ; $i:Int 53 BaseL L:0 Warn 57 QueryM 0 CGet EL:3 $arr:Arr ; $i:Int 63 SetL L:4 65 PopC 66 Int 0 75 CGetL2 L:4 77 Gt 78 JmpZ 13 (91) $elem:Unc ; Stk{0}:Dblfunction addPositive($arr, $n) { $sum = 0; for ($i = 0; $i < $n; $i++) { $elem = $arr[$i]; if ($elem > 0) { $sum = $sum + $elem; } } return $sum; } 91 IncDecL L:3 PostInc 94 PopC 95 CGetL L:1 97 CGetL2 L:3 99 Lt 100 JmpNZ -47 (53)
  • 24.
    HHIR optimization passes JIToptimizations 69: CGetL L:1 (12) t3:Str = LdLoc<Str,1> t0:FramePtr (13) IncRef t3:Str 71: CGetL2 L:0 (16) t4:Str = LdLoc<Str,0> t0:FramePtr (17) IncRef t4:Str 73: Concat (22) t5:Str = ConcatStrStr t4:Str, t3:Str (24) DecRef<-> t3:Str 74: SetL L:2 (27) StLoc<2> t0:FramePtr, t5:Str (28) IncRef t5:Str 76: PopC (31) DecRef<-> t5:Str 77: CGetL L:2 (33) IncRef t5:Str 79: FCallBuiltin 1 1 "strlen" (35) t7:Int = LdStrLen t5:Str (36) DecRef<-> t5:Str 69: CGetL L:1 (12) t3:Str = LdLoc<Str,1> t0:FramePtr (13) IncRef t3:Str 71: CGetL2 L:0 (16) t4:Str = LdLoc<Str,0> t0:FramePtr (17) IncRef t4:Str 73: Concat (22) t5:Str = ConcatStrStr t4:Str, t3:Str (24) DecRef<-> t3:Str 74: SetL L:2 (27) StLoc<2> t0:FramePtr, t5:Str (28) IncRef t5:Str 76: PopC (31) DecRef<-> t5:Str $c = $a . $b; $len = strlen($c);
  • 25.
    HHIR optimization passes JIToptimizations 69: CGetL L:1 (12) t3:Str = LdLoc<Str,1> t0:FramePtr (13) IncRef t3:Str 71: CGetL2 L:0 (16) t4:Str = LdLoc<Str,0> t0:FramePtr (17) IncRef t4:Str 73: Concat (22) t5:Str = ConcatStrStr t4:Str, t3:Str (24) DecRef<-> t3:Str 74: SetL L:2 (27) StLoc<2> t0:FramePtr, t5:Str (28) IncRef t5:Str 76: PopC (31) DecRef<-> t5:Str 77: CGetL L:2 (33) IncRef t5:Str 79: FCallBuiltin 1 1 "strlen" (35) t7:Int = LdStrLen t5:Str (36) DecRef<-> t5:Str $c = $a . $b; $len = strlen($c);
  • 26.
    HHIR optimization passes JIToptimizations 69: CGetL L:1 (12) t3:Str = LdLoc<Str,1> t0:FramePtr (13) Nop 71: CGetL2 L:0 (16) t4:Str = LdLoc<Str,0> t0:FramePtr (17) IncRef t4:Str 73: Concat (22) t5:Str = ConcatStrStr t4:Str, t3:Str (24) Nop 74: SetL L:2 (27) StLoc<2> t0:FramePtr, t5:Str (28) Nop 76: PopC (31) Nop 77: CGetL L:2 (33) Nop 79: FCallBuiltin 1 1 "strlen" (35) t7:Int = LdStrLen t5:Str (36) Nop $c = $a . $b; $len = strlen($c);
  • 27.
    HHIR optimization passes JIToptimizations 69: CGetL L:1 (12) t3:Str = LdLoc<Str,1> t0:FramePtr 71: CGetL2 L:0 (16) t4:Str = LdLoc<Str,0> t0:FramePtr (17) IncRef t4:Str 73: Concat (22) t5:Str = ConcatStrStr t4:Str, t3:Str 74: SetL L:2 (27) StLoc<2> t0:FramePtr, t5:Str 79: FCallBuiltin 1 1 "strlen" (35) t7:Int = LdStrLen t5:Str $c = $a . $b; $len = strlen($c);
  • 28.
    /* ... */ HHIRoptimization passes JIT optimizations
  • 29.
    Will the demowork? 1 4 3 Where do we go from here? How did we get running on AArch64?2 What is HHVM? Agenda i.e., Who did all the hard work?
  • 30.
    Compilation pipeline x64HHIR just intime PHP HHBC optionally ahead of time
  • 31.
    Compilation pipeline ARM simulator x64 HHIR justin time PHP HHBC optionally ahead of time vixl
  • 32.
    • Maintenance nightmare •> 600 HHIR ops: • We aren't ARM experts Compilation pipeline ARM simulator x64 HHIR just in time PHP HHBC optionally ahead of time vixl
  • 33.
    Compilation pipeline x64HHIR just intime PHP HHBC optionally ahead of time 63 SetL L:4 65 PopC 66 Int 0 75 CGetL2 L:4 77 Gt 78 JmpZ 13 (91) 63: SetL L:4 (12) t3:Int = LdStk<Int,IRSPOff 0> t1:StkPtr (14) StLoc<4> t0:FramePtr, t3:Int 66: Int 0 (21) StStk<IRSPOff 0> t1:StkPtr, 0 75: CGetL2 L:4 (24) StStk<IRSPOff 0> t1:StkPtr, t3:Int (25) StStk<IRSPOff -1> t1:StkPtr, 0
  • 34.
    just in time Compilationpipeline Virtual assembly PHP x64HHIRHHBC vasm optionally ahead of time 63: SetL L:4 (12) t3:Int = LdStk<Int,IRSPOff 0> t1:StkPtr (14) StLoc<4> t0:FramePtr, t3:Int 66: Int 0 (21) StStk<IRSPOff 0> t1:StkPtr, 0 75: CGetL2 L:4 (24) StStk<IRSPOff 0> t1:StkPtr, t3:Int (25) StStk<IRSPOff -1> t1:StkPtr, 0 load [%128] => %129 storeb %136(17b), [%rbp - 0x48] store %129, [%rbp - 0x50] storeb %136(17b), [%128 + 0x8] store %129, [%128]
  • 35.
    just in time •Uncanny resemblance to x64 • Spiritual sibling of WebKit's Bare Bones Backend Compilation pipeline Virtual assembly PHP x64HHIRHHBC vasm optionally ahead of time
  • 36.
    just in time •"Why don't you just use LLVM?" 🤔 • We tried it: • No noticeable performance gains • LLVM's MCJIT is too heavyweight Compilation pipeline LLVM? Have you heard of it? PHP LLIR?HHIRHHBC vasm optionally ahead of time
  • 37.
    • Experimental LLVMbackend stress-tested vasm • Calling conventions • Register widths • ... Compilation pipeline LLVM? Have you heard of it? PHP LLIR?HHIRHHBC vasm optionally ahead of time just in time
  • 38.
    Compilation pipeline ARM backend PHPHHIR just in time HHBC vasm optionally ahead of time x64 arm
  • 39.
    Compilation pipeline Backends foreveryone! PHP HHIR just in time HHBC vasm optionally ahead of time x64 ppc64 arm
  • 40.
    • Lakshmi Pathy— @lpathy • Dave Estes — @dave-estes • Jim Saxman — @jim-saxman • Christoph Müllner — @cmuellner • Steve Walk — @swalk-cavium • Andrew Pinski — @apinski-cavium • ... Contributors The most important slide in this talk
  • 41.
    • Final vasm-to-AArch64lowering pass • Code smashing • Boundary-crossing b/w C++ and jitted code • Bonus: Continuous integration testing! ARM backend Baseline functionality
  • 42.
    • Strength reductionon flag-setting instructions • 64-bit immediate lifting • Branch offset optimizations • ... ARM backend Optimizations
  • 43.
    Will the demowork? 1 4 3 Where do we go from here? How did we get running on AArch64?2 What is HHVM? Agenda Hold onto your backends
  • 44.
    Will the demowork? 1 4 3 Where do we go from here? How did we get running on AArch64?2 What is HHVM? Agenda i.e., How can you get involved?
  • 45.
    • Cache isking! • Instruction sequences generally larger on AArch64 • HHVM is sensitive to code layout • Huge pages • Indirect branch rewriting • Locality tuning • ... Future work Code size and layout
  • 46.
    • Profile OSSworkloads on ARM using perf • https://github.com/hhvm/oss-performance • Make some measurements Future work More ARM-specific optimizations
  • 47.
    • Website: http://hhvm.com/ •GitHub: https://github.com/facebook/hhvm • IRC: #hphp-dev on Freenode • Mailing list: https://groups.google.com/d/forum/hhvm-arm • My email: mwang@fb.com Resources Feel free to contribute!
  • 48.
  • 49.
    Quick recap HHVM onAArch64 1 HHVM
  • 50.
    Quick recap HHVM onAArch64 1 It runs on AArch64 (thanks to the community)2 HHVM
  • 51.
    Quick recap HHVM onAArch64 Seriously, the demo worked and everything 1 3 It runs on AArch64 (thanks to the community)2 HHVM
  • 52.
    Any questions? Any questions? Seriously,the demo worked and everything 1 4 3 Any questions? It runs on AArch64 (thanks to the community)2 HHVM