SlideShare a Scribd company logo
1 of 30
Download to read offline
The DFG JIT, Inside
& Out
JavaScriptCore’s Optimizing Compiler
JSConf.EU 2012
Andy Wingo
wingo@igalia.com
Compiler hacker at Igalia
Contract work on language implementations
V8, JavaScriptCore
Schemer
Hubris
“Now that JavaScriptCore is as fast as V8 on its
own benchmark, it’s well past time to take a
look inside JSC’s optimizing compiler, the DFG
JIT.”
DFG
Optimizing compiler for JSC
LLInt -> Baseline JIT -> DFG JIT
Makes hot code run fast
But how good is it?
An empirical approach
Getting good code
What: V8 benchmarks
When: Hacked V8 benchmarks
How: Code dive
The V8 benchmarks
The best performance possible from an
optimizing compiler
❧ full second of warmup
❧ full second of runtime
❧ long run amortizes GC pauses
Baseline JIT vs DFG
Abusing the V8 benchmarks
When does the DFG kick in?
What does it do?
Idea: V8 benchmarks with variable warmup
❧ after 0 ms of warmup
❧ after 5 ms of warmup
❧ after n ms of warmup
Small fixed runtime (5 ms)
Caveats
Very sensitive
❧ GC
❧ optimization pauses
❧ timer precision
... but then, so is real code
Keep close eye on distribution of measurements
Richards

Speedup: 3.7X
Bit ops, properties, prototypes
TaskControlBlock.prototype.isHeldOrSuspended = function () {
return (this.state & STATE_HELD) != 0
|| (this.state == STATE_SUSPENDED); };
GetLocal
0x7f4d028abbf4:
CheckStructure
0x7f4d028abbf8:
0x7f4d028abc02:
0x7f4d028abc05:
GetByOffset
0x7f4d028abc0b:
GetGlobalVar
0x7f4d028abc0f:
0x7f4d028abc19:
BitAnd
0x7f4d028abc1c:
0x7f4d028abc1f:
0x7f4d028abc25:
0x7f4d028abc28:
0x7f4d028abc2e:

mov -0x38(%r13), %rax
mov $0x7f4d00109c80, %r11
cmp %r11, (%rax)
jnz 0x7f4d028abd15
mov 0x38(%rax), %rax
mov $0x7f4d479cdca8, %rdx
mov (%rdx), %rdx
cmp %r14, %rax
jb 0x7f4d028abd2b
cmp %r14, %rdx
jb 0x7f4d028abd41
and %edx, %eax
CompareEq
0x7f4d028abc30:
0x7f4d028abc32:
0x7f4d028abc34:
0x7f4d028abc37:
0x7f4d028abc3a:

xor %ecx, %ecx
cmp %ecx, %eax
setz %al
movzx %al, %eax
or $0x6, %eax

LogicalNot
0x7f4d028abc3d: xor $0x1, %rax
SetLocal
0x7f4d028abc41: mov %rax, 0x0(%r13)
Branch
0x7f4d028abc45: test $0x1, %eax
0x7f4d028abc4b: jnz 0x7f4d028abc87
...
0x7f4d028abc97: ret
(End Of Main Path)
...
DeltaBlue

Speedup: 4.4X
Prototypes, inlining
Inlining
At 20ms:
Delaying optimization for
Constraint.prototype.satisfy (in loop)
because of insufficient profiling.

Eventually succeeds after 4 more times and 20
more ms; see --maximumOptimizationDelay.
1000 cuts
One function optimized about 20ms in:

Planner.prototype.addConstraintsConsumingTo =
function (v, coll) {
var determining = v.determinedBy;
var cc = v.constraints;
for (var i = 0; i < cc.size(); i++) {
var c = cc.at(i);
if (c != determining && c.isSatisfied()
coll.add(c);
}
}

Many small marginal gains
Crypto

Speedup: 4.1X
Integers, arrays
function am3(i,x,w,j,c,n) {
var this_array = this.array;
var w_array
= w.array;
var xl = x&0x3fff, xh = x>>14;
while(--n >= 0) {
var l = this_array[i]&0x3fff;
var h = this_array[i++]>>14;
var m = xh*l+h*xl;
l = xl*l+((m&0x3fff)<<14)+w_array[j]+c;
c = (l>>28)+(m>>14)+xh*h;
w_array[j++] = l&0xfffffff;
}
return c;
}
var l = this_array[i]&0x3fff
GetLocal: this_array
0x7f4d02909bf6: mov 0x0(%r13), %r10
GetLocal: i (int32; type check hoisted)
0x7f4d02909bfa: mov -0x40(%r13), %eax
GetButterfly: this_array
0x7f4d02909bfe: mov 0x8(%r10), %rdx
GetByVal: this_array[i] (array check hoisted)
0x7f4d02909c02: cmp -0x4(%rdx), %eax
0x7f4d02909c05: jae 0x7f4d02909ed2
0x7f4d02909c0b: mov 0x10(%rdx,%rax,8), %rcx
0x7f4d02909c10: test %rcx, %rcx
0x7f4d02909c13: jz 0x7f4d02909ee8
BitAnd:
0x7f4d02909c19: cmp %r14, %rcx
0x7f4d02909c1c: jb 0x7f4d02909efe
0x7f4d02909c22: mov %rcx, %rbx
0x7f4d02909c25: and $0x3fff, %ebx
RayTrace

Speedup: 2.5X
Floating point, objects with floating-point fields
normalize()
normalize : function() {
var m = this.magnitude();
return new Flog.RayTracer.Vector(this.x / m,
this.y / m,
this.z / m); },

DFG inlines as it compiles: inlines
this.magnitude()
ArithDiv:
0x7f4d0298164b: divsd %xmm1, %xmm0
SetLocal:
0x7f4d0298164f: movd %xmm0, %rdx
0x7f4d02981654: sub %r14, %rdx
0x7f4d02981657: mov %rdx, 0x20(%r13)

No typed fields (yet)
EarleyBoyer

Speedup: 2.0X
Function calls, small short-lived allocations
EarleyBoyer
“Performance is a distribution, not a value”
Wide distribution indicates nonuniform
performance
Cause in this case: nonincremental mark GC
RegExp

Speedup: 1.2X
Regexp compiler test; DFG of no help
Splay

Speedup: 1.4X
GC test, huge variance
NavierStokes

Speedup: 3.0X
Floating point arrays, large floating-point
functions
No automagic double arrays
GetByVal:
0x7f4d02acec1f:
0x7f4d02acec23:
0x7f4d02acec29:
0x7f4d02acec2e:
0x7f4d02acec31:

cmp -0x4(%rcx), %r9d
jae 0x7f4d02acee0b
mov 0x10(%rcx,%r9,8), %rbx
test %rbx, %rbx
jz 0x7f4d02acee21

GetLocal:
0x7f4d02acec37:
Int32ToDouble:
0x7f4d02acec3b:
0x7f4d02acec3e:
0x7f4d02acec44:
0x7f4d02acec47:
0x7f4d02acec4d:
0x7f4d02acec50:
0x7f4d02acec53:
0x7f4d02acec58:
0x7f4d02acec5d:

mov -0x50(%r13), %rdi
cmp %r14, %rbx
jae 0x7f4d02acec5d
test %rbx, %r14
jz 0x7f4d02acee37
mov %rbx, %rsi
add %r14, %rsi
movd %rsi, %xmm0
jmp 0x7f4d02acec61
cvtsi2sd %ebx, %xmm0
Getting data out of JSC
jsc --options
jsc -d
jsc --showDFGDisassembly=true
-DJIT_ENABLE_VERBOSE=1, DJIT_ENABLE_VERBOSE_OSR=1 and timestamping
hacks on dataLog
Comparative Literature
V8 vs JSC: fight!
Does JSC beat V8?
Does JSC meet V8?
Does V8 beat JSC?
Yes
Questions?
❧ igalia.com/compilers
❧ wingolog.org
❧ @andywingo
❧ wingolog.org/pub/jsconf-eu-2012slides.pdf

More Related Content

What's hot

Intro2 Cuda Moayad
Intro2 Cuda MoayadIntro2 Cuda Moayad
Intro2 Cuda MoayadMoayadhn
 
Dive into CPython Bytecode
Dive into CPython BytecodeDive into CPython Bytecode
Dive into CPython BytecodeAlex Gaynor
 
Java performance jit
Java performance jitJava performance jit
Java performance jitSuken Shah
 
NIR on the Mesa i965 backend (FOSDEM 2016)
NIR on the Mesa i965 backend (FOSDEM 2016)NIR on the Mesa i965 backend (FOSDEM 2016)
NIR on the Mesa i965 backend (FOSDEM 2016)Igalia
 
Probability of finding a single qubit in a state
Probability of finding a single qubit in a stateProbability of finding a single qubit in a state
Probability of finding a single qubit in a stateVijayananda Mohire
 
Engineering fast indexes
Engineering fast indexesEngineering fast indexes
Engineering fast indexesDaniel Lemire
 
Autovectorization in llvm
Autovectorization in llvmAutovectorization in llvm
Autovectorization in llvmChangWoo Min
 
Introduction to RevKit
Introduction to RevKitIntroduction to RevKit
Introduction to RevKitMathias Soeken
 
Android Developer Days: Increasing performance of big arrays processing on An...
Android Developer Days: Increasing performance of big arrays processing on An...Android Developer Days: Increasing performance of big arrays processing on An...
Android Developer Days: Increasing performance of big arrays processing on An...Stanfy
 
Gameboy emulator in rust and web assembly
Gameboy emulator in rust and web assemblyGameboy emulator in rust and web assembly
Gameboy emulator in rust and web assemblyYodalee
 
Next Generation Indexes For Big Data Engineering (ODSC East 2018)
Next Generation Indexes For Big Data Engineering (ODSC East 2018)Next Generation Indexes For Big Data Engineering (ODSC East 2018)
Next Generation Indexes For Big Data Engineering (ODSC East 2018)Daniel Lemire
 
Reversible Logic Synthesis and RevKit
Reversible Logic Synthesis and RevKitReversible Logic Synthesis and RevKit
Reversible Logic Synthesis and RevKitMathias Soeken
 
Quantum circuit example in Qiskit
Quantum circuit example in QiskitQuantum circuit example in Qiskit
Quantum circuit example in QiskitVijayananda Mohire
 
The Object Repository - Pulling Objects out of the Ecosystem
The Object Repository - Pulling Objects out of the EcosystemThe Object Repository - Pulling Objects out of the Ecosystem
The Object Repository - Pulling Objects out of the EcosystemESUG
 
maXbox Starter 39 GEO Maps Tutorial
maXbox Starter 39 GEO Maps TutorialmaXbox Starter 39 GEO Maps Tutorial
maXbox Starter 39 GEO Maps TutorialMax Kleiner
 
Introduction to nand2 tetris
Introduction to nand2 tetrisIntroduction to nand2 tetris
Introduction to nand2 tetrisYodalee
 

What's hot (20)

Intro2 Cuda Moayad
Intro2 Cuda MoayadIntro2 Cuda Moayad
Intro2 Cuda Moayad
 
Dive into CPython Bytecode
Dive into CPython BytecodeDive into CPython Bytecode
Dive into CPython Bytecode
 
Java performance jit
Java performance jitJava performance jit
Java performance jit
 
NIR on the Mesa i965 backend (FOSDEM 2016)
NIR on the Mesa i965 backend (FOSDEM 2016)NIR on the Mesa i965 backend (FOSDEM 2016)
NIR on the Mesa i965 backend (FOSDEM 2016)
 
Probability of finding a single qubit in a state
Probability of finding a single qubit in a stateProbability of finding a single qubit in a state
Probability of finding a single qubit in a state
 
Engineering fast indexes
Engineering fast indexesEngineering fast indexes
Engineering fast indexes
 
Autovectorization in llvm
Autovectorization in llvmAutovectorization in llvm
Autovectorization in llvm
 
Introduction to RevKit
Introduction to RevKitIntroduction to RevKit
Introduction to RevKit
 
Tensor Core
Tensor CoreTensor Core
Tensor Core
 
OpenCL Heterogeneous Parallel Computing
OpenCL Heterogeneous Parallel ComputingOpenCL Heterogeneous Parallel Computing
OpenCL Heterogeneous Parallel Computing
 
Android Developer Days: Increasing performance of big arrays processing on An...
Android Developer Days: Increasing performance of big arrays processing on An...Android Developer Days: Increasing performance of big arrays processing on An...
Android Developer Days: Increasing performance of big arrays processing on An...
 
Gameboy emulator in rust and web assembly
Gameboy emulator in rust and web assemblyGameboy emulator in rust and web assembly
Gameboy emulator in rust and web assembly
 
tokyotalk
tokyotalktokyotalk
tokyotalk
 
Next Generation Indexes For Big Data Engineering (ODSC East 2018)
Next Generation Indexes For Big Data Engineering (ODSC East 2018)Next Generation Indexes For Big Data Engineering (ODSC East 2018)
Next Generation Indexes For Big Data Engineering (ODSC East 2018)
 
Reversible Logic Synthesis and RevKit
Reversible Logic Synthesis and RevKitReversible Logic Synthesis and RevKit
Reversible Logic Synthesis and RevKit
 
Quantum circuit example in Qiskit
Quantum circuit example in QiskitQuantum circuit example in Qiskit
Quantum circuit example in Qiskit
 
The Object Repository - Pulling Objects out of the Ecosystem
The Object Repository - Pulling Objects out of the EcosystemThe Object Repository - Pulling Objects out of the Ecosystem
The Object Repository - Pulling Objects out of the Ecosystem
 
maXbox Starter 39 GEO Maps Tutorial
maXbox Starter 39 GEO Maps TutorialmaXbox Starter 39 GEO Maps Tutorial
maXbox Starter 39 GEO Maps Tutorial
 
Introduction to FPGAs
Introduction to FPGAsIntroduction to FPGAs
Introduction to FPGAs
 
Introduction to nand2 tetris
Introduction to nand2 tetrisIntroduction to nand2 tetris
Introduction to nand2 tetris
 

Similar to DFG JIT, Inside & Out: JavaScriptCore's Optimizing Compiler

Living with Garbage by Gregg Donovan at LuceneSolr Revolution 2013
Living with Garbage by Gregg Donovan at LuceneSolr Revolution 2013Living with Garbage by Gregg Donovan at LuceneSolr Revolution 2013
Living with Garbage by Gregg Donovan at LuceneSolr Revolution 2013Gregg Donovan
 
The CppCat Analyzer Checks TortoiseGit
The CppCat Analyzer Checks TortoiseGitThe CppCat Analyzer Checks TortoiseGit
The CppCat Analyzer Checks TortoiseGitAndrey Karpov
 
Java Jit. Compilation and optimization by Andrey Kovalenko
Java Jit. Compilation and optimization by Andrey KovalenkoJava Jit. Compilation and optimization by Andrey Kovalenko
Java Jit. Compilation and optimization by Andrey KovalenkoValeriia Maliarenko
 
100 bugs in Open Source C/C++ projects
100 bugs in Open Source C/C++ projects 100 bugs in Open Source C/C++ projects
100 bugs in Open Source C/C++ projects Andrey Karpov
 
Java 5 6 Generics, Concurrency, Garbage Collection, Tuning
Java 5 6 Generics, Concurrency, Garbage Collection, TuningJava 5 6 Generics, Concurrency, Garbage Collection, Tuning
Java 5 6 Generics, Concurrency, Garbage Collection, TuningCarol McDonald
 
Building Web Apps Sanely - EclipseCon 2010
Building Web Apps Sanely - EclipseCon 2010Building Web Apps Sanely - EclipseCon 2010
Building Web Apps Sanely - EclipseCon 2010Chris Ramsdale
 
Android RenderScript on LLVM
Android RenderScript on LLVMAndroid RenderScript on LLVM
Android RenderScript on LLVMJohn Lee
 
Google Developer Fest 2010
Google Developer Fest 2010Google Developer Fest 2010
Google Developer Fest 2010Chris Ramsdale
 
How much performance can you get out of Javascript? - Massimiliano Mantione -...
How much performance can you get out of Javascript? - Massimiliano Mantione -...How much performance can you get out of Javascript? - Massimiliano Mantione -...
How much performance can you get out of Javascript? - Massimiliano Mantione -...Codemotion
 
OWASP Poland Day 2018 - Pedro Fortuna - Are your Java Script based protection...
OWASP Poland Day 2018 - Pedro Fortuna - Are your Java Script based protection...OWASP Poland Day 2018 - Pedro Fortuna - Are your Java Script based protection...
OWASP Poland Day 2018 - Pedro Fortuna - Are your Java Script based protection...OWASP
 
HKG15-300: Art's Quick Compiler: An unofficial overview
HKG15-300: Art's Quick Compiler: An unofficial overviewHKG15-300: Art's Quick Compiler: An unofficial overview
HKG15-300: Art's Quick Compiler: An unofficial overviewLinaro
 
PVS-Studio. Static code analyzer. Windows/Linux, C/C++/C#. 2017
PVS-Studio. Static code analyzer. Windows/Linux, C/C++/C#. 2017PVS-Studio. Static code analyzer. Windows/Linux, C/C++/C#. 2017
PVS-Studio. Static code analyzer. Windows/Linux, C/C++/C#. 2017Andrey Karpov
 
C++ amp on linux
C++ amp on linuxC++ amp on linux
C++ amp on linuxMiller Lee
 
Planet-HTML5-Game-Engine Javascript Performance Enhancement
Planet-HTML5-Game-Engine Javascript Performance EnhancementPlanet-HTML5-Game-Engine Javascript Performance Enhancement
Planet-HTML5-Game-Engine Javascript Performance Enhancementup2soul
 

Similar to DFG JIT, Inside & Out: JavaScriptCore's Optimizing Compiler (20)

Living with garbage
Living with garbageLiving with garbage
Living with garbage
 
Living with Garbage by Gregg Donovan at LuceneSolr Revolution 2013
Living with Garbage by Gregg Donovan at LuceneSolr Revolution 2013Living with Garbage by Gregg Donovan at LuceneSolr Revolution 2013
Living with Garbage by Gregg Donovan at LuceneSolr Revolution 2013
 
Living With Garbage
Living With GarbageLiving With Garbage
Living With Garbage
 
The CppCat Analyzer Checks TortoiseGit
The CppCat Analyzer Checks TortoiseGitThe CppCat Analyzer Checks TortoiseGit
The CppCat Analyzer Checks TortoiseGit
 
Java Jit. Compilation and optimization by Andrey Kovalenko
Java Jit. Compilation and optimization by Andrey KovalenkoJava Jit. Compilation and optimization by Andrey Kovalenko
Java Jit. Compilation and optimization by Andrey Kovalenko
 
Price of an Error
Price of an ErrorPrice of an Error
Price of an Error
 
100 bugs in Open Source C/C++ projects
100 bugs in Open Source C/C++ projects 100 bugs in Open Source C/C++ projects
100 bugs in Open Source C/C++ projects
 
Java 5 6 Generics, Concurrency, Garbage Collection, Tuning
Java 5 6 Generics, Concurrency, Garbage Collection, TuningJava 5 6 Generics, Concurrency, Garbage Collection, Tuning
Java 5 6 Generics, Concurrency, Garbage Collection, Tuning
 
Building Web Apps Sanely - EclipseCon 2010
Building Web Apps Sanely - EclipseCon 2010Building Web Apps Sanely - EclipseCon 2010
Building Web Apps Sanely - EclipseCon 2010
 
JBoss World 2010
JBoss World 2010JBoss World 2010
JBoss World 2010
 
Android RenderScript on LLVM
Android RenderScript on LLVMAndroid RenderScript on LLVM
Android RenderScript on LLVM
 
Google Developer Fest 2010
Google Developer Fest 2010Google Developer Fest 2010
Google Developer Fest 2010
 
How much performance can you get out of Javascript? - Massimiliano Mantione -...
How much performance can you get out of Javascript? - Massimiliano Mantione -...How much performance can you get out of Javascript? - Massimiliano Mantione -...
How much performance can you get out of Javascript? - Massimiliano Mantione -...
 
Exploiting vectorization with ISPC
Exploiting vectorization with ISPCExploiting vectorization with ISPC
Exploiting vectorization with ISPC
 
OWASP Poland Day 2018 - Pedro Fortuna - Are your Java Script based protection...
OWASP Poland Day 2018 - Pedro Fortuna - Are your Java Script based protection...OWASP Poland Day 2018 - Pedro Fortuna - Are your Java Script based protection...
OWASP Poland Day 2018 - Pedro Fortuna - Are your Java Script based protection...
 
HKG15-300: Art's Quick Compiler: An unofficial overview
HKG15-300: Art's Quick Compiler: An unofficial overviewHKG15-300: Art's Quick Compiler: An unofficial overview
HKG15-300: Art's Quick Compiler: An unofficial overview
 
Vectorization in ATLAS
Vectorization in ATLASVectorization in ATLAS
Vectorization in ATLAS
 
PVS-Studio. Static code analyzer. Windows/Linux, C/C++/C#. 2017
PVS-Studio. Static code analyzer. Windows/Linux, C/C++/C#. 2017PVS-Studio. Static code analyzer. Windows/Linux, C/C++/C#. 2017
PVS-Studio. Static code analyzer. Windows/Linux, C/C++/C#. 2017
 
C++ amp on linux
C++ amp on linuxC++ amp on linux
C++ amp on linux
 
Planet-HTML5-Game-Engine Javascript Performance Enhancement
Planet-HTML5-Game-Engine Javascript Performance EnhancementPlanet-HTML5-Game-Engine Javascript Performance Enhancement
Planet-HTML5-Game-Engine Javascript Performance Enhancement
 

More from Igalia

A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Building End-user Applications on Embedded Devices with WPE
Building End-user Applications on Embedded Devices with WPEBuilding End-user Applications on Embedded Devices with WPE
Building End-user Applications on Embedded Devices with WPEIgalia
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Automated Testing for Web-based Systems on Embedded Devices
Automated Testing for Web-based Systems on Embedded DevicesAutomated Testing for Web-based Systems on Embedded Devices
Automated Testing for Web-based Systems on Embedded DevicesIgalia
 
Embedding WPE WebKit - from Bring-up to Maintenance
Embedding WPE WebKit - from Bring-up to MaintenanceEmbedding WPE WebKit - from Bring-up to Maintenance
Embedding WPE WebKit - from Bring-up to MaintenanceIgalia
 
Optimizing Scheduler for Linux Gaming.pdf
Optimizing Scheduler for Linux Gaming.pdfOptimizing Scheduler for Linux Gaming.pdf
Optimizing Scheduler for Linux Gaming.pdfIgalia
 
Running JS via WASM faster with JIT
Running JS via WASM      faster with JITRunning JS via WASM      faster with JIT
Running JS via WASM faster with JITIgalia
 
To crash or not to crash: if you do, at least recover fast!
To crash or not to crash: if you do, at least recover fast!To crash or not to crash: if you do, at least recover fast!
To crash or not to crash: if you do, at least recover fast!Igalia
 
Implementing a Vulkan Video Encoder From Mesa to GStreamer
Implementing a Vulkan Video Encoder From Mesa to GStreamerImplementing a Vulkan Video Encoder From Mesa to GStreamer
Implementing a Vulkan Video Encoder From Mesa to GStreamerIgalia
 
8 Years of Open Drivers, including the State of Vulkan in Mesa
8 Years of Open Drivers, including the State of Vulkan in Mesa8 Years of Open Drivers, including the State of Vulkan in Mesa
8 Years of Open Drivers, including the State of Vulkan in MesaIgalia
 
Introducción a Mesa. Caso específico dos dispositivos Raspberry Pi por Igalia
Introducción a Mesa. Caso específico dos dispositivos Raspberry Pi por IgaliaIntroducción a Mesa. Caso específico dos dispositivos Raspberry Pi por Igalia
Introducción a Mesa. Caso específico dos dispositivos Raspberry Pi por IgaliaIgalia
 
2023 in Chimera Linux
2023 in Chimera                    Linux2023 in Chimera                    Linux
2023 in Chimera LinuxIgalia
 
Building a Linux distro with LLVM
Building a Linux distro        with LLVMBuilding a Linux distro        with LLVM
Building a Linux distro with LLVMIgalia
 
turnip: Update on Open Source Vulkan Driver for Adreno GPUs
turnip: Update on Open Source Vulkan Driver for Adreno GPUsturnip: Update on Open Source Vulkan Driver for Adreno GPUs
turnip: Update on Open Source Vulkan Driver for Adreno GPUsIgalia
 
Graphics stack updates for Raspberry Pi devices
Graphics stack updates for Raspberry Pi devicesGraphics stack updates for Raspberry Pi devices
Graphics stack updates for Raspberry Pi devicesIgalia
 
Delegated Compositing - Utilizing Wayland Protocols for Chromium on ChromeOS
Delegated Compositing - Utilizing Wayland Protocols for Chromium on ChromeOSDelegated Compositing - Utilizing Wayland Protocols for Chromium on ChromeOS
Delegated Compositing - Utilizing Wayland Protocols for Chromium on ChromeOSIgalia
 
MessageFormat: The future of i18n on the web
MessageFormat: The future of i18n on the webMessageFormat: The future of i18n on the web
MessageFormat: The future of i18n on the webIgalia
 
Replacing the geometry pipeline with mesh shaders
Replacing the geometry pipeline with mesh shadersReplacing the geometry pipeline with mesh shaders
Replacing the geometry pipeline with mesh shadersIgalia
 
I'm not an AMD expert, but...
I'm not an AMD expert, but...I'm not an AMD expert, but...
I'm not an AMD expert, but...Igalia
 
Status of Vulkan on Raspberry
Status of Vulkan on RaspberryStatus of Vulkan on Raspberry
Status of Vulkan on RaspberryIgalia
 

More from Igalia (20)

A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Building End-user Applications on Embedded Devices with WPE
Building End-user Applications on Embedded Devices with WPEBuilding End-user Applications on Embedded Devices with WPE
Building End-user Applications on Embedded Devices with WPE
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Automated Testing for Web-based Systems on Embedded Devices
Automated Testing for Web-based Systems on Embedded DevicesAutomated Testing for Web-based Systems on Embedded Devices
Automated Testing for Web-based Systems on Embedded Devices
 
Embedding WPE WebKit - from Bring-up to Maintenance
Embedding WPE WebKit - from Bring-up to MaintenanceEmbedding WPE WebKit - from Bring-up to Maintenance
Embedding WPE WebKit - from Bring-up to Maintenance
 
Optimizing Scheduler for Linux Gaming.pdf
Optimizing Scheduler for Linux Gaming.pdfOptimizing Scheduler for Linux Gaming.pdf
Optimizing Scheduler for Linux Gaming.pdf
 
Running JS via WASM faster with JIT
Running JS via WASM      faster with JITRunning JS via WASM      faster with JIT
Running JS via WASM faster with JIT
 
To crash or not to crash: if you do, at least recover fast!
To crash or not to crash: if you do, at least recover fast!To crash or not to crash: if you do, at least recover fast!
To crash or not to crash: if you do, at least recover fast!
 
Implementing a Vulkan Video Encoder From Mesa to GStreamer
Implementing a Vulkan Video Encoder From Mesa to GStreamerImplementing a Vulkan Video Encoder From Mesa to GStreamer
Implementing a Vulkan Video Encoder From Mesa to GStreamer
 
8 Years of Open Drivers, including the State of Vulkan in Mesa
8 Years of Open Drivers, including the State of Vulkan in Mesa8 Years of Open Drivers, including the State of Vulkan in Mesa
8 Years of Open Drivers, including the State of Vulkan in Mesa
 
Introducción a Mesa. Caso específico dos dispositivos Raspberry Pi por Igalia
Introducción a Mesa. Caso específico dos dispositivos Raspberry Pi por IgaliaIntroducción a Mesa. Caso específico dos dispositivos Raspberry Pi por Igalia
Introducción a Mesa. Caso específico dos dispositivos Raspberry Pi por Igalia
 
2023 in Chimera Linux
2023 in Chimera                    Linux2023 in Chimera                    Linux
2023 in Chimera Linux
 
Building a Linux distro with LLVM
Building a Linux distro        with LLVMBuilding a Linux distro        with LLVM
Building a Linux distro with LLVM
 
turnip: Update on Open Source Vulkan Driver for Adreno GPUs
turnip: Update on Open Source Vulkan Driver for Adreno GPUsturnip: Update on Open Source Vulkan Driver for Adreno GPUs
turnip: Update on Open Source Vulkan Driver for Adreno GPUs
 
Graphics stack updates for Raspberry Pi devices
Graphics stack updates for Raspberry Pi devicesGraphics stack updates for Raspberry Pi devices
Graphics stack updates for Raspberry Pi devices
 
Delegated Compositing - Utilizing Wayland Protocols for Chromium on ChromeOS
Delegated Compositing - Utilizing Wayland Protocols for Chromium on ChromeOSDelegated Compositing - Utilizing Wayland Protocols for Chromium on ChromeOS
Delegated Compositing - Utilizing Wayland Protocols for Chromium on ChromeOS
 
MessageFormat: The future of i18n on the web
MessageFormat: The future of i18n on the webMessageFormat: The future of i18n on the web
MessageFormat: The future of i18n on the web
 
Replacing the geometry pipeline with mesh shaders
Replacing the geometry pipeline with mesh shadersReplacing the geometry pipeline with mesh shaders
Replacing the geometry pipeline with mesh shaders
 
I'm not an AMD expert, but...
I'm not an AMD expert, but...I'm not an AMD expert, but...
I'm not an AMD expert, but...
 
Status of Vulkan on Raspberry
Status of Vulkan on RaspberryStatus of Vulkan on Raspberry
Status of Vulkan on Raspberry
 

Recently uploaded

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAndikSusilo4
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Hyundai Motor Group
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitecturePixlogix Infotech
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 

Recently uploaded (20)

08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Azure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & ApplicationAzure Monitor & Application Insight to monitor Infrastructure & Application
Azure Monitor & Application Insight to monitor Infrastructure & Application
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2Next-generation AAM aircraft unveiled by Supernal, S-A2
Next-generation AAM aircraft unveiled by Supernal, S-A2
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Understanding the Laravel MVC Architecture
Understanding the Laravel MVC ArchitectureUnderstanding the Laravel MVC Architecture
Understanding the Laravel MVC Architecture
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 

DFG JIT, Inside & Out: JavaScriptCore's Optimizing Compiler

  • 1. The DFG JIT, Inside & Out JavaScriptCore’s Optimizing Compiler JSConf.EU 2012 Andy Wingo
  • 2. wingo@igalia.com Compiler hacker at Igalia Contract work on language implementations V8, JavaScriptCore Schemer
  • 3. Hubris “Now that JavaScriptCore is as fast as V8 on its own benchmark, it’s well past time to take a look inside JSC’s optimizing compiler, the DFG JIT.”
  • 4. DFG Optimizing compiler for JSC LLInt -> Baseline JIT -> DFG JIT Makes hot code run fast But how good is it?
  • 5. An empirical approach Getting good code What: V8 benchmarks When: Hacked V8 benchmarks How: Code dive
  • 6. The V8 benchmarks The best performance possible from an optimizing compiler ❧ full second of warmup ❧ full second of runtime ❧ long run amortizes GC pauses
  • 8. Abusing the V8 benchmarks When does the DFG kick in? What does it do? Idea: V8 benchmarks with variable warmup ❧ after 0 ms of warmup ❧ after 5 ms of warmup ❧ after n ms of warmup Small fixed runtime (5 ms)
  • 9. Caveats Very sensitive ❧ GC ❧ optimization pauses ❧ timer precision ... but then, so is real code Keep close eye on distribution of measurements
  • 10. Richards Speedup: 3.7X Bit ops, properties, prototypes
  • 11. TaskControlBlock.prototype.isHeldOrSuspended = function () { return (this.state & STATE_HELD) != 0 || (this.state == STATE_SUSPENDED); }; GetLocal 0x7f4d028abbf4: CheckStructure 0x7f4d028abbf8: 0x7f4d028abc02: 0x7f4d028abc05: GetByOffset 0x7f4d028abc0b: GetGlobalVar 0x7f4d028abc0f: 0x7f4d028abc19: BitAnd 0x7f4d028abc1c: 0x7f4d028abc1f: 0x7f4d028abc25: 0x7f4d028abc28: 0x7f4d028abc2e: mov -0x38(%r13), %rax mov $0x7f4d00109c80, %r11 cmp %r11, (%rax) jnz 0x7f4d028abd15 mov 0x38(%rax), %rax mov $0x7f4d479cdca8, %rdx mov (%rdx), %rdx cmp %r14, %rax jb 0x7f4d028abd2b cmp %r14, %rdx jb 0x7f4d028abd41 and %edx, %eax
  • 12. CompareEq 0x7f4d028abc30: 0x7f4d028abc32: 0x7f4d028abc34: 0x7f4d028abc37: 0x7f4d028abc3a: xor %ecx, %ecx cmp %ecx, %eax setz %al movzx %al, %eax or $0x6, %eax LogicalNot 0x7f4d028abc3d: xor $0x1, %rax SetLocal 0x7f4d028abc41: mov %rax, 0x0(%r13) Branch 0x7f4d028abc45: test $0x1, %eax 0x7f4d028abc4b: jnz 0x7f4d028abc87 ... 0x7f4d028abc97: ret (End Of Main Path) ...
  • 14. Inlining At 20ms: Delaying optimization for Constraint.prototype.satisfy (in loop) because of insufficient profiling. Eventually succeeds after 4 more times and 20 more ms; see --maximumOptimizationDelay.
  • 15. 1000 cuts One function optimized about 20ms in: Planner.prototype.addConstraintsConsumingTo = function (v, coll) { var determining = v.determinedBy; var cc = v.constraints; for (var i = 0; i < cc.size(); i++) { var c = cc.at(i); if (c != determining && c.isSatisfied() coll.add(c); } } Many small marginal gains
  • 17. function am3(i,x,w,j,c,n) { var this_array = this.array; var w_array = w.array; var xl = x&0x3fff, xh = x>>14; while(--n >= 0) { var l = this_array[i]&0x3fff; var h = this_array[i++]>>14; var m = xh*l+h*xl; l = xl*l+((m&0x3fff)<<14)+w_array[j]+c; c = (l>>28)+(m>>14)+xh*h; w_array[j++] = l&0xfffffff; } return c; }
  • 18. var l = this_array[i]&0x3fff GetLocal: this_array 0x7f4d02909bf6: mov 0x0(%r13), %r10 GetLocal: i (int32; type check hoisted) 0x7f4d02909bfa: mov -0x40(%r13), %eax GetButterfly: this_array 0x7f4d02909bfe: mov 0x8(%r10), %rdx GetByVal: this_array[i] (array check hoisted) 0x7f4d02909c02: cmp -0x4(%rdx), %eax 0x7f4d02909c05: jae 0x7f4d02909ed2 0x7f4d02909c0b: mov 0x10(%rdx,%rax,8), %rcx 0x7f4d02909c10: test %rcx, %rcx 0x7f4d02909c13: jz 0x7f4d02909ee8 BitAnd: 0x7f4d02909c19: cmp %r14, %rcx 0x7f4d02909c1c: jb 0x7f4d02909efe 0x7f4d02909c22: mov %rcx, %rbx 0x7f4d02909c25: and $0x3fff, %ebx
  • 19. RayTrace Speedup: 2.5X Floating point, objects with floating-point fields
  • 20. normalize() normalize : function() { var m = this.magnitude(); return new Flog.RayTracer.Vector(this.x / m, this.y / m, this.z / m); }, DFG inlines as it compiles: inlines this.magnitude() ArithDiv: 0x7f4d0298164b: divsd %xmm1, %xmm0 SetLocal: 0x7f4d0298164f: movd %xmm0, %rdx 0x7f4d02981654: sub %r14, %rdx 0x7f4d02981657: mov %rdx, 0x20(%r13) No typed fields (yet)
  • 21. EarleyBoyer Speedup: 2.0X Function calls, small short-lived allocations
  • 22. EarleyBoyer “Performance is a distribution, not a value” Wide distribution indicates nonuniform performance Cause in this case: nonincremental mark GC
  • 23. RegExp Speedup: 1.2X Regexp compiler test; DFG of no help
  • 25. NavierStokes Speedup: 3.0X Floating point arrays, large floating-point functions
  • 26. No automagic double arrays GetByVal: 0x7f4d02acec1f: 0x7f4d02acec23: 0x7f4d02acec29: 0x7f4d02acec2e: 0x7f4d02acec31: cmp -0x4(%rcx), %r9d jae 0x7f4d02acee0b mov 0x10(%rcx,%r9,8), %rbx test %rbx, %rbx jz 0x7f4d02acee21 GetLocal: 0x7f4d02acec37: Int32ToDouble: 0x7f4d02acec3b: 0x7f4d02acec3e: 0x7f4d02acec44: 0x7f4d02acec47: 0x7f4d02acec4d: 0x7f4d02acec50: 0x7f4d02acec53: 0x7f4d02acec58: 0x7f4d02acec5d: mov -0x50(%r13), %rdi cmp %r14, %rbx jae 0x7f4d02acec5d test %rbx, %r14 jz 0x7f4d02acee37 mov %rbx, %rsi add %r14, %rsi movd %rsi, %xmm0 jmp 0x7f4d02acec61 cvtsi2sd %ebx, %xmm0
  • 27. Getting data out of JSC jsc --options jsc -d jsc --showDFGDisassembly=true -DJIT_ENABLE_VERBOSE=1, DJIT_ENABLE_VERBOSE_OSR=1 and timestamping hacks on dataLog
  • 28. Comparative Literature V8 vs JSC: fight! Does JSC beat V8? Does JSC meet V8? Does V8 beat JSC?
  • 29. Yes
  • 30. Questions? ❧ igalia.com/compilers ❧ wingolog.org ❧ @andywingo ❧ wingolog.org/pub/jsconf-eu-2012slides.pdf