SlideShare a Scribd company logo
1 of 24
Download to read offline
A Faster CRuby interpreter with dynamically
specialized IR
Vladimir Makarov
RedHat
Sep 10, 2022
Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 1 / 1
Presentation outline
Project motivation and initial expectations
Dynamically specialized VM insns (IR)
Current state of the project
Microbenchmark performance comparison of
I the base interpreter
I the interpreter with specialized IR (SIR)
I YJIT and MJIT
I and early stage CRuby MIR JIT based on SIR
Future plans
Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 2 / 1
The project motivation
MIR project to address shortcomings of MJIT
MIR is an universal light-weight JIT compiler
I already good for JITting static programming languages
I still a lot of work to do to make it good for dynamic programming
languages
F a generalized lazy basic block versioning based on dynamic code properties
F trace generation and optimization based on basic block cloning
F ultimate goal is meta-tracing MIR C compiler
F more details in my blog post ”Code specialization for the MIR lightweight JIT
compiler”
Introduction of YJIT was a major disruption
I need to use more pragmatic approach to use MIR in the current state
I CRuby VM insn specialization instead of one in MIR itself
Expectation register transfer language (RTL) with BB versioning
can achieve YJIT performance for some benchmarks
Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 3 / 1
Code specialization
One Merriam-Webster definition of ”specialization” – ”design, train,
or fit for one particular purpose”
Code specialization is a common approach for faster code
generation
Specialized code already exists in CRuby,
I VM insns for calling methods with particular name like opt plus
Statically and dynamically specialized code
Speculatively specialized code and deoptimization
I the more dynamic language is the more (speculative) specialization you
need to be closer to static language performance
Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 4 / 1
Dynamically specialized CRuby insns
Dynamic specialization in a lazy way on VM insn BB level
I usually a lot of versions of executed BB exists
Specialization currently implemented:
I hybrid stack-RTL insn specialization
I type specialization based on lazy basic block versioning
F Maxime Chevalier-Boisvert invention and the most important optimization
technique of YJIT
I different specialization based on profile info
F type specialization based on profile info
F specialized calls
F specialized instance variable and attribute access
F iterator specialization
Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 5 / 1
RTL
Stack insns RTL insns
getlocal v1 # push v1
getlocal v2 # push v2
opt plus # pop v1 and v2; push v1+v2 sir plusvvv res, v1, v2 # assign v1+v2 to res
setlocal res # pop stack value and assign it to res
RTL advantages
I less insns, less insn dispatch code
I less memory traffic
RTL disadvantages
I longer insn, more time in operand decoding
I worse behaviour for working with values in stack ways (calls)
Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 6 / 1
RTL (disjoint method frames)
temporaries
local
variables
ep sp
stack growth
1 2 3
-3 -2 -1
indexes:
temporaries
local
variables
ep sp
stack growth
3 2 1
-3 -2 -1
indexes:
unpredictable
distance
only ep can be used for addressing
local variables (negative offset)
and stack values (positive offset)
ep should be used for addressing local
variables and sp for stack values
I a lot of ifs on offset values – a big
performance impact on addressing
Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 7 / 1
Hybrid stack-RTL insns
Hybrid stack-RTL insns to overcome pure RTL disadvantages:
sir

























plus
minus
mult
div
mod
or
and
eq
neq
lt
gt
le
ge
aref
aset

























s

s
v
i

s
v
i

s means value on stack
v means value in a local variable
i means immediate value
insns sir aset..i and insns with suffixes sss, sii are absent
Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 8 / 1
Type specialized insns
sir i

























plus
minus
mult
div
mod
or
and
eq
neq
lt
gt
le
ge
aref
aset

























n
s
v
o
s
v
i

s
v
i

sir f















plus
minus
mult
div
mod
eq
neq
lt
gt
le
ge















n
s
v
o
s
v
i

s
v
i

Fixnum (prefix sir i) and FP (prefix sir f) type specialized insns:
I Difference with non-type RTL insns: result can be also a local variable
I Otherwise, strict correspondence between type-specialized insns and non-type ones is for
safe deoptimization
Many type specialized insns are generated by lazy BB versions
I See numerous Maxime Maxime Chevalier-Boisvert presentations about BBV for details
Rest of type specialized insns are generated from profile info
Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 9 / 1
Profile-based specialization
sir_inspect_type
sir_inspect_fixtype
sir_inspect_flotype
sir_check_fixtype
sir_check_flotype
Profiling Stage
After profiling and further
BBV
nop
Insns to inspect value types can not be deducted from BBV
I After profiling, if inspect insns are transformed into type guards, further
type specialization is done by BBV
Specialized insns for calls and instance variable access also can be
generated from profile info
Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 10 / 1
Iterators
sir_iter_start start_func
sir_cfunc_send = Lcont: sir_iter_body Lexit, block_bbv, cond_func
sir_iter_cont Lcont, arg_func
Lexit:
A lot of Ruby standard methods are implemented on C, accept iseq blocks, and
behave as iterators
Calling the interpreter from C code is very expensive
Such iterator method calls are changed by specialized insns avoiding the
interpreter exits and enters
I sir iter start start finc, where start func checks receiver type and setup block
args
I sir iter body exit label, block bbv, cond func, where cond func finishes
iterator or calls block BBV
I sir iter cont cont label, arg func, where arg func updates block args and goto
to given label
Iterators currently implemented only for fixnum times, range each, and array
Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 11 / 1
Dynamic flow of specialized insns
VM stack insn iseq
iseq entry point stubs
(sir_make_bbv, sir_make_catch_bbv)
non-specialized hybrid
stack-RTL BBV insns
type-specialized BBV insns
with profiling (self modifying insns)
type- and profile-specialized
BBV insns
SIR
To future MIR-based JIT
compile
execution
execution
execution
(invalid
assumption)
execution
(normal path)
Normal IR execution flow:
I Start execution of BB with a stub
I Stub execution generates hybrid
stack-based RTL insns and
type-specialized insns with profiling
insns
I Several executions of type-specialized
insns results in type- and profile-
specialized insns
I Type- and profile-specialized insns are a
source of the MIR-based JIT
Exception IR execution flow:
I Switching to non-type specialized
stack-based RTL
Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 12 / 1
Implementation and the current state
The code can be found in my github repository
https://github.com/vnmakarov/ruby
In the current state, SIR interpreter and MIR JIT are good only to run the
micro-benchmarks
Code for specialized IR generation and execution is about 3.5K lines of C
Generator of C code for MIR is about 2.5K lines of C
MIR-based JIT needs MIR library (from bbv branch) about 900KB of machine
code
Options to use the SIR interpreter: --sir, --sir-debug,
--sir-max-bb-versions=N
Option to use SIR interpreter and MIR JIT: --mirjit, -mirjit-debug
Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 13 / 1
Benchmarking
Intel i7-9700K with 16GB memory under Linux FC 32
I Base interpreter
I Interpreter with SIR: –sir
I YJIT: –yjit-call-threshold=1
I MJIT: –jit-min-calls=1
I SIR+MIR: –mirjit
Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 14 / 1
Micro-benchmarks (Wall time)
93% Geomean performance improvement for SIR
Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 15 / 1
Micro-benchmarks (Wall time)
SIR is faster YJIT on while benchmark because of RTL
Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 15 / 1
Micro-benchmarks (Wall time)
SIR is faster YJIT on nested times benchmark because of iterator
specialization
Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 15 / 1
Micro-benchmarks (Wall time)
YJIT is faster SIR on call benchmark (an example when YJIT
specialization is better)
Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 15 / 1
Micro-benchmarks (CPU time)
CPU times are practically the same
as wall times
MJIT Exception: GCC run in
parallel adds a lot of CPU time
Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 16 / 1
Micro-benchmarks (Memory use)
Maximal resident memory size
YJIT has the biggest memory
consumption
Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 17 / 1
OptCarrot
The faster interpreter provides a
modest 39% improvement
Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 18 / 1
Optimized OptCarrot
Huge method generation during
execution (analog aggressive
method inlining)
YJIT behaviour is the worst
Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 19 / 1
The future plans
The faster interpreter is not ready yet
I bug fixing and more optimization work
F SIR is not API and there are no compatibility problems to change it
I plans to finish it at the end of 2022
MIR-based JIT is at the very early stage of the development
I even more bug fixing and a lot of optimization work
I plans for finish implementation in the next year
Right now the faster interpreter and MIR-based JIT are more a
research project
I no commitment to submit it to CRuby
I commitment only to support MIR project itself
Adopting project ideas and/or code by Ruby developers is welcomed
Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 20 / 1
Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 21 / 1

More Related Content

Similar to A Faster CRuby interpreter with dynamically specialized IR

Tectonic Summit 2016: Brandon Philips, CTO of CoreOS, Keynote
Tectonic Summit 2016: Brandon Philips, CTO of CoreOS, KeynoteTectonic Summit 2016: Brandon Philips, CTO of CoreOS, Keynote
Tectonic Summit 2016: Brandon Philips, CTO of CoreOS, KeynoteCoreOS
 
IronRuby for the .NET Developer
IronRuby for the .NET DeveloperIronRuby for the .NET Developer
IronRuby for the .NET DeveloperCory Foy
 
XS Boston 2008 Paravirt Ops in Linux IA64
XS Boston 2008 Paravirt Ops in Linux IA64XS Boston 2008 Paravirt Ops in Linux IA64
XS Boston 2008 Paravirt Ops in Linux IA64The Linux Foundation
 
Static partitioning virtualization on RISC-V
Static partitioning virtualization on RISC-VStatic partitioning virtualization on RISC-V
Static partitioning virtualization on RISC-VRISC-V International
 
CS6270 Virtual Machines - Retargetable Binary Translators
CS6270 Virtual Machines - Retargetable Binary TranslatorsCS6270 Virtual Machines - Retargetable Binary Translators
CS6270 Virtual Machines - Retargetable Binary TranslatorsKwangshin Oh
 
Readactor-Practical Code Randomization Resilient to Memory Disclosure
Readactor-Practical Code Randomization Resilient to Memory DisclosureReadactor-Practical Code Randomization Resilient to Memory Disclosure
Readactor-Practical Code Randomization Resilient to Memory Disclosurech0psticks
 
Kernel Recipes 2013 - Overview display in the Linux kernel
Kernel Recipes 2013 - Overview display in the Linux kernelKernel Recipes 2013 - Overview display in the Linux kernel
Kernel Recipes 2013 - Overview display in the Linux kernelAnne Nicolas
 
Talk: The Present and Future of Pharo
Talk: The Present and Future of PharoTalk: The Present and Future of Pharo
Talk: The Present and Future of PharoMarcus Denker
 
Experiments in Sharing Java VM Technology with CRuby
Experiments in Sharing Java VM Technology with CRubyExperiments in Sharing Java VM Technology with CRuby
Experiments in Sharing Java VM Technology with CRubyMatthew Gaudet
 
"Black Clouds and Silver Linings in Node.js Security" Liran Tal
"Black Clouds and Silver Linings in Node.js Security" Liran Tal"Black Clouds and Silver Linings in Node.js Security" Liran Tal
"Black Clouds and Silver Linings in Node.js Security" Liran TalJulia Cherniak
 
“Seamless Deployment of Multimedia and Machine Learning Applications at the E...
“Seamless Deployment of Multimedia and Machine Learning Applications at the E...“Seamless Deployment of Multimedia and Machine Learning Applications at the E...
“Seamless Deployment of Multimedia and Machine Learning Applications at the E...Edge AI and Vision Alliance
 
Spryker meetup-distribute-your-spryker-deployment-with-docker-and-kubernetes
Spryker meetup-distribute-your-spryker-deployment-with-docker-and-kubernetesSpryker meetup-distribute-your-spryker-deployment-with-docker-and-kubernetes
Spryker meetup-distribute-your-spryker-deployment-with-docker-and-kubernetesBernd Alter
 
Denker - Pharo: Present and Future - 2009-07-14
Denker - Pharo: Present and Future - 2009-07-14Denker - Pharo: Present and Future - 2009-07-14
Denker - Pharo: Present and Future - 2009-07-14CHOOSE
 
ソーシャルゲームの課金認証共通基盤をどう設計したか
ソーシャルゲームの課金認証共通基盤をどう設計したかソーシャルゲームの課金認証共通基盤をどう設計したか
ソーシャルゲームの課金認証共通基盤をどう設計したかYugo Shimizu
 

Similar to A Faster CRuby interpreter with dynamically specialized IR (20)

BRKCRT-2601.pdf
BRKCRT-2601.pdfBRKCRT-2601.pdf
BRKCRT-2601.pdf
 
Tectonic Summit 2016: Brandon Philips, CTO of CoreOS, Keynote
Tectonic Summit 2016: Brandon Philips, CTO of CoreOS, KeynoteTectonic Summit 2016: Brandon Philips, CTO of CoreOS, Keynote
Tectonic Summit 2016: Brandon Philips, CTO of CoreOS, Keynote
 
Advanced Topics in IP Multicast Deployment
Advanced Topics in IP Multicast DeploymentAdvanced Topics in IP Multicast Deployment
Advanced Topics in IP Multicast Deployment
 
IronRuby for the .NET Developer
IronRuby for the .NET DeveloperIronRuby for the .NET Developer
IronRuby for the .NET Developer
 
Build Programming Language Runtime with LLVM
Build Programming Language Runtime with LLVMBuild Programming Language Runtime with LLVM
Build Programming Language Runtime with LLVM
 
XS Boston 2008 Paravirt Ops in Linux IA64
XS Boston 2008 Paravirt Ops in Linux IA64XS Boston 2008 Paravirt Ops in Linux IA64
XS Boston 2008 Paravirt Ops in Linux IA64
 
Static partitioning virtualization on RISC-V
Static partitioning virtualization on RISC-VStatic partitioning virtualization on RISC-V
Static partitioning virtualization on RISC-V
 
CS6270 Virtual Machines - Retargetable Binary Translators
CS6270 Virtual Machines - Retargetable Binary TranslatorsCS6270 Virtual Machines - Retargetable Binary Translators
CS6270 Virtual Machines - Retargetable Binary Translators
 
Readactor-Practical Code Randomization Resilient to Memory Disclosure
Readactor-Practical Code Randomization Resilient to Memory DisclosureReadactor-Practical Code Randomization Resilient to Memory Disclosure
Readactor-Practical Code Randomization Resilient to Memory Disclosure
 
vb script
vb scriptvb script
vb script
 
Kernel Recipes 2013 - Overview display in the Linux kernel
Kernel Recipes 2013 - Overview display in the Linux kernelKernel Recipes 2013 - Overview display in the Linux kernel
Kernel Recipes 2013 - Overview display in the Linux kernel
 
Talk: The Present and Future of Pharo
Talk: The Present and Future of PharoTalk: The Present and Future of Pharo
Talk: The Present and Future of Pharo
 
Experiments in Sharing Java VM Technology with CRuby
Experiments in Sharing Java VM Technology with CRubyExperiments in Sharing Java VM Technology with CRuby
Experiments in Sharing Java VM Technology with CRuby
 
"Black Clouds and Silver Linings in Node.js Security" Liran Tal
"Black Clouds and Silver Linings in Node.js Security" Liran Tal"Black Clouds and Silver Linings in Node.js Security" Liran Tal
"Black Clouds and Silver Linings in Node.js Security" Liran Tal
 
“Seamless Deployment of Multimedia and Machine Learning Applications at the E...
“Seamless Deployment of Multimedia and Machine Learning Applications at the E...“Seamless Deployment of Multimedia and Machine Learning Applications at the E...
“Seamless Deployment of Multimedia and Machine Learning Applications at the E...
 
Spryker meetup-distribute-your-spryker-deployment-with-docker-and-kubernetes
Spryker meetup-distribute-your-spryker-deployment-with-docker-and-kubernetesSpryker meetup-distribute-your-spryker-deployment-with-docker-and-kubernetes
Spryker meetup-distribute-your-spryker-deployment-with-docker-and-kubernetes
 
Denker - Pharo: Present and Future - 2009-07-14
Denker - Pharo: Present and Future - 2009-07-14Denker - Pharo: Present and Future - 2009-07-14
Denker - Pharo: Present and Future - 2009-07-14
 
ソーシャルゲームの課金認証共通基盤をどう設計したか
ソーシャルゲームの課金認証共通基盤をどう設計したかソーシャルゲームの課金認証共通基盤をどう設計したか
ソーシャルゲームの課金認証共通基盤をどう設計したか
 
Jax keynote
Jax keynoteJax keynote
Jax keynote
 
Turbo charging v8 engine
Turbo charging v8 engineTurbo charging v8 engine
Turbo charging v8 engine
 

Recently uploaded

Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 

Recently uploaded (20)

Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 

A Faster CRuby interpreter with dynamically specialized IR

  • 1. A Faster CRuby interpreter with dynamically specialized IR Vladimir Makarov RedHat Sep 10, 2022 Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 1 / 1
  • 2. Presentation outline Project motivation and initial expectations Dynamically specialized VM insns (IR) Current state of the project Microbenchmark performance comparison of I the base interpreter I the interpreter with specialized IR (SIR) I YJIT and MJIT I and early stage CRuby MIR JIT based on SIR Future plans Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 2 / 1
  • 3. The project motivation MIR project to address shortcomings of MJIT MIR is an universal light-weight JIT compiler I already good for JITting static programming languages I still a lot of work to do to make it good for dynamic programming languages F a generalized lazy basic block versioning based on dynamic code properties F trace generation and optimization based on basic block cloning F ultimate goal is meta-tracing MIR C compiler F more details in my blog post ”Code specialization for the MIR lightweight JIT compiler” Introduction of YJIT was a major disruption I need to use more pragmatic approach to use MIR in the current state I CRuby VM insn specialization instead of one in MIR itself Expectation register transfer language (RTL) with BB versioning can achieve YJIT performance for some benchmarks Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 3 / 1
  • 4. Code specialization One Merriam-Webster definition of ”specialization” – ”design, train, or fit for one particular purpose” Code specialization is a common approach for faster code generation Specialized code already exists in CRuby, I VM insns for calling methods with particular name like opt plus Statically and dynamically specialized code Speculatively specialized code and deoptimization I the more dynamic language is the more (speculative) specialization you need to be closer to static language performance Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 4 / 1
  • 5. Dynamically specialized CRuby insns Dynamic specialization in a lazy way on VM insn BB level I usually a lot of versions of executed BB exists Specialization currently implemented: I hybrid stack-RTL insn specialization I type specialization based on lazy basic block versioning F Maxime Chevalier-Boisvert invention and the most important optimization technique of YJIT I different specialization based on profile info F type specialization based on profile info F specialized calls F specialized instance variable and attribute access F iterator specialization Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 5 / 1
  • 6. RTL Stack insns RTL insns getlocal v1 # push v1 getlocal v2 # push v2 opt plus # pop v1 and v2; push v1+v2 sir plusvvv res, v1, v2 # assign v1+v2 to res setlocal res # pop stack value and assign it to res RTL advantages I less insns, less insn dispatch code I less memory traffic RTL disadvantages I longer insn, more time in operand decoding I worse behaviour for working with values in stack ways (calls) Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 6 / 1
  • 7. RTL (disjoint method frames) temporaries local variables ep sp stack growth 1 2 3 -3 -2 -1 indexes: temporaries local variables ep sp stack growth 3 2 1 -3 -2 -1 indexes: unpredictable distance only ep can be used for addressing local variables (negative offset) and stack values (positive offset) ep should be used for addressing local variables and sp for stack values I a lot of ifs on offset values – a big performance impact on addressing Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 7 / 1
  • 8. Hybrid stack-RTL insns Hybrid stack-RTL insns to overcome pure RTL disadvantages: sir                          plus minus mult div mod or and eq neq lt gt le ge aref aset                          s s v i s v i s means value on stack v means value in a local variable i means immediate value insns sir aset..i and insns with suffixes sss, sii are absent Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 8 / 1
  • 9. Type specialized insns sir i                          plus minus mult div mod or and eq neq lt gt le ge aref aset                          n s v o s v i s v i sir f                plus minus mult div mod eq neq lt gt le ge                n s v o s v i s v i Fixnum (prefix sir i) and FP (prefix sir f) type specialized insns: I Difference with non-type RTL insns: result can be also a local variable I Otherwise, strict correspondence between type-specialized insns and non-type ones is for safe deoptimization Many type specialized insns are generated by lazy BB versions I See numerous Maxime Maxime Chevalier-Boisvert presentations about BBV for details Rest of type specialized insns are generated from profile info Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 9 / 1
  • 10. Profile-based specialization sir_inspect_type sir_inspect_fixtype sir_inspect_flotype sir_check_fixtype sir_check_flotype Profiling Stage After profiling and further BBV nop Insns to inspect value types can not be deducted from BBV I After profiling, if inspect insns are transformed into type guards, further type specialization is done by BBV Specialized insns for calls and instance variable access also can be generated from profile info Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 10 / 1
  • 11. Iterators sir_iter_start start_func sir_cfunc_send = Lcont: sir_iter_body Lexit, block_bbv, cond_func sir_iter_cont Lcont, arg_func Lexit: A lot of Ruby standard methods are implemented on C, accept iseq blocks, and behave as iterators Calling the interpreter from C code is very expensive Such iterator method calls are changed by specialized insns avoiding the interpreter exits and enters I sir iter start start finc, where start func checks receiver type and setup block args I sir iter body exit label, block bbv, cond func, where cond func finishes iterator or calls block BBV I sir iter cont cont label, arg func, where arg func updates block args and goto to given label Iterators currently implemented only for fixnum times, range each, and array Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 11 / 1
  • 12. Dynamic flow of specialized insns VM stack insn iseq iseq entry point stubs (sir_make_bbv, sir_make_catch_bbv) non-specialized hybrid stack-RTL BBV insns type-specialized BBV insns with profiling (self modifying insns) type- and profile-specialized BBV insns SIR To future MIR-based JIT compile execution execution execution (invalid assumption) execution (normal path) Normal IR execution flow: I Start execution of BB with a stub I Stub execution generates hybrid stack-based RTL insns and type-specialized insns with profiling insns I Several executions of type-specialized insns results in type- and profile- specialized insns I Type- and profile-specialized insns are a source of the MIR-based JIT Exception IR execution flow: I Switching to non-type specialized stack-based RTL Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 12 / 1
  • 13. Implementation and the current state The code can be found in my github repository https://github.com/vnmakarov/ruby In the current state, SIR interpreter and MIR JIT are good only to run the micro-benchmarks Code for specialized IR generation and execution is about 3.5K lines of C Generator of C code for MIR is about 2.5K lines of C MIR-based JIT needs MIR library (from bbv branch) about 900KB of machine code Options to use the SIR interpreter: --sir, --sir-debug, --sir-max-bb-versions=N Option to use SIR interpreter and MIR JIT: --mirjit, -mirjit-debug Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 13 / 1
  • 14. Benchmarking Intel i7-9700K with 16GB memory under Linux FC 32 I Base interpreter I Interpreter with SIR: –sir I YJIT: –yjit-call-threshold=1 I MJIT: –jit-min-calls=1 I SIR+MIR: –mirjit Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 14 / 1
  • 15. Micro-benchmarks (Wall time) 93% Geomean performance improvement for SIR Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 15 / 1
  • 16. Micro-benchmarks (Wall time) SIR is faster YJIT on while benchmark because of RTL Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 15 / 1
  • 17. Micro-benchmarks (Wall time) SIR is faster YJIT on nested times benchmark because of iterator specialization Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 15 / 1
  • 18. Micro-benchmarks (Wall time) YJIT is faster SIR on call benchmark (an example when YJIT specialization is better) Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 15 / 1
  • 19. Micro-benchmarks (CPU time) CPU times are practically the same as wall times MJIT Exception: GCC run in parallel adds a lot of CPU time Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 16 / 1
  • 20. Micro-benchmarks (Memory use) Maximal resident memory size YJIT has the biggest memory consumption Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 17 / 1
  • 21. OptCarrot The faster interpreter provides a modest 39% improvement Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 18 / 1
  • 22. Optimized OptCarrot Huge method generation during execution (analog aggressive method inlining) YJIT behaviour is the worst Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 19 / 1
  • 23. The future plans The faster interpreter is not ready yet I bug fixing and more optimization work F SIR is not API and there are no compatibility problems to change it I plans to finish it at the end of 2022 MIR-based JIT is at the very early stage of the development I even more bug fixing and a lot of optimization work I plans for finish implementation in the next year Right now the faster interpreter and MIR-based JIT are more a research project I no commitment to submit it to CRuby I commitment only to support MIR project itself Adopting project ideas and/or code by Ruby developers is welcomed Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 20 / 1
  • 24. Vladimir Makarov (RedHat) A Faster CRuby interpreter with dynamically specialized IR Sep 10, 2022 21 / 1