CONFIDENTIAL
ML also helps Generic Compiler ?
Ryo Takahashi
2
Domain Specific or Generic Compiler ?
Target of CompilerGym
Generic Compiler
• Optimize arbitrary programs
• Generate executable machine code
Domain Specific Compiler
• Pursue performance/productivity
of programs in a specific domain
stencil computation
(e.g. neural networks)
statical model (MCMC)
3
How ML is involved in generic compilation ?
● Difficulty in compiler development
l Middle-end consists of hundreds of heuristics and
parameters invented by experts (e.g. JF Bastien-san)
l The current architecture is not scalable for development,
thus it tends to fall into sub-optimal performance (even LLVM !!)
Find better optimization heuristics and parameters by ML !!
clang
swiftc
rustc
C/C++/Objective-C
Swift
Rust
Front-end
LLVM IR
Optimization
(IR transformation)
x86-64
ARM
PowerPC
codegen
codegen
codegen
Back-end
Middle-end
e.g. loop unrolling, tiling,vectorization,
thread coarsening, device mapping, …
4
Early works
● Since [Calder+ TOPLAS ’97], many academic researchers had tried ML
l Feature engineering for all of (Optimization Decisions) x (Targets) is infeasible
Step 1: Feature engineering
• case: unrolling factors
Still unacceptable engineering cost ...
Step 2: Predict the best decision
by traditional ML models
• e.g. decision tree
5
DeepTune [Commins+ PaCT ’17 Best Paper!!]
● Use LSTM + Embedding layers for feature extraction
• Feed (almost) raw OpenCL code
• Handle the encoded sequence like NLP
• Make an optimzation decision with Affine
• Train modes for each target
Chris Cummns-san
(Facebook Researcher)
6
Evaluation results
● Apply DeepTune to
device mapping decision
l CPU/GPU
● Outperformed the baselines
in the 2 platforms
l AMD/NVIDIA
Expert’s feture choice + decision tree
Static mapping
7
After all, what is CompilerGym ?
● Background
l Compiler researchers see compiler optimizaiton tasks
as environements for RL [Leather+ FDL’20]
● CompilerGym
l Uses the OpenAI Gym to expose the “agent-environment loop”
l Support users to implement their own optimizers as RL agents
l Provides features extractors
and benchmark datasets
l Expose compiler APIs
state IR
action • IR transformation
• contex change
reward • speed up
• codesize reduction
DeepTune is
just this module
8
Summary
● Finally, ML-based generic compiler optimization is
becoming less of a niche academic discipline !?
● However, full-scale implementation involving industry/OSS
has only just begun
● Let’s keep eye on activities of Chris-san and his ex-supervisor’s labo !!
(they have collaboration with ARM, Microsoft Research, and so on)

Ml also helps generic compiler ?

  • 1.
    CONFIDENTIAL ML also helpsGeneric Compiler ? Ryo Takahashi
  • 2.
    2 Domain Specific orGeneric Compiler ? Target of CompilerGym Generic Compiler • Optimize arbitrary programs • Generate executable machine code Domain Specific Compiler • Pursue performance/productivity of programs in a specific domain stencil computation (e.g. neural networks) statical model (MCMC)
  • 3.
    3 How ML isinvolved in generic compilation ? ● Difficulty in compiler development l Middle-end consists of hundreds of heuristics and parameters invented by experts (e.g. JF Bastien-san) l The current architecture is not scalable for development, thus it tends to fall into sub-optimal performance (even LLVM !!) Find better optimization heuristics and parameters by ML !! clang swiftc rustc C/C++/Objective-C Swift Rust Front-end LLVM IR Optimization (IR transformation) x86-64 ARM PowerPC codegen codegen codegen Back-end Middle-end e.g. loop unrolling, tiling,vectorization, thread coarsening, device mapping, …
  • 4.
    4 Early works ● Since[Calder+ TOPLAS ’97], many academic researchers had tried ML l Feature engineering for all of (Optimization Decisions) x (Targets) is infeasible Step 1: Feature engineering • case: unrolling factors Still unacceptable engineering cost ... Step 2: Predict the best decision by traditional ML models • e.g. decision tree
  • 5.
    5 DeepTune [Commins+ PaCT’17 Best Paper!!] ● Use LSTM + Embedding layers for feature extraction • Feed (almost) raw OpenCL code • Handle the encoded sequence like NLP • Make an optimzation decision with Affine • Train modes for each target Chris Cummns-san (Facebook Researcher)
  • 6.
    6 Evaluation results ● ApplyDeepTune to device mapping decision l CPU/GPU ● Outperformed the baselines in the 2 platforms l AMD/NVIDIA Expert’s feture choice + decision tree Static mapping
  • 7.
    7 After all, whatis CompilerGym ? ● Background l Compiler researchers see compiler optimizaiton tasks as environements for RL [Leather+ FDL’20] ● CompilerGym l Uses the OpenAI Gym to expose the “agent-environment loop” l Support users to implement their own optimizers as RL agents l Provides features extractors and benchmark datasets l Expose compiler APIs state IR action • IR transformation • contex change reward • speed up • codesize reduction DeepTune is just this module
  • 8.
    8 Summary ● Finally, ML-basedgeneric compiler optimization is becoming less of a niche academic discipline !? ● However, full-scale implementation involving industry/OSS has only just begun ● Let’s keep eye on activities of Chris-san and his ex-supervisor’s labo !! (they have collaboration with ARM, Microsoft Research, and so on)