AUTOMATIC CONSTRUCTION OF
INLINING HEURISTICS USING
MACHINE LEARNING
(S. KULKARNI, J. CAVAZOS, C. WIMMER,
D. SIMON, 2013)
NATALLIE BAIKEVICH
@LU-A-JALLA
PAPERS WE LOVE: ZÜRICH
WHY THIS PAPER?
Research papers in the .NET source (post
by Matt Warren, @matthewwarren) :
http://mattwarren.org/2016/12/12/Research-
papers-in-the-.NET-source/
Inspired by Java papers:
https://lowlevelbits.org/java-papers/
MOTIVATION
 Inlining can improve performance
A LOT
 Things can go REALLY WRONG
 The only thing worse than no
optimization is a harmful
optimization
INLINING:
FUNCTION CALL -> BODY
+ eliminates the overhead of a call
+ broadens the scope for other
optimizations
- increases register pressure
- increases memory footprint
- slower compilation
FACTORS
 calling context of the method
 target platform*
 other optimizations
 effect of previous inlining decisions
*Java HotSpot & Maxine VM C1X compiler: static,
hard-coded heuristics, same for all platforms
APPROACH AND
IMPLEMENTATION
DEFAULT HEURISTICS AND NEAT
FEATURES
Caller and Callee
- Simple instructions count
- Method calls
- Conditional branches
- Unconditional branches
- Memory load/store
- New objects
- Other instructions
- Size (all instructions)
Calling Context
- In loop or not
- Depth of recursive call
- Depth of non-recursice
call
- Current graph size
- Synchronized or not
HotSpot Only
- Loop depth
- Block weight
SENSITIVITY(SPECJVM98)
SEARCHSPACE(SPECJVM98)
METHODS
Traditional supervised
learning: requires
labeled training set
Each decision affects
next: difficult to study
the effect
Unsupervised
learning: generate
solution, measure
effectiveness
METRIC: average
speedup
Large space of possible solutions -> ML
INLINING HEURISTIC
(HOTSPOT)
if (calleeSize > ALWAYS_INLINE_SIZE)
if (calleeSize > CALLER_MAX_SIZE)
return NO;
if (inlineDepth > MAX_INLINE_DEPTH)
return NO;
if (callWarmth > CALL_WARMTH_THRESHOLD*)
return NO;
// Passed all tests so we inline
return YES;
* # of call invocation + est. profit from
inlining + est. amount of work
NEAT: NEURO-EVOLUTION OF
AUGMENTING TOPOLOGIES
 Randomly generate an initial generation
of neural networks
 Evaluate the performance of each NN
 Use the best NNs to produce next
generation
 Repeat
Interpretation?
CREATING
HEURISTICS
Input: Features
Multi-layer NN
Output: Number
between 0 and 1
(>0.5 for inline)
RESULTS
BENCHMARKING INLINERS
DECISION TREE FOR
READABLE HEURISTICS
 The best NEAT NN -> training dataset
 Construct decision tree
 Prune
Average performance gain
NEAT 11% over default heuristic
Decision Tree 10%
Probability of
execution
WHATISUSEFUL?
Close to size
threshold
BUT
many memory
operations
=> INLINE
Factors
depend on
other factors
BENCHMARKS: ALGORITHMS
BENCHMARKS: FEATURES
INLINING:
GOOD OR EVIL?
MANUAL INLINING EXAMPLES IN F#
OUTSMARTING COMPILERS:
TO BE OR NOT TO BE?
Let’s take .NET world as an example:
 JIT-inlining: automatic + manual hints
(e.g. MethodImplAttribute)
 Manual inlining (e.g. inline keyword in
F#)
What could possibly go
wrong?
CONCLUSIONS
 Inlining is important
 NEAT and Decision Trees allow to
construct good heuristics
 Average performance improvement
over Java HotSpot VM and Maxine
VM - 11%
 Decision Trees for readable
heuristics
REFERENCES: PAPERS
 Automatic construction of inlining heuristics using
Machine Learning:
https://www.eecis.udel.edu/~skulkarn/papers/cgo-
2013.pdf
 Evolving Neural Networks through Augmenting
Topologies:
http://nn.cs.utexas.edu/downloads/papers/stanley.ec
02.pdf
 Meta optimization: improving compiler heuristics
with Machine
Learning:http://groups.csail.mit.edu/commit/papers/0
3/metaopt-pldi.pdf
 Adaptive online context-sensitive
inlining:http://www.research.ibm.com/people/d/dgrov
e/papers/cgo03.pdf
REFERENCES: .NET & JAVA
 Research papers in the .NET
source:http://mattwarren.org/2016/12/12/Research-
papers-in-the-.NET-source/
 CoreCLR
inlining:https://github.com/dotnet/coreclr/blob/maste
r/src/jit/inline.cpp
 CoreCLR
intrinsics:https://github.com/dotnet/coreclr/blob/mas
ter/src/inc/corinfo.h#L908-L965
 Java papers: https://lowlevelbits.org/java-papers/
 Scala Benchmarking project: http://scalabench.org/

Inlining Heuristics

  • 1.
    AUTOMATIC CONSTRUCTION OF INLININGHEURISTICS USING MACHINE LEARNING (S. KULKARNI, J. CAVAZOS, C. WIMMER, D. SIMON, 2013) NATALLIE BAIKEVICH @LU-A-JALLA PAPERS WE LOVE: ZÜRICH
  • 2.
    WHY THIS PAPER? Researchpapers in the .NET source (post by Matt Warren, @matthewwarren) : http://mattwarren.org/2016/12/12/Research- papers-in-the-.NET-source/ Inspired by Java papers: https://lowlevelbits.org/java-papers/
  • 3.
    MOTIVATION  Inlining canimprove performance A LOT  Things can go REALLY WRONG  The only thing worse than no optimization is a harmful optimization
  • 4.
    INLINING: FUNCTION CALL ->BODY + eliminates the overhead of a call + broadens the scope for other optimizations - increases register pressure - increases memory footprint - slower compilation
  • 5.
    FACTORS  calling contextof the method  target platform*  other optimizations  effect of previous inlining decisions *Java HotSpot & Maxine VM C1X compiler: static, hard-coded heuristics, same for all platforms
  • 6.
  • 7.
    FEATURES Caller and Callee -Simple instructions count - Method calls - Conditional branches - Unconditional branches - Memory load/store - New objects - Other instructions - Size (all instructions) Calling Context - In loop or not - Depth of recursive call - Depth of non-recursice call - Current graph size - Synchronized or not HotSpot Only - Loop depth - Block weight
  • 8.
  • 9.
  • 10.
    METHODS Traditional supervised learning: requires labeledtraining set Each decision affects next: difficult to study the effect Unsupervised learning: generate solution, measure effectiveness METRIC: average speedup Large space of possible solutions -> ML
  • 11.
    INLINING HEURISTIC (HOTSPOT) if (calleeSize> ALWAYS_INLINE_SIZE) if (calleeSize > CALLER_MAX_SIZE) return NO; if (inlineDepth > MAX_INLINE_DEPTH) return NO; if (callWarmth > CALL_WARMTH_THRESHOLD*) return NO; // Passed all tests so we inline return YES; * # of call invocation + est. profit from inlining + est. amount of work
  • 12.
    NEAT: NEURO-EVOLUTION OF AUGMENTINGTOPOLOGIES  Randomly generate an initial generation of neural networks  Evaluate the performance of each NN  Use the best NNs to produce next generation  Repeat Interpretation?
  • 13.
    CREATING HEURISTICS Input: Features Multi-layer NN Output:Number between 0 and 1 (>0.5 for inline)
  • 14.
  • 15.
    DECISION TREE FOR READABLEHEURISTICS  The best NEAT NN -> training dataset  Construct decision tree  Prune Average performance gain NEAT 11% over default heuristic Decision Tree 10%
  • 16.
    Probability of execution WHATISUSEFUL? Close tosize threshold BUT many memory operations => INLINE Factors depend on other factors
  • 17.
  • 18.
  • 19.
    INLINING: GOOD OR EVIL? MANUALINLINING EXAMPLES IN F#
  • 20.
    OUTSMARTING COMPILERS: TO BEOR NOT TO BE? Let’s take .NET world as an example:  JIT-inlining: automatic + manual hints (e.g. MethodImplAttribute)  Manual inlining (e.g. inline keyword in F#) What could possibly go wrong?
  • 21.
    CONCLUSIONS  Inlining isimportant  NEAT and Decision Trees allow to construct good heuristics  Average performance improvement over Java HotSpot VM and Maxine VM - 11%  Decision Trees for readable heuristics
  • 22.
    REFERENCES: PAPERS  Automaticconstruction of inlining heuristics using Machine Learning: https://www.eecis.udel.edu/~skulkarn/papers/cgo- 2013.pdf  Evolving Neural Networks through Augmenting Topologies: http://nn.cs.utexas.edu/downloads/papers/stanley.ec 02.pdf  Meta optimization: improving compiler heuristics with Machine Learning:http://groups.csail.mit.edu/commit/papers/0 3/metaopt-pldi.pdf  Adaptive online context-sensitive inlining:http://www.research.ibm.com/people/d/dgrov e/papers/cgo03.pdf
  • 23.
    REFERENCES: .NET &JAVA  Research papers in the .NET source:http://mattwarren.org/2016/12/12/Research- papers-in-the-.NET-source/  CoreCLR inlining:https://github.com/dotnet/coreclr/blob/maste r/src/jit/inline.cpp  CoreCLR intrinsics:https://github.com/dotnet/coreclr/blob/mas ter/src/inc/corinfo.h#L908-L965  Java papers: https://lowlevelbits.org/java-papers/  Scala Benchmarking project: http://scalabench.org/