Graal and Truffle:
One VM to Rule Them All

Thomas Wuerthinger
Oracle Labs
@thomaswue
12-December-2013,
at ETH Zurich
Disclaimer
The following is intended to provide some insight into a line of
research in Oracle Labs. It is intended for information purposes
only, and may not be incorporated into any contract. It is not a
commitment to deliver any material, code, or functionality, and
should not be relied upon in making purchasing decisions. The
development, release, and timing of any features or
functionality described in connection with any Oracle product or
service remains at the sole discretion of Oracle. Any views
expressed in this presentation are my own and do not
necessarily reflect the views of Oracle.

2

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Agenda

§  One VM to Rule Them All?
§  Dynamic Compilation
§  Graal Compiler
§  Truffle System
§  Q&A

3

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
One Language to Rule Them All?
Let’s ask a search engine…

4

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
One Language to Rule Them All?
Let’s ask Stack Overflow…

5

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Relative Speed of Programming Languages
(as measured by the Computer Language Benchmarks Game, ~1y ago)

One VM to for all languages means
interoperability and being able to
choose the best language for the task!

3

Goal:

6

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Agenda

§  One VM to Rule Them All?
§  Dynamic Compilation
§  Graal Compiler
§  Truffle System
§  Q&A

7

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Static versus Dynamic Compilation (1)
§  Static (or ahead-of-time) Compilation
–  Compilation happens before program is run.
–  Can include profiling feedback from sample application runs.

§  Dynamic (or just-in-time) Compilation
–  Compilation happens while the program is running.
–  Base line execution (interpreter or simple compiler) gathers

profiling feeback.
–  Optimization => Deoptimization => Reoptimization cycles.
–  On-stack-replacement (OSR) to switch between the tiers (two or

more execution modes.

8

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Static versus Dynamic Compilation (2)
§  Static (or ahead-of-time) Compilation
–  Fast start-up, because compilation and profiling is not part of

application execution time.
–  Predictable performance as only the source program affects the

generated machine code.
§  Dynamic (or just-in-time) Compilation
–  Can exploit exact target platform properties when generating

machine code.
–  Profiling feedback captures part of the application behavior and

increases code quality.
–  The deoptimization capabilities allow the optimized code to be

incomplete and/or use aggressive speculation.
–  Can use assumptions about the current state of the system (e.g.,
loaded classes) in the generated code.

9

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Profiling Feedback for Java
§  Branch probabilities
–  Never taken branches can be omitted.
–  Exact probabilities allows if-cascade reordering.

§  Loop frequencies
–  Guide loop unrolling and loop invariant motion.

§  Type profile
–  Optimize instanceof, checkcast type checks (i.e., speculate that

only a specific set of types occurs)
–  Optimize virtual calls or interface calls.

Profiling feedback only helps when the program behavior during
the observed period matches the overall program behavior.

10

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Static Single Assignment (SSA) Form
§  Every variable is assigned only once.
§  Phis capture values coming from different control flow branches.
§  Commonly used in compilers as it simplifies optimizations and

traversal along the def-use and use-def chain.

...
if (condition) {
x = value1 + value2;
} else {
x = value2;
}
return x;

11

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

...
if (condition) {
x1 = value1 + value2;
} else {
x2 = value2;
}
x3 = phi(x1, x2);
return x3;
Agenda

§  One VM to Rule Them All?
§  Dynamic Compilation
§  Graal Compiler
§  Truffle System
§  Q&A

12

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Graal is an …

... extensible,
dynamic compiler using
object-oriented Java programming,
a graph intermediate representation,
and Java snippets.

13

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
HotSpotVM versus GraalVM
30k LOC

120k LOC

60k LOC

Client

Server

Graal

Compiler Interface

Compilation Queue

Compilation Queue

Compiler Interface

HotSpot

HotSpot

C++

14

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Java
Why Java?
Robustness: Runtime exceptions not fatal.
Reflection: Annotations instead of macros.
Meta-Evaluation: IR subgraph expressible in Java code.
Extensibility: No language barrier to the application.
Tooling: Java IDEs speed up the development process.

15

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Snippets for Graph Construction
Manual construction:
Node max(ValueNode a, ValueNode b) {	
IfNode ifNode = new IfNode(new IntegerLessThanNode(a, b));	
ifNode.trueSuccessor().setNext(new ReturnNode(a));	
ifNode.falseSuccessor().setNext(new ReturnNode(b));	
return ifNode;	
}

Expression as snippet:
int max(int a, int b) {	
if (a > b) return a;	
else return b;	
}

16

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Data

Code
Lowering
§  Replace one node with multiple other nodes.
–  New nodes provide more detailed description of semantics.
–  New nodes can be optimized and moved separately.

§  General Java lowerings
–  Example: Replace an array store with null check, bounds check,

store check, write operation.
if (array != null && index >= 0 && index < array.length && 	
canAssign(array.getClass().getComponentType(), value)) {	
*(array + 16 + index*8) = value;	
} else { deoptimize; }

§  VM specific lowerings
–  Examples: Replace a monitorenter with the code dependent on the

locking schemes used by the VM

17

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Gradual Lowering
3

Nodes per bytecode

2.5

2

Graal
1.5

Client
Server

1

0.5

0

After parsing

After optimizations

After lowering

Before code emission

Numbers obtained while running the DaCapo benchmark suite.

18

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Extensibility
•  Multiple Target Platforms (AMD64, SPARC, PTX, HSAIL)
•  Multiple Runtimes (HotSpot and Maxine)
•  Adding new types of Nodes
•  Adding new compiler Phases
abstract	
  class	
  Phase	
  {	
  abstract	
  void	
  run(Graph	
  g);	
  }
for	
  (IfNode	
  n	
  :	
  graph.getNodes(IfNode.class))	
  {	
  ...	
  }

Compiler has about 100 different individual modules.

19

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Graph IR
• 

Static single assignment (SSA) form with def-use and use-def edges.

• 

Program dependence graph (sea of nodes), but with explicit
distinction between control flow and data flow edges.

• 

Graph visualization tools: IdealGraphVisualizer and c1visualizer.
...	
  

condition	
  

If	
  

...
if (condition) {
result = value1 + value2;
} else {
result = value2;
}
return result;

Begin	
  

Begin	
  

End	
  

End	
  

Merge	
  

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Add	
  

Phi	
  

Return	
  

20

value1	
  

value2	
  
Guards
int get(x) {
return x.field;
}

21

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Guards
int get(x) {
if (cond) return x.field;
else return 0;
}

22

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Eliding Exception Edges
Catch
Operation

Operation
Operation

Actual

Potential
Invoke

1296646

14454

1.11%

BoundsCheck

166770

498

0.30%

NullCheck

1525061

686

0.04%

OutOfMemory

110078

0

0.00%

CheckCast

99192

0

0.00%

DivRem

6082

0

0.00%

MonitorNullCheck

33631

0

0.00%

TOTAL

3237460

15638

0.48%

Numbers obtained while running the DaCapo benchmark suite.

23

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Graal GPU Backends
JavaScript, Ruby,
Python, …

Java bytecodes

Truffle AST

Graal IR

PTX

24

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

HSAIL
Java Peak Performance
§  SPECjvm2008
114

120

100

100
80

76

60
40
20
0

Client

Graal

Server

Configura*on:	
  Intel	
  Core	
  i7-­‐3770	
  @	
  3,4	
  Ghz,	
  4	
  Cores	
  8	
  Threads,	
  16	
  GB	
  RAM	
  
Comparison	
  against	
  HotSpot	
  changeset	
  tag	
  hs25-­‐b37	
  from	
  June	
  13,	
  2013	
  

25

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Scala Peak Performance
§  Scala-Dacapo Benchmark Suite
120

100

100

106

80

61
60
40
20
0

Client

Graal

Server

Configura*on:	
  Intel	
  Core	
  i7-­‐3770	
  @	
  3,4	
  Ghz,	
  4	
  Cores	
  8	
  Threads,	
  16	
  GB	
  RAM	
  
Comparison	
  against	
  HotSpot	
  changeset	
  tag	
  hs25-­‐b37	
  from	
  June	
  13,	
  2013	
  

26

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Your Compiler Extension?
http://openjdk.java.net/projects/graal/
graal-dev@openjdk.java.net
$ hg clone http://hg.openjdk.java.net/graal/graal
$ cd graal
$ ./mx.sh --vm graal build
$ ./mx.sh ideinit
$ ./mx.sh --vm graal vm

§  Graal Resources

https://wiki.openjdk.java.net/display/Graal/Main
§  Graal License: GPLv2

27

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Agenda

§  One VM to Rule Them All?
§  Dynamic Compilation
§  Graal Compiler
§  Truffle System
§  Q&A

28

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
“Write Your Own Language”
Current situation

Prototype a new language
Parser and language work to build
syntax tree (AST), AST Interpreter
Write a “real” VM
In C/C++, still using AST interpreter,
spend a lot of time implementing
runtime system, GC, …
People start using it
People complain about performance
Define a bytecode format and
write bytecode interpreter
Performance is still bad
Write a JIT compiler
Improve the garbage collector

29

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

How it should be

Prototype a new language in Java
Parser and language work to build
syntax tree (AST)
Execute using AST interpreter
People start using it
And it is already fast
Truffle: System Structure

Written by:
Application
Developer

Written in:

Guest Language Application

Guest Language

Language
Developer

Guest Language Implementation

Managed Host Language

VM Expert

Host Services

Managed Host Language
or Unmanaged Language

OS Expert

OS

30

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Unmanaged Language
(typically C or C++)
Speculate and Optimize …

Node Rewriting
for Profiling Feedback

U

Compilation using
Partial Evaluation

G

G
U

U

Node Transitions
U

U

I
Uninitialized

S
AST Interpreter
Uninitialized Nodes

I

G
I

I

D

String

Double

G
Generic

31

I

Integer

I

U

I

G

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

AST Interpreter
Rewritten Nodes

Compiled Code
Partial Evaluation
§  Example function:
–  f(x, y) = x + y + 1

§  Partial evaluation of example function:
–  g(y) = f(1, y) = 1 + y + 1 = y + 2

§  Interpreter function:
–  f(program, arguments) = calculations to interpret the program

§  Partial evaluation of interpreter function (first Futamura projection):
–  g(arguments) = f(#specificProgram, arguments) = compiled version of

#specificProgram that takes arguments as parameters

32

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
… and Deoptimize and Reoptimize!

Deoptimization
to AST Interpreter

Node Rewriting to Update
Profiling Feedback

G

Recompilation using
Partial Evaluation

G

G
I
I

G
I

G

D

G

I

I
I

33

D

G

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

I

I

D

G
D
Object add(Object a, Object b) {
if(a instanceof Integer && b instanceof Integer) {
return (int)a + (int)b;
} else if (a instanceof String && b instanceof String) {
return (String)a + (String)b;
} else {
return genericAdd(a, b);
}
}

int add(int a,

String add(String a,

int b) {

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

return genericAdd(a, b);

return a + b;
}

34

Object b) {

String b) {

return a + b;
}

Object add(Object a,

}
Node Implementation
class IAddNode extends BinaryNode {
int executeInt(Frame f) throws UnexpectedResult {
int a;
try {
a = left.executeInt(f);
} catch (UnexpectedResult ex) {
throw rewrite(f, ex.result, right.execute(f));
}
int b;
try {
b = right.executeInt(f);
} catch (UnexpectedResult ex) {
throw rewrite(f, a, ex.result);
}
try {
return Math.addExact(a, b);
} catch (ArithmeticException ex) {
throw rewrite(f, a, b);
}
}

35

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Uninitialized

Double

String

Generic

36

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Specializing
FSA
Truffle DSL
@Specialization(rewriteOn=ArithmeticException.class)
int addInt(int a, int b) {
return Math.addExact(a, b);
}
@Specialization
double addDouble(double a, double b) {
return a + b;
}
@Generic
Object addGeneric(Frame f, Object a, Object b) {
// Handling of String omitted for simplicity.
Number aNum = Runtime.toNumber(f, a);
Number bNum = Runtime.toNumber(f, b);
return Double.valueOf(aNum.doubleValue() +
bNum.doubleValue());
}

37

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Inline Caching
uninitialized

monomorphic

polymorphic

U

S

megamorphic

S

G
U

S

…

S

U

38

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Method Inlining

39

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Method Inlining

40

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Truffle API Compiler Directives
§  Guards
if(condition)	
  {	
  
	
  	
  //	
  some	
  code	
  that	
  is	
  only	
  valid	
  if	
  condition	
  is	
  true	
  
}	
  else	
  {	
  
	
  	
  CompilerDirectives.transferToInterpreter();	
  
}	
  

§  Assumptions
Assumption	
  assumption	
  =	
  Truffle.getRuntime().createAssumption();	
  

assumption.check();	
  
//	
  some	
  code	
  that	
  is	
  only	
  valid	
  if	
  assumption	
  is	
  true	
  

assumption.invalidate();	
  

41

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Performance Number Disclaimers
§  All Truffle numbers reflect the current development snapshot.
–  Subject to change at any time (hopefully improve)
–  You have to know a benchmark to understand why it is slow or fast

§  We are not claiming to have complete language implementations.
–  JavaScript: quite complete, passing 99.8% of ECMAScript262 tests
–  Ruby: passing >45% of RubySpec language tests
–  R: early prototype

§  We measure against latest versions of competitors.
§  We measure peak performance (i.e., giving each benchmark enough

iterations to warmup before starting measurement).

§  Benchmarks that are not shown
–  may not run at all, or
–  may not run fast

42

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Peak Performance: JavaScript
Speedup relative to V8
2.6

3.0
Truffle
SpiderMonkey

2.5

0.8

1.0
0.9

1.2

1.1

0.9
1.1

0.5
0.6

0.7
0.7

1.0
0.6

1.0

0.8

1.0
0.7

1.5

1.4

1.5

1.6

2.0

0.5

te

u
C

om

po

si

em
gb

x2
bo

bo
yrle

d

r
ye

y
la
ea

na

vi

er

-s

to
k

sp

es

e
ra
y

tra
c

to
cr
yp

bl
lta
de

ric

ha

rd

s

ue

0.0

Selection of benchmarks from Google‘s Octane benchmark suite v1.0

43

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Peak Performance: Ruby
Speedup relative to JRuby 1.7.5

14

14
14

16
MRI 2.0.0
Topaz

12

Truf f le
10

0

44

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

0.6
1.0
1.7

1.8

0.8

1.7
2.7
1.1

0.5

0.2

0.4
0.3
0.7

2

1.7
2.7

4

0.7

4.7
4.5

6

4.9

8
Peak Performance: R
94

Speedup relative to GNUR
100.0
90.0
80.0
70.0
60.0

22

30.0

0.0

45

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

0.8

2.7

2.1

10.0

2.0

14

20.0

23

40.0

24

38

39

50.0
Language Implementations

Simple
Language

Ruby

C

R

46

JavaScript

Python

Smalltalk

Your
language?

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Java
Your Language?
http://openjdk.java.net/projects/graal/
graal-dev@openjdk.java.net
$ hg clone http://hg.openjdk.java.net/graal/graal
$ cd graal
$ ./mx.sh --vm server build
$ ./mx.sh ideinit
$ ./mx.sh --vm server unittest SumTest

§  Truffle API Resources

https://wiki.openjdk.java.net/display/Graal/Truffle+FAQ+and+Guidelines
§  Truffle API License: GPLv2 with Classpath Exception

47

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
Acknowledgements
Oracle Labs
Laurent Daynès
Erik Eckstein
Michael Haupt
Peter Kessler
Christos Kotselidis
David Leibs
Roland Schatz
Chris Seaton
Doug Simon
Michael Van De Vanter
Christian Wimmer
Christian Wirth
Mario Wolczko
Thomas Würthinger
Laura Hill (Manager)
Interns
Danilo Ansaloni
Daniele Bonetta
Shams Imam
Stephen Kell
Gregor Richards
Rifat Shariyar

48

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

JKU Linz
Prof. Hanspeter Mössenböck
Gilles Duboscq
Matthias Grimmer
Christian Häubl
Josef Haider
Christian Humer
Christian Huber
Manuel Rigger
Lukas Stadler
Bernhard Urban
Andreas Wöß
University of Edinburgh
Christophe Dubach
Juan José Fumero Alfonso
Ranjeet Singh
Toomas Remmelg
LaBRI
Floréal Morandat

University of California, Irvine
Prof. Michael Franz
Codrut Stancu
Gulfem Savrun Yeniceri
Wei Zhang
Purdue University
Prof. Jan Vitek
Tomas Kalibera
Petr Maj

Lei Zhao
T. U. Dortmund
Prof. Peter Marwedel
Helena Kotthaus
Ingo Korb
University of California, Davis
Prof. Duncan Temple Lang
Nicholas Ulle
http://openjdk.java.net/projects/graal/
graal-dev@openjdk.java.net
@thomaswue

Q/A

49

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
50

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
51

Copyright © 2013, Oracle and/or its affiliates. All rights reserved.

Graal and Truffle: One VM to Rule Them All

  • 1.
    Graal and Truffle: OneVM to Rule Them All Thomas Wuerthinger Oracle Labs @thomaswue 12-December-2013, at ETH Zurich
  • 2.
    Disclaimer The following isintended to provide some insight into a line of research in Oracle Labs. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, and timing of any features or functionality described in connection with any Oracle product or service remains at the sole discretion of Oracle. Any views expressed in this presentation are my own and do not necessarily reflect the views of Oracle. 2 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
  • 3.
    Agenda §  One VMto Rule Them All? §  Dynamic Compilation §  Graal Compiler §  Truffle System §  Q&A 3 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
  • 4.
    One Language toRule Them All? Let’s ask a search engine… 4 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
  • 5.
    One Language toRule Them All? Let’s ask Stack Overflow… 5 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
  • 6.
    Relative Speed ofProgramming Languages (as measured by the Computer Language Benchmarks Game, ~1y ago) One VM to for all languages means interoperability and being able to choose the best language for the task! 3 Goal: 6 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
  • 7.
    Agenda §  One VMto Rule Them All? §  Dynamic Compilation §  Graal Compiler §  Truffle System §  Q&A 7 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
  • 8.
    Static versus DynamicCompilation (1) §  Static (or ahead-of-time) Compilation –  Compilation happens before program is run. –  Can include profiling feedback from sample application runs. §  Dynamic (or just-in-time) Compilation –  Compilation happens while the program is running. –  Base line execution (interpreter or simple compiler) gathers profiling feeback. –  Optimization => Deoptimization => Reoptimization cycles. –  On-stack-replacement (OSR) to switch between the tiers (two or more execution modes. 8 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
  • 9.
    Static versus DynamicCompilation (2) §  Static (or ahead-of-time) Compilation –  Fast start-up, because compilation and profiling is not part of application execution time. –  Predictable performance as only the source program affects the generated machine code. §  Dynamic (or just-in-time) Compilation –  Can exploit exact target platform properties when generating machine code. –  Profiling feedback captures part of the application behavior and increases code quality. –  The deoptimization capabilities allow the optimized code to be incomplete and/or use aggressive speculation. –  Can use assumptions about the current state of the system (e.g., loaded classes) in the generated code. 9 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
  • 10.
    Profiling Feedback forJava §  Branch probabilities –  Never taken branches can be omitted. –  Exact probabilities allows if-cascade reordering. §  Loop frequencies –  Guide loop unrolling and loop invariant motion. §  Type profile –  Optimize instanceof, checkcast type checks (i.e., speculate that only a specific set of types occurs) –  Optimize virtual calls or interface calls. Profiling feedback only helps when the program behavior during the observed period matches the overall program behavior. 10 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
  • 11.
    Static Single Assignment(SSA) Form §  Every variable is assigned only once. §  Phis capture values coming from different control flow branches. §  Commonly used in compilers as it simplifies optimizations and traversal along the def-use and use-def chain. ... if (condition) { x = value1 + value2; } else { x = value2; } return x; 11 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. ... if (condition) { x1 = value1 + value2; } else { x2 = value2; } x3 = phi(x1, x2); return x3;
  • 12.
    Agenda §  One VMto Rule Them All? §  Dynamic Compilation §  Graal Compiler §  Truffle System §  Q&A 12 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
  • 13.
    Graal is an… ... extensible, dynamic compiler using object-oriented Java programming, a graph intermediate representation, and Java snippets. 13 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
  • 14.
    HotSpotVM versus GraalVM 30kLOC 120k LOC 60k LOC Client Server Graal Compiler Interface Compilation Queue Compilation Queue Compiler Interface HotSpot HotSpot C++ 14 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Java
  • 15.
    Why Java? Robustness: Runtimeexceptions not fatal. Reflection: Annotations instead of macros. Meta-Evaluation: IR subgraph expressible in Java code. Extensibility: No language barrier to the application. Tooling: Java IDEs speed up the development process. 15 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
  • 16.
    Snippets for GraphConstruction Manual construction: Node max(ValueNode a, ValueNode b) { IfNode ifNode = new IfNode(new IntegerLessThanNode(a, b)); ifNode.trueSuccessor().setNext(new ReturnNode(a)); ifNode.falseSuccessor().setNext(new ReturnNode(b)); return ifNode; } Expression as snippet: int max(int a, int b) { if (a > b) return a; else return b; } 16 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Data Code
  • 17.
    Lowering §  Replace onenode with multiple other nodes. –  New nodes provide more detailed description of semantics. –  New nodes can be optimized and moved separately. §  General Java lowerings –  Example: Replace an array store with null check, bounds check, store check, write operation. if (array != null && index >= 0 && index < array.length && canAssign(array.getClass().getComponentType(), value)) { *(array + 16 + index*8) = value; } else { deoptimize; } §  VM specific lowerings –  Examples: Replace a monitorenter with the code dependent on the locking schemes used by the VM 17 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
  • 18.
    Gradual Lowering 3 Nodes perbytecode 2.5 2 Graal 1.5 Client Server 1 0.5 0 After parsing After optimizations After lowering Before code emission Numbers obtained while running the DaCapo benchmark suite. 18 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
  • 19.
    Extensibility •  Multiple TargetPlatforms (AMD64, SPARC, PTX, HSAIL) •  Multiple Runtimes (HotSpot and Maxine) •  Adding new types of Nodes •  Adding new compiler Phases abstract  class  Phase  {  abstract  void  run(Graph  g);  } for  (IfNode  n  :  graph.getNodes(IfNode.class))  {  ...  } Compiler has about 100 different individual modules. 19 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
  • 20.
    Graph IR •  Static singleassignment (SSA) form with def-use and use-def edges. •  Program dependence graph (sea of nodes), but with explicit distinction between control flow and data flow edges. •  Graph visualization tools: IdealGraphVisualizer and c1visualizer. ...   condition   If   ... if (condition) { result = value1 + value2; } else { result = value2; } return result; Begin   Begin   End   End   Merge   Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Add   Phi   Return   20 value1   value2  
  • 21.
    Guards int get(x) { returnx.field; } 21 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
  • 22.
    Guards int get(x) { if(cond) return x.field; else return 0; } 22 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
  • 23.
  • 24.
    Graal GPU Backends JavaScript,Ruby, Python, … Java bytecodes Truffle AST Graal IR PTX 24 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. HSAIL
  • 25.
    Java Peak Performance § SPECjvm2008 114 120 100 100 80 76 60 40 20 0 Client Graal Server Configura*on:  Intel  Core  i7-­‐3770  @  3,4  Ghz,  4  Cores  8  Threads,  16  GB  RAM   Comparison  against  HotSpot  changeset  tag  hs25-­‐b37  from  June  13,  2013   25 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
  • 26.
    Scala Peak Performance § Scala-Dacapo Benchmark Suite 120 100 100 106 80 61 60 40 20 0 Client Graal Server Configura*on:  Intel  Core  i7-­‐3770  @  3,4  Ghz,  4  Cores  8  Threads,  16  GB  RAM   Comparison  against  HotSpot  changeset  tag  hs25-­‐b37  from  June  13,  2013   26 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
  • 27.
    Your Compiler Extension? http://openjdk.java.net/projects/graal/ graal-dev@openjdk.java.net $hg clone http://hg.openjdk.java.net/graal/graal $ cd graal $ ./mx.sh --vm graal build $ ./mx.sh ideinit $ ./mx.sh --vm graal vm §  Graal Resources https://wiki.openjdk.java.net/display/Graal/Main §  Graal License: GPLv2 27 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
  • 28.
    Agenda §  One VMto Rule Them All? §  Dynamic Compilation §  Graal Compiler §  Truffle System §  Q&A 28 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
  • 29.
    “Write Your OwnLanguage” Current situation Prototype a new language Parser and language work to build syntax tree (AST), AST Interpreter Write a “real” VM In C/C++, still using AST interpreter, spend a lot of time implementing runtime system, GC, … People start using it People complain about performance Define a bytecode format and write bytecode interpreter Performance is still bad Write a JIT compiler Improve the garbage collector 29 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. How it should be Prototype a new language in Java Parser and language work to build syntax tree (AST) Execute using AST interpreter People start using it And it is already fast
  • 30.
    Truffle: System Structure Writtenby: Application Developer Written in: Guest Language Application Guest Language Language Developer Guest Language Implementation Managed Host Language VM Expert Host Services Managed Host Language or Unmanaged Language OS Expert OS 30 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. Unmanaged Language (typically C or C++)
  • 31.
    Speculate and Optimize… Node Rewriting for Profiling Feedback U Compilation using Partial Evaluation G G U U Node Transitions U U I Uninitialized S AST Interpreter Uninitialized Nodes I G I I D String Double G Generic 31 I Integer I U I G Copyright © 2013, Oracle and/or its affiliates. All rights reserved. AST Interpreter Rewritten Nodes Compiled Code
  • 32.
    Partial Evaluation §  Examplefunction: –  f(x, y) = x + y + 1 §  Partial evaluation of example function: –  g(y) = f(1, y) = 1 + y + 1 = y + 2 §  Interpreter function: –  f(program, arguments) = calculations to interpret the program §  Partial evaluation of interpreter function (first Futamura projection): –  g(arguments) = f(#specificProgram, arguments) = compiled version of #specificProgram that takes arguments as parameters 32 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
  • 33.
    … and Deoptimizeand Reoptimize! Deoptimization to AST Interpreter Node Rewriting to Update Profiling Feedback G Recompilation using Partial Evaluation G G I I G I G D G I I I 33 D G Copyright © 2013, Oracle and/or its affiliates. All rights reserved. I I D G D
  • 34.
    Object add(Object a,Object b) { if(a instanceof Integer && b instanceof Integer) { return (int)a + (int)b; } else if (a instanceof String && b instanceof String) { return (String)a + (String)b; } else { return genericAdd(a, b); } } int add(int a, String add(String a, int b) { Copyright © 2013, Oracle and/or its affiliates. All rights reserved. return genericAdd(a, b); return a + b; } 34 Object b) { String b) { return a + b; } Object add(Object a, }
  • 35.
    Node Implementation class IAddNodeextends BinaryNode { int executeInt(Frame f) throws UnexpectedResult { int a; try { a = left.executeInt(f); } catch (UnexpectedResult ex) { throw rewrite(f, ex.result, right.execute(f)); } int b; try { b = right.executeInt(f); } catch (UnexpectedResult ex) { throw rewrite(f, a, ex.result); } try { return Math.addExact(a, b); } catch (ArithmeticException ex) { throw rewrite(f, a, b); } } 35 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
  • 36.
    Uninitialized Double String Generic 36 Copyright © 2013,Oracle and/or its affiliates. All rights reserved. Specializing FSA
  • 37.
    Truffle DSL @Specialization(rewriteOn=ArithmeticException.class) int addInt(inta, int b) { return Math.addExact(a, b); } @Specialization double addDouble(double a, double b) { return a + b; } @Generic Object addGeneric(Frame f, Object a, Object b) { // Handling of String omitted for simplicity. Number aNum = Runtime.toNumber(f, a); Number bNum = Runtime.toNumber(f, b); return Double.valueOf(aNum.doubleValue() + bNum.doubleValue()); } 37 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
  • 38.
  • 39.
    Method Inlining 39 Copyright ©2013, Oracle and/or its affiliates. All rights reserved.
  • 40.
    Method Inlining 40 Copyright ©2013, Oracle and/or its affiliates. All rights reserved.
  • 41.
    Truffle API CompilerDirectives §  Guards if(condition)  {      //  some  code  that  is  only  valid  if  condition  is  true   }  else  {      CompilerDirectives.transferToInterpreter();   }   §  Assumptions Assumption  assumption  =  Truffle.getRuntime().createAssumption();   assumption.check();   //  some  code  that  is  only  valid  if  assumption  is  true   assumption.invalidate();   41 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
  • 42.
    Performance Number Disclaimers § All Truffle numbers reflect the current development snapshot. –  Subject to change at any time (hopefully improve) –  You have to know a benchmark to understand why it is slow or fast §  We are not claiming to have complete language implementations. –  JavaScript: quite complete, passing 99.8% of ECMAScript262 tests –  Ruby: passing >45% of RubySpec language tests –  R: early prototype §  We measure against latest versions of competitors. §  We measure peak performance (i.e., giving each benchmark enough iterations to warmup before starting measurement). §  Benchmarks that are not shown –  may not run at all, or –  may not run fast 42 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
  • 43.
    Peak Performance: JavaScript Speeduprelative to V8 2.6 3.0 Truffle SpiderMonkey 2.5 0.8 1.0 0.9 1.2 1.1 0.9 1.1 0.5 0.6 0.7 0.7 1.0 0.6 1.0 0.8 1.0 0.7 1.5 1.4 1.5 1.6 2.0 0.5 te u C om po si em gb x2 bo bo yrle d r ye y la ea na vi er -s to k sp es e ra y tra c to cr yp bl lta de ric ha rd s ue 0.0 Selection of benchmarks from Google‘s Octane benchmark suite v1.0 43 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
  • 44.
    Peak Performance: Ruby Speeduprelative to JRuby 1.7.5 14 14 14 16 MRI 2.0.0 Topaz 12 Truf f le 10 0 44 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 0.6 1.0 1.7 1.8 0.8 1.7 2.7 1.1 0.5 0.2 0.4 0.3 0.7 2 1.7 2.7 4 0.7 4.7 4.5 6 4.9 8
  • 45.
    Peak Performance: R 94 Speeduprelative to GNUR 100.0 90.0 80.0 70.0 60.0 22 30.0 0.0 45 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. 0.8 2.7 2.1 10.0 2.0 14 20.0 23 40.0 24 38 39 50.0
  • 46.
  • 47.
    Your Language? http://openjdk.java.net/projects/graal/ graal-dev@openjdk.java.net $ hgclone http://hg.openjdk.java.net/graal/graal $ cd graal $ ./mx.sh --vm server build $ ./mx.sh ideinit $ ./mx.sh --vm server unittest SumTest §  Truffle API Resources https://wiki.openjdk.java.net/display/Graal/Truffle+FAQ+and+Guidelines §  Truffle API License: GPLv2 with Classpath Exception 47 Copyright © 2013, Oracle and/or its affiliates. All rights reserved.
  • 48.
    Acknowledgements Oracle Labs Laurent Daynès ErikEckstein Michael Haupt Peter Kessler Christos Kotselidis David Leibs Roland Schatz Chris Seaton Doug Simon Michael Van De Vanter Christian Wimmer Christian Wirth Mario Wolczko Thomas Würthinger Laura Hill (Manager) Interns Danilo Ansaloni Daniele Bonetta Shams Imam Stephen Kell Gregor Richards Rifat Shariyar 48 Copyright © 2013, Oracle and/or its affiliates. All rights reserved. JKU Linz Prof. Hanspeter Mössenböck Gilles Duboscq Matthias Grimmer Christian Häubl Josef Haider Christian Humer Christian Huber Manuel Rigger Lukas Stadler Bernhard Urban Andreas Wöß University of Edinburgh Christophe Dubach Juan José Fumero Alfonso Ranjeet Singh Toomas Remmelg LaBRI Floréal Morandat University of California, Irvine Prof. Michael Franz Codrut Stancu Gulfem Savrun Yeniceri Wei Zhang Purdue University Prof. Jan Vitek Tomas Kalibera Petr Maj
 Lei Zhao T. U. Dortmund Prof. Peter Marwedel Helena Kotthaus Ingo Korb University of California, Davis Prof. Duncan Temple Lang Nicholas Ulle
  • 49.
  • 50.
    50 Copyright © 2013,Oracle and/or its affiliates. All rights reserved.
  • 51.
    51 Copyright © 2013,Oracle and/or its affiliates. All rights reserved.