It's always sunny with OpenJ9

It’s always sunny with OpenJ9
Dan Heidinga,
Eclipse OpenJ9 Project Lead
Interpreter Lead, IBM Runtimes
@danheidinga
DanHeidinga

5
Java in the Cloud requirements
§ Fast startup
–Faster scaling for increased demand
§ Small footprint
–Improves density for providers
–Improves cost for applications
§ Quick / immediate rampup
–GB/hr is key, if you run for less time you pay less money

http://www.eclipse.org/openj9
https://github.com/eclipse/openj9
Dual License:
Eclipse Public License v2.0
Apache 2.0
Users and contributors very welcome
https://github.com/eclipse/openj9/blob/master/CONTRIBUTING.md
Eclipse OpenJ9
Created Sept 2017

OpenJ9 helps…
… containers out of the box
7

8
https://twitter.com/paulbiggar/status/1228395386378678272

Don’t use –Xmx for containers to avoid rebuilds
9
§ -XX:InitialRAMPercentage=N
– Set initial heap size as a percentage of total memory
§ -XX:MaxRAMPercentage=N
– Set maximum heap size as a percentage of total memory
§ Running in containers with set memory limits?
– OpenJ9 will base the default heap size on that limit:

Out of the box idle tuning
§ When’s the best time to schedule a GC?
When your apps idle!
§ OpenJ9 enables -XX:+IdleTuningGcOnIdle
when running in a container
– When the VM detects your app is idle for a
configurable length of time, it can GC
– Both scavenge and global collections can
occur based on heuristics
10

OpenJ9 helps…
… startup
11

-Xverify:none never again!
12
§ Use -XX:[+|-]ClassRelationshipVerifier instead for fast and safe startup
public class A {
public static void main(String[] args) {
B b = new B();
acceptC(b);
}
public void acceptC(C c) {}
}

ShareClasses cache
13
Classfile ROMClass J9RAMClass

ShareClasses: ROM pays off
14
JVM 1 JVM 2 JVM 3

15
JVM 1 JVM 2 JVM 3

16
JVM 1 JVM 2 JVM 3
Shared Classes
Cache
Faster startup, Smaller footprint

“Dynamic” AOT through ShareClasses
17
Shared Classes
Cache
AOTROM Classes
$ java –Xshareclasses ...

SharedClasses cache
-Xshareclasses
-enables the share classes cache
-Xscmx50M
- sets size of the cache
18

19
ShareClasses and AOT
§ Distinction between ‘cold’ and ‘warm’ runs
§ Dynamic AOT compilation
–Relocatable format
–AOT loads are ~100 times faster than JIT compilations
–More generic code à slightly less optimized
§ Generate AOT code only during start-up
§ Recompilation helps bridge the gap

Problem 1: Modifying pre-existing files creates copies.
FROM scratch
RUN echo “Class A” >myscc
21

FROM scratch
FROM Image_1
RUN echo “Class B” >>myscc
22

FROM scratch
FROM Image_1
23
Class A

FROM scratch
FROM Image_1
24
Class A
Class A
Class B

FROM scratch
FROM Image_1
FROM Image_2
RUN echo “Class C” >>myscc
25
Class A
Class A
Class B
Class A
Class B
Class C

Problem 2: We need to leave space in the cache for other
images.
FROM scratch
26
-Xscmx?
How big?

Solution: Create new cache layers in new files.
FROM scratch
RUN echo “Class A” >myscc.L0
FROM Image_1
RUN echo “Class B” >myscc.L1
FROM Image_2
RUN echo “Class C” >myscc.L2
27

FROM scratch
FROM Image_1
FROM Image_2
28

FROM scratch
FROM Image_1
FROM Image_2
29
Class A
Class B
Class C

Bonus: We can specify –Xscmx per layer. We don’t need to
leave any extra space.
FROM scratch
FROM Image_1
FROM Image_2
30
How Big?
Doesn’t
matter.

Bonus: We can specify –Xscmx per layer. We don’t need to
leave any extra space.
FROM scratch
FROM Image_1
FROM Image_2
31
How Big?
Doesn’t
matter.
-XscmxA
-XscmxB
-XscmxC

MULTI-LAYER SHARED
CLASSES CACHE
OpenJ9 Implementation Details
32

Example of Docker layers
Container
Layer
(Read-Write)
DayTrader
Image Layer
(Read-only)
Open Liberty
Image Layer
(Read-only)
OpenJ9
Image Layer
(Read-only)
Ubuntu 16.04
33

Read-write OpenJ9
Read-only Ubuntu 16.04
SCC
OpenJ9 layer data
34

Read-write Open Liberty
Read-only OpenJ9
SCC
OpenJ9 layer data
SCC
OpenJ9 layer data Liberty layer data
SCC tuned on
Load OpenJ9 layer data/add Open Liberty
Layer data Docker CoW
35

Read-write DayTrader
Read-only Open Liberty
Read-only OpenJ9
SCC
OpenJ9 layer data
SCC
OpenJ9 layer + Liberty layer data
SCC
OpenJ9 layer + Liberty layer + DayTrader layer data
36

Multi-layer SCC
Read-only OpenJ9
SCC
OpenJ9 layer data
Cannot write to a lower layer
SCC
Liberty layer data
Write SCC in top layer
37

Multi-layer SCC
Read-only OpenJ9
SCC_L0
OpenJ9 layer data
SCC_L1
Liberty layer data
Cannot write to a lower layer Write SCC in top layer SCC becomes layered
38

Multi-layer SCC
Read
write
DayTrader
Read-only Open Liberty
Read only OpenJ9
SCC_L0
OpenJ9 layer data
SCC_L1
Liberty layer data
SCC_L2
DayTrader layer data
39

Example
• Create a shared cache named "demo"
• Use –Xshareclasses:listAllCaches to find the shared cache. By default, the cache is layer 0
40

Example
• Traditionally if you are running on a single layer cache:
Simply use the same -Xshareclasses option to start up the JVM again.
Traditional single cache is equivalent to Mutli-layer SCC case with layer number 0.
41

Example
• If you want to create a new layer for cache named "demo":
42

Example
• Find the new layer cache. Cache named "demo" has layer 0 and layer 1 now (two files).
• Future run with –Xshareclasses:cacheDir=/tmp/,name=demo starts up all existing layers.
43

Command line options on Multi-layer SCC
§ -Xshareclasses:createLayer
– Create a new shared cache layer
§ -Xshareclasses:layer=<number>
– Specify the top layer number. Create a new shared cache layer if the
specified layer does not exist.
§ Some applications launch multiple JVMs simultaneously
– If each’s running with –Xshareclasses:createLayer, N new layers created
– Use layer=<number> to ensure only one layer created.
44

OpenJ9 helps…
… footprint (and performance!)
45

46
Out of the Box: OpenJ9 uses roughly half the memory
Footprint is 60% smaller with OpenJ9
Hotspot OpenJ9 OpenJ9 -Xshareclasses -
Xquickstart
Hotspot OpenJ9 OpenJ9 -Xshareclasses -Xquickstart

Sidebar: Life of a running Java application
”Big bang”
(java process
created)
Time
47

”Big bang”
(java process
created)
Application
ready to do
work, can be
1000s classes
100s class
loaders
Code paths
& profile
stabilizes
Size and
Complexity
of
Class Hierarchy
48
Ready to run
main
~ 750 classes
~ 3 class loaders
RampupStartup
Time
Steady state

”Big bang”
(java process
created)
Application
ready to do
work, can be
1000s classes
100s class
loaders
Code paths
& profile
stabilizes
Size and
Complexity
of
Class Hierarchy
49
Startup
Time
Steady state
JITAOT Rampup
Ready to run
main
~ 750 classes
~ 3 class loaders

Challenges with AOT compilation
• Native code is not platform neutral
• Different AOT code needed for each deployment platform (Linux, Mac, Windows)
• Other usability issues
• Some deployment options decided at build time, e.g. GC policy, ability to re-JIT, etc.
• Different platforms: different classes load and methods compiled?
• Curate lists of classes/modules, methods to compile as your application and its
dependencies evolve
• What about classes that aren’t available until the run starts?
• What methods to compile?
• Performance: AOT compilers can only reason about what happens at runtime
• Unlike JIT compiler which sees it happening
50

Profile Directed Feedback (PDF) may help?
• BUT: AOT code must run all possible user executions
• No longer compiling for “this” user on “this” run
• Really important to use representative input data when collecting profile for AOT
• Risk: can be misleading to use only a few input data sets
• AOT compiler can specialize to one data set and then run well on it
• But PDF can lead compiler astray if data isn’t properly representative
• Monomorphic in one runtime instance ≠ Monomorphic across all runtime instances
• Benchmarks may not stress AOT compilers properly (not many input sets)
• Cross training critically important
• Input data sets need to be curated and maintained as application and users evolve
• Profile data collection and curation responsibility is on the application provider
• Observation: PDF has not really been a huge success for static languages
51

Strengths and Weaknesses
52
JIT AOT
Code Performance (steady state)
Runtime: adapt to changes
Ease of use
Platform neutral deployment
Start up (ready to handle load)
Ramp up (until steady state)
Runtime: CPU & Memory

JIT AOT AOT +JIT
Ease of use
Start up (ready to handle load)
Ramp up (until steady state)
Runtime: CPU & Memory
53

Is that as good as it gets?
54

Caching JIT Compiles
• Basic idea:
• Store JIT compiled code (JIT) in a cache for loading by other JVMs (“AOT”)
• Goal: JIT compiled code performance levels earlier
• Also: reduce JIT compiler’s transient CPU and memory overheads
• Really different than AOT ? No and Yes
• From perspective of second+ JVM: code loads as if it was AOT compiled
• First JVM: JIT compiles while app runs but generates code that can be cached
• Need meta data to validate later runs match first (i.e. same classes loaded same way)
• If invalid, don’t use cached code: instead do JIT or even more AOT recompilations
• Return to platform neutrality!
• Different users still get compiled code tailored for their environment
55

OpenJ9: Caching JIT code accelerates start-up
• OpenJ9 Shared Class Cache (SCC)
• Memory mapped file for caching:
• Class files*
• AOT compiled code
• Profile data, hints
• Population of the cache happens
naturally and transparently at runtime
• Also -Xtune:virtualized
• Caches JIT code even more aggressively
to accelerate ramp-up (under load)
• Maybe slight (5-10%) performance drop
24
0
20
40
60
80
100
120
OpenJ9 no
SCC
OpenJ9
default
SCC
OpenJ9
full SCC
HotSpot
Normalizedstart-uptime
Apache Tomcat 8 Start-up Time
28%
43%
19%
• SCC for JCL bootstrap classes enabled by default
• Use -Xshareclasses option for full sharing
* Technically an internal format that can load faster than a .class file

57* After first run
JIT AOT AOT +JIT Cache
JIT
Ease of use
Start up (ready to handle load) *
Ramp up (until steady state) *
Runtime: CPU & Memory *

Still some “not green” boxes there
…even for caching JITs…
L
58

What if the JIT became a JIT Server
JIT
JVM
JIT
Server
JIT
JVM
JIT
JVM
Orchestrator
load balancing,
affinity, scaling,
reliability
JIT
Server
JVM client identifies methods to compile, but asks server to do the actual compilation
• JIT server asks questions to the client JVM (about classes, environment, etc.)
• Sends generated code & meta data back to be installed in client’s code cache 59

Benefits of an independent JIT server
• Move much of JIT induced CPU and memory spikes away from client
• Client CPU and memory consumption dictated by application
• JIT server connected to client JVM at runtime, so:
• Theoretically no loss in performance using same profile and class hierarchy info
• Still adaptable to changing conditions
• JVM client still platform neutral
60

Could that work?
62
AcmeAir rampup with JIT Server using
–Xshareclasses
All JVMs run in containers, client and server on different machines with direct cable connection
Note: Hotspot takes twice as long as OpenJ9 to ramp up to about the same performance level
0
1000
2000
3000
4000
5000
6000
0 100 200 300 400 500 600
Throughput(pages/sec)
Time (sec)
AcmeAir with -Xshareclasses (Cold Run)
Container limits: 1P, 150M
JITServer-cold OpenJ9-cold
0
1000
2000
3000
4000
5000
6000
0 100 200 300 400 500 600
Time (sec)
AcmeAir with -Xshareclasses (Warm Run)
Container limits: 1P, 150M
JITServer-warm OpenJ9-warm
31

Could that work?
63
JITServer Performance – Daytrader 7 Throughput
Throughput benefits grow in constrained environments
0
200
400
600
800
1000
1200
1400
0 100 200 300 400 500 600
Time (sec)
--cpus=1, -m=300m
JITServer OpenJ9
0
200
400
600
800
1000
1200
1400
0 100 200 300 400 500 600
Time (sec)
--cpus=1, -m=256m
JITServer OpenJ9
0
200
400
600
800
1000
1200
1400
0 100 200 300 400 500 600
Time (sec)
--cpus=1, -m=200m
JITServer OpenJ9
Smaller memory limit
32

What about network latency?
Won’t that hurt start up and ramp up?
Will it be practical in the cloud?
64

JIT Server works well on Amazon AWS!
65
* JITaaS == JIT Server

66* After first run ** After first run across cluster
JIT AOT AOT +JIT Cache
JIT
JIT
Server
Ease of use
Platform Neutral deployment
Start up (ready to handle load) * **
Ramp up (until steady state) * **
Runtime: CPU & Memory *

JIT Server Current Status
• Code is fully open source at Eclipse Open J9
• Has now been merged into our master branch
• Now available in AdoptOpenJDK January 2020 update releases for JDK8 and 11 on Linux x86-64
platform
• Simple options lend well to all kinds of Java workload deployments
• Server: jitserver –XX:JITServerPort=<port> -XX:JITServerAddress=<host>
• Client: java -XX:+UseJITServer -XX:JITServerPort=<port>
-XX:JITServerAddress=<host> YourJavaApp
• We are seeking feedback on how well it works in real user environments!
• Try it now:
• E.g. https://adoptopenjdk.net/releases.html?variant=openjdk8&jvmVariant=openj9
• E.g. Docker pull adoptopenjdk:8-jdk-openj9 (on Linux x86-64 platform)
67

We are really just at the beginning…
• Primary focus has been on mechanics to move JIT compilation to a server
• Once compilation work is redirected to server :
• Do that work more efficiently across a cluster of JVMS (think microservices)
• Classify and categorize JVM clients using machine learning
• Optimize groups of microservices together
• …
68

69
https://adoptopenjdk.net
Select
“OpenJ9”
Button!!

It's always sunny with OpenJ9

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to It's always sunny with OpenJ9

Similar to It's always sunny with OpenJ9 (20)

More from DanHeidinga

More from DanHeidinga (7)

Recently uploaded

Recently uploaded (20)

It's always sunny with OpenJ9