Monoliths, macroservices, microservices, cloud-native and serverless... Where do we even start? If you are a Java developer, you will likely have to work with one, some, or even all of these deployment approaches. Does this mean learning multiple frameworks, tools and methods? It certainly looks that way, based on the many deployment-specific solutions being proposed to the Java development community.
In this presentation, we will look into these solutions, weighing their strengths and weaknesses. We will also contrast this with one-size-fits-all solutions being offered by modern open-source cloud-native Java runtimes like Open Liberty. Does it have the right technology to compete in microservice and serverless environments - can one runtime really do it all?
4. 5
Cloud Performance
• Modular runtime
o Low memory, high throughput
o Low operating cost
• CI/CD optimized
o Availability on all Cloud platforms
o Cloud-native tool integration
o Optimized for Kubernetes
• Simple developer experience
o Integration with all popular IDEs and build tools
o API support – Jakarta EE and MicroProfile
8. 10
10
Zero Migration
üNo configuration behavior changes
üNo runtime feature behavior
changes
üNo removals Never apply an ifix again
Skipping a release does not introduce
additional migration work
Stay current with a rebuild
(no app or config changes necessary)
14. 19
Architecture Styles vs Typical Java workloads
Architecture Characteristics Compatibility
Monoliths Long-lived, efficient ✅
Macroservices Long-lived ✅
Microservices
Fast start-up/ramp-up for scaling up
and down (to zero) to meet demand.
Small footprint.
❌
Serverless/Functions Fast start-up (in MS), short-lived ❌
16. 21
What is the JVM?
The Java Virtual Machine (JVM) is the runtime environment to execute Java code.
The JVM is responsible for taking your application bytecode and converting it to a
machines language for execution.
Most other languages compile code for a specific system but in Java, the compiler
converts the code for a Java Virtual Machine that can be used on any system
hence the phrase “Write once, run anywhere”.
17. 22
How the JVM works
Class Loader
Bytecode Verifier
Interpreter JIT
Java Bytecode
JVM
Java Code
Java Compiler
javac
Run Time
Compile Time
OS
Hardware
18. 23
JVM Interpreter
Class Loader
Bytecode Verifier
Interpreter JIT
Java Bytecode
JVM
• Machine independent bytecodes are
interpreted by the JVM at runtime
• This ensures portability of Java programs
across different architectures
• But it affects performance because
interpretation is relatively slow.
OS
Hardware
19. 24
Just-in-Time Compiler
Class Loader
Bytecode Verifier
Interpreter JIT
Java Bytecode
JVM
• The answer to the performance issue is the JIT
compiler, which transforms sequences of
bytecodes into machine code that is cached by
the JVM
• Unit of compilation is a method. To save
overhead, only “hot” methods are compiled
• Compiled native machine code executes 10x
faster than a bytecode-by-bytecode interpreter
• Generated code is saved in a "code cache" for
future use for lifetime of JVM OS
Hardware
21. 26
JVM – The Good
26
• Device independent – write once, run anywhere
• > 25 years of improvements
• JIT produces optimized machine code through use of Profilers
• Efficient garbage collection
• Longer it runs, the better it runs (JVM collects more profile data, JIT
compiles more methods)
22. 27
JVM – The Bad
27
• Initial execution run is “interpreted”, which is relatively slow
• “Hot Spot” methods compiled by JIT can create CPU and memory
spikes
• CPU spikes cause lower QoS
• Memory spikes cause OOM issues, including crashes
• Slow start-up time
• Slow ramp-up time
23. 28
JVM Spikes
28
0
50
100
150
200
250
300
350
400
0 30 60 90
CPU
utilization
(%)
Time (sec)
Daytrader7 CPU consumption
CPU spikes caused
by JIT compilation
0
100000
200000
300000
400000
500000
600000
0 30 60 90
Resident
set
size
(KB)
Time (sec)
Daytrader7 memory footprint
Footprint spikes caused
by JIT compilation
24. 29
Container Size
29
Main issues:
•Need to over-provision to
avoid OOM
•Very hard to do – JVMs
have a non-deterministic
behavior
0
100000
200000
300000
400000
500000
600000
0 30 60 90
Resident
set
size
(KB)
Time (sec)
Daytrader7 memory footprint
Footprint spikes caused
by JIT compilation
28. 33
Overview of Eclipse OpenJ9
Designed from the start to span all the operating
systems needed by IBM products
This JVM can go from small to large
Can handle constrained environments or memory
rich ones
Renowned for its small footprint, fast start-up and
ramp-up time
Is used by the largest enterprises on the planet
33
31. 36
IBM Semeru Runtimes
“The part of Java that’s really in the clouds”
IBM-built OpenJDK runtimes powered by the Eclipse OpenJ9 JVM
No cost, stable, secure, high performance, cloud optimized, multi-
platform, ready for development and production use
Open Edition
• Open source license (GPLv2+CE)
• Available for Java 8, 11, 17, 18 (soon 19)
Certified Edition
• IBM license
• Java SE TCK certified.
• Available for Java 11, 17
36
33. 38
JIT-as-a-Service
Decouple the JIT compiler from the JVM and let it run as an independent
process
Offload JIT
compilation to
remote process
Remote
JIT
Remote
JIT
JVM
JIT
JVM
JIT
Kubernetes
Control Plane
Treat JIT
compilation as a
cloud service
• Auto-managed by orchestrator
• A mono-to-micro solution
• Local JIT still available
38
34. 39
Eclipse OpenJ9 JITServer
• JITServer feature is available in the Eclipse OpenJ9 JVM
• “Semeru Cloud Compiler” when used with Semeru Runtimes
39
35. 40
JITServer advantages for JVM Clients
40
Provisioning
Easier to size; only consider the needs
of the application
Performance
Improved ramp-up time due to JITServer
supplying extra CPU power when the JVM
needs it the most.
Reduced CPU consumption with JITServer AOT
cache
Cost
Reduced memory consumption means
increased application density and reduced
operational cost.
Efficient auto-scaling – only pay for what
you need/use.
Resiliency
If the JITServer crashes, the JVM can
continue to run and compile with its
local JIT
36. 41
JITServer value in Kubernetes
• https://blog.openj9.org/2021/10/20/save-money-with-jitserver-on-the-cloud-an-aws-
experiment/
• Experimental test bed
o ROSA (RedHat OpenShift Service on AWS)
• Demonstrate that JITServer is not tied to IBM HW or SW
o OCP cluster: 3 master nodes, 2 infra nodes, 3 worker nodes
• Worker nodes have 8 vCPUs and 16 GB RAM (only ~12.3 GB available)
o Four different applications
• AcmeAir Microservices
• AcmeAir Monolithic
• Petclinic (Springboot framework)
• Quarkus
o Low amount of load to simulate conditions seen in practice
o OpenShift Scheduler to manage pod and node deployments/placement
41
37. 42
JITServer improves container density and cost
Default config
AM 500
B 550
C 550
F 450 P 450
P 450
B 550
F 450
AM 500
A 350
AM 500
M 200
Q 350
P 450
Q 350
D 600
D 1000
F 450
B 550
Q 350
AM 500
AM 500
AM 500
B 550
B 550
A 350
C 550
F 450
M 200
P 450
P 450
P 450
Q 350
Q 350
D 1000
AM 500 B 550
P 450
AM 500
B 550
B 550
C 550
C 550
F 450
F 450 P 450
Q 350
Q 350
D 1000
D 1000
Q 350
AM 250
AM 250
P 250
P 250
F 250
F 250
B 400 C 350
Q 150
Q 150
M 150
AM 250
AM 250
P 250
P 250
F 250
B 400
Q 150
Q 150
J 1200
A 250
B 400
B 400
C 350
D 1000 D 1000
D 600
AM 250
AM 250
P 250
P 250
F 250
F 250
B 400 C 350
Q 150
Q 150
M 150
AM 250
AM 250
P 250
P 250
F 250
B 400
Q 150
Q 150
J 1200
A 250
B 400
B 400
C 350
D 1000
D 1000
JITServer config
Legend:
AM: AcmeAir monolithic
A: Auth service
B: Booking service
C: Customer service
D: Database (mongo/postgres)
F: Flight service
J: JITServer
M: Main service
P: Petclinic
Q: Quarkus
Total=8250 MB Total=8550 MB Total=8600 MB
Total=9250 MB Total=9850 MB
6.3 GB less
42
38. 43
Throughput comparison
0
200
400
600
800
1000
1200
0 240 480 720 960 1200 1440 1680 1920
Throughput
(pages/sec)
Time (sec)
AcmeAir Microservices
0
200
400
600
800
1000
1200
0 240 480 720 960 1200 1440 1680 1920
Throughput
(pages/sec)
Time (sec)
AcmeAir Monolithic
0
20
40
60
80
100
120
140
160
180
0 240 480 720 960 1200 1440 1680 1920
Throughput
(pages/sec)
Time (sec)
Petclinic
0
500
1000
1500
2000
2500
3000
3500
0 240 480 720 960 1200 1440 1680 1920
Throughput
(pages/sec)
Time (sec)
Quarkus
Machine load:
17.5% from apps
7% from OpenShift
43
è JITServer and default configuration achieve the same level of throughput at steady-state
JITServer Baseline
39. 44
Conclusions from high density experiments
• JITServer can improve container density and reduce operational
costs of Java applications running in the cloud by 20-30%
• Steady-state throughput is the same despite using fewer nodes
44
40. 45
Horizontal Pod Autoscaling in Kubernetes
• Better autoscaling behavior with JITServer due to faster ramp-up
• Less risk to trick the HPA due to transient JIT compilation overhead
45
Setup:
Single node Microk8s cluster (16 vCPUs, 16 GB RAM)
JVMs limited to 1 CPU, 500MB
JITServer limited to 8 CPUs and has AOT cache enabled
Load applied with JMeter, 100 threads, 10 ms think-time,
60s ramp-up time
Autoscaler: scales up when average CPU utilization
exceeds 0.5P. Up to 15 AcmeAir instances
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
0 60 120 180 240 300 360 420 480
Throughput
(pages/sec)
Time (sec)
AcmeAir throughput when using Kubernetes autoscaling
Baseline JITServer+AOTcache
41. 46
Final thoughts
• JIT provides advantage, but compilation adds overhead
• Disaggregate JIT from JVM è JIT compilation as a service
• Eclipse OpenJ9 JITServer (a.k.a Semeru Cloud Compiler)
o Available now on Linux for Java 8, Java 11 and Java 17 (IBM Semeru
Runtimes)
o Especially good for constrained environments (micro-containers)
o Kubernetes ready (Helm chart available, Prometheus integration)
o Can improve ramp-up, autoscaling and performance of short lived
applications
o Can reduce peak memory footprint, increase app density and reduce costs
o Java solution to Java problem, with no compromise
46
44. 49
• Use Linux CRIU capability to snapshot/restore Liberty process
• Linux CRIU snapshots process state including memory pages,
file handles, network connections
• Liberty and OpenJ9 run pre-snapshot and post-restore hooks
to enable seamless user experience
What is Liberty Instant On ?
Liberty JVM starts
up
Liberty JVM reaches
snapshot point
Liberty JVM runs pre
snapshot hooks
Liberty JVM initiates
snapshot
Build time
Liberty JVM exits
after snapshot
Restore Liberty JVM
from snapshot
Liberty JVM runs
post restore hooks
Run time
Restore Liberty JVM
from snapshot
Liberty JVM runs
post restore hooks
Run time
Goal : “instant on” without limiting programming model/capabilities substantially
Liberty JVM
continues running
Liberty JVM
continues running
49
45. 50
Linux CRIU capable of reducing startup time dramatically
• ~10X reduction in first response time by using Liberty Instant On prototype using CRIU
• Liberty Instant On is an effort to explore productization of CRIU use with Liberty
• Main design goals
• Usability : make it easy for an end user to consume the feature
• Functionality : make it possible for end user to run their application without changes
50
0
1000
2000
3000
4000
5000
6000
Liberty baseline Liberty Instant On prototype
PingPerf - First Response Time(ms)
(lower is better)
0
1000
2000
3000
4000
5000
6000
Liberty baseline Liberty Instant On prototype
Rest CRUD - First Response
Time(ms)
(lower is better)
0
1000
2000
3000
4000
5000
6000
Liberty baseline Liberty Instant On prototype
Daytrader7 - First Response Time(ms)
(lower is better)
1470
128
3082
213
5462
310
System Configuration:
-------------------------------
SUT: LinTel – SLES 15 sp3 - Intel(R) Xeon(R) Gold 5118 CPU @ 2.30GHz, 2 physical cores(4 cpu), 32GB RAM.
46. 51
Where to
checkpoint?
51
Liberty InstantOn leverages Semeru to provide a
seamless checkpoint/restore solution for
developers. With checkpoint restore, there is a
tradeoff between startup time and the complexity of
the restore.
Checkpoint phases:
• features
• deployment
• applications
Kernel
start and feature
runtime
processing
Process
application
Start
application
Accept
requests
Applications
Deployment
Features
Later
checkpoint
==
Faster
restore
time
Later
checkpoint
==
More
complexity
300 – 2200ms
100 – 3000ms
0 – ???ms