4. We design payments technology
that powers the growth of millions
of businesses around the world.
Who are we?
7000+ engineers
in over 40 countries
Managing 28+ billion
transactions per year
€250M spent in R&D
every year
Handling 150+
payment methods
5. How your application server handles requests?
So, what’s the problem with our Java apps?
Application Server
Request 1
Request 2
Request n
Thread 1
Thread 2
Thread n
…
… Thread pool
Images: Flaticon.com/Freepik
6. Blocking, waiting, blocking, waiting, blocking, waiting …
So, what’s the problem with our Java apps?
Wait for WebService response Wait for database
Image: Flaticon.com/Freepik
7. May Reactive Be With You
Reactive vs Imperative
Request 1
Request 2
Request 3
Request 4
Request 5
…
Event
Loop
Image: Flaticon.com/Freepik
Imperative Reactive
10. One project, 3 JEPs
Virtual
Threads
Lightweight threads
JEP 444
Stable with
Java 21 LTS
Structured
Concurrency
Facilitate tasks dev&
run on top of Virtual
Threads
JEP 453
Preview with
Java 21 LTS
Scoped
Values
Modernization,
optimization of
Thread Locals
JEP 446
Preview with
Java 21 LTS
12. From a few Platform Threads …
Platform Threads
OS Threads
OS Scheduling
Virtual Threads
OS
JVM
Kernel mode
User mode
13. … to millions Virtual Threads
Platform (Carrier) Threads
OS Threads
OS Scheduling
JVM Scheduling (ForkJoinPool)
Virtual Threads
OS
JVM
14. Inside the JVM: the magic of Continuation
Heap
Continuation.yield(scope) continuation.run()
Virtual Thread 1
Carrier Thread a
Carrier Thread b
Carrier Thread c
…
unmount
mount
ForkJoin
Pool
wait
15. Many changes in the JVM
No impact on user code (vs Kotlin’s coroutines)
Heap and GC activity to be further monitored and tuned
Not suitable for CPU-hungry processing
Loom promises
Keypoints
16. Payments to grow your world
TechSquad PPT Toolbox
Demo time!
Round 1
Java SE
17. • Basic Java SE tests
• Inspired from Eclipse Jetty loom-trial (Webtide)
• Code on GitHub
• Measured on MacOS (ARM M1, 8 CPU, 16 G RAM) with Java 20 (Temurin distro)
Methodology
21. • Up to 64 millions Virtual Threads in a single JVM!
• Just 8 Carrier Threads used in the background
• Same test with Platform Threads: 4K+ on MacOS and 30k+ on Linux
• About 200 Virtual Threads created/msec, 1 MB Heap/thread
• No hard limit with Virtual Threads: the limit is the Garbage Collector efficiency!
How long to start Virtual Threads?
22. One million Threads running a recursive genetic
mutation algorithm: is it possible?
TechSquad PPT Toolbox
DeepStack
Test conditions:
• Stack depth: 1000
• Timeout if Thread creation > 3s
25. First results
How to increase the number of Virtual Threads? …
DeepStack
Platform Threads Virtual Threads
4k+ 30k+ 30k+
Images: Flaticon.com/Freepik
27. Platform and Virtual Threads: Take aways
Platform Threads
Java’s traditional unit of concurrency
Trivial wrappers around OS Threads
Scarce resources
Expensive to create & block
Counter measures: thread pools, reactive
programming, cooperative scheduling
(async/await construct)
Virtual Threads
A new lightweight construct
Not an OS resource
Cheap to create & block
No counter measures needed
Price to pay: heap memory footprint
28. Risk of thread pinning
Continuation.yield(scope)
Virtual Thread 1
Carrier Thread a
Carrier Thread b
Carrier Thread c
…
unmount
ForkJoin
Pool
wait
{native code/JNI}
May reference addresses on the stack
(not supported by unmount/mount mechanism)
Images: Flaticon.com/Freepik
29. Avoid long synchronized blocks/methods
• Replace with ReentrantLock
• Check your dependencies
Make your code Loom-friendly
Thread pool not needed with VT
• Use Semaphore to limit the access of resources
Use Thread Locals with caution
• Virtual Threads support Thread Locals but amplify an existing issue
• Plenty of useless and Local Threads with critical data: transactional context, user id, permissions …
• Thread Locals should be removed!
• To be replaced with Scoped Variables in the mid-term: immutable, lifecycle limited to a task
DZone: Pitfalls to avoid when switching to virtual threads
0 200 400 600 800 1000 1200 1400 1600
PostgreSQL Driver 42.5.4 (non-loom friendly)
PostgreSQL Driver 42.6.0 (loom friendly)
Max. Requests Per Seconds (Virtual Threads)
34. Observability & monitoring
Virtual Threads Scheduling/ForkJoinPool
Java jdk.virtualThreadScheduler options:
• parallelism: number of Carrier Threads, default to CPU processors
• maxPoolSize: the parallelism may be temporarily expanded, at most 256
• minRunnable: number of Carrier Threads not blocked, half the pool size
35. Payments to grow your world
TechSquad PPT Toolbox
Back to demo!
Round Two
Helidon, Quarkus, Spring
36. • Inspired from Oracle Helidon demo
• Code on GitHub
• Comparing imperative programming with Virtual & Platform Threads, reactive programming
• Measuring throughput, memory footprint, quick vs slow request collision
• Same benchmark mixing REST API, dependency injection and JPA persistence
• REST/HTTP injector sending 10 threads * 6000 requests in //
• 10 worker threads configured on the server side
Methodology
37. Helidon 3
Reactive Web Server (Netty)
Plugable ExecutorService
Platform or Virtual Threads
Support reactive programming
Helidon 4
Aka NIMA
Opinionated approach
Natively based on Virtual Threads
Built in tight collaboration with the Java team
Web Server mixing Platform & Virtual Threads
Business Logic run on Virtual Threads
Open-Source from Oracle
Java libs for Cloud Native, Microservices
Focus on simplicity and performance
2 flavors: SE and MP
Native Image support
40. • Basic adoption of VirtualThreads with Helidon 3 Executors.newVirtualThreadPerTaskExecutor()
• Not optimal: more memory, less performance!
• Nima refactored to be nativelly based on Virtual Threads
• Nima Performance and memory footprint comparable to Reactive Programming
• With standard blocking Jakarta EE & MicroProfile code!
Helidon & Virtual Threads
41. Aka. the "pragmatic"
Open-Source from Red Hat
Java libs for Cloud Native, Microservices
« Supersonic Subatomic Java »
Native Image support
IO Threads
Aka Event Loop
Can’t block
Worker Threads
Can block
Platform
Threads
Default
Virtual
Threads
@RunOnDefaultThread
42. • Reactive programming with Mutiny delivers the best performance
• Quarkus not (yet) nativelly based on Virtual Threads, based on IO & Worker Threads
• Optionnaly, Worker threads can be Virtual
• Performance & memory footprint comparable with Platform & Virtual Threads
• So … Is it worth using Virtual Threads with Quarkus?
• Let’s see what happens, when calling concurrently quick and slow endpoints
Quarkus and Virtual Threads
44. Code!
@Path("/uuid")
public class UUIDResource {
@Inject
UUIDRepository uuid;
@Inject@RestClient
ThirdPartySleepServiceClient client;
@GET
@Path("/platform")
@Produces(MediaType.APPLICATION_JSON)
public UUID uuidPlatform() {
client.timed(300);
return uuid.generateRandom(0.025);
}
}
@Path("/uuid")
public class UUIDResource {
@Inject
UUIDRepository uuid;
@Inject@RestClient
ThirdPartySleepServiceClient client;
@GET
@Path("/virtual")
@Produces(MediaType.APPLICATION_JSON)
@RunOnVirtualThread
public UUID uuidVirtual() {
client.timed(300);
return uuid.generateRandom(0.025);
}
}
Platform Threads Virtual Threads
45. Code!
Reactive Model
@Path("/uuid")
public class UUIDResource {
@Inject @RestClient
ThirdPartySleepServiceClient client;
@Inject
UUIDRepository repository;
@GET
@Produces(MediaType.APPLICATION_JSON)
public Uni<UUID> uuid() {
return client.timed(300).onItem().transformToUni(x -> repository.generateRandomUUID(0.025));
}
}
46. Requests Per Second and Average Response Time per Thread Model
141
283
564
843
1121
1390
1493
1388
1334
3520
1760
885
592
446 360 334 360 374
0
500
1000
1500
2000
2500
3000
3500
4000
0
200
400
600
800
1000
1200
1400
1600
ms
Requests
per
Second
Max RPS Average Response Time
Thread per Request
47. Process Max Memory Usage (Resident Set Size) per Thread Model
645
325
497
423
0
100
200
300
400
500
600
700
Thread per Request (500 PT) Reactive Virtual Virtual (native)
Memory
Usage
(MB)
49. Requests Per Second and Average Response Time per Thread Model
141
280
562
837
1109
1369 1390
1338
3500
1760
885
596
450 365 359 373
0
500
1000
1500
2000
2500
3000
3500
4000
0
200
400
600
800
1000
1200
1400
1600
50 PT 100 PT 200 PT 300 PT 400 PT 500 PT Reactive Virtual
Threads
ms
Requests
per
Second
Max RPS Average Response Time
Thread per Request
50. Process Max Memory Usage (Resident Set Size) per Thread Model
727
609
804
0
100
200
300
400
500
600
700
800
900
Thread per Request
(500)
Reactive Virtual
Memory
Usage
(MB)
51. 1. Understand the principles of Virtual Threads
2. Check how your favorite frameworks and libs have adopted them
3. Do some realistic benchmarks before going to production, monitor in production
4. Bench and monitor in JVM and in GraalVM/Mandrel Native Image mode
5. Things are improving fast: re-bench regularly!
6. Reminder: no benefit for CPU-hungry processing, …
Getting ready for Virtual Threads
55. • Asbtract away the use of Virtual Threads
• Enable to coordinate tasks running on Virtual Threads in the context of a “scope”
• Built-in coordination strategy: any, all
• Extensible coordination strategy
• No back pressure
• Still incubator
TechSquad PPT Toolbox
Structured Concurrency
56. • Thread locals not designed to be shared by “millions” of threads
• Unclear lifecycle: not always cleaned up
• Uncontrolled mutability: can be changed at any time
• Inheritance: risk of high memory footprint
• Bound to a callable (not a Thread)
TechSquad PPT Toolbox
Scoped Values
Editor's Notes
JF
Diapo : changer par gros chiffre (30K+ et 60k+) et listing plus visual (icons & co)
Cpu-bound -> virtual threads moins efficaces
TODO:
Example with pinned carrier thread ?
Jcmd to dump virtual threads ?