Quasar Fibers provide a fiber-based concurrency framework that can be used to refactor a synchronous Java application to an asynchronous model without requiring a complete code rewrite. The document describes an auction platform that was refactored from using threads to using Quasar Fibers. This allowed the platform to scale to handle more concurrent auctions and demand partners without degradation.
The key steps taken were:
1. Refactoring core components like queues and timers to use Quasar equivalents instead of standard Java utilities
2. Wrapping blocking third party libraries like Apache HttpClient using Comsat to make them fiber-aware
3. Adding semaphores to limit concurrent fiber execution to avoid
4. Our Auction Platform
A classic synchronous web application
Written with Java 8 & Spring 4
Runs on top of tomcat 8 with NginX serving as a web frontend
Each request gets it’s own dedicated thread
Service designation
A programmatic VAST auction service
Supports multiple auction protocols - Direct VAST, OpenRTB and more
Built on top of proprietary modular auction engine
We are using 8 core, 15 GB RAM instances
NginX has 1 allocated core, JVM has the remaining 7
5. Diving Deeper Into the Application Flow
Each Request is parsed and validated
Relevant VAST demand deals are targeted
Data is transformed into auction DTOs and passed to the auction engine
The auction engine allocates a dedicated thread for each demand partner
Demand processing logic is executed (consisting mostly of HTTP/S calls)
Processing results are collected, and eligible responses are augmented
Elected results are formatted into a VAST HTTP response
Auction level analytics events are also collected and enqueued
Analytics events are batched and dispatched asynchronously
A thread is allocated for processing demand level analytics events
8. It’s All About the Quantity
A synchronous model works as long as concurrency doesn’t out-scale the hardware
We started with 3-4 demand deals/partners per auction
Business had it, that number needed to grow to 30-40
This means 60-80 threads allocated per auction
Request processing duration greatly affects the achievable concurrency
We started with auction timeouts of 200-300 ms
Business had it, that number needed to grow to 2000 ms
In synchronous web applications, Scaling the inbound connector is trickier
10. Turning the Knobs Up to 11
We can always tweak whatever pool we can lay our hands on
Java ThreadPoolExecutor instances
Tomcat connectors threads and processor caches
Apache Http Client connection manager
Demand thread processing time is mostly spent blocking on HTTP calls
Threads are waiting for native method level socket read
Though threads are RUNNABLE, they should be consuming next to no CPU
JVM CPU time should be mostly allocated to threads that actually need it
12. 100 is a Company, 10000 is a Crowd
Tweaking various pools seems like a nice fix, but it doesn’t come close to a solution
Member` each auction allocates 60-80 threads ?
You want to handle (much) more than 100 auctions concurrently per server
Well, it adds up to a whole lot of threads
Having the JVM juggle so many threads in such short durations is a bad idea
CPU core count is very limited
Context switches are costly operations
RUNNABLE threads handling native-methods still compete with
other threads for CPU resource allocation
13. Feed Me Seymour
In any given duration, each thread gets a very small amount of CPU resouces
So we wound up with a JVM packed with so many threads
So basically beyond some point, the application completely breaks
This also means that logical and functional tasks ( like time-bombs )
don’t execute on time
That’s what you’d generally call “Thread Starvation”
14. Back to the Beginning
Why did we build our code using a
synchronous/blocking thread model to
begin with?
Readability is crucial in
massive & modular codebase
Needs and requirements change
It’s much easier to implement
advanced synchronization and
parallelization patterns
16. Vert.x
A complete end-to-end solution for writing asynchronous applications
Offers a complete asynchronous stack (HTTP Client and Server, Event Bus )
Built using standard JVM utilities and OS interfaces
However
Will require a complete code rewrite of the auction engine
Easier to fall into messy code and “callback hell”
Harder to implement advanced synchronization features/ request contexts
Provides built-in integrations with asynchronous frameworks/SDKS
Not all frameworks/SDKs are asynchnornous
Supports both direct callback & vertical(actorish) coding paradigms
17. Akka
An actor based concurrency framework/tool-kit
Aims to be a fully asynchronous stack
Supports only the actor programming paradigm
And yet
It’d still require a complete code rewrite of the auction engine
It’d still be hard/cumbersome to implement advanced synchronization features
Frameworks/SDKs that are not asynchronous still pose a potential
bottleneck
Supports N/N or N/1 Request/Actor models allowing for request easier context
management
18. Quasar
A fiber based concurrency framework
Fibers are ‘lightweight threads’. They’re used and (mostly) behave like
standard Java threads
Concurrency is achieved by performing stack manipulation and bytecode
instrumentation under the hood
But what about 3rd party SDKs/Frameworks?
That’s what you have Comsat for...
Most importantly, it means a complete code rewrite is not required
19. A fiber based framework complementary framework for web development
Maintained by the same company as Quasar
Provides drop-in replacements that ‘imbue’ the APIs of common 3rd party
libraries like JDBC,JOOQ,Jedis(still in development), Apache HttpClient with
fiber awareness
On the downside, the development is not as active as Quasar
Comsat
21. Standard BlockingQueue has been replaced with Quasar Channel
Change for the better
Standard CountDownLatch has been replaced with Quasar drop-in
replacement
Demand processing threads (Callable) are now fiber based (SuspendableCallable)
Demand analytics collection threads (Runnable) are now fiber based
(SuspendableRunnable)
Demand Processing interrupt tImers ScheduledExecutorService based
implementation with Quasar drop-in replacement
Apache HttpClient with Comsat’s FiberHttpClient (Fiber blocking wrapper around
Apache AsyncHttpClient)
23. Refactoring core components to work with fibers instead of threads : ~5 hours
Code structure/flow did not change one bit
Counting the Hours
Debugging the first (instrumentation) issue: ~5 hours
Found one bug ( was fixed within an hour by Ron )
You don’t really know what to expect
Additional Refactoring and code clean-up : ~10 hours
Some of our modules need to support fiber/standard operation modes
24. At first glance, we were able to scale to 500 auctions/second on the same hardware
Some frameworks are not meant to operate in extremely high concurrency
Log4j2(even when set to async mode, tweaked and optimized), should be
used with great care
Concurrent fiber execution is basically unbounded
The Operation Succeeded, the Patient is Still not Well
Underlying threads schedule fibers using ForkJoinPool work-stealing
mechanism
And sometimes, it’s not even up to to your application...
25. A Can of Worms
Apache AsyncHttpClient utilizes a reactor pattern to handle HTTP
request/response related events using a fixed amount of “IO threads”
Targeted demand access is mostly done over HTTPS
Opening SSL connections in the JVM is a really expensive and inefficient
operation
To top it all off, the demand closes connections unexpectedly, so
connection pools and SSL client session cache cannot be utilized
Handling thousands of SSL connection initialization operations/second
creates backlogs, if the backlog becomes too big, the application completely
breaks
Proxying SSL traffic via NginX also proved to be inefficient
26. Fiber concurrent execution is virtually unbounded
It’s very easy to overload the server
A Happy Ending
A single server’s throughput is limited due reasons beyond our control
Now we just need to find the sweetspot
How many fibers can co-execute without degrading the app’s functionality ?
Acquiring Semaphore permits prior to entering the auction is the answer
Number of fibers spawned per auction matches the amount of targeted deals
Permits are acquired per auction based on that amount
When the Semaphore runs out of permits, auctions report a specific error
We can later auto-scale our infrastructure based on the error above.
28. A single thread handles many fibers executing concurrently.
Code running inside the context of the fiber should never be “synchronized”
A thread-level monitor/lock is acquired for “synchronized” blocks/methods
Synchronized/Blocking Code Inside Fibers
So thread level locks make no sense
The problem: Synchronized/Blocking code is everywhere
Use of synchronous/blocking 3rd party libraries should be avoided.
Example : Formatting throwables stacktraces should be avoided.
Synchronization facilities are available through Quasar supplements
FiberAsync Allows us to transform async APIs to blocking fiber aware APIs
Use Comsat’s drop-in replacements where available
A fiber can run on several different threads.
29. Fiber concurrency is achieved by manipulating thread stacks in real-time
These exceptions should be propagated throughout entire method stack
running on top of the fiber
The Quasar framework uses SuspendedExceution exceptions to initiate
transfer of control (fiber “context switch”)
Suspendables
@Suspendable annotation is available for scenarios where an exception
cannot be thrown (i.e. overriding predefined interfaces)
The catch: You need to remember to do it
When accessing comsat libraries you need to propagate SuspendedExceution
Set co.paralleluniverse.fibers.verifyInstrumentation to true (only!) while testing
35. Dispatching(Standalone)
Fibers
public class MyFiber extends Fiber<Void> {
@Override
public Void run() {
// your code here
return null ;
}
public static void main(String[] args) {
new MyFiber().start();
}
}
Threads
public class MyThread extends Thread {
@Override
public void run() {
// your code here
}
public static void main(String[] args) {
new MyThread().start();
}
}
36. Dispatching(Runnables/Dedicated Pool)
Fibers
public class MySuspendableRunnable implements
SuspendableRunnable {
private final static FiberScheduler FIBER_SCHEDULER = new
FiberForkJoinScheduler("my-fibers",
Runtime.getRuntime().availableProcessors());
public static void main(String[] args) {
new Fiber<>(FIBER_SCHEDULER,new
MySuspendableRunnable()).start();
}
@Override
public void run() throws SuspendExecution,
InterruptedException {
// your code here
}
}
Threads
public class MyRunnable implements Runnable{
private final static ExecutorService THREAD_POOL =
Executors.newFixedThreadPool(Runtime.getRuntime().availableProcess
ors());
public static void main(String[] args) {
THREAD_POOL.execute(new MyRunnable());
}
@Override
public void run() {
// your code here
}
}
37. Dispatching(Callables/Response Extraction)
Fibers
public class MySuspendableCallable implements
SuspendableCallable<Long>{
private final static FiberScheduler FIBER_SCHEDULER = new
FiberForkJoinScheduler("my-fibers",
Runtime.getRuntime().availableProcessors());
public static void main(String[] args) throws
InterruptedException, ExecutionException {
Future<Long> fiberFuture = new
Fiber<>(FIBER_SCHEDULER,new MySuspendableCallable()).start();
System.out.println(fiberFuture.get());
}
@Override
public Long run() {
Random random = StrandLocalRandom.current();
return random.nextLong();
}
}
Threads
public class MyCallable implements Callable<Long>{
private final static ExecutorService THREAD_POOL =
Executors.newFixedThreadPool(Runtime.getRuntime().availableProcess
ors());
public static void main(String[] args) throws
InterruptedException, ExecutionException {
Future<Long> future = THREAD_POOL.submit(new
MyCallable());
System.out.println(future.get());
}
@Override
public Long call() {
Random random = ThreadLocalRandom.current();
return random.nextLong();
}
}