Problem Statement● Many applications are running on a computer with multiple CPU cores, and the number of CPU cores is likely to increase in the future.● To access the full power of the computer, applications need to make good use of multiple threads. (Vertical scalability.)● The traditional approach to employing multiple threads— threads and locks—is error prone and extremely difficult to maintain. And often does not run much faster than when a single thread is used.
The Heart of the Problem: Shared Mutable State● Traditionally, mutable (changeable) state is shared by multiple threads, using locks to prevent corruption when partially updated state is accessed by another thread.● Threads block when attempting to access a locked state, which increases context switching. (When a thread is blocked, the underlying hardware thread tries find a thread to run which is not blocked.)● As the complexity of an application increases, so does the chance that unguarded state will be shared—which gives rise to (often infrequent) race conditions.● Complex applications usually require a locking hierarchy to prevent deadlocks, and chances are good that the locking hierarchy will be violated as the application code ages.
Alternatives to Locks● Concurrent and Atomic Data Structures● Functional Programming with Immutable State● Actors
What is an Actor?● An actor is like an object, except that it sends messages to other actors rather than call methods.● When an actor has messages to process, a thread is assigned to that actor. And when there are no more messages to process, the thread is released.● Unlike other objects, an actors state is not shared. The state can only be accessed by a single thread at any given time. So there is no need for locks.● In many ways, Actors look like the ideal alternative to locks.
Actors need to be Large● When passing messages, the throughput is not especially high—about a million messages per second per hardware thread on a good actor implementation. So you need to avoid having actors that pass a large volume of messages. Message passing needs to be kept out of an applications inner loops. So you end up having relatively large actors that do a fair amount for each message received.● Applications then tend not to be very modular and the actors often need to process a number of different message types, and to process them differently depending on the actors state.
Large Actors tend to get Complicated● Large actors often need to address a variety of concerns, and tend to turn into bowls of spaghetti code. Using a state machine does help, but even then it is not always easy to maintain the state transition model as the application ages.● In practice, monitors are often used to ensure that an actor continues to function and restart it when it is not. This of course makes it more difficult to ensure that messages are not lost or processed more than once.
Flow Control is left to the Application Developer● Messages are mostly one-way, so there is no inherent flow control and it is easy for actors to flood the system with messages.● Programmers, unless they have some experience with communication protocols, do not normally concern themselves with flow control. Method calls (an objects equivalent of a message) provide implicit flow control, as the calling object resumes only on the completion of the call.● Indeed, the developer may not even realize the need to implement flow control until load testing is performed.● Adding flow control to the application logic will, of course, further complicate the code of the actors.
JActor: A High-Performance Actor Framework● Messages are sent at up to 150 million / second, fast enough that actors can be used ubiquitously.● The actors are light weight: a billion actors a second can be created on a single thread.● Large tables can be deserialized, updated and reserialized at a rate of 400 nanoseconds per unchanged entry virtually independent of the size and complexity of those entries.● A transaction pipeline is provided that durably (with fsync) logs and processes up to 900,000 transactions per second.● (Tests were run on an i7-3770 CPU @ 3.40GHz with a Vertex 3 SATA III SSD and 1600 MHz DDR3 RAM.)
Mailboxes● Actors can share a common queue of received messages. These message queues are called mailboxes.● Actors which share a mailbox always operate on the same thread, allowing them to directly call each others methods.● Messages sent to an actor with the same mailbox are processed immediately without being enqueue.
Commandeering● When a message is sent to an actor with an empty mailbox, the sending thread commandeers that mailbox and immediately processes the request.● Commandeering prevents another thread from being assigned to the mailbox, so the actors state will still only be accessible from a single thread at any given time.● With commandeering we avoid having to enqueue the message and assign a thread to dequeue and process the message—giving JActor a significant boost in performance.
Asynchronous Mailboxes● Sometimes we need to prevent commandeering, e.g. when an actor performs blocking I/O or long computations. Otherwise all the actors of the mailbox whose thread did the commandeering will also be blocked.● Asynchronous mailboxes differ from the default type of mailbox in that they do not allow commandeering.● Messages sent to an actor with an asynchronous mailbox then are always processed on a different thread.
Message Buffering● Message buffering is a technique borrowed from flow-based programming to increase message throughput at a small cost to latency.● A message buffer is simply an ArrayList. And all the messages in a message buffer are destined for the same mailbox. The message buffers themselves are held by a mailbox.● When an actor sends a message and the destination actor uses the same mailbox, or the destination actor has an idle mailbox, then the message is processed immediately on the current thread. Otherwise the message is placed in a message buffer.● When a mailbox has no more incoming messages to process, the last thing it does before releasing its assigned task is to send all the message buffers.
A particular slide catching your eye?
Clipping is a handy way to collect important slides you want to go back to later.