Use$promo$code!“ilovegoto”!!
for$$100$off$registration!
Aeron
What, Why, and What Next?
Todd L. Montgomery
@toddlmontgomery
THANK YOU!
1. Why build another Product?
2. What Features are really needed?
3. How does one Design for this?
4. How did things Evolve along the way?
5. What’s Next?
1. Why build another
product?
Feature Bloat & Complexity
Not Fast Enough
Low & Predictable Latency is key
We are in a new world
Multi-core, Multi-socket,
Cloud...
We are in a new world
UDP, IPC, InfiniBand,

RDMA, PCI-e
Multi-core, Multi-socket,
Cloud...
Aeron is trying a new approach
The Team
Todd Montgomery Richard Warburton
Martin Thompson
2. What features

are really needed?
Publishers SubscribersChannel
Stream
Messaging
Channel
A library, not a framework, on
which other abstractions and
applications can be built
Composable Design
OSI layer 4 Transport for
message oriented streams
OSI Layer 4 (Transport) Services
1. Connection Oriented Communication
2. Reliability
3. Flow Control
4. Congestion Avoidance/Control
5. Multiplexing
Connection Oriented Communication
Reliability
Flow Control
Congestion Avoidance/Control
Multiplexing
Multi-Everything World!
Publishers Subscribers
Channel
Stream
Multi-Everything World
3. How does one

design for this?
Design Principles
1. Garbage free in steady state running
2. Smart Batching in the message path
3. Wait-free algos in the message path
4. Non-blocking IO in the message path
5. No exceptional cases in message path
6. Apply the Single Writer Principle
7. Prefer unshared state
8. Avoid unnecessary data copies
It’s all about 3 things
It’s all about 3 things
1. System Architecture
It’s all about 3 things
1. System Architecture
2. Data Structures
It’s all about 3 things
1. System Architecture
2. Data Structures
3. Protocols of Interaction
Publisher
Subscriber
Subscriber
Publisher
Architecture
IPC Log Buffer
Sender
Receiver
Receiver
Sender
Publisher
Subscriber
Subscriber
Publisher
Architecture
Media
IPC Log Buffer
Media (UDP, InfiniBand, PCI-e 3.0)
Conductor
Sender
Receiver
Conductor
Receiver
Sender
Publisher
Subscriber
Subscriber
Publisher
Admin
Events
Architecture
Admin
EventsMedia
IPC Log Buffer
Media (UDP, InfiniBand, PCI-e 3.0)
Function/Method Call
Volatile Fields & Queues
ClientMedia DriverMedia Driver
Conductor
Sender
Receiver
Conductor
Receiver
Sender
Client
Publisher
Conductor Conductor
Subscriber
Subscriber
Publisher
Admin
Events
Architecture
Admin
EventsMedia
IPC Log Buffer
IPC Ring/Broadcast Buffer
Media (UDP, InfiniBand, PCI-e 3.0)
Function/Method Call
Volatile Fields & Queues
Is the Media Driver a broker?
Broker Relationship Status: Complicated…
Broker Relationship Status: Complicated…
1. Media Driver can be standalone
Broker Relationship Status: Complicated…
1. Media Driver can be standalone
2. Media Driver can be embedded
Broker Relationship Status: Complicated…
1. Media Driver can be standalone
2. Media Driver can be embedded
3. Core-to-Core memory handoff
Broker Relationship Status: Complicated…
1. Media Driver can be standalone
2. Media Driver can be embedded
3. Core-to-Core memory handoff
4. Isolation of protocol logic
Data Structures
• Maps
• IPC Ring Buffers
• IPC Broadcast Buffers
• ITC Queues
• Dynamic Arrays
• Log Buffers
Creates a

replicated persistent log

of messages
What Aeron does
Tail
File
Tail
File
Message 1
Header
Tail
File
Message 1
Header
Message 2
Header
Tail
File
Message 1
Header
Message 2
Header
Tail
File
Message 1
Header
Message 2
Header
Message 3
Tail
File
Message 1
Header
Message 2
Header
Message 3
Header
Persistent data structures can be
safe to read without locks
One big file that

goes on forever?
No!!!
Page faults, page cache churn,
VM pressure, ...
ActiveDirtyClean
Tail
Message
Header
Message
Header
Message
Header
Message
Header
Message
Header
Message
Header
Message
Header
Message
Header
How do we stay “wait-free”?
Tail
File
Message 1
Header
Message 2
Header
Message 3
Header
Message X
Message Y
Tail
File
Message 1
Header
Message 2
Header
Message 3
Header
Message X
Message Y
Tail
File
Message 1
Header
Message 2
Header
Message 3
Header
Message X
Message Y
Tail
File
Message 1
Header
Message 2
Header
Message 3
Header
Message X
Message Y
Header
Tail
File
Message 1
Header
Message 2
Header
Message 3
Header
Message X
Message Y
Header
Padding
Tail
File
Message 1
Header
Message 2
Header
Message 3
Header
Message Y
Header
Padding
File
Message X
Tail
File
Message 1
Header
Message 2
Header
Message 3
Header
Message X
Message Y
Header
Padding
File
Header
What’s in a header?
Data Message Header
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Version |B|E| Flags | Type |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-------------------------------+
|R| Frame Length |
+-+-------------------------------------------------------------+
|R| Term Offset |
+-+-------------------------------------------------------------+
| Session ID |
+---------------------------------------------------------------+
| Stream ID |
+---------------------------------------------------------------+
| Term ID |
+---------------------------------------------------------------+
| Encoded Message ...
... |
+---------------------------------------------------------------+
Header Evolution
1. Same header on wire & in memory
2. Frame Length originally 16-bits
3. IPv6-style Header Chains? Pfft!
4. Frame Alignment & Padding
5. Fragmentation? 2-bits
6. Position! Position! Position!
What is a position?
Unique identification of a byte
within each stream across time
(streamId, sessionId,
termId, termOffset)
count = currentTerm - initialTerm
(count * termLength) + termOffset
// termLength power of 2
bts = ntz(termLength)
(count << bts) + termOffset
Position
How do we replicate a log?
We need a Protocol of
messages
Sender Receiver
Receiver
Setup
Sender
Sender
Status
Receiver
DataData
Sender Receiver
DataData
Status
Sender Receiver
DataDataHeartbeat
Sender Receiver
DataData
NAK
Sender Receiver
Data
Sender Receiver
Protocol Evolution
1. 0-length Data Frames (SETUP & HB)
2. Ranged NAKs vs NAKing a Range
3. Send Padding & Frame Alignment!
4. Eliminating Special Cases
How are message streams
reassembled?
High Water Mark
File
Completed
High Water Mark
File
Message 1
Header
Completed
High Water Mark
File
Message 1
Header
Message 3
Header
Completed
File
Message 1
Header
Message 2
Header
Message 3
Header
Completed High Water Mark
How do we know what is
consumed?
Publishers, Senders,

Receivers, and Subscribers

all keep position counters
Counters are the key to

flow control and monitoring
Flow Control
Three Flow Control Windows
1. Publisher to Sender – Counter
2. Sender to Receiver – Status Messages
3. Receiver to Subscriber – Counter
Status Messages
Completed Position of Subscriber
+
Receiver Window
Status Message Generation
1. Term Rollover
2. Every ¼ Term Progression
3. On Timeout
Clocked by Sender Data Rate
Flow control strategies handle
composition of status from
multiple receivers
Safety
No Status Messages,
No Completed Progression,
No Sending
Congestion Control
Status Messages
Completed Position of Subscriber
+
Receiver Window
Dynamically adjust
Receiver Window based on loss
Loss, throughput, and buffer size
are all strongly related!!!
Pro Tip:
Know your OS network
parameters and how
to tune them
4. What else did we
discover?
Some parts of Java really suck!
Some parts of Java really suck!
Unsigned Types?
Some parts of Java really suck!
Unsigned Types?
NIO (most of) - Locks
Some parts of Java really suck!
Unsigned Types?
NIO (most of) - Locks
Off-heap, PAUSE, Signals, etc.
Some parts of Java really suck!
Unsigned Types?
String Encoding
NIO (most of) - Locks
Off-heap, PAUSE, Signals, etc.
Some parts of Java really suck!
Unsigned Types?
String Encoding
NIO (most of) - Locks
Off-heap, PAUSE, Signals, etc.
Managing External Resources
Some parts of Java really suck!
Unsigned Types?
Off-heap, PAUSE, Signals, etc.
Selectors - GC
String Encoding
NIO (most of) - Locks
Managing External Resources
Some parts of Java are really nice!
Some parts of Java are really nice!
Tooling – IDEs, Gradle, HdrHistogram
Some parts of Java are really nice!
Tooling – IDEs, Gradle, HdrHistogram
Profiling – Flight Recorder
Some parts of Java are really nice!
Tooling – IDEs, Gradle, HdrHistogram
Bytecode Instrumentation
Profiling – Flight Recorder
Some parts of Java are really nice!
Tooling – IDEs, Gradle, HdrHistogram
Bytecode Instrumentation
Unsafe!!! + Java 8
Profiling – Flight Recorder
Some parts of Java are really nice!
Tooling – IDEs, Gradle, HdrHistogram
Bytecode Instrumentation
Profiling – Flight Recorder
The Optimiser
Unsafe!!! + Java 8
Some parts of Java are really nice!
Tooling – IDEs, Gradle, HdrHistogram
Bytecode Instrumentation
Profiling – Flight Recorder
The Optimiser – Love/Hate
Unsafe!!! + Java 8
Some parts of Java are really nice!
Tooling – IDEs, Gradle, HdrHistogram
Bytecode Instrumentation
Garbage Collection!!!
Profiling – Flight Recorder
Unsafe!!! + Java 8
The Optimiser – Love/Hate
5. What’s Next?
Finished a few passes of

Profiling and Tuning
20+ Million 40 byte

messages per second!!!
Replication
Services
Aeron Core
Persistence
Queueing
Performance
C/C++ Port
Batch Send/Recv
IPC
Infiniband
Multi Unicast Send
Stream Query
In closing…
Aeron: https://github.com/real-logic/Aeron
Twitter: @toddlmontgomery
Thank You!
Questions?

GOTO Night with Todd Montgomery: Aeron: What, why and what next?