Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Use$promo$code!“ilovegoto”!!
for$$100$off$registration!
Aeron
What, Why, and What Next?
Todd L. Montgomery
@toddlmontgomery
THANK YOU!
1. Why build another Product?
2. What Features are really needed?
3. How does one Design for this?
4. How did things Evolv...
1. Why build another
product?
Feature Bloat & Complexity
Not Fast Enough
Low & Predictable Latency is key
We are in a new world
Multi-core, Multi-socket,
Cloud...
We are in a new world
UDP, IPC, InfiniBand,

RDMA, PCI-e
Multi-core, Multi-socket,
Cloud...
Aeron is trying a new approach
The Team
Todd Montgomery Richard Warburton
Martin Thompson
2. What features

are really needed?
Publishers SubscribersChannel
Stream
Messaging
Channel
A library, not a framework, on
which other abstractions and
applications can be built
Composable Design
OSI layer 4 Transport for
message oriented streams
OSI Layer 4 (Transport) Services
1. Connection Oriented Communication
2. Reliability
3. Flow Control
4. Congestion Avoidan...
Connection Oriented Communication
Reliability
Flow Control
Congestion Avoidance/Control
Multiplexing
Multi-Everything World!
Publishers Subscribers
Channel
Stream
Multi-Everything World
3. How does one

design for this?
Design Principles
1. Garbage free in steady state running
2. Smart Batching in the message path
3. Wait-free algos in the ...
It’s all about 3 things
It’s all about 3 things
1. System Architecture
It’s all about 3 things
1. System Architecture
2. Data Structures
It’s all about 3 things
1. System Architecture
2. Data Structures
3. Protocols of Interaction
Publisher
Subscriber
Subscriber
Publisher
Architecture
IPC Log Buffer
Sender
Receiver
Receiver
Sender
Publisher
Subscriber
Subscriber
Publisher
Architecture
Media
IPC Log Buffer
Media (UDP, In...
Conductor
Sender
Receiver
Conductor
Receiver
Sender
Publisher
Subscriber
Subscriber
Publisher
Admin
Events
Architecture
Ad...
ClientMedia DriverMedia Driver
Conductor
Sender
Receiver
Conductor
Receiver
Sender
Client
Publisher
Conductor Conductor
Su...
Is the Media Driver a broker?
Broker Relationship Status: Complicated…
Broker Relationship Status: Complicated…
1. Media Driver can be standalone
Broker Relationship Status: Complicated…
1. Media Driver can be standalone
2. Media Driver can be embedded
Broker Relationship Status: Complicated…
1. Media Driver can be standalone
2. Media Driver can be embedded
3. Core-to-Core...
Broker Relationship Status: Complicated…
1. Media Driver can be standalone
2. Media Driver can be embedded
3. Core-to-Core...
Data Structures
• Maps
• IPC Ring Buffers
• IPC Broadcast Buffers
• ITC Queues
• Dynamic Arrays
• Log Buffers
Creates a

replicated persistent log

of messages
What Aeron does
Tail
File
Tail
File
Message 1
Header
Tail
File
Message 1
Header
Message 2
Header
Tail
File
Message 1
Header
Message 2
Header
Tail
File
Message 1
Header
Message 2
Header
Message 3
Tail
File
Message 1
Header
Message 2
Header
Message 3
Header
Persistent data structures can be
safe to read without locks
One big file that

goes on forever?
No!!!
Page faults, page cache churn,
VM pressure, ...
ActiveDirtyClean
Tail
Message
Header
Message
Header
Message
Header
Message
Header
Message
Header
Message
Header
Message
He...
How do we stay “wait-free”?
Tail
File
Message 1
Header
Message 2
Header
Message 3
Header
Message X
Message Y
Tail
File
Message 1
Header
Message 2
Header
Message 3
Header
Message X
Message Y
Tail
File
Message 1
Header
Message 2
Header
Message 3
Header
Message X
Message Y
Tail
File
Message 1
Header
Message 2
Header
Message 3
Header
Message X
Message Y
Header
Tail
File
Message 1
Header
Message 2
Header
Message 3
Header
Message X
Message Y
Header
Padding
Tail
File
Message 1
Header
Message 2
Header
Message 3
Header
Message Y
Header
Padding
File
Message X
Tail
File
Message 1
Header
Message 2
Header
Message 3
Header
Message X
Message Y
Header
Padding
File
Header
What’s in a header?
Data Message Header
0 1 2 3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-...
Header Evolution
1. Same header on wire & in memory
2. Frame Length originally 16-bits
3. IPv6-style Header Chains? Pfft!
...
What is a position?
Unique identification of a byte
within each stream across time
(streamId, sessionId,
termId, termOffset)
count = currentTerm - initialTerm
(count * termLength) + termOffset
// termLength power of 2
bts = ntz(termLength)
(count ...
How do we replicate a log?
We need a Protocol of
messages
Sender Receiver
Receiver
Setup
Sender
Sender
Status
Receiver
DataData
Sender Receiver
DataData
Status
Sender Receiver
DataDataHeartbeat
Sender Receiver
DataData
NAK
Sender Receiver
Data
Sender Receiver
Protocol Evolution
1. 0-length Data Frames (SETUP & HB)
2. Ranged NAKs vs NAKing a Range
3. Send Padding & Frame Alignment...
How are message streams
reassembled?
High Water Mark
File
Completed
High Water Mark
File
Message 1
Header
Completed
High Water Mark
File
Message 1
Header
Message 3
Header
Completed
File
Message 1
Header
Message 2
Header
Message 3
Header
Completed High Water Mark
How do we know what is
consumed?
Publishers, Senders,

Receivers, and Subscribers

all keep position counters
Counters are the key to

flow control and monitoring
Flow Control
Three Flow Control Windows
1. Publisher to Sender – Counter
2. Sender to Receiver – Status Messages
3. Receiver to Subscri...
Status Messages
Completed Position of Subscriber
+
Receiver Window
Status Message Generation
1. Term Rollover
2. Every ¼ Term Progression
3. On Timeout
Clocked by Sender Data Rate
Flow control strategies handle
composition of status from
multiple receivers
Safety
No Status Messages,
No Completed Progression,
No Sending
Congestion Control
Status Messages
Completed Position of Subscriber
+
Receiver Window
Dynamically adjust
Receiver Window based on loss
Loss, throughput, and buffer size
are all strongly related!!!
Pro Tip:
Know your OS network
parameters and how
to tune them
4. What else did we
discover?
Some parts of Java really suck!
Some parts of Java really suck!
Unsigned Types?
Some parts of Java really suck!
Unsigned Types?
NIO (most of) - Locks
Some parts of Java really suck!
Unsigned Types?
NIO (most of) - Locks
Off-heap, PAUSE, Signals, etc.
Some parts of Java really suck!
Unsigned Types?
String Encoding
NIO (most of) - Locks
Off-heap, PAUSE, Signals, etc.
Some parts of Java really suck!
Unsigned Types?
String Encoding
NIO (most of) - Locks
Off-heap, PAUSE, Signals, etc.
Manag...
Some parts of Java really suck!
Unsigned Types?
Off-heap, PAUSE, Signals, etc.
Selectors - GC
String Encoding
NIO (most of...
Some parts of Java are really nice!
Some parts of Java are really nice!
Tooling – IDEs, Gradle, HdrHistogram
Some parts of Java are really nice!
Tooling – IDEs, Gradle, HdrHistogram
Profiling – Flight Recorder
Some parts of Java are really nice!
Tooling – IDEs, Gradle, HdrHistogram
Bytecode Instrumentation
Profiling – Flight Recor...
Some parts of Java are really nice!
Tooling – IDEs, Gradle, HdrHistogram
Bytecode Instrumentation
Unsafe!!! + Java 8
Profi...
Some parts of Java are really nice!
Tooling – IDEs, Gradle, HdrHistogram
Bytecode Instrumentation
Profiling – Flight Recor...
Some parts of Java are really nice!
Tooling – IDEs, Gradle, HdrHistogram
Bytecode Instrumentation
Profiling – Flight Recor...
Some parts of Java are really nice!
Tooling – IDEs, Gradle, HdrHistogram
Bytecode Instrumentation
Garbage Collection!!!
Pr...
5. What’s Next?
Finished a few passes of

Profiling and Tuning
20+ Million 40 byte

messages per second!!!
Replication
Services
Aeron Core
Persistence
Queueing
Performance
C/C++ Port
Batch Send/Recv
IPC
Infiniband
Multi Unicast S...
In closing…
Aeron: https://github.com/real-logic/Aeron
Twitter: @toddlmontgomery
Thank You!
Questions?
GOTO Night with Todd Montgomery: Aeron: What, why and what next?
GOTO Night with Todd Montgomery: Aeron: What, why and what next?
GOTO Night with Todd Montgomery: Aeron: What, why and what next?
GOTO Night with Todd Montgomery: Aeron: What, why and what next?
Upcoming SlideShare
Loading in …5
×

GOTO Night with Todd Montgomery: Aeron: What, why and what next?

671 views

Published on

"Aeron: What, Why, and What Next"

Aeron is a new protocol and implementation that is specifically designed to be a blazingly fast and predictable transport protocol for unicast and multicast. In this session, we will explore the design evolution of the Aeron protocol and implementation. What have we learned and been learning as Aeron has been evolving and changing? What current and future problems does Aeron address? And what does the approaches taken by Aeron open up in the future?

Published in: Internet
  • Be the first to comment

GOTO Night with Todd Montgomery: Aeron: What, why and what next?

  1. 1. Use$promo$code!“ilovegoto”!! for$$100$off$registration!
  2. 2. Aeron What, Why, and What Next? Todd L. Montgomery @toddlmontgomery
  3. 3. THANK YOU!
  4. 4. 1. Why build another Product? 2. What Features are really needed? 3. How does one Design for this? 4. How did things Evolve along the way? 5. What’s Next?
  5. 5. 1. Why build another product?
  6. 6. Feature Bloat & Complexity
  7. 7. Not Fast Enough
  8. 8. Low & Predictable Latency is key
  9. 9. We are in a new world Multi-core, Multi-socket, Cloud...
  10. 10. We are in a new world UDP, IPC, InfiniBand,
 RDMA, PCI-e Multi-core, Multi-socket, Cloud...
  11. 11. Aeron is trying a new approach
  12. 12. The Team Todd Montgomery Richard Warburton Martin Thompson
  13. 13. 2. What features
 are really needed?
  14. 14. Publishers SubscribersChannel Stream Messaging Channel
  15. 15. A library, not a framework, on which other abstractions and applications can be built
  16. 16. Composable Design
  17. 17. OSI layer 4 Transport for message oriented streams
  18. 18. OSI Layer 4 (Transport) Services 1. Connection Oriented Communication 2. Reliability 3. Flow Control 4. Congestion Avoidance/Control 5. Multiplexing
  19. 19. Connection Oriented Communication
  20. 20. Reliability
  21. 21. Flow Control
  22. 22. Congestion Avoidance/Control
  23. 23. Multiplexing
  24. 24. Multi-Everything World!
  25. 25. Publishers Subscribers Channel Stream Multi-Everything World
  26. 26. 3. How does one
 design for this?
  27. 27. Design Principles 1. Garbage free in steady state running 2. Smart Batching in the message path 3. Wait-free algos in the message path 4. Non-blocking IO in the message path 5. No exceptional cases in message path 6. Apply the Single Writer Principle 7. Prefer unshared state 8. Avoid unnecessary data copies
  28. 28. It’s all about 3 things
  29. 29. It’s all about 3 things 1. System Architecture
  30. 30. It’s all about 3 things 1. System Architecture 2. Data Structures
  31. 31. It’s all about 3 things 1. System Architecture 2. Data Structures 3. Protocols of Interaction
  32. 32. Publisher Subscriber Subscriber Publisher Architecture IPC Log Buffer
  33. 33. Sender Receiver Receiver Sender Publisher Subscriber Subscriber Publisher Architecture Media IPC Log Buffer Media (UDP, InfiniBand, PCI-e 3.0)
  34. 34. Conductor Sender Receiver Conductor Receiver Sender Publisher Subscriber Subscriber Publisher Admin Events Architecture Admin EventsMedia IPC Log Buffer Media (UDP, InfiniBand, PCI-e 3.0) Function/Method Call Volatile Fields & Queues
  35. 35. ClientMedia DriverMedia Driver Conductor Sender Receiver Conductor Receiver Sender Client Publisher Conductor Conductor Subscriber Subscriber Publisher Admin Events Architecture Admin EventsMedia IPC Log Buffer IPC Ring/Broadcast Buffer Media (UDP, InfiniBand, PCI-e 3.0) Function/Method Call Volatile Fields & Queues
  36. 36. Is the Media Driver a broker?
  37. 37. Broker Relationship Status: Complicated…
  38. 38. Broker Relationship Status: Complicated… 1. Media Driver can be standalone
  39. 39. Broker Relationship Status: Complicated… 1. Media Driver can be standalone 2. Media Driver can be embedded
  40. 40. Broker Relationship Status: Complicated… 1. Media Driver can be standalone 2. Media Driver can be embedded 3. Core-to-Core memory handoff
  41. 41. Broker Relationship Status: Complicated… 1. Media Driver can be standalone 2. Media Driver can be embedded 3. Core-to-Core memory handoff 4. Isolation of protocol logic
  42. 42. Data Structures • Maps • IPC Ring Buffers • IPC Broadcast Buffers • ITC Queues • Dynamic Arrays • Log Buffers
  43. 43. Creates a
 replicated persistent log
 of messages What Aeron does
  44. 44. Tail File
  45. 45. Tail File Message 1 Header
  46. 46. Tail File Message 1 Header Message 2 Header
  47. 47. Tail File Message 1 Header Message 2 Header
  48. 48. Tail File Message 1 Header Message 2 Header Message 3
  49. 49. Tail File Message 1 Header Message 2 Header Message 3 Header
  50. 50. Persistent data structures can be safe to read without locks
  51. 51. One big file that
 goes on forever?
  52. 52. No!!! Page faults, page cache churn, VM pressure, ...
  53. 53. ActiveDirtyClean Tail Message Header Message Header Message Header Message Header Message Header Message Header Message Header Message Header
  54. 54. How do we stay “wait-free”?
  55. 55. Tail File Message 1 Header Message 2 Header Message 3 Header Message X Message Y
  56. 56. Tail File Message 1 Header Message 2 Header Message 3 Header Message X Message Y
  57. 57. Tail File Message 1 Header Message 2 Header Message 3 Header Message X Message Y
  58. 58. Tail File Message 1 Header Message 2 Header Message 3 Header Message X Message Y Header
  59. 59. Tail File Message 1 Header Message 2 Header Message 3 Header Message X Message Y Header Padding
  60. 60. Tail File Message 1 Header Message 2 Header Message 3 Header Message Y Header Padding File Message X
  61. 61. Tail File Message 1 Header Message 2 Header Message 3 Header Message X Message Y Header Padding File Header
  62. 62. What’s in a header?
  63. 63. Data Message Header 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Version |B|E| Flags | Type | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-------------------------------+ |R| Frame Length | +-+-------------------------------------------------------------+ |R| Term Offset | +-+-------------------------------------------------------------+ | Session ID | +---------------------------------------------------------------+ | Stream ID | +---------------------------------------------------------------+ | Term ID | +---------------------------------------------------------------+ | Encoded Message ... ... | +---------------------------------------------------------------+
  64. 64. Header Evolution 1. Same header on wire & in memory 2. Frame Length originally 16-bits 3. IPv6-style Header Chains? Pfft! 4. Frame Alignment & Padding 5. Fragmentation? 2-bits 6. Position! Position! Position!
  65. 65. What is a position?
  66. 66. Unique identification of a byte within each stream across time (streamId, sessionId, termId, termOffset)
  67. 67. count = currentTerm - initialTerm (count * termLength) + termOffset // termLength power of 2 bts = ntz(termLength) (count << bts) + termOffset Position
  68. 68. How do we replicate a log?
  69. 69. We need a Protocol of messages
  70. 70. Sender Receiver
  71. 71. Receiver Setup Sender
  72. 72. Sender Status Receiver
  73. 73. DataData Sender Receiver
  74. 74. DataData Status Sender Receiver
  75. 75. DataDataHeartbeat Sender Receiver
  76. 76. DataData NAK Sender Receiver
  77. 77. Data Sender Receiver
  78. 78. Protocol Evolution 1. 0-length Data Frames (SETUP & HB) 2. Ranged NAKs vs NAKing a Range 3. Send Padding & Frame Alignment! 4. Eliminating Special Cases
  79. 79. How are message streams reassembled?
  80. 80. High Water Mark File Completed
  81. 81. High Water Mark File Message 1 Header Completed
  82. 82. High Water Mark File Message 1 Header Message 3 Header Completed
  83. 83. File Message 1 Header Message 2 Header Message 3 Header Completed High Water Mark
  84. 84. How do we know what is consumed?
  85. 85. Publishers, Senders,
 Receivers, and Subscribers
 all keep position counters
  86. 86. Counters are the key to
 flow control and monitoring
  87. 87. Flow Control
  88. 88. Three Flow Control Windows 1. Publisher to Sender – Counter 2. Sender to Receiver – Status Messages 3. Receiver to Subscriber – Counter
  89. 89. Status Messages Completed Position of Subscriber + Receiver Window
  90. 90. Status Message Generation 1. Term Rollover 2. Every ¼ Term Progression 3. On Timeout Clocked by Sender Data Rate
  91. 91. Flow control strategies handle composition of status from multiple receivers
  92. 92. Safety No Status Messages, No Completed Progression, No Sending
  93. 93. Congestion Control
  94. 94. Status Messages Completed Position of Subscriber + Receiver Window
  95. 95. Dynamically adjust Receiver Window based on loss
  96. 96. Loss, throughput, and buffer size are all strongly related!!!
  97. 97. Pro Tip: Know your OS network parameters and how to tune them
  98. 98. 4. What else did we discover?
  99. 99. Some parts of Java really suck!
  100. 100. Some parts of Java really suck! Unsigned Types?
  101. 101. Some parts of Java really suck! Unsigned Types? NIO (most of) - Locks
  102. 102. Some parts of Java really suck! Unsigned Types? NIO (most of) - Locks Off-heap, PAUSE, Signals, etc.
  103. 103. Some parts of Java really suck! Unsigned Types? String Encoding NIO (most of) - Locks Off-heap, PAUSE, Signals, etc.
  104. 104. Some parts of Java really suck! Unsigned Types? String Encoding NIO (most of) - Locks Off-heap, PAUSE, Signals, etc. Managing External Resources
  105. 105. Some parts of Java really suck! Unsigned Types? Off-heap, PAUSE, Signals, etc. Selectors - GC String Encoding NIO (most of) - Locks Managing External Resources
  106. 106. Some parts of Java are really nice!
  107. 107. Some parts of Java are really nice! Tooling – IDEs, Gradle, HdrHistogram
  108. 108. Some parts of Java are really nice! Tooling – IDEs, Gradle, HdrHistogram Profiling – Flight Recorder
  109. 109. Some parts of Java are really nice! Tooling – IDEs, Gradle, HdrHistogram Bytecode Instrumentation Profiling – Flight Recorder
  110. 110. Some parts of Java are really nice! Tooling – IDEs, Gradle, HdrHistogram Bytecode Instrumentation Unsafe!!! + Java 8 Profiling – Flight Recorder
  111. 111. Some parts of Java are really nice! Tooling – IDEs, Gradle, HdrHistogram Bytecode Instrumentation Profiling – Flight Recorder The Optimiser Unsafe!!! + Java 8
  112. 112. Some parts of Java are really nice! Tooling – IDEs, Gradle, HdrHistogram Bytecode Instrumentation Profiling – Flight Recorder The Optimiser – Love/Hate Unsafe!!! + Java 8
  113. 113. Some parts of Java are really nice! Tooling – IDEs, Gradle, HdrHistogram Bytecode Instrumentation Garbage Collection!!! Profiling – Flight Recorder Unsafe!!! + Java 8 The Optimiser – Love/Hate
  114. 114. 5. What’s Next?
  115. 115. Finished a few passes of
 Profiling and Tuning
  116. 116. 20+ Million 40 byte
 messages per second!!!
  117. 117. Replication Services Aeron Core Persistence Queueing Performance C/C++ Port Batch Send/Recv IPC Infiniband Multi Unicast Send Stream Query
  118. 118. In closing…
  119. 119. Aeron: https://github.com/real-logic/Aeron Twitter: @toddlmontgomery Thank You! Questions?

×