Solace Systems The Evolution of Messaging The Rise of the Appliance
Clive Andrews
Mat Hobbis
Obninsk, 2 March, 2013
LSE The focus beyond Low Latency
EXTENT Trading Technology Trends & Quality Assurance
13. Applying the Appliance Advantage to Middleware
Enterprise o Easier Operation
Appliances
o Lower TCO
Software
o Higher Performance
on Servers
IP Web Storage Database Messaging
Routing Infrastructure Middleware
14. Don’t Make Headlines
“India stock exchange
flash crash erases
US$58 Billion”
October 2012
“IT leaders face “Facebook crashes
pay cut after TSE the Nasdaq”
outage” December 2012
August 2012
16. Event Driven Architecture (i)
• Need to be Agile.
• Increased Regulation . Audit, “Real-Time” Global Risk and P&L
• Drive EDA for scale and resilience – drives message bus requirements
• Bus Latencies and Throughput important
Trade Bus Trade Bus
OMS Post Trade DB OMS Post Trade DB
Sub Dist Svc Persist Sub Dist Svc Persist
Monitor /Staging Bus Monitor / Staging Bus
16
Risk P&L
17. Event Driven Architecture (ii)
• Co-Locate Processes where Latency is
key Shm Q
• Shared Memory IPC within host
(Same API) Crossing
SOR
• Non “on host” components also need Engine
Low Latency Connections. Physical Host
• Lower Latency requirements of
Staging area allow message batching
– Turn Message Rate problem into a Trade Bus
Bandwidth Problem
• Need High Availability and recovery
OMS Post Trade DB
options
Sub Dist Svc Persist
• Need Disaster Recovery options
17
CONFIDENIAL
19. Networked Architecture
• Hardware-based middleware overlay for IP networks
• All Message QoS in one Appliance – Reliable/Persistent/Web
Streaming
• WAN Optimisation and Compression
• Comprehensive Statistics and Monitoring
19
CONFIDENTIAL
20. Modular Addition of Functionality
Administration, subscriptions
Control Plane and stats collection never
impact performance
Data Plane Capabilities embedded in
High-Speed Interconnect FPGAs and network
processors, added via
modular architecture
- Build to suit
- Scale within footprint
Solace Blades - Easy upgrades
(PCIe Cards)
(10 blades in 3260,
5 blades in 3230)
20
CONFIDENTIAL
21. Reliable Messaging
60
• Pure hardware solution Avg
54
50
– No operating system 99.9th
39
40
– No context switching Micro- 30
32
seconds 30 25 26 35
– No interrupts of 29
24 26
Latency 20
– No data copies 22 23
10
• 10 million messages/second
0
– Can be any combination, e.g. 500K/500K 1M/1M 2M/2M 3M/3M 4M/4M 5M/5M
5M in & 5M out, 2M in & 8M out Messages per Second
Bulk Message Size Message Rate User Payload
Message (bytes) (msgs/sec) Bandwidth (Mbps)
Rate 100 5,930,000 4,744
500 2,080,000 8,320
1,000 1,080,000 8,640 10GigE
Line Rate
12,000 92,000 8,832 is the Limit
30,000 34,000 8,160
21
CONFIDENTIAL
22. Guaranteed Messaging;
Store & Forward Performance
180
Failsafe w/o overhead 160 Avg 154
of persisting every 140
120
99.9th
114
123
99
message to disk Micro- 100 90 88 91
seconds 80 98
of 79 84
60 69 69 69 73
200K msgs/sec ingress Latency
40
and 200K msgs/sec 20
0
egress 2,000 10,000 25,000 50,000 100,000 125,000 150,000
Messages per Second
Latency steady even 206,400 202,000 ADB-3 Software Broker
while recovering when 200,000
Msg Rate (Msg/sce)
157,500
disconnected 150,000 124,400
subscribers reconnect
100,000
50,000
0
500 1,000 2,000 4,000
22 ADB Message Rates
CONFIDENTIAL
23. Guaranteed Messaging;
Cut Through Persistence Latency
Low, consistent latency
for low latency trading
applications
Can also have store &
forward clients for same
published message
Queues can have low and
high priority limits set.
During congestion :
Reject new orders
Process changes to
23
CONFIDENTIAL
existing orders
24. Steady in Face of Slow Consumers
o Latency stays consistent
even through
disconnection and re-
connection of clients
o Re-connected
subscribers “catch up”
without impacting other
clients
180 170
160 Avg
140 99.9th
120 113
103 103
Micro- 100
seconds of 80 89
Latency 60 74 75 74
40
20
0
Pre-Failure Spooling Catchup/Recovery Post-Recovery
Period of Test
24
CONFIDENTIAL
25. IPC Shared Memory Messaging
• Single API session for: Core Core
– Communications between processes 1 2
on one OS instance
– Topic-based pub/sub and request/reply
Shared
– Any-to-any messaging Memory
– Reliable delivery
Core Core
• Applications can block or busy-wait 3 4
• C API for Linux, Solaris and Windows
• Move apps to IPC with no application changes
1 publisher -> 1 subscriber
• 2.91 million msgs/sec; 128 byte messages
• Average latency 431 nanoseconds
99th percentile 480 nanoseconds
6x6 mesh simulation of fanout/fanin
• 46.8 million messages per second
• 154.5 gigabits per second
25
CONFIDENTIAL