0
Replacing Traditional Technologies with MongoDB
A Single Platform for all Financial Market Data
June 2014
James Blackburn ...
Opinions expressed are those of the author and may not be shared by all personnel of Man Group plc
(‘Man’). These opinions...
© Man 2014 3
Introductions
Gary Collier James Blackburn
© Man 2014 4
Agenda
The Story of MongoDB at AHL
1. What is a Systematic Fund Manager?
2. Low Frequency Futures and FX Data...
Prologue
AHL – A Systematic Fund Manager
© Man 2014 5
© Man 2014 6
Systematic Fund Management
Removing the first impedance mismatch…
© Man 2014 7
Quants and Techies Speak the Same Language
© Man 2014 8
Disparate Data Sources
DataAPI
But…
© Man 2014 9
All Data is Behind an API
Performance
User Experience
Cluster Compute
Onboarding
New Data
Impedance Mism...
© Man 2013 10
Chapter 1
Starting Small: Low Frequency Data
© Man 2014 11
The Data
8000 rows x 200 markets
100 MB
5000000 rows x 250 markets
500 GB
Parallel Filesystem
© Man 2014 12
Previous Solution
HDF5
HDF5HDF5
HDF5 HDF5
Prop
PropProp
Prop
Prop
RDBMS
RDBMS RDBMS
© Man 2014
13
The Challenge
Fast?
Reliable?
Versionable?
Easy to extend?
© Man 2014 14
MongoDB Solution
node 85 node 96node 86 …node 87
node 1 node 2 node 12
node 73 node 84node 74
…
…
.
.
.
.
.
...
© Man 2014 15
Performance: 200 Future Markets
Previous Solution MongoDB
100x faster to retrieve data
Consistent retrieval ...
© Man 2014 16
Performance: EURUSD 1-Minute Data
Previous Solution MongoDB
2-5x faster to retrieve data
Consistent retrieva...
© Man 2014 17
Low Frequency Data - Conclusions
MongoDB faster than previous RDBMS/File Solution at…
• ALL data sizes and A...
© Man 2013 18
Chapter 2
Getting Bigger: Single Stock Equities
© Man 2014 19
Single Stock Data - Scale
Thousands
of Stocks
Many years of
Time-series Data
Tens of different Data
Item for...
Trading
Signal
Derived Data
Item
Derived Data
Item
Derived Data
Item
Derived Data
Item
Derived Data
Item
Raw Data
ItemsRaw...
© Man 2014 21
Single Stock Trading - Conclusions
MongoDB faster than previous RDBMS/File Solution at…
• Fast interactive r...
© Man 2013 22
Chapter 3
MongoDB as a Tick Store
Almost, but not quite
© Man 2014 23
Big Data?
30TB Historic Data
Ticks/1000 per second
Sparse Data
© Man 2014 24
Third-Party Tick Stores
Typically…
• Expensive
• Proprietary query languages
• Database-centric architecture...
© Man 2014 25
Architecture
Reuters
RMDSMessageBus
Bloomberg
Banks
Kafka Queue
Kafka Queue
Kafka Queue
16 shard cluster
Mas...
Parallel Access
© Man 2014 26
Tick Store Performance
Infiniband
saturated
25x greater tick throughput
With just 2 machines!
© Man 2014 27
Tick Store: System Load
OtherTick Mongo (x2)N Tasks = 32
© Man 2014 28
Tick Store - Conclusions
Happy Quants!
• 25x improvement in tick throughput
• So fit models 25x as fast
Happ...
© Man 2014 29
Epilogue
Where are we now and where next?
Performance
Low Frequency Data: 100x faster
Equities Models: Hours  Seconds
Tick Data: 25x faster
© Man 2014 30
Key Facts...
© Man 2014 31
Where Next?
1. Extend the data ecosystem further
2. Broader application across the company as a whole
3. Ope...
© Man 2014 32
Questions
Gary Collier
gcollier@ahl.com
James Blackburn
jblackburn@ahl.com
Upcoming SlideShare
Loading in...5
×

Replacing Traditional Technologies with MongoDB: A Single Platform for All Financial Data at AHL

2,344

Published on

Published in: Technology
0 Comments
5 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
2,344
On Slideshare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
49
Comments
0
Likes
5
Embeds 0
No embeds

No notes for slide
  • Everything running orders of magnitude faster

    Move from proprietary tech  commodity and MongoDB has realised significant cost savings

    Complexity down, and getting more out of what we have, both in hardware and people

    Including onboarding new data.

    Peopleware: often overlooked, but really the most important factor in our sorts of industries
    “The reason I love working here so much is because the technology is soooo good”

  • Transcript of "Replacing Traditional Technologies with MongoDB: A Single Platform for All Financial Data at AHL"

    1. 1. Replacing Traditional Technologies with MongoDB A Single Platform for all Financial Market Data June 2014 James Blackburn & Gary Collier
    2. 2. Opinions expressed are those of the author and may not be shared by all personnel of Man Group plc (‘Man’). These opinions are subject to change without notice, and are for information purposes only and do not constitute an offer or invitation to make an investment in any financial instrument or in any product to which any member of Man’s group of companies provides investment advisory or any other services. Any forward-looking statements speak only as of the date on which they are made and are subject to risks and uncertainties that may cause actual results to differ materially from those contained in the statements. Unless stated otherwise this information is communicated by Man Investments Limited and AHL Partners LLP which are both authorised and regulated in the UK by the Financial Conduct Authority. © Man 2014 2 Legal Stuff
    3. 3. © Man 2014 3 Introductions Gary Collier James Blackburn
    4. 4. © Man 2014 4 Agenda The Story of MongoDB at AHL 1. What is a Systematic Fund Manager? 2. Low Frequency Futures and FX Data 3. Single Stock Equity Trading 4. Building a Tick Store 5. Now and the Future?
    5. 5. Prologue AHL – A Systematic Fund Manager © Man 2014 5
    6. 6. © Man 2014 6 Systematic Fund Management
    7. 7. Removing the first impedance mismatch… © Man 2014 7 Quants and Techies Speak the Same Language
    8. 8. © Man 2014 8 Disparate Data Sources DataAPI
    9. 9. But… © Man 2014 9 All Data is Behind an API Performance User Experience Cluster Compute Onboarding New Data Impedance Mismatch Mix of Technologies Is there one Technology which could address? Many Moving Parts Reliability
    10. 10. © Man 2013 10 Chapter 1 Starting Small: Low Frequency Data
    11. 11. © Man 2014 11 The Data 8000 rows x 200 markets 100 MB 5000000 rows x 250 markets 500 GB
    12. 12. Parallel Filesystem © Man 2014 12 Previous Solution HDF5 HDF5HDF5 HDF5 HDF5 Prop PropProp Prop Prop RDBMS RDBMS RDBMS
    13. 13. © Man 2014 13 The Challenge Fast? Reliable? Versionable? Easy to extend?
    14. 14. © Man 2014 14 MongoDB Solution node 85 node 96node 86 …node 87 node 1 node 2 node 12 node 73 node 84node 74 … … . . . . . . node 3 node 75 . . SSD shard 1 shard 2 shard 3 shard 4 shard 1 shard 2 shard 3 shard 4 shard 1 shard 2 shard 3 shard 4 MongoDB Cluster Linux 24 cores 96 GB RAM Bloomberg Adapter JPM Adapter Markit Adapter GS Adapter
    15. 15. © Man 2014 15 Performance: 200 Future Markets Previous Solution MongoDB 100x faster to retrieve data Consistent retrieval times
    16. 16. © Man 2014 16 Performance: EURUSD 1-Minute Data Previous Solution MongoDB 2-5x faster to retrieve data Consistent retrieval times
    17. 17. © Man 2014 17 Low Frequency Data - Conclusions MongoDB faster than previous RDBMS/File Solution at… • ALL data sizes and ALL client load levels • …consistently Game changing new features: • No impedance mismatch: onboard new data in minutes • Version Store: can ask “What did the data look like?” Cost Savings: • Proprietary parallel filesystem replaced by commodity SSD’s
    18. 18. © Man 2013 18 Chapter 2 Getting Bigger: Single Stock Equities
    19. 19. © Man 2014 19 Single Stock Data - Scale Thousands of Stocks Many years of Time-series Data Tens of different Data Item for each Stock Complex trading models with many Quants sharing the Data
    20. 20. Trading Signal Derived Data Item Derived Data Item Derived Data Item Derived Data Item Derived Data Item Raw Data ItemsRaw Data ItemsRaw Data ItemsRaw Data ItemsRaw Data Item Multi-user, versioned, interactive graph-based computation © Man 2014 20 Single Stock Data Source Data (Managed RDBMS) Raw Data ItemsRaw Data ItemsRaw Data ItemsRaw Data ItemsRaw Data Item Derived Data Item Derived Data Item Derived Data Item Derived Data Item Derived Data Item Trading Signal shard 1 shard 2 shard 3 shard 4 shard 1 shard 2 shard 3 shard 4 shard 1 shard 2 shard 3 shard 4 MongoDB Cluster ~1TB Data ~10,000 Stocks ~20 Years 250 Data Items Each Item is 600 MB Single model ~150GB Many Quants and models Hours  Minutes
    21. 21. © Man 2014 21 Single Stock Trading - Conclusions MongoDB faster than previous RDBMS/File Solution at… • Fast interactive research • Read/write a 600MB Data item in < 1 second • Rebuild complex model: hours  minutes
    22. 22. © Man 2013 22 Chapter 3 MongoDB as a Tick Store
    23. 23. Almost, but not quite © Man 2014 23 Big Data? 30TB Historic Data Ticks/1000 per second Sparse Data
    24. 24. © Man 2014 24 Third-Party Tick Stores Typically… • Expensive • Proprietary query languages • Database-centric architectures, so… • Not ideal for cluster compute • Unless you pay for lots of cores… • Expensive! So… • A real $$$ saving opportunity!
    25. 25. © Man 2014 25 Architecture Reuters RMDSMessageBus Bloomberg Banks Kafka Queue Kafka Queue Kafka Queue 16 shard cluster Master + 1 replica Linux 12 cores 256 GB RAM 96TB Disk Infiniband network LZ4 compressed data MongoDB Cluster
    26. 26. Parallel Access © Man 2014 26 Tick Store Performance Infiniband saturated 25x greater tick throughput With just 2 machines!
    27. 27. © Man 2014 27 Tick Store: System Load OtherTick Mongo (x2)N Tasks = 32
    28. 28. © Man 2014 28 Tick Store - Conclusions Happy Quants! • 25x improvement in tick throughput • So fit models 25x as fast Happy Accountants! • >40x cost saving of MongoDB Support compared to previous Tick Store licensing.
    29. 29. © Man 2014 29 Epilogue Where are we now and where next?
    30. 30. Performance Low Frequency Data: 100x faster Equities Models: Hours  Seconds Tick Data: 25x faster © Man 2014 30 Key Facts Cost Savings Parallel File System  Commodity SSD’s Proprietary Tick Store  MongoDB Orders of magnitude $$$ savings… Efficiencies 4 storage technologies  1 Fully utilise expensive HPC resources Support load on team down > 50% Game Changers Onboard Data: Days  Minutes Data Versioning The technology is no longer the bottleneck “Peopleware” Attract and retain great Quants Attract and retain great Techies And attend a great conference 
    31. 31. © Man 2014 31 Where Next? 1. Extend the data ecosystem further 2. Broader application across the company as a whole 3. Open Source?
    32. 32. © Man 2014 32 Questions Gary Collier gcollier@ahl.com James Blackburn jblackburn@ahl.com
    1. A particular slide catching your eye?

      Clipping is a handy way to collect important slides you want to go back to later.

    ×