This document discusses, using a simple example, what a robo-advisor can achieve using Deep Reinforcement Learning when creating trading strategies for financial markets. This system is already in use and generating added value for asset managers, traders and financial market analysts. The system will be upgraded with a Business Intelligence layer and be partially made available via the web in the course of 2018.
Trading Engine/Robo-Advisor using Deep Reinforcement Learning
1. Trading Strategy Generator using Deep Reinforcement Learning
Covering an End-to-End Trading Strategy Life Cycle
1
Define
Meta-Parameters
Loading
Historical Data
Time Series
Processing
Historical Data
Time Series
Building
Prediction Models
Deriving
Risk-Minimizing
Portfolio
Generating
Trading
Strategies
Putting
Optimal Trading
Strategy into Production
3. Trading Strategy Generator
Generating Trading Strategies using Deep Reinforcement Learning
3
• As an example, the goal is to generate an equity-based portfolio for the FTSE100,
by detecting and executing optimized asset selection and allocation strategies.
• Engine is putting an ‘end-of-day’ trading strategy into production by initializing a
portfolio at 14/9/2016 and rebalancing it if needed in the 5 trading days to come
implying a weekly realignment if necessary.
• By performing extensive ‘paper trading’, using Machine Learning (ML), the engine
concludes to combine a portfolio based on a ML-based trading strategy (Wijnen)
with a portfolio based on MPT (Markowitz – Modern Portfolio Theory) allocating
weights to the two strategies of 62.8% and 37.2% respectively. By performing ML, the
engine picks following components of the FTSE100 to be part of the portfolio and also
determines the number of components to pick (5): POLY, SGE, RB, CPG and FRES to
be traded in Pound Sterling (GBP).
• The table with labels ‘Prices at’, ‘Prediction at +1’ and ‘Prediction at’ show actual prices
(open) and predicted prices (close) for each of the shares selected. ‘Prediction at’ is a
prediction of closing price for same trading day, while ‘Prediction at +1’ is our estimate of
next day’s closing price. Prices are given for a ‘bundle of a given asset, e.g. 100
shares instead of 1’ in order to have a way of comparing the pricing of the different
assets.
• The engine examines both ‘The tendency to go Long’ and ‘The tendency to go
Short’ and expresses the results as a long and short position for each of the
assets respectively (E.g. for POLY go long for 840 shares, go short for 1936 shares.).
Finally the ‘netted position’ is determined (E.g. for POLY S1096, where S = Short vs. L =
Long and 1096 the number of shares to go short.).
1
2
3
4
4. Trading Strategy Generator
Generating Trading Strategies using Deep Reinforcement Learning
4
• The next trading day (15/9/2019) actual prices are retrieved and predictions generated.
• At 15/9/2019, the portfolio can be/is rebalanced based on market behavior: Rebalancing is analyzed for
‘going long’ and ‘going short’ simultaneously (E.g. for POLY the tendency is to increase the long
position by 47 and the short position by 108, resulting in a netted tendency to go short by 61 shares.).
• The engine keeps track of actual cumulative ‘Profit and Loss’ (P&L), and worst and best results in
terms of cumulative P&L thus far. This info is used in order to determine whether or not to liquidate the
portfolio.
• The engine rebalances the portfolio for 16/9/2019. Every trading day the engine evaluates whether to
rebalance or liquidate the portfolio.
• Graph showing the cumulative P&L over time (in trading days) for the strategy that has been put into
production.
• The part of the graph to the left of the ‘red dashed line’ shows the ‘learning’ being done by the engine. The
‘red dashed line’ shows when the strategy was first put into ‘production’. So, the to the right, actual trading
results are shown.
• As the trading strategy above seems to become less profitable after a given moment in time, we
decide to switch to a new trading strategy.
• Graph showing the amount of cash needed to execute the strategy (Positive values reflect cash
consumption while going long, negative values reflect cash impact going short.) over time.
• Graph showing the ‘Long Ratio’, which expresses the estimated probability that the market will go long
(So 0 implies 100% short, 1 reflects 100% long.).
• Graph showing the profitability of the strategy as a % per annum (E.g. 0.30 implies 30% return over the
year.).
ADDED VALUE FOR ASSET MANAGERS:
=> More accurate estimates of future expected returns and asset pricing (How: Other pptx).
=> Powerful ML-driven trading engine/portfolio optimization exploiting this increase in
accuracy by creating trading strategies in real-time reflecting current changes in market
behavior and thus allow for significantly ‘outperforming the market’.
5
6
7
8
9