SlideShare a Scribd company logo
Q U A N T T R A D I N G W I T H
A R T I F I C I A L I N T E L L I G E N C E
T H E P R A C T I C A L A P P R O A C H A N D C O M M O N P I T F A L L S
By Roger Lee
Develop your own trading
strategy with Python
Avoid common pitfalls
ENHANCE VISIBILITY OF THE REALITY
FOSTER MORE INTELLIGENT USE OF AI IN TRADING
Challenges in AI + Trading
Quant Trading Process
• Data Curation
• Feature Analysis (Black box)
• Strategy (White box)
• Backtesting
• Deployment
FUNCTIONDATA PORTFOLIO
MANAGEMENT
OP
OP
ALPHA
MACRO FUNDAMENTAL MARKET DATA
• GDP
• INDUSTRY GROWTH RATE
• INDUSTRY STATISTICS
• ASSET / LIABILITIES
• REVENUE / PROFIT
• OPERATIONAL STATS
• STOCK PRICE / YIELDS /
IMPLIED VOL.
• MARKET CAP.
• VOLUME
• DIVIDEND / COUPONS
• MICROSTRUCTURES (BID /
ASK / DEPTH / ORDER TYPE)
Marco
Low Freq.
Mirco
High Freq.
EVENTS ALTERNATIVE DATA
• ANALYST
RECOMMENDATION
• CREDIT RATINGS
• EARNINGS CALL /
GUIDANCE
• NEWS
• NEWS SENTIMENT
• TWITTER
• SATELLITE IMAGES
• GOOGLE TRENDS
DATA
FUNCTION
MACHINE LEARNING NLP SPEECH KNOWLEDGE ENGINEERING
• SUPERVISED LEARNING
• UNSUPERVISED LEARNING
• REINFORCEMENT LEARNING
• DEEP LEARNING
• NAME ENTITY RECOGNITION
• SENTIMENT
• TRANSLATION
• SPEECH-TO-TEXT • BAYESIAN LEARNING
• MARKOV
PORTFOLIO MANAGEMENT
v In trading, there are 3 parameters for decision making
- Asset
- Time (A lot of people just only focus on the time (1 dimensional) to analyze)
- Position
v Function picks the (Asset, Timing) pair
v Portfolio Management picks (Position)
- Very important for risk management
- Recent news: Optionsellers.com
EXAMPLE -
SUPERVISED LEARNING
Apply supervised learning on stock ranking
We use only OHLCV data on GOOGLE (Alphabet) and APPLE stocks
STEP 1: CALL ALL RELEVANT LIBRARY
STEP 2: READ DATA FROM PRE-PREPARED CSV (DATA FROM 2014 –
2018, ROUGHLY 1000 SAMPLES)
E
X
A
M
P
L
E
|
S
U
P
E
R
V
I
S
E
D
L
E
A
R
N
I
N
G
STEP 3: CALCULATE RETURN AND NORMALIZE DATA
STEP 4: PREPARE TARGET DATA (LABEL GOOG > AAPL AS 1, ELSE 0)
E
X
A
M
P
L
E
|
S
U
P
E
R
V
I
S
E
D
L
E
A
R
N
I
N
G
STEP 5: SEPARATE TRAINING AND TEST SAMPLES
STEP 6: APPLY LOGISTIC REGRESSION AND CALCULATE ACCURACY
E
X
A
M
P
L
E
|
S
U
P
E
R
V
I
S
E
D
L
E
A
R
N
I
N
G
EXAMPLE -
SUPERVISED LEARNING
Result:
• 44% accuracy on predicting the return ranking of GOOG and AAPL ~200 data
points (with ~800 data as training sets) with 40 lines of code
Not satisfied
EXAMPLE -
SUPERVISED LEARNING
Hypothesis:
• People will exit when the achieve a certain return…
• Combine OHLC data (OHLC/4)
• Create a new number called volume-weighted average return
Result:
• With just 4 variables, achieved 65% accuracy
Can we do more?
• Goal is to find features that can identify / predict prices
• We can do it both by human knowledge and Artificial intelligence algorithm
EXAMPLE -
UNSUPERVISED LEARNING
“Days Like today”
• Can we cluster similar days historically?
• How do we define similar days?
• How can we cluster high-dimensional data?
E
X
A
M
P
L
E
|
U
N
S
U
P
E
R
V
I
S
E
D
L
E
A
R
N
I
N
G
LIST OF 20 FEATURES TO TEST:
• FOCUS ON TECHNICAL ANALYSIS ONLY
• ICHIMOKU
• RSI
• STOCHASTIC
• MACD
• BOLLINGER BAND
• WILLIAM’S R
• PARABOLIC SAR
• CCI
• ETC.
E
X
A
M
P
L
E
|
U
N
S
U
P
E
R
V
I
S
E
D
L
E
A
R
N
I
N
G
E
X
A
M
P
L
E
|
U
N
S
U
P
E
R
V
I
S
E
D
L
E
A
R
N
I
N
G
1. Normalize Data
2. Svd (20 -> 10)
3. TSNE (10 -> 2D Points)
4. Clustering (Affinity
Propagation)
E
X
A
M
P
L
E
|
U
N
S
U
P
E
R
V
I
S
E
D
L
E
A
R
N
I
N
G
LIBRARY USES
NUMPY Optimize numerical and matrix operations
PANDAS Pandas dataframe for easy manipulation of timeseries data
Downlad financial data from various sources
MATPLOTLIB Plotting timeseries data on a chart to observe patterns
SCIKIT-LEARN Most ML / AI algorithms (excluding PGM)
SCIPY Scientific / quantitative computing
MULTIPROCESSING Optimize parallel calculations
TENSORFLOW Optimize for neural networks and google’s TPU (tensor processing unit)
PGMPY Basic PGM algorithm
QUANDL / ALPHA-VANTAGE Data sources (free / freemium / premium)
Useful Python Library
P I T F A L L S O F
U S I N G A I
#1: DATA CURATION
• Biased Data (e.g. look-forward bias / survivorship bias)
• Low-quality Data
• Throw away too much Data
• Common Curation steps
- Stock Prices: should be adjusted for splits and dividends
- Futures / Option Prices: handle rollover situations
- FX prices: OTC Traded so volume from one broker may not be relevant
- Timezone adjustments / lunch hours adjustments
- Holidays / No trading: fill by previous / blank
#1: EXAMPLE – LOW QUALITY DATA
• Daily data / Monthly data / Quarterly data / Yearly Data
- Too few data points to work in deep networks (or even shallow networks).
- For Daily data, 10 years (which is already maximum to avoid market regime change issue), just roughly 260*10 = 2,600 data points
- High bias
• Tick data
- Contains too many noises.
- During peak hours when market opens / closes, there is a lot of trading and information flow happening.
- Apart from those, the information contained in the bars outside of these hours are insignificant
- High variance
Solution: Use Turnover Bar / Volume Bar
#1: EXAMPLE – THROW AWAY TOO MUCH DATA
• Return = ln(Pn / Pn-1) = ln Pn – ln Pn-1 = yn- yn-1 --- (*), where Pn stands for stock price at t=n
• When you are taking return from a series of price data, you have completely thrown away the Price information (yt) for any t in (1,n) as an
input that might be an important data and assume yn ~ N(μ,σ2)
• Solution: Apply fractional differentiation such that the result timeseries contains information of both price and return
• Fractional Differentiation formula (Easy to prove) :
• When α=1, it is equivalent to the first order derivative (which is formula (*))
• When α=0, it is equivalent to yn (Original log-Price Series)
• When α takes on any number between 0 and 1, it contains both information of return and price.
#2: RISK MANAGEMENT
• Neural Network forgets historical black swan events, and most AI cannot provide good
predictions on networks they have not encountered before
• Solution 1: LSTM / Neural network with attention / Memory network
• Solution 2: Knowledge Engineering & Generalisation
• Solution 3: A good risk management system
Reference
LSTM Neural Network With Attention Memory Network
Each repeating module will export
information to the next module. The next
module will decide whether to keep it or
throw it when passing to next module.
Attention mechanism distribute attention
of certain data at several time steps to
avoid information being “diluted” after
many time steps.
Controller, reader, writer are trained
consistently to store important events into
the memory.
Photo source: skymind.Ai, deepmind
Challenges and Opportunities
Problem Solution
Short-
term
focus
• Not enough structured data
• Too much useless data (noises)
• Turn unstructured data to structured data
(using AI)
• Extract value from information
(domain expert)
Long-
term
focus
• Making decisions based on incomplete
information & under uncertain uncertainty
• General AI algorithm cannot deal with
causation
• Most popular / commercialized / scalable AI
are black-box, there is lack of white-box
• Deal with causation & uncertainty (using AI)
• White-box knowledge engineering
(domain expert)
T h a n k y o u !
e m a i l :
c y l e e @ l i v e . h k
l i n k e d i n :
h t t p s : / / w w w . l i n k e d i n . c o m / i n / r o g e r -
l e e - c f a - 8 5 7 1 0 0 4 8 /

More Related Content

What's hot

Algorithmic Trading: an Overview
Algorithmic Trading: an Overview Algorithmic Trading: an Overview
Algorithmic Trading: an Overview
EXANTE
 
Quant insti webinar on algorithmic trading for technocrats!
Quant insti webinar on algorithmic trading for technocrats!Quant insti webinar on algorithmic trading for technocrats!
Quant insti webinar on algorithmic trading for technocrats!
QuantInsti
 
"Fundamental Forecasts: Methods and Timing" by Vinesh Jha, CEO of ExtractAlpha
"Fundamental Forecasts: Methods and Timing" by Vinesh Jha, CEO of ExtractAlpha"Fundamental Forecasts: Methods and Timing" by Vinesh Jha, CEO of ExtractAlpha
"Fundamental Forecasts: Methods and Timing" by Vinesh Jha, CEO of ExtractAlpha
Quantopian
 
Algorithmic Trading
Algorithmic TradingAlgorithmic Trading
Algorithmic Trading
Prashant Maharshi
 
Algo trading with machine learning ppt
Algo trading with machine learning pptAlgo trading with machine learning ppt
Algo trading with machine learning ppt
Deb prakash ganguly
 
"Trading Strategies That Are Designed Not Fitted" by Robert Carver, Independe...
"Trading Strategies That Are Designed Not Fitted" by Robert Carver, Independe..."Trading Strategies That Are Designed Not Fitted" by Robert Carver, Independe...
"Trading Strategies That Are Designed Not Fitted" by Robert Carver, Independe...
Quantopian
 
"Quant Trading for a Living – Lessons from a Life in the Trenches" by Andreas...
"Quant Trading for a Living – Lessons from a Life in the Trenches" by Andreas..."Quant Trading for a Living – Lessons from a Life in the Trenches" by Andreas...
"Quant Trading for a Living – Lessons from a Life in the Trenches" by Andreas...
Quantopian
 
What we learned from running a quant crypto hedge fund
What we learned from running a quant crypto hedge fundWhat we learned from running a quant crypto hedge fund
What we learned from running a quant crypto hedge fund
Yingdan (Mora) Liang
 
Algorithmic Trading-An Introduction
Algorithmic Trading-An IntroductionAlgorithmic Trading-An Introduction
Algorithmic Trading-An Introduction
Rajeev Ranjan
 
Simulating HFT for Low-Latency Investors - The Investor’s View (Robert Almgren)
Simulating HFT for Low-Latency Investors - The Investor’s View (Robert Almgren)Simulating HFT for Low-Latency Investors - The Investor’s View (Robert Almgren)
Simulating HFT for Low-Latency Investors - The Investor’s View (Robert Almgren)
CBS Competitiveness Platform
 
Order book dynamics in high frequency trading
Order book dynamics in high frequency tradingOrder book dynamics in high frequency trading
Order book dynamics in high frequency trading
QuantInsti
 
A Guided Tour of Machine Learning for Traders by Tucker Balch at QuantCon 2016
A Guided Tour of Machine Learning for Traders by Tucker Balch at QuantCon 2016A Guided Tour of Machine Learning for Traders by Tucker Balch at QuantCon 2016
A Guided Tour of Machine Learning for Traders by Tucker Balch at QuantCon 2016
Quantopian
 
DIY Quant Strategies on Quantopian
DIY Quant Strategies on QuantopianDIY Quant Strategies on Quantopian
DIY Quant Strategies on Quantopian
Jess Stauth
 
"From Trading Strategy to Becoming an Industry Professional – How to Break in...
"From Trading Strategy to Becoming an Industry Professional – How to Break in..."From Trading Strategy to Becoming an Industry Professional – How to Break in...
"From Trading Strategy to Becoming an Industry Professional – How to Break in...
Quantopian
 
Quantitative Trading
Quantitative TradingQuantitative Trading
Quantitative Trading
futurewardcentral
 
Algorithmic trading and Machine Learning by Michael Kearns, Professor of Comp...
Algorithmic trading and Machine Learning by Michael Kearns, Professor of Comp...Algorithmic trading and Machine Learning by Michael Kearns, Professor of Comp...
Algorithmic trading and Machine Learning by Michael Kearns, Professor of Comp...
Quantopian
 
"From Alpha Discovery to Portfolio Construction: Pitfalls and Solutions" by D...
"From Alpha Discovery to Portfolio Construction: Pitfalls and Solutions" by D..."From Alpha Discovery to Portfolio Construction: Pitfalls and Solutions" by D...
"From Alpha Discovery to Portfolio Construction: Pitfalls and Solutions" by D...
Quantopian
 
Algo trading
Algo tradingAlgo trading
Algo trading
Ankit Chauhan
 
"Quantitative Trading as a Mathematical Science" by Dr. Haksun Li, Founder an...
"Quantitative Trading as a Mathematical Science" by Dr. Haksun Li, Founder an..."Quantitative Trading as a Mathematical Science" by Dr. Haksun Li, Founder an...
"Quantitative Trading as a Mathematical Science" by Dr. Haksun Li, Founder an...
Quantopian
 
Combining the Best Stock Selection Factors by Patrick O'Shaughnessy at QuantC...
Combining the Best Stock Selection Factors by Patrick O'Shaughnessy at QuantC...Combining the Best Stock Selection Factors by Patrick O'Shaughnessy at QuantC...
Combining the Best Stock Selection Factors by Patrick O'Shaughnessy at QuantC...
Quantopian
 

What's hot (20)

Algorithmic Trading: an Overview
Algorithmic Trading: an Overview Algorithmic Trading: an Overview
Algorithmic Trading: an Overview
 
Quant insti webinar on algorithmic trading for technocrats!
Quant insti webinar on algorithmic trading for technocrats!Quant insti webinar on algorithmic trading for technocrats!
Quant insti webinar on algorithmic trading for technocrats!
 
"Fundamental Forecasts: Methods and Timing" by Vinesh Jha, CEO of ExtractAlpha
"Fundamental Forecasts: Methods and Timing" by Vinesh Jha, CEO of ExtractAlpha"Fundamental Forecasts: Methods and Timing" by Vinesh Jha, CEO of ExtractAlpha
"Fundamental Forecasts: Methods and Timing" by Vinesh Jha, CEO of ExtractAlpha
 
Algorithmic Trading
Algorithmic TradingAlgorithmic Trading
Algorithmic Trading
 
Algo trading with machine learning ppt
Algo trading with machine learning pptAlgo trading with machine learning ppt
Algo trading with machine learning ppt
 
"Trading Strategies That Are Designed Not Fitted" by Robert Carver, Independe...
"Trading Strategies That Are Designed Not Fitted" by Robert Carver, Independe..."Trading Strategies That Are Designed Not Fitted" by Robert Carver, Independe...
"Trading Strategies That Are Designed Not Fitted" by Robert Carver, Independe...
 
"Quant Trading for a Living – Lessons from a Life in the Trenches" by Andreas...
"Quant Trading for a Living – Lessons from a Life in the Trenches" by Andreas..."Quant Trading for a Living – Lessons from a Life in the Trenches" by Andreas...
"Quant Trading for a Living – Lessons from a Life in the Trenches" by Andreas...
 
What we learned from running a quant crypto hedge fund
What we learned from running a quant crypto hedge fundWhat we learned from running a quant crypto hedge fund
What we learned from running a quant crypto hedge fund
 
Algorithmic Trading-An Introduction
Algorithmic Trading-An IntroductionAlgorithmic Trading-An Introduction
Algorithmic Trading-An Introduction
 
Simulating HFT for Low-Latency Investors - The Investor’s View (Robert Almgren)
Simulating HFT for Low-Latency Investors - The Investor’s View (Robert Almgren)Simulating HFT for Low-Latency Investors - The Investor’s View (Robert Almgren)
Simulating HFT for Low-Latency Investors - The Investor’s View (Robert Almgren)
 
Order book dynamics in high frequency trading
Order book dynamics in high frequency tradingOrder book dynamics in high frequency trading
Order book dynamics in high frequency trading
 
A Guided Tour of Machine Learning for Traders by Tucker Balch at QuantCon 2016
A Guided Tour of Machine Learning for Traders by Tucker Balch at QuantCon 2016A Guided Tour of Machine Learning for Traders by Tucker Balch at QuantCon 2016
A Guided Tour of Machine Learning for Traders by Tucker Balch at QuantCon 2016
 
DIY Quant Strategies on Quantopian
DIY Quant Strategies on QuantopianDIY Quant Strategies on Quantopian
DIY Quant Strategies on Quantopian
 
"From Trading Strategy to Becoming an Industry Professional – How to Break in...
"From Trading Strategy to Becoming an Industry Professional – How to Break in..."From Trading Strategy to Becoming an Industry Professional – How to Break in...
"From Trading Strategy to Becoming an Industry Professional – How to Break in...
 
Quantitative Trading
Quantitative TradingQuantitative Trading
Quantitative Trading
 
Algorithmic trading and Machine Learning by Michael Kearns, Professor of Comp...
Algorithmic trading and Machine Learning by Michael Kearns, Professor of Comp...Algorithmic trading and Machine Learning by Michael Kearns, Professor of Comp...
Algorithmic trading and Machine Learning by Michael Kearns, Professor of Comp...
 
"From Alpha Discovery to Portfolio Construction: Pitfalls and Solutions" by D...
"From Alpha Discovery to Portfolio Construction: Pitfalls and Solutions" by D..."From Alpha Discovery to Portfolio Construction: Pitfalls and Solutions" by D...
"From Alpha Discovery to Portfolio Construction: Pitfalls and Solutions" by D...
 
Algo trading
Algo tradingAlgo trading
Algo trading
 
"Quantitative Trading as a Mathematical Science" by Dr. Haksun Li, Founder an...
"Quantitative Trading as a Mathematical Science" by Dr. Haksun Li, Founder an..."Quantitative Trading as a Mathematical Science" by Dr. Haksun Li, Founder an...
"Quantitative Trading as a Mathematical Science" by Dr. Haksun Li, Founder an...
 
Combining the Best Stock Selection Factors by Patrick O'Shaughnessy at QuantC...
Combining the Best Stock Selection Factors by Patrick O'Shaughnessy at QuantC...Combining the Best Stock Selection Factors by Patrick O'Shaughnessy at QuantC...
Combining the Best Stock Selection Factors by Patrick O'Shaughnessy at QuantC...
 

Similar to Quant trading with artificial intelligence

Practical deep learning for computer vision
Practical deep learning for computer visionPractical deep learning for computer vision
Practical deep learning for computer vision
Eran Shlomo
 
TAU on Power 9
TAU on Power 9TAU on Power 9
TAU on Power 9
Ganesan Narayanasamy
 
How to design quant trading strategies using “R”?
How to design quant trading strategies using “R”?How to design quant trading strategies using “R”?
How to design quant trading strategies using “R”?
QuantInsti
 
GPU Accelerated Backtesting and Machine Learning for Quant Trading Strategies
GPU Accelerated Backtesting and Machine Learning for Quant Trading StrategiesGPU Accelerated Backtesting and Machine Learning for Quant Trading Strategies
GPU Accelerated Backtesting and Machine Learning for Quant Trading Strategies
Daniel Egloff
 
Basic terminologies & asymptotic notations
Basic terminologies & asymptotic notationsBasic terminologies & asymptotic notations
Basic terminologies & asymptotic notations
Rajendran
 
AI and Deep Learning
AI and Deep Learning AI and Deep Learning
AI and Deep Learning
Subrat Panda, PhD
 
The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it Work
Ivo Andreev
 
Hive + Amazon EMR + S3 = Elastic big data SQL analytics processing in the cloud
Hive + Amazon EMR + S3 = Elastic big data SQL analytics processing in the cloudHive + Amazon EMR + S3 = Elastic big data SQL analytics processing in the cloud
Hive + Amazon EMR + S3 = Elastic big data SQL analytics processing in the cloud
Jaipaul Agonus
 
Machine Learning in the Financial Industry
Machine Learning in the Financial IndustryMachine Learning in the Financial Industry
Machine Learning in the Financial Industry
Subrat Panda, PhD
 
presentation on data science with python
presentation on data science with pythonpresentation on data science with python
presentation on data science with python
Khushbujaim
 
Basic of python for data analysis
Basic of python for data analysisBasic of python for data analysis
Basic of python for data analysis
Pramod Toraskar
 
XGBoost @ Fyber
XGBoost @ FyberXGBoost @ Fyber
XGBoost @ Fyber
Daniel Hen
 
Real time streaming analytics
Real time streaming analyticsReal time streaming analytics
Real time streaming analytics
Anirudh
 
Explore big data at speed of thought with Spark 2.0 and Snappydata
Explore big data at speed of thought with Spark 2.0 and SnappydataExplore big data at speed of thought with Spark 2.0 and Snappydata
Explore big data at speed of thought with Spark 2.0 and Snappydata
Data Con LA
 
Groovy On The Trading Desk
Groovy On The Trading DeskGroovy On The Trading Desk
Groovy On The Trading Desk
Jonathan Felch
 
Analysis using r
Analysis using rAnalysis using r
Analysis using r
Priya Mohan
 
ACM DEBS 2015: Realtime Streaming Analytics Patterns
ACM DEBS 2015: Realtime Streaming Analytics PatternsACM DEBS 2015: Realtime Streaming Analytics Patterns
ACM DEBS 2015: Realtime Streaming Analytics Patterns
Srinath Perera
 
DEBS 2015 Tutorial : Patterns for Realtime Streaming Analytics
DEBS 2015 Tutorial : Patterns for Realtime Streaming AnalyticsDEBS 2015 Tutorial : Patterns for Realtime Streaming Analytics
DEBS 2015 Tutorial : Patterns for Realtime Streaming Analytics
Sriskandarajah Suhothayan
 
1025 track1 Malin
1025 track1 Malin1025 track1 Malin
1025 track1 Malin
Rising Media, Inc.
 
Data Science Using Python.pptx
Data Science Using Python.pptxData Science Using Python.pptx
Data Science Using Python.pptx
Sarkunavathi Aribal
 

Similar to Quant trading with artificial intelligence (20)

Practical deep learning for computer vision
Practical deep learning for computer visionPractical deep learning for computer vision
Practical deep learning for computer vision
 
TAU on Power 9
TAU on Power 9TAU on Power 9
TAU on Power 9
 
How to design quant trading strategies using “R”?
How to design quant trading strategies using “R”?How to design quant trading strategies using “R”?
How to design quant trading strategies using “R”?
 
GPU Accelerated Backtesting and Machine Learning for Quant Trading Strategies
GPU Accelerated Backtesting and Machine Learning for Quant Trading StrategiesGPU Accelerated Backtesting and Machine Learning for Quant Trading Strategies
GPU Accelerated Backtesting and Machine Learning for Quant Trading Strategies
 
Basic terminologies & asymptotic notations
Basic terminologies & asymptotic notationsBasic terminologies & asymptotic notations
Basic terminologies & asymptotic notations
 
AI and Deep Learning
AI and Deep Learning AI and Deep Learning
AI and Deep Learning
 
The Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it WorkThe Power of Auto ML and How Does it Work
The Power of Auto ML and How Does it Work
 
Hive + Amazon EMR + S3 = Elastic big data SQL analytics processing in the cloud
Hive + Amazon EMR + S3 = Elastic big data SQL analytics processing in the cloudHive + Amazon EMR + S3 = Elastic big data SQL analytics processing in the cloud
Hive + Amazon EMR + S3 = Elastic big data SQL analytics processing in the cloud
 
Machine Learning in the Financial Industry
Machine Learning in the Financial IndustryMachine Learning in the Financial Industry
Machine Learning in the Financial Industry
 
presentation on data science with python
presentation on data science with pythonpresentation on data science with python
presentation on data science with python
 
Basic of python for data analysis
Basic of python for data analysisBasic of python for data analysis
Basic of python for data analysis
 
XGBoost @ Fyber
XGBoost @ FyberXGBoost @ Fyber
XGBoost @ Fyber
 
Real time streaming analytics
Real time streaming analyticsReal time streaming analytics
Real time streaming analytics
 
Explore big data at speed of thought with Spark 2.0 and Snappydata
Explore big data at speed of thought with Spark 2.0 and SnappydataExplore big data at speed of thought with Spark 2.0 and Snappydata
Explore big data at speed of thought with Spark 2.0 and Snappydata
 
Groovy On The Trading Desk
Groovy On The Trading DeskGroovy On The Trading Desk
Groovy On The Trading Desk
 
Analysis using r
Analysis using rAnalysis using r
Analysis using r
 
ACM DEBS 2015: Realtime Streaming Analytics Patterns
ACM DEBS 2015: Realtime Streaming Analytics PatternsACM DEBS 2015: Realtime Streaming Analytics Patterns
ACM DEBS 2015: Realtime Streaming Analytics Patterns
 
DEBS 2015 Tutorial : Patterns for Realtime Streaming Analytics
DEBS 2015 Tutorial : Patterns for Realtime Streaming AnalyticsDEBS 2015 Tutorial : Patterns for Realtime Streaming Analytics
DEBS 2015 Tutorial : Patterns for Realtime Streaming Analytics
 
1025 track1 Malin
1025 track1 Malin1025 track1 Malin
1025 track1 Malin
 
Data Science Using Python.pptx
Data Science Using Python.pptxData Science Using Python.pptx
Data Science Using Python.pptx
 

Recently uploaded

GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
Tomaz Bratanic
 
dbms calicut university B. sc Cs 4th sem.pdf
dbms  calicut university B. sc Cs 4th sem.pdfdbms  calicut university B. sc Cs 4th sem.pdf
dbms calicut university B. sc Cs 4th sem.pdf
Shinana2
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
Hiike
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
Antonios Katsarakis
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
tolgahangng
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Safe Software
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
Postman
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Tatiana Kojar
 
AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
HarisZaheer8
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Alpen-Adria-Universität
 
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
ScyllaDB
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 
A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
Intelisync
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 

Recently uploaded (20)

GraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracyGraphRAG for Life Science to increase LLM accuracy
GraphRAG for Life Science to increase LLM accuracy
 
dbms calicut university B. sc Cs 4th sem.pdf
dbms  calicut university B. sc Cs 4th sem.pdfdbms  calicut university B. sc Cs 4th sem.pdf
dbms calicut university B. sc Cs 4th sem.pdf
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
 
Dandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity serverDandelion Hashtable: beyond billion requests per second on a commodity server
Dandelion Hashtable: beyond billion requests per second on a commodity server
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 
Serial Arm Control in Real Time Presentation
Serial Arm Control in Real Time PresentationSerial Arm Control in Real Time Presentation
Serial Arm Control in Real Time Presentation
 
Driving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success StoryDriving Business Innovation: Latest Generative AI Advancements & Success Story
Driving Business Innovation: Latest Generative AI Advancements & Success Story
 
WeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation TechniquesWeTestAthens: Postman's AI & Automation Techniques
WeTestAthens: Postman's AI & Automation Techniques
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
 
AWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptxAWS Cloud Cost Optimization Presentation.pptx
AWS Cloud Cost Optimization Presentation.pptx
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing InstancesEnergy Efficient Video Encoding for Cloud and Edge Computing Instances
Energy Efficient Video Encoding for Cloud and Edge Computing Instances
 
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 
A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
 
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 

Quant trading with artificial intelligence

  • 1. Q U A N T T R A D I N G W I T H A R T I F I C I A L I N T E L L I G E N C E T H E P R A C T I C A L A P P R O A C H A N D C O M M O N P I T F A L L S By Roger Lee
  • 2. Develop your own trading strategy with Python Avoid common pitfalls ENHANCE VISIBILITY OF THE REALITY FOSTER MORE INTELLIGENT USE OF AI IN TRADING Challenges in AI + Trading
  • 3. Quant Trading Process • Data Curation • Feature Analysis (Black box) • Strategy (White box) • Backtesting • Deployment FUNCTIONDATA PORTFOLIO MANAGEMENT OP OP ALPHA
  • 4. MACRO FUNDAMENTAL MARKET DATA • GDP • INDUSTRY GROWTH RATE • INDUSTRY STATISTICS • ASSET / LIABILITIES • REVENUE / PROFIT • OPERATIONAL STATS • STOCK PRICE / YIELDS / IMPLIED VOL. • MARKET CAP. • VOLUME • DIVIDEND / COUPONS • MICROSTRUCTURES (BID / ASK / DEPTH / ORDER TYPE) Marco Low Freq. Mirco High Freq. EVENTS ALTERNATIVE DATA • ANALYST RECOMMENDATION • CREDIT RATINGS • EARNINGS CALL / GUIDANCE • NEWS • NEWS SENTIMENT • TWITTER • SATELLITE IMAGES • GOOGLE TRENDS DATA
  • 5. FUNCTION MACHINE LEARNING NLP SPEECH KNOWLEDGE ENGINEERING • SUPERVISED LEARNING • UNSUPERVISED LEARNING • REINFORCEMENT LEARNING • DEEP LEARNING • NAME ENTITY RECOGNITION • SENTIMENT • TRANSLATION • SPEECH-TO-TEXT • BAYESIAN LEARNING • MARKOV
  • 6. PORTFOLIO MANAGEMENT v In trading, there are 3 parameters for decision making - Asset - Time (A lot of people just only focus on the time (1 dimensional) to analyze) - Position v Function picks the (Asset, Timing) pair v Portfolio Management picks (Position) - Very important for risk management - Recent news: Optionsellers.com
  • 7. EXAMPLE - SUPERVISED LEARNING Apply supervised learning on stock ranking We use only OHLCV data on GOOGLE (Alphabet) and APPLE stocks
  • 8. STEP 1: CALL ALL RELEVANT LIBRARY STEP 2: READ DATA FROM PRE-PREPARED CSV (DATA FROM 2014 – 2018, ROUGHLY 1000 SAMPLES) E X A M P L E | S U P E R V I S E D L E A R N I N G
  • 9. STEP 3: CALCULATE RETURN AND NORMALIZE DATA STEP 4: PREPARE TARGET DATA (LABEL GOOG > AAPL AS 1, ELSE 0) E X A M P L E | S U P E R V I S E D L E A R N I N G
  • 10. STEP 5: SEPARATE TRAINING AND TEST SAMPLES STEP 6: APPLY LOGISTIC REGRESSION AND CALCULATE ACCURACY E X A M P L E | S U P E R V I S E D L E A R N I N G
  • 11. EXAMPLE - SUPERVISED LEARNING Result: • 44% accuracy on predicting the return ranking of GOOG and AAPL ~200 data points (with ~800 data as training sets) with 40 lines of code Not satisfied
  • 12. EXAMPLE - SUPERVISED LEARNING Hypothesis: • People will exit when the achieve a certain return… • Combine OHLC data (OHLC/4) • Create a new number called volume-weighted average return Result: • With just 4 variables, achieved 65% accuracy Can we do more? • Goal is to find features that can identify / predict prices • We can do it both by human knowledge and Artificial intelligence algorithm
  • 13. EXAMPLE - UNSUPERVISED LEARNING “Days Like today” • Can we cluster similar days historically? • How do we define similar days? • How can we cluster high-dimensional data?
  • 14. E X A M P L E | U N S U P E R V I S E D L E A R N I N G LIST OF 20 FEATURES TO TEST: • FOCUS ON TECHNICAL ANALYSIS ONLY • ICHIMOKU • RSI • STOCHASTIC • MACD • BOLLINGER BAND • WILLIAM’S R • PARABOLIC SAR • CCI • ETC.
  • 16. E X A M P L E | U N S U P E R V I S E D L E A R N I N G 1. Normalize Data 2. Svd (20 -> 10) 3. TSNE (10 -> 2D Points) 4. Clustering (Affinity Propagation)
  • 18. LIBRARY USES NUMPY Optimize numerical and matrix operations PANDAS Pandas dataframe for easy manipulation of timeseries data Downlad financial data from various sources MATPLOTLIB Plotting timeseries data on a chart to observe patterns SCIKIT-LEARN Most ML / AI algorithms (excluding PGM) SCIPY Scientific / quantitative computing MULTIPROCESSING Optimize parallel calculations TENSORFLOW Optimize for neural networks and google’s TPU (tensor processing unit) PGMPY Basic PGM algorithm QUANDL / ALPHA-VANTAGE Data sources (free / freemium / premium) Useful Python Library
  • 19. P I T F A L L S O F U S I N G A I
  • 20. #1: DATA CURATION • Biased Data (e.g. look-forward bias / survivorship bias) • Low-quality Data • Throw away too much Data • Common Curation steps - Stock Prices: should be adjusted for splits and dividends - Futures / Option Prices: handle rollover situations - FX prices: OTC Traded so volume from one broker may not be relevant - Timezone adjustments / lunch hours adjustments - Holidays / No trading: fill by previous / blank
  • 21. #1: EXAMPLE – LOW QUALITY DATA • Daily data / Monthly data / Quarterly data / Yearly Data - Too few data points to work in deep networks (or even shallow networks). - For Daily data, 10 years (which is already maximum to avoid market regime change issue), just roughly 260*10 = 2,600 data points - High bias • Tick data - Contains too many noises. - During peak hours when market opens / closes, there is a lot of trading and information flow happening. - Apart from those, the information contained in the bars outside of these hours are insignificant - High variance Solution: Use Turnover Bar / Volume Bar
  • 22. #1: EXAMPLE – THROW AWAY TOO MUCH DATA • Return = ln(Pn / Pn-1) = ln Pn – ln Pn-1 = yn- yn-1 --- (*), where Pn stands for stock price at t=n • When you are taking return from a series of price data, you have completely thrown away the Price information (yt) for any t in (1,n) as an input that might be an important data and assume yn ~ N(μ,σ2) • Solution: Apply fractional differentiation such that the result timeseries contains information of both price and return • Fractional Differentiation formula (Easy to prove) : • When α=1, it is equivalent to the first order derivative (which is formula (*)) • When α=0, it is equivalent to yn (Original log-Price Series) • When α takes on any number between 0 and 1, it contains both information of return and price.
  • 23. #2: RISK MANAGEMENT • Neural Network forgets historical black swan events, and most AI cannot provide good predictions on networks they have not encountered before • Solution 1: LSTM / Neural network with attention / Memory network • Solution 2: Knowledge Engineering & Generalisation • Solution 3: A good risk management system
  • 24. Reference LSTM Neural Network With Attention Memory Network Each repeating module will export information to the next module. The next module will decide whether to keep it or throw it when passing to next module. Attention mechanism distribute attention of certain data at several time steps to avoid information being “diluted” after many time steps. Controller, reader, writer are trained consistently to store important events into the memory. Photo source: skymind.Ai, deepmind
  • 25. Challenges and Opportunities Problem Solution Short- term focus • Not enough structured data • Too much useless data (noises) • Turn unstructured data to structured data (using AI) • Extract value from information (domain expert) Long- term focus • Making decisions based on incomplete information & under uncertain uncertainty • General AI algorithm cannot deal with causation • Most popular / commercialized / scalable AI are black-box, there is lack of white-box • Deal with causation & uncertainty (using AI) • White-box knowledge engineering (domain expert)
  • 26. T h a n k y o u ! e m a i l : c y l e e @ l i v e . h k l i n k e d i n : h t t p s : / / w w w . l i n k e d i n . c o m / i n / r o g e r - l e e - c f a - 8 5 7 1 0 0 4 8 /