Future stock performance presentation

STOCK ANALYSIS
- Financial Indicators of a stock price -
1

GOAL
Can a novice investor
buy a stock using
machine learning
algorithms?
2

AGENDA
 What is a Stock?
 What does it
mean to own
stock in a
company?
Basics
01
 What sample?
 What features?
 What size?
Dataset
 Zero dominance
 Null dominance
 Outlier detection
Preprocessing
 Feature selection
 Modeling
 Comparison
Modelling
 Important
variables
 Future work
Conclusion
02 03 04 05
3

BASICS
What is a stock?
what does it mean to
own stock in a company?
4

Each company is part of a
sector that classifies it in a
macro-area
US BASED STOCKS
01
02
03
• 4392 companies
• 225 financial indicators
• 2018 year
Sample size
• Binary Class
• 1 Class
• 0 Class
Target variable
These are commonly found in the
10-K filings each publicly traded company
releases yearly
From Kaggle
DATASET

DATASET – TARGET
• Binary Class
• 1 indicates should Buy
• 0 indicates should NOT Buy
• Year 2019
1 – 70%
0 - 30%
Target Variable
7

NULL DOMINANCE
• Removed feature wise
• 13 features have above
40% missing values
ZERO DOMINANCE
• Removed Feature wise
• 10 features have above 65%
zero values
Data Preprocessing
8

IMPUTING NULLS
Replacing remaining missing
values with their means
according to each sector
INORGANIC GROWTH
• Price variation above
300%
• 19 companies
Data Preprocessing Cont’d:
OUTLIER DETECTION
• Z-scores
• IQR
• Coerce
9
NAN

200+ financial indicators at RANDOM!
10

Feature Selection methods
 Univariate Feature Selection - Chi-squared
 Wrapper Select via model
 Mutual info Classification
 Stepwise Recursive Backwards Feature removal
 L2 Regularization
 Exhaustive search
 Genetic search
11
 Decision tree Classifier
 Random Forest Classifier
 Gradient Decent Classifier
 Ada Boost Classifier
 Neural Network Classifier
 Logistic Regression
 Support Vector Machine
Modeling methods

Modeling – Decision Tree, Random Forest
Selection method Accuracy AUC # Features
selected
Low Variance filter
< 20%
0.64 (+/- 0.06) 0.59 (+/- 0.03) 10
Stepwise Backward
Removal
0.62 (+/- 0.07) 0.57 (+/- 0.03) 5
Mutual Info
Classification
0.63 (+/- 0.05 0.57 (+/- 0.04) 10
Wrapper Select 0.65 (+/- 0.04) 0.60 (+/- 0.04) 68
selected
Low Variance filter
< 20%
0.71 (+/- 0.06) 0.77 (+/-0.02) 151
Stepwise Backward
Removal
0.69 (+/- 0.06) 0.70 (+/0.03) 5
Mutual Info
Classification
0.72 (+/-0.06) 0.77 (+/0.01) 10
Wrapper Select
Random Forest
0.72 (+/- 0.06) 0.77 (+/-0.01) 62
Decision tree Random Forest
12

Decision Boundaries– Random Forest vs Decision Tree
13
 Decision boundary technique helps to develop an intuition of how a model work
 These boundaries separate the data-points (companies) into regions signifying different classes (1,0)
Actual 0 Class Actual 1 Class

Modeling – Boosting Methods
14
71 71
70
68
66
76 76
72
70
69
3 5 10 15 20
InPercentages
Depth of a tree
Max Depth vs CV Performance
Accuracy AUC
selected
Low Variance filter
< 20%
0.72 (+/- 0.05) 0.77 (+/- 0.03) 151
Stepwise Backwards
Removal
0.71 (+/- 0.05) 0.75 (+/- 0.03) 5
Mutual Info
Classification
0.72 (+/- 0.02) 0.73 (+/- 0.02) 10
Wrapper Select 0.72 (+/- 0.04) 0.78 (+/- 0.03) 51
selected
Low Variance filter
< 20%
0.72 (+/- 0.05) 0.76 (+/- 0.02) 151
Stepwise Backwards
Removal
0.71 (+/- 0.06) 0.73 (+/- 0.04) 5
Mutual Info
Classification
0.72 (+/- 0.03) 0.73 (+/- 0.02) 10
Wrapper Select 0.72 (+/- 0.05) 0.77 (+/- 0.02) 51
Gradient Boost
Ada Boosting

Decision Boundaries– Gradient Boost vs AdaBoost
15
 Blue region classifies to “NOT Buy” class while orange classifies to “Buy” class.
Actual 0 Class Actual 1 Class

Modeling – Neural Network
16
selected
Low Variance filter
< 20%
0.68
(+/- 0.05)
0.64
(+/- 0.05)
151
Stepwise Backwards
Removal
0.69
(+/- 0.01)
0.52
(+/- 0.07)
5
Mutual Info
Classification
0.71
(+/- 0.02)
0.69
(+/- 0.03)
10
Wrapper Select 0.70
(+/- 0.03)
0.59
(+/- 0.07)
51

Modeling – SVM
17
Selection methcod Accuracy AUC # Features
selected
Low Variance filter
< 20%
0.72
(+/- 0.02)
0.72
(+/- 0.03)
88
Stepwise Backwards
Removal
0.69
(+/- 0.00)
0.67
(+/- 0.03)
5
Mutual Info Classification 0.70
(+/- 0.00)
0.60
(+/- 0.03)
5
Wrapper Select 0.71
(+/- 0.02)
0.73
(+/- 0.03)
62

Important Features
19
A calculation used to gauge the
quality of a company's earnings
per share (EPS).
EPS Diluted:
If a company has been buying
back shares, this number will be
negative.
Weighted Average Shares:
The amount of money that would be
returned to shareholders if all of
the assets were liquidated.
Shareholders Equity:
The portion of a
company's profit that is allocated
to each outstanding share of its
common stock.
Net Income per share:
Earnings per share divided
by the share price.
Earnings Yield:
Long-term assets that have a
useful life of more than one
year.
Total non-current assets:

Compare model results to
2020 financial year.
Compute gains, losses and
ROI for each stock.
Future work
Plotting decision boundaries
using 2 components.
20

Future stock performance presentation

Recommended

Recommended

More Related Content

Similar to Future stock performance presentation

Similar to Future stock performance presentation (20)

Recently uploaded

Recently uploaded (20)

Future stock performance presentation

Editor's Notes