PERFORMANCE ANALYSIS OF HYBRID FORECASTING MODEL IN STOCK MARKET FORECASTING
Stock Market Prediction - IEEE format
1. Stock Market Forecasting using Decision Tree and
GARCH Method
1
R.Gnanavel, 2
P.Anjana,3
K.S.Nappinnai,4
N.Pavithra Sahari
1
Assistant Professor, Department of Computer Science and Engineering, Rajalakshmi Institute of Technology, Chennai, India.
2,3,4
UG Students, Department of Computer Science and Engineering, Rajalakshmi Institute of Technology, Chennai, India.
1
rgvelu22@gmail.com, 2
anjuprasan@gmail.com, 3
nappinnaikalaivanan@gmail.com, 4
pavithrasahari@gmail.com
Abstract—Stock market trading is an important economic
activity of a society which helps people to earn money. If the
trading of shares is done properly after keen observation of the
fluctuation of the share price, then it is possible to have a stable
financial income through the share market. The prediction of the
shares in the stock market involves predicting the future values
of a company or organizations stock by processing the
information using data mining. Based on the values predicted the
future trading of the shares will take place. The problem with the
stock market is that it can be very erratic at times which might
lead to a crash. For this reason the proposed system presents the
idea of a Modified GARCH based on decision tree algorithm
which consists of two separate sections known as the prediction
section and the optimization section. The idea revolves around
using a Generalized AutoRegressive Conditional
Heteroskedasticity (GARCH) Algorithm to help with the
forecasting of the stock price. The decision tree algorithm helps
to reduce the number of false positives and true negatives in the
system by allocating weights for a set of distinct set of parameters
to detect and reduce the error rate. The model improves the
accuracy of the system by 13% when compared with the existing
model.
Keywords—Stock market prediction, data mining, GARCH,
decision tree
I. INTRODUCTION
Data mining has become an attractive field for research as
well as for scientific and technological developments. It
involves computational process of discovering patterns in
large data sets. Data mining cumulates this large set of data for
the purpose of analyzing, processing and manipulating it to get
more efficient result. Data mining is achieved through
interdisciplinary activities such as machine learning, database
system handling, statistics, artificial intelligence, storing and
retrieval of large amounts of data in data warehousing, which
is also known as big data processing. The applications of data
warehousing can vary from market analysis, engineering
manufacturing, customer relations management, intrusion
detection, fraud detection, lie detection, financial banking,
corporate surveillance and bio informatics. In this paper we
mainly concentrate on the statistical part of data mining by
analyzing the index values of the shares in the stock market.
Forecasting stock market exchange is a tedious process. There
are many algorithms readily available that predicts the stock
exchanges. For this purpose of forecasting the share values the
GARCH method is used. The GARCH model is one of the
famous economic models that is being used in the current
stock forecasting processes by predicting the price and rates of
a particular financial instrument. It is an approach to estimate
the time varying series with volatility clusters in the financial
market. There are several forms of GARCH that can be
implemented such as EGARCH, GARCH-M, NGARCH,
IGARCH, QGARCH, GJR-GARCH, COGARCH, TGARCH,
FGARCH. All the GARCH models are derived from a model
called the AutoRegressive Conditional Heteroskedasticity
(ARCH) model. The ARCH model deals with computing the
gradual increase in the variance term. This value can be later
used to estimate the mean level value of the system. The
estimated mean level helps during transforming the variable.
The GARCH model uses the observations estimated in ARCH
model along with the past variance during a particular time for
its prediction process. There are a number of tests which are
present to check the validity of the ARCH and GARCH
models. Some of these tests are Lagrange multiplier test, white
test, null hypothesis test and etc. The GARCH model can be
applied in fields that involve trading, dealing, hedging and
investing.
Decision tree algorithm is used like a predictive model
from the observations being noted. Decision trees are
commonly used specifically in decision analysis, to identify a
strategy most likely to reach a goal, but also a popular tool
in machine learning. The algorithm in this paper uses the
decision tree method that is mostly used to make decisions
based on the outcomes. It describes data but not the decisions
in data mining. The main purpose of the decision tree is to
optimize the preexisting algorithm. There are many
parameters that can be considered to enhance the way the
results are being calculated. In general, the decision tree
technique is mostly used to improve the performance of the
system, reduces the existing noise, enhance the safety
protocols, increase the accuracy, reduce the corruption of data
while processing and improving the overall data quality. It is
commonly used in learning algorithms, operational research,
decision analysis, machine learning. The decision tree is a
flow chart like structure that has a minimum of three different
2. types of nodes. The decision node makes the decision on how
the outcome will occur. This node is illustrated with a square
shape. The chance node does not process a decision parameter
instead it states the probability of occurrence of a certain event
or parameter. The chance mode is illustrated with the shape of
a circle. The end node usually represents a result or states the
final outcome after processing the decision tree. It cannot
contain any child nodes. It end node is illustrated with a
triangle.
II. LITERATURE REVIEW
The AutoRegressive Integrated Moving Average
(ARIMA) forecasting model of time series is used in [1]
to
predict the occurrence T.papillosa stink bug , which is a major
enemy of the lychee plant. This prediction helped in providing
guidelines and time to plan for the establishment of prevention
and control measures. The daily load patterns were predicted
in [2]
using the AutoRegressive Integrated Moving Average
(ARIMA) model. This paper helped as the initial step to
investigate the impact of weather change and climate extremes
on electricity demand patterns. For short time scales it uses the
GARCH model to evaluate the maximum load demand level.
The daily volume of data series from the Dhaka Stock
Exchange was predicted in [3]
using the ML-ARCH method.
This paper also used the ARIMA and the EGARCH models
which resulted in low forecasting error, low mean absolute
percentage error, low variance and covariance proportion.
Both the algorithms, ARIMA and the GARCH are tested on
the gram prices in the Delhi market in [4]
. This paper uses the
Augmented Dickey Fuller (ADF) test on the algorithms to
check their stationarity of the series and the ARCH-LM test to
check the volatility. The GARCH model was found to be more
productive in analyzing the volatility with a lesser error rate. It
was then concluded that the GARCH model was a better fit to
predict the gram prices in the market. The univariate models
such as AR(1), ARIMA, ARIMA – GARCH and ARIMA -
EGARCH model were used in [5]
to predict the short term
interest rates such as the paper rate, yield on the treasury bill,
MIBOR rate and call money rate. Finally, it was concluded
that the ARIMA – GARCH model was the most appropriate
for forecasting the interest rates.
The model discussed in [9]
examines the three listed
equities on the Ghana Stock Exchange (GSE) by using a
volatility model on the financial returns. The analysis was
performed on the data taken from 25th
June 2007 till the 31st
October 2014. The KPSS test was performed on all of the
three equities to check for their stationary position. This study
used the GARCH (p,q) model for the volatility. The residual
series were fitted using GARCH (1, 1), GARCH (1, 2),
GARCH (2, 1) and GARCH (2, 2). The results of this study
showed that the volatility present in the three equities were not
always persistent. Each model was compared using the AIC
standards and it was found that the GARCH (1, 1) model
showcased the best performance. For future implementations,
this study recommends using other variants of the GARCH
model for their comparison.
III. PROPOSED SYSTEM
The proposed system aims at forecasting the stock market
rates using a process called the Generalized AutoRegressive
Conditional Heteroskedasticity (GARCH). This process is
used to analyze the stock price values which are initially given
as input to it. It analyzes the input to estimates the error
variance of a time series. This model can be used to estimate
the increasing variance over a short as well as over a long
period of time. The gradually increasing mean level of the
system can be evaluated using the proportionally increasing
variance value of the system. The GARCH model is used to
predict the sensex values for the future. Like any algorithm,
the GARCH algorithm too is not perfect. It is not a hundred
percent correct at all times. To reduce the error rate in the
results produced by the GARCH algorithm, the decision tree
concept is introduced in this proposed system. The decision
tree enhances the safety and performance of the program by
reducing the noise present in it. The main parameters used in
the decision tree to classify the quality of the data presented
are Earning Per Share (EPS), Sales Revenue, daily volume of
shares traded, new orders acquired, etc. The parameters are
used as tools to reduce the amount of false positives and true
negatives present in the system. These parameters are set to
specific weights based on their priority with regards to the
system. Fig 1 represents the block diagram of the proposed
system.
Fig 1: Block Diagram of the proposed system
3. PREDICTION MODULE
Characterization of time series was initially done through
the use of the AutoRegressive Conditional Heteroskedasticity
(ARCH) model. The main concept used in this model is that
the error terms are considered to have a characteristics
variance. This model was described to be used in a volatile
time series system to solve financial and econometric
problems. The increasing variance calculated in this model is
used to estimate its proportional mean level by the usage of
the transforming variable. It takes into account the amount of
increase in the stock price and the investments made per time
period. The ARCH model can be described through the
following equation:
mtytytmytytytVar m 22102 1),..,1,(
where t is the time at which the variance is calculated based on
the previous m times observations. The GARCH model, on the
other hand, uses both the past variance and the past squared
observations at the time t. The estimations for the
GARCH(p,q) model is performed in three different steps. The
first subscript p refers to the order of the y term whereas the
second subscript used here refers to the order of the σ term.
The first step involves figuring out the best fitting
AutoRegressive model for the given inputs.
q
i
tititqtqt yaayayaay
1
0110 ...
After this, the plotting of the computed autocorrelation value
needs to be performed by using the formula:
T
t
tt
T
it
tttt
1
222
1
1
22
1
22
)(
))((
Then the GARCH errors are identified by finding the standard
deviation. The values that are greater that the standard
deviation, which is one divided by square root of T, indicate
the error values.
OPTIMIZATION MODULE
A decision tree uses a tree or graph structure to analyse the
data and take a decision based on those analysed data values.
The output inde values predicted by the GARCH model is
given as input to the optimization module that comprises of
the decision tree. The price of the stock market shares can
vary due to many ambiguious factors such as the compounded
annual growth rate, the earning per share, the net profit earned
by the company, the average volume of the shares traded on a
daily basis, the market captial invested on the organization, the
amount of new buissness or orders acquired by the firm, the
price to book ratio, the companies debt position, the annual
divident percentage paid, the net worth, the capital and other
expenditures. There are other factors the global scenario of the
field in which the company is based on, the news being
published about the company and the revenue that the
company is making. In this proposed system, the parameters
for the decision tree, taken into consideration are the Earning
Per Share (EPS), Sales Revenue, daily volume of shares traded
and the amount of new orders acquired by the organization.
These factors help to analyse the predicted output of the
GARCH module and to refine them by improving the
accuracy of the reslts.
IV. FUTURE WORKS
For further improvement of this idea, a different
economic model can be chosen to be worked upon. The
proposed system uses a certain set of parameters to judge the
results presented to it. So a different set of parameters can be
used in the decision tree to check if a better reduction of the
error rates is possible for the given input. Also, mechanisms
other than a decision tree for performance improvement can be
implemented.
V. CONCLUSION
The proposed system was used to project the future
values of the shares in the stock market for a particular
company. This was performed by taking as input the past
values of the shares of the same company. The GARCH
algorithm was used to efficiently forecast the values by taking
into account the volatility and the variance present in the
market. The decision tree algorithm was implemented to
reduce the error rate thereby decreasing the amount of false
positives and true negatives present in the system. The results
show that the proposed algorithm improves the accuracy of the
system by 13%.
REFERENCES
[1] Boopathi, T., et al. "Development of temporal modeling for forecasting
and prediction of the incidence of lychee, Tessaratoma papillosa
(Hemiptera: Tessaratomidae), using time-series (ARIMA)
analysis." Journal of Insect Science 15.1 (2015): 55.
[2] Hor, Ching-Lai, Simon J. Watson, and Shanti Majithia. "Daily load
forecasting and maximum demand estimation using ARIMA and
GARCH."Probabilistic Methods Applied to Power Systems, 2006.
PMAPS 2006. International Conference on. IEEE, 2006.
[3] Hossain, Ahammad, Md Kamruzzaman, and Md Ayub Ali. "ARIMA
With GARCH Family Modeling and Projection on Share Volume of
DSE."Economics 3.7-8 (2015): 171-184.
[4] Bhardwaj, S. P., et al. "An empirical investigation of ARIMA and
GARCH models in Agricultural Price Forecasting." Economic
Affairs 59.3 (2014): 415-428.
[5] Radha, S., and M. Thenmozhi. "Forecasting short term interest rates
using ARMA, ARMA-GARCH and ARMA-EGARCH models." Indian
Institute of Capital Markets 9th Capital Markets Conference Paper.
2006.
[6] Adegboyega, Abiola. "A Dynamic Bandwidth Prediction & Provisioning
Scheme in Cloud Networks."
4. [7] Roy, Partha. "A novel fuzzy document-based information retrieval
scheme (FDIRS)." Applied Informatics. Vol. 3. No. 1. Springer Berlin
Heidelberg, 2016.
[8] Das, Dhruba, et al. "Construction of normal fuzzy numbers: case studies
with stock exchange data." Annals of Fuzzy Mathematics and
Informatics5.3 (2013): 505-514.
[9] Omari-Sasu, Akoto Yaw, et al. "Modeling Stock Market Volatility
Using GARCH Approach on the Ghana Stock Exchange." International
Journal of Business and Management 10.11 (2015): 169.
[10] Amirmazlaghani, Maryam, Hamidreza Amindavar, and Alireza
Moghaddamjoo. "Speckle suppression in SAR images using the 2-D
GARCH model." Image Processing, IEEE Transactions on 18.2 (2009):
250-259.
[11] Ince, Huseyin, and Theodore B. Trafalis. "Kernel principal component
analysis and support vector machines for stock price prediction." IIE
Transactions 39.6 (2007): 629-637.
[12] Turiel, Antonio, and Conrad J. Pérez-Vicente. "Multifractal geometry in
stock market time series." Physica A: Statistical Mechanics and its
Applications322 (2003): 629-649.
[13] Li, Xiaodong, et al. "News impact on stock price return via sentiment
analysis." Knowledge-Based Systems 69 (2014): 14-23.
[14] Seker, Sadi Evren, et al. "Ensemble classification over stock market time
series and economy news." Intelligence and Security Informatics (ISI),
2013 IEEE International Conference on. IEEE, 2013.