• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Regression, theil’s and mlp forecasting models of stock index
 

Regression, theil’s and mlp forecasting models of stock index

on

  • 315 views

 

Statistics

Views

Total Views
315
Views on SlideShare
315
Embed Views
0

Actions

Likes
0
Downloads
1
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Regression, theil’s and mlp forecasting models of stock index Regression, theil’s and mlp forecasting models of stock index Document Transcript

    • International Journal of Computer and Technology (IJCET), ISSN 0976 – 6367(Print),International Journal of Computer Engineering EngineeringISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEMEand Technology (IJCET), ISSN 0976 – 6367(Print) IJCETISSN 0976 – 6375(Online) Volume 1Number 1, May - June (2010), pp. 82-91 ©IAEME© IAEME, http://www.iaeme.com/ijcet.html REGRESSION, THEIL’S AND MLP FORECASTING MODELS OF STOCK INDEX K. V. Sujatha Research Scholar Sathyabama University, Chennai E-mail: sujathacenthil@gmail.com S. Meenakshi Sundaram Department of Mathematics Sathyabama University, Chennai E-mail: sundarambhu@rediffmail.comABSTRACT Financial Forecasting or specifically Stock Market prediction is one of the hottestfields of research lately due to its commercial applications owing to the high stakes andthe kinds of attractive benefits that it has to offer. Financial time-series is one of the‘noisiest’ and ‘non-stationary’ signals present and hence very difficult to forecast. In thispaper we have made an attempt to forecast the daily prices of stock index using aRegression, Theil’s and MLP models and the predictive ability of these models arecompared using standard error measures.Keywords: Forecasting, Regression, Principal Component, Perceptron, MAPE.1. INTRODUCTION Trading in stock market indices has gained unprecedented popularity in majorfinancial markets around the world. However, the prediction of stock price index is a verydifficult problem because of the complexity of the stock market data, and is affected bymany factors including political events, general economic conditions, and investors’expectations. Modeling the behavior of a market index is a challenging task for severalreasons. There are two major approaches (fundamental and technical) for analyzing stockprice prediction [1]. 82
    • International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEME Due to the lack of profound knowledge about interior running rules in nonlinearsystems like stock system, we have no idea about the variables which are more influentialand important and which are not. Input variables are selected only depending on openingand objective historical data in a stock market. To avoid missing important datainfluencing prediction from the historical data, Principal Component Analysis (PCA), isusually used. A functional principal component technique for the Statistical analysis of aset of financial time series highlights some relevant statistical features of such relateddatasets [3]. This method is to replace original variables with new ones, which are less innumber and not mutually correlative, and contain most of the information of originalvariables [6]. Xiaoping Yang [4] used PCA to find the principal components that aretaken as inputs for predicting stock prices using neural network. Variables high, low,open, volume and adjusted closing were considered for prediction of closing prices usingHybrid Kohonen Self Organizing Map [5]. Liu et al [7] used the back propagation neuralnetworks using moving average, deviation from moving average, turnover movingaverage, and relative index for prediction. In Versace et al’s work[8], values used areopen, high, low, close and volume of a specific stock while Baba [9] used change ofindex, PBR, changes of the turnover by foreign traders, changes of current rates, andturnover in local stock market. MLP outperformed RBF in predicting weekly closingprices using the variables open, high, low and volume [10]. In the recent years, ArtificialNeural Networks (ANNs) have been applied to many areas of statistics. One of theseareas is time series forecasting [11-19]. The variables considered in this article forpredicting the daily closing prices are the historic prices, daily opening, low and highprices of BSE Sensex from 1st January 2009 till 31st March 2010. Principal componentanalysis resulted in a single set of variable. The closing prices are predicted by fitting a parametric model Simple LinearRegression and also by classical Non parametric model Theil’s Incomplete Method.Multilayer Perceptron is another non parametric model that is used to forecast the dailyclosing prices taking the principal component as the predictor variable. The forecast errorvalues are measured which is the difference between the actual value and the forecastvalue for the corresponding period all three models. Error values MAPE, SMAPE and 83
    • International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEMEMAE are related with how close the forecasted values are to the target ones. Lower theerror values, better is the forecaster. 2 MODEL DESCRIPTION 2.1 PRINCIPAL COMPONENT ANALYSIS Principal component analysis is appropriate when there are number of observedvariables and wishes to develop a smaller number of artificial variables (called principalcomponents) that will account for most of the variance in the observed variables. Theprincipal components may then be used as predictor or criterion variables in subsequentanalyses. Principal component analysis is a variable reduction procedure. It is usefulwhen there is redundancy in the data obtained on the number of variables. Hereredundancy means that some of the variables are correlated with one another, possiblybecause they are measuring the same construct. Because of this redundancy it is possibleto reduce the observed variables into smaller number of principal components that willaccount for most of the variance in the observed variables. Technically, a principal component can be defined as a linear combination ofoptimally weighted observed variables. Below is the general form for the formula tocompute the first component extracted in a principal component analysis: C1 = b11(X1)+ b12(X2)+….. b1p(Xp)Where, C1= the first component extracted b1p= the regression coefficient for the observed variable p, Xp = the value of the observed variable.2.2 SIMPLE LINEAR REGRESSION Simple linear regression fits a straight line through the set of n points in such away that makes the sum of squared residuals of the model as small as possible.Regression has the following assumptions The dependent variable is linearly related to the independent variable. Residuals follow normal distribution. Residuals have uniform variance. 84
    • International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEME Regression parameters for a straight line model y = a + bx are calculated by theleast squares method (minimization of the sum of squares of deviations from a straightline). This differentiates to the following formulae for the slope (b) and the y intercept(a) of the line2.3 THEIL’S INCOMPLETE METHOD A simple, non-parametric approach to fit a straight line to a set of (x,y)-points isthe Theils incomplete method which assumes that points (x1, y1), (x2, y2) . . . (xN, yN) aredescribed by the equation y = a + bxThe calculation of a and b takes place as follows: All N data points are ranked in ascending order of x-values. The data are separated into two equal size (m) groups, the low (L) and the high (H) group. If N is odd the middle data point is not included to either group The slope bi is calculated for all points of each group, i.e. bi = (yH,I – yL,i)/ (xH,I – xL,i) for i=1,2,…,m The median of the m slope values b1, b2, . . ,bm is calculated and it is taken as the best estimate of the slope (b) of the line, i.e. b = median(b1, b2, . . bm). For each data point (xi,yi) the value of intercept ai is calculated using the previously calculated slope b, i.e. ai=yi- bxi for i=1,2,…N The median of the N intercept values a1, a2 , . . . aN is calculated and it is taken asthe best estimate of the intercept (a) of the line, i.e. a = median (a1, a2, . .. aN). 85
    • International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEME2.4 MULTILAYER PERCEPTRON. A multilayer perceptron is a feed forward network model that maps sets of inputdata onto a set of appropriate output. It is a modification of the standard linear perceptronin that it uses three or more layers of neurons (nodes) with nonlinear activation functions,and is more powerful than the perceptron in that it can distinguish data that is not linearlyseparable. The MLP divides the data set in to three parts Training, Testing and Holdout. Training - This segment of data is used only to train the network. Testing - This segment of data is a part of the training data to prevent over training Hold out - This set of data used to assess the final neural network. Hold out data set gives an honest estimate of the predictive ability of the model. Multilayer Layer Perceptron has rescaling option which is done to improvethe network training. There are three rescaling options: standardization, normalization,and adjusted normalization. All rescaling is performed based on the training data, even ifa testing or holdout sample is defined. The activation function of the hidden layer can behyperbolic tangent or sigmoid. The units in the output layer can use any one of thefollowing activation function - Identity, Sigmoid, Softmax or Hyperbolic Tangent.2.5 ERROR MEASURES Error Functions that are used are sum of square error and relative error. Sum of square error is defined as the sum of the squared deviation between observedand the model predicted value. Relative Error is the ratio of an absolute error to the true,specified, or theoretically correct value of the quantity that is in error 1 n MeanAverageError, MAE = ∑ | At − Pt | n t =1 1 n | At − Pt | MeanAveragePercentError, MAPE = ∑ A n t =1 t 1 n | At − Pt | SymmentricMeanAveragePercentError, SMAPE = ∑ , n t =1 At + PtWhere At is the actual value and Pt is the predicted value. 86
    • International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEME3 FINDINGS AND RESULTS Principal Component Analysis of the variable daily high, low and opening pricesof BSE Sensex data resulted in the single principal component which is further used inpredicting the closing prices by the methods discussed above. The factor determining thenumber of principal component, the eigen value and the factor loading of the principalcomponents are given in Table 1. Table 1 Principal Component Analysis 1 2 3 4 Eigenvalues 3.2787 0.7194 0.0011 0.0008 Difference 2.5593 0.7183 0.0003 Proportion 81.97% 17.98% 0.03% 0.02% Cumulative 81.97% 99.95% 99.98% 100.00% Criteria: Kaiser Weights Factors F1 PCA PCA1 V1 0.9855 V1 0.5442 V2 0.9855 V2 0.5442 V3 0.9883 V3 0.5458 V4 -0.5998 V4 -0.3312 Exp. Var. 3.2787 87
    • International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEME Initial Descriptive analysis of the daily closing prices and the predictor variable(principal component variable) is given in Table 2. The assumptions of simple linearregression are checked and then with this set of observation the line of regression isfitted. Table 2 Descriptive StatisticsVariable Mean Standard Deviation Skewness KurtosisDaily Closing 14337.1182 3041.62375 -.788 -.909Principal Component 23419.2108 4974.61634 -.785 -.918 Table 3 Tests of Normality Kolmogorov-Smirnov Shapiro-Wilk Statistic df Sig. Statistic df SignificanceClosing .170 300 .000 .838 300 .000PCA .171 300 .000 .838 300 .000 Durbin Watson value is 2.11 clearly states the absence of autocorrelation.Normality tests Kolmogorov-Smirnov and Shapiro-Wilk were performed and the outcomewere displayed in Table 3. From the Table 3 it is clear that both the tests imply that thecondition of normality is not met. Using method of Least Squares, the Simple Linear Regression Model for the data isgiven by Y = 34.312 +0.611X, where X is the principal component variable and Yrepresents the daily closing price of BSE. By the classical Nonparametric model Theil’smethod, the model is given by Y = 42.15384+0.610456X, where X is the principal component and Y representsthe daily closing price. 88
    • International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEME For modeling the data with Multilayer Perceptron, the Principal componentvariable is taken as covariate and the daily closing prices of BSE is considered to be thetarget variable. Smoothing (standardized, normalized and adjusted normalized) of boththe dependent variable and covariates are done successively. All possible combination,changing the activation function of the hidden layer (hyperbolic tangent and sigmoid) andthat of the output layer (Identity, hyperbolic tangent and sigmoid) the sum of square errorand relative error values are measured with different scaling options. The different combinations of the activation function of the output and the hiddenlayer with the three rescaling options of the input and target variables resulted in 30models. The architecture for which the sum of square and relative error was minimum isthe one in which the smoothing of both the dependent and covariates are normal withhyperbolic tangent as the activation function of the hidden layer and Identity for theoutput layer. Table 5 gives the MAE, MAPE, SMAPE and R square values for the abovemodels discussed above. Figure 1 shows how the models predict the closing prices for thelast 50 data point. Table 6 MAE, MAPE and SMAPE values Model MAE MAPE SMAPE R2 Value Linear Regression 110.695401 0.0081926 0.0040934 0.9977142 Theil’s Incomplete 110.6996 0.008198 0.004095 0.9977138 Method Multilayer 118.5105 0.008839 0.004424 0.9974605 Perceptron 89
    • International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEME Figure 1 shows how the models predict the closing prices for the last 50 data point.4 CONCLUSION The best model for forecasting the daily closing prices was found to be linearregression. The model yielded the least error, only 0.0081926 on average measured bythe MAPE, 0.0040934 on average measured by SMAPE and 110.695401 as the MAEvalue. The R square value is 0.997714272 which indicates that the model is appropriatein predicting the daily closing prices when the daily opening, high and low prices areconsidered for predicting. This model out performed the nonparametric Theil’s methodand MLP models. It will be interesting to conduct further studies to compare the resultswith addition variables.5. REFERENCES1. Kai Keng Ang and Chai Quek, (2006), “Stock Trading Using RSPOP: A Novel Rough Set-Based Neuro-Fuzzy Approach”, IEEE Transactions of Neural Networks, 17(5):1301–1315.2. Brabazon. T., (2000) “A connectivist approach to index modelling in financing markets”, In Proceedings, Coil / EvoNet Summer School. University of Limerick.3. Salvatore Ingrassia and G. Damiana Costanzo. (2005), “Functional principal component analysis of financial time series”, Vichi M., Monari P., Mignani S., Montanari A. (Eds.) New Developments in Classification and Data Analysis, Pages 351-358, Springer-Verlag, Berlin. 90
    • International Journal of Computer Engineering and Technology (IJCET), ISSN 0976 – 6367(Print),ISSN 0976 – 6375(Online) Volume 1, Number 1, May - June (2010), © IAEME4. Xiaoping Yang (2005), “The Prediction of Stock Prices Based on PCA and BP Neural Networks Chinese Business Review, ISSN 1537-1506, USA Volume 4, No.5 (Serial No.23), Page 64 – 68.5. Mark O. Afolabi, Olatoyosi Olude (2007), “Predicting Stock Prices Using a Hybrid Kohonen Self Organizing Map (SOM)”, Proceedings of the 40th Hawaii International Conference on System Sciences, IEEE.6. Huixin Ke, Jinghua Huang, Hao Shen (2007), “Statistic Analysis in Investigation and Research”, Beijing: Beijing Broadcast University Press, 465-484.7. Qiong Liu, Xin Lu, Fuji Ren and Shingo Kuroiwa.( 2004), “Automatic Estimation of Stock Market Forecasting and Generating the Corresponding Natural language Expression”, IEEE Proceedings of the International Conference on Information Technology: Coding and Computing.8. Versace M., Bhatt R., Hinds O. and Shiffer M. (2004), “Predicting the exchange traded fund DIA with a combination of genetic algorithms and neural networks.” Expert Systems with applications, Elsevier.9. Baba N., Naoyuki I. and Hiroyuki A. (2000), “Utilization of Neural Networks & GAs for Constructing Reliable Decision Support Systems to Deal Stocks.” Proceedings of IEEE-INNS-ENNS International Joint Conference on Neural Networks.10. Sujatha K. V. and S. Meenakshi Sundaram, (2010), “A MLP, RBF Neural Network Model for Prediction in BSE SENSEX Data Set”, Proceedings of National Conference on Applied Mathematics.11. Katijani, Y., W.K. Hipel and A.I. McLeod, (2005), “Forecasting Nonlinear Time Series with Feedforward Neural Networks: A Case Study of Canadian Lynx Data”. Journal of Forecasting, 24: 105-117.12. Yao, J., Y. Li and C.L. Tan, (2000), “ Option Price Forecasting Using Neural Networks”. Omega, 28: 455-466.13. Chakraborty, K., Merotra K., Mohan C.K. and Ranka S, (1992), “Forecasting the Behavior of Multivariate Time-Series Using Neural Network”, Neural Networks, 5: 461-470. 91