IC2IT 2013 Presentation

585 views

Published on

9th International Conference on Computing & Information Technology. Held in Bangkok. Spinger Proceedings.

Published in: Technology, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
585
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
13
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

IC2IT 2013 Presentation

  1. 1. The 9th International Conference on Computing and Information Technology (IC2IT 2013) KMUTNB Dhaka Stock Exchange Trend Analysis Using Support Vector Regression Authors: Phayung Meesad & Risul Islam Rasel Faculty of Information Technology King Mongkut’s University of Technology North Bangkok Email: pym@kmutnb.ac.th; rasel.kmutnb@gmail.com
  2. 2. Outline Introduction Related work Motivation & Goal Experiment design Result analysis Conclusion 9th May 20113 IC2IT 2013 2
  3. 3. 1. Introduction • Stock exchange : is an emerging business sector has become more popular among the people. many people, organizations are related to this business. gaining insight about the market trend has become an important factor • Stock trend or price prediction is regarding as challenging task because Essentially a non-liner, non parametric Noisy Deterministically chaotic system • Why deterministically chaotic system? Liquid money and Stock adequacy Human behavior and News related to the stock market Share Gambling Money exchange rate, etc. 9th May 20113 IC2IT 2013 3
  4. 4. 2. Related Work 2.1 Support Vector regression (SVR) • Support vector machine (SVM), a novel artificial intelligence-based method developed from statistical learning theory • SVM has two major futures: classification (SVC) & regression (SVR). • In SVM regression, the input is first mapped onto a m-dimensional feature space using some fixed (nonlinear) mapping, and then a linear model is constructed in this feature space. • a margin of tolerance (epsilon) is set in approximation. • This type of function is often called – epsilon intensive – loss function. • Usage of slack variables to overcome noise in the data and non - separability 9th May 20113 IC2IT 2013 4
  5. 5. Related Work (cont..) • Support vector regression (SVR) model with parameters w and b can be expressed as f(x) = w.z (x) + b • where y is the model output and input x is mapped into a feature space by a nonlinear function (x). Image courtesy: Pao-Shan Yu*, Shien-Tsung Chen and I-Fan Chang 9th May 20113 IC2IT 2013 5
  6. 6. Related Work (cont..) • The regression problem of SVM can be expressed as the following optimization problem. l min 1 | | w | | 2 + c / (p + p* ) * 2 w, b, p, p i=1 subject to yi - (w.z (xi) + b) # f + p (w.z (xi) + b) - yi # f + p* p, p* $ 0, i = 1, 2, ......, l • • * Where pand p are slack variables that specify the upper and the lower training errors subject to an error tolerance ε. C is a positive constant that determines the degree of penalized loss when a training error occurs 9th May 20113 IC2IT 2013 6
  7. 7. Related Work (cont..) 2.2 Windowing operator • • • • The problem of forecasting the class attribute x, N steps ahead of time, as learning a target function which uses a fixed number of M past values. x(t+N) = f([x(t-0), x(t-1), …, x(t-M)]) ……………(1) This equation can be written as: x-0 = f([x-0, x-1, ..., x-M] – [x-0, x-1, …, x-N]) or as: x-0= f([x-0, x-N, ..., x-M]) …………………(2) or in the multivariate case as: x-0 = f([x-0, y-0, x-1, y-1 ..., x-M, y-M] – [x-0, x-1, y-1, …, x-N, y-N]) …. (3) Since Windowed Examples are of the form: [x-0, x-N, y-N, ..., x-M, y-M], we have to remove all horizon attributes: [x-1, y-1, …, x-N, y-N]. The result is a dataset with Windowed Examples which can be fed to any machine learning algorithm. 9th May 20113 IC2IT 2013 7
  8. 8. Notations: •0 timestep 0,the timestep we wish to predict. •N the number of timesteps between now and 0 •M the size of the window •attribute-[0-9] a Windowed Attribute, measured at timestep [0-9] •x-0 attribute x measured at timestep 0 •x-0 equivalent to x-0 •x(t+N) equivalent to x(0), if t+N is the timestep we wish to predict 9th May 20113 IC2IT 2013 8
  9. 9. Related Work (cont..) 2.3 Windowing operator: transform the time series data into a generic data set convert the last row of a window within the time series into a label or target variable Fed the cross sectional values as inputs to the machine learning technique such as liner regression, Neural Network, Support vector machine and so on. 9th May 20113 • Parameters: Horizon (h) Window size Step size Training window width Testing window width IC2IT 2013 9
  10. 10. Related Work (cont..) 9th May 20113 IC2IT 2013 10
  11. 11. Related Work (cont..) 2.4 Some recent research works 1. “Stock Forecasting Using Support Vector Machine,” • • • • Authors: Lucas, K. C. Lai, James, N. K. Liu Applied technique: SVM and NN Data preprocess technique: Exponential Moving Average (EMA15) and relative difference in percentage of price (RDP) Domain: Hong Kong Stock Exchange 2. “Stock Index Prediction: A Comparison of MARS, BPN and SVR in an Emerging Market,” • • • Authors: Lu, Chi-Jie, Chang, Chih-Hsiang, Chen, Chien-Yu, Chiu, Chih-Chou, Lee, Tian-Shyug, Applied technique: Multivariate adaptive regression splines (MARS), Back propagation neural network (BPN), support vector regression (SVR), and multiple linear regression (MLR). Domain: Shanghai B-share stock index 9th May 20113 IC2IT 2013 11
  12. 12. Related Work (cont..) 3. “An Improved Support Vector Regression Modeling for Taiwan Stock Exchange Market Weighted Index Forecasting,” • • • Authors: Kuan-Yu. Chen, Chia-Hui. Ho Applied technique: SVR, GA, Auto regression (AR) Domain: Taiwan Stock Exchange • So, many research have been done using support vector machine (SVM) in order predict stock market trend. • GA, EMA, RDP and some other techniques have been used as input selection technique or optimization technique. So, Still there are some scope to apply different input selection or optimization technique to fed input to the machine learning algorithm like support vector machine and Neural network. 9th May 20113 IC2IT 2013 12
  13. 13. 3. Motivation & Goal • Motivation: SVR is a powerful machine learning technique for pattern recognition Introducing of using different kinds of windowing function as data preprocess is a new idea Combining windowing function and support vector regression can make good model for time series prediction. • Goal: Propose a good Win-SVR model to predict stock price 9th May 20113 IC2IT 2013 13
  14. 14. 4. Experiment Design 4.1 Data collection Experiment dataset had been collected from Dhaka stock exchange (DSE), Bangladesh. 4 year’s (January 2009-June 2012)historical data were collected. Almost 522 company are listed in DSE. But for the convenient of the experiment we only select one well known company data. Dataset had 6 attributes. Date, open price, high price, low price, close price, volumes. 5 attributes were used in experiment except volumes. Total 822 days data. 700 data were used as training dataset, and 122 data were used as testing dataset. 9th May 20113 IC2IT 2013 14
  15. 15. Experiment Design (cont..) 4.2 Model Work Flow Training phase • Step 1: Read the training dataset from local repository. • Step 2: Apply windowing operator to transform the time series data into a generic dataset. This step will convert the last row of a window within the time series into a label or target variable. Last variable is treated as label. • Step 3: Accomplish a cross validation process of the produced label from windowing operator in order to feed them as inputs into SVR model. • Step 4: Select kernel types and select special parameters of SVR (C, ε , g etc). • Step 5: Run the model and observe the performance (accuracy). • Step 6: If performance accuracy is good than go to step 6, otherwise go to step 4. • Step 7: Exit from the training phase & apply trained model to the testing dataset. Testing phase • Step 1: Read the testing dataset from local repository. • Step 2: Apply the training model to test the out of sample dataset for price prediction. • Step 3: Produce the predicted trends and stock price 9th May 20113 IC2IT 2013 15
  16. 16. Experiment Design (cont..) 4.3 Optimal Window settings • Three types of Windowing operator were used as data preprocess. Normal rectangular window Flatten window De- flatten window • Optimal settings of windowing components for SVR models are given below: Table 1: Window settings Flatten window De-Flatten window 9th May 20113 Step size Training window width Test window width All 3 1 30 30 3 1 30 30 5 days 8 1 30 30 22 days Rectangular Model Window size 1 day Windowing operator 25 1 30 30 All 5 1 30 30 IC2IT 2013 16
  17. 17. Experiment Design (cont..) 4.4 SVR kernel Parameters settings • • • • Model 1: 1 day a-head prediction model Model 2: 5 days a-head prediction model Model 3: 22 days a-head prediction model Kernel function: Radial basis function (RBF) Table 2: SVR kernel parameters setting SVR Model Kernel C g ε ε+ ε- Model-1 RBF 10000 1 2 1 1 Model-2 RBF 10000 1 2 1 1 Model-3 RBF 10000 1 2 1 1 9th May 20113 IC2IT 2013 17
  18. 18. Experiment Design (cont..) 4.5 Proposed Win-SVR Models • • • • • 1 day & 5 days a-head model Window type: Flatten Window Window size : 3 (1 day model), 8 (5 days model) Attribute selection : All Step size : 1 T.W.W : 30, t.s.w : 30, Kernel type : RBF Table 3: SVR model for Flatten window Model SV Bias (b) Weight (w) w[open-2] 687 5 days 9th May 20113 3.335 -4.658 w[Close-2] -746.516 -1074.989 -1087.763 -546.558 w[open-6] w[High-7] w[High-6] 1792.63 1716.616 2231.12 2447.79 w[Low-7] 696 w[Low-2] w[open-7] 1 day w[High-2] w[Low-6] w[Close-7] w[Close-6] 2587.202 2219.727 2762.02 2187.662 IC2IT 2013 18
  19. 19. Experiment Design (cont..) • • • • • 22 days a-head model Window type: Normal Rectangular window Window size : 3 Attribute selection : single attribute (close) Step size : 1 T.W.W : 30, t.s.w : 30, Kernel type : RBF Table 4: SVR model for normal rectangular window Model Support Vector Bias (offset) Weight (w) SV 22 days 9th May 20113 b w [close-2] w [close-1] w [close-0] 675 421.3 1719.6 1631.5 805.1 IC2IT 2013 19
  20. 20. 5. Result Analysis Result evolution technique: Error calculation: Used MAPE MAPE: Mean Average Percentage Error (MAPE) was used to calculate the error rate between actual and predicted price. Here, n MAPE = 100 /| A - P A | i=1 n A = Actual price P = Predicted price n = number of data to be counted 9th May 20113 IC2IT 2013 20
  21. 21. Result Analysis (cont..) Fig:2 Actual vs Predicted price 1 day a-head model 300 Close price (BDT) Close Price (BDT) Fig:1 250 200 150 100 50 0 300 250 200 150 100 50 0 1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106 113 120 Days (Jan-Jun'2012) Actual Fig:3 1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97 103 109 Days (Jan-jun'2012) Predicted Actual Actual vs Predicted close price 22 days a-head model Predicted •Fig 1: 1 day a-head model result from flatten window (MAPE : 0.04 ) 300 Close Price (BDT) Actual vs Predicted price 5 days a-head model 250 200 •Fig 2: 5 days a-head model result from flatten window (MAPE : 0.15 ) 150 100 50 0 1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96 Days (Jan-May'2012) Actual 9th May 20113 •Fig 3: 22 days a-head model result from rectangular window (MAPE : 0.22 ) Predicted IC2IT 2013 21
  22. 22. Result Analysis (cont..) Fig:4 Fig:5 Error rate Normal Rectangular window 1 MAPE MAPE 1.5 0.5 0 Jan Feb Mar Apr Error rate Flatten window 0.50 0.40 0.30 0.20 0.10 0.00 Jan May Feb Fig:6 5 days a-head 1 day a-head 22 days a-head Error rate De-flatten window MAPE Apr May Month Month 1 day a-head Mar 5 days a-head 22 days a-head Table :Average MAPE (error) for test data (From Jan’12 to May’12) Horizon Rectangul ar window Flatten window Deflatten window 1 Day a-head 16 14 12 10 8 6 4 2 0 1 0.42 0.04 7.79 5 days a-head 5 0.26 0.15 7.16 22 days head 22 0.22 0.22 7.61 Model Jan Feb Mar Apr May Months 1 day a-head 9th May 20113 5 days a-head 22 days a-head IC2IT 2013 a- 22
  23. 23. 6. Conclusion 6.1 Discussions : Different windowing function can produce different prediction results. We used 3 types of windowing operators. Normal rectangular window, Flatten window, De-flatten window. Rectangular and flatten windows are able to produce good prediction result for time series data. De-flatten window can not produce good prediction results. 6.2 Limitations & Future works: Here we only use 3 types of windowing operators. Only one stock exchange data set were used to undertake the experiments. Do not compare with other machine learning techniques. In future, we will apply our model to other stock market data set and will also compare our research result with other types of data mining techniques. 9th May 20113 IC2IT 2013 23
  24. 24. The 9th International Conference on Computing and Information Technology (IC2IT 2013) KMUTNB THANK YOU FOR YOUR ATTENTION 9th May 20113 IC2IT 2013 24

×