Comparison of the forecasting techniques – arima, ann and svm a review-2


Published on

Published in: Technology, Business
1 Like
  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Comparison of the forecasting techniques – arima, ann and svm a review-2

  1. 1. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 3, May – June (2013), © IAEME 370 COMPARISON OF THE FORECASTING TECHNIQUES – ARIMA, ANN AND SVM - A REVIEW V.Anandhi1 Dr.R.Manicka Chezian2 Assistant Professor(Computer Science Associate Professor, Department of Forest Resource Management, Department of Computer Science, Forest College and Research Institute, NGM College, Pollachi-642001, Mettupalayam 641 301,Tamil Nadu-India Tamil Nadu, India ABSTRACT Wood Pulp is the most common raw material in paper making.Forecasting is a systematic effort to anticipate future events or conditions. Forecasting is usually carried out using the statistical methods like ARIMA and nowadays Artificial Neural Networks (ANN) and Support Vector Machines (SVM) are widely used in forecasting for its accuracy. In ANN, a Levenberg-Marquardt Back Propagation (LMBP) algorithm has been used to develop the ANN models. In developing the ANN models, different networks with different numbers of neuron hidden layers were evaluated. The forecast is done using the feed forward Back Propagation Network (BPN). Support Vector Regression (SVR), a category for support vector machine attempts to minimize the generalization error bound so as to achieve generalized performance. Regression is that of finding a function which approximates mapping from an input domain to the real numbers on the basis of a training sample. Keywords: Forecasting, ARIMA, Artificial Neural Networks (ANN) and Support Vector Machines (SVM), Levenberg-Marquardt Back Propagation (LMBP), Back Propagation Network (BPN) I. INTRODUCTION Forecasting is a systematic effort to anticipate future events or conditions. Forecasts are more accurate for larger groups of items and for longer time periods. Many forecasters depend heavily on models to help in forecasting. A model consists of mathematical expressions or equations which describe relationship among variables. A forecaster’s choice of forecasting model is the key importance. Applications of forecasting include rainfall, stock market- price, cash forecasting in banks etc. Forecasting is usually carried out using the statistical methods like ARIMA and nowadays Artificial Neural Networks (ANN) and Support Vector Machines (SVM) are widely used in forecasting for its accuracy. INTERNATIONAL JOURNAL OF COMPUTER ENGINEERING & TECHNOLOGY (IJCET) ISSN 0976 – 6367(Print) ISSN 0976 – 6375(Online) Volume 4, Issue 3, May-June (2013), pp. 370-376 © IAEME: Journal Impact Factor (2013): 6.1302 (Calculated by GISI) IJCET © I A E M E
  2. 2. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 3, May – June (2013), © IAEME 371 II. FORECASTING SYSTEM A good forecasting system has qualities that distinguish it from other systems. The qualities provide a useful basis for understanding why good forecasting systems outperform others over time. If users understand the rationale of a forecast, they can appraise the uncertainty of the forecast, and they will know when to revise the forecast in light of changing circumstances. The more accurate a forecast is, the better are the decisions that depend upon it. Inaccurate forecasts lead to too much or too little capacity and can be very costly. Forecasts cost money, time and effort. Added expense must purchase added accuracy, flexibility, or insight. A sophisticated forecasting system requires ample staff resources and technical skills for maintenance. Choice of a forecasting system must include a commitment to the resources necessary to maintain it and avoid systems the utility will be unable or unwilling to maintain. Earlier statistical methods were used for forecasting. A time series model was constructed solely from the past values of the variables to be forecast. Single equation time series model known as ‘autoregressive models’ – the ARIMA (Auto Regressive Integrated Moving Average) model. Auto Regressive is a model in which a variable is a function of only its past values, except for deviations introduced by an error term. Integrated are the period to period changes in the level of the original variable employed in the estimation procedure rather than the level of the variable itself. Moving average procedure has been used to eliminate any inter correlations of the error term to its own past or future values. ARIMA models- Autoregressive Integrated Moving-average, Can represent a wide range of time series, a “stochastic” modeling approach that can be used to calculate the probability of a future value lying between two specified limits. For more than two decades, Box and Jenkins Auto-Regressive Integrated Moving Average (ARIMA) techniquehas been widely used for time series forecasting. However, ARIMA is a general univariate model and it is developed based on the assumption that the time series being forecasted are linear and stationary. Because of its popularity, the ARIMA model has been used as a benchmark to evaluate many new modeling approaches [1]. The method of least squares was used to estimate the parameters in ARIMA. Artificial Neural Network(ANN) largely used in forecasting, assists multivariate analysis. Multivariate models can rely on greater information, where not only the lagged time series being forecast, but also other indicators (such as technical, fundamental, inter-marker etc. for financial market), are combined to act as predictors. In addition, ANN is more effective in describing the dynamics of nonstationary time series due to its unique non-parametric, non-assumable, noise-tolerant and adaptive properties. ANNs are universal function approximators that can map any nonlinear function without a priori assumptions about the data [2]. AUTOREGRESSIVE (AR) MODEL An Autoregressive (AR) model is a representation of a type of random process; as such, it describes certain time-varying processes in nature, economics, etc. The autoregressive model specifies that the output variable depends linearly on its own previous values, the notation AR (p) indicates an autoregressive model of order p. The AR (p) model is defined as
  3. 3. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 3, May – June (2013), © IAEME 372 Where the parameters of the model are, is a constant, and is white noise. This can be equivalently written using the backshift operatorB as So that, moving the summation term to the left side and using polynomial notation, we have An autoregressive model can thus be viewed as the output of an all-poleinfinite impulse response filter whose input is white noise.Some constraints are necessary on the values of the parameters of this model in order that the model remains wide-sense stationary. For example, processes in the AR model with |φ1| ≥ 1 are not stationary. More generally, for an AR (p) model to be wide-sense stationary, the roots of the polynomial must lie within the unit circle, i.e., each root must satisfy Machine learning techniques Machine learning, a branch of artificial intelligence, was originally employed to develop techniques to enable computers to learn. It includes a number of advanced statistical methods for regression and classification. In certain applications it is sufficient to directly predict the dependent variable without focusing on the underlying relationships between variables. In other cases, the underlying relationships can be very complex and the mathematical form of the dependencies unknown. For such cases, machine learning techniques emulate human cognition and learn from training examples to predict future events. Artificial Neural Network Artificial Neural Networks (ANNs) are models based on the neural structure of the brain. The brain learns from experience. Artificial neural networks try to mimic the functioning of the brain. A neural network is a massively parallel distributor processor made up of simple processing units. It has a natural property for storing experiential knowledge and making it available for use. Fig.1 Artificial Neural Network Architecture
  4. 4. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 3, May – June (2013), © IAEME 373 The architecture of ANN is designed by the number of layers, number of neurons in each layer, weights between neurons, a transfer function that controls the generation of output in a neuron. A supervised training is accomplished by presenting a sequence of training vectors, or patterns, each with an associated target output vector [3]. Neural networks are a class of nonlinear model that can approximate any nonlinear function to an arbitrary degree of accuracy and have the potential to be used as forecasting tools in many different areas [4]. The most commonly used neural network architecture is multilayer feedforwardnetwork. It consists of an input layer, an output layer and one or more intermediate layer called hidden layer. All the nodes at each layer are connected to each node at the upper layer by interconnection strength called weights. A training algorithm is used to attain a set of weights that minimizes the difference the target and actual output produced by the network. There are many different neural net learning algorithms found in the literature. The ANN model is a mathematical model inspired by the function of the human brain and its use is mainly motivated by its capability of approximating a function to any degree of accuracy. ANN, is a method which has few limiting hypotheses and can be easily adapted to the types of data. Secondly, ANN can be generalized. Thirdly, ANN has a general functional structure. Furthermore, ANN can be classified as non-linear models. One critical decision is to determine the appropriate architecture, that is, the number of layers, number of nodes in each layers, and the number of arcs which interconnect with the nodes. Feedforward neural network (FNN) has been used in many studies for the forecasting process of the series[5].Most popular supervised training algorithm is BPN. Back propagation is a form of supervised learning for multi-layer nets, also known as the generalized delta rule. Error data at the output layer is "back propagated" to earlier ones, allowing incoming weights to these layers to be updated[6]. It is most often used as training algorithm in current neural network applications. Determining the architecture depends on the basic problem. ANN gives good accuracy and thus more preferred for forecasting.ANN had a significantly lower error compared with other methods[7] . Support Vector Machines Support Vector Machines (SVM) is learning machines implementing the structural risk minimization inductive principle to obtain good generalization on a limited number of learning patterns. The theory has originally been developed by Vapnik[1] and his co-workers on a basis of a separable bipartition problem at the AT & T Bell Laboratories. SVM implements a learning algorithm, useful for recognizing subtle patterns in complex data sets. Instead of minimizing the observed training error, Support Vector Regression (SVR) attempts to minimize the eneralization error bound so as to achieve generalized performance. There are two main categories for support vector machines: support vector classification (SVC) and support vector regression (SVR). SVM is a learning system using a high dimensional feature space. It yields prediction functions that are expanded on a subset of support vectors. SVM can generalize complicated gray level structures with only a very few support vectors and thus provides a new mechanism for image compression. A version of a SVM for regression has been proposed in 1997 by Vapnik, Steven Golowich, and Alex Smola [8]. This method is called support vector regression (SVR)the model produced by SVR only depends on a subset of the training data, because the cost function for building the model ignores any training data that is close (within a threshold ε) to the model prediction [9]. Support Vector Regression (SVR) is the most common application form of SVMs. Support vector machines project the
  5. 5. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 3, May – June (2013), © IAEME 374 data into a higher dimensional space and maximize the margins between classes or minimize the error margin for regression[10]. Support Vector Machines (SVMs) are a popular machine learning method for classification, regression, and other learning tasks. Basic principle of SVM is that, given a set of points which need to be classified into two classes, find a separating hyper plane which maximizes the margin between the two classes. This will ensure the better classification of the unseen points, i.e. better generalization. In SVR, our goal is to find a function f(x) that has at most deviation from the actually obtained targets yi for all the training data. Support vector regression is the natural extension of large margin kernel methods [11] used for classification to regression analysis. The problem of regression is that of finding a function which approximates mapping from an input domain to the real numbers on the basis of a training sample. This refers to the difference between the hypothesis output and it’s training [12] value as the residual of the output, an indication of the accuracy of the fit at this point. One must decide how to measure the importance of this accuracy, as small residuals may be inevitable even while we need to avoid in large ones. The loss function determines this measure. Each choice of loss function will result in a different overall strategy for performing regression. Fig. 2. ε -insensitive Loss Function for regression Fig. 3. ε -insensitive zone for non-linear support vector regression Support vector regression performs linear regression in the feature space using ε - insensitive loss function and, at the same time, tries to reduce model complexity by minimizing ||w ||2. Support vector machines (SVM) are used as they reduce the time and
  6. 6. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 3, May – June (2013), © IAEME 375 expertise needed to construct/train price forecasting models. Also SVM has lower tune-able parameters with parameter values choice being less critical for good forecasting results. SVM can optimize its structure (tune its parameter settings) on input training data provided. SVM training includes solving quadratic optimization as it has only a unique solution and does not involve weights random initialization as training NN does. So an SVM with the similar parameter settings and trained on identical data provides identical results. This increases SVM forecast repeatability while reducing training runs number needed to locate optimum SVM parameter settings [13].Data non-regularity enables SVMs to be used for regression analysis, for example when data is not distributed regularly or has a known distribution [14]. Information to be transformed is evaluated prior to entering classification techniques score. SVM techniques’ advantages are: SVMs gain flexibility through kernel introduction in the choice of threshold form separating instances which do not need to be linear or have similar functional form for all data. This is because its function is non-parametric and operates locally. No assumptions are necessary as kernel contains a non-linear transformation and its functional transformation ensures that data is linearly separable. Transformation is implicit on a robust theoretical basis without the need for human judgment. When parameters C and r (in the case of a Gaussian kernel) are chosen correctly, SVMs provide a good out-of-sample generalization ensuring that selecting an appropriate generalization grade ensures that SVMs are robust, even with a biased training sample. As optimality problem is convex, SVMs deliver unique solutions which are advantageous when compared to Neural Networks which have local minima linked multiple solutions and so might not be robust over samples. V. CONCLUSION A Support Vector Regression based prediction model appropriately tuned can outperform other more complex models. Support vector regression is a statistical method for creating regression functions of arbitrary type from a set of training data. Testing the quality of regression on the training set shows good prediction accuracy. A neural network model with improved learning technique is also promising for forecasting. The Artificial Neural Networks, the well-known function approximators in prediction and system modelling, has recently shown its great applicability in time-series analysis and forecasting. ACKNOWLEDGMENTS Sincere thanks to Mr. R. Sreenivasan, Assistant General Manager, Tamil Nadu Newsprint and Papers Limited (TNPL), karur, Dr. K. T. Parthiban, Professor and Head, Tree Breeding, Dr. P. Durairasu, Dean, FC&RI, Dr. M. Anjugam, Professor and Head, Forest College and Research Institute for their guidance and support. REFERENCES [1] H. B. Hwarng and H. T. Ang, “A simple neural network for ARMA(p,q) time series,” OMEGA: Int. Journal of Management Science, vol. 29, pp 319-333, 2002. [2] L. Cao and F. Tay, “Financial forecasting using support vector machines,” Neural Comput.&Applic, vol. 10, pp.184-192, 2001. [3] V. Anandhi, R. ManickaChezian and K.T. Parthiban, “Forecast of Demand and Supply of Pulpwood using Artificial Neural Network”, International Journal of Computer Science and Telecommunications, pp. 35-38, 2012.
  7. 7. International Journal of Computer Engineering and Technology (IJCET), ISSN 0976- 6367(Print), ISSN 0976 – 6375(Online) Volume 4, Issue 3, May – June (2013), © IAEME 376 [4] JoarderKamruzzaman and Ruhul A. Sarker, “ANN-Based Forecasting of Foreign Currency Exchange Rates”, Neural Information Processing - Letters and Reviews, Vol. 3, No. 2, May 2004. [5] CemKadilar, MuammerSimsek and CagdasHakanAladag, “ Forecasting the exchange rate series with ann: the case of turkey”, Istanbul University Econometrics and Statistics e- Journal, pages 17-29, 2009. [6] V. Anandhi, R. ManickaChezian, “Backpropagation Algorithm for Forecasting the Price of Pulpwood –Eucalyptus”, International Journal of Advanced Research in Computer Science, pp.355-357,2012 [7]K. Mohammadi, H. R. Eslami and Sh. DayyaniDardashti, “ Comparison of Regression, ARIMA and ANN Models for Reservoir Inflow Forecasting using Snowmelt Equivalent (a Case study of Karaj) “ , J. Agric. Sci. Technol. pp. 17-30,2005 [8] V. Vapnik, S. Golowich, and A. Smola, “Support vector method for function approximation, regression estimation, and signal processing,” Neural Information Processing Systems, vol. 9, MIT Press, Cambridge,MA. 1997. [9] D. Basak, S. Pal, and D. C. Patranabis, “Neural Information Processing,” Letters and Reviews, vol. 11, no. 10, pp. 203-224, October 2007. [10] “A Comparison of Machine Learning Techniques and Traditional Methods,” Journal of Applied Sciences, vol. 9, pp. 521-527. [11] K. P. Soman, R. Loganathan, and V. Ajay, “Support vector machines and other kernel methods,” Centre for Excellence in Computational Engineering and Networking Amrita Vishwa Vidyapeetham. [12] H. Drucker, C. J. C. Burges, L. Kaufman, A. Smola, and V. Vapnik, “Support vector regression machines,” Advances in NeuralInformation Processing Systems, The MIT Press, vol. 9, pp. 155, 1997. [13]Sansom, D. C., Downs, T., & Saha, T. K. (2003). Evaluation of support vector machine based forecasting tool in electricity price forecasting for Australian national electricity market participants. Journal of Electrical and Electronics Engineering, Australia, 22(3), 227- 234,2003. [14]Zhang, L., Lin, F., & Zhang, B. (2001, October). Support vector machine learning for image retrieval. In Image Processing, 2001. Proceedings. International Conference on (Vol. 2, pp. 721-724)IEEE, 2001. [15] M. Nirmala and S. M. Sundaram, “Modeling and Predicting the Monthly Rainfall in Tamilnadu as a Seasonal Multivariate Arima Process”, International Journal of Computer Engineering & Technology (IJCET), Volume 1, Issue 1, 2010, pp. 103 - 111, ISSN Print: 0976 – 6367, ISSN Online: 0976 – 6375. [16] Vilas Naik and Raghavendra Havin, “Entropy Features Trained Support Vector Machine Based Logo Detection Method for Replay Detection and Extraction from Sports Videos”, International Journal of Graphics and Multimedia (IJGM), Volume 4, Issue 1, 2013, pp. 20 - 30, ISSN Print: 0976 – 6448, ISSN Online: 0976 –6456. [17] Ankush Gupta, Ameesh Kumar Sharma and Umesh Sharma, “To Forecast the Future Demand of Electrical Energy in India: by Arima & Exponential Methods”, International Journal of Advanced Research in Engineering & Technology (IJARET), Volume 4, Issue 2, 2013, pp. 197 - 205, ISSN Print: 0976-6480, ISSN Online: 0976-6499.