Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Time Series Forecasting using Neural Nets (GNNNs)


Published on

paper review: Toward Automatic Time-Series Forecasting Using Neural Networks (GNNNs)

Published in: Science
  • Be the first to comment

Time Series Forecasting using Neural Nets (GNNNs)

  1. 1. Toward Automatic Time-Series Forecasting Using Neural Networks - Weixhong Yan Presenter: Sean Golliher 1 / 19
  2. 2. Relationship to Research Currently analyzing the performance of NEAT for Time Series Forecasting (TSF) Paper summarizes common approaches, and issues, using ANNs for TSF 2 / 19
  3. 3. Claims of the Paper Develops an automatic TSF model using a Generalized Regression Neural Network (GRNN) Shows promising results by winning NN3 time-series competition against 60 different models 3 / 19
  4. 4. General Problems with ANN Most approaches are ad hoc meaning they do some type of preprocessing of the data Typically try different ANN architectures to see which one performs better Nelson et al. : ANN inconsistency on TSF is the result of different preprocessing strategies Balkin et al. : ANNs require larger number of samples to be trained. Real-world examples, financial etc., are short training samples. 4 / 19
  5. 5. RBF RBF can be viewed as local linear regression model Apply Gaussian kernel to input data. All inputs go to node of form: G(x) = exp −x − c σ2 (1) Find center points by assigning c (center point) to each point in data set (measuring the distance to center point). This is equivalent to doing a local regression (sigma affects the smoothing of the approximation). Output layer (the weights) are trained using least-squares regression 5 / 19
  6. 6. Generalized Definition for Regression Computation of most probable value of Y for each value of X based on finite number of possibly noisy measurements of X Conditional mean of y given X (regression of y on X ) is given by: E[y|X] = ∞ −∞ yf (X, y)dy ∞ −∞ f (X, y)dy (2) Since we don’t typically know the density function f(X, y) it can be estimated using a Parzen window density estimator. 6 / 19
  7. 7. Generalized Definition for Regression The generalized definition yields the following regression function: ˆY (X) = n i=1 Y i exp − D2 i 2σ2 n i=1 exp − D2 i 2σ2 (3) Where D2 i = (X − Xi )T (X − Xi ) In the case of GRNN X is the input data and Xi are the centers. 7 / 19
  8. 8. GRNN G(x, xi ) are the standard radial basis functions wi is the generalized regression equation The spread factor dictates the performance 8 / 19
  9. 9. Claimed Benefits of GRNN Easy to train Can accurately approximate functions from sparse and noisy data Note: Recent paper, Ahmed et al., claim GRNN inferior to MLP for TSF 9 / 19
  10. 10. Methodology Requirements Minimal human intervention Computationally efficient for a large number of series Good forecasting over range of data sets 10 / 19
  11. 11. Preprocessing: Outliers Real-world time series has outliers Outliers identified by |x| ≥ 4max(|ma|, |mb|) (4) where ma = median(xi−3, xi−2, xi−1) and mb = median(xi+1, xi+2, xi+3) If x is an outlier the value is replaced with average value of two points before and after x 11 / 19
  12. 12. Preprocessing: Trends Real-world time series has trends. Could be due to seasonality or other factors. Common approaches are curve fitting, filtering, and differencing. Identifying trends is difficult to do algorithmically Proposes detrending scheme: Split series into segments. If monthly split into 12 if quarterly split into 4 Mean of historical observations within each segment is subtracted from every historical observation in segment. If x is an outlier the value is replaced with average value of two points before and after x 12 / 19
  13. 13. Preprocessing: Seasonality Identifying seasonality is typically a manual process Author used a simple approach and defined short series as n ≤ 60 and long n ≥ 60 Uses autocorrelation coefficients at one and two seasonal lags to decide if seasonal Uses a standard method for subtracting out seasonality from series data 13 / 19
  14. 14. ANN Modeling Aspects of ANN modeling Spread Factor. Typically found empirically since no good analytic approach has been found. Some guidance was given by Haykin σ = dmax√ 2n where dmax is max distance between the training points. Proposes spread factor be set to d50, d75, d95 (percentiles) of the nearest distance of all training samples to rest of points. Uses three GRNNs that all take the same input and are combined to give the final output. Choice of combining three GRNNs is based on previous success in literature 14 / 19
  15. 15. ANN Modeling Cont’d Input selection is considered one of the most important aspects in TSF Two general approaches: filter and wrapper Filtering selects features based on data itself (independent of learning algorithm) Wrapping approaches use the learning algorithm. Wrapper typically performs better. Author uses contiguous lag and limits to one full season for 12 month data. 15 / 19
  16. 16. Experimental Results Use NN3 time-series competition dataset which has composed of Dataset A and Dataset B Dataset A is 111 monthly time series data drawn from empirical business time series Dataset B is a small subset of Dataset A which consists of 11 time series Error is measured using sMAPE 16 / 19
  17. 17. Experimental Results B indicates statistical model and C indicates computational intelligence model 17 / 19
  18. 18. Ablation Studies SP: Spread, MSA: Multiple Step Ahead 18 / 19
  19. 19. Discussion Are TSF competitions just a demonstration of the no free lunch theorem? Why is the theorem not mentioned? Did he prove his approach was “better” or did this approach just outperform on a particular contest? Why doesn’t the training of the GRNN factor out outliers and seasonality on its own? Isn’t that what training is for? Why did he choose a GRNN? Previous papers said they perform poorly. What kind of bias does the detrending scheme introduce? Paper was “rule of thumb” oriented. Is there a way to make an automatic approach more rigorous? 19 / 19