Visualizing, Modeling and Forecasting of Functional Time Series

1,940 views
1,849 views

Published on

Published in: Education, Technology, Business
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,940
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
50
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

Visualizing, Modeling and Forecasting of Functional Time Series

  1. 1. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series Conclusion Visualizing and forecasting functional time series Han Lin Shang Department of Econometrics and Business Statistics HanLin.Shang@monash.edu
  2. 2. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionOutline 1 Visualizing functional time series. 2 Modeling and forecasting functional time series. 3 Modeling and forecasting seasonal univariate time series via functional approach. 4 Present empirical analysis on estimation, modeling, forecasting techniques, with no theoretical proof.
  3. 3. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionAim of the first paper Introduce three visualization methods 1 rainbow plot 2 functional bagplot 3 functional highest density region (HDR) boxplot Functional bagplot and functional HDR boxplot can detect outliers.
  4. 4. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionOverview of functional data 1 A collection of functions, represented by curves, surfaces, shapes or images. 2 Some applications include Age-specific mortality and fertility rates (Hyndman and Ullah, 2007) Term-structured yield curve (Kargin and Onatski, 2008) Spectrometry data (Reiss and Odgen, 2007) El Ni˜o data (Ferraty and Vieu, 2006) n
  5. 5. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionVisualizing functional data Help discovery characteristics that might not apparent from mathematical models and summary statistics. Visualization plays a minor role.
  6. 6. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionSome visualization methods 1 Phase-plane plot 2 Rug-plot 3 Singular value decomposition plot
  7. 7. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionRainbow plot 1 A simple plot of all the data, with added feature being a rainbow color palette based on an ordering of functional data. 2 Functional data can be ordered by depth and density.
  8. 8. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionExample of rainbow plot Annual age-specific mortality curves for French males between 1899 and 2005 France: male log mortality rate (1899−2005) 0 −2 Log mortality rate −4 −6 −8 −10 0 20 40 60 80 100 Age
  9. 9. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionMultivariate principal component analysis 1 PC1 is calculated by maximizing the variance of φ1 X , that is argmax var(φ1 X ) = argmax φ1 X Xφ1 . φ1 =1 φ1 =1 2 Successive PC are obtained iteratively by subtracting the first k PC from X. Xk = Xk−1 − Xk−1 φk φk , 3 Treating Xk as the new data matrix to find φk+1 by maximizing the variance of φk+1 Xk , subject to 1 φk+1 = ( p φ2 j=1 k+1,j ) = 1 and φk+1 ⊥ φj , j = 1, . . . , k. 2
  10. 10. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionProperties of functional principal component analysis PCA FPCA Variables X = [x1 , . . . , xp ], f(x) = xi = [x1i , . . . , xni ] , i = [f1 (x), . . . , fn (x)], 1, . . . , p x ∈ [x1 , xp ] Data Vectors ∈ R p Curves ∈ L2 [x1 , xp ] Covariance Matrix Operator T bounded V = Cov(X) ∈ R p between x1 and xp , T : L2 [x1 , xp ] → L2 [x1 , xp ] Eigen Vector ξk ∈ R, Function structure Vξk = λk ξk , for ξk (x) ∈ L2 [x1 , xp ], xp 1 ≤ k < min(n, p) x1 T ξk (x)dx = λk ξk (x), for 1 ≤ k < n Components Random variables in Random variables in Rp L2 [x1 , xp ]
  11. 11. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionBivariate and functional bagplots 1 Apply robust functional principal component analysis (FPCA) to {yt (x)} and obtain the first two PC scores.
  12. 12. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionBivariate and functional bagplots 1 Apply robust functional principal component analysis (FPCA) to {yt (x)} and obtain the first two PC scores. 2 Bivariate PC scores then ordered by Tukey’s halfspace location depth and plotted by bivariate bagplot.
  13. 13. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionBivariate and functional bagplots 1 Apply robust functional principal component analysis (FPCA) to {yt (x)} and obtain the first two PC scores. 2 Bivariate PC scores then ordered by Tukey’s halfspace location depth and plotted by bivariate bagplot. 3 Mapping the features of bivariate bagplot into the functional space.
  14. 14. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionBivariate and functional HDR boxplots 1 Compute a bivariate kernel density estimate on the first two robust PC scores.
  15. 15. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionBivariate and functional HDR boxplots 1 Compute a bivariate kernel density estimate on the first two robust PC scores. 2 Apply the bivariate HDR boxplot.
  16. 16. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionBivariate and functional HDR boxplots 1 Compute a bivariate kernel density estimate on the first two robust PC scores. 2 Apply the bivariate HDR boxplot. 3 Mapping the features of the HDR boxplots into the functional space.
  17. 17. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionExample of El Ni˜o data n Average monthly sea surface temperatures (Celsius) from January 1951 to December 2007 28 Sea surface temperature 26 24 22 20 2 4 6 8 10 12 Month
  18. 18. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionRainbow plots ordered by depth and density 28 28 Sea surface temperature Sea surface temperature 26 26 24 24 22 22 20 20 2 4 6 8 10 12 2 4 6 8 10 12 Month Month
  19. 19. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionOutlier detection by bagplots 0 1914q q 1915 q q 1916q 4 q −2 1918q q 1944 3 1940q q q q q 1917 q Log mortality rate PC score 2 −4 qq 2 q q q qqqq q q q q q 1943q q q q q qq qq q 1 q −6 q qq q q q q q q q qq q 1919q q q q qq qq q q q 0 q q q qq q q q q q q q q qq qq q q q −8 qqq q q q q q qqq q q q q q q q q q −1 q q q q q q q q q q q q q q q q −10 −5 0 5 10 15 0 20 40 60 80 100 PC score 1 Age 1998 q 4 q 28 q 2 q q q q 1983 q q Sea surface temperature q q 26 q q q q q q q q q q q q q q q q q q q q q q 0 q q PC score 2 q q q q q q qq q q q q 24 q q q q q q q −2 q q q q q 22 −4 1982 q 20 q −6 1997 q q −4 −2 0 2 4 6 8 10 2 4 6 8 10 12 PC score 1 Month
  20. 20. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionOutlier detection by HDR boxplots 0 6 1914q q 1915 q q −2 1916q 4 q 1918q q Log mortality rate 1944 1940q q q q q 1917 q PC score 2 −4 2 qq q q q q qqqq q qq q q 1943q q q q qq qq q q q q q q q 1919q −6 q q q qq q qq q q qq qq 0 qq q q qq q q q q qq qq q q q q qq q q q q q qo qq q q q q qqq q q q q qqq q q q q q qqq q q −8 q q q −2 −15 −10 −5 0 5 10 15 20 0 20 40 60 80 100 PC score 1 Age 6 28 1998q 4 q q Sea surface temperature 2 q 26 q q q q q q 1983q q q q q q q q q q q q q o q PC score 2 q q q q q q 0 q q qq q q q q q q q q q 24 q q q q q q q q q −2 q q q q q 22 −4 1982q q −6 20 1997q q −8 −5 0 5 10 2 4 6 8 10 12 PC score 1 Month
  21. 21. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionOther outlier detection methods 1 Notion of functional depth and calculates a likelihood ratio test statistics for each curve. 2 A curve is an outlier if the maximum of the test statistics exceeds a given critical value. 3 Remove the outlier, the remaining data are tested again.
  22. 22. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionIntegrated squared error 1 Utilizes robust FPCA. Integrated squared error for each curve is xp xp K 2 ˆ2 et (x)dx = yt (x) − µ(x) − ˆ ˆ ˆ βt,k φk (x) dx x1 x1 k=1 2 High integrated squared errors indicate a high likelihood of curves being detected as outliers.
  23. 23. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionRobust Mahalanobis distance method 1 Discretize functional data on an equally spaced dense grid. 2 The squared robust Mahalanobis distance is defined by rt = [yt (xi )−ˆ(xi )] Σ−1 [yt (xi )−ˆ(xi )], µ ˆ µ i = 1, . . . , p, t = 1, . . . , n 3 Outliers have squared robust Mahalanobis distances greater than χ2 . .99,p
  24. 24. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionOutlier detection comparison of mortality data Method Outliers detected Functional depth None Integrated squared error 1914–1918, 1940, 1943–1945 Functional bagplot 1914–1919, 1940, 1943–1944 Functional HDR boxplot 1914–1919, 1940, 1943–1944 Robust Mahalanobis distance 1914–1918, 1940, 1944 Table: The outliers are 1914-1919, 1940, 1943-1944.
  25. 25. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionOutlier detection comparison of El Ni˜o data n Method Outliers detected Functional depth 1983, 1997 Integrated squared error 1973, 1982–1983, 1997–1998 Functional bagplot 1982–1983, 1997–1998 Functional HDR boxplot 1982–1983, 1997–1998 Robust Mahalanobis distance 1982–1983, 1997–1998 Table: The outliers are 1982-1983, 1997-1998.
  26. 26. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionConclusion of the first paper 1 Three graphical methods to visualize functional data. 2 Functional bagplots and HDR boxplots can detect outliers. 3 One limitation is only first two principal component scores are considered. 4 Probability of outliers needs to be pre-chosen.
  27. 27. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionPossible extension 1 FPCA can be replaced by other dimension reduction techniques. 2 Other ways of ordering functional data or determining functional median or mode. 3 Tukey’s location depth can be replaced by other depth measures. 4 Extend from two-dimensional curves to three-dimensional images.
  28. 28. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionAim of the second paper 1 New functional data analytic tool for forecasting age-specific mortality and fertility rates. 2 Mortality rate forecasting is vital for planning insurance and pension policies. 3 Fertility rate forecasting is important for planning child care policy.
  29. 29. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionAustralian fertility data set Annual Australian fertility rates (1921-2006) for age groups from 15 to 49. These are defined as the number of live births during the calendar year, according to the age of the mother, per 1000 of the female resident population of the same age at 30 June. Australia fertility rate (1921−2006) 250 200 Fertility rate 150 100 50 0 15 20 25 30 35 40 45 50 Age
  30. 30. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionFrench female mortality data set Annual French female mortality rates (1899-2005) for single year of age. These are simply the ratio of death counts to population exposure in the relevant interval of age and time. France: female log mortality rate (1899−2005) 0 −2 Log mortality rate −4 −6 −8 −10 0 20 40 60 80 100 Age
  31. 31. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionModeling step 1 Smooth the data for each year using a nonparametric ˆ smoothing method to estimate ft (x) for x ∈ [x1 , xp ] from {xi , yt (xi )}, i = 1, 2, . . . , p. 2 Decompose the realized curves via FPCA K yt (x) = µ(x) + ˆ ˆ ˆ βt,k φk (x) + et (x) + σt (x)ηt , ˆ (1) k=1 µ(x) is the mean function. ˆ ˆ ˆ {φ1 (x), . . . , φK (x)} is the functional principal components, which are assumed to be fixed. ˆ ˆ {βt,1 , . . . , βt,K } is the uncorrelated principal component scores K ˆ2 satisfying k=1 βt,k < ∞. et (x) is the estimated model residual function. ˆ σt (x)ηt takes into account heterogeneity, and ηt ∼ N(0, 1). K is the number of functional principal components.
  32. 32. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionForecasting step 1 Model and forecast the coefficients ˆ ˆ {β1,k , . . . , βn,k }, k = 1, . . . , K via univariate time series. 2 Use the forecast coefficients with (1) to obtain forecasts of fn+h (x), where h is forecast horizon. 3 Estimated variances of the error terms in (1) are used to compute prediction intervals.
  33. 33. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionWeighted mean function 1 Mean function µ(x) estimated by a weighted average n ∗ ˆ µ (x) = ˆ wt ft (x), t=1 ˆ where ft (x) is the smoothed curve estimated from yt (x), and wt = κ(1 − κ)n−t is a geometrically decreasing weight with 0 < κ < 1. 2 ˆ ˆ ft∗ (x) = ft (x) − µ∗ (x) is the de-centralized functional curves, ˆ let G = W f ∗ (x), where W = diag (w1 , . . . , wn ) is a diagonal weight matrix. 3 Apply singular value decomposition to G = UDV , where ˆ φk (xi∗ ) is the (i, k)th element of V.
  34. 34. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionWeighted functional principal components 1 Weighted functional principal component decomposition is K yt (x) = µ∗ (x) + ˆ βt,k φ∗ (x) + et (x) + σt (x)ηt ˆ ˆ k ˆ k=1 2 ˆ ˆ Since the scores {βt,1 , . . . , βt,K } are uncorrelated, they can be forecasted using an univariate time series model. 3 Conditioning on the observations I and the set of fixed weighted functional principal components ˆ ˆ ˆ Φ∗ = {φ∗ (x), . . . , φ∗ (x)}, h-step-ahead forecasts of yn+h (x) 1 K is K yn+h|n (x) = E[yn+h (x)|I, Φ∗ ] = µ∗ (x) + ˆ ˆ ˆ βn+h|n,k φ∗ (x), ˆ ˆ k k=1 ˆ where βn+h|n,k denotes the h-step-ahead forecast of βn+h,k .
  35. 35. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionSelection of weight parameter κ can be determined by minimizing the mean integrated forecast error (MISFE): xp 2 MISFE(h) = yn+h (x) − yn+h|n (x) dx, ˆ x1 over a set of grid points of κ.
  36. 36. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionSelection of number of components Optimal number of components is determined by minimizing the MISFE.
  37. 37. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionAustralian fertility rates K FPCA FPCAw RW 1 99.0611 16.7304 2 56.3095 3.3019 3 24.9330 3.2580 4 15.6845 3.1995 5 4.4495 3.2132 6 3.4310 3.2123 4.9800 Table: MSE: Australian fertility rates.
  38. 38. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionFrench female mortality rates K FPCA FPCAw RW 1 0.5956 0.0293 2 0.0537 0.0310 3 0.0316 0.0310 4 0.0296 0.0311 5 0.0287 0.0311 6 0.0425 0.0311 0.0437 Table: MSE (×1000): French female log mortality rates.
  39. 39. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionConclusion of the second paper 1 Proposed a weighted FPCA to forecast age-specific fertility and mortality rates. 2 Compared point forecast accuracy between the unweighted and weighted FPCA. 3 Extend weighting idea to other dimension reduction techniques, such as functional partial least squares regression.
  40. 40. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionAim of the third paper 1 Sea surface temperature (SST) is rising. 2 Rising sea surface temperatures increases intensity of nature disaster, such as hurricanes and storms. 3 Provide a better way, a multivariate way and a nonparametric way for modeling and predicting sea surface temperature.
  41. 41. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionEl Ni˜o data set n 1 Average monthly sea surface temperature from January 1950 to December 2008, available online at www.cpc.noaa.gov/data/indices/sstoi.indices. 2 Sea surface temperatures are measured by moored buoys in the “Nino region” defined by the coordinate 0 − 10◦ South and 90 − 80◦ West.
  42. 42. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionUnivariate graphical display 28 Sea surface temperature 26 24 22 20 1950 1960 1970 1980 1990 2000 2010 Month
  43. 43. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionFunctional graphical display 28 Sea surface temperature 26 24 22 20 2 4 6 8 10 12 Month
  44. 44. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionFunctional time series analysis {Zw , w ∈ [1, N]} be a seasonal time series observed at N equispaced times. For unequally-spaced data set, the smoothing methods may be applied. Observed time series {Z1 , . . . , Z708 } divided into 59 successive paths of length 12, yt (x) = {Zw , w ∈ (p(t−1), pt]}, ∀t = 1, . . . , 59, p = 1, . . . , 12. To forecast future processes, yn+h,h>0 (x), from the observed data.
  45. 45. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionFPCA 1 Decompose a complete (12 × 59) data matrix, y(x) = [y1 (x), . . . , yn (x)] , into a number of functional principal components and their uncorrelated scores. 2 FPCA decomposition can be written as K yt (x) = µ(x) + ˆ ˆ ˆ βt,k φk (x) + ˆt (x), (2) k=1
  46. 46. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionFunctional principal component regression Conditioning on historical curves I and fixed functional ˆ ˆ ˆ principal components {Φ = φ1 (x), . . . , φK (x)}, forecasted curves are K ˆ TS ˆ yn+h|n (x) = E[yn+h (x)|I, Φ] = µ(x)+ ˆ ˆ ˆ βn+h|n,k φk (x), (3) k=1 ˆ where βn+h|n,k denotes the h-step-ahead forecast of βn+h,k . Hereafter, we refer this method as the time series (TS) method.
  47. 47. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionProblem statement 1 As observe most recent data points consisting of first m0 time period of yn+1 (x), denoted by yn+1 (xe ) = [yn+1 (x1 ), . . . , yn+1 (xm0 )] , we want update forecasts for the remaining time period of year n + 1, denoted by yn+1 (xl ) = [yn+1 (xm0 +1 ), . . . , yn+1 (x12 )] . 2 Using (3), TS forecasts of yn+1 (xl ) is given as K ˆ TS ˆ yn+1|n (xl ) = E[yn+1 (xl )|I l , Φl ] = µ(xl ) + ˆ ˆTS ˆ βk,n+1|n φk (xl ). k=1 3 TS method does not consider any new observations. 4 Introduce four dynamic updating methods and compare their point forecast performance.
  48. 48. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionBlock moving (BM) 1 BM method considers most recent data as last observation in a complete data matrix. 2 Because time is a continuous variable, we observe a complete data matrix at any given time interval. 3 TS method can be applied by sacrificing a number of data points in the first year.
  49. 49. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionOrdinary least squares (OLS) regression 1 ˆ Denote Fe as a m0 × K matrix whose (j, k)th entry is φj,k for 1 ≤ j ≤ m0 , 1 ≤ k ≤ K .
  50. 50. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionOrdinary least squares (OLS) regression 1 Denote Fe as a m0 × K matrix whose (j, k)th entry is φj,k for ˆ 1 ≤ j ≤ m0 , 1 ≤ k ≤ K . 2 ˆ ˆ ˆ Let βn+1 = [βn+1,1 , . . . , βn+1,K ] be a K × 1 vector, and ˆn+1 (xe ) = [ˆn+1 (x1 ), . . . , ˆn+1 (xm0 )] be a m0 × 1 vector.
  51. 51. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionOrdinary least squares (OLS) regression 1 Denote Fe as a m0 × K matrix whose (j, k)th entry is φj,k for ˆ 1 ≤ j ≤ m0 , 1 ≤ k ≤ K . 2 ˆ ˆ ˆ Let βn+1 = [βn+1,1 , . . . , βn+1,K ] be a K × 1 vector, and ˆn+1 (xe ) = [ˆn+1 (x1 ), . . . , ˆn+1 (xm0 )] be a m0 × 1 vector. 3 ˆ∗ As the mean-adjusted yn+1 (xe ) = yn+1 (xe ) − µ(xe ) becomes ˆ available, OLS regression ˆ∗ ˆ yn+1 (xe ) = Fe βn+1 + ˆn+1 (xe ).
  52. 52. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionOrdinary least squares (OLS) regression 1 Denote Fe as a m0 × K matrix whose (j, k)th entry is φj,k for ˆ 1 ≤ j ≤ m0 , 1 ≤ k ≤ K . 2 ˆ ˆ ˆ Let βn+1 = [βn+1,1 , . . . , βn+1,K ] be a K × 1 vector, and ˆn+1 (xe ) = [ˆn+1 (x1 ), . . . , ˆn+1 (xm0 )] be a m0 × 1 vector. 3 ˆ∗ As the mean-adjusted yn+1 (xe ) = yn+1 (xe ) − µ(xe ) becomes ˆ available, OLS regression ˆ∗ ˆ yn+1 (xe ) = Fe βn+1 + ˆn+1 (xe ). 4 ˆOLS Via OLS, βn+1 = (Fe Fe )−1 Fe yn+1 (xe ). ˆ∗
  53. 53. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionOrdinary least squares (OLS) regression 1 Denote Fe as a m0 × K matrix whose (j, k)th entry is φj,k for ˆ 1 ≤ j ≤ m0 , 1 ≤ k ≤ K . 2 ˆ ˆ ˆ Let βn+1 = [βn+1,1 , . . . , βn+1,K ] be a K × 1 vector, and ˆn+1 (xe ) = [ˆn+1 (x1 ), . . . , ˆn+1 (xm0 )] be a m0 × 1 vector. 3 ˆ∗ As the mean-adjusted yn+1 (xe ) = yn+1 (xe ) − µ(xe ) becomes ˆ available, OLS regression ˆ∗ ˆ yn+1 (xe ) = Fe βn+1 + ˆn+1 (xe ). 4 ˆOLS Via OLS, βn+1 = (Fe Fe )−1 Fe yn+1 (xe ). ˆ∗ 5 OLS forecast of yn+1 (xl ) is given by K ˆ OLS ˆ yn+1|n (xl ) = E[yn+1 (xl )|I l , Φl ] = µ(xl ) + ˆ ˆ ˆ βn+1,k φk (xl ). k=1
  54. 54. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionRidge regression (RR) 1 RR penalizes the OLS coefficients, which deviate from 0. RR coefficients minimize a penalized residual sum of squares y∗ ˆ y∗ ˆ ˆ ˆ argmin{(ˆn+1 (xe )−Fe βn+1 ) (ˆn+1 (xe )−Fe βn+1 )+λβn+1 βn+1 } ˆ βn+1
  55. 55. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionRidge regression (RR) 1 RR penalizes the OLS coefficients, which deviate from 0. RR coefficients minimize a penalized residual sum of squares y∗ ˆ y∗ ˆ ˆ ˆ argmin{(ˆn+1 (xe )−Fe βn+1 ) (ˆn+1 (xe )−Fe βn+1 )+λβn+1 βn+1 } ˆ βn+1 2 ˆ Taking derivative with respect to βn+1 , βn+1 = (Fe Fe + λI)−1 Fe yn+1 (xe ). ˆRR ˆ∗
  56. 56. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionRidge regression (RR) 1 RR penalizes the OLS coefficients, which deviate from 0. RR coefficients minimize a penalized residual sum of squares y∗ ˆ y∗ ˆ ˆ ˆ argmin{(ˆn+1 (xe )−Fe βn+1 ) (ˆn+1 (xe )−Fe βn+1 )+λβn+1 βn+1 } ˆ βn+1 2 ˆ Taking derivative with respect to βn+1 , βn+1 = (Fe Fe + λI)−1 Fe yn+1 (xe ). ˆRR ˆ∗ 3 RR forecast of yn+1 (xl ) is K ˆ RR ˆ yn+1 (xl ) = E[yn+1 (xl )|I, Φl ] = µ(xl ) + ˆ ˆRR ˆ βn+1,k φk (xl ). k=1
  57. 57. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionPenalized least square (PLS) regression 1 OLS method needs a sufficient number of observation (≥ K ) ˆOLS in order for βn+1 to be numerically stable.
  58. 58. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionPenalized least square (PLS) regression 1 OLS method needs a sufficient number of observation (≥ K ) ˆOLS in order for βn+1 to be numerically stable. 2 βn+1 obtained from the PLS methods minimizes y∗ ˆ y∗ ˆ (ˆn+1 (xe ) − Fe βn+1 ) (ˆn+1 (xe ) − Fe βn+1 ) + ˆ ˆ ˆ λ(βn+1 − β TS ) (βn+1 − β TS ) ˆ n+1|n n+1|n
  59. 59. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionPenalized least square (PLS) regression 1 OLS method needs a sufficient number of observation (≥ K ) ˆOLS in order for βn+1 to be numerically stable. 2 βn+1 obtained from the PLS methods minimizes y∗ ˆ y∗ ˆ (ˆn+1 (xe ) − Fe βn+1 ) (ˆn+1 (xe ) − Fe βn+1 ) + ˆ ˆ ˆ λ(βn+1 − β TS ) (βn+1 − β TS ) ˆ n+1|n n+1|n 3 ˆ Taking first derivative with respect to βn+1 , βn+1 = (Fe Fe + λI)−1 (Fe yn+1 (xe ) + λβn+1|n ). ˆPLS ˆ ˆTS (4)
  60. 60. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionPenalized least square (PLS) regression 1 OLS method needs a sufficient number of observation (≥ K ) ˆOLS in order for βn+1 to be numerically stable. 2 βn+1 obtained from the PLS methods minimizes y∗ ˆ y∗ ˆ (ˆn+1 (xe ) − Fe βn+1 ) (ˆn+1 (xe ) − Fe βn+1 ) + ˆ ˆ ˆ λ(βn+1 − β TS ) (βn+1 − β TS ) ˆ n+1|n n+1|n 3 ˆ Taking first derivative with respect to βn+1 , βn+1 = (Fe Fe + λI)−1 (Fe yn+1 (xe ) + λβn+1|n ). ˆPLS ˆ ˆTS (4) 4 PLS forecasts is a weighted average between the TS and OLS forecasts, subject to a penalty parameter λ.
  61. 61. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionPenalized least square (PLS) regression 1 OLS method needs a sufficient number of observation (≥ K ) ˆOLS in order for βn+1 to be numerically stable. 2 βn+1 obtained from the PLS methods minimizes y∗ ˆ y∗ ˆ (ˆn+1 (xe ) − Fe βn+1 ) (ˆn+1 (xe ) − Fe βn+1 ) + ˆ ˆ ˆ λ(βn+1 − β TS ) (βn+1 − β TS ) ˆ n+1|n n+1|n 3 ˆ Taking first derivative with respect to βn+1 , βn+1 = (Fe Fe + λI)−1 (Fe yn+1 (xe ) + λβn+1|n ). ˆPLS ˆ ˆTS (4) 4 PLS forecasts is a weighted average between the TS and OLS forecasts, subject to a penalty parameter λ. 5 PLS forecast of yn+1 (xl ) is given as K ˆ PLS ˆ yn+1 (xl ) = E[yn+1 (xl )|I l , Φl ] = µ(xl ) + ˆ ˆPLS ˆ βn+1,k φk (xl ). k=1
  62. 62. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionPenalty parameter selection Split the data into a training set 1 a training sample (SST from 1950 to 1970), and 2 a validation sample (SST from 1971 to 1992). and a testing set (SST from 1993 to 2007). Optimal penalty parameters λ for different updating periods are determined by minimizing the mean absolute error (MAE). h p 1 MAE = |yn+j (xi ) − yn+j (xi )|, ˆ hp j=1 i=1 over a grid of candidates (from 10−6 to 106 in steps of 0.0001).
  63. 63. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionComponent selection With data in training set, select number of components by minimizing MAE within the validation set. Optimal number of components is K = 5.
  64. 64. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionSome benchmark forecasting methods 1 Mean predictor (MP) method predicts values at n + 1 by empirical mean from first year to nth year. 2 Random walk (RW) method predicts new values at year n + 1 by observations at year n. 3 Seasonal autoregressive moving average (SARIMA) is a benchmark method for forecasting seasonal univariate time series. Requires the specifications of order of the seasonal and non-seasonal components of an ARIMA model. Implement an automatic algorithm of Hyndman and Khandakar (2008) to select the optimal orders.
  65. 65. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionPoint forecast comparison Non-dynamic updating method Dynamic updating methods Update MP RW SARIMA TS OLS Block PLS RR Mar-Dec 0.72 0.86 0.96 0.73 0.72 0.70 0.67 0.76 Apr-Dec 0.73 0.87 0.98 0.74 0.69 0.73 0.68 0.65 May-Dec 0.71 0.86 0.88 0.71 0.94 0.71 0.68 0.62 Jun-Dec 0.71 0.84 0.86 0.71 1.07 0.70 0.66 0.58 Jul-Dec 0.72 0.87 0.86 0.73 0.94 0.68 0.60 0.57 Aug-Dec 0.71 0.91 0.84 0.74 0.94 0.69 0.63 0.62 Sep-Dec 0.71 0.93 0.84 0.74 1.03 0.70 0.65 0.64 Oct-Dec 0.72 0.96 0.57 0.78 0.69 0.74 0.71 0.64 Nov-Dec 0.72 0.92 0.52 0.79 0.25 0.75 0.58 0.24 Dec 0.64 0.83 0.21 0.71 0.29 0.59 0.23 0.29 Mean 0.71 0.88 0.75 0.74 0.76 0.70 0.61 0.56 Table: MAE of the point forecasts using different methods.
  66. 66. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionParametric prediction intervals 1 Based on orthogonality and linear additivity, total forecast variance is approximated by the sum of individual variances K ˆ ˆ ξn+h|n = Var[yn+h |I, Φ] ≈ ˆ ηn+h|n,k φ2 (x) + vn+h , ˆ ˆ k k=1 ˆ ˆ ˆ ηn+h|n,k = Var(βn+h,k |β1,k , . . . , βn,k ) is obtained by a time ˆ series model. vn+h is estimated by averaging ˆ2 (x) in (3) for each x ˆ n+h variable. 2 Under the normality, the (1 − α) prediction intervals for yn+h (x) are 1 ˆ yn+h|n (x) ± zα (ξn+h|n ) 2 , ˆ where zα is the (1 − α/2) standard normal quantile.
  67. 67. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionNonparametric prediction intervals 1 h-step-ahead forecast errors of principal component scores is ˆ ˆ πt,h,k = βt,k − βt|t−h,k , for t = h + 1, . . . , n where h < n − 1. ˆ
  68. 68. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionNonparametric prediction intervals 1 h-step-ahead forecast errors of principal component scores is ˆ ˆ πt,h,k = βt,k − βt|t−h,k , for t = h + 1, . . . , n where h < n − 1. ˆ 2 By sampling with replacement, obtain bootstrap samples of βn+h,k , ˆb,TS ˆTS ˆb βn+h|n,k = βn+h|n,k + π∗,h,k , for b = 1, . . . , B.
  69. 69. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionNonparametric prediction intervals 1 h-step-ahead forecast errors of principal component scores is ˆ ˆ πt,h,k = βt,k − βt|t−h,k , for t = h + 1, . . . , n where h < n − 1. ˆ 2 By sampling with replacement, obtain bootstrap samples of βn+h,k , ˆb,TS ˆTS ˆb βn+h|n,k = βn+h|n,k + π∗,h,k , for b = 1, . . . , B. 3 Since the residual {ˆ1 (x), . . . , ˆn (x)} is uncorrelated to the principal components, bootstrap the model residual term ˆb n+h|n (x) by iid sampling.
  70. 70. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionNonparametric prediction intervals 1 h-step-ahead forecast errors of principal component scores is ˆ ˆ πt,h,k = βt,k − βt|t−h,k , for t = h + 1, . . . , n where h < n − 1. ˆ 2 By sampling with replacement, obtain bootstrap samples of βn+h,k , ˆb,TS ˆTS ˆb βn+h|n,k = βn+h|n,k + π∗,h,k , for b = 1, . . . , B. 3 Since the residual {ˆ1 (x), . . . , ˆn (x)} is uncorrelated to the principal components, bootstrap the model residual term ˆb n+h|n (x) by iid sampling. 4 Based on orthogonality and linear additivity, obtain B forecast variants of yn+h|n (x), K ˆb yn+h|n (x) = µ(x) + ˆ ˆb,TS ˆ βn+h|n,k φk (x) + ˆb n+h|n (x). k=1
  71. 71. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionNonparametric prediction intervals 1 h-step-ahead forecast errors of principal component scores is ˆ ˆ πt,h,k = βt,k − βt|t−h,k , for t = h + 1, . . . , n where h < n − 1. ˆ 2 By sampling with replacement, obtain bootstrap samples of βn+h,k , ˆb,TS ˆTS ˆb βn+h|n,k = βn+h|n,k + π∗,h,k , for b = 1, . . . , B. 3 Since the residual {ˆ1 (x), . . . , ˆn (x)} is uncorrelated to the principal components, bootstrap the model residual term ˆb n+h|n (x) by iid sampling. 4 Based on orthogonality and linear additivity, obtain B forecast variants of yn+h|n (x), K ˆb yn+h|n (x) = µ(x) + ˆ ˆb,TS ˆ βn+h|n,k φk (x) + ˆb n+h|n (x). k=1 5 ˆb (1 − α) prediction intervals are quantiles of yn+h|n (x).
  72. 72. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionDistributional forecast updating 1 By sampling with replacement, obtain bootstrap samples of βn+1,k for year n + 1, ˆb,TS ˆTS ˆb βn+1|n,k = βn+1|n,k + π∗,1,k , for b = 1, . . . , B. 2 ˆb,TS With bootstrapped samples βn+1|n,k , these lead to ˆb,PLS bootstrapped samples βn+1 by (4). 3 From β b,PLS , obtain B replications of ˆ n+1 K ˆ b,PLS yn+1 (xl ) = µ(xl ) + ˆ ˆb,PLS ˆ βn+1,k φk (xl ) + ˆn+1 (xl ). k=1 4 ˆ b,PLS (1 − α) prediction intervals are quantiles of yn+1 (xl ).
  73. 73. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionDistributional forecast measure 1 Empirical conditional coverage probability was calculated as the ratio between number of ‘future’ samples falling into the calculated prediction intervals and number of testing samples. p h 1 coverage = y lb ˆ ub I (ˆn+j|n (xi ) < yn+j (xi ) < yn+j|n (xi )), hp i=1 j=1 Mean coverage probability deviance = average(empirical coverage - nominal coverage). 2 To assess which approach gives narrower prediction intervals, calculate the width of prediction intervals p h 1 Width = y ub ˆ lb |ˆn+j|n (xi ) − yn+j|n (xi )|. hp i=1 j=1
  74. 74. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionDistributional forecast comparison Parametric Nonparametric Period TS BM TS BM PLS Mar-Dec 97% 98% 97% 97% 95% Apr-Dec 97% 98% 97% 97% 95% May-Dec 96% 96% 96% 96% 96% Jun-Dec 96% 96% 96% 95% 95% Jul-Dec 95% 96% 95% 94% 94% Aug-Dec 94% 94% 94% 94% 93% Sep-Dec 93% 95% 93% 95% 93% Oct-Dec 93% 93% 93% 93% 90% Nov-Dec 93% 96% 93% 93% 93% Dec 93% 100% 93% 93% 93% MCD 1.58% 1.88% 1.58% 1.40% 1.49% Table: Nominal = 95%, smaller the mean coverage probability deviance (MCD) is, the better the method is.
  75. 75. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionDistributional forecast comparison Parametric Nonparametric Period TS BM TS BM PLS Mar-Dec 3.65 3.64 3.55 3.51 3.15 Apr-Dec 3.73 3.73 3.62 3.66 3.21 May-Dec 3.69 3.69 3.57 3.61 3.21 Jun-Dec 3.58 3.58 3.47 3.50 3.05 Jul-Dec 3.47 3.46 3.38 3.41 2.90 Aug-Dec 3.34 3.33 3.26 3.37 2.61 Sep-Dec 3.26 3.26 3.19 3.25 2.82 Oct-Dec 3.27 3.28 3.20 3.23 2.78 Nov-Dec 3.23 3.24 3.16 3.26 2.69 Dec 3.19 3.18 3.12 3.30 2.48 Mean width 3.44 3.44 3.35 3.41 2.89 Table: Width comparison at nominal = 95%.
  76. 76. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionConclusion of the third paper 1 Presented a nonparametric method to forecast univariate seasonal time series. 2 Showed importance of dynamic updating for improving point forecast accuracy. 3 Among all dynamic updating methods, RR turns out to be best. 4 Possible to examine other penalty functions used in both the PLS and RR methods.
  77. 77. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionSummary of the paper 1 Proposed three graphical tools for visualizing functional data and identifying functional outliers.
  78. 78. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionSummary of the paper 1 Proposed three graphical tools for visualizing functional data and identifying functional outliers. 2 Proposed a weighted functional principal component analysis to model and forecast mortality and fertility.
  79. 79. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionSummary of the paper 1 Proposed three graphical tools for visualizing functional data and identifying functional outliers. 2 Proposed a weighted functional principal component analysis to model and forecast mortality and fertility. 3 Applied the functional data analytic approach to model and forecast seasonal univariate time series.
  80. 80. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionReferences of three papers Hyndman, R. J. and Shang, H. L. (2010) Rainbow plot, bagplot and boxplot for functional data, Journal of Computational and Graphical Statistics, 19(1), 29-45. Hyndman, R. J. and Shang, H. L. (2009) Forecasting functional time series (with discussion), Journal of Korean Statistical Society, 38(3), 199-221. Shang, H. L. and Hyndman, R. J. (2011) Nonparametric time series forecasting with dynamic updating, Mathematics and Computers in Simulation, 81(7), 1310-1324.
  81. 81. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionReferences of three R packages Shang, H. L. and Hyndman, R. J. (2011) rainbow: Rainbow plots, bagplots and boxplots for functional data, R package version 2.3.4, http://CRAN.R-project.org/package=rainbow. Shang, H. L. and Hyndman, R. J. (2011) fds: Functional data sets, R package version 1.6, http://CRAN.R-project.org/package=fds. Hyndman, R. J. and Shang, H. L. (2011) ftsa: Functional time series analysis, R package version 2.6, http://CRAN.R-project.org/package=ftsa.
  82. 82. Visualizing functional data Forecasting functional data Forecasting seasonal univariate time series ConclusionContact detail Thank you for your attention. Keep contact HanLin.Shang@monash.edu

×