The document provides an overview of Box-Jenkins (ARIMA) methodology for time series modeling and forecasting. It discusses autoregressive (AR), moving average (MA), and autoregressive moving average (ARMA) models. It also covers model identification using autocorrelation (ACF) and partial autocorrelation (PACF) functions, as well as model estimation, checking, selection and forecasting. Examples are provided to illustrate the methodology.
2. ARIMA Models
• The Box-Jenkins methodology refers to a set of
procedures for identifying, fitting and checking
ARIMA models with time series.
•
• The AR in ARIMA refers to Autoregressive models
• The MA in ARIMA refers to Moving Average
models
• The I in ARIMA refers to the number of lags used
in differencing the data
3. Autoregressive Models
Yt = 0 + 1Yt-1 + 2Yt-2 … + pYt-p + et,
• where t = coefficients to be estimated and
p = number of lags
• The number of lags (p) used in the model is a
parameter and its value must be determined
by the user. An autoregressive model with a
lag of two will be denoted as AR(2).
•
4. Moving Average Models
• Yt = + et - w1et-1 - w2et-2 - … wqet-q,
• where wt = coefficients to be estimated, and et
are the error terms
• The number of error terms used in the model, q,
is a parameter and its value must be determined
by the user. A moving average model with two
error terms will be denoted as MA(2).
•
5. ARMA models
• Combining AR and MA models, an ARMA(p,q)
model is as follows:
• Yt = 0 + 1Yt-1 + 2Yt-2 … + pYt-p + et, - w1et-1
- w2et-2 - … wqet-q,
6. Differences
• Differences of the time series may be used if it is not
stationary. In some cases, a difference of the
differences may be necessary before a stationary data
is obtained. We use the notation “d” to indicate the
number of times the time series is differenced to
obtain a stationary series.
•
• ARIMA Notation: ARIMA(p,d,q) = An ARIMA model
with the time series differenced d times as the
response variable with a p-order autoregressive model
mixed with q-order moving average model.
•
7. Model Identification
• We use ACF (Autocorrelation function) and
PACF (Partial Autocorrelation function) . ACF
measures the correlation between a time
series and its past values at different time
lags. (i.e.corr(yt,yt−k),k=1,2,...)
• PACF measures the autocorrelation between
Yt and Yt-k, when the effects of other time lags,
1, 2, .., k-1, are removed.
11. MA(2):Yt=+t- 1t-1 - 2t-2
18.05.2023
11
18.05.2023
-1
1
0 k
-1
1
0 k
-1
1
0 k
-1
1
0 k
ACF PACF
12. ARMA(1,1):Yt= 0+ 1Yt-1 +t- 1t-1
18.05.2023
12
-1
1
0 k
-1
1
0 k
Auto Correlation Partial Auto Correlation
13. ARMA(1,1):Yt= 0+ 1Yt-1 +t- 1t-1
18.05.2023
13
18.05.2023
-1
1
0 k
-1
1
0 k
Auto Correlation Partial Auto Correlation
14. AR, MA or ARMA?
Autocorrelations Partial
Autocorrelations
MA(q) Cut off after the
order of q of the
process
Die out
AR(p) Die out Cut off after the
order of p of the
process
ARMA(p,q) Die out Die out
15. Model Building Strategy
• Step 1: Model identification
Plot the time series/ACF and examine whether it is
stationary. If not, try some transformation and or
differencing, until the data seems stationary. Compare ACF
and PACF of the time series data and identify the ARIMA
model to be used. To judge the significance of
autocorrelation and partial autocorrelation, the
corresponding sample values may be compared with ±2/ .
Use the principle of parsimony.
• Step 2: Model estimation
Use SPSS or other package to estimate the model
parameters. t-test may be used to judge whether a
parameter may be dropped from the model.
n
n
16. Model Building Strategy
• Step 3: Model checking
• The model will be considered adequate if the residuals
are random. The following three procedures may be used.
• Residual plots as in regression may be used,
• rk(e) must be within ±2/ of zero, and
• L-Q test may be used to test whether a group
autocorrelation of lags 1, 2,.. m, is significant.
•
• Step 4: Model forecasting
• SPSS generates forecasts for a given number of future
periods.
n
17. Model Selection Criteria
• Akaike Information Criterion (AIC) selects the best
model from a group of candidate models as the
one that minimizes
• Bayesian Information Criterion (BIC) selects the
best model e that minimizes
where σ2 residual variance
2 2
AIC=ln r
n
2 lnn
BIC=ln r
n
18. Example 1
-1
-0.5
0
0.5
1
0 2 4 6 8 10 12 14 16
lag
ACF for Y
+- 1.96/T^0.5
-1
-0.5
0
0.5
1
0 2 4 6 8 10 12 14 16
lag
PACF for Y
+- 1.96/T^0.5
24. ATR şirketi üretim hedefleri Öngörüsü
18.05.2023
24
18.05.2023
MA(1):Yt=+t+ 1t-1
1
5875
.
0
1513
.
0
ˆ
t
Y
25. ATR şirketi üretim hedefleri Öngörüsü
18.05.2023
25
18.05.2023
Relative change in each estimate less than 0.0010
Final Estimates of Parameters
Type Coef SE Coef T P
MA 1 0.5875 0.0864 6.80 0.000
Constant 0.15129 0.04022 3.76
0.000
Mean 0.15129 0.04022
Number of observations: 90
Residuals: SS = 74.4933 (backforecasts excluded)
MS = 0.8465 DF = 88
Modified Box-Pierce (Ljung-Box) Chi-Square statistic
Lag 12 24 36 48
Chi-Square 9.1 10.8 17.3 31.5
DF 10 22 34 46
P-Value 0.524 0.977 0.992 0.950
Forecasts from period 90
95% Limits
Period Forecast Lower Upper Actual
91 0.43350 -1.37018 2.23719
92 0.15129 -1.94064 2.24322
28. Example
• The analyst for the ISC Corporation was asked to
develop forecasts for the closing prices of ISC stock.
• The stock has been languishing for some time with
little growth, and senior management wanted some
projections to discuss with the board of directors.
• The ISC stock prices are plotted in the following
slide.
Dr. Mohammed Alahmed
28
31. Example 4
• The plot of the stock prices suggests the series is
stationary.
• The stock prices vary about a fixed level of
approximately 250.
• Is the Box-Jenkins methodology appropriate for this
data series?
• The ACF and PACF for the stock price series are
reported in the following two slides.
Dr. Mohammed Alahmed
31
33. Example 4
• The sample ACF alternate in sign and decline to zero
after lag 2.
• The sample PACF are similar are close to zero after
time lag 2.
• These are consistent with an AR(2) or ARIMA(2,0,0)
model
• AR(2) model is fit to the data.
• WE include a constant term to allow for a nonzero
level.
Dr. Mohammed Alahmed
33
35. Example 4
• The estimated coefficient 2 is not significant (t=1.75) at
5% level but is significant at the 10 % level.
• The residual ACF and PACF are given in the following
two slides.
• The ACF and PACF are well within their two standard
error limits.
Dr. Mohammed Alahmed
35
Final Estimates of Parameters
Type Coef SE T P
AR 1 -0.3243 0.1246 -2.60 0.012
AR 2 0.2192 0.1251 1.75 0.085
Constant 284.903 6.573 43.34 0.000
41. Example 5
• The gradual decline of ACF values indicates non-
stationary series.
• The first partial autocorrelation is very dominant and
close to 1, indicating non-stationarity.
• The time series plot clearly indicates non-stationarity.
• We take the first differences of the data and
reanalyze.
Dr. Mohammed Alahmed
41
45. Seasonality and ARIMA models
• The ARIMA models can be extended to handle seasonal
components of a data series.
• The general shorthand notation is
ARIMA (p, d, q)(P, D, Q)s
• Where s is the number of periods per season.
•
Dr. Mohammed Alahmed
45
46. Seasonality and ARIMA models
• The seasonal lags of the ACF and PACF plots show the
seasonal parts of an AR or MA model.
• Examples:
1. Seasonal MA model:
• ARIMA(0,0,0)(0,0,1)12
– will show a spike at lag 12 in the ACF but no other significant
spikes.
– The PACF will show exponential decay in the seasonal lags
i.e. at lags 12, 24, 36,…
2. Seasonal AR model:
• ARIMA(0,0,0)(1,0,0)12
– will show exponential decay in seasonal lags of the ACF.
– Single significant spike at lag 12 in the PACF.
Dr. Mohammed Alahmed
46
48. Auto correlation dies out after first lag.
But it is different than 0 at 12,24,36 lag
Series may not be stationary. Let’s take seasonal difference
12.12.2018
Dr
.M.HanifiVAN 48
54. • Model for ARIMA(p,d,q)(P,D,Q)12 will be
• ARIMA(0,0,0)(0,1,1)12 .
• We have D=1 as we take seasonal
difference
•Yt Yt12 t 1t12
• Yt Yt12 85.457t 0.8180t12