SlideShare a Scribd company logo
1 of 8
Download to read offline
1
FORECASTING ECONOMIC TIME SERIES DATA USING ARMA
Nagendra Belvadi V
Black Belt
and
Dr. Tansen Chaudhari
Head of Process Asia Pacific
Xchanging, Xchanging Towers, SJR IPark,
EPIP Area, Whitefield, Bangalore - 560 066. India.
Why is forecasting necessary?
The subject matter of forecasting is uncertainty. Uncertainty means no clarity of future. In a state of
uncertainty, organizations make decisions based on historical experience or even gut feeling. The
decisions thus taken through gut feeling are detrimental to organizations. This necessitates scientific
approach to decision making. Forecasting is one such scientific technique which helps
organizations/processes in decision making in the state of uncertainty. In this article I am making an
earnest effort to take you through one of the sophisticated forecasting techniques called ARMA (Auto
Regressive Integrated Moving Average).
Introduction: This article encompasses application of ARMA models in forecasting economic variables,
its merits, demerits and advantages of the model in comparison with conventional time series models.
ARMA is the acronym for “Autoregressive Moving Average”, invented by two great Statisticians George
Box and Gwilym Jenkins and hence ARMA models are also known as Box and Jenkins models. ARMA
models are suitable for high frequency data.
Since most of the economic time series data are non-stationary, a method called differencing is
employed to convert non-stationary data into stationary. The differenced series is regressed on to the
original series and hence ARMA model becomes ARIMA (Auto Regressive Integrated Moving Average).
Box Jenkins methodology: ARIMA models produce accurate forecasts based on the historical patterns of
the time series data. ARIMA belongs to the class of linear models and can represent both stationary and
non-stationary data. ARIMA models do not involve the dependent variable; instead they make use of
information in the series to generate the series itself. Stationary series is the one which vary about a
fixed value and non-stationary series do not vary about a fixed value. The seasonal ARIMA is
represented as below:
ARIMA (p, d, q) (P, D, Q)
Where (p-auto regressive parameter, d-order of differencing, q-moving average parameter) is the
regular model and (P, D, Q) are the seasonal elements.
2
The model building process involves the following steps:
Model identification
The first step in the model identification is to determine whether the time series data is
stationary or non- stationary. The stationarity can be assessed either using Dickey Fuller test or
run sequence plots. If the series has either growing or declining trend the data is said to be
” non- stationary “and series with no trend is termed “stationary”. If the original series has no
trend then the series is an ideal candidate for ARIMA. If the original series has trend, the series
can be converted to stationary by differencing the series. If the series is non stationary, the
autocorrelations fail to die out rapidly and auto correlations die out rapidly for stationary series.
The order of differencing is zero for a stationary series and greater than zero for non- stationary
series. Based on sample auto correlations and partial auto correlations the regular and seasonal
parameters are determined.
Model parameter estimation
The estimation of parameters is of paramount importance in the model building exercise. The
parameters thus obtained are estimated statistically by the method of least squares. A t-statistic
shall be employed to test the parameters significance.
Model Diagnostics:
Once the parameters are statistically estimated, before forecasting the series, it is necessary to
check the adequacy of the tentatively identified model. The model is declared adequate if the
residuals cannot improve forecast anymore. In other words, residuals are random. To check the
overall model adequacy, the Ljung-Box Statistic is employed which follows a Chi-Square
distribution. The null hypothesis is either rejected or not rejected based on the low or high p-
value associated with the Statistic.
Forecasting:
Once the model adequacy is established the series in question shall be forecasted for specified
period. It is always advisable to keep track on the forecast errors and depending on the
magnitude of errors, the model shall be re-evaluated.
ARIMA when compared to other conventional models and methods is more robust in terms of
accuracy of the forecast as it takes seasonality into consideration. If the original series do not
exhibit season, non-seasonal ARIMA shall be fitted. The disadvantage of ARIMA building is that it
is tedious to build model manually without the aid of Statistical software(s). Also, there are
situations where in the final model may not fit the requirement due to error terms with non-
constant variances which is known as heteroskedasticity which are treated separately using
ARCH(Autoregressive Conditional Heteroskedasticity) and GARCH(Generalized Autoregressive
Conditional Heteroskedasticity) techniques.
3
Demonstration Problem: Suppose an analyst in a service industry is interested in fitting ARIMA
model for a daily volume (volume of work) data and he collects daily volume data for 103
days.(forecast using an ARIMA using data series less than 50 data points are unreliable) The
very first step he does is understanding whether data is stationary or not. He studies time series
plots and applies Dickey Fuller augmented test to check stationarity as shown below:
1009080706050403020101
900
800
700
600
500
400
300
200
100
0
Index
Volume
Time Series Plot of Volume
Dickey Fuller Test Statistic -4.62799
p-value 0.01
Lag order 4
Ouput
The data in the above figure is not showing a trend like behavior and the same is true with Dickey Fuller
test with negative statistic and low p-value.
4
2624222018161412108642
1.0
0.8
0.6
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
Lag
PartialAutocorrelation
Partial Autocorrelation Function for Volume
(with 5% significance limits for the partial autocorrelations)
2624222018161412108642
1.0
0.8
0.6
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
Lag
Autocorrelation
Autocorrelation Function for Volume
(with 5% significance limits for the autocorrelations)
The ACF and PACF shown above are not significant at any lags. Hence the analyst went ahead with first
differencing to study the regular parameters.
5
2624222018161412108642
1.0
0.8
0.6
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
Lag
Autocorrelation
Autocorrelation Function for first diffrencing
(with 5% significance limits for the autocorrelations)
2624222018161412108642
1.0
0.8
0.6
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
Lag
PartialAutocorrelation
Partial Autocorrelation Function for first diffrencing
(with 5% significance limits for the partial autocorrelations)
After first differencing, the analyst found that the AR and MA parameters significant at lag 1 and 2
respectively and he identifies the regular model as (2, 1, 1) using PACF and ACF respectively.
Next the analyst moves on to identify parameters for seasonal model. Since the data is daily, he suspects
five working days as the season and takes fifth differencing on the first differenced series. The ACF and
PACFs help him to identify the seasonal elements (P, D, Q).
6
24222018161412108642
1.0
0.8
0.6
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
Lag
Autocorrelation
Autocorrelation Function for fifth diffrencing
(with 5% significance limits for the autocorrelations)
24222018161412108642
1.0
0.8
0.6
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
Lag
PartialAutocorrelation
Partial Autocorrelation Function for fifth diffrencing
(with 5% significance limits for the partial autocorrelations)
From PACF it is evident that the partial autocorrelations are significant at periods 5 and 10 indicating
two seasons (one season is equal to five days) and concludes 2 as autoregressive parameter. Also,
moving average parameter is 1 from ACF. The seasonal model now comprise of parameters (2,1,1).
Now the analyst with the seasonal ARIMA (2,1,1)(2,1,1) forecasts the series under study for specified
periods and the Minitab output is given below:
7
Period Forecast Lower Upper Actual
86 140.522 -91.517 372.56 129
87 112.872 -119.167 344.911 94
88 161.841 -70.197 393.88 127
89 202.259 -29.78 434.298 175
90 138.448 -93.59 370.487 239
91 99.972 -144.197 344.141 161
92 57.691 -186.478 301.861 88
93 135.685 -108.484 379.854 70
94 140.839 -103.331 385.008 62
95 88.764 -155.405 332.934 46
96 113.271 -145.719 372.261 66
97 123.47 -135.52 382.46 88
98 171.281 -87.709 430.271 155
99 210.133 -48.857 469.122 247
100 131.533 -127.457 390.522 62
101 116.755 -176.54 410.051 203
102 74.362 -218.933 367.658 102
He also looks at ACF for residuals and Ljung Box statistic.
20151051
1.0
0.8
0.6
0.4
0.2
0.0
-0.2
-0.4
-0.6
-0.8
-1.0
Lag
Autocorrelation
ACF of Residuals for Volume
(with 5% significance limits for the autocorrelations)
Modified Box-Pierce (Ljung-Box) Chi-Square statistic
Lag 12 24 36 48
Chi-Square 10.5 15.0 19.3 27.9
DF 5 17 29 41
P-Value 0.063 0.595 0.914 0.941
8
Observations: ACF for residuals are not significant and p-values greater than 0.05 corresponding to Q-
statistics indicates model adequacy. If adequacy is not established, the model needs to be diagnosed for
heteroskedasticity using ARCH & GARCH techniques which is beyond the scope of this article.
Summary: Though ARIMA models are robust enough for accurate forecasts, historically they have not
enjoyed wide usage in corporate world due to complexities involved in model identification and lack of
considerable Statistical knowledge. With the advent of sophisticated Statistical software(s) like SAS,
SPSS, R and Minitab, which outperform human brain, ARIMA is gaining prominence in service industries
like BPO, call centers etc., for capacity planning and scheduling. ARIMA models outperform conventional
time series models like moving average and other smoothing models in terms of degree of forecast
accuracy, treating seasonality and long term forecasts. The disadvantage of ARIMA compared to other
conventional models is that forecasts are unreliable with data series less than 50 data points. ARIMA
models are more sensitive to outliers present in the original series. However if the degree of accuracy is
not of great concern, conventional time series models shall be employed which are simple and less time
consuming.
(Note: Authors have not directly referred to any other existing papers/articles while writing this paper. If
it matches with the views expressed in any existing articles, it is purely coincidental)

More Related Content

Similar to Forecasting%20Economic%20Series%20using%20ARMA

Time series modelling arima-arch
Time series modelling  arima-archTime series modelling  arima-arch
Time series modelling arima-archjeevan solaskar
 
ARIMA Model for analysis of time series data.ppt
ARIMA Model for analysis of time series data.pptARIMA Model for analysis of time series data.ppt
ARIMA Model for analysis of time series data.pptREFOTDEBuea
 
Writing Sample
Writing SampleWriting Sample
Writing SampleYiqun Li
 
Different Models Used In Time Series - InsideAIML
Different Models Used In Time Series - InsideAIMLDifferent Models Used In Time Series - InsideAIML
Different Models Used In Time Series - InsideAIMLVijaySharma802
 
Statistics project2
Statistics project2Statistics project2
Statistics project2shri1984
 
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...Simplilearn
 
6-130914140240-phpapp01.pdf
6-130914140240-phpapp01.pdf6-130914140240-phpapp01.pdf
6-130914140240-phpapp01.pdfssuserdca880
 
What is ARIMA Forecasting and How Can it Be Used for Enterprise Analysis?
What is ARIMA Forecasting and How Can it Be Used for Enterprise Analysis?What is ARIMA Forecasting and How Can it Be Used for Enterprise Analysis?
What is ARIMA Forecasting and How Can it Be Used for Enterprise Analysis?Smarten Augmented Analytics
 
FE8513 Fin Time Series-Assignment
FE8513 Fin Time Series-AssignmentFE8513 Fin Time Series-Assignment
FE8513 Fin Time Series-AssignmentRaktim Ray
 
16 ch ken black solution
16 ch ken black solution16 ch ken black solution
16 ch ken black solutionKrunal Shah
 
Project time series ppt
Project time series pptProject time series ppt
Project time series pptamar patil
 
Time Series Analysis - Modeling and Forecasting
Time Series Analysis - Modeling and ForecastingTime Series Analysis - Modeling and Forecasting
Time Series Analysis - Modeling and ForecastingMaruthi Nataraj K
 
MSc Finance_EF_0853352_Kartik Malla
MSc Finance_EF_0853352_Kartik MallaMSc Finance_EF_0853352_Kartik Malla
MSc Finance_EF_0853352_Kartik MallaKartik Malla
 

Similar to Forecasting%20Economic%20Series%20using%20ARMA (20)

Time series modelling arima-arch
Time series modelling  arima-archTime series modelling  arima-arch
Time series modelling arima-arch
 
ARIMA Model.ppt
ARIMA Model.pptARIMA Model.ppt
ARIMA Model.ppt
 
ARIMA Model for analysis of time series data.ppt
ARIMA Model for analysis of time series data.pptARIMA Model for analysis of time series data.ppt
ARIMA Model for analysis of time series data.ppt
 
ARIMA Model.ppt
ARIMA Model.pptARIMA Model.ppt
ARIMA Model.ppt
 
Team 16_Report
Team 16_ReportTeam 16_Report
Team 16_Report
 
Team 16_Report
Team 16_ReportTeam 16_Report
Team 16_Report
 
Writing Sample
Writing SampleWriting Sample
Writing Sample
 
Different Models Used In Time Series - InsideAIML
Different Models Used In Time Series - InsideAIMLDifferent Models Used In Time Series - InsideAIML
Different Models Used In Time Series - InsideAIML
 
ARIMA
ARIMA ARIMA
ARIMA
 
Statistics project2
Statistics project2Statistics project2
Statistics project2
 
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...
Time Series Analysis - 2 | Time Series in R | ARIMA Model Forecasting | Data ...
 
6-130914140240-phpapp01.pdf
6-130914140240-phpapp01.pdf6-130914140240-phpapp01.pdf
6-130914140240-phpapp01.pdf
 
Spc training
Spc training Spc training
Spc training
 
What is ARIMA Forecasting and How Can it Be Used for Enterprise Analysis?
What is ARIMA Forecasting and How Can it Be Used for Enterprise Analysis?What is ARIMA Forecasting and How Can it Be Used for Enterprise Analysis?
What is ARIMA Forecasting and How Can it Be Used for Enterprise Analysis?
 
FE8513 Fin Time Series-Assignment
FE8513 Fin Time Series-AssignmentFE8513 Fin Time Series-Assignment
FE8513 Fin Time Series-Assignment
 
16 ch ken black solution
16 ch ken black solution16 ch ken black solution
16 ch ken black solution
 
Project time series ppt
Project time series pptProject time series ppt
Project time series ppt
 
16. SPC.pptx
16. SPC.pptx16. SPC.pptx
16. SPC.pptx
 
Time Series Analysis - Modeling and Forecasting
Time Series Analysis - Modeling and ForecastingTime Series Analysis - Modeling and Forecasting
Time Series Analysis - Modeling and Forecasting
 
MSc Finance_EF_0853352_Kartik Malla
MSc Finance_EF_0853352_Kartik MallaMSc Finance_EF_0853352_Kartik Malla
MSc Finance_EF_0853352_Kartik Malla
 

Forecasting%20Economic%20Series%20using%20ARMA

  • 1. 1 FORECASTING ECONOMIC TIME SERIES DATA USING ARMA Nagendra Belvadi V Black Belt and Dr. Tansen Chaudhari Head of Process Asia Pacific Xchanging, Xchanging Towers, SJR IPark, EPIP Area, Whitefield, Bangalore - 560 066. India. Why is forecasting necessary? The subject matter of forecasting is uncertainty. Uncertainty means no clarity of future. In a state of uncertainty, organizations make decisions based on historical experience or even gut feeling. The decisions thus taken through gut feeling are detrimental to organizations. This necessitates scientific approach to decision making. Forecasting is one such scientific technique which helps organizations/processes in decision making in the state of uncertainty. In this article I am making an earnest effort to take you through one of the sophisticated forecasting techniques called ARMA (Auto Regressive Integrated Moving Average). Introduction: This article encompasses application of ARMA models in forecasting economic variables, its merits, demerits and advantages of the model in comparison with conventional time series models. ARMA is the acronym for “Autoregressive Moving Average”, invented by two great Statisticians George Box and Gwilym Jenkins and hence ARMA models are also known as Box and Jenkins models. ARMA models are suitable for high frequency data. Since most of the economic time series data are non-stationary, a method called differencing is employed to convert non-stationary data into stationary. The differenced series is regressed on to the original series and hence ARMA model becomes ARIMA (Auto Regressive Integrated Moving Average). Box Jenkins methodology: ARIMA models produce accurate forecasts based on the historical patterns of the time series data. ARIMA belongs to the class of linear models and can represent both stationary and non-stationary data. ARIMA models do not involve the dependent variable; instead they make use of information in the series to generate the series itself. Stationary series is the one which vary about a fixed value and non-stationary series do not vary about a fixed value. The seasonal ARIMA is represented as below: ARIMA (p, d, q) (P, D, Q) Where (p-auto regressive parameter, d-order of differencing, q-moving average parameter) is the regular model and (P, D, Q) are the seasonal elements.
  • 2. 2 The model building process involves the following steps: Model identification The first step in the model identification is to determine whether the time series data is stationary or non- stationary. The stationarity can be assessed either using Dickey Fuller test or run sequence plots. If the series has either growing or declining trend the data is said to be ” non- stationary “and series with no trend is termed “stationary”. If the original series has no trend then the series is an ideal candidate for ARIMA. If the original series has trend, the series can be converted to stationary by differencing the series. If the series is non stationary, the autocorrelations fail to die out rapidly and auto correlations die out rapidly for stationary series. The order of differencing is zero for a stationary series and greater than zero for non- stationary series. Based on sample auto correlations and partial auto correlations the regular and seasonal parameters are determined. Model parameter estimation The estimation of parameters is of paramount importance in the model building exercise. The parameters thus obtained are estimated statistically by the method of least squares. A t-statistic shall be employed to test the parameters significance. Model Diagnostics: Once the parameters are statistically estimated, before forecasting the series, it is necessary to check the adequacy of the tentatively identified model. The model is declared adequate if the residuals cannot improve forecast anymore. In other words, residuals are random. To check the overall model adequacy, the Ljung-Box Statistic is employed which follows a Chi-Square distribution. The null hypothesis is either rejected or not rejected based on the low or high p- value associated with the Statistic. Forecasting: Once the model adequacy is established the series in question shall be forecasted for specified period. It is always advisable to keep track on the forecast errors and depending on the magnitude of errors, the model shall be re-evaluated. ARIMA when compared to other conventional models and methods is more robust in terms of accuracy of the forecast as it takes seasonality into consideration. If the original series do not exhibit season, non-seasonal ARIMA shall be fitted. The disadvantage of ARIMA building is that it is tedious to build model manually without the aid of Statistical software(s). Also, there are situations where in the final model may not fit the requirement due to error terms with non- constant variances which is known as heteroskedasticity which are treated separately using ARCH(Autoregressive Conditional Heteroskedasticity) and GARCH(Generalized Autoregressive Conditional Heteroskedasticity) techniques.
  • 3. 3 Demonstration Problem: Suppose an analyst in a service industry is interested in fitting ARIMA model for a daily volume (volume of work) data and he collects daily volume data for 103 days.(forecast using an ARIMA using data series less than 50 data points are unreliable) The very first step he does is understanding whether data is stationary or not. He studies time series plots and applies Dickey Fuller augmented test to check stationarity as shown below: 1009080706050403020101 900 800 700 600 500 400 300 200 100 0 Index Volume Time Series Plot of Volume Dickey Fuller Test Statistic -4.62799 p-value 0.01 Lag order 4 Ouput The data in the above figure is not showing a trend like behavior and the same is true with Dickey Fuller test with negative statistic and low p-value.
  • 4. 4 2624222018161412108642 1.0 0.8 0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -1.0 Lag PartialAutocorrelation Partial Autocorrelation Function for Volume (with 5% significance limits for the partial autocorrelations) 2624222018161412108642 1.0 0.8 0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -1.0 Lag Autocorrelation Autocorrelation Function for Volume (with 5% significance limits for the autocorrelations) The ACF and PACF shown above are not significant at any lags. Hence the analyst went ahead with first differencing to study the regular parameters.
  • 5. 5 2624222018161412108642 1.0 0.8 0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -1.0 Lag Autocorrelation Autocorrelation Function for first diffrencing (with 5% significance limits for the autocorrelations) 2624222018161412108642 1.0 0.8 0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -1.0 Lag PartialAutocorrelation Partial Autocorrelation Function for first diffrencing (with 5% significance limits for the partial autocorrelations) After first differencing, the analyst found that the AR and MA parameters significant at lag 1 and 2 respectively and he identifies the regular model as (2, 1, 1) using PACF and ACF respectively. Next the analyst moves on to identify parameters for seasonal model. Since the data is daily, he suspects five working days as the season and takes fifth differencing on the first differenced series. The ACF and PACFs help him to identify the seasonal elements (P, D, Q).
  • 6. 6 24222018161412108642 1.0 0.8 0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -1.0 Lag Autocorrelation Autocorrelation Function for fifth diffrencing (with 5% significance limits for the autocorrelations) 24222018161412108642 1.0 0.8 0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -1.0 Lag PartialAutocorrelation Partial Autocorrelation Function for fifth diffrencing (with 5% significance limits for the partial autocorrelations) From PACF it is evident that the partial autocorrelations are significant at periods 5 and 10 indicating two seasons (one season is equal to five days) and concludes 2 as autoregressive parameter. Also, moving average parameter is 1 from ACF. The seasonal model now comprise of parameters (2,1,1). Now the analyst with the seasonal ARIMA (2,1,1)(2,1,1) forecasts the series under study for specified periods and the Minitab output is given below:
  • 7. 7 Period Forecast Lower Upper Actual 86 140.522 -91.517 372.56 129 87 112.872 -119.167 344.911 94 88 161.841 -70.197 393.88 127 89 202.259 -29.78 434.298 175 90 138.448 -93.59 370.487 239 91 99.972 -144.197 344.141 161 92 57.691 -186.478 301.861 88 93 135.685 -108.484 379.854 70 94 140.839 -103.331 385.008 62 95 88.764 -155.405 332.934 46 96 113.271 -145.719 372.261 66 97 123.47 -135.52 382.46 88 98 171.281 -87.709 430.271 155 99 210.133 -48.857 469.122 247 100 131.533 -127.457 390.522 62 101 116.755 -176.54 410.051 203 102 74.362 -218.933 367.658 102 He also looks at ACF for residuals and Ljung Box statistic. 20151051 1.0 0.8 0.6 0.4 0.2 0.0 -0.2 -0.4 -0.6 -0.8 -1.0 Lag Autocorrelation ACF of Residuals for Volume (with 5% significance limits for the autocorrelations) Modified Box-Pierce (Ljung-Box) Chi-Square statistic Lag 12 24 36 48 Chi-Square 10.5 15.0 19.3 27.9 DF 5 17 29 41 P-Value 0.063 0.595 0.914 0.941
  • 8. 8 Observations: ACF for residuals are not significant and p-values greater than 0.05 corresponding to Q- statistics indicates model adequacy. If adequacy is not established, the model needs to be diagnosed for heteroskedasticity using ARCH & GARCH techniques which is beyond the scope of this article. Summary: Though ARIMA models are robust enough for accurate forecasts, historically they have not enjoyed wide usage in corporate world due to complexities involved in model identification and lack of considerable Statistical knowledge. With the advent of sophisticated Statistical software(s) like SAS, SPSS, R and Minitab, which outperform human brain, ARIMA is gaining prominence in service industries like BPO, call centers etc., for capacity planning and scheduling. ARIMA models outperform conventional time series models like moving average and other smoothing models in terms of degree of forecast accuracy, treating seasonality and long term forecasts. The disadvantage of ARIMA compared to other conventional models is that forecasts are unreliable with data series less than 50 data points. ARIMA models are more sensitive to outliers present in the original series. However if the degree of accuracy is not of great concern, conventional time series models shall be employed which are simple and less time consuming. (Note: Authors have not directly referred to any other existing papers/articles while writing this paper. If it matches with the views expressed in any existing articles, it is purely coincidental)