Regression with Time Series Data

878 views

Published on

Any business and economic applications of forecasting involve time series data. Re-gression models can be fit to monthly, quarterly, or yearly data using the techniques de-scribed in previous chapters. However, because data collected over time tend to exhibit trends, seasonal patterns, and so forth, observations in different time periods are re¬lated or autocorrelated. That is, for time series data, the sample of observations cannot be regarded as a random sample. Problems of interpretation can arise when standard regression methods are applied to observations that are related to one another over time. Fitting regression models to time series data must be done with considerable care.

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
878
On SlideShare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
15
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Regression with Time Series Data

  1. 1. TUGAS KELOMPOK TEKNIK PROYEKSI BISNIS RESUME “Regression with Time Series Data” Dosen: SigitIndrawijaya, SE. M.Si Disusunoleh: RizanoAhdiatRash’ada (C1B011047) MuchlasPratama Roby Harianto (C1B011005) PROGRAM STUDI MANAJEMEN FAKULTAS EKONOMI UNIVERSITAS JAMBI 2013
  2. 2. Regression with Time Series Data Any business and economic applications of forecasting involve time series data. Re-gression models can be fit to monthly, quarterly, or yearly data using the techniques de-scribed in previous chapters. However, because data collected over time tend to exhibit trends, seasonal patterns, and so forth, observations in different time periods are re¬ lated or autocorrelated. That is, for time series data, the sample of observations cannot be regarded as a random sample. Problems of interpretation can arise when standard regression methods are applied to observations that are related to one another over time. Fitting regression models to time series data must be done with considerable care. Time Series Data and the Problem of Autocorrelation With time series data, the assumption of independence rarely holds.. Consider the annual base price for a particular model of a new car. Can you imagine the chaos that would exist if the new car prices from one year to the next were indeed unrelated (in-dependent) of one another? In such a world prices would be determined like numbers drawn from a random number tabl-. Knowledge of the price in one year would not tell you anything about the price in the next year. In the, real world, price in the current year is related to (correlated with) the price in the previous year and maybe the price two years ago, arid so forth. That is, the prices in different years are autocorrelated, they are not independent. Autocorrelation exists when successive observations over time are related to one another.
  3. 3. Autocorrelation can occur because the effect of a predictor variable on the re¬sponse is distributed over time. For example, an increase in salary may affect your con¬sumption (or saving) not only in the current period but also in several future periodic A currentlabor contract may affect the cost of production for some time to come. Over time, relationships tend to be dynamic From a forecasting perspective, autocorrelation is not all bad. If values of a re¬sponse Y in one time period are related to Y values in previous time periods, then pre¬vious Y's can be used to predict future Y's.1 In a regression framework, autocorrelation is handled by "fixing up" the standard regression model. To accommodate autocorrelation, sometimes it is necessary to change the mix of predictor variables and/or the form of the regression function. More typically, however, autocorrelation is handled by changing the nature of the error term. A common kind of autocorrelation, often called first-order serial correlation, is one in which the error term in the current time period is directly related to the error term in the previous time period. In this case, with the subscript t representing time, the simple linear regression model takes the form (evolving), not static. From a forecasting perspective, autocorrelation is not all bad. If values of a response Y in one time period are related to Y values in previous time periods, then previous Y's can be used to predict future Y's.1 In a regression framework, autocorrelation is handled by "fixing up" the standard regression model. To accommodate autocorrelation, sometimes it is necessary to change the mix of predictor variables and/or the form of the regression function. More typically, however, autocorrelation is handled by changing the nature of the error term.
  4. 4. A common kind of autocorrelation, often called first-order serial correlation, is one in which the error term in the current time period is directly related to the error term in the previous time period. In this case, with the subscript t representing time, the simple linear regression model takes the form Yt=β0+β1X1+ε1 With (1) εt=ρεt-1+v (2) Where E, = the error at time t p = the parameter (lag 1 autocorrelation coefficient) that measures correlation between adjacent error terms = normally distributed independent error term with mean 0 and variance σ2 Equation 2 says that the level of one error term (εt-1) directly affects the level of the next error term (εt,). The magnitude of the autocorrelation coefficient p, where —1 < p < I, indicates the strength of the serial correlation. If p is zero, then there is no serial correlation, and the error terms are independent (εt = vt). Durbin-Watson Test for Serial Correlation One approach that is used frequently to determine if serial correlation is present is Durbin-Watson test. The test involves the determination of whether the autocorretion parameter p shown in Equation 8.2 is zero. Consider εt=ρεt-1+v
  5. 5. The hypotheses to be tested are H0:ρ=0 H1:p>0 The alternative hypothesis is p > 0 since business and economic time series tend show positive autocorrelation. If a regression model does not properly account for autocorrelation, the residu will be autocorrelated. So, the Durbin-Watson test is carried out using the residua from the regression analysis. Durbin-Watson statistic is defined as where e, = Yt — Yt = the residual for time period t et-i— Yt -1Yr -1 = the residual for time period t — 1 For positive serial correlation, successive residuals tend to be alike and the sum of squared differences in the numerator of the Durbin-Watson statistic will be relatively small. Small values of the Durbin-Watson statistic are consistent with positive serial correlation. The autocorrelation coefficient ρ can be estimated by the lag 1 residual autocorrelation r1(e), and with a little bit of mathematical maneuvering. the Durbin-Watson statistic can be related to ri (e). For moderate to large samples, DW ----- 2(1 — r1(e))
  6. 6. Since —1 <r1(e)< 1, Equation above shows that 0 < DW < 4. For r1(e) close to 0, the DW statistic will be close to 2. Positive lag 1 residual autocorrelation is associated with DW values less than 2, and negative lag 1 residual autocorrelation is associated with DW values above 2. A useful, but sometimes not definitive, test for serial correlation can be performed by comparing the calculated value of the Durbin-Watson statistic with lower (L) and upper (U) bounds. The decision rules are: 1.When the Durbin-Watson statistic is larger than the upper (U) bound, the autocorrelation coefficient p is equal to zero (there is no positive autocorrelation). 2.When the Durbin-Watson statistic is smaller than the lower (L) bound, the autocorrelation coefficient p is greater than zero (there is positive autocorrelation). 3.When the Durbin-Watson statistic lies within the lower and upper bounds, the test is inconclusive (we don't know whether there is positive autocorrelation). The Durbin-Watson test is used to determine whether positive autocorrelation is present. If DW > U, conclude H0 :ρ= 0. If DW < L, conclude H1 : ρ > 0. If DW lies within the lower and upper bounds (L ≤ DW ≤ U), the test is inconclusive. Solutions to Autocorrelation Problems Once autocorrelation has been discovered in a regression of time series data, it is neccessary to remove it, or model it, before the regression function can be evaluated for its effectiveness. The solution to the problem of serial correlation begins with an evaluation of the model specification. Is the functional form correct? Were any important variables omitted? Are there effects that might have some pattern over time that could have introduced autocorrelation into the errors?
  7. 7. Since a major cause of autocorrelated errors in the regression model is the omission of one or more key variables, the bes t approach to solving the problem is to findthem. This effort is sometimes referred to as improving the model specification. Modelspecification not only involves finding the important predictor variables, it also involves entering these variables in the regression function in the right way. Unfortu nately, it is not always possible to improve the model specification because an important missing variable may not be quantifiable or, if it is quantifiable, the data may not be available. For example, one may suspect that business investment in future periods is related to the attitude of potential investors. However, it is difficult to quantify the variable "attitude." Nevertheless, whenever possible, the model should be specified in accordance with theoretically sound insight. Only after the specification of the equation has been carefully reviewed should the possibility of an adjustment be considered. Several techniques for eliminating auto-correlation will be discussed. One approach to eliminating autocorrelati on is to add an omitted variable to the re-gression function that explains the association in the response from one period to the next. REGRESSION WITH DIFFERENCES For highly autocorrelated data, modeling changes rather than levels can often elimi¬nate the serial correlation. That is, instead of formulating the regression equation in terms of Y and X 1 , X 2 ,... , X k , the regression equation is written in terms of the differences, Y 1 = Y t – Y t-1 , and X t1 = X t1 - 1,1 , X t2 = X t2 – X t-1,2 , and so forth. Differences should be considered when the Durbin -Watson statistic associated with the regression involving the original variables is close to 0.7
  8. 8. One rationale for differencing comes from the following argument. Yr = β 0 + β 1 X 1 +εt with εt=ρε t-1 =v t where p = correlation between consecutive errors V t = random error = εtwhen p = 0 The model holds for any time period so Y t-1 = β 0 +βtX t -1+ε t -1 Time Series Data and the Problem of Heteroscedasticity Variability can increase if a variable is growing at a constant rate rather than a constant amount over time. Nonconstant variability is called heteroscedasticity.In a regression framework, heteroscedasticity occurs if the variance of the error term, c, is not constant. If the variability for recent time periods is larger than it was for past time periods, then the standard error of the estimate,underestimates the current standard deviation of the error term. If the standard deviationof the estimate is then used to set forecast limits for future observations, these limits can be too narrow for the stated confidence level. Using Regression to Forecast Seasonal Data In this model the seasonality is handled by using dummy variables in the regression function. A seasonal model for quarterly data with a time trend is Yt=β 0 +β 1 t+β 2 S 2 +β3S3+β 4 S 4 +ε t
  9. 9. Where Y t = the variable to be forecast t = the time index S 2 = a dummy variable that is 1 for the second quarter of the year; 0 otherwise = a dummy variable that is 1 for the third quarter of the year; 0 otherwise S 4 = a dummy variable that is 1 for the fourth quarter of the year; 0 otherwise ε t = errors assumed to be independent and normally distributed with mean zero and constant variance β 0 β 1 β 2 β 3 β 4 = coefficients to be estimated Econometric Forecasting When regression analysis is applied to economic data, the predictions developed from such models are referred to as economic forecasts. However, since economic theory frequently suggests that the values taken by the quantities of interest are determined through the simultaneous interaction of different economic forces, it may be necessary to model this interaction with a set of simultaneous equations. This idea leads to the construction of simultaneous equation econometric models. These models involve individual equations that look like regression equations. However, in a simultaneous system the individual equations are related, and the econometric model allows the joint determination of a set of dependent variables in terms of several independent variables. This contrasts with the usual regression situation in which a single equation determines the expected value of one dependent variable in terms of the independent, variables. A simultaneous equation econometric model determines jointly the values of a set of dependent variables, called endogenous variables by econometricians, in terms
  10. 10. of thevalues of independent variables, called exogenous variables. The values of the exoge¬nous variables are assumed to influence the endogenous variables but not the other way around. A complete simultaneous equation model will involve the same number of equations as endogenous variables. Economic theory holds that, in equilibrium, the quantity supplied is equal to the quantity demanded at a particular price. That is, the quantity demanded, the quantity supplied, and price are determined simultaneously. In one study of the price elasticity of demand, the model was specified as Qt=α0+a1Pt+a2lt+a3Tt+εt Pt=β0+β1Q1+β2Lt+vt where Qt = a measure of the demand (quantity sold) Pt = a measure of price (deflated dollars) lt = a measure of income per capita Tt = a measure of temperature lt = a measure of labor cost εt,vt= independent error terms that are uncorrelated with each other Large-scale econometric models are being used today to model the behavior of specific firms within an industry, selected industries within the economy, and the total economy. Econometric models can include any number of simultaneous multiple re¬gression-like equations. Econometric models are used to understand how the economy works and to generate forecasts of key economic variables. Econometric models are important aids in policy formulation.

×