Econometric ModelingEconometric ModelingResearch MethodsResearch MethodsProfessor Lawrence W. LanProfessor Lawrence W. LanEmail:Email: firstname.lastname@example.org@mdu.edu.twhttp://126.96.36.199/mdu/http://188.8.131.52/mdu/Institute of ManagementInstitute of Management
OverviewOverview• Objectives• Model building• Types of models• Criteria of a good model• Data• Desirable properties of estimators• Methods of estimation• Software packages and books
ObjectivesObjectives• Empirical verification of the theories in business,economics, management and related disciplines isbecoming increasingly quantitative.• Econometrics, or economic measurement, is a socialscience in which the tools of economic theory,mathematical statistics are applied to the analysis ofeconomic phenomena.• Focus on models that can be expressed in equation formand relating variables quantitatively.• Data are used to estimate the parameters of theequations, and the theoretical relationships are testedstatistically.• Used for policy analysis and forecasting.
Model BuildingModel Building• Model building is a science and art, whichserves for policy analysis and forecasting.– science: consists of a set of quantitative toolsused to construct and test mathematicalrepresentations of the real world problems.– art: consists of intuitive judgments that occurduring the modeling process. No clear-cutrules for making these judgments.
Types of Models (1/4)Types of Models (1/4)• Time-series models– Examine the past behavior of a time series inorder to infer something about its futurebehavior, without knowing about the causalrelationships that affect the variable we aretrying to forecast.– Deterministic models (e.g. linearextrapolation) vs. stochastic models (e.g.ARIMA, SARIMA).
Types of Models (2/4)Types of Models (2/4)• Single-equation models– With causal relationships (based onunderlying theory) in which the variable (Y)under study is explained by a single function(linear or nonlinear) of a number of variables(Xs)– Y: explained or dependent variable– Xs: explanatory or independent variables
Types of Models (3/4)Types of Models (3/4)• Simultaneous-equation models (or multi-equation simulation models)– With causal relationships (based onunderlying theory) in which the dependentvariables (Ys) under study are related to eachother as well as to a set of equations (linear ornonlinear) with a number of explanatoryvariables (Xs)
Types of Models (4/4)Types of Models (4/4)• Combination of time-series and regressionmodels– Single-input vs. multiple-input transferfunction models– Linear vs. rational transfer functions– Simultaneous-equation transfer functions– Transfer functions with interventions oroutliers
Criteria of a Good ModelCriteria of a Good Model• Parsimony• Identifiability• Goodness of fit• Theoretical consistency• Predictive power
DataData• Sample data: the set of observations from themeasurement of variables, which may comefrom any number of sources and in a variety offorms.• Time-series data: describe the movement of anyvariable over time.• Cross-section data: describe the activities of anyindividual or group at a given point in time.• Pooled data: a combination of time-series andcross-section data, also known as panel data,longitudinal or micropanel data.
Desirable Properties of EstimatorsDesirable Properties of Estimators• Unbiased: the mean or expected value of anestimator is equal to the true value.• Efficient (best): the variance of an estimator issmaller than any other ones.• Minimum mean square error (MSE): to trade offbias and variance. MSE is equal to the square ofthe bias and the variance of the estimator.• Consistent: the probability limit of an estimatorgets close to the true value. It is a large-sampleor asymptotic property.
Methods of EstimationMethods of Estimation• Ordinary least squares (OLS)• Maximum likelihood (ML)• Weighted least squares (WLS)• Generalized least squares (GLS)• Instrumental variable (IV)• Two-stage least squares (2SLS)• Indirect least squares (ILS)• Three-stage least squares (3SLS)
Software Packages and BooksSoftware Packages and Books• LIMDEP: single-equation andsimultaneous-equation regression models• SCA: time series models• Textbooks– (1) Damodar Gujarati, Essentials of Econometrics,2nd ed. McGraw-Hill, 1999.– (2) Robert S. Pindyck and Daniel L. Rubinfeld,Econometric Models and Economic Forecasts, 4thed. McGraw-Hill, 1997.
Single-equation Regression ModelsSingle-equation Regression Models• Assumptions• Best Linear Unbiased Estimation (BLUE)• Hypothesis testing• Violations for assumptions 1 ~ 5• Forecasting
AssumptionsAssumptions• A1: (i) The relationship between Y and X is trulyexistent and correctly specified. (ii) Xs arenonstochastic variables whose values are fixed.(iii) Xs are not linearly correlated.• A2: The error term has zero expected value forall observations.• A3: The error term has constant variance for allobservations• A4: The error terms are statistically independent.• A5: The error term is normally distributed.
Best Linear Unbiased EstimationBest Linear Unbiased Estimation• Gauss-Markov (GM) Theorem: Givenassumptions 1, 2, 3, and 4, the estimation of theregression parameters by least squares (OLS)method are the best (most efficient) linearunbiased estimators. (BLUE)• GM theorem applies only to linear estimatorswhere the estimators can be written as aweighted average of the individual observationson Y.
Hypothesis TestingHypothesis Testing• Normal, Chi-square, t, and F distributions• Goodness of fit• Testing the regression coefficients (singleequation)• Testing the regression equation (jointequations)• Testing for structural stability ortransferability of regression models
A1(ii) Violation – Xs Correlated with ErrorA1(ii) Violation – Xs Correlated with Error• OLS leads to biased and inconsistent estimators• Criteria of good instrumental (proxy) variables• Instrumental-variables estimation consistent,but no guarantee for unbiased or uniqueestimators• Two-stage least squares (2SLS) estimation optimal instrumental variable, unique consistentestimators
A1(iii) Violation -- MulticollinearityA1(iii) Violation -- Multicollinearity• Perfect collinearity between any of Xs no solution will exist• Near or imperfect multicollinearity largestandard error of OLS estimators or widerconfidence intervals; high R2but fewsignificant t values; wrong signs forregression coefficients; difficulty inexplaining or assessing the individualcontribution of Xs to Y.
Detection of MulticollinearityDetection of Multicollinearity• Testing the significance of R-i2from the variousauxiliary regressions. F=[R-i2/(k-1)]/[(1-R-i2)/(n-k)],where n=number of observations, k=number ofexplanatory variables including the intercept.Check if F-value is significantly different from zero. Ifyes (F-value > F-table), X-i and Xi are significantlycollinear with each other.• Variance inflation factor (VIF = 1/(1-R-i2): VIF=1representing no collinearity; if VIF>10 then high degreeof multicollinearity
A2 Violation – Measurement Error in YA2 Violation – Measurement Error in Y• OLS will result in biased intercept;however, the estimated slope parametersare still unbiased and consistent.• Correction for the dependent variable
A3 Violation -- HeteroscedasticityA3 Violation -- Heteroscedasticity• It happens mostly for cross-sectional data;sometimes for time-series data.• OLS will lead to inefficient estimation, but stillunbiased.• Can be corrected by weighted least squares(WLS) method• Detection: Goldfeld-Quandt test, Breusch-Pagantest, White test, Park-Glejser test, Bartlett test,Peak test, Spearman’s rank correlation test, etc.
A4 Violation -- AutocorrelationA4 Violation -- Autocorrelation• It happens mostly for time-series data;sometimes for cross-sectional data.• OLS will lead to inefficient estimation, but stillunbiased.• Can be corrected by generalized least squares(GLS) method• Detection: Durbin-Watson test, runs test. (Forlagged dependent variable, DW2 even whenserial correlation, do not use DW test, use h testor t test instead)
A5 Violation – Non-normalityA5 Violation – Non-normality• Chi-square, t, F tests are not valid;however, these tests are still valid for largesample.• Detection: Shapiro-Wilk test, Anderson-Darling test, Jarque-Bera (JB) test.JB=(n/6)[S2+ (K-3)2/4] where n=samplesize, K=kurtosis, S=skewness. (Fornormal, K=3, S=0) JB~ Chi-squaredistribution with 2 d.f.
ForecastingForecasting• Ex post vs. ex ante forecast• Unconditional forecasting• Conditional forecasting• Evaluation of ex post forecast errors– means: root-mean-square error, root-mean-squarepercent error, mean error, mean percent error, meanabsolute error, mean absolute percent error, Theil’sinequality coefficient– variances: Akaike information criterion (AIC), Schwarzinformation criterion (SIC)
Simultaneous-equations ModelsSimultaneous-equations Models• Endogenous variables exist on both sides of theequations• Structural model vs. reduced form model• OLS will lead to biased and inconsistentestimation; indirect least squares (ILS) methodcan be used to obtain consistent estimation• Three-stage least squares (3SLS) method willresult in consistent estimation• 3SLS often performs better than 2SLS in termsof estimation efficiency
Seemly Unrelated Equation ModelsSeemly Unrelated Equation Models• Endogenous variables appear only on theleft hand side of equations• OLS usually results in unbiased butinefficient estimation• Generalized least squares (GLS) methodis used to improve the efficiency Zellnermethod
Identification ProblemIdentification Problem• Unidentified vs. identified (over identifiedand exactly identified)• Order condition• Rank condition
Time-series ModelsTime-series Models• Time-series data• Univariate time series models• Box-Jenkins modeling approach• Transfer function models
Time-series DataTime-series Data• Yt: A sequence of data observed at equallyspaced time interval• Stationary vs. non-stationary time series• Homogeneous vs. non-homogeneous timeseries• Seasonal vs. non-seasonal time series
Univariate Time Series ModelsUnivariate Time Series Models• Types of models: white noise model,autoregressive (AR) models, moving-average(MA) models, autoregressive-moving average(ARMA) models, integrated autoregressive-moving average (ARIMA) models, seasonalARIMA models• Model identification: MA(q) sampleautocorrelation function (ACF) cuts off; AR(p) sample partial autocorrelation function (PACF)cuts off; ARMA(p,q) both ACF and PACF dieout
Transfer Function ModelsTransfer Function Models• Single input (X) vs. multiple input (Xs) models• Linear transfer function (LTF) vs. rationaltransfer function (RTF) models• Model identification (variables to be used; b, s, rfor each input variable using corner tablemethod; ARMA model for the noise)• Model estimation: maximum likelihoodestimation (conditional or exact)• Diagnostic checking: cross correlation function(CCF)• Forecasting: simultaneous forecasting
Simultaneous Transfer FunctionSimultaneous Transfer Function(STF) Models(STF) Models• Purposes (to facilitate forecasting andstructural analysis of a system, and toimprove forecast accuracy)• Yt and Xt can be both endogenousvariables in the system• Use LTF method for model identification,FIML for estimation, CCM (crosscorrelation matrices) for diagnosticchecking, simultaneous forecasting
Transfer Function Models withTransfer Function Models withInterventions or OutliersInterventions or Outliers• Additive Outlier (AO)• Level Shift (LS)• Temporary Change (TC)• Innovational Outlier (IO)• Intervention models