Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

multiple linear reggression model

658 views

Published on

Published in: Technology, Economy & Finance
  • Be the first to comment

multiple linear reggression model

  1. 1. Short Guides to Microeconometrics Kurt Schmidheiny The Multiple Linear Regression Model 2Fall 2011 Unversit¨t Basel a 2 The Econometric Model The multiple linear regression model assumes a linear (in parameters) The Multiple Linear Regression Model relationship between a dependent variable yi and a set of explanatory variables xi =(xi0 , xi1 , ..., xiK ). xik is also called an independent variable, a covariate or a regressor. The first regressor xi0 = 1 is a constant unless otherwise specified.1 Introduction Consider a sample of N observations i = 1, ... , N . Every single obser- vation i followsThe multiple linear regression model and its estimation using ordinary yi = xi β + uileast squares (OLS) is doubtless the most widely used tool in econometrics.It allows to estimate the relation between a dependent variable and a set where β is a (K + 1)-dimensional column vector of parameters, xi is aof explanatory variables. Prototypical examples in econometrics are: (K + 1)-dimensional row vector and ui is a scalar called the error term. The whole sample of N observations can be expressed in matrix nota- • Wage of an employee as a function of her education and her work tion, experience (the so-called Mincer equation). y = Xβ + u • Price of a house as a function of its number of bedrooms and its age where y is a N -dimensional column vector, X is a N × (K + 1) matrix (an example of hedonic price regressions). and u is a N -dimensional column vector of error terms, i.e. The dependent variable is an interval variable, i.e. its values repre-       y1 1 x11 · · · x1K u1sent a natural order and differences of two values are meaningful. The   β0  y   2   1 x21 · · · x2K   u   2 dependent variable can, in principle, take any real value between −∞ and    β1    y3  1 x31 · · · x3K  u3        +∞. In practice, this means that the variable needs to be observed with =    .  .  +  .  . . .  .         ..  .  .  . . .  .    . some precision and that all observed values are far from ranges which .  . . .  . βKare theoretically excluded. Wages, for example, do strictly speaking not yN 1 xN 1 · · · xN K uNqualify as they cannot take values beyond two digits (cents) and values N ×1 N × (K + 1) (K + 1) × 1 N ×1which are negative. In practice, monthly wages in dollars in a sample The data generation process (dgp) is fully described by a set of as-of full time workers is perfectly fine with OLS whereas wages measured sumptions. Several of the following assumptions are formulated in dif-in three wage categories (low, middle, high) for a sample that includes ferent alternatives. Different sets of assumptions will lead to differentunemployed (with zero wages) ask for other estimation tools. properties of the OLS estimator.Version: 29-11-2011, 18:29
  2. 2. 3 Short Guides to Microeconometrics The Multiple Linear Regression Model 4OLS1: Linearity OLS4: Identifiability yi = xi β + ui and E(ui ) = 0 rank(X) = K + 1 < N and E(xi xi ) = QXX is positive definite and finiteOLS1 assumes that the functional relationship between dependent andexplanatory variables is linear in parameters, that the error term enters The OLS4 assumes that the regressors are not perfectly collinear, i.e. noadditively and that the parameters are constant across individuals i. variable is a linear combination of the others. For example, there can only be one constant. Intuitively, OLS4 means that every explanatory variableOLS2: Independence adds additional information. OLS4 also assumes that all regressors (but {xi , yi }N i.i.d. (independent and identically distributed) i=1 the constant) have non-zero variance and not too many extreme values.OLS2 means that the observations are independently and identically dis- OLS5: Error Structuretributed. This assumption is in practice guaranteed by random sampling. a) V (ui |xi ) = σ 2 < ∞ (homoscedasticity)OLS3: Exogeneity 2 b) V (ui |xi ) = σi = g(xi ) < ∞ (conditional heteroscedasticity) 2 a) ui |xi ∼ N (0, σ ) OLS5a (homoscedasticity) means that the variance of the error term is b) ui ⊥ xi (independent) a constant. OLS5b (conditional heteroscedasticity) allows the variance of the error term to depend on the explanatory variables. c) E(ui |xi ) = 0 (mean independent) d) cov(xi , ui ) = 0 (uncorrelated) 3 Estimation with OLSOLS3a assumes that the error term is independent of the explanatoryvariables and normally distributed. OLS3b means that the error term is Ordinary least squares (OLS) minimizes the squared distances betweenindependent of the explanatory variables. OLS3c states that the mean of the observed and the predicted dependent variable y:the error term is independent of the explanatory variables. OLS3d means Nthat the error term and the explanatory variables are uncorrelated. OLS3a S (β) = (yi − xi β)2 = (y − Xβ) (y − Xβ) → min βis the strongest assumption and implies OLS3b to OLS3d. OLS3b implies i=1OLS3c and OLS3d. OLS3c implies OLS3d which is the weakest assump- The resulting OLS estimator of β is:tion. Intuitively, OLS3 means that the explanatory variables contain noinformation about the error term. −1 β = (X X) Xy Given the OLS estimator, we can predict the dependent variable by yi = xi β and the error term by ui = yi − xi β. ui is called the residual.
  3. 3. 5 Short Guides to Microeconometrics The Multiple Linear Regression Model 6 Note: R2 is in general not meaningful in a regression without a con- 4 stant. R2 increases by construction with every (also irrelevant) additional E(y|x) OLS regressors and is therefore not a good criterium for the selection of regres- 3 data sors. 2 5 Small Sample Properties 1 Assuming OLS1, OLS2, OLS3a, OLS4, and OLS5, the following proper- 0 y ties can be established for finite, i.e. even small, samples. −1 • The OLS estimator of β is unbiased : −2 E(β|X) = β −3 • The OLS estimator is (multivariate) normally distributed: −4 β|X ∼ N β, V (β|X) 0 2 4 6 8 10 x −1 with variance V (β|X) = σ 2 (X X) under homoscedasticity (OLS5a) −1 −1 and V (β|X) = σ 2 (X X) [X ΩX] (X X) under known heteroscedas-Figure 1: The linear regression model with one regressor. β0 = −2, ticity (OLS5b). Under homoscedasticity (OLS5a) the variance Vβ1 = 0.5, σ 2 = 1, x ∼ uniform(0, 10), u ∼ N (0, σ 2 ). can be unbiasedly estimated as −1 V (β|X) = σ 2 (X X)4 Goodness-of-fit with uuThe goodness-of-fit of an OLS regression can be measured as σ2 = . N −K −1 SSR SSE R2 = 1 − = SST SST • Gauß-Markov-Theorem: under homoscedasticity (OLS5a), N Nwhere SST = ¯2 i=1 (yi − y ) is the total sum of squares, SSE = i=1 (yi − N β is BLUE (best linear unbiased estimator)y )2 the explained sum of squares and SSR = i=1 u2 the residual sum¯ iof squares. R2 lies by definition between 0 and 1 and reports the fraction Table 1 provides a systematic overview of assumptions and their impliedof the sample variation in y that is explained by the xs. properties.
  4. 4. 7 Short Guides to Microeconometrics The Multiple Linear Regression Model 8 Table 1: Properties of the OLS estimator. (k+1)−th column of V (β|X). For example, to perform a two-sided test of Case [1] [2] [3] [4] [5] [6] H0 against the alternative hypotheses Ha : βk = q on the 5% significance Assumptions level, we calculate the t-statistic and compare its absolute value to the OLS1: linearity OLS2: independence 0.975-quantile of the t-distribution. With N = 30 and K = 2, H0 is OLS3: exogeneity rejected if |t| > 2.052. - OLS3a: normality × × × × A null hypotheses of the form H0 : Rβ = q with J linear restrictions - OLS3b: independent × × × × is jointly tested with the F -test. If the null hypotheses is true, the F - - OLS3c: mean indep. × - OLS3d: uncorrelated statistic OLS4: identifiability −1 OLS5: error Structure Rβ − q RV (β|X)R Rβ − q - OLS5a: homoscedastic F = ∼ FJ,N −K−1 J - OLS5b: heteroscedastic Small sample properties of β ˆ follows an F distribution with J numerator degrees of freedom and N − unbiased × K − 1 denominator degrees of freedom. For example, to perform a two- normally distributed × × × × sided test of H0 against the alternative hypotheses Ha : Rβ = q at the efficient × × × 5% significance level, we calculate the F -statistic and compare it to the t-test, F -test × × × × × Large sample properties of β ˆ 0.95-quantile of the F -distribution. With N = 30, K = 2 and J = 2, H0 consistent is rejected if F > 3.35. We cannot perform one-sided F -tests. approx. normal Only under homoscedasticity (OLS5a), the F -statistic can also be asymptotically efficient × × × ∗ ∗ ∗ computed as z-test, Wald test Notes: = fullfiled, × = violated, ∗ = corrected standard errors (SSRrestricted − SSR)/J (R2 − Rrestricted )/J 2 F = = ∼ FJ,N −K−1 SSR/(N − K − 1) (1 − R2 )/(N − K − 1) 26 Tests in Small Samples where SSRrestricted and Rrestricted are, respectively, estimated by re- striced least squares which minimizes S(β) s.t. Rβ = q. ExclusionaryAssume OLS1, OLS2, OLS3a, OLS4, and OLS5a. restrictions of the form H0 : βk = 0, βm = 0, ... are a special case of A simple null hypotheses of the form H0 : βk = q is tested with the H0 : Rβ = q. In this case, restricted least squares is simply estimated ast-test. If the null hypotheses is true, the t-statistic a regression were the explanatory variables k, m, ... are excluded. βk − q t= ∼ tN −K−1 se(βk ) 7 Confidence Intervals in Small Samplesfollows a t-distribution with N − K − 1 degrees of freedom. The standard Assuming OLS1, OLS2, OLS3a, OLS4, and OLS5a, we can constructerror se(βk ) is the square root of the element in the (k + 1)−th row and confidence intervals for a particular coefficient βk . The (1 − α) confidence
  5. 5. 9 Short Guides to Microeconometrics The Multiple Linear Regression Model 10interval is given by robust or Eicker-Huber-White estimator (see handout on “Heteroscedas- ticity in the linear Model”) βk − t(1−α/2),(N −K−1) se(βk ) , βk + t(1−α/2),(N −K−1) se(βk ) N −1 −1where t(1−α/2),(N −K−1) is the (1 − α/2) quantile of the t-distribution with Avar(β) = (X X) u2 xi xi (X X) i . i=1N − K − 1 degrees of freedom. For example, the 95 % confidence intervalwith N = 30 and K = 2 is βk − 2.052se(βk ) , βk + 2.052se(βk ) . Note: In practice we can almost never be sure that the errors are8 Asymptotic Properties of the OLS Estimator homoscedastic and should therefore always use robust standard errors. Table 1 provides a systematic overview of assumptions and their im-Assuming OLS1, OLS2, OLS3d, OLS4, and OLS5a or OLS5b the follow- plied properties.ing properties can be established for large samples. • The OLS estimator is consistent: 9 Asymptotic Tests plim β = β Assume OLS1, OLS2, OLS3d, OLS4, and OLS5a or OLS5b. A simple null hypotheses of the form H0 : βk = q is tested with the • The OLS estimator is asymptotically normally distributed: z-test. If the null hypotheses is true, the z-statistic √ d N (β − β) −→ N (0, Σ) βk − q A z= ∼ N (0, 1) se(βk ) where Σ = σ 2 [E(xi xi )]−1 = σ 2 Q−1 = under OLS5a and Σ = XX [E(xi xi )]−1 [E(u2 xi xi )][E(xi xi )]−1 = QXX [E(u2 xi xi )]Q−1 under −1 follows approximately the standard normal distribution. The standard i i XX OLS5b. error se(βk ) is the square root of the element in the (k + 1)−th row and (k + 1)−th column of Avar(β). For example, to perform a two sided • The OLS estimator is approximately normally distributed test of H0 against the alternative hypotheses Ha : βk = q on the 5% A significance level, we calculate the z-statistic and compare its absolute β ∼ N β, Avar(β) value to the 0.975-quantile of the standard normal distribution. H0 is where the asymptotic variance Avar(β) = N −1 Σ can be consistently rejected if |z| > 1.96. estimated under OLS5a (homoscedasticity) as A null hypotheses of the form H0 : Rβ = q with J linear restrictions is jointly tested with the Wald test. If the null hypotheses is true, the Wald −1 Avar(β) = σ 2 (X X) statistic −1 A with σ 2 = u u/N and under OLS5b (heteroscedasticity) as the W = Rβ − q RAvar(β)R Rβ − q ∼ χ2 J
  6. 6. 11 Short Guides to Microeconometrics The Multiple Linear Regression Model 12follows approximately an χ2 distribution with J degrees of freedom. For 11 Small Sample vs. Asymptotic Propertiesexample, to perform a test of H0 against the alternative hypotheses Ha :Rβ = q on the 5% significance level, we calculate the Wald statistic and The t-test, F -test and confidence interval for small samples depend on thecompare it to the 0.95-quantile of the χ2 -distribution. With J = 2, H0 is normality assumption OLS3d (see Table 1). This assumption is strong andrejected if W > 5.99. We cannot perform one-sided Wald tests. unlikely to be satisfied. The asymptotic z-test, Wald test and the con- Under OLS5a (homoscedasticity) only, the Wald statistic can also be fidence interval for large samples rely on much weaker assumptions. Al-computed as though most statistical software packages report the small sample results by default, we would typically prefer the large sample approximations. In (SSRrestricted − SSR) (R2 − Rrestricted ) A 2 2 W = = ∼ χJ practice, small sample and asymptotic tests and confidence intervals are SSR/N (1 − R2 )/N very similar already for relatively small samples, i.e. for (N − K) > 30. 2where SSRrestricted and Rrestricted are, respectively, estimated by re- Large sample tests also have the advantage that they can be based onstricted least squares which minimizes S(β) s.t. Rβ = q. Exclusionary heteroscedasticity robust standard errors.restrictions of the form H0 : βk = 0, βm = 0, ... are a special case ofH0 : Rβ = q. In this case, restricted least squares is simply estimated asa regression were the explanatory variables k, m, ... are excluded. Note: the Wald statistic can also be calculated as A W = J · F ∼ χ2 Jwhere F is the small sample F -statistic. This formulation differs by afactor (N − K − 1)/N but has the same asymptotic distribution.10 Confidence Intervals in Large SamplesAssuming OLS1, OLS2, OLS3d, OLS4, and OLS5a or OLS5b, we canconstruct confidence intervals for a particular coefficient βk . The (1 − α)confidence interval is given by βk − z(1−α/2) se(βk ) , βk + z(1−α/2) se(βk )where z(1−α/2) is the (1 − α/2) quantile of the standard normal distribu-tion. For example, the 95 % confidence interval is βk − 1.96se(βk ) , βk +1.96se(βk ) .
  7. 7. 13 Short Guides to Microeconometrics The Multiple Linear Regression Model 1412 Implementation in Stata 11 13 Implementation in R 2.13The multiple linear regression model is estimated by OLS with the regress The multiple linear regression model is estimated by OLS with the lmcommand For example, function. For example, webuse auto.dta > library(foreign) regress mpg weight displacement > auto <- read.dta("http://www.stata-press.com/data/r11/auto.dta") > fm <- lm(mpg~weight+displacement, data=auto)regresses the mileage of a car (mpg) on weight and displacement. A > summary(fm)constant is automatically added if not suppressed by the option noconst regresses the mileage of a car (mpg) on weight and displacement. A regress mpg weight displacement, noconst constant is automatically added if not suppressed by -1Estimation based on a subsample is performed as > lm(mpg~weight+displacement-1, data=auto) regress mpg weight displacement if weight>3000 Estimation based on a subsample is performed aswhere only cars heavier than 3000 lb are considered. Transformations of > lm(mpg~weight+displacement, subset=(weight>3000), data=auto)variables are included with new variables where only cars heavier than 3000 lb are considered. Tranformations of generate logmpg = log(mpg) variables are directly included with the I() function generate weight2 = weight^2 regress logmpg weight weight2 displacement > lm(I(log(mpg))~weight+I(weight^2)+ displacement, data=auto)The Eicker-Huber-White covariance is reported with the option robust The Eicker-Huber-White covariance is reported after estimation with regress mpg weight displacement, vce(robust) > library(sandwich) > library(lmtest)F -tests for one or more restrictions are calculated with the post-estimation > coeftest(fm, vcov=sandwich)command test. For example F -tests for one or more restrictions are calculated with the command test weight waldtest which also uses the two packages sandwich and lmtesttests H0 : β1 = 0 against HA : β1 = 0, and > waldtest(fm, "weigth", vcov=sandwich) test weight displacement tests H0 : β1 = 0 against HA : β1 = 0 with Eicker-Huber-White, andtests H0 : β1 = 0 and β2 = 0 against HA : β1 = 0 or β2 = 0 > waldtest(fm, .~.-weight-displacement, vcov=sandwich) New variables with residuals and fitted values are generated by tests H0 : β1 = 0 and β2 = 0 against HA : β1 = 0 or β2 = 0. predict uhat if e(sample), resid New variables with residuals and fitted values are generated by predict pricehat if e(sample) > auto$uhat <- resid(fm) > auto$mpghat <- fitted(fm)
  8. 8. 15 Short Guides to MicroeconometricsReferencesIntroductory textbooksStock, James H. and Mark W. Watson (2007), Introduction to Economet- rics, 2nd ed., Pearson Addison-Wesley. Chapters 4 - 9.Wooldridge, Jeffrey M. (2009), Introductory Econometrics: A Modern Approach, 4th ed., South-Western Cengage Learning. Chapters 2 - 8.Advanced textbooksCameron, A. Colin and Pravin K. Trivedi (2005), Microeconometrics: Methods and Applications, Cambridge University Press. Sections 4.1- 4.4.Wooldridge, Jeffrey M. (2002), Econometric Analysis of Cross Section and Panel Data, MIT Press. Chapters 4.1 - 4.23.Companion textbooksAngrist, Joshua D. and J¨rn-Steffen Pischke (2009), Mostly Harmless o Econometrics: An Empiricist’s Companion, Princeton University Press. Chapter 3.Kennedy, Peter (2008), A Guide to Econometrics, 6th ed., Blackwell Pub- lishing. Chapters 3 - 11, 14.

×