Upcoming SlideShare
×

Like this presentation? Why not share!

# Plsc 503 spring-2013-lecture 10

## on Apr 09, 2013

• 1,795 views

### Views

Total Views
1,795
Views on SlideShare
214
Embed Views
1,581

Likes
0
0
0

### Report content

• Comment goes here.
Are you sure you want to

## Plsc 503 spring-2013-lecture 10Presentation Transcript

• The bootstrap Why TSCS data?PLSC 503: Quantitative Methods, Week 10 Thad Dunning Department of Political Science Yale University The bootstrap; time-series cross-section models Lecture Notes, Week 10 1/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataThe bootstrap Here we introduce the bootstrap, aka Monte Carlo simulations. Lecture Notes, Week 10 2/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataThe bootstrap Here we introduce the bootstrap, aka Monte Carlo simulations. The bootstrap is often used to approximate the bias and standard error of an estimator in a complex statistical model. Lecture Notes, Week 10 2/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataThe bootstrap Here we introduce the bootstrap, aka Monte Carlo simulations. The bootstrap is often used to approximate the bias and standard error of an estimator in a complex statistical model. When the bootstrap is used, statistical theory is typically unclear about the extent of bias, or results about bias and variance only hold asymptotically (that is, as the sample size gets very large). Lecture Notes, Week 10 2/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataThe bootstrap Here we introduce the bootstrap, aka Monte Carlo simulations. The bootstrap is often used to approximate the bias and standard error of an estimator in a complex statistical model. When the bootstrap is used, statistical theory is typically unclear about the extent of bias, or results about bias and variance only hold asymptotically (that is, as the sample size gets very large). To get a handle on the mechanics, let’s start with a simple example, where theory does provide a reliable guide: the sample mean. Lecture Notes, Week 10 2/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataBootstrapping the sample mean Let Xi be IID for i = 1, ..., n, with mean µ and variance σ2 . We use the sample mean X to estimate µ. Is this estimator biased? What is its standard error? Lecture Notes, Week 10 3/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataBootstrapping the sample mean Let Xi be IID for i = 1, ..., n, with mean µ and variance σ2 . We use the sample mean X to estimate µ. Is this estimator biased? What is its standard error? The bootstrap idea: take the data—the n observed values of Xi —as a small population. The mean of this population is X . Lecture Notes, Week 10 3/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataBootstrapping the sample mean Let Xi be IID for i = 1, ..., n, with mean µ and variance σ2 . We use the sample mean X to estimate µ. Is this estimator biased? What is its standard error? The bootstrap idea: take the data—the n observed values of Xi —as a small population. The mean of this population is X . Using the computer, simulate n draws made at random with replacement from this small population. X1 X2 ..  ..  … Xn X1* * X2 ..  ..  … X n * € € € € € € Lecture Notes, Week 10 3/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataThe bootstrap sample Now, X1 , ...., Xn is the bootstrap sample. Some elements of ∗ ∗ the small population will enter the sample more than once, some will enter once, and some will enter not at all. The average of the bootstrap sample estimates the average of the small population (that is, the original, real sample): n ∗1 X = Xi∗ (1) n i =1 Lecture Notes, Week 10 4/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataThe bootstrap sample Now, X1 , ...., Xn is the bootstrap sample. Some elements of ∗ ∗ the small population will enter the sample more than once, some will enter once, and some will enter not at all. The average of the bootstrap sample estimates the average of the small population (that is, the original, real sample): n ∗1 X = Xi∗ (1) n i =1 The key idea: we can draw many bootstrap samples to get the ∗ sampling distribution of X . Lecture Notes, Week 10 4/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataBootstrap replicates Suppose we have k = 1, ...., N bootstrap replicates. Denote the mean of each bootstrap sample as X (1) , ...., X (k ) , ...., X (N ) . (2) Lecture Notes, Week 10 5/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataBootstrap replicates Suppose we have k = 1, ...., N bootstrap replicates. Denote the mean of each bootstrap sample as X (1) , ...., X (k ) , ...., X (N ) . (2) Now, approximate the the expected value of each bootstrap replicate by the mean of all the bootstrap replicates: N 1 X ave = X (k ) . (3) N k =1 Lecture Notes, Week 10 5/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataBootstrap replicates Suppose we have k = 1, ...., N bootstrap replicates. Denote the mean of each bootstrap sample as X (1) , ...., X (k ) , ...., X (N ) . (2) Now, approximate the the expected value of each bootstrap replicate by the mean of all the bootstrap replicates: N 1 X ave = X (k ) . (3) N k =1 We should ﬁnd that X ave X, (4) because X ave is an unbiased and consistent estimator for X . Lecture Notes, Week 10 5/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataBootstrapping the standard error We can also use the bootstrap to estimate the sampling variance of X . The variance of the N bootstrap replicates is N 1 var(X k ) = [X k − X ave ]2 . (5) N k =1 Lecture Notes, Week 10 6/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataBootstrapping the standard error We can also use the bootstrap to estimate the sampling variance of X . The variance of the N bootstrap replicates is N 1 var(X k ) = [X k − X ave ]2 . (5) N k =1 The square root of (5) is the bootstrap SE – the SD of the bootstrap replicates. Lecture Notes, Week 10 6/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataBootstrapping the standard error We can also use the bootstrap to estimate the sampling variance of X . The variance of the N bootstrap replicates is N 1 var(X k ) = [X k − X ave ]2 . (5) N k =1 The square root of (5) is the bootstrap SE – the SD of the bootstrap replicates. The bootstrap SE estimates the true SE—that is, it tells us how good the original X was, as an estimate for µ. Lecture Notes, Week 10 6/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataThe theory of the bootstrap The key idea is that the bootstrap mimics the sampling procedure under consideration. Lecture Notes, Week 10 7/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataThe theory of the bootstrap The key idea is that the bootstrap mimics the sampling procedure under consideration. For example, in bootstrapping the sample mean, we drew at random with replacement from the small population constructed from the real sample—because this is how the real sample was drawn from the true population. Lecture Notes, Week 10 7/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataThe theory of the bootstrap The key idea is that the bootstrap mimics the sampling procedure under consideration. For example, in bootstrapping the sample mean, we drew at random with replacement from the small population constructed from the real sample—because this is how the real sample was drawn from the true population. Bootstrap principle for the sample mean: The distribution of ∗ X − X should be a good approximation to the distribution of X − µ. Lecture Notes, Week 10 7/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataThe theory of the bootstrap The key idea is that the bootstrap mimics the sampling procedure under consideration. For example, in bootstrapping the sample mean, we drew at random with replacement from the small population constructed from the real sample—because this is how the real sample was drawn from the true population. Bootstrap principle for the sample mean: The distribution of ∗ X − X should be a good approximation to the distribution of X − µ. For this to work, n should be reasonably large—that’s what makes the empirical distribution {X1 , . . . , Xn } a good approximation to the distribution that this real sample came from. Lecture Notes, Week 10 7/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataBootstrapping a regression model Suppose we have the OLS regression model, with Y = Xβ + . Lecture Notes, Week 10 8/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataBootstrapping a regression model Suppose we have the OLS regression model, with Y = Xβ + . Here, Xn×p has full rank and the ﬁrst column is 1s, the i are IID with mean 0 and variance σ2 , and X. Lecture Notes, Week 10 8/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataBootstrapping a regression model Suppose we have the OLS regression model, with Y = Xβ + . Here, Xn×p has full rank and the ﬁrst column is 1s, the i are IID with mean 0 and variance σ2 , and X. Under these assumptions, given ﬁxed X and data on Y , we can bootstrap the regression model. Lecture Notes, Week 10 8/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataBootstrapping a regression model Suppose we have the OLS regression model, with Y = Xβ + . Here, Xn×p has full rank and the ﬁrst column is 1s, the i are IID with mean 0 and variance σ2 , and X. Under these assumptions, given ﬁxed X and data on Y , we can bootstrap the regression model. The key is to use the computer to simulate the data-generating process assumed by the model. Lecture Notes, Week 10 8/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataBootstrapping a regression model Let β = (X X )−1 X Y be the OLS estimator, and let ˆ e = Y − X β be the residuals. ˆ Lecture Notes, Week 10 9/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataBootstrapping a regression model Let β = (X X )−1 X Y be the OLS estimator, and let ˆ e = Y − X β be the residuals. ˆ ¯ Note that e = 0 (why?). So the residuals are a small population with mean zero. Lecture Notes, Week 10 9/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataBootstrapping a regression model Let β = (X X )−1 X Y be the OLS estimator, and let ˆ e = Y − X β be the residuals. ˆ ¯ Note that e = 0 (why?). So the residuals are a small population with mean zero. To bootstrap the regression model, draw n times at random with replacement from this small population, to get IID bootstrap errors 1 , ...., n . ∗ ∗ e1 e2 ..  ..  … en * ε1 ε* 2 ..  ..  … εn * € € € € € € Lecture Notes, Week 10 9/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataThe ﬁrst bootstrap replicate Using 1 , ...., n and the ﬁxed design matrix, we can now ∗ ∗ generate a new dataset: ∗ Y1 = X1 β + ∗ 1 ∗ Y2 = X2 β + ∗ 2 .... .... ∗ Yn = Xn β + ∗ n where Xi is the ith row of X . Lecture Notes, Week 10 10/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataThe ﬁrst bootstrap replicate Using 1 , ...., n and the ﬁxed design matrix, we can now ∗ ∗ generate a new dataset: ∗ Y1 = X1 β + ∗ 1 ∗ Y2 = X2 β + ∗ 2 .... .... ∗ Yn = Xn β + ∗ n where Xi is the ith row of X . Here we are sampling with replacement from the small population of residuals, so i∗ could be any one of the n residuals from the original data set. Lecture Notes, Week 10 10/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataThe bootstrap estimator Now we can construct the OLS estimator for this single bootstrapped data set. This is β∗ = (X X )−1 X Y ∗ , ˆ where Y ∗ is the simulated dependent variable for this bootstrapped replicate. Lecture Notes, Week 10 11/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataThe bootstrap estimator Now we can construct the OLS estimator for this single bootstrapped data set. This is β∗ = (X X )−1 X Y ∗ , ˆ where Y ∗ is the simulated dependent variable for this bootstrapped replicate. What is the distribution of β∗ − β? ˆ ˆ Lecture Notes, Week 10 11/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataThe bootstrap estimator Now we can construct the OLS estimator for this single bootstrapped data set. This is β∗ = (X X )−1 X Y ∗ , ˆ where Y ∗ is the simulated dependent variable for this bootstrapped replicate. What is the distribution of β∗ − β? ˆ ˆ This is where the bootstrap replicates come in. Lecture Notes, Week 10 11/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataThe bootstrap replicates We repeat the entire process N times. That is, for each of the i = 1, ..., k , ...., N replicates, we Lecture Notes, Week 10 12/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataThe bootstrap replicates We repeat the entire process N times. That is, for each of the i = 1, ..., k , ...., N replicates, we draw n bootstrapped errors at random with replacement—from the original box of residuals; Lecture Notes, Week 10 12/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataThe bootstrap replicates We repeat the entire process N times. That is, for each of the i = 1, ..., k , ...., N replicates, we draw n bootstrapped errors at random with replacement—from the original box of residuals; generate Y ∗ = X β + ∗ ; and ˆ Lecture Notes, Week 10 12/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataThe bootstrap replicates We repeat the entire process N times. That is, for each of the i = 1, ..., k , ...., N replicates, we draw n bootstrapped errors at random with replacement—from the original box of residuals; generate Y ∗ = X β + ∗ ; and ˆ calculate β∗ . ˆ Lecture Notes, Week 10 12/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataThe bootstrap replicates We repeat the entire process N times. That is, for each of the i = 1, ..., k , ...., N replicates, we draw n bootstrapped errors at random with replacement—from the original box of residuals; generate Y ∗ = X β + ∗ ; and ˆ calculate β∗ . ˆ Now we have N little datasets, each with n observations. Lecture Notes, Week 10 12/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataThe bootstrap replicates We repeat the entire process N times. That is, for each of the i = 1, ..., k , ...., N replicates, we draw n bootstrapped errors at random with replacement—from the original box of residuals; generate Y ∗ = X β + ∗ ; and ˆ calculate β∗ . ˆ Now we have N little datasets, each with n observations. We can make N as big as we like (computing power is cheap). Lecture Notes, Week 10 12/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataThe bootstrap replicates We repeat the entire process N times. That is, for each of the i = 1, ..., k , ...., N replicates, we draw n bootstrapped errors at random with replacement—from the original box of residuals; generate Y ∗ = X β + ∗ ; and ˆ calculate β∗ . ˆ Now we have N little datasets, each with n observations. We can make N as big as we like (computing power is cheap). So we can observe lots of realizations of the random variable β∗ . ˆ Lecture Notes, Week 10 12/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataThe bootstrap principle for regression For each of the i = 1, ..., k , ...., N bootstrap replicates, denote the bootstrap estimator βk . (That is, βk is β∗ for the k th ˆ ˆ ˆ boostrap replicate). Lecture Notes, Week 10 13/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataThe bootstrap principle for regression For each of the i = 1, ..., k , ...., N bootstrap replicates, denote the bootstrap estimator βk . (That is, βk is β∗ for the k th ˆ ˆ ˆ boostrap replicate). The average of the OLS estimator across each of the bootstrapped replicates, N 1 βave = ˆ βk , ˆ N k =1 gives us the expected value, up to sampling error, of the bootstrap estimator for β—the original estimated vector that ˆ we told the computer to take as the truth. Lecture Notes, Week 10 13/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataThe bootstrap principle for regression For each of the i = 1, ..., k , ...., N bootstrap replicates, denote the bootstrap estimator βk . (That is, βk is β∗ for the k th ˆ ˆ ˆ boostrap replicate). The average of the OLS estimator across each of the bootstrapped replicates, N 1 βave = ˆ βk , ˆ N k =1 gives us the expected value, up to sampling error, of the bootstrap estimator for β—the original estimated vector that ˆ we told the computer to take as the truth. Lecture Notes, Week 10 13/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataThe bootstrap principle for regression For each of the i = 1, ..., k , ...., N bootstrap replicates, denote the bootstrap estimator βk . (That is, βk is β∗ for the k th ˆ ˆ ˆ boostrap replicate). The average of the OLS estimator across each of the bootstrapped replicates, N 1 βave = ˆ βk , ˆ N k =1 gives us the expected value, up to sampling error, of the bootstrap estimator for β—the original estimated vector that ˆ we told the computer to take as the truth. The role of sampling error in the bootstrap is minimal, because we can make N very large. Lecture Notes, Week 10 13/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataThe theory of the bootstrap The key idea: with a reasonably large n, the sampling distribution of β∗ − β is a good approximation to the sampling ˆ ˆ distribution of β − β. ˆ Lecture Notes, Week 10 14/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataThe theory of the bootstrap The key idea: with a reasonably large n, the sampling distribution of β∗ − β is a good approximation to the sampling ˆ ˆ distribution of β − β. ˆ In particular, the empirical covariance matrix of the β∗ is a ˆ good approximation to the theoretical covariance matrix of β. ˆ Lecture Notes, Week 10 14/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataThe theory of the bootstrap The key idea: with a reasonably large n, the sampling distribution of β∗ − β is a good approximation to the sampling ˆ ˆ distribution of β − β. ˆ In particular, the empirical covariance matrix of the β∗ is a ˆ good approximation to the theoretical covariance matrix of β. ˆ This is because we constructed a data-generating process, using the computer, that mirrors the assumptions of the model—and uses the residuals from the original regression to approximate the box of errors. Lecture Notes, Week 10 14/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataThe theory of the bootstrap The key idea: with a reasonably large n, the sampling distribution of β∗ − β is a good approximation to the sampling ˆ ˆ distribution of β − β. ˆ In particular, the empirical covariance matrix of the β∗ is a ˆ good approximation to the theoretical covariance matrix of β. ˆ This is because we constructed a data-generating process, using the computer, that mirrors the assumptions of the model—and uses the residuals from the original regression to approximate the box of errors. This works as long as the distribution of residuals is a good estimate for the distribution of the true errors. That’s why n needs to be at least moderately large. Lecture Notes, Week 10 14/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataThe bootstrap (empirical) standard errors The “empirical” covariance matrix of the bootstrap estimator is N 1 [β(k ) − βave ][β(k ) − βave ] ˆ ˆ ˆ ˆ N k =1 Lecture Notes, Week 10 15/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataThe bootstrap (empirical) standard errors The “empirical” covariance matrix of the bootstrap estimator is N 1 [β(k ) − βave ][β(k ) − βave ] ˆ ˆ ˆ ˆ N k =1 The square roots of the diagonal elements of this matrix are the bootstrap SEs. Lecture Notes, Week 10 15/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataThe bootstrap (empirical) standard errors The “empirical” covariance matrix of the bootstrap estimator is N 1 [β(k ) − βave ][β(k ) − βave ] ˆ ˆ ˆ ˆ N k =1 The square roots of the diagonal elements of this matrix are the bootstrap SEs. The matrix is “empirical” because the N actual replicates trace out the sampling distribution of the bootstrap estimator. Lecture Notes, Week 10 15/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataThe bootstrap (empirical) standard errors The “empirical” covariance matrix of the bootstrap estimator is N 1 [β(k ) − βave ][β(k ) − βave ] ˆ ˆ ˆ ˆ N k =1 The square roots of the diagonal elements of this matrix are the bootstrap SEs. The matrix is “empirical” because the N actual replicates trace out the sampling distribution of the bootstrap estimator. By the bootstrap principle, the empirical matrix approximates the theoretical covariance matrix, which is E {[β − E (β)][β − E (β)] } = σ2 (X X )−1 . ˆ ˆ ˆ ˆ Lecture Notes, Week 10 15/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataThe nominal standard errors Note that we can also calculate the “nominal” standard errors—that is, the standard errors calculated from the usual regression formulas—for any bootstrap replicate. Lecture Notes, Week 10 16/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataThe nominal standard errors Note that we can also calculate the “nominal” standard errors—that is, the standard errors calculated from the usual regression formulas—for any bootstrap replicate. The tricky part is we need to use the residuals given by the least-squares ﬁt to each bootstrapped data set. For any particular data set, e ∗ = Y ∗ − X β∗ ; ˆ Lecture Notes, Week 10 16/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataThe nominal standard errors Note that we can also calculate the “nominal” standard errors—that is, the standard errors calculated from the usual regression formulas—for any bootstrap replicate. The tricky part is we need to use the residuals given by the least-squares ﬁt to each bootstrapped data set. For any particular data set, e ∗ = Y ∗ − X β∗ ; ˆ So, the nominal covariance matrix is 1 cov(β∗ |X ) = ˆ e ∗ 2 (X X )−1 . n−p The square root of each diagonal element gives us the nominal standard error for the associated coefﬁcient. Lecture Notes, Week 10 16/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataBias in the nominal standard errors Let the nominal SE of, say, β2(k ) —that is, β∗ for the k th ˆ ˆ 2 bootstrap replicate—be SE β2(k ) . ˆ Lecture Notes, Week 10 17/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataBias in the nominal standard errors Let the nominal SE of, say, β2(k ) —that is, β∗ for the k th ˆ ˆ 2 bootstrap replicate—be SE β2(k ) . ˆ The average of the nominal SE across all replicates is N 1 SE β2(k ) ˆ N k =1 Lecture Notes, Week 10 17/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataBias in the nominal standard errors Let the nominal SE of, say, β2(k ) —that is, β∗ for the k th ˆ ˆ 2 bootstrap replicate—be SE β2(k ) . ˆ The average of the nominal SE across all replicates is N 1 SE β2(k ) ˆ N k =1 Differences between this quantity and the standard deviation of the bootstrap replicates indicates bias in the nominal standard errors. Lecture Notes, Week 10 17/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataBias in the nominal standard errors Let the nominal SE of, say, β2(k ) —that is, β∗ for the k th ˆ ˆ 2 bootstrap replicate—be SE β2(k ) . ˆ The average of the nominal SE across all replicates is N 1 SE β2(k ) ˆ N k =1 Differences between this quantity and the standard deviation of the bootstrap replicates indicates bias in the nominal standard errors. If we ran the simulations discussed in this subsection, we shouldn’t see any bias. (Why not?). Lecture Notes, Week 10 17/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataMean-squared error In settings where estimators—or their estimated standard errors—may be biased, and we are comparing alternative estimators, the concept of mean-squared error is useful. Lecture Notes, Week 10 18/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataMean-squared error In settings where estimators—or their estimated standard errors—may be biased, and we are comparing alternative estimators, the concept of mean-squared error is useful. The deﬁnition of this measure is N 1 (β(k ) − β)2 . ˆ ˆ (6) N k =1 Lecture Notes, Week 10 18/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataMean-squared error In settings where estimators—or their estimated standard errors—may be biased, and we are comparing alternative estimators, the concept of mean-squared error is useful. The deﬁnition of this measure is N 1 (β(k ) − β)2 . ˆ ˆ (6) N k =1 This gives the average squared difference, across the N replicates, between the bootstrap estimator and what we told the computer to take as the “truth”—namely, β. ˆ Lecture Notes, Week 10 18/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataMean-squared error In settings where estimators—or their estimated standard errors—may be biased, and we are comparing alternative estimators, the concept of mean-squared error is useful. The deﬁnition of this measure is N 1 (β(k ) − β)2 . ˆ ˆ (6) N k =1 This gives the average squared difference, across the N replicates, between the bootstrap estimator and what we told the computer to take as the “truth”—namely, β. ˆ Thus, (6) includes both the bias and precision of β(k ) as an ˆ estimator for β. ˆ Lecture Notes, Week 10 18/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataDecomposition of the mean-squared error In general, mean-squared error of an estimator is the variance of the estimator plus the square of its bias. Lecture Notes, Week 10 19/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataDecomposition of the mean-squared error In general, mean-squared error of an estimator is the variance of the estimator plus the square of its bias. Why? The variance of any random variable Z is Var(Z ) = E [Z 2 ] − [E (Z )]2 (7) Lecture Notes, Week 10 19/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataDecomposition of the mean-squared error In general, mean-squared error of an estimator is the variance of the estimator plus the square of its bias. Why? The variance of any random variable Z is Var(Z ) = E [Z 2 ] − [E (Z )]2 (7) Let Z = β(k ) − β, where β(k ) is a random variable and β is ˆ ˆ ˆ ˆ ﬁxed (as in the bootstrap). Lecture Notes, Week 10 19/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataDecomposition of the mean-squared error In general, mean-squared error of an estimator is the variance of the estimator plus the square of its bias. Why? The variance of any random variable Z is Var(Z ) = E [Z 2 ] − [E (Z )]2 (7) Let Z = β(k ) − β, where β(k ) is a random variable and β is ˆ ˆ ˆ ˆ ﬁxed (as in the bootstrap). Then, Var(β(k ) − β) = E [(β(k ) − β)2 ] − [E (β(k ) − β)]2 ˆ ˆ ˆ ˆ ˆ ˆ Lecture Notes, Week 10 19/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataDecomposition of the mean-squared error In general, mean-squared error of an estimator is the variance of the estimator plus the square of its bias. Why? The variance of any random variable Z is Var(Z ) = E [Z 2 ] − [E (Z )]2 (7) Let Z = β(k ) − β, where β(k ) is a random variable and β is ˆ ˆ ˆ ˆ ﬁxed (as in the bootstrap). Then, Var(β(k ) − β) = E [(β(k ) − β)2 ] − [E (β(k ) − β)]2 ˆ ˆ ˆ ˆ ˆ ˆ Since Var(β(k ) − β) = Var(β(k ) ) and E [(β(k ) − β)2 ] is the MSE, ˆ ˆ ˆ ˆ ˆ we have MSE = Variance + Bias2 . (8) Lecture Notes, Week 10 19/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataAutoregression Here’s another example—one in which you might actually want to use the bootstrap. Lecture Notes, Week 10 20/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataAutoregression Here’s another example—one in which you might actually want to use the bootstrap. Suppose that for all time periods t = 1, 2, ...., n, we have Yt = a + bYt −1 + t . (9) Lecture Notes, Week 10 20/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataAutoregression Here’s another example—one in which you might actually want to use the bootstrap. Suppose that for all time periods t = 1, 2, ...., n, we have Yt = a + bYt −1 + t . (9) Here, Y0 is a ﬁxed number; the t are IID with mean 0 and variance σ2 . By assumption, |b | < 1. (This is a stationary time series; if |b | = 1, it has a unit root.) Lecture Notes, Week 10 20/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataAutoregression Here’s another example—one in which you might actually want to use the bootstrap. Suppose that for all time periods t = 1, 2, ...., n, we have Yt = a + bYt −1 + t . (9) Here, Y0 is a ﬁxed number; the t are IID with mean 0 and variance σ2 . By assumption, |b | < 1. (This is a stationary time series; if |b | = 1, it has a unit root.) This is an example of an autoregression, in which a dependent variable is regressed on its one-period lag. Lecture Notes, Week 10 20/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataBias in autoregression There are n equations:        Y1   1     Y0    1     Y2   1            Y1       2           .   .       .    a  .       . .        . = . + (10)         .                      b         .   .             .       .                      Yn 1 Yn−1 n Lecture Notes, Week 10 21/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataBias in autoregression There are n equations:        Y1   1     Y0    1     Y2   1            Y1       2           .   .       .    a  .       . .        . = . + (10)         .                      b         .   .             .       .                      Yn 1 Yn−1 n Notice that under the assumptions of the model, we cannot have X . (Why not?). Lecture Notes, Week 10 21/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataBias in autoregression There are n equations:        Y1   1     Y0    1     Y2   1            Y1       2           .   .       .    a  .       . .        . = . + (10)         .                      b         .   .             .       .                      Yn 1 Yn−1 n Notice that under the assumptions of the model, we cannot have X . (Why not?). Indeed, Yt is a function of t , so the column of errors cannot be independent of the design matrix. Lecture Notes, Week 10 21/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataThe bootstrap sample Use the bootstrap procedure to assess the degree of bias: Estimate the model on the original data and freeze Y0 , ˆ a β = ˆ , and e = Y − X β. ˆ ˆ b Lecture Notes, Week 10 22/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataThe bootstrap sample Use the bootstrap procedure to assess the degree of bias: Estimate the model on the original data and freeze Y0 , ˆ a β = ˆ , and e = Y − X β. ˆ ˆ b Resample the e’s to get the bootstrap errors 1 , ...., n . ∗ ∗ Lecture Notes, Week 10 22/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataThe bootstrap sample Use the bootstrap procedure to assess the degree of bias: Estimate the model on the original data and freeze Y0 , ˆ a β = ˆ , and e = Y − X β. ˆ ˆ b Resample the e’s to get the bootstrap errors 1 , ...., n . ∗ ∗ ∗ Generate the Yi ’s one at a time: ∗ Y1 = ˆ ˆ a + bY0 + ∗ 1 ∗ Y2 = ˆ ˆ a + bY1 + ∗ 2 .... ∗ Yn = ˆ ˆ a + bYn−1 + ∗ n Lecture Notes, Week 10 22/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataThe bootstrap sample Use the bootstrap procedure to assess the degree of bias: Estimate the model on the original data and freeze Y0 , ˆ a β = ˆ , and e = Y − X β. ˆ ˆ b Resample the e’s to get the bootstrap errors 1 , ...., n . ∗ ∗ ∗ Generate the Yi ’s one at a time: ∗ Y1 = ˆ ˆ a + bY0 + ∗ 1 ∗ Y2 = ˆ ˆ a + bY1 + ∗ 2 .... ∗ Yn = ˆ ˆ a + bYn−1 + ∗ n Here, the design matrix is random, as are the Yi∗ s. However, Y0 is ﬁxed (by assumption), so you can build up the Yi s one at a time. Lecture Notes, Week 10 22/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataThe bootstrap replicates As usual, we replicate this procedure N times, each time computing the bootstrap estimator, β∗ = (X ∗ X ∗ )−1 X ∗ Y ∗ . ˆ Lecture Notes, Week 10 23/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataThe bootstrap replicates As usual, we replicate this procedure N times, each time computing the bootstrap estimator, β∗ = (X ∗ X ∗ )−1 X ∗ Y ∗ . ˆ The usual bootstrap principle applies. In particular, With large enough n, the average of β∗ − β is good ˆ ˆ approximation to the bias in β. ˆ Lecture Notes, Week 10 23/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataThe bootstrap replicates As usual, we replicate this procedure N times, each time computing the bootstrap estimator, β∗ = (X ∗ X ∗ )−1 X ∗ Y ∗ . ˆ The usual bootstrap principle applies. In particular, With large enough n, the average of β∗ − β is good ˆ ˆ approximation to the bias in β. ˆ Actual simulations would indeed show bias—due to the presence of the lagged dependent variables. Lecture Notes, Week 10 23/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataA model with time-series cross-section data For a ﬁnal example of the bootstrap, consider a model for time-series cross-sectional data: Yt ,j = αj + bYt −1,j + cWt ,j + t ,j . Lecture Notes, Week 10 24/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataA model with time-series cross-section data For a ﬁnal example of the bootstrap, consider a model for time-series cross-sectional data: Yt ,j = αj + bYt −1,j + cWt ,j + t ,j . Here, t is a time index and j is a unit index (e.g., countries). Lecture Notes, Week 10 24/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataA model with time-series cross-section data For a ﬁnal example of the bootstrap, consider a model for time-series cross-sectional data: Yt ,j = αj + bYt −1,j + cWt ,j + t ,j . Here, t is a time index and j is a unit index (e.g., countries). We need to get the model into the matrix framework. Lecture Notes, Week 10 24/ 48
• Bootstrapping the sample mean The bootstrap Bootstrapping a regression model Why TSCS data? Time-series dataA model with time-series cross-section data For a ﬁnal example of the bootstrap, consider a model for time-series cross-sectional data: Yt ,j = αj + bYt −1,j + cWt ,j + t ,j . Here, t is a time index and j is a unit index (e.g., countries). We need to get the model into the matrix framework. Read Freedman (2009), 8.1 (Example 4) and 8.2 . . . Lecture Notes, Week 10 24/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsComparisons across cases over time Let’s take a step back to consider the conditions under which time-series cross-sectional data may help us make causal inferences—and the conditions under which it doesn’t. Lecture Notes, Week 10 25/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsComparisons across cases over time Let’s take a step back to consider the conditions under which time-series cross-sectional data may help us make causal inferences—and the conditions under which it doesn’t. A classic example: the Connecticut Crackdown (Campbell and Ross 1968) Lecture Notes, Week 10 25/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsComparisons across cases over time Let’s take a step back to consider the conditions under which time-series cross-sectional data may help us make causal inferences—and the conditions under which it doesn’t. A classic example: the Connecticut Crackdown (Campbell and Ross 1968) At the start of 1956, the Governor of Connecticut signed a new speeding law. Lecture Notes, Week 10 25/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsComparisons across cases over time Let’s take a step back to consider the conditions under which time-series cross-sectional data may help us make causal inferences—and the conditions under which it doesn’t. A classic example: the Connecticut Crackdown (Campbell and Ross 1968) At the start of 1956, the Governor of Connecticut signed a new speeding law. Trafﬁc fatalities dropped sharply in the next year. Lecture Notes, Week 10 25/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsComparisons across cases over time Let’s take a step back to consider the conditions under which time-series cross-sectional data may help us make causal inferences—and the conditions under which it doesn’t. A classic example: the Connecticut Crackdown (Campbell and Ross 1968) At the start of 1956, the Governor of Connecticut signed a new speeding law. Trafﬁc fatalities dropped sharply in the next year. Question: can the drop in trafﬁc fatalities be attributed to the causal effect of the law? Lecture Notes, Week 10 25/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsThe Connecticut Crackdown LAW AND SOCIETYREVIEW 320 - 310- 300- ~~ 290 280 - T I.... I BEFORE AFTER CRACKDOWN CRACKDOWN (1955) (1956) Fatalities, 1955-1956 Figure 1. ConnecticutTraffic fails to control for the six common threats to the validity of experiments specified below: Lecture Notes, Week 10 26/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsThe “One Group Pretest-Posttest Design”:Campbell’s checklist of threats to internal validity History (events other than the law occurring between pre-test and post-test) Lecture Notes, Week 10 27/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsThe “One Group Pretest-Posttest Design”:Campbell’s checklist of threats to internal validity History (events other than the law occurring between pre-test and post-test) Maturation (changes/processes due to the passage of time) Lecture Notes, Week 10 27/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsThe “One Group Pretest-Posttest Design”:Campbell’s checklist of threats to internal validity History (events other than the law occurring between pre-test and post-test) Maturation (changes/processes due to the passage of time) Testing (change due to publication/dissemination of the pre-test) Lecture Notes, Week 10 27/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsThe “One Group Pretest-Posttest Design”:Campbell’s checklist of threats to internal validity History (events other than the law occurring between pre-test and post-test) Maturation (changes/processes due to the passage of time) Testing (change due to publication/dissemination of the pre-test) Instrumentation (shift of the measurement tool from pre-test to post-test) Lecture Notes, Week 10 27/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsThe “One Group Pretest-Posttest Design”:Campbell’s checklist of threats to internal validity History (events other than the law occurring between pre-test and post-test) Maturation (changes/processes due to the passage of time) Testing (change due to publication/dissemination of the pre-test) Instrumentation (shift of the measurement tool from pre-test to post-test) (Instability: variation/ﬂux in year-on-year death rates) Lecture Notes, Week 10 27/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsThe “One Group Pretest-Posttest Design”:Campbell’s checklist of threats to internal validity History (events other than the law occurring between pre-test and post-test) Maturation (changes/processes due to the passage of time) Testing (change due to publication/dissemination of the pre-test) Instrumentation (shift of the measurement tool from pre-test to post-test) (Instability: variation/ﬂux in year-on-year death rates) Regression to the mean (“regression effects”: the tendency of extreme values to revert towards average values). Lecture Notes, Week 10 27/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsFailures of symmetry Notice that for each threat to internal validity, the basic issue is lack of symmetry between pre-test and post-test groups Lecture Notes, Week 10 28/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsFailures of symmetry Notice that for each threat to internal validity, the basic issue is lack of symmetry between pre-test and post-test groups History and maturation: the post-test and pre-test groups differ in ways other than the intervention. Lecture Notes, Week 10 28/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsFailures of symmetry Notice that for each threat to internal validity, the basic issue is lack of symmetry between pre-test and post-test groups History and maturation: the post-test and pre-test groups differ in ways other than the intervention. Testing and instrumentation: the post-test group is exposed to publicity/testing or measurement tools distinct from the pre-test group Lecture Notes, Week 10 28/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsFailures of symmetry Notice that for each threat to internal validity, the basic issue is lack of symmetry between pre-test and post-test groups History and maturation: the post-test and pre-test groups differ in ways other than the intervention. Testing and instrumentation: the post-test group is exposed to publicity/testing or measurement tools distinct from the pre-test group Regression: Exposure to the intervention (post-test) is related to previous extreme values Lecture Notes, Week 10 28/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsFailures of symmetry Notice that for each threat to internal validity, the basic issue is lack of symmetry between pre-test and post-test groups History and maturation: the post-test and pre-test groups differ in ways other than the intervention. Testing and instrumentation: the post-test group is exposed to publicity/testing or measurement tools distinct from the pre-test group Regression: Exposure to the intervention (post-test) is related to previous extreme values These differences may also be related to differences in outcomes (e.g. measured death rates): thus the threats to validity/causal attribution. Lecture Notes, Week 10 28/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsFailures of symmetry Notice that for each threat to internal validity, the basic issue is lack of symmetry between pre-test and post-test groups History and maturation: the post-test and pre-test groups differ in ways other than the intervention. Testing and instrumentation: the post-test group is exposed to publicity/testing or measurement tools distinct from the pre-test group Regression: Exposure to the intervention (post-test) is related to previous extreme values These differences may also be related to differences in outcomes (e.g. measured death rates): thus the threats to validity/causal attribution. The pre-test and post-test groups are not plausible counterfactuals: looking at the pre-test group does not reveal the outcomes that would have materialized in the post-test group, had the intervention never been applied. Lecture Notes, Week 10 28/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsSelection The hallmark of an observational study is self-selection Lecture Notes, Week 10 29/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsSelection The hallmark of an observational study is self-selection People/units sort themselves into comparison groups (CT decides to pass a trafﬁc law while other states may not) Lecture Notes, Week 10 29/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsSelection The hallmark of an observational study is self-selection People/units sort themselves into comparison groups (CT decides to pass a trafﬁc law while other states may not) Policy-makers apply interventions to some groups and not others, in ways related to outcomes (governors implement speeding laws after unusually high trafﬁc fatalities) Lecture Notes, Week 10 29/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsSelection The hallmark of an observational study is self-selection People/units sort themselves into comparison groups (CT decides to pass a trafﬁc law while other states may not) Policy-makers apply interventions to some groups and not others, in ways related to outcomes (governors implement speeding laws after unusually high trafﬁc fatalities) Self-selection leads to confounding and can produce failures of symmetry. Lecture Notes, Week 10 29/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effects“Interrupted Time Series” LAW AND SocIETY REVIEW 325 00 - 275 z 225- E 4 hi 200 -r I I . I I I 51 52 53 54 55 56 57 58 59 Figure 2. ConnecticutTrafficFatalities, 1951-1959 a coincident reform of the record keeping, ruling out valid inferences as to effects. The Chicago police reform Lecture Notes,aWeek in point. 48 cited above is case 10 30/
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effects“Multiple Time Series” (treatment and comparison LAW AND SocmIEr Rivuvw Connecticut,all data being expressed on a per 100,000population basegroups) to bring the figuresinto proximity. The control data are much smoother, due to the much larger base, i.e., the canceling out of chance deviations in the annual figures for particularstates. 17 Connecticut Control Stotes-- 16 16 LU wS 14 13 1- 12 U. II 10 8 7 1 I I I I I 1 I I I 51 52 53 54 55 56 57 58 59 YEAR Figure 3. Connecticutand ControlStates TrafficFatalities, 1951-1959 (per 100,000 population) While in general these data confirm the single time-seriesWeek 10 Lecture Notes, analysis, the 31/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsDifference-in-differences analysis One strategy for studying the effect of the trafﬁc law is thus to compare the change in fatalities in Connecticut to the change in fatalities among states that didn’t pass the trafﬁc law. Lecture Notes, Week 10 32/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsDifference-in-differences analysis One strategy for studying the effect of the trafﬁc law is thus to compare the change in fatalities in Connecticut to the change in fatalities among states that didn’t pass the trafﬁc law. This is an example of difference-in-differences analysis. Lecture Notes, Week 10 32/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsDifference-in-differences analysis One strategy for studying the effect of the trafﬁc law is thus to compare the change in fatalities in Connecticut to the change in fatalities among states that didn’t pass the trafﬁc law. This is an example of difference-in-differences analysis. For each state i, we want to know fatalities after the passage of the law would differ from fatalities before the law, if the state passes a law, and compare that to the change in fatalities if the state didn’t pass the law. This is the unit causal effect for state i. Lecture Notes, Week 10 32/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsDifference-in-differences analysis One strategy for studying the effect of the trafﬁc law is thus to compare the change in fatalities in Connecticut to the change in fatalities among states that didn’t pass the trafﬁc law. This is an example of difference-in-differences analysis. For each state i, we want to know fatalities after the passage of the law would differ from fatalities before the law, if the state passes a law, and compare that to the change in fatalities if the state didn’t pass the law. This is the unit causal effect for state i. In the simplest case, we have two time periods and two groups. Lecture Notes, Week 10 32/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsThe importance of “parallel trends” Outcome Y2 Average  Treatment  Effect Treatment Group Y1 Y4 Control Group Y3 Time Treatment Happens Source: Berry (2011) Lecture Notes, Week 10 33/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsThe difference, without the difference Outcome Y2 Measured effect  without pre‐ without pre Treatment Group measurement Y1 Y4 Control Group Y3 Time Treatment Happens Source: Berry (2011) Lecture Notes, Week 10 34/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsDifference-in-differences with randomized experiments What does randomization to treatment imply about the difference Y1 − Y3 ? Lecture Notes, Week 10 35/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsDifference-in-differences with randomized experiments What does randomization to treatment imply about the difference Y1 − Y3 ? Because these variables are measured before treatment, these pre-treatment covariates (in this case, prior measures of the outcome variable) should not be affected by treatment assignment. Lecture Notes, Week 10 35/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsDifference-in-differences with randomized experiments What does randomization to treatment imply about the difference Y1 − Y3 ? Because these variables are measured before treatment, these pre-treatment covariates (in this case, prior measures of the outcome variable) should not be affected by treatment assignment. Randomization implies E [Y1 ] = E [Y3 ]. (Here we are treating the Yi s as random variables, thus the expectations). Lecture Notes, Week 10 35/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsDifference-in-differences with randomized experiments What does randomization to treatment imply about the difference Y1 − Y3 ? Because these variables are measured before treatment, these pre-treatment covariates (in this case, prior measures of the outcome variable) should not be affected by treatment assignment. Randomization implies E [Y1 ] = E [Y3 ]. (Here we are treating the Yi s as random variables, thus the expectations). Note that E [Y2 −Y1 −(Y4 −Y3 )] = E [Y2 ]−E [Y1 ]−E [Y4 ]+E [Y3 ] = E [Y2 ]−E [Y4 ], (11) so with randomized treatment assignment the expected value of the difference-in-differences is the expected value of the (post-treatment) difference. Lecture Notes, Week 10 35/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsDifference of means versus difference-in-differences With randomized assignment to treatment, the difference of means and difference-in-differences estimator are both unbiased, e.g., for the average causal effect. Lecture Notes, Week 10 36/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsDifference of means versus difference-in-differences With randomized assignment to treatment, the difference of means and difference-in-differences estimator are both unbiased, e.g., for the average causal effect. However, subtracting the pre-score (or other pre-treatment covariate) from the post-score can reduce the variability of the effect estimators. Lecture Notes, Week 10 36/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsDifference of means versus difference-in-differences With randomized assignment to treatment, the difference of means and difference-in-differences estimator are both unbiased, e.g., for the average causal effect. However, subtracting the pre-score (or other pre-treatment covariate) from the post-score can reduce the variability of the effect estimators. The key is the correlation between the pre-treatment covariates and potential outcomes (Gerber and Green 2012, 4.1). Lecture Notes, Week 10 36/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsDifference of means versus difference-in-differences With randomized assignment to treatment, the difference of means and difference-in-differences estimator are both unbiased, e.g., for the average causal effect. However, subtracting the pre-score (or other pre-treatment covariate) from the post-score can reduce the variability of the effect estimators. The key is the correlation between the pre-treatment covariates and potential outcomes (Gerber and Green 2012, 4.1). When the pre-treatment covariate is highly correlated with the outcome, the difference-in-difference estimator may be much more precise (have less variance), compared to the difference of means. Lecture Notes, Week 10 36/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsOne simulation: Sampling distribution ofdifference-of-means and difference-in-differencesestimators (Gerber and Green 2012, Figure 4.1) Lecture Notes, Week 10 37/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsImplementing Difference-in-Differences analysis With observational data, there are two issues. The ﬁrst is the usual problem of missing counterfactuals—we only see states that pass trafﬁc laws or don’t. In particular, it may be hard to validate the parallel trends assumption. Lecture Notes, Week 10 38/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsImplementing Difference-in-Differences analysis With observational data, there are two issues. The ﬁrst is the usual problem of missing counterfactuals—we only see states that pass trafﬁc laws or don’t. In particular, it may be hard to validate the parallel trends assumption. In some contexts, a second issue is that we may not have panel data: that is, we may not observe the same units pre- and post-reform. Lecture Notes, Week 10 38/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsImplementing Difference-in-Differences analysis With observational data, there are two issues. The ﬁrst is the usual problem of missing counterfactuals—we only see states that pass trafﬁc laws or don’t. In particular, it may be hard to validate the parallel trends assumption. In some contexts, a second issue is that we may not have panel data: that is, we may not observe the same units pre- and post-reform. Example: What is the effect of attending a reform high school, versus a regular school? We may have data on cohorts of 9th and 12th graders, not repeated observations on the same students. Lecture Notes, Week 10 38/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsImplementing Difference-in-Differences analysis With observational data, there are two issues. The ﬁrst is the usual problem of missing counterfactuals—we only see states that pass trafﬁc laws or don’t. In particular, it may be hard to validate the parallel trends assumption. In some contexts, a second issue is that we may not have panel data: that is, we may not observe the same units pre- and post-reform. Example: What is the effect of attending a reform high school, versus a regular school? We may have data on cohorts of 9th and 12th graders, not repeated observations on the same students. We can deal with the second issue by comparing grade averages within schools and then comparing this difference across types of schools. Lecture Notes, Week 10 38/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsImplementing Difference-in-Differences analysis First, subtract average scores for all ninth-graders in reform schools from average scores for all twelth-graders in the reform schools: Diﬀerencetreated = Y 12 (Ti = 1) − Y 9 (Ti = 1) (12) Lecture Notes, Week 10 39/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsImplementing Difference-in-Differences analysis First, subtract average scores for all ninth-graders in reform schools from average scores for all twelth-graders in the reform schools: Diﬀerencetreated = Y 12 (Ti = 1) − Y 9 (Ti = 1) (12) Then do the same for the regular schools: Diﬀerenceuntreated = Y 12 (Ti = 0) − Y 9 (Ti = 0) (13) Lecture Notes, Week 10 39/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsImplementing Difference-in-Differences analysis First, subtract average scores for all ninth-graders in reform schools from average scores for all twelth-graders in the reform schools: Diﬀerencetreated = Y 12 (Ti = 1) − Y 9 (Ti = 1) (12) Then do the same for the regular schools: Diﬀerenceuntreated = Y 12 (Ti = 0) − Y 9 (Ti = 0) (13) The difference-in-differences estimator is just the difference between these quantities: Diﬀ − In − Diﬀ = Diﬀerencetreated − Diﬀerenceuntreated . (14) Lecture Notes, Week 10 39/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsDummy-variable regression with interaction terms How else may we implement this estimator of the average causal effect in practice? One option is dummy-variable regression with interaction terms. Lecture Notes, Week 10 40/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsDummy-variable regression with interaction terms How else may we implement this estimator of the average causal effect in practice? One option is dummy-variable regression with interaction terms. First, create two dummy variables: Lecture Notes, Week 10 40/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsDummy-variable regression with interaction terms How else may we implement this estimator of the average causal effect in practice? One option is dummy-variable regression with interaction terms. First, create two dummy variables: grade=1 if student i is in 12th grade and 0 otherwise; Lecture Notes, Week 10 40/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsDummy-variable regression with interaction terms How else may we implement this estimator of the average causal effect in practice? One option is dummy-variable regression with interaction terms. First, create two dummy variables: grade=1 if student i is in 12th grade and 0 otherwise; treated=1 if student i is in a reform school and 0 otherwise. Lecture Notes, Week 10 40/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsDummy-variable regression with interaction terms How else may we implement this estimator of the average causal effect in practice? One option is dummy-variable regression with interaction terms. First, create two dummy variables: grade=1 if student i is in 12th grade and 0 otherwise; treated=1 if student i is in a reform school and 0 otherwise. Now, suppose that E (Yi ) = a + b1 gradei + b2 treatedi + b3 grade ∗ treatedi , (15) where the expectation is formed over the population of all students in the study group. Here, a, b1 , b2 , and b3 are parameters to be estimated from the data. Lecture Notes, Week 10 40/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsDummy-variable regression with interaction terms According to equation (15), for a 9th grader who is in a regular school (and thus for this student treated=0 and grade=0), E (Yi ,0,0 ) = a . (16) (Here, Yi ,0,0 denotes a student i for whom treat=0 and grade=0). Lecture Notes, Week 10 41/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsDummy-variable regression with interaction terms According to equation (15), for a 9th grader who is in a regular school (and thus for this student treated=0 and grade=0), E (Yi ,0,0 ) = a . (16) (Here, Yi ,0,0 denotes a student i for whom treat=0 and grade=0). Similarly, a 12th grader in a regular school has E (Yi ,0,1 ) = a + b1 , (17) while a 10th grader in a reform school has E (Yi ,1,0 ) = a + b2 , (18) and a 12th grader in a reform school has E (Yi ,1,1 ) = a + b1 + b2 + b3 . (19) Lecture Notes, Week 10 41/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsDummy-variable regression with interaction terms The quantity we want to estimate is E (Yi ,1,1 − Yi ,1,0 ) − E (Yi ,0,1 − Yi ,0,0 ). (20) Lecture Notes, Week 10 42/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsDummy-variable regression with interaction terms The quantity we want to estimate is E (Yi ,1,1 − Yi ,1,0 ) − E (Yi ,0,1 − Yi ,0,0 ). (20) Distributing expectations, the difference-in-difference estimand in equation (20) is just (a + b1 + b2 + b3 − a − b2 ) − (a + b1 − a ) = b3 . Lecture Notes, Week 10 42/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsDummy-variable regression with interaction terms The quantity we want to estimate is E (Yi ,1,1 − Yi ,1,0 ) − E (Yi ,0,1 − Yi ,0,0 ). (20) Distributing expectations, the difference-in-difference estimand in equation (20) is just (a + b1 + b2 + b3 − a − b2 ) − (a + b1 − a ) = b3 . So the parameter b3 is the difference-in-difference estimand. Lecture Notes, Week 10 42/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsDummy-variable regression with interaction terms The quantity we want to estimate is E (Yi ,1,1 − Yi ,1,0 ) − E (Yi ,0,1 − Yi ,0,0 ). (20) Distributing expectations, the difference-in-difference estimand in equation (20) is just (a + b1 + b2 + b3 − a − b2 ) − (a + b1 − a ) = b3 . So the parameter b3 is the difference-in-difference estimand. ˆ You can estimate b3 by OLS regression. The estimator b3 will in fact be identical to that produced by the difference-of-means analysis above. Lecture Notes, Week 10 42/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsDummy-variable regression with interaction terms The quantity we want to estimate is E (Yi ,1,1 − Yi ,1,0 ) − E (Yi ,0,1 − Yi ,0,0 ). (20) Distributing expectations, the difference-in-difference estimand in equation (20) is just (a + b1 + b2 + b3 − a − b2 ) − (a + b1 − a ) = b3 . So the parameter b3 is the difference-in-difference estimand. ˆ You can estimate b3 by OLS regression. The estimator b3 will in fact be identical to that produced by the difference-of-means analysis above. This doesn’t necessarily mean, however, that standard regression analysis is the best option Lecture Notes, Week 10 42/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsDummy-variable regression with interaction terms The quantity we want to estimate is E (Yi ,1,1 − Yi ,1,0 ) − E (Yi ,0,1 − Yi ,0,0 ). (20) Distributing expectations, the difference-in-difference estimand in equation (20) is just (a + b1 + b2 + b3 − a − b2 ) − (a + b1 − a ) = b3 . So the parameter b3 is the difference-in-difference estimand. ˆ You can estimate b3 by OLS regression. The estimator b3 will in fact be identical to that produced by the difference-of-means analysis above. This doesn’t necessarily mean, however, that standard regression analysis is the best option E.g., nominal standard errors may differ from a difference-of-means analysis, sometimes by a lot Lecture Notes, Week 10 42/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsValidating parallel trends: placebo tests A key assumption in difference-in-differences analysis is “parallel trends” Lecture Notes, Week 10 43/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsValidating parallel trends: placebo tests A key assumption in difference-in-differences analysis is “parallel trends” This assumption is not fully testable, but there may be ways to validate partially this assumption. Lecture Notes, Week 10 43/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsValidating parallel trends: placebo tests A key assumption in difference-in-differences analysis is “parallel trends” This assumption is not fully testable, but there may be ways to validate partially this assumption. If the intervention occurs between time t and t + 1, does the difference at time t − 1 predict the difference at time t? Lecture Notes, Week 10 43/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsValidating parallel trends: placebo tests A key assumption in difference-in-differences analysis is “parallel trends” This assumption is not fully testable, but there may be ways to validate partially this assumption. If the intervention occurs between time t and t + 1, does the difference at time t − 1 predict the difference at time t? This is an example of using a setting in which a non-effect is “known” to validate a design. Lecture Notes, Week 10 43/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsValidating parallel trends: placebo tests A key assumption in difference-in-differences analysis is “parallel trends” This assumption is not fully testable, but there may be ways to validate partially this assumption. If the intervention occurs between time t and t + 1, does the difference at time t − 1 predict the difference at time t? This is an example of using a setting in which a non-effect is “known” to validate a design. These are now sometimes called placebo tests in the design literature. (N.B. Not quite what is meant by placebo test in the medical literature!). Lecture Notes, Week 10 43/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsValidating parallel trends: placebo tests A key assumption in difference-in-differences analysis is “parallel trends” This assumption is not fully testable, but there may be ways to validate partially this assumption. If the intervention occurs between time t and t + 1, does the difference at time t − 1 predict the difference at time t? This is an example of using a setting in which a non-effect is “known” to validate a design. These are now sometimes called placebo tests in the design literature. (N.B. Not quite what is meant by placebo test in the medical literature!). However, these don’t validate the claim that nothing relevant to the outcome changed across units, other than the intervention, between t and t + 1. Lecture Notes, Week 10 43/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsUnit ﬁxed effects A generalization of difference-in-differences analysis to the case of multiple time periods is “ﬁxed-effects regression” Lecture Notes, Week 10 44/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsUnit ﬁxed effects A generalization of difference-in-differences analysis to the case of multiple time periods is “ﬁxed-effects regression” This involves adding a dummy variable–a unit-speciﬁc intercept—for each unit. (If the units are countries, the intercepts are called “country ﬁxed effects.” Lecture Notes, Week 10 44/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsUnit ﬁxed effects A generalization of difference-in-differences analysis to the case of multiple time periods is “ﬁxed-effects regression” This involves adding a dummy variable–a unit-speciﬁc intercept—for each unit. (If the units are countries, the intercepts are called “country ﬁxed effects.” Suppose there are some attributes of the units that dont change over time, Zi . Theory tells us that the Zi s should be in the model (excluding them leads to omitted-variables bias). Lecture Notes, Week 10 44/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsUnit ﬁxed effects A generalization of difference-in-differences analysis to the case of multiple time periods is “ﬁxed-effects regression” This involves adding a dummy variable–a unit-speciﬁc intercept—for each unit. (If the units are countries, the intercepts are called “country ﬁxed effects.” Suppose there are some attributes of the units that dont change over time, Zi . Theory tells us that the Zi s should be in the model (excluding them leads to omitted-variables bias). Suppose further we cannot observe/measure all of the relevant Zis. Lecture Notes, Week 10 44/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsDemeaning the data Consider the data-generating process Yit = βDit + γZi + it , (21) where Dit is an indicator for treatment and the Zi s are potentially unobservable unit-speciﬁc attributes correlated with treatment assignment. Lecture Notes, Week 10 45/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsDemeaning the data Consider the data-generating process Yit = βDit + γZi + it , (21) where Dit is an indicator for treatment and the Zi s are potentially unobservable unit-speciﬁc attributes correlated with treatment assignment. One way to remove the Zi s (observable and unobservable) is by transforming the equation, which removes unit?level averages from both sides: Yit − Yi = β(Dit − Di ) + γ(Zi − Z i ) + ( it − i) = β(Dit − Di ) + ( it − i) Note that the Zi s (whether observed or unobserved) drop out of the equation because they do not vary over time. Lecture Notes, Week 10 45/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsUnit ﬁxed effects In fact, estimating a model that is transformed in terms of deviations from unit-level averages is equivalent to estimating the following model: Yit = αi + βDit + it , (22) where there is one dummy variable (“ﬁxed effect”) αi for each unit i. Lecture Notes, Week 10 46/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsUnit ﬁxed effects In fact, estimating a model that is transformed in terms of deviations from unit-level averages is equivalent to estimating the following model: Yit = αi + βDit + it , (22) where there is one dummy variable (“ﬁxed effect”) αi for each unit i. Why are these two equations equivalent? Lecture Notes, Week 10 46/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsUnit ﬁxed effects In fact, estimating a model that is transformed in terms of deviations from unit-level averages is equivalent to estimating the following model: Yit = αi + βDit + it , (22) where there is one dummy variable (“ﬁxed effect”) αi for each unit i. Why are these two equations equivalent? (We leave this as a possible exercise for a problem set.) Lecture Notes, Week 10 46/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsUnit ﬁxed effects In fact, estimating a model that is transformed in terms of deviations from unit-level averages is equivalent to estimating the following model: Yit = αi + βDit + it , (22) where there is one dummy variable (“ﬁxed effect”) αi for each unit i. Why are these two equations equivalent? (We leave this as a possible exercise for a problem set.) Fixed-effects models help to isolate the effects of within-unit variation—though coefﬁcients are pooled across units. Lecture Notes, Week 10 46/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsAssumptions of ﬁxed-effect analysis The key idea is that time-invariant confounders may be absorbed by the ﬁxed effect. Lecture Notes, Week 10 47/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsAssumptions of ﬁxed-effect analysis The key idea is that time-invariant confounders may be absorbed by the ﬁxed effect. However, the key assumption is no time-varying omitted-variables bias: that is, no time-varying omitted variables that are correlated with changes in treatment and changes in outcome. The omitted variable doesnt change over time. Lecture Notes, Week 10 47/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsAssumptions of ﬁxed-effect analysis The key idea is that time-invariant confounders may be absorbed by the ﬁxed effect. However, the key assumption is no time-varying omitted-variables bias: that is, no time-varying omitted variables that are correlated with changes in treatment and changes in outcome. The omitted variable doesnt change over time. Stated less technically: Except for the change in policy, the groups should not otherwise have had different changes over time. Lecture Notes, Week 10 47/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsAssumptions of ﬁxed-effect analysis The key idea is that time-invariant confounders may be absorbed by the ﬁxed effect. However, the key assumption is no time-varying omitted-variables bias: that is, no time-varying omitted variables that are correlated with changes in treatment and changes in outcome. The omitted variable doesnt change over time. Stated less technically: Except for the change in policy, the groups should not otherwise have had different changes over time. Unit ﬁxed effects can be useful when units are heterogeneous and have unequal probabilities of assignment to treatment: if omitted, such unit speciﬁc effects are confounders. Lecture Notes, Week 10 47/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsFixed-effect analysis is no panacea However, they are no panacea. Think about some of the items on Campbell’s checklist of threats to internal validity: Lecture Notes, Week 10 48/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsFixed-effect analysis is no panacea However, they are no panacea. Think about some of the items on Campbell’s checklist of threats to internal validity: Lecture Notes, Week 10 48/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsFixed-effect analysis is no panacea However, they are no panacea. Think about some of the items on Campbell’s checklist of threats to internal validity: History; Maturation; Testing; Instrumentation; Regression effects. Lecture Notes, Week 10 48/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsFixed-effect analysis is no panacea However, they are no panacea. Think about some of the items on Campbell’s checklist of threats to internal validity: History; Maturation; Testing; Instrumentation; Regression effects. These factors pose threats to validity within units: they are potentially time-varying confounders Lecture Notes, Week 10 48/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsFixed-effect analysis is no panacea However, they are no panacea. Think about some of the items on Campbell’s checklist of threats to internal validity: History; Maturation; Testing; Instrumentation; Regression effects. These factors pose threats to validity within units: they are potentially time-varying confounders As always, the most important assumption is the model itself Lecture Notes, Week 10 48/ 48
• The Connecticut crackdown The bootstrap Differences-in-Differences Why TSCS data? Unit ﬁxed effectsFixed-effect analysis is no panacea However, they are no panacea. Think about some of the items on Campbell’s checklist of threats to internal validity: History; Maturation; Testing; Instrumentation; Regression effects. These factors pose threats to validity within units: they are potentially time-varying confounders As always, the most important assumption is the model itself With ﬁxed effects models, why are the coefﬁcients constant? Why are we pooling across units? Etc. . . . Lecture Notes, Week 10 48/ 48