Panel data

Panel Data Regression Models
Dr. T.Sampathkumar
Assistant Professor in Economics
Govt.Arts College (Autonomous)
Coimbatore.641 018.
tsampath_136@yahoo.com

Time series data
(data collected on one individual/unit over several time periods)
or
Cross-sectional data
(data collected on several individuals/units at one point in time)

Time Series Data
Sl.No Year Y X1 X2
1 1991 2985 125 35
2 1992 4517 214 50
3 1993 4925 284 54
. . . . .
. . . . .
19 2009 3105 568 235
20 2010 4258 715 368

Cross Section
Obsevation Year Y X1 X2
1 1991 2985 125 35
2 1991 4517 214 50
. . . . .
. . . . .
. . . . .
49 1991 3105 568 235
50 1991 4258 715 368

Observation Year Y X1 X2
1 1991 125 2985 35
2 1991 214 4517 50
. 1991 400 2358 114
. . . . .
. . . . .
49 1991 568 3105 235
50 1991 715 4258 368
1 2001 891 4925 1431
2 2001 1304 6242 1777
. . . . .
. . . . .
49 2001 548 4286 1258
50 2001 478 1214 3217
Independently Pooled Data

A longitudinal, or panel, data set is one that follows a given sample
of individuals over time, and thus provides multiple observations on
each individual in the sample.
Panel data are repeated cross-sections over time with space and time
dimensions.
Panel Data

A Micro-panel data (Short panel) set is a panel for which the time dimension T
is largely less important than the individual dimension N
T << N
A Macro-panel data (Long Panel) set is a panel for which the time dimension T
is similar to the individual dimension
T ≈ N
A panel is said to be balanced if we have the same time periods,
t = 1, ...... ,T,
for each cross section observation.
For an unbalanced panel, the time dimension, denoted Ti , is specific to each
individual.

1 1991 2985 125 35
2 1991 4517 214 50
. . . . .
49 1991 5148 128 68
50 1991 3625 458 24
1 2001 3105 568 235
2 2001 4258 715 368
. . . . .
49 2001 2598 587 125
50 2001 3547 127 196
Balanced Panel

1 1991 2985 125 35
2 1991 4517 214 50
. . . . .
49 1991 5148 128 68
50 1991 3625 458 24
1 2001 3105 568 235
2 2001 4258 715 368
. . . . .
34 2001 2598 587 125
35 2001 3547 127 196
Unbalanced Panel

Terminology and notations:
Individual or cross section unit : country, region, state, firm, consumer,
individual or countries etc.
Double index : i (for cross-section unit) and t (for time)
yit for i = 1, ..,N and t = 1, ..,T
Yit , Xit

Potential gains
• Panel data usually give a large number of data points (N * T),
increasing the degrees of freedom and reducing the collinearity
among explanatory variables
• Improves the efficiency of econometric estimates
• Especially suitable to study dynamics of change
• Panel data involve two dimensions: a cross-sectional dimension N,
• and a time-series dimension T.
• Minimize bias due to aggregation

Yit = β0+β1Xit + β2Xit +ai + uit
Fixed Effects Model

Ÿ (Y- Ῡ) = β1 Ẍ1it (Xit - Xit)+β2Ẍ2it (Xit - Xit)
Now, for each “i” average this equation over time i.e.,
ΣY/N (Ῡ)
and
ΣX/N (X)
Demeaning Method (within estimation)
Take the deviation of Y (or) X from its mean value. i.e.,
Y- Ῡ (= Ÿ) and X - X (=Ẍ)
Ÿ = β1 Ẍ1it+β2Ẍ2it + üit
Yit = β1Xit + β2Xit +ai + uit

Firmid Year Yearid Y Ÿ
1 2001 1 5.612 -2.3716
1 2002 2 6.881 -1.1026
1 2003 3 5.689 -2.2946
1 2004 4 5.292 -2.6916
1 2005 5 5.551 -2.4326
1 2006 6 6.429 -1.5546
1 2007 7 7.559 -0.4246
1 2008 8 8.912 0.9284
1 2009 9 13.044 5.0604
1 2010 10 14.867 6.8834
Mean 7.9836
Ÿ = Y- Ῡ

To find out the individual or time effect, there are certain possibilities
1. Intercept and slope are constant and error term captures space and time effect
2. Intercept may change across samples (constant slope)
3. Intercept may change across time period (constant slope)
4. Both intercept and slope change

Yit = β0+β1Xit + β2Xit + uit
Both intercept and slope are constant

Yit = α1+α2d2i+α3d3i+α4d4i+β1Xit + β2Xit + uit
d’s are dummy variable. If there are four cross sections,
d=1 for cross section 1 and 0 if not
Intercept varies across cross sections, but slope is constant
Least Square Dummy Variable Method (LSDM)

Yit = λo+λ1t1t+……….+ λ9t9t +β1Xit + β2Xit + uit
t’s are time dummy variable. If there are 10 time periods,
t=1 for time period 1 and 0 if not
Intercept varies across time periods (time dummy), but slope is
constant

Both intercept and slope change across individuals
Yit = α1+α2d2i+α3d3i+α4d4i+β1Xit + β2Xit +
ϒ1(d2x1) + ϒ2 (d2x2)+ϒ3 (d3x1)+ϒ4 (d3x2)+ϒ5 (d4x1)+ϒ6 (d4x2)+ui
There are four cross sections and two explanatory variables

Random Effects Model (Error Component Model, (ECM))
Yit = β0+β1Xit + β2Xit + 𝜔it
𝜔it = εi +uit
εi = individual specific error term
uit = combined time and cross section error term
𝜀i = ~ N (0, 𝜎𝜖
2
)
uit = ~ N (0, 𝜎 𝑢
2)
E (𝜀i uit ) = 0
E (𝜀i 𝜀𝑗 ) = 0
E (uit uis ) = 0

whether or not the individuals can be viewed as a random sample from
a large population
E (εi Xi ) = 0
If yes: random effects, if no: fixed effects
The relation between T and N
for large T and small N not a big difference
for small T and large N random effects estimators are more effcient
than fixed effects
Choice between Random and Fixed Effects Model

Hausman Test
To decide between fixed or random effects you can run a Hausman
test where the null hypothesis is that the preferred model is random
effects vs. the alternative the fixed effects
It basically tests whether the unique errors (ui) are correlated with the
regressors, the null hypothesis is they are not.
Decision: if the test value (chi-2) is less than 0.05 %, use fixed effect model
or random effects is more efficient.

Panel data

Recommended

Recommended

More Related Content

What's hot

What's hot (13)

Similar to Panel data

Similar to Panel data (20)

Recently uploaded

Recently uploaded (20)

Panel data