Panel Data Assign

PANEL DATA REGRESSION MODELS
Panel data (also known as longitudinal or cross sectional time-series data) is a dataset in which
the behavior of entities are observed across time. These entities could be states, companies,
individuals, countries, etc. There are other names for panel data, such as pooled data,
micropanel data.
Panel data allows you to control for variables you cannot observe or measure like cultural
factors or difference in business practices across companies; or variables that change over time
but not across entities (i.e. national policies, federal regulations, international agreements,
etc.). This is, it accounts for individual heterogeneity.
BALANCED PANEL:
A panel is said to be balanced if each subject (firm,individuals,etc)has the same number of
observations.
UNBALANCED PANEL:
A panel is said to be unbalanced if each entity (firm,individuals,etc) has the different number
of observations.
In the panel data literature, there is two terms:
SHORT PANEL: In a short panel the number of cross sectional subjects,N, is greater than
the number of time periods,T.
LONG PANEL: In a long panel the number of cross sectional subjects,N, is less than the
number of time periods,T.
EXAMPLES OF PANEL DATA:
Firm or company data.
Longitudinal data on patterns of individual behaviour over the life-cycle.
Comparative country-specific macroeconomic data over time.
Labour economics, welfare economics and several other fields rely heavily on
household panel studies.
ADVANTAGES OF PANEL DATA SETS AND PANEL MODELS:
Panels are more informative than simple time series of aggregates, as they allow
tracking individual histories. A 10% unemployment rate is less informative than a
panel of individuals with all of them unemployed 10% of the time or one with 10%
always unemployed.Panels are more informative than cross-sections, as they reflect
dynamics and Granger causality across variables.
Panel data provides a means of resolving the magnitude of econometric problems that
often arises in empirical studies, namely the often heard assertion that the real reason
one finds (or does not find) certain effects is the presence of omitted (mismeasured or
unobserved) variables that are correlated with explanatory variables.
They allow to study individual dynamics (e.g. separating age and cohort effects).
They give information on the time-ordering of events.
They allow to control for individual unobserved heterogeneity.
DISADVANTAGES OF PANEL DATA SETS AND PANEL MODELS:
Complicated survey design, stratification.

Changing structure of population (use of rotating panel data).
Incomplete coverage of the population of interest.
Data collection and management problem.
Distortions of measurement errors due to faulty response, unclear questions, etc
Selectivity problems (self-selectivity not to work because reservation.
wage>offered wage).
Non-response (partial or complete) due to lack of cooperation.
Attrition problem, non-response over time is increasing.
Short time-series dimension, increased N costly, increased T deteriorates attrition.
New estimation problems.
There are four possibilities in the panel data :

1) Pooled OLS model.
2) The (LSDV) model.
3) The (FEWG) model.
4) The(REM).
1) POOLED OLS REGRESSION:

Suppose we estimate the following model:
Yit = O + 1X1it + . . . . + k Xkit +
it
where i = 1; : : : ; N , t = 1; : : : ; T
Note that the double-subscripted notation represents that we are dealing with a panel data set.
This is a pooled regression model since we pool all the observations in OLS
regression.The model is implicitly assuming that the coefficients (including the
intercepts) are the same for all the individuals.
In order for the OLS estimates to be unbiased and consistent, the regressor should
satisfy exogeneity assumption.
Suppose each individual i has time-invariant but unique effects on the dependent
variable. Since the pooled regression model neglects the heterogeneity across
individuals and assumes the same coefficients for all individuals, those effects unique to
each individual are all subsumed in the error term
it.
If this is the case, the explanatory variables will no longer be uncorrelated with the
error terms. Then, the estimates from pooled OLS regression will be biased and
inconsistent.
2) THE LEAST SQUARE DUMMY VARIABLE(LSDV) MODEL:

The easiest way of specifying the fixed effects model is to generate as many dummy variables
as the individuals in panel data as follows:
Yit = 1 + 2D2i ++ N DNi + 1X1it + .+ kXkit +
it
Note that we need to subtract a dummy variable for one individual to avoid perfect
multicollinearity.
This dummy technique is called the least-squres dummy variable (LSDV) because it is
simply the OLS estimator with plenty of dummy variables.
Note that consistent estimates with the LSDV model is only obtained when the error
terms are independent across both dimensions (across time and individual) of the panel
data.
In many cases the prime interest of researchers is not in obtaining the impact of the
unobserved variables (or heterogeneity). For this reason, the parameters of dummy
variables for fixed effects are called nuisance parameters.
We can provide a test to check if the fixed effects model gives different estimates than
the pooled OLS regression using F-test.
The null hypothesis test associated with this F-test is
H0 : 1 = 2 = . = N = 0
The caveat of using the fixed effects is that if you introduce too many dummy variables
(that is, if N is too large), you will lack enough observations to do a meaningful
statistical analysis. For example, suppose we have N = 2; 000 and T = 3, then we have to
draw upon the variation shown only from 3 observations for the fixed effects of each
individual.
3) THE FIXED EFFECT WITHIN-GROUP(WG) ESTIMATOR:

Also known as (Covariance Model, Within Estimator, Individual Dummy Variable
Model, Least Squares Dummy Variable Model).
Use fixed-effects (FE) whenever you are only interested in analyzing the impact of
variables that vary over time.
FE explore the relationship between predictor and outcome variables within an entity
(country, person, company, etc.).
Each entity has its own individual characteristics that may or may not influence the
predictor variables (for example being a male or female could influence the opinion
toward certain issue or the political system of a particular country could have some
effect on trade or GDP or the business practices of a company may influence its stock
price).
When using FE we assume that something within the individual may impact or bias the
predictor or outcome variables and we need to control for this. This is the rationale
behind the assumption of the correlation between entitys error term and predictor
variables. FE remove the effect of those time-invariant characteristics from the
predictor variables so we can assess the predictors net effect.
Another important assumption of the FE model is that those time-invariant
characteristics are unique to the individual and should not be correlated with other
individual characteristics.
Each entity is different therefore the entitys error term and the constant (which
captures individual characteristics) should not be correlated with the others. If the
error terms are correlated then FE is no suitable since inferences may not be correct
and you need to model that relationship (probably using random-effects), this is the
main rationale.
The equation for the fixed effects model becomes:
Yit= 1Xit+ i+ uit
Where,
i (i=1....n) is the unknown intercept for each entity (nentity-specific intercepts).

Yit is the dependent variable (DV) where i= entity and t= time.
Xit represents one independent variable (IV).
1 is the coefficient for that IV.
uit is the error term.
4) RANDOM EFFECT MODEL:

It estimates variance components for groups (or times) and error.
It assumes the same intercept and slopes.
ut is a part of the errors ;should not be correlated to any regressor. (why?)
The difference among groups (or time periods) lies in their variance of the error term,
not in their intercepts.
When the variance structure among groups is known we can use generalized least
squares method for estimation.
The fixed effects model assumes that each group (firm) has a non-stochastic groupspecific component to y. Including dummy variables is a way of controlling for
unobservable effects on y. But these unobservable effects may be stochastic (i.e.

random). The Random Effects Model attempts to deal with this:
Yit= 0+1Xit+ vi+ it
Here the unobservable component, vi , is treated as a component of the random error
term. vi is the element of the error which varies between groups but not within groups.
it is the element of the error which varies over group and time.
CHOOSING BETWEEN FIXED EFFECTS (FE) AND RANDOM EFFECTS

(RE):
1. With large T and small N there is likely to be little difference, so FE is preferable as it
is easier to compute.
2. With large N and small T, estimates can differ significantly. If the cross-sectional
groups are a random sample of the population RE is preferable. If not the FE is
preferable.
3. If the error component, vi , is correlated with x then RE is biased, but FE is not.
4. For large N and small T and if the assumptions behind RE hold then RE is more
efficient than FE.
WORKING EXAMPLE 1:
Greene (1997) provides a small panel data set with information on costs and output of 6
different firms, in 4 different periods of time (1955, 1960, 1965, and 1970). Your job is to
estimate a cost function using basic panel data techniques.
PANEL DATA:
Year
Firm
Cost
Output D1
1955
1
3.154
214
1960
1
4.271
419
1965
1
4.584
588
D2
1
1
1
D3
0
0
0
D4
0
0
0
D5
0
0
0
D6
0
0
0
0
0
0
1970
1955
1960
1965
1970
1955
1960
1965
1970
1955
1960
1965
1970
1955
1960
1965
1970
1955
1960
1965
1970
1
2
2
2
2
3
3
3
3
4
4
4
4
5
5
5
5
6
6
6
6
5.849
3.859
5.535
8.127
10.966
19.035
26.041
32.444
41.18
35.229
51.111
61.045
77.885
33.154
40.044
43.125
57.727
73.05
98.846
138.88
191.56
1025
696
811
1640
2506
3202
4802
5821
9275
5668
7612
10206
13702
6000
8222
8484
10004
11796
15551
27218
30958
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
0
1
1
1
1
Consider a simplified version of the equation (1)

(1)
yit = xitb + ai + uit
POOLED OLS:
The most basic estimator of panel data sets are the Pooled OLS (POLS). Johnston & DiNardo (1997)
recall that the POLS estimators ignore the panel structure of the data, treat observations as being
serially uncorrelated for a given individual, with homoscedastic errors across individuals and time
periods:
(2)
bPOLS = (X'X)-1X'y
gen lnc=log(cost)
gen lny=log(output)
regress lnc lny
Source |
SS
df
MS
-------------+-----------------------------Model | 33.617333
Number of obs =
24
F( 1, 22) = 728.51
1 33.617333
Prob > F
Residual | 1.01520396 22 .046145635
R-squared
-------------+------------------------------
= 0.0000
= 0.9707
Adj R-squared = 0.9694
Total | 34.6325369 23 1.50576248
Root MSE
= .21482
-----------------------------------------------------------------------------lnc |
Coef. Std. Err.
t P>|t|
[95% Conf. Interval]
-------------+---------------------------------------------------------------lny | .8879868 .0328996 26.99 0.000
.8197573 .9562164
_cons | -4.174783 .2768684 -15.08 0.000 -4.748973 -3.600593

------------------------------------------------------------------------------
scalar R2OLS=_result(7)
FIXED EFFECTS (WITHIN-GROUPS) ESTIMATORS:

P = D(D'D)-1D': transform data into individual means
Q = I-P : transform data into deviation from individual means.
The within-groups (or fixed effects) estimator is then given by:
(3) bW = (X'QX)-1X'Qy
Given that Q is idempotent, this is equivalent to regressing Qy on QX, i.e., using data in the form of
deviations from individuals means. In STATA, you can obtain the within-groups estimators using
the built-in function xtreg, fe:
tsset firm year
panel variable: firm (strongly balanced)
time variable: year, 1955 to 1970, but with gap
delta: 1 unit
xtreg lnc lny, fe
Fixed-effects (within) regression

Group variable: firm
Number of obs =
Number of groups =
R-sq: within = 0.8774
avg =
overall = 0.9707
=
Obs per group: min =
between = 0.9833
F(1,17)
24
4.0
max =
121.66
corr(u_i, Xb) = 0.8495
Prob > F
0.0000
-----------------------------------------------------------------------------lnc |
Coef. Std. Err.
t P>|t|
-------------+---------------------------------------------------------------lny | .6742789 .0611307 11.03 0.000
.5453044 .8032534
_cons | -2.399009 .508593 -4.72 0.000 -3.472046 -1.325972

-------------+---------------------------------------------------------------sigma_u | .36730483
sigma_e | .12463167
rho | .89675322 (fraction of variance due to u_i)
-----------------------------------------------------------------------------F test that all u_i=0:
F(5, 17) =
9.67
Prob > F = 0.0002
matrix bW=get(_b)
matrix VW=get(VCE)
BETWEEN-GROUPS ESTIMATORS:
(4) bB = [X'PX]-1X'Py
xtreg lnc lny, be
Between regression (regression on group means) Number of obs
Number of groups =
between = 0.9833
overall = 0.9707
avg =
max =
4.0
4
24
F(1,4)
sd(u_i + avg(e_i.))= .1838474
Prob > F
236.23
0.0001
-----------------------------------------------------------------------------lnc |
Coef. Std. Err.
t P>|t|
-------------+---------------------------------------------------------------lny | .9110734 .0592772 15.37 0.000
.7464935 1.075653
_cons | -4.366618 .4982409 -8.76 0.001 -5.749957 -2.983279

------------------------------------------------------------------------------
. matrix bB=get(_b)
. matrix VB=get(VCE)
RANDOM EFFECTS:
(5) bGLS = [X'Omega-1X]-1X'Omega-1y
where , Omega = (sigmau2*InT + T*sigmaa2*P)
GLS
xtreg lnc lny, re
Random-effects GLS regression
Number of obs
Number of groups =
between = 0.9833
overall = 0.9707
Random effects u_i ~ Gaussian
corr(u_i, X)
= 0 (assumed)
avg =
max =
24
4.0
4
Wald chi2(1)
Prob > chi2
= 268.10
= 0.0000
------------------------------------------------------------------------------
lnc |
Coef. Std. Err.
z P>|z|
-------------+---------------------------------------------------------------lny | .7963203 .0486336

_cons | -3.413094 .4131166
16.37 0.000
-8.26 0.000
.7010002
-4.222788
.8916404
-2.6034
-------------+---------------------------------------------------------------sigma_u | .17296414
sigma_e | .12463167
------------------------------------------------------------------------------
GLS AS A COMBINATION OF WITHIN & B/W GROUPS ESTIMATORS:

bGLS = Delta* bB + (1-Delta)* bW
where , Delta = VW / (VW + VB)
we can recover random effects GLS estimators as follows:
matrix V=VW+VB
. matrix Vinv=syminv(V)
. matrix D=VW*Vinv
. matrix P1=D*bB'
. matrix I2=I(2)
. matrix RD=I2-D
. matrix P2=RD*bW'
. matrix bRE=P1+P2
. matrix list bRE
bRE[2,1]
y1
lny .79632032
_cons -3.413094
FIXED EFFECTS OR RANDOM EFFECTS A HAUSMAN TEST

APPROACH:
Hausman (1978) suggested a test to check whether the individual effects (a i) are correlated with the
regressors (Xit):
- Under the Null Hypothesis: Orthogonality, i.e., no correlation between individual effects and
explanatory variables. Both random effects and fixed effects estimators are consistent, but the
random effects estimator is efficient, while fixed effects is not.
- Under the Alternative Hypothesis: Individual effects are correlated with the X's. In this case,
random effects estimator is inconsistent, while fixed effects estimator is consistent and efficient.
Greene (1997) recalls that, under the null, the estimates should not differ systematically. Thus, the
test will be based on a contrast vecor H:
(6)
H = [bGLS - bW]'[V(bW)-V(bGLS)]-1[bGLS - bW] ~ Chi-squared (k)
where k is the number of regressors in X (excluding constant).

xtreg lnc lny, fe
Fixed-effects (within) regression
Number of obs
24
Number of groups =
between = 0.9833
avg =
overall = 0.9707
max =
F(1,17)
corr(u_i, Xb) = 0.8495
Prob > F
4.0
4
= 121.66
=
0.0000
-----------------------------------------------------------------------------lnc |
Coef. Std. Err.
P>|t|
-------------+---------------------------------------------------------------lny | .6742789 .0611307

_cons | -2.399009
.508593
11.03 0.000
-4.72 0.000
.5453044
.8032534
-3.472046 -1.325972
-------------+---------------------------------------------------------------sigma_u | .36730483
sigma_e | .12463167
-----------------------------------------------------------------------------F test that all u_i=0:
F(5, 17) =
9.67
Prob > F = 0.0002
xtreg lnc lny, re

Random-effects GLS regression
Number of obs
Number of groups =
between = 0.9833
avg =
overall = 0.9707
corr(u_i, X)
Wald chi2(1)
= 0 (assumed)
24
4.0
max =
Random effects u_i ~ Gaussian
Prob > chi2
= 268.10
= 0.0000
-----------------------------------------------------------------------------lnc |
Coef. Std. Err.
z P>|z|
-------------+---------------------------------------------------------------lny | .7963203 .0486336

_cons | -3.413094 .4131166
16.37 0.000
-8.26 0.000
.7010002
.8916404
-4.222788
-2.6034
-------------+---------------------------------------------------------------sigma_u | .17296414
sigma_e | .12463167
So, based on the test above, we can see that the tests statistic (10.86) is greater than the critical value
of a Chi-squared (1df, 5%) = 3.84. Therefore, we reject the null hypothesis. So, the preferred model
is the fixed effects.
FIXED EFFECTS (LEAST SQUARES DUMMY VARIABLES):

We can recover the intercept of your cross-sectional unit after using fixed effects estimators. For the
example above, let's calculate the fixed effects model including dummy variables for each firm,
instead of a common intercept.
. regress lnc lny d1 d2 d3 d4 d5 d6, noconst
Source |
SS
df
MS
-------------+-----------------------------Model | 280.714267
7 40.1020382
Number of obs =
F( 7,
24
17) = 2581.72
Prob > F
= 0.0000
Residual | .264061918
17 .015533054
R-squared
-------------+-----------------------------Total | 280.978329
= 0.9991
24 11.7074304
Root MSE
= .12463
-----------------------------------------------------------------------------lnc |
Coef. Std. Err.
P>|t|
-------------+---------------------------------------------------------------lny | .6742789 .0611307
11.03 0.000
d1 | -2.693527 .3827874
-7.04 0.000
-3.501137 -1.885916
d2 | -2.911731 .4395755
-6.62 0.000
-3.839154 -1.984308
d3 | -2.439957 .5286852
-4.62 0.000
-3.555386 -1.324529
d4 | -2.134488 .5587981
-3.82 0.001
-3.313449
d5 | -2.310839
.55325
d6 | -1.903512 .6080806
-4.18 0.001
-3.13 0.006
.5453044
.8032534
-.955527
-3.478094 -1.143583
-3.18645 -.6205737
------------------------------------------------------------------------------
The slope is obviously the same. The only change is the substitution of a common intercept for 6
dummies, each of them representing a cross-sectional unit. Now suppose we would like to know if
the difference in the firms effects is statistically significant.
- Regress the fixed affects estimators above, including the intercept and the dummies:
regress lnc lny d1 d2 d3 d4 d5 d6
note: d1 omitted because of collinearity
Source |
SS
df
MS
-------------+-----------------------------Model | 34.368475
Residual | .264061918
6 5.72807917
17 .015533054
-------------+-----------------------------Total | 34.6325369
23 1.50576248
Number of obs =
F( 6,
24
17) = 368.77
Prob > F
= 0.0000
R-squared
= 0.9924

Root MSE
------------------------------------------------------------------------------
= .12463
lnc |
Coef. Std. Err.
P>|t|
-------------+---------------------------------------------------------------lny | .6742789 .0611307
11.03 0.000
.5453044
.8032534
d2 | -.2182041 .1052027
-2.07 0.054
-.4401624
.0037542
d3 | .2535693 .1716665
1.48 0.158
-.1086153
.6157539
d4 | .5590387 .1982915
2.82 0.012
.1406801
.9773973
d5 | .3826881 .1933058
1.98 0.064
-.0251516
.7905277
d6 | .7900151 .2436915
3.24 0.005
.275871
d1 | (omitted)
_cons | -2.693527 .3827874
-7.04 0.000
1.304159
-3.501137 -1.885916
------------------------------------------------------------------------------
Note that one of the dummies is dropped (due to perfect collinearity of the constant), and all other
dummies are represented as the difference between their original value and the constant. (The value
of the constant in this second regression equals the value of the dropped dummy in the previous
regression. The dropped dummy is seen as the benchmark.)
Obtain the R-squared from restricted (POLS) and unrestricted (fixed effects with dummies) models
. scalar R2LSDV=_result(7)
. scalar list
R2LSDV = .99237532
R2OLS = .97068641
Perform the traditional F-test, comparing the unrestricted regression with the restricted regression:
(7)
F(n-1, nT-n-K)=[ (Ru2 - Rp2) / (n-1) ] / [ (1 - Ru2) / (nT - n - k) ]
where the subscript "u" refers to the unrestricted regression (fixed effects with dummies), and the
subscript "p" to the restricted regression (POLS). Under the null hypothesis, POLS are more
efficient.
. scalar
F=((R2LSDV-R2OLS)/(6-1))/((1-R2LSDV)/(24-6-1))
. scalar list F
F = 9.6715307
The result above can be compared with the critical value of F(5,17), which equals 4.34 at 1% level.
Therefore, we reject the null hypothesis of common intercept for all firms.
WORKING EXAMPLE 2:
The Following Panel Data we have selected is Investment data for four companies for 20
years,1935-1954
Where,
Grossinv = gross investment=y
Valuefirm = value of firm =x2
capstock = capital (stock of plant and equipment)=x3
First we regress Gross Investment on Value of the firm and Capital Stock.
y
x2
x3
33.1
45
77.2
.
.
71.78
90.08
68.6
1170.6 97.8
2015.8 104.4
2803.3 118
.
.
.
.
864.1 145.5
1193.5 174.8
1188.9 213.5
FIXED EFFECT & RANDOM EFFECTS ANALYSIS ON THE PANEL DATA:

1. APPLYING GENERALIZED LEAST SQUARE METHOD:
->gross inv on values of firm and capital stock
COMMANDS: ls y c x2 x3
RESULTS:
Dependent Variable: Y
Method: Least Squares
Sample: 1 80
Included observations: 80
Variable
Coefficient
Std. Error
t-Statistic
Prob.
C
X2
X3
-63.30414
0.110096
0.303393
29.61420
0.013730
0.049296
-2.137628
8.018809
6.154553
0.0357
0.0000
0.0000
R-squared
Adjusted R-squared
S.E. of regression
Sum squared resid
Log likelihood
F-statistic
Prob(F-statistic)
0.756528
0.750204
142.3682
1560690.
-508.6596
119.6292
0.000000
Mean dependent var

S.D. dependent var
Akaike info criterion
Schwarz criterion
Hannan-Quinn criter.
Durbin-Watson stat
290.9154
284.8528
12.79149
12.88081
12.82730
0.309795
2. APPLYING BREUSCH AND PAGAN TEST:

RESULTS:
Heteroskedasticity Test: Breusch-Pagan-Godfrey

F-statistic
Obs*R-squared
Scaled explained SS
4.146525
7.778406
6.247011
Prob. F(2,77)
Prob. Chi-Square(2)
Prob. Chi-Square(2)
0.0195
0.0205
0.0440
Variable
Coefficient
Std. Error
t-Statistic
Prob.
C
X2
X3
15418.30
-1.935848
23.44746
5174.959
2.399202
8.614225
2.979406
-0.806872
2.721947
0.0039
0.4222
0.0080
R-squared
Adjusted R-squared
S.E. of regression
Sum squared resid
Log likelihood
F-statistic
Prob(F-statistic)
0.097230
0.073782
24878.25
4.77E+10
-921.7262
4.146525
0.019486
Mean dependent var

S.D. dependent var
Schwarz criterion
Durbin-Watson stat
19508.62
25850.15
23.11815
23.20748
23.15397
0.729271
3. APPLYING WHITE TEST:

RESULTS:
Heteroskedasticity Test: White
F-statistic
2.719894
Prob. F(2,77)
0.0722
Obs*R-squared
5.278799
Prob. Chi-Square(2)
0.0714
Scaled explained SS
4.239521
Prob. Chi-Square(2)
0.1201
Variable
Coefficient
Std. Error
t-Statistic
Prob.
C
X2^2
X3^2
20262.24
-0.000584
0.011659
3695.723
0.000423
0.005000
5.482620
-1.379930
2.331917
0.0000
0.1716
0.0223
R-squared
Adjusted R-squared
S.E. of regression
Sum squared resid
Log likelihood
0.065985
0.041725
25305.11
4.93E+10
-923.0872
Mean dependent var

S.D. dependent var
Schwarz criterion
19508.62
25850.15
23.15218
23.24151
23.18799
F-statistic
Prob(F-statistic)
2.719894
0.072214
Durbin-Watson stat
0.745009
CONCLUSION:
Hence we can conclude that the above results showing the large impact of stock of plant and
equipment and firm values on the gross investment..
the above regression equation has a positive impact on the data and can be written as:
REQUIRED GLS:
GROSSINV=0.110096(VALUE OF FIRMS) +0.303393(STOCK OF PLANT & EUIPMENT)

Panel Data Assign

Transféré par

Informations du document

Description originale:

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Panel Data Assign

Transféré par

Droits d'auteur :

Formats disponibles

PANEL DATA REGRESSION MODELS

EXAMPLES OF PANEL DATA:

Firm or company data.

Longitudinal data on patterns of individual behaviour over the life-cycle.

Comparative country-specific macroeconomic data over time.

ADVANTAGES OF PANEL DATA SETS AND PANEL MODELS:

DISADVANTAGES OF PANEL DATA SETS AND PANEL MODELS:

Complicated survey design, stratification.

There are four possibilities in the panel data :

1) POOLED OLS REGRESSION:

2) THE LEAST SQUARE DUMMY VARIABLE(LSDV) MODEL:

Yit = 1 + 2D2i ++ N DNi + 1X1it + .+ kXkit +

3) THE FIXED EFFECT WITHIN-GROUP(WG) ESTIMATOR:

i (i=1....n) is the unknown intercept for each entity (nentity-specific intercepts).

4) RANDOM EFFECT MODEL:

unobservable effects on y. But these unobservable effects may be stochastic (i.e.

CHOOSING BETWEEN FIXED EFFECTS (FE) AND RANDOM EFFECTS

Consider a simplified version of the equation (1)

yit = xitb + ai + uit

Residual | 1.01520396 22 .046145635

Adj R-squared = 0.9694

Total | 34.6325369 23 1.50576248

Coef. Std. Err.

[95% Conf. Interval]

-------------+---------------------------------------------------------------lny | .8879868 .0328996 26.99 0.000

_cons | -4.174783 .2768684 -15.08 0.000 -4.748973 -3.600593

FIXED EFFECTS (WITHIN-GROUPS) ESTIMATORS:

Fixed-effects (within) regression

R-sq: within = 0.8774

Obs per group: min =

corr(u_i, Xb) = 0.8495

Coef. Std. Err.

[95% Conf. Interval]

-------------+---------------------------------------------------------------lny | .6742789 .0611307 11.03 0.000

_cons | -2.399009 .508593 -4.72 0.000 -3.472046 -1.325972

Prob > F = 0.0002

Group variable: firm

R-sq: within = 0.8774

Obs per group: min =

Coef. Std. Err.

[95% Conf. Interval]

-------------+---------------------------------------------------------------lny | .9110734 .0592772 15.37 0.000

_cons | -4.366618 .4982409 -8.76 0.001 -5.749957 -2.983279

Group variable: firm

R-sq: within = 0.8774

Obs per group: min =

Coef. Std. Err.

[95% Conf. Interval]

-------------+---------------------------------------------------------------lny | .7963203 .0486336

GLS AS A COMBINATION OF WITHIN & B/W GROUPS ESTIMATORS:

FIXED EFFECTS OR RANDOM EFFECTS A HAUSMAN TEST

H = [bGLS - bW]'[V(bW)-V(bGLS)]-1[bGLS - bW] ~ Chi-squared (k)

where k is the number of regressors in X (excluding constant).

Group variable: firm

R-sq: within = 0.8774

Obs per group: min =

corr(u_i, Xb) = 0.8495

Coef. Std. Err.

[95% Conf. Interval]

-------------+---------------------------------------------------------------lny | .6742789 .0611307

Prob > F = 0.0002

xtreg lnc lny, re

Group variable: firm