Académique Documents
Professionnel Documents
Culture Documents
Advanced Econometrics
Panel data econometrics
and GMM estimation
Alban Thomas
MF 102, thomas@toulouse.inra.fr
geneity.
Methods:
- Fixed Eects Least Squares
- Generalized Least Squares
- Instrumental Variables
- Maximum Likelihood estimation for Panel Data models
Contents
I
Introduction
1.1
1.1.1
1.1.2
Examples . . . . . . . . . . . . . . . . . . .
10
1.1.3
1.1.4
. . . . . . . . . . . . . . .
11
1.2
Analysis of variance . . . . . . . . . . . . . . . . .
12
1.3
Some denitions . . . . . . . . . . . . . . . . . . .
15
17
2.1
Notation . . . . . . . . . . . . . . . . . . . . . . .
17
2.1.1
Model notation
. . . . . . . . . . . . . . .
18
2.1.2
19
2.1.3
. . . . .
20
21
2.2
2.2.1
21
2.2.2
. .
23
2.2.3
Comments . . . . . . . . . . . . . . . . . .
24
2.2.4
25
CONTENTS
2.3
26
2.3.1
. . . . . . . . .
26
2.3.2
27
2.3.3
29
2.3.4
2.3.5
2.3.6
31
Extensions
33
3.1
33
3.1.1
33
3.1.2
3.2
3.3
. . . . . . . . . . . . . . . . .
30
. . . . . .
. . . . . . . .
36
37
3.2.1
. . .
37
3.2.2
`Typical heteroskedasticity . . . . . . . . .
38
. . . . . . . . . . .
39
3.3.1
Introduction . . . . . . . . . . . . . . . . .
39
3.3.2
40
47
4.1
Introduction . . . . . . . . . . . . . . . . . . . . .
47
4.2
48
4.3
49
4.4
. . . . . . . . .
GLS estimator . . . . . . . . . . . . . . . . . . . .
51
4.4.1
51
4.4.2
IV in a panel-data context
51
4.4.3
. . . . . . . . .
ment matrix . . . . . . . . . . . . . . . . .
52
CONTENTS
4.4.4
4.5
4.5.1
. . . . . . . . . . . . . . . . . . . . . .
4.7
55
. . . . .
56
. . . . . . . . . . . . . .
56
. . . . . . . . . . . . .
56
Model specication
. . . . . . . . .
4.7.1
4.7.2
53
4.6
. . . . . . . .
. . . . . . .
. . . . . . . . . . . . . . . . .
58
58
58
63
5.1
63
Motivation . . . . . . . . . . . . . . . . . . . . . .
5.1.1
5.2
5.3
63
5.1.2
65
5.1.3
. . . .
67
69
5.2.1
70
5.2.2
Instrumental-variable estimation . . . . . .
73
. . . . . . . . . . . . .
75
5.3.1
75
5.3.2
An equivalent representation
. . . . . . . .
76
5.3.3
. . . . . . . .
77
5.3.4
78
5.3.5
78
. . .
8
II
6
CONTENTS
6.2
6.3
85
85
. . . . . . . . . . . . .
85
6.1.1
Moment conditions
6.1.2
6.1.3
. . . . .
86
. . . . . . .
87
6.1.4
87
6.1.5
. . . . .
88
6.1.6
Comments . . . . . . . . . . . . . . . . . .
89
91
6.2.1
Introduction . . . . . . . . . . . . . . . . .
91
6.2.2
91
6.2.3
A denition
92
6.2.4
. . . . . . . . . . . . . . . . .
. . . . .
92
93
6.3.1
Consistency
. . . . . . . . . . . . . . . . .
94
6.3.2
Asymptotic normality . . . . . . . . . . . .
95
6.4
. . . . . . . . . . . .
97
6.5
. . . . . . . . . . . . . . . .
99
6.6
102
6.6.1
. . . . . .
102
6.6.2
. . . . . . . . . .
104
6.6.3
6.6.4
. . . . . . . . . . . . . . . . . .
. . . . . . . . . . . . . .
106
83
109
115
. . . . . . . . .
115
115
CONTENTS
7.1.2
7.2
7.3
7.4
GMM estimation
. . . . . . . . . . . . . .
117
118
7.2.1
A simple estimator
. . . . . . . . . . . . .
118
7.2.2
120
7.2.3
. . . . . .
121
. . . . . . . .
122
. . . . . . . . . . .
122
. . . . . . . . . . . . . . . .
123
125
7.4.1
126
7.4.2
126
7.4.3
127
7.4.4
128
7.4.5
. . . .
130
7.4.6
. . . . .
133
7.3.2
IV estimation
135
8.1
Introduction . . . . . . . . . . . . . . . . . . . . .
135
8.2
136
8.2.1
Model assumptions
136
8.2.2
. . . . . . . . . . . . .
. .
137
139
8.3.1
Additional assumptions . . . . . . . . . . .
139
8.4
140
8.5
. . . .
141
8.5.1
141
8.5.2
Mixed structure
143
8.3
8.6
. . . . . . . . . . . . . . .
. . . . . . . . . . . . . .
145
10
III
9
CONTENTS
149
9.2
151
. . .
151
. . . . . . . . . .
151
9.1.1
9.1.2
Logit model
. . . . . . . . . . . . . . . . .
152
9.1.3
Probit model . . . . . . . . . . . . . . . . .
152
153
9.2.1
Sucient statistics . . . . . . . . . . . . . .
153
9.2.2
Conditional probabilities
. . . . . . . . . .
155
9.2.3
Example:
. . . . . . . . . . . . . . .
156
. . . . . . . . . . . . . . . . . . . .
157
T =2
9.3
Probit models
9.4
9.5
9.4.1
. . . . . . . . . .
159
9.4.2
The IV estimator
. . . . . . . . . . . . . .
162
. . . . . . . .
164
9.5.1
164
9.5.2
Example
168
. . . . . . . . . . . . . . . . . . .
171
173
179
11
CONTENTS
194
c Software
c
Appendix 8. A crash course in Gauss
c
Appendix 9. Example: The Gauss
software
203
211
219
c 224
232
References
238
12
CONTENTS
Part I
Panel Data Models
13
Chapter 1
Introduction
Panel data: Sequential observations on a number of
units (individuals, rms).
Also called
or
pooled
F (Y; X; Z; ) = 0;
where
Y:
:
Z:
(public
parameters.
Linear model:
Y = 0 + xX + z Z + u:
15
X:
16
CHAPTER 1. INTRODUCTION
not included in Z .
ables.
1.1.2
Examples
b)
c)
1.1.
17
Firms react to higher wages imposed by unions by hiring higherquality workers, and
1.1.3
Time-series:
related;
With panel data, variations across individuals and across time periods are accounted for.
Cross-sections: no information on adjustment dynamics. Estimates may reect inter-individual dierences inherent in comparisons of
1.1.4
18
CHAPTER 1. INTRODUCTION
Q~ it = a0 + a1p~it + uit; i = 1; 2; : : : ; N; t = 1; 2; : : : ; T;
~ it = log Qit, p~it = log pit, a1 = 1=( 1),
where Q
a0 = ( A E log i) =( 1), Euit = 0.
Model identied if E log i = 0, i.e., Ei = 1, otherwise A is biased if i is overlooked and E log i 6= 0.
Empirical issue: possible correlation between output price
and eciency term
i.
pit
xit
is scalar,
and
i = 1; 2; : : : ; N; t = 1; 2; : : : ; Ti;
i
i.
Ti:
number of
1.2.
19
ANALYSIS OF VARIANCE
Ti
1X
y ;
yi =
T t=1 it
Sxxi =
Ti
X
t=1
x )2;
(xit
and
Syyi =
Ti
X
t=1
(yit
Ti
1X
x ;
xi =
T t=1 it
Sxyi =
yi)2;
Ti
X
t=1
(xit
xi)(yit
yi);
i = 1; 2; : : : ; N:
^ i = Sxyi=Sxxi
and
xi^
^ i = y i
2 =S ;
Sxyi
xxi
RSSi = Syyi
with
(Ti
i is
2) degrees of freedom:
Consider now a restricted model with constant slopes and constant intercepts:
1 = 2 = = N (= )
1 = 2 = = N (= ):
^ =
PN PTi
)(yit
i=1 t=1(xit x
PN PTi
)2
i=1 t=1 (xit x
y)
20
CHAPTER 1. INTRODUCTION
and
^ = y x^ , where
y =
Ti
N X
X
1
P
i Ti i=1 t=1
yit; x =
1
P
Ti
N X
X
i Ti i=1 t=1
xit:
RSS =
hP
Ti
N X
X
i=1 t=1
(yit
y)2
N PTi
i=1 t=1(yit y)(xit
PN PTi
)2
i=1 t=1(xit x
PN
freedom:
i=1 Ti 2.
i2
x)
Minimizing
P P
i t (yit
have
XX
t
(yit
xit ) = 0;
XX
i
xit(yit
xit ) = 0;
so that
P P
x (y y )
^ i = yi xi and ^ = P i P t it it i :
i )
i t xit (xit x
P
Residual Sum of Squares has now
i Ti (N + 1) degrees of
free-
dom (
1.3.
21
SOME DEFINITIONS
T ) is small.
is large,
Pseudo panel:
22
CHAPTER 1. INTRODUCTION
Chapter 2
The linear model
2.1 Notation
yit = xit + uit; i = 1; 2; : : : ; N; t = 1; 2; : : : ; T;
where
xit is a K
vector,
xit:
uit = i + t + "it;
i is the time-invariant individual
eect, and "it is the i.i.d. component.
where
t is the time
uit = i + "it.
error-component model: uit = i + t + "it .
eect,
23
24
Model notation
Y = X + + + ";
0
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
@
y11
..
.
y1T
y21
..
.
y2T
..
.
yit
..
.
yN 1
..
.
yNT
1
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
A
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
6
4
(1)
X11
..
.
X1(1)T
(1)
X21
..
.
X2(1)T
..
.
Xit(1)
..
.
XN(1)1
..
.
(1)
XNT
(K )
X11
..
.
X1(KT )
(K )
X21
..
.
X2(KT )
..
.
Xit(K )
..
.
XN(K1)
..
.
(K )
XNT
3
7
7
7
7
70
7
7
7B
7B
7B
7B
7B
7B
7B
7@
7
7
7
7
7
7
7
5
1
2
C
C
..
C
.
C
+++"
k C
C
C
..
A
.
2.1.
25
NOTATION
yi = Xi + i + + "i; i = 1; 2; : : : ; N;
0
where yi is T 1, Xi is T K . Note: = (1 ; 2 ; : : : ; T ) and
i = (i; i; : : : ; i)0 are (T 1).
2.1.2
(Between-individual operator);
(Between-period operator);
IN
(1=T )eT e0T = INT
(Within-individual operator);
Q = INT
B
(Within-period operator;)
26
NT .
means.
2.1.3
Q0 = Q; B 0 = B; Q2 = Q; B 2 = B; BQ = QB = 0;
@4 0 0 1 0 5
0 1
2 1 1 A
0 0 0 1
0
1
2
30
1
1 1 0 0
y11
y11
B y12 C 1 6 1 1 0 0 7 B y12 C
6
7B
C
C
=B
@ y21 A 2 4 0 0 1 1 5 @ y21 A
0 0 1 1
y22
y22
0
1
0
1
y11
y11 + y12
B y12 C 1 B y11 + y12 C
C
B
C
=B
@ y21 A 2 @ y21 + y22 A
y22
y21 + y22
We will also use
2.2.
27
conditional
Rather,
i 's
2.2.1
regressing
6
6
6
6
6
6
6
6
4
"
"
0
0
0
0
0
0
(i = 1) (i = 2)
0
0
0
1
1
1
"
(i = N )
Y = X + E
+ " = W + u
0 00
where W = [X; E ], = ( ;
) , u = + ".
7
7
7
7
7
7
7
7
5
28
Frish-Waugh-Lovell theorem:
Parameter estimates
are numeri-
1W 0Y
1E 0 ]X = PE X;
1E 0]Y = PE Y
and
on
0
0
But E = IN
eT , E E = IN
eT eT = IN T
, PE = I E (E 0E ) 1E 0 = I T1 E (IN )E 0
= I 1 (IN
eT )(IN
eT )0 = I IN
1 eT e0 = Q.
T
E ).
yit
1=T
X
t
yit = (xit
BY = (X
1=T
X
t
xit) + uit
BX ) + u Bu
1=T
X
t
uit
QY = QX + Qu:
2.2.
29
2.2.2
y1
6y 7
6 2 7
6 .. 7
4. 5
yN
x1
6x 7
2 7
=6
6 .. 7
4. 5
xN
eT
0T
60 7
6e 7
T 7
6 T 7
+6
6 .. 7 1 + 6 .. 7 2
4. 5
4. 5
0T
60
6 T
+ + 6 ..
4
eT
0T
7
7
7 N
5
0T
+ 6 ..
7
7
7;
5
"1
6"
6 2
4
"N
with assumptions:
X
min "0i"i = (yi
i=1
i=1
, ^ i = yi
i
xi )0(yi
i
xi )
i = 1; 2; : : : ; N;
and substituting in partial derivative wrt. , we have
^ =
" N;T
X
i;t
(xit
xi)(xit
xi;
xi)0
# 1 " N;T
X
i;t
(xit
xi)(yit
yi)
or
V ar ^ = ^ 2"
" N
X
i=1
xiQT x0i
# 1
30
where
QT = IT
! 1.
Comments
BY = BX + B + B";
If
PT
t x1it i;t
(NT
K )=[N (T
1) K ].
2.2.
31
Y
..........
..........
.
.
.
.
.
.
.
.
.
.
..........
..........
.
........
........
.
.
.
.
.
.
.
......
Within
Between
1...........
X
2.2.4
Poolability
As before:
xit is a K vector.
H0 : 1 = 2 = = N (= ) (K (N
but now
1) constraints).
PN
i=1 RSSi where RSSi
1); N (T
1)) ;
32
(OLS)
versus
(Within)
1); NT
K )) ;
! 1.
i 's) wrt.
Assumptions:
E (ij ) =
2
0
if
i = j;
otherwise
"2 if i = j and t = s;
E ("it"sj ) = 0 otherwise:
2
2
2
Hence cov (uit ; ujs ) = + " if i = j and t = s, and if i = j
and t 6= s.
2.3.
33
Let
2 + "2 2
6 2
2 + "2
6
0
T = E (uiui) = 6 ..
4.
2
2
2
2
..
.
2 + "2
a (
E (uu0) =
= IN
T = IN
2 (eT e0T ) + "2IT
3
7
7
7;
5
We have
= IN
2 (T BT ) + "2(QT + BT )
since QT = IT
BT and BT = (1=T )eT e0T . Therefore
= IN
2 (T BT ) + "2(QT + BT ) = T 2 B + "2INT
= "2Q + (T 2 + "2)B .
or equivalently:
2.3.2
Y = X + U;
with
E (UU 0) =
.
, 2
and
"2,
.
^ GLS = X 0
1X 1 X 0
1Y
^ GLS ) = "2 X 0
1X 1.
and V ar (
covariance matrix
Computation of
1:
r = ("2)r Q + (T 2 + "2)r B
for an arbitrary scalar
r.
Based on properties of
Q and B (idem-
34
1
1
B
1 = 2Q + 2
"
T + "2
and
1
1
1=2 = Q +
B:
2
"
(T + "2)1=2
^ GLS = X 0
1X 1 X 0
1Y
We have
"
= X0
"2
1
# 1"
X0
"2
1
Y :
i 1h
i
1
1
0
0
= X (Q + B ) X
X (Q + B ) Y ;
2
2 2
2 2
where = (T + )= = 1 + T = .
"
"
"
"
1=2 and use OLS: Y = X + u, where
"
1
=
2
Y = "
Y = Q +
B Y
(" + T )1=2
"
1
=
2
X = "
X = Q +
B X;
(" + T )1=2
so that
Y = (Q + 1=2B )Y; X = (Q + 1=2B )X;
scalar form:
fyit g = (yit
(1
fxitg = (xit
(1
and in
p1
)yi
p1 )xi:
2.3.
35
2.3.3
1
1
^ GLS = X 0 QX + 1 X 0 BX
X 0QY + X 0 BY
^ W ithin = (X 0QX ) 1X 0 QY; ^ Between = (X 0BX ) 1X 0BY;
so that
i's ?
marginal (uncondi-
36
N = 629, T = 6).
The GLS estimator is a weighted-average of the Within and Between estimators, where the weight is the inverse of the corresponding variance.
2.3.
37
Table 2.1:
Variable
Within
GLS
Constant
0.8499
Age in [20,35]
0.0557
0.0393
Age in [35,45]
0.0351
0.0092
Age in [45,55]
0.0209
-0.0007
Age in [55,65]
0.0209
-0.0097
Age 65 over
-0.0171
-0.0423
-0.0042
-0.0277
-0.0204
-0.0250
Self-employed
-0.2190
-0.2670
South
-0.1569
-0.0324
Rural
-0.0101
-0.1215
we use
2.3.6
^ 2 = u0Qu=tr(Q) =
"
and
because
PN PT
i=1 t=1(uit
N (T
1)
ui)2
X
"2 + T 2 = u0Bu=tr(B ) = T
u2i =N;
i=1
tr(Q) = N (T
1) and tr(B ) = N .
uit's
u^it's instead.
38
u's;
2/ Amemiya (1971):
p
2
pNT (^2"
N (^
"2)
2 )
where
^ 2 = "2 + T 2
vN
0;
2"4 0
0 24
^ 2" =T .
^ 2" = Y 0QY
Y 0QX (X 0QX ) 1X 0 QY =[N (T
1) K ]
Y 0BX (X 0BX ) 1X 0BY =[N
"2 + T 2 = Y 0BY
X ),
1]:
not in the
Within regression.
4/ Nerlove (1971):
Compute
^ 2 = N1 1
PN
i
i=1(^
^i)2, where ^ i
Feasible GLS.
Chapter 3
Extensions
3.1 The Two-way panel data model
Error component structure of the form:
uit = i + t + "it
i = 1; 2; : : : ; N; t = 1; 2; : : : ; T;
or in matrix form
U = (IN
eT ) + (eN
IT ) + ";
where
3.1.1
inference
3.1.1.1 Notation
Fixed-eect estimates of
Q = IN
IT
IN
(eT e0T =T ) (eN e0N =N )
IT ;
39
40
CHAPTER 3.
so that
Qu = fuit
ui
EXTENSIONS
utgit :
N
X
with restriction
i=1
i = 0:
T
X
with restriction
t=1
t = 0;
^ = (X 0QX ) 1X 0 QY;
^
^ i = yi xi;
^
^t = yt xt:
If the model contains an intercept, operator
Q = IN
IT
so that
Qu = fuit
Q becomes
IN
(eT e0T =T ) (eN e0N =N )
IT
+(eN e0N =N )
(eT e0T =T )
ui ut + ugit, and Within estimates are
^ = (X 0QX )
^ i = (yi y)
^ t = (yt y)
1X 0 QY;
(xi
(xt
^
x);
^
x):
1/ H0 : 1 = = N = 1 = = T = 0.
3.1.
41
k1 = N + T
2; k2 = (N
1)(T
1) K );
and
2/ H0 : 1 = = N = 0 given t 6= 0; t T
1.
k1 = N
1; k2 = (N
1)(T
1) K );
and
(yit
yt) = (xit
xt) + (uit
ut):
3/ H0 : 1 = = T 1 = 0 given i 6= 0; i N
1.
k1 = T
1; k2 = (N
1)(T
1)
K );
and
42
CHAPTER 3.
EXTENSIONS
(yit
yi) = (xit
xi) + (uit
ui):
3.1.2
uit):
Climatic conditions, identical across farms (t);
Motivation for adding specic eects (into
tion
Assumption
Estimate
1 (Labor)
2 (Real estate)
3 (Machinery)
4 (Fertilizer)
Sum of 's
R 2
(I)
(II)
(III)
0.256
0.166
0.043
0.135
0.230
0.199
0.163
0.261
0.194
0.349
0.311
0.289
0.904
0.967
0.726
0.721
0.813
0.884
i = t = 0 i = 0 t = 0
3.2.
43
But variances
2
"2
and
are assumed
constant.
V ar(i) = i2
V ar("i) = i2
E ("it"is) 6= 0
Individual-specic heteroskedasticity
t 6= s
Typical heteroskedasticity
Serial correlation
3.2.1
or
i = 1; 2; : : : ; N;
= E (UU 0) = diag[i2]
(eT e0T ) + diag["2]
IT ;
where
diag["2] is N N .
We have
e e0
eT e0T
= diag[(T i + " ) ]
+ diag[(" ) ]
IT
:
T
T
Transformation of the heteroskedastic model:
multiply both sides by
"
1=2
"
= diag
2
(T i + "2)1=2
eT e0T
+ IN
IT
T
eT e0T
:
T
44
CHAPTER 3.
yit = yit
"
"
p
T i2 + "2
EXTENSIONS
!#
yi:
is individual-
specic:
i = (T i2 + "2)="2
and
yit = yit
Feasible GLS:
p1 yi:
i
1)
Step 3.
Step 4.
Step 5.
uit
t=1 (^
1; 2; : : : ; N
3.2.2
requires
T >> N .
w^i2; i =
`Typical heteroskedasticity
Assumptions:
= E (UU 0) = diag[2 ]
(eT e0T ) + diag[i2]
IT
= diag[T 2 + i2]
(eT e0T =T ) + diag[i2]
(IT
Transformed model uses
1
]
(eT e0T =T )
2
2
T + i
1=2 = diag[ p
eT e0T =T ) :
3.3.
45
+diag[1=i]
(IT
eT e0T =T ) ;
Y =
1=2 has typical element
y y
y
yit = it i + p 2i 2
i
T + i
y iyi
i
p
= it
where i = 1
i
T 2 + i2
E (u2it) = wi2 = 2 + i2 8i, hence
OLS residuals u
^it can be used to
P
T
2 ^ 2 = 1=(T 1)
estimate wi : w
uit ^ui)2.
i
t (^
Within residuals u
~ are then used to compute
PTit
2
^ i = 1=(T 1) t (~uit u~i)2.
so that
A consistent estimate of
2 is ^ 2 = (1=N )
PN 2
^i
i (w
^ 2i ).
Introduction
i
Ti periods, and total
PN
number of observations is now
i=1 Ti (instead of NT previously).
vidual) to another. For individual , we have
Examples
46
CHAPTER 3.
3.3.2
EXTENSIONS
y11
B y12
B
B y13
B
@ y21
y22
To eliminate
T1 = 3 and T2 = 2:
x11
1
C B x12 C
B 1
C B
C
B
C = B x13 C + B 1
C B
C
B
A @ x21 A
@ 2
x22
2
"11
C B "12
C B
C + B "13
C B
A @ "21
"22
C
C
C:
C
A
6
6
6
6
4
2=3
1=3
1=3
0
0
I3
e3e03=3
0
I2
1=3
2=3
1=3
0
0
1=3
1=3
2=3
0
0
0
e2e02=2
0
0
0
1=2
1=2
3
0
07
7
07
7;
1=2 5
1=2
Q = diag(ITi
3.3.
47
where
Nt:
n.
t, and n =
PT
t=1 Nt .
Consider a
N
matrix at time
t.
Example:
1 0 0
40 1 05
0 0 1
8
>
>
>
>
>
>
>
>
>
>
>
>
>
<
1 0 0
D1 = 4 0 1 0 5
0 0 1
1 0 0
D
=
>
2
>
>
0 0 1
>
>
>
>
>
>
>
>
>
>
:
1 0 0
D3 =
0 1 0
We have 3 (Nt N ) matrices Dt , t = 1; 2; 3 constructed from I3
above.
48
CHAPTER 3.
EXTENSIONS
Matrix
is
n (N + T ),
1
in
where
and
Dit:
t's.
1 = (eT
2 = (IT
eN ), and would be NT (N + T ).
In the balanced panel case, we would have
IN ) and
3.3.
49
n = 3 + 2 + 2 = 7 and N = 3:
In example above,
=
vector
1
60
6
60
6
61
6
40
0
0
0
1
1
0
0
1
0
0
0
1
0
1
0
0
1
0
1
0
0
1
0
0
0
0
1
0
0
1
0
1
0
0
1
1
1
0
0
0
0
0
0
0
1
1
0
0
0
07
7
07
7
07
7;
07
7
7
15
1
would be
0
1
0
1
0
0
6
6
6
6
6
6
6
6
6
4
0
0
1
0
1
0
1
0
0
0
0
1
0
y
0 B 11 C
y11 + y12 + y13
y
B
21
C
17
y21 + y23
C B
7B
y
B
C B
31
7
B
0 7B
C B y31 + y32
y
B
C=B
12
07
y11 + y21 + y31
C
7B
B y32 C B
5
@
C
0 B
y12 + y32
@ y13 A
1
y13 + y23
y23
3
1
C
C
C
C
C
C
A
Easier method if
and
Let
N = 011
T = 022
NT = 021
= 2 1N10NT
P = T NT N10NT = 02
(N N );
(T T );
(T N );
(n T );
(T T ):
50
CHAPTER 3.
EXTENSIONS
Q = In
where
1N101
: generalized inverse of
P
0;
P.
PTi
t=1 yti .
;
1i
a0i
+
Ti
Ti
t;
Example
We have
3 0 0
1 1 1
N = T = 4 0 2 0 5 ; NT = 4 1 0 1 5 ;
0 0 2
1 1 0
2
P =4
1:6666
0:8333
0:8333
0:8333
1:1666
0:3333
0:8333
0:3333 5
1:1666
3.3.
51
QY =
B
B
B
B
B
B
B
B
B
@
0:4582
0:1875 C
C
0 1
0
1
0:5000 C
6
0:3383
C
0:5418 C
C ; 1 = @ 6 A = @ 1:6618 A
0:5000 C
C
9
2:0368
C
0:0832 A
0:1875
For example,
Qy11 = 1
6 1
+ ( ) (1 1 1 ) @
3 3
0
Qy31 = 3
9 1
+ ( ) (1 1 0 ) @
2 2
0:3383
1:6618 A + 0:3383 = 0:4582:
2:0368
1
0:3383
1:6618 A + 0:3383 = 0:5:
2:0368
52
CHAPTER 3.
EXTENSIONS
Chapter 4
Augmented panel data models
What are augmented panel models ? Implication for estimation ?
Special estimation techniques when GLS are not feasible.
4.1 Introduction
Consider the model
gressors.
Example:
B )Z
+ Q + Q" = QX + Q";
53
54
since
CHAPTER 4.
BZ = Z .
Only
identiable.
feasible:
) ^ ;
xi^ = i + Zi
+ "i;
to estimate the
's.
yi
i = 1; 2; : : : ; N;
Recall: GLS is a consistent and ecient estimator provided regressors are exogenous:
E (izi) = 0
8i; t:
Consider the non-augmented model yit = xit + i + "it .
If xit is endogenous in the sense E (i xit ) 6= 0, then GLS are not
E (ixit) = 0
consistent:
and
^ GLS = + X 0
1X 1 X 0
1U
= + X 0 Q + 1B X 1 X 0 Q + 1B U ;
2 2
where = 1 + T =" , so that
0
X
Q + 1B U = [X 0Q" + X 0(B + B")=]
4.3.
55
= 0 + X 0B= + 0 = X 0= 6= 0;
because
E (X 0") = 0 and B = .
E (Z 0) 6= 0.
E (X 0) 6= 0 and/or
is ltered out.
xit's).
H0 : E (X 0) = E (Z 0) = 0 (exogeneity).
56
CHAPTER 4.
H0
Alternative
^ GLS
^ W ithin
Consistent,
Consistent,
ecient
not ecient
Not consistent
Consistent
xit's
Therefore,
HT = ^ W ithin
Notes
^ GLS
^ GLS
0 h
^ W ithin
H0,
only.
i 1
v 2(K ):
Recall that
"2(X 0QX ) 1.
X.
later).
B ):
rank
Alternative method:
Instrumental-variable estimation.
In the
observations:
where
[W 0(Y
If L > K ,
[W 0(Y
X )] = 0
^ = (W 0X ) 1W 0Y
X )] = 0
(W 0Y ) = (W 0X )
(IV estimator)
L conditions on K
parameters)
(Y X )0W (W 0W ) 1W 0
X ) where PW = W (W 0W ) 1W 0
(Y
) ^ = (X 0PW0 X )
Note:
in general, instruments
1 (X 0 P Y ):
W
originate from or outside the
equation.
4.4.2
IV in a panel-data context
58
CHAPTER 4.
Y = X11 + X22 + Z1
1 + Z2
2 + + ";
where
X1 :
X2 :
Z1 :
Z2 :
and let
N K1
N K2
N G1
N G2
i and t;
endogenous, varying across i and t;
exogenous, varying across i;
endogenous, varying across i;
exogenous, varying across
Y =
1=2Y , X =
1=2X ,
h
=
1=2.
We
1 0
0
^ IV = PW
PW Y
h
i 1h
i
0
1
=
2
1
=
2
0
1
=
2
1
=
2
=
PW
PW
Y :
Computation of
4.4.3
and
1=2:
E (X10 ) = E (Z10 ) = 0
) Obvious instruments are X1 and Z1, not sucient because
K1 + G1 < K1 + K2 + G1 + G2.
Additional instruments: must not be correlated with .
Because is the source of endogeneity, every variable not correlated with is a valid instrument. Best valid instruments are
highly correlated with X2 and Z2 .
QX1 and QX2 are valid instruments: E [(QX1)0] = E [X10 Q] =
Exogeneity assumptions:
E [X10
1U ] = E [X10 (Q + 1B )U ] = E [X10 B (Q + 1B )U ]
since
BQ = 0 and BB = B .
4.4.4
QX ).
K1 G2.
xit is exogenous, we can use the following conE (xiti ) = 0 8i; 8t instead of E (x0ii) = 0.
X1
in
60
CHAPTER 4.
x11
6x
6 11
6 :::
6
6
6 x21
6
x21
X1 = 6
6
6 :::
6
6 xN 1
6
6 xN 1
6
4 :::
xN 1
x12
x12
:::
x22
x22
:::
xN 2
xN 2
:::
xN 2
:::
:::
:::
:::
:::
:::
:::
:::
:::
:::
x1T
x1T
:::
x2T
x2T
:::
xNT
xNT
:::
xNT
(i = 1; t = 1)
(i = 1; t = 2) 7
7
7
:::
7
(i = 2; t = 1) 7
7
(i = 2; t = 2) 7
7
7
:::
7
7
(i = N; t = 1) 7
7
(i = N; t = 2) 7
7
5
:::
(i = N; t = T )
such that
is
WAM
using
We add
X2
origi-
(QX1)
and
(QX2)
X1
for AM.
Identication condition:
, in particular parameter .
Let
where
BX ^ W = B
BX (X 0 QX ) 1X 0Q Y
= Z
+ + B BX (X 0 QX ) 1X 0Q ";
X = (X1jX2), Z = (Z1jZ2), and
= (
1;
2).
The last
is
Z2 in order to estimate .
62
CHAPTER 4.
uals
QX ^ W and u^B = BY
u^W = QY
BX ^ W
Z ^B :
These two vectors of residuals are used to compute variance composants as in standard Feasible GLS.
4.5.1
and
QY .
(1
)yi.
W.
Model specication
4.6.
63
:
where
w:
wage rate
X1: additional variables (industry, occupation status, etc.), and ED : educational level. Proxies
worker's ability (unobserved),
ED: @w=@ED.
conditions
where
ED ?
If ability
where
Two problems when estimating the rst equation while overlooking the second one:
ED);
error bias.
64
CHAPTER 4.
4.7.1
4.7.
65
F EM : dummy, 1 female;
BLK : dummy, 1 if head is black;
ED : number of years of education attained.
Individual-specic variables:
ED, BLK
and
F EM .
a priori
individual eects):
MS );
Variables
IND).
a priori
Zi's)
exogenous:
Augmented model
66
CHAPTER 4.
Table 4.1:
Variable
LW AGE
EXP
W KS
OCC
IND
UNION
SOUT H
SMSA
MS
ED
F EM
BLK
Mean
Std. Dev.
Minimum
Maximum
6.6763
0.4615
4.6052
8.5370
19.8538
10.9664
1.0000
51.0000
46.8115
5.1291
5.0000
52.0000
0.5112
0.4999
0.0000
1.0000
0.3954
0.4890
0.0000
1.0000
0.3640
0.4812
0.0000
1.0000
0.2903
0.4539
0.0000
1.0000
0.6538
0.4758
0.0000
1.0000
0.8144
0.3888
0.0000
1.0000
12.8454
2.7880
4.0000
17.0000
0.1126
0.3161
0.0000
1.0000
0.0723
0.2590
0.0000
1.0000
4.7.
67
Table 4.2:
Exogenous regressors
only.
Within
GLS
0.0976 (0.0040)
OCC
-0.0696 (0.02323)
-0.0701 (0.02322)
SOUTH
-0.0052 (0.05833)
-0.0072 (0.05807)
SMSA
-0.1287 (0.03295)
-0.1275 (0.03290)
0.0317 (0.02626)
0.0317 (0.02624)
Constant
IND
2(4) = 0:551
Table 4.3:
only.
Within
GLS
0.0561 (0.0024)
0.1136 (0.002467)
0.1133 (0.002466)
EXPE2
-0.0004 (0.000054)
-0.0004 (0.000054)
WKS
0.0008 (0.0005994)
0.0008 (0.0005994)
-0.0322 (0.01893)
-0.0325 (0.01892)
0.0301 (0.01480)
0.0300 (0.01479)
Constant
EXPE
MS
UNION
2(5) = 24:94
68
CHAPTER 4.
Table 4.4:
GLS
0.1866 (0.01189)
OCC
-0.0214 (0.01378)
-0.0243 (0.01367)
SOUTH
-0.0018 (0.03429)
0.0048 (0.03188)
SMSA
-0.0424 (0.01942)
-0.0468 (0.01891)
IND
0.0192 (0.01544)
0.0148 (0.01521)
EXPE
0.1132 (0.00247)
0.1084 (0.00243)
-0.0004 (0.00005)
-0.0004 (0.00005)
0.0008 (0.00059)
0.0008 (0.00059)
-0.0297 (0.01898)
-0.0391 (0.01884)
0.0327 (0.01492)
0.0375 (0.01472)
FEM
-0.1666 (0.12646)
BLK
-0.2639 (0.15413)
ED
0.1373 (0.01415)
Constant
EXPE2
WKS
MS
UNION
2(9) = 495:3
Table 4.5:
AM
BMS
0.1772 (0.017)
0.1781 (0.016)
0.1748 (0.016)
-0.0207 (0.013)
-0.0208 (0.013)
-0.0204 (0.013)
0.0074 (0.031)
0.0072 (0.031)
0.0077 (0.031)
-0.0418 (0.018)
-0.0419 (0.018)
-0.0423 (0.018)
IND
0.0135 (0.015)
0.0136 (0.015)
0.0138 (0.015)
EXPE
0.1131 (0.002)
0.1129 (0.002)
0.1127 (0.002)
-0.0004 (0.005)
-0.0004 (0.000)
-0.0004 (0.000)
0.0008 (0.000)
0.0008 (0.000)
0.0008 (0.000)
-0.0298 (0.018)
-0.0300 (0.018)
-0.0303 (0.018)
0.0327 (0.014)
0.0324 (0.014)
0.0326 (0.014)
FEM
-0.1309 (0.126)
-0.1320 (0.126)
-0.1337 (0.126)
BLK
-0.2857 (0.155)
-0.2859 (0.155)
-0.2793 (0.155)
0.1379 (0.021)
0.1372 (0.020)
0.1417 (0.020)
Constant
OCC
SOUTH
SMSA
EXPE2
WKS
MS
UNION
ED
Test
Chapter 5
Dynamic panel data models
5.1 Motivation
Usefulness of dynamic panel data models:
In practice: estimate long-run elasticities and structural parameters from Euler equations.
5.1.1
R
maxq(0);:::;q(T ) E e rt(t) ;
(t) = p(t)q(t) c[q(t); b(t)];
b_ = G[b(t); q(t)];
69
70
CHAPTER 5.
b(t) is the state variable (stock, capital,...), q(t) is the control variable, r is discount rate. G(:) describes the evolution path
where
nP
o
T
t
maxq0;:::;qT E
t=0(1 + r) t ;
Vt(bt) = max Et t + (1 + r) 1Vt+1(bt+1)
and
We use a) the envelope theorem (evolution path at optimum depends only on state variable, as control variable is already optimized); b) First-order condition wrt. control variable.
@Vt+1
@ t @f (bt; qt)
=
@f
@qt
@qt
1
(1 + r);
(FOC)
5.1.
71
MOTIVATION
@Vt @ t
=
@bt @bt
Now we lag
1
@ t 1
1 @ t
+
@qt 1 1 + r @bt
Assume
@ t @f (bt; qt)
@qt
@qt
@ t @f
@qt @qt
1
@f (bt; qt)
:
@bt
@f @f (bt 1; qt 1)
= 0:
@bt
@qt 1
We have
@ t
1 + r @ t 1
a @ t
=
+ 1
:
@qt
a2
@qt 1
a2 @bt
This is the Euler equation relating current and past marginal
prots.
If, for instance, prot is linear-quadratic in
b0 + b1qt + b2bt =
1+r (b0 + b1 qt 1 + b2 bt 1)
a
2
a1
a2
where
0
1
2
3
5.1.2
= (a2 b1
= (a2 b1
= (a2 b1
= (a2 b1
a1c1)
a1c1)
a1c1)
a1c1)
1 [b ((1 + r) a ) + a c ] ;
0
2
1 0
1 [(1 + r)b ] ;
1
1 [(1 + r)b ] ;
2
1 [a c a b ] :
1 2
2 2
72
CHAPTER 5.
ct + At = yt + At 1(1 + rt); t = 1; 2;
where
ct
is consumption at time
income, and
rt is interest rate.
t, At
is total assets,
yt
is wage
U = u(c1) +
where
1
u(c );
1+ 2
U
where
= c1 +
1
c2 ;
1+
At the optimum (by replacing budget constraints in utility function and optimizing wrt.
A1):
@u @c1
1 @u @c2
@U
=
+
=0
@A1 @c1 @A1 1 + @c2 @A1
@u 1 + r @u
, @c
=
:
1 1 + @c2
This is the
c1 1= =
1+r
c2 1= :
1+
5.1.
73
MOTIVATION
c1 =
1+r
(
1+
u(X ) = 1=2(
Ec2)
X )2 :
c1 = Ec2
if
r = :
ct+1 = ct + "t+1;
where
"t+1 is i.i.d.;
ct 1) + "t:
ct = 0 + 1yt + (ct 1
5.1.3
1yt 1) + 2(yt 1
ct 1) + "t:
yt+1
yt
where
and price.
have
consumption
We
74
CHAPTER 5.
j
X
@ C~i;t+j
j + j 1 + + + 1) = :
=
lim
(
~
j !1
j !1
1
s=0 @ Pi;t+s
lim
Qit
is output of rm
where
t
change),
uit
i at time t, Nit
is labor input,
Kit
is
vit
"it
is an i.i.d.
is a productivity
5.2.
75
2 log Ki;t 1
"i;t 1] ;
or
log Qit = 1 log Nit + log Ni;t 1 + 3 log Kit + log Ki;t 1
+5 log Qi;t 1 + t + (i + !it);
subject to restrictions
2 = 1 5 and 4 = 35.
Hence, equivalence between a static (short-run) model with seriallycorrelated productivity shocks, and a dynamic representation of
production output.
By continuous substitution:
+ 2"
i;t
2 +
+ t
t
1 " + 1 + t y :
i1
i0
1 i
76
CHAPTER 5.
5.2.1
^ =
PN PT
i=1 t=1(yit yi)(yi;t 1 yi; 1) ;
PN PT
2
i=1 t=1 (yi;t 1 yi; 1)
^ i = yi
^yi; 1;
where
T
T
T
1X
1X
1X
y ; yi; 1 =
y ; "i =
" :
yi =
T t=1 it
T t=1 i;t 1
T t=1 it
Also,
merator converges to 0.
Numerator:
1
plimN !1
NT
because
N;T
X
i;t
(yi;t 1
yi; 1)("it
N
1X
"i) = plim
y "
N i=1 i; 1 i
We
use
T
1X
1 1 T
(T
yi; 1 =
yi;t 1 =
yi0 +
T t=1
T 1
1) T + T
i
(1 )2
1 T 1
1 T 2
+
" +
" + + "i;T 1 :
1 i1
1 i2
5.2.
77
We have
N
X
plim
N
X
1
1
1
yi; 1"i = plim
"i
N i=1
N i=1 T
(
N
X
T
X
1
1
"
N i=1 T t=1 it
"2 (T 1)
= 2
T
(1
1
T
"
T 1
X
1
1
t=1
"
T 1
X
1
T t
T t
#)
"it
#)
"it
1
t=1
T + T
:
)2
1 PN;T (y
2
In a similar manner, we show that plim
i;t 1 yi; 1)
i;t
NT
= plim
2
= " 2 1
1
1) T + T
T2
2
(T
(1 )2
1
T
1 1 T
T 1
1+
plimN !1 (^
) =
1
T 1
2
(1 )(T
1)
1 T
T (1 )
1
= O(1=T ):
(yit
yi) = (yi;t 1
1=T .
is large and
is small.
78
CHAPTER 5.
Table 5.1:
0.2
0.5
0.7
0.9
Percent
-0.2063
-103.1693
-0.1539
-76.9597
10
-0.1226
-61.3139
20
-0.0607
-30.3541
40
-0.0302
-15.0913
-0.2756
-55.1282
-0.2049
-40.9769
10
-0.1622
-32.4421
20
-0.0785
-15.6977
40
-0.0384
-7.6819
-0.3307
-47.2392
-0.2479
-35.4084
10
-0.1966
-28.0912
20
-0.0938
-13.3955
40
-0.0449
-6.4114
-0.3939
-43.7633
-0.3017
-33.5179
10
-0.2432
-27.0248
20
-0.1196
-13.2934
40
-0.0563
-6.2561
5.2.
79
5.2.2
Instrumental-variable estimation
when
is xed
(yit
"i;t 1)
ture values of
80
CHAPTER 5.
^ =
or
Conclusion:
even though
because the
PN PT
i=1 t=3(yit
PN PT
i=1 t=3(yi;t 1
yi;t 1)yi;t 2
:
yi;t 2)yi;t 2
Step 1.
(yit
yi;t 1) = (yi;t 1
yi;t 2) + (xit
xi;t 1) + "it
"i;t 1:
Step 2.
Substitute
yi
^yi; 1
and estimate
Step 3.
^ and ^
xi^ = zi + i + "i; i = 1; 2; : : : ; N;
by OLS.
yi;t 2)
5.3.
81
IV estimator of
yi;t 1.
^ =
PN PT
i=1 t=1 yit yi;t 1
PN PT
2
i=1 t=1 yi;t 1
=+
PN PT
i=1 t=1(i + "it)yi;t 1 :
PN PT
2
i=1 t=1 yi;t 1
We show that
N X
T
1 1 T
1 X
( + " )y
=
Cov(yi0; i)
plimN !1
NT i=1 t=1 i it i;t 1 T 1
1 2
T;
+
(
T
1)
T
+
T (1 )2
and
As
5.3.1
"it.
N X
T
N 2
1 X
1 2T
2
i yi0
plimN !1
yi;t 1 =
:
NT i=1 t=1
T (1 2) N
2 1
1 T 1 2T
+
: T 2
+
(1 )2 T
1
1 2
82
CHAPTER 5.
1 T 1
+
T (1 ) 1
1
"2
+
(T 1)
T (1 2)2
2
2T
Cov(yi0; i)
2
T 2 + 2T :
5.3.2
yit).
yi0 (constant
An equivalent representation
i.
5.3.
83
5.3.3
(B)
In model (A),
yit
xit and zi .
i , dif-
Possible interpretation:
and
wit
is a latent variable,
yit
is observed,
wit is unobserved.
But
assumptions (or knowledge) on initial conditions may help to distinguish between both processes.
Dierent cases:
1/ yi0 xed;
2/ yi0 random;
2.a/ yi0 independent of i, with E (yi0) = y
2 ;
and
V ar(yi0) =
y0
2)
84
CHAPTER 5.
(stationarity assumption);
2) (sta-
tionarity assumption);
4.d/ wi0 random with mean i0 and arbitrary variance w2 0.
See Appendix 4 for a derivation of Maximum Likelihood estimators in each case.
5.3.4
2
and
VT .
Other cases
5.3.5
5.3.
85
Table 5.2: Properties of the MLE for dynamic panel data models
Parameters
xed,
Case 1:
; ; "2
; 2
Case 2.a:
; ; "2
y ;
; 2 ; y2
0
; ; "2
wi0;
; 2
yi0
xed,
!1
xed
Consistent
Inconsistent
Consistent
yi0
Case 2.b:
!1
Consistent
; ; "2
y ;
; 2 ; y2 ;
random,
yi0
ind. of
Consistent
Consistent
Inconsistent
Consistent
yi0
correlated with
Consistent
Consistent
Inconsistent
Consistent
Case 3:
wi0
xed
Consistent
Inconsistent
Inconsistent
Inconsistent
Case 4.a:
86
CHAPTER 5.
Demand system:
Nit = Nit
Ii;t 1, and 6 = 1 r.
5.3.
87
0 (Intercept)
1 (Pit)
2 (Nit)
3 (Ni;t 1)
4 (Iit)
5 (Ii;t 1)
6 (Gi;t 1)
OLS
Within
GLS
-3.650
-4.091
(3.316)
(11.544)
-0.0451(*)
-0.2026
-0.0879(*)
(0.027)
(0.0532)
(0.0468)
0.0174(*)
-0.0135
-0.00122
(0.0093)
(0.0215)
(0.0190)
0.00111(**)
0.0327(**)
0.00360(**)
(0.00041)
(0.0046)
(0.00129)
0.0183(**)
0.0131
0.0170(**)
(0.0080)
(0.0084)
(0.0080)
0.00326
0.0044
0.00354
(0.00197)
(0.0101)
(0.00622)
1.010(**)
0.6799(**)
0.9546(**)
(0.014)
(0.0633)
(0.0372)
Notes. N = 36, T = 11. Standard errors are in parentheses. (*) and (**):
parameter signicant at 10% and 5% level respectively.
88
CHAPTER 5.
Part II
Generalized Method of Moments
estimation
89
Chapter 6
The GMM estimator
Generalized Method of Moments: ecient way to obtain consistent parameter estimates under mild conditions on the model.
Very popular in estimating structural economic models, as it requires much less conditions on model disturbances than Maximum
Likelihood. Another important advantage: easy to obtain parameter estimates that are robust to heteroskedasticity of unknown
form.
Moment conditions
Let
92
E [f (xi; 0)] = 0:
6.1.2
yi = xi0 + ui; i = 1; 2; : : : ; N;
where
0 :
term.
A common assumption is
and
ui
is the error
xi ).
E (xiui) = E [xi(yi
Note that here,
p = q,
xi0] = 0:
parameters to estimate.
E (ziui) = 0.
such that
There are
Vector
or
p parameters to estimate.
q p.
and
E (uijxi) = 0
6.1.
6.1.3
A sample
bution
93
(a; b) with
true values
a0
and
b0.
Relationship between
a
E (xi) = 0 ;
b0
a
E (xi)]2 = 20 :
b0
In our notation in the denition above: = (a; b) and
h
a
a 2 ai
; (x
)
;
f (xi; ) = xi
b i b
b2
so that E [f (xi; 0] = 0.
6.1.4
E [xi
How to estimate
N
1X
fN () =
f (x ; ):
N i=1 i
E (f ) close to
fN (population moments close to empirical moments), then ^N is
a convenient estimate for 0 , where f (^
N ) = 0.
0 = E [f (0)] fN (^N ) ) 0 ^N :
E (f )
is adequately approximated by
94
N
N
1X
1X
x u^ =
x (y
N i=1 i i N i=1 i i
and solving for
^ N
yields
^ N =
6.1.5
xi^ N ) = 0;
N
X
i=1
xix0i
! 1 N
X
i=1
xiyi:
dependent variables
y1; y2; : : : ; yN
are distributed
1; 2; : : : ; N
respectively.
i's
ri
r!
linear relationship:
log i = 0 +
p
X
j =1
j xij :
L=
Ni=1
exp(
yi i
i)
yi!
"
= exp
N
X
i=1
i + 0
N
X
i=1
yi
6.1.
p
X
j =1
N
X
i=1
xij yi
1
Ni=1yi!
95
T0 =
N
X
i=1
yi
Tj =
N
X
i=1
xij yi
j = 1; : : : ; p;
@i
= i
@0
If we set derivatives of
T0 =
N
X
i=1
^ i
and
@i
= xij i:
@j
Tj =
N
X
i=1
xij ^i
j = 1; : : : ; p
P
^i = exp(^ 0 + pj=1 ^ j xij ): Hence, we match sample moPN
Pp
^
ments T0 and Tj to theoretical moments
exp(
+
^ j xij )
0
i
=1
j
=1
PN
^ Pp ^
and Tj =
i=1 xij exp( 0 + j =1 j xij ) respectively.
We have p + 1 such matching conditions for p + 1 parameters.
where
6.1.6
Comments
(LS)
96
.
^ = arg min (Y
where
X)0Z (Z 0Z ) 1Z 0(Y
:
X);
^ =
N
X
i=1
zi0 xi
! 1 N
X
i=1
xi^) = 0
zi0 yi = (Z 0X ) 1Z 0Y:
or start
N
1X
@ log L()
j=^ = 0;
N i=1
@
Ensure that we can replace population moments by sample moments, for the Method of Moments to work.
valid
6.2.
97
Introduction
by dening
AN
0(1).
Important note: for the just-identied case, QN ( ) = 0
fN () = 0, but in the over-identied case, QN () > 0.
where
because
This fact is important for model checking (we will come to this
point later in the course).
6.2.2
Consider
ments), and
rank(W 0X ) = p.
Solving for
we have
are instru-
^ = (W 0X ) 1(W 0Y )
u(^ )0PW0 u(^ ) = Y
X (W 0X ) 1(W 0Y ) 0 W
(W 0W ) 1W 0
Y X (W 0X ) 1(W 0Y )
= Y 0PW Y + (W 0Y )0(W 0X ) 1X 0 PW X (W 0X ) 1(W 0Y )
98
A denition
6.2.4
q > p instruments
E (ziui) = E (zi(yi
xi0)) = 0
N
1X
fN ( ) =
z (y
N i=1 i i
xi ) =
1 0
(Z Y
N
Z 0X ):
6.3.
99
AN =
Assume that
1 0Z
NZ
N
X
1
zi0 zi
N i=1
! 1
= N (Z 0Z ) 1:
! 1), to a
^ N = X 0Z (Z 0Z ) 1Z 0X 1 X 0Z (Z 0Z ) 1Z 0Y:
This expression is the IV formulation for the case where there are
more instruments than parameters.
100
6.3.1
Consistency
Assumption set 1
(i)
(ii) Let
There exists
0
such that
(iii) Let
AN
such that
AN
AN
p
!
0.
p
! 0 for j = 1; 2; : : : ; q:
fNj (N )
gNj (N )
p
!
0, where N is a sequence of
increases.
6.3.
101
such that
p
!
0 uniformly for 2 :
N () = 0 , =
we have that Q
QN () Q N ()
(i) and (ii),
Q N () > 0 otherwise.
From
0,
and
Therefore,
^N
^N minimizes QN ();
0 minimizes Q N (p );
QN () Q N () ! 0:
But this implies that
p
!
0, because
6.3.2
Asymptotic normality
Assumption set 2
(v) Function
(vi) Let
FN (N ) FN
where
on
.
FN
is a sequence of
For any
p
!
0;
102
(vii) Function
d
!
N (0; Iq );
N = NV ar[fN (0)], a sequence of q q non-random,
where V
positive denite matrices.
i 1
1=2
0
0
^
^
^
^
FN (N ) AN VN AN FN (N )
FN (N ) AN FT (N )
p
d
N (0; Ip)
N (^N 0) !
Proof:
We know that
0:
fN
0) = 0
6.4.
103
(^N
0 ) =
N (^N
i 1
1
0 ) =
FN (^N )0AN FN (N )
p
FN (^N )0AN VN1=2VN 1=2 NfN (0)
p
VN 1=2 hNfN (0) is Ni(0; Iq ).
hp
i
p ^
^
Therefore, E
N (N 0) = 0 and V ar N (N 0) =
,
where
=
h
i 1
0
^
FN (N ) AN FN (N )
FN (^N )0AN VN AN FN (^N )
where
i 1
AN , the
1 0
1
0
0
Aopt
N = arg min (FN AN FN )) FN AN VN AN FN (FN AN FN ) :
AN
Lemma 3
The matrix
(FN0 VN 1FN ) 1
104
If we select
AN = VN 1, we get
(FN0 AN FN ) 1
= 0:
Hence, best weighting matrix for GMM: inverse of the variancecovariance of moment conditions.
For this choice, variance of GMM is simply
1
!0
!# 1
h
i
^
^
1
1 @f (x; N ) 1
1 @f (x; N )
V arf (x; ^N )
N @
N
N @
and this denes the optimal GMM. But: in general, no condition imposed on distribution of
VN
that produces a
Step 1.
AN (A1N ):
^1N
using an
Step 2. Compute V^N from u(^1N ) and nd ^2N such that
^2N = arg min u0()Z (V^N ) 1Z 0u():
A1N
6.5.
105
Method 1.
sively replacing
^N
and
AN
, and solve
^N = arg min QN () fN ()0AN ()fN ():
In practice, construction of variance-covariance matrix depends on the nature of data: cross-sections, times series, or panel
data (see dedicated section below).
@QN (^N )
^N )0VN 1fN (^N ) = 0;
=
F
(
N
@ ^N
where FN (^
N ) = @fN (N )=@.
If ^
N satises FOC above, it must also satisfy
P^ VN 1=2fN (^N ) = 0;
106
where
and
so that
M^ = VN 1=2FN (^N );
1
P V 1=2E [f (0)] = 0;
where
P = M (M 0 M ) 1M 0
and Fi ( ) = @f (xi; )=@ .
Projection matrix
only
If
and
is of rank
p,
M = V 1=2E [Fi(0)];
q1
vector
E [f (xi; 0)] to
0.
^N :
N (^N
p
0
1
0
1
=
2
0) = (M M ) M V
NfN (0) + op(1);
where
The basic way of testing for model validity is to use the over-
identifying restrictions
(Iq
6.5.
107
p.
(IQ
Interpretation:
the data
satises the over-identifying restrictions. The asymptotic distribution of sample moments is determined by the function of the
data in the over-identifying restrictions:
d
!
N (0; Iq
P):
p
p
Cov[ N (^N 0); NfN (^N )]
= (M 0M ) 1M 0 (Iq P ) = 0:
H0 : E [f (xi; 0)] = 0
or
because
JN = NQN (^N )
d
!
2(q
p)
under
H0:
108
JN
A 0
JN v
zq (Iq
where zq v N (0; Iq ).
is asymptotically equivalent to
P )0 (Iq
P )zq = zq0 (I
P )zq ;
Based on Newey 1993, Ecient estimation of models with conditional moment restrictions.
6.6.1
E [(z; 0)jx] = 0
E [A(x)(z; 0)] = 0;
where x is a vector of conditioning variables, A(x) is an r s
matrix of functions of x, and 0 the true value of parameters.
Focus of the analysis here: choose A(x) to minimize the asymptotic variance of the GMM estimator.
Let
@(z; 0)
D(x) = E
jx ;
@
6.6.
109
B (x) = C:D(x)0
(x) 1;
where
= E [D(x)0
(x) 1D(x)] 1 :
Example: Linear model with heteroskedasticity
We have in the model
D(x) = x0;
y = x00 + "; E ("jx) = 0,
1=2(x):
1 corrects for heteroskedasticity,
Analogy with linear model:
(x)
and derivatives @(z; 0)=@ correspond to regressors, and matrix
D(x) is a function of x closely correlated with those derivatives.
Since
and dene
Therefore,
(E [mB m0B ]) 1
110
= (E [mAm0B ]) 1 E [mAm0A ]
where
R = (E [mAm0B ]) 1 mA
Since
asymptotic variance.
6.6.2
Optimal instruments
and
(x) =
(x; 0);
D(:) and
(:) are known, and is a real vector. Because D (x) and
(x), we could estimate 0 by running
a linear regression of @(z; ^
)=@ and (z; ^)(z; ^)0 on x. This
^ (x) = D(x; ^)0
(x; ^) 1 and the resulting GMM estimator
gives B
where functions
would be
^ = arg min
8
n
<X
2 :
i=1
"
n
X
i=1
B^ (xi)B^ (xi)0
# 1
n
X
i=1
9
=
B^ (xi)(zi; ) :
D(x; ) and
(x; ) are misspecied.
6.6.
where
111
h(:) is known.
Ex-
(z; ) = y
f (x; ) ; [y
f (x; )]2 h(x; ; ) 0 :
@f (x; )=@ 0
0
D(x) = D(x; 0); D(x; ) = @h(x; ; )=@ 0 @h(x; ; )=@0 ;
B (x) = D(x)0
(x) 1:
Empirical issue: when is incorporating additional moment condition yielding a more ecient estimator ?
Asymptotic variance of the heteroskedasticity-corrected least squares
estimator:
E ["2jx]
@f (x; 0)
@
@f (x; 0) 0 1
;
@
in E [D(x)0
(x) 1D(x)] 1.
E ["3jx] = 0, or
h(x; 0; 0) = h(x; 0).
Otherwise, the asymptotic variance of the heteroskedasticity-corrected
least squares estimator will be larger than the conditional moment
bound.
Corollary:
not depend on
x or !
h(x; ; ) and
(x) do
112
Needs specication of
(x),
E ["3jx] = 0;
[yi
mators.
Estimated optimal instruments are then
"
^
0
^ x) = h(x; ; ^ )
D^ (x) = D(x; ^);
(
^ ^ )2 ;
0
^:h(x; ;
^ x) 1:
B^ (x) = D^ (x)0
(
6.6.3
Advantage:
avoid misspecication in
D(x; 0)
and
(x; 0)
in
x).
NN
estimator.
6.6.
113
xl denote a measure of scale of lth component of x (standard deviation). x being of rank r , dene
Let
jjxi
xj jjn =
r
X
(xil
l=1
^ l
xjl )2
)1=2
K; K n, and
8
<
:
Integer
vation
i.
!kK 0
!kK = 0
x.
i and j , account-
1 k K;
for
k > K;
PK
k=1 !kK = 1:
for
and
j 6= i according to distance
th
above. Then assign the weight Wij = !jK to observation with j
smallest distance jjxi
xj jjn.
Let
Wii = 0
!kK = 1=K; k K .
To compute conditional expectation of y given x:
Select the set of the K (out of n) xi's closest to point x;
Compute the mean of the yi values corresponding to the xi's
Example: uniform weights
chosen above:
K
1X
E (yjx) =
!kK yk (x) =
yk (x);
K k=1
k=1
K
X
where
yk(x)
measure dened above (y1
114
Other possibility:
E (yjx) =
n
X
j =1
!j yj(x);
!jT =
2(K
0
j + 1)=[K (K + 1)]
j < K;
for j K;
for
of quadratic weights:
!jQ
6(K 2
0
(j
1)2]=[K (K + 1)(4K
1)]
j < K;
for j K:
for
xi is
^ xi) =
where observation
n
X
j =1
(xi)
procedure).
D(x) is accordingly
n
X
@(zj ; ^)
^
D(xi) =
Wij
:
@
j =1
x.
D(x)
D(x; )
6.6.
115
n
X
j =1
Wij
@(zj ; ^)
@
D(xj ; ^) :
n
X
1
0
1
^ xi) ; ^ =
^ xi)D^ (xi)
B^ (xi) = D^ (xi)
(
D^ (xi)0
(
n i=1
6.6.4
! 1
x, E (Y jX = x) = m(x), with
m(x) =
Z 1
X=
f (y; x)
y
dy;
f1(x)
1
where
116
The den-
sity function is
f (x) =
d
F (x + h=2) F (x h=2)
F (x) = lim
h!0
dx
h
probability above is then estimated by the proportion of observations falling in the interval
1
f^(x) =
nh
1
=
nh
(x
Number of
Number of
x1
n
1 X
=
1I
nh i=1
n
1 X
=
1I
nh i=1
1
2
h=2; x + h=2):
h
h
;x+
2
2
x1; : : : ; xn in x
x
;:::;
xn
in
( 1=2; 1=2)
1 xi x 1
h 2
2
1
;
i
2
xi
xi's
6.6.
in an interval around
117
xi h=2.
of weights, one can replace the indicator function by a positive kernel function denoted
K (:).
estimator is
n
n
X
X
x
x
1
1
i
K
=
K ( i) ;
f^(x) =
nh i=1
h
nh i=1
where the kernel function has the following properties:
Z 1
K ( )d = 1; K (
1) = K (1) = 0;
K (x)dx = 1:
n
1 X
z z
^
^
f (y; x) = f (z ) = q+1
K1 i
;
nh i=1
h
where
is a xed point.
h.
(A1) Observations
(A2) Kernel
118
R
R K (2 )d
(ii)
(
R K
2
(i)
(iii)
= 1,
)d = 2 6= 0,
K ( )d < 1.
x.
h = hn ! 0 as n ! 1.
(A5) nhn ! 1 as n ! 1.
(A4)
f^:
h2 00
^
Bias [f (x)] =
f (x);
2 2
Strategy for choosing
h:
1
var [f^(x)] =
f (x)
nh
K 2 ( )d :
Error (MISE):
Z h
i2
f^(x) f (x) dx =
Z h
1
AMISE = 1h4 + 2(nh) 1;
4
Z
where
1 = 22 [f 00(x)]2dx; 2 =
K 2( )d :
Since Bias
h / n 1=5;
for which
6.6.
119
m
^ (x) =
"
1 Pn K1 yi y ; xi x
iP
=1
h
h
y
dy;
n
x
x
q
1
i
K
1 (nh )
i=1
h
Z 1
(nhp)
K (:) and K1(:; :) are q-multivariate and p-multivariate kernels respectively, and p = q + 1 (recall x has rank q ). Dene
i = h 1(yi y) , y = yi hi. The numerator above becomes
where
Z 1
(nhp) 1
n
X
i=1
Z
n
1X
y
=
n i=1 i
n Z 1
1X
n i=1
(yi
1
1
h )K1 ;
K1 ;
xi
h p+2K1 ;
xi
xi
hd
h q d
d;
and since the last term is zero for symmetric kernels, we nally
have
n
1X
=
yh
n i=1 i
Z 1
K1 ;
xi
d
n
1X
x x
=
:
yih q K i
n i=1
h
m
^ (x) =
" n
X
i=1
xi
# 1 "X
n
i=1
xi
yi :
G
NN ).
120
m(x) = E (Y jX = x) =
where
and
n
X
i=1
!is(x)yi;
xi x
d
!is(x) = Pn
xi x ;
K
i=1
d
th nearest
distance between x and its K
d is the
neighbor.
and
K.
estimation:
K = nh4=(4+q)
and
Chapter 7
GMM estimators for time series
models
7.1 GMM and Euler equation models
Lucas critique (1976): evaluations based traditional dynamic simultaneousequation models are awed because parameters are assumed invariant across dierent policy regimes.
Hence, marginal response to a change in policy instruments is not
to be expected from rational agents taking into account policy
changes in their decision making.
Standard estimation procedures (MLE) are computationally burdensome when one introduces taste and technology parameters.
7.1.1
121
122
counted utility
max E0
"
1
X
t=0
t U (Ct) ;
where
Ct + Pt Qt RtQt 1 + Wt;
where
First-order condition:
where
Rt+1 U 0(Ct+1)
Et
U 0(C )
Pt
t
U 0(:) = @U=@C:
1 = 0:
Ct
U (Ct) = ;
with
< 1;
7.1.
123
so that
where
7.1.2
R
C
Et t+1 t+1 1 = 0;
Pt
Ct
1.
(7.1)
GMM estimation
R
LW1;t+1 = log t+1
Pt
and
C
LW2;t+1 = log t+1
Ct
given
t ;
The
R
C
E t+1 t+1
Pt Ct
1 =E
R
C
t+1 t+1
Pt Ct
= 0:
t, t, so that
If yt+1 2
= t but zt 2 t then Et(yt+1zt) = [Et(yt+1)] zt:
If Et (yt+1 ) = 0, by the Law of Iterated Expectations, we have
E (yt+1zt) = 0, and the Euler equation implies
Rt+1 Ct+1
E ["t+1(; )zt] = 0 where "t+1 =
1;
Pt
Ct
time
124
and
zt
Ct i; Rt i; Pt i; i 0.
t.
Notes.
yt = "t + 0"t 1;
where
7.2.1
"t is an i.i.d.
(7.2)
A simple estimator
0 =
E (ytyt 1)
0
=
:
E (yt2)
1 + 02
^T =
we obtain estimator
^T
0 by sample estimator
PT
t=2 yt yt 1 ;
PT
2
t=2 yt
by solving
^2T
^T 1^T
1 = 0:
7.2.
125
may dene
~T =
and solution for
~T
8
<
0:5
^
: T
0:5
is
~T =
1 4^2T
:
2^T
02, whose expression can be de-
T
1X
f T ( ) =
f (y ; ) =
T t=1 t
fT (^T = 0
~
^T = ~T = (T ; ~ 2T ).
yields the
126
Estimators ^T and ~T are consistent and asymptotically normal with distribution
Theorem 4
p
pT (^T
0)
0)
T (^T
where
1
=
(1 02)2
v N (0; );
0 0
+
;
0 4
with 4 the fourth-order cumulant of "t.
Under the normality assumption, asymptotic variance of the
MLE of
^T
(1
is
02).
in general.
7.2.2
1959).
The MA(1) dened by (7.2) is invertible, therefore it admits
an AR representation:
yt =
where
1
X
j =1
j (0)yt j + "t;
j = 1; 2; : : :
7.2.
127
yt =
K
X
j =1
j (0)yt j + "Kt:
(7.3)
"Kt = "t +
1
X
j =K +1
1:
yt
K -vector
0
1
1()
A 8 ; with j () = ( )j ;
AK () = @ ...
K ( )
^K denote the K -vector of OLS estimators (^1; : : : ; ^ K )
and let A
Dene the
in (7.3).
For an given
K , we dene
where
7.2.3
0
AK () VT K
= ( 1; +1) and VT K
is a
A^K
K K
AK () ;
weighting matrix.
We can write
j () = j 1; j = 1; 2; : : : ;
(7.4)
with
0() = 1:
128
P
^D =
K
^ j ^ j 1
j =1
P
K
2
^j
j =1
with
^ 0 = 1:
VT K = BK ()0BK ();
and
LK : K K
where
BK () = IK + LK ;
7.3.1
The model is
where we assume
0 6= 0;
yt = 0yt 1 + ut;
where
ut = "t + 0"t 1;
(7.5)
7.3.
129
yt =
7.3.2
1
X
j =0
0j ut
ut)
(7.6)
IV estimation
Ef (yt; 0) = 0
f (yt; ) = (yt
where
yt 1)yt 2;
T
1X
fT ( ) =
(y yt 1)yt 2;
T t=3 t
^ T ) = 0 for ^T gives
and solving fT (
^ T =
T
X
t=3
yt 2 yt 1
! 1 T
X
t=3
yt 2yt:
Theorem 5
T (^ MLE
(1 + 00)2(1 02)
0) v N 0;
:
(0 + 0)2
130
0.
The MLE is more ecient than GMM, especially for large values
of
0 and 0.
yt j ; j = 2; 3; : : :,
yielding
T
X
^ Tj =
t=j +1
yt 1yt
! 1
j
T
X
t=j +1
yt yt j ;
for
j 2:
yt 1 )
yt
j ).
(rapidly) with
Since
yt
yt 1
q vector of conditions
E (utyt j ) = 0;
8j
yt 1))
estimator is
^ Tq =
T
X
t=q+2
yt 1Yq;t0 2ATq
T
X
t=q+2
2.
Yq;t 2yt 1
! 1
GMM
7.4.
131
T
X
t=q+2
yt 1Yq;t0 2ATq
X
t=q+2
Yq;t 2yt;
q q weighting matrix.
^ Tq is
The asymptotic distribution of
where
ATq
is a positive denite
T (^ Tq 0) !d N 0; "2(Rq0 Aq Rq ) 1Rq0 Aq Vq Aq Rq (Rq0 Aq Rq ) 1 ;
where
(1 + 00)(0 + 0)
"20j 1
:
(1 02)
1
The optimal choice for the weighting matrix being ATq = Vq , we
with
have
T (^ Tq
j th element
0) !d N 0; "2(Rq0 Aq Rq ) 1 :
E [f (xt; 0] = 0,
mated is
T X
T
X
1
VT = T var[fT (0)] =
E [f (xt; 0)f (xs; 0)]:
T t=1 s=1
This is the average of autocovariances for the process
Let
ft = f (xt; 0)
and rewrite
function:
VT =
T 1
X
j = (T
1)
VT
f (xt; 0).
as a general autocovariance
T (j )
where
132
T (j ) =
7.4.1
yt = xt0 + ut;
Assume
where
E (ft) = E (xtut) = 0:
E (utjut 1; xt; ut 2; xt 1; : : :) = 0
and
Residual
We
have
T
1X
VT =
T (0) =
E (xtututx0t) = u2 E (xtx0t);
T t=1
the standard OLS variance-covariance matrix. The estimator of
VT
T
2X
^
V^T = u
xtx0t;
T t=1
7.4.2
where
T
1X
2
^ u =
u^2t ; u^t = yt
T t=1
^
xt:
T
1X
VT =
T (0) =
E (xtututx0t);
T t=1
7.4.
133
T
1X
^
VT =
xtu^tu^tx0t:
T t=1
This is White's heteroskedasticity consistent estimator.
In a typical IV setup, where
ft = wt(yt
xt ); wt are instruments;
T
1X
E (u2t )wt0 wt;
VT =
T t=1
and the asymptotic covariance matrix would be
1
P X
T W
1
1 ^
P P
T W W
1 0
X PW
T
1
where
7.4.3
Assume
VT =
m
X
j= m
T (j ):
V^T =
^ T (j ) =
m
X
^ T (j );
where
j= m
(
P
(1=T ) Tt=j +1 xtu^tx0t j u^t j ;
P
(1=T ) Tt= (j 1) xt+j u^t+j x0tu^t;
j 0;
j < 0:
134
V^MM
V^MM =
^ T (j ) =
where
T 1
X
j = (T
1)
^ T (j );
where
P
(1=T ) Tt=j +1 f^tf^t0 j ; j 0;
P
(1=T ) Tt= (j 1) f^t+j f^t0; j < 0;
But:
Although V^MM may be asymptotically unbiased, it is not consistent in the mean squared error sense;
^ T (j )
j, T + 1 j T 1 ?
Suppose j = T
2; then
^ T (j ) tends to 0 as T
arbitrary
7.4.4
!1!
! 1.
This is the
mixing property.
7.4.
135
Denition 2
sum,
p are eliminated.
VT :
V^T =
^ T (0) +
p
X
j =1
by a
truncated
^ T
( j ) =
(j )0, we consider
^ T (j ) +
^ T (j )0 :
(7.7)
multiply
^ T (j )
^ T (j ) may
p
X
j =1
j
p+1
^ T (j ) +
^ T (j )0 ;
^ T (0) down to
136
7.4.5
V^T =
T 1
X
s= (T
1)
!s
^ T (s);
In practice, we concentrate on
s
;
!s = k
mT
0.
We assume
k(0) = 1; k(z ) = k( z ) 8z 2 R;
Z 1
jk(z)jdz < 1;
and k(:) is continuous at 0 and "everywhere else" except at a nite number of points.
Note: When k (:) = 0 for z > 1, mT reduces to p, the lag truncation parameter.
7.4.
Let
z !0
kr
137
k(:), and kr
k(z )
jzjr
Consider nally the following measure of smoothness of the spectral density function in the neighborhood of 0:
1
X
(
r
)
1
S = (2)
jj jr
(j );
j= 1
also denoted the
function:
When
(j )e
2 j = 1
ij :
is equal to
2
1982).
Dene the asymptotic truncated Mean Squared Error:
T
MSEh = E min j vec(V^T
mT
where
BT
VT )j; h ;
Theorem 6
We have
(i) If m2T =T
! 0 then V^T
VT
p
!
0.
p
! 1 and BT !
B.
138
(i): establishes consistency of scale parameter covariance estimators for bandwidth sequences that grow at rate
o( T ).
For
According to the-
Variance of these
r.
mT :
according to asymptotic
T (2r+1)
7.4.
139
T 1
1 X
W (; mT ) =
!e
2 s= (T 1) s
This is also denoted the
spectral window.
is:
V^T =
Z
where
or
T 1
X
s= (T
1)
^ seis
periodogram,
and
W (:; :) is
the
averaging kernel.
Spectral estimators once computationally burdensome, before FFT
(Fast Fourier Transforms) became popular.
Dene the Fourier transform of
f^t as
T
1 X
(p) = p
f^teip t:
2T t=1
140
p =
as
2p
T
; p = 1; 2; : : : ;
T
2
(T 1)
X
2
V^T =
I^T (0p)W (0p; mT );
2T 1 p= (T 1)
Chapter 8
GMM estimators for dynamic
panel data
8.1 Introduction
GMM estimation was introduced as an interesting alternative to
Fixed-eects, Maximum-Likelihood or GLS estimation procedures.
But its advantages are the most obvious for estimating dynamic
panel-data models.
formation.
Two drawbacks:
a) In IV procedure, variance-covariance matrix is restricted;
b) Only one instrument is used (either
141
yi;t 2 or yi;t 2
yi;t 3).
142
8.2.1
Model assumptions
E (yisuit) = 0; t = 2; 3; : : : ; T; s = 0; 1; : : : ; t 2;
uit = "it = "it
where
"i;t 1.
This is a set of
T (T
1)=2
" are
correlated, i.e., we must have E ("it "i;t+s ) = 0, for
not serially
s = 1; 1.
E (yisuit) = 0; t = 3; : : : ; T; s = 0; 1; : : : ; t 3;
which gives
(T
1)(T
1) condi-
tions).
By continuous substitution seen before:
1 t
i + tyi0;
yit = "it + "i;t 1 + 2"i;t 2 + + t 1"i1 +
1
8.2.
143
so that
i is of the form:
yi0 0 0
6 0 yi0 yi1 0
0 0
6
60 0 0 y
i0 yi1 yi2 0
Wi = 6
0
0
0
yi;T 2
6
4
..
.
..
.
..
.
0
0
so that Wi ui =
0
ui2 yi0
B ui3 yi0
B
B ui3 yi1
B
B ui4 yi0
B
B u y
i4 i1
B
B u y
i4 i2
B
B
B
B
B
B
@
..
.
uiT yi0
..
.
uiT yi;T 2
0
and E (Wi ui ) = 0.
..
.
..
.
..
.
..
.
..
.
yi0
C
C
C
C
C
C
C
C
C=
C
C
C
C
C
C
A
B
B
B
B
B
B
B
B
B
B
B
B
B
B
B
@
(yi2
(yi3
(yi3
(yi4
(yi4
(yi4
(yiT
(yiT
..
.
..
.
yi1) yi0
yi2) yi0
yi2) yi1
yi3) yi0
yi3) yi1
yi3) yi2
..
.
yi;T 1) yi0
..
.
yi;T 1) yi;T 2
7
7
7
7
7
5
1
C
C
C
C
C
C
C
C
C
C
C
C
C
C
C
A
144
(W 0
W ) 1:
is the variance-covariance
of " (in the transformed model). If "it is homoskedastic, we have
Initial weighting matrix for
H=
(T
2)(T
6
6
6
6
6
4
2
1 0
1 2
1 0
0
1 2
1
..
.
2) matrix.
..
.
..
.
..
.
..
.
..
.
..
.
We can use
0
07
7
07
7;
..
.
1 2
7
5
weighting matrix as
A1 =
N
X
i=1
Wi0HWi:
= y0 1W A1 1W 0y 1 1 y0 1W A1 1W 0y ;
we can compute the second-stage weighting matrix as:
A2 =
where
^ui = yi
N
X
i=1
^yi; 1.
Wi0^ui^u0iWi;
8.3.
145
E (uiT uit) = 0; t = 2; 3; : : : ; T
(T
2)
orthogonality conditions.
1:
have T (T
1)=2 +
8.3.1
Additional assumptions
8.3.1.1 Homoskedasticity
E (u2it)
8i,
E (yisuit) = 0 t = 2; : : : ; T; s = 0; : : : ; t 2;
E (yitui;t+1 yi;t+1ui;t+2) = 0 t = 1; : : : ; T 2;
E (uiui;t+1) = 0 t = 1; : : : ; T 1;
where
ui = T1
PT
t=1 uit.
8.3.1.2 Stationarity
2) conditions is now
E (yisuit) = 0 t = 2; : : : ; T; s = 0; : : : ; t
E (uiT yit) = 0 t = 1; : : : ; T 1;
E (uityit ui;t 1yi;t 1) = 0 t = 2; : : : ; T:
2;
146
Wi =
B
B
B
@
..
.
..
.
..
.
0
0
..
.
i:
ui 0 ::: 0
0 ui ::: 0
..
.
..
.
..
.
..
.
1
C
C
C:
A
W0
conditions.
ments
Let
(W 0; W 1)
^
and
and
E (uityi;t 1) = 0 t = 3; 4; : : : ; T;
with the addition of
E (ui3yi2) = 0:
This last condition combined with the one above implies the AhnSchmidt (1995) nonlinear restrictions
E (uitui;t 1) = 0; t = 3; : : : ; T .
8.5.
147
yi0 =
+ "i0:
i=(1
)
i=(1 ) itself.
must not be
The GMM estimator of Blundell and Bond combines the AhnSchmidt conditions
Wi 0
0
6 0 y
0
i2
6
6
Wi+ = 6 0 0 yi3
6
4
..
.
..
.
..
.
..
.
0
0
0
0
yi;T 1
7
7
7
7;
7
5
8.5.1
Holtz-Eakin et al.
148
where
t
is
; ; rt; t = 2; 3; : : : ; T .
dierencing.
GMM estimation is applicable as before (Arellano-Bond, AhnSchmidt or Blundell-Bond), but the initial weighting matrix cannot be used anymore. Let
"it = "it
rt"i;t 1.
We have, under
1 + r12 r2
6 r2
1 + r22
6
60
r3
6
4 :::
:::
0
:::
0
r3 0
1 + r32 0
:::
:::
:::
rT 1
:::
:::
:::
:::
1 + rT2
3
7
7
7:
7
5
8.5.
149
When the
rt's
8.5.2
Mixed structure
Consider
i = 1; 2; :::; N t = 1; 2; :::; T;
tvi
captures
t = s
Under condition (8.1),
= i + vi.
i
8t; s = 1; 2; : : : ; T;
v = v2 = 0:
let t =
8t; then uit = i + "it,
vi
(8.1)
(8.2)
where
E (u2it) =
2 + "2 and E (uituis) = 2 if t 6= s. Models uit = i + tvi + "it
150
and
v = 1; 1).
and
We have
vi
v = 1;
if v =
1:
if
with
t = (1 + t); vi = i:
tion yields:
uit
ui;t 1 = (t
t 1)vi + "it
"i;t 1;
vi:
s t 2:
uit
E [(uit
i .
rt"i;t 1;
We have
s t 2:
8.6.
151
and
tvi,
it is necessary to use a
4yit
r~t4yi;t 1 = (4yi;t 1
r~t4yi;t 2) + 4"it
r~t4"i;t 1;
i = 1; 2; : : : ; N; t = 3; 4; : : : ; T , where
r~t = 4t=4t 1 = (t
t 1)=(t 1
t 2):
GMM estimators of the double-dierence model based on Quasidierencing rst and then First-dierencing residuals are not consistent when instruments include lagged dependent variables.
We would have in that case:
4 [("it
which depends on
4(rt"i;t
1)
i4rt;
i.
GMM procedures using instrument matrices from lagged dependent variables would yield consistent estimates only when the correct model transformation is performed.
wit:
OCCit:
wage rate,
W KSit:
152
and
W KS; OCC ).
Table 8.1:
Parameter
1
2
First-dierence GMM
Estimate
Std. error
t-stat.
0.9465
0.0126
74.83
0.0022
0.0022
0.98
-0.0848
0.0423
-2.00
8.6.
Table 8.2:
Parameter
1
2
r1
r2
r3
r4
r5
153
Quasi-dierence GMM
Estimate
Std. error
t-stat.
0.9121
0.0218
41.72
0.0150
0.0038
3.87
-0.1014
0.1007
-1.00
-0.5838
0.3856
-1.51
-0.0871
0.0974
-0.89
0.3294
0.0621
5.29
-0.1842
0.1074
-1.71
1.0401
0.5947
1.75
154
Table 8.3:
Parameter
1
2
r~1
r~2
r~3
r~4
Double-dierence GMM
Estimate
Std. error
t-stat.
0.9211
0.0460
19.98
0.0082
0.0014
5.79
-0.0394
0.0322
-1.22
-0.5272
0.2250
-2.34
-0.1188
0.1029
-1.15
0.2931
0.1009
2.90
-0.0863
0.0399
-2.16
Part III
Discrete choice models
155
Chapter 9
Nonlinear panel data models
9.1 Brief review of binary discrete-choice models
Models with qualitative variables: binary choice and multinomial
models. Brief survey of these models, for cross-section data and
the binary case :
yi = xi + ui; i = 1; 2; : : : ; N;
yi = 1
if yi > 0;
yi = 0
if yi 0;
157
158
= (1 xi ) ( xi )2 + xi (1 xi )2
= (1 xi )[( xi )2 + xi (1 xi )]
= xi (1 xi ):
9.1.2
Logit model
exp(xi ) ;
P rob(yi = 1) = (xi ) = 1+exp(
xi )
1
P rob(yi = 0) = 1 (xi ) = 1+exp(
xi ) ;
exp(xi )
Density: (xi ) =
[1+exp(xi )] :
2
In this case,
9.1.3
V ar(ui) = 2=3.
Probit model
xi = R xi = p1 exp( u2i );
1 2
2 2
R
1 exp( u2i2 );
p
xi = x+1
2
i = 2
2
ui
p1
2 exp( 22 ):
P rob(yi = 1) =
P rob(yi = 0) = 1
xi
Density:
=
Parameter
ui is N (0; 2)
=):
is normal-
ized to 1.
Estimation method: Maximum Likelihood:
^ = arg max
N
Y
i=1
[P rob(yi = 1)]yi [1
P rob(yi = 0)]1
yi
9.2.
159
= arg min
where
N
Y
i=1
F (ixi );
1.
@P rob(yi = 1)=@xi).
b) marginal eects (
i )
Sucient statistics
and
i; i = 1; : : : ; N , but i and are not independent for qualitativechoice models. When T is xed, MLE estimates of i are not consistent and consequently, the MLE of is not consistent either.
Individual eects i are denoted incidental parameters (their number increases with N ).
Solution: Neyman-Scott (1948) principle of estimation in the presence of incidental parameters.
statistic i for , i = 1; 2; : : : ; N
sucient
, then
f (yijxi; i; ) =
f (yijxi; i; )
;
g(ijxi; i; )
for
g(ijxi; i; ) > 0;
160
i.
^ = arg max
Joint probability of
yi:
h
P rob(yi) =
exp i
N
Y
i=1
P
f (yijxi; i; ):
T
t=1 yit
P
QT
t=1 [1 + exp(xit
T
t=1 yit xit
i
+ i)]
N X
T
@ log L X
=
@
i=1 t=1
and wrt.
exp(xit + i)
+ y x = 0;
1 + exp(xit + i ) it it
i:
T
@ log L X
=
@i
t=1
T
X
t=1
yit =
exp(xit + i)
+ y = 0; i = 1; 2; : : : ; N;
1 + exp(xit + i) it
T
X
t=1
exp(xit + i)
1 + exp(xit + i )
i is: i =
PT
The probability that
t yit = s is
Hence, a sucient statistic for
exp(is)
T!
Q
s!(T s)!
[1
+
exp(
x
+
)]
it
i
t
i = 1; 2; : : : ; N:
PT
t=1 yit .
X
d2Bi
exp
T
X
t=1
! )
ditxit
9.2.
161
9.2.2
Conditional probabilities
yi given i is:
i
T
exp
t=1 yit xit
P
P rob (yi i) = P
T
d2Bi exp
t=1 dit xit
P
P
( t yit)!(T
t yit)! ;
hP
where
T!
Bi is a set of indices for individual i:
(
Bi
T
X
and
t=1
dit =
T
X
t=1
yit :
yP
it for individual
T
in
t yit . Groups
for which
PT
t yit
= 0
or
PT
t yit
= T
T
y
=
s
2
]0
;
T
[
;
there
are
(
it
1
s ) =T !=[s!(T s)!] such elements,
that correspond to distinct T sequences with value s.
PT
Notes:
162
9.2.3
Example:
T =2
yi1 + yi2 = 1.
!i = 1
!i = 0
if
if
Let
P rob(!i = 1)
P rob(!i = 0) + P rob(!i = 1)
exp(i + yi2xi2 )
=
[1 + exp(i + xi1 )][1 + exp(i + xi2 )]
i2Bi
log-
T >P
2, we have to consider alternative sets of
T
observations for which
t yit is the same. Note that this formulation is a conditional Logit specication: regressors x depend on
In practice, when
the alternative.
9.3.
163
PROBIT MODELS
uit = i + "it,
where
is drawn from
Assume
2
:
1 + 2
The contribution to the likelihood of unit i is Li = P rob(yi )
V ar() = 2 ; V ar("it) = 1; Corr(uit; uis) = =
=
Z i1 xi1
Z iT xiT
it = 2yit
elements in ui .
where
1
and
Z +1 Y
T
1 t=1
2 )).
Z +1
i,
f (uitji )f (i)di;
1
Li(yi) = p
Z +1
"
T
Y
t2i
(
t=1
which is now a one-dimensional integral that can be evaluated numerically (Gauss-Hermite integration procedure).
of the method: assume a constant correlation
Disadvantage
) across periods.
164
i = 1; : : : ; N; t = 1; : : : ; T;
x is strictly exogenous:
E ["itjxi1; : : : ; xiT ] = 0:
x is predetermined only:
E ["itjxi1; : : : ; xit] = 0;
and in this case we have to use IV estimation strategy, e.g., tting
x.
9.4.1
"it
it
is independent from
zi).
and
"it,
conditional on
x and
Assumptions
A.1. The conditional distribution of
ft(itjxit; zi).
t = r and t = s, the conditional distribution of it given xit and zi has support [Lt ; Kt ] with
1 Lt <
0 < Kt 1, and the support of xit eit is a subset of [Lt; Kt].
A.3. For 2 periods
A.4. Let
Then
(ii)
E (izi); zz ; xrz and xsz exist:
(iii) zz and (xrz xsz ) 0zz (xrz xsz )0
166
are nonsingular.
it;
i can be correlated with xit or zi, but (nuit; i) must be independent given (xit ; zi );
"it is uncorrelated with instruments zi;
(A.2) means that the conditional distribution of "it given (it; xit; zi)
does not depend on it ;
According to (A.3), it can take on any value that x0it + eit
as deterministic functions of
and
fk1g and
! 1:
variable.
Theorem 7
Let
yit =
x0
e.
We have
E (yjx; z ) = E
E [y
=
=
Z KZ
Z K
E [y
Z K
e L
[1I( > s)
e wrt. :)
Note that
1I( > s)
+1I( > 0 s)] 1I(s 0) [1I(s > > 0) + 1I( > s > 0)] 1I(s > 0)
1I( > 0 s)1I(s 0)
= 1I(s > 0) [1I( > s > 0) 1I(s > > 0) 1I( > s > 0)]
+1I(s 0) [1I(0 s) + 1I( > 0 s)
= 1I(s 0)1I(0 > s)
,
=
E (yjx; z ) =
1I(s 0)
Z K
e L
Z 0
s
ddFe(ejx; z)
1 d
1I(s > 0)
Z s
1 d dFe(ejx; z )
QED:
sdFe(ejx; z )
168
9.4.2
The IV estimator
for
t = r; s:
Let
xsz )0 1 (xrz
xsz )zz1(xrz
= (xrz
xsz )zz1
t = E (ziyit ):
Then is consistently estimated by
(r
s).
and
xir
yir
yis
on
E (yjx; z ) = x0 + E ( + "jx; z )
,
Let
4x = xr
yir
will be
^ = (4xz 0)(z 0z ) 1(z 4x0) 1 (4xz 0)(z 0z ) 1z 4y:
Lewbel and Honor show that
N (^
where
) v N 0;
Var(Q^ i)
0 ;
can be replaced by
^ and
Q^ i = (ziyir
ziyis ) zi(xir
^
xis)0:
estimate of
ft .
sity of
NT
X
K +L+1 1
f^(it; wit) = NT h
= NT h
NT Z
X
1
K +L+1
uit
uj
f^(it; wit)dit
Km
it
NT
1X
j =1
Km
j wit
;
wit
wj
h
wj
dit
where
j =1
NT hK +L
Km
j =1
f^(wit) =
is the window,
2
f^(it; wit)
^
ft(itjxit; zi) = ^
:
f (wit)
170
9.5.1
L=
where
g("j)f ()d;
f:rg
= (1; 2; : : : ; K )0 and "
are a
K -vector
(9.1)
and a
M-
r.
g(:j:),
Notes:
In this model formulation, " is an implicit function of parameters and observed variables.
" given .
9.5.
171
Let
= var(),
D
Dene
B
=B
B
@
satisfying
DD0 =
(Choleski
d11 0 : : : 0
d21 d22 : : : 0 C
C
..
.
..
.
..
dK 1 dK 2 : : :
..
.
..
.
C:
A
variate. We have
L=
where
i(:):
K
Y
(9.2)
= (1; 2; : : : ; K ).
f : rg can be written
1
1
1
1
r1; 2 (r2 d211); 3 (r3 d311 d322);
d11
d22
d33
1
: : : ; K
(r
d : : : dK;K 1K 1):
dKK K K 1 1
L=
Z 1
"
Z 1
1
r1 =d11 d22
(r2 d21 1 )
:::
K
Y
i=1
(i)Ai = (i)
1
1
(r
dii i
di11
i
: : : di;i 1i 1)
(9.3)
1
172
1
where Ai =
dii (ri
di11
i
, and
(:) is the
normal cumulative density function (CDF). The likelihood function above is now
L=
Z
A1
:::
"
Z
AK
K
Y
i=1
K
Y
i=1
1
(r
dii i
!
!
di11
: : : di;i 1i 1)
ui =
(i)
1
d1ii (ri
d1ii (ri
di11
di11
: : : di;i 1i 1)
: : : di;i 1i 1)
; i = 1; 2; : : : ; K:
For example:
(1) (r1=d11)
,
1 = 1 [u1 (1 (r1=d11)) + (r1=d11)]
1 (r1=d11)
(2) (1=d22(r2 d21r1))
u2 =
1 (1=d22(r2 d21r1))
1
1
, 2 = 1 u2 1 d (r2 d211) + d (r2 d211) ;
22
22
where 1 is dened above.
For any i, we have the recursive formula:
1
i = 1 ui 1
(r d : : : di;i 1i 1)
dii i i1 1
u1 =
9.5.
173
1
+
(r
dii i
di11
: : : di;i 1i 1)
(u1; : : : ; uK ).
The likelihood function now involves random variables ui ; i =
1; : : : ; K and K integrals with constant bounds:
which depends on the sequence of uniform random variables
L=
Z 1
where
:::
Z 1" Y
K
i=1
1
(r
dii i
di11
: : : di;i 1i 1)
!
g("jD )
du1du2 : : : duK ;
QK
i Ai (i)
dui=di.
Since the
ui's
above by
LS =
"K
S Y
X
1
1
1
(r
S s=1 i=1
dii i
di11s
: : : di;i 1is 1)
Note
Easy to generalize to a restriction set of the form
a < < b.
i = q [ui; (ai
di11
g("jD s) :
We
174
(bi
di11
where
where
9.5.2
Example
St and Et respectively.
y1t > 0;
y1t 0:
St =
Et =
8
<
:
0
1
1
0
if
if
y2t < ;
+
if y2t < ;
+
if y2t :
if
9.5.
S E
0
-1
-1
175
y1
y2
11 + x11 + v1 < 0
x22 + v2 <
x11 + v1 <0
< x22 + v2 < +
12 + x11 + v1 < 0
+ < x22 + v2
11 + x11 + v1 > 0
2 + x22 + v2 <
x11 + v1 > 0 < 2 + x22 + v2 < +
12 + x11 + v1 > 0
+ < 2 + x22 + v2
a1
a2
<
v1
v2
<
b1
b2
as follows:
S E
0
-1
-1
a1
1
1
1
(
11 + x11)
x11
(
12 + x11) +
a2
+
x22
x22
2
2
x22
x22
b1
(
11 + x11)
x11
+
(
12 + x11)
+1 2
+1 + 2
+1
tional on
b2
x22
x22
+1
x22
x22
+1
176
Allows for multivariate distributions for individual eects, possibly correlated across equations.
177
log L =
where
NT
log(2)
2
=
="2 = Q + B , and
j
j = ("2)N (T
1)( 2 + T 2 )N = ( 2)NT N :
"
"
":
NT
1
N
NT
log(2)
log d0 Q + B d
log();
log L =
2
2
2
where d = Y
X ^ .
Estimate of 1= conditional on :
P P
0 Qd
d
dit di)2
i t (P
1d
= =
=
:
(T 1)d0Bd T (T 1) i(di d)2
Estimate of
conditional on 1=:
1
X0 Q + B X
1
1
X 0 Q + B Y:
^ "2 and 1d
= until con-
178
179
is independent of
i; t and "it.
We have
E (uitujs) =
8 2
2
2
< + + "
2
: 2
i = j; t = s;
if i = j; t 6= s;
if i 6= j; t = s:
if
= 2 (IN
eT e0T ) + 2 (eN e0N
IT ) + "2(IN
IT )
= T 2 B + N2 B + "2INT :
A2.2 Feasible GLS estimation
We can write
P4
j =1 j Mj ,
1 = "2
2 = T 2 + "2
3 = N2 + "2
4 = T 2 + N2 + "2
with
M1 = (IN eNNeN )
(IT
0
0
M2 = (IN eNNeN )
eETeT
0
0
M3 = eNNeN
(IT eTTeT )
0
0
M4 = ( eNNeN )
( eTTeT ):
eT e0T
T )
180
We have
r =
P4
r
j =1 j Mj ,
so that
4
X
"
1=2 =
p" j Mj
j =1
and the typical element of
yit = yit
with
1 = 1
p" 2 ; 2 = 1
Y = "
1=2Y
1yi
is
2yt + 3y;
p" 3 ; 3 = 1 + 2 +
p" 4
1:
Y on X .
mates:
p
2
NT
(^
p 2"
@ N (^
p 2
0
T (^
"2)
2"4 0 0
2 ) A v N @0; @ 0 24 0
0 0 24
2 )
11
AA :
Within,
181
Estimate of
1:
^ 1 = ^ 2" =
[Y 0M1Y
2:
^ 2 =
[Y 0M2Y
and we compute
model transformed by
^ 2 = (1=T )(^2
^ 2" ).
3:
^ 3 =
[Y 0M3Y
and we compute
^ 2 = (1=N )(^3
^ 2" ).
^ GLS = (X 0 M1X )="2 + (X 0M2X )=2 + (X 0M3X )=3 1
and
V ar(^ GLS ) = "2 (X 0M1X ) + "2(X 0 M2X )=2
182
Between-period estimator is
^ GLS = W1^ W ithin + W2^ BI + W3^ BP ;
with
i
0M X
0M X 1
X
X
0
2
2
W1 = X M1X + " + "
(X 0M1X );
h
i
0M X
0 M X 1 "
X
X
2
0
0
2
W2 = X M1X + " + "
(X M2 X );
h
i
0M X
0 M X 1 "
X
X
0
2
2
W3 = X M1X +
+
(X 0M3X ):
2
"
2
"
3
2
2
2
3
2
2
3
= = 0.
H0 :
only
1
@ log L() 0
@ 2 log L()
@ log L()
LM =
E
;
@
@@0
@
where
log L() =
and
= (2 ; 2 ; "2).
NT
log(2)
2
1
log j
j
2
1 0 1
U
U;
2
183
Gradient of log likelihood:
@ log L()
1
@
= tr
1
@i
2
@i
i = 1; 2; 3.
1
@
+ U 0
1
1U ;
2
@i
Because
= 2 (IN
eT e0T ) + 2 (eN e0N
IT ) + "2(IN
IT );
we have
=
@i
8
0
< IN eT eT
eN e0N IT
:
INT
Hence
(2 )
2
i=2 ( )
2
i=3 (" ):
i=1
0
0
0
@ log L()
NT 4 1 U0 (IN
0 eT eT )U=U 0U
=
1 U (eN eN
IT )U=U U
@
2"2
0
and
3
5;
1
@ 2 log L()
=
E
@@0
2
3
(
N
1)
0
(1
N
)
4
2"
4
0
(T 1) (1 T ) 5 :
NT (N 1)(T 1) (1 N ) (1 T ) (NT 1)
NT
LM =
1
2(T 1)
NT
+
1
2(N 1)
U 0(IN
eT e0T )U 2
U 0U
U 0(eN e0N
IT )U 2
U 0U
184
and is distributed as a
Important note. LM
U.
185
D1 and D1 + D2 resp.
D1 1 )
Y1
X1
U1
=
+
;
(D1 + D2) 1 ) Y2
X2
U2
where X1 and X2 are resp. D1 K and (D1 + D2 ) K .
Variance-covariance matrix of
Now, let
We have
0
0
Tj =
Pj
i=1 Di ,
1 0
=
;
0
2
is
rj = (Tj 2 + "2)r
eT e0
j
Tj
Tj
2
2
2
If we denote wj = Tj + " ,
matrix:
+ ("2)r ITj
!
0
eTj eTj
Tj
"
wj
anced panel is
"
j 1=2 =
T1 = D1 and T2 = D1 + D2.
so that
eTj e0Tj
+ ITj
Tj
eTj e0Tj
Tj
186APPENDIX 3.
eTj e0Tj
j
Tj
= ITj
where
1=2Yj : yjt
Typical element of "
0
^ GLS = X X
1
Y = "
1=2Y , and
"
1=2 = diag ITi
j = 1
"
:
wj
PTj
1
j Tj t=1 yjt .
N > 2,
because
is block-
0
X Y where X = "
1=2X;
eTi e0Ti
+ diag
Ti
"
wi
eTi e0Ti
Ti
U^ 0QU^
;
T
N
K
i
i
2 and "2:
^ 2" = P
N + tr (X 0QX ) 1X 0 B X ^ 2"
2
P
P 2 P
^ =
T
i
i
i Ti = i Ti
0
tr (X Q X ) 1X 0 (Jn=N ) X ^ 2"
P
P 2 P
+
;
T
T
=
T
i
i
i
i i
i
P
P
where Jn is a matrix of ones, of dimension (
i Ti) ( i Ti),
U^ 0B U^
eTi e0Ti
B = diag
Ti
ji=1!N ; Q = diag
ITi
eTi e0Ti
Ti
ji=1!N :
187
L1 = (2)
NT
2
"
N
(det V )
exp
N
1X
2 i=1
u0iVT 1ui ;
where
i):
"
N
2
N
X
1
(y
2y2 i=1 i0
exp
y )2 :
0
L2b = (2)
(
exp
("2)
N (T
1)
("2 + T a)
N
2
(y2 )
0
N
2
" T
N X
X
N X
T
X
1
a
2+
u
it
2"2 i=1 t=1
2"2("2 + T a) i=1
(2)
i):
"
N
2
exp
N
X
1
(y
2y2 i=1 i0
0
t=1
u2it
y )2 ;
0
#)
188
where
y ).
a = 2 2 y2
and
L3 = (2)
NT
2
( 2 )
NT
"
(2)
N X
T
1 X
[(y
2"2 i=1 t=1 it
exp
(yi;t 1
N
(2)
N
2
z ]2
2)):
"
N (T +1)
L4a = (2)
"
j
T +1j
N
2
exp
N
1X
wi0)2 :
2=(1
N
1 X
(y
22 i=1 i0
exp
yi0 + wi0)
w and variance
#
1 v0 ;
vi
T +1
i
2 i=1
where vi is a (T + 1) vector vi = (yi0
w ; yi1 yi0 xit
zi
; : : : ; yiT yi;T 1 xiT zi
) and
T +1 is a (T +1) (T +1)
matrix
T +1 = "2 1
0T
Useful expressions:
00T
IT
1
+ 2 1
eT
1
"2T
j
T +1j = 1 2 "2 + T 2 + 11 + 2
and
1 = 1
T +1
"2
"
1 2 00T
0T
IT
2
"
2
; e0T :
1+
+T +
1
1
189
1+
(1 + ; e0T ) :
eT
1
2 = 2 00
VT +1 = "2 w " T + 2 1
0T IT
eT
; e0T :
)
(1
2
w0 ):
i0
and
"2=(1
y
replaced by
i0,
and
and xed
Case 1
yi0 xed.
190
as a
Case 4.a
wi0
wi0
w
and variance
"2=(1 2)
H0: matrix
T +1 as dened in likelihood for Case 4.a, vs. alternative: unrestricted variance-covariance with (T + 1)(T + 2)=2
0
components, with log-likelihoods L4a and L4a respectively. Under
H0, 2(L4a L04a) is distributed as a 2((T + 1)(T + 2)=2 2)
(note only two free parameters in restricted VT +1, as already
estimated).
Case 4.b
variance
2
w0 .
Let
L04b
w
and arbitrary
VT +1 for Case 4.b, and L4a the unrestricted log-likelihood for Case
L0 ) admits
4.a (as above). Under H0 : True model is 4.b, 2(L4a
4b
2
a ((T +1)(T +2)=2
3) distribution (3 free parameters in Case
2
2
2
4.b: " ; ; w ).
0
H0:
2
as a (1).
Under
191
eT e0T ).
was assumed
Several cases:
1. Random or xed eects (instruments correlated with
);
").
If heteroskedasticity of
E (uituis) = 0; t 6= s, we have
N
1 0
VN = NV arf (x; ) = Nvar Z u = 2 E [Z 0uu0Z ]
N
N
1
= Z 0[IT
diagfi2g]Z
N
P
2
where i can be estimated by
^ 2i = T1 Tt=1 u^2it. Hence, a optimal
second-step estimate for VN would be
N
X
1
^ i; where H^ = diagf^ 2i g:
V^N =
Zi0HZ
N i=1
2
2
the form E (u ) =
it
such that
192
q orthogonality conditions:
W 0 u
VN = NE
N
0
u W
= NE [(QW=N )0uu0(QW=N )]
VN =
because
1
[(QW )0["2INT + T 2 B ](QW )]
N
1
= [(QW )0["2INT ](QW )]
N
0
0
0
^N = arg min u () W ( W QW ) 1 W u () ;
N
N
N
Q)
193
and the optimal GMM estimator is
exogeneity (
uncorrelated with
" at
E (wis0 uit) = 0
for
s; t = 1; 2; : : : ; T;
which gives
0 ) = 1=2
w0
1=2WSE;i = 1=2(IN
wiT
iT
0 )( 1=2
I ) = W B;
= (IN
wiT
qT
SE;i
where
B = 1=2
IqT .
exogenous. To remove
where
194
where
6
6
LT = 6
6
6
4
Note that (
T)
LT (L0T LT ) 1L0T .
If instruments
1 0 0
1 1 0
0
0
0 0 0
0 0 0
1 1
0
1
..
.
..
.
..
.
..
.
..
.
0
0
..
.
3
7
7
7
7:
7
5
LT : QT =
0 L0 u ) = E (Z 0 L0 " ) = 0;
E (ZSE;i
T i
SE;i t i
where
0:
ZSE;i = IT 1
wiT
ZSE;i as instruments.
1q
vector of instruments
wit
such
that
can be written
uis = uis
for
t = 1; 2; : : : ; T
1; t s;
ui;s 1.
195
1
^ FF = X 0F 0 H (H 0H ) 1H 0F X
X 0F 0 H (H 0H ) 1H 0F Y ;
where
F = IN
F .
Wi
wit.
When
is large and
Wi .
plim N1
PN
0 0
i=1 HiF ui uiF Hi
1 PN H 0 F F 0 H
6= plimP
i
i=1 i
N
plim N1
N
0
i=1 Hi Hi :
= "2IT + eT e0T
2
matrix.
yi = Ri + (eT
zi)
+ ui Xi + ui;
0 )0 (a T k matrix of
ui = (eT
i )+"i, Ri = (ri01; r20 i; : : : ; riT
0 0
00
time-varying regressors), eT
zi = [zi ; zi ; : : : ; zi ] (a T g matrix
where
196
of time-invariant regressors).
Assume regressors
E (di
"i) = 0;
where
i:
r2i = (r2i1
If the
E (Wi0uiu0iWi) = E (Wi0Wi),
ing the same instruments
Wi.
E [(LT
di)0ui] = E (L0T ui
di)
= E [L0T (eT i + "i)
di] = E (L0T "i
di) = 0;
where
LT
di is a T [(T
1)(kT + g]
WB;i = (LT
di; eT
si) instead of WA;i = (QT Ri; eT
si):
Number of additional instruments wrt.
BMS:
rank(ZB;i)
rank(ZA;i) = (T
1)(kT + g)
k.
is unrestricted.
Other
197
A.5.6 GMM with unrestricted variance-covariance matrix
ZB;i satisfy the no conditional heteroskedasticity assumption, but the variance-covariance of u is unrestricted.
We assume instruments
1ZA;i
using instru-
This is not
, E (Ri0 QT 1eT i) 6= 0.
Q = 1
and we can show that
QT
for removing
i :
QeT = 0.
Therefore:
si:
198
where
u to be a) homoskedastic (V
is diago-
= E (uu0) = IN
, where
= "2INT + 2 (IN
eT e0T ) and = "2IT + 2 eT e0T :
Zi:
yi = Xi + ui
by
^ 2SLS = X0
1=2Z (Z 0Z ) 1Z 0
1=2X 1
X 0
1=2Z (Z 0Z ) 1Z 0
1=2Y :
1=2Zi as instruAn equivalent 2SLS estimator obtains by using
1
ments:
^ 2SLS = X0
1Z (Z 0
1Z ) 1Z 0
1X 1
X 0
1Z (Z 0
1Z ) 1Z 0
1Y :
2
^ 3SLS = X 0Z (Z 0
Z ) 1Z 0X 1 X 0Z (Z 0
Z ) 1Z 0Y :
199
GMM and 3SLS are equivalent if the following condition holds:
E (Zi0uiu0iZi) = E (Zi0Zi) 8i = 1; 2 : : : ; N;
because, as
! 1,
N
1 0
1X
plim Z
Z = plim
Zi0u^iu^0iZi = E (Zi0uiu0iZi) = V:
N
N i=1
This condition is denoted
When condition
No conditional heteroskedasticity.
E (Zi0uiu0iZi) = E (Zi0Zi)
Theorem 8
is
^
estimated from rst-stage N for GMM. It states that under this
1=2) does
condition, ltering (premultiplying instruments by
200APPENDIX 6.
i and an i.i.d.
error term
"it:
uit = i + "it:
OLS (or, equivalently, ML) yield consistent but not ecient estimates if unobserved heterogeneity is omitted.
yit
is
not i.i.d.).
where
The
201
the OLS estimate of
is
N
1X
Cov(i; V ar(yi;t
P
^
i +
N i=1
1=n i V ar(yi;t
1))
1)
N
Covi(P
i; "2=(1 2i ))
1X
+
:
=
N i=1 i 1=n i "2=(1 2i )
If all
overestimates the
i are i.i.d.
E (i) = 0.
This is
^ =
We have
^ T !1
!
"
"P
is
N PT
i=1 t=1 yit
NT
N
X
1
1
N i=1 i
# 1
# 1
N
1X
<
:
N i=1 i
Hence, the MLE of the misspecied model underestimates the average of individual parameters
i .
202APPENDIX 6.
In many cases, it is not possible to lter out the individual effect without very restrictive assumptions (e.g., Fixed-eect Logit,
Another possibility is to integrate
conditional likelihood.
eters of interest.
The distribution of
f (yitjxit; ; ) =
exp(xit + i)yit
f~(yit; xit + i) =
exp[ exp(xit + i)]:
yit!
Change of variable: i = exp(i ), with probability distribution:
1=
1 exp(
=)
(;
) =
;
(
)1=
(1=
)
where
(:): Gamma distribution, and
> 0. Then it can be
shown that
203
This is the
i v N (0; 2 ):
P rob[yit = 1jxit] =
where
(:):
1
(xit + )
d;
density function of
N (0; 1).
P rob[yi1 = 1; : : : ; yiT = 1] =
6=
T
Y
t=1
Z Y
T
1
(xit + )
d
t=1
P rob[yit = 1]:
In more complex
M (yitjxit; ; ) =
204APPENDIX 6.
We can write
M (yitjxit; ; ) =
(;
) 0
m(yit; xit + ) 0
(;
0)d;
0
(;
)
If we can nd
drawn
expectation by
S
1X
(is ;
)
s
m(yit; xit + i ) 0 s 0 :
S s=1
(i ;
)
Under (mild) regularity assumptions, the simulated expression
converges to the above expectation, using a weak Law of Large
Numbers. Two issues in practice:
205
Gouriroux and Monfort (J. of Econometrics, 1993): Simulated
GMM (SGMM) and Simulated Maximum Likelihood (SML).
For SGMM, when population moments are impossible to compute,
we replace
S
1X
E [f (yit; xit; i; ] = 0 by
[f (yit; xit; is; ] 0;
S s=1
or by
S
1X
(s ;
)
[f (yit; xit; is; ] 0 is 0:
S s=1
(i ;
)
s
MGMM
=
( N
X
S
1X
[f (yi; xi; is; ]0 Zi
S s=1
i=1
!)
T 1
N
X
S
1X
0
Zi
[f (yi; xi; is; )]
S s=1
i=1
Zi is a T L matrix of instruments. The SGMM is consistent and asymptotically normal when N tends to innity and S
where
is xed. This is because we can use the weak Law of Large Numbers for consistency of the simulator
1P f
s
S
towards
E f
and a
log L() =
where heterogeneity
f (yijxi; ).
N
X
i=1
206APPENDIX 6.
Then
S
1X
f~(yi; xi; is; );
S s=1
where
Ls() =
"
N
X
S
X
i: is ; s = 1; 2; : : : ; S
and
1
1
log
f~(y ; x ; s; ) :
N i=1
S s=1 i i i
N=S
may be necessary.
"
GN () =
1
N
N
X
to
207
are used for each individual:
"
N
S
1X
1X
I
GN ( ) =
yi; xi;
(yi; xi; s; )
N i
S s
"
GDN () =
1
N
N
X
i
yi; xi;
1
S
S
X
s
!#
;
!#
E yi; xi;
1
S
S
X
s
!#
(yi; xi; s; )
I
G(). Therefore ^ that maximizes (SML)
I
or minimizes (SGMM) GN ( ) is inconsistent.
GDN () converges to the non random scalar:
"
!#
S
1X
EE yi; xi;
(yi; xi; s; ) ;
S s
which is in general dierent from G( ). But if function is linear
D
D
wrt. E
(:), GN ( ) converges to G( ) and ^
is consistent.
Case 2. S and N ! 1.
Both
^I
and
^D
are consistent.
208APPENDIX 6.
and
yit
T -fold
functions of
yi given xi and i:
f (yijxi; i; ) =
yit =1
(xit + )
1
y
f (yijxi; i; ) =
it
yit >0 "
Y
Y
yit =0
Y
yit =0
( xit
xit
"
xit
"
)
;
DYNTAB.SAS ;
;
Uses datafile DYNTAB3.DAT;
;
Create library and file names ;
* Change directory information below ;
209
210
SOFTWARE
211
model lconso= lprice lrevenue ;
by year;
run;
* Compute Within and Between estimates ;
* using the MEANS procedure ;
proc sort data=wat;
by id;
proc means data=wat noprint;
var lconso lprice lrevenue ;
by id;
output out=out1 mean=mconso mprice mrevenue ;
data out1;set out1;
keep id mconso mprice mrevenue ;
data wat;
merge wat out1;
by id;
data wat;set wat;
qconso=lconso-mconso; qprice=lprice-mprice;
qrevenue=lrevenue-mrevenue;
* Within regression ;
proc reg data=wat;
model qconso = qprice qrevenue ;
run;
* Between regression ;
proc reg data=wat;
model mconso = mprice mrevenue;
run;
212
SOFTWARE
Model Description
Estimation Method
FIXONE
Number of Cross Sections 116
Time Series Length
6
SSE
MSE
RSQ
Model Variance
2.578099 DFE
578
0.00446
Root MSE 0.066786
0.9344
DF
1
1
1
1
1
...
1
1
1
1
1
1
1
Parameter
Estimate
-0.455773
-0.222476
0.153338
-0.131488
0.027422
...
0.420843
-0.322888
-0.259767
-0.240823
5.099257
-0.134245
0.024386
Standard
Error
0.039463
0.039923
0.038900
0.039174
0.038890
...
0.040309
0.039376
0.038678
0.039379
0.366957
0.018447
0.033223
T for H0:
Parameter=0
-11.549433
-5.572620
3.941882
-3.356518
0.705132
... ...
10.440506
-8.200102
-6.716134
-6.115479
13.896065
-7.277506
0.734009
Variable
Label
Cross Sec
Cross Sec
Cross Sec
Cross Sec
Cross Sec
Cross Sec
Cross Sec
Cross Sec
Cross Sec
Intercept
213
MODEL 2. TWO-WAY FIXED EFFECTS
LCONSO
Model Description
Estimation Method
FIXTWO
Number of Cross Sections 116
Time Series Length
6
SSE
MSE
RSQ
Model Variance
2.205671 DFE
573
0.003849 Root MSE 0.062043
0.9439
Variable
CS 1
CS 2
CS 3
...
CS 114
CS 115
TS 1
TS 2
TS 3
TS 4
TS 5
INTERCEP
LPRICE
LREVENUE
DF
1
1
1
...
1
1
1
1
1
1
1
1
1
1
Parameter Estimates
Parameter Standard T for H0:
Estimate
Error
Parameter=0
-0.535192 0.040793 -13.119702
-0.302435 0.041809 -7.233670
0.120803
0.037066 3.259125
... ...
...
... ...
-0.288486 0.036463 -7.911820
-0.256215 0.036669 -6.987209
-0.102087 0.017883 -5.708681
-0.047565 0.016463 -2.889216
-0.030524 0.014486 -2.107135
-0.007359 0.012507 -0.588378
-0.025528 0.009992 -2.554900
6.316873
0.396540 15.929983
-0.251061 0.034210 -7.338896
-0.053316 0.033244 -1.603773
Variable
Label
Cross Sec
Cross Sec
Cross Sec
Cross Sec
Cross Sec
Time Seri
Time Seri
Time Seri
Time Seri
Time Seri
Intercept
214
SOFTWARE
Model Description
Estimation Method
RANONE
Number of Cross Sections 116
Time Series Length
6
Variance Component Estimates
SSE 3.12498
DFE
693
MSE 0.004509 Root MSE 0.067152
RSQ 0.1087
Variance Component for Cross Sections
Variance Component for Error
0.043243
0.004460
Variable
INTERCEP
LPRICE
LREVENUE
DF
1
1
1
Parameter
Estimate
4.692305
-0.149074
0.053077
Parameter Estimates
Standard T for H0:
Error
Parameter=0
0.354917 13.220844
0.017611 -8.465039
0.032306 1.642977
Variable
Label
Intercept
215
MODEL 4. TWO-WAY FIXED EFFECTS
LCONSO
Model Description
Estimation Method
RANTWO
Number of Cross Sections 116
Time Series Length
6
Variance Component Estimates
SSE 2.707154 DFE
693
MSE 0.003906 Root MSE 0.062501
RSQ 0.0907
Variance Component for Cross Sections
Variance Component for Time Series
Variance Component for Error
0.043638
0.000746
0.003849
Variable
INTERCEP
LPRICE
LREVENUE
DF
1
1
1
Parameter
Estimate
5.674742
-0.225151
-0.018251
Parameter Estimates
Standard T for H0:
Error
Parameter=0
0.371984 15.255323
0.027604 -8.156464
0.032401 -0.563297
Variable
Label
Intercept
216
SOFTWARE
Source
Model
Error
c Total
Analysis
Sum of
DF
Squares
2
0.31252
693 2.57810
695 2.89062
Root MSE
Dep Mean
C.V.
Variable
INTERCEP
QPRICE
QREVENUE
DF
1
1
1
of Variance
Mean
Square
F Value
0.15626 42.003
0.00372
0.06099
-0.00000
-1.291786E17
R-square
Adj R-sq
Prob>F
0.0001
0.1081
0.1055
Parameter Estimates
Parameter
Standard
T for H0:
Estimate
Error
Parameter=0
-5.28092E-17 0.00231195 -0.000
-0.134245
0.01684666 -7.969
0.024386
0.03034107 0.804
Variable
Label
Source
Model
Error
C Total
DF
2
693
695
Analysis of Variance
Sum of
Mean
Squares
Square
F Value
7.13103
3.56551 84.369
29.28684 0.04226
36.41786
Root MSE
Dep Mean
C.V.
Variable
INTERCEP
MPRICE
MREVENUE
DF
1
1
1
Parameter
Estimate
-0.176444
-0.259461
0.494483
0.20557
4.99481
4.11576
R-square
Adj R-sq
Prob>F
0.0001
0.1958
0.1935
Parameter Estimates
Standard
T for H0:
Error
Parameter=0
0.68091356 -0.259
0.02278084 -11.389
0.05958703 8.298
Variable
Label
217
Gauss is an interpreter computer language, that is most conveniently run in interactive mode (global variables are kept in memory until one quits Gauss). It has a small built-in editor useful for
long jobs, or it can be used in command mode.
[Gauss].
command mode and the edit mode using either tool bar (Windows bar at the bottom, Gauss bar on top). In command mode,
you can edit any le (for example
myprog.prg)
by typing
edit
You
218
load x[1000,5]=mydata.dat
or
n=100;t=10;nvar=5;load x[n*t,nvar]=mydata.dat;.
mydata").
varnames);
("
call saved(x,"mydata",varnames).
Basic operators
In Gauss, most operators return a value that may be stored in a
variable, or printed to screen. If no assigment command is given,
the program will simply output the result to the screen. Example:
x={1 2 3};
(a
1 Note:
;.
vnames={"a","b","c"}.
219
Here is a list of useful operators:
cols(x)
rows(x)
meanc(x)
stdc(x)
sqrt(x)
sumc(x)
cumsumc(x)
columns of
x;
x;
x;
x;
x;
x;
x;
cdfn(x)
Returns the cumulative normal distribution (x);
2
cdfchic(x,y)
Returns the complement to 1 of the (x) cumulative distribution with
2
puting p-values of
tests.
x'
Transposes matrix or vector x;
y=x1 x2, y=x1|x2
Concatenates
horizontally or vertically;
y=x[.,1]
Selects column 1 and all rows of matrix x;
y=x[1:10,.]
Selects rows 1 to 10 and all columns;
y=x[1:10,1:20]
Selects columns 1 to 20 and rows 1 to 10;
vec(x)
Creates a vector from a matrix, by stacking all
columns one after the other. vec(x) is NT 1 if x is N T ;
diag(x)
Returns the rst diagonal of matrix x (must be
square);
reshape(x,n,t)
Reshapes matrix x into a N T matrix;
a*b*c
Performs matrix multiplication (check number of rows
and columns!);
a.*b, a./b
220
inverse, use
invpd(x));
zeros(n,m)
Returns a n m matrix of zeros;
ones(n,m)
Returns a n m matrix of ones;
eye(n)
Returns a n n identity matrix;
a.*.b
Computes the Kronecker product a
b;
Conditional operators and loops
Useful for testing and creating dummy variables. Operators:
.eq,
less than, less than or equal to, strictly greater than, greater than
or equal to.
Example: suppose you want to create an indicator variable equal
y if z > 0.
z , equal to x if z < 0
z = x.*(z .lt 0)
Loops are not recommended because they produce lengthy processes, and vector operators should always be preferred. But in
some cases, they are necessary. Examples of loops are:
221
or
y=zeros(n,1).
y=sorthc(x,1)
Sorts matrix
x using
variable in column 1
as key;
y=selif(x, x .eq 1)
Creates matrix
Creates matrix
equal to 1;
y=delif(x, x .lt 0)
tive values from
x;
from values of
by deleting nega-
Creating procedures
Very useful to speed up repetitive tasks. The general syntax is
proc func(a);
local toto;
:::
retp(toto);
endp;.
222
:::
retp(toto1,toto2,toto3);
endp;.
This code declares 3 inputs
{b1,b2,b3}=func(a1,a2,a3);
Beware of the use of local variables; any variable used in the procedure must either be declared as local (its value is lost when one
quits the procedure) or else where in the program (this will be a
global variable). A possibility to avoid problems is to declare all
variables as global at the start of the program, with the syntax:
proc(x);
local toto;
toto=reshape(x,n,t);
toto=toto-meanc(toto');
toto=reshape(toto,n*t,1);
retp(toto);
endp;
Note in this case, variables
and
proc(x,n,t);
local toto;
223
toto=reshape(x,n,t);
retp(reshape(toto-meanc(toto'),n*t,1));
retp(toto);
endp;
And if we wished to return both Between and Within:
proc (2)=(x,n,t);
local toto;
toto=reshape(meanc(reshape(x,n,t)'),n*t,1);
retp(toto,x-toto));
endp;
Some useful built-in procedures
call dstat(0,x)
in
x;
call dstat("mydata",1|3)
call ols(0,y,x);
y on x;
as follows:
library optmum;optmum;
guments;
ters;
Main command;
is the
224
ret is a return
func)
proc(z);
:::;
retp(crit);
endp;
z).
Example: To estimate a nonlinear model by minimizing the residual sum of squares, where the model is
log(1)wi:
yi = 0 + 12xi +
library optmum;optmum;
x0={0.1 , 0.1 , 0.5};
{x, f, g, ret} = optmum(&func,x0);
proc(z);
local err;
err=y-z[1]-z[1]*z[2]*x-ln(z[2])*w;
is
z [1], 1
is
retp(crit);
endp;
must
225
n=116; t=6;
load x[n*t,6]=d:/dea/panel/dyntab3.dat;
id=x[.,1];
year=x[.,2];
conso=ln(x[.,3]);
price=ln(x[.,4]);
revenue=ln(x[.,5]);
precip=ln(x[.,6]);
vnames="year","conso","price","revenue","precip","id" ;
call saved(year conso price revenue precip id,"watle",vnames);
y= conso ;
x= price,revenue ;
grp= id ;
__title("Water demand equation");
call tscs("watle",y,x,grp);
226
SOFTWARE
=====================================================================
TSCS Version 3.1.2 1/17/01 3:51 pm
=====================================================================
Data Set: watfile
OLS DUMMY VARIABLE RESULTS
Dependent variable: conso
Observations :
Number of Groups :
Degrees of freedom :
Residual SS :
Std error of est :
Total SS (corrected) :
F = 35.033
P-value =
Var
price
revenue
Coef.
-0.134245
0.024386
Std.
Group Number
1
2
3
...
114
115
116
696
116
578
2.578
0.067
2.891
with 2,578 degrees of freedom
0.000
Coef.
-0.347461
0.035045
Std.
Error
0.018447
0.033223
Dummy Variable
4.643484
4.876781
5.252595
... ... ...
4.839490
4.858434
5.099257
t-Stat
-7.277506
0.734009
Standard Error
0.365639
0.370063
0.369474
... ... ...
0.365496
0.359065
0.366957
P-Value
0.000
0.463
227
OLS ESTIMATE OF CONSTRAINED MODEL
Dependent variable: conso
Observations :
696
Number of Groups :
116
Degrees of freedom :
693
R-squared :
0.172
Rbar-squared :
0.170
Residual SS :
32.532
Std error of est :
0.217
Total SS (corrected) : 39.308
F = 72.175
with 3,693 degrees of freedom
P-value =
0.000
Var
CONSTANT
price
revenue
Coef.
1.164761
-0.249873
0.376643
Std.
Coef.
-0.406149
0.257121
Std.
Error
0.598014
0.022153
0.052746
t-Stat
1.947715
-11.279345
7.140637
P-Value
0.052
0.000
0.000
228
SOFTWARE
Var
CONSTANT
price
revenue
Coef.
4.687235
-0.149316
0.053560
Std.
Coef.
-0.363264
0.071009
Std.
Error
0.355285
0.017623
0.032338
t-Stat
13.192903
-8.472974
1.656247
P-Value
0.000
0.000
0.098
229
Group Number
1
2
3
4
5
...
112
113
114
115
116
Random Components
-0.346522
-0.121608
0.250638
-0.020350
0.128761
... ... ...
0.512636
-0.216224
-0.151243
-0.125587
0.104064
230
of Z2i
231
occ=x[.,4];
ind=x[.,5];
south=x[.,6];
smsa=x[.,7];
ms=x[.,8];
fem=x[.,9];
unioni=x[.,10];
edu=x[.,11];
blk=x[.,12];
lwage=x[.,13];
/* Define matrices X, Z and vector Y */
x1=occ south smsa ind;
x2=expe expe2 wks ms unioni;
z1=fem blk;
z2=edu;
y=lwage;
x=x1 x2;
z=z1 z2;
/* You don't need to change anything after this */
/* Compute Between and Within transformations:
Caution: keep that order for BXZ: X,Z,Y */
qx=with(x y);
bxz=bet(x z y);
by=bxz[.,cols(bxz)];
bxz=bxz[.,1:cols(bxz)-1];
qy=qx[.,cols(qx)];
qx=qx[.,1:cols(qx)-1];
/* Within regression and error term (uw) */
betaw=inv(qx'qx)*qx'qy;
uw=qy-qx*betaw;
/* Compute variance with instruments */
exob=un bxz;
gamb=inv(exob'exob)*(exob'by);
BX and QX
232
ub=by-exob*gamb;
sigep=uw'uw/(n*(t-1)-kq);
sigq=sqrt(sigep*diag(inv(qx'qx)));
a=x1 z1;
di=by-bxz[.,1:kq]*betaw;
zz=un z1 z2;
gamhatw=inv(zz'*a*inv(a'*a)*a'*zz)*zz'*a*inv(a'*a)*a'*di;
s2=(1/(n*t))*(by-bxz[.,1:kq]*betaw
-zz*gamhatw)'*(by-bxz[.,1:kq]*betaw-zz*gamhatw);
sigal=s2-(1/t)*sigep;
theta=sqrt(sigep/(sigep+t*sigal));
/* GLS transformation and estimate
Caution: keep the order 1,X1,X2,Z1,Z2 in matrix EXOG */
exog=gls(un x1 x2 z1 z2 y);
yg=exog[.,cols(exog)];
exog=exog[.,1:cols(exog)-1];
betagls=inv(exog'exog)*(exog'yg);
siggls=sqrt(sigep*diag(inv(exog'exog)));
/* HT */
aht=un qx bet(x1) z1;
betaht=inv(exog'*aht*inv(aht'*aht)*aht'*exog)*exog'*aht*inv(aht'*aht)
*aht'*yg;
sight=sqrt(sigep*diag(inv(exog'*aht*inv(aht'*aht)*aht'*exog)));
/* AM */
x1s=tam(x1);
aam=un qx x1s z1;
betaam=inv(exog'*aam*inv(aam'*aam)*aam'*exog);
betaam=betaam*exog'*aam*inv(aam'*aam)*aam'*yg;
sigam=sqrt(sigep*diag(inv(exog'*aam*inv(aam'*aam)*aam'*exog)));
/* BMS */
233
abms1=aam tbms(with(x2));
/* This is the general form for BMS instrument, it should work in most
cases. But with the application to PSID data, we must drop some variables,
see below. This means you have to delete ABMS1 below for your application
*/
/* Remove abms1 just below: */
abms1=un qx bet(x1) tbms(with(occ south smsa ind ms wks unioni)) z1;
betabms1=inv(exog'*abms1*inv(abms1'*abms1)*abms1'*exog)
*exog'*abms1*inv(abms1'*abms1)*abms1'*yg;
sigbms1=sqrt(sigep*diag(inv(exog'*abms1*inv(abms1'*abms1)*abms1'*exog)));
/* Compute variance-covariance matrices */
varq=sigep*inv(qx'qx); varg=sigep*inv(exog'*exog);
varht=sigep*inv(exog'*aht*inv(aht'*aht)*aht'*exog);
varam=sigep*inv(exog'*aam*inv(aam'*aam)*aam'*exog);
varbms1=sigep*inv(exog'*abms1*inv(abms1'*abms1)*abms1'*exog);
test1=(betagls[2:kq+1]-betaw)'*inv(varq-varg[2:kq+1,2:kq+1]);
test1=test1*(betagls[2:kq+1]-betaw);
test2=(betaht[2:kq+1]-betaw)'*inv(varq-varht[2:kq+1,2:kq+1])
*(betaht[2:kq+1]-betaw);
test3=(betaht-betaam)'*inv(varht-varam)*(betaht-betaam);
test4=(betaam-betabms1)'*inv(varam-varbms1)*(betaam-betabms1);
output file=iv1.out reset;
output on;
"Within estimates ";
" Estimate standard error t-stat ";
betaw sigq betaw./sigq;
"GLS estimates";
"sigma(alpha),sigma(epsilon),theta(=(sig(ep)/(sig(ep+t*sig(al)))**(1/2))";
sigal sigep theta;
" Estimate standard error t-stat ";
betagls siggls betagls./siggls;
234
235
" Estimate standard error t-stat ";
b2 se2 b2./se2;
"Hansen test and p-value ";
sar cdfchic(sar,cols(abms1)-rows(b2));
output off;
proc bet(w);
/* Compute BX from matrix w */
local i,term,betx;
term=reshape(w[.,1],n,t);
term=meanc(term').*.et;
term=reshape(term,n*t,1);
betx=term;
i=2;
do until i>cols(w);
term=reshape(w[.,i],n,t);
term=reshape(meanc(term').*.et,n*t,1);
betx=betx term;
i=i+1;
endo;
retp(betx);
endp;
proc with(w);
/* Compute Within transformation for matrix W */
retp(w-bet(w));
endp;
proc gls(w);
/* GLS transformation */
local term; term=w-(1-theta)*bet(w);
retp(term);
endp;
proc tam(w);
/* AM transformation, stacking time observations */
local i,term,xstar;
term=reshape(w[.,1],n,t).*.et;
xstar=term;
236
i=2;
do until i>cols(w);
term=reshape(w[.,i],n,t).*.et;
xstar=xstar term;
i=i+1;
endo;
retp(xstar);
endp;
proc tbms(w);
/* BMS transformation, stacking time observations but deleting last column
*/
local i,term,xstar;
term=reshape(w[.,1],n,t).*.et;
xstar=term[.,1:cols(term)-1];
i=2;
do until i>cols(w);
term=reshape(w[.,i],n,t).*.et;
xstar=xstar term[.,1:cols(term)-1];
i=i+1;
endo;
retp(xstar);
endp;
proc (5)=gmm(y,x,z,d);
local zx,w,w2,b,e,e2,b2,se,se2,sar2;
zx = z'x;
if d==1;
w = invpd(inw(z));
else;
w = invpd(z'z);
endif;
b = invpd(zx'w*zx)*zx'w*z'y;
e = y-x*b;
w2 = ezw(e,z);
se = invpd(zx'w*zx)*zx'w*w2*w*zx*invpd(zx'w*zx);
237
w = invpd(w2);
se2 = invpd(zx'w*zx);
b2 = se2*zx'w*z'y;
e2 = y-x*b2;
sar2 = e2'z*w*z'e2;
retp(b,sqrt(diag(se)),b2,sqrt(diag(se2)),sar2);
endp;
proc ezw(e,z);
local k,ez,T;
T = rows(e)/N;
k = cols(z);
ez = reshape(e.*z,N,K*T)*(ones(T,1).*.eye(K));
retp(ez'ez);
endp;
proc inw(z);
local a,i,zi,zaz,T;
t = rows(z)/N;
a = eye(T);
zaz = 0;
i = 1;
do until i>N;
zi = z[(i-1)*T+1:i*T,.];
zaz = zaz + zi'a*zi;
i = i+1;
endo;
retp(zaz);
endp;
238
*/
239
First component matrix: lagged Y's
Recall: if AR1=1, restriction when epsilon's are serially correlated
of order 1 */
z = (y[.,1]).*.ddif[.,1];
j = 2;
do until j>cols(ddif);
z = z ((y[.,1:j]).*.ddif[.,j]);
j = j+1;
endo;
if ar1==1;
z = (y[.,1]).*.ddif[.,1];
j = 2;
do until j>cols(ddif);
z = z ((y[.,1:j-1]).*.ddif[.,j]);
j = j+1;
endo;
z=z[.,2:cols(z)];
endif;
/* Second component matrix: Instruments from X */
/* Delete this block if you want only instruments from y's */
if top==1;
/* Weakly exogenous X's, in level */
toto=shapent(x[.,1]);
z2 = (toto[.,1]).*.ddif[.,1];
j = 2;
do until j>cols(ddif);
z2 = z2 ((toto[.,1:j]).*.ddif[.,j]);
j = j+1;
endo;
i=2;
do until i>cols(x);
toto=shapent(x[.,i]);
z2 =z2 ((toto[.,1]).*.ddif[.,1]);
j = 2;
do until j>cols(ddif);
z2 = z2 ((toto[.,1:j]).*.ddif[.,j]);
240
j = j+1;
endo;
i=i+1;
endo;
z=z z2;
endif;
if top==2;
/* Strongly exogenous X's, in first-difference form */
toto=shapent(x[.,1]);
z2 = (toto[.,3]-toto[.,2]).*.ddif[.,1];
j = 2;
do until j>cols(ddif);
z2 = z2 ((toto[.,j]-toto[.,j-1]).*.ddif[.,j]);
j = j+1;
endo;
i=2;do until i>cols(x);
toto=shapent(x[.,i]);
z2 = z2 ((toto[.,3]-toto[.,2]).*.ddif[.,1]);
j = 2;
do until j>cols(ddif);
z2 = z2 ((toto[.,j]-toto[.,j-1]).*.ddif[.,j]);
j = j+1;
endo;
i=i+1;
endo;
z=z z2;
endif;
b1,se1,b2,se2,sar = gmm(vec((y[.,3:T]-y[.,2:T-1])'),
vec((y[.,2:T-1]-y[.,1:T-2])')
trans(x),z,1);
output file = dpd1.out on;
"Arellano-Bond GMM estimates";
if top ==0;
"Instruments from lagged Y's only (TOP=0)";
endif;
if top==1;
241
"Instruments from X are weakly exogenous and in level (TOP=1)";
endif;
if top==2;
"Instruments from X are strongly exogenous and first-differenced (TOP=2)";
endif;
if ar1==1;
"Restricted estimates: epsilon are serially correlated of order 1 (AR1=1)";
endif;
" Estimate standard error t-stat";
b2 se2 b2./se2;
"Nb. of conditions (instruments) " cols(z);
"Nb. of parameters " rows(b2);
"Hansen specification test and p-value ";
sar cdfchic(sar,cols(z)-rows(b2));
output off;
proc shapent(w);
/* Reshapes vector in NxT form */
retp(reshape(w,n,t));
endp;
proc trans(w);
/* Transforms matrix X in First Difference */
local toto,i,xfd;
toto=reshape(w[.,1],n,t);
toto=vec((toto[.,3:T]-toto[.,2:T-1])');
xfd=toto;
i=2;
do until i>cols(w);
toto=reshape(w[.,i],n,t);
toto=vec((toto[.,3:T]-toto[.,2:T-1])');
xfd=xfd toto;
i=i+1;
endo;
retp(xfd);
endp;
242
proc (2)=ls(y,x);
/* Computes OLS, returns White var-covar matrix */
local ixx,b,e,v;
ixx = invpd(x'x);
b = ixx*x'y;
e = y-x*b;
v = ixx*(ezw(e,x))*ixx;
retp(b,v);
endp;
proc ezw(e,z);
local k,ez,T;
T = rows(e)/N;
k = cols(z);
ez = reshape(e.*z,N,K*T)*(ones(T,1).*.eye(K));
retp(ez'ez);
endp;
proc inw(z);
local d,a,i,zi,zaz,T;
T = rows(z)/N;
d = zeros(T,1) (eye(T-1)|zeros(1,T-1));
a = 2*eye(T) - (d + d');
zaz = 0;
i = 1;
do until i>N;
zi = z[(i-1)*T+1:i*T,.];
zaz = zaz + zi'a*zi;
i = i+1;
endo;
retp(zaz);
endp;
proc (5)=gmm(y,x,z,d);
local zx,w,w2,b,e,e2,b2,se,se2,sar2;
zx = z'x;
243
if d==1;
w = invpd(inw(z));
else;
w = invpd(z'z);
endif;
b = invpd(zx'w*zx)*zx'w*z'y;
e = y-x*b;
w2 = ezw(e,z);
se = invpd(zx'w*zx)*zx'w*w2*w*zx*invpd(zx'w*zx);
w = invpd(w2);
se2 = invpd(zx'w*zx);
b2 = se2*zx'w*z'y;
e2 = y-x*b2;
sar2 = e2'z*w*z'e2;
retp(b,sqrt(diag(se)),b2,sqrt(diag(se2)),sar2);
endp;
244
REFERENCES
References
S.C. Ahn and P. Schmidt, Ecient Estimation of Models for Dynamic Panel
Data, Journal of Econometrics, 68, 5-27, 1995.
S.C. Ahn and P. Schmidt, A Separability Result for GMM Estimation, with
Applications to GLS Prediction and Conditional Moment Tests, Econometric Reviews, 14(1), 19-34, 1995.
S.C. Ahn and P. Schmidt, Ecient Estimation of Dynamic Panel Data Models:
Alternative Assumptions and Simplied Estimation, Journal of Econometrics, 76,
309-321, 1997.
S.C. Ahn, Y.H. Lee and P. Schmidt, GMM Estimation of Linear Panel Data
Models with Time-varying Individual Eects, Journal of Econometrics, 101, 219255, 2001.
T. Amemiya, The estimation of the variances in a variance-components model,
International Economic Review, 12, 1-13, 1971.
T. Amemiya and T.E. MaCurdy, Instrumental-Variable Estimation of an ErrorComponents Model, Econometrica, 54(4), 869880, 1986.
E.B. Andersen, Conditional inference and models for measuring (Mentalhygiejnisk Forlag, Copenhague), 1973.
T.W. Anderson and C. Hsiao, Formulation and Estimation of Dynamic Models
Using Panel Data, Journal of Econometrics, 18, 4782, 1982.
D.W.K. Andrews, Heteroskedasticity and autocorrelation consistent covariance
matrix estimation, Econometrica, 59, 817-858, 1991.
D.W.K. Andrews and J.C. Monahan, An improved heteroskedasticity and autocorrelation consistent covariance matrix estimator, Econometrica, 60, 953-966,
1992.
W. Antweiler, Nested Random Eects Estimation in Unbalanced Panel Data,
Journal of Econometrics, 101, 295-313, 2001.
M. Arellano, Discrete choices with panel data, working paper 0101, CEMFI,
245
2001.
M. Arellano and S. Bond, Some Tests of Specication for Panel Data: Monte
Carlo Evidence and an Application to Employment Equations, Review of Economic
Studies, 58, 277297, 1991.
M. Arellano and O. Bover, Another Look at the Instrumental Variable Estimation of Error-Components Models, Journal of Econometrics, 68, 2951, 1995.
J. Alvarez and M. Arellano, The Time Series and Cross Section Asymptotics
of Dynamic Panel Data Estimators, CEMFI Working Paper No. 9808, 1998.
P. Balestra and M. Nerlove, Pooling cross-section and time-series data in the
estimation of a dynamic model: the demand for natural gas, Econometrica, 34,
585-612,1966.
B.H. Baltagi, Econometric Analysis of Panel Data, J. Wiley, 1995.
B.H. Baltagi and S. Khanti-Akom, On ecient estimation with panel data:an
empirical comparison of instrumental variables estimators, Journal of Applied Econometrics, 5, 401-406, 1990.
B.H. Baltagi, Simultaneous equations with error components, Journal of Econometrics, 17, 189-200, 1981.
B.H. Baltagi, Specication issues, in The econometrics of panel data: Handbook of theory and applications, chap. 9, L. Matyas and P. Sevestre eds., Kluwer
Academix Publishers, Dordrecht, 196-205, 1992.
B.H. Baltagi, Panel data, Journal of Econometrics, 68, 1-268, 1995.
B.H. Baltagi, S.H. Song and B.C. Jung, The Unbalanced Nested Error Component Regression Model, Journal of Econometrics, 101, 357-381, 2001.
R. Blundell and S. Bond, GMM estimation with persistent panel data: An
application to production functions, IFS working paper W99/4, 1999.
R. Blundell and S. Bond, Initial Conditions and Moment Restrictions in Dynamic Panel Data Models, Journal of Econometrics, 87, 115143, 1998.
246
REFERENCES
A. Brsch-Supan and V. Hajivassiliou, Smooth unbiased multivariate probability simulators for maximum likelihood estimation of limited dependent variables
models, Cowles Foundation paper 960, Yale University, 1990.
T.S. Breusch, G.E. Mizon and P. Schmidt, Ecient Estimation Using Panel
Data, Econometrica, 57(3), 695-700, 1989.
G. Chamberlain, Asymptotic Eciency in Estimation with Conditional Moment Restrictions, Journal of Econometrics, 34, 305-334, 1987.
G. Chamberlain, Panel data, in Handbook of Econometrics, pp. 1247-1318, Z.
Griliches and M. Intriligator eds., North- Holland, Amsterdam, 1984.
G. Chamberlain, Comment: Sequential Moment Restrictions in Panel Data,
Journal of Business and Economic Statistics, 10, 20-26, 1992.
G. Chamberlain, Multivariate regression models for panel data, Journal of
Econometrics, 18, 5-46, 1982.
E. Charlier, B. Melenberg and A. van Soest, Estimation of a censored regression panel data model using conditional moment restrictions eciently, Journal of
Econometrics, 95, 25-56, 2000.
C. Cornwell. and P. Rupert, Ecient Estimation with Panel Data: An Empirical Comparison of Instrumental Variables Estimators, Journal of Applied Econometrics, 3, 149-155, 1988.
B. Crpon, F. Kramarz and A. Trognon, Parameters of Interest, Nuisance Parameters and Orthogonality Conditions. An Application to Autoregressive Error
Component Models, Journal of Econometrics, 82, 135156, 1997.
C. Cornwell, P. Schmidt and D. Wyhowski, Simultaneous equations and panel
data, Journal of Econometrics, 51, 151-181, 1992.
G. Dionne, R. Gagn and C. Vanasse, Inferring technological parameters from
incomplete panel data, Journal of Econometrics, 87, 303-327, 1998.
J. Dolado, Optimal instrumental variable estimator of the AR parameter of an
ARMA(1,1) process, Econometric Theory, 6, 117-119.
247
B. Dormont, Introduction l'Economtrie des Donnes de Panel, Editions du
Centre National de la Recherche Scientique, Paris, 1989.
E. Fix and J.L. Hodges, Discriminatory analysis, nonparametric estimation:
consistent properties, Report No 4, USAF School of Aviation Medicine, Randolph
Field, Texas, 1951.
J. Geweke, Bayesian inference in econometric models using Monte Carlo integration, Econometrica, 57, 1317-1339, 1989.
S. Girma, A quasi-dierencing approach to dynamic modelling from a time series of independent cross-sections, Journal of Econometrics, 365-383, 2000.
R. Hall, Stochastic implications of the life cycle-permanent income hypothesis,
Journal of Political Economy, 86, 971-987, 1978.
B.E. Hansen, Threshold Eects in Non-Dynamic Panels: Estimation, Testing,
and Inference,Journal of Econometrics, 93, 345368, 1999.
L.P. Hansen, Large sample properties of generalized method of moments estimators, Econometrica, 50, 102-1054, 1982.
L.P. Hansen, A method of calculating bounds on the asymptotic covariance
matrices of generalized method of moments estimators, Journal of Econometrics,
30, 203-238, 1985.
L.P. Hansen and T.J. Sargent, Instrumental variables procedures for estimating
linear rational expectations models, Journal of Monetary Economics, 9, 263-296,
1982.
L.P. Hansen and K.J. Singleton, Generalized instrumental variable estimation
of nonlinear rational expectations models, Econometrica, 50, 1269-1286, 1982.
L.P. Hansen, J.C. Heaton and A. Yaron, Finite-sample properties of some alternative GMM estimators, Journal of Business and Economics Statistics, 14, 262-280,
1993.
W. Hrdle and J.S. Marron, Optimal bandwidth selection in nonparametric
regression function estimation, Annals of Statistics, 13 1465-1481, 1983.
R.D.F. Harris and E. Tzavalis, Inference for unit roots in dynamic panels where
248
REFERENCES
249
Intertemporal Factor Structure, unpublished manuscript, Cornell University, 1980.
E. Kyriazidou, Estimation of a panel data sample selection model, Econometrica, 65, 1335-1364, 1997.
Y.H. Lee and P. Schmidt, A Production Frontier Model with Flexible Temporal
Variation in Technical Ineciency, in The Measurement of Productive Eciency:
Techniques and Applications, Oxford University Press, 1993.
L.A. Lillard and Y. Weiss, Components of Variation in Panel Earnings Data:
American Scientists 1960-1970, Econometrica, 47, 437454, 1979.
R. Lucas, Econometric policy evaluation: A critique, in The Phillips curve and
labor markets, K. Brunner (Ed.), Vol. 1, North-Holland, 1976.
Y.P. Mack, Local properties of k N N regression estimates, SIAM Journal of
Algebraic and discrete methods, 2, 311-323, 1981.
L. Matyas and P. Sevestre, The Econometrics of Panel Data. Handbook of
Theory and Applications, Kluwer Academic Publishers, 1992.
P. Mazodier and A. Trognon, Heteroskedasticity and stratication in error components models, Annales de l'INSEE, 30-31, 451-482, 1978.
C. Meghir and F. Windmeijer, Moment Conditions for Dynamic Panel Data
Models with Multiplicative Individual Eects in the Conditional Variance,IFS
Working Paper Series No. W97/21, 1997.
R. Mott, Identication and estimation of dynamic models with a time series
of repeated cross-sections, Journal of Econometrics, 59, 99-123, 1993.
M. Nerlove, A note on error components models, Econometrica, 39, 383-396,
1971.
W.K. Newey, Ecient estimation of models with conditional moment restrictions, in Handbook of Statistics, C.R. Rao and H.D. Vinod (Eds.), Vol. 11, Elsevier
Science Publishers, 1993.
W.K. Newey, Ecient instrumental variables estimation of nonlinear models,
Econometrica, 58, 809-837, 1990.
250
REFERENCES
W.K. Newey and K.D. West, Automatic lag selection in covariance estimation,
Review of Economic Studies, 61, 631-653, 1994.
W.K. Newey and K.D. West, Hypothesis testing with ecient method of moments estimation, International Economic Review, 28, 777-787, 1987.
W.K. Newey and K.D. West, A simple, positive denite, heteroscedasticity and
autocorrelation consistent covariance matrix, Econometrica, 55, 703-708, 1987.
P. Schmidt, S.C. Ahn and D. Wyhowski, Comment: Sequential Moment Restrictions in Panel Data,Journal of Business and Economic Statistics, 10, 1014,
1992.
C.J. Stone, Consistent nonparametric regression, Annals of Statistics, 5, 595645, 1977.
P.A.V.B. Swamy and S.S. Arora, The exact nite sample properties of the estimators of coecients in the error components regression models, Econometrica,
40, 261-275, 1972.
M. Verbeek and T.E. Nijman, Testing for selectivity bias in panel data models,
International Economic Review, 33, 681-703, 1992.
M. Verbeek and T.E. Nijman, Minimum MSE estimation of a regression model
with xed eects and a series of cross- sections, Journal of Econometrics, 59, 125136, 1993.
T.D. Wallace and A. Hussain, The use of error components models in combining cross-sction and time-series data, Econometrica, 37, 55-72, 1969.
T.J. Wansbeek and A. Kapteyn, Estimation of the error components model
with incomplete panels, Journal of Econometrics, 41, 341-361, 1989.
H. White, A heteroscedasticity consistent covariance matrix estimator and a
direct test for heteroscedasticity, Econometrica, 48, 817-838, 1980.
H. White, Asymptotic theory for econometricians, Academic Press, Orlando,
1984.
251
J.M. Wooldridge, A framework for estimating dynamic, unobserved eects
panel data models with possible feedback to future explanatory variables, Economics Letters, 68, 245-250, 2000.