Vous êtes sur la page 1sur 19

Solutions (mostly for odd-numbered exercises)

c 2005 A. Colin Cameron and Pravin K. Trivedi


"Microeconometrics: Methods and Applications"

1. Chapter 1: Introduction

No exercises.

2. Chapter 2: Causal and Noncausal Models

No exercises.

3. Chapter 3: Microeconomic Data Structures

No exercises.

4. Chapter 4: Linear Models

4-1 (a) For the diagonal entries i = j and E[u2i ] = 2.

For the …rst o¤-diagonal i = j 1 or i = j + 1 so ji jj = 1 and E[ui uj ] = 2.

Otherwise ji jj > 1 and E[ui uj ] = 0.

(b) b OLS is asymptotically normal with mean 0 and asymptotic variance matrix

V[ b OLS ] = (X0 X) 1
X0 X(X0 X) 1
;

where 2 3
2 2 0 0
6 .. .. .. 7
6 2 . . . 7
6 7
6 .. .. .. 7
=6 0 . . . 0 7 :
6 7
6 .. .. .. 7
4 . . . 2 5

0 0 2 2

(c) This example is a simple departure from the simplest case of = 2 I.

1
Here depends on just two parameters and hence can be consistently estimated as
N ! 1.
So we use
b b OLS ] = (X0 X) 1 X0 b X(X0 X) 1 ;
V[
where 2 3
b2 bb 2 0 0
6 .. .. .. 7
6 bb 2 . . . 7
6 7
b =6
6 0 .. .. .. 7
6 . . . 0 7 7
6 .. .. .. 2 7
4 . . . bb 5
0 0 bb 2 b2
p p p p
and b ! if b2 ! 2 and d2 ! 2 or b ! .P
For 2 =E[u2i ] the obvious estimate is b2 = N 1 N b2i , where u
i=1 u bi = yi x0i b .
2 =E[u u d2 = N 1 PN u
For we can directly use
p i i 1 ] consistently estimated by i=2 bi u
bi 1 .
2
Or use =E[ui ui 1 ]= E[ui ]E[ui 1 ] =E[ui ui 1 ]=E[ui ] consistently estimated by b =
P P P
N 1 N i=2 u bi 1 =N 1 N
bi u b2i and hence bb2 = N 1 N
i=1 u bi u
i=2 u bi 1 .
(d) To answer (d) and (e) it is helpful to use summation notation:
"N # 1" N N
#" N # 1
X X X X
b b
V[ ] =
OLS xi x0 b2 i xi x0 + 2bb2 xi x0 ixi x0 i 1 i
i=1 i=1 i=2 i=1
" N
# 1 " N
# 1" N #" N
# 1
X X X X
2 2
= b xi x0i + 2bb xi x0i xi x0i 1 xi x0i
i=1 i=1 i=2 i=1

(d) No. The usual OLS output estimate b2 (X0 X) 1 is inconsistent as it ignores the
o¤-diagonal terms and hence the second term above.
(e) No. The White heteroskedasticity-robust estimate is inconsistent as it also ignores
the o¤-diagonal terms and hence the second term above.

4-3 (a) The error u is conditionally heteroskedastic, since V[ujx] =V[x"jx] = x2 V["jx] =
x2 V["] = x2 1 = x2 which depends on the regressor x.
P
(b) For scalar regressor N 1 X0 X =N 1 i x2i .
Here x2i are iid with mean 1 (since E[x2i ] =E[(xi E[xi ])2 ] =V[xi ] = 1 using E[xi ] = 0).
P p
Applying a LLN (here Kolmogorov), N 1 X0 X =N 1 i x2i ! E[x2i ] = 1, so Mxx = 1:
(c) V[u] =V[x"] =E[(x")2 ] (E[x"])2 =E[x2 ]E["2 ] (E[x]E["])2 =V[x]V["] 0 0 =
1 1 = 1 where use independence of x and " and fact that here E[x] = 0 and E["] = 0:
(d) For scalar regressor and diagonal ,
N N N
1 1 X 1 X 2 2 1 X 4
N X0 X = 2 2
i xi = xi xi = xi
N N N
i=1 i=1 i=1

using 2i = x2i from (a).N


Here x4i are iid with mean 3 (since E[x4i ] =E[(xi E[xi ])4 ] = 3 using E[xi ] = 0 and the
fact that fourth central moment of normal is 3 4 = 3 1 = 3).
P p
Applying a LLN (here Kolmogorov), N 1 X0 X =N 1 i x4i ! E[x4i ] = 3, so Mx x =
3.
(e) Default OLS result
p d
N ( b OLS ) ! N 0; 2
Mxx1 = N 0; 1 (1) 1
= N [0;1] :

(f ) White OLS result


p d
N ( b OLS ) ! N 0; Mxx1 Mx 1
x Mxx = N 0; (1) 1
3 (1) 1
= N [0;3]:

(g) Yes. Expect that failure to control for conditional heteroskedasticity when should
control for it will lead to inconsistent standard errors, though a priori the direction of
the inconsistency is not known. That is the case here.
What is unusual compared to many applications is that there is a big di¤erence in this
example - the
p true variance is three times the default estimate and the true standard
errrors are 3times larger.

4-5 (a) Di¤erentiate


@Q( ) @u0 Wu
=
@ @
@u0 @u0 Wu
= by chain rule for matrix di¤erentiation
@ @u
= X0 2Wu assuming W is symmetric
= 2X0 Wu

Set to zero
2X0 Wu = 0
) 2X0 W(yX )= 0
) X Wy = X0 WX
0

) b = (X0 WX) 1 X0 Wy
where need to assume the inverse exists.
Here W is rank r > K =rank(X) ) X0 Z and Z0 Z are rank K ) X0 Z(Z0 Z) 1 Z0 X is of
full rank K.
(b) For W = I we have b =(X0 IX) 1 X0 Iy = (X0 X) 1 X0 y which is OLS.
Note that (X0 X) 1 exists if N K matrix X is of full rank K.
(c) For W = 1 we have b =(X0 1
X) 1 X0 1
y which is GLS (see (4.28)).
(d) For W = Z(Z0 Z) 1 Z0 we have b =(X0 Z(Z0 Z) 1 Z0 X) 1 X0 Z(Z0 Z) 1 Z0 y which is
2SLS (see (4.53)).

4-7 Given the information, E[x] = 0 and E[z] = 0 and

V[x] = E[x2 ] = E[( u + ")2 ] = 2 2u + 2"


V[z] = E[z 2 ] = E[( " + v)2 ] = 2 2" + 2v
Cov[x; z] = E[xz] = E[( u + ")( " + v)] = 2
"
Cov[x; u] = E[xu] = E[( u + ")u] = 2u

P 2 1P
(a) For regression of y on x we have b OLS = i xi i xi yi and as usual
P 1 P
plim( b OLS ) = plim i x2i plim i xi ui
1
= E[x2 ] E[xu] as here data are iid
= ( 2 2u + 2" ) 1 2u :

(b) The squared correlation coe¢ cient is


2 = [Cov[x; z]2 ]=[V[x]V[z]]
XZ
= [ 2" ]2 =[( 2 2u + 2" )( 2 2
" + 2 )]
v

(c) For single regressor and single instrument

b P P
IV = ( i zi xi ) 1 i zi yi
P P
= ( i zi xi ) 1 i zi (xi + ui )
P P
= + ( i zi xi ) 1 i zi ui
P P
= + ( i zi ( ui + "i )) 1 i zi ui
P P 1P
= +( i zi u i + i zi "i )) i zi u i
1
= + ( mzu + mz" ) mzu
b = mzu = ( mzu + mz" )
IV
P P
where mzu = N 1 i zi ui and mz" = N 1 i zi "i .
p
By a LLN mzu ! E[mzu ] = E[zi ui ] = E[( "i +vi )ui ] = 0 since ", u and v are independent
with zero means.
p
By a LLN mz" ! E[mz" ] = E[zi "i ] = E[( "i + vi )"i ] = E["2i ] = 2" :

b p 2)
IV ! 0= 0+ " = 0:

(d) If mzu = mz" = then mzu = mz" so mzu mz" = 0 and b IV = mzu =0
which is not de…ned.
(e) First
b = mzu = ( mzu + mz" )
IV
= 1=( + mz" =mzu ):
If mzu is large relative to mz" = then is large relative to mz" =mzu so + mz" =mzu is
close to and 1=( + mz" =mzu ) is close to 1= :
(f ) Given the de…nition of 2XZ in part (c), 2XZ is smaller the smaller is , the smaller is
2 , and the larger is . So in the weak instruments case with small correlation between
"
x and z (ao 2XZ is small), b IV is likely to converge to 1= rather than 0, and there
b
is “…nite sample bias” in IV .

4-11 (a) The true variance matrix of OLS is

V[ b OLS ] = (X0 X) 1
X0 X(X0 X) 1

= (X0 X) 1
X0 2
(IN +AA0 )X(X0 X) 1

2
= (X0 X) 1
+ 2
(X0 X) 1
X0 AA0 X(X0 X) 1
:

(b) This equals or exceeds 2 (X0 X) 1 since (X0 X) 1 X0 AA0 X(X0 X) 1 is positive semi-
de…nite. So the default OLS variance matrix, and hence standard errors, will generally
understate the true standard errors (the exception being if X0 AA0 X = 0).
(c) For GLS

V[ b GLS ] = (X0 1
X) 1

= (X0 [ 2
(I + AA0 )] 1
X) 1

2
= (X0 [I + AA0 ] 1
X) 1

2
= (X0 [IN A(Im + A0 A) 1
A0 ]X) 1

2
= (X0 X X0 A(Im + A0 A) 1
A0 X) 1
:
(d) 2 (X0 X) 1 V[ b GLS ] since

X0 X X0 X X0 A(Im + A0 A) 1 A0 X in the matrix sense


) (X0 X) 1 (X0 X X0 A(Im + A0 A) 1 A0 X) 1 in the matrix sense.

If we ran OLS and GLS and used the incorrect default OLS standard errors we would
obtain the puzzling result that OLS was more e¤…cient than GLS. But this is just an
artifact of using the wrong estimated standard errors for OLS.
(e) GLS requires (X0 1 X) 1 which from (c) requires (Im + A0 A) 1 which is the
inverse of an m m matrix.
[We also need (X0 X X0 A(Im + A0 A) 1 A0 X) 1 but this is a smaller k k marix given
k < m < N .]

4-13 (a) Here = [1 1] and = [1 0]:


From bottom of page 86 the intercept will be 1 + 1 F" 1 (q) = 1 + 1 F" 1 (q) =
1 + F" 1 (q):
The slope will be 2 + 2 F" 1 (q) = 1 + 0 F" 1 (q) = 1:
The slope should be 1 at all quantiles.
The intercept varies with F" 1 (q). Here F" 1 (q) takes values 2:56, 1:68 , 1:05,
0:51, 0:0, 0:51, 1:05, 1:68 and 2:56 for q = 0:1, 0:2, .... , 0:9. It follows that the
intercept takes values 1:56, 0:68 , 0:05, 0:49, 1:0, 1:51, 2:05, 2:68.
[For example F" 1 (0:9) is " such that Pr[" " ] = 0:9 for " N [0; 4] or equivalently
" such that Pr[z " =2] = 0:9 for z N [0; 1]. Then " =2 = 1:28 so " = 2:56.]
(b) The answers accord quite closely with theory as the slope and intercepts are quite
precisely estimated with slope coe¢ cient standard errors less than 0:01 and intercept
coe¢ cient standard errors less than 0:04.
(c) Now both the intercept and slope coe¢ cients vary with the quantile. Both intercept
and slope coe¢ cients increase with the quantile, and for 1 = 0:5 are within two standard
errors of the true values of 1 and 1.
(d) Compared to (b) it is now the intercept that is constant and the slope that varies
across quantiles.
This is predicted from theory similar to that in part (a). Now = [1 1] and = [0 1].
From bottom of page 86 the intercept will be 1 + 1 F" 1 (q) = 1 + 0 F" 1 (q) = 1
and the slope will be 2 + 2 F" 1 (q) = 1 + 1 F" 1 (q) = 1 + F" 1 (q):

4-15 (a) The OLS slope estimate and standard error are 0:05209 and 0:00291, and
the IV estimates are 0:18806 and 0:02614. The IV slope estimate is much larger and
indicates a very large return to schooling. There is a lossin precision with IV standard
error ten times larger, but the coe¢ cient is still statististically signi…cant.
(b) OLS of wage76 on an intercept and col4 gives slope coe¢ cient 0:1559089 and OLS
regression of grade76 on an intercept and col4 gives slope coe¢ cient 0:829019. From
(4.46) , dy=dx = (dy=dz)=(dx=dz) = 0:1559089=0:829019 = 0:18806. This is the same
as the IV estimate in part (a).
(c) We obtain Wald = (1.706234 - 1.550325) / ( 13.52703 - 12.69801) = 0.18806. This
is the same as the IV estimate in part (a).
(d) From OLS regression of grade76 on col4, R2 = 0:0208 and F = 60:37. This does
not suggest a weak instruments problem, except that precision of IV will be much lower
than that of OLS due to the relatively low R2 .
(e) Including the additional regressors the OLS slope estimate and standard error are
0:03304 and 0:00311, and the IV estimates are 0:09521 and 0:04932. The IV slope
estimate is again much larger and indicates a very large return to schooling. There is a
loss in precision with IV standard error now eighteed ten times larger, but the coe¢ cient
is still statististically signi…cant using a one-tail test at …ve percent.
Now OLS of wage76 on an intercept and col4 and other regressors gives slope coe¢ cient
0:1559089 and OLS regression of grade76 on an intercept and col4 gives slope coe¢ cient
0:829019. From (4.46) , dy=dx = (dy=dz)=(dx=dz) = 0:1559089=0:829019 = 0:18806.
This is the same as the IV estimate in part (a).

4-17 (a) The average of b OLS over 1000 simulations was 1.502518.
This is close to the theoretical value of 1:5: plim( b OLS ) = 2=
u
2 2
u + 2
" =
(1 1)=(1 1 + 1) = 1=2 and here = 1.
(b) The average of b IV over 1000 simulations was 1.08551.
This is close to the theoretical value of 1: plim( b IV ) = 0 and here = 1.
(c) The observed values of b IV over 1000 simulations were skewed to the right of = 1,
with lower quartile .964185, median 1.424028 and upper quartile 1.7802471. Exercise
4-7 part (e) suggested concentration of b IV around 1= = 1 or concetration of b IV
around + 1 = 2 since here = 1:
(d) The R2 and F statistics across simulations from OLS regression (with intercept) of
z on x do indicate a likely weak instruments problem.
Over 1000 simulations, the average R2 was 0.0148093 and the average F was 1.531256.
[Aside: From Exercise 4-7 (b) 2XZ = [ 2" ]2 =[( 2 2u + 2" )( 2 2" + 2v ) = [0:01]2 =(1 +
1)(0:012 + 1) = 0:00005:]
5. Chapter 5: Extremum, ML, NLS

5-1 First note that


b
@ E[yjx] @
= exp(1 + 0:01x)[1 + exp(1 + 0:01x)] 1
@x @x
= 0:01 exp(1 + 0:01x)[1 + exp(1 + 0:01x)] 1
2
exp(1 + 0:01x) 0:01 exp(1 + 0:01x)[1 + exp(1 + 0:01x)]
exp(1 + 0:01x)
= 0:01 upon simpli…cation
[1 + exp(1 + 0:01x)]2

(a) The average marginal e¤ect over all observations.


100
b
@ E[yjx] 1 X exp(1 + 0:01i)
= 0:01 = 0:0014928:
@x 100 1 + exp(1 + 0:01i)
i=1

1 P100
(b) The sample mean x = 100 i=1 i = 50:5. Then

b
@ E[yjx] exp(1 + 0:01 50:5)
= 0:01 = 0:0014867:
@x [1 + exp(1 + 0:01x 50:5)]2
x

(c) Evaluating at x = 90

b
@ E[yjx] exp(1 + 0:01 90)
= 0:01 = 0:0011318:
@x [1 + exp(1 + 0:01x 90)]2
90

(d) Using the …nite di¤erence method

b
E[yjx] exp(1 + 0:01 90) exp(1 + 0:01 90)
= = 0:0011276:
x 1 + exp(1 + 0:01x 90) 1 + exp(1 + 0:01x 90)
90

Comment: This example is quite linear, leading to answers in (a) and (b) being close,
and similarly for (c) and (d). A more nonlinear function, with greater variation is
b
obtained using E[yjx] = exp(0 + 0:04x)=[1 + exp(0 + 0:04x)] for x = 1; :::; 100. Then the
answers are 0:0026163, 0:0013895, 0:00020268, and 0:00019773.
5-2 (a) Here
ln f (y) = ln y 2 ln y= with = exp(x0 )=2 and ln = x0 ln 2
0 0
= ln y 2(x ln 2) y=[exp(x )=2]
0
= ln y 2x + 2 ln 2 2y exp( x0 )
so
1 X 1 X
QN ( )= ln f (yi ) = fln yi 2x0 + 2 ln 2 2yi exp( x0 )g:
N i N i

(b) Now using x nonstochastic so need only take expectations wrt y


Q0 ( ) = plim QN ( )
1 X 1 X 1 X 1 X
= plim ln yi plim 2x0i + plim 2 ln 2 plim 2yi exp( x0i )
N i N i N i N i
1 X 1 X 0 1 X
= lim E[ln yi ] 2 lim xi +2 ln 2 2 lim E[yi ] exp( x0i )
N i N i N i
1 X 1 X 0 1 X
= lim E[ln yi ] 2 lim xi +2 ln 2 2 lim exp(x0i 0 ) exp( x0i );
N i N i N i
where the last line uses E[yi ] = exp(x0i 0 ) in the dgp and we do not need to evaluate
E[ln yi ] as the …rst sum does not invlove and will therefore have derivative of 0 wrt
.
(c) Di¤erentiate wrt (not 0)
@Q0 ( ) 1 X 2 X
= 2 lim xi + lim exp(x0i 0 ) exp( x0i )xi
@ N i N i
= 0 when = 0 :
P
[Also @ Q0 ( )=@ @ 0 = 2 lim N 1 i exp(x0i 0 ) exp( x0i )xi x0i is negative de…nite at
2

0 , so local max.]
Since plim QN ( ) attains a local maximum at = 0 , conclude that b = arg max QN ( )
is consistent for 0 .
(d) Consider the last term. Since yi exp( x0i ) is not iid need to use Markov SLLN.
This requires existence of second moments of yi which we have assumed.

5-3 (a) Di¤erentiating QN ( ) wrt


@QN 1 X
= 2xi + 2yi exp( x0i )xi
@ N i
1 X
= 2 fyi exp( x0i ) 1gxi rearranging
N i
1 X yi exp(x0i ) exp(x0i )
= 2 xi multiplying by
N i exp(x0i ) exp(x0i )
(b) Then
" #
@QN 1 X yi exp(x0i 0 )
lim E = lim 2 xi = 0 if E[yi jxi ] = exp(x0i 0 ):
@ 0
N i exp(x0i 0 )

So essential condition is correct speci…cation of E[yi jxi ].


(c) From (a)
p @QN 1 X yi exp(x0i 0 )
N =p 2 xi :
@ 0
N i exp(x0i 0 )
Apply CLT to average of the term in the sum.
Now yi jxi has mean exp(x0i 0 ) and variance (exp(x0i 2
0 )) =2.
yi exp(x0i 0 ) (exp(x0i 0 ))2 =2
So Xi 2 xi has mean 0
exp(x0i 0 )
and variance 4 (exp(x0i 0 ))2
xi x0i = 2xi x0i .
p p p P 1=2 P
Thus for ZN = (V[ N X]) 1=2 ( N X N E[X]) = N1 i V[Xi ] ( p1N i Xi )

1 X 0
1=2
1 X yi exp(x0i 0 ) d
ZN = 2xi xi p 2 0 xi ! N [0; I]
N i N i exp(xi 0 )
1 X yi exp(x0i 0 ) d 1 X
) p 2 0 xi ! N 0; lim 2xi x0i
N i exp(x i 0 ) N i

(d) Here yi is not iid. Use Liapounov CLT.


This will need a (2 + )th absolute moment of yi . e.g. 4th moment of yi .
0
(e) Di¤erentiating (a) wrt yields

@ 2 QN 1 X exp(x0i 0) p 1 X
= 2 xi x0i ! lim 2xi x0i :
@ @ 0 0
N i exp(x0i 0) N i

(f ) Combining
p d
N (b ! N [0; A( 0 ) 1 B( 0 )A( 0 ) 1 ]
0)
" #
d 1 X 1
1 X 1 X 1
! N 0; lim 2xi x0i lim 2xi x0i lim 2xi x0i
N i N i N i
" #
d 1 X 1
! N 0; lim 2xi x0i :
N i
(g) Test H0 : 0j j against Ha : 0j < j at level :05.

a
X 1
b N ; 2xi x0i
i

(bj j) a
X 1
) zj = N [0; 1], where sj is j th diag entry in 2xi x0i :
sj i

Reject H0 at level 0:05 if zj < z:05 = 1:645.

5-5 (a) t = b1 =se[b1 ] = 5=2 = 2:5. Since j2:5j > z_:05 = 1:645 we reject H0 .
(b) Rewrite as H0 : 1 2 2 = 0 versus H0 : 1 2 2 6= 0.
Use (5.32). Test H0 : R = r where R = [1 2] and r = 0 and 0 = [ 1 2 ].
5 5
Here b = so Rb r = [1 2] = 1:
2 2
Also V[b] = N 1 C b = 4 1 using Cov[b1 ; b2 ] = (Cor[b1 ; b2 ])2 V[b1 ]V[b2 ] = 0:52
1 1
22 12 = 1.
b 0 = [1 4 1 1
Then RN 1 CR 2] =4
1 1 2
1
so W = (Rb r)0 R(N b 0
1 C)R (Rb r) = 1 4 1 1.
Since W = 0:25 < 2 = 3:84 do not reject H0 :
1;:05

[Alternatively as only one restriction here, note that b1 2b2 has variance V[b1 ] +
4V[b1 ] 4Cov[b1 ; b2 ] = 4 + 4 1 4 1 = 4, leading to

b1 2b2 5 3
t= = p = 0:5
se[b1 2b2 ] 4

and do not reject as j0:5j < z:05 = 1:96. Note that t2 =W.]
1 0 0 1
(c) Use (5.32) Test H0 : R = r where R = and r = and = .
0 1 0 2
1 0 5 5
Then Rb r= =
0 1 2 2
b 0= 1 0
and RN 1 CR
4 1 1 0
=
4 1
0 1 1 1 0 1 1 1
1 4 1 5
so W = (Rb r)0 R(N b 0
1 C)R (Rb r) = 5 2 = 124.
1 1 2
Since W = 124 < 2 = 5:99 reject H0 :
2;:05
5-7 Results will vary as uses generated data. Expect b 1 ' 1 and b 2 ' 1 and standard
errors similar to those below.
(a) For NLS got b 1 = 1:1162 and b 2 = 1:1098 with standard errors 0:0551 and 0:0256.
(b) Yes, will need to use sandwich errors due to heteroskedasticity as V[yjx] = exp( 1 +
2
2 x) =2. Note that standard errors given in (a) do not correct for heteroskedasticity.

(c) For MLE got b 1 = 1:0088 and b 2 = 1:0262 with standard errors 0:0224 and 0:0215.
(d) Sandwich errors can be used but are not necessary since the ML simpli…cation that
A = B is appropriate here.
Additional Exercises
c 2005 A. Colin Cameron and Pravin K. Trivedi
"Microeconometrics: Methods and Applications"

1. Chapter 1: Introduction

No exercises.

2. Chapter 2: Causal and Noncausal Models

No exercises.

3. Chapter 3: Microeconomic Data Structures

No exercises.

4. Chapter 4: Linear Models

4-7 THIS QUESTION HAD SEVERAL ERRORS (notable (d)-(f )).


USE THE FOLLOWING REVISED QUESTION INSTEAD.
(Adapted from Nelson and Startz, 1990). Consider the three equation model, y = x+u;
x = u + "; z = " + v; where the mutually independent errors u, " and v are iid normal
with mean 0 and variances, respectively, 2u , 2" and 2v .
(a) Show that plim( b OLS )= 2=
u
2 2
u + 2
" .
(b) Show that 2 =[ 2 ]2 =[( 2 2 + 2 )( 2 2 + 2 )].
XZ " u " " v
p P
(c) Show that b IV = mzu = ( mzu + mz" ) ! 0, where, for example, mzu = N 1
i zi u i .

(d) Show that b IV is not de…ned if mzu = mz" = . Nelson and Startz (1990) argue
that this region is visited often enough that the mean of b IV does not exist.
(e) Show that b IV = 1=( + mz" =mzu ) equals 1= if mzu is large relative to mz" = .
Nelson and Startz (1990) conclude that if mzu is large relative to mz" = then b IV
is concentrated around 1= , rather than the probability limit of zero from part (c).

1
(f ) Nelson and Startz (1990) argue that b IV concentrates on 1= more rapidly the
smaller is , the smaller is 2" , and the larger is . Given your answer in part (c), what
do you conclude about the small sample distribution of b IV when 2XZ is small?

4-10 Consider weighted least squares estimation of household medical expenditures on


household total expenditure using the data of section 4.6.4 where again only those with
positive medical expenditures are included, but in this question the regression is in
levels and not logs. [Use program mma04p2qreg and generate new variables med =
exp(lnmed) and total = exp(lntotal).]
(a) Perform OLS regression of med on total. You should obtain slope coe¢ cient of
0:0938.
(b) Do you think that the errors in regression of med on total are likely to be het-
eroskedastic? Explain.
(c) Compare the default OLS standard errors for the OLS slope coe¢ cient estimate
with heteroskedastic-robust standard errors. Comment.
(d) OLS packages have an option for weighting. By appropriate use of weights option
in your package, perform GLS regression of med on total under the assumption that the
error has variance 2 (total)2 .
(e) Compare the default standard errors for the weighted LS slope coe¢ cient estimate
with heteroskedastic-robust standard errors for the weighted LS slope coe¢ cient esti-
mate. Comment.
(f ) Compare the default standard errors for the weighted LS slope coe¢ cient estimate
with heteroskedastic-robust standard errors for the weighted LS slope coe¢ cient esti-
mate. Comment.
(g) Obtain the LS estimates in part (d) manually by (unweighted) OLS regression by
…rst appropriately transforming med, total and the intercept.

4-11 Consider least squares estimation of the model y = X + u where = 2 (I + AA0 )


where A is an N m matrix with k < m < N and for simplicity we assume that 2
and A are known.
(a) Obtain the variance of the OLS estimator using (4.19).
(b) Compare your answer in (a) to the default variance estimate 2 (X0 X) 1 . Will
default OLS standard errors be biased / inconsistent in any particular direction?
(c) Give the variance of the GLS estimator of , using the result that (I + AA0 ) 1 =
IN A(Im + A0 A) 1 A0 .

(d) In general GLS is more e¢ cient than OLS. But what if we instead compare the (in-
correct) default variance 2 (X0 X) 1 of OLS with the true variance of GLS? Comment.
(e) GLS requires matrix inversion. For this problem what is the largest size matrix
that needs to be inverted to perform GLS?

4-12 Consider regression of y on x when data (y; x) take values ( 1; 2), (0; 1), (0; 0),
(0; 1) and (1; 2).
(a) Using an appropriate statistical package obtain the OLS estimate of the slope co-
e¢ cient and the least absolute deviations regression estimate of the slope coe¢ cient.
Compare the two estimates of the slope coe¢ cient and the precision of the estimates.
(b) From part (a) you should …nd that the intercept is 0. We will …nd the least absolute
deviations regression estimate of the slope coe¢ cient by grid search. For the given
data compute (4.34) with q = 0:5 and x0i = xi (one regressor and no intercept) for
= 0:3; 0:35; 0:4; 0:45; 0:5; 0:55 and 0:6. Which value of minimizes (4.34)? Compare
your answer to that in part (a).

4-13 Consider dgp y = 1 + 1 x and u = " where x N [0; 25] and " N [0; 4]. This
is the same dgp as in Section 4.5.3 page 84 except that u is homoskedastic.
(a) Using the general result at bottom page 86 give the true slope and intercept coe¢ -
cient for the pth quantile of y conditional on x for quantiles q = 0:1, 0:2, .... , 0:9. [Hint:
F" 1 (0:1) is that value " such that Pr[" " ] = 0:1 for " N [0; 4]].
(b) Generate a sample of size 10; 000 from this dgp. This requires minor modi…cation
of the Section 4.5.3 program given on the book website. Estimate quantile regressions
of y on x for q = 0:1, 0:2, .... , 0:9. Compare your answers to the theoretical answers in
part (a).
p
(c) Redo part (b) for dgp with the modi…cation that u = x" rather than ". Comment
on any changes compared to (b).
(d) Redo part (b) for dgp with the modi…cations that x is the square of a N [0; 25] variate
and u = x". [This setup ensures that x0 > 1 as assumed on page 86]. Comment on
any changes compared to (b).

4-14 If y is exponential distributed with mean exp(x0 ) then the variance is [exp(x0 )]2 .
This can be written as the regression model y = exp(x0 )+u where u = exp(x0 )" and
" iid[0; 1].
(a) Show by argument similar to that in Section 4.6.1 that the population q th quantile
of y conditional on x is q (x; ) = exp(x0 ) + exp(x0 ) F" 1 (q).
(b) Hence state what happens for this model to the intercept and slope coe¢ cients of
the population q th quantile of y conditional on x as q varies.

4-15 Consider the same data set as that in example 4.9.6 and use the program given
at the book website. Note that wage76 is log hourly wage, grade76 is years of schooling
and col4 is proximity to college.
(a) Regress wage76 on an intercept and grade76, with estimation by OLS and by IV
where col4 is the instrument for grade76. Compare the size and precision of the OLS
and IV slope coe¢ cients, where heteroskedasticity-robust standard errors are used.
(b) Perform OLS regression of wage76 on an intercept and col4 and perform OLS
regression of grade76 on an intercept and col4. Obtain the ratio of the two slope
coe¢ cient estimates as in (4.46) and compare to your answer in (a).
(c) The instrument here is a binary variable and there is one regressor. Compute the
Wald estimate (see (4.48)) and compare it to the IV estimate from part (a).
(d) Do you think there might be a weak instrument problem here? Provide appropriate
statistical measures.
(e) Redo parts (a) and (b), with all regressions now including as additional (exogenous)
regressors age76, agesq76, black, south76, smsa76, reg2-reg9,
smsa66, momdad14, sinmom14, nodaded, nomomed, daded, momed, and famed1-famed8.
Does result (4.46) still old?

4-16 Consider the linear regression model y= X + u and the IV estimator


b = (Z0 X) 1
Z0 y

where Z and X are N k full rank matrices of constants (i.e. are nonstochastic).
(a) Suppose u N [0; ]. Obtain the …nite sample distribution of b .
(b) Suppose u [0; ], the probability limits of N 1 X0 X, N 1 Z0 Z and N 1 Z0 X all
1=2 Z0 u d 1 Z0
exist and are …nite nonsingular
p and N ! N [0; plim N Z]. Obtain the
limit distribution of N ( b ).
(c) Hence obtain the asymptotic distribution of b and compare to your answer in (a).
4-17 Consider the same three equation model as in Exercise 4-7, with y = x + u;
x = u + "; z = " + v; where the mutually independent errors u, " and v are iid normal
with mean 0 and variances 1, and = 1, = 1 and = 0:01. Perform 1000 simulations
with N = 100 where in each simulation data (y; x; z) is generated and we obtain (1)
b b
OLS from OLS regression of y on x without intercept; (2) IV from IV regression of
2
y on x without intercept with instrument z; (3) R and F from OLS regression (with
intercept) of z on x:
(a) Compare the average across simulations of b OLS with the probability limit given in
Exercise 4-7 part (b). Comment.
(b) Compare the average across simulations of b IV with the probability limit given in
Exercise 4-7 part (c). Comment.
(c) Obtain percentiles and quartiles of the observed values across simulations of b IV .
Comment in the light of Exercise 4-7 part (e).
(d) Do the R2 and F statistics across simulations from OLS regression (with intercept)
of z on x indicate a likely weak instruments problem?
Exercises: Di¢ culty and Topics Covered
c 2005 A. Colin Cameron and Pravin K. Trivedi
"Microeconometrics: Methods and Applications"

Di¢ culty: 1 is easiest; 2 is harder; 3 is hardest.

1. Chapter 1: Introduction

No exercises.

2. Chapter 2: Causal and Noncausal Models

No exercises.

3. Chapter 3: Microeconomic Data Structures

No exercises.

1
4. Chapter 4: Linear Models
Ques Sol:given Dif f Section T opic
4 1 Y es 2 4:4 OLS with 6= 2 I
4 2 1 4:4 OLS with heteroskedasticity
4 3 Y es 3 4:4 OLS with heteroskedasticity
4 4 3 4:4 OLS limit distribution
4 5 Y es 1 4:5 LS estimators minimize u0 Wu
4 6 1 4:8 IV estimation theory
4 7 Y es 3 4:8 IV estimation weak instruments theory
4 8 2 4:6 Quantile regression data application
4 9 Y es 3 4:9 IV with weak instruments data application
4 10 2 4:5 Weighted LS application
4 11 Y es 2 4:5 GLS theory
4 12 1 4:6 Quantile regression with arti…cial data
4 13 Y es 2 4:6 Quantile regression with generated data
4 14 2 4:6 Quantile regression theory
4 15 Y es 2 4:8 IV data application
4 16 2 4:8 IV theory
4 17 Y es 2 4:9 IV with weak instruments simulation

5. Chapter 5: ML and NLS Estimation


Ques Sol:given Dif f Section T opic
5 1 Y es 1 5:2 Marginal e¤ects
5 2 Y es 2 5:3 m-estimator example asymptotic theory
5 3 Y es 2 5:3 m-estimator example asymptotic theory
5 4 3 4:3 m-estimator example asymptotic theory
5 5 Y es 2 5:5 Wald test of linear restrictions
5 6 2 5:7 NLS estimation theory
5 7 Y es 2 5:7 8 NLS and ML with generated data
5 8 1 5:2 Marginal e¤ects