Homework 1

ECON2651 - First Assignment
Ayar PORTUGAL MICHAUX
9/11/2010
In this homework we'll use the notation used by Hayashi [?] in his book.
Specically we have :
     
xi1 β1 y1
xi =  ...  , β =  ...  , y =  ... 
     
xiK βK yn
   0   
1 x1 x1 1 ... x1 K
 ..   ..   .. .
.
ε =  . ,X =  .  =  .

... . 
n x0 n xn 1 ... xn K
1 Theoretical Exercises
1.1 Chapter 1 : Small Sample
Question 1
OLS estimation belongs to the family of the Lp -normed estimations procedures

[?], all of which are assumed to have linear regressors (L stands for Lebesgue,
named after Henri Lebesgue) with respect to the dependent variable.
A Lp -norm is of the form :
N
!1/p N
!1/p
X X
p p
|ei | = (|yi − xi β̃|)
i=1 i=1
In the particular case

A Lp -norm estimation minimizes the sum of the pth power of the absolute
value of residuals, i.e. in general we have :
N
!1/p N
!1/p
X X
min |ei |p = min (|yi − xi β̃|)p
β̃ β̃
i=1 i=1
Where β̃ is the regressor being estimated.

The OLS regression minimizes the squared residuals. This estimation is
equivalent to the case where p = 2, i.e. :
1
N
X N
X
2
min |ei | = min (|yi − xi b|)2
β̃ β̃
i=1 i=1
0
= min e e
β̃
= min(y − x0 i b)0 (y − x0 i b)
β̃
This last procedure leads to OLS regressors bi for i = 1, 2, ..., K .

We could also dene a Lp -normed estimation with a p > 2. At the limit,
n
X
i.e. when lim |ei |b , the estimation is of norm innity, i.e. a L∞ -norm
b→∞
i=1
more sensitive to outliers and therefore useful in contexts
estimation and it is
where one wants to detect them e.g. measuring atness of a surface [?].
The innity norm is equal to the maximum value among all observations,
i.e. the maximum residual. Minimizing the innity norm is equivalent to
minimizing the maximum residual.
This minimisation doesn't lead to the same value than the OLS estimator,
and in fact it could be the case that the innity-norm regressor is not unique.
For example, in the following graph there are two regression which are both
minimizing the maximum residuals, but the coecients β̂ 0 and β̂ are not the
same.
Figure 1: Two regressions using the innity norm
2
We prefer the quadratic form, i.e. OLS estimation, because :
First, it leads to the only ecient unbiased estimators (in the Gauss-Markov
sens, i.e. the b regressor is BLUE) among all the linear estimation meth-
ods, and
Second, provided strict exogeneity holds, it can be easy computed and it is

unique.
Proof that b is the only ecient estimator

We suppose that the L∞ norm has a solution and it is unbiased (say β̂ ),
furthermore, the OLS coecients are b1 , ..., bK . That b if BLUE means that
it has the smallest conditional variance among all other linear and unbiased
estimators (e.g. in this case β̂ ).
β̂ = Cy (β̂ is a linear regressor on y)

= (D + A)y = Dy + Ay (D = C − A)
= DXβ + Dε + b (1) (y = Xβ + ε)
By taking the conditional expectation of β̂ we have :
E(β̂|X) = E(DXβ + Dε + b|X)

= E(DXβ|X) + E(Dε|X) + E(b|X) (E(.) is a linear operator)
= DXβ + E(b|X) (DXβ depend on X and E(ε|X) = 0)
From this last expression we see that DXβ = 0 because we have assumed
that β̂ was also unbiased.
Therefore, from (1) :
β̂ = Dε + b
β̂ − β = Dε + (b − β) (we subtract −β )
= Dε + (A(Xβ + ε) − β)
= Dε + (X0 X)−1 X0 Xβ + Aε − β

= Dε + Aε
= (D + A)ε
We want to determine wether var(β̂ − β|X) is larger than var(b − β|X) (so
the OLS estimator is indeed ecient) or not, therefore we calculate :
var(β̂ − β|X) = var ((D + A)ε|X)

= (D + A)var(ε|X)(D + A)0
= σ 2 (D + A)(D + A)0 (By assumption var(ε|X) = σ 2 In )
= σ 2 (DD0 + DA0 + AD0 + AA0 )
3
Before continuing we point out that :
0
DA0 = D (X0 X)−1 X0

= DX(X0 X)−1 = 0 (Because DXβ = 0, this is only possible if DX = 0)

0 0 −1 0 0
AD = (X X) XD =0
0 0 −1
0
X0 (X0 X)−1 X0

AA = (X X)
= (X0 X)−1 X0 X(X0 X)−1
= (X0 X)−1

By substituting this results in var β̂ − β|X we have :
var(β̂ − β|X) = σ 2 (DD0 + DA0 + AD0 + AA0 )

= σ 2 (DD0 + (X0 X)−1 )
We now have to compute var(b − β|X) in order to compare it with the

variance of the hypothetical regressor :
b − β = Ay − β
= (X0 X)−1 X(Xβ + ε) − β
= β + Aε − β
= Aε
Therefore,
var(b − β|x) = var(Aε|X)

= A(var(ε|X))A0
= σ 2 (AA0 ) (Because we assume spherical errors)
2 0 −1 0 −1 0
= σ ((X X) X((X X) X)
2 0 −1
= σ (X X)
Because DD' is a positive denite matrix (D = C − A is by assumption non

negative) then we have :
σ 2 (X0 X)−1 = var(b − β|x) < var(β̂ − β|X) = σ 2 (DD0 + (X0 X)−1 )
This concludes the proof : the OLS estimator b is more ecient than the
innity-norm estimator β̂ (if it exists).
Question 2
We have the population regression and the estimated regression:
y = Xβ + ε
4
y = Xb + e
We want to demostrate the equality
e0 e = ε0 M ε
where M = In − X(X0 X)−1 X0 .
0 −1 0
We know that e = y − Xb, by substituting b (b = (X X) X y):
e = y − Xb = y − X(X0 X)−1 X0 y
= (In − X(X0 X)−1 X0 )y
= My
We have that e equals M y. Then, substituting for y = Xβ + ε we get:
M y = y − Xb
= y − X(X0 X)−1 X0 (Xβ + ε)
= y − Xβ − X(X0 X)−1 X0 ε
Here, X(X0 X)−1 X0 X = X, because (X0 X)−1 (X0 X) = In . Also, X(X0 X)−1 X0
is the so called projection matrix, P which has the property that P X = X.
Then we substitute y − Xβ for ε and get:
= ε − X(X0 X)−1 X0 ε
= (In − X(X0 X)−1 X0 )ε
= Mε
Having now e = Mε we premultiply both sides by e0 :
e0 e = e0 M ε
= (M ε)0 M ε
= ε0 M 0 M ε
Since M is both symmetric and idempotent, i.e. M 0M = M . Therefore we

get:
= ε0 M ε
Proof that M is idempotent:

M 0 M = (In − X(X0 X)−1 X0 )0 In − X(X0 X)−1 X0
= In − X(X0 X)−1 X0 − X(X0 X)−1 X0 + X(X0 X)−1 X0 X(X0 X)−1 X0
5
Because (X(X0 X)−1 X0 )0 = X(X0 X)−1 X0 . Furthermore, (X0 X)−1 X0 X can-
cels out. We are left with:
= In − X(X0 X)−1 X0 − X(X0 X)−1 X0 + X(X0 X)−1 X0

= In − X(X0 X)−1 X0
Which is indeed equal to M.
Question 3
In order to know if the marginal distribution of q can be computed and to

determine its distribution we will proceed in 2 steps:
• First, we will recall why does q follow a chi-square distribution and link
it with its components, i.e. the errors terms.
• Then, we will determine whether or not the unconditional error distri-

bution is well dened so that the marginal distribution of q follows a
chi-squared distribution.
From statistics theory we know that if Z ∼ Nn (0, In ) and M is an n × n

symmetric and idempotent matrix, then Z 0 M Z ∼ χ2 (r) where r is the degrees
of freedom and is equal to the rank of M (r = rank(M )).
e0 e ε0 ε 0 0
As q =
σ 2 = σ M σ , because e = M ε and therefore e e = ε M ε, then we
ε
have to check if is normally distributed unconditionally to X with mean 0
σ
and variance equal to In .
2
We already know that ε|X ∼ N (0, σ In ). Therefore:
E(ε) = E (E(ε|X))
= E(0)
=0
V ar(ε) = var(E(ε|X)) + E (var(ε|X))

= var(0) + E(σ 2 In )
= σ 2 In (Because neither σ2 nor In are random variables)
ε
Therefore ε ∼ N (0, σ 2 In ), which means that
σ ∼ N (0, In ). And because
2
M is idempotent and its rank is equal to N −K we have that q ∼ χ (N − K),
i.e. the marginal distribution of q is also chi-squared.
Question 4
e0 e
s2 =
n−K
2σ 4
We want to show that V ar(s2 |X) = n−K . We have already shown that
0
2 ee 2 2
q|X ∼ χ (n−K). Observe that q = σ 2 = (n−k)s /σ . Therefore, (n−k)s2 /σ 2
2
has a χ distribution with (n − K) degrees of freedom.
6

n−K 2
V ar s = 2(n − K)
σ2
The variance of the χ2 distribution with (n − K) degrees of freedom is
2(n-K).
−2
n−K
V ar(s2 ) = 2(n − k)
σ2
2σ 4
=
n−K
The Cramer-Rao bound for the variance term is 2σ 4 /n. The OLS estimator
2
does not reach this bound, but no unbiased estimator of σ reach this bound.
1.2 Section 1 of chapter 2 : Large sample concepts
Question 1
A random sequence {zn } converges in probability to a scalar α (i.e. zn →p α)

if ∀ε > 0 :
lim P (|zn − α| > ε) = 0

n→∞
We'll rst point out that the denition of this random experiment satises
one important axiom : the joint probability of every event, i.e. the sample
space probability, is equal to 1.
P (S) = P (zn = α ∪ zn = 0 ∪ zn = −n)

= P (zn = α) + P (zn = 0) + P (zn = −n)
n−c 1 c−1
= + +
n n n
n−c+1+c−1
=
n
=1
This was possible because these three events are mutually exclusive.
As convergence in Mean Squares implies convergence in probability, we will
show the former not to converge neither to α nor 0 concluding that zn doesn't
converge in probability neither.
As a reminder, if lim E[(zn − α)2 ] = 0, then zn →M S α.
n→∞
E[(zn − α)2 ] = E[zn2 − 2zn α + α2 ]

= E(zn2 ) − 2αE(zn ) + α2
This last result was possible because the expectation is a linear operator
and neither α nor 2 are random variables. Next we have :
7
n−c c−1 αn − αc − nc + n
E(zn ) = α +n =
n n n
and
n−c c−1 α 2 − α 2 c + n2 c − n2
E(zn2 ) = α2 + n2 =
n n n
therefore
E[(zn − α) ] = E(zn2 ) − 2αE(zn ) + α2

2
α 2 − α 2 c + n2 c − n2 αn − αc − nc + n
= − 2α + α2
n n
α2 − α2 c + n2 c − n2 − 2α2 n + 2α2 c + 2αnc − 2αn + α2 n
=
n
α2 α 2 c n2 c n2 α2 n 2αnc 2αn
= + + − − + −
n n n n n n n
α2 + α2 c
= + n(c − 1) − α(α − 2c + 2)
n
So the limit of this expectation as n goes to innity is :
α2 + α2 c
lim E[(zn2 − α)2 ] = lim + n(c − 1) − α(α − 2c + 2)
n→∞ n→∞ n
=∞
Then we can conclude that the {sn } doesn't converge in mean squares to
α, nor to 0, and because convergence in MS implies convergence in probability,
{sn } doesn't converge neither to α nor to 0 in probability.
Question 2
What we know :
√ 4n − c
n(θ̂ − θ) →d N (0, σ 2 ) αn =
n
n−1 1
p(βn = 0) = p(βn = c) =
n n
Where c is a constant and βn a random sequence.
√
Limit in distribution of αn n(θ̂ − θ)?
√
By applying lemma 2.4 c) in Hayashi [?, p.92] if n(θ̂ − θ) →d Θ (where Θ
√
is the limit in distribution) and αn →p α then αn n(θ̂ − θ) →d αΘ. And
√
as pointed out in the same page, if Θ ∼ N (0, σ) then αn n(θ̂ − θ) →d
Nr (0, ασ 2 α0 ), where r is the number of rows of α.
We therefore rst have to show that αn →p α, for that, we will show that
8
αn →ms α which implies the former convergence :
lim E(αn − α)2 = lim E(αn2 − 2αn α + α2 )

n→∞ n→∞
= lim E(αn2 ) − 2αE(αn ) + α2

n→∞

4n − c 2 4n − c 2
= lim E( ) − 2αE( )+α
n→∞ n n
= E(4)2 − 2α4 + α2
= 16 − 8α + α2 = 0 (The solution is α = 4)
Finally, if α = 4, then :
αn →ms α ⇒ αn →p α
In the second place we have to specify what Θ is.

As we know that αΘ ∼ N (0, ασ 2 α), then it must be the case that :
E(αΘ) = 0
and
var(αΘ) = E (αΘ)2 − E(αΘ)2

= E (αΘ)2 = ασ 2 α

This is equivalent to say that :
P ((αΘ)2 = ασ 2 α) = 1
therefore
(αΘ)2 = ασ 2 α
√ √
αΘ = ασ α
Θ=σ
And so the limit in distribution is equal to ασ , i.e.
√
αn n(θ̂ − θ) →d ασ
which is of course normally distributed with the parameters we have specied

above.
√
Limit in distribution of n(θ̂ − θ) + βn ?
√
n(θ̂ − θ) + βn →d σ + β̂
9
Therefore we need to know what is the limit in probability of β :
n−1 n−1 1
P (βn = 0) = ⇒ P (|βn − 0| > ε) = 1 − =
n n n
therefore
βn →p β̂ (where β̂ = 0)
and so
√
n(θ̂ − θ) + βn →d σ + β̂
√
n(θ̂ − θ) + βn →d σ
Limit in probability of θ̂ − θ ?
√
We know that θˆn is consistent (because n(θ̂−θ) →d N (0, σ 2 )), that means
that :
θˆn →p θ
θˆn − θ →p 0
Limit in probability of βn αn ?
We already know that αn →p α and βn → p 0 , therefore by applying lemma
2.2 we have :
αn βn →p 0
Limit in mean square of βn ?
The expected value of βn is :
n−1 1
E(βn ) = 0 × +c×
n n
c
=
n
and
c2
E(βn2 ) =
n
Therefore :
h i
E (βn − β̂)2 = E(βn2 − 2βn β̂ + β̂ 2 )
= E(βn2 ) − 2β̂E(β) + β̂ 2
c2 1
= − 2β̂ + β̂ 2
n n
By taking the limit :
h i c2 1
lim E (βn − β̂)2 = lim − 2β̂ + β̂ 2
n→∞ n→∞ n n
= 0 − 0 + β̂ 2
⇒ β̂ 2 = 0 ⇔ β̂ = 0
10
Therefore, βn converges Almost Surely to β̂ only if β̂ = 0, i.e. :
βn →AS 0
Question 3
Limit in probability of z̄n ?

The Kolmogorov's strong LLN states that if {zi } is i.i.d. and E(zi ) =
µ < ∞, then z̄n →M S µ. As we know that almost sure convergence implies
convergence in probability, then we have :
z̄n →p µ
Limit in mean square of z̄n ?
We will suppose that :
lim E(z̄n ) = µ and lim var(z̄n ) = 0

n→∞ n→∞
For z̄n to converge in mean square to a constant the following condition must
be satised :
lim E (z̄n − z̃)2 = 0

n→∞
Therefore we procede as follows:
z̄n − z̃ = [z̄n − E(z̄n )] + [E(z̄n ) − z̃]

2 2
[z̄n − z̃] = [[z̄n − E(z̄n )] + [E(z̄n ) − z̃]]
= [z̄n − E(z̄n )]2 + 2[z̄n − E(z̄n )][E(z̄n ) − z̃] + [E(z̄n ) − z̃]2
If we take the expectation in both sides we have :
E (z̄n − z̃)2 = E [z̄n − E(z̄n )]2 + 2[z̄n − E(z̄n )][E(z̄n ) − z̃] + [E(z̄n ) − z̃]2

= E [z̄n − E(z̄n )]2 + 2(E(z̄n ) − E(z̄n ))[E(z̄n ) − z̃] + E [E(z̄n ) − z̃]2

= var(z̄n ) + E [E(z̄n ) − z̃]2

If we now take the limit of this expression we have :
lim {E(z̄n − z̃)} = lim var(z̄n ) + E [E(z̄n ) − z̃]2

n→∞ n→∞
= 0 + lim E [E(z̄n ) − z̃]2

n→∞
If we want z̄n to be convergent in mean squares this last limit has to be

equal to zero. This can only be the case if z̃ = µ because we assumed that
limn→∞ E(z̄n ) = µ. So we have :
= 0 + lim E [E(z̄n ) − z̃]2

n→∞
= E [E(µ − µ)]2 = 0

⇒ z̄n →M S µ
11
Question 4
We know that :
yi = σi i where i ∼ i.i.dN (0, 1)

σi2 =c+ 2
αyi−1 + 2
γyi−2
gi = yi−1 + yi−2
is gi a martingal?
gi is a martingal process with respect to yi :
E(gi |yi−1 , yi−2 , . . . , y1 ) = E(yi + yi−1 |yi−1 , yi−2 , . . . , y1 )

= E(yi |yi−1 , yi−2 , . . . , y1 ) + E(yi−1 |yi−1 , yi−2 , . . . , y1 )
= yi−1
is yi a white noise process?
One condition for yi to be white noise is that E(yi ) = 0, but this condition
is violated :
E(yi ) = E(σi i )
2 2
)2

= E (c + αyi−1 + γyi−2
We immediatly see that the expected value of y depends on its past values
and a constant, therefore it is not equal to zero.
is yi a white noise process?
We will rst show that i is independent of (y1 , ...yi−2 ), then we will verify
that the denition of a martingal dierence sequence is satised :
We can rewrite y i = σ i i as :
2 2
yi = (c + αyi−1 + γyi−2 )1/2 i
We could substitute yi−1 and yi−2 by its past values up to the rst obser-
vationy1 , in that case, yi would become a function of y1 and {e3 , e4 , . . . , ei}
(because i ≥ 3), therefore i is independent of {y1 , . . . , yi−2 }.
With that in mind, we have :
E(yi |yi−1 , . . . , y1 ) = E(σi i |yi−1 , . . . , y1 )

2 2
= E((c + αyi−1 + γyi−2 )1/2 i |yi−1 , . . . , y1 )
2 2
= (c + αyi−1 + γyi−2 )1/2 )E(i |yi−1 , . . . , y1 )
2 2
= (c + αyi−1 + γyi−2 )1/2 )E(i ) (because i is independent of yi )
2 2
= (c + αyi−1 + γyi−2 )1/2 ) ×0 (because i ∼ i.i.dN (0, 1))
=0
Therefore yi is a MDS.
is yi an independent white noise process?
12
Question 5
The random walk process yi = yi−1 +i is covariance stationary if E(yi ) doesn't
depend on t and cov(yi , yi−j ) exists for j = 0, ±1, ±2, . . . and depends only on
j but not on i.
E(yi ) = E(yi−1 + i )
= E(yi−1 ) + E(i )
= E(yi−2 + i−1 ) (because is IWN)
= E(yi−2 ) + E(i−2 )
.
.
.
= E(y0 )
= y0 (it doesn't depend on i because it's a constant)
We now analyse the covariance:
cov(yi , yi−j ) = E [(yi − E(yi ))(yi−j − E(yi−j )]

= E [(yi − y0 )(yi−j − y0 )]
= E(yi yi−j − yi y0 − yi−j y0 + y02 )
= E(yi yi−j ) − E(yi y0 ) − E(yi−j y0 ) + E(y02 )
= E(yi yi−j ) − y0 (E(yi ) + E(yi−j )) + y02
= E(yi yi−j ) − y0 (2y0 ) + y02
= E(yi yi−j ) − y02
= E((yi−1 + i )(yi−j−1 + i−j ) − y02
= E(((yi−2 + i−1 ) + i )((yi−j−2 + i−j−1 ) + i−j ) − y02
.
.
.
i
X i−j
X
= E[(y0 + l )(y0 + k )] − y02
l=1 k=1
i−j i−j
" i i
#
X X X X
= E y0 y0 + y0 l + y 0 k + k l − y02
l=1 k=1 k=1 l=1
 
i
X i−j
X i−j
X
= y02 + y0 E(l ) + y0 E(k ) + E  k l  − y02
l=1 k=1 k=1,l=1
2
= (i − j)σ
As we can see, the covariance depend on j as well, therefore this AR process

is not covariance-stationary.
13
2 Empirical exercise
We consider the regression
bwght = β0 + β1 cigs + β2 parity + β3 f aminc + β4 moteeduc + β5 f atheduc +

For our small descriptive analysis we consider the means, standard devi-
ations and correlation matrix. This information is given in the output form
OxMetrics below.
---- Descriptive Statistics 1.0 session started at 18:51:12 on 8-11-2010 ----
Means, standard deviations and correlations

The dataset is: C:\Documents and Settings\Aleksander Berge\Mine dokumenter\
\ECON 2651\Assignements\bwght2.xls
The sample is: 1 - 1191
Means
bwght cigs parity faminc motheduc fatheduc
7.4706 1.7691 1.6138 32.219 13.191 13.125
Standard deviations (using T-1)
bwght cigs parity faminc motheduc fatheduc
1.2588 5.3438 0.87464 17.956 2.7413 2.4174
Correlation matrix:
bwght cigs parity faminc motheduc
bwght 1.0000 -0.16458 0.069536 0.079921 0.083043
cigs -0.16458 1.0000 0.041854 -0.15539 -0.18026
parity 0.069536 0.041854 1.0000 -0.056728 -0.045192
faminc 0.079921 -0.15539 -0.056728 1.0000 0.44768
motheduc 0.083043 -0.18026 -0.045192 0.44768 1.0000
fatheduc 0.045064 -0.21672 -0.095962 0.42709 0.64348
fatheduc
bwght 0.045064
cigs -0.21672
parity -0.095962
faminc 0.42709
motheduc 0.64348
fatheduc 1.0000
We note that family income is, not surprisingly, correlated with the edu-
cation of both the mother and father, with correlation coerricients of 0.448
and 0.427 respectively. Furthermore, we note that the education between the
mother and father is even stronger correlated, with a correlation coecient of
0.643.
Now we turn to our regression model, which we etimate using OLS. We re-
move the incomplete observations, and from an initial number of 1388 observa-
tions we are left with 1191 complete observations. The results from OxMetrics
are reported below.
14
---- PcGive 13.10 session started at 18:53:08 on 8-11-2010 ----
EQ( 1) Modelling bwght by OLS-CS

The dataset is: C:\Documents and Settings\Aleksander Berge\Mine dokumenter\
\ECON 2651\Assignements\bwght2.xls
The estimation sample is: 1 - 1191
Coefficient Std.Error t-value t-prob Part.R^2

Constant 7.15777 0.2330 30.7 0.0000 0.4433
cigs -0.0372460 0.006897 -5.40 0.0000 0.0240
parity 0.111725 0.04121 2.71 0.0068 0.0062
faminc 0.00350259 0.002285 1.53 0.1256 0.0020
motheduc 0.0295247 0.01767 1.67 0.0949 0.0024
fatheduc -0.0231531 0.01999 -1.16 0.2470 0.0011
sigma 1.2368 RSS 1812.66068

R^2 0.0387482 F(5,1185) = 9.553 [0.000]**
Adj.R^2 0.0346923 log-likelihood -1940.07
no. of observations 1191 no. of parameters 6
mean(bwght) 7.47061 se(bwght) 1.25883
Normality test: Chi^2(2) = 373.70 [0.0000]**

Hetero test: F(10,1180)= 0.34017 [0.9701]
Hetero-X test: F(20,1170)= 0.25140 [0.9997]
RESET23 test: F(2,1183) = 1.3081 [0.2707]
We see that neither f aminc and f atheduc are signicant at the 10% critical
value. motheduc is just below this limit, and not signicant at the 5% critical
value.
In order to construct the 95% condence interval, we note that
bk
−tα/2 (n − K) < < tα/2 (n − K)
SE(bk − β¯k
where 1−α is the condence interval you wish to use, in our case the 95%
condence interval, and β¯k is the value of the null hypothesis which you wish
to test, in our case ¯
βk = 0 . I will use the values from the normal distribution,
considering this as a good approximation of the t-distribution with 1185 degrees
of freedom. Then we have the following 95% condence intervals. First, the
constant:
7.1578
−1.96 < < 1.96
0.2330
Here, with a t-value of 30.7, clearly we reject the null. Then cigs,
−0.0372
−1.96 < < 1.96
0.0067
15
which gives a t-value of -5.40, and again, we reject the null with good
margin. Then for parity :
0.1127
−1.96 < < 1.96
0.0412
which a gives a t-value of 2.71 > 1.96 and again we reject the null. Then
for f aminc:
0.0035
−1.96 < < 1.96
0.0023
This time, we have a t-value of 1.53, and this time we cannot reject the
null. Next, motheduc:
0.0295
−1.96 < < 1.96
0.0177
The t-value this time is 1.67, and we do not reject the null.
−0.0232
−1.96 < < 1.96
0.0200
This is for f atheduc, and gives a t-value of -1.16, which lies inside the
interval, and we do not reject the null.
The rank of the matrix X is 6, so there is no perfect multicollinearity.
However, when excluding motheduc from the regression, we see that the part
R squared of f aminc, that is, the correlation of f aminc with the dependent
variable given the other explanatory variables, increases (see under). This
suggests that there might be a collinearity problem between these two variables.
Coefficient Std.Error t-value t-prob Part.R^2

Constant 7.27301 0.2228 32.6 0.0000 0.4734
cigs -0.0376922 0.006897 -5.47 0.0000 0.0246
parity 0.113638 0.04123 2.76 0.0059 0.0064
faminc 0.00444714 0.002216 2.01 0.0450 0.0034
fatheduc -0.00475325 0.01670 -0.285 0.7760 0.0001
We test the linear restriction that β4 = β 5 = 0 , in other words, we dene

R and r as:

0 0 0 0 1 0 0
R= ,r =
0 0 0 0 0 1 0
This gives us the following test statistics for the F-test:
Test for linear restrictions (Rb=r):

R matrix
Constant cigs parity faminc motheduc fatheduc
0.00000 0.00000 0.00000 0.00000 1.0000 0.00000
0.00000 0.00000 0.00000 0.00000 0.00000 1.0000
16
r vector
0.00000 0.00000
LinRes F(2,1185) = 1.4373 [0.2380]
White's heteroskedasticity test:
Heteroscedasticity coefficients:
Coefficient Std.Error t-value
cigs -0.032041 0.049584 -0.64620
parity 0.0045450 0.44633 0.010183
faminc 0.015506 0.026471 0.58575
motheduc 0.15605 0.23825 0.65501
fatheduc 0.29249 0.33937 0.86185
cigs^2 0.0010846 0.0019191 0.56514
parity^2 -0.0071292 0.091492 -0.077921
faminc^2 -0.00020918 0.00033770 -0.61941
motheduc^2 -0.0059738 0.0093890 -0.63626
fatheduc^2 -0.010163 0.012990 -0.78240
RSS = 16787.9 sigma = 3.77188 effective no. of parameters = 11

Regression in deviation from mean
Testing for heteroscedasticity using squares

Chi^2(10) = 3.4235 [0.9696] and F-form F(10,1180)= 0.34017 [0.9701]
17
References
18

Homework 1

Transféré par

Informations du document

Description originale:

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Homework 1

Transféré par

Droits d'auteur :

Formats disponibles

ECON2651 - First Assignment

Ayar PORTUGAL MICHAUX

OLS estimation belongs to the family of the Lp -normed estimations procedures

In the particular case

Where β̃ is the regressor being estimated.

This last procedure leads to OLS regressors bi for i = 1, 2, ..., K .

Figure 1: Two regressions using the innity norm

Second, provided strict exogeneity holds, it can be easy computed and it is

Proof that b is the only ecient estimator

β̂ = Cy (β̂ is a linear regressor on y)

By taking the conditional expectation of β̂ we have :

E(β̂|X) = E(DXβ + Dε + b|X)

= DXβ + E(b|X) (DXβ depend on X and E(ε|X) = 0)

var(β̂ − β|X) = var ((D + A)ε|X)

= DX(X0 X)−1 = 0 (Because DXβ = 0, this is only possible if DX = 0)

var(β̂ − β|X) = σ 2 (DD0 + DA0 + AD0 + AA0 )

We now have to compute var(b − β|X) in order to compare it with the

var(b − β|x) = var(Aε|X)

Because DD' is a positive denite matrix (D = C − A is by assumption non

We have the population regression and the estimated regression:

We have that e equals M y. Then, substituting for y = Xβ + ε we get:

Having now e = Mε we premultiply both sides by e0 :

Since M is both symmetric and idempotent, i.e. M 0M = M . Therefore we

Proof that M is idempotent:

= In − X(X0 X)−1 X0 − X(X0 X)−1 X0 + X(X0 X)−1 X0

Which is indeed equal to M.

In order to know if the marginal distribution of q can be computed and to

• Then, we will determine whether or not the unconditional error distri-

From statistics theory we know that if Z ∼ Nn (0, In ) and M is an n × n

V ar(ε) = var(E(ε|X)) + E (var(ε|X))

1.2 Section 1 of chapter 2 : Large sample concepts

A random sequence {zn } converges in probability to a scalar α (i.e. zn →p α)

lim P (|zn − α| > ε) = 0

P (S) = P (zn = α ∪ zn = 0 ∪ zn = −n)

E[(zn − α)2 ] = E[zn2 − 2zn α + α2 ]

E[(zn − α) ] = E(zn2 ) − 2αE(zn ) + α2

lim E(αn − α)2 = lim E(αn2 − 2αn α + α2 )

In the second place we have to specify what Θ is.

var(αΘ) = E (αΘ)2 − E(αΘ)2

This is equivalent to say that :

And so the limit in distribution is equal to ασ , i.e.

which is of course normally distributed with the parameters we have specied

Limit in probability of z̄n ?

lim E(z̄n ) = µ and lim var(z̄n ) = 0

z̄n − z̃ = [z̄n − E(z̄n )] + [E(z̄n ) − z̃]

If we take the expectation in both sides we have :

= E [z̄n − E(z̄n )]2 + 2(E(z̄n ) − E(z̄n ))[E(z̄n ) − z̃] + E [E(z̄n ) − z̃]2

= var(z̄n ) + E [E(z̄n ) − z̃]2

If we now take the limit of this expression we have :

lim {E(z̄n − z̃)} = lim var(z̄n ) + E [E(z̄n ) − z̃]2

If we want z̄n to be convergent in mean squares this last limit has to be

= 0 + lim E [E(z̄n ) − z̃]2

yi = σi i where i ∼ i.i.dN (0, 1)

E(gi |yi−1 , yi−2 , . . . , y1 ) = E(yi + yi−1 |yi−1 , yi−2 , . . . , y1 )

E(yi |yi−1 , . . . , y1 ) = E(σi i |yi−1 , . . . , y1 )

We now analyse the covariance:

cov(yi , yi−j ) = E [(yi − E(yi ))(yi−j − E(yi−j )]

As we can see, the covariance depend on j as well, therefore this AR process

bwght = β0 + β1 cigs + β2 parity + β3 f aminc + β4 moteeduc + β5 f atheduc + 

---- Descriptive Statistics 1.0 session started at 18:51:12 on 8-11-2010 ----

Means, standard deviations and correlations

EQ( 1) Modelling bwght by OLS-CS

Coefficient Std.Error t-value t-prob Part.R^2

Figure 1: Two regressions using the innity norm

Proof that b is the only ecient estimator

Because DD' is a positive denite matrix (D = C − A is by assumption non

which is of course normally distributed with the parameters we have specied

yi = σi i where i ∼ i.i.dN (0, 1)

E(yi |yi−1 , . . . , y1 ) = E(σi i |yi−1 , . . . , y1 )

bwght = β0 + β1 cigs + β2 parity + β3 f aminc + β4 moteeduc + β5 f atheduc +

We test the linear restriction that β4 = β 5 = 0 , in other words, we dene