Vous êtes sur la page 1sur 30

Lesson 2: MRA: Small Sample Properties

(Wooldridge Ch. 3.3,3.4,3.5)

Course: Econometrics For Finance


Professor: Ricardo Mora

UC3
M
Master in Finance, UC3M

Econometrics for Finance, week 2

1 / 30

Introduction

Assumptions and Properties

Simple and Multiple Regression

Standard Errors

Master in Finance, UC3M

Econometrics for Finance, week 2

2 / 30

Introduction

Introduction

Master in Finance, UC3M

Econometrics for Finance, week 2

3 / 30

Introduction

The model with k regressors

Yi = 0 + 1 X1i + 2 X2i + ... + k Xki + ui ,

OLS with k regressors:


o
n
b0 , b1 , ..., bk = min

i = 1, . . . , n

[Yi (b0 + b1 X1i + ... + bk Xki )]2

b0 ,b1 ,...,bk i=1

Master in Finance, UC3M

Econometrics for Finance, week 2

4 / 30

Introduction

First Order Conditions


i
h

Yi b0 + b1 X1i + ...bk Xki
=0

h

i
En X1i Yi b0 + b1 X1i + ...bk Xki
=0
..

h

i .
En Xki Yi b0 + b1 X1i + ...bk Xki
=0
En

In matrix form:

1
En (X1i )
En (X1i )
En X1i2

..
..

.
.

...
En (Xki )
. . . En (X1i Xki )
..
..
.
. 
En (Xki ) En (X1i Xki ) . . .
En Xki2

Master in Finance, UC3M

Econometrics for Finance, week 2

b0
b1
..
.
b
k

En (Yi )
En (Yi X1i )
..
.

En (Yi Xki )

5 / 30

Introduction

Example: GPA data

\ = 1.29 + 0.009 ACT + 0.453 hsGPA,


colGPA

En (colGPA)
3.06
En (ACT ) = 24.16
En (hsGPA)
3.40

n = 141

0.139
n = 0.219
0.049

8.090
0.315

0.102

then, since En (Wi Zi ) = Covn (Wi , Zi ) + En (Wi ) En (Zi ),

1
24.16
3.40

Master in Finance, UC3M

b0
3.06

b

1 = 74.06
11.67
10.45
b2

591.55
82.49

Econometrics for Finance, week 2

6 / 30

Assumptions and Properties

Assumptions and Properties

Master in Finance, UC3M

Econometrics for Finance, week 2

7 / 30

Assumptions and Properties

Assumptions

Assumption 1: Y = 0 + 1 X1 + 2 X2 + . . . + k Xk + u

Assumption 2: i.i.d sample

Assumption 3: E (ui |X1i = X1 , . . . , Xki = Xk ) = 0

Assumption 4: rank (X 0 X ) = k + 1 (no perfect multicollinearity)

Master in Finance, UC3M

Econometrics for Finance, week 2

8 / 30

Assumptions and Properties

Assumption 2: Random sampling

This arises if the observational entities are selected at random: from


the same population and independently of each other
The most important example when we will find non-i.i.d. sampling is
when data are recorded over time for the same entity (time series)
We will deal with this issue when we look at dynamic models

Master in Finance, UC3M

Econometrics for Finance, week 2

9 / 30

Assumptions and Properties

Assumption 3: Zero Conditional Expectation

In a ideal experiment, the values of X1 , X2 , ..., Xk are randomly


assigned
As the regressors are randomly assigned, all other factors for Y must
be uncorrelated to the regressors
Hence, on average ui does not depend on Xi

With observational data we need to think about whether


E (ui |Xi = x) = 0 holds

Master in Finance, UC3M

Econometrics for Finance, week 2

10 / 30

Assumptions and Properties

Assumption 4: No perfect multicollinearity

With only one regressor, this assumption is very easy to check: the
regressor must present some variation in the sample.
With more regressors, the assumption implies that no regressor can be
expressed as a linear combination of the others
this rule includes the constant

Intuitively, we need to have in each regressor variation that cannot be


atributed to any other regressor
It is still acceptable that regressors are correlated!

Master in Finance, UC3M

Econometrics for Finance, week 2

11 / 30

Assumptions and Properties

OLS properties

For any sample of size n, if Assumptions 1-4 hold


 
OLS estimators are not biased: E bj = j , j = 0, 1, ..., k
(proofs of these results are beyond the scope of this course)

Master in Finance, UC3M

Econometrics for Finance, week 2

12 / 30

Assumptions and Properties

Discussion on the validity of the Assumptions


Assumption OLS#2 is satisfied under random sampling
this assumption is usually violated with time series. We will see that
under other assumptions, OLS still has good properties

Assumption OLS#3 may fail for different reasons:


if the model specification is wrong (for example, if we should include
X 2 but we do not)
if one omitted variable is correlated with one regressor (omitted
variable bias)
if one regressor is measured with error or is simultaneously determined
together with the dependent variable

Assumption OLS#4 usually fails because of bad model specification

Master in Finance, UC3M

Econometrics for Finance, week 2

13 / 30

Assumptions and Properties

Perfect Multicollinearity
Suppose one regressor is a linear combination of the others:
Xki = 0 + 1 X1i + ... + k1 X k1,i
n

FOC bk : 0 =

ubi Xki

i=1
n

ubi

0 + 1 X1i + ... + k1 X k1,i

i=1

= 0

ubi + 1 ubi X1i + ... + k1 ubi X k1,i

i=1

i=1

i=1

FOC bk is a linear combination of the other FOCs: system cannot


have a unique solution

Master in Finance, UC3M

Econometrics for Finance, week 2

14 / 30

Assumptions and Properties

Perfect multicollinearity is usually a problem about the definition of


the regressors
Econometric packages report an error message
Typical beginner problems
1
2

you include the same regressor twice


you include a constant and dummy variables that are exclusive and
exhaustive (there are several categories and each observation belongs
to only one category). The solution is to omit either the constant or
the dummy for one of the categories.

Master in Finance, UC3M

Econometrics for Finance, week 2

15 / 30

Assumptions and Properties

Example of Multicollinearity Problems

With cross-section country data, we want to study the effect on gdp


of various infrastructure investment categories: roads, ports, airports
All roads, ports, airports are highly correlated: countries with many
roads are also likely to have many ports and airports
OLS will have trouble trying to isolate effects of each of these types of
infrastructures with aggregated country data
Probably better to use a single indicator

Master in Finance, UC3M

Econometrics for Finance, week 2

16 / 30

Assumptions and Properties

Imperfect Multicollinearity

This happens when one or more regressors are highly correlated


The result of this problem is that one or more coefficients are going to
be estimated very imprecisely
The intuition here is clear: With two regressors, 1 es the effect of X1
keeping X2 constant
if X1 and X2 are highly correlated in the sample, there will be little
variation of X1 once X2 is kept constant
hence, there is very little information to learn how X1 affects Y ceteris
paribus

Master in Finance, UC3M

Econometrics for Finance, week 2

17 / 30

Simple and Multiple Regression

Simple and Multiple Regression

Master in Finance, UC3M

Econometrics for Finance, week 2

18 / 30

Simple and Multiple Regression

The issue revisited

Structural (long) model


wagesi = 0 + 1 educi + 2 IQi + i , cov (educi , i ) = cov (IQi , i ) = 0

You do not have information on IQ


wagesi = 0 + 1 educi + ui , cov (educi , ui ) = 0
this is the best linear predictor having only information on education

b1 ?
Is there any relation between b1 and

Master in Finance, UC3M

Econometrics for Finance, week 2

19 / 30

Simple and Multiple Regression

FOC in the long model

covn (educi , ubi ) = 0


covn (IQi , ubi ) = 0
Then:
b1 varn (educi )
b2 covn (educi , IQi ) = 0
covn (educi , wagesi )
b1 cov (IQi , educi )
b2 var (IQi ) = 0
covn (IQi , wagesi )
b1 and
This is a system of two linear equations and two unknowns (
b2 )

Master in Finance, UC3M

Econometrics for Finance, week 2

20 / 30

Simple and Multiple Regression

Omitted Variable Bias (I)

b1 varn (educi )
b2 covn (educi , IQi ) = 0
covn (educi , wagesi )
b1 cov (IQi , educi )
b2 var (IQi ) = 0
covn (IQi , wagesi )
Dividing the first condition by varn (educi ):
covn (educi ,wagesi ) =
n (educi ,IQi )
b1 +
b2 cov
varn (educi )
varn (educi )
n (educi ,wagesi )
But b1 = covvar
so that
n (educi )
n (educi ,IQi )
b1 +
b2 cov
b1 =
varn (educi )

Master in Finance, UC3M

Econometrics for Finance, week 2

21 / 30

Simple and Multiple Regression

Omitted Variable Bias (II)


n (educi ,IQi )
b1 +
b2 cov
b1 =
varn (educi )

The estimate b1 in the short model captures two effects on wages:


1

the estimate of the ceteris paribus effect of educi for a given IQ level:
b1

the estimate of the effects of changes in IQi which are simultaneous to


changes in educi :
n (educi ,IQi )
b2 cov

varn (educi )

n (educi ,IQi )
Note that cov
varn (educi ) captures the change brought about in IQ by
an independent change in educ!!

Master in Finance, UC3M

Econometrics for Finance, week 2

22 / 30

Simple and Multiple Regression

Determinants of the Grade Point Average revisited

\ = 2.40 + 0.027 ACT


Simple Regresion: colGPA
\ = 1.29 + 0.009 ACT + 0.453 hsGPA
Adding hsGPA: colGPA
What happens to the estimate of the coefficient associated to ACT ?
the estimate of the long regression is much smaller than in the short
regression because ACT and hsGPA are working in the same direction
(both slopes are positive) and they are highly correlated
the short regression is still useful if we want to predict colGPA but we
do not have information on hsGPA
the estimate however cannot be interpret as the effect of ACT only

Master in Finance, UC3M

Econometrics for Finance, week 2

23 / 30

Simple and Multiple Regression

Omitted Variable Bias in more general setups


When one regressor is correlated with an omitted variable, there may
be bias in all the regressors

This contagion depends on the correlation structure of all the


regressors at the same time

It is also very difficult to predict the sign of the bias even for the
regressor which is correlated with the omitted variable

To discuss potential biases, it is usual practice however to argue as if


the regressors were not correlated between each other
Master in Finance, UC3M

Econometrics for Finance, week 2

24 / 30

Standard Errors

Standard Errors

Master in Finance, UC3M

Econometrics for Finance, week 2

25 / 30

Standard Errors

Standard Deviations of OLS Coefficients


 
We know that under A1A4 E bj = j
How spread is bj in our sample?


We need to know about St.Dev. bj |x1 , ..., xk to construct standard
errors of OLS estimators
Additional Assumption:
A5: Homoskedasticity: Var(u|x1 , ..., xk ) = 2
Under A1-A5 (Gauss-Markov assumptions), Var(bj |x1 , ..., xk ) =

Master in Finance, UC3M

Econometrics for Finance, week 2

2
TSSxj (1Rxj2 )

26 / 30

Standard Errors

Intuition

Var(bj |x1 , ..., xk ) =

2
TSSxj (1Rx2 )
j

For a given sample, estimates are less precise :


the larger the population variance of the error term: 2
the smaller the sample variance of the regressor: TSSxj
the larger the proportion of variation in each control explained by the
other controls: Rx2j

Summing up: the informative changes in a control are those


independent from the other regressors

Master in Finance, UC3M

Econometrics for Finance, week 2

27 / 30

Standard Errors

An Unbiased Estimator for 2


b 2 =

RSS
(nk1)

(n k 1) is the degrees of freedom (the working sample size)


b 2 is the estimator for the standard deviation of
The square root of
the error term
For inference, we need to estimate:



St.Dev. bj |x1 , ..., xk = q


2
TSSxj (1Rxj )

b:
Simply replace with its estimator


St.Err. bj |x1 , ..., xk = q

TSSxj (1Rxj2 )

Master in Finance, UC3M

Econometrics for Finance, week 2

28 / 30

Standard Errors

Gauss-Markov Theorem

Under A1A5, OLS is the BLUE estimator:


Best (smallest variance for any linear combination of )
Linear
Unbiased
Estimator

Thus, if the assumptions hold, use OLS


(Although some biased estimators may tradeoff variance with bias)

Master in Finance, UC3M

Econometrics for Finance, week 2

29 / 30

Summary

Summary

Sometimes, the omission of variables can lead to bias in the OLS


estimator
The best solution to avoid OV bias is to include the omitted variable
in the regression
Slope coefficients in the multiple regression model capture ceteris
paribus effects
They are identified by the orthogonality conditions

Master in Finance, UC3M

Econometrics for Finance, week 2

30 / 30

Vous aimerez peut-être aussi