Clase III

Multiple Regression
Analysis
y = b0 + b1x1 + b2x2 + . . . bkxk + u
1. Estimation
1
Parallels with Simple Regression
b0 is still the intercept
b1 to bk all called slope parameters
u is still the error term (or disturbance)
Still need to make a zero conditional mean
assumption, so now assume that
E(u|x1,x2, ,xk) = 0
Still minimizing the sum of squared residuals, so
have k+1 first order conditions
2
Interpreting Multiple Regression
y b0 b1 x1 b 2 x2 ... b k xk , so
y b x b x ... b x ,
1 1 2 2 k k
so holding x2 ,..., xk fixed implies that

y b1 x1 , that is each b has
a ceteris paribus interpreta tion
3
A Partialling Out Interpretation
Consider t he case where k 2, i.e.

y b b x b x , then
0 1 1 2 2
b1 ri1 yi r2
i1 , where ri1 are
the residuals from the estimated
regression x1 0 2 x2
4
Partialling Out continued
Previous equation implies that regressing y on x1
and x2 gives same effect of x1 as regressing y on
residuals from a regression of x1 on x2
This means only the part of xi1 that is uncorrelated
with xi2 are being related to yi so were estimating
the effect of x1 on y after x2 has been partialled
out
5
Simple vs Multiple Reg Estimate
~ ~ ~
Compare the simple regression y b 0 b1 x1
with the multiple regression y b0 b1 x1 b 2 x2
~
Generally, b1 b1 unless :
b 0 (i.e. no partial effect of x ) OR
2 2
x1 and x2 are uncorrelat ed in the sample
6
Goodness-of-Fit
We can think of each observatio n as being made
up of an explained part, and an unexplaine d part,
yi y i ui We then define the following :
y y is the total sum of squares (SST)
2
i
y y is the explained sum of squares (SSE)

2
i
u is the residual sum of squares (SSR)

2
i
Then SST SSE SSR

7
Goodness-of-Fit (continued)
How do we think about how well our
sample regression line fits our sample data?
Can compute the fraction of the total sum

of squares (SST) that is explained by the
model, call this the R-squared of regression
R2 = SSE/SST = 1 SSR/SST
8
Goodness-of-Fit (continued)
We can also think of R 2 as being equal to

the squared correlatio n coefficien t between
the actual yi and the values y i
y y y y
2
y y y y
R 2
i
2
i
2
i i
9
More about R-squared
R2 can never decrease when another independent
variable is added to a regression, and usually will
increase
Because R2 will usually increase with the number

of independent variables, it is not a good way to
compare models
10
Assumptions for Unbiasedness
Population model is linear in parameters:
y = b0 + b1x1 + b2x2 ++ bkxk + u
We can use a random sample of size n,
{(xi1, xi2,, xik, yi): i=1, 2, , n}, from the
population model, so that the sample model
is yi = b0 + b1xi1 + b2xi2 ++ bkxik + ui
E(u|x1, x2, xk) = 0, implying that all of the
explanatory variables are exogenous
None of the xs is constant, and there are no
exact linear relationships among them
11
Too Many or Too Few Variables
What happens if we include variables in our
specification that dont belong?
There is no effect on our parameter estimate, and
OLS remains unbiased
What if we exclude a variable from our

specification that does belong?
OLS will usually be biased
12
Omitted Variable Bias
Suppose the true model is given as

y b 0 b1 x1 b 2 x2 u , but we
~ ~ ~
estimate y b b x u , then
0 1 1
~ x x1 yi
b1 i1
x x1
2
i1
13
Omitted Variable Bias (cont)
Recall the true model, so that
yi b 0 b1 xi1 b 2 xi 2 ui , so the
numerator becomes
x x b
i1 1 0 b1 xi1 b 2 xi 2 ui
b x x b 2 xi1 x1 xi 2 xi1 x1 ui
2
1 i1 1
14
~
b b1 b 2 x x x x x1 ui
x x x i1 x1
i1 1 i2 i1
i1 1
2
2
since E(ui ) 0, taking expectatio ns we have
~
E b1 b 1 b 2
x x x
x x
i1 1 i2
2
i1 1
15
Consider t he regression of x2 on x1
~ ~ ~ ~ x x x
x2 0 1 x1 then 1
x x
i1 1 i2
2
i1 1

~
so E b1 b1 b 2 1
~
16
Summary of Direction of Bias
Corr(x1, x2) > 0 Corr(x1, x2) < 0
b2 > 0 Positive bias Negative bias
b2 < 0 Negative bias Positive bias
17
Omitted Variable Bias Summary
Two cases where bias is equal to zero
b2 = 0, that is x2 doesnt really belong in model
x1 and x2 are uncorrelated in the sample
If correlation between x2 , x1 and x2 , y is the same

direction, bias will be positive
If correlation between x2 , x1 and x2 , y is the
opposite direction, bias will be negative
18
The More General Case
Technically, can only sign the bias for the more
general case if all of the included xs are
uncorrelated
Typically, then, we work through the bias assuming

the xs are uncorrelated, as a useful guide even if
this assumption is not strictly true
19
Variance of the OLS Estimators
Now we know that the sampling
distribution of our estimate is centered
around the true parameter
Want to think about how spread out this
distribution is
Much easier to think about this variance
under an additional assumption, so
Assume Var(u|x1, x2,, xk) = s2
(Homoskedasticity)
20
Variance of OLS (cont)
Let x stand for (x1, x2,xk)
Assuming that Var(u|x) = s2 also implies that
Var(y| x) = s2
The 4 assumptions for unbiasedness, plus this

homoskedasticity assumption are known as the
Gauss-Markov assumptions
21
Variance of OLS (cont)
Given the Gauss - Markov Assumption s
s 2
Var b j

SSTj 1 R 2
j , where
x x
2 2 2
SSTj ij j and R is the R
j
from regressing x j on all other x' s
22
Components of OLS Variances
The error variance: a larger s2 implies a larger
variance for the OLS estimators
The total sample variation: a larger SSTj implies a
smaller variance for the estimators
Linear relationships among the independent
variables: a larger Rj2 implies a larger variance for
the estimators
23
Misspecified Models
Consider again the misspecifi ed model
s 2
~ ~ ~ ~
y b 0 b1 x1 , so that Var b1
SST1

~

Thus, Var b Var b unless x and
1 1 1
x2 are uncorrelat ed, then they ' re the same
24
Misspecified Models (cont)
While the variance of the estimator is smaller for
the misspecified model, unless b2 = 0 the
misspecified model is biased
As the sample size grows, the variance of each

estimator shrinks to zero, making the variance
difference less important
25
Estimating the Error Variance
We dont know what the error variance, s2,
is, because we dont observe the errors, ui
What we observe are the residuals, i
We can use the residuals to form an

estimate of the error variance
26
Error Variance Estimate (cont)
s u
2
n k 1 SSR df
2
i
thus, seb s SST 1 R

j j
2 12
j
df = n (k + 1), or df = n k 1
df (i.e. degrees of freedom) is the (number of
observations) (number of estimated parameters)
27
The Gauss-Markov Theorem
Given our 5 Gauss-Markov Assumptions it can be
shown that OLS is BLUE
Best
Linear
Unbiased
Estimator
Thus, if the assumptions hold, use OLS
28

Clase III

Transféré par

Informations du document

Description originale:

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Clase III

Transféré par

Droits d'auteur :

Formats disponibles

Multiple Regression

so holding x2 ,..., xk fixed implies that

Consider t he case where k 2, i.e.

x1 and x2 are uncorrelat ed in the sample

y y is the explained sum of squares (SSE)

u is the residual sum of squares (SSR)

Then SST SSE SSR

Can compute the fraction of the total sum

We can also think of R 2 as being equal to

Because R2 will usually increase with the number

What if we exclude a variable from our

Suppose the true model is given as

since E(ui ) 0, taking expectatio ns we have

Corr(x1, x2) > 0 Corr(x1, x2) < 0

b2 > 0 Positive bias Negative bias

b2 < 0 Negative bias Positive bias

If correlation between x2 , x1 and x2 , y is the same

Typically, then, we work through the bias assuming

The 4 assumptions for unbiasedness, plus this

Given the Gauss - Markov Assumption s

from regressing x j on all other x' s

Consider again the misspecifi ed model

x2 are uncorrelat ed, then they ' re the same

As the sample size grows, the variance of each

What we observe are the residuals, i

We can use the residuals to form an

thus, seb s SST 1 R

Vous aimerez peut-être aussi