PP240B S07 Section 03

2007.01.31 - SECTION #3 (WITH ANSWERS) ANGIE CHEN & KEITH V.
LUCAS
PP240B, SPRING 2007 1 OF 7 JANUARY 31, 2007
January 31, 2007 Section #3
SECTION AGENDA
Questions and digressions are encouraged throughout.
1. This week in quant
2. Practice problems
3. Freeform Q&A
THIS WEEK IN QUANT
PART I: LINEAR REGRESSION MODELS, FROM POPULATION TO PREDICTION

Population Model
The goal of regression analysis is to model the true population-level relationship between
two variables
We are interested in the unknown population model y = + x +
o y is the dependent variable
o x is the independent variable
o is the unknown y-intercept for the population
o is the unknown slope of the line for the population
o represents all of the unobserved factors that impact y

Sample Model
We do not know the true population parameters ( and ), so we approximate their values
by taking a sample of n randomly selected observations
The sample model is written as
i i i
x y + + =
o y
i
is the value of the ith observation of the dependent variable
o x
i
is the value of the ith observation of the independent variable
o is the unknown y-intercept, which we will estimate using observations of x
i
and y
i

o is the unknown slope of the line, which we will estimate using observations of x
i
and y
i

o
i
represents all of the unobserved factors that impact the ith observation of y

Predicted Model
We use the observations from our random sample to predict the true relationship between x
and y by fitting a line to the observed data
The predicted model is the fitted regression line

y =

+

x
o

y is the fitted value of the dependent variable
o x is the independent variable
o

is the fitted value of the y-intercept
o

is the fitted slope of the line

2007.01.31 - SECTION #3 (WITH ANSWERS) ANGIE CHEN & KEITH V. LUCAS
PART II: ORDINARY LEAST SQUARES (OLS) ASSUMPTIONS
Ordinary Least Squares is a method for estimating the parameters of a linear regression (e.g. and
). The OLS estimates are calculated by minimizing the sum of squared residuals. Before
undertaking this process, we should understand the assumptions that underlie OLS.

Rucker and the book (Wooldridge, chapter 2) present two different (but equivalent) sets of OLS
assumptions. Although the specific assumptions are different, both sets lead to the same equations
for calculating OLS estimates. The purpose of understanding these assumptions is to identify when
OLS is a good estimator and when it is not. We will focus on Ruckers set of five assumptions for
estimating a linear regression model using OLS.

i i i
x y + + =

ASSUMPTION MEANING VIOLATION
1 Exogeneity

E(y|x=xi) = + xi

cov(xi,i) = 0 and
E(|x) = 0
The expected value of y is the
regression line
x is independent of the
unobserved determinants ()
of y
Violated if x is not
independent of the
unobserved determinants ()
of y
Omitted Variables Bias
2 Zero Mean Error

E(i) = 0

The mean of the error terms
for all xi equals 0

Violated if the mean of the
error terms does not equal
zero
Biased intercept ()

3 Homoskedasticity

var(i) =
2

Variance in the error term is
constant for all values of x
Note that homoskedasticity
does not affect bias, but it
does impact efficiency
Violated if the variance in
the error term is not
constant
Heteroskedasticity

4 Errors are independent

cov(i,j) = 0

Error terms are independent
across observations
Violated if the error terms of
observations are related
Measurement Error
5 Errors Normally Distributed

~ N(0,
2
)

Error terms are normally
distributed around the
regression line with mean 0
and variance
2

Assumptions 25 can be summarized as the NIID assumptions: the error terms of the regression model are
independently and identically distributed as a normal distribution with mean zero and constant variance

PART III: CALCULATING OLS ESTIMATES
Based on the assumptions above, we use OLS to estimate the relationship between x and y. The
following is a derivation of the OLS estimators for and

We do not know the true population parameters ( and ), so we approximate their values by taking
a sample of n randomly selected observations specify a sample model
i i i
x y + + =
Using the observed data, we create an estimated model

y
i
=

+

x
i

We define the residual as e
i
= y
i

y
i
= the observed value of y the predicted value of y
Substituting for

y
i
we get
e
i
= y
i

y
i
= y
i

x
i

For each observation (x
i
,y
i
) there is a residual (e
i
)
We need to minimize the Sum of Squared Residuals (SSR)
SSR = e
i
2
= (y
i

x
i
)
2
i=1
n
i=1
n

First, we take the partial derivative with respect to and set it equal to zero
SSR
= 2 y
i

x
i ( )
= 0
i=1
n

We divide by -2 and substitute for y
i

x
i
to get e
i
= 0
i=1
n
, which means e = 0
The residuals sum to zero and their mean is zero
Next, we take the partial derivative with respect to and set it equal to zero

SSR
= 2 y
i

x
i ( )
x
i
= 0
i=1
n

Again, we divide by -2 and substitute for y
i

x
i
to get e
i
x
i
= 0
i=1
n

Therefore x and e do not covary
Using these first order conditions (and a lot more math, which Rucker will derive in lecture), we can
derive equations to calculate estimates of (the y-intercept) and (the slope of the regression line)

(the estimated slope) is the change in y with a one-unit change in x

=
cov(x, y)
var(x)
=
x
i
x ( ) y
i
y ( )
n 1 ( )
i=1
n
x
i
x
( )
2
n 1 ( )
i=1
n
=
x
i
x ( ) y
i
y ( )
i=1
n
x
i
x
( )
2
i=1
n

(the estimated y-intercept) is the value of y when x is zero

= y

x
The predicted regression line will always pass through the mean values of x and y

PART IV: OLS GOODNESS OF FIT
Now that we understand the algebra associated with fitting a regression line to observed data, we
can consider the goodness of fit, or the amount of total variation in y that is explained by our model

Total variation from mean = variation explained by x + unexplained variation
SST = SSE + SSR
SST = y
i
y ( )
2
i=1
n

o SST = total sum of squares
o The sum of the total variation between observed y
i
and the mean ( y )
o SST = SSE + SSR
SSE =
y
i
y ( )
2
=

2
x
i
x ( )
2
i=1
n
i=1
n

o SSE = explained sum of squares
o The sum of all the explained variation between predicted

y
i
and the mean ( y )
o SSE = SST SSR
SSR = y
i

y
i
( )
2
i=1
n
= e
i
2
i=1
n

o SSR = sum of squared residuals
o The sum of all the unexplained variation between observed y
i
and predicted

y
i

o SSR = SST SSE

Coefficient of Determination

R
2
is the proportion of the variation in y that is determined by x in our predicted model
R
2
=
SSE
SST
=1
SSR
SST

o This is the ratio of the explained variation to the total variation
o R
2
ranges between 0 and 1

PART V: THE YS OF LINEAR REGRESSION

y
x
Regression
Line
Sample Average
( ) y x,
Sample
Data
i
y
i
y
y
i i
y y
y y
i

y y
i
x y
+ =
PRACTICE PROBLEM
1. The following table contains the ACT scores and the GPA for eight college students. GPA
is based on a four-point scale and has been rounded to one digit after the decimal.

Student GPA ACT
1 2.8 21
2 3.4 24
3 3.0 26
4 3.5 27
5 3.6 29
6 3.0 25
7 2.7 25
8 3.7 30

Student GPA
(y)
y y ( )
ACT
(x)
x x ( )

y y ( ) x x ( )

x x ( )
2

y
y y ( )

y y ( )
2

y y ( )
2

y y ( )
2

1 2.8 -0.41 21
-4.88 2.01 23.77 2.71 0.09 0.17 0.25 7.35E-03
2 3.4 0.19 24
-1.88 -0.35 3.52 3.02 0.38 0.04 0.04 1.44E-01
3 3.0 -0.21 26
0.13 -0.03 0.02 3.23 -0.23 0.05 0.00 5.07E-02
4 3.5 0.29 27
1.13 0.32 1.27 3.33 0.17 0.08 0.01 2.98E-02
5 3.6 0.39 29
3.13 1.21 9.77 3.53 0.07 0.15 0.10 4.64E-03
6 3.0 -0.21 25
-0.88 0.19 0.77 3.12 -0.12 0.05 0.01 1.51E-02
7 2.7 -0.51 25
-0.88 0.45 0.77 3.12 -0.42 0.26 0.01 1.79E-01
8 3.7 0.49 30
4.13 2.01 17.02 3.63 0.07 0.24 0.18 4.35E-03
mean 3.21 25.9
sum 5.81 56.88 0.00 1.03 0.59 4.35E-01
= cov(x,y) = var(x) = e = SST = SSE = SSR

a. Estimate the relationship between GPA and ACT using OLS; that is, obtain the
intercept and slope estimates in the equation

G

P A =

0
+

1
ACT

Comment on the direction of the relationship. Does the intercept have a useful
interpretation here? Explain. How much higher is the GPA predicted to be if the ACT
score is increased by five points?

1
=
x
i
x ( ) y
i
y ( )
i=1
n
x
i
x
( )
2
i=1
n
=
cov(x, y)
var(x)
=
5.81
56.88
= 0.102

0
= y

1
x = 3.21 0.10225.9
( )
= 0.568

The intercept does not have a useful interpretation, because it implies that a student with
an ACT score of 0 would have a GPA of 0.568.

If a students ACT score increased by five points, his GPA would be predicted to
increase by 0.51 points. 5 * 0.102 = 0.51

b. Compute the fitted values and residuals for each observation, and verify that the
residuals (approximately) sum to zero.

See table above.

c. What is the predicted value of GPA when ACT = 20?

G

P A =

0
+

1
ACT
0.568 + (0.102 * 20) = 2.612

d. How much of the variation in GPA for these eight students is explained by ACT?
Explain.

R
2
= SSE/SST = 0.59/1.03 = 0.573

ACT score explains 57.3% of variation in student GPAs.

PP240B S07 Section 03

Transféré par

Informations du document

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

PP240B S07 Section 03

Transféré par

Droits d'auteur :

Formats disponibles

2007.01.31 - SECTION #3 (WITH ANSWERS) ANGIE CHEN & KEITH V.

Vous aimerez peut-être aussi