Académique Documents
Professionnel Documents
Culture Documents
Fall 2013
MIT 18.S096
Regression Analysis
Regression Analysis
Outline
Regression Analysis
Linear Regression: Overview
Ordinary Least Squares (OLS)
Gauss-Markov Theorem
Generalized Least Squares (GLS)
Distribution Theory: Normal Regression Models
Maximum Likelihood Estimation
Generalized M Estimation
MIT 18.S096
Regression Analysis
Regression Analysis
Regression Analysis
Regression Analysis
Regression Analysis
Regression Analysis
MIT 18.S096
Regression Analysis
Regression Analysis
MIT 18.S096
Regression Analysis
Regression Analysis
Regression Analysis
Regression Analysis
Outline
Regression Analysis
Linear Regression: Overview
Ordinary Least Squares (OLS)
Gauss-Markov Theorem
Generalized Least Squares (GLS)
Distribution Theory: Normal Regression Models
Maximum Likelihood Estimation
Generalized M Estimation
MIT 18.S096
Regression Analysis
Regression Analysis
y1
x1,1 x1,2
y2
x2,1 x2,2
y= . X= .
..
..
..
..
.
.
yn
xn,1 xn,2
MIT 18.S096
x1,p
x2,p
.
..
xp,n
Regression Analysis
1
..
.
p
9
Regression Analysis
y=
Q() =
y1
y2
..
.
= X and
yn
Pn
i=1 (yi
yi )2 = (y
y)T (y
y)
= (y X)T (y X)
Q()
OLS solves =0, j = 1, 2, . . . , p
j
Pn
Q()
2
[y
(x
+
x
)]
=
1
2
p
i
i,1
i,2
i,p
i=1
j
Pjn
=
i=1 2(xi,j )[yi (xi,1 1 + xi,2 2 + xi,p p )]
= 2(X[j] )T (y X) where X[j] is the jth column of X
MIT 18.S096
Regression Analysis
10
Regression Analysis
Q
1
Q
2
..
.
Q
p
= 2
solves
So the OLS Estimate
T
X (y X)
XT X
XT
[1] (y X)
XT
[2] (y X)
..
.
XT
[p] (y X)
the
=
=
=
= 2XT (y X)
Normal Equations
0
XT y
(XT X)1 XT y
to exist (uniquely)
N.B. For
(XT X) must be invertible
Regression Analysis
11
Regression Analysis
y =
y1
y2
..
.
yn
1
2
.. =
.
p
x1,1 1 + + x1,p p
x2,1 1 + + x2,p p
..
.
xn,1 1 + + xn,p p
= X(XT X)1 XT y = Hy
= X
Where
Regression Analysis
12
Regression Analysis
=
1
2
..
.
= y y = (In H)y
n
0
.
) = XT = 0p =
Normal Equations: XT (y X
..
0
N.B. The Least-Squares Residuals vector is orthogonal to the
column space of X
MIT 18.S096
Regression Analysis
13
Regression Analysis
Outline
Regression Analysis
Linear Regression: Overview
Ordinary Least Squares (OLS)
Gauss-Markov Theorem
Generalized Least Squares (GLS)
Distribution Theory: Normal Regression Models
Maximum Likelihood Estimation
Generalized M Estimation
MIT 18.S096
Regression Analysis
14
Regression Analysis
Data y =
and X = ..
..
..
.
.
.
.
.
.
yn
xn,1 xn,2 xp,n
follow a linear model satisfying the Gauss-Markov Assumptions
if y is an observation of random vector Y = (Y1 , Y2 , . . . YN )T and
y1
y2
..
.
E (Y | X, ) = X, where = (1 , 2 , . . . p )T is the
p-vector of regression parameters.
Regression Analysis
15
Regression Analysis
Gauss-Markov Theorem
For known constants c1 , c2 , . . . , cp , cp+1 , consider the problem of
estimating
= c1 1 + c2 2 + cp p + cp+1 .
Under the Gauss-Markov assumptions, the estimator
= c1 1 + c2 2 + cp p + cp+1 ,
Regression Analysis
16
Regression Analysis
Regression Analysis
17
Regression Analysis
If is unbiased then
The orthogonality of f to d implies
Var () =
=
=
=
=
=
MIT 18.S096
Regression Analysis
18
Regression Analysis
Outline
Regression Analysis
Linear Regression: Overview
Ordinary Least Squares (OLS)
Gauss-Markov Theorem
Generalized Least Squares (GLS)
Distribution Theory: Normal Regression Models
Maximum Likelihood Estimation
Generalized M Estimation
MIT 18.S096
Regression Analysis
19
Regression Analysis
Regression Analysis
20
Regression Analysis
Outline
Regression Analysis
Linear Regression: Overview
Ordinary Least Squares (OLS)
Gauss-Markov Theorem
Generalized Least Squares (GLS)
Distribution Theory: Normal Regression Models
Maximum Likelihood Estimation
Generalized M Estimation
MIT 18.S096
Regression Analysis
21
Regression Analysis
Y = X + , where =
1
2
..
.
Nn (On , 2 In )
n
MIT 18.S096
Regression Analysis
22
Regression Analysis
Distribution Theory
= ... = E (Y | X, , 2 ) = X
n
MIT 18.S096
Regression Analysis
23
Regression Analysis
= Cov (Y | X, , 2 ) =
2
0
0
...
0 0
2 0
0 2
...
..
0
0
0
...
2
= 2 In
MIT 18.S096
Regression Analysis
24
Regression Analysis
MGF of Y
For the n-variate r.v. Y, and constant nvector t = (t1 , . . . , tn )T ,
MY (t) =
=
=
=
=
E (e t Y ) = E (e t1 Y1 +t2 Y2 +tn Yn )
E (e t1 Y1 ) E (e t2 Y2 ) E (e tn Yn )
MY1 (t1 ) MY2 (t2 ) MYn (tn )
Qn
ti i + 21 ti2 2
i =1 e
P
n
1 Pn
1 T
T
e i=1 ti i + 2 i,k=1 ti i,k tk = e t u+ 2 t t
= Y Nn (, )
Multivariate Normal with mean and covariance
MIT 18.S096
Regression Analysis
25
Regression Analysis
MGF of
, and constant pvector = (1 , . . . , p )T ,
For the p-variate r.v.
M ( ) = E (e
) = E (e 1 1 +2 2 +p p )
and
M ( ) =
=
=
=
=
E (e )
T
E (e AY )
T
E (e t Y ), with t = AT
MY (t)
1 T
T
e t u+ 2 t t
MIT 18.S096
Regression Analysis
26
Regression Analysis
MGF of
For
M ( ) = E (e
= e
tT u+ 12 tT t
Plug in:
t = AT = X(XT X)1
= X
= 2 In
Gives:
tT = T
tT t = T (XT X)1 XT [ 2 In ]X(XT X)1
= T [ 2 (XT X)1 ]
is
So the MGF of
1 T 2
T
T
1
M ( ) = e + 2 [ (X X) ]
T Regression
2
1 Analysis
MIT 18.S096
27
Regression Analysis
MIT 18.S096
Regression Analysis
28
Regression Analysis
MIT 18.S096
Regression Analysis
29
Regression Analysis
If
X[1]
=
r1,1 r1,2
0
r2,2
..
R= 0
.
0
0
0
0
0
= Q[1] r1,1
r1,p1
r2,p1
...
r1,p
r2,p
...
rp1,p1 rp1,p
0
rp,p
, then
2
r1,1
= XT
[1] X[1]
Q[1] = X[1] /r1,1
Regression Analysis
30
Regression Analysis
Regression Analysis
31
Regression Analysis
MIT 18.S096
Regression Analysis
32
Regression Analysis
Regression Analysis
33
Regression Analysis
j j
Pn
2 = np
2i
i=1
Cj,j = [(XT X)1 ]j,j
MIT 18.S096
Regression Analysis
34
Regression Analysis
Proof: Note that (d) follows immediately from (a), (b), (c)
QT
Define A =
WT
, where
Regression Analysis
35
Regression Analysis
The distribution of z = Ay is Nn (z , z )
where
T
Q
z = [A][X] =
[Q R ]
WT
T
Q Q
=
[R ]
T
W Q
Ip
=
[R ]
0
(np)p
R
=
0(np)p
z = A [ 2 In ] AT = 2 [AAT ] = 2 In
since AT = A1
MIT 18.S096
Regression Analysis
36
Regression Analysis
Thus z =
zQ
zW
Nn
R
Onp
, In
zQ Np [(R), 2 Ip ]
zW N(np) [(O(np) , 2 I(np) ]
and
zQ and zW are independent.
The Theorem follows by showing
= R1 zQ and = WzW ,
(a*)
and are functions of different independent vecctors).
(i.e.
= R1 zQ ,
(b*) Deducing the distribution of
1
applying Theorem* with A = R and y = zQ
(c*) T = zW T zW
= sum of (n p) squared r.vs which are i.i.d. N(0, 2 ).
2
2 (np)
, a scaled Chi-Squared r.v.
MIT 18.S096
Regression Analysis
37
Regression Analysis
Proof of (a*)
= R1 zQ follows from
X = QR with Q : QT Q = Ip
=
=
=
=
=
= y (QR) (R1 zQ )
y y = y X
y QzQ
y QQT y = (In QQT )y
WWT y (since In = AT A = QQT + WWT )
WzW
MIT 18.S096
Regression Analysis
38
Regression Analysis
Outline
Regression Analysis
Linear Regression: Overview
Ordinary Least Squares (OLS)
Gauss-Markov Theorem
Generalized Least Squares (GLS)
Distribution Theory: Normal Regression Models
Maximum Likelihood Estimation
Generalized M Estimation
MIT 18.S096
Regression Analysis
39
Regression Analysis
Maximum-Likelihood Estimation
Consider the normal linear regression model:
y = X + , where {i } are i.i.d. N(0, 2 ), i.e.,
Nn (0n , 2 In )
or y Nn (X, 2 In )
Definitions:
The likelihood function is
L(, 2 ) = p(y | X, B, 2 )
where p(y | X, B, 2 ) is the joint probability density function
(pdf) of the conditional distribution of y given data X,
(known) and parameters (, 2 ) (unknown).
The maximum likelihood estimates of (, 2 ) are the values
maximizing L(, 2 ), i.e., those which make the observed
data y most likely in terms of its pdf.
MIT 18.S096
Regression Analysis
40
Regression Analysis
2
Because
Ppthe yi are independent r.v.s with yi N(i , ) where
i = j=1 j xi,j ,
Qn
L(, 2 ) =
p (yi | , 2 )
i
P
Qni=1 h 1
1 2 (yi j=1 j xi,j )2
2
=
e
i=1
2
2
12 (yX)T ( 2 In )1 (yX)
1
e
(2 2 )n/2
,
likelihood estimates (
2 ) maximize the
The maximum
log-likeliood function (dropping constant terms)
logL(, 2 ) = n2 log ( 2 ) 12 (y X)T ( 2 In )1 (y X)
= n2 log ( 2 ) 21 2 Q ()
where Q() = (y X)T (y X) ( Least-Squares Criterion!)
is also the ML-estimate.
The OLS estimate
The ML estimate
of 2 solves
2
log L( , )
( 2 )
2 =
= ML
) = 0
= 0 ,i.e., n2 12 21 (1)( 2 )2 Q(
P
)/n = ( n 2 )/n (biased!)
Q(
i=1 i
MIT 18.S096
Regression Analysis
41
Regression Analysis
Outline
Regression Analysis
Linear Regression: Overview
Ordinary Least Squares (OLS)
Gauss-Markov Theorem
Generalized Least Squares (GLS)
Distribution Theory: Normal Regression Models
Maximum Likelihood Estimation
Generalized M Estimation
MIT 18.S096
Regression Analysis
42
Regression Analysis
Generalized M Estimation
For data y, X fit the linear regression model
y i = xT
i + i , i = 1, 2, . . . , n.
to minimize
by specifying =P
Regression Analysis
43
Regression Analysis
MIT 18.S096
Regression Analysis
44
MIT OpenCourseWare
http://ocw.mit.edu
For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.