Académique Documents
Professionnel Documents
Culture Documents
'&
ti
jp
tsBnJc
r{"
',,,
C h a p e Treso
I1
\
i
\
\
Model
The SimpleRegression
\
\
\
modclcantrcuscdto stLrcly
hc sinrplcregrcssion therclationshipbctwecntwo
variables.For reasonswe will see,the sirnpleregressionurodel haslirlita- \
it
Ncvcrthclcss,
anall'sis.
tionsas a gcncraltool lor cnrpirical is sorttetintcs
appropriareas an empirical tool. Learning how to interpret the sinrple regression
m oc lc l i s g o o c lp r.a c ti c cfo r s tu rl y i ngrnul ti pl c rcgrcssi on.w hi ch w c' i l do i n subsc-
q u c n tc h r t p t c r s .
22
Model
The SimpleRegression
.,c:ptcr I
havsseveril!rJifterent nantesused
when relatedby Q.l), the vafiables1land.l
,n *'ari*ble' explainrdvari-
interchangeirbly, asfollows. is calleclthe.clg*adent 'i[1e the
variable, or regressapd.-r :s called
able, the responsevariable, the predg*ect
variable' the control vsriable':he pre-
the independentvariable, tfueex-gdatcry
(T.hetern covariateis alsousedfor x.) Tnetertns
dictor variabre,or the re:=;f.on
..dependent uoriuil{, -a,e-{nclepe*dent variable"a'e frequentiyusedi* *ionomet-
"'independenr" herecloesnot reler tu drestatistical
rics. But na o*=i.rlrat'the label
rlnciom variables (seeAppendixBi'
'"-in" of ii:,:€Fcfil*n.. betwgg.n
notion
ancl"explanatory"
,r..:rii;,explainsd" variabies arepro"\ablythemostdescrip-
in the experim*ntal science.s'wherethe
1,, .i.esponse', and"co:itrci"areuseclmostly
..;frable-r rs undertl.ieexperimenter's controi.we u,ill not usilthe "pr:dicted.
te-rms vari-
..predictor,"althoughyou somctimcsscc thcsc.Our tclniiriolcrgy for sinrple
*;--o
regres;ionis summarized in Table2' 1"'
Tbbla 2.{
for SimpleRegression
Termirrology
v .r
.l:
,. ,.,
: . ::i,..;.-
_,::.:i."
DependentVariable IndependentVariable
ExplainedYariable ExplanaLoryVariable
PredictedVariable PrcdictorVariablc
Regressand Regressor
, called
nqltcrtthe
rhe error relationsnip,represents
rnce in the relatlol
effor term or disturbance
The variablez, 1n
treats all fac-
eft-ectivei,i'
factors other than r that affecl,rl,A simple regressionanalysis
think of a as stand-
tors atlectin-q.1,other than-x a3'beingunobser'red.You can usefully
ing for "unobscrvcd."
Ecuetion (2.1) also adclresses the issueof the functional relationshi;:betweenrr and
zero' .\fi : 0, then x
-1.Xf ise clrer faciors in u are,heldtixed, so that the changein lr is
has a line.urcffect otr y: l
A] : F,Ar if Au : 0. (r:2I
This'i'reansthat BI is
ThuS, the changein .y is sirnply B, multipliecl by the changein .r.
between y and.r holding the ciher factorsin u
t5e slope parameter in the relationship
economics. The intercept paxameter Bc,also
fixecl; i1 is of primary interest in applied
has its uSeS,although it is rareiy central to an analysis.
I l'(r,9ie::t ot r Ar r.rly',r', wt tl r C rcl:'-- :cclt c-'rt al I'r'rl''r
Part
-**"*:**----
w, y" s.u gul p ,* f;; ? ?
(Soybean Yield and Fertilizer)
by the model
th a t s o y b e a ny i e l di s d e te rm i ned
S uppo s e
* u,
yield: Fo+ B,Jertilizer (2't)
Y" SL IY} F , 1 ^ # ;T ?
1A S imple W a g e E q u a t i o n)
s a g eto o b served
A m od e lre l a ti n ga p e rs o n 'w and other unobservedfactorsis
educatton
on y'
The linearity of (2.1) implies that a one-unitchangein "t has the sunrceffect
regarclless of t5c initial valueof .r. Tl-iisis unrcalistic1or ntan)/ecollonlicapplications'
for increu'sing
Foi exarnple,in the rvage-educationexampie. i.l'e might want to allorv
has a lurge r el'l'cct on wages than did tltc prcvious
rcturns:the ncxt ycar ol- educaticln
year.we rvill sec*howto aliow for sucit possibilities in section 2.4.
T h c .to s t ri i l ti c u l t i s s u cto a c l d r css i s w i rcthcrnl odel(2.1)real l yal l ow susl tocl raw
cetcrisparibuscoiiclusi0ns abouL horv .r al,f-ecrs J,.we.iust saw in equation(2.2) thar Bl
l h c c t' t' c c t
4, r : . r,rrrc .s u rc g l ' .r' tl ry r'h, okl i l
n-1] l l otl tcr Ii tctcl l '(isrt rr) fl xcrl .[s thi s thc cncl
of the causalityistue? Unfor[unately, no. Horv can we hope lo learn in -{etteralat'loitl"
ilre ce[erisparibusef'lbctof ,r,on-t,, holding olhcr factors fixed" when we are ignoring all
those othcr factors'l
As we will seein Section2.5, we are only able to get reliableestimatorsof B,,and
wc make an i-rssumption restrictinghow the
B, lrom a ranclonrsamplc of data when
Without such a rest:ictiol']. we
unobseryableu is relateclto the explanatoryvariable.r.
Because u and x are random
rvill not be able to estimatethe ceterisparibuseffect,B,.
variables,we needa conceptgroundedin probability'
Ilclbrc wc statcrirckcy assLlllption aboutliorv.randu arc rclatcd.[ltcrcis cltteassulnp-
tion aboutrr thatwe can ahvaysnrake.As long as the intercept. B,,is includedin lhe equa-
zero'
tion, rrothingis lost by assumingthat dre averagevalue ol u in the populationis
24
Model
TheSimPleRegression
ChaPter 2
MathematicailY'
(2.51
E(il) : 0.
, ' i , .
2A
Analysis
Regression Data
with Cross-Sectional
Part t
l*ili1.'
:i$sj'snmc bel'orcapplying
lbr all cducltioulcvcls.But thisis ln issuothatwc tnlrstilddrcss
, simple regression analysis.
. In thefertilizerexample,if fertilizeramountsarechosenindepertdently oi otheri'ea-
turcs of thc plots, then (2.6) will hold: the
averagslanclquality will not dependon the
uEsTloru amourlt ol' I'crtiliz-er.Flowr:vcr,if more l'er-
Suopbse that a scoreon a final exam,score,dependson classes tilizer is put on the higher quality plots of
ittenaea (atend) and unobserved factorsthat affect exam perfor- lancl,then the expectedvalue of a changes
manieGuchasstudentability): with the level of fertiiizer, and (2.6) faits'
Assumption (2.6) gives B, anotlrer
interpretation that is often useful. Thking
the expectedvalue of (2.1) conditional on
you expectthis modelto satisfy(2.6)? x anclusing E(r.rl-r): 0 gives
E(ylx): Fo* F$
a:;.,'-r-'--
i,,,'iE(ylx)as a linearfunction of x.
;i1i;i"1.
*
l" 26
I
:
Model
TheSimpieReqression
Ghaptcr 2
\!.r,',,,'l
ffi
of )'is cen-
valueof 'r' thedisuibution
Fo'ony
edvalueof v by the amountB,' Y?n .
in Figure2' l'
i"r.J ououtbOltl, asillustrated The picce Fo* Ftx
y into two components'
When (2.6),, ,'ut,^iii' osefulto tireat \ti'
issornerimesca.lledthe silsteilxaticpartofl'-thatis,thepartofyexplainedby;r-attd \
'r' We will use
is cailed ne unrysle'ioticpart' or the pa'rtof v not exptainedby
r
assumpdon(2'6)intr'"""-tsectionformodvatingestinlatesofpoandB,.Thisassump.
1orthestatisticalanalysisin Section2'5'
tion is alsocrr'rcial
LEAST SQUARE5
2,2 DERIVIHG THE OTTDIHARY
ESTIIVIATEE
ttl?i
Nowthatwehavediscussecithebasicingredientsolthesinrpleregressionnrodei,we
will address of hol '1esti1a1e
issue
trreimportant le..l,1;1T"1':: =l
fi=1,...,rt!
asample from Letf,l^li
rhepopuiarion. {(xi,:-i):
ill ?iil:?tri'iff';. need
denotearandotnsampteofsizerrfrorrrthepopulation'sincethesedat.ercornefrom
(2.1),we canwrite
(t.bt
li: Fo + Br.[r+ t{i
ali factors aff'ect-
tbr observationisince it contains
foreacl.ri. Here, r{,is the error term
t-r forlamilv
savings
andy,theannuai
il:f:lxil#, income
.x,mightberheannuar
:L
n : l5' A scat-
t.hen
ciataon l5 families,
l. i. y;:ii;,
j duringa panicutar havecollecred
'l:i, ficiitious)
aiongwiththe(necessarilv
i.'..\
rerptorof such^ o"^ J;, t, *:**'i";;;2.2,
r*s:t.:iJ:il:::fl,fll,'"-3J; *.r" dara andsrope
ofrheintercepr
esrimutes
toobtrin
in thepopulationregressionof savingson income'
we will use
tr-,efollowingesdmationprocedure'
Thereareseveral*ri i" *.ttvaie
impticatlonof assuniption (2'6):in tirepopulation' hasa zero
u
(2.5)anclan importarrr
meananriisuncorrelaterlwit]r'r,Therefore,weseetlratulrasZeroexpectedv:ilueand'
u is zero:
that the covariancebetweenx and
E(a) : 0 i;;F,1!01
Cov(r,u):E(ru):0'
t2.rt)
iiit' wherethetirstequalityin(2.1t)1ollows196(2.10).(SeeSectionB.4forthedefini.
1i r,r,, tionanclpropertiesot.ouu,iun.,".)lntermsoftheobservablevariablesxandyandthe
writtenas
unknownparametersp.-tnJ B,' equations(2'10)and(2'11)canbe
i; 1'. !
E(v-Fn-Fr'r):o 'ti'lil
'(;ffi
$$i
;].i.:r..,
',
and
Elx(y*Fo-F,x)l:0,
on thejoin'
restrictions
.fi111,i,t
Equations(2'12) and(2'13)impty two
respectively. f:b*lt]:'l
to estl-
i;1i;i,,: of (.r'-v)
m tne Sincethereale two unknownparameters
population'
i'rl;i;- distribution obnin good esti-
(2' i2) and (2'13)canbe usedto
mate,we might hope,nui!qu*'o''s
i';; -, t,
't
= tii
*-
.tl 1 Regression Data
with Cross-Sectional
Analysis
* 9o +. plincorne
E(savingslincome)
(SeeSectitlnC.4
This is an exampleof the.mttltodof ntomentrapproachto estimation.
Theseequations be solvedfor
approaches.)
of differentrestimation
for a cliscussion can
Po-d F,.
Using the basicpropertiesof the summationoperatorfrom AppendixA, equation
*.1a:,:
(2.14)canbe rewrittenas
!':
2a
....1.------.-
The Simple RegressionModel
'\.,
Chapter 2
\.",
- F,t' ,,tz,r1711
Fo: I
tl]esolutiorr)an<ipluggirrg(2.17)
*i;r
irji,-,:1
i,:
into (2.15)Yields
- (t - F,x)- F'x,)
: 0 i{''
I ",,r, ' .,f
.'
,uai',
r".
1,,. ...'
,,i,:
i1 which, upon rcarrangclllcul',givcs t'
$t i'lrr
!:l
S
-/-t ^" ri\.t
., i
- ii\ =
.\ l F,)x,("r,-r).
'l:,i
Frornbasicpropertiesofthesumnrationoperatorlsee(A.7)and(A.8)]'
rt rl sl nrn r t -
Z.r,(.r, x) i=r
== i ( * , - ; ) z a n d ) ; , t . r , , - . r ) :
i:r
) (",-.r)(t,r-l').
,:r..
ffi*
i= |
rirra
Therefore, Providedthit
t. '
'
.l -
) ,", ;)'> 0, i'.1"}
,ff'ti
,*
. ir':
i "r:
dreestimatedsioPeis
i-il j
Q,- .-r1(]i- -v)
il
Figuro 2,3
A*tt"tpl"i;f wage againsteducationwhen educ,= 12 {or all i'
if
squares
The estimatesgiven in Q.l7) and (2.19)are calledthe ordina^ryleast
(OLS) estimatesof prand F,.To justify this na're, for any [i,and B,' definea fitted
Yaluefor Y whenx : .ti SUCh
3s
f (t,hi.i
, ':
!,= Fo*8,x,,
: There
for thegiveninterceptandslope.This is the valuewe predictfor y when;r Jt'
is a trttedvaluefor eachobgervationin thesample'Theresidualfor observationis the
i
differencebctweerrthe actqal)'i andits fittedvalue:
-t
'
i>t"=)0,
I i:1 i:l
' &'- F,x,)', ,'*l
The SimPleRegressionModel
Chapter 2
i , t l ; '. , ' l ,,
.. Figure'2.4
g : P o+ P ' x
0r: residuq!
"
t'
assmaltaspossible.Theappenclixtotlrischaptersirowsthatthecondidonsnecessary
(2' 14)and(2'15)'withoul
give-nexactiyby equarions
ror ip,),p,;to minimize1i'Zz) ;" OLS
caffeOtitefirst order conditionsfor the
n-1. Equations(2.14)u* 1Z'iSlaie often A)' From
usingcalculus(see,Appendix
esdmates, a termtnatcomesfrom optimization {irst orderconditions
we know-ttratthe solutionito the oLS
our previouscatcutauons, from the fact
name"ordinaryleastsquares"comes
aregivenby (2.J7)unJ ii.rql. The
:, .t'.a.
t:.;: i
UtuJtf,"r. minimize the sum of squaredresiduals'
.t,,
"rtl*it., interceptand slope estimates,we form theoLS
once we havecietermined theoLS
regression line:
!=F,,+Pt'r'
tz.zsi
:. ]
p,have beenobtainedusingequations(2'17) and
whereit is understooorirui lrr'and
thatthepredictedvaiuesfrom equa-
(2.t9). The notariont, r; ur'..yttui;; emphasizes : 0'
is the predictedvalue .of y when 'r
tion (2.23) are estimates'The intercept'Bo' is not'
senseto set:l-: 0. In thosesituations'8,,
althoughin somecasesit *itt not make
initself,veryinterestlng.Whenusing(2.23)tocomputepredictedvaluesofl'forvari-
Equation(2'23)
ttreinterceptin the calculations'
ous valuesof x, we *uii o.rount foi version
is alsocalledthe ,u*pr" *gr*ssion functiol tsirnl becauseit is theestimated
g(ll.r)': is important to remember
of the populationregressioifunction Fo t F,x.It
pRF is sometSing fixed. but unknown,'ln trrepopuiation'since the sRF is
thar rhe
lyl
Analysis
Regression Data
with Cross-Sectional
Pa|{ I
a differentslopeand
obtflined1bra givensantpleof dau, a n()wsanple will generate
intercept in equation(2.23).
lrr rlost casesths slopeestinate,whichwe cill wl-ltcaij
B,: A/Ax,
by
increases
is of primaryinterest.It tells us the amountby which i changeswhenr
one unit. EquivalentlY'
6XlhMpl"8 3 3
(CEO SalarY and Return on EquitY)
32
,1.,
Model
TheSimpleRegression
Chapter 2
,::
tt andslopeestimates havebeenrounded to threedecimal places; we
l r, wherethe intercept the
thatthisis an estimated equation. Howdo we interpret
],, use,,salaryhat,, to indicate
i equation?First,ifthereturnonequityiszerc,roe:0,thenthepredictedsalaryistheinter-
ffii.,., c e p t , 9 6 3 ' 1 9 1 . w h i c h e q u a | s $ 9 6 3 , 1 9 1 ' i n . . s a l a r y i sroe:m e Asaiary:
a s u r e d i 18
n t h501
ousands.Next,wecan
writethe predicted changein salary asa functionof the changein
$,:;:l:",' iaro"l.Thismeansthat iithe return
on equityincreases by onepercentage point,Aroe:
Because (2.26)is a linear
1uiri, 1,then salaryispredided io change by about18.5,or $18,500.
tfrl ls tfreestimated changeregardless of the initialsalary'
equation,
W e c a n e a s i | y u s e ( 2 . 2 6 ) t o c o m p a r e p r e d i :c t e d s a l a r i e s a tis differentva|uesofroe.
1518'221' which justover
Suppose roe : 3b' Thensaiary: 963'191+ 18 501(30)
meanthat a particular CEOwhosefirm had an
$1.5 million.However,,fi'iot' not that affectsalary' Thisisjust
other factors
roe - 30 earns$1,518,22LTherearemany in Fig-
tineiz.za). Theestimated lineis graphed
our prediction fromtn.bLs regression
u r e 2 . 5 , a | o n g w i t h t h e p o p u | a t i o n r e g r e s s i o n f u n c t i o n E sample
( s a t a rof
y | data
r o e )will
'WewiIlneverknow
PRFAnother
the pRF, so we cannortell'howclosethe sRFis to the regres-
Iine,whichmayor maynot be closer to the population
givea differentregression
sionline.
Fiqure 2.5
populatlon
+ 18'50roeandthe (unknown)
{unction.
regression
sarary
Saafy = YbJ' lY 1 +
? ^-r.^
roe
18,501
\-,z
I i,t,
iti*:;
iilr.i
,l::r:.
963.'191
;. i::]]l
i..'.i
;;, -r
il t . , '. , , ','
1 . . : ; 1 , , , ' ; - r -, i
ilr.,i,;:,,: ,:,
!"t"'t'
lr, .r:.. : .
33
!r,iii"',','
i:;a:,1, ,.
i;'-
lir-,i'
1;1,'-'
i.::rj.,r.,.'
. :
:,,
,'
:;,
i;t;:,,.,
Data
RegressionAnalysiswith Cross-Sectlonal
Part {
EXA$b'!*3$*K X 4
(Wage and Education)
- wage,wherewage is mea-
Forthe populationof peoplein'the work f orcein 1976,lety
f
s u r e di n d o l l a r sp e r h o u r .T h u s , o r a p a r t i c u l apr e r s o ni,I w a g e : 6 . 7 5 , t h e h o u r l yw a g e i s
$ 6 . 7 5 .L e tx : e d u cd e n o t e y e a r so f s c h o o l i n gf o
; r e x a m p l ee, d u c : 1 2 c o r r e s p o n dt os a
completehigh schooleducation. Since the average wage in the sampleis $5 90, the con-
sumerpricelndexindicates that this amount is equivalent to $16'64 in 1997dollars'
Usingthe datain WAGEl.RAWwheren : 526 individuals, we obtain the followingOLS
regression line(or sampleregression function):
EXJIMPLF N $
(Voting Outcomes and Campaign Expenditures).
!r4
'.
Model
TheSimPleRegression
-
:. GhaPter 2
.'
ili#,i!',,.,:.'
:I : :
.. :.
m i ^gL +h t ^ewx^ 6' aet c t '
*l
1,'
li',,'t'
' but to simply
.. -^^-,,.,i.innrnqlvsisis not usedto cleterminecausality
i',"i ,"";r.ii^fir;l'J;;'"il,r;ilil:ffiJi#J##fflj*i;l
x*x;J':i1t.l,
0ccursin Problem2'12'where.Iou-T:
-3
e u E s r r o ru 2 \ ;;; to us3^11ta
ol ii::"::1ii:":::
spent sreeping
-*orr.ingtrooo)
ij-am"'mesf'
r.r,T,nL,Lnlo**1r:j:lT":,:iil1*"Li::fj
rnExampre I
I
reasonablc? lme
invcstigatc thc tradeofl
b;;s thisanswerseem
= 60 (whichmeans60 pJ.I;itr on,r to
between thesetwo factors'
A ilote on TerminolgY
sakeof brevity'it is
Inmostcases,wewillindicatetheestimationofarelationshipthrouglroLSbywriting
as fi'i'i,'A.iil, or (2.28).Stot,i*"t, for the
an equatiorr such
regressi"" withoutactuallywritingout the
usefulto indicate that an OLS 1i:-;;;un
eouation.Wewilloftenirrdicatethatequauon-(2'.23)hasbeenobtainedbyoLSinsay-
o7
ing that we run the reg'ression : . : . . ,,
(2.2e)
)onr,
o r s i r r r p l y t l t t t t ' w c r c s r c s i y o l l , r . T l r c prc't
t l svariable
i t i o r t s:owe
f t , always
a n c l x rcgrcss 9 ) i tlcpcn-
i n ( 2 ' 2 thc ndicatewlrichisthe
clcpcnclcnt variable *ntlir i* ,r,. in,r"p"n we rcplacc I andx
"no var.iablc. r,u,^rj,"Jii.applications,
dentvariabie on urerndepencJcnt on I?c or to obtain (2'28)'
(2'26)' *t
with their names.Tlrus,io obtain "g"" "'l'l'
-'l;T::'#f*:l#i;'iinotogv pran toesti-
'topt'%'Tl,:'Ii:l',i:::,1'l'we
for thevast
marctheintercept, *iir'
Bo,"oiong
"'(';??):
'nt B'' ri.ti' to*Lit Ti1:ltiott lelationship
the
&tu'ion']l1l L", ilt" *i"!' t: :iTi:
majoriryof appiicatlon'' = 0);
intercept
.r assuningthat the is zero(sothat'r : 0 impliesthatI
)' arTd
beiween wealways
otherwise'
.*pii.it'tystated
i.o. Untess
this
wc cover case ort.tjrtt"'i."i""
with a slope'
es[itnatean intercept along
ll',
2.3 nlEcHAlulcq oF oLs
i*'it' ln rhis$cction, *" sontealgebraic
"oul'thinkabout
nrory111::*'.i],tl:,t"
properties
rrrese trrarthevarercatures
?::,i:it:::'iJ"J,1li;
istorsalize
il$tii i:,:ffiffi:Iirl,i":r"i;
of ols for a piulicutarlampleof
with thestatisticulprop'
Theycanbe contrasted
clata,
of tlreesti-
[i.;i'
(ai i,i':r:::. ertiesof ol-s, which requires deriving teaturesoi tt e sampiingdistributions
matofs.WewilldiscussstatisticaipropertiesirrSection2.5. will appearmundane'
iirri.j. Scru:d of thc u\gttniic pro6erties\\e rue going to deri-ve
l
i,
i''
i .',.,,
'if'"
Data
RegressionAnalysiswith Cross-Sectional
Part I
E}C,l.M$$!.s x s
(CEO SalarY and Return on EquitY)
in theCEOdataset,alongwith the
listingof thefirst15 observations
Table2.2 contains-a
calledsalaryhat,
fittedvalues, and the uhat.
called
residuals,
Thble 2.2
for the First15 CEOs
andResiduals
FittedValues
t012.348 -494.3484
A
T 5.9 578
t264.761 - 170.7606
8 I O.-1 1094
'79.s4626
9 l t l \ 1157.454
t459.023 -526.0231
tz 26.8 933
continued
36
Clraptcr 2 The Simple RegressionModel
\ , r -uin - w. (2.3O)
,LJ
r
s\
Z xiui: U, {,.;t
The sampleaverageof the OLS residuztls is zero,so the left handsideof (2.31) is pro-
portionalto the samplecovariance between-lrandi,.
(3) The point (-r-',.v.)
is alwayson the OLS regression line.In otherwords,if we tike
equation(2.23) andplug in 7,for x, thenthe predictedvalueis !. This is exactlywhat
l.i equation(2.16)sitowsus.
77
i;
i
Part I ReoressionAnalvsiswith Cross-Sectional
Data
ffiK&rvgtrilffi s"F
(Wage and Education)
F o rt h e d a t ai n W A G E l . R A Wt h e a v e r a g eh o u r l yw a g e i n t h e s a m p l ei s 5 . 9 0 ,r o u n d e dt o
t w o d e c i m apl l a c e sa, n d t h e a v e r a g e d u c a t i o n i s 1 2 . 5 6 .l f w e p l u ge d u c : 1 2 . 5 6i n t ot h e
O L Sr e g r e s s i o n l t n e ( 2 , 2w 7e ),get wAge: - 0 . 9 0 + 0 , 5 4 ( 1 2 . 5 6 ) :5 . 8 8 2 4w , h i c he q u a l s
5 . 9 w h e n r o u n d e dt o t h e f i r s td e c i m a l p l a c e . d
T h e r e a s o nt h e s ef i g u r e s o n o t e x a c t la
ygree
isthat we haveroundedthe averagewage and education, aswell asthe intercept slope and
e s t i m a t e sl f,w e d i d n o t l n i t i a l lryo u n da n yo f t h e v a l u e sw, e w o u l dg e t t h ea n s w e rtso a g r e e
more closely,but thls practicehas littleuseful effect.
S S T=
\1
ffii
s\ ,^
l S S E= {2.34}
I ssn=>r;'.
n
;il$l
SST is a measureof tne rotatsamplevarialionin the ,r,; that is. it measureshow spreitd
out the y, are in the sample.If we divide SST by n - 1. we obtainthe samplevariance
of 'y, as discussedin Appendix C. Similarly,SSE measurestlte sarnplevuiation in tJte
: rr),lncl SSR measuresthe samplevariationin tire fi,.
i,(where we usethe facr thati
The total variation in i, can always be expressedas the sum of the explainedvaliation
and the unexpiainecl variationSSR.Thus,
a . : ' t ; : ' : 1 ' 1 ' \t t ) t
S S T : S S E+ S S R . {ft16|q
3A
T h e S i m P I eR e g r e s s i oM
n odel
ChaPter 2
of the sum-
diftjcuil' but it requirelus to useall ol the ProPertics
Proving(2.36)is not
ApperrdixA' Wrile
madonoperatorcoveredin
n
-.r S
S r .\Ji' - lLY,-YJ+Sr-l)l'
Z,/
:)ti,+()r-)-,)lr
tl
,, -'l , S r.r - '-,\:
: ) n i + 2 > r i ' ( i , -) ) - r ZJ\)i .')
il
- !) + ssE'
: sSR+ z> r?r()'i
that
Now (2.36)holdsif we show
S .rra -
uiUi
r:\ = 0 iz.iit r
4 'r/
B u t w e h a v e a l r e a c l y c l a i m e d t h a t t h e s air.iuriti:il
n p i e c o v a l i i<livicleci twn
r n c e b eby e e- n1' d u ahave
e r e s iwe
t l rTirr'rs' lsandthe
is zero,#'il;
firtedvatues .-ouuri^,.,.*
Goodness-of'Fit I variable'
n:Y theexpianatory or 1ndePenclent
no way of measuring sum-
So tar,we have lttt to compute a number that
variaute,u. it i, olten usefirl
x, explainsme oepe#eni li-. ;;;;;;oin. rn trre toitowingdiscussion' be
rhe ol-s regression
marizeshow weil alongwith the slope'
that iin *io..pt is esdmated
sufeto rerlemberu]atwe u*.ur* i, no. equalto zero-which
iS true
ot,qu*."-isi
Assumingtnat tne total Sum can divide
all-tle 1' equaitire sane value-we
exceptin the very ""tt*ttt """"ttft'i
( 2 . 3 6 ) b v S S T t o r ; ; ' i l ' i s g s s r + s s r v d s r ' T h eisRdefined
- s q u aas
redoftheregresslon'
of determination'
sometimescaUeOtf'!""oefficient
g9
-'T-
Para I Regression
Analysis pata
with Cross-Sectional
= I - SSruSST.
R2:.SSE/SST
HXAM$}T-H } 8
(CEO Salary and Return on Equity)
In the CEOsalaryregression,
we obtainthe following:
: 963.191+18.501
saiary roe
n : 2 0 9R
, 2: 0.0132
We havereproduced the OLSregression lineand the numberof observatrons for clarity.
Usingthe R-squared (rounded to four decimalplaces) reported for thisequation,we can
seehow muchof the variation in salaryis actually
explained by the returnon equity.The
answer is:notmuch.Thefirmsreturnon equityexplains onlyabout1.3o/o of thevariation
in salaries
for thissampleof 209CEOs. Thatmeansthat98Jo/o of thesalaryvariationsfor
theseCEOsis left unexplained!Thislackof explanatory powermaynot be too surprising
sincethereare manyothercharacteristics of both the firm and the indjvidualCEOthat
shouldinfluence'salary;
thesefactorsare necessarily included in the errorsin,a simple
rdgressionanalysis. I
In the sociaisciences,
low R-squareds in regression equationsarenot uncomrnon,
especiallyfor cross-sectionql
analysis. We wiii discussthis issuemoregenerallyuncler
rnultipleregressionanalysis,but it is worthemphasizing now rhata seeminglylow R-
squareddoesnot necessarily meanthatan OLS regression equationis useless.
It is still
possiblethat(2.39)is a goodestimateof theceterisparibusrelationship betweenscla4y
$i" androe; whetheror not this is uue cloesnot dependdirectlyon thesizeof R-squared.
Studentswho arefirst learnipgeconometrics tendto put too muchweighton the sizeof
the R-squaredin evaluati4gregressionequations.For now, be awarethat using
R-squared asthemaingaugeof success for an econometric analysiscanleadto trouble.
4()
variable.
i;
ffi'"'
t.
'
::,t ''
Model
The SimPleRegression
GhaPter 2
? s
EKAmp*'a
Campaign Expenditures)
(Vottng Outcomes and
expen-
rt
equation (.2 o sos Thus,theshareof camPaign
outcome
lnthevotins ]u]J I::variation outcomes
intheelection for thissam-
just 50
over percentof the
explains
ditures
portion'
ple.Thisis a fairlysizable
AND FUTUCTTOIUAI-
2.4 UNTTS OF MEASURE]NEilT
FORM the
how clranging
t:on:*-i:'"-1re( L) unclerstanding
issuesin appliecl OLSesti-
Two important unOlorlni.p*Otntlalablesaffects
;i ;h; dependent
unirsof nleasur.cmen,
(2)knowins
lurd
matos i"':'l-"-i1'^.lT:)fjT;'fili"""'fr:[:::jil':l"tT[:
rrow.to oI runc-
lffiiffi J[.CIoio'atu' u'ircrstancrirtg
H::.,:#lts,lff;il:J1# in AppendixA'
revieweci
,i*tt r?,* issuesis
of Ghanging Units of Measurement on OLs
The Effects
Statistics
I n E x a m p l e 2 . 3 , w e c h o s e t o n r ep.rr"nt.Gu'ttuinon
a s u f e a n n u a l s a lu'a raydecimal)' " s . ^ o fto
a nisdcrucial
i n t h o u s lt dollars,andthe
equiry was #il;J;r u sense of the
reruril on in order to make
in trri* l*o*pr.
know how salary and;;;;;ruied
;*ttl"?ff;;:'inf wavs
inenrirerv,.:1.:,,:d the
when
e$imares
,11l"LS lfange ln
cirange' Examplc
unoiffi"'de,'t variables. lt ln
uni$ of measuremenr;;';;;;o;nJrntsalaryin tnousands of dollars'wemeasure
suppose raUrJr
tfrat, ntfo'u'ing
tfro't be interpreted as
2.3, :845'?61 wouid
Letsalardolbesalary,in measured in
doilars' .Oot'*l,911'idot
a^simpie to the salary
ieiationsnip
ot course,sataic,sltas run the
$g45,?61.). wt ;" not need'to actually
thousands -i"ia"i = r'ooo'snlary'
or oouars'ls is:
thattheestimatedequation
reEession of salardol;;;;";"know i ',,..,,",',,,',,*',.
interceptandtlre
in (24g|:i*ply bv multip|l:'i^'nt
Weobtaintheinterceptandslope
salarvis
slopein(2.39)byr,o6o.rpi,givesequaticl"'iilql""d(240)the.tcnteinterpretation.
6, ,n"".'rr1a'rr)":',16:,i9r' t-q.t.,predicted
roor<ingat (2.40),ii"rnii,) (21?Jl Furthermore' if roe
[the same ;;J *; obbinedfrom equation thisis whatwe
$963,191 i;;;.;;;t by $ 18,501; again'
by onc,*""'ir. brJtcicclsalary
i'creascs
concludedt'o* o* *'tiei analysis of equation 'J#il::
Q'39)'
* ffii ;;'-;d111n5;m:ilT:ti:iffi
i,r,'ilv'
rv,
erar
:,,l :
Gen bv
each multiplied
varueintr'''umft"is
Xfl,ffi,i:fiff:Tf,[1T.--'fnl,:F
,"i.r.if ^"0 ,top. arealso by c' (Thisassumes
nultiplied :
c-rhsn rheOLS "rti*uils c
in *recEO salarvexanrple'
r,* .r,ong.i-;i;;i;; inoepenoent*"uianr".l
nothing
to solatdol'
;;66 ;" movingirom snlar-v
41
Pal't I RegressionAnalysiswith Cross-Sectional
Data
We can also use the CEO salary cxaruplc t.oscc what happcnswhen.we cirange
the unitsof measurement of the indepen-
.+'r
dent variable.Define roedec - roel100
,a-
QuEsrlolu 2 - 4 to be the decimalequivalentof nre; thus,
ilt,'
iuooor" that salaryis measured in hundredsof dollars,rdher
than roedec= 0.23meansa retun on equityof
in'ihousandsof dollars,saysalarhun.Whatwill be the OLSintercept 23 percem.To focuson changin_q theunits
ind slopeestimatesin the ol
regression salarhun on roe? I i of measurement of the independent vari-
ablc,wc rcturnlo oul originaldepcnclcnt
wltich is measupdin thtlusandsof dollars.When we regresssalary-on
variable,salary,,
roedec,we obtain
salary)1963.191
+ I850.1roedcc. (2.4'U
ot roedecis 100!irncs
The coelllcietrL tirecoethcier)Lon /irc in (2.39).'l'hisis as ir
shouldbe.Changingroeby one pfintage pointis to \roedec- 0.01.From
equivalent
(2.41),ifA,roedec:0.0i, thenfri/an : :
18-50.1(0.01)18.501, whichis whatis
obrainedby usin-eQ.39).Note tir4 in noving froni (2.39)to (2.41),the independent
',ri
.:i* variablewasclividedby 100,ands(tlreOLS slopeeslimatewasmultipliedby 100,pre-
servingthe interpretationof tire Quation.Cenerally,if the independerrt variable.is
dividedor rnultipliedby somenorferoconstant,c, thenthe OLS slopecoefficientis
by c cspptively.
alsomultipliedor cJivicled
The intercepthasnot changdi{(2.4i) because roedec: 0 still corresponds to a
zero return on equity.In generil,chqging the units 01'nreasurement
of only the inde-
pendentvariabledoes not affertthe i\ercept.
In the previoussection.ve defir\dR-squareci
as a goodness-of-fit
nreasure
lbr
OLS rcgrcssion.Wc catr alscask wlti\ happcnsto .It2whcn thc uniI ol' nictrsurcnicnt
of either the independentorthe depen$ntvariablechanges.Without doing any alge-
bra, we shoul6know the rerrlt:the goolpess-ot'-tit
of the model shoultJnor depenclon
the units of measuremenlof our variabps. For e.rample,the amountof variationin
salary,explainedby therrturnon equity,\houldnotdependon whethersalaryis mea-
of dollar$r on wiretirer
suredin dollarsor rn thcusands returnon equity'is a pcrccnt
or a decimai.Thisjntuitoncan be veritledrnathematically:
usingthedetrnitionof R2,
it canbe shownthatRzis, in fact,invariantto
changes in the unitsof y or.r.
|ii"f1;,r,;; regression
ii;'l'':, equationswherethedoendent variableappears in tithmicfonn.Why is thisdone?
il,'.,,,,
'.: Recallthe wage-educdon example,rvherewe ly u'ageon yearsof edu-
i).i...: .
cation.We obtained islopeestimate of 0.:4 lsee equat (2.27)),which ureansthat
iii,'
5;:;:li:'.' each year:f
additional educationis predicted to rly wage by 54 cents.
i,'
1 , :" ) ,
i:..'1.
,1 a
,'.
ll. l
:11:
Model
TheSimpleRegression
Chapter 2
is theincrcasclbrcitherthe tlrstyearof
Becauseof thelinearnatureof (2.21),54cents
education
-- or the twentiethyear;this may not be reasonable'
i4prease in wageis il"esamegivenonemore
Suppora,instead,thatihepercentoge increase:thepeF
a constantpercmtage
y.u, ot iJo.ation. ldodel(2.2i) ooesrol.iinpiy (approxintately)a
oependion the initkfi wage. A model thatgives
centageincreases
constantpercentage effectis
1;,. (100'B,)Aeduc.f
%Lwage,e'
percentagecpngein wage gtvenorreaddi-
Nodcehow we multiply B1by i00 to get the
tionnty.o,ofeducati.on:$^ncettepefcentage.changeinr&seistheSameforeaciraddi.
iionofVt* of educa.tion. ^.y?!:.fbr an extrafearo1cducationincrcaseszrs
ihp'fl*ng" returnto education'
educationincreases: in ';thrj *otdt' (2'42) inipliesan trfrcaslrlS
;;;;;;;*riating.(?.n2), !1/ecanwrirc wcge : exP(Fo+B'educ* a)' This equation
ls'grapneO in Figule';'6" wirh tt : 0'
F i g- u r e J r . rr-. f -
'iig"
.
with B, > o'
= ex4(loi Breduc),
'waQeI
: , t:
Data
RegressionAnaiysiswltlr Cross-sectional
Part t
whenusingsimplel'egressron'
Estimatinga modelsuchas(2.42)is suaightlbrward variableis
variable,y, to be I : log(wage). The independent
Justdefinethedepenclent as before:tlre intercept
the same
;;;A by r : edic.Tnemechanicsof oLS :ue words' we
(2'l?) andQ'19)' In other
and slopeestimates.r.giutn by the fbrmulas
p, tio* th;ols regression of log(rvagc)ot edttc'
ilJ; A-;
ffix&fiftff}Ftuffi & $$
(A Lo9 Wagc tquation)
l o g ( w a q ea)s t h ed e p e n d e n t v a r i a bwi ee,
U s i n gt h e s a m ed a t aa s i n E x a m p l2e. 4 , b u t u s i n g
obtainthe followingrelationshrp:
ir;1, - o'584+ o'083edrc
log(wage) li.++l
iir,i:]
n:526,R2:0.186'
interpretation whenit is multiplied by 100:wage
Thecoefficient on educhasa percentage
yearof education. This is what economists
increases by g.3 percent for everyadJitional
yearof education"'
m.un*f,.n theyreferto the "returnto another
ltisimportanttorememberthatthemainreasonforusingthelogotwagein(2.42)is
t o i m p o s e a c o n s t a n t p e r c e n t a g e e f f e c t o f e dparticu|ar,
u c a t i o nitois
n npt
W acorrect
g e . o ntoc say
eequation(2.42)is
the naturallo9of wage|s rare|ymentioned, ln
obtained,
yearof education increases log(wage) by 8 3%'
ihut unothur. log(wage)'
as it givesthe predicted
The lnterceptin (2.42)is not verymeaniniful, percent of thevari-
explains about 18.6
wheneduc: 0. Thenirquui.oshowsthat educ all of the non-
(not wage). Frnally, equation (2.44) mlght not capture
ationin log(wage) "diploma effects"'
u.*..n wageandschooling' lf there are
linearity in the relatronti-]ip
t h e n t h e t w e l f t h y e a r o f e d u c a t i o n _ g r a d u a t i o n f r o m h ikind
g h sof
c hnonlinearity
o o | _ c o u | in
dbeworthmucn
allowfor thls
morethanthe eleventh v.r|. w. wiii learnhow to
Chaoter 7.
a constantelasticitymodel'
A'other i*poitrnt useof thenatu'allog is in obtaining
w,K&{qt$sl-€ u s$
(CEO SalarY and Firm Sales)
modelrelating
ellsticity CEOsalary Thedataset
to firmsales'
We canestimate a constant
2.3,except*. no* relatesa/ary to sa/esLetsa/esbe
is the sameone usedtn Example is
A constant
of dollars.
inlmillions model
elasticity
annual measured
firmsales,
f tt, (i.45);
iog(salnr])= Fo* Brlog(snles)
w h e r e B 1 i s t h e e l a s t i c i t y o f s a t l a r y w i t h r e s p e c t t o s a:/ e s . T h i s mand
odelfal|sunderthesimp|e
to be y log(sa/ary) the inde-
regression model by o.tlning the dependentvariable
this equationby OL5gives
pJnO"n,variableto be x : llg(sa/es)'Estimating
l
i.,,ji
;,
(hapter 2 T h e S i m p l eR e g r e s s i oM
n odel \*1,''
' 1os$nkn') : 4.822 + 0.25'llog(saies) It*.,..=1.9;.
f"'
i:;:lrllir:,,;r'':
n : 209,R2: 0.211.
Thecoefficient is the estimated
of log(sales) to sa/es.lt
of salarywith respect
elasticity
CEO
in firm salesincreases salaryby 0.257per-
about
iii;-' impliesthat a 1 percentincrease
cent-the usualinterpretationof an elasticity
1ffi,, :
i::r:::la')i::
Tbble 2.3
Forms
of Functional
Summary Logarithms
lnvolvlng
45
RegressionAnalysiswith Cross-Sectional
Data
lfnbiasedness of OLs
we be-qinby establisiring the unbiaseciness
of oLS unclera sinple setof assunptions.
For luturc ref'crence,
it is usefulto numberr]reseassumptrons usin_{theprefix .,sLR,,
1brsimplelinearregression. Thefirstassunrptiondefinesthepopulation moder.
#
T h e S i m p l eR e g r e s s i oM
n odel
Chapter 2
- (LINEAR IN PARAMETERs)
ASSUMPTION SLR.T
is related the independentvariable
t o
I n t h e p o p u l a t i o nm o d e l , t [ . o u p " n o * n t v a r i a b l e y
u )a s
a n d t h e e r r o r( o r d i s t u r b a n c e
tr,
,(2:471
respectlvely'
and slopeparameters'
where Boand p1 aretne populationintercept
lir,,, ti;cbt;
li: Fo* Ffiit [t,,i: 1,2,..,,t7,
fir*, j (for exzunple,
wheren, is the.erroror diSturbance for observation personl,'firm i, city
ilit
,: i; i, etc.).ThuS,t, containsthe unobservables for observation I which affect1". The It
shouldnot be confusedrvith the residuals. r'i,,thatrvedefinedin Section2.3.Lateron.
rve will explorethe relationship between the erors and the residuats. For interpret-
ing Fomd F, in a particular (2.47)
application, \s most (2'48)is alsct
informative,,but
r
n."d"a fclrsonreof the statisticalderivations.
The relationship (2.48)canbe plottecl for a particular outcomeof rlataasshownin
Figure2.7.
In orclerto obtainunbiased of B,,andFr, we needto intpOse
estimators thezerocott-
rJitionalmeanassumption that we cliscussed in solne detail in Section 2.1. We now
explicitlyaddit to our list of assumplions'
i'.
Data
ReoressionAnalvsiswith Cross-Sectional
Fiqure 2,7
=
Graphof y, P'o+Bli+ ui.
'
E l y r :xF) o * F , x
ij'i'
regression
sirrrple is goingto produccunbiased
analysis cstimators,it is criticalto think
in termsof AssumptionSLR.3.
Oncewe haveagreedto conditionon thex,, we needonefinal assumption for unbi-
asedness.
leastimportantbecause it essentially
neverfailsin interestingapplications.
If Assump-
tion SLR.4doesfail, we cannotcomputetheOLS estimators, whichmeansstatistical
analysisis irrelevant.
li, n
Using the fact that ) (r, - ;)0,, - .'r-,)
,,
: ) (r, - x)v;(seeAppendixA), we can
l t=l i:l
,:l.i:,''
'i,".rl;.:l;r""t Ii in equation(2.19)as
writetheOLS slopeestimator
.,
flilii',..., ..
i"ir':'i..;t I '
s, -.
iiit:rli.rl,,.: I I I
^ ,L. (xi - x.)],
fu1li1i,,,.,,, ln.'...'
Ft:#,
r;.,;. 1:ir
';r l' . ) (*,- r)'
S
Z-/
/*
\*i
- ^r\.,
t-t i )
"
(", - xXBo+ Br:r,-lu,) .' l
gr : ----------;- -
s.i ?, ,'(r.5ol
tlte notation.(This is not quite the sarnplevuiance of the "r,becausewe do not divide
by rt - l.) Using the algebra of the sumrnationoperator,write the numeratorof p, as
i=l i=l
rrririlr
'hnn r.fi"!t,fi
i:i ::: ::'::
li:llrriiiiliiirilr,i
49
Part I Data
ReoressionAnalvsiswith Cross-Sectional
t l l l "
s^?
: B, * (tlsi) ) d,H,,
,l
i: I
;;;L
'
,'t
.. i:
II andE(8,): B,
t(BJ: Bo, (2'sil"
I
for Bo,andp' isunbiased
of BoandB,. Inotherwords,Boisunbiased
,or.un,values for B'.
| I
tl p on thesample of I
values
I n o o t : In thisproof,theexpectedvalues areconditional
variable.
I tn. lnC.p.ndent Since onlyof the&, theyarenonrandomI
sl andd,aref unctions
Therefore,
in theconditicning. from(2.53),
I I
I E (p ,): F r* E [(l /s) .2
'
2,t,u,)= 13,*ir ls]1i
E( d,ui)
'=,,' |
|| : Fr*
,'=
( t / s ' f ; ) r t , E ( u , )p=r * t l / s , 1 1d) , . 0 : 8 1 ,
I
i:r i:r I
I I
| *f,"r" we haveusedthef actthattheexpected valueoI eachu,(conditionalon {x',xr,...,x,,})|
I iszero underAssumptions and
51R.2 SLR.3. I
Avcragc
fne prooffor 1loisnowstraightforward, (2.48)acrossi to getI': [3o+Bi 1'
I I
u, andplugthisintotheformula for 6:
I I
I Bo:.!- F,x:Fu-rB,x,+n-B,t=Fu*18,-B,p+a- |
I rh.n, conditional of thex,,
on thevalues I
r F,)]r,
- B,)rt+ E(;)iBuo EftF,
ti&,1: pn+E[(F'
I : Bt,which
I
I sinceE(0):0 by Assumptiorrs SLR.2andSLR.3.But,we showed thatE(81) I
that E[(F,- F,)] : 0. Thus,EFJ : B6.Bothof thesearguments
I implies
arevalidfor any
I
valuesof BoandB1,andsowe haveestablished unbiasedness
I J
50
l l:ll'lr;-,..1ll
..., I r,tii
.r.,,,,. i
,1.,, ,
' 1
t,
Model
Regression
The'SitnPle
Chapter 2
NXltff!&1$E"K x s:t
( S t u d e n l M a t h t . . , 1 o . r T u . n " . : , . : l d t h g 5 q h . o , o lL. u n 5 , h r r o o . r e m )
Letm,othioidenole th.epelientage of tenthgraders bt a high.school'.receiving a passing
sco1€.;ot-l a stardardized mdthematics exam. Suppose we wish to estimate the effectof
the federally fundedschoOl lunch program on student performance' lf anything, we
expectthe lun::hprogramio have a positive ceteris paribus effect on performance: all
otherfactors beingequal,if a student who is too poor to eat regular meals becomes eli-
giblefor.the schoollunchqrogram,hisor herperformance should'irnbrove Lel'tnChprg
ilirl;
denotethe percentage,of.sludents who areeligible,for;trhsllunshLprrbgrarni;Thbnib,sirtr'ipile ':r ',
I
* u,
lnathl} = F,s* BJnchprg (2.541:,
, l': ' .'l
5t
Data
RegressionAnalysiswith Cross-Sectional
Pad I
school
[hataffectovcrall
characl"crislics pcrformance'
whereu contatnS andsludent
school
highschools
on 408 Michigan for the 1992-93school
Usingrhe datain MEAP93.RAW
year,we obtain
nw?ht0:32.14 - 0.319lu:hPrg
n : 408,R: : 0.171
that if studenteligibilityin the lunchprogram increasesby 10 per-
Thisequationpredicts
centagepoints,tnepercentageofstudentspassingthemathexamfa//sbyabout3.2per-
points.Dowe really Lelieve in the lunchprogram
that higherparticipation actually
centage term
Almostcertainly not.A better explanation isthatthe error
causes worseperformancei
ln fact,u factors
contains suchastne pover
u in equation(2.5a)iscorrelated with/nchprg.
attending school, which affectsstudentperformance andis highlycorre-
ty rateof children are
in thellnchprogram. Varlablessuchasschool quality andresources
latedwitheligibility remem-
with /nchprg lt is importantto
alsocontained in u, arrdthesearelikelycorrelated
the estimate -0.319 isonly for thisparticularsample,butitssignandmagnitude
ber that
makeussuspectthatuandxarecorrelated,sothatsimpieregressionisbiased.
with
thereareotherreasonsfor -r to be correlated
In additionto omittedvariables,
model.Sincethe sameissuesarisein muitiple regressiott
u in the simpleregression
analysis,wewiilpostponeaSystematictreatmentoftheproblemuntilthen.
l'i", 52
i"l
I' r.
Model
TheSimpleRegression
Chapter 2
it iffrpliesthatorclinaryleastsquares hascertainefficiencyprop-
B,,andF, andbecause thatu andx areindepen-
assun'le
erlies,which we will seein Chapter3' If we wereto
on .r' andso E(rrfr:)= E(1) = 0
dent, thenthedistributionof ,t giuen,, doesnot clepend
: s2. But independence is sometimes too sffongof an assutnption'
.";-v*irtrl
"^-B;;il;
: - : 0'.crr
E(rrl't) whichnreans
Var(rrl.r) ni":ltl tE(ul'r)]'?a^nd l.Ejrurlr)'
:
:
of 12'Theretbre,or E(u2) Var(r'r)' because
o2 is also tte unconclitionat expectation
E(u) : 0. In .ther words,o2 iS'thetmconclitiottttl varianceof tt, andsorr! is oftencalled
thc crror varianccor ciisturbance variance, The squarerootol'cr2,o. is the standard
ol'thetrnobservablcs af'l'ecl-
deviationof thcerror.A largercr mcallsthltlthc clistribution
ing l is moresPreadout.
'ItisoftenusefultowriteAssumptionsSLR'3arrdSLR.5intermsofthecondi-
of y:
tionalmeanandconditional1'ariance
(2.55)
E(Ilr) : Fo* Frx.
Var(yl,l)= o2. tz.s,eil
Irr Otherwords,theconditionalexpectadon of v given-r is linearin 'r, butthevadanceof
Trrissituationis grapheclin Figure2'8 whereFu> 0 andF' > 0'
;, gi;; r is constant.
Figure 2'B
r:ii;'Thesimpleregressionmodel under homoskedasticity'
,,.:'*,
53