124354hbfbfjsd PDF

Estimation of Error Variance
Common estimate of 2 is
n
1 X
Se2 = (yi (0 + 1 xi ))2
n 2 i=1
n
1 X 2
= e
n 2 i=1 i
SSE
= = MSerror
DFerror
1
Dividing by n2 makes Se2 an unbiased estimator for 2 .
(n 2) follows general d.f. rule:
Estimate 2 parameters in the model.
The residuals satisfy two constraints by LSE method.
1 / 17
Inference for 1
Discuss 1 in detail
Pn Pn
i=1 (xi x)(yi y ) (xi x)yi
1 = Pn 2
= Pi=1
n 2
i=1 (xi x) i=1 (xi x)
1 is a linear combination of normal random variables (the

yi s), so 1 is normally distributed with
E(1 ) = 1
2 2
21 = Var(1 ) = Pn =
i=1 (xi x)
2 (n 1)SX2
2 / 17
Inference for 1
2 is unknown; plug in estimate Se2 .
Sample standard error of 1 is

S
S1 = e
SX n 1
(1 1 )/S1 follows a t-distribution with (n 2) d.f.
Test H0 : 1 = 0 vs. H1 : 1 6= 0 at level ,

T = S1 0 tn2 , and reject H0 if |T | > tn2,1/2
1
100(1 )% C.I. for 1 is 1 tn2,1/2 S1 .
3 / 17
Inference for 0
Point estimate of 0 is 0 = y 1 x.
x 2

2 2 1
E(0 ) = 0 , 0 = Var(0 ) = +
n (n 1)SX2
0 N(0 , 2 )
0
q
1 x 2
0 has sample standard error S0 = Se n
+ (n1)SX2
Test H0 : 0 = 0 vs. H1 : 0 6= 0 at level ,

T = S0 0 tn2 , and reject H0 if |T | > tn2,1/2 .
0
100(1 )% C.I. for 0 is 0 tn2,1/2 S0 .
4 / 17
Inference for Regression Line (or Conditional Means)
Inference for E(Y |X = x) = 0 + 1 x
For a chosen x0 ,
estimate is y0 = 0 + 1 x0 = y + 1 (x0 x).
E(y0 ) = Y |X =x
0 = 0 + 1x0
1 (x0 x)2
Var(y0 ) = 2 n
+ (n1)SX2
q
1 (x0 x)2
Sample standard error is Sy0 = Se n
+ (n1)SX2
.
Test H0 : Y |X =x0 = vs. H1 : Y |X =x0 6= at level ,

T = yS0
y
tn2 , and reject H0 if |T | > tn2,1/2 .
0
100(1 )% C.I. for Y |X =x0 (i.e. 0 + 1 x0 ) is

(0 + 1 x0 ) tn2,1/2 Sy0 .
5 / 17
Prediction
Predict the value for Y at given x0 :

Ynew = 0 + 1 x0 + E
Estimate is still ynew = 0 + 1 x0
Standard error is
s
q 1 (x0 x)2
Sy ,pred = Se2 + Sy20 = Se 1+ +
n (n 1)SX2
100(1 )% prediction interval:
(0 + 1 x0 ) tn2,1/2 Sy ,pred .
6 / 17
Example
Forbes Data
James D. Forbes collected data in the Scotish Alps in the

1840s and 1850s.
n = 17 locations (at different altitudes)
Objective: Predict barometric pressure (in inches of

mercury) from boiling point of water (X) in F.
Use Y = log (barametric pressure).
Motivation: Fragile barameters of the 1840s were difficult

to transport.
7 / 17
BOILING POINT BARAMETRIC NATURAL LOG OF
OF WATER PRESSURE BARAMETRIC
Obs (degrees F) (inches Hg) PRESSURE
1 194.3 20.79 3.034472
2 194.5 20.79 3.034472
3 197.9 22.40 3.109061
4 198.4 22.67 3.121042
5 199.4 23.15 3.141995
6 199.9 23.35 3.150597
7 200.9 23.89 3.173460
8 201.1 23.89 3.173460
9 201.3 24.01 3.178470
10 201.4 24.02 3.178887
11 203.6 25.14 3.224460
12 204.6 26.57 3.279783
13 208.6 27.76 3.323596
14 209.5 28.49 3.349553
15 210.7 29.04 3.368674
16 211.9 29.88 3.397189
17 212.2 30.06 3.403195
8 / 17
Forbes Data
3.5
3.4

Log Pressure
3.3

3.2

3.1

3.0
190 195 200 205 210 215

Boiling point of water (degrees F)
9 / 17
Analysis of Forbes Data
Proposed regression model
yi = 0 + 1 xi + ei
i.i.d
where ei N(0, 2 ), i = 1, , 17.
Yi = log(pressure)
Xi = boiling point ( F)
1 is the increase in mean log(pressure) when boiling point

of water increases by 1 F.
0 is the mean log(pressure) when boiling point of water is

0 F. (Is this extrapolation realistic?)
10 / 17
Estimated regression model

y = 0 + 1 x = 0.970866 + 0.020622x
Residuals: ei = yi yi , i = 1, , 17.
Estimated mean log(pressure) at 212 F is
y212 = 0 + 1 212 = 3.401074.
11 / 17
Inference on 1 :
Test H0 : 1 = 0 (Yi = 0 + ei ) versus

H1 : 1 6= 0 (Yi = 0 + 1 xi + ei ).
1 0 0.0206220
Evaluate T = S
= 0.000379
= 54.42. p-value <<
1
0.0001. Reject H0 and conclude that the slope is positive.
A 95% C.I. for the slope indicates that the slope is very
well estimated from these data
1 t15,0.975 S1
0.020622 (2.131)(0.00037895)
(0.0198, 0.0214)
12 / 17
Inference on 0
Test H0 : 0 = 0 (Yi = 1 xi + ei ) versus

H1 : 0 6= 0 (Yi = 0 + 1 xi + ei )
0 0 0.9710
Evaluate T = S
= 0.0769
= 12.6. p-value
0
<< 0.0001. Reject H0 and conclude that the intercept is
negative. (Is there a practical motivation to do this test?)
A 95% C.I. for the intercept is
0 t15,0.975 S0
0.971 (2.131)(0.0769)
(1.135, 0.807)
13 / 17
Construct a 95% C.I. for mean of log-pressure

measurements when the boiling point of water is x = 209

F.
Estimated mean is
y = 0 + 1 x = 0.9710 + (0.0206)(209) = 3.339
Evaluate the sample standard error of this estimate

s
1 (209 202.953)2
Sy = 0.0000762 + = 0.00312
17 530.78
A 95% C.I. is y t15,0.975 Sy = (3.333, 3.346).
14 / 17
For every point x, compute 95% C.I. will get us a C.I. band for
the regression line.
3.5
Regression Line
95 percent C.I.
3.4

Log Pressure
3.3

3.2

3.1

3.0
190 195 200 205 210 215

Boiling point of water (degrees F)
15 / 17
Inference for prediction:
Construct a 95% prediction interval for a log-pressure

value when the boiling point of water is x=209 F.
Prediction is the estimated mean (because the estimate of

the error is zero)
y = 0 +1 x+error = 0.9710+(0.0206)(209)+0 = 3.339
Evaluate the standard error of the prediction

s
1 (209 202.953)2
Sy = 0.0000762 1 + + = 0.00927
17 530.78
16 / 17
A 95% prediction interval is

y t15,0.975 Sy ,pred
3.339 (2.131)(0.00927)
(3.319, 3.359)
Above inferences (estimation, testing, prediction) are in the

output of SAS program. We will introduce SAS coding after
we introduce one more thing the ANOVA table for simple
linear regression. Hold on.
17 / 17

124354hbfbfjsd PDF

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

124354hbfbfjsd PDF

Transféré par

Droits d'auteur :

Formats disponibles

Estimation of Error Variance

1 is a linear combination of normal random variables (the

2 is unknown; plug in estimate Se2 .

Sample standard error of 1 is

(1 1 )/S1 follows a t-distribution with (n 2) d.f.

Test H0 : 1 = 0 vs. H1 : 1 6= 0 at level ,

100(1 )% C.I. for 1 is 1 tn2,1/2 S1 .

Test H0 : 0 = 0 vs. H1 : 0 6= 0 at level ,

100(1 )% C.I. for 0 is 0 tn2,1/2 S0 .

Test H0 : Y |X =x0 = vs. H1 : Y |X =x0 6= at level ,

100(1 )% C.I. for Y |X =x0 (i.e. 0 + 1 x0 ) is

Predict the value for Y at given x0 :

Estimate is still ynew = 0 + 1 x0

100(1 )% prediction interval:

James D. Forbes collected data in the Scotish Alps in the

n = 17 locations (at different altitudes)

Objective: Predict barometric pressure (in inches of

Use Y = log (barametric pressure).

Motivation: Fragile barameters of the 1840s were difficult

190 195 200 205 210 215

1 is the increase in mean log(pressure) when boiling point

0 is the mean log(pressure) when boiling point of water is

Estimated regression model

Estimated mean log(pressure) at 212 F is

y212 = 0 + 1 212 = 3.401074.

Test H0 : 1 = 0 (Yi = 0 + ei ) versus

Test H0 : 0 = 0 (Yi = 1 xi + ei ) versus

A 95% C.I. for the intercept is

Construct a 95% C.I. for mean of log-pressure

Evaluate the sample standard error of this estimate

A 95% C.I. is y t15,0.975 Sy = (3.333, 3.346).

190 195 200 205 210 215

Construct a 95% prediction interval for a log-pressure

Prediction is the estimated mean (because the estimate of

y = 0 +1 x+error = 0.9710+(0.0206)(209)+0 = 3.339

Evaluate the standard error of the prediction

A 95% prediction interval is

Above inferences (estimation, testing, prediction) are in the

Vous aimerez peut-être aussi