Vous êtes sur la page 1sur 7

QMDS 202 Data Analysis and Modeling

Chapter 12 Inference About A Population


The Chi-Squared Distribution
Let Z1, Z2, , Zn be n independent random variables each with a standard normal
Z 12 Z 22 ... Z n2

distribution. Then the distribution of

2
i

is called the

i 1

distribution with n degrees of freedom, denoted by


Properties of the

n2

distribution:

Since values are obtained by using squared numbers, all 2 values are zero or
positive a property that is not found with the z distribution. Thus, the scale of
possible 2 values extends from zero indefinitely to the right in a positive
direction.
A 2 distribution is not symmetrical like the z distribution.
There is a different 2 distribution for a different degree of freedom. If we change
the value of n, there is a different df (degrees of freedom), and so the shape of
each distribution of interest to us now depends on the value of n.

Now, it can be shown that

( n 1) s 2

n21 . Thus the random variable formed by

dividing (n1) times the sample variance s 2 (a random variable) by the population
variance follows a chi-squared distribution. While the full proof of this proposition is
difficult, you might note that if X1, X2, , Xn are all normally distributed, then
2

Xi


i 1

2
i

n2

i 1

Then, we have
n

Xi X

i 1

( n 1) s 2

n21

Note. Replacing with X will reduce the degrees of freedom for the chi-squared
distribution to n 1.
The t Distribution
Consider two independent random variables A and B. Suppose that A follows a
standard normal distribution with a mean of 0 and a variance of 1. That is, A N(0,1).
Suppose B follows a chi-squared distribution with v degrees of freedom. That is, B
v2 . Then, if we form a new random variable

A
B
v

this random variable follows the t distribution with v degrees of freedom. That is,
A
B tv
v

Properties of the t distribution:


It is similar to a z distribution with a zero mean and a symmetrical (bell) shape
about the mean.
But its shape depends on the degrees of freedom (the t distribution is really a
family of distributions, and there is a different one for a different degree of
freedom).
With a small degree of freedom, the shape of the corresponding t distribution is
less peaked than the z distribution, but as the degree of freedom increases and
approaches 30, the shapes of the t distributions lose their flatness and approximate
the shape of the z distribution.
To see why the t distribution is useful to us, remember that we have been using the
fact that
Z

X
N(0,1)
/ n

Usually, however, we would not know the population standard deviation . If we


replaced by the sample standard deviation s , that is, we form the random variable
X
s/

This new random variable is not standard normal but rather follows the t distribution
with n 1 degrees of freedom. That is,
X
s/

t n 1

To see this notice that


X
X
s/

/ n
( n 1) s 2
(n 1) 2

t n 1

Inference About A Population Mean When The Standard Deviation Is Unknown


Replace by s, then we have x (an estimate of x )
For finite population, x

N n
N 1

s
n

For infinite population, x

s
n

When X is normally distributed (n 30 or the population is normally distributed) and


is unknown, the confidence interval (c.i.) of is found by:
x t / 2 x x t / 2 x

where t / 2 is a score obtained from a t-distribution with n - 1 degrees of freedom.


Example 1

Over 14 days, there is a total of 154 minor traffic accidents with a daily
standard deviation of 4. Find a 95% confidence interval for the mean
daily number of accidents.

Solution:

n = 14

x = 154

s=4

x 154

11
n
14

N is unknown infinite population


s
4
x

1.069
( is unknown)
n
14
Assume the population is normally distributed X is normal.
= 1 0.95 = 0.05
v = degrees of freedom = n 1 = 14 1 = 13 t / 2 t 0.025 2.16
The 95% c.i. of the population mean is:

x t / 2 x x t / 2 x

11 2.16 1.069 11 2.16 1.069


8.69 13.31 (ans.)

Example 2

A machine has been producing rods cut off at 10.5 inches. The
machine is considered out of control if the rods cut by it are either too
long or too short. A random sample of 10 items shows a mean of 10.82
inches with a standard deviation of 0.25 inches. Is the machine out of
control? Test the hypothesis at 0.01 significance level.

Solution:

H0: = 10.5
H1: 10.5
= 0.01
n = 10 < 30 and unknown
Assume the population is normally distributed X is normal.
t-distribution will be used as the testing distribution.

Reject H0 if TS < -3.25 or TS > 3.25.

TS

x 0
x 0 10.82 10.5

4.05
x
s/ n
0.25 / 10

TS = 4.05 > 3.25 Reject H0.


Conclusion: It is likely that the machine is out of control.

Inference About A Population Variance


If the population is normally distributed, the confidence interval of 2 can be found
by:
( n 1) s 2
( n 1) s 2
2

2 / 2
12 / 2
where 2 / 2 and 12 / 2 are scores obtained from a Chi-square ( 2 ) distribution
with n 1 degrees of freedom.
The corresponding confidence interval of can be found by:
( n 1) s 2

2 / 2

( n 1) s 2

12 / 2

Example 3

In a sample size of 15 items, the sample variance was 0.008. What is


the 95% confidence interval for (a) the population variance? (b) the
population standard deviation?

Solution:

(a)

n = 15
s2 = 0.008
Assume that the population is normal.
= 1 0.95 = 0.05
df = n 1 = 15 1 = 14

2 / 2 = 02.025 = 26.1

12 / 2 = 02.975 = 5.63
The 95% c.i. of the population variance is:

( n 1) s 2
( n 1) s 2
2

2 / 2
12 / 2

14 0.008
14 0.008
2
26.1
5.63
2
0.0043 0.0199 (ans.)

(b)

The 95% c.i. of the population standard deviation is:


0.0043

0.0199

0.066 0.141 (ans.)

Example 4

A drug company makes tablets to help control a certain disorder, and


the process that produces these tablets is considered out of control if
the standard deviation of the tablet weights exceeds 0.0125 milligrams.
A random sample of 20 tablets taken during a routine periodic check

produced a sample standard deviation of 0.0190 milligrams. At the


0.05 level, is the tablet production process out of control?
Solution:

H0: 0.0125
(0 = the claimed value of stated in H0 = 0.0125)
H1: > 0.0125
= 0.05
Assume the population is normal 2-distribution will be used as the testing
distribution
v = degrees of freedom = n 1 = 20 1 = 19

Reject H0 if TS > 30.1.

TS

( n 1) s 2

02

( 20 1)(0.019) 2
(0.0125) 2

43.9

TS = 43.9 > 30.1 Reject H0.


Conclusion: It is likely that the tablet production process is out of control.

If H0: 0
H1: < 0
then reject H0 if TS < 12
If H0: = 0
H1: 0
then reject H0 if TS < 12 / 2 or TS > 2 / 2

Inference About A Population Proportion


Normal Approximation of Binomial Distribution when p is unknown:
If n 30 , np 5 , and n(1 p ) 5 , then the sample proportions are approximately
normally distributed.
Thus, when all these three conditions are satisfied, the confidence interval of p can be
found by:
p z / 2 p p p z / 2 p

where p
p

p 1 p
n

N n
for finite population
N 1

p 1 p
for infinite population
n

Example 5

Suppose you wish to know what proportion of the population drink


beer. You sample 200 people and find that 90 drink beer. Find a 95%
confidence interval for the proportion of the population who drink
beer.

Solution:

n = 200

90
0.45
200

n > 30, np 90 5, and n (1 p ) 110 5


distributed.

p 1 p

is approximately normally
p

0.45 0.55
0.035
200

The 95% confidence interval of p is:


p z / 2 p p p z / 2 p
0.45 1.96 0.035 p 0.45 1.96 0.035
0.3809 p 0.5191 (ans.)

Example 6

The manager of a firm has advertised that 90% of the firms customers
are satisfied with the companys services. A customer activist feels that
this is an exaggerated statement that might require legal action. In a
random sample of 150 of the companys clients, 132 said they were
satisfied. What should be concluded if a test is conducted at the 0.05
level of significance?

Solution:

H0: p 0.9
H1: p < 0.9
= 0.05
n
=

(p0 = the claimed value of p stated in H0 = 0.9)

150

>

30,

np 0 150 0.9 135 5 ,

and

n(1 p 0 ) 150 0.1 15 5

z-distribution will be used as the testing distribution.

Reject H0 if TS < -1.645.

TS

p p 0
p 0 (1 p 0 )
n

0.88 0.9
0 .9 0 .1
150

0 .8

TS = 0.8 is not less than 1.645 Cannot reject H0.


Conclusion: There is not sufficient evidence to show the statement made by the
manager is exaggerated.

Selecting The Sample Size To Estimate The Proportion


Example 7

A plant manager wants to form a 99% confidence interval to estimate


the proportion of defective products from the production line. He

wants the estimate to be accurate within 0.05. What is the minimum


number for his sample if
a. past experience has shown an estimate of 0.3 for the population
proportion?
b. he has no estimate for the true proportion?
Solution:

a.

= 1 0.99 = 0.01
p 0.05
Required: p
0.05
The interval estimate of p is to be of the form: p
(W = Margin of error = the quantity following the sign)
From sampling concept, we know p p z / 2 p
Set z / 2 p W z / 2

p (1 p )
z

W n /2
n
W

p (1 p )

from previous study = 0.3


p

2.575

0.05

0.3 (1 0.3) 556.97 557 (ans.)


2

b.

2.575
p (1 p )
0.05
= 0.5 if there is no estimate can be taken from previous study.
Set p
n

2.575

0.05

0.5 (1 0.5) 663.06 664 (ans.)

Review Problems: 12.12, 12.19, 12.24, 12.26, 12.56, 12.60, 12.62, 12.70, 12.72,
12.74, 12.75, 12.82.

Vous aimerez peut-être aussi