Vous êtes sur la page 1sur 8

Inference Concerning Variances

Chi-square Variate
The square of a standard normal variate is known as Chi-square variate with 1 degrees of freedom (df).
If x is normally distributed with mean

X ~ N ( , 2 ) then Z

and variance 2 , i.e. if

X-
~ N (0,1)

then

with 1 degree of freedom.


In general If


1,

2,

X ,X ,X
3,

......

,.... X n

X 2 2

Variate with n degrees of freedom.

i.e.

i 1

X i i

i

is a Chi- square (

) Variate

be n independent normal variates with means

and variance.

X 1 1

X
z

2

2
1

2 2 3 2 ...... n 2

X 3 3

respectively then,

.........

X nn

, is a Chi-square

is a Chi-square variate with n degrees of freedom.

Theorem
If S2 is the variance of a random sample of size n taken from normal population having the variance

2 n Xi X
2

1
S

i 1
2

, then

2
is a random variable having the chi square distribution with n-1 df.

Confidence Intervals for the Population Variance

In most practical applications, intervals estimates of


sample variance.

We can use above theorem for interval estimate of

Pr [
Pr[

2
1

2
1

n 1 S
2

] 1

] 1

n 1 S 2 ] 1
2

n 1 S 2 2
2

is required

are based on the sample standard deviation or the

2.

2
Pr[ n 1 S
2
2

Thus ,

2 or

n 1 S 2
2

(1 )100% confidence interval for 2 .

Suppose we want a 95 % confidence interval for the variance. For instant let us consider degrees of freedom is 8. We locate
two points on

chi square distribution with given degrees of freedom:

of the distribution and

The values of

2
U

2
L

2
U

cuts off 0.025 of the area in the upper tail

cuts off 0.025 of the area in the lower tail of the distribution.

0.025 = 17.535 and 10.025 0.975 = 2.180 can be found from the table.
2

The Following expression gives the confidence interval for

2
2 n 1 S
Lower Confidence Limit L
2
U
n 1 S 2
2
Upper Confidence Limit U
2
L
Example: A sample of 20 observations from a normal distribution has mean 37 and variance of 12.2. Construct a 90 percent
confidence interval for the true population variance.

2
2 n 1 S (20 1)12.2 7.69
Lower Confidence Limit L
2
30.144
U
n 1 S 2 (20 1)12.2
2

22.91
Upper Confidence Limit U
2
10.117
L
Test for a Specified Population Variance
Let a random sample

x , x , x ,.... x
1

of size n be drawn from a normal population with mean

and variance

2 . To test the hypothesis that the population variance has a specified value 2
Let the Null hypothesis be H : 2 = 2
Then the Alternative hypothesis be H : 2 2
0

Assuming that H0 is true, the test statistic is

2 n 1S 2 , Where S is sample variance

2
2

The test statistic

follows chi-square distribution with (n-1) df. If calculated value is greater or equal than table

n 1 value then Reject Null hypothesis.


2

Rejection Rule

Critical regions for testing

2 = 2
0

Alternative hypothesis

Reject null hypothesis if

2 < 2

21
> 2
2

2 > 2

2 2
0

|>

2
2

Example: A random sample of size 20 from a normal population gives a mean of 42 and a variance of 25, test the hypothesis
that the population variance is 64 at 5% level of significance.
Solution:

Let the null Hypothesis be H0: 2 = 0 2 = 64


Then the alternative hypothesis is H1: 2

64

Assuming that H0 is true, the test statistic is

2 n 1S 2 19 * 25 475 7.42

64
64
2
0

2
df 20 1 19 and tabulated 19 (0.025) 32.852

Since calculated value of chi-square is less than tabulated value. Hence we accept null hypothesis. Thus, we may conclude that
population variance is 64.

Hypothesis Concerning Two Variances


F-statistics: If

2
1

and are two independent chi-square variates with 1and 2 df respectively, then F-Statistics
2
2

is defined by

12 /1
F 2
2 / 2
In other words, F is defined as the ratio of two independent chi-square variate divided by their respective degrees of freedom
and it follows F- distribution with 1and 2 df.

The F-Distribution
The F-distribution is skewed distribution.
Generally it is skewed to the right and tends to become more symmetrical as the number of degrees of freedom in the
numerator and denominator increase.
The F- distribution has single mode.
The shape of the distribution depends on the number of degrees of freedom in both numerator and denominator of the Fratio.

The first number is the number of degrees of freedom in the numerator of the F-ratio; the second is the degrees of
freedoms in the denominator.
Fig 11.8 Pg 597 Rubin
Confidence interval

2 , are the variance of independent random sample of size n and n respectively, taken
If S12 , and S2
1
2,

2
2
S /
1
1

~ F ( n 1, n 1)
1
2
2
2
S /
2
2

Pr F
( , ) F F
( , ) 1
1 / 2 1 2
/ 2 1 2
2
2

S /

1
1
Pr F
( , )
F
( , ) 1
1 / 2 1 2
/2 1 2
2
2
S /

2
2

2 2

1
2
Pr F
( , )
F
( , ) 1
1 / 2 1 2
/2 1 2
2 2
S

2 1

2 2

S
1
1

2
1
Pr

1
2 2
F
( , )
S
F1 / 2 (1 , 2 )
/2 1 2

1 2
from two normal populations
2 2

S
1
1

2
1
Pr

1
2
2
F
( , )
S
F1 / 2 (1 , 2 )
/2 1 2
1 2

2 2

S
1
1

2
1
Pr

1
2
2
F
( , )
S
F / 2 (1 , 2 )
1 / 2 1 2

1 2
2
2
2

S
1
1
S1

1
1
Pr

1
2 F
2
2
(

)
F
(

S
S

2 /2 1 2
2
2 1 / 2 1 2
1
AsF
( , )
1 / 2 1 2
F
( 2 , 1 )
/2
2
2
2

S
1
S1

1
1
Pr

F
( , ) 1
2 F
2
2

/
2
2
1
( , )
S
S

2 /2 1 2
2
2

Is required confidence interval.


Theorem

If S 2 , and S 2 , are the variance of independent random sample of size n1 and n2, respectively, taken from two normal

populations having the same variance, then

S2
F 1
S2
2

is a random variable having the F-distribution with n1-1 and n2-1 degrees of freedom.

Remark: The greater of the two variances S 2 , and S 2 , is to be taken in the numerator and n1 correspond to the greater

variance.

The F-Table
In F-table the columns represent the number of degrees of freedom for the numerator and the rows represents the degrees of
freedom for the denominator.

Suppose we are testing a hypothesis at the level of significance 0.05, using F-distribution and our degrees of freedom for
numerator is 2 and 13 for the denominator. The value we find in the F-Table is 3.81 (First look in column 2 and then in row
13)
Critical Value of F- distribution
Usually F-tables give the critical value of F for the right tailed test, the right-tail area determines i.e. the critical region. Thus,
the significant value

F ( n1 ,n2 ) at the level of significance and (n , n ) where n


1

is the number of degrees of

freedom in the numerator and n2 the number of degrees of freedom in the denominator.

P[F F ( n1 ,n2 )] . As shown in figure

Pg. 877 Gupta and Kapoor

Rejection rule: If calculated F-ratio value is greater than table

F ( n1 ,n2 ) at given level of significance then we reject

null hypothesis, otherwise accept it.


To test whether two independent samples of size n1-1 and n2-1 have been drawn from the normal populations with the
same variance

or not

The null hypothesis H0: 12 2 2

The alternative hypothesis H1: 12 2 2


Level of significance:

S2
Test Statistic: F 1 , having the F-distribution with n1-1 and n2-1 degrees of freedom.
S2
2
Criterion:

P[F F ( n1 1,n2 1 )]

If calculated F-ratio value is greater than table


hypothesis.

F ( n1 1,n2 1 )

at given level of significance then we reject null

The null hypothesis H0: 12 2 2

The alternative hypothesis H1: 12 2 2


Level of significance:

S2
Test Statistic: F 2 , having the F-distribution with n2-1 and n1-1 degrees of freedom.
S2
1
Criterion:

P[F F ( n2 1,n1 1 )]

If calculated F-ratio value is greater than table


hypothesis.

F ( n2 1,n1 1 )

Assignment I
Subject: Statistics (216)
Level: CS II Year.
Year: 2002

at given level of significance then we reject null

1.

A Simple random of sample of size 66 was drawn in the process of estimating the mean annual income of 950 families of
a certain township. The mean and the SD of the sample were found to be Rs. 4,730 and Rs. 7.65 respectively. Find a 95%
confidence interval for the population mean.

2.

In a large consignment of oranges a random sample of 500 oranges reveled that 65 oranges were bad. Prove that 99.73%
of bad oranges in the consignment certainly lies between 8.5% and 17.5%.

3.

Department of road has ordered an investigation of the large number of accidents that have occurred in recent summer in
the Capital of Himalayan Kingdom. Acting upon instruction, an investigator has randomly selected 9 summer months
within the last few years and has complied data on the number of accidents that occurred during each of these months. The
mean number of accidents to occur in these 9 months was 31 and the standard deviation in this sample was 9 accidents per
month. An investigator was told to construct a 90 percent confidence interval for the true mean number of accidents per
month, but he was in such an accident himself recently, so you will have to do this for him.

4.

The mean breaking strength of the cables supplied by a manufacturer is 1800 with SD 100. By a new technique in the
manufacturing process it is claimed that the breaking strength of the cables have increased. In order to test this claim a
sample of 50 cables is tested. It is found that the mean breaking strength is 1850. Can we support the claim at 0.01 level of
significance?

5.

A manufacturer claimed that at least 95% of the equipments, which he supplied to a factory, conformed to specifications.
An examination of sample of 200 pieces of equipment revealed that 18 were faulty. Test his claim at a significance level of
0.05.

6.

A manufacturer of High-speed train has developed a new engine, which is designed to have speed equal to 300 km/hour.
Eight engines are tested and their speed measured. The resulting speeds are shown in following table.
Speed in km/hour
300.5
292.5
293.5
296.5

299.5
300.5
293.5
290.5

Do the data present sufficient evidence to indicate that the average speed differ from 300 km/hour. Take the level of
significance 0.05 .
7.

In a certain factory there are two different processes of manufacturing the same item. The average weight in a sample of
250 items produced from one process is found to be 120 grams with a SD of 12 grams; the corresponding figures in a
sample of 400 items from the other process are 124 and 14. Compute the SE of the difference between the two sample
means. Is this difference significant at 1 % level?

8.

A group of 5 patients treated with medicine A weigh 42, 39, 48, 60 and 41 kg; a second group of 7 patients from the
same Hospital treated with medicine B weigh 38, 42, 56, 64, 68, 69 and 62 kg. Do you agree with the claim that
medicine B increases the weight significantly? Use 5% level of significance.

9.

A company has the head office at Kathmandu and a branch in Pokhara. The personal director wanted to know if the
workers at the two places would like the introduction of a new plan of work and a survey was conducted for this purpose.
Out of a sample of 500 workers at Kathmandu 62% favored the new plan. At Pokhara out of a sample of 400 workers 41%
were against the new plan. Is there any significant difference between the two groups in their attitude towards the new
plan at 5% level of significance?

10.

A company manufacturer gold weighting balances. It maintains strict quality control over the products and do not release
a balance to sale unless the balance showed variability significantly below one microgram (at alpha = 0.01), when
weighting quantities of about 500 grams. A new balance has just been delivered to the quality control division form the
production line. This new balance is tested by using it to weight the same 500-gram standard weight 30 different times
the standard deviation turns out to be 0.73 microgram. Should this balance be released for sale?

American Theaters knows that a certain hit movie ran an average of 84 days in each city, and the corresponding standard
deviation was 10 days. The manager of the southeastern district was interested in comparing the movies popularity in his
region with that in all of Americans other theaters. He randomly chose 75 theaters in his region and found that they ran
movie an average of 81.5 days.
State appropriate hypothesis for testing whether there was a significant difference in the length of the pictures run
between theaters in the southeastern district and all of Americans other theaters and test these hypotheses at a 1 percent
significance level.