Vous êtes sur la page 1sur 36

BITS Pilani

Pilani Campus

Course No: MATH F432

Applied Statistical Methods


Hypothesis Testing
(Lecture 7)
Sumanta Pasari
Assistant Professor,
BITS Pilani Department of Mathematics,
Pilani Campus BITS Pilani, Pilani Campus
Revise: Confidence Interval on p

Interval Estimate = Point Estimate


` + /  Margin of Error
Recall that if np  5 and n 1  p   5, we get (using CLT)

X  p 1  p   p p
p  ~ N  p,  ~ N  0,1
n 

n 
 p 1  p 
n
Taking two points  z 2 symmetrically about the origin, we get
 
 
p p
P   z 2   z 2   1  
 p 1  p  
 
 n 
Here 1    is known as confidence level.
3 BITS Pilani, Pilani Campus
Confidence Interval on p
 p 1  p  p 1  p  
P  p  z 2  p  p  z 2  1
 n n 
 
As p is unknown, above confidence bounds are not statistics. So replace p by
unbiased estimator p, and then the CI on p having confidence level 1    is
 p 1  p  p 1  p  
 p  z 2 , p  z 2 .
 n n 
 
The endpoints of the confidence interval is called confidence limits.

4 BITS Pilani, Pilani Campus


Sample Size for Estimating p
We can be 100(1-% sure that p and p differ by
at most E , where E is given by
p (1  p )
E  z 2
n
Thus, sample size for estimating p, when prior
estimate available is
p (1  p )
n  z 2
2
2
E
5 BITS Pilani, Pilani Campus
BITS Pilani
Pilani Campus

Testing of Hypothesis

Chapter 9
BITS Pilani, Pilani Campus
Objectives
• Understanding hypothesis testing

• Constructing null and alternative hypotheses

• Type I and Type II errors

• Power of a test

• Test for population mean and population


proportion
7 BITS Pilani, Pilani Campus
Testing of Hypothesis
• Often we end up with taking decisions based on samples: the decision may
be correct or it may be incorrect.

• Testing of hypothesis is used to verify whether a statement about the value


of a population parameter should be rejected or not.

• The statement will be verified based on the information available from


random samples.

• Either the statement will be rejected or the statement cannot be rejected


(that is, accepted) based on the information available from samples.

• Two types of statement: null hypothesis and alternative hypothesis

8 BITS Pilani, Pilani Campus


Testing of Hypothesis
• The null hypothesis, denoted by H0, is a tentative preconceived
assumption about population parameter. It always includes the
‘statement of equality’, that is, equality part always appears with H0.

• The alternative hypothesis, denoted by Ha or H1, is the opposite of


what is stated in the null hypothesis. The alternative hypothesis is
what the test is attempting to test or establish.

• If the information available from sample data contradicts the null


hypothesis, we shall reject it, otherwise, we say “we fail to reject”
null hypothesis (similar to accepting the alternative hypothesis).
9 BITS Pilani, Pilani Campus
Examples: Testing of Hypothesis

A criminal trial:
In a trial, jury must decide between two hypotheses. The null hypothesis
is
H0: The defendant is innocent
The alternative hypothesis or research hypothesis is
H1: The defendant is guilty

The jury do not know which hypothesis is true. They must make a
decision on the basis of evidence presented.

10 BITS Pilani, Pilani Campus


Examples: Testing of Hypothesis

In the language of statistics convicting the defendant is called rejecting the null
hypothesis in favor of the alternative hypothesis. That is, the jury is saying
that there is enough evidence to conclude that the defendant is guilty (i.e.,
there is enough evidence to support the alternative hypothesis).

If the jury acquits it is stating that there is not enough evidence to support the
alternative hypothesis. Notice that the jury is not saying that the defendant is
innocent, only that there is not enough evidence to support the alternative
hypothesis. In the same logic, we do not say that we accept the null
hypothesis, rather we say that “we fail to reject the null hypothesis” from
available information from sample.

11 BITS Pilani, Pilani Campus


Types of Errors

Decision H0 accepted H0 rejected


Reality
H0 true No error Type I error 
 (probability = α)

H0 false Type II error  No error


(probability = β) 
• H0: the null hypothesis and H1: the alternative hypothesis
• Type-I error: Rejecting null hypothesis when it is actually true; Prob(type-I) = α
• Type-II error: Failed to reject null hypothesis when it is false; Prob(type-II) = β
• Power of a test (1-β): Probability of rejecting null hypothesis when it is false
12 BITS Pilani, Pilani Campus
Types of Errors

Critical
Value

Accept H0 Reject H0

Reducing both type-I and type-II errors together is not possible.


Although, one can try to make either type of error reasonably small!
13 BITS Pilani, Pilani Campus
Level of Significance
• The decision depends on the value of the test
statistic on a sample and hence has randomness in
it.
• There is a chance that null hypothesis is rejected
when it is true, that is, we have committed type I
error.
• Probability of Type I error is
P[H0 is rejected|H0 is true].
• This is also called level of significance and denoted
by .
BITS Pilani, Pilani Campus
Type-I Error
A type I error is an error made when the null
hypothesis is rejected, in spite of it being true. The
probability of committing a type I error is called the
‘level of significance’ of the test and is denoted by ‘’.

The set of values of the test statistic that leads us to


reject the null hypothesis is termed as ‘Critical
Region’.

BITS Pilani, Pilani Campus


Type-II Error
We design the test so that the probability of
committing a type I error is approximately the value
we desire.
Sometimes, it might also happen that the observed
value of the test statistic does not fall on the rejection
region even though the null hypothesis is not true and
should be rejected. This is type-II error. The probability
of occurrence of this is given by beta (b).

BITS Pilani, Pilani Campus


Definition 8.3.3: Power of a Test

Consider a test of hypothesis. The probability that the null


hypothesis will be rejected when, in fact, the research theory
is true is called the power of the test (1-β).
Note: We will either fail to reject to the null hypothesis with
probability b or we reject the null hypothesis with probability
power, so
b + power = 1
Note: Our objective is always to keep α and β as small as
possible and the power of the test to be as high as possible.
This is usually achieved by choosing a appropriate sample
size.
BITS Pilani, Pilani Campus
Constructing null and alternative
hypotheses
• One-tailed and two-tailed test:

H 0 :   0 H 0 :   0 H 0 :   0
H 1 :   0 H 1 :   0 H 1 :   0
One-tailed One-tailed Two-tailed
(lower-tail) (upper-tail)
Or Or
(Left-tailed) (Right-tailed)
• Probability of Type I error is  = P(H0 is rejected|H0 is true).
This is also called the level of significance.
• Probability of Type II error is b =P(H0 is accepted|H0 is false)

18 BITS Pilani, Pilani Campus


One-tailed or two-tailed?
Ex. From long experience of coca-cola company, it is known that
yield is normally distributed with mean of 500 units and standard deviation
96 units. For a modified process, yield is 535 units for a sample of size 50.
At 5% significance level, does the modified process increased the yield?
Sol. Here H 0 :   500  this specifies a single value for the parameter 
Actually, we shall assume H 0 :   500
H1 :   500  this is what we want to test 
 one-tailed test; test for  and  known
Is calculated value
greater than the
critical value ?

BITS Pilani, Pilani Campus


One-tailed or two-tailed?
Ex. A department store manager determines that a new billing system will be
cost effective only if the mean monthly account is more than $170. A random sample
of 400 monthly accounts is drawn, for which the sample mean is $178. It is known
that the accounts are approximately normally distributed with s.d. of $65.
At   5%, can we conclude that the new system will be cost-effective?
Sol.
System is cost effective if the mean account balance for all customers (population)
is greater than $170, that is, if   $170.
Our null hypothesis thus H 0 :   170
H1 :   170  this is what we want to test 
 one-tailed test; test for  and  known

BITS Pilani, Pilani Campus


One-tailed or two-tailed?
Ex. A drug is given to 10 patients, and the increments in their blood pressure
were recorded as 3, 6,  2, 4,  4, 1,  6, 0, 0, 2. Is it reasonable to believe that the
drug has no effect on change of the mean blood pressure? Test at 5% significance
level, assuming that the population is normal with variance 1.
Sol. Formulate the hypothesis: H 0 :   0
H1 :   0
 Two-tailed test; test for  and  known.
Does calculated
value fall in the
rejection region of
Acceptance
H0 (that is beyond
region
the critical values)
?
BITS Pilani, Pilani Campus
One-tailed or two-tailed?
Ex. The mean weakly sales of a magazine was 146 units. After an advertisement
campaign, mean of weakly sales in 22 stores for a typical week increased to 154 with
a standard deviation of 17 units. Was the advertisement successful at 5% significance
level? It is given that the weakly sales of magazine follows normal distribution.
Sol. Formulate the hypothesis: H 0 :   146
H1 :   146
 One-tailed test; test for  and  unknown.
Ex. A state highway patrol periodically samples vehicles speeds at various
locations on a particular highway. The sample of vehicle speeds is used to test the
hypothesis H 0 :   65. A sample of 64 vehicles shows a mean speed of 66.2 kmph
with a s.d. of 4.2 kmph. Use   0.05 to test H 0 . Assume normality of population.
Sol. Formulate the hypothesis: H 0 :   65
H1 :   65  One-tailed test; test for  ,  unknown.
BITS Pilani, Pilani Campus
One-tailed or two-tailed?
Ex. A marketing research firm conducted a survey 10 years ago and found that
the average household income of Pilani is Rs. 12000. Mr. Agrawal, who has recently
joined the firm wants to verify the accuracy of data. For this, the firm decides to take
a random sample of 200 households. Sample mean and sample s.d. are Rs. 13000 and
Rs. 100. Verify Mr. Agrawal's doubt at   0.05, assuming normality of population.
Sol. Formulate the hypothesis: H 0 :   12000
H1 :   12000
 Two-tailed test; test for  and  unknown  sample size n  200  .
Ex. A CFL manufacturing company supplies its products to various retailers. The
company has received complaints from retailers that the average life of its CFL is not
24 months, as the company claims. For verifying, the company collected a random sample
of 150 CFLs and found that the average life is 23 months. Assuming   5 months, test the
average population life of CFLs at   0.08.
Sol. Formulate the hypothesis: H 0 :   24, H1 :   24  Two-tailed; test for  ,  known.
BITS Pilani, Pilani Campus
One-tailed or two-tailed?
Ex. In a golf course, over the past years, 20% of the players were women. In an effort
to increase the proportion of women players, a special promotion was implemented. Now the
manager likes to see whether the promotion helped to increase the proportion of women
players. A random sample of 400 players was selected, and 100 of the players were women.
Test the hypothesis at 5% significance level.
Sol. Formulate the hypothesis: H 0 : p  0.20
H1 : p  0.20
 One-tailed test; test for p  sample size n  400  .

BITS Pilani, Pilani Campus


Steps of Hypothesis Testing

Step 1. Develop the null and alternative hypotheses; determine


appropriate statistical test.

Step 2. Specify the level of significance .

Step 3. Collect the sample data and compute the test statistic.

Step 4. Based on , identify critical values.

Step 5. Reject H0 if the calculated test statistic value falls in


the rejection region.

25 BITS Pilani, Pilani Campus


Test Statistics

1. Test Statistic for population mean:

(a) when population variance is known: Z  X  0

/ n
(b) when population variance is unknown: Tn1  X   0

(requires normaility of population) S/ n


2. Test Statistic for population proportion:

p0 1  p0 
Z   pˆ  p0 
n

26 BITS Pilani, Pilani Campus


Lower-tailed test for population
mean (σ known)

H 0 :   0
H 1 :   0

27 BITS Pilani, Pilani Campus


Upper-tailed test for population
mean (σ known)
H 0 :   0
H 1 :   0

28 BITS Pilani, Pilani Campus


Two-tailed test for population
mean (σ known)

Do Not
Reject H0
(Acceptance
Region)
H 0 :   0
H 1 :   0
29 BITS Pilani, Pilani Campus
Examples: Hypothesis Testing

Ex.From long experience of coca-cola company, it is known that


yield is normally distributed with mean of 500 units and standard deviation
96 units. For a modified process, yield is 535 units for a sample of size 50.
At 5% significance level, does the modified process increased the yield?
Sol.

30 BITS Pilani, Pilani Campus


Examples: Normal Distribution

BITS Pilani, Pilani Campus


Examples: Hypothesis Testing

Ex. From long experience of coca-cola company, it is known that


yield is normally distributed with mean of 500 units and standard deviation
96 units. For a modified process, yield is 535 units for a sample of size 50.
At 5% significance level, does the modified process increased the yield?
Sol.
Step1:Here H 0 :   500  one-tailed (right-tailed) test; test for  ,  known
H1 :   500
x  0 535  500
Step 2: From sample data, we formulate zcalculated    2.57
   96 50
 
 n
Step 3: At 95% confidence level, z0.05  1.64  from single tailed test of Z-table 
Step 4: As zcalculated  z0.05  reject the null hypothesis (i.e., enough evidence to
accept the alternative hypothesis)
32 BITS Pilani, Pilani Campus
Examples: Hypothesis Testing

Ex. A department store manager determines that a new billing system will be
cost effective only if the mean monthly account is more that $170. A random sample
of 400 monthly accounts is drawn, for which the sample mean is $178. It is known
that the accounts are approximately normally distributed with s.d. of $65.
At   5%, can we conclude that the new system will be cost-effective?
Sol.
Step1:Here H 0 :   170  one-tailed (right-tailed) test; test for  and  known
H1 :   170
x  0 178  170
Step 2: From sample data, we formulate zcalculated    2.46
   65 400
 
 n 
Step 3: At 95% confidence level, z0.05  1.64  from single tailed test of Z-table 
Step 4: As zcalculated  z0.05  reject the null hypothesis (i.e., accept H1 )
33 BITS Pilani, Pilani Campus
Examples: Hypothesis Testing

Ex. A drug is given to 10 patients, and the increments in their blood pressure
were recorded as 3, 6,  2, 4,  4, 1,  6, 0, 0, 2. Is it reasonable to believe that the
drug has no effect on change of the mean blood pressure? Test at 95% confidence
level, assuming that the population is normal with variance 1.
Sol.
Step 1: Formulate the hypothesis: H 0 :   0, H1 :   0
 Two-tailed test for  and  is known.
x  0 0.4  0
Step 2: From sample data, we formulate z calculated    1.265
 n  1 10
Step 3: At 95% confidence level, z0.025  1.96,  z0.025  1.96
 from two tailed test of Z-table, we find  z   2

Step 4: As zcalculated does not fall in the rejection region, we fail to reject H 0 .
 We can believe that the drug has no effect on change of the mean blood pressure
34 BITS Pilani, Pilani Campus
Examples: Hypothesis Testing

Ex. The mean weakly sales of a magazine was 146 units. After an advertisement
campaign, mean of weakly sales in 22 stores for a typical week increased to 154 with
a standard deviation of 15 units. Was the advertisement successful at 5% significance
level? It is given that the weakly sales of magazine follows normal distribution.
Sol.
Step 1: Formulate the hypothesis: H 0 :   146
H1 :   146
 One-tailed test; test for  and  unknown.
x  0 154  146
Step 2: From sample data, we formulate tcalculated    2.501
S n  15 22

Step 3: For   0.05 and 21 dof , t21,0.05  1.721  from one tailed test of T-table 
Step 4: As tcalculated  t21, 0.05  reject the null hypothesis (i.e., accept H1 )
 We can conclude that the advertisement was successful.
35 BITS Pilani, Pilani Campus
Examples: Hypothesis Testing

HW. To test the null hypothesis that population mean is 4  H 0 :   4  against


alternative hypothesis   5, a test is designed based on a random sample of size 49.
It is decided that the null hypothesis will be rejected if the observed sample mean
x  4.3. If the population variance is 9, find (a) the distribution of X , assuming H 0
true, (b) the distribution of X , assuming H1 true, and (c) Probability of type-I and
type-II errors (e) What is the power of the test?

36 BITS Pilani, Pilani Campus

Vous aimerez peut-être aussi