Vous êtes sur la page 1sur 37

Introduction to

Hypothesis Testing
Course 3 econometrics

1
Introduction

 The purpose of hypothesis testing is to determine


whether there is enough statistical evidence in favor of a
certain belief about a parameter.
 Examples
 Is there statistical evidence in a random sample of potential
customers, that support the hypothesis that more than p% of the
potential customers will purchase a new products?
 Is a new drug effective in curing a certain disease? A sample of
patient is randomly selected. Half of them are given the drug where
half are given a placebo. The improvement in the patients conditions is
then measured and compared.
2
Concept of hypothesis testing
 The critical concepts of hypothesis testing.
 There are two hypotheses (about a population parameter(s))
 H0 - the null hypothesis [ for example m = 5]
 H1 - the alternative hypothesis [m > 5] This is what you want to prove

– Assume the null hypothesis is true.


• Build a statistic related to the parameter
hypothesized.
• Pose the question: How probable is it to
obtain a statistic value at least as
extreme as the one observed from the
3 sample? m=5 x
 Continued
 Make one of the following two decisions (based on the test):
 Reject the null hypothesis in favor of the alternative hypothesis.
 Do not reject the null hypothesis in favor of the alternative
hypothesis.

– Two types of errors are possible when making the


decision whether to reject H0
• Type I error - reject H0 when it is true.
• Type II error - do not reject H0 when it is false.

4
Testing the Population Mean When
the Population Standard Deviation is
Known
• Example 1
– A new billing system for a department store will be cost-
effective only if the mean monthly account is more than
$170.
– A sample of 400 monthly accounts has a mean of $178.
– If the account are approximately normally distributed with
s = $65, can we conclude that the new system will be cost
effective?
5
 Solution
 The population of interest is the credit accounts at the
store.
 We want to show that the mean account for all
customers is greater than $170.

H1 : m > 170
– The null hypothesis must specify a single value of
the parameter m
H0 : m = 170

6
Is a sample mean of 178 sufficiently greater than 170 to infer
that the population mean is greater than 170?

m x  170 178
If m is really equal to 170, then m x  170 . The
distribution of the sample mean should look like this.

Is it likely to have x  178 under the null hypothesis (m = 170)?

7
The rejection region method
The rejection region is a range of values such that if
the test statistic falls into that range, the null
hypothesis is rejected in favor of the alternative
hypothesis.

Define the value of x that is just large enough to


reject the null hypothesis as x L . The rejection region is

x  xL

8
The Rejection region is: x  x L

x  xL xL x  xL
Do no reject the Reject the
null hypothesis null hypothesis

9
The Rejection region is: x  x L

x L  170
za 
a 65 400

m x  170 xL Reject the null hypothesis x


here
a = P(commit a type I error) = P(reject H0 given that H0 is true)

= P( x  x L given that H0 is true)

 P(Z  Z a )
10
The Rejection region is: x  x L

a = 0.05

x L  170
m x  170 xL za 
65 400
65
x L  170  z a .
400
If we select a  0.05, z .05  1.645.
Then
65
x L  170  1.645  175.34.
11
400
The rejection region is: x  x L

Re ject the null hypothesis


if x  175.34

a = 0.05
Conclusion
Since the sample mean (178) is
m x  170 greater than the critical value of
x L  175.34 178
175.34, there is sufficient evidence
in the sample to reject H0 in favor of
H1, at 5% significance level.
12
The standardized test statistic
 Instead of using the statistic x, we can use the
standardized value z.
x m
z
s n

 Then, the rejection region becomes

One tail test


z  za

13
 Example 1 - continued
 We redo this example using the standardized test
statistic.
H0: m = 170
H1: m > 170
 Test statistic:
x m 178  170
z   2.46
s n 65 400
 Rejection region: z > z.05  1.645.
 Conclusion: Since 2.46 > 1.645, reject the null
hypothesis in favor of the alternative hypothesis.
14
P-value method
 The p - value provides information about the amount of
statistical evidence that supports the alternative
hypothesis.

– The p-value of a test is the probability of observing a


test statistic at least as extreme as the one computed,
given that the null hypothesis is true.

– Let us demonstrate the concept on the previous example

15
The probability of observing a
test statistic at least as extreme as 178,
given that the null hypothesis is true is:

 P( x  178)
178  170
 P( z  )
65 400
 P( z  2.4615)  .0069
m x  170
x  178 The p-value

16
 Interpreting the p-value
We can conclude that
Becausethe
the smaller thep-value
probability that the sample mean will assume a
value of more than 178 when m = 170 is so small (.0069), there
the more statistical evidence
are reasons to believe that
existsmto>support
170. the
alternative hypothesis.
Note how the event
x  178 is rare under H0
when m x  170, but...
…it becomes more
probable under H1,
when m x  170
H0 : m x  170
H1 : m x  170
17
x  178
 Describing the p-value

– If the p-value is less than 1%, there is overwhelming


evidence that support the alternative hypothesis.
– If the p-value is between 1% and 5%, there is a strong
evidence that supports the alternative hypothesis.
– If the p-value is between 5% and 10% there is a weak
evidence that supports the alternative hypothesis.
– If the p-value exceeds 10%, there is no evidence that
supports of the alternative hypothesis.

18
 The p-value and rejection region methods
 The p-value can be used when making decisions based on
rejection region methods as follows:
 Define the hypotheses to test, and the required significance
level a.
 Perform the sampling procedure, calculate the test statistic
and the p-value associated with it.
 Compare the p-value to a. Reject the null hypothesis only
if p <a; otherwise, do not reject the null hypothesis.

a = 0.05
The p-value
m x  170
19 x L  175.34 x  178
Conclusions of a test of Hypothesis

 If we reject the null hypothesis, we conclude that there


is enough evidence to infer that the alternative
hypothesis is true.
 If we do not reject the null hypothesis, we conclude
that there is not enough statistical evidence to infer
that the alternative hypothesis is true.
The alternative hypothesis
is the more important
one. It represents what
we are investigating.
20
 Example 2
 A government inspector samples 25 bottles of catsup
labeled “Net weight: 16 ounces”, and records their
weights.
 From previous experience it is known that Catsup
the weights are normally distributed with a 15.8
standard deviation of 0.4 ounces. 16.0
16.2
 Can the inspector conclude that the product 15.7
label is unacceptable? .
.
.

21
 Solution
 We need to draw a conclusion about the mean weights
of all the catsup bottles.
 We investigate whether the mean weight is less than
16 ounces (bottle label is unacceptable).

H0: m = 16
Then
H1: m < 16 – Select a significance level:
– The test statistic is a = 0.05
x m
z – Define the rejection region
s n z < - za  1.645
22 One tail test
So, if in reality m =16, but we reject this
hypothesis in favor of m < 16 because
x was very small, we want this mistake
a  0.05 to happen not more than 5% of the time.

x 16
A sample mean far below 16,
should be a rare event if m = 16. a  0.05

Rejection region -1.25


-za = -1.645 0
x m 15.90  16
z   1.25
s n 0.4 25

23
Since the value of the test statistic There is insufficient
does not fall in the rejection region, evidence to infer
we do not reject the null hypothesis that the mean is less
in favor of the alternative hypothesis. than 16 ounces.

The p-value = P(Z < - 1.25) = .1056 > .05

a  0.05
Rejection region
-1.25
-za = -1.645 0
x m 15.90  16
z   1.25
s n 0.4 25

24
 Example 3
 The amount of time required to complete a critical part of a
production process on an assembly line is normally distributed.
The mean was believed to be 130 seconds.

 To test if this belief is correct, a sample of 100 randomly


selected assemblies was drawn, and the processing time
recorded. The sample mean was 126.8 seconds.

 If the process time is really normal with a standard deviation of


15 seconds, can we conclude that the belief regarding the mean
is incorrect?

25
 Solution
 Is the mean different than 130?

H0: m = 130
Then
H1 : m  130

– Define the rejection region


z < - za/2 or z > za/2

26
So, if in reality m =130, but we mistakenly
reject this hypothesis in favor of m  130
because x was very small or very large,
a/2  0.025 we want this mistake to happen not more
than 5% of the time.

x 130 x
a/2  0.025
A sample mean far below 130
or far above 130, should be a a/2  0.025 a/2  0.025
rare event if m = 130.

-za/2 = -1.96 0 za/2 = 1.96


x m 126.8  130
z   2.13
s n 15 100
Rejection region

27
Since the value of the test statistic There is sufficient
falls in the rejection region, we reject evidence to infer
the null hypothesis in favor of the that the mean is not 130.
alternative hypothesis.

The p-value = P(Z < - 2.13)+P(Z > 2.13)


= 2(.0166) = .0332 < .05
a/2  0.025 a/2  0.025

-2.13 0 2.13
x m 126.8  130 -za/2 = -1.96 za/2 = 1.96
z   2.13
s n 15 100

28
Testing hypotheses and intervals estimators
 Interval estimators can be used to test hypotheses.
 Calculate the 1 - a confidence level interval estimator,
then
 if the hypothesized parameter value falls within the
interval, do not reject the null hypothesis, while
 if the hypothesized parameter value falls outside the
interval, conclude that the null hypothesis can be
rejected (m is not equal to the hypothesized value).

29
 Drawbacks
 Two-tail interval estimators may not provide the right
answer to the question posed in one-tail hypothesis
tests.
 The interval estimator does not yield a p-value.

There are cases where only tests produce


the information needed to make decisions.

30
Calculating the Probability of a Type II
Error
 To properly interpret the results of a test of hypothesis,
we need to
 specify an appropriate significance level or judge the p-
value of a test;
 understand the relationship between Type I and Type II
errors.
 How do we compute a type II error?

31
 Calculation of a type II error requires that
 the rejection region be expressed directly, in terms of the
parameter hypothesized (not standardized).
 the alternative value (under H1) be specified.

H0: m  m0
a
H1: m  m1 (m0 is not equal to m1)
m m0 xL

32 m m1
• Let us revisit example 1
– The rejection region was x  175.34 with a = .05.
– A type II error occurs when a false H0 is not rejected.

x  175.34
Do not reject H0
a  .05
m0  170 175.34
  P( x  175.34 given that H0 is false )
x L  175.34
…but  P( x  175.34 given that m  180)
H0 is false  P( z  175.34  180 )  .0764
175.34 m1  180
65 400
33
 Effects on  of changing a
 Decreasing the significance level a, increases the
the value of , and vice versa.

a1  a2

1   2

34
 Judging the test

 A hypothesis test is effectively defined by the


significance level a and by the the sample size n.

 If the probability of a type II error  is judged to be


too large, we can reduce it by
 increasing a, and/or
 increasing the sample size.

35
By increasing the sample size
the standard deviation of the
sampling distribution of the
a
mean decreases. Thus, x L
decreases.

xxLxLxLxLxLxLL xL  m
za  , thus
s n
1  2
s
As a result  decreases xL  m  z a
n
x LxxLxLxLL
 In example 10.1, suppose n increases from 400 to 1000.
s 65
xL  m  z a  170  1.645  173.38
n 1000
173.38  180
  P( Z  )  P( Z  3.22)  0
65 1000
36
 In summary,
 By increasing the sample size, we reduce the
probability of type II error.
 Hence, we shall accept the null hypothesis when it is
false less frequently.
 Power of a test
 The power of a test is defined as 1 - .
 It represents the probability to reject the null
hypothesis when it is false.

37

Vous aimerez peut-être aussi