Vous êtes sur la page 1sur 42

Hypothesis Testing;

Z-Test, T-Test, F-Test


3
What is Hypothesis?

 Hypothesis is a predictive statement, capable of


being tested by scientific methods, that relates an
independent variables to some dependent
variable.
 A hypothesis states what we are looking for and it is
a proportion which can be put to a test to
determine its validity
e.g.
Students who receive good coaching will show a
greater increase in creativity than students not
receiving coaching.
Null and Alternative Hypothesis

Null hypothesis (H0)


The null hypothesis states that a population parameter (such as the mean, the standard
deviation, and so on)
is equal to a hypothesized value. The null hypothesis is often an initial claim
that is based on previous analyses or specialized knowledge.

Alternative Hypothesis (H1)


The alternative hypothesis states that a population parameter is smaller, greater, or
different than the hypothesized value
in the null hypothesis. The alternative hypothesis is what you might believe to be true or
hope to prove true.
5
Null Hypothesis

 It is an assertion that we hold as true unless we have


sufficient statistical evidence to conclude otherwise.
 Null Hypothesis is denoted by 𝐻0
 If a population mean is equal to hypothesised mean
then Null Hypothesis can be written as

𝐻0: 𝜇 =𝜇0
6
Alternative Hypothesis

 The Alternative hypothesis is negation of null


hypothesis and is denoted by 𝐻𝑎
If Null is given as 𝐻 0: 𝜇 = 𝜇 0

Then alternative Hypothesis can be written as


𝐻 𝑎: 𝜇 ≠ 𝜇0
𝐻 𝑎 : 𝜇 > 𝜇0
𝐻 𝑎 : 𝜇 < 𝜇0
7
Level of significance and
confidence

 Significance means the percentage risk to reject a


null hypothesis when it is true and it is denoted by 𝛼.
Generally taken as 1%, 5%, 10%
 (1 − 𝛼) is the confidence interval in which the null
hypothesis will exist when it is true.
8
Risk of rejecting a Null Hypothesis
when it is true

Risk Confidence
Designation Description
𝜶 𝟏−𝜶
More than $100 million
0.001 0.999
Supercritical (Large loss of life, e.g. nuclear
0.1% 99.9%
disaster
0.01 0.99 Less than $100 million
Critical
1% 99% (A few lives lost)
0.05 0.95 Less than $100 thousand
Important
5% 95% (No lives lost, injuries occur)
0.10 0.90 Less than $500
Moderate
10% 90% (No injuries occur)
9
Type I and Type II Error

Decision
Accept Null Reject Null
Situation
Null is true Correct Type I error
( 𝛼 𝑒𝑟𝑟𝑜𝑟 )

Null is false Type II error Correct


( 𝛽 𝑒𝑟𝑟𝑜𝑟 )
Two tailed test @ 10
5% Significance level

Acceptance and Rejection


regions in case of a Two Suitable When 𝐻0: 𝜇 = 𝜇0
tailed test 𝐻 𝑎 : 𝜇 ≠ 𝜇0

𝑅𝑒𝑗𝑒𝑐𝑡𝑖𝑜𝑛 𝑟𝑒𝑔𝑖𝑜𝑛 𝑅𝑒𝑗𝑒𝑐𝑡𝑖𝑜𝑛 𝑟𝑒𝑔𝑖𝑜𝑛


𝑇𝑜𝑡𝑎𝑙 𝐴𝑐𝑐𝑒𝑝𝑡𝑎𝑛𝑐𝑒 𝑟𝑒𝑔𝑖𝑜𝑛 /𝑠𝑖𝑔𝑛𝑖𝑓𝑖𝑐𝑎𝑛𝑐𝑒 𝑙𝑒𝑣𝑒𝑙
/𝑠𝑖𝑔𝑛𝑖𝑓𝑖𝑐𝑎𝑛𝑐𝑒 𝑙𝑒𝑣𝑒𝑙
(𝛼 = 0.025 𝑜𝑟 2.5%) 𝑜𝑟 𝑐𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 𝑙𝑒𝑣𝑒𝑙 (𝛼 = 0.025 𝑜𝑟 2.5%)
(1 − 𝛼) = 95%
𝐻0: 𝜇 = 𝜇0
Left tailed test @ 11
5% Significance level

Acceptance and Rejection


regions in case of a left tailed Suitable When 𝐻0: 𝜇 = 𝜇0
test 𝐻 𝑎 : 𝜇 < 𝜇0

𝑅𝑒𝑗𝑒𝑐𝑡𝑖𝑜𝑛 𝑟𝑒𝑔𝑖𝑜𝑛 𝑇𝑜𝑡𝑎𝑙 𝐴𝑐𝑐𝑒𝑝𝑡𝑎𝑛𝑐𝑒 𝑟𝑒𝑔𝑖𝑜𝑛


/𝑠𝑖𝑔𝑛𝑖𝑓𝑖𝑐𝑎𝑛𝑐𝑒 𝑙𝑒𝑣𝑒𝑙 𝑜𝑟 𝑐𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 𝑙𝑒𝑣𝑒𝑙
(𝛼 = 0.05 𝑜𝑟 5%) (1 − 𝛼) = 95%

𝐻0: 𝜇 = 𝜇0
Right tailed test @ 12
5% Significance level

Acceptance and Rejection


regions in case of a Right Suitable When 𝐻0: 𝜇 = 𝜇0
tailed test 𝐻 𝑎 : 𝜇 > 𝜇0

𝑇𝑜𝑡𝑎𝑙 𝐴𝑐𝑐𝑒𝑝𝑡𝑎𝑛𝑐𝑒 𝑟𝑒𝑔𝑖𝑜𝑛 𝑅𝑒𝑗𝑒𝑐𝑡𝑖𝑜𝑛 𝑟𝑒𝑔𝑖𝑜𝑛


𝑜𝑟 𝑐𝑜𝑛𝑓𝑖𝑑𝑒𝑛𝑐𝑒 𝑙𝑒𝑣𝑒𝑙 /𝑠𝑖𝑔𝑛𝑖𝑓𝑖𝑐𝑎𝑛𝑐𝑒 𝑙𝑒𝑣𝑒𝑙
(1 − 𝛼) = 95% (𝛼 = 0.05 𝑜𝑟 5%)
𝐻0: 𝜇 = 𝜇0
13
Procedure for Hypothesis
Testing

State the null State a Decide a test Calculate the


(Ho)and alternate significance level; statistics; z-test, t- value of test
(Ha) Hypothesis 1%, 5%, 10% etc. test, F-test. statistics

Compare
Calculate the p-
the p-value P-value >
value at given
with Calculated Accept Ho
significance level
calculated value
from the table
value

P-value <
Calculated Reject Ho
value
14

Hypothesis
Testing of Z-TEST AND T-TEST

Means
Z test

A z-test is used for testing the mean of a population versus a standard, or


comparing the means of two populations, with large (n ≥ 30) samples
whether you know the population standard deviation or not.

It is also used for testing the proportion of some characteristic versus a


standard proportion, or comparing the proportions of two populations.

Example: Comparing the average engineering salaries of men versus


women.
Example: Comparing the fraction defectives from 2 production lines.
15
Z-Test for testing means

Test Condition Test Statistics


 Population normal and
infinite
𝑋−𝜇 𝐻 0
 Sample size large or small,
 Population variance is
𝑧= 𝜎𝑝
known
 Ha may be one-sided or
two sided
16
Z-Test for testing means

Test Condition Test Statistics


 Population normal and
finite,
 Sample size large or small,
𝑋 − 𝜇𝐻0
 Population variance is 𝑧 = 𝜎𝑝
known 𝑁−𝑛𝑁− 1
𝑛
 Ha may be one-sided or ×
two sided
17
Z-Test for testing means

Test Condition Test Statistics


 Population is infinite and
may not be normal, 𝑋−𝜇 𝐻 0
 Sample size is large, 𝑧= 𝜎𝑠
 Population variance is
unknown
 Ha may be one-sided or
two sided
18
Z-Test for testing means

Test Condition Test Statistics


 Population is finite and may
not be normal, 𝑋 − 𝜇𝐻0
𝑧= 𝜎
 Sample size is large, 𝑠
× 𝑁−𝑛𝑁− 1
𝑛
 Population variance is
unknown
 Ha may be one-sided or
two sided
T Test

A t-test is used for testing the mean of one population against a standard
or comparing the means of two populations if you do not
know the populations’ standard deviation and when you have a limited
sample (n < 30).

If you know the populations’ standard deviation, you


may use a z-test.

t-test is used to compare two related samples

Example: Measuring the average diameter of shafts from a certain


machine when you have a small sample.
19
T-Test for testing means

Test Condition Test Statistics


 Population is infinite and
normal, 𝑋−𝜇 𝐻 0
 Sample size is small,
𝑡= 𝜎𝑠
𝑛
 Population variance is
𝑤𝑖𝑡ℎ 𝑑. 𝑓. = 𝑛 − 1
unknown
 Ha may be one-sided or 𝑋𝑖 − 𝑋 2
two sided 𝜎𝑠 =
(𝑛 − 1)
20
T-Test for testing means

Test Condition Test Statistics


 Population is finite and 𝑋 − 𝜇𝐻 0
normal, 𝑡= 𝜎
𝑛 × 𝑁−𝑛𝑁− 1
𝑠
 Sample size is small,
 Population variance is
𝑤𝑖𝑡ℎ 𝑑. 𝑓. = 𝑛 − 1
unknown
 Ha may be one-sided or 𝑋𝑖 − 𝑋 2
two sided 𝜎𝑠 =
(𝑛 − 1)
F Test

An F-test is used to compare 2 populations’ variances. The samples can


be any size. It is the basis of ANOVA.

f-test is used to test the equality of two populations

Example: Comparing the variability of Subject marks of students from


two Institute of RM.
Matched pair test

Matched pair test is used to compare the means before and after
something is done to the samples.

A t-test is often used because the samples are often small. However, a z-
test is used when the samples are large.

The variable is the difference between the before and after


measurements.

Example: The average weight of subjects before and after following a


diet.
21

Hypothesis
testing for
difference Z-TEST, T-TEST

between
means
22
Z-Test for testing difference
between means

Test Condition Test Statistics


 Populations are normal
 Samples happen to be
large, 𝑋1 − 𝑋2
𝑧=
 Population variances are 2 2
known 𝜎𝑝1 𝜎𝑝2
+
 Ha may be one-sided or 𝑛1 𝑛2
two sided
23
Z-Test for testing difference
between means

Test Condition Test Statistics


 Populations are normal
 Samples happen to be large,
 Presumed to have been
drawn from the same
𝑋1 −𝑋2
population 𝑧=
1 1
 Population variances are 𝜎𝑝2 +
known 𝑛1 𝑛2
 Ha may be one-sided or two
sided
24
T-Test for testing difference
between means

Test Condition Test Statistics


 Samples happen to be small,
 Presumed to have been
drawn from the same 𝑋1 − 𝑋2
population 𝑡=
2 + 𝑛 − 1 𝜎2
𝑛1 − 1 𝜎𝑠1 1 1
 Population variances are 2 𝑠2 ×
+
unknown but assumed to be 𝑛1 + 𝑛2 − 2 𝑛1 𝑛2
equal
 Ha may be one-sided or two 𝑤𝑖𝑡ℎ 𝑑. 𝑓. = (𝑛1 + 𝑛2 − 2)
sided
25

Hypothesis
Testing for
comparing PAIRED T-TEST

two related
samples
26
Paired T-Test for comparing
two related samples

Test Condition Test Statistics


 Samples happens to be 𝐷−0
small 𝑡 = 𝜎𝑑𝑖𝑓𝑓.
 Variances of the two 𝑛
populations need not be 𝑤𝑖𝑡ℎ (𝑛 − 1) 𝑑. 𝑓.
equal
 Populations are normal 𝐷 = Mean of differences
𝜎𝑑𝑖𝑓𝑓. = Standard deviation of differences
 Ha may be one sided or
two sided 𝑛 = 𝑁𝑢𝑚𝑏𝑒𝑟 𝑜𝑓 𝑚𝑎𝑡𝑐ℎ𝑒𝑑 𝑝𝑎𝑖𝑟𝑠
27

Hypothesis
Testing of Z-TEST

proportions
28
Z-test for testing of proportions

Test Condition Test statistics


 Use in case of qualitative
data 𝑝− 𝑝
 Sampling distribution may 𝑧=
take the form of binomial 𝑝. 𝑞
probability distribution 𝑛
 Ha may be one sided or two
sided
𝑝= 𝑝𝑟𝑜𝑝𝑜𝑟𝑡𝑖𝑜𝑛 𝑜𝑓 𝑠𝑢𝑐𝑒𝑠𝑠
 𝑀𝑒𝑎𝑛 = 𝑛.𝑝
 𝑆𝑡𝑎𝑛𝑑𝑎𝑟𝑑 𝑑𝑒𝑣𝑖𝑎𝑡𝑖𝑜𝑛 = 𝑛. 𝑝.𝑞
29

Hypothesis
Testing for
difference Z-TEST

between
proportions
30
Z-test for testing difference
between proportions

Test Condition Test statistics


 Sample drawn from two
𝑝 1 − 𝑝2
different populations 𝑧=
𝑝 1𝑞1 + 𝑝 2𝑞2
 Test confirm, whether the
𝑛1 𝑛2
difference between the
proportion of success is 𝑝1 = proportion of success in sample one
significant
𝑝2 = proportion of success in sample two
 Ha may be one sided or
two sided
31

Hypothesis
testing of
equality of F-TEST
variances of
two normal
populations
F-Test for testing equality of 32
variances of two normal
populations

Test conditions Test statistics


 The populations are normal
𝜎 2
 Samples have been drawn 𝑠1
𝐹= 2
randomly 𝜎𝑠2
 Observations are
independent; and 𝑤𝑖𝑡ℎ 𝑛1 − 1 and 𝑛2 − 1 d. f.
 There is no measurement
error
2
𝜎𝑠1 is the sample estimate for 𝜎𝑝2 1
 Ha may be one sided or two
sided
2
𝜎𝑠2 is the sample estimate for 𝜎𝑝2 2
Difference between T Test and F Test

F-test is statistical test,


T-test is a univariate hypothesis
that determines the
test, that is applied when
equality of the variances of
standard deviation is not known
the two normal
and the sample size is small.
populations.

Comparing two population


Comparing the means of two
variances.
populations.
One-Tailed and Two-Tailed Tests

A test of a statistical hypothesis, where the region of rejection is on only one side of the sampling
distribution, is called a one-tailed test. For example, suppose the null hypothesis
states that the mean is less than or equal to 10. The alternative hypothesis would be that the
mean is greater than 10. The region of rejection would consist of a range of numbers
located on the right side of sampling distribution; that is, a set of numbers greater than 10.

A test of a statistical hypothesis, where the region of rejection is on both sides of the sampling
distribution, is called a two-tailed test. For example, suppose the null hypothesis states
that the mean is equal to 10. The alternative hypothesis would be that the mean is less than 10 or
greater than 10. The region of rejection would consist of a range of numbers located
on both sides of sampling distribution; that is, the region of rejection would consist partly of
numbers that were less than 10 and partly of numbers that were greater than 10.
Decision Rules

P-value. The strength of evidence in support of a null hypothesis is measured by the P-value.
Suppose the test statistic is equal to S. The P-value is the probability of observing a
test statistic as extreme as S, assuming the null hypothesis is true. If the P-value is less than the
significance level, we reject the null hypothesis.

Region of acceptance. The region of acceptance is a range of values. If the test statistic falls
within the region of acceptance, the null hypothesis is not rejected. The region of
acceptance is defined so that the chance of making a Type I error is equal to the significance level.

The set of values outside the region of acceptance is called the region of rejection. If the test
statistic falls within the region of rejection, the null hypothesis is rejected. In such
cases, we say that the hypothesis has been rejected at the α level of significance.
These approaches are equivalent. Some statistics texts use the P-value approach; others use the
region of acceptance approach.

Vous aimerez peut-être aussi