Vous êtes sur la page 1sur 57

A2-Level Maths:

Statistics 2
for Edexcel

S2.4 Hypothesis
tests

These icons indicate that teacher’s notes or useful web addresses are available in the Notes Page.

This icon indicates the slide contains activities created in Flash. These activities are not editable.
For more detailed instructions, see the Getting Started presentation.
11 of
of 57
57 © Boardworks Ltd 2006
Introduction to sampling

Introduction to sampling

Introduction to hypothesis testing


Contents

Chocolate tasting practical

One-sided hypothesis tests

One-sided versus two-sided tests

Critical regions

Hypothesis tests and critical regions

22 of
of 57
57 © Boardworks Ltd 2006
National census

The British government carries out a census of the entire


population of the United Kingdom every 10 years (most
recently in April 2001).
The first census in the United Kingdom was carried out in
1086 with the construction of the Doomesday Book.
However they have only been conducted on a regular
basis since 1801.
The census provides the government with a detailed
picture of the population living in each part of the country
(town, city or countryside). The results are used to help
plan public services (health, housing, transport and
education) for the future.

3 of 57 © Boardworks Ltd 2006


Introduction to sampling

In statistics we often want to obtain information from a group


of individuals or about a group of objects.

The population is the set of all individuals or


objects that we wish to study.

A census is an investigation in which information


is obtained from every member of the population.

A sampling frame is a list of all members of the


population.

4 of 57 © Boardworks Ltd 2006


Introduction to sampling

Examples:
1. A head teacher is interested in finding out how long
her sixth form students spend in part-time
employment each week.
The population is the set of all sixth form students in
her school. A possible sampling frame would be the
registers of sixth form tutor groups.
2. A newspaper is interested in obtaining the views of
residents living close to the site of a proposed new
airport.
The population might be all adults living within a 10
mile radius of the site. A possible sampling frame
could be the local electoral roll.
5 of 57 © Boardworks Ltd 2006
Introduction to sampling

Examples:
3. A car company has discovered a fault that affects
one of their models of car. The company may wish
to know how widespread the problem might be.
The population would be all cars produced of this
particular model.
A possible sampling frame would be a list of all
registered cars of this model provided by the DVLA.

6 of 57 © Boardworks Ltd 2006


Introduction to sampling

Carrying out a census of the entire population is usually not


feasible or sensible.
A census is usually costly in terms of
money
time
resources
In addition, some investigations could result in the
destruction of the entire population!
For example, if a light bulb manufacturer wished to
investigate the lifetime of its bulbs, a census would result in
the destruction of all the bulbs it produced.

7 of 57 © Boardworks Ltd 2006


Introduction to sampling

Instead of surveying the whole population, information can


instead be obtained from a sample.
The sampling process should be undertaken carefully to ensure
that the sample is representative of the entire population.
Bias can occur if one section of the population is over- or
under-represented.
Question: A local council wishes to know the views of local
people on public transport. Criticize each of the following
sampling regimes:
1. Ask the people waiting at the town centre bus stop.
2. Leave questionnaires in local libraries for people to fill in.
3. Ask people at the shopping centre on a Thursday morning.

8 of 57 © Boardworks Ltd 2006


Sampling methods

One way to obtain a fair sample is to use random sampling.


This method gives every member of the population an equal
chance of being chosen for the sample.
A more formal definition of a random sample is as follows:

A sample of size n is called a random sample if


every possible selection of size n has the same
probability of being chosen.

There are a number of ways in which a random sample can be


chosen. One commonly used technique is to use random
number tables.

9 of 57 © Boardworks Ltd 2006


Random number tables

The table below gives a list of random digits:


793 259 976 452 401 234 393 053 225 197 549 628 444 212 885 355 169 905
834 193 439 102 356 206 753 335 713 416 584 438 085 966 235 418 626 411
469 807 561 925 290 692 923 229 288 631 523 040 940 642 775 838 281 475

Here is how to use random digits to obtain a sample:


Example: A sample of size 15 is required from a population
of size 300.
One possible approach would be to obtain a sampling frame
for the population and number every member from 001 to
300. You could then obtain chains of 3 random digits from
tables. If the chain corresponds to a number between 001
and 300 you could select that member of the population;
otherwise you could discard that chain and choose another.

10 of 57 © Boardworks Ltd 2006


Random number tables

793 259 976 452 401 234 393 053 225 197 549 628 444 212 885 355 169 905
834 193 439 102 356 206 753 335 713 416 584 438 085 966 235 418 626 411
469 807 561 925 290 692 923 229 288 631 523 040 940 642 775 838 281 475

Example (continued): This method is wasteful of random


digits since most chains of 3 digits will be discarded.
A more efficient strategy would be to assign each member of
the population to several chains of random digits:
Population member Random digits
1 001 301 601 This approach leads
2 002 302 602 to only chains of digits
3 003 303 603 between 901 and 000
… … being discarded.
300 300 600 900

11 of 57 © Boardworks Ltd 2006


Random number tables

793 259 976 452 401 234 393 053 225 197 549 628 444 212 885 355 169 905
834 193 439 102 356 206 753 335 713 416 584 438 085 966 235 418 626 411
469 807 561 925 290 692 923 229 288 631 523 040 940 642 775 838 281 475

Example (continued): Suppose that we use the 2nd line of


random digits in the above table, then the sample chosen
would be: 834 → 234 713 → 113
193 → 193 416 → 116
439 → 139 584 → 284
102 → 102 438 → 138
356 → 56 085 → 85
206 → 206 966 (cannot be used)
753 → 153 235 → 235
335 → 35 418 → 118

12 of 57 © Boardworks Ltd 2006


Hypothesis testing

Introduction to sampling

Introduction to hypothesis testing


Contents

Chocolate tasting practical

One-sided hypothesis tests

One-sided versus two-sided tests

Critical regions

Hypothesis tests and critical regions

13
13 of
of 57
57 © Boardworks Ltd 2006
Introduction to hypothesis testing

Is a new cancer drug more


effective than an
existing treatment?

14 of 57 © Boardworks Ltd 2006


Introduction to hypothesis testing

Has the installation


of a new speed camera
led to a reduction in the
traffic speed?

15 of 57 © Boardworks Ltd 2006


Introduction to hypothesis testing

A candidate in an
election claims 60%
support. Is the candidate
exaggerating their
level of support?

16 of 57 © Boardworks Ltd 2006


Introduction to hypothesis testing

Hypothesis testing is concerned with trying to answer


questions like these.

Hypothesis tests are crucial in many subject areas


including medicine, psychology, biology and geography.

In S1, we only deal with situations where we are testing


a probability or a proportion.

17 of 57 © Boardworks Ltd 2006


A simple introductory example

Consider the following simple situation.


You suspect that a die is biased towards the number six.
In order to test this suspicion, you could perform an
experiment in which the die is thrown 20 times.
If the die were fair, you would expect about 3 sixes.
If you obtained a lot more than 3 sixes then you might
decide that there is evidence to support your suspicions.
But how do you decide on what
a suspicious number of sixes is?

18 of 57 © Boardworks Ltd 2006


A simple introductory example

Consider throwing a fair dice 20 times. The probability of


obtaining different numbers of sixes is shown in the graph:

19 of 57 © Boardworks Ltd 2006


A simple introductory example

So, we noticed from the previous slide that, with 20 throws


of a fair die, the probability of getting 7 or more sixes is
about 0.0371.
This means that if a fair die were thrown 20 times over and
over again, then you would obtain 7 or more sixes less than
once in every 20 experiments.
The figure of 1 in 20 (or 5%) is often taken as a cut-off point
– results with probabilities below this level are sometimes
regarded as being unlikely to have occurred by chance.
However, in situations where more evidence is required, cut-
off values of 1% or 0.1% are typically used.

20 of 57 © Boardworks Ltd 2006


A formal introduction to hypothesis tests

In hypothesis testing we are essentially presented with two rival


hypotheses.
Examples might include:
“The coin is fair” or “the coin is biased”;
“The proportion of local people in favour of a by-pass
is 80%” or “the proportion is smaller than 80%”;
“The drug has the same effectiveness as an existing
treatment” or “the drug is more effective”.
These rival hypotheses are referred to as the null
and the alternative hypotheses.

21 of 57 © Boardworks Ltd 2006


A formal introduction to hypothesis tests

The null hypothesis (H0) is often thought of as the


cautious hypothesis – it represents the usual state of affairs.
The alternative hypothesis (H1) is usually the one that we
suspect or hope to be true.
Hypothesis testing is concerned with examining the data
collected in experiments, and deciding how likely the result
is to have occurred if the null hypothesis is true.
The significance level of the test is the chosen cut-off
value between the results that might plausibly have been
obtained by chance if H0 is true, and the results that are
unlikely to have occurred.

22 of 57 © Boardworks Ltd 2006


A formal introduction to hypothesis tests

Significance levels that are typically used are 10%, 5%, 1%


and 0.1%.
These significance levels correspond to different rigours of
test – the lower the significance level, the stronger the
evidence the test will provide.

Note: It is important to appreciate that it is not possible


to prove that a hypothesis is definitely true in statistics.
Hypothesis tests can only provide different degrees of
evidence in support of a hypothesis. A 10%
significance level can only provide weak evidence in
support of a hypothesis. A 0.1% test is much more
stringent and can provide very strong evidence.

23 of 57 © Boardworks Ltd 2006


Chocolate tasting practical

Introduction to sampling

Introduction to hypothesis testing


Contents

Chocolate tasting practical

One-sided hypothesis tests

One-sided versus two-sided tests

Critical regions

Hypothesis tests and critical regions

24
24 of
of 57
57 © Boardworks Ltd 2006
Chocolate tasting practical

Do you think you can taste the difference between branded


chocolate and supermarket own-label chocolate?
You are going to perform an experiment to find out.

There will be 2 pieces of chocolate to try: one will be a


branded make of chocolate, the other will be a supermarket’s
own-brand. Try to identify the branded make.

25 of 57 © Boardworks Ltd 2006


Chocolate tasting practical

26 of 57 © Boardworks Ltd 2006


One-sided hypothesis tests

Introduction to sampling

Introduction to hypothesis testing


Contents

Chocolate tasting practical

One-sided hypothesis tests

One-sided versus two-sided tests

Critical regions

Hypothesis tests and critical regions

27
27 of
of 57
57 © Boardworks Ltd 2006
One-sided hypothesis tests

Example: Mr Jones, a candidate in a local election,


claims to have the support of 40% of the electorate.
A rival candidate, Miss Smith, believes that Mr Jones
is exaggerating his level of support.
She asks a random sample of 12 local people and
discovers that 3 of them support Mr Jones.
Carry out a test at the 5% significance level to see
whether there is evidence that Mr Jones is
exaggerating his level of support.

28 of 57 © Boardworks Ltd 2006


One-sided hypothesis tests

Solution: We begin by writing down the 2 rival hypotheses.


Let p represent the proportion of the electorate who support
Mr Jones. This hypothesis represents
ourNotice
cautiousthatbelief,
the i.e. that
H0: p = 0.4 hypotheses
Mr Jones ishave beenabout
not lying
This
written hypothesis
mathematically,
his support.
H1: p < 0.4 represents what is
in terms of a parameter,
suspected to be true,
Significance level = 5% p.
i.e. that Mr Jones is
Let X be the number of people in exaggerating.
the sample who support
Mr Jones.
If the null hypothesis is true, then X ~ B(12, 0.4).
The observed data was x = 3. This is less than we would
expect if H0 were true, but is this result so extreme that it
is implausible?

29 of 57 © Boardworks Ltd 2006


One-sided hypothesis tests

We calculate P(X ≤ 3):


P( X  3)  12C3  0.43  0.69  0.1419
P( X The  12C2  0.42  0.610  0.0639
2) probability
of results at least
as extreme
1)  12Cas    0.0174
11
P( X 1 0 .4
those 0 .6
obtained.
P( X  0)  0.6  0.0022
12

So, P(X ≤ 3) = 0.225.


The significance level in this test was chosen to be 5% – the
probability calculated was much higher than this.
We conclude: the evidence is not strong enough to reject H0
at the 5% significance level. The data does not provide
evidence that Mr Jones was exaggerating his support.

30 of 57 © Boardworks Ltd 2006


One-sided hypothesis tests

The steps required to answer a hypothesis test


question in a S1 examination are:
Step 1: Write out H0 and H1 in mathematical terms.

Step 2: State the significance level – if none is mentioned


in the question, it is usual to choose 5%.

Step 3: State the distribution, assuming the


null hypothesis to be true.

Step 4: Calculate the probability (under H0) of obtaining


results as extreme as those collected.

Step 5: Compare the probability with the significance level


and make conclusions – can H0 be rejected or not?
Interpret your results in context.
31 of 57 © Boardworks Ltd 2006
One-sided hypothesis tests

Examination style question: The standard treatment for a


particular medical condition has a success rate of 70%. A
new drug is launched which, it is claimed, treats a greater
proportion of patients successfully.
A doctor tries the new drug on 20 patients and finds that it
successfully treats 19 of them.
Test at the 1% significance level whether there is evidence
to suggest that the new drug treatment is more successful
than the standard treatment.

32 of 57 © Boardworks Ltd 2006


One-sided hypothesis tests

Solution: Let p represent the proportion of patients that


are treated successfully.
The new treatment is
H0: p = 0.7 no more successful
than
The newthetreatment
existing
H1: p > 0.7 treatment.
is better than the
Significance level = 1% standard treatment.
Let X be the number of people successfully treated by the
new drug.
If the null hypothesis is true, then X ~ B(20, 0.7).
The observed data is x = 19.
Using tables, P(X ≥ 19) = 0.0076 < 1%.
We reject the null hypothesis at the 1% level – there is quite
strong evidence that the new treatment is more successful.

33 of 57 © Boardworks Ltd 2006


One-sided versus two-sided tests

Introduction to sampling

Introduction to hypothesis testing


Contents

Chocolate tasting practical

One-sided hypothesis tests

One-sided versus two-sided tests

Critical regions

Hypothesis tests and critical regions

34
34 of
of 57
57 © Boardworks Ltd 2006
One-sided versus two-sided tests

The examples considered so far can all be classified as one-


sided tests – we have been testing for either an increase or
a decrease in the value of the parameter, p.
Sometimes we are not looking specifically for an increase (or
decrease) in p, but instead we may want to examine whether
the value of p has changed. In these situations we use a
two-sided (or a two-tailed) test.
A two-sided hypothesis test carried out at the α%
significance level is in a sense two separate one-sided
tests. The significance level is therefore shared between
these two tests, ½α% for each tail.

35 of 57 © Boardworks Ltd 2006


One-sided versus two-sided tests

Example: A restaurant has traditionally found that 60% of


its customers have been pleased or very pleased with the
quality of the food served.
A new chef is appointed and the restaurant management
wish to find out whether this has changed the proportion of
customers who are happy with their food.
The management question 16 diners and discover that 14
of them are pleased or very pleased with their food.
Test at the 5% significance level whether there has been a
change in the proportion of contented customers.

36 of 57 © Boardworks Ltd 2006


One-sided versus two-sided tests

Solution: Let p represent the proportion of customers pleased


or very pleased with the quality of the food served.
The hypotheses can be stated as follows:
Ho: p = 0.6 (i.e. no change)
H1: p ≠ 0.6 (i.e. a change in the proportion).
5% significance level (2.5% for each tail).

Let X represent the number of customers that are pleased or


very pleased with their food. Then under the null hypothesis,
X ~ B(16, 0.6).

37 of 57 © Boardworks Ltd 2006


One-sided versus two-sided tests

If H0 were true, we would expect 16 × 0.6 = 9.6 customers to


be pleased with the food quality. The observed number, 14,
is on the high side.
We calculate P(X ≥ 14):
P( X  14)  16C14  0.614  0.42  0.0150
P( X  15)  16C15  0.615  0.41  0.0030
P( X  16)  0.616  0.0003

So P(X ≥ 14) = 0.0183 < 2.5%.


Conclusion: We can reject the null hypothesis at the 5%
significance level. There is some evidence that the proportion
pleased or very pleased with their food has changed.

38 of 57 © Boardworks Ltd 2006


One-sided versus two-sided tests

Examination style question: A driving instructor knows from


past experience that 2 out of 3 of his students pass their
driving test first time.
A new driving examiner is employed at the test centre.
The instructor wants to know whether this has changed the
proportion of his students passing their test at the first attempt.
He monitors the next 12 of his students taking their tests and
finds that 6 pass their test first time round.
a) Write down a suitable null and alternative hypothesis for
this test. Explain why your alternative hypothesis has the
form it has.
b) Carry out the test at a 10% significance level.

39 of 57 © Boardworks Ltd 2006


One-sided versus two-sided tests

Solution:
a) Let p represent the proportion of candidates now
passing on the first attempt.
H0: p = 2/3
H1: p ≠ 2/3
The alternative hypothesis is two-sided since the instructor
is looking for a change in the proportion of his students
passing first time.
b) 10% significance level (5% for each tail).
Let X = number of students passing on first try.
Then under H0, X ~ B(12, 2/3)

40 of 57 © Boardworks Ltd 2006


One-sided versus two-sided tests

We would expect 8 candidates to pass on the first attempt


if the null hypothesis were true. The observed number, 6,
is on the low side.
We need to calculate P(X ≤ 6).
Using tables, this probability is 0.1777 > 5%.
Conclusion: We are unable to reject the null hypothesis.
The data does not provide enough evidence to suggest
that the proportion of candidates passing their driving test
at the first attempt has altered.

41 of 57 © Boardworks Ltd 2006


Critical regions

Introduction to sampling

Introduction to hypothesis testing


Contents

Chocolate tasting practical

One-sided hypothesis tests

One-sided versus two-sided tests

Critical regions

Hypothesis tests and critical regions

42
42 of
of 57
57 © Boardworks Ltd 2006
Critical regions

The critical (or rejection) region for a hypothesis test is the


range of values for which the null hypothesis could be rejected.

Example 1: Police records show that 25% of the vehicles


using a stretch of road exceed the speed limit. A new speed
camera is installed. The police wish to find out whether this
has led to a reduction in the proportion of drivers speeding.
The police sample 20 cars driving along the stretch of road.
a) Find the critical region for a test carried out at the 5%
significance level.
b) Comment on the implications of the test if the police
find 2 speeding drivers.

43 of 57 © Boardworks Ltd 2006


Critical regions

a) H0: p = 0.25 where p = proportion of drivers who speed.


H1: p < 0.25
Significance level = 5%
Let X = number of cars that exceed speed limit.
Under H0, X ~ B(20, 0.25).
From tables, P(X ≤ 2) = 0.0913 > 5% (so the critical region
does not contain 2)
P(X ≤ 1) = 0.0243 < 5% (so x = 1 is contained
in the critical region).
Thus, the critical region for the test is x ≤ 1.

44 of 57 © Boardworks Ltd 2006


Critical regions

b)The actual number of speeding motorists is 2.


This number is not contained within the critical region.
Therefore we cannot reject the null hypothesis at the
5% level. The evidence does not support the theory
that the proportion of motorists that speed has
changed.

45 of 57 © Boardworks Ltd 2006


Critical regions

Examination style question: A gardener knows from past


experience that 80% of the runner bean seeds that he
plants will germinate. He is forced to switch to a different
brand of seed. He wants to find out whether this has led to
a change in the germination rate of his runner beans.
He plants 25 seeds. Let X represent the number of seeds
that germinate.
Find the critical region for a hypothesis test carried out at
the 10% significance level.

46 of 57 © Boardworks Ltd 2006


Critical regions

Solution:
H0: p = 0.8 (p = proportion of seeds that germinate).
H1: p ≠ 0.8
Significance level = 10% (5% for each tail).
Under H0, X ~ B(25, 0.8).
There will be two parts to the critical region, one corresponding
to each tail of the test.
Lower tail: Using tables, P(X ≤ 17) = 0.1091 > 5%
P(X ≤ 16) = 0.0468 < 5%
Therefore the lower part of the critical region is x ≤ 16.

47 of 57 © Boardworks Ltd 2006


Critical regions

Upper tail: P(X ≥ 23) = 1 – P(X ≤ 22)


= 1 – 0.9018 = 0.0982 > 5%
P(X ≥ 24) = 1 – P(X ≤ 23)
= 1 – 0.9726 = 0.0274 < 5%
Therefore part of the critical region for the upper tail is x ≥ 24.

Combining these two parts, the critical region for the whole
test is
x ≤ 16 or x ≥ 24.

48 of 57 © Boardworks Ltd 2006


Hypothesis tests and critical regions

Introduction to sampling

Introduction to hypothesis testing


Contents

Chocolate tasting practical

One-sided hypothesis tests

One-sided versus two-sided tests

Critical regions

Hypothesis tests and critical regions

49
49 of
of 57
57 © Boardworks Ltd 2006
Hypothesis tests on a Poisson mean

The steps for performing a hypothesis test on the value of a


Poisson mean are the same as for a binomial probability:
Step 1: Write down the null and alternative hypotheses
and state the significance level of the test.
Step 2: Write down the distribution of the random variable
assuming that the null hypothesis holds.
Step 3: Find the probability of obtaining results at least as
extreme as those actually recorded – this probability
is called the p-value.
Step 4: Compare the p-value with the significance level and
decide whether to reject the null hypothesis or not.
Step 5: Make a conclusion in the context of the problem.

50 of 57 © Boardworks Ltd 2006


Hypothesis tests on a Poisson mean

The number of accidents each year on a dangerous stretch of


road historically follows a Poisson distribution with mean 18.
The police install a speed camera and the local council is
interested in knowing whether this will lead to a reduction in
the number of accidents.
In the year after the speed camera was installed, 10 accidents
were recorded.
Use a 5% significance level to test whether there seems to
have been a reduction in the number of accidents.

51 of 57 © Boardworks Ltd 2006


Hypothesis tests on a Poisson mean

Let λ represent the mean number of accidents per year.


No reduction in the
Null hypothesis H0: λ = 18 number of accidents.

Alternative hypothesis H1: λ < 18 There has been a


reduction in the
number of accidents.
Let X be the number of accidents in a year.
Under H0, X ~ Po(18). The number of
accidents fell to 10.
Using tables, P(X ≤ 10) = 0.0304
0.0304 < 5%
Therefore we can reject the null hypothesis. There is some
evidence that there has been a reduction in the number of
accidents.

52 of 57 © Boardworks Ltd 2006


Examination-style question

Examination-style question:
A company has a notoriously unreliable computer system with
a mean of 4.25 breakdowns each week.
The company installs a new operating system and the
management are keen to know whether this will have an effect
on the number of breakdowns.
Over the next two weeks the computer system breaks down
on 11 occasions.
Stating your hypotheses clearly, carry out a hypothesis test
using a 2% significance level.

53 of 57 © Boardworks Ltd 2006


Examination-style question

Let λ represent the mean number of breakdowns per week.


Null hypothesis H0: λ = 4.25 This is a
Alternative hypothesis H1: λ ≠ 4.25 two-sided test.

Let X be the number of breakdowns in a two week period.


Under H0, X ~ Po(8.5).
P(X ≥ 11) = 1 – P(X ≤ 10) = 1 – 0.7634 (using tables) = 0.2366
0.2366 > 1% Therefore we cannot reject the null
hypothesis.
The test is 2-sided,
so we compare the There is no evidence that
p-value with half the there has been a change in the
significance level.
mean number of breakdowns.

54 of 57 © Boardworks Ltd 2006


Critical regions

Remember that the critical region for a hypothesis test is the


set of values that would lead to the rejection of the null
hypothesis.

A car salesman sells on average 2 new cars every day.


His company asks him to change his sales strategy.
The salesman records how many cars he sells over the next 7
days so that he can test whether there has been any change
in how successfully he sells new cars.
Find the critical region for a hypothesis test using a nominal
5% significance level. The probability of rejection in each tail
should be as close as possible to 2.5%.

55 of 57 © Boardworks Ltd 2006


Critical regions

A two-sided hypothesis test would be appropriate.


Let λ represent the mean number of cars sold per day.

H0: λ = 2
H1: λ ≠ 2.

If X is the number of cars sold in 7 days then under H0,


X ~ Po(14).

Lower tail: P(X ≤ 7) = 0.0316 Closest to


2.5%
P(X ≤ 6) = 0.0142

56 of 57 © Boardworks Ltd 2006


Critical regions

Upper tail: P(X ≥ 22) = 1 – P(X ≤ 21)


= 1 – 0.9712 = 0.0288 Closest to
2.5%
P(X ≥ 23) = 1 – P(X ≤ 22)
= 1 – 0.9833 = 0.0167

Therefore the critical region is X ≤ 7 or X ≥ 22.

57 of 57 © Boardworks Ltd 2006

Vous aimerez peut-être aussi