Vous êtes sur la page 1sur 2

B io Factsheet

September 1997

Number 3

Which Stats test should I use?


"Which statistical test should we use?" is a common question from Biology students. This Factsheet provides simple guidelines on when
each type of statistical test should be used.
The choice of the correct statistical test is all-important - use the wrong test and the conclusions will be invalidated. Marks are only awarded for an appropriate
- i.e. correct - use of statistics. The flowchart below can be used to identify the appropriate test. Table 1 overleaf gives examples of investigations. and
appropriate tests.

Figure 1. Deciding which test to use

Comparing frequencies
(numbers of things) in
various categories?
(eg. are the numbers of
seeds germinating in
various trays significantly
different?)

Whether the frequencies are


the same in each of two or
more categories?
(eg. are seed germination rates
the same in different pHs?)

Do you
want to
test

Whether observed
frequencies are the same as
those theoretically expected?
(eg. are the predictions of
genetics correct?)

Whether two factors are


related?
(eg. does pollution affect the
number of sites at which clinging
mayfly are found?)

Are

Calculated
(such as a diversity index) or
Counted
(such as the number of
organisms?)

you

Finding if there is a
difference between two
averages?
(eg. is there, on average, a
higher species diversity in
unpolluted water rather
than polluted water?)

Chi-squared
Goodness-of-Fit

Chi-squared
Contingency
table

Mann-Whitney
U test

Yes
Is the
data

Measured
(such as length, width,
height, velocity)?

Do the data occur in


natural pairs?

eg. the same organism


reacting to two different
stimuli.
No

Investigating the
relationship between
two variables?
(eg. Is there a relationship
between pollution level
and distance from road?)

Do you
want to

paired
t-test

unpaired
t-test

Find whether the two


variables are correlated?
(i.e does increasing one cause
the other to increase or
decrease?)

Spearman's
rank correlation
coefficient

Use one variable to predict


the value of the other?
(eg. predict the pollutant levels at
10m from the road)

Regression

Bio Factsheet

Which Stats test should I use?

Table 1: Statistical tests for various investigations


INVESTIGATION

WHAT IS MEASURED ?

NULL HYPOTHESIS

STATISTICAL
TEST

EXPLANATION

1. Effect of pH on
seed germination

Number of seeds (out of 20, say)


germinating in each of several
trays which contain different pH
solutions

Ho: Number of seeds


germinating is not
dependent on pH

Chi-squared

We are comparing observed


frequencies with expected
frequencies (i.e. that the
same number germinate in
each tray)

2. Effect of differing
environmental
conditions (eg. two
soil types) on the
yield of a crop plant

The weight of usable crop


produced from a given area at a
minimum of 4 sites for each soil
type

Ho: Mean yield the same


for both soil types

3. Effect of pollution
on vegetation

Measurements of vegetation
height from at least 4 polluted and
4 unpolluted sites

Ho: Mean vegetation height


unaffected by pollution

4. Comparison of leaf
length for the same
tree species in two
different sites

Measurements of length of at
least 20 leaves from each site

Ho: Mean leaf lengths the


same for both sites

5. Comparison of plant
growth on two
sides of a hedgerow

Measurement of vegetation
height at matched sites - i.e.
equivalent points at opposite
sides of hedge.

Ho: Mean vegetation height


the same on both sides

6. Comparison of
species diversity in
mown and unmown
turf

Simpson's Diversity Index at a


minimum of 5 mown and 5
unmown sites

Ho: Species diversity does


not differ significantly
between mown and
unmown sites

7. Lichen distribution
related to direction
faced (North or
South)

The number of quadrats in which


lichen occurs at each of a
minimum of 5 sites facing in each
direction

Ho: Lichen distribution does


not differ significantly
between North and South
facing areas

8. Comparison of
wildlife in coppiced
and uncoppiced
woods

Either: The incidence of specified


species at each of at least 5
sites in each type of woodland
Or: Simpson's Diversity Index at
each of at least 5 sites in each
type of woodland

Ho: Incidence of specified


species /Species diversity
does not differ significantly
between coppiced and
uncoppiced woodland

9. Relationship
between species
diversity and
concentration of zinc
in a stream

Simpson's Diversity Index and


zinc concentration at a minimum
of 5 sites (but preferably more)

Ho: There is no correlation


between species diversity
and temperature in a
stream

Spearman's
Rank

We are looking for a


relationship where species
diversity decreases with
zinc concentration

10.Effect of soil type


on incidence of a
particular plant
species

Number of quadrats in which


species is present or absent for
at least 20 samples taken for
each soil type. (there must be
enough samples to guarantee
that there will be at least 5 cases
where it is present and 5 where
it is absent for each soil type)

Ho: Incidence of the


species is not dependent
on soil type

Chi-squared
(Contingency
Table)

We are trying to test


whether two factors - i.e.
type of soil and absence or
presence of a plant species
- are independent

11.Establishment of the
exact relationship
between width and
height of limpets

Measurements of base width and


height of at least 10 limpets

N/A

Regression

Since we are trying to find a


relationship between the
two variables, we want to
be able to use one to predict
the other

Acknowledgements; This Bio Factsheet was researched and written by Cath Brown
Curriculum Press, Unit 305B, The Big Peg, 120 Vyse Street, Birmingham. B18 6NF
Biopress Factsheets may be copied free of charge by teaching staff or students, provided that their school is a registered subscriber.
No part of these Factsheets may be reproduced, stored in a retrieval system, or transmitted,
in any other form or by any other means, without the prior permission of the publisher. ISSN 1351-5136

The variable measured is


continuous, and would be
expected to follow a normal
- or bell-shaped distribution.
t-test
(unpaired)
and we are interested in
comparing mean (or
average) values. Most
"natural" measurements length, weight etc. - follow
a normal distribution
t-test
(paired)

The paired test is used


when the above applies and
we have a natural matching
between sites
This is used when we want
to compare averages, but
cannot assume our figures
come from a normal
distribution

Mann-Whitney
U-test

This will certainly be the


case when we are
comparing something we
have calculated or counted
at different sites
It can be used instead of a
t-test as well, but is less
poweful and less likely to
give significant results

Vous aimerez peut-être aussi