Vous êtes sur la page 1sur 3

Using the Mann-Whitney U test

The Mann-Whitney U test will tell you whether the medians of two sets of data are
significantly different from one another. It works on unmatched, interval or ordinal data.
It does not require that the data are normally distributed but it does require that both
datasets are the same shape. If youve got normally distributed data with 25 samples in
each dataset use a t-test.
An example
Lets say you are investigating the effects of lifestyle on human body size. You notice that
a high-fat, fast-food diet seems to be linked to larger body size. You state the null
hypothesis: there is no significant difference between the medians of the two sets of data
body sizes of people who eat high-fat, fast food and those who eat low-fat, fast food.
You feel it would be too much of an intrusion to measure peoples weight or girth
directly, so you invent a way of assessing their size remotely.
You select two sites: Site one is outside the well-known fast-food chain Bloaters Land of
Lard. Site two is outside the well-known health-food chain Smugs Lettuce Bar.
You stand outside both establishments and simply assess the body size of the first eight
punters leaving each restaurant using the following scale:
1 skeletal 2 thin 3 medium 4 rounded 5 fat 6 very fat
You obtain the results in Table 1.
1: Work out the median for each set of data
Rank the data and identify the median, the middle value:
Median body size score for Bloaters 2.5.
Median body size score for Smugs 4.5.
2: Put the data in order
Next, put all the data in order from smallest to largest and assign a rank to the values as
shown in Table 2. Notice the lowest value is 1 (Bloaters dataset), so this receives a rank
of 1. Next we have three tied values (three values of 2 from the Bloater dataset). These
three items of data occupy three ranks but they are all of the same value, so we share
out the ranks thus: rank 2 rank 3 rank 4 9. Divide by three and we end up with
a rank of 3 for each piece of data.
Maths/stats support 14 Mann-Whitney U test
Student skills
M0.14S
Salters-Nuffield Advanced Biology, Harcourt Education Ltd 2006. University of York Science Education Group.
This sheet may have been altered from the original. 1 of 3
Body size scores
Person 1 2 3 4 5 6 7 8
Bloaters 3 2 2 1 4 4 5 2
Smugs 5 4 6 3 4 6 3 6
Bloaters 1 2 2 2 3 4 4 5
Smugs 3 3 4 4 5 6 6 6
Table 1 Body size scores of people patronising Bloaters and Smugs.
The next values are also tied, a value of 3 from Bloaters and two values of 3 from
Smugs. We deal with these in the same way. We have used ranks up to 4 so the next
three ranks that are available are rank 5, rank 6 and rank 7. Add these together and
share them out equally: 5 6 7 18/3 6. Continue doing this to complete the
table and the only moderately hard part of doing this test is over.
3: Add up the ranks for each set of data
R
1 (Bloater)
47.5
R
2 (Smug)
88.5
4: Calculate the value U for each sample using the formula:
U
1
(n
1
n
2
) (0.5n
1
) (n
1
1) R
1
U
2
(n
1
n
2
) (0.5n
2
) (n
2
1) R
2
where: U the test statistic
n
1
number of items in dataset one
n
2
number of items in dataset two
R
1
sum of ranks of dataset one
R
2
sum of ranks of dataset two
In our example:
U
1
(8 8) (4 9) 47.5 52.5
U
2
(8 8) (4 9) 88.5 11.5
We can now check our calculations because U
1
U
2
should equal n
1
n
2
. Happily, in
our case this is indeed true.
We now take the smaller of the two values as our calculated test statistic and compare
this value with the critical value obtained from a table of critical values of U, Table 3.
The smaller of our calculated values of U is 11.5.
Student skills
M0.14S
Maths/stats support 14 Mann-Whitney U test
Salters-Nuffield Advanced Biology, Harcourt Education Ltd 2006. University of York Science Education Group.
This sheet may have been altered from the original. 2 of 3
Bloaters 1 2 2 2 3 4 4 5
Rank 1 3 3 3 6 9.5 9.5 12.5
Smugs 3 3 4 4 5 6 6 6
Rank 6 6 9.5 9.5 12.5 15 15 15
Table 2 Ranking of the data.
Look down the left side of the table until you find the correct number of data items in
dataset 1 and then move along the row until you reach the column for the number of
data items in dataset 2. This gives the critical value at 5% significance level (p 0.05).
For our example, we need to look at row 8 and column 8, giving a critical value for U
of 13.
In a Mann-Whitney U test, if the calculated value of U is less than (or equal to) the
critical value we reject the null hypothesis.
Our calculated value of U is 11.5, which is less than the critical value of U, 13, so we
reject the null hypothesis at 5% significance level (p 0.05). In rejecting our null
hypothesis we are saying that there is a significant difference between the median body
size scores of the Bloater and Smug customers. In doing this at the 5% significance level
we would expect to be correct in rejecting our null hypothesis 95% of the time.
Dont jump to conclusions
We might also learn from this example of the dangers of jumping to conclusions. I
would bet that most people expected Bloaters to have the fatter people using it.
Sheets on the statistical test written by Steve Morrell of Field Studies Council Centre, Dale Fort.
Student skills
M0.14S
Maths/stats support 14 Mann-Whitney U test
Salters-Nuffield Advanced Biology, Harcourt Education Ltd 2006. University of York Science Education Group.
This sheet may have been altered from the original. 3 of 3
n
1
n
2
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
1
2 0 0 0 0 1 1 1 1 1 2 2 2 2
3 0 1 1 2 2 3 3 4 4 5 5 6 6 7 7 8
4 0 1 2 3 4 4 5 6 7 8 9 10 11 11 12 13 13
5 0 1 2 3 5 6 7 8 9 11 12 13 14 15 17 18 19 20
6 1 2 3 5 6 8 10 11 13 14 16 17 19 21 22 24 25 27
7 1 3 5 6 8 10 12 14 16 18 20 22 24 26 28 30 32 34
8 0 2 4 6 8 10 13 15 17 19 22 24 26 29 31 34 36 38 41
9 0 2 4 7 10 12 15 17 20 23 26 28 31 34 37 39 42 45 48
10 0 3 5 8 11 14 17 20 23 26 29 33 36 39 42 45 48 52 55
11 0 3 6 9 13 16 19 23 26 30 33 37 40 44 47 51 55 58 62
12 1 4 7 11 14 18 22 26 29 33 37 41 45 49 53 57 61 65 69
13 1 4 8 12 16 20 24 28 33 37 41 45 50 54 59 63 67 72 76
14 1 5 9 13 17 22 26 31 36 40 45 50 55 59 64 67 74 78 83
15 1 5 10 14 19 24 29 34 39 44 49 54 59 64 70 75 80 85 90
16 1 6 11 15 21 26 31 37 42 47 53 59 64 70 75 81 86 92 98
17 2 6 11 17 22 28 34 39 45 51 57 63 67 75 81 87 93 99 105
18 2 7 12 18 24 30 36 42 48 55 61 67 74 80 86 93 99 106 112
19 2 7 13 19 25 32 38 45 52 58 65 72 78 85 92 99 106 113 119
20 2 8 13 20 27 34 41 48 55 62 69 76 83 90 98 105 112 119 127
Table 3 Critical values of U (5% significance).

Vous aimerez peut-être aussi