Vous êtes sur la page 1sur 2



CATEGORICAL DATA: Magnesium Placebo Total
Felt better 12 3 15
Jenny V Freeman and Michael J Campbell
analyse categorical data in small samples Did not 3 14 17
feel better

Total 15 17 32
IN THE PREVIOUS TUTORIAL we have tables that could have been observed,
outlined some simple methods for for the same row and column totals as
analysing binary data, including the the observed data. These row and Results of the study to examine whether intra-muscular
comparison of two proportions using column totals are also known as magnesium is better than placebo for the treatment of
the Normal approximation to the marginal totals. What we are trying to chronic fatigue syndrome.†
binomial and the Chi-squared test. establish is how extreme our particular
However, these methods are only table (combination of cell frequencies)
approximations, although they are is in relation to all the possible ones
good when the sample size is large. that could have occurred given the FIGURE 1
When the sample size is small we can marginal totals.
evaluate all possible combinations of This is best explained by a simple
the data and compute what are known worked example. The data in table 1 (i) (ii) (iii)
as exact P-values. come from an RCT comparing intra-
muscular magnesium injections with 0 15 1 14 2 13
FISHER’S EXACT TEST placebo for the treatment of chronic
When one of the expected values (note: fatigue syndrome.3 Of the 15 patients
not the observed values) in a 2 × 2 table who had the intra-muscular 15 2 14 3 13 4
is less than 5, and especially when it is magnesium injections 12 felt better (80
less than 1, then Yates’ correction can per cent) whereas, of the 17 on placebo, (iv) (v) (vi)
be improved upon. In this case Fisher’s only three felt better (18 per cent).
Exact test, proposed in the mid-1930s There are 16 different ways of 3 12 4 11 5 10
almost simultaneously by Fisher, Irwin rearranging the cell frequencies for the
and Yates, can be applied. The null table whilst keeping the marginal totals
hypothesis for the test is that there is the same, as illustrated in figure 1 12 5 11 6 10 7
no association between the rows and (right). The result that corresponds to
columns of the 2 × 2 table, such that our observed cell frequencies is (xiii). (vii) (viii) (ix)
the probability of a subject being in a The general form of table 1 is given
particular row is not influenced by in table 2, and under the null 6 9 7 8 8 7
being in a particular column. If the hypothesis of no association Fisher
columns represent the study group showed that the probability of obtaining
and the rows represent the outcome, the frequencies a, b, c and d in table 2 is 9 8 8 9 7 10
then the null hypothesis could be
(a + b)!(c + d)!(a + c)!(b + d)!
interpreted as the probability of having (1) (x) (xi) (xii)
(a + b + c + d)!a!b!c!d!
a particular outcome not being
influenced by the study group, and the where x! is the product of all the
test evaluates whether the two study integers between 1 and x, e.g. 5! = 1 × 2
9 6 10 5 11 4
groups differ in the proportions with × 3 × 4 × 5 = 120 (note that for the
each outcome. purpose of this calculation, we define 0!
An important assumption for all of as 1). Thus for each of the results (i) to
6 11 5 12 4 13
the methods outlined, including (xvi) the exact probability of obtaining
(xiii) (xiv) (xv)
Fisher’s Exact test, is that the binary that result can be calculated (table 3).
data are independent. If the For example, the probability of
proportions are correlated then more obtaining (i) in figure 1 is 12 3 13 2 14 1
advanced techniques should be
applied. For instance in the leg ulcer = 0.0000002.
example of the previous tutorial,1 if
32!0!15!15!2! 3 14 2 15 1 16
there were more than one leg ulcer per From table 3 we can see that the
patient, we could not treat the probability of obtaining the observed (xvi)
Illustration of all the different ways of
outcomes as independent. frequencies for our data is that which
15 0 rearranging cell frequencies in table 1,
The test is based upon calculating corresponds with (xiii), which gives P =
but with the marginal totals remaining
directly the probability of obtaining the 0.0005469 and the probability of the same.
results that we have shown (or results obtaining our results or results more
more extreme) if the null hypothesis is extreme (a difference that is at 0 17
actually true, using all possible 2 × 2 least as large) is the sum of the

SCOPE | JUNE 07 | 11

probabilities for (xiii) to (xvi) = 0.000573. EXAMPLE DATA


This gives the one-sided P-value for FROM LAST WEEK

TABLE 2 obtaining our results or results more Table 4 shows the data from the
extreme, and in order to obtain the two- previous tutorial. It is from a
sided P-value there are several randomised controlled trial of
Column 1 Column 2 Total
approaches. The first is to simply community leg ulcer clinics,5 comparing
double this value, which gives P = the cost effectiveness of community leg
Row 1 a b a+b 0.0001146. A second approach is to add ulcer clinics with standard nursing
together all the probabilities that are care. The columns represent the two
Row 2 c d c+d the same size or smaller than the one treatment groups, specialist leg ulcer
for our particular result; in this case, clinic (clinic) and standard care (home),
a+b+ all probabilities that are less than or and the rows represent the outcome
Total a+b b+d
c+d equal to 0.0005469, which are (i), (ii), variable, in this case whether the leg
(iii), (xiii), (xiv), (xv) and (xvi). This gives a ulcer has healed or not.
General form of table 1.
two-sided value of P = 0.001033. For this example the two-sided P-
Generally the difference is not great, value from Fisher’s Exact test is 0.599
though the first approach will always and in this case we cannot reject the
TABLE 3 give a value greater than the second. A null hypothesis and would decide that
third approach, which is recommended there is a insufficient evidence to a
Total a b c d P-value by Swinscow and Campbell,4 is a difference between the two groups.
compromise and is known as the mid-P
i 0 15 15 2 0.0000002 method. All the values more extreme SUMMARY
than the observed P-value are added This tutorial has described in detail
ii 1 14 14 3 0.0000180 up and these are added to one half of Fisher’s Exact test, for analysing simple
iii 2 13 13 4 0.0004417 the observed value. This gives P = 2 × 2 contingency tables when the
0.000759. assumptions for the Chi-squared test
iv 3 12 12 5 0.0049769 are not met. It is tedious to do by hand,
v 4 11 11 6 0.0298613 COMPARISON OF TESTS but nowadays is easily computed by
The criticism of the first two methods is most statistical packages.
vi 5 10 10 7 0.1032349 that they are too conservative, i.e. if the †
null hypothesis was true, over repeated When organising data such as this is it good
vii 6 9 9 8 0.2150728 practice to arrange the table with the grouping
studies they would reject the null
variable forming the columns and the outcome
viii 7 8 8 9 0.2765221 hypothesis less often than 5 per cent.
variable forming the rows.
They are conditional on both sets of
ix 8 7 7 10 0.2212177 marginal totals being fixed, i.e. exactly
15 people being treated with
x 9 6 6 11 0.1094916
magnesium and 15 feeling better. How-
xi 10 5 5 12 0.0328475 ever if the study were repeated, even
xii 11 4 4 13 0.0057426 with 15 and 17 in the magnesium and 1 Freeman JV, Julious SA. The
placebo groups respectively, we would analysis of categorical data.
xiii 12 3 3 14 0.0005469 not necessarily expect exactly 15 to feel Scope 2007; 16(1): 18–21.
xiv 13 2 2 15 0.0000252 better. The mid-P value method is less
2 Armitage P, Berry PJ,
conservative, and gives approximately Matthews JNS. Statistical
xv 14 1 1 16 0.0000005 the correct rate of type I errors (false methods in medical
positives). research. 4th ed. Oxford:
xvi 15 0 0 17 0.0000000
In either case, for our example, the Blackwell Publishing, 2002.
P-value is less than 0.05, the nominal
Probabilities of each of the frequency tables above, 3 Cox IM, Campbell MJ,
calculated using formula 1. level for statistical significance and we
can conclude that there is evidence of a Dowson D. Red blood cell
statistically significant difference in the
magnesium and chronic
fatigue syndrome. Lancet
proportions feeling better between the
TABLE 4 two treatment groups. However, in
1991; 337: 757–60.
common with other non-parametric 4 Swinscow TDV, Campbell
Treatment tests, Fisher’s Exact test is simply a MJ. Statistics at square one.
Outcome Total
Clinic Home hypothesis test. It will merely tell you 10th ed. London: BMJ
whether a difference is likely, given the Books, 2002.
Healed 22 (18%) 17 (15%) 39 null hypothesis (of no difference). It 5 Morrell CJ, Walters SJ,
gives you no information about the Dixon S, Collins K, Brereton
Not healed 98 (82%) 77 (85%) 194 likely size of the difference, and so LML, Peters J et al. Cost
whilst we can conclude that there is a effectiveness of community
significant difference between the two leg ulcer clinic: randomised
Total 120 (100%) 113 (100%) 233 treatments with respect to feeling controlled trial. Brit Med J
better or not, we can draw no 1998; 316: 1487–91.
2 × 2 contingency table of treatment (clinic/home) by conclusions about the possible size of
outcome (ulcer healed/not healed) for the leg ulcer study. the difference.

12 | JUNE 07 | SCOPE