Vous êtes sur la page 1sur 4

PSYCHOLOGICAL BULLETIN

Vol. 54, No. 2, t9S7

A GENERAL METHOD OF ANALYSIS OF FREQUENCY


DATA FOR MULTIPLE CLASSIFICATION DESIGNS
J. P. SUTCLIFFE
University of Sydney

Fisher (3) has shown the advan- sible to obtain measurements on the
tages of the factorial experiment over dependent variable, so that statisti-
the classical method of "one vari- cal analysis of the outcomes is by way
able." The following gains accrue of the "analysis of variance." In
from consideration of the effects of many research areas, however, phe-
independent variables (treatments) nomena are not yet amenable to scal-
upon the dependent variable in the ing so that one has counts or fre-
context of other independent vari- quencies within given categories ra-
ables: (a) With a sample of size n, ther than measures, e.g., male versus
and k treatment classifications which female rather than degrees of sexu-
do not interact, "hidden replication" ality. There is no logical hindrance
enables estimation of all k main ef- to the use of factorial experimenta-
fects with the same precision as tion with these phenomena, and such
would be achieved for one in a single is to be recommended in light of the
factor experiment of the same size. advantages to be gained. The prob-
The economy of the factorial design lem is to find a method of statistical
is indicated by the fact that to obtain analysis appropriate to this type of
the same amount of information by data, x2 methods are available for
the "rule of one variable," one would the comparison of sampled frequen-
need k sets of n replicates, (b) If cies and for assessing association in
there is interaction among the treat- simple contingency tables. These
ments, the factorial arrangement en- cases are in effect instances of single
ables its isolation and evaluation and and double classification designs, and
thereby sets the limits of generaliza- if contingency association is the
tion. One can specify the effect of analogue of interaction in analysis of
the independent upon the dependent variance, then a method of assessing
variable in a variety of contexts; and multiple contingency is needed for
conversely, if interaction is zero one the analysis of frequency data from
may conclude that the relationship is higher order designs. Pearson (6)
constant through all contexts con- described a procedure for assessing
sidered, (c) A further virtue of the multiple contingency but failed to
factorial design lies in the informa- consider the question of additivity
tion it may provide about the rela- of x2 components. Bartlett (1) of-
tive efficacy of different combina- fered a method for the 2 X 2 X 2 case
tions of conditions for the production which involves the solution of a cubic
of given effects. Most use has been equation and is difficult to apply in
made of this in agriculture and indus- practice. Recently, Lancaster (5)
try, but it has its scientific as well as following proofs by Irwin (2) and
its technological applications, such Lancaster (4) has devised a general
as in sorting out necessary and suf- method of partitioning a total %2 and
ficient conditions. degrees of freedom into independent
In practice the factorial design has additive components due to given
most often been used where it is pos- sources of variation. This completes
134
ANALYSIS OF FREQUENCY DATA 135
the parallel with the analysis of vari- ment. The method will be developed
ance in which the total sum of squares through a notation which, while
and degrees of freedom are parti- perfectly general, will for simplicity
tioned into sums of squares and df be set out for an A XB X C design.
for all main effects, interactions and Let the classifications be symbo-
error. This paper presents for psy- lized as A, B, C, , L. Let A be
chologists a general form of multiple subdivided into a categories repre-
contingency analysis based on Lan- sented generally by the subscript i
caster's work, provides an illustra- which thus takes values from 1 to a.
tion, and comments on the generality Similarly B is represented by (j
of application of the method. = 1, , b), C by (ft = l, , c),
etc. Let pijk the probability of an
MULTIPLE CONTINGENCY observation falling in the ijkth cell;
ANALYSIS Oijk = the observed frequency in the
ijkth cell; and ,-;& = the expected
Complex contingency tables of frequency in the ijkth cell. Let a dot
frequency data from multiple classi- in place of a subscript represent sum-
fication designs may be of several mation across the values represented
forms according as the sampling of by the subscript, e.g.,
main effects is random or restricted,
(a) The random case imposes no
sampling restrictions. For example,
after a random sample has been
taken, it may be classified in various Let the total sample size o ... N; and
ways and the frequencies within finally ^.,. = 1.0. On the hypothesis1
classes will be due only (within of zero interaction, Pak pi..Xp.}.
sampling limits) to the population Xp..k, Pa.=pi..Xp.j., etc. These
proportions, (b) The mixed case in- parameters are used to find the ex-
volves restrictions upon the propor- pected frequencies, e.g., <# = *
tions within categories of given classi- X N, and hence x2 may be calculated
fications and freedom with respect to as ^(oeY/e. Now some or all of
others, e.g., arranging in advance that the values of the parameters pi...,
a total sample will involve equal p.j., p.,h may be (a) known from the
proportions of the sexes. Parameters population; or (b) estimated from
for a classification are denned by its the sample, e.g., pi.. = Oi../N. These
restriction. Whichever case is in- situations taken with the random
volved, for each observed frequency in and mixed designs provide four cases
the table there will be an expected each requiring separate consideration.
value, and hence divergence of the Case (la) will be presented in full
total table from expectation may be for the A XB X C design.
tested through x2- Within a total (la) Random sampling, known pa-
table, however, there will be a num- rameters
ber of sources of variation comprising
main effects and interactions, and to The partition of total x2 ar>d df
isolate them one would need to par- into component values for this case,
tition the total x2 and df into inde- 1
Ordinarily one works with the null hy-
pendent additive components due pothesis, but population hypotheses of non-
zero interactions maybe entertained, e.g., in a
to such sources. The problem is test of goodness of fit with case la, and again
to specify the expected frequen- in determining the power of the test of signifi-
cies which will meet this require- cance for a given situation.
136 J. P. SUTCLIPPR
TABLE 1
PARTITION OF x2 AND DF FOR CASE (la)

Number Source

1 A X"A = E ("*..-';...)'/.. (d-1)

2 B XIB^ (a.j.-t .,.)'/./. O-l)


3 C X2o= E (..-..)'/..* (c-1)
4 AB jfAB- E E (,' .-,-y.)Vftv. -C+2) (-l)0-l)
5 AC 2
X ^IC?
V V f/i. *-,-.*)'/..*
i_/ ^C-' v t - -(1+3) (a-l)(c-l)
6 c
!
6 BC x oc= E E (o.yt-e.yt)V.y* -(2+3) (6-1)(<;-l)
n /) c
7 ABC Y 2, nf ,-
A Jlty
v y y (o,-y*-et)V.-yt -(1+2+3+4+5+6)
/_^ /_rf ^_rf (o-l) (6- !)(<;-!)
8 Total xv= E E E (9.-yt-iy*)V.-y* (oie-1)

together with computing formulas, classes are prearranged and sampling


are set out in Table 1. As the popu- is random only with respect of C,
lation values are known, the sig- then one has set e,-..=0i.., e . j . o . j . ,
nificance of all main effects and inter- e i j . = o i j . = (oi..Xo.i.)/N. In this
actions may be assessed. case one would obtain
(IV) Random sampling, parameters
estimated from the data
In this case one estimates popula- and the total df would be reduced by
tion proportions from the sample the number lost with the main effects
data, e.g., pi..ot../N, and as e,-.. and interactions.
= Pt..XN, then e,-..=0... Accord- (2b) Mixed case, parameters estimated
ingly for this case the values of %2 and from the data
df for the main effects are zero. One
may assess all interactions and their Here one loses all the main effects
df are unchanged, but the total dfia and such interactions as involve only
reduced by the number lost with the the restricted classifications. For the
main effects. case with A and B restricted and C
random,
(2a) Mixed case, known parameters
Here restriction specifies the pa-
rameters for a classification, so that and as before the total df has to be
the main effects and df for the re- adjusted for the number lost with the
stricted classifications are zero. Fur- main effects and interactions.
thermore, if several classifications
and their subclasses are restricted, ILLUSTRATION OF THE METHOD
within that set the interactions and An experiment is reported (7) in
df are also zero. For instance, if in which the manner of resolving con-
an AXBXC design the proportions flict (A, 2 = 1, 2) is observed under
within the A and B classes and four conditions constituting the fac-
ANALYSIS OF FREQUENCY DATA 137
torial arrangement of two conditions measurement data has recently been
of social distance (B, j 1, 2) and suggested by Wilson2 (8) who uses
two conditions of publicity (C, k it as a "distribution-free" substitute
= 1, 2). Four independent random for the analysis of variance. He does
samples of 100 cases were assigned to not justify this substitution and some
the four conditions, and the whole comment is warranted. Wilson di-
experiment was replicated for eight- chotomizes his measures at the
een conditions of social sanction median and, in effect, introduces the
(D, 1 = 1, 2, , 18). In this way dependent variable as an additional
equal numbers were subjected to all classification with two levels. In-
treatments and the only main effect formation is lost in categorizing and
frequencies free to vary were those to that extent a test of significance
pertaining to type of conflict resolu- with frequencies is less sensitive
tion, That is, A is random and B, than one applied to measures. Hence
C, and D and their subclasses are re- one would only use with measure-
stricted in an AXBXCXD design. ment data multiple contingency
As the population proportions for analysis as a substitute for analysis of
type of conflict resolution were un- variance when the latter method was
known, they were estimated from not applicable. This would be so
the sample data. Hence the analysis when certain assumptions required
follows the (2V) type, where for a valid F test could not be met
normality of parent population, ho-
mogeneity of variance-and a suit-
able transformation was not avail-
able. Here in the absence of the more
dfT=(bcd-\)(a-l). sensitive test, the less sensitive test
would certainly be preferable to none
GENERALITY OF APPLICATION at all.
5
As presented the method has ap- Wilson's procedures are based upon a
particular hypothesis about the expected
plication to factorial experiments in values, viz., irrespective of treatment effects
which information on the dependent a score on the dependent variable is equally
variable is in frequency form. Equal- likely to occur above or below the median.
ly the method may be applied to sur- It should be noted that this is not the only
population hypothesis which may be enter-
veys where sampling units are classi- tained. The method presented in this paper,
fied in a variety of ways. A further being more general than Wilson's, is to be
application of this type of method to preferred on that score.
REFERENCES
1. BARTLETT, M. S. Contingency table inter- /. roy. statist. Soc., Series B, 1951, 13,
actions. /. roy. statist. Soc. Suppl, 242-249.
1935, 2, 248-252. 6. PEARSON, K. On the theory of multiple
2. IRWIN, J. 0. A note on the subdivision of contingency with special reference to
X2 into components. Biometrika, 1949, partial contingency. Biometrika, 1915-
36, 130-134. 17, 11, 145-158.
3. FISHER, R. A. The design of experiments. 7. SUTCLIFFE, J. P., & HABBRMAN, M. Fac-
(5th Ed.) Edinburgh: Oliver & Boyd, tors influencing choice in role conflict
1949. situations. Amer. social. Rev., December,
4. LANCASTER, H. O. The derivation and 1956, 21.
partition of x1 in certain discrete distribu- 8. WILSON, K. V. A distribution-free test of
tions. Biometrika, 1949, 36, 117-129. analysis of variance hypotheses. Psychol,
5. LANCASTER, H. O. Complex contingency Butt. 1956, S3, 96-101.
tables treated by the partition of xa. Received May 2, 1956.

Vous aimerez peut-être aussi