Vous êtes sur la page 1sur 0

NON-PARAMETRIC METHODS IN ANALYSIS OF

EXPERIMENTAL DATA

Rajender Parsad
I.A.S.R.I., Library Avenue, New Delhi 110 012


1. Introduction
In conventional setup the analysis of experimental data is based on the assumptions like
normality, independence and constant variance of the observations. However, there may arise
experimental situations where these assumptions, particularly the assumption of normality,
may not be satisfied. In such situations non-parametric test procedures may become quite
useful. A lot of attention is being made to develop non-parametric tests for analysis of
experimental data. Most of these non-parametric test procedures are based on rank statistic.
The rank statistic has been used in development of these tests as the statistic based on ranks
is
1. distribution free
2. easy to calculate and
3. simple to explain and understand.

The another reason of use of the rank statistic is due to the well known result that the average
rank approaches normality quickly as n (number of observations) increases, under the rather
general conditions, while the same might not be true for the original data {see e.g. Conover
and Iman (1976)}. The non-parametric test procedures available in literature cover
completely randomized designs, randomized complete block designs, balanced incomplete
block designs, design for bioassays, split plot designs, cross-over designs and so on. For
excellent and elaborate discussions on non-parametric tests in the analysis of experimental
data, one may refer to Sen (1996), Deshpande, Gore and Shanubhogue (1995) and Hollander
and Wolfe (1999).

In the present talk we shall concentrate on the non-parametric test procedure for analysis of
one-way and two-way classified data. Some of these procedures are:
1. Kruskal-Wallis Test
2. Friedman Test
3. Durbin Test
4. Skillings and Mack Test
5. Gore test for Multiple Observations per Plot

In the sequel we describe each of these tests in brief.

2. Non-parametric Procedures for the Analysis of Experimental Data
In this section, we shall describe the non-parametric tests for analysis of experimental data
generated through completely randomized designs (CRD) and block designs. We begin with
the test procedure for the experimental data generated through CRD




2.1. Kruskal Wallis Test
This is used for testing the equality of several independent samples. Hence, this test is quite
useful for the analysis of experimental data generated through a completely randomized
design. Consider that there are v treatments and i
th
treatment is replicated r
i
times; i =1, 2, ,
v, such that N =

. The data generated through CRD can be represented by usual one-way


classified linear model as
=
v
i
i
r
1
y
ij
= +
i
+
ij
i = 1, 2, , v; j = 1, 2, , r
i
.

where y
ij
is the yield or response of j
th
replication of i
th
treatment, is the general mean,
i
is
the i
th
treatment effect,
ij
is the error due to i
th
treatment and j
th
observation.

The experimenter is interested to test the equality of treatments effects against the alternative
that at least two of the treatments effects are not equal. In other words, we want to test the
null hypothesis H
0
:
1
=
2
=
3
= =
v
= (say) against the alternative H
1
: at least two of
the
i
s are different.

To test the above null hypothesis using the Kruskal-Wallis Statistic, we rank all N
observations by giving rank 1 to smallest observation and N to largest observation. Once this
is done, we obtain the sum and average of the ranks of the observations pertaining to each of
the treatments. Now, if the treatment effects are equal, then the average ranks expected to be
same, the differences if any, are due to sampling fluctuations. The Kruskal-Wallis statistic is
based on the assessment of the differences among the average ranks. This may be explained
as below:
Let R
ij
be the rank of y
ij
; i = 1, 2, 3, , v; j = 1, 2, , r
i
and R
i
= ( the sum of the
ranks of the observations pertaining to the i

=
i
r
j
ij
R
1
th
treatment) and
i i i
r R R / = (the average of the
ranks of the observations pertaining to the i
th
treatment). Let R is the mean of the all
i
R . The
Kruskal-Wallis statistic is, then, given by
( )

=

+
=
v
i
i i
R R r
N N
T
1
2
) 1 (
12

+
=
v
i
i
i
i
N
r
R
r
N N
T
1
2
2
1
) 1 (
12


) 1 ( 3
) 1 (
12
1
2
+
+
=

=
N
r
R
N N
v
i
i
i


The method of determining the significance of the observed values of T depends on the
number of treatments (v) and their replications (r
i
). The Null Distribution (distribution under
680

null hypothesis) of T for v =3 and r
i
5 is extensively tabulated and available in several
texts. For a ready reference, the Table has been included in the Appendix-I of the present
text. In other cases, under null hypothesis, T may be approximated by the
2
with (v-1)
degree of freedom.

Remark 2.1: When ties occur between two or more observations, each observation is given
the mean of ranks for which it is tied. The T-statistic after correcting the effect of ties is
computed by using following formula

) N N /( ) t t ( 1
) 1 N ( 3
r
R
) 1 N ( N
12
T
3
g
1 s
s
3
s
v
1 i
i
2
i

+
=

=
=


where g is number of treatments of different tied ranks, t
s
is the number of tied ranks in the s
th

treatment with tied ranks. The effect of correcting for ties is to increase the value of T and
thus to make the result more significant than it would have been had no correction been
made. Therefore, if one is able to reject null hypothesis without making the correction, one
will be able to reject H
0
at an even more stringent level of significance if the correction is
used.

The Kruskal-Wallis statistic may be derived by taking the help of Mann - Whitney
Wilcoxon test for two independent sample that correspond to 2 treatments that are replicated
r
1
and r
2
times respectively. Some steps of the derivation are listed as below:

Suppose , , , , , , , , ,
11
y
12
y
13
y
1
r 1
y
21
y
22
y
23
y
2
r 2
y

be two random samples of size
r
1
and r
2
respectively such that r
1
+r
2
= N. Let us order the combined sample from smallest
(rank 1) to largest (rank r
1
+r
2
) and let

R( ) =rank of in the combined sample, k = 1, 2, , r
k 1
y
k 1
y
1
.

Then 1 R( ) r
k 1
y
1
+r
2
for k = 1, 2, , r
1
. Under H
0
we expect the r
1
ranks of the y
1
s
namely R( ), R( ), , R( ) to be randomly spread out among the ranks 1, 2, , r
11
y
12
y
1
r 1
y
1
+r
2
.
The test based on the rank sum statistic

=
=
1
r
1 k
k 1 1
) y ( R T
Now we assumed that the sample with fewer observations is the first sample (y
1
) so that
. Next note that the least value of y
1
r
2
r
1
corresponds to the ranks 1, 2, , and the maximum
value corresponds to ranks +1, +2, , + . It follows that
1
r
2
r
2
r
1
r
2
r
681

2
) 1 r 2 r ( r
) l r ( y
2
) 1 r ( r
2 1 1
r
1 l
2 1
1 1
1
+ +
= +
+

=

Finally, under H
0
: T
1
has a symmetric distribution about its mean ( + +1)/2. It follow
that a left tail cumulative probability equals the corresponding right tail cumulative
probability. For larger value of or we use the fact that
1
r
1
r
2
r
1
r
2
r

2
) 1 r r ( r
) T ( E
2 1 1
1
+ +
= and
12
) 1 r r ( r r
) T ( Var
2 1 2 1
1
+ +
=
Then for large + , we know that
1
r
2
r

12 / ) 1 r r ( r r
2 / ) 1 r r ( r R
Z
2 1 2 1
2 1 1 1
+ +
+ +
=

has approximately a standard normal distribution, so that
2
2 1 1
1
2 1 2 1
2
2
) 1 r r ( r
R
) 1 r r ( r r
12
Z

+ +

+ +
=

2
) 1 (
2 1
1
1
2 1 2
1
~
2
1 r r
r
R
) 1 r r ( r
r 12

+ +

+ +
=

has approximately a
2
with one degree of freedom. The first thing to note is that since R
2
=
[N(N+1)]/2 R
1
, then after simplification we can write

+
+

+
=
2
2
2
2
2
1
1
1
2
2
1 N
r
R
r
2
1 N
r
R
r
) 1 N ( N
12
Z
after generalization this equation in term of (v > 2) treatments assuming that i
th
treatment is
replicated r
i
times such that then we get T statistic given which is mentioned
above.

=
=
v
1 i
i
N r

Pair-wise Comparisons
If the Kruskal-Wallis test rejects the null hypothesis of equality of v treatment effects, it
indicates that at least two of the treatment effects are unequal. It does not tell the researcher
which one are different from each other. Therefore, a test procedure for making pair wise
comparisons is needed. For this, the null hypothesis
i i 0
: H

= against
i i 1
: H


can be tested at v ..., , 2 , 1 i i = % level of significance by using the inequality

682

v ..., 2, 1, i' i , i' i,
r
1
r
1
12
) 1 N ( N
z R R
' i i
p ' i i
=

+
+


where p = /v(v-1) and z
p
is the quantile of order 1-p under the standard normal distribution.
From the above, we can say that the least significant difference between the treatments i and
is i

+
+
=
' i i
p ' ii
r
1
r
1
12
) 1 N ( N
z c

Therefore, if
' ii ' i i
c R R > then the difference between i and i' treatment effects is
considered significant at % level of significance. The above procedure is illustrated with the
help of following example.

Example 2.1: An experiment was conducted with 21 animals to determine if the four
different feeds have the same distribution of Weight gains on experimental animals. The
feeds 1, 3 and 4 were given to 5 randomly selected animals and feed 2 was given to 6
randomly selected animals. The data obtained is presented in the following table.
Feeds Weight gains (kg)
1 3.35 3.8 3.55 3.36 3.81
2 3.79 4.1 4.11 3.95 4.25 4.4
3 4 4.5 4.51 4.75 5
4 3.57 3.82 4.09 3.96 3.82

We use Kruskal-Wallis test to analyze the above data. We arrange the data in ascending
order and give the ranks 1 to 21 to the observations. The ranks are then arranged feed wise as
given below:
Feeds Ranks of Weight gains
Sum of
ranks (R
i
)
Average of
Ranks
1 1 6 3 2 7 19 3.80
2 5 14 15 10 16 17 77 12.83
3 12 18 19 20 21 90 18.00
4 4 9 13 11 8 45 9.00

Then the Kruskal-Wallis test statistic is obtained as:

22 * 3
5
) 45 (
5
) 90 (
6
) 77 (
5
) 19 (
22 * 21
12
T
2 2 2 2

+ + + =

=80.14 66 = 14.14
The tabulated value of
2
at 3 degree of freedom at 5% level of significance is 7.815 and the
calculated value is 14.14 so we infer that feed effects are differing significantly.
683

Pair-wise Comparisons for the feeds
Here r
1
= r
3
= r
4
= 5; r
2
= 6, N = 21, v = 4, and v(v-1) = 12. Let = 0.05 then p = 0.15/6 =
0.0041662. For this value of p, z
p
= 2.6. We can compute c
ii
as

+ = = =
6
1
5
1
12
22 * 21
6 . 2 c c c
42 32 12
= 9.77

+ = = =
5
1
5
1
12
22 * 21
6 . 2 c c c
34 14 13
= 10.2
Thus | R R |
2 1
= 9.03; | R R |
3 1
= 14.2; | R R |
4 1
= 5.2; | R R |
3 2
= 5.17;
| R R |
4 2
= 3.83; | R R |
4 3
= 9.
Here we see that | R R |
3 1
> c
13
so the effect of feeds 1 and 3 are significantly different at
5% of level of significance while all other pairs of feeds effects do not differ significantly.

2.2 Friedman test
The Kruskal-Wallis test is useful for the data generated through completely randomized
designs. A completely randomized design is used when experimental units are homogeneous
in a block. However, there do occur experimental situations where one can find a factor
(called nuisance factor), which, though not of interest to the experimenter, does contribute
significantly to the variability in the experimental material. Various levels of this factor are
used for blocking. For the experimental situations where there is only one nuisance factor,
the block designs are being used. The simplest and most commonly used block design by the
agricultural research workers is a randomized complete block (RCB) design. The problem of
non-normality of data may also occur in RCB design as well. Friedman test is useful for such
situations (see Friedman, 1937). Let there are v treatments that are arranged in N = vb
experimental units arranged in b blocks of size v each. Each treatment appears exactly once
in each block. The data generated through a RCB design can analyzed by the following
linear model
y
ij
= +
i
+
j
+
ij
, i = 1, 2, , v; j = 1, 2, , b.

where y
ij
is the yield (response) of the i
th
experimental unit receiving the treatment in j
th

block.
i
is the effect due to i
th
treatment.
j
is the effect of j
th
block.
ij
is random error in
response. Now we are interested to test the equality of treatment effect. In other words, we
want to test the null hypothesis H
0
:
1
=
2
=
3
= =
v
= (say) against the alternative H
1
:
at least two of the
i
s are different. For using Friedman test, we proceed as follows.

Arrange the observations in v rows (treatments) and b columns (blocks). The observations in
the different rows are independent and those in different columns are dependent. Rank all the
observations in a column (block) i.e. ranks are assigned separately for each block. Let R
ij
be
the rank of the observation pertaining to i
th
treatment in the j
th
block. Then 1 R
ij
v. As the
ranking has been done within blocks from 1 to v, therefore sum of ranks in j
th
block is
684

2
) 1 v ( v
R R
v
1 i
ij j
+
= =

=
and
2
1 v
R
j
+
= and the variance is
12
1 v
2

. Sum of ranks for each


treatment is .

=
=
b
1 j
ij i
R R

If the treatment effects are all the same then we expect each R
i
to be equal b(v+1)/2, that is,
under H
0
,

2
) 1 v ( b
2
) 1 v ( bv
v
1
) R (
i
+
=
+
= .
The sum of squared deviations of R
i
s from E(R
i
) is, therefore, a measure of the differences in
the treatment effects.
Let
2
v
1 i
i
2
) 1 v ( b
R S

=

+
=
The Friedman test statistic is then defined as


2
v
1 i
i
2
) 1 v ( b
R
) 1 v ( bv
12
) 1 v ( bv
S 12
T

=

+
=
+
=

2
) 1 v (
v
1 i
2
i
) 1 v ( b 3 R
) 1 v ( bv
12

=
+
+
=



The method of determining the probability of occurrence when H
0
is true of an observed
values of T depends upon the sizes of v and b. For small values of b and v, the null
distribution of T has been tabulated. For a ready reference, the table has been included in
Appendix-II. For large b and v, the associated probability may be approximated by the
2

distribution with v-1 degrees of freedom.

Remark 2.2: When there are ties among the ranks for any given block, the statistics T must
be corrected to account for changes in the sampling distribution. So if ties occur then we use
following statistic
) 1 v (
t bv
) 1 v ( bv
) 1 v ( v b 3 R 12
T
b
1 j
g
1 s
3
js
v
1 i
2 2 2
i
j

+ +
+
=

= =
=

2
(v-1)

685

where g
j
is the number of sets of tied ranks in the j
th
block and t
js
is the size of the j
th
set of
tied ranks in the i
th
block.

Pair-wise Comparisons
When the Friedman test rejects the null hypothesis that the all treatment effects are not the
same, it is of interest to identify significant difference between the paired treatments.
Therefore, a test procedure for making pair wise comparisons is needed. The null hypothesis
i i 0
: H

= against
i i 1
: H

v ,..., 2 , 1 i i = can be tested at % level of
significance using the inequality.
i' i v, 1,2,..., i' i, all for
6
) 1 v ( bv
z R R
p ' i i
=
+

where p = /v(v-1) and z
p
is the quantile of order 1-p under the standard normal distribution.
From the above, we can say that the least significant difference between the treatments i and
is i
6
) 1 v ( bv
z c
p
+
=

If c R R
' i i
> then the difference between treatment i and i' are considered as significantly
different at % level of significance. The above procedure is illustrated with the help of
following example:

Example 2.2: An animal feeding experiment involving 8 different rations was laid out in a
RCB design using 24 animals in 3 groups of size 8 each. The grouping was done on the basis
of initial body weight.
Block
Rations 1 2 3
1 138.46 400 461.5
2 387.6 369.2 384.6
3 436.9 430.8 467.7
4 424.6 461.5 421.5
5 430.7 455.3 406.1
6 424.61 384.6 384.5
7 424.59 350.8 307.6
8 360 400.1 400

Now first we check the normality of observations. For testing the normality of observation
we use the Shapiro-Wilk Test and Kolmogorov-Smirnov test. Here H
0
: Observations come
from normal population against H
1
: Observations do not come from normal population.

Kolmogorov-Smirnov Shapiro-Wilk
Statistic df Sig. Statistic df Sig.
Residual .200 24 .014 .873 24 .010
686

We can see that the observations are non-normal at 5% level of significance. Now we use
usual method of ANOVA. We get

Sum of
Source DF Squares Mean Square F Value Pr >
F

Replication 2 3889.9561 1944.9780 0.40 0.6747
Treatment 7 32075.2424 4582.1774 0.95 0.4988
Error 14 67273.4204 4805.2443
Total 23 103238.6190

From this analysis we can see that treatments are significantly different at 10% level of
significance. Now we use the Friedman Test for analysis of same data. We rank the data
within each block then we get

Block
Rations 1 2 3
Sum of ranks
(R
i
)
1 1 4 7 12
2 3 2 3 8
3 8 6 8 22
4 5 8 6 19
5 7 7 5 19
6 6 3 2 11
7 4 1 1 6
8 2 5 4 11

Here b = 3 and v = 8, so the Friedman statistic is

[ ] 9 * 3 * 3 ) 11 ( ) 6 ( ) 11 ( ) 19 ( ) 19 ( ) 22 ( ) 8 ( ) 12 (
) 1 8 ( 8 * 3
12
T
2 2 2 2 2 2 2 2
+ + + + + + +
+
=
= 94 - 81 = 13

The tabulated value of
2
at 7 degree of Freedom is 12.02 and the calculated value is 13. So
the rations are significantly different at 10% level of significance. Probability greater than
2

is 0.0721084.

Pair-wise Comparisons
Here b =3, v = 8 and v(v-1) = 56. Let = 0.1 then p = 0.1/56 = 0.0017857. z
p
= 2.1. Then
we calculate

6
) 1 8 ( 8 * 3
1 . 2 c
+
= =12.6
687

So the critical difference of rank sum is 12.6, if |R
i
R
i'
| > 12.6 the treatment i and i' is
significantly different. For example |R
2
R
3
| = 14 is more than 12.6, so we conclude that it is
significantly different at 10% level of significance.

2.3 Durbins Test
In the section 2.2, we have discussed the non-parametric analysis of experimental data of a
RCB design. In a RCB design, the number of experimental units required in each block are
same as the number of treatments. However, when the number of treatments increase, the
blocks become large and it is not possible to maintain homogeneity with blocks. If an
experimenter persists with a RCB design it results into large intra block variances and hence
reduced precision on treatment comparisons. To circumvent this problem, recourse is made
incomplete block designs. Many a times an experimenter may have to use incomplete block
designs because of the nature of experimental units. The simplest of the incomplete block
designs are balanced incomplete block (BIB) design. This standard notations for describing a
BIB design is given below:
v =the number of treatments
b =the number of blocks
r =number of replications of each treatment
k =number of experimental units per block (k<v)
=the number of blocks in which a given treatment pair occurs together

The model of a BIB design is same as given in section 2.2. For a non-normal data situation
pertaining to a BIB design, Durbin (1951) proposed a test statistic. We want to test the
equality of the treatment effects i. e. the null hypothesis H
0
:
1
=
2
=
3
= =
v
= (say)
against the alternative H
1
: at least two of the
i
s are different.

To test the above hypothesis, rank the observations y
ij
from 1 to k within a block. Let R
ij
be
the rank of y
ij
. Then following the lines of Friedman test, Durbin test statistic is given by

2
v
1 i
i
2
) 1 k ( r
R
) 1 k )( 1 k ( rv
) 1 v ( 12
T

=

+

=

2
) 1 v (
v
1 i
2
i
~
) 1 k (
) 1 k )( 1 v ( r 3
R
) 1 k )( 1 k ( rv
) 1 v ( 12

=

+

+

=



The test rejects H
0
if T is more than the cut-off point. The cut off point is obtained by
referring to the chi-square distribution with (v-1) degree of freedom. The exact test can be
obtained by rejecting H
0
when T m

, where some values of m

are given in Skillings and


Mack (1981).

Pair-wise Comparisons
When the Durbin test rejects the null hypothesis that the all treatment effects are not the
same, it is of interest to identify significantly different treatments. Therefore, a test procedure
688

for making pair wise comparisons is needed. The null hypothesis
i i 0
: H

= against
i i 1
: H

can be tested at v ,..., 2 , 1 i i = % level of significance using the

v 1,2,..., i' i, , i' i
) 1 v ( 6
) 1 k )( 1 k ( vr
z R R
p ' i i
=

+

where p = /v(v-1) and z
p
is the quantile of order 1-p under the standard normal distribution.
From the above, we can say that the least significant difference between the treatments and
is
i
i
) 1 v ( 6
) 1 k )( 1 k ( vr
z c
p

+
=

If c R R
' i i
> then the difference between treatments i and i' is considered significant at s
level of significance. The above procedure is illustrated with the help of following example:

Example 2.3: In an experiment to compare palatability of four varieties of rice (cooked),
four judges were asked to rank three varieties each. The results are as follows.
J udges Rice
Variety 1 2 3 4
Sum of
ranks (R
i
)
1 1 2 1 - 4
2 3 1 - 2 6
3 2 - 2 1 5
4 - 3 3 3 9

Now we the use the Durbin test

( ) ( ) ( ) ( ) [ ]
2 2 2 2
6 9 6 5 6 6 6 4
) 1 3 )( 1 3 ( * 4 * 3
) 1 4 ( 12
T + + +
+

=

=5.25
The tabulated value of
2
at 3 degree of Freedom is 7.815 and the calculated value is 5.25 so
the treatments are not significantly different at 5 % level of significance. Here the probability
greater than
2
is 0.0724398.

2.4 Skillings and Mack Test
In some of the experimental situations, the use of a BIB design may not be feasible. One may
have to use a partially balanced incomplete block (PBIB) design. In some of the experimental
situations even a non-proper block design may be useful. Skillings and Mack (1981)
proposed a Friedman-type test statistic i. e. useful for any binary block design. Let (v, b, r,
k) represents a binary block design in which v treatments arranged in b blocks such that j
th

block contains k
j
2 distinct treatments and i
th
treatment is replicated r
i
times; i = 1, 2, , v,
j =1, 2, , b. For the analysis of experimental data generated through a binary block designs,
689

we make use of statistic given by Skillings and Mack (1981). To compute the test statistic,
we find adjusted treatment sums for ranked data. For this we proceed as follows:

1. Within each block, rank the observations from 1 to k

j
, where k
j
is the number of
treatment present in j
th
block.
2. Let R
ij

be the rank of y
ij
if the observation is present; otherwise, let R
ij
= (k
j
+1)/2
3. Compute an adjusted treatment sum for the i
th
treatment, namely
[ ] [ 2 / ) 1 k ( R ) 1 k /( 12 A
j ij
2 / 1
b
1 j
j i
+ + =

=
]
Let matrix , which is the covariance matrix of the random vector A' =(A
1
, , A
v
). The
covariance structure of theR
ij
s is well known and in this case only minor modifications are
required because of missing cells. In block j , under H
0
:
1
=
2
=
3
= =
v
= (say), we
have
,
otherwise , 0
j block in present is i treatment f i 12 / ) 1 k )( 1 k (
) R ( Var
j j
ij

+
=
,
otherwise , 0
1 n and 1 n ' i i , j' j If 12 / ) 1 k (
) R , R ( Cov
' tj ij j
' j ' i ij

= = = +
=
where n
ij
is the number of times treatment i appears in block j. Thus
i = 1, 2, , v

=
=
b
1 j
ij j i
, n ) 1 k ( ) A ( Var
and 1 i i' v

=
=
b
1 j
' ii ij ' i i
, n n ) A , A ( Cov
where n
ij
equals one if treatment i appear in block j and equal to zero otherwise. By defining

it
to be the number of blocks containing observations for both treatments i and i'. It can be
seen by inspection that under H
0
the ((
ii'
)) can be rewritten as

ii'
= -
ii'
, 1 i i' v
and
, i = 1, 2, , v.

=
= =
v
i ' i
1 ' i
' ii
v
i ' i
1 ' i
' ii ii

Thus the elements of are simple to obtain. The off-diagonal elements are (-
ii'
), and the
diagonal elements are the negative of the sum of off-diagonal elements in that row. We note
that the covariance matrix is singular, because the sum of the rows (columns) is always
zero. In any connected block design, the rank of will be v-1. The test statistic we now
propose is of the form


T =A'
-
A
690


where
-
is a generalized inverse of . T follows an approximate
2
-distribution with d.f. as
the rank of . The above procedure is illustrated with the help of following example:


Example 2.4: An engineer is studying the mileage performance characteristics of 4 types of
gasoline additives. In the road test he wishes to use cars as block (9). The results are as
follows.

Car (Block) Gasoline
Additives 1 2 3 4 5 6 7 8 9
A 3.2 3.1 4.3 3.5 3.6 4.5 4.3 3.5
B 4.1 3.9 3.5 3.6 4.2 4.7 4.2 4.6
C 3.8 3.4 4.6 3.9 3.7 3.7 3.4 4.4 3.7
D 4.2 4 4.8 4 3.9 4.9 3.9

Now we rank the data within each block. If observation is not present then we use R
ij
=(k
j

+1)/2. We get

Car ( Block)
Gasoline
Additives 1 2 3 4 5 6 7 8 9
A 1 1 2 1 1 2 1.5* 1 1
B 3 3 1 2 4 3 2 3 2*
C 2 2 3 3 2 1 1 2 2
D 4 4 4 4 3 2* 1.5* 4 3
* Represents missing observation rank.

Now we calculate adjusted treatment sums (A
i
) as

Block
1 2 3 4 5 6 7 8 9
Adjusted
Treatment
Sum (A
i
)
A -2.3238 -2.3238 -0.7746 -2.3238 -2.3238 0.0000 0.0000 -2.3238 -1.7321 -14.1256
B 0.7746 0.7746 -2.3238 -0.7746 2.3238 1.7321 1.0000 0.7746 0.0000 4.2812
C -0.7746 -0.7746 0.7746 0.7746 -0.7746 -1.7321 -1.0000 -0.7746 0.0000 -4.2812
D 2.3238 2.3238 2.3238 2.3238 0.7746 0.0000 0.0000 2.3238 1.7321 14.1256

So , A
i
= (-14.1256 4.2812 -4.2812 14.1256)'

The covariance matrix is obtained by counting the number of times treatment pair occur
together.
691





=
20 7 6 7
7 23 8 8
6 8 21 7
7 8 7 22
And Generalized inverse
-
is

0 0 0 0
0 413 232 224
0 232 442 225
0 224 225 419


Now,
T =A'
-
A = [-14.1256 4.2812 -4.2812 14.1256]

0 0 0 0
0 413 232 224
0 232 442 225
0 224 225 419

1256 . 14
2812 . 4
2812 . 4
1256 . 14
=15.49

The tabulated value of
2
at 3 degree of Freedom is 7.815 and the calculated value is 15.49.
So the treatments are significantly different at 5 % level of significance. Here the probability
greater than
2
is 0.0014424.

The statistic is quite general and the commonly used Friedman test statistic and Durbin test
statistic discussed in Section 2.2 and 2.3 respectively are the particular cases of this test. For
example, in a RCB designs all n
ij
= 1, all
ii'
= b and k
j
= v, therefore, for a RCB design

= vb(I
v
v
-1
1
v
1
v
') and A
i
=

+
b
1 j
ij
2 / 1
2
1 v
R
1 v
12
.
Substituting, these in T, we get

=
v
1 i
2
i
1
A ) bv ( T

=
+
+
=
v
1 i
2
i
) 1 v ( b 3 R
) 1 v ( bv
12
T

Now in case of a BIB design, all blocks havek< v observations and the number of blocks in
which any pair of treatments occur together is, . Therefore, for a BIB design = (r(k-
1)+)I-1
v
1
v
'. T statistic is reduces to
692

=
v
1 i
2
i
1
A ) v ( T

1 k
) 1 k )( 1 v ( r 3
R
) 1 k )( 1 k ( rv
) 1 v ( 12
T
v
1 i
2
i

+

=

=
Since (v-1) = r(k-1)

2.5 Gore test for Multiple Observations per Plot
The test discussed above can also be used for general block design where only one
observation per plot is available. However sometime more than one observations are
available from each cell. The appropriate model for such situation is

y
ijk
= +
i
+
j
+
ijk
, i = 1, 2; , v; j = 1, 2, , b; k = 1, 2, , n
ij
.

where y
ijk
is the k
th
observation in the (i, j)
th
cell, n is the number of observations in the (i, j)
th

cell, that is the number of experimental units receiving i
th
treatment and i
th
block.
ijk
are
independent errors that follow a continuous distribution with a zero median. We are
interested to test the equality of the treatment effect i.e. H
0
:
1
=
2
=
3
= =
v
= (say)
against an alternative hypothesis that at least two of the
i
s are different.
Now, suppose , the total number of observations and

=
b
j
ij
v
i
n N

N
n
p
ij
ij
= ,
ij
ij
p
1
q = , ,

=
b
j
ij . i
q q

=
v
i
. i
..
*
q
1
q
Consider a pair of plots (i, j) and (i, j). They are in the same column j. We can form n
ij
.n
ij

pairs of observations by taking one observation from each of these cells. Suppose u
i,i,j
is the
proportion of these pairs, such that the observation from plot (i,j) is larger of two. Then
define


= =
=
b
1 j
ij ' i
v
1 i ' i
i
u u
Using these, the test statistic is

=

= =
*
..
2
. i
v
1 i
i . i
2
v
1 i
i
2
q / q / ) 2 / b ) 1 v ( u ( q / ) 2 / b ) 1 v ( u (
v
N 12
T
The test rejects H
0
if T is greater than the upper cut off point of the chi-square distribution
with (v -1) degree of freedom. The above procedure is illustrated with the help of following
example:

Example 2.5: The following table gives the number of days to maturity for three varieties of
a cereal crop grown in two soil conditions.



693

Soil type
Variety Light Heavy
A 130, 115, 123, 142 117, 125, 139
B 108, 114, 124, 106 91, 111, 110
C 155, 146, 151, 165 97, 108

In this data set n
11
= 4, n
12
= 3, n
21
= 4, n
22
= 3, n
31
= 4, n
32
= 2. and v = 3, b = 2, N = 20,

Now
p
11
= p
21
= p
31
= 4/20 = 0.2; p
12
= p
22
= 3/20 = 0.15; p
32
= 2/20 = 0.1
Then
q
11
= q
21
= q
31
= 1/0.20 = 5; q
12
= q
22
= 1/0.15 = 6.67; q
32
= 1/0.1 =10
q
1.
= 5+6.67=11.67; q
2.
= 5+6.67=11.67; q
3.
= 5+10=15
2380 . 0
15
1
67 . 11
1
67 . 11
1
q
..
= + + =

Computation of u
i,i,j
: Let us first compute u
1,2,1
, here 16 (4x4) comparisons altogether. Take
each observation in cell (1,1) and compare it with 4 observations in cell (2, 1). Observation
130 is bigger than all 4 values in cell (2, 1). Hence, it contributes 4. Similarly 115 contribute
3, 123 contributes 3 and 142 contributes 4. The total is 14. So, u
1,2,1
is 14/16.

Similarly u
1,2,2
=9/9; u
1,3,1
=0; u
1,3,2
=1; u
2,3,1
=0; u
2,3,2
=4/6; u
2,1,1
=2/16; u
2,1,2
=0; u
3,1,1
=1;
u
3,1,2
=0; u
3,2,1
=1; u
3,2,2
=2/6;
Hence
16
46
1 0
9
9
16
14
u
1
= + + + = ;
96
76
6
4
0 0
16
2
u
2
= + + + = ;
6
14
6
2
1 0 1 u
3
= + + + =
From these now we calculate test statistic

=

= =
*
..
2
. i
v
1 i
i . i
2
v
1 i
i
2
q / q / ) 2 / b ) 1 v ( u ( q / ) 2 / b ) 1 v ( u (
v
N 12
T
( ) ( ) [ ] 0238 / 0074 . 0 125 . 0 0656 . 0 0074 . 0 125 . 0 0656 . 0
3
20 * 12
T
2
2
+ + + + =
=5.28

The tabulated value of
2
at 2 degree of Freedom is 5.991 and the calculated value is 5.28 so
the varieties of cereal crop are not significantly different at 5% level of significance. Here the
probability greater than
2
is 0.071361.

A lot of efforts have also been made at IASRI to develop the non-parametric test procedures
for the analysis of groups of experiments conducted in Randomized block designs and split
plot designs. For details one may refer to Rai and Rao (1980, 1984).

Acknowledgements: The help received from Sh. Ajeet Kumar, M.Sc. (Agricultural
Statistics) student during the preparation of this lecture note is duly acknowledged.
694

References
Conover, W.J . and Iman, R.L. (1976). In Some Alternative Procedure Using Ranks for the
Analysis of Experimental Designs. Commun. Statist. Theor. Math. A5(14), 1349-
1368.
Deshpande, J .V., Gore, A.P. and Shanubhogue, A. (1995). Statistical Analysis of Non-
normal Data. Wiley Eastern Limited, New Delhi.
Durbin, J . (1951), Incomplete Blocks in Ranking Experiments. Brit. J. Statist. Psych. 4, 84-
90.
Friedman, M. (1937), The Use of ranks to avoid the assumption of Normality implicit in the
Analysis of Variance, J. Amer. Statist. Assoc. 32, 675-701.
Gore, A.P. (1975). Some Non-parametric Tests and Selection Procedures for Main Effects in
Two- ways Layouts. Annals of the Institute of Statistical Mathematics, 27, 487-500.
Gore, A.P. and Shanobhogue, A. (1988). A Note on Rank Analysis of Split Plot Experiments,
Journal of Indian Society of Agricultural Statistics, XL 3, 178-184.
Hollander, M. and Wolfe, D.A. (1999). Nonparametric Statistical Methods, 2
nd
Ed, New
York: J ohn Wiley.
Kruskal, W.H. (1952). A Nonparametric Test for the Several Sample Problem, Ann. Math.
Statist. 23, 525-540.
Kumar, Ajeet. (2002). Non-parametric methods in analysis of experimental data. Course
Seminar Write delivered during J anuary April, 2002.
Rai, S.C. and Rao, P.P (1980). Use of Ranks in Groups of Experiments. Indian Soc. Agril.
Statist., 32, 25-32.
Rai, S.C. and Rao, P.P (1984). Rank Analysis of Group off Split-Plot Experiments. Indian
Soc. Agril. Statist., 36, 105-113.
Sen, P.K. (1996). Design and Analysis of Experiments: Non-parametric Methods with
Applications to Clinical Trials. Handbook of Statistics, Vol 13, eds. S.Ghosh and
C.R.Rao, 91-150.
Siegel, S. and Catellan, N.J . J r. (1988). Non parametric statistics for the Behavioural
Sciences, 2
nd
Ed., McGraw Hill, New York.
Skillings, J .H. and Mack G.A. (1981). On the Use of a Friedman-Type Statistic in Baanced
and Unbalanced Block Designs. Technometrics 23 (2), 171-177.
695

Appendix-I

Kruskal-Wallis Test Statistic

Each table entry is the smallest of the Kruskal-Wallis T such that its right-tail
probability is less than or equal to the value given on the top row for v = 3, each sample
size less than or equal to five.

r
1
r
2
r
3
0.100 0.050 0.020 0.010 0.001
2 2 2 4.571 - - - -
3 2 1 4.286 - - - -
3 2 2 4.500 4.714 - - -
3 3 1 4.571 5.143 - - -
3 3 2 4.556 5.361 6.250 - -
3 3 3 4.622 5.600 6.489 7.200 -
4 2 1 4.500 - - - -
4 2 2 4.458 5.333 6.000 - -
4 3 1 4.056 5.208 - - -
4 3 2 4.511 5.444 6.144 6.444 -
4 3 3 4.709 5.791 6.564 6.745 -
4 4 1 4.167 4.967 6.667 6.667 -
4 4 2 4.555 5.455 6.600 7.036 -
4 4 3 4.545 5.598 6.712 7.144 8.909
4 4 4 4.654 5.692 6.962 7.654 9.269
5 2 1 4.200 5.000 - - -
5 2 2 4.373 5.160 6.000 6.533 -
5 3 1 4.018 4.960 6.044 - -
5 3 2 4.651 5.250 6.124 6.909 -
5 3 3 4.533 5.648 6.533 7.079 8.727
5 4 1 3.987 4.985 6.431 6.955 -
5 4 2 4.541 5.273 6.505 7.205 8.591
5 4 3 4.549 5.656 6.676 7.445 8.795
5 4 4 4.668 5.657 6.953 7.760 9.168
5 5 1 4.109 5.127 6.145 7.309 -
5 5 2 4.623 5.338 6.446 7.338 8.938
5 5 3 4.545 5.705 6.866 7.578 9.284
5 5 4 4.523 5.666 7.000 7.823 9.606
5 5 5 4.560 5.780 7.220 8.000 9.920
696

Appendix-II

Friedman Test Statistic

Critical value for the Friedman two-way analysis of variance by rank statistics, T

v b .10 .05 .01
3 3
4
5
6
7
8
9
10
11
12
13


6.00 6.00 ---
6.00 6.50 8.00
5.20 6.40 8.40
5.33 7.00 9.00
5.43 7.14 8.86
5.25 6.25 9.00
5.56 6.22 8.67
5.00 6.20 9.60
4.91 6.54 8.91
5.71 6.17 8.67
4.77 6.00 9.39

4.61 5.99 9.21

4 2
3
4
5
6
7
8


6.00 6.00 ---
6.60 7.40 8.60
6.30 7.80 9.60
6.36 7.80 9.96
6.40 7.60 10.00
6.26 7.80 10.37
6.30 7.50 10.35

6.25 7.82 11.34
5 3
4
5


7.47 8.53 10.13
7.60 8.80 11.00
7.68 8.96 11.52

7.78 9.49 13.28

697

Vous aimerez peut-être aussi