Vous êtes sur la page 1sur 25

Statistical Method and Advanced Application

SPEARMAN RANK CORRELATION COEFFICIENT

SMALL SAMPLE

LARGE SAMPLE APPROXIMATION

SMALL SAMPLE

Spearman Rank Correlation Coefficient

Learning Outcome

Student should be able to make a relationship between two variable by using Spearman rank correlation coefficient.

Assumptions

a)The data consist of a random sample of n pairs of numeric or non-numeric observations.


b)Each pair of observations represents two measurements taken on the same object or individual, called the unit of association.

Procedures
1)If the data consist of observations from a bivariate population, we designate the n pairs of observations (X1,Y1), (X2,Y2),, (Xn,Yn). 2)Each X is ranked relative to all other observed values of X, from smallest to largest in order of magnitude. The rank of the ith value of X is denoted by R(Xi )=1 if Xi is the smallest observed value of X. 3) Each Y is ranked relative to all other observed values of Y, from smallest to largest in order of magnitude. The rank of the ith value of Y is denoted by R(Yi )=1 if Yi is the smallest observed value of Y. 4)If ties occur among the Xs or among the Ys, each tied value is assigned the mean of the rank positions for which it is tied. 5)If the data consist of nonnumeric observations, they must be capable of being ranked as described.

Hypotheses
Case A (two-sided) H0 : X and Y are independent. Case B (one-sided) H0 : X and Y are independent. Case C (one-sided) H0 : X and Y are independent.

H1 : X and Y are either directly or inversely related.

H1 : There is a direct relationship between X and Y.

H1 : There is an inverse relationship between X and Y.

H : = 0
0 s

H : = 0
0 s

H : 0
1 s

H 1 : s > 0

H : = 0 H 1 : s < 0
0 s

Test Statistic
1) Rank all sample observations from smallest to largest. 2)Find di and di2 where di=R(Xi) R(Yi), and di is the difference in the ranks given to the variables Xi and Yi. 3)Sum the square of the difference in the ranks observations from population Xi and Yi, di2 4)The test statistic is, rs 1

n(n 1)
2

6 di 2

5) rs is to measure the strength of the relationship between the sample X and Y values and as an estimate of the strength of the relationship between X and Y in the sampled population. 6) When the rank of X is the same as the rank of Y for every pair of observations (perfect direct relationship), all the differences di will be equal to zero, and rs will be equal to +1. 7) Kendall(T3)* has shown that in general rs = -1 when the rank of one variable within each pair of observations (Xi, Yi) is the reverse of the other (perfect inverse relationship).
*Kendall, M.G., Rank Correlation Methods, fourth edition, London: Griffin, 1970.

Thus if [R(X) = 1, R(Y) = n] [R(X) = 2, R(Y) = n-1],.,[R(X) = n, R(Y) = 1] for n pairs of observations, rs = -1. This may be illustrated by means of a simple example. Suppose have the following pairs of observations of (Xi, Yi) : (0,10),(8,3),(2,9),(5,6). The ranks are
Xi R(Xi) Yi R(Yi) 0 1 10 4 8 4 3 1 2 2 9 3 5 3 6 2

So, the di2 =(-3) 2 +(3) 2 + (-1) 2 + (1) 2 = 20. Then rs = 1 [6(20)/4(16-1) = -1 8) Kendall also shows that rs can never be greater than +1 or less than -1.

Decision Rule
Case A (Two-sided) Reject H0 if

rs r

2 2

rs r
B (One-sided positive) C (One-sided negative)

rs r

rs r

Example 9.1(pg 360)

Pincherle and Robinson(E1)* note a marked interobserver variation in blood pressure. They found that doctors who read high on systolic tended to read high on diastolic. Table 9.1 shows mean systolic and diastolic blood pressure readings by 14 doctors. We wish to compute a measure of the strength of the relationship between the two variables. Under the consumption that these 14 doctors constitute a random sample from a population of doctors, we wish to know whether we may conclude from the data that there is a direct relationship between systolic and diastolic readings. Suppose we let = 0.05
*G.Pincherle and D.Robinson, Mean Blood Pressure and its Relation to Other Factors Determined at a Routine Executive Health Examination, J.Choronic Dis.,27 (1974),245-260; used with permission of Pergamon Press.

Mean blood pressure readings, millimeters mercury, by doctor


Doctor
1 2 3 4 5 6

Systolic
141.8 140.2 131.8 132.5 135.7 141.2

Diastolic
89.7 74.4 83.5 77.8 85.8 86.5

7 8
9 10 11 12 13 14

143.9 140.2
140.8 131.7 130.8 135.6 143.6 133.2

89.4 89.3
88.0 82.2 84.6 84.4 86.3 85.9

1) HYPOTHESES
H 0 : Systolic and diastolic blood pressure readings by doctors are independent.
H 1 : There is a direct relationship between systolic and diastolic blood pressure readings by doctors (claim).

2) TEST STATISTIC
SYSTOLIC (Xi ) DIASTOLIC ( Yi )

R(X i )
12 8.5 3 4 7 11 14 8.5 10 2 1 6 13 5

R( Yi )
14 1 4 2 7 10 13 12 11 3 6 5 9 8

d i= R ( x i ) R( Yi)
-2 7.5 -1 2 0 1 1 - 3.5 -1 -1 -5 1 4 -3

di2
4 56.25 1 4 0 1 1 12.25 1 1 25 1 16 9
2 di = 132.50

141.8 140.2 131.8 132.5 135.7 141.2 143.9 140.2 140.8 131.7 130.8 135.8 143.6 133.2

89.7 74.4 83.5 77.8 85.8 86.5 89.4 89.3 88.0 82.2 84.6 84.4 86.3 85.9

By using the formula of test static:

6 d i rs 1 2 n(n 1)
From the table, we get the value of i 132.50 d and n = 14. Then, substitute the values into the equation above.
2

6(132 .50 ) rs 1 2 14 (14 - 1)

0.7088

3) CRITICAL VALUE
Table A.21 reveals that, for n = 14 and (1) = 0.05, the critical value of rs is 0.464.

4) DECISION
Since 0.7088 > 0.464, we reject H 0 .

5) CONCLUSION
There is enough evidence to support the claim that the doctors who read high on systolic tend to read high on diastolic blood pressure.

Exercise 1
Table 9.4 shows the serum and bone magnesium levels of 14 patients are reported by Alfrey et al.*. Can we conclude from these data that a relationship exist between serum magnesium and bone magnesium in the sample population?
Serum Mg ( m Eq./L.) 3.60 2.85 2.80 2.70 2.60 2.55 2.55 2.45 2.25 1.80 1.45 1.35 1.40 0.90 Bone Mg ( m Eq./kg ash ) 672 610 621 567 570 638 612 552 524 400 277 294 338 230

*Alfrey, Allen C.,Nancy L. Miller, and Donald Butkus, Evaluation Of Body Magnesium Stores,J. Lab. Clin. Med.,84 (1974), 153-1

Exercise 2
Ten seventh-grade children randomly selected from a certain public school system were ranked according to the quality of their home environment and the quality of their performance in school. The result are shown in table 9.45. compute rs and determine whether one can conclude that the two variable are directly related. Ten seventh-grade children ranked according to quality of home environment and quality of performance in school. Child 1 2 3 4 5 6 7 8 9 10 Home environment 3 7 10 9 2 1 6 4 8 5 Performance in school 1 9 8 10 3 4 5 2 6 7

Exercise 3
The Spearman's Rank Correlation Coefficient is used to discover the strength of a link between two sets of data. This example looks at the strength of the link between the price of a convenience item (a 50cl bottle of water) and distance from the Contemporary Art Museum (CAM ) in El Raval, Barcelona. compute rs and determine whether one can conclude that the two variable are inversly related. Convinence store 1 2 3 4 5 6 7 8 Distance from CAM ( m) 50 175 270 375 425 580 710 790 Price of 50cl bottle () 1.80 1.20 2.00 1.00 1.00 1.20 0.80 0.60

9 10

890 980

1.00 0.85

Large-Sample Approximation
When the sample size is greater than 100, we cannot use Table A.21 to test the significance of rs. Then we may compute

z rs n 1
which is distributed approximately as the standard normal.

LINK OF YOUTUBE
http://www.youtube.com/watch?v=Eu_XOoF NfR4&feature=mfu_in_order&list=UL

Thank You