Term Paper

ALTERNATIVE METHOD OF COMPUTING
CORRELATION COEFFICIENT USING THE

COMPUTATIONAL VERSION OF THE
PEARSON PRODUCT-MOMENT CORRELATION COEFFICIENT
FORMULA
A Term Paper
Presented to:
Dr. Lucila Fineza-Tibigar

(Professor)
In Partial Fulfillment
of the Requirements of the Course
Statistics Applied to Educational Research II
(EdAd 600)
Tryon R. Gabriel
April, 2005
Background of the Study
The Pearson Product Moment Correlation Coefficient is the most widely
used measure of correlation or association. It is named after Karl Pearson who developed
the correlational method to do agricultural research. The product moment part of the
name comes from the way in which it is calculated, by summing up the products of the
deviations of the scores from the mean.
The symbol for the correlation coefficient is lower case r, and it is
described in textbooks as the sum of the product of the Z-scores for the two variables
divided by the number of scores.
If we substitute the formulas for the Z-scores into this formula we get the following
formula for the Pearson Product Moment Correlation Coefficient, which we will use as a
definitional formula.
The numerator of this formula says that we sum up the products of the deviations of a
subject's X score from the mean of the X’s and the deviation of the subject's Y score from
the mean of the Y’s. This summation of the product of the deviation scores is divided by
the number of subjects times the standard deviation of the X variable times the standard
deviation of the Y variable.

You can see that it is fairly difficult to calculate the correlation coefficient
using the definitional formula. In real practice we use another formula that is
mathematically identical but is much easier to use. This is the computational or raw score
formula for the correlation coefficient. The computational formula for the Pearsonian r is
To properly interpret the correlation coefficient, one must understand the
basic properties of r:
 The value r measures the strength of the linear relationship between X and Y and
will always be between -1 and +1.
 The closer r is to either -1 or +1, the stronger the linear relationship between X
and Y. In fact, points that fall exactly on a straight line have a correlation of +1 if
the line has positive slope and -1 if the line has negative slope.
 If r is zero, then X and Y are not linearly related. They may be related, but the
relationship is not a straight line.
 The value of r does not change when the units of measurement are change.
It is still computationally difficult to find the correlation coefficient,
especially if we are dealing with a large number of subjects. In practice we would
probably use a computer to calculate the correlation coefficient. The aim of this paper is
to present a modified method of computing correlation coefficient using the
computational version of the Pearson Product-Moment Correlation Coefficient formula.
As mentioned above, it is possible that the data obtained for each variable are too large to
handle for manual computation. In the absence of the computer, such difficulty could lead
to computational error giving results that greatly affect the decision making. In this paper,
the author presents a method of reducing the said difficulty by subtracting from the
values of the variable its corresponding assumed mean.
Statement of the Problem
The purpose of this paper is to present and determine the validity of an
alternative method of computing correlation coefficient using the computational version
of the Pearson Product-Moment Correlation Coefficient formula. Specifically, this paper
sought to answer the question: Is there a difference in the result of the computation of
correlation coefficient when an assumed mean for a given variable is subtracted from its
values?
Procedure
To determine the validity of the said alternative method, the author
presented all the possible cases where the assumed mean for a given variable (say, X or
Y) is subtracted from its values. The said cases are the following: (i) assumed mean
subtracted from the values of X alone; (ii) assumed mean subtracted from the values of Y
alone; and (iii) corresponding assumed means for X and Y subtracted from their values.
For each case, correlation coefficient is computed using the computational version of the
Pearson Product-Moment Correlation Coefficient formula.

Findings
The following is the result of the usual method of computing the
correlation coefficient between the variables X and Y using the computational version of
the Pearson Product-Moment Correlation Coefficient Formula.
X Y X2 Y2 XY
26 37 676 1369 962
42 90 1764 8100 3780
37 48 1369 2304 1776
82 90 6724 8100 7380
66 88 4356 7744 5808
44 100 1936 10000 4400
24 95 576 9025 2280
39 120 1521 14400 4680
55 95 3025 9025 5225
61 76 3721 5776 4636
77 89 5929 7921 6853
58 100 3364 10000 5800
Σ= 34961 93764 53580
r= 0.264201335
The above shows that the correlation coefficient r = 0.264201335 and the values obtained
are very large and difficult to handle for manual computation. The above table is
presented by the author of this paper for the purpose of comparing it to the following data
obtained for the above-mentioned cases:

X Y X2 Y2 XY
-14 37 196 1369 -518
2 90 4 8100 180
-3 48 9 2304 -144
42 90 1764 8100 3780
26 88 676 7744 2288
4 100 16 10000 400
-16 95 256 9025 -1520
-1 120 1 14400 -120
15 95 225 9025 1425
21 76 441 5776 1596
37 89 1369 7921 3293
18 100 324 10000 1800
Σ= 5281 93764 12460
r= 0.264201335
1. Assumed mean subtracted from the values of X alone:

The above result shows that after subtracting the assumed mean (=40) for the values of X
it still yields the same correlation coefficient. Notice also that the values for X and X 2
become smaller compared to their original values shown in the first table and easier to
handle for manual computation.
2. Assumed mean subtracted from the values of Y alone:
X Y X2 Y2 XY
26 -43 676 1849 -1118
42 10 1764 100 420
37 -32 1369 1024 -1184
82 10 6724 100 820
66 8 4356 64 528
44 20 1936 400 880
24 15 576 225 360
39 40 1521 1600 1560
55 15 3025 225 825
61 -4 3721 16 -244
77 9 5929 81 693
58 20 3364 400 1160
Σ= 34961 6084 4700
r= 0.264201335
The above result shows that after subtracting the assumed mean (=80) for the values of Y
it still yields the same correlation coefficient. Notice also that the values for Y and Y2
become smaller compared to their original values shown in the first table and easier to
handle for manual computation.
3. Corresponding assumed means for X and Y subtracted from their values:
X Y X2 Y2 XY
-14 -43 196 1849 602
2 10 4 100 20
-3 -32 9 1024 96
42 10 1764 100 420
26 8 676 64 208
4 20 16 400 80
-16 15 256 225 -240
-1 40 1 1600 -40
15 15 225 225 225
21 -4 441 16 -84
37 9 1369 81 333
18 20 324 400 360
Σ= 5281 6084 1980
r= 0.264201335
The above result shows that after subtracting the corresponding assumed means for the
values of X and Y it still yields the same correlation coefficient. Notice also that the
values for X, X2, Y, and Y2 become smaller compared to their original values shown in
the first table and again they are now easier to handle for manual computation.
Conclusion
On the basis of the above results, the author of this paper inferred that
subtracting the assumed mean from the values of the variables X and Y doesn’t alter the
result of the computation of the correlation coefficient using the computational version of
the Pearson Product-Moment Correlation Coefficient formula.
Recommendation
1. In view of the above satisfactory result, the author of this paper
recommends the method of subtracting the assumed mean from the
values of the variable in the computation of the correlation coefficient
using the computational version of the Pearson Product-Moment
Correlation Coefficient formula. It also greatly reduces the magnitude
of the numbers involved making them easier to handle for manual
computation.
2. If the assumed mean doesn’t sufficiently reduce the size of the
numbers, the author also recommends dividing the said numbers by a
multiple of ten before performing the computation of the correlation

coefficient using the computational version of the Pearson Product-
Moment Correlation Coefficient formula.
Reference
Kitchens, L. J. (1998). Exploring Statistics, A Modern Introduction to Data Analysis and

Inference, 2nd ed. Ca. 93950: Brooks/Cole Publishing Co.
Bernstein, S. & Bernstein, R. (1999). Schaum’s Outline of Theory and Problems of

Elements of Statistics I: Descriptive Statistics and Probability, International ed.
Singapore: McGraw-Hill Book Co.
Bernstein, S. & Bernstein, R. (1999). Schaum’s Outline of Theory and Problems of

Elements of Statistics II: Inferential Statistics, International ed. Singapore: McGraw-Hill
Book Co.
Dougherty, E. R. (1990). Probability and Statistics for the Engineering, Computing, and
Physical Sciences. New Jersey 07632: Prentice-Hall, Inc.

Term Paper

Transféré par

Informations du document

Description originale:

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Term Paper

Transféré par

Droits d'auteur :

Formats disponibles

ALTERNATIVE METHOD OF COMPUTING

CORRELATION COEFFICIENT USING THE

Dr. Lucila Fineza-Tibigar

The Pearson Product Moment Correlation Coefficient is the most widely

deviations of the scores from the mean.

The symbol for the correlation coefficient is lower case r, and it is

divided by the number of scores.

deviation of the Y variable.

To properly interpret the correlation coefficient, one must understand the

will always be between -1 and +1.

relationship is not a straight line.

It is still computationally difficult to find the correlation coefficient,

especially if we are dealing with a large number of subjects. In practice we would

to present a modified method of computing correlation coefficient using the

computational version of the Pearson Product-Moment Correlation Coefficient formula.

values of the variable its corresponding assumed mean.

Statement of the Problem

The purpose of this paper is to present and determine the validity of an

alternative method of computing correlation coefficient using the computational version

of the Pearson Product-Moment Correlation Coefficient formula. Specifically, this paper

To determine the validity of the said alternative method, the author

Pearson Product-Moment Correlation Coefficient formula.

The following is the result of the usual method of computing the

the Pearson Product-Moment Correlation Coefficient Formula.

obtained for the above-mentioned cases:

1. Assumed mean subtracted from the values of X alone:

handle for manual computation.

2. Assumed mean subtracted from the values of Y alone:

handle for manual computation.

3. Corresponding assumed means for X and Y subtracted from their values:

the Pearson Product-Moment Correlation Coefficient formula.

1. In view of the above satisfactory result, the author of this paper

recommends the method of subtracting the assumed mean from the

values of the variable in the computation of the correlation coefficient

using the computational version of the Pearson Product-Moment

Correlation Coefficient formula. It also greatly reduces the magnitude

of the numbers involved making them easier to handle for manual

2. If the assumed mean doesn’t sufficiently reduce the size of the

numbers, the author also recommends dividing the said numbers by a

multiple of ten before performing the computation of the correlation

Moment Correlation Coefficient formula.

Kitchens, L. J. (1998). Exploring Statistics, A Modern Introduction to Data Analysis and

Bernstein, S. & Bernstein, R. (1999). Schaum’s Outline of Theory and Problems of

Bernstein, S. & Bernstein, R. (1999). Schaum’s Outline of Theory and Problems of

Vous aimerez peut-être aussi