Vous êtes sur la page 1sur 14

The Statistical Imagination

Chapter 15:
Correlation and Regression
Part 2: Hypothesis Testing and
Aspects of a Relationship

2008 McGraw-Hill

When to Test a Hypothesis Using


Correlation and Regression
1) There is one representative sample from
a single population
2) There are two interval/ratio variables
3) There are no restrictions on sample size,
but generally, the larger the n, the better
4) A scatterplot of the coordinates of the two
variables fits a linear pattern
2008 McGraw-Hill

Test Preparation
Before proceeding with the hypothesis test,
check the scatterplot for a linear pattern
Calculate the Pearsons r correlation
coefficient and the regression coefficient, b
Compute the means of X and Y and use them
and b to compute a
Specify the regression equation, insert values
of X, solve for , and plot the line on the
scatterplot
Provide a conceptual diagram
2008 McGraw-Hill

Features of the
Hypothesis Test
Step 1. H0: = 0
That is, there is no relationship between
X and Y
The Greek letter rho () is the correlation
coefficient obtained if Pearsons correlation
coefficient were computed for the population
A of zero asserts that there is no
correlation in the population and that the
regression line has no slope
2008 McGraw-Hill

Features of the
Hypothesis Test (cont.)
Step 2. The sampling distribution is the tdistribution with df = n - 2
When the H0 is true, sample Pearsons rs
will center around zero
This test does not require a direct
calculation of a standard error

2008 McGraw-Hill

Features of the
Hypothesis Test (cont.)
Step 4. The test effect is the value of
Pearsons r
The test statistic is tr
The p-value is estimated from the tdistribution table, Statistical Table C
in Appendix B
2008 McGraw-Hill

Four Aspects of a
Relationship
With correlation and regression
analysis, because both variables are
of interval/ratio level, the analysis is
mathematically rich
All four aspects of a relationship
apply
2008 McGraw-Hill

Existence of a Relationship
Test the H0 that = 0, that there is no
relationship between X and Y
If the H0 is rejected, a relationship
exists

2008 McGraw-Hill

Direction of a Relationship
Direction is indicated by the sign of r and b,
and by observing the slope of the pattern
of coordinates in a scatterplot
A positive relationship is revealed with an
upward slope, and r and b will be positive
A negative relationship is revealed with a
downward slope, and r and b will be
negative
2008 McGraw-Hill

Strength of a Relationship
Strength is determined by the
proportion of the total variation in Y
explained by X
This proportion is quickly obtained by
squaring Pearsons r correlation
coefficient
Focus on r2, not r
2008 McGraw-Hill

Nature of a Relationship
1) Interpret the regression coefficient, b, the
slope of the regression line. State the
effect on Y of a one-unit change in X
2) Provide best estimates using the
regression line equation. Insert chosen
values of X, compute s and interpret
them in everyday language

2008 McGraw-Hill

Careful Interpretation
of Findings
A correlation applies to a population,
not to an individual
E.g., predictions of Y for a value of X
provide the best estimate of the mean
of Y for all subjects with that X-score

2008 McGraw-Hill

Careful Interpretation
of Findings (cont.)
A statistical relationship may exist but
not mean much. Be wary of
statistically significant but small
Pearsons rs
Distinguish statistical significance
(i.e., the existence of a relationship)
from practical significance (i.e., the
strength of the relationship)
2008 McGraw-Hill

Statistical Follies:
Spurious Correlation
A spurious correlation is one that is
conceptually false, nonsensical, or
theoretically meaningless
E.g., for the period of the 1990s, there is a
positive correlation between the amount of
carbon dioxide released into the atmosphere
and the level of the Dow Jones stock index
2008 McGraw-Hill