Vous êtes sur la page 1sur 3

Linear Regression Analysis using SPSS

Statistics
Introduction
Linear regression is the next step up after correlation. It is used when we want to predict the
value of a variable based on the value of another variable. The variable we want to predict is
called the dependent variable (or sometimes, the outcome variable). The variable we are using to
predict the other variable's value is called the independent variable (or sometimes, the predictor
variable). For example, you could use linear regression to understand whether exam performance
can be predicted based on revision time; whether cigarette consumption can be predicted based
on smoking duration; and so forth. If you have two or more independent variables, rather than
just one, you need to use multiple regression.

This "quick start" guide shows you how to carry out linear regression using SPSS Statistics, as
well as interpret and report the results from this test. However, before we introduce you to this
procedure, you need to understand the different assumptions that your data must meet in order
for linear regression to give you a valid result. We discuss these assumptions next.

Output of Linear Regression Analysis


SPSS Statistics will generate quite a few tables of output for a linear regression. In this section,
we show you only the three main tables required to understand your results from the linear
regression procedure, assuming that no assumptions have been violated. A complete explanation
of the output you have to interpret when checking your data for the six assumptions required to
carry out linear regression is provided in our enhanced guide. This includes relevant scatterplots,
histogram (with superimposed normal curve), Normal P-P Plot, casewise diagnostics and the
Durbin-Watson statistic. Below, we focus on the results for the linear regression analysis only.

The first table of interest is the Model Summary table, as shown below:

Published with written permission from SPSS Statistics, IBM Corporation.

This table provides the R and R2 values. The R value represents the simple correlation and is
0.873 (the "R" Column), which indicates a high degree of correlation. The R2 value (the "R
Square" column) indicates how much of the total variation in the dependent variable, Price, can
be explained by the independent variable, Income. In this case, 76.2% can be explained, which is
very large.

The next table is the ANOVA table, which reports how well the regression equation fits the data
(i.e., predicts the dependent variable) and is shown below:

Published with written permission from SPSS Statistics, IBM Corporation.

This table indicates that the regression model predicts the dependent variable significantly well.
How do we know this? Look at the "Regression" row and go to the "Sig." column. This indicates
the statistical significance of the regression model that was run. Here, p < 0.0005, which is less
than 0.05, and indicates that, overall, the regression model statistically significantly predicts the
outcome variable (i.e., it is a good fit for the data).

The Coefficients table provides us with the necessary information to predict price from income,
as well as determine whether income contributes statistically significantly to the model (by
looking at the "Sig." column). Furthermore, we can use the values in the "B" column under the
"Unstandardized Coefficients" column, as shown below:

Published with written permission from SPSS Statistics, IBM Corporation.

to present the regression equation as:

Price = 8287 + 0.564(Income)

If you are unsure how to interpret regression equations or how to use them to make predictions,
we discuss this in our enhanced linear regression guide. We also show you how to write up the
results from your assumptions tests and linear regression output if you need to report this in a
dissertation/thesis, assignment or research report. We do this using the Harvard and APA styles.
You can learn more about our enhanced content here.

Vous aimerez peut-être aussi