Vous êtes sur la page 1sur 7

Regression

Explaining Variation
I. Explaining Variation: R2
 A. Breaking
Y
Down the Distances
Money Spent 8 x x
on Health Care 7 x Y=a+bx
x x
6 Y
x x
5
4 x

10 20 30 40 50 60 70 X
Income
 How well does the predicted line explain the variation in
the independent variable money spent?
I. Explaining Variation: R2
 Total Variation
$ Spent on
Health Care
Y
Y-Y=deviation unexplained by regression
8 x x
(x,y)
7 x
x x Y-Y=deviation explained by regression
6
5.9 Y
x x Y-Y=total deviation around Y
5
4 x
Y=a+bx

10 20 30 40 50 60 70 X
Income
I. Explaining Variation: R2
 Total Deviation
Y Y  (Y  Y )  (Y  Y ) .
Total = Explained + Unexplained
Deviation Deviation Deviation

 The total distance from any point to Y is the sum of


the distance from Y to the regression line plus the
distance from the regression line to Y .
I. Explaining Variation: R2
 B. Sums of Squares
 We can sum this equation across all the Y's and
square both sides to get:
2
 (Y  Y ) 2
  (Y  Y )  (Y  Y )
  (Y  Y )2  2 (Y  Y )(Y  Y )   (Y  Y )2
  (Y  Y )2   (Y  Y )2 ,
I. Explaining Variation: R2
 1. Total Sum of Squares (SST).
 The term on the left-hand side of this equation is the
sum of the squared distances from all points to Y .
We call this the total variation in the Y's, or the
Total Sum of Squares (SST).
 2. Regression Sum of Squares
 The first term on the right hand side is the sum of
the squared distances from the regression line to Y .
We call it the Regression Sum of Squares, or
SSR.
I. Explaining Variation: R2
 3. Error Sum of Squares
 Finally, the last term is the sum of the squared
distances from the points to the regression line.
Remember, this is the quantity that least squares
minimizes. We call it the Error Sum of Squares, or
SSE.
 We can rewrite the previous equation as:
SST = SSR + SSE.

Vous aimerez peut-être aussi