Vous êtes sur la page 1sur 12

MULTIPLE REGRESSION ANALYSIS AND

MODELING

SANTOSH KOIRALA
 To understand regression analysis
 Finding multiple regression equation
 Making inference about the population
parameters
 Assumptions of Classical linear regression
model
Regression analysis examines associative relationships
between a metric dependent variable and one or more
independent variables in the following ways:
 Determine whether the independent variables
explain a significant variation in the dependent
variable: whether a relationship exists.
 Determine how much of the variation in the
dependent variable can be explained by the
independent variables: strength of the relationship.
 Determine the structure or form of the relationship:
the mathematical equation relating the independent
and dependent variables.
 Predict the values of the dependent variable.
 Control for other independent variables when
evaluating the contributions of a specific variable or
set of variables.
 Regression analysis is concerned with the nature and
degree of association between variables and does
not imply or assume any causality.
(continued)
y
yi  
SSE = (y - y )2 y
_
SST = (y - y)2

y  _
_ SSR = (y - y)2 _
y y

Xi x
(continued)

 SST = total sum of squares


◦ Measures the variation of the y values around
their mean y
 SSE = error sum of squares
◦ Variation attributable to factors other than the
relationship between x and y
 SSR = regression sum of squares
◦ Explained variation attributable to the relationship
between x and y
SSy = SSreg + SSres

where
n
SS y =  (Y i - Y )2
i =1
n 2
SS reg =  (Y i - Y )
i =1
n 2
SS res =  (Y i - Y i )
i =1
The strength of association is measured by the square of the multiple
correlation coefficient, R2, which is also called the coefficient of
multiple determination.

SS reg
R2 =
SS y

R2 is adjusted for the number of independent variables and the sample


size by using the following formula:

Adjusted R2 =R 2 k(1 - R 2 )
-
n-k-1
A partial correlation coefficient measures the
association between two variables after
controlling for,
or adjusting for, the effects of one or more
additional
variables.
rx y - (rx z ) (ry z )
rx y . z =
1 - rx2z 1 - ry2z

 Partial correlations have an order associated


with them. The order indicates how many
variables are being adjusted or controlled.
 The simple correlation coefficient, r, has a
zero-order, as it does not control for any
additional variables while measuring the
association between two variables.
 A dummy variable is a categorical
explanatory variable with two levels:
◦ yes or no, on or off, male or female
◦ coded as 0 or 1
 Regression intercepts are different if the
variable is significant
 Assumes equal slopes for other variables
 If more than two levels, the number of
dummy variables needed is (number of
levels - 1)
 Different Intercepts, same slope

Ŷ  b 0  b1 X1  b 2 (1)  (b 0  b 2 )  b1 X1 male

Ŷ  b 0  b1 X1  b 2 (0)  b 0  b 1 X1 Female

Y (Salary)
If H0: β2 = 0 is
b0 + b 2 rejected, then
“salary between
b0 male and female
employees are
significantly
different.

Months of employment (X1)


 Model is linear is parameters
 Independent variables are assumed to be non
random.
 For given value of Xi s expected value of
disturbance term is 0
 i.e. E(ui /Xi)=0
 For a given values of Xi s the variance is
constant i.e. Var(ui /Xi)= σ
 There is no exact linear relationship between
the independent variables.

Vous aimerez peut-être aussi