Académique Documents
Professionnel Documents
Culture Documents
0.45
Published in
Agricultural Economics
Best article of the year,
2008
???
0.21
Session 3 Topics
n Multiple
What
regression analysis
does it mean?
Why is it important?
How is it done and how are results interpreted?
What are the hazards?
does it mean?
Multivariate
analysis/statistics
Ceteris paribus
All else equal
Controlling for
does it matter?
y = + 1 x1 + u
u = 2 x2 +
Results
n If
are biased
SS
df
MS
Model
Residual
2.1590e+09
1.2229e+10
1
8646
2.1590e+09
1414446.51
Total
1.4388e+10
8647
1663962.69
mzyield
Coef.
basaprate
_cons
5.254685
1335.84
Std. Err.
.1344979
14.57861
t
39.07
91.63
Number of obs
F( 1, 8646)
Prob > F
R-squared
Adj R-squared
Root MSE
=
8648
= 1526.38
= 0.0000
= 0.1501
= 0.1500
= 1189.3
P>|t|
0.000
0.000
4.991037
1307.262
5.518333
1364.417
SS
df
MS
Model
Residual
2.3418e+09
1.2046e+10
2
8644
1.1709e+09
1393535.34
Total
1.4387e+10
8646
1664061.58
mzyield
Coef.
basaprate
topaprate
_cons
1.897807
3.62044
1314.93
Std. Err.
.321747
.3157663
14.58701
t
5.90
11.47
90.14
Number of obs
F( 2, 8644)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
8647
840.22
0.0000
0.1628
0.1626
1180.5
P>|t|
0.000
0.000
0.000
1.267106
3.001463
1286.336
2.528508
4.239418
1343.524
is the intercept
1 slope
intercept
1
is the intercept
n
n
n
n
How is it done?
n OLS
n
(y
i =1
n Minimize
the noise
n Squared, so residuals dont off set
n Gives us and predicted values y
1
1
y = x + x
1
SS
df
MS
Model
Residual
2.3418e+09
1.2046e+10
2
8644
1.1709e+09
1393535.34
Total
1.4387e+10
8646
1664061.58
mzyield
Coef.
basaprate
topaprate
_cons
1.897807
3.62044
1314.93
Std. Err.
.321747
.3157663
14.58701
t
5.90
11.47
90.14
Number of obs
F( 2, 8644)
Prob > F
R-squared
Adj R-squared
Root MSE
=
=
=
=
=
=
8647
840.22
0.0000
0.1628
0.1626
1180.5
P>|t|
0.000
0.000
0.000
1.267106
3.001463
1286.336
2.528508
4.239418
1343.524
Key Assumptions
n Linear
in parameters
n Random sample
n Zero conditional mean
n No perfect collinearity (variation in data)
n Homoskedastic errors
Key Assumptions
n Linear
in parameters
n Random sample
n Zero conditional mean
n No perfect collinearity (variation in data)
n Homoskedastic errors
Perfect Collinearity
n Variable
others.
n No variation in one variable (collinear w/
intercept)
17
Perfect Collinearity
n Variable
others.
n No variation in one variable (collinear w/
intercept)
n Perfect correlation between 2 binary
variables
Other hazards
n Multi-collinearity
n Including
irrelevant variables
n Omitting relevant variables
Multi-Collinearity
n Highly
correlated variables
n Variable is a nonlinear function of others
n Whats the problem?
n Efficiency losses
n Schmidt thumb rule
we omit x2 (underspecifying)
n OLS is generally biased
n It
let
x1 = 0 + 1 x2
Omitted Variable Bias
( )
~
E 1 = 1 + 2 1
Corr(x1,x2)<0
2 > 0
Positive bias
Negative bias
2 < 0
Negative bias
Positive bias
of fit
R2
regressors
n Other categorical regressors
n Categorical regressors as a series of
binary regressors
n Quadratic terms
n Other interactions
n Average Partial Effects