Vous êtes sur la page 1sur 16

FINAL EXAM DANIKA LI

QUESTION #1

1. Does the estimator of regressing X1 on Y suffer from omitted variable bias?


Explain.
a. No, because although X2 may be a determinant of Y, which is the first
condition for omitted variable bias, both conditions of omitted
variable bias must hold. In this case, X2 and X1 have to be correlated
with each other for the omission of X2 to be considered omitted
variable bias. Because X2 and X1 are given to be uncorrelated, theres
no omitted variable bias.
2. Calculate the variance of Bhat1 when X1i and X2i are uncorrelated.
a. Var(Bhat1) = [(1/(n-1)]*[(sigmau^2)/(varX1)]*[1/(1-R1^2)]
=(1/400)*(4/6)*[(1/(1-R1^2)]
=0.0025*0.6667*[(1/(1-R1^2)]
=0.00167*[1/(1-0)]
=0.00167*1
=0.00167
3. Assume that cor(X1, X2)=0.5 and R1^2=0.25. Computer variance again
a. Var(Bhat1) = [(1/(n-1)]*[(sigmau^2)/(varX1)]*[1/(1-R1^2)]
=(1/400)*(4/6)*[(1/(1-R1^2)]
=0.0025*0.6667*[(1/(1-R1^2)]
=0.00167*[1/(1-0.25)]
=0.00167*(1/0.75)
=0.00222
4. When X1 and X2 are correlated, the variance of Bhat1 is larger than it would
be is X1 and X2 are uncorrelated. Thus if you are interested in Bhat1, it is
best to leave X2 out of the regression if its correlated with X1.
a. The first part of the statement is true, as X1 and X2 being correlated
lead to variance being increased from 0.00167 to 0.00222. However,
the second half of the statement is untrue, because if X1 and X2 are
correlated, then they satisfy the second condition of omitted variable
bias as well and the regression then suffers from possible omitted
variable bias.

QUESTION 2

1. Is educ significant at the 5% level for model 1?


> wage<-read.csv("wage2.csv", header=T)
> model1<-lm(log(wage)~educ, data=wage)
> library(zoo)
> library(lmtest)
> library(sandwich)
> coeftest(model1, vcov=vcovHC(model1, type="HC3"))
t test of coefficients:

Estimate Std. Error t value Pr(>|t|)


FINAL EXAM DANIKA LI

(Intercept) 5.9730625 0.0824651 72.431 < 2.2e-16 ***


educ 0.0598392 0.0060949 9.818 < 2.2e-16 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

Yes, the variable education is significant at the 5% level, as the P-


value is far below 0.05 and the t-stat is higher than 1.96.

a. Confidence interval:
> .0598392-1.96*0.006
[1] 0.0480792
> lower_bound<-.0598392-1.96*0.006
> upper_bound<-.0598392+1.96*0.006
> print(c(lower_bound, upper_bound))
[0.0480792, 0.0715992]

The coefficient 0.059 lies within the 95% confidence interval, and
the previous section showed it to be significant at the 5% level,
meaning that education is statistically significant.

b. Interpretation of educ variable: For every additional year of


education, theres a 5.9% increase in monthly wages.
2. Estimation including experience and tenure
a. Model 2:
> model2<-lm(log(wage)~educ+exper, data=wage)
> summary(model2)

Call:
lm(formula = log(wage) ~ educ + exper, data = wage)

Residuals:
Min 1Q Median 3Q Max
-1.86915 -0.24001 0.03564 0.26132 1.30062

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.502710 0.112037 49.115 < 2e-16 ***
educ 0.077782 0.006577 11.827 < 2e-16 ***
exper 0.019777 0.003303 5.988 3.02e-09 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

Residual standard error: 0.393 on 932 degrees of freedom


Multiple R-squared: 0.1309, Adjusted R-squared: 0.129
F-statistic: 70.16 on 2 and 932 DF, p-value: < 2.2e-16
FINAL EXAM DANIKA LI

i. Coefficient interpretation: For every additional year of


education, monthly wages will increase by 7.78%.
b. Model 3:
> model3<-lm(log(wage)~educ+exper+tenure, data=wage)
> summary(model3)

Call:
lm(formula = log(wage) ~ educ + exper + tenure, data = wage)

Residuals:
Min 1Q Median 3Q Max
-1.8282 -0.2401 0.0203 0.2569 1.3400

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.496696 0.110528 49.731 < 2e-16 ***
educ 0.074864 0.006512 11.495 < 2e-16 ***
exper 0.015328 0.003370 4.549 6.10e-06 ***
tenure 0.013375 0.002587 5.170 2.87e-07 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

Residual standard error: 0.3877 on 931 degrees of freedom


Multiple R-squared: 0.1551, Adjusted R-squared: 0.1524
F-statistic: 56.97 on 3 and 931 DF, p-value: < 2.2e-16

i. Coefficient interpretation: For every additional year of


education, monthly wages will increase by 7.49%.
c. Does model 1 suffer from omitted variable bias? Is bhat1
overestimated or underestimated?
i. Yes, because not only do adding experience and tenure on as
additional independent variables change the coefficient
(bhat1) on education, but these additional variables are
significant at the 0.001 (0.1%) level.
ii. Bhat1 in model 1 (0.059) is therefore underestimated as
compared to bhat1 in model 2 (0.0778) and model 3 (0.0749).
3. Use the results of model 3, compute predicted monthly earnings at the
averages of educ, exper, and tenure.
a. > mean_educ<-mean(wage$educ)
> mean_exper<-mean(wage$exper)
> mean_tenure<-mean(wage$tenure)
>5.496696+(0.074864*mean_educ)+(0.015328*mean_exper)+(0.013
375*mean_tenure)
[1] 6.779003
> exp(6.779003)
[1] 879.1917
FINAL EXAM DANIKA LI

The predicted monthly earnings is $879.19/month.


4. Which model would you select for the return to education? Why?
a. I would use model 3 because the significant changes in bhat1 after
adding on experience AND tenure and the fact that these variables are
statistically significant at the 5% level indicates that they solve
significant omitted variable bias in model 1 and to a lesser extent
model 2. R-squared is also the highest for model 3, indicating that the
regression explains the most variation in the model.

QUESTION #3

1. log(wage)=B0+B1educ+B2educ*pareduc+B3tenure+u
Holding all else constant and only changing educ:
log(wage)=B1educ+B2educ*pareduc
log(wage)=educ(B1+B2pareduc)
log(wage)/educ = B1+B2pareduc
2. Estimate model 4
a. > model4<-lm(log(wage)~educ+educ:pareduc+tenure, data=wage)
> summary(model4)

Call:
lm(formula = log(wage) ~ educ + educ:pareduc + tenure, data = wage)

Residuals:
Min 1Q Median 3Q Max
-1.90863 -0.24051 0.02678 0.26726 1.28671

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 6.0315779 0.1030887 58.509 < 2e-16 ***
educ 0.0325911 0.0102024 3.194 0.001462 **
tenure 0.0146925 0.0028870 5.089 4.6e-07 ***
educ:pareduc 0.0007413 0.0002138 3.467 0.000557 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

Residual standard error: 0.3892 on 718 degrees of freedom


(213 observations deleted due to missingness)
Multiple R-squared: 0.1424, Adjusted R-squared: 0.1388
F-statistic: 39.74 on 3 and 718 DF, p-value: < 2.2e-16

b. Estimated return to educ when pareduc=32


i. > 0.0325911+0.0007413*32
[1] 0.0563127
When pareduc=32, for every additional year of education,
monthly wage goes up by 5.63%.
FINAL EXAM DANIKA LI

c. Estimated return to educ when pareduc=24


i. > 0.0325911+0.0007413*24
[1] 0.0503823
When pareduc=24, for every additional year of education,
monthly wage goes up by 5.04%.
3. Add pareduc into the model as a separate independent variable (Model 5)
and estimate it. Does the estimated return to education now depend on
parent education?
a. >model5<-lm(log(wage)~educ+pareduc+educ:pareduc+tenure,
data=wage)
> summary(model5)

Call:
lm(formula = log(wage) ~ educ + pareduc + educ:pareduc + tenure,
data = wage)

Residuals:
Min 1Q Median 3Q Max
-1.91704 -0.23329 0.02131 0.26594 1.29484

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.487671 0.368315 14.899 < 2e-16 ***
educ 0.071609 0.027338 2.619 0.009 **
pareduc 0.025999 0.016903 1.538 0.124
tenure 0.014891 0.002887 5.158 3.24e-07 ***
educ:pareduc -0.001094 0.001212 -0.902 0.367
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

Residual standard error: 0.3888 on 717 degrees of freedom


(213 observations deleted due to missingness)
Multiple R-squared: 0.1452, Adjusted R-squared: 0.1405
F-statistic: 30.45 on 4 and 717 DF, p-value: < 2.2e-16

> coeftest(model5, vcov=vcovHC(model5, type="HC3"))

t test of coefficients:

Estimate Std. Error t value Pr(>|t|)


(Intercept) 5.4876712 0.3658890 14.9982 < 2.2e-16 ***
educ 0.0716086 0.0271934 2.6333 0.008638 **
pareduc 0.0259987 0.0172617 1.5062 0.132468
tenure 0.0148915 0.0028721 5.1849 2.815e-07 ***
educ:pareduc -0.0010938 0.0012420 -0.8807 0.378786
---
FINAL EXAM DANIKA LI

Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

The estimated return on education does not depend on parental


education on a statistically significant level. Although model 5
shows a coefficient on educ:pareduc, the result of the t-test
shows its not statistically significant even at the 10% level. I
chose to analyze it on a 5% significance level. Therefore, because
the p-value is higher than 0.05 and the t-stat is lower than 1.96,
theres no statistical significance. The results of running a linear
hypothesis test (shown below) further reinforce this answer.

> library(car)
>linearHypothesis(model5,c("pareduc=0","educ:pareduc=0"),
vcov=vcovHC(model5, type="HC3"))
Linear hypothesis test

Hypothesis:
pareduc = 0
educ:pareduc = 0

Model 1: restricted model


Model 2: log(wage) ~ educ + pareduc + educ:pareduc + tenure

Note: Coefficient covariance matrix supplied.

Res.Df Df F Pr(>F)
1 719
2 717 2 7.532 0.0005791 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

4. By omitting pareduc by itself as a variable, model 4 suffers from significant


omitted variable bias. By including it in model 5, you solve for this bias and
get a more accurate estimate on the effect on return to education as based on
pareduc. This is a good example of what happens when you dont include all
the necessary terms in a regression, as the estimate wasnt just off by a few
decimals, but by a positive-negative difference.

QUESTION #4

1. What is the approximate difference in monthly salary between blacks and


nonblacks? Is this different statistically significant on a 5% significance level?
a. >model6<-
lm(log(wage)~educ+exper+tenure+married+black+south+urban,
data=wage)
> summary(model6)
FINAL EXAM DANIKA LI

Call:
lm(formula = log(wage) ~ educ + exper + tenure + married + black +
south + urban, data = wage)

Residuals:
Min 1Q Median 3Q Max
-1.98069 -0.21996 0.00707 0.24288 1.22822

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.395497 0.113225 47.653 < 2e-16 ***
educ 0.065431 0.006250 10.468 < 2e-16 ***
exper 0.014043 0.003185 4.409 1.16e-05 ***
tenure 0.011747 0.002453 4.789 1.95e-06 ***
married 0.199417 0.039050 5.107 3.98e-07 ***
black -0.188350 0.037667 -5.000 6.84e-07 ***
south -0.090904 0.026249 -3.463 0.000558 ***
urban 0.183912 0.026958 6.822 1.62e-11 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

Residual standard error: 0.3655 on 927 degrees of freedom


Multiple R-squared: 0.2526, Adjusted R-squared: 0.2469
F-statistic: 44.75 on 7 and 927 DF, p-value: < 2.2e-16

According to the model, the difference in monthly salary is that


those who are black have a 18.83% lower monthly salary than non-
blacks. According to the t-test below, this difference is statistically
significant at the 0.001 level (and therefore the 0.05 level as well).
The p-value is miniscule, and much lower than 0.05. Additionally,
the absolute value of the t-stat is higher than the critical value of
1.96.

> coeftest(model6, vcov=vcovHC(model6, type="HC3"))

t test of coefficients:

Estimate Std. Error t value Pr(>|t|)


(Intercept) 5.3954970 0.1137966 47.4135 < 2.2e-16 ***
educ 0.0654307 0.0064452 10.1519 < 2.2e-16 ***
exper 0.0140430 0.0032611 4.3062 1.838e-05 ***
tenure 0.0117473 0.0025532 4.6010 4.789e-06 ***
married 0.1994171 0.0401269 4.9697 7.986e-07 ***
black -0.1883499 0.0370303 -5.0864 4.417e-07 ***
south -0.0909037 0.0275051 -3.3050 0.0009863 ***
FINAL EXAM DANIKA LI

urban 0.1839121 0.0272624 6.7460 2.673e-11 ***


---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

2. Add the interaction term educ:black. Is this coefficient statistically significant


at the 5% significance level?
a. >model7<-
lm(log(wage)~educ+exper+tenure+married+black+south+urban+edu
c:black, data=wage)
> summary(model7)

Call:
lm(formula = log(wage) ~ educ + exper + tenure + married + black +
south + urban + educ:black, data = wage)

Residuals:
Min 1Q Median 3Q Max
-1.97782 -0.21832 0.00475 0.24136 1.23226

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.374817 0.114703 46.859 < 2e-16 ***
educ 0.067115 0.006428 10.442 < 2e-16 ***
exper 0.013826 0.003191 4.333 1.63e-05 ***
tenure 0.011787 0.002453 4.805 1.80e-06 ***
married 0.198908 0.039047 5.094 4.25e-07 ***
black 0.094809 0.255399 0.371 0.710561
south -0.089450 0.026277 -3.404 0.000692 ***
urban 0.183852 0.026955 6.821 1.63e-11 ***
educ:black -0.022624 0.020183 -1.121 0.262603
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

Residual standard error: 0.3654 on 926 degrees of freedom


Multiple R-squared: 0.2536, Adjusted R-squared: 0.2471
F-statistic: 39.32 on 8 and 926 DF, p-value: < 2.2e-16

> coeftest(model7, vcov=vcovHC(model7, type="HC3"))

t test of coefficients:

Estimate Std. Error t value Pr(>|t|)


(Intercept) 5.3748170 0.1161704 46.2666 < 2.2e-16 ***
educ 0.0671153 0.0067213 9.9854 < 2.2e-16 ***
exper 0.0138259 0.0032655 4.2339 2.526e-05 ***
tenure 0.0117870 0.0025532 4.6165 4.453e-06 ***
FINAL EXAM DANIKA LI

married 0.1989077 0.0400959 4.9608 8.351e-07 ***


black 0.0948087 0.2163599 0.4382 0.661344
south -0.0894495 0.0274961 -3.2532 0.001183 **
urban 0.1838523 0.0272556 6.7455 2.684e-11 ***
educ:black -0.0226236 0.0169537 -1.3344 0.182390
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

After running a t-test on model 7, you see that the interaction


term educ:black is not statistically significant, even on the 10%
level. Therefore, due to the p-value being higher than 0.05 and
the absolute value of the t-stat being smaller than the critical
value of 1.96, educ:black is not statistically significant at the 5%
level education does not depend on race at a statistically
significant level.

3. Test the null hypothesis that the effects of all dummy variables are equal to
zero with a heteroskedasticity F-test
a. > linearHypothesis(model6, c("exper=0", "tenure=0","married=0",
"black=0", "south=0", "urban=0"), vcov=vcovHC(model6, type =
"HC3"))
Linear hypothesis test

Hypothesis:
exper = 0
tenure = 0
married = 0
black = 0
south = 0
urban = 0

Model 1: restricted model


Model 2: log(wage) ~ educ + exper + tenure + married + black + south
+
urban

Note: Coefficient covariance matrix supplied.

Res.Df Df F Pr(>F)
1 933
2 927 6 37.058 < 2.2e-16 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

Because the p-value is smaller than the 0.05 alpha level and the f-
stat is higher than the 1.96 critical value, we reject the null
FINAL EXAM DANIKA LI

hypothesis that the effects of the dummy variables are equal to


zero. In effect, this means that the dummy variables have a
statistically significant effect.

4. Extend the original model so that education depends on the amount of work
experience (model 8 with interaction term educ:exper). Obtain thetahat1
and a 95% confidence interval for theta1.
a. Holding all else equal:
log(wage)=B0+B1educ+B2exper+B3educ*exper+B4tenure+B5marrie
d+B6black+B7south+B8urban+u

log(wage)=B0+B1educ+B2exper+B3educ*exper+u
Plug in B1= 1-10B3

Log(wage)=B0+(1-10B3)educ+B2exper+B3educ*exper+u
=B0+1educ+B2exper-10B3educ+B3educ*exper+u
=B0+1educ+B2exper+B3educ(exper-10)+u

Therefore, regress log(wage) on educ, exper, and educ(exper-10)

>model8<-lm(log(wage)~educ+exper+I(educ*(exper-10)),
data=wage)
> summary(model8)

Call:
lm(formula = log(wage) ~ educ + exper + I(educ * (exper - 10)),
data = wage)

Residuals:
Min 1Q Median 3Q Max
-1.88558 -0.24553 0.03558 0.26171 1.28836

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 5.949455 0.240826 24.704 <2e-16 ***
educ 0.076080 0.006615 11.501 <2e-16 ***
exper -0.021496 0.019978 -1.076 0.2822
I(educ * (exper - 10)) 0.003203 0.001529 2.095 0.0365 *
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

Residual standard error: 0.3923 on 931 degrees of freedom


Multiple R-squared: 0.1349, Adjusted R-squared: 0.1321
F-statistic: 48.41 on 3 and 931 DF, p-value: < 2.2e-16
FINAL EXAM DANIKA LI

Theta hat in this case is the return to education, or 0.0761. The


code for the confidence interval is shown below, but essentially
the 95% confidence interval for theta hat is [0.063, 0.089]. Since
our estimate for theta hat (0.076) is within this confidence
interval, its a statistically sound estimate.

> lower_bound<-0.076080-1.96*0.006615
> upper_bound<-0.076080+1.96*0.006615
> print(c(lower_bound, upper_bound))
[1] 0.0631146 0.0890454

QUESTION #5

1. Estimate the model with only 1988 data. What are the estimated effect of
education and union membership? Are they significant at the 5% level?
a. > nls_panel<-read.csv("nls_panel.csv", header=TRUE)
> nls88<-subset(nls_panel, year==88)
>model88<-
lm(log(wage)~educ+exper+I(exper^2)+tenure+I(tenure^2)+black+s
outh+union, data=nls88)
> summary(model88)

Call:
lm(formula = log(wage) ~ educ + exper + I(exper^2) + tenure +
I(tenure^2) + black + south + union, data = nls88)

Residuals:
Min 1Q Median 3Q Max
-1.55873 -0.23842 -0.00052 0.23490 1.82679

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.2237350 0.2240258 0.999 0.318281
educ 0.0776627 0.0063978 12.139 < 2e-16 ***
exper 0.0787905 0.0307279 2.564 0.010549 *
I(exper^2) -0.0016709 0.0010510 -1.590 0.112327
tenure 0.0076095 0.0098219 0.775 0.438745
I(tenure^2) -0.0002872 0.0005024 -0.572 0.567701
black -0.1310958 0.0372783 -3.517 0.000465 ***
south -0.1370122 0.0336513 -4.072 5.2e-05 ***
union 0.1300245 0.0356196 3.650 0.000281 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

Residual standard error: 0.4044 on 707 degrees of freedom


FINAL EXAM DANIKA LI

Multiple R-squared: 0.3132, Adjusted R-squared: 0.3054


F-statistic: 40.3 on 8 and 707 DF, p-value: < 2.2e-16

> coeftest(model88, vcov=vcovHC(model88, type="HC3"))

t test of coefficients:

Estimate Std. Error t value Pr(>|t|)


(Intercept) 0.22373504 0.20407101 1.0964 0.2732951
educ 0.07766268 0.00683453 11.3633 < 2.2e-16 ***
exper 0.07879051 0.02890784 2.7256 0.0065778 **
I(exper^2) -0.00167093 0.00102090 -1.6367 0.1021328
tenure 0.00760953 0.01128314 0.6744 0.5002670
I(tenure^2) -0.00028724 0.00055969 -0.5132 0.6079599
black -0.13109578 0.03383677 -3.8744 0.0001168 ***
south -0.13701219 0.03338922 -4.1035 4.544e-05 ***
union 0.13002448 0.03474406 3.7424 0.0001971 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

Education: In 1988, for every additional year of education, this lead


to a 7.7% increase in wage/hour. This value is statistically significant
at the 0.05 level, because its p-value is smaller than 0.05 and its t-stat
is higher than the 1.96 critical value.
Union membership: In 1988, membership in a union lead to a 13%
increase in wage/hour. This value is statistically significant at the 0.05
level, because its p-value is smaller than 0.05 and its t-stat is higher
than the 1.96 critical value.

2. Estimate the model above with 1987 data. What are the estimated effects of
education and union membership? Are they similar to the results of 1988
data? Explain.
a. > nls87<-subset(nls_panel, year==87)
>model87<-
lm(log(wage)~educ+exper+I(exper^2)+tenure+I(tenure^2)+black+s
outh+union, data=nls87)
> summary(model87)

Call:
lm(formula = log(wage) ~ educ + exper + I(exper^2) + tenure +
I(tenure^2) + black + south + union, data = nls87)

Residuals:
Min 1Q Median 3Q Max
-1.52585 -0.25020 -0.01483 0.21843 2.61713
FINAL EXAM DANIKA LI

Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.2242087 0.1890339 1.186 0.23599
educ 0.0759663 0.0062708 12.114 < 2e-16 ***
exper 0.0854817 0.0280038 3.053 0.00235 **
I(exper^2) -0.0020485 0.0010488 -1.953 0.05119 .
tenure 0.0068705 0.0097102 0.708 0.47945
I(tenure^2) -0.0001893 0.0005442 -0.348 0.72801
black -0.1574320 0.0366493 -4.296 1.99e-05 ***
south -0.1014177 0.0328986 -3.083 0.00213 **
union 0.1662697 0.0352498 4.717 2.89e-06 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

Residual standard error: 0.3956 on 707 degrees of freedom


Multiple R-squared: 0.3289, Adjusted R-squared: 0.3213
F-statistic: 43.32 on 8 and 707 DF, p-value: < 2.2e-16

> coeftest(model87, vcov=vcovHC(model87, type="HC3"))

t test of coefficients:

Estimate Std. Error t value Pr(>|t|)


(Intercept) 0.22420875 0.16043638 1.3975 0.1627037
educ 0.07596626 0.00764732 9.9337 < 2.2e-16 ***
exper 0.08548166 0.02559507 3.3398 0.0008825 ***
I(exper^2) -0.00204848 0.00099797 -2.0526 0.0404751 *
tenure 0.00687051 0.01026421 0.6694 0.5034808
I(tenure^2) -0.00018933 0.00056956 -0.3324 0.7396668
black -0.15743205 0.03350536 -4.6987 3.147e-06 ***
south -0.10141769 0.03165266 -3.2041 0.0014158 **
union 0.16626968 0.03784696 4.3932 1.288e-05 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

Education: In 1987, an additional year of education yielded a 7.6%


increase in wages/hour. This effect is statistically significant at the
0.05 level, as its p-value is smaller than 0.05 and its t-value higher
than the 1.96 critical value.
Union membership: In 1987, membership in a union yielded a
16.62% increase in wages/hour. This effect is statistically significant
at the 0.05 level, as its p-value is smaller than 0.05 and its t-value
higher than the 1.96 critical value.

Similarities to 88: The effect of education was very similar to the


1988 effects, with 1987 yielding a .1% smaller increase. However,
FINAL EXAM DANIKA LI

union membership in 1987 yielded a 3.62% higher increase. This is


not a massive change in effect, and these similarities can be attributed
to the fact that the two models are only one year apart.

3. Using the original dataset, estimate the pooled OLS model. Whats the
estimated effect of being in a union? Using robust standard errors, did you
find any insignificant variables at the 5% significance level?
a. >pooledmodel<-
plm(log(wage)~educ+exper+I(exper^2)+tenure+I(tenure^2)+black+
south+union,data= nls.plm,model="pooling")
> summary(pooledmodel)
Pooling Model

Call:
plm(formula = log(wage) ~ educ + exper + I(exper^2) + tenure +
I(tenure^2) + black + south + union, data = nls.plm, model =
"pooling")

Balanced Panel: n=716, T=5, N=3580

Residuals :
Min. 1st Qu. Median 3rd Qu. Max.
-1.70000 -0.23300 -0.00438 0.21500 2.58000

Coefficients :
Estimate Std. Error t-value Pr(>|t|)
(Intercept) 0.47660008 0.05615585 8.4871 < 2.2e-16 ***
educ 0.07144879 0.00268939 26.5669 < 2.2e-16 ***
exper 0.05568504 0.00860716 6.4696 1.116e-10 ***
I(exper^2) -0.00114754 0.00036129 -3.1762 0.0015046 **
tenure 0.01496002 0.00440728 3.3944 0.0006953 ***
I(tenure^2) -0.00048604 0.00025770 -1.8860 0.0593697 .
black -0.11671387 0.01571590 -7.4265 1.387e-13 ***
south -0.10600256 0.01420083 -7.4645 1.045e-13 ***
union 0.13224321 0.01496161 8.8388 < 2.2e-16 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

Total Sum of Squares: 772.56


Residual Sum of Squares: 521.03
R-Squared: 0.32559
Adj. R-Squared: 0.32408
F-statistic: 215.496 on 8 and 3571 DF, p-value: < 2.22e-16
> coeftest(pooledmodel, vcov=vcovHC(pooledmodel, type="HC3"))

t test of coefficients:
FINAL EXAM DANIKA LI

Estimate Std. Error t value Pr(>|t|)


(Intercept) 0.47660008 0.08480039 5.6203 2.053e-08 ***
educ 0.07144879 0.00550493 12.9790 < 2.2e-16 ***
exper 0.05568504 0.01139134 4.8884 1.062e-06 ***
I(exper^2) -0.00114754 0.00049662 -2.3107 0.02091 *
tenure 0.01496002 0.00714088 2.0950 0.03624 *
I(tenure^2) -0.00048604 0.00041159 -1.1809 0.23772
black -0.11671387 0.02816113 -4.1445 3.485e-05 ***
south -0.10600256 0.02708239 -3.9141 9.244e-05 ***
union 0.13224321 0.02710551 4.8788 1.114e-06 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

Union member: By being a member of a union, your hourly wage


should increase by 13.22%. Based on the t-test, I(tenure^2) is the only
statistically insignificant variable. All other variables were significant
at the 5% or smaller level.

4. Estimate the individual fixed effects model. What is the sample size? Can you
find the coefficient on educ? Why was it dropped? Explain.
a. fixed_id<-
plm(log(wage)~educ+exper+I(exper^2)+tenure+I(tenure^2)+black+
south+union, data=nls.plm, model="within")
>coeftest(fixed_id,vcov=vcovHC(fixed_id,type="HC3",
cluster="group"))

t test of coefficients:

Estimate Std. Error t value Pr(>|t|)


exper 0.04108314 0.00825483 4.9769 6.846e-07 ***
I(exper^2) -0.00040905 0.00033058 -1.2374 0.2160467
tenure 0.01390895 0.00422300 3.2936 0.0010011 **
I(tenure^2) -0.00089623 0.00024992 -3.5860 0.0003414 ***
south -0.01632239 0.05921706 -0.2756 0.7828471
union 0.06369724 0.01690438 3.7681 0.0001678 ***
---
Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1

Sample size results are shown from the R command below. R


dropped the coefficient on educ automatically when a fixed
effects model was implemented which is why youre unable to
find a coefficient for educ in the model.

Sample size:
> pdim(fixed_id)
FINAL EXAM DANIKA LI

Balanced Panel: n=716, T=5, N=3580

5. What is the estimated effect of union membership from the fixed_id model?
Compare it with the result from the pooled OLS model. Which model is
statistically reliable between the pooled OLS and fixed_id model?
a. Union membership in the fixed_id model is shown to yield a
6.37% increase in hourly wage. This is much smaller than the
estimated effect of union membership in the pooled OLS model
(13.22%).

> pFtest(fixed_id, pooledmodel)

F test for individual effects

data: log(wage) ~ educ + exper + I(exper^2) + tenure + I(tenure^2) +


...
F = 15.188, df1 = 713, df2 = 2858, p-value < 2.2e-16
alternative hypothesis: significant effects

The results of this pFtset show that the null hypothesis (no
significant fixed state effects) is rejected in favor of the
alternative hypothesis of significant fixed state effects.
Statistically, this test shows that you should use the fixed state
effects model.

During this examination, all work has been my own. I give my word that I have not
resorted to any ethically questionable means of improving my grade or anyone elses
on this examination and that I have not discussed this exam with anyone other than
my instructor.

Vous aimerez peut-être aussi