Vous êtes sur la page 1sur 7

A Guide to Modern Econometrics / 2ed

Answers to selected exercises - Chapter 4

Exercise 4.1
a. The OLS results (Eviews 5.0 output) are as follows:
Dependent Variable: AIRQ
Method: Least Squares
Date: 12/02/05
Time: 14:04
Sample: 1 30
Included observations: 30
Variable Coefficient Std. Error t-Statistic Prob.
C
VALA
RAIN
COAS
DENS
MEDI

111.9347
15.33179
7.300823
0.000883 0.002256 0.391543
0.250699 0.343518 0.729798
-33.39830 10.45752 -3.193711
-0.001073 0.001623 -0.661225
0.000554 0.000850 0.652151

R-squared
0.382922
Adjusted R-squared 0.254364
S.E. of regression 24.20266
Sum squared resid
14058.45
Log likelihood
-134.8149
Durbin-Watson stat 1.829941

0.0000
0.6989
0.4726
0.0039
0.5148
0.5205

Mean dependent var


S.D. dependent var
Akaike info criterion
Schwarz criterion
F-statistic
Prob(F-statistic)

104.7000
28.02850
9.387659
9.667898
2.978597
0.031334

In this sample, air quality (airq) varies from 59 (best) to 165 (worst). Taking the estimated coe cients as given, the results indicate that coastal
regions, ceteris paribus, have a better air quality. Regions with a higher
value added, with more rain, or a higher household income tend to have
somewhat worse air quality. A higher population density is associated
with a somewhat better air quality, other things being equal. Statistical
signicance is ignored here (see b).
b. We test the null hypothesis that the (true) coe cient for medi is zero by
means of a t-test. The test statistic can be computed as
0:000554 0
= 0:652:
0:000850
This value does not allow us to reject the null hypothesis at any reasonable
level of signicance. (The associated p-value is 0.52). To test the joint
1

hypothesis that all coe cients, except the intercept, are equal to zero, we
use the F -test that is routinely provided. The F -statistic is 2.979 with a
p-value of 0.031. This indicates a marginal rejection at the 95% condence
level.
c. To perform the Goldfeld-Quandt test, we estimate the linear model for
the subsamples of coastal and non-coastal regions (omitting the coastal
dummy from the model). We nd s2A = 808:77 for the coastal sample
(NA = 21) and s2B = 23:75 for the non-coastal sample (NB = 9) . The
resulting test statistic is 1 = 34:05: The appropriate distribution under
the null hypothesis is an F -distribution with 21 5 and 9 5 degrees of
freedom. The 1% one-sided critical value is 14:1: Accordingly, the null
hypothesis of homoskedastic errors is strongly rejected. Note that we can
also change the role of the two subsamples, resulting in a test statistic of
2 = 1= 1 = 0:0294 and a 1% one-sided critical value of 0:07. The exact
(one-sided) p-value can be computed with additional software and equals
0:0018: The strong rejection of this tests indicates that the assumption of
homoskedastic errors, which underlies computation of the standard errors
and test statistics of a. and b., is invalid. Very likely, the standard errors
are wrong and the tests of b. are potentially misleading.
d. We perform a Breusch-Pagan test by saving the residuals from the OLS
regression of a. and computing their squares e2i : Next, we regress them
upon the full set of explanatory variables, including a constant. This
produces the following results.
Dependent Variable: RESID2
Method: Least Squares
Date: 12/02/05
Time: 14:54
Sample: 1 30
Included observations: 30
Variable Coefficient

Std. Error

t-Statistic

C
AIRQ
VALA
RAIN
COAS
DENS
MEDI

905.7886
6.719527
0.074508
11.43300
410.9406
0.053921
0.028236

-2.303782
3.067242
-0.242855
-0.539542
3.062459
-0.343060
-0.572467

-2086.739
20.61042
-0.018095
-6.168581
1258.489
-0.018498
-0.016164

R-squared
Adjusted R-squared
S.E. of regression
Sum squared resid
Log likelihood

0.364619
0.198867
796.7232
14599661
-238.9978

Prob.

0.0306
0.0055
0.8103
0.5947
0.0055
0.7347
0.5726

Mean dependent var


S.D. dependent var
Akaike info criterion
Schwarz criterion
F-statistic
2

468.6151
890.1337
16.39986
16.72680
2.199792

Durbin-Watson stat

2.230432

Prob(F-statistic)

0.080103

The simple variant of the Breusch-Pagan test is based on computing N = 30


times the R2 of this regression and produces = 10:94 (p = 0:09): Note
that the p-value for the F -test of this auxiliary regression is 0.08. At the
95% condence level, this does not allow us to reject the null hypothesis.
Note, however, that the non-rejection may be due to a lack of power due to
the small number of observations and the general nature of the alternative
hypothesis. (Note the signicance of coas in the auxiliary regression.)
e. The White test is based on including all squares and cross-products of the
regressors in the auxiliary regression of d. However, this leads to more
than 30 regressors in the model and generates perfect multicollinearity.
By adding additional regressors one by one, you will see the R2 converging
to its maximum of 1. As a result, the maximum value for the White test
statistic based on N R2 is 30. Critical values for any relevant Chi-squared
distribution would be much larger than 30, indicating that a rejection
will never be found. The results of this exercise indicate that the use
of the White test is inappropriate in this case, as the sample size is too
small (given the number of regressors). As an aside, Eviews provides the
option of computing a version of the White test without cross-terms. This
produces an R2 of 0.90 and a test statistic of 27.02. With 11 degrees of
freedom this results in a rejection at the 95% condence level.
f. The auxiliary regression produces the following results.
Dependent Variable: LOGRES2
Method: Least Squares
Date: 12/02/05
Time: 16:03
Sample: 1 30
Included observations: 30
Variable
C
COAS
MEDI

Coefficient

Std. Error

4.436858
1.571601
-5.29E-05

0.527094
0.614876
2.29E-05

R-squared
0.273084
Adjusted R-squared 0.219238
S.E. of regression 1.520931
Sum squared resid
62.45721
Log likelihood
-53.56742
Durbin-Watson stat 2.853740

t-Statistic

Prob.

8.417587
2.555965
-2.306350

0.0000
0.0165
0.0290

Mean dependent var


S.D. dependent var
Akaike info criterion
Schwarz criterion
F-statistic
Prob(F-statistic)

5.035834
1.721275
3.771161
3.911281
5.071608
0.013491

The F -statistic of this regression is 5.07 with a p-value of 0.013, indicating a


rejection of the null hypothesis at the 95% condence level.
3

^ 2 = exp(log res2f
g. We compute h
4:4368), where the logres2f denotes the
i
i
^ i (including
tted value from f. Subsequently, we divide all variables by h
the intercept) and re-estimate the model of a. The results are as follows.
Dependent Variable: AIRQ/H
Method: Least Squares
Date: 12/02/05
Time: 16:26
Sample: 1 30
Included observations: 30
Variable
ONE/H
VALA/H
RAIN/H
COAS/H
DENS/H
MEDI/H

Coefficient

Std. Error

115.7941
0.000115
0.165046
-32.64689
-0.000503
0.000700

11.71455
0.000868
0.300249
7.743770
0.001292
0.000357

R-squared
Adjusted R-squared
S.E. of regression
Sum squared resid
Log likelihood

0.965137
0.957873
14.96537
5375.096
-120.3932

t-Statistic
9.884637
0.132787
0.549698
-4.215891
-0.389019
1.958386

Prob.
0.0000
0.8955
0.5876
0.0003
0.7007
0.0619

Mean dependent var


S.D. dependent var
Akaike info criterio
Schwarz criterion
Durbin-Watson stat

93.67218
72.91367
8.426211
8.706451
1.796558

While the standard errors reported above are smaller than those reported
under a., note that the latter standard errors are likely to be incorrect
(given the test results of c.). If we now test whether average income
aects air quality, we nd a t-statistic of 1.96, which is on the margin
of being signicant at the 95% condence level. The F -test for the null
hypothesis that all ve slope coe cients are zero produces a value of 7.262
(based on additional calculations) and a corresponding p-value of 0.0003,
indicating a strong rejection. (The Chi-squared version of this Wald test
equals 36.31.) Note that these results are quite dierent from those based
on OLS (part b.).
h. The R2 is inated because of two reasons. First, it expresses the amount
of variation in airq/h that is explained by the model, not the variation
in air quality itself. Because observations with large values for hi are less
accurately described by the model, this increases the value of the reported
R2 . Second, because in estimation the intercept term is suppressed, Eviews adjust the denition of the R2 to its uncentred version (see eq.
(2.43)).

Exercise 4.2
We extend the specication of Table 4.9 by including cons t
sumption. The results are as follows.

1,

lagged con-

Dependent Variable: CONS


Method: Least Squares
Date: 01/02/06
Time: 14:36
Sample(adjusted): 2 30
Included observations: 29 after adjusting endpoints
Variable

Coefficient

C
PRICE
INCOME
TEMP
CONS(-1)

0.029105
-0.707504
0.003835
0.003323
0.098788

R-squared
Adjusted R-squared
S.E. of regression
Sum squared resid
Log likelihood
Durbin-Watson stat

Std. Error

t-Statistic

Prob.

0.270071
0.810356
0.001238
0.001057
0.298172

0.107767
-0.873078
3.098268
3.145391
0.331312

0.9151
0.3913
0.0049
0.0044
0.7433

0.764517
0.725270
0.034992
0.029387
58.82116
1.176422

Mean dependent var


S.D. dependent var
Akaike info criterion
Schwarz criterion
F-statistic
Prob(F-statistic)

0.358517
0.066760
-3.711804
-3.476063
19.47952
0.000000

The additional variable is statistically insignicant. Note, however, that because of the lagged dependent variable, the Durbin-Watson test statistic is no
longer appropriate (even though Eviews reports it). One can test for rst-order
serial correlation using one of the asymptotic tests discussed in Subsection 4.7.1.
This requires an auxiliary regression of the OLS residual upon its lag and all
explanatory variables from the model. The following results are obtained.
Dependent Variable: RES
Method: Least Squares
Date: 01/02/06
Time: 14:45
Sample(adjusted): 3 30
Included observations: 28 after adjusting endpoints
Variable
C
RES(-1)
INCOME
PRICE

Coefficient
0.079185
0.506950
0.000581
-0.207927

Std. Error
0.260370
0.248772
0.001224
0.784520
5

t-Statistic
0.304124
2.037808
0.474802
-0.265038

Prob.

0.7639
0.0538
0.6396
0.7934

TEMP
CONS(-1)

0.000906
-0.327099

R-squared
Adjusted R-squared
S.E. of regression
Sum squared resid
Log likelihood
Durbin-Watson stat

0.001096
0.324038
0.159628
-0.031365
0.033330
0.024439
58.88266
1.697923

0.826613
-1.009448

0.4173
0.3237

Mean dependent var


S.D. dependent var
Akaike info criterion
Schwarz criterion
F-statistic
Prob(F-statistic)

-0.000613
0.032819
-3.777333
-3.491860
0.835779
0.538434

The implied estimate for the rst-order serial correlation coe cient
is
0:51. The t-test on its signicance rejects marginally at the 95% condence
level. The test based on (T 1)R2 produces a statistic of 28 0:1596 = 4:47,
which implies a rejection (again at 95% condence). Apparently, the inclusion
of lagged consumption is insu cient to solve the autocorrelation problem.
Exercise 4.3
a. The inconclusive regionof the Durbin-Watson test is an interval of outcomes
for the dw test statistic for which the test is inconclusive (see page 103).
Because exact critical values are unknown, we work with upper and lower
bounds and outcomes may be such that we can neither reject the null
hypothesis nor not-reject the null hypothesis. The inconclusive region is
typically small if the sample size is large.
b. In general, the nding of autocorrelation means that the unobservables contained in the error term are related from one period to the next. If the
functional form of the model is incorrect, the unobservables pick up this
misspecication and may exhibit patterns of serial correlation. See Figure
4.4 for an illustration.
c. The answer here is the general version of the answer under b. For example, excluding a variable that describes seasonal patterns in monthly or
quarterly data may lead to ndings of autocorrelation.
d. In case of rst order autocorrelation, including the lagged error term as
a (hypothetically observable) regressor would eliminate the autocorrelation problem. Because the lagged error term is a function of the lagged
dependent variable and the lagged regressors, including the latter set of
variables also eliminates the serial correlation problem (see Subsection
4.10.1). Doing so, may not be preferable. First, we may be interested in
a static model rather than a dynamic one. That is, we may be interested
in Efyt jxt g; where the lagged dependent variable (and lagged regressors)
are not included in the conditioning set. Second, if we are interested in
a dynamic model the inclusion of lagged regressors and lagged dependent
error term, without any restrictions, this is not parsimonious. If rstorder autocorrelation is appropriate, imposing coe cient restrictions as in
6

(4.67) is preferred. (However, not imposing these restrictions generates


more exibility and robustness, and this may be preferred if the number
of observations is not too small.)
e. An overlappings samples problem arises if the horizon at which the dependent variable in a model is dened exceeds the frequency with which
it is observed. For example, annual stock market returns observed every
month. In this case, returns are observed over (partly) overlapping periods of twelve months. The problem is that this generates a moving
average autocorrelation pattern. Unexpected events in one month come
up in dierent twelve month periods and therefore lead to serial correlation. Routinely computed standard errors and tests will be incorrect and
misleading (see Subsection 4.11.3 for an illustration).
f. See Subsection 5.2.1.
g. The use of Newey-West standard errors is appropriate if autocorrelation is
suspected, for example in case of an overlapping samples problem. However, these standard errors are only asymptotically appropriate and require
that the order of autocorrelation is limited. Reasonable sample sizes are
thus required.
h. This extends the procedure discussed in Subsection 4.6.1. First, estimate the
model by OLS. Second, estimate the rst and second order autocorrelation
coe cient by regressing the OLS residual upon its rst two lags. Denoting
the true coe cient by 1 and 2 the transformed model is given by
yt

1 yt 1

2 yt 2

= (xt

1 xt 1

0
2 xt 2 )

+ vt :

Finally, replacing 1 and 2 by their estimates, and estimating the resulting transformed model by OLS, produces the feasible GLS estimator.
This estimator does not exploit information contained in the rst two
observations.
c 2006, John Wiley and Sons

Vous aimerez peut-être aussi