Vous êtes sur la page 1sur 6

Copyright 2011. All rights reserved.

Page 1

Week 8

AUTOCORRELATION
AHEAD IN THIS LECTURE
Hill et al
What is Autocorrelation? Sections 9.1, 9.2
Properties of the OLS Estimator Section 9.3
GLS Estimation Section 9.3
Testing for Autocorrelation Section 9.4
Autoregressive Models Section 9.5
Finite Distributed Lag Models Section 9.6
WHAT IS AUTOCORRELATION?
In a time-series context, the multiple linear regression is:

MR1 = 1, ,

MR2

MR3 var

=
2


MR4 , = 1, ,

MR5 the

are not collinear, and for all .



MR6 the values of e are normally distributed (optional)

If MR4 is violated then the errors are said to be autocorrelated.
( ) 0
t
E e =
cov( , ) 0 for
t s
e e t s = =
1 2 2
...
t t K tK t
y x x e | | | = + + + +
( ) 0
t tk
E e x =
WHAT IS AUTOCORRELATION? cont.
In this lecture we allow for some correlation between errors.

In particular, we assume the errors follow what is referred to as
an autoregressive process of order one (denoted as AR(1) process)

=
1
+

, 1 < < 1

where the

are independent random error terms with mean zero


and constant variance

2
.

To allow for AR(1), we need to modify MR1-MR5 assumptions:

MR3 var

2
=

2
1


MR8 cov

, for > 0

Example sugar cane
3.2
3.6
4.0
4.4
4.8
5.2
-.4 -.2 .0 .2 .4 .6 .8
LOG(P)
L
O
G
(
A
)
PROPERTIES OF THE OLS ESTIMATOR
If the errors are autocorrelated then

OLS is unbiased and consistent

the variances of the OLS estimators are no longer given by the
standard formulas,
So, the confidence intervals and hypothesis tests based on
these formulas may be misleading!
We can compute correct heteroskedasticity- and
autocorrelation-consistent (HAC) standard errors using an
estimator suggested by Newey and West
The Newey-West standard errors are analogous to the White
standard errors that are used when the errors are
heteroskedastic.

OLS is inefficient (i.e., it is no longer the BLUE).

Copyright 2011. All rights reserved. Page 2
Example sugar cane cont.
Dependent Variable: LOG(A)
Method: Least Squares
Date: 04/27/08 Time: 12:12
Sample: 1 34
Included observations: 34
Newey-West HAC Standard Errors & Covariance (lag truncation=3)


Variable Coefficient Std. Error t-Statistic Prob.


C 3.893256 0.062444 62.34761 0.0000
LOG(P) 0.776119 0.378207 2.052102 0.0484


R-squared 0.196466 Mean dependent var 3.980680
Adjusted R-squared 0.171355 S.D. dependent var 0.338123
S.E. of regression 0.307793 Akaike info criterion 0.538245
Sum squared resid 3.031571 Schwarz criterion 0.628031
Log likelihood -7.150159 F-statistic 7.824072
Durbin-Watson stat 1.168987 Prob(F-statistic) 0.008653



GENERALISED LEAST SQUARES ESTIMATION
If the errors follow an AR(1) process and if is known then we
can obtain unbiased, consistent and efficient estimates by applying
OLS to a properly transformed model:



where and

* * * *
1 1 2 2
...
t t t K tK t
y x x x v | | | = + + + +
*
1 t t t
y y y p

=
*
1,
1 for 1
for 1
tk
tk t k
k
x
x x k
p
p

=
=

>

Note: Only T 1 observations are used for estimation (one


observation is lost through lagging).

This is known as the Cochrane-Orcutt transformation.

If is unknown then we can use the first-order sample correlation
coefficient (corr(

,
1
)) as an estimator of this .
Example sugar cane cont.
Dependent Variable: LOG(A)-0.395*LOG(A(-1))
Method: Least Squares
Date: 04/29/08 Time: 06:53
Sample (adjusted): 2 34
Included observations: 33 after adjustments


Variable Coefficient Std. Error t-Statistic Prob.


1-0.395 3.899243 0.087209 44.71165 0.0000
LOG(P)-0.395*LOG(P(-1)) 0.876123 0.255584 3.427925 0.0017


R-squared 0.274865 Mean dependent var 2.427009
Adjusted R-squared 0.251474 S.D. dependent var 0.324645
S.E. of regression 0.280875 Akaike info criterion 0.356874
Sum squared resid 2.445605 Schwarz criterion 0.447572
Log likelihood -3.888426 Durbin-Watson stat 1.773865



Nonlinear Least Squares Estimation
It is also possible to obtain unbiased and efficient estimates by
estimating the model:



This model is nonlinear in the parameters, which makes it difficult
to find values of the parameters that minimise the sum of squares
of residuals.

EViews finds the so-called nonlinear least squares (NLS) estimates
numerically (by systematically evaluating the sum of squares
function at different values of the parameters until the least squares
estimates are found).

NLS estimation is equivalent to iterative GLS estimation using the
Cochrane-Orcutt transformation.
( )
1 2 2 1 1 2 1,2 1,
... ...
t t K tK t t K t K t
y x x y x x v | | | p | | |

= + + + + +
Example sugar cane cont.
Dependent Variable: LOG(A)
Method: Least Squares
Date: 04/27/08 Time: 12:21
Sample (adjusted): 2 34
Included observations: 33 after adjustments
Convergence achieved after 1 iteration
LOG(A) = C(1)*(1-C(3)) + C(2)*LOG(P) + C(3)*LOG(A(-1)) - C(3)*C(2)
*LOG(P(-1))


Coefficient Std. Error t-Statistic Prob.


C(1) 3.898771 0.092166 42.30171 0.0000
C(3) 0.422139 0.166047 2.542282 0.0164
C(2) 0.888371 0.259298 3.426056 0.0018


R-squared 0.277777 Mean dependent var 3.999309
Adjusted R-squared 0.229629 S.D. dependent var 0.325164
S.E. of regression 0.285399 Akaike info criterion 0.416650
Sum squared resid 2.443575 Schwarz criterion 0.552696
Log likelihood -3.874725 Durbin-Watson stat 1.820559



TESTING FOR AUTOCORRELATION
Methods for detecting the presence of autocorrelation:

Residual Plots

Residual correlograms

Lagrange Multiplier test

Durbin-Watson test


Copyright 2011. All rights reserved. Page 3
Plots
Positive autocorrelation is likely to be present if residual plots
reveal runs of positive residuals followed by runs of negative
residuals.

Negative autocorrelation is likely to be present if positive
residuals tend to be followed by negative residuals, and negative
residuals tend to be followed by positive residuals.
Example sugar cane cont.
Dependent Variable: LOG(A)
Method: Least Squares
Date: 04/26/08 Time: 04:56
Sample: 1 34
Included observations: 34


Variable Coefficient Std. Error t-Statistic Prob.


C 3.893256 0.061345 63.46486 0.0000
LOG(P) 0.776119 0.277467 2.797154 0.0087


R-squared 0.196466 Mean dependent var 3.980680
Adjusted R-squared 0.171355 S.D. dependent var 0.338123
S.E. of regression 0.307793 Akaike info criterion 0.538245
Sum squared resid 3.031571 Schwarz criterion 0.628031
Log likelihood -7.150159 F-statistic 7.824072
Durbin-Watson stat 1.168987 Prob(F-statistic) 0.008653



Example sugar cane cont.
-.8
-.6
-.4
-.2
.0
.2
.4
.6
.8
5 10 15 20 25 30
LOG(A) Residuals
Residual Correlograms
The correlation between errors that are k periods apart is

corr

=
cov

var(

)var(

)
=
cov

var(

)


it unknown but can be estimated using the OLS residuals as

= corr

=
cov

var

=+1

2
=+1


The sequence
1
,
2
, is known as the sample autocorrelation
function or correlogram.

If the null hypothesis
0
: corr

= 0 is true then

~ (0,1), approximately
Example sugar cane Lagrange Multiplier Test
To test
0
= 0 against
1
0
we can use the LM test statistic (distributed as Chi-square):

=
2
~
(1)
2
(approximately)

where R
2
is the coefficient of determination in the regression of the
residuals

on 1,
2
, ,

and
1
.

This test is also known as a Breusch-Godfrey test.




Copyright 2011. All rights reserved. Page 4
Example sugar cane cont.
Breusch-Godfrey Serial Correlation LM Test:


F-statistic 5.949152 Prob. F(1,31) 0.020646
Obs*R-squared 5.474312 Prob. Chi-Square(1) 0.019298



Test Equation:
Dependent Variable: RESID
Method: Least Squares
Date: 04/27/08 Time: 12:04
Sample: 1 34
Included observations: 34
Presample missing value lagged residuals set to zero.


Variable Coefficient Std. Error t-Statistic Prob.


C -0.008116 0.057186 -0.141927 0.8881
LOG(P) 0.091601 0.260934 0.351052 0.7279
RESID(-1) 0.407821 0.167202 2.439088 0.0206


R-squared 0.161009 Mean dependent var 2.45E-17
Adjusted R-squared 0.106881 S.D. dependent var 0.303094



Durbin-Watson Test
If the explanatory variables do not include any lagged values of the
endogenous variable then we can test
0
= 0 vs.
1
0
using -test statistic




Note: Critical values of this test depend on the data!
Upper and lower bounds on the critical values can be obtained
from Durbin-Watson tables.



2
1
2
1
2
2
( )
2(1 )

T
t t
t
T
t
t
e e
d r
e

=
=

= ~


2 0 4 dL
Reject H0

positive
autocorrelation
Reject H0

negative
autocorrelation
Do not reject H0

no
autocorrelation inconclusive inconclusive
dU 4 dU 4 dL
Example sugar cane cont.
Dependent Variable: LOG(A)
Method: Least Squares
Date: 04/26/08 Time: 04:56
Sample: 1 34
Included observations: 34


Variable Coefficient Std. Error t-Statistic Prob.


C 3.893256 0.061345 63.46486 0.0000
LOG(P) 0.776119 0.277467 2.797154 0.0087


R-squared 0.196466 Mean dependent var 3.980680
Adjusted R-squared 0.171355 S.D. dependent var 0.338123
S.E. of regression 0.307793 Akaike info criterion 0.538245
Sum squared resid 3.031571 Schwarz criterion 0.628031
Log likelihood -7.150159 F-statistic 7.824072
Durbin-Watson stat 1.168987 Prob(F-statistic) 0.008653



Example sugar cane cont.
For a test of
0
: = 0 against
0
: > 0 at the 5% level of
significance, the Durbin-Watson tables suggest that the
critical values for = 34 are

= 1.393 so 4

= 2.607
and

= 1.514 so 4

= 2.486

2 0 4 dL
Reject H0

positive
autocorrelation
Reject H0

negative
autocorrelation
Do not reject H0

no
autocorrelation inconclusive inconclusive
dU 4 dU 4 dL
Since = 1.169 <

we reject (at 5% significance level) the
null hypothesiswe conclude there is evidence of autocorrelation!
So, need to use OLS on transformed model, as we discussed!...
AUTOREGRESSIVE MODELS
An autoregressive model is a model that expresses the current value
of a variable as a function of its own lagged values.

An autoregressive model of order , denoted as AR(), is given by:

= +
1

1
+
2

2
++



where the

are independent random error terms with mean zero


and constant variance

2
.

Note: The error here is well-behaved (satisfies MR1-MR5)
so this model can be estimated using OLS!
The usual hypothesis testing procedures and goodness-of-fit
statistics are valid
We choose a value of using the usual methods hypotheses
tests, residual analysis, information criteria, parsimony...
Example inflation
Dependent Variable: INFLN
Method: Least Squares
Date: 04/25/08 Time: 07:58
Sample (adjusted): 1984M03 2006M05
Included observations: 267 after adjustments


Variable Coefficient Std. Error t-Statistic Prob.


C 0.209278 0.021781 9.608328 0.0000
INFLN(-1) 0.355224 0.060520 5.869540 0.0000
INFLN(-2) -0.180537 0.060341 -2.991927 0.0030


R-squared 0.120232 Mean dependent var 0.253534
Adjusted R-squared 0.113568 S.D. dependent var 0.209803
S.E. of regression 0.197531 Akaike info criterion -0.394674
Sum squared resid 10.30084 Schwarz criterion -0.354368
Log likelihood 55.68895 F-statistic 18.03964
Durbin-Watson stat 1.963006 Prob(F-statistic) 0.000000



Copyright 2011. All rights reserved. Page 5
Example inflation cont. Example inflation cont.
Dependent Variable: INFLN
Method: Least Squares
Date: 04/25/08 Time: 07:59
Sample (adjusted): 1984M04 2006M05
Included observations: 266 after adjustments


Variable Coefficient Std. Error t-Statistic Prob.


C 0.188335 0.025290 7.446877 0.0000
INFLN(-1) 0.373292 0.061481 6.071690 0.0000
INFLN(-2) -0.217919 0.064472 -3.380029 0.0008
INFLN(-3) 0.101254 0.061268 1.652641 0.0996


R-squared 0.129295 Mean dependent var 0.253389
Adjusted R-squared 0.119325 S.D. dependent var 0.210185
S.E. of regression 0.197247 Akaike info criterion -0.393799
Sum squared resid 10.19345 Schwarz criterion -0.339912
Log likelihood 56.37528 F-statistic 12.96851
Durbin-Watson stat 2.000246 Prob(F-statistic) 0.000000



Forecasting
The equation that tells us the value
+1
is


+1
= +
1

+
2

1
++

+1
+
+1


Our forecast of this value is:


+1
=

1
++

+1


The forecast value two periods beyond the end of the sample is

+2
=

+1
+

++

+2


and so on
Note:
Conf. intervals for our forecasts are difficult to compute manually!
because forecast error variances are highly nonlinear functions of
the variances of the OLS estimators.
but Eviews can do it!....
Example inflation cont.
1 1 2 1
2 1 1 2

0.2093 0.3552 0.4468 0.1805 0.5988




0.2093 0.3552 0.1805 0.4468
0.22
0.2599
0
0.259
9
9
T T T
T T T
y y y
y y y
o u u
o u u
+
+ +
= + +
= +
=
= + +
= +
=
Using our preferred model to forecast two periods beyond the end
of the sample:
Accounting for coefficient uncertainty, a 95% confidence interval
for y
T+1
is given by
Example inflation cont.
1 1

0.2599 1.9689 0.19896
0.2599 0.39174
T c
y t o
+

The forecast interval is (0.1319, 0.6516).


FINITE DISTRIBUTED LAG MODELS
A finite distributed lag (FDL) model is a model that expresses the
current value of the dependent variable as a function of current and
lagged values of exogenous variables.

If there is only one exogenous variable, a finite distributed lag
model of order q takes the form:

= +
0

+
1

1
+
2

2
++



where the

are independent random error terms with mean zero


and constant variance

2
.

The coefficients

( = 0, 1, , ) are called distributed lag


weights or -period delay multipliers

The model can be estimated using OLS, and the usual hypothesis
testing procedures and goodness-of-fit statistics are valid
We choose a value of using the usual methods
Copyright 2011. All rights reserved. Page 6
Example inflation cont.
Dependent Variable: INFLN
Method: Least Squares
Date: 04/25/08 Time: 16:45
Sample (adjusted): 1984M04 2006M05
Included observations: 266 after adjustments


Variable Coefficient Std. Error t-Statistic Prob.


C 0.121873 0.048655 2.504862 0.0129
PCWAGE 0.156089 0.088502 1.763684 0.0790
PCWAGE (-1) 0.107498 0.085055 1.263861 0.2074
PCWAGE (-2) 0.049485 0.085258 0.580418 0.5621
PCWAGE (-3) 0.199014 0.087885 2.264475 0.0244


R-squared 0.048435 Mean dependent var 0.253389
Adjusted R-squared 0.033851 S.D. dependent var 0.210185
S.E. of regression 0.206597 Akaike info criterion -0.297475
Sum squared resid 11.14009 Schwarz criterion -0.230116
Log likelihood 44.56421 F-statistic 3.321216
Durbin-Watson stat 1.382599 Prob(F-statistic) 0.011241



Example inflation cont.

Vous aimerez peut-être aussi