Vous êtes sur la page 1sur 21

Lecture 18.

Serial correlation: testing and estimation Testing for serial correlation


In lecture 16 we used graphical methods to look for serial/autocorrelation in the random error term ut . Because we cannot observe the ut we used the OLS residuals et . We looked at Time series graph of et , t = 1,, n . If there is serial correlation this graph shows gradual changes in the et . Scatterplot of et versus et 1. If the AR(1) model ut = ut 1 + t holds, then we expect that the scatterplot is concentrated along a straight line through 0. Tests for serial/autocorrelation also use the OLS residuals et .

Consider the linear regression model


Yt = 1 + 2 X t 2 + L K X tK + ut , t = 1,K , n

As with tests for heteroskedasticity we assume a particular model for the autocorrelation. Initially we consider firstorder serial correlation (1) ut = ut 1 + t
1 < < 1

with t white noise (the t s are independent and have all the same variance and mean 0). If = 0 , then ut = t and in that case the random errors ut satisfy Assumption 4, i.e. there is no serial correlation. Hence a test for serial correlation is a test of H0 : = 0.

First step is to find estimator for . If we replace ut in (1) by et and estimate by OLS we obtain

= t = 2n

et et 1
t =1

et2

This is also the first-order autocorrelation coefficient of the time series et , t = 1, K , n (see lecture 16) . The obvious thing to do is to use to test whether = 0 .
Instead of , a related quantity is used, the Durbin-Watson statistic d

d = t =2

(et et 1 ) 2
t =1

et2

It can be shown
d = 2(1 ) Hence if close to 0 (no autocorrelation) then d is close to 2. If is close to 1, then d is close to 0 and if is close to 1, then d is close to 4

no negative autocor. autocor. 0 2 4 ___|____|__|__|_____|______________|__


c L c cU d

positive autocor.

Because negative autocorrelation is rare, the usual test is H 0 : = 0 against H1 : > 0 . We reject (see graph) if d is small, i.e. close to 0. The critical value is than some number c greater than 0 but less than 2.

With a 5% significance level we want to have that if H 0 is true, then the probabability of rejection of H 0
Pr(d < c) = .05

If the errors have a normal distribution (assumption 5), then the distribution of d can be derived. This distribution depends on the independent variables X 2 , K, X K . Compare with the t- or F-distribution that do not depend on this. Some programs compute c exactly for the independent variables in your dataset. This is easy with current computers. If not, then there is a table with bounds c L and cU . These bounds are for extreme datasets and the c for any dataset is between them.

Example: for 2 independent variables (do not count the constant) and 25 observations c L = 1.206 and cU = 1.550 . Hence if e.g. d = 1.1 we reject and if d = 1.7 we do not reject. If d = 1.3 we do know what to do (test is inconclusive). This is computational problem, because c can be computed. Consider regression log housing starts per head on log GNP per head and log mortgage interest rate The DW statistics is .913 with n=23, k=2, so that c L = 1.168 and we reject the hypothesis of no serial correlation.

Dependent Variable: LNHOUSINGCAP Method: Least Squares Date: 11/13/01 Time: 00:06 Sample: 1963 1985 Included observations: 23 Variable C LNGNPCAP LNINTRATE R-squared Adjusted R-squared S.E. of regression Sum squared resid Log likelihood Durbin-Watson stat Coefficient 2.528899 -0.066000 -0.211284 0.094147 0.003562 0.225692 1.018735 3.209133 0.913015 Std. Error 1.180472 0.540505 0.202894 t-Statistic 2.142278 -0.122109 -1.041351 Prob. 0.0447 0.9040 0.3101 1.991961 0.226095 -0.018186 0.129922 1.039325 0.372027

Mean dependent var S.D. dependent var Akaike info criterion Schwarz criterion F-statistic Prob(F-statistic)

Alternative to DW test is the Lagrange Multiplier (LM) test. Also uses the OLS residuals et . The first step of the test is a linear regression with dependent variable et and independent variables X t 2 ,K , X tK , et 1. Compute the R 2 of this regression. The test statistic is
LM = (n 1) R 2

Note that we use n 1 observations in the regression. If H 0 : = 0 is true than LM has a chi-square distribution with 1 d.f. We reject if LM > c and if we want a test with a 5% significance level we find the critical value c from
Pr( LM > c) = .05

Application to housing start data


LM = 22 * .311 = 6.85

and the critical value for 5% significance is 3.84. Again we reject H 0 .

Estimation with serial correlation


Consider the linear regression
Yt = 1 + 2 X t + ut and u t = ut 1 + t AR(1)

How do we estimate the regression parameters and ? As with heteroskedasticity we transform the variables such that we have a random error term that satisfies the assumptions 1-4. Hence we can apply OLS to the transformed regression.

Dependent Variable: RESID01 Method: Least Squares Date: 11/14/01 Time: 23:56 Sample(adjusted): 1964 1985 Included observations: 22 after adjusting endpoints Variable C LNGNPCAP LNINTRATE RESID01LAG R-squared Adjusted R-squared S.E. of regression Sum squared resid Log likelihood Durbin-Watson stat Coefficient 0.330654 -0.233173 0.147169 0.586673 0.311452 0.196694 0.195443 0.687561 6.905473 1.272797 Std. Error 1.284974 0.596719 0.203398 0.208829 t-Statistic 0.257324 -0.390759 0.723551 2.809346 Prob. 0.7998 0.7006 0.4786 0.0116 -0.006314 0.218061 -0.264134 -0.065763 2.713985 0.075366

Mean dependent var S.D. dependent var Akaike info criterion Schwarz criterion F-statistic Prob(F-statistic)

Because t satisfies all the usual assumptions we must get this as the random error term. Note
t = u t ut 1

Now do the subtraction


Yt = 1 + 2 X t + ut Yt 1 = 1 + 2 X t 1 + u t 1

_______________________ (1) Yt Yt 1 = (1 ) 1 + 2 ( X t X t 1 ) + t Conclusion: if we transform the dependent variable to Yt Yt 1 and the independent variable to X t X t 1 we can use OLS to estimate 2 . Note that the OLS estimator of the constant does not estimate 1, but if we divide the OLS estimator of the constant by 1 we get an estimator of 1.

Problem with this method: We do not know . Solution: Choose range of values for , e.g. -.99, -.98, ., .98, .99 and estimate (1) for each of these values. For each compute the residuals
et = Yt Yt 1 (1 ) 1 2 ( X t X t 1 )

and the sum of squared residuals. Choose the value of and the OLS estimators of 1 , 2 that has the smallest sum of squared residuals. This the Hildreth-Lu procedure. Application to consumption and wages (billion 1992$) for US 1959-1994. Test for AR(1) errors (DW and LM) Compare estimates and standard errors

Dependent Variable: CONS Method: Least Squares Date: 11/15/01 Time: 01:10 Sample: 1959 1994 Included observations: 36 Variable C WAGES R-squared Adjusted R-squared S.E. of regression Sum squared resid Log likelihood Durbin-Watson stat Coefficient 1614.711 0.769682 0.948852 0.947347 216.9661 1600526. -243.7236 0.072411 Std. Error 59.81012 0.030647 t-Statistic 26.99729 25.11440 Prob. 0.0000 0.0000 2811.178 945.5435 13.65131 13.73929 630.7330 0.000000

Mean dependent var S.D. dependent var Akaike info criterion Schwarz criterion F-statistic Prob(F-statistic)

Dependent Variable: RESID01 Method: Least Squares Date: 11/15/01 Time: 01:12 Sample(adjusted): 1960 1994 Included observations: 35 after adjusting endpoints Variable C WAGES RESID01LAG R-squared Adjusted R-squared S.E. of regression Sum squared resid Log likelihood Durbin-Watson stat Coefficient 48.97566 -0.026748 0.930118 0.950467 0.947371 46.61194 69525.55 -182.5596 1.096482 Std. Error 13.26937 0.006724 0.037631 t-Statistic 3.690880 -3.978251 24.71658 Prob. 0.0008 0.0004 0.0000 12.50131 203.1813 10.60341 10.73672 307.0144 0.000000

Mean dependent var S.D. dependent var Akaike info criterion Schwarz criterion F-statistic Prob(F-statistic)

Dependent Variable: CONS Method: Least Squares Date: 11/15/01 Time: 01:13 Sample(adjusted): 1960 1994 Included observations: 35 after adjusting endpoints Convergence achieved after 8 iterations Variable C WAGES AR(1) R-squared Adjusted R-squared S.E. of regression Sum squared resid Log likelihood Durbin-Watson stat Inverted AR Roots Coefficient 2566.421 0.519497 0.943172 0.997574 0.997423 47.06673 70888.86 -182.8995 1.135516 .94 Std. Error 932.1594 0.138746 0.047851 t-Statistic 2.753200 3.744220 19.71054 Prob. 0.0096 0.0007 0.0000 2851.680 927.1223 10.62283 10.75614 6580.217 0.000000

Mean dependent var S.D. dependent var Akaike info criterion Schwarz criterion F-statistic Prob(F-statistic)

Alternative interpretation of AR(1) errors The linear regression in (1) can be rewritten as (2) Yt = (1 ) 1 + Yt 1 + 2 X t 2 X t 1 + t This is a linear regression model with independent variables Yt 1 , X t , X t 1 . In the model with only X t as independent variable Yt 1 , X t 1 are omitted and relegated to the error term. Because both variables are economic time series and change gradually the error term is autocorrelated. Compare (2) to the linear regression model (3) Yt = 1 + 2Yt 1 + 3 X t + 4 X t 1 + t Note that (3) has 4 regression coefficients and (2) has 3. (3) becomes (2) if
4 = 2 . 3

If we estimate (3) we find 4 = . 691 , 3 = .718 , 2 = .933 and hence


4 .691 = = .962 3 .718

Note that (3) is more general and an alternative to (2).

Dependent Variable: CONS Method: Least Squares Date: 11/15/01 Time: 01:15 Sample(adjusted): 1960 1994 Included observations: 35 after adjusting endpoints Variable C CONSLAG WAGES WAGESLAG R-squared Adjusted R-squared S.E. of regression Sum squared resid Log likelihood Durbin-Watson stat Coefficient 158.1357 0.932914 0.717754 -0.691637 0.997622 0.997391 47.35197 69508.48 -182.5553 1.101519 Std. Error 75.36761 0.049885 0.288670 0.279581 t-Statistic 2.098192 18.70140 2.486413 -2.473829 Prob. 0.0441 0.0000 0.0185 0.0190 2851.680 927.1223 10.66030 10.83806 4334.325 0.000000

Mean dependent var S.D. dependent var Akaike info criterion Schwarz criterion F-statistic Prob(F-statistic)

Dependent Variable: RESID02 Method: Least Squares Date: 11/15/01 Time: 01:16 Sample(adjusted): 1961 1994 Included observations: 34 after adjusting endpoints Variable C WAGES WAGESLAG RESID02LAG R-squared Adjusted R-squared S.E. of regression Sum squared resid Log likelihood Durbin-Watson stat Coefficient 9.395652 -0.193875 0.202677 0.477410 0.217167 0.138884 41.97033 52845.25 -173.1729 1.845907 Std. Error 14.09111 0.205809 0.216017 0.166159 t-Statistic 0.666779 -0.942015 0.938248 2.873215 Prob. 0.5100 0.3537 0.3556 0.0074 1.297489 45.22842 10.42193 10.60151 2.774118 0.058508

Mean dependent var S.D. dependent var Akaike info criterion Schwarz criterion F-statistic Prob(F-statistic)

Prediction with AR(1) The prediction of Yt +1 is Yt +1 = (1 ) 1 + Yt + 2 X t +1 2 X t = = 1 + 2 X t +1 + (Yt 2 X t ) = 1 + 2 X t +1 + et Compare this with


Yt +1 = 1 + 2 X t +1

for the linear regression without serial correlation. The error in period t + 1 can be predicted using the residual in period t .

Vous aimerez peut-être aussi