Vous êtes sur la page 1sur 4

Serial Correlation in Regression Analysis One of the assumptions of both simple and multiple regression analysis is that the

error terms are independent from one another they are uncorrelated. The magnitude of the error at some observation, i (i = Yi A BXi) has no effect on the magnitude or sign of the error at some other observation, j. This assumption is formally expressed as E(ij) = 0 for all i , !hich means that the expected value of all pair"!ise products of error terms is #ero. $f indeed, the error terms are uncorrelated, the positive products !ill cancel those that are negative leaving an expected value of 0. $f this assumption is violated, although the estimated regression model can still be of some value for prediction, its usefulness is greatly compromised. The estimated regression parameters, a, b%, b&, . . . ,bk, remain unbiased estimators of the corresponding true values, A, B%, B&, . . ,Bk, leaving the estimated model appropriate for establishing point estimates of A, B, etc., and the model can be used for predicting values of Y for any given set of X values. 'o!ever, the standard errors of the estimates of the regression parameters, e.g., sb are significantly underestimated !hich leads to erroneously inflated t values. (ecause testing hypotheses about the slope coefficients (i.e., Bk) and computing the corresponding confidence intervals rely on the calculated t values as the test statistics, the presence of correlated error terms means that these types of inferences cannot be made reliably. )lthough there are many !ays this assumption might be violated, the most common occurs !ith time series data in the form of serial correlation. $n the context of time series, the error in a period influences the error in a subse*uent period the next period or beyond. +onsider this, if there are factors (other than the independent variables) ma-ing the observation at some point in time larger than expected, (i.e., a positive error), then it is reasonable to expect that the effects of those same factors linger creating an up!ard (positive) bias in the error term of a subse*uent period. This phenomenon, for obvious reasons, is called positive first-order serial correlation by far the most common manner in !hich the assumption of independence of errors is violated.% Of course, it.s easy to imagine situations !here the error term in one period does not affect the error in the next period (no first order serial correlation), but biases the error terms t!o periods later (second"order serial correlation). +onsider a time series influenced by *uarterly seasonal factors, a model that ignores the seasonal factors !ill have correlated error terms !ith a lag of four periods. /e !ill concern ourselves, ho!ever, only !ith first order serial correlation. The 0urbin /atson test is a !ell -no!n formal method of testing if serial correlation is a serious problem undermining the model.s inferential suitability (e.g., assessing the confidence in the predicted value of a dependent variable). The test statistic of the 0urbin"/atson procedure is d and is calculated as follo!s,

d = t =&

(et et %) &
t =%

et&

1egative first"order serial correlation occurs !hen a positive (negative) error is follo!ed by a negative (positive) error. That is, the current error negatively influences the follo!ing error.

2ecall that et represents the observed error term (i.e., residuals) or (Yt t) = Yt a bXt. $t can be sho!n that the value of d !ill be bet!een #ero and four3 #ero corresponding to perfect positive correlation and four to perfect negative correlation. $f the error terms, et and et-%, are uncorrelated, the expected value of d is &. The further d is belo! & the stronger the evidence for the existence of positive first order serial correlation and vice versa. 4nfortunately, the 0urbin"/atson test can be inconclusive. The critical values of d for a given level of significance, sample si#e and number of independent variables are tabulated as pairs of values, !" and !# (a table is provided in 5+ourse 0ocuments67tatistical Tables folder). $f the test statistic, d falls bet!een these t!o values the test is inconclusive. The formal test of positive first order serial correlation is as follo!s, 'o : = 0 '% : > 0 (no serial correlation) (positive serial correlation)

$f d 8 !" re ect 'o, !hile if d 9 !# do not re ect 'o. )lthough negative first order serial correlation is far less li-ely, the statistic d can be used to test for the existence of negative serial correlation as !ell. :or this test the critical limits are ; " !" and ; " !#. The test then is, 'o : = 0 '% : $ 0 (no serial correlation) (negative serial correlation)

$f d 8 ; " !# do not re ect 'o, if d 9 ; " !" re ect 'o. These decision points can be summari#ed as in the follo!ing figure.

90 0 !"

< !#

=0 ;-!#

? ; "! "

80

Example: The model !e used earlier, Y = A % BX = , !here Y is the sales reven&e and X is the advertising b&dget, is actually a time series3 the successive pairs of Y and X values are observed monthly. Therefore, there is concern that first"order serial correlation may exist. To test for positive serial correlation !e calculate the error terms, ei. 1ote that the >xcel regression tool !ill generate the error terms upon re*uest by chec-ing the residuals box in the regression tool dialog. The errors (residuals) are given belo!.
RESIDUAL OUTPUT Residual s (et) 348.4015 11.23608 -41.7989 -431.179 133.2528 -23.3486 16.43126 224.1726 288.5593 -650.965 -0.07709 125.3145

Observatio n 1 2 3 4 5 6 7 8 9 10 11 12

Predicte d Yi 1473.967 1388.173 1242.198 1154.249 2453.899 1518.662 733.0114 1291.79 908.6782 1639.618 1357.892 1855.751

ei-1 348.4015 11.23608 -41.7989 -431.179 133.2528 -23.3486 16.43126 224.1726 288.5593 -650.965 -0.07709

(ei ei-1)2 113680.5 2812.713 151616.8 318583.1 24523.97 1582.434 43156.46 4145.651 882705.1 423654.5 15723.06
n &

1otice that the denominator of the 0urbin /atson statistic is 77> = et = ?00,@&&.%.
t= %

To calculate the numerator !e shift residuals do!n by one ro! and calculate the sum of the s*uared differences starting !ith the second ro!. These calculations are done by appending the last t!o columns to the table. The sum of the last column !hich gives the numerator is %?A&%A;. /e can no! compute the 0urbin"/atson statistic, d as %?A&%A;6?00,@&&.% = &.&00BB%. :ormally 'o : = 0 '% : > 0 (no serial correlation) (positive serial correlation)

:or n =%&, k = % and =.05 DL = .97 and DU = 1.32. 7ince d 9 !# !e accept the null hypothesis that there is no significant positive serial correlation. You may wish to formally test the existence of negative serial correlation.

Vous aimerez peut-être aussi