Vous êtes sur la page 1sur 8

ECONOMIC DATA ANALYSIS:

HETEROSKEDASTICITY
HETEROSKEDASTICITY
 Heteroskedasticity arises when the error term of a regression
equation does not have a constant variance:
 Var(ei) ≠ σ2 .
 That is, the error term can VARY with a change in one of
the explanatory variables.
 For example, household expenditure on food will likely be
heteroskedastic because expenditure on food will rise with
income.
 It is most likely – but not limited to – to appearing in
regressions taken from CROSS-SECTIONAL DATA.
 CONSEQUENCES:
 The OLS estimator is NO LONGER BLUE:
 It is still LINEAR and UNBIASED but is no longer BEST.
 The Estimated Standard Errors are no longer correct.
 Therefore hypothesis testing may give misleading results.
MATHEMATICALLY
 
Var ˆ 
2
 The maths behind the consequences
are shown to the side. 
i 1 i
N
x  x  2

 Normally, the variance for the OLS


estimator βhat is given by the top
equation.
 However, because the variance isn’t   xi  x  
constant, the variance is given by the  Let w 
bottom equation:  
i 1 i
N
x  x  2

 The variance term now varies for
all ‘i’, and is multiplied by a non-
constant factor, as shown.
 Hence, the reason why we can’t
trust OLS standard errors:   
Var ˆ 
N
w
i 1 i
2
  2
i 
 We would be assuming the top
equation for the variance, which is

 
wrong, rather than the bottom.


N
 x  x i
2
  i2
 i 1

 N
i 1
 xi  x  2
 2
CORRECTING FOR HETEROSKED. I
 To still be able to use OLS estimation
and retrieve reliable estimated standard
errors, we can calculate WHITE’S

 
HETEROSKEDASTICITY-

var   i 1 wi2 eˆi2


ˆ
CONSISTENT ERRORS. N
 These are calculated by simply
substituting the variance term for the
‘ith’ residual term.
 To calculate the residual term, run the
regression:
Y = B1 + B2Xi + ei

i 1  xi  x  ei 

N 2

 
2
 Then retrieve the residual errors (the ˆ
var ˆ 
distance from the data points to the
FITTED line):
2
 N  x  x  2
 Residual ‘i’ = yi – b1 – b2xi

i 1 i
 However:
 This method allows us to use hypothesis
testing... 
 BUT is still not the BEST method of
estimation.
CORRECTING FOR HETEROSKED II
 GENERALIZED LEAST SQUARES:
 If there is Heterosked, we may impose an
assumption to make it easier to deal with.
 For example, assume that in the model:
 Y = B1 + B2 X + E
Var  ei   X i   2
 That Y is expenditure on food and X is income.
1 X   2
2   Var  ei   i 2
 
 We suspect that X is the explanatory variable
driving heterosked in the model.
 Therefore, we can model the variance of the
Xi Xi
model as being dependent on X – shown in the


top equation.
Hence, if we work backwards – as shown – we

can calculate that we have to divide all the
terms by the SQUARE ROOT of X. Yi 1 2 X i ei
 We then have a constant variance.   
 HOWEVER: Xi Xi Xi Xi
 NO LONGER A CONSTANT TERM.

This method is called the Weighted Least


Yi *  1 X 1*i   2 X 2*i  ei*

Squares Estimator.
DIFFERENT TYPES OF HETEROSKED
 What we have been discussing up to now is
PROPORTIONAL HETEROSKEDASTICITY.
 There is another type – PARTITION
HETEROSKEDASTICITY.
 The former is where the variance of the error term scales
proportionally with one or more of the variables.
 The latter is a more discrete affair – consider the price of
tea in China and the UK. If the variance differs in the
two areas, then we have partition heterosked.
 To fix this, simply WEIGHT BY THE RESPECTIVE
VARIANCE OF THAT AREA.
DETECTING HETEROSKED I
 FOUR main ways to detect Heterosked in a regression:
 Residual Plots:
 Should see a ‘fanning’ of residuals in a residual plot – homoskedasticity
would show up as all residuals being an even distance from the
origin/fitted line.
 Goldfeld – Quandt Test
 Specifically designed for Partitioned heterosked.
 Uses an F distribution and a test statistic of:
ˆ China
2
 China
2
F ~ F N China  K China , NUK  KUK 
ˆUK  UK
2 2

 Where ‘K’ is the number of explanatory variables in the model; usually


will be the same for both sub samples.
 Will test the restriction that the true population variance (the
denominators of both ratios) are EQUAL: variance of China =
variance of UK.
 Hence, the statistic is a ratio of the two ESTIMATED STD. ERRORS.
DETECTING HETEROSKED II
 Lagrange Mulitplier / Breusch-Pagan
 It is a LARGE SAMPLE test which uses the CHI-SQUARED distribution.
 It’s also devilishly simple to calculate:
 LM = N * R2 ~ χ2(K-1)
 Where R2 is the R2 ‘goodness of fit’ statistic gained from running an OLS
regression, and K-1 is the number of explanatory variables.
 Also comes in an F – test form, but no need to go into it.
 It works by testing the variance function:
 Var(yi) = h(a + a z +...a z )
1 2 2 K K
 Where the ‘z’ terms are the variables we think are affecting heterosked on the
model. The Null hypothesis is that all Zi = 0, leaving just a1 – a constant.
 The White Test
 The problem with the LM/B-P test is that it doesn’t specify what ‘Zi’ may be in
the case that the alternative hypothesis (all z aren’t 0, there is heterosked.)
proves true.
 The White test specifies that each ‘z’ term is either an explanatory variable OR a
square of an explanatory variable, OR a cross product of two explanatory
variables.
 It’s the same Chi squared/F test as before though.

Vous aimerez peut-être aussi