Académique Documents
Professionnel Documents
Culture Documents
Models
ARCH MODEL AND TIME-VARYING VOLATILITY
In this lesson we'll use Stata to estimate several models in which the variance of the dependent
variable changes over time. These are broadly referred to as ARCH (autoregressive conditional
heteroskedasticity) models and there are many variations upon the theme. Again, this is all
covered in POE4.
The first thing to do is illustrate the problem graphically using data on stock returns. The data
are stored in the Stata dataset returns.dta.
use r et ur ns, cl ear
The data contain four monthly stock price indices: U.S. Nasdaq (nasdaq), the Australian All
Ordinaries (al l or ds), the J apanese Nikkei (ni kkei ) and the U.K. FTSE (f t se). The data are
recorded monthly beginning in 1988m1 and ending in 2009m7.
gen dat e = m( 1988m1) + _n - 1
f or mat dat e %t m
t sset dat e
Plots of the series in their levels are generated using t woway( t sl i ne var name) .
qui t sl i ne nasdaq, name( nas, r epl ace)
qui t sl i ne al l or ds, name( a, r epl ace)
qui t sl i ne f t se, name( f , r epl ace)
qui t sl i ne ni kkei , name( nk, r epl ace)
gr aph combi ne nas a f nk, col s( 2) name( al l 1, r epl ace)
The series are characterized by random, rapid changes and are said to be volatile. The volatility
seems to change over time as well. For instance U.S. stock returns (nasdaq) experiences a
relatively sedate period from 1992 to 1996. Then, stock returns become much more volatile until
early 2004. Volatility increases again at the end of the sample. The other series exhibit similar
periods of relative calm followed by increased volatility.
Next, the hi st ogr amcommand is used to generate graphs of the empirical distribution of
returns. A curve from a normal distribution is overlaid using the nor mal option.
qui hi st ogr amnasdaq, nor mal name( nas, r epl ace)
qui hi st ogr amal l or ds, nor mal name( a, r epl ace)
qui hi st ogr amf t se, nor mal name( f , r epl ace)
qui hi st ogr amni kkei , nor mal name( nk, r epl ace)
gr aph combi ne nas a f nk, col s( 2) name( al l 2, r epl ace)
-
3
0
-
2
0
-
1
0
0
1
0
2
0
N
A
S
D
A
Q
s
t
o
c
k
I
n
d
e
x
(
U
S
A
)
1990m1 1995m1 2000m1 2005m1 2010m1
date
-
2
0
-
1
0
0
1
0
A
l
l
O
r
d
i
n
a
r
i
e
s
S
t
o
c
k
I
n
d
e
x
(
A
u
s
t
r
a
l
i
a
)
1990m1 1995m1 2000m1 2005m1 2010m1
date
-
2
0
-
1
0
0
1
0
2
0
F
T
S
E
S
t
o
c
k
I
n
d
e
x
(
U
K
)
1990m1 1995m1 2000m1 2005m1 2010m1
date
-
3
0
-
2
0
-
1
0
0
1
0
2
0
N
I
k
k
e
i
S
t
o
c
k
I
n
d
e
x
(
J
a
p
a
n
)
1990m1 1995m1 2000m1 2005m1 2010m1
date
Time-Varying Volatility and ARCH Models
These series are leptokurtic. That means they have lots of observations around the average and a
relatively large number of observations that are far from average; the center of the histogram has
a high peak and the tails are relatively heavy compared to the normal.
TESTING, ESTIMATING, AND FORECASTING
The basic ARCH models consist of two equations. The mean equation describes the behavior of
the mean of your time series; it is a linear regression function that contains a constant and
possibly some explanatory variables. In the cases considered below, the mean function contains
only an intercept.
t t
y e = | +
In this case we expect the time series to vary randomly about its mean, |. If the mean of your
time series drifts over time or is explained by other variables, you'd add them to this equation just
as you would in the usual regression model. The error of the regression is normally distributed
and heteroskedastic. The variance of the current period's error depends on information that is
revealed in the preceding period. The variance of e
t
is given the symbol h
t
. The variance
equation describes how the error variance behaves.
2
1 1 t t
h e
= o + o
0
.
0
2
.
0
4
.
0
6
.
0
8
D
e
n
s
i
t
y
-30 -20 -10 0 10 20
NASDAQ stock Index (USA)
0
.
0
2
.
0
4
.
0
6
.
0
8
.
1
D
e
n
s
i
t
y
-20 -10 0 10
All Ordinaries Stock Index (Australia)
0
.
0
5
.
1
D
e
n
s
i
t
y
-20 -10 0 10 20
FTSE Stock Index (UK)
0
.
0
2
.
0
4
.
0
6
.
0
8
D
e
n
s
i
t
y
-30 -20 -10 0 10 20
NIkkei Stock Index (J apan)
Notice that h
t
depends on the squared error in the preceding time period. The parameters in this
equation have to be positive to ensure that the variance, h
t
, is positive.
A Lagrange Multiplier (LM) test can be used to test for the presence of ARCH effects (i.e.,
whether o>0). To perform this test, first estimate the mean equation. Save and square the
estimated residuals,
2
t
e . You will use these in an auxiliary regression from which youll use the
sample size and goodness-of-fit measure to compute a test statistic. For first order ARCH, regress
2
t
e on the lagged residuals
2
1
t
e
and a constant:
2 2
0 1 1
t t t
e e v
= + +
where
t
v is a random term. The null and alternative hypotheses are:
0 1
1 1
: 0
: 0
H
H
=
=
The test statistic is TR
2
, whereT is the number of observations in the auxiliary regression. It has a
_
2
(1) distribution is the null hypothesis is true. Compare the p-value from this statistic to the
desired test level (o) and reject the null if the p-value is smaller. If you suspect a higher order
ARCH(q) error variance process, then include q lags of
2
t
e as regressors, compute TR
2
, and use
the _
2
(q) distribution to obtain the p-value.
In the first ARCH example the byd.dta data are used. Load the data using the clear option to
remove any previous data from Statas memory.
use byd, cl ear
This dataset contains a single undated time series. Generate a time variable in the easiest way
possible and declare the data to be time series.
gen t i me = _n
t sset t i me
In this instance, a time counter equal to the observation number is created using _n and this is set
equal to the variable t i me. Then the t sset command is used to declare it a time series.
The first thing to do is plot the time series using
t sl i ne r , name( g1, r epl ace)
This yields
Time-Varying Volatility and ARCH Models
There is visual evidence of time varying volatility. Towards the end of the time series, returns for
BYD appear to become more volatile. An ARCH(1) model is proposed and the ARCH(1) model
is tested against the null hypothesis of no ARCH using the LM test discussed above. The first
step is to estimate a regression that contains only an intercept. Obtain the residuals, which we call
ehat , and square them.
r egr ess r
pr edi ct ehat , r esi dual
gen ehat 2 = ehat * ehat
The auxiliary regression
2 2
0 1 1
t t t
e e v
1.063
t
r
+
= | =
The forecasted error variance is essentially an in-sample prediction model based on the estimated
variance function.
( )
( )
2
2
1 0 1 0
0.642 0.569 1.063
t t t
h r r
+
= o + o | = +
Stata generates this whenever it estimates an ARCH model and saves the result to a variable using
the pr edi ct command with option var i ance. Here the ARCH(1) model is estimated and the
variance is saved as a variable called ht ar ch.
ar ch r , ar ch( 1)
pr edi ct ht ar ch, var i ance
This could be generated manually using saved results from the estimated ARCH model
gen ht _1 = _b[ ARCH: _cons] +_b[ ARCH: L1. ar ch] *( L. r - _b[ r : _cons] ) ^2
l i st ht ar ch ht _1 i n 496/ 500
which produces:
The built-in computation from Statas predict command is confirmed by our manual calculation.
Then t sl i ne is used to plot the forecast error variance against time.
t sl i ne ht ar ch, name( g2, r epl ace)
This produces the time series plot
Obviously, there is a lot more volatility towards the end of the sample.
500. 2. 122526 2. 122526
499. 1. 614941 1. 614941
498. 1. 968768 1. 968768
497. . 8093833 . 8093833
496. 1. 412281 1. 412281
ht ar ch1 ht _1
. l i st ht ar ch ht _1 i n 496/ 500
0
5
1
0
1
5
2
0
C
o
n
d
i
t
i
o
n
a
l
v
a
r
i
a
n
c
e
,
o
n
e
-
s
t
e
p
0 100 200 300 400 500
time
Time-Varying Volatility and ARCH Models
EXTENTIONS
An important extension of the ARCH(1) is the ARCH(q) model. Here, additional lags of the
squared residuals are added as determinants of the equations variance, h
t
:
2 2 2
0 1 1 2 2
...
t t t q t q
h e e e
= o + o + o + o
GARCH
Another extension is the Generalized ARCH or GARCH model. The GARCH model adds lags
of the variance, h
t-p
, to the standard ARCH. A GARCH(1,1) model would look like this:
2
1 1 1 1 t t t
h e h
= o + o +|
It has one lag of the regression models residual (1 ARCH term) and one lag of the variance itself
(1 GARCH term). Additional ARCH or GARCH terms can be added to obtain the GARCH(p,q),
where p is the number of lags for h
t
and q is the number of lags of e
t
included in the model.
Estimating a GARCH(1,1) model for BYD is simple. Basically, you just add a single
GARCH term to the existing ARCH model, so the command is
ar ch r , ar ch( 1) gar ch( 1)
The syntax is interpreted this way. We have an ar ch regression model that includes r as a
dependent variable and has no independent variables other than a constant. The first option
ar ch( 1) tells Stata to add a single lagged value of e
t
to the modeled variance; the second option
gar ch( 1) tells Stata to add a single lag of the variance, h
t
, to the modeled variance. The result is:
The estimate of o
1
is 0.491 and the estimated coefficient on the lagged variance, |
1
is 0.238.
Again, there are a few minor differences between these results and those in the text, but that is to
_cons . 4009868 . 0899182 4. 46 0. 000 . 2247505 . 5772232
L1. . 2379837 . 1114836 2. 13 0. 033 . 0194799 . 4564875
gar ch
L1. . 4911796 . 1015995 4. 83 0. 000 . 2920482 . 6903109
ar ch
ARCH
_cons 1. 049856 . 0404623 25. 95 0. 000 . 9705517 1. 129161
r
r Coef . St d. Er r . z P>| z| [ 95%Conf . I nt er val ]
OPG
Log l i kel i hood = - 736. 0281 Pr ob > chi 2 = .
Di st r i but i on: Gaussi an Wal d chi 2( . ) = .
Sampl e: 1 - 500 Number of obs = 500
ARCH f ami l y r egr essi on
I t er at i on 7: l og l i kel i hood = - 736. 02814
be expected when coefficient estimates have to be solved for via numerical methods rather than
analytical ones.
As in the ARCH model, the predicted forecast variance can be saved and plotted:
pr edi ct ht gar ch, var i ance
t sl i ne ht gar ch
which yields the time series plot:
Threshold GARCH
The threshold GARCH model, or T-GARCH, is another generalization of the GARCH model
where positive and negative news are treated asymmetrically. In the T-GARCH version of the
model, the specification of the conditional variance is:
2 2
1 1 1 1 1 1
1 0 (bad news)
0 0 (good news)
t t t t t
t
t
t
h e d e h
e
d
e
= o + o + +|
<
=
>
0
5
1
0
1
5
2
0
C
o
n
d
i
t
i
o
n
a
l
v
a
r
i
a
n
c
e
,
o
n
e
-
s
t
e
p
0 100 200 300 400 500
time
Time-Varying Volatility and ARCH Models
In Stata this just means that another option is added to the arch r regression model. The option to
add asymmetry of this sort is acw() where the argument tells Stata how many lagged asymmetry
terms to add. This can be less than the number of ARCH terms, q, but not greater.
Here is a T-GARCH model for BYD.
ar ch r , ar ch( 1) gar ch( 1) t ar ch( 1)
pr edi ct ht t gar ch, var i ance
t sl i ne ht t gar ch
Once again, the variance is saved and plotted using a time series plot. The Threshold GARCH
result is:
and the plotted predicted error variances are:
_cons . 3557296 . 0900538 3. 95 0. 000 . 1792274 . 5322318
L1. . 2873 . 1154888 2. 49 0. 013 . 0609462 . 5136538
gar ch
L1. - . 4917071 . 2045045 - 2. 40 0. 016 - . 8925285 - . 0908856
t ar ch
L1. . 754298 . 2003852 3. 76 0. 000 . 3615501 1. 147046
ar ch
ARCH
_cons . 9948399 . 0429174 23. 18 0. 000 . 9107234 1. 078956
r
r Coef . St d. Er r . z P>| z| [ 95%Conf . I nt er val ]
OPG
Log l i kel i hood = - 730. 554 Pr ob > chi 2 = .
Di st r i but i on: Gaussi an Wal d chi 2( . ) = .
Sampl e: 1 - 500 Number of obs = 500
ARCH f ami l y r egr essi on
I t er at i on 7: l og l i kel i hood = - 730. 55397
GARCH-in-mean
A final variation of the ARCH model is called GARCH-in-mean (MGARCH). In this model,
the variance, h
t
, is added to the regression function.
0 t t t
y h e = | + u +
If its parameter, u, is positive then higher variances will cause the average return E(y) to increase.
This seems reasonable: more risk, higher average reward! To add a GARCH-in-mean to the BYD
example, we simply add another option to the growing list in the ar ch statement. The command
becomes:
ar ch r , ar chmar ch( 1) gar ch( 1) t ar ch( 1)
In this case, the option ar chm(which stands for arch in mean) is added to the others, ar ch( 1)
gar ch( 1) and t ar ch( 1) . These are retained since these terms are included in the BYD example
from the text. The results are
0
5
1
0
1
5
C
o
n
d
i
t
i
o
n
a
l
v
a
r
i
a
n
c
e
,
o
n
e
-
s
t
e
p
0 100 200 300 400 500
time
Time-Varying Volatility and ARCH Models
You can see that the coefficient on the GARCH-in-mean term