Académique Documents
Professionnel Documents
Culture Documents
03/05/2010
Anton Parlow Lab session Econ710 UWM Econ Department () and Box-Jenkins Unit root tests
03/05/2010
1 / 35
Our plan
Introduction to time series AR and MA-process Box-Jenkins Method Unit root tests Short review of Stata Finding the proper model Unit root tests Arima Forecasting
Anton Parlow Lab session Econ710 UWM Econ Department () and Box-Jenkins Unit root tests
03/05/2010
2 / 35
Introduction
A time series is the outcome of a variable observed over time e.g. annually, quarterly, monthly and so on. There are dierent ways to describe a series e.g. has it a trend, a drift or is it a random walk? Example: Quarterly real GDP from 1947 to 2008
We want to explain GDP today with past values of GDP but have to nd the proper model rst.
Anton Parlow Lab session Econ710 UWM Econ Department () and Box-Jenkins Unit root tests 03/05/2010 3 / 35
AR and MA-process
If GDP (yt ) depends only on its own (=auto) and past values (regressive) we have an autoregressive process: yt = + 1 yt1 + 2 yt2 + 3yt3 + + p ytp +
t
In general we call it an AR(p)-model and if GDP depends only on one past realization (=lag), it is an AR(1)-process: yt = + 1 yt1 +
t
Anton Parlow Lab session Econ710 UWM Econ Department () and Box-Jenkins Unit root tests
03/05/2010
4 / 35
If a variable depends only on past realizations of own error-terms we have a moving average process yt = +
t
+ 1
t1
+ 2
t2
+ 3
t3
+ + q
tq
In general we call it a MA(q)-model and if it depends only on one past error-term, it is a MA(1)-process: yt = +
t
+ 1
t1
Sometimes called a white noise process or the error-term is well-behaved (E [ut ] = 0, Var (ut ) = 2 ) and they are iid (=independently identically distributed) A bit hard to nd examples for this, so let us focus on AR-processes today!
Anton Parlow Lab session Econ710 UWM Econ Department () and Box-Jenkins Unit root tests
03/05/2010
5 / 35
In general theses two models are an ARMA(p,q)-model where p = order for the AR-process, q = order for the MA-process Examples: ARMA(1,0)= AR(1)-process yt = + 1 yt1 + ARMA(0,1)= MA(1)-process yt = +
t t
+ 1
t1
If you see an ARIMA(p,I,q)-model then the I stands for integrated or when is the model stationary (see unit-root tests). If I=0 or I(0) the time series is already stationary. If I=1 or I(1) then it is stationary after rst dierencing and so on.
Anton Parlow Lab session Econ710 UWM Econ Department () and Box-Jenkins Unit root tests
03/05/2010
6 / 35
Sometimes it is convenient to write these models in lag-operator notation L for L = one lag, L2 = two lags and so on. Example: yt = + 1 yt1 + that Lyt = yt1 , L2 yt =
t
yt2 , L3 yt
[1 ]
1 L
Anton Parlow Lab session Econ710 UWM Econ Department () and Box-Jenkins Unit root tests
03/05/2010
7 / 35
How to gure out the process describing a time-series? Use the autocorrelation function ACF (= covariance between past realizations) and the partial autocorrelation function PACF. See Hamilton chapter 3 for a very good step by step derivation of these. Take a look at these and decide. Time-series modeling is often referred as art (actually empirical work in general) meaning you can have two economists telling you something else if they look at these functions. Remember the ACF and PACF are pretty much opposite to each other when we talk about AR and MA-processes. An AR-process has a (exponentially) declining ACF and spikes for the PACF. A MA-process has spikes in the ACF and (exponentially) declining PACF CONFUSED??? see some examples next
Anton Parlow Lab session Econ710 UWM Econ Department () and Box-Jenkins Unit root tests
03/05/2010
8 / 35
Example AR(2):
Anton Parlow Lab session Econ710 UWM Econ Department () and Box-Jenkins Unit root tests
03/05/2010
9 / 35
Example MA(2):
Anton Parlow Lab session Econ710 UWM Econ Department () and Box-Jenkins Unit root tests
03/05/2010
10 / 35
Much more fun if you have AR and MA-terms in your model.. ARMA(1,1):
Another way to nd the underlying process is to use information criteria like BIC, AIC, SIC which is part of the output in Eviews but not in STATA (calculating by hand a lot of fun) e.g. start with AR(0), then AR(1), AR(2).. and calculate the information criteria a trick maybe use estat ic
Anton Parlow Lab session Econ710 UWM Econ Department () and Box-Jenkins Unit root tests
03/05/2010
11 / 35
Box-Jenkins-method
Was the rst systematical approach to time-series modeling including 4 steps: 1. Model identication = test for stationarity, use ACF and PACF to nd the right model or information criteria 2. Model estimation = run the regressions, get the residues 3. Model checking = use the residues to check if they are white noise (graph, Q-tests and more) (4. Forecasting - see appendix)
Anton Parlow Lab session Econ710 UWM Econ Department () and Box-Jenkins Unit root tests
03/05/2010
12 / 35
If a time series is stationary, regressions results are not spurious or screwed up. This means most of the time we want to have the series stationary (not needed if you do error-correction models). Problem is, most macroeconomic time series like GDP, unemployment, trade and many more are non-stationary (=contain a unit-root) or are not going back to their mean and the variance is not constant (actually increasing over time). More formally, a series is stationary when the errors are: 1. E ( t ) = 0 2. var ( t ) = 2 = or is constant 3. E (
t t1 )
in other words: the errors are well-behaved or white noise. A non-stationary time series has the opposite properties!
Anton Parlow Lab session Econ710 UWM Econ Department () and Box-Jenkins Unit root tests
03/05/2010
13 / 35
Or if we use yt instead, a time-series is stationary when: 1. E (yt ) = the mean is constant and does not depend on time 2. E (yt )(ytj ) = j that the auto covariance is independent of time too! This means we have to test for non-stationarity, which is done using unit root tests like the most common Dickey-Fuller test. To make a non-stationary time series stationary, we can do the following: 1. take the rst dierences 2. or detrend the time series (dont do this today)
Anton Parlow Lab session Econ710 UWM Econ Department () and Box-Jenkins Unit root tests
03/05/2010
14 / 35
The Dickey Fuller test (or augmented if more than one lag is included) uses following test regressions: 1. yt = yt1 +
t
if the time series is at (no trend) and potentially slow turning around zero 2. yt = + yt1 +
t
if the series is at and potentially slow-turning around a non-zero value (or has a drift, intercept = ) 3. yt = + yt1 + T +
t
if the series has a trend T (up or down) and a drift (intercept) or slow-turning around a trend line you would draw through the data The DF-test has its own test statistics and we want to reject the H0 : = 0 for stationarity. Or in other words if we cannot reject H0 the series is non-stationary and it has to be rst dierenced.
Anton Parlow Lab session Econ710 UWM Econ Department () and Box-Jenkins Unit root tests
03/05/2010
15 / 35
How do we choose the lag-length p for the DF-test? Schwert (1989) suggests following rule of thumb: pmax = 12
T 100
1 4
Why should we care? If p (1) is too small some serial correlation can remain in the errors and biases the test, (2) is too large the power of the test will suer Another test for unit roots is suggested by Phillips-Perron (=PP) which corrects for a serial correlation and heteroskedasticity in the errors. And both ADF and PP-tests are not very helpful if the series is close to be stationary. Kwiatkowski, Phillips, Schmidt and Shin (1992) suggest a test for stationarity, the so-called KPSS-test s.t. H0 = series is stationary.
Anton Parlow Lab session Econ710 UWM Econ Department () and Box-Jenkins Unit root tests
03/05/2010
16 / 35
There are more tests out there, but in general it is not enough to use the Dickey-Fuller test only. Usually you use some more to be condent about your time series.
Anton Parlow Lab session Econ710 UWM Econ Department () and Box-Jenkins Unit root tests
03/05/2010
17 / 35
Remember a command in Stata has the following structure: [command] variable, options We used gen for generating new variables e.g. gen lgdp=log(gdp) to generate the log of GDP Remember: if you want to have the residues after a regression use predict
Anton Parlow Lab session Econ710 UWM Econ Department () and Box-Jenkins Unit root tests
03/05/2010
18 / 35
We will work with quarterly GDP data rst 1. set mem 50m 2. load gdp.dta 3. Stata needs to know it is a time series. 3.1. generate a time-variable: gen time=tq(1947q1)+_n-1 3.2. give it the right format: format time %tq 3.3. tell Stata about it: tsset time 4. graph the series: tsline gdp 5. generate: gen lgdp=log(gdp) and graph it again: tsline lgdp
Anton Parlow Lab session Econ710 UWM Econ Department () and Box-Jenkins Unit root tests
03/05/2010
19 / 35
Let us play around with ACF (=ac) and PACF (=pac) and lgdp is the variable, option = lag-length 1. ac lgdp, lags(10) 2. pac lgdp, lags(10) or 3. corrgram lgdp, lags(10) What do we see? Do it again for 20 lags. Let us do the same for the rst-dierence version of lgdp. There are two ways: 1. generate a new variable: gen flgdp=D.lgdp or 2. ac D.lgdp
Anton Parlow Lab session Econ710 UWM Econ Department () and Box-Jenkins Unit root tests
03/05/2010
20 / 35
Assume an AR(1)-model is okay for log of real GDP. We should run following regression: reg lgdp L.lgdp note: Stata uses L= for lag, L2= two lags, L3 = three lags Stata uses D = for taking the rst dierence Stata uses F = if you have to forward your series, sometimes called a lead pretty convenient, because you can use these for generating new variables too.
Anton Parlow Lab session Econ710 UWM Econ Department () and Box-Jenkins Unit root tests
03/05/2010
21 / 35
If the AR(1) model is the proper one, the errors should be white noise. There are a couple of ways to test for it: 1. graph the errors 2. do a Breusch-Godfrey-test for serial correlation 3. do a Q-test called White-Noise test (or portmanteau test) Note: The Box-pierce test is not very common anymore, due its poor performance in small samples.
Anton Parlow Lab session Econ710 UWM Econ Department () and Box-Jenkins Unit root tests
03/05/2010
22 / 35
1. Graphing the errors To get the residues after the regression: predict res, resid Stata will save the errors in res There are two ways to graph them: 1.1. tsline resid plots them against time, there should be no pattern over time 1.2. plot the residues against past residues and there should be no pattern again! reg res L.res, beta twoway (scatter res L.res)
Anton Parlow Lab session Econ710 UWM Econ Department () and Box-Jenkins Unit root tests
03/05/2010
23 / 35
2. Breusch-Godfrey-test again after the regression do the following (no need for predicting errors): estat bgodfrey, lags(10) H0 = no serial correlation, if we reject it, then the errors are correlated and not white-noise! 3. White-noise test run the regression predict the errors and do the following wntestq resid, lags(10) H0 = no serial correlation, if we reject it, then the errors are correlated and not white-noise!
Anton Parlow Lab session Econ710 UWM Econ Department () and Box-Jenkins Unit root tests
03/05/2010
24 / 35
Anton Parlow Lab session Econ710 UWM Econ Department () and Box-Jenkins Unit root tests
03/05/2010
25 / 35
2. Phillips-Perron-test If we dont specify a lag-length PP-test uses Schwerts thumb of rule. Options are similar to dfuller pperron lds Remember: H0 =non-stationary 3.KPSS-test kpss lds type help kpss into Stata, options are a bit dierent Remember: H0 =stationary If we reject the Null, then the series is non-stationary. Stata gives you the test values for dierent lag-lengths.
Anton Parlow Lab session Econ710 UWM Econ Department () and Box-Jenkins Unit root tests
03/05/2010
26 / 35
ARIMA in Stata
We focused on AR-processes using OLS so far, but more powerful is following command: arima Arima-estimation is a maximum likelihood estimation and remember the notation is in general Arima(p,I,q) where I = integration e.g. I=0 the series is already stationary, I=1 you have to take the rst dierences rst examples arima ds, ar(1) AR(1) for defense spending (ds) arima ds, arima(1,0,0) still AR(1) but already stationary without rst-dierencing arima D.ds, ar(1) = arima ds, arima(1,1,0) rst-dierence version of AR(1) on ds arima ds, ma(1) = arima ds, arima(0,0,1) would be a MA(1)-process for ds arima ds, ar(1) ma(1) = arima ds, arima(1,0,1) would have an AR(1) and a MA(1) component to get the AIC, BIC for the models, use following command after a regression: estat ic
Anton Parlow Lab session Econ710 UWM Econ Department () and Box-Jenkins Unit root tests
03/05/2010
27 / 35
Residual test to test the residuals for auto-correlation, it is similar as before (but bgodfrey will not work) e.g. predict the residuals and graph them, do a whitenoise test (wntestq res) or if you like a durban watson statistics (dwstat res) which should be around 2.
Anton Parlow Lab session Econ710 UWM Econ Department () and Box-Jenkins Unit root tests
03/05/2010
28 / 35
Forecasting
There are dierent types of forecasting after a regression. We can do an in-sample forecast (using the quarters given) or we can do an out-of-sample forecast (adding quarters). I will do it for the Arima-command (OLS is a bit dierent) Remember: To check the quality of your forecast, you need to calculate the Root mean square error (RMSE). The RMSE uses the forecast-error (actual observation minus the forecast) and the formula is the following: RMSE = Example AR(1)-model: arima fgdp, ar(1) Do a one-step ahead forecast: predict fgdp1, y Compare the actual value with the forecast tsline fgdp fgdp1 (Yt forecastt )2
N
Anton Parlow Lab session Econ710 UWM Econ Department () and Box-Jenkins Unit root tests
03/05/2010
29 / 35
Forecasting continued
Calculate the RMSE: 1. Generate the forecast error: gen ferr=fgdp-fgdp1 2. Generate the square of the forecast error: gen ferr2=ferr^2 3. Get the mean of the errors sum ferr2 (0.0040) 4. Use it to compute the RMSE. display "rmse: " (0.0040)^.5 Note there are more ways to measure forecast accuracy.
Anton Parlow Lab session Econ710 UWM Econ Department () and Box-Jenkins Unit root tests
03/05/2010
30 / 35
Forecasting continued
A dynamic forecast could be done as follows: predict fgdpd, xb dynamic(.) Plot the actual value and the forecast tsline fgdp fgdpd Out of sample forecast Do the regression but then you have to extend the time-horizon rst: tsappend, add(24) adds 24 quarters to the quarterly data-set we have. Then use the predict command for one-step ahead or dynamic forecasts.
Anton Parlow Lab session Econ710 UWM Econ Department () and Box-Jenkins Unit root tests
03/05/2010
31 / 35
Forecasting continued
A simple linear OLS-forecast (dont ask me about the dynamic one, same command as above is not working. There should be a way to compute it manually in Stata): reg fgdp L.fgdp predict fgdp1 (Stata assumes the option xb anyway in this case) tsline fgdp fgdp1 What else could be done??? There is much more out there e.g. rolling forecast, comparing forecasts of dierent models e.g. AR(1) with AR(2) and so on.
Anton Parlow Lab session Econ710 UWM Econ Department () and Box-Jenkins Unit root tests
03/05/2010
32 / 35
The simplest way in Stata is: Let gdp be in levels and we want to create the rst dierence: gen fgdp=D.gdp (same as: yt yt1 ) or D2 would be (yt yt1 ) (yt1 yt2 ) As you have seen above, in a regression you can use D,F and L in front of a variable without generating a new variable rst!
Anton Parlow Lab session Econ710 UWM Econ Department () and Box-Jenkins Unit root tests
03/05/2010
33 / 35
Anton Parlow Lab session Econ710 UWM Econ Department () and Box-Jenkins Unit root tests
03/05/2010
34 / 35
Anton Parlow Lab session Econ710 UWM Econ Department () and Box-Jenkins Unit root tests
03/05/2010
35 / 35