Vous êtes sur la page 1sur 95

11th March 2003

NOTES ON PROGRAMS
TRAMO AND SEATS

PART I
Introduction and Brief Review of Applied Time Series Analysis

Agustín Maravall
Bank of Spain

Thanks are due to Victor Gómez, Gianluca Caporello, Fernando Sánchez, and Nieves Morales.

x:\...\maravall\tramseat\cursos\reviewts.ts
 Victor Gómez and Agustín Maravall, Programs TRAMO and SEATS, 1996.

1
1. INTRODUCTION

2
In our application,
we center on series observed with a 1, 2, 3, 4, 6, and 12 times a
year frequency. The most relevant ones:

MONTHLY and QUARTERLY time series

TIME SERIES ≡ [ x1 , x2 , K, xT ]

12 - 36 ≤ T ≤ 600 observations

(minimum depends on the frequency of observations and on the


type of analysis performed).

Our interest: SHORT-TERM ANALYSIS

SOME EXAMPLES OF PROBLEMS THAT WE SHALL


ADDRESS:

3
Monthly Time Series
700

600

500

400

300

200

100

0
1 9 17 25 33 41 49 57 65 73 81 89 97 105 113 121 129 137

Forecast

1200

1000

800

600

400

200 last 2 years


0
120

132

144

156

168

4
Missing Observations
500
450
400 *
350
* *
300 *
250
*
200
150
100
50
0
1 9 17 25 33 41 49 57 65 73 81 89 97 105 113 121 129 137

Interpolated Series
700

600

500

400

300

200

100

0
1 9 17 25 33 41 49 57 65 73 81 89 97 105 113 121 129 137

5
Outlier Contamination
140

120

100

80

60

40

20

0
1 10 20 30 40 50 60 70 80 90 100 110 120 130 140

Outliers
30

LS TC AO
20

10

-10

-20

-30
1 10 20 30 40 50 60 70 80 90 100 110 120 130 140

6
Series with Intervention Variable
140

120

100

80

60

40

20

0
1 10 20 30 40 50 60 70 80 90 100 110 120 130 140

Intervention Variable Effect


30

20

10

-10

-20

-30
1 10 20 30 40 50 60 70 80 90 100 110 120 130 140

7
Regression variable and special effects

100

100

100

1 10 20 30 40 50 60 70 80 90 100 110 120 130 140 150 160

Top: Holiday effect Middle: Easter effect Bottom: TD effect

Seasonal Factors

140

130

120

110

100

90

80

70
1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106 113 120 127 134

8
Seasonally Adjusted Series

700

600

500

400

300

200

100

0
1 13 25 37 49 61 73 85 97 109 121 133

Trend-Cycle

700

600

500

400

300

200

100

0
1 13 25 37 49 61 73 85 97 109 121 133

9
Seasonal
Forecast: Factors
Seasonal Factors

140

130

120

110

100

90

80

70
120 132 144 156 168

Forecast: Trend & Original Series

1200

1000

800

600

400

200

0
120 132 144 156 168

10
ST and LT Trends

120

100

80

60
1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96

xorig long term trend Short term trend

Business Cycle

1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96

11
Standard ("routine") treatment at present typically solves the
previous problems using different procedures, that often have little
to do with each other. For ex.:

* Forecasting: ARIMA, EWMA

* Interpolation: Chow-Lin, Denton

* Seasonal adjustment: X11 , X11A

* Preadjustment for trading day, easter effect, holidays...:


Regression / Prior correction (For ex., divide by # of working
days)…

* Trend extraction: Henderson Moving Averages, HP filter…

* Outliers: Some weighted trimming?


Robust procedures instead?

* Forecast of a trend: ?
Often: Fit ARIMA to some trend, and obtain ARIMA forecasts
(not recommended)

* SE of x̂ at ? ( x at : SA series)
Important issue (Bach Commitee, Moore Committee,…)

and so on...

12
We shall present a methodology that

* permits to deal with all those issues jointly, within a unified


framework.

* This framework provides OPTIMAL ESTIMATORS (OR


FORECASTS) with respect to

- well-defined STATISTICAL MODELS,

- well-defined ESTIMATION CRITERION,

in an EFFICIENT way.

The Model-based approach will facilitate

* interpretation
For example, the model may specify that the sum of
the seasonal component over a 12 consecutive-month
period is a zero-mean, small variance, stationary process.
* diagnostics
The joint distribution of the estimators can be derived,
and hence standard tests can be performed.

* inference
For example, we can obtain optimal forecasts of the
rate of growth of the SA series, with the associated SE.

13
The methodology is based on:

1) Identifying REG-ARIMA models for the observed series.

2) Decomposing the previous model for the series into


unobserved components.

3) Obtain the MMSE estimator of the components (or signals).


These estimators will be:

E (signal | observations)

Before explaining the methodology, it will be helpful to start with a


brief (and informal) review of some applied time series analysis
concepts and tools.

14
2. BRIEF REVIEW OF APPLIED TIME SERIES ANALYSIS

15
General Framework:

Stochastic process:

zt ~ f t ( z t )

Time Series: [ z1, z2, ... , zT ]

We consider it as a particular (partial) realization of a

stochastic process.

Hence, a sample of size 1 for each ft.

Need to add more structure.

16
STATIONARITY AND DIFFERENCING

Strong condition. Although few economic series will satisfy it,


simple transformations will render them stationary.

Basic condition:

f ( z1 ..., z T ) = f ( z1+ k ,..., z T + k )

1 2 k+1 k+2

In particular, for marginal distribution (T=1)

f t ( z t ) = f ( z t ) for every t, hence

Ez t = µ 

Vz t = Vz 

both:

- are finite

- do not depend on t

17
In practice, constant variance is achieved through:

- log-level transformation
+
- outlier correction

Alternatively, one may use

NONLINEAR FILTERS

(ex. : GARCH, Bilinear, Stochastic Volatility models, … )

- Not yet fit for large-scale use.

- For many of these models, point forecasts or point estimator of


the series and components obtained with linear model remains
approximately optimal.

- The decomposition of NL models still poses some problems.

- Monthly and lower-frequency data seldom display markedly


nonlinear structures.
We shall not consider them.

18
Roughly:

LOGS are appropriate when the amplitude of the series


oscillations is approximately proportional to the level.

Note: LOG transformation has some nice features.

- "Scale" free

- Natural interpretation: ( d log x = dx / x ≅ ∇x / x )

variations are expressed as fractions ("per one") of

the level of the series.

Thus, for example,

ARIMA fit to the logs → σa (residuals) =.004

We can say: The serie is forecasted (1 p.a.) with an error


equivalent to roughly 4‰ of the level of the series.

On the negative side, it may induce BIASES (due to the fact that
geometric means underestimate aritmetic means).

⇒ Annual mean of original series > mean of SA series


(and of trend)

(If biases are large, there are 2 options:


- ad-hoc corrections;
- model the levels.)

19
Concerning STATIONARITY IN MEAN, most economic time
series display a mean (i.e., a “local level”) that cannot be
assumed constant. The two most important reasons:

a) The presence of a trend (or a trend-cycle)

Trend

0
1 13 25 37 49 61 73 85 97 109 121 133 145 157 169 181 193 205 217 229

Obviously, the mean of the series in the first years is not the same
as the one for the last years.

20
b) The presence of seasonality:

Seasonality

140

130

120

110

100

90

80

70

60
1 9 17 25 33 41 49 57 65 73 81 89 97 105 113 121 129 137 145 153 161

Obviously, the level of the series depends on the period within the
year.

21
To achieve Constant Mean:

Let:

B: "Backward" operator; B j z t = z t- j

s = # obs./year (12, if monthly; 4 if quarterly;… )

We use operators:

∇ = 1- B (regular difference)

∇ s = 1- Bs (seasonal difference)
s-1
Ss = 1 + B + K B (sum over a year)

An important identity : ∇ s = ∇Ss

22
* Assume xt is a linear trend

xt = a + b t ;

then,
∇x t = b

or

∇2 xt = 0 .

In general:

∇ d reduces polynomial of degree d to a constant

(cancels polynomial of degree d-1)

Also: ∇12 x t = 12b

(hence seasonal differencing also affects trend)

23
Note: Cosine function

Basic element in cyclical and seasonal movements: the


(deterministic) function

x t = m t cos( ω t + ω 0 ) , t = 0, 1, … (A)

m=1
ω0
m

t
τ

m = modulus (max. value) ;

ω = Frequency (# of cycles / unit of time), measured in radians



= ;
τ

τ = Period (# of units of time needed to complete a full cycle) ;

ω 0 = Phase (angle at t = 0)

24
* Using expression for cos (a + b), (A) above can also be
rewritten as

x t = m t ( C cos ω t + D sin ω t ) .

* Recall:
(A) is solution of 2nd order difference equation (with real
coefficients)

x t + φ 1 x t −1 + φ 2 x t − 2 = 0 ,
when roots are complex
⇒ always, pairs of complex conjugates.
(see below)

25
Consider a monthly series

* Assume xt is a sine wave, for ex.


 π 
x t = cos  t , then ∇12 x t = 0
 6 

π
given that cos (t − 12) = cos  π t − 2π  = cos  π t 
6 6  6 

ω=π/6

-1
0 1 2 3 4 5 6 7 8 9 10 11 12

Same thing holds when

π 
x t = cos  t 
2 

ω=π /2
1

-1
0 1 2 3 4 5 6 7 8 9 10 11 12

26
So: what is the complete solution of

∇12 x t = 0 ?

i.e., most general F(t) that is cancelled by ∇12

∇12 x t = x t − x t −12 = 0 Linear difference equation


(homogeneous).

Note: Homogeneous Linear Difference Equations


(with real coefficients).

Let equation be

x t + φ 1 x t −1 + ... + φ p x t −p = 0 , t = 1, 2, …

Replacing x t by r t , the characteristic equation is obtained,


equal to

r p + φ 1 r p −1 + ... + φ p −1 r + φ p = 0 .

27
This equation has

some real
p roots
some complex
(always in pairs of complex conjugates:
r1 = a + b i

r2 = a − b i )

Solution to the difference equation:

p
x t = ∑ c j r jt ,
j =1

where
r j : a root of the charact. equat.

c j : arbitrary constants.

The final way in which the roots are expressed is the following
(the c’s are always constants, to be determined from the starting
conditions).

28
1) Single real root

x t = c j r jt .

Notice : if r j > 1 ⇒ root is explosive.

We shall restrict attention to roots with

rj ≤1 .

2) Multiple real roots


order of multiplicity : k + 1 ;

x t = ( c 0 + c 1 t + ... + c k t k ) r t ,

where r 1 = ... = r k +1 = r .

3) Single complex root


Always as a pair of complex conjugates.

29
Let
r1 = a + b i

r2 = a − b i

be the pair, and let

i
r1
r
b
ω
real
a

r2

r = ( a + b ) 1/ 2 ,

ω = a cos 
a 
 ,
 r 

then

x t = c 0 r t cos ( ω t + c 1 ) .

( c 0 , c 1 : constants)

30
4) Multiple complex roots
Very rarely encountered.
x t = A mixture of previous solutions.

Notice that, in all cases, if

r =1 ,

the solution has a systematic explosive behavior, which is not


found in actual economic series.
Thus, we shall assume always

r ≤1

(This assumption is in reality an identification condition.)

In summary:

SOLUTION OF DIFFERENCE EQUATIONS:

x t = Sum of

* damped exponentials in time ( → 0 )


* polynomials in time (deterministic trends)
* cosine functions (seasonal and other cycles)

We shall come back often to this result!

31
Remark
Using the backward operator B, the difference equation can be
written as

φ(B ) x t = 0 ,

where

φ ( B ) = 1 + φ 1 B + ... + φ p B p .

Comparing this last expression with the characteristic equation

 1 
* roots of [ φ ( B ) = 0 ] =   .
r
 i 

Thus,
In terms of the roots of φ ( B ) = 0 , the condition

r ≤1

becomes

b ≥1 ,

where b is the corresponding root of φ ( B ) = 0 .

32
Back to the ∇ 12 example.

Thus, for x t − x t −12 = 0 , the characteristic equation: r 12 - 1 = 0 ;

or r = (1)1/12 ,

or twelve roots of unit circle

imaginary

C1

ω=π/6 real
-1 1

C2

-1

All 12 roots have unit modulus.


Roots are:
* 2 real roots: r1 = 1
r2 = -1

33
* 10 complex roots,
in pairs of complex conjugates

Each complex conjugate pair is associated with a


frequency
π
ω=
6
π
ω=2
6
π
ω=3
6
π
ω=4
6
π
ω=5
6

Each pair will generate a solution of the type

A cos (ωt + B)

34
π
Consider the frequency ω = .
6
How many periods are needed to complete a full circle?

⇒ 12 periods.
Hence frequency implies 1 circle (or cycle) per year.

π
For frequency ω = 2 , 6 periods are needed to complete the
6
π
circle, hence ω = implies that 2 circles per year are
3
3π π
completed. For frequency ω = = , 4 periods are needed to
6 2
π
complete the circle, hence ω = ⇒ 3 circles completed per
2
year, and so on.
…..
π
Finally, for ω=6 = π , the root is real and equal to r = -1.
6
For this frequency, a full circle is completed in two periods.
Hence, for monthly data,

r =-1 (ω = π) ⇒ 6 circles per year

Notice that the root r = -1 implies that the factor (1 + B)

appears in the factorization of the AR polynomial. (Such is the

case when this polynomial is S, or ∇ s .)

35
In short:

6  π 
X t = C + ∑ A j cos  j t + B j 
j =1  6 

C: constant, associated with zero frequency root B = 1 (i.e., with


the factor (1-B))

 π
 j  : seasonal frequencies.
 6 
j=1 once a year : “ Fundamental “ frequency
j=2 twice a year
... ............ “ harmonics “
j=6 six times a year

36
ω =π /6
1

-1
0 1 2 3 4 5 6 7 8 9 10 11 12

ω =π /3
1

-1
0 1 2 3 4 5 6 7 8 9 10 11 12

ω =π /2
1

-1
0 1 2 3 4 5 6 7 8 9 10 11 12

ω =2π /3
1

-1
0 1 2 3 4 5 6 7 8 9 10 11 12

ω =5π /6
1

-1
0 1 2 3 4 5 6 7 8 9 10 11 12

ω=π
1

-1
0 1 2 3 4 5 6 7 8 9 10 11 12

37
Another way to look at it:
∇12 can be factorized as

AR factors Frequency

1 − B12 = (1 - 3 B + B2 ) x Once a year

(1 - B + B 2 ) x Twice a year

(1 + B 2 ) x 3 times a year

(1 + B + B 2 ) x 4 times a year

(1 + 3 B + B2 ) x 5 times a year

( 1+ B ) 6 times a year

(1 − B ) associated with trend

= (1 − B) S

1–B contains a trend root,


S contains the seasonal roots
(one real and 5 pairs of complex conjugates).

All of them are “unit roots” (unit in modulus).

38
Hence, for example,

∇ ∇12 x t = 0 , (1)

since ∇ ∇12 = ∇ 2 S ,

will cancel

(A) the polynomial pt = a + b t (∇ 2


pt = 0 )
and

(B) the seasonal cycles for the 1, 2, ..., 6 times-a-year


frequencies (S s t = 0)

st= ∑ A j cos (ω j t + B j )
j


ωj = j , j = 1,2,…, 6.
12

39
Notice that, when specifying stochastic ARIMA models, we will

not say that ∇ ∇12 x t is exactly = 0, but instead that

∇ ∇12 x t = z t

where zt is a zero mean, finite variance stationary stochastic


process.

That is, ∇ ∇12 x t will on average be zero and will not depart too
much from it.

Thus, every period, the functions (A) and (B) will be

"perturbated" by a stochastic input,

so that

a, b → a( t ) , b( t )

C → C( t )

Aj → Aj
(t )

Bj → Bj
(t )

and so on.

Thus, the "MOVING" and ADAPTIVE features of the components.

40
DISTRIBUTION OF THE STATIONARY SERIES

So, let in general

δ (B) = ∇ d ∇Ds

represent all the differences applied for reaching stationary, i.e.,

z t = δ (B) x t is stationary.

Implication : [ z1, ..., zt ] will have a well-defined proper joint

distribution.

Further, we assume: JOINT NORMALITY

Hence, we consider: LINEAR STATIONARY STOCHASTIC

PROCESSES

The time series generated by them will be jointly Normally

distributed.

In the multivariate Normal distribution, conditional expectations

are linear functions of the observed series. For ex.:

Expectation of a future value:

E ( z t + j | z 1 , ..., z t ) : forecast (j>0)

41
Expectation of a missing value:

E ( z ( t ) | z1 ... z t −1 z t +1 ... z T ) : interpolator (zt missing)

Expectation of a signal st buried in zt (zt = st + noise)

E (st | z1 ,…, zT): signal extraction

all are linear in the observations (i.e., linear filters)

42
Stationarity ⇒
Ez t = µ

Vz t = Vz

Cov ( z t , z t − k ) = γ k

γk : only depends on | k | , the relative distance


between observations.
(it does not depend on t)

Thus,

( z1, K, zT ) ~ NT ( µ, Σ )

 γz γ1 γ2 ... γ T −1 
 γz γ1 ... 
 
∑= ... ... ... 
 γz γ1 
(sym.) γ z 

(elements in each diagonal are the same)

43
More parsimonious representation:

AUTOCOVARIANCE FUNCTION:

A Cov F = γk as a function of k

Let F = B −1 ( i . e ., F z t = z t +1 ) ;

F ≡ "Forward" Operator

AUTOCOVARIANCE GENERATING FUNCTION:

ACov G . F . = γ ( B, F )

γ ( B, F ) = γ0 + γ1 ( B + F ) + γ 2 ( B2 + F2 ) + K


= γ0 + ∑ γ j ( B j + F j )
j =1

44
Better (scale free) measure: autocorrelation

γk
ρk = = Lag - k autocorrelation
γ0

AUTOCORRELATION FUNCTION

ACF ≡ ρ as function of k
( since symmetric, only needed for k > 0 )

AUTOCORRELATION GENERATING FUNCTION


ACGF ≡ ρ ( B, F ) = 1 + ∑ ρ j ( B j + F j )
j =1

ACF

0,5

-0,5

-1 -24 -20 -18 -16 -12 -8 -4 0 4 8 12 16 20 24 28

45
Results:

If zt stationary, then

1) ρ0 = 1,

2) ρ j = ρ- j , ( symmetric )

3) | ρk | < 1, k ≠ 0

4) ρk → 0 as k → ∞


5) ∑ | ρk | < ∞ (Convergence condition)
k =0

Note: JOINT NORMALITY implies that

µ , γ 0 , ρ ( B, F )

fully characterize the joint distribution function of [ z1, ..., zT ] ;


they contain all the "sample" information.

ρk = Lag-k autocorrelation is a measure of the (linear)


dependence between observations distant k periods

46
Wold Representation

A linear ( ≡ jointly Normal) stationary purely

stochastic process can be expressed as:

z t = a t + ψ1 a t −1 + ψ 2 a t − 2 + ...


= ∑ ψ j a t − j , (ψ 0 = 1),
j=0

where

− niid (0, Va ) ≡ white noise var iable



a t = −" residuals"
−" innovation s" = 1 p. a. forecast error (known parameters )

a t = z t − ẑ t | t −1

( ẑ t | t − j : forecast of z t made at period t-j )

47
Hence

zt = Linear filter applied to innovations. Also called the MA


representation of zt. Filter is one-sided (only past and present
innovations) and convergent,

ψj → 0 


j→∞ 


∑ | ψj | < ∞ .
0

In short:


z t = ψ (B) a t ψ(B) = ∑ ψj Bj
j=0

Useful result:

Let γ z ( B, F ) = A Cov GF ( z t ) Then,

γ z ( B , F ) = ψ ( B ) ψ ( F ) Va

In particular, for the variance :

2 2
γ 0 = (1 + ψ1 + ψ 2 + K ) V a

48
ACF: Basic tool in "Time Domain analysis" of a series.

Another important tool:

Spectrum

Basic tool in "Frequency Domain analysis" of a series.

Consider the time series:

x t = [x1, x 2 , ... , x T ]

We can “exactly explain” the series with the polynomial of degree


(T-1):

x t = a 0 + a 1 t + ... + a T −1 t T −1

(Set, succesively, t = 1, 2, …, T, and a linear system of T


equations in the T unknowns a 0 , a1 , ... , a T −1 is obtained).

In a similar manner, we can represent (exactly) the T observations


[x ] with sine-cosine functions as follows.
t

49
To simplify, assume T is even, so that
T = 2q

Define the Fundamental Frequency



ω1 =
T
(i.e., the frequency of one full circle completed in T periods),
and its Harmonics:
 2π 
ωj =   j , j = 1, 2, …, q.
 T
Then, express xt as

x t = ∑ ( a j cos ω j t + b j sin ω j t )
q

j=1

Letting t = 1, 2, …, T ,
a linear system of T equations in the T unknowns : (a j
, bj ) ,

j = 1, …, q , is obtained.

For a particular periodic component (with frequency ω j ), the

“amplitude” (i.e., the height of the peak) is equal to

A 2j = a 2j + b 2j

Notice: The bigger this amplitude, the larger will be the


contribution of the component in explaining xt .

50
Example:

-1

-2

-3
1 2 3 4 5 6 7 8 9 10

1,00

0,80

0,60

0,40

0,20

0,00

-0,20

-0,40

-0,60

-0,80

-1,00
0 1 2 3 4 5 6 7 8 9 10

51
In general, group the cosine functions by intervals of frequency

(summing the amplitudes)

HISTOGRAM OF THE DISTRIBUTION BY FREQUENCY

ω (frequency)
0 π

In the same way that:

density function ≡ population counterpart of standard


histogram,

spectrum ≡ population counterpart of frequency histogram.

52
Given that
cos (α ) = cos (α + 2π) ,
cos (α ) = cos ( −α ) ,

spectrum is periodic and symmetric ⇒

enough to consider: 0 ≤ ω ≤ π


∫ g ( ω ) dω = γ0 ( variance of series )
0

π
(or ∫ )
−π

∴ Spectrum can be interpreted as a decomposition of variance


by intervals of frequency.

(Standarized) it displays properties similar to those of a


density function.

SPECTRUM SERIES

−π 0 dω π

53
ω = frequency in radians

Recall:

ω=
τ τ = Period

1.5

0.5

0
t
-0.5

-1

-1.5

Spectrum g (ω) : decomposes V x by frequency.

g(ω) SPECTRUM SERIES

variance associated with


interval d ω

ω
0 dω π

54
Consider the once-a-year frequency in monthly data

months
τ = 12

2π π
τ = 12 ⇒ ω = =
12 6

Hence: If series has important seasonal component with that


frequency,

in gx (ω) :

SPECTRUM AR(2)

peak for ω= π/6

0 π/6 π

55
For trend:

A way to think about trends:

Cycles with period close to ∞ (ex.: cycles with periods 1000

years, 10.000 years,…)

τ→∞ ⇒ ω→0

g(ω)

0 π

56
If the spectrum of a quarterly series is, for example:
gx( ω)

SPECTRUM SERIES

0 π/2 π

0 π/2 π

Peak for:

ω= 0 Trend

π 
ω= → Once a year 
2


 seasonal frequencies

ω = π → Twice a year 


57
To extract some signal from a series, for example, to S.A. a
series:

- remove variation around seasonal frequencies


- leave the rest unchanged.

If nt = SA series is estimated through


n̂t = c (B, F) x t = ... + c 2 x t −2 + c 1 x t −1 + c 0 x t + c 1 x t +1 + c 2 x t +2 + ...

where

c (B, F) = c 0 + ∑ c j (B j + F j )
j

is a symmetric filter, the F.T. of the filter


(recall: Bj + Fj → 2 cos j ω )

is

c (ω) = c 0 + 2 ∑ c j cos ( j ω)
~
j

From n̂t = c (B, F) x t ,

gn̂ (ω ) = [~
c (ω )] 2 g x (ω )

c (ω) = Gain of filter


~

[~c (ω)] 2 = Squared gain of filter

58
Hence:
Spectrum of n̂ t = ( Squared gain of filter ) x ( Spectrum of series )

Squared gain: Determines, for each frequency, which proportion


of the series variance is passed on to the signal estimator.

=1 all the variation is passed


=0 the frequency is ignored

SQUARED GAIN OF SA SERIES

0 π/2 π

59
SPECTRUM OF A LINEAR PROCESS
(AND OF AN ARIMA MODEL)

In general, from ACovGF, the spectrum is easily obtained as its


Fourier Transform:

γ( B, F ) = γ 0 + ∑ γ j ( B j + F j )

B → e -i ω 

 F.T.
j + j ⇒ 2 cos j ω 
B F 

0≤ ω≤ 2π

g ( ω)=
1
[
γ + 2 ∑ γ j cos j ω
2π 0
]

For notational simplicity, we shall work with

g ( ω ) = 2π g (ω) ,

and avoid the factor 2π .

60
Ex. 1:
ACF and spectrum of MA (2)

x t = ( 1 + θ1 B + θ2 B2 ) a t [ Va = 1]

ACovGF = θ (B) θ (F) ,

(1 + θ1 B + θ 2 B 2 ) (1 + θ1 F + θ 2 F 2 )

= 1 + θ12 + θ22 coeff . of B 0 : γ0

+ θ1 ( 1 + θ2 ) [ B + F ] coeff . of B and F : γ1

+ θ2 [ B2 + F2 ] coeff. of B2 and F2 : γ 2

In short

B k + F k → 2 cos k ω

g ( ω ) = [γ0 + 2 γ1 cos ω + 2 γ 2 cos 2 ω] Va

( γ0 , 2 γ1 , 2 γ2 : coeff. of "harmonic" expressions)

61
Ex. 2:

zt ~ white noise ⇒ g z ( ω ) = constant

gz (ω) ACF White noise

0,8

0,6

0,4

0,2
ω 0
1 2 3 4 5 6 7 8 9 10 11 12
0
π

Ex. 3:

AR(1)

z t + φ z t −1 = a t , | φ | <1

(1 + φ B ) z t = a t ,

 1 
zt =   at
1 + φ B 

1
z t = ψ (B) a t → ψ (B) = (Wold representation)
1 + φB

Therefore, ACovGF is: ψ (B) ψ (F) Va

1 1
γz ( B , F ) = Va =
( 1+ φ B ) ( 1+ φ F )

1
=
1 + φ2 + φ ( B + F )

62
Hence the spectrum is: (B + F → 2 cos ω)

1
gz ( ω ) = Va
1 + φ2 + 2 φ cos ω

g z (ω)

φ <0
ACF ( 1+.8B)xt = at
ACF(1-.8B)xt=at
1

0,8

0,6

0,4

0,2

0
1 2 3 4 5 6 7 8 9 10 11 12

0 π

φ>0 ACF(1+.8B)xt=at
ACF ( 1-.8B)xt = at

1
0,8
0,6
0,4
0,2
0
-0,2
-0,4
-0,6
-0,8
-1
1 2 3 4 5 6 7 8 9 10 11 12

0 π

63
Ex. 4 : PSEUDO-SPECTRUM

As φ → −1 the model approaches NONSTATIONARITY.


In the limit:

( 1-B ) zt = at ≡ RANDOM WALK

∇z t = a t (va = 1)
zt = at + at-1 + at-2 + at-3 + …

As t → ∞
* Mean is not defined ("0 ⋅ ∞") .
* Variance goes to ∞ .

As with stationary models, write


1
ψ (B ) =

 1  1 
“Pseudo” A Cov. GF =    va
 1− B   1− F 
(does not converge)

“Pseudo – spectrum” = F. T. of pseudo-A Cov. GF

64
1
F.T . ≡ g z ( ω ) =
2 ( 1 - cos ω )

ω=0 ⇒ g→∞

∫ g does not converge (variance goes to ∞ )

(pseudo) spectrum of a random walk

SPECTRUM SERIES
SPECTRUM SERIES

ω
0 π/2 π

p-spectrum is:

- informative

- well-behaved

in a very basic way.

65
For example, consider the two trend spectra:

Trend Spectra

0,5

0
0 π

and two associated realizations

Trend

4,65

4,6

4,55

4,5

4,45

4,4

4,35

4,3

4,25
1 10 19 28 37 46 55 64 73 82 91 100 109 118 127 136 145 154

The trend that corresponds to the wider spectral peak contains


more stochastic variability (i.e., is of a more “moving” nature).

The narrow peak generates a more stable trend.

66
Similarly, from the two spectra:

Seasonal component Spectra

0,5

the following two seasonal components are generated

67
Seasonal Component

-1

-2

-3

Seasonal Component

-1

-2

-3

As was the case with the trend, the narrow spectral peaks
produce stable seasonal components.

The wider peaks produce components that change faster (more


moving components).

In what follows we shall also refer to the “p-spectrum” simply as


the "spectrum".

68
Ex. 5 : AR (2)

(1 + φ1 B + φ2 B 2 ) z t = a t

1 + φ1 B + φ2 B 2 = 0

Either:
- 2 real roots ( each ~ AR(1) )
- Complex conjugate root, → z t = r cos (ωt ) + L

with

* modulus r = φ2

 φ 
* frequency ω = arcos  1  (in rads.)
 2r

period τ =
ω

In this case, spectrum shows a peak for ω

SPECTRUM SERIES

0 ω π/2

69
General Hint: Useful way to look at AR(p) :

Factorize it as:

Real roots x Complex roots


↓ ↓
each one : AR(1) each pair : AR(2)

SPECTRUM SERIES: Real roots AR(1)

0 π

SPECTRUM SERIES: Complex roots AR(2)

0 ω π

70
Ex. 6 : SEASONAL SERIES

If there is seasonal nonstationarity,

unit roots show up as ∞ for seasonal frequencies.

For example, consider a quarterly time series x t such that

∇4 xt = zt ,
z t : stationary process.

Then,

1 1
xt = zt = zt =
∇4 ∇S
1 1
= zt =
∇ 1 + B + B2 + B3
1 1
= zt
∇ (1 + B) (1 + B 2 )

∇: root associated with ω = 0


1 + B 2 : root associated with ω = π
2
1+ B : root associated with ω = π

1 1 1
p ACGF( x ) = γ z (B, F) .
(1 − B) (1 − F) (1 + B) (1 + F) (1 + B 2 ) (1 + F2 )

71
Operating and using B j + F j = 2 cos jω , the p-spectrum is

1
gx (ω) = . ∞ for ω = 0
2 (1 − cos ω)

1
. ∞ for ω = π
2 (1 + cos ω)
(seasonal freq.)

1 π
. ∞ for ω =
2 (1 + cos 2ω) 2
(seasonal freq.)

γ z (ω) bounded.

Unit AR roots dominate the spectrum of the series.

Thus, a "standard" series with trend and seasonality (both NS)


will display a spectrum of the type:

72
SPECTRUM QUARTERLY SERIES

0 π/2 π

SPECTRUM MONTHLY SERIES

0 π/6 π/3 π/2 2π/3 5π/6 π

73
ARIMA models

Back to the Wold general representation of a purely stationary

series:

z t = ψ( B ) a t

Problem: In general ψ( B ) of degree ∞ K

Thus we use rational approximation:

θ(B )
ψ(B )=
φ(B )

θ( B ) = finite degree q

φ( B ) = finite degree p

Therefore,

θ(B )
zt = at ,
φ(B )

or:

φ (B) z t = θ ( B ) a t

74
Autoregressive Moving-Average Models: ARMA models

AR (p) polynomial: 1 + φ1 B +… + φp Bp
MA (q) polynomial: 1 + θ1 B +… + θq Bq

( 1 + φ1 B + K + φp Bp ) z t = (1 + θ1 B + K + θ q B q ) a t

ARMA (p,q) model.

Let z t = δ ( B ) xt [ δ ( B ) ≡ stationary transformation ]

If δ ( B ) = ∇ d

x t ~ ARIMA ( p, d, q ) model

I ≡ integrated (of order d) , x t ~ I (d)

Model:

φ ( B ) δ ( B ) x t = θ ( B ) at

φ ( B ) : stationary
θ ( B ) : invertible
δ ( B ) : unit roots (Nonstationary roots)

75
Stationarity of ARMA Models:

Roots of φ ( B ) = 0 lie outside the Unit Circle

UNIT CIRCLE = circle in the complex plane with radius = 1

Let B1 ,…, Bp ≡ roots of φ (B) = 0.

Stationarity implies that

2
i

1 modulus

ω: frecuency
real
-2 -1 1 2

-1
Four roots of unit circle

One stationary root of φ (B)


-2

that is: moduli of the roots B1 L Bp of φ ( B ) = 0 are > 1

76
Ex:

a) AR(1) : x1 + φ x t −1 = a t

(1 + φ B) x t = a t

1
* root of 1 + φ B = 0 ⇒ B = -
φ

* root outside U.C. when

1
mod (B) = − > 1⇒ | φ |<1
φ

b) Stationarity region for AR(2) :


x t + φ1 x t −1 + φ2 x t − 2 = a t

Conditions for roots of


1 + φ1 B+ φ2 B 2 = 0

to be > 1 in moduli.

77
Useful diagram:

φ2

φ1

-2 -1 1 2

Region of complex roots -1

Region of real roots

-2

The stationary region is the region inside the triangle.

The shaded area is the region of complex roots (periodic


behavior).

78
If zt is stationary, φ B) –1 converges, and hence we can write

zt = [ φ(B)-1 θ(B) ] at .

- The series accepts a convergent MA representation,


- Its ACF converges to zero.

Invertibility

Roots of θ ( B ) = 0 lie outside the Unit Circle.


Thus θ ( B )−1 converges and we can write
[θ (B)-1 φ (B)] zt = at

If zt is invertible,
- it accepts a convergent AR representation

Remark:

- A unit AR root induces nonstationarity,


gx ( ω assoc . with U . R . ) → ∞

- A unit MA root induces noninvertibility,


gx ( ω assoc . with U . R . ) = 0

79
Model:
( 1 - B) x t = (1 + B) a t

AR root (1-B) ; MA root (1+B)

0 π

Model:
( 1+ B ) x t = ( 1 - B ) at

ARroot
AR root(1-B)
(1+B) ; MA
; MA root
root (1-B)
(1+B)

0 π

80
TWO USEFUL ALTERNATIVE REPRESENTATIONS:

ARMA model:

φ ( B ) at = θ ( B ) at

θ(B ) 2
(a) xt = at = ( 1 + ψ1 B + ψ2 B + K ) a t
φ(B )

or

x t = ψ(B) a t

↑ "Psi" - weights ( ψ - weights )

* If x t is stationary:

ψ - weights → 0

ARMA could be approximated by finite MA

81
(b) Alternatively,

φ (B)
x t = a t , or
θ (B)

( 1 + π1 B + π 2 B 2 + K ) x t = a t ,

and in compact form,

π(B) x t = a t

↑ "Pi" - weights ( π - weights )

* If xt is invertible,

π - weights → 0

ARMA could be approximated by finite AR

82
For seasonal data, often the general multiplicative ARIMA model
is used:

φ ( B ) Φ ( Bs ) ∇ d ∇Ds x t = θ (B) Θ (Bs ) a t

φ (B) ≡ regular AR pol.

Φ (B s ) ≡ seasonal AR pol. ( in Bs )

θ (B) ≡ regular MA pol.

Θ(Bs) ≡ seasonal MA pol. ( in Bs )

where: φ (B) and Φ (B s ) are stationary.

θ (B) and Θ (Bs ) are invertible.

Two important points:

- PARSIMONY (few parameters);

- short-term use (max. horizon: 1 – 2 years)

(it may affect the order of differencing: roughly,


* short-term use favors differencing
* long-term use does not.)

83
"Box-Jenkins"-type approach:

IDENTIFICATION

of the model

ESTIMATION

DIAGNOSIS

O.K.?

No

Yes

INFERENCE

84
IDENTIFICATION OF THE ARIMA MODEL

One has to determine:


a) Degrees of differencing
b) Orders p and q of ARMA

a) Traditional criterion: "fast enough convergence of ACF”. As


we shall see, unit roots are easily detected through estimation
(Tiao, Tsay)
b) Main idea: to "match" ACF of some known ARMA.

Basic traits of ACF of ARMA (p, q) :

( 1 + φ1 B + K + φpBp ) x t = ( 1 + θ1 B + K + θ q B q ) a t
ACF:
ρk ≡ Lag-k autocorrelation of xt, as a function of k:

It will display
* q starting conditions,
after which the AR difference equation

* ρk + φ1 ρk-1 + K + φp ρ k - p = 0
holds. Hence, for k > q , ρk is solution of

φ ( B ) ρk = 0

where B operates on k.

85
Thus: "Eventual ACF"

ρk = ∑[ ( pol . in t ) + cosine func . ]

ACF (1-.8B)xt=at

0.5

ACF (1+.8B)xt=at

0.5

0
1 3 5 7 9 11 13 15 17 19 21 23

-0.5

-1

ACF AR(2) complex roots

0,5

-0,5

-1
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31

86
At present, "identification" uses more efficient procedures

(as will be seen in TRAMO)

Notes:

a) Often, more than one model may seem reasonable,


hence always some room for the analyst experience or purpose.

b) In practice we do not know the ACF and autocorrelations have


to be estimated.

Estimation can induce large (spurious) covariances that have a


distorting effect on the sample ACF, which may fail to damp out
according to expectations.

87
THEORETICAL ACF

0.5

1 4 8 12 16 20 24

SAMPLE ACF

0,5

-0,5

88
PARAMETER ESTIMATION (more later)

Rough intuition : [ x1 K xT ]

Assume:
x t = a t + θ a t −1 MA (1), | θ |< 1
or
a t = x t − θ a t −1

Conditional on a0: Set θ = θ0

Compute sequentally:

a10 = x1 − θ0 a0

a02 = x 2 − θ0 a10

………………..

a0T = x T − θ0 a0T −1

____________

( )
T 2
SS 0 = ∑ a 0j
1

89
Varying θ :

SS

SSSS
0

−1 θ0 θ̂ 1

θ̂ = min. SS

Notice:
a t = x t − θ a t −1
a t −1 = x t −1 − θ a t − 2
at − 2 = x t − 2 − θ at −3
…..
yields

aT = x T − θ x T −1 + θ2 x T − 2 − ... + ( −θ)T −1 x1 + ( −θ)T x 0

1) HIGHLY NON LINEAR FUNCTION OF PARAMETERS


(∴NL optimization )

2) Effect of starting conditions


IMPORTANCE OF INVERTIBILITY ( | θ | <1)

90
DIAGNOSTICS (more later)

* In-sample
* out-of-sample

Mostly : residual-based diagnostics.

â t ~ Niid (0, Va)

INFERENCE (more later)

Example : Forecasting (parameters known)

Notation: x̂ t + k |t ≡ Forecast of x t + k made in period t.

x t = x̂ t | t −1 + a t

a t ≡ 1 - period- ahead forecast error

= x t − x̂ t | t −1 ( ≡ innovation )

These a t ‘s are the ones of the Wold representation, and of the


ARIMA model.

91
Forecast:
x̂ t + k | t = MMSE t ( x t + k ) = (under our assumptions) =

= E ( x t + k | x1 K x t )

Computation with Kalman filter (later).

Forecast function :

x̂ t + k | t as a function of k.

For ARMA (p, q) :

Forecast function:

* q starting conditions,

after which the AR difference equation

x̂ t + k | t + φ1 x̂ t + k - 1 | t + K + φp x̂ t + k - p | t = 0

holds. Hence x̂ t + k | t is solution of

φ ( B ) x̂ t + k | t = 0

where B operates on k.

92
Note:
Eventual Forecast Function and ACF are solution of the same AR
finite difference equation

(by looking at the correlation between present and past, we know


the correlation between present and future...)

USEFUL WAY TO LOOK AT THE FORECAST:

Use ψ - weights :

x t+k = a t + k + Ψ1 a t + k - 1 + K +

+ Ψ k - 1 a t + 1 + Ψ k at + Ψ k + 1 a t −1 + K ;

since

E t a t +k = 0 (k > 0) : future forecast errors are unknown ,

E t a t +k = a t +k (k < 0) : past forecast errors are known ,

x̂ t + k | t = E t x t + k = ψ k a t + ψ k +1 a t −1 + ... ;

a linear combination of past and present innovations

93
Thus, FORECAST ERROR

e t+k|t = x t + k - x̂ t + k | t

= a t + k + ψ1 a t +k - 1 + K + ψ k - 1 a t + 1
144444424444443

MA(k(k-1)
MA - 1) ofof" future"
“ futureinnovation
“ innovations.
s

From this, distributions are easily derived.

Example: Simplest one


k=1
e t +1| t ~ N (0, Va ) ,

or, for a vector of 1-period-ahead forecast errors,

e t +1| t ~ N ( 0 , V a I ) …

94

Vous aimerez peut-être aussi