Vous êtes sur la page 1sur 26

Business Forecasting

ECON2209
Slides 03

Lecturer: Minxian Yang

BF-03

my, School of Economics, UNSW

Ch.4 Statistical Graphics

Lecture Plan
Graphs of data: merits and limitations
Examples: use graphs to show data features
Time series differ from random sample
Components in time series
Classical decomposition

BF-03

my, School of Economics, UNSW

Ch.4 Statistical Graphics

Statistical Graphics
Example: Anscombes quartet
obs

X1

Y1

X2

Y2

X3

Y3

X4

Y4

1
2
3
4
5
6
7
8
9
10
11

10
8
13
9
11
14
6
4
12
7
5

8.04
6.95
7.58
8.81
8.33
9.96
7.24
4.26
10.84
4.82
5.68

10
8
13
9
11
14
6
4
12
7
5

9.14
8.14
8.74
8.77
9.26
8.10
6.13
3.10
9.13
7.26
4.74

10
8
13
9
11
14
6
4
12
7
5

7.46
6.77
12.74
7.11
7.81
8.84
6.08
5.39
8.15
6.42
5.73

8
8
8
8
8
8
8
19
8
8
8

6.58
5.76
7.71
8.84
8.47
7.04
5.25
12.5
5.56
7.91
6.89

Why identical
regression line?
BF-03

y = 3.00 + 0.50 x,
(1.12)

R 2 = 0.67

(0.12)

my, School of Economics, UNSW

Ch.4 Statistical Graphics

Example
Scatter plots explain it vividly.
Y1 vs. X1

Y2 vs. X2

11

11

10

10
9

8
Y2

Y1

8
7

7
6
6

3
3

10 11 12 13 14 15

X1
Y3 vs. X3

10 11 12 13 14 15

X2
Y4 vs. X4

13

13

12

12

11

11

10

10
Y4

Y3

9
8

9
8

7
6

6
5

4
3

10 11 12 13 14 15

10

X3

BF-03

12

14

16

18

20

X4

my, School of Economics, UNSW

Ch.4 Statistical Graphics

Advantages of graphs
Graphs represent data visually and help us to see
data features/patterns.
A graph is worth a thousand of words.

Graphs make anomalies/outliers apparent.


Graphs are effective in comparing data sets.
But it is hard to visualise high dimensional data.

BF-03

my, School of Economics, UNSW

Ch.4 Statistical Graphics

Scatter plots
Useful to reveal the relationship between two
variables.
ls y c z
======================================================
Variable
Coefficient
Std. Error
t-Statistic
Prob.
eg. xyz.dat
C
10.04891
0.272825
36.83280
0.0000
Z

-0.364783

Y vs. X

0.242923

-1.501641

0.1400

Y vs. Z

13

13

12

12

11

11

10

10

14

14

Weak relation?
It could be
y = b0+b1z+b2x + u

5
-3

-2

-1

-3

-2

-1

BF-03

my, School of Economics, UNSW

Ch.4 Statistical Graphics

Scatter plots
eg. xyz.dat. Partial relationship
of y and z (after controlling for x):
regress y on c x to get resid01;
regress z on c x to get resid02;
scatter plot resid01 against resid02.

EViews:
Type in top panel
ls y c x
On the result window, click
Proc, Make Residual Series,
resid01 (in Name for resid series), OK
Type in top panel
ls z c x
On the result window, click
Proc, Make Residual Series,
resid02 (in Name for resid series), OK

RESID01 vs. RESID02

Type in top panel


group grp resid02 resid01
grp.linefit

4
3

RESID01

Also compare the result of


ls resid01 resid02
against the result of
ls y c x z

1
0
-1
-2
-3
-3

-2

-1

Dependent Variable: Y,
Variable
Coefficient
C
9.884732
X
1.073140
Z
-0.638011

Sample: 1 48
Std. Error
0.190297
0.150341
0.172499

t-Statistic
51.94359
7.138031
-3.698642

Prob.
0.0000
0.0000
0.0006

RESID02

BF-03

my, School of Economics, UNSW

Ch.4 Statistical Graphics

Time series plot


eg. liquor.dat, monthly sales, 1967.01 1994.12
trend (increase over time on average)
seasonality (reoccurring pattern: Dec high, Feb low)
2800
Liquor Sales
2400
2000
1600
1200
800
400
68 70 72 74 76 78 80 82 84 86 88 90 92
BF-03

my, School of Economics, UNSW

Ch.4 Statistical Graphics

Time series plot


eg. US 10-year treasury bond yield (monthly, %), 530obs
persistent (gentle moves with few large jumps)
random trend (ups & downs without a clear pattern)
10-year T-bond Yield
16
14
12
10
8
6
4
2
65

BF-03

70

75

80

85

90

95

my, School of Economics, UNSW

00

05

Ch.4 Statistical Graphics

Time series plot


eg. Change of US 10-year treasury bond yield (monthly)
fluctuate about 0 (never move away for long)
volatile (change direction frequently with large spikes)
yt = yt yt 1
Change of 10-year T-bond Yield

EViews:

Read in bond10y.csv
File, Open, Foreign Data as Workfile,
bond10y.csv (in File name), Open,
Finish

Type in top panel


plot close
genr dy=close-close(-1)
plot dy
hist dy

-1

-2
65

BF-03

70

75

80

85

90

95

00

05

my, School of Economics, UNSW

10

Ch.4 Statistical Graphics

Time series plot


Time series plots reveal

trends
seasonalities
volatilities (mount of variation)
breaks (pattern changes)
outliers (unusual observations)

in data.
First thing in time series analysis: plotting data

BF-03

my, School of Economics, UNSW

11

Ch.4 Statistical Graphics

Histogram
Describes how data are distributed.
(frequency distribution)
eg. Change of US 10-year treasury bond yield (monthly)
Thick-tailed (caused by a small number of large jumps)
Change of 10-year T-bond Yield
100
Sample 1962M01 2006M02
Observations 529
80

60

40

20

Mean
Median
Maximum
Minimum
Std. Dev.
Skewness
Kurtosis

0.000926
0.010000
1.590000
-1.880000
0.348751
-0.280792
6.565498

Jarque-Bera
Probability

287.1622
0.000000

Normal distribution:
skewness = 0,
kurtosis = 3.
At the 5% level,
reject normality if
Jarque-Bera > 5.99.

0
-1.5

BF-03

-1.0

-0.5

0.0

0.5

1.0

1.5

my, School of Economics, UNSW

12

Ch.4 Statistical Graphics

Empirical cumulative distribution (cdf)


Another way to look at how data are distributed.
eg. Change of US 10-year treasury bond yield (monthly)
Empirical CDF

80% of
observations are
below 0.23

1.0

Probability

0.8
0.6
0.4
0.2
0.0
-1.6

-1.2

-0.8

-0.4

0.0

0.4

0.8

1.2

Change

BF-03

my, School of Economics, UNSW

13

Ch.4 Statistical Graphics

QQ-plot
Check how a theoretical distribution fits data.
For a perfect fit, the QQ-plot is a straight line.
eg. Change of US 10-year
treasury bond yield:
It appears non-normal.

std normal distribution:


(1/529)th quantile = -2.90

6
4

Normal Quantile

data with 529 observations:


(1/529)th quantile = -1.88

Theoretical Quantile-Quantile

(-1.88, -2.90)

2
0
-2
-4
-6
-2

-1

Change
BF-03

my, School of Economics, UNSW

14

Ch.4 Statistical Graphics

EViews
Commands for the examples

EViews:
Read in bond10y.csv
File, Open, Foreign Data as Workfile,
bond10y.csv (in File name), Open,
Finish
Type in top panel
genr dy=close-close(-1)
plot close dy
dy.line
dy.hist
dy.distplot cdf
dy.qqplot
scalar q80=@quantile(dy, .8)
scalar qn529=@qnorm(1/529)

BF-03

my, School of Economics, UNSW

15

Ch.4 Statistical Graphics

Style of graphs
Easy to understand
eg. indicate the meaning of axes
Highlight the point you are trying to make
Informative
eg. make self-explained graphs
Attractive
eg. use of proper colours and symbols
Avoid chart junk
eg. abuse of colours, shadings, grids,

BF-03

my, School of Economics, UNSW

16

Ch.4 Statistical Graphics

Decomposition of a Time Series


Time series versus random sample
A random sample
is a set of independent observations on a variable,
often collected at one point in time.
eg. Income of a randomly-selected household
Med-insurance status of a randomly-selected household

A time series
is a set of observations on a variable observed
over consecutive time intervals.
eg. T-bond rate, gold price, retail sales, (daily, annual,...)
BF-03

my, School of Economics, UNSW

17

Ch.4 Statistical Graphics

Features of economic time series data


Trend, seasonality, fluctuation/cycle
Autocorrelation (future is influenced by present)
eg. Department stores turnover: 1982.04 1999.10
Trend (sales growing), Seasonality (peak & trough repeats)
Cycle (random fluctuations)
8.0

2000

7.6

1600

7.2
1200

6.8
800

6.4
400

6.0
5.6

0
82

84

86

88

90

92

94

96

98

82

84

Retail Turnover ($M)

BF-03

my, School of Economics, UNSW

86

88

90

92

94

96

98

log Retail Turnover

18

Ch.4 Statistical Graphics

Features of economic time series data


eg. Gold price ($US per fine ounce, London 3pm, 1/80-11/99)
- sub-samples very different; varying trends (randomly);
- persistent with few large jumps
Gold Price (USD/fine ounce, London 3pm)
700

600

500

400

300

200
80
BF-03

82

84

86

88

90

92

94

96

98

my, School of Economics, UNSW

19

Ch.4 Statistical Graphics

Trend, seasonality and cycle


Let yt be a time series (observable).
Trend, mt, is the smoothly evolving part of yt.
It represents the long-run movement of yt.
log Retail Turnover: Trend

eg.
Trend in log retail turnover

8.0
7.6
7.2
6.8
6.4
6.0
5.6
82

BF-03

84

my, School of Economics, UNSW

86

88

90

92

94

96

98

20

Ch.4 Statistical Graphics

Trend, seasonality and cycle


Seasonality, st, is the repetitive part of yt.
It repeats over a fixed number of periods.
eg. quartly seasonality repeats over 4 quarters.
eg. Log retail turnover: raw trend = detreded
= seasonality + cycle
log Retail Turnover: Trend & Seasonality

Cycle

Raw-Trend

Seasonality

.8

7
.4

6
5

.0

.12

.08

.00

-.04

-.08

-1

-.12

82

BF-03

-.4

.04

84

86

88

90

92

94

96

98

84

my, School of Economics, UNSW

86

88

90

92

94

96

98

21

Ch.4 Statistical Graphics

Trend, seasonality and cycle


Cycle, xt, is the random fluctuation in yt, aka
irregular component. It is a RV for each t.
eg. Interesting to know how xt and xt -1 are associated.
eg. Log retail turnover: cycle = raw trend seasonality
Theoretical Quantile-Quantile

Cycle
3

.08

Normal Quantile

.12

.04
.00
-.04

1
0
-1

-.08

-2

-.12

-3
-.12

82

84

86

88

90

92

94

96

98

-.08

-.04

.00

.04

.08

.12

Cycle

BF-03

my, School of Economics, UNSW

22

Ch.4 Statistical Graphics

Classical decomposition of yt
CD is a model that splits the observable time
series, yt, into three unobserved components:
Normalized to zero:
trend (mt), seasonality(st), cycle(xt)
so that each
component is
Additive decomposition
identified.
yt = mt + st + xt ,

st + p = st ,

s
i =1

t +i

= 0,

E( xt ) = 0,

where p is the number of periods in a season.


eg. p = 12 for monthly series
Multiplicative decomposition: Yt = Mt St Xt
But log(Yt) has an additive decomposition.
BF-03

my, School of Economics, UNSW

23

Ch.4 Statistical Graphics

Classical decomposition of yt
e.g. Department stores turnover: 1982.04 1999.10
log Retail Turnover

Trend

8.0

8.0

7.6

7.5

7.2
7.0
6.8
6.5
6.4
6.0

6.0
5.6

5.5
82

84

86

88

90

92

94

96

98

82

84

86

88

Seasonality

90

92

94

96

98

Cycle

1.5

.3

1.0

.2
.1

0.5
.0
0.0
-.1
-0.5

-.2

-1.0

-.3
82

BF-03

84

86

88

90

92

94

96

98

my, School of Economics, UNSW

82

84

86

88

90

92

94

96

98

24

Ch.4 Statistical Graphics

Data and EViews


EViews:
Read in data by clicking
File, New, Workfile, Dated (in Workfile structure), Monthly (in Frequency),
1982:01 (in Start date), 1999:10 (in End date), OK;
Proc (on Workfile window), Import, Import from file,
departstoresTurnover05.xls (in File name), Open, Finish, OK
Time series plot of log sales by typing in top panel
genr y=log(sales)
y.line
Generate MA trend
genr ma12b=(y(-6)+y(-5)+y(-4)+y(-3)+y(-2)+y(-1)+y+y(1)+y(2)+y(3)+y(4)+y(5))/12
genr ma12f=(y(-5)+y(-4)+y(-3)+y(-2)+y(-1)+y+y(1)+y(2)+y(3)+y(4)+y(5)+y(6))/12
genr ma12=0.5*(ma12b+ma12f)
De-trended series
genr ydt=y-m12
plot y ma12
plot y ma12 ydt

BF-03

my, School of Economics, UNSW

25

Ch.4 Statistical Graphics

Summary
List the types of graphs we used today.
Why plotting data is important?
What are the major components in a time series?
What is the additive decomposition?
Why must the seasonal component sum to zero
over a season?
Why is the mean of cycle component normalised
to zero?

BF-03

my, School of Economics, UNSW

26

Vous aimerez peut-être aussi