Académique Documents
Professionnel Documents
Culture Documents
ECON2209
Slides 03
BF-03
Lecture Plan
Graphs of data: merits and limitations
Examples: use graphs to show data features
Time series differ from random sample
Components in time series
Classical decomposition
BF-03
Statistical Graphics
Example: Anscombes quartet
obs
X1
Y1
X2
Y2
X3
Y3
X4
Y4
1
2
3
4
5
6
7
8
9
10
11
10
8
13
9
11
14
6
4
12
7
5
8.04
6.95
7.58
8.81
8.33
9.96
7.24
4.26
10.84
4.82
5.68
10
8
13
9
11
14
6
4
12
7
5
9.14
8.14
8.74
8.77
9.26
8.10
6.13
3.10
9.13
7.26
4.74
10
8
13
9
11
14
6
4
12
7
5
7.46
6.77
12.74
7.11
7.81
8.84
6.08
5.39
8.15
6.42
5.73
8
8
8
8
8
8
8
19
8
8
8
6.58
5.76
7.71
8.84
8.47
7.04
5.25
12.5
5.56
7.91
6.89
Why identical
regression line?
BF-03
y = 3.00 + 0.50 x,
(1.12)
R 2 = 0.67
(0.12)
Example
Scatter plots explain it vividly.
Y1 vs. X1
Y2 vs. X2
11
11
10
10
9
8
Y2
Y1
8
7
7
6
6
3
3
10 11 12 13 14 15
X1
Y3 vs. X3
10 11 12 13 14 15
X2
Y4 vs. X4
13
13
12
12
11
11
10
10
Y4
Y3
9
8
9
8
7
6
6
5
4
3
10 11 12 13 14 15
10
X3
BF-03
12
14
16
18
20
X4
Advantages of graphs
Graphs represent data visually and help us to see
data features/patterns.
A graph is worth a thousand of words.
BF-03
Scatter plots
Useful to reveal the relationship between two
variables.
ls y c z
======================================================
Variable
Coefficient
Std. Error
t-Statistic
Prob.
eg. xyz.dat
C
10.04891
0.272825
36.83280
0.0000
Z
-0.364783
Y vs. X
0.242923
-1.501641
0.1400
Y vs. Z
13
13
12
12
11
11
10
10
14
14
Weak relation?
It could be
y = b0+b1z+b2x + u
5
-3
-2
-1
-3
-2
-1
BF-03
Scatter plots
eg. xyz.dat. Partial relationship
of y and z (after controlling for x):
regress y on c x to get resid01;
regress z on c x to get resid02;
scatter plot resid01 against resid02.
EViews:
Type in top panel
ls y c x
On the result window, click
Proc, Make Residual Series,
resid01 (in Name for resid series), OK
Type in top panel
ls z c x
On the result window, click
Proc, Make Residual Series,
resid02 (in Name for resid series), OK
4
3
RESID01
1
0
-1
-2
-3
-3
-2
-1
Dependent Variable: Y,
Variable
Coefficient
C
9.884732
X
1.073140
Z
-0.638011
Sample: 1 48
Std. Error
0.190297
0.150341
0.172499
t-Statistic
51.94359
7.138031
-3.698642
Prob.
0.0000
0.0000
0.0006
RESID02
BF-03
BF-03
70
75
80
85
90
95
00
05
EViews:
Read in bond10y.csv
File, Open, Foreign Data as Workfile,
bond10y.csv (in File name), Open,
Finish
-1
-2
65
BF-03
70
75
80
85
90
95
00
05
10
trends
seasonalities
volatilities (mount of variation)
breaks (pattern changes)
outliers (unusual observations)
in data.
First thing in time series analysis: plotting data
BF-03
11
Histogram
Describes how data are distributed.
(frequency distribution)
eg. Change of US 10-year treasury bond yield (monthly)
Thick-tailed (caused by a small number of large jumps)
Change of 10-year T-bond Yield
100
Sample 1962M01 2006M02
Observations 529
80
60
40
20
Mean
Median
Maximum
Minimum
Std. Dev.
Skewness
Kurtosis
0.000926
0.010000
1.590000
-1.880000
0.348751
-0.280792
6.565498
Jarque-Bera
Probability
287.1622
0.000000
Normal distribution:
skewness = 0,
kurtosis = 3.
At the 5% level,
reject normality if
Jarque-Bera > 5.99.
0
-1.5
BF-03
-1.0
-0.5
0.0
0.5
1.0
1.5
12
80% of
observations are
below 0.23
1.0
Probability
0.8
0.6
0.4
0.2
0.0
-1.6
-1.2
-0.8
-0.4
0.0
0.4
0.8
1.2
Change
BF-03
13
QQ-plot
Check how a theoretical distribution fits data.
For a perfect fit, the QQ-plot is a straight line.
eg. Change of US 10-year
treasury bond yield:
It appears non-normal.
6
4
Normal Quantile
Theoretical Quantile-Quantile
(-1.88, -2.90)
2
0
-2
-4
-6
-2
-1
Change
BF-03
14
EViews
Commands for the examples
EViews:
Read in bond10y.csv
File, Open, Foreign Data as Workfile,
bond10y.csv (in File name), Open,
Finish
Type in top panel
genr dy=close-close(-1)
plot close dy
dy.line
dy.hist
dy.distplot cdf
dy.qqplot
scalar q80=@quantile(dy, .8)
scalar qn529=@qnorm(1/529)
BF-03
15
Style of graphs
Easy to understand
eg. indicate the meaning of axes
Highlight the point you are trying to make
Informative
eg. make self-explained graphs
Attractive
eg. use of proper colours and symbols
Avoid chart junk
eg. abuse of colours, shadings, grids,
BF-03
16
A time series
is a set of observations on a variable observed
over consecutive time intervals.
eg. T-bond rate, gold price, retail sales, (daily, annual,...)
BF-03
17
2000
7.6
1600
7.2
1200
6.8
800
6.4
400
6.0
5.6
0
82
84
86
88
90
92
94
96
98
82
84
BF-03
86
88
90
92
94
96
98
18
600
500
400
300
200
80
BF-03
82
84
86
88
90
92
94
96
98
19
eg.
Trend in log retail turnover
8.0
7.6
7.2
6.8
6.4
6.0
5.6
82
BF-03
84
86
88
90
92
94
96
98
20
Cycle
Raw-Trend
Seasonality
.8
7
.4
6
5
.0
.12
.08
.00
-.04
-.08
-1
-.12
82
BF-03
-.4
.04
84
86
88
90
92
94
96
98
84
86
88
90
92
94
96
98
21
Cycle
3
.08
Normal Quantile
.12
.04
.00
-.04
1
0
-1
-.08
-2
-.12
-3
-.12
82
84
86
88
90
92
94
96
98
-.08
-.04
.00
.04
.08
.12
Cycle
BF-03
22
Classical decomposition of yt
CD is a model that splits the observable time
series, yt, into three unobserved components:
Normalized to zero:
trend (mt), seasonality(st), cycle(xt)
so that each
component is
Additive decomposition
identified.
yt = mt + st + xt ,
st + p = st ,
s
i =1
t +i
= 0,
E( xt ) = 0,
23
Classical decomposition of yt
e.g. Department stores turnover: 1982.04 1999.10
log Retail Turnover
Trend
8.0
8.0
7.6
7.5
7.2
7.0
6.8
6.5
6.4
6.0
6.0
5.6
5.5
82
84
86
88
90
92
94
96
98
82
84
86
88
Seasonality
90
92
94
96
98
Cycle
1.5
.3
1.0
.2
.1
0.5
.0
0.0
-.1
-0.5
-.2
-1.0
-.3
82
BF-03
84
86
88
90
92
94
96
98
82
84
86
88
90
92
94
96
98
24
BF-03
25
Summary
List the types of graphs we used today.
Why plotting data is important?
What are the major components in a time series?
What is the additive decomposition?
Why must the seasonal component sum to zero
over a season?
Why is the mean of cycle component normalised
to zero?
BF-03
26