Vous êtes sur la page 1sur 52

Introductory Econometrics

2019

Monash Econometrics and Business Statistics

2019

1 / 38
Welcome

I This is the first unit in econometrics and a pre-requisite for


most third year units in econometrics, business analytics and
actuarial studies
I You are certainly in the right place to learn econometrics

2 / 38
What is Econometrics?

I “Metrics” as a suffix means measuring things and analysing


those measurements. “Econo” tells us that we are talking
about measuring and analysing economic things.
I Other examples of quantitative analytics in other fields are
biometrics and psychometrics
I Remember that this was coined in 1930, when many of the
business and commerce disciplines of today were at their
infancy or did not exist, so the context was ”Econo” not
“Busino” (“Businometrics” does not sounds very good to me)
I Econometric methods are used in economics, finance,
marketing and management
I Econometrics uses mathematics and statistics

3 / 38
Outline of unit

I Week 1: Introduction
I Week 2: Review of Statistical Concepts
I Week 3: Linear Regression (OLS)
I Week 4: More on Linear Regression Analysis
I Week 5: Inference
I Week 6: Model Selection / Prediction
I Week 7: Binary Variables
I Week 8: Heteroskedasticity
I Week 9: Serial Correlation
I Week 10: Persistence in Time Series Data
I Week 11: Large Sample Properties of OLS
I Week 12: Revision

4 / 38
Logistics

I Refer to the Unit Guide and the Moodle site

5 / 38
Overall goal

Evidence based (aka empirical) analysis of business and economic


problems quotes . This comprises the following stages:
1. Understanding the problem
2. Formulating an appropriate conceptual model to tackle the
problem
3. Collecting appropriate data
4. Looking at data (Descriptive Analytics)
5. Estimating the model, making inference, predictions and
policy prescriptions as appropriate (Predictive and Prescriptive
Analytics)
6. Evaluating, learning and improving each of the previous steps,
and iterating until the problem is solved

6 / 38
Two purposes of econometric modelling

I We use data and econometric methods for two purposes:


1. Prediction: To predict a target variable based on available
information, e.g.,
I future return to a stock based on historical returns
I school quality based on house prices
I rain and its intensity based on the number of people carrying
umbrellas
2. Policy prescription: To understand the causes of variation in a
target variable, so that we can control that variable with
suitable policy prescriptions, e.g.
I what determines the volatility of stock prices?
I what determines the demand for umbrellas?
I how does university education affect earnings?
I what is the effect of greenhouse gases on global temperature?

7 / 38
Two purposes of econometric modelling: 1. Prediction

I When do you think a variable x will be useful for predicting a


target variable y ?

8 / 38
Two purposes of econometric modelling: 1. Prediction

I When do you think a variable x will be useful for predicting a


target variable y ?
I Can we predict how many people will be carrying umbrellas
using the weather bureau’s forecast of incidence and intensity
of rain?

8 / 38
Two purposes of econometric modelling: 1. Prediction

I When do you think a variable x will be useful for predicting a


target variable y ?
I Can we predict how many people will be carrying umbrellas
using the weather bureau’s forecast of incidence and intensity
of rain?
I Can we predict if it is raining outside by counting how many
people enter the lecture theatre with umbrellas?

8 / 38
Two purposes of econometric modelling: 1. Prediction

I When do you think a variable x will be useful for predicting a


target variable y ?
I Can we predict how many people will be carrying umbrellas
using the weather bureau’s forecast of incidence and intensity
of rain?
I Can we predict if it is raining outside by counting how many
people enter the lecture theatre with umbrellas?
I The answer to both questions is ‘yes’. We can exploit
correlation for prediction, regardless of the direction of
causation.

8 / 38
Two purposes of econometric modelling: 1. Prediction
Predictive models

I The tools we learn allow us to predict y using an x that is


correlated with y .
I If there are several variables that are correlated with y , these
tools will allow us to use a combination of these predictors to
predict y .
I When we have time series data, and today’s value is
correlated with yesterday’s value, these tools allow us to use
yesterday’s value to predict today.
I Prediction of future values is called forecasting. The tools we
learn allow us to build forecasting models.

9 / 38
Two purposes of econometric modelling: 2. Policy
Prescription
Causal models

I When do you think we can manipulate a target variable y by


changing a variable x?

10 / 38
Two purposes of econometric modelling: 2. Policy
Prescription
Causal models

I When do you think we can manipulate a target variable y by


changing a variable x?
I Example 1: CEO salary is positively correlated with a
company’s share price. Can we increase a company’s share
price by offering a large salary to its CEO?

10 / 38
Two purposes of econometric modelling: 2. Policy
Prescription
Causal models

I When do you think we can manipulate a target variable y by


changing a variable x?
I Example 1: CEO salary is positively correlated with a
company’s share price. Can we increase a company’s share
price by offering a large salary to its CEO?
I Example 2: Poor countries have high infant mortality. Does
this mean that the only way to reduce infant mortality is to
make the country rich?
I We need to have a causal model to enable us to give policy
advice. For example, after taking account for all confounding
factors, we establish that better sanitation reduces child
mortality. Then we can advise the country or NGOs to
implement programs to improve sanitation in order to reduce
child mortality.

10 / 38
The role of economics and finance theories
I Theories in economics and finance suggest how variables are
related to each other. Hence they are potentially useful for
developing causal and also predictive models. Examples:
I Economic theory suggests that quantity demanded is
negatively related to own price and positively related to income
of consumers.

11 / 38
The role of economics and finance theories
I Theories in economics and finance suggest how variables are
related to each other. Hence they are potentially useful for
developing causal and also predictive models. Examples:
I Economic theory suggests that quantity demanded is
negatively related to own price and positively related to income
of consumers.
I Finance theory suggests that bond prices depend negatively on
inflation. This implies that predictors that are useful for
predicting inflation should also help in predicting bond prices.

11 / 38
The role of economics and finance theories
I Theories in economics and finance suggest how variables are
related to each other. Hence they are potentially useful for
developing causal and also predictive models. Examples:
I Economic theory suggests that quantity demanded is
negatively related to own price and positively related to income
of consumers.
I Finance theory suggests that bond prices depend negatively on
inflation. This implies that predictors that are useful for
predicting inflation should also help in predicting bond prices.
I Also, we can use econometrics to test economics and finance
theories. For example:
I The efficient market hypothesis implies that all important
information for next period’s equity prices is in this period’s
price. Hence equity returns should be unpredictable. If we can
find a significant predictor for returns, we can reject this
hypothesis.

11 / 38
Some examples

I What predicts stock returns?


I What is the effect of reducing class size on student
achievement?
I How does another year of education change earnings?
I What determines if a person chooses to buy private health
insurance?
I What is the price elasticity of cigarettes?
I What is the effect on output growth of a 0.25 percentage
point increase in interest rates?

12 / 38
Let’s think of bigger issues: What do YOU think?

The most important problem facing us in the next 50 years is:


1. Global warming
2. Inequality and poverty
3. Ageing of the population
4. Threat of terrorism

Go to https://flux.qa, click on + and then enter: UIGIRZ

13 / 38
Econometric Analysis: Data
I The gold standard for measuring causal effects is using data
from randomised control trials (RCTs), similar to the way the
effectiveness of new drugs or vaccines are measured
I Such data is called experimental data
I Unfortunately, in business and economics we cannot run
experiments
I Most often econometric analysis has to be carried out using
observational (i.e. non-experimental) data.
I Returns to education: data on wage and educational
attainment of a sample of individuals.

I Observational data pose major challenges:


I Confounding effects (omitted factors): smarter individuals go
to university. University graduates get higher wages. Is this
wage differential a return to university education or a return to
smartness?
14 / 38
Econometric Analysis: Structure of data
I Three data structures that are commonly encountered in
econometrics:
1. Cross-sectional data: observations on one or more variables
taken at the same point in time.
I e.g. Observations on infant mortality and GDP per capita in a
group of countries in 2015.

2. Time series data: observations on one or more variables taken


at different points in time.
I e.g. Monthly observations on returns and volume of trade of
Qantas shares from January 2000 to June 2018.

3. Panel data (or longitudinal): observations on the same


cross-section units at different points in time.
I e.g. Daily returns and volumes for ASX200 companies from
2/1/2017 to 30/6/2018.

15 / 38
Cross-sectional versus Time series

I Below are two important differences in the properties between


cross-sectional and time series data:
P1 There is no natural ordering of observations in cross-sectional
data, i.e. there is no reason to assume that the characteristics
of Helen should be studied before those of Paul and vice versa.
P1’ Conversely, observations in time series data are ordered: There
is a natural ordering in time in the sense that GDP in quarter 1
of the year precedes quarter 2 and quarter 2 precedes quarter 3
P2 Cross-sectional data are generally assumed to be independent,
i.e. information collected on individual 1 does not provide any
information on individual 2
P2’ Conversely, time series data are generally characterized by
some form of temporal dependence, i.e. an observation on
cigarette consumption in July 2018 can be informative about
cigarette consumption in August 2018.

16 / 38
Cross-sectional versus Time series

I Time series data can be used to accomplish two important


tasks for which cross-sectional data are inadequate. These are
to:
I Forecast future values of a variable: eg. stock prices, consumer
price index, gross domestic product, annual homicide rates one
or several days /months /quarters / years ahead.
I Estimate the dynamic causal effect of one variable x on
another variable y : eg. estimate the effect on alcohol
consumption of an increase in the tax on alcohol, both initially
and subsequently as consumers adjust to the new tax.

17 / 38
Cross-sectional versus Time series: Notation
I Conventionally, a regression model for cross section analysis is
written as:

yi = β 0 + β 1 xi1 + β 2 xi2 + . . . + β k xik + ui , i=1,2,. . . ,n

where i denotes units such as individuals, households, firms,


schools, cities, states, countries etc
I For time series analysis a similar regression model is written as:

yt = β 0 + β 1 xt1 + β 2 xt2 + . . . + β k xtk + ut , t=1,2,. . . ,n

where t denotes units in time such as days, months, quarters,


years etc
I All that changes in these two expressions is the notation for
units (from i to t)
I The remaining parts of these equations will be studied in
detail in subsequent lectures
18 / 38
Econometric Analysis: Modelling
I The fundamental step is to think of all variables as random
variables (variables with several possible outcomes that have a
probability distribution)
I Equally as important is to remember that with no
information, the best prediction of a variable is the centre of
its distribution (typically its mean)
I Example: If we have a random sample of weekly wages in
Australia and based on that want to predict the wage of a
random person in Australia (not in our sample), what do we
do?
I Answer: ...

19 / 38
Econometric Analysis: Modelling
I The fundamental step is to think of all variables as random
variables (variables with several possible outcomes that have a
probability distribution)
I Equally as important is to remember that with no
information, the best prediction of a variable is the centre of
its distribution (typically its mean)
I Example: If we have a random sample of weekly wages in
Australia and based on that want to predict the wage of a
random person in Australia (not in our sample), what do we
do?
I Answer: ...
I If we group Australian population by age, the centre of the
wage distribution for different ages will not be the same
I To learn about mean of wage for each age, what should we
do?
I Calculating the average wage for each age in our sample is
not feasible (our sample may not have all ages in it) go to data
19 / 38
Econometric Analysis: Modelling - interaction of theory
and practice
Distribution of wage conditional on different values of age
f (wage | age)

wage

E(wage | age) = β0 + β1 age

20
40 age
60

20 / 38
Econometric Analysis: Modelling - interaction of theory
and practice
Distribution of wage conditional on different values of age
f (wage | age)

wage

E(wage | age) = β0 + β1 age

20
40 age
60

21 / 38
Econometric Analysis: Modelling - interaction of theory
and practice

I The problem is reduced to estimating the intercept and slope


of the conditional expectation line from the data
I Find the best fitting (regression) line to the scatter plot of the
data
I We study the regression line and establish why it is the best
we can do

22 / 38
Econometric Analysis: Modelling - interaction of theory
and practice

I There are other predictors of wage, like education and IQ


I In the theory universe, we generalise the model to
E (wage | age, educ, IQ ) = β 0 + β 1 age + β 2 educ + β 3 IQ
I In the data universe, we find the combination of age,
education and IQ that fits the wage data best
I We study why this is the best we can do
I We also learn how to incorporate information that is
qualitative, e.g. gender and occupation.

23 / 38
Econometric Analysis: Modelling - The Main Focus

We study regression modelling in this unit, which is a powerful


tool for empirical analysis. We learn how to estimate regression
models, how to interpret them, and how to make inference,
predict and prescribe policy advice on the basis of these models.

24 / 38
Summary

I The goals of econometric modelling is either to predict or to


prescribe policy
I The data available in business and economics are often
observational data
I Observational data sets are either cross sectional, or time
series or panel (longitudinal) data sets
I Regression modelling is a powerful tool for econometric
analysis

25 / 38
Quotes by Jim Barksdale

Jim Barksdale was the CEO of Netscape, and is now the


co-chairman of Spread Networks. He is known for his managerial
skills.
“You cannot manage that which you cannot measure.”

“If we have data, lets look at data. If all we have are


opinions, lets go with mine.”
Back to overview

26 / 38
Global Warming

I Global temperatures have been on a warming trend


I Question: Can we control this warming by reducing the
human activities that produce green house gases?
I This translates to econometric questions such as:

27 / 38
Global Warming

I Global temperatures have been on a warming trend


I Question: Can we control this warming by reducing the
human activities that produce green house gases?
I This translates to econometric questions such as:
I What is the nature of the warming trend in temperatures?

27 / 38
Global Warming

I Global temperatures have been on a warming trend


I Question: Can we control this warming by reducing the
human activities that produce green house gases?
I This translates to econometric questions such as:
I What is the nature of the warming trend in temperatures?
I What causes a warming trend in temperatures?

27 / 38
Global Warming

I Global temperatures have been on a warming trend


I Question: Can we control this warming by reducing the
human activities that produce green house gases?
I This translates to econometric questions such as:
I What is the nature of the warming trend in temperatures?
I What causes a warming trend in temperatures?
I What is the sensitivity of global temperatures to green house
gas emissions?

27 / 38
Global Warming: Data
Global temperature anomalies (GTA)

0.6
0.4
0.2
GTA

0.0
−0.2
−0.4

1850 1900 1950 2000

Year

Figure: GTA: Global temperature “anomalies” (actual average annual


temperatures minus the 1961-1990 average)

28 / 38
Global Warming: Data
Greenhouse gases (GHG)

3.5
3.0
2.5
2.0
GHG

1.5
1.0
0.5
0.0

1850 1900 1950 2000

Year

Figure: GHG: Green House Gasses (An index that includes CO2 , CH4 ,
N2 O and CFC constructed by NASA)

29 / 38
Global Warming: Data Analysis
S catter pl ot of GTA aga inst G HG

.6

.4

.2

.0
GTA

-.2

-.4

-.6
0 1 2 3 4

GHG

Figure: The scatter plot of GTA against GHG

30 / 38
Global Warming: Data Analysis

.6

.4

.2

.0
GTA

-.2

-.4

-.6
0 1 2 3 4

GHG

Figure: The scatter plot of GTA against GHG

31 / 38
Global Warming: Data Analysis
OLS regression fitted values

0.6
0.4
Global Temperature Anomalies
OLS Fitted Values

0.2
GTA

0.0
−0.2
−0.4

1850 1900 1950 2000

Year

d t = 0.381 + 0.237 GHGt


Figure: Predicted GTA using GTA

32 / 38
Global Warming: Data Analysis

I We learn how to estimate the parameters using data.

33 / 38
Global Warming: Data Analysis

I We learn how to estimate the parameters using data.


I Things we need to pay attention to:
1. Possibility of spurious regression due to trend.
2. Possibility of better estimation of this relationship exploiting
the persistence in errors.
3. Feedback of temperatures on greenhouse gases.
I We learn how to pay attention to some but not all of the
above in this unit.
Back to problems

33 / 38
Poverty and Inequality

I Reduction of poverty and inequality has been a millennium


development goal.
I A blatant evidence of poverty and inequality among nations is
variation in child mortality.
I Obviously poorer nations have higher child mortality. But is
that the best policy advice we can provide: increase GDP per
capita and that will reduce child mortality?

34 / 38
Poverty and Inequality

I Reduction of poverty and inequality has been a millennium


development goal.
I A blatant evidence of poverty and inequality among nations is
variation in child mortality.
I Obviously poorer nations have higher child mortality. But is
that the best policy advice we can provide: increase GDP per
capita and that will reduce child mortality?
I We need to determine that after controlling for GDP per
capita (i.e. in countries with similar GDP per capita) what
other factors determine variations in child mortality.

34 / 38
Poverty and Inequality

I Reduction of poverty and inequality has been a millennium


development goal.
I A blatant evidence of poverty and inequality among nations is
variation in child mortality.
I Obviously poorer nations have higher child mortality. But is
that the best policy advice we can provide: increase GDP per
capita and that will reduce child mortality?
I We need to determine that after controlling for GDP per
capita (i.e. in countries with similar GDP per capita) what
other factors determine variations in child mortality.
I Multiple regression analysis allows us to do that.

34 / 38
Poverty and Inequality: Data
I Data from the World Bank Development Indicators (WDI)
data set
I Countries with highest and lowest under-5 mortality rates (per
1000 live births) in 2015

35 / 38
Poverty and Inequality: Data Analysis

160,000 100 100

80
120,000 80
GDP_PERC APIT A

BASIC_WAT ER
60

SANIT AT ION
80,000 60
40

40,000 40
20

0 20 0
0 40 80 120 160 0 40 80 120 160 0 40 80 120 160

MORT _UNDER5 MORT_UN DER5 MORT _UND ER 5

Figure: Scatter plots of mortality rate against several contributing


variables

36 / 38
Poverty and Inequality: Data Analysis

Figure: Multiple regression results

37 / 38
Poverty and Inequality: Data Analysis

I We used observational data for this analysis


I There are important issues surrounding the implementation of
sanitation and water purification programs: how to persuade
people to participate? How to evaluate the success of such
programs?
I Modern Development Economics includes a great deal of field
work on program implementation and program evaluation.
I These areas are important areas in economics where
experimental data is being created and analysed (including
important work by Monash staff and students in Indonesia,
India and Timor Leste).
Back to problems

38 / 38