Vous êtes sur la page 1sur 9

SCHOOL OF ECONOMICS

Econ 2206 Introductory Econometrics

Final Examination

Session 1, 2007

1. TIME ALLOWED - 2 Hours.

2. TOTAL NUMBER OF QUESTIONS - 6.

3. ANSWER ALL QUESTIONS.

4. ALL QUESTIONS ARE OF EQUAL VALUE (The marks awarded to each part of a question are indicated
- the total marks for this exam is 60).

5. CANDIDATES MAY BRING THEIR OWN CALCULATORS TO THE EXAM

6. STATISTICAL TABLES ARE PROVIDED AT THE END OF THE EXAM PAPER

7. ALL ANSWERS MUST BE WRITTEN IN PEN. PENCILS MAY BE USED ONLY FOR DRAWING,
SKETCHING OR GRAPHICAL WORK.
ANSWER ALL SIX QUESTIONS

REMINDER: When performing statistical tests, always state the null and alternative hypothe-
ses, the test statistic and it’s distribution under the null hypothesis, the level of significance
and the conclusion of the test.

Question 1. (10 Marks).

(i) Suppose that the correct population regression model is:

y = β 0 + β 1 x1 + β 2 x2 + u (1.1)

However we only have data only on y and x1 , and as a consequence we estimate the following model by OLS:

y = β̂ 0 + β̂ 1 x1 + v (1.2)

In what circumstance will the OLS estimator for model (1.2):


(a) provide an unbiased estimate of the true population parameter β 1 ? (2 marks)

(b) provide an estimate of β 1 that has positive (or upward) bias ? (2 marks)

(ii) Outline the advantages of using larger samples of data in regression analysis. (2 marks)

(iii) A model used analysing the effect of house characteristics on the sale price was:

log(price) = β 0 + β 1 area + β 2 bdrms + β 3 area × bdrms + u

where price is the house price, area is the floor area of the house (measured in square metres), and bdrms
d
is the number of bedrooms. What is the partial effect on log(price) of increasing area by 1 square metre ?
( 2 marks).

(iv) What is the meaning of the term “contemporaneous exogeneity” as used in the context of time
series data ? What is the difference between contemporaneous exogeneity and “strict exogeneity” as used in
multiple regression models for time series data ? (2 marks)

2
Question 2. (10 Marks in total)
The following regression model explains the monthly wages as a function of years of education (educ), years
of labour market experience (exper) and current job tenure (tenure):
log(wage) = β 0 + β 1 educ + β 2 exper + β 3 tenure + u (2.1)

With a random sample of data the following output was obtained using SHAZAM:
Welcome to SHAZAM - Version 10.0
|_sample 1 722
|_read wage educ exper tenure
4 VARIABLES AND 722 OBSERVATIONS STARTING AT OBS 1
|_genr lnwage=log(wage)

|_* Model estimates


|_ols lnwage educ exper tenure

REQUIRED MEMORY IS PAR= 81 CURRENT PAR= 2000


OLS ESTIMATION
722 OBSERVATIONS DEPENDENT VARIABLE= LNWAGE
...NOTE..SAMPLE RANGE SET TO: 1, 722

R-SQUARE = 0.1551 R-SQUARE ADJUSTED = 0.1524


VARIANCE OF THE ESTIMATE-SIGMA**2 = 0.19493
STANDARD ERROR OF THE ESTIMATE-SIGMA = 0.44151
SUM OF SQUARED ERRORS-SSE= 139.96
MEAN OF DEPENDENT VARIABLE = 6.7790
LOG OF THE LIKELIHOOD FUNCTION = -438.839

VARIABLE ESTIMATED STANDARD T-RATIO PARTIAL STANDARDIZED ELASTICITY


NAME COEFFICIENT ERROR 718 DF P-VALUE CORR. COEFFICIENT AT MEANS
EDUC 0.74864E-01 0.6512E-02 11.50 0.000 0.353 0.3905 0.1487
EXPER 0.15328E-01 0.3370E-02 4.549 0.000 0.147 0.1592 0.0261
TENURE 0.13375E-01 0.2587E-02 5.170 0.000 0.167 0.1612 0.0143
CONSTANT 5.4967 0.1105 49.73 0.000 0.852 0.0000 0.8108

(i) What is the interpretation of the coefficient on education, β 1 ? (2 marks).

(ii) Calculate the exact percentage effect of another year of education on the predicted wage level. (2 marks).

(iii) Test the null hypothesis that all the slope parameters in the model are jointly equal to zero using a 1
percent significance level. What do you conclude ? (3 mark ).
Note: The F-test statistic is given by the formula based on R2 is:
2
(Rur − Rr2 )/q
F = 2 )/(n − k − 1)
(1 − Rur
where q is the number of restrictions, and ur and r stand for unrestricted and restricted models, respectively.

(iv) We are interesting in constructing a confidence interval for the (conditional) predicted log(wage) when
educ = 13, exper = 11 and tenure = 7. To obtain the standard error for the prediction we need to estimate
a transformed model that is equivalent to (2.1). Derive the transformed model which will give a direct
estimate of the prediction and the standard error of the prediction. (3 marks).

3
Question 3. (10 Marks in total)
We are interested in analysing the effect of different house characteristics on the market price of the house
in the Sydney, and consider the following regression model:

log(price) = β 0 + β 1 log(lotsize) + β 2 log(sqrf t) + β 3 log(bdrms) + u (3.1)

where price is the sale price (measured in $1000), lotsize is land area (square metres), sqrmtr (is the floor
area of the house (also measured in square metres), and bdrms is the number of bedrooms. Based on a
sample of data from 2005 house sales in Sydney, the following regression estimates were obtained:

d
log(price) = 0.5481 + 0.7013 log(sqrmtr) + 0.1745 log(lotsize) + 0.0363 log(bdrms)
(0.3945)(0.0823) (0.0353) (0.0932)
n = 108, R2 = 0.551, R̄2 = 0.538

(i) Construct a 90% confidence interval for β̂ 3 (the coefficient on log(bdrms)). Is zero within the confidence
interval ? (3 marks).

(ii) Given the estimation results, would you conclude that this is a good econometric model ? Explain.
(3 marks).

(iii) We are concerned that the model in (3.1) may be misspecified. An alternative model specification where
all the variables are in level form (rather than in log form) is:

price = β 0 + β 1 lotsize + β 2 sqrf t + β 3 bdrms + u (3.2)

Outline a procedure for testing whether model (3.1) or model (3.2) is a better specification. What are the
limitations (if any) of the test ? Explain. (4 marks)

4
Question 4. (10 Marks in total).
In a recent study an economist examined the factors explaining whether a firm was taken over by another
firm during a given year. The dependent variable in the analysis was T akeover - which is a binary variable
equal to 1 if it was taken over (and 0 otherwise). The explanatory variables were prof it which is the firm’s
average profit rate over the previous five years, mktval which is the market value of the firm (in $100m), and
debtearn which is the debt-to-earnings ratio. The table below presents coefficient estimates (and standard
errors) based on a sample of 177 firms in 2004.

Table 4.1. Estimation Results for Takeover Models

Dependent Variable: T akeover


Variables
prof it 0.251
(0.068)
mktval −0.930
(0.287)
debtearn −0.364
(0.249)
constant −19.21
(4.839)
Observations(n) 177
R2 0.233
Note: The usual OLS standard errors in () below the coefficient estimates.

(i) What is the interpretation of the coefficient on prof it ? (2 mark)

d
(ii) What is the predicted probability of T akeover for a firm with the following characteristics: prof it = 0.05,
mktval = 1.5 and debtearn = 6 ? Briefly explain whether the result is sensible. (2 marks)

(iii) We know the Linear Probability Model must contain “heteroskedaticity”. What is heteroskedasticity
and what are the consequences of heteroskedasticity for:
(a) estimation, and
(b) inference with the standard OLS procedures ?
(2 marks)

(iv) Given that we know the model contains heteroskedasticity, what advice would you give an economist
wishing to analyse the determinant of T akeover with regression methods ? (4 marks)

5
Question 5. (10 Marks in total).
The following regression model was proposed for analyse the effect of the minimum wage on employment:

log(emprtet ) = β 0 + β 1 log(minwgt ) + β 2 log(minwgt−1 ) + β 3 log(GN Pt ) + ut (5.1)

where emprtet is the employment rate, minwgt is the minimum wage and GN Pt is GNP (a proxy for labour
demand) in year t.

(i) What is the interpretation of the coefficient β 1 ? (2 mark ).

(ii) Is this a “static” or “dynamic” model ? What is the purpose of including the lagged term minwgt−1 ?
Briefly explain. (2 marks).

Using annual data from 1950-1987, the following regression model estimates were obtained:

d t ) = −7.05 − 0.072 log(minwgt ) − 0.061 log(minwgt−1 ) − 0.012 log(GN Pt )


log(emprte (5.2)
(0.77) (0.031) (0.015) (0.089)
2 2
n = 38, R = 0.661, R̄ = 0.641

(iii) Test the null hypothesis that the lagged term minwgt−1 is insignificant using a 10 percent significance
level and the one-sided alternative that the coefficent is negative (H0 : β 2 = 0, H1 : β 2 < 0 ). (2 marks).

(iv) There is not enough information in the results presented in (5.2) to construct a confidence interval for
the Long Run Propensity (LRP). Rewrite the model in (5.1) into a form which you give you a direct estimate
of the LRP (and the standard error on the LRP). What parameter in this transformed model corresponds
to the LRP ? (2 marks).

(v) I am concerned that the model in (5.2) may suffer from the “spurious regression” problem. What is the
spurious regression problem and what simple adjustment to the model would help reduce the possibility of
this problem ? (2 marks).

6
Question 6. (10 Marks in total).
We are interested in analysing the effect of locating a water desalination plant on local property prices.
Desalination plants are large, industrial sites which can generate a lot of noise pollution and reduce amenities
in the local area. The South Australian government built a desalination plant in the Adelaide area of South
Beach in 1998. Discussion about building a desalination plant in South Beach began after 1994, and the
plant was built and began operating in 1998. We have data on the prices of houses sold in South Beach in
1994 (the “before” period) and another sample on houses sold in 2002 (the “after” period). The hypothesis
we wish to test is that the price of houses located near the site of the desalination plant would fall below the
price of more distant houses.
The data for each year includes the dummy variable nearplant which is equal to one if the house is
located within 3 kilometres of the desalination plant. The variable hprice denotes the real house price
(scaled by $10,000). The following simple regression model was estimated using only the year 2002 sample
of data:

d
hprice = 21.311 − 6.198 nearplant (6.1)
(0.618) (0.992)
n = 353, R2 = 0.212

Using the 1994 sample, the following regression results were obtained:

d
hprice = 16.527 − 3.679 nearplant (6.2)
(0.538) (0.615)
n = 182, R2 = 0.172

(i) What is the interpretation of the coefficient on the intercept term in model (6.2) (that is, what does the
value 16.527 represent) ? What is the interpretation of the coefficient on nearplant in model (6.2) ?
(2 marks)

(ii) Can you infer from the estimates in (6.1), based on the year 2002 data, that the location of the plant
caused the price of houses located nearby to fall by an average of $61,980 ? Explain . (2 marks)

(iii) An alternative approach is to pool the data for both years and estimate the following model:

d
hprice = 16.527 + 4.7840 year2 − 3.679 nearplant − 2.519 year2 . nearplant (6.3)
(0.793) (0.9471) (0.876) (1.128)
2
n = 535, R = 0.202

where year2 is a dummy variable equal to one if the observation is for the year 2002 (and is equal to zero if
the observation is for the year 1994).
What is the estimated effect of the plant on neighbouring house prices based on the “difference-in-difference”
estimator ? Is the effect significantly different from 0 at the 5% significance level ? (use the one-sided
alternative hypothesis that the coefficient is negative). (3 marks)

(iv) What, if any, would be the advantages of collecting and using panel data to evaluate the effect of the
location of the desalination plant on local property prices ? Explain. (3 marks).

7
Table 1. Critical Values of the t Distribution
Significance Level
1-Tailed: 0.10 0.05 0.025 0.01 0.005
2-Tailed: 0.20 0.10 0.05 0.02 0.01
1 3.078 6.314 12.706 31.821 63.656
2 1.886 2.920 4.303 6.965 9.925
3 1.638 2.353 3.182 4.541 5.841
4 1.533 2.132 2.776 3.747 4.604
5 1.476 2.015 2.571 3.365 4.032
6 1.440 1.943 2.447 3.143 3.707
7 1.415 1.895 2.365 2.998 3.499
8 1.397 1.860 2.306 2.896 3.355
9 1.383 1.833 2.262 2.821 3.250
D 10 1.372 1.812 2.228 2.764 3.169
e 11 1.363 1.796 2.201 2.718 3.106
g 12 1.356 1.782 2.179 2.681 3.055
r 13 1.350 1.771 2.160 2.650 3.012
e
14 1.345 1.761 2.145 2.624 2.977
e
15 1.341 1.753 2.131 2.602 2.947
s
16 1.337 1.746 2.120 2.583 2.921
17 1.333 1.740 2.110 2.567 2.898
o
18 1.330 1.734 2.101 2.552 2.878
f
19 1.328 1.729 2.093 2.539 2.861
F 20 1.325 1.725 2.086 2.528 2.845
r 21 1.323 1.721 2.080 2.518 2.831
e 22 1.321 1.717 2.074 2.508 2.819
e 23 1.319 1.714 2.069 2.500 2.807
d 24 1.318 1.711 2.064 2.492 2.797
o 25 1.316 1.708 2.060 2.485 2.787
m 26 1.315 1.706 2.056 2.479 2.779
27 1.314 1.703 2.052 2.473 2.771
28 1.313 1.701 2.048 2.467 2.763
29 1.311 1.699 2.045 2.462 2.756
30 1.310 1.697 2.042 2.457 2.750
40 1.303 1.684 2.021 2.423 2.704
60 1.296 1.671 2.000 2.390 2.660
90 1.291 1.662 1.987 2.368 2.632
120 1.289 1.658 1.980 2.358 2.617
∞ 1.282 1.645 1.960 2.326 2.576
Example: The 1% critical value for a one tailed test with 25 df is 2.485. The 5% critical value for a two-tailed
test with large (>120) df is 1.960.

8
Table 2. 1% Critical Values of the F Distribution
Numerator Degrees of Freedom
1 2 3 4 5 6 7 8 9 10
D 10 10.04 7.56 6.55 5.99 5.64 5.39 5.20 5.06 4.94 4.85
e 11 9.65 7.21 6.22 5.67 5.32 5.07 4.89 4.74 4.63 4.54
n
12 9.33 6.93 5.95 5.41 5.06 4.82 4.64 4.50 4.39 4.30
o
m
13 9.07 6.70 5.74 5.21 4.86 4.62 4.44 4.30 4.19 4.10
i 14 8.86 6.51 5.56 5.04 4.69 4.46 4.28 4.14 4.03 3.94
n 15 8.68 6.36 5.42 4.89 4.56 4.32 4.14 4.00 3.89 3.80
a 16 8.53 6.23 5.29 4.77 4.44 4.20 4.03 3.89 3.78 3.69
t 17 8.40 6.11 5.19 4.67 4.34 4.10 3.93 3.79 3.68 3.59
o 18 8.29 6.01 5.09 4.58 4.25 4.01 3.84 3.71 3.60 3.51
r
19 8.18 5.93 5.01 4.50 4.17 3.94 3.77 3.63 3.52 3.43
D 20 8.10 5.85 4.94 4.43 4.10 3.87 3.70 3.56 3.46 3.37
e 21 8.02 5.78 4.87 4.37 4.04 3.81 3.64 3.51 3.40 3.31
g 22 7.95 5.72 4.82 4.31 3.99 3.76 3.59 3.45 3.35 3.26
r 23 7.88 5.66 4.76 4.26 3.94 3.71 3.54 3.41 3.30 3.21
e 24 7.82 5.61 4.72 4.22 3.90 3.67 3.50 3.36 3.26 3.17
e
25 7.77 5.57 4.68 4.18 3.85 3.63 3.46 3.32 3.22 3.13
s
26 7.72 5.53 4.64 4.14 3.82 3.59 3.42 3.29 3.18 3.09
o 27 7.68 5.49 4.60 4.11 3.78 3.56 3.39 3.26 3.15 3.06
f 28 7.64 5.45 4.57 4.07 3.75 3.53 3.36 3.23 3.12 3.03
29 7.60 5.42 4.54 4.04 3.73 3.50 3.33 3.20 3.09 3.00
F 30 7.56 5.39 4.51 4.02 3.70 3.47 3.30 3.17 3.07 2.98
r
40 7.31 5.18 4.31 3.83 3.51 3.29 3.12 2.99 2.89 2.80
e
e
60 7.08 4.98 4.13 3.65 3.34 3.12 2.95 2.82 2.72 2.63
d 90 6.93 4.85 4.01 3.53 3.23 3.01 2.84 2.72 2.61 2.52
o 120 6.85 4.79 3.95 3.48 3.17 2.96 2.79 2.66 2.56 2.47
m ∞ 6.63 4.61 3.78 3.32 3.02 2.80 2.64 2.51 2.41 2.32
Example: The 1% critical value for numerator df =3 and denominator df=60 is 4.13.

Vous aimerez peut-être aussi