Vous êtes sur la page 1sur 3

Econ 531 Midterm Examination

Fall 2016

1) (21 points) Suppose we have the following population equation in vector as y = x + u where x is a
1K vector of regressors and =(1,2,...,K) is a K1 vector. (Here x1 = 1.) We obtain a sample of
size N from the population to estimate . Thus,{(xi,yi) : i = 1,2,...,N} are independent, identically
distributed random variables, where xi is 1K and yi is a scalar. For each observation i, we have yi
=xi + ui. (YOU NEED TO BE CAREFUL ABOUT THE DETAILS IN THIS QUESTION!)
a) Derive the OLS estimator (by taking expectations). What assumptions do you make?
b) Show that the OLS estimator is consistent for .

2) (8 points, True/False). Provide complete explanations.


a) E(u|x) = 0 is a sucient condition for E(xu) = 0.
b) If N({N} ) = Op(1), then ({N} ) = op(N^(c)) for any 0 c < 1/2.

3) (7 points) Suppose we run the following regression


log(wage) = 0 + 1 s + 2 fem + 3 fems + u
where s denotes years of schooling, fem is a dummy variable for female, and u is the error term.
a) Provide interpretations of the coecients 1, 2 and 3.
b) What are the returns to schooling for men and for women

4) (7 points) Consider estimating the eect of personal computer ownership, as represented by a


binary variable, PC, on college GPA, colGPA. With data on SAT scores and high school GPA you
postulate the model
colGPA = 0 + 1*hsGPA + 2*SAT + 3*PC + u
a) Why might u and PC be positively correlated?
b) If the given equation is estimated by OLS using a random sample of college students, is 3 likely
to have an upward or downward asymptotic bias?
c) What are some variables that might be good proxies for the unobservables in u that are correlated
with u?

5) (7 points)
a)

Explain the selection problem in words with an example.

b)

Let Yi denote the outcome of interest (e.g. health status) and Di = {0,1} the treatment variable

(e.g. hospital visit). Can we find the causal eect of a hospital visit on health by calculating E(Yi |
Di=1) - E(Yi | Di=0) ? Show the selection bias.
c)

How could we solve this problem?

6) (8 points) Suppose that you want to estimate the returns to schooling using data on 1,000 twins.
The data include information on twins wages and years of schooling, as well as other background
variables like parental characteristics and such. We assume that twin pairs have the same level of
ability. How could we use panel data estimation methods to estimate the returns to schooling with
this data.

7) (13 points) Suppose that you want to conduct a study measuring the effect of Syrian immigrants
on the Turkish labor market.
a)

Design and explain a difference-in-differences methodology to measure this effect. Explain the

data you would use. What are your control and treatment grups? Write down the regression
specification you would use, clearly defining each variable.
b)

Discuss the identification assumptions in this study? Do you have a good control group? Would

you be able to check it?


c)

Would you be able to add other control variables to this regression? What are the advantages

of it?

8) (8 points) Suppose that you want to estimate the effect of number of children on female labor force
participation rate. Would you get consistent estimates if you run an OLS? What is the problem?
Suppose that in Turkey parents whose first child is a girl are more likely to have a second child.
Would this help you in solving this problem? How? Under what assumptions? Would these
assumptions hold in this case?

9) (True/False, 21 points). In either case, provide sufficient explanations.


a)

Heteroskedasticity violates the unbiasedness of OLS estimators.

b)

Suppose that in the following regression, ui includes ability variable which is positively correlated with

both wage and education. Then, E(1) > 1 necessarily holds.


wagei = 0 + 1educi + ui

c)

In the case of a sample correlation coefficient of .95 between two independent variables both included
in the model, OLS t statistics are invalid.

d)

Consider the following regression:


y = 0 + 1x1 + 2x2 + u

Adding an irrelavant variable, x3, does not change the variance of 1.


e)

Suppose you run the following regression


crime_rate = 0 + 1police + u

and find a negative and statistically significant 1.Then, we must have made a mistake in the regression.
f)

Suppose that E (u|x) = 0 holds. We take a sample of 1,000 and run the following regression:
y = 0 + 1x1 + 2x2 + u

Our estimate of 1 = 0.52 whereas the true 1 = 0.5. Then, we must have made a mistake in the estimation.
g)

Suppose we regress the price of a house (price) on its square footage (sqrft) and number of

bedrooms(bdrms) and get the following results using a sample of 123 houses. The negative sign of bdrms
show that there is something wrong in the regression.
log(price) = 7.46

+.65 log(sqrft) . .08 bdrms

(1.15) (.18)

(.04)

Vous aimerez peut-être aussi