Vous êtes sur la page 1sur 21

Econ 140

Binary Response
Lecture 21

Lecture 21

Todays plan

Econ 140

Three models:
Linear probability model
Probit model
Logit model
L21.xls provides an example of a linear probability model
and a logit model

Lecture 21

Discrete choice variable


Defining variables:
Yi = 1 if individual :
Takes BART
Buys a car
Joins a union

Econ 140

Yi = 0 if individual:
Does not take BART
Does not buy a car
Does not join a union

The discrete choice variable Yi is a function of individual


characteristics: Yi = a + bXi + ei

Lecture 21

Graphical representation

Econ 140

X = years of labor market experience


Y = 1 [if person joins union]
= 0 [if person doesnt join union]
Y
1

Observed data with OLS


regression line

0
Lecture 21

X
4

Linear probability model

Econ 140

The OLS regression line in the previous slide is called the


linear probability model
predicting the probability that an individual will join a
union given their years of labor market experience
Using the linear probability model, we estimate the
equation:

Y a bX

using a & b

Lecture 21

we can predict the probability

Linear probability model (2)

Econ 140

Problems with the linear probability model


1) Predicted probabilities dont necessarily lie within the 0 to 1
range
2) We get a very specific form of heteroskedasticity
errors for this model are
ei Yi Yi
note:
values are along the continuous OLS line, but Yi

Yi between 0 and 1 - this creates large variation


values jump
in errors
3) Errors are non-normal
We can use the linear probability model as a first guess
can be used for start values in a maximum likelihood problem

Lecture 21

McFaddens Contribution

Econ 140

Suggestion: curve that runs strictly between 0 and 1 and


tails off at the boundaries like so:
Y
1

0
Lecture 21

McFaddens Contribution

Econ 140

Recall the probability distribution function and cumulative


distribution function for a standard normal:
1
PDF

Lecture 21

CDF

Probit model

Econ 140

For the standard normal, we have the probit model using


the PDF
The density function for the normal is:

1
1 2
f Z
exp Z
2
2

where Z = a + bX
For the probit model, we want to find
Pr(Yi 1) F Z i

f Z i PDF , F ( Z i ) CDF

Pr( Z z ) CDF
Lecture 21

Probit model (2)

Econ 140

The probit model imposes the distributional form of the


CDF in order to estimate a and b
The values a and b have to be estimated as part of the
maximum likelihood procedure

Lecture 21

10

Logit model

Econ 140

The logit model uses the logistic distribution


z

e
Density
g z
z
:
1 e
1

Cumulative:

1
G z
1 e z
Standard normal F(Z)
Logistic G(Z)

0
Lecture 21

11

Maximum likelihood

Econ 140

Alternative estimation that assumes you know the form of


the population
Using maximum likelihood, we will be specifying the
model as part of the distribution

Lecture 21

12

Maximum likelihood (2)

Econ 140

For example: Bernoulli distribution where: (with a


parameter )
Pr(Y 1)

Pr(Y 0) 1
We have an outcome
1110000100
The probability expression is:
3 1 4 1 2 4 1 6
0.4
We pick a sample of Y1.Yn
Pr Yi 1
Pr Yi 0 1
Lecture 21

13

Maximum likelihood (3)

Econ 140

Probability of getting observed Yi is based on the form


weve assumed:
Yi 1 1Yi
If we multiply across the observed sample:
n

Yi 1 (1Yi )

i 1

Given we think that an outcome of one occurs r times:

( nr )
r

Lecture 21

14

Maximum likelihood (3)


If we take logs, we get

Econ 140

L r log n r log 1

This is the log-likelihood


We can differentiate this and obtain a solution for

Lecture 21

15

Maximum likelihood (4)

Econ 140

In a more complex example, the logit model gives

Pr Yi 1 G Z i
Z i a bX i

Pr Yi 0 1 G Z i
Instead of looking for estimates of
estimates of a and b
Think of G(Zi) as :
we get a log-likelihood

we are looking for

L(a, b) = i [Yi log(Gi) + (1 - Yi) log(1 - Gi)]


solve for a and b

Lecture 21

16

Example

Econ 140

Data on union membership and years of labor market


experience (L21.xls)
To build the maximum likelihood form, we can think of:
intercept: a
coefficient on experience : b
There are three columns
Predicted value Z
Estimated probability(on the CDF)
Estimated likelihood as given by the model
The Solver from the Tools menu calculates estimates of a
and b

Lecture 21

17

Example (2)

Econ 140

How the solver works:


Defining a and b using start values
Choose start values of a and b equal to zero
Define our model: Z = a + bX
1
G z
Define the predictive possibilities:
1 ez
Define the log-likelihood and sum it
Can use Solver to change the values on a and b

Lecture 21

18

Comparing parameters

Econ 140

How do we compare parameters across these models?


The linear probability form is: Y = a + bX
where Pr
b

Recall the graphs associated with each model


Consequently Pr
g Z i b
X
This is the same for the probit and logit forms

Lecture 21

19

L21.xls example

Econ 140

Predicting the linear probability model:

U 0.281 0.005 EXPER

Note the value of the estimated coefficient (b) = 0.005


For the logit form:
use logit distribution:
ez
g z
1 ez
logit estimated equation is:
Z = U = -0.923 + 0.020EXPER

Lecture 21

20

L21.xls example (2)

Econ 140

At 20 years of experience:
Z = U = -0.923 + 0.020(20) = -0.523
eZ = e-0.523 = 0.590
g(Z) = (0.590/(1+0.590)) = 0.371
Thus the slope at 20 years of experience is:
0.371 x 0.020 = 0.007
Note the similarity (OLS value = 0.005), but for other
examples the difference can be notable.
Most software (e.g. STATA) will give the coefficient from
the logit, or the differential slope.
Lecture 21

21

Vous aimerez peut-être aussi