Académique Documents
Professionnel Documents
Culture Documents
Y, p
1
1 - 1 - 2Xi
1 + 2Xi
1
1 + 2Xi
B
Xi
The linear probability model may make the nonsense predictions that an event will occur
with probability greater than 1 or less than 0.
1
F (Z )
1
p F(Z )
1 e Z
0.75
0.50
Z 1 2 X
0.25
0.00
-8
-6
-4
-2
The usual way of avoiding this problem is to hypothesize that the probability is a sigmoid
(S-shaped) function of Z, F(Z), where Z is a function of the explanatory variables.
2
F (Z )
1
p F(Z )
1 e Z
0.75
0.50
Z 1 2 X
0.25
0.00
-8
-6
-4
-2
Several mathematical functions are sigmoid in character. One is the logistic function
shown here. As Z goes to infinity, e-Z goes to 0 and p goes to 1 (but cannot exceed 1). As Z
goes to minus infinity, e-Z goes to infinity and p goes to 0 (but cannot be below 0).
3
F (Z )
1
p F(Z )
1 e Z
0.75
0.50
Z 1 2 X
0.25
0.00
-8
-6
-4
-2
The model implies that, for values of Z less than -2, the probability of the event occurring is
low and insensitive to variations in Z. Likewise, for values greater than 2, the probability is
high and insensitive to variations in Z.
4
U
Y
V
1
p F(Z )
1 e Z
Z
dp (1 e ) 0 1 ( e
dZ
(1 e Z ) 2
e Z
(1 e Z ) 2
dY
dZ
dU
dV
U
dZ
dZ
V2
U 1
dU
0
dZ
V (1 e Z )
dV
e Z
dZ
To obtain an expression for the sensitivity, we differentiate F(Z) with respect to Z. The box
gives the general rule for differentiating a quotient and applies it to F(Z).
5
F (Z )
dp
e Z
f (Z )
dZ (1 e Z ) 2
1
p F(Z )
1 e Z
0.2
0.1
0
-8
-6
-4
-2
The sensitivity, as measured by the slope, is greatest when Z is 0. The marginal function,
f(Z), reaches a maximum at this point.
6
F (Z )
1
p F(Z )
1 e Z
0.75
0.50
Z 1 2 X
0.25
0.00
-8
-6
-4
-2
For a nonlinear model of this kind, maximum likelihood estimation is much superior to the
use of the least squares principle for estimating the parameters. More details concerning
its application are given at the end of this sequence.
7
F (Z )
1
p F(Z )
1 e Z
0.75
0.50
Z 1 2 ASVABC
0.25
0.00
-8
-6
-4
-2
We will apply this model to the graduating from high school example described in the linear
probability model sequence. We will begin by assuming that ASVABC is the only relevant
explanatory variable, so Z is a simple function of it.
8
0:
1:
2:
3:
4:
5:
Log
Log
Log
Log
Log
Log
Likelihood
Likelihood
Likelihood
Likelihood
Likelihood
Likelihood
Logit Estimates
Log Likelihood = -117.35135
=-162.29468
=-132.97646
=-117.99291
=-117.36084
=-117.35136
=-117.35135
Number of obs
chi2(1)
Prob > chi2
Pseudo R2
=
570
= 89.89
= 0.0000
= 0.2769
-----------------------------------------------------------------------------grad |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
---------+-------------------------------------------------------------------asvabc |
.1666022
.0211265
7.886
0.000
.1251951
.2080094
_cons | -5.003779
.8649213
-5.785
0.000
-6.698993
-3.308564
------------------------------------------------------------------------------
The Stata command is logit, followed by the outcome variable and the explanatory
variable(s). Maximum likelihood estimation is an iterative process, so the first part of the
output will be like that shown.
9
0:
1:
2:
3:
4:
5:
Log
Log
Log
Log
Log
Log
Likelihood
Likelihood
Likelihood
Likelihood
Likelihood
Likelihood
=-162.29468
=-132.97646
=-117.99291
=-117.36084
=-117.35136
=-117.35135
Logit Estimates
Log Likelihood = -117.35135
Number of obs
chi2(1)
Prob > chi2
Pseudo R2
=
570
= 89.89
= 0.0000
= 0.2769
-----------------------------------------------------------------------------GRAD |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
---------+-------------------------------------------------------------------ASVABC |
.1666022
.0211265
7.886
0.000
.1251951
.2080094
_cons | -5.003779
.8649213
-5.785
0.000
-6.698993
-3.308564
------------------------------------------------------------------------------
10
Cumulative effect
0.75
1 e
5.0040.167 ASVABC i
0.04
0.03
0.50
0.02
0.25
Marginal effect
pi
0.01
0.00
0
0
20
40
60
80
100
ASVABC
Cumulative effect
0.75
1 e
5.0040.167 ASVABC i
0.04
0.03
0.50
0.02
0.25
Marginal effect
pi
0.01
0.00
0
0
20
40
60
80
100
ASVABC
0:
1:
2:
3:
4:
5:
Log
Log
Log
Log
Log
Log
Likelihood
Likelihood
Likelihood
Likelihood
Likelihood
Likelihood
=-162.29468
=-132.97646
=-117.99291
=-117.36084
=-117.35136
=-117.35135
Logit Estimates
Log Likelihood = -117.35135
Number of obs
chi2(1)
Prob > chi2
Pseudo R2
=
570
= 89.89
= 0.0000
= 0.2769
-----------------------------------------------------------------------------GRAD |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
---------+-------------------------------------------------------------------ASVABC |
.1666022
.0211265
7.886
0.000
.1251951
.2080094
_cons | -5.003779
.8649213
-5.785
0.000
-6.698993
-3.308564
------------------------------------------------------------------------------
0:
1:
2:
3:
4:
5:
Log
Log
Log
Log
Log
Log
Likelihood
Likelihood
Likelihood
Likelihood
Likelihood
Likelihood
=-162.29468
=-132.97646
=-117.99291
=-117.36084
=-117.35136
=-117.35135
Logit Estimates
Log Likelihood = -117.35135
Number of obs
chi2(1)
Prob > chi2
Pseudo R2
=
570
= 89.89
= 0.0000
= 0.2769
-----------------------------------------------------------------------------GRAD |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
---------+-------------------------------------------------------------------ASVABC |
.1666022
.0211265
7.886
0.000
.1251951
.2080094
_cons | -5.003779
.8649213
-5.785
0.000
-6.698993
-3.308564
------------------------------------------------------------------------------
Cumulative effect
0.75
1 e
5.0040.167 ASVABC i
0.04
0.03
0.50
0.02
0.25
Marginal effect
pi
0.01
0.00
0
0
20
40
60
80
100
ASVABC
15
1
p F(Z )
1 e Z
Z 1 2 X 2 ... k X k
However, we can use them to quantify the marginal effect of a change in ASVABC on the
probability of graduating. We will do this theoretically for the general case where Z is a
function of several explanatory variables.
16
1
p F(Z )
1 e Z
Z 1 2 X 2 ... k X k
p dp Z
e Z
f ( Z ) i
i
Z 2
X i dZ X i
(1 e )
1
p F(Z )
1 e Z
Z 1 2 X 2 ... k X k
dp
e Z
f (Z )
dZ (1 e Z ) 2
p dp Z
e Z
f ( Z ) i
i
Z 2
X i dZ X i
(1 e )
We have already derived an expression for dp/dZ. The marginal effect of Xi on Z is given by
its coefficient.
18
1
p F(Z )
1 e Z
Z 1 2 X 2 ... k X k
dp
e Z
f (Z )
dZ (1 e Z ) 2
p dp Z
e Z
f ( Z ) i
i
Z 2
X i dZ X i
(1 e )
19
1
p F(Z )
1 e Z
Z 1 2 X 2 ... k X k
dp
e Z
f (Z )
dZ (1 e Z ) 2
p dp Z
e Z
f ( Z ) i
i
Z 2
X i dZ X i
(1 e )
The marginal effect is not constant because it depends on the value of Z, which in turn
depends on the values of the explanatory variables. A common procedure is to evaluate it
for the sample means of the explanatory variables.
20
Logit Estimates
Log Likelihood = -117.35135
Number of obs
chi2(1)
Prob > chi2
Pseudo R2
=
570
= 89.89
= 0.0000
= 0.2769
-----------------------------------------------------------------------------GRAD |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
---------+-------------------------------------------------------------------ASVABC |
.1666022
.0211265
7.886
0.000
.1251951
.2080094
_cons | -5.003779
.8649213
-5.785
0.000
-6.698993
-3.308564
------------------------------------------------------------------------------
21
Number of obs
chi2(1)
Prob > chi2
Pseudo R2
=
570
= 89.89
= 0.0000
= 0.2769
-----------------------------------------------------------------------------GRAD |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
---------+-------------------------------------------------------------------ASVABC |
.1666022
.0211265
7.886
0.000
.1251951
.2080094
_cons | -5.003779
.8649213
-5.785
0.000
-6.698993
-3.308564
------------------------------------------------------------------------------
22
dp
e Z
0.034
f (Z )
0.032
Z 2
2
dZ (1 e )
(1 0.034)
23
dp
e Z
0.034
f (Z )
0.032
Z 2
2
dZ (1 e )
(1 0.034)
p dp Z
0.03
0.50
0.02
0.25
Marginal effect
Cumulative effect
0.75
0.01
0.00
0
0
20
40
50.15
60
80
100
ASVABC
In this example, the marginal effect at the mean of ASVABC is very low. The reason is that
anyone with an average score is very likely to graduate anyway. So an increase in the score
has little effect.
25
dp
e Z
0.994
f (Z )
0.250
Z 2
2
dZ (1 e )
(1 0.994)
p dp Z
0.042
0.04
0.03
0.50
0.02
0.25
Marginal effect
Cumulative effect
0.75
0.01
0.00
0
0
20
30
40
60
80
100
ASVABC
0:
1:
2:
3:
4:
5:
Log
Log
Log
Log
Log
Log
Likelihood
Likelihood
Likelihood
Likelihood
Likelihood
Likelihood
Logit Estimates
Log Likelihood = -116.49968
=-162.29468
=-133.21451
=-117.16714
=-116.51081
=-116.49969
=-116.49968
Number of obs
chi2(4)
Prob > chi2
Pseudo R2
=
570
= 91.59
= 0.0000
= 0.2822
-----------------------------------------------------------------------------grad |
Coef.
Std. Err.
z
P>|z|
[95% Conf. Interval]
---------+-------------------------------------------------------------------ASVABC |
.1563271
.0224382
6.967
0.000
.1123491
.2003051
SM |
.0645542
.0773804
0.834
0.404
-.0871086
.216217
SF |
.0054552
.0616822
0.088
0.930
-.1154397
.12635
MALE | -.2790915
.3601689
-0.775
0.438
-.9850095
.4268265
_cons |
-5.15931
.994783
-5.186
0.000
-7.109049
-3.209571
------------------------------------------------------------------------------
28
We will estimate the marginal effects, putting all the explanatory variables equal to their
sample means.
29
product
f(Z)
f(Z)b
ASVABC
50.15
0.156
7.839
0.033
0.005
SM
11.65
0.065
0.753
0.033
0.002
SF
11.82
0.006
0.065
0.033
0.000
MALE
0.57
-0.279
-0.159
0.033
-0.009
Constant
1.00
-5.159
-5.159
Total
Z 1 2 X 2 ... k X k
3.338
3.338
The first step is to calculate Z, when the X variables are equal to their sample means.
30
product
f(Z)
f(Z)b
ASVABC
50.15
0.156
7.839
0.033
0.005
SM
11.65
0.065
0.753
0.033
Z
0.002
3.338
SF
11.82
0.006
0.065
MALE
0.57
-0.279
-0.159
Constant
1.00
-5.159
-5.159
Total
.036
Z
e 0.000
f (Z )
0.033
Z 2
e )
0.033 (1 -0.009
0.033
3.338
31
product
f(Z)
f(Z)b
ASVABC
50.15
0.156
7.839
0.033
0.005
SM
11.65
0.065
0.753
0.033
0.002
SF
11.82
0.006
0.065
0.033
0.000
MALE
0.57
-0.279
-0.159
0.033
-0.009
Constant
1.00
-5.159
-5.159
Total
-3.338
p dp Z
f ( Z ) i
X i dZ X i
The estimated marginal effects are f(Z) multiplied by the respective coefficients. We see
that the effect of ASVABC is about the same as before. Every extra year of schooling of the
mother increases the probability of graduating by 0.2 percent.
32
product
f(Z)
f(Z)b
ASVABC
50.15
0.156
7.839
0.033
0.005
SM
11.65
0.065
0.753
0.033
0.002
SF
11.82
0.006
0.065
0.033
0.000
MALE
0.57
-0.279
-0.159
0.033
-0.009
Constant
1.00
-5.159
-5.159
Total
-3.338
p dp Z
f ( Z ) i
X i dZ X i
Father's schooling has no discernible effect. Males have 0.9 percent lower probability of
graduating than females. These effects would all have been larger if they had been
evaluated at a lower ASVABC score.
33
1
p F (Z )
1 e Z
1
1 e 1 2 ASVABC
Z 1 2 ASVABC
This sequence will conclude with an outline explanation of how the model is fitted using
maximum likelihood estimation.
34
1
p F (Z )
1 e Z
1
1 e 1 2 ASVABC
Z 1 2 ASVABC
In the case of an individual who graduated, the probability of that outcome is F(Z). We will
give subscripts 1, ..., s to the individuals who graduated.
35
1
1 e 1 2 ASVABC i
In the case of an individual who did not graduate, the probability of that outcome is 1 - F(Z).
We will give subscripts s+1, ..., n to these individuals.
36
...
1 e
1 e b1 b2 ASVABC s
1
1
...
b1 b2 ASVABC s 1
b1 b2 ASVABC n
1 e
1 e
Did graduate
1
1 e
1 2 ASVABC i
1
1 e 1 2 ASVABC i
We choose b1 and b2 so as to maximize the joint probability of the outcomes, that is, F(Z1)
x ... x F(Zs) x [1 - F(Zs+1)] x ... x [1 - F(Zn)]. There are no mathematical formulae for b1 and b2.
They have to be determined iteratively by a trial-and-error process.
37