Logit

About This Presentation

Title:

Logit

Description:

Logit & Probit Models. Interpretation. Goodness of Fit. Diagnostics. Interpretation ... In OLS the effect of X on Y is linear; in logit/probit it is not. ... – PowerPoint PPT presentation

Number of Views:624

Avg rating:3.0/5.0

Slides: 39

Provided by: davidl135

Category:

more less

Transcript and Presenter's Notes

Title: Logit

1
Logit Probit Models

Interpretation
Goodness of Fit
Diagnostics

2
Interpretation

Signs and significance are the same as in OLS
Increase (decrease) in X associated with greater
(lesser) probability of Y, all else equal
Parameters follow z, not t, distribution ? why?
The effect of independent variables is linear in
the latent variable (Y) but not in the observed
variable Y. Why?
In OLS the effect of X on Y is linear in
logit/probit it is not. It depends on the values
of the other Xs and the estimated ßs.

Interpretation of change in X on change in Y also
depends on where you are in the curve because the
impact is explicitly non-linear.

4
Example Swedish Referenda

. logit yesno birthyear leftright gender citizen
trust
Iteration 0 log likelihood -6548.2124
Iteration 1 log likelihood -5539.9057
Iteration 2 log likelihood -5507.3187
Iteration 3 log likelihood -5507.0733
Iteration 4 log likelihood -5507.0733
Logit estimates
Number of obs 9463
LR chi2(5) 2082.28
Prob gt chi2 0.0000
Log likelihood -5507.0733
Pseudo R2 0.1590
--------------------------------------------------
----------------------------
yesno Coef. Std. Err. z
Pgtz 95 Conf. Interval
-------------------------------------------------
----------------------------
birthyear -.0064844 .0014425 -4.50
0.000 -.0093117 -.0036571
leftright .6738988 .0219747 30.67
0.000 .6308292 .7169684
gender .4929538 .0462304 10.66
0.000 .4023438 .5835638

. sum prlogit
Variable Obs Mean Std. Dev.
Min Max
-------------------------------------------------
--------------------
prlogit 9734 .4757422 .2236455
.0333279 .9778679
histogram prlogit, bfcolor(none) blcolor(black)
normal normopts( clcolor(red) )

Can also generate index values
. predict prlogit_index, index
(998 missing values generated)
. sum prlogit_index
Variable Obs Mean Std. Dev.
Min Max
-------------------------------------------------
--------------------
prlogit_inx 9734 -.1297229 1.106006
-3.367464 3.788345
. twoway (connect prlogit prlogit_index, sort)

7
Index Values and Predicted Probabilities

Index value the combination of the covariate
means and their estimated coefficients
.gen prlogit_index2_b_cons_bbirthyearbirthy
ear_bleftrightleftright_bgendergender
gt _bcitizencitizen_btrusttrust
(998 missing values generated)
. sum prlogit_index
Variable Obs Mean Std. Dev.
Min Max
-------------------------------------------------
--------------------
prlogit_inx 9734 -.1297229 1.106006
-3.367464 3.788345
prlogit_in2 9734 -.1297229 1.106006
-3.367464 3.788345
Convert to predicted probabilities recall the
logit transformation

. gen prlogit2exp(prlogit_index2)/1exp(prlogit_
index2)
(998 missing values generated)
. sum prlogit if e(sample)
Variable Obs Mean Std. Dev.
Min Max
prlogit 9463 .4758533 .2242877
.0333279 .9778679
prlogit2 9463 .4758533 .2242877
.0333279 .9778679

9
Probit

. probit yesno birthyear leftright gender citizen
trust
Iteration 0 log likelihood -6548.2124
Iteration 1 log likelihood -5534.9351
Iteration 2 log likelihood -5512.4941
Iteration 3 log likelihood -5512.4593
Probit estimates
Number of obs 9463
LR chi2(5) 2071.51
Prob gt chi2 0.0000
Log likelihood -5512.4593
Pseudo R2 0.1582
--------------------------------------------------
----------------------------
yesno Coef. Std. Err. z
Pgtz 95 Conf. Interval
-------------------------------------------------
----------------------------
birthyear -.0038943 .0008681 -4.49
0.000 -.0055958 -.0021928
leftright .4033274 .0127411 31.66
0.000 .3783553 .4282995
gender .2968808 .0277645 10.69
0.000 .2424633 .3512983
citizen -.4833439 .0765248 -6.32
0.000 -.6333298 -.333358

.gen rprobit_index2_b_cons _bbirthyearbirth
year _bleftrightleftright_
bgendergender_bcitizencitizen_btrusttru
st
(998 missing values generated)
. gen prprobit2normprob(prprobit_index2)
(998 missing values generated)
. sum prprobit
Variable Obs Mean Std. Dev.
Min Max
prprobit 9734 .4756409 .2214617
.0223669 .9877646
prprobit_ix 9734 -.0774396 .6591582
-2.00715 2.249656
prprobit_i2 9734 -.0774396 .6591582
-2.00715 2.249656
prprobit2 9734 .4756409 .2214617
.0223669 .9877646

11
How might this be useful?

Can estimate change in probability for a given
set of covariates. For example, difference
between men and women

. estimate probabilty of yes vote for men,
holding all other variables at their means
. tab gender
Gender Freq. Percent Cum.
Female 5,320 50.93 50.93
Male 5,126 49.07 100.00
Total 10,446 100.00
. sum gender
Variable Obs Mean Std. Dev.
Min Max
gender 10446 1.490714 .4999377
1 2
.estimate probabilty of yes vote for women
(gender1), holding all other variables at their
means
. gen prprobit_womennormprob(_b_cons_bbirthye
ar1959.585_bleftright2.899_bgender1
_bcitizen.966_btrust2.48)
. estimate probabilty of yes vote for men
(gender2), holding all other variables at their
means
. gen rprobit_mennormprob(_b_cons_bbirthyear
1959.585_bleftright2.899_bgender2_b
gt citizen.966_btrust2.48)
. sum prprobit_women prprobit_men
Variable Obs Mean Std. Dev.
Min Max

Stata has a built in capability to do this for
probit models using the command dprobit
. dprobit yesno birthyear leftright gender
citizen trust
Iteration 0 log likelihood -6548.2124
Iteration 1 log likelihood -5534.9351
Iteration 2 log likelihood -5512.4941
Iteration 3 log likelihood -5512.4593
Probit estimates
Number of obs 9463
LR chi2(5) 2071.51
Prob gt chi2 0.0000
Log likelihood -5512.4593
Pseudo R2 0.1582
--------------------------------------------------
----------------------------
yesno dF/dx Std. Err. z Pgtz
x-bar 95 C.I.
-------------------------------------------------
----------------------------
birthyr -.0015489 .0003453 -4.49 0.000
1959.58 -.002226 -.000872
leftrit .1604231 .0050664 31.66 0.000
2.89897 .150493 .170353
gender .118084 .0110431 10.69 0.000
1.49878 .09644 .139728

Stata can also do this using the mfx compute
command
. mfx compute
Marginal effects after probit
y Pr(yesno) (predict)
.4691524
--------------------------------------------------
----------------------------
variable dy/dx Std. Err. z Pgtz
95 C.I. X
-------------------------------------------------
----------------------------
birthyr -.0015489 .00035 -4.49 0.000
-.002226 -.000872 1959.58
leftrit .1604231 .00507 31.66 0.000
.150493 .170353 2.89897
gender .118084 .01104 10.69 0.000
.09644 .139728 1.49878
citizen -.1889318 .02837 -6.66 0.000
-.244544 -.13332 .965867
trust -.2356792 .00782 -30.15 0.000
-.250999 -.220359 2.48061
--------------------------------------------------
----------------------------
() dy/dx is for discrete change of dummy
variable from 0 to 1
mfx compute has nice features that we will
exploit later one problem is that the
computation of standard errors is quite time
consuming depending on the size of the data set.

15
For continuous variables

. gen prprobit_lr1normprob(_b_cons_bbirthyear
1959.585_bleftright1_bgender1.50_b
gt citizen.966_btrust2.48)
. gen prprobit_lr2normprob(_b_cons_bbirthyear
1959.585_bleftright2_bgender1.50_b
gt citizen.966_btrust2.48)
. gen prprobit_lr3normprob(_b_cons_bbirthyear
1959.585_bleftright3_bgender1.50_b
gt citizen.966_btrust2.48)
. gen prprobit_lr4normprob(_b_cons_bbirthyear
1959.585_bleftright4_bgender1.50_b
gt citizen.966_btrust2.48)
. gen prprobit_lr5normprob(_b_cons_bbirthyear
1959.585_bleftright5_bgender1.50_b
gt citizen.966_btrust2.48)
. sum prprobit_lr if e(sample)
Variable Obs Mean Std. Dev.
Min Max
-------------------------------------------------
--------------------

. tab leftright, gen(lrdum)
. dprobit yesno birthyear gender citizen trust
lrdum1 lrdum2 lrdum4 lrdum5 /note leftright3
is the omitted category
Probit estimates
Number of obs 9463
LR chi2(8) 2115.72
Prob gt chi2 0.0000
Log likelihood -5490.3545
Pseudo R2 0.1615
--------------------------------------------------
----------------------------
yesno dF/dx Std. Err. z Pgtz
x-bar 95 C.I.
-------------------------------------------------
----------------------------
birthyr -.0016018 .0003462 -4.63 0.000
1959.58 -.00228 -.000923
gender .1148058 .0110777 10.36 0.000
1.49878 .093094 .136518
citizen -.1893555 .0283859 -6.32 0.000
.965867 -.244991 -.13372
trust -.2318162 .0078473 -29.51 0.000
2.48061 -.247197 -.216436
lrdum1 -.2406357 .0160484 -13.44 0.000
.132939 -.27209 -.209181
lrdum2 -.0964487 .0144258 -6.59 0.000
.236077 -.124723 -.068175

17
Computing Marginal Effects

Statas mfx command is very useful. It allows us
to calculate a variety of marginal effects for
almost every model that stata can estimate.
Basic syntax after an estimation command
mfx
this generates marginal effects or elasticities
User can specify values for the independent
variables default is to hold them at their means
and calculate the marginal effect
For dichotomous independent variables the
marginal effect is the change from X0 to X1
holding all others at their means

. logit yesno birthyear gender citizen trust
leftright
Logit estimates
Number of obs 9463
LR chi2(5) 2082.28
Prob gt chi2 0.0000
Log likelihood -5507.0733
Pseudo R2 0.1590
--------------------------------------------------
----------------------------
yesno Coef. Std. Err. z
Pgtz 95 Conf. Interval
-------------------------------------------------
----------------------------
birthyear -.0064844 .0014425 -4.50
0.000 -.0093117 -.0036571
gender .4929538 .0462304 10.66
0.000 .4023438 .5835638
citizen -.8248991 .1305405 -6.32
0.000 -1.080754 -.5690445
trust -.9999017 .034302 -29.15
0.000 -1.067132 -.9326711
leftright .6738988 .0219747 30.67
0.000 .6308292 .7169684
_cons 13.16165 2.837379 4.64
0.000 7.60049 18.72281
--------------------------------------------------
----------------------------
. mfx

. mfx, at(mean gender1)
Marginal effects after logit
y Pr(yesno) (predict)
.40720144
--------------------------------------------------
----------------------------
variable dy/dx Std. Err. z Pgtz
95 C.I. X
-------------------------------------------------
----------------------------
birthyr -.0015653 .00035 -4.49 0.000
-.002248 -.000883 1959.58
gender .1189933 .01066 11.16 0.000
.098104 .139883 1
citizen -.2033465 .03123 -6.51 0.000
-.264563 -.14213 .965867
trust -.2413647 .00826 -29.23 0.000
-.257551 -.225178 2.48061
leftrit .1626714 .00537 30.28 0.000
.152142 .173201 2.89897
--------------------------------------------------
----------------------------
() dy/dx is for discrete change of dummy
variable from 0 to 1
. mfx, at(mean gender2) .529-.407.122
Marginal effects after logit

20
Continuous X Variables

With continuous X variables it is often easier to
graph the marginal effects. We could do this
with a series of mfx commands, setting age, for
example, at different values
. mfx, at(mean birthyear1911) var(birthyear)
Marginal effects after logit
y Pr(yesno) (predict)
.54621076
. mfx, at(mean birthyear1955) var(birthyear)
Marginal effects after logit
y Pr(yesno) (predict)
.47503579
. mfx, at(mean birthyear1975) var(birthyear)
Marginal effects after logit
y Pr(yesno) (predict)
.44284412

It is, however, often easier to write a do file
that contains a loop
generate a counter running from 1911-1985
gen year_n1910 in 1/75
create a variable to "hold" the predicted
probabilities
gen p_year.
forvalues i1911(1)1985
run the mfx command looping through values of
birthyear
mfx, at(mean birthyeari') var(birthyear)
capture the predicted probability and put it in
p_year
replace p_yeare(Xmfx_y) if yeari'
Notes
If you are just interested in capturing the
predicted probability the loop will run about 50
faster if you use the , nose option (no standard
error)
You can see what results are kept in memory by
using the command
ereturn list
Return list
After the call to mfx

The mfx command can be arbitrarily complicated.
Lets say we are interested in differences in how
age impacts the probability of voting yes for
male and female voters
gen p_year_m.
gen p_year_f.
forvalues i1911(1)1985
run the mfx command looping through values of
birthyear
mfx, at(mean gender1 birthyeari')
var(birthyear)
capture the predicted probability and put it in
p_year
replace p_year_fe(Xmfx_y) if yeari'
forvalues i1911(1)1985
run the mfx command looping through values of
birthyear
mfx, at(mean gender2 birthyeari')
var(birthyear)
capture the predicted probability and put it in
p_year
replace p_year_me(Xmfx_y) if yeari'

23
twoway (connected p_year year) (connected
p_year_m year) (connected p_year_f year)
24
Clarify

Clarify is a set of programs written by Gary
King, Michael Tomz and Jason Wittenberg. It
allows researchers to easily interpret
regression-like models in a more intuitive
manner. It does this via simulation.
Citation King, Gary, Michael Tomz and Jason
Wittenberg. 2000. Making the Most of Statistical
Analysis Improving Interpretation and
Presentation, American Journal of Political
Science 44341-55
gking.harvard.edu/clarify/docs/clarify.html

25
Parts of Clarify

estsimp this is a wrapper that goes before
(almost) any regression type command. It causes
Stata to follow estimating the model by
generating 1000 (or more if you choose) draws
from the asymptotic posterior distribution of the
parameter estimates (variance-covariance matrix).
You can then plot the simulations to get an idea
of what your parameter estimates look like.
It is important to remember that your parameter
estimates are random variables with a certain
mean and distribution. Clarify exploits this
uncertainty.
setx this allows the researcher to set the
values of the various independent variables at
your values of interest (e.g., means, medians,
etc)
simqi this simulates and generates the
quantities of interest. These are based in large
part on the type of model you are estimating.
after regress simqi generates expected values
after logit simqi generates predicted
probabilities

26
Example

. estsimp logit yesno birthyear gender citizen
trust leftright
Iteration 0 log likelihood -6548.2124
Iteration 4 log likelihood -5507.0733
Logit estimates
Number of obs 9463
LR chi2(5) 2082.28
Prob gt chi2 0.0000
Log likelihood -5507.0733
Pseudo R2 0.1590
--------------------------------------------------
----------------------------
yesno Coef. Std. Err. z
Pgtz 95 Conf. Interval
-------------------------------------------------
----------------------------
birthyear -.0064844 .0014425 -4.50
0.000 -.0093117 -.0036571
gender .4929538 .0462304 10.66
0.000 .4023438 .5835638
citizen -.8248991 .1305405 -6.32
0.000 -1.080754 -.5690445
trust -.9999017 .034302 -29.15
0.000 -1.067132 -.9326711
leftright .6738988 .0219747 30.67
0.000 .6308292 .7169684
_cons 13.16165 2.837379 4.64
0.000 7.60049 18.72281

Use setx to set the values of the variables in
question to some value while holding the other
variables at other values. We will hold values
at their medians and examine the change in
leftright
. setx median
. simqi
Quantity of Interest Mean Std.
Err. 95 Conf. Interval
-------------------------------------------------
----------------------------
Pr(yesno0) .4661651
.0087691 .4495492 .4844435
Pr(yesno1) .5338349
.0087691 .5155565 .5504508
. simqi, fd(pr) changex(leftright 1 2)
First Difference leftright 1 2
Quantity of Interest Mean Std.
Err. 95 Conf. Interval
-------------------------------------------------
----------------------------
dPr(yesno 0) -.1391568
.0034546 -.1458878 -.1326886
dPr(yesno 1) .1391568
.0034546 .1326886 .1458878
. simqi, fd(pr) changex(leftright 3 4)

Both the setx command and the changex option can
be arbitrarily complicated.
setx mean /sets all variables to their mean/
setx gender 2 /sets gender equal to 2/
Clarify is very flexible and great to use because
it generates standard errors. It is faster to
mxf but does not handle all the models that mfx
does but it is under constant development.

29
What to Report

Parameter estimates
Standard errors (regular or robust)
Some measure that aids in interpretation
Marginal effects problem of interpretation
Change from 0?1 for dichotomous X
Change from ½ SD below the mean of X to ½ SD
above the mean of X
Some measure of uncertainty with regard to this
marginal effect

30
Goodness of Fit

Scalar measures of fit
Lots and lots of different measures based on
likelihoods, predicted values, predicted
probabilities, etc
Difficult to interpret substantively
Not transparent most depend on some arbitrary
cutoff to define a correct prediction/classificati
on

Most measures defined in the Long and Freese
text.
Easily implemented in Stata using fitstat
. qui logit yesno gender yearborn leftright trust
--output omitted
. fitstat
Measures of Fit for logit of yesno
Log-Lik Intercept Only -6572.174 Log-Lik
Full Model -5553.639
D(9493) 11107.277 LR(4)
2037.070
Prob gt
LR 0.000
McFadden's R2 0.155
McFadden's Adj R2 0.154
Maximum Likelihood R2 1.000 Cragg
Uhler's R2 1.000
McKelvey and Zavoina's R2 0.266 Efron's
R2 0.199
Variance of y 4.480 Variance
of error 3.290
Count R2 0.688 Adj
Count R2 0.343
AIC 1.170 AICn
11117.277
BIC -75837.558 BIC'
-2000.434

Classification tables. Key idea is to see how
well the model classifies/predicts outcomes.
Critical assumption is the definition of what
constitutes a correct prediction
Standard practice is to use .5
This is not necessarily appropriate if the
dependent variable is not symmetric (if p is
significantly greater than or less than .5)
Focus in classification table is often a matter
of theoretical motivation what matters?
correctly classified
False positives
False negatives

Classification table
True positive outcome1 prediction1 (d)
True negative outcome0 prediction0 (a)
False positive outcome0 prediction1 (b)
False negative outcome1 prediction0 (c)
Usually report these as percentages of total
(abcd)
Difficult theoretical question is what do we want
to maximize/minimize in some cases false
negatives are worse than false positives (e.g.,
speculative attacks, AIDS tests, etc)

34
Diagnostics

Residuals. If we define the predicted
probability for a given set of independent
variables as
then the deviations yi-pi are
heteroscedastic.
This implies that the variance in a binary
outcome is greatest when pi .5 and least as pi
approaches 0 or 1
This would suggest the use of the Pearson
residual which divides the residual by its
standard deviation

Pregibon (1981) showed that the variance of ri is
not equal to one and proposed the standardized
Pearson residual
Stata does this automatically
quietly logit yesno gender year leftright trust
predict rstd, rs
label var rstd Standardized Residual
gen obs_n
label obs Observation number
twoway (scatter rstd obs)

No hard and fast criteria to define a large
residual but a good rule of thumb is /- 2sd
One way to proceed is to sort residuals by the
variable that you may think leads to the problem.
Of course, this is somewhat difficult to figure
out a priori.
Look at residuals v trust

Influential cases. Large residuals do not
necessary mean that there is a strong influence
on the estimated parameters.
Influential points are also called high-leverage
points. This is defined (as in OLS) by examining
the change in the estimated ß that occurs when
the ith observation is deleted
Pregibon (1981) defined the a counterpart to
Cooks distance for the binomial case