Logit - PowerPoint PPT Presentation

1 / 25

About This Presentation

Title:

Logit

Description:

twoway (connected probit x) (connected logit x) ... Logit or Probit? ... Logit and probit models are part of the 'binomial' family in the generalized ... – PowerPoint PPT presentation

Number of Views:453

Avg rating:3.0/5.0

Slides: 26

Provided by: davidl135

Category:

more less

Transcript and Presenter's Notes

Title: Logit

1
Logit Probit Models

Theory and Estimation

2
Linear Probability Model

Linear probability model is the OLS model applied
to a dichotomous dependent variable
Recall OLS model
Recall also that
Since we are dealing with a proportion
P(p)E(p) we know the probability
Interpretation of the coefficients is
straightforward
A one unit increase in X is associated with a ß
increase in the probability of an event occurring
The relationship is linear so the impact of X on
Y is constant

3
Example Swedish EURO Referenda
(sweden_class.dta)

. reg yesno age /regress euro vote on age/
Source SS df MS
Number of obs 9936
-------------------------------------------
F( 1, 9934) 15.72
Model 3.9140087 1 3.9140087
Prob gt F 0.0001
Residual 2473.23001 9934 .248966178
R-squared 0.0016
-------------------------------------------
Adj R-squared 0.0015
Total 2477.14402 9935 .24933508
Root MSE .49897
--------------------------------------------------
----------------------------
yesno Coef. Std. Err. t
Pgtt 95 Conf. Interval
-------------------------------------------------
----------------------------
age -.001222 .0003082 -3.96
0.000 -.0018262 -.0006179
_cons 2.868081 .6038954 4.75
0.000 1.684324 4.051839
--------------------------------------------------
----------------------------
Interpretation as age increases by one year the
expected change in the probability of voting for
the referendum decreases by .001.
predict yhat, xb /generate predicted values/
(484 missing values generated)

4
(No Transcript)
5
Problems with the Linear Probability Model

Non-normal errors
Since Y takes only two possible values the
residual (e) can also take on only two values
If Y1 then e 1 - p(x) with probability p(x)
If Y0 then e p(x) with probability 1 - p(x)
The distribution of e will have mean 0 and
variance equal to p(x)1- p(x)
Note normality is not required for estimates to
be unbiased but it is necessary for efficiency

Non-Constant Error Variance
Since the variance of e is p(x)1- p(x) we have
non-constant variance variance that is a
function of the value of X
This means that the OLS estimator for the linear
probability model is inefficient and the standard
errors are biased ? resulting in incorrect
hypothesis tests
Non-Linearity
Since the OLS estimator is a linear model
probabilities increase by the same amount as X
goes up one unit, regardless of the value of X
This assumption is often met over a limited range
of values of X
We often expect the impact of an X variable on
the probability of Y to diminish as X increases
(or decreases)
For example the likelihood of owning a house
increases as income increases but at a decreasing
rate
Nonsensical Predictions
The linear model can create predicted values that
are not bounded by zero and one. This clearly
does not make sense.

7
Logit and Probit ModelsA Latent Variable
Approach

A latent variable approach treats the use of a
dichotomous variable essentially as a measurement
problem
There is a continuous underlying, latent,
variable (denoted Y) but we cannot observeand
therefore cannot measureit.
Rather, we observe a dichotomous indicator of
that latent variable
For example there is an underlying propensity
for an individual to vote, for a nation to go to
war, for a student to cheat. However, we only
observe the outcomethe action, not the
underlying propensity
The underlying model is

But we only observe the following realizations of
Y
We can write
The last equality holds because the eis are
distributed symmetrically.
In words we can say that Y1 if the random part
is less than or equal to the systematic part
The problem is figuring out the probability. The
requires the use of some distribution.

9
Logit

If we assume that e follows a standard logistic
distribution then we get the logit model
Standard logistic distributionthe pdf
Cumulative distribution function for the standard
logistic distribution
Standard logistic distribution is symmetrical
around zero.

Recall that
Assuming a standard logistic distribution for e
we can write this as
We can write this out for every observation in
our sample in terms of the conditional
expectation of Y given the value(s) of X. The
likelihood for a given observation is
Observations with Y1 contribute P(Y1X) to the
likelihood while those with Y0 contribute
P(y0X).

Assuming independent observations we can take the
product over all N observations to get the
overall likelihood
Taking the natural logarithm results in
Now maximize this log-likelihood with respect to
the Bs

12
Probit Models

Standard normal distribution has mean zero and
unit variance. Its density looks as follows
The cumulative distribution function is

The probability for a probit looks like this
With a log likelihood of

14
What do these distributions look like?

. set obs 600
obs was 0, now 600
. egen xfill(-300 -299)
. replace xx/100
(599 real changes made)
. gen probit1/sqrt(23.1415)exp(-((x2)/2))
. gen logit(exp(x))/1exp(x)2
. twoway (connected probit x) (connected logit x)
Logit has fatter tails that is the major
difference between the two

15
Cumulative distribution function

gen cumul_logitsum(logit)
gen cumul_probitsum(probit)
twoway (connected cumul_probit x) (connected
cumul_logit x)

16
Which is Better? Logit or Probit?

From an empirical standpoint logits and probits
typically yield similar estimates of the relevant
derivatives
Because the cumulative distribution functions for
the two models differ slightly only in the tails
of their respective distributions
The derivatives are different only if there are
enough observations in the tail of the
distribution
While the derivatives are usually similar, the
parameter estimates associated with the two
models are not
Multiplying the logit estimates by 0.625 makes
the logit estimates comparable to the probit
estimates

17
Hypothesis Testing

Logit and probit models are part of the
binomial family in the generalized linear model
(GLM) framework.
All GLMs are fit using mle and provide a
framework that we will use later when we add
panel and time-series considerations.
Key component in hypothesis testing of GLM models
is the likelihood both the initial likelihood
and the final likelihood.
The likelihood also provides information
regarding goodness of fit. In GLM models we can
construct a measure called the deviance (G2)
which is computed as G2-2logeL
The deviance is similar to the residual sum of
squares from OLS.

Hypothesis tests and confidence intervals are
standard across all MLE models.
Tests for individual slopes are based on the Wald
statistic
Tests that several slopes are jointly equal to
zero are based on the generalized
likelihood-ratio test (based on the deviance) and
have a ?2 distribution. This is similar to the
F-test from OLS where the difference in ESS from
a nested model is compared to the ESS from the
comparison model with degrees of freedom
dependent on the number of parameters being
tested

19
Example EURO referendum Sweden September
2003VALU 2003/Exitpolls from 80 polling places.

Dataset is a subset of the exit poll 44
questions in total.
N10,732.
sweden_class.dta
Question of interest how did you vote in the
referendum today? Yes means that Sweden should
join the EU and adopt the Euro No means that
Sweden will maintain the status quo.
Outcomethe referendum was defeated.
Substantively interesting for lots of reasons
useful for this class because there are lots of
questions that are coded on a nominal, ordinal
and ratio scale.

Contains data from C\Documents and
Settings\Administrator\Desktop\class_sweden.dta
obs 10,732
Extract from Swedish Exit Poll
Data
vars 14 8
Sep 2004 1139
size 203,908 (99.7 of memory free)
--------------------------------------------------
-----------------------------
storage display value
variable name type format label
variable label
--------------------------------------------------
-----------------------------
eu byte 40.0g eu Do
you think Sweden should
resign from the EU or stay in
the Union
party byte 39.0g party
What political party would you
vote for in a parliamentary
election today
gender byte 14.0g gender
Gender
birthyear int 14.0g birth_year
What year were you born
citizen byte 14.0g citizen Are
you a Swedish citizen

trust byte 14.0g trust
Generally speeking, how much
trust do you have for
politicians
employed byte 67.0g employment
What is your employment
situation
immigration byte 33.0g imm_vote How
important was the issue of
immigration for how you decided
to vote
democracy byte 33.0g dem_vote How
important was democracy for
how you decided to vote
interestrate byte 33.0g intrate_vote
How
important was the
possibility for Sweden to
decided its interest rate for
ho
ownecon byte 33.0g ownecon_vote
How
important was the question
of your own economy for how you

22
Variable and Value Labels

Variable labels allow you to add a label that
contains a description of the variable in the
dataset
label var yesno Yesvote for Euro
Value labels allow you to label the values that
an ordinal or nominal variable takes
label define eu 1"Sweden should resign from the
EU" 2"Sweden should remain a member of the EU" 3
"No opinion on hte matter" 9"No information"
label values v12 eu
. tab eu
Do you think Sweden should resign from
the EU or stay in the Union
Freq. Percent Cum.
-------------------------------------------------
--------------------------
Sweden should resign from the EU
2,499 28.16 28.16
Sweden should remain a member of the E
6,375 71.84 100.00
-------------------------------------------------
--------------------------
Total
8,874 100.00