Modeling Consumer Decision Making and Discrete Choice Behavior presentation

About This Presentation

Transcript and Presenter's Notes

Title: Modeling Consumer Decision Making and Discrete Choice Behavior

1
(No Transcript)
2
Econometrics in Health Economics Discrete
Choice ModelingandFrontier Modeling and
Efficiency EstimationProfessor William
GreeneStern School of BusinessNew York
UniversitySeptember 2-4, 2007
3
Discrete Choices

Observed outcomes
Inherently discrete
Number of occurrences (e.g., family size)
Behavior drug use, smoking behavior
Implicitly continuous, censored
The observed data are discrete by construction
(e.g., revealed preferences
Discrete decisions that reveal underlying
preferences
Implicit censoring mechanism
Implications to be considered
For model building
For analysis and prediction of behavior

4
Modeling Discrete Choice

Theoretical foundations
Econometric methodology
Models
Statistical bases
Econometric methods
Applications

5
Discrete Choice Modeling

Random Utility Models
Binary Choice Modeling
Extensions
Heterogeneity
Semiparametrics
Panel Data

6
Two Fundamental Building Blocks

Underlying Behavioral Theory Random utility
model The link between underlying behavior and
observed data
Empirical Tool Stochastic, parametric model for
binary choice a platform for models of discrete
choice

7
Behavioral Assumptions

Utility is defined over alternatives, j
1,,J(i,t)
U(i,j,t) is a preference ordering that exists for
individual i in choice situation t for
alternative j.
Preferences are transitive and complete wrt
choice situations
Utility maximization assumption
If U(i,1,t) gt U(i,2,t), the individual
will choose alternative 1, not alternative 2.
Revealed preference (duality). If the consumer
chooses alternative 1 and not alternative 2, then
U(i,1,t) gt U(i,2,t).

8
Indirect Utility Functions

Utility(i)x U(x) defined over consumption
choices
Utility maximization subject to budget
constraints produces x x(Income,Prices)
Indirect utility V(Income,Prices)
Observability heterogeneity produces indirect
utility
V(i,t,j) V(Income,Prices, Age,Educ,Sex,)
Unobservable heterogeneity produces random
utility U(i,t,j) V(Income,Prices,
Characteristics) e

9
Random Indirect Utility Functions
U(i,j,t) ?j ?ix(i,t,j) ?izit
?ijt
?j Choice specific constant xitj
Attributes of choice presented to person,
such as Price ?i Person specific taste
weights zit Characteristics of the person
(age,income) ?i Weights on person specific
characteristics ?ijt Unobserved random
component of utility
10
Application

210 Commuters Between Sydney and Melbourne
Available modes Air, Train, Bus, Car
Observed
Choice
Attributes Cost, terminal time, travel time,
other
Characteristics Household income
First application Fly or Other

11
A Formal Model for Binary Choice

Yes or No decision
Example, choose to fly or not to fly to a
destination when there
are alternatives.
Model Net utility of flying
Ufly ??1Cost ?2Time ?Income ?
Choose to fly if net utility is positive
Net utility UFLY UNOT FLY
Data x 1,cost,terminal time,travel time
z income
y 1 if choose fly, Ufly gt 0, 0 if
not.

12
An Econometric Model

Choose to fly iff UFLY gt UOTHER 0 (Normalize)
Ufly ??1Cost ?2Time ?Income ?
Ufly gt 0 ? ? gt -(??1Cost ?2Time
?Income)
Probability model For any person observed by
the analyst,
Prob(fly)Prob? gt -(??1Cost ?2Time
?Income)
Note the relationship between the unobserved ?
and the outcome

13
Binary Choice Data
Choose Air Gen.Cost Travel Time
Income 1.0000 86.000 25.000
70.000 .00000 67.000 69.000
60.000 .00000 77.000 64.000
20.000 .00000 69.000 69.000
15.000 .00000 77.000 64.000
30.000 .00000 71.000 64.000
26.000 .00000 58.000 64.000
35.000 .00000 71.000 69.000
12.000 .00000 100.00 64.000
70.000 1.0000 158.00 30.000
50.000 1.0000 136.00 45.000
40.000 1.0000 103.00 30.000
70.000 .00000 77.000 69.000
10.000 1.0000 197.00 45.000
26.000 .00000 129.00 64.000
50.000 .00000 123.00 64.000 70.000
14
A Case for Randomness of Utility

Does GC1 lt GC2 ? will always choose choice 1?
Apparently not
Does Income explain the difference?
Apparently not

Choose Air Gen.Cost Travel Time
Income 1.0000 86.000 25.000
70.000 .00000 67.000 69.000 60.000
Choose Air Gen.Cost Travel Time Income
.00000 100.00 64.000 70.000 1.0000
158.00 30.000 50.000
15
What Can We Learn from the Data?

Are the attributes important?
Aggregate predictions Total Demand
Value of time

16
Implied Demand Curve

Expected Demand for Flights As
So, we can obtain a downward sloping demand

17
Value of Time

We can also compute the value of time as
If the direct cost measure is unavailable, use
the negative of the income coefficient.
(Numerator will generally be negative.)

18
Econometric Frameworks

Nonparametric
Semiparametric
Parametric
Classical (Sampling Theory)
Bayesian
(We will focus on classical inference methods)

19
Modeling Approaches

Nonparametric Relationship
Minimal Assumptions
Minimal Conclusions
Semiparametric Index function
Stronger assumptions
Robust to model misspecification
(heteroscedasticity)
Still weak conclusions
Parametric Probability and index function
Strongest assumptions complete specification
Strongest conclusions
Possibly less robust. (Not necessarily)

20
Nonparametric Not Very Informative
P(Air)f(Income)
21
Semiparametric Approaches

Maximum Score
Find b so that
Si sign(bxi) sign(yi) is maximized
Maximize the number of observations for which
bxi lt 0 when y 0 and bxi gt 0 when yi 1.
Questions(1) What do the coefficients
mean?(2) If b is a solution, Kb is a solution
for any K gt 0. See question (1).
(Solution is scaled so bb 1.)
(3) Is inference possible? (Apparently
not Abrevaya)

22
MSCORE
23
Semiparametric Approaches

Klein and Spady Kernel Based

24
Klein and Spady Semiparametric
Note necessary normalizations. Coefficients are
not very meaningful.
25
Likelihood Based Inference Methods
Behavioral Theory
Likelihood Function
Statistical Theory
Observed Measurement
The likelihood function embodies the theoretical
description of the population. Characteristics of
the population are inferred from the
characteristics of the likelihood function.
(Bayesian and Classical)
26
Parametric Logit Model
27
Logit vs. MScore

Logit fits worse
MScore fits better, coefficients
are meaningless

28
Parametric Model Estimation

How to estimate ?, ?1, ?2, ??
Its not regression
The technique of maximum likelihood
Proby1
Prob? gt -(??1Cost ?2Time ?Income)
Proby0 1 - Proby1
Requires a model for the probability

29
Completing the Model F(?)

The distribution
Normal PROBIT, natural for behavior
Logistic LOGIT, allows thicker tails
Gompertz EXTREME VALUE, asymmetric, underlies
the basic logit model for multiple choice
Does it matter?
Yes, large difference in estimates
Not much, quantities of interest are more stable.

30
Application Doctor Visits(No Attributes of the
Choices)
German Health Care Usage Data, 7,293 Individuals,
Varying Numbers of PeriodsVariables in the file
areData downloaded from Journal of Applied
Econometrics Archive. This is an unbalanced panel
with 7,293 individuals. They can be used for
regression, count models, binary choice, ordered
choice, and bivariate binary choice. This is a
large data set. There are altogether 27,326
observations. The number of observations ranges
from 1 to 7. (Frequencies are 11525, 22158,
3825, 4926, 51051, 61000, 7987). Note, the
variable NUMOBS below tells how many observations
there are for each person. This variable is
repeated in each row of the data for the person.
(Downlo0aded from the JAE Archive)
DOCTOR 1(Number of doctor visits gt 0)
HSAT health satisfaction, coded
0 (low) - 10 (high) DOCVIS
number of doctor visits in last three months
HOSPVIS number of hospital
visits in last calendar year
PUBLIC insured in public health insurance 1
otherwise 0 ADDON insured
by add-on insurance 1 otherswise 0
HHNINC household nominal monthly net
income in German marks / 10000.
(4 observations with income0 were dropped)
HHKIDS children under age 16 in
the household 1 otherwise 0
EDUC years of schooling
AGE age in years MARRIED
marital status
31
Estimated Binary Choice (Probit) Model
---------------------------------------------
Binomial Probit Model
Dependent variable DOCTOR
Number of observations 27326
Log likelihood function -17670.94
Info. Criterion AIC 1.29378
Info. Criterion BIC 1.29559
Restricted log likelihood -18019.55
McFadden Pseudo R-squared .0193462
---------------------------------------------
----------------------------------------------
------------------ Variable Coefficient
Standard Error b/St.Er.PZgtz Mean of
X -------------------------------------------
--------------------- ---------Index function
for probability Constant .15500247
.05651561 2.743 .0061 HHNINC
-.11643121 .04632875 -2.513 .0120
.35208362 HHKIDS -.14118362
.01821758 -7.750 .0000 .40273000 EDUC
-.02811531 .00350266 -8.027 .0000
11.3206310 AGE .01283460
.00079035 16.239 .0000 43.5256898 MARRIED
.05226039 .02046202 2.554 .0106
.75861817
32
Estimated Binary Choice Models
LOGIT PROBIT EXTREME
VALUE Constant 0.155002 0.251115
0.560723 HHNINC -0.116431 -0.185922 -0.140951 HHK
IDS -0.141184 -0.22947 -0.182789 EDUC -0.0281153
-0.0455878 -0.035887 AGE 0.0128346 0.0207086
0.016202 MARRIED 0.0522604 0.085293
0.068080 Log-L -17670.9 -17673.1 -17679.5 Log-L(0
) -18019.6 -18019.6 -18019.6
33
Index??1Income ?2Educ ?Age
34
Effect on Predicted Probability of an Increase in
Income
??1Income ?2Educ ?(Age1)
(? is positive)
35
Marginal Effects in Probability Models

ProbOutcome some F(??X)
Partial effect ? F(??X) / ?x
(derivative)
Partial effects are derivatives
Result varies with model
Logit ? F(??Age) / ?x Prob (1-Prob)
?AGE
Probit ? F(??Age) / ?x Normal density ?AGE
Scaling usually erases model differences

36
The Delta Method For Computing Standard Errors

37
Marginal Effects for Binary Choice

Logit
Probit

38
Marginal Effect for a Dummy Variable

Probyi 1xi,di F(?xi?di)
conditional mean
Marginal effect of d
Probyi 1xi,di1 - Probyi 1xi,di0
Logit

39
Estimated Marginal Effects
Estimate Standard Error t Ratio
P Value PROBIT
HHNINC -.04388304
.01746073 -2.513 .0120 EDUC
-.01059669 .00132014 -8.027 .0000 AGE
.00483737 .00029767 16.251
.0000 ---------Marginal effect for dummy
variables is P1 - P0. HHKIDS -.05341443
.00691172 -7.728 .0000 MARRIED
.01978313 .00777809 2.543
.0110 LOGIT
HHNINC -.04321347
.01744584 -2.477 .0132 EDUC
-.01059587 .00131215 -8.075 .0000 AGE
.00481326 .00029819 16.142
.0000 ---------Marginal effect for dummy
variables is P1 - P0. HHKIDS -.05359813
.00692332 -7.742 .0000 MARRIED
.01993604 .00782159 2.549
.0108 Extreme Value
HHNINC -.04067557
.01667101 -2.440 .0147 EDUC
-.01035617 .00124994 -8.285 .0000 AGE
.00467547 .00029841 15.668
.0000 ---------Marginal effect for dummy
variables is P1 - P0. HHKIDS -.05417190
.00697888 -7.762 .0000 MARRIED
.02018367 .00790414 2.554 .0107
40
Computing Effects

Compute at the data means?
Simple
Inference is well defined
Average the individual effects
More appropriate?
Asymptotic standard errors.
Is hypothesis testing about marginal effects
meaningful?

41
Average Partial Effects
42
Krinsky and Robb Method
43
Partial Effects for a Probit Model
----------------------------------------------
-------- Variable Coefficient Standard
Error b/St.Er.PZgtz ----------------------
-------------------------------- Krinsky and
Robb Method With 100 Replications Using Average
Partial Effects HHNINC -.04318266
.01619878 -2.666 .0077 HHKIDS
-.05225173 .00678277 -7.704 .0000
EDUC -.01032588 .00131932 -7.827
.0000 AGE .00474397 .00027718
17.115 .0000 MARRIED .01894322
.00792639 2.390 .0169 Delta Method Using
Partial Effects at Means HHNINC -.04388304
.01746073 -2.513 .0120 HHKIDS
-.05341443 .00691172 -7.728 .0000
EDUC -.01059669 .00132014 -8.027
.0000 AGE .00483737 .00029767
16.251 .0000 MARRIED .01978313
.00777809 2.543 .0110
44
Elasticities

Elasticity
How to compute standard errors?
Delta method
Bootstrap
Bootstrap the individual elasticities? (Will
neglect variation in parameter estimates.)
Bootstrap model estimation?

45
Income Elasticity Krinsky and Robb
46
How Well Does the Model Fit?

There is no R squared
Fit measures computed from log L
pseudo R squared 1 logL0/logL
Others - these do not measure fit.
Direct assessment of the effectiveness of the
model at predicting the outcome

47
Fit Measures for Binary Choice

Likelihood Ratio Index
Bounded by 0 and 1
Rises when the model is expanded
Cramer (and others)

48
Fit Measures for the ProbitModel
---------------------------------------- Fit
Measures for Binomial Choice Model Probit
model for variable DOCTOR -----------------
----------------------- Proportions P0
.370892 P1 .629108 N 27326 N0
10135 N1 17191 LogL -17670.942
LogL0 -18019.552 Estrella
1-(L/L0)(-2L0/n) .02544 -------------------
--------------------- Efron McFadden
Ben./Lerman .02448 .01935
.54500 Cramer Veall/Zim. Rsqrd_ML
.02492 .04374 .02519
----------------------------------------
Information Akaike I.C. Schwarz I.C.
Criteria 1.29378 1.29559
---------------------------------------- ----
--------------------------------------------------
--- Predictions for Binary Choice Model.
---------------------------------
---------------------- Actual
Predicted Value Value
0 1 Total Actual
-------------------------------------------
----------- 0 367 ( 1.3) 9768 (
35.7) 10135 ( 37.1) 1 387 (
1.4) 16804 ( 61.5) 17191 (
62.9) --------------------------------------
---------------- Total 754 ( 2.8)
26572 ( 97.2) 27326 (100.0) ---------------
---------------------------------------
Pseudo R-squared
49
Predicting the Outcome

Predicted probabilities
P F(a b1Age b2Educ cIncome)
Predicting outcomes
Predict y1 if P is large
Use 0.5 for large (more likely than not)
Generally, use
Count successes and failures

50
Aggregate Prediction is a Useful Way to Assess
the Importance of a Variable
-------------------------------------------------
-------- Predictions for Binary Choice Model.
Predicted value is 1 when probability is
greater than .500000, 0 otherwise. -----------
-------------------------------------------- Ac
tual Predicted Value
Value 0 1
Total Actual ------------------------------
------------------------ 0 351 (
1.3) 9784 ( 35.8) 10135 ( 37.1) 1
409 ( 1.5) 16782 ( 61.4) 17191 (
62.9) --------------------------------------
---------------- Total 760 ( 2.8)
26566 ( 97.2) 27326 (100.0) ---------------
--------------------------------------- ------
------------------------------------------------
- Actual Predicted Value
Value 0 1
Total Actual ------------------------
------------------------------ 0 367
( 1.3) 9768 ( 35.7) 10135 ( 37.1) 1
387 ( 1.4) 16804 ( 61.5) 17191 (
62.9) --------------------------------------
---------------- Total 754 ( 2.8)
26572 ( 97.2) 27326 (100.0) ---------------
---------------------------------------
Model fit without Income
Model fit with Income has 238 more correct
predictions
51
Hypothesis Tests

Restrictions Linear or nonlinear functions of
the model parameters
Structural change Constancy of parameters
Specification Tests Heteroscedasticity, model
specification (distribution)

52
Hypothesis Testing Conventional Neyman/Pearson

Comparisons of Likelihood Functions Likelihood
Ratio Tests
Distance Measures Wald Statistics
Lagrange Multiplier Tests

53
Likelihood Ratio Tests

Null hypothesis restricts the parameter vector
Alternative releases the restriction
Test statistic Chi-squared
2(LogLUnrestricted model
LogLRestricted model) gt 0
Degrees of freedom number of restrictions

54
Wald Test

Unrestricted parameter vector is estimated
Discrepancy m Rb q (or r(b,q) if nonlinear)
is computed
Variance of discrepancy is estimated
Wald Statistic is mVar(m)-1m

55
Lagrange Multiplier Test

Restricted model is estimated
Derivatives of unrestricted model and variances
of derivatives are computed at restricted
estimates
Wald test of whether derivatives are zero tests
the restrictions
Usually hard to compute difficult to program
the derivatives and their variances.

56
Hypothesis Tests Results

LIKELIHOOD RATIO
LRTEST 88.766777
WALD
Matrix WALDSTAT has 1
rows and 1 columns.
1
--------------
1 89.26382
LAGRANGE MULTIPLIER
---------------------------------------------
Binary Logit Model for Binary Choice
Dependent variable DOCTOR
Number of observations 27326
LM Stat. at start values 89.88971
LM statistic kept as scalar LMSTAT
---------------------------------------------

Testing the hypothesis that the coefficients on
the income and education variables are equal to
zero in the logit model.
57
Testing Structural Stability

Fit the same model in each of K subsamples
Unrestricted log likelihood is the sum of the
subsample log likelihoods LogL1
Pool the subsamples, fit the model to the pooled
sample
Restricted log likelihood is that from the pooled
sample LogL0
Chi-squared 2(LogL1 LogL0) degrees of
freedom (K-1)model size.

58
A Test of Structural Stability

(Application to be examined later) Liberal arts
college has gender economics course?
Covariates constant, size of economics faculty,
academic affiliation, religious affiliation
Data from 4 U.S. regions, West, North, South,
midwest.
Is the same model appropriate for all 4 regions?

59
Application Men vs. Women Model for Doctor
Probit forfemale1 Lhs Doctor Rhs X
Log likelihood function -7855.219 Probit
forfemale0 Lhs Doctor Rhs X Log
likelihood function -9541.066 Probit
Lhs Doctor Rhs X Log
likelihood function -17670.94 2LogL(Femal
e) LogL(Male) LogL(Pooled) -----------------
------------------- Listed Calculator Results
------------------------------------
Result 549.310000 (Chi squared with 6 D.F.)
60
Scaling

Uitj ?j ?i xitj ?izit ?ijt
?ijt Unobserved random component of utility
Mean E?ijt 0, Var?ijt 1
Mean 0 is innocent. Why assume variance 1?
What if there are subgroups with different
variances?
Cost of ignoring the between group variation?
Specifically modeling
More general heterogeneity across people
Cost of the homogeneity assumption
Modeling issues

61
Heteroscedasticity in Binary Choice Models

Random utility Yi 1 iff ?xi ?i gt 0
Resemblance to regression How to accommodate
heterogeneity in the random unobserved effects
across individuals?
Heteroscedasticity different scaling
Parameterize Var?i exp(?zi)
Reformulate probabilities
Probit
Partial effects are now very complicated

62
Heteroscedasticity in Marginal Effects

For the univariate case
Eyixi,zi Fßxi / exp(?zi)
? Eyixi,zi /?xi fßxi / exp(?zi)
ß
? Eyixi,zi /?zi
fßxi / exp(?zi) ? - ßxi /
exp(?zi) ?
If the variables are the same in x and z, these
are added. Sign and magnitude are ambiguous

63
Testing For Heteroscedasticity

Likelihood Ratio, Wald and Lagrange Multiplier
Tests are all straightforward
All tests require a specification of the model of
heteroscedasticity
There is no generic test for heteroscedasticity
without a specific model

64
Robust Estimation

There is no heteroscedasticity robust (White)
covariance estimator.
Robust (semiparametric) parameter estimators do
not allow further analysis.
Only ratios of coefficients are estimable
Probabilities and partial effects cannot be
computed. (Scaling is not accounted for.)

65
Heteroscedasticity in the Doctor Equation
---------------------------------------------
Binomial Probit Model
Dependent variable DOCTOR
Log likelihood function -17496.19
Log likelihood function -17670.94
(Restricted model. LR 349.5 w/ 2 DF) LM Stat.
at start values 313.4050 (Computed
separately) -------------------------------------
-------- -------------------------------------
--------------------------- Variable
Coefficient Standard Error b/St.Er.PZgtz
Mean of X ------------------------------------
---------------------------- ---------Index
function for probability Constant .06472667
.01268180 5.104 .0000 HHNINC
-.01170328 .00554442 -2.111 .0348
.35208362 HHKIDS -.01356948
.00462470 -2.934 .0033 .40273000 EDUC
-.00084257 .00051971 -1.621 .1050
11.3206310 AGE -.00030092
.00014827 -2.030 .0424 43.5256898 (Note
negative coefficient) MARRIED .00610723
.00268916 2.271 .0231
.75861817 ---------Variance function AGE
-.03914159 .00629100 -6.222 .0000
43.5256898 (Note larger negative coefficient)
FEMALE -.77274469 .05529956 -13.974
.0000 .47877479 (Highly significant.) -----
-------------------------------------- Partial
derivatives of Ey F with respect to
the vector of characteristics. ----------------
--------------------------- HHNINC
-.03554999 .01374920 -2.586 .0097
HHKIDS -.04121878 .00714739 -5.767
.0000 EDUC -.00255939 .00127739
-2.004 .0451 AGE .00350153
.00349942 1.001 .3170 MARRIED
.01855137 .00586656 3.162
.0016 ---------Variance function AGE
.00350153 .00349942 1.001 .3170 (Note
positive marginal effect!) FEMALE
.08717426 .04486398 1.943 .0520
(Insignificant?)
66
Choice Based Sampling

Sample estimator (MLE) mimics the sample
MLE assumes the sample mimics the population
If the sample is nonrepresentative of the
population, the MLE will be also.
Choice based samples
Sample is biased
Certain outcomes (choices) are over- or
undersampled
Estimator (MLE) will produce estimates that mimic
this bias.

67
Choice Based Sample for Transport Mode

68
Weighting and Choice Based Sampling

Weighted log likelihood for all data types
Endogenous weights for individual data
Biased sampling Choice Based

69
Choice Based Sampling Correction

Maximize Weighted Log Likelihood
Covariance Matrix Adjustment
V H-1 G H-1 (all three weighted)
H Hessian
G Outer products of gradients

70
Effect of Choice Based Sampling

Unweighted ------------------------------------
-------------------- Variable Coefficient
Standard Error b/St.Er.PZgtz
--------------------------------------------
------------ Constant 1.784582594
1.2693459 1.406 .1598 GC
.2146879786E-01 .68080941E-02 3.153 .0016
TTME -.9846704221E-01 .16518003E-01 -5.961
.0000 HINC .2232338915E-01 .10297671E-01
2.168 .0302 --------------------------------
------------- Weighting variable
CBWT Corrected for Choice Based
Sampling ------------------------------
--------------- ------------------------------
-------------------------- Variable
Coefficient Standard Error b/St.Er.PZgtz
--------------------------------------------
------------ Constant 1.014022236
1.1786164 .860 .3896 GC
.2177810754E-01 .63743831E-02 3.417 .0006
TTME -.7434280587E-01 .17721665E-01 -4.195
.0000 HINC .2471679844E-01 .95483369E-02
2.589 .0096
71
Panel Data Treatments

Pooling and robust estimation
Clustering corections
Panel estimators
Random effect
Fixed effects
Modeling heterogeneity
Common effects
Random parameters
Mixed models
Latent class models

72
Panel Data Application

Did firm i produce a product or process
innovation in year t ? yit 1Yes/0No
Observed N1270 firms for T5 years, 1984-1988
Observed covariates xit Industry, competitive
pressures, size, productivity, etc.
How to model?
Binary outcome
Correlation across time
Heterogeneity across firms

73
Application
74
Cluster Effects in Panel and Stratified Data

What do we mean by this?
Clustering is with respect to the dependent
variable
Clustering is with respect to unobserved
effects in the model
Clustering with respect to independent
variables is irrelevant and should be ignored.
Correction is with respect to the covariance
matrix, not the estimator
Is the robust covariance matrix robust? To
what?
What assumptions are needed for the correction
to work? The pooled estimator must be consistent!

75
Cluster Correction
76
(No Transcript)
77
Fixed and Random Effects in Regression

yit ai bxit eit
Random effects Two step FGLS. First step is OLS
Fixed effects OLS based on group mean
differences
Neither works (even approximately) if the model
is nonlinear.
How do we proceed for a binary choice model
yit ai bxit eit
yit 1 if yit gt 0, 0 otherwise.

78
Panel Data and Binary Choice Models

Random Utility Model for Binary Choice
Uit ? ?xit ?it Person i
specific effect
Fixed effects using dummy variables
Uit ?i ?xit ?it
Random effects using omitted heterogeneity
Uit ? ?xit (?it vi)
Same outcome mechanism Yit Uit gt 0

79
Fixed and Random Effects Models

Fixed Effects
Robust to both specifications
Inconvenient to compute (many parameters)
Incidental parameters problem
Random Effects
Inconsistent if effects are correlated with X
Small(er) number of parameters
Easier (?) to compute
Computation available estimators
Other Approaches to Modeling Heterogeneity

80
Random Effects

Uit ? ?xit (?it ?v vi)
Logit model (can be generalized)
Joint probability for individual i vi
Unobserved component vi must be eliminated
Maximize wrt ?, ? and ?v
How to do the integration?
Analytic integration Integral does not exist in
closed form
Quadrature most familiar software
Simulation

81
Quadrature Butler and Moffitt
82
Estimation by Simulation
is the sum of the logs of EPr(y1,y2,vi). Can
be estimated by sampling vi and averaging. (Use
random numbers.)
83
Estimated Random Effects Models
---------------------------------------------
Random Effects Binary Probit Model
Log likelihood function -16273.96
Restricted log likelihood -17670.94
Unbalanced panel has 7293 individuals.
---------------------------------------------
----------------------------------------------
------------------ Variable Coefficient
Standard Error b/St.Er.PZgtz Mean of
X -------------------------------------------
--------------------- Constant .03411277
.09635399 .354 .7233 HHNINC
-.00317550 .06667150 -.048 .9620
.35208362 HHKIDS -.15378566
.02704366 -5.687 .0000 .40273000 EDUC
-.03369428 .00628888 -5.358 .0000
11.3206310 AGE .02014296
.00131894 15.272 .0000 43.5256898 MARRIED
.01632531 .03134693 .521 .6025
.75861817 Rho .44789069
.01020965 43.869 .0000 ---------------------
------------------------ Random Coefficients
Probit Model Log likelihood
function -16279.97 PROBIT (normal)
probability model Simulation based
on 50 Halton draws --------------------
------------------------- ---------Means for
random parameters Constant .03329051
.06322876 .527 .5985 ---------Nonrandom
parameters HHNINC -.00297343
.05201195 -.057 .9544 .35208362 HHKIDS
-.15357945 .02028593 -7.571 .0000
.40273000 EDUC -.03348872
.00393143 -8.518 .0000 11.3206310 AGE
.02007864 .00090132 22.277 .0000
43.5256898 MARRIED .01682560
.02277150 .739 .4600 .75861817 ---------
Scale parameters for dists. of random
parameters Constant .90088375
.01126251 79.990 .0000 RHO .900883752 /
1 .900883752 .447997
Butler/Moffitt Quadrature
Simulation with 50 Halton Points
84
Ignoring Unobserved Heterogeneity
85
The Effect of Ignoring Random EffectsLogit
Coefficient Estimates
logit lhs doctor rhs x mar
pds_groupti ran --------------------------
-------------------------------------- Variab
le Coefficient Standard Error
b/St.Er.PZgtz Mean of X -----------------
----------------------------------------------
- Random Effects
Constant -.13460475
.17764130 -.758 .4486 HHNINC
.02191356 .11865884 .185 .8535
HHKIDS -.21598299 .04773805 -4.524
.0000 EDUC -.06357790 .01132182
-5.616 .0000 AGE .03926718
.00246587 15.924 .0000 MARRIED
.02507118 .05628204 .445 .6560 Rho
.41607571 .00583916 71.256
.0000 Pooled
Constant
.25111543 .09113537 2.755 .0059
HHNINC -.18592232 .07506403 -2.477
.0133 .35208362 HHKIDS -.22947000
.02953694 -7.769 .0000 .40273000 EDUC
-.04558783 .00564646 -8.074
.0000 11.3206310 AGE .02070863
.00128517 16.114 .0000 43.5256898 MARRIED
.08529305 .03328573 2.562 .0104
.75861817
The cluster estimator does not fix this.
86
The Effect of Ignoring Random EffectsMarginal
Effects
----------------------------------------------
------------------ Variable Coefficient
Standard Error b/St.Er.PZgtzElasticity ---
----------------------------------------------
--------------- Random Effects
HHNINC
.00358351 .01940382 .185 .8535
.00198108 EDUC -.01039686
.00184906 -5.623 .0000 -.18480732 AGE
.00642134 .00040269 15.946 .0000
.43885160 ---------Marginal effect for dummy
variable is P1 - P0. HHKIDS -.03544814
.00786141 -4.509 .0000 -.02241578
MARRIED .00410498 .00922645 .445
.6564 .00488969 Pooled

HHNINC -.04321347 .01744584 -2.477
.0132 -.02405262 EDUC -.01059587
.00131215 -8.075 .0000 -.18962896 AGE
.00481326 .00029819 16.142
.0000 .33119369 ---------Marginal effect for
dummy variable is P1 - P0. HHKIDS
-.05359813 .00692332 -7.742 .0000
-.03412409 MARRIED .01993604
.00782159 2.549 .0108 .02390890
87
Fixed Effects

Dummy variable coefficients
Uit ?i ?xit ?it
Can be done by brute force for 10,000s of
individuals
F(.) appropriate probability for the observed
outcome
Compute ? and ?i for i1,,N (may be large)
See Estimating Econometric Models with Fixed
Effects at www.stern.nyu.edu/wgreene

88
Models with Fixed Individual Effects

Additive Effects
Log Likelihood Function
Approach
Conditional estimation based on sufficient
statistics
Unconditional, brute force with all dummy
variables

89
Conditional Estimation

Principle f(yi1,yi2, some statistic) is free
of the fixed effects for some models.
Maximize the conditional log likelihood, given
the statistic.

90
Conditional Logit Model
91
Conditional Estimation

Other Distributions?
Poisson the leading nonbinary case
Loglinear Exponential
Almost no others
Estimating constants is still a problem if
marginal effects or probabilities are desired

92
Example Two Period Binary Logit

93
Binary Logit, cont.

Estimate ? by maximizing conditional logL
Estimate ?i by using the known ? in the FOC for
the unconditional logL
Solve for the N constants, one at a time treating
? as known.
No solution when yit sums to 0 or Ti

94
Wooldridge on Estimating Partial Effects

The fixed effects logit estimator of ?
immediately gives us the effect of each element
of xi on the log-odds ratio Unfortunately, we
cannot estimate the partial effects unless we
plug in a value for ai. Because the distribution
of ai is unrestricted in particular, Eai is
not necessarily zero it is hard to know what to
plug in for ai. In addition, we cannot estimate
average partial effects, as doing so would
require finding E?(xit ? ai), a task that
apparently requires specifying a distribution for
ai.

95
Logit Constant Terms
96
Unconditional Estimation

Maximize the whole log likelihood
Difficult! Many (thousands) of parameters.
Feasible NLOGIT (2001) (Brute force)

97
Unconditional Estimator
---------------------------------------------
FIXED EFFECTS Logit Model
Log likelihood function -9458.638
Unbalanced panel has 7293 individuals.
Bypassed 3046 groups with inestimable a(i).
LOGIT (Logistic) probability model
---------------------------------------------
----------------------------------------------
------------------ Variable Coefficient
Standard Error b/St.Er.PZgtz Mean of
X -------------------------------------------
--------------------- ---------Index function
for probability HHNINC -.06097272
.17828658 -.342 .7324 .35357827 HHKIDS
-.08840685 .07439887 -1.188 .2347
.43906021 EDUC -.11670836
.06674866 -1.748 .0804 11.3554798 AGE
.10475175 .00725480 14.439 .0000
43.0477999 MARRIED -.05731835
.10608750 -.540 .5890 .77178591 --------
----------------------------------- Partial
derivatives of Ey F with respect to
the vector of characteristics. They are
computed at the means of the Xs. Estimated
Eymeans,mean alphai .612 Estimated
scale factor for dE/dx .237
-------------------------------------------
HHNINC -.01447714 .04208235 -.344
.7308 .35357827 ---------Marginal effect
for binary independent variable HHKIDS
-.02099100 .00415463 -5.052 .0000
.43906021 EDUC -.02771081
.02037808 -1.360 .1739 11.3554798 AGE
.02487187 .00420872 5.910 .0000
43.0477999 ---------Marginal effect for binary
independent variable MARRIED -.01360946
.00469760 -2.897 .0038 .77178591
98
Conditional Estimator
-------------------------------------------------
- Panel Data Binomial Logit Model
Number of individuals 7293
Number of periods
_GROUPTI Conditioning event is the
sum of DOCTOR ------------------------
-------------------------- Log likelihood
function -6299.016
--------------------------------------------
---------- Variable Coefficient Standard
Error b/St.Er.PZgtz ----------------------
-------------------------------- HHNINC
-.05038297 .15887796 -.317 .7512
HHKIDS -.07776425 .06628241 -1.173
.2407 EDUC -.09081577 .05667292
-1.602 .1091 AGE .08475971
.00650217 13.036 .0000 MARRIED
-.05207227 .09304411 -.560
.5757 -------------------------------------------
Partial derivatives of probabilities with
(Constant terms estimated individually respect
to the vector of characteristics. after
estimation of beta by Chamberlain They are
computed at the means of the Xs. method)
Observations used are All Obs.
------------------------------------------- --
-------Marginal effect for variable in
probability HHNINC -.01205716
.03807980 -.317 .7515 -.00703544 HHKIDS
-.01860977 .01891717 -.984 .3252
-.34915032 EDUC -.02173314
.01390241 -1.563 .1180 -.03601829 AGE
.02028386 .00310658 6.529 .0000
1.46317734 ---------Marginal effect for dummy
variable is P1 - P0. MARRIED -.00314259
.00566690 -.555 .5792 -.00209750
99
Advantages of the FE Model

Allows correlation of effect and regressors
Fairly straightforward to estimate
Simple to interpret

100
Disadvantages of FE

Not necessarily simple to estimate if very large
samples (Stata just creates the thousands of
dummy variables)
The incidental parameters problem Small T bias.

101
Incidental Parameters Problems Conventional
Wisdom

General Biased in samples with fixed T except
in special cases such as linear or Poisson
regression
Specific Upward bias (experience with probit
and logit) in estimators of ?

102
What We Know About the IP Problem in Binary
Choice Models

Andersen, Hsiao, Abrevaya (Exact Analytic)
Bias in logit estimator is exactly 100 when T
2
No result ever obtained for any other model for
T2
Heckman (Nonreplicable Monte Carlo study)
Bias in probit estimator is small if T ? 8
Bias in probit estimator is toward 0 in some
cases
Katz (et al numerous others), Greene
Bias in probit and logit estimators is large
Upward bias persists even as T ? 20

103
Some Familiar Territory A Monte Carlo Study of
the FE Estimator Probit vs. Logit
Estimates of Coefficients and Marginal Effects at
the Implied Data Means
Results are scaled so the desired quantity being
estimated (?, ?, marginal effects) all equal 1.0
in the population.
104
Specification Tests RE vs. FE

Fixed effects vs. Random effects
Unconditional FE estimator is never consistent
(if T is small)
RE is inconsistent if FE applies
Cannot use Hausman test
Effects vs. no effects
Conditional FE estimator is always consistent
Pooled estimator is consistent if no effects
Can use Hausman test for this specification test
is for common effects vs. no effects, not fixed
effects vs. no effects.

105
Dynamic Models
106
Dynamic Probit Model A Standard Approach
107
Bias Reduction

Dynamic binary choice
yit 1(ßxit dyi,t-1ci gt 0)
Fixed or random effects
Common fixed effects
yit 1(ßxit ci gt 0)
Presumed proportional bias plim b kß
Estimate the proportionality constant, k.
Literature is in its infancy
How do we know b is proportionally biased?
Known analytic results apply to trivial models
Not yet useful for practitioners

108
Ordered Outcomes

E.g. Taste test, credit rating, course grade
Underlying random preferences Mapping to
observed choices
Strength of preferences
Censoring and discrete measurement
The nature of ordered data

109
Modeling Ordered Choices

Random Utility
Uit ? ?xit ?izit ?it
ait ?it
Observe outcome j if utility is in region j
Probability of outcome probability of cell
PrYitj F(?j ait) - F(?j-1 ait)

110
Application Health Care Usage
German Health Care Usage Data, 7,293 Individuals,
Varying Numbers of PeriodsVariables in the file
areData downloaded from Journal of Applied
Econometrics Archive. This is an unbalanced panel
with 7,293 individuals. They can be used for
regression, count models, binary choice, ordered
choice, and bivariate binary choice. This is a
large data set. There are altogether 27,326
observations. The number of observations ranges
from 1 to 7. (Frequencies are 11525, 22158,
3825, 4926, 51051, 61000, 7987). Note, the
variable NUMOBS below tells how many observations
there are for each person. This variable is
repeated in each row of the data for the person.
(Downlo0aded from the JAE Archive)
DOCTOR 1(Number of doctor visits gt 0)
HOSPITAL 1(Number of hospital
visits gt 0) HSAT health
satisfaction, coded 0 (low) - 10 (high)
DOCVIS number of doctor visits in
last three months HOSPVIS
number of hospital visits in last calendar year
PUBLIC insured in public
health insurance 1 otherwise 0
ADDON insured by add-on insurance 1
otherswise 0 HHNINC
household nominal monthly net income in German
marks / 10000. (4
observations with income0 were dropped)
HHKIDS children under age 16 in the
household 1 otherwise 0
EDUC years of schooling
AGE age in years MARRIED
marital status EDUC years of
education
111
Health Care Satisfaction (HSAT)
Self Administered Survey Health Care
Satisfaction? (0 10)
Continuous Preference Scale
112
Ordered Probabilities
113
Four Ordered Probabilities
0 - ßx
µ1 - ßx
µ2 - ßx
8 - ßx
-8 - ßx
y0 y1 y2
y3
114
Coefficients
115
Effects in the Ordered Probability Model
Assume the ßk is positive. Assume that xk
increases. ßx increases. µj- ßx shifts to the
left for all 4 cells. Proby0
decreases Proby1 decreases the mass shifted
out is larger than the mass shifted in. Proby2
increases same reason. Proby3 increases.
When ßk gt 0, increase in xk decreases Proby0
and increases ProbyJ. Intermediate cells are
ambiguous, but there is only one sign change in
the marginal effects from 0 to 1 to to J
116
Ordered Probability Model for Health Satisfaction
---------------------------------------------
Ordered Probability Model
Dependent variable HSAT
Number of observations 27326
Underlying probabilities based on Normal
Cell frequencies for outcomes Y
Count Freq Y Count Freq Y Count Freq 0
447 .016 1 255 .009 2 642 .023 3
1173 .042 4 1390 .050 5 4233 .154 6
2530 .092 7 4231 .154 8 6172 .225 9
3061 .112 10 3192 .116
---------------------------------------------
----------------------------------------------
-------------------- Variable Coefficient
Standard Error b/St.Er.PZgtz Mean of
X -------------------------------------------
----------------------- Index
function for probability Constant
2.61335825 .04658496 56.099 .0000
FEMALE -.05840486 .01259442 -4.637
.0000 .47877479 EDUC .03390552
.00284332 11.925 .0000 11.3206310 AGE
-.01997327 .00059487 -33.576
.0000 43.5256898 HHNINC .25914964
.03631951 7.135 .0000 .35208362
HHKIDS .06314906 .01350176 4.677
.0000 .40273000 Threshold
parameters for index Mu(1) .19352076
.01002714 19.300 .0000 Mu(2)
.49955053 .01087525 45.935 .0000 Mu(3)
.83593441 .00990420 84.402
.0000 Mu(4) 1.10524187 .00908506
121.655 .0000 Mu(5) 1.66256620
.00801113 207.532 .0000 Mu(6)
1.92729096 .00774122 248.965 .0000
Mu(7) 2.33879408 .00777041 300.987
.0000 Mu(8) 2.99432165 .00851090
351.822 .0000 Mu(9) 3.45366015
.01017554 339.408 .0000
117
Ordered Probability Effects
-------------------------------------------------
--- Marginal effects for ordered probability
model M.E.s for dummy variables are
Pryx1-Pryx0 Names for dummy
variables are marked by .
-----------------------------------------------
----- ---------------------------------------
--------------------------- Variable
Coefficient Standard Error b/St.Er.PZgtz
Mean of X ----------------------------------
--------------------------------
These are the effects on ProbY00 at means.
FEMALE .00200414 .00043473 4.610
.0000 .47877479 EDUC -.00115962
.986135D-04 -11.759 .0000 11.3206310 AGE
.00068311 .224205D-04 30.468
.0000 43.5256898 HHNINC -.00886328
.00124869 -7.098 .0000 .35208362
HHKIDS -.00213193 .00045119 -4.725
.0000 .40273000 These are the
effects on ProbY01 at means. FEMALE
.00101533 .00021973 4.621 .0000
.47877479 EDUC -.00058810
.496973D-04 -11.834 .0000 11.3206310 AGE
.00034644 .108937D-04 31.802
.0000 43.5256898 HHNINC -.00449505
.00063180 -7.115 .0000 .35208362
HHKIDS -.00108460 .00022994 -4.717
.0000 .40273000 ... repeated for all 11
outcomes These are the effects on
ProbY10 at means. FEMALE -.01082419
.00233746 -4.631 .0000 .47877479 EDUC
.00629289 .00053706 11.717
.0000 11.3206310 AGE -.00370705
.00012547 -29.545 .0000 43.5256898
HHNINC .04809836 .00678434 7.090
.0000 .35208362 HHKIDS .01181070
.00255177 4.628 .0000 .40273000
118
Ordered Probit Marginal Effects
119
Panel Data Treatments FE

No sufficient statistics ? No conditional
estimator
Unconditional (brute force) is straightforward
Transformed model
Prob(yit gt j, t1,,TiXi) is an ordered binary
choice model
Produces multiple estimators of ß
Reconcile with minimum distance

120
Transformed Model
Fit this model for each j 1,,J as a fixed or
random effects binary choice model. Each has its
own j specific constant term (random effects) or
estimates of (ai µj) and its own specific
vector ßj. Reconcile the multiple slope vectors
with a minimum distance weighted average. (In
both cases, only the part of ßj that is not the
constant term.) (Note This is a way to get a
consistent estimator in the presence of fixed
effects. It is not needed for random effects.)
121
Fixed Effects Estimates
---------------------------------------------
FIXED EFFECTS OrdPrb Model
Number of observations 27326
Iterations completed 6
Log likelihood function -41876.93
Number of parameters 5680
Unbalanced panel has 7293 individuals.
Bypassed 1626 groups with inestimable a(i).
LHS variable values 0,1,...,10
---------------------------------------------
----------------------------------------------
------------------ Variable Coefficient
Standard Error b/St.Er.PZgtz Mean of
X -------------------------------------------
--------------------- ---------Index function
for probability AGE -.07137568
.00273513 -26.096 .0000 43.9209856 HHNINC
.30140100 .06919695 4.356 .0000
.35112607 EDUC .02405894
.02649654 .908 .3639 11.3100525 HHKIDS
-.05493925 .02766401 -1.986 .0470
.40921377 MU(1) .32485866
.02036400 15.953 .0000 MU(2)
.84476382 .02736032 30.876 .0000
MU(3) 1.39396202 .03002635 46.425
.0000 MU(4) 1.82292900 .03101930
58.768 .0000 MU(5) 2.69905222
.03227934 83.615 .0000 MU(6)
3.12710904 .03273884 95.517 .0000
MU(7) 3.79215966 .03344847 113.373
.0000 MU(8) 4.84341077 .03489769
138.789 .0000 MU(9) 5.57238334
.03629754 153.520 .0000
Time invariant variable FEMALE is dropped from
the fixed effects model.
122
Generalizing the Ordered Probit

INDEX ?xit
Thresholds
Standard model ?-1 -?, ?00, ?j gt ?j-1 gt 0.
Homogeneous preference scale.
Generalized (Pudney/Shields, JAE 00, Job
Grades)
Note identification problem -
?xit. If any variables are common to the two
parts, the coefficients are not identified.
Harris, Zhao, Greene (2004 Drug Use )

123
Multivariate Binary Choice Models

Bivariate Probit Models
Analysis of bivariate choices
Marginal effects
Prediction
Simultaneous Equations and Recursive Models
A Sample Selection Bivariate Probit Model
The Multivariate Probit Model
Specification
Simulation based estimation
Inference
Partial effects and analysis
The panel probit model

124
Gross Relation Between Two Binary Variables

Cross Tabulation Suggests Presence or Absence of
a Bivariate Relationship

-------------------------------------------------
------------------------ Cross Tabulation

Row variable is DOCTOR (Out of range 0-49
0) Number of Rows 2
(DOCTOR 0 to 1)
Col variable is HOSPITAL (Out of range 0-49
0) Number of Cols
2 (HOSPITAL 0 to 1)
Chi-squared independence tests
Chi-squared
1 430.11235 Prob value .00000
G-squared 1 477.27393 Prob
value .00000
-----------------------------------------------
-------------------------- Joint Frequencies
for Row Variable DOCTOR Column Variable
HOSPITAL ---------------------------------
-------------------------------------- DOCTOR
Total 0 1
------------------------------
-----------------------------------------
0 10135 9715 420
1 17191 15216
1975
---------------------------------------------
-------------------------- Total 27326
24931 2395
-----------------------------------------
------------------------------
125
Tetrachoric Correlation
http//ourworld.compuserve.com/homepages/jsuebersa
x/tetra.htm http//www2.chass.ncsu.edu/garson/pa7
65/correl.htm
126
Estimating Tetrachoric Correlation

Numerous ad hoc algorithms suggested in the
literature
Do not appear to have noticed the connection to a
bivariate probit model
Maximum likelihood estimation is simple under the
(necessary) assumption of normality

127
Likelihood Function
128
Estimation
---------------------------------------------
FIML Estimates of Bivariate Probit Model
Maximum Likelihood Estimates
Dependent variable DOCHOS
Weighting variable None
Number of observation

Write a Comment

User Comments (0)

About PowerShow.com

Modeling Consumer Decision Making and Discrete Choice Behavior PowerPoint PPT Presentation