Title: Modeling Consumer Decision Making and Discrete Choice Behavior
 1(No Transcript) 
 2Econometrics in Health Economics Discrete 
Choice ModelingandFrontier Modeling and 
Efficiency EstimationProfessor William 
GreeneStern School of BusinessNew York 
UniversitySeptember 2-4, 2007 
 3Discrete Choices
- Observed outcomes 
 - Inherently discrete 
 - Number of occurrences (e.g., family size) 
 - Behavior drug use, smoking behavior 
 - Implicitly continuous, censored 
 - The observed data are discrete by construction 
(e.g., revealed preferences  - Discrete decisions that reveal underlying 
preferences  - Implicit censoring mechanism 
 - Implications to be considered 
 - For model building 
 - For analysis and prediction of behavior
 
  4Modeling Discrete Choice
- Theoretical foundations 
 - Econometric methodology 
 - Models 
 - Statistical bases 
 - Econometric methods 
 - Applications
 
  5Discrete Choice Modeling
- Random Utility Models 
 - Binary Choice Modeling 
 - Extensions 
 - Heterogeneity 
 - Semiparametrics 
 - Panel Data
 
  6Two Fundamental Building Blocks
- Underlying Behavioral Theory Random utility 
model The link between underlying behavior and 
observed data  - Empirical Tool Stochastic, parametric model for 
binary choice a platform for models of discrete 
choice 
  7Behavioral Assumptions
- Utility is defined over alternatives, j  
1,,J(i,t)  - U(i,j,t) is a preference ordering that exists for 
individual i in choice situation t for 
alternative j.  - Preferences are transitive and complete wrt 
choice situations  - Utility maximization assumption 
 -  If U(i,1,t) gt U(i,2,t), the individual 
will choose alternative 1, not alternative 2.  - Revealed preference (duality). If the consumer 
chooses alternative 1 and not alternative 2, then 
U(i,1,t) gt U(i,2,t). 
  8Indirect Utility Functions
- Utility(i)x  U(x) defined over consumption 
choices  - Utility maximization subject to budget 
constraints produces x  x(Income,Prices)  - Indirect utility  V(Income,Prices) 
 - Observability heterogeneity produces indirect 
utility  -  V(i,t,j)  V(Income,Prices, Age,Educ,Sex,) 
 - Unobservable heterogeneity produces random 
utility U(i,t,j) V(Income,Prices, 
Characteristics)  e 
  9Random Indirect Utility Functions
U(i,j,t)  ?j  ?ix(i,t,j)  ?izit  
?ijt
?j  Choice specific constant xitj  
Attributes of choice presented to person, 
such as Price ?i  Person specific taste 
weights zit  Characteristics of the person 
(age,income) ?i  Weights on person specific 
characteristics ?ijt  Unobserved random 
component of utility 
 10Application
- 210 Commuters Between Sydney and Melbourne 
 - Available modes  Air, Train, Bus, Car 
 - Observed 
 - Choice 
 - Attributes Cost, terminal time, travel time, 
other  - Characteristics Household income 
 - First application Fly or Other
 
  11A Formal Model for Binary Choice
- Yes or No decision 
 -  Example, choose to fly or not to fly to a 
destination when there  -  are alternatives. 
 - Model Net utility of flying 
 -  Ufly  ??1Cost  ?2Time  ?Income  ? 
 -  Choose to fly if net utility is positive 
 -  Net utility  UFLY  UNOT FLY 
 - Data x  1,cost,terminal time,travel time 
 -  z  income 
 -  y  1 if choose fly, Ufly gt 0, 0 if 
not. 
  12An Econometric Model
- Choose to fly iff UFLY gt UOTHER  0 (Normalize) 
 -  Ufly  ??1Cost  ?2Time  ?Income  ? 
 -  Ufly gt 0 ? ? gt -(??1Cost  ?2Time  
?Income)  - Probability model For any person observed by 
the analyst,  -  Prob(fly)Prob? gt -(??1Cost  ?2Time  
?Income)  - Note the relationship between the unobserved ? 
and the outcome 
  13 Binary Choice Data
 Choose Air Gen.Cost Travel Time 
Income 1.0000 86.000 25.000 
70.000 .00000 67.000 69.000 
60.000 .00000 77.000 64.000 
20.000 .00000 69.000 69.000 
15.000 .00000 77.000 64.000 
30.000 .00000 71.000 64.000 
26.000 .00000 58.000 64.000 
35.000 .00000 71.000 69.000 
12.000 .00000 100.00 64.000 
70.000 1.0000 158.00 30.000 
50.000 1.0000 136.00 45.000 
40.000 1.0000 103.00 30.000 
70.000 .00000 77.000 69.000 
10.000 1.0000 197.00 45.000 
26.000 .00000 129.00 64.000 
50.000 .00000 123.00 64.000 70.000 
 14A Case for Randomness of Utility
- Does GC1 lt GC2 ? will always choose choice 1? 
Apparently not  - Does Income explain the difference? 
 -  Apparently not 
 
 Choose Air Gen.Cost Travel Time 
Income 1.0000 86.000 25.000 
70.000 .00000 67.000 69.000 60.000
 Choose Air Gen.Cost Travel Time Income 
.00000 100.00 64.000 70.000 1.0000 
 158.00 30.000 50.000 
 15What Can We Learn from the Data?
- Are the attributes important? 
 - Aggregate predictions Total Demand 
 - Value of time
 
  16Implied Demand Curve
- Expected Demand for Flights As 
 - So, we can obtain a downward sloping demand
 
  17Value of Time
- We can also compute the value of time as 
 - If the direct cost measure is unavailable, use 
the negative of the income coefficient. 
(Numerator will generally be negative.) 
  18Econometric Frameworks
- Nonparametric 
 - Semiparametric 
 - Parametric 
 - Classical (Sampling Theory) 
 - Bayesian 
 - (We will focus on classical inference methods) 
 
  19Modeling Approaches
- Nonparametric  Relationship 
 - Minimal Assumptions 
 - Minimal Conclusions 
 - Semiparametric  Index function 
 - Stronger assumptions 
 - Robust to model misspecification 
(heteroscedasticity)  - Still weak conclusions 
 - Parametric  Probability and index function 
 - Strongest assumptions  complete specification 
 - Strongest conclusions 
 - Possibly less robust. (Not necessarily)
 
  20Nonparametric Not Very Informative
P(Air)f(Income) 
 21Semiparametric Approaches
- Maximum Score 
 - Find b so that 
 -  Si sign(bxi)  sign(yi) is maximized 
 - Maximize the number of observations for which 
bxi lt 0 when y  0 and bxi gt 0 when yi  1.  - Questions(1) What do the coefficients 
mean?(2) If b is a solution, Kb is a solution 
for any K gt 0. See question (1). 
(Solution is scaled so bb  1.)  -  (3) Is inference possible? (Apparently 
not  Abrevaya) 
  22MSCORE 
 23Semiparametric Approaches
- Klein and Spady Kernel Based 
 
  24Klein and Spady Semiparametric
Note necessary normalizations. Coefficients are 
not very meaningful. 
 25Likelihood Based Inference Methods
Behavioral Theory
Likelihood Function
Statistical Theory
Observed Measurement
The likelihood function embodies the theoretical 
description of the population. Characteristics of 
the population are inferred from the 
characteristics of the likelihood function. 
(Bayesian and Classical) 
 26Parametric Logit Model 
 27Logit vs. MScore
-  Logit fits worse 
 -  MScore fits better, coefficients 
are meaningless 
  28Parametric Model Estimation
- How to estimate ?, ?1, ?2, ?? 
 - Its not regression 
 - The technique of maximum likelihood 
 - Proby1  
 -  Prob? gt -(??1Cost  ?2Time  ?Income) 
 -  Proby0  1 - Proby1 
 - Requires a model for the probability 
 
  29Completing the Model F(?)
- The distribution 
 - Normal PROBIT, natural for behavior 
 - Logistic LOGIT, allows thicker tails 
 - Gompertz EXTREME VALUE, asymmetric, underlies 
the basic logit model for multiple choice  - Does it matter? 
 - Yes, large difference in estimates 
 - Not much, quantities of interest are more stable.
 
  30Application  Doctor Visits(No Attributes of the 
Choices)
German Health Care Usage Data, 7,293 Individuals, 
Varying Numbers of PeriodsVariables in the file 
areData downloaded from Journal of Applied 
Econometrics Archive. This is an unbalanced panel 
with 7,293 individuals. They can be used for 
regression, count models, binary choice, ordered 
choice, and bivariate binary choice.  This is a 
large data set.  There are altogether 27,326 
observations.  The number of observations ranges 
from 1 to 7.  (Frequencies are 11525, 22158, 
3825, 4926, 51051, 61000, 7987).  Note, the 
variable NUMOBS below tells how many observations 
there are for each person.  This variable is 
repeated in each row of the data for the person.  
(Downlo0aded from the JAE Archive) 
 DOCTOR  1(Number of doctor visits gt 0) 
 HSAT   health satisfaction, coded 
0 (low) - 10 (high)   DOCVIS 
  number of doctor visits in last three months 
 HOSPVIS   number of hospital 
visits in last calendar year 
PUBLIC   insured in public health insurance  1 
otherwise  0 ADDON   insured 
by add-on insurance  1 otherswise  0 
 HHNINC   household nominal monthly net 
income in German marks / 10000. 
 (4 observations with income0 were dropped) 
 HHKIDS  children under age 16 in 
the household  1 otherwise  0 
 EDUC   years of schooling 
AGE  age in years MARRIED  
marital status 
 31Estimated Binary Choice (Probit) Model
---------------------------------------------  
Binomial Probit Model   
Dependent variable DOCTOR   
Number of observations 27326   
Log likelihood function -17670.94   
Info. Criterion AIC  1.29378   
Info. Criterion BIC  1.29559   
Restricted log likelihood -18019.55   
McFadden Pseudo R-squared .0193462 
 --------------------------------------------- 
----------------------------------------------
------------------ Variable Coefficient  
Standard Error b/St.Er.PZgtz Mean of 
X -------------------------------------------
--------------------- ---------Index function 
for probability Constant .15500247 
.05651561 2.743 .0061 HHNINC  
-.11643121 .04632875 -2.513 .0120 
.35208362 HHKIDS  -.14118362 
.01821758 -7.750 .0000 .40273000 EDUC 
 -.02811531 .00350266 -8.027 .0000 
 11.3206310 AGE  .01283460 
.00079035 16.239 .0000 43.5256898 MARRIED 
 .05226039 .02046202 2.554 .0106 
 .75861817 
 32Estimated Binary Choice Models
 LOGIT PROBIT EXTREME 
VALUE Constant 0.155002 0.251115 
0.560723 HHNINC -0.116431 -0.185922 -0.140951 HHK
IDS -0.141184 -0.22947 -0.182789 EDUC -0.0281153
 -0.0455878 -0.035887 AGE 0.0128346 0.0207086 
0.016202 MARRIED 0.0522604 0.085293 
0.068080 Log-L -17670.9 -17673.1 -17679.5 Log-L(0
) -18019.6 -18019.6 -18019.6 
 33Index??1Income  ?2Educ  ?Age 
 34Effect on Predicted Probability of an Increase in 
Income
??1Income  ?2Educ  ?(Age1)
(? is positive) 
 35Marginal Effects in Probability Models
-  ProbOutcome  some F(??X) 
 -  Partial effect  ? F(??X) / ?x 
 -  (derivative) 
 - Partial effects are derivatives 
 - Result varies with model 
 - Logit ? F(??Age) / ?x  Prob  (1-Prob)  
?AGE  - Probit ? F(??Age) / ?x  Normal density ?AGE 
 - Scaling usually erases model differences
 
  36The Delta Method For Computing Standard Errors
  37Marginal Effects for Binary Choice
  38Marginal Effect for a Dummy Variable
- Probyi  1xi,di  F(?xi?di) 
 -  conditional mean 
 - Marginal effect of d 
 -  Probyi  1xi,di1 - Probyi 1xi,di0 
 - Logit 
 
  39Estimated Marginal Effects
 Estimate Standard Error t Ratio 
 P Value PROBIT
 HHNINC  -.04388304 
.01746073 -2.513 .0120 EDUC  
-.01059669 .00132014 -8.027 .0000 AGE 
  .00483737 .00029767 16.251 
.0000 ---------Marginal effect for dummy 
variables is P1 - P0. HHKIDS  -.05341443 
 .00691172 -7.728 .0000 MARRIED  
.01978313 .00777809 2.543 
.0110 LOGIT
 HHNINC  -.04321347 
.01744584 -2.477 .0132 EDUC  
-.01059587 .00131215 -8.075 .0000 AGE 
  .00481326 .00029819 16.142 
.0000 ---------Marginal effect for dummy 
variables is P1 - P0. HHKIDS  -.05359813 
 .00692332 -7.742 .0000 MARRIED  
.01993604 .00782159 2.549 
.0108 Extreme Value
 HHNINC  -.04067557 
.01667101 -2.440 .0147 EDUC  
-.01035617 .00124994 -8.285 .0000 AGE 
  .00467547 .00029841 15.668 
.0000 ---------Marginal effect for dummy 
variables is P1 - P0. HHKIDS  -.05417190 
 .00697888 -7.762 .0000 MARRIED  
.02018367 .00790414 2.554 .0107 
 40Computing Effects
- Compute at the data means? 
 - Simple 
 - Inference is well defined 
 - Average the individual effects 
 - More appropriate? 
 - Asymptotic standard errors. 
 - Is hypothesis testing about marginal effects 
meaningful? 
  41Average Partial Effects 
 42Krinsky and Robb Method 
 43Partial Effects for a Probit Model
----------------------------------------------
-------- Variable Coefficient  Standard 
Error b/St.Er.PZgtz ----------------------
-------------------------------- Krinsky and 
Robb Method With 100 Replications Using Average 
Partial Effects HHNINC  -.04318266 
.01619878 -2.666 .0077 HHKIDS  
-.05225173 .00678277 -7.704 .0000 
EDUC  -.01032588 .00131932 -7.827 
 .0000 AGE  .00474397 .00027718 
 17.115 .0000 MARRIED  .01894322 
.00792639 2.390 .0169 Delta Method Using 
Partial Effects at Means HHNINC  -.04388304 
 .01746073 -2.513 .0120 HHKIDS  
-.05341443 .00691172 -7.728 .0000 
EDUC  -.01059669 .00132014 -8.027 
 .0000 AGE  .00483737 .00029767 
 16.251 .0000 MARRIED  .01978313 
.00777809 2.543 .0110 
 44Elasticities
- Elasticity  
 - How to compute standard errors? 
 - Delta method 
 - Bootstrap 
 - Bootstrap the individual elasticities? (Will 
neglect variation in parameter estimates.)  - Bootstrap model estimation?
 
  45Income Elasticity Krinsky and Robb 
 46How Well Does the Model Fit?
- There is no R squared 
 - Fit measures computed from log L 
 - pseudo R squared  1  logL0/logL 
 - Others - these do not measure fit. 
 - Direct assessment of the effectiveness of the 
model at predicting the outcome 
  47Fit Measures for Binary Choice
- Likelihood Ratio Index 
 - Bounded by 0 and 1 
 - Rises when the model is expanded 
 - Cramer (and others)
 
  48 Fit Measures for the ProbitModel
----------------------------------------  Fit 
Measures for Binomial Choice Model   Probit 
model for variable DOCTOR  -----------------
-----------------------  Proportions P0 
.370892 P1 .629108   N  27326 N0 
10135 N1 17191   LogL -17670.942 
LogL0 -18019.552   Estrella  
1-(L/L0)(-2L0/n)  .02544  -------------------
---------------------  Efron  McFadden  
 Ben./Lerman   .02448  .01935  
.54500   Cramer  Veall/Zim.  Rsqrd_ML 
   .02492  .04374  .02519 
 ----------------------------------------  
Information Akaike I.C. Schwarz I.C.   
Criteria 1.29378 1.29559 
 ---------------------------------------- ----
--------------------------------------------------
--- Predictions for Binary Choice Model. 
  ---------------------------------
---------------------- Actual 
Predicted Value   Value 
 0 1  Total Actual 
  -------------------------------------------
-----------  0  367 ( 1.3) 9768 ( 
35.7) 10135 ( 37.1)  1  387 ( 
1.4) 16804 ( 61.5) 17191 ( 
62.9) --------------------------------------
---------------- Total  754 ( 2.8) 
26572 ( 97.2) 27326 (100.0) ---------------
---------------------------------------
Pseudo  R-squared 
 49Predicting the Outcome 
- Predicted probabilities 
 -  P  F(a  b1Age  b2Educ  cIncome) 
 - Predicting outcomes 
 - Predict y1 if P is large 
 - Use 0.5 for large (more likely than not) 
 - Generally, use 
 - Count successes and failures
 
  50Aggregate Prediction is a Useful Way to Assess 
the Importance of a Variable
-------------------------------------------------
-------- Predictions for Binary Choice Model. 
Predicted value is  1 when probability is 
greater than .500000, 0 otherwise. -----------
-------------------------------------------- Ac
tual Predicted Value  
  Value  0 1  
Total Actual  ------------------------------
------------------------  0  351 ( 
1.3) 9784 ( 35.8) 10135 ( 37.1)  1  
 409 ( 1.5) 16782 ( 61.4) 17191 ( 
62.9) --------------------------------------
---------------- Total  760 ( 2.8) 
26566 ( 97.2) 27326 (100.0) ---------------
--------------------------------------- ------
------------------------------------------------
- Actual Predicted Value  
  Value  0 1 
  Total Actual  ------------------------
------------------------------  0  367 
( 1.3) 9768 ( 35.7) 10135 ( 37.1)  1 
  387 ( 1.4) 16804 ( 61.5) 17191 ( 
62.9) --------------------------------------
---------------- Total  754 ( 2.8) 
26572 ( 97.2) 27326 (100.0) ---------------
--------------------------------------- 
Model fit without Income
Model fit with Income has 238 more correct 
predictions 
 51Hypothesis Tests
- Restrictions Linear or nonlinear functions of 
the model parameters  - Structural change Constancy of parameters 
 - Specification Tests Heteroscedasticity, model 
specification (distribution) 
  52Hypothesis Testing  Conventional Neyman/Pearson
- Comparisons of Likelihood Functions Likelihood 
Ratio Tests  - Distance Measures Wald Statistics 
 - Lagrange Multiplier Tests 
 
  53Likelihood Ratio Tests
- Null hypothesis restricts the parameter vector 
 - Alternative releases the restriction 
 - Test statistic Chi-squared  
 -  2(LogLUnrestricted model  
 -  LogLRestricted model) gt 0 
 - Degrees of freedom  number of restrictions
 
  54Wald Test
- Unrestricted parameter vector is estimated 
 - Discrepancy m Rb  q (or r(b,q) if nonlinear) 
is computed  - Variance of discrepancy is estimated 
 - Wald Statistic is mVar(m)-1m
 
  55Lagrange Multiplier Test
- Restricted model is estimated 
 - Derivatives of unrestricted model and variances 
of derivatives are computed at restricted 
estimates  - Wald test of whether derivatives are zero tests 
the restrictions  - Usually hard to compute  difficult to program 
the derivatives and their variances. 
  56Hypothesis Tests Results
- LIKELIHOOD RATIO 
 - LRTEST  88.766777 
 - WALD 
 - Matrix WALDSTAT has 1 
 - rows and 1 columns. 
 -  1 
 -  -------------- 
 -  1 89.26382 
 - LAGRANGE MULTIPLIER 
 - --------------------------------------------- 
 -  Binary Logit Model for Binary Choice  
 -  Dependent variable DOCTOR  
 -  Number of observations 27326  
 -  LM Stat. at start values 89.88971  
 -  LM statistic kept as scalar LMSTAT  
 - --------------------------------------------- 
 
Testing the hypothesis that the coefficients on 
the income and education variables are equal to 
zero in the logit model. 
 57Testing Structural Stability
- Fit the same model in each of K subsamples 
 - Unrestricted log likelihood is the sum of the 
subsample log likelihoods LogL1  - Pool the subsamples, fit the model to the pooled 
sample  - Restricted log likelihood is that from the pooled 
sample LogL0  - Chi-squared  2(LogL1  LogL0) degrees of 
freedom  (K-1)model size. 
  58A Test of Structural Stability
- (Application to be examined later) Liberal arts 
college has gender economics course?  - Covariates  constant, size of economics faculty, 
academic affiliation, religious affiliation  - Data from 4 U.S. regions, West, North, South, 
midwest.  - Is the same model appropriate for all 4 regions? 
 
  59Application Men vs. Women Model for Doctor
Probit  forfemale1 Lhs  Doctor  Rhs  X 
 Log likelihood function -7855.219 Probit 
 forfemale0 Lhs  Doctor  Rhs  X  Log 
likelihood function -9541.066 Probit  
 Lhs  Doctor  Rhs  X  Log 
likelihood function -17670.94 2LogL(Femal
e)  LogL(Male)  LogL(Pooled) -----------------
-------------------  Listed Calculator Results 
  ------------------------------------ 
Result  549.310000 (Chi squared with 6 D.F.) 
 60Scaling
- Uitj  ?j  ?i xitj  ?izit  ?ijt 
 - ?ijt  Unobserved random component of utility 
 -  Mean E?ijt  0, Var?ijt  1 
 - Mean  0 is innocent. Why assume variance  1? 
 - What if there are subgroups with different 
variances?  - Cost of ignoring the between group variation? 
 - Specifically modeling 
 - More general heterogeneity across people 
 - Cost of the homogeneity assumption 
 - Modeling issues
 
  61Heteroscedasticity in Binary Choice Models
- Random utility Yi  1 iff ?xi  ?i gt 0 
 - Resemblance to regression How to accommodate 
heterogeneity in the random unobserved effects 
across individuals?  - Heteroscedasticity  different scaling 
 - Parameterize Var?i  exp(?zi) 
 - Reformulate probabilities 
 - Probit 
 - Partial effects are now very complicated
 
  62Heteroscedasticity in Marginal Effects
- For the univariate case 
 -  Eyixi,zi  Fßxi / exp(?zi) 
 -  ? Eyixi,zi /?xi  fßxi / exp(?zi) 
ß  -  ? Eyixi,zi /?zi 
 -   fßxi / exp(?zi) ? - ßxi / 
exp(?zi) ?  - If the variables are the same in x and z, these 
are added. Sign and magnitude are ambiguous 
  63Testing For Heteroscedasticity
- Likelihood Ratio, Wald and Lagrange Multiplier 
Tests are all straightforward  - All tests require a specification of the model of 
heteroscedasticity  - There is no generic test for heteroscedasticity 
without a specific model 
  64Robust Estimation
- There is no heteroscedasticity robust (White) 
covariance estimator.  - Robust (semiparametric) parameter estimators do 
not allow further analysis.  - Only ratios of coefficients are estimable 
 - Probabilities and partial effects cannot be 
computed. (Scaling is not accounted for.)  
  65Heteroscedasticity in the Doctor Equation
---------------------------------------------  
Binomial Probit Model   
Dependent variable DOCTOR   
Log likelihood function -17496.19   
Log likelihood function -17670.94  
(Restricted model. LR  349.5 w/ 2 DF)  LM Stat. 
at start values 313.4050  (Computed 
separately) -------------------------------------
-------- -------------------------------------
--------------------------- Variable 
Coefficient  Standard Error b/St.Er.PZgtz 
Mean of X ------------------------------------
---------------------------- ---------Index 
function for probability Constant .06472667 
 .01268180 5.104 .0000 HHNINC  
-.01170328 .00554442 -2.111 .0348 
.35208362 HHKIDS  -.01356948 
.00462470 -2.934 .0033 .40273000 EDUC 
 -.00084257 .00051971 -1.621 .1050 
 11.3206310 AGE  -.00030092 
.00014827 -2.030 .0424 43.5256898 (Note 
negative coefficient) MARRIED  .00610723 
 .00268916 2.271 .0231 
.75861817 ---------Variance function AGE  
 -.03914159 .00629100 -6.222 .0000 
43.5256898 (Note larger negative coefficient) 
FEMALE  -.77274469 .05529956 -13.974 
 .0000 .47877479 (Highly significant.) -----
--------------------------------------  Partial 
derivatives of Ey  F with   respect to 
the vector of characteristics.  ----------------
--------------------------- HHNINC  
-.03554999 .01374920 -2.586 .0097 
HHKIDS  -.04121878 .00714739 -5.767 
 .0000 EDUC  -.00255939 .00127739 
 -2.004 .0451 AGE  .00350153 
.00349942 1.001 .3170 MARRIED  
.01855137 .00586656 3.162 
.0016 ---------Variance function AGE  
.00350153 .00349942 1.001 .3170 (Note 
positive marginal effect!) FEMALE  
.08717426 .04486398 1.943 .0520 
(Insignificant?) 
 66Choice Based Sampling
- Sample estimator (MLE) mimics the sample 
 - MLE assumes the sample mimics the population 
 - If the sample is nonrepresentative of the 
population, the MLE will be also.  - Choice based samples 
 - Sample is biased 
 - Certain outcomes (choices) are over- or 
undersampled  - Estimator (MLE) will produce estimates that mimic 
this bias. 
  67Choice Based Sample for Transport Mode
  68Weighting and Choice Based Sampling
- Weighted log likelihood for all data types 
 - Endogenous weights for individual data 
 -  Biased sampling  Choice Based
 
  69Choice Based Sampling Correction
- Maximize Weighted Log Likelihood 
 - Covariance Matrix Adjustment 
 -  V  H-1 G H-1 (all three weighted) 
 -  H  Hessian 
 -  G  Outer products of gradients 
 
  70Effect of Choice Based Sampling
Unweighted ------------------------------------
-------------------- Variable  Coefficient 
 Standard Error b/St.Er.PZgtz 
 --------------------------------------------
------------ Constant 1.784582594 
1.2693459 1.406 .1598 GC 
.2146879786E-01 .68080941E-02 3.153 .0016 
TTME -.9846704221E-01 .16518003E-01 -5.961 
 .0000 HINC .2232338915E-01 .10297671E-01 
 2.168 .0302 --------------------------------
-------------  Weighting variable 
 CBWT   Corrected for Choice Based 
Sampling  ------------------------------
--------------- ------------------------------
-------------------------- Variable  
Coefficient  Standard Error b/St.Er.PZgtz 
 --------------------------------------------
------------ Constant 1.014022236 
1.1786164 .860 .3896 GC 
.2177810754E-01 .63743831E-02 3.417 .0006 
TTME -.7434280587E-01 .17721665E-01 -4.195 
 .0000 HINC .2471679844E-01 .95483369E-02 
 2.589 .0096  
 71Panel Data Treatments
- Pooling and robust estimation 
 - Clustering corections 
 - Panel estimators 
 - Random effect 
 - Fixed effects 
 - Modeling heterogeneity 
 - Common effects 
 - Random parameters 
 - Mixed models 
 - Latent class models
 
  72Panel Data Application
- Did firm i produce a product or process 
innovation in year t ? yit  1Yes/0No  - Observed N1270 firms for T5 years, 1984-1988 
 - Observed covariates xit  Industry, competitive 
pressures, size, productivity, etc.  - How to model? 
 - Binary outcome 
 - Correlation across time 
 - Heterogeneity across firms
 
  73Application 
 74Cluster Effects in Panel and Stratified Data
- What do we mean by this? 
 - Clustering is with respect to the dependent 
variable  - Clustering is with respect to unobserved 
effects in the model  - Clustering with respect to independent 
variables is irrelevant and should be ignored.  - Correction is with respect to the covariance 
matrix, not the estimator  - Is the robust covariance matrix robust? To 
what?  - What assumptions are needed for the correction 
to work? The pooled estimator must be consistent! 
  75Cluster Correction 
 76(No Transcript) 
 77Fixed and Random Effects in Regression
- yit  ai  bxit  eit 
 - Random effects Two step FGLS. First step is OLS 
 - Fixed effects OLS based on group mean 
differences  - Neither works (even approximately) if the model 
is nonlinear.  - How do we proceed for a binary choice model 
 - yit  ai  bxit  eit 
 - yit  1 if yit gt 0, 0 otherwise.
 
  78Panel Data and Binary Choice Models
- Random Utility Model for Binary Choice 
 - Uit  ?  ?xit  ?it  Person i 
specific effect  - Fixed effects using dummy variables 
 - Uit  ?i  ?xit  ?it 
 - Random effects using omitted heterogeneity 
 - Uit  ?  ?xit  (?it  vi) 
 - Same outcome mechanism Yit  Uit gt 0 
 
  79Fixed and Random Effects Models
- Fixed Effects 
 - Robust to both specifications 
 - Inconvenient to compute (many parameters) 
 - Incidental parameters problem 
 - Random Effects 
 - Inconsistent if effects are correlated with X 
 - Small(er) number of parameters 
 - Easier (?) to compute 
 - Computation  available estimators 
 - Other Approaches to Modeling Heterogeneity
 
  80Random Effects
- Uit  ?  ?xit  (?it  ?v vi) 
 - Logit model (can be generalized) 
 - Joint probability for individual i  vi  
 - Unobserved component vi must be eliminated 
 - Maximize wrt ?, ? and ?v 
 - How to do the integration? 
 - Analytic integration Integral does not exist in 
closed form  - Quadrature most familiar software 
 - Simulation
 
  81Quadrature  Butler and Moffitt 
 82Estimation by Simulation
is the sum of the logs of EPr(y1,y2,vi). Can 
be estimated by sampling vi and averaging. (Use 
random numbers.) 
 83Estimated Random Effects Models
---------------------------------------------  
Random Effects Binary Probit Model   
Log likelihood function -16273.96   
Restricted log likelihood -17670.94   
Unbalanced panel has 7293 individuals. 
 --------------------------------------------- 
----------------------------------------------
------------------ Variable Coefficient  
Standard Error b/St.Er.PZgtz Mean of 
X -------------------------------------------
--------------------- Constant .03411277 
 .09635399 .354 .7233 HHNINC  
-.00317550 .06667150 -.048 .9620 
.35208362 HHKIDS  -.15378566 
.02704366 -5.687 .0000 .40273000 EDUC 
 -.03369428 .00628888 -5.358 .0000 
 11.3206310 AGE  .02014296 
.00131894 15.272 .0000 43.5256898 MARRIED 
 .01632531 .03134693 .521 .6025 
 .75861817 Rho  .44789069 
.01020965 43.869 .0000 ---------------------
------------------------  Random Coefficients 
Probit Model   Log likelihood 
function -16279.97   PROBIT (normal) 
probability model   Simulation based 
on 50 Halton draws  --------------------
------------------------- ---------Means for 
random parameters Constant .03329051 
.06322876 .527 .5985 ---------Nonrandom 
parameters HHNINC  -.00297343 
.05201195 -.057 .9544 .35208362 HHKIDS 
 -.15357945 .02028593 -7.571 .0000 
 .40273000 EDUC  -.03348872 
.00393143 -8.518 .0000 11.3206310 AGE 
 .02007864 .00090132 22.277 .0000 
 43.5256898 MARRIED  .01682560 
.02277150 .739 .4600 .75861817 ---------
Scale parameters for dists. of random 
parameters Constant .90088375 
.01126251 79.990 .0000 RHO  .900883752 / 
1  .900883752  .447997
Butler/Moffitt Quadrature
Simulation with 50 Halton Points 
 84Ignoring Unobserved Heterogeneity 
 85The Effect of Ignoring Random EffectsLogit 
Coefficient Estimates
logit  lhs  doctor  rhs  x  mar 
pds_groupti  ran  --------------------------
-------------------------------------- Variab
le Coefficient  Standard Error 
b/St.Er.PZgtz Mean of X -----------------
----------------------------------------------
- Random Effects
 Constant -.13460475 
 .17764130 -.758 .4486 HHNINC  
.02191356 .11865884 .185 .8535 
HHKIDS  -.21598299 .04773805 -4.524 
 .0000 EDUC  -.06357790 .01132182 
 -5.616 .0000 AGE  .03926718 
.00246587 15.924 .0000 MARRIED  
.02507118 .05628204 .445 .6560 Rho 
  .41607571 .00583916 71.256 
.0000 Pooled
 Constant 
.25111543 .09113537 2.755 .0059 
HHNINC  -.18592232 .07506403 -2.477 
 .0133 .35208362 HHKIDS  -.22947000 
 .02953694 -7.769 .0000 .40273000 EDUC 
  -.04558783 .00564646 -8.074 
.0000 11.3206310 AGE  .02070863 
.00128517 16.114 .0000 43.5256898 MARRIED 
 .08529305 .03328573 2.562 .0104 
 .75861817 
The cluster estimator does not fix this. 
 86The Effect of Ignoring Random EffectsMarginal 
Effects
----------------------------------------------
------------------ Variable Coefficient  
Standard Error b/St.Er.PZgtzElasticity ---
----------------------------------------------
--------------- Random Effects
 HHNINC 
 .00358351 .01940382 .185 .8535 
 .00198108 EDUC  -.01039686 
.00184906 -5.623 .0000 -.18480732 AGE 
 .00642134 .00040269 15.946 .0000 
 .43885160 ---------Marginal effect for dummy 
variable is P1 - P0. HHKIDS  -.03544814 
 .00786141 -4.509 .0000 -.02241578 
MARRIED  .00410498 .00922645 .445 
 .6564 .00488969 Pooled 
HHNINC  -.04321347 .01744584 -2.477 
 .0132 -.02405262 EDUC  -.01059587 
 .00131215 -8.075 .0000 -.18962896 AGE 
  .00481326 .00029819 16.142 
.0000 .33119369 ---------Marginal effect for 
dummy variable is P1 - P0. HHKIDS  
-.05359813 .00692332 -7.742 .0000 
-.03412409 MARRIED  .01993604 
.00782159 2.549 .0108 .02390890 
 87Fixed Effects
- Dummy variable coefficients 
 -  Uit  ?i  ?xit  ?it 
 - Can be done by brute force for 10,000s of 
individuals  - F(.)  appropriate probability for the observed 
outcome  - Compute ? and ?i for i1,,N (may be large) 
 - See Estimating Econometric Models with Fixed 
Effects at www.stern.nyu.edu/wgreene  
  88Models with Fixed Individual Effects
- Additive Effects 
 - Log Likelihood Function 
 - Approach 
 - Conditional estimation based on sufficient 
statistics  - Unconditional, brute force with all dummy 
variables 
  89Conditional Estimation
- Principle f(yi1,yi2,  some statistic) is free 
of the fixed effects for some models.  - Maximize the conditional log likelihood, given 
the statistic. 
  90Conditional Logit Model 
 91Conditional Estimation
- Other Distributions? 
 - Poisson  the leading nonbinary case 
 - Loglinear  Exponential 
 - Almost no others 
 - Estimating constants is still a problem if 
marginal effects or probabilities are desired 
  92Example Two Period Binary Logit
  93Binary Logit, cont.
- Estimate ? by maximizing conditional logL 
 - Estimate ?i by using the known ? in the FOC for 
the unconditional logL  - Solve for the N constants, one at a time treating 
? as known.  - No solution when yit sums to 0 or Ti
 
  94Wooldridge on Estimating Partial Effects
- The fixed effects logit estimator of ? 
immediately gives us the effect of each element 
of xi on the log-odds ratio Unfortunately, we 
cannot estimate the partial effects unless we 
plug in a value for ai. Because the distribution 
of ai is unrestricted  in particular, Eai is 
not necessarily zero  it is hard to know what to 
plug in for ai. In addition, we cannot estimate 
average partial effects, as doing so would 
require finding E?(xit ? ai), a task that 
apparently requires specifying a distribution for 
ai. 
  95Logit Constant Terms 
 96Unconditional Estimation
- Maximize the whole log likelihood 
 - Difficult! Many (thousands) of parameters. 
 - Feasible  NLOGIT (2001) (Brute force)
 
  97Unconditional Estimator
---------------------------------------------  
FIXED EFFECTS Logit Model   
Log likelihood function -9458.638   
Unbalanced panel has 7293 individuals.   
Bypassed 3046 groups with inestimable a(i).   
LOGIT (Logistic) probability model 
 --------------------------------------------- 
----------------------------------------------
------------------ Variable Coefficient  
Standard Error b/St.Er.PZgtz Mean of 
X -------------------------------------------
--------------------- ---------Index function 
for probability HHNINC  -.06097272 
.17828658 -.342 .7324 .35357827 HHKIDS 
 -.08840685 .07439887 -1.188 .2347 
 .43906021 EDUC  -.11670836 
.06674866 -1.748 .0804 11.3554798 AGE 
 .10475175 .00725480 14.439 .0000 
 43.0477999 MARRIED  -.05731835 
.10608750 -.540 .5890 .77178591 --------
-----------------------------------  Partial 
derivatives of Ey  F with   respect to 
the vector of characteristics.   They are 
computed at the means of the Xs.   Estimated 
Eymeans,mean alphai .612   Estimated 
scale factor for dE/dx .237 
 ------------------------------------------- 
HHNINC  -.01447714 .04208235 -.344 
 .7308 .35357827 ---------Marginal effect 
for binary independent variable HHKIDS  
-.02099100 .00415463 -5.052 .0000 
.43906021 EDUC  -.02771081 
.02037808 -1.360 .1739 11.3554798 AGE 
 .02487187 .00420872 5.910 .0000 
 43.0477999 ---------Marginal effect for binary 
independent variable MARRIED  -.01360946 
 .00469760 -2.897 .0038 .77178591 
 98Conditional Estimator
-------------------------------------------------
-  Panel Data Binomial Logit Model 
   Number of individuals  7293 
   Number of periods 
_GROUPTI   Conditioning event is the 
sum of DOCTOR  ------------------------
--------------------------  Log likelihood 
function -6299.016 
 --------------------------------------------
---------- Variable Coefficient  Standard 
Error b/St.Er.PZgtz ----------------------
-------------------------------- HHNINC  
 -.05038297 .15887796 -.317 .7512 
HHKIDS  -.07776425 .06628241 -1.173 
 .2407 EDUC  -.09081577 .05667292 
 -1.602 .1091 AGE  .08475971 
.00650217 13.036 .0000 MARRIED  
-.05207227 .09304411 -.560 
.5757 -------------------------------------------
  Partial derivatives of probabilities with  
(Constant terms estimated individually  respect 
to the vector of characteristics.  after 
estimation of beta by Chamberlain  They are 
computed at the means of the Xs.  method)  
Observations used are All Obs. 
 ------------------------------------------- --
-------Marginal effect for variable in 
probability HHNINC  -.01205716 
.03807980 -.317 .7515 -.00703544 HHKIDS 
 -.01860977 .01891717 -.984 .3252 
 -.34915032 EDUC  -.02173314 
.01390241 -1.563 .1180 -.03601829 AGE 
 .02028386 .00310658 6.529 .0000 
 1.46317734 ---------Marginal effect for dummy 
variable is P1 - P0. MARRIED  -.00314259 
 .00566690 -.555 .5792 -.00209750 
 99Advantages of the FE Model
- Allows correlation of effect and regressors 
 - Fairly straightforward to estimate 
 - Simple to interpret
 
  100Disadvantages of FE
- Not necessarily simple to estimate if very large 
samples (Stata just creates the thousands of 
dummy variables)  - The incidental parameters problem Small T bias.
 
  101Incidental Parameters Problems Conventional 
Wisdom
- General Biased in samples with fixed T except 
in special cases such as linear or Poisson 
regression  - Specific Upward bias (experience with probit 
and logit) in estimators of ? 
  102What We Know About the IP Problem in Binary 
Choice Models
- Andersen, Hsiao, Abrevaya (Exact Analytic) 
 - Bias in logit estimator is exactly 100 when T  
2  - No result ever obtained for any other model for 
T2  - Heckman (Nonreplicable Monte Carlo study) 
 - Bias in probit estimator is small if T ? 8 
 - Bias in probit estimator is toward 0 in some 
cases  - Katz (et al  numerous others), Greene 
 - Bias in probit and logit estimators is large 
 - Upward bias persists even as T ? 20
 
  103Some Familiar Territory  A Monte Carlo Study of 
the FE Estimator Probit vs. Logit
Estimates of Coefficients and Marginal Effects at 
the Implied Data Means
Results are scaled so the desired quantity being 
estimated (?, ?, marginal effects) all equal 1.0 
in the population. 
 104Specification Tests RE vs. FE
- Fixed effects vs. Random effects 
 - Unconditional FE estimator is never consistent 
(if T is small)  - RE is inconsistent if FE applies 
 - Cannot use Hausman test 
 - Effects vs. no effects 
 - Conditional FE estimator is always consistent 
 - Pooled estimator is consistent if no effects 
 - Can use Hausman test for this specification test 
is for common effects vs. no effects, not fixed 
effects vs. no effects. 
  105Dynamic Models 
 106Dynamic Probit Model A Standard Approach 
 107Bias Reduction
- Dynamic binary choice 
 - yit  1(ßxit  dyi,t-1ci gt 0) 
 - Fixed or random effects 
 - Common fixed effects 
 - yit  1(ßxit  ci gt 0) 
 - Presumed proportional bias plim b kß 
 - Estimate the proportionality constant, k. 
 - Literature is in its infancy 
 - How do we know b is proportionally biased? 
 - Known analytic results apply to trivial models 
 - Not yet useful for practitioners
 
  108Ordered Outcomes
- E.g. Taste test, credit rating, course grade 
 - Underlying random preferences Mapping to 
observed choices  - Strength of preferences 
 - Censoring and discrete measurement 
 - The nature of ordered data
 
  109Modeling Ordered Choices
- Random Utility 
 -  Uit  ?  ?xit  ?izit  ?it 
 -   ait  ?it 
 - Observe outcome j if utility is in region j 
 - Probability of outcome  probability of cell 
 -  PrYitj  F(?j  ait) - F(?j-1  ait) 
 
  110Application Health Care Usage
German Health Care Usage Data, 7,293 Individuals, 
Varying Numbers of PeriodsVariables in the file 
areData downloaded from Journal of Applied 
Econometrics Archive. This is an unbalanced panel 
with 7,293 individuals. They can be used for 
regression, count models, binary choice, ordered 
choice, and bivariate binary choice.  This is a 
large data set.  There are altogether 27,326 
observations.  The number of observations ranges 
from 1 to 7.  (Frequencies are 11525, 22158, 
3825, 4926, 51051, 61000, 7987).  Note, the 
variable NUMOBS below tells how many observations 
there are for each person.  This variable is 
repeated in each row of the data for the person.  
(Downlo0aded from the JAE Archive) 
 DOCTOR  1(Number of doctor visits gt 0) 
 HOSPITAL  1(Number of hospital 
visits gt 0) HSAT   health 
satisfaction, coded 0 (low) - 10 (high)   
 DOCVIS   number of doctor visits in 
last three months HOSPVIS   
number of hospital visits in last calendar year 
 PUBLIC   insured in public 
health insurance  1 otherwise  0 
 ADDON   insured by add-on insurance  1 
otherswise  0 HHNINC   
household nominal monthly net income in German 
marks / 10000. (4 
observations with income0 were dropped) 
 HHKIDS  children under age 16 in the 
household  1 otherwise  0 
EDUC   years of schooling 
AGE  age in years MARRIED  
marital status EDUC  years of 
education 
 111Health Care Satisfaction (HSAT)
Self Administered Survey Health Care 
Satisfaction? (0  10)
Continuous Preference Scale 
 112Ordered Probabilities 
 113Four Ordered Probabilities
0 - ßx
µ1 - ßx
µ2 - ßx
8 - ßx
-8 - ßx
 y0 y1 y2 
 y3 
 114Coefficients 
 115Effects in the Ordered Probability Model
Assume the ßk is positive. Assume that xk 
increases. ßx increases. µj- ßx shifts to the 
left for all 4 cells. Proby0 
decreases Proby1 decreases  the mass shifted 
out is larger than the mass shifted in. Proby2 
increases  same reason. Proby3 increases. 
When ßk gt 0, increase in xk decreases Proby0 
and increases ProbyJ. Intermediate cells are 
ambiguous, but there is only one sign change in 
the marginal effects from 0 to 1 to  to J 
 116Ordered Probability Model for Health Satisfaction
---------------------------------------------  
Ordered Probability Model   
Dependent variable HSAT   
Number of observations 27326   
Underlying probabilities based on Normal   
 Cell frequencies for outcomes   Y 
Count Freq Y Count Freq Y Count Freq   0 
447 .016 1 255 .009 2 642 .023   3 
1173 .042 4 1390 .050 5 4233 .154   6 
2530 .092 7 4231 .154 8 6172 .225   9 
3061 .112 10 3192 .116 
 --------------------------------------------- 
----------------------------------------------
-------------------- Variable  Coefficient 
 Standard Error b/St.Er.PZgtz  Mean of 
X -------------------------------------------
----------------------- Index 
function for probability Constant 
2.61335825 .04658496 56.099 .0000 
FEMALE -.05840486 .01259442 -4.637 
 .0000 .47877479 EDUC .03390552 
 .00284332 11.925 .0000 11.3206310 AGE 
 -.01997327 .00059487 -33.576 
.0000 43.5256898 HHNINC .25914964 
 .03631951 7.135 .0000 .35208362 
HHKIDS .06314906 .01350176 4.677 
 .0000 .40273000 Threshold 
parameters for index Mu(1) .19352076 
 .01002714 19.300 .0000 Mu(2) 
.49955053 .01087525 45.935 .0000 Mu(3) 
 .83593441 .00990420 84.402 
.0000 Mu(4) 1.10524187 .00908506 
121.655 .0000 Mu(5) 1.66256620 
.00801113 207.532 .0000 Mu(6) 
1.92729096 .00774122 248.965 .0000 
Mu(7) 2.33879408 .00777041 300.987 
 .0000 Mu(8) 2.99432165 .00851090 
 351.822 .0000 Mu(9) 3.45366015 
.01017554 339.408 .0000 
 117Ordered Probability Effects
-------------------------------------------------
---  Marginal effects for ordered probability 
model   M.E.s for dummy variables are 
Pryx1-Pryx0   Names for dummy 
variables are marked by . 
 -----------------------------------------------
----- ---------------------------------------
--------------------------- Variable  
Coefficient  Standard Error b/St.Er.PZgtz 
 Mean of X ----------------------------------
-------------------------------- 
These are the effects on ProbY00 at means. 
FEMALE .00200414 .00043473 4.610 
 .0000 .47877479 EDUC -.00115962 
 .986135D-04 -11.759 .0000 11.3206310 AGE 
 .00068311 .224205D-04 30.468 
.0000 43.5256898 HHNINC -.00886328 
 .00124869 -7.098 .0000 .35208362 
HHKIDS -.00213193 .00045119 -4.725 
 .0000 .40273000 These are the 
effects on ProbY01 at means. FEMALE 
.00101533 .00021973 4.621 .0000 
.47877479 EDUC -.00058810 
.496973D-04 -11.834 .0000 11.3206310 AGE 
 .00034644 .108937D-04 31.802 
.0000 43.5256898 HHNINC -.00449505 
 .00063180 -7.115 .0000 .35208362 
HHKIDS -.00108460 .00022994 -4.717 
 .0000 .40273000 ... repeated for all 11 
outcomes These are the effects on 
ProbY10 at means. FEMALE -.01082419 
 .00233746 -4.631 .0000 .47877479 EDUC 
 .00629289 .00053706 11.717 
.0000 11.3206310 AGE -.00370705 
 .00012547 -29.545 .0000 43.5256898 
HHNINC .04809836 .00678434 7.090 
 .0000 .35208362 HHKIDS .01181070 
 .00255177 4.628 .0000 .40273000 
 118Ordered Probit Marginal Effects 
 119Panel Data Treatments FE
- No sufficient statistics ? No conditional 
estimator  - Unconditional (brute force) is straightforward 
 - Transformed model 
 - Prob(yit gt j, t1,,TiXi) is an ordered binary 
choice model  - Produces multiple estimators of ß 
 - Reconcile with minimum distance
 
  120Transformed Model
Fit this model for each j  1,,J as a fixed or 
random effects binary choice model. Each has its 
own j specific constant term (random effects) or 
estimates of (ai  µj) and its own specific 
vector ßj. Reconcile the multiple slope vectors 
with a minimum distance weighted average. (In 
both cases, only the part of ßj that is not the 
constant term.) (Note This is a way to get a 
consistent estimator in the presence of fixed 
effects. It is not needed for random effects.) 
 121Fixed Effects Estimates
---------------------------------------------  
FIXED EFFECTS OrdPrb Model   
Number of observations 27326   
Iterations completed 6   
Log likelihood function -41876.93   
Number of parameters 5680   
Unbalanced panel has 7293 individuals.   
Bypassed 1626 groups with inestimable a(i).   
LHS variable  values 0,1,...,10 
 --------------------------------------------- 
----------------------------------------------
------------------ Variable Coefficient  
Standard Error b/St.Er.PZgtz Mean of 
X -------------------------------------------
--------------------- ---------Index function 
for probability AGE  -.07137568 
.00273513 -26.096 .0000 43.9209856 HHNINC 
 .30140100 .06919695 4.356 .0000 
 .35112607 EDUC  .02405894 
.02649654 .908 .3639 11.3100525 HHKIDS 
 -.05493925 .02766401 -1.986 .0470 
 .40921377 MU(1)  .32485866 
.02036400 15.953 .0000 MU(2)  
.84476382 .02736032 30.876 .0000 
MU(3)  1.39396202 .03002635 46.425 
 .0000 MU(4)  1.82292900 .03101930 
 58.768 .0000 MU(5)  2.69905222 
.03227934 83.615 .0000 MU(6)  
3.12710904 .03273884 95.517 .0000 
MU(7)  3.79215966 .03344847 113.373 
 .0000 MU(8)  4.84341077 .03489769 
 138.789 .0000 MU(9)  5.57238334 
.03629754 153.520 .0000
Time invariant variable FEMALE is dropped from 
the fixed effects model. 
 122Generalizing the Ordered Probit
- INDEX  ?xit 
 - Thresholds 
 - Standard model ?-1  -?, ?00, ?j gt ?j-1 gt 0. 
 - Homogeneous preference scale. 
 - Generalized (Pudney/Shields, JAE 00, Job 
Grades)  -  
 - Note identification problem - 
?xit. If any variables are common to the two 
parts, the coefficients are not identified.  - Harris, Zhao, Greene (2004 Drug Use )
 
  123Multivariate Binary Choice Models
- Bivariate Probit Models 
 - Analysis of bivariate choices 
 - Marginal effects 
 - Prediction 
 - Simultaneous Equations and Recursive Models 
 - A Sample Selection Bivariate Probit Model 
 - The Multivariate Probit Model 
 - Specification 
 - Simulation based estimation 
 - Inference 
 - Partial effects and analysis 
 - The panel probit model 
 
  124Gross Relation Between Two Binary Variables
- Cross Tabulation Suggests Presence or Absence of 
a Bivariate Relationship 
-------------------------------------------------
------------------------ Cross Tabulation  
 Row variable is DOCTOR (Out of range 0-49 
 0)  Number of Rows  2 
 (DOCTOR  0 to 1) 
  Col variable is HOSPITAL (Out of range 0-49 
 0)  Number of Cols  
2 (HOSPITAL  0 to 1) 
  Chi-squared independence tests 
  Chi-squared 
1  430.11235 Prob value  .00000 
  G-squared  1  477.27393 Prob 
value  .00000 
 -----------------------------------------------
-------------------------- Joint Frequencies 
for Row Variable DOCTOR Column Variable 
HOSPITAL  ---------------------------------
-------------------------------------- DOCTOR 
 Total  0 1 
  ------------------------------
-----------------------------------------  
 0 10135  9715 420 
   1 17191  15216 
1975 
 ---------------------------------------------
--------------------------  Total 27326  
24931 2395 
  -----------------------------------------
------------------------------ 
 125Tetrachoric Correlation
http//ourworld.compuserve.com/homepages/jsuebersa
x/tetra.htm http//www2.chass.ncsu.edu/garson/pa7
65/correl.htm 
 126Estimating Tetrachoric Correlation
- Numerous ad hoc algorithms suggested in the 
literature  - Do not appear to have noticed the connection to a 
bivariate probit model  - Maximum likelihood estimation is simple under the 
(necessary) assumption of normality 
  127Likelihood Function 
 128Estimation
---------------------------------------------  
FIML Estimates of Bivariate Probit Model   
Maximum Likelihood Estimates   
Dependent variable DOCHOS   
Weighting variable None   
Number of observation