Title: ?????? Categorical Data Analysis ??? (T.Y. Wang) ??????? (Illinois State University) tywang@ilstu.edu
1??????Categorical Data Analysis???(T.Y.
Wang)???????(Illinois State University)
tywang_at_ilstu.edu
1
2- ?????????????(linear regression models)????????
2
3- ???????
- ?????????,????????
- ?
- ?
3
4- Heteroscedasticity (???????????,??????????? vs.
homoscedasticity)
4
5E(x)p Var(x)p(1-p)
0.1 0.09
0.2 0.16
0.3 0.21
0.4 0.24
0.5 0.25
0.6 0.24
0.7 0.21
0.8 0.16
0.9 0.09
5
6- ? y ??????? ,??????(variance) ?????? x ????
6
77
8- ??????? (?????) (Functional form)
- ????????
- (Nonsensical predictions )
- prvalue, x(age35 k54 wc0 hc0)
8
9- Example Labor force participation of women (file
name binlfp2)
9
10- ?????????,??????????????,??????????
10
11- Models to be Considered
- Binary Logit (??????) and Probit(?????? )
- Ordinal Ordered Logit (??????)and
Probit(??????) - Nominal Multinomial Logit(??????)
- Count Poisson (?? ) and Negative Binomial(???)
11
12- Maximum Likelihood (??????)Estimation
- ???????????????????(The ML estimate is the value
that maximizes the likelihood of observing the
sample data)
12
13- ????????????????(the method of least
squares)??????? - ??????????????????????????
14- ??????????????,????????????????????????,??????????
?(realization)?
15- ?Latent Variable Model(??????)??
8
- 8
?0
0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 y
15
1616
17- ??????????????????????,???????????????????,???????
?????????????????????????
18- ??????????????,????????????
19- ?????????????????(numerical methods)??????????????
??
20- ??????????????(????? (start value)),????????????
?????,??????????? - ?????????????,?????????(iteration)
21- ??????????????????????????,????????????(convergenc
e),???????????
2222
23??????(Models for Binary Outcomes)
24- ?Latent Variable Model(??????)??
8
- 8
?0
0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 y
24
2525
2626
27- Distribution of Errors ( )
- binary logit model(????????) if logistic
distribution - binary probit model(????????) if normal
distribution
27
28- ??????????(Generalized Linear Model) ??
- ????(random component)
- ???? (system component)
- ????(link component)
28
29- ???? ????????? y,?????????? ??????????????(Bernou
lli)??
29
3030
31- ????????????? ?????????????, ??????
31
32?????(logit)??????????????
32
33- ??(???,odds)
- ????????????????????????
- ??
- ??
33
34- ??4,??????0.8,???????0.2?????????????4?
- ?????,???
34
35- ????????
- ??0????,???
- ???????????,???????????????,????????,????
35
36- ????????
- ??????1 ,???????,???????1???
- ?????1?,?????????????????
36
370.01 0.2 0.5 0.7 0.9 0.99
0.99 0.8 0.5 0.3 0.1 0.01
?? 0.01 0.25 1.00 2.33 9.00 99.0
37
38- odds gt1,?????????????
- 0lt odds lt1,?????????????
- odds 1,?????????????
38
3939
40- ???(odds ratio) ??????????????
40
41?? ?? ??(Odds)
?????? 600 5000 0.12
???? 400 60000 0.0067
???(Odds Ratio) 18
?????????,???????????????? ?????????,???????????????? ?????????,???????????????? ?????????,????????????????
42- ??????(logit, the natural log of odds)
- ??????????,????????0???
- ????0.5??,????
- ??????????????????????
42
430.1 0.3 0.5 0.7 0.9
0.9 0.7 0.5 0.3 0.1
odds 0.11 0.43 1.00 2.33 9.00
logit -2.20 -0.85 0.00 0.85 2.20
43
44- ????????????,???????????????(linear in logit)
44
45- the same change in x have constant effects
- ??????????,????????????????(nonlinear in
probability)
45
4646
47- probabilities depends on the start level of xi
and the values of other variables - ???????????????? (A Nonlinear Probability Model)
47
48- ?????????????????????,?????
- ????? ?
- ????,????????,???????(identity)??????????????,????
??
49- ??????,?
- ???????????
- ????????????,???0?1????????????????????????
50- ????????????
- ???(Bernoulli)??
- ?????
- binary logit model(????????)
- binary probit model(????????)
51- Example Labor force participation of women (file
name binlfp2) - Difference and Similarity between Logit and Probit
51
52- Hypothesis Testing
- ????,???,p ????
- test (testing effects of single e.g., test k5,
or multiple coefficients being equal, e.g., test
hcwc) - lrtestComparing competitive (nested) models
using LR test
52
53- The LR test assesses a hypothesis by comparing
the log likelihood from the full model (MU) and a
restricted model (MC) (H0 imposed constraints
are true). If the constraints significantly
reduce the log likelihood, the H0 is rejected and
the full model is preferred
53
54- Consider the following models
- M1
- M2
-
- M3
54
55- When conducting LR test
- the two models must be nested (i.e., a nested
model is created by imposing constraints on the
coefficients in the prior model), - the two models must be fitted on exactly the
same sample
55
56- mark and markout to exclude cases with missing
data - mark nomiss
- markout nomiss lfpk5 k618 age wc hc lwg inc
- logit lfpk5 inc if nomiss1
56
57- ??????
- predict????????
- prvalue ? prtab????????(profile)
- prchange?????????????????
- listcoef???????(odds ratio)
57
58- predict?????????????
- predict prlogit
58
59- prvalue ? prtab????????(profile)
- prvalue, x(age50 k50 k6180 wc1 hc1)
rest(mean) - prtab k5 wc, rest(mean)
59
60- prchange?????????????????
- marginal change
60
61- ?????????????, ???????????????? -0.0084
- ?????????????????,????????????????????,?????????,?
??
61
6262
63- discrete change
- ?????????????, ??????????,?????????? 0.34
- ???????????, ?????????????, ??????? 0.19
63
64- min-gtmax
- ??????????????,????????????????,??????? 0.4372
64
65- listcoef???????(odds ratio)
65
66- ?????????,????????????????????
66
67 68- ??
- ???????????,??? ???????,??????????? ?
68
69- ? exp( ) gt1,?,?????????????exp( )?
- ?? exp( ) lt1 ?,?????????????exp( ) ?
69
70- ?????????????,???????,?????????(???)????0.23?
- ?????????????,???????,?????????(???)????76.8
70
71- Residuals (???) and Influence (?????)using
predict
71
72- ????
- drop if index752
- index345index142
- (data/create or change data/
- keep or drop observations)
72
73- Scalar Measures of Fit (???)using fitstat
- ??????R2?(Pseudo-R2)
- Information Measures can be used to compare
models across different samples or to compare
non-nested models
73
74- AIC the smaller AIC, the better the fit
- BIC the more negative the BIC, the better the
fit
74
75- ??????????
- ????????????????????
- Stata???Wang_PR_BRM.dta