?????? Categorical Data Analysis ??? (T.Y. Wang) ??????? (Illinois State University) tywang@ilstu.edu - PowerPoint PPT Presentation

1 / 75
About This Presentation
Title:

?????? Categorical Data Analysis ??? (T.Y. Wang) ??????? (Illinois State University) tywang@ilstu.edu

Description:

Categorical Data Analysis (T.Y. Wang) (Illinois State University) tywang_at_ilstu.edu * – PowerPoint PPT presentation

Number of Views:62
Avg rating:3.0/5.0
Slides: 76
Provided by: TYW9
Learn more at: https://sites.duke.edu
Category:

less

Transcript and Presenter's Notes

Title: ?????? Categorical Data Analysis ??? (T.Y. Wang) ??????? (Illinois State University) tywang@ilstu.edu


1
??????Categorical Data Analysis???(T.Y.
Wang)???????(Illinois State University)
tywang_at_ilstu.edu
1
2
  • ?????????????(linear regression models)????????

2
3
  • ???????
  • ?????????,????????
  • ?
  • ?

3
4
  • Heteroscedasticity (???????????,??????????? vs.
    homoscedasticity)

4
5
E(x)p Var(x)p(1-p)
0.1 0.09
0.2 0.16
0.3 0.21
0.4 0.24
0.5 0.25
0.6 0.24
0.7 0.21
0.8 0.16
0.9 0.09
5
6
  • ? y ??????? ,??????(variance) ?????? x ????

6
7
7
8
  • ??????? (?????) (Functional form)
  • ????????
  • (Nonsensical predictions )
  • prvalue, x(age35 k54 wc0 hc0)

8
9
  • Example Labor force participation of women (file
    name binlfp2)

9
10
  • ?????????,??????????????,??????????

10
11
  • Models to be Considered
  • Binary Logit (??????) and Probit(?????? )
  • Ordinal Ordered Logit (??????)and
    Probit(??????)
  • Nominal Multinomial Logit(??????)
  • Count Poisson (?? ) and Negative Binomial(???)

11
12
  • Maximum Likelihood (??????)Estimation
  • ???????????????????(The ML estimate is the value
    that maximizes the likelihood of observing the
    sample data)

12
13
  • ????????????????(the method of least
    squares)???????
  • ??????????????????????????

14
  • ??????????????,????????????????????????,??????????
    ?(realization)?

15
  • ?Latent Variable Model(??????)??

8
- 8
?0
0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 y
15
16
  • Structural Model

16
17
  • ??????????????????????,???????????????????,???????
    ?????????????????????????

18
  • ??????????????,????????????

19
  • ?????????????????(numerical methods)??????????????
    ??

20
  • ??????????????(????? (start value)),????????????
    ?????,???????????
  • ?????????????,?????????(iteration)

21
  • ??????????????????????????,????????????(convergenc
    e),???????????

22
22
23
??????(Models for Binary Outcomes)
24
  • ?Latent Variable Model(??????)??

8
- 8
?0
0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1
1 1 y
24
25
  • Structural Model

25
26
26
27
  • Distribution of Errors ( )
  • binary logit model(????????) if logistic
    distribution
  • binary probit model(????????) if normal
    distribution

27
28
  • ??????????(Generalized Linear Model) ??
  • ????(random component)
  • ???? (system component)
  • ????(link component)

28
29
  • ???? ????????? y,?????????? ??????????????(Bernou
    lli)??

29
30
  • ???? ????????
  • ???????

30
31
  • ????????????? ?????????????, ??????

31
32
?????(logit)??????????????
32
33
  • ??(???,odds)
  • ????????????????????????
  • ??
  • ??

33
34
  • ??4,??????0.8,???????0.2?????????????4?
  • ?????,???

34
35
  • ????????
  • ??0????,???
  • ???????????,???????????????,????????,????

35
36
  • ????????
  • ??????1 ,???????,???????1???
  • ?????1?,?????????????????

36
37
0.01 0.2 0.5 0.7 0.9 0.99
0.99 0.8 0.5 0.3 0.1 0.01
?? 0.01 0.25 1.00 2.33 9.00 99.0
37
38
  • odds gt1,?????????????
  • 0lt odds lt1,?????????????
  • odds 1,?????????????

38
39
  • ????????

39
40
  • ???(odds ratio) ??????????????

40
41
?? ?? ??(Odds)
?????? 600 5000 0.12
???? 400 60000 0.0067
???(Odds Ratio) 18
?????????,???????????????? ?????????,???????????????? ?????????,???????????????? ?????????,????????????????
42
  • ??????(logit, the natural log of odds)
  • ??????????,????????0???
  • ????0.5??,????
  • ??????????????????????

42
43
0.1 0.3 0.5 0.7 0.9
0.9 0.7 0.5 0.3 0.1
odds 0.11 0.43 1.00 2.33 9.00
logit -2.20 -0.85 0.00 0.85 2.20
43
44
  • ????????????,???????????????(linear in logit)

44
45
  • the same change in x have constant effects
  • ??????????,????????????????(nonlinear in
    probability)

45
46
  • ?????????(exponent)

46
47
  • probabilities depends on the start level of xi
    and the values of other variables
  • ???????????????? (A Nonlinear Probability Model)

47
48
  • ?????????????????????,?????
  • ????? ?
  • ????,????????,???????(identity)??????????????,????
    ??

49
  • ??????,?
  • ???????????
  • ????????????,???0?1????????????????????????

50
  • ????????????
  • ???(Bernoulli)??
  • ?????
  • binary logit model(????????)
  • binary probit model(????????)

51
  • Example Labor force participation of women (file
    name binlfp2)
  • Difference and Similarity between Logit and Probit

51
52
  • Hypothesis Testing
  • ????,???,p ????
  • test (testing effects of single e.g., test k5,
    or multiple coefficients being equal, e.g., test
    hcwc)
  • lrtestComparing competitive (nested) models
    using LR test

52
53
  • The LR test assesses a hypothesis by comparing
    the log likelihood from the full model (MU) and a
    restricted model (MC) (H0 imposed constraints
    are true). If the constraints significantly
    reduce the log likelihood, the H0 is rejected and
    the full model is preferred

53
54
  • Consider the following models
  • M1
  • M2
  • M3

54
55
  • When conducting LR test
  • the two models must be nested (i.e., a nested
    model is created by imposing constraints on the
    coefficients in the prior model),
  • the two models must be fitted on exactly the
    same sample

55
56
  • mark and markout to exclude cases with missing
    data
  • mark nomiss
  • markout nomiss lfpk5 k618 age wc hc lwg inc
  • logit lfpk5 inc if nomiss1

56
57
  • ??????
  • predict????????
  • prvalue ? prtab????????(profile)
  • prchange?????????????????
  • listcoef???????(odds ratio)

57
58
  • predict?????????????
  • predict prlogit

58
59
  • prvalue ? prtab????????(profile)
  • prvalue, x(age50 k50 k6180 wc1 hc1)
    rest(mean)
  • prtab k5 wc, rest(mean)

59
60
  • prchange?????????????????
  • marginal change

60
61
  • ?????????????, ???????????????? -0.0084
  • ?????????????????,????????????????????,?????????,?
    ??

61
62
  • discrete change
  • ??,

62
63
  • discrete change
  • ?????????????, ??????????,?????????? 0.34
  • ???????????, ?????????????, ??????? 0.19

63
64
  • min-gtmax
  • ??????????????,????????????????,??????? 0.4372

64
65
  • listcoef???????(odds ratio)

65
66
  • ?????????,????????????????????

66
67
  • ? ????????????,??????

68
  • ??
  • ???????????,??? ???????,??????????? ?

68
69
  • ? exp( ) gt1,?,?????????????exp( )?
  • ?? exp( ) lt1 ?,?????????????exp( ) ?

69
70
  • ?????????????,???????,?????????(???)????0.23?
  • ?????????????,???????,?????????(???)????76.8

70
71
  • Residuals (???) and Influence (?????)using
    predict

71
72
  • ????
  • drop if index752
  • index345index142
  • (data/create or change data/
  • keep or drop observations)

72
73
  • Scalar Measures of Fit (???)using fitstat
  • ??????R2?(Pseudo-R2)
  • Information Measures can be used to compare
    models across different samples or to compare
    non-nested models

73
74
  • AIC the smaller AIC, the better the fit
  • BIC the more negative the BIC, the better the
    fit

74
75
  • ??????????
  • ????????????????????
  • Stata???Wang_PR_BRM.dta
Write a Comment
User Comments (0)
About PowerShow.com