Title: Linear Relationship Between Regression Coefficients under Different Links for
1Linear Relationship Between Regression
Coefficients under Different Links for Binary
Data Qiong Feng March 28, 2002
2Outline
- Background
- Approaches to develop relationship among the
links - Comparison of the two approaches
- Examples
- Remarks
3Background
- Binary Response Data is such data that the
response variable has only two possible
qualitative outcomes - Examples
- Complete pain relief at 2 hours (Yes/No)
- Is there any strong association between pain
relief and dosage, patient age, patient gender, - Passing or not passing certain exam
- Can you predict the outcome ( success or
failure) given explanatory variables such as
campus, sex, age, education level of mother and
/or father, location, province, secondary school
type, profession mother or father,..
4Background
- Special problems for binary response data
- Non-normal Error Terms
- For a binary 0,1 response, each error term
- can only take two values
5Background
- Special problems for binary response data
- Non-normal Error Terms
- For a binary 0,1 response, each error term
- can only take two values
- Non-constant Error Variance
- The error variances will differ at different
levels of X, and ordinary least squares will no
longer be optimal.
6Background
- Special problems for binary response data
- Non-normal Error Terms
- For a binary 0,1 response, each error term
- can only take two values
- Non-constant Error Variance
- The error variances will differ at different
levels of X, and ordinary least squares will no
longer be optimal. - Constraint on Response Function
7Background
- Generalized Linear Model
- Random component
- Since the response variable is binary Yi1 or
0, - Systematic component a linear predictor such
that - The predictor variables may be continuous,
discrete, or both - Link function
8Background
- Link Functions
- logistic(logit), probit, and tv links (symmetric
links) - Complementary log-log link (asymmetric link)
- Summary of these four links
9How to?
- Common thought assume linear relationship
between regression coefficients under two
different links - Two ways to develop relationship between
different links - Taylor Expansion Method
- Quantile Matching Method
10Taylors Expansion Method
- Suppose we have two links F-1 and G-1
- Representation ( Albert and Chib 1993)
-
where wj is latent r.v. -
11Taylors Expansion Method
- Suppose we have two links F-1 and G-1
- Representation ( Albert and Chib 1993)
-
where wj is latent r.v. - Hence we can derive
12Taylors Expansion Method
- Suppose we have two links F-1 and G-1
- Representation ( Albert and Chib 1993)
-
where wj is latent r.v. - Hence we can derive
13Taylors Expansion Method
- Suppose we have two links F-1 and G-1
- Representation ( Albert and Chib 1993)
-
where wj is latent r.v. - Hence we can derive
14Taylors Expansion Method
- Suppose we have two links F-1 and G-1
- Representation ( Albert and Chib 1993)
-
where wj is latent r.v. - Hence we can derive
15Taylors Expansion Method
- Suppose we have two links F-1 and G-1
- Representation ( Albert and Chib 1993)
-
where wj is latent r.v. - Hence we can derive
-
-
(by Taylors Expansion)
16Taylors Expansion Method
- Table of (?1, ?2) based on Taylor Expansion
Example
17Quantile Matching Method
- Suppose we have two links F-1 and G-1
- Assume there is a linear relationship
18Quantile Matching Method
- Suppose we have two links F-1 and G-1
- Assume there is a linear relationship
- Hence we can derive
19Quantile Matching Method
- Suppose we have two links F-1 and G-1
- Assume there is a linear relationship
- Hence we can derive
- and
20Quantile Matching Method
- To satisfy , we can
calculate 1000 quantiles with probabilities
evenly spaced from 0.01 to 0.99 under the two
links. Then fit a simple linear regression to
estimate the two parameters in this equation. - Table of (?1, ? 2) based on Quantile Matching
21Comparison of two methods
- To compare the performance of these two methods,
two summary measures are computed
Where F denotes the cdf that we want to
approximate, is the quantile from F
corresponding to probabilities evenly spaced from
0.01 to 0.99, and is the fitted
quantile for F using the quantile from
another link G-1
22Comparison of two methods
- Table of summary measures (Dq,Dp)
Note TE denotes Taylor expansion and QM denotes
Quantile Matching
23Comparison of two methods
- Generally quantile matching performs better under
Dq measure and Taylors expansion does better
under Dp - Symmetric link approximate another symmetric link
better than asymmetric link - A light-tailed link performs better to
approximate the other light-tailed link
24Example1 Fertility Data
- On the study of rats fertility after the
administration of doses (in mg) of vitamin E - Fertility Data
25Example1 Fertility Data
- Results (TE and QM coefficients are obtained from
Probit link)
Note TE denotes Taylor expansion and QM denotes
Quantile Matching
26Example2 Programming Task Data
- Study the effect of computer programming
experience on ability to complete a complex
programming task within a limited time - Programming Task Data
- 25 persons were selected
- Programming experience was measured in months. Y
1 if the task was completed successfully,
otherwise Y0
27Example2 Programming Task Data
- Results (TE and QM coefficients are obtained from
Probit link)
Note TE denotes Taylor expansion and QM denotes
Quantile Matching
28(No Transcript)
29References
- 1 On the relationship between links for binary
response data, Journal of statistical studies ,
Special Issue (2002) - ( Y., Wu, M.-H. Chen., and D. k. Dey)
- 2 Applied linear statistical models, fourth
edition,John Neter, Michael H. Kutner,etc
30Thank You!