Loading...

PPT – Modeling with Observational Data PowerPoint presentation | free to download - id: 3c0079-Y2RmY

The Adobe Flash plugin is needed to view this content

Modeling with Observational Data

- Michael Babyak, PhD

What is a model ?

Y f(x1, x2, x3xn)

Y a b1x1 b2x2bnxn

Y e a b1x1 b2x2bnxn

All models are wrong, some are useful --

George Box

- A useful model is
- Not very biased
- Interpretable
- Replicable (predicts in a new sample)

(No Transcript)

Some Premises

- Statistics is a cumulative, evolving field
- Newer is not necessarily better, but should be

entertained in the context of the scientific

question at hand - Data analytic practice resides along a continuum,

from exploratory to confirmatory. Both are

important, but the difference has to be

recognized. - Theres no substitute for thinking about the

problem

Observational Studies

- Underserved reputation
- Especially if conducted and analyzed wisely
- Biggest threats
- Third Variable
- Selection Bias (see above)
- Poor Planning

Correlation between results of randomized trials

and observational studies http//www.epidemiologic

.org/2006/11/agreement-of-observational-and.html

Mean of Estimates

Head-to-head comparisons

(No Transcript)

Statistics is a cumulative, evolving field How

do we know this stuff?

- Theory
- Simulation

Concept of Simulation

Y b X error

bs1

bs2

bsk-1

bsk

bs3

bs4

.

Concept of Simulation

Y b X error

bs1

bs2

bsk-1

bsk

bs3

bs4

.

Evaluate

Simulation Example

Y .4 X error

bs1

bs2

bsk-1

bsk

bs3

bs4

.

Simulation Example

Y .4 X error

bs1

bs2

bsk-1

bsk

bs3

bs4

.

Evaluate

True Model Y .4x1 e

Ingredients of a Useful Model

Correct probability model

Based on theory

Good measures/no loss of information

Useful Model

Comprehensive

Parsimonious

Tested fairly

Flexible

Correct Model

- Gaussian General Linear Model
- Multiple linear regression
- Binary (or ordinal) Generalized Linear Model
- Logistic Regression
- Proportional Odds/Ordinal Logistic
- Time to event
- Cox Regression or parametric survival models

Generalized Linear Model

Normal

Binary/Binomial

Count, heavy skew, Lots of zeros

Poisson, ZIP, negbin, gamma

General Linear Model/ Linear Regression

Logistic Regression

ANOVA/t-test ANCOVA

Chi-square

Regression w/ Transformed DV

Can be applied to clustered (e.g, repeated

measures data)

Factor Analytic Family

Structural Equation Models

Partial Least Squares

Latent Variable Models (Confirmatory Factor

Analysis)

Multiple regression

Common Factor Analysis

Principal Components

Use Theory

- Theory and expert information are critical in

helping sift out artifact - Numbers can look very systematic when the are in

fact random - http//www.tufts.edu/gdallal/multtest.htm

(No Transcript)

(No Transcript)

(No Transcript)

(No Transcript)

(No Transcript)

(No Transcript)

Measure well

- Adequate range
- Representative values
- Watch for ceiling/floor effects

Using all the information

- Preserving cases in data sets with missing data
- Conventional approaches
- Use only complete case
- Fill in with mean or median
- Use a missing data indicator in the model

Missing Data

- Imputation or related approaches are almost

ALWAYS better than deleting incomplete cases - Multiple Imputation
- Full Information Maximum Likelihood

Multiple Imputation

http//www.lshtm.ac.uk/msu/missingdata/mi_web/node

5.html

(No Transcript)

Modern Missing Data Techniques

- Preserve more information from original sample
- Incorporate uncertainty about missingness into

final estimates - Produce better estimates of population (true)

values

Dont waste information from variables

- Use all the information about the variables of

interest - Dont create clinical cutpoints before modeling
- Model with ALL the data first, then use

prediction to make decisions about cutpoints

Dichotomizing for Convenience Dubious

Practice (C.R.A.P.)

- Convoluted Reasoning and Anti-intellectual

Pomposity - Streiner Norman Biostatistics The Bare

Essentials

Implausible measurement assumption

not depressed

depressed

A

B

C

Depression score

Loss of power

http//psych.colorado.edu/mcclella/MedianSplit/

Sometimes through sampling error You can get a

lucky cut.

http//www.bolderstats.com/jmsl/doc/medianSplit.ht

ml

Dichotomization, by definition, reduces the

magnitude of the estimate by a minimum of about

30

Dear Project Officer, In order to facilitate

analysis and interpretation, we have decided to

throw away about 30 of our data. Even though

this will waste about 3 or 4 hundred thousand

dollars worth of subject recruitment and testing

money, we are confident that you will

understand. Sincerely, Dick O. Tomi, PhD Prof.

Richard Obediah Tomi, PhD

Power to detect non-zero b-weight when x is

continuous versus dichotomized

True model y .4x e

Dichotomizing will obscure non-linearity

Low

High

CESD Score

Dichotomizing will obscure non-linearity Same

data as previous slide modeled continuously

Type I error rates for the relation between x2

and y after dichotomizing two continuous

predictors. Maxwell and Delaney calculated the

effect of dichotomizing two continuous predictors

as a function of the correlation between them.

The true model is y .5x1 0x2, where all

variables are continuous. If x1 and x2 are

dichotomized, the error rate for the relation

between x2 and y increases as the correlation

between x1 and x2 increases.

Is it ever a good idea to categorize

quantitatively measured variables?

- Yes
- when the variable is truly categorical
- for descriptive/presentational purposes
- for hypothesis testing, if enough categories are

made. - However, using many categories can lead to

problems of multiple significance tests and still

run the risk of misclassification

CONCLUSIONS

- Cutting
- Doesnt always make measurement sense
- Almost always reduces power
- Can fool you with too much power in some

instances - Can completely miss important features of the

underlying function - Modern computing/statistical packages can

handle continuous variables - Want to make good clinical cutpoints? Model

first, decide on cuts afterward.

Statistical Adjustment/Control

- What does it mean to adjust or control for

another variable?

(No Transcript)

(No Transcript)

(No Transcript)

(No Transcript)

(No Transcript)

(No Transcript)

(No Transcript)

Y

(No Transcript)

(No Transcript)

(No Transcript)

(No Transcript)

Difficulties

- What if lines arent parallel?
- What if poor overlap between groups?

A Note on Mediation vs Confounding

- Mathematically identical no test can tell you

which is which - Depends on YOUR causal hypothesis
- Criteria for either
- All three variables, predictor,

confounder/mediator, outcome must be related

Possible Models

Initial condition all related

A

C

B

Possible Models

Initial condition all related

A

C

C

B

B

A

Possible Models

Typical regression result

A

C

B

Possible Models

Mediational relation between A and C

A

C

B

Possible Models

Spurious relation between A and C

A

C

B

Possible Models

Or worse

A

C

U

B

- With cross-sectional design, best you can do is

say that observed relations are consistent/not

consistent with hypothesized relation - Prospective better but still vulnerable to

outside variables - Interpretation of mediator/confounding

distinction is entirely substantive

Not always clear difference between mediator and

confounder

- Beware that adjustment for confounder might

actually be modeling an explanatory mechanism - E.g., relation between depression and mortality
- Often adjust for medical comorbidity
- Comorbidity however, might be a proxy for poor

self-care, which in turn is linked to depression

Sample size and the problem of underfitting vs

overfitting

- Model assumption is that ALL relevant variables

be includedthe antiparsimony principle or As

big as a house. - Tempered by fact that estimating too many

unknowns with too little data will yield junk. - In other words, cant build a mansion with a

shantys worth of wood.

Sample Size Requirements

- Linear regression
- minimum of N 50 8/predictor (Green, 1990)or

maybe more? (Kelley Maxwell, 2003) - Logistic Regression
- Minimum of N 10-15/predictor among smallest

group (Peduzzi et al., 1990a) - Survival Analysis
- Minimum of N 10-15/predictor (Peduzzi et al.,

1990b)

Consequences of inadequate sample size

- Lack of power for individual tests
- Unstable estimates
- Spurious good fitlots of unstable estimates will

produce spurious good-looking (big) regression

coefficients

All-noise, but good fit

R-squares from multivariable models where

population is completely random numbers

Events per predictor ratio

Simulation number of events/predictor ratio

Y .5x1 0x2 .2x3 0x4 -- Where r x1 x4

.4 -- N/p 3, 5, 10, 20, 50

Parameter stability and n/p ratio

Peduzzis Simulation number of events/predictor

ratio

P(survival) a b1NYHA b2CHF b3VES b4DM

b5STD b6HTN b7LVC --Events/p 2, 5,

10, 15, 20, 25 -- relative bias

(estimated b true b/true b)100

Simulation results number of events/predictor

ratio

Simulation results number of events/predictor

ratio

Approaches to variable selection

- Stepwise automated selection
- Pre-screening using univariate tests
- Combining or eliminating redundant predictors
- Fixing some coefficients
- Theory, expert opinion and experience
- Penalization/Random effects
- Propensity Scoring
- Matches individuals on multiple dimensions to

improve baseline balance - Tibshiranis Lasso

Any variable selection technique based on looking

at the data first will likely be biased

- I now wish I had never written the stepwise

selection code for SAS. - --Frank Harrell, author of forward and backwards

selection algorithm for SAS PROC REG

Automated Selection Derksen and Keselman (1992)

Simulation Study

- Studied backward and forward selection
- Some authentic variables and some noise variables

among candidate variables - Manipulated correlation among candidate

predictors - Manipulated sample size

Automated Selection Derksen and Keselman (1992)

Simulation Study

- The degree of correlation between candidate

predictors affected the frequency with which the

authentic predictors found their way into the

model. - The greater the number of candidate predictors,

the greater the number of noise variables were

included in the model. - Sample size was of little practical importance

in determining the number of authentic variables

contained in the final model.

Simulation results Number of noise variables

included

Sample Size

20 candidate predictors 100 samples

Simulation results R-square from noise variables

Sample Size

20 candidate predictors 100 samples

Simulation results R-square from noise variables

Sample Size

20 candidate predictors 100 samples

SOME of the problems with stepwise variable

selection.

1. It yields R-squared values that are badly

biased high 2. The F and chi-squared tests

quoted next to each variable on the printout do

not have the claimed distribution 3. The method

yields confidence intervals for effects and

predicted values that are falsely narrow (See

Altman and Anderson Stat in Med) 4. It yields

P-values that do not have the proper meaning and

the proper correction for them is a very

difficult problem 5. It gives biased regression

coefficients that need shrinkage (the

coefficients for remaining variables are too

large see Tibshirani, 1996). 6. It has severe

problems in the presence of collinearity 7. It

is based on methods (e.g. F tests for nested

models) that were intended to be used to test

pre-specified hypotheses. 8. Increasing the

sample size doesn't help very much (see Derksen

and Keselman) 9. It allows us to not think about

the problem 10. It uses a lot of paper

author Chatfield, C., title Model

uncertainty, data mining and statistical

inference (with discussion), journal JRSSA,

year 1995, volume 158, pages

419-466, annote --bias by

selecting model because it fits the data well

bias in standard errors P. 420 ... need for a

better balance in the literature and in

statistical teaching between techniques and

problem solving strategies. P. 421 It is well

known' to be logically unsound and practically

misleading' (Zhang, 1992) to make inferences as

if a model is known to be true when it has, in

fact, been selected from the same data to be used

for estimation purposes. However, although

statisticians may admit this privately (Breiman

(1992) calls it a quiet scandal'), they (we)

continue to ignore the difficulties because it is

not clear what else could or should be done. P.

421 Estimation errors for regression

coefficients are usually smaller than errors from

failing to take into account model specification.

P. 422 Statisticians must stop pretending that

model uncertainty does not exist and begin to

find ways of coping with it. P. 426 It is

indeed strange that we often admit model

uncertainty by searching for a best model but

then ignore this uncertainty by making inferences

and predictions as if certain that the best

fitting model is actually true.

Phantom Degrees of Freedom

- Faraway (1992)showed that any pre-modeling

strategy cost a df over and above df used later

in modeling. - Premodeling strategies included variable

selection, outlier detection, linearity tests,

residual analysis. - Thus, although not accounted for in final model,

these phantom df will render the model too

optimistic

Phantom Degrees of Freedom

- Therefore, if you transform, select, etc., you

must include the DF in (i.e., penalize for) the

Final Model

Conventional Univariate Pre-selection

- Non-significant tests also cost a DF
- Non-significance is NOT necessarily related to

importance - Variables may not behave the same way in a

multivariable modelvariable not significant at

univariate test may be very important in the

presence of other variables

Conventional Univariate Pre-selection

- Despite the convention, testing for confounding

has not been systematically studiedin many cases

leads to overadjustment and underestimate of true

effect of variable of interest. - At the very least, pulling variables in and out

of models inflates the model fit, often

dramatically

Better approach

- Pick variables a priori
- Stick with them
- Penalize appropriately for any data-driven

decision about how to model a variable

Spending DF wisely

- If not enough N/predictor, combine covariates

using techniques that do not look at Y in the

sample, PCA, FA, conceptual clustering,

collapsing, scoring, established indexes. - Save DF for finer-grained look at variables of

most interest, e.g, non-linear functions

What to do

- Penalization/Random effects
- Propensity Scoring
- Matches individuals on multiple dimensions to

improve baseline balance - Tibshiranis Lasso

(No Transcript)

Propensity Score Example

- Observational data on SSRI use in post myocardial

infarction patients - Early use of SSRI as an adjustment covariate

revealed excess risk for all-cause mortality

among SSRI users - Can use Propensity Score to help rule out

confounders

Step 1 Kitchen Sink Model predicting SSRI use

- Why is it OK to use lots of predictors in this

case? - Working strictly at the sample level

(No Transcript)

Generate conditional probabilities of being on an

SSRI for each patient

ID probssri 1 0.07071829 2

0.10357308 3 0.08324767 4 0.09562251

5 0.10424651 6 0.28105882 7

0.09824793

Step 2 Remove non-overlapping cases

SSRI0

SSRI1

density

Perform primary analysis predicting survival

- Surv ssri
- Surv ssri logit(pssri)
- Surv ssri logit(pssri) BDI
- Surv ssri logit(pssri) BDI others

Step 3 Unadjusted estimate

Factor HR Lower 0.95 Upper

0.95 ssri 0.22 0.18 1.05

Hazard Ratio 1.85 1.20 2.86

Step 4 Adjusted for Propensity (linear)

Factor Effect S.E. Lower 0.95 Upper 0.95

ssri 0.61 0.24 0.15 1.08

Hazard Ratio 1.85 NA 1.16 2.95

LOGIT 0.00 0.14 -0.27 0.28

Hazard Ratio 1.00 NA 0.76 1.33

(No Transcript)

Better Step 4 Adjusted for Propensity

(non-linear)

Factor Effect S.E. Lower 0.95 Upper 0.95

ssri 0.55 0.24 0.07 1.03

Hazard Ratio 1.73 NA 1.07 2.79

LOGIT 0.02 0.25 -0.47 0.51

Hazard Ratio 1.02 NA 0.62 1.67

(No Transcript)

Limitations

- Still may be differences/confounding not measured

and therefore not captured by propensity score - If poor overlap, limited generalizability
- Many reviewers not familiar with it

What to do about heterogeneous slopes?

- We know there is always heterogeneity of slopes,

perhaps even important - Proper test is product interaction termNOT

within subgroups tests (see BMJ series) - Increased error rate
- Differential power
- Danger of Accepting the null
- Sparse cells and unstable estimates
- Tension between low power of interaction and high

error rate/instability - Especially true in observational data
- I honestly dont know what to doany ideas?

If you worry about Type I

- Use pooled test (see, for example, Cohen Cohen

or Harrell) - If pooled test not significant, stop there

If Type II is a bigger concern

- Report non-significant effects, acknowledging the

uncertainty, but conveying need to investigate

more - C.F. HRT data was there an age X HRT

interaction?

Validation

- Apparent fit
- Usually too optimistic
- Internal
- cross-validation, bootstrap
- honest estimate for model performance
- provides an upper limit to what would be found on

external validation - External validation
- replication with new sample, different

circumstances

Validation

- Steyerburg, et al. (1999) compared validation

methods - Found that split-half was far too conservative
- Bootstrap was equal or superior to all other

techniques

Conclusions

- Measure well
- Use all the information
- Recognize the limitations based on how much data

you actually have - In the confirmatory mode, be as explicit as

possible about the model a priori, test it, and

live with it - By all means, explore data, but recognize and

state frankly --the limits post hoc analysis

places on inference

http//myspace.com/monkeynavigatedrobots

Advanced topics and examples

Bootstrap

My Sample

?1

?2

?3

?k-1

?k

?4

.

WITH REPLACEMENT

Evaluate

1, 3, 4, 5, 7, 10

7 1 1 4 5 10

10 3 2 2 2 1

3 5 1 4 2 7

2 1 1 7 2 7

4 4 1 4 2 10

Can use data to determine where to spend DF

- Use Spearmans Rho to test importance
- Not peeking because we have chosen to include the

term in the model regardless of relation to Y - Use more DF for non-linearity

Example-Predict Survival from age, gender, and

fare on Titanic example using R software

If you have already decided to include them (and

promise to keep them in the model) you can peek

at predictors in order to see where to add

complexity

(No Transcript)

Non-linearity using splines

Linear Spline (piecewise regression)

Y a b1(xlt10) b2(10ltxlt20) b3 (x gt20)

Cubic Spline (non-linear piecewise regression)

knots

Logistic regression model

fitfarelt-lrm(survived(rcs(fare,3)agesex)2,xT,

yT) anova(fitfare)

Spline with 3 knots

Wald Statistics Response survived

Factor

Chi-Square d.f. P fare (FactorHigher

Order Factors) 55.1 6 lt.0001 All

Interactions 13.8 4

0.0079 Nonlinear (FactorHigher Order

Factors) 21.9 3 0.0001 age

(FactorHigher Order Factors) 22.2 4

0.0002 All Interactions

16.7 3 0.0008 sex (FactorHigher

Order Factors) 208.7 4 lt.0001

All Interactions 20.2

3 0.0002 fare age (FactorHigher Order

Factors) 8.5 2 0.0142 Nonlinear

8.5 1 0.0036

Nonlinear Interaction f(A,B) vs. AB 8.5

1 0.0036 fare sex (FactorHigher Order

Factors) 6.4 2 0.0401 Nonlinear

1.5 1 0.2153

Nonlinear Interaction f(A,B) vs. AB 1.5

1 0.2153 age sex (FactorHigher Order

Factors) 9.9 1 0.0016 TOTAL NONLINEAR

21.9 3 0.0001

TOTAL INTERACTION 24.9

5 0.0001 TOTAL NONLINEAR INTERACTION

38.3 6 lt.0001 TOTAL

245.3 9 lt.0001

Wald Statistics Response survived

Factor

Chi-Square d.f. P fare (FactorHigher

Order Factors) 55.1 6 lt.0001 All

Interactions 13.8 4

0.0079 Nonlinear (FactorHigher Order

Factors) 21.9 3 0.0001 age

(FactorHigher Order Factors) 22.2 4

0.0002 All Interactions

16.7 3 0.0008 sex (FactorHigher

Order Factors) 208.7 4 lt.0001

All Interactions 20.2

3 0.0002 fare age (FactorHigher Order

Factors) 8.5 2 0.0142 Nonlinear

8.5 1 0.0036

Nonlinear Interaction f(A,B) vs. AB 8.5

1 0.0036 fare sex (FactorHigher Order

Factors) 6.4 2 0.0401 Nonlinear

1.5 1 0.2153

Nonlinear Interaction f(A,B) vs. AB 1.5

1 0.2153 age sex (FactorHigher Order

Factors) 9.9 1 0.0016 TOTAL NONLINEAR

21.9 3 0.0001

TOTAL INTERACTION 24.9

5 0.0001 TOTAL NONLINEAR INTERACTION

38.3 6 lt.0001 TOTAL

245.3 9 lt.0001

Wald Statistics Response survived

Factor

Chi-Square d.f. P fare (FactorHigher

Order Factors) 55.1 6 lt.0001 All

Interactions 13.8 4

0.0079 Nonlinear (FactorHigher Order

Factors) 21.9 3 0.0001 age

(FactorHigher Order Factors) 22.2 4

0.0002 All Interactions

16.7 3 0.0008 sex (FactorHigher

Order Factors) 208.7 4 lt.0001

All Interactions 20.2

3 0.0002 fare age (FactorHigher Order

Factors) 8.5 2 0.0142 Nonlinear

8.5 1 0.0036

Nonlinear Interaction f(A,B) vs. AB 8.5

1 0.0036 fare sex (FactorHigher Order

Factors) 6.4 2 0.0401 Nonlinear

1.5 1 0.2153

Nonlinear Interaction f(A,B) vs. AB 1.5

1 0.2153 age sex (FactorHigher Order

Factors) 9.9 1 0.0016 TOTAL NONLINEAR

21.9 3 0.0001

TOTAL INTERACTION 24.9

5 0.0001 TOTAL NONLINEAR INTERACTION

38.3 6 lt.0001 TOTAL

245.3 9 lt.0001

Wald Statistics Response survived

Factor

Chi-Square d.f. P fare (FactorHigher

Order Factors) 55.1 6 lt.0001 All

Interactions 13.8 4

0.0079 Nonlinear (FactorHigher Order

Factors) 21.9 3 0.0001 age

(FactorHigher Order Factors) 22.2 4

0.0002 All Interactions

16.7 3 0.0008 sex (FactorHigher

Order Factors) 208.7 4 lt.0001

All Interactions 20.2

3 0.0002 fare age (FactorHigher Order

Factors) 8.5 2 0.0142 Nonlinear

8.5 1 0.0036

Nonlinear Interaction f(A,B) vs. AB 8.5

1 0.0036 fare sex (FactorHigher Order

Factors) 6.4 2 0.0401 Nonlinear

1.5 1 0.2153

Nonlinear Interaction f(A,B) vs. AB 1.5

1 0.2153 age sex (FactorHigher Order

Factors) 9.9 1 0.0016 TOTAL NONLINEAR

21.9 3 0.0001

TOTAL INTERACTION 24.9

5 0.0001 TOTAL NONLINEAR INTERACTION

38.3 6 lt.0001 TOTAL

245.3 9 lt.0001

Wald Statistics Response survived

Factor

Chi-Square d.f. P fare (FactorHigher

Order Factors) 55.1 6 lt.0001 All

Interactions 13.8 4

0.0079 Nonlinear (FactorHigher Order

Factors) 21.9 3 0.0001 age

(FactorHigher Order Factors) 22.2 4

0.0002 All Interactions

16.7 3 0.0008 sex (FactorHigher

Order Factors) 208.7 4 lt.0001

All Interactions 20.2

3 0.0002 fare age (FactorHigher Order

Factors) 8.5 2 0.0142 Nonlinear

8.5 1 0.0036

Nonlinear Interaction f(A,B) vs. AB 8.5

1 0.0036 fare sex (FactorHigher Order

Factors) 6.4 2 0.0401 Nonlinear

1.5 1 0.2153

Nonlinear Interaction f(A,B) vs. AB 1.5

1 0.2153 age sex (FactorHigher Order

Factors) 9.9 1 0.0016 TOTAL NONLINEAR

21.9 3 0.0001

TOTAL INTERACTION 24.9

5 0.0001 TOTAL NONLINEAR INTERACTION

38.3 6 lt.0001 TOTAL

245.3 9 lt.0001

(No Transcript)

(No Transcript)

(No Transcript)

Bootstrap Validation

Summary

- Think about your model
- Collect enough data

Summary

- Measure well
- Dont destroy what youve measured

Summary

- Pick your variables ahead of time and collect

enough data to test the model you want - Keep all your variables in the model unless

extremely unimportant

Summary

- Use more df on important variables, fewer df on

nuisance variables - Dont peek at Y to combine, discard, or transform

variables

Summary

- Estimate validity and shrinkage with bootstrap

Summary

- By all means, tinker with the model later, but be

aware of the costs of tinkering - Dont forget to say you tinkered
- Go collect more data

Web links for references, software, and more

- Harrells regression modeling text
- http//hesweb1.med.virginia.edu/biostat/rms/
- R software
- http//cran.r-project.org/
- SAS Macros for spline estimation
- http//hesweb1.med.virginia.edu/biostat/SAS/survri

sk.txt - Some results comparing validation methods
- http//hesweb1.med.virginia.edu/biostat/reports/lo

gistic.val.pdf - SAS code for bootstrap
- ftp//ftp.sas.com/pub/neural/jackboot.sas
- S-Plus home page
- insightful.com
- Mike Babyaks e-mail
- michael.babyak_at_duke.edu
- This presentation
- http//www.duke.edu/mababyak

- www.duke.edu/mababyak
- michael.babyak _at_ duke.edu
- symptomresearch.nih.gov/chapter_8/

Observational Data and Clinical

Trials http//www.epidemiologic.org/2006/11/agreem

ent-of-observational-and.html http//www.epidemio

logic.org/2006/10/resolving-differences-of-studies

-of.html Propensity Scoring Rubin Symposium

notes http//www.symposion.com/nrccs/rubin.htm Ro

senbaum, P.R. and Rubin, D.B. (1984). "Reducing

bias in observational studies using

sub-classification on the propensity score."

Journal of the American Statistical Association,

79, pp. 516-524. Pearl, J. (2000). Causality

Models, Reasoning, and Inference, Cambridge

University Press. Rosenbaum, P. R., and Rubin,

D. B., (1983), "The Central Role of the

Propensity Score in Observational Studies for

Causal Effects, Biometrica, 70, 41-55.

Mediation and Confounding MacKinnon DP, Krull

JL, Lockwood CM. Equivalence of the mediation,

confounding and suppression effect. Prev Sci

(2000) 117381

General Modeling Harrell FE Jr. Regression

modeling strategies with applications to linear

models, logistic regression and survival

analysis. New York Springer 2001. Sample

Size Kelley, K. Maxwell, S. E. (2003). Sample

size for Multiple Regression Obtaining

regression coefficients that are accuracy, not

simply significant. Psychological Methods, 8,

305321. Kelley, K. Maxwell, S. E. (In

press). Power and Accuracy for Omnibus and

Targeted Effects Issues of Sample Size Planning

with Applications to Multiple Regression Handbook

of Social Research Methods, J. Brannon, P.

Alasuutari, and L. Bickman (Eds.). New York, NY

Sage Publications. Green SB. How many subjects

does it take to do a regression analysis?

Multivar Behav Res 1991 26 499510. Peduzzi

PN, Concato J, Holford TR, Feinstein AR. The

importance of events per independent variable in

multivariable analysis, II accuracy and

precision of regression estimates. J Clin

Epidemiol 1995 48 150310 Peduzzi PN, Concato

J, Kemper E, Holford TR, Feinstein AR. A

simulation study of the number of events per

variable in logistic regression analysis. J Clin

Epidemiol 1996 49 13739.

Dichotomization Cohen, J. (1983) The cost of

dichotomization. Applied Psychological

Measurement, 7, 249-253. MacCallum R.C., Zhang,

S., Preacher, K.J., Rucker, D.D. (2002). On the

practice of dichotomization of quantitative

variables. Psychological Methods, 7(1), 19-40.

Maxwell, SE, Delaney, HD (1993). Bivariate

median splits and spurious statistical

significance. Psychological Bulletin, 113,

181-190 Royston, P., Altman, D. G.,

Sauerbrei, W. (2006) Dichotomizing continuous

predictors in multiple regression a bad idea.

Statistics in Medicine, 25,127-141.

http//biostat.mc.vanderbilt.edu/twiki/bin/view/

Main/CatContinuous

Pretesting Grambsch PM, OBrien PC. The effects

of preliminary tests for nonlinearity in

regression. Stat Med 1991 10 697709. Faraway

JJ. The cost of data analysis. J Comput Graph

Stat 1992 1 21329. Validaton and

Penalization Steyerberg EW, Harrell FE Jr,

Borsboom GJ, Eijkemans MJ, Vergouwe Y, Habbema

JD. Internal validation of predictive models

efficiency of some procedures for logistic

regression analysis. J Clin Epidemiol 2001 54

77481. Tibshirani R. Regression shrinkage and

selection via the lasso. J R Stat Soc B 2003 58

26788. Greenland S . When should epidemiologic

regressions use random coefficients? Biometrics

2000 Sep 56(3)915-21 Moons KGM, Donders ART,

Steyerberg EW, Harrell FE (2004) Penalized

maximum likelihood estimation to directly adjust

diagnostic and prognostic prediction models for

overoptimism a clinical example. J Clin

Epidemiol 2004571262-1270. Steyerberg EW,

Eijkemans MJ, Habbema JD. Application of

shrinkage techniques in logistic regression

analysis a case study. Stat Neerl 2001

5576-88.

Variable Selection Thompson B. Stepwise

regression and stepwise discriminant analysis

need not apply here a guidelines editorial. Ed

Psychol Meas 1995 55 52534. Altman DG,

Andersen PK. Bootstrap investigation of the

stability of a Cox regression model. Stat Med

2003 8 77183. Derksen S, Keselman HJ.

Backward, forward and stepwise automated subset

selection algorithms frequency of obtaining

authentic and noise variables. Br J Math Stat

Psychol 1992 45 26582. Steyerberg EW,

Harrell FE, Habbema JD. Prognostic modeling with

logistic regression analysis in search of a

sensible strategy in small data sets. Med Decis

Making 2001 21 4556. Cohen J. Things I have

learned (so far). Am Psychol 1990 45 130412.

Roecker EB. Prediction error and its estimation

for subset-selected models Technometrics 1991

33 45968.