Title: Lovely Lucid Logistics the analysis and graphic presentation of effects of nominal and metric variab
1 Lovely Lucid Logistics the analysis and graphic
presentation of effects of nominal and metric
variables on binary outcomes
- Diana Eugenie Kornbrot
- Blended Learning Unit
- University of Hertfordshire
- d.e.kornbrot_at_herts.ac.uk
2Abstract
- Logistic regression can be used to answer the
same questions about binary variables that ANOVA
and ANCOVA answer about metric variables. - However, SPSS provides much less support for
logistic regression. The Logistic Regression
Procedure provides no equivalent of ANOVA Means
Tables or Profile Plots. - This presentation shows how to use a combination
of SPSS Procedures to produce Tables and Graphs
of predicted logit and probabilities as a
function of categorical factor and metric
covariate variables. - Diagnostics for model fit NOT discussed
- Merits own presentation
3Acknowledgments
- Lia Kvavilashvili
- For all the prospective memory data
- Stimulating theoretical discussion on content
- ESRC Project Grant
4Goals
- Motivate Logistic regression
- Graphic Presentation of Logistic Model Results
- Interpretation much easier from graphs
- Predictions
- Logits and Probabilities as function explanatory
variables - Identification of statistically reliable effects
- Factors and Contrasts
- Application to Different Designs
- Explanatory variables 2 or 3 categorical
- Explanatory variables 1 metric, 1 or 2
categorical - Recommendations to Users of Logistic Regression
- Recommendations to SPSS
5Why Logistic Analysis?
- Need to analyse binary, i.e. 2 alternative,
responses - Errors right, wrong
- Events remembered, forgotten
- Success grant awarded, grant rejected
- patient recovered, or not
- More than 1 categorical variable
- Chi-square not sufficient
- Combination of metric and categorical explanatory
variables - Interactions matter
6Why Interpretation of Results is a Problem
- Analysis is on log (odds ratio) or logits
- Lack of intuitive feel for logits
- Lack of intuitive feel for odds ratios for
non-betters - Probabilities are more natural?
- Need for Packages SPSS or other
- Cant hand calculate, as no closed form answer
- SPSS Output
- Primary output is in logits
- No directly useful graphics output
- BUT Save permits direct saving of probabilities
no logits - ?No confidence levels on probabilities
7Analysis
- Analysis
- GLM framework
- Effects assumed to be linear on logits
- Model Goodness of Fit Test on 2LogLikelihood,
-2LL - Model Fitting Procedure
- SPSS uses Wald, other packages use deviance
-2LL - Effect of Evauluation Criteria SPSS uses Wald
- On factors and covariates
- On model parameters
- Other Packages Vary, all give Wald as minimum
- JMP, SPSS, SAS, SYSTAT
8Data Example Prospective Memory
- Prospective Memory
- Does person have GOOD prospective memory
- 5 or 6 occasions remembered from 6 opportunities
- Model 1 task(action, event, time), age(4
categories) - Model 2 task(action, event, time), age(4),
intellect - Presentation Criteria
- Easy to interpret gt Graphics
- Predicted probability and logits
- Estimate of accuracy as part of results
- Tests for explanatory variable effects and
contrasts
9Model 1 using SPSS menus
- Analyze gt Regression gt Binary Logistic
- Dependent good
- Covariates task(cat)
- age(cat)
- task(cat)age(cat)
- Method Enter
- Categorical task(deviation) age(deviation)
- or task(repeated) age(repeated)
- !!!NOT indicator, the default!!!
- not a lot of people know that!
- Save probabilities, Cooks, deviation
- Options CI for exp(B)
10Model 1 Global Results
- Model 1 task(action, event, time), age(4
categories) - Omnibus Test Significant Good
- Model Summary Substantial variance accounted for
11SPSS Model 1 Parameters
- Variable effect not salient
- No effects or standard errors for reference
(last) - Wald Estimates of s.e. may not be those that are
needed?
12SPSS Graphic Representation
- Predicted Probabilities, pre_1
- Directly Available from Save
- Logits can be calculated
- Compute gt Transform
- Lgt ln(pre_1/(1-pre_1)
- NB Most other packages allow direct saving of
logits - Graph gt Interactive gt Line plot
- Y axis predicted probability (mean)
- X axis age
- Colour task
- No interactions
- So expect logit plots to be more linear
13SPSS Logit Probability Graphs
Raw probability Logit ??looks more
linear?? Confidence Levels??? NOT in SPSS!!!
14Confidence Levels
- Assume no extra-binomial dispersion
- Asymptotic for logit
- Symmetric about mean(lgt)
- se(lgt)2 1/Noccur - 1/Nnot occur
- Lower Confidence Level, 95, LCL(lgt) mean(lgt)
-1.96se(lgt) - Upper Confidence Level, 95, LCL(lgt) mean(lgt)
1.96se(lgt) - Asymptotic for probability
- Asymmetric about mean(prob).
- Calculate from lgt CLs
- probability exp(lgt)/1exp(lgt
- LCL(prob) exp(LCL(lgt)0/1exp(LCL(lgt))
- UCL(prob) exp(UCL(lgt)0/1exp(UCL(lgt))
- Use EXCEL, cant customise error bars in SPSS
15EXCEL Logit Probability Graphs
Raw probability Logit Errors are for each group.
So low power for interaction
16Model 2 Using SPSS menus
- Analyze gt Regression gt Binary Logistic
- Dependent good
- Covariates task(cat), age(cat), intellec
- task(cat)age(cat)
- task(cat)intellec
- intellecage(cat)
- task(cat)age(cat)intellec
- Method Enter
- Categorical task(deviation), age(deviation)
- or task(repeated), age(repeated
- Save probabilities, Cooks, deviation
- Options CI for exp(B)
17Model 2 Summary
- OmnibusWhole Model LR chi2(23)82.2, p.0000001
- Various r2 values
- McFadden.36 Cox Snell.37 Nagelkerke.51
- Variable Effects
- Source DF Wald chi2 Wald Prob LR Chi2 LR Prob
- TASK 2 14.03 .000899 29.70 .000000
- AGE 3 3 4.45 .217040 4.96 .174500
- intellect 1 2.87 .089995 6.03 .014101
- TASKAGE 6 6.00 .423621 14.63 .023371
- TASKintellect 2 4.32 .115183 7.73 .021003
- AGEintellect 3 5.00 .171542 7.07 .069614
- TASKAGEintellect 6 10.52 .104480 21.43 .001532
- Comparison of Variable Effects with different
methods/packages - Likelihood Ratio shows strong effects intellec
intellec interactions - Used JMP-IN even version 3, 5 is better for some
things - Wald does NOT show these effect - WORRYING
- Model improvement with intellec chi2(12)33.3,
p.00087
18Model 2 Probability by Age
- Not very clear!
- Task effect
- Event has lower prob
- Intellect
- Most groups
- Prob increase with intellec
- 3 way interactions
- gt 70, event 61-65 time
- Prob decrease with intellec
19Model 2 Logit by Age
- Bit clearer!
- Task effect
- Event has lower prob
- Intellect
- Most groups
- Prob increase with intellec
- Large 71-75time, 76-80action
- 3 way interactions
- gt 70, event 61-65 time
- Prob decrease with intellec
20Summary Recommendations
- Recommend Logit analyses as a very important tool
- Recommend Graphic displays toimprove
interpretability - SPSS provides basic procedure
- Limitations of SPSS
- No direct predicted logit or probability Table or
Graph Summary - Poor model diagnostics and power procedures
- No direct group standard errors
- No Maximum Likelihood estimates for explanatory
variables - No mixed models
- Other general packages are also DIRE - in
different ways - Need simple tools for routine logistic
applications - Can SPSS User Groups do anything?
21References
- Agresti, A. (1990). Categorical data analyses.
Chichester Wiley. - Agresti, A. (1996). Introduction to categorical
data analyses. Chichester Wiley. - Agresti, A., Finley, B. (1997). Statistical
methods for the social sciences (3 ed.). Upper
Saddle River, NJ Prentice Hall. - Agresti, A., Hartzel, J. (2000). Tutorial in
biostatistics strategies for comparing
treatments on a binary response with mulit-centre
data. Statistics in Medicine, 19, 1115-1139. - Everitt, B., Dunn, G. (2001). Applied
multivariate data analysis (2 ed.). London
Edward Arnold. - Kornbrot, D. E. (2000, 17-20 july 2000). Counting
on prospective memory Advantages of logistic and
log linear models over ANOVA and correlations.
Paper presented at the 1st International
Prospective Memory Conference, Hatfield,
Hertfordshire, U.K. - Kvavilashvili, L., Kornbrot , D. E., Mash , V.,
Cockburn, J., Milne, A. (2000, 17-20 july
2000). Remembering event-, time- and
activity-based tasks in young, young-old and
old-old people. Paper presented at the 1st
International Prospective Memory Conference,
Hatfield, Hertfordshire, U.K. - Lindsey, J. K. (1999). Models for repeated
measurements (2 ed.). Oxford Oxford University
Press. - Sofroniou, N., Hutcheson, G. D. (2002).
Confidence Intervals for the Predictions of
Logistic Regression in the Presence and Absence
of a Variance Covariance Matrix. Understanding
Statistics, 1(1), 318. - Tabachnick, B. G., Fidell, L. S. (1996). Using
multivariate statistics (3 ed.). New York Harper
Collins.