Testing and Interpreting Mediational and Moderational Models in Logistic Regression - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

Testing and Interpreting Mediational and Moderational Models in Logistic Regression

Description:

Get the variance-covariance matrix for all variables in the model, paste into Excel. Make a table of all coefficients and SEs from model results ... – PowerPoint PPT presentation

Number of Views:216
Avg rating:3.0/5.0
Slides: 28
Provided by: kevin248
Category:

less

Transcript and Presenter's Notes

Title: Testing and Interpreting Mediational and Moderational Models in Logistic Regression


1
Testing and Interpreting Mediational and
Moderational Models in Logistic Regression
  • A practical talk
  • Kevin M. King, M.A.

2
Goals
  • Provide a basic understanding of logistic
    regression
  • Demonstrate how to compute standardized
    coefficients using Excel to compute the indirect
    and direct effect for mediational models in
    logistic regression (e.g. MacKinnon Dwyer,
    1993)
  • Demonstrate how to use Excel to create graphs of
    logistic regression interactions

3
A caveat
  • I am not a quantitative expert.
  • Im just a guy whos worked with these models
    quite a bit and has some shortcuts to share.
  • To really understand whats going on, you really
    need to understand logistic regression, which you
    wont find here. Take a course or read a book.
  • A great (and cheap) resource for learning
    logistic regression is the Sage Primer by Pampel
    (2000)

4
Prediction of Dichotomous Variables
  • Many variables are dichotomous (e.g. death,
    psychological diagnosis, participation in an
    intervention) and very relevant.
  • It is desirable to apply the same modeling
    framework that we use to predict continuous
    outcomes to dichotomous outcomes

5
Difficulties
  • We often think in terms of continuous prediction
  • i.e.
  • How does a personality lead to drug use disorder
  • How is income related to participation
  • Yet we cant talk about 1 to 1 relations, where
    5,000 more of income leads to 1 more
    participation, or 1 higher impulsivity leads to 1
    more drug use disorder

6
Continuous Distribution
7
Risk and Odds
  • We must translate our thinking to binary outcomes
    and talk about risk and odds, to describe the
    probability of being in one state or another.
  • These probabilities are not distributed linearly
  • E.g.
  • relation of of pubs to tenure
  • Relation of of alcoholic relatives to alcoholism

8
Binary Distributions
9
Probability Curve (Sigmoid Curve)
10
Estimation of Logistic Regression-and
implications for mediation-
  • Logistic Regression uses a transformation of the
    probability curve, called the logit (logged
    odds), which linearizes the probability curve.
  • Logitln( Pi /(1-Pi) ), Odds Pi /(1-Pi),
    Probability Pi
  • But since we observe the actual presence or
    absence, rather than its probability, we cant
    use OLS procedures
  • Error term is non-normally distributed
  • Error variances are non-equal across values of IV
  • (i.e. distribution is inherently nonlinear)

11
Probability vs. Odds vs. Logit
  • Probability Pi chance of occurrence
  • Odds Pi /(1-Pi) odds of occurring vs. not (e.g.
    men are twice as likely to be diagnosed with
    alcoholism)
  • Logitln( Pi /(1-Pi) ) log of the odds,
    otherwise un-interpretable
  • Standardizing variables (Z-scores) by dividing by
    their SE helps interpretation

12
Maximum Likelihood Estimation
  • Maximum likelihood estimation (MLE) is used to
    estimate these unobserved logged odds.
  • An estimation procedure that tries to best
    reproduce the covariance matrix of the data given
    the model
  • In predicting binary outcomes, the initial best
    guess is the proportion in the population.
  • Each explanatory variable that is added to the
    model improves model fit and heightens the chance
    of accurate prediction of binary class membership
    (e.g. correctly predicting diagnosis).
  • Because Y is unobserved in logistic regression,
    its variance is also unobserved. In order to
    estimate the model, the variance of the residual
    is fixed to p2/3 in logistic regression.
  • The scale in logistic regression depends on the
    extent of prediction that depends on the
    variables in the model (MacKinnon Dwyer, 1993,
    p. 150)
  • Thus the coefficients in each model are scaled
    according to the explanatory power of the other
    coefficients in the model and p2/3.

13
Mediational ModelsApplication to Logistic
Regression
  • Mediation is where a predictor has an indirect
    effect on an outcome through a third variable.
    This indirect effect accounts for some to all of
    the main effect of the predictor on the outcome
    (see Baron and Kenny, 1988 )
  • Mediation can be tested through measuring the
    impact of the mediator on t (e.g. t-t) or
    through testing the significance of the indirect
    effect (ab)
  • See Shrout Bolger, 2002 and MacKinnon, Krull
    Lockwood, 2000 for good discussions of these
    methods)

Mediator
a
b
Predictor
Outcome
t
14
The Rub
  • In logistic regression, a and b and t and t
    dont come from the same equations, which means
    that when the outcome or mediator is binary they
    do not have the same scale.
  • Thus one cannot compute the mediation effect
    either through ab or t-t

Mediator
a
b
Predictor
Outcome
t
15
The Solution
  • The solution is to standardize the coefficients,
    as MacKinnon and Dwyer (1993) recommend.
  • Because the variance of the outcome is dependent
    on the variables in the model plus p2/3, we can
    estimate the variance of the outcome and use it
    to standardize the coefficients.

16
The Formula
  • Variance of Outcome
  • s2(O) b2s2(M) t2 s2(P) 2bts(PM) p2/3
  • This means
  • Variance of outcome (coefficient for Mediator
    squaredvariance of the Mediator) (coefficient
    of Predictor squared variance of Predictor)
    (2coeff. of med. coeff. of pred. covariance
    of P and M) pi squared/3.
  • This can be expanded to include any number of
    covariates. Each variable must be included both
    as its coeffcient squared by its variance and a
    term for its coefficient times each other
    coefficient in the mode times their covariance
  • To standardized coefficents
  • b b/ s2(O)

17
An Example The mediating effect of behavioral
undercontrol on the relation between parental
alcoholism and drug use disorder (from King
Chassin, 2004)
Table 2 Logistic Regression Predicting Drug
Diagnosis from Parental Alcoholism and Behavioral
Undercontrol
Note.plt.001
Undercontrol
0.27
0.61
Parent Alc.
Offspring Drug Disorder
0.54
18
Standardization of coefficients
  • Using the coefficients and the variance-covariance
    matrix of the variables in the equation, we can
    easily fill in the values of the formula.
  • Steps
  • Get the variance-covariance matrix for all
    variables in the model, paste into Excel
  • Make a table of all coefficients and SEs from
    model results
  • Using the table of coefficients, make a
    variance-covariance like table (where the
    on-diagonal is the coefficient squared and the
    off diagonal is 2ab)
  • Combine the variance-covariance table and the new
    table of coefficients my multiplying matching
    cells
  • Sum the new combined table and add p2/3.
  • Use this outcome variance to standardize b and t
    by dividing each coefficient by the outcome
    variance

19
Standardized Model
Undercontrol
0.27
0. 15
Parent Alc.
Offspring Drug Disorder
0.12
Proportion Mediatedb/(tb)
20
Moderation in Logistic Regression Interpreting
Coefficients and Graphing Interactions
  • Testing interactions in logistic regression is
    similar to OLS regression methods, in that one
    includes an interaction term in the model
    predicting a binary outcome.
  • Interactions can also be probed using Aiken
    Wests method (test at 1 SD above and below the
    mean).
  • Centering is just as important as in OLS
    regression, and standardizing variables will also
    aid in interpretation of coeffcients
  • Present model shows maternal support moderating
    the relation between behavioral undercontrol and
    risk for young adult drug use disorder.

Support
-0. 61
0.40
Undercontrol
Offspring Drug Disorder
0.85
21
Interpreting Coefficients and Graphing
Interactions An Example (from King Chassin,
2004)
  • An Example Behavioral undercontrols effect on
    drug use disorder is moderated by parental
    support.

Table 5 Logistic Regression Predicting Drug
Diagnosis From Behavioral Undercontrol and
Parental Support
Note. B the unstandardized logistic regression
coefficient. plt.05, plt.01, plt.001
22
The OLS Extension
  • Probe the interaction at 1 SD above and below the
    mean of the moderated variable to obtain
    coefficients and intercepts.
  • Plot these coefficients across a range of data
    points of the moderated variable (remember,
    youve standardized your predictors for easy
    interpretation).

23
The Logistic Twist
  • The previous graph is in terms of the logit. Its
    good for helping us understand the nature of the
    interaction (in this case protective but
    reactive)
  • However, it fails to give us a sense of whats
    really happening in terms of how the predicted
    probabilities differ across levels of the
    moderator
  • Thus, we need to transform the coefficients to
    the odds or probability to create interpretable
    graphs
  • Pe(logit)/(1e(logit)), Oddse(logit)

24
Odds and Probability Metric
25
Interpreting with REAL values
  • While we may see the shape of the probability or
    odds function in the above graphs, note that they
    extend out to 6.5 SD above the mean for
    undercontrol!
  • Its important to display your interactions where
    there is real data.
  • To do this, you can run your moderational model
    in SPSS and save out the predicted probabilities
    for each participant as a variable. See code
    below for an example.

LOGISTIC REGRESSION VARc4drugdx /METHODENTER
rgrp paranti rgen zc3age /METHODENTER zunder
zc3ss /METHODENTER unbyks /CRITERIA PIN(.05)
POUT(.10) ITERATE(20) CUT(.5) /save pred
/CLASSPLOT.
26
Putting it all together
  • Take the predicted probabilities from SPSS and
    move them next to the participants scores on the
    moderated variable (e.g. undercontrol).
  • Select both columns, copy and past into Excel.
  • Select the predicted probabilities and the model
    implied probabilities and graph
  • I use a scatter plot in Excel. Using the chart
    wizard
  • For the X values of the predicted probabilities,
    select the actual values of the moderated
    variable
  • For the model implied, select the column of
    values used to make the simple slope graph (e.g.
    -1.5 SD to 1.5 SD, etc)

27
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com