Workshop on Statistical Mediation and Moderation: Statistical Mediation - PowerPoint PPT Presentation

1 / 54
About This Presentation
Title:

Workshop on Statistical Mediation and Moderation: Statistical Mediation

Description:

Kenny, David A. (1986) ... According to Baron and Kenny (1986), if one obtains a significant drop in beta ... Baron & Kenny didn't mention this wasn't used in ... – PowerPoint PPT presentation

Number of Views:788
Avg rating:3.0/5.0
Slides: 55
Provided by: Pau1121
Category:

less

Transcript and Presenter's Notes

Title: Workshop on Statistical Mediation and Moderation: Statistical Mediation


1
Workshop on Statistical Mediation and
ModerationStatistical Mediation
  • Paul Jose
  • Victoria University of Wellington
  • 27 March, 2008
  • SASP Conference

2
What do you want to know?
  • Lets briefly have each person state what he or
    she would like to learn this morning.
  • Also, what is your level of statistical
    knowledge/experience?
  • Okay, let me tell you what Im planning to cover.

3
What am I doing today?
  • I want to define mediation and moderation
  • How are they similar or different?
  • Basic mediation and moderation
  • Advanced mediation and moderation
  • Questions and answers

4
Where does one start?
  • I began to be interested in mediation and
    moderation because I found that I was
    increasingly using these approaches in
    understanding process among variables.
  • I found that there was little about these
    techniques in traditional statistics textbooksI
    mostly obtained information through
    word-of-mouth.
  • . . . and I was confused. I dont like being
    confused, so I did something about it. I educated
    myself on these techniques. And now I can pass on
    what Ive learned. Let me list what I consider to
    be the main sources of confusion.

5
Five major sources of confusion
  • Moderation and mediation sound alike. It makes it
    seem that they are very similar, and or they
    derive from the same origin. They are somewhat
    similar (cousins), but they dont come from the
    same place.
  • Second, statistics textbooks typically do not do
    a very good job of explaining these two
    approaches. Exception Howell (2006).
  • Third, reports of moderation and mediation in the
    research literature are not always clear or
    accurately performed.

6
More confusion
  • Both are special cases of two separate broad
    statistical approaches mediation is a special
    case of semi-partial correlations (path modeling)
    and moderation is a special case of statistical
    interactions (from ANOVA). Both are included
    under GLM, but this is not usually appreciated.
  • Its not entirely clear what distinguishes a
    moderating variable from a mediating variable.
    Can one a priori define mediating and moderating
    variables?

7
One last stumbling block
  • Problem there are no easily used statistics
    programmes that compute mediation and moderation.
    Can do analyses in SPSS and other programmes that
    do regression, but there is no graphing
    capability dedicated to either mediation or
    moderation (except ModGraph and MedGraph).
  • What we have here is a case of the users getting
    ahead of the statisticians in the sense that
    researchers frequently use mediation and
    moderation but many statisticians arent even
    familiar with the terms.

8
Background and history
  • Most peoples awareness of this area comes from
    this article
  • Baron, Reuben M. Kenny, David A.(1986). The
    moderator-mediator variable distinction in social
    psychological research Conceptual, strategic,
    and statistical considerations.Journal of
    Personality and Social Psychology. Vol 51(6), pp.
    1173-1182.
  • Cited about 6,500 times by PsychInfos count. And
    thats just in Psychology.
  • Most people are unclear about what they said and
    how to perform the techniques.

9
Lets get startedSimilarities and differences
  • Similarities
  • They both involve three variables
  • You can use regression to compute both
  • You wish to see how a third variable affects a
    basic relationship (IV to DV).
  • Differences
  • You create a product term in moderation not in
    mediation
  • You dont have to centre anything in mediation
  • Moderation can be used on concurrent or
    longitudinal data, but mediation is best used on
    longitudinal data.
  • Graphing is critical for moderation helpful for
    mediation.

10
How do you know if you have a moderator or a
mediator?
  • Whats the diff?
  • Moderators tend to be variables that are
    relatively immune to change over time
    (personality trait, gender, ethnic group, etc.).
  • Mediators tend to be variables that change in
    relation to other variables (anxiety,
    helpfulness, honesty, mood).
  • However, there is a class of variables (e.g.,
    coping efforts/strategies) that might be examined
    in both ways. These two categories are not
    mutually exclusive.

11
So lets focus on mediation first
  • Definition A mediating variable is one which
    specifies how (or the mechanism by which) a given
    effect occurs between an independent variable
    (IV) and a dependent variable (DV). (Holmbeck,
    1997, p. 599).
  • The question you wish to answer is whether the
    effect of the IV on the DV is at least partially
    mediated by a third variable (MV).
  • You can answer this question with two regressions
    (and a correlation matrix).
  • Lets consider a specific example.

12
An example from my research
Stressor intensity
Depression
Rumination
13
The theories
  • Susan Nolen-Hoeksema believes that an individual
    who ruminates more ends up more depressed. X gt
    Y. Notice that its a causal statement.
  • I dont disagree with her, but I think that this
    simple effect should be embedded within the
    stress and coping context.
  • We know that stress leads to depression. The
    question I want to ask is whether at least part
    of the effect of stress on depression occurs
    because certain individuals ruminate about
    stressful events, and this rumination leads to
    depression.

14
The basic relationship
Stressor intensity
Depression
One must have a significant correlation between
the IV and DV (in fact among all 3 variables).
The essential question is whether by adding a
third variable, one can at least partially
explain the basic relationship. Lets look at
some real data.
15
The two steps
Step 1
.45
Stressor intensity
Depression
Step 2
Stressor intensity
Depression
.45
(.29)
.51
.46
Rumination
(.32)
16
Baron Kennys 4 criteria
  • IV to MV must be significant
  • IV to DV must be significant
  • MV to DV must be significant (when entered with
    the IV)
  • The effect of the IV on the DV must be less in
    the third equation than the second. Perfect
    mediation holds if the IV has no effect when the
    mediator is controlled.
  • must be less is measured with the Sobel formula
    (see following pages)
  • Perfect mediation occurs when the original
    relationship goes to zero. This never happens in
    psychology. I have a proposal for how to deal
    with this issue, presented below.

17
What changed?
  • Note that the beta weight from IV to DV changed
    from .45 to .29.
  • What does that tell us?
  • According to Baron and Kenny (1986), if one
    obtains a significant drop in beta for this
    relationship, then one has obtained significant
    mediation.
  • How can one test whether this is significant or
    not? (It is not simply whether it goes from
    significant to non-significant.) One needs to
    compute the Sobels test
  • z-value ab/SQRT(b2sa2 a2sb2)

18
Who ya gonna call?
  • Many people have been using a web-site by
    Preacher and Leonardelli, and its quite useful
    for computing the Sobels statistic
    http//www.psych.ku.edu/preacher/sobel/sobel.htm
  • Let me show you how to use the site. It is
    generally very helpful.
  • I have invented my own programme to do what P
    Ls site does, and MORE. Lets check it out too.

19
Preparatory work
  • Before we run off to use these, please know that
    you have to obtain some statistical information
    first
  • Compute a correlation matrix of the 3 variables
  • Perform a multiple regression of the IV on the
    mediating variable and
  • Perform a multiple regression of the IV and
    mediator on the DV (simultaneous inclusion).

20
Correlation matrix
21
Results from the two regressions
1st regression (Stress on Rumination) B
7.501 (unstand regression coefficient) se .938 (
standard error) 2nd regression (Stress,
Rumination on Depression) You select the B and
se for the mediating variable here B .069 se
.016 new beta for Stress .288 new beta for
Rumination .317 (P L web-site needs the first
four values.)
22
Okay, go to the programmes
  • It is necessary to have written down the
    pertinent statistical output, or to have printed
    off the relevant sections.
  • Can do both programmes on the internet.
  • If youre away from the internet you can download
    the Excel macro of MedGraph and run it whenever
    you want.

23
MedGraph output
24
Comparison of web-sites
  • Preachers site has been around longer, it allows
    variations on the Sobel formula, and gives you an
    alternate way to compute the Sobels t.
  • My site results in a graphical presentation of
    results, I think its harder to make mistakes
    with my programme, and it has/will have
    information about the type of mediation.

25
My criteria for type of mediation
  • At present my programme stipulates
  • None non-significant Sobels z-value
  • Partial significant Sobels and significant
    basic relationship in the 2nd regression (IV to
    DV)
  • Full significant Sobels and non-significant
    basic relationship in the 2nd regression (IV to
    DV)
  • Dave Kenny argues against this (see his
    web-site), and I tend to agree with him now. My
    new approach is on the following page.

26
What kind of mediation?
  • None non-significant Sobels z-value
  • Partial significant Sobels and ratio lt .80.
    (ratio is indirect/total in this case its
    .161/.449)
  • Full significant Sobels and ratio gt .80
  • --------------------------------------------------
    ----
  • In the present case we have a significant Sobels
    and ratio .36. Thus, we have partial mediation.
    Notice that I dont use the term perfect
    mediation. There is no consensus on the
    partial/full mediation issue.

27
Causal finding?
  • Many researchers would be keen to argue from this
    result that the experience of stress leads to
    rumination, which in turn partially leads to
    depressive symptoms, i.e., a causal argument. Is
    this merited?
  • Cole and Maxwell (2003) argue strenuously that
    concurrent mediation CANNOT support a causal
    statement. They argue that few concurrent
    mediation results actually turn out to hold up in
    longitudinal data. What do they mean?

28
Shared and unique variance
Stress
Depressive symptoms
Basic relationship is just a correlation between
two variables.
29
Three variables mediation
Direct effect
Stress
Depressive symptoms
Indirect effect
Rumination
The green area indicates the degree of shared
variance among the three variables thats the
size of the indirect effect. It is hard to
argue that these relationships are causal with
these data they are the size of shared and
unique variance.
30
Warnings!
  • One must have all three correlations be
    significant before launching this. K now suggests
    that 1st one may be optional.
  • Be sure that you do the regressions correctly,
    and that you are taking the proper statistical
    information from the print-outs (B vs. b).
  • Some people make causal arguments from these
    results. They are shaky at best.
  • Types of specification error 1) ordering of
    variables, 2) variables with/without error, and
    3) third variable problem
  • Longitudinal data are best.
  • Bootstrapping is best with small N samples.
  • Path models involving more than three variables
    is the general casedont do a bunch of
    three-variable mediation analyses when you can do
    one path model.

31
Specification error
  • Major boogeyman in path model analytic work have
    you correctly specified your model?
  • Several issues here
  • Temporal order of variables
  • Variables measured with error
  • Missing variable?

32
Why is your proposed model the best?
Rumination
Stress intensity
Depressive symptoms
There are exactly 6 combinations of any three
variableswhy is your proposed model the best?
Why not test all of them? I have, and in the
present case I find six instances of partial
mediation. Which is correct? They all tell us
something useful about shared and unique variance.
33
Variables measured with error
  • One can obtain biased estimates of the indirect
    effect if the MV is measured with significant
    error. (Same is true of the IV and DV too, by the
    way.)
  • Answer? Do mediation in a latent variable path
    model in SEM. Possible but not easy. Also, a lot
    of the times one doesnt have a sufficient N or
    multiple indicators of the variables (3
    indicators per variable). Would look like this

34
Latent variable path model
Stress intensity
Depression
.30
(.20)
.40
.24
Rumination
Indirect effect .10 direct effect .20 ratio
.33 (.36 in MR)
35
Missing variable?
  • This is the old third variable problem, but in
    this case we might wish to call it the fourth
    variable problem.
  • My student, Kirsty Weir, suggests that
    anxiety/worry might explain the relationship
    between rumination and depression. Graph is on
    the following page.
  • One can never completely resolve this question
    include the likely candidates and try to reject
    them.

36
The road from stress to depression
Note that the Rum to Dep path was removed because
it was non- significant when we added the 4th
variable (control). Is the 3-variable mediation
pattern wrong then?
37
Bootstrapping
  • David MacKinnon and others have argued that
    typical multiple regression analysis is unbiased
    only for large samples. (present case N 575)
  • They suggest
  • Large sample use MR
  • Small sample use bootstrapping
  • What is bootstrapping?

38
Wave of the future
  • Bootstrapping is a compilation of regression
    results from many subsets of the original
    dataset.
  • The programme selects a subset of the data (e.g.,
    50 from 100 participants), runs the regression
    analysis, stores the result, does it again and
    again up to a predetermined number of times, and
    then compiles the results of the repeated
    analyses.
  • Baron Kenny didnt mention thiswasnt used in
    1986 very much at all. It is performed now, but
    infrequently. It is the wave of the future.

39
So how does one do this?
  • If you toddle off to SPSS to do this, you will be
    disappointed. Although it can perform
    bootstrapping, it is not set up to do mediation
    bootstrapping.
  • Preacher and Hayes (see the Preacher web-site on
    mediation) offers two different macros SAS and
    SPSS. Download it and use it within SPSS. (not
    easy)
  • Lets look at the results of the SPSS macro.

40
Macro output
Run MATRIX procedure DIRECT AND TOTAL EFFECTS
Coeff s.e. t
Sig(two) b(YX) .3934 .0288
13.6685 .0000 b(MX) 1.0412 .0691
15.0779 .0000 b(YM.X) .1369 .0165
8.3200 .0000 b(YX.M) .2508 .0322
7.8002 .0000 INDIRECT EFFECT AND
SIGNIFICANCE USING NORMAL DISTRIBUTION
Value s.e. LL 95 CI UL 95 CI Z
Sig(two) Sobel .1426 .0196
.1042 .1810 7.2723
.0000 BOOTSTRAP RESULTS FOR INDIRECT EFFECT
Mean s.e. LL 95 CI UL 95 CI LL
99 CI UL 99 CI Effect .1434 .0239
.1001 .1939 .0879 .2113 SAMPLE SIZE
575 NUMBER OF BOOTSTRAP RESAMPLES 2000
Its telling us that the indirect effect was
significantagrees with the multiple regression
result, but this is an unbiased estimate. (z
3.80 before)
41
Mediation with longitudinal data
  • . . . is very complicated but is very
    illuminating.
  • Much of structural equation modelling (SEM) is
    devoted to trying to understand mediational
    models.
  • Path modelling with longitudinal data is hard to
    do but will generate very interesting and
    interpretable results.
  • One should obtain the same variables at different
    times of measurement to allow residualisation.

42
Hierarchical multiple regression
Time 1
Time 2
Rum
2nd step
Dep
Dep
1st step
This is N-Hs hypothesis Rum1 should explain
unique variance in Dep2 after Dep1 is entered,
i.e., explaining new variance in the residual.
43
Back to Venn diagrams,but with a difference
Dep2
Dep1
Stability coefficient typically medium to large.
The purple area is the residual variance. It
represents the change in depression over this
time period. The overlapping area refers to the
stability of depression over this time period.
44
Does Rum1 predict any of the residual?
Dep1
Dep2
Rum1
The red area is the amount of variance in Dep2
explained by Rum1, i.e., the degree to which Rum1
explains change in depression over time.
45
So whats the answer?
  • Perform a hierarchical regression
  • IV DV
  • Dep1 Dep2
  • Rum1
  • I found that N-Hs hypothesis was not supported
    Rum1 did not explain any of the residual of Dep2
    after Dep1 was entered.

.72
.05ns
46
This is what it looks like
Dep1
Dep2
Rum1
Although Dep1 and Rum1 are significantly
correlated, Rum1 doesnt explain much new
variance in Dep2 above and beyond what Dep1 can
do.
47
The other direction
  • IV DV
  • Rum1 Rum2
  • Dep1
  • This result suggests that depression may
    contribute to rumination over a 3-month period of
    time, but not the other way around.
  • It is recommended that you perform a path
    analysis in SEM for this type of analysis allows
    for concurrent correlation (see next page).

.64
.08
48
Two time points
Time 1
Time 2
Rum
Rum
Dep
Dep
SEM computes all of these relationships
simultaneously, allowing one to identify the
unique relationships. Enact in LISREL, EQS, AMOS,
etc. What did I find?
49
Same basic results
Time 1
Time 2
.63
Rum
Rum
.08
.47
.43
Dep
Dep
.74
But you get model fit indices, modification
indices, and so forth . . . I deleted the Rum1 to
Dep2 path because it was non-significant.
50
Three time points and three variables
Time 1 Time 2
Time 3
Stress
Stress
Stress
my hypoth
?
Rum.
Rum.
Rum.
N-H
MR
Dep.
Dep.
Dep.
51
SEM yielded this result
.74
Stress
Stress
Stress
.59
.11
Rum.
Rum.
Rum.
.61
.59
.08
Dep.
Dep.
Dep.
.72
.51
52
Powerful but hard to do
  • Need to have three times of measurement
    reasonably spread out so that stability
    coefficients are not too high.
  • Need to have good measures (small measurement
    error) or do latent variable longitudinal path
    modelling.
  • This type of test of mediation is very stringent
    because it occurs over time and must be strong
    enough to exist against the backdrop of the
    stability coefficients, i.e., these residualised
    effects explain change in other variables.

53
Back to types of mediation
  • Why do I think in terms of null, partial, and
    full mediation?
  • Because SEM-based path models yield those three
    possible patterns.
  • Sociological point basic mediation (e.g., BK)
    is rooted in multiple regression where issues of
    model specification are not salient. On the other
    hand, if you learn SEM, then you will think in
    terms like Ive enunciated above. Confusions
    occur because of the anachronisms in the field of
    mediation (harkening back to MR rather than
    embracing path modelling).

54
Lets bring mediation to a close
  • Ive covered many powerful techniques that derive
    from the basic mediation paradigm.
  • Remaining issues
  • Logistic mediation
  • Mediation in other contexts HLM
  • Still much to learn and master, but this is a
    good start.
Write a Comment
User Comments (0)
About PowerShow.com