Meta-analysis - PowerPoint PPT Presentation


PPT – Meta-analysis PowerPoint presentation | free to download - id: 7552a5-MTRhY


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation



NCRM Research Methods Festival University of Oxford Department of Education – PowerPoint PPT presentation

Number of Views:148
Avg rating:3.0/5.0
Slides: 88
Provided by: acuk
Learn more at:


Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Meta-analysis

  • NCRM Research Methods Festival
  • University of Oxford

Department of Education
Todays content
  • What is meta-analysis,
  • when and why we use meta-analysis,
  • Examples of meta-analyses
  • benefits and pitfalls of using meta-analysis,
  • defining a population of studies and finding
  • coding materials,
  • inter-rater reliability,
  • computing effect sizes,
  • structuring a database, and
  • a conceptual introduction to analysis and
    interpretation of results based on fixed effects,
    random effects, and multilevel models.

Why a course on meta-analysis?
  • Meta-analysis is an increasingly popular tool for
    summarising research findings
  • Cited extensively in research literature
  • Relied upon by policymakers
  • Important that we understand the method, whether
    we conduct or simply consume meta-analytic
  • Should be one of the topics covered in all
    introductory research methodology courses

  • What is meta-analysis
  • When and why we use meta-analysis

What is meta-analysis?
  • Systematic synthesis of various studies on a
    particular research question
  • Do boys or girls have higher self-concepts?
  • Collect all studies relevant to a topic
  • Find all published journal articles on the topic
  • An effect size is calculated for each outcome
  • Determine the size/direction of gender difference
    for each study
  • Content analysis
  • code characteristics of the study age, setting,
    ethnicity, self-concept domain (math, physical,
    social), etc.
  • Effect sizes with similar features are grouped
    together and compared tests moderator variables
  • Do gender differences vary with age, setting,
    ethnicity, self-concept, domain, etc.

A blend of qualitative and quantitative approaches
  • Coding the process of extracting the information
    from the literature included in the
    meta-analysis. Involves noting the
    characteristics of the studies in relation to a
    priori variables of interest (qualitative)
  • Effect size the numerical outcome to be analysed
    in a meta-analysis a summary statistic of the
    data in each study included in the meta-analysis
  • Summarise effect sizes central tendency,
    variability, relations to study characteristics

Abridged history
When why we use meta-analysis
  • One of the primary aims is to reach a conclusion
    related to the magnitude of the effect on a
    specific sample inferred to the population
  • Meta-analysis can test if the studies' outcomes
    show more variation than the variation that is
    expected because of sampling different research
  • In such cases, study characteristics (e.g., the
    measurement instrument used, population sampled,
    or aspects of the studys design) are coded.
    These characteristics are then used as predictor
    variables to analyze the excess variation in the
    effect sizes

What Disciplines do meta-analysis?ISI 10 Feb,
2008. Topic meta-analysis Results found ,
What Disciplines do meta-analysis? ISI 10 Feb,
2008. Topic meta-analysis Results found ,
Meta-analysis examples
Psychology Where it all began
  • Amato, P. R., Keith, B. (1991). Parental
    divorce and the well-being of children A
    meta-analysis . Psychological Bulletin, 110,
    26-46. Times Cited 471
  • Linn, M. C., Petersen, A. C. (1985). Emergence
    and characterization of sex differences in
    spatial ability A meta-analysis . Child
    Development, 56, 1479-1498. Times Cited 570
  • Johnson, D. W., et al (1981). Effects of
    cooperative, competitive, and individualistic
    goal structures on achievement A meta-analysis .
    Psychological Bulletin, 89, 47-62. Times Cited
  • Tett, R. P., Jackson, D. N., Rothstein, M.
    (1991). Personality measures as predictors of job
    performance A meta-analytic review . Personnel
    Psychology, 44, 703-742 Times Cited 387
  • Hyde, J. S., Linn, M. C. (1988). Gender
    differences in verbal ability A meta-analysis .
    Psychological Bulletin, 104, 53-69. Times Cited
  • Iaffaldano, M. T., Muchinsky, P. M. (1985). Job
    satisfaction and job performance A meta-analysis
    . Psychological Bulletin, 97, 251-273. Times
    Cited 263.

Education Widely Cited Meta-analyses
  • De Wolff, M., van IJzendoorn, M. H. (1997).
    Sensitivity and attachment A meta-analysis on
    parental antecedents of infant attachment . Child
    Development, 68, 571-591. Times Cited 340
  • Wellman, H. M., Cross, D., Watson, J. (2001).
    Meta-analysis of theory-of-mind development The
    truth about false belief . Child Development, 72,
    655-684. Times Cited 276
  • Cohen, E. G. (1994). Restructuring the classroom
    Conditions for productive small groups . Review
    of Educational Research, 64, 1-35. Times Cited
  • Hansen, W. B. (1992). School-based substance
    abuse prevention A review of the state of the
    art in curriculum, 1980-1990 . Health Education
    Research, 7, 403-430. Times Cited 207
  • Kulik, J. A., Kulik, C-L., Cohen, P. A. (1980).
    Effectiveness of Computer-Based College Teaching
    A Meta-Analysis of Findings. Review of
    Educational Research, 50, 525-544. Times Cited

Business/Management Widely Cited Meta-analyses
  • Sheppard, B. H., Hartwick, J., Warshaw, P. R.
    (1988). The theory of reasoned action A
    meta-analysis of past research with
    recommendations for modifications and future
    research . Journal of Consumer Research, 15,
    325-343. Times Cited 515
  • Jackson, S. E., Schuler, R. S. (1985). A
    meta-analysis and conceptual critique of research
    on role ambiguity and role conflict in work
    settings . Organizational Behavior and Human
    Decision Processes, 36, 16-78. Times Cited 401
  • Tornatzky Lg, Klein Kj. (1994). Innovation
    characteristics and innovation adoption-implementa
    tion - A meta-analysis of findings . IEEE
    Transactions On Engineering Management, 29, 28-4.
    Times Cited 269.
  • Lowe KB, Kroeck KG, Sivasubramaniam N. (1996).
    Effectiveness correlates of transformational and
    transactional leadership A meta-analytic review
    of the MLQ literature. Leadership Quarterly, 7,
    385-425. Times Cited 203.
  • Churchill GA, Ford NM, Hartley SW, et al. (1985).
    Title The determinants of salesperson
    performance - A meta-analysis . Journal Of
    Marketing Research, 22, 103-118. Times Cited

Most Widely Cited Meta-analyses are in Medicine
  • Jadad AR, Moore RA, Carroll D, et al. (1996).
    Assessing the quality of reports of randomized
    clinical trials Is blinding necessary?
    Controlled Clinical Trials, 17, 1-12. Times
  • Boushey Cj, Beresford Saa, Omenn Gs, Et . Al.
    (1995). A quantitative assessment of plasma
    homocysteine as a risk factor for
    vascular-disease - Probable benefits of
    increasing folic-acid intakes. JAMA-journal Of
    The American Medical Assoc, 274, 1049-1057. Times
    Cited 2,128
  • Alberti W, Anderson G, Bartolucci A, et al.
    (1995). Chemotherapy in non-small-cell
    lung-cancer - A metaanalysis using updated data
    on individual patients from 52 randomized
    clinical-trials. British Medical Journal, 311,
    899-909. Times Cited1,591
  • Block G, Patterson B, Subar A (1992). Fruit,
    vegetables, and cancer prevention - A review of
    the epidemiologic evidence. Nutrition And
    Cancer-an International Journal, 18, 1-29. Times
    Cited 1,422

Cohen, P. A. (1980). Effectiveness of
student-rating feedback for improving college
instruction A meta-analysis. Research in Higher
Education, 13, 321-341.
  • Question Does feedback from university students
    evaluations of teaching lead to improved
  • Teachers are randomly assigned to experimental
    (feedback) and control (no feedback) groups
  • Feedback group gets ratings, augmented, perhaps,
    with personal consultation
  • Groups are compared on subsequent ratings and,
    perhaps, other variables
  • Feedback teachers improved their teaching
    effectiveness by .3 standard deviations compared
    to control teachers on the Overall Rating item
    even larger differences for ratings of Instructor
    Skill, Attitude Toward Subject, Student Feedback
  • Studies that augmented feedback with consultation
    produced substantially larger differences, but
    other methodological variations had little

Hattie, J, Marsh, H. W. (1996). The
relationship between research and teaching -- a
meta-analysis. Review of Educational Research,
66, 507-542.
  • Question What is the correlation between
    university teaching effectiveness and research
  • Based on 58 studies and 498 correlations
  • The mean correlation between measures of teaching
    effectiveness (mostly based on SETs) and research
    productivity was .06
  • This near-correlation was consistent across
    different disciplines, types of university,
    indicators of research, and icomponents of
    teaching effectiveness.
  • This meta-analysis was followed by Marsh Hattie
    (2002) primary data study to more fully evaluate
    theoretical model

OMara, A. J., Marsh H. W., Craven, R. G.,
Debus, R. (2006). Do self-concept interventions
make a difference? A synergistic blend of
construct validation and meta-analysis.
Educational Psychologist, 41, 181206.
  • Contention about global self-esteem versus
    multidimensional, domain-specific self-concept
  • Traditional reviews and previous meta-analyses of
    self-concept interventions have underestimated
    effect sizes by using an implicitly
    unidimensional perspective that emphasizes global
  • We used meta-analysis and a multidimensional
    construct validation approach to evaluate the
    impact of self-concept interventions for children
    in 145 primary studies (200 interventions).
  • Overall, interventions were significantly
    effective (d .51, 460 effect sizes).
  • However, in support of the multidimensional
    perspective, interventions targeting a specific
    self-concept domain and subsequently measuring
    that domain were much more effective (d 1.16).
  • This supports a multidimensional perspective of

Hanson, R K., Morton-Bourgon, K. E. (2005). The
Characteristics of Persistent Sexual Offenders A
Meta-Analysis of Recidivism Studies. Journal of
Consulting Clinical Psychology, 73, 1154-1163.
  • Examined predictors of sexual, nonsexual violent,
    and general (any) recidivism
  • 82 recidivism studies
  • Identified deviant sexual preferences and
    antisocial orientation as the major predictors of
    sexual recidivism for both adult and adolescent
    sexual offenders. Antisocial orientation was the
    major predictor of violent recidivism and general
    (any) recidivism
  • Concluded that many of the variables commonly
    addressed in sex offender treatment programs
    (e.g., psychological distress, denial of sex
    crime, victim empathy, stated motivation for
    treatment) had little or no relationship with
    sexual or violent recidivism

Bazzano, L. A., Reynolds, K., Holder, K. N.,
He, J. (2006).Effect of Folic Acid
Supplementation on Risk of Cardiovascular
Diseases A Meta-analysis of Randomized
Controlled Trials. JAMA, 296, 2720-2726
  • Epidemiologic studies have suggested that folate
    intake decreases risk of cardiovascular diseases.
    However, the results of randomized controlled
    trials on dietary supplementation with folic acid
    to date have been inconsistent
  • Included 12 studies with randomised control
  • The overall relative risks (95 confidence
    intervals) of outcomes for patients treated with
    folic acid supplementation compared with controls
    were 0.95 (0.88-1.03) for cardiovascular
    diseases, 1.04 (0.92-1.17) for coronary heart
    disease, 0.86 (0.71-1.04) for stroke, and 0.96
    (0.88-1.04) for all-cause mortality.
  • Concluded folic acid supplementation does not
    reduce risk of cardiovascular diseases or
    all-cause mortality among participants with prior
    history of vascular disease.

Fiske, P., Rintamaki, P. T., Karvonen, E. (1998).
Mating success in lekking males a meta-analysis.
Behavioral Ecology, 9, 328-338.
  • In lekking species (those that gather for
    competitive mating), a male's mating success can
    be estimated as the number of females that he
    copulates with.
  • Aim of the study was to find predictors of
    lekking species mating success through analysis
    of 48 studies
  • Behavioural traits such as male display activity,
    aggression rate, and lek attendance were
    positively correlated with male mating success.
    The size of "extravagant" traits, such as birds
    tails and ungulate antlers, and age were also
    positively correlated with male mating success.
  • Territory position was negatively correlated with
    male mating success, such that males with
    territories close to the geometric centre of the
    leks had higher mating success than other males.
  • Male morphology (measure of body size) and
    territory size showed small effects on male
    mating success.

Benefits and pitfalls of using meta-analysis
Benefits of meta-analysis
  • Compared to traditional literature reviews
  • (1) there is a definite methodology employed in
    the research analysis and 
  • (2) the results of the included studies are
    quantified to a standard metric thus allowing for
    statistical techniques for further analysis.
  • Therefore less biased and more replicable
  • Able to establish generalisability across many
    studies (and study characteristics).

Benefits of meta-analysis
  • Analyzing the results from a group of studies can
    allow more accurate data analysis
  • Increased power
  • Enhanced precision due to averaging out the
    sampling error deviations from the true values
  • Also, provides corrections to mean values with
    distortions due to measurement error and other
    possible artefacts

Publication bias
  • Studies that are published are more likely to
    report statistically significant findings. This
    is a source of potential bias.
  • The debate about using only published studies
  • peer-reviewed studies are presumably of a higher
  • significant findings are more likely to be
    published than non-significant findings
  • There is no agreed upon solution. However, one
    should retrieve all studies that meet the
    eligibility criteria, and be explicit with how
    they dealt with publication bias. Some methods
    for dealing with publication bias have been
    developed (e.g., Fail-safe N, Trim and Fill

Study quality
  • Increasingly, meta-analysts evaluate the quality
    of each study included in a meta-analysis.
  • Sometimes this is a global holistic (subjective)
    rating. In this case it is important to have
    multiple raters to establish inter-rater
    agreement (more on this later).
  • Sometimes study quality is quantified in relation
    to objective criteria of a good study, e.g.
  • larger sample sizes
  • more representative samples
  • better measures
  • use of random assignment
  • appropriate control for potential bias
  • double blinding, and
  • low attrition rates (particularly for
    longitudinal studies)

Study quality Does it make a difference?
  • Meta-analyses should always include subjective
    and/or objective indicators of study quality.
  • In Social Sciences there is some evidence that
    studies with highly inadequate control for
    pre-existing differences leads to inflated effect
    sizes. However, it is surprising that other
    indicators of study quality make so little
  • In medical research, studies largely limited to
    RCTs where there is MUCH more control than in
    social science research. Here there is evidence
    that inadequate concealment of assignment and
    lack of double-blind inflate effect sizes, but
    perhaps only for subjective outcomes.
  • These issues are likely to be idiosyncratic to
    individual discipline areas and research

Conducting a meta-analysis
  • Defining a population of studies and finding
  • Coding materials
  • Inter-rater reliability
  • Computing effect sizes
  • Structuring a database

Steps in a meta-analysis
Establish research question
  • Comparison of treatment control groups?
  • What is the effectiveness of a reading skills
    program for treatment group compared to an
    inactive control group?
  • Pretest-posttest differences?
  • Is there a change in motivation over time?
  • What is the correlation between two variables?
  • What is the relation between teaching
    effectiveness and research productivity
  • Moderators of an outcome?
  • Does gender moderate the effect of a
    peer-tutoring program on academic achievement?

Establish research question
  • Do you wish to generalise your findings to other
    studies not in the sample?
  • Do you have multiple outcomes per study. e.g.
  • achievement in different school subjects
  • 5 different personality scales
  • multiple criteria of success
  • Such questions determine the choice of
    meta-analytic model
  • fixed effects
  • random effects
  • multilevel

Defining a population of studies and finding
  • Need to have explicit inclusion and exclusion
  • The broader the research domain, the more
    detailed they tend to become
  • Refine criteria as you interact with the
  • Components of a detailed criteria
  • distinguishing features
  • research respondents
  • key variables
  • research methods
  • cultural and linguistic range
  • time frame
  • publication types

Locate and collate studies
  • Search electronic databases (e.g., ISI,
    Psychological Abstracts, Expanded Academic ASAP,
    Social Sciences Index, PsycINFO, and ERIC)
  • Examine the reference lists of included studies
    to find other relevant studies
  • If including unpublished data, email researchers
    in your discipline, take advantage of Listservs,
    and search Dissertation Abstracts International

Locate and collate studies
  • Inclusion process usually requires several steps
    to cull inappropriate studies
  • Example from Bazzano, L. A., Reynolds, K.,
    Holder, K. N., He, J. (2006).Effect of Folic
    Acid Supplementation on Risk of Cardiovascular
    Diseases A Meta-analysis of Randomized
    Controlled Trials. JAMA, 296, 2720-2726

Develop code materials
Code Sheet
Code Book/manual
  • __ Study ID
  • _ _ Year of publication
  • __ Publication type (1-5)
  • __ Geographical region (1-7)
  • _ _ _ _ Total sample size
  • _ _ _ Total number of males
  • _ _ _ Total number of females

Pilot coding
  • Random selection of papers coded by both coders
  • Meet to compare code sheets
  • Where there is discrepancy, discuss to reach
  • Amend code materials/definitions in code book if
  • May need to do several rounds of piloting, each
    time using different papers

Interrater reliability
  • Percent agreement Common but not recommended
  • Cohens kappa coefficient
  • Kappa is the proportion of the optimum
    improvement over chance attained by the coders,
    where a value of 1 indicates perfect agreement
    and a value of 0 indicates that agreement is no
    better than that expected by chance
  • Kappas over .40 are considered to be a moderate
    level of agreement (but no clear basis for this
  • Correlation between different raters
  • Intraclass correlation. Agreement among multiple
    raters corrected for number of raters using
    Spearman-Brown formula (r)

Exercise 1a
  • The purpose of this exercise is to explore
    various issues of meta-analytic methodology
  • Discuss in groups of 3-4 people the following
    issues in relation to the gender differences in
    smiling study (LaFrance et al., 2003)
  • Did the aims of the study justify conducting a
  • Was selection criteria and the search process
  • How did they deal with interrater (coder)

Ex. 1a discussion points
  1. Extend previous meta-analyses, include previously
    untested moderators based on theory/empirical
  2. Search process detailed databases and 5 other
    sources of studies, search terms. Selection
    criteria justification provided (e.g., for
    excluding under the age of 13). However, not
    clear how many studies were retrieved and then
    eventually included (compare with flow chart on
    slide 51)
  3. Multiple coders (group of coders consisted of
    four people with two raters of each sex coding
    each moderator). Interrater reliability was
    calculated by taking the aggregate reliability of
    the four coders at each time using the
    SpearmanBrown formula

Effect size calculation
Effect size calculation
  • The effect size makes meta-analysis possible
  • It is based on the dependent variable (i.e.,
    the outcome)
  • It standardizes findings across studies such that
    they can be directly compared
  • Any standardized index can be an effect size
    (e.g., standardized mean difference, correlation
    coefficient, odds-ratio), but must
  • be comparable across studies (standardization)
  • represent magnitude direction of the relation
  • be independent of sample size

Effect size calculation
Means and standard deviations
Effect sizes
  • Lipsey Wilson (2001) present many formulae for
    calculating effect sizes from different
  • However, need to convert all effect sizes into a
    common metric, typically based on the natural
    metric given research in the area. E.g.
  • Standardized mean difference
  • Odds-ratio
  • Correlation coefficient

Effect size calculation
  • Standardized mean difference
  • Group contrast research
  • Treatment groups
  • Naturally occurring groups
  • Inherently continuous construct
  • Odds-ratio
  • Group contrast research
  • Treatment groups
  • Naturally occurring groups
  • Inherently dichotomous construct
  • Correlation coefficient
  • Association between variables research

Sample size, significance, effect size
Sample size, significance, effect size
Effect size calculation
  • Represents a standardized group contrast on an
    inherently continuous measure
  • Uses the pooled standard deviation (some
    situations use control group standard deviation)
  • Commonly called d

In an intervention study with experimental and
control groups, the effect size might be
In a gender difference study, the effect size
might be
Effect size calculation
  • Represents the strength of association between
    two inherently continuous measures
  • Generally reported directly as r (the Pearson
    product moment coefficient)

r to d, d to r
Alternatively transform rs into Fishers
Zr-transformed rs, which are more normally
Effect size calculation
  • The odds-ratio is based on a 2 by 2 contingency
  • The Odds-Ratio is the odds of success in the
    treatment group relative to the odds of success
    in the control group

Correction for bias
  • Hedges proposed a correction for small sample
    size bias (n lt 20)
  • Must be applied before analysis

  • The effect sizes are weighted by the inverse of
    the variance to give more weight to effects based
    on large sample sizes
  • Variance is calculated as
  • The standard error of each effect size is given
    by the square root of the sampling variance
  • SE ? vi

Population and sample
n - size m - mean d effect size
Structuring a database
Constructing a database
Analytical Methods
  • Fixed effects model
  • Random effects model
  • Multilevel model

Fixed effects assumptions
  • Includes the entire population of studies to be
    considered do not want to generalise to other
    studies not included (e.g., future studies).
  • All of the variability between effect sizes is
    due to sampling error alone. Thus, the effect
    sizes are only weighted by the within-study
  • Effect sizes are independent.

Conducting fixed effects meta-analysis
  • There are 2 general ways of conducting a fixed
    effects meta-analysis ANOVA multiple
  • The analogue to the ANOVA homogeneity analysis is
    appropriate for categorical variables
  • Looks for systematic differences between groups
    of responses within a variable
  • Multiple regression homogeneity analysis is more
    appropriate for continuous variables and/or when
    there are multiple variables to be analysed
  • Tests the ability of groups within each variable
    to predict the effect size
  • Can include categorical variables in multiple
    regression as dummy variables. (ANOVA is a
    special case of multiple regression)

Q-test of the homogeneity of variance
The homogeneity (Q) test asks whether the
different effect sizes are likely to have all
come from the same population (an assumption of
the fixed effects model). Are the differences
among the effect sizes no bigger than might be
expected by chance?
effect size for each study (i 1 to k)
mean effect size a weight for each study
based on the sample size However, this
(chi-square) test is heavily dependent on sample
size. It is almost always significant unless the
numbers (studies and people in each study) are
VERY small. This means that the fixed effect
model will almost always be rejected in favour of
a random effects model.

Example fixed effects study
  • On the next slide, we will look at these outcomes
    in more detail to show the importance of various
    moderator variables
  • Do Psychosocial and Study Skill Factors Predict
    College Outcomes? A Meta-Analysis
  • Robbins, Lauver, Le, Davis, Langley, Carlstrom
    (2004). Psychological Bulletin, 130, 261288
  • Aim
  • To examine the relationship between psychosocial
    and study skill factors (PSFs) and college
    retention by meta-analyzing 109 studies

Fixed effects output
N sample size for that variable k number of
correlation coefficients on which each
distribution was based r mean observed
correlation CIr 10 lower bound of the
confidence interval for observed r CIr 90
upper bound of the confidence interval for
observed r
Regression output example
  • Target self-concept domains are those that are
    directly relevant to the intervention
  • Target-related are those that are logically
    relevant to the intervention, but not focal
  • Non-target are domains that are not expected to
    be enhanced by the intervention

Regression Coefficients and their standard
errors B SE
Sig? Target .4892 .0552 yes
Target-related .1097 .0587
no Non-target .0805 .0489 no From
OMara, Marsh, Craven, Debus (2006)
Random effects assumptions
  • Is only a sample of studies from the entire
    population of studies to be considered want to
    generalise to other studies not included
    (including future studies).
  • Variability between effect sizes is due to
    sampling error plus variability in the population
    of effects.
  • Effect sizes are independent.

Random effects models
  • Variations in sampling schemes can introduce
    heterogeneity to the result, which is the
    presence of more than one intercept in the
  • E.g., if some studies used 30mg of a drug, and
    others used 50mg, then we would plausibly expect
    two clusters to be present in the data, each
    varying around the mean of one dosage or the
  • Random effects models account for this

Random effects models
  • If the homogeneity test is rejected (it almost
    always will be), it suggests that there are
    larger differences than can be explained by
    chance variation (at the individual participant
    level). There is more than one population in
    the set of different studies.
  • Now we turn to the random effects model to
    determine how much of this between-study
    variation can be explained by study
    characteristics that we have coded.
  • The total variance associated with the effect
    sizes has two components, one associated with
    differences within each study (participant level
    variation) and one between study variance

Weighting in random effects models
  • The random error variance component is added to
    the variance calculated earlier (see slide 44)
  • This means that the weighting for each effect
    size consists of the within-study variance (vi)
    and between-study variance (v?)
  • The new weighting for the random effects model
    (wiRE) is given by the formula

Example random effects study
  • Do Self-Concept Interventions Make a Difference?
    A Synergistic Blend of Construct Validation and
  • OMara, Marsh, Craven, Debus. (2006).
    Educational Psychologist, 41, 181206
  • Aim
  • To examine what factors moderate the
    effectiveness of self-concept interventions by
    meta-analyzing 200 interventions

Example random effects results homogeneity
  • QB between group homogeneity. If the QB value
    is significant, then the groups (categories) are
    significantly different from each other
  • QW within group homogeneity. If QW is
    significant, then the effect sizes within a group
    (category) differ significantly from each other

Multilevel modelling assumptions
  • Meta-analytic data is inherently hierarchical
    (i.e., effect sizes nested within studies) and
    has random error that must be accounted for.
  • Effect sizes are not necessarily independent
  • Allows for multiple effect sizes per study

Multilevel modelling
  • New technique that is still being developed
  • Provides more precise and less biased estimates
    of between-study variance than traditional

Multilevel model structure example
  • Level 1 outcome-level component
  • Effect sizes
  • Level 2 study component
  • Publications

Conducting multilevel model analyses
  • Intercept-only model, which incorporates both the
    outcome-level and the study-level components
    (similar to a random effects model)
  • Expand model to include predictor variables, to
    explain systematic variance between the study
    effect sizes

Example multilevel model
  • Acute Stressors and Cortisol Responses A
    Theoretical Integration and Synthesis of
    Laboratory Research
  • Dickerson Kemeny (2004). Psychological
    Bulletin, 130, 355391
  • Aim
  • To examine methodological predictors of cortisol
    responses in a meta-analysis of 208 laboratory
    studies of acute psychological stressors

Example multilevel results
  • Only 2 variables significant (Quad Time between
    stress onset assessment Time of day). The
    quadratic component is difficult to interpret as
    an unstandardized regression coefficient, but the
    graph suggests it is meaningfully large

Model selection
  • Fixed, random, or multilevel?
  • Generally, if more than one effect size per study
    is included in sample, multilevel should be used
  • However, if there is little variation at study
    level, the results of multilevel modelling
    meta-analyses are similar to random effects

Model selection
  • Do you wish to generalise your findings to other
    studies not in the sample?
  • Do you have multiple outcomes per study?

Exercise 1b
  • The purpose of this exercise is to consider
    choice of meta-analytic method
  • Discuss in groups of 3-4 people the question in
    relation to the gender differences in smiling
    study (LaFrance et al., 2003)
  • Is there independence of effect sizes? What are
    the implications for model choice (fixed, random,

Exercise 1b discussion points
  • No independence (research reports 162, number
    of effect sizes (k) 418).
  • Of the total number of reports described here,
    less than one fourth contributed more than one
    effect size to the moderator analysis...
    Nevertheless, appropriate caution should be used
    interpreting these analyses, because they
    challenge the assumption of effect size
    independence (p. 313).

Exercise 2
  • The purpose of this exercise is to practice
    reading meta-analytic results tables.
  • This study, by Reger et al. (2004), examines the
    relationship between neuropsychological
    functioning and driving ability in dementia.
  • In Table 3, which variables are homogeneous for
    the on-road tests driving measure in the All
    Studies column? What does this tell you about
    those variables?
  • In Table 4, look at the variables that were
    homogeneous in question (1) for the on-road
    tests using All Studies. Which variables have
    a significant mean ES? Which variable has the
    largest mean ES?

Exercise 2 Answers
  • Homogeneous variables (non-significant Q-values)
    Mental statusgeneral cognition, Visuospatial
    skills, Memory, Executive functions, Language
  • All of the relevant mean effect sizes are
    significant. Memory and language are tied as the
    largest mean ESs for homogeneous variables (r

  • We established what is meta-analysis, when and
    why we use meta-analysis, and the benefits and
    pitfalls of using meta-analysis
  • Summarised how to conduct a meta-analysis
  • Provided a conceptual introduction to analysis
    and interpretation of results based on fixed
    effects, random effects, and multilevel models
  • Applied this information to examining the methods
    of a published meta-analysis

Steps in a meta-analysis
  • Comparing apples and oranges
  • Quality of studies included in the meta-analysis
  • What to do when studies dont report sufficient
    information (e.g., non-significant findings)?
  • Including multiple outcomes in the analysis
    (e.g., different achievement scores)
  • Publication bias

Future directions
  • With meta-analysis now one of the most popularly
    published research methods, it is an exciting
    time to be involved in meta-analytic research
  • The hottest topics in meta-analysis are
  • Multilevel modelling to address the issue of
    independence of effect sizes
  • New methods in publication bias assessment
  • Also receiving attention
  • Establishing guidelines for conducting
    meta-analysis (best practice)
  • Meta-analyses of meta-analyses

  • Purpose-built
  • Comprehensive Meta-analysis (commercial)
  • Schwarzer (free, http//
  • Extensions to standard statistics packages
  • SPSS, Stata and SAS macros, downloadable from
  • Stata add-ons, downloadable from
  • HLM V-known routine
  • MLwiN
  • MPlus

Key reference books
  • Cooper, H., Hedges, L. V. (Eds.) (1994). The
    handbook of research synthesis (pp. 521529). New
    York Russell Sage Foundation.
  • Hox, J. (2003). Applied multilevel analysis.
    Amsterdam TT Publishers.
  • Hunter, J. E., Schmidt, F. L. (1990). Methods
    of meta-analysis Correcting error and bias in
    research findings. Newbury Park Sage
  • Lipsey, M. W., Wilson, D. B. (2001). Practical
    meta-analysis. Thousand Oaks, CA Sage

More information
  • Pick up a brochure about our intermediate and
    advanced meta-analysis courses
  • Visit our website http//