Loading...

PPT – Meta-analysis PowerPoint presentation | free to download - id: 3b4b3f-Y2ZjN

The Adobe Flash plugin is needed to view this content

Meta-analysis

- ESRC Workshop
- Researcher Development Initiative

Department of Education, University of Oxford

Todays content

- What is meta-analysis,
- When and why we use meta-analysis,
- Examples of meta-analyses,
- Benefits and pitfalls of using meta-analysis,
- Defining a population of studies and finding

publications, - Coding materials,
- Inter-rater reliability,
- Computing effect sizes,
- Structuring a database,
- A conceptual introduction to analysis and

interpretation of results based on fixed effects,

random effects, and multilevel models, and - Supplementary analyses

Primary versus secondary data analysis

- Traditionally, education researchers collect and

analyse their own data (referred to as primary

data). Secondary data analysis is based on data

collected by someone else (or, perhaps,

reanalysis of your own published data). There are

at least four logical perspectives to this issue

- 1. Meta-analysis -- systematic, quantitative

review of published research in a particular

field, the focus of this presentation. - 2. Systematic review -- systematic, qualitative

review of published research in a particular

field - 3. Secondary Data Analyses -- using large

(typically public) databases - 4. Reanalyses of published studies -- (often in

ways critical of the original study).

Why meta-analysis?

- Wilson Lipsey (2001) synthesised 319

meta-analyses of intervention studies. Across the

studies, roughly equal amounts of variance were

due to - substantive features of the intervention (true

differences), - method effects (idiosyncratic study features and

potential biases particularly research design

and operationalisation of outcome measures), and - sampling error.
- They concluded
- These results underscore the difficulty of

detecting treatment outcomes, the importance of

cautiously interpreting findings from a single

study, and the importance of meta-analysis in

summarizing results across studies (p.413).

Why a course on meta-analysis?

- Meta-analysis is an increasingly popular tool for

summarising research findings - Cited extensively in research literature
- Relied upon by policymakers
- Important that we understand the method, whether

we conduct or simply consume meta-analytic

research - Should be one of the topics covered in all

introductory research methodology courses

Background...

- What is meta-analysis?
- When and why we use meta-analysis?

What is meta-analysis?

- Systematic synthesis of various studies on a

particular research question - Do boys or girls have higher self-concepts?
- Collect all studies relevant to a topic
- Find all published journal articles on the topic
- An effect size is calculated for each outcome
- Determine the size/direction of gender difference

for each study - Content analysis
- Code characteristics of the study age, setting,

ethnicity, self-concept domain (math, physical,

social), etc. - Effect sizes with similar features are grouped

together and compared tests moderator variables - Do gender differences vary with age, setting,

ethnicity, self-concept, domain, etc?

A blend of qualitative and quantitative approaches

- Coding the process of extracting the information

from the literature included in the

meta-analysis. Involves noting the

characteristics of the studies in relation to a

priori variables of interest (qualitative) - Effect size the numerical outcome to be analysed

in a meta-analysis a summary statistic of the

data in each study included in the meta-analysis

(quantitative) - Summarise effect sizes central tendency,

variability, relations to study characteristics

(quantitative)

Abridged history

Karl Pearson (1904)

Karl Pearson conducted what is reputed to be the

first meta-analysis (although not called this)

comparing effects of inoculation in different

settings.

10

Classic Meta-analysis Smith Ml, Glass GV (1977)

Meta-analysis Of Psychotherapy Outcome Studies.

American Psychologist, 32, 752-760. Times Cited

840.

- Gene Glass coined the phrase meta-analysis in

classic study of the effects of psychotherapy.

Because most individual studies had small sample

sizes, the effects typically were not

statistically significant. - Results of 375 controlled evaluations of

psychotherapy and counselling were coded and

integrated statistically. The findings provide

convincing evidence of the efficacy of

psychotherapy. - On the average, the typical therapy client is

better off than 75 of untreated individuals. - Few important differences in effectiveness could

be established among many quite different types

of psychotherapy (e.g., behavioral and

non-behavioral).

11

ESRC RDI One Day Meta-analysis workshop (Marsh,

OMara, Malmberg)

Why is meta-analysis important? Generalisability

- The essence of good science is replicable and

generalisable results. - Do we get the same answer to important research

questions when we run the study again? - The primary aims of meta-analysis is to test the

generalisability of results across a set of

studies designed to answer the same research

question. - Are the results consistent? If not, what are the

differences in the studies that explain the lack

of consistency?

When why we use meta-analysis

- A primary aim is to reach a conclusion to a

research question from a sample of studies that

is generalisable to the population of all such

studies. - Meta-analysis tests whether study-to-study

variation in outcomes is more than can be

explained by random chance. - When there is systematic variation in outcomes

from different studies, meta-analysis tries to

explain these differences in terms of study

characteristics e.g. measures used study

design participant characteristics controls for

potential bias.

When is meta-analysis appropriate?

- There exists a critical mass of comparable

studies designed to address a common research

question. - Data are presented in a form that allows the

meta-analyst to compute an effect size for each

study. - Characteristics of each study are described in

sufficient detail to allow meta-analysts to

compare characteristics of different studies and

to judge the quality of each study.

Schulze, R. (2007). The state and the art of

meta-analysis . Zeitschrift für

Psychologie/Journal of Psychology, 215, 87-89.

The number of meta-analyses is increasing at a

rapid rate.

Where are meta-analyses done? All over the

world.

16

What Disciplines publish meta-analyses?ISI 10

Feb, 2008. Topic meta-analysis Results found

, 21,286

All disciplines do meta-analyses, but very

popular in medicine

ISI 10 Feb, 2008. Topicmeta-analysis

Education Disciplines, Results found 612, Sum of

the Times Cited 12,294, Average Citations per

Item 20.09, h-index 54

The number frequency of citations are

increasing in Education

ISI 10 Feb, 2008. Topicmeta-analysis

Psychology DisciplinesResults found2,345 Sum

of the Times Cited68,477 Average Citations per

Item 29.20, h-index 125

The number frequency of citations are

increasing in Psychology

19

Meta-analysis examples

Psychology Where it all began

- Amato, P. R., Keith, B. (1991). Parental

divorce and the well-being of children A

meta-analysis . Psychological Bulletin, 110,

26-46. Times Cited 471 - Linn, M. C., Petersen, A. C. (1985). Emergence

and characterization of sex differences in

spatial ability A meta-analysis . Child

Development, 56, 1479-1498. Times Cited 570 - Johnson, D. W., et al (1981). Effects of

cooperative, competitive, and individualistic

goal structures on achievement A meta-analysis .

Psychological Bulletin, 89, 47-62. Times Cited

426 - Tett, R. P., Jackson, D. N., Rothstein, M.

(1991). Personality measures as predictors of job

performance A meta-analytic review . Personnel

Psychology, 44, 703-742 Times Cited 387 - Hyde, J. S., Linn, M. C. (1988). Gender

differences in verbal ability A meta-analysis .

Psychological Bulletin, 104, 53-69. Times Cited

316 - Iaffaldano, M. T., Muchinsky, P. M. (1985). Job

satisfaction and job performance A meta-analysis

. Psychological Bulletin, 97, 251-273. Times

Cited 263.

Education Widely Cited Meta-analyses

- De Wolff, M., van IJzendoorn, M. H. (1997).

Sensitivity and attachment A meta-analysis on

parental antecedents of infant attachment . Child

Development, 68, 571-591. Times Cited 340 - Wellman, H. M., Cross, D., Watson, J. (2001).

Meta-analysis of theory-of-mind development The

truth about false belief . Child Development, 72,

655-684. Times Cited 276 - Cohen, E. G. (1994). Restructuring the classroom

Conditions for productive small groups . Review

of Educational Research, 64, 1-35. Times Cited

235 - Hansen, W. B. (1992). School-based substance

abuse prevention A review of the state of the

art in curriculum, 1980-1990 . Health Education

Research, 7, 403-430. Times Cited 207 - Kulik, J. A., Kulik, C-L., Cohen, P. A. (1980).

Effectiveness of Computer-Based College Teaching

A Meta-Analysis of Findings. Review of

Educational Research, 50, 525-544. Times Cited

198.

Business/Management Widely Cited Meta-analyses

- Sheppard, B. H., Hartwick, J., Warshaw, P. R.

(1988). The theory of reasoned action A

meta-analysis of past research with

recommendations for modifications and future

research . Journal of Consumer Research, 15,

325-343. Times Cited 515 - Jackson, S. E., Schuler, R. S. (1985). A

meta-analysis and conceptual critique of research

on role ambiguity and role conflict in work

settings . Organizational Behavior and Human

Decision Processes, 36, 16-78. Times Cited 401 - Tornatzky Lg, Klein Kj. (1994). Innovation

characteristics and innovation adoption-implementa

tion - A meta-analysis of findings . IEEE

Transactions On Engineering Management, 29, 28-4.

Times Cited 269. - Lowe KB, Kroeck KG, Sivasubramaniam N. (1996).

Effectiveness correlates of transformational and

transactional leadership A meta-analytic review

of the MLQ literature. Leadership Quarterly, 7,

385-425. Times Cited 203. - Churchill GA, Ford NM, Hartley SW, et al. (1985).

Title The determinants of salesperson

performance - A meta-analysis . Journal Of

Marketing Research, 22, 103-118. Times Cited

189.

Most Widely Cited Meta-analyses are in Medicine

- Jadad AR, Moore RA, Carroll D, et al. (1996).

Assessing the quality of reports of randomized

clinical trials Is blinding necessary?

Controlled Clinical Trials, 17, 1-12. Times

Cited 2,008 - Boushey Cj, Beresford Saa, Omenn Gs, Et . Al.

(1995). A quantitative assessment of plasma

homocysteine as a risk factor for

vascular-disease - Probable benefits of

increasing folic-acid intakes. JAMA-journal Of

The American Medical Assoc, 274, 1049-1057. Times

Cited 2,128 - Alberti W, Anderson G, Bartolucci A, et al.

(1995). Chemotherapy in non-small-cell

lung-cancer - A metaanalysis using updated data

on individual patients from 52 randomized

clinical-trials. British Medical Journal, 311,

899-909. Times Cited 1,591 - Block G, Patterson B, Subar A (1992). Fruit,

vegetables, and cancer prevention - A review of

the epidemiologic evidence. Nutrition And

Cancer-an International Journal, 18, 1-29. Times

Cited 1,422

Cohen, P. A. (1980). Effectiveness of

student-rating feedback for improving college

instruction A meta-analysis. Research in Higher

Education, 13, 321-341.

- Question Does feedback from university students

evaluations of teaching lead to improved

teaching? - Teachers are randomly assigned to experimental

(feedback) and control (no feedback) groups - Feedback group gets ratings, augmented, perhaps,

with personal consultation - Groups are compared on subsequent ratings and,

perhaps, other variables - Feedback teachers improved their teaching

effectiveness by .3 standard deviations compared

to control teachers on the Overall Rating item

even larger differences for ratings of Instructor

Skill, Attitude Toward Subject, Student Feedback - Studies that augmented feedback with consultation

produced substantially larger differences, but

other methodological variations had little

effect.

Hattie, J, Marsh, H. W. (1996). The

relationship between research and teaching -- a

meta-analysis. Review of Educational Research,

66, 507-542.

- Question What is the correlation between

university teaching effectiveness and research

productivity? - Based on 58 studies and 498 correlations
- The mean correlation between teaching

effectiveness (mostly based on Students

evaluations of teaching) and research

productivity was almost exactly zero - This near-zero correlation was consistent across

different disciplines, types of university,

indicators of research, and components of

teaching effectiveness. - This meta-analysis was followed by Marsh Hattie

(2002) primary data study to more fully evaluate

theoretical model

ESRC RDI One Day Meta-analysis workshop (Marsh,

OMara, Malmberg)

OMara, A. J., Marsh H. W., Craven, R. G.,

Debus, R. (2006). Do self-concept interventions

make a difference? A synergistic blend of

construct validation and meta-analysis.

Educational Psychologist, 41, 181206.

- Contention about global self-esteem versus

multidimensional, domain-specific self-concept - Traditional reviews and previous meta-analyses of

self-concept interventions have underestimated

effect sizes by using an implicitly

unidimensional perspective that emphasizes global

self-concept. - We used meta-analysis and a multidimensional

construct validation approach to evaluate the

impact of self-concept interventions for children

in 145 primary studies (200 interventions). - Overall, interventions were significantly

effective (d .51, 460 effect sizes). - However, in support of the multidimensional

perspective, interventions targeting a specific

self-concept domain and subsequently measuring

that domain were much more effective (d 1.16). - This supports a multidimensional perspective of

self-concept

ESRC RDI One Day Meta-analysis workshop (Marsh,

OMara, Malmberg)

Hanson, R K., Morton-Bourgon, K. E. (2005). The

Characteristics of Persistent Sexual Offenders A

Meta-Analysis of Recidivism Studies. Journal of

Consulting Clinical Psychology, 73, 1154-1163.

- Examined predictors of sexual, nonsexual violent,

and general (any) recidivism - 82 recidivism studies
- Identified deviant sexual preferences and

antisocial orientation as the major predictors of

sexual recidivism for both adult and adolescent

sexual offenders. Antisocial orientation was the

major predictor of violent recidivism and general

(any) recidivism - Concluded that many of the variables commonly

addressed in sex offender treatment programs

(e.g., psychological distress, denial of sex

crime, victim empathy, stated motivation for

treatment) had little or no relationship with

sexual or violent recidivism

Bazzano, L. A., Reynolds, K., Holder, K. N.,

He, J. (2006).Effect of Folic Acid

Supplementation on Risk of Cardiovascular

Diseases A Meta-analysis of Randomized

Controlled Trials. JAMA, 296, 2720-2726

- Epidemiologic studies have suggested that folate

intake decreases risk of cardiovascular diseases.

However, the results of randomized controlled

trials on dietary supplementation with folic acid

to date have been inconsistent. - Included 12 studies with randomised control

trials. - The overall relative risks of outcomes for

patients treated with folic acid supplementation

compared with controls were non-significant for

cardiovascular diseases, coronary heart disease,

stroke, and for all-cause mortality. - Concluded folic acid supplementation does not

reduce risk of cardiovascular diseases or

all-cause mortality among participants with prior

history of vascular disease.

Fiske, P., Rintamaki, P. T., Karvonen, E. (1998).

Mating success in lekking males a meta-analysis.

Behavioral Ecology, 9, 328-338.

- In lekking species (those that gather for

competitive mating), a male's mating success can

be estimated as the number of females that he

copulates with. - Aim of the study was to find predictors of

lekking species mating success through analysis

of 48 studies. - Behavioural traits such as male display activity,

aggression rate, and lek attendance were

positively correlated with male mating success.

The size of "extravagant" traits, such as birds

tails and ungulate antlers, and age were also

positively correlated with male mating success. - Territory position was negatively correlated with

male mating success, such that males with

territories close to the geometric centre of the

leks had higher mating success than other males. - Male morphology (measure of body size) and

territory size showed small effects on male

mating success.

Benefits and pitfalls of using meta-analysis

Benefits of meta-analysis

- Compared to traditional literature reviews
- (1) there is a definite methodology employed in

the research analysis (more like that used in

primary research) and - (2) the results of the included studies are

quantified to a standard metric thus allowing for

statistical techniques for further analysis. - Therefore process of reviewing research

literature is more objective, transparent, and

replicable less biased and idiosyncratic to the

whims of a particular researcher

Battle between different camps do extrinsic

rewards increase intrinsic enjoyment?

- Cameron, J., Pierce, W. D (1994).

Reinforcement, reward, and intrinsic motivation

A meta-analysis. Review of Educational Research,

64, 363-423. - Ryan, R., Deci, E. L. (1996). When paradigms

clash Comments on Cameron and Pierce's claim

that rewards do not undermine intrinsic

motivation. Review of Educational Research, 66,

33-38 - Cameron, J., Pierce, W. D (1996). The debate

about rewards and intrinsic motivation Protests

and accusations do not alter the results. Review

of Educational Research, 66, 39-51. - Deci, E. L., Koestner, R., Ryan, R. (2001).

Extrinsic rewards and intrinsic motivation in

education reconsidered once again. Review of

Educational Research, 71, 1-27. - Cameron, J. (2001). Negative effects of reward on

intrinsic motivation a limited phenomenon

comment on Deci, Koestner, and Ryan. Review of

Educational Research, 71, 29-42.

33

ESRC RDI One Day Meta-analysis workshop (Marsh,

OMara, Malmberg)

Benefits of meta-analysis

- Increased power by combining information from

many individual studies, the meta-analyst is able

to detect systematic trends not obvious in the

individual studies. - Conclusions based on the set of studies are

likely to be more accurate than any one study. - Improved precision based on information from

many studies, the meta-analyst can provide a more

precise estimate of the population effect size

(and a confidence interval). - Provides potential corrections for potential

biases, measurement error and other possible

artefacts - Identifies directions for further primary studies

to address unresolved issues.

XLS

Benefits of meta-analysis

- Able to establish generalisability across many

studies (and study characteristics). - Typically there is study-to-study variation in

results. When this is the case, the meta-analyst

can explore what characteristics of the studies

explain these differences (e.g., study design) in

ways not easy to do in individual studies. - Easy to interpret summary statistics (useful if

communicating findings to a non-academic

audience).

Publication bias

- Studies that are published are more likely to

report statistically significant findings. This

is a source of potential bias. - The debate about using only published studies
- peer-reviewed studies are presumably of a higher

quality - VERSUS
- significant findings are more likely to be

published than non-significant findings - There is no agreed upon solution. However, one

should retrieve all studies that meet the

eligibility criteria, and be explicit with how

they dealt with publication bias. Some methods

for dealing with publication bias have been

developed (e.g., Fail-safe N, Trim and Fill

method).

English language bias

- Meta-analyses are mostly limited to studies

published in English. - Juni et al. (2002) evaluated the implications of

excluding non-English publications in

meta-analyses of randomised clinical trials in 50

meta-analyses - treatment effects were modestly larger in

non-English publications (16). - However, study quality was also lower in

non-English publications. - Effects were sufficiently small not to have much

influence on treatment effect estimates, but may

make a difference in some reviews.

37

ESRC RDI One Day Meta-analysis workshop (Marsh,

OMara, Malmberg)

Study quality

- Increasingly, meta-analysts evaluate the quality

of each study included in a meta-analysis. - Sometimes this is a global holistic (subjective)

rating. In this case it is important to have

multiple raters to establish inter-rater

agreement (more on this later). - Sometimes study quality is quantified in relation

to objective criteria of a good study, e.g. - larger sample sizes
- more representative samples
- better measures
- use of random assignment
- appropriate control for potential bias
- double blinding, and
- low attrition rates (particularly for

longitudinal studies)

Study quality in the social sciences

- In a meta-analysis of Social Science

meta-analyses, Wilson Lipsey (1993) found an

effect size of .50. They evaluated how this was

related to study quality - For meta-analyses providing a global (subjective)

rating of the quality of each study, there was no

significant difference between high and low

quality studies the average correlations between

effect size and quality was almost exactly zero. - Almost no difference between effect sizes based

on random- and non-random assignment (effect

sizes slightly larger for random assignment). - Only study quality characteristic to make a

difference was positively biased effects due to

one-group pre/post design with no control group

at all

39

ESRC RDI One Day Meta-analysis workshop (Marsh,

OMara, Malmberg)

Study quality in the social sciences

- Goldring (1990) evaluated the effects of gifted

education programs on achievement. She found a

positive effect, but emphasised that findings

were questionable because of weak studies - 21 of the 24 studies were unpublished and only

one used random assignment. - Effects varied with matching procedures
- largest effects for achievement outcomes were for

studies in which all non-equivalent groups'

differences controlled by only one pretest

variable. - Effect sizes reduced as the number of control

variables increase and - disappeared altogether with random assignment.
- Goldring (1990, p. 324) concluded policy makers

need to be aware of the limitations of the GAT

literature.

40

Study quality in medicine

- Schulz (1995) evaluated study quality in 250

randomized clinical trials (RCTs) from 33

meta-analyses. Poor quality studies led to

positively biased estimates - lack of concealment (30-41),
- lack of double-blind (17),
- participants excluded after randomization (NS).
- Moher et al. (1998) reanalysed 127 RCTs

randomized clinical trials from 11 meta-analyses

for study quality. - Low quality trials resulted in significantly

larger effect sizes, 30-50 exaggeration in

estimates of treatment efficacy. - Wood et al. (2008) evaluated study quality (1346

RCTs from 146 meta-analyses. - subjective outcomes inadequate/unclear

concealment lack of blinding resulted in

substantial biases. - objective outcomes no significant effects.
- conclusion Systematic reviewers should assess

risk of bias.

41

Study quality Does it make a difference?

- Meta-analyses should always include subjective

and/or objective indicators of study quality. - In Social Sciences there is some evidence that

studies with highly inadequate control for

pre-existing differences leads to inflated effect

sizes. However, it is surprising that other

indicators of study quality make so little

difference. - In medical research, studies largely limited to

RCTs where there is MUCH more control than in

social science research. Here there is evidence

that inadequate concealment of assignment and

lack of double-blind inflate effect sizes, but

perhaps only for subjective outcomes. - These issues are likely to be idiosyncratic to

individual discipline areas and research

questions.

42

Conducting a meta-analysis

- Defining a population of studies and finding

publications - Coding materials
- Inter-rater reliability
- Computing effect sizes
- Structuring a database

Steps in a meta-analysis

Establish research question

- Comparison of treatment control groups?
- What is the effectiveness of a reading skills

program for treatment group compared to an

inactive control group? - Pretest-posttest differences?
- Is there a change in motivation over time?
- What is the correlation between two variables?
- What is the relation between teaching

effectiveness and research productivity? - Moderators of an outcome?
- Does gender moderate the effect of a

peer-tutoring program on academic achievement?

Establish research question

- Do you wish to generalise your findings to other

studies not in the sample? - Do you have multiple outcomes per study? e.g.
- achievement in different school subjects
- 5 different personality scales
- multiple criteria of success
- Such questions determine the choice of

meta-analytic model - fixed effects
- random effects
- multilevel

Defining a population of studies and finding

publications

- Need to have explicit inclusion and exclusion

criteria - The broader the research domain, the more

detailed they tend to become - Refine criteria as you interact with the

literature - Components of a detailed search criteria
- distinguishing features
- research respondents
- key variables
- research methods
- cultural and linguistic range
- time frame
- publication types

Locate and collate studies

- Search electronic databases (e.g., ISI,

Psychological Abstracts, Expanded Academic ASAP,

Social Sciences Index, PsycINFO, and ERIC) - Examine the reference lists of included studies

to find other relevant studies - If including unpublished data, email researchers

in your discipline, take advantage of Listservs,

and search Dissertation Abstracts International

Reporting the search procedures

- The following is one possible way to write up the

search procedure (see LeBlanc Ritchie, 2001) - Electronic search strategy (e.g., PsycINFO

Dissertation Abstracts). Provide years included

in database - Keywords and limitations of the search (e.g.,

language) - Additional search methods (e.g., mailing lists)
- Exclusion criteria (e.g., must contain control

group) - Yield of the searchnumber of studies found.

Ideally should also mention how many were

excluded from the meta-analysis and why

Search procedures

ESRC RDI One Day Meta-analysis workshop (Marsh,

OMara, Malmberg)

Locate and collate studies

- Inclusion process usually requires several steps

to cull inappropriate studies - Example from Bazzano, L. A., Reynolds, K.,

Holder, K. N., He, J. (2006).Effect of Folic

Acid Supplementation on Risk of Cardiovascular

Diseases A Meta-analysis of Randomized

Controlled Trials. JAMA, 296, 2720-2726

Inclusion/exclusion

You can report the inclusion/exclusion process

using text rather than a flow chart, but is not

as easy to follow if it is an elaborate

process. Should report original sample

and final yield as a minimum (in this case,

original 139, final 22)

Develop code materials

Code Sheet

Code Book/manual

- __ Study ID
- _ _ Year of publication
- __ Publication type (1-5)
- __ Geographical region (1-7)
- _ _ _ _ Total sample size
- _ _ _ Total number of males
- _ _ _ Total number of females

ESRC RDI One Day Meta-analysis workshop (Marsh,

OMara, Malmberg)

Coding

Mode of therapy, Duration of therapy, Participant

characteristics, Publication characteristics, Des

ign characteristics

Coding characteristics should be mentioned in the

paper. If the editor allows, a copy of the actual

coding materials can be included as an appendix

Pilot coding

- Random selection of papers coded by both coders

(e.g., 30 of publications are double-coded) - Meet to compare code sheets
- Where there is discrepancy, discuss to reach

agreement - Amend code materials/definitions in code book if

necessary - May need to do several rounds of piloting, each

time using different papers

Interrater reliability

- Percent agreement Common but not recommended
- Cohens kappa coefficient
- Kappa is the proportion of the optimum

improvement over chance attained by the coders,

where a value of 1 indicates perfect agreement

and a value of 0 indicates that agreement is no

better than that expected by chance - Kappas over .40 are considered to be a moderate

level of agreement (but no clear basis for this

guideline) - Correlation between different raters
- Intraclass correlation. Agreement among multiple

raters corrected for number of raters using

Spearman-Brown formula (r)

Exercise 1a

- The purpose of this exercise is to explore

various issues of meta-analytic methodology - Discuss in groups of 3-4 people the following

issues in relation to the gender differences in

smiling study (LaFrance et al., 2003) - Did the aims of the study justify conducting a

meta-analysis? - Was selection criteria and the search process

explicit? - How did they deal with interrater (coder)

reliability?

Ex. 1a discussion points

- Extend previous meta-analyses, include previously

untested moderators based on theory/empirical

observations - Search process detailed databases and 5 other

sources of studies, search terms. Selection

criteria justification provided (e.g., for

excluding under the age of 13). However, not

clear how many studies were retrieved and then

eventually included (compare with flow chart on

slide 51) - Multiple coders (group of coders consisted of

four people with two raters of each sex coding

each moderator). Interrater reliability was

calculated by taking the aggregate reliability of

the four coders at each time using the

SpearmanBrown formula

Effect size calculation

Effect size calculation

- The effect size makes meta-analysis possible
- It is based on the dependent variable (i.e.,

the outcome) - It standardizes findings across studies such that

they can be directly compared - Any standardized index can be an effect size

(e.g., standardized mean difference, correlation

coefficient, odds-ratio), but must - be comparable across studies (standardization)
- represent magnitude direction of the relation
- be independent of sample size
- Different studies in same meta-analysis can be

based on different statistics, but have to

transform each to a standardized effect size that

is comparable across different studies

Sample size, significance, effect size

Sample size, significance, effect size

XLS

62

ESRC RDI One Day Meta-analysis workshop (Marsh,

OMara, Malmberg)

Scatter plot of effect size and sample size

- OMara (2004)

Effect sizes

- Within the one meta-analysis, can include studies

based on any combination of statistical analysis

(e.g., t-tests, ANOVA, multiple regression,

correlation, odds-ratio, chi-square, etc).

However, you have to convert each of these to a

common effect size metric. - Lipsey Wilson (2001) present many formulae for

calculating effect sizes from different

information. The art of meta-analysis is how to

compute effect sizes based on non-standard

designs and studies that do not supply complete

data. - However, need to convert all effect sizes into a

common metric, typically based on the natural

metric given research in the area. E.g.

standardized mean difference odds-ratio

correlation, etc.

64

ESRC RDI One Day Meta-analysis workshop (Marsh,

OMara, Malmberg)

Effect size calculation

- Standardized mean difference
- Group contrast research
- Treatment groups
- Naturally occurring groups
- Inherently continuous construct
- Odds-ratio
- Group contrast research
- Treatment groups
- Naturally occurring groups
- Inherently dichotomous construct
- Correlation coefficient
- Association between variables research

Effect size calculation

- Represents a standardized group contrast on an

inherently continuous measure - Uses the pooled standard deviation (some

situations use control group standard deviation) - Commonly called d

In an intervention study with experimental and

control groups, the effect size might be

In a gender difference study, the effect size

might be

ESRC RDI One Day Meta-analysis workshop (Marsh,

OMara, Malmberg)

Effect size calculation

Means and standard deviations

Almost all test statistics can be transformed

into an standardized effect size d

Correlations

d

P-values

F-statistics

t-statistics

other test statistics

ESRC RDI One Day Meta-analysis workshop (Marsh,

OMara, Malmberg)

Effect size calculation using Excel

68

Effect size calculation

- Represents the strength of association between

two inherently continuous measures - Generally reported directly as r (the Pearson

product moment coefficient)

Effect size calculation

- The odds-ratio is based on a 2 by 2 contingency

table - The odds-ratio is the odds of success in the

treatment group relative to the odds of success

in the control group

Effect size calculation

ESRC RDI One Day Meta-analysis workshop (Marsh,

OMara, Malmberg)

r to d, d to r

Alternatively transform rs into Fishers

Zr-transformed rs, which are more normally

distributed

72

ESRC RDI One Day Meta-analysis workshop (Marsh,

OMara, Malmberg)

Correction for bias

- Hedges proposed a correction for small sample

size bias (n lt 20) - Must be applied before analysis

Weighting

- The effect sizes are weighted by the inverse of

the variance to give more weight to effects based

on large sample sizes - Variance is calculated as
- The standard error of each effect size is given

by the square root of the sampling variance - SE ? vi

74

ESRC RDI One Day Meta-analysis workshop (Marsh,

OMara, Malmberg)

Population and sample

Sample

n - size m - mean d effect size

75

ESRC RDI One Day Meta-analysis workshop (Marsh,

OMara, Malmberg)

Structuring a database

Constructing a database

Analytical Methods

- Fixed effects model
- Random effects model
- Multilevel model

Fixed effects assumptions

- Includes the entire population of studies to be

considered do not want to generalise to other

studies not included (e.g., future studies). - All of the variability between effect sizes is

due to sampling error alone. Thus, the effect

sizes are only weighted by the within-study

variance. - Effect sizes are independent.

Conducting fixed effects meta-analysis

- There are 2 general ways of conducting a fixed

effects meta-analysis ANOVA multiple

regression - The analogue to the ANOVA homogeneity analysis is

appropriate for categorical variables - Looks for systematic differences between groups

of responses within a variable - Multiple regression homogeneity analysis is more

appropriate for continuous variables and/or when

there are multiple variables to be analysed - Tests the ability of groups within each variable

to predict the effect size - Can include categorical variables in multiple

regression as dummy variables. (ANOVA is a

special case of multiple regression)

Q-test of the homogeneity of variance

The homogeneity (Q) test asks whether the

different effect sizes are likely to have all

come from the same population (an assumption of

the fixed effects model). Are the differences

among the effect sizes no bigger than might be

expected by chance?

effect size for each study (i 1 to k)

mean effect size a weight for each study

based on the sample size However, this

(chi-square) test is heavily dependent on sample

size. It is almost always significant unless the

numbers (studies and people in each study) are

VERY small. This means that the fixed effect

model will almost always be rejected in favour of

a random effects model.

Fixed effects mean effect size

Run MATRIX procedure Meta-Analytic

Results ------- Distribution Description

--------------------------------- N

Min ES Max ES Wghtd SD 15.000

.050 1.200 .315 ------- Fixed

Random Effects Model -----------------------------

Mean ES -95CI 95CI SE

Z P Fixed .4312 .3383

.5241 .0474 9.0980 .0000 Random

.3963 .2218 .5709 .0890 4.4506

.0000 ------- Random Effects Variance Component

------------------------ v

.074895 ------- Homogeneity Analysis

-------------------------------------

Q df p 44.1469

14.0000 .0001 Random effects v estimated

via noniterative method of moments. ------ END

MATRIX -----

ESRC RDI One Day Meta-analysis workshop (Marsh,

OMara, Malmberg)

Modelling moderators

- Model moderators by grouping effect sizes that

are similar on a specific characteristic - For example, group all effect size outcomes that

come from studies using a placebo control group

design and compare with effect sizes from studies

using a waitlist control group design - So in this example, Design is a dichotomous

variable with the values 0 placebo control and

1 waitlist control

Exp. cond

ES

design

Example fixed effects study

- On the next slide, we will look at the outcomes

of a study to show the importance of various

moderator variables - Do Psychosocial and Study Skill Factors Predict

College Outcomes? A Meta-Analysis - Robbins, Lauver, Le, Davis, Langley, Carlstrom

(2004). Psychological Bulletin, 130, 261288 - Aim
- To examine the relationship between psychosocial

and study skill factors (PSFs) and college

retention by meta-analyzing 109 studies

Fixed effects output

N sample size for that variable k number of

correlation coefficients on which each

distribution was based r mean observed

correlation CIr 10 lower bound of the

confidence interval for observed r CIr 90

upper bound of the confidence interval for

observed r

Regression output example

- Target self-concept domains are those that are

directly relevant to the intervention - Target-related are those that are logically

relevant to the intervention, but not focal - Non-target are domains that are not expected to

be enhanced by the intervention

Regression Coefficients and their standard

errors B SE

Sig? Target .4892 .0552 yes

Target-related .1097 .0587

no Non-target .0805 .0489 no From

OMara, Marsh, Craven, Debus (2006)

Random effects assumptions

- Is only a sample of studies from the entire

population of studies to be considered. As a

result, do want to generalise to other studies

not included in the sample (e.g., future

studies). - Variability between effect sizes is due to

sampling error plus variability in the population

of effects. - Effect sizes are independent.

Random effects models

- If the homogeneity test is rejected (it almost

always will be), it suggests that there are

larger differences than can be explained by

chance variation (at the individual participant

level). There is more than one population in

the set of different studies. - Now we turn to the random effects model to

determine how much of this between-study

variation can be explained by study

characteristics that we have coded. - The total variance associated with the effect

sizes has two components, one associated with

differences within each study (participant level

variation) and one between study variance

Weighting in random effects models

- The random error variance component is added to

the variance calculated earlier - This means that the weighting for each effect

size consists of the within-study variance (vi)

and between-study variance (v?) - The new weighting for the random effects model

(wiRE) is given by the formula

Example random effects study

- Do Self-Concept Interventions Make a Difference?

A Synergistic Blend of Construct Validation and

Meta-Analysis - OMara, Marsh, Craven, Debus. (2006).

Educational Psychologist, 41, 181206 - Aim
- To examine what factors moderate the

effectiveness of self-concept interventions by

meta-analyzing 200 interventions

Example random effects results homogeneity

analyses

- QB between group homogeneity. If the QB value

is significant, then the groups (categories) are

significantly different from each other - QW within group homogeneity. If QW is

significant, then the effect sizes within a group

(category) differ significantly from each other

Random effects mean effect size

Run MATRIX procedure Meta-Analytic

Results ------- Distribution Description

--------------------------------- N

Min ES Max ES Wghtd SD 15.000

.050 1.200 .315 ------- Fixed

Random Effects Model -----------------------------

Mean ES -95CI 95CI SE

Z P Fixed .4312 .3383

.5241 .0474 9.0980 .0000 Random

.3963 .2218 .5709 .0890 4.4506

.0000 ------- Random Effects Variance Component

------------------------ v

.074895 ------- Homogeneity Analysis

-------------------------------------

Q df p 44.1469

14.0000 .0001 Random effects v estimated

via noniterative method of moments. ------ END

MATRIX -----

Multilevel modelling assumptions

- Meta-analytic data is inherently hierarchical

(i.e., effect sizes nested within studies) and

has random error that must be accounted for - Effect sizes are not necessarily independent
- Allows for multiple effect sizes per study

Multilevel modelling

- New technique that is still being developed
- Provides more precise and less biased estimates

of between-study variance than traditional

techniques

Multilevel model structure example

- Level 1 outcome-level component
- Effect sizes
- Level 2 study component
- Publications

Conducting multilevel model analyses

- Intercept-only model, which incorporates both the

outcome-level and the study-level components

(similar to a random effects model) - Expand model to include predictor variables, to

explain systematic variance between the study

effect sizes

Example multilevel model

- Acute Stressors and Cortisol Responses A

Theoretical Integration and Synthesis of

Laboratory Research - Dickerson Kemeny (2004). Psychological

Bulletin, 130, 355391. - Aim
- To examine methodological predictors of cortisol

responses in a meta-analysis of 208 laboratory

studies of acute psychological stressors

Example multilevel results

- Only 2 variables significant (Quad Time between

stress onset assessment Time of day). The

quadratic component is difficult to interpret as

an unstandardized regression coefficient, but the

graph suggests it is meaningfully large

Model selection

- Fixed, random, or multilevel?
- Generally, if more than one effect size per study

is included in sample, multilevel should be used - However, if there is little variation at study

level, the results of multilevel modelling

meta-analyses are similar to random effects

models

Model selection

- Do you wish to generalise your findings to other

studies not in the sample?

- Do you have multiple outcomes per study?

Exercise 1b

- The purpose of this exercise is to consider

choice of meta-analytic method - Discuss in groups of 3-4 people the question in

relation to the gender differences in smiling

study (LaFrance et al., 2003) - Is there independence of effect sizes? What are

the implications for model choice (fixed, random,

multilevel)?

Supplementary analyses publication bias

- Fail-safe N
- Power analysis
- Trim-and-fill method

Dealing with publication bias

- The fail-safe N (Rosenthal, 1991) determines the

number of studies with an effect size of zero

needed to lower the observed effect size to a

specified (criterion) level. - For example, assume that you want to test the

assumption that an effect size is at least .20. - If the observed effect size was .26 and the

fail-safe N was found to be 44, this means that

44 unpublished studies with a mean effect size of

zero would need to be included in the sample to

reduce the observed effect size of .26 to .20.

Dealing with publication bias

- Power is a term used to describe the probability

of a statistical test committing Type II error.

That is, it indicates the likelihood that the

test has failed to reject the null hypothesis,

which implicitly suggests that there is no effect

when in reality there is. - Power, sample size, significance level, and

effect size are inter-related. - A lower powered study has to exhibit a much

larger effect size to produce a significant

finding. This has ramifications for publication

bias. - Muncer, Craigie, Holmes (2003) recommend

conducting a power analysis on all studies

included in the meta-analysis - Compare the observed value (d) against a

theoretical value (includes information about

sample size)

Dealing with publication bias

- Trim and fill procedure (Duval Tweedie, 2000a,

2000b) calculates the effect of potential data

censoring (including publication bias) on the

outcome of the meta-analyses. - Nonparametric, iterative technique examines the

symmetry of effect sizes plotted by the inverse

of the standard error. Ideally, the effect sizes

should mirror on either side of the mean.

Exercises

- Examining the methods and output of published

meta-analysis

Exercise 1c

- Discuss in groups of 3-4 people the following

question in relation to the gender differences in

smiling study (LaFrance et al., 2003) - How did they deal with publication bias? Does

this seem appropriate?

Exercise 2

- The purpose of this exercise is to practice

reading meta-analytic results tables. - This study, by Reger et al. (2004), examines the

relationship between neuropsychological

functioning and driving ability in dementia. - In Table 3, which variables are homogeneous for

the on-road tests driving measure in the All

Studies column? What does this tell you about

those variables? - In Table 4, look at the variables that were

homogeneous in question (1) for the on-road

tests using All Studies. Which variables have

a significant mean ES? Which variable has the

largest mean ES?

Exercise 2 Answers

- Homogeneous variables (non-significant Q-values)

Mental statusgeneral cognition, Visuospatial

skills, Memory, Executive functions, Language - All of the relevant mean effect sizes are

significant. Memory and language are tied as the

largest mean ESs for homogeneous variables (r

.44)

Conclusion

Summary

- We established what meta-analysis is, when and

why we use meta-analysis, and the benefits and

pitfalls of using meta-analysis - Summarised how to conduct a meta-analysis
- Provided a conceptual introduction to analysis

and interpretation of results based on fixed

effects, random effects, and multilevel models - Applied this information to examining the methods

of a published meta-analysis

Limitations

- Comparing apples and oranges
- Quality of the studies included in the

meta-analysis - What to do when studies dont report sufficient

information (e.g., non-significant findings)? - Including multiple outcomes in the analysis

(e.g., different achievement scores) - Publication bias

Future directions

- With meta-analysis now one of the most popularly

published research methods, it is an exciting

time to be involved in meta-analytic research - The hottest topics in meta-analysis are
- Multilevel modelling to address the issue of

independence of effect sizes - New methods in publication bias assessment

(Trim-and-fill method, post hoc power analysis) - Also receiving attention
- Establishing guidelines for conducting

meta-analysis (best practice) - Meta-analyses of meta-analyses

Software

- Purpose-built
- Comprehensive Meta-analysis (commercial)
- Schwarzer (free, http//userpage.fu-berlin.de/hea

lth/meta_e.htm) - Extensions to standard statistics packages
- SPSS, Stata and SAS macros, downloadable from

http//mason.gmu.edu/dwilsonb/ma.html - Stata add-ons, downloadable from

http//www.stata.com/support/faqs/stat/meta.html - HLM V-known routine
- MLwiN
- Mplus
- Please note that we do not advocate any one

programme over another, and cannot guarantee the

quality of all of the products downloadable from

the internet. This list is not exhaustive.

Key reference books

- Cooper, H., Hedges, L. V. (Eds.) (1994). The

handbook of research synthesis (pp. 521529). New

York Russell Sage Foundation. - Hox, J. (2003). Applied multilevel analysis.

Amsterdam TT Publishers. - Hunter, J. E., Schmidt, F. L. (1990). Methods

of meta-analysis Correcting error and bias in

research findings. Newbury Park Sage

Publications. - Lipsey, M. W., Wilson, D. B. (2001). Practical

meta-analysis. Thousand Oaks, CA Sage

Publications.

More information

- Pick up a brochure about our intermediate and

advanced meta-analysis courses - Visit our website http//www.education.ox.ac.uk/re

search/resgroup/self/training.php