Title: Meta Analysis and Selection Bias with Applications in Medical Research
1Meta Analysis and Selection Bias with
Applications in Medical Research
- Jian Qing Shi
- Newcastle University, UK
- http//www.staff.ncl.ac.uk/j.q.shi
- email j.q.shi_at_ncl.ac.uk
- INHA University, 16/12/2008
2Outline
- Introduction meta-analysis
- Selection bias
- Selectivity model and sensitivity analysis
- Meta-analysis for 2x2 tables
- Meta-analysis and dose-analysis
- Meta-analysis for Multi-arm trials
- Comments
3New England J of Medicine(V34242-49 January 6,
2000)
- Looking Back on the Millennium in Medicine
- Elucidation of Human Anatomy and Physiology
- Discovery of Cells and Their Substructures
- Elucidation of the Chemistry of Life
- Application of Statistics to Medicine
- Development of Anesthesia
- Discovery of the Relation of Microbes to Disease
- Elucidation of Inheritance and Genetics
- Knowledge of the Immune System
- Development of Body Imaging
- Discovery of Antimicrobial Agents
- Development of Molecular Pharmacotherapy
4Introduction meta-analysis
- Example 1. Passive smoking and lung cancer
- Are they correlated?
- How are they correlated?
- What is the correlation?
5Example 1. Passive smoking and lung cancer
- of cases for
- exposure group
- of cases for group without exposure
- Need to compare and
6Example 1. Passive smoking and lung cancer
- Compare and
- The empirical log odds ratio is
- The asymptotic variance is
- 95 CI is (-.833, .264).
7- Conclusion from this study log odds ratio is
-0.285, meaning that exposure to smoke decreases
the risk of lung cancer by about 25. - It might be a wrong result!!
- Possible reasons randomness, small sample size
- Meta-analysis
- Collect and Synthesize results from similar
individual studies. 37 studies are collected to
assess the epidemiological evidence on lung
cancer and passive smoking - Table 1
8Introduction meta-analysis
- Assume the empirical log-odds ratio is
approximately normal - A random effect meta-analysis model is defined as
-
- For passive smoking data, , there
is an overall excess risk of 24 (95 CI 13 to
36) -
9Publication bias in favour of large studies and
studies with significant results
- Figure 1 funnel plot for passive smoking data
10Example passive smoking and coronary heart
disease (10 cohort and 8 case-control studies)
11Example the effect of selective decontamination
of the digestive tract on the risk of respiratory
tract infection (22 trials)
12- Other bias related to publication e.g.
- Time lag bias
- grey literature bias
- language bias
- duplicate publication bias
- Selection bias a headachy problem for
Statistician - Lack of information (non-ignorable missing data)!
13Modeling for publication bias
- Use weighted distribution
- Nonparametric approach Trim and Fill
- Selectivity model and sensitivity analysis
14Use weighted distribution
- Weighted distributions have been proposed for
problems of non-random sampling (Rao 1965) - In meta-analysis, is the probability
that a study is selected
15Use weighted distribution
- How to choose a weight function?
- Parametric model, for example
- Nonparametric model, for example
- Bayesian approach (e.g. Cleary et al 1997)
- Problems how to choose ? How to determine
? - Copas and Jackson (2004) gave a bound for
publication bias
16Trim and fill (Duval and Tweedie,2000)
- Giving a starting value of , estimate the
number of missing studies - The asymmetric studies are trimmed, and
re-estimate - After convergence, filling missing counterparts,
and calculate estimate of
17Selection model and sensitivity analysis
- Idea define a latent variable
- A study is selected only when the latent variable
- Meta-analysis and selection models
18Selection model and sensitivity analysis
- If this is the model without selection
bias - If , selected studies will have zgt0 and so
are more likely to be positive,
leading to a positive bias in y. Explicitly
19Selection model and sensitivity analysis
-
- The parameter a and b control the marginal
probability that a study is selected - a controls the overall proportion published
- b controls how the chance of selection depends on
study size. We expect bgt0, so that - very large studies (very small s) are almost
bound to be selected, - but only a proportion of the smaller ones will be
selected.
20Sensitivity analysis
- Since we do not observe how many studies are not
selected, a and b cannot be estimated! - Idea of sensitivity analysis
- For fixed values of a and b, make an inference
about the parameters - Exam how sensitively any conclusions depend on
the particular choice of these parameters - Test the fit of the funnel plot
21Sensitivity analysis Step 1
- Identify a range of selection models
- Determine a plausible range of (a,b)
- Or assume that 0.01ltP(selects)lt0.99 and then it
is converted into a range of values of a and b.
22Sensitivity analysis Step 2
- For each grid point of individual (a,b) pairs,
calculate overall mean and other quantities.
23Sensitivity analysis Step 3
- Test how the meta-analysis selection models fit
to the funnel plot - Idea refit the same model but with a linear term
in s added to the expected value of y. - If we accept the null hypothesis that
then we accept that the selection model has
satisfactorily explained any apparent linear
relationship between y and s.
24Sensitivity analysis Step 3
- This P-value should be at least say 5, for
passive smoking data, it means the estimate of
excess risk is at most 22. - Corresponding to average P-value 0.5, the risk
excess is only 14.
25Sensitivity analysis Step 4
- For each pairs (a,b), the inference can be
summarized by the following quantities -
- P-values for testing
- Lower limit of the 95 confidence interval for
- Upper limit of the 95 confidence interval for
- P-value for fit to the funnel plot
- P(select )
- P(select )
- Estimated number of selected and unselected
studies given by
26Example 2. Passive smoking and coronary heart
disease
- Without considering selection bias, we estimated
a 28 excess risk - Sensitivity analysis only the excess risk less
than 22 can be considered as reasonably
consistent with the data
27Selection bias and meta-analysis for 2x2 tables
by using exact distributions
- Suppose that a study has binomial outcomes
- The empirical logistic transformation
-
- The approximation is clearly inappropriate if
are not large, or if the are such
that can be close to 0 or to
28Meta-analysis for 2x2 tables by using exact
distributions
- Meta-analysis model
- For several 2x2 tables, conditional and
unconditional estimates of an assumed common
log-odds ratio are asymptotically equivalent, but
the unconditional estimates may be biased if the
number of the tables is large (see e.g. Cox and
Snell, 1989)---We use the conditional likelihood
approach
29Meta-analysis for 2x2 tables by using exact
distributions
- For each study, the conditional probability is
- The conditional likelihood involves an integral
- Use Gaussian quadrature approximation.
- Use Laplace method or other asymptotic method.
- MCEM
30Meta-analysis for 2x2 tables Monte Carlo EM
algorithm
- The full log-likelihood is
- E-step at (r1)-th iteration
- There is no analytical form for the above
equation. We use the following Monte Carlo
approximation.
31Meta-analysis for 2x2 tables Markov chain Monte
Carlo EM algorithm
- M-step it is rather simple.
-
- The standard deviation of can also
be calculated easily by the MC-EM algorithm.
32Meta-analysis with selectivity Inspection for
selection bias
- Juvenile offending data studies on the
effectiveness of rehabilitation programme for
juvenile offenders
33Meta-analysis with selectivity
- Let S be the event that a study is selected. The
selection model is defined by - The MC-EM algorithm can be extended to cover the
sensitivity analysis model
34Example Juvenile offending data
- Conclusion Use 5 as threshold, the average
treatment effect comes down from 1.14 to 1, i.e.,
the conventional method overestimate by at least
14. - It comes down to 0.6 attaining the expected null
P-value of 0.5.
35Meta-analysis and dose-analysis
- Example Alcohol use and breast cancer
- Dose-analysis model for i-th study
36Meta-analysis and dose-analysis
- Three major problems
- Heterogeneity
- Grouped dose measures
- Publication bias
37Heterogeneity and within-study dependence
- Meta-analysis model
- Within-study correlation between log-odds ratio,
which is approximated by - Let then
the model is - The overall MLE can therefore be calculated
38Grouped exposure levels
- The exposure levels are often not recorded
exactly but grouped into class intervals (e.g.
2.5-9.3) - Disadvantages for using a single assigned value
inaccurate estimates and underestimation of
variance - Idea suppose the exposure levels of all
individuals have pdf f(x), and if the probability
of being a case, given dose x, is , then the
probability that an individual in class interval
J is a case is
39Grouped exposure levels
- Theorem. Approximately, we have
- The above approach is also valid for adjusted
log-odds ratios provide the values of the
covariate adjustments are not too large
40Grouped exposure levels
- The likelihood for is
- Dose distribution a parametric model
can be used to fitted to the observed frequencies
of the dose intervals by maximising the log
likelihood
41Example breast cancer and alcohol use
- 13 studies are used in the meta-analysis
- When number of categories is small, is very
sensitive to the choices of assigned value - Mean and GL approaches suggest a much
stronger dose trend (excess risk 16 for one
extra drink daily) than ML approach (7). - Fixed-effect and random-effect models are almost
indistinguishable when ML is used---variability
of dose levels may explain the heterogeneity.
42Selection bias and sensitivity analysis
43Selection bias and sensitivity analysis
- Selection model
- It can be proved that the marginal selection
probability for a study with standard error is -
44Example breast cancer and alcohol use
- Greenland and Longnecker (1988) gave the
random-effect trend estimate is 0.0112, implying
that one extra drink daily (13g of alcohol)
increase risk by 16 - Sensitivity analysis the average value of
estimate is about 0.0038, the risk increase is
about 5, which is consistent with the later more
extensive meta-analysis. The causal role of
alcohol is in question.
45Further development using exact distribution
- Meta-analysis and dose-analysis by using exact
distribution. The conditional likelihood is
46Meta-analysis for Multi-arm trails
47Comments how to adjust bias
- Is it possible to adjust bias? This is actually a
problem of non-ignorable missing data - Sensitivity analysis
- Local sensitivity analysis (model
misspecification) in function space for model - Bound for possible bias
- Double the variance
48References
- Copas and Shi (2000, Biostatistcs)
- (2000, BMJ)
- (2001, SMMR)
- Shi and Copas (2002, JRSSB)
- (2004, SIM)
- Chootrakool and Shi (2008, 2009)