Title: Clinical Significance for Quality of Life Endpoints in Clinical Trials
1Clinical Significance for Quality of Life
Endpoints in Clinical Trials
Jeff A. Sloan, Ph.D. Mayo Clinic, Rochester,
MN, USA
- FDA/Industry Statistics Workshop
- Washington, September 16, 2005
2Primary goal advance the state of the science
to help cancer patient QOL soar
3Take home messagethere is good news
- There are problems with using QOL assessments as
indicators of efficacy in clinical trials. - There are scientifically sound solutions to these
problems. The problems have been disseminated
widely and consistently. The solutions have not.
4It takes a certain amount of bravery to work in
QOL research
5Science is a candle in the dark - Carl Sagan
We will use the candle of science to improve the
QOL of cancer patients
6How do you determine the clinical significance
of QOL assessments?
7What is a clinically meaningful QOL burden?
8Why is it difficult to define clinical
significance for QOL?
- Pain analogy
- 25 years ago physicians were the sole raters of
patient pain - JCAHO 2000 guideline every patients pain to be
assessed upon intake on a 0-10 scale - Time and experience alleviates novelty and
skepticism, and guidelines evolve
9Why is it difficult to define clinical
significance for QOL?
- Blood pressure analogy
- 100 years ago, clinical significance of BP scores
was unknown (Lancet 1899) - massage therapy was the gold standard
- present guidelines for BP clinical significance
today redrawn (McCrory DC. Lewis SZ. Chest. 126(1
Suppl)11S-13S, 2004)
10The solution found for tumor response cutoffs may
provide guidance
- We call a reduction of 50 a response.
- Have reductions of 49 all the time, but do not
worry about misclassification. - Moertel (1976) basis for 50 cutoff
- Find a cutoff and stick to it? (RECIST)
11What Clinical significance is NOT
- Statistical significance
- Example drawn from JCO 2001 (anonymous)
- HSQ before / after scores on 1300 patients
- all p-values lt0.0001
- conclusion all domains of QOL were significantly
different across treatment groups - problem 1300 patients provides 80 power to
detect a change of 1 unit on 0-100 point scale
12EORTC QLQ-LC13
- Item n537 n346 Effect Size
- Coughing 46.2 44.3 small
- Dyspnea 17.2 16.2 small
- Pain 26.9 25.5 small
- all p-values were statistically significant
13(No Transcript)
14The Six Papers
- 1) Methods used to date
- 2) Group versus individual differences
- 3) Single item versus multi-item
- 4) Patient, clinician, population perspectives
- 5) Changes over time
- 6) Practical considerations for specific
audiences - MCP, April, May, June 2002
15(No Transcript)
16No single statistical decision rule or procedure
can take the place of well-reasoned consideration
of all aspects of the data by a group of
concerned, competent, and experienced persons
with a wide range of scientific backgrounds and
points of view. Canner (1981)
If it looks like a duck, sounds like a duck, and
walks like a duck, the odds of it being a worm or
an elephant in a clever disguise are small in the
extreme. Sloan (2001)
17Bottom Line
- Assessing the clinical significance of QOL can be
as simple as a 10-point change on a 100-point
scale, if that is consistent with the goals of
the scientific enquiry. The real issue underlying
the controversy over QOL is the relative novelty
and lack of experience that presently exists with
QOL. With time and familiarity this too shall
pass. - (Sloan, J Chronic Obs. Pul. Dis. 2 57-62, 2005.)
18Presenting global solutions is always interesting
you
me
19Two general methods for clinical significance
- Anchor-based methods requirements
- independent interpretable measure (the anchor)
which has appreciable correlation between anchor
and target - Distribution-based methods
- rely on expression of magnitude of effect in
terms of measure of variability of results
(effect size)
20The MID method in one slide
21The Empirical Rule Effect Size (ERES) Approach
(Sloan et al, Cancer Integrative Medicine
1(1)41-47, 2003)
- QOL tool range 6 standard Deviations
- SD Estimate 100 percent / 6
- 16.7 of theoretical range
- Two-sample t-test effect sizes (J Cohen, 1988)
- small, moderate, large effect (0.2, 0.5, 0.8 SD
shift) - S,M,L effects 3, 8, 12 of range
22(No Transcript)
23The Empirical Rule
- Tchebyshevs Theorem at least 1-1/k2 of any
distribution will fall within k standard
deviations (SDs) of the mean - If the distribution is symmetric, 99 will fall
within 3 standard deviations - The pdf for the range is a function of the SD
- an estimate of the SD can be obtained via
- range 6 SD
24Assumption Checking for ERES(Dueck, Sloan, 2006,
J. Biopharm. Stats, in press)
- Tchebyshevs Inequality is conservative
- Tested the effect of various distributional
assumptions - Only a uniform distribution results in deviation
from the assumption of a 6 SD-based estimate (28
instead of 17)
25All Methods Give Similar Answers
- Cohen - 1/2 SD is moderate effect
- MCID - 1/2 point on 7-point Likert
- 7-1 6 point range gt SD of 1 unit
- so 1/2 point gt 1/2 SD
- Cella - 10 point on FACT-G
- 10/1.12 8.9 / 16.7 1/2 SD
- Feinstein - correlation approach
- Cohen was arbitrary, should be 0.6 SD
26The Good News
- Statistical, Philosophical, Empirical, Clinical,
Historical, Practical approaches to defining a
clinically significant effect for symptom
assessments are all in the same ballpark - A 10 point difference on a 100-point scale (1/2
SD) is almost always going to be clinically
significant - Smaller differences may also be meaningful (data)
- Applies to groups or individuals (just different
SD) - Norman GR, Sloan JA, Wyrwich KW. Expert Review of
Pharmacoeconomics and Outcomes Research Sept
2004 4(5) 515 519 - Sloan JA, Cella D, Hays R. J Clin Epidemiol (in
press).
27Four Guidelines(Sloan, Cella, Hays, JCE 2005,in
press)
- The method used to obtain an estimate of clinical
significance should be scientifically
supportable. - The ½ SD is a conservative estimate of an effect
size that is likely to be clinically meaningful.
An effect size greater than ½ SD is not likely to
be one that can be ignored. In the absence of
other information, the ½ SD is a reasonable and
scientifically supportable estimate of a
meaningful effect.
28Four Guidelines(Sloan, Cella, Hays, JCE 2005,in
press)
- Effect sizes below ½ SD, supported by data
regarding the specific characteristics of a
particular QOL assessment or application, may
also be meaningful. The minimally important
difference may be below ½ SD in such cases. - If feasible, multiple approaches to estimating a
tools clinically meaningful effect size in
multiple patient groups are helpful in assessing
the variability of the estimates. However, the
lack of multiple approaches with multiple groups
should not preemptively restrict application of
information gained to date.
29Summary
- Defining clinical significance for QOL
assessments is today where pain was 25 years ago,
tumor response was 50 years ago and blood
pressure was 100 years ago - Define clinical significance a priori, and use
the definition in the analytical process - Consensus is building as the answers from
different approaches are similar and relatively
robust
30New ideas have enabled us to make advances in QOL
science
31A Mayo/FDA meeting regardingguidance on
patient-reported outcomes (PRO)Discussion,
Education, and Operationalization
- FDA to release guidances for assessing PROs in
all clinical trials (3rd quarter 2005?) - Meeting co-sponsored with FDA to
- provide a focused process to facilitate
discussion among all stakeholders - educate stakeholders on background, content, and
concerns - provide an opportunity for input
- delineate ways to best operationalize the
guidance into clinical trials - February 23-25, 2006, DC (Westfields Marriott,
Chantilly, VA, 7 miles from Dulles)
32The NCCTG QOL Team
33Thank you
Email jsloan_at_mayo.edu