Title: Moving from Correlative Science to Predictive Medicine
1Moving from Correlative Science to Predictive
Medicine
- Richard Simon, D.Sc.
- Chief, Biometric Research Branch
- National Cancer Institute
- http//linus.nci.nih.gov/brb
2BRB Websitebrb.nci.nih.gov
- Powerpoint presentations and audio files
- Reprints Technical Reports
- BRB-ArrayTools software
- BRB-ArrayTools Data Archive
- 100 published cancer gene expression datasets
with clinical annotations - Sample Size Planning for Clinical Trials with
Predictive Biomarkers
3Kinds of Biomarkers
- Surrogate endpoint
- Pre post rx, early measure of clinical outcome
- Pharmacodynamic
- Pre post rx, measures an effect of rx on
disease - Prognostic
- Which patients need rx
- Predictive
- Which patients are likely to benefit from a
specific rx - Product characterization
- For biological rx
4Surrogate Endpoints
- It is extremely difficult to properly validate a
biomarker as a surrogate for clinical outcome. - It requires a series of randomized trials with
both the biomarker and clinical outcome measured
and demonstrating correlated differences - Even the concept of surrogate is dubious because
often a large treatment effect on PFS corresponds
to a small treatment effect on survival
5- Pharmacodynamic biomarkers used as endpoints in
phase I or II studies need not be validated
surrogates of clinical outcome - Unvalidated biomarkers can be used for early
futility analyses in phase III trials
6Prognostic Biomarkers
- Most prognostic factors are not used because they
are not therapeutically relevant - Most prognostic factor studies are poorly
designed - They are not focused on a clear therapeutic
decision context - They use a convenience sample of patients for
whom tissue is available. Generally the patients
are too heterogeneous to support therapeutically
relevant conclusions - They address statistical significance rather than
predictive accuracy relative to standard
prognostic factors
7Pusztai et al. The Oncologist 8252-8, 2003
- 939 articles on prognostic markers or
prognostic factors in breast cancer in past 20
years - ASCO guidelines only recommend routine testing
for ER, PR and HER-2 in breast cancer - With the exception of ER or progesterone
receptor expression and HER-2 gene amplification,
there are no clinically useful molecular
predictors of response to any form of anticancer
therapy.
8Prognostic Biomarkers Can be Therapeutically
Relevant
- 3-5 of node negative ER breast cancer patients
require or benefit from systemic rx other than
endocrine rx - Prognostic biomarker development should focus on
specific therapeutic decision contexts
9Key Features of OncotypeDx Development
- Identification of important therapeutic decision
context - Prognostic marker development was based on
patients with node negative ER positive breast
cancer receiving tamoxifen as only systemic
treatment - Use of patients in NSABP clinical trials
- Staged development and validation
- Separation of data used for test development from
data used for test validation - Development of robust assay with rigorous
analytical validation - 21 gene RTPCR assay for FFPE tissue
- Quality assurance by single reference laboratory
operation
10Predictive Classifiers
- Most cancer treatments benefit only a minority of
patients to whom they are administered - Particularly true for molecularly targeted drugs
- Being able to predict which patients are likely
to benefit would - save patients from unnecessary toxicity, and
enhance their chance of receiving a drug that
helps them - Help control medical costs
- Improve the success rate of clinical drug
development
11- Cancers of a primary site are often a
heterogeneous grouping of diverse molecular
diseases - The molecular diseases vary enormously in their
responsiveness to a given treatment - It is feasible (but difficult) to develop
prognostic markers that identify which patients
need systemic treatment and which have tumors
likely to respond to a given treatment - e.g. breast cancer and ER/PR, Her2
12- Conducting a phase III trial in the traditional
way with tumors of a specified site/stage/pre-trea
tment category may - Result in a false negative trial
- Unless a sufficiently large proportion of the
patients have tumors driven by the targeted
pathway - Require a very large number of randomized
patients to detect the small average treatment
effect
13- Positive results in traditionally designed broad
eligibility phase III trials may result in
subsequent treatment of many patients who do not
benefit
14(No Transcript)
15Predictive Biomarkers
- In the past often studied as un-focused post-hoc
subset analyses of RCTs. - Numerous subsets examined
- Same data used to define subsets for analysis and
for comparing treatments within subsets - Multiple comparisons with no control of type I
error - Led to conventional wisdom
- Only for hypothesis generation
- Only valid if overall treatment difference is
significant
16(No Transcript)
17The Roadmap
- Develop a completely specified genomic classifier
of the patients likely to benefit from a new drug - Establish analytical and pre-analytical validity
of the classifier - Use the completely specified classifier to design
and analyze a new clinical trial to evaluate
effectiveness of the new treatment with a
pre-defined analysis plan that preserves the
overall type-I error of the study.
18Guiding Principle
- The data used to develop the classifier must be
distinct from the data used to test hypotheses
about treatment effect in subsets determined by
the classifier - Developmental studies are exploratory
- Studies on which treatment effectiveness claims
are to be based should be definitive studies that
test a treatment hypothesis in a patient
population completely pre-specified by the
classifier
19New Drug Developmental Strategy I
- Restrict entry to the phase III trial based on
the binary predictive classifier, i.e. targeted
design
20Develop Predictor of Response to New Drug
Using phase II data, develop predictor of
response to new drug
Patient Predicted Responsive
Patient Predicted Non-Responsive
Off Study
New Drug
Control
21Applicability of Design I
- Primarily for settings where the classifier is
based on a single gene whose protein product is
the target of the drug - eg trastuzumab
- With a strong biological basis for the
classifier, it may be unacceptable to expose
classifier negative patients to the new drug - Analytical validation, biological rationale and
phase II data provide basis for regulatory
approval of the test - Phase III study focused on test patients to
provide data for approving the drug
22Evaluating the Efficiency of Strategy (I)
- Simon R and Maitnourim A. Evaluating the
efficiency of targeted designs for randomized
clinical trials. Clinical Cancer Research
106759-63, 2004 Correction and supplement
123229, 2006 - Maitnourim A and Simon R. On the efficiency of
targeted clinical trials. Statistics in Medicine
24329-339, 2005. - reprints and interactive sample size calculations
at http//linus.nci.nih.gov
23- Relative efficiency of targeted design depends on
- proportion of patients test positive
- effectiveness of new drug (compared to control)
for test negative patients - When less than half of patients are test positive
and the drug has little or no benefit for test
negative patients, the targeted design requires
dramatically fewer randomized patients - The targeted design may require fewer or more
screened patients than the standard design
24TrastuzumabHerceptin
- Metastatic breast cancer
- 234 randomized patients per arm
- 90 power for 13.5 improvement in 1-year
survival over 67 baseline at 2-sided .05 level - If benefit were limited to the 25 assay
patients, overall improvement in survival would
have been 3.375 - 4025 patients/arm would have been required
25Web Based Software for Comparing Sample Size
Requirements
- http//linus.nci.nih.gov/brb/
-
26(No Transcript)
27(No Transcript)
28(No Transcript)
29Developmental Strategy (II)
30Developmental Strategy (II)
- Do not use the diagnostic to restrict
eligibility, but to structure a prospective
analysis plan - Having a prospective analysis plan is essential
- Stratifying (balancing) the randomization is
useful to ensure that all randomized patients
have tissue available but is not a substitute for
a prospective analysis plan - The purpose of the study is to evaluate the new
treatment overall and for the pre-defined
subsets not to modify or refine the classifier - The purpose is not to demonstrate that repeating
the classifier development process on independent
data results in the same classifier
31Validation of EGFR biomarkers for selection of
EGFR-TK inhibitor therapy for previously treated
NSCLC patients
Outcome
FISH ( 30)
Erlotinib
2nd line NSCLC with specimen
1 PFS 2 OS, ORR
FISH Testing
Pemetrexed
1-2 years minimum additional follow-up
FISH - ( 70)
Erlotinib
Pemetrexed
4 years accrual, 1196 patients
957 patients
- PFS endpoint
- 90 power to detect 50 PFS improvement in FISH
- 90 power to detect 30 PFS improvement in FISH-
- Evaluate EGFR IHC and mutations as predictive
markers - Evaluate the role of RAS mutation as a negative
predictive marker
32Analysis Plan B(Limited confidence in test)
- Compare the new drug to the control overall for
all patients ignoring the classifier. - If poverall? 0.03 claim effectiveness for the
eligible population as a whole - Otherwise perform a single subset analysis
evaluating the new drug in the classifier
patients - If psubset? 0.02 claim effectiveness for the
classifier patients.
33- This analysis strategy is designed to not
penalize sponsors for having developed a
classifier - It provides sponsors with an incentive to develop
genomic classifiers - Incentives are appropriate because developing new
drugs with companion diagnostics increases the
complexity and cost of the drug development
process
34Analysis Plan C(adaptive)
- Test for difference (interaction) between
treatment effect in test positive patients and
treatment effect in test negative patients - If interaction is significant at level ?int then
compare treatments separately for test positive
patients and test negative patients - Otherwise, compare treatments overall
35Sample Size Planning for Analysis Plan C
- 88 events in test patients needed to detect 50
reduction in hazard at 5 two-sided significance
level with 90 power - If 25 of patients are positive, when there are
88 events in positive patients there will be
about 264 events in negative patients - 264 events provides 90 power for detecting 33
reduction in hazard at 5 two-sided significance
level
36Simulation Results for Analysis Plan C
- Using ?int0.10, the interaction test has power
93.7 when there is a 50 reduction in hazard in
test positive patients and no treatment effect in
test negative patients - A significant interaction and significant
treatment effect in test positive patients is
obtained in 88 of cases under the above
conditions - If the treatment reduces hazard by 33 uniformly,
the interaction test is negative and the overall
test is significant in 87 of cases
37Development of Genomic Classifiers
- Single gene or protein based on knowledge of
therapeutic target - Single gene or protein culled from set of
candidate genes identified based on imperfect
knowledge of therapeutic target - Empirically determined based on correlating gene
expression to patient outcome after treatment - Pusztai, Anderson, Hess. Clin Cancer Res
2007136080
38Myth
- Huge sample sizes are needed to develop effective
predictive classifiers
39Sample Size Planning References
- K Dobbin, R Simon. Sample size determination in
microarray experiments for class comparison and
prognostic classification. Biostatistics 627-38,
2005 - K Dobbin, R Simon. Sample size planning for
developing classifiers using high dimensional DNA
microarray data. Biostatistics (In Press)
40Sample size as a function of effect size
(log-base 2 fold-change between classes divided
by standard deviation). Two different tolerances
shown, . Each class is equally represented in the
population. 22000 genes on an array.
41Development of Genomic Classifiers
- During phase II development or
- After failed phase III trial using archived
specimens. - Adaptively during early portion of phase III
trial.
42(No Transcript)
43Prognostic and Predictive Classifiers for Guiding
Use of Approved Drugs
44Developmental Studies vs Validation Studies
- Validation studies use prognostic or predictive
biomarkers or composite classifiers that have
been completely defined in previous developmental
studies
45Types of Validation for Prognostic and Predictive
Biomarkers
- Analytical validation
- Pre-analytical,analytical, post-analytical
robustness - Clinical validation
- Does the biomarker predict what its supposed to
predict for independent data - Not whether independent studies produce the same
predictive biomarkers - Clinical utility
- Does use of the biomarker result in patient
benefit
46Clinical Utility
- Benefits patient by improving treatment decisions
- Depends on context of use of the biomarker
- Treatment options and practice guidelines
- Other prognostic factors
47Establishing Clinical Utility of a Prognostic
Biomarker Classifier
- Identify patients for whom
- practice standards imply cytotoxic chemotherapy
- who have good prognosis without chemotherapy
- Prospective trial using pre-defined classifier to
identify good risk patients and withhold
chemotherapy - TAILORx, MINDACT
- Analysis of archived specimens from previous
clinical trial in which patients did not receive
chemotherapy - Pre-defined classifier
- Prospective analysis plan developed before doing
assay - Establish analytical and pre/post-analytical
validity of assay - Large fraction of patients with adequate archived
tissue
48Establishing Clinical Utility of a Predictive
Classifier of Benefit from Regimen T
- Randomized trial of treatment with T versus
control - Include both test and test patients and size
trial to evaluate T vs control separately for the
two groups of patients - Or include only test patients if T is an
established standard therapy - Prospective trial may not be feasible
- Prospective analysis of archived specimens from
previous trial
49Myth of Gold Standard Design for Establishing
Clinical Utility of a Predictive Classifier of
Benefit from Regimen T
- Randomize patients to whether or not to have
classifier measured or to use standard of care - Standard of care group receive T and dont have
classifier measured - Patients randomized to have classifier measured
- If test (ie predicted to benefit from T)
receive T - If test - receive control regimen C
- Very inefficient
- many patients get same treatment regardless of
randomized arm - Since classifier is not measured in SOC arm, the
trial must be huge to detect very small overall
difference in outcome
50Microarray Myths
- That the greatest challenge is managing the mass
of microarray data
51Greater Challenges Are
- Designing, conducting and analyzing key
experiments that effectively utilize microarray
technology to bridge the gap between basic
research and clinical development
52(No Transcript)
53Major Flaws Found in 40 Studies Published in 2004
- 50 of studies contained one or more major flaws
- Cluster Analysis
- 13/28 studies invalidly claimed that expression
profiles could predict outcome based on
clustering samples with regard to differentially
expressed genes - Finding genes correlated with outcome
- 9/23 studies had inadequate methods to deal with
false positives - 10,000 genes x .05 significance level 500 false
positives - Supervised prediction
- 12/28 reported a misleading estimate of
prediction accuracy
54(No Transcript)
55Solution
56BRB-ArrayToolshttp//linus.nci.nih.gov
- Contains analysis tools that I have selected as
valid and useful - Targeted to biomedical scientists with analysis
wizard and numerous help screens - Imports data from all platforms and major
databases - Extensive built-in gene annotation and linkage to
gene annotation websites - Extensive gene-set enrichment tools for
integrating gene expression with pathways,
transcription factor targets, microRNA targets,
protein domains and other biological information - Extensive tools for the development and
validation of predictive classifiers with binary
outcome or survival outcome data
57Development and Validation of Predictive
Classifiers using Gene Expression Profiles
- To be continued Monday
- Thank you
58Conclusions
- New technology makes it feasible to identify
which patients are likely or unlikely to benefit
from a specified treatment - Targeting treatment can greatly improve the
therapeutic ratio of benefit to adverse effects - Treated patients benefit
- Economic benefit
- Greater chance of success in drug devleopment
59Conclusions
- Some of the conventional wisdom about how to
develop prognostic and predictive classifiers and
how to use them in clinical trial design is
flawed - Prospectively specified analysis plans for phase
III studies are essential to achieve reliable
results - Biomarker analysis does not mean exploratory
analysis except in developmental studies
60Conclusions
- Achieving the potential of new technology
requires paradigm changes in correlative
science. - Effective interdisciplinary research requires
increased emphasis on cross education of
laboratory, clinical and statistical/computational
scientists
61Acknowledgements
- Kevin Dobbin
- Alain Dupuy
- Boris Freidlin
- Wenyu Jiang
- Aboubakar Maitnourim
- Yingdong Zhao