Use of Prognostic - PowerPoint PPT Presentation

1 / 118
About This Presentation
Title:

Use of Prognostic

Description:

Use of Prognostic & Predictive Biomarkers in Clinical Trial Design Richard Simon, D.Sc. Chief, Biometric Research Branch National Cancer Institute – PowerPoint PPT presentation

Number of Views:177
Avg rating:3.0/5.0
Slides: 119
Provided by: rsimon
Category:

less

Transcript and Presenter's Notes

Title: Use of Prognostic


1
Use of Prognostic Predictive Biomarkers in
Clinical Trial Design
  • Richard Simon, D.Sc.
  • Chief, Biometric Research Branch
  • National Cancer Institute
  • http//brb.nci.nih.gov

2
BRB Websitebrb.nci.nih.gov
  • Powerpoint presentations
  • Reprints
  • BRB-ArrayTools software
  • Data archive
  • Q/A message board
  • Web based Sample Size Planning
  • Clinical Trials
  • Optimal 2-stage phase II designs
  • Phase III designs using predictive biomarkers
  • Phase II/III designs
  • Development of gene expression based predictive
    classifiers

3
Prognostic Predictive Biomarkers
  • Most cancer treatments benefit only a minority of
    patients to whom they are administered
  • Being able to predict which patients are likely
    to benefit would
  • Save patients from unnecessary toxicity, and
    enhance their chance of receiving a drug that
    helps them
  • Control medical costs
  • Improve the success rate of clinical drug
    development

4
  • Predictive biomarkers
  • Measured before treatment to identify who will or
    will not benefit from a particular treatment
  • ER, HER2, KRAS
  • Prognostic biomarkers
  • Measured before treatment to indicate long-term
    outcome for patients untreated or receiving
    standard treatment
  • Only have medical utility if therapeutically
    relevant
  • Used to identify who does or does not require
    more intensive than standard treatment
  • OncotypeDx

5
Prognostic and Predictive Biomarkers in Oncology
  • Single gene or protein measurement
  • Scalar index or classifier that summarizes
    expression levels of multiple genes

6
Prognostic Factors in Oncology
  • Many prognostic factors are not used because they
    are not actionable
  • Most prognostic factor studies are not conducted
    with an intended use
  • They use a convenience sample of heterogeneous
    patients for whom tissue is available
  • Retrospective studies of prognostic markers
    should be planned and analyzed with specific
    focus on intended use of the marker
  • Design of prospective studies depends on context
    of use of the biomarker
  • Treatment options and practice guidelines
  • Other prognostic factors

7
Clinical Utility
  • Biomarker benefits patient by improving treatment
    decisions
  • Depends on context of use of the biomarker
  • Treatment options and practice guidelines
  • Other prognostic factors

8
Potential Uses of a Prognostic Biomarker
  • Identify patients who have very good prognosis on
    standard treatment and do not require more
    intensive regimens
  • Identify patients who have poor prognosis on
    standard chemotherapy who are good candidates for
    experimental regimens

9
Prospective Evaluation of Prognostic Biomarker
  • Identify low stage patients for whom standard of
    care is chemotherapy
  • Find dataset of low stage patients who did not
    receive chemotherapy for whom archived tissue is
    available
  • Develop prognostic biomarker classifier of risk
    without chemotherapy of low stage patients
  • Conduct RCT in which low stage patients who are
    low risk by biomarker classifier are randomized
    to - chemotherapy

10
  • In some cases, if biomarker predicted risk of
    recurrence is sufficiently low for randomized
    patients, then randomization is omitted and the
    test of the biomarker is a test of whether the
    risk is as low as predicted
  • Absolute benefit of very low risk patients is by
    necessity very small
  • This is the approach of TAILORx

11
How Does This Approach Compare to the So Called
Gold Standard of Randomizing Patients to Receive
or Not Receive the Test?
12
Prospective Marker Strategy Design
  • Patients are randomized to either
  • have marker measured and treatment determined
    based on marker result and clinical features
  • dont have marker measured and receive standard
    of care treatment based on clinical features
    alone

13
Randomize Patients to Test or No Test
Rx Determined by Test
Rx Determined By SOC
14
Marker Strategy Design
  • Inefficient
  • Many patients get the same treatment regardless
    of which arm they are randomized to
  • Uninformative
  • Since patients in the standard of care arm do not
    have the marker measured, it is not possible to
    compare outcome for patients whose treatment is
    changed based on the marker result

15
Apply Test to All Eligible Patients
Using phase II data, develop predictor of
response to new drug
Test Deterimined Rx Different From SOC
Test Determined Rx Same as SOC
Off Study
Use Test Determined Rx
Use SOC
16
Prospective Evaluation of OncotypeDx (TAILORx)
  • For patients with predicted low risk of
    recurrence
  • Withhold chemotherapy and observe long term
    recurrence rate
  • If recurrence rate is very low, potential
    chemotherapy benefit must be very small

17
  • MINDACT randomizes breast cancer patients whose
    Mammaprint based Rx differs from SOC
  • SOCchemo, low risk Mammaprint
  • SOCno chemorx, high risk Mammaprint
  • Trial is sized to estimate risk of relapse of low
    risk Mammaprint patients randomized to no
    chemotherapy

18
Predictive Biomarkers
19
(No Transcript)
20
(No Transcript)
21
  • Cancers of a primary site are a molecularly
    heterogeneous group of diverse diseases which
    vary enormously in their responsiveness to a
    given treatment
  • Can we develop new drugs in a manner more
    consistent with modern tumor biology and obtain
    reliable actionable information about what
    regimens work for what kinds of tumors?

22
  • Evaluating a predictive biomarker for treatment T
    involves an RCT of T versus a control C.
  • Analysis of RCT determines whether the biomarker
    distinguishes the patients who benefit from T vs
    C from those who dont
  • In this RCT, the biomarker should ideally be
  • completely specified in advance
  • focused on the single specific biomarker
  • the trial sized with sufficient marker and
    marker patients for adequately powered separate
    analysis of T vs C differences in each stratum.
  • Evaluating a predictive biomarker does not
    involve comparison of outcome of marker vs
    marker patient

23
(No Transcript)
24
Prospective Co-Development of Drugs and Companion
Diagnostics
  1. Develop a completely specified genomic classifier
    of the patients likely to benefit from a new drug
  2. Establish analytical validity of the classifier
  3. Use the completely specified classifier in the
    primary analysis plan of a phase III trial of the
    new drug

25
Guiding Principle
  • The data used to develop the classifier should be
    distinct from the data used to test hypotheses
    about treatment effect in subsets determined by
    the classifier
  • Developmental studies can be exploratory
  • Studies on which treatment effectiveness claims
    are to be based should not be exploratory

26
Develop Predictor of Response to New Drug
Using phase II data, develop predictor of
response to new drug
Patient Predicted Responsive
Patient Predicted Non-Responsive
Off Study
New Drug
Control
27
Applicability of Targeted/Enrichment Design
  • Primarily for settings where the classifier is
    based on a single gene whose protein product is
    the target of the drug
  • eg trastuzumab
  • With a strong biological basis for the
    classifier, it may be unacceptable to expose
    classifier negative patients to the new drug
  • Analytical validation, biological rationale and
    phase II data provide basis for regulatory
    approval of the test
  • Phase III study focused on test patients to
    provide data for approving the drug

28
Principle
  • If a drug is found safe and effective in a
    defined (test ) patient population, approval
    should not depend on finding the drug ineffective
    in some other (test -) population

29
Evaluating the Efficiency of Enrichment Design
  • Simon R and Maitnourim A. Evaluating the
    efficiency of targeted designs for randomized
    clinical trials. Clinical Cancer Research
    106759-63, 2004 Correction and supplement
    123229, 2006
  • Maitnourim A and Simon R. On the efficiency of
    targeted clinical trials. Statistics in Medicine
    24329-339, 2005.
  • reprints and interactive sample size calculations
    at http//linus.nci.nih.gov

30
  • Relative efficiency of targeted design depends on
  • proportion of patients test positive
  • effectiveness of new drug (compared to control)
    for test negative patients
  • When less than half of patients are test positive
    and the drug has little or no benefit for test
    negative patients, the targeted design requires
    dramatically fewer randomized patients

31
TrastuzumabHerceptin
  • Metastatic breast cancer
  • 234 randomized patients per arm
  • 90 power for 13.5 improvement in 1-year
    survival over 67 baseline at 2-sided .05 level
  • If benefit were limited to the 25 assay
    patients, overall improvement in survival would
    have been 3.375
  • 4025 patients/arm would have been required

32
(No Transcript)
33
Model for Two Treatments With Binary Response
  • Molecularly targeted treatment T
  • Control treatment C
  • 1-? Proportion of patients that express target
  • pc control response probability
  • response probability for T patients who express
    target (R) is (pc ?1)
  • Response probability for T patients who do not
    express target (R-) is (pc ?0)

34
Untargeted Trial
  • Compare outcome for treatment group T vs control
    group C without classifier data
  • Fisher-Exact test at two-sided level .05
    comparing response proportion in control group to
    response proportion in treatment group
  • Number of responses in C group of n patients is
    binomial B(n,pc)
  • Number of responses in T group is
  • B(n,(1-?)(pc?1) ?(pc?0))
  • Determine n patients per treatment group for
    power 1-?
  • Use Ury Fleiss approximation Biom
    36347-51,1980.

35
Targeted Trial
  • Compare outcome for treatment group T vs control
    group C for Assay positive patients
  • Fisher-Exact test at two-sided level .05
    comparing response proportion in control group to
    response proportion in treatment group
  • Number of responses in C group of n patients is
    binomial B(n,pc)
  • Number of responses in T group is
  • B(n,pc?1)
  • Determine nT patients per treatment group for
    power 1-?
  • Use Ury Fleiss approximation Biom
    36347-51,1980.

36
(No Transcript)
37
(No Transcript)
38
Approximations
  • Observed response rate N(p,p(1-p)/n)
  • pe(1-pe) pc(1-pc)

39
Number of Randomized Patients Required
  • Type I error ?
  • Power 1-? for obtaining significance

40
Randomized Ratio(normal approximation)
  • RandRat nuntargeted/ntargeted
  • ?1 rx effect in marker patients
  • ?0 rx effect in marker - patients
  • ? proportion of marker - patients
  • If ?00, RandRat 1/ (1-?) 2
  • If ?0 ?1/2, RandRat 1/(1- ?/2)2

41
Randomized Rationuntargeted/ntargeted
1-? Express target ?00 ?0 ?1/2
0.75 1.78 1.31
0.5 4 1.78
0.25 16 2.56
42
Screened Ratio
  • Nuntargeted nuntargeted
  • Ntargeted ntargeted/(1-?)
  • ScreenRat Nuntargeted/Ntargeted(1- ?)RandRat

43
Screened Ratio
Express target ?00 ?0 ?1/2
0.75 1.33 0.98
0.5 2 0.89
0.25 4 0.64
44
Randomized Ratio
  • RandRat nuntargeted/ntargeted

45
Randomized Ratiosensitivityspecificity0.9
Express target ?00 ?0 ?1/2
0.75 1.29 1.26
0.5 1.8 1.6
0.25 3.0 1.96
0.1 25.0 1.86
46
Screened RatioImperfect Assay
  • Nuntargeted nuntargeted

47
Screened Ratiosensitivityspecificity0.9
Express target ?00 ?0 ?1/2
0.75 0.9 0.88
0.5 0.9 0.80
0.25 0.9 0.59
0.1 4.5 0.33
48
Web Based Software for Designing RCT of Drug and
Predictive Biomarker
  • http//brb.nci.nih.gov

49
  • It can be very difficult to develop an effective
    and analytically validated predictive biomarker
    prior to launch of the phase III trial
  • Even for anti-EGFR antibodies, a more effective
    biomarker turned out to be KRAS mutation, not
    EGFR expression
  • For small molecule kinase inhibitors the task is
    more difficult
  • In some settings it can be easier to use an
    analytically validated biomarker of poor outcome
    on the standard therapy

50
  • Score function S for distinguishing patients with
    favorable outcome on standard rx vs those with
    unfavorable outcome
  • Developed on training set of pts receiving std rx
  • GF(s)CDF of S in favorable pts
  • GU(s)CDF of S in unfavorable pts
  • Computed on test set of pts receiving std rx

51
  • GU(s)sensitivity of test for selecting pts with
    unfavorable outcome on std rx using threshold s
  • 1-GF(s)specificity of test
  • Plot of GU(s) vs GF(s) ROC curve

52
  • Latent classes
  • LCF
  • LCU
  • PrLCF?
  • PrSRespFLCFp1
  • PrSRespFLCUp0
  • PrERespFLCFp1
  • PrSRespFLCUp0?

53
(No Transcript)
54
(No Transcript)
55
(No Transcript)
56
(No Transcript)
57
  • The maximum treatment effect is ?. It can be
    achieved if one selects a threshold t small
    enough that the specificity of the test for
    excluding cases with favorable outcome on the
    standard treatment is 1. If the specificity is 1,
    then the size of the treatment effect does not
    depend on the sensitivity of the test
  • Proportion randomized (1-?)GU(t)?GF(t)

58
  • Simon and Maitnourim showed that the ratio of
    number of patients needed to randomize for a
    targeted design compared to a standard design
    that does not use the biomarker is approximately
    equal to the square of the ratio of the treatment
    effects for the two designs
  • For the standard design the treatment effect is
    (1-?)?

59
(No Transcript)
60
  • If the threshold is selected for specificity 1,
    then the randomization ratio equals (1-?)2
  • Hence if half of the patients have favorable
    outcome with standard treatment, i.e. ?0.5, then
    the targeted design requires only one quarter the
    number of randomized patients as the standard
    design.

61
Stratification Design
62
Stratification Design
  • Use the test to structure a prospective specified
    primary analysis plan
  • Having a prospective analysis plan is essential
  • Stratifying (balancing) the randomization is
    useful to ensure that all randomized patients
    have tissue available but is not a substitute for
    a prospective analysis plan
  • The purpose of the study is to evaluate the new
    treatment overall and for the pre-defined
    subsets not to modify or refine the classifier
  • The purpose is not to demonstrate that repeating
    the classifier development process on independent
    data results in the same classifier

63
Not Interaction Design
  • Requiring a significant interaction at 5 level
    to justify evaluating treatment effects in
    subsets
  • was useful in the context of post-hoc subset
    analysis when drugs were non-specific cytotoxins,
    the subsets were not biology based and the prior
    probability of qualitative interactions was low
  • is not useful for focused co-development of
    molecularly targeted drugs when the subset
    analysis is part of the primary analysis plan and
    the study-wise type I error is controlled
  • is an example of how progress could be
    unnecessarily stymied by making co-development
    impracticably expensive

64
  • R Simon. Using genomics in clinical trial design,
    Clinical Cancer Research 145984-93, 2008
  • R Simon. Designs and adaptive analysis plans for
    pivotal clinical trials of therapeutics and
    companion diagnostics, Expert Opinion in Medical
    Diagnostics 2721-29, 2008

65
Analysis Plan A
  • Compare the new drug to the control for
    classifier positive patients
  • If pgt0.05 make no claim of effectiveness
  • If p? 0.05 claim effectiveness for the
    classifier positive patients and
  • Compare new drug to control for classifier
    negative patients using 0.05 threshold of
    significance

66
Analysis Plan B(Limited confidence in test)
  • Compare the new drug to the control overall for
    all patients ignoring the classifier.
  • If poverall? 0.03 claim effectiveness for the
    eligible population as a whole
  • Otherwise perform a single subset analysis
    evaluating the new drug in the classifier
    patients
  • If psubset? 0.02 claim effectiveness for the
    classifier patients.

67
Analysis Plan C
  • Test for difference (interaction) between
    treatment effect in test positive patients and
    treatment effect in test negative patients at an
    elevated level ?int (e.g. .10)
  • If interaction is significant at level ?int then
    compare treatments separately for test positive
    patients and test negative patients
  • Otherwise, compare treatments overall

68
Sample Size Planning for Analysis Plan C
  • 88 events in test patients needed to detect 50
    reduction in hazard at 5 two-sided significance
    level with 90 power
  • If 25 of patients are positive, when there are
    88 events in positive patients there will be
    about 264 events in negative patients
  • 264 events provides 90 power for detecting 33
    reduction in hazard at 5 two-sided significance
    level

69
Simulation Results for Analysis Plan C
  • Using ?int0.10, the interaction test has power
    93.7 when there is a 50 reduction in hazard in
    test positive patients and no treatment effect in
    test negative patients
  • A significant interaction and significant
    treatment effect in test positive patients is
    obtained in 88 of cases under the above
    conditions
  • If the treatment reduces hazard by 33 uniformly,
    the interaction test is negative and the overall
    test is significant in 87 of cases

70
Does the RCT Need to Be Significant Overall for
the T vs C Treatment Comparison?
  • No
  • It is incorrect to require that the overall T vs
    C comparison be significant to claim that T is
    better than C for test patients but not for
    test patients
  • That requirement has been traditionally used to
    protect against data dredging. It is
    inappropriate for focused trials of a treatment
    with a companion test.

71
Development of Genomic Classifiers
  • During phase II development or
  • Adaptively during phase III trial
  • Using archived specimens from previous phase III
    trial

72
(No Transcript)
73
(No Transcript)
74
(No Transcript)
75
Biomarker Adaptive Threshold Design
  • Wenyu Jiang, Boris Freidlin Richard Simon
  • JNCI 991036-43, 2007

76
Biomarker Adaptive Threshold Design
  • Randomized trial of T vs C
  • Have identified a biomarker score B thought to be
    predictive of patients likely to benefit from T
    relative to C
  • Eligibility not restricted by biomarker
  • No threshold for biomarker determined
  • Biomarker value scaled to range (0,1)
  • Time-to-event data

77
Procedure A
  • Compare T vs C for all patients
  • If results are significant at level .04 claim
    broad effectiveness of T
  • Otherwise proceed as follows

78
Procedure A
  • Test T vs C restricted to patients with biomarker
    B gt b
  • Let S(b) be log likelihood ratio statistic
  • Repeat for all values of b
  • Let S maxS(b)
  • Compute null distribution of S by permuting
    treatment labels
  • If the data value of S is significant at 0.01
    level, then claim effectiveness of T for a
    patient subset
  • Compute point and bootstrap interval estimates of
    the threshold b

79
Estimated Power of Broad Eligibility Design
(n386 events) vs Adaptive Design A (n412
events) 80 power for 30 hazard reduction
Model Broad Eligibility Design Biomarker Adaptive Threshold A
40 reduction in 50 of patients (22 overall reduction) .70 .78
60 reduction in 25 of patients (20 overall reduction) .65 .91
79 reduction in 10 of patients (14 overall reduction) .35 .93
80
(No Transcript)
81
Multiple Biomarker Design
  • Have identified K candidate binary classifiers B1
    , , BK thought to be predictive of patients
    likely to benefit from T relative to C
  • Eligibility not restricted by candidate
    classifiers
  • For notation let B0 denote the classifier with
    all patients positive

82
  • Test T vs C restricted to patients positive for
    Bk for k0,1,,K
  • Let S(Bk) be log likelihood ratio statistic for
    treatment effect in patients positive for Bk
    (k1,,K)
  • Let S maxS(Bk) , k argmaxS(Bk)
  • For a global test of significance
  • Compute null distribution of S by permuting
    treatment labels
  • If the data value of S is significant at 0.05
    level, then claim effectiveness of T for patients
    positive for Bk

83
  • Test T vs C restricted to patients positive for
    Bk for k0,1,,K
  • Let S(Bk) be log likelihood ratio statistic for
    treatment effect in patients positive for Bk
    (k1,,K)
  • Let S maxS(Bk) , k argmaxS(Bk)
  • The new treatment is superior to control for the
    population defined by k
  • Repeating the analysis for bootstrap samples of
    cases provides
  • an estimate of the stability of k (the
    indication)
  • an interval estimate S (the size of treatment
    effect for the size of treatment effect in the
    target population)

84
Adaptive Signature Design
  • Boris Freidlin and Richard Simon
  • Clinical Cancer Research 117872-8, 2005

85
Adaptive Signature DesignEnd of Trial Analysis
  • Compare E to C for all patients at significance
    level 0.04
  • If overall H0 is rejected, then claim
    effectiveness of E for eligible patients
  • Otherwise

86
  • Otherwise
  • Using only the first half of patients accrued
    during the trial, develop a binary classifier
    that predicts the subset of patients most likely
    to benefit from the new treatment T compared to
    control C
  • Compare T to C for patients accrued in second
    stage who are predicted responsive to T based on
    classifier
  • Perform test at significance level 0.01
  • If H0 is rejected, claim effectiveness of T for
    subset defined by classifier

87
Treatment effect restricted to subset.10 of
patients sensitive, 10 sensitivity genes, 10,000
genes, 400 patients.
Test Power
Overall .05 level test 46.7
Overall .04 level test 43.1
Sensitive subset .01 level test (performed only when overall .04 level test is negative) 42.2
Overall adaptive signature design 85.3
88
Cross-Validated Adaptive Signature Design(to be
submitted for publication)
  • Wenyu Jiang, Boris Freidlin, Richard Simon

89
Cross-Validated Adaptive Signature DesignEnd of
Trial Analysis
  • Compare T to C for all patients at significance
    level ?overall
  • If overall H0 is rejected, then claim
    effectiveness of T for eligible patients
  • Otherwise

90
Otherwise
  • Partition the full data set into K parts
  • Form a training set by omitting one of the K
    parts. The omitted part is the test set
  • Using the training set, develop a predictive
    classifier of the subset of patients who benefit
    preferentially from the new treatment T compared
    to control C using the methods developed for the
    ASD
  • Classify the patients in the test set as
    sensitive (classifier ) or insensitive
    (classifier -)
  • Repeat this procedure K times, leaving out a
    different part each time
  • After this is completed, all patients in the full
    dataset are classified as sensitive or
    insensitive

91
  • Compare T to C for sensitive patients by
    computing a test statistic S e.g. the difference
    in response proportions or log-rank statistic
    (for survival)
  • Generate the null distribution of S by permuting
    the treatment labels and repeating the entire
    K-fold cross-validation procedure
  • Perform test at significance level 0.05 -
    ?overall
  • If H0 is rejected, claim effectiveness of T for
    subset defined by classifier
  • The sensitive subset is determined by developing
    a classifier using the full dataset

92
70 Response to T in Sensitive Patients25
Response to T Otherwise25 Response to C20
Patients Sensitive
ASD CV-ASD
Overall 0.05 Test 0.486 0.503
Overall 0.04 Test 0.452 0.471
Sensitive Subset 0.01 Test 0.207 0.588
Overall Power 0.525 0.731
93
Does It Matter If the Randomization in the RCT
Was Not Stratified By the Test?
  • No
  • Stratification improves balance of stratification
    factors in overall comparisons
  • Stratification does not improve comparability of
    treatment (T) and control (C) groups within test
    positive patients or within test negative
    patients.
  • In a fully prospective trial, stratification of
    the randomization by the test is only useful for
    ensuring that all patients have adequate test
    performed

94
Information about a predictive biomarker may
develop following completion of the pivotal
trials
  • It may be infeasible to conduct a new
    prospective trial for a previously approved drug
  • KRAS for anti-EGFR antibodies in colorectal
    cancer
  • HER2 for doxorubicin in breast cancer

95
  • In some cases the benefits of a prospective trial
    can be closely achieved by the carefully planned
    use of archived tissue from a previously
    conducted randomized clinical trial

96
Use of Archived Specimens in Evaluation of
Prognostic and Predictive BiomarkersRichard M.
Simon, Soonmyung Paik and Daniel F. Hayes
  • Claims of medical utility for prognostic and
    predictive biomarkers based on analysis of
    archived tissues can be considered to have either
    a high or low level of evidence depending on
    several key factors.
  • Studies using archived tissues, when conducted
    under ideal conditions and independently
    confirmed can provide the highest level of
    evidence.
  • Traditional analyses of prognostic or predictive
    factors, using non analytically validated assays
    on a convenience sample of tissues and conducted
    in an exploratory and unfocused manner provide a
    very low level of evidence for clinical utility.

97
Use of Archived Specimens in Evaluation of
Prognostic and Predictive BiomarkersRichard M.
Simon, Soonmyung Paik and Daniel F. Hayes
  • For Level I Evidence
  • (i) archived tissue adequate for a successful
    assay must be available on a sufficiently large
    number of patients from a phase III trial that
    the appropriate analyses have adequate
    statistical power and that the patients included
    in the evaluation are clearly representative of
    the patients in the trial.
  • (ii) The test should be analytically and
    pre-analytically validated for use with archived
    tissue.
  • (iii) The analysis plan for the biomarker
    evaluation should be completely specified in
    writing prior to the performance of the biomarker
    assays on archived tissue and should be focused
    on evaluation of a single completely defined
    classifier.
  • iv) the results from archived specimens should be
    validated using specimens from a similar, but
    separate, study.

98
Factor        
A B C D
Clinical trial PRCT designed to address tumor marker Prospective trial not designed to address tumor marker, but design accommodates tumor marker utility. Accommodation of predictive marker requires PRCT Prospective observational registry, treatment and followup not dictated No prospective aspect to study
Patients and patient data Prospectively enrolled, treated, and followed in RCT Prospectively enrolled, treated, and followed in clinical trial and, especially if a predictive utility is considered, a PRCT addressing the treatment of interest Prospectively enrolled in registry, but treatment and followup standard of care No prospective stipulation of treatment or followup patient data collected by retrospective chart review
Specimen collection, processing, and archival Specimens collected, processed and assayed for specific marker in real time Specimens collected, processed, and archived prospectively using generic SOPs. Assayed after trial completed Specimens collected, processed, and archived prospectively using generic SOPs. Assayed after trial completed Specimens collected, processed and archived with no prospective SOPs
Statistical Design and analysis Study powered to address tumor marker question. Study powered to address therapeutic question underpowered to address tumor marker question. Focused analysis plan for marker question developed prior to doing assays Study not prospectively powered at all. Retrospective study design confounded by selection of specimens for study. Focused analysis plan for marker question developed prior to doing assays Study not prospectively powered at all. Retrospective study design confounded by selection of specimens for study. No focused analysis plan for marker question developed prior to doing assays
Validation Result unlikely to be play of chance Although preferred, validation not required Result more likely to be play of chance that A, but less likely than C. Requires one or more validation studies Result very likely to be play of chance. Requires subsequent validation studies Result very likely to be play of chance. Requires subsequent validation
Terminology Prospective Prospective using archived samples Prospective /observational Retrospective/observational
99
Revised Levels of Evidence for Tumor Marker
Studies
Level of Evidence Category from Table 1 Validation Studies Available
I A None required
I B One or more with consistent results
II B None or Inconsistent results
II C 2 or more with consistent results
III C None or 1 with consistent results or Inconsistent results
IV-V D NA
100
New Paradigms for Stud Design and Analysis for
Prediction
  • Developments in biotechnology have forced
    statisticians to focus on prediction problems
  • This has led to many exciting methodological
    developments
  • pgtgtn problems in which number of genes is much
    greater than the number of cases
  • Statistics has over-focused on inference. Many of
    the methods and much of the conventional wisdom
    of statistics are based on inference problems and
    not applicable to prediction problems

101
Some statisticians believe that accurate
prediction is not possible for pgtgtn
  • Accurate prediction is often possible, but
    standard statistical methods for model building
    and evaluation are not effective
  • Much of the conventional wisdom about how to
    develop and evaluate regression models is flawed
    when applied to prediction

102
  • pgtn prediction problems are not multiple
    comparison problems
  • Feature selection should be optimized for
    accurate prediction, not for controlling the
    false discovery rate
  • Standard statistical methods for model building
    and evaluation are not effective
  • e.g. Fishers LDA vs diagonal LDA
  • Model performance on the training set is
    extremely misleading for pgtn problems and should
    never be reported
  • Inadequate focus on selecting samples based on
    pre-defined intended use of model

103
  • Goodness of fit is not a proper measure of
    predictive accuracy
  • Odds ratios, hazard ratios and statistical
    significance of regression coefficients are not a
    proper measures of predictive accuracy

104
  • Validation of a predictive model means that the
    model predicts accurately for independent data
  • Validation does not mean that the model is stable
    or that using the same algorithm on independent
    data will give a similar model

105
Prediction Based Clinical Trials
  • Using cross-validation we can evaluate new
    methods for analysis of clinical trials in terms
    of their intended use which is informing
    therapeutic decision making

106
  • fj(x) probability of response for patient with
    covariate vector x who receives treatment j

107
Single Hypothesis Testing Based Decision Making
in an RCT
  • Test H0 ExfT(x) ExfC(x)
  • or fT(x) fC(x) for all x
  • If you reject H0 then treat future patients with
    T, otherwise treat future patients with C

108
Other Approaches
109
Predicting the Effect of Analysis Methods on
Patient Outcome
  • At the conclusion of the trial randomly partition
    the patients into 10 equally sized sets P1 , ,
    P10
  • Let D-i denote the full dataset minus data for
    patients in Pi
  • Using 10-fold complete cross-validation, omit
    patients in Pi
  • Analyze trial using only data in D-i with both
    the standard analysis and the alternative
    analysis

110
  • For each patient j in Pi record the
    cross-validated treatment recommendations based
    on D-i

111
  • Let ST denote the set of cases for which the
    standard analysis recommends C and the
    alternative analysis recommends T
  • Let SC denote the set of cases for which the
    standard analysis recommends T and the
    alternative analysis recommends C

112
  • For patients in ST compare outcomes for patients
    who received T versus those who received C
  • For patients in SC compare outcomes for patients
    who received T versus those who received C

113
  • Hence, alternative methods for analyzing RCTs
    can be evaluated in an unbiased manner with
    regard to their value to patients using the
    actual RCT data

114
Conclusions
  • New biotechnology and knowledge of tumor biology
    provide important opportunities to improve
    therapeutic decision making
  • Treatment of broad populations with regimens that
    do not benefit most patients is increasingly no
    longer necessary nor economically sustainable
  • The established molecular heterogeneity of human
    diseases requires the use new approaches to the
    development and evaluation of therapeutics

115
Conclusions
  • New biotechnology and knowledge of tumor biology
    provide important opportunities to improve
    therapeutic decision making
  • Treatment of broad populations with regimens that
    do not benefit most patients is increasingly no
    longer necessary nor economically sustainable
  • The established molecular heterogeneity of human
    diseases requires the use new approaches to the
    development and evaluation of therapeutics

116
Conclusions
  • Some of the conventional wisdom about statistical
    analysis of clinical trials is not applicable to
    trials dealing with co-development of drugs and
    diagnostic
  • e.g. subset analysis if the overall results are
    not significant or if an interaction test is not
    significant or if the randomization was not
    stratified by the subsetting variable

117
Conclusions
  • Can we develop new drugs in a manner more
    consistent with modern tumor biology and obtain
    reliable actionable information about what
    regimens work for what kinds of patients?
  • The information doesnt have to be perfect to be
    much better than what we currently have

118
Conclusions
  • Co-development of drugs and companion diagnostics
    increases the complexity of drug development
  • It does not make drug development simpler,
    cheaper and quicker
  • But it may make development more successful and
    it has great potential value for patients and for
    the economics of health care
Write a Comment
User Comments (0)
About PowerShow.com