Statistical Principles for Clinical Research

About This Presentation

Title:

Statistical Principles for Clinical Research

Description:

... in software, printed tables in stat texts, or even shuffled slips of paper. ... Margin of error = the value (half-width) of the 95% confidence interval. ... – PowerPoint PPT presentation

Number of Views:69

Avg rating:3.0/5.0

Slides: 49

Provided by: peterchr3

Category:

more less

Transcript and Presenter's Notes

Title: Statistical Principles for Clinical Research

1
Statistical Principles for Clinical Research
Conducting Clinical Trials 2007
Sponsored by NIH General Clinical Research
Center Los Angeles Biomedical Research
Institute at Harbor-UCLA Medical Center November
1, 2007 Peter D. Christenson
2
Speaker Disclosure Statement
The speaker has no financial relationships
relevant to this presentation.
3
Recommended Textbook Making Inference
Design issues Biases How to read
papers Meta-analyses Dropouts Non-mathematical Man
y examples
4
Example Harbor Study Protocol
18 Pages of Background and Significance,
Preliminary Studies, and Research Design and
Methods. Then Pearson correlation, repeated
measure of the general linear model, ANOVA
analyses and student t tests will be used where
appropriate. The two main parameters of
interest will be A and B. For A, using a
t-test 40 subjects provide 80 assurance that a
XX reduction will be detected, with plt0.05.
Similar comparisons as for A and B will be
carried out
5
Example Harbor Study Protocol
The good . The two main parameters of
interest will be A and B. For A, using a
t-test, 40 subjects provide 80 assurance that a
XX reduction will be detected, with plt0.05.

Because
Explicit Specifies primary outcome of interest.
Explicit Justification for of subjects.

6
Example Harbor Study Protocol
the Bad Pearson correlation, repeated
measure of the general linear model, ANOVA
analyses and student t tests will be used where
appropriate.

Because
Boilerplate.
These methods are almost always used.
Where appropriate?
Tries to satisfy reviewer, not science.

7
Example Harbor Study Protocol
and the Ugly. Similar comparisons as for A
and B will be carried out

Because
1º OK Diff b/w 2 visits for 2 measures, A B.
But, 15 measures taken at each of 19 visits.
Torture the data long enough, and it will
confess to something.

8
Goals of this Presentation
More good. Less bad. Less ugly.
9
Biostatistical Involvement in Studies
Off-site statistical design and
analysis Multicenter studies data coordinating
center. In house drug company statisticians. CRO
through NIH or drug company. Local study
contracted elsewhere e.g. UCLA, USC,
CRO. Local protocol, and statistical design and
analysis Occasionally multicenter.
10
Studies with Off-Site Biostatistics

Not responsible for statistical design and
analysis.
Are responsible for study conduct that may
impact analysis, believability of results.
reduce sensitivity (power) of the study to
be able to detect effects.

11
Review of Basic Method of Inference from
Clinical Studies
12
Typical Study Data Analysis
Large enough signal-to-noise ratio ? Proves an
effect beyond a reasonable doubt. Often
Observed Effect Natural Variation/vN
Signal Noise
Ratio

For a t-test comparing two groups
Difference in Means SD/vN
t Ratio

Degree of allowable doubt ? How large t needs to
be.
5 (plt0.05) ? t gt 2
13
Meaning of p-value
p-value Probability of a test statistic (ratio)
that is at least as deviant as was observed, if
there is really no effect. Smaller p-values ?
more evidence of effect.

Validity of p-value interpretation typically
requires
Proper data generation, e.g., randomness.
Subjects provide independent information.
Data is not used in other statistical tests.
or an accounting for not satisfying these
criteria.

? p-values are earned by satisfying appropriately.
14
Analogy with Diagnostic Testing
Analogy True Effect ? Disease Study Claim
? Diagnosis
Truth
No Effect
Effect
Study Claims
No Effect
Correct
Error
Specificity
Sensitivity
Effect
Correct
Error
Set p0.05 Specificity95
Power Maximize. Choose N for 80
? Typical ?
15
Study Conduct Impacting Analysis
? effect detectability (and ?ratio) results from
Non-adherence of study personnel to the protocol
in general. Increases variation. Enrolling
subjects who do not satisfy inclusion or
exclusion criteria. E.g., no effect in 10
wrongly included real effect50 ? 0.9(50)
45 observed effect. Can decrease observed
effect. Subjects not completing entire study.
May decrease N, or give potentially conflicting
results.
16
Potentially Conflicting Results
Example Subjects not completing the entire study.
17
Tigabine Study Results How Believable?
1
2
3
Conclusions differ depending on how
non-completing subjects (24) are handled in the
analysis.
Primary analysis here is specified, but we would
prefer robustness to the method of analysis
(agreement), which is more likely with more
completing subjects.
18
Study Conduct Impacting Analysis
Intention-to-Treat (ITT)
ITT typically specifies that all subjects are
included in analysis, regardless of treatment
compliance or whether lost to follow-up. Purposes
Avoid bias from subjective exclusions or
differential exclusion between treatment groups
sometimes argued to mimic non-compliance in real
world setting. More emphasis on policy
implications of societal effectiveness than on
scientific efficacy. Not appropriate for many
studies.
Continued
19
Study Conduct Impacting Analysis
Intention-to-Treat (ITT)
Lost to follow-up Always minimize no real
world analogy as for treatment compliance. Need
to define outcomes for non-completing
subjects. Current Harbor study N1200 would
need N3000 if ITT used, 20 lost, and lost
counted as treatment failures.
20
ITT Need to Impute Unknown Values
Observations
LOCF Ignore Presumed Progression
0
Change from Baseline
Individual Subjects
Baseline
Intermediate Visit
Final Visit
Ranks
LRCF Maintain Expected Relative Progression
0
Change from Baseline
Intermediate Visit
Final Visit
Baseline
21
Study Conduct Impacting Feasibility
Potential Effects of Slow Enrollment

Needed N may be impossible ? Study stopped.
Competitive site enrollment ? Local financial
loss.
Insufficient person-years (PY) of observation
for some studies, even if N is attained

Area PY
of Subjects
Detects Effect?
Detects Effect1.1?
Detects Effect1.7?
N
Year
0 1 2
0 1 2
0 1 2
Planned
Slower Yet
Slower
22
Biostatistical Involvement in Studies
Off-site statistical design and
analysis Multicenter studies data coordinating
center. In-house drug company statisticians. By
CRO through NIH or drug company. Local study
contracted elsewhere e.g. UCLA, USC, CRO Local
protocol, and statistical design and
analysis Occasionally multicenter.
23
Local Protocols and Data Analysis

Develop protocol and data analysis plan.
Have randomization and blinding strategy, if
study requires.
Data management.
Perform data analyses.

24
Local Data Analysis Resources
Biostatistician Peter Christenson,
PChristenson_at_labiomed.org. Develop study design,
analysis plan. Advise throughout for any
study. Perform all non-basic analyses. Full
responsibility for studies with funded
FTE. Review some protocols for committees. Data
Management Database development for GCRC
studies by database manager.
25
Statistical Components of Protocols

Target population / source of subjects.
Quantification of aims, hypotheses.
Case definitions, endpoints quantified.
Randomization plan, if any.
Masking, if used.
Study size screen, enroll, complete.
Use of data from non-completers.
Justification of study size (power, precision,
other).
Methods of analysis.
Mid-study analyses.

26
SelectedStatistical Componentsand Issues
27
Case Definitions and Endpoints

Primary case definitions and endpoints need
careful thought.
Will need to report results based on these.

Example Study at Harbor Definition of cure very
strict. Analyzed data with this definition. Cure
rates too low - would not be taken
seriously. Scientific method ? need to report
them otherwise cherry-picking. Publication Use
primary definition explain also report with
secondary definition. Less credible.
28
Randomization

Helps assure attributability of treatment
effects.
Blocked randomization assures approximate
chronologic equality of numbers of subjects in
each treatment group.
Recruiters must not have access to randomization
list.
List can be created with a random number
generator in software, printed tables in stat
texts, or even shuffled slips of paper.

29
Non-completing Subjects

Enrolled subjects are never dropouts.
Protocol should specify
Primary analysis set (e.g., ITT or per-protocol).
How final values will be assigned to
non-completers.
Time-to-event (survival analysis) studies may not
need final assignments use time followed.
Study size estimates should incorporate the
number of expected non-completers.

30
Study Size Power

Power Probability of detecting real effects of
a specified minimal (clinically relevant)
magnitude
Power will be different for each outcome.
Power depends on the statistical method.
Five factors including power are inter-related.
Fixing four of these specifies the fifth
Study size
Heterogeneity among subjects (SD)
Magnitude of treatment effect to be detected
Power to detect this magnitude of effect
Acceptable chance of false positive conclusion,
usually 0.05

31
Free Study Size Software
www.stat.uiowa.edu/rlenth/Power
32
Free Study Size Software Example
Pilot data SD8.19 in 36 subjects. We propose
N40 subjects/group in order to provide 80 power
to detect (plt0.05) an effect ? of 5.2
33
Study Size May Not be Based on Power
Precision refers to how well a measure is
estimated. Margin of error the value
(half-width) of the 95 confidence
interval. Smaller margin of error ?? greater
precision. To achieve a specified margin of
error, solve the CI formula for N. Polls N
1000? margin of error on 1/vN 3.
Pilot Studies, Phase I, Some Phase II Power not
relevant may have a goal of obtaining an SD for
future studies.
34
Mid-Study Analyses

Mid-study comparisons should not be made before
study completion unless planned for (interim
analyses). Early comparisons are unstable, and
can invalidate final comparisons.
Interim analyses are planned comparisons at
specific times, usually by an unmasked advisory
board. They allow stopping the study early due to
very dramatic effects, and final comparisons, if
study continues, are adjusted to validly account
for peeking.

Continued
35
Mid-Study Analyses
Too many analyses
Effect
0
Wrong early conclusion
Time ?
Number of Subjects Enrolled
Need to monitor, but also account for many
analyses
36
Mid-Study Analyses

Mid-study reassessment of study size is advised
for long studies. Only standard deviations to
date, not effects themselves, are used to assess
original design assumptions.
Feasibility analysis
may use the assessment noted above to decide
whether to continue the study.
may measure effects, like interim analyses, by
unmasked advisors, to project ahead on the
likelihood of finding effects at the planned end
of study.

Continued
37
Mid-Study Analyses
Examples Studies at Harbor Randomized not
masked data available to PI. Compared treatment
groups repeatedly, as more subjects were enrolled.
Study 1 Groups do not differ plan to add more
subjects. Consequence ? final p-value not valid
probability requires no prior knowledge of
effect. Study 2 Groups differ significantly
plan to stop study. Consequence ? use of this
p-value not valid the probability requires
incorporating later comparison.
38
Multiple Analyses at Study End
False Positive Conclusions
Torturing Data
Replacing Subgroup with Analysis Gives a
Similar Problem
Lagakos NEJM 354(16)1667-1669.
39
Multiple Analyses at Study End

There are formal methods to incorporate the
number of multiple analyses.
Bonferroni
Tukey
Dunnett
Transparency of what was done is most
important.
Should be aware of number of analyses and
report it with any conclusions.

40
SummaryBad Science That May Seem So Good

Re-examining data, or using many outcomes,
seeming to be performing due diligence.
Adding subjects to a study that is showing
marginal effects or, stopping early due to
strong results.
Examining effects in subgroups. See NEJM 2006
354(16)1667-1669.
Actually bad? Could be negligent NOT to do these,
but need to account for doing them.

41
Statistical Software
42
Professional Statistics Software Package
Output
Stored data access-ible.
Enter code syntax.
43
Microsoft Excel for Statistics

Primarily for descriptive statistics.
Limited output.

44
Almost Free On-Line Statistics Software
www.statcrunch.com
Run from browser not local. 5/ 6 months
usage. Potential HIPPA concerns
Supported by NSF
45
Typical Statistics Software Package
Select Methods from Menus
www.ncss.com www.minitab.com www.stata.com
100 - 500
Output after menu selection
Data in spreadsheet
46
http//gcrc.labiomed.org/biostat
This and other biostat talks posted
47
Conclusions
Dont put off slow enrollment find the cause
solve it. I am available. Do put off analyses of
efficacy, not of design assumptions. I am
available. P-values are earned, by following
methods which are needed for them to be valid. I
am available. You may have to pay for lack of
attention to protocol decisions, to satisfy the
scientific method. I am available. Software
always takes more time than expected.
48
Thank You
Nils Simonson, in Furberg Furberg, Evaluating
Clinical Research

Write a Comment

User Comments (0)

About PowerShow.com

Statistical Principles for Clinical Research - PowerPoint PPT Presentation

Statistical Principles for Clinical Research

... in software, printed tables in stat texts, or even shuffled slips of paper. ... Margin of error = the value (half-width) of the 95% confidence interval. ... – PowerPoint PPT presentation