Title: and Precision Effective Research Design Planning for Grant Proposals
1Power
(and Precision) Effective Research Design
Planningfor Grant Proposals More
Walt Stroup, Ph.D. Professor Chair, Department
of Statistics University of Nebraska, Lincoln
2Outline for Talk
- What is Power Analysis? Why should I do it?
- Essential Background
- A Word about Software
- Decisions that Affect Power several examples
- Latest Thinking
- Final Thoughts
3Power and Precision Defined
- Precision a.k.a Margin of Error
- In most cases, the standard error of relevant
estimate - Power
- Prob reject H0 given H0 false
- Prob research hypothesis statistically
significant - Power analysis
- essentially, If I do the study this way, power
? - Sample size estimation
- How many observations required to achieve given
power?
4Whats involved in Power Analysis
- WHAT ITS NOT
- Painting by numbers...
- IF ITS DONE RIGHT
- Power analysis should be
- a comprehensive conversation to plan the study
- a dress rehearsal for the statistical analysis
once the data are collected
5Why do a Power Analysis?
- For NIH Grant Proposal
- because its required
- For many other grant proposals
- because it gives you a competitive edge
- Other reasons
- practical increases chance of success reduces
we dont have time to do it right, but lots of
time to do it over syndrome - ethical
6Ethical???
- Last Ph.D. in U.S. Senate
- Irritant to doctrinaire left and right
- Keynote address to 1997 American Stat. Assoc.
... we can continue to make policy based on
data-free ideology on we can inform policy
where possible by competent inquiry...
late U.S. Senator Daniel Patrick Moynihan
7Ethical
- Results of your study may affect policy
- Well-conceived research means
- better information
- greater chance of sound decisions
- Poorly-conceived research
- lost opportunity
- deprives policy-makers of information that might
have been useful - or worse bad information misinforms or misleads
public
8What affects Power Precision?
- A short statistics lesson
- What goes into computing test statistics
- What test statistics are supposed to tell us
- A bit about the distribution of test statistics
- Central and non-central t, F, and chi-square
- ( mostly F )
9What goes into a test statistic?
Research hypothesis motivation for study
Assumed not true unless data show compelling
evidence otherwise
Research hypothesis HA opposite H0
10What goes into a test statistic?
- Visualize using F
- But same basic principles for t, chi-square, etc
- F is ratio of variation attributable to factor
under study vs. variation attributable to noise
N of obs
effect size
variance of noise (i.e. among obs)
11When H0 True i.e. no trt effect
12When H0 false (i.e. Research HA true)
13What affects Power?
N of obs
effect size
variance of noise (i.e. among obs)
14What should be in a conversation about Power?
N of obs
effect size
variance of noise (i.e. among obs)
- Effect size what is the minimum that matters?
- Variance how much noise in the response
variable (range? distribution? count? pct?) - Practical Constraints
- Design same N can produce varying Power
15About Software (part I)
- Canned Software
- lots of it
- Xiang and Zhou working on report
- painting by numbers
- Simulation
- most accurate not constrained by canned
scenarios - you can see what will happen if you actually do
this... - Exemplary data set modeling software
- nearly as accurate as simulation
- dress rehearsal for actual analysis
- MIXED, GLIMMIX, NLMIXED if you can model it you
can do power analysis
16Design Decisions Some Examples
- Main Idea For the same amount of effort, or ,
or observations, power and precision can be
quite different - Power analysis objective Work smarter, not
harder - Simple example design of regression study
- From STAT 412 exercise
17Treatment Design Exercise
- Class was asked to predict Bounce Height of
basketball from Drop Height and to see if
relationship changes depending on floor surface - Decision What drop heights to use???
18Objectives and Operating Definitions
- Recall objective does drop bounce height
relationship change with floor surface?
operating definition
19Consequences of Drop Height Decisions
- Should we use fewer drops heights more obs per
drop height or vice versa?
table from Stat 412 Avery archive
20Simulation
- CRD example 3 treatments, 5 reps / treatment
- Suspected Effect size 6-10 relative to control,
whose mean is known to be 100 - Standard deviation 10 considered reasonable
- Simulate 1000 experiments
- Reject H0 equal trt means 228 times
- power 0.228 at alpha0.05
- Ctl mean ranked correctly 820 times
- (intermediate mean ranked correctly 589 times)
21Exemplary Data
- Many software packages for power sample size
- e.g SAS PROC POWER
- for FIXED effect models only
- Exemplary Data more general
- Especially (but not only) when Mixed Model
Issues - random effects
- split-plot structure
- errors potentially correlated longitudinal or
spatial data - any other non-standard model structure
- Methods use PROC MIXED or GLIMMIX
- adapted from Stroup (2002, JABES)
- Chapter 12, SAS for Mixed Models
- (Littell, et al, 2006)
22Exemplary Data - Computing Power using SAS
- create data set like proposed design
- run PROC GLIMMIX (or MIXED) with variance fixed
- ?(F computed by GLIMMIX)?rank(K) or chi-sq
with GLM - use GLIMMIX to compute ?
- critical F (Fcrit ) is value s.t.
- PF (rank(K), ?, 0 ) gt Fcrit ? or
chi-square - Power PF rank(K), ?, ? gtFcrit
- SAS functions can compute Fcrit Power
23Compute Power with GLIMMIX CRD example
/ step 1 - create data set with same structure
as proposed design use MU (expected
mean) instead of observed Y_ij values / /
this example shows power for 5, 10, and 15 e.u.
per trt / data crdpwrx1 input trt
mu do n5 to 15 by 5 do eu1 to n
output end end cards 1 100 2 94 3 90
24Compute Power with GLIMMIX CRD example
/ step 2 - use PROC GLIMMIX to compute
non-centrality parameters for ANOVA tests
contrasts ODS statements output them to new
data sets / proc sort
datacrdpwrx1 by n proc glimmix
datacrdpwrx1 by n class trt model mutrt
parms (100)/hold1 contrast 'et1 v et2' trt 0
1 -1 contrast 'c vs et' trt 2 -1 -1 ods
output tests3b ods output contrastsc run
25/ step 3 combine ANOVA contrast n-c parameter
data sets use SAS functions PROBF and FINV to
compute power / data power set b c
alpha0.05 ncparmnumdffvalue
fcritfinv(1-alpha,numdf,dendf,0)
power1-probf(fcrit,numdf,dendf,ncparm) proc
print
Note close agreement of Simulated Power (0.228)
and exemplary data power (0.224)
Obs Effect Label DF DenDF
alpha nc fcrit power 1 trt
2 12 0.05
2.53333 3.88529 0.22361 2 et1
v et2 1 12 0.05 0.40000
4.74723 0.08980 3 c vs et
1 12 0.05 2.13333 4.74723
0.26978
26More Advanced Example
- Plots in 8 x 3 grid
- Main variation along 8 rows
- 3 x 2 treatment design
- Alternative designs
- randomized complete block (4 blocks, size 6)
- incomplete block (8 blocks, size 3)
- split plot
- RCBD easy but ignores natural variation
27Picture the 8 x 3 Grid
Gradient
e.g. 8 schools, gradient is SES, 3 classrooms
each
28SAS Programs to Compare 8 x 3 Design
data a input bloc trtmnt _at__at_ do s_plot1 to
3 input dose _at__at_ mutrtmnt(0(dose1)4
(dose2)8(dose3)) output end cards 1
1 1 2 3 1 2 1 2 3 2 1 1 2 3 2 2 1 2 3 3 1 1
2 3 3 2 1 2 3 4 1 1 2 3 4 2 1 2 3
Split-Plot
proc glimmix dataa noprofile class bloc trtmnt
dose model mubloc trtmntdose random
trtmnt/subjectbloc parms (4) (6) / hold1,2
lsmeans trtmntdose / diff contrast 'trt x lin'
trtmntdose 1 0 -1 -1 0 1 ods output
diffsb ods output contrastsc run
298 x 3 Incomplete Block
data a input bloc _at__at_ do eu1 to 3 input
trtmnt dose _at__at_ mutrtmnt(0(dose1)4(dos
e2)8(dose3)) output end cards 1 1
1 1 2 1 3 2 1 1 1 2 2 2 3 1 1 1 3
2 3 4 1 1 2 1 2 2 5 1 2 1 3 2 2 6
1 2 2 1 2 3 7 1 3 2 1 2 3 8 2 1
2 2 2 3
proc glimmix dataa noprofile class bloc trtmnt
dose model mutrtmntdose random intercept /
subjectbloc parms (4) (6) / hold1,2 lsmeans
trtmntdose / diff contrast 'trt x lin'
trtmntdose 1 0 -1 -1 0 1 ods output
diffsb ods output contrastsc run
308 x 3 Example - RCBD
data a input trtmnt dose _at__at_ do bloc1 to 4
mutrtmnt(0(dose1)4(dose2)8(dose3))
output end cards 1 1 1 2 1 3 2 1 2 2
2 3
proc glimmix dataa noprofile class bloc trtmnt
dose model mubloc trtmntdose parms (10) /
hold1 lsmeans trtmntdose / diff contrast
'trt x lin' trtmntdose 1 0 -1 -1 0 1 ods
output diffsb ods output contrastsc run
31How did designs compare?
- Suppose main objective is compare regression over
3 levels of doses do they differ by treatment?
(similar to basketball experiment) - Operating definition is thus H0 dose regression
coefficient equal - Power for Randomized Block 0.66
- Power for Incomplete Block 0.85
- Power for Split-Plot 0.85
- Same observations you can work smarter
32But what if I dont know Trt Effect Size or
Variance?
- How can I do a power analysis? If I knew the
effect size and the variance I wouldnt have to
do the study. - What trt effect size is NOT it is NOT the effect
size you are going to observe - It is somewhere between
- what current knowledge suggests is a reasonable
expectation - minimum difference that would be considered
important or meaningful
33And Variance??
- Know thy relevant background / Do thy homework
- Literature search what have others working with
similar subjects reported as variance? - Pilot study
- Educated guess
- range youd expect 95 of likely obs? divide it
by 4 - most extreme values you can plausibly imagine?
divide range by 6
34Hierarchical Linear Models
- From Bovaird (10-27-2006) seminar
- 2 treatment
- 20 classrooms / trt
- 25 students / classroom
- 4 years
- reasonable ideas of classroom(trt),
student(classroomtrt), within student variances
as well as effect size - Implement via exemplary data GLIMMIX
35Categorical Data?
- Example Binary data
- Standard has success probability of 0.25
- New Improved hope to increase to 0.30
- Have N subjects at each of L locations
- For sake of argument, suppose we have
- 900 subjects / location
- 10 locations
36Power for GLMs
- 2 treatments
- Pfavorable outcome
- for trt 1 p 0.30 for trt 2 p0.25
- power if n1300 n2600
data a input trt y n datalines 1 90 300 2
150 600
proc glimmix class trt model y/ntrt /
chisq ods output tests3pwr run
data power set pwr alpha0.05
ncparmnumdfchisq critcinv(1-alpha,numdf,0)
power1-probchi(crit,numdf,ncparm) proc print
run
exemplary data
37Power for GLMM
- Same trt and sample size per location as before
- 10 locations
- Var(Location)0.25 Var(TrtLoc)0.125
- Variance Components variation in log(OddsRatio)
- Power?
data a input trt y n do loc1 to 10
output end datalines 1 90 300 2 150 600
proc glimmix dataa initglm class trt loc
model y/n trt / oddsratio random intercept
trt / subjectloc random _residual_ parms
(0.25) (0.125) (1) / hold1,2,3 ods output
tests3pwr run
38GLMM Power Analysis Results
Gives you expected Conf Limits for Locations
N / Loc contemplated
Gives you the power of the test of TRT effect
on prob(favorable)
39GLMM Power Impact of Sample Size?
- N of subjects per trt per location?
- N of Locations?
- Three cases
- n-300/600 10 loc
- n600/1200, 10 loc
- n300/600, 20 loc
data a input trt y n do loc1 to 10
output end datalines 1 90 300 2 150 600
data a input trt y n do loc1 to 10
output end datalines 1 180 600 2 300
1200
data a input trt y n do loc1 to 20
output end datalines 1 90 300 2 150 600
40GLMM Power Impact of Sample Size?
Recall, for 10 locations, N300/600, CI for
OddsRatio was (0.884, 1.871) Power was 0.274
For 10 locations, N600 / 1200
N alone has almost no impact
For 20 locations, N300 / 600
41Recent developments
- Continue binary example
- Power analysis shows
what do you do?
42More Information
- Consider studies directed toward improving
success rate similar to that proposed in study - Lit search yields 95 such studies
- 29 have reported statistically significant gains
of p1-p2gt0.05 (or, alternatively, significant
odds ratios of (30/70)/(25/75)1.28 or greater) - If this holds, prior prob (desired effect size
) is approx 0.3
43An Intro Stat Result
real Prtype I error is more like 0.23 than
0.10!!!
44Returning to All Scenarios
NOTE dramatic impact of alpha-level when
prior Pr DES is relatively low POWER role
increases at Pr DES increases
45Closing Comments
- In case its not obvious
- Im not a fan of painting by numbers
- Role of power analysis misunderstood
underappreciated - MOST of ALL it is an opportunity to explore and
rehearse study design planned analysis - Engage statistician as a participating member of
research team - Give it the TIME it REQUIRES
46Thanks
... for coming