Lecture 2: Statistical Overview - PowerPoint PPT Presentation

About This Presentation

Title:

Lecture 2: Statistical Overview

Description:

Child Psychiatry Research Methods Lecture Series Lecture 2: Statistical Overview Elizabeth Garrett esg_at_jhu.edu Two Types of Statistics Descriptive Statistics Uses ... – PowerPoint PPT presentation

Number of Views:224

Avg rating:3.0/5.0

Slides: 40

Provided by: peopleMu1

Learn more at: http://people.musc.edu

Category:

more less

Transcript and Presenter's Notes

Title: Lecture 2: Statistical Overview

1
Lecture 2Statistical Overview
Child Psychiatry Research Methods Lecture Series

Elizabeth Garrett
esg_at_jhu.edu

2
Two Types of Statistics

Descriptive Statistics
Uses sample statistics (e.g. mean, median,
standard deviation) to describe the sample and
the population from which it was drawn.
Not decision oriented
Pilot studies are descriptive
Statistical Inference
Inference The act of passing from statistical
sample data to generalizations . usually with
calculated degrees of certainty.
Key elements
sample
generalizations
certainty
Often used for making decisions
drug works or it doesnt
ADHD is genetically inherited or it isnt

3
Example 1 Viral Exposure and Autism (Deykin
and MacMahon, 1979)

Hypothesis
Direct exposure to or clinical illness with
measles, mumps, or chicken pox may play a causal
role in autism.

4
Example 2Neurobiology of Attention in Fetal
Alcohol Syndrome(Lockhart, 2001?)

Hypotheses
(1) The neurobiological basis of problems in
response inhibition and motor impersistence in
children with FAS is related to abnormalities in
the anterior frontostriatal network.
(2) The neurobiological basis of problems in
orienting/shifting attention in children with FAS
is related to abnormalities in the posterior
parietal network.

5
4 Statistical Plan 4.1 Primary
outcome(s) 4.2 Statistical analysis 4.3
Sample size justification
6
4 Statistical Plan 4.1 Primary outcome(s)
Primary outcome variable not defined!
Common Problem
7
Defining Primary Outcome Variables

Continuous
MRI volumes
fMRI activation levels
blood pressure
response time
number of voxels activated
cost of hospital visit
neurobehavioral test score

Categorical
Nominal
Binary (two categories)
gene carrier status (as diagnosed by.)
measles (as diagnosed by.)
ADHD (as diagnosed by.)
Polychotomous (more than two unordered
categories)
region of activation
Ordinal
severity score (see BPI)
symptom rating
on a scale of 1 to 5.

Example 1 Primary outcomes
Disease history of
measles
mumps
chicken pox
Example 2 Primary outcomes
MRI volumes of
corpus collosum
caudate
cerebellar vermis
parietal lobes
frontal lobe

10
4 Statistical Plan 4.1 Primary outcomes
- Be clear about each variable and how it is
measured.
- NOT okay to say our primary outcome variable
is cognition.
- It IS okay to say our primary outcome variable
is cognition as measured by the WISC-III.
- Multiple outcomes are okay e.g. MRI volumes
and cognitive tests can both be primary outcomes.
11
4 Statistical Plan 4.1 Primary
outcome(s) 4.2 Statistical analysis

- How are you going to answer specific aims using
primary outcome variable?
12

Commonly seen statistical methods in analysis
plans
t-test
confidence interval
Chi-square test
Fishers exact test
linear regression
logistic regression
Wilcoxon rank sum test
ANOVA
GEE

13
Key Idea Data Reduction

Statistics is the art/science of summarizing a
large amount of information by just a few numbers
and/or statements.
Examples
pvalue 0.01
OR 5.0
prevalence 0.20 ? 0.05

14
Example 1

Recall aim To compare measles history in
autistic versus non-autistic kids.
Methods
Odds ratio Quantifies risk of disease in two
exposure groups
Confidence interval Answers What is reasonable
range for true odds ratio?
Fishers exact test Answers Is the risk the
same in the two exposure groups?

15
Statistical Analysis

We will measure the risk of autism associated
with measles using an odds ratio. Significance
will be assessed by Fishers exact test and a 95
confidence interval will be calculated.

16
Example 2

Recall aim To compare MRI volumes in FAS kids
and controls.
Methods
Two-sample t-test Answers are the mean volumes
in the two groups different?
95 confidence interval Answers what is the
estimated difference in volumes in the two
groups, approximately?

17
Statistical Analysis

To answer the specific aims, we will compare the
caudate volumes in the FAS group to those in the
control group using a two sample t-test. We will
also estimate a 95 confidence interval to
provide a reasonable range of the difference in
mean volumes in the two groups.

18
4 Statistical Plan 4.1 Primary
outcome(s) 4.2 Statistical analysis
- Data reduction is key How are you going to
combine information from all patients to answer
scientific question?

- Specific methods need to be designated.
- Study design often changes after statistical
issues are considered!
19
4 Statistical Plan 4.1 Primary
outcome(s) 4.2 Statistical analysis 4.3
Sample size justification
- Do you have enough subjects to answer the
question, but not too many so that you are
efficient (in terms of money and risks)?
20
Power and Sample Size Considerations

All about precision! (Recall Craig last time)
Intuition
the more individuals, the better your estimate
the more individuals, the less variability in
your estimate
the more individuals, the more precise your
estimate
but, how precise need your estimate be?
Example 1
Odds ratio of measles for autism 3.7
Interpretation Babies exposed to measles
prenatally or in early infancy are at 3.7 times
the risk for autism compared to children who are
unexposed.
Strong result?

21
Three Theoretical Outcomes

95 confidence intervals

22
Actual Result from Study
95 Confidence interval (0.97, 14.2) Fishers
exact pvalue 0.12
23
Magnitude versus Significance

Magnitude of finding How big is the odds ratio?
Statistical significance of the finding Is the
odds ratio different than 1?
Clinical significance of the finding Is the
size of the estimated odds ratio worth worrying
about?
Autism and Measles
exposure to measles is rare
need a lot of subjects to show significant
difference!

24
Justifying sample size in a study design

Hypothesis testing
Ho OR1
Ha OR3
Which is a more reasonable conclusion?
Issues
type 1 error (?)
type 2 error (?)

Ha
Ho
25
Type I and II Errors

Type I error (?)
The probability that we reject Ho given that it
is true
The probability that we find an association
between measles and autism when, in truth, one
does not exist.
Type II error (?)
The probability that we reject Ha given that it
is true
The probability that we find no association
between measles and autism when, in truth, one
does exist.
Note Power 1 - ?

26
Sample size dictates overlap

Scenario 1
Small samples
Large samples
Scenario 2
27
Decision Rule

Before study is completed, you know what you need
to observe to find evidence for OR1 or OR3
Scenario 1 If observed OR gt 3.6, then conclude
that there IS an association
Scenario 2 If observed OR gt 1.6, then conclude
that there IS an association.

28
Type I Error alpha
Alpha usually predetermined 0.05

29
Type II Error beta
Beta is figured out conditional on alpha.
? 0.60
If sample size is small, beta will be big

If sample size is big, beta will be small
? 0.02
30
Power 1- beta
Power is 1 - beta.
Power 0.40

If sample size is small, power will be small
If sample size is large, power will be large
Power 0.98
31
Power/Sample Size Estimate

Kids with Autism N 608
Kids without Autism N 1216
Using Fishers exact test, we have 80 power
with alpha 0.05 to detect an odds ratio of 3 if
we enroll 608 children with autism and 1216
normal controls. This assumes that 3 of
autistic children have been exposed to measles
and 1 of the controls have been exposed.

32
Sample Size Table(80 power, alpha 0.05)
33
Example 2 FAS and controls

How many FAS children and controls do we need to
detect a significant difference in MRI volumes?
From previous research we can estimate (i.e.
guess)
Volumes of cerebellar vermis in FAS kids are
approximately 400.
It would be interesting if FAS kids had volumes
10 or more less than normal controls (i.e. 400
versus 450).

34
Sample size needed depends on overlap between FAS
and control kids.
control
FAS
control
FAS
35
Two sample t-test

Same general approach as the odds ratio
Define ? difference in mean volumes
control mean - FAS mean
H0 ? 0
Ha ? 50
Same thing which hypothesis is more reasonable
based on our data?
Note Based on previous research, we can
estimate that the standard deviaion of volumes is
70.

36
What if N 100 (50 per group)?
Alpha 0.05
Beta 0.06
37
Power/Sample Size Options

For power 80, alpha 0.05
32 FAS and 32 controls
For power 90, alpha 0.05
43 FAS and 43 controls
To achieve 80 power with a type I error of 5,
we require 32 FAS kids and 32 controls. This
will allow us to detect a 10 difference in mean
MRI volumes of cerebellar vermis (400 versus 450,
respectively) assuming standard deviations of 70
in each group.

38
4 Statistical Plan 4.1 Primary
outcome(s) 4.2 Statistical analysis 4.3
Sample size justification
-Explain justification in terms of statistics.
Saying we are confident that 10 subjects will
provide. is not sufficient.
39
General Biostatistics References