Title: Advanced topics in study design II: Less commonly used observational study designs
1Advanced topics in study design II Less
commonly used observational study designs
- John S. Witte
- jwitte_at_ucsf.edu
2Cross-Sectional Studies
- Subjects are all persons in the population at the
time of ascertainment or a representative sample. - Often deal with exposures that can not change,
such as blood type or other invariable personal
characteristics. - Cross-sectional analyses of baseline information
in cohort studies provides possible
exposure-disease associations that can later be
confirmed. - Cases in a cross-sectional study will over
represent cases with a long duration of illness
and under represent those with a short duration
of illness.
3Case or Control Cross Sectional Studies
- Use cases only to determine estimates of disease
prevalence etc. among different groups (e.g.,
defined by geographical region). - Use controls to estimate exposure prevalence in
population.
4Cross-sectional Study Strengths
- Relatively feasible and not too time-consuming,
since there is no follow-up period (though random
sampling in a large population can be expensive
and problematic). - We can study several diseases and/or exposures
thus, it is useful for screening new hypotheses. - We can describe disease frequency and health
needs of a large population thus, it is useful
for health planning.
Hal Morgenstern
5Cross-sectional Study Weaknesses
- Potential temporal ambiguity (exposure and
disease). - Possible large measurement error that may be
nondifferential (e.g., exposures collected after
disease occurs), resulting in biased effect
estimates. - Selection bias possible since prevalent cases
occurred before the study is conducted, so
disease status can influence the selection of
subjects. - It is inefficient for studying rare or highly
fatal diseases or diseases with short durations
of expression.
Hal Morgenstern
6Repeated Survey
- Combines two or more cross-sectional studies of
the same source population at different times.
Although we might say that the population is
followed in this type of study, individuals are
not followed. - Design is not much better than the simple
cross-sectional study for testing etiologic
hypotheses. - study population trends or to evaluate the
effectiveness of population interventions
initiated between surveys. - assess the extent to which change in disease rate
can be explained by changes in specific exposures.
Hal Morgenstern
7Survey Follow-up
- Combines a cross-sectional study followed by a
cohort study of those individuals who are still
at risk of developing the disease. - This design is used when
- Want to estimate both the prevalence and
incidence rates of a disease in the same source
population - It is hard to distinguish between prevalent and
incident cases. - To make baseline assessments to identify persons
still at risk of developing the disease (e.g, as
a necessary first phase of a cohort study)
Hal Morgenstern
8Intervention Follow-up
- Combines an intervention with a cohort study,
each part having a different followup period and
outcome variable. - The first followup period is short and is used to
assess the effect of an intervention / exposure
on an outcome (not primary disease). - The second followup period is generally longer
and is used to observe disease occurrence. - This design is useful for examining relationships
between acute biological/behavioral responses and
chronic health effects.
Hal Morgenstern
9Proportionate Study
- Proportional morbidity or mortality study
involves data on cases or deaths. - Special type of case-control (or cross-sectional)
study. - A group of individuals with (or dying from) the
index disease of interest is compared with a
group of individuals with (or dying from) certain
other diseases.
Hal Morgenstern
10Example Proportional mortality study
- Occupational exposure to low-level ionizing
radiation on cancer. - All certified deaths among employees of the
Hanford nuclear power facility between 1944 and
1972 were classified by cause of death and
exposure status (based on company records of
radiation monitoring). - Proportion of deaths that was exposed to ionizing
radiation (i.e., at least one positive badge
reading) among male employees, by cause of death
(n3520). Cancers of the reticuloendothelial
system (RES) include lymphomas, myelomas, and
leukemias.
Mancuso et al. Health Physics 1977
33369-385. Hal Morgenstern
11Two-Stage Sampling Case-Control Studies
- Large control sample has some exposure
information or a limited amount of information on
some relevant variables. - Subsample selected more detailed information
obtained. - Useful when relatively inexpensive to obtain
exposure information but more expensive to obtain
specific covariate information. - Exposure information has already been collected
on the entire population but more detailed
information is needed on covariates. - Special analytic methods are needed to take full
advantage of the information collected at both
stages.
12Revisit Nested Case-Control Studies
- What does the OR from these studies estimate?
- Depends on how the controls are sampled.
- Random at start of follow-up (case-cohort)
- -------------------------------------------------
------------------------ - Start FU
End FU -
- Density sampling
- -------------------------------------------------
------------------------ - Start FU
End FU -
- Cumulative sampling
- -------------------------------------------------
----------------------- - Start FU
End FU -
13Nested Case-control Studies
Cases Total Person Time Controls
Exposed A1 N1 T1 B1
Unexposed A0 N0 T0 B0
14Case-Only Studies
- Study only cases.
- Use theoretical considerations to construct a
distribution of exposure in the source
population. - Use this distribution in place of an observed
control series. - Case-crossover studies
- Case-specular Studies
- Genetic epidemiology
- Hardy-Weinberg Disequilibrium
- Gene x Environment Interaction
15Case-Crossover Studies
- One or more time periods are selected as matched
control periods for the case. - Compare exposure status at the time of disease
onset to the control exposure status within the
same individual. - Depends on the assumption that neither exposure
nor confounders are changing over time in a
systematic way. - i.e. cyclic manner
- Exposure must vary over time within individuals
- Exposure must have a short duration and a
transient effect
16Example Physical Exertion MI
- A number of different exposure periods can be
measured. - One might also use a bidirectional approach to
measuring exposures.
Tager, 2000
17Limitations of Case-Crossover Studies
- There can be overmatching on the exposures
which leads to decreased precision of estimates. - Misclassification could be differential between
case and control groups if different methods
are used to measure exposures or past exposures
are more poorly measured.
18Case-Specular Design
- Use some physical properties to distinguish
controls environmental exposures. - E.g., In a study of electromagnetic field
exposure and disease, measure cases homes
distance to electrical wires. Then flip block
and measure distance from specular home to
electrical wires for controls distance.
19Genetic Epidemiology Case-Only Studies
- The laws of inheritance may be combined with
certain assumptions to derive a population of
genotypes. - Hardy-Weinberg Principle Genotypes will reflect
allele frequency distributions in the general
population. - That is, both allele and genotype frequencies in
a population remain constantthey are in
equilibriumfrom generation to generation unless
specific disturbing influences are introduced.
20Hardy-Weinberg Disequilibrium
- Expect the cases to have an increased frequency
of the disease causing genetic alleles. - Study cases only, and look for departures from
Hardy-Weinberg equilibrium. - This suggests chromosomal
- regions where a disease-causing
- gene resides.
21Case-Parents Transmission Disequilibrium Test
(TDT)
- Transmitted alleles vs. non-transmitted alleles
M1 M2
M2 M2
M1 M2
22TDT
- Transmitted alleles vs. non-transmitted alleles
Non-Transmitted Allele Non-Transmitted Allele Non-Transmitted Allele
Transmitted M1 M2
Transmitted M1 n11 n12
Transmitted M2 n21 n22
TDT (n12 - n21)2 (n12 n21)
Asymptotically c2 with 1 degree of freedom
23TDT
Non-Transmitted Allele Non-Transmitted Allele Non-Transmitted Allele
Transmitted M1 M2
Transmitted M1 0 1
Transmitted M2 0 1
TDT (1 - 0)2 (1 0)
1
p-value 0.32
24Case-Only for Interactions
E E-
G G- G G-
Case A11 A10 A01 A00
Control B11 B10 B01 B00
25Family-Based Association Studies
Siblings
Parents
G
G
G
G
G
G
Cousins
G
G
26Twin Studies
- Compare the disease concordance rates of MZ
(identical) and DZ (fraternal) twins.
Twin 1
Disease Yes No
Yes A B
No C D
Concordance 2A/(2ABC)
Twin 2
Then one can estimate heritability of a
phenotype.
27Example of Twin Study PCa
- Twin registry (Sweden, Denmark, and Finland)
- 7,231 MZ and 13,769 DZ Twins (male)
Twin Concordant pairs (A) Discordant pairs (BC) Concordance
MZ 40 299 0.21
DZ 20 584 0.06
Heritability 0.42 (0.29-0.50) Non-shared
Environment 0.58 (0.50-0.67) Lichtenstein et al
NEJM 2000 1334378-85.
28Comparison of Designs
- Family-based designs can be less efficient than
population-based designs.
Rare Recessive
Common
Rare Dominant
High Risk
Low Risk
High Risk
Population-based
100
100
100
Case-sibling
69
51
50
Case-cousin
97
88
88
TDT
231
102
101
Witte et al. Am J Epidemiol 1999
- Further, family-based designs can be require
more recruitment efforts.
29Population Stratification
- Confounding bias that may occur if ones sample
is comprised of sub-populations with different - allele frequencies (?) and
- disease rates (RpR)
- Cases are more likely than controls to arise from
the sub-population with the higher baseline
disease rate. - Cases and controls will have different allele
frequencies regardless of whether the locus is
causal.
30Genomic Control
- Use population-based design, but incorporate into
analysis genomic information to adjust for
population stratification. - Genomic control adjust test statistics for
outliers due to population stratification. - Use unlinked genetic markers.
31Principal CompoenentsGenetic Matching of
Controls
Luca et al. AJHG 2008
32Continuum of Assoc Study Designs
Population-based
Ethnicity Matched
Genomic-based
Family-based
Population Stratification
Overmatching
(Biasversus...efficiency)
- ? Sharing of genes envt.
- Efficiency
- Also, recruitment issues
33Ecologic Studies
- Levels of Measurement
- Aggregate measures summaries (e.g. means,
proportions) of observations derived from
individuals in each group. - Environmental measures physical characteristics
of the place in which members of each group
live or work (e.g. air pollution level, hours
of sunlight). - Global measures attributes of groups,
organizations, or places for which there is no
distinct analogue at the individual level (e.g.
population density, level of social
disorganization, existence of a specific law,
or type of health-care system).
34Ecologic Studies- Concepts (continued)
- Levels of Analysis
- - The common level for which data on all
variables are reduced and analyzed. - a. Complete ecologic analysis
Total
Disease Exposure Exposure Exposure Exposure
Disease -
Disease ? ? T1
Disease - ? ?
Disease T0 T
35Ecologic Studies- Concepts (continued)
- Levels of Analysis
- - The common level for which data on all
variables are reduced and analyzed. - a. Complete ecologic analysis
- b. Partially ecologic analysis
Z1 Z0 Total
Disease Exposure Exposure Exposure Exposure
Disease -
Disease ? ? M11
Disease - ? ? M01
Disease N11 N01 T1
Exposure Exposure Exposure Exposure
-
? ? M11
- ? ? M01
N10 N00 T0
Exposure Exposure Exposure Exposure
-
? ? T1
- ? ? T0
T1 T0 T
36Ecologic Studies- Concepts (continued)
- Levels of Inference
- -The goal is to make ecologic inferences about
effects on group rates (an ecologic effect). - i.e. Helmet-use laws
- Ecologic effects vs. biological effect
-
- -An interest may exist to estimate the
contextual effect of an ecologic exposure on
individual risk. - Commonly found in infectious disease
epidemiology -
37Ecologic Studies- Study Designs
- Multiple-group Design
- -The rate of disease is compared among many
groups during one period of time to search for
spatial patterns. - Example NCI cancer study
- - The rate of disease may be compared between
migrants and their offspring and residents of
the countries of immigration and emigration. - -Environmental or behavioral risk factors
- -Genetic risk factors
- Examples Migrant Study and Multiple-group
Analytic Study
38Migrant Studies
Weeks, Population. 1999
39Example Standardized Mortality Ratios
Japanese
Cancer Site Japan Not US Born US Born US Caucasians
Stomach (M) 100 72 38 17
Colorectal (F) 100 218 209 483
Breast 100 166 136 591
MacMahon B, Pugh TF. Epidemiology. 1970178.
40Ecologic Studies- Study Designs (continued)
- Time-trend Design
- One group or population is followed over time to
assess a possible association between a change in
exposure frequency and a change in disease
frequency. - Example NCI study of artificial sweetener
consumption and bladder cancer between
1950-1969. - Mixed Design
- A mixture of the two previous designs. A number
of groups or populations are followed over time
to assess a possible association between a change
in exposure frequency and a change in disease
frequency. - Example Change in annual CVD mortality rate for
males between 1948 and 1964 in 83 British towns
by age and water level hardness.
41Ecologic Studies- Rationale
- Strengths
- Low cost and convenience
- Measurement limitations of individual-level
studies - Design limitations of individual-level studies
- Interest in ecologic effects
- Simplicity of analysis and presentation
42Ecologic Studies- Rationale (continued)
- Weaknesses
- Ecologic fallacy (or bias)
- Cannot asses confounding or effect modification
- Temporal ambiguity
- Migration across groups
- Collinearity
- Lack of adequate data
43 COMMONLY USED DESIGNS FOR SELECTED STUDY
OBJECTIVES
Objective of the Study and Commonly
Used Nature of the Disease Study Designs
1. Test / screen new
etiologic hypotheses regarding Case-control se
veral possible risk factors for one disease
Cross-sectional Ecologic 2. Test
/ screen new etiologic hypotheses regarding the
Cohort effects of a specific exposure on
several outcomes Ecologic 3. Test or screen
new etiologic hypotheses, based on the Ecologic
merging of two or more large data sets, to
obtain infor- mation on both exposure and
disease frequencies 4. Identify risk factors
for a disease for which we cannot Selective
prevalence observe the (base) population at
risk Proportional 5. Study the possible
genetic etiology of a disease Family-based 6. D
etermine whether a disease is likely to have
an Space-time cluster infectious
etiology 7. Identify environmental risk factors
for a remittent Repeated follow-up disease,
or study the possible mutual effects
between two diseases
Hal Morgenstern
44 COMMONLY USED DESIGNS FOR SELECTED STUDY
OBJECTIVES
Objective of the Study and Commonly
Used Nature of the Disease Study
Designs 8. Identify environmental risk factors
for a specific rare Retrospective
cohort disease, which might have a long latent
period Case-control 9. Study the relationship
between an acute response to an Intervention
follow-up exposure and a chronic health
outcome 10. Identify risk factors for a
relatively frequent disease Prospective
cohort with a long duration of expression,
which often goes Cross-sectional undiagnosed
or unreported 11.Study the possible effect of
an exposure on disease occur- Prospective
cohort rence, where exposure status is likely
to be influenced Repeated follow-up by
disease status 12.Assess the impact of a
planned intervention on the health Repeated
survey status of a target population Ecologic
13.Assess the need for health services and
facilities in a Cross-sectional target
population Survey follow-up Repeated
survey Ecologic
Hal Morgenstern
45Criteria for Comparing Study Designs
- There are three general criteria for evaluating
and comparing different study designs. - 1. Relevance of the information to the
investigator - Extent to which expected findings will satisfy
the specific objectives of the study - Investigator's desire to estimate specific
population parameters - 2. Quality or accuracy of the information
expected in the data - Ability of the investigator to determine that the
exposure preceded disease occurrence - Ability of the investigator to eliminate the
possibility that the statistical findings were
due to various methodological problems or sources
of error - 3. Cost of the information
- The ultimate worth of a study is the total value
of all derived information--now and in the
future--relative to the total (direct and
indirect) costs of the study