CaseControl and Cohort Studies - PowerPoint PPT Presentation


PPT – CaseControl and Cohort Studies PowerPoint presentation | free to view - id: 21e9db-ZDc1Z


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation

CaseControl and Cohort Studies


Examines exposure-disease relationship by enrolling cases (with ... Histologic exam revealed growth was not carcinoma. Found not to be malignant disease at all ... – PowerPoint PPT presentation

Number of Views:56
Avg rating:3.0/5.0
Slides: 57
Provided by: lmn


Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: CaseControl and Cohort Studies

Case-Control and Cohort Studies
Robert Heimer Yale University School of Public
Health April 2009
Overview for Today
  • For Cohort
  • Definition and description
  • Dynamic, fixed
  • Prospective, retrospective
  • Example -- Nurses Health Study and breast cancer
  • Conduct
  • Assemble the cohort
  • Classify exposed/non-exposed
  • Follow over time for outcome
  • Analysis
  • Strengths limitations
  • For Case-Control
  • Definitions
  • Cases
  • Definition, selection
  • Controls
  • Definition, purpose, selection
  • Sampling
  • Types
  • Analysis
  • Strengths limitations
  • Examples

Main types of epidemiologic studies
Review slide
Epidemiologic research study designs
Experimental studies
Observational studies
Some examples and special types
  • Nested
  • Case-crossover
  • Times series
  • Multi-level
  • Individual-level RCT
  • Community randomized trial
  • Quasi-experimental
  • One-time
  • Panel
  • Retrospective
  • Prospective

Case-Control A Definition
  • Examines exposure-disease relationship by
    enrolling cases (with disease) and controls
    (without disease) and comparing exposure history
  • Backward design flash back
  • Two groups are enrolled (cases and non-cases) and
    compared with respect to past exposure

Hypothetical Example
  • Research question Pesticide exposure increases
    the risk of bladder cancer.
  • Methods Consider a prospective cohort study, in
    which you enrolled 89,949 individuals aged 34-59
    and followed the cohort for 8 years
  • Outcome 1,439 bladder cancer cases identified
    over 8 years of follow-up
  • Exposure Blood drawn and frozen at beginning of
    study can be analyzed for level of pesticides

A Practical Problem
  • Quantifying pesticide levels in the blood is
  • it's not practical to analyze all 89,949 blood
  • To be efficient, analyze a select number
  • all cases (N1,439)
  • just take a sample of the cohort participants who
    did not get bladder cancer
  • For example, two times as many cases (N2,878)

Case-Control Data
Bladder Cancer
We have just identified cases of disease from a
defined population, and then taken a sample of
that population (at-risk population) for
comparison. Exposure histories are determined for
each group. This is an example of a
case-control study that is nested in a cohort.
A Refined Definition of Case-Control Study
  • A method of sampling a population in which cases
    of disease are identified and enrolled, and a
    sample of the population that produced the cases
    is identified and enrolled exposures are
    determined for individuals in each group.
  • If done properly, case-control is a method of
    sampling a population so that controls reflect
    the source population that gave rise to cases.

  • Need a clear case definition that leads to
    anaccurate classification of disease
  • Disease definitions can be based on
  • Signs, symptoms
  • Clinical exam
  • Laboratory test results
  • May come from registries, hospitals/clinics,
    surveillance reports, etc.
  • Cases may be enrolled going forward in time
  • Incident cases preferred to prevalent cases since
    these are a better measure of risk because not
    confounded by duration
  • Cases may be sampled if you think youll have too
    many, but this is not usually the situation

Cases in Doll Hill 1950
  • Twenty London hospitals were asked to refer all
    patients admitted with carcinoma of the lung.
  • Identified by admitting clerk, house physician,
    cancer registrar, radiotherapy department
  • most diagnoses by necropsy, biopsy, or
    exploratory operation and some were diagnosed by
    other criteria
  • Diagnoses confirmed by hospital diagnosis on
  • As a general rule this was the final diagnosis.
  • Some diagnoses were changed after discharge.
  • Patients were excluded from case population if
  • Subsequent checking revealed primary carcinoma
    was another site (e.g. breast, colon)
  • Histologic exam revealed growth was not carcinoma
  • Found not to be malignant disease at all

  • A sample of the source population (study base)
    that gave rise to the cases
  • Purpose is to estimate exposure distribution in
    the source population that produced the cases
  • Controls provide a fast and inexpensive
    (efficient) means of obtaining the exposure
    experience in the source population
  • Controls are selected
  • Without outcome of interest
  • Independent of exposure status
  • The would criterion
  • If a member of the control group actually had the
    disease under study WOULD the person end up as a
    case in your study?

Selecting Controls
  • Sources
  • General population
  • Hospital, clinic
  • Other
  • Sampling
  • Risk-set sampling
  • Survivor sampling
  • Case-base sampling

General Population Versus Hospital Controls
  • General Population
  • Often cases are selected from a defined
    geographic population that gave rise to the cases
  • Can use residence lists, drivers license
    records, voter lists, etc.
  • Advantages
  • High likelihood controls are from same study base
    as cases
  • Disadvantages
  • Need an enumerated list
  • Time consuming, expensive
  • Typically have high refusals
  • Hospital Controls
  • Use patients with diseases that have no relation
    to the exposures under study
  • Advantages
  • May have similar selection factors to cases
  • Identifiable and accessible
  • May be more willing to participate than general
    population controls
  • Some potential disadvantages
  • Referral patterns may not be the same, for ex If
    a hospital has a world-famous cancer center
  • Controls are sick, so exposure patterns may not
    reflect study base

Other Types of Controls
  • Friends
  • Spouses
  • Siblings/twins
  • (Deceased individuals)
  • The goal with each of these types of controls is
  • Measure exposure in the source population
  • Minimize differences between cases and controls

  • You pair each case with someone who is like the
    case but who does not have the disease or outcome
    under study.
  • You can do FREQUENCY MATCHING, which means
    picking people from the general groups from which
    cases come, so that the overall make up of the
    two groups is similar.
  • You can connect each case to more than one person
    who resembles that case (R1 MATCHING)
  • Requires a different analytical approach, most
    common are
  • Matched OR
  • McNemar chi-square
  • Conditional logistic regression

Doll and Hill, 1950
  • Required to make similar inquires of a group of
    non-cancer control patients
  • For each lung cancer patient, interviewers were
    instructed to interview a patient of same sex,
    within the same five-year age group, and in the
    same hospital at or about the same time
  • Could not always find a suitable control
  • 743 general medical and surgical patients
  • Some differences with regard to place of
  • Higher proportion of cases from outside London

Data Collection Exposure Assessment
  • Once cases and controls are identified and
    enrolled, collect information of exposure of
    interest and other variables
  • Ideally, use same data collection techniques for
    both cases and controls

Analysis of Case-Control Data Overview
  • Because controls are a sample of the population
    that produced the cases, you do not know the size
    of the total population
  • Therefore, you cannot get a prevalence,
    cumulative incidence or incidence rate of
  • Do not have the appropriate denominator for these
  • Instead, we compute odds and odds ratios that we
    use for estimation of relative risk in special

Doll and Hill, 1950 Table IV
  • Problem with usual interpretation of this table
  • 1298 is not an at-risk population, it is the
    study population
  • 1269 and 29 do not tell us about exposure is the
    source population
  • 647/1269 and 2/29 do not tell us about risk of
    outcome (lung cancer) among exposed and unexposed
    (smokers and non-smokers) in source population

Need a Different Analytical Approach
  • Use Odds Ratio
  • Delete marginals from the table because they do
    not have a lot of meaning for understanding risk.
  • Odds probability(event)/probability(non-event)
  • Compares the frequency of occurrence of something
    to the frequency of non-occurrence.
  • Calculation of exposure odds ratio (EOR)
  • Odds of exposure vs. no exposure in diseased
    persons a/c
  • Odds of exposure vs. no exposure in non-diseased
    persons b/d
  • Odds ratio of exposure for cases compared to
    controls (a/c) / (b/d) ad/bc

Expsoure Versus Disease Odds Ratio
  • Suppose we want to determine the odds of disease
    rather than the odds of exposure
  • The odds of disease in the exposed a/b
  • The odds of disease in the unexposed c/d
  • The odds ratio of disease for exposed compared to
    unexposed (a/b) / (c/d) ad/bc
  • This is the same as the exposure odds ratio

OR Approximates RR
  • when disease is rare
  • Proportion of cases in exposed and unexposed
    groups is low in total (source) population
  • ab b and cd d
  • RR a/(ab)/c/(cd) a/b / c/d ad/bc
  • if disease is not rare
  • depends on sampling When case-base or risk-set
    sampling is used for control selection
  • when cases are newly diagnosed and prevalent
    cases are excluded from control group, and
    selection of cases and controls is not based on
    exposure status

Doll and Hill, 1950 Table IV
  • OR (64727) / (2622) 14.0
  • Technically The odds of smoking among lung
    cancer cases is 14 times higher than the odds of
    smoking among non-lung cancer cases.
  • Loosely People who smoke have a 14-fold
    increased risk of lung cancer compared to people
    who do not smoke.

Advantages of Case-Control Method
  • Useful when exposure data are expensive or
    difficult to obtain
  • Pesticide and bladder cancer example
  • Useful when little is known about disease
  • Vaginal cancer in women
  • Useful when disease has long induction or latent
  • Lung cancer
  • Many special cases
  • Outbreaks, vaccine effectiveness, etc.
  • A major advantage is efficiency (time and money)

  • Retrospective nature makes them prone to many
  • Information bias
  • Recall bias, if cases and controls report
    exposures differently because of their
    case/control status
  • Selection bias
  • If selection of control group is not
    representative of source population that gave
    rise to cases with respect to exposure

Rare Vaginal Cancer in Young WomenHerbst et al.
NEJM 1971284(15)878-881
  • Background
  • Initial clinical observation of 7 women, ages
    15-22, with adenocarcinoma of the vagina
  • Never before seen at that hospital
  • Study design
  • We then decided to conduct a case-control,
    retrospective study that would compare in detail
    these patients and their families with an
    appropriate control group to uncover factors that
    might be associated with the sudden appearance of
    these tumors.
  • Cases
  • 8 women with diagnosed with clear-cell or
    endometrial type adenocarcinoma of the vagina
    between 1966 and 1969 at Boston hospitals

Controls and Exposure Assessment
  • Controls
  • 4 matched controls per case
  • Using persons born at the same hospital as the
    case and within 5 days and on the same type of
    service (ward or private)
  • Exposure assessment
  • Reproductive and other factors

Major factor Ingestion of estrogen
(Diethylstilbesterol or DES) during first
trimester of pregnancy by 7 of 8 mothers of
affected women and by 0 of 32 mothers of control
women (plt.001) (OR?) Conclusion DES is a risk
factor for subsequent adenocarcinoma of the
vagina in offspring. Implications It is
unwise to administer estrogen to women early in
pregnancy. Abnormal bleeding in adolescent
women should be examined for vaginal tumors.
Summary Points on Case-Control
  • Case-control studies are useful in many
    situations for epidemiologic research because of
    their efficiency
  • Selection of control population is often
  • With appropriate methodologies and mindfulness
    toward common biases, they can produce valid and
    important results.

Cohort Studies
Definition of a Cohort
  • Cohort
  • A group of persons followed over time
  • Cohort study
  • A study in which two or more groups of people
    that are free of disease and that differ
    according to exposure level(s) are followed over
    time and compared with respect to disease
    incidence to assess the association between
    exposure and disease
  • Also called prospective, follow-up,
    longitudinal studies
  • May be considered a natural experiment
  • People are exposed to substances and risk
    behaviors all the time, either on purpose or not
    these can be studied as exposures
  • Interventions may be considered a subset of
    cohort studies

Some Characteristics of Cohort Studies
  • May be open (dynamic), fixed, or closed
  • Dynamic members can enter and leave during
    follow-up time
  • Residents of Kazan
  • Fixed membership is fixed (permanent), but
    members can exit the cohort
  • People present in lower Manhattan 9/11/2001
  • Women who have given birth
  • Closed members cannot enter after start of
    study and nobody is lost to follow-up defined
    start and end time
  • Attendees of church supper
  • Everybody has same follow-up time
  • May be prospective or retrospective
  • Depends on temporal relationship between
    initiation of study and occurrence of disease
  • Calendar time vs. follow-up time

Dynamic, Fixed, and Closed Cohort
  • Closed cohort is like a fixed cohort, but all
    members have the same exposure time

Timing of Cohort Studies
  • Prospective
  • Exposure has occurred but disease has not
    occurred at start of study
  • Exposure---------------?Disease
  • Study starts here (Calendar time and follow-up
    time are concurrent)
  • Retrospective
  • Both exposure and disease have occurred at start
    of study
  • Exposure----------------? Disease

  • Study starts here
  • Study starts here (Calendar time and follow-up
    time NOT concurrent)

  • Retrospective
  • Cheaper, faster
  • Efficient with diseases with long latent period
  • Exposure data and other information may be
    limited or missing
  • Prospective
  • More expensive, time consuming
  • Not efficient for diseases with long latent
  • Better exposure and confounder data (planned)
  • Enhanced follow-up
  • Less vulnerable to some bias
  • How to choose?
  • Necessity (logistics time, money)
  • Research question (science available data)

Design Overview
  • Identify and assemble a group of individuals
    without disease of interest
  • Classify with respect to exposure status at start
    of study
  • Monitor subsequent development of disease in
    exposed and non-exposed subjects over time
  • Analysis

Example Nurses Health Study
  • Background and Purpose
  • Based at Harvard Medical School and School of
    Public Health
  • Originally conceived as a study to examine the
    association between oral contraceptive use
    (widespread 1960s and 1970s) and breast cancer
  • In part, due to conflicting previous studies and
    concerns of limitations of case-control approach
    for this association
  • Cohort
  • Enrolled 120,000 married female nurses age 30-55
    registered in one of 11 states in 1976
  • Identified by American Nursing Association and
    state boards of nursing
  • Initial baseline mail survey collected
    information about demographic, reproductive,
    medical, and life-style variables

Assembling a Prospective Cohort
  • Exclude those with disease or not at risk
  • Depending on research question and feasibility
  • General cohorts and special cohorts
  • Internal and external comparison groups

General Cohorts
  • Select a group of individuals from general
  • Geographically defined areas
  • Well-identified groups (Ex NHS)
  • Others
  • Not chosen for exposure status but rather for
    feasibility and logistical reasons that make the
    study possible
  • Nurses Health Study
  • Believed that nurses could be interested in
    participating in a health study
  • Believed that nurses could answer survey
    questions correctly
  • Often useful for common exposures
  • NHS -- OC use 42 past users, 6 current users
  • Often used for multiple exposures and multiple

Special Cohorts
  • Groups with a particular health status or other
    special characteristic
  • For example repeated x-rays, live near toxic
    waste dump site, present at event such as
  • Useful for occupational settings
  • Often have unusual exposures
  • Danish workers exposed to trichloroethylene
  • Often useful for rare exposures
  • Allows accrual of sufficient exposed individuals
    because of targeted recruitment

Comparison Groups (Exposure)
  • Principle You want the comparison (unexposed)
    group to be as similar as possible to the exposed
    group with respect to all other factors except
    the exposure. If the exposure has no effect on
    disease occurrence, then the rate of disease in
    the exposed and comparison groups will be the
  • Counterfactual ideal The ideal comparison group
    consists of exactly the same individuals in the
    exposed group had they not been exposed. Since it
    is impossible for the same person to be exposed
    and unexposed simultaneously, epidemiologists
    must select different sets of people who are as
    similar as possible.

Classifying Exposure
  • Define and measure
  • Need a way to handle exposures that may change
    over time
  • NHS and OC use
  • Current and past (ever), never Current defined
    as past 2 years
  • Total duration, duration prior to first
    pregnancy, duration prior age 25
  • May want to consider multiple levels of exposure

Internal Comparison Group
  • Unexposed members of the same cohort
  • Single cohort in which individuals are classified
    into exposure categories
  • NHS nurses were enrolled, surveyed about risks
    and classified as exposed or unexposed
  • Usually preferred because of higher likelihood of
    similarity between exposed and unexposed
  • Selected into cohort in same way
  • Because you are a nurse, because you live in a
  • Measurement and follow-up of disease done in the
    same way

External Comparison Group
  • A different cohort, from another similar
    population, that is not exposed
  • Different cohort for example, same type of work
    at different organization
  • General population
  • Useful when a special exposure group is used
    and/or when entire cohort is exposed
  • Inclusion in the cohort meant exposed so had to
    go elsewhere for comparison

Following the Cohort
  • A major challenge, a major expense, a major
    potential threat to validity
  • Method
  • Passive or active (NHS)
  • Length of time
  • Depends on outcome and sample size
  • For chronic diseases, follow-up will need to be
    years or decades
  • Data collection
  • For exposure (updated and new) and outcome
    (multiple) information
  • Pre-existing records (medical, employment),
    questionnaires, physical exams, medical tests,
    laboratory assays, external sources (e.g. cancer
    registries, vital statistics such as the US
    National Death Index)

Important to Minimize Losses to Follow-up
  • For reasons of sample size and bias
  • Depends on population, duration, etc.
  • Methods
  • Collecting sufficient locating information
  • Maintaining regular contact with participants
  • Using multiple methods (phone, mail, internet,
    postal office databases, disease registries,
    vital statistics, physicians)
  • Make participation worthwhile for participants

Nurses Health Study
  • Follow-up
  • Every 2 years complete another mail survey that
    collects information about development of
    outcomes, updated exposures, new exposures
  • Biologic specimens also collected
  • Self-reported outcomes confirmed by medial record
    reviews, pathology reports, review of National
    Death Index
  • Promoting follow-up
  • Follow-up surveys included a newsletter
  • 2-4 follow-up mailings 5th mailing was an
    abbreviated survey
  • Added a telephone follow-up
  • Use of certified mail
  • Follow-up through state boards of nursing
  • gt90 follow-up at each cycle

Other Advantage to Cohorts
  • Nurses Health Study has continued and expanded to
    include examination of associations between
  • Exposures diet, physical activity, obesity,
    post-menopausal hormone use, reproductive
    factors, smoking, alcohol, coffee, hair dyes
  • Outcomes cancer, CVD, diabetes, mortality,
    osteoporosis to name just a few
  • gt800 publications to date

  • Basic analysis involves calculation and
    comparison of incidence of disease among exposed
    and unexposed
  • Depending on available data, you can calculate
    cumulative incidence or incidence rates and
    corresponding ratios

OC use and breast cancer in NHSRomieu et al.
JNCI 1989811313-1321.
  • Background
  • Conflicting evidence re OC use and breast cancer
  • OC use is widespread, therefore possible large
    impact on public health
  • Cohort
  • 118,273 women who did not report a diagnosis of
    cancer (other than non-melanoma skin cancer) on
    1976 questionnaire
  • Exposure
  • Self-reported OC use on each questionnaire (every
    2 years)
  • Classified as current, past, never time since
    first use, time since last use, duration of use,
    use before first pregnancy

(No Transcript)
Conclusions from NHS on Risk of Breast Cancer
following Oral Contraception Use
  • Overall past use not associated with breast
    cancer (RR1.06)
  • Slight increased risk for current users (RR1.56)
  • Number of women who used for long duration early
    in reproductive life was too small for meaningful

Advantages of Cohort Studies
  • Known temporal association exposure ? outcome
  • Preferred for causal inference
  • Can evaluate multiple outcomes for given
  • NHS examined relationship between OC use and
    breast cancer, ovarian cancer, malignant melanoma
    and myocardial infarction
  • Less prone to certain types of bias
  • Recall bias outcome does not influence recall of
    past exposures (pre-classified)
  • Selection bias disease status does not influence
    selection of subjects with respect to exposure
  • Can estimate risk and incidence (person-time)

Disadvantages of Cohort Studies
  • Expensive
  • Time-consuming
  • Loss to follow-up bias
  • If those lost are different in ways related to
    exposure and outcome
  • Inefficient for rare diseases
  • unless AR is high
  • retrospective cohort can then established
  • Inefficient for diseases with long induction or
    latent period (unless retrospective cohort)

  • Cohort studies are generally considered the
    strongest of the observational designs
  • They are often the most expensive and time
  • Vulnerable to different set of limitations than
    ecological, cross-sectional, and case-control

??????? ?? ????????