LECTURE 3 June 8 2007 Cohort Studies, Selection Bias Survival analysis - PowerPoint PPT Presentation


PPT – LECTURE 3 June 8 2007 Cohort Studies, Selection Bias Survival analysis PowerPoint presentation | free to view - id: 129be1-ZDM4N


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation

LECTURE 3 June 8 2007 Cohort Studies, Selection Bias Survival analysis


Diseases may be ascertained directly, or may also have already occurred. ... Observer bias if disease ascertained at same time. Blind observers to study hypothesis ... – PowerPoint PPT presentation

Number of Views:117
Avg rating:3.0/5.0
Slides: 57
Provided by: CMIC8


Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: LECTURE 3 June 8 2007 Cohort Studies, Selection Bias Survival analysis

LECTURE 3 June 8 2007Cohort Studies,
Selection BiasSurvival analysis
  • Dr. Dick Menzies

Cohort Studies General
  • Prospective study Incidence of new disease in
    persons who start without disease.
  • Follow-up period weeks, months, years
  • One or more diseases can be measured
  • Measure exposures at start or ongoing.
  • Can measure multiple exposures
  • Compare incidence in exposed vs unexposed groups
    within population per unit of time

Advantages of cohort over case-control or
cross-sectional designs
  • KEY exposure measurement is made before disease
  • Exposure more accurate prospective, and often
  • Eliminates bias in measurement of exposures
  • Recall bias of patients, or observer bias in
    exposure assessment - with knowledge of disease

Experimental vs cohort studies
  • Experimental studies - a form of cohort study
  • SAME - Persons are free of disease at outset
  • DIFFERENCE - Exposure is ASSIGNED (given) to
    some/not others
  • SAME - Measure outcomes after exposure
  • Cohort study Observational study (natural
  • Exposures NOT assigned,
  • Occur naturally, or are chosen by subjects, or by
    their MDs, etc

Advantages of cohort studies over experimental
  • Ideal to study natural history, course of
    disease, prognostic factors.
  • Etiologic research for exposures that can not be
    given experimentally, for ethical reasons
  • Smoking, asbestos, air pollution
  • Interventions not feasible for randomization
  • Diagnostic tests, complex care management
  • Some outcomes not well measured in trials
  • Compliance by patients and MDs,

Advantages of cohort studies over experimental
  • Total population studied.
  • Children, elderly, pregnancy, mentally
  • Full spectrum of illness
  • From patients in ICU to minimal forms of disease
  • Often excluded in RCT esp Pharma trials
  • Findings more likely to be applicable in real
  • Adverse events often more accurately measured
  • Population based estimates of exposure effects
  • BUT you MUST include the full spectrum of
    patients as possible (No exclusions in
    observational studies)

  • Selection bias Persons who get exposed not same
    as unexposed
  • Surgery who is operable vs inoperable
  • Exposures that seem same, are not
  • Potential bias in measuring
  • Drop-outs reduce power, may bias (a lot)
  • Outcome assessment can be biased

Cohort Designs Prospective
  • Subjects without disease at onset
  • Followed to determine incidence of diseases
  • Exposures measured at baseline, and/or
  • Disease occurrence after onset
  • Disease measured directly during follow-up

Cohort Designs Retrospective(Historical
  • A group of subjects are identified because of
    known past exposures.
  • Diseases may be ascertained directly, or may also
    have already occurred.
  • KEY exposure must have been well defined, and
    was measured or documented clearly.
  • AND exposure occurred well before disease.
  • Example Hiroshima survivors and leukemia.
  • Or, Asbestos workers in WW II, and mesothelioma

Cohort Populations
  • General populations no special exposures
  • Framingham study a true general population
  • All persons in the community invited
  • Proxy general Popn - Nurses, Military, Company
  • Exposures studied are those of general popn.
  • Diet, exercise, smoking, alcohol
  • Exposure defined cohort
  • Work-force to study occupational exposures
  • Group of patients who received certain therapy

Cohorts of patients
  • Clinical cohorts patients with a given
  • Case series can be form of cohort study
  • But must have differences in exposure
  • Different types, severity, causes
  • Potential problems in cohort studies with
  • Referral bias only sickest, rarest,
  • Lead-time bias better facilities earlier Dx
  • Multi-serial cohorts
  • Cohort starts with all diabetics in 2004
  • New, and old very different patients

Open versus Closed Cohorts
  • An open cohort or dynamic cohort - is one where
    people can enter or leave
  • Examples A workforce study that is ongoing
  • A city or other geographic location
  • A closed cohort is where all persons in the
    cohort are defined at entry. No one enters,
    members can only exit.
  • Eg. McGill medical school class of 2004

Selection Bias
  • Definition selection bias occurs when there is
    a distortion in the estimate of effect
    (association) because the study or sample
    population is not truly representative of the
    underlying population in terms of the
    distribution of exposures and/or outcomes.
  • Other terms referral bias, volunteer bias,
    healthy worker effect, susceptibility bias,
    drop-out bias
  • How/where in a study can this occur?

Figure 15-2. Diagram showing successive transfers
from the intended population to the group
admitted to a study of therapy
Obtaining a representative sample
  • In a representative sample we hope for a sample
    that shows us the true underlying distribution of
    exposure and disease
  • Odds Ratio (A/B) / (C/D)
  • A x D
  • B x C

Getting a representative sample
  • If we had 10 sample of all with disease
  • Simplest conceptually is to have same sample
  • 10 of Diseased WITH Exposure
  • 10 of Diseased WITHOUT Exposure
  • Then if 10 sample of all controls
  • Need 10 of with and 10 of without exposure
  • But if we had a 1 sample of all controls
  • We would want the same proportions
  • 1 of Healthy WITH Exposure
  • 1 of Healthy WITHOUT Exposure

Getting a representative sample
  • But does not have to be equal in all groups
  • KEY is that it must be equal ratios
  • If sample 10 of Diseased WITH Exposure
  • And 5 of Diseased WITHOUT Exposure
  • Then for controls - want the same proportions
  • If sample 1 of Healthy WITH Exposure
  • Then 0.5 of Healthy WITHOUT Exposure

Un-biased Sampling
  • Odds Ratio (P1 x P4) x (A x D)
  • (P2 x P3) (B x C)
  • IF (P1 x P4) THEN OR (A x D)
  • (P2 x P3) (B x

x 1
But, you do not know these proportions in advance
  • Right, so you have to design to avoid selection
  • Do not state the study Hypothesis
  • Try to minimize refusals and drop-outs
  • Find ways to assess completeness of study group
  • Did people leave the group already?
  • What are the pre-requisites to be in the study

Example biased sampling
  • You are studying Lupus and metal-working
  • Your hypothesis Lupus is caused by exposure to
    endotoxins in metal-working fluids.
  • Design Historical cohort
  • Cohort All workers in 3 plants, whether or not
    they have ever worked with these fluids.
  • Exposure Bacteria in fluid has been measured
    every 6 months for past 10 years.
  • Title of your survey A study of Lupus and
    endotoxin in metal-working fluids
  • WHO volunteers to take part? How will that affect
    your results

Biased Sampling
  • If 2/3 of A (P1.66) volunteer, but only 1/3 of B
  • And 1/3 of C and D (P30.33, P40.33)
  • Odds Ratio (P1 x P4) (.66x.33) 2 x (A x
  • (P2 x P3) (.33X.33) (B x
  • IF (P1 x P4)2 THEN ORestimated 2X ORTrue
  • (P2 x P3)

Example 2 Biased sampling
  • We are planning a case control study of spicy
    foods and peptic ulcer disease
  • Cases endoscopy proven peptic ulcer disease
  • Controls elective inguinal hernia repair at the
    same hospital
  • The truth no relationship i.e. the odds ratio
  • The problem physician at this hospital strongly
    believe spicy foods is an important risk factor
    for peptic ulcer disease.
  • Therefore they tend to refer patients for
    endoscopy more often if they had a diet of spicy

Example biased sampling
  • So, 100 of patients with ulcers AND spicy foods
    have endoscopy
  • But, only 50 of patients with ulcers, WITHOUT
    spicy food are endoscoped - so they are missed
  • Estimated association 2.0 (not 1.0)
  • This is NOT subject bias, it is MDs who
    introduce a diagnostic bias.
  • This is still a form of selection bias

Biased sampling (contd)

Volunteer Bias (1)
  • Participants in a study are different from
  • Mortality of non-participants in almost all
    large-scale cohort studies is significantly
  • Typically 20-30 higher
  • Often seen within 1-2 years after study starts
  • WHY?

Volunteer Bias (2)
  • Subjects with exposure and the outcome are more
    (or less) likely to participate
  • Eg HIV infection and homosexuality in Africa
  • What direction?
  • Disease and self-reported occupational exposures
  • What direction?
  • Compensable illnesses and occupational exposures
  • What direction?

Susceptibility bias
  • Persons who self-select to certain exposures are
    more, or less susceptible
  • Bus-drivers in Montreal vs Insulation workers
  • Who had worse lung function?
  • Lab animal workers
  • How many leave in first 6 months?
  • Persons allocated to one form of treatment,
    develop health outcomes of interest.
  • Eg Cancer patients surgery vs medical or
    radiotherapy only. Surgical patients do better.
  • WHY?

Healthy worker effect
  • An important bias found in work-force studies
  • Reflects medical screening (military, mining)
  • Or, physical requirements of job
  • Health status better initially
  • Strongly affects results in cross-sectional
  • Reduces risk or delays occurrence of health
    outcomes of interest.
  • Healthy smoker effect
  • Lung function in adolescents Smokers gt

Example of healthy smoker effect
Selection Bias in Cohort Studies Dropouts
  • Losses to follow up occur in all cohort studies
  • Reduce power, and dilute results
  • Problematic if more drop-outs in one exposure
  • REALLY important if more drop-outs in one cell
  • Eg. Group with exposure who develop disease

Selection Bias in Cohort Studies Unequal
  • Study of incidence of diabetes in obese persons.
  • Truth IRR 3.0
  • Drop-outs 33 in obese persons who developed
    Diabetes (death/other)
  • Drop-outs - 5 losses in all other groups

Example No Dropouts
  • Incidence (no drop-outs)
  • In obese 27/227 12
  • In non-obese 33/773 4
  • Unbiased Incidence rate ratio 12 / 4 3.0

Example Unequal Dropouts
  • (P1 x P4) does not 1
  • (P2 x P3)
  • Incidence (biased)
  • In obese 18/208 8.7
  • In non-obese 30/735 4.1
  • Biased incidence rate ratio 8.7 / 4.1 2.1

Drop-outs from a work-force - impact
  • An occupational exposure causes health effects
    quickly in a susceptible sub-group.
  • They leave the work-force (quit) quickly.
  • Examples
  • Allergy to lab animals in researchers
  • Asthma in Grain workers
  • Cross-sectional studies no susceptibles left
  • Cohorts Can miss when setting up cohort.
  • Outcomes occur in new workers, then no more
    (power problem)

Controlling Selection Bias Design
  • Design is most important
  • Recruitment high in all groups (80 rule)
  • Same recruitment in exposed/not exposed
  • Close follow-up to prevent dropouts
  • Assess potential bias of study group
  • Requirements for members

Assessing Selection Bias - analysis
  • Compare participants to non-participants
  • Sub-groups of non-participants
  • Compare dropouts with those who remained
  • Initial/baseline characteristics
  • Exposures
  • Sensitivity analysis best case/ worst case to
    assess impact of selection biases
  • Commonly used fro drop-outs
  • What if? All died / all survived, etc.

Cohort Studies Exposure Assessments
  • Prospective - Measure one or more exposures at
  • Specific cholesterol, obesity, smoking, blood
  • Proxies occupation, housing
  • Measure once, or repeatedly to account for
    changes in exposure over time (obesity, smoking,
  • Retrospective
  • Exposure based upon past events
  • Sometimes recorded (transfusions, dust levels)
  • Usually not quantified
  • Proxies used (job description, distance from

Pitfalls in exposure assessments
  • Observer bias if disease ascertained at same time
  • Blind observers to study hypothesis
  • Standardized protocols
  • Are all exposures the same?
  • Complications Pleural tap gtgt Thoracoscopy WHY?
  • You cant think of everything
  • Hard to go back to the start of cohort
  • Freeze some samples
  • New things reported can be very valuable
  • Nairobi sex workers in 1980s
  • Meta-pneumovirus

Cohort Studies Outcome Assessments
  • Baseline ensure cohort members free of disease.
  • Easy if prospective, harder if retrospective
    (except Cancers)
  • Outcomes measured periodically
  • Through questionnaire, exam, labs (direct)
  • Through health service utilization (databases)
  • Through vital statistics (databases)
  • Case definition key for outcome assessments
  • Diagnosis of milder disease common problem

Pitfalls in outcome assessments
  • Ascertainment bias patients with Factor X more
    likely to have testing so detect disease.
  • Standardized protocols, blinding to exposures
  • But diagnosis can happen outside of study
  • Observer bias patients with Factor X more
    likely to be diagnosed with outcome of interest
  • Problem with subjective tests eg CXR
  • Solution independent blinded reviewers
  • Lead time bias earlier diagnosis makes survival
    look better

Lead-time bias - example
Cohort Studies Measures of Incidence
  • Incidence rate (simplest)
  • number developing disease
  • Total number who entered cohort
  • per unit of time
  • Cumulative incidence
  • number developing disease
  • Total number who entered cohort
  • Over total follow-up period

Measuring Incidence in Cohort StudiesHow to
handle drop outs etc..?
  • Drop-outs, loss to follow-up, death other causes
  • Up to 50 in long term cohorts
  • Simple incidence measures - excludes these
  • Need to allow variable length of follow up
  • And count people who enter after the first year

Incidence Density (ID)
  • Counts person-time (person-years/months)
  • Starts count when person enters cohort
  • Each year of follow-up added up

ID in Exposed 1 event in 12 person years ID in
Unexposed 1 event in 18 person years
Cohort studies Measure of Association Risk
Ratios, or Incidence rate ratios
  • Summary measure of association in Cohort Studies
  • Formula for Incidence rate ratio (IRR)
  • Incidence of disease in persons with exposure
  • Incidence of disease in persons without exposure
  • Ndisease/Nexposed per unit time
  • Ndisease/Nunexposed per unit time
  • Note IRR has no unit of time.
  • Assumes that time was similar
  • For diseased / disease-free
  • For exposed / unexposed

Calculation of Risk Ratio - example
  • Cohort at inception 1,000 people without
  • Prevalence of obesity at inception 22.7
  • Outcome Incidence of diabetes in a population
  • Exposure - obesity at inception of cohort
  • Follow-up - six years
  • Overall incidence of diabetes 1 per year
  • Risk Cumulative Incidence 6

Risk Ratio Calculation - Example
Ratio of Incidence risk ratio 27/227 /
33/773 12 / 4
Incidence Density Ratio
  • Incidence rate ratio (1/2) / (1/2) 1
  • Density method (0/2 years) (1/10 years)
  • (0/8 years) (1/10 years)
  • Incidence density ratio (1/12)
  • 1.5

Incidence Rate Difference
  • A patient asks How much will my risk of heart
    attack go down if I take this new drug (B),
    instead of old one (A)?
  • Answer using incidence rate difference
  • Incidence with Drug A - Incidence with Drug B
  • 0.5/year 0.3/year 0.2/year, or, a
    40 reduction
  • Same answer using Incidence rate ratio
  • Incidence with Drug B 0.3 0.6, or, a 40
  • Incidence with Drug A 0.5

Attributable risk
  • How many lung cancers are due to air pollution
    in Montreal? Same as What is attributable
  • Attributable risk IRR x Prevalence of exposure
  • Increases with higher IRR
  • Or if exposure more common
  • Diabetes vs Silicosis and TB
  • Diabetes IRR 3.5, Prevalence 3
  • Silicosis IRR 12, Prevalence 0.1
  • Attrib risk for Diabetes gtgt than for Silicosis

Cohort Studies Survival Analysis
  • Analysis of time to event
  • Accounts for variable length of follow up.
  • Advantage if time to event affected by exposure.
  • Can find important differences in treatments even
    overall survival same
  • Cancer treatment A increases survival at two
  • But five year mortality is same as treatment B.
  • Treatment A - preferred by most patients!

Important differences found using Survival
Types of Survival Analysis
  • Simplest Direct
  • Kaplan-Meier still pretty simple. Calculates
    cumulative proportion free of outcome (survived)
    at each point in time when that outcome occurs.
    People who drop out or die of other causes are
    censored. At each point numerator is all who
    have developed disease, while denominator is all
    without outcome in the interval just before
  • Cox regression analysis multivariate analysis
    with same basic principles

Kaplan Meier survival analysis - example
Notes Intervals are variable defined by when
subjects die Proportion surviving interval
excludes drop-outs during the interval (censored)
Kaplan Meier survival analysis - example
Example of Kaplan-Meier analysis General
Hospital Ventilation and time to TST conversion
Selection Bias Berksons
  • This is described in case control studies in
    hospitalized patients
  • First described on mathematical basis.
  • Probability Hospitalization if Factor Z 0.1
    Probability Hospitalization if Factor Y 0.05
    Probability Hospitalization if both higher
  • These two independent conditions will appear to
    be associated but may not be.
  • In practice it is common that patients with 2 or
    more conditions ARE more likely hospitalized (eg
    CHF and pneumonia) so in hospital based
    Case-control study they appear to be strongly
  • Fundamental problem is the same. P1 does not
    equal P2 does not equal P3 does not equal P4
About PowerShow.com