Title: P1254325703tchnP
1Measurement issues
Jean Bourbeau, MD Respiratory Epidemiology and
Clinical Research Unit McGill University Clinica
l Epidemiology (679) June 19, 2006
2Objectives
- Define categorical and continuous variables
- Define 2 sources of variation biological and
measurement error (random and bias) - Describe the classification measures and their
focus functional, descriptive and methodological - Define and discuss the advantages and
disadvantages of objective and subjective health
measures - Define the psychometric properties of measurement
instruments reliability, validity,
responsiveness - Discuss key questions and concerns about each of
the psychometric properties of an instrument
reliability, validity and responsiveness - Define and discuss minimal clinically important
difference
3Reading
4Outline of Measurement issues
- 1. Measurements
- 2. Sources of variation
- 3. Classification
- 4. Health measurements
- 5. Measurement properties
5Outline of Measurement issues
- 1. Measurements
- 2. Sources of variation
- 3. Classification
- 4. Health measurements
- 5. Measurement properties
6Examples
- In a 60-year-old patient after right
hemicolectomy, the DUKE stage is a widely
accepted, indispensable descriptive tool for
planning further treatment. - Adjuvant post operative chemotherapy is currently
the recommended treatment for resected Duke C
colon cancer.
7Examples
- In a 20-year-old woman with right lower quadrant
pain and vomiting, the likely diagnosis is an
appendicitis or a gynecological infection. - After excluding pelvic inflammatory disease, an
experienced surgeon or gastroenterologist will
diagnose appendicitis based on history, clinical
findings and ultrasound.
8Measurement
We need to assign numbers to certain clinical
phenomena to make them manageable and scientific
9Measurement
- Measure
- A scale or test is an instrument to measure a
clinical phenomenon a score is a value on the
scale in a given patient
10Measurement
- The attributes or events that are measured in
- a research study are called variables
- Variables are measured according to 2 types
- Categorical
- Continuous
11Categorical variables
- Also called discrete variable
- Dichotomous
- or Polychotomous (multilevel)
- - Nominal
- - Ordinal
12Dichotomous categorical variables
- Examples
- Vital status (alive vs dead)
- Yes or no (response to a question)
- Sex (male vs female)
13Polychotomous categorical variables
- Nominal
- Named categories that bear no ordered
relationship to one another - Example
- Hair colour, race, or country of origin
14Nominal scale
- Hierarchy of mathematical adequacy
- Lowest level (not a measurement but a
classification) - Use numbers as a labels (such as male or female)
- No inference can be drawn from the relative size
of the numbers used
15Polychotomous categorical variables
- Ordinal
- Named categories that bear an ordered
relationship to one another - The intervals need not be equal
- Example
- Ordinal pain scale that include pain
severity none, mild, moderate, and severe - Deep tendon reflex absent, 1,2, 3, or 4
16Ordinal scale
- Hierarchy of mathematical adequacy
- Numbers are again used as a labels for response
categories - Numbers reflect the increasing order of the
characteristics being measured (mild,
moderate,severe) - The numeric values, and the differences between
them, hold no intrinsic meaning
17Continuous variables
- Also called dimensional, quantitative or interval
variables - Expressed as integers, fractions, or decimals in
which equal distances exist between successive
intervals - Examples age, blood pressure, temperature
18Interval scale
- Hierarchy of mathematical adequacy
- Numbers are assigned to the response categories
in such a way that a unit change represents a
constant change across the range of the scale
(temperature in degrees Celsius)
19Ratio scale
- Hierarchy of mathematical adequacy
- With a ratio scale, it becomes possible to state
how many times greater one score is than another - This improves on the interval scale by including
a zero point
20Scales
Binary Rank order (small to large) Continuous (0
to 8) Ratios
21Outline of Measurement issues
- 1. Measurements
- 2. Sources of variation
- 3. Classification
- 4. Health measurements
- 5. Measurement properties
22Sources of variation
- 2 sources of variation
- Biological variation
- Measurement error
23Biological variation
- Sources
- Dynamic nature of most biologic entities
(differences in age, sex, race, or disease
status) - Temporal variation
- (sometimes predictable, such as the diurnal cycle
of plasma cortisol)
24Measurement error
- 2 different types
- Random (chance error)
- Bias (systematic error)
25(No Transcript)
26Measurement error
- Can arise from
- The method (measuring instrument )
- Observer (the measurer)
27Measurement error
- We can talk about the variability between methods
of making the measurement or between the
observers - Repeated measurements by the same method or
observer - Intramethod or Intraobserver
- Between two or more methods or observers
- Intermethod or Interobserver
28Consequences of erroneous measurement
- Individual
- Makes no difference whether the error is
systematic or random - Group
- Variability in the absence of bias should not
change the average group value - However, it can have deleterious consequences
when one is seeking associations or correlations
between 2 measures (analytic bias)
29Regression toward the mean
- Individual measurement is subject to both
biologic variation and measurement error - An extremely high or low value obtained in an
individual from a group is more likely to be an
error than is an intermediate value - Tendency toward a less extreme value is greater
than the tendency for an intermediate value to
become more extreme
30Outline of Measurement issues
- 1. Measurements
- 2. Sources of variation
- 3. Classification
- 4. Health measurements
- 5. Measurement properties
31Classifications of measures
- Functional classifications focus on
- Purpose of application of the measures
- Descriptive classifications focus on
- Their scope
- Methodological classifications focus on
- Technical aspects
32Functional classification
- Measures have discriminative, evaluative or
predictive properties - Choice of measure depends on the purpose(s) for
which it will be used
33Functional classification
- Discriminative instrument
- Can discriminate between people with different
levels of a particular attribute or disease - For example
- NYHA scale
- MRC dyspnea scale
34MRC Dyspnea Scale
none
- Grade 1 ? Breathless with strenuous exercise
- Grade 2 ? Short of breath when hurrying on the
- level or walking up a slight hill
- Grade 3 ? Walks slower than people of the same
- age on the level or stops for breath while
- walking at own pace on the level
- Grade 4 ? Stops for breath after walking 100
yards - Grade 5 ? Too breathless to leave the house or
- breathless when dressing
severe
35(No Transcript)
36Functional classification
- Predictive instrument
- Can predict the probability of a clinical
diagnosis (diagnostic test) or the likelihood of
a future event (prognostic test)
375-year survival COPD
Dyspnea MRC scale
FEV1
...according to the level of dyspnea as evaluated
by the MRC Dyspnea Scale
...according to staging as defined by the ATS
Guidelines ( predicted FEV1)
Nishimura K, et al. Chest 2002 121 1434-1440.
38Functional classification
- Evaluative instrument
- Can measure change over time in the same person
- For example
- Dyspnea subscale of the Chronic Respiratory
Questionnaire (CRQ) (COPD disease-specific
quality of life questionnaire)
39Descriptive classification
- Large number of possible categories
- Can categorize instruments by
- Content domains of interest (dyspnea, fatigue,
emotion) - Generic or disease-specific
40COPD
Questionnaires
Disease-Specific
General
- used in any population
- cross-condition comparison
- co-morbid conditions and
- effects to treatment covered
- do not focus on HRQL/ COPD
- irrelevant items
- insensitive to small changes
- focus on relevant aspects
- of HRQL
- greater sensitivity for
- disease changes
- increased responsiveness
- no comparisons
41Methodological classification
- Large number of possible categories
- Can categorize by
- Interviewer versus self-administered
- Objective versus subjective
42Outline of Measurement issues
- 1. Measurements
- 2. Sources of variation
- 3. Classification
- 4. Health measurements
- 5. Measurement properties
43Health measurements
- Measurements may be based on
- Laboratory or diagnostic tests (objective)
- Indicators in which the patient or the clinician
makes a judgement (subjective)
44Health measurements
- Unfortunately subjective is also used in other
ways - To indicate if the variable is observable or not
- Examples
- Objective indicator such as The ability to
climb stairs - Subjective indicators such as pain or feelings
45Objective vs Subjective
- Objective
- More often continuous (lab data)
- Few categorical (vital status, sex and race)
- Subjective
- Greater potential, for bias or variability on the
part of - the observer
- Many variables that are most important in caring
for - patients are soft and subjective
- For example pain, mood, dyspnea, ability to
work, HRQL
46The example of CABG
- Why is quality of life important in studies
- of CABG patients?
- Survival with surgery gt medical treatment for
patients with left main and triple vessels - Survival similar in patients with less severe
disease - CASS NEJM 1984 European cooperative study Lancet
1982.
47As Feinstein has emphasized
The tendency of clinical investigators to focus
on objective rather than subjective
measurements can result in research that is both
dehumanizing and irrelevant
48Subjective vs Objective measurement
49Objective vs Subjective
- Data traditionally considered objective or
hard can be seen to have feet of softer clay - Example
- X-ray or cytopathologic diagnoses have been shown
to be subject to considerable intra- and
interobserver variability
50Subjective health measurements
- May be grouped into 3 main categories
- General feelings of well-being
- Symptoms of illness
- Adequacy of a persons functioning
51Subjective health measurements
- Advantages
- Amplify the data obtainable from morbidity and
mortality statistics - Give insights into matters of human concern such
as pain suffering or depression - Offer a systematic way to record the voice of
the patient - Do not require expensive or invasive procedures
52Subjective health measurements
- Disadvantages
- Contrast sharply with the inherent reliability of
mortality rates - Seem more susceptible to bias
- Applying these measures to an entire population
more difficult or impossible
53Subjective health measurements
The use of rating methods suitable for
statistical analysis permit subjective health
measurements to rival the quantitative strengths
of the traditional objective indicators
54Health measurements
- Scientific basis
- Subjective judgements as a valid approach to
measurement derive from the field of
psychophysics - Psychophysical principles were later incorporated
into psychometrics from which most of the
techniques used to develop subjective
measurements of health have been derived
55Outline of Measurement issues
- 1. Measurements
- 2. Source of variation
- 3. Classification
- 4. Health measurements
- 5. Measurement properties
56Psychometric properties
- Definition
- Psychometrics is the science of using
standardized tests or scales to measure
attributes of a person or object
57Numerical estimates of health
-
- Many scaling methods exist for
- Translating indicators into numerical
estimates of severity - When it is done, they may be combined into an
overall score, termed health index
58Psychometric properties
- Criteria for a scoring system
- Reliability
- Validity
- Responsiveness
- Minimal clinically important difference (MCID)
59Reliability
- Definition
- The extent to which the same results are obtained
when the measurement is repeated - It may reflect either (temporal) variation or
random measurement error
60Reliability
- Key Questions
- Internal consistency
- Test-retest reliability (reproducibility)
- Key Concern
- Error
- (error attenuates relationships between
variables, and makes it more difficult to detect
treatment effects)
61Validity
- Definition
- The extent to which the measurement corresponds
to the true value (some accepted gold
standard ), or behaves as expected - Validity depends on minimizing measurement error
caused by bias
62Type of measurement validity
Content validity Construct validity (convergent,
discriminant) Criterion validity (predictive,
concurrent) Cross-cultural validity Situational
validity
63Content validity
- Definition
- The extent to which the items sampled for
inclusion in the instrument adequately represent
the domain of content (particular domain area)
addressed by the instrument
64Content validity
- Key Questions
- Theoretical foundation of the instrument
- Instrument development primary sources of
information, sources of items and scaling
structure selection - Rules applied for content validation patient
and/or clinician validation scientific review - Instrument is appropriate for the study under
consideration
65Content validity
- Key concern
- Without validity, an instrument has no meaning
66Construct validity
- Definition
- The extent to which the instrument measures an
abstract concept (construct) or attribute
evaluated by comparison with instruments
measuring related constructs - Convergent (come together, same concept) or
discriminant with other instruments (truly
measures something different from other
instruments)
67Criterion validity
- Definition
- Extent to which the instrument relates to an
external criterion (criterion of practical value) - Concurrent (able to correlate with a present
criterion) or predictive (able to correlate with
a future criterion)
68Construct validity
It is important to understand that a direct test
of the validity of an abstract concept such as
impaired health due to disease is not possible
69Construct validity
- Key Questions
- Factor structure of the measure consistent with
expectations - Scores from the instrument correlate with those
of other instruments (measuring the same or
related constructs) - Score from the instrument independent of scores
from instruments measuring dissimilar constructs - Differentiate groups known to differ on the
attribute being measured, e.g. on HRQL
70Testing construct validity
- The most widely method used is the
multitrait-multimethod matrix - It involves testing a series of hypotheses
concerning relationships between the new
instrument and a range of reference measures of
disease activity
71Construct validity
- Key concern
- Without validity, an instrument has no meaning
72Cross-cultural validity
- Definition
- The extent to which an instrument developed and
tested in one cultural group is appropriate for,
and behaves similarly in, another
73Cross-cultural validity
- Key Questions
- Items appropriate for the culture under
consideration - Instrument translated culturally and
linguistically - Evidence of reliability and validity
74Situational validity
- Definition
- The extent to which an instrument is appropriate
for use in any given situation
75Situational validity
- Key Questions
- Instrument should measure an appropriate outcome
for the trial - Instrument should be valid for the specific
purpose of the trial - Sufficiently reliable and responsive for this
purpose - Sample size sufficient to detect change in the
outcome measure of interest
76Situational validity
- Key Issues
- Validity can be situation specific an instrument
valid for one situation is not necessarily valid
for another - Failure to detect treatment effects may be a
function of study design, rather than a
limitation of the instrument
77Responsiveness
- Definition
- The extent to which scores change with a given
change in the condition or disease state - Key Questions
- Instrument has been evaluated for responsiveness
- Effects sizes have been associated with the
instrument in well designed trials. - Key concerns
- The ability to track changes
78MCID
- Definition
- The smallest difference that clinicians and
patients would care about - Key Questions
- Has the MCID been established?
- What was the method used?
- Key concerns
- The ability to detect true treatment effects
79Benefits of Pulmonary Rehabilitation
Functional exercise capacity 6-MWD (N444)
Health status CRQ dyspnea (N519)
Lacasse Y, et al. Cochrane Database Syst Rev
2002 3CD003793.
80Key messages
- Some simple criteria
- The system must address a well defined clinical
phenomenon - The scale has to have a clearly defined ranking
in a hierarchical order (reasonable clinical or
mathematical criteria) - The different stages or categories have to be
mutually exclusive - The scale has to be adapted to the area of
measurement where it will be applied - Creating complex or composite scores such as
quality of life requires one to address issues
concerning the inner structure of a score
81Key messages
- Quote from McDowell and Newell
- Ultimately the selection of a measurement
contains an element of art and perhaps even luck
it is often prudent to apply more than one
measurement whenever possible. - This has the advantage of reinforcing the
conclusions of the study when the results from
ostensibly similar methods are in agreement, and
it also serves to increase our general
understanding of the comparability of the
measurements we use.