# Practical Applications of Statistical Methods in the Clinical Laboratory - PowerPoint PPT Presentation

PPT – Practical Applications of Statistical Methods in the Clinical Laboratory PowerPoint presentation | free to download - id: a810d-ODM0N

The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
Title:

## Practical Applications of Statistical Methods in the Clinical Laboratory

Description:

### Practical Applications of Statistical Methods in the Clinical Laboratory ... difficulties that bars the path of those who pursue the Science of Man. ... – PowerPoint PPT presentation

Number of Views:214
Avg rating:3.0/5.0
Slides: 218
Provided by: rogerbe
Category:
Tags:
Transcript and Presenter's Notes

Title: Practical Applications of Statistical Methods in the Clinical Laboratory

1
Practical Applications of Statistical Methods in
the Clinical Laboratory
• Roger L. Bertholf, Ph.D., DABCC
• Associate Professor of Pathology
• Director of Clinical Chemistry Toxicology
• UF Health Science Center/Jacksonville

2
Statistics are the only tools by which an
opening can be cut through the formidable thicket
of difficulties that bars the path of those who
pursue the Science of Man.
• Sir Francis Galton (1822-1911)

3
There are three kinds of lies Lies, damned
lies, and statistics
• Benjamin Disraeli (1804-1881)

4
What are statistics, and what are they used for?
• Descriptive statistics are used to characterize
data
• Statistical analysis is used to distinguish
between random and meaningful variations
• In the laboratory, we use statistics to monitor
and verify method performance, and interpret the
results of clinical laboratory tests

5
mathematics, I assure you that mine are greater
• Albert Einstein (1879-1955)

6
I don't believe in mathematics
• Albert Einstein

7
Summation function
8
Product function
9
The Mean (average)
• The mean is a measure of the centrality of a set
of data.

10
Mean (arithmetical)
11
Mean (geometric)
12
Use of the Geometric mean
• The geometric mean is primarily used to average
ratios or rates of change.

13
Mean (harmonic)
14
Example of the use of Harmonic mean
• Suppose you spend 6 on pills costing 30 cents
per dozen, and 6 on pills costing 20 cents per
dozen. What was the average price of the pills
you bought?

15
Example of the use of Harmonic mean
• You spent 12 on 50 dozen pills, so the average
cost is 12/500.24, or 24 cents.
• This also happens to be the harmonic mean of 20
and 30

16
Root mean square (RMS)
17
For the data set 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
18
The Weighted Mean
19
Other measures of centrality
• Mode

20
The Mode
• The mode is the value that occurs most often

21
Other measures of centrality
• Mode
• Midrange

22
The Midrange
• The midrange is the mean of the highest and
lowest values

23
Other measures of centrality
• Mode
• Midrange
• Median

24
The Median
• The median is the value for which half of the
remaining values are above and half are below it.
I.e., in an ordered array of 15 values, the 8th
value is the median. If the array has 16 values,
the median is the mean of the 8th and 9th values.

25
Example of the use of median vs. mean
• Suppose youre thinking about building a house in
a certain neighborhood, and the real estate agent
tells you that the average (mean) size house in
that area is 2,500 sq. ft. Astutely, you ask
Whats the median size? The agent replies
1,800 sq. ft.
• What does this tell you about the sizes of the
houses in the neighborhood?

26
Measuring variance
• Two sets of data may have similar means, but
otherwise be very dissimilar. For example, males
and females have similar baseline LH
concentrations, but there is much wider variation
in females.
• How do we express quantitatively the amount of
variation in a data set?

27
(No Transcript)
28
The Variance
29
The Variance
• The variance is the mean of the squared
differences between individual data points and
the mean of the array.
• Or, after simplifying, the mean of the squares
minus the squared mean.

30
The Variance
31
The Variance
• In what units is the variance?
• Is that a problem?

32
The Standard Deviation
33
The Standard Deviation
• The standard deviation is the square root of the
variance. Standard deviation is not the mean
difference between individual data points and the
mean of the array.

34
The Standard Deviation
In what units is the standard deviation? Is that
a problem?
35
The Coefficient of Variation
• Sometimes called the Relative Standard Deviation
(RSD or RSD)

36
Standard Deviation (or Error) of the Mean
• The standard deviation of an average decreases by
the reciprocal of the square root of the number
of data points used to calculate the average.

37
Exercises
• How many measurements must we average to improve
our precision by a factor of 2?

38
• To improve precision by a factor of 2

39
Exercises
• How many measurements must we average to improve
our precision by a factor of 2?
• How many to improve our precision by a factor of
10?

40
• To improve precision by a factor of 10

41
Exercises
• How many measurements must we average to improve
our precision by a factor of 2?
• How many to improve our precision by a factor of
10?
• If an assay has a CV of 7, and we decide run
samples in duplicate and average the
measurements, what should the resulting CV be?

42
• Improvement in CV by running duplicates

43
Population vs. Sample standard deviation
• When we speak of a population, were referring to
the entire data set, which will have a mean ?

44
Population vs. Sample standard deviation
• When we speak of a population, were referring to
the entire data set, which will have a mean ?
• When we speak of a sample, were referring to a
subset of the population, customarily designated
x-bar
• Which is used to calculate the standard deviation?

45
Sir, I have found you an argument. I am not
obliged to find you an understanding.
• Samuel Johnson (1709-1784)

46
Population vs. Sample standard deviation
47
Distributions
• Definition

48
Statistical (probability) Distribution
• A statistical distribution is a
mathematically-derived probability function that
can be used to predict the characteristics of
certain applicable real populations
• Statistical methods based on probability
distributions are parametric, since certain

49
Distributions
• Definition
• Examples

50
Binomial distribution
• The binomial distribution applies to events that
have two possible outcomes. The probability of r
successes in n attempts, when the probability of
success in any individual attempt is p, is given
by

51
Example
• What is the probability that 10 of the 12 babies
born one busy evening in your hospital will be
girls?

52
Solution
53
Distributions
• Definition
• Examples
• Binomial

54
God does arithmetic
• Karl Friedrich Gauss (1777-1855)

55
The Gaussian Distribution
• What is the Gaussian distribution?

56
63 81 36 12 28 7 79 52 96 17 22 4 61 85
etc.
57
(No Transcript)
58
63 81 36 12 28 7 79 52 96 17 22 4 61 85
22 73 54 33 99 5 61 28 58 24 16 77 43 8
85 152 90 45 127 12 140 70 154 41 38 81 104 93

59
(No Transcript)
60
. . . etc.
61
Probability
x
62
The Gaussian Probability Function
• The probability of x in a Gaussian distribution
with mean ? and standard deviation ? is given by

63
The Gaussian Distribution
• What is the Gaussian distribution?
• What types of data fit a Gaussian distribution?

64
Like the ski resort full of girls hunting for
husbands and husbands hunting for girls, the
situation is not as symmetrical as it might seem.
• Alan Lindsay Mackay (1926- )

65
Are these Gaussian?
• Human height
• Outside temperature
• Raindrop size
• Blood glucose concentration
• Serum CK activity
• QC results
• Proficiency results

66
The Gaussian Distribution
• What is the Gaussian distribution?
• What types of data fit a Gaussian distribution?
• What is the advantage of using a Gaussian
distribution?

67
Gaussian probability distribution
Probability
.67
.95
µ
µ?
µ2?
µ3?
µ-?
µ-2?
µ-3?
68
What are the odds of an observation . . .
• more than 1 ??from the mean (/-)
• more than 2 ? greater than the mean
• more than 3 ? from the mean

69
Some useful Gaussian probabilities
Range
Probability
Odds
/- 1.00 ?
68.3
1 in 3
/- 1.64 ?
90.0
1 in 10
/- 1.96 ?
95.0
1 in 20
/- 2.58 ?
99.0
1 in 100
70
Example
That
This
71
On the Gaussian curve Experimentalists think
that it is a mathematical theorem while the
mathematicians believe it to be an experimental
fact.
• Gabriel Lippman (1845-1921)

72
Distributions
• Definition
• Examples
• Binomial
• Gaussian

73
"Life is good for only two things, discovering
mathematics and teaching mathematics"
• Siméon Poisson (1781-1840)

74
The Poisson Distribution
• The Poisson distribution predicts the frequency
of r events occurring randomly in time, when the
expected frequency is ?

75
Examples of events described by a Poisson
distribution
?
• Lightning
• Accidents
• Laboratory?

76
A very useful property of the Poisson distribution
77
Using the Poisson distribution
• How many counts must be collected in an RIA in
order to ensure an analytical CV of 5 or less?

78
79
Distributions
• Definition
• Examples
• Binomial
• Gaussian
• Poisson

80
The Students t Distribution
• When a small sample is selected from a large
population, we sometimes have to make certain
assumptions in order to apply statistical methods

81
• Is the mean of our sample, x bar, the same as the
mean of the population, ??
• Is the standard deviation of our sample, s, the
same as the standard deviation for the
population, ??
• Unless we can answer both of these questions
affirmatively, we dont know whether our sample
has the same distribution as the population from
which it was drawn.

82
• Recall that the Gaussian distribution is defined
by the probability function
• Note that the exponential factor contains both
??and ?, both population parameters. The factor
is often simplified by making the substitution

83
• The variable z in the equation
• is distributed according to a unit gaussian,
since it has a mean of zero and a standard
deviation of 1

84
Gaussian probability distribution
Probability
.67
.95
0
1
2
3
-1
-2
-3
z
85
• But if we use the sample mean and standard
• and weve defined a new quantity, t, which is not
distributed according to the unit Gaussian. It
is distributed according to the Students t
distribution.

86
Important features of the Students t distribution
• Use of the t statistic assumes that the parent
distribution is Gaussian
• The degree to which the t distribution
approximates a gaussian distribution depends on N
(the degrees of freedom)
• As N gets larger (above 30 or so), the
differences between t and z become negligible

87
Application of Students t distribution to a
sample mean
• The Students t statistic can also be used to
analyze differences between the sample mean and
the population mean

88
Comparison of Students t and Gaussian
distributions
• Note that, for a sufficiently large N (gt30), t
can be replaced with z, and a Gaussian
distribution can be assumed

89
Exercise
• The mean age of the 20 participants in one
workshop is 27 years, with a standard deviation
of 4 years. Next door, another workshop has 16
participants with a mean age of 29 years and
standard deviation of 6 years.
• Is the second workshop attracting older
technologists?

90
Preliminary analysis
• Is the population Gaussian?
• Can we use a Gaussian distribution for our
sample?
• What statistic should we calculate?

91
Solution
• First, calculate the t statistic for the two
means

92
Solution, cont.
• Next, determine the degrees of freedom

93
Statistical Tables
94
Conclusion
• Since 1.16 is less than 1.64 (the t value
corresponding to 90 confidence limit), the
difference between the mean ages for the
participants in the two workshops is not
significant

95
The Paired t Test
• Suppose we are comparing two sets of data in
which each value in one set has a corresponding
value in the other. Instead of calculating the
difference between the means of the two sets, we
can calculate the mean difference between data
pairs.

96
• we use
• to calculate t

97
• If the type of data permit paired analysis, the
paired t test is much more sensitive than the
unpaired t.
• Why?

98
Applications of the Paired t
• Method correlation
• Comparison of therapies

99
Distributions
• Definition
• Examples
• Binomial
• Gaussian
• Poisson
• Students t

100
The ?2 (Chi-square) Distribution
• There is a general formula that relates actual
measurements to their predicted values

101
The ?2 (Chi-square) Distribution
• A special (and very useful) application of the ?2
distribution is to frequency data

102
Exercise
iatrogenic strep infection in your last 725
patients. St. Elsewhere, across town, reports 35
cases of strep in their last 416 patients.
• Do you need to review your infection control
policies?

103
Analysis
• If your infection control policy is roughly as
effective as St. Elsewheres, we would expect
that the rates of strep infection for the two
hospitals would be similar. The expected
frequency, then would be the average

104
Calculating ?2
• First, calculate the expected frequencies at your
hospital (f1) and St. Elsewhere (f2)

105
Calculating ?2
• Next, we sum the squared differences between
actual and expected frequencies

106
Degrees of freedom
• In general, when comparing k sample proportions,
the degrees of freedom for ?2 analysis are k - 1.
Hence, for our problem, there is 1 degree of
freedom.

107
Conclusion
• A table of ?2 values lists 3.841 as the ?2
corresponding to a probability of 0.05.
• So the variation (?2?????????between strep
infection rates at the two hospitals is within
statistically-predicted limits, and therefore is
not significant.

108
Distributions
• Definition
• Examples
• Binomial
• Gaussian
• Poisson
• Students t
• ?2

109
The F distribution
• The F distribution predicts the expected
differences between the variances of two samples
• This distribution has also been called Snedecors
F distribution, Fisher distribution, and variance
ratio distribution

110
The F distribution
• The F statistic is simply the ratio of two
variances
• (by convention, the larger V is the numerator)

111
Applications of the F distribution
• There are several ways the F distribution can be
used. Applications of the F statistic are part
of a more general type of statistical analysis
called analysis of variance (ANOVA). Well see

112
Example
• Youre asked to do a quick and dirty
correlation between three whole blood glucose
analyzers. You prick your finger and measure
your blood glucose four times on each of the
analyzers.
• Are the results equivalent?

113
Data
114
Analysis
• The mean glucose concentrations for the three
analyzers are 70, 85, and 76.
• If the three analyzers are equivalent, then we
can assume that all of the results are drawn from
a overall population with mean ? and variance ?2.

115
Analysis, cont.
• Approximate ? by calculating the mean of the
means

116
Analysis, cont.
• Calculate the variance of the means

117
Analysis, cont.
• But what we really want is the variance of the
population. Recall that

118
Analysis, cont.
• Since we just calculated
• we can solve for ??

119
Analysis, cont.
• So we now have an estimate of the population
variance, which wed like to compare to the real
variance to see whether they differ. But what is
the real variance?
• We dont know, but we can calculate the variance
based on our individual measurements.

120
Analysis, cont.
• If all the data were drawn from a larger
population, we can assume that the variances are
the same, and we can simply average the variances
for the three data sets.

121
Analysis, cont.
• Now calculate the F statistic

122
Conclusion
• A table of F values indicates that 4.26 is the
limit for the F statistic at a 95 confidence
level (when the appropriate degrees of freedom
are selected). Our value of 10.6 exceeds that,
so we conclude that there is significant
variation between the analyzers.

123
Distributions
• Definition
• Examples
• Binomial
• Gaussian
• Poisson
• Students t
• ?2
• F

124
Unknown or irregular distribution
• Transform

125
Log transform
Probability
Probability
log x
x
126
Unknown or irregular distribution
• Transform
• Non-parametric methods

127
Non-parametric methods
• Non-parametric methods make no assumptions about
the distribution of the data
• There are non-parametric methods for
characterizing data, as well as for comparing
data sets
• These methods are also called distribution-free,
robust, or sometimes non-metric tests

128
Application to Reference Ranges
• The concentrations of most clinical analytes are
not usually distributed in a Gaussian manner.
Why?
• How do we determine the reference range (limits
of expected values) for these analytes?

129
Application to Reference Ranges
• Reference ranges for normal, healthy populations
are customarily defined as the central 95.
• An entirely non-parametric way of expressing this
is to eliminate the upper and lower 2.5 of data,
and use the remaining upper and lower values to
define the range.
• NCCLS recommends 120 values, dropping the two
highest and two lowest.

130
Application to Reference Ranges
• What happens when we want to compare one
reference range with another? This is precisely
what CLIA 88 requires us to do.
• How do we do this?

131
Everything should be made as simple as possible,
but not simpler.
• Albert Einstein

132
Solution 1 Simple comparison
• Suppose we just do a small internal reference
range study, and compare our results to the
manufacturers range.
• How do we compare them?
• Is this a valid approach?

133
NCCLS recommendations
• Inspection Method Verify reference populations
are equivalent
• Limited Validation Collect 20 reference
specimens
• No more than 2 exceed range
• Repeat if failed
• Extended Validation Collect 60 reference
specimens compare ranges.

134
Solution 2 Mann-Whitney
• Rank normal values (x1,x2,x3...xn) and the
reference population (y1,y2,y3...yn)
• x1, y1, x2, x3, y2, y3 ... xn, yn
• Count the number of y values that follow each x,
and call the sum Ux. Calculate Uy also.
• Also called the U test, rank sum test, or
Wilcoxens test.

135
Mann-Whitney, cont.
• It should be obvious that Ux Uy NxNy
• If the two distributions are the same, then
• Ux Uy 1/2NxNy
• Large differences between Ux and Uy indicate that
the distributions are not equivalent

136
Obvious is the most dangerous word in
mathematics.
• Eric Temple Bell (1883-1960)

137
Solution 3 Run test
• In the run test, order the values in the two
distributions as before
• x1, y1, x2, x3, y2, y3 ... xn, yn
• Add up the number of runs (consecutive values
from the same distribution). If the two data
sets are randomly selected from one population,
there will be few runs.

138
Solution 4 The Monte Carlo method
• Sometimes, when we dont know anything about a
distribution, the best thing to do is
independently test its characteristics.

139
The Monte Carlo method
y
x
140
The Monte Carlo method
Reference population
141
The Monte Carlo method
• With the Monte Carlo method, we have simulated
the test we wish to apply--that is, we have
randomly selected samples from the parent
distribution, and determined whether our in-house
data are in agreement with the randomly-selected
samples.

142
Analysis of paired data
• For certain types of laboratory studies, the data
we gather is paired
• We typically want to know how closely the paired
data agree
• We need quantitative measures of the extent to
which the data agree or disagree
• Examples?

143
Examples of paired data
• Method correlation data
• Pharmacodynamic effects
• Risk analysis
• Pathophysiology

144
Correlation
50
45
40
35
30
25
20
15
10
5
0
0
5
10
15
20
25
30
35
40
45
50
145
Linear regression (least squares)
• Linear regression analysis generates an equation
for a straight line
• y mx b
• where m is the slope of the line and b is the
value of y when x 0 (the y-intercept).
• The calculated equation minimizes the differences
between actual y values and the linear regression
line.

146
Correlation
50
45
40
35
30
25
20
15
10
5
0
0
5
10
15
20
25
30
35
40
45
50
147
Covariance
• Do x and y values vary in concert, or randomly?

148
• What if y increases when x increases?
• What if y decreases when x increases?
• What if y and x vary independently?

149
Covariance
• It is clear that the greater the covariance, the
stronger the relationship between x and y.
• But . . . what about units?
• e.g., if you measure glucose in mg/dL, and I
measure it in mmol/L, whos likely to have the
highest covariance?

150
The Correlation Coefficient
151
The Correlation Coefficient
• The correlation coefficient is a unitless
quantity that roughly indicates the degree to
which x and y vary in the same direction.
• ? is useful for detecting relationships between
parameters, but it is not a very sensitive

152
Correlation
50
45
40
y 1.031x - 0.024 ? 0.9986
35
30
25
20
15
10
5
0
0
5
10
15
20
25
30
35
40
45
50
153
Correlation
50
45
40
y 1.031x - 0.024 ? 0.9894
35
30
25
20
15
10
5
0
0
5
10
15
20
25
30
35
40
45
50
154
Standard Error of the Estimate
• The linear regression equation gives us a way to
calculate an estimated y for any given x value,
given the symbol y (y-hat)

155
Standard Error of the Estimate
• Now what we are interested in is the average
difference between the measured y and its
estimate, y

156
Correlation
50
45
40
y 1.031x - 0.024 ? 0.9986 sy/x1.83
35
30
25
20
15
10
5
0
0
5
10
15
20
25
30
35
40
45
50
157
Correlation
50
45
40
y 1.031x - 0.024 ? 0.9894 sy/x 5.32
35
30
25
20
15
10
5
0
0
5
10
15
20
25
30
35
40
45
50
158
Standard Error of the Estimate
• If we assume that the errors in the y
measurements are Gaussian (is that a safe
assumption?), then the standard error of the
estimate gives us the boundaries within which 67
of the y values will fall.
• ?2sy/x defines the 95 boundaries..

159
Limitations of linear regression
• Assumes no error in x measurement
• Assumes that variance in y is constant throughout
concentration range

160
Alternative approaches
• Weighted linear regression analysis can
compensate for non-constant variance among y
measurements
• Deming regression analysis takes into account
variance in the x measurements
• Weighted Deming regression analysis allows for
both

161
Evaluating method performance
• Precision

162
Method Precision
• Within-run 10 or 20 replicates
• What types of errors does within-run precision
reflect?
• Day-to-day NCCLS recommends evaluation over 20
days
• What types of errors does day-to-day precision
reflect?

163
Evaluating method performance
• Precision
• Sensitivity

164
Method Sensitivity
• The analytical sensitivity of a method refers to
the lowest concentration of analyte that can be
reliably detected.
• The most common definition of sensitivity is the
analyte concentration that will result in a
signal two or three standard deviations above
background.

165
Signal
time
166
Other measures of sensitivity
• Limit of Detection (LOD) is sometimes defined as
the concentration producing an S/N gt 3.
• In drug testing, LOD is customarily defined as
the lowest concentration that meets all
identification criteria.
• Limit of Quantitation (LOQ) is sometimes defined
as the concentration producing an S/N gt5.
• In drug testing, LOQ is customarily defined as
the lowest concentration that can be measured
within 20.

167
Question
• At an S/N ratio of 5, what is the minimum CV of
the measurement?
• If the S/N is 5, 20 of the measured signal is
noise, which is random. Therefore, the CV must
be at least 20.

168
Evaluating method performance
• Precision
• Sensitivity
• Linearity

169
Method Linearity
• A linear relationship between concentration and
signal is not absolutely necessary, but it is
highly desirable. Why?
• CLIA 88 requires that the linearity of
analytical methods is verified on a periodic
basis.

170
Ways to evaluate linearity
• Visual/linear regression

171
Signal
Concentration
172
Outliers
• We can eliminate any point that differs from the
next highest value by more than 0.765 (p0.05)
times the spread between the highest and lowest
values (Dixon test).
• Example 4, 5, 6, 13
• (13 - 4) x 0.765 6.89

173
Limitation of linear regression method
• If the analytical method has a high variance
(CV), it is likely that small deviations from
linearity will not be detected due to the high
standard error of the estimate

174
Signal
Concentration
175
Ways to evaluate linearity
• Visual/linear regression

176
• Recall that, for linear data, the relationship
between x and y can be expressed as
• y f(x) a bx

177
• A curve is described by the quadratic equation
• y f(x) a bx cx2
• which is identical to the linear equation except
for the addition of the cx2 term.

178
• It should be clear that the smaller the x2
coefficient, c, the closer the data are to linear
(since the equation reduces to the linear form
when c approaches 0).
• What is the drawback to this approach?

179
Ways to evaluate linearity
• Visual/linear regression
• Lack-of-fit analysis

180
Lack-of-fit analysis
• There are two components of the variation from
the regression line
• Intrinsic variability of the method
• Variability due to deviations from linearity
• The problem is to distinguish between these two
sources of variability
• What statistical test do you think is appropriate?

181
Signal
Concentration
182
Lack-of-fit analysis
• The ANOVA technique requires that method variance
is constant at all concentrations. Cochrans
test is used to test whether this is the case.

183
Lack-of-fit method calculations
• Total sum of the squares the variance
calculated from all of the y values
• Linear regression sum of the squares the
variance of y values from the regression line
• Residual sum of the squares difference between
TSS and LSS
• Lack of fit sum of the squares the RSS minus
the pure error (sum of variances)

184
Lack-of-fit analysis
• The LOF is compared to the pure error to give the
G statistic (which is actually F)
• If the LOF is small compared to the pure error, G
is small and the method is linear
• If the LOF is large compared to the pure error, G
will be large, indicating significant deviation
from linearity

185
Significance limits for G
• 90 confidence 2.49
• 95 confidence 3.29
• 99 confidence 5.42

186
If your experiment needs statistics, you ought
to have done a better experiment.
• Ernest Rutherford (1871-1937)

187
Evaluating Clinical Performance of laboratory
tests
• The clinical performance of a laboratory test
defines how well it predicts disease
• The sensitivity of a test indicates the
likelihood that it will be positive when disease
is present

188
Clinical Sensitivity
• If TP as the number of true positives, and FN
is the number of false negatives, the
sensitivity is defined as

189
Example
• Of 25 admitted cocaine abusers, 23 tested
positive for urinary benzoylecgonine and 2 tested
negative. What is the sensitivity of the urine
screen?

190
Evaluating Clinical Performance of laboratory
tests
• The clinical performance of a laboratory test
defines how well it predicts disease
• The sensitivity of a test indicates the
likelihood that it will be positive when disease
is present
• The specificity of a test indicates the
likelihood that it will be negative when disease
is absent

191
Clinical Specificity
• If TN is the number of true negative results,
and FP is the number of falsely positive results,
the specificity is defined as

192
Example
• What would you guess is the specificity of any
particular clinical laboratory test? (Choose any
one you want)

193
• Since reference ranges are customarily set to
include the central 95 of values in healthy
subjects, we expect 5 of values from healthy
people to be abnormal--this is the false
positive rate.
• Hence, the specificity of most clinical tests is
no better than 95.

194
Sensitivity vs. Specificity
• Sensitivity and specificity are inversely related.

195
Marker concentration
-

Disease
196
Sensitivity vs. Specificity
• Sensitivity and specificity are inversely
related.
• How do we determine the best compromise between
sensitivity and specificity?

197
198
Evaluating Clinical Performance of laboratory
tests
• The sensitivity of a test indicates the
likelihood that it will be positive when disease
is present
• The specificity of a test indicates the
likelihood that it will be negative when disease
is absent
• The predictive value of a test indicates the
probability that the test result correctly
classifies a patient

199
Predictive Value
• The predictive value of a clinical laboratory
test takes into account the prevalence of a
certain disease, to quantify the probability that
a positive test is associated with the disease in
a randomly-selected individual, or alternatively,
that a negative test is associated with health.

200
Illustration
• Suppose you have invented a new screening test
• The test correctly identified 98 of 100 patients
with confirmed Addison disease (What is the
sensitivity?)
• The test was positive in only 2 of 1000 patients
with no evidence of Addison disease (What is the
specificity?)

201
Test performance
• The sensitivity is 98.0
• The specificity is 99.8
• But Addison disease is a rare disorder--incidence
110,000
• What happens if we screen 1 million people?

202
Analysis
• In 1 million people, there will be 100 cases of
• Our test will identify 98 of these cases (TP)
• Of the 999,900 non-Addison subjects, the test
will be positive in 0.2, or about 2,000 (FP).

203
Predictive value of the positive test
• The predictive value is the of all positives
that are true positives

204
What about the negative predictive value?
• TN 999,900 - 2000 997,900
• FN 100 0.002 0 (or 1)

205
Summary of predictive value
• Predictive value describes the usefulness of a
clinical laboratory test in the real world.
• Or does it?

206
• Even when you have a very good test, it is
generally not cost effective to screen for
diseases which have low incidence in the general
population. Exception?
• The higher the clinical suspicion, the better the
predictive value of the test. Why?

207
Efficiency
• We can combine the PV and PV- to give a quantity
called the efficiency
• The efficiency is the percentage of all patients
that are classified correctly by the test result.

208
209
To call in the statistician after the experiment
is done may be no more than asking him to
perform a postmortem examination he may be able
to say what the experiment died of.
• Ronald Aylmer Fisher (1890 - 1962)

210
Application of Statistics to Quality Control
• We expect quality control to fit a Gaussian
distribution
• We can use Gaussian statistics to predict the
variability in quality control values
• What sort of tolerance will we allow for
variation in quality control values?
• Generally, we will question variations that have
a statistical probability of less than 5

211
He uses statistics as a drunken man uses lamp
posts -- for support rather than illumination.
• Andrew Lang (1844-1912)

212
Westgards rules
• 12s
• 13s
• 22s
• R4s
• 41s
• 10x
• 1 in 20
• 1 in 300
• 1 in 400
• 1 in 800
• 1 in 600
• 1 in 1000

213
Some examples
3sd
2sd
1sd
mean
-1sd
-2sd
-3sd
214
Some examples
3sd
2sd
1sd
mean
-1sd
-2sd
-3sd
215
Some examples
3sd
2sd
1sd
mean
-1sd
-2sd
-3sd
216
Some examples
3sd
2sd
1sd
mean
-1sd
-2sd
-3sd
217
In science one tries to tell people, in such a
way as to be understood by everyone, something
that no one ever knew before. But in poetry, it's
the exact opposite.
• Paul Adrien Maurice Dirac (1902- 1984)