RELIABILITY OF DISEASE CLASSIFICATION presentation

About This Presentation

Transcript and Presenter's Notes

Title: RELIABILITY OF DISEASE CLASSIFICATION

1
RELIABILITY OF DISEASE CLASSIFICATION

Nigel Paneth

2
TERMINOLOGY

Reliability is analogous to precision
Validity is analogous to accuracy
Reliability is how well an observer classifies
the same individual under different
circumstances.
Validity is how well a given test reflects
another test of known greater accuracy.

3
RELIABILITY AND VALIDITY

Reliability includes
assessments of the same observer at different
times - INTRA-OBSERVER RELIABILITY
assessments of different observers at the same
time - INTER-OBSERVER RELIABILITY
Reliability assumes that all tests or observers
are equal Validity assumes that there is a gold
standard to which a test or observer should be
compared.

4
ASSESSING RELABILITY

How do we assess reliability?
One way is to look simply at percent agreement.
Percent agreement is the proportion of all
diagnoses classified the same way by two
observers.

5
EXAMPLE OF PERCENT AGREEMENT

Two physicians are each given a set
of 100 X-rays to look at independently and asked
to judge whether pneumonia is present or absent.
When both sets of diagnoses are tallied, it is
found that 95 of the diagnoses are the same.

6
IS PERCENT AGREEMENT GOOD ENOUGH?

Do these two physicians exhibit high
diagnostic reliability?
Can there be 95 agreement between two
observers without really having good reliablity?

Compare the two tables below
Table 1 Table 2

In both instances, the physicians agree 95 of
the time. Are the two physicians equally reliable
in the two tables?
8

What is the essential difference between the two
tables?
The problem arises from the ease of agreement on
common events (e.g. not having pneumonia in the
first table).
So a measure of agreement should take into
account the ease of agreement due to chance
alone.

9
USE OF THE KAPPA STATISTIC TO ASSESS RELIABILITY

Kappa is a widely used test of inter or
intra-observer agreement (or reliability) which
corrects for chance agreement.

10
KAPPA VARIES FROM 1 to - 1

1 means that the two observers are perfectly
reliable. They classify everyone exactly the same
way.
0 means there is no relationship at all between
the two observers classifications, above the
agreement that would be expected by chance.
- 1 means the two observers classify exactly the
opposite of each other. If one observer says
yes, the other always says no.

GUIDE TO USE OF KAPPAS IN EPIDEMIOLOGY AND
MEDICINE
Kappa gt .80 is considered excellent
Kappa .60 - .80 is considered good
Kappa .40 - .60 is considered fair
Kappa lt .40 is considered poor

12
1st WAY TO CALCULATE KAPPA

1. Calculate observed agreement (cells in which
the observers agree/total cells). In both table 1
and table 2 it is 95
2. Calculate expected agreement (chance
agreement) based on the marginal totals

13
Table 1s marginal totals are
14

How do we calculate the N expected by chance in
each cell?
We assume that each cell should reflect the
marginal distributions, i.e. the proportion of
yes and no answers should be the same within the
four-fold table as in the marginal totals.

To do this, we find the proportion of answers in
either the column (3 and 97, yes and no
respectively for MD 1) or row (4 and 96 yes
and no respectively for MD 2) marginal totals,
and apply one of the two proportions to the other
marginal total. For example, 96 of the row
totals are in the No category. Therefore, by
chance 96 of MD 1s Nos should also be in
the No column. 96 of 97 is 93.12.

16
By subtraction, all other cells fill in
automatically, and each yes/no distribution
reflects the marginal distribution. Any cell
could have been used to make the calculation,
because once one cell is specified in a 2x2 table
with fixed marginal distributions, all other
cells are also specified.
17
Now you can see that just by the operation of
chance, 93.24 of the 100 observations should have
been agreed to by the two observers. (93.12
0.12)
18

Lets now compare the actual agreement with the
expected agreement.
Expected agreement is 6.76 from perfect
agreement of 100 (100 93.24)
Actual agreement is 5.0 from perfect agreement
(100 95).
So our two observers were 1.76 better than
chance, but if they had agreed perfectly they
would have been 6.76 better than chance. So
they are really only about ¼ better than chance
(1.76/6.76)

19
Below is the formula for calculating Kappa from
expected agreement

Observed agreement - Expected Agreement
1 - Expected Agreement
95 - 93.24 1.76 .26
1 - 93.24 6.76

How good is a Kappa of 0.26?
Kappa gt .80 is considered excellent
Kappa .60 - .80 is considered good
Kappa .40 - .60 is considered fair
Kappa lt .40 is considered poor

21
In the second example, the observed agreement was
also 95, but the marginal totals were very
different

Using the same procedure as before, we calculate
the expected N in any one cell, based on the
marginal totals. For example, the lower right
cell is 54 of 55, which is 29.7

23
And, by subtraction the other cells are as below.
The cells which indicate agreement are
highlighted in yellow, and add up to 50.4
24

Enter the two agreements into the formula
Observed agreement - Expected Agreement
1 - Expected Agreement
95 - 50.4 44.6 .90
1 - 50.4 49.6

In this example, the observers have the same
agreement, but now they are much different from
chance. Kappa of 0.90 is considered excellent
25
A 2nd WAY TO CALCULATE THE KAPPA STATISTIC
2(AD - BC) N1N4 N2N3 where the Ns are the
marginal totals, labeled thus
26

Look again at the tables on slide 7.
For Table 1
2(94 x 1 - 2 x 3) 176 .26
4 x 97 3 x 96 676
For Table 2
2(52 x 43 - 3 x 2) 4460 .90
46 x 55 45 x 54 4960

Note parallels between
THE ODDS RATIO
THE CHI-SQUARE STATISTIC
THE KAPPA STATISTIC
Note that the cross-products
of the four-fold table, and their relation to
marginal totals, are central to all three
expressions

Write a Comment

User Comments (0)

About PowerShow.com

RELIABILITY OF DISEASE CLASSIFICATION PowerPoint PPT Presentation