Using tests to improve decisions: Cutting scores - PowerPoint PPT Presentation

1 / 52
About This Presentation
Title:

Using tests to improve decisions: Cutting scores

Description:

5 males, one wears a dress; 5 females, 4 wear dresses. What is the probability that you wear a dress, given that you are female? ... – PowerPoint PPT presentation

Number of Views:70
Avg rating:3.0/5.0
Slides: 53
Provided by: chriswe
Category:

less

Transcript and Presenter's Notes

Title: Using tests to improve decisions: Cutting scores


1
Using tests to improve decisionsCutting scores
base rates
2
Review Conditional Probability
  • Recall Conditional probabilities arise when the
    probability of one thing A depends on the
    probability of something else B
  • In such cases, we want to factor in the
    probability of B before we worry about A
  • This amounts to focusing on the elements that are
    likely to be picked out by both A and B
  • P(AB) P(A and B)/P(B)

3
Three ways
  • We have already considered three ways to solve
    conditional probability questions (all are
    exactly equivalent)
  • Common sense
  • Probability tables
  • Bayes Theorem

4
a.) Common sense
  • P(AB) P(A and B)/P(B)
  • 5 males, one wears a dress 5 females, 4 wear
    dresses.
  • What is the probability that you wear a dress,
    given that you are female?
  • First We want to know how many people are both
    dress wearers and females P(A and B) 4
  • Second We want to know what proportion of all
    females are accounted for by the dress wearing
    females
  • Dress wearing females / Females
  • P(Female and dress-wearing)/P(Female)
  • 4/5

5
a.) Common sense
  • BUT Remember that common sense isnt so common,
    especially when it comes to conditional
    probability problems
  • Relying on your intuitive understanding may lead
    you astray precisely because your intuitions
    about conditional probability are likely to be
    wrong.
  • So it is advisable to use one of the other two
    methods

6
b.) Probability Tables
  • P(AB) P(A and B)/P(B)
  • What is the probability that you sometimes wear a
    dress, given that you are female?

7
b.) Probability Tables
  • P(AB) P(A and B)/P(B)
  • What is the probability that you sometimes wear a
    dress, given that you are female? This amounts
    to saying Just ignore all the non-females!

JUST IGNORE ALL THE MALES!
8
b.) Probability Tables
  • P(AB) P(A and B)/P(B)
  • What is the probability that you sometimes wear a
    dress, given that you are female? This amounts
    to saying Just ignore all the non-females!

JUST IGNORE ALL THE MALES!
The sum in this row /rectangle is P(B) 5/10
This 4/10 is P(A and B)
9
c.) Bayes Theorem
  • P(AB) P(A and B)/P(B)
  • What is the probability that you sometimes wear a
    dress, given that you are female?
  • P(AB) P(BA) P(A) / P(B)
  • Proof By definition, (1.) P(AB) P(A,B) /
    P(B)
  • (2.) P(BA) P(A,B) / P(A)
  • (3.) P(AB) P(B) P(A,B) Multiply (1.) by
    P(A)
  • (4.) P(BA) P(A) P(A,B) Multiply (2.) by
    P(A)
  • (5.) P(AB) P(B) P(BA) P(A) Substitute (4.)
    in (3.)
  • (6.) P(AB) P(BA) P(A) / P(B) Divide by
    P(B)

10
c.) Bayes Theorem
  • P(AB) P(A and B)/P(B)
  • P(BA) P(A) / P(B)
  • Note that this just means that P(BA) P(A) P(A
    and B) How come?
  • - We are considering two independent events
    here
  • P(BA) The odds of B given A the odds of
    being female, given that you wear a dress
  • P(A) The odds of wearing a dress
  • When we multiply these together, we pick out that
    subset that falls under both (just like any other
    two independent probabilities) those that are
    female AND wear a dress

11
c.) Bayes Theorem
  • P(AB) P(A and B)/P(B)
  • What is the probability that you sometimes wear a
    dress, given that you are female?
  • P(AB) P(BA) P(A) / P(B)
  • P(Dress-wearingFemale) P(FemaleDress-wearing)P
    (Female)/P(Dress-wearing)
  • (4/5 5/10) / (5/10)
  • 4/5

12
c.) Bayes Theorem
  • P(AB) P(A and B)/P(B)
  • What is the probability that you sometimes wear a
    dress, given that you are female?
  • P(AB) P(BA) P(A) / P(B)
  • P(Dress-wearingFemale) P(FemaleDress-wearing)P
    (Female)/P(Dress-wearing)
  • (4/5 5/10) / (5/10)
  • 4/5

Note that the red Ps refer to the population as a
whole! This is where we ignore all the males
13
Why use ! Bayes Theorem?
  • Bayes Theorem is not intended to confuse, but to
    simplify you can use it to get the probability
    relation between any two cells in the 2x2 table
  • It can also be generalized to more complex
    situations
  • However, in this class we wont go outside of 2x2
    conditional probability tables so just draw a
    picture or think it through if you prefer!

14
Why use ! Bayes Theorem?
  • Bayes rule (conditional probability) applies
    whenever we have to incorporate new evidence into
    a line of reasoning
  • That evidence may be related to how reliable our
    test (or diagnostician!) is (P(Diagnosis
    Reliable test) P(Diagnosis Less reliable
    test) or base rates (P(Diagnosis Common
    disease) P(Diagnosis Rare disease)
  • It is therefore of central concern in justifying
    beliefs supporting belief in an hypothesis
  • Each new piece of relevant evidence (if it has a
    known probability) can be used to revise the
    current probability of a hypothesis (and that
    current probability may reflect prior pieces of
    relevant evidence).

15
Why use ! Bayes Theorem?
  • Scientific reasoning is subject to Bayesian
    thinking
  • An hypothesis that is non-significant should be
    considered in light of known evidence that bears
    on it
  • E.g. Null or non-null results do not occur in a
    vaccuum
  • If you get an effect no one else gets, be
    suspicious.
  • If you dont get an effect everyone else does
    get, be suspicious.
  • There may be relevant effects that are
    extra-scientific e.g. P(Effect Jim does the
    experiment) P(Effect Sally does the
    experiment).

16
Why use ! Bayes Theorem?
  • Bayes Theorem also comes into play when base
    rates are skewed (as we have already seen) and
    when we are selecting cutting scores for our
    tests (as we will soon see)

17
Cutting scores
  • What is a cutting score or cutting line?
  • How shall we evaluate how good any given test is?

18
Cutting scores
  • What is a cutting score or cutting line?
  • In many tests we have criteria if a subject
    scores above some specified score X, they are
    likely to be Y a genius, depressed, a good
    marriage prospect, dead in six months
  • X is a cutting score

19
Cutting scores
  • In many tests we have criteria if a subject
    scores above some specified cutting score X, they
    are likely to be Y
  • Note (as always with Bayes Theorem!) that this
    is just a standard conditional probability we
    are calculating P(YX) or P(diagnosistest
    result)
  • Note also that in this case probability of
    attaining a score of X test result is not
    given by God (that is, is not a matter of
    empirical fact)- because we test designers are
    free to change the cutting score as we like
  • In doing so, we can change P(diagnosistest
    result)

20
Cutting scores
  • As an example, think of the probability that a
    person is a genius (defined, lets say, as IQ
    130, p 128, on the one hand, or 110, on the other.
  • Assume the standard error for IQ is 5 points
  • Then there is a fair chance that a person who got
    128 has an IQ above 130, but a very small (but
    non-zero) chance that that person who got 110 has
    an IQ above 130
  • If we used 110 as a cutting score for detecting
    geniuses, wed be wrong a lot P(diagnosistest
    result) is low
  • If we used 128 as a cutting score for genius,
    wed be wrong less often P(diagnosistest
    result) is higher

21
Cutting scores
  • What we want is a principled way of choosing a
    good cutting score for any particular purpose
  • Clearly, our choice of cutting score must depend
    on that purpose
  • When we are diagnosing a brain tumour, we want to
    be wrong almost never if the person does have a
    brain tumour (low false negative rate) AND we
    dont care too much if we have a high false
    positive rate ( at least until we cut into the
    brain!)
  • When we are trying to identify criminals, we
    might be more worried about minimizing false
    positives (we could horribly destroy an innocent
    life if we say someone is a criminal when they
    are not) and we might be willing to pay the price
    by letting some real criminals go free (increase
    our false negative rate)

22
(No Transcript)
23
False negative Incorrectly undiagnosed.
False positive Incorrectly diagnosed
24
Low false negative rate
High false positive rate
Rewarding incompetence
25
High false negative rate
Low false positive rate
Ignoring competence
26
How shall we evaluate how good a test is?
  • Three things need to be taken into account
  • i.) The size of the correlation between test
    scores and criterion (which is called?)
  • - The higher the correlation, the narrower the
    scatterplot (i.e. the ellipse) and the smaller
    the error rates

27
How shall we evaluate how good a test is?
  • Two other things need to be taken into account
  • ii.) The base rate
  • iii.) The cutting score
  • What is the relation between these two measures?

28
The relation between base rate and cutting score
  • Example from Meehl (in a paper we did not read in
    this class)
  • Group A 415 well-adjusted soldiers
  • Group B 89 mal-adjusted soldiers
  • A scale diagnosed 55 of Group B, and only 19 of
    Group A, so the authors advocated its use

29
Example Assume N 10,000 P(Bad) 0.05
  • 500 are bad. 55 (275) are classified as bad
  • 9500 are good. 81 (7695) are not classified as
    bad.
  • (7695 275)/10000 79.97 are correctly
    classified.
  • Why should this bother us?

30
Lets use Bayes Theorem Is bad bad?
Oh no! When we take base rates into account, an
identification of a person as bad actually has
only a 13 chance of being correct, not a 55
chance as claimed.
31
Lets use Bayes Theorem Is good good?
When we take base rates into account, a failure
to identify a person as bad has 97 chance of
being correctbut remember that we were already
95 sure before we bothered to do the calculation
or give the test!
32
The relation between base rate and cutting score,
II
  • A certain Rorschach configuration is seen in 8.1
    of schizophrenics, and 0 of non-schizophrenics
  • Such either/or certainty is rare, especially in
    projective tests
  • The authors therefore claim that this is
    clinically useful But is it really?

33
Lets do the math!
Although the sign is certain in this case, it is
so rare itself and applies to a group with such a
rare base rate that it is P(Rorschach) that is
worrying This information would be
diagnostically helpful in only 7 cases out of
10,000! it is clinically useless
34
What can we do? Rule 1
  • In order for a positive diagnostic assertion to
    be more likely true than false, the ratio of
    positive to negative base rates in the examined
    population must exceed the false positive to
    valid positive rate
  • Base rate of positives False positive rate of
    test
  • Base rate of negatives True positive rate of test
  • In other words If your confidence in your test
    results cant beat base rates, then you should go
    with base rates.


35
What can we do? Rule 1
  • Base rate of positives False positive rate of
    test
  • Base rate of negatives True positive rate of
    test
  • - Imagine otherwise Lets say we have 50/50
    positive versus negatives (ratio 11), but 75
    of our positive tests are false (ratio 31)
  • You get a positive result
  • But that only gives you a 25 of being positive
    given you are positive, and you had a 50 chance
    before you took the test!


36
Example Rule 1
  • Base rate of positives False positive rate of
    test
  • Base rate of negatives True positive rate of test

A cutting score identifies 80 of brain-damaged
patients. 15 of nondamaged patients also exceed
that cut-off. What base rates can justify the use
of such a test? .15 (false positive) / .80 (true
positive) 0.19 The ratio of brain damaged to
non-brain damaged patients in the population
under consideration must be equal to or greater
than .19, or about 1 in 5.
37
The easiest case Equal base rates (Rule 2)
  • Iff base rates are equal, then the probability of
    a positive diagnosis is the ratio of the true
    positive rate to the sum of the true and false
    positive rates.
  • Another way of saying this more simply is equal
    base rates render Bayes Theorem unnecessary.

38
Example Equal base rates (Rule 2)
  • Iff base rates are equal, then the probability of
    a positive diagnosis is the ratio of the true
    positive rate to the sum of the true and false
    positive rates.
  • Two kinds of cancers occur equally often. A test
    diagnoses Type B with 68 accuracy, but is at
    chance for Type A. You get a positive test
    result. What is the probability you have Type B
    cancer?

For once life is simple. The probability is
68. 0.68 / (0.68 0.32) 0.68
39
Example 2 Equal base rates (Rule 2)
  • A test picks out 75 of people who will continue
    in school (true positives) but also 40 of those
    who will not (false positives). It is claimed
    that about half of all students in the population
    drop out of school. How far off can that claim be
    without the test being useless?
  • The probability of a positive diagnosis with
    equal split is the ratio of the true positive
    rate to the sum of the true and false positive
    rates
  • 0.75 / (0.75 0.40) 0.65
  • So the test gets about 65 right. If less than
    35 of the students actually do drop out, the
    test will not do better than base rates.
  • That is If it is a matter of fact that (say)
    only 10 of students drop out, then there is no
    use giving this test it cant beat the 90 odds
    you have of being correct before you bothered to
    give the test

40
When can a test help? (Rule 3)
  • A test result can only help if the base rate of
    the more numerous class (here, positive) is less
    than the ratio of the true negative rate to the
    sum of the true and false negative rate

41
When can a test help? (Rule 3)
  • A test result can only help if the base rate of
    the more numerous class (say, positive) is less
    than the ratio of the true negative rate to the
    sum of the true and false negative rate
  • A test of maladjustment classifies 85 of
    maladjusted girls, but only mis-identifies 15 of
    adjusted girls. What base rates are needed to
    support these ratios? (Assume, reasonably, that
    there are more adjusted than unadjusted girls.)
  • The ratio of the true negative rate to the sum of
    the true and false negative rate (0.85 true
    negative / (0.85 true negative 0.15 false
    negative) 0.85. The test can only help if less
    than 85 of girls are well-adjusted.

42
What does this have to do with cutting lines?
  • The proportion of people selected (diagnosed,
    chosen) from a sample is called the selection
    ratio
  • When positive/negative base rates are not equal,
    there is a (fairly brutal) trade-off between the
    accuracy (error rate) of a diagnosis or
    prediction, and the size of the selection ratio

43
The brutal trade-off
  • If you want to be very sure you are right, you
    can speak of only a very small proportion of the
    sample (and you need a very large sample to get
    the cut-off points!)
  • If you want to say something about everyone, then
    you must be prepared to be uncertain about your
    cut-off points, and wrong very often.
  • In short you can be certain about a few people,
    or uncertain about a lot of people take your
    pick!

44
False negative Incorrectly unselected
False positive Incorrectlyselected
45
Low false Negative rate
High false positive rate
Rewarding incompetence
46
High false negative rate
Low false positive rate
Ignoring competence
47
Sensitivity Specificity
  • The sensitivity of a test The probability of
    having a positive test result when the disease is
    present
  • P(ResultDisease) True positive rate
  • The specificity of a test The probability of
    having a negative test result when the disease is
    absent
  • P(ResultDisease) True negative rate

48
SENSITIVITY
False negative Incorrectly unselected
True positive Correctly selected
True negative Correctly unselected
False positive Incorrectly selected
SPECIFICITY
49
What to do?
  • 1.) Obviously, sometimes we can be satisfied with
    a small improvement on true negative base rates
    and with a large false positive rate
  • As we have noted earlier, we dont mind mistaking
    90 brain tumors in order not miss 20.
  • 2.) Successive hurdles Take a chance, allow
    errors, and give the expensive, time-consuming,
    but accurate tests to those who are selected out
    from a first-pass of a less-expensive, less
    time-consuming, and more accurate test
  • Repeat as necessary...

50
What to do?
  • 3.) Sometimes we can find sub-populations with
    less extreme base rates than in the
    world-at-large
  • If our referrals are well-screened, we can have
    more confidence in base rates that are less
    onerous ( closer to being equal) than they would
    be in the world at large

51
What to do?
  • 4.) Sometimes so what? is the right thing to
    say.
  • Since testing with any accuracy is so difficult
    to do well, we should not bother to give tests
    that dont lead to real changes in therapy or
    other treatment
  • If you can identify good therapy candidates with
    70 accuracy, so what? Will you then ignore or
    refuse to treat those who dont make the cut?
  • If not, dont waste time and effort giving the
    test

52
What to do?
  • Gather base rate information.
Write a Comment
User Comments (0)
About PowerShow.com