CostSensitive Classifier Evaluation using Cost Curves - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

CostSensitive Classifier Evaluation using Cost Curves

Description:

CostSensitive Classifier Evaluation using Cost Curves – PowerPoint PPT presentation

Number of Views:29
Avg rating:3.0/5.0
Slides: 28
Provided by: compu143
Category:

less

Transcript and Presenter's Notes

Title: CostSensitive Classifier Evaluation using Cost Curves


1
Cost-Sensitive Classifier Evaluation using Cost
Curves
MLJ 2006
  • Robert Holte
  • Computing Science Dept.
  • University of Alberta

Joint work with Chris Drummond, NRC, Ottawa
Cost Curve Tool programmed by Alden Flatt
2
Classifiers
  • A classifier assigns an object to one of a
    predefined set of categories or classes.
  • Example A metal detector either
  • sounds an alarm, or
  • stays quiet when someone walks through.
  • This talk only 2 classes, positive and
    negative.

3
Two Types of Error
False positive (false alarm), FP alarm sounds
but person is not carrying metal
  • False negative (miss), FN
  • alarm doesnt sound but person is carrying metal

4
2-class Confusion Matrix
  • Reduce the 4 numbers to two rates
  • true positive rate TP (TP)/(P)
  • false positive rate FP (FP)/(N)
  • Rates are independent of class ratio

subject to certain conditions
5
Example 3 classifiers
Classifier 1 TP 0.4 FP 0.3
Classifier 2 TP 0.7 FP 0.5
Classifier 3 TP 0.6 FP 0.2
6
Assumptions
  • Standard Cost Model
  • correct classification costs 0
  • cost of misclassification depends only on the
    class, not on the individual example
  • costs are additive over a set of examples
  • True FP and TP do not vary with time or location,
    and are accurately estimated.
  • Costs and Class Distributions
  • are not known precisely at evaluation time
  • may vary with time
  • may depend on where the classifier is deployed

7
How to Evaluate Performance ?
  • Scalar measure summarizing performance
  • Accuracy
  • Expected cost
  • Area under the ROC curve
  • Performance Visualization Techniques
  • ROC curve
  • Cost Curve

8
Is AUC0.95 better than AUC0.75 ?
When positives outnumber negatives 251,
AUC0.95 has more than twice the error rate of
AUC0.75
9
The Key Question When?
A
B
The key question is When is A better than B ?
10
Whats Genuinely Good AboutScalar Measures ?
  • we know how to average them, compute confidence
    intervals, test for significance, etc.
  • being one-dimensional leaves the second dimension
    free for other uses, e.g.
  • Learning curves
  • Multiple datasets
  • easily generalize to any number of classes

11
Cost Curves
Classifier 1 TP 0.4 FP 0.3
Classifier 2 TP 0.7 FP 0.5
Classifier 3 TP 0.6 FP 0.2
12
Operating Range
13
Lower Envelope
14
Varying a Threshold
always negative
always positive
15
Taking Costs Into Account
Y FNX FP (1-X) So far, X p(), making Y
error rate
Y expected cost normalized to 0,1
16
Averaging Cost Curves
17
Cost Curve Avg. in ROC Space
18
Confidence Interval Example
19
Paired Resampling to Test Statistical Significance
For the 100 test examples in the negative class
FP for classifier1 (3010)/100 0.40 FP for
classifier2 (300)/100 0.30 FP2 FP1
-0.10
Resample this matrix 10000 times to get (FP2-FP1)
values. Do the same for the matrix based on
positive test examples. Plot and take 95
envelope as before.
20
Statistical Significance Example
classifier1
classifier2
FP2-FP1
FN2-FN1
21
Correlation between Classifiers
High Correlation (the preceding example)
Low Correlation
22
Low correlation Low significance
classifier1
classifier2
FP2-FP1
FN2-FN1
23
Limited Range of Significance
24
Comparing J48 and AdaBoost
25
Multiple Comparisons
26
Learning Curves
27
Conclusions
  • Scalar performance measures, including AUC, do
    not indicate when one classifier is better than
    another.
  • Cost curves enable easy visualization of
  • Average performance (expected cost)
  • operating range
  • confidence intervals on performance
  • difference in performance and its significance
  • See MLJ2006 paper for all the details
  • Cost/ROC curve software is available.
    Contact holte_at_cs.ualberta.ca
Write a Comment
User Comments (0)
About PowerShow.com