Three Papers: AUC, PFA and BIOInformatics - PowerPoint PPT Presentation

About This Presentation
Title:

Three Papers: AUC, PFA and BIOInformatics

Description:

ROC curve (Provost & Fawcett, AAAI'97) How to ... ROC curve and AUC. If A dominates D, then A is ... ROC curve and AUC. AUC. Two classifiers: The AUC of ... – PowerPoint PPT presentation

Number of Views:132
Avg rating:3.0/5.0
Slides: 23
Provided by: ling153
Category:

less

Transcript and Presenter's Notes

Title: Three Papers: AUC, PFA and BIOInformatics


1
Three Papers AUC, PFA and BIOInformatics
  • The three papers are posted online

2
Learning Algorithms for Better Ranking
  • Jin Huang, Charles X. Ling Using AUC and
    Accuracy in Evaluating Learning Algorithms. IEEE
    Trans. Knowl. Data Eng. 17(3) 299-310 (2005)
  • Find the citations online (google scholar)
  • Goal accuracy vs ranking
  • Secondary Goal Decision Tree vs Bayesian
    Networks in Ranking
  • Design Algorithms That Directly Optimize Ranking

3
Accuracy not good enough
Higher ranking more desirable
Cutoff line
  • Two classifiers

Classifier 1
Classifier 2
Accuracy of Classifier1 4/5 Accuracy of
Classifier2 4/5 But intuitively, Classifier 1 is
better!
4
Accuracy vs ranking
  • Accuracy-based making two assumptions balanced
    class distribution and equal costs for
    misclassification
  • Ranking step aside these assumptions
  • Problem Training examples are labeled, not
    ranked
  • How to evaluate ranking?

5
ROC curve
(Provost Fawcett, AAAI97)
6
How to calculate AUC
  • Rank test examples in an increasing order
  • Let ri be the rank of the ith positive example
    (left low r_i, right high r_i better)
  • S0 ? ri
  • AUC
  • (Hand Till, 2001, MLJ)

7
An example
Classifier 1
ri 5 7 8 9 10
Better result
S0 578910 39 AUC (39 5x6/2) / 25
24/25
8
ROC curve and AUC
  • If A dominates D, then A is better than D
  • Often A and B are not dominating each other
  • AUC (area under the ROC curve)
  • Overall performance
  • AUC for evaluating ranking

9
ROC curve and AUC
  • Traditional learning algorithms produce poor
    probability estimates as by-product.
  • Decision tree algorithms
  • Strategies to improve
  • How about Bayesian network learning algorithms ?

10
Evaluation of Classifiers
  • Classification accuracy or error rate.
  • ROC curve and AUC.

11
AUC
  • Two classifiers

Classifier 1
Classifier 2
The AUC of Classifier1 24/25 The AUC of
Classifier2 16/25 Classifier 1 is better than 2!
12
AUC is more discriminating
  • For N examples
  • (N1) different accuracies
  • N (N1)/2 different AUC values
  • AUC is a better and more discriminating
    evaluation measure than accuracy

13
Naïve Bayes vs C4.4
Overall, Naïve Bayes outperforms C4.4 in AUC
LingZhang, submitted, 2002
14
PCA in Face Recognition
15
Problem with PCA
  • The features are principal components
  • Thus they do not correspond directly to the
    original features
  • Problem with face recognition wish to pick a
    subset of original features rather than composed
    ones
  • Principal Feature Analysis pick the best,
    uncorrelated, subset of features of a data set
  • Equivalent to finding q dimensions of a random
    variable Xx1,x2, , xnT

16
How to find the q features?
q1, q2, q3, qn
ith row ith feature
q
17
The subspace
18
Algorithm
19
Result
20
When PCA does not work
21
PCA Clustering Bad Idea
22
More
23
Rand Index for Clusters (Partitions)
24
Results
Write a Comment
User Comments (0)
About PowerShow.com