Discriminative and Generative Classifiers - PowerPoint PPT Presentation

About This Presentation
Title:

Discriminative and Generative Classifiers

Description:

Discriminative and Generative Classifiers. Tom Mitchell ... Discriminative classifiers (also called informative' by Rubinstein&Hastie) ... – PowerPoint PPT presentation

Number of Views:62
Avg rating:3.0/5.0
Slides: 13
Provided by: whiz6
Learn more at: http://www.cs.cmu.edu
Category:

less

Transcript and Presenter's Notes

Title: Discriminative and Generative Classifiers


1
Discriminative and Generative Classifiers
  • Tom Mitchell
  • Statistical Approaches to Learning and Discovery,
    10-702 and 15-802
  • March 19, 2003
  • Lecture based on On Discriminative vs.
    Generative classifiers A comparison of logistic
    regression and naïve Bayes, A. Ng and M. Jordan,
    NIPS 2002.

2
Lecture Outline
  • Generative and Discriminative classifiers
  • Asymptotic comparison (as examples grows)
  • when model correct
  • when model incorrect
  • Non-asymptotic analysis
  • convergence of parameter estimates
  • convergence of expected error
  • Experimental results

3
Generative vs. Discriminative Classifiers
  • Training classifiers involves estimating f X ?
    Y, or P(YX)
  • Discriminative classifiers (also called
    informative by RubinsteinHastie)
  • Assume some functional form for P(YX)
  • Estimate parameters of P(YX) directly from
    training data
  • Generative classifiers
  • Assume some functional form for P(XY), P(X)
  • Estimate parameters of P(XY), P(X) directly from
    training data
  • Use Bayes rule to calculate P(YX xi)

4
Generative-Discriminative Pairs
Example assume Y boolean, X ltx1, x2, , xngt,
where xi are boolean, perhaps dependent on Y,
conditionally independent given Y Generative
model naïve Bayes Classify new example x
based on ratio Equivalently, based on sign of
log of this ratio
s indicates size of set. l is smoothing parameter
5
Generative-Discriminative Pairs
Example assume Y boolean, X ltx1, x2, , xngt,
where xi are boolean, perhaps dependent on Y,
conditionally independent given Y Generative
model naïve Bayes Classify
new example x based on ratio Discriminative
model logistic regression Note both learn
linear decision surface over X in this case
6
What is the difference asymptotically?
  • Notation let denote error of
    hypothesis learned via algorithm A, from m
    examples
  • If assumed model correct (e.g., naïve Bayes
    model), and finite number of parameters, then
  • If assumed model incorrect
  • Note assumed discriminative model can be correct
    even when generative model incorrect, but not
    vice versa

7
Rate of covergence logistic regression
Let hDis,m be logistic regression trained on m
examples in n dimensions. Then with high
probability
Implication if we want for some constant ,
it suffices to pick ? Convergences to best
linear classifier, in order of n examples (result
follows from Vapniks structural risk bound, plus
fact that VCDim of n dimensional linear
separators is n )
8
Rate of covergence naïve Bayes
Consider first how quickly parameter estimates
converge toward their asymptotic values. Then
well ask how this influences rate of convergence
toward asymptotic classification error.
9
Rate of covergence naïve Bayes parameters
10
Rate of covergence naïve Bayes classification
error
See blackboard ?
11
Some experiments from UCI data sets
12
Pairs of plots comparing naïve Bayes and logistic
regression with quadratic regularization
penalty. Left plots show training error vs.
number of examples, right plots show test
error. Each row uses different regularization
penalty. Top row uses small penalty penalty
increases as you move down the page. Thanks to
John Lafferty.
Write a Comment
User Comments (0)
About PowerShow.com