Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley - PowerPoint PPT Presentation

About This Presentation
Title:

Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley

Description:

... Decision Theory (Sections 2-6,2-9) Discriminant Functions for ... We saw that the minimum error-rate classification can be achieved by the discriminant function ... – PowerPoint PPT presentation

Number of Views:214
Avg rating:3.0/5.0
Slides: 23
Provided by: djam84
Learn more at: https://www.cse.sc.edu
Category:

less

Transcript and Presenter's Notes

Title: Pattern Classification All materials in these slides were taken from Pattern Classification (2nd ed) by R. O. Duda, P. E. Hart and D. G. Stork, John Wiley


1
Pattern ClassificationAll materials in these
slides were taken from Pattern Classification
(2nd ed) by R. O. Duda, P. E. Hart and D. G.
Stork, John Wiley Sons, 2000 with the
permission of the authors and the publisher
2
Chapter 2 (part 3)Bayesian Decision Theory
(Sections 2-6,2-9)
  • Discriminant Functions for the Normal Density
  • Bayes Decision Theory Discrete Features

3
Discriminant Functions for the Normal Density
  • We saw that the minimum error-rate classification
    can be achieved by the discriminant function
  • gi(x) ln P(x ?i) ln P(?i)
  • Case of multivariate normal

4
  • Case ?i ?2I (I stands for the identity
    matrix)
  • What does ?i ?2I say about the dimensions?
  • What about the variance of each dimension?

5
  • We can further simplify by recognizing that the
    quadratic term xtx implicit in the Euclidean norm
    is the same for all i.

6
  • A classifier that uses linear discriminant
    functions is called a linear machine
  • The decision surfaces for a linear machine are
    pieces of hyperplanes defined by
  • gi(x) gj(x)
  • The equation can be written as
  • wt(x-x0)0

7
  • The hyperplane separating Ri and Rj
  • always orthogonal to the line linking the means!

8
(No Transcript)
9
(No Transcript)
10
(No Transcript)
11
  • Case ?i ? (covariance of all classes are
    identical but arbitrary!)Hyperplane separating
    Ri and Rj
  • (the hyperplane separating Ri and Rj is generally
    not orthogonal to the line between the means!)

12
(No Transcript)
13
(No Transcript)
14
  • Case ?i arbitrary
  • The covariance matrices are different for each
    category
  • The decision surfaces are hyperquadratics
  • (Hyperquadrics are hyperplanes, pairs of
    hyperplanes, hyperspheres, hyperellipsoids,
    hyperparaboloids, hyperhyperboloids)

15
(No Transcript)
16
(No Transcript)
17
Bayes Decision Theory Discrete Features
  • Components of x are binary or integer valued, x
    can take only one of m discrete values
  • v1, v2, , vm
  • ? concerned with probabilities rather than
    probability densities in Bayes Formula

18
Bayes Decision Theory Discrete Features
  • Conditional risk is defined as before R(ax)
  • Approach is still to minimize risk

19
Bayes Decision Theory Discrete Features
  • Case of independent binary features in 2 category
    problem
  • Let x x1, x2, , xd t where each xi is
    either 0 or 1, with probabilities
  • pi P(xi 1 ?1)
  • qi P(xi 1 ?2)

20
Bayes Decision Theory Discrete Features
  • Assuming conditional independence, P(xwi) can be
    written as a product of component probabilities

21
Bayes Decision Theory Discrete Features
  • Taking our likelihood ratio

22
  • The discriminant function in this case is
Write a Comment
User Comments (0)
About PowerShow.com