character recognition based on probability tree model - PowerPoint PPT Presentation

About This Presentation
Title:

character recognition based on probability tree model

Description:

Relax the tree structure into a hyper tree. Experiments in character recognition ... B. The patient has a sore throat? C. The patient has a fever? ... – PowerPoint PPT presentation

Number of Views:20
Avg rating:3.0/5.0
Slides: 20
Provided by: hkz
Category:

less

Transcript and Presenter's Notes

Title: character recognition based on probability tree model


1
character recognition based on
probability tree model
  • Presenter Huang Kaizhu

2
Outline
  • Introduction
  • How probability can be used in character
    recognition?
  • What is probability tree model?
  • Two improvement direction
  • Integrate Prior knowledge
  • Relax the tree structure into a hyper tree
  • Experiments in character recognition

3
Disease Diagnosis problem
  • How a doctor get to know a patient have a cold?
  • A. The patient has a headache?
  • B. The patient has a sore throat?
  • C. The patient has a fever?
  • D. The patient can breathe well via his nose?
  • Now a patient has the following symtoms.
  • A is no, B is yes, C is no, D is yes
  • What is the hidden principle of the doctor in
    making a judgment?

4
Disease Diagnosis problem(cont)
  • A good doctor will get his answer by checking
  • P1 P(Coldtrue,AN, BY,CN,DY)
  • Vs
  • P2 P(Coldfalse,AN, BY,CN,DY)
  • if P1gtP2, the patient is judged to have a cold
  • if P2gtP1, the patient is judged to have no cold

5
What is Probability Model Classifier?
  • A Probability model classifier is a kind of
    classifier
  • based on the probability inductions.
  • The focus is now changed into how to calculate
  • P(Coldtrue,AN, BY,CN,DY)
  • and
  • P(Coldfalse,AN, BY,CN,DY)

Now a classification Problem is change into a
distribution estimation problem
6
Used in character recognition
  • How can the probability model used in character
    recognition?
  • (similar to the Disease Diagnosis Problem)
  • Find a probability distribution of the features
    for every type of character.
  • P(a, f1,f2,f3,,fn), P(b,f1,f2,f3,,fn),,
    P(z,f1,f2,f3,,fn)
  • Compute in what probability a unknown character
    belongs to each type of character. And classify
    this character into the class with the
  • highest probability.
  • For example
  • P(a,fu1, fu2 , ,fun, )gt P(C,fu1, fu2 ,
    ,fun, ) , Cb,c,z
  • We judge the unknown character into a
  • How can we estimate the joint Probability
  • P(C, f1,f2,f3,,fn)?
    Ca,b,z

7
Estimate the joint Probability
  • 1. Estimation based on direct counting
  • P(Coldtrue,AN, BY,CN,DY)
  • Num(Coldtrue,AN, BY,CN,DY)/TotalNum
  • Impractical!!
  • Reasons Huge samples needed.
  • if the num of features is n
    ,at least 2n samples are needed for
    binary features..
  • 2. Estimation based on Dependence relationship
    between features

8
Advantage
  • Joint Probability can be written into a product
    form.
  • P(A,B,C,D)
  • P(C)P(AC)P(DC)P(BC)
  • BY estimating each item of the above according to
    counting process,We can avoid the sample
    exploration problem
  • Probability tree model is a kind of model based
    on the above principle

9
Probability tree model
  • It assume that dependence relationship among
    features can be represented as a tree.
  • It seeks to find out a tree structure to
    represent the dependence relationship optimally
    and the probability can be written into

10
Algorithm
  • 1.Obtaining P(vi ) and P(vi,vj) for each pair of
    (vi,vj) by accumulating process . Vi is the
    feature
  • 2.Calculating the mutual entropy
  • 3.Utilizing Maximum spanning tree algorithm to
    find the optimal tree structure,which the edge
    weight between two nodes vi,vj is I((vi,vj)
  • This algorithm was proved to be optimal in 1

11
(No Transcript)
12
Two problems of tree model
  • Cant process sparse data or missing data
  • For example, if the samples are too sparse,
    maybe nose problem never happens in all the
    records of the patients with cold and nose
    problem happens 2 times in all the records of the
    patients without cold
  • Thus no matter what symptom a patient has, a
    coldFALSE judgment will be made since the
  • P(coldtrue,A,B,C,D FALSE)
  • P( coldtrue,DfalseC)0
  • lt P(coldfalse,A,B,C,D FALSE)
  • Cant perform well in multi-dependence
    relationship

13
2 Our improvements
  • To problem1
  • Introduce prior knowledge to overcome it
  • So the example in last slide

14
Key point of Technique 1
  • When a variable(feature) are always the same in
    one class, we replace its probability with a
    proportion of the variable probability in the
    whole database

15
  • To Problem2
  • Introduce Large Node methods to overcome it

LNCLT
CLT
16
Algorithm
  • 1. Find out the tree model
  • 2.Refine the tree model based on frequent itemset
  • Basic idea
  • if two variable come out together with
    each other more frequently, more possible it will
    be combined into a large node

17
Experiments1---Handwritten digit Lib
  • Database setup
  • 60000-digit training lib ,10000-digit test lib
  • Database is not sparse
  • Purpose evaluate the technique to problem 2

The digits recognized correctly by LNCLT are
wrongly recognized into the right-bottom digits
by CLT
18
Experiments1---Printed character Lib
  • Database setup
  • 8270 training lib ,
  • Database is sparse
  • Purpose
  • To evaluate the technique to Problem 1sparse
    data
  • Before introducing Prior knowledge
  • Recognition rate of training data 86.9
  • After introducing Prior knowledge
  • Recognition rate of training data 97.7

19
  • Demo
Write a Comment
User Comments (0)
About PowerShow.com