Pattern Recognition - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

Pattern Recognition

Description:

Mathematicians knew that organised data could be ... One of the first attempts to apply statistical pattern recognition and ... overlap is an area of indecision ... – PowerPoint PPT presentation

Number of Views:181
Avg rating:3.0/5.0
Slides: 26
Provided by: cies9
Category:

less

Transcript and Presenter's Notes

Title: Pattern Recognition


1
Pattern Recognition
  • Lecture 2
  • By Rob Buxton (some ideas inspired by L.Noriega)

2
History
  • Mathematicians knew that organised data could be
    classified in terms of its inherent patterns long
    before the first computers were invented
  • One of the first attempts to apply statistical
    pattern recognition and classification to a
    problem was Fischer, who classified prehistoric
    skulls according to a complex set of measurements

3
Vectors
  • Statistical methods represent objects by using
    vectors, where the components are measurements or
    features associated with that object
  • These can be discrete (number of legs) or
    continuous (weight)
  • It is usual to have several components to a
    vector (0.7,0.9,12,5.07)

4
Vectors
  • In order to be compared directly vectors need to
    be of the same dimensionality

(0.4,0.5,0.9,0.6,0.6)
(0.2,0.7,0.9,0.5,0.7)
5
Measurements
  • If we look at vectors in practical terms often
    the components are measurements of some sort

2 cubes can be compared in terms of a
3-dimensional feature vector
(x,y,z)
(x,y,z)
6
Pattern Spaces
x1, x2, x3
Each row of a table like this is a vector
describing a point in pattern space
7
Euclidean Distance
  • In Lecture 1 we discussed the concept of distance
    as a similarity measure
  • But how does this work?

The difference between the ith component of x and
the ith component of y
The number of components
The difference between x and y
i is the component number
8
Euclidean Distance
  • Take two 2-d vectors
  • x (0.5,0.7)
  • y (0.1,0.9)

Diff(x-y) sqrt((0.5-0.1)2(0.7-0.9)2)
sqrt(0.2) 0.447
9
Possible Problems
What happens if we have a lot of variance in one
feature compared to another ?
large
overlap is an area of indecision
x1
A distance measure that attempts to create a more
symmetrical cluster may help
small
x2
10
Modified Euclidean distance Metrics
  • These can be a lot more complex but can help to
    overcome the sort of problems shown in the last
    example
  • A relatively simple way of rescaling is based
    upon the variance in the measured features
  • What effect do you think this might have ?

11
Features and their selection
  • So far weve looked at 3-d objects and it is
    fairly simple to see that our vectors correspond
    to (width,length,height)
  • In some PR problems extracting the most
    appropriate feature information from unwanted
    clutter is a major task

12
Features example
13
Features
  • There are three distinctive points contained in
    the information on the previous slide, these
    shall be our features, can you pick them?
  • That shouldve been quite easy but how do we get
    a computer system to pick the features on a
    reliable basis?
  • This can be very tricky!

14
Graphing
  • Our system needs to pick the distinctive values
    in the same way we do
  • It can sometimes be easier to visualise the
    process if it is graphed

15
Problem dependent
  • The answer is problem dependent
  • If we know that in every case we are going to be
    searching for 3 points to equal our features-then
    it is quite simple
  • We simply list the 3 highest points
  • What could we do if we dont always know the
    number of features?

16
Clusters
  • Many methods rely on grouping items that are
    similar together (in some imaginary space)
  • Often this process is called Clustering
  • Clusters that are close to each other are
    similar, clusters that are far apart are
    dissimilar
  • Often clusters are associated with unsupervised
    PR methods

17
K-Nearest Neighbour Revisited
  • This is a simple method that we have already
    discussed very briefly, but it serves a purpose
    because it allows us to discuss in more detail
    issues that affect these types of methods in real
    life

18
K- NN explained
Cluster 1
Cluster 2
K-nn allocates a novel vector to an existing
cluster based upon its Euclidean distance from
the (7) nearest allocated vectors
novel vector
19
K-NN, advantages and disadvantages
  • Advantages
  • Quick and simple to use
  • Disadvantages
  • The clusters have to be predefined
  • Very sensitive to outliers

20
K-Means
  • A better algorithm to use in practice!
  • How does it work?
  • This is an unsupervised method where the clusters
    are created
  • Prototypes within the cluster structure are used
    to determine allocation

21
K-Means explained
Cluster 1
Distance measured to the prototype
prototype
Novel vector
22
K-Means update
  • If a novel vector is allocated to a cluster then
    the prototype is updated so as to include the
    characteristics of the additional vector
  • How do you think this may be done?

23
Average of the vectors
  • One simple way would be to average the vector
    components

(i1 i2 in) ((j1 k1)/2 (j2 k2)/2
..)
24
Advantages and Disadvantages
  • Advantages
  • Simple
  • Efficient
  • Disadvantages
  • K must be provided
  • Depends on Linear Separability

25
Next Lecture
  • Syntactical Pattern Recognition
Write a Comment
User Comments (0)
About PowerShow.com