Instance Based Learning - PowerPoint PPT Presentation

About This Presentation
Title:

Instance Based Learning

Description:

We need a measure of distance in order to know who are the neighbours ... We might want to weight nearer neighbors more heavily ... – PowerPoint PPT presentation

Number of Views:106
Avg rating:3.0/5.0
Slides: 30
Provided by: atak4
Category:

less

Transcript and Presenter's Notes

Title: Instance Based Learning


1
Instance Based Learning
  • Ata Kaban
  • The University of Birmingham

2
  • Today we learn
  • K-Nearest Neighbours
  • Case-based reasoning
  • Lazy and eager learning

3
Instance-based learning
  • One way of solving tasks of approximating
    discrete or real valued target functions
  • Have training examples (xn, f(xn)), n1..N.
  • Key idea
  • just store the training examples
  • when a test example is given then find the
    closest matches

4
  • 1-Nearest neighbour
  • Given a query instance xq,
  • first locate the nearest training example xn
  • then f(xq) f(xn)
  • K-Nearest neighbour
  • Given a query instance xq,
  • first locate the k nearest training examples
  • if discrete values target function then take
    vote among its k nearest nbrs else if real
    valued target fct then take the mean of the f
    values of the k nearest nbrs

5
The distance between examples
  • We need a measure of distance in order to know
    who are the neighbours
  • Assume that we have T attributes for the learning
    problem. Then one example point x has elements xt
    ? ?, t1,T.
  • The distance between two points xi xj is often
    defined as the Euclidean distance

6
Voronoi Diagram
7
Characteristics of Inst-b-Learning
  • An instance-based learner is a lazy-learner and
    does all the work when the test example is
    presented. This is opposed to so-called
    eager-learners, which build a parameterised
    compact model of the target.
  • It produces local approximation to the target
    function (different with each test instance)

8
When to consider Nearest Neighbour algorithms?
  • Instances map to points in
  • Not more then say 20 attributes per instance
  • Lots of training data
  • Advantages
  • Training is very fast
  • Can learn complex target functions
  • Dont lose information
  • Disadvantages
  • ? (will see them shortly)

9
(No Transcript)
10
Training data
Test instance
11
Keep data in normalised form
One way to normalise the data ar(x) to ar(x) is
12
Normalised training data
Test instance
13
Distances of test instance from training data
Classification 1-NN Yes 3-NN Yes 5-NN No 7-NN
No
14
What if the target function is real valued?
  • The k-nearest neighbour algorithm would just
    calculate the mean of the k nearest neighbours

15
Variant of kNN Distance-Weighted kNN
  • We might want to weight nearer neighbors more
    heavily
  • Then it makes sense to use all training examples
    instead of just k (Stepards method)


16
Difficulties with k-nearest neighbour algorithms
  • Have to calculate the distance of the test case
    from all training cases
  • There may be irrelevant attributes amongst the
    attributes curse of dimensionality

17
Case-based reasoning (CBR)
  • CBR is an advanced instance based learning
    applied to more complex instance objects
  • Objects may include complex structural
    descriptions of cases adaptation rules

18
  • CBR cannot use Euclidean distance measures
  • Must define distance measures for those complex
    objects instead (e.g. semantic nets)
  • CBR tries to model human problem-solving
  • uses past experience (cases) to solve new
    problems
  • retains solutions to new problems
  • CBR is an ongoing area of machine learning
    research with many applications

19
Applications of CBR
  • Design
  • landscape, building, mechanical, conceptual
    design of aircraft sub-systems
  • Planning
  • repair schedules
  • Diagnosis
  • medical
  • Adversarial reasoning
  • legal

20
CBR process
New Case
21
CBR example Property pricing
Test instance
22
How rules are generated
  • There is no unique way of doing it. Here is one
    possibility
  • Examine cases and look for ones that are almost
    identical
  • case 1 and case 2
  • R1 If recep-rooms changes from 2 to 1 then
    reduce price by 5,000
  • case 3 and case 4
  • R2 If Type changes from semi to terraced then
    reduce price by 7,000

23
Matching
  • Comparing test instance
  • matches(5,1) 3
  • matches(5,2) 3
  • matches(5,3) 2
  • matches(5,4) 1
  • Estimate price of case 5 is 25,000

24
Adapting
  • Reverse rule 2
  • if type changes from terraced to semi then
    increase price by 7,000
  • Apply reversed rule 2
  • new estimate of price of property 5 is 32,000

25
Learning
  • So far we have a new case and an estimated price
  • nothing is added yet to the case base
  • If later we find house sold for 35,000 then the
    case would be added
  • could add a new rule
  • if location changes from 8 to 7 increase price by
    3,000

26
Problems with CBR
  • How should cases be represented?
  • How should cases be indexed for fast retrieval?
  • How can good adaptation heuristics be developed?
  • When should old cases be removed?

27
Advantages
  • A local approximation is found for each test case
  • Knowledge is in a form understandable to human
    beings
  • Fast to train

28
Summary
  • K-Nearest Neighbours
  • Case-based reasoning
  • Lazy and eager learning

29
Lazy and Eager Learning
  • Lazy wait for query before generalizing
  • k-Nearest Neighbour, Case based reasoning
  • Eager generalize before seeing query
  • Radial Basis Function Networks, ID3,
  • Does it matter?
  • Eager learner must create global approximation
  • Lazy learner can create many local approximations
Write a Comment
User Comments (0)
About PowerShow.com