The Voted Perceptron for Ranking and Structured Classification - PowerPoint PPT Presentation

About This Presentation
Title:

The Voted Perceptron for Ranking and Structured Classification

Description:

Does it matter that some linear-chain nodes have only one neighbor? ... Bishop's textbook chapter 8 for introduction. The voted perceptron. A. B. instance xi ... – PowerPoint PPT presentation

Number of Views:88
Avg rating:3.0/5.0
Slides: 23
Provided by: willia95
Learn more at: http://www.cs.cmu.edu
Category:

less

Transcript and Presenter's Notes

Title: The Voted Perceptron for Ranking and Structured Classification


1
The Voted Perceptron for Ranking and Structured
Classification
  • William Cohen
  • 3-6-2007

2
A few critique questions
  • Why use a non-convergent method for computing
    expectations (for skip-CRFs) ? Was that the only
    choice?
  • Sadly the choice is provably fast or provably
    convergent -- pick only one.
  • Does it matter that the structure is different at
    different nodes in the skip-chain CRF?
  • Does it matter that some linear-chain nodes have
    only one neighbor?
  • Does it matter that some documents have 100 words
    and some have 1000?
  • What is all the loopy BP stuff all about anyway?
  • Bishops textbook chapter 8 for introduction.

3
The voted perceptron
instance xi
B
A
4
(1) A target u
(2) The guess v1 after one positive example.
5
(3a) The guess v2 after the two positive
examples v2v1x2
(3b) The guess v2 after the one positive and one
negative example v2v1-x2
u
u
x2
v2
v1
x1
v1
-x2
-u
-u
2?
2?
6
(3a) The guess v2 after the two positive
examples v2v1x2
(3b) The guess v2 after the one positive and one
negative example v2v1-x2
u
u
x2
v2
gt?
v1
x1
v1
-x2
-u
-u
2?
2?
7
(3a) The guess v2 after the two positive
examples v2v1x2
(3b) The guess v2 after the one positive and one
negative example v2v1-x2
u
u
x2
v2
v1
x1
v1
-x2
-u
-u
2?
2?
8
(No Transcript)
9
On-line to batch learning
  1. Pick a vk at random according to mk/m, the
    fraction of examples it was used for.
  2. Predict using the vk you just picked.
  3. (Actually, use some sort of deterministic
    approximation to this).

10
The voted perceptron for ranking
instances x1 x2 x3 x4
B
A
11
Ranking some xs with the target vector u
x
?
x
x
x
x
12
Ranking some xs with some guess vector v part 1
x
?
v
x
x
x
x
13
Ranking some xs with some guess vector v part
2. The purple-circles x is xb - the green one
is xb, the one A has chosen to rank highest.
x
v
x
x
x
x
14
Correcting v by adding xb xb
x
v
x
x
x
x
15
Correcting v by adding xb xb (part 2)
Vk1
x
vk
x
x
x
x
16
(3a) The guess v2 after the two positive
examples v2v1x2
u
x2
gt?
v1
-u
2?
17
(3a) The guess v2 after the two positive
examples v2v1x2
u
x2
gt?
v1
-u
2?
3
18
Notice this doesnt depend at all on the number
of xs being ranked
(3a) The guess v2 after the two positive
examples v2v1x2
u
x2
gt?
v1
-u
2?
19
The voted perceptron for ranking
instances x1 x2 x3 x4
B
A
Change number one replace x with z
20
The voted perceptron for NER
instances z1 z2 z3 z4
B
A
  • A sends B the Sha Pereira paper and
    instructions for creating the instances
  • A sends a word vector xi. Then B could create
    the instances F(xi,y)..
  • but instead B just returns the y that gives the
    best score for the dot product vk . F(xi,y) by
    using Viterbi.
  • A sends B the correct label sequence yi.
  • On errors, B sets vk1 vk zb - zb vk
    F(xi,y) - F(xi,y)

21
The voted perceptron for NER
instances z1 z2 z3 z4
B
A
  1. A sends a word vector xi.
  2. B just returns the y that gives the best score
    for vk . F(xi,y)
  3. A sends B the correct label sequence yi.
  4. On errors, B sets vk1 vk zb - zb vk
    F(xi,y) - F(xi,y)

22
Collins results
Write a Comment
User Comments (0)
About PowerShow.com