Training Tree Transducers - PowerPoint PPT Presentation

1 / 28
About This Presentation
Title:

Training Tree Transducers

Description:

Finite State Transducers (FSTs) and R. Trees and Regular ... drinks. water. S. qleft.vp.v VP. qpro PRO. qright.vp.np VP. V. NP. PRO. S. qleft.vp.v VP. qpro PRO ... – PowerPoint PPT presentation

Number of Views:47
Avg rating:3.0/5.0
Slides: 29
Provided by: jumbo2
Category:

less

Transcript and Presenter's Notes

Title: Training Tree Transducers


1
Training Tree Transducers
  • Author Jonathan Graehl
  • Kevin Knight
  • Presented by Zhengbo Zhou

2
Outline
  • Finite State Transducers (FSTs) and R
  • Trees and Regular Tree Grammars
  • xR and Derivation Tree
  • Inside-Outside algorithm and EM training
  • Turning tree to string (xRS)
  • Example and Related Work
  • My thought/questions

3
Finite State Transducers (FSTs)
  • Finite-state Transducer from what weve learned-gt

4
R transducer
  • An R transducer compactly represent a potentially
    infinite set of input/output tree pairs.
  • While a FST compactly represent such a set of
    input/output string pairs.
  • R is a generalization of FST.

5
Example of R
  • He drinks water

6
Example for R cont
Rule 1
Rule 2,3,4
English order S(PRO, VP(V, NP))
Arabic order S(V,PRO,NP)
7
Trees
  • Definitions

8
Regular Tree Grammars (RTG)
  • Regular Tree Grammar, a common way of compactly
    representing a potentially infinite set of trees.
  • wRTG is just like WFSA.
  • wRTG G (?,N,S,P)
  • ? alphabet
  • N nonterminals
  • S start nonterminal
  • Weighted
    productions

9
Sample wRTG
10
Extended-LHS Tree Transducer (xR)
  • Different from R explicitly represent the
    lookahead and movement with a more specified LHS
  • Form of LHS is
  • The pattern will be used to match an input
    subtree.
  • There is a set of finite tree patterns.

11
Binary Relation
12
Derivation Tree
  • So many trees now, but this derivation tree is a
    representation of the transducer, neither the
    input tree nor the output tree.
  • But derivation tree can deterministically produce
    a single weighted output tree.

13
Derivation tree derivation wRTG
X
X
14
Inside-Outside algorithm
  • Basic idea of inside-outside algorithm
  • Use current probability of rules to estimate the
    expected frequencies of certain types of
    derivation steps and compute new probabilities
    for those rules.1
  • Generally
  • for inside probability is to recalculate p of
    A-gta may go through A-gtBC
  • for outside probability is to recalculate p of
    C-gtAB or C-gtBA

15
Inside-Outside for wRTG
  • Inside weights using G are given by ßG
  • Outside weights aG

16
EM training
  • EM training to maximized the corpus likelihood,
    repeatedly estimating the expectation of decision
    and maximizing by assigning counts to parameter
    and renormaliztion.
  • Algorithm 2 implements EM xR training by
    repeatedly computing inside-outside weights.

17
From tree to string
  • Although we can use Extended-LHS Tree Transducer
    (xR) to get an output tree from an input tree
    (say parse trees), but still, it is a (parse)
    tree, not the sentence in another language (for
    machine translation).
  • Now we have xRStree to string transducer.

18
Tree-to-string transducer
  • Weighted extended-lhs root-to-frontier
    tree-to-string transducer
  • X(?,?,Q, Qi, R)
  • It is similar to xR, but the rhs is strings
    instead of trees.

19
Example
  • Implemented the translation model of (Yamada and
    Knight 2001)
  • There is a trainable xRS tree-to-string
    transducer that embodies

20
Example
21
Related Work
  • TSG vs RTG (equivalent)
  • xR vs weighted synchronous TSG (similar)
  • EM training vs forward backward algorithm for
    finite state (string) transducer and also for HMM

22
Questions
  • Is there any future work on this tree transducer
    especially for Machine Translation?
  • Precision? Recall?
  • Also a little bit confused in the descriptions of
    those two relationships gtx and gtG
  • Not very sure about inside-outside algorithm.

Questions?
23
  • Thank you!!

24
Reference
  • 1 Fernando Pereira, Yves Schabes INSIDE-OUTSIDE
    REESTIMATION FROM PARTIALLY BRACKETED CORPORA
    1992

25
What might be useful
  • An Overview of Probabilistic Tree Transducers for
    Natural Language Processing Kevin Knight and
    Jonathan Graehl

26
  • R Top-down transducer, introduced before.
  • F Bottom-up transducer (Frontier-to-root),
    with similar rules, but transforming the leaves
    of the input tree first, and working its way up.
  • L Linear transducer, which prohibits copying
    subtrees. Rule 4 in Figure 4 is example of a
    copying production, so this whole transducer is R
    but not RL.
  • N Non-deleting transducer, which requires that
    every left-hand-side variable also appear on the
    right-hand side. A deleting R-transducer can
    simply delete a subtree (without inspecting it).
    The transducer in Figure 4 is the deleting kind,
    because of rules 34-39. It would also be deleting
    if it included a rule for dropping English
    determiners, e.g., q NP(x0, x1) q x1.
  • D Deterministic transducer, with a maximum of
    one production per ltstate, symbolgt pair.
  • T Total transducer, with a minimum of one
    production per ltstate, symbolgt pair.
  • PDTT Push-down tree transducer, the transducer
    analog of CFTG 36.
  • subscript Regular-lookahead transducer, which
    can check to see if an input subtree is
    tree-regular, i.e., whether it belongs to a
    specified RTL. Productions only fire when their
    lookahead conditions are met.

27
(No Transcript)
28
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com