Stochastic Inversion Transduction Grammars Dekai Wu - PowerPoint PPT Presentation

About This Presentation
Title:

Stochastic Inversion Transduction Grammars Dekai Wu

Description:

... boundaries are not marked in Chinese text. No word chunks available ... Used 2000 Chinese-English sentence-pairs from HKUST corpus ... compatibility between ... – PowerPoint PPT presentation

Number of Views:75
Avg rating:3.0/5.0
Slides: 26
Provided by: sanj89
Learn more at: http://www.cs.cmu.edu
Category:

less

Transcript and Presenter's Notes

Title: Stochastic Inversion Transduction Grammars Dekai Wu


1
Stochastic Inversion Transduction Grammars Dekai
Wu
  • 11-734 Advanced Machine Translation Seminar
  • Presented by
  • Sanjika Hewavitharana
  • 04/13/2006

2
Overview
  • Simple Transduction Grammars
  • Inversion Transduction Grammars (ITGs)
  • Stochastic ITGs
  • Parsing with SITGs
  • Applications of SITGs
  • Main Reading Stochastic Inversion Transduction
    Grammars and Bilingual Parsing of Parallel
    Corpora (1997)

3
Introduction
  • Mathematical models of translation
  • IBM Models (Brown et al.) String generates
    String
  • Syntax based (Yamada Kenji) Tree generates
    String
  • ITG (Wu) two trees are generated simultaneously
  • ITGs
  • A formalism for modeling bilingual sentence pairs
  • Not intended to use as full translation models,
    but to use for parallel corpus analysis
  • Extract useful structures from input data
  • Generative view rather than translation view
  • two output trees are generated simultaneously,
    one for each language

4
Transduction Grammars
  • A simple transduction grammar is a CFG whose
    terminals are pairs of symbols (or singletons)
  • Can be used to model the generation of bilingual
    sentence pairs
  • E The Financial Secretary and I will be
    accountable.
  • C

5
Transduction Grammar Rules E.g.
  • Simple Rules
  • Inversion Rule

6
Transduction Grammars
  • A simple transduction grammar is a CFG whose
    terminals are pairs of symbols (or singletons)
  • Can be used to model the generation of bilingual
    sentence pairs
  • E
  • C

7
Transduction Grammars
  • In general, they are not very useful
  • two languages should share exactly the same
    grammatical structure
  • So some sentence pairs cannot be generated
  • ITG removes the rigid parallel ordering
    constraint
  • Constituent order in one language may be the
    inverse of the other language
  • Order is the same for both (square brackets)
  • Order is inverted for one (angle brackets)

8
ITGs
  • e.g.
  • With ITG we can parse the previous sentence pair
  • Inversion rule VP ? ? VV PP ?

9
ITG Parse Tree
10
Expressiveness of ITGs
11
Expressiveness of ITGs
  • Not all matching are possible with ITG
  • e.g. Inside-out matching are not allowed
  • This helps to reduce the combinatorial growth of
    matchings with the number of tokens
  • The number of matchings eliminated increases
    rapidly as the number of tokens increases
  • Author claims this is a benefit

12
Expressiveness of ITGs
13
Normal Form of ITG
  • For any ITG there exists an equivalent grammar in
    the normal form
  • Right hand side of all rules have either
  • Terminal couples
  • Terminal singletons
  • Pairs of non-terminals with straight orientation
  • Pairs of non-terminals with inverted orientation

14
Stochastic ITGs
  • A probability can be assigned to each rewrite
    rule
  • The probabilities of all the rules with a given
    left hand side must sum to 1.
  • An SITG will give the most probable matching (ML)
    parse for a sentence pair.
  • Similar to Viterbi or CYK (Chart) parsing

15
Parsing with SITGs
  • Every node (q) in the parse tree has 5 elements
  • Begin end indices for language-1 string (s,t)
  • Begin end indices for language-2 string (u,v)
  • Non-terminal category (i)
  • Each cell (in the chart) stores the probability
    of the most likely parse covering the appropriate
    substrings, rooted in the appropriate category

16
Parsing with SITGs - Algorithm
  • Initialize the cells corresponding to terminals
    using a translation lexicon
  • For the other cells, recursively find the most
    probable way of obtaining that nonterminal
    category.
  • Compute the probability by multiplying the
    probability of the rule by the probabilities of
    both the constituents
  • Store that probability plus the orientation of
    the rule
  • Complexity O(n3m3)

17
Applications of SITGs
  • Segmentation
  • Bracketing
  • Alignment
  • Bilingual Constraint Transfer
  • Mining parallel sentences from comparable corpora
  • Wu Fung 2005

18
Applications of SITGs - Segmentation
  • Word boundaries are not marked in Chinese text
  • No word chunks available for matching
  • One option do word segmentation as
    preprocessing
  • Might produce chunks with that does not agree
    bilingually
  • Solution extend the algorithm to accommodate
    segmentation
  • Allow the initialization step to find strings of
    any length in the translation lexicon
  • The recursive step stores the most probable way
    of creating a constituent, whether it came from
    the lexicon or from rules

19
Applications of SITGs Bracketing
  • How to assign structure to a sentence with no
    grammar available?
  • Especially problematic for minority language
  • A solution using ITGs
  • Get a parallel corpus pairing it with some other
    language
  • Get a reasonable translation dictionary
  • Parse it with a bracketing transduction grammar

20
Bracketing Transduction Grammar
  • A minimal ITG
  • Only one nonterminal A
  • Production rules
  • Lexical translation probabilities has
    prominence
  • Small prob. values for the two singleton
    production rules
  • Also, a very small value for

21
Bracketing with Singletons
  • Singletons cause bracketing errors
  • Some refinements
  • Depending on the language, bias the singletons
    attachment either to the left or the right of a
    constituent
  • Apply a series of transformations which would
    push the singletons as closely as possible
    towards couples
  • e.g. x ? A B ? ? ? x ? A B ? ? ? ? ? x A ? B
    ? ? ? x A B ?
  • Before
  • After

22
Bracketing Experiments
  • Used 2000 Chinese-English sentence-pairs from
    HKUST corpus
  • Some filtering
  • Remove sentence pairs that were not adequately
    covered by the lexicon (gt1 unknown words)
  • Remove sentence pairs with high unmatched words
    (gt2)
  • Bracketing precision
  • 80 for English
  • 78 for Chinese
  • Errors mainly due to lexical imperfections
  • A statistical lexicon (6.5k English, 5.5k
    Chinese words)
  • Can be improved with extra information
  • e.g. POS, grammar-based bracketer

23
Applications of SITGs - Alignment
  • Alignments (phrasal or word) are a natural
    byproduct of bilingual parsing
  • Unlike parse-parse-match methods, this
  • Doesnt require a robust grammar for both
    languages
  • Guarantees compatibility between parses
  • Has a principled way of choosing between possible
    alignments
  • Provides a more reasonable distortion penalty
  • Recent empirical studies show ITGs produce better
    alignments in various applications Wu Fung
    2005

24
Bilingual Constraint Transfer
  • A high-quality parse for one language can be
    leveraged to get structure for the other
  • Alter the parsing algorithm
  • only allow constituents that match the parse that
    already exists for the well-studied language
  • This works for any sort of constraint supplied
    for the well-studied language

25
References
  • Dekai Wu (1997), Stochastic Inversion
    Transduction Grammars and Bilingual Parsing of
    Parallel Corpora, Computational Linguistics, Vol.
    23, no. 1, pp. 377-403.
  • Dekai Wu (1995), Grammarless Extraction of
    Phrasal Translation Examples from Parallel Texts,
    6th Intl. Conf.on Theoretical and Methodological
    Issues in Machine Translation, Vol. 2, pp.
    354-372. Leuven, Belgium.
  • Dekai Wu and Pascale FUNG (2005), Inversion
    Transduction Grammar Constraints for Mining
    Parallel Sentences from Quasi-Comparable Corpora,
    2nd Intl. Joint Conf. on Natural Language
    Processing (IJCNLP-2005), Jeju, Korea, October.
Write a Comment
User Comments (0)
About PowerShow.com