Structural Phrase Alignment Based on Consistency Criteria - PowerPoint PPT Presentation

About This Presentation
Title:

Structural Phrase Alignment Based on Consistency Criteria

Description:

Toshiaki Nakazawa, Kun Yu, Sadao Kurohashi (Graduate School of Informatics, Kyoto University) {nakazawa, kunyu}_at_nlp.kuee.kyoto-u.ac.jp kuro_at_i.kyoto-u.ac.jp – PowerPoint PPT presentation

Number of Views:20
Avg rating:3.0/5.0
Slides: 2
Provided by: none74
Category:

less

Transcript and Presenter's Notes

Title: Structural Phrase Alignment Based on Consistency Criteria


1
Structural Phrase AlignmentBased on Consistency
Criteria
Toshiaki Nakazawa, Kun Yu, Sadao
Kurohashi (Graduate School of Informatics, Kyoto
University) nakazawa, kunyu_at_nlp.kuee.kyoto
-u.ac.jp kuro_at_i.kyoto-u.ac.jp
Core Steps of Alignment
Flow of Our EBMT System
  • Searching Correspondence Candidates
  • Fine alignment is efficient in translation
  • Search candidates as much as possible using
    variety of linguistic information
  • Bilingual dictionaries
  • Transliteration (Katakana words, NEs)
  • ?????? ? rosuwain ? rose wine
    (similarity0.78)
  • ?? ? shinjuku ? shinjuku (similarity1.0)
  • Numeral normalization
  • ????? ? 2,160,000 ? 2.16 million
  • Japanese flexible matching (Odani et. al. 2007)
  • Substring co-occurrence measure (Cromieres 2006)
  • Selecting Correspondence Candidates
  • More candidates derive more ambiguities and
    improper alignments
  • Necessity of robust alignment method which can
    align parallel sentences consistently by
    selecting the adequate candidates set

Translation Examples
Input
??
(cross)
came
??????? ??????????
? ? ?
(point)
at me
??
from the side
(suddenly)
????? ?? ??? ?
at the intersection
(rush out)
??
? ?
(cross)
to remove
? ?
(house)
??
(point)
when
??
(enter)
?
entering
(enter)
(when)
?
(when)
??
(put off)
a house
? ?
(my)
? ?
(my)
my
?? ?
(signal)
???
signature
(signal)
Language Models
?
(blue)
?? ?
??? ?
traffic
(signal)
(was)
Output
?
The light
(blue)
My traffic light was green when entering the
intersection.
??? ?
was green
(was)
Selecting Correspondence Candidates Using
Consistency Score and Dependency Type
Ambiguities!
you
?? ?
(in Japan)
Near!
will have to file
??
(insurance)
insurance
?? ? ???
Far!
Far!
(to company)
an claim
??
(insurance)
insurance
?? ?
(claim)
with the office
???? ?
Near!
(instance)
Improper alignments!
in Japan
?????
How to reflect the inconsistency?
(you can)
Japanese Japanese
predicate level C 6
predicate level B/B 5
predicate level B-/A 4
case no / rentai 2
Inside clause 1
predicate level A-
Others 3
English English
S / SBAR / SQ 5
VP / WHADVP 4
WHADJP
ADVP / ADJP NP / PP / INTJ 3
QP / PRT / PRN
Others 1
Dependency Type Distance
Distribution of the distance of alignment pairs
in hand-annotated data (Mainichi newspaper 40K
sentence pairs) Uchimoto04
Near-Near pair ? Positive Score Far-Far pair
? 0 Near-Far pair ? Negative Score
Consistency Score Function
Experimental Result
Quality of Other Language Pairs
  • 500 test sentences from Mainichi newspaper
    parallel corpus
  • Bilingual dictionary KENKYUSYA J-E/J-E 500K
    entries
  • Evaluation criteria Precision / Recall /
    F-measure
  • Character-base for Japanese, word-base for English

English- French English- Romanian English- Korean
HLT-NAACL 2003 5.71 28.86 -
ACL 2005 - 26.55 -
(Gildea, 2003) - - 32
GIZA 15.89 27.19 35
Pre Rec F
Baseline 77.47 64.32 70.29
Consistency Score 80.30 66.90 72.99
Proposed(CS,DpndType) 80.77 69.14 74.51
Filtering (80) 82.48 71.31 76.49
Moses (SMT Toolkit) 60.19 33.15 42.75
Manual (upper bound) 95.58 89.80 92.60
(AER)
Conclusion
  • Proposed a new phrase alignment method using
    consistency criteria.
  • Enough alignment accuracy compared to other
    language pairs.
  • We need to acquire the parameters automatically
    by machine learning.
  • We are planning to evolve the framework which
    revises the parse result.

Using 300K newspaper domain bi-sentences for
training
(There is a translation demos in exhibition
corner by NICT which is using our system!)
Write a Comment
User Comments (0)
About PowerShow.com