Title: Imposing Constraints from the Source Tree on ITG Constraints for SMT
1Imposing Constraints from the Source Tree on ITG
Constraints for SMT
Hirofumi Yamamoto, Hideo Okuma, Eiichiro Sumita
National Institute of Information and
Communications Technology
ATR Spoken Language Communication Research Labs.
Kindai University School of Science and
Engineering Department of Information
2Background
In current SMT, erroneous word reordering is one
of the most serious problems, especially for dis-
similar language pair such as English-Chinese or
English-Japanese.
1) To introduce linguistic syntax directly.
Tree-to-string
String-to-tree
Tree-to-tree
Not robust to parsing error
3Background
In current SMT, erroneous word reordering is one
of the most serious problems, especially for not
similar language pair such as English-Chinese or
English-Japanese.
2) To assign probabilistic constraints for word
reordering
IBM distortion, Lexical reordering, ITG
Weaker constraints than the first type
To introduce syntax information to second type
4ITG Constraints
Translation source sentences are represented by
binary tree. Translation target sentences can be
generated by rotating branches of nodes of source
tree.
B
A
D
C
B
A
D
C
d
b
c
a
a
c
b
d
Above target word order cannot be generated from
any source binary tree. Source binary tree
instance is not considered.
5Basic Idea of IST-ITG
To use ITG constraints under the given source tree
abcd, abdc, bacd, badc, cdad, cdba, dcab, dcba
B
A
D
C
abcd, bacd, cabd, cbad, dabc, dbac, dcab, dcba
B
A
D
C
In original ITG constraints, 22 combinations are
allowed.
6The Number of Word Order Combinations
For binary source tree, word order
combinations are allowed without constraints.
Under the IST-ITG constraints, this number is
reduced to .
If
Without constraints ITG constraints IST-ITG
If
Without constraints ITG constraints IST-ITG
7Extension to Non-binary Tree
Parsing results sometimes are not binary tree.
For the nodes which have more than two branches,
any word reorderings are allowed.
abcd, abdc, acbd, acdb, adbc, adcb, bcda, bdca,
cbda, cdba, dbca, dcba
8Extension to Non-binary Tree
Parsing results sometimes are not binary tree.
For the node which have more than two branches,
any word reorderings are allowed.
For non-binary tree, the number of combinations
of IST-ITG can represented by .
( represents number of branches in -th
node)
9IST-ITG in Phrase-based SMT (1)
The unit of parsing tree is word, but the
unit of phrase-based
SMT is phrase. Units are different.
? Word-to-word alignments are sometimes not
one-to-one. But phrase-to-phrase
alignments are always one-to-one
Additional rules for phrase-based SMT
1) Word reordering that breaks a phrase is not
allowed.
2) Phrase internal word reordering is not checked.
10IST-ITG in Phrase-based SMT (2)
5
4
1
2
3
B
C
D
E
F
G
A
Ph
1NG 2NG 3OK 4NG 5OK
(unacceptable)
11IST-ITG in Phrase-based SMT (2)
5
4
1
2
3
B
C
D
E
F
G
A
Ph
Ph
1NG 2NG 3OK 4NG 5OK
12IST-ITG in Phrase-based SMT (2)
5
4
1
2
3
B
C
D
E
F
G
A
Ph
Ph
1NG 2NG 3OK 4NG 5OK
13IST-ITG in Phrase-based SMT (2)
5
4
1
2
3
B
C
D
E
F
G
A
Ph
1NG 2NG 3OK 4NG 5OK
14IST-ITG in Phrase-based SMT (2)
5
4
1
2
3
B
C
D
E
F
G
A
Ph
Ph
1NG 2NG 3OK 4NG 5OK
15IST-ITG in Phrase-based SMT (2)
5
4
2
3
1
D
E
F
G
B
C
A
Ph
1NG 2NG 3OK 4NG 5OK
16Decoding Algorithm with IST-ITG
2
2
1
0
0
0
B
C
D
E
F
G
A
H
I
0
0
1
1
0
0
0
0
0
d e
0Untranslated 1Translated 2Translating
17Decoding Algorithm with IST-ITG
NG
2
1
2
0
0
B
C
D
E
F
G
A
H
I
1
0
1
1
0
0
1
0
0
d e a b
If phrases A and B are translated, Sub-tree that
includes more than two 2 ? NG
18Decoding Algorithm with IST-ITG
2
2
1
0
0
0
B
C
D
E
F
G
A
H
I
0
0
1
1
0
0
0
0
0
d e
Consider minimum Translating sub-tree (sub-tree
that includes both 0 and 1.)
19Decoding Algorithm with IST-ITG
2
1
1
0
1
2
B
C
D
E
F
G
A
H
I
0
0
1
1
1
1
0
1
0
d e f g h
All of minimum Translating sub-tree are
translated. ? OK
20Decoding Algorithm with IST-ITG
2
2
1
0
2
0
B
C
D
E
F
G
A
H
I
0
0
1
1
0
1
0
0
0
d e g
Translate sub-part of minimum Translating
sub-tree. ? OK
21English and Japanese Patent Corpus Experiments
Experimental corpus size
Total Words
of sent.
of entry
E/J Train
1.8M
60M/64M
188K/118K
E/J Dev
916
30K/32K
4,072/3,646
E/J Eval
899
29K/32K
3,967/3,682
Single reference
22Other Experimental Conditions
LM training SRI Language model toolkit
(5-grams) Word alignment for TM training
GIZA Decoder Moses compatible in-house decoder
named CleopATRa
Evaluation measures
BLEU,NIST,WER,PER
23English and Japanese Patent Translation Experiment
al Results
English-to-Japanese
BLEU
NIST
WER
PER
40.02
Monotone
79.97
6.95
24.91
39.52
81.10
7.19
No Constraint
26.83
39.25
78.35
7.29
IBM
28.34
38.93
74.90
IST-ITG
30.26
7.41
38.61
76.30
7.50
IBMLex
31.17
38.15
71.18
7.61
32.20
IBMLexIST
24English and Japanese Patent Translation Experiment
al Results
English-to-Japanese
BLEU
NIST
WER
PER
40.02
Monotone
79.97
6.95
24.91
39.52
81.10
7.19
No Constraint
26.83
39.25
78.35
7.29
IBM
28.34
38.93
74.90
IST-ITG
30.26
7.41
38.61
76.30
7.50
IBMLex
31.17
38.15
71.18
7.61
32.20
IBMLexIST
25English and Japanese Patent Translation Experiment
al Results
Japanese-to-English
BLEU
NIST
WER
PER
39.12
77.27
7.54
IBMLex
29.93
39.73
72.80
7.50
29.77
IST-ITG
26English and Japanese Patent Translation Experiment
al Results
Japanese-to-English
BLEU
NIST
WER
PER
39.12
77.27
7.54
IBMLex
29.93
39.73
72.80
7.50
29.77
IST-ITG
27Chinese-to-English Translation Experiments
NIST MT08 English-to-Chinese track
Training data for TM Training data for
LM Development data Evaluation data
6.2M 20.1M 1,664 1,859
1 reference 4 reference
Experimental Results
W-Bleu
C-Bleu
WER
CER
35.2
74.1
IBMLex
21.0
75.0
67.9
69.7
37.0
23.2
IST-ITG
28Conclusion
We proposed new word reordering constrains
IST-ITG using source tree structure. It is
extension of ITG constraints.
We conducted three experiments of proposed
method E-J and J-E patent translation and NIST
MT08 E-C track. In all experiments, improvements
of BLEU and WER are confirmed.
Especially, improvement for WER is very large,
and effectiveness for global word reordering is
confirmed.
29Thank you!