Title: Error Analysis of Two Types of Grammar for the purpose of Automatic Rule Refinement
1Error Analysis of Two Types of Grammar for the
purpose ofAutomatic Rule Refinement
- Ariadna Font Llitjós, Katharina Probst, Jaime
Carbonell - Language Technologies Institute
- Carnegie Mellon University
- AMTA 2004
2Outline
- Automatic Rule Refinement
- AVENUE and resource-poor scenarios
- Experiment
- Data (eng2spa)
- Two types of grammar
- Evaluation results
- Error analysis
- RR required for each type
- Conclusions and Future Work
3Motivation for Automatic RR
- General
- MT output still requires post-editing
- Current systems do not recycle post-editing
efforts back into the system, beyond adding as
new training data - within Avenue
- Resource-poor scenarios lack of manual grammar
or very small initial grammar - Need to validate elicitation corpus and
automatically learned translation rules
4Motivation for Automatic RR
- General
- MT output still requires post-editing
- Current systems do not recycle post-editing
efforts back into the system, beyond adding as
new training data - within Avenue
- Resource-poor scenarios lack of manual grammar
or very small initial grammar - Need to validate elicitation corpus and
automatically learned translation rules
5AVENUE and resource-poor scenarios
- No e-data available (often spoken tradition)
- SMT or EBMT
- lack of computational linguists to write a
grammar - So how can we even start to think about MT?
- Thats what AVENUE is all about
- Elicitation Corpus
- Automatic Rule Learning Rule Refinement
- What do we usually have available in
resource-poor scenarios? Bilingual users
6AVENUE overview
7Automatic and Interactive RLR
1st step
SLSentence1 TLSentence1
SLSentence2 TLSentence2
Automatically Learned Rule R
2nd step
TLS3
SLS3
RR module
R (R refined)
TLS3
SLS3
TLS3
8Interactive Elicitation of MT errors
- Assumptions
- non-expert bilingual users can reliably detect
and minimally correct MT errors, given - SL sentence (I saw you)
- up to 5 TL sentences (Yo vi tú, ...)
- word-to-word alignments (I-yo, saw-vi, you-tú)
- (context)
- using an online GUI the Translation Correction
Tool (TCTool) - Goal Simplify MT correction task maximally
- User studies 90 error detection accuracy and
73 error classification LREC 2004 -
91st Eng2Spa user study
Interactive elicitation of error information
- LREC 2004
- Manual grammar 12 rules 442 lexical entries
- MT error classification (v0.0) 9
linguistically-motivated classes - word order, sense, agreement error (number,
person, gender, tense), form, incorrect word and
no translation - Test set 32 sentences from the AVENUE
Elicitation Corpus (4 correct / 28 incorrect)
10Data Analysis
Interactive elicitation of error information
- For 10 (of the 29) users
- - from Spain (to reduce geographical differences)
- - 2 had Linguistics background
- - 2 had a Bachelor's degree, 5 a Masters and 3 a
PhD.
Interested in high precision, even at the expense
of lower recall - ideally no false positives
(users correcting something that is not strictly
necessary) - we don't care so much about
having false negatives (errors that were not
corrected)
11TCTool v0.1
- Add a word
- Delete a word
- Modify a word
- Change word order
Actions
12RR Framework
- Find best RR operations given a
- grammar (G),
- lexicon (L),
- (set of) source language sentence(s) (SL),
- (set of) target language sentence(s) (TL),
- its parse tree (P), and
- minimal correction of TL (TL)
- such that TQ2 gt TQ1
- Which can also be expressed as
- max TQ(TLTL,P,SL,RR(G,L))
13Types of RR operations
- Grammar
- R0 ? R0 R1 R0 contr CovR0 ? CovR0,R1
- R0 ? R1 R0 constr CovR0 ? CovR1
- R0 ? R1R0 constr -
- ? R2R0 constrc CovR0 ?
CovR1,R2 - Lexicon
- Lex0 ? Lex0 Lex1Lex0 constr
- Lex0 ? Lex1Lex0 constr
- Lex0 ? Lex1?Lex0 ? TLword
- ? ? Lex1 (adding lexical item)
bifurcate
refine
14Experiment
- Data (eng2spa)
- Grammars manual vs learned
- Results
- Error analysis
- Types of RR operations required
- by each grammar
15 Data English - Spanish
- Training
- First 200 sentences from AVENUE Elicitation
Corpus - Lexicon extracted semi-automatically from first
400 sentences (442 entries) - Test
- 32 sentences manually selected from the next 200
sentences in the EC to showcase a variety of MT
errors
16Manual grammar
- 12 rules (2 S, 7 NP, 3 VP)
- Produces 1.6 different translations on average
17Learned Grammar feature constraints
- 316 rules (194 S, 43 NP, 78 VP, 1 PP)
- emulated decoder by reordering of 3 rules
- Produces 18.6 different translations on average
18Comparing Grammar Output Results
- Manually
- Automatic MT Evaluation
19Error Analysis
- Most of the errors produced by the manual grammar
can be classified into - lack of subj-pred agreement
- wrong word order of object pronouns (clitic)
- wrong preposition
- wrong form (case)
- OOV words
- On top of these, the learned grammar output
exhibited errors of the following type - lack of agreement constraints
- missing preposition
- over-generalization
20Examples
- Same (both good)
- Manual Grammar better
- Learned Grammar better
- Different (both bad)
21Types of RR required for
- Manual Grammar
- Bifurcate a rule to code an exception
- R0 ? R0 R1 R0 contr CovR0 ? CovR0,R1
- R0 ? R1R0 constr -
- ? R2R0 constrc CovR0 ?
CovR1,R2 - Learned Grammar
- Adjust feature constraints, such as agreement
- R0 ? R1 R0 - constr CovR0 ? CovR1
22Conclusions
- TCTool RR can improve both hand-crafted and
automatically learned grammars. - In the current experiment, MT errors differ
almost 50 of the time, depending on the type of
grammar. - Manual G will need to be refined to encode
exceptions, whereas Learned G will need to be
refined to achieve the right level of
generalization. - We expect the RR to give the most leverage when
combined with the Learned Grammar.
23Future Work
- Experiment where user corrections are used both
as new training examples for RL and to refine the
existing grammar with the RR module. - Investigate using reference translations to
refine MT grammars automatically... but much
harder since they are not minimal post-editions.
24Questions???Thank you!
252 steps to ARR
- Interactive elicitation of
- error information
- Automatic Rule Adaptation
26Error Correction by bilingual users
27 MT error typology for RR (simplified)
- missing word
- extra word
- word order (local vs long-distance, word vs
phrase, word change) - incorrect word (sense, form, selectional
restrictions, idiom, ...) - agreement (missing constraint, extra agreement
constraint)
28RR Framework
- types of operations bifurcate, make more
specific/general, add blocking constraints, etc. - formalizing error information (clue word)
- finding triggering features