Linguistics 187287 Week 6 - PowerPoint PPT Presentation

1 / 110
About This Presentation
Title:

Linguistics 187287 Week 6

Description:

(Optional) filtering of f-structure snippets based on consistency of linguistic categories. Extraction of snippet that translates zutiefst dankbar into a deep ... – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 111
Provided by: ronk153
Category:

less

Transcript and Presenter's Notes

Title: Linguistics 187287 Week 6


1
Linguistics 187/287 Week 6
Generation Term-rewrite System Machine Translation
  • Martin Forst, Ron Kaplan, and Tracy King

2
Generation
  • Parsing string to analysis
  • Generation analysis to string
  • What type of input?
  • How to generate

3
Why generate?
  • Machine translation
  • Lang1 string -gt Lang1 fstr -gt Lang2 fstr -gt Lang2
    string
  • Sentence condensation
  • Long string -gt fstr -gt smaller fstr -gt new string
  • Question answering
  • Production of NL reports
  • State of machine or process
  • Explanation of logical deduction
  • Grammar debugging

4
F-structures as input
  • Use f-structures as input to the generator
  • May parse sentences that shouldnt be generated
  • May want to constrain number of generated options
  • Input f-structure may be underspecified

5
XLE generator
  • Use the same grammar for parsing and generation
  • Advantages
  • maintainability
  • write rules and lexicons once
  • But
  • special generation tokenizer
  • different OT ranking

6
Generation tokenizer/morphology
  • White space
  • Parsing multiple white space becomes a single TB
  • John appears. -gt John TB appears TB . TB
  • Generation single TB becomes a single space (or
    nothing)
  • John TB appears TB . TB -gt John appears.

  • John appears .
  • Suppress variant forms
  • Parse both favor and favour
  • Generate only one

7
Morphconfig for parsing generation
  • STANDARD ENGLISH MOPRHOLOGY (1.0)
  • TOKENIZE
  • P!eng.tok.parse.fst G!eng.tok.gen.fst
  • ANALYZE
  • eng.infl-morph.fst G!amerbritfilter.fst
  • G!amergen.fst
  • ----

8
Reversing the parsing grammar
  • The parsing grammar can be used directly as a
    generator
  • Adapt the grammar with a special OT ranking
    GENOPTIMALITYORDER
  • Why do this?
  • parse ungrammatical input
  • have too many options

9
Ungrammatical input
  • Linguistically ungrammatical
  • They walks.
  • They ate banana.
  • Stylistically ungrammatical
  • No ending punctuation They appear
  • Superfluous commas John, and Mary appear.
  • Shallow markup NP John and Mary appear.

10
Too many options
  • All the generated options can be linguistically
    valid, but too many for applications
  • Occurs when more than one string has the same,
    legitimate f-structure
  • PP placement
  • In the morning I left. I left in the morning.

11
Using the Gen OT ranking
  • Generally much simpler than in the parsing
    direction
  • Usually only use standard marks and NOGOOD
  • no marks, no STOPPOINT
  • Can have a few marks that are shared by several
    constructions
  • one or two for dispreferred
  • one or two for preferred

12
Example Prefer initial PP
  • S --gt (PP _at_ADJUNCT _at_(OT-MARK GenGood))
  • NP _at_SUBJ
  • VP.
  • VP --gt V
  • (NP _at_OBJ)
  • (PP _at_ADJUNCT).
  • GENOPTIMALITYORDER NOGOOD GenGood.
  • parse they appear in the morning.
  • generate without OT In the morning they appear.
  • They appear
    in the morning.
  • with OT In the morning they
    appear.

13
Debugging the generator
  • When generating from an f-structure produced by
    the same grammar, XLE should always generate
  • Unless
  • OT marks block the only possible string
  • something is wrong with the tokenizer/morphology
  • regenerate-morphemes if this gets a
    string
  • the tokenizer/morphology is not the
    problem
  • Hard to debug XLE has robustness features to help

14
Underspecified Input
  • F-structures provided by applications are not
    perfect
  • may be missing features
  • may have extra features
  • may simply not match the grammar coverage
  • Missing and extra features are often systematic
  • specify in XLE which features can be added and
    deleted
  • Not matching the grammar is a more serious problem

15
Adding features
  • English to French translation
  • English nouns have no gender
  • French nouns need gender
  • Soln have XLE add gender
  • the French morphology will control
    the value
  • Specify additions in xlerc
  • set-gen-adds add "GEND"
  • can add multiple features
  • set-gen-adds add "GEND CASE PCASE"
  • XLE will optionally insert the feature

Note Unconstrained additions make generation
undecidable
16
Example
The cat sleeps. -gt Le chat dort.
  • PRED 'dormirltSUBJgt'
  • SUBJ PRED 'chat'
  • NUM sg
  • SPEC def
  • TENSE present

PRED 'dormirltSUBJgt' SUBJ PRED 'chat'
NUM sg GEND masc
SPEC def TENSE present
17
Deleting features
  • French to English translation
  • delete the GEND feature
  • Specify deletions in xlerc
  • set-gen-adds remove "GEND"
  • can remove multiple features
  • set-gen-adds remove "GEND CASE PCASE"
  • XLE obligatorily removes the features
  • no GEND feature will remain in the f-structure
  • if a feature takes an f-structure value, that
    f-structure is also removed

18
Changing values
  • If values of a feature do not match between the
    input f-structure and the grammar
  • delete the feature and then add it
  • Example case assignment in translation
  • set-gen-adds remove "CASE"
  • set-gen-adds add "CASE"
  • allows dative case in input to become accusative
  • e.g., exceptional case marking verb in input
    language but regular case in output language

19
Generation for Debugging
  • Checking for grammar and lexicon errors
  • create-generator english.lfg
  • reports ill-formed rules, templates, feature
    declarations, lexical entries
  • Checking for ill-formed sentences that can be
    parsed
  • parse a sentence
  • see if all the results are legitimate strings
  • regenerate they appear.

20
Rewriting/Transfer System
21
Why a Rewrite System
  • Grammars produce c-/f-structure output
  • Applications may need to manipulate this
  • Remove features
  • Rearrange features
  • Continue linguistic analysis (semantics,
    knowledge representation next week)
  • XLE has a general purpose rewrite system (aka
    "transfer" or "xfr" system)

22
Sample Uses of Rewrite System
  • Sentence condensation
  • Machine translation
  • Mapping to logic for knowledge representation and
    reasoning
  • Tutoring systems

23
What does the system do?
  • Input set of "facts"
  • Apply a set of ordered rules to the facts
  • this gradually changes the set of input facts
  • Output new set of facts
  • Rewrite system uses the same ambiguity management
    as XLE
  • can efficiently rewrite packed structures,
    maintaining the packing

24
Example F-structure Facts
  • PERS(var(1),3)
  • PRED(var(1),girl)
  • CASE(var(1),nom)
  • NTYPE(var(1),common)
  • NUM(var(1),pl)
  • SUBJ(var(0),var(1))
  • PRED(var(0),laugh)
  • TNS-ASP(var(0),var(2))
  • TENSE(var(2),pres)
  • arg(var(0),1,var(1))
  • lex_id(var(0),1)
  • lex_id(var(1),0)
  • F-structures get var()
  • Special arg facts
  • lex_id for each PRED
  • Facts have two arguments (except arg)
  • Rewrite system allows for any number
  • of arguments

25
Rule format
  • Obligatory rule LHS gt RHS.
  • Optional rule LHS ?gt RHS.
  • Unresourced fact - clause.
  • LHS
  • clause match and delete
  • clause match and keep
  • -LHS negation (don't have fact)
  • LHS, LHS conjunction
  • ( LHS LHS ) disjunction
  • ProcedureCall procedural attachment
  • RHS
  • clause replacement facts
  • 0 empty set of replacement facts
  • stop abandon the analysis

26
Example rules
PERS(var(1),3) PRED(var(1),girl) CASE(var(1),nom)
NTYPE(var(1),common) NUM(var(1),pl) SUBJ(var(0),v
ar(1)) PRED(var(0),laugh) TNS-ASP(var(0),var(2))
TENSE(var(2),pres) arg(var(0),1,var(1)) lex_id(va
r(0),1) lex_id(var(1),0)
"PRS (1.0)" grammar toy_rules. "obligatorily
add a determiner if there is a noun with no
spec" NTYPE(F,), -SPEC(F,) gt SPEC(F,def
). "optionally make plural nouns singular this
will split the choice space" NUM(F, pl) ?gt
NUM(F, sg).
27
Example Obligatory Rule
PERS(var(1),3) PRED(var(1),girl) CASE(var(1),nom)
NTYPE(var(1),common) NUM(var(1),pl) SUBJ(var(0),v
ar(1)) PRED(var(0),laugh) TNS-ASP(var(0),var(2))
TENSE(var(2),pres) arg(var(0),1,var(1)) lex_id(va
r(0),1) lex_id(var(1),0)
"obligatorily add a determiner if there is a
noun with no spec" NTYPE(F,),
-SPEC(F,) gt SPEC(F,def).
Output facts all the input facts plus
SPEC(var(1),def)
28
Example Optional Rule
"optionally make plural nouns singular this will
split the choice space" NUM(F, pl) ?gt
NUM(F, sg).
PERS(var(1),3) PRED(var(1),girl) CASE(var(1),nom)
NTYPE(var(1),common) NUM(var(1),pl) SPEC(var(1),de
f) SUBJ(var(0),var(1)) PRED(var(0),laugh) TNS-AS
P(var(0),var(2)) TENSE(var(2),pres) arg(var(0),1,
var(1)) lex_id(var(0),1) lex_id(var(1),0)
Output facts all the input facts plus
choice split A1 NUM(var(1),pl)
A2 NUM(var(1),sg)
29
Output of example rules
  • Output is a packed f-structure
  • Generation gives two sets of strings
  • The girls laugh.laugh!laugh
  • The girl laughs.laughs!laughs

30
Manipulating sets
  • Sets are represented with an in_set feature
  • He laughs in the park with the telescope
  • ADJUNCT(var(0),var(2))
  • in_set(var(4),var(2))
  • in_set(var(5),var(2))
  • PRED(var(4),in)
  • PRED(var(5),with)
  • Might want to optionally remove adjuncts
  • but not negation

31
Example Adjunct Deletion Rules
  • "optionally remove member of adjunct set"
  • ADJUNCT(, AdjSet), in_set(Adj, AdjSet),
  • -PRED(Adj, not)
  • ?gt 0.
  • "obligatorily remove adjunct with nothing in it"
  • ADJUNCT(, Adj), -in_set(,Adj)
  • gt 0.

He laughs with the telescope in the park. He
laughs in the park with the telescope He laughs
with the telescope. He laughs in the park. He
laughs.
32
Manipulating PREDs
  • Changing the value of a PRED is easy
  • PRED(F,girl) gt PRED(F,boy).
  • Changing the argument structure is trickier
  • Make any changes to the grammatical functions
  • Make the arg facts correlate with these

33
Example Passive Rule
  • "make actives passive
  • make the subject NULL make the object the
    subject
  • put in features"
  • SUBJ( Verb, Subj), arg( Verb, Num, Subj),
  • OBJ( Verb, Obj), CASE( Obj, acc)
  • gt
  • SUBJ( Verb, Obj), arg( Verb, Num, NULL),
    CASE( Obj, nom),
  • PASSIVE( Verb, ), VFORM( Verb, pass).

the girls saw the monkeys gt The monkeys were
seen. in the park the girls saw the monkeys
gt In the park the monkeys were seen.
34
Templates and Macros
  • Rules can be encoded as templates
  • n2n(Eng,Frn)
  • PRED(F,Eng), NTYPE(F,)
  • gt PRED(F,Frn).
  • _at_n2n(man, homme).
  • _at_n2n(woman, femme).
  • Macros encode groups of clauses/facts
  • sg_noun(F)
  • NTYPE(F,), NUM(F,sg).
  • _at_sg_noun(F), -SPEC(F)
  • gt SPEC(F,def).

35
Unresourced Facts
  • Facts can be stipulated in the rules and refered
    to
  • Often used as a lexicon of information not
    encoded in the f-structure
  • For example, list of days and months for
    manipulation of dates
  • - day(Monday). - day(Tuesday). etc.
  • - month(January). - month(February). etc.
  • PRED(F,Pred), ( day(Pred) month(Pred) )
    gt

36
Rule Ordering
  • Rewrite rules are ordered (unlike LFG syntax
    rules but like finite-state rules)
  • Output of rule1 is input to rule2
  • Output of rule2 is input to rule3
  • This allows for feeding and bleeding
  • Feeding insert facts used by later rules
  • Bleeding remove facts needed by later rules
  • Can make debugging challenging

37
Example of Rule Feeding
  • Early Rule Insert SPEC on nouns
  • NTYPE(F,), -SPEC(F,) gt
  • SPEC(F, def).
  • Later Rule Allow plural nouns to become singular
    only if have a specifier (to avoid bad count
    nouns)
  • NUM(F,pl), SPEC(F,) gt NUM(F,sg).

38
Example of Rule Bleeding
  • Early Rule Turn actives into passives
    (simplified)
  • SUBJ(F,S), OBJ(F,O) gt
  • SUBJ(F,O), PASSIVE(F,).
  • Later Rule Impersonalize actives
  • SUBJ(F,), -PASSIVE(F,) gt
  • SUBJ(F,S), PRED(S,they), PERS(S,3),
    NUM(S,pl).
  • will apply to intransitives and verbs with
    (X)COMPs but not transitives

39
Debugging
  • XLE command line tdbg
  • steps through rules stating how they apply

Rule
1 (NTYPE(F,A)), -(SPEC(F,B))
gtSPEC(F,def) File /tilde/thking/courses/ling18
7/hws/thk.pl, lines 4-10 Rule 1 matches
(2) NTYPE(var(1),common) 1
--gt SPEC(var(1),def)
Rule 2 NUM(F,pl)
?gtNUM(F,sg) File /tilde/thking/courses/ling187/
hws/thk.pl, lines 11-17 Rule 2 matches 3
NUM(var(1),pl) 1 --gt
NUM(var(1),sg)
Rule 5 SUBJ(Verb,Subj),
arg(Verb,Num,Subj), OBJ(Verb,Obj),
CASE(Obj,acc) gtSUBJ(Verb,Obj),
arg(Verb,Num,NULL), CASE(Obj,nom),
PASSIVE(Verb,), VFORM(Verb,pass) File
/tilde/thking/courses/ling187/hws/thk.pl, lines
28-37 Rule does not apply
girls laughed
40
Running the Rewrite System
  • create-transfer adds menu items
  • load-transfer-rules FILE loads rules from file
  • f-str window under commands has
  • transfer prints output of rules in XLE window
  • translate runs output through generator
  • Need to do (where path is XLEPATH/lib)
  • setenv LD_LIBRARY_PATH /afs/ir.stanford.edu/data/l
    inguistics/XLE/SunOS/lib

41
Rewrite Summary
  • The XLE rewrite system lets you manipulate the
    output of parsing
  • Creates versions of output suitable for
    applications
  • Can involve significant reprocessing
  • Rules are ordered
  • Ambiguity management is as with parsing

42
Grammatical Machine Translation
  • Stefan Riezler John Maxwell

43
Translation System
Lots of statistics
Translationrules
XLEParsing
XLEGeneration
F-structures
F-structures.
GermanLFG
English LFG
44
Transfer-Rule Induction from aligned bilingual
corpora
  • Use standard techniques to find many-to-many
    candidate word-alignments in source-target
    sentence-pairs
  • Parse source and target sentences using LFG
    grammars for German and English
  • Select most similar f-structures in source and
    target
  • Define many-to-many correspondences between
    substructures of f-structures based on
    many-to-many word alignment
  • Extract primitive transfer rules directly from
    aligned f-structure units
  • Create powerset of possible combinations of basic
    rules and filter according to contiguity and type
    matching constraints

45
Induction
  • Example sentences Dafür bin ich zutiefst
    dankbar.
  • I have a deep appreciation for that.
  • Many-to-many word alignment
  • Dafür6 7 bin2 ich1 zutiefst3 4 5
    dankbar5
  • F-structure alignment

46
Extracting Primitive Transfer Rules
  • Rule (1) maps lexical predicates
  • Rule (2) maps lexical predicates and interprets
    subj-to-subj link as indication to map subj of
    source with this predicate into subject of target
    and xcomp of source into object of target
  • X1, X2, X3, are variables for f-structures
  • (2) PRED(X1, sein),
  • SUBJ(X1,X2),
  • XCOMP(X1,X3)
  • gt
  • PRED(X1, have),
  • SUBJ(X1,X2)
  • OBJ(X1,X3)

(1) PRED(X1, ich) gt PRED(X1, I)
47
Extracting Complex Transfer Rules
  • Complex rules are created by taking all
    combinations of primitive rules, and filtering
  • (4) zutiefst dankbar sein
  • gt
  • have a deep appreciation
  • (5) zutiefst dankbar dafür sein
  • gt
  • have a deep appreciation for that
  • (6) ich bin zutiefst dankbar dafür
  • gt
  • I have a deep appreciation for that

48
Transfer Contiguity constraint
  • Transfer contiguity constraint
  • Source and target f-structures each have to be
    connected
  • F-structures in the transfer source can only be
    aligned with f-structures in the transfer target,
    and vice versa
  • Analogous to constraint on contiguous and
    alignment-consistent phrases in phrase-based SMT
  • Prevents extraction of rule that would translate
    dankbar directly into appreciation since
    appreciation is aligned also to zutiefst
  • Transfer contiguity allows learning idioms like
    es gibt - there is from configurations that are
    local in f-structure but non-local in string,
    e.g., es scheint zu geben - there seems
    to be

49
Linguistic Filters on Transfer Rules
  • Morphological stemming of PRED values
  • (Optional) filtering of f-structure snippets
    based on consistency of linguistic categories
  • Extraction of snippet that translates zutiefst
    dankbar into a deep appreciation maps
    incompatible categories adjectival and nominal
    valid in string-based world
  • Translation of sein to have might be discarded
    because of adjectival vs. nominal types of their
    arguments
  • Larger rule mapping zutiefst dankbar sein to have
    a deep appreciation is ok since verbal types match

50
Transfer
  • Parallel application of transfer rules in
    non-deterministic fashion
  • Unlike XLE ordered-rule rewrite system
  • Each fact must be transferred by exactly one rule
  • Default rule transfers any fact as itself
  • Transfer works on chart using parsers
    unification mechanism for consistency checking
  • Selection of most probable transfer output is
    done by beam-decoding on transfer chart

51
Generation
  • Bi-directionality allows us to use same grammar
    for parsing training data and for generation in
    translation application
  • Generator has to be fault-tolerant in cases where
    transfer-system operates on FRAGMENT parse or
    produces non-valid f-structures from valid input
    f-structures
  • Robust generation from unknown (e.g.,
    untranslated) predicates and from unknown
    f-structures

52
Robust Generation
  • Generation from unknown predicates
  • Unknown German word Hunde is analyzed by German
    grammar to extract stem (e.g., PRED Hund, NUM
    pl) and then inflected using English default
    morphology (Hunds)
  • Generation from unknown constructions
  • Default grammar that allows any attribute to be
    generated in any order is mixed as suboptimal
    option in standard English grammar, e.g. if SUBJ
    cannot be generated as sentence-initial NP, it
    will be generated in any position as any category
  • extension/combination of set-gen-adds and OT
    ranking

53
Statistical Models
  • Log-probability of source-to-target transfer
    rules, where probability r(ef) or rule that
    transfers source snippet f into target snippet e
    is estimated by relative frequency
  • Log-probability of target-to-source transfer
    rules, estimated by relative frequency

54
Statistical Models, cont.
  • Log-probability of lexical translations l(ef)
    from source to target snippets, estimated from
    Viterbi alignments a between source word
    positions i1, n and target word positions
    j1,,m for stems fi and ej in snippets f and e
    with relative word translation frequencies
    t(ejfi)
  • Log-probability of lexical translations from
    target to source snippets

55
Statistical Model, cont.
  • Number of transfer rules
  • Number of transfer rules with frequency 1
  • Number of default transfer rules
  • Log-probability of strings of predicates from
    root to frontier of target f-structure, estimated
    from predicate trigrams in English f-structures
  • Number of predicates in target f-structure
  • Number of constituent movements during
    generations based on original order of head
    predicates of the constituents

56
Statistical Models, cont.
  • Number of generation repairs
  • Log-probability of target string as computed by
    trigram language model
  • Number of words in target string

57
Experimental Evaluation
  • Experimental setup
  • German-to-English on Europarl parallel corpus
    (Koehn 02)
  • Training and evaluation on sentences of length
    5-15, for quick experimental turnaround
  • Resulting in training set of 163,141 sentences,
    development set of 1,967 sentences, test of 1,755
    sentences (used in Koehn et al. HLT03)
  • Improved bidirectional word alignment based on
    GIZA (Och et al. EMNLP99)
  • LFG grammars for German and English (Butt et al.
    COLING02 Riezler et al. ACL02)
  • SRI trigram language model (Stocke02)
  • Comparison with PHARAOH (Koehn et al. HLT03) and
    IBM Model 4 as produced by GIZA (Och et al.
    EMNLP99)

58
Experimental Evaluation, cont.
  • Around 700,000 transfer rules extracted from
    f-structures chosen by dependency similarity
    measure
  • System operates on n-best lists of parses (n1),
    transferred f-structures (n10), and generated
    strings (n1,000)
  • Selection of most probable translations in two
    steps
  • Most probable f-structure by beam search (n20)
    on transfer chart using features 1-10
  • Most probable string selected from strings
    generated from selected n-best f-structures using
    features 11-13
  • Feature weights for modules trained by MER on 750
    in-coverage sentences of development set

59
Automatic Evaluation
  • NIST scores (ignoring punctuation) Approximate
    Randomization for significance testing (see
    above)
  • 44 in-coverage of grammars 51 FRAGMENT parses
    and/or generation repair 5 timeouts
  • In-coverage Difference between LFG and P not
    significant
  • Suboptimal robustness techniques decrease overall
    quality

60
Manual Evaluation
  • Closer look at in-coverage examples
  • Random selection of 500 in-coverage examples
  • Two independent judges indicated preference for
    LFG or PHARAOH, or equality, in blind test
  • Separate evaluation under criteria of
    grammaticality/fluency and translational/semantic
    adequacy
  • Significance assessed by Approximate
    Randomization via stratified shuffling of
    preference ratings between systems

61
Manual Evaluation
  • Result differences on agreed-on ratings are
    statistically significant at p lt 0.0001
  • Net improvement in translational adequacy on
    agreed-on examples is 11.4 on 500 sentences
    (57/500), amounting to 5 overall improvement in
    hybrid system (44 of 11.4)
  • Net improvement in grammaticality on agreed-on
    examples is 15.4 on 500 sentences, amounting to
    6.7 overall improvement in hybrid system

62
Examples LFG gt PHARAOH
  • src in diesem fall werde ich meine verantwortung
    wahrnehmen
  • sef then i will exercise my responsibility
  • LFG in this case i accept my responsibility
  • P in this case i shall my responsibilities
  • src die politische stabilität hängt ab von der
    besserung der lebensbedingungen
  • ref political stability depends upon the
    improvement of living conditions
  • LFG the political stability hinges on the
    recovery the conditions
  • P the political stability is rejects the
    recovery of the living conditions

63
Examples PHARAOH gt LFG
  • src das ist schon eine seltsame vorstellung von
    gleichheit
  • ref a strange notion of equality
  • LFG equality that is even a strange idea
  • P this is already a strange idea of equality
  • src frau präsidentin ich beglückwünsche herrn
    nicholson zu seinem ausgezeichneten bericht
  • ref madam president I congratulate mr nicholson
    on his excellent report
  • LFG madam president I congratulate mister
    nicholson on his report excellented
  • P madam president I congratulate mr nicholson
    for his excellent report

64
Discussion
  • High percentage of out-of-coverage examples
  • Accumulation of 2 x 20 error-rates in parsing
    training data
  • Errors in rule extraction
  • Together result in ill-formed transfer rules
    causing high number of generation
    failures/repairs
  • Propagation of errors through the system also for
    in-coverage examples
  • Error analysis 69 transfer errors, 10 due to
    parse errors
  • Discrepancy between NIST and manual evaluation
  • Suboptimal integration of generator, making
    training and translation with large n-best lists
    infeasible
  • Language and distortion models applied after
    generation

65
Conclusion
  • Integration of grammar-based generator into
    dependency-based SMT system achieves
    state-of-the-art NIST and improved grammaticality
    and adequacy on in-coverage examples
  • Possibility of hybrid system since it is
    determinable when sentences are in coverage of
    system

66
Grammatical Machine Translation II
  • Ji Fang, Martin Forst, John Maxwell, and Michael
    Tepper

67
Overview over different approaches to MT
68
Limitations of string-based approaches
  • Transfer rules/correspondences of little
    generality
  • Problems with long-distance dependencies
  • Perform less well for morphologically rich
    (target) languages
  • N-gram LM-based disambiguation seems to have
    leveled out

69
Limitations of string-based approaches - little
generality
  • From Europarl Das tut mir leid. Im sorry
    about that.
  • Google (SMT) Im sorry. Perfect!
  • But As soon as input changes a bit, we get
    garbage.
  • Das tut ihr leid. She is sorry about that.
    ? It does their suffering.
  • Der Tod deines Vaters tut mir leid. I am sorry
    about the death of your father. ? The death of
    your father I am sorry.
  • Der Tod deines Vaters tut ihnen leid. They are
    sorry about the death of your father. ? The
    death of your father is doing them sorry.

70
Limitations of string-based approaches - problems
with LDDs
  • From Europarl Dies stellt eine der großen
    Herausforderungen für die französische
    Präsidentschaft dar . This is one of the
    major issues of the French Presidency .
  • Google (SMT) This is one of the major challenges
    for the French presidency represents.
  • Particle verb is identified and translated
    correctly
  • But two verbs ? ungrammatical seem to be too
    far apart to be filtered by LM

71
Limitations of string-based approaches - rich
morphology
  • Language pairs involving morphologically rich
    languages, e.g., Finnish, are hard

From Koehn (2005, MT Summit)
72
Limitations of string-based approaches - rich
morphology
  • Morphologically rich, free word order languages,
    e.g. German, are particularly hard as target
    languages.

Again from Koehn (2005, MT Summit)
73
Limitations of string-based approaches - n-gram
LMs
  • Even for morphologically poor languages,
    improving n-gram LMs becomes increasingly
    expensive.
  • Adding data helps improve translation quality
    (BLEU scores), but not enough.
  • Assuming best improvement rate observed in Brants
    et al. (2007), 400 million times available data
    needed to attain human translation quality by LM
    improvement.

74
Limitations of string-based approaches - n-gram
LMs
  • Best improvement rate 0.7 BP/x2
  • Would need 40 more doublings to obtain human
    translation quality. (42 0.740 70)
  • Necessary training data in tokens 1e22
    (1e10240 1e22)
  • 4e8 times current English Web (estimate)
    (2.5e134e8 1e22)
  • From Brants et al. (2007)

75
Limitations of bitext-based approaches
  • Generally available bitexts are limited in size
    and specialized in genre
  • Parliament proceedings
  • UN texts
  • Judiciary texts (from multilingual countries)
  • ? Makes it hard to repurpose bitext-based
    systems to new genres
  • Induced transfer rules/correspondences often of
    mediocre quality
  • Loose translations
  • Bad alignments

76
Limitations of bitext-based approaches -
availability and quality
  • Readily available bitexts are limited in size and
    specialized in genre
  • Approaches to auto-extracting bitexts from the
    web exist.
  • Additional data help to some degree, but then
    effect levels out.
  • Still a genre bias in bitexts, despite automatic
    acquisition?
  • Still more general problems with alignment
    quality etc.?

77
Limitations of bitext-based approaches -
availability and quality
  • Much more data needed to attain human translation
    quality
  • Logarithmic gains (at best) by adding bitext data
  • From Munteanu Marcu (2005)
  • Base Line 100K - 95M English Words
  • Mid Line (auto) 90K - 2.1M
  • Top Line (oracle) 90K - 2.1M

78
Context-Based MT / Meaningful Machines
  • Combines example-based MT (EBMT) and SMT
  • Very large (target) language model, large amount
    of monolingual text required
  • No transfer statistics, thus no parallel text
    required
  • Translation lexicon is developed
    semi-automatically (i.e. hand-validated)
  • Lexicon has slotted phrase pairs (like EBMT),
    i.e. NP1 biss ins Gras. NP1 bit the dust.

79
Context-Based MT / Meaningful Machines - pros
  • High-quality translation lexicon seems to allow
    for
  • Easier repurposing of system(s) to new genres
  • Better translation quality

From Carbonell (2006)
80
Context-Based MT / Meaningful Machines - cons
  • Works really well for English-Spanish. How about
    other language pairs?
  • Same problems with n-gram LMs as traditional
    SMT probably affects pairs involving
    morphologically rich (target) language
    particularly badly.
  • How much manual labor involved in development of
    translation lexicon?
  • Computationally expensive

81
Grammatical Machine Translation
  • Syntactic transfer-based approach
  • Parsing and generation identical/similar between
    GMT I and GMT II

pyramid
F-structure transfer rules
transfer, score target FSs
parse source, score f-structures
generate, pick best realization
String-level statistical methods
82
Grammatical Machine Translation GMT I vs. GMT II
  • GMT I
  • Transfer rules induced from parsed bitexts
  • Target f-structures ranked using individual
    transfer rule statistics
  • GMT II
  • Transfer rules induced from manually/semi-automati
    cally construc-ted phrase lexicon
  • Target f-structures ranked using monolingually
    trained bilexical dependency statistics and
    general transfer rule statistics

83
GMT II
  • Where do the transfer rules come from?
  • Where do statistics/machine learning come in?

induced from manually/semi-automatically compiled
phrase pairs with slots potentially, but not
necessarily from bitexts
pyramid
log-linear model trained on synt. annotated
monolingual corpus
log-linear model trained on bitext data includes
score from parse ranking model and very general
transfer features
F-structure transfer rules
log-linear model trained on bitext data includes
scores from other two models and features/score
of monolingually trained model for realization
ranking
transfer, score target FSs
parse source, score f-structures
generate, pick best realization
String-level statistical methods
84
GMT II - The phrase dictionary
  • Contains phrase pairs with slot categories
    (Ddeff, Ddef, NP1nom, NP1, etc.) that allow for
    well-formed phrases without being included in
    induced rules
  • Currently hand-written
  • Will hopefully be compiled (semi-)automati-cally
    from bilingual dictionaries
  • Bitexts might also be used how exactly remains
    to be defined.

85
GMT II - Rule induction from the phrase dictionary
  • Sub-FSs of slot variables are not included
  • FS attributes can be defined as irrelevant for
    translation, e.g. CASE (in both en and de), GEND
    (in de). Attributes so defined are never included
    in induced rules.
  • set-gen-adds remove CASE GEND
  • FS attributes can be defined as
    remove_equal_features. Attributes defined as
    such are not included in induced rules when they
    are equal.
  • set remove_equal_features NUM OBJ OBL-AG
    PASSIVE SUBJ TENSE
  • ? more general rules

86
GMT II - Rule induction from the phrase
dictionary (noun)
  • Ddeff Verfassung Ddef constitution
  • PRED(X1, Verfassung),
  • NTYPE(X1, Z2),
  • NSEM(Z2, Z3),
  • COMMON(Z3, count),
  • NSYN(Z2, common)
  • gt
  • PRED(X1, constitution),
  • NTYPE(X1, Z4),
  • NSYN(Z4, common).

87
GMT II - Rule induction from the phrase
dictionary (adjective)
  • europäische European
  • PRED(X1, europäisch)
  • gt
  • PRED(X1, European).
  • To accommodate certain non-parallelism with
    respect to SUBJs of adjectives etc., special
    mechanism removes SUBJs of non-verbs and makes
    them addable in generation.

88
GMT II - Rule induction from the phrase
dictionary (verb)
  • NP1nom koordiniert NP2acc. NP1 coordinates
    NP2.
  • PRED(X1, koordinieren),
  • arg(X1, 1, A2),
  • arg(X1, 2, A3),
  • VTYPE(X1, main)
  • gt
  • PRED(X1, coordinate),
  • arg(X1, 1, A2),
  • arg(X1, 2, A3),
  • VTYPE(X1, main).

89
GMT II - Rule induction (argument switching)
  • NP1nom tut NP2dat leid. NP2 is sorry about
    NP1.
  • PRED(X1, leidtun),
  • SUBJ(X1, A2),
  • OBJ-TH(X1, A3),
  • VTYPE(X1, main)
  • gt
  • PRED(X1,be),
  • SUBJ(X1,A3),
  • XCOMP-PRED(X1,Z1),
  • PRED(Z1, sorry),
  • OBL(Z1,Z2),
  • PRED(Z2,about),
  • OBJ(Z2,A2),
  • VTYPE(X1,copular).

90
GMT II - Rule induction (head switching)
  • Ich versuche nur, mich jeder Demagogie zu
    enthalten. It is just that I am trying not to
    indulge in demagoguery.
  • NP1nom Vfin nur. It is ist just that NP1 Vs.
  • ADJUNCT(X1,Z2), in_set(X3,Z2),
    PRED(X3,nur), ADV-TYPE(X3,unspec)
  • gt
  • PRED(Z4,be), SUBJ(Z4,X3), NTYPE(X3,Z5),
    NSYN(Z5,pronoun), GEND-SEM(Z5,nonhuman),
    HUMAN(Z5,-), NUM(Z5,sg), PERS(Z5,3),
    PRON-FORM(Z5,it), PRON-TYPE(Z5,expl_),
    arg(Z4,1,Z6), PRED(Z6, just), SUBJ(Z6,Z7),
    arg(Z6,1,A1), COMP-FORM(A1,that),
    COMP(Z6,A1), nonarg(Z6,1,Z7),
    ATYPE(Z6,predicative), DEGREE(Z6, positive),
    nonarg(Z4,1,X3), TNS-ASP(Z4,Z8),
    MOOD(Z8,indicative), TENSE(Z8, pres),
    XCOMP-PRED(Z4,Z6), CLAUSE-TYPE(Z4,decl),
    PASSIVE(Z4,-), VTYPE(A2,copular).

91
GMT II - Rule induction (more on head switching)
  • In addition to rewriting terms, system
    re-attaches rewritten FS if necessary. Here, this
    might be the case of X1.
  • ADJUNCT(X1,Z2), in_set(X3,Z2),
    PRED(X3,nur), ADV-TYPE(X3,unspec)
  • gt
  • PRED(Z4,be), SUBJ(Z4,X3), NTYPE(X3,Z5),
    NSYN(Z5,pronoun), GEND-SEM(Z5,nonhuman),
    HUMAN(Z5,-), NUM(Z5,sg), PERS(Z5,3),
    PRON-FORM(Z5,it), PRON-TYPE(Z5,expl_),
    arg(Z4,1,Z6), PRED(Z6, just), SUBJ(Z6,Z7),
    arg(Z6,1,A1), COMP-FORM(A1,that),
    COMP(Z6,A1), nonarg(Z6,1,Z7),
    ATYPE(Z6,predicative), DEGREE(Z6, positive),
    nonarg(Z4,1,X3), TNS-ASP(Z4,Z8),
    MOOD(Z8,indicative), TENSE(Z8, pres),
    XCOMP-PRED(Z4,Z6), CLAUSE-TYPE(Z4,decl),
    PASSIVE(Z4,-), VTYPE(A2,copular).

92
GMT II - Pros and cons of rule induction from a
phrase dictionary
  • Development of phrase pairs can be carried out by
    someone with little knowledge of grammar and
    transfer system manual development of transfer
    rules would require experts (for boring,
    repetitive labor).
  • Phrase pairs can remain stable while grammars
    keep evolving. Since transfer rules are induced
    fully automatically, they can easily be kept in
    sync with grammars.
  • Induced rules are of much higher quality than
    rules induced from parsed bitexts (GMT I).
  • Although there is hope that phrase pairs can be
    constructed semi-automatically from bilingual
    dictionaries, it is not yet clear to what extent
    this can be automated.
  • If rule induction from parsed bitexts can be
    improved, the two approaches might well be
    complementary.

93
Lessons Learned for Parallel Grammar Development
  • Absence of a feature like PERF/- is not
    equivalent to PERF-.
  • FS-internal features should not say anything
    about the function of the FS
  • Example PRON-TYPEposs instead of
    PRON- TYPEpers
  • Compounds should be analyzed similarly, whether
    spelt together (de) or apart (en)
  • Possible with SMOR
  • Very hard or even impossible with DMOR

94
Absence of PERF ? PERF-
95
No function info in FS-internal features
  • I think NP1 Vs. In my opinion NP1 Vs.

96
Parallel analysis of compounds
97
More Lessons Learned for Parallel Grammar
Development
  • ParGram needs to agree on a parallel PRED value
    for (personal) pronouns
  • We need an interlingua for numbers, clock
    times, dates etc.
  • Guessers should analyze (composite) names
    similarly

98
Parallel PRED values for (personal) pronouns
  • Otherwise the number of rules we have to learn
    for them explodes.
  • de-en pro/er ? he, pro/er ? it, pro/sie ? she,
    pro/sie ? it, pro/es ? it, pro/es ? he, pro/es ?
    she
  • Also PRED-NUM-PERS combination may make no
    sense!!! Result A lot of generator effort for
    nothing
  • en-de he ? pro/er, she ? pro/sie, it ? pro/es,
    it ? pro/er, it ? pro/sie,

99
Interlingua for numbers, clock times, dates, etc.
  • We cannot possibly learn transfer rules for all
    dates.

100
Guessed (composite) names
We cannot possibly learn transfer rules for all
proper names in this world.
101
And Yet More Lessons Learned for Grammar
Development
  • Reflexive pronouns - PERS and NUM agreement
    should be insured via inside-out function
    application, e.g. ((SUBJ ) PERS) (PERS).
  • Semantically relevant features should not be
    hidden in CHECK

102
Reflexive pronouns
  • Introduce their own values for PERS and NUM
  • Overgeneration Ich wasche sich.
  • NUM ambiguity for (frequent) sich
  • Less generalization possible in transfer rules
    for inherently reflexive verbs - 6 rules
    necessary instead of 1.

103
Reflexive pronouns
104
Semantically relevant features in CHECK
  • sie they
  • Sie you (formal)
  • Since CHECK features are not used for
    translations, the distinction between sie and
    Sie is lost.

105
Planned experiments - Motivation
  • We do not have the resources to develop a
    general purpose phrase dictionary in the short
    or medium term.
  • Nevertheless, we want to get an idea about how
    well our new approach may scale.

106
Planned Experiments 1
  • Manually develop phrase dictionary for a few
    hundred Europarl sentences
  • Train target FS ranking model and realization
    ranking model on those sentences
  • Evaluate output in terms of BLEU, NIST and
    manually
  • Can we make this new idea work under ideal
    conditions? It seems we can.

107
Planned Experiments 2
  • Manually develop phrase dictionary for a few
    hundred Europarl sentences
  • Use bilingual dictionary to add possible phrase
    pairs that may distract the system
  • Train target FS ranking model and realization
    ranking model on those sentences
  • Evaluate output in terms of BLEU, NIST and
    manually
  • How well can our system deal with the
    distractors?

108
Planned Experiments 3
  • Manually develop phrase dictionary for a few
    hundred Europarl sentences
  • Use bilingual dictionary to add possible phrase
    pairs that may distract the system
  • Degrade the phrase dictionary at various levels
    of severity
  • Take out a certain percentage of phrase pairs
  • Shorter phrases may be penalized less than longer
    ones
  • Train target FS ranking model and realization
    ranking model on those sentences
  • Evaluate output in terms of BLEU, NIST and
    manually
  • How good or bad is the output of the system when
    the bilingual phrase dictionary lacks coverage?

109
Main Remaining Challenges
  • Get comprehensive and high-quality dictionary of
    phrase pairs
  • Get more and better (i.e. more normalized and
    parallel) analyses from grammars
  • Improve ranking models, in particular on source
    side
  • Improve generation behavior of grammars - So far,
    grammar development has mostly been
    parsing-oriented.
  • Efficiency, in particular on the generation side,
    i.a. packed transfer and generation

110
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com