Interlingua-based MT - PowerPoint PPT Presentation

About This Presentation
Title:

Interlingua-based MT

Description:

Interlingua-based MT Interlingua-based Machine Translation Syntactic transfer-based MT Couples the syntax of the two languages What if we abstract away the syntax All ... – PowerPoint PPT presentation

Number of Views:118
Avg rating:3.0/5.0
Slides: 61
Provided by: Sri675
Category:

less

Transcript and Presenter's Notes

Title: Interlingua-based MT


1
Interlingua-based MT
2
Interlingua-based Machine Translation
Interlingua
  • Syntactic transfer-based MT
  • Couples the syntax of the two languages
  • What if we abstract away the syntax
  • All that remains is meaning
  • Meaning is the same across languages
  • Simplicity Only N components needed to translate
    among N languages
  • Two small problems
  • What is meaning?
  • How do we represent meaning?

Semantic Interpretation
Semantic Generation
Syntactic Structure
Syntactic Structure
Transfer-based MT
Syntactic Generation
Parsing
Direct MT
Source
Target
English analyzer
Spanish analyzer
Japanese analyzer
Interlingual representation
Spanish Generator
Japanese Generator
English generator
3
Example of Interlingua Machine Translation
Interlingua representation
4
Ingredients of a semantic representation
  • language neutral
  • Syntactic variations should result is the same
    semantics
  • sense of a word
  • deep semantic role labels
  • scope of quantifiers, adverbials, adjectives
  • polarity information
  • Distinguish between
  • surface structure (syntactic structure) and
  • deep structure (semantic structure) of
    sentences.
  • Different forms of semantic representation
  • logic formalisms
  • ontology / semantic representation languages
  • Case Frame Structures (Filmore)
  • Conceptual Dependy Theory (Schank)
  • Description Logic (DL) and similar KR languages
  • Ontologies

5
Text Meaning Representation
  • Lexicon has two components
  • Syntactic part
  • Semantic constraints part
  • Given a sentence, the syntactic part analyzes
    the input syntactically and the semantic
    constraints create semantic expressions that can
    be evaluated.
  • Ontology specifies the type hierarchy
  • Used for checking selectional restrictions
  • Selectional restrictions used for word-sense
    disambiguation
  • e.g. accident is an event organization has
    humans

6
Constructing a Semantic Representation
  • General approach
  • Start with surface structure derived from parser.
  • Map surface structure to semantic structure
  • Use phrases as sub-structures.
  • Find concepts and representations for central
    phrases (e.g. VP, NP, then PP)
  • Assign phrases to appropriate roles around
    central concepts (e.g. bind PP into VP
    representation).

7
Semantic Representation
  • Semantic Representations are based on some form
    of (formal) Representation Language.
  • Semantics Networks
  • Conceptual Dependency Graphs
  • Case Frames
  • Ontologies
  • DL and similar KR languages
  • Important note Difference between relations
    between text strings and referents in the world.

8
Ontology (Interlingua) approach
  • Ontology a language-independent classification
    of objects, events, relations
  • A Semantic Lexicon, which connects lexical items
    to nodes (concepts) in the ontology
  • An analyzer that constructs Interlingua
    representations and selects an appropriate one

9
Semantic Lexicon
  • Provides a syntactic context for the appearance
    of the lexical item
  • Provides a mapping for the lexical item to a node
    in the ontology (or more complex associations)
  • Provides connections from the syntactic context
    to semantic roles and constraints on these roles

10
Constructing an InterLingua Representation
  • For each syntactic analysis
  • Access all semantic mappings and contexts for
    each lexical item.
  • Create all possible semantic representations.
  • Test them for coherency of structure and content.

11
Basic Semantic Dependency - Example
Input John makes tools Syntactic Analysis
cat verb root make tense present subject
  root john cat noun-proper object   roo
t     tool cat noun number plural
12
Lexicon Entries for John and tool
John-n1 syn-struc root john cat noun-proper
sem-struc human name john gender
male
tool-n1 syn-struc root tool cat n sem-struc
tool
13
Ontological Representation - Example
Relevant extract from the specification of the
ontological concept used to describe the
appropriate meaning of make manufacturing-activi
ty... agent human theme artifact
14
Semantic Dependency Component
The basic semantic dependency component of the
Text Meaning Representation (TMR) for John
makes tools manufacturing-activity-7 agent human
-3 theme set-1 element tool cardinality gt
1
15
semantic representation of try-v3
try-v3 syn-struc root try cat v subj
root var1 cat n xcomp root
var2 cat v form OR infinitive
gerund sem-struc set-1 element-type refsem-1
cardinality gt1 refsem-1 sem event agent
var1 effect refsem-2 modality modality-
type epiteuctic modality-scope refsem-2 mod
ality-value lt 1 refsem-2 value var2 sem ev
ent
Means non finished action outcome unclear
16
Why is Iraq developing weapons of mass
destruction?
17
Word sense Disambiguation
  • Methods
  • Constraint checking
  • make sure the constraints imposed on context are
    met
  • Graph traversal
  • is-a links are inexpensive
  • other links are more expensive
  • the cheapest structure is the most coherent one
  • Hunter-gatherer processing
  • find (hunt) and eliminate (kill) unlikely
    interpretations
  • collect (gather) remaining interpretations

18
Ontological Semantics An example semantic
representation language
  • slides from S. Nirenberg

19
Ontological semantics is a computationally
tractable theory of meaning in natural language
as well as a suite (OntoSem) of implemented NLP
programs and a set of static knowledge resources
that support these programs. Ontological
semantics deals directly with extraction, represen
tation and manipulation of text meaning. Ontosem
text analyzers produce interpreted knowledge
ready to be used in reasoning-heavy applications
that include question answering, cross-document
and cross-lingual text summarization, question
answering, machine translation and others.
Support of intelligent human-computer
interaction in domain- and task-oriented
environments is squarelywithin the purview of
ontological semantics.
20
Ontological semantics concentrates on content of
representations and is adaptable to a number of
different representation formats. Ontological
semantics is both a producer and aconsumer of
knowledge deriving text meaning isitself a
knowledge-intensive task
21
  • OntoSem
  • is devoted to processing naturally occurring
    texts
  • strives for high-quality results first
    followed by concern for broad coverage
  • expects unexpected inputs
  • seeks quality heuristics of any provenance
    (knowledge- based or probabilistic,
    cooccurrence-based)
  • does not grant syntax a privileged position
    among the providers of heuristics for
    semantic processing
  • does not make a strong distinction between
    semantics and pragmatics
  • is applicable to any natural language

22
Ontological-semantic analyzers take natural
language texts as inputs and generate
machine- tractable text meaning representations
(TMRs) that form the basis of various reasoning
processes. Sample Input Sentence Iran, Iraq
and North Korea on Wednesday rejected an
accusation by President Bush that they are
developing weapons of mass destruction. The TMR
(presented graphically) for the above isas
follows
23
Output A Text Meaning Representation (TMR)
This presentation is simplified the system, in
fact, derives much more from text event
instances are shown in ellipses object
instances, in rectangles only caserole and set
membership relations are shown (as labels on
links) numerical constraints can be fuzzy, as
in the cardinality of SET-1226.
24
A pretty-printed fragment of the actual TMR
representation for sample input
25
  • Ontological-semantic systems centrally rely on
    the following
  • static knowledge resources
  • a language-independent ontology that
    includes knowledge about types of
    entities in the world,
  • e.g., ATHLETE, WELD or SPEED
  • ontology-oriented lexicons (and onomasticons,
    or lexicons of proper names) for each
    natural language in the system and
  • a fact repository containing instances of
    ontological concepts, e.g., Andre
    Agassi
  • (ATHLETE-3176) or the Apollo 13 mission
    (SPACEFLIGHT-142)

26
A Sample Screen of the Ontology/Lexicon/Fact
Repository Browsing and Editing Environment
27
(No Transcript)
28
(No Transcript)
29
(diagnosis (diagnosis-n1 (cat n) (anno
(def "") (ex "The diagnosis (of cancer) (by
the specialist) was made quickly")
(comments "")) (syn-struc ((root var0)
(cat n) diagnosis (pp-adjunct
((root of) (root var1) (cat prep) (opt )
of (obj ((root var2) (cat n)))))
disease (pp-adjunct ((root by) (root
var3) (cat prep) (opt ) by (obj ((root
var4) (cat n))))))) someone
(sem-struc (DIAGNOSE
the ontological mapping (agent (value
var4)) the case roles (theme (value
var2))) (var1 (null-sem )) blocks
compositional analysis of preps (var3
(null-sem )))) )
30
(cancer (cancer-n1 (cat n) (anno (def "a
disease") (ex "") (comments "") )
(syn-struc ((n ((root var1) (cat n) (opt )))
animal part as modifier (root var0)
(cat n) cancer ))
(sem-struc (CANCER (location (value
var1) (sem animal-part))) ) )
31
(cancer-n2 (cat n) (anno (def "a sign of
the zodiac") (ex "") (comments "") )
(syn-struc ((root var0) (cat n) ))
(sem-struc (CANCER-ZODIAC) ) ) )
32
  • Currently Available Static Knowledge Sources for
    English
  • Ontology of about 6,500 concepts (about
    95,000 property-value pairs)
  • English lexicon of about 40,000 entries
  • Fact repository of about 20,000 facts (outside
    medical domain)
  • English Onomasticon of about 350,000 entries
  • Tokenization knowledge, morphological and
    syntactic grammars
  • for a number of languages

33
The analyzers conceptual architecture
(in reality, not strictly pipelined)
TMR
SyntacticAnalyzer
SemanticAnalyzer
Preprocessor
Processing Modules
Grammar Ecology MorphologySyntax
Lexicon and Onomasticon
Ontology and Fact Repository
Static Knowledge Resources
34
  • The basic (who did what to whom) semantic
    dependency is derived, in the general case, on
    the basis of
  • lexical-semantic expectations (selectional
    restrictions) recorded in the ontology and the
    lexicon and
  • syntactic dependency derived from the results of
    syntactic analysis.

35
The beginnings of system evaluation
Run I raw Run II preprocessor output
correct Run III preprocessor and syntactic
analysis output correct
36
In addition to the basic semantic dependency,
TMRs also include parameterized information
provided by the microtheoriesof aspect, modality
(including speaker attitudes), time, style and
others. Most of these microtheories have been
implemented. All would benefit from further
work. We are also actively looking
into possibilities of borrowing some
microtheories -- either in toto or partially.
37
(No Transcript)
38
(No Transcript)
39
(No Transcript)
40
(No Transcript)
41
FrameNet Another example of semantic
representation
  • Frame Semantics (Fillmore 1976, 1977, ..)
  • Frame a conceptual structure or prototypical
    situation
  • Frame elements (roles)
  • Identify participants of the situation
  • Are local to their frame
  • Frame evoking elements (verbs, nouns, adjectives)
    introduce frames
  • E.g. VERDICT
  • The juryJudge convicted himDefentant on the
    counts of theftCharges.
  • On Thursday a juryJudge found the
    youthDefendant guilty of wounding Mr Lay
    Finding
  • Berkeley FrameNet Project
  • Database of frames for core lexicon of English
  • Current release 610 frames, about 9000 lexical
    units

42
Types of Relations
  • FrameNet Relations
  • Frame hierarchy inherits
  • Subframes
  • Contextual Relations between instantiated frames
    and roles
  • Syntactic and/or semantic embedding
  • Discourse relations
  • Anaphoric relations
  • Inferences
  • On the basis of both

43
A Case Study
  • In the first trial in the world in connection
    with the terrorist attacks of 11 September 2001,
    the Higher Regional Court of Hamburg has passed
    down the maximum sentence. Mounir al Motassadeq
    will spend 15 years in prison. The 28-year-old
    Moroccan was found guilty as an accessory to
    murder in more than 3000 cases.

44
FrameNet as a Net Frame-to-Frame Relations
  • Subframe relation
  • Super frame represents complex event
  • Subframes represent sub-events
  • Subframes usually inherit some roles of the super
    frame

Criminal process
Arraignment
Arrest
Sentencing
Trial
45
Local Roles
  • In the first trial in the world in connection
    with the terroristAssailant attacks of 11
    September 2001TimeCase, the Higher Regional
    Court of HamburgCourt has passed down the
    maximumType sentence.

46
Local Roles
  • Mounir al MotassadeqInmates will spend 15
    yearsDuration in prison.

47
Local Roles
  • The 28-year-old MoroccanDefendant was found
    guiltyFinding as an accessory to
    murderFocalEntity in more than 3000
    casesVictim Charge.

48
Unfilled Roles
  • Target Frame Frame roles Filler (given vs.
    Induced)
  • trial TRIAL CASE terrorist attacks (1)
  • CHARGE accessory to murder (2)
  • COURT Higher Regional Court (3)
  • DEFENDANT ... 28-year-old Moroccan (4)
  • attacks ATTACK ASSAILANT terrorist (5)
  • VICTIM ... (6) TIME (exth.) 11
    September 2001 (7)
  • sentence SENTENCING CONVICT Mounir al
    Motassadeq (8) COURT Higher Regional
    Court (9) TYPE ... maximum sentence (10)
  • prison PRISON INMATES ... Mounir al
    Motassadeq (11) DURATION (exth.) 15
    years (12)
  • found VERDICT CASE terrorist attacks (13)
  • CHARGE accessory to murder (14) DEFENDAN
    T 28-year-old Moroccan (15) FINDING
    ... guilty (16)
  • accessory ASSISTANCE CO-AGENT (17)
  • FOCAL_ENTITY murder (18)
  • HELPER ... 28-year-old Moroccan (19)

49
  • Target Frame Frame roles Filler (given vs.
    Induced)
  • trial TRIAL CASE terrorist attacks
    (1)
  • CHARGE accessory to murder (2)
  • COURT Higher Regional Court (3)
  • DEFENDANT ... 28-year-old Moroccan (4)
  • attacks ATTACK ASSAILANT terrorist
    (5)
  • VICTIM ... (6)
  • TIME (exth.) 11 September 2001 (7)
  • sentence SENTENCING CONVICT Mounir al
    Motassadeq (8)
  • COURT Higher Regional Court (9)
  • TYPE ... maximum sentence (10)
  • prison PRISON INMATES ... Mounir al
    Motassadeq (11) DURATION (exth.) 15
    years (12)
  • Found VERDICT CASE terrorist
    attacks (13)
  • CHARGE accessory to murder (14)

50
  • Target Frame Frame roles Filler (given vs.
    Induced)
  • trial TRIAL CASE terrorist attacks (1)
  • CHARGE accessory to murder (2)
  • COURT Higher Regional Court (3)
  • DEFENDANT ... 28-year-old Moroccan (4)
  • attacks ATTACK ASSAILANT terrorist (5)
  • VICTIM ... (6)
  • TIME (exth.) 11 September 2001 (7)
  • sentence SENTENCING CONVICT Mounir al
    Motassadeq (8)
  • COURT Higher Regional Court (9)
  • TYPE ... maximum sentence (10)
  • prison PRISON INMATES ... Mounir al
    Motassadeq (11)
  • DURATION (exth.) 15 years (12)
  • found VERDICT CASE terrorist
    attacks (13)

51
Linking Frames and Roles in Context
  • At the instance level
  • given frame instances f1F1 and f2F2, where
  • f1 and f2 stand in a contextual relation (syn,
    sem, discourse)
  • frame types F1 and F2 stand in some frame
    relation
  • gt identify role instances (referents) of f1 and
    f2 (r1 ( r0) r2)

inferred relation
frame relation
context-related instances
52
Linking Frames and Roles in Context
  • In the first trial in the world in connection
    with the terrorist attacks of 11 September 2001,
    the Higher Regional Court of Hamburg has passed
    down the maximum sentence.

Criminal Process
Sentencing
Court
Trial
frame relation
53
Linking Frames and Roles in Context
  • In the first trial (f1) in the world in
    connection with the terrorist attacks of 11
    September 2001, the Higher Regional Court of
    Hamburg (r2) has passed down the maximum
    sentence (f2).

Criminal Process
Sentencing
Court
Functional Embedding
Trial
The Higher Regional Court of Hamburg
frame relation
context-related instances
54
Linking Frames and Roles in Context
In the first trial (f1) in the world in
connection with the terrorist attacks of 11
September 2001, the Higher Regional Court of
Hamburg (r2r0 r1) has passed down the maximum
sentence (f2).
Criminal Process
Sentencing
Court
Functional Embedding
Trial
The Higher Regional Court of Hamburg
frame relation
context-related instances
inferred relation
55
Linking Frames and Roles in Context
  • At the type level (more involved)
  • If instances of frame roles f1F1 and f2F2 are
    often found co-referent within particular
    contextual relations
  • gt Hypothesize a frame relation between F1 and F2

inferred relation
(no) frame relation
context-related instances
56
Linking Frames and Roles in Context
the Higher Regional Court of Hamburg has
passed down the Maximum sentence. Mounir al
Motassadeq will spend 15 years in prison.
Prison
  • New Frame Relation
  • (Role Binding ConvictInmates)

Sentencing
Inmates
Discourse Relation
Convict
(Co-reference)
inferred relations
(no) frame relation
context-related instances
57
Frame, Contextual, and Inferred Relations
CRIMINAL PROCESS
SENTENCING (1)
TRIAL (1)
PRISON (2)
Defendant
Inmates
Duration
Convict
Type
Case
Charge
Court
Court
VERDICT (3)
Defendant
Case
Charge
Finding
ASSISTANCE (3)
KILLING (3)
(1)
sentence number
Subframe/FE
Contextual Relation
Killer
Helper
Co_agent
Focal_entity
Victim
Inferred Relation
58
CRIMINAL PROCESS
SENTENCING
TRIAL
PRISON
Defendant
Inmates (Motus.)
Duration (15Y)
Convict
Duration (maximum)
Court (Hmbg.)
Case (9/11)
Charge
VERDICT
Defendant (the Moroccan)
Case
Charge (accessory)
ASSISTANCE
Hierarchy/Subframe/FE
KILLING
Contextual Relations
Helper
Co_agent
Goal (murder)
Inference
Killer
Victim (3000)
In the first trial .. the higher Regional Court
.. has passed down the maximum sentence. Mounir
al Motussadeq will spend 15 years in prison. The
28-year-old Moroccan was found guilty as an
accessory to murder in .. 3000 cases.
59
Statistical Semantic Role Labeling
60
References
  • Jurafsky, D. J. H. Martin, Speech and Language
    Processing, Prentice-Hall, 2000. (Chapters 9 and
    10)
  • Helmreich, S., From Syntax to Semantics,
    Presentation in the 74.419 Course, November 2003.
  • Nirenburg, S. V. Raskin, Ontological Semantics,
    MIT Press, 2004.
  • Wordnet, http//wordnet.princeton.edu/
  • Suggested Upper Merged Ontology (SUMO),
    http//www.ontologyportal.org/
Write a Comment
User Comments (0)
About PowerShow.com