CLINUtrecht99 - PowerPoint PPT Presentation

1 / 54
About This Presentation
Title:

CLINUtrecht99

Description:

Formal systems to specify a grammar formalism. Start with primitives (basic primitive structures or. building blocks) as simple as possible and then ... – PowerPoint PPT presentation

Number of Views:32
Avg rating:3.0/5.0
Slides: 55
Provided by: aravind8
Category:

less

Transcript and Presenter's Notes

Title: CLINUtrecht99


1
Starting With Complex Primitives Pays
Off Complicate Locally, Simplify
Globally ARAVIND K. JOSHI Department of Computer
and Information Science and Institute for
Research in Cognitive Science

2
Outline
  • Introduction
  • Towards CLSG
  • Syntactic description
  • Semantic composition
  • Statistical processing
  • Psycholinguistic properties
  • Applications to other domains
  • Discourse structure
  • Folded structure of biomolecular sequences
  • Summary

3
Introduction
  • Formal systems to specify a grammar formalism
  • Start with primitives (basic primitive
    structures or building blocks) as simple as
    possible and then introduce various
    operations for constructing more complex
    structures
  • Such systems are string rewriting systems,
    requiring string adjacency of function and
    argument
  • Alternatively,

4
Introduction CLSG
  • Start with complex (more complicated)
    primitives which directly capture some crucial
    linguistic properties and then introduce some
    general operations for composing them
    -- Complicate Locally, Simplify Globally (CLSG)
  • CLSG systems are structure rewriting systems,
    requiring structure adjacency of function and
    argument
  • CLSG approach is characterized by localizing
    almost all complexity in the set of primitives,
    a key property

5
Introduction CLSG localization of complexity
  • Specification of the set of complex primitives
    becomes the main task of a linguistic theory
  • CLSG pushes non-local dependencies to become
    local, i. e. , they arise initially in the
    primitive structures to start with

6
CLSG
  • CLSG approach as led to several new insights
    into
  • Syntactic description
  • Semantic composition
  • Language generation
  • Statistical processing
  • Psycholinguistic properties
  • Discourse structure

7
Context-free Grammars
  • The domain of locality is the one level tree
    -- primitive building blocks


DET the N man/car
CFG, G S NP VP
VP V NP VP VP ADV NP DET N
V likes ADV passionately
S
VP
N
N
DET
car
the
man
NP
VP

VP
ADV

VP
NP
V
ADV
passionately
likes
N
NP
DET
V
8
Context-free Grammars
  • The arguments of the predicate are not in the
    same local domain
  • They can be brought together in the same domain
    -- by introducing a rule

S NP V NP
  • However, then the structure is lost
  • Further the local domains of a CFG are not
    necessarily lexicalized
  • Domain of Locality and Lexicalization

9
Towards CLSG Lexicalization
  • Lexical item One or more elementary structures
    (trees, directed acyclic graphs), which are
    syntactically and semantically encapsulated.
  • Universal combining operations
  • Grammar Lexicon

10
Lexicalized Grammars
  • Context-free grammar (CFG)

CFG, G S NP VP
NP Harry
VP V NP VP VP ADV
NP peanuts
V likes ADV passionately
(Non-lexical)
(Lexical)
S
NP
VP
Harry
VP
ADV
passionately
NP
V
likes
peanuts
11
Weak Lexicalization
  • Greibach Normal Form (GNF)

CFG rules are of the form A a B1
B2 ... Bn A a This
lexicalization gives the same set of strings but
not the same set of trees, i.e., the same set of
structural descriptions. Hence, it is a weak
lexicalization.
12
Strong Lexicalization
  • Same set of strings and same set of trees or
    structural descriptions.
  • Tree substitution grammars (TSG)
  • Increased domain of locality
  • Substitution as the only combining operation

13
Substitution
X
a
b
X
g
X
b
14
Strong Lexicalization
  • Tree substitution grammars (TSG)

CFG, G S NP VP
NP Harry
VP V NP
NP peanuts
V likes
S
a3 NP
a2
NP
TSG, G a1
Harry
peanuts
NP
VP
V NP
likes
15
Insufficiency of TSG
  • Formal insufficiency of TSG

G S SS (non-lexical) S a (lexical)
CFG
S
S
S
TSG G a1
a2
a3
S
S
S
S
a
a
a
16
Insufficiency of TSG
S
S
S
TSG G a1
a2
g
S
S
S
S
S
S
S
S
a
a
S
S
a
a
a
S
a3
S
S
a
a
a
g grows on both sides of the root
G can generate all strings of G but not all
trees of G. CFGs cannot be lexicalized by TSGs,
i.e., only by substitution.
17
Adjoining
X
b
a
X
X
X
g
b
X
Tree b adjoined to tree a at the node labeled X
in the tree a
18
With Adjoining
G S SS S a
S
S
S
TSG G a1
a2
a3
a
S
S
S
S
g
S
a
a
S
S
Adjoining a2 to a3 at the S node, the root node
and then adjoining a1 to the S node of the
derived tree we have g .
a
S
S
a
a
CFGs can be lexicalized by LTAGs. Adjoining is
crucial for lexicalization.
Adjoining arises out of lexicalization
19
Lexicalized LTAG
  • Finite set of elementary trees anchored on
    lexical items -- extended projections of
    lexical anchors, -- encapsulate syntactic and
    semantic dependencies
  • Elementary trees Initial and Auxiliary
  • Operations Substitution and Adjoining
  • Derivation
  • Derivation Tree
  • How elementary trees are put together.
  • Derived tree

20
Localization of Dependencies
  • agreement person, number, gender
  • subcategorization sleeps null eats NP gives
    NP NP thinks S
  • filler-gap who did John ask Bill to invite e
  • word order within and across clauses as in
    scrambling and clitic movement
  • function argument all arguments of the
    lexical anchor are localized

21
Localization of Dependencies
  • word-clusters (flexible idioms)
    non-compositional aspect
  • take a walk, give a cold shoulder to
  • word co-occurrences
  • lexical semantic aspects
  • statistical dependencies among heads
  • anaphoric dependencies

22
LTAG Examples
S
S
a1
a2
VP
S
NP
NP
V
NP
VP
NP
likes
V
NP
likes
transitive
e
object extraction
some other trees for likes subject extraction,
topicalization, subject relative, object
relative, passive, etc.
23
LTAG A derivation
S
S
b2
b1
a2
S
V
S
S
NP
VP
NP
does
VP
V
S
NP
V
NP
think
likes
e
a5
a3
NP
a4
NP
NP
Bill
who
Harry
24
LTAG A Derivation
who does Bill think Harry likes
S
S
b2
b1
a2
S
V
S
S
NP
VP
NP
does
VP
V
S
NP
V
NP
think
likes
substitution
e
a5
a3
NP
a4
NP
NP
adjoining
Bill
who
Harry
25
LTAG Derived Tree
who does Bill think Harry likes
S
S
NP
V
S
who
does
VP
NP
V
S
Bill
think
VP
NP
NP
V
Harry
likes
e
26
LTAG Derivation Tree
substitution
who does Bill think Harry likes
adjoining
likes
a2
who
a3
a4
Harry
b1
think
a5
Bill
does
b2
Compositional semantics on this derivation
structure Related to dependency diagrams
27
Topology of Elementary Trees Nested Dependencies
Topology of elementary trees, a and b determines
the Nature of dependencies described by the TAG
grammar

S
a
S
G b
S
a
b
a
S
b
a S b
a S b
a a ab b b
S
a b
28
Topology of Elementary Trees Crossed dependencies
b
a
S
S
a
a
S
S
b
b
S
Topology of elementary trees a and b determines
the kinds of dependencies that can be
characterized b is one level below a and to
the right of the spine
29
Topology of Elementary Trees Crossed
dependencies
S
S
a
a
S
S
S
b
b
S
a
S
a
S
a a b b
b
S
Linear structure
b
30
Examples Nested Dependencies
  • Center embedding of relative clauses in English

The rat1 the cat2 chased2 ate1 the cheese
  • Center embedding of complement clauses in German

Hans1 Peter2 Marie3 schwimmen3 lassen2
sah1 (Hans saw Peter make Marie swim)
31
Examples Crossed Dependencies
  • Center embedding of complement clauses in Dutch

Jan1 Piet2 Marie3 zag1 laten2 zwemmen3
(Jan saw Piet make Marie swim)
  • It is possible to obtain a wide range of
    complex dependencies, i.e., complex
    combinations of nested and crossed
    dependencies. Such patterns arise in word
    order phenomena such as scrambling and clitic
    climbing and also due to scope ambiguities

32
LTAG Some Important Properties
  • Factoring recursion from the domain of
    dependencies (FRD) and extended domain of
    locality (EDL)
  • All interesting properties of LTAG follow from
    FRD and EDL mathematical, linguistic and
    processing
  • Belong to the class of so-called mildly
    context-sensitive grammars
  • Automaton equivalent of TAG, embedded pushdown
    automaton, EPDA

33
Processing of crossed and nested dependencies
Crossed dependencies (CD)
Jan1 Piet2 Marie3 zag1 laten2 zwemmen3
Nested dependencies (ND)
Hans1 Peter2 Marie3 schwimmen3 lassen2 sah1
(Jan saw Peter make Marie swim)
  • CDs are easier to process (about one-half) than
    NDs (Bach, Brown, and Marslen-Wilson (1986)
  • Principle of partial interpretation (PPI)
  • EPDA model correctly predicts BBM results
    Joshi (1990)

34
Some Important Properties of LTAG
  • Extended domain of locality (EDL)
  • Localizing dependencies
  • Set of elementary trees are the domains for
    specifying linguistic constraints
  • Factoring recursion from the domain of
    dependencies (FRD)
  • All interesting properties of LTAG follow from
    EDL and FRD mathematical, linguistic, and
    processing
  • Belongs to the class of mildly context-sensitive
    grammars

35
A different perspective on LTAG
  • Treat the elementary trees associated with a
    lexical item as if they are super part of speech
    (super-POS or supertags)
  • Local statistical techniques have been remarkably
    successful in disambiguating standard POS
  • Apply these techniques for disambiguating
    supertags -- almost parsing

36
Supertag disambiguation -- supertagging
  • Given a corpus parsed by an LTAG grammar
  • we have statistics of supertags -- unigram,
    bigram, trigram, etc.
  • these statistics combine the lexical statistics
    as well as the statistics of the constructions in
    which the lexical items appear

37
Supertagging
a5
a2
a1
a3
a4
a8
a6
a7
b2
b4
a13
a9
a10
b3
a12
a11
b1
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
the purchase price includes two
ancillary companies
On the average a lexical item has about 8 to 10
supertags
38
Supertagging
a5
a2
a1
a3
a4
a8
a6
a7
b2
b4
a13
a9
a10
b3
a12
a11
b1
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
the purchase price includes two
ancillary companies
- Select the correct supertag for each word --
shown in green - Correct supertag for a word
means the supertag that corresponds to that
word in the correct parse of the sentence
39
Supertagging -- performance
- Performance of a trigram supertagger
- Performance on the WSJ corpus, Srinivas
(1997), Chen (2002)

correct
of words correctly supertagged
Size of the training corpus
Size of the test corpus
75.3
35,391
Baseline
47,000
92.2
43,334
47,000
1 million
40
Abstract character of supertagging
  • Complex (richer) descriptions of primitives
    (anchors)
  • contrary to the standard mathematical convention
  • descriptions of primitives are simple
  • complex descriptions are made from simple
    descriptions
  • Associate with each primitive all information
    associated with it

41
Complex descriptions of primitives
  • Making descriptions of primitives more complex
  • increases the local ambiguity, i.e., there are
    more descriptions for each primitive
  • however, these richer descriptions of primitives
    locally constrain each other
  • analogy to a jigsaw puzzle -- the richer the
    description of each primitive the better

42
Complex descriptions of primitives
  • Making the descriptions of primitives more
    complex
  • allows statistics to be computed over these
    complex descriptions
  • these statistics are more meaningful
  • local statistical computations over these complex
    descriptions lead to robust and efficient
    processing

43
Flexible Composition
Adjoining as Wrapping
a at x
Split
a
X
X
X
a1 supertree of a at X
a2 subtree of a at X
44
Flexible Composition
Adjoining as Wrapping
X
b
a
X
X
a1 supertree of a at X
X
g
b
X
a2 subtree of a at X
a wrapped around b i.e., the two components a1
and a2 are wrapped around b
45
Flexible Composition
Wrapping as substitutions and adjunctions
S
b
a
S
S
NP(wh)
VP
NP
VP
NP
V
S
V
NP
think
substitution
likes
e
- We can also view this composition as a
wrapped around b - Non-directional composition
adjoining
46
Adjoining as Wrapping
Wrapping as substitutions and adjunctions
a
S
substitution
a1
S
adjoining
b
S
NP(wh)
S
VP
a2
NP
VP
NP
V
S
V
NP
think
likes
e
a1 and a2 are the two components of a a1
attached (adjoined) to the root node S of b a2
attached (substituted) at the foot node S of b

47
Multi-component LTAG (MC-LTAG)
- The components are used together in one
composition step with the individual
components being composed with either
substitution or adjoining
- The representation can be used for both
-- predicate argument relationships --
scope information - The two pieces of information
are together before the single composition
step - However, after the composition there may
be intervening material between the components
48
Tree-Local Multi-component LTAG (MC-LTAG)
- How can the components of MC-LTAG compose
preserving locality of LTAG - Tree-Local MC-LTAG
-- Components of a set compose only with an
elementary tree or an elementary
component- Flexible composition - Tree-Local
MC-LTAGs are weakly equivalent to LTAGs -
However, Tree-Local MC-LTAGs provide structural
descriptions not obtainable by LTAGs - Increased
strong generative power
49
Scope ambiguities Example
( every student hates some course)
a1
a2
a3
S
a11
S
a21
S
VP
NP
V
NP
a12
NP
a22
NP
DET
N
hates
DET
N
every
some
N
N
a4
a5
student
course
50
Derivation with scope information Example
( every student hates some course)
a1
a2
a3
S
a11
S
a21
S
VP
NP
V
NP
a12
NP
a22
NP
DET
N
hates
DET
N
every
some
N
N
a4
a5
student
course
51
Derivation tree with scope information Example
( every student hates some course)
a3(hates)
0
0
1
2.2
a11(E)
a12(every)
a22(some)
a21(S)
2
2
a4(student)
a5(course)
- a11 and
a21 are both adjoined at the root of a3(hates)
- They can be adjoined in any order - a11 will
outscope a(S) if a(E)is adjoined before a(S) -
Scope information represented in the LTAG
system itself
52
Competence/Performance Distinction A New Twist
  • For a property, P, of language, how does one
    decide whether P is a competence or a
    performance property?
  • The answer is not given a-priori
  • It depends on the formal devices (grammars and
    corresponding machines) available for
    describing language

53
Competence/Performance Distinction A New Twist
  • With MC-TAG and flexible composition all word
    order patterns up to two levels of embedding
    can be described with correct structural
    structural descriptions assigned, i.e., with
    correct semantics
  • Examples center embedding of complement
    clauses, clitic movement, scope ambiguities,
    etc.
  • Beyond two levels of embedding, although all
    word order patterns can be described, there is
    no guarantee that correct semantics can be
    assigned to all strings
  • No corresponding result known so far for center
    embedding of relative clauses as in English

54
Summary
  • Complex primitive structures (building blocks)
  • CLSG Complicate Locally, Simplify Globally
  • CLSG makes non-local dependencies become local,
    i.e., they are encapsulated in the primitive
    building blocks
  • New insights into
  • Syntactic description
  • Semantic composition
  • Statistical processing
  • Psycholinguistic properties
  • Applications to other domains
Write a Comment
User Comments (0)
About PowerShow.com