The Rulebased Parser of the NLP Group of the University of Torino - PowerPoint PPT Presentation

1 / 15
About This Presentation
Title:

The Rulebased Parser of the NLP Group of the University of Torino

Description:

Procedural analysis of conjunctions and of identification of verbal dependents ... PuoiV-modal-2nd-sing-pres dirV-inf [miPron-1st-dative]Pron ... – PowerPoint PPT presentation

Number of Views:33
Avg rating:3.0/5.0
Slides: 16
Provided by: leonard53
Category:

less

Transcript and Presenter's Notes

Title: The Rulebased Parser of the NLP Group of the University of Torino


1
The Rule-based Parser of the NLP Group of the
University of Torino
  • Leonardo LesmoDipartimento di Informatica and
  • Centro di Scienze Cognitive,
  • Università di Torino,
  • Italy
  • Email lesmo_at_di.unito.it

2
Goals
  • Wide-coverage tool
  • Domain-independence
  • Extensibility to semantics

Approach
  • Manually developed rules
  • Two phases Chunking and subcategorization
  • Procedural analysis of conjunctions and of
    identification of verbal dependents

3
TULE (Turin University Linguistic Environment)
4
The grammar
  • Rule-based dependency grammar
  • Chunking (non-verbal groups) verbal
    subcategorization frames
  • Output a projective tree represented as pointers
    to parents, including some null elements
    (understood items e.g. pro-drop - and traces)

5
Parser Architecture
Lexical Items
Splits the text into groups of strictly connected
words
CHUNKING
Chunking rules
Chunked text
Connects chunks linked by conjunctions, to form
larger chunks
ANALYSIS OF CONJUNCTIONS
Procedural preference rules 1
Chunked text
Procedural preference rules 2
Determines the dependents of verbs
SEGMENTATION
Lexical items
Verb classes
Determines the role (arc labels) of the verbal
dependents
VERBAL ATTACHMENT
Verbal Caseframes
Parse Tree
6
An example
7
Chunking
Example Puoi dirmi che spettacoli di cabaret
posso vedere domani? (Can you
tell me what cabaret plays I can see tomorrow?)
PuoiV-modal-2nd-sing-pres dirV-inf
miPron-1st-dativePron cheAdj-interr
spettacoliNoun diPrep cabaretNounP-group
N-group possoV-modal-1st-sing-pres vedereV-inf
domaniAdvA-group?
Chunking Rules
  • Chunking rules are grouped in packets.
  • Each packet is associated with a lexical
    category, and describes the chunkable possible
    dependents of words of that category.
  • Chunkable means a dependent handled during
    chunking (e.g. auxiliaries, but not arguments of
    verbs)

8
A chunk rule
(NOUN common (precedes (ADJ qualif T (\- \'
\")) (ADJ ((type qualif)
(agree))) ADJCQUALIF-RMOD))
9
Conjunctions
  • When a coordinating conjunction is found, all
    following and preceding chunks are collected
  • All pairs are built, and the best one is chosen
    according to criteria based on structural
    similarity and distance
  • Special treatment for verbs

Example Ho incontrato Marco e Lucia e li ho
salutati (I met Marco e Lucia
and I greeted them)
HoV-aux incontratoV-main
MarcoNoun-ProperNoun eConj-coord
LuciaNoun-ProperNoun eConj-coord liPron-pers
Pron hoV-aux salutatiV-main
10
Segmentation
  • For each verb (going from left to right)
  • Look for possible dependents (on its right and
    left)
  • On the left, the search is blocked from the
    previous verb
  • On the right, some barriers are defined to stop
    the search (for instance, a subordinating
    conjunction acts as a barrier)

PuoiV-modal-2nd-sing-pres dirV-inf
miPron-1st-dativePron cheAdj-interr
spettacoliNoun diPrep cabaretNounP-group
N-group possoV-modal-1st-sing-pres
vedereV-inf domaniAdvA-group?
11
Verbal Subcategorization
The subcategorization classes
12
Example subcategorization class definitions
(subj-verbs (intrans) (verbs) verbs with
a subject. Definition of subject (
verb-subj ((noun (agree)) (art (agree))
(pron (not (word quale) (type relat)) (case
lsubj) (agree)) (adj (type (indef demons
deitt interr poss)) (agree)) (num (agree))
(prep (word in) (down (cat pron) (type indef))
(agree)))))
(ssubj-inf-verbs () (verbs) verbs with
an inf-verb sentential subject ( verb-subj
((verb (mood infinite)
(agree)))))
(empty-modal () (no-subj-verbs)
modals without subject (
verb-indcompl-modal ((verb
(mood infinite)))))
13
Transformations
basic class (e.g. trans)
transformed classes (e.g. trans,
transpassivization, transinfinitivization, tran
sprodrop, transpassivizationinfinitivization,
.. )
Example transformation
(infinitivization replacing
(subj-verbs) (is-inf-form tr-verb
v-casefr) (cancel-case s-subj))
14
  • Some statistics
  • Chunking rules
  • Total 295 rules
  • Common 250 rules
  • English 34 rules
  • Italian 7 rules
  • Spanish Catalan 4 rules
  • Base Subcategorization
  • Total 118 classes
  • Abstract 21 classes
  • plus verbal locutions
  • Italian 40 classes
  • English 1 class
  • Derived surface case frames
  • 2653 case frames

15
Conclusions
  • Test of the parser on other languages, using the
    same grammar augmented with extra rules (see
    previous slide)
  • Partial use of semantic information (about 400
    words classified according to a semantic taxonomy)
  • The parser has been used in a project involving
    spoken and written linguistic interaction with a
    user. It has been interfaced with an repository
    of semantic knowledge to build a meaning
    representation.
Write a Comment
User Comments (0)
About PowerShow.com