Word Sense Disambiguation at SensevalII - PowerPoint PPT Presentation

1 / 58
About This Presentation
Title:

Word Sense Disambiguation at SensevalII

Description:

All the senses for a word are collected into a dictionary. ... Cross category relations: operate#3 [Medicine] Cross language information. Polysemy Reduction ... – PowerPoint PPT presentation

Number of Views:63
Avg rating:3.0/5.0
Slides: 59
Provided by: giul9
Category:

less

Transcript and Presenter's Notes

Title: Word Sense Disambiguation at SensevalII


1
Word Sense Disambiguation at Senseval-II
  • Bernardo Magnini, Carlo Strapparava
  • Giovanni Pezzulo and Alfio Gliozzo
  • ITC-irst, Centro per la Ricerca Scientifica e
    Tecnologica
  • Povo (Trento) - Italy
  • magnini, strappa, pezzulo, gliozzo_at_itc.it

2
Outline
  • Word Sense Disambiguation (WSD)
  • Definition of the task
  • Methodological issues
  • Senseval-II Experience
  • Overview
  • Systems description

3
Word Sense Disambiguation (WSD)WSD is the
process of deciding the meaning of a word in its
context.
  • The problem to resolve derives from Lexical
    Ambiguity
  • he cashed a check at the bank
  • he sat on the bank of the river and watched the
    currents

The same word can assume different senses with
respect to the particular context where it
occurs.
4
WSD Preliminary observations
  • All the senses for a word are collected into a
    dictionary.
  • Evaluating a WSD program is a process of
    comparison between human and systems answers.
  • Most common words have more than one meaning
    (Zirpfs law), so the great part of terms into a
    text are polysemous.

5
Uses of WSD systems
  • Many NLP applications could be improved using a
    good WSD module
  • Examples
  • Machine Translation
  • Information Retrieval Question Answering

6
The WSD Problem
  • Choose the sense repository (meanings are
    represented in different ways in different
    dictionaries)
  • Elaborate WSD procedures and systems
  • Evaluate systems results

7
Machine Readable Dictionaries (MRD)
  • Provide sense repertories for disambiguation
    systems
  • Different dictionaries present different sense
    distinctions for the same word (granularity)
  • Some algorithms use information taken from
    dictionaries
  • The most used dictionaries for WSD are WordNet,
    LDOCE, Hector

8
Choosing the right sense
  • he cashed a check at the bank

Fine Grained Dictionary (WordNet) 1. depository
financial institution, bank, banking concern,
banking company -- 2. bank -- (sloping land
(especially the slope beside a body of water))3.
bank -- (a supply or stock held in reserve for
future use (especially in emergencies))4. bank,
bank building 5. bank -- (an arrangement of
similar objects in a row or in tiers)6. savings
bank, coin bank, money box, bank 7. bank -- (a
long ridge or pile) 8. bank -- (the funds held
by a gambling house or the dealer in some
gambling games)9. bank, cant, camber 10. bank
-- (a flight maneuver aircraft tips laterally
about its longitudinal axis )
Coarse Grained Dictionary (WordNet Domains) 1.
ECONOMY (Institution or place where money can be
saved) 2. GEOGRAPHY (the sloping land beside a
body of water) 3. FACTOTUM (an arrangement of
similar objects in a row or in tiers) 4.
ARCHITECTURE (a slope in the turn of the road) 5
TRANSPORT (a flight maneuver)
9
Evaluation of WSD systems
  • Consists in a comparison between systems and
    human answers
  • Human answers are collected in an annotated
    corpus (Gold Standard)
  • Precision and Recall can be used.
  • Baseline and upper bound can be fixed.

10
Corpora
  • Large collections of texts
  • Sense Annotated
  • Semcor (200.000), DSO (192.000 semantically
    annotated occurrences of 121 nouns and 70 verbs),
    training data senseval (8699 texts for 73 words),
    Tal-treebank(80000)
  • Difficult and expensive to realise.
  • Non Annotated
  • Brown, LOB, Reuters
  • Available in large quantity
  • Uses for WSD
  • To evaluate systems (gold standard)
  • Learning

11
Gold Standard Datasets Manually sense tagged
corpus with respect to a given dictionary
  • Requirements
  • Sense selections must be independently made by
    more than one person using the same dictionary,
    in cases of disagreement a supervisor is called
    to choose.
  • Inter-Tagger Agreement must be high enough (more
    than 80)

12
Inter-Tagger Agreement (ITA)
  • People often disagree on the sense to be assigned
    to a corpus instance of a word
  • ITA can be evaluated if more than one person made
    the sense selection on the same text
  • It is the percentage of the same choices made by
    annotators
  • It can also be evaluated using the Kappa measure.

13
Precision and Recall for WSD
  • Precision
  • Recall
  • where Good is the number of correct answers
    provided by the system
  • Bad is the number of wrong answers provided by
    the system
  • Null is the number of cases in which the
    system doesnt provide any answer

Note many systems provide multiple senses for a
single instance of a word so variations for the
measures shown can be used.
14
Classification of WSD systems
  • Unsupervised
  • Knowledge based (WN, dictionaries)
  • Learning from non annotated corpora
  • Supervised
  • Learning from sense annotated corpora (e.g.
    Semcor, DSO, TAL-treebank, and training data)

Many systems make use of mixed techniques to
improve their results.
15
Baselines for a WSD system
  • Very easy (naive) WSD procedures
  • Used to measure the improvement in a WSD system
    performance
  • Represent the lower bound of a WSD systems
    accuracy.
  • Examples
  • Unsupervised Random, Simple-Lesk
  • Supervised Most Frequent,Lesk-plus-corpus.

16
Lesks algorithm (1986)
  • Simple
  • Choose the sense whose dictionary definition and
    example texts have most word in common with the
    words around the instance to be disambiguated.
  • Plus corpus
  • As Simple Lesk, but also considers the word
    contained in the tagged training data.

Supervised
Unsupervised
17
Is ITA the upper bound for the accuracy of WSD
systems?
  • If a second human agrees with a first only 80
    of the time, then it is not clear what it means
    to say that a program was more then 80
    accurate (Kilgarriff,1998)
  • The debate is still open
  • ITA defines the upper bound of how well a
    computer program can perform (Kilgarriff)
  • Computers could work better than humans (Wilks)
  • If a WSD system has got a recall higher than ITA,
    then either the system or the task itself are
    wrongly designed (our opinion).

18
Outline
  • Word Sense Disambiguation (WSD)
  • Definition of the task
  • Methodological issues
  • Senseval-II Experience
  • Overview
  • Description of some systems

19
SENSEVAL goals
  • Provide a common framework to compare WSD systems
  • Standardise the task (especially evaluation
    procedures)
  • Build and distribute new lexical resources
    (dictionaries and sense tagged corpora)
  • There are now many computer programs for
    automatically determining the sense of a word in
    context (Word Sense Disambiguation or WSD).  The
    purpose of Senseval is to evaluate the strengths
    and weaknesses of such programs with respect to
    different words, different varieties of language,
    and different languages. from
    http//www.sle.sharp.co.uk/senseval2

20
SENSEVAL History
  • ACL-SIGLEX workshop (1997)
  • Yarowsky and Resnik paper
  • SENSEVAL-I (1998)
  • Lexical Sample for English, French, and Italian
  • SENSEVAL-II (Toulouse, 2001)
  • Lexical Sample and All Words
  • Organization Kilkgarriff (Brighton)
  • SENSEVAL-III (???)
  • Senseval workshop (ACL 2002)

21
WSD at SENSEVAL-II
  • Choosing the right sense for a word among those
    of WordNet

Sense 1 horse, Equus caballus -- (solid-hoofed
herbivorous quadruped domesticated since
prehistoric times) Sense 2 horse -- (a padded
gymnastic apparatus on legs) Sense 3 cavalry,
horse cavalry, horse -- (troops trained to fight
on horseback "500 horse led the attack") Sense
4 sawhorse, horse, sawbuck, buck -- (a framework
for holding wood that is being sawed) Sense 5
knight, horse -- (a chessman in the shape of a
horse's head can move two squares horizontally
and one vertically (or vice versa)) Sense 6
heroin, diacetyl morphine, H, horse, junk, scag,
shit, smack -- (a morphine derivative)
Corton has been involved in the design,
manufacture and installation of horse stalls and
horse-related equipment like external doors,
shutters and accessories.
22
SENSEVAL-II Schedule
23
SENSEVAL-II Tasks
  • All Words (without training data) Czech, Dutch,
    English, Estonian
  • Lexical Sample (with training data) Basque,
    Chinese, Danish, English, Italian, Japanese,
    Korean, Spanish, Swedish

24
English SENSEVAL-II
  • Organization Martha Palmer (UPENN)
  • Gold-standard 2 annotators and 1 supervisor
    (Fellbaum)
  • Interchange data format XML
  • Sense repository WordNet 1.7 (special Senseval
    release)
  • Competitors
  • All Words 11systems
  • Lexical Sample 16 systems

25
English All Words
  • Data 3 texts for a total of 1770 words
  • Average polysemy 6.5
  • Example (part of) Text 1

The art of change-ringing is peculiar to the
English and, like most English peculiarities ,
unintelligible to the rest of the world . --
Dorothy L. Sayers , " The Nine Tailors " ASLACTON
, England -- Of all scenes that evoke rural
England , this is one of the loveliest An
ancient stone church stands amid the fields , the
sound of bells cascading from its tower ,
calling the faithful to evensong . The
parishioners of St. Michael and All Angels stop
to chat at the church door , as members here
always have .
26
English All Words Systems
  • Supervised (5)
  • S. Sebastian (decision lists in Semcor)
  • UCLA (Semcor, Semantic Distance and Density,
    AltaVista for frequency)
  • Sinequa (Semcor and Semantic Classes)
  • Antwerp (Semcor, Memory Based Learning)
  • Moldovan (Semcor plus an additional sense tagged
    corpus, heuristics)
  • Unsupervised (6)
  • UMED (relevance matrix over Gutemberg project
    corpus)
  • Illinois (Lexical Proximity)
  • Malaysia (MTD, Machine Tractable Dictionary)
  • Litkowsky (New Oxford Dictionary and Contextual
    Clues)
  • Sheffield (Anaphora and WN hierarchy)
  • IRST (WordNet Domains)

27
Fine and coarse grained senses
  • Fine-grained The answers are compared to the
    senses from the Gold Standard.
  • Coarse-grained The answers are mapped to
    coarse-grained senses and compared to the Gold
    Standard tags, also mapped to coarse-grained
    senses.
  • Example groups for the verb to use
  • GROUP 1
  • use1 use, utilize, utilise, apply, employ --
    (put into service)
  • use3 use -- (seek or achieve an end )
  • use5 practice, apply, use -- (avail oneself
    to)
  • GROUP 2
  • use2 use -- (take or consume (regularly))
  • use4 use, expend -- (use up, consume fully
    ...)
  • GROUP 3
  • use6 use -- (habitually do something)

28
(No Transcript)
29
Lexical Sample
  • Data 8699 texts for 73 words
  • Average WN polysemy 9.22
  • Training Data 8166 (average 118/word)
  • Baseline (commonest) 0.47 precision
  • Baseline (Lesk) 0.51 precision

30
Lexical Sample
Example to leave
ltinstance id"leave.130"gt ltcontextgt I 'd been
seeing Johnnie almost a year now, but I still
didn't want to ltheadgtleavelt/headgt him for five
whole days. lt/contextgt lt/instancegt ltinstance
id"leave.157"gt ltcontextgt And he saw them all as
he walked up and down. At two that morning, he
was still walking -- up and down Peony, up and
down the veranda, up and down the silent, moonlit
beach. Finally, in desperation, he opened the
refrigerator, filched her hand lotion, and
ltheadgtleftlt/headgt a note. lt/contextgt lt/instancegt
31
English Lexical Sample Systems
  • Unsupervised (5) Sunderlard, UNED, Illinois,
    Litkowsky, ITRI
  • Supervised (12) S. Sebastian, Sinequa, Manning,
    Pedersen, Korea, Yarowsky, Resnik, Pennsylvania,
    Barcellona, Moldovan, Alicante, IRST

32
Supervised Techniques
  • Algorithms
  • Decision Lists
  • Boosting
  • Domain Driven Disambiguation
  • ...
  • Features
  • Lexical Context
  • Words
  • Morphological roots
  • Syntactic Context
  • POS bigrams/trigrams
  • Semantic Context
  • Domains
  • ...

33
Decision Lists 1/2
  • Training lexical context of n words
  • Example the word bank
  • bank1 depository financial institution ...
  • bank2 sloping land ...
  • ...

34
Decision Lists 2/2
  • The evidences most strongly indicative of a
    particular pattern will have the largest
    log-likelihood (strongest and most reliable
    evidence)
  • Log-likelihood for each evidence takes into
    account positive and negative examples

Classification of new examples the highest line
in the list that matches the given context
35
Boosting 1/2
  • Combine many simple and moderately accurate Weak
    Classifiers (WC)
  • Train WCs sequentially, each on the examples
    which were most difficult to classify by the
    preceding WCs
  • Examples of WCs
  • preceding_wordhouse
  • domainsport
  • ...

36
Boosting 2/2
  • WCi is trained and tested on the whole corpus
  • Each pair word, synset is given an importance
    weight h depending on how difficult it was for
    WC1,,WCi to classify
  • WCi1 is tuned to classify the worst pairs word,
    synset correctly and it is tested on the whole
    corpus
  • so h is updated at each step

At the end all the WCs are combined into a single
rule, the combined hypothesis each WCs is
weighted according to its effectiveness in the
tests
37
Domain Driven Disambiguation 1/3
  • Comparison between
  • the domain(s) of each synset of a word
  • the domain(s) of the context where the word
    appears
  • Domain information is collected in Domain Vectors
    having 41 dimensions (one for each domain label)
  • We build
  • Text Vectors
  • Synset Vectors
  • and we compare them using scalar products

38
Domain Driven Disambiguation 2/3
Example Bank1 depository financial institution
... Bank2 sloping land TEXT He cashed a
check at the bank
1,731878
0,06185
  • The module of a Synset Vector is proportional to
    its frequency (in Semcor or in other training
    data)
  • The direction is indicative of the contribute of
    domain(s)

39
Domain Driven Disambiguation 3/3
  • Obtaining Text Vectors
  • text categorisation technique based on WordNet
    Domains resource
  • Obtaining Synset Vectors
  • from training data
  • from manual annotation (WordNet Domains)

40
(No Transcript)
41
Discussion about IRST Results
  • Domain Driven Disambiguation can not be
    successfully applied to the words that do not
    carry relevant domain information
  • for instance
  • factotum words (i.e. having many generic senses
    e.g. the verb to be)
  • words whose senses have domains that are far
    from the relevant ones in the context
  • In this cases the system gives no answer this
    explains the low recall

42
IRST Results at SENSEVAL-II
43
(No Transcript)
44
Word Sense Disambiguation at Senseval-II
  • Bernardo Magnini, Carlo Strapparava
  • Giovanni Pezzulo and Alfio Gliozzo
  • ITC-irst, Centro per la Ricerca Scientifica e
    Tecnologica
  • Povo (Trento) - Italy
  • magnini, strappa, pezzulo, gliozzo_at_itc.it

45
Domain Driven Disambiguation
  • Semantic domains play an important role in the
    disambiguation process
  • Underlying assumption
  • Knowing in advance the relevant semantic
    domain(s) of a text makes word sense
    disambiguation easier

46
Domain Information 1/5
From the plush Connolly hide leather sofa and
chairs in the living room to the Bang and Olufsen
stereo, and remote control television complete
with video, you're surrounded by the HIGHEST
QUALITY. The inlaid chequerboard top of the
coffee table houses all kind of games, including
backgammon, chess and Scrabble. You'll also find
a selection of books, from Queen Victoria's
Highland journals, to the very latest bestselling
thriller. The dinner table and chairs are
elegant yet comfortable, and you can be assured
of the finest tableware and crystal for meals at
home.
1. Furniture chair -- (a seat for one
person) 2. University professorship, chair --
(the position of professor) 3. Administration
president, chairman, chairwoman, chair,
chairperson 4. Law electric chair, chair,
death chair, hot seat
Furniture Play Literature
47
Domain Information 2/5
From the plush Connolly hide leather sofa and
chairs in the living room to the Bang and Olufsen
stereo, and remote control television complete
with video, you're surrounded by the HIGHEST
QUALITY. The inlaid chequerboard top of the
coffee table houses all kind of games, including
backgammon, chess and Scrabble. You'll also find
a selection of books, from Queen Victoria's
Highland journals, to the very latest bestselling
thriller. The dinner table and chairs are
elegant yet comfortable, and you can be assured
of the finest tableware and crystal for meals at
home.
48
Domain Information 3/5
From the plush Connolly hide leather sofa and
chairs in the living room to the Bang and Olufsen
stereo, and remote control television complete
with video, you're surrounded by the HIGHEST
QUALITY. The inlaid chequerboard top of the
coffee table houses all kind of games, including
backgammon, chess and Scrabble. You'll also find
a selection of books, from Queen Victoria's
Highland journals, to the very latest bestselling
thriller. The dinner table and chairs are
elegant yet comfortable, and you can be assured
of the finest tableware and crystal for meals at
home.
49
Domain Information 4/5
From the plush Connolly hide leather sofa and
chairs in the living room to the Bang and Olufsen
stereo, and remote control television complete
with video, you're surrounded by the HIGHEST
QUALITY. The inlaid chequerboard top of the
coffee table houses all kind of games, including
backgammon, chess and Scrabble. You'll also find
a selection of books, from Queen Victoria's
Highland journals, to the very latest bestselling
thriller. The dinner table and chairs are
elegant yet comfortable, and you can be assured
of the finest tableware and crystal for meals at
home.
50
Domain Information 5/5
From the plush Connolly hide leather sofa and
chairs in the living room to the Bang and Olufsen
stereo, and remote control television complete
with video, you're surrounded by the HIGHEST
QUALITY. The inlaid chequerboard top of the
coffee table houses all kind of games, including
backgammon, chess and Scrabble. You'll also find
a selection of books, from Queen Victoria's
Highland journals, to the very latest bestselling
thriller. The dinner table and chairs are
elegant yet comfortable, and you can be assured
of the finest tableware and crystal for meals at
home.
51
Domain Information Sources
  • Annotated WordNet (WordNet Domains)
  • ontology-based (according to the WordNet
    hierarchical structure)
  • focused on technical senses (e.g. believe)
  • Categorised corpora
  • words clustering reflects their distribution over
    texts
  • focused on common use

52
WordNet Domains
  • Integrate taxonomic and domain oriented
    information
  • Cross hierarchy relations
  • doctor2 Medicine --gt person1
  • hospital1 Medicine --gt location1
  • Cross category relations operate3 Medicine
  • Cross language information

53
Polysemy Reduction
U
B
L
I
S
H
I
N
G
P

R
U
B
L
I
S
H
I
N
G
E
L
I
G
I
O
N
T
H
E
A
T
E
R
C
O
M
M
E
R
C
E
F
A
C
T
O
T
U
M
54
Semantic Domains Organization
  • 250 Domain labels collected from dictionaries
  • Four level hierarchy (Dewey Decimal
    Classification)
  • 41 basic domains used for Senseval

55
WordNet Domain Statistics 1/2
56
WordNet Domains Statistics 2/2
57
Domain Overlapping
Alimentation
Supermarket
Recipe
Restaurant
Cooking
Food
Eating
Fork
Kitchen
Drinking
Diet
Bulimia
Medicine
Hospital
Illness
Doctor
58
This was another difficult verb to group,
possibly even more difficult than "match" (and
thankfully less polysemous!). The problem with
grouping "use" -- and I remember encountering
this in my tagging -- is that the various senses
of "use" sort of shade off into one another, so
that the boundaries are fuzzy even for verbs.
In fact, of all the verbs I tagged, this one is
the murkiest. Ultimately, these groups are
almost artificial. This is true for any grouping
assignment really, but in this case the artifice
is worn on the sleeve. GROUP ONE. If the sense
seemed to be fairly explicit about the existence
of an inherent function or purpose, I grouped it
here. This ended up to be the general
all-purpose when-in-doubt-tag-to-this-sense sense
(that would be sense 1), as well as the specific
exploitative sense where the subject is using the
direct object to further his own advantage
(sense 3) and the sense which refers to using
more abstract sorts of principles (sense
5). GROUP TWO. If the sense seemed to imply
that the thing being used was a commodity, and
that it was being consumed, I put it here. This
ended up to be the drug addict sense (sense 2)
and the deplete sense (sense 5). SENSE 6. I
love sense 6. Since it is really only an
aspectual marker, you can't group it with
anything, and no matter how lumpy you might be,
you can't argue that it should be grouped
anywhere.
Write a Comment
User Comments (0)
About PowerShow.com