The extension of the Anaphora Resolution Exercise (ARE) to Spanish and Catalan - PowerPoint PPT Presentation

1 / 37
About This Presentation
Title:

The extension of the Anaphora Resolution Exercise (ARE) to Spanish and Catalan

Description:

the goal of ARE was to 'develop discourse anaphora resolution methods and to ... Discourse deixis. No possessive pronouns. Split antecedents ... – PowerPoint PPT presentation

Number of Views:59
Avg rating:3.0/5.0
Slides: 38
Provided by: In101
Learn more at: http://stel.ub.edu
Category:

less

Transcript and Presenter's Notes

Title: The extension of the Anaphora Resolution Exercise (ARE) to Spanish and Catalan


1
The extension of the Anaphora Resolution Exercise
(ARE)to Spanish and Catalan
  • Constantin Orasan
  • University of Wolverhampton, UK
  • and
  • Marta Recasens
  • Universitat de Barcelona, Spain

2
Structure
  1. Description of ARE2007
  2. English corpus used in ARE2007
  3. The AnCora corpora
  4. Adapting the AnCora corpora for ARE2009
  5. Plans for ARE2009

3
The Anaphora Resolution Exercises (AREs)
  • the goal of ARE was to develop discourse
    anaphora resolution methods and to evaluate them
    in a common and consistent manner
  • we organise them in conjunction with DAARC
    conferences
  • thought as multilingual evaluations
  • not supposed to be restricted only to pronominal
    and NP coreference
  • Do we need a roadmap?

4
ARE2007
  • organised in conjunction with DAARC2007
  • only English texts
  • very short time to organise it
  • can be considered a dry-run for ARE2009
  • focused on 4 tasks
  • 3 participants, 8 runs submitted
  • we used the NP4E corpus, a corpus of newswire
    texts

5
Task 1 Pronominal resolution on pre-annotated
texts
  • resolve pronouns to NPs
  • participants received the pronouns to be resolved
    and NP candidates

Pronouns to resolve 6 it
  • Input text
  • Israeli-PLO relations1 have hit a new low2
  • with the Palestinian Authority3 saying
    Israel5
  • is wrong to think it6 can treat the
    Authority7
  • like a client militia8.

Output (6, 5) (Israel, it)
6
Evaluation method for task 1
  • success rate (accuracy) defined as the number of
    correctly resolved anaphoric pronouns divided by
    the total number of anaphoric pronouns

7
Task 2 Coreferential chains resolution on
pre-annotated texts
  • assign NPs to chains
  • participants received texts with NPs belonging to
    a chain with at least two elements annotated
  • Input text
  • Israeli-PLO relations1 have hit a new low
  • with the Palestinian Authority2 saying
    Israel3 is wrong to think it4 can treat the
    Authority5 like a client militia.

Output Chain 1 3 Israel 4 it Chain 2 2
the Palestinian Authority 5 the Authority
8
Evaluation method for Task 2
  • Precision and recall as defined by MUC
  • only one system participated
  • precision 53.01
  • recall 45.72
  • f-measure 48.32

9
Task 3 Pronominal resolution on raw texts
  • unannotated texts were given to participants
  • systems had to
  • determine the referential pronouns
  • NP candidates
  • resolve pronouns to NPs

Input text Japan26 and27 Peru28 on29 Saturday30
took31 a32 tough33 stand34 their45 accord46
was47 swiftly48 ...
Output text (45 45, 26 28) (their,
Japan and Peru)
10
Task 4 Coreferential chains resolution on raw
texts
  • unannotated texts were given to participants
  • systems had to determine
  • the coreferential NPs
  • assign the to chains
  • the most popular task (3 runs submitted)

Input text Japan26 and27 Peru28 on29 Saturday30
took31 a32 tough33 stand34 their45 accord46
was47 swiftly48
Output text (26 28 , 45 45, ) (Japan and
Peru, their, )
11
Overlap measure
12
Evaluation method for task 3
13
Evaluation of task 3
14
Evaluation for Task 4
  • MUC scores modified to use overlap metric

15
Rationale for the tasks
  • tasks 1 and 3 evaluation of pronominal anaphora
  • tasks 2 and 4 evaluation of coreference
    resolution
  • tasks 1 and 2 evaluation of algorithms
  • tasks 3 and 4 evaluation of fully automatic
    systems

16
Corpus used in ARE2007
  • we used the NP4E corpus (Hasler et. al, 2006)
  • over 55,000 words
  • newswire texts
  • five clusters of related documents
  • annotation in two steps
  • identification of markables
  • identification of relations between markables
  • annotation done using PALinkA (Orasan 2003)

17
Markables
  • all the NPs at all levels regardless whether they
    are coreferential or not
  • include all the modifiers (both pre- and post-
    modifiers)
  • possessive pronouns and possessors
  • no relative pronouns or relative clauses
  • no NPs from fixed expressions (in town, on board,
    etc.)

18
Coreferential links
  • COREF and UCOREF
  • only nominal identity of reference direct
    anaphoric expressions
  • relations marked
  • identity
  • synonymy
  • generalisation
  • specialisation

lexical choice rather than concept (i.e. the
house the door)
19
Coreferential links
  • definite NPs in copular relationthe blast was
    the worst attack on civilians on U.S. soil
  • definite appositivesZaire Airlines, the main
    commercial airline in Zaire
  • text in brackets
  • I, you, we in speech coreferential to their
    antecedents

20
AnCora corpora
  • ANnotated CORporA for Catalan and Spanish
  • Newspaper and newswire texts
  • 500,000 words each
  • Annotated with
  • PoS tags and lemmas
  • Constituents and functions
  • Argument structures, thematic roles
  • Named entities
  • Nominal WordNet synsets
  • Coreference relations

21
AnCora corpora
  • XML in-line annotation
  • Markables (syntactic nodes)
  • NPs ? ltsngt ... lt/sngt ltsn
    ellipticyes/gt
  • Clitics ? ltvgt darles lt/vgt
  • Clauses, sentences ? ltSgt ... lt/Sgt
  • Attributes
  • entityentity
  • coreftypeident/pred/dx

22
Example AnCora-Ca
  • "L' aeroport ha d' anar amb compte amb els
    sorolls. Ø Ha de comportar -se com un bon
    veí", va recomanar Morlanes ... Malgrat les
    diferències entre AENA i veïns de Gavà_Mar

23
Example AnCora-Ca
  • ltsn entityentity3gt "L' aeroport lt/sngt ha d'
    anar amb compte amb els sorolls. ltsn
    ellipticyes entityentity3
    coreftypeident/gt Ø Ha de comportar -se com un
    bon veí", va recomanar Morlanes ... Malgrat les
    diferències entre ltsn entityentity3
    coreftypeidentgt AENA lt/sngt i veïns de Gavà_Mar

24
Identity of reference
  • Identity, synonymy
  • lAjuntament de Tarragona ... lAjuntament
  • los usuarios de la Red en EEUU ... los
    internautas estadounidenses
  • Generalisation, specialisation
  • los precios del café ... los precios
  • Metonymy
  • los conductores de camiones ... los camiones no
    hacen caso de los agentes

25
Identity of reference
  • Different scope of generics
  • las mujeres de España ... las mujeres
  • Place boundness
  • In Garraf, the unemployment rate ... it is
    higher in Lleida
  • Time boundness
  • el Festival de la Música Viva ... aquesta edició
  • Unrealized entities
  • Si hay un fan de Georgie_Fame , o de
    Gary_Brooker o de Albert_Lee , Ø puede
    estar a dos metros de él

26
AnCora vs. NP4E
  • Similarities
  • All NPs (modifiers, embedded NPs, coordinated
    NPs)
  • e.g. passengers on a flight from Moscow
    to Nigeria
  • la existencia de una fuerte división
    en esta institución
  • McVeigh and Nichols el PP y
    el PSOE
  • Barcelona airport vs. l aeroport
    de Barcelona
  • No NPs part of fixed expressions
  • e.g. came to power, subió al poder
  • Identity relation
  • Predicative relation copular, apposition
  • No identity-of-sense China-org vs. China-loc
  • No pleonastic pronouns

27
AnCora vs. NP4E
  • Differences
  • Zero elements elliptical subjects
  • Clitical pronouns
  • e.g. give them (Spanish) darles / (Catalan)
    donar-les
  • Relative pronouns
  • Discourse deixis
  • No possessive pronouns
  • Split antecedents

28
Preparation of Catalan and Spanish data for
ARE2009
  • there are lots of similarities between the
    guidelines used for AnCora and NP4E corpus
  • features too specific will be discarded
  • AnCora will be converted to the light XML
    annotation used in ARE2007
  • We hope not to encounter major problems when we
    do the actual conversion

29
Lessons learnt from ARE2007
  • if possible more evaluation methods and more
    baselines
  • better overlap metric
  • participants want more time (lots of interest,
    but the evaluation clashed with some major
    conferences)
  • participants want to be able to publish

30
Better overlap metric
  • no head/MIN attribute for the markable
  • the same system obtained better results on Task 4
    than Task 2

Task 2 ? Score 0 Task 4 ? Score 0.xxxx
31
Plans for ARE2009
  • include 4 languages Catalan, Dutch, English, and
    Spanish
  • keep the 4 tasks
  • include a multilingual task for pronominal
    anaphora resolution
  • evaluate some preprocessing stages for anaphora
    resolution
  • have a real time task

32
Preprocessing tasks
  1. Identification of pleonastic it pronouns in
    English texts
  2. Identification of pleonastic het pronouns in
    Dutch
  3. Identification of elliptical subjects in Spanish
    and Catalan

33
NP anaphora and NP coreference resolution tasks
  • Catalan Dutch English Spanish
  • Task 1 Yes No Yes Yes
  • Task 2 Yes No Yes Yes
  • Task 3 Yes Yes Yes Yes
  • Task 4 Yes Yes Yes Yes

34
Multilingual task for pronominal anaphora
resolution
  • Is it possible to have a multilingual system?
  • participants get a set of documents with
    paragraphs in Catalan, Dutch, English, and
    Spanish
  • referential personal pronouns marked in all the
    texts
  • candidate noun phrases not annotated
  • use a modified version of success rate that
    considers how correctly pronouns were resolved
    and in how many languages
  • 35050 pronouns per language
  • but, more thinking necessary

35
Real time tasks
  • Invite DAARC participants to take part in a real
    time exercise
  • The same tasks as for the main ARE2009 exercise,
    but
  • participants will need to bring their programs
  • will have one hour to submit the results
  • the tasks may include some surprise texts
  • subject to interest from participants and
    presence of the necessary infrastructure

36
Tentative timescale
  • 14 Nov 2008 Preliminary call for participation
  • 15 Jan 2009 Training data released
  • 4 - 23 May 2009 Test data is released (48 hours
    to submit the results after test data
    downloaded)
  • 30 May 2009 Results communicated back to
    participants
  • 6 June 2009 4 page technical reports due from
    participants
  • 20 June 2009 Reviews back to participants
  • 1 July 2009 Final version of technical reports
  • 5 - 6 Nov 2009 DAARC2009, Goa, India

37
  • Webpage
  • http//www.anaphora-and-coreference.info/ARE2009
  • Mailing list
  • ARE2009-list_at_anaphora-and-coreference.info
  • Email address
  • ARE2009_at_anaphora-and-coreference.info
  • Thank you!
Write a Comment
User Comments (0)
About PowerShow.com