Indirect Anaphora Resolution as Semantic Path Search - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

Indirect Anaphora Resolution as Semantic Path Search

Description:

Indirect Anaphora Resolution as Semantic Path Search. James Fan, Ken Barker and Bruce Porter ... Resolve indirect anaphora by using a general-purpose search ... – PowerPoint PPT presentation

Number of Views:56
Avg rating:3.0/5.0
Slides: 26
Provided by: labwa
Category:

less

Transcript and Presenter's Notes

Title: Indirect Anaphora Resolution as Semantic Path Search


1
Indirect Anaphora Resolution as Semantic Path
Search
  • James Fan, Ken Barker and Bruce Porter
  • University of Texas at Austin

2
Indirect Anaphora
  • Indirect anaphora is a type of anaphora in which
    the referring expression and the object being
    referred to are related by unstated background
    knowledge.
  • May account for 15 of noun phrase anaphora.
    Poesio and Vieira 98

3
Indirect Anaphora and Knowledge Capturing
  • In order to automatically capturing knowledge
    from text, indirect anaphora must be resolved.
  • For example
  • When the detective got back to the garage, the
    door was unlocked.
  • The referring expression, the door, relates to
    the antecedent, the garage, through a part-whole
    (metonymy) link.

4
Challenges in Indirect Anaphora Resolution
  • Requires semantic knowledge of the relationship
    between the referring expression and the
    antecedent.
  • Problematic for shallow processing systems.

5
Our Approach
  • Resolve indirect anaphora by using a
    general-purpose search program that finds short
    semantic paths in a knowledge base.
  • The search program has been used for a variety
    of tasks, including
  • noun compound interpretation Fan, et al. 2003
  • query interpretation Fan and Porter 2004.

6
Previous Work ? Theoretical
  • Theoretical work has identified a variety of
    types of indirect anaphora Clark 1975Gardent,
    et al. 2003.

7
Some Frequent Types of Indirect Anaphora
8
Previous Work WordNet Based
  • Use WordNet as the knowledge base. Vieira and
    Poesio 2000.
  • Choose one noun as the most likely antecedent
    from a list of nouns that appear earlier in the
    text.
  • The antecedent must relate to the referring
    expression as a synonym, hypernym/hyponym,
    coordinate sibling or meronym/holonym.
  • If multiple antecedents are found, they are
    ranked based on their contextual distances from
    the referring expression.

9
Previous Work WordNet Based
  • Strength
  • Reveals the type of association between each
    referring expression and its antecedent.
  • Weakness
  • Low recall. Commonly attributed to that many
    frequently used types of links, such as
    event/role or cause/consequence, are not
    available in WordNet.

10
Previous Work Machine-learning Systems
  • Use the web as the corpus Markert, et al. 2003
    Bunescu 2003.
  • Issue a series of web search queries made of the
    referring expression and each candidate
    antecedent.
  • Use the number of web pages returned as a measure
    of the strength of association.
  • If the strength exceeds a threshold, then
    consider the candidate the true antecedent.
  • Machine learning techniques are used to determine
    the best threshold.

11
Previous Work Machine-learning Systems
  • Strength
  • Broad coverage of all types of links.
  • Achieved results comparable with Wordnet-based
    approaches.
  • Weakness
  • Do not determine the semantic nature of the
    relationship between the referring expression and
    the antecedent.

12
Our Interpreter
  • Task
  • Given a knowledge base encoded as a semantic
    network.
  • Input a pair of nouns corresponding to two nodes
    in the network.
  • Output a path of semantic relations between the
    two nodes.
  • Stops when any subclass or superclass of the
    goal node is found.
  • Sorting prefer paths of short length.

13
Our Interpreter
Door
14
Comparison With Previous Approaches
  • Similar to WordNet based-systems.
  • Differences
  • More relaxed stopping criterion.
  • Sorting based on lexical distance (path length)
    rather than contextual distance.
  • Search inherited properties (not just local
    ones).
  • Deeper search.

15
Applying Our Interpreter to Indirect Anaphora
Resolution
  • Word sense form the cross product of all
    possible word senses of each referring expression
    and each candidate antecedent. This forms
    candidate pairs
  • ltreferring expression, antecedentgt
  • Search find semantic paths for each candidate
    pair.
  • Select rank the semantic paths to choose the
    best candidate path

16
Experiment 1 Evaluate the Interpreter's
Performance
  • Two data sets
  • 32 articles from Brown corpus.
  • 32 articles from Wall Street Journal.
  • Compared with an implementation of a
    state-of-the-art WordNet-based system Vieira and
    Poesio 2000.
  • WordNet 2.0 as knowledge base.

17
Experiment 1 (Results)
18
Experiment 1 Analysis
  • Precision remains the same.
  • Recall increases significantly.

19
Ablation Study
  • Why is the recall significantly better?
  • The systems differ in only four ways
  • More relaxed stopping criterion.
  • Sorting based on lexical distance (path length)
    rather than contextual distance.
  • Search inherited properties (not just local
    ones).
  • Deeper search.
  • We measured the contribution of each difference
    through a series of ablations.

20
Ablation Study Results
21
Ablation Study Analysis
  • Little impact
  • Sorting.
  • Search depth.
  • Inherited properties.
  • Big impact
  • Stopping criterion.

22
Experiment 2
  • Is the effect of stopping criterion restricted to
    these data sets and this task?
  • Evaluated impact of four different stopping
    criteria on semantic path search equality,
    superclass, subclass, super_or_subclass.
  • Task is noun compound interpretation.
  • Four sets of data (total of 742 pairs of nouns)
  • Biology text.
  • Small engine repair manual.
  • Sparcstation owners manual.
  • Online airplane descriptions.

23
Experiment 2 Results
24
Experiment 2 Result Analysis
  • The interpreter that used the most restricted
    stopping criterion had the worst recall.
  • The interpreter that used the least restricted
    stopping criterion had the best recall.
  • The more relaxed stopping criterion may induce
    many false positives, but it was rarely the case
    in practice.

25
Conclusion and Discussion
  • We applied a general tool for finding semantic
    paths between concepts to indirect anaphora
    resolution.
  • Our system achieved much higher recall with no
    drop in precision.
  • A relaxed stopping criterion, not search depth,
    is responsible for the increase in recall. This
    suggests that the interpreter can be used on very
    large knowledge bases.
  • In the future, we plan to
  • Assess the interpreters effectiveness on
    additional natural language processing tasks.
  • Evaluate the impact of taxonomy design on
    performance.
Write a Comment
User Comments (0)
About PowerShow.com