Discourse Analysis - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

Discourse Analysis

Description:

Coreference or Anaphora Resolution -- determining which entity a ... I saw no less than 6 Acura Integras today. Now I want one. ( from book) April 2005 ... – PowerPoint PPT presentation

Number of Views:2507
Avg rating:3.0/5.0
Slides: 19
Provided by: davidc60
Category:

less

Transcript and Presenter's Notes

Title: Discourse Analysis


1
Discourse Analysis
  • David M. Cassel
  • Natural Language Processing
  • Villanova University
  • April 21st, 2005

2
Discourse Analysis
  • Discourse collocated, related groups of
    sentences (from book)

3
Discourse Analysis
  • Discourse Model -- a model to represent the
    entities mentioned in the discourse
  • Coreference or Anaphora Resolution -- determining
    which entity a referring expression refers to
  • Coherence -- modeling the logical flow of the
    discourse
  • The book also discusses Psycholinguistic Studies
    of Reference and Coherence

4
Anaphora Resolution
  • Before the game, manager Charlie Manuel said
    Gavin Floyd's performance would not affect
    whether he remains with the team when Vicente
    Padilla comes off the disabled list Tuesday.
  • Then Floyd went out and had a nightmarish first
    inning four walks, one wild pitch, one hit, four
    runs.
  • After the game, Manuel said Floyd's disastrous
    outing had not changed his mind. The righthander
    will remain with the club and be used in relief.
  • "The pitcher we saw in St. Louis is a pitcher who
    has the ability to be a very good major-league
    pitcher," he said. "He didn't have command of his
    fastball and couldn't get his breaking ball over
    tonight... . Maybe the cold was affecting his
    breaking ball, because he was bouncing a lot of
    them."
  • -- Sam Carchidi, Philadelphia Inquirer, 4/16/05

5
Discourse Model
evoke (introduce)
refer
he Floyd The righthander The pitcher we saw in
St. Louis his
corefer
Gavin Floyd
Adapted from Figure 18.1, Speech Language
Processing
6
Types of Anaphoric References
  • Indefinite noun phrases
  • A baseball player like that should do well.
  • Definite noun phrases
  • The righthander will remain with the club.
  • Pronouns
  • He had a bad game.
  • Demostratives
  • This player has a bright future.
  • One-anaphora
  • I saw no less than 6 Acura Integras today. Now I
    want one. (from book)

7
Reference Constraints
  • Number Agreement
  • Floyd pitched 6 innings. They went well.
  • Person and Case
  • He didnt have command of his fastball.
  • Gender Agreement
  • Floyd took his glove with him. It fit well.
  • Syntactic Contraints
  • Floyd threw him the ball.
  • Selectional Restrictions
  • Floyd stepped onto the mound with the ball. He
    threw it really fast.

8
Preferences
  • Recency
  • Floyd threw the ball. Lieberthal picked it up. He
    put the ball in his pocket.
  • Grammatical Role
  • Floyd threw the ball to Lieberthal. His arm was
    getting tired.
  • Repeated Mention
  • (See article)
  • Parallelism
  • Floyd threw a ball to Lieberthal. Wagner threw a
    ball to him, too.
  • Verb Semantics
  • John telephoned Bill. He lost the pamphlet on
    Acuras.
  • John criticized Bill. He lost the pamphlet on
    Acuras.

9
Pronoun Resolution Algorithms
  • Traditional
  • Carter shallow parsing
  • Rich, LuperFoy distributed architecture
  • Carbonell, Brown multi-strategy
  • Rico Pérez scalar product
  • Mitkov combination of linguistic, statistical
    (high 80s)
  • Lappin, Leass syntax-based (86)
  • Hobbs Tree Search Algorithm (91.7)
  • Grosz, Joshi, Weinstein Centering Algorithm
    (77.6)
  • Hobbs Coherence
  • Alternative
  • Nasukawa knowledge-independent (93.8)
  • Dagan, Itai statistical, corpus processing (87
    for genuine it)
  • Connolly, Burger, Day machine learning
  • Aone, Bennett machine learning (close to 90)
  • Mitkov uncertainty reasoning
  • Mitkov 2-engine (90)
  • Tin, Akman situational semantics
  • Say, Vakman

10
Lappin Leass
  • Book presents a slightly modified algorithm for
    nonreflexive, 3rd person pronouns. Two parts
  • Update discourse model with salience value
  • Resolve pronouns
  • Lets apply this to some text
  • In the afternoon, Gavin Floyd played baseball at
    the park. Then he went to a bar with Mike
    Lieberthal. He enjoyed a beer.

11
Salience Factors
12
Pronoun Salience
13
LL Algorithm
  • Collect the potential referents (up to four
    sentences back).
  • Remove potential referents that do not agree in
    number or gender with the pronoun.
  • Remove potential referents that do not pass
    intrasentential syntactic coreference
    constraints.
  • Compute the total salience value of the referent
    by adding any applicable values to existing
    salience value.
  • Select the referent with the highest salience
    value. In case of ties, select closest referent
    in terms of string position.

14
Example
In the afternoon, Gavin Floyd played baseball at
the park. Then he went to a bar with Mike
Lieberthal. He enjoyed a beer.
15
Example
In the afternoon, Gavin Floyd played baseball at
the park. Then he went to a bar with Mike
Lieberthal. He enjoyed a beer.
16
Example
In the afternoon, Gavin Floyd played baseball at
the park. Then he went to a bar with Mike
Lieberthal. He enjoyed a beer.
17
Example
In the afternoon, Gavin Floyd played baseball at
the park. Then he went to a bar with Mike
Lieberthal. He enjoyed a beer.
Gavin Floyd gets 35 point for Role Parallelism.
Mike Lieberthal does not. Floyd gt 265
points Lieberthal gt 75 points We pick Floyd as
the antecedent of He.
18
Summary
  • Discourse Analysis requires processing more text
    than POS tagging or finding entities.
  • Part of tracing the flow of discourse is
    resolving anaphora.
  • That resolution lets us capture more
    relationships and other information than we could
    otherwise.
Write a Comment
User Comments (0)
About PowerShow.com