A Probabilistic model of Lexical and Syntactic Access and Disambiguation - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

A Probabilistic model of Lexical and Syntactic Access and Disambiguation

Description:

Access: Retrieving linguistic structure from some mental grammar. ... The representation of constituent structure rules as mental objects. ... – PowerPoint PPT presentation

Number of Views:60
Avg rating:3.0/5.0
Slides: 28
Provided by: Black7
Category:

less

Transcript and Presenter's Notes

Title: A Probabilistic model of Lexical and Syntactic Access and Disambiguation


1
A Probabilistic model of Lexical and Syntactic
Access and Disambiguation
Daniel Jurafsky, 1995
Presented by Connor Stroomberg, 13-12-2006
2
Introduction
  • Access Retrieving linguistic structure from
    some mental grammar.
  • Disambiguation choosing among combinations of
    structures to correctly parse ambiguous
    linguistic input.

3
Previous models
  • Try to solve problem by divide and conquer
    method.
  • Model either access or disambiguation.
  • Model either at lexical or syntactic level.

4
The proposed parser
  • Parallel
  • Built with standard dynamic programming (chart)
    parsing architecture.
  • Access and disambiguation implemented as a set of
    pruning heuristics.

5
Why pruning ?
  • Dynamic programming may be used to efficiently do
    syntax parsing.
  • Interpretation may not be as efficient.
  • Both access and disambiguation pruning based on
    probability ranking

6
Grammar
  • Each grammatical construction is a sign
  • A sign is a from meaning pair, represented by
    typed unification-based context free rules
  • Four assumptions are made
  • The representation of constituent structure rules
    as mental objects.
  • A uniform context-free model of lexical,
    morphological and syntactic rules.
  • Valence expectations on lexical heads.
  • A lack of empty categories.

7
Representational uniformity
  • No distinction between, lexicon, morphology,
    syntax.
  • Each represented by a augmented CFR

8
Augmentations
  • Each construction is augmented with two types of
    probabilitys.
  • 1 Prior probability (resting activation)
  • 2 Probabilitys in each case were a construction
    expresses some linguistic expectation.
  • Valence
  • Obligatory arguments have probability of 1.
  • Optional arguments between 0 and 1.

9
Access
  • Traditional psycholinguistic models were serial.
  • Motivation garden-path sentences
  • Traditional computational models were parallel.
  • Due to dynamic programming.
  • Proposed model is a parallel model based on
    dynamic programming, but is able to model
    garden-path effect.

10
Examples of previous models
11
Problems with Previous models
  • Bottom-up and top-down have been discussed.
  • Timing and frequency effects.
  • Solution key or clue
  • Problem 1 each item needs to be annotated with
    key or clue.
  • Problem 2 context effects.

12
The probabilistic model
  • For each construction compute conditional
    probability given the evidence.
  • Evidence may be syntactic semantic lexical and
    bottom-up or top-down.
  • Constructions are accessed according to
    beam-search.
  • Beam-width is universal constant in a grammar.

13
Evidence
  • Top-down evidence (e). P(c e) the probability
    that the evidence construction e left-expands to
    construction c.
  • Bottom-up evidence (e-).

14
Evidence
  • Combining the top-down and bottom-up is complex.
  • A simplifying assumption can be made, saying e
    can only effect e- thought c.

15
Simplifying evidence
  • Since ratios a compared the denominator may be
    dropped.
  • However psycholinguistic studies suggest this is
    to simplistic.

16
Choosing constructions
  • Constructions have probabilities how do we choose
    between them ?
  • Pruning, relative beam-search.
  • Prunes any construction more than a constant
    times worse than the best construction.

17
Advantages of the model
  • The model is able to account for a number of
    psycholinguistic results.
  • Lexical items with a higher frequency will have a
    higher probability.

implies
  • Access of a construction will be inversely
    proportional to the probability of the evidence.

18
Disambiguation
  • Natural language is ambiguous.
  • Examples
  • Preference
  • The woman discussed the dogs on the beach
  • The woman discussed the dogs which were on the
    beach (90)
  • The woman discussed them (the dogs) while on the
    beach (10)
  • The woman kept the dogs on the beach
  • The woman kept the dogs which were on the beach
    (5)
  • The woman kept them (the dogs) while on the beach
    (95)
  • Garden-path the horse raced past the barn
    fell.
  • Gap-filling / valence ambiguities.

19
Serial or parallel disambiguation
  • Garden-path effect
  • Serial explanation needs additional heuristic.
  • Word-based window, 3-constituent window.
  • Parallel parser uses pruning.
  • Using probability as pruning metric the most
    coherent interpretation is chosen.

20
Modeling preference effects
21
Modeling preference effects
22
Modeling the garden-path effect
  • Requires showing that the theory predict pruning
    in a case of ambiguity whenever the garden-path
    effect can be shown to occur.
  • This requires setting the appropriate beam width.
  • Set the beam to wide the garden-path sentence
    will be mislabeled as a less preferred
    interpretation.
  • Set the beam to narrow the mislabel parsable
    sentences as garden-path sentences.
  • Jurafsky suggests a beam width of 1/5.

23
Modeling the garden-path effect
  • The complex houses married and single students
    and their families.

24
Semantics in disambiguation
  • We need semantics to explain.
  • The teachers taught by the Berlitz method
    passed the test.
  • ?The children taught by the Berlitz method
    passed the test.
  • Possible solution add semantics to valence
    probabilities.
  • This may also be used to model real world
    knowledge.
  • The view from the window would be improved by the
    addition of a plant out there.
  • The view from the window would be destroyed by
    the addition of a plant out there.

25
Problems en future work
  • Simplification assumption.
  • No discussion on morphology.
  • No discussion of overload effects
    (center-embedding).
  • Embedding in connectionist framework.

26
Final thought
  • The author sees probabilities not as a
    replacement for structure, but as an enrichment
    of structure.

27
Questions / Comments Discussion
Write a Comment
User Comments (0)
About PowerShow.com