A Dynamic Learning Model For Categorizing Words Using Frames - PowerPoint PPT Presentation

1 / 29
About This Presentation
Title:

A Dynamic Learning Model For Categorizing Words Using Frames

Description:

Grammar includes manipulations of lexical items based on their syntactical categories. ... Consequently, high accuracy of word categorization is achieved. ... – PowerPoint PPT presentation

Number of Views:62
Avg rating:3.0/5.0
Slides: 30
Provided by: haow6
Category:

less

Transcript and Presenter's Notes

Title: A Dynamic Learning Model For Categorizing Words Using Frames


1
A Dynamic Learning Model For Categorizing Words
Using Frames
  • Hao Wang, Toben Mintz
  • Department of Psychology
  • University of Southern California

2
The Problem of Learning Syntactical Categories
  • Grammar includes manipulations of lexical items
    based on their syntactical categories.
  • Learning syntactical categories are fundamental
    to the acquisition of language.

3
The Problem of Learning Syntactical Categories
  • Nativist approach
  • Children are innately endowed with the possible
    syntactical categories.
  • How to map a lexical item to its syntactical
    category or categories?
  • Empirical approach
  • Children have to figure out the syntactical
    categories in their target language, and assign
    categories to lexical items.
  • There is no or little help from syntactical
    constraints.

4
Approaches Based on Semantic Categories
  • Grammatical Categories correspond to
    Semantic/Conceptual Categories(Macnamara, 1972
    Bowerman, 1973 Bates MacWhinney, 1979 Pinker,
    1984)object ? noun action ? verb
  • But what about
  • action, noise, love
  • to think, to know(Maratsos Chalkley, 1980)

5
Grammatical Categories from Distributional
Analyses
  • Structural LinguisticsGrammatical categories
    defined by similarities of word patterning
    (Bloomfield , 1933 Harris, 1951)
  • Maratsos Chalkley (1980) Distributional
    learning theory
  • lexical co-occurrence patterns
  • (and morphology and semantics)
  • the cat is on the mat
  • cat, mat

6
Grammatical Categories from Distributional
Analyses
  • Patterns across whole utterances(Cartwright
    Brent, 1997)
  • My cat meowed.
  • Your dog slept.
  • Det N X/Y.
  • Bigram co-occurrence patterns(Mintz, Newport,
    Bever, 1995, 2002 Redington, Chater Finch,
    1998)
  • the cat is on the mat

7
Probabilistic Bigram Co-Occurrence Patterns
8
Frequent Frames (Mintz, 2003)
  • Frames are defined as two jointly occurring
    words with one word intervening.
  • would you put the cans back ?
  • you get the nuts .
  • you take the chair back .
  • you read the story to Mommy .
  • Frame you_X_the

9
Sensitivity to Frame-like Units
  • Frames lead to categorization in adults (Mintz,
    2002)
  • Fifteen-month-olds are sensitive to frame-like
    sequences (Gómez Maye, 2005)

10
Other Motivation for Frames
  • Verb learning in children can be facilitated by
    frequent frames (Childers Tomasello, 2001)
  • Aspects of verb meaning carried by verb frame,
    linguistically defined (Gleitman, 1991 Gillette,
    Gleitman, Gleitman, Lederer, 1999 etc.)

11
Distributional Analyses Using Frequent Frames
(Mintz, 2003)
  • Six corpora from CHILDES (MacWhinney, 2000).
  • Analyzed utterances to children under 26.
  • Accuracy results
  • averaged overall corpora.

12
Limitation of the Frequent Frame Analyses
  • Requires two passes through the corpus
  • Step 1, identify the frequent frames by tallying
    the frame frequency.
  • Step 2, categorizing words using those frames.
  • Tracks the frequency of all frames
  • E.g., approximately 15000 frame types in one of
    the corpora in Mintz (2003).

13
Goal of current study
  • Provides a psychological plausible model of word
    categorization
  • Children possesses limited memory and cognitive
    capacity.
  • Human memory is imperfect.
  • Children may not be able to track all the frames
    he/she has encountered.

14
Features of current model
  • It processes input and updates the categorization
    frames dynamically.
  • Frame is associated with and ranked by a
    activation value.
  • It has a limited memory buffer for frames.
  • Only stores the most activated 150 frames.
  • It implements a forgetting function on the
    memory.
  • After processed a new frame, the activation of
    all frames in the memory decreased by 0.0075.

15
Child Input Corpora
  • Six corpora from CHILDES (MacWhinney, 2000).
  • Analyzed utterances to children under 26.
  • Peter (Bloom, Hood, Lightbown, 1974 Bloom,
    Lightbown, Hood, 1975)Eve (Brown, 1973) Nina
    (Suppes, 1974)Naomi (Sachs, 1983)Anne
    (Theakston, Lieven, Pine, Rowland, 2001)Aran
    (Theakston et al., 2001)
  • Mean Utterance/Child 17,200
  • MIN 6,950 MAX 20,857

16
Procedure
  • The child-directed utterances from each corpus
    was processed individually
  • Utterances were presented to the model in the
    order of appearance in the corpus
  • Each utterance was segmented into frames
  • you read the story to Mommy
  • you read the
  • read the story
  • the story to
  • story to Mommy

17
Procedure continued
  • you read the
  • read the story
  • the story to
  • story to Mommy

18
Procedure continued
  • The memory buffer only stores most activated 150
    frames.
  • It becomes full very quickly after processing
    several utterances.

19
Procedure continued
  • you put the
  • Frame you_X_the
  • Look up you_X_the frame in the memory
  • Increase the activation of you_X_the frame by 1
  • Re-rank the memory by activation

20
Procedure continued
  • you have a
  • Frame you_X_a
  • Look up you_X_a frame in the memory
  • story_X_Mommy lt 1
  • Remove story_X_Mommy
  • Add you_X_a to memory, set the activation to 1
  • Re-rank the memory by activation

21
Procedure continued
  • A new frame not in memory
  • The activation of all frames in memory are
    greater than 1
  • There is no change to the memory.

22
Evaluating Model Performance
  • Hit two words from the same linguistic category
    grouped together
  • False Alarm two words from different linguistic
    categories grouped together
  • Upper bound of 1

23
Accuracy Example
  • Hits 10
  • False Alarms 5
  • Accuracy
  • V
  • V
  • V
  • ADV
  • V
  • V

24
Ten Categories for Accuracy
  • Noun, pronoun
  • Verb, Aux., Copula
  • Adjective
  • Preposition
  • Adverb
  • Determiner
  • Wh-word
  • Negation -- not
  • Conjunction
  • Interjection

25
Averaged accuracy across 6 corpora
26
The Development of Accuracy
  • Accuracy are very high and stable in the entire
    process

27
Compare to Frequent Frames
  • After processing about half of the corpus, 70 of
    frequent frames are in the most activated 45
    frames in memory.

28
Memory of Final Step of Eve Corpus
29
Stability of Frames in Memory
  • Big changes of frames in memory in early stage,
    but become stable after processing 10 of the
    corpus

30
Summary
  • After processed the entire corpus, the learning
    algorithm has identified almost all of the
    frequent frames by highest activation.
  • Consequently, high accuracy of word
    categorization is achieved.
  • After processing fewer than half of the
    utterances, the 45 most activated frames included
    approximately 70 of frequent frames.

31
Summary
  • Frames are a robust cue for categorizing words.
  • With limited and imperfect memory, the learning
    algorithm can identify most frequent frames after
    processing a relatively small number of
    utterances. Thus yield a high accuracy of word
    categorization.
Write a Comment
User Comments (0)
About PowerShow.com