Connectionist Time and Dynamic Systems Time in One Architecture? Modeling Word Learning at Two Timescales - PowerPoint PPT Presentation

About This Presentation
Title:

Connectionist Time and Dynamic Systems Time in One Architecture? Modeling Word Learning at Two Timescales

Description:

Title: Slide 1 Author: Jessica Horst Last modified by: Jessica Horst Created Date: 6/14/2005 3:05:00 PM Document presentation format: On-screen Show – PowerPoint PPT presentation

Number of Views:101
Avg rating:3.0/5.0
Slides: 24
Provided by: Jessica322
Category:

less

Transcript and Presenter's Notes

Title: Connectionist Time and Dynamic Systems Time in One Architecture? Modeling Word Learning at Two Timescales


1
Connectionist Time and Dynamic Systems Time in
One Architecture?Modeling Word Learning at Two
Timescales
Jessica S. Horst (jessica-horst_at_uiowa.edu) Bob
McMurray Larissa K. Samuelson Dept. of
Psychology University of Iowa
2
Two Time Scales in Neural Networks
  • Connectionist and dynamical systems accounts
  • stress change over time
  • complement each other in timescale
  • Dynamic Systems online processes
  • Connectionist Networks long-term learningMany
    domains of development require both timescales
  • Example language development requires
  • sensitivity to brief and sequential nature of the
    input
  • slower developmental processes.

3
Two Time Scales in Language Acquisition
Word learning often attributed to fast mapping -
quick link between a novel name and a novel
object (e.g., Carey, 1978). But,
recent empirical data suggests that fast mapping
and word learning may represent two distinct time
scales (Horst Samuelson, April, 2005). -
Fast Mapping quick process emerging in the
moment. - Word Learning gradual process over
the course of developmentWe capture
both timescales in a recurrent network.
4
Auditory Inputs
The Architecture
  • Activation feed from input layers to decision
    layers.
  • Decision units compete via inhibition.
  • Activation feeds back to input layers.
  • Cycle continues until system settles.c

Decision Units (Hidden) Layer
Visual Inputs
Initial State (Before Learning)
(McMurray Spivey, 2000)
  • Unsupervised Hebbian learning occurs on every
    cycle.

5
  • Online decision dynamics reflect auditory and
    visual competitors.

6
The Model
  • 15 Auditory 15 Visual units
  • 90 Decision units
  • Names presented singly with a variable number of
    objects
  • Name-Decision Object-Decision associations
    strengthened via learning
  • After 4000 training trials network forms localist
    representations
  • Learns name-object links and to ignore visual
    competitors

End State Post Learning
7
Connection Strength
8
Two Time Scales
  • Fast Moment by Moment
  • Online information integration and constraint
    satisfaction (e.g., McClelland Elman, 1986,
    Dell, 1981)
  • Reaches a pattern of stable activation through
    input based on auditory and visual inputs and
    stored knowledge (weights)
  • Model makes correct name-object links based on
    the latest input
  • Slow Over the Long-Term
  • Unsupervised Hebbian Learning
  • Associates words with visual targets
  • Learns to ignore visual competitors

9
Dependent Time Scales
  • The two time scales are not independent
  • Long-term learning depends critically on the
    dynamics of the fast time scales
  • Competition between decision units ensures
    pseudo-localist representationscritical for
    Hebbian learning (e.g. Rumelhart Zipser, 1986)
  • Learning occurs on each cycle
  • - Influences processing cycle-by-cycle
    trial-by-trial
  • Accumulated learning across trials leads to
    learning on long-term time scale (i.e., word
    learning)

10
Empirical Results
11
Fast Time Scale
  • 24-month-old children
  • Saw 2 familiar 1 novel objects
  • Asked to get familiar and novel objects (e.g.,
    get the cow! or get the yok!)


  • Children were excellent at fast mapping (finding
    the referent of novel and familiar words in the
    moment).

12
Slow Time Scale
After a 5-minute delay, children were asked to
pick a newly fast-mapped name (e.g., get the
yok!)

  • Children unable to retain mappings after a
    5-minute delay

13
Replication
  • Initial findings replicated with simpler tasks
  • effect of number of names or trials?
  • Childrens difficulty in retaining newly
    fast-mapped names is not related to the number of
    names or trials

Replication 1 (N 12)
Replication 2 (N 12)
Fast Mapping Retention
9/12 4/9 n.s.
Fast Mapping Retention
7/12 4/7 n.s.
  • 1 Novel Name
  • 8 Familiar Names
  • 7 Preference Trials
  • 1 Novel Name
  • 2 Familiar Names

Binomial, p lt .05, Binomial, p lt .01
14
Simulations
15
  • 20 networks initialized with random weights
  • 15 word lexicon (names objects)
  • 5 familiar words
  • 5 novel words
  • 5 held out
  • Trained on 5 familiar items for 5000 epochs
  • Items presented in random order
  • Run in the Fast Mapping Experiment
  • 10 fast mapping trials (5 familiar, 5 novel)
  • 5 retention trials
  • Learning was not turned off during experiment.

16
How The Model Behaves
  • Fast Time Scale
  • Model succeeded on both types of fast-mapping
    trials
  • Model behavior patterned with empirical results

17
  • Slow Time Scale
  • The model fails to retain the newly learned
    words after a delay

Chance
18
How The Model Thinks
  • Analyses of weight matrices revealed that
    relatively little learning occurred during the
    test phase.

Change (RMS) in portions of weight matrix
2
1.6
1.2
Squared Deviations
0.8
0.4
0
Familiar
Familiar
Novel
Control
Words
Words
Words
Words
After
After Test
End
End
Learning
Temporal dynamics of processing
19
Prior to Experiment
Connection Strength
After Experiment
20
Conclusions
  • Two time scales captured in a single
    architecture
  • Fast, online fast mapping
  • Slow, long-term word learning
  • The model replicated the empirical findings
  • Excellent word learning and fast mapping
  • Poor retention
  • Has sufficient knowledge to select the referent
    at a given moment in time, given auditory and
    visual input and stored knowledge (weights).
  • But not enough to subsequently know the word.

21
Conclusions
  • In-the-moment learning
  • Subtly biases behavior
  • Combined with activation dynamics, yields correct
    response.
  • Does not provide robust, context-independent word
    knowledge (in the short term)
  • Continued training on fast-mapped words (i.e.,
    5000 epochs) makes them familiar words.
  • Accumulation of this learning provides robust
    context-independent word knowledge over
    development.

22
Take-Home Messages
1) A fast-mapped word is not a known word but
a known word is known, because it has been
fast-mapped many, many times.
2) Understanding development requires models that
integrate both short-term dynamic processes and
long-term learning.
23
References
  • Carey, S. (1978). The child as word learner. In
    M. Halle, J. Bresnan A. Miller (Eds.),
    Linguistic Theory and Psychological Reality (pp.
    264-293). Cambridge, MA MIT Press.
  • Dell, Gary S. (1986). A spreading-activation
    theory of retrieval in sentence production.
    Psychological Review, 93(3) 283-321.
  • Horst, J.S. Samuelson, L.K. (2005, April). Slow
    Down Understanding the Time Course Behind Fast
    Mapping. Poster session presented at the 2005
    Biennial Meeting of the Society for Research in
    Child Development, Atlanta, GA.
  • McClelland, J. Elman, J. (1986). The TRACE
    Model of Speech Perception, Cognitive Psychology,
    18(1), 1-86.
  • McMurray, B., Spivey, M. (2000). The
    Categorical Perception of Consonants The
    Interaction of Learning and Processing, The
    Proceedings of the Chicago Linguistics Society,
    34(2), 205-220.
  • Rumelhart, D. Zipser, D. (1986). Feature
    Discovery By Competitive Learning. In Rumelhart,
    D., McClelland, J. (Eds) Parallel Distributed
    Processing Explorations in the Microstructure of
    Cognition, 1, Cambridge, MA MIT Press.

Acknowledgements
The authors would like to thank Joseph Toscano
for programming assistance and support. This
work was supported by NICHD Grant R01-HD045713 to
LKS.
Write a Comment
User Comments (0)
About PowerShow.com