Explanation and Simulation in Cognitive Science - PowerPoint PPT Presentation

Loading...

PPT – Explanation and Simulation in Cognitive Science PowerPoint presentation | free to download - id: 52e00a-M2FlM



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Explanation and Simulation in Cognitive Science

Description:

Explanation and Simulation in Cognitive Science Simulation and computational modeling Symbolic models Connectionist models Comparing symbolism and connectionism – PowerPoint PPT presentation

Number of Views:43
Avg rating:3.0/5.0
Slides: 44
Provided by: HeatherB4
Learn more at: http://www.cog.brown.edu
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Explanation and Simulation in Cognitive Science


1
Explanation and Simulation in Cognitive Science
  • Simulation and computational modeling
  • Symbolic models
  • Connectionist models
  • Comparing symbolism and connectionism
  • Hybrid architectures
  • Cognitive architectures

2
Simulation and Computational Modeling
  • With detailed and explicit cognitive theories, we
    can implement the theory as a computational model
  • And then execute the model to
  • Simulate cognitive capacity
  • Derive predictions from the theory
  • The predictions can then be compared to empirical
    data

3
Questions
  • What kinds of theories are amenable to
    simulation?
  • What techniques work for simulation?
  • Is simulating the mind different from simulating
    the weather?

4
The Mind the Weather
  • The mind may just be a complex dynamic system,
    but it isnt amenable to generic simulation
    techniques
  • The relation between theory and implementation is
    indirect theories tend to be rather abstract
  • The relation between simulation results and
    empirical data is indirect simulations tend to
    be incomplete
  • The need to simulate helps make theories more
    concrete
  • But improvement of the simulation must be
    theory-drive, not just an attempt to capture the
    data

5
Symbolic Models
  • High-level functions (e.g., problem solving,
    reasoning, language) appear to involve explicit
    symbol manipulation
  • Example Chess and shopping seem to involve
    representation of aspects of the world and
    systematic manipulation of those representations

6
Central Assumptions
  • Mental representations exist
  • Representations are structured
  • Representations are semantically interpretable

7
Whats in a representation?
  • Representation must consist of symbols
  • Symbols must have parts
  • Parts must have independent meanings
  • Those meanings must contribute to the meanings of
    the symbols which contain them
  • e.g., 34 contains 3 and 4, parts which have
    independent meanings
  • the meaning of 34 is a function of the meaning
    of 3 in the tens position and 4 in the units
    position

8
In favor of structured mental representations
  • Productivity
  • It is through structuring that thought is
    productive (finite number of elements, infinite
    number of possible combinations)
  • Systematicity
  • If you think John loves Mary, you can think
    Mary loves John
  • Compositionality
  • The meaning of John loves Mary is a function of
    its parts, and their modes of combination
  • Rationality
  • If you know A and B is true, then you can infer A
    is true
  • Fodor Pylyshyn (1988)

9
What do you do with them?
  • Suppose we accept that there are symbolic
    representations
  • How can they be manipulated? by a computing
    machine
  • Any such approach has three components
  • A representational system
  • A processing strategy
  • A set of predefined machine operations

10
Automata Theory
  • Identifies a family of increasingly powerful
    computing machines
  • Finite state automata
  • Push down automata
  • Turning machines

11
Automata, in brief (Figure 2.2 in Green et al.,
Chapter 2)
  • This FSA takes as input a sequence of on and off
    messages, and accepts any sequence ending with an
    on
  • A PDA adds a stack an infinite-capacity,
    limited access memory, so that what a machine
    does depends on input, current state, plus the
    memory

12
  • A Turing machine changes this memory to allow any
    location to be accessed at any time. An the
    State transition function specifies read/write
    instructions, as well as which state to move to
    next.
  • Any effective procedure can be implemented on an
    appropriately programmed Turing machine
  • And Universal Turing machines can emulate any
    Turing machine, via a description on the tape of
    the machine and its inputs
  • Hence, philosophical disputes
  • Is the brain Turing powerful?
  • Does machine design matter or not?

13
More practical architectures
  • Von Neumann machines
  • Strictly less powerful than Turing machines
    (finite memory)
  • Distinguished area of memory for stored programs
  • Makes them conceptually easier to use than TMs
  • Special memory location points to
    next-instruction on each processing cycle fetch
    instruction, move pointer to next instruction,
    execute current instruction

14
Production Systems
  • Introduced by Newell Simon (1972)
  • Cyclic processor with two main memory structures
  • Long term memory with rules (productions)
  • Working memory with symbolic representation of
    current system state
  • Example IF goal (sweeten(X) AND available
    (sugar) THEN action (add(sugar, X)) and retract
    (goal(sweeten(X)))

15
  • Recognize phase (pattern matching)
  • Find all rules in LTM that match elements in WM
  • Act phase (conflict resolution)
  • Choose one matching rule, execute, update WM and
    (possibly) perform action
  • Complex sequences of behavior can thus result
  • Power of pattern matcher can be varied, allowing
    different use of WM
  • Power of conflict resolution will influence
    behavior given multiple matches
  • Most specific?
  • This works well for problem-solving. Would it
    work for pole-balancing?

16
Connectionist Models
  • The basic assumption
  • There are many processors connected together, and
    operating simultaneously
  • Processors units, nodes, artificial neurons

17
A connectionist network is
  • A set of nodes, connected in some fashion
  • Nodes have varying activation levels
  • Nodes interact via the flow of activation along
    the connections
  • Connections are usually directed (one-way flow),
    and weighted (strength and nature of interaction
    positive weight excitatory negative
    inhibitory)
  • A nodes activation will be computed from the
    weighted sum of its inputs

18
Local vs. Distributed Representation
  • Parallel Distributed Processing is a (the?) major
    branch of connectionism
  • In principle, a connectionist node could have an
    interpretable meaning
  • E.g., active when red input, or grandmother,
    or whatever
  • However, an individual PDP node will not have
    such an interpretable meaning
  • Activation over whole set of nodes corresponds to
    red
  • Individual node participates in many such
    representations

19
PDP
  • PDP systems lack systematicity and
    compositionality
  • Three main types of networks
  • Associative
  • Feed-forward
  • Recurrent

20
Associative
  • To recognize and reconstruct patterns
  • Present activation pattern to subset of units
  • Let network settle in stable activation pattern
    (reconstruction of previously learned state)

21
Feedforward
  • Not for reconstruction, but for mapping from one
    domain to another
  • Nodes are organized into layers
  • Activation spreads through layers in sequence
  • A given layer can be thought of as an activation
    vector
  • Simplest case
  • Input layer (stimulus)
  • Output layer (response)
  • Two layer networks are very restricted in power.
    Intermediate (hidden) layers gain most of the
    additional computational power needed.

22
Recurrent
  • Feedforward nets compute mappings given current
    input only. Recurrent networks allow mapping to
    take into account previous input.
  • Jordan (1986) and Elman (1990) introduced
    networks with
  • Feedback links from output or hidden layers to
    context units, and
  • Feedforward links from the context units to the
    hidden units
  • Jordan network output depends on current input
    and previous output
  • Elman network output depends on current input and
    whole of previous input history

23
Key Points about PDP
  • Its not just that a net can recognize a pattern
    or perform a mapping
  • Its the fact that it can learn to do so, on the
    basis of limited data
  • And the way that networks respond to damage is
    crucial

24
Learning
  • Present network with series of training patterns
  • Adjust the weights on connections so that the
    patterns are encoded in the weights
  • Most training algorithms perform small
    adjustments to the weights per trial, but require
    many presentations of the training set to reach a
    reasonable degree of performance
  • There are many different learning algorithms

25
Learning (contd.)
  • Associative nets support Hebbian learning rule
  • Adjust weight of connection by amount
    proportional to the correlation in activity of
    corresponding nodes
  • So if both active, increase weight if both
    inactive, increase weight if they differ,
    decrease weight
  • Important because this is biologically
    plausibleand very effective

26
Learning (contd.)
  • Feedforward and recurrent nets often exploit the
    backpropagation of error rule
  • Actual output compared to expected output
  • Difference computed and propagated back to input,
    layer by layer, requiring weight adjustments
  • Note unlike Hebb, this is supervised learning

27
Psychological Relevance
  • Given a network of fixed size, if there are two
    few units to encode the training set, then
    interference occurs
  • This is suboptimal, but is better than nothing,
    since at least approximate answers are provided
  • And this is the flipside of generalization, which
    provides output for unseen input
  • E.g., weep ? wept bid ? bid

28
Damage
  • Either remove a proportion of connections
  • Or introduce random noise into activation
    propagation
  • And behavior can simulate that of people with
    various forms of neurological damage
  • Graceful degradation impairment, but residual
    function

29
Example of Damage
  • Hinton Shallice (1991), Plaut Shallice (1993)
    on deep dyslexia
  • Visual error (cat read as cot)
  • Semantic error (cat read as dog)
  • Networks constructed for orthography-to-phonology
    mapping, lesioned in various ways, producing
    behavior similar to human subjects

30
Symbolic Networks
  • Though distributed representations have proved
    very important, some researchers prefer localist
    approaches
  • Semantic networks
  • Frequently used in AI-based approaches, and in
    cognitive approaches which focus on conceptual
    knowledge
  • One node per concept typed links between
    concepts
  • Inference link-following

31
Production systems with spreading activation
  • Andersons work (ACT, ACT, ACT-R)
  • Symbolic networks with continuous activation
    values
  • ACT-R never removes working memory elements
    activation instead decays over time
  • Productions chosen on basis of (co-) activation

32
Interactive Activation Networks
  • Essentially, localist connectionist networks
  • Featuring self-excitatory and lateral inhibitory
    links, which ensure that theres always a winner
    in a competition (e.g., McClelland Rumelharts
    model of letter perception)
  • Appropriate combinations of levels, with feedback
    loops in them, allow modeling of complex
    data-driven and expectation-driven bahavior

33
Comparing Symbolism Connectionism
  • As is so often the case in science, the two
    approaches were initially presented as exclusive
    alternatives

34
Connectionist
  • Interference
  • Generalization
  • Graceful degradation
  • Symbolists complain
  • Connectionists dont capture structured
    information
  • Network computation is opaque
  • Networks are merely implementation-level

35
Symbolic
  • Productive
  • Systematic
  • Compositional
  • Connectionists complain
  • Symbolists dont relate assumed structures to
    brain
  • They relate them to von Neumann machines

36
Connectionists can claim
  • Complex rule-oriented behavior emerges from
    interaction of subsymbolic behavior
  • So symbolic models describe, but do not explain

37
Symbolists can claim
  • Though PDP models can learn implicit rules, the
    learning mechanisms are usually not neurally
    plausible after all
  • Performance is highly dependent on exact choice
    of architecture

38
Hybrid Architectures
  • But really, the truth is that different tasks
    demand different technologies
  • Hybrid approaches explicitly assume
  • Neither connectionist nor symbolic approach is
    flawed
  • Their techniques are compatible

39
Two main hybrid options
  • Physically hybrid models
  • Contain subsystems of both types
  • Issues interfacing, modularity (e.g., use
    Interactive Activation Network to integrate
    results)
  • Non-physically hybrid models
  • Subsystems of only one type, but described two
    ways
  • Issue levels of description (e.g.,
    connectionist production systems)

40
Cognitive Architectures
  • Most modeling is aimed at specific processes or
    tasks
  • But it has been argued that
  • Most real tasks involve many cognitive processes
  • Most cognitive processes are used in many tasks
  • Hence, we need unified theories of cognition

41
Examples
  • ACT-R (Anderson)
  • Soar (Newell)
  • Both based on production system technology
  • Task-specific knowledge coded into the
    productions
  • Single processing mechanism, single learning
    mechanism

42
  • Like computer architectures, cognitive
    architectures tend to make some tasks easy, at
    the price of making other hard
  • Unlike computer architectures, cognitive
    architectures must include learning mechanisms
  • But note that the unified approaches sacrifice
    genuine task-appropriateness and perhaps also
    biological plausibility

43
A Cognitive Architecture is
  • A fixed arrangement of particular functional
    components
  • A processing strategy
About PowerShow.com