Using auto-encoders to model early infant categorization: results, predictions and insights - PowerPoint PPT Presentation

About This Presentation
Title:

Using auto-encoders to model early infant categorization: results, predictions and insights

Description:

Title: The Importance of Starting Blurry: Simulating Improved Basic-Level Category Learning in Infants Due to Weak Visual Acuity Author: Robert M. French – PowerPoint PPT presentation

Number of Views:125
Avg rating:3.0/5.0
Slides: 97
Provided by: Rober1332
Category:

less

Transcript and Presenter's Notes

Title: Using auto-encoders to model early infant categorization: results, predictions and insights


1
Using auto-encoders to model early infant
categorization results, predictions and
insights
2
Overview
  • An odd categorization asymmetry was observed in
    3-4 month old infants.
  • We explain this asymmetry using a connectionist
    auto-encoder model.
  • Our model made a number of predictions, which
    turned out to be correct.
  • We used a more neurobiologically plausible
    encoding for the stimuli.
  • The model can now show how young infants reduced
    visual acuity may actually help them do
    basic-level categorization.

3
Background on infant statistical
category-learning
  • Quinn, Eimas, Rosenkrantz (1993) noticed a
    rather surprising categorization asymmetry in 3-4
    month old infants
  • Infants familiarized on cats are surprised by
    novel dogs
  • BUT infants familiarized on dogs are bored by
    novel cats.

4
How their experiment worked
Familiarization phase infants saw 6 pairs of
pictures of animals, say, cats, from one category
(i.e., a total of 12 different animals)
Test phase infants saw a pair consisting of a
new cat and a new dog. Their gaze time was
measured for each of the two novel animals.
5

Familiarization Trials
6
(No Transcript)
7
(No Transcript)
8
(No Transcript)
9
(No Transcript)
10
(No Transcript)
11
Test phase
Compare looking times
12
Results (Quinn et al., 1993)The categorization
asymmetry
  • Infants familiarized on cats look significantly
    longer at the novel dog in the test phase than
    the novel cat.
  • No significant difference for infants
    familiarized on dogs on the time they look at a
    novel cat compared to a novel dog.

13
Our hypothesis
  • We assume that infants are hard-wired to be
    sensitive to novelty (i.e., they look longer at
    novel objects than at familiar objects).
  • Cats, on the whole, are less varied and thus are
    included in the category of Dogs.
  • Thus, when they have seen a number of cats, a
    dog is perceived as novel. But, when they have
    seen a number of dogs, the new cat is perceived
    as just another dog.

14
Statistical distributions of patterns are what
count
The infants are becoming sensitive to the
statistical distributions of the patterns they
are observing.
15
Consider the distribution of values of a
particular characteristic for Cats and Dogs
  • Note that the distribution for Cats is
  • narrower than that of Dogs
  • included in that of Dogs.

16
Suppose an infant has become familiarized with
the distribution for cats
And then sees a dog
Chances are the new stimulus will fall outside of
the familiarized range of values
17
On the other hand, if an infant has become
familiarized with the distribution for Dogs
And then sees a cat
Chances are the new stimulus will be inside the
familiarized range of values
18
How could we model this asymmetry?
  • We based our connectionist model on a model of
    infant categorization proposed by Sokolov (1963).

19
Sokolovs (1963) model
Encode
Stimulus in the environment
20
Decode and Compare
Encode
equal?
Stimulus in the environment
21
Decode and Compare
Adjust
Encode
Stimulus in the environment
22
Decode and Compare
Adjust
Encode
Stimulus in the environment
23
Decode and Compare
Adjust
Encode
equal?
Stimulus in the environment
24
Decode and Compare
Adjust
Encode
Stimulus in the environment
25
Decode and Compare
Adjust
Encode
Stimulus in the environment
26
Decode and Compare
Adjust
Encode
equal?
Stimulus in the environment
27
Decode and Compare
Adjust
Encode
Stimulus in the environment
28
Decode and Compare
Adjust
Encode
Stimulus in the environment
29
Continue looping
until the internal representation corresponds to
the external stimulus
30
Using an autoassociator to simulate the Sokolov
model
Stimulus from the environment
31
encode
Stimulus from the environment
32
decode
encode
Stimulus from the environment
33
decode
compare
encode
Stimulus from the environment
34
decode
adjust weights
encode
Stimulus from the environment
35
decode
encode
Stimulus from the environment
36
decode
encode
Stimulus from the environment
37
decode
encode
Stimulus from the environment
38
decode
compare
encode
Stimulus from the environment
39
decode
adjust weights
encode
Stimulus from the environment
40
decode
encode
Stimulus from the environment
41
decode
encode
Stimulus from the environment
42
decode
encode
Stimulus from the environment
43
decode
compare
encode
Stimulus from the environment
44
decode
adjust weights
encode
Stimulus from the environment
45
Continue looping
until the internal representation corresponds to
the external stimulus
46
Infant looking time ? network error
  • In the Sokolov model, an infant continues to
    look at the image until the discrepancy between
    the image and the internal representation of the
    image drops below a certain threshold.
  • In the auto-encoder model, the network continues
    to process the input until the discrepancy
    between the input and the (decoded) internal
    representation of the input drops below a certain
    (error) threshold.

47
Input to our model
We used a three-layer, 10-8-10, non-linear
auto-encoder (i.e., a network that tries to
reproduce on output what it sees on input) to
model the data. The inputs were ten feature
values, normalized between 0 and 1.0 across all
of the images, taken from the original stimuli
used by Quinn et al. (1993). They were head
length, head width, eye separation, ear
separation, ear length, nose length, nose width,
leg length vertical extent, and horizontal
extent. The distributions and, especially,
the amount of inclusion of these features in
shown in the following graphs.
48
Comparing the distributions of the input features
49
Results of Our Simulation
50
??
?1
?2
51
A Prediction of the auto-encoder model
  • If we were to reverse the inclusion relationship
    between Dogs and Cats, we should be able to
    reverse the asymmetry.
  • We selected the new stimuli from dog- and
    cat-breeder books (and very slightly morphed some
    of these stimuli).
  • We created a set of Cats and Dogs, such that Cats
    now included Dogs i.e., the Cat category was
    the broad category and the Dog category was the
    narrow category.

52
Reversing the Inclusion Relationship
Eye separation
Reversed distributions Cats include Dogs
Old distributions Dogs include Cats
Ear length
53
Results
Prediction by the model
3-4 month infant data
54
Removing the inclusion relationshipAnother
prediction from the model
  • Our model also predicts that, regardless of the
    variance of each category, if we remove the
    inclusion relationship, we should eliminate the
    categorization asymmetry.

55
A new set of cat/dog stimuli was created in which
there is no inclusion relationship
56
Prediction and Empirical Results The
categorization asymmetry disappears.
Infant data
Prediction of the auto-encoder
57
A critique of our methodology The use of
explicit features
  • We used explicit features (head length, leg
    length, ear separation, nose length, etc.) to
    characterize the animals (we hand-measured the
    values using the photos shown to the infants).
  • We decided instead to use simply Gabor-filtered
    spatial-frequency information to characterize the
    pictures.

58
The Forest and the TreesWhat are spatial
frequencies?
Very low spatial frequencies
59
The Forest and the TreesWhat are spatial
frequencies?
Low spatial frequencies
60
The Forest and the TreesWhat are spatial
frequencies?
Medium spatial frequencies
61
The Forest and the TreesWhat are spatial
frequencies?
Medium-high spatial frequenciess
62
The Forest and the TreesWhat are spatial
frequencies?
High spatial frequenciess
63
The Forest and the TreesWhat are spatial
frequencies?
Very high spatial frequenciess
64
The Forest and the TreesWhat are spatial
frequencies?
10 m. away Forest no longer visible. Trees with
branches and individual leaves visible
Extremely high spatial frequencies
65
The Forest and the TreesCombining spatial
frequencies to obtain the full image
66
Cats infant-to-adult visual acuity
Very low spatial frequencies
Two-month old vision
3-4 month old vision
(almost) adult vision
67
Cats infant-to-adult visual acuity
68
(No Transcript)
69
(No Transcript)
70
(No Transcript)
71
(No Transcript)
72
(No Transcript)
73
Adult Vision with full range of spatial
frequencies
74
Spatial frequency maps of images with Gabor
filtering
 
low freq.
high freq.
spatial-frequency map
We cover this map with spatial-frequency ovals
along various orientations of the image. (Each
oval is normalized to have approximately the same
energy.)
This allows us to characterize each dog/cat image
with a 26-unit vector.
75
This is an experiment.Consider the following
image.
76
?
77
(No Transcript)
78
(No Transcript)
79
(No Transcript)
80
Moral of the story Sometimes too much detail
hinders categorization (even for adults!)
81
The same is true for infants Reducing
high-frequency information improves category
discrimination for distinct categories
Reducing the range of the spatial frequencies
from the retinal map to V1 decreases
within-category variance. This decreases the
difference between two exemplars of the same
category, but increases the difference between
exemplars from two different categories. This
will make learning distant basic-level or
super-ordinate category distinctions easier (but
subordinate-level category distinctions will be
more difficult).
82
In other words, reduced visual acuity might
actually be good for infant categorization.
  • Visual acuity in infants is not the same as that
    of adults. They do not perceive high-spatial
    frequencies (i.e., fine details), or perceive
    them only poorly.
  • This reduced visual acuity may actually improve
    perceptual efficiency by eliminating the
    information overload caused by too many
    extraneous fine details likely to overwhelm their
    cognitive system.
  • Thus, distant basic-level category and
    super-ordinate level category learning may
    actually be facilitated by reduced visual
    acuity.

83
Reducing visual acuity in our model to simulate
young-infant vision by removing high spatial
frequencies
High spatial frequencies
84
Reducing visual acuity in our model to simulate
young-infant vision by removing high spatial
frequencies
High spatial frequencies
85
Reducing visual acuity in our model to simulate
young-infant vision by removing high spatial
frequencies
High spatial frequencies
86
Reducing visual acuity in our model to simulate
young-infant vision by removing high spatial
frequencies
The high spatial frequencies have been removed.
The autoencoder will work with input from these
images, thereby simulating early infant vision.
87
Two simulations with Gabor-filtered input
  • Reproducing previous results Using vectors of
    the 26 weighted spatial-frequency values, instead
    of explicit feature values, produces autencoder
    network results similar to those produced by
    infants tested on the same images
  • Reduced visual acuity This is produced by
    largely eliminating high-spatial frequency
    information from the input (i.e., blurry
    vision) actually significantly improves the
    networks ability to categorize the images
    presented to it.

88
Reproducing previous results (Cats are the more
variable category)
Results with explicit feature values (French et
al., 2001)
Results for 3-4 month old infants
Very little jump in error
Large jump in error
Network generalization errors with Gabor-filtered
spatial-frequency information
89
Conclusion about the use of Gabor-filtered input
instead of explicit feature measurements
  • Spatial frequency data in the model produces a
    reasonable fit to empirical data.
  • We avoid the thorny issue of using a particular
    set of high-level feature measurements (ear
    length, eye separation, etc.) to characterize the
    images used in the simulations.

90
Reduced visual acuity
Reduced perceptual acuity in 3-4 month old
infants produces an advantage for differentiating
perceptually distant basic-level categories and
super-ordinate categories.
91
Simulation 2 The advantage in 3-4 month old
infants of reduced visual acuity
The frequencies removed or reduced were
  • Above 3-4 cycles/degree very little contribution
  • Above 7.1 cycles/degree no contribution

Network used 26-16-26 feedforward BP
autoencoder network (learning rate 0.1,
momentum 0.9)
92
Error reduction ( improved generalization, even
overgeneralization) with reduced visual acuity
Network error
93
Close categories vs. Very dissimilar categories
When a network is familiarized on one category
(say, Cat), reduced visual acuity decreases
errors (i.e., improves generalization) for novel
exemplars in the same category or very similar
categories (like Dog). But it should help in
discriminating dissimilar categories. So, for
example, reduced visual acuity should produce a
greater jump in error for network (or increased
attention for an infant) familiarized on Cats
when exposed to Cars.
94
When trained on one category (Cats), errors on
dissimilar categories (Cars) are increased by
reduced visual acuity (i.e., better category
discrimination). Larger the error better
discrimination.
95
When trained on one category (Cats), errors on
dissimilar categories (Cars) are increased by
reduced visual acuity (i.e., better category
discrimination)
96
Jump in network error when trained on Cats and
tested on Novel Cats vs. Cars.
97
A Prediction of the Model Consider Quinn et al.
(1993)
Familiarized on Cats
Familiarized on Dogs
Cat
Dog
Jump in interest
No jump in interest.
But what if we took this test Cat and, by adding
only high spatial-frequency information,
transformed it into this Dog?
98
Presumably what the 3-month old infant would see
is this
Familiarized on Cats
Familiarized on Dogs
Cat
Cat
Prediction No jump in interest
No jump in interest.
The asymmetry would disappear, even though adults
would perceive a series of cats followed by a dog
and would expect a jump in infants interest, as
there usually is for a novel dog following
familiarization on cats.
99
Modeling Dogs and Cats Conclusions
A simple connectionist auto-encoder does a good
job of reproducing certain surprising infant
categorization data.
This supports a statistical, perceptually based,
on-line categorization mechanism in young infants
This model makes testable predictions
that have subsequently been confirmed in infants.
Gabor-filtered spatial-frequency input is
neurobiologically plausible and produces a good
approximation to infant categorization data.
A counter-intuitive learning advantage for
categorizing distant basic-level categories and
super-ordinate categories arises from reduced
acuity input.
100
The case of anomia
One possible answer variability In general,
natural kinds are LESS VARIABLE than artefactual
kinds (e.g.,  cats  less variable than
 chairs .
101
Natural and artificial kinds Before and after
representational compression
Chair
Chair after compression
Butterfly
Butterfly after compression
Write a Comment
User Comments (0)
About PowerShow.com