Modelling Language Evolution Lecture 4: Learning bias and linguistic structure PowerPoint PPT Presentation

presentation player overlay
About This Presentation
Transcript and Presenter's Notes

Title: Modelling Language Evolution Lecture 4: Learning bias and linguistic structure


1
Modelling Language EvolutionLecture 4 Learning
bias and linguistic structure
  • Simon Kirby
  • University of Edinburgh
  • Language Evolution Computation Research Unit

2
Summary the story so far
  • What is a model? Why do linguists need
    computational models?
  • Modelling learning. One approach Neural nets
  • Nodes, activations, connection weights, hidden
    representations
  • Error driven learning
  • Learning syntax recurrent nets, starting small,
    critical period
  • Evolving network structure genetic algorithms

3
Learning bias
  • We have been talking about what learners are
    good or bad at what they can and cannot
    learn.
  • We refer to the learners prior bias
  • (This can be given a simple mathematical
    definition but lets not worry about that)
  • Prior bias is everything the learner brings to
    the problem that is independent of the data
  • Where does the bias come from?
  • It comes from biology. It is what is innate.

4
Language universals and learning biases
  • Christiansen suggests that languages themselves
    adapt to learners.
  • So far we have looked at long-distance dependency
    and embedding
  • Christiansen suggests less general targets for
    explanation
  • Branching direction/head-order consistency
  • Subjacency
  • Typically, these are assumed to be innate (and
    therefore evolved by natural selection)
  • What if they arise naturally from sequential
    learning biases?

5
Head-ordering consistency
  • Languages typically head-first or head-last.
  • (for the linguists) This might be explained with
    a parameterised of X-bar theory

6
Recursive consistency
  • Christiansen generalises head-ordering in terms
    of the interaction of recursive rules.
  • Consistent trees

7
Recursive consistency
  • Christiansen generalises head-ordering in terms
    of the interaction of recursive rules.
  • Inconsistent trees

8
A simple typology
  • Typologists construct a space of
    logically-possible languages and assign each a
    type
  • Christiansens binary typology
  • English is 11100

9
Which languages can SRNs learn?
  • If languages adapt to learning biases (as opposed
    to the other way round), perhaps some types will
    be better than others?
  • Will the SRN biases predict cross-linguistic
    distribution?
  • 8x8x8 SRN trained on next-category prediction
  • Categories
  • Singular N, Plural N
  • Singular V, Plural V
  • Singular genitive, Plural genitive
  • Adposition
  • End of sentence marker

10
Experimental setup
  • Trained on each of the 32 languages
  • Each language trained on 25 nets
  • Each of these had 5 different initial weight
    settings and 5 different random training sets
  • Each set contained 1000 words
  • Each net trained on 7 passes through data
  • So 800 simulations of 7000 words each
  • Output in terms of mean standard error of
    predicting the correct probability distribution
    for next-word

11
Results 1 Net error v. recursive inconsistency
  • Net error correlates very well with number of
    inconsistencies (r.83, plt.0001)

12
Typological data
  • 625 languages have been characterised in terms
    of
  • Verb-object order
  • Adposition order (i.e., prepositions or
    postpositions)
  • Genitive order
  • Grouped according to historical relatedness into
    252 genera. (Why?)
  • This controls for imbalances in the sample that
    are due to historical epiphenomena.

13
(No Transcript)
14
Results 2 Net error v. cross-linguistic
distribution
  • Net error correlates well with proportion of
    genera (r.35, plt.05)

15
Conclusions, and potential problems
  • We have moved from
  • Learners adapt to be good at language (via
    natural selection)
  • To
  • Language adapts to us
  • Concerns
  • What do Christiansens results say about Elman
    and Batalis?
  • Are the neural nets modelling learning, or
    processing?
  • What about other universals (e.g., subjacency)
  • Is equating learning difficulty and universal
    distribution valid?
  • Where do the languages come from? and what do the
    errors mean?
Write a Comment
User Comments (0)
About PowerShow.com