Title: Modelling the evolution of language for modellers and non-modellers
1Modellingthe evolution of languagefor modellers
and non-modellers
- Introduction and techniques
2Todays presenters
- Paul Vogt
- Language evolution and computation research unit,
University of Edinburgh, UK - Induction of linguistic knowledge group, Tilburg
University, The Netherlands - Tony Belpaeme (not here)
- Center for Interactive Intelligent
SystemsUniveristy of Plymouth, UK - Bart de Boer
- Department of artificial intelligenceRijksunivers
iteit Groningen, the Netherlands - All alumni of the Brussels AI-lab
- All AI-researchers with strong cognitive focus
3Why this Tutorial?
- AI in the sense of using computer models to
understand human intelligence benefits from
interaction with other disciplines - Evolution of language is such a discipline
- The study of language evolution deals with
systems that are so complex that they need to be
modeled with computers - Linguists/paleontologists appreciate modeling,
but are usually no good with computers themselves
4Our aims
- To introduce the field of language evolution to
the AI community - To present examples of possible models
- Based on our own work
- To explain how to communicate outside the field
of AI - Linguists/paleontologists read AI papers in a
different way than AI researchers
5Organisation of the tutorial
- Theory
- 14001445 Introduction and techniques (Bart)
- 14451530 Topics of research (Paul)
- BREAK
- 16001645 Communication and caveats
(Paul/Tony) - Practical Examples
- 16451700 Vowel systems (Bart)
- 17001715 Talking Heads simulator (Paul)
- 17151730 Hands-on demonstration (Bart/
Tony)
6Language Evolution
7(No Transcript)
8Early Scientific Experiments
- Pharaoh Psamtik I
- Frederick II von Hohenstaufen
- James IV of Scotland
9And speculation
- Jespersens critique
- Bow-wow theory
- Pooh-pooh theory
- Ding-dong theory
- Yo-he-ho theory
- But his own theory
- La-la theory
10As a result
- Also Chomsky considered it impossible (and
uninteresting) to study language evolution
11Can we do better today?
- 1990 Pinker Bloom Natural Language and
Natural Selection - Since 1996 biannual Evolang Conference
- Palaeontology
- Archaeology
- Anthropology
- Linguistics
- Biology
- Ethology
- Etc
- And of course Computer modelling
12Why language?
- Interesting and difficult question
- Many factors play a role (including chance),
complex dynamics - Possibilities for modelling
13Communication in Animals
Humans are not the only ones with complex
communication
14Relation with primates
- Chimpanzees are very smart
- But do not learn how to speak
- And only learn sign language with difficulty
- Apes do communicate vocally
- But more comparable with involun-tary human
cries of pain, joy, laughter etc. - Neural structures in ape and monkey brains for
manipulation and vocalization are analogues of
human brain structures for speech
15What evolved?
- Very specific mechanisms?
- Universal Grammar
- Principles and Parameters
- More general learning mechanisms, some
specialised for communication? - Completely general mechanisms
- Language itself evolved culturally
- Nature versus nurture
16Which of these had language?
Australopithecus africanus
Homo erectus
Homo neanderthalensis
17When did language emerge?
- Two extremes
- Late emergence (30 000 years ago)
- Early emergence (A. africanus)
- But there is no direct archaeological evidence
18The argument for late emergence
- Symbolic explosion
- About 30 000 years ago humans started to produce
art - Problem European bias, nowadays earlier and
earlier finds - Appears to have emerged and disappeared repeatedly
19Against late emergence
- How can complex language evolve so quickly?
- How does one explain biological adaptations to
language? - Homo sapiens started to spread much earlier than
5070 000 years ago - Would language have emerged in different places?
20Fossil evidence
- Hypoglossal canal (Kay, Cartmill Balow)
- For tongue control
- Not enlarged in Homo erectus, but in early
sapiens and Neanderthal (gt400 000 years) - Thoracic vertebral canal (MacLarnon Hewitt)
- For diaphragm control
- Not enlarged in Homo ergaster, but in
Neanderthal and modern man
21Modern Language
- Are there primitive languages?
- No, not if native
- No data on language evolution
- But data on possibilities of language
- But pidgin-languages
- Jargons, second language etc.
- And creolisation
- Or emergence of new language
- Nicaraguan sign language
- Idea proto-language
- Bickerton
22Jackendoff
Pre-existing primate conceptual structure
Use of symbols in non-situation-specific fashion
Concatenation of symbols
Use of an open, unlimited class of symbols
Use of symbol position to convey basic semantic
relations
Development of a phonological combinatorial
system to enlarge open, unlimited class of
symbols
(Protolanguage about here)
Hierarchical phrase structure
Symbols that explicitly encodeabstract semantic
relations
Grammatical categories
System of inflectionsto convey semanticrelations
System of grammaticalfunctions to convey
semantic relations
(Modern language
23Deep history of language
- Historical linguistics can reconstruct older
forms of a language (e.g. indo-european) - Traditional linguistics up to 8000 years ago
- But
- Ruhlen claims proto-world
- Very unlikely
- Human expansion started about 150 000 yrs ago
- After this time all similarities are gone
24LanguageandGenes
25Conclusion
- Language gt400 000 years old
- No primitive languages exist, and we know little
about how they spread very long ago - But we can observe emergence of new languages
- That all have certain special properties
- We can also observe incomplete languages
- Language probably emerged as primitive
proto-language - Why is an interesting, but hard-to-answer
question - How did it spread, how did it emerge?
- Cannot be reconstructed from fossils
- But possible to model
26Techniques
27The process of modelling
- Choose a cognitive/linguistic problem
- Gather data and theories on which to base a model
- Implement the theory as a computer model
- Make abstractions of reality
- Make mappings abstractions ? reality
- Make measures of performance of the model
- Implement the model using abstractions and
tradeoffs - Run tests for different parameter settings
- Check whether predictions of your model conform
with reality - Communicate your results
28About compromises
- Reality is too complex to model completely
- Especially true for models of systems involving
humans - Computing power is limited
- Our knowledge of the underlying systems is
limited - Especially of the cognitive aspects
- One is forced to make compromises/tradeoffs
- Identify the bottlenecks it is no use to make
one part of the system extremely realistic if
other parts are not - Always communicate your compromises
Computation
Realism
29About abstractions
- Making abstractions is one form of increasing
performance - Do not model certain aspects of the system and
consider them a black box - Abstractions are perfectly acceptable
- Ensure to not abstract away the baby with the
bathwater - Describe and defend your abstractions
- Keep in mind how the abstractions map onto
reality - Do not use the model for something you abstracted
out
Not modelled
Speaker
Hearer
Noisy Speech
Errorless words
Errorless words
30About implementation
- Choose a language and a programming environment
in which you are comfortable - Describe your model independent from your chosen
programming language - No single programming language is best
- Models are often on the edge of what is
computationally feasible - (Evolution of) language is complex, so you want
to model as much of it as possible - Optimization of implementation becomes crucial
- Preferably choose algorithms that are cognitively
plausible - No exponential complexity
- No global knowledge
31Techniques
- Optimization
- Genetic Algorithms
- Agent-based models
- Mathematical analysis
32Agent-based models (1)
- Two aspects of language/linguistics
- Individual
- Psycholinguistics/language acquisition/speech
errors - performance, parole
- Collective
- Historical linguistics, general linguistics
- competence, langue
- These aspects influence each other
- Individual performance based on group conventions
- Collective behaviour caused by individuals
- This link is difficult to investigate
- Complex feedback
- Non-linear behaviour (influences are not
separable) - Difficult to understand with pen-and-paper
33Agent-based models (2)
- Computer simulations have no problems with
complex systems - Ideal to investigate interaction of individual
and collective levels - Model a population of individuals
- Individual learning behavior, language behavior
- Population Interactions, population dynamics
Langue
Speaker
Speaker
Speaker
Speaker
Speaker
34Architecture of an individual
Language 1
Language 2
Language N
Age
Language Learning
chromosomes
Language behaviour
Perception
Production
speech
speech
Social behaviour
35Remove from population
The population
Language interaction
Agent
Agent
Agent
Agent
Social interaction
Add to population
Agent
mating
Agent
Agent
Agent
Spatial structure
36Agent-based paradigms
- Iterated learning model (Edinburgh)
- Vertical transmission (Transfer over generations)
- Small populations
- Language game model (Brussels)
- Horizontal transmission (Transfer within
generations) - Larger population
- Sometimes the differences are accentuated, but we
would like to stress the similarities
Language Games
Large
Population size
Iterated Learning
Small
Horizontal
Vertical
transmission
37On measuring performance
- Important to define measures of performance
- Optimization needs quality function
- GA needs a fitness function
- Agent-based models need to be monitored
- The whole model contains too much data(too many
degrees of freedom) - Especially true in agent-based models
- Large complex systems can sometimes be described
by a smaller number of parameters - Compare statistical mechanics properties of a
gas are temperature, pressure and specific gravity
38Measures
- Examples
- Energy (Liljencrants and Lindblom model)
- Communicative success
- Number of words
- Coherence in the population
- Productivity of a grammar
-
- Important to clearly define and describe your
measures - Important to explain how the measures map from
the simulation onto real language - Measures are abstractions from an already
abstract model
39About mathematical analysis
- Mathematical analysis can also be used to gain
understanding of complex linguistic phenomena - Has been used successfully in a number of cases
(i.e. Nowak et al.) - Comparable to mathematical biology
- Is considered more exact than computational
models insight in the why, not just the how - In order to do mathematical analysis, models must
be made even more simple and abstract - Complex, non-linear models are often not solvable
- Complementary to computational models
- Perhaps design models with analysis in mind
- Use analysis to gain deeper understanding of
models dynamics
40What have we seen?
- Steps in modeling
- Design an abstract model of a linguistic
phenomenon - Specify mappings from the abstraction to reality
- Choose a technique for implementing your model
- Make decisions about computational
simplifications - Design measures on your system
- Do mathematical analysis
- Techniques for modeling
- Optimization
- Genetic Algorithms
- Agent-based models (Language Games and Iterated
Learning) - (mathematical analysis)