ON THE USE OF LINGUISTIC CORPORA IN CONNECTIONIST MODELLING - PowerPoint PPT Presentation

1 / 19

About This Presentation

Title:

ON THE USE OF LINGUISTIC CORPORA IN CONNECTIONIST MODELLING

Description:

Cognitive-linguistic approaches: what can we gain by computational ... ( gerund forms in English; cf. Finnish: a common past tense and plural marker -i-) 11 ... – PowerPoint PPT presentation

Number of Views:75

Avg rating:3.0/5.0

Slides: 20

Provided by: xx164

Category:

more less

Transcript and Presenter's Notes

Title: ON THE USE OF LINGUISTIC CORPORA IN CONNECTIONIST MODELLING

1
ON THE USE OF LINGUISTIC CORPORA IN CONNECTIONIST
MODELLING

Kari Hiltula
University of Tampere
Finnish language

2
Outline of the presentation

A focus on the data in modelling
Corpus as a basis for the training environment of
a model
Training a connectionist model
The relationship between the model and the real
thing
The learning situation redefined
Towards modelling meaning-induced learning

3
A focus on the data in modelling

Models of language learning are seen to partake
in the debate between connectionist vs. symbolic
theories of cognition (Pinker Prince 1988)
As a result, the discussion has focussed more on
the appropriate mechanism(s) of a model than on
the actual data
What a connectionist model comes to represent
depends mostly on the data it has been trained
with

4
Corpus as a basis for the learning environment of
a model (1)

The training data of a connectionist model is
often based on lexical and frequency data derived
from a corpus, which could be a general written
language corpus (e.g., The Brown Corpus), a
particular literary text in an electronic format,
a dictionary, etc.
The training data may consist of simplified
patterns to scale down the original problem for
the purposes of modelling but preserve the
relative frequency of the patterns in the chosen
corpus

5
Corpus as a basis for the learning environment of
a model (2)

A common conception of a connectionist model an
attempt to approximate or mimic the acquisitional
situation of a young native learner (MacWhinney
et al. 1989, p. 263)
The training data or set can be regarded as a
phenomenon-relevant (e.g., past tense learning)
sample of the actual language environment
confronted by the child
So far no exact criteria of how to choose a
representative corpus for that sample

6
Corpus as a basis for the learning environment of
a model (3)

The difficulty of measuring the match between the
actual and model input
These numbers, although accurate, may justly be
regarded with a certain degree of suspicion with
regard to their appropriateness as a measure of a
childs input, as the Brown corpus (from which
the frequency data derives) is a sample not of
childrens (or child-directed) spontaneous
speech, but of written, edited, adult-to-adult
communication such as novels, magazines, and
newspapers. On the other hand, measuring only
child-directed or child-initiated speech could
also be misleading as most children certainly
listen to adult conversation (and even edited
adult speech, e.g., on television). (Plunkett
Juola 1999, p. 467468.)

7
Training a connectionist model (1)

A training set is essentially an input for the
learning model (here supervised neural network),
together with the desired output
During training, the chosen set of verbs, nouns,
or other patterns relevant to the phenomenon
being modelled are presented to the model several
hundred times
The output of the model is tested at various
points during training, and at the end of
training (e.g., with a new set of patterns)

8
Training a connectionist model (2)

The model is trained until it has learned the
particular input-output mapping (base form of a
verb/noun ? inflected form, e.g., past tense of a
verb or plural form of a noun) to a certain
criterion what is often examined is the U-shaped
learning curve
To sum up In order to train a model, the
modeller has to define the network type and
algorithm, the patterns that represent the
mapping, representational format of the patterns,
and the training regime

9
The relationship between the model and the real
thing (1)

The model is an hypothesis of how particular
mental processes take place
It is useful here to recall the theoretical
assumptions of the model, namely that childrens
overregularization errors can be explained in
terms of their attempt to systematize the
relationship between phonological representations
of verb stems known to them and phonological
representations of the past tense forms known to
them. (Plunkett Marchman 1996, p. 303, italics
in original.)

10
The relationship between the model and the real
thing (2)

The mapping (e.g. base form ? inflected form)
represents the kind of environment under the
influence of which the learning (e.g., the
English past tense) takes place
Some unanswered questions
Why start with the base form?
Any other forms in the environment that may have
an influence on learning? (gerund forms in
English cf. Finnish a common past tense and
plural marker -i-)

11
The relationship between the model and the real
thing (3)

In the connectionist literature, the training set
has been interpreted either as a) a (mutatis
mutandis) actual input, or b) an already
interpreted input for the learning model or agent
A problem It is difficult to conceive the
training set both as a sample of the actual
learning environment (based on a corpus) and as
to-be-internalized data processed by a putatively
mental mechanism

12
The relationship between the model and the real
thing (4)

One solution is to distinguish the uptake
(internalized) and the input (actual)
environment
In essence, the modeller specifies both the
uptake and the input environment in the
assessment of the degree to which absolute token
frequencies influence the saliency of the
training item. As a result, the incidence of low
frequency forms in the uptake environment are
inflated relative to the hypothesized input
environment. (Plunkett Marchman 1996, p. 303.)

13
The relationship between the model and the real
thing (5)

The distinction creates a further problem the
training set is a hypothesis of the data
to-be-internalized by the child, based on a
hypothesis of the actual input data for the
child, which in turn is most often based on the
corpus or other data from which the absolute
token frequencies derive
As a consequence, the choice of the original
corpus has considerable influence on the
composition of the training set and thus to
possible hypotheses

14
The learning situation redefined (1)

The samples of observations (a corpus, a child
language study, etc.) that are made use of in the
training data, together with other decisions on
the construction of the model are an example of
(theoretical and contemplative) observers
knowledge (as defined by Itkonen 2005, p. 187)
A question Can the leap from observers
knowledge to (practical) agents knowledge in a
fully trained model be justified or is it only
stipulated?

15
The learning situation redefined (2)

The leap is not justified if the modeller simply
claims that the learning properties stem from the
model itself - on the contrary, the models have
specifically been trained to accomplish certain
tasks
To interpret a model of language learning as
characterizing agents knowledge, some notion of
the role of meaning in the formation of that
knowledge should be considered

16
The learning situation redefined (3)

Semantic cues eventually used as a guide for
learning must first be recognized and excluded
from other equivalent cues by the learner,
whereas the modeller has power over which cues to
include in the training data of the model, e.g.,
cues on class membership, gender, etc.
Paying attention to whatever relevant cues there
are requires conceptualization (see, e.g.,
Mandler 2004, p. 188)

17
The learning situation redefined (4)

The question of meaning hardly arises if the
training set is seen as a learning environment
external to the agent
If the set is seen as a hypothesis of salient
to-be-internalized data, the question of meaning
is presupposed (by letting the model have access
to crucial semantic cues) or simply ignored
So far actual production or comprehension data
not used in training the models

18
Towards modelling meaning-induced learning

Instead of seeing the training set as
representing a certain grammatical domain as such
(in the mind) of the learner, it should optimally
focus on a particular setting under which that
domain is active
A comparison between a model accomplishing a
certain task and human performance may call for a
delimitation of the corpus base
What is needed is a theory of pragmatics
compatible with connectionist modelling

19
REFERENCES

Esa Itkonen. 2005. Analogy as structure and
process. John Benjamins, Amsterdam.
Brian MacWhinney, Jared Leinbach, Roman Taraban,
and Janet McDonald. 1989. Language Learning cues
or rules? Journal of Memory and Language
28255277.
Jean Matter Mandler. 2004. The foundations of
mind. Oxford University Press, Oxford.
Steven Pinker and Alan Prince. 1988. On language
and connectionism analysis of a parallel
distributed processing model of language
acquisition. Cognition 2873193.
Kim Plunkett and Patrick Juola. 1999. A
connectionist model of English past tense and
plural morphology. Cognitive Science 23, 463490.
Kim Plunkett and Virginia Marchman. 1996.
Learning from a connectionist models of the
acquisition of the English past tense. Cognition,
61299308.