Title: A Computational Model for Children's Language Acquisition using ILP
1A Computational Model for Children's Language
Acquisition using ILP
- Koichi Furukawa
- School of Media and Governance, Keio University
2Difficulties on Vocabulary Acquisition by Children
- Formalization of sensory inputs
- humans may select appropriate types of attributes
by switching context - such selection itself is subject for learning
- on the level of perception
- Vastness of the search space
- Quines Paradox For one label, there are many
possible targets to be referred.
3An Approach to Overcome the Search Space Problem
- Constraint Theory on Word Meanings prior
hypotheses on words meanings - necessary to learn vocabulary avoiding the
Quines Paradox - like biases in machine learning, essential for
learning
4Biases under the Constraint Theory
- Priority hypothesis on selection of type of word
meaning - Whole Object Bias
- Hypotheses on categorization of objects
- Taxonomic Bias,
- Mutually Exclusivity Bias, and
- etc.
- Priority hypothesis on selection of type of
sensory inputs - Shape Bias
5Accommodation of Biases into ILP
- Two types of biases in ILP
- declarative / procedural
- Each of biases from the Constraint Theory does
not necessarily correspond to a single
declarative or procedural bias. - Shape Bias implemented by assigning heavier
weight on shape-related attributes in evaluation
functions - Taxonomic Bias implemented as a switch of a
proper evaluation function depending on taxonomic
position of the concept to be learned
6Co-evolution between Word Description and Concept
Hierarchy
- Word description learning utilize concept
hierarchy - Concept hierarchy building utilize words
descriptions
7Word Learning and Induction
- Applying inductive logic model raises difficulty
of intensional description of a word - Poverty of stimuli
- only positive examples
- few examples
- hardness to find a sufficient set of descriptors
(attributes) to explain a concept such as cat - hardness to find a proper type of intension
- Constraint Theory suggests overcoming these
problems by bounding possible hypotheses
8The Whole Object Bias
- This states that a child assumes a novel label
to refer to the whole of a given related object. - Analyzed meanings
- a label first refers the whole of an object
- a label does not refer to a part or an attribute
of an object - Implementation setting goal clauses as
explanations on whole objects
9The Taxonomic Bias
- Statement
- A child maps a label to a taxonomic category
which includes a referred object. - Analyzed meanings
- a label refers a category of objects
- a label is not limited to refer a single object
- a category contains taxonomically similar objects
- a category does not contain associatively or
syntactically similar objects - Implementation
- Introducing a switching among evaluation
functions depending on taxonomic position
10The Shape Bias
- Statement a child assumes that objects are
similar when their shapes are similar. - Analyzed meanings
- a label refers category of similar objects in
shape - shape-related attributes are prior to other type
attributes - Implementation
- Assigning heavier weight on shape-related
attributes in evaluation functions
11The Principle of Contrast
- Statements a child assumes that different
labels cannot refer to the same category. - Analyzed meaning
- different labels refers to different categories
- yet different labels can refer to the same object
- Implementation prohibiting intensional
explanations of different labels being the same
12The Mutual Exclusivity Bias
- Statement a child assumes that different
objects have different labels. - Analyzed meaning
- different labels cannot refer to the same object
- different categories cannot contain the same
object - more strict reference limitation than the
Principle of Contrast - Implementation
- setting to find alternative solution when
multiple labels intensions explain the same
object
13Configuration of the Model
Label Acceptor
Sensory Input Acceptor
General Attributes
Categorical Classifier
Categorical Classifier
Label
Attributes Selector
Word Learner
Hypothesis Evaluation Function
Pos. Neg. Example Objects
Concepts
Similarity Calculation Module
Concepts
Weight
Weight Learner
Similarity
Supercategory (Taxonomic Domain)
Hierarchy Constructor
14Modules in the System
- Stimuli acceptor modules
- Label Acceptor
- Sensory Input Acceptor
- Word Learner
- Concept hierarchy construction modules
- Similarity Calculation Module
- Hierarchy Constructor
- Weight Learner
- Attribute Selector
15Stimuli Acceptor Modules
- These modules accept stimuli for one current
object at the start of each session. - Label Acceptor
- accepts a label for the current object
- the label is given by the teacher
- under assumption that a child properly identify
the current object - depends on the Whole Object Bias, the Mutual
Exclusivity Bias, and the Shape Bias - Sensory Input Acceptor
- (the next slide)
16Stimuli Acceptor Modules
- Sensory Input Acceptor
- accepts a set of attributes (General Attributes)
on the current object - General Attributes represent information that
sensors percept directly on the current object - also accept Categorical Classifier(s) for the
current object - a Categorical Classifier represents the innate
ontological class (the Supercategory) where the
current object would belong to - determination of Categorical Classifier(s) is
assumed to be able by a child learner using
pre-language knowledge
17Word Learner
- A label is assumed to represent a category.
- This module inductively derives an intensional
expression for the questioned label. - Examples (next slide)
- Background knowledge (2 slides later)
- Hypothesis candidates (2 slides later)
- Candidates Evaluation function (2 slides later)
18Word Learner
- Examples current and past objects in the same
supercategory - Positives
- the current object
- objects ever related with the questioned label by
the teacher - objects referred by subordinate categorie(s) of
the questioned label - Negatives
- objects never related with the questioned label
- objects having sufficiently small similarity with
the typical of the questioned label
19Word Learner
- Background knowledge descriptors (General
Attributes Categorical Classifiers) on positive
example objects - Hypothesis candidates
- the head of each clause reference of the label
- the body of each clause conjunction of
descriptors appearing positive examples - Candidates Evaluation function
- given selected one from Attribute Selector
20Characteristics of Our Word Learner
- Difference against General ILP
- construct candidates from few examples
- through serial sessions, a same concept is
revised repeatedly - incremented example objects
- change of positive and negative sets of objects
by hierarchy construction - change of most likely candidate hypothesis by
weight learning
21Two-step Model of Word Learning
- Steps
- first, depending on Categorical Classifier, the
learner switches into one supercategory - second, the learner do induction in that
supercategory - The merit of two-step learning
- reduction of number of General Attributes
- reduction of search space by limiting examples
22Concept Hierarchy Construction Modules
- Modules for construction the Concept Hierarchy,
which represent relations between concepts, by
the learner - Similarity Calculation Module
- calculate similarities among concepts
- also calculate similarities among objects, for
the use in calculation among concepts - Hierarchy Constructor
- (next slide)
23Concept Hierarchy Construction Modules
- Hierarchy Constructor
- decide relation between two concepts
- whether they are hierarchically related
- whether they are mutually exclusive
- in the cases of hierarchical relation, which
concept is the superordinate of the other - decide position of a concept in the hierarchy
- use
- similarity between the two concepts provided by
the Similarity Calculation Module - supercategory (prior taxonomic class) of the two
concepts from the Attribute Selector
24Weight Learner
- There exists one evaluation function for each
supercategory. - An evaluation function has its weight
distribution among types of descriptors. - This module learns such distribution of each
evaluation function. - formula of evaluation function
- (next slide)
25Weight Learner
- formula of evaluation function
- value of a candidate clause for a category within
the supercategory CCi - PE, NE numbers of positive and negative examples
- CC, wCC number of appearances of Categorical
Classifier(s) in the function and its coefficient
- GAj, w number of appearances of type j
General Attributes and its coefficient - The Weight Learner learns w .
CCi GAj
CCi GAj
26Attribute Selector
- This module selects an appropriate evaluation
function for a category depending on the
supercategory of the category, then hands it to
the Word Learner. - this means switching of learning domain
- This module hands Categorical Classifiers of
concerning categories to the Hierarchy
Constructor.
27Realization of the Whole Object Bias
- Limit goals of induction as only intensional
explanation for a label - do not allow other types of meaning for labels
28Realization of the Taxonomic Bias
- Feature of conforming concepts containing
taxonomically like objects - Using ILP by the Word Learner causes naturally to
employee generalization of intensionally like
example objects - Restriction of concerned domain in learning
- The Attribute Selector switches domain where the
word learning and hierarchy construction are
executed
29Realization of the Shape Bias
- According to the Shape Bias, shape-related types
of General Attributes are given heavier weights
in similarity calculation than the other types. - Shape-related General Attributes are also given
heavier weights (in current design, 1.5 times) in
evaluation functions that are employed by the
Word Learner.
30Concepts Relation Determination and Biases
- They are implemented at the Hierarchy
Constructor. - The Mutually Exclusivity Bias
- turn to be effective if concerning two objects
are less similar than a threshold - revise both of two concepts at the condition that
no object is allowed to be positive for both - Otherwise, the Principle of Contrast becomes
valid and allows extension overlapping.
31Ontological Domain and Attribute Relativity
- Discrimination of like concepts is important
issue on such application - To distinguish cats form dogs is more difficult
problem than cats from chairs - Importance of certain types of sensory inputs is
different among domains - Way of locomotion is important in animal
classification while it is nonsense in furniture
classification
32Domain-specific learning and Building Hierarchy
- Some supercategories are prepared a priori
- The Word Learner acts within one supercategory
during a session - The Hierarchy Constructor construct hierarchies
in each of supercategory - The Weight Learner learns weight distributions on
General Attribute types diversely for each
supercategory
33Similarity Measurement
- Similarity between two concepts a, b in
supercategory j is - Sa, CC set of Category Classifiers appearing in
intension of concept (label) a - Sa, set of type i General Attributes
appearing in intension a -
- wCC, w weights on Category Classifiers and
type i General Attributes at supercategory j - we set 2, 1.5, 1 for wCC, w , w
respectively
GAi
CCi GAj
CCi shape
CCi other
34Example of Similarity Measurement
- Relative Similarity between dog and each of
label001-label003 - Shape type values are used prior to the other
- label001 is least typical as dog because its
intension shares no shape type value with dogs - label002 is less typical than label003 as dog
because its intension has shape type value which
dogs does not have
- Intension of dog
- labeling(dog, A) -
- tax(A, animation, animate),
- attr(A, shape, short_tail).
- Derived intensions of newly introduced labels
- labeling(label001, A) -
- tax(A, animation, animate),
- attr(A, shape, hanging_ears),
- attr(A, covering, furred).
- labeling(label002, A) -
- tax(A, animation, animate),
- attr(A, shape, short_tail),
- attr(A, shape, hanging_ears).
- labeling(label003, A) -
- tax(A, animation, animate),
- attr(A, shape, short_tail),
- attr(A, covering, furred).
35Example of Concepts Relation Judgement
- label001 (Similarity with dog 1/4) is allowed to
share its the same object with dog, for we set
the threshold of mutual exclusivity at 1/6. It
can be exclusive with dog if we set the threshold
larger than 1/4. - The other labels are more likely even to
relates hierarchically with dog.
- Intension of dog
- labeling(dog, A) -
- tax(A, animation, animate),
- attr(A, shape, short_tail).
- Derived intensions of newly introduced labels
- labeling(label001, A) -
- tax(A, animation, animate),
- attr(A, shape, hanging_ears),
- attr(A, covering, furred).
- labeling(label002, A) -
- tax(A, animation, animate),
- attr(A, shape, short_tail),
- attr(A, shape, hanging_ears).
- labeling(label003, A) -
- tax(A, animation, animate),
- attr(A, shape, short_tail),
- attr(A, covering, furred).
36Co-evolution between Concepts and the Hierarchy
- Concept hierarchy service for word learning
- enable domain-specific learning
- be able to switch General Attributes relativity
- be able to use only near-miss negative examples
- be able to reduce search space
- provide evidence for example selection
- Concept descriptions service for hierarchy
construction - determination of concepts relation depends on
intensions
37Concept Hierarchy Construction
- Unsupervised
- There is no way to certificate the structure to
be correct (e.g. same to teachers) - Possible trigger of construction
- judgement of hierarchical relation of concepts
38Virtual Experiments
- Two experiments within different supercategories
- one use tableware as example
- the other use animals
- We intend to find difference of important General
Attribute types between them
39Experiment 1 Tableware
- Objects a fork and a spoon
- almost the same length
- made of the same material
- Input to the Label Acceptor
- teacher associates an object obj051 to a category
named fork - assert which labeling is latest
- first introducing a label
labeling(fork, obj051, 1). newest_labeling(1).
40Sensory Inputs on a Object
- tax(O, C, Sc) means that a given object O belongs
to supercategory Sc under classification C. - attr(O, A, AV) means that an object O has value
AV for attribute A. - subobj(P, O) means that P is a convex-shaped part
of an object (or part of object) O. - connection(P1, D1, P2, D2) means that two points
p1 and p2 are contact to each other. pi is on the
surface of Pi which is on the direction Di from
the center of Pi.
tax(obj051, animation, inanimate). attr(obj051,
color, having_reflection). attr(obj051, color,
shining). attr(obj051, shape, constant). attr(obj0
51, direction, y_axis). subobj(obj051a,
obj051). subobj(obj051b, obj051). connection(obj0
51a, y_minus, obj051b, y_plus).
y
part a
y-
y
part b
41Sensory Inputs on the Object
y
obj051a2
attr(obj051b, direction, y_axis). attr(obj051b,
shape, board). attr(obj051b, shape,
x_axis/y_plus, y_minus). subobj(obj051a1,
obj051a). subobj(obj051a2, obj051a). subobj(obj051
a3, obj051a). subobj(obj051a4, obj051a). subobj(ob
j051a5, obj051a). connection(obj051a1,
x_minus_y_plus, obj051a2, y_minus). connection(
obj051a1, y_plus, obj051a3, y_minus). connection
(obj051a1, y_plus, obj051a4, y_minus). connectio
n(obj051a1, x_plus_y_plus, obj051a5, y_minus).
obj051a5
obj051a1
obj051b
x
42Sensory Inputs on the Object
y
obj051a2
attr(obj051a2, direction, y_axis). attr(obj051a2,
shape, pyramid). attr(obj051a3, direction,
y_axis). attr(obj051a3, shape, pyramid). attr(obj0
51a4, direction, y_axis). attr(obj051a4, shape,
pyramid). attr(obj051a5, direction,
y_axis). attr(obj051a5, shape, pyramid).
obj051a5
obj051a1
x
- After giving information above, we let the
learner start learning. - Then, we gave information on a spoon (obj052),
and let it start learning again.
43Result of Example 1
- First, induces spoon taking obj052 as a positive
and obj051 as a negative (row 2). Row 1 is forks
unchanged. - Then checks label(s) that refers to the current
object obj052 (rows 3-4). - In current, obj052 is assumed as negative for
fork (row 5), so re-induces fork and row 7 is the
revised. - Checking again makes this session completed (rows
8-9).
- Output during learning after giving spoon example
- (1) labeling(fork, A) - tax(A, animation,
inanimate). - (2) labeling(spoon, A) -tax(A, animation,
inanimate), - subobj(B, A), attr(B, shape, oval_semisphere).
- (3) obj052 is a fork.
- (4) obj052 is a spoon.
- (5) Need induction for fork.
- (6) labeling(spoon, A) - tax(A, animation,
inanimate), - subobj(B, A), attr(B, shape, oval_semisphere).
- (7) labeling(fork, A) - tax(A, animation,
inanimate), - subobj(B, A), subobj(C, B),
- attr(C, shape, pyramid).
- (8) obj052 is a spoon.
- (9) Need induction for .
44Experiment 2 Terrestrial Mammals
- Also in this experiment, the learner starts as an
entire novice as well as in preceding experiment. - Objects a cat and a dog
- They have different postures.
- the cat is laid
- the dog is standing
45Result of Example 2
- Similar process to Example 1
labeling(cat, A) - tax(A, animation,
animate). labeling(dog, A) - tax(A, animation,
animate), subobj(B, A), attr(B, direction,
z_axis). obj032 is a cat. obj032 is a dog. Need
induction for cat. labeling(dog, A) - tax(A,
animation, animate), subobj(B, A), attr(B,
direction, z_axis). labeling(cat, A) - tax(A,
animation, animate), subobj(B, A), attr(B,
shape, barrel). obj032 is a dog. Need induction
for .
46Comparing of Experiments
labeling(spoon, A) - tax(A, animation,
inanimate), subobj(B, A), attr(B, shape,
oval_semisphere). labeling(fork, A) - tax(A,
animation, inanimate), subobj(B, A), subobj(C,
B), attr(C, shape, pyramid). labeling(dog, A)
- tax(A, animation, animate), subobj(B, A),
attr(B, direction, z_axis). labeling(cat, A) -
tax(A, animation, animate), subobj(B, A),
attr(B, shape, barrel).
- In both case, Mutual Exclusivity between two
concepts appears. - Shape of parts makes difference within silverware
domain. - Between animals, difference in posture appears.
47Conclusion of Experiments
- It is difficult to find outstanding difference
between relative General Attributes of each
domain. - all General Attributes remained in intensions are
shape-related type - However, because of high similarity of example
objects, concepts of silverware are explained in
the same type (shape of parts) attributes. - shape of parts can be relative attributes for
discrimination within this domain - It is necessary to increase example scenes to
find relative attributes especially for animal
domain
48Conclusion
- Biases under the Constraint Theory of Word
Meaning can be realized in ILP. - weights distribution among types of General
Attributes - switching such distribution depending on
ontological domains - Effects of Biases are ensured to appear in
experiments.
49Future works
- We have expectation that to use difference of
relative attributes for weight revising by the
Weight Learner. - For this purpose, we would like to refine sensory
input to the learner. - how can the learner deal with locomotion of
animate objects? - more precise structure of shape-related
attributes - what can exist as attribute types other than
shape-related ones?