Title: A Learning architecture for autonomous robots based on concept generation
1A Learning architecture for autonomous robots
based on concept generation
- Pejman Iravani
- (Supervisors Lucia Rapanotti Jeff Johnson)
- The Open University
- 02-March-2005
2Outline
- Introduction
- Robotic Architectures
- Learning and Adaptation
- Concepts and Hierarchical representation
- Architecture based on concept generation
- Experimental analysis
- Summary and conclusions
3Introduction
IntroductionRobotic architecturesLearning and
adaptationConcepts and hierarchical
repArchitecture based on conceptsExperimental
analysis
4Introduction
IntroductionRobotic architecturesLearning and
adaptationConcepts and hierarchical rep
Architecture based on conceptsExperimental
analysis
- Robots are designed to be...
- Task oriented we want robots to do things.
- Autonomous we want robots to be independent
- Robust not having to rescue them.
- Flexible not having to reprogram them.
5Introduction
IntroductionRobotic architecturesLearning and
adaptationConcepts and hierarchical rep
Architecture based on conceptsExperimental
analysis
- Robots must be adaptable to be autonomous.
- if a robot can adapt its behaviour to overcome
failures it becomes robust. - if a robot can adapt its behaviour to changes in
the task or the environment it becomes more
flexible. - Robots can exploit machine learning techniques to
adapt.
6Robotic Architectures
IntroductionRobotic architecturesLearning and
adaptationConcepts and hierarchical rep
Architecture based on conceptsExperimental
analysis
7Robotic Architectures
IntroductionRobotic architecturesLearning and
adaptationConcepts and hierarchical rep
Architecture based on conceptsExperimental
analysis
- Deliberative or Symbolic Architecture
- Based on traditional AI approach
8Robotic Architectures
IntroductionRobotic architecturesLearning and
adaptationConcepts and hierarchical rep
Architecture based on conceptsExperimental
analysis
- Deliberative or Symbolic Architecture
- Symbol Generation
in(ball,field) in(goal,field) aligned(goal, ball,
robot) behind(robot,ball) etc...
Symbol grounding problem How symbols get their
meaning from the real world? Harnad 1990
9Robotic Architectures
IntroductionRobotic architecturesLearning and
adaptationConcepts and hierarchical rep
Architecture based on conceptsExperimental
analysis
- Deliberative or Symbolic Architecture
- Symbol Manipulation
PLAN go(initial,p_ball)get(ball) go(p_ball,
p_goal) release(ball)
10Robotic Architectures
IntroductionRobotic architecturesLearning and
adaptationConcepts and hierarchical rep
Architecture based on conceptsExperimental
analysis
- Deliberative or Symbolic Architecture
- Symbol Manipulation
PLAN go(initial,p_ball)get(ball) go(p_ball,
p_goal) release(ball)
Frame problem Representing the effects of
actions (models) without having to represent a
large number of obvious non-effects.
11Robotic Architectures
IntroductionRobotic architecturesLearning and
adaptationConcepts and hierarchical rep
Architecture based on conceptsExperimental
analysis
- Deliberative or Symbolic Architecture
- The environment and the robots body is dismissed
intelligent behaviour
12Robotic Architectures
IntroductionRobotic architecturesLearning and
adaptationConcepts and hierarchical rep
Architecture based on conceptsExperimental
analysis
- Reactive or Embodied Architecture
- Based on nouvelle AI approach
- Symbolic representations are not needed
- The environment itself is the best model
- of the environment
Simple Sensor to Motor mapping (stimulus-response)
13Robotic Architectures
IntroductionRobotic architecturesLearning and
adaptationConcepts and hierarchical rep
Architecture based on conceptsExperimental
analysis
- Reactive or Embodied Architecture
- Behaviours drive the robot
No symbols Complex behaviours, such as
communication need symbols. No models Goal
directed behaviour must be encoded in the
robots behaviours.
No Symbols
No Models
14Robotic Architectures
IntroductionRobotic architecturesLearning and
adaptationConcepts and hierarchical rep
Architecture based on conceptsExperimental
analysis
- Reactive or Embodied Architecture
- The robot-environment interaction is essential
intelligent behaviour
15Robotic Architectures
IntroductionRobotic architecturesLearning and
adaptationConcepts and hierarchical rep
Architecture based on conceptsExperimental
analysis
- Hybrid Architectures
- Combination of the previous approaches
Mediator layer is very difficult to implement!!
And usually is better to use only a reactive
architecture
16Robotic Architectures
IntroductionRobotic architecturesLearning and
adaptationConcepts and hierarchical rep
Architecture based on conceptsExperimental
analysis
17Learning and Adaptation
IntroductionRobotic architecturesLearning and
adaptationConcepts and hierarchical rep
Architecture based on conceptsExperimental
analysis
- Reinforcement Learning...
18Learning and Adaptation
- Reinforcement Learning...
- What to learn?
- Learn a way of acting such that maximises the
rewards received over time. - The way of acting is known as a policy.
19Learning and Adaptation
IntroductionRobotic architecturesLearning and
adaptationConcepts and hierarchical rep
Architecture based on conceptsExperimental
analysis
- Reinforcement Learning...
- A policy, ?, indicates, in a probabilistic
manner, which is the action, a, robot should take
given the state, s
20Learning and Adaptation
IntroductionRobotic architecturesLearning and
adaptationConcepts and hierarchical rep
Architecture based on conceptsExperimental
analysis
- Reinforcement Learning...
- A well known problem in reinforcement learning is
the so-called curse of dimensionality.
sx1,...xn s0,0...0 s0,0,1...
s1,1,...1s2n
sx1 s0 s1 s2
sx1,x2 s0,0 s0,1s1,0 s1,1s4
sx1,x2,x3 s0,0,0 s0,0,1...
s1,1,1s8
21Learning and Adaptation
IntroductionRobotic architecturesLearning and
adaptationConcepts and hierarchical rep
Architecture based on conceptsExperimental
analysis
- Reinforcement Learning...
- A well known problem in reinforcement learning is
the so-called curse of dimensionality.
- Curse of dimensionality The exponential growth
of a hyperspace (e.g. robot states) as a
function of its dimensions (e.g. robot sensors)
requires the computational power to also grow
exponentially. Bellman 1957. - For example, to represent a policy a robot will
need to increase its memory exponentially.
22Learning and Adaptation
IntroductionRobotic architecturesLearning and
adaptationConcepts and hierarchical rep
Architecture based on conceptsExperimental
analysis
Generalisation Given a large state or action
space produce a more compact representation of
it. For example, from 1000 states, represent only
10. Dimension filtering From all the possible
dimensions, select a sub-set of relevant ones.
For example, from 10 sensors use only 5 relevant
ones.
- Summary
- Reinforcement learning is a simple framework
based on the definition of states, actions and
rewards, that allow robots to adapt autonomously
to their environment. - Reinforcement learning suffers from the so-called
curse of dimensionality. - To alleviate this problem, generalisation and
dimension filtering are necessary!
23Concepts and hierarchical representation
IntroductionRobotic architecturesLearning and
adaptationConcepts and hierarchical rep
Architecture based on conceptsExperimental
analysis
- Concepts are...
- Classes composed of primitives.
By definition, concepts are generalisations of
their primitives. Concepts can be used to
represent a robots state-action spaces in a
generalised manner.
car
bus
plane
bicycle
train
24Concepts and hierarchical representation
IntroductionRobotic architecturesLearning and
adaptationConcepts and hierarchical rep
Architecture based on conceptsExperimental
analysis
- Concepts can be represented on a multi-level
hierarchy...
Level N1
Level N
25Concepts and hierarchical representation
IntroductionRobotic architecturesLearning and
adaptationConcepts and hierarchical rep
Architecture based on conceptsExperimental
analysis
- The hierarchy can have increasing levels of
description...
Level N2
transport vehicle
Level N1
car
train
bus
plane
Level N
bicycle
26Concepts and hierarchical representation
IntroductionRobotic architecturesLearning and
adaptationConcepts and hierarchical rep
Architecture based on conceptsExperimental
analysis
Generalisation concept Any of the primitives is
sufficient for the concept to be
formed. Relational concept All the primitives
and a special relation between them is necessary
to form relational concepts.
- Two main types of concepts can be distinguished
car
train
bus
plane
bicycle
27Concepts and hierarchical representation
IntroductionRobotic architecturesLearning and
adaptationConcepts and hierarchical rep
Architecture based on conceptsExperimental
analysis
- Hypothesis
- Generalisation and relational concepts can be
used to reduce the state and action spaces in
robotic systems. - Both types of concepts can be used to learn
robotic behaviours and to control robots. - A robotic architecture can be developed that
exploits the definition of concepts.
28Concepts and hierarchical representation
IntroductionRobotic architecturesLearning and
adaptationConcepts and hierarchical rep
Architecture based on conceptsExperimental
analysis
- Summary
- Concepts have been defined has general
descriptions of some primitives. - Two different types of concepts have been
presented, namely, generalisation and relational
concepts. - It has been hypothesised that both types of
concepts can be integrated in a robotic
architecture capable of learning behaviours and
controlling robots.
29Architecture based on concepts
IntroductionRobotic architecturesLearning and
adaptationConcepts and hierarchical rep
Architecture based on conceptsExperimental
analysis
- The proposed architecture operates as following...
ConceptGeneration
Behaviour Learning
Robot interacts with environment
Robot Control
producing sensor andmotor data
producing hierarchy ofconcepts
producing behaviours
30Architecture based on concepts
IntroductionRobotic architecturesLearning and
adaptationConcepts and hierarchical rep
Architecture based on conceptsExperimental
analysis
- Robot interacts with the environment
- As the robot interacts with the environment
collects data from its sensors and motors.
31Architecture based on concepts
IntroductionRobotic architecturesLearning and
adaptationConcepts and hierarchical rep
Architecture based on conceptsExperimental
analysis
- Concept generation
- Sensor and motor data can be aggregated into
state and action concepts, denoted by, sc, and,
ac.
Concepts are grounded The concepts created in
this manner are grounded on sensor and motor
data. This means, that all concepts can be
ultimately, interpreted as sensor and motor data.
No symbol grounding problem.
Generalisation concepts
Relational concepts
32Architecture based on concepts
IntroductionRobotic architecturesLearning and
adaptationConcepts and hierarchical rep
Architecture based on conceptsExperimental
analysis
- Behaviour learning
- In reinforcement learning a behaviour or policy
is a function that maps, states to the
probability of selecting an action. In our
architecture, behaviours are represented by a
policy that maps state concepts into probability
of selecting action concepts.
33Architecture based on concepts
IntroductionRobotic architecturesLearning and
adaptationConcepts and hierarchical rep
Architecture based on conceptsExperimental
analysis
- Robot control
- A policy controls a robot by observing the state
concept from the environment, and selecting the
action concept with highest probability of
occurrence. - So, action selection is stimulus-response.
34Architecture based on concepts
- Summary
- An architecture has been presented that creates
and uses concepts. - Concepts are created from sensor and motor data,
so they are grounded. - Behaviour learning uses state and action
concepts. That is, it uses the generalised state
and action spaces.
35Experimental analysis
IntroductionRobotic architecturesLearning and
adaptationConcepts and hierarchical rep
Architecture based on conceptsExperimental
analysis
- Low-level behaviour learning using generalisation
concepts. - Behaviour to learn navigation from A to B.
- State space distance to B and deviation to B
100x360 - Action space power right and left wheel
100x100 - The total state-action space is
100x360x100x10036x107
36Experimental analysis
IntroductionRobotic architecturesLearning and
adaptationConcepts and hierarchical rep
Architecture based on conceptsExperimental
analysis
- To acquire generalisation concepts
- Data is acquired by a hand-coded robot.
- Clustering techniques are applied to the data.
- The resulting state-action spaces has a size
10x10100
37Experimental analysis
IntroductionRobotic architecturesLearning and
adaptationConcepts and hierarchical rep
Architecture based on conceptsExperimental
analysis
- Results obtained by a robot using a learned
behaviour versus the hand-coded robot.
38Experimental analysis
IntroductionRobotic architecturesLearning and
adaptationConcepts and hierarchical rep
Architecture based on conceptsExperimental
analysis
- Higher level-behaviour learning using relational
concepts.
?
Passing behaviour who to pass to?
39Experimental analysis
IntroductionRobotic architecturesLearning and
adaptationConcepts and hierarchical rep
Architecture based on conceptsExperimental
analysis
- State representation of successful passes...
neighbour is team-mate
neighbour is opponent
40Experimental analysis
IntroductionRobotic architecturesLearning and
adaptationConcepts and hierarchical rep
Architecture based on conceptsExperimental
analysis
- State representation as an incidence matrix
C1 C2 C3 C4 C5 C6
Triangle 1 1 1 1 1 0 1
0 1 Triangle 2 1 1 1
0 0 1 0 0 Triangle 3 1
1 1 1 0 1 1 1
Triangle 4 0 0 0 1 1 1
0 1 Triangle 5 1 0 1
1 1 1 0 0 Triangle 6 0
1 0 1 1 1 0 1
41Experimental analysis
IntroductionRobotic architecturesLearning and
adaptationConcepts and hierarchical rep
Architecture based on conceptsExperimental
analysis
- Q-analysis methods are used to finding shared
columns in the passes.
C1 C2 C3 C4 C5 C6
Triangle 1 1 1 1 1 0 1
0 1 Triangle 2 1 1 1
0 0 1 0 0 Triangle 3 1
1 1 1 0 1 1 1
Triangle 4 0 0 0 1 1 1
0 1 Triangle 5 1 0 1
1 1 1 0 0 Triangle 6 0
1 0 1 1 1 0 1
42Experimental analysis
IntroductionRobotic architecturesLearning and
adaptationConcepts and hierarchical rep
Architecture based on conceptsExperimental
analysis
- Relational concepts of good-passing
configurations are defined
C1 C2 C3 C4 C5 C6
Triangle 1 1 1 1 1 0 1
Triangle 2 1 1 1 0 0 1
Triangle 3 1 1 1 1 0
1 Triangle 4 0 0 0 1 1
1 Triangle 5 1 0 1 1 1
1 Triangle 6 0 1 0 1
1 1
C1 and C2 and C3
C4 and C5 and C6
43Experimental analysis
IntroductionRobotic architecturesLearning and
adaptationConcepts and hierarchical rep
Architecture based on conceptsExperimental
analysis
- Control using relational concepts...
- The method was tried using data from the RoboCup
simulation league providing positive results.
Iravani, et al. 2005
if( C1 and C2 and C3 ) then pass
if( C4 and C5 and C6) then pass
44Summary
- A review of existing architectures indicated that
reactive/embodied architectures are the most
appropriate in robotics. - Reinforcement Learning was proposed as a method
to improve autonomy in robots. - Concepts were introduced as generalised
representations that help reducing dimensionality
problems in behaviour learning. - An architecture based on concept generation was
presented.
45Conclusions
- It is possible to acquire generalisation
concepts from sensor and motor data. - It is possible to acquire relational concepts
from sensor and motor data. - A multilevel architecture can be defined that
uses concepts for behaviour learning and robot
control. - The architecture acquire concepts (symbols) from
the robot-environment interaction (grounded). - The architecture uses concepts (symbols) in a
reactive manner.
46Last remark
- Not all robots are good...
47The End o)Thanks for the attention