Learning to Coordinate Behaviors - PowerPoint PPT Presentation

About This Presentation
Title:

Learning to Coordinate Behaviors

Description:

Six-legged robot that walks forward. 12 behaviors, 6 conditions, 8742 nodes ... A 'tripod' gait emerged which is common among six-legged insects. Conclusions ... – PowerPoint PPT presentation

Number of Views:17
Avg rating:3.0/5.0
Slides: 18
Provided by: JavierM
Learn more at: https://www.cse.unr.edu
Category:

less

Transcript and Presenter's Notes

Title: Learning to Coordinate Behaviors


1
Learning to Coordinate Behaviors
  • Pattie Maes Rodney A. Brooks
  • Presented by Javier Martinez

2
Introduction
  • Behavior-based system
  • Learning using positive and negative feedback
  • Behaviors decide when is time to activate
  • Distributed algorithm
  • Test the concept in a robot

3
Motivation
  • Behavior control is a weak point initial
    Behavior-based systems
  • Behavior control has to be prewired
  • This approach doesnt scale too well

4
New Ideas
  • Behavior control is learned through experience
  • Learning algorithm completely distributed
  • Each behavior learns when to become active
  • The solution maximizes positive feedback and
    minimizes negative feedback

5
The Learning Task
  • What is needed
  • Vector of binary perceptual conditions
  • Set of behaviors
  • Positive feedback generator
  • Negative feedback generator

6
The Learning Task
  • The task
  • Change the precondition list from each behavior
    to maximize relevance and reliability

7
The Learning Task
  • Constraints
  • Relevance behavior correlated to positive
    feedback, not correlated with negative feedback
  • Reliability behavior receives consistent feedback

8
The Learning Task
  • More constraints
  • Algorithm should deal with noise,
  • Perform in real time,
  • Support readaptation

9
The Learning Task
  • Assumptions
  • At least one combination of preconditions is
    bounded
  • Feedback is immediate
  • Only combinations of conditions can be learned

10
Algorithm
  • Measure
  • Number of times a positive/negative feedback
    did/didnt happen when a behavior was/wasnt
    active
  • Calculate the correlation between
    positive/negative feedback and the status of the
    behavior

11
Algorithm
  • Measure
  • Express relevance and reliability in terms of
    this correlation
  • Relevance controls whether a behavior should be
    active or not
  • Reliability decides whether the behavior should
    try to improve itself

12
Algorithm
  • Measure
  • Improvement is done by monitoring a perceptual
    condition
  • If reliability increases, the behavior is added
    to the list of preconditions
  • Keep monitoring in a circle until reaching the
    threshold

13
Genghis
  • Six-legged robot that walks forward
  • 12 behaviors, 6 conditions, 8742 nodes
  • 4 eight-bit microprocessors, 32 KB memory
  • The challenge is to learn how to coordinate the
    legs to produce a forward movement

14
Results
  • Convergence time
  • Non-intelligent search during the monitoring
    stage 10 minutes
  • Intelligent search 1min 45sec
  • A tripod gait emerged which is common among
    six-legged insects

15
Conclusions
  • A learning algorithm was developed which allows a
    behavior-based robot to learn when its behaviors
    should become active using positive and negative
    feedback

16
Comments
  • Impressive results
  • Global behavior (walking) emerges from
    coordinated Behaviors
  • Simple idea, powerful consequences. Robot learned
    how to walk, wasnt taught

17
Comments
  • Dead behaviors dont revive. They might be useful
    in other situations
  • How to deal with concurrent actions? (i.e.
    walking and following a target)
Write a Comment
User Comments (0)
About PowerShow.com