Humanoid Robots Learning to Walk Faster: From the Real World to Simulation and Back - PowerPoint PPT Presentation

About This Presentation
Title:

Humanoid Robots Learning to Walk Faster: From the Real World to Simulation and Back

Description:

Humanoid Robots Learning to Walk Faster: From the Real World to Simulation and Back ALON FARCHY, SAMUEL BARRETT, PATRICK MACALPINE, PETER STONE Motivation Low-level ... – PowerPoint PPT presentation

Number of Views:251
Avg rating:3.0/5.0
Slides: 33
Provided by: Alon152
Category:

less

Transcript and Presenter's Notes

Title: Humanoid Robots Learning to Walk Faster: From the Real World to Simulation and Back


1
Humanoid Robots Learning to Walk FasterFrom the
Real World to Simulation and Back
  • Alon Farchy, Samuel Barrett,Patrick Macalpine,
    Peter Stone

2
Motivation
  • Low-level robot skills are important
  • Robust walking and turning
  • Precise robotic arm movement
  • Localization
  • Stability
  • Etc.

3
Motivation
  • Low-level robot skills are important
  • Robust walking and turning
  • Precise robotic arm movement
  • Localization
  • Stability
  • Etc.
  • These skills can be parameterized, but learning
    on a robot is challenging
  • Many environmental factors
  • Robot performance degrades with use
  • Robots take time to operate
  • Lack of ground truth

4
A Little Background RoboCup
  • The RoboCup Standard Platform League Soccer for
    robots
  • Requires fast, stable, intelligent robots
  • Robots wear out and are time consuming to work
    with
  • Therefore, using machine learning is hard

5
A Little Background RoboCup Simulation
  • The 3D Simulation League Soccer for virtual
    robots
  • Requires fast, stable, intelligent robots
  • Robots are unlimited
  • Great environment for machine learning
  • In 2011 and 2012, UT Austin Villa Simulation
    League outpaced the competition by using machine
    learning.

6
How can we transfer our knowledge to the real
world?
Learn
Apply
Transfer
7
Challenges
  • Many differences between simulation and the real
    world
  • Open Dynamics Engine (ODE) is far from perfect
  • No fiction on joints
  • No heat simulation
  • Virtual Nao is greatly approximated
  • Equal joint strength, perfect balance, simple
    foot shape
  • Soccer environment is greatly approximated
  • Perfectly flat surface

8
Outline
  • Grounded Simulation Learning (GSL)
  • Assumptions, Parameters, Overview
  • Ground, Optimize, Guide
  • Implementation
  • Fitness Evaluation
  • Predicting Joint Angles
  • Optimizing (CMA-ES)
  • Manual Guidance
  • Results
  • References

9
Grounded Simulation Learning (GSL)
  • Concept
  • Iteratively bound the search space to find areas
    that overlap between the simulation and the real
    word. Reduce disparity between simulation / real
    world along the way.
  • Assumptions
  • 1. Evaluation in simulation can be modified.
  • 2. A small number of evaluations can be run on
    the robot.
  • 3. A small number of explorations can be run on
    the robot to collect data.
  • 4. Using data from (3), the disparity between the
    simulation and robot can be reduced via
    supervised machine learning.
  • 5. Optimization in simulation can be biased
    towards / against certain parameters.

10
GSL Parameters
  • Input
  • P0 Initial parameter set
  • Fitnesssim A simulation fitness function that
    uses a model that maps joint commands to outputs
  • Fitnessrobot a robot fitness function
  • Explorerobot a robot exploration routine
  • Learn A supervised learning algorithm
  • Optimize An optimization algorithm to run in
    simulation
  • Output
  • Popt Optimized parameter set
  • Variables
  • BestFitness Current best fitness evaluation on
    the robot
  • OpenParams Bag of pairs (Parameter set,
    Fitness) to try

11
GSL Ground
  • Using the next parameter set in OpenParams
  • 1. Collect data about the robots states and
    actions using Explorerobot.
  • 2. Use Learn to create a mapping between states
    and actions on the robot.
  • 3. Use this mapping to reduce disparity between
    simulation and the real world.
  • ? Force simulation to act like the robot

12
GSL Optimize
  • Use Optimize to find good parameters in the
    grounded simulation.
  • Note The optimization should not search too
    deeply. Searching far from the base parameters is
    very likely to exploit idiosyncrasies in the
    simulation.

13
GSL Guide
  • 1. Try some as many good parameters on the robot
    as is feasible. Add the good ones to OpenParams.
  • 2. Based on results, select parameters to focus
    on for the next round of optimization.
  • In our case, this selection was performed
    manually.
  • Repeat ground, optimize, and guide until
    OpenParams is empty.

14
GSL Putting it together
  • P0 Initial parameter set
  • Fitnesssim Simulation fitness function
  • Fitnessrobot Robot fitness function
  • Explorerobot Robot exploration routine
  • Learn Supervised learning algorithm
  • Optimize Simulation optimization algorithm
  • BestFitness Current best fitness
  • OpenParams Bag (Parameter set, Fitness)

15
GSL Putting it together
Good Params (robot)
Pop
OpenParams
Evaluate (Robot)
Explore (Robot)
Good Params (sim)
Learn
Optimize (Sim)
Good Params (robot)
Model
States, Actions
Guide
Focus
16
Implementation
  • Fitness Evaluation
  • Predicting Joint Angles
  • Optimizing (CMA-ES)
  • Manual Guidance
  • Results

17
Fitness Evaluation (Real World)
Walk 238cm forward (towards orange ball) Manual
stop when foot reaches white line Robot measures
time delta. Shorter time is better
18
Fitness Evaluation (Real World)
19
Fitness Evaluation (Simulation)
  • Original
  • OmniWalk (goToTarget)
  • Omnidirectional walk towards various targets.
  • Closer to target is better.
  • Penalty for falling.
  • Needs to be able to turn and stop quickly out
    of scope.
  • New
  • WalkFront
  • Walk forward only for 15 seconds.
  • Measure forward delta.
  • Higher is better.

20
(No Transcript)
21
Grounding Predicting Joint Angles
  • Explorerobot modified OmniWalk.
  • Only walk forward and turn
  • Record joint commands and joint angles at each
    frame
  • Learn M5P Learn mapping from
  • (Joint Angles, Joint Commands)
  • to
  • Next Joint Angles

RAE Relative Absolute Error RRSE Relative
Root Squared Error
22
Grounding Predicting Joint Angles
  • How to apply model to simulation?
  • Linear combination of requested joint angles and
    predicted joint angles
  • By manual testing, 70 requested / 30 predicted.
  • Now we can use this grounded simulation in
    Optimize.

23
Optimizing in Simulation CMA-ES
  • Covariance Matrix Adaptation Evolution Strategy
  • Candidates sampled from multidimentional Gaussian
    distribution.
  • Evaluated by Fitnesssim
  • Weighted average of members with highest fitness
    used to update mean of distribution
  • Covariance updated using evolution paths controls
    search step sizes

24
Optimizing in Simulation CMA-ES
25
Optimizing in Simulation CMA-ES
Condor workload management system. 150
simultaneous fitness evaluations. Even with
small number of generations (10), explores a LOT
more parameter sets than a real robot could.
26
Guidance
  • Evaluate optimized parameters using Fitnessrobot.
  • Select parameters for OpenParams (easy)
  • Robot Falls?
  • Robot Faster?
  • Bias Optimize to better parameters (harder)
  • Manually tweaked variance of parameters in the
    CMA-ES.
  • Could be automated.

27
(No Transcript)
28
(No Transcript)
29
Results
  • GSL was run at 67 step size for stability.
  • But ITER 4 (WalkFront) could run at 100 step
    size.
  • Original _at_ 100 13.5 cm/s
  • http//youtu.be/grlceQkBTxw
  • Optimized _at_ 100 17.1 cm/s
  • http//youtu.be/nGc127yYoSs

  • 26.7 Improvement!

30
Related Work
  • UT Austin Villa RobotCup 3D Simulation League
  • P. MacAlpine, S. Barrett, D. Urieli, V. Vu, and
    P. Stone. Design and optimization of an
    omnidirectional humanoid walk A winning approach
    at the RoboCup 2011 3D simulation competition. In
    Twenty-Sixth Conference on Articial Intelligence
    (AAAI-12), July 2012.
  • CMA-ES
  • N. Hansen. The CMA evolution strategy A
    tutorial, 2005.
  • M5P
  • R. J. Quinlan. Learning with continuous classes.
    In 5th Australian Joint Conference on Articial
    Intelligence, pages 343348, Singapore, 1992.
    World Scientific.

31
Related Work
  • Simulation Robot learning
  • P. Abbeel, M. Quigley, and A. Y. Ng. Using
    inaccurate models in reinforcement learning. In
    International Conference on Machine Learning
    (ICML) Pittsburgh, pages 18. ACM Press, 2006.
  • J. C. Zagal, J. Delpiano, and J. Ruiz-del Solar.
    Self-modeling in humanoid soccer robots. Robot.
    Auton. Syst., 57(8)819827, July 2009.
  • L. Iocchi, F. D. Libera, and E. Menegatti.
    Learning humanoid soccer actions interleaving
    simulated and real data. In Second Workshop on
    Humanoid Soccer Robots, November 2007.
  • S. Koos, J.-B. Mouret, and S. Doncieux. Crossing
    the reality gap in evolutionary robotics by
    promoting transferable controllers. In
    Proceedings of the 12th annual conference on
    Genetic and volutionary computation, GECCO '10,
    pages 119126, New York, NY, USA, 2010. ACM.

32
Questions?
Write a Comment
User Comments (0)
About PowerShow.com