Humanoid Robots Learning to Walk Faster: From the Real World to Simulation and Back - PowerPoint PPT Presentation

About This Presentation

Title:

Humanoid Robots Learning to Walk Faster: From the Real World to Simulation and Back

Description:

Humanoid Robots Learning to Walk Faster: From the Real World to Simulation and Back ALON FARCHY, SAMUEL BARRETT, PATRICK MACALPINE, PETER STONE Motivation Low-level ... – PowerPoint PPT presentation

Number of Views:251

Avg rating:3.0/5.0

Slides: 33

Provided by: Alon152

Learn more at: https://www.cs.utexas.edu

Category:

more less

Transcript and Presenter's Notes

Title: Humanoid Robots Learning to Walk Faster: From the Real World to Simulation and Back

1
Humanoid Robots Learning to Walk FasterFrom the
Real World to Simulation and Back

Alon Farchy, Samuel Barrett,Patrick Macalpine,
Peter Stone

2
Motivation

Low-level robot skills are important
Robust walking and turning
Precise robotic arm movement
Localization
Stability
Etc.

3
Motivation

Low-level robot skills are important
Robust walking and turning
Precise robotic arm movement
Localization
Stability
Etc.
These skills can be parameterized, but learning
on a robot is challenging
Many environmental factors
Robot performance degrades with use
Robots take time to operate
Lack of ground truth

4
A Little Background RoboCup

The RoboCup Standard Platform League Soccer for
robots
Requires fast, stable, intelligent robots
Robots wear out and are time consuming to work
with
Therefore, using machine learning is hard

5
A Little Background RoboCup Simulation

The 3D Simulation League Soccer for virtual
robots
Requires fast, stable, intelligent robots
Robots are unlimited
Great environment for machine learning
In 2011 and 2012, UT Austin Villa Simulation
League outpaced the competition by using machine
learning.

6
How can we transfer our knowledge to the real
world?
Learn
Apply
Transfer
7
Challenges

Many differences between simulation and the real
world
Open Dynamics Engine (ODE) is far from perfect
No fiction on joints
No heat simulation
Virtual Nao is greatly approximated
Equal joint strength, perfect balance, simple
foot shape
Soccer environment is greatly approximated
Perfectly flat surface

8
Outline

Grounded Simulation Learning (GSL)
Assumptions, Parameters, Overview
Ground, Optimize, Guide
Implementation
Fitness Evaluation
Predicting Joint Angles
Optimizing (CMA-ES)
Manual Guidance
Results
References

9
Grounded Simulation Learning (GSL)

Concept
Iteratively bound the search space to find areas
that overlap between the simulation and the real
word. Reduce disparity between simulation / real
world along the way.
Assumptions
1. Evaluation in simulation can be modified.
2. A small number of evaluations can be run on
the robot.
3. A small number of explorations can be run on
the robot to collect data.
4. Using data from (3), the disparity between the
simulation and robot can be reduced via
supervised machine learning.
5. Optimization in simulation can be biased
towards / against certain parameters.

10
GSL Parameters

Input
P0 Initial parameter set
Fitnesssim A simulation fitness function that
uses a model that maps joint commands to outputs
Fitnessrobot a robot fitness function
Explorerobot a robot exploration routine
Learn A supervised learning algorithm
Optimize An optimization algorithm to run in
simulation
Output
Popt Optimized parameter set
Variables
BestFitness Current best fitness evaluation on
the robot
OpenParams Bag of pairs (Parameter set,
Fitness) to try

11
GSL Ground

Using the next parameter set in OpenParams
1. Collect data about the robots states and
actions using Explorerobot.
2. Use Learn to create a mapping between states
and actions on the robot.
3. Use this mapping to reduce disparity between
simulation and the real world.
? Force simulation to act like the robot

12
GSL Optimize

Use Optimize to find good parameters in the
grounded simulation.
Note The optimization should not search too
deeply. Searching far from the base parameters is
very likely to exploit idiosyncrasies in the
simulation.

13
GSL Guide

1. Try some as many good parameters on the robot
as is feasible. Add the good ones to OpenParams.
2. Based on results, select parameters to focus
on for the next round of optimization.
In our case, this selection was performed
manually.
Repeat ground, optimize, and guide until
OpenParams is empty.

14
GSL Putting it together

P0 Initial parameter set
Fitnesssim Simulation fitness function
Fitnessrobot Robot fitness function
Explorerobot Robot exploration routine
Learn Supervised learning algorithm
Optimize Simulation optimization algorithm
BestFitness Current best fitness
OpenParams Bag (Parameter set, Fitness)

15
GSL Putting it together
Good Params (robot)
Pop
OpenParams
Evaluate (Robot)
Explore (Robot)
Good Params (sim)
Learn
Optimize (Sim)
Good Params (robot)
Model
States, Actions
Guide
Focus
16
Implementation

Fitness Evaluation
Predicting Joint Angles
Optimizing (CMA-ES)
Manual Guidance
Results

17
Fitness Evaluation (Real World)
Walk 238cm forward (towards orange ball) Manual
stop when foot reaches white line Robot measures
time delta. Shorter time is better
18
Fitness Evaluation (Real World)
19
Fitness Evaluation (Simulation)

Original
OmniWalk (goToTarget)
Omnidirectional walk towards various targets.
Closer to target is better.
Penalty for falling.
Needs to be able to turn and stop quickly out
of scope.
New
WalkFront
Walk forward only for 15 seconds.
Measure forward delta.
Higher is better.

20
(No Transcript)
21
Grounding Predicting Joint Angles

Explorerobot modified OmniWalk.
Only walk forward and turn
Record joint commands and joint angles at each
frame
Learn M5P Learn mapping from
(Joint Angles, Joint Commands)
to
Next Joint Angles

RAE Relative Absolute Error RRSE Relative
Root Squared Error
22
Grounding Predicting Joint Angles

How to apply model to simulation?
Linear combination of requested joint angles and
predicted joint angles
By manual testing, 70 requested / 30 predicted.
Now we can use this grounded simulation in
Optimize.

23
Optimizing in Simulation CMA-ES

Covariance Matrix Adaptation Evolution Strategy
Candidates sampled from multidimentional Gaussian
distribution.
Evaluated by Fitnesssim
Weighted average of members with highest fitness
used to update mean of distribution
Covariance updated using evolution paths controls
search step sizes

24
Optimizing in Simulation CMA-ES
25
Optimizing in Simulation CMA-ES
Condor workload management system. 150
simultaneous fitness evaluations. Even with
small number of generations (10), explores a LOT
more parameter sets than a real robot could.
26
Guidance

Evaluate optimized parameters using Fitnessrobot.
Select parameters for OpenParams (easy)
Robot Falls?
Robot Faster?
Bias Optimize to better parameters (harder)
Manually tweaked variance of parameters in the
CMA-ES.
Could be automated.

27
(No Transcript)
28
(No Transcript)
29
Results

GSL was run at 67 step size for stability.
But ITER 4 (WalkFront) could run at 100 step
size.
Original _at_ 100 13.5 cm/s
http//youtu.be/grlceQkBTxw
Optimized _at_ 100 17.1 cm/s
http//youtu.be/nGc127yYoSs
26.7 Improvement!

30
Related Work

UT Austin Villa RobotCup 3D Simulation League
P. MacAlpine, S. Barrett, D. Urieli, V. Vu, and
P. Stone. Design and optimization of an
omnidirectional humanoid walk A winning approach
at the RoboCup 2011 3D simulation competition. In
Twenty-Sixth Conference on Articial Intelligence
(AAAI-12), July 2012.
CMA-ES
N. Hansen. The CMA evolution strategy A
tutorial, 2005.
M5P
R. J. Quinlan. Learning with continuous classes.
In 5th Australian Joint Conference on Articial
Intelligence, pages 343348, Singapore, 1992.
World Scientific.

31
Related Work

Simulation Robot learning
P. Abbeel, M. Quigley, and A. Y. Ng. Using
inaccurate models in reinforcement learning. In
International Conference on Machine Learning
(ICML) Pittsburgh, pages 18. ACM Press, 2006.
J. C. Zagal, J. Delpiano, and J. Ruiz-del Solar.
Self-modeling in humanoid soccer robots. Robot.
Auton. Syst., 57(8)819827, July 2009.
L. Iocchi, F. D. Libera, and E. Menegatti.
Learning humanoid soccer actions interleaving
simulated and real data. In Second Workshop on
Humanoid Soccer Robots, November 2007.
S. Koos, J.-B. Mouret, and S. Doncieux. Crossing
the reality gap in evolutionary robotics by
promoting transferable controllers. In
Proceedings of the 12th annual conference on
Genetic and volutionary computation, GECCO '10,
pages 119126, New York, NY, USA, 2010. ACM.