Helsinki University of Technology Adaptive Informatics Research Centre Finland

About This Presentation

Title:

Helsinki University of Technology Adaptive Informatics Research Centre Finland

Description:

Parameters, states, and observations are modelled with Gaussian distributions ... the policy mapping does not fix the control signal (because of the noise model) ... – PowerPoint PPT presentation

Number of Views:17

Avg rating:3.0/5.0

Slides: 17

Provided by: tapani

Category:

more less

Transcript and Presenter's Notes

Title: Helsinki University of Technology Adaptive Informatics Research Centre Finland

1
Helsinki University of TechnologyAdaptive
Informatics Research CentreFinland

Variational Bayesian Approach for Nonlinear
Identification and Control
Matti Tornio and Tapani Raiko
October 9, 2006

2
Introduction

System identification and control in nonlinear
state-space models
Continues the work by Rosenqvist and Karlström
(Automatica 2005)
Our background is in machine learning
Uncertainties taken explicitly into accountby
using Variational Bayesian treatment

3
Why nonlinear state-space models?

System identification using a hidden state has
many benefits
More resistant to noise
Observations (without history) do not always
carry enough information about the system state
Finds a representation of the state that is more
suitable for approximating the dynamics

4
System identification in nonlinear state-space
models

We use a state-of-the-art tool by Valpola and
Karhunen (Neural Computation 2002)
Parameters, states, and observations are modelled
with Gaussian distributions
Less prone to overfitting (than the prediction
error method)
Can determine the dimensionality of the state
space etc.

5
Properties of the method

The model scales well to higher dimensions
Can model very complex dynamics
Natural conjugate gradient algorithm is used for
fast system identification

6
Nonlinearities by MLP networks

f(x(t),?)B tanhAx(t)a b noise
The parameters ? include the weight matrices,
bias vectors, noise variances etc.
Note that the policy mapping does not fix the
control signal (because of the noise model)

7
Variational Bayesian treatment

Posterior probability p(x,?y) is approximated
by q(x,?)
q is assumed to be Gaussian with limited
dependencies
The fit of q to p is measured by a cost function
Both identification and prediction can be done by
minimising the misfit by adjusting the parameters
defining q (means, variances, covariances)

8
Control

Current state is estimated with extended Kalman
filter (EKF)
Control signals u(t) are selected to minimise
the expected cost EJ over the distribution q
Quasi-Newton algorithm for optimisation
Compare to dual controlestimation errors
increase the expected cost

9
Control (cont.)

Prediction with variances is 5 times slower
too slow for some applications, the method can
still be used for system identification
Learning is done offline
online learning possible as well, leads to
different exploration strategies

10
Optimistic inference control

Alternative control scheme
Observations at some point in the future are
fixed and the states leading to this desired
future are inferred
Allows the direct use of inference algorithms
Conceptually very simple, but not as versatile as
NMPC
constraints hard to model

11
Experiments

Assume the cart-pole system to be unknown
Dynamics are identified from only 2500 samples
6-dimensional state space x(t) was used

12
(No Transcript)
13
(No Transcript)
14
Results

Very high success rate was reached even under
high noise
Partially observed system is hard to control

Low noise
High noise
15
Results (initialisation)

Good initialisations are important
Local minima are the biggest problem
Internal forward model can provide reasonable
initialisations without significant extra
computation

16
Conclusion