LMS Algorithm in a Reproducing Kernel Hilbert Space

About This Presentation

Title:

LMS Algorithm in a Reproducing Kernel Hilbert Space

Description:

The convergence and regularization analysis (important) Learning from error models (interesting) ... Regularization Technique. Learning from finite data is ill-posed. ... – PowerPoint PPT presentation

Number of Views:777

Avg rating:3.0/5.0

Slides: 38

Provided by: plaz

Learn more at: http://plaza.ufl.edu

Category:

more less

Transcript and Presenter's Notes

Title: LMS Algorithm in a Reproducing Kernel Hilbert Space

1
LMS Algorithm in a Reproducing Kernel Hilbert
Space

Weifeng Liu, P. P. Pokharel, J. C. Principe
Computational NeuroEngineering Laboratory,
University of Florida
Acknowledgment This work was partially supported
by NSF grant ECS-0300340 and ECS-0601271.

2
Outlines

Introduction
Least Mean Square algorithm (easy)
Reproducing kernel Hilbert space (tricky)
The convergence and regularization analysis
(important)
Learning from error models (interesting)

3
Introduction

Puskal (2006) Kernel LMS
Kivinen, Smola (2004) Online learning with
kernels (more like leaky LMS)
Moody, Platt (1990s)Resource allocation
networks (growing and pruning)

4
LMS (1960, Widrow and Hoff)

Given a sequence of examples from UR
U a compact set of RL.
The model is assumed
The cost function

5
LMS

The LMS algorithm
The weight after n iteration

(1)
(2)
6
Reproducing kernel Hilbert space

A continuous, symmetric, positive-definite kernel
,a mapping F, and an inner
product
H is the closure of the span of all F(u).
Reproducing
Kernel trick
The induced norm

7
RKHS

Kernel trick
An inner product in the feature space
A similarity measure you needed.
Mercers theorem

8
Common kernels

Gaussian kernel
Polynomial kernel

9
Kernel LMS

Transform the input ui to F(ui)
Assume F(ui) ?RM
The model is assumed
The cost function

10
Kernel LMS

The KLMS algorithm
The weight after n iteration

(3)
(4)
11
Kernel LMS
(5)
12
Kernel LMS

After the learning, the input-output relation

(6)
13
KLMS vs. RBF

KLMS
RBF
a satisfy
G is the gram matrix G(i,j)?(ui,uj)
RBF needs regularization.
Does KLMS need regularization?

(7)
(8)
14
KLMS vs. LMS

Kernel LMS is nothing but LMS in the feature
space--a very high dimensional reproducing kernel
Hilbert space (MgtN)
Eigen-spread is awfuldoes it converge?

15
Example MG signal predication

Time embedding 10.
Learn rate 0.2
500 training data
100 test data point.
Gaussian noise
noise variance .04

16
Example MG signal predication
MSE Linear LMS KLMS RBF (?0) RBF (?.1) RBF (?1) RBF (?10)
training 0.021 0.0060 0 0.0026 0.0036 0.010
test 0.026 0.0066 0.019 0.0041 0.0050 0.014
17
Complexity Comparison
RBF KLMS LMS
Computation O(N3) O(N2) O(L)
Memory O(N2NL) O(NL) O(L)
18
The asymptotic analysis on convergencesmall
step-size theory

Denote
The correlation matrix
is singular. Assume
and

19
The asymptotic analysis on convergencesmall
step-size theory

Denote
we have

20
The weight stays at the initial place in the
0-eigen-value directions

If
we have

21
The 0-eigen-value directions does not affect the
MSE

Denote

It does not care about the null space! It only
focuses on the data space!
22
The minimum norm initialization

The initialization gives the
minimum norm possible solution.

23
Minimum norm solution
24
Learning is Ill-posed
25
Over-learning
26
Regularization Technique

Learning from finite data is ill-posed.
A priori information--Smoothness is needed.
The norm of the function, which indicates the
slope of the linear operator is constrained.
In statistical learning theory, the norm is
associated with the confidence of uniform
convergence!

27
Regularized RBF

The cost function
or equivalently

28
KLMS as a learning algorithm

The model with
The following inequalities hold
The proof(H8 robust triangle inequality
matrix transformation derivative )

29
The numerical analysis

The solution of regularized RBF is
The reason of ill-posedness is the inversion of
the matrix (G?I)

30
The numerical analysis

The solution of KLMS is
By the inequality we have

31
Example MG signal predication
weight KLMS RBF (?0) RBF (?.1) RBF (?1) RBF (?10)
norm 0.520 4.8e3 10.90 1.37 0.231
32
The conclusion

The LMS algorithm can be readily used in a RKHS
to derive nonlinear algorithms.
From the machine learning view, the LMS method is
a simple tool to have a regularized solution.

33
Demo
34
Demo
35
LMS learning model

An event happens, and a decision made.
If the decision is correct, nothing happens.
If an error is incurred, a correction is made on
the original model.
If we do things right, everything is fine and
life goes on.
If we do something wrong, lessons are drawn and
our abilities are honed.

36
Would we over-learn?

If the real world is attempted to be modeled
mathematically, what dimension is appropriate?
Are we likely to over-learn?
Are we using the LMS algorithm?
What is good to remember the past?
What is bad to be a perfectionist?