Improved BP algorithms (first order gradient method) - PowerPoint PPT Presentation

1 / 42
About This Presentation
Title:

Improved BP algorithms (first order gradient method)

Description:

Improved BP algorithms (first order gradient method) BP with momentum Delta- bar- delta Decoupled momentum RProp Adaptive BP Trinary BP BP with adaptive gain – PowerPoint PPT presentation

Number of Views:52
Avg rating:3.0/5.0
Slides: 43
Provided by: Mah103
Category:

less

Transcript and Presenter's Notes

Title: Improved BP algorithms (first order gradient method)


1
Improved BP algorithms(first order gradient
method)
  1. BP with momentum
  2. Delta- bar- delta
  3. Decoupled momentum
  4. RProp
  5. Adaptive BP
  6. Trinary BP
  7. BP with adaptive gain
  8. Extended BP

2
BP with momentum (BPM)
  • The basic improvement to BP (Rumelhart 1986)
  • Momentum factor alpha selected between zero and
    one
  • Adding momentum improves the convergence speed
    and helps network from being trapped in a local
    minimum.

3
Modification form
  • Proposed by nagata 1990
  • Beta is a constant value decided by user
  • Nagata claimed that beta term reduce the
    possibility of the network being trapped in the
    local minimum
  • This seems beta repeating alpha rule again!!! But
    not clear

4
Delta-bar-delta (DBD)
  • Use adaptive learning rate to speed up the
    convergence
  • The adaptive rate adopted base on local
    optimization
  • Use gradient descent for the search direction,
    and use individual step sizes for each weight

5
RProp
  • Jervis and Fitzgerald (1993)
  • and limit the size of the step

6
Second order gradient methods
  • Newton
  • Gauss-Newton
  • Levenberg-Marquardt
  • Quickprop
  • Conjugate gradient descent
  • Broyde Fletcher-Goldfab-Shanno

7
Performance Surfaces
  • Taylor Series Expansion

8
Example
9
Plot of Approximations
10
Vector Case
11
Matrix Form
12
Performance Optimization
  • Steepest Descent
  • Basic Optimization Algorithm

13
Examples
14
Minimizing Along a Line
15
Example
16
Plot
17
Newtons Method
18
Example
19
Plot
20
Non-Quadratic Example
21
Different Initial Conditions
22
  • DIFFICULT
  • Inverse a singular matrix!!!
  • Complexity

23
Newtons Method
24
Matrix Form
25
Hessian
26
Gauss-Newton Method
27
Levenberg-Marquardt
28
Adjustment of mk
29
Application to Multilayer Network
30
Jacobian Matrix
31
Computing the Jacobian
32
Marquardt Sensitivity
33
Computing the Sensitivities
34
LMBP
  • Present all inputs to the network and compute the
    corresponding network outputs and the errors.
    Compute the sum of squared errors over all
    inputs.
  • Compute the Jacobian matrix. Calculate the
    sensitivities with the backpropagation algorithm,
    after initializing. Augment the individual
    matrices into the Marquardt sensitivities.
    Compute the elements of the Jacobian matrix.
  • Solve to obtain the change in the weights.
  • Recompute the sum of squared errors with the new
    weights. If this new sum of squares is smaller
    than that computed in step 1, then divide mk by
    u, update the weights and go back to step 1. If
    the sum of squares is not reduced, then multiply
    mk by u and go back to step 3.

35
Example LMBP Step
36
LMBP Trajectory
37
Conjugate Gradient
38
LRLS method
  • Recursive least square method based method, does
    not need gradient descent.

39
LRLS method cont.
40
LRLS method cont.
41
Example for compare methods
Algorithms Average time steps (cut off at 0.15) Average time steps (cut off at 0.1) Average time steps (cut off at 0.05)
BPM 101 175 594
DBD 64 167 2382
LM 2 123 264
LRLS 3 9 69
42
IGLS (Integration of the Gradient and Least
Square)
  • The LRLS algorithm has faster convergence speed
    than BPM algorithm. However, it is
    computationally more complex and is not ideal for
    very large network size. The learning strategy
    describe here combines ideas from both the
    algorithm to achieve the objectives of
    maintaining complexity for large network size and
    reasonably fast convergence.
  • The out put layer weights update using BPM method
  • The hidden layer weights update using BPM method
Write a Comment
User Comments (0)
About PowerShow.com