2L490 CNperceptrons 1 - PowerPoint PPT Presentation

About This Presentation
Title:

2L490 CNperceptrons 1

Description:

[ 0,1]m ,when the sigmoid activation function is used. a function Rn ! ... (incremental version, sigmoid transfer function) 9/22/09. Rudolf Mak TU/e Computer Science ... – PowerPoint PPT presentation

Number of Views:60
Avg rating:3.0/5.0
Slides: 24
Provided by: Rudol
Category:

less

Transcript and Presenter's Notes

Title: 2L490 CNperceptrons 1


1
Disadvantages of Discrete Neurons
  • Only boolean valued functions can be computed
  • A simple learning algorithm for multi-layer
    discrete-neuron perceptrons is lacking
  • The computational capabilities of single-layer
    discrete-neuron perceptrons is limited
  • These disadvantages disappear when we
  • consider multi-layer continuous-neuron
  • perceptrons

2
Preliminaries
  • A continuous-neuron perceptron with n input and m
    outputs computes
  • a function Rn ! 0,1m ,when the sigmoid
    activation function is used
  • a function Rn ! Rm ,when a linear activation
    function is used
  • The learning rules for continuous-neuron
    perceptrons are based on optimization techniques
    for error-functions. This requires a continuous
    and differentiable error function.

3
Sigmoid transfer function
4
Computational Capabilities
  • Let g0,1n!R be a continuous function and let
    . Then there exists a two layer
    perceptron with
  • First layer build from neurons with threshold and
    standard sigmoid activation function
  • Second layer build from one neuron without
    threshold and linear activation function
  • such that the function G computed by this network
    satis-
  • fies

5
Single-layer networks
  • Compute function from Rn to 0, 1m
  • Sufficient to consider a single neuron
  • Compute a function f(w0 ?1 j n wjxj )
  • Assume x0 1 then compute
  • a function f(?0 j n wjxj )

6
Error function
7
Gradient Descent
8
Update of Weight i by Training Pair q
9
Delta Rule Learning (incremental version,
arbitrary transfer function)
10
Delta Rule Learning (incremental version,
sigmoid transfer function)
11
Delta Rule Learning (incremental version, linear
transfer function)
12
Stopcriteria
  • The mean square error becomes small enough
  • The mean square error does not decrease any-
    more, i.e. the gradient has become very small or
    even changes sign
  • The maximum number of iterations has been exceeded

13
Remarks
  • Delta rule learning is also called L(east) M(ean)
    S(quare) learning or Widrow Hoff learning
  • Note that the incremental version of the delta
    rule is strictly not a gradient descent
    algorithm, because in each step a different error
    function E(q) is used
  • Convergence of the incremental version can only
    be guaranteed if the learning parameter a goes to
    0 during learning

14
Perceptron Learning Rule (batch version,
arbitrary transfer function)
15
Perceptron Learning Delta Rule (batch version,
sigmoidal transfer function)
16
Perceptron Learning Rule (batch version, linear
transfer function)
17
Convergence of the batch version
For small enough learning parameter the batch
version of the delta rule always converges. The
resulting weights, however, may correspond to a
local minimum of the error function, instead of
the global minimum
18
Linear Neurons and Least Squares
19
Linear Neurons and Least Squares
20
C is non-singular
21
Linear Least Squares Convergence
22
(No Transcript)
23
Linear Least Squares Convergence
24
Find the line
25
Solution
Write a Comment
User Comments (0)
About PowerShow.com