CS621: Artificial Intelligence Lecture 18: Feedforward network contd - PowerPoint PPT Presentation

1 / 22
About This Presentation
Title:

CS621: Artificial Intelligence Lecture 18: Feedforward network contd

Description:

Change weights, if found better (i.e. changed weights result in reduced error) ... a movement of the operating point in the wmn co-ordinate space will result in ... – PowerPoint PPT presentation

Number of Views:71
Avg rating:3.0/5.0
Slides: 23
Provided by: saur1
Category:

less

Transcript and Presenter's Notes

Title: CS621: Artificial Intelligence Lecture 18: Feedforward network contd


1
CS621 Artificial IntelligenceLecture 18
Feedforward network contd
  • Pushpak Bhattacharyya
  • Computer Science and Engineering Department
  • IIT Bombay

2
Pocket Algorithm
  • Algorithm evolved in 1985 essentially uses PTA
  • Basic Idea
  • Always preserve the best weight obtained so far
    in the pocket
  • Change weights, if found better (i.e. changed
    weights result in reduced error).

3
XOR using 2 layers
  • Non-LS function expressed as a linearly
    separable
  • function of individual linearly separable
    functions.

4
Example - XOR
  • 0.5

? Calculation of XOR
w21
w11
x1x2
x1x2
Calculation of
x1x2
  • 1

w21.5
w1-1
x2
x1
5
Example - XOR
  • 0.5

w21
w11
x1x2
1
1
x1x2
1.5
-1
-1
1.5
x2
x1
6
Some Terminology
  • A multilayer feedforward neural network has
  • Input layer
  • Output layer
  • Hidden layer (asserts computation)
  • Output units and hidden units are called
  • computation units.

7
Training of the MLP
  • Multilayer Perceptron (MLP)
  • Question- How to find weights for the hidden
    layers when no target output is available?
  • Credit assignment problem to be solved by
    Gradient Descent

8
Gradient Descent Technique
  • Let E be the error at the output layer
  • ti target output oi observed output
  • i is the index going over n neurons in the
    outermost layer
  • j is the index going over the p patterns (1 to p)
  • Ex XOR p4 and n1

9
Weights in a ff NN
  • wmn is the weight of the connection from the nth
    neuron to the mth neuron
  • E vs surface is a complex surface in the
    space defined by the weights wij
  • gives the direction in which a movement
    of the operating point in the wmn co-ordinate
    space will result in maximum decrease in error

m
wmn
n
10
Sigmoid neurons
  • Gradient Descent needs a derivative computation
  • - not possible in perceptron due to the
    discontinuous step function used!
  • ? Sigmoid neurons with easy-to-compute
    derivatives used!
  • Computing power comes from non-linearity of
    sigmoid function.

11
Derivative of Sigmoid function
12
Training algorithm
  • Initialize weights to random values.
  • For input x ltxn,xn-1,,x0gt, modify weights as
    follows
  • Target output t, Observed output o
  • Iterate until E lt ? (threshold)

13
Calculation of ?wi
14
Observations
  • Does the training technique support our
    intuition?
  • The larger the xi, larger is ?wi
  • Error burden is borne by the weight values
    corresponding to large input values

15
Backpropagation on feedforward network
16
Backpropagation algorithm
Output layer (m o/p neurons)
.
j
wji
.
i
Hidden layers
.
.
Input layer (n i/p neurons)
  • Fully connected feed forward network
  • Pure FF network (no jumping of connections over
    layers)

17
Gradient Descent Equations
18
Backpropagation for outermost layer
19
Backpropagation for hidden layers
Output layer (m o/p neurons)
.
k
.
j
Hidden layers
.
i
.
Input layer (n i/p neurons)
?k is propagated backwards to find value of ?j
20
Backpropagation for hidden layers
21
General Backpropagation Rule
  • General weight updating rule
  • Where

for outermost layer
for hidden layers
22
How does it work?
  • Input propagation forward and error propagation
    backward (e.g. XOR)
Write a Comment
User Comments (0)
About PowerShow.com