Title: APPLICATION OF AN EXPERT SYSTEM FOR ASSESSMENT OF THE SHORT TIME LOADING CAPABILITY OF TRANSMISSION
1Before we start ADALINE
 Test the response of your Hebb and Perceptron on
this following noisy version  Exercise pp98 2.6(d)
2ADALINE
 ADAPTIVE LINEAR NEURON
 Typically uses bipolar (1, 1) activations for
its input signal and its target output  The weights are adjustable, has bias whose
activation is always 1
Architecture of an ADALINE
3ADALINE
 In general ADALINE can be trained using the delta
rule also known as least mean squares (LMS) or
WidrowHoff rule  The delta rule can also be used for single layer
nets with several output units  ADALINE a special one  only one output unit
4ADALINE
 Activation of the unit
 Is the net input with identity function
 The learning rule minimizes the mean squares
error between the activation and the target value  Allows the net to continue learning on all
training patterns, even after the correct output
value is generated
5ADALINE
 After training, if the net is being used for
pattern classification in which the desired
output is either a 1 or a 1, a threshold
function is applied to the net input to obtain
the activation  If net_input 0 then activation 1
 Else activation 1
6The Algorithm
Step 0 Initialize all weights and
bias (small random values are usually
used0 Set learning rate ? (0 lt ? 1) ?
0 Step 1 While stopping condition is
false, do steps 26. Step2For each
bipolar training pair st, do steps 35
Step 3. Set activations for input units i
1, , n xi si Step 4.Compute
net input to output unit NET y_in b
? xi wi
7The Algorithm
Step 5. Update weights and bias i 1, ,
n wi(new) wi(old) ? (t y_in)xi
b(new) b(old) ? (t y_in) else
wi(new) wi(old) b(new) b(old) Step
6. Test stopping condition If the largest
weight change that occurred in Step 2 is
smaller than a specified tolerance, then stop
otherwise continue.
8Setting the learning rate ?
 Common to take a small value for ? 0.1
initially  If ? too large, the learning process will not
converge  If ? too small learning will be extremely slow
 For single neuron, a practical range is
 0.1 n? 1.0
9Application
After training, an ADALINE unit can be used to
classify input patterns. If the target values are
bivalent (binary or bipolar), a step function can
be applied as activation function for the output
unit Step 0 Initialize all weights Step
1 For each bipolar input vector x, do steps
24 Step 2. Set activations for input units to
x Step 3. Compute net input to output
unit net y_in b ? xi wi Step
4. Apply the activation function
10Example 1
 ADALINE for AND function binary input, bipolar
targets  (x1 x2 t)
 (1 1 1)
 (1 0 1)
 (0 1 1)
 (0 0 1)
 Delta rule in ADALINE is designed to find weights
that minimize the total error 
Associated target for pattern p
4
E ? (x1(p) w1 x2(p)w2 w0 t(p))2
p1
Net input to the output unit for pattern p
11Example 1
 ADALINE for AND function binary input, bipolar
targets  Delta rule in ADALINE is designed to find weights
that minimize the total error  Weights that minimize this error are w1 1, w2
1, w0 3/2  Separating lines x1 x2 3/2 0

12Example 2
 ADALINE for AND function bipolar input, bipolar
targets  (x1 x2 t)
 (1 1 1)
 (1 1 1)
 (1 1 1)
 (1 1 1)
 Delta rule in ADALINE is designed to find weights
that minimize the total error 
Associated target for pattern p
4
E ? (x1(p) w1 x2(p)w2 w0 t(p))2
p1
Net input to the output unit for pattern p
13Example 2
 ADALINE for AND function bipolar input, bipolar
targets  Weights that minimize this error are w1 1/2, w2
1/2, w0 1/2  Separating lines 1/2x1 1/2 x2 1/2 0

14Example
 Example 3 ADALINE for AND NOT function bipolar
input, bipolar targets  Example 4 ADALINE for OR function bipolar
input, bipolar targets 
15Derivations
 Delta rule for single output unit
 The delta rule changes the weights of the
connections to minimize the difference between
input and output unit  By reducing the error for each pattern one at a
time  The delta rule for Ith weight(for each pattern)
is  ?wI ? (t y_in)xI
16Derivations
 The squared error for a particular training
pattern is  E (t y_in)2.
 E function of all weights wi, I 1, , n
 The gradient of E is the vector consisting of the
partial derivatives of E with respect to each of
the weights  The gradient gives the direction of most rapid
increase in E  Opposite direction gives the most rapid decrease
in the error  The error can be reduced by adjusting the weight
wI in the direction of
 ?E
?wI
17Derivations
2(t y_in)xI
The local error will be reduced most rapidly by
adjusting the weights according to the delta rule
?wI ? (t y_in)xI
18Derivations
 Delta rule for multiple output unit
 The delta rule for Ith weight(for each pattern)
is  ?wIJ ? (t y_inJ)xI
19Derivations
 The squared error for a particular training
pattern is  E ?(tj y_inj)2.
 E function of all weights wi, I 1, , n
 The error can be reduced by adjusting the weight
wI in the direction of
 ?E
m
?
(tj y_inj)2
?wIJ
j1
(tJ y_inJ)2
Continued pp 88
20Exercise
 http//www.neuralnetworksatyourfingertips.com/
adaline.html  Adaline Network Simulator
21MADALINE
 MANY ADAPTIVE LINEAR NEURON
Architecture of an MADALINE with two hidden
ADALINES and one output ADALINE
22MADALINE
 Derivation of delta rule for several outputs
shows no change in the training process with
several combination of ADALINEs  The outputs of two hidden ADALINES, z1 and z2 are
determined by signal from input units X1 and X2  Each output signal is the result of applying a
threshold function to the units net input  y is the nonlinear function of the input vector
(x1, x2)
23MADALINE
 Why we need hidden units???
 The use of hidden units Z1 and Z2 give the net
 Computational capabilities not found in single
layer nets  Butcomplicate the training process
 Two algorithms
 MRI only weights for hidden ADALINES are
adjusted, the weights for output unit are fixed  MRII provides methods for adjusting all weights
in the net
24ALGORITHM MRI
The weights v1 and v2 and bias b3 that feed
into the output unit Y are determined so that
the response of unit Y is 1 if the signal it
receives from either Z1 or Z2 (or both) is 1
and is 1 if both Z1 and Z2 send a signal of 1.
The unit Y performs the logic function OR on the
signals it receives from Z1 and Z2
Set v1 ½, v2 ½ and b3 ½ see example 2.19
the OR function
25ALGORITHM MRI
 x1 x2 t
 1 1
 1 1 1
 1 1 1
 1 1 1
 Set ? 0.5
 Weights into
 Z1 Z2 Y
 w11 w21 b1 w12 w22 b2 v1
v2 b3  .05 .2 .3 .1 .2 .15 .5
.5 .5
Set v1 ½, v2 ½ and b3 ½ see example 2.19
the OR function
26 Step 0 Initialize all weights and bias
 wi 0 (i 1 to n), b0
 Set learning rate ? (0 lt ? 1)
 ? 0
 Step 1 While stopping condition is false,
 do steps 28.
 Step2 For each bipolar training pair st, do
steps 37  Step 3. Set activations for input units
 xi si
 Step 4.Compute net input to each hidden ADALINE
unit 
 z_in1 b1 x1 w11 x2 w21
 z_in2 b2 x2 w12 x2 w22
 Step 5. Determine output of each hidden ADALINE
 z1 f(z_in1)
 z2 f(z_in2)
 Step 6. Determine output of net
f(x)
27The Algorithm
 Step 7. Update weights and bias if an error
occurred for this pattern 
 If t y, no weight updates are performed
 otherwise
 If t 1, then update weights on ZJ, the unit
whose net input is closest to 0,  wiJ(new) wiJ(old) ? (1 z_in)xi
bJ(new) bJ(old) ? (1 z_inJ)  If t 1, then update weights on all units ZK,
that have positive net input,  wik(new) wik(old) ? (1 z_in)xi
bk(new) bk(old) ? (1 z_ink)  Step 8. Test stopping condition
 Of weight changes have stopped(or reached an
acceptable level), or if a specified maximum
number of weight update iterations (Step 2) have
been performed, then stop otherwise continue