Chapter 5 Recurrent Networks and Temporal Feedforward Networks - PowerPoint PPT Presentation

Loading...

PPT – Chapter 5 Recurrent Networks and Temporal Feedforward Networks PowerPoint presentation | free to download - id: 6965e5-YThjZ



Loading


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation
Title:

Chapter 5 Recurrent Networks and Temporal Feedforward Networks

Description:

Chapter 5 Recurrent Networks and Temporal Feedforward Networks (Chuan-Yu Chang ) Office: ES 709 – PowerPoint PPT presentation

Number of Views:18
Avg rating:3.0/5.0
Date added: 26 May 2020
Slides: 87
Provided by: 4518
Category:

less

Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: Chapter 5 Recurrent Networks and Temporal Feedforward Networks


1
Chapter 5 Recurrent Networks and Temporal
Feedforward Networks
  • ???????? ???????
  • ???(Chuan-Yu Chang ) ??
  • Office ES 709
  • TEL 05-5342601 ext. 4337
  • E-mail chuanyu_at_yuntech.edu.tw

2
Overview of Recurrent Neural Networks
  • A network that has closed loops in its
    topological structure is considered a recurrent
    network.
  • Feedforward networks
  • Implemented fixed-weighted mapping from input
    space to output space.
  • The state of any neuron is solely determined by
    the input to the unit and not the initial and
    past states of the neuron.
  • Recurrent neural networks
  • Recurrent neural networks utilize feedback to
    allow initial and past state involvement along
    with serial processing.
  • Fault-tolerant
  • These networks can be fully connected.
  • The connection weights in a recurrent neural
    network can be symmetric or asymmetric.
  • In symmetric case, (wijwji) the network always
    converges to stable point. However, these
    networks cannot accommodate temporal sequences of
    pattern.
  • In the asymmetric case, (wij?wji) the dynamics of
    the network can exhibit limit cycles and chaos,
    and with the proper selection of weights,
    temporal spatial patterns can be generated and
    stored in the network.

3
Hopfield Associative Memory
  • Hopfield(1988)
  • The physical systems consisting of a large number
    of simple neurons can exhibit collective emergent
    properties.
  • A collective property of a system cannot emerge
    from a single neuron, but it can emerge from
    local neuron interactions in the system.
  • Produce a content-addressable memory that can
    correctly yield an entire memory from partial
    information.

4
Hopfield Associative Memory (cont.)
  • The standard discrete-times Hopfield neural
    network
  • A kind of recurrent network
  • Can be viewed as a nonlinear associative memory,
    or content-addressable memory.
  • To perform a dynamic mapping function.
  • Intended to perform the function of data storage
    and retrieval.
  • The network stores the information in a
    dynamically stable environment.
  • A stored pattern in memory is to be retrieved in
    response to an input pattern that is a noisy
    version (incomplete) of the stored pattern.

5
Hopfield Associative Memory (cont.)
  • Content-addressable memory (CAM)
  • Attractor is a state that the system will evolve
    toward in time, starting from a set of initial
    conditions. (basin of attraction)
  • If an attractor is a unique point in the state
    space, it is called a fixed point.
  • A prototype state Fh is represented by a fixed
    point sh of the dynamic system.
  • Thus, Fh is mapped onto the stable points sh of
    the network.

6
Hopfield Associative Memory (cont.)
7
Hopfield Associative Memory (cont.)
Activation function symmetric hard-limiter
Output can only be 1 or -1
The output of a neuron is not fed back to itself.
Therefore, Wij0 for ij.
8
Hopfield Associative Memory (cont.)
  • The output of the linear combiner is written
    as where is
    the state of the network
  • The state of each neuron is given by if vi0,
    the value of xj will be defined as its previous
    state.
  • The vector-matrix form of (5.1) is given by

(5.1)
External threshold
(5.2)
(5.3)
9
Hopfield Associative Memory (cont.)
  • The network weight matrix W is written
    as Each row in (5.4) is the associated
    weight vector for each neuron.
  • The output of the network can be written
    as vector-matrix form Scalar form

(5.4)
(5.5)
(5.6)
10
Hopfield Associative Memory (cont.)
  • There are two basic operational phases associated
    with the Hopfield network the storage phase and
    the recall phase.
  • During the storage phase, the associative memory
    is build according to the outer-product rule for
    correlation matrix memories.
  • Given the set of r prototype memories,
    the network weight matrix is computed
    as
  • Recall phase
  • A test input vector x
  • The state of network x(k) is initialized with the
    values of the unknown input, ie., x(0)x.
  • Using the Eq.(5.6), the elements of the state
    vector x(k) are updated one at a time until there
    is no significant change in the elements of the
    vector. When this condition is reached, the
    stable state xe is the network output.

(5.7)
????wij0 for ij
11
Hopfield Associative Memory (cont.)
  • Discrete-time Hopfield network training algorithm
  • Step 1 (storage phase) Given a set of prototype
    memories, using (5.7), the synaptic weights of
    the network are calculated according to
  • Step 2 (Recall Phase) Given an unknown input
    vector x, the Hopfield network is initialized by
    setting the state of the network x(k) at time k0
    to x
  • Step 3 The element of the state of the network
    x(k) are update asynchronously according to
    (5.6)
  • This iterative process is continued until it can
    be shown that the element of the state vector do
    not change. When this condition is met, the
    network outputs the equilibrium state

(5.8)
(5.9)
(5.10)
(5.11)
12
Hopfield Associative Memory (cont.)
  • The major problem associated with the Hopfield
    network is spurious equilibrium state.
  • These are stable equilibrium states that are not
    part of the design set of prototype memories.
  • ???spurious equilibrium state??
  • They can result from linear combinations of an
    odd number of patterns.
  • For a large number of prototype memories to be
    stored, there can exist local minima in the
    energy landscape.
  • Spurious attractors can result from the symmetric
    energy function.
  • Li et al., proposed a design approach, which is
    based on a system of first-order linear ordinary
    differential equation.
  • The number of spurious attractors is minimized.

13
Hopfield Associative Memory (cont.)
  • Because the Hopfield network has symmetric
    weights and no neuron self-loop, an energy
    function (Lyapunov function) can be defined.
  • An energy function for the discrete-time Hopfield
    neural network can be written as
  • The change in energy function is given by

x is the state of the network, x is an
externally applied input presented to the
network q is the threshold vector.
(5.12)
The operation of Hopfield network leads to a
monotonically decreasing energy function, and
changes in the state of the network will continue
until a local minimum of the energy landscape is
reached/
(5.13)
14
Hopfield Associative Memory (cont.)
  • For no externally applied inputs, the energy
    function is given by
  • The energy change is
  • The storage capacity (bipolar patterns) of the
    Hopfield network is approximately by
  • If most of the prototype memories be recalled
    perfectly, the maximum storage capacity of the
    network given by
  • If it is required that 99 of the prototype
    memories are to be recalled perfectly

(5.14)
n is the number of neurons in the network
(5.15)
(5.16)
(5.17)
(5.18)
15
Hopfield Associative Memory (cont.)
  • Example 5.1
  • ??????????,???????,threshold?0?????,???????????(??
    bipolar vector)
  • ?????????????(prototype memory) -1, 1, -1?1,
    -1, 1??(5.7)???
  • ?????????????????

16
Hopfield Associative Memory (cont.)
  • ?? -1, -1, 1, 1, -1, -1 ?1, 1, 1????? 1,
    -1, 1
  • ??(5.14)?????? ??5.1??,?????????,????????????ene
    rgy???

17
Hopfield Associative Memory (cont.)
  • Example 5.2
  • ?????1212??????????(1??,-1??)
  • ??1212144????,14414420736?????
  • Threshold q0?
  • ???????????(-1????1???)

18
Hopfield Associative Memory (cont.)
  • ?????????5?prototype vector??(5.7)?????????5.7???

????0
??30 bit error rate
19
The Traveling-Salesperson Problem
  • Optimization problems
  • Finding the best way to do something subject to
    certain constraints.
  • The best solution is defined by a specific
    criterion.
  • In many cases optimization problems are described
    in terms of a cost function.
  • The Traveling-Salesperson Problem, TSP
  • A salesperson must make a circuit through a
    certain number of cities.
  • Visiting each city only once.
  • The salesperson returns to the starting point at
    the end of the trip.
  • Minimizing the total distance traveled.

20
The Traveling-Salesperson Problem (cont.)
  • Constraint
  • Weak constraint
  • Eg. Minimum distance
  • Strong constraint
  • Constraints that must be satisfied.
  • The Hopfield network is guaranteed to converge to
    a local minimum of the energy function.
  • To use the Hopfield memory for optimization
    problems, we must find a way to map the problem
    onto the network architecture.
  • The first step is to develop a representation of
    the problems solutions that fit an architecture
    having a single array of PEs.

21
The Traveling-Salesperson Problem (cont.)
  • ?Hopfield?????????
  • ????????
  • ?????????????
  • ????????????
  • ??????(Lyapunov energy function) ,????,?????????
  • ???????????????W,????q
  • ?????????,??????

22
The Traveling-Salesperson Problem (cont.)
  • An energy function must satisfies the following
    criteria.
  • Visit each city only once on the tour
  • Visit each position on the tour in a time
  • Include all n cities.
  • The shortest total distances.

23
The Traveling-Salesperson Problem (cont.)
?????????
  • The energy equation is

????????
??????,?????????????????
???N??????
24
The Traveling-Salesperson Problem (cont.)
  • Comparing the cost function and the Lyapunov
    function of the Hopfield networks, the synaptic
    interconnection strengths and the bias input of
    the network are obtained as where the
    Kronecker delta function defined as


.
25
The Traveling-Salesperson Problem (cont.)
  • The total input to neuron (x,i) is

26
A Contextual Hopfield Neural Networks for
Medical Image Edge Detection
  • ???(Chuan-Yu Chang)
  • Optical Engineering,
  • vol. 45, No. 3, pp. 037006-1037006-9,2006.
    (EI?SCI)

27
Introduction
  • Edge detection from medical images(such as CT and
    MRI) is an important steps in the medical image
    understanding system.

28
Introduction
Changs2000- CHEFNN
The CHEFNN Advantage --Taking each
pixels contextual information. --Adoption of the
competitive learning rule. Disadvantage --
predetermined parameters A and B, obtain by
trial and errors -- Execution time is long, 26
second above.
  • The Proposed CHNN
  • The input of CHNN is the original two-dimensional
    image and the output is an edge-based feature
    map.
  • Taking each pixels contextual information.
  • Experimental results are more perceptual than the
    CHEFNN.
  • The execution time is fast than the CHEFNN

29
The Contextual Hopfield Neural Network, CHNN
The architecture of CHNN
30
The CHNN
  • The total input to neuron (x,i) is computed as
  • The activation function in the network is defined
    as

(1)
(2)
31
The CHNN
  • Base on the update equation, the Lyapunov energy
    function of the two dimensional Hopfield neural
    network as

(3)
32
The CHNN
  • The energy function of CHNN must satisfy the
    following conditions
  • The gray levels within an area belonging to the
    non-edge points have the minima Euclidean
    distance measure.

(4)
where
(5)
33
The CHNN
  • The neighborhood function

(6)
(7)
34
The CHNN
  • The objective function for CHNN

(8)
35
The CHNN
  • Comparing the objection function of the CHNN in
    Eq.(8) and the Lyapunov function Eq.(3) of the
    CHNN

(9)
(10)
(11)
36
The CHNN Algorithm
  • Input The original image X, the neighborhood
    parameters p and q.
  • Output The stabilized neuron representing the
    classified edge feature map of the original
    image.

37
The CHNN Algorithm
  • Algorithm
  • Step 1) Assigning the initial neuron states as 1.
  • Step 2) Use Eq.(11) to calculate the total input
    of each neuron .
  • Step 3) Apply the activation rule given in Eq.(2)
    to obtain the new output states for each
    neuron.
  • Step 4) Repeat Step 2 and Step 3 for all neurons
    and count the number of neurons whose state is
    changed during the updating. If there is a
    change, then go to Step 2. Otherwise, go to
    Step 5.
  • Step 5) Output the final states of neurons that
    indicate the edge detection results.

38
Experimental Results
Phantom images
(a) Original phantom image (b) added noise
(K18), (c) added noise (K20), (d) added noise
(K23), (e) added noise (K25), (f) noise (K30)
39
Experimental Results
  • Noiseless phantom image.
  • Laplacian-based,
  • (b) the Marr-Hildreths,
  • (c) the wavelet-based,
  • (d) the Cannys,
  • (e) the CHEFNN,
  • (f) the proposed CHNN.

40
Experimental Results
  • Noise phantom image(K18).
  • Laplacian-based,
  • (b) the Marr-Hildreths,
  • (c) the wavelet-based,
  • (d) the Cannys,
  • (e) the CHEFNN,
  • (f) the proposed CHNN.

41
Experimental Results
  • Noise phantom image(K20).
  • Laplacian-based,
  • (b) the Marr-Hildreths,
  • (c) the wavelet-based,
  • (d) the Cannys,
  • (e) the CHEFNN,
  • (f) the proposed CHNN.

42
Experimental Results
  • Noise phantom image(K23).
  • Laplacian-based,
  • (b) the Marr-Hildreths,
  • (c) the wavelet-based,
  • (d) the Cannys,
  • (e) the CHEFNN,
  • (f) the proposed CHNN.

43
Experimental Results
  • Noise phantom image(K25).
  • Laplacian-based,
  • (b) the Marr-Hildreths,
  • (c) the wavelet-based,
  • (d) the Cannys,
  • (e) the CHEFNN,
  • (f) the proposed CHNN.

44
Experimental Results
  • Noise phantom image(K30).
  • Laplacian-based,
  • (b) the Marr-Hildreths,
  • (c) the wavelet-based,
  • (d) the Cannys,
  • (e) the CHEFNN,
  • (f) the proposed CHNN.

45
Experimental Results
46
Experimental Results
Knee joint based MR image
47
Experimental Results
Skull-based CT image
48
Conclusion
  • Proposed a new contextual Hopfield neural
    networks called Contextual Hopfield Neural
    Network (CHNN) for edge detection.
  • CHNN considers the contextual information of
    pixels.
  • The results of our experiments indicate that CHNN
    can be applied to various kinds of medical image
    segmentation including CT and MRI.

49
Recommended Reading
  • Chuan-Yu Chang and Pau-Choo Chung, Two-layer
    competitive based Hopfield neural network for
    medical image edge detection, Optical
    Engineering, Vol. 39, No. 3, pp.695-703, March.
    2000. (SCI)
  • Chuan-Yu Chang, and Pau-Choo Chung, Medical
    Image Segmentation Using a Contextual-Constraint
    Based Hopfield Neural Cube, Image and Vision
    Computing, Vol 19, pp. 669-678, 2001. (SCI)
  • Chuan-Yu Chang, "Spatiotemporal-Hopfield Neural
    Cube for Diagnosing Recurrent Nasal Papilloma,"
    Medical Biological Engineering Computing,
    Vol. 43. pp. 16-22, 2005(EI?SCI).
  • Chuan-Yu Chang, A Contextual-based Hopfield
    Neural Network for Medical Image Edge Detection,
    Optical Engineering, vol. 45, No. 3, pp.
    037006-1037006-9,2006. (EI?SCI)
  • Chuan-Yu Chang, Hung-Jen Wang and Si-Yan Lin,
    Simulation Studies of Two-layer Hopfield Neural
    Networks for Automatic Wafer Defect Inspection,
    Lecture Notes in Computer Science 4031, pp. 1119
    1126, 2006.(SCI)

50
Simulated Annealing
  • ??Hopfield neural network??????(recalling stored
    pattern)????????local minima???,?optimization
    problem?????????global minimum?
  • ??Hopfield neural network???gradient
    descent?,???????????????local minimum?
  • ????????????????????????,???????global
    minimum????
  • SA?????????????NP-complete????
  • SA???????????,?????????????????,??????local
    minimum,???????? (global minimum) ?
  • SA????????
  • Melting the system to be optimized at an
    effectively high temperature.
  • Lowering the temperature in slow stages until the
    system freezes.

51
Simulated Annealing (cont.)
  • Plot of a function of two variables with multiple
    minima and maxima

52
Simulated Annealing (cont.)
  • ??????,????E(x)?????????????x?????
  • ????x???????? ??KB?boltzmann??(KB1.380610-23
    J/K) T???,Z?partition function????
  • ??,Tr?????????????????
  • ?(5.25)??(5.24)??Boltamann-Gibbs Distribution

(5.24)
(5.25)
(5.26)
53
Simulated Annealing (cont.)
  • ??????Pr(x-gtxp)???????x??????????xp????
  • ?????????????(thermal equilibrium),???????Pr(x-gtxp
    )???????(sufficient condition),?????????xp ?
  • ??????,?????x???xp?????xp???x????????????

(5.27)
54
Simulated Annealing (cont.)
  • ?(5.27)??,??(5.26)?????? ???????
  • Metropolis algorithm??Monte Carlo
    technique,?????????
  • ??????????,???(atom)?????,???????????????????
  • ??????????,????????????????????

(5.28)
(5.29)
55
Simulated Annealing (cont.)
  • ???????????uniformly distributed?random
    number(??0-1)????
  • ???????,???????Pr(DE)??
  • ?????lt Pr(DE)???????,
  • ????gt Pr(DE)????????,????????

56
Simulated Annealing (cont.)
  • There are four basic components associated with a
    simulated annealing based global search
    algorithm
  • A concise description of the system
    configuration.
  • An object or cost function.
  • An exploration process, or a random generator of
    move or rearrangements of the system elements
    in a configuration.
  • An annealing schedule of temperatures and defined
    time periods for which the system is to be
    evolved.
  • The basic idea is to go downhill most of the
    time instead of always going downhill. (a video)

57
Simulated Annealing (cont.)
  • SA????????????T???schedule??,????????,?????,????lo
    cal minimum????????,????????
  • Geman?????????global minimum?temperature
    schedule????????,????????
  • ??????SA????????????,????????,?????????

??????
(5.32)
(5.33)
Decrementing factor should be small and close to
unity, 0.80.99
58
Simulated Annealing based global search algorithm
  • Step 1
  • Initialize the vector x to a random point in the
    set f.
  • Step 2
  • Select an annealing schedule for the parameter T
  • Initialize T to a sufficiently large number.
  • Step 3
  • Compute xpx?x.
  • Step 4
  • Compute the change in the cost function
    ?ff(xp)-f(x)
  • Step 5
  • Use (5.29), associated with the Metropolis
    algorithm, to decide if xp should be used as the
    new state of the system or keep the current state
    x.

59
Simulated Annealing based global search algorithm
(cont.)
  • Step 6
  • Step 3 through 5 are repeated until the system
    reaches equilibrium,
  • which is determined when the number of accepted
    transitions becomes insignificant.
  • Typically, Steps 3 through 5 are carried out a
    predetermined number of times.
  • Step 7
  • The temperature T is updated according to the
    annealing schedule specified in step 2,.
  • Steps 3 through 6 are repeated.
  • The process can be stopped when the temperature T
    reaches zero or a predetermined small number.

60
Simulated Annealing (cont.)
  • Example 5.3 TSP problem
  • The optimization task is to determine the optimum
    sequence of cities that the salesman is to follow
    on the trip.
  • The steps of the SA to this problem
  • Identify the state space of possible solutions.
  • An ordered list of cities on the sales trip.
  • The possible number of different sequences is
    equal to N!.
  • To specify the nature of the state perturbation.
  • Assume that a new solution is obtained by
    swapping the position of two cities in the
    current solution.
  • To specify the cost function that facilitates
    fitness quantification of the proposed solution.
  • The total distance traveled by the salesman.

61
Simulated Annealing (cont.)
Initial solution
  • Example 5.3

Random??20???????
??????,cost?????
Final solution by SA
62
Boltzmann Machine
  • The Boltzmann machine is a parallel constraint
    satisfaction network based on simulated annealing
    and uses stochastic neurons.
  • Boltzmann machine???????????(pattern) ,????????
  • ????????????(feedback)??????(stochastic neuron)
    ,???????stochastic recurrent network.
  • Boltzmann machine?Hopfield???
  • Boltzmann machine????hidden neuron, Hopfield????
  • Boltzmann machine??stochastic neuron,Hopfield??McC
    ulloch-Pitts neuron.
  • Hopfield????????,Boltzmann machineg????????,??????
    ??

63
Boltzmann Machine (cont.)
  • Boltzmann machine?Hopfield????
  • ????????????
  • ?????(self-feedback)
  • Processing unit have bipolar states.
  • The neurons are selected randomly and one at a
    time for updating.
  • Constraints
  • Strong constraint
  • Must be satisfied by any solution.
  • The strong constraints are the rule.
  • Weak constraint
  • Boltzmann machine????????weak constraint????

64
Boltzmann Machine (cont.)
  • Stochastic neuron
  • ?????q,????????fire??????????????????????????
    ?????? ??
  • ??vq0,?yq1/-1?????0.5

T?pseudo temperature????????????????? ?T?????,????
??Hopfield network (5.2)
(5.34)
(5.35)
65
Boltzmann Machine (cont.)
  • Probability distribution function for a
    stochastic neuron firing and the MiCulloch-Pitts
    neuron activation function.
  • ??Hopfield network??,Boltzmann machine????????wij
    wji,????self-feedback,wij0 ?ij.

T0 MiCulloch-Pitts neuron activation function
stochastic neuron
66
Boltzmann Machine (cont.)
  • ????????????Boltzmann-Gibbs ??,????Boltzmann
    machine?
  • Boltzmann machine?????????
  • nv?Visible???
  • nh?Hidden???
  • ??(nv nh)(nv nh -1)? ????
  • ????????,??? ?????????,?? visible????????? ???,???
    ?????? ??????????????????????
  • ???????????,clamping patterns associated with the
    environment onto the visible neurons with the
    appropriate probabilities.
  • The supervised mode of training may involve a
    probabilistic correct response pattern for each
    of the input pattern.

67
Boltzmann Machine (cont.)
  • The energy of global network configuration
  • The energy function can be written in vector form
  • Bolzmann machine?learning cycle
  • Positive phase and negative phase alternate
    followed by synaptic weight adjustments.
  • The state transition function is given by

(5.36)
xi denotes the ith neuron output state, qi is the
ith neuron threshold
(5.37)
(5.38)
68
Boltzmann Machine (cont.)
  • ??neuron i????(xigt-xi) ,??????????
  • ?(5.39)??(5.38)??
  • ?neuron i??????xi-1,????????????(xi1) ????

(5.39)
(5.40)
(5.41)
69
Boltzmann Machine (cont.)
  • ?neuron i??????xi1,????????????(xi-1)????
  • (5.42)????
  • (5.41)?(5.43)??(5.34)?general stochastic
    neuron???
  • Boltzmann machine??nnvnh????,????????????(1/-1)
    ,???2n????
  • Boltzmann machine??simulated annealing,???????????
    ???????????T,?????T???
  • The Boltzmann machine learning rule will
    presented in a step-by-step algorithm.

(5.42)
(5.43)
70
Learning algorithm for the Boltzmann machine
  • Loop 1
  • ?????????????????????????????????????????????
  • At the outermost loop, the synaptic weights of
    the network are update many times to ensure
    convergence according to where mgt0, and

(5.45)
(5.46)
71
Learning algorithm for the Boltzmann machine
(cont.)
  • Loop 2
  • For each iteration in loop 1 ltxixjgt must be
    calculated in an unclamped state, and with the
    visible units clamped in each desired pattern.
  • To operate the Boltzmann machine, the system must
    be in thermal equilibrium for some positive
    temperature Tgt0.
  • The state of the system x then fluctuates and the
    correlations ltxixjgt are measured by taking the
    time average of xixj.
  • To obtain all information that is necessary to
    compute the synaptic weight update rule in
    (5.45), this process must be carried out once
    with the visible neurons clamped in each of their
    states a for Ragt0, and once with the neurons
    unclamped.
  • The system must repeatedly reach thermal
    equilibrium before an average can be taken.

72
Learning algorithm for the Boltzmann machine
(cont.)
  • Loop 3
  • Foe each of these averages in loop 2, thermal
    equilibrium must be reached using a simulated
    annealing temperature schedule T(k), for a
    sufficiently large initial temperature T(0), and
    then a gradual decrease in the temperature.

73
Learning algorithm for the Boltzmann machine
(cont.)
  • Loop 4
  • At each of these temperatures in loop3, many
    neurons must be sampled and updated according to
    the rule from where and vi is the activity
    level of neuron i, that is

(5.47)
(5.48)
(5.49)
74
Overview of Temporal Feedforward network
  • The time delays allow the network to become a
    dynamic network.
  • The most common types of temporal network
  • Time-delay neural network (TDNN)
  • Finite impulse response (FIR)
  • Simple recurrent network (SRN)
  • Real-time recurrent neural network (RTRNN)
  • Pipeline recurrent neural network (PRNN)
  • Nonlinear autoregressive moving average (NARMA)

75
Simple Recurrent Network
  • Simple recurrent network
  • Elman network
  • A single hidden-layer feedforward neural network.
  • It has feedback connections from the outputs of
    the hidden-layer neurons to the input of the
    network.
  • Developed to learn time-varying patterns or
    temporal sequences.

76
Simple Recurrent Network (cont.)
  • The upper portion of the network contains the
    context units.
  • The function of these units is to replicate the
    hidden-layer output signals at the previous time
    step.
  • The purpose of the context units is to deal with
    input pattern dissonance.

77
Simple Recurrent Network (cont.)
  • The feedback provide a mechanism within the
    network to discriminate between patterns
    occurring at different times that are essentially
    identical.
  • The weights of the context units are fixed.
  • The other network weights can be adjusted in a
    supervised training mode by using the error
    backpropagation algorithm with momentum.

78
Time-delay neural network
  • Using time delays to perform temporal processing.
  • A Feedforward neural network, with the inputs to
    the network successively delayed in time.
  • A temporal sequence for the input is established
    and can be expressed as
  • The total number of weights required for the
    single neuron is (p1)n
  • This single-neuron model can be extended to a
    multilayer structure.
  • The TDNN can be trained using a modified version
    of the standard backpropagation algorithm.

79
Time-delay neural network (cont.)
  • Basic TDNN neuron with n inputs and p delays for
    each input.

80
Time-delay neural network (cont.)
  • Three layered TDNN architecture for the
    recognition of phonemes.

81
Distributed Time-Lagged Feedforward neural
networks
  • A DTLFNN is distributed in the sense that the
    element of time is distributed throughout the
    entire network.

82
Distributed Time-Lagged Feedforward neural
networks (cont.)
  • The output of the linear combiner is given
    by where
  • In the z domain we can write from (5.52)

(5.51)
(5.52)
The sum in (5.52) is referred to as a convolution
sum.
(5.53)
83
Distributed Time-Lagged Feedforward neural
networks (cont.)
  • Or as a transfer function or
  • The output of the linear combiner in Fig. 5.19
    for the qth neuron of the network is

(5.54)
(5.55)
(5.56)
84
Distributed Time-Lagged Feedforward neural
networks (cont.)
  • Each filtered input in Fig. 5.19 expressed in the
    time domain is given by the convolution sum
  • The output of the jth neuron in the network is
    given by

(5.57)
(5.58)
85
Distributed Time-Lagged Feedforward neural
networks (cont.)
  • A DTLFNN is trained using a supervised learning
    algorithm
  • a temporal backpropagation algorithm
  • This training algorithm is a temporal
    generalization of the standard backpropagation
    training algorithm.
  • Update the appropriate network weight vector
    according to

(5.59)
86
Distributed Time-Lagged Feedforward neural
networks (cont.)
  • where In (5.60) ej(k) is the instantaneous
    error, and

(5.60)
About PowerShow.com