ECE 517: Reinforcement Learning in Artificial Intelligence Lecture 13: Artificial Neural Networks - PowerPoint PPT Presentation


PPT – ECE 517: Reinforcement Learning in Artificial Intelligence Lecture 13: Artificial Neural Networks PowerPoint presentation | free to download - id: 600345-NTU3N


The Adobe Flash plugin is needed to view this content

Get the plugin now

View by Category
About This Presentation

ECE 517: Reinforcement Learning in Artificial Intelligence Lecture 13: Artificial Neural Networks


ECE 517: Reinforcement Learning in Artificial Intelligence Lecture 13: Artificial Neural Networks Introduction, Feedforward Neural Networks – PowerPoint PPT presentation

Number of Views:349
Avg rating:3.0/5.0
Slides: 34
Provided by: Dr1260
Learn more at:


Write a Comment
User Comments (0)
Transcript and Presenter's Notes

Title: ECE 517: Reinforcement Learning in Artificial Intelligence Lecture 13: Artificial Neural Networks

ECE 517 Reinforcement Learning in Artificial
IntelligenceLecture 13 Artificial Neural
Networks Introduction, Feedforward Neural
October 30, 2012
Dr. Itamar Arel College of Engineering Electrical
Engineering and Computer Science Department The
University of Tennessee Fall 2012
Final projects - logistics
  • Projects can be done individually or in pairs
  • Students are encouraged to propose a topic
  • Please email me your top three choices for a
    project along with a preferred date for your
  • Presentation dates
  • Nov. 27, 29 and Dec. 4
  • Format 17 min presentation 3 min QA
  • 7 min for background and motivation
  • 10 for description of your work and conclusions
  • Written report due Friday, Dec. 7
  • Format similar to project report

Final projects - topics
  • Teris player using RL (and NN)
  • Curiosity based TD learning
  • States vs. Rewards in RL
  • Human reinforcement learning
  • Reinforcement Learning of Local Shape in the Game
    of Go
  • Where do rewards come from?
  • Efficient Skill Learning using Abstraction
  • AIBO Playing on a PC using RL
  • AIBO learning to walk within a maze
  • Study of value function definitions for TD

  • Introduction
  • Brain vs. Computers
  • The Perceptron
  • Multilayer Perceptrons (MLP)
  • Feedforward Neural-Networks and Backpropagation

Pigeons as art experts (Watanabe et al. 1995)
  • Experiment
  • Pigeon was placed in a closed box
  • Present paintings of two different artists (e.g.
    Chagall / Van Gogh)
  • Reward for peckingwhen presented a particular
    artist (e.g. Van Gogh)
  • Pigeons were able todiscriminate betweenVan
    Gogh and Chagallwith 95 accuracy(when
    presented with
  • pictures they had beentrained on)

Pictures by different artists
Interesting results
  • Discrimination still 85 successful for
    previously unseen paintings of the artists
  • Conclusions from the experiment
  • Pigeons do not simply memorise the pictures
  • They can extract and recognise patterns (e.g.
    artistic style)
  • They generalise from the already seen to make
  • This is what neural networks (biological and
    artificial) are good at (unlike conventional
  • Provided further justification for use of ANNs

Computers are incredibly fast, accurate, and
stupid. Human beings are incredibly slow,
inaccurate, and brilliant. Together they are
powerful beyond imagination, Albert Einstein
The Von Neumann architecture vs. Neural Networks
Von Neumann
  • Follows rules
  • Solution can/must be formally specified
  • Cannot generalize
  • Not error tolerant

Neural Net
  • Learns from data
  • Rules on data are not visible
  • Able to generalize
  • Copes well with noise
  • Memory for programs and data
  • CPU for math and logic
  • Control unit to steer program flow

Biological Neuron
  • Input builds up on receptors (dendrites)
  • Cell has an input threshold
  • Upon breech of cells threshold, activation is
    fired down the axon
  • Synapses (i.e. weights) exist prior to the
    dendrites (input) interfaces

  • Connectionist techniques (a.k.a. neural networks)
    are inspired by the strong interconnectedness of
    the human brain.
  • Neural networks are loosely modeled after the
    biological processes involved in cognition
  • 1. Information processing involves many simple
    processing elements called neurons.
  • 2. Signals are transmitted between neurons using
    connecting links.
  • 3. Each link has a weight that modulates (or
    controls) the strength of its signal.
  • 4. Each neuron applies an activation function to
    the input that it receives from other neurons.
    This function determines its output.
  • Links with positive weights are called excitatory
  • Links with negative weights are called inhibitory

Some definitions
  • A Neural Network is an interconnected assembly of
    simple processing elements, units or nodes. The
    long-term memory of the network is stored in the
    inter-unit connection strengths, or weights,
    obtained by a process of adaptation to, or
    learning from, a set of training patterns.
  • Biologically inspired learning mechanism

Brain vs. Computer
  • Performance tends to degrade gracefully under
    partial damage
  • In contrast, most programs and engineered systems
    are brittle if you remove some arbitrary parts,
    very likely the whole will cease to function
  • It performs massively parallel computations
    extremely efficiently. For example, complex
    visual perception occurs within less than 100 ms,
    that is, 10 processing steps!

Dimensions of Neural Networks
  • Various types of neurons
  • Various network architectures
  • Various learning algorithms
  • Various applications
  • Well focus mainly on supervised learning based
  • The architecture of a neural network is linked
    with the learning algorithm used to train

ANNs The basics
  • ANNs incorporate the two fundamental components
    of biological neural nets
  • Neurons computational nodes
  • Synapses weights or memory storage devices

Neuron vs. Node
The Artificial Neuron
Bias as an extra input
  • Bias is an external parameter of the neuron. Can
    be modeled by adding an extra (fixed-valued) input

Face recognition example
90 accurate learning head pose, and recognizing
1-of-20 faces
The XOR problem
  • A single-layer (linear) neural network cannot
    solve the XOR problem.
  • Input Output
  • 00 ? 0
  • 01 ? 1
  • 10 ? 1
  • 11 ? 0
  • To see why this is true, we can try to express
    the problem as a linear equation aX bY Z
  • a0 b0 0
  • a0 b1 1 -gt b 1
  • a1 b0 1 -gt a 1
  • a1 b1 0 -gt a -b

The XOR problem (cont.)
  • But adding a third bit the problem can be
  • Input Output
  • 000 ? 0
  • 010 ? 1
  • 100 ? 1
  • 111 ? 0
  • Once again, we express the problem as a linear
    equation aX
    bY cZ W
  • a0 b0 c0 0
  • a0 b1 c0 1 -gt b1
  • a1 b0 c0 1 -gt a1
  • a1 b1 c1 0 -gt a b c 0 -gt 1 1 c
    0 -gt c -2
  • So the equation X Y - 2Z W will solve the

A Multilayer Network for the XOR function
Hidden Units
  • Hidden units are a layer of nodes that are
    situated between the input nodes and the output
  • Hidden units allow a network to learn non-linear
  • The hidden units allow the net to represent
    combinations of the input features
  • Given too many hidden units, however,a net will
    simply memorize the inputpatterns
  • Given too few hidden units, the networkmay not
    be able to represent all of thenecessary

Backpropagation Networks
  • Backpropagation networks are among the most
    popular and widely used neural networks because
    they are relatively simple and powerful
  • Backpropagation was one of the first general
    techniques developed to train multilayer
    networks, which do not have many of the inherent
    limitations of the earlier, single-layer neural
    nets criticized by Minsky and Papert.
  • Backpropagation networks use a gradient descent
    method to minimize the total squared error of the
  • A backpropagation net is a multilayer,
    feedforward network that is trained by
    backpropagating the errors using the generalized
    delta rule.

The idea behind (error) backpropagation learning
  • Feedforward training of input patterns
  • Each input node receives a signal, which is
    broadcasted to all of the hidden units
  • Each hidden unit computes its activation, which
    is broadcasted to all of the output nodes
  • Backpropagation of errors
  • Each output node compares itsactivation with the
    desired output
  • Based on this difference, the error ispropagated
    back to all previous nodes
  • Adjustment of weights
  • The weights of all links are computedsimultaneous
    ly based on the errors that were propagated

Multilayer Perceptron (MLP)
Activation functions
  • Transforms neurons input into output
  • Features of activation functions
  • A squashing effect is required
  • Prevents accelerating growth of activation levels
    through the network
  • Simple and easy to calculate

Backpropagation Learning
  • We want to train a multi-layer feedforward
    network by gradient descent to approximate an
    unknown function, based on some training data
    consisting of pairs (x,d)
  • Vector x represents a pattern of input to the
    network, and the vector d the corresponding
    target (desired output)
  • BP is a gradient-descent based scheme
  • The overall gradient with respect to the entire
    training set is just the sum of the gradients for
    each pattern
  • We will therefore describe how to compute the
    gradient for just a single training pattern
  • We will number the units, and denote the weight
    from unit j to unit i by xij

BP Forward Pass at Layer 1
BP Forward Pass at Layer 2

BP Forward Pass at Layer 3
  • The last layer produces the networks output
  • We can now derive an error (difference between
    output and the target)

BP Back-propagation of error output layer
  • We have an error with respect to the target (z)
  • This error signal will be propagated back towards
    the input layer (layer 1)
  • Each neuron will forward error information to the
    neurons feeding it from the previous layer

BP Back-propagation of error towards the hidden
BP Back-propagation of error towards the input
BP Illustration of Weight Update