Loading...

PPT – ECE 517: Reinforcement Learning in Artificial Intelligence Lecture 13: Artificial Neural Networks PowerPoint presentation | free to download - id: 600345-NTU3N

The Adobe Flash plugin is needed to view this content

ECE 517 Reinforcement Learning in Artificial

IntelligenceLecture 13 Artificial Neural

Networks Introduction, Feedforward Neural

Networks

October 30, 2012

Dr. Itamar Arel College of Engineering Electrical

Engineering and Computer Science Department The

University of Tennessee Fall 2012

Final projects - logistics

- Projects can be done individually or in pairs
- Students are encouraged to propose a topic
- Please email me your top three choices for a

project along with a preferred date for your

presentation - Presentation dates
- Nov. 27, 29 and Dec. 4
- Format 17 min presentation 3 min QA
- 7 min for background and motivation
- 10 for description of your work and conclusions
- Written report due Friday, Dec. 7
- Format similar to project report

Final projects - topics

- Teris player using RL (and NN)
- Curiosity based TD learning
- States vs. Rewards in RL
- Human reinforcement learning
- Reinforcement Learning of Local Shape in the Game

of Go - Where do rewards come from?
- Efficient Skill Learning using Abstraction

Selection - AIBO Playing on a PC using RL
- AIBO learning to walk within a maze
- Study of value function definitions for TD

learning

Outline

- Introduction
- Brain vs. Computers
- The Perceptron
- Multilayer Perceptrons (MLP)
- Feedforward Neural-Networks and Backpropagation

Pigeons as art experts (Watanabe et al. 1995)

- Experiment
- Pigeon was placed in a closed box
- Present paintings of two different artists (e.g.

Chagall / Van Gogh) - Reward for peckingwhen presented a particular

artist (e.g. Van Gogh) - Pigeons were able todiscriminate betweenVan

Gogh and Chagallwith 95 accuracy(when

presented with - pictures they had beentrained on)

Pictures by different artists

Interesting results

- Discrimination still 85 successful for

previously unseen paintings of the artists - Conclusions from the experiment
- Pigeons do not simply memorise the pictures
- They can extract and recognise patterns (e.g.

artistic style) - They generalise from the already seen to make

predictions - This is what neural networks (biological and

artificial) are good at (unlike conventional

computer) - Provided further justification for use of ANNs

Computers are incredibly fast, accurate, and

stupid. Human beings are incredibly slow,

inaccurate, and brilliant. Together they are

powerful beyond imagination, Albert Einstein

The Von Neumann architecture vs. Neural Networks

Von Neumann

- Follows rules
- Solution can/must be formally specified
- Cannot generalize
- Not error tolerant

Neural Net

- Learns from data
- Rules on data are not visible
- Able to generalize
- Copes well with noise

- Memory for programs and data
- CPU for math and logic
- Control unit to steer program flow

Biological Neuron

- Input builds up on receptors (dendrites)
- Cell has an input threshold
- Upon breech of cells threshold, activation is

fired down the axon - Synapses (i.e. weights) exist prior to the

dendrites (input) interfaces

Connectionism

- Connectionist techniques (a.k.a. neural networks)

are inspired by the strong interconnectedness of

the human brain. - Neural networks are loosely modeled after the

biological processes involved in cognition - 1. Information processing involves many simple

processing elements called neurons. - 2. Signals are transmitted between neurons using

connecting links. - 3. Each link has a weight that modulates (or

controls) the strength of its signal. - 4. Each neuron applies an activation function to

the input that it receives from other neurons.

This function determines its output. - Links with positive weights are called excitatory

links. - Links with negative weights are called inhibitory

links.

Some definitions

- A Neural Network is an interconnected assembly of

simple processing elements, units or nodes. The

long-term memory of the network is stored in the

inter-unit connection strengths, or weights,

obtained by a process of adaptation to, or

learning from, a set of training patterns. - Biologically inspired learning mechanism

Brain vs. Computer

- Performance tends to degrade gracefully under

partial damage - In contrast, most programs and engineered systems

are brittle if you remove some arbitrary parts,

very likely the whole will cease to function - It performs massively parallel computations

extremely efficiently. For example, complex

visual perception occurs within less than 100 ms,

that is, 10 processing steps!

Dimensions of Neural Networks

- Various types of neurons
- Various network architectures
- Various learning algorithms
- Various applications
- Well focus mainly on supervised learning based

networks - The architecture of a neural network is linked

with the learning algorithm used to train

ANNs The basics

- ANNs incorporate the two fundamental components

of biological neural nets - Neurons computational nodes
- Synapses weights or memory storage devices

Neuron vs. Node

The Artificial Neuron

Bias as an extra input

- Bias is an external parameter of the neuron. Can

be modeled by adding an extra (fixed-valued) input

Face recognition example

90 accurate learning head pose, and recognizing

1-of-20 faces

The XOR problem

- A single-layer (linear) neural network cannot

solve the XOR problem. - Input Output
- 00 ? 0
- 01 ? 1
- 10 ? 1
- 11 ? 0
- To see why this is true, we can try to express

the problem as a linear equation aX bY Z - a0 b0 0
- a0 b1 1 -gt b 1
- a1 b0 1 -gt a 1
- a1 b1 0 -gt a -b

The XOR problem (cont.)

- But adding a third bit the problem can be

resolved. - Input Output
- 000 ? 0
- 010 ? 1
- 100 ? 1
- 111 ? 0
- Once again, we express the problem as a linear

equation aX

bY cZ W - a0 b0 c0 0
- a0 b1 c0 1 -gt b1
- a1 b0 c0 1 -gt a1
- a1 b1 c1 0 -gt a b c 0 -gt 1 1 c

0 -gt c -2 - So the equation X Y - 2Z W will solve the

problem.

A Multilayer Network for the XOR function

Thresholds

Hidden Units

- Hidden units are a layer of nodes that are

situated between the input nodes and the output

nodes - Hidden units allow a network to learn non-linear

functions - The hidden units allow the net to represent

combinations of the input features - Given too many hidden units, however,a net will

simply memorize the inputpatterns - Given too few hidden units, the networkmay not

be able to represent all of thenecessary

generalizations

Backpropagation Networks

- Backpropagation networks are among the most

popular and widely used neural networks because

they are relatively simple and powerful - Backpropagation was one of the first general

techniques developed to train multilayer

networks, which do not have many of the inherent

limitations of the earlier, single-layer neural

nets criticized by Minsky and Papert. - Backpropagation networks use a gradient descent

method to minimize the total squared error of the

output. - A backpropagation net is a multilayer,

feedforward network that is trained by

backpropagating the errors using the generalized

delta rule.

The idea behind (error) backpropagation learning

- Feedforward training of input patterns
- Each input node receives a signal, which is

broadcasted to all of the hidden units - Each hidden unit computes its activation, which

is broadcasted to all of the output nodes - Backpropagation of errors
- Each output node compares itsactivation with the

desired output - Based on this difference, the error ispropagated

back to all previous nodes - Adjustment of weights
- The weights of all links are computedsimultaneous

ly based on the errors that were propagated

backwards

Multilayer Perceptron (MLP)

Activation functions

- Transforms neurons input into output
- Features of activation functions
- A squashing effect is required
- Prevents accelerating growth of activation levels

through the network - Simple and easy to calculate

Backpropagation Learning

- We want to train a multi-layer feedforward

network by gradient descent to approximate an

unknown function, based on some training data

consisting of pairs (x,d) - Vector x represents a pattern of input to the

network, and the vector d the corresponding

target (desired output) - BP is a gradient-descent based scheme
- The overall gradient with respect to the entire

training set is just the sum of the gradients for

each pattern - We will therefore describe how to compute the

gradient for just a single training pattern - We will number the units, and denote the weight

from unit j to unit i by xij

BP Forward Pass at Layer 1

BP Forward Pass at Layer 2

BP Forward Pass at Layer 3

- The last layer produces the networks output
- We can now derive an error (difference between

output and the target)

BP Back-propagation of error output layer

- We have an error with respect to the target (z)
- This error signal will be propagated back towards

the input layer (layer 1) - Each neuron will forward error information to the

neurons feeding it from the previous layer

BP Back-propagation of error towards the hidden

layer

BP Back-propagation of error towards the input

layer

BP Illustration of Weight Update