CS451CS551EE565 ARTIFICIAL INTELLIGENCE - PowerPoint PPT Presentation

1 / 43
About This Presentation
Title:

CS451CS551EE565 ARTIFICIAL INTELLIGENCE

Description:

Words, Sounds, Faces, etc. Data Clustering. Unsupervised Concept Learning ... Properties of connectionist models. Many neuron-like threshold switching units. ... – PowerPoint PPT presentation

Number of Views:75
Avg rating:3.0/5.0
Slides: 44
Provided by: janicets
Category:

less

Transcript and Presenter's Notes

Title: CS451CS551EE565 ARTIFICIAL INTELLIGENCE


1
CS451/CS551/EE565ARTIFICIAL INTELLIGENCE
  • Neural Networks
  • 12-06-2006
  • Prof. Janice T. Searleman
  • jets_at_clarkson.edu, jetsza

2
Outline
  • Neural Nets
  • Reading Assignment AIMA
  • Chapter 20, section 20.5, Neural Networks
  • Final Exam Mon, 12/11/06, 800 am, SC342
  • HW7 posted due Wed. 12/06/06

3
Connectionist models
  • Key intuition Much of intelligence is in the
    connections between the 10 billion neurons in the
    human brain.
  • Neuron switching time is roughly 0.001 second
    scene recognition time is about 0.1 second. This
    suggests that the brain is massively parallel
    because 100 computational steps are simply not
    sufficient to accomplish scene recognition.
  • Development Formation of basic connection
    topology
  • Learning Fine-tuning of topology Major
    synaptic-efficiency changes.
  • The matrix IS the intelligence!

4
Artificial Neural Networks (ANN)
  • Distributed representational and computational
    mechanism based (very roughly) on
    neurophysiology.
  • A collection of simple interconnected processors
    (neurons) that can learn complex behaviors
    solve difficult problems.
  • Wide range of applications
  • Supervised Learning
  • Function Learning (Correct mapping from inputs to
    outputs)
  • Time-Series Analysis, Forecasting, Controller
    Design
  • Concept Learning
  • Standard Machine Learning Classification tasks
    Features gt Class
  • Unsupervised Learning
  • Pattern Recognition (Associative Memory models)
  • Words, Sounds, Faces, etc.
  • Data Clustering
  • Unsupervised Concept Learning

5
NeuroComputing
  • Nodes fire when sum (weighted inputs) gt
    threshold.
  • Other varieties common unthresholded linear,
    sigmoidal, etc.
  • Connection topologies vary widely across
    applications
  • Weights vary in magnitude sign (stimulate or
    inhibit)
  • Learning Finding proper topology weights
  • Search process in the space of possible
    topologies weights
  • Most ANN applications assume a fixed topology.
  • The matrix IS the learning machine!

6
Properties of connectionist models
  • Many neuron-like threshold switching units.
  • Many weighted interconnections between units.
  • Highly parallel, distributed computation.
  • Weights are tuned automatically.
  • Especially useful for learning complex functions
    with continuous-valued outputs and large numbers
    of noisy inputs, which is the type that
    logic-based techniques have difficulty with.
  • Fault-tolerant.
  • Degrades gracefully.

7
Neural Networks
node unit
ai g(ini)
A NODE
Wj,i
aj
input function
activation function
output
g
ini
output links
input links
ai
8
Simple Computing Elements
  • Each unit (node) receives signals from its input
    links and computes a new activation level that it
    sends along all output links.
  • Computation is split into two steps
  • ini Wj,i aj , the linear step, and then
  • ai g(ini), the nonlinear step.

j
9
Possibilities for g
Step function
Sign function
Sigmoid (logistic) function
sign(x) 1, if x gt 0 -1, if x
lt 0
step(x) 1, if x gt threshold 0,
if x lt threshold (in picture above, threshold 0)
sigmoid(x) 1/(1e-x)
Adding an extra input with activation a0 - 1
and weight W0,j t is equivalent to having a
threshold at t. This way we can always assume a
0 threshold.
10
Real vs artificial neurons
11
Similarities with neurons
12
Differences
13
Neural Nets A Brief History
  • McCulloch and Pitts 1943 Showed how neural-like
    networks could compute
  • Rosenblatt 1950s Perceptrons
  • Minsky Papert 1969 Perceptron Deficiencies
  • Hopfield 1982 Hopfield Nets
  • Hinton Sejnowski 1986 Boltzmann Machines
  • Rumelhart et. al 1986 Multilayer nets with
  • Backpropagation

14
Universal computing elements
  • In 1943, McCullough and Pitts showed that a
    synchronous assembly of such neurons is a
    universal computing machine. That is, any Boolean
    function can be implemented with threshold (step
    function) units.

15
Implementing AND
-1
x1
1
W1.5
o(x1,x2)
x2
1
16
Implementing OR
-1
x1
1
W0.5
o(x1,x2)
x2
1
o(x1,x2) 1 if 0.5 x1 x2 gt 0
0 otherwise
17
Implementing NOT
-1
W-0.5
-1
o(x1,x2)
x1
18
Implementing more complex Boolean functions
-1
0.5
1
x1
x1 or x2
1
-1
x2
1.5
1
1
x3
(x1 or x2) and x3
19
Types of Neural Networks
  • Feedforward Links are unidirectional, and there
    are no cycles, i.e., the network is a directed
    acyclic graph (DAG). Units are arranged in
    layers, and each unit is linked only to units in
    the next layer. There is no internal state other
    than the weights.
  • Recurrent Links can form arbitrary topologies,
    which can implement memory. Behavior can become
    unstable, oscillatory, or chaotic.

20
Feedforward Neural Net
21
A recurrent network topology
  • Hopfield net every unit i is connected to every
    other unit j by weight Wij

Weights are assumed to be symmetric Wij Wji
Useful for associative memory after training on
a set of examples, a new stimulus will cause the
network to settle into an activation pattern
corresponding to the example in the training set
that most closely resembles the new stimulus.
22
Hopfield Net
23
  • Perceptrons
  • Hopfield Nets
  • Multilayer Feedforward Nets

24
Perceptrons
  • Perceptrons are single-layer feedforward networks
  • Each output unit is independent of the others
  • Can assume a single output unit
  • Activation of the output unit is calculated by
  • O Step0( Wj xj )
  • where xj is the activation of input unit j, and
    we assume an additional weight and input to
    represent the threshold

25
Perceptron
26
Multiple Perceptrons
27
How can perceptrons be designed?
  • The Perceptron Learning Theorem (Rosenblatt,
    1960) Given enough training examples, there is
    an algorithm that will learn any linearly
    separable function.
  • Learning algorithm
  • If the perceptron fires when it should not, make
    each weight wi smaller by an amount proportional
    to xi
  • If it fails to fire when it should, make each wi
    proportionally larger

28
The perceptron learning algorithm
  • Inputs training set (x1,x2,,xn,t)
  • Method
  • Randomly initialize weights w(i), -0.5ltilt0.5
  • Repeat for several epochs until convergence
  • for each example
  • Calculate network output o.
  • Adjust weights

29
Expressive limits of perceptrons
  • Can the XOR function be represented by a
    perceptron
  • (a network without a hidden layer)?

There is no assignment of values to w0,w1 and w2
that satisfies above inequalities. XOR cannot be
represented!
30
So what can be represented using perceptrons?
and
or
Representation theorem 1 layer feedforward
networks can only represent linearly separable
functions. That is, the decision surface
separating positive from negative examples has to
be a plane.
31
Why does the method work?
  • The perceptron learning rule performs gradient
    descent in weight space.
  • Error surface The surface that describes the
    error on each example as a function of all the
    weights in the network. A set of weights defines
    a point on this surface.
  • We look at the partial derivative of the surface
    with respect to each weight (i.e., the gradient
    -- how much the error would change if we made a
    small change in each weight). Then the weights
    are being altered in an amount proportional to
    the slope in each direction (corresponding to a
    weight). Thus the network as a whole is moving
    in the direction of steepest descent on the error
    surface.
  • The error surface in weight space has a single
    global minimum and no local minima. Gradient
    descent is guaranteed to find the global minimum,
    provided the learning rate is not so big that
    that you overshoot it.

32
  • Perceptrons
  • Hopfield Nets
  • Multilayer Feedforward Nets

33
Hopfield Nets
  • John Hopfield, 1982
  • distributed representation
  • memory is stored as a pattern of activation
  • different memories are different patterns on the
    SAME PEs
  • distributed, asynchronous control
  • each processor makes decisions based on local
    situation
  • content-addressable memory
  • a number of patterns can be stored in a net
  • to retrieve a pattern, specify some (or all) of
    it it will find the closest match
  • fault tolerance
  • the network works even if a few PEs misbehave or
    fail (graceful degradation)
  • also handles novel inputs well (robust)

34
Distributed Information Storage Processing
  • Information is stored in the weights with
  • Concepts/Patterns spread over many weights, and
    nodes.
  • Individual weights can hold info for many
    different concepts

35
Parallel Relaxation
  • choose an arbitrary unit. if any neighbors are
    active, compute the sum
  • if the sum is positive, then activate the unit
    else, deactivate it
  • continue until a stable state is achieved (all
    units have been considered no more units can
    change)
  • Hopfield showed that given any set of weights and
    any initial state, the parallel relaxation
    algorithm would eventually settle into a stable
    state

36
Example Hopfield Net
Note that this is a stable state
37
Test Input
What steady state does this converge to?
38
Another Test Input
What steady state does this converge to?
39
Four Stable States
40
  • Perceptrons
  • Hopfield Nets
  • Multilayer Feedforward Nets

41
Multilayer Feedforward Net
42
Multi-layer networks
  • Multi-layer feedforward networks are trainable by
    backpropagation provided the activation function
    g is a differentiable function.
  • Threshold units dont qualify, but the logistic
    function does.

43
Sigmoid units
  • Soft threshold units.
Write a Comment
User Comments (0)
About PowerShow.com