Connectionist Sentence Comprehension - PowerPoint PPT Presentation

1 / 27
About This Presentation
Title:

Connectionist Sentence Comprehension

Description:

The function of an Artificial Neural Network is to produce an output pattern from a ... Artificial Intelligence 2nd ed, Russel & Norvig, Prentice Hall, 2003 ... – PowerPoint PPT presentation

Number of Views:143
Avg rating:3.0/5.0
Slides: 28
Provided by: csUman
Category:

less

Transcript and Presenter's Notes

Title: Connectionist Sentence Comprehension


1
  • Connectionist Sentence Comprehension
  • and Production System
  • A model by Dr. Douglas Rohde, M.I.T
  • by
  • Dave Cooke
  • Nov. 6, 2004

2
Overview
  • Introduction
  • A brief overview of Artificial Neural Networks
  • The basic architecture
  • Introduce Douglas Rohde's CSCP model
  • Overview
  • Penglish Language
  • Architecture
  • Semantic System
  • Comprehension, Prediction, and Production System
  • Training
  • Testing
  • Conclusions
  • Bibliography

3
A Brief Overview
  • Basic definition of an Artificial Neural Network
  • A network of interconnected neurons inspired by
    the biological nervous system.
  • The function of an Artificial Neural Network is
    to produce an output pattern from a given input.
  • First described by Warren McCulloch and Walter
    Pitts in 1943 in their seminal paper A logical
    calculus of ideas imminent in nervous activity.

4
  • Artificial neurons are modeled after biological
    neurons
  • The architecture of an Artificial Neuron

5
Architecture -- Structure
  • Network Structure
  • Many types of neural network structures
  • Ex Feedforward, Recurrent
  • Feedforward
  • Can be single layered or multi-layered
  • Inputs are propagated forward to the output layer

6
Architecture -- Recurrent NN
  • Recurrent Neural Networks
  • Operate on an input space and an internal state
    space they have memory.
  • Primary types of Recurrent neural networks
  • simple recurrent
  • fully recurrent
  • Below is an example of a simple recurrent network
    (SRN)

7
Architecture -- Learning
  • Learning used in NN's
  • Learning change in connection weights
  • Supervised networks network is told about
    correct answer
  • ex. back propagation, back propagation through
    time, reinforcement learning
  • Unsupervised networks network has to find
    correct input.
  • competitive learning, self-organizing or Kohonen
    maps

8
Architecture -- Learning (BPTT)
  • Backpropagation Through Time (BPTT) is used in
    the CSCP Model and SRNs
  • In BPTT the network runs ALL of its forward
    passes then performs ALL of the backward passes.
  • Equivalent to unrolling the network backwards
    through time

9
The CSCP Model
  • Connectionist Sentence Comprehension and
    Production model
  • Primary Goal learn to comprehend and produce
    sentences developed in the Penglish( Pseudo
    English) language.
  • Secondary Goal to construct a model that will
    acount for a wide range of human sentence
    processing behaviours.

10
Basic Architecture
  • A Simple Recurrent NN is used
  • Penglish (Pseudo English) was used to train and
    test the model.
  • Consists of 2 separate parts contected by a
    message layer
  • Semantic System (Encoding/Decoding System)
  • CPP system
  • Backpropagation Through Time (BPTT) is the
    learning algorithm.
  • method for learning temporal tasks

11
Penglish
  • Goal to produce only sentences that are
    reasonably valid in english
  • Built around the framework of a stochastic
    context-free grammar.
  • Given a SCFG it is easy to generate sentences,
    parse sentences, and perform optimal prediction
  • Subset of english some grammatical structures
    used are
  • 56 verb stems
  • 45 noun stems
  • adjectives, determiners, adverbs, subordinate
    clauses
  • several types of logical ambiguity.

12
Penglish
  • Penglish sentences do not always sound entirely
    natural even though constraints to avoid semantic
    violations were implemented
  • Example sentences are
  • (1) We had played a trumpet for you
  • (2) A answer involves a nice school.
  • (3) The new teacher gave me a new book of
    baseball.
  • (4) Houses have had something the mother has
    forgotten

13
The CSCP Model
Start
Semantic System
stores all propositions seen for current sentence
CPP System
14
Semantic System
Propositions loaded sequentially
Propositions stored in Memory
15
Semantic System
Error measure
16
Training (SS)
  • Backpropagation
  • Trained separate and prior to the rest of the
    model.
  • The decoder uses standard single-step
    backpropagation
  • The encoder is trained using BPTT.
  • Majority of the running time is in the decoding
    stage.

17
Training (SS)
Error is assessed here.
18
CPP System
Error measure
The CPP System
Phonologically encoded word.
19
CPP System (cont.)
Starts here by trying to predict next word in
sentence.
Goal to produce next word in sentence and pass it
to Word Input Layer
20
The CPP System - Training
1. BPTT starts here.
4. BPTT
2. Backpropagated to here.
3. Previously recorded output errors are injected
here
21
Training
  • 16 Penglish training sets
  • Set 250,000 sentences, total 4 million
    sentences
  • 50 000 weight updates per set 1 epoch
  • Total of 16 epochs.
  • The learning rate start at .2 for the first epoch
    and then was gradually reduced over the course of
    learning.
  • After the Semantic System the CPP system was
    similarily trained
  • Training began with limited complexity sentences
    and complexity increased gradually.
  • Training a single network took about 2 days on a
    500Mhz alpha. Total training time took about
    two months.
  • Overall 3 networks were trained

22
Testing
  • 50,000 sentences
  • 33.8 of testing sentences also appeared in one
    of the training sets.
  • Nearly all of the sentences had 1 or 2
    propositions.
  • 3 forms of measurement are used in measuring
    comprehension.
  • multiple choice measure
  • Reading time measure
  • Grammaticality rating measure

23
Testing (Multiple Choice)
  • Example When the owner let go, the dog ran
    after the mailman.
  • Expressed as ran after, theme, ?
  • Possible answers
  • Mailman (correct answer)
  • owner, dog, girls, cats. (distractors)
  • Error measure is
  • When applying four distractors, the chance
    performance is 20 correct.

24
Testing (Reading Time)
  • Also known as Simulated Reading Time
  • Its a weighted average of 4 components.
  • 1 and 2 Measure the degree to which the current
    word was expected
  • 3rd The change in the message that occurred when
    the current word was read
  • 4th The average level of activation in the
    message layer
  • The four components are multiplied by scaling
    factors to achieve average values of close to 1.0
    for each of them and a weighted average is then
    taken.
  • Ranges from .4 for easy words to 2.5 or more for
    very hard words.

25
Testing (Grammaticality)
  • The Grammaticality Method
  • (1) prediction accuracy (PE)
  • Indicator of syntactic complexity
  • Involves the point in the sentence at which the
    worst two consecutive predictions occur.
  • (2) comprehension performance (CE)
  • Average strict-criterion comprehension error rate
    on the sentence.
  • Intented to reflect the degree to which the
    sentence makes sense.
  • Simulated ungrammaticality rating (SUR)
  • SUR (PE 8) X (CE 0.5)
  • combines the two components into a single measure
    of ungrammaticality

26
Conclusions
  • General Comprehension Results
  • final networks are able to provide complete,
    accurate answer
  • Given NO choices 77
  • Given 5 choices 92
  • Sentential Complement Ambiguity
  • Strict criterion error rate 13.5
  • Multiple choice 2
  • Subordinate Clause Ambiguity
  • Ex. Although the teacher saw a book was taken in
    the school.
  • Intransitive, weak bad, weak good condition,
    strong bad, and strong good all were under 20
    error rate on multiple choice questions.

27
Bibliography
  • Artificial Intelligence 4th ed, Luger G.F.,
    Addison Wesley, 2002
  • Artificial Intelligence 2nd ed, Russel Norvig,
    Prentice Hall, 2003
  • Neural Networks 2nd ed, Picton P., Palgrave, 2000
  • A connectionist model of sentence comprehension
    and production, Rohde D., MIT, March 2 2002
  • Finding Structure in Time, Elman J.L, UC San
    Diego, Cognitive Science, 14, 179-211, 1990
  • Fundamentals of Neural Networks, Fausett L,
    Pearson, 1994
Write a Comment
User Comments (0)
About PowerShow.com