Title: Multiword Expressions Facilitate, not Hinder, Understanding 9 November 2006
1Multiword Expressions Facilitate, not Hinder,
Understanding9 November 2006
- Jerry Ball
- Senior Research Psychologist
- Human Effectiveness Directorate
- Air Force Research Laboratory
2Multiword Expressions A Pain in the Neck?
- According to Sag et al. (2002) Multiword
Expressions (MWEs) are a pain in the neck for
developing Natural Language Processing (NLP)
systems - MWEs must be handled as exceptions to a
word-based compositional semantics - Meaning of MWEs cannot be determined from
meanings of individual words composed together
according to syntax - Unfortunately, MWEs are ubiquitous in natural
language - Sag, I., Baldwin, T, Bond, F, Copestake, A. and
Flickinger, D. (2002). Multiword Expressions A
Pain in the Neck for NLP. In Proceedings of the
Third International Conference on Intelligent
Text Processing and Computational Linguistics
3Multiword Expressions A Pain in the Neck?
- Maybe the current word-based compositional
semantic approach to building NLP systems is
missing something! - Words are the base meaningful units
- Words are the base units of recognition
- Meaning of expression is composed from meanings
of words recognized independently and combined
syntactically - But humans recognize and understand linguistic
units holistically at multiple levels, not just
words - Letter, Phoneme, Syllable, Morpheme, Word,
Phrase, Text
4Identifying Letters in Words
Count the number of F's in the following text
FINISHED FILES ARE THE RESULT OF
YEARS OF SCIENTIFIC STUDY COMBINED WITH
THE EXPERIENCE OF YEARS
5Identifying Letters in Words
Count the number of F's in the following text
FINISHED FILES ARE THE RESULT OF
YEARS OF SCIENTIFIC STUDY COMBINED WITH
THE EXPERIENCE OF YEARS
6Identifying Letters in Words
Count the number of F's in the following text
FINISHED FILES ARE THE RESULT OF
YEARS OF SCIENTIFIC STUDY COMBINED WITH
THE EXPERIENCE OF YEARS
7Composing Words from Letters
- The word of is recognized holistically
- of is not recognized by recognizing o and
recognizing f and combining them to get of - Words can be recognized without recognizing the
individual letters - Even when the task is to identify letters, this
can be difficult for very common words - The f in of is perceptually implicit
Healy, A. F. (1976). Detection errors on the word
The Evidence for reading units larger than
letters. Journal of Experimental Psychology
Human Perception Performance, 2, 235-242.
8Identifying Words
rscheearch ltteer waht lteter oredr wrod pclae
deosn't olny tihs taht frist uinervtisy lsat
rghit rset toatl mttaer mses iprmoetnt raed
aoccdrnig wouthit porbelm cmabrigde ltteers
bcuseae huamn deos raed sitll mnid ervey istlef
tihng wrod wlohe
9Identifying Words
rscheearch ltteer waht lteter oredr wrod pclae
deosn't olny tihs taht frist uinervtisy lsat
rghit rset toatl mttaer mses iprmoetnt raed
aoccdrnig wouthit porbelm cmabrigde ltteers
bcuseae huamn deos raed sitll mnid ervey istlef
tihng wrod wlohe
10Identifying Words
rscheearch ltteer waht lteter oredr wrod pclae
deosn't olny tihs taht frist uinervtisy lsat
rghit rset toatl mttaer mses iprmoetnt raed
aoccdrnig wouthit porbelm cmabrigde ltteers
bcuseae huamn deos raed sitll mnid ervey istlef
tihng wrod wlohe
11Identifying Words
rscheearch ltteer waht lteter oredr wrod pclae
deosn't olny tihs taht frist uinervtisy lsat
rghit rset toatl mttaer mses iprmoetnt raed
aoccdrnig wouthit porbelm cmabrigde ltteers
bcuseae huamn deos raed sitll mnid ervey istlef
tihng wrod wlohe
12Identifying Words in Context
Aoccdrnig to a rscheearch at Cmabrigde Uinervtisy
it deosn't mttaer in waht oredr the ltteers in a
word are, the olny iprmoetnt tihng is taht the
frist and lsat ltteer be at the rghit pclae. The
rset can be a toatl mses and you can sitll raed
it wouthit porbelm. Tihs is bcuseae the huamn
mnid deos not raed ervey lteter by istlef, but
the wrod as a wlohe.
Rawlinson, G. E. (1976) The significance of
letter position in word recognition. Unpublished
PhD Thesis, Psychology Department, University of
Nottingham, Nottingham UK.
http//www.mrc-cbu.cam.ac.uk/mattd/Cmabrigde/
13Words in Context are Easier to Recognize
- It is easier to recognize words whose letters are
jumbled within an expression than to recognize
isolated words with jumbled letters - toatl
- a toatl mses
- More noise in the input can be tolerated when
recognizing larger units - If the linguistic unit cant be recognized, the
meaning cannot be determined! - Larger units facilitate recognition ? larger
units faciliate understanding
14Words in Context are Easier to Recognize
- It is easier to recognize words whose letters are
jumbled within an expression than to recognize
isolated words with jumbled letters - toatl
- a toatl mses
- More noise in the input can be tolerated when
recognizing larger units - If the linguistic unit cant be recognized, the
meaning cannot be determined - Larger units facilitate recognition ? larger
units faciliate understanding
15Whats Wrong With Compositional Semantics!
- Meaning of expression is composed from meaning of
words recognized independently - Meaning of black cat equals meaning of black
meaning of cat - MWEs must be treated as exceptions
- Meaning of black ice does not equal meaning of
black meaning of ice - black ice is actually clear, not black!
- Why not recognize the largest units of meaning
and simplify the problem! - Dont treat MWEs as exceptions
16High Frequency Words
- The meaning of high frequency words like take
and have cannot be determined in isolation from
the expressions in which they occur - Take take for instance
- Take a hike
- Take five
- Take place
- Have a blast
- Dont have a cow
- Have at it
17High Frequency Words
- Why are high frequency words the most ambiguous?
- It isnt possible to have a separate word for
every concept that may need to be expressed - Some words must be used in the expression of
multiple concepts - The words used in the expression of multiple
concepts are necessarily ambiguous and tend to be
high frequency
18Syllables, Morphemes or Words?
- Irrelevant, but possible words or morphemes
within words are better recognized as meaningless
syllables - It does not make sense to try to compose the
meaning of carpet from the meanings of car
and pet! - How do we avoid recognizing car and pet as
meaningful? - Words in MWEs often function more like
meaningless syllables than independent meaningful
units! - The meanings of ad and hoc in ad hoc
- Although ad and hoc have meanings in Latin
- The meaning of blue in blue moon
- Even if the meaning of blue is initially
activated by blue, it is not part of the
meaning of blue moon
19Syllables, Morphemes, Words or Expressions?
- No sharp divide between syllables, morphemes,
words and expressions - nonetheless vs. none the less
- Is none a syllable or morpheme in nonetheless
or a word in none the less? - whatever vs. what ever
- alot vs. a lot
- whatchamacallit vs. what do you call it
20What are Acronyms?
- Acronyms are MWEs that are perceptually
re-encoded as a sequence of letters (written) or
syllables corresponding to letters (spoken)! - AFMC vs. Air Force Materiel Command
- Acronyms allow a single perceptual unit to encode
an entire MWE! - Overcome limitations of visual and aural
perceptual span
21Frequency of Multiword Expressions
- Conventionalized Expressions
- We say baked potato and roast beef not baked
beef or roast potato (although roast
potatoes is OK) - 25 of expressions are conventionalized ways of
saying things! - Erman, B. Warren, B. (1999). The idiom
principle and the open choice principle. Text Vol
20 pp. 29-62 - Formulaic Language
- As much as 70 of our adult native language may
be formulaic! (Altenberg, 1990) - Wray, A. Perkins, M. (2000). The functions of
formulaic language an integrated model. Language
Communication 20, pp. 1-29 - Altenberg, B (1990). Speech as linear
composition. Proceedings of the Fourth Nordic
Conference for English Studies
22Frequency of Multiword Expressions
- The number of MWEs in a speakers lexicon is of
the same order of magnitude as the number of
single words - Jackendoff, J. (1997). The Architecture of the
Language Faculty. Cambridge, MA The MIT Press. - In WordNet, 41 of entries are multiword
- The number of MWEs increases in specialized
domains and acronymns are ubiquitous! - AFRL vs. Air Force Research Laboratory
- BRAA vs. Bearing, Range, Altitude Aspect
- Sag, I., Baldwin, T, Bond, F, Copestake, A. and
Flickinger, D. (2002). Multiword Expressions A
Pain in the Neck for NLP. In Proceedings of the
Third International Conference on Intelligent
Text Processing and Computational Linguistics
23Processing Efficiency
- There really isnt time to process spoken input
one word at a time - Word-based compositionality is computationally
too expensive - Even if each word in a 20 word sentence has only
3 meanings (on average), there are 203 possible
combinations! - Extensive search is not a cognitively viable
option - There must be constraints that minimize the
number of alternatives - MWEs offer one such constraint
- MWEs are directly retrievable from memory
reducing the amount of processing required to
determine meaning
24Processing Efficiency
- Humans can recognize letters in words more
rapidly than letters in isolation - Word Superiority Effect
- Can humans recognize words in MWEs more rapidly
than recognizing words in isolation? - Multiword Superiority Effect?
- Suggested by our ability to complete unfinished
MWEs without seeing or hearing the entire final
word - kicked the bu
- spill the b
- Suggested by the Cambridge Study example
25Processing Efficiency
- Perceptual processing is constrained by the
visual perceptual span in reading and the size of
the phonological buffer in speech - Mechanisms that shorten the visual and aural span
should facilitate processing - Mechanisms that link perceptual units to larger
units of meaning should facilitate processing
26Processing Efficiency
- Acronyms and abbreviations support efficient
processing - HE vs. Human Effectiveness Directorate
- AFRL/HE vs. Air Force Research Laboratory
- They achieve this by associating a perceptual
unit with a larger unit of meaning - HE is perceived as a unit
- HE is stored as a unit and linked to Human
Effectiveness Directorate which is also stored
as a unit - Sometimes the original expression is lost or
modified - AOC vs. Air and Space Operations Center
- RADAR vs. ??? (Radio Detection and Ranging)
27Processing Efficiency
- Recognition of larger units competes with
recognition of smaller units - If larger unit is recognized first, smaller unit
remains implicit unless task requires accessing
smaller unit - Recognition of smaller units of meaning is
detrimental to understanding in many cases! - Irrelevant meanings
- car in carpet
- a in a priori
- Literal interpretation of non-literal language
- Have a nice day!
- I wasnt going to, but if you say so!
28Processing Efficiency
- MWE Storage
- Humans have a powerful associative memory
- Storage of frequently occurring MWEs is
psychologically plausible - MWE Perception
- MWEs may be holistically perceivable
- Perhaps in single fixation when reading
- Advantage of acronyms and abbreviations in
English - In written Hebrew, only consonants are written
which should facilitate recognition of MWEs - Via some concatenation mechanism in speech
29Why MWEs are good!
- The larger the linguistic unit, the less likely
to be ambiguous - The larger the linguistic unit, the less
susceptible to noise - The larger the linguistic unit, the more rapidly
it can be recognized relative to individually
recognizing the lower level elements of the unit - Bigger is better!
30Summary
- Humans have little difficulty understanding MWEs
- NLP systems should be designed to handle MWEs as
part and parcel of what they do, not treat them
as exceptions that are a pain in the neck! - The result will be better NLP systems!
31Questions?
32Perceiving Larger Linguistic Units
- Phonologic Loop
- 2 seconds of spoken input
- Baddeley, A. (???)
- Visual fixations
- 4 letters to left of fixation
- 9 letters to right of fixation
- Carpenter, Just (???).
33Storing Larger Linguistic Units
- Long-Term Memory
- About 4 Distinct Units in a Single Declarative
Memory Chunk - Hierarchically organized
- No limit to depth of hierarchy
- Short-Term Working Memory
- Phonologic Loop
- Visuo-Spatial Sketch Pad
34Change in Meaning ? Change in Form
- Changes in meaning often result in changes in
form - Grammaticization Processes
- going to ? gonna
- want to ? wanna
- Bybee (2001) explains the processes of reduction
and drift by which frequently co-occurring words
come to have unique phonology (i.e. perceptual
form) and meaning - Bybee, J. (2001). Phonology and Language Use.
Cambridge, UK Cambridge University Press - Specialized Uses Lead to Specialized
Pronunciation - Whatever used as a negative response
- Bad used to mean Good