Title: LING 406 Intro to Computational Linguistics Computational Morphology and Morphological Theory
1LING 406Intro to Computational
LinguisticsComputational Morphologyand
Morphological Theory
- Richard Sproat
- URL http//catarina.ai.uiuc.edu/L406_08/
2This Lecture
- Overview of a thriving debate in morphological
theory - and its computational interpretation
3Item-and-arrangement versus Item and process
- Charles Hockett (1954) Two models of grammatical
description - Item-and-arrangement words are composed of
morphemes that are put together by a kind of
word syntax - Item-and-process words are built up via the
application of rules that add phonological and
morphosyntactic information
4For example
- The word dogs
- An item-and-arrangement analysis
- dogN sN__, pl gt dogsN, pl
- Each morpheme carries its meaning and syntactic
properties with it, and - An item-and-process analysis
- Ø gt s / N, pl __
5The separation hypothesis(Beard, 1995 Beard
Volpe, 2005)
- Lexical and inflectional derivation are processes
distinct from phonological realization
(affixation, etc.). - Again theres a clear difference between Item
and Process and Item and Arrangement theories
6Stumps classification
hoots3sg
Øs / hoot3sg
hoots 3sg because of -s
-s is introduced due to 3sg
7Turkish (Hankamer, 1986)
çöplüklerimizdekilerdenmiydi garbageAFF
PL 1P/PL LOC REL PL ABL
INT AUXPST was it from those that
were in our garbage cans?
8Kannada
9Reduplication
10Problems for IA
- Multiple exponence
- Breton
- bagig little boat
- bagoùigoù little boats
- Classical Greek
- pepaideuka I have educated
11Problems for IA
- Zero exponence Russian feminine nouns
12But is there really a difference?
- Suppose we reduce both item and arrangement and
item and process to a computational
interpretation? - Is there any difference between the two?
13An alternative view
- In (Roark Sproat, 2006) we reanalyze Stumps
analyses of - Sanskrit nominal declensions
- Swahili verbal declensions
- Breton double plurals
- All of which purport to show the need for an
realizational-inferential account
14Sanskrit declensions
15Sanskrit declensions
16Issues with Sanskrit
- Nouns have two or three stems strong, middle
and (optionally) weakest - A different series of stem alternations
cross-cuts this guna, vrddhi, and zero - foot pad-, pad-, pd-
- strong stems may be guna or vrddhi
- middle stems may be zero, or a lexeme-specific
stem - weakest stems may be zero or lexeme-specific stem
17Sanskrit declensions
zero
guna
18Sanskrit declensions
vrddhi
lexeme-class particular
lexeme-class particular
19Further issues
- Stump argues for Indexing Autonomy Hypothesis
- A stems index is independent of the form used
for the stem - Sanskrit nominal declensions are morphomic in
Aronoffs sense - Also involved are rules of referral whereby a
particular form is systematically used to
represent more than one slot in the paradigm. - For example, in Latin the ablative and dative
plural in nominal paradigms are identical no
matter what form is used for the particular
paradigm - So we have several layers of complexity here,
which would seem to make an item-and-arrangement
approach impossible
20Computational analysis
21Refactoring
22A simpler example from (Blevins, 2003)
- Verbal d in West Germanic
- PAST John whacked the toadstool
- PERF John has whacked the toadstool
- PASS The toadstool was whacked
- Lexical-incremental accounts posit three
homophonous affixes - Blevins proposes a realization function
Affixes /d/ to the stem
23Refactorization
This is simply an incremental-lexical model
24English agentive nominals (cf. Beard Volpe,
2005)
- read-er, stand-ee, correspond-ent, record-ist,
cook - e " ent / entnoun,agentive S __
- Call the set of all agentive rules R
- We can define a new metarule R' that is the
union of all rules in R
25Feature noun,agentive
- Presumably this is also introduced by rule call
this rule M - Then given a base B, the base with that feature
specification added is given by B?M - Then the appropriate suffixed form is given by
B?M?R' - But this can be written, by associativity, as
B?M?R' - Finally, M?R' can be precomposed call this R''
26So what?
- R''
- Introduces the morphosyntactic feature
noun,agentive - Introduces the affixal morphology as appropriate
to the base - In short, R'' encodes a lexical-incremental model
of morphology.
27Why form-function mismatch?
- Affixes frequently show a non-one-to-oneness
between form and function this is rarely if ever
true of lexemes - Still, for cases like the English agentive, part
of the explanation must surely be historical. - E.g., -ent, is associated with Latinate stems,
which either were used as agentives originally,
or became so via lexical drift - Is something beyond such a historical story
required?
28Summary
- Morphological theory has been divided between
item-and-process and item-and-arrangement
approaches - Yet when viewed from a formal or computational
point of view there is relatively little
difference between these two approaches - This again echoes Karttunens point about the
non-difference between traditional rule-based
approaches to phonology and constraint-based
approaches.
29Reading