An Approach to Catalan Adjective Classes by Clustering - PowerPoint PPT Presentation

1 / 54
About This Presentation
Title:

An Approach to Catalan Adjective Classes by Clustering

Description:

un tractor molt agr cola. A TRACTOR VERY AGRICULTURAL. una noia ... presumpte 'alleged', antic 'former/old/antique' after common noun 0.49, comparativity 0 ... – PowerPoint PPT presentation

Number of Views:107
Avg rating:3.0/5.0
Slides: 55
Provided by: la588
Category:

less

Transcript and Presenter's Notes

Title: An Approach to Catalan Adjective Classes by Clustering


1
An Approach to Catalan Adjective Classes by
Clustering
  • Laura Alonso Alemany
  • Universitat de Barcelona
  • lalonso_at_fil.ub.es
  • Gemma Boleda Torrent
  • Universitat Pompeu Fabra
  • gemma.boleda_at_trad.upf.es

2
motivation
  • to search for empirical (corpus-based) support
    for theories of adjective classification via
    data-driven methods
  • to enhance a lexicon with information on
    adjective classes in an inexpensive and reliable
    way

3
contents
  • introduction
  • previous theoretical work
  • a preliminary hypothesis
  • experiments on clustering adjectives
  • results and discussion

4
introduction
  • hypothesis 0 a single class of adjectives
  • BUT heterogeneous behaviour of adjectives
  • La noia Ć©s (molt) alta the girl is (very) tall
  • La bandera Ć©s nacional the flag is national
  • LassassĆ­ Ć©s presumptethe murderer is alleged

5
why clustering
  • clustering has been used for inferring knowledge
    in not-so-well-known domains
  • verbal subcategorization and selectional
    restrictions (Schulte im Walde Brew 2002)
  • inference of POS tags for unknown languages
  • it introduces little bias into the final results
  • there are no pre-defined classes (as opposed to
    classification methods see Bohnet et al. 2002)
  • ... but bias in modelling the data

6
problems with clustering
  • it is a data-driven technique, but the
    appropriate degree of abstraction must be chosen
  • completely data-driven approaches are possible,
    but
  • the search space becomes far too big
  • they are very sensitive to data sparseness

7
contents
  • introduction
  • previous theoretical work
  • a preliminary hypothesis
  • experiments on clustering adjectives
  • results and discussion

8
two traditions
  • two main scholarly traditions regarding the study
    of adjectives
  • descriptive grammar
  • morphology (derivational processes) and syntax
    (ordering among adjectives and with respect to
    head)
  • denotational semantics
  • formal semantics
  • semantic type (modifier or predicate)

9
classifications
10
qualitative / lte,tgt
  • predicative (syntactic version Levi 1978)
  • red house / this house is red
  • national flag / this flag is national
  • alleged murderer / this murderer is alleged
  • gradable / comparable
  • very red / redder, reddish
  • scalar (Raskin Nirenburg 1995)
  • red/green/blue, big/small
  • in Catalan, typically following the head noun

11
adverbial / ltlte,tgt, lte,tgtgt
these parameters seem to be relevant, well use
them in experiments
  • nonpredicative
  • alleged murderer / this murderer is alleged
  • nongradable, noncomparable
  • very/more alleged murderer
  • nonscalar
  • and no antonym
  • in Catalan, only preceding the head noun

12
on adjective position
  • the position of the adjective in Catalan and in
    other Romance languages is related to reference
    restriction (GCC, GDLE)
  • prenominal ? nonrestricting
  • postnominal ? restricting
  • very few strict nonpredicative adjectives
  • usual case mixed behaviour, with shift in
    meaning (potential problem!)
  • antic president former president
  • nonpredicative reading
  • armari antic antique wardrobe
  • qualitative reading

13
a gap relational
  • a.o. Bally 1944, GDLE, GCC, Engel 1988, Levi 1978

la mĆ quina Ć©s agrĆ­cola THE MACHINE IS
AGRICULTURAL
una mĆ quina gran agrĆ­cola vs. una mĆ quina
agrĆ­cola gran A MACHINE AGRICULTURAL BIG
una mĆ quina agrĆ­cola i gran A MACHINE
AGRICULTURAL AND BIG
14
a gap relational
  • predicativity mixed behavior
  • El congrĆ©s Ć©s internacional ? lte,tgt
  • THE CONFERENCE IS INTERNATIONAL
  • La Joana Ć©s corresponsal internacional
  • THE JOANA IS INTERNATIONAL CORRESPONDENT
  • La Joana Ć©s internacional /? lte,tgt
  • La corresponsal Ć©s internacional /? lte,tgt
  • ambiguity / class shift or property?

15
a gap relational
  • gradability and comparativity
  • said to be nongradable and noncomparable but very
    easy qualitativization
  • un tractor molt agrĆ­cola
  • A TRACTOR VERY AGRICULTURAL
  • una noia molt internacional
  • A GIRL VERY INTERNATIONAL
  • (has travelled a lot, knows many people
    from abroad)
  • could reflect diachronic processes

these facts could explain results at least in
part-
16
contents
  • introduction
  • previous theoretical work
  • a preliminary hypothesis
  • experiments on clustering adjectives
  • results and discussion

17
adjective classes
  • hypothesis three classes of adjectives
  • qualitative
  • non predicative
  • relational

vermell red, alt tall presumpte
alleged agrĆ­cola agricultural
18
challenges
  • does this classification have empirical
    (corpus-based) support?
  • can adjectives be automatically classified using
    the features reviewed?
  • which are the most relevant features for
    adjective classification?

19
contents
  • introduction
  • previous theoretical work
  • a preliminar hypothesis
  • experiments on clustering adjectives
  • results and discussion

20
modelling adjectives
  • find a textual correlate of theoretical
    parameters that describe semantic classes
  • in terms of morphosyntactic data
  • retrievable from an annotated corpus
  • it is not always possible
  • and careful with redundant features!
  • values are difficult to set adequately

21
the set of attributes
  • follows a verb
  • cooccurs with molt very and the like
  • form inflected by size morphemes
  • cooccurs with mĆ©s/menys more/less
  • form inflected by superlative morpheme Ć­ssim
  • precedes or follows a noun
  • precedes or follows an adjective
  • predicativity
  • gradability, scalability
  • comparativity, scalability
  • reference restriction
  • ref. restr. / relative ordering
  • distributional properties
  • POS of surrounding words (five word window)

22
corpus fragment of CTILC
  • collected by the Institute for Catalan Studies
    (IEC)
  • 8.5 million words
  • Catalan texts from 1970 onwards
  • only written, quite formal register
  • manually revised tagging (but there are errors!!)
  • lemma, part-of-speech, morphological info (EAGLES
    standard)
  • no syntactic information
  • 571365 adjective occurrences (tokens)
  • 17325 adjective lemmata (types)

23
data and tools
  • each adjective is described as a vector
  • where each dimension is one of the features
    relevant for characterising the adjective
  • the values of the features are a real value
    between 0 and 1
  • a matrix is built with all the vectors
  • perform the clustering with CLUTO (Karypis 2002)

24
experiment setting
  • set of objects only frequent adjectives (4859
    objects, 10 occurrences)
  • set of attributes
  • only textual correlates of semantic properties
  • only context of occurrence
  • combination of 1, 2 / with customized values
  • attribute values true percentages
  • number of clusters 2, 3, 4, 5, 6, 7
  • clustering parameters
  • combination of E/I criteria, partitional algorithm

25
gold standard
nonpredicative very few, not represented ? added
manually
  • annotated by human judges
  • 76 adjectives chosen randomly from the corpus
  • classified by human judges into 41 classes
  • qualitative calent hot, actiu active/lively
  • relational cientĆ­fic, digital
  • qualitative/non-predicative antic
  • non-predicative presumpte alleged, mer mere
  • errors artista
  • costly process, only a small number of adjectives
    can be considered

26
contents
  • introduction
  • previous theoretical work
  • a preliminary hypothesis
  • experiments on clustering adjectives
  • results and discussion

27
semantic parametersvs. gold standard
467 3040 229
593 787
28
semantic parametersvs. gold standard
gradability 0.07, comparativity 0 millor best,
eixerit nice, lively
preceding common noun 0.06, after common noun
0.49 presumpte alleged, antic
former/old/antique
after common noun 0.49, comparativity
0 important, subversiu subversive
after noun 0.54, comparativity 0 alemany
german, internacional
predicativity 0.1, comparativity 0, after Adj
0.03 possible, necessari
467 3040 229
593 787
29
contextual vs. semantic attributes
contextual
2107 697 290
593 1172
467 3040 229
336 787
semantic
30
contextual vs. semantic attributes
-1 common noun 0.5, -2 determiner 0.34 general,
negre black, alemany german, internacional
1 punctuation 0, -1 Noun 0.5 preescolar,
subversiu
1 punctuation, -1 adv possible, hot
1 noun, -1 determiner mer mere, antic
contextual
2107 697 290
593 1172
1Prep 0.3, 2determiner 0.25 important,
necessari, diagonal
467 3040 229
336 787
semantic
31
agreement between solutions
32
homogeneity of adjective classesvs. gold
standard
semantic parameters and context
contextual attributes
semantic parameters
customized values
33
questions
  • which is the best clustering solution?
  • which attributes are actually descriptive of
    adjective behaviour?
  • which are noisy?
  • which classes receive empirical support?

34
discussion
  • contextual and semantic features yield quite
    similar results, although
  • semantic features seem to be more adequate
  • contextual are stronger!
  • the most discriminating attribute is position of
    the adjective with respect to the noun
  • why are some others not discriminating?
    (modelling)
  • noisy
  • preposition follows
  • punctuation follows

35
discussion
  • clustering is a useful technique for inductive
    investigation on adjective classes
  • which hadnt been done before
  • theoretically biased results are supported by
    distributional properties

36
discussion
  • the following classes of adjectives emerge from
    the results
  • nonpredicative (with few elements)
  • relational
  • consistent behaviour
  • similar to a part of the qualitative
  • could reflect a diachronic process or class shift
  • or a bad modelling of the adjectives

37
discussion
  • qualitative adjectives as described in the
    literature are not homogeneous
  • predicativity, gradability and comparativity are
    not distributed uniformly in these adjectives
  • distributional properties are not uniform either

unexpected?
38
future work
  • further linguistic investigation of results
  • other clustering solutions
  • evaluation

39
references
  • Bally, C. (1944) Linguistique gĆ©nĆ©rale et
    linguistique franƧaise
  • B. Bohnet, S. Klatt and L. Wanner (2002) An
    Approach to Automatic Annotation of Functional
    Information to Adjectives with an Application to
    German
  • GDLE Bosque, I. and V. Demonte, eds. (1999)
    GramƔtica Descriptiva de la Lengua EspaƱola
  • Engel, U. (1988) Deutsche Grammatik, Heidelberg
    Julius Groos Verlag
  • Levi, J. N. (1978) The Syntax and Semantics of
    Complex Nominals
  • Montague, R. (1974) Formal Filosophy. Selected
    Papers of Richard Montague
  • Raskin, V. and S. Nirenburg (1995) Lexical
    Semantics of Adjectives. A Microtheory of
    Adjectival Meaning
  • Schulte im Walde, S. and C. Brew (2002) Inducing
    German Semantic Verb Classes from Purely
    Syntactic Subcategorisation Information
  • GCC SolĆ , J. et al., eds. (2002) GramĆ tica del
    CatalĆ  Contemporani

40
a vector
mes1_Esp 0.0331491712707182
mes1_Nom 0.0267034990791897 mes1_PT
0.320441988950276 mes1_Prep
0.366482504604052 mes1_Pron
0.0220994475138122 mes1_Verb
0.0460405156537753 mes1_no
0.0220994475138122 mes2_Adj
0.0607734806629834 mes2_Adv
0.00552486187845304 mes2_Conj
0.0552486187845304 mes2_Det
0.276243093922652 mes2_Esp
0.069060773480663 mes2_Nom
0.160220994475138 mes2_PT
0.0718232044198895 mes2_Prep
0.124309392265193 mes2_Pron
0.0303867403314917 mes2_Verb
0.140883977900552
menys2_Verb 0.25414364640884
menys2_no 0.00552486187845304 menys1_Adj
0.00276243093922652 menys1_Adv
0.0110497237569061 menys1_Conj
0.0276243093922652 menys1_Det
0.0276243093922652 menys1_Esp 0
menys1_Nom 0.81767955801105 menys1_Num 0
menys1_PT 0.0110497237569061
menys1_Prep 0.00552486187845304
menys1_Verb 0.0524861878453039 menys1_no
0.0331491712707182 mes1_Adj
0.00828729281767956 mes1_Adv
0.0441988950276243 mes1_Conj
0.0718232044198895 mes1_Det
0.0386740331491713
  • verd 181
  • serestarsemblarpredicatiu
    0.0386740331491713
  • comparativitat 0
  • gradabilitat 0.0165745856353591
  • modificador_dreta 0.0220994475138122
  • modificador_esquerra 0.895027624309392
  • menys2_Adj 0.0497237569060773
  • menys2_Adv 0.00828729281767956
  • menys2_Conj 0.00552486187845304
  • menys2_Det 0.281767955801105
  • menys2_Esp 0.0441988950276243
  • menys2_Nom 0.0911602209944751
  • menys2_Num 0
  • menys2_PT 0.0607734806629834
  • menys2_Prep 0.187845303867403
  • menys2_Pron 0.0110497237569061

predicativity comparativity gradability right
modifier left modifier
back
41
the matrix
  • 0 0 0 0 0 0.0833333333333333 0 0 0.75 0 0
    0.166666666666667 0 0 0 0 0 0 0 0 0 1
  • 0.2 0 0.0666666666666667 0.0666666666666667 0 0
    0.0666666666666667 0 0.466666666
  • 0.384615384615385 0 0.153846153846154 0 0 0 0
    0.0769230769230769 0.2307692307692
  • 0.11 0 0.075 0.015 0.005 0.12 0.04 0.04 0.33 0.13
    0.01 0.15 0.01 0.06 0 0.01 0.1
  • 0 0 0 0 0 0 0.0133333333333333 0 0.88 0 0
    0.0933333333333333 0 0.013333333333333
  • 0.192307692307692 0.0384615384615385
    0.0384615384615385 0 0 0.0769230769230769 0
  • 0.117647058823529 0 0.0784313725490196 0 0
    0.196078431372549 0.0196078431372549
  • 0 0 0 0.0789473684210526 0.0263157894736842
    0.105263157894737 0 0 0.368421052631
  • 0 0 0 0 1 0.0588235294117647 0 0.294117647058824
    0 0 0 0 0 0.588235294117647 0.0
  • 0.0952380952380952 0.0476190476190476
    0.0476190476190476 0.0476190476190476 0.04
  • 0 0 0.0681818181818182 0 0 0.204545454545455
    0.0681818181818182 0 0.272727272727
  • 0.0769230769230769 0 0 0 0 0.230769230769231 0 0
    0.461538461538462 0 0 0.2307692
  • 0.04 0 0.08 0 0 0.28 0 0 0.4 0.04 0 0.04 0 0.08 0
    0 0.16 0 0.12 0.2 0.2 0.36 0 0
  • 0.293333333333333 0 0.04 0.0133333333333333 0
    0.08 0.0133333333333333 0.02666666
  • 0.133333333333333 0 0 0 0 0.133333333333333 0
    0.0666666666666667 0.4666666666666
  • 0 0 0 0.0909090909090909 0.0909090909090909
    0.181818181818182 0 0.09090909090909
  • 0.0434782608695652 0 0.130434782608696
    0.0434782608695652 0 0.130434782608696 0
  • 0.104166666666667 0 0.0625 0 0.0208333333333333
    0.0208333333333333 0.0625 0.1041
  • 0.0526315789473684 0 0.105263157894737
    0.0526315789473684 0 0 0.105263157894737

back
42
CLUTO(v. 1.5.1, Karypis 2002)
  • high dimensional datasets
  • analysis of cluster features
  • partitional or agglomerative algorithms
  • various criterion functions, taking into account
    similarity within the objects in a cluster
    (internal criterion) and/or the differences
    between objects of different clusters (external
    criterion)

partitional
combination of internal and external criteria
back
43
human gold standardinter-judge agreement
back
44
contextual attributes vs. gold standard
agreement with semantic attributes
45
contextual attributes vs. gold standard
following common noun (50), following specifier
(34)
preceding common noun (7), following specifier
(7)
preceding preposition (30), preceding specifier
(25)
preceding punctuation (40), following adverb or
verb
not preceding punctuation, following common noun
(50)
human
agreement with semantic attributes
46
customized valuesgradability and comparativity
normalized to binary
gradability (61), after common noun (10),
followed by common noun (2)
comparativity (12), followed by common noun
(2), after common noun (11)
comparativity (14), gradability (61)
after common noun (11), comparativity (13),
gradability (61)
gradability (65), comparativity (12)
back
47
customized valuesgradability and comparativity
normalized to binary
back
48
interpretation of resultsquality of cluster
solution
  • tightness of obtained clusters
  • objects within a cluster are very similar to each
    other
  • objects are very dissimilar to objects in
    different clusters
  • attribute distribution different values across
    clusters evidence discriminating function of
    attributes

49
tightness of clustering solutions
back
50
attribute distribution across clusters
51
attribute distribution across clusters
back
52
attribute distribution across clusters
back to interpretation
back
53
decision list
  • a gold standard annotated by human judges
  • a gold standard built with a decision list
  • deductive classification using some of the
    attributes in the vectors for classifying
    adjectives into pre-defined classes
  • predicativity
  • position with respect to the head noun
  • gradability and comparativity
  • fully automatic inexpensive but unsupervised

54
decision list vs.human gold standard
a deductive approach does not provide a good
solution
Write a Comment
User Comments (0)
About PowerShow.com