morphological tag of some word - PowerPoint PPT Presentation

1 / 11
About This Presentation
Title:

morphological tag of some word

Description:

more than one tag is possible for some word forms ... e.g., the nominative, accusative and vocative cases have the same form for all Czech neuters ... – PowerPoint PPT presentation

Number of Views:36
Avg rating:3.0/5.0
Slides: 12
Provided by: pavel91
Category:

less

Transcript and Presenter's Notes

Title: morphological tag of some word


1
Introduction
  • morphological tag of some word
  • combination of values of morphological categories
    making sense for this word
  • morphological ambiguity
  • more than one tag is possible for some word forms
  • word form own can represent either an adjective
    or a verb
  • Its my own car. x I own a car.
  • or in Czech e.g. je can be either a verb or a
    pronoun
  • Petr je nemocný. x U je vidím.
  • morphological analyzer
  • returns a morphological tag(s) for a given word
  • morphological disambiguation/tagging
  • removing the incorrect tags

2
Morphological tagging
  • still an open task for highly inflectional
    languages
  • statistical methods and/or rule-based techniques
  • advantages of the rule-based approach
  • transparency
  • linguistic interpretability
  • manual improvability
  • independency
  • the manual design of the rules is expensive and
    requires some linguistic knowledge
  • attempts to learn the rules automatically using
    ILP and active learning (Nepil, Popelínský 2001
    Nepil et al., 2001)
  • manually annotated learning data were still
    needed
  • new method of the generating the disambiguation
    rules using only unannotated data.

3
Basic Idea
  • homonymy often has a rather accidental nature
  • enu the accusative of the fem. ena or 1st sg
    of the verb hnát
  • many Czech feminines in the accusative are not
    homonymous with the 1st sg of any verb. And
    conversely...
  • a step aside
  • grammatical meaning/morphological tag is
    connected with a function of the word in a
    sentence words with the same grammatical
    meaning have the same or similar functions
  • words sharing the same function occur in contexts
    which have some common properties
  • these properties are manifested in morphological
    tags of the words constituting these contexts

4
Basic Idea
  • Back to our example
  • all feminines in the accusative occurs in similar
    contexts which differ from contexts of verbs in
    1st sg
  • word form enu cannot be both at the same time
  • to resolve a particular ambiguity between the
    tags (or sets of tags) X and Y, we have to search
    a corpus for words unambiguously tagged with tags
    X and Y, resp. and find some common properties of
    their contexts. Then we should be able to examine
    context of a word ambiguously tagged with both X
    and Y and determine, which of them can be removed

5
Training Data
  • unlimited amount of unanotated data
  • we can select only fine learning examples
  • quite strong restrictions
  • only whole sentences, no numbers, abbreviations,
    interjections, proper names, words unknown to a
    morphological analyzer, no very short sentences,
    sentences without a finite verb, nor sentences
    with an unclear punctuation
  • even with these restrictions we get many
    non-grammatical sentences and incorrectly tagged
    words
  • learning examples are annotated by the
    morphological analyzer ajka (Sedlácek, 2001),
    each word is labeled with all tags offered by the
    analyzer
  • rare readings of some frequent words are removed
    by simple lexical filters

6
Learning Algorithm
  • pos examples neg examples domain knowledge
  • gt ILP system
  • generates (induces) rules covering the positive
    examples without covering the negative ones
  • refines a particular rule according to a given
    criterion
  • learning examples for resolving X/Y ambiguity
  • sentences with some word unambiguously tagged
    X/Y
  • learning examples are encoded into Prolog facts
  • to resolve a particular ambiguity X/Y we learn
    both two sets of disambiguation rules
  • during disambiguation, whenever all rules
    covering certain word fall into the same set, the
    respective tag is retained, and the other
    removed. If both sets contain some rule covering
    the word, the more probable tag is retained.

7
Learning Algorithm
  • Construction of the set of rules RS describing
    utmost positive examples and none of the negative
    ones (P and N could be chosen arbitrarily)
  • Set the set RS to an empty set
  • Set a rule R to a trivial one
  • Choose at most P positive examples covered by
    the rule R, but not covered by any rule from
    RS. End if there are no such examples
  • Choose at most N negative examples covered by
    the rule R. If there are no such examples, add
    the rule R to the set RS and continue with step 2
  • Try to refine (specialize) the rule R to the
    best advantage according to the selected
    positive and negative examples. End if there is
    no posibility of refinement, otherwise continue
    with step 3
  • The utility of possible refinements is measured
    by the following formula
  • Pcov / Pall Ncov / Nall
  • where Pcov (Ncov) stands for the count of
    positive (negative) examples covered by the
    refined rule, Pall (Nall) stands for the count of
    all positive (negative) examples selected in
    steps 2. and 3.
  • ILP system INDEED (Nepil 2003) is used for
    refining the rules

8
Experiments
  • three experiments have been performed
  • the third and the fifth most frequent Czech word
    (se and je) and the subset of the most frequent
    POS ambiguity (words of type vedení) were chosen
  • se is either a reflexive pronoun or a vocalized
    form of the preposition s
  • je is either a personal pronoun or the 3rd sg of
    the verb být
  • vedení type words are some forms of either
    substantives or adjectives
  • the evaluation of the generated rules has been
    performed on the manually annotated corpus DESAM
  • all occurrences of these three types have been
    used.
  • even the badly disambiguated words in
    non-grammatical, but human-parseable sentences
    have been counted as errors caused by the rules

9
Results
  • left recall correctly disambiguated / all
    ambiguous words
  • right precision correctly disambiguated /
    disambiguated words
  • baseline default precision, selection of the
    more probable tag
  • frequency portion among all words ambiguous in
    POS
  • HMM is Czech HMM-based Tagger (Krbec, Hajic
    2001)
  • EXP is Czech Feature-based Tagger (Hajic 2001)
  • it should be stated that the comparison with HMM
    and EXP is not quite fair, as they are not
    specialized in solving these three particular
    ambiguities

10
Discussion
  • in all cases I had to relax some of the
    principles proposed formerly
  • allowing the coverage of a small amount of
    negative examples
  • not all unambiguous words can be used, words
    appearing in non-typical contexts have been
    discarded
  • je was substituted with words bearing slightly
    different tags
  • disadvantages
  • difficulty of searching for adequate unambiguous
    substitutes
  • it seems that there will be many ambiguities
    unresolvable with our method. e.g., the
    nominative, accusative and vocative cases have
    the same form for all Czech neuters
  • on the other hand, the results show that at least
    for some ambiguities quite accurate rules can be
    learned, which could be useful for a partial
    disambiguation or some preprocessing

11
Conclusions and Future Work
  • new method of inducing rules for disambiguation
  • learning from raw, unannotated data
  • promising results, mainly in the accuracy
  • limitations constraining a wider/general
    application
  • possible improvements
  • some kind of (semi?)automatic elimination of
    non-grammatical learning examples should be
    considered
  • some rearranging of the tagset
  • some rules are very similar to other ones some
    methods of detecting these similarities could be
    useful
  • the lexical filters can always be improved
  • some improving or lexicalization of the domain
    knowledge?
Write a Comment
User Comments (0)
About PowerShow.com