Translation Selection Using Bilingual Lexicon and Monolingual Corpus - PowerPoint PPT Presentation

1 / 12
About This Presentation
Title:

Translation Selection Using Bilingual Lexicon and Monolingual Corpus

Description:

'translation selection is a process that selects an appropriate target word ... backer, penaja. monolingual dictionary example entry. sponsor n 2. ... – PowerPoint PPT presentation

Number of Views:61
Avg rating:3.0/5.0
Slides: 13
Provided by: hong45
Category:

less

Transcript and Presenter's Notes

Title: Translation Selection Using Bilingual Lexicon and Monolingual Corpus


1
Translation Selection Using Bilingual Lexicon and
Monolingual Corpus
  • Ye Hong Hoe

2
Introduction
  • In bilingual dictionary
  • a word may have many senses, each sense may have
    multiple target words

kill
English word
deprive of life
put an end to
English gloss
Malay Target Words
membunuh
mengorbankan
memusnahkan
menghapuskan
3
Introduction (cont.)
  • In machine translation
  • translation selection is a process that selects
    an appropriate target word corresponding to a
    word in a source language (Lee et al., 2002)

to kill someone
? memusnahkan seseorang (X)
? menghapuskan seseorang (X)
? mengorbankan seseorang (?)
? membunuh seseorang (O)
4
Introduction (cont.)
  • Bilingual dictionary
  • essential for translation selection
  • usually contains less descriptive definitions

5
Objective
  • to build an English-Malay translation selection
    system
  • select an appropriate Malay word corresponding to
    an English word in an English sentence

6
Previous Work
  • Hyun and Gil, 2002
  • Translation selection
  • Source word sense disambiguation
  • Target word selection
  • Advantage
  • Reduce complexity
  • consider only target words for each word sense
  • Weakness
  • used only simple method of word sense
    disambiguation

7
Proposed System Design
Target LanguageCorpus
Bilingual Lexicon
Lexical ConceptualDistance Data(LCDD)
Sense Definition Example Sentence
Sense TargetWord Equivalent
Target WordCo-occurrence
Input sentence
Source word sensedisambiguation
Target wordselection
Word-level translation
Figure 1. Proposed System Design
8
Knowledge Source
  • Bilingual lexicon

English-Malay Lexicon
Kamus InggerisMelayu Dewan(KIMD)
gloss,example
definition
Longman Dictionaryof ContemporaryEnglish (LDOCE)
WordNet
English Lexicon
English Lexicon
Figure 2. Bilingual lexicon enriched by
monolingual lexicons
9
Knowledge Source (cont.)
  • Target language corpus
  • Malay corpus
  • provide target word co-occurrence
  • a pair of words which co-occur within a
    predefined window (e.g. sentence)
  • e.g. (melawan, pertandingan), (mengajar, sekolah)
  • Lexical Conceptual Distance Data (LCDD)
  • measurement of relatedness between two word senses

10
Source Word Sense Disambiguation
  • Methods
  • part-of-speech tagging
  • use word sense definitions and LCDD
  • calculate relatedness between definitions
  • the sense with the most related definition is the
    preferred sense

11
Target Word Selection
  • Method
  • use target word co-occurrence
  • calculate frequency
  • a target word co-occurs in a corpus with
    translations of other words within an input
    sentence
  • frequency increases, probability increases
  • apply distance factor
  • if distance between two words in co-occurrence
    increases, probability decreases

12
Thank You
Write a Comment
User Comments (0)
About PowerShow.com