Machine Translation Lecture 20 - PowerPoint PPT Presentation

1 / 16
About This Presentation
Title:

Machine Translation Lecture 20

Description:

Potential saving (and potential quality improvements) from machine translation. ... 80% of EU documents between Spanish and French are results of machine translation ... – PowerPoint PPT presentation

Number of Views:42
Avg rating:3.0/5.0
Slides: 17
Provided by: alison87
Category:

less

Transcript and Presenter's Notes

Title: Machine Translation Lecture 20


1
Machine TranslationLecture 20
  • Motivation, Difficulties and Core Techniques

2
Motivation
  • Huge commercial need for translation
  • 2,000,000,000 industry.
  • 10 of EU budget.
  • Mostly technical documentation for products with
    international market. Also EU documents!
  • Potential saving (and potential quality
    improvements) from machine translation.
  • Successful deployment of MT already
  • 80 of EU documents between Spanish and French
    are results of machine translation

3
Difficulties (words/syntax)
  • One word in source language may have many
    translations.
  • E.g., wear ? haku, kiru, kaburu
  • Complex expression in source language gt Single
    word in target language. (nach dem krieg ?
    postwar)
  • Order/phrasing differentI swam across the
    river ? Jai traverse le fleuve en nageant

4
Difficulties (pragmatics)
  • Conventional structures for reports may be
    different in each language.
  • Politeness conventions differ greatly.

5
Example Google
  • Que ce soit à lHalloween ou dans ta vie de
    tous les jours, ils y a des règles à suivre
    lorsque tu dois traverser la rue. 
  • ?
  • That it is in Halloween or in your life of the
    every day, they has rules there to follow when
    you must cross the street.

6
Traditional Approach
  • Do some level of analysis of source, then use
    transfer rules to map to representation of target.

Direct Translation
Target Text
Source Text
Shallow Transfer
Target Syntax
Source Syntax
Deep Transfer
Target Semantics
Source Semantics
Interlingua
  • Varying approaches depending on depth of
    analysis.

7
Transfer vs Interlingua
  • Transfer systems map words/syntax/semantics in
    one language to words/syntax/semantics in
    another.
  • May need many special case rules to deal with
    language idiosyncrasies.
  • Interlingua systems translate from a source
    language to a language independent
    representation, then can translate from that to a
    target language.

8
Translating between multiple languages
Transfer Systems
Spanish generation German generation English
generation
Spanish analysis
Spanish-German transfer
German analysis
Spanish-English transfer
English analysis
etc
Versus Interlingua
Spanish Generation German Generation English
Generation
Spanish analysis German analysis English
analysis
Interlingua
9
Interlingua versus transfer
  • So.. Interlingua simpler for multiple languages.
  • But required language independent representation
    is complex.
  • Target language has no influence on analysis
    process so can be harder to deal with some
    idiosyncratic translations.

10
Speech versus Text
  • Spoken language translation is particularly hard.
  • Speech recognition process error prone.
  • Intonation/prosody often discarded in
    recognition, and hard to add into synthesised
    speech in other language.
  • Text translation
  • Easier, but spelling errors etc still common.
  • In either there may be many unknown words.

11
Machine Aided Translation
  • Often translation may be made easier by sharing
    the task between man and machine still saves on
    manual translation.
  • Human ( specialised pre-processing tools) may
    pre-edit source documents
  • substituting unknown words,
  • identifying proper nouns,
  • indicating sense/class of ambiguous words.

12
Machine Aided Translation
  • Human may also post-edit
  • Correcting output from MT-system,
  • e.g, In this study it will be sought to
    answer.. ? This study will seek to answer..
  • MT system may interact with user allowing user to
    select from or correct ambiguous or potentially
    incorrect translations.
  • Pre and post-editing does not always require
    knowledge of both languages.

13
Machine Aided Translation
  • More extreme example Translation memories.
  • Store human translations of words and phrases.
  • Make these easily retrievable for the human
    translator translating same things again.
  • Can be made more Intelligent with fuzzy
    matching and replacement.
  • Popular with human translators.

14
Modern MT Approaches
  • Rule Based
  • Transfer rules map from source to target language
    representations.
  • Statistical
  • Given alternative possible translations, find the
    most probable one in the target language (given a
    corpus in the target language)
  • Use P(ST) and P(T) for possible translations
  • Google moving from rule-based to more statistical
    methods.
  • Example-based
  • Extend idea of translation memories. Reuse
    existing translation fragments.

15
MT Evaluation
  • Evaluating translation systems is hard.
  • Expensive to evaluate by hand requires human
    translator to make judgements.
  • May compare results on sample documents to fixed
    set of reference translations.
  • But hard to automatically compare translations to
    assess closeness.

16
Summary
  • Translating is hard languages vary by more than
    by words.
  • Main approaches transfer or interlingua.
  • Statistical and example-based techniques also
    used.
  • Full, automatic, high quality translation is
    beyond state of the art.
  • Machine aided translation is nevertheless very
    useful.
Write a Comment
User Comments (0)
About PowerShow.com