Machine Translation - PowerPoint PPT Presentation

1 / 25
About This Presentation
Title:

Machine Translation

Description:

Machine Translation Research Seminar on Software Business 21.5.2003 Antti Ilmo Outline Introduction Translation and Machine Translation Techniques The Early Machine ... – PowerPoint PPT presentation

Number of Views:1346
Avg rating:3.0/5.0
Slides: 26
Provided by: TomBom
Category:

less

Transcript and Presenter's Notes

Title: Machine Translation


1
Machine Translation
  • Research Seminar on Software Business 21.5.2003
  • Antti Ilmo

2
Outline
  • Introduction
  • Translation and Machine Translation Techniques
  • The Early Machine Translation Systems
  • Problems of Machine Translation
  • Proposed Solutions to the Problems
  • Summary

3
Introduction
  • The Internet and globalisation have increased the
    need for localization of documentation and
    interaction between different nationalities
  • Localization is expensive and time consuming
  • Machine Translation a potential solution
  • But

4
Introduction (2)
  • MT quality is not good enough
  • language works on many levels
  • interpretation
  • dictionary may tell a meaning, but not how it is
    interpreted
  • competence, experience and internal models of
    language users important
  • local usage etc. (Canadian French and French
    French)
  • translation may sound wrong in a dialect
  • typos
  • syntactic errors occur

5
Outline
  • Introduction
  • Translation and Machine Translation Techniques
  • The Early Machine Translation Systems
  • Problems of Machine Translation
  • Proposed Solutions to the Problems
  • Summary

6
What is translation?
  • Preservation of the original text
  • stylistic and semantic characteristics
  • word-for-word
  • meaning-for-meaning
  • Rules of language
  • e.g. letters c, a and t form a word only in
    the right order
  • Translation process (translating) and translation
    product (translated text)
  • translation concept consists of both of the above
  • Translator re-codes the message into a different
    language

7
MT Technology
  • Machine Translation (MT)
  • machine takes care of translation process
  • Machine Aided Translation (MAT)
  • Machine-Assisted Human Translation (MAHT)
  • humans translate, machine assists
  • Human-Assisted Machine Translation (HAMT)
  • machine translates, humans assist
  • e.g. choosing a correct word from a dictionary
  • Terminology Databanks (TD)
  • technical terminology
  • most commonly used nowadays

8
Linguistic Techniques
  • Direct vs. indirect
  • direct uses word replacement
  • indirect tries to express a meaning
  • Interlingua vs. transfer
  • Interlingua does not take into account variations
    in target languages
  • transfer approach uses language-specific meaning
  • local vs. global
  • local scope uses word-level analysis
  • global scope analyses sentences or even more

9
Outline
  • Introduction
  • Translation and Machine Translation Techniques
  • The Early Machine Translation Systems
  • Problems of Machine Translation
  • Proposed Solutions to the Problems
  • Summary

10
Early Systems (GAT)
  • Georgetown Automatic Translation
  • one of the earliest MT projects
  • development began in 1952, in use 1964-1979
  • physics texts from Russian to English
  • replacement of words
  • no real linguistic theory
  • The spirit is willing, but the flesh is weak
    translated to Russian and then back to English.
    The result The wine is agreeable, but the meat
    has spoiled

11
Early Systems (CETA)
  • Centre dEtudes pour la Traduction Automatique
  • launched in 1961 in Grenoble
  • in use 1967-71
  • approximately 400,000 words translated
  • Russian to French
  • sentence based analysis
  • Interlingua and transfer mixed
  • grammatical level vs. dictionary level
  • Realization Interlingua approach not a good one

12
Early Systems (SYSTRAN)
  • one of the first systems marketed
  • installed in 1970 (US Air Force Foreign
    Technology Division)
  • used also at NASA and EURATOM
  • semantic features ad hoc
  • negative feedback at first
  • post-editing found to be a good approach
  • GM of Canada claimed the system speeded up the
    work of human translators three to four times
    (3000-4000 words a day, approximately the same a
    human translator now translates with the help of
    translation workbenches)

13
Early Systems (TAUM-METEO)
  • TAUM-METEO was the first truly automatic MT
    system
  • developed in 1960s
  • used by Canadian Meteorological Center
  • scanned network for English weather reports and
    translated them to French
  • corrected its own errors without post-editors
  • forwarded offending content to human translators
  • 24,000 words/day
  • problems
  • communication noise
  • misspellings
  • words missing from the dictionary
  • specialised language made translations possible

14
Outline
  • Introduction
  • Translation and Machine Translation Techniques
  • The Early Machine Translation Systems
  • Problems of Machine Translation
  • Proposed Solutions to the Problems
  • Summary

15
Problems
  • Translation is not straightforward
  • it is not replacing words for words
  • word orders
  • rewriting of text into another language
  • choosing the right words
  • e.g. imperative mood in English infinitive in
    French

16
Problems (2)
  • Automation of translation not easy
  • quality is poor
  • homographs
  • fan a ventilator or an enthusiast
  • different word classes
  • e.g. love both a verb and a noun
  • you can be both singular and plural
  • idioms
  • e.g. country music meaning type of music
  • personal pronouns
  • second person pronouns may vary in familiar and
    formal situations
  • also post-editing can take more time than
    translating from a scratch

17
Problems (3)
  • Morphological analysis
  • e.g. Chinese and Japanese do not use punctuations
  • sentences are not separated by anything
  • Syntactic analysis
  • modifiers a problem
  • The boy saw a girl with a telescope
  • the girl had a telescope vs. the boy used a
    telescope to see a girl
  • Analysis of context
  • 20-40 words in a sentence
  • 100 million possible translations
  • There are always going to be problem cases

18
Outline
  • Introduction
  • Translation and Machine Translation Techniques
  • The Early Machine Translation Systems
  • Problems of Machine Translation
  • Proposed Solutions to the Problems
  • Summary

19
AI-Based Approach
  • Raman Alwar 1990
  • Conversations carried out across enquiry counters
    on railway stations in India
  • System should understand a text before
    translating it
  • analysis of text to understand the meaning and
    storing it in a language-free semantic map
  • semantic maps used to generate translations
  • Analyzer analyses one sentence at a time
  • unnecessary adjectives not taken into account
  • morphological analysis first
  • building of semantic map second
  • stages work concurrently
  • large dictionary needed

20
AI-Based Approach (2)
  • Natural language generator builds a sentence in
    target language
  • analyzers result fed into the generator
  • translate everything vs. leave something out
  • definition of structure
  • words in right order and inflected correctly
  • minimal importance to style
  • Successful in specific application and a
    restricted set of sentences

21
Interactive Approach
  • Sen, Zhaoxiong and Heyan 1997
  • Knowledge of MT systems incomplete -gt incorrect
    translations
  • Possibility for an MT system to learn
  • quality should improve
  • Interaction starts when a sentence is found that
    the system cannot analyse properly
  • message to the user
  • user responds with a coded message
  • updates systems knowledge base
  • interaction limited to three stages
  • lexical analysis
  • uncertain modifiers
  • multiple translations

22
Multiple Translation Engines Sentence
Partitioning
  • Ren, Shi and Kuroiwa 2000
  • Multiple MT systems running in parallel
  • all use different MT techniques
  • controller coordinates translating
  • each engine translates a sentence indepedently
  • controller chooses the best translation
  • no proper translations leads to sentence
    partitioning
  • process starts from beginning
  • in the end the partitioned sentence is put back
    together

23
Multiple Translation Engines Sentence
Partitioning (2)
  • Parallel processing should improve success rate
  • correct translation preserved through procedures
  • combining the best translations should improve
    quality
  • Morphological analysis
  • analysis gives results that are used as inpupts
    for the engines
  • engines are then ran on parallel
  • if more than one result amount of engines
    increase
  • if no results sentence is partitioned
  • problem of partitioning a sentence e.g. Chinese
    Japanese
  • In a test situation with four engines the results
    improved dramatically
  • consumed time doubled
  • 1 MT system translated 45.6 of sentences
    correctly
  • with multiple engines the result was 74.2
    (Japanese to Chinese)

24
Outline
  • Introduction
  • Translation and Machine Translation Techniques
  • The Early Machine Translation Systems
  • Problems of Machine Translation
  • Proposed Solutions to the Problems
  • Summary

25
Summary
  • Definite solution is still to be found
  • Biggest problems of MT are linguistic
  • it is very hard to cover all the rules and adjust
    them to all possible languages and variations
  • misspellings cause problems which means a very
    good proof-reading function is needed
  • There is a long way to go before MT systems
    replace human translators
  • Machine Translation can be used in applications
    where the language is very specific
Write a Comment
User Comments (0)
About PowerShow.com