IT AND TRANSLATION - PowerPoint PPT Presentation

1 / 81
About This Presentation
Title:

IT AND TRANSLATION

Description:

Fonts, code pages, keyboard layout, language tools in Windows XP and Office ... Internet creates casual access to multilingual information ... – PowerPoint PPT presentation

Number of Views:102
Avg rating:3.0/5.0
Slides: 82
Provided by: aca160
Category:

less

Transcript and Presenter's Notes

Title: IT AND TRANSLATION


1
IT AND TRANSLATION
  • INTRODUCTION

2
Rationale for IT Applications to Translation
A computer is a device that can be used to
magnify human productivity. Properly used, it
does not dehumanize by imposing its own Orwellian
stamp on the products of human spirit .
..Translation is a fine and exacting art, but
there is much about it that is mechanical and
routine, if this were given over to a machine,
the productivity of the translator would not only
be magnified but this work would become more
rewarding, more exciting, more human.

Martin Kay (1987)
3
COURSE OVERVIEW
  • ESSENTIALS
  • TEXT PROCESSING
  • MT
  • TM
  • WORKING WITH CORPORA
  • TERMINOLOGY EXTRACTION AND GLOSSARY PRODUCTION
    (MONOLINGUAL AND BILINGUAL CORPORA)

4
COURSE OVERVIEW - DETAILS
  • 1) ESSENTIALS
  • Types of computer aides
  • CAT vs. MT
  • History of CAT tools
  • General principles of working with CAT tools
  • Reference materials
  • Localization and internationalization
  • UNIX

SOME OF THIS TODAY!
5
COURSE OVERVIEW - DETAILS
  • 2) TEXT PROCESSING
  • Word and WordPad (tips and tricks)
  • Fonts, code pages, keyboard layout, language
    tools in Windows XP and Office
  • Speech recognition software
  • Scanning
  • OCR
  • File types (essential info on the most common
    file types and file conversion utilities)

6
COURSE OVERVIEW - DETAILS
  • 3) MT
  • How it works, brief exhibition
  • Systran Pro
  • Prompt
  • Neuro Tran
  • Babelfish

DESKTOP BASED
SUPPORTS CROATIAN (partially Serbian)
WEB BASED
7
COURSE OVERVIEW - DETAILS
  • 4) TM
  • Overview (what it is, standards and file formats)
  • Desktop vs. server based TM programs
  • WinAlign
  • WordFast
  • Trados (nowadays SDL Trados) Freelance edition
  • Sisulizer

8
COURSE OVERVIEW - DETAILS
  • 5) WORKING WITH CORPORA
  • Essentials
  • Concordancing (WordSmith, Concordancer, AntConc)
  • Advanced corpora analysis WordSmith, TigerSearch
  • Lemmatization and annotation
  • Parallel corpora ParaConc

9
COURSE OVERVIEW - DETAILS
  • 6) TERMINOLOGY EXTRACTION AND GLOSSARY PRODUCTION
  • Essentials
  • Doing it automatically Trados (i.e. SDL)
    MultiTerm (Desktop and Extract)
  • Doing it semi-automatically ParaConc,
    Concordancer

10
COURSE REQUIREMENTS
  • Basic computer literacy
  • Positive outlook
  • Computers dont bite
  • CAT tools are not complex, they are actually made
    to make you more efficient
  • Interest in translation
  • Willingness to become several times more
    efficient in doing translations

11
SCHEDULE
  • HONESTLY, WE DONT KNOW FOR CERTAIN!
  • THATS WHY WE NEED YOUR EMAIL ADDRESSES, SO THAT
    WE CAN KEEP YOU UPDATED WITH THE LATEST SCHEDULE
    DEVELOPMENTS
  • PROBABLY LOCATION 25 (lectures) and 38
    (computer lab), SATURDAYS, at1600 OCLOCK

12
LITERATURE
  • Geoffrey Samuelsson-Brown, A Practical Guide for
    Translators (Topics in Translation), Multilingual
    Matters, 4th edition (May 28, 2004)
  • H. L. Somers (Editor), Computers and Translation
    A Translator's Guide (Benjamins Translation
    Library, 35), John Benjamins Publishing Co, 1st
    edition (May 2003)
  • Bert Esselink, A Practical Guide to Localization
    (Language International World Directory), John
    Benjamins Publishing Co, Revised 1st edition
    (September 2000)
  • Silvia Pavel and Diane Nolet, Handbook of
    Terminology, Translation Bureau of Canada, 1st
    edition (2001)
  • Frank Austermuhl, Electronic Tools for
    Translators (Translation Practices Explained),
    St. Jerome, 1st edition, (April 2001)

13
COURSE OVERVIEW - GRADING
  • This is a hands-on course
  • You will be graded on the basis of the results of
    your practical assignments
  • Creating TMs from parallel texts (fiction and
    non-fiction e.g. a book and a manual) in a way,
    you will be also creating a parallel corpus
  • Translating two short passages (fiction and
    non-fiction) using your newly created TMs

14
IT AND TRANSLATION
  • ESSENTIALS AND MORE ABOUT THE COURSE

15
TYPES OF COMPUTER AIDES
  • Computer aides / tools that are relevant to
    translators can be roughly classified into three
    groups
  • Basic input and editing tools
  • Reference tools
  • Productivity tools

WORD PROCESSORS
Electronic books (desktop web) Electronic
dictionaries Web (Eurodicautom, onelook, etc.)
Software-based reference materials
(encyclopedias, e-Bible, etc.)
TM tools MT tools Speech Technology (i.e. voice
recognition)
16
CAT vs MT
  • As soon as you start using computer software in
    the process of translating, you are entering the
    realm of COMPUTER-AIDED TRANSLATION, or CAT in
    short.
  • In other words, CAT is a form of translation
    wherein a human translator translates texts using
    computer software designed to support and
    facilitate the translation process.

17
CAT vs MT (continued)
  • The problem is that COMPUTER-AIDED TRANSLATION,
    is sometimes also called COMPUTER-ASSISTED
    TRANSLATION, MACHINE-AIDED TRANSLATION or
    MACHINE-ASSISTED TRANSLATION.
  • Due to the latter two terms, CAT is sometimes
    confused with MACHINE TRANSLATION, or MT in
    short.

18
CAT vs MT (continued)
  • Although these two concepts are related and
    similar in some aspects, CAT and MT denote two
    diametrically different processes
  • In CAT, the computer program merely supports the
    translator, so the translator translates the text
    himself/herself, making all the essential
    decisions involved.
  • In MT, the translator supports the machine, that
    is to say the computer (i.e. program) translates
    the text, which is then edited by the translator,
    or, in most cases, not edited at all.

19
CAT vs MT (continued)
  • Graphically represented, the difference is

Translation Technology Continuum
automation
human involvement
Computer-aided Translation (CAT)
Unaided Translation
Automatic Translation/ Machine Translation
Translation process aided by electronic tools
such as (most typically) Translation Memory
Translation process automated by use of Machine
Translation
Translation process not aided by any electronic
tools
Adapted from Hutchins Somers (1992)
20
CAT its scope
WRONG!!!
  • CAT is traditionally associated with large-scale
    / corporate translations
  • manuals and technical documentation
  • software localization
  • Typewriter-assisted (i.e. traditional)
    translation is usually associated with
    small-scale / individual translations (done by
    freelancers)
  • fiction books, scientific papers, etc.

21
CAT its scope (continued)
  • This is notion of CAT being restricted to
    corporate translation projects dates back to the
    90s and is based exclusively on financial
    criteria
  • during the early and mid 90s a combination of a
    high-end computer and a high-end CAT tool cost as
    much as a new car
  • from their very beginnings CAT tools were
    designed to be capable of handling both big- and
    small-scale projects, but initially no freelance
    translator could afford them

22
CAT its scope (continued)
  • Even for a freelance translator, CAT route is
    nowadays the only possibility if one wants to
    provide high-quality, 100 terminologically
    consistent and efficiently produced translations.
  • A testimony to that is the industry-standard TM
    program Trados Trados Freelance edition has been
    the companys best-selling TM program for a
    number of years.

23
CAT tools a bit about their history
  • CAT tools were developed after (very)
    disappointing initial experiments with MT tools.
  • So, in order to give you a proper overview of how
    we got where we are now, we have to start with
    the history of MT tools

24
MT History how we switched to CAT
  • MT research began in 1950s Warren Weavers
    1949 Memo
  • When I look at an article in Russian, I say
    This is really written in English, but it has
    been coded in some strange symbols. I will now
    proceed to decode.
  • (in Locke and Booth 195518)

25
MT History how we switched to CAT
  • Initially based on some misconception about human
    translation
  • knowledge of two language systems suffices
  • it is merely a matter of looking up dictionaries
  • it is easy to define a good translation
  • there is only one correct translation possible

26
MT History how we switched to CAT

MT history milestones pre-ALPAC
  • 1954 Georgetown system demo
  • successful translation of 49 Russian sentences
    into English
  • 1955-1966 50m spent in 20 research centres in
    USA
  • 1966 Automatic Language Processing Advisory
    Committee (ALPAC) Report concludes
  • ...MT is slower, less accurate and twice as
    expensive as Human Translation...
  • ...there is no prospect of useful MT either
    immediately or in the future...

27
MT History how we switched to CAT

MT history milestones post-ALPAC
  • 1969 privately funded projects
  • Logos system (1969) Weidner-CAT (1977) ALPS
    (1980)
  • 1975 Météo project in Canada
  • 1976 European Commission acquires Systran
  • 1979 Eurotra project in Europe for
    Multilingual system
  • 1980 PC-based system
  • 1990 data-driven system WebMT

28
MT History how we switched to CAT
  • 1975 Météo project in Canada
  • Automatic translation of weather forecasts (En
    -gt Fr)
  • Sublanguage approach (domain-specific MT)
  • Most successful MT application to date
  • public broadcasting since 1977
  • Fr -gt En available since 1989
  • only 4 of output needs post-editing
  • rapid translation staff turnover no longer a
    problem

29
MT History how we switched to CAT
  • Renewed interest in MT in late 80s and early
    90s
  • Technological factors
  • specifically prevalence of PC with improved
    processing power
  • Translation market factors
  • official bilingualism/multilingualism create
    institutional needs
  • globalisation creates huge commercial needs
  • Advances in computational linguistics
  • More realistic user expectations
  • Internet creates casual access to multilingual
    information

30
MT History how we switched to CAT
  • However, translations produced by MT were still
    not reliable and accurate enough for large-scale
    commercial applications.
  • So, it became evident that the human translator
    cannot be eliminated and replaced by computers.
  • Actually, it became obvious that computers
    programs should be used as TOOLS which only HELP
    the translator.

31
History of CAT Tools
  • Unreliability of MT tools -gt large corporations
    hire translation agencies
  • Translations agencies find it difficult to cope
    with the increasing demand
  • Translation agencies develop their own in-house
    CAT tools
  • Translation agencies begin to sell their CAT tools

32
History of CAT Tools
  • Two major players in the domain of CAT tools
    development Trados and STAR Group both started
    as
  • TRANSLATION AGENCIES!!!

STAR AG was founded as a small translation agency
in 1984 by Josef Zibung and Hanspeter Siegrist in
the northern Swiss city of Stein am Rhein near
Schaffhausen. It won and keept customers from the
automotive, machine tool, computer and
aeronautics industries like ABB, ATT, BMW,
Dornier, IBM, Mazda, Mercedes, Nissan, Saab and
Siemens.
TRADOS was founded in 1984 by Jochen Hummel and
Iko Knyphausen in Stuttgart, Germany to provide
translation services for IBM.
33
TRADOS timeline
  • 1990 - first version of TRADOS's main component,
    MultiTerm was created for DOS
  • 1992 -TRADOS developed the first MultiTerm for
    Windows (v3.1)
  • 1992 TRADOSs Translator's Workbench with
    linguistic fuzzy-matching on translation memories
    for DOS
  • 1994 - TRADOSs Translator's Workbench for
    Windows

34
TRADOS timeline (continued)
  • 1997 BREAKTHROUGH Microsoft decides to base
    its internal localization memory store on TRADOS
  • 1998 Microsoft acquires a share of 20 in TRADOS

TRADOS becomes a de-facto industry standard CAT
tool!!!
Thats why we will mostly work with TRADOS in
this course (as far as TM is concerned).
But we will also work with WordFast, because not
all people can afford Trados.
35
WHAT WE WANT TO TEACH YOU HERE?
  • TWO PRACTICAL EXAMPLES OF COMMON TRANSLATION
    PROBLEMS

36
(No Transcript)
37
  • IMPORTANT THINGS TO NOTE
  • (quite obvious) the book has an index YOU
    (i.e. the translator) are supposed to make it in
    the translated version of the book
  • a vast index a lot of terminology
  • some index terms appear on several pages that
    are not necessarily in the same chapter (e.g. pg.
    36, pg. 92 and pg. 255) a very serious problem
    for the consistency of you translation

38
(No Transcript)
39
(No Transcript)
40
(No Transcript)
41
(No Transcript)
42
(No Transcript)
43
General principles of working with CAT tools
  • The main goals are EFFICIENCY and CONSISTENCY
  • CAT tools TM tools (in this case only)
  • The basic idea is fairly simple
  • Documents, especially technical ones, contain a
    large amount of content that is similar or
    identical to information already contained in
    earlier versions or similar documents that have
    been translated before.
  • that applies to the source editing language (SL)
    as well as the target translation languages (TL).

44
General principles of working with CAT tools
  • So, wouldnt it be great to re-use previously
    translated content as valuable reference material
    for new translations as well so as to obtain
    consistency of terminology and phrasing?
  • That is exactly what CAT tools do!
  • CAT tools make it possible for translators to
    work only on content that is being created for
    the first time. Existing text and text similar to
    existing text is taken from the available.
    reference translations (i.e. from TM translation
    memory).

45
General principles of working with CAT tools
  • So, wouldnt it be great to re-use previously
    translated content as valuable reference material
    for new translations as well so as to obtain
    consistency of terminology and phrasing?
  • That is exactly what CAT tools do!
  • CAT tools make it possible for translators to
    work only on content that is being created for
    the first time. Existing text and text similar to
    existing text is taken from the available.
    reference translations (i.e. from TM translation
    memory).

46
TRADOS - a screenshot
47
A DREAM COME TRUE?
  • TO ENJOY ALL THE BENEFITS OF CAT TOOLS FIRST YOU
    HAVE TO CREATE A TM AND A TERMINOLOGY DATABASE
  • either from your old translations
  • or from new translations (i.e. creating a TM from
    scratch)
  • NOT REALLY ???

THAT IS WHERE OTHER CAT TOOLS (i.e. NON-TM CAT
tools) STEP IN TO SAVE THE DAY!!!
48
REUSING YOU OLD TRANSLATIONS
  • The best way to make a TM
  • reliable source (YOU did the translation)
  • readily available (stored on you PC)

49
A BRIEF DIGRESSION
  • The term LOCALIZATION has often popped up in
    previous slides
  • What is LOCALIZATION?

50
WHAT IS LOCALIZATION?
  • Localization is the process of adapting,
    translating and customizing a product (software)
    for a specific market (for a specific locale or
    cultural conventions the locale usually
    determines conventions such as sort order,
    keyboard layout, date, time, number and currency
    formats). In terms of software localization, this
    means the production of interfaces that are
    meaningful and comprehensible to local users.
  • The Localization Industry Standards Association
    (LISA) defines localization as Localization
    involves taking a product and making it
    linguistically and culturally appropriate to the
    target locale (country/region and language) where
    it will be used and sold.
  • Typically, this involves the translation of the
    user interface (the messages a program presents
    to users) to enable them to create documents and
    data, modify them, print them, send them by
    e-mail, etc.)

51
LOCALIZATION what it includes
  • Focal points of internationalization and
    localization efforts include
  • Language
  • Computer-encoded text
  • Alphabets/scripts different systems of numerals
    left-to-right script vs. right-to-left scripts.
    Most recent systems use the Unicode to solve many
    of these character encoding problems.
  • Graphical representations of text (printed
    materials, online images containing text)
  • Spoken (Audio)
  • Sub-titles for video
  • Date/time format, including use of different
    calendars
  • Formatting of numbers (decimal points,
    positioning of separators, character used as
    separator)
  • Time zones (UTC in internationalized
    environments)
  • Currency
  • Images and colors issues of comprehensibility
    and cultural appropriateness
  • Names and titles
  • Government assigned numbers (such as the Social
    Security number in the US, National Insurance
    number in the UK) and passports
  • Telephone numbers, addresses and international
    postal codes
  • Weights and measures
  • Paper sizes
  • Differences between local standards (e.g. YU ISO
    or JUS) and international standards (ISO)

52
LOCALIZATION vs. INTERNATIONALIZATION
  • The distinction between internationalization and
    localization is subtle but important
  • Internationalization is the adaptation of
    products for potential use virtually everywhere,
    while
  • localization is the addition of special features
    for use in a specific locale.
  • The processes are complementary, and must be
    combined to lead to the objective of a system
    that works globally.

53
CAT tools for localization
  • Over the last couple of years, in addition to
    general-purpose TM tools such as Trados and
    Transit, translation technology companies also
    developed a number of TM tools specially designed
    for localization
  • Alchemy CATALYST
  • PASSOLO
  • Sisulizer

SISULIZER is currently the industry standard
localization tool, so thats the one in which we
will work!!!
54
SISULIZER a screenshot
55
Other CAT tools (non-TM based)
  • As we said earlier, computer-assisted translation
    (CAT) is a broad and somewhat imprecise term
    covering a range of tools, from the fairly simple
    to the more complicated, which can include
  • Word processors, grammar and spell checkers,
    terminology managers, eBooks, eDictionaries,
    full-text search tools, concordancers, web, TM
    tools, bitexts, etc.

56
CAT - REFERENCE MATERIALS
  • Reference materials are the primary source of
    terminology in absence of translation memory.
  • Computer-based reference materials can be
    classified into
  • Online libraries
  • Specialized web resources
  • Specialized software products
  • Other materials in electronic formats

57
Online Libraries
  • Large collections of books in electronic form,
    e.g.
  • eBrary (new scientific books, pay site)
  • Internet Archive (hosting A Million Book
    Project)
  • Project Gutenberg (PD fiction books, free)
  • Questia (popular titles fiction and
    non-fiction, pay site some sections free)

58
Internet Archive
59
eBrary
60
Questia
61
Questia
62
Specialized web resources
  • Online glossaries
  • e.g. http//www.lai.com/glossaries.html
  • Online terminology databases
  • e.g. EURODICAUTOM
  • Acronym dictionaries
  • e.g. www.acronymfinder.com
  • Online dictionaries
  • e.g. www.thefreedictionary.com
  • Online corpora (e.g. BNC and COCA)

63
Online glossary language automation glossary
index
64
Online terminology databases - EURODICAUTOM
65
Online terminology databases - EURODICAUTOM
66
Acronym dictionary www.acronymfinder.com
67
Online dictionary www.thefreedictionary.com
68
BNC British National Corpus
69
BNC British National Corpus
70
(No Transcript)
71
COCA Corpus Of Contemporary American English
COCA Corpus Of Contemporary American English
72
(No Transcript)
73
Specialized software products
  • Various programs that can be used for terminology
    extraction
  • Electronic dictionaries
  • General monolingual e.g. OED v3
  • Specialized monolingual e.g. Cambridge
    Pronouncing Dictionary, Collins Collocations
  • Bilingual e.g. Morton Benson, MidiDict
  • Electronic Bible (e.g. e-Sword)
  • Concordance programs (e.g. Concordancer)
  • Data-mining programs (e.g. Summarizer Pro)

74
Electronic dictionaries - OED
75
Electronic Bible - e-Sword
76
Concordancers
  • Make it possible to see a word in context
  • Useful for finding collocations and phrases
  • Useful for extracting terminology
  • Two types
  • Monolingual concordancers (e.g. WordSmith)
  • Polylingual concordancers (e.g. ParaConc)

77
Monolingual Concordancer
78
Parallel Concordancer
79
Intellexer Summarizer Pro
80
Intellexer Summarizer Pro
81
THE END
Write a Comment
User Comments (0)
About PowerShow.com