ML_MUA_1 - PowerPoint PPT Presentation

1 / 42
About This Presentation
Title:

ML_MUA_1

Description:

Testing multilingual support in Mail User Agents TERENA Pilot Project Yuri Demchenko, TERENA TNC 98 Dresden October 5-8, 1998 – PowerPoint PPT presentation

Number of Views:57
Avg rating:3.0/5.0
Slides: 43
Provided by: Yuri99
Learn more at: http://www.uazone.org
Category:
Tags: mlmua1 | about | ukraine

less

Transcript and Presenter's Notes

Title: ML_MUA_1


1
Testing multilingual support in Mail User
AgentsTERENA Pilot Project
  • Yuri Demchenko, TERENA ltdemch_at_terena.nlgtTNC98
    Dresden October 5-8, 1998

2
TERENA Pilot Project on Testing Multilingual MUAs
  • Officially started in April 1998 till September
    1998
  • The project objectives can be described as
  • Develop benchmarking methodology for Multilingual
    MUAs, and specify templates for collecting the
    results in a coherent way.
  • Design a set of composite multilingual test
    messages
  • Configure each MUA for all supported national
    character sets and send the test messages to
    other MUAs and to themselves.
  • Compile the results, analyzing how the MUA
    composes, sends, receives and displays the test
    messages.
  • Prepare recommendations for users - correct setup
    and operation of popular multilingual MUAs

3
The list of mail clients to be tested
  • Derived from TERENA MUAs usage statistics based
    on analysis of more than 3000 messages from
    TERENA Mail archives collected during the period
    August 1997 - March 1998
  • Microsoft Windows (NT, 3.11, 95)
  • Microsoft Outlook Express
  • Netscape Mail 3.x and 4.x
  • Netscape Messenger
  • Qualcomm Eudora 3.0 and 4.0 beta
  • Pegasus Mail
  • The Bat!
  • ESYS Simeon
  • Alis Tango Mailer
  • UNIX Terminal
  • Elm
  • MH
  • Pine
  • UNIX GUI (with X11R6)
  • Netscape Mail
  • EXMH
  • Z-Mail

4
Activity and Projects in i18n and Multilingual
Support
  • i18n activity (ISO, IETF, ECMA, TERENA, Unicode
    Consortium)
  • CEN/TC304 works on European character sets and
    keyboard
  • MAITS project
  • Internet Mail Consortium - Report on using
    International Characters in Internet Mail
  • Terena Pilot Project on Testing Multilingual
    support in MUAs

5
Internet Mail Consortium - i18n Report
  • Summary of recommendations
  • 1. Explicit charset parameter
  • 2. Sending UTF-8
  • 3. Displaying UTF-8
  • 4. Choosing charsets on creation
  • 5. Specifying languages
  • 6. Multi-language text
  • 7. Non-ASCII headers
  • 8. Handling all common charset
  • 9. MTAs and 8-bit content

Report strongly recommends that all mail-creating
and mail-displaying programs created or revised
after January 1, 1999, must be able to create and
display mail using UTF-8 and have ability to
handle all common charsets in addition to UTF-8
6
Standard on i18n and Character Sets Technologies
  • ISO standards
  • ISO 2022 Character Set Concept and Terminology
  • ISO 8859-x Character Sets
  • ISO Standards on APIs i18n and FDCC
  • Unicode standards
  • RFC 2277 IETF Policy on Character Sets and
    Languages
  • Recommendation of IAB Workshop on character sets
    technology (RFC 2130)
  • MIME format of messages (Using MIME in Internet
    Mail) RFC 2045-RFC 2049
  • RFC 822 - Syntax of electronic messages format
    according

7
Standards in i18n and Multilingual Support in
Internet Mail
  • RFC 2045 - RFC 2049, RFC 2231 - MIME
  • Coded Character Set
  • Character Encoding Scheme specified by the
    Charset parameter to the Content-Type header
    field
  • Transfer Encoding Syntax like Base64, QP
    specified by the Content-Transfer-Encoding header
    field
  • RFC 2277 - IETF Policy on Character Sets and
    Languages
  • main definitions and requirement for language
    tagging
  • RFC 2130 - Recommendation of IAB Workshop on
    character sets technology
  • framework for interoperability between the many
    characters in use
  • an architecture model for on-the-wire
    transmission of text
  • recommendations for tagging transmitted (and
    stored) text

8
RFC 2130 Architecture model
  • User interface issues (OS, GUI, API)
  • Layout
  • Culture
  • Locale
  • Language
  • On-the-wire
  • The Coded Character
  • The Character Encoding Scheme
  • The Transfer Encoding Syntax

9
The testing and the evaluation scheme
10
Testing of Multilingual support in MUAs
  • Includes the following phases
  • Evaluation of Multilingual features/settings of
    MUAs
  • Testing Message Reading procedure
  • Testing Message Composing procedure
  • Testing Message Sending and Receiving procedure

11
Evaluation of Multilingual features/settings of
MUAs
  • READ operation mode
  • choose Language/Encoding
  • choose Fonts (Optional for Address, Subject,
    Message Body, Quoted Text)
  • Optional - Font mapping
  • COMPOSE operation mode
  • choose Language/Encoding Settings
  • Optional - Possibility to switch
    Language/Encoding during composition/typing
  • choose Fonts (Optional for Address, Subject,
    Message Body, Quoted Text)
  • Optional - choose Spelling/Language/Dictionary
  • SEND operation mode
  • set MIME encoding (Quoted Printable, Base64)
  • Optional - select/disable Uuencode mode (non
    standard)
  • Allow/disallow 8-bit in Header Fields
  • select/disable HTML in body parts

12
Message Reading procedure
  • Multilingual MUAs should support the following
    features
  • Reading/Displaying non-ASCII characters in
    Message Body
  • Reading/Displaying non-ASCII characters in
    Message Header (Address, Subject Lines)
  • Reading Forwarded Message with non-ASCII
    characters in Address, Subject, Message Body,
    using the same or different MIME character set
    attributes
  • Reading Attached non-ASCII Text File (Document)
  • Possible problems are detected comparing the
    original and the delivered test messages
    appearance
  • This includes the evaluation of the MUAs
    correct/incorrect processing of the MIME
    attributes of the test message.

13
Message Composing procedure
  • Message composition operations to be tested
  • Typing message from keyboard
  • Copy and Paste operations
  • Text/File attachments
  • Quoted text/message
  • Edit different parts of message
  • Charset/Encoding processing by Message
    Composer/Editor
  • Real Message composition also includes operations
    like
  • Typing non-ASCII text in Message Body and Message
    Header
  • Pasting non-ASCII-Text into Body and Header
    fields
  • Reply to message with non-ASCII Text
  • Forward message with non-ASCII content
  • Attach text documents containing non-ASCII
    characters

14
Test messages set
  • Each test is performed in at least 2 character
    sets, one of which is US ASCII (or ISO 8859-1),
    and the other with characters that are not part
    of US-ASCII or ISO 8859-1.
  • Mandatory
  • tmsg1 - Message with non-ASCII characters/text in
    the Subject line
  • tmsg2 - Message with non-ASCII characters/text in
    Mail Address free-form name
  • tmsg3 - Message with non-ASCII characters/text in
    the Message Body text (single part)
  • tmsg4 - Message with non-ASCII characters/text in
    text/plain attachment
  • Optionally
  • tmsg6 - Message with UTF-7/UTF-8 Character set
    in Message Body and Header (optional)

15
Testing program map
16
Testing Methodology - The tests to be performed
  • test-1 - Receive all 4 test messages tmsg1-tmsg4
    and display them correctly (Change
    Language/Alphabet/Encoding Options if needed)
  • test-2 - Print all 4 messages tmsg1-tmsg4 to the
    standard printer
  • test-3 - Reply to messages tmsg1 and tmsg2, and
    check that information is returned in the same
    character set as it arrived in
  • test-4 - Reply to message tmsg3 using "reply
    including quote of body"
  • test-5 - Reply to message tmsg3 using the
    environment's "cut and paste" function to insert
    the non-ASCII characters into the outgoing
    message
  • test-6 - Forward all 4 messages to the originator
    address
  • test-7 - Generate, as completely as possible, the
    same messages from the keyboard of the IUT
  • test-8 - Check possible text distortion when
    exchanging by tmsg1-2-3 with non-ASCII Default
    Language/Alphabet/Encoding
  • test-9 - Provide tests 1-5 for message tmsg6
    with UTF-7/UTF-8

17
Testing Results Presentation
18
ML MUAs Testing Results and Data Analysis
  • Testing results are documented and presented at
  • http//park.kiev.ua/multiling/ml-mua/prjdocs/mlmua
    -repv1.html
  • Standards overview on Internationalisation and
    Multilinguality
  • http//park.kiev.ua/multiling/ml-mua/mldoc-review.
    html
  • Test messages constructor pilot version
  • http//park.kiev.ua/multiling/ml-mua/testcon.html

19
Evaluation of ML MUAs
  • First group - includes MUAs that support multiple
    languages/alphabets by means of multiple charsets
    support and use internal language/charset
    transformation
  • Microsoft Outlook Express
  • Netscape Messenger 4.04 and previous product
    Netscape Mail 3
  • exmh for X Windows
  • Second group - provides ML support by selecting
    proper font for creating and displaying messages
  • Eudora Pro 3.0
  • Pegasus
  • Forte Agent
  • The Bat!
  • Simeon
  • UNIX Terminal Products
  • pine
  • elm

20
First group - Full Multilingual Support
ñëîâî
  • Microsoft Outlook Express
  • has the best and richest multilingual support
  • use effective internal conversion scheme that is
    good controlled by users via setup and
    Alphabet/Charset selection menu
  • Netscape Messenger 4.04 and Netscape Mail 3.04
  • provide rich multilingual support for many
    charsets/encodings
  • but are very inflexible for Languages that have
    many charsets in use (F.E., Cyrillic Windows
    CP-1251 and KOI8-R/U for Russian/Ukrainian, or
    ISO 8859-2 and Windows CP-1250 for Central
    European Languages
  • Netscape products for X Windows - the same
    features.
  • exmh for X Windows
  • provides good support for main groups of
    European languages using Latin 1, Latin 2
    Cyrillic charsets

21
Second group Simplified Multilingual Support
òâåðäî
  • Popular in Latin1 (ISO 8859-1) and English
    speaking community
  • Languages and charsets/encodings support is
    provided by selecting proper font for creating
    and displaying messages.
  • Eudora Pro 3.0
  • Pegasus
  • Forte Agent
  • The Bat! provide simple conversion between
    Cyrillic encodings (ISO 8859-5, Windows CP-1251,
    KOI8-R)
  • Simeon
  • pine and elm for UNIX

22
Common problems of multilingual support in MUAs
óê
  • Conversion between different Encodings/Charsets
    for the same language
  • Correct processing of MIME tags in message Header
    fields (Subject and Address lines) during
    displaying when charset name in header is
    different from Message Body
  • The same problems occur when user tries to change
    Charset/Encoding when displaying or composing
    message, or use CopyPaste operations for
    different Charsets
  • View message source code and/or message info
    (charset/encoding for the Header and Body,
    Multipart MIME structure, so on)
  • Using common and correct terminology for
    language/charset settings in MUAs

23
Projects Main Results
ôåðòü
  • The international environment of the project
    allowed to discover the main problems in
    multilingual MUAs support
  • Multilingual test messages set
  • Evaluation scheme for the forthcoming ML MUAs
  • Project activity was conducted in coordination
    with other multilingual related projects
  • IMC MAIL-I18N report on Internationalization and
    Character Set technologies
  • Mozilla i18n project (Netscape 5.0)
  • PT members have contributed to the new Ukrainian
    Language enabled Mozilla
  • proposed model of multilingual support in MUAs
    was discussed
  • ESYS Simeon IMAP Mail multilingual features
    testing

24
Follow-on Projects and activity
õåð
  • Testing new products using proposed methodology
  • New releases of OutLook Express 98, Netscape
    Messenger 4.5 and 5.0
  • New products of 1999 that is expected will
    implement recommendations of IETF/IMC
  • Another areas of further activity
  • Establishing ML/i18n supporting Charsets
    repository for online support of Multilingual
    mail (mapping reference tables download,
    translation, configuration, etc.)
  • Creating Web based ML test messages Constructor
    which pilot version is demonstrated at projects
    page
  • http//park.kiev.ua/multiling/ml-mua/testcon.html

25
Test Messages Constructor http//park.kiev.ua/mul
tiling/ml-mua/testcon.html
26
Test Messages Constructor - Creating test message
27
Project Team
  • Yuri Demchenko, TERENA
  • Konstantin Chuguev, Ural Technical University,
    Russia
  • Janja Faganel, Jozef Stefan Institute, Slovenia
  • Vadim Shevchenko, Kiev Polytechnic Institute
  • Alexey Medvedev, Kiev Polytechnic Institute

28
Acknowledgments
øòà
  • Borka Jerman-Blazic, Jozef Stefan Institute,
    Slovenia
  • Claudio Allocchio, Sincrotrone Trieste INFN
    Trieste, Italy
  • Peter Heijmens Visser from TERENA for provided
    MUAs usage statistics
  • Harald T. Alvestrand, Maxware Norway

29
IMPORTANT NOTE
åð
  • Multilingual page will be moved and supported at
    TERENA webserver http//www.terena.nl/multiling/

30
åðû
31
åðü
32
ÿòü
33
þ
34
èà
35
þñ ìàëûé
36
þñáîëüøîé
37
êñè
38
ïñè
39
Russian/Ukrainian LanguagesHistorical overview
ôèòà
  • VI-XI cent. - Ancient Rus written language
  • X-XIV cent. - Cyrillic written language
  • Invented by Cyrill and Methody (Saloniki) in IX
    cent
  • First introduced in Moravia with advent of
    Christianity
  • Introduced in Kiev Rus with advent of
    Christianity in X cent.
  • XIV-XVII - Forming Russian literature language
  • With Forming Moscow State after Mongol higo
  • XVII - Developing modern Russian literature
    language
  • Lomonosov, Puskin

40
Ukrainian Literature Language
èæèöà
  • Common ancient roots with Russian and all Slavic
    languages
  • Was influenced by centuries of conquerors
    languages
  • features of analytical language (as English)
  • 1818 - Published Gramatics of Ukrainian
    (malorussian) dialect
  • introduced ukr. i, (for kg sounds),
    spelling of äç, äæ
  • Forming modern Ukrainian literature language
    (Taras Shevchenko)
  • 1921 - Published Main rules of Ukrainian
    orthography
  • 1984 - introduction of new/lost ukr. letter

41
(No Transcript)
42
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com