Farsi eOrthography: an example of eOrthography Concept - PowerPoint PPT Presentation

1 / 3
About This Presentation
Title:

Farsi eOrthography: an example of eOrthography Concept

Description:

How the orthography of a language can be followed in an encoding system ... policy of text encoding, tokenization, orthography, and text processing e.g. Corpus tagging ... – PowerPoint PPT presentation

Number of Views:56
Avg rating:3.0/5.0
Slides: 4
Provided by: behrangqa
Category:

less

Transcript and Presenter's Notes

Title: Farsi eOrthography: an example of eOrthography Concept


1
Farsi e-Orthography an example of e-Orthography
Concept
  • Behrang Qasemizadeh

2
What is e-Orthography and Why Do We Need That?
  • e-Orthography tells us
  • How the orthography of a language can be followed
    in an encoding system
  • What character codes should be used
  • How they attach to each other to form a word
  • Which tokenization policy must be taken in
    document processing
  • Interaction between the policy of text encoding,
    tokenization, orthography, and text processing
    e.g. Corpus tagging
  • E-Orthography is an implied assumption in any
    text analysis application

3
e-Orthography is a Must for Some Languages
  • Farsi, an Indo-European language, is written with
    Arabic transcription! (a Semitic Language,
    cursive transcription)
  • Words can represent with different forms
    (different codes, different keys)
  • Available standards do not tell us how they
    should represent in digital environment or at
    least what are the equal keys
  • Solution e-Orthography
Write a Comment
User Comments (0)
About PowerShow.com