Machine Translation and NLP - PowerPoint PPT Presentation

1 / 35
About This Presentation
Title:

Machine Translation and NLP

Description:

Context Free Grammar of English Language. S PreNP NP VP PostVP ... Urdu translation is built by re-arrangement and inflection of words and phrases. ... – PowerPoint PPT presentation

Number of Views:926
Avg rating:3.0/5.0
Slides: 36
Provided by: Inb2
Category:

less

Transcript and Presenter's Notes

Title: Machine Translation and NLP


1
Machine Translation and NLP
??? ???? ?????? ??????
  • ????? ????
  • ???? ??????? ?????
  • ????? ?????

2
  • Machine Translation An Overview
  • English to Urdu Translation
  • Urdu to English Translation

3
??? ???
  • Machine Translation
  • An Overview

4
Natural Language Processing
  • Natural language is a term which denotes a
    (naturally occurring) human language as opposed
    to computer languages and other artificial
    languages.
  • Natural Language Processing is the field of
    inquiry concerned with the study and development
    of computer systems for processing natural
    (human) languages.

5
Applications of NLP
  • Question Answering Systems
  • Chat Bots
  • NL Questions to DB or Search Engine
  • NL Commands to Computers and other machines
  • Information Extraction
  • Grammar Checkers
  • Machine Translation
  • .

6
Machine Translation
  • Translation of text of one Natural Language to
    other Natural Language.
  • But what does Translation and Human Translation
    means?

7
Machine Translation Problems
  • A word has many Part of Speech.
  • A word of same part of speech have many meanings.
  • A Grammar that covers all possible sentences of a
    Natural Language is not possible.
  • Grammars of Natural Languages are ambiguous.
  • Understanding a sentence requires background
    knowledge of World.

8
MT Problems (some examples)
  • Time Flies like an arrow.
  • Flies is noun or verb
  • The spirit is willing, but the flesh is weak
  • is translated in Russian as
  • The vodka is good, but the steak is lousy
  • Cette personne n'est pas de permanence
    aujourd'hui (This person is not on duty today)
  • is translated in English as
  • This person is not any today permanence

9
MT Problems (some examples)
  • The federal cabinet here on Saturday approved a
    new labour policy to meet the challenges of
    globalization and emerging technologies, giving
    new directions for improvement and guidance in
    the labour sector. (Dawn, 22nd September, 2002)
  • Real textual data is more like this text and not
    like traditionally quoted sentences like
  • I have a book.
  • He goes to school.

10
English to Urdu Translator
  • ??? ???

11
Structure of Translator
  • Lexical Module
  • Syntax Module
  • Transformation Module

12
Lexical Module
  • Pre Processor
  • Detect Proper Nouns
  • Convert short forms (dont ? do not)
  • Detect abbrevations like etc., mr.
  • Tokenizer
  • Search Database of words and proper nouns and
    generate all possible interpretations of a word.

13
Structure of Lexicon
  • Word
  • Category
  • Noun, Pronoun, .
  • SubCategory
  • Auxillary Verb, Possesive Pronoun,
    ToPreposition, ..
  • Sense
  • Human, Animate, Uanimate

14
Structure of Lexicon - Contd.
  • Form
  • Base, First,Second, (for Verb Form) First,
    Second,Third (for Person) Comparative,
    Superlative, for Adjectives
  • Number
  • Singular, Plural
  • Gender
  • Masculine, Feminine
  • Object Preposition Subject Preposition
  • ?? ? ??? ??

15
Structure of Lexicon - Contd.
  • Object Count
  • No of objects required with the verb
  • Urdu Meaning
  • Meaning for different forms
  • Meaning of Adjective and Noun for different forms
    of Gender and Number like ??? ??? ??? ???? ?
    ????? ????? ???

16
Syntax Box
  • Context Free Grammar of English Language.
  • ?
  • ? Noun
  • ?
  • ?
  • ? Prep

17
Some important points of Grammar
  • Active and Passive forms of Positive and Negative
    sentence are modeled.
  • Adverbial Phrases coming at beginning, last and
    middle of the sentence are modeled.
  • Infinitive Verb Phrase (to VERB) is modeled.
  • ..

18
Partial Parsing
  • The system use Bottom Up Chart Parser that makes
    Partial Parsing possible. Hence it can deal
    sentences which have some small error (or the
    sentences that are not according to the grammar.)
  • I know him He lives here.

19
Transformational Module
  • Parse Structure from Syntactical Module is
    traversed.
  • Urdu translation is built by re-arrangement and
    inflection of words and phrases.

20
Transformational Module (contd.)
  • If more than one parses are generated by
    Syntactical Module, then it uses Heuristics for
    best interpretation.
  • If Auxiliary Verb is used as Main Verb, it has
    negative weight.
  • If Adjective is used as noun, it has negative
    weight
  • If Verb is used as noun, it has negative weight.

21
English and Urdu Comparison
  • SVO and SOV
  • Order of Words in Phrases
  • Many Forms of Adjective and Prepositions
  • Many Forms of Verb
  • Object Preposition and Subject Preposition

22
SVO vs SOV
  • English is Subject -Verb-Object Language.
  • Hamid writes a letter.
  • Urdu is Subject-Object- Verb Language.
  • ???? ?? ????? ??

23
Order of Words in Phrases
  • For English
  • ? Prep
  • Example of red color.
  • For Urdu
  • -- Prep
  • Example ??? ??? ??

24
Many Forms of Adjective and Prepositions
  • Blue Book, Blue Books, Blue Pen, Blue Pens
  • ???? ????? ???? ??????? ???? ??? ? ???? ???
  • Price of Book, Writer of Book
  • ???? ?? ????? ???? ?? ????

25
Many Forms of Adj and Prep (Contd.)
  • Blue Color
  • ???? ???
  • Book of Blue Color
  • (???) ???? ??? ?? ????
  • ???? ??? ?? ???? (????)

26
Many Forms of Verb
  • Rule Based System for Verb Inflection
  • Inflection forms of verb (can) depends on
  • Tense of Sentence
  • Gender, Number and Person of Subject or Object
  • Transitive and Intransitive Verb
  • Subject Preposition and Object Preposition

27
Many Forms of Verb (examples)
  • Verb Form depends on Subject (Gender, Number and
    Person) and Tense
  • ???? ???? ?????? ??
  • ???? ???? ?????? ??
  • Verb Form depends on Object (Gender, Number and
    Person) and Tense
  • ???? ?? ??? ?????
  • ???? ?? ???? ?????
  • Verb Form Depends on Verb Gender and Tense
  • ???? ?? ???? ?? ??? ??
  • ???? ?? ???? ?? ??? ??

28
Subj Preposition and Obj Preposition
  • Used in Past Indefinite Tense having Transitive
    Urdu Verb
  • Commonly ?? is used with Subject and ?? is used
    as Object
  • ?? ?? ?? ?? ?????
  • In some cases, other prepositions like ?? can be
    used.
  • ?? ?? ?? ?? ?????.
  • Presence and absence of Object Preposition
    depends on sense(semantic type) of verb.
  • ?? ?? ?? ?? ????? (He asked you)
  • ? ? ?? ??? ???? ?????(He asked a question)

29
Implementation of Translator
  • Bottom Up Chart Parsing Framework
  • Words in Database
  • Grammar Rules in Database
  • Transformational Framework
  • Depth First Traversal of Parse Structure
  • Script a Rule Body (corresponding to each rule in
    database)
  • Can be customized to other NLP problems like
    Grammar Checking etc.

30
Future Directions
  • Improvement in Grammar
  • Interrogative Sentences
  • Verb Phrases acting as Noun (Example Reading is
    good hobby)
  • ...
  • Statistical Disambiguation
  • will select a suitable interpretation of word
    depending on its adjacent words.
  • Improvements in Chart parser
  • Every Production will have a weight, High weight
    elements will be tried first to get quick
    results.
  • Rule base system for preposition
  • There is no one-to-one relationship between
    English and Urdu Prepositions.

31
Urdu to English Translation
  • ??? ???

32
Urdu To English Translation Issues
  • SOV and OSV
  • Light Verbs
  • Noun Phrase Boundary

33
SOV and OSV
  • Most of the time, Urdu Sentences are in SOV
    Form but OSV is also grammatically valid.
  • ??? ?? ?? ?? ?????
  • ?? ?? ??? ?? ?????

34
Light Verbs
  • Verbs that comes after main verbs.
  • ??? ??? ???? ???
  • I do work. (Incorrect)
  • I work. (Correct)
  • ??? ?? ?? ??? ???
  • I gave wrote a letter. (Incorrect)
  • I wrote a letter. (Correct)

35
Noun Phrase Boundary
  • ??? ??????? ????? ???? ?? ?????? ?? ??? ???-
  • (NP Boundaries is specified by ?? and?? )
  • ??? ??????? ????? ???? ?????? ???
  • (No Hint for NP Boundaries)
Write a Comment
User Comments (0)
About PowerShow.com