STRANS 2002 - PowerPoint PPT Presentation

1 / 17
About This Presentation
Title:

STRANS 2002

Description:

Uses multi-translation engine(5) and 198 blackboards for information exchange. ... 6. Develop multiple translation engine (A number of sub- tasks ,each with ... – PowerPoint PPT presentation

Number of Views:72
Avg rating:3.0/5.0
Slides: 18
Provided by: draja
Category:

less

Transcript and Presenter's Notes

Title: STRANS 2002


1
STRANS 2002 Keynote Presentation TOWARDS
SPEECH TO SPEECH TRANSLATION (S2S) Prof.
R.M.K.Sinha I.I.T. Kanpur
2
RANGE OF DEVICES
document
TEXT
EDIT
Speech Input
ASR
TRANSLATE
OCR
Figure 1
Speech Input
ASR
TEXT
TRANSLATE
POST-EDIT
TTS
OCR
Figure 2
3
ASR
TEXT (SOURCE)
TRANSLATE
TEXT (TARGET)
Confirm
EDIT SOURCE TEXT
BACK TRANSLATE
TRANSLATE
TTS
Interactive
Figure 3
4
WHY IS S2S SO DIFFICULT?
  • Difficulties In ASR

1. Vocabulary size
2. Speaker independency
  • 3. Continuous speech
  • Unclear word boundary
  • Co-articulation effect
  • Poor articulation of functional words
  • Sentence delimiter

5
Difficulties in MT Compounded with Speech
Recognition
  • Requires understanding of conversation
  • Ill formed inputs superfluous parts
  • Noisy inputs phoneme errors
  • Dealing with Discourse
  • Understanding speakers intention
  • Real time performance
  • Multi-linguality
  • so on.

6
Customer(English Speaking)
Service Counter Clerk(Hindi Speaking)
yaid Aapkao kYT na hao tao, _at_yaa Aap kRpyaa maoro
ekaNT maoM jamaa bata sakto hOM
If you dont mind , you could please also let me
know the balance in my account?
Aapka ekaNT nambar ?
Your account number.
AcCa yah dao PaMaca PaMaca naaO tIna hO
Well , it is two five five nine three
ifr sao baaolaoM
Say it again.
Twenty five thousand five hundred ninty three.
pccaIsa hjaar paMca saaO itranavao
saoivagMsa hO ik kroMT ?
Is it Savings or Current.
7
Aaoh saoivagMsa saoivagMsa
Oh Savings Savings
caar hjaar naaO saaO tIsa pyao
Rupees four thousand nine hundred thirty
AaOr ipClaI inakasaI
How about last withdrawl.
Rupees twenty five hundred
pccaIsa saaO pyao
Anything else
AaOr kuC
No thank you
nahIM Qanyavaad
8
Conversion in Indian Context (Hindi Speaking
Region)
Customer
Service Counter Clerk
yaid Aapkao kYT na hao tao, _at_yaa Aap kRpyaa maoro
ekaNT maoM jamaa bataegaoM
If you dont mind , maoro ekaNT ka balance
bataegaoM ?
Aapka ekaNT nambar ?
Your account number.
haM dao PaMaca PaMaca naaO tIna hO
haM , two five five nine three
Beg your parden
Beg your parden
Twenty thousand five hundred ninty three.
pccaIsa hjaar paMca saaO itranavao
Saving or Current ?
Saving or Current..
9
Aaoh saoivagMsa saoivagMsa
Oh Savings Savings
Four thousand nine hundred thirty
Four thousand nine hundred thirty
AaOr ipClaa inakasaI
AaOr last withdrawl.
Two thousand five hundred
Two thousand five hundred
Anything else
AaOr kuC
No thank you
nahIM Qanyavaad
10
MT from English to Hindi
Speech Synthesis in Hindi
Translated Voice Output in Hindi
English ASR
Voice Input in English
Knowledge Bases for Phoneme, Word, Sentence
hypothesization, Discourse Analysis, Contextual
information etc.
Voice Input in Hindi
Speech Synthesis In English
MT from Hindi to English
Hindi ASR
Translated Voice Output in English
11
Input Speech
Phoneme Hypotheses
Phoneme Recognition
Word Identification
Alternate Phoneme Hypotheses
Alternate Word Hypotheses
Word Hypotheses
Sentence Identification
Machine Translation
Speech Synthesis
Speech in Target language
Figure 5
12
Continuous Speech
Word boundary Hypotherization Identification
Revise
Sentence boundary Hypotherization
Identification
Revise
Syntactic Semantic analysis
Natural Language Model
Lexical Phoneme data-base ( Hindi-English )
Bilingual Speech Model ( Switching Hindi to
English )
Figure 6
13
Chunk1
Chunk2
Chunk3
If you dont mind
could you please also let me know
the balance in my account
Statistical Translator
Semantic Translator
Rule- based Translator
Example based Translator
...
...
Selection based on Confidence Measure
yaid Aapkao kYT na hao tao
_at_yaa Aap kRpyaa bata sakto hO
maoro ekaNT maoM baOlaonsa
Merger of the Output
yaid Aapkao kYT na hao tao, _at_yaa Aap kRpyaa maoro
ekaNT maoM baOlaonsa bata sakto hOMoM
Figure 7
14
CURRENT RESEARCH
  • Speech Trans (CMU) Japanese to English
  • - Doctor -patient domain
  • - Uses top-down prediction from language model
  • SL-Trans(ATR,Japan) Japanese into English
  • - ATR conference registration
    domain.
  • - Uses HMM-LR method

15
  • JANUS(CMU) English to Japanese German
  • - ATR conference representation
    domain
  • - JANNS2 based on a connectionist
    speech recognition module, linked
    Predictive Neural Network(LPNN) and
  • large vocabulary used
  • VERBMOBIL Bidirectional-German , English,
    Japanese
  • - Uses three business oriented
    domains.
  • - Uses multi-translation engine(5) and
    198 blackboards for information exchange.
  • DIPLOMAT(CMU) Interactive-bidirectional,
    Serbo-Croatian English
  • - Peace keeping mission interviewing locals
    .
  • - Uses Multi engine MT,
  • - Back translation used for interaction

16
  • Road Map for S2S in Indian Context(Coarse Level)

1. Identify application domains and languages
- industrial houses for tech. transfer
support
  • 2. Create speech corpus in the domain with
    transcription and tagging
  • - parallel corpus in target language
  • - multiple speakers varied
    geographical regimes and linguistic background


3. Create lexical and phoneme data base
- obtain confusion matrices
4. Algorithm for chunk identifier
5. Algorithm for parsing multi-lingual speech
(say Hindi mixed with English)
6. Develop multiple translation engine (A
number of sub- tasks ,each with its own road
map)
7. Algorithm for merger of chunk translations
Parallel task Develop target language text to
speech synthesizer.
17
Need for an R D Institute
  • Highly interdisciplinary area.
  • Need to grow manpower in the area
  • Continued and concerted effort needed if goal is
    to be achieved in a cost effective
    manner in a reasonable time frame.
  • The institute will facilitate industry
    participation consolidation of research groups
    within the country and liaison with international
    groups.
  • Attached to an Academic Institute
Write a Comment
User Comments (0)
About PowerShow.com