Spoken Language Translation - PowerPoint PPT Presentation

1 / 51
About This Presentation
Title:

Spoken Language Translation

Description:

On the Integration of Speech Recognition and Statistical Machine Translation. in Proc. Interspeech 2005. E. ... Statistical Phrase-based Speech Translation. – PowerPoint PPT presentation

Number of Views:111
Avg rating:3.0/5.0
Slides: 52
Provided by: star75
Category:

less

Transcript and Presenter's Notes

Title: Spoken Language Translation


1
Spoken Language Translation
2
Spoken Language Translation
3
Spoken Language Translation
  • Spoken language translation (SLT) is to directly
    translate spoken utterances into another
    language.
  • Major components
  • Automatic Speech Recognition (ASR)
  • Machine Translation (MT)
  • Text-to-Speech (TTS)

4
Spoken Language Translation
  • In comparison with written language,
  • Speech and especially spontaneous speech poses
    additional difficulties for the task of automatic
    translation.
  • Typically, these difficulties are caused by
    errors of the speech recognition step, which is
    carried out before the translation process.
  • As a result, the sentence to be translated is not
    necessarily well-formed from a syntactic
    point-of-view.
  • Why a statistical approach for machine
    translation?
  • Even without recognition errors, structures of
    spontaneous speech differ from those of written
    language.
  • The statistical approach
  • Avoid hard decisions at any level of the
    translation process
  • For any source sentence, a translated sentence in
    the target language is guaranteed to be
    generated.

5
Coupling ASR to MT
  • Motivation
  • ASR cannot secure an error-free system
  • One best of ASR could be wrong
  • SLT must be designed robust to speech recognition
    errors
  • MT could be benefited from wide range of
    supplementary information provided by ASR
  • MT quality may depend on WER of ASR
  • Strong correlation between recognition and
    translation quality
  • WER of ASR decreases in a set of hypotheses
  • Idea Exploitation of more transcriptions
  • SLT systems vary in the degree to which SMT and
    ASR are integrated within the overall translation
    process.

6
Coupling ASR to MT
  • Loose coupling
  • SMT uses ASR output (1-best, N-best, lattice, or
    confusion network) as input for 1-way module
    communication
  • Tight coupling
  • The whole search space of ASR and MT is integrated

7
Coupling ASR to MT
  • Statistical spoken language translation
  • Given a speech input x in the source language,
    find the best translation e
  • F(o) is a set of possible transcriptions
  • Loose coupling 1-best, N-best, lattice, or
    confusion network
  • Tight coupling full search space
  • Pr(f,ex) speech translation model
  • Acoustic and translation features

8
Coupling ASR to MT
  • Loose coupling vs. Tight couplings

Loose Coupling Tight Coupling
Modularity of Knowledge Sources Each KS in stand-alone module All KSs integrated in single model
Inter-module Communication Typically one-way (pipelined) N/A
Scalability Easy Not easy
Complexity Feasible Feasible only for very small domains
9
ASR Outputs
  • Automatic speech recognition (ASR) is a process
    by which an acoustic speech signal is converted
    into a set of words.
  • Architecture

Feature Extraction
Decoding
SMT
Speech Signals
ASR outputs ( 1-best, N-best, Lattice, or CN )
Network Construction
Speech DB
Acoustic Model
Pronunciation Model
Language Model
HMM Estimation
G2P
Text Corpora
LM Estimation
10
ASR Outputs
  • Network Structure
  • Decoding of HMM-based ASR
  • Searching the best path in a huge HMM-state
    lattice

11
ASR Outputs
  • 1-best
  • The best path could find from back tracking
  • Why a 1-best
Write a Comment
User Comments (0)
About PowerShow.com