IRCS/CCN Summer Workshop June 2003 Speech Recognition

About This Presentation

Title:

IRCS/CCN Summer Workshop June 2003 Speech Recognition

Description:

IRCS/CCN Summer Workshop June 2003 Speech Recognition – PowerPoint PPT presentation

Number of Views:76

Avg rating:3.0/5.0

Slides: 16

Provided by: MarkLi97

Learn more at: http://languagelog.ldc.upenn.edu

Category:

more less

Transcript and Presenter's Notes

Title: IRCS/CCN Summer Workshop June 2003 Speech Recognition

1
IRCS/CCN Summer WorkshopJune 2003Speech
Recognition
2
Why is perception hard?

Task available signals ? model of the world
around
signals are mostly accidental, inadequate
sometimes disguised or falsified
always mixed-up and ambiguous
Reasoning about the source of signals
Integration of context what do you expect?
Sensor fusion integration of vision, sound,
smell etc.
Source (and noise) separation theres more than
one thing out there
Variable perspective, source variation etc.
depends on the type of signal
depends on the type of object
Much harder than chess or calculus!

3
Bayesian probability estimation

Thomas Bayes (1702-1761)
Minister of the Presbyterian Chapel at Tunbridge
Wells
Amateur mathematician
Essay towards solving a problem in the doctrine
of chances,published (posthumously) in 1764
Crucial idea
background (prior) knowledge about the
plausibility of different theoriescan be
combined with knowledge aboutthe relation of
theories to evidence
in a mathematically well-defined way
even if all knowledge is uncertain
to reason about the most likely explanation of
the available evidence
Bayes theorem
the most important equation in the history of
mathematics (?)
a simple consequence of basic definitions, or
a still-controversial recipe for the probability
of alternative causes for a given event, or
the implicit foundation of human reasoning
a general framework for solving the problems of
perception

Tutorial on Bayes Theorem
4
(No Transcript)
5
(No Transcript)
6
(No Transcript)
7
(No Transcript)
8
Fundamental theoremof speech recognition

P(WS) ? P(SW)P(W)
where W is Word(s) (i.e. message text)
S is Sound(s) (i.e. speech signal)
Noisy channel model of communications
engineeringdue to Shannon 1949
New algorithms, especially relevant to speech
recognition
due to L.E. Baum et al. 1965-1970
Applied to speech recognition by Jim Baker (CMU
PhD 1975),
Fred Jelinek (IBM speech group gtgt1975)

9
Motivations for a Bayesian approach

A consistent framework for integrating
previous experience and current evidence
A quantitative model for abduction
reasoning about the best explanation
A general method for turning a generative model
into an analytic one analysis by
synthesis helpful where categories ltlt
signals

These motivations apply both in engineering
practice and in the evolution of biological
systems
10
Basic architecture of standard speech
recognition technology

1. Bayes Rule P(WS) ? P(SW)P(W)
2. Approximate P(SW)P(W) as a Hidden Markov
Model
a probabilistic function to get P(SW)
of a markov chain to get P(W)
3. Use Baum/Welch (EM) algorithm to
learn HMM parameters
4. Use Viterbi decoding
to find the most probable W given S
in terms of the estimated HMM

11
HMM parameter estimation given
labelled/aligned training data...
12
Viterbi decoding given HMM observed
signal...
13
Sketch of Baum-Welch (EM) algorithm for
estimating HMM parameters given unaligned
(or even unlabelled) training data
14
Other typical detailsComplex elaborations of
the basic ideas