THE MAPSPACE DENOISING ALGORITHM FOR NOISE ROBUST SPEECH RECOGNITION - PowerPoint PPT Presentation

1 / 18

About This Presentation

Title:

THE MAPSPACE DENOISING ALGORITHM FOR NOISE ROBUST SPEECH RECOGNITION

Description:

Christophe Cerisara. INRIA-LORIA, Speech Group. Khalid Daoudi. IRIT-CNRS ... K. Daoudi and C. Cerisara, 'The map-space denoising algorithm for noise robust ... – PowerPoint PPT presentation

Number of Views:68

Avg rating:3.0/5.0

Slides: 19

Provided by: Pili2

Category:

more less

Transcript and Presenter's Notes

Title: THE MAPSPACE DENOISING ALGORITHM FOR NOISE ROBUST SPEECH RECOGNITION

1
THE MAP-SPACE DENOISING ALGORITHM FOR NOISE
ROBUST SPEECH RECOGNITION
Stereo-based Piecewise Affinity Compensation for
Environment
ShihHsiang 2006
2
Reference

K. Daoudi and C. Cerisara, The map-space
denoising algorithm for noise robust speech
recognition, in Proc. ASRU, Cancuun, Mexico,
2005.
C. Cerisara and K. Daoudi, Evaluation of the
SPACE denoising Algorithm on Aurora2, in Proc.
ICASSP, Toulouse, France, 2005.

3
Outline

Introduction
The SPACE algorithm
The MAP-SPACE algorithm
Experiments
Conclusions

4
Introduction

Robustness techniques can be roughly classified
into two categories
Signal Processing
Achieve noise robustness or denoise the signal
Adaptation Techniques
Initial canonical models are transformed to
represent the new environment
Require relatively a large amount of adaptation
data
The need of speech data transcription in an
unsupervised mode
In this paper, they proposed an algorithm, called
MAP-SPACE, which can be seen as an hybrid
approach between a denoising and an adaptation
techniques

5
The SPACE algorithm

The first step of the algorithm is to model the
noisy speech y by a mixture of I Gaussians
The second step is to model clean speech x by a
mixture of I Gaussians
Assume that the acoustic region modeled by is
related to the one modeled by by a certain
transformation

prior
prior
6
The SPACE algorithm (cont.)
Clean GMM
Noisy GMM
Find a transformation through stereo data
Find a transformation through stereo data
Find a transformation through stereo data
7
The SPACE algorithm (cont.)

Assume the relationship is deterministic and
affine
in each acoustic region
the mapping transformation is
This mapping is the basis of their denoising
algorithm, that is, they assume
then clean feature estimate is given by

8
The SPACE algorithm (cont.)

In order to estimate the parameters ,
MMSE is used. The objective function to minimize
is
given that the covariances are diagonal
If and
then for each I, the objective function to
minimize is

where
where
9
The SPACE algorithm (cont.)

The problem is then equivalent to minimize Fi,n
w.r.t ai,n and bi,n, for each i and n. Let
Then the problem becomes
The solution to this problem is given by
Thus obtain Ai and bi for each i.

10
The MAP-SPACE Algorithm

The originality of the SPACE algorithm resides in
its ability to be easily modified to handle new
environment
Assume that SPACE has been performed for some
initial training noisy condition and given test
observations
Use these observations in a MAP criterion to
adapt the initial noisy speech GMM to the new
environment
Such adaptation keeps correspondence between the
initial and new model parameters
That is, if the adapted noisy speech GMM is
for each i, the new estimate is then given by

11
The MAP-SPACE Algorithm (cont.)

For faster implementation, this estimate can be
approximated by
MAP-SPACE has two major advantages w.r.t.
traditional adaptation techniques
No transcription of adaptation data is required
and no assumption on noise alteration is made
The amount of adaptation data require to achieve
good estimate is much lower than in traditional
adaptation
As compared to SPLICE, MAP-SPACE has the major
advantage of making no assumption on the type of
noise which corrupts testing data

12
Experiment

The results are obtained using pseudo-clean
trained HMM
The HMMs modeling the acoustic units are trained
using the denoised features instead of the clean
ones
This strategy provides an approximate modeling of
residual noise
Experiments have been conducted on the clean part
of Aurora 2 test set A, which artificially
corrupted by adding various types of noise at
different SNRs (from NOISEX)
A mixture of 8 Gaussians is used to model the
noisy and clean speech GMMs.
Adaptation of the noisy GMM us realized using the
whole noisy test set

13
Experiment (cont.)

Experiments on white noise with different SNR
Match scores are the best (but impracticable)
Clean models give very poor performance 24.7
and 9.8
Experiments on different noise types (5dB)
Training is done on a white noise and testing on
some different noise types

14
Conclusions

MAP-SPACE is a combination between an extension
of SPLICE and traditional adaptation techniques
MAP-SPACE have shown is robustness to SNR and
noisy type mismatch
In this paper, they only want to prove their
algorithm is efficient to compensation the
distortion corrupted by various kind of noise

15
Practical Implementation (on AURORA2)

The pseudo-clean features are obtained by
following steps
For each noisy condition of the training corpus,
a GMM is trained using the Maximum Likelihood
criterion
The corresponding clean GMM is trained on the
corresponding clean sentences using the MMSE
criterion
For each testing condition
First estimate the SNR (Detect the closest
training environment)
For each test sentence, the energy is computed on
a sliding window of 64ms length
SNR the highest energy / the lowest energy (in
the window)
The closest corresponding SNR of the training
corpus is found (4 noisy GMMs)

16
Practical Implementation (on AURORA2) (cont.)

In the second step, the four noisy GMMs for these
four training conditions are compares, and the
one that maximizes the likelihood is chosen
The test corpus is then denoised using the
parameters of this GMM and its corresponding
clean GMM
Evaluation on AURORA2 test set A
The multi-style training always outperforms than
SPACE
There is no apparent stability in the SPACE
behavior when the number of Gaussians varies

17
Joint Modeling of Clean and Noisy Speech
Distribution

There may exist different reasons that explain
the bad results of SPACE and MAP-SPACE
The most important one is the fact that the
Gaussian correspondence hypothesis is not
verified
This meas, that the MMSE criterion is not the
best way to build such correspondences
Joint modeling of clean and noisy speech
distribution
Modeling P(x,y)

18
Comparison of SPACE and SPACE-JM
Average accuracy over the 4 noises and 5 SNRs (5
dB, 10 dB, 15 dB, 20 dB and clean) of aurora2
test set A
Average accuracy over the 4 noises of aurora2
test set A at SNR0 dB

SPACE-JM are much more stable than SPACE results
The results not only suggest that a better
Gaussian correspondence is achieved by SPACE-JM,
but also it is robust to SNR change

Write a Comment

User Comments (0)