THE MAPSPACE DENOISING ALGORITHM FOR NOISE ROBUST SPEECH RECOGNITION - PowerPoint PPT Presentation

1 / 18
About This Presentation
Title:

THE MAPSPACE DENOISING ALGORITHM FOR NOISE ROBUST SPEECH RECOGNITION

Description:

Christophe Cerisara. INRIA-LORIA, Speech Group. Khalid Daoudi. IRIT-CNRS ... K. Daoudi and C. Cerisara, 'The map-space denoising algorithm for noise robust ... – PowerPoint PPT presentation

Number of Views:68
Avg rating:3.0/5.0
Slides: 19
Provided by: Pili2
Category:

less

Transcript and Presenter's Notes

Title: THE MAPSPACE DENOISING ALGORITHM FOR NOISE ROBUST SPEECH RECOGNITION


1
THE MAP-SPACE DENOISING ALGORITHM FOR NOISE
ROBUST SPEECH RECOGNITION
Stereo-based Piecewise Affinity Compensation for
Environment
ShihHsiang 2006
2
Reference
  • K. Daoudi and C. Cerisara, The map-space
    denoising algorithm for noise robust speech
    recognition, in Proc. ASRU, Cancuun, Mexico,
    2005.
  • C. Cerisara and K. Daoudi, Evaluation of the
    SPACE denoising Algorithm on Aurora2, in Proc.
    ICASSP, Toulouse, France, 2005.

3
Outline
  • Introduction
  • The SPACE algorithm
  • The MAP-SPACE algorithm
  • Experiments
  • Conclusions

4
Introduction
  • Robustness techniques can be roughly classified
    into two categories
  • Signal Processing
  • Achieve noise robustness or denoise the signal
  • Adaptation Techniques
  • Initial canonical models are transformed to
    represent the new environment
  • Require relatively a large amount of adaptation
    data
  • The need of speech data transcription in an
    unsupervised mode
  • In this paper, they proposed an algorithm, called
    MAP-SPACE, which can be seen as an hybrid
    approach between a denoising and an adaptation
    techniques

5
The SPACE algorithm
  • The first step of the algorithm is to model the
    noisy speech y by a mixture of I Gaussians
  • The second step is to model clean speech x by a
    mixture of I Gaussians
  • Assume that the acoustic region modeled by is
    related to the one modeled by by a certain
    transformation

prior
prior
6
The SPACE algorithm (cont.)
Clean GMM
Noisy GMM
Find a transformation through stereo data
Find a transformation through stereo data
Find a transformation through stereo data
7
The SPACE algorithm (cont.)
  • Assume the relationship is deterministic and
    affine
  • in each acoustic region
  • the mapping transformation is
  • This mapping is the basis of their denoising
    algorithm, that is, they assume
  • then clean feature estimate is given by

8
The SPACE algorithm (cont.)
  • In order to estimate the parameters ,
    MMSE is used. The objective function to minimize
    is
  • given that the covariances are diagonal
  • If and
    then for each I, the objective function to
    minimize is

where
where
9
The SPACE algorithm (cont.)
  • The problem is then equivalent to minimize Fi,n
    w.r.t ai,n and bi,n, for each i and n. Let
  • Then the problem becomes
  • The solution to this problem is given by
  • Thus obtain Ai and bi for each i.

10
The MAP-SPACE Algorithm
  • The originality of the SPACE algorithm resides in
    its ability to be easily modified to handle new
    environment
  • Assume that SPACE has been performed for some
    initial training noisy condition and given test
    observations
  • Use these observations in a MAP criterion to
    adapt the initial noisy speech GMM to the new
    environment
  • Such adaptation keeps correspondence between the
    initial and new model parameters
  • That is, if the adapted noisy speech GMM is
  • for each i, the new estimate is then given by

11
The MAP-SPACE Algorithm (cont.)
  • For faster implementation, this estimate can be
    approximated by
  • MAP-SPACE has two major advantages w.r.t.
    traditional adaptation techniques
  • No transcription of adaptation data is required
    and no assumption on noise alteration is made
  • The amount of adaptation data require to achieve
    good estimate is much lower than in traditional
    adaptation
  • As compared to SPLICE, MAP-SPACE has the major
    advantage of making no assumption on the type of
    noise which corrupts testing data

12
Experiment
  • The results are obtained using pseudo-clean
    trained HMM
  • The HMMs modeling the acoustic units are trained
    using the denoised features instead of the clean
    ones
  • This strategy provides an approximate modeling of
    residual noise
  • Experiments have been conducted on the clean part
    of Aurora 2 test set A, which artificially
    corrupted by adding various types of noise at
    different SNRs (from NOISEX)
  • A mixture of 8 Gaussians is used to model the
    noisy and clean speech GMMs.
  • Adaptation of the noisy GMM us realized using the
    whole noisy test set

13
Experiment (cont.)
  • Experiments on white noise with different SNR
  • Match scores are the best (but impracticable)
  • Clean models give very poor performance 24.7
    and 9.8
  • Experiments on different noise types (5dB)
  • Training is done on a white noise and testing on
    some different noise types

14
Conclusions
  • MAP-SPACE is a combination between an extension
    of SPLICE and traditional adaptation techniques
  • MAP-SPACE have shown is robustness to SNR and
    noisy type mismatch
  • In this paper, they only want to prove their
    algorithm is efficient to compensation the
    distortion corrupted by various kind of noise

15
Practical Implementation (on AURORA2)
  • The pseudo-clean features are obtained by
    following steps
  • For each noisy condition of the training corpus,
    a GMM is trained using the Maximum Likelihood
    criterion
  • The corresponding clean GMM is trained on the
    corresponding clean sentences using the MMSE
    criterion
  • For each testing condition
  • First estimate the SNR (Detect the closest
    training environment)
  • For each test sentence, the energy is computed on
    a sliding window of 64ms length
  • SNR the highest energy / the lowest energy (in
    the window)
  • The closest corresponding SNR of the training
    corpus is found (4 noisy GMMs)

16
Practical Implementation (on AURORA2) (cont.)
  • In the second step, the four noisy GMMs for these
    four training conditions are compares, and the
    one that maximizes the likelihood is chosen
  • The test corpus is then denoised using the
    parameters of this GMM and its corresponding
    clean GMM
  • Evaluation on AURORA2 test set A
  • The multi-style training always outperforms than
    SPACE
  • There is no apparent stability in the SPACE
    behavior when the number of Gaussians varies

17
Joint Modeling of Clean and Noisy Speech
Distribution
  • There may exist different reasons that explain
    the bad results of SPACE and MAP-SPACE
  • The most important one is the fact that the
    Gaussian correspondence hypothesis is not
    verified
  • This meas, that the MMSE criterion is not the
    best way to build such correspondences
  • Joint modeling of clean and noisy speech
    distribution
  • Modeling P(x,y)

18
Comparison of SPACE and SPACE-JM
Average accuracy over the 4 noises and 5 SNRs (5
dB, 10 dB, 15 dB, 20 dB and clean) of aurora2
test set A
Average accuracy over the 4 noises of aurora2
test set A at SNR0 dB
  • SPACE-JM are much more stable than SPACE results
  • The results not only suggest that a better
    Gaussian correspondence is achieved by SPACE-JM,
    but also it is robust to SNR change
Write a Comment
User Comments (0)
About PowerShow.com