RCCMean Subtraction Robust Feature and Compare Various Feature based Methods for Robust Speech Recog - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

RCCMean Subtraction Robust Feature and Compare Various Feature based Methods for Robust Speech Recog

Description:

Computer Engineering Department, Sharif University of Technology. Wednesday, February 18, 2005 ... Babble, car, subway. Exhibit, office, ... Convolutional Noise ... – PowerPoint PPT presentation

Number of Views:45
Avg rating:3.0/5.0
Slides: 31
Provided by: aminf
Category:

less

Transcript and Presenter's Notes

Title: RCCMean Subtraction Robust Feature and Compare Various Feature based Methods for Robust Speech Recog


1
RCC-Mean Subtraction Robust Feature and Compare
Various Feature based Methods for Robust Speech
Recognition in presence of Telephone Noise
  • Amin Fazel
  • Sharif University of Technology
  • Hossein Sameti, Mohammad T. Manzuri
  • February 2005

Computer Engineering Department, Sharif
University of Technology
2
Outline
  • Introduction
  • Feature based methods
  • MFCC, RCC, CMN, PLP, RASTA
  • Mean Normalization Root Cepstral Coefficients
  • Experimental Results
  • Experiment 1 Sharif CSR and TFARSDAT Database
  • Experiment 2 HTK CSR and AURORA 2 Database
  • Summery

3
Effect of Noise on ASR
  • Two phase in most ASR systems
  • Train
  • Operating (Testing)
  • Mismatch causes reduction in accuracy
  • Mismatch occur because of
  • Environment
  • Microphone, babble, distance, transmission canal
  • Speaker
  • Specific speaker speed,
  • Various speakers gender, age, accent,

4
Effect of Noise on ASR
  • Noise
  • Additive noise
  • Babble, car, subway
  • Exhibit, office,
  • Convolutional Noise
  • Canal, telephone line
  • Microphone effect
  • Distance of speaker to microphone
  • Others
  • Lombard noise, Reflection of building

5
Effect of Noise on ASR
  • Simple model
  • Robust Speech Recognition is the study of
    building speech recognition that handle mismatch
    condition.

6
Robustness Methods
  • Signal
  • Speech enhancement
  • Feature
  • Robust feature extraction
  • Model
  • Change of the model parameters
  • Model training

7
Outline
  • Introduction
  • Feature based methods
  • MFCC, RCC, CMN, PLP, RASTA
  • Mean Normalization Root Cepstral Coefficients
  • Experimental Results
  • Experiment 1 Sharif CSR and TFARSDAT Database
  • Experiment 2 HTK CSR and AURORA 2 Database
  • Summery

8
Mel-Frequency Cepstral Coefficient
  • Compute magnitude-squared of Fourier transform
  • Apply triangular frequency weights that represent
    the effects of peripheral auditory frequency
    resolution
  • Take log of outputs ( for RCC we take root
    instead of log)
  • Compute cepstral using discrete cosine transform
  • Smooth by dropping higher-order coefficients

9
Temporal processing
  • To capture the temporal features of the spectral
    envelop to provide the robustness
  • Delta Feature first and second order
    differences regression
  • Cepstral Mean Subtraction
  • For normalizing for channel effects and adjusting
    for spectral slope

10
Perceptual Linear Prediction (PLP)
  • Compute magnitude-squared of Fourier transform
  • Apply triangular frequency weights that represent
    the effects of peripheral auditory frequency
    resolution
  • Apply compressive nonlinearities
  • Compute discrete cosine transform
  • Smooth using autoregressive modeling
  • Compute cepstral using linear recursion

11
PLP (Cont.)
  • Algorithm

12
RelAtive SpecTral Analysis
  • Which makes PLP (and possibly also some other
    short-term spectrum based techniques) more robust
    to linear spectral distortions
  • The new spectral estimate is less sensitive to
    slow variations in the short-term spectrum
  • Filtering of the temporal trajectories of some
    function of each of the spectral values to
    provide more reliable spectral features
  • This is usually a bandpass filter, maintaining
    the linguistically important spectral envelop
    modulation (1-16Hz)

13
RASTA (Cont.)
  • Algorithm

14
RASTA-PLP
  • Algorithm

15
Outline
  • Introduction
  • Feature based methods
  • MFCC, RCC, CMN, PLP, RASTA
  • Mean Normalization Root Cepstral Coefficients
  • Experimental Results
  • Experiment 1 Sharif CSR and TFARSDAT Database
  • Experiment 2 HTK CSR and AURORA 2 Database
  • Summery

16
RCC-Mean Normalization
  • Root Cepstral Coefficients (RCC)
  • Derived using root compression rather than log
    compression on the filterbank energies
  • Advantage of RCC to MFCC
  • More immune to noise
  • Faster decoding

17
RCC-Mean Normalization
  • Mean normalization
  • If we approximate root with logarithm

18
Outline
  • Introduction
  • Feature based methods
  • MFCC, RCC, CMN, PLP, RASTA
  • Mean Normalization Root Cepstral Coefficients
  • Experimental Results
  • Experiment 1 Sharif CSR and TFARSDAT Database
  • Experiment 2 HTK CSR and AURORA 2 Database
  • Summery

19
Experiment 1
  • Database
  • TFARSDAT
  • 64 Speakers
  • 8 hours telephony speech data
  • ASR
  • Sharif ASR System
  • HMM based
  • Training Segmental K-means
  • Search Beam Viterbi

20
Experiment 1
  • Test results

21
Experiment 2
  • Aurora 2.0
  • Noisy connected digits recognition
  • 4 hours training data, 2 hours test data in 70
    Noise Types/SNR conditions
  • HTK
  • HMM based
  • Model for each digit
  • 16 states with 3 Gaussian mixtures

22
Experiment 2
  • Average results on AURORA
  • Average obtained on various SNRs of a noise

23
Experiment 2
  • Subway noise in various SNRs

24
Experiment 2
  • Babble noise in various SNRs

25
Experiment 2
  • Car noise in various SNRs

26
Experiment 2
  • Exhibition noise in various SNRs

27
Outline
  • Introduction
  • Feature based methods
  • MFCC, RCC, CMN, PLP, RASTA
  • Mean Normalization Root Cepstral Coefficients
  • Experimental Results
  • Experiment 1 Sharif CSR and TFARSDAT Database
  • Experiment 2 HTK CSR and AURORA 2 Database
  • Summery

28
Summery
  • Various robust features was tested
  • Introduce of RCC_MN
  • In first experiment
  • RASTA-PLP
  • Although RCC_MN is good
  • In second experiment
  • RCC_MN

29
Outline
  • Introduction
  • Feature based methods
  • MFCC, RCC, CMN, PLP, RASTA
  • Mean Normalization Root Cepstral Coefficients
  • Experimental Results
  • Experiment 1 Sharif CSR and TFARSDAT Database
  • Experiment 2 HTK CSR and AURORA 2 Database
  • Summery

30
Thanks for your patience !
Write a Comment
User Comments (0)
About PowerShow.com