Title: Noise Supression Techniques for Speech Enhancement Using Adaptive Filtering
1Noise Supression Techniques for Speech
Enhancement Using Adaptive Filtering
- Derek Shiell
- 03/09/2006
- ECE 463 Project Presentation
- Professor Michael Honig
2Overview
- Objective/Problem Description
- Applications
- Overview of Noise Reduction Methods
- System Description
- Filter analysis
- Linear methods
- Wiener approximation
- KLT preprocessing
- Signal subspace embedding
- Kalman filter based methods
- Non-linear methods
- Current results
- Future work
- Implementation/ practical considerations
- Conclusions
3Objective/Problem Description
- The goal of my project was to research noise
reduction techniques specifically for automatic
speech recognition system front-end processing on
a single microphone without an independent noise
recording or clean reference signal.
4Applications
- Cell phone speech enhancement
- Automatic speech recognition
- Speaker identification
- Biomedical signal processing
(1)
(2)
- http//images.businessweek.com/mz/04/45/techbuy/im
ages/razr_phone.jpg - http//www.nanopac.com/images/smnsbox.jpg
- http//ldt.stanford.edu/sgilutz/Shulis_Portfolio/
fall/hci/images/sensory.jpg
(3)
5Overview of Speech Enhancement
- Microphone Array Processing
- Utilizing multiple microphones, blind source
separation (BSS) techniques such as independent
component analysis (ICA) may be used to
distinguish one speaker from other directional or
diffuse noises. - Active echo/noise cancellation (ANC)
- In this case, the echo or noise is estimated and
re-generated with opposite phase to destructively
interfere with the original echo or noise. - Blind noise suppression
- In this case, there is a single speech signal
corrupted by noise, no separate noise recording
with which to make noise estimates, and no source
signal to reference.
6System Descriptions
BSS/ICA
ANC
Active Noise Cancellation with single
microphone/speaker 4
BSS based on frequence domain ICA 6
Blind Noise Reduction
Blind noise reduction schematic 1
7Filter Analysis (1)
Linear MMSE (Wiener approximation)
MMSE cost function
Reduces to (frame length N)
8Filter Analysis (2)
- Linear Estimation (continued)
Signal is estimated from a linear filtering of
the corrupted signal
Minimizing the MMSE cost function with respect to
w the result is as follows
This is an approximation to the Wiener solution
where we are estimating the crosscorrelation
vector p with (ry rn) (similar to spectral
subtraction)
9Filter Analysis (3)
- Linear estimation with Karhunen Lòeve Transform
(KLT)
Preprocessing the signal using KLT (or PCA)
separates the signal into its directions of
greatest variance. Using the transform the
signal can be mapped into a lower dimensional
space which helps decorrelate the signal from
noise. For a changing signal this requires that
U be adaptively updated. Define U the KLT
transform as the eigenvectors of Ry the
autocorrelation matrix of the noisy
signal. Using this transformation we can define
the transformed yk as The resulting closed
form solution for the weight vector is
10Filter Analysis (4)
- Signal subspace embedding
- This method allows for a matrix of gain
factors, W, rather than simply a weight vector, w
(MIMO) so that a simultaneous block estimate of
can be made. In addition the matrix Q can be
chosen as either I or to taper the tap weights by
some factor(s) such that is emphasized more
in the minimization phase. - MMSE cost function
- Update Equations for the filter matrix
and transform basis can be found iteratively
11Filter Analysis (5)
- Kalman Filtering Approaches
- Kalman filters are widely used in speech
enhancement and much theoretical work has been
done analyzing Kalman filters. The Kalman filter
is the minimum mean-square estimator of the state
of a linear dynamical system and can be used to
derive many types of RLS filters. Extended
Kalman filters can be expanded to handle
nonlinear models through a linearization process. - Kalman filters have the advantages that they are
- more robust (stationarity not assumed)
- require only the previous estimate for the next
estimation (versus all passed values for
instance) - computationally efficient
Standard linear state-space model for Kalman
filter
12Filter Analysis (6)
- Nonlinear filtering
- Many nonlinear filtering methods exist to
suppress noise in noisy speech. Examples include
filters based on neural networks or phase space
reconstruction. In general, they are very complex
to analyze, but do not require estimation of
noise or speech spectra and are not characterized
by musical tone artifacts.
Feed forward neural network (1)
Phase space reconstruction for different speech
phonemes 9
- http//research.yale.edu/ysm/images/78.2/articles-
neural-network.jpg
13Typical Results
Segmental SNR results (left) and SNR results
(below) for various linear and nonlinear noise
reduction methods 8
Noisy Speech Signal (white noise)
Wiener Filtered
Ephraim Filtered
- Comparison of segmental SNR performance for
different noise sources - White noise (SNR 6.08 dB)
- Pink noise (SNR 4.34 dB)
- Factory noise (SNR 5.16 dB)
- F16 noise (SNR 4.61 dB)
- a) Linear estimation b) linear estimation with
KLT preprocessing c) signal subspace embedding d)
weighted signal subspace embedding e) NN with KLT
f) linear with clean target g) nonlinear with
clean target h) standard spectral subtraction
method (3dB segmental SNR 5dB SNR) 1
14Future Work
- Perform ASR after noise reduction filtering
- AVICAR database
- Data collected in a car environment
- Time varying SNR
- No independent noise recording (detecting speech
is difficult) - Experiments
- KLT preprocessing linear estimation (Wiener)
- Ephraim filter (ML short time spectral amplitude
estimator) - Nonlinear methods
15Implementation/Practical Considerations
- Real-time processing
- Applications require computationally efficient
algorithms to be feasible. - Determining noise sample
- Single microphone, speech detection to estimate
noise statistics is difficult. - Use visual information to detect speech or
nonlinear noise reduction methods
16Conclusions
- Noise suppression methods have become
increasingly important due to the proliferation
of mobile devices, ASR systems, and biometrics/
bioinformatics - Speech enhancement is a very broad field
- Array processing for source separation, noise
cancellation - Interested in blind noise reduction
- Linear, Linear KLT preprocessing, Signal
subspace embedding - Kalman filter based methods, Non-linear methods
- Using state-of-the-art noise reduction methods,
typical SNR improvements are 5 dB - Proposed experiments to test ASR improvement
17References
- Eric A. Wan and Rudolph van der Merwe,
Noise-Regularized Adaptive Filtering for Speech
Enhancement, Proc. Eurospeech, pp. 2643-2646,
1999. - Ki Yong Lee., Byung-Gook Lee, Iickho Song, and
Souguil Ann, Robust Estimation of AR Parameters
and its Application for Speech Enhancement,
Proc. IEEE ICASSP, pp. 309 - 312, 1992. - Phil S. Whitehead, David V. Anderson, and Mark A.
Clements, Adaptive, Acoustic Noise Suppression
for Speech Enhancement. Proc. IEEE ICME, pp. 565
568, 2003. - A. V. Oppenheim, E. Weinstein, K. C. Zangi, M.
Feder, and D. Gauger, Single Sensor Active Noise
Cancellation Based on the EM Algorithm, Proc.
IEEE ICASSP, pp. 277 280, 1992. - T. Rutkowski, A. Cichocki, and A. K. Barros,
Speech Enhancement Using Adaptive Filters and
Independent Component Analysis Approach, Proc.
AISAT, 2000. - H. Saruwatari, K. Sawai, A. Lee, K. Shikano, A.
Kaminuma, and M. Sakata, Speech Enhancement and
Recognition in Car Environment Using Blind Source
Separation and Subband Elimination Processing,
Proc. ICA, pp. 367 372, 2003. - Simon Haykin, Adaptive Filter Theory,
Prentice-Hall Inc., Upper Saddle River, NJ, pp
466 501, 2002. - M. T. Johnson, A. C. Lindgren, R. J. Povinelli,
and X. Yuan, Performance of Nonlinear Speech
Enhancement using Phase Space Reconstruction,
Proc IEEE ICASSP, pp. 872 875, 2003. - Andrew C. Lindgren, Speech Recognition Using
Features Extracted from Phase Space
Reconstructions, Thesis, Marquette University,
Milwaukee WI, May 2003.
18