A Signal Subspace Approach for Speech Enhancement - PowerPoint PPT Presentation

1 / 13
About This Presentation
Title:

A Signal Subspace Approach for Speech Enhancement

Description:

?z is a diagonal matrix of the eigenvalues of the noisy signal ... On a 1.3 GHz processor, with 99% CPU utilization: 5ms frame - 20s running time ... – PowerPoint PPT presentation

Number of Views:92
Avg rating:3.0/5.0
Slides: 14
Provided by: sundee4
Category:

less

Transcript and Presenter's Notes

Title: A Signal Subspace Approach for Speech Enhancement


1
A Signal Subspace Approach for Speech Enhancement
  • Karhunen-Loève Transform Approach
  • Sundeep Singh

2
Signal Spaces
  • Signal occupies an N-dimensional space (for a
    vector of length N)
  • For speech signal, the speech occupies only a
    subspace of the entire N-dimensional space
  • White noise occupies the entire N-dimensional
    space
  • For noisy signal zyw, find the subspace that
    contains the speech and project z onto this
    subspace for the estimate of the clean signal, y

3
Subspace Linear Estimator
4
Karhunen-Loève Expansion
  • Decompose a general second-order random process
    into an orthonormal expansion, whose coefficients
    are uncorrelated random variables
  • There exists at least one set of orthonormal
    functions with the property that the coefficients
    in its expansion are uncorrelated random
    variables
  • Different from the Fourier series expansion,
    because the coefficients are uncorrelated, or
    statistically orthogonal
  • X(t) ?Xnfn(t) (in the mean square sense)

5
K-L Transform
  • KLT is the theoretically optimal transform for
    transform coding
  • Implemented in this algorithm by using an
    empirical estimate based on eigendecomposition of
    the (Toeplitz) covariance matrix
  • Decomposes the vector space of the noisy signal
    into a signal(noise) subspace and a noise
    subspace
  • Linear estimation is performed by modifying the
    KLT components which lie in the signal subspace
    by a gain function, determined by an estimator
  • Enhanced signal is obtained from the inverse KLT
    of the altered components

6
K-L Transform (cont.)
  • Find covariance matrix of a frame of noisy
    speech, Rz
  • Find covariance matrix of the noise, Rw this is
    a diagonal matrix with each element on the
    diagonal equal to the variance of the noise, s²
  • s² is estimated from the first few frames of the
    speech signal, since there is all noise and no
    speech

7
K-L Transform (cont.)
  • Perform eigendecomposition on Rz to find the
    eigenvectors and eigenvalues of the noisy signal,
    RzU?zU
  • ?z is a diagonal matrix of the eigenvalues of the
    noisy signal
  • U is a matrix of the corresponding eigenvectors
  • ?z(k)s² in the noise subspace
  • ?z(k)?y(k)s² in the signal subspace
  • So for the eigenvalues which are greater than s²,
    we can assume those lie in the signal subspace.

8
Implementation
  • For each of the eigenvalues, ?z(k), of the noisy
    signal, compare the values to the variance of the
    noise, s²
  • If the eigenvalue is greater than the noise
    variance, assume it lies in the signal subspace
    along with its corresponding eigenvector
  • Otherwise, assume it lies in the noise subspace,
    and null it and its corresponding eigenvector
  • To estimate the eigenvalues, ?y(k), of the clean
    signal, take the noisy eigenvalues which lie in
    the signal subspace and subtract the noise
    variance

9
Implementation (cont.)
  • The gain function and inverse transform can be
    implemented via a filter, H
  • Let yHz be a linear estimator of the clean
    signal y, where H is a NN matrix
  • It can be shown that the optimal H to estimate
    the clean signal is HU1GµU1
  • U1 is the matrix of eigenvectors that correspond
    to the eigenvalues in the signal subspace
  • Gµ is a diagonal matrix, with the gain values as
    elements on the diagonal
  • gµ(m) ?y(m) / (?y(m) µs²)
  • µ is the LaGrange multiplier, which can be
    estimated in several ways, or chosen to be an
    arbitrary value

10
Algorithm Complexity
  • Each frame of the noisy signal has a unique
    filter, H, which cleans it up
  • Very computationally expensive to implement,
    because a new filter has to be computed for each
    frame
  • As frame size increases, the covariance matrices
    get larger, and running time of the algorithm
    increases
  • On a 1.3 GHz processor, with 99 CPU utilization
  • 5ms frame - 20s running time
  • 10ms frame - 48s running time
  • 20ms frame - 147s running time

11
Audio Demos (µ10)
  • Noisy signal, 0 dB SNR
  • Enhanced signal
  • Noisy signal, 5 dB SNR
  • Enhanced signal
  • Noisy signal, 10 dB SNR
  • Enhanced signal

12
Effect of µ on output of algorithm
  • As µ increases, the residual noise decreases
    while signal distortion increases
  • The effect of changes in µ becomes apparent in
    highly noisy signals
  • Noisy signal, -5 db SNR
  • Enhanced signal, µ0
  • Enhanced signal, µ10
  • Enhanced signal, µ100

13
The End
Write a Comment
User Comments (0)
About PowerShow.com