Blind Separation of Speech Mixtures - PowerPoint PPT Presentation

1 / 57
About This Presentation
Title:

Blind Separation of Speech Mixtures

Description:

Mask estimation Mic 1 Mic P Mixture in TF domain Separated signals in TF domain * * Mathematical Representation Time domain: Frequency domain: ... (DOA) method ... – PowerPoint PPT presentation

Number of Views:350
Avg rating:3.0/5.0
Slides: 58
Provided by: NTU85
Category:

less

Transcript and Presenter's Notes

Title: Blind Separation of Speech Mixtures


1
Blind Separation of Speech Mixtures
  • Vaninirappuputhenpurayil Gopalan REJU
  • School of Electrical and Electronic Engineering
  • Nanyang Technological University

2
Introduction
Blind Source Separation
Convolutive
  • Mixing process

s1
s2
  • Unmixing process

3
Introduction
Convolutive Blind Source Separation
Instantaneous Blind Source Separation
4
Introduction
Convolutive Blind Source Separation
Instantaneous Blind Source Separation
Difficult to separate
Easy to separate
  • In frequency domain

5
Introduction
No. of sources lt No. of sensor
Overdetermined mixing
Easy to separate
No. of sources No. of sensor
Determined mixing
No. of sources gt No. of sensor
Difficult to separate
Underdetermined mixing
6
Approaches for BSS of Speech Signals
Types of mixing
Instantaneous mixing
Convolutive mixing
7
Approaches for BSS of Speech Signals
Instantaneous mixing
Step 1 Selection of cost function
Step 2 Minimization or maximization of the cost
function
X1
S1
Y1
W
H
S2
Y2
X2
Separated?
8
Approaches for BSS of Speech Signals
Instantaneous mixing
Selection of cost function
Statistical independence
Signals from two different sources are independent
Information theoretic
Non-Gaussianity
Central limit theorem Mixture of two or more
sources will be more Gaussian than their
individual components
Non Gaussianity measures
Kurtosis
Negentropy
Nonlinear cross moments
Temporal structure of speech
Non-stationarity of speech
9
Approaches for BSS of Speech Signals
Instantaneous mixing
Minimization or maximization of the cost function
simple gradient method
Natural gradient method
e.g. Informax ICA algorithm
Newtons method
e.g. FastICA
10
Approaches for BSS of Speech Signals
Convolutive Mixing
Time Domain
Frequency Domain
Advantage No permutation problem Disadvantage Sl
ow convergence High computational cost for long
filter taps
Advantage Low computational cost Fast
convergence Disadvantage Permutation Problem
X1
S1
Y1 Y2
W
H
or
S2
Y2 Y1
X2
11
Permutation Problem in Frequency Domain BSS
Corresponding to y3
One frequency bin
Instantaneous ICA algorithm
f1
BSS
K point FFT
K point IFFT
Solving permutation Problem
y1
y1
x1
f2
BSS
y2
y2
x2
x3
y3
y3
fk
BSS
Mixed signals
Still signals are mixed
Separated signals
Corresponding to different sources Due to
permutation problem
12
Motivation
Instantaneous
Determined/ Overdetermined
Frequency domain
Frequency bin-wise separation
Permutation problem
mixtures sources
Convolutive
Time domain
BSS
Instantaneous
Mixing matrix estimation
Source estimation
Underdetermined
mixtures lt sources
Frequency domain
Frequency bin-wise separation
Permutation problem
Convolutive
Time domain
Automatic detection of no. of sources
13
My Contribution - I
Instantaneous
Determined/ Overdetermined
Frequency domain
Frequency bin-wise separation
Permutation problem
mixtures sources
Convolutive
Time domain
BSS
Instantaneous
Mixing matrix estimation
Source estimation
Underdetermined
mixtures lt sources
Frequency domain
Frequency bin-wise separation
Permutation problem
Convolutive
Time domain
Automatic detection of no. of sources
14
Algorithm for Solving the Permutation Problem
One frequency bin
Instantaneous ICA algorithm
f1
BSS
K point FFT
K point IFFT
Solving permutation Problem
y1
x1
f2
BSS
y2
x2
x3
y3
fk
BSS
Mixed signals
Separated signals
Permutation problem solved
Permutation problem
15
Existing Method forSolving the Permutation
Problem
Direction Of Arrival (DOA) method
Direction of y1 -30o Direction of y2 20o
Position of the pth sensor
Velocity of sound
16
Existing Method forSolving the Permutation
Problem
Direction Of Arrival (DOA) method
  • Disadvantages
  • Fails at lower frequencies.
  • Fails when sources are near.
  • Room reverberation.
  • Sensor positions must be known.
  • Reasons for failure at lower freq
  • Lower spacing causes error in phase difference
    measurement.
  • The relation is approximated for plane wave front
    under anechoic condition

17
Existing Method forSolving the Permutation
Problem
Adjacent bands correlation method
High correlation
Low correlation
Low correlation
f1
BSS
K point FFT
K point IFFT
Solving permutation Problem
y1
x1
f2
BSS
y2
x2
y3
x3
fk
BSS
Mixed signals
Separated signals
18
Existing Method forSolving the Permutation
Problem
Adjacent bands correlation method
r11
r11
r11
r11
s1
K-1
K1
K2
..
..
K
K3
Correlation matrix
r12 r21
r12 r21
r12 r21


r12 r21
r11 r12 r21 r22
s2
K
K3
K-1
K1
K2
..
..
r22
r22
r22
r22
With confidence
Without confidence
Example
Example
Change permutation
No change
19
Existing Method forSolving the Permutation
Problem
Adjacent bands correlation method
r11
r11
r11
r11
Correlation matrix
s1
K-1
K1
K2
..
..
K
K3


r11 r12 r21 r22
r12 r21
r12 r21
r12 r21
r12 r21
s2
K
K3
K-1
K1
K2
..
..
r22
r22
r22
r22
Disadvantage The method is not robust
20
Existing Method forSolving the Permutation
Problem
Combination of DOA and Correlation methods method
DOA Harmonic Correlation Adjacent bands
correlation Advantage Increased robustness
21
Proposed algorithm Partial separation
method(Parallel configuration)Reference V. G.
Reju, S. N. Koh and I. Y. Soon, Partial
separation method for solving permutation problem
in frequency domain blind source separation of
speech signals, Neurocomputing, Vol. 71, NO.
1012, June 2008, pp. 20982112.
Time domain stage
Frequency domain stage
22
Partial separation method(Parallel configuration)
Time domain stage
Frequency domain stage
23
Partial separation method(Cascade configuration)
Parallel configuration
Frequency domain stage
Time domain stage
24
Advantages of Partial Separation method
  • Robustness

25
Comparison with Adjacent Bands Correlation Method
26
Comparison with DOA method
PS - Partial Separation method with confidence
check, C1 - Correlation between the adjacent bins
without confidence check, C2 - Correlation
between adjacent bins with confidence check, Ha -
Correlation between the harmonic components with
confidence check, PS1 - Partial separation method
alone without confidence check.
27
My Contribution -II
Instantaneous
Determined/ Overdetermined
Frequency domain
Frequency bin-wise separation
Permutation problem
mixtures sources
Convolutive
Time domain
BSS
Instantaneous
Mixing matrix estimation
Source estimation
Underdetermined
mixtures lt sources
Frequency domain
Frequency bin-wise separation
Permutation problem
Convolutive
Time domain
Automatic detection of no. of sources
28
Underdetermined Blind Source Separation of
Instantaneous Mixtures
29
Mathematical Representation of Instantaneous
MixingReference V. G. Reju, S. N. Koh and I. Y.
Soon, An algorithm for mixing matrix estimation
in instantaneous blind source separation, Signal
Processing, Vol. 89, Issue 9, September 2009, pp.
17621773.
Time domain
P No. of mixtures Q No. of sources
Time-Frequency domain
30
Single Source Points in Time-Frequency domain
Single source point 1
Single source point 2
0
0
31
Single Source Points in Time-Frequency domain
Single source point 1
Single source point 2
32
Single Source Points in Time-Frequency domain
Single source point 1
Single source point 2
Scalar
Scalar
Scalar
Scalar
.?. At single source point 1
.?. At single source point 2
33
Scatter Diagram of the Mixtures When Source are
Perfectly Sparse
Example
0 0 0
0 0
34
Scatter Diagram of the Mixtures When Source are
Not Perfectly Sparse
Example
0 0 0
0
0 0
35
Scatter Diagram of the Mixtures when Sources are
Sparse
No. of sources 6 No. of mixtures 2
36
Scatter Diagram of the Mixtures when Sources are
Sparse, After Clustering
No. of sources 6 No. of mixtures 2
37
Scatter Diagram of the Mixtures when Sources are
Not Perfectly Sparse
Objective Estimation of the single source
points.
No. of sources 6 No. of mixtures 2
38
Principle of the Proposed Algorithm for the
Detection of Single Source Points
Single source point 1
Single source point 2
Scalar
Scalar
Scalar
Scalar
Multi source point
39
Principle of the Proposed Algorithm for the
Detection of Single Source Points
Single source point 1
Single source point 2
Scalar
Scalar
Scalar
Scalar
Multi source point
40
Principle of the Proposed Algorithm for the
Detection of Single Source Points
Average of 15 pairs of speech utterances of
length 10 s each
SSP
MSP
41
Proposed Algorithm for the Detection of Single
Source Points
SSP
MSP
42
Elimination of Outliers
SSPs detection
Clustering
Outlier elimination
43
Experimental Results
No. of mixtures 2, No. of sources 6
44
Detected Single Source Points,Three mixtures
No. of mixtures 3, No. of sources 6
45
Comparison with Classical Algorithms for
Determined Case
Average of 500 experimental results
No. of mixtures 2 No. of sources 2
-gt
46
Comparison with Method Proposed in 1,
Underdetermined case
Normalized mean square error (NMSE) in mixing
matrix estimation (dB)
P No. of mixtures Q No. of sources
Order of the mixing matrices (PxQ)
1 Y. Li, S. Amari, A. Cichocki, D. W. C. Ho,
and S. Xie, Underdetermined blind source
separation based on sparse representation, IEEE
Transactions on Signal Processing, vol. 54, p.
423437, Feb. 2006.
47
Advantages of the Proposed algorithm
1) Much simpler constrain the algorithm does
not require single source zone.
2) Separation performance is better.
3) The algorithm is extremely simple but
effective
Step 1 Convert x in the time domain to the TF
domain to get X. Step 2 Check the
condition Step 3 If the condition is
satisfied, then X(k, t) is a sample at the SSP,
and this sample is kept for mixing matrix
estimation otherwise, discard the point. Step
4 Repeat Steps 2 to 3 for all the points in the
TF plane or until sufficient number of SSPs are
obtained.
-gt
48
My Contributions III, IV and V
Instantaneous
Determined/ Overdetermined
Frequency domain
Frequency bin-wise separation
Permutation problem
mixtures sources
Convolutive
Time domain
BSS
Instantaneous
Mixing matrix estimation
Source estimation
Underdetermined
mixtures lt sources
Frequency domain
Frequency bin-wise separation
Permutation problem
Convolutive
Time domain
Automatic detection of no. of sources
49
Underdetermined Convolutive Blind Source
Separation via Time-Frequency MaskingReference
V. G. Reju, S. N. Koh and I. Y. Soon,
Underdetermined Convolutive Blind Source
Separation via Time- Frequency Masking, IEEE
Transactions on Audio, Speech and Language
Processing, Vol. 18, NO. 1, Jan. 2010, pp.
101116.
STFT
Apply mask
Mic 1
Mixture in TF domain
STFT
Apply Mask
Mic P
Mask estimation
Separated signals in TF domain
50
Mathematical Representation
Time domain
P No. of mixtures Q No. of sources
Frequency domain
51
Single source points
Instantaneous mixing
Single source point 1
Single source point 2
Real
Real
Real scalar
Real scalar
Real scalar
Real scalar
Convolutive mixing
Single source point 1
Single source point 2
Complex
Complex
Complex scalar
Complex scalar
52
Basic Principle of Single Source Points Detection
Convolutive mixing
Single source point 1
Single source point 2
Complex
Complex
Complex scalar
Complex scalar
-gt
The Hermitian angle between the complex vectors
u1 and u2 will remain the same even if the
vectors are multiplied by any complex scalars,
whereas the pseudo angle will change.
53
Algorithm for Single Source Points Detection
?H2
?H1
?H1
OR
?H2
54
Mask Estimation by k-means (KM)
Clean
Estimated
55
Mask Estimation by Fuzzy c-means (FCM)
Clean
Estimated
56
Automatic Detection of Number of Sources
Cluster validation technique
For c 2 to cmax Cluster the data into c
clusters. Calculate the cluster validation
index. End Take c corresponding to the best
cluster as the number of sources.
-gt
57
Elimination of Low Energy Points
Write a Comment
User Comments (0)
About PowerShow.com