Subband cocktail-party speech separation: CASA vs. BSS - PowerPoint PPT Presentation

1 / 20
About This Presentation
Title:

Subband cocktail-party speech separation: CASA vs. BSS

Description:

Effect of nbp: Coherence spectrograms. 10. 100. 3. NBP=3. NBP=10. NBP=100. Left. Right. Coh ... Right: coherence defined as the mean of the coherence spectrogram. ... – PowerPoint PPT presentation

Number of Views:61
Avg rating:3.0/5.0
Slides: 21
Provided by: berthommie
Category:

less

Transcript and Presenter's Notes

Title: Subband cocktail-party speech separation: CASA vs. BSS


1
Subband cocktail-party speech separation CASA
vs. BSS
Seungjin Choi Department of Computer Science and
Engineering POSTECH, Korea seungjin_at_postech.ac.kr
Co-work with Frederic Berthommier ICP, INPG,
France
2
Number95 Stereo Database
ST-Numbers95 Database ICP/INP Grenoble Authors
E.Tessier and F. Berthommier
Left source
Right source
Reference
Mixture
A large database of binary mixtures of sentences
(n613) has been recorded by Tessier and
Berthommier, 1999. The signal of Numbers95 is
played by loudspeakers and recorded. The temporal
overlap between words is about 75 and the
relative level is 0dB. The setup is static. Only
332 mixture sentences truncated at 1 s are used
in the present study.
3
Filterbank decomposition
4
The CASA Model
5
Reconstruction Acuracy
6
Gain of CASA
7
Gain of CASA Relative Level
RAY
RAX
8
Subband effect for CASA
dB
Effect of the number of subbands (nbsb) for the
CASA model on the RA (in dB). From left to right
averaged left source RA, averaged right source
RA, averaged leftright RA over all frames. The
number of subbands varies from 1 to 5 and the two
curves correspond to duration 256 and 512 bins.
The RA of the mixture, which is subtracted for
gain evaluation is labelled ().
9
Effect of nbsb RA
Mixt.
Left
Right
2
4
10
Subband effect for CASA Gain
Right
Left
nbsb1
nbsb4
11
The BSS Model
Xl(t)
Yl(t)
Gain Non linear function Delayed output
Xr(t)
Yr(t)
12
Gain of BSS Relative Level
RAY
RAX
13
Subband effect for BSS
left
right
leftright
10
10
20
9.5
9.5
19
9
9
18
8.5
8.5
17
8
8
16
dB
dB
dB
7.5
7.5
15
7
7
14
6.5
6.5
13
6
6
12
2
3
10
5.5
5.5
11
100
5
5
10
1
2
3
4
1
2
3
4
1
2
3
4
nbsb
nbsb
nbsb
Effect of the number of subbands (nbsb) for the
BSS model on the RA (in dB). From left to right
av. left source RA, av. right source RA, av.
leftright RA over all frames. The number of
subbands varies from 1 to 4 and the three curves
correspond to nbp 2,3,10, 100. The RA of the
mixture is labelled (). In each figures, two
points are added at nbsb1 for the "BSS giv"
condition (?) and for "BSS ori" data (?).
14
RA and Gain for BSS
Speech Separation Program (C) POSTECH Authors
S. Choi and H. Hong
Left
20
RAX
15
Mixt.
-
-
10
RA (dB)

5
0
RAY
-5
0
4
6
8
10
12
14
2
Left
Right
15
10
-
RA (dB)

5
Right
0
-5
0
2
4
6
8
10
12
14
Frame 1024 bins with half overlap
15
Subband effect for BSS Gain
16
Demixing filters
17
Coherence spectrograms
18
Effect of nbp Coherence spectrograms
Left
Right
Coh
NBP3
3
0.68
NBP10
10
0.65
NBP100
100
0.60
19
Coherence statistic
Effect of the number of subbands (nbsb) on the
coherence index for the BSS model. Left average
leftright RA over all frames. Right coherence
defined as the mean of the coherence spectrogram.
The number of subbands varies from 1 to 4 and the
three curves correspond to nbp 2,3,10, 100. The
RA of the mixture is labelled (). The CohX
coherence between the two mixture channels is
labelled () in the right figure. In each
figures, two points are added at nbsb1 for the
"BSS giv" condition (?) and for "BSS ori" data
(?).
20
Summary results
CASA
BSS
REF
Left
Right
Right
mean
Left
Write a Comment
User Comments (0)
About PowerShow.com