Acoustic impulse response measurement using speech and music signals - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Acoustic impulse response measurement using speech and music signals

Description:

Acoustic impulse response measurement using speech and music signals John Usher Barcelona Media Innovation Centre | Av. Diagonal, 177, planta 9, 08018 Barcelona – PowerPoint PPT presentation

Number of Views:153
Avg rating:3.0/5.0
Slides: 25
Provided by: johnu8
Category:

less

Transcript and Presenter's Notes

Title: Acoustic impulse response measurement using speech and music signals


1
Acoustic impulse response measurement using
speech and music signals John Usher
Barcelona Media Innovation Centre Av.
Diagonal, 177, planta 9, 08018 Barcelona
2
Using adaptive filters to estimate acoustic IRs
  • In-situ acquisition of electro-acoustic IR, with
    audience.
  • Continuous
  • Fast enough for changing environment conditions.
  • Use speech and music signal radiated from
    loudspeaker.
  • AF for IR is nothing new!
  • Used for
  • Acoustic echo and feedback cancellation.
  • Upmixing (2 ? 5.1, 2 ? 3D).
  • ANC.
  • Room EQ (using noise).

3
(No Transcript)
4
(No Transcript)
5
Localizing objects in a room
  • Emit speech warning from loudspeaker in room.
  • Extract RIR using adaptive filter.
  • Detect reflection onset timing, e.g. using
    running kurtosis.

6
(No Transcript)
7
(No Transcript)
8
Empirical experiment with small-room configuration
  • Set-up
  • Single microphone.
  • Single loudspeaker.
  • Small room (RT 0.5 s).
  • Noise, speech or music radiated.
  • Reference measurement using exponential
    swept-sine deconvolution.
  • Further test using live (spoken) voice, with
    close and far lav. mic.

9
(No Transcript)
10
Results Error Criterion
  • Start with reference RIR (measured using
    swept-sine technique).
  • Allow Adaptive Filter to converge for 10 seconds
    to get AF spectra.
  • Calculate misalignment mean of difference
    between the ref. and AF spectra (80 Hz-- 12 kHz)

11
Rate of Convergence
12
RIR using noise, music, voice (no obvious
difference in TD!)
Reference RIR from sine-sweep
13
RIR from live voice and 2 lavs
Reference RIR from sine-sweep
14
Comparison of filter spectra using noise, speech
and music (High SNR)
15
Robustness to SNR (25, 12, 3 dB SNR) Masker
noise.
16
Robustness to SNR Masker babble
17
Comparison with DCFFT
  • Dual Channel FFT method
  • Following AES reviewer recommendation, compared
    with commercial DCFFT system (SMAART).

18
Comparison of NLMS vs DCFFT
19
Effectiveness of AF RIR acquisition method with
long RIRs.
  • 6 RIRs
  • Obtained from Dirac fed into Altiverb.
  • (NB No background noise simulated.)
  • Football stadium, Caen Cathedral, church, EMT
    plate, Filmorch. Stage Berlin, Castle.
  • RT60 9.6-1.1 secs.
  • 1.2, 2.3, 3.5, 6.0, 7.8, 9.6.

20
What happens if we just model the early part of
the IR?
Not much most of the spectral detail is in the
early part.
For longer IRs, the adaptive filter should be
longer.
Longer RT
21
(No Transcript)
22
(No Transcript)
23
(No Transcript)
24
Rate of Convergence for different RTs. 340 ms
window, 32 x overlap.
Longer RT
25
Conclusions
  • RIR acquisition for small and large rooms
  • Adaptive filter updated using NLMS and overlapped
    window.
  • Tested with RT60 0.5 -10 secs.
  • Using music, speech and noise as excitation
    signals.
  • Less accurate using live voice and two mics.
  • Convergence in lt3 sec. (lt2 dB mean error).
  • Little change in performance with SNRs down to 0
    dB.

26
Conclusions
  • Music vs speech
  • Music AF matches RIR 60 Hz12 kHz.
  • Speech AF matches RIR 100 Hz 8 kHz.
  • No considerable improvement for filter sizes gt340
    ms.
  • I.e. we only need to model first 1/8th of RIR to
    have a good approximation of the spectrum.
  • Adaptive whitening algorithm (LPC residuals) can
    speed up convergence for highly coloured signals,
    but only in low SNRS.

27
Applications
In-situ continuous room EQ using filtered-x
approach. Object localization using speech
message. (e.g. using running kurtosis).
Re-mixing live music ambient sound separation
using filter output and error signal (e.g. get
clean signal room ambiance audience
applause).

28
Cheers! John Usher
29
(No Transcript)
30
(No Transcript)
Write a Comment
User Comments (0)
About PowerShow.com