Dynamic Aspects of the Cocktail Party Listening Problem - PowerPoint PPT Presentation

About This Presentation
Title:

Dynamic Aspects of the Cocktail Party Listening Problem

Description:

Douglas S. Brungart Air Force Research Laboratory Binaural Listening Spatial Separation in Azimuth From the classic cocktail party effect Spatial separation ... – PowerPoint PPT presentation

Number of Views:118
Avg rating:3.0/5.0
Slides: 28
Provided by: Sarl156
Category:

less

Transcript and Presenter's Notes

Title: Dynamic Aspects of the Cocktail Party Listening Problem


1
Dynamic Aspects of the Cocktail
Party Listening Problem
  • Douglas S. Brungart
  • Air Force Research Laboratory

2
Credits
AFOSR Sponsored Research Team Brian
Simpson Alex Kordik Rich McKinley Mark
Ericson Collaborators Chris Darwin Gerald
Kidd
3
Introduction
1) Energetic and Informational Masking Speech
in Noise vs Speech in Speech 2) Monaural speech
segregation 3) Binaural and Dichotic speech
segregation 4) Dynamic aspects of cocktail party
problem 5) Audio-Visual cocktail party effects
4
Energetic Masking
In classic speech-on-noise masking, only one type
of masking occurs Energetic Masking In
Energetic Masking -The masking sound is more
intense than the target in one or more critical
bands -Some portion of the target signal is
inaudible at the periphery
5
Energetic Masking Articulation Theory
Energetic masking in speech was studied for years
by Fletcher and others at Bell Labs -Articulation
Theory -Articulation Index (AI) Allows
accurate prediction of intelligibility -For any
phonetically balanced vocabulary -For any
continuous noise source -Plus numerous
correction factors High-Amplitudes, Reverb,
Peak-Clipping, etc.
6
Informational Masking
  • Energetic Masking also occurs in
    Speech-on-Speech masking
  • -Where signals overlap within critical band
  • However, informational masking also occurs
  • Listeners hear two or more audible sounds, but
    cant segregate them into separate messages
  • Classic example multi-tone complexes
  • - No energetic overlap in stimuli, but
    substantial masking is observed (Kidd, Neff)


7
Methods The Coordinate Response Measure (CRM)
Data collected with Coordinate Response Measure
-CRM Originally developed by Moore McKinley
(1980) - Format Ready (Call Sign) go to
(Color) (Number) now. - Target is indicated by
call sign Baron - Maskers indicated by other
call signs - Complete CRM corpus is available
(Bolia et. al, 2001) - 8 Talkers in corpus (4 M,
4 F), 2048 Phrases - 8 Talkers x 4 Colors x 8
Numbers x 8 Call Signs - Embedded call-sign
ideal for multitalker studies - Similar to many
multichannel monitoring tasks

8
") document.writeln("") document.writeln("
Your call sign is Baron.
Methods The Coordinate Response Measure
Listeners respond by selecting the appropriate
colored digit with the computer mouse


9
Methods Pros and Cons of CRM
Advantages of CRM Rapid data collection
training and scoring Sentences are
reusable Embedded call sign to designate
target - does not require a priori
designation Disadvantages of CRM Limited
vocabulary - partially offset by lack of
context - not phonetically balaced Not
conversationally realistic CRM emphasizes
speech on speech masking


10
Methods Pros and Cons of CRM
Advantages of CRM Rapid data collection
training and scoring Sentences are
reusable Embedded call sign to designate
target - does not require a priori
designation Disadvantages of CRM Limited
vocabulary - partially offset by lack of
context - not phonetically balaced Not
conversationally realistic CRM emphasizes
speech on speech masking


11
Methods Pros and Cons of CRM
Advantages of CRM Rapid data collection
training and scoring Sentences are
reusable Embedded call sign to designate
target - does not require a priori
designation Disadvantages of CRM Limited
vocabulary - partially offset by lack of
context - not phonetically balaced Not
conversationally realistic CRM emphasizes
informational masking


12
Two-Talker Diotic Listening Results
TMMod. Noise Masker TNCont. Noise
Masker TDDiff. Sex Masker TSSame Sex
Masker TTSame Talker Masker
13
Two-Talker Diotic Listening Error Distribution
Most errors match the color and number spoken by
the masking talker. This is indicative of
informational masking
14
Three-Talker Diotic Listening Results

TTarget Talker MMod. Noise Masker DDiff. Sex
Masker SSame Sex Masker TSame Talker Masker
15
Four-Talker Diotic Listening Results
TTarget Talker MMod. Noise Masker DDiff. Sex
Masker SSame Sex Masker TSame Talker Masker
16
3-4 Talker Listening Results
17
Dichotic Listening Introduction
  • To this point, all stimuli have been diotic
  • Spatial separation is known to play a role
  • - Cherrys Cocktail Party Problem
  • Dichotic masking is pure informational masking
  • - No contralateral energetic masking occurs
  • Previous results have suggested
  • - Almost perfect segregation across ears
  • - Cherry, Broadbent, Triesman, Kidd, Neff, etc.

18
Dichotic Listening Procedure

Dichotic listening similar to other procedure
but 1) Talkers were known a priori - 1 male,
1 female target talker 2) 2 Talkers presented
in right ear (T and M) 3) Masking signal was
presented in left ear
19
Dichotic Listening Results
With 2 talkers in right ear Noise in left
ear doesnt interfere (Even when Loud)
Speech interferes substantially (Even when
Quiet) Reversed Speech interferes but
only when target in right ear lower
than masker in right ear
20
Binaural ListeningSpatial Separation in Azimuth
  • From the classic cocktail party effect
  • Spatial separation improves segregation

Diotic vs. 45 Separation, same-sex talkers
21
Binaural ListeningSpatial Separation in Distance

22
Binaural ListeningSpatial Separation in Distance

With Natural Better-Ear SNR Cues, Both speech and
noise Benefit from separation in distance
23
Binaural ListeningSpatial Separation in Distance

With normalization, speech is Better but Noise is
not
24
Dynamic Aspects of Multitalker Listening
Most Cocktail-Party Listening Experiments
assume 1) Target talker is known (Selective
Attention) 2) Target talker is unknown
(Divided Attention) Real world listening falls
in between these extremes - Attention focused
primarily on one talker - Other talkers
monitored for important info How do listeners
adapt to conversational dynamics
25
Dynamic Cocktail Party EffectsMultitalker
Transition Probability
Experiment 3-Talker Condition 1) Standard CRM
task 2) 2, 3, or 4 Spatially Separated Same-Sex
Talkers - Close or Far separation for 2 and 3
talkers 3) 5 Transition Probabilities (0-1) 4)
3 Talker Configurations - Talkers selected
randomly - Each location assigned a
talker - Target talker follows target
location 5) Total of 106,200 Trials -
Balanced by Target Talker and Target Location
26
Dynamic Cocktail Party Effects Multitalker
Transition Probability
Overall Perfomance Improves Gradually After
Transitions
27
Conclusions
  • 1) Speech-on-Speech ? Speech-in-Noise
  • - Deployment of Auditory Attention is Important
  • - Signal similarity is a major factor
  • - Spatial separation is particularly beneficial
  • 2) Multitalker Listening is a Dynamic Process
  • - Listeners adapt to source location changes
    over 5-8 trials
  • Listeners learn new situations quickly (10
    trials)
  • Listeners adopt optimal listening strategies


?
Write a Comment
User Comments (0)
About PowerShow.com