The Whisper Effect - PowerPoint PPT Presentation

1 / 45
About This Presentation
Title:

The Whisper Effect

Description:

First, we can try using the Emu Labeler do the pitch analysis for us... F0 and Pitch ... It seems that Emu Labeler has failed us (not too surprisingly) ... – PowerPoint PPT presentation

Number of Views:54
Avg rating:3.0/5.0
Slides: 46
Provided by: Fran445
Category:
Tags: effect | emu | whisper

less

Transcript and Presenter's Notes

Title: The Whisper Effect


1
The Whisper Effect
11752 - Spring2004 - FrankLin
2
Shhhhh
Why do people whisper? People whisper when
theyre telling others a scandalous
secret People whisper when theyre ask to speak
softly as not to disturb others (remember your
elementary school librarian?) People whisper
when theyre too weak to speak normally People
whisper when(can you think of other
reasons?) It seems that whispering is the most
effective and efficient vocal communication when
it is better that only people within very short
range of the speaker should hear the speech.
3
Shhhhh
  • So exactly what is whispering, or, whisper
    speech?
  • Is it just a softer, a less intense version of
    regular speech?
  • Why is it harder to understand whisper speech,
    even when it is spoken right next to your ear?
  • Would it be easier or more difficult to build a
    speech recognizer for whisper speech?
  • Can different voices be recognized in whisper
    speech?
  • How is a word stressed, or emphasized, in
    whisper speech?
  • In an attempt to answer these questions, we have

4
The Experiment
  • Four subjects are asked to
  • Speak 10 medium length sentences (5 to 12 words)
    as naturally and as clearly as possible.
  • Repeat the 10 sentences again, but this time in
    whisper speech.
  • The first 9 sentences covers each of the phonemes
    in American English at least once, and the 10th
    sentence is repeated three times (both in regular
    speech and whisper speech), and each time a
    different word is stressed.

5
The Subjects
  • This experiment was made possible by four dear
    volunteers. They are
  • A Young female native English speaker from the
    Midwest
  • A Young male native English speaker from eastern
    Canada
  • A Young male native English speaker from the
    Southeast
  • A Young male native English speaker from Texas
  • These four subjects should give us a good idea of
    the differences between regular and whisper
    speech in North American English.
  • Without further delay, let us look at the results

6
General Appearance of the Spectrograms
Spectrogram of the phrase stole my house in
regular speech
7
General Appearance of the Spectrograms
Spectrogram of the phrase stole my house in
whisper speech
8
General Appearance of the Spectrograms
At first glance, the spectrograms of whisper
speech looks like a string of fricative
noises. It is definitely much less intense than
regular speech, which would explain why it takes
less energy to whisper than to speak. Now lets
take a closer look at what happens to each type
of phonemes when we whisper, starting with vowels
9
Vowels Whats all that hissing noise?
I lied a lot on Saturday in whisper speech
10
Vowels Whats all that hissing noise?
Chang is not a China man in whisper speech
11
Vowels Whats all that hissing noise?
Regular stole my house again, but this time
notice the HH
12
Vowels Whats all that hissing noise?
Whisper stole my house again, can you tell
where HH starts and stops?
13
Vowels Whats all that hissing noise?
A closer look at the vowels shows us something
interesting They all look like HHs! We all
know that HH is a very transparent phoneme, it
does not warp the vowels around it. Actually,
vowels seem to pass through HH because we can
make out the formants. Now it seems like all the
whisper vowels are just HHs with different
vowels passing through. Can you guess what would
the word is sound like in whisper speech? Did
you notice something peculiar with the formants?
14
Vowels Whats all that hissing noise?
The boy will eat oat, pit, or soot
but only in small doses.
15
Vowels Whats all that hissing noise?
A second look shows us that low f1 on vowels seem
to disappear entirely, which is also an attribute
of HHs. Fortunately, we can guess a low f1 on a
whisper spectrogram from the lack of it, and f2
and f3 are good enough indicators of labial,
velar, and dental phonemes. But how about
voicing? Isnt f1 going down usually an indicator
of voicing? Lets look at the voicing for
16
Fricatives and Stops Why we dont say bzzzd
The fish thief stole my house
17
Fricatives and Stops Why we dont say bzzzd
Can I pay tickets with tacos and pork?
18
Fricatives and Stops Why we dont say bzzzd
The whisper fricatives and stops seems to be
relatively easy to spot in the spectrogram, just
as in regular speech. Now lets take a look at
the voiced fricatives and stops
19
Fricatives and Stops Why we dont say bzzzd
The very vexed zebra in regular speech
20
Fricatives and Stops Why we dont say bzzzd
The very vexed zebra in whisper speech
21
Fricatives and Stops Why we dont say bzzzd
Beat the good dog, boy! in whisper speech
22
Fricatives and Stops Why we dont say bzzzd
What happened? The voiced fricatives and stops
look just like their unvoiced counterparts! It
seems that theyve lost their voicing! So how do
we hear things like dog and zebra? It is
because we rely on high-level knowledge. If we
play just the phoneme of the whispered voiced
consonant by itself, we can hear that the
unvoiced version is actually pronounced!
23
Fricatives and Stops Why we dont say bzzzd
Fricatives and stops are relatively easy to spot
in a whisper spectrogram but they can be
confusing, which is exactly the opposite of
24
Nasals Barely there
Chang is not a China man
25
Nasals Barely there
It seems nasals follow suit with the other
phonemesno voice bars and no low f1 formants.
Additionally nasals seem so faint that they
almost look like pauses. However, we can see from
the spectrogram that it isnt difficult to
identify which nasal it is we can see the
formants going up for N, going down for M, and
velar pinch for NG. What about liquids and
glides? They actually behave pretty well in
whisper speech identifying them is usually
easier.
26
Liquids and Glides
Look, you wet your red leather boots!
27
Try this at home!
  • Now that we have gone through the different types
    of phonemes, we can compile our results
  • Vowels resemble HHs
  • Voiced fricatives and stops lose their voicing
  • Nasals become faint but can be differentiated
  • Liquids and glides do not change much
  • Much high level knowledge is required to
    recognize whisper speech
  • We can do a little test to demonstrate this

28
F0 and Pitch
What sort of f0 and pitch does whisper speech
have? (Can you guess?) First, we can try using
the Emu Labeler do the pitch analysis for us
29
F0 and Pitch
Pitch analysis for Somebody set up us the bomb!
(stress on us)
30
F0 and Pitch
It seems that Emu Labeler has failed us (not too
surprisingly). But thats alright we can still
do it ourselves. Lets make the broadband
spectrograms into narrowband spectrograms
31
F0 and Pitch
Somebody set up us the bomb! (stress on us)
Bandwidth70
32
F0 and Pitch
Somebody set up us the bomb! (stress on us)
Bandwidth40
33
F0 and Pitch
Somebody set up us the bomb! (stress on us)
Bandwidth20
34
F0 and Pitch
As we make the bandwidth smaller and smaller, we
realize that we cannot make out the f0. But since
pitch is so important in stressing and
emphasizing parts of speech, how is stressing and
emphasizing done in whisper speech?
35
F0 and Pitch
Somebody set up us the bomb! (stress on
somebody)
36
F0 and Pitch
Somebody set up us the bomb! (stress on us)
37
F0 and Pitch
Somebody set up us the bomb! (stress on bomb)
38
F0 and Pitch
As you may have expected, because of the lack of
the ability to change the pitch, speakers uses
the other two methodsmore energy and longer
durationto emphasize something they want to
stress in whisper speech. Try sing in whispercan
you do it?
39
One Last Thought Variability in Whisper Speech
One thing we notice throughout the experiment is
that many characteristics of regular speech are
lost in whisper speech. On the other hand, some
variability factors such as age, regional accent,
and emotion may also be reduced to some extent in
whisper speech.
40
One Last Thought Variability in Whisper Speech
Which speaker whispered the sentence at the
bottom?
Speaker A
Speaker B
Chang is not a China man. in whisper
They treasured the very vexed zebra. in whisper
41
One Last Thought Variability in Whisper Speech
Now can you tell?
Speaker A
Speaker B
Chang is not a China man. in regular speech
They treasured the very vexed zebra. in regular
speech
42
One Last Thought Variability in Whisper Speech
It seems that whisper speech forces the speech to
lose some of its variability. What can you guess
anything about the speaker from the this speech?
(sex, age, nationality, region, the person?)
The fish thief stole my house. in whisper speech
The fish thief stole my house. in regular speech
43
Conclusion
  • Whisper speech introduces more ambiguity into
    speech, therefore the recognition of whisper
    speech requires much high level knowledge.
  • There is no detectable pitch dynamics in whisper
    speech.
  • Whisper speech seem to reduce some variability in
    speech.

44
Conclusion
Would we ever need automatic speech recognition
for whisper speech?
For use in quiet places (library) For people with
speech difficulty (throat cancer) Can you think
of others? (secret agent watch?)
Would it be more difficult than automatic speech
recognition for regular speech?
More ambiguity Need more high-level language
modeling Less variability?
45
The End
11752 - Spring2004 - FrankLin
Write a Comment
User Comments (0)
About PowerShow.com