Structure of Human Speech - PowerPoint PPT Presentation

1 / 24
About This Presentation
Title:

Structure of Human Speech

Description:

Mynah bird speech. Klatt &Stefanski (1974) How does a ... Mynah / Grey parrot. Mynah produces 'formants' but probably through changing syrinx resonances, not ... – PowerPoint PPT presentation

Number of Views:44
Avg rating:3.0/5.0
Slides: 25
Provided by: chrisd
Category:

less

Transcript and Presenter's Notes

Title: Structure of Human Speech


1
Structure of Human Speech
  • Chris Darwin

2
Vocal Tract
3
Pitch and Formants
1. Harmonics (giving pitch) produced by vocal
cord vibration
2. Formant frequencies resonances of the vocal
tract
3. Formant frequencies change as you change the
shape of your vocal tract
4
Source Filter
Larynx
Vocal tract
Output sound
5
Prosody vsSegments
Segmental consonants / vowels -gt
words Prosodic pitch contour, stress.
Emphasis, pragmatics. I thought she was
married? ".!" I thought she was
married.I thought she was married!I thought
she was married! NB in tone languages pitch used
segmentally. Are bird/mammal animal systems like
human prosody? generally use different pitch
contours
yes!
6
Vowel production
7
Vowel sounds
Vocal tract shapes for vowels
8
narrow-band spectrogram
sine-wave speech
9
Mynah bird speech
Klatt Stefanski (1974) How does a mynah bird
imitate human speech? J Acoust Soc Amer, 55,
822-832.
10
Mynah / Grey parrot
  • Mynah produces "formants" but probably through
    changing syrinx resonances, not through changing
    vocal tract shape. (Klatt Stefanski, 1974, J
    Acoust Soc Amer)
  • Grey parrot has a longer vocal tract and may use
    changes in its shape to produce formant variation
    (more like human speech). (Warren, Patterson,
    Pepperburg, 1996, Auk)

11
Characteristics of speech
  • No gaps between words
  • Smoothly changing sound from one speech sound to
    the next
  • So you cant just shuffle the acoustic words

Only silence is /g/ of ago
12
Speech is more like semaphore than like music
  • Music discrete targets giving discrete
    acoustic events
  • Semaphore discrete targets with transitions
    between targets
  • Speech articulatory transitions between targets

13
Semaphore
14
Formants in a wide-band spectrogram
lt-- F 3
Burst --gt
lt-- F 2
lt-- Formant transitions -------gt
lt-- F1
w e g o
15
Where are the segments?
16
Different transition - same consonant
lt-- Formant transitions -------gt
1400 Hz
dee da
17
Speech is more like speech than like semaphore
Speech does not have invariant acoustic
targets consonants change with the
vowel. Compare /s/ in /si/ with /s/ in
/su/ This is due to co-articulation.
18
Same noise - different consonant
F 2
1400 Hz
Burst --gt
F 1
pea ka
19
Co-articulation
Arises because (mainly) consonant gestures dont
involve all the articulators eg /b/ is lips
only, tongue free to take up position for next
vowel. /d/ and /s/ just involve the tongue tip,
touching the alveolar ridge, tongue body and lips
free to take up position for next vowel - viz.
/si/ /su/.
20
Two articulatory systems
Öhman suggested that articulation can be
decomposed into two semi-independent
systems Slow movement from one vowel target to
nexteg /i/ -gt /u/ Rapid consonantal movement
superimposedeg /b/ /d/ So the /b/ in /ibu/ is
not the same as in /ibi/
21
Co-articulation
Advantages 1. information about different
segments is spread across time (Hocketts
squashed eggs). You know that a /u/ is coming
because of the type of /s/ you have heard.
22
Co-articulation
2. Liberman thought that this spreading across
time makes it easier to transmit information at a
fast rate.
23
Co-articulation - 2
The disadvantage of co-articulation for
perception is that there are no constant acoustic
targets in speech. The same phoneme can be
represented as different sounds in different
contexts (/s/ before /u/ or /i/. Conversely, the
same sound, can be heard as different consonants
in different contexts (eg as /p/ before /i/ and
/a/ but as /k/ before /u/).
24
Speech Code
Factors that make it hard (for machines) to
recognise speech
  • Co-articulation
  • Rapid speech /djewonega?at/
  • Different vocal-tract sizes
  • Different dialects ---gt
Write a Comment
User Comments (0)
About PowerShow.com