ECE 598: The Speech Chain - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

ECE 598: The Speech Chain

Description:

R(w) = jrf/r = 'radiation characteristic' r = density of air. r = distance to the microphone ... Vowel Transfer Function: Synthetic Example. L1 = 20log10(500/80)=16dB ... – PowerPoint PPT presentation

Number of Views:68
Avg rating:3.0/5.0
Slides: 39
Provided by: markhasega
Category:

less

Transcript and Presenter's Notes

Title: ECE 598: The Speech Chain


1
ECE 598 The Speech Chain
  • Lecture 8 Formant Transitions Vocal Tract
    Transfer Function

2
Today
  • Perturbation Theory
  • A different way to estimate vocal tract resonant
    frequencies, useful for consonant transitions
  • Syllable-Final Consonants Formant Transitions
  • Vocal Tract Transfer Function
  • Uniform Tube (Quarter-Wave Resonator)
  • During Vowels All-Pole Spectrum
  • Q
  • Bandwidth
  • Nasal Vowels Sum of two transfer functions gives
    spectral zeros

3
Topic 1Perturbation Theory
4
Perturbation Theory(Chiba and Kajiyama, The
Vowel, 1940)
A(x) is constant everywhere, except for one small
perturbation.
Method 1. Compute formants of the
unperturbed vocal tract. 2. Perturb the
formant frequencies to match the area
perturbation.
5
Conservation of Energy Under Perturbation
6
Conservation of Energy Under Perturbation
7
Sensitivity Functions
8
Sensitivity Functions for the Quarter-Wave
Resonator (Lips Open)
x
0
L
  • Note low F3 of /er/ is caused in part by a
    side branch under the tongue perturbation alone
    is not enough to explain it.

/AA/
/ER/
/IY/
/W/
9
Sensitivity Functions for the Half-Wave Resonator
(Lips Rounded)
x
0
L
  • Note high F3 of /l/ is caused in part by a
    side branch above the tongue perturbation alone
    is not enough to explain it.

/L,OW/
/UW/
10
Formant Frequencies of Vowels
From Peterson Barney, 1952
11
Topic 2Formant Transitions, Syllable-Final
Consonant
12
Events in the Closure of a Nasal Consonant
Formant Transitions
Vowel Nasalization
Nasal Murmur
13
Formant Transitions A Perturbation Theory Model
14
Formant Transitions Labial Consonants
the mom
the bug
15
Formant Transitions Alveolar Consonants
the supper
the tug
16
Formant Transitions Post-alveolar Consonants
the shoe
the zsazsa
17
Formant Transitions Velar Consonants
the gut
sing a song
18
Topic 3Vocal Tract Transfer Functions
19
Transfer Function
  • Transfer Function T(w)Output(w)/Input(w)
  • In speech, its convenient to write
    T(w)UL(w)/UG(w)
  • UL(w) volume velocity at the lips
  • UG(w) volume velocity at the glottis
  • T(0) 1
  • Speech recorded at a microphone pressure
  • PR(w) R(w)T(w)UG(w)
  • R(w) jrf/r radiation characteristic
  • r density of air
  • r distance to the microphone
  • f frequency in Hertz

20
Transfer Function of an Ideal Uniform Tube
  • Ideal Terminations
  • Reflection coefficient at glottis zero velocity,
    g1
  • Reflection coefficient at lips zero pressure,
    g-1
  • Obviously, this is an approximation, but it
    gives
  • T(w) 1/cos(wL/c)
  • (ww3)(ww2)(ww1)(w-w1)(w-w2)(w-w3)
  • wn npc/L pc/2L
  • Fn nc/2L c/4L

w12w22w32
21
Transfer Function of an Ideal Uniform Tube
Peaks are actually infinite in height (figure is
clipped to fit the display)
22
Transfer Function of a Non-Ideal Uniform Tube
  • Almost ideal terminations
  • At glottis velocity almost zero, g1
  • At lips pressure almost zero, g-1
  • T(w) 1/(j/Q cos(wL/c))
  • at Fnnc/2L c/4L,
  • T(2pFn) -jQ
  • 20log10T(2pFn) 20log10Q

23
Transfer Function of a Non-Ideal Uniform Tube
24
Transfer Function of a Vowel Height of First
Peak is Q1F1/B1
(2pFn)2(pBn)2
8
  • T(w) P (jwj2pFnpBn)(jw-j2pFnpBn)
  • T(2pF1) (2pF1)2/(j4pF1pB1)
  • -jF1/B1
  • Call Qn Fn/Bn
  • T(2pF1) -jQ1
  • 20log10T(2pF1) 20log10Q1

n1
25
Transfer Function of a Vowel Bandwidth of a Peak
is Bn
(2pFn)2(pBn)2
8
  • T(w) P (jwj2pFnpBn)(jw-j2pFnpBn)
  • T(2pF1pB1) (2pF1)2/((j4pF1)(pB1pB1))
  • -jQ1/2
  • At fF10.5Bn,
  • T(w)0.5Qn
  • 20log10T(w) 20log10Q1 3dB

n1
26
Amplitudes of Higher Formants Include the Rolloff
(2pFn)2(pBn)2
8
  • T(w) P (jwj2pFnpBn)(jw-j2pFnpBn)
  • At f above F1
  • T(2pf) (F1/f)
  • T(2pF2) (-jF2/B2)(F1/F2)
  • 20log10T(2pF2)
  • 20log10Q2 20log10(F2/F1)
  • 1/f Rolloff 6 dB per octave (per doubling of
    frequency)

n1
27
Vowel Transfer Function Synthetic Example
L1 20log10(500/80)16dB
L2 20log10(1500/240) 20log10(F2/F1) 16dB
9.5dB
L3 20log10(2500/600) 20log10(F3/F1)
20log10(F3/F2)
B2 240Hz
B1 80Hz
B3 600Hz? (hard to measure because rolloff from
F1, F2 turns the F3 peak into a plateau)
F4 peak completely swamped by rolloff from lower
formants
28
Shorthand Notation for the Spectrum of a Vowel
snsn
8
  • T(s) P (s-sn)(s-sn)
  • s jw
  • sn -pBnj2pFn
  • sn -pBn-j2pFn
  • snsn sn2 (2pFn)2(pBn)2
  • T(0) 1
  • 20log10T(0) 0dB

n1
29
Another Shorthand Notation for the Spectrum of a
Vowel
1
8
  • T(s) P (1-s/sn)(1-s/sn)

n1
30
Topic 4Nasalized Vowels
31
Vowel Nasalization
Nasalized Vowel
Nasal Consonant
32
Nasalized Vowel
  • PR(w) R(w)(UL(w)UN(w))
  • UN(w) Volume Velocity from Nostrils
  • PR(w) R(w)(TL(w)TN(w))UG(w)
  • R(w)T(w)UG(w)
  • T(w) TL(w) TN(w)

33
Nasalized Vowel
  • T(s) TL(s)TN(s)
  • (1-s/sLn)(1-s/sLn) (1-s/sNn)(1-s/sNn)
  • (1-s/sLn)(1-s/sLn)(1-s/sNn)(1-s/sNn)
  • 1/sZn ½(1/sLn1/sNn)
  • sZn nth spectral zero
  • T(s) 0 if ssZn

1
1
2(1-s/sZn)(1-s/sZn)
34
The Pole-Zero Pair
  • 20log10T(w)
  • 20log10(1/(1-s/sLn)(1-s/sLn))
  • 20log10((1-s/sZn)(1-s/sZn)/(1-s/sNn)(1-s/sNn))
  • original vowel log spectrum
  • log spectrum of a pole-zero pair

35
Additive Terms in the Log Spectrum
36
Transfer Function of a Nasalized Vowel
37
Pole-Zero Pairs in the Spectrogram
Nasal Pole
Zero
Oral Pole
38
Summary
  • Perturbation Theory
  • Squeeze near a velocity peak formant goes down
  • Squeeze near a pressure peak formant goes up
  • Formant Transitions
  • Labial closure loci near 250, 1000, 2000 Hz
  • Alveolar closure loci near 250, 1700, 3000 Hz
  • Velar closure F2 and F3 come together (velar
    pinch)
  • Vocal Tract Transfer Function
  • T(s) P snsn/(s-sn)(s-sn)
  • T(w2pFn) Qn Fn/Bn
  • 3dB bandwidth Bn Hertz
  • T(0) 1
  • Nasal Vowels
  • Sum of two transfer functions gives a spectral
    zero between the oral and nasal poles
  • Pole-zero pair is a local perturbation of the
    spectrum
Write a Comment
User Comments (0)
About PowerShow.com