Study and Implementation of Two Voice Warping Algorithms: Pitch Shifting - PowerPoint PPT Presentation

1 / 14
About This Presentation
Title:

Study and Implementation of Two Voice Warping Algorithms: Pitch Shifting

Description:

An important identification of male or female, adults or children ... Create chipmunk or Mickey mouse like sounds. Lots of applications in movie industry ... – PowerPoint PPT presentation

Number of Views:81
Avg rating:3.0/5.0
Slides: 15
Provided by: cam84
Category:

less

Transcript and Presenter's Notes

Title: Study and Implementation of Two Voice Warping Algorithms: Pitch Shifting


1
Study and Implementation of Two Voice Warping
AlgorithmsPitch Shifting Time Stretching
  • EEL 6586 Project Presentation
  • Deng, Chengyu
  • Wang, Dexiang

2
Outline
  • Pitch analysis
  • Voice warping applications
  • Pitch shifting
  • Time stretching
  • Pitch shifting algorithm
  • How to change pitch
  • General approaches phase vocoder VS PSOLA
  • Improve frequency resolution
  • Formants consideration
  • Time stretching implementation
  • Software exhibition (real work we have done)

3
Take a Look at Pitch
  • The perceived fundamental frequency of a sound
    (definition)

Pitch period
  • Due to glottis excitation
  • An important identification of male or female,
    adults or children
  • Accompany with lots of harmonics

4
Voice Warping Applications
  • Pitch shifting maintain time duration but
    upscale or downscale pitch
  • Change mens voice to womens OR vice versa
  • Create chipmunk or Mickey mouse like sounds
  • Lots of applications in movie industry
  • Time stretching keep the pitch unchanged but
    shorten or stretch time duration
  • Help with word identification
  • Create some extremely short or long period of
    voice which can hardly be spoken by normal people

5
How to Change Pitch?
  • Naïve idea
  • Down-sample or up-sample the speech signal
  • Problems
  • Time duration also gets changed
  • Formants get moved as well
  • We should generate the same number of samples but
    only scale the pitch

6
Two General Approaches
  • Phase Vocoder
  • Manipulate the signal in frequency domain
  • Phase is an important feature to determine the
    pitch and its harmonic position
  • More accurate, higher fidelity, but longer
    computation
  • Time domain scaling ((P)SOLA, etc)
  • Manipulate the signal in time domain
  • Precise pitch detection is a critical
    prerequisite
  • Shorter computation, but lower quality
  • (P)SOLA (Pitch) Synchronous OverLap/Add

7
Basic Algorithms We Used for Pitch Shifting
  • Frequency domain process (more accurate)
  • Use short time frequency transform
  • And overlapped windows
  • Scale the frequency axis to change the pitch and
    harmonics positions
  • Upscale discard high frequency components to
    avoid aliasing (human cannot feel difference)
  • Downscale put zeros as high frequency components

8
Improve Frequency Resolution
  • Due to the accuracy limitation of discrete
    fourier transform
  • Cannot precisely represent peak components
  • Example
  • A frequency point exactly on 50th sample
  • B frequency point in between 50th and 51st
    samples
  • Solution
  • Utilize phase difference between two successive
    windows to compute exact frequency bins (final
    report will have more details)

9
Formants Consideration
  • Deal with formant movement issues
  • Lose vocal tract information
  • Upshifting pitch -gt smaller vocal tract (shape)
    effect
  • Downshifting pitch -gt bigger vocal tract (shape)
    effect
  • Solution
  • Calculate formants envelop (LPC)
  • Normalize magnitudes before frequency scaling
  • Scale frequency axis
  • Recover formants envelop

10
Time Stretching Implementation
  • Still take advantage of frequency domain
    manipulation
  • Stretch time duration
  • Interpolate additional samples between original
    frequency bins (upsampling in frequency domain)
  • Linear interpolation instead of SINC function
    interpolation (for convenience of computation)
  • Shrink time duration
  • Compression of original frequency bin samples
    (downsampling in frequency domain)

11
Put All Together (Building Our Software
Implementation)
  • Windows platform / Visual C
  • Self-developed framework algorithms
  • Formant position maintenance (LPC formant envelop
    calculation)
  • Time stretching
  • Borrowed idea and some source codes from DSP
    website
  • http//www.dspdimension.com/ for elementary
    frequency shifting algorithm
  • http//www.koders.com/ for Levinson LPC algorithm

12
Introduction to Our GUI Functions
  • Set target parameters needed by pitch shifting
    and time stretching process
  • Click Process Voice File to assign the
    original voice file and altered voice file
  • Waiting for process completion
  • Click Play Voice File button to hear the
    effect of altered voice

13
Introduction to Our GUI Functions
  • Advanced setup
  • Change the parameters used in our algorithms
  • LPC order
  • Window size
  • Overlapped percentage of the windows

14
Demo Show (Question session follows)
Write a Comment
User Comments (0)
About PowerShow.com