Introduction to Voice and - PowerPoint PPT Presentation

1 / 38
About This Presentation
Title:

Introduction to Voice and

Description:

Optical Image Processing - Optics and imaging systems, basic elements and ... Cricoid cartilage. Apex. Blade Middle Back. 23. 24. Modulation (vocal tract) Excitation ... – PowerPoint PPT presentation

Number of Views:24
Avg rating:3.0/5.0
Slides: 39
Provided by: sand211
Category:

less

Transcript and Presenter's Notes

Title: Introduction to Voice and


1
Introduction to Voice and Image
Processing Spring 2003
2
Faculty Dr. Lester A. Gerhardt Associate Dean,
School of Engineering Rensselaer Polytechnic
Institute Phone 518-276-6203/6400 Fax
518-276-8788 E-mail gerhal_at_rpi.edu
3
Text Digital Image Processing, K.C. Castleman
(Prentice-Hall, 1996) References Digital Image
Processing, R.C Gonzales/R.E.Woods
(Prentice Hall, 2002) Digital Image Processing,
William Pratt (Wiley, 2001) Image
Processing The Fundamentals M. Petrou (Wiley,
1999)
4
Lecture 1 Course Outline, Background
Requirements
Introduction - Similarities and differences,
sampling/quantization, definitions, eye/ear
physiology, sensors and scanned images,
Displays.
5
Lecture 2 Van der Lugt paper
Optical Image Processing - Optics and imaging
systems, basic elements and transfer functions,
design/analysis of complete systems, point spread
functions, imaging and transform conditions,
spatial filtering, correlation, deblurring.
6
Lecture 3 Image Processing/Math Tools -
Histograms, point, algebraic and geometric
operations, convolution, correlation, 2-D Fourier
Transform, other transforms, wavelet transforms,
Stochastic Processes and LMSEE, Linear Predictive
Coding (LPC).
7
Lecture 4 Image Bandwidth Compression - LPC,
bit plane encoding, 2-D transform coding,
selective filtering, delta modulation.
8
Lecture 5 Image Enhancement and Restoration -
Edge detection, color encoding, windowing,
spectral enhancement, density transformations,
deblurring.
9
Lecture 6 Image Representation and Description
- Segmentation, boundaries, regions, morphology.
10
Lecture 7 Image Processing Applications -
Satellite imagery, medical, fingerprints,
robotics among others, pattern recognition, 3-D
imagery.
11
Introduction Human Sensing
12
SIMILARITIES
  • Digital signal processing
  • Scanned imagery is sequential - as is speech
  • Transform processing applies
  • Similar compression methods
  • Performance evaluation difficult
  • Acoustic signals can be converted to spatial
    representation

13
One picture is worth 1000 words.
14
Human Mechanisms
Speech
Images
Receiving
Generation
15
Retina
The Eye
Choroid
Sclera
Eyelens
Vitreous humor
Optic nerve
Fovea
Iris
Aqueous humor
Cornea
16
Cornea
Iris
Ciliary body
Ciliary muscle
Lens
Ciliary fibers
Visual axis
Anterior chamber
Vitreous humor
Retina
Blind spot
Fovea
Sclera
Choroid
Nerve sheath
17
Optic pathways
1. Optic nerve 2. Frontal lobe 3. Temporal
lobe 4. Optic tract 5. Temporal loop
6. Parietal lobe 7. Visual radiation 8.
Occipital lobe 9. Chiasm 10. Lateral
geniculate body 11. Posterior calcarine
fissure
1
1
2
2
3
3
4
4
5
5
9
6
6
10
7
7
8
8
11
18
2
Cornsweet
Blind spot
Rods
Rods
Number of rods of cones per mm
Cones
Cones
Temporal on retina
Nasal
Perimetric angle (deg)
19
Yarbus
Relative visual acuity
Distance from center of fovea, degrees
Distance from center of fovea, minutes of angle
Relative visual acuity depending on the position
of the retinal image of the retina ( Jones and
illggins, 1947)
20
CAN See black line 1 second of
arc on white field Detect motion to 10 seconds
of arc 2 minutes of arc per second of
time Match brightness or color well (within 2)
(or a few millimicrons) Process information in
parallel
21
CANNOT Judge absolute level or brightness
accurately Determine absolute wavelength of
color well Detect motion faster than 200 per
second See Beyond .4 .7 microns
o
22
HUMAN In parallel Color Non uniform distribution
of sensitized elements Gestalt Processing 3-D
MACHINE Sequential BW Uniform distribution of
sensitized elements Induction/ Deduction 2-D
23
Nose
Nasal pharynx
Nasal cavity
Velum
Palate
Alveolar ridge
Uvula
Blade Middle Back
Teeth
Tongue
Oral pharynx
Lip
Root
Epiglottis
Apex
Jawbone
Laryngeal pharynx
Hyoid bone
Vocal cords
Thyroid cartilage
Cricoid cartilage
23
24
Radiated speech
25
NASAL CAVITY
VELUM
NOSE OUTPUT
MOUTH CAVITY
PHARYNX CAVITY
MOUTH OUTPUT
VOCAL CORDS
LARYNX TUBE
TONGUE HUMP
Schematic diagram of functional components of the
vocal tract
TRACHEA AND BRONCHI
LUNG VOLUME
MUSCLE FORCE
25
26
Network simulation of the vocal system
Nasal tract
Nostril
Velum
Lungs
Trachea bronchi
Vocal cords
Mouth
Vocal tract
27
VOICED COMPONENT
UNVOICED COMPONENT
28
28
29
(No Transcript)
30
(No Transcript)
31
Oval window
35 mm
Stapes
Helicotrema
Round window
Apex
Base
Partition (with basilar membrane)
The cochlea as it would appear if unwound.
32
RELATIVE AMPLITUDE OF DISPLACEMENT
FREQUENCY-CPS
33
RELATIVE AMPLITUDE
PHASE IN RADIANS
FREQUENCY IN CYCLES PER SECOND
34
(No Transcript)
35
(No Transcript)
36
LINGUISTIC AND TECHNICAL FUNDAMENTALS
Localization from hearing loss measurement
Localization as measured on basilar membrane
Distance from the stapes (mm)
Frequency (Hz)
Position of maximum amplitude along basilar
membrane as a function of applied frequency.
(After Bekesy,1960)
37
Display ParametersStatic Image
Pixel
Resolution, Quantization
38
Display ParametersDynamic Image
Frame Rate Resolution, Quantization
Write a Comment
User Comments (0)
About PowerShow.com