Sound%20and%20Music%20for%20Video%20Games - PowerPoint PPT Presentation

About This Presentation

Title:

Sound%20and%20Music%20for%20Video%20Games

Description:

Sound and Music for Video Games Technology Overview Roger Crawfis Ohio State University Overview Fundamentals of Sound Psychoacoustics Interactive Audio Applications ... – PowerPoint PPT presentation

Number of Views:221

Avg rating:3.0/5.0

Slides: 58

Provided by: Roger322

Learn more at: http://web.cse.ohio-state.edu

Category:

more less

Transcript and Presenter's Notes

Title: Sound%20and%20Music%20for%20Video%20Games

1
Sound and Music for Video Games

Technology Overview
Roger Crawfis
Ohio State University

2
Overview

Fundamentals of Sound
Psychoacoustics
Interactive Audio
Applications

3
What is sound?

Sound is the sensation perceived by the sense of
hearing
Audio is acoustic, mechanical, or electrical
frequencies corresponding to normally audible
sound waves

4
Dual Nature of Sound

Transfer of sound and physical stimulation of ear
Physiological and psychological processing in ear
and brain (psychoacoustics)

5
Transmission of Sound

Requires a medium with elasticity and inertia
(air, water, steel, etc.)
Movements of air molecules result in the
propagation of a sound wave

6
Longitudinal Motion of Air
7
Wavefronts and Rays
8
Reflection of Sound
9
Absorption of Sound

Some materials readily absorb the energy of a
sound wave
Example carpet, curtains at a movie theater

10
Refraction of Sound
11
Refraction of Sound
12
Diffusion of Sound

Not analogous to diffusion of light
Naturally occurring diffusions of sounds
typically affect only a small subset of audible
frequencies
Nearly full diffusion of sound requires a
reflection phase grating (Schroeder Diffuser)

13
The Inverse-Square Law (Attenuation)
I is the sound intensity in W/cm2 W is the sound
power of the source in W r is the distance from
the source in cm
14
The Skull

Occludes wavelengths small relative to the
skull
Causes diffraction around the head (helps amplify
sounds)
Wavelengths much larger than the skull are not
affected (explains how low frequencies are not
directional)

15
The Pinna
16
Ear Canal and Skull

(A) Dark line ear canal only
(B) Dashed line ear canal and skull diffraction

17
Auditory Area (20Hz-20kHz)
18
Spatial Hearing

Ability to determine direction and distance from
a sound source
Not fully understood process
However, some cues have been identified as useful

19
The Duplex Theory of Localization

Interaural Intensity Differences (IIDs)
Interaural Arrival-Time Differences (ITDs)

20
Interaural Intensity Difference

The skull produces a sound shadow
Intensity difference results from one ear being
shadowed and the other not
The IID does not apply to frequencies below
1000Hz (waves similar or larger than size of
head)
Sound shadowing can result in up to 20dB drops
for frequencies gt6000Hz
The Inverse-Square Law can also effect intensity

21
Head Rotation or Tilt

Rotation or tilt can alter interaural spectrum in
predictable manner

22
Interaural Arrival-Time Difference

Perception of phase difference between ears
caused by arrival-time delay (ITD)
Ear closest to sound source hears the sound
before the other ear

23
Digital Sound

Remember that sound is an analogue process (like
vision).
Computers need to deal with digital processes
(like digital images).
Many similar properties between computer imagery
and computer sound processing.

24
Class or Semantics

Sample
Stream Sounds
Music
Tracks
MIDI

25
Sound for Games

Stereo doesnt cut it anymore you need
positional audio.
Positional audio increases immersion
The Old Vary volume as position changes
The New Head-Related Transfer Functions (HRTF)
for 3d positional audio with 2-4 speakers
Games use
Dolby 5.1 requires lots of speakers
Creatives EAX environmental audio
Aureals A3D good positional audio
DirectSound3D Microsofts answer
OpenAL open, cross-platform API

26
Audio Basics

Has two fundamental physical properties
Frequency (the pitch of the wave oscillations
per second (Hertz))
Amplitude (the loudness or strength of the wave -
decibels)

27
Sampling

A sound wave is sampled
measurements of amplitude taken at a fast rate
results in a stream of numbers

28
Data Rates for Sound

Human ear can hear frequencies between ?? and ??.
Must sample at twice the highest frequency.
Assume stereo (two channels)
Assume 44Khz sampling rate (CD sampling rate)
Assume 2 bytes per channel per sample
How much raw data is required to record 3 minutes
of music?

29
Waveform Sampling Quantization

Quantization
Introduces
Noise
Examples 16, 12, 8, 6, 4 bit music
16, 12, 8, 6, 4 bit speech

30
Limits of Human Hearing

Time and Frequency
Events longer than 0.03 seconds are resolvable in
time
shorter events are perceived as features in
frequency
20 Hz. lt Human Hearing lt 20 KHz.
(for those under 15 or so)
Pitch is PERCEPTION related to FREQUENCY
Human Pitch Resolution is about 40 - 4000
Hz.

31
Limits of Human Hearing

Amplitude or Power???
Loudness is PERCEPTION related to POWER,
not AMPLITUDE
Power is proportional to (integrated) square of
signal
Human Loudness perception range is about 120 dB,
where 10 db 10 x power 20 x
amplitude
Waveform shape is of little consequence. Energy
at each frequency, and how that changes in
time, is the most important feature of a sound.

32
Limits of Human Hearing

Waveshape or Frequency Content??
Here are two waveforms with identical power
spectra, and which are (nearly) perceptually
identical
Wave 1
Wave 2
Magnitude
Spectrum

33
Limits of Human Hearing

Masking in Amplitude, Time, and Frequency
Masking in Amplitude Loud sounds mask soft
ones.
Example Quantization Noise
Masking in time A soft sound just before a
louder
sound is more likely to be heard than if it is
just after.
Example (and reason) Reverb vs. Preverb
Masking in Frequency Loud neighbor frequency
masks soft spectral components. Low sounds
mask higher ones more than high masking low.

34
Limits of Human Hearing

Masking in Amplitude
Intuitively, a soft sound will not be heard if
there is a competing loud sound. Reasons
Gain controls in the ear
stapedes reflex and more
Interaction (inhibition) in the cochlea
Other mechanisms at higher levels

35
Limits of Human Hearing

Masking in Time
In the time range of a few milliseconds
A soft event following a louder event tends to be
grouped perceptually as part of that louder event
If the soft event precedes the louder event, it
might be heard as a separate event (become
audible)

36
Limits of Human Hearing

Masking in Frequency
Only one component in this spectrum is
audible because of frequency masking

37
Sampling Rates

For Cheap Compression, Look at Lowering the
Sampling Rate First
44.1kHz 16 bit CD Quality
8kHz 8 bit MuLaw Phone Quality
Examples
Music 44.1, 32, 22.05, 16, 11.025kHz
Speech 44.1, 32, 22.05, 16, 11.025, 8kHz

38
Views of Digital Sound

Two (mainstream) views of sound and their
implications for compression
1) Sound is Perceived
The auditory system doesnt hear everything
present
Bandwidth is limited
Time resolution is limited
Masking in all domains
2) Sound is Produced
Perfect model could provide perfect compression

39
Production Models

Build a model of the sound production system,
then fit the parameters
Example If signal is speech, then a
well-parameterized vocal model can yield highest
quality and compression ratio
Benefits Highest possible compression
Drawbacks Signal source(s) must be assumed,
known, or identified

40
MIDI and Other Event Models

Musical Instrument Digital Interface
Represents Music as Notes and Events
and uses a synthesis engine to render it.
An Edit Decision List (EDL) is another example.
A history of source materials, transformations,
and processing steps is kept. Operations can be
undone or recreated easily. Intermediate
non-parametric files are not saved.

41
Event Based Compression

A Musical Score is a very compact representation
of music
Benefits
Highest possible compression
Drawbacks
Cannot guarantee the performance
Cannot assure the quality of the sounds
Cannot make arbitrary sounds

42
Event Based Compression

Enter General MIDI
Guarantees a base set of instrument sounds,
and a means for addressing them,
but doesnt guarantee any quality
Better Yet, Downloadable Sounds
Download samples for instruments
Benefits Does more to guarantee quality
Drawbacks Samples arent reality

43
Event Based Compression

Downloadable Algorithms
Specify the algorithm, the synthesis engine runs
it, and we just send parameter changes
Part of Structured Audio (MPEG4)
Benefits
Can upgrade algorithms later
Can implement scalable synthesis
Drawbacks
Different algorithm for each class of sounds
(but can always fall back on samples)

44
Compressed Audio Formats
Name Extension Ownership
AIFF (Mac) .aif, .aiff Public
AU (Sun/Next) .au Public
CD audio (CDDA) N/A Public
MP3 .mp3 MPEG Audio Layer-III
Windows Media Audio .wma Proprietary (Microsoft)
QuickTime .qt Proprietary (Apple)
RealAudio .ra, ram Proprietary (Real Networks)
WAV .wav Public
45
To be continued

Stop here
Sound Group Technical Presentations.
Suggested Topics
Compression
Controlling the Environment
ToolKit I features
ToolKit II features
Examples and Demos

46
Environmental Effects

Obstruction/Occlusion
Reverberation
Doppler Shift
Atmospheric Effects

47
Obstruction

Same as sound shadowing
Generally approximated by a ray test and a low
pass filter
High frequencies should get shadowed while low
frequencies diffract

48
Obstruction
49
Occlusion

A completely blocked sound
Example A sound that penetrates a closed door or
a wall
The sound will be muffled (low pass filter)

50
Reverberation

Effects from sound reflection
Similar to echo
Static reverberation
Dynamic reverberation

51
Static Reverberation

Relies on the closed container assumption
Parameters used to specify approximate
environment conditions (decay, room size, etc.)
Example Microsoft DirectSound3D EAX

52
Static Reverberation
53
Dynamic Reverberation

Calculation of reflections off of surfaces taking
into account surface properties
Typically diffusion and diffraction ignored
Wave Tracing
Example Aureal A3D 2.0

54
Dynamic Reverberation
55
Comparison

Static Reverberation less expensive
computationally, simple to implement
Dynamic Reverberation very expensive
computationally, difficult to implement, but
potentially superior results

56
Doppler Shift