Sound%20and%20Music%20for%20Video%20Games - PowerPoint PPT Presentation

About This Presentation
Title:

Sound%20and%20Music%20for%20Video%20Games

Description:

Sound and Music for Video Games Technology Overview Roger Crawfis Ohio State University Overview Fundamentals of Sound Psychoacoustics Interactive Audio Applications ... – PowerPoint PPT presentation

Number of Views:221
Avg rating:3.0/5.0
Slides: 58
Provided by: Roger322
Category:

less

Transcript and Presenter's Notes

Title: Sound%20and%20Music%20for%20Video%20Games


1
Sound and Music for Video Games
  • Technology Overview
  • Roger Crawfis
  • Ohio State University

2
Overview
  • Fundamentals of Sound
  • Psychoacoustics
  • Interactive Audio
  • Applications

3
What is sound?
  • Sound is the sensation perceived by the sense of
    hearing
  • Audio is acoustic, mechanical, or electrical
    frequencies corresponding to normally audible
    sound waves

4
Dual Nature of Sound
  • Transfer of sound and physical stimulation of ear
  • Physiological and psychological processing in ear
    and brain (psychoacoustics)

5
Transmission of Sound
  • Requires a medium with elasticity and inertia
    (air, water, steel, etc.)
  • Movements of air molecules result in the
    propagation of a sound wave

6
Longitudinal Motion of Air
7
Wavefronts and Rays
8
Reflection of Sound
9
Absorption of Sound
  • Some materials readily absorb the energy of a
    sound wave
  • Example carpet, curtains at a movie theater

10
Refraction of Sound
11
Refraction of Sound
12
Diffusion of Sound
  • Not analogous to diffusion of light
  • Naturally occurring diffusions of sounds
    typically affect only a small subset of audible
    frequencies
  • Nearly full diffusion of sound requires a
    reflection phase grating (Schroeder Diffuser)

13
The Inverse-Square Law (Attenuation)
I is the sound intensity in W/cm2 W is the sound
power of the source in W r is the distance from
the source in cm
14
The Skull
  • Occludes wavelengths small relative to the
    skull
  • Causes diffraction around the head (helps amplify
    sounds)
  • Wavelengths much larger than the skull are not
    affected (explains how low frequencies are not
    directional)

15
The Pinna
16
Ear Canal and Skull
  • (A) Dark line ear canal only
  • (B) Dashed line ear canal and skull diffraction

17
Auditory Area (20Hz-20kHz)
18
Spatial Hearing
  • Ability to determine direction and distance from
    a sound source
  • Not fully understood process
  • However, some cues have been identified as useful

19
The Duplex Theory of Localization
  • Interaural Intensity Differences (IIDs)
  • Interaural Arrival-Time Differences (ITDs)

20
Interaural Intensity Difference
  • The skull produces a sound shadow
  • Intensity difference results from one ear being
    shadowed and the other not
  • The IID does not apply to frequencies below
    1000Hz (waves similar or larger than size of
    head)
  • Sound shadowing can result in up to 20dB drops
    for frequencies gt6000Hz
  • The Inverse-Square Law can also effect intensity

21
Head Rotation or Tilt
  • Rotation or tilt can alter interaural spectrum in
    predictable manner

22
Interaural Arrival-Time Difference
  • Perception of phase difference between ears
    caused by arrival-time delay (ITD)
  • Ear closest to sound source hears the sound
    before the other ear

23
Digital Sound
  • Remember that sound is an analogue process (like
    vision).
  • Computers need to deal with digital processes
    (like digital images).
  • Many similar properties between computer imagery
    and computer sound processing.

24
Class or Semantics
  • Sample
  • Stream Sounds
  • Music
  • Tracks
  • MIDI

25
Sound for Games
  • Stereo doesnt cut it anymore you need
    positional audio.
  • Positional audio increases immersion
  • The Old Vary volume as position changes
  • The New Head-Related Transfer Functions (HRTF)
    for 3d positional audio with 2-4 speakers
  • Games use
  • Dolby 5.1 requires lots of speakers
  • Creatives EAX environmental audio
  • Aureals A3D good positional audio
  • DirectSound3D Microsofts answer
  • OpenAL open, cross-platform API

26
Audio Basics
  • Has two fundamental physical properties
  • Frequency (the pitch of the wave oscillations
    per second (Hertz))
  • Amplitude (the loudness or strength of the wave -
    decibels)

27
Sampling
  • A sound wave is sampled
  • measurements of amplitude taken at a fast rate
  • results in a stream of numbers

28
Data Rates for Sound
  • Human ear can hear frequencies between ?? and ??.
  • Must sample at twice the highest frequency.
  • Assume stereo (two channels)
  • Assume 44Khz sampling rate (CD sampling rate)
  • Assume 2 bytes per channel per sample
  • How much raw data is required to record 3 minutes
    of music?

29
Waveform Sampling Quantization
  • Quantization
  • Introduces
  • Noise
  • Examples 16, 12, 8, 6, 4 bit music
  • 16, 12, 8, 6, 4 bit speech

30
Limits of Human Hearing
  • Time and Frequency
  • Events longer than 0.03 seconds are resolvable in
    time
  • shorter events are perceived as features in
    frequency
  • 20 Hz. lt Human Hearing lt 20 KHz.
  • (for those under 15 or so)
  • Pitch is PERCEPTION related to FREQUENCY
  • Human Pitch Resolution is about 40 - 4000
    Hz.

31
Limits of Human Hearing
  • Amplitude or Power???
  • Loudness is PERCEPTION related to POWER,
    not AMPLITUDE
  • Power is proportional to (integrated) square of
    signal
  • Human Loudness perception range is about 120 dB,
  • where 10 db 10 x power 20 x
    amplitude
  • Waveform shape is of little consequence. Energy
    at each frequency, and how that changes in
    time, is the most important feature of a sound.

32
Limits of Human Hearing
  • Waveshape or Frequency Content??
  • Here are two waveforms with identical power
    spectra, and which are (nearly) perceptually
    identical
  • Wave 1
  • Wave 2
  • Magnitude
  • Spectrum

33
Limits of Human Hearing
  • Masking in Amplitude, Time, and Frequency
  • Masking in Amplitude Loud sounds mask soft
    ones.
  • Example Quantization Noise
  • Masking in time A soft sound just before a
    louder
  • sound is more likely to be heard than if it is
    just after.
  • Example (and reason) Reverb vs. Preverb
  • Masking in Frequency Loud neighbor frequency
  • masks soft spectral components. Low sounds
  • mask higher ones more than high masking low.

34
Limits of Human Hearing
  • Masking in Amplitude
  • Intuitively, a soft sound will not be heard if
    there is a competing loud sound. Reasons
  • Gain controls in the ear
  • stapedes reflex and more
  • Interaction (inhibition) in the cochlea
  • Other mechanisms at higher levels

35
Limits of Human Hearing
  • Masking in Time
  • In the time range of a few milliseconds
  • A soft event following a louder event tends to be
    grouped perceptually as part of that louder event
  • If the soft event precedes the louder event, it
    might be heard as a separate event (become
    audible)

36
Limits of Human Hearing
  • Masking in Frequency
  • Only one component in this spectrum is
  • audible because of frequency masking

37
Sampling Rates
  • For Cheap Compression, Look at Lowering the
    Sampling Rate First
  • 44.1kHz 16 bit CD Quality
  • 8kHz 8 bit MuLaw Phone Quality
  • Examples
  • Music 44.1, 32, 22.05, 16, 11.025kHz
  • Speech 44.1, 32, 22.05, 16, 11.025, 8kHz

38
Views of Digital Sound
  • Two (mainstream) views of sound and their
    implications for compression
  • 1) Sound is Perceived
  • The auditory system doesnt hear everything
    present
  • Bandwidth is limited
  • Time resolution is limited
  • Masking in all domains
  • 2) Sound is Produced
  • Perfect model could provide perfect compression

39
Production Models
  • Build a model of the sound production system,
    then fit the parameters
  • Example If signal is speech, then a
    well-parameterized vocal model can yield highest
    quality and compression ratio
  • Benefits Highest possible compression
  • Drawbacks Signal source(s) must be assumed,
    known, or identified

40
MIDI and Other Event Models
  • Musical Instrument Digital Interface
  • Represents Music as Notes and Events
  • and uses a synthesis engine to render it.
  • An Edit Decision List (EDL) is another example.
  • A history of source materials, transformations,
    and processing steps is kept. Operations can be
    undone or recreated easily. Intermediate
    non-parametric files are not saved.

41
Event Based Compression
  • A Musical Score is a very compact representation
    of music
  • Benefits
  • Highest possible compression
  • Drawbacks
  • Cannot guarantee the performance
  • Cannot assure the quality of the sounds
  • Cannot make arbitrary sounds

42
Event Based Compression
  • Enter General MIDI
  • Guarantees a base set of instrument sounds,
  • and a means for addressing them,
  • but doesnt guarantee any quality
  • Better Yet, Downloadable Sounds
  • Download samples for instruments
  • Benefits Does more to guarantee quality
  • Drawbacks Samples arent reality

43
Event Based Compression
  • Downloadable Algorithms
  • Specify the algorithm, the synthesis engine runs
    it, and we just send parameter changes
  • Part of Structured Audio (MPEG4)
  • Benefits
  • Can upgrade algorithms later
  • Can implement scalable synthesis
  • Drawbacks
  • Different algorithm for each class of sounds
    (but can always fall back on samples)

44
Compressed Audio Formats
Name Extension Ownership
AIFF (Mac) .aif, .aiff Public
AU (Sun/Next) .au Public
CD audio (CDDA) N/A Public
MP3 .mp3 MPEG Audio Layer-III
Windows Media Audio .wma Proprietary (Microsoft)
QuickTime .qt Proprietary (Apple)
RealAudio .ra, ram Proprietary (Real Networks)
WAV .wav Public
45
To be continued
  • Stop here
  • Sound Group Technical Presentations.
  • Suggested Topics
  • Compression
  • Controlling the Environment
  • ToolKit I features
  • ToolKit II features
  • Examples and Demos

46
Environmental Effects
  • Obstruction/Occlusion
  • Reverberation
  • Doppler Shift
  • Atmospheric Effects

47
Obstruction
  • Same as sound shadowing
  • Generally approximated by a ray test and a low
    pass filter
  • High frequencies should get shadowed while low
    frequencies diffract

48
Obstruction
49
Occlusion
  • A completely blocked sound
  • Example A sound that penetrates a closed door or
    a wall
  • The sound will be muffled (low pass filter)

50
Reverberation
  • Effects from sound reflection
  • Similar to echo
  • Static reverberation
  • Dynamic reverberation

51
Static Reverberation
  • Relies on the closed container assumption
  • Parameters used to specify approximate
    environment conditions (decay, room size, etc.)
  • Example Microsoft DirectSound3D EAX

52
Static Reverberation
53
Dynamic Reverberation
  • Calculation of reflections off of surfaces taking
    into account surface properties
  • Typically diffusion and diffraction ignored
  • Wave Tracing
  • Example Aureal A3D 2.0

54
Dynamic Reverberation
55
Comparison
  • Static Reverberation less expensive
    computationally, simple to implement
  • Dynamic Reverberation very expensive
    computationally, difficult to implement, but
    potentially superior results

56
Doppler Shift
  • Change in frequency due to velocity
  • Very susceptible to temporal aliasing
  • The faster the update rate the better
  • Requires dedicated hardware

57
Atmospheric Effects
  • Attenuate high frequencies faster than low
    frequencies
  • Moisture in air increases this effect
Write a Comment
User Comments (0)
About PowerShow.com