Audio - PowerPoint PPT Presentation

1 / 19
About This Presentation
Title:

Audio

Description:

Sounds of instruments, Music. Sounds of all other kinds ... To display the spectrogram, use specgram. Audio analysis are done in frames of 20ms 40ms long. ... – PowerPoint PPT presentation

Number of Views:165
Avg rating:3.0/5.0
Slides: 20
Provided by: Hao65
Category:

less

Transcript and Presenter's Notes

Title: Audio


1
Audio
  • Hao Jiang
  • Computer Science Department
  • Boston College
  • Oct. 11, 2007

2
Digital Audio
  • Audio comes from different sources
  • Speech.
  • Sounds of instruments, Music.
  • Sounds of all other kinds (the sound of wind,
    train and ocean).
  • Audio needs new methods for coding and
    processing.
  • Audio processing is a key task in multimedia
    systems
  • Audio coding (MPEG audio, mp3, AAC and others)
  • Authoring and representation (composition)
  • Analysis and searching (retrieval and database)
  • 3D sound, etc.
  • We will focus on basic audio processing, MPEG
    audio and related topics.

3
Audio Processing
  • Audio authoring

Audio file formats Waveform files and
MIDI. MIDI Musical Instrument Digital
Interface. Instead of storing the
waveform samples, MIDI file has a sequence of
commands to control an audio device to generate a
specified note with given properties.
4
Audio Processing Using Matlab
  • To load a wave in Windows
  • audat wavread(filename.wav)
  • Or, directly open the file and load a stream
    of words (2 bytes) or bytes depending on the
    wav format.
  • To play a sound, use sound(audat, samplingrate).
  • To display the spectrogram, use specgram.
  • Audio analysis are done in frames of 20ms 40ms
    long.

5
Frequency Domain Analysis
  • Fourier transform can be used to decompose any
    signal into summation of sinusoidal waves.
  • In Matlab, we can use fft (Fast Fourier
    Transform) for frequency domain analysis.

T
Base frequency ¼ 1/T
The time domain waveform
The frequency Domain components.
6
MP3 and Others
  • MPEG (Motion Picture Expert Group) and ISO
    (International Standard Organization) have
    published several standards about digital audio
    coding.
  • MPEG-1 Layer 1,2 and 3 (MP3)
  • MPEG2 AAC
  • MPEG4 AAC and TwinVQ
  • Other standards
  • Dolby AC3
  • They have been widely used in consumer
    electronics, digital audio broadcasting, DVD and
    movies etc.

7
Perceptual Coding in MPEG
audio
Encoder
MUX
Bit stream
Dynamic bit allocation
FFT
Masking Threshold
Encoder
MUX
audio
Bit stream
Dynamic bit allocation
8
Simultaneous Masking
  • A strong audio component can mask its nearby
    frequency components.

dB
Masker
Sound pressure level
Masking threshold
Threshold in quiet
20000 Hz
1000
20
9
Masking and Quantization
Masker
dB
Signal To mask ratio
Sound pressure level
m1-bit quantizer SNR
Minimum masking threshold for band A.
m-bit quantizer SNR
20000 Hz
20

Critical band A Neighbor
critical band
A critical band defines the resolution of the
hearing at some frequency location.
10
Temporal Masking
Amplitude
Pre-masking curve
Post-masking curve
time
11
MPEG Perceptual Model
  • A matlab demo.

12
MPEG Audio Layer 1
  • MPEG (1 and 2) audio allows sampling rate at 44.1
    48, 32, 22.05, 24 and 16KHz.
  • MPEG filters the input audio into 32 bands.

12 samples
Filtering And downsampling
Perceptual coder
12 samples
Audio
Normalize By scale factor
384 samples
12 samples
13
MPEG Audio Layer 2
  • Layer 2 is very similar to Layer 1, but groups 3
    12-samples together in coding.
  • It also improves the scaling factor quantization
    and also groups 3 audio samples together in bit
    assignment.

36 samples
Filtering And downsampling
Perceptual coder
36 samples
Audio
Normalize By scale factor
3x384 samples
36 samples
14
Overlapped Transform and MDCT
Window 1
Window 3
2N
Window 2
Window 4
In overlapped transform, 2N samples are
transformed to N elements.
1
3
In reverse Transform
2
4

Reconstructed result.
15
Some Matlab Codes
  • The program compares DCT and MDCT in audio
    processing.
  • Code is available on the course website as a tar
    ball mdct_and_dct.tar.

16
MP3
  • MP3 is another layer built on top of MPEG audio
    layer 2.
  • MP3 further does MDCT on each band and tries to
    encode the MDCT coefficients.
  • MP3 then uses Huffman coding to further compress
    the bit streams losslessly.

17
File Format
Mpeg audio puts header in each of the frame, so
that they can be decoded separately.
Header
CRC
Bit Allocation
Scale factors
Subband Data
Header
CRC
Bit Allocation
Scale factors
Subband Data
Frame 1
Frame 2
18
Other Audio Coding Standards
  • MPEG 2 and MPEG 4 ACC (advanced audio coding)
  • Not backward compatible
  • Use MDCT without bandpass filtering
  • Dolby AC3
  • MDCT based codec
  • Similar to MPEG ACC but uses a different
    quantization and coding scheme
  • A de-facto standard for DVD and Digital audio in
    Movie.

19
Realtime Audio Systems
Audio I/O Process
Write pointer
Read pointer
Audio input circular queue
Audio Processing Unit
Audio output circular queue
Write a Comment
User Comments (0)
About PowerShow.com