Audio Compression Techniques - PowerPoint PPT Presentation

About This Presentation
Title:

Audio Compression Techniques

Description:

Audio compression algorithms are often referred to as 'audio encoders' Applications ... Implemented using a look-up tables in encoder and in decoder ... – PowerPoint PPT presentation

Number of Views:107
Avg rating:3.0/5.0
Slides: 27
Provided by: Pau169
Category:

less

Transcript and Presenter's Notes

Title: Audio Compression Techniques


1
Audio Compression Techniques
  • MUMT 611, January 2005
  • Assignment 2
  • Paul Kolesnik

2
Introduction
  • Digital Audio Compression
  • Removal of redundant or otherwise irrelevant
    information from audio signal
  • Audio compression algorithms are often referred
    to as audio encoders
  • Applications
  • Reduces required storage space
  • Reduces required transmission bandwidth

3
Audio Compression
  • Audio signal overview
  • Sampling rate ( of samples per second)
  • Bit rate ( of bits per second). Typically,
    uncompressed stereo 16-bit 44.1KHz signal has a
    1.4MBps bit rate
  • Number of channels (mono / stereo / multichannel)
  • Reduction by lowering those values or by data
    compression / encoding

4
Audio Data Compression
  • Redundant information
  • Implicit in the remaining information
  • Ex. oversampled audio signal
  • Irrelevant information
  • Perceptually insignificant
  • Cannot be recovered from remaining information

5
Audio Data Compression
  • Lossless Audio Compression
  • Removes redundant data
  • Resulting signal is same as original perfect
    reconstruction
  • Lossy Audio Encoding
  • Removes irrelevant data
  • Resulting signal is similar to original

6
Audio Data Compression
  • Audio vs. Speech Compression Techniques
  • Speech Compression uses a human vocal tract model
    to compress signals
  • Audio Compression does not use this technique due
    to larger variety of possible signal variations

7
Generic Audio Encoder
8
Generic Audio Encoder
  • Psychoacoustic Model
  • Psychoacoustics study of how sounds are
    perceived by humans
  • Uses perceptual coding
  • eliminate information from audio signal that is
    inaudible to the ear
  • Detects conditions under which different audio
    signal components mask each other

9
Psychoacoustic Model
  • Signal Masking
  • Threshold cut-off
  • Spectral (Frequency / Simultaneous) Masking
  • Temporal Masking
  • Threshold cut-off and spectral masking occur in
    frequency domain, temporal masking occurs in time
    domain

10
Signal Masking
  • Threshold cut-off
  • Hearing threshold level a function of frequency
  • Any frequency components below the threshold will
    not be perceived by human ear

11
Signal Masking
  • Spectral Masking
  • A frequency component can be partly or fully
    masked by another component that is close to it
    in frequency
  • This shifts the hearing threshold

12
Signal Masking
  • Temporal Masking
  • A quieter sound can be masked by a louder sound
    if they are temporally close
  • Sounds that occur both (shortly) before and after
    volume increase can be masked

13
Spectral Analysis
  • Tasks of Spectral Analysis
  • To derive masking thresholds to determine which
    signal components can be eliminated
  • To generate a representation of the signal to
    which masking thresholds can be applied
  • Spectral Analysis is done through transforms or
    filter banks

14
Spectral Analysis
  • Transforms
  • Fast Fourier Transform (FFT)
  • Discrete Cosine Transform (DCT) - similar to FFT
    but uses cosine values only
  • Modified Discrete Cosine Transform (MDCT) used
    by MPEG-1 Layer-III, MPEG-2 AAC, Dolby AC-3
    overlapped and windowed version of DCT

15
Spectral Analysis
  • Filter Banks
  • Time sample blocks are passed through a set of
    bandpass filters
  • Masking thresholds are applied to resulting
    frequency subband signals
  • Poly-phase and wavelet banks are most popular
    filter structures

16
Filter Bank Structures
  • Polyphase Filter Bank
    used in all of the MPEG-1 encoders
  • Signal is separated into subbands, the widths of
    which are equal over the entire frequency range
  • The resulting subband signals are downsampled to
    create shorter signals (which are later
    reconstructed during decoding process)

17
Filter Bank Structures
  • Wavelet Filter Bank
    used by Enhanced Perceptual Audio Coder (EPAC)
    by Lucent
  • Unlike polyphase filter, the widths of the
    subbands are not evenly spaced (narrower for
    higher frequencies)
  • This allows for better time resolution (ex. short
    attacks), but at expense of frequency resolution

18
Noise Allocation
  • System Task derive and apply shifted hearing
    threshold to the input signal
  • Anything below the threshold doesnt need to be
    transmitted
  • Any noise below the threshold is irrelevant
  • Frequency component quantization
  • Tradeoff between space and noise
  • Encoder saves on space by using just enough bits
    for each frequency component to keep noise under
    the threshold - this is known as noise allocation

19
Noise Allocation
  • Pre-echo
  • In case a single audio block contains silence
    followed by a loud attack, pre-echo error occurs
    - there will be audible noise in the silent part
    of the block after decoding
  • This is avoided by pre-monitoring audio data at
    encoding stage and separating audio into shorter
    blocks in potential pre-echo case
  • This does not completely eliminate pre-echo, but
    can make it short enough to be masked by the
    attack (temporal masking)

20
Pre-echo Effect
21
Additional Encoding Techniques
  • Other encoding techniques techniques are
    available (alternative or in combination)
  • Predictive Coding
  • Coupling / Delta Encoding
  • Huffman Encoding

22
Additional Encoding Techniques
  • Predictive Coding
  • Often used in speech and image compression
  • Estimates the expected value for each sample
    based on previous sample values
  • Transmits/stores the difference between the
    expected and received value
  • Generates an estimate for the next sample and
    then adjusts it by the difference stored for the
    current sample
  • Used for additional compression in MPEG2 AAC

23
Additional Encoding Techniques
  • Coupling / Delta encoding
  • Used in cases where audio signal consists of two
    or more channels (stereo or surround sound)
  • Similarities between channels are used for
    compression
  • A sum and difference between two channels are
    derived difference is usually some value close
    to zero and therefore requires less space to
    encode
  • This is a case of lossless encoding process

24
Additional Encoding Techniques
  • Huffman Coding
  • Information-theory-based technique
  • An element of a signal that often reoccurs in the
    signal is represented by a simpler symbol, and
    its value is stored in a look-up table
  • Implemented using a look-up tables in encoder and
    in decoder
  • Provides substantial lossless compression, but
    requires high computational power and therefore
    is not very popular
  • Used by MPEG1 and MPEG2 AAC

25
Encoding - Final Stages
  • Audio data packed into frames
  • Frames stored or transmitted

26
Conclusion
  • HTML Bibliography
  • http//www.music.mcgill.ca/pkoles
  • Questions
Write a Comment
User Comments (0)
About PowerShow.com