Ogg Vorbis: Audio Compression Scheme presentation

About This Presentation

Transcript and Presenter's Notes

Title: Ogg Vorbis: Audio Compression Scheme

1
Ogg Vorbis Audio Compression Scheme

2
Overview

3
Introduction

Vorbis is a lossey audio compression scheme
intended to replace other similar formats (MP3,
etc.). It uses a psychoacoustic model to
eliminate perceptually negligible information,
thereby decreasing file size.
The architecture is forward-adaptive, encouraging
future improvements to the encoding scheme, and
flexible in that a range of encoding algorithms
are possible.
Vorbis website http//www.vorbis.com/

4
Introduction

5
Licensing and Availability

Ogg Vorbis is open source, unpatented and free.
There are no licensing fees for developers,
musicians, record labels, etc.
Encoders/decoders are available and compatible
with many popular media players on all major
platforms. Among these are the official command
line encoder called Oggenc and a graphical
encoder/decoder/player called OggDrop
Software can be obtained here http//www.vorbis.c
om/setup/

6
Audio quality metric

Ogg Vorbis audio quality is measured on a
subjective quality scale ranging from -1 to 10.
The default quality setting is 3, which the
developers claim sounds better than 128 kbps
MP3s, while occupying 10 of the file size.
Although each quality rating corresponds to an
average bit rate, the encoder does not attempt to
achieve a specific ABR. Moreover, the developers
hope to avoid judging quality in terms of this
criterion. The logic being that bit rates
correspond only loosely to measures of audio
quality where differing compression algorithms
are in use.
This is demonstrated in comparison between Vorbis
and MP3, MP3Pro, WMA, ACC and Real Audio, where
comparable average bit rates clearly correspond
to varying degrees of audio quality.
Comparisons can be found here http//www.xiph.org
/vorbis/listen.html

7
Audio quality metric

Additional specifications
Encodes using variable bit rates (VBR) or average
bit rates (ABR). ABR is used for streaming to
meet bandwidth requirements.
Supports up to 255 channels of audio.
Works with sample rates from 8kHz to 192kHz
Can potentially support bit rate peeling. This
entails conversion of an already compressed file
to a lesser quality without reintroducing (and
thereby compounding) the same encoding artifacts
associated with the first conversion.

8
File Structure

The Vorbis I specification can be found here
http//www.xiph.org/vorbis/doc/Vorbis_I_spec.html
The Vorbis bit stream specification consists of
four packet types, which occur consecutively. The
first three are headers. The header size is
unlimited.
The identification header Identifies the bit
stream as Vorbis and gives the version in use. It
includes audio characteristics required for
further interpretation such as sampling rate and
channel number.
The comment header Includes tags consisting of
user comments and a vendor string. Tags
themselves may be user-defined.

9
File Structure

3. The codec setup header Setup components
include modes, mappings, floors, residues
and codebooks, all of which have specific roles
in the decoding process.
Codebooks are required for decoding the audio
stream. For efficiency, audio is represented by
codewords derived using vector quantisation and
entropy encoding methods (Huffman binary tree
representation). Encoding/decoding of individual
audio packets involves reading from the
appropriate codebook.
Audio packets.

10
Encoding Decoding

11
Encoding Decoding

The decoding and synthesis process includes
several steps.
Decode packet type flag First the decoder must
verify that a given packet contains audio data by
inspecting its type flag.
Decode mode number The mode number indicates
the current frame size, window type, transform
type and mapping number.
The frame size is a power of 2 between 64 and
8192, and can be either short or long. Short
windows are used near attack transients in order
to limit artifacts associated with the MDCT.
The window taper varies for long windows
depending on whether the previous and subsequent
frames are short or long.
The transform type is always type 0, the MDCT,
in Vorbis I.
The mapping number contains a description of the
channel coupling scheme and a list of sub-maps
which bundle sets of channel vectors.

12
Encoding Decoding

13
Encoding Decoding

Type 1 uses a piecewise straight-line
representation to encode a spectral envelope
curve. The representation plots this curve
mechanically on a linear frequency axis and a
logarithmic (dB) amplitude axis. The integer
plotting algorithm used is similar to Bresenham's
algorithm.
5. Decode the residue The high frequency detail
remaining after the floor has been subtracted
from the audio spectrum (for a given channel and
frame) during encoding comprises the residue.

14
Encoding Decoding

6. Inverse channel coupling of residue vectors
The bit rate is lowered during encoding by
eliminating redundancies between channels.
Two mechanisms exist for channel coupling
Channel interleaving via residue backend type 2
Cartesian to square polar mapping.
The inverse process is performed during decoding.
For encoder quality settings equal or greater
than six, channel coupling is loseless.

15
Encoding Decoding

The floor curve is generated from the decoded
floor data.
The dot product of the floor and residue vectors
is taken to produce an audio spectrum vector.
The audio spectrum is converted back to the time
domain via the inverse MDCT.
10. The result of the transform is overlapped
and added together frame-by-frame to provide the
new audio stream.

Write a Comment

User Comments (0)

About PowerShow.com

Ogg Vorbis: Audio Compression Scheme PowerPoint PPT Presentation