What Is Audio?
Why Compression is needed?
Types Of Audio Compression.
Standard codecs for audio compression.
Categories of Audio Files
MPEG Audio Encoding Steps
MPEG Audio decoding
Successor of MP3
High quality audio coding based on perceptual models has
found its way to widespread application in broadcasting and
Internet audio (e.g. mp3).
Algorithms defined by the MPEG group (MPEG-1 Audio, e.g.
MPEG Layer-3 (mp3), MPEG-2 Advanced Audio Coding,
MPEG-4 Audio including its different functionalities) still
define the state of the art.
Audio is an electrical or other
representation of sound.
An audio file format is a file
format for storing digital audio
data on a computer system.
• Compression is the reduction in size of data in order to
save space or transmission time.
• Compression can be used to:
Reduce File Size
Save disk space
Reduce transmission time
• Compression is performed by a program that uses an
algorithm to determine how to compress or decompress
• Audio compression is a form of data compression designed
to reduce the size of audio files.
• There is a conditions on this definition :
the audio file must still be playable after
compression, without decompressing it to original size
when you want to play it (for example with WinRAR).
If the file is compressed 'too much' there will be loss
The compression is done with a thing called a codec.
This is an aggregation of the words: compressor and
This codec is a special algorithm to reduce the size.
There are mainly two types of audio compression show
• A compression technique that does not decompress data
back to 100% of the original.
• Lossy methods provide high degrees of compression
and result in smaller compressed files, but there is a
certain amount of visual loss when restored.
• A compression technique that decompresses
data back to its original form without any loss.
• The decompressed file and the original are
identical. For example, the ZIP archiving
technology (WinZip...) is the most widely used
• Lossless audio files typically require more storage
space than Lossy encoded ones.
• However this type of format is often favored by users
wanting to backup original audio CDs.
• A perfect copy can be restored in the event of loss or
damage to the CD. FLAC, Apple Lossless (ALAC) and
WMA Lossless are examples of lossless compression
For lossy compression:
• Nero AAC Codec (Nero “advanced audio coding”
codec): It was developed and distributed by Nero AG.
• FAAC(Freeware Advanced Audio Coder):is an audio
compression computer program that creates AAC sound
files from other formats , it is the recommended format
for the company's iPod music player.
For lossless compression
• LPAC (Lossless predictive audio compression):is an
improved lossless audio compression algorithm developed
by Tilman Liebchen, Marcus Purat and Peter Noll.
• ALAC (Apple Lossless Audio Codec):is an audio coding
format, and its reference audio codec implementation,
developed by Apple.
• FLAC(Free Lossless Audio Codec): can typically reduce
the original size of audio file to 50–60%, and decompressed
it to an identical copy of the original audio data, developed
by Josh Coalson.
• WMA Lossless (Windows Media Audio Lossless):
developed by Microsoft
Moving Picture Experts Group
Aim to create standards relating to synchronized
audio and video compression
high bit-rate is
Digital audio tap
over ISDN lines
There are 3 categories in which certain Audio files
◦ Hearing threshold level – a function of frequency
◦ Any frequency components below the threshold will
not be perceived by human ear
A frequency component can be partly or fully masked by
another component that is close to it in frequency
A lower tone can effectively mask higher tone
This shifts the hearing threshold
◦ A quieter sound can be masked by a louder
sound if they are temporally close Sounds that
occur both (shortly) before and after volume
increase can be masked
Power-law: larger values have less accuracy
quantization: uniform or non-uniform quantization.
coding: quantized spectral components are transmitted
either directly, or as entropy coded words (Huffman
For better data compression, variable- length Huffman
codes are used to encode the quantized samples.
Resultant bitstream is now reduced, because of coarser
quantisation, but can be further reduced by the use of
formats encoded quantized samples into an
encoded bit stream – final form in which the
compressed signal is transmitted.
Header (First 4 bytes of a frame)
◦ Contains: Frame Sync, MPEG Layer, Sampling Frequency,
Number of Channels, CRC, etc.
◦ Variable bit rate mp3’s switch bitrate between frames
Decoder side relatively easier. The gain, scale
factor, quantization steps recovered and used for
reconstruct the filter bank responses
Filter bank responses are combined to reconstruct
the decoded audio signal
Advanced Audio Coding (AAC) – now part of MPEG-
Inclusion of 48 full-bandwidth audio channels
Default audio format for iPhone, iPad, Nintendo,
PlayStation, Nokia, Android, BlackBerry
Introduced 1997 as MPEG-2 Part 7
In 1999 – updated and included in MPEG-4
standard for lossy digital audio compression. Designed
to be the successor of the MP3 format, AAC generally
achieves better sound quality than MP3 at similar bit
Opus is a lossy audio coding format developed by the
Internet Engineering Task Force (IETF) that is
particularly suitable for interactive real-time
applications over the Internet.
Opus incorporates technology from two other audio
coding formats: the speech-oriented SILK and the low
The MDCT was proposed by Princen, Johnson, and
Bradley in 1987, following earlier (1986) work by
Princen and Bradley to develop the MDCT’s
underlying principle of time-domain aliasing
cancellation (TDAC), described below.
In MP3, the MDCT is not applied to the audio signal
directly, but rather to the output of a 32-band polyphase
quadrature filter (PQF) bank. The output of this MDCT
is post processed by an alias reduction formula to
reduce the typical aliasing of the PQF filter bank.
Such a combination of a filter bank with an MDCT is
called a hybrid filter bank or a sub band MDCT. AAC,
on the other hand, normally uses a pure MDCT; only
the (rarely used) MPEG-4 AAC-SSR variant (by Sony)
uses a four band PQF bank followed by an MDCT.
Similar to MP3, ATRAC uses stacked quadrature
mirror filters (QMF) followed by an MDCT.
• MPEG-1 Layer III (MP3)
• MPEG-1 Layer II
• MPEG-1 Layer I
• MPEG-4 ALS
• MPEG-4 SLS
• MPEG-D USAC