Prepared By:Avni Guna.
Khushali Panasala.
Yogesh Pandey.
Priyanka Pandey.
Hiteshri Patel.
Guided By:Mr.Chandresh
Parekh.
 Introduction
 What Is Audio?
 Why Compression is needed?
 Audio Compression.
 Types Of Audio Compression.
 Standard codecs for audio compression.
 Categories of Audio Files
 MPEG Audio Encoding Steps
 MPEG Audio decoding
 Successor of MP3
 High quality audio coding based on perceptual models has
found its way to widespread application in broadcasting and
Internet audio (e.g. mp3).
 Algorithms defined by the MPEG group (MPEG-1 Audio, e.g.
MPEG Layer-3 (mp3), MPEG-2 Advanced Audio Coding,
MPEG-4 Audio including its different functionalities) still
define the state of the art.
 Audio is an electrical or other
representation of sound.
 An audio file format is a file
format for storing digital audio
data on a computer system.
• Compression is the reduction in size of data in order to
save space or transmission time.
• Compression can be used to:
 Reduce File Size
 Save disk space
 Reduce transmission time
• Compression is performed by a program that uses an
algorithm to determine how to compress or decompress
data.
• Audio compression is a form of data compression designed
to reduce the size of audio files.
• There is a conditions on this definition :
 the audio file must still be playable after
compression, without decompressing it to original size
when you want to play it (for example with WinRAR).
 If the file is compressed 'too much' there will be loss
of quality.
 The compression is done with a thing called a codec.
This is an aggregation of the words: compressor and
decompressor.
 This codec is a special algorithm to reduce the size.
 There are mainly two types of audio compression show
below:
1)Lossy Compression(MP3)
2)Lossless Compression(Winzip)
• A compression technique that does not decompress data
back to 100% of the original.
• Lossy methods provide high degrees of compression
and result in smaller compressed files, but there is a
certain amount of visual loss when restored.
• Example:MP3
• A compression technique that decompresses
data back to its original form without any loss.
• The decompressed file and the original are
identical. For example, the ZIP archiving
technology (WinZip...) is the most widely used
lossless method.
• Lossless audio files typically require more storage
space than Lossy encoded ones.
• However this type of format is often favored by users
wanting to backup original audio CDs.
• A perfect copy can be restored in the event of loss or
damage to the CD. FLAC, Apple Lossless (ALAC) and
WMA Lossless are examples of lossless compression
formats.
 For lossy compression:
• Nero AAC Codec (Nero “advanced audio coding”
codec): It was developed and distributed by Nero AG.
• FAAC(Freeware Advanced Audio Coder):is an audio
compression computer program that creates AAC sound
files from other formats , it is the recommended format
for the company's iPod music player.
 For lossless compression
• LPAC (Lossless predictive audio compression):is an
improved lossless audio compression algorithm developed
by Tilman Liebchen, Marcus Purat and Peter Noll.
• ALAC (Apple Lossless Audio Codec):is an audio coding
format, and its reference audio codec implementation,
developed by Apple.
• FLAC(Free Lossless Audio Codec): can typically reduce
the original size of audio file to 50–60%, and decompressed
it to an identical copy of the original audio data, developed
by Josh Coalson.
• WMA Lossless (Windows Media Audio Lossless):
developed by Microsoft
 Moving Picture Experts Group
 Aim to create standards relating to synchronized
 audio and video compression
 MPEG-1
 MPEG-2
Quite good
high bit-rate is
available
Digital audio tap
Complex
Digital Audio
Broadcasting
most complex
audio transmission
over ISDN lines
There are 3 categories in which certain Audio files
belong to:-
1) Uncompressed:
Ex) .Wav
2) Lossless:
Ex) .WMA
3) Lossy:
Ex) .Mp3
◦ Hearing threshold level – a function of frequency
◦ Any frequency components below the threshold will
not be perceived by human ear
 A frequency component can be partly or fully masked by
another component that is close to it in frequency
 A lower tone can effectively mask higher tone
 This shifts the hearing threshold
◦ A quieter sound can be masked by a louder
sound if they are temporally close Sounds that
occur both (shortly) before and after volume
increase can be masked
 Power-law: larger values have less accuracy
 quantization: uniform or non-uniform quantization.
 coding: quantized spectral components are transmitted
either directly, or as entropy coded words (Huffman
coding)
 For better data compression, variable- length Huffman
codes are used to encode the quantized samples.
 Resultant bitstream is now reduced, because of coarser
quantisation, but can be further reduced by the use of
Huffman coding.
 formats encoded quantized samples into an
encoded bit stream – final form in which the
compressed signal is transmitted.
 Header (First 4 bytes of a frame)
◦ Contains: Frame Sync, MPEG Layer, Sampling Frequency,
Number of Channels, CRC, etc.
◦ Variable bit rate mp3’s switch bitrate between frames
 Decoder side relatively easier. The gain, scale
factor, quantization steps recovered and used for
reconstruct the filter bank responses
 Filter bank responses are combined to reconstruct
the decoded audio signal
 Advanced Audio Coding (AAC) – now part of MPEG-
4 Audio
 Inclusion of 48 full-bandwidth audio channels
 Default audio format for iPhone, iPad, Nintendo,
PlayStation, Nokia, Android, BlackBerry
 Introduced 1997 as MPEG-2 Part 7
 In 1999 – updated and included in MPEG-4
 standard for lossy digital audio compression. Designed
to be the successor of the MP3 format, AAC generally
achieves better sound quality than MP3 at similar bit
rates.
 Opus is a lossy audio coding format developed by the
Internet Engineering Task Force (IETF) that is
particularly suitable for interactive real-time
applications over the Internet.
 Opus incorporates technology from two other audio
coding formats: the speech-oriented SILK and the low
latency CELT.
 The MDCT was proposed by Princen, Johnson, and
Bradley[1] in 1987, following earlier (1986) work by
Princen and Bradley[2] to develop the MDCT’s
underlying principle of time-domain aliasing
cancellation (TDAC), described below.
 In MP3, the MDCT is not applied to the audio signal
directly, but rather to the output of a 32-band polyphase
quadrature filter (PQF) bank. The output of this MDCT
is post processed by an alias reduction formula to
reduce the typical aliasing of the PQF filter bank.
 Such a combination of a filter bank with an MDCT is
called a hybrid filter bank or a sub band MDCT. AAC,
on the other hand, normally uses a pure MDCT; only
the (rarely used) MPEG-4 AAC-SSR variant (by Sony)
uses a four band PQF bank followed by an MDCT.
Similar to MP3, ATRAC uses stacked quadrature
mirror filters (QMF) followed by an MDCT.
 ISO/IEC
• MPEG-1 Layer III (MP3)
• MPEG-1 Layer II
• MPEG-1 Layer I
• AAC
• MPEG-4 ALS
• MPEG-4 SLS
• MPEG-D USAC
 ITU-T
• G.711
• G.718
• G.719
• G.722
• G.723
• G.726
• G.728
• G.729
• audio compression is a key technology
• many algorithms  many applications
• Better algorithms  better quality, more compression
Thank You

Audio compression

  • 1.
    Prepared By:Avni Guna. KhushaliPanasala. Yogesh Pandey. Priyanka Pandey. Hiteshri Patel. Guided By:Mr.Chandresh Parekh.
  • 2.
     Introduction  WhatIs Audio?  Why Compression is needed?  Audio Compression.  Types Of Audio Compression.  Standard codecs for audio compression.  Categories of Audio Files  MPEG Audio Encoding Steps  MPEG Audio decoding  Successor of MP3
  • 3.
     High qualityaudio coding based on perceptual models has found its way to widespread application in broadcasting and Internet audio (e.g. mp3).  Algorithms defined by the MPEG group (MPEG-1 Audio, e.g. MPEG Layer-3 (mp3), MPEG-2 Advanced Audio Coding, MPEG-4 Audio including its different functionalities) still define the state of the art.
  • 4.
     Audio isan electrical or other representation of sound.  An audio file format is a file format for storing digital audio data on a computer system.
  • 5.
    • Compression isthe reduction in size of data in order to save space or transmission time. • Compression can be used to:  Reduce File Size  Save disk space  Reduce transmission time • Compression is performed by a program that uses an algorithm to determine how to compress or decompress data.
  • 6.
    • Audio compressionis a form of data compression designed to reduce the size of audio files. • There is a conditions on this definition :  the audio file must still be playable after compression, without decompressing it to original size when you want to play it (for example with WinRAR).  If the file is compressed 'too much' there will be loss of quality.  The compression is done with a thing called a codec. This is an aggregation of the words: compressor and decompressor.  This codec is a special algorithm to reduce the size.
  • 8.
     There aremainly two types of audio compression show below: 1)Lossy Compression(MP3) 2)Lossless Compression(Winzip)
  • 9.
    • A compressiontechnique that does not decompress data back to 100% of the original. • Lossy methods provide high degrees of compression and result in smaller compressed files, but there is a certain amount of visual loss when restored. • Example:MP3
  • 10.
    • A compressiontechnique that decompresses data back to its original form without any loss. • The decompressed file and the original are identical. For example, the ZIP archiving technology (WinZip...) is the most widely used lossless method.
  • 11.
    • Lossless audiofiles typically require more storage space than Lossy encoded ones. • However this type of format is often favored by users wanting to backup original audio CDs. • A perfect copy can be restored in the event of loss or damage to the CD. FLAC, Apple Lossless (ALAC) and WMA Lossless are examples of lossless compression formats.
  • 12.
     For lossycompression: • Nero AAC Codec (Nero “advanced audio coding” codec): It was developed and distributed by Nero AG. • FAAC(Freeware Advanced Audio Coder):is an audio compression computer program that creates AAC sound files from other formats , it is the recommended format for the company's iPod music player.
  • 13.
     For losslesscompression • LPAC (Lossless predictive audio compression):is an improved lossless audio compression algorithm developed by Tilman Liebchen, Marcus Purat and Peter Noll. • ALAC (Apple Lossless Audio Codec):is an audio coding format, and its reference audio codec implementation, developed by Apple. • FLAC(Free Lossless Audio Codec): can typically reduce the original size of audio file to 50–60%, and decompressed it to an identical copy of the original audio data, developed by Josh Coalson. • WMA Lossless (Windows Media Audio Lossless): developed by Microsoft
  • 14.
     Moving PictureExperts Group  Aim to create standards relating to synchronized  audio and video compression  MPEG-1  MPEG-2
  • 15.
    Quite good high bit-rateis available Digital audio tap Complex Digital Audio Broadcasting most complex audio transmission over ISDN lines
  • 16.
    There are 3categories in which certain Audio files belong to:- 1) Uncompressed: Ex) .Wav 2) Lossless: Ex) .WMA 3) Lossy: Ex) .Mp3
  • 24.
    ◦ Hearing thresholdlevel – a function of frequency ◦ Any frequency components below the threshold will not be perceived by human ear
  • 25.
     A frequencycomponent can be partly or fully masked by another component that is close to it in frequency  A lower tone can effectively mask higher tone  This shifts the hearing threshold
  • 26.
    ◦ A quietersound can be masked by a louder sound if they are temporally close Sounds that occur both (shortly) before and after volume increase can be masked
  • 28.
     Power-law: largervalues have less accuracy  quantization: uniform or non-uniform quantization.  coding: quantized spectral components are transmitted either directly, or as entropy coded words (Huffman coding)  For better data compression, variable- length Huffman codes are used to encode the quantized samples.  Resultant bitstream is now reduced, because of coarser quantisation, but can be further reduced by the use of Huffman coding.
  • 29.
     formats encodedquantized samples into an encoded bit stream – final form in which the compressed signal is transmitted.  Header (First 4 bytes of a frame) ◦ Contains: Frame Sync, MPEG Layer, Sampling Frequency, Number of Channels, CRC, etc. ◦ Variable bit rate mp3’s switch bitrate between frames
  • 30.
     Decoder siderelatively easier. The gain, scale factor, quantization steps recovered and used for reconstruct the filter bank responses  Filter bank responses are combined to reconstruct the decoded audio signal
  • 31.
     Advanced AudioCoding (AAC) – now part of MPEG- 4 Audio  Inclusion of 48 full-bandwidth audio channels  Default audio format for iPhone, iPad, Nintendo, PlayStation, Nokia, Android, BlackBerry  Introduced 1997 as MPEG-2 Part 7  In 1999 – updated and included in MPEG-4  standard for lossy digital audio compression. Designed to be the successor of the MP3 format, AAC generally achieves better sound quality than MP3 at similar bit rates.
  • 32.
     Opus isa lossy audio coding format developed by the Internet Engineering Task Force (IETF) that is particularly suitable for interactive real-time applications over the Internet.  Opus incorporates technology from two other audio coding formats: the speech-oriented SILK and the low latency CELT.
  • 33.
     The MDCTwas proposed by Princen, Johnson, and Bradley[1] in 1987, following earlier (1986) work by Princen and Bradley[2] to develop the MDCT’s underlying principle of time-domain aliasing cancellation (TDAC), described below.  In MP3, the MDCT is not applied to the audio signal directly, but rather to the output of a 32-band polyphase quadrature filter (PQF) bank. The output of this MDCT is post processed by an alias reduction formula to reduce the typical aliasing of the PQF filter bank.
  • 34.
     Such acombination of a filter bank with an MDCT is called a hybrid filter bank or a sub band MDCT. AAC, on the other hand, normally uses a pure MDCT; only the (rarely used) MPEG-4 AAC-SSR variant (by Sony) uses a four band PQF bank followed by an MDCT. Similar to MP3, ATRAC uses stacked quadrature mirror filters (QMF) followed by an MDCT.
  • 35.
     ISO/IEC • MPEG-1Layer III (MP3) • MPEG-1 Layer II • MPEG-1 Layer I • AAC • MPEG-4 ALS • MPEG-4 SLS • MPEG-D USAC
  • 36.
     ITU-T • G.711 •G.718 • G.719 • G.722 • G.723 • G.726 • G.728 • G.729
  • 37.
    • audio compressionis a key technology • many algorithms  many applications • Better algorithms  better quality, more compression
  • 38.