SlideShare a Scribd company logo
A Tutorial on
MPEG/Audio
Compression
Davis Pan, IEEE Multimedia Journal,
Summer 1995
Presented by:
Randeep Singh Gakhal
CMPT 820, Spring 2004
Outline
 Introduction
 Technical Overview
 Polyphase Filter Bank
 Psychoacoustic Model
 Coding and Bit Allocation
 Conclusions and Future Work
Introduction
 What does MPEG-1 Audio provide?
A transparently lossy audio compression system based on
the weaknesses of the human ear.
 Can provide compression by a factor of 6 and
retain sound quality.
 One part of a three part standard that includes
audio, video, and audio/video synchronization.
Technical Overview
MPEG-I Audio Features
 PCM sampling rate of 32, 44.1, or 48 kHz
 Four channel modes:
 Monophonic and Dual-monophonic
 Stereo and Joint-stereo
 Three modes (layers in MPEG-I speak):
 Layer I: Computationally cheapest, bit rates > 128kbps
 Layer II: Bit rate ~ 128 kbps, used in VCD
 Layer III: Most complicated encoding/decoding, bit rates ~
64kbps, originally intended for streaming audio
Human Audio System (ear + brain)
 Human sensitivity to sound is non-linear
across audible range (20Hz – 20kHz)
 Audible range broken into regions where
humans cannot perceive a difference
 called the critical bands
MPEG-I Encoder Architecture[1]
MPEG-I Encoder Architecture
 Polyphase Filter Bank: Transforms PCM samples
to frequency domain signals in 32 subbands
 Psychoacoustic Model: Calculates acoustically
irrelevant parts of signal
 Bit Allocator: Allots bits to subbands according to
input from psychoacoustic calculation.
 Frame Creation: Generates an MPEG-I compliant
bit stream.
The Polyphase
Filter Bank
Polyphase Filter Bank
 Divides audio signal into 32 equal width
subband streams in the frequency domain.
 Inverse filter at decoder cannot recover
signal without some, albeit inaudible, loss.
 Based on work by Rothweiler[2].
 Standard specifies 512 coefficient analysis
window, C[n]
Polyphase Filter Bank
 Buffer of 512 PCM samples with 32 new
samples, X[n], shifted in every computation cycle
 Calculate window samples for i=0…511:
 Partial calculation for i=0…63:
 Calculate 32 subsamples:
][][][ iXiCiZ ⋅=
∑=
+=
7
0
]64[][
j
jiZiY
∑=
⋅=
63
0
]][[][][
k
kiMiYiS
Polyphase Filter Bank
 Visualization of the filter[1]
:
Polyphase Filter Bank
 The net effect:
 Analysis matrix:
 Requires 512 + 32x64 = 2560 multiplies.
 Each subband has bandwidth π/32T centered at
odd multiples of π/64T
]64[]64[]][[][
63
0
7
0
jiXjiCkiMiS
k j
++= ∑ ∑= =





 −+
=
64
)16)(12(
cos]][[
πki
kiM
Polyphase Filter Bank
 Shortcomings:
 Equal width filters do not correspond with critical
band model of auditory system.
 Filter bank and its inverse are NOT lossless.
 Frequency overlap between subbands.
Polyphase Filter Bank
 Comparison of filter banks and critical bands[1]:
Polyphase Filter Bank
 Frequency response of one subband[1]
:
Psychoacoustic
Model
The Weakness of the Human Ear
 Frequency dependent resolution:
 We do not have the ability to discern minute
differences in frequency within the critical bands.
 Auditory masking:
 When two signals of very close frequency are
both present, the louder will mask the softer.
 A masked signal must be louder than some
threshold for it to be heard  gives us room to
introduce inaudible quantization noise.
MPEG-I Psychoacoustic Models
 MPEG-I standard defines two models:
 Psychoacoustic Model 1:
 Less computationally expensive
 Makes some serious compromises in what it
assumes a listener cannot hear
 Psychoacoustic Model 2:
 Provides more features suited for Layer III
coding, assuming of course, increased processor
bandwidth.
Psychoacoustic Model
 Convert samples to frequency domain
 Use a Hann weighting and then a DFT
 Simply gives an edge artifact (from finite window
size) free frequency domain representation.
 Model 1 uses 512 (Layer I) or 1024 (Layers II
and III) sample window.
 Model 2 uses a 1024 sample window and two
calculations per frame.
Psychoacoustic Model
 Need to separate sound into “tones” and “noise”
components
 Model 1:
 Local peaks are tones, lump remaining spectrum per
critical band into noise at a representative frequency.
 Model 2:
 Calculate “tonality” index to determine likelihood of each
spectral point being a tone
 based on previous two analysis windows
Psychoacoustic Model
 “Smear” each signal within its critical band
 Use either a masking (Model 1) or a spreading
function (Model 2).
 Adjust calculated threshold by incorporating
a “quiet” mask – masking threshold for
each frequency when no other frequencies
are present.
Psychoacoustic Model
 Calculate a masking threshold for each subband in the
polyphase filter bank
 Model 1:
 Selects minima of masking threshold values in range of each
subband
 Inaccurate at higher frequencies – recall how subbands are
linearly distributed, critical bands are NOT!
 Model 2:
 If subband wider than critical band:
 Use minimal masking threshold in subband
 If critical band wider than subband:
 Use average masking threshold in subband
Psychoacoustic Model
 The hard work is done – now, we just
calculate the signal-to-mask ratio (SMR)
per subband
 SMR = signal energy / masking threshold
 We pass our result on to the coding unit
which can now produce a compressed
bitstream
Psychoacoustic Model (example)
 Input[1]
:
Psychoacoustic Model (example)
 Transformation to perceptual domain[1]
:
Psychoacoustic Model (example)
 Calculation of masking thresholds[1]
:
Psychoacoustic Model (example)
 Signal-to-mask ratios[1]
:
Psychoacoustic Model (example)
 What we actually send[1]
:
Coding and Bit
Allocation
Layer Specific Coding
 Layer specific frame formats[1]
:
Layer Specific Coding
 Stream of samples is processed in groups[1]
:
Layer I Coding
 Group 12 samples from each subband and
encode them in each frame (=384 samples)
 Each group encoded with 0-15 bits/sample
 Each group has 6-bit scale factor
Layer II Coding
 Similar to Layer I except:
 Groups are now 3 of 12 samples per-subband =
1152 samples per frame
 Can have up to 3 scale factors per subband to
avoid audible distortion in special cases
 Called scale factor selection information (SCFSI)
Layer III Coding
 Further subdivides subbands using Modified
Discrete Cosine Transform (MDCT) – a lossless
transform
 Larger frequency resolution => smaller time
resolution
 possibility of pre-echo
 Layer III encoder can detect and reduce pre-echo
by “borrowing bits” from future encodings
Bit Allocation
 Determine number of bits to allot for each
subband given SMR from psychoacoustic model.
 Layers I and II:
 Calculate mask-to-noise ratio:
 MNR = SNR – SMR (in dB)
 SNR given by MPEG-I standard (as function of quantization
levels)
 Now iterate until no bits to allocate left:
 Allocate bits to subband with lowest MNR.
 Re-calculate MNR for subband allocated more bits.
Bit Allocation
 Layer III:
 Employs “noise allocation”
 Quantizes each spectral value and employs
Huffman coding
 If Huffman encoding results in noise in excess of
allowed distortion for a subband, encoder
increases resolution on that subband
 Whole process repeats until one of three
specified stop conditions is met.
Conclusions and
Future Work
Conclusions
 MPEG-I provides tremendous compression
for relatively cheap computation.
 Not suitable for archival or audiophile grade
music as very seasoned listeners can
discern distortion.
 Modifying or searching MPEG-I content
requires decompression and is not cheap!
Future Work
 MPEG-1 audio lays the foundation for all modern
audio compression techniques
 Lots of progress since then (1994!)
 MPEG-2 (1996) extends MPEG audio
compression to support 5.1 channel audio
 MPEG-4 (1998) attempts to code based on
perceived audio objects in the stream
 Finally, MPEG-7 (2001) operates at an even
higher level of abstraction, focusing on meta-data
coding to make content searchable and
retrievable
References
[1] D. Pan, “A Tutorial on MPEG/Audio Compression”,
IEEE Multimedia Journal, 1995.
[2] J. H. Rothweiler, “Polyphase Quadrature Filters – a New
Subband Coding Technique”, Proc of the Int. Conf. IEEE
ASSP, 27.2, pp1280-1283, Boston 1983.

More Related Content

What's hot

Video Compression Basics - MPEG2
Video Compression Basics - MPEG2Video Compression Basics - MPEG2
Video Compression Basics - MPEG2
VijayKumarArya
 
Audio and video compression
Audio and video compressionAudio and video compression
Audio and video compressionneeraj9217
 
Audio compression 1
Audio compression 1Audio compression 1
Audio compression 1
Rajat Kumar
 
Mp3
Mp3Mp3
video compression techique
video compression techiquevideo compression techique
video compression techiqueAshish Kumar
 
Lecture 8 audio compression
Lecture 8 audio compressionLecture 8 audio compression
Lecture 8 audio compressionMr SMAK
 
Iain Richardson: An Introduction to Video Compression
Iain Richardson: An Introduction to Video CompressionIain Richardson: An Introduction to Video Compression
Iain Richardson: An Introduction to Video Compression
Iain Richardson
 
Video Compression
Video CompressionVideo Compression
Video Compression
Shreyash Patel
 
Unit 1
Unit 1Unit 1
Unit 1
swapnasalil
 
JPEG
JPEGJPEG
Transform coding
Transform codingTransform coding
Transform coding
Nancy K
 
A short history of video coding
A short history of video codingA short history of video coding
A short history of video coding
Iain Richardson
 
Image Compression
Image CompressionImage Compression
Image Compression
Paramjeet Singh Jamwal
 
Error control coding techniques
Error control coding techniquesError control coding techniques
Error control coding techniques
DhanashriNandre
 
Data compression
Data compressionData compression
Data compression
VIKAS SINGH BHADOURIA
 
Video coding standards ppt
Video coding standards pptVideo coding standards ppt
Video coding standards ppt
Lokesh Reddy Avula
 
Linear Predictive Coding
Linear Predictive CodingLinear Predictive Coding
Linear Predictive Coding
Shruti Bhatnagar Dasgupta
 
Motion Estimation - umit 5 (II).pdf
Motion Estimation  - umit 5 (II).pdfMotion Estimation  - umit 5 (II).pdf
Motion Estimation - umit 5 (II).pdf
HeenaSyed6
 
MPEG-1 Part 2 Video Encoding
MPEG-1 Part 2 Video EncodingMPEG-1 Part 2 Video Encoding
MPEG-1 Part 2 Video Encoding
Christian Kehl
 

What's hot (20)

Video Compression Basics - MPEG2
Video Compression Basics - MPEG2Video Compression Basics - MPEG2
Video Compression Basics - MPEG2
 
Audio and video compression
Audio and video compressionAudio and video compression
Audio and video compression
 
Audio compression 1
Audio compression 1Audio compression 1
Audio compression 1
 
Mp3
Mp3Mp3
Mp3
 
video compression techique
video compression techiquevideo compression techique
video compression techique
 
Hdtv technology
Hdtv technologyHdtv technology
Hdtv technology
 
Lecture 8 audio compression
Lecture 8 audio compressionLecture 8 audio compression
Lecture 8 audio compression
 
Iain Richardson: An Introduction to Video Compression
Iain Richardson: An Introduction to Video CompressionIain Richardson: An Introduction to Video Compression
Iain Richardson: An Introduction to Video Compression
 
Video Compression
Video CompressionVideo Compression
Video Compression
 
Unit 1
Unit 1Unit 1
Unit 1
 
JPEG
JPEGJPEG
JPEG
 
Transform coding
Transform codingTransform coding
Transform coding
 
A short history of video coding
A short history of video codingA short history of video coding
A short history of video coding
 
Image Compression
Image CompressionImage Compression
Image Compression
 
Error control coding techniques
Error control coding techniquesError control coding techniques
Error control coding techniques
 
Data compression
Data compressionData compression
Data compression
 
Video coding standards ppt
Video coding standards pptVideo coding standards ppt
Video coding standards ppt
 
Linear Predictive Coding
Linear Predictive CodingLinear Predictive Coding
Linear Predictive Coding
 
Motion Estimation - umit 5 (II).pdf
Motion Estimation  - umit 5 (II).pdfMotion Estimation  - umit 5 (II).pdf
Motion Estimation - umit 5 (II).pdf
 
MPEG-1 Part 2 Video Encoding
MPEG-1 Part 2 Video EncodingMPEG-1 Part 2 Video Encoding
MPEG-1 Part 2 Video Encoding
 

Viewers also liked

Video Compression Techniques
Video Compression TechniquesVideo Compression Techniques
Video Compression Techniques
cnssources
 
Compression presentation 415 (1)
Compression presentation 415 (1)Compression presentation 415 (1)
Compression presentation 415 (1)
Godo Dodo
 
ISDD Video Compression
ISDD Video CompressionISDD Video Compression
ISDD Video Compression
Forrester High School
 
Introduction To Video Compression
Introduction To Video CompressionIntroduction To Video Compression
Introduction To Video Compression
guestdd7ccca
 
Hw3 0972552
Hw3 0972552Hw3 0972552
Hw3 0972552s0972552
 
Standards De Compression Audio Et VidéO
Standards De Compression Audio Et VidéOStandards De Compression Audio Et VidéO
Standards De Compression Audio Et VidéO
briantais
 
28 h 264-avc_by_dhchang
28   h 264-avc_by_dhchang28   h 264-avc_by_dhchang
28 h 264-avc_by_dhchangBadri Patro
 
video_compression_2004
video_compression_2004video_compression_2004
video_compression_2004aniruddh Tyagi
 
MPEG Compression Standards
MPEG Compression StandardsMPEG Compression Standards
MPEG Compression Standards
Ajay
 
Compression: Video Compression (MPEG and others)
Compression: Video Compression (MPEG and others)Compression: Video Compression (MPEG and others)
Compression: Video Compression (MPEG and others)
danishrafiq
 
Video Compression Basics
Video Compression BasicsVideo Compression Basics
Video Compression Basics
Sanjiv Malik
 

Viewers also liked (14)

Video Compression Techniques
Video Compression TechniquesVideo Compression Techniques
Video Compression Techniques
 
Compression presentation 415 (1)
Compression presentation 415 (1)Compression presentation 415 (1)
Compression presentation 415 (1)
 
Chap55
Chap55Chap55
Chap55
 
Hw2
Hw2Hw2
Hw2
 
ISDD Video Compression
ISDD Video CompressionISDD Video Compression
ISDD Video Compression
 
Introduction To Video Compression
Introduction To Video CompressionIntroduction To Video Compression
Introduction To Video Compression
 
Hw3 0972552
Hw3 0972552Hw3 0972552
Hw3 0972552
 
Standards De Compression Audio Et VidéO
Standards De Compression Audio Et VidéOStandards De Compression Audio Et VidéO
Standards De Compression Audio Et VidéO
 
28 h 264-avc_by_dhchang
28   h 264-avc_by_dhchang28   h 264-avc_by_dhchang
28 h 264-avc_by_dhchang
 
video_compression_2004
video_compression_2004video_compression_2004
video_compression_2004
 
MPEG Compression Standards
MPEG Compression StandardsMPEG Compression Standards
MPEG Compression Standards
 
Compression: Video Compression (MPEG and others)
Compression: Video Compression (MPEG and others)Compression: Video Compression (MPEG and others)
Compression: Video Compression (MPEG and others)
 
Video Compression Basics
Video Compression BasicsVideo Compression Basics
Video Compression Basics
 
Compression
CompressionCompression
Compression
 

Similar to MPEG/Audio Compression

Final presentation
Final presentationFinal presentation
Final presentation
Meghasyam Tummalacherla
 
Audio Compression_2023.pptx
Audio Compression_2023.pptxAudio Compression_2023.pptx
Audio Compression_2023.pptx
zulhelmanz
 
Multimedia.pdf
Multimedia.pdfMultimedia.pdf
Multimedia.pdf
SunayanaShivthare1
 
A1mpeg12 2004
A1mpeg12 2004A1mpeg12 2004
A1mpeg12 2004
Thiago Skiba
 
add9.5.ppt
add9.5.pptadd9.5.ppt
add9.5.ppt
AshenafiGirma5
 
Multimedia Object - Audio
Multimedia Object - AudioMultimedia Object - Audio
Multimedia Object - Audio
Telkom Institute of Management
 
Novel Approach of Implementing Psychoacoustic model for MPEG-1 Audio
Novel Approach of Implementing Psychoacoustic model for MPEG-1 AudioNovel Approach of Implementing Psychoacoustic model for MPEG-1 Audio
Novel Approach of Implementing Psychoacoustic model for MPEG-1 Audio
inventy
 
lect10-mpeg1.ppt
lect10-mpeg1.pptlect10-mpeg1.ppt
lect10-mpeg1.ppt
ssuser6606eb
 
Psychoacoustic Approaches to Audio Steganography
Psychoacoustic Approaches to Audio SteganographyPsychoacoustic Approaches to Audio Steganography
Psychoacoustic Approaches to Audio Steganography
Cody Ray
 
Lecture 8 audio compression
Lecture 8 audio compressionLecture 8 audio compression
Lecture 8 audio compressionMr SMAK
 
audiocompression-130624061221-phpapp02.pptx
audiocompression-130624061221-phpapp02.pptxaudiocompression-130624061221-phpapp02.pptx
audiocompression-130624061221-phpapp02.pptx
PawachMetharattanara
 
Speaker Segmentation (2006)
Speaker Segmentation (2006)Speaker Segmentation (2006)
Speaker Segmentation (2006)
Luís Gustavo Martins
 
Chapter 2- Digital Data Acquistion.ppt
Chapter 2- Digital Data Acquistion.pptChapter 2- Digital Data Acquistion.ppt
Chapter 2- Digital Data Acquistion.ppt
VasanthiMuniasamy2
 
C06-Broadcast_Systems1.ppt
C06-Broadcast_Systems1.pptC06-Broadcast_Systems1.ppt
C06-Broadcast_Systems1.ppt
Palanikumar72221
 
Compression of digital voice and video
Compression of digital voice and videoCompression of digital voice and video
Compression of digital voice and videosangusajjan
 
notes_Image Compression_edited.ppt
notes_Image Compression_edited.pptnotes_Image Compression_edited.ppt
notes_Image Compression_edited.ppt
HarisMasood20
 
M1L1-2.ppt
M1L1-2.pptM1L1-2.ppt
M1L1-2.ppt
shareea2002
 

Similar to MPEG/Audio Compression (20)

Final presentation
Final presentationFinal presentation
Final presentation
 
Audio Compression_2023.pptx
Audio Compression_2023.pptxAudio Compression_2023.pptx
Audio Compression_2023.pptx
 
Multimedia.pdf
Multimedia.pdfMultimedia.pdf
Multimedia.pdf
 
A1mpeg12 2004
A1mpeg12 2004A1mpeg12 2004
A1mpeg12 2004
 
add9.5.ppt
add9.5.pptadd9.5.ppt
add9.5.ppt
 
Multimedia Object - Audio
Multimedia Object - AudioMultimedia Object - Audio
Multimedia Object - Audio
 
Novel Approach of Implementing Psychoacoustic model for MPEG-1 Audio
Novel Approach of Implementing Psychoacoustic model for MPEG-1 AudioNovel Approach of Implementing Psychoacoustic model for MPEG-1 Audio
Novel Approach of Implementing Psychoacoustic model for MPEG-1 Audio
 
lect10-mpeg1.ppt
lect10-mpeg1.pptlect10-mpeg1.ppt
lect10-mpeg1.ppt
 
Psychoacoustic Approaches to Audio Steganography
Psychoacoustic Approaches to Audio SteganographyPsychoacoustic Approaches to Audio Steganography
Psychoacoustic Approaches to Audio Steganography
 
Lecture 8 audio compression
Lecture 8 audio compressionLecture 8 audio compression
Lecture 8 audio compression
 
audiocompression-130624061221-phpapp02.pptx
audiocompression-130624061221-phpapp02.pptxaudiocompression-130624061221-phpapp02.pptx
audiocompression-130624061221-phpapp02.pptx
 
Speaker Segmentation (2006)
Speaker Segmentation (2006)Speaker Segmentation (2006)
Speaker Segmentation (2006)
 
Soundpres
SoundpresSoundpres
Soundpres
 
Chapter 2- Digital Data Acquistion.ppt
Chapter 2- Digital Data Acquistion.pptChapter 2- Digital Data Acquistion.ppt
Chapter 2- Digital Data Acquistion.ppt
 
C06-Broadcast_Systems1.ppt
C06-Broadcast_Systems1.pptC06-Broadcast_Systems1.ppt
C06-Broadcast_Systems1.ppt
 
Compression of digital voice and video
Compression of digital voice and videoCompression of digital voice and video
Compression of digital voice and video
 
Mixer v1.0.3
Mixer v1.0.3Mixer v1.0.3
Mixer v1.0.3
 
Speech Compression
Speech CompressionSpeech Compression
Speech Compression
 
notes_Image Compression_edited.ppt
notes_Image Compression_edited.pptnotes_Image Compression_edited.ppt
notes_Image Compression_edited.ppt
 
M1L1-2.ppt
M1L1-2.pptM1L1-2.ppt
M1L1-2.ppt
 

More from Daniel Brewster (20)

Evaluation question 2
Evaluation question 2Evaluation question 2
Evaluation question 2
 
Meeting minutes 21
Meeting minutes 21Meeting minutes 21
Meeting minutes 21
 
Meeting minutes 21
Meeting minutes 21Meeting minutes 21
Meeting minutes 21
 
Meeting minutes 23
Meeting minutes 23Meeting minutes 23
Meeting minutes 23
 
Meeting minutes 20
Meeting minutes 20Meeting minutes 20
Meeting minutes 20
 
Meeting minutes 23
Meeting minutes 23Meeting minutes 23
Meeting minutes 23
 
Meeting minutes 22
Meeting minutes 22Meeting minutes 22
Meeting minutes 22
 
Meeting minutes 21
Meeting minutes 21Meeting minutes 21
Meeting minutes 21
 
Meeting m=
Meeting m=Meeting m=
Meeting m=
 
Meeting minutes 19
Meeting minutes 19Meeting minutes 19
Meeting minutes 19
 
Meeting minutes 18
Meeting minutes 18Meeting minutes 18
Meeting minutes 18
 
Meeting minutes 17
Meeting minutes 17Meeting minutes 17
Meeting minutes 17
 
Meeting minutes 16
Meeting minutes 16Meeting minutes 16
Meeting minutes 16
 
Meeting minutes 15
Meeting minutes 15Meeting minutes 15
Meeting minutes 15
 
Meeting minutes 14
Meeting minutes 14Meeting minutes 14
Meeting minutes 14
 
Meeting minutes 13
Meeting minutes 13Meeting minutes 13
Meeting minutes 13
 
Short film analysis 2 losers
Short film analysis 2   losersShort film analysis 2   losers
Short film analysis 2 losers
 
Short film analysis - Tick Tock
Short film analysis - Tick TockShort film analysis - Tick Tock
Short film analysis - Tick Tock
 
Representation
RepresentationRepresentation
Representation
 
Representation
RepresentationRepresentation
Representation
 

Recently uploaded

一比一原版(qut毕业证)昆士兰科技大学毕业证如何办理
一比一原版(qut毕业证)昆士兰科技大学毕业证如何办理一比一原版(qut毕业证)昆士兰科技大学毕业证如何办理
一比一原版(qut毕业证)昆士兰科技大学毕业证如何办理
taqyed
 
Memory Rental Store - The Chase (Storyboard)
Memory Rental Store - The Chase (Storyboard)Memory Rental Store - The Chase (Storyboard)
Memory Rental Store - The Chase (Storyboard)
SuryaKalyan3
 
Codes n Conventionss copy (2).pptx new new
Codes n Conventionss copy (2).pptx new newCodes n Conventionss copy (2).pptx new new
Codes n Conventionss copy (2).pptx new new
ZackSpencer3
 
一比一原版(DU毕业证)迪肯大学毕业证成绩单
一比一原版(DU毕业证)迪肯大学毕业证成绩单一比一原版(DU毕业证)迪肯大学毕业证成绩单
一比一原版(DU毕业证)迪肯大学毕业证成绩单
zvaywau
 
ART FORMS OF KERALA: TRADITIONAL AND OTHERS
ART FORMS OF KERALA: TRADITIONAL AND OTHERSART FORMS OF KERALA: TRADITIONAL AND OTHERS
ART FORMS OF KERALA: TRADITIONAL AND OTHERS
Sandhya J.Nair
 
The Last Polymath: Muntadher Saleh‎‎‎‎‎‎‎‎‎‎‎‎
The Last Polymath: Muntadher Saleh‎‎‎‎‎‎‎‎‎‎‎‎The Last Polymath: Muntadher Saleh‎‎‎‎‎‎‎‎‎‎‎‎
The Last Polymath: Muntadher Saleh‎‎‎‎‎‎‎‎‎‎‎‎
iraqartsandculture
 
IrishWritersCtrsPersonalEssaysMay29.pptx
IrishWritersCtrsPersonalEssaysMay29.pptxIrishWritersCtrsPersonalEssaysMay29.pptx
IrishWritersCtrsPersonalEssaysMay29.pptx
Aine Greaney Ellrott
 
一比一原版(UniSA毕业证)南澳大学毕业证成绩单如何办理
一比一原版(UniSA毕业证)南澳大学毕业证成绩单如何办理一比一原版(UniSA毕业证)南澳大学毕业证成绩单如何办理
一比一原版(UniSA毕业证)南澳大学毕业证成绩单如何办理
zeyhe
 
2137ad Merindol Colony Interiors where refugee try to build a seemengly norm...
2137ad  Merindol Colony Interiors where refugee try to build a seemengly norm...2137ad  Merindol Colony Interiors where refugee try to build a seemengly norm...
2137ad Merindol Colony Interiors where refugee try to build a seemengly norm...
luforfor
 
Inter-Dimensional Girl Boards Segment (Act 3)
Inter-Dimensional Girl Boards Segment (Act 3)Inter-Dimensional Girl Boards Segment (Act 3)
Inter-Dimensional Girl Boards Segment (Act 3)
CristianMestre
 
A Brief Introduction About Hadj Ounis
A Brief  Introduction  About  Hadj OunisA Brief  Introduction  About  Hadj Ounis
A Brief Introduction About Hadj Ounis
Hadj Ounis
 
一比一原版(GU毕业证)格里菲斯大学毕业证成绩单
一比一原版(GU毕业证)格里菲斯大学毕业证成绩单一比一原版(GU毕业证)格里菲斯大学毕业证成绩单
一比一原版(GU毕业证)格里菲斯大学毕业证成绩单
zvaywau
 
acting board rough title here lolaaaaaaa
acting board rough title here lolaaaaaaaacting board rough title here lolaaaaaaa
acting board rough title here lolaaaaaaa
angelicafronda7
 
2137ad - Characters that live in Merindol and are at the center of main stories
2137ad - Characters that live in Merindol and are at the center of main stories2137ad - Characters that live in Merindol and are at the center of main stories
2137ad - Characters that live in Merindol and are at the center of main stories
luforfor
 
Memory Rental Store - The Ending(Storyboard)
Memory Rental Store - The Ending(Storyboard)Memory Rental Store - The Ending(Storyboard)
Memory Rental Store - The Ending(Storyboard)
SuryaKalyan3
 
ashokathegreat project class 12 presentation
ashokathegreat project class 12 presentationashokathegreat project class 12 presentation
ashokathegreat project class 12 presentation
aditiyad2020
 
一比一原版(QUT毕业证)昆士兰科技大学毕业证成绩单如何办理
一比一原版(QUT毕业证)昆士兰科技大学毕业证成绩单如何办理一比一原版(QUT毕业证)昆士兰科技大学毕业证成绩单如何办理
一比一原版(QUT毕业证)昆士兰科技大学毕业证成绩单如何办理
zeyhe
 
Fed by curiosity and beauty - Remembering Myrsine Zorba
Fed by curiosity and beauty - Remembering Myrsine ZorbaFed by curiosity and beauty - Remembering Myrsine Zorba
Fed by curiosity and beauty - Remembering Myrsine Zorba
mariavlachoupt
 
Caffeinated Pitch Bible- developed by Claire Wilson
Caffeinated Pitch Bible- developed by Claire WilsonCaffeinated Pitch Bible- developed by Claire Wilson
Caffeinated Pitch Bible- developed by Claire Wilson
ClaireWilson398082
 

Recently uploaded (19)

一比一原版(qut毕业证)昆士兰科技大学毕业证如何办理
一比一原版(qut毕业证)昆士兰科技大学毕业证如何办理一比一原版(qut毕业证)昆士兰科技大学毕业证如何办理
一比一原版(qut毕业证)昆士兰科技大学毕业证如何办理
 
Memory Rental Store - The Chase (Storyboard)
Memory Rental Store - The Chase (Storyboard)Memory Rental Store - The Chase (Storyboard)
Memory Rental Store - The Chase (Storyboard)
 
Codes n Conventionss copy (2).pptx new new
Codes n Conventionss copy (2).pptx new newCodes n Conventionss copy (2).pptx new new
Codes n Conventionss copy (2).pptx new new
 
一比一原版(DU毕业证)迪肯大学毕业证成绩单
一比一原版(DU毕业证)迪肯大学毕业证成绩单一比一原版(DU毕业证)迪肯大学毕业证成绩单
一比一原版(DU毕业证)迪肯大学毕业证成绩单
 
ART FORMS OF KERALA: TRADITIONAL AND OTHERS
ART FORMS OF KERALA: TRADITIONAL AND OTHERSART FORMS OF KERALA: TRADITIONAL AND OTHERS
ART FORMS OF KERALA: TRADITIONAL AND OTHERS
 
The Last Polymath: Muntadher Saleh‎‎‎‎‎‎‎‎‎‎‎‎
The Last Polymath: Muntadher Saleh‎‎‎‎‎‎‎‎‎‎‎‎The Last Polymath: Muntadher Saleh‎‎‎‎‎‎‎‎‎‎‎‎
The Last Polymath: Muntadher Saleh‎‎‎‎‎‎‎‎‎‎‎‎
 
IrishWritersCtrsPersonalEssaysMay29.pptx
IrishWritersCtrsPersonalEssaysMay29.pptxIrishWritersCtrsPersonalEssaysMay29.pptx
IrishWritersCtrsPersonalEssaysMay29.pptx
 
一比一原版(UniSA毕业证)南澳大学毕业证成绩单如何办理
一比一原版(UniSA毕业证)南澳大学毕业证成绩单如何办理一比一原版(UniSA毕业证)南澳大学毕业证成绩单如何办理
一比一原版(UniSA毕业证)南澳大学毕业证成绩单如何办理
 
2137ad Merindol Colony Interiors where refugee try to build a seemengly norm...
2137ad  Merindol Colony Interiors where refugee try to build a seemengly norm...2137ad  Merindol Colony Interiors where refugee try to build a seemengly norm...
2137ad Merindol Colony Interiors where refugee try to build a seemengly norm...
 
Inter-Dimensional Girl Boards Segment (Act 3)
Inter-Dimensional Girl Boards Segment (Act 3)Inter-Dimensional Girl Boards Segment (Act 3)
Inter-Dimensional Girl Boards Segment (Act 3)
 
A Brief Introduction About Hadj Ounis
A Brief  Introduction  About  Hadj OunisA Brief  Introduction  About  Hadj Ounis
A Brief Introduction About Hadj Ounis
 
一比一原版(GU毕业证)格里菲斯大学毕业证成绩单
一比一原版(GU毕业证)格里菲斯大学毕业证成绩单一比一原版(GU毕业证)格里菲斯大学毕业证成绩单
一比一原版(GU毕业证)格里菲斯大学毕业证成绩单
 
acting board rough title here lolaaaaaaa
acting board rough title here lolaaaaaaaacting board rough title here lolaaaaaaa
acting board rough title here lolaaaaaaa
 
2137ad - Characters that live in Merindol and are at the center of main stories
2137ad - Characters that live in Merindol and are at the center of main stories2137ad - Characters that live in Merindol and are at the center of main stories
2137ad - Characters that live in Merindol and are at the center of main stories
 
Memory Rental Store - The Ending(Storyboard)
Memory Rental Store - The Ending(Storyboard)Memory Rental Store - The Ending(Storyboard)
Memory Rental Store - The Ending(Storyboard)
 
ashokathegreat project class 12 presentation
ashokathegreat project class 12 presentationashokathegreat project class 12 presentation
ashokathegreat project class 12 presentation
 
一比一原版(QUT毕业证)昆士兰科技大学毕业证成绩单如何办理
一比一原版(QUT毕业证)昆士兰科技大学毕业证成绩单如何办理一比一原版(QUT毕业证)昆士兰科技大学毕业证成绩单如何办理
一比一原版(QUT毕业证)昆士兰科技大学毕业证成绩单如何办理
 
Fed by curiosity and beauty - Remembering Myrsine Zorba
Fed by curiosity and beauty - Remembering Myrsine ZorbaFed by curiosity and beauty - Remembering Myrsine Zorba
Fed by curiosity and beauty - Remembering Myrsine Zorba
 
Caffeinated Pitch Bible- developed by Claire Wilson
Caffeinated Pitch Bible- developed by Claire WilsonCaffeinated Pitch Bible- developed by Claire Wilson
Caffeinated Pitch Bible- developed by Claire Wilson
 

MPEG/Audio Compression

  • 1. A Tutorial on MPEG/Audio Compression Davis Pan, IEEE Multimedia Journal, Summer 1995 Presented by: Randeep Singh Gakhal CMPT 820, Spring 2004
  • 2. Outline  Introduction  Technical Overview  Polyphase Filter Bank  Psychoacoustic Model  Coding and Bit Allocation  Conclusions and Future Work
  • 3. Introduction  What does MPEG-1 Audio provide? A transparently lossy audio compression system based on the weaknesses of the human ear.  Can provide compression by a factor of 6 and retain sound quality.  One part of a three part standard that includes audio, video, and audio/video synchronization.
  • 5. MPEG-I Audio Features  PCM sampling rate of 32, 44.1, or 48 kHz  Four channel modes:  Monophonic and Dual-monophonic  Stereo and Joint-stereo  Three modes (layers in MPEG-I speak):  Layer I: Computationally cheapest, bit rates > 128kbps  Layer II: Bit rate ~ 128 kbps, used in VCD  Layer III: Most complicated encoding/decoding, bit rates ~ 64kbps, originally intended for streaming audio
  • 6. Human Audio System (ear + brain)  Human sensitivity to sound is non-linear across audible range (20Hz – 20kHz)  Audible range broken into regions where humans cannot perceive a difference  called the critical bands
  • 8. MPEG-I Encoder Architecture  Polyphase Filter Bank: Transforms PCM samples to frequency domain signals in 32 subbands  Psychoacoustic Model: Calculates acoustically irrelevant parts of signal  Bit Allocator: Allots bits to subbands according to input from psychoacoustic calculation.  Frame Creation: Generates an MPEG-I compliant bit stream.
  • 10. Polyphase Filter Bank  Divides audio signal into 32 equal width subband streams in the frequency domain.  Inverse filter at decoder cannot recover signal without some, albeit inaudible, loss.  Based on work by Rothweiler[2].  Standard specifies 512 coefficient analysis window, C[n]
  • 11. Polyphase Filter Bank  Buffer of 512 PCM samples with 32 new samples, X[n], shifted in every computation cycle  Calculate window samples for i=0…511:  Partial calculation for i=0…63:  Calculate 32 subsamples: ][][][ iXiCiZ ⋅= ∑= += 7 0 ]64[][ j jiZiY ∑= ⋅= 63 0 ]][[][][ k kiMiYiS
  • 12. Polyphase Filter Bank  Visualization of the filter[1] :
  • 13. Polyphase Filter Bank  The net effect:  Analysis matrix:  Requires 512 + 32x64 = 2560 multiplies.  Each subband has bandwidth π/32T centered at odd multiples of π/64T ]64[]64[]][[][ 63 0 7 0 jiXjiCkiMiS k j ++= ∑ ∑= =       −+ = 64 )16)(12( cos]][[ πki kiM
  • 14. Polyphase Filter Bank  Shortcomings:  Equal width filters do not correspond with critical band model of auditory system.  Filter bank and its inverse are NOT lossless.  Frequency overlap between subbands.
  • 15. Polyphase Filter Bank  Comparison of filter banks and critical bands[1]:
  • 16. Polyphase Filter Bank  Frequency response of one subband[1] :
  • 18. The Weakness of the Human Ear  Frequency dependent resolution:  We do not have the ability to discern minute differences in frequency within the critical bands.  Auditory masking:  When two signals of very close frequency are both present, the louder will mask the softer.  A masked signal must be louder than some threshold for it to be heard  gives us room to introduce inaudible quantization noise.
  • 19. MPEG-I Psychoacoustic Models  MPEG-I standard defines two models:  Psychoacoustic Model 1:  Less computationally expensive  Makes some serious compromises in what it assumes a listener cannot hear  Psychoacoustic Model 2:  Provides more features suited for Layer III coding, assuming of course, increased processor bandwidth.
  • 20. Psychoacoustic Model  Convert samples to frequency domain  Use a Hann weighting and then a DFT  Simply gives an edge artifact (from finite window size) free frequency domain representation.  Model 1 uses 512 (Layer I) or 1024 (Layers II and III) sample window.  Model 2 uses a 1024 sample window and two calculations per frame.
  • 21. Psychoacoustic Model  Need to separate sound into “tones” and “noise” components  Model 1:  Local peaks are tones, lump remaining spectrum per critical band into noise at a representative frequency.  Model 2:  Calculate “tonality” index to determine likelihood of each spectral point being a tone  based on previous two analysis windows
  • 22. Psychoacoustic Model  “Smear” each signal within its critical band  Use either a masking (Model 1) or a spreading function (Model 2).  Adjust calculated threshold by incorporating a “quiet” mask – masking threshold for each frequency when no other frequencies are present.
  • 23. Psychoacoustic Model  Calculate a masking threshold for each subband in the polyphase filter bank  Model 1:  Selects minima of masking threshold values in range of each subband  Inaccurate at higher frequencies – recall how subbands are linearly distributed, critical bands are NOT!  Model 2:  If subband wider than critical band:  Use minimal masking threshold in subband  If critical band wider than subband:  Use average masking threshold in subband
  • 24. Psychoacoustic Model  The hard work is done – now, we just calculate the signal-to-mask ratio (SMR) per subband  SMR = signal energy / masking threshold  We pass our result on to the coding unit which can now produce a compressed bitstream
  • 26. Psychoacoustic Model (example)  Transformation to perceptual domain[1] :
  • 27. Psychoacoustic Model (example)  Calculation of masking thresholds[1] :
  • 28. Psychoacoustic Model (example)  Signal-to-mask ratios[1] :
  • 29. Psychoacoustic Model (example)  What we actually send[1] :
  • 31. Layer Specific Coding  Layer specific frame formats[1] :
  • 32. Layer Specific Coding  Stream of samples is processed in groups[1] :
  • 33. Layer I Coding  Group 12 samples from each subband and encode them in each frame (=384 samples)  Each group encoded with 0-15 bits/sample  Each group has 6-bit scale factor
  • 34. Layer II Coding  Similar to Layer I except:  Groups are now 3 of 12 samples per-subband = 1152 samples per frame  Can have up to 3 scale factors per subband to avoid audible distortion in special cases  Called scale factor selection information (SCFSI)
  • 35. Layer III Coding  Further subdivides subbands using Modified Discrete Cosine Transform (MDCT) – a lossless transform  Larger frequency resolution => smaller time resolution  possibility of pre-echo  Layer III encoder can detect and reduce pre-echo by “borrowing bits” from future encodings
  • 36. Bit Allocation  Determine number of bits to allot for each subband given SMR from psychoacoustic model.  Layers I and II:  Calculate mask-to-noise ratio:  MNR = SNR – SMR (in dB)  SNR given by MPEG-I standard (as function of quantization levels)  Now iterate until no bits to allocate left:  Allocate bits to subband with lowest MNR.  Re-calculate MNR for subband allocated more bits.
  • 37. Bit Allocation  Layer III:  Employs “noise allocation”  Quantizes each spectral value and employs Huffman coding  If Huffman encoding results in noise in excess of allowed distortion for a subband, encoder increases resolution on that subband  Whole process repeats until one of three specified stop conditions is met.
  • 39. Conclusions  MPEG-I provides tremendous compression for relatively cheap computation.  Not suitable for archival or audiophile grade music as very seasoned listeners can discern distortion.  Modifying or searching MPEG-I content requires decompression and is not cheap!
  • 40. Future Work  MPEG-1 audio lays the foundation for all modern audio compression techniques  Lots of progress since then (1994!)  MPEG-2 (1996) extends MPEG audio compression to support 5.1 channel audio  MPEG-4 (1998) attempts to code based on perceived audio objects in the stream  Finally, MPEG-7 (2001) operates at an even higher level of abstraction, focusing on meta-data coding to make content searchable and retrievable
  • 41. References [1] D. Pan, “A Tutorial on MPEG/Audio Compression”, IEEE Multimedia Journal, 1995. [2] J. H. Rothweiler, “Polyphase Quadrature Filters – a New Subband Coding Technique”, Proc of the Int. Conf. IEEE ASSP, 27.2, pp1280-1283, Boston 1983.