SlideShare a Scribd company logo
1 of 24
1
Audio Compression
Techniques
Prepared by
Razia Nisar Noorani
Lecture 8
2
Introduction
 Digital Audio Compression
 Removal of redundant or otherwise irrelevant
information from audio signal
 Audio compression algorithms are often referred to as
“audio encoders”
 Applications
 Reduces required storage space
 Reduces required transmission bandwidth
3
Audio Compression
 Audio signal – overview
 Sampling rate (# of samples per second)
 Bit rate (# of bits per second). Typically,
uncompressed stereo 16-bit 44.1KHz signal has a
1.4MBps bit rate
 Number of channels (mono / stereo / multichannel)
 Reduction by lowering those values or by data
compression / encoding
4
Audio Data Compression
 Redundant information
 Implicit in the remaining information
 Ex. oversampled audio signal
 oversampling is the process of sampling a signal with a
sampling frequency significantly higher than twice the
bandwidth or highest frequency of the signal being sampled
 Irrelevant information
 Perceptually insignificant
 Cannot be recovered from remaining information
5
Audio Data Compression
 Lossless Audio Compression
Removes redundant data
Resulting signal is same as original – perfect
reconstruction
 Lossy Audio Encoding
Removes irrelevant data
Resulting signal is similar to original
6
Audio Data Compression
 Audio vs. Speech Compression
Techniques
Speech Compression uses a human vocal
tract model to compress signals
Audio Compression does not use this
technique due to larger variety of possible
signal variations
7
Generic Audio Encoder
 Psychoacoustic Model
Psychoacoustics – study of how sounds are
perceived by humans
Uses perceptual coding
 eliminate information from audio signal that is
inaudible to the ear
Detects conditions under which different audio
signal components mask each other
8
Psychoacoustic Model
 Signal Masking
Threshold cut-off
Spectral (Frequency / Simultaneous) Masking
Temporal Masking
 Threshold cut-off and spectral masking
occur in frequency domain, temporal
masking occurs in time domain
9
Signal Masking
 Threshold cut-off
 Hearing threshold level
– a function of
frequency
 Any frequency
components below the
threshold will not be
perceived by human
ear
10
Signal Masking
 Spectral Masking
 A frequency
component can be
partly or fully masked
by another component
that is close to it in
frequency
 This shifts the hearing
threshold
11
Signal Masking
 Temporal Masking
 A quieter sound can
be masked by a louder
sound if they are
temporally close
 Sounds that occur
both (shortly) before
and after volume
increase can be
masked
12
Spectral Analysis
 a device or algorithm that identifies a frequency
domain representation of a time domain signal.
 Tasks of Spectral Analysis
 To derive masking thresholds to determine which
signal components can be eliminated
 To generate a representation of the signal to which
masking thresholds can be applied
 Spectral Analysis is done through transforms or
filter banks
13
Spectral Analysis
 Transforms
Fast Fourier Transform (FFT)
Discrete Cosine Transform (DCT) - similar to
FFT but uses cosine values only
Modified Discrete Cosine Transform (MDCT)
[used by MPEG-1 Layer-III, MPEG-2 AAC,
Dolby AC-3] – overlapped and windowed
version of DCT
14
Spectral Analysis
 Filter Banks
 a filter bank is an array of band-pass filters that
separates the input signal into multiple
components, each one carrying a single
frequency subband of the original signal
 Time sample blocks are passed through a set of
bandpass filters
 Masking thresholds are applied to resulting frequency
subband signals
 Poly-phase and wavelet banks are most popular filter
structures
15
Filter Bank Structures
 Polyphase Filter Bank
[used in all of the MPEG-1 encoders]
Signal is separated into subbands, the widths
of which are equal over the entire frequency
range
The resulting subband signals are
downsampled to create shorter signals (which
are later reconstructed during decoding
process)
16
Filter Bank Structures
 Wavelet Filter Bank
[used by Enhanced Perceptual Audio
Coder (EPAC) by Lucent]
Unlike polyphase filter, the widths of the
subbands are not evenly spaced (narrower for
higher frequencies)
This allows for better time resolution (ex. short
attacks), but at expense of frequency
resolution
17
Noise Allocation
 System Task: derive and apply shifted hearing
threshold to the input signal
 Anything below the threshold doesn’t need to be
transmitted
 Any noise below the threshold is irrelevant
 Frequency component quantization
 Tradeoff between space and noise
 Encoder saves on space by using just enough bits for
each frequency component to keep noise under the
threshold - this is known as noise allocation
18
Noise Allocation
 Pre-echo
 In case a single audio block contains silence followed
by a loud attack, pre-echo error occurs - there will be
audible noise in the silent part of the block after
decoding
 This is avoided by pre-monitoring audio data at
encoding stage and separating audio into shorter
blocks in potential pre-echo case
 This does not completely eliminate pre-echo, but can
make it short enough to be masked by the attack
(temporal masking)
19
Additional Encoding Techniques
 Other encoding techniques techniques are
available (alternative or in combination)
Predictive Coding
Coupling / Delta Encoding
Huffman Encoding
20
Additional Encoding Techniques
 Predictive Coding
 Often used in speech and image compression
 Estimates the expected value for each sample based
on previous sample values
 Transmits/stores the difference between the expected
and received value
 Generates an estimate for the next sample and then
adjusts it by the difference stored for the current
sample
 Used for additional compression in MPEG2 AAC
(Advance audio Coding)
21
Additional Encoding Techniques
 Coupling / Delta encoding
 Used in cases where audio signal consists of two or
more channels (stereo or surround sound)
 Similarities between channels are used for
compression
 A sum and difference between two channels are
derived; difference is usually some value close to zero
and therefore requires less space to encode
 This is a case of lossless encoding process
22
Additional Encoding Techniques
 Huffman Coding
 Information-theory-based technique
 An element of a signal that often reoccurs in the
signal is represented by a simpler symbol, and its
value is stored in a look-up table
 Implemented using a look-up tables in encoder and in
decoder
 Provides substantial lossless compression, but
requires high computational power and therefore is
not very popular
 Used by MPEG1 and MPEG2 AAC
23
Encoding - Final Stages
 Audio data packed into frames
 Frames stored or transmitted
24
Questions

More Related Content

What's hot

Fundamentals of Data compression
Fundamentals of Data compressionFundamentals of Data compression
Fundamentals of Data compressionM.k. Praveen
 
Chapter 5 - Data Compression
Chapter 5 - Data CompressionChapter 5 - Data Compression
Chapter 5 - Data CompressionPratik Pradhan
 
Audio Codec
Audio CodecAudio Codec
Audio Codeclesleyw
 
Multimedia fundamental concepts in video
Multimedia fundamental concepts in videoMultimedia fundamental concepts in video
Multimedia fundamental concepts in videoMazin Alwaaly
 
Video compression
Video compressionVideo compression
Video compressionDarkNight14
 
Chapter 3 - Fundamental Concepts in Video and Digital Audio.ppt
Chapter 3 - Fundamental Concepts in Video and Digital Audio.pptChapter 3 - Fundamental Concepts in Video and Digital Audio.ppt
Chapter 3 - Fundamental Concepts in Video and Digital Audio.pptBinyamBekele3
 
video compression techique
video compression techiquevideo compression techique
video compression techiqueAshish Kumar
 
Video Compression Basics - MPEG2
Video Compression Basics - MPEG2Video Compression Basics - MPEG2
Video Compression Basics - MPEG2VijayKumarArya
 
MPEG video compression standard
MPEG video compression standardMPEG video compression standard
MPEG video compression standardanuragjagetiya
 
MPEG-1 Part 2 Video Encoding
MPEG-1 Part 2 Video EncodingMPEG-1 Part 2 Video Encoding
MPEG-1 Part 2 Video EncodingChristian Kehl
 

What's hot (20)

Text compression
Text compressionText compression
Text compression
 
Fundamentals of Data compression
Fundamentals of Data compressionFundamentals of Data compression
Fundamentals of Data compression
 
Digital Audio in Multimedia
Digital Audio in MultimediaDigital Audio in Multimedia
Digital Audio in Multimedia
 
Chapter 5 - Data Compression
Chapter 5 - Data CompressionChapter 5 - Data Compression
Chapter 5 - Data Compression
 
Audio Codec
Audio CodecAudio Codec
Audio Codec
 
Multimedia fundamental concepts in video
Multimedia fundamental concepts in videoMultimedia fundamental concepts in video
Multimedia fundamental concepts in video
 
Compression techniques
Compression techniquesCompression techniques
Compression techniques
 
Video compression
Video compressionVideo compression
Video compression
 
data compression.
data compression.data compression.
data compression.
 
Linear Predictive Coding
Linear Predictive CodingLinear Predictive Coding
Linear Predictive Coding
 
Chapter 3 - Fundamental Concepts in Video and Digital Audio.ppt
Chapter 3 - Fundamental Concepts in Video and Digital Audio.pptChapter 3 - Fundamental Concepts in Video and Digital Audio.ppt
Chapter 3 - Fundamental Concepts in Video and Digital Audio.ppt
 
video compression techique
video compression techiquevideo compression techique
video compression techique
 
Data compression
Data compressionData compression
Data compression
 
Video Compression Basics - MPEG2
Video Compression Basics - MPEG2Video Compression Basics - MPEG2
Video Compression Basics - MPEG2
 
Audio and Video Compression
Audio and Video CompressionAudio and Video Compression
Audio and Video Compression
 
MPEG video compression standard
MPEG video compression standardMPEG video compression standard
MPEG video compression standard
 
MPEG-1 Part 2 Video Encoding
MPEG-1 Part 2 Video EncodingMPEG-1 Part 2 Video Encoding
MPEG-1 Part 2 Video Encoding
 
SPEECH CODING
SPEECH CODINGSPEECH CODING
SPEECH CODING
 
Data compression
Data compressionData compression
Data compression
 
Digital Audio
Digital AudioDigital Audio
Digital Audio
 

Similar to Audio compression 1

Lecture 8 audio compression
Lecture 8 audio compressionLecture 8 audio compression
Lecture 8 audio compressionMr SMAK
 
PSoC BASED SPEECH RECOGNITION SYSTEM
PSoC BASED SPEECH RECOGNITION SYSTEMPSoC BASED SPEECH RECOGNITION SYSTEM
PSoC BASED SPEECH RECOGNITION SYSTEMirjes
 
PSoC BASED SPEECH RECOGNITION SYSTEM
PSoC BASED SPEECH RECOGNITION SYSTEMPSoC BASED SPEECH RECOGNITION SYSTEM
PSoC BASED SPEECH RECOGNITION SYSTEMIJRES Journal
 
Audio_Overview.pptx
Audio_Overview.pptxAudio_Overview.pptx
Audio_Overview.pptxBinhHoang71
 
Multimedia Compression and Communication
Multimedia Compression and CommunicationMultimedia Compression and Communication
Multimedia Compression and CommunicationBenesh Selvanesan
 
Audio and video compression
Audio and video compressionAudio and video compression
Audio and video compressionneeraj9217
 
PHOENIX AUDIO TECHNOLOGIES - A large Audio Signal Algorithm Portfolio
PHOENIX AUDIO TECHNOLOGIES  - A large Audio Signal Algorithm PortfolioPHOENIX AUDIO TECHNOLOGIES  - A large Audio Signal Algorithm Portfolio
PHOENIX AUDIO TECHNOLOGIES - A large Audio Signal Algorithm PortfolioHTCS LLC
 
Chapter 2- Digital Data Acquistion.ppt
Chapter 2- Digital Data Acquistion.pptChapter 2- Digital Data Acquistion.ppt
Chapter 2- Digital Data Acquistion.pptVasanthiMuniasamy2
 
NTSC Software Decoding Presentation
NTSC Software Decoding PresentationNTSC Software Decoding Presentation
NTSC Software Decoding PresentationPrateek Dayal
 
Novel Approach of Implementing Psychoacoustic model for MPEG-1 Audio
Novel Approach of Implementing Psychoacoustic model for MPEG-1 AudioNovel Approach of Implementing Psychoacoustic model for MPEG-1 Audio
Novel Approach of Implementing Psychoacoustic model for MPEG-1 Audioinventy
 

Similar to Audio compression 1 (20)

Speech Compression
Speech CompressionSpeech Compression
Speech Compression
 
Lecture 8 audio compression
Lecture 8 audio compressionLecture 8 audio compression
Lecture 8 audio compression
 
PSoC BASED SPEECH RECOGNITION SYSTEM
PSoC BASED SPEECH RECOGNITION SYSTEMPSoC BASED SPEECH RECOGNITION SYSTEM
PSoC BASED SPEECH RECOGNITION SYSTEM
 
PSoC BASED SPEECH RECOGNITION SYSTEM
PSoC BASED SPEECH RECOGNITION SYSTEMPSoC BASED SPEECH RECOGNITION SYSTEM
PSoC BASED SPEECH RECOGNITION SYSTEM
 
Final presentation
Final presentationFinal presentation
Final presentation
 
Sub band project
Sub band projectSub band project
Sub band project
 
Audio_Overview.pptx
Audio_Overview.pptxAudio_Overview.pptx
Audio_Overview.pptx
 
Multimedia Compression and Communication
Multimedia Compression and CommunicationMultimedia Compression and Communication
Multimedia Compression and Communication
 
Audio and video compression
Audio and video compressionAudio and video compression
Audio and video compression
 
PHOENIX AUDIO TECHNOLOGIES - A large Audio Signal Algorithm Portfolio
PHOENIX AUDIO TECHNOLOGIES  - A large Audio Signal Algorithm PortfolioPHOENIX AUDIO TECHNOLOGIES  - A large Audio Signal Algorithm Portfolio
PHOENIX AUDIO TECHNOLOGIES - A large Audio Signal Algorithm Portfolio
 
Chapter 2- Digital Data Acquistion.ppt
Chapter 2- Digital Data Acquistion.pptChapter 2- Digital Data Acquistion.ppt
Chapter 2- Digital Data Acquistion.ppt
 
Digital audio
Digital audioDigital audio
Digital audio
 
M1L1-2.ppt
M1L1-2.pptM1L1-2.ppt
M1L1-2.ppt
 
Mk3422222228
Mk3422222228Mk3422222228
Mk3422222228
 
Mp3
Mp3Mp3
Mp3
 
Digital audio
Digital audioDigital audio
Digital audio
 
MPEG/Audio Compression
MPEG/Audio CompressionMPEG/Audio Compression
MPEG/Audio Compression
 
Speaker Segmentation (2006)
Speaker Segmentation (2006)Speaker Segmentation (2006)
Speaker Segmentation (2006)
 
NTSC Software Decoding Presentation
NTSC Software Decoding PresentationNTSC Software Decoding Presentation
NTSC Software Decoding Presentation
 
Novel Approach of Implementing Psychoacoustic model for MPEG-1 Audio
Novel Approach of Implementing Psychoacoustic model for MPEG-1 AudioNovel Approach of Implementing Psychoacoustic model for MPEG-1 Audio
Novel Approach of Implementing Psychoacoustic model for MPEG-1 Audio
 

Recently uploaded

Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AIabhishek36461
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfROCENODodongVILLACER
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...asadnawaz62
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfAsst.prof M.Gokilavani
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx959SahilShah
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024hassan khalil
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEroselinkalist12
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girlsssuser7cb4ff
 
EduAI - E learning Platform integrated with AI
EduAI - E learning Platform integrated with AIEduAI - E learning Platform integrated with AI
EduAI - E learning Platform integrated with AIkoyaldeepu123
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting  .Churning of Butter, Factors affecting  .
Churning of Butter, Factors affecting .Satyam Kumar
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxk795866
 
DATA ANALYTICS PPT definition usage example
DATA ANALYTICS PPT definition usage exampleDATA ANALYTICS PPT definition usage example
DATA ANALYTICS PPT definition usage examplePragyanshuParadkar1
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineeringmalavadedarshan25
 

Recently uploaded (20)

Past, Present and Future of Generative AI
Past, Present and Future of Generative AIPast, Present and Future of Generative AI
Past, Present and Future of Generative AI
 
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
Risk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdfRisk Assessment For Installation of Drainage Pipes.pdf
Risk Assessment For Installation of Drainage Pipes.pdf
 
complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...complete construction, environmental and economics information of biomass com...
complete construction, environmental and economics information of biomass com...
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdfCCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
CCS355 Neural Networks & Deep Learning Unit 1 PDF notes with Question bank .pdf
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 
Application of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptxApplication of Residue Theorem to evaluate real integrations.pptx
Application of Residue Theorem to evaluate real integrations.pptx
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024Architect Hassan Khalil Portfolio for 2024
Architect Hassan Khalil Portfolio for 2024
 
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETEINFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
INFLUENCE OF NANOSILICA ON THE PROPERTIES OF CONCRETE
 
Call Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call GirlsCall Girls Narol 7397865700 Independent Call Girls
Call Girls Narol 7397865700 Independent Call Girls
 
EduAI - E learning Platform integrated with AI
EduAI - E learning Platform integrated with AIEduAI - E learning Platform integrated with AI
EduAI - E learning Platform integrated with AI
 
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxDecoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
Churning of Butter, Factors affecting .
Churning of Butter, Factors affecting  .Churning of Butter, Factors affecting  .
Churning of Butter, Factors affecting .
 
Introduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptxIntroduction-To-Agricultural-Surveillance-Rover.pptx
Introduction-To-Agricultural-Surveillance-Rover.pptx
 
DATA ANALYTICS PPT definition usage example
DATA ANALYTICS PPT definition usage exampleDATA ANALYTICS PPT definition usage example
DATA ANALYTICS PPT definition usage example
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineering
 

Audio compression 1

  • 2. 2 Introduction  Digital Audio Compression  Removal of redundant or otherwise irrelevant information from audio signal  Audio compression algorithms are often referred to as “audio encoders”  Applications  Reduces required storage space  Reduces required transmission bandwidth
  • 3. 3 Audio Compression  Audio signal – overview  Sampling rate (# of samples per second)  Bit rate (# of bits per second). Typically, uncompressed stereo 16-bit 44.1KHz signal has a 1.4MBps bit rate  Number of channels (mono / stereo / multichannel)  Reduction by lowering those values or by data compression / encoding
  • 4. 4 Audio Data Compression  Redundant information  Implicit in the remaining information  Ex. oversampled audio signal  oversampling is the process of sampling a signal with a sampling frequency significantly higher than twice the bandwidth or highest frequency of the signal being sampled  Irrelevant information  Perceptually insignificant  Cannot be recovered from remaining information
  • 5. 5 Audio Data Compression  Lossless Audio Compression Removes redundant data Resulting signal is same as original – perfect reconstruction  Lossy Audio Encoding Removes irrelevant data Resulting signal is similar to original
  • 6. 6 Audio Data Compression  Audio vs. Speech Compression Techniques Speech Compression uses a human vocal tract model to compress signals Audio Compression does not use this technique due to larger variety of possible signal variations
  • 7. 7 Generic Audio Encoder  Psychoacoustic Model Psychoacoustics – study of how sounds are perceived by humans Uses perceptual coding  eliminate information from audio signal that is inaudible to the ear Detects conditions under which different audio signal components mask each other
  • 8. 8 Psychoacoustic Model  Signal Masking Threshold cut-off Spectral (Frequency / Simultaneous) Masking Temporal Masking  Threshold cut-off and spectral masking occur in frequency domain, temporal masking occurs in time domain
  • 9. 9 Signal Masking  Threshold cut-off  Hearing threshold level – a function of frequency  Any frequency components below the threshold will not be perceived by human ear
  • 10. 10 Signal Masking  Spectral Masking  A frequency component can be partly or fully masked by another component that is close to it in frequency  This shifts the hearing threshold
  • 11. 11 Signal Masking  Temporal Masking  A quieter sound can be masked by a louder sound if they are temporally close  Sounds that occur both (shortly) before and after volume increase can be masked
  • 12. 12 Spectral Analysis  a device or algorithm that identifies a frequency domain representation of a time domain signal.  Tasks of Spectral Analysis  To derive masking thresholds to determine which signal components can be eliminated  To generate a representation of the signal to which masking thresholds can be applied  Spectral Analysis is done through transforms or filter banks
  • 13. 13 Spectral Analysis  Transforms Fast Fourier Transform (FFT) Discrete Cosine Transform (DCT) - similar to FFT but uses cosine values only Modified Discrete Cosine Transform (MDCT) [used by MPEG-1 Layer-III, MPEG-2 AAC, Dolby AC-3] – overlapped and windowed version of DCT
  • 14. 14 Spectral Analysis  Filter Banks  a filter bank is an array of band-pass filters that separates the input signal into multiple components, each one carrying a single frequency subband of the original signal  Time sample blocks are passed through a set of bandpass filters  Masking thresholds are applied to resulting frequency subband signals  Poly-phase and wavelet banks are most popular filter structures
  • 15. 15 Filter Bank Structures  Polyphase Filter Bank [used in all of the MPEG-1 encoders] Signal is separated into subbands, the widths of which are equal over the entire frequency range The resulting subband signals are downsampled to create shorter signals (which are later reconstructed during decoding process)
  • 16. 16 Filter Bank Structures  Wavelet Filter Bank [used by Enhanced Perceptual Audio Coder (EPAC) by Lucent] Unlike polyphase filter, the widths of the subbands are not evenly spaced (narrower for higher frequencies) This allows for better time resolution (ex. short attacks), but at expense of frequency resolution
  • 17. 17 Noise Allocation  System Task: derive and apply shifted hearing threshold to the input signal  Anything below the threshold doesn’t need to be transmitted  Any noise below the threshold is irrelevant  Frequency component quantization  Tradeoff between space and noise  Encoder saves on space by using just enough bits for each frequency component to keep noise under the threshold - this is known as noise allocation
  • 18. 18 Noise Allocation  Pre-echo  In case a single audio block contains silence followed by a loud attack, pre-echo error occurs - there will be audible noise in the silent part of the block after decoding  This is avoided by pre-monitoring audio data at encoding stage and separating audio into shorter blocks in potential pre-echo case  This does not completely eliminate pre-echo, but can make it short enough to be masked by the attack (temporal masking)
  • 19. 19 Additional Encoding Techniques  Other encoding techniques techniques are available (alternative or in combination) Predictive Coding Coupling / Delta Encoding Huffman Encoding
  • 20. 20 Additional Encoding Techniques  Predictive Coding  Often used in speech and image compression  Estimates the expected value for each sample based on previous sample values  Transmits/stores the difference between the expected and received value  Generates an estimate for the next sample and then adjusts it by the difference stored for the current sample  Used for additional compression in MPEG2 AAC (Advance audio Coding)
  • 21. 21 Additional Encoding Techniques  Coupling / Delta encoding  Used in cases where audio signal consists of two or more channels (stereo or surround sound)  Similarities between channels are used for compression  A sum and difference between two channels are derived; difference is usually some value close to zero and therefore requires less space to encode  This is a case of lossless encoding process
  • 22. 22 Additional Encoding Techniques  Huffman Coding  Information-theory-based technique  An element of a signal that often reoccurs in the signal is represented by a simpler symbol, and its value is stored in a look-up table  Implemented using a look-up tables in encoder and in decoder  Provides substantial lossless compression, but requires high computational power and therefore is not very popular  Used by MPEG1 and MPEG2 AAC
  • 23. 23 Encoding - Final Stages  Audio data packed into frames  Frames stored or transmitted

Editor's Notes

  1. Hello, Today I will talk about the common techniques commonly used for digital audio compression of various audio filetype formats.
  2. -I will discuss the difference between redundant and irrelevant further in my presentation. -Depending on storage or transmission, there is an optimization in size