Your SlideShare is downloading. ×
0
Lecture 8 audio compression
Lecture 8 audio compression
Lecture 8 audio compression
Lecture 8 audio compression
Lecture 8 audio compression
Lecture 8 audio compression
Lecture 8 audio compression
Lecture 8 audio compression
Lecture 8 audio compression
Lecture 8 audio compression
Lecture 8 audio compression
Lecture 8 audio compression
Lecture 8 audio compression
Lecture 8 audio compression
Lecture 8 audio compression
Lecture 8 audio compression
Lecture 8 audio compression
Lecture 8 audio compression
Lecture 8 audio compression
Lecture 8 audio compression
Lecture 8 audio compression
Lecture 8 audio compression
Lecture 8 audio compression
Lecture 8 audio compression
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Lecture 8 audio compression

2,049

Published on

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
2,049
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
112
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide
  • Hello, Today I will talk about the common techniques commonly used for digital audio compression of various audio filetype formats.
  • -I will discuss the difference between redundant and irrelevant further in my presentation. -Depending on storage or transmission, there is an optimization in size
  • Transcript

    • 1. Audio CompressionTechniques Lecture 8 Prepared by Razia Nisar Noorani 1
    • 2. Introduction Digital Audio Compression  Removal of redundant or otherwise irrelevant information from audio signal  Audio compression algorithms are often referred to as “audio encoders” Applications  Reduces required storage space  Reduces required transmission bandwidth 2
    • 3. Audio Compression Audio signal – overview  Sampling rate (# of samples per second)  Bit rate (# of bits per second). Typically, uncompressed stereo 16-bit 44.1KHz signal has a 1.4MBps bit rate  Number of channels (mono / stereo / multichannel) Reduction by lowering those values or by data compression / encoding 3
    • 4. Audio Data Compression Redundant information  Implicit in the remaining information  Ex. oversampled audio signal  oversampling is the process of sampling a signal with a sampling frequency significantly higher than twice the bandwidth or highest frequency of the signal being sampled Irrelevant information  Perceptuallyinsignificant  Cannot be recovered from remaining information 4
    • 5. Audio Data Compression Lossless Audio Compression  Removes redundant data  Resulting signal is same as original – perfect reconstruction Lossy Audio Encoding  Removes irrelevant data  Resulting signal is similar to original 5
    • 6. Audio Data Compression Audio vs. Speech Compression Techniques  Speech Compression uses a human vocal tract model to compress signals  Audio Compression does not use this technique due to larger variety of possible signal variations 6
    • 7. Generic Audio Encoder Psychoacoustic Model  Psychoacoustics – study of how sounds are perceived by humans  Uses perceptual coding  eliminate information from audio signal that is inaudible to the ear  Detectsconditions under which different audio signal components mask each other 7
    • 8. Psychoacoustic Model Signal Masking  Threshold cut-off  Spectral (Frequency / Simultaneous) Masking  Temporal Masking Threshold cut-off and spectral masking occur in frequency domain, temporal masking occurs in time domain 8
    • 9. Signal Masking Threshold cut-off  Hearing threshold level – a function of frequency  Any frequency components below the threshold will not be perceived by human ear 9
    • 10. Signal Masking Spectral Masking A frequency component can be partly or fully masked by another component that is close to it in frequency  This shifts the hearing threshold 10
    • 11. Signal Masking Temporal Masking A quieter sound can be masked by a louder sound if they are temporally close  Sounds that occur both (shortly) before and after volume increase can be masked 11
    • 12. Spectral Analysis a device or algorithm that identifies a frequency domain representation of a time domain signal. Tasks of Spectral Analysis  To derive masking thresholds to determine which signal components can be eliminated  To generate a representation of the signal to which masking thresholds can be applied Spectral Analysis is done through transforms or filter banks 12
    • 13. Spectral Analysis Transforms  Fast Fourier Transform (FFT)  Discrete Cosine Transform (DCT) - similar to FFT but uses cosine values only  Modified Discrete Cosine Transform (MDCT) [used by MPEG-1 Layer-III, MPEG-2 AAC, Dolby AC-3] – overlapped and windowed version of DCT 13
    • 14. Spectral Analysis Filter Banks a filter bank is an array of band-pass filters that separates the input signal into multiple components, each one carrying a single frequency subband of the original signal  Time sample blocks are passed through a set of bandpass filters  Masking thresholds are applied to resulting frequency subband signals  Poly-phase and wavelet banks are most popular filter structures 14
    • 15. Filter Bank Structures Polyphase Filter Bank [used in all of the MPEG-1 encoders]  Signal is separated into subbands, the widths of which are equal over the entire frequency range  The resulting subband signals are downsampled to create shorter signals (which are later reconstructed during decoding process) 15
    • 16. Filter Bank Structures Wavelet Filter Bank [used by Enhanced Perceptual Audio Coder (EPAC) by Lucent]  Unlike polyphase filter, the widths of the subbands are not evenly spaced (narrower for higher frequencies)  This allows for better time resolution (ex. short attacks), but at expense of frequency resolution 16
    • 17. Noise Allocation System Task: derive and apply shifted hearing threshold to the input signal  Anything below the threshold doesn’t need to be transmitted  Any noise below the threshold is irrelevant Frequency component quantization  Tradeoff between space and noise  Encoder saves on space by using just enough bits for each frequency component to keep noise under the threshold - this is known as noise allocation 17
    • 18. Noise Allocation Pre-echo  In case a single audio block contains silence followed by a loud attack, pre-echo error occurs - there will be audible noise in the silent part of the block after decoding  This is avoided by pre-monitoring audio data at encoding stage and separating audio into shorter blocks in potential pre-echo case  This does not completely eliminate pre-echo, but can make it short enough to be masked by the attack (temporal masking) 18
    • 19. Additional Encoding Techniques Other encoding techniques techniques are available (alternative or in combination)  Predictive Coding  Coupling / Delta Encoding  Huffman Encoding 19
    • 20. Additional Encoding Techniques Predictive Coding  Often used in speech and image compression  Estimates the expected value for each sample based on previous sample values  Transmits/stores the difference between the expected and received value  Generates an estimate for the next sample and then adjusts it by the difference stored for the current sample  Used for additional compression in MPEG2 AAC (Advance audio Coding) 20
    • 21. Additional Encoding Techniques Coupling / Delta encoding  Used in cases where audio signal consists of two or more channels (stereo or surround sound)  Similarities between channels are used for compression  A sum and difference between two channels are derived; difference is usually some value close to zero and therefore requires less space to encode  This is a case of lossless encoding process 21
    • 22. Additional Encoding Techniques Huffman Coding  Information-theory-based technique  An element of a signal that often reoccurs in the signal is represented by a simpler symbol, and its value is stored in a look-up table  Implemented using a look-up tables in encoder and in decoder  Provides substantial lossless compression, but requires high computational power and therefore is not very popular  Used by MPEG1 and MPEG2 AAC 22
    • 23. Encoding - Final Stages Audio data packed into frames Frames stored or transmitted 23
    • 24. Questions 24

    ×