Audio encoding principles


Published on

Principles of digital audio encoding and transcoding. The relationship of audio sound to audio data.

  • Be the first to comment

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Audio encoding principles

  1. 1. Audio file compression Audio Compression
  2. 2. Audio Compression  Audio compression can be lossless or lossy  Lossless compression reduces the file size (minimally) and keeps all of the information  Lossy compression reduces the file size dramatically but decreases sound quality (loses audible information)  Audio compression uses psychoachoustic redundancy principles  Parts of a clip where quality is ‘important’ (singing, talking, complex harmonies etc.) will be written / have more data written  ‘Unimportant’ parts (silence for example) will be written with less data
  3. 3. Lossless Audio Container Formats  Some common lossless audio formats  AIFF: Audio Interchange File Format  Designed by Apple  Used for uncompressed storage, generally on Mac systems  Used in audio processing / editing work flows  WAV: Waveform Audio File Format  Designed by Microsoft  Used for uncompressed storage, generally on Windows systems  Used in audio processing / editing work flows  Nowadays, both AIFF / WAV will generally play on either PC /MAC systems  FLAC: Free lossless Audio Codec  Open source codec
  4. 4. Lossy audio codecs  Some common lossy audio codecs  Mp3 (Mpeg1 – layer 3)  A ubiquitous codec (plays on everything)  AAC (Mpeg4 – part 3) – supported by HTML5 player  Common iTunes codec – becoming universally playable  Popular choice for podcasts as it can be encoded as an enhanced podcast  Containing images, chapters and URL links (on iTunes only)  Will take the extension .m4a or .m4p in certain instances  (Ogg) Vorbis  Open source audio codec also supported by HTML 5 player  WMA (Windows media audio)  Windows media file  Encodes surround sound
  5. 5. Compressing Audio  Audio compression and the resulting aural quality are affected by changing the following properties of your source audio file  Sample rate (frequency)  The amount of samples taken per second (temporal resolution)  Bit depth  The amount of information used at a single sample point (resolution)  Bit rate  Number of bits per second (Kbps)  Channels  Mono, stereo and surround sound files  Your ear will always have the final say in whether a file has been compressed to an acceptable quality
  6. 6. Sample rate  The number of samples taken from an audio source in one second  Measured in KHz  48KHz (high end DV)  44KHz (CD quality)  32KHz (Some digital video)
  7. 7. Bit depth (audio)  The amount of data (bits) used in each sample of audio ( bit ‘word length’)  2 bit – 00, 01, 11, 10  4 bit – 0000, 0001, 0011, 0111, 1111, 1110, 1100, 1000, 1101, 1011, 1001, 1010, 01 01…..0110 (16 possible values (or 24))  The more data recorded at each sample, the more fidelity (less degradation) of the original signal
  8. 8. Bit depth (audio)  Recording and storing at high bit depth allows freer use of effects in post-production before degradation occurs  Audio recorded at 24bit can be heavily processed (effects) and be output as 16bit without degradation  Some common bit depths  Phone – 8bit (VOIP) enough data to accurately capture the human voice range but can remove certain qualities of the original voice  CD – 16bit  DVD and Blu Ray – (20 – 24bit)  Post production – 24 – 32bit
  9. 9. Bit rate: bits per second (Kbps)  Bit rate = the amount of data required to store one second of audio (and stream / play the file)  128Kbps: acceptable ‘tipping point’ in terms of audible quality (default iTunes import conversion setting)  Lower bitrates give higher frequencies in the sound a ‘sizzling’ effect (poor quality)  Compensate by a) increasing the bitrate or b) cutting the higher frequencies in the mix  Constant bitrate (CBR)  A bitrate is set and will be fixed for the duration of the clip  A fast method of encoding – done in one pass (through the data)  Used for live streamed audio - ‘on the fly’
  10. 10. Relationship of bitrate to sample rate  Each sample is like a ‘slice’ of audio  Bits are distributed amongst all the samples in any given second (Kbps (per second))  If a fixed amount of bits is given to each sample  The less samples per second – The more data is given to each sample – The higher the audible sound quality  The more samples per second – The less data is given to each sample – The lower the sound quality
  11. 11. Variable Bit Rate – V.B.R. Variable bitrate (VBR)  Adaptive encoding method: Software decides on appropriate bitrate depending on psychoacoustic redundancy principles (how much information is required for a given moment of audio?)  Quiet / silent parts are given less data (compressed more)  Complex / loud sections with more detailed sound are given more data (compressed less)  Longer files benefit from VBR in terms of file size  VBR can be constrained to maximum / minimum / average values  VBR requires 2 (or more passes) through the data in the file 1. Pass one determines how much compression is required in a given part  The location of designated sections and corresponding compression amount is logged 2. Pass two applies the appropriate amount of compression to the
  12. 12. Factors determining Quality/File size in audio compression Quality Bitrate: How many bits (data) are available per second? Channels 1 channel (mono) will get all the available bits 2 channels (stereo) will have half the available bits each Sample Rate The bits are distributed amongst each sample The more samples, the less bits available for each
  13. 13. Audio such as music, exported at a bitrate of 128Kbps and a frequency of 44.1KHz will generally sound acceptable Sound ‘artifacting’  High frequencies will start to degrade at settings below that  A ‘hissing’ or phased effect will creep into the audio  Compensating solutions to this include:  Raise the bitrate / frequency and remove one channel (if file- size needs to be maintained)  Go back to your stereo project, bring down the high frequencies in the mix and re-export a version for low bandwidths
  14. 14. Example – 34 sec. 44100Hz, 16bit, stereo Sample rate 44100Hz Bit depth 16 Channels * 2 Total bits per second 1411200 Divided /8 = bytes per s 176,400 Multiplied by duration (34) 5997600 (bytes) Divided /1024 5857.03125 (KB) Divided /1024 5.719757080078125 (MB) Calculating Data in Transcode / ADC / Recording Secenario (Uncompressed audio (Such as PCM))
  15. 15. Calculating bitrate / file size of compressed audio  Bitrate = file size / duration  File size = bitrate * duration  Duration = File size / bitrate A piece of audio is 10 seconds long and 320KB in size – What is the bit rate (express in Kilobits)? A piece of audio has a bitrate of 128Kbps and is 5 seconds in duration – what is the file size (in Kilobytes (KB))?