Abstract:In the age of smart phones and internet ready devices, audio/video transport and distributionhas evolved from sharing low quality files to providing high quality mobile device streams, clickto play content, over the air broadcasting, audio distribution in large facilities, and more. Eachmedium has several methods of compressing content by means of a codec. This session willexplain which codecs are appropriate for which purposes, common misuse of audio codecs, andhow to maintain audio quality by implementing codecs professionally.Introduction:What a codec is?Codec (enCOder/DECoder or COmpressor/DECompressor): software (or hardware) thatcompresses and decompresses audio and video data streams.The purpose of codecs is to reduce the size of digital audio samples and video frames in order tospeed up transmission and save storage space.According to Wikipedia, "A codec is a device or program capable of performing encoding anddecoding on a digital data stream or signal." In plain English Id put it this way: a codec allowsone to read and save audio and video files, often for the purposes of saving space.The best known example of a codec is MP3. It compresses bulky audio files such as WAV tomuch smaller MP3 files.All codecs involve a tradeoff between the amount of compression and the resultant quality. Ifyou compress too much the quality loss may become intolerable.A codec can consist of two components: an encoder and a decoder. The encoder performs thecompression (encoding) function and the decoder performs the decompression (decoding)function. Some codecs include both of these components and some codecs only include one ofthem.For example, when you rip a song from an audio CD to your computer, the Player usesthe Windows Media Audio codec by default to compress the song into a compact WMA file. Whenyou play that WMA file (or any WMA file that might be streamed from a website), the Player usesthe Windows Media Audio codec to decompress the file so the music can be played through yourspeakers.Hence A codec is software that is used to compress or decompress a digital media file, such as asong or video.Why do we need codecs?ANS: Because video and music files are large, they become difficult to transfer across theInternet quickly. To help speed up downloads, mathematical "codecs" were built to encode("shrink") a signal for transmission and then decode it for viewing or editing. Without codecs,downloads would take three to five times longer than they do now.
The data rate of a video file governs the ability of the target device for playback. Originally CDdrives were only 1X and the data rate for video files had to fall into the limitations of that devicefor playback. In order to reach the planned target some concessions had to be made to allowsmooth playback, i.e. smaller frame resolutions, lower frame rates, lower audio rates. This iswhere compression comes into play - all these factors gathered together and crammed into asingle stream for playback at the lowered target data rate.Understanding codecs and compression/decompression is really not that difficult in obviousinformation - more compression less picture quality, less compression better picture quality. It ismore difficult to understand how to compress the video in order to achieve the target playbackyou desire. There is always a trade-off when it comes to passing video over the web, particularlywhen the users have 56K or lower connections. Unless the user has broadband the image has tosuffer in some way. It has to be compressed to lower data rates and frame rates are cut down to10 or 15fps.Compression has always had a lot to do with the ability to share video over various connectionspeeds through the internet. Full frame resolution NTSC video with CD quality audiouncompressed has a data rate of nearly 30MB persecond. Even with large amounts of compression that stream would choke over a 56Kconnection. Over the years many different types of codecs have been developed in the hope ofachieving better quality video at lower data rates.This is where MPEG-2 compression outshines all the rest lowering the Megabytes to megabitsand still offering a high quality picture with sound. Unlike many of the non-DV video algorithmsMPEG-2 does not have sub-format frame ratios.Most video applications using software based codecs assume a 4:3 ratio and the sub-ratiosbased upon it, i.e. 640x480, 320x240, 160x120,etc. all square pixel ratios. This is because DVwas not around and analog capture was done in square pixel 4:3 screen ratios. MPEG-2 on theother hand expects the DV pixel ratio and frame resolution. There is a breaking point with anycodec in relationship to compression - the more the compression the lesser the picture quality.Naturally the less the compression the better the picture quality.Lossy or Lossless codecs:The goal of all codec designers is to maintain audio and video quality while compressing thebinary data further.Most codecs are LOSSY, in order to get a reasonably small file size. There are LOSSLESS codecsas well, but for most purposes the almost imperceptible increase in quality is not worth theconsiderable increase in data size. The main exception is if the data will undergo moreprocessing in the future, in which case the repeated lossy encoding would damage the eventualquality too much.Examples of Lossy file formats: AAC (Advanced Audio Coding), MP3, Vorbis (filename extension.OGG), lossy Windows Media Audio (filename extension .WMA)...Example of Lossless file formats: Apple Lossless (filename extension .m4a), FLAC, MonkeysAudio (filename extension .APE), Shorten, TTA, lossless Windows Media Audio (filenameextension .WMA), WavPack.
Classification of Codecs: 1. Audio Codecs: In Software, an audio codec is a computer program implementing an algorithm that compresses and decompresses digital audio data according to a given audio file format or streaming media audio format. The object of the algorithm is to represent the high- fidelity audio signal with minimum number of bits while retaining the quality. This can effectively reduce the storage space and the bandwidth required for transmission of the stored audio file. Most codecs are implemented as libraries which interface to one or more multimedia players. Software Audio Codecs can be classified as: Non Compression Formats (LPCM, PDM, PAM) Lossless Formats (ALAC, FLAC, MPEG-4 ALS) Lossy Formats (Dolby Digital, MP1, MP2, MP3, AAC, WMA) In hardware, audio codec refers to a single device that encodes analog audio as digital signals and decodes digital back into analog. In other words, it contains both an Analog- to-digital converter (ADC) and Digital-to-analog converter (DAC) running off the same clock. This is used in sound cards that support both audio in and out, for instance. 2. Video Codecs: A video codec is a combination of hardware and/or software that creates a binary stream of data that represents the video and audio captured by a camera. Encoders differ from capture devices primarily in what they are intended to create as output. A capture card usually creates a binary stream that will be stored as a file. An encoder usually creates a stream of data that is to be transferred to a second device. This second device has various names such as set-top box or decoder, but it essentially reverses the process carried out by the encoder and re-creates the representation of the scene picked up by the camera. Video codecs basically do three things. If the camera is analog, they sample the output signal. The rate at which this is done is referred to as the sampling rate. Each sample is then converted into a certain number of bits (often 8) during the analog-to-digital conversion process (A/D). This is called quantization. Finally, the codec must do compression on the resulting bit stream because it is usually too much information to be efficiently transmitted. Video Codecs can be classified as: Lossless Compression (FFV1, Dirac Lossless, H264 Lossless) Lossy Compression (H.264/MPEG-4 AVC, AVC, H.263) Hence, a video codec is software or a device that provides encoding and decoding which may or may not include the use of video compression and/or decompression for digital video.
3. Text Codecs A Text Codec is a function that transforms text into (when encoding) or out of (when decoding) another kind of representation. Usually, the most human-readable representation is said to be "decoded". "Encoders" will turn the (selected or whole) text into something less readable, "Decoders" try to revert those effects as good as possible. E.g.: ROT-13, Base64, URI Codecs, Unicode Codecs, Case Encoders, CMML, BiM Codecs can also be classified into Specialized codecs such as „Speech Codecs‟ which are designed to deal with the characteristics of voice, while „Audio Codecs‟ are developed for music. The difference between speech and audio codecs is that speech codecs look for speech patterns in order to compress the data further.Codecs may also be able to transcode from one digital format to another; for example, from PCMaudio to MP3 audio.What’s the difference between decoding, encoding, recoding andtranscoding?• DecodingSimply opening and watching video files with a video player (e.g. a DivX decoder for openingDivX files or a MPEG-1 Decoder for opening a MPEG-1 video file).• EncodingCreating a video file in a special format (e.g. DivX, MPEG-2, MPEG-4, ...) - You´ll need a DivXEncoder in order to be able to create DivX files and a MPEG-2 Encoder in order to be able tocreate MPEG-2 videos. You´ll need these encoders for transcoding and recoding video files aswell.• RecodingConversion of a video file which is present in a special format with special attributes in the sameformat with different attributes (e.g. 2 h movie with 3000 Kbit/s into 2 h movie with 2000Kbit/s).• TranscodingConversion of a video file which is present in a special format into another video format (e.g.DivX in DVD or MPEG-2 in DivX).
Understanding Various Video Codecs:The Video codecs is a method of compression/decompression of video file, video data orstreaming video format. The codecs stands for Coders / Decoders.There are various kinds of video codecs available. Since these codecs have been implemented bydifferent algorithms by number of companies; they have different specification and application invarious fields. These video codecs generally complies Industry standards.The various software video codecs are: H.264 / MPEG-4 AVC Mpeg-4 DivX x264 Real Video Sorenson Mpeg-1 Mpeg-2 H.261 H.263These various Video codecs are technically differentiated from each other based on variousfactors which includes compression technology / algorithm, platform supported, sampling, OSsupported etcOne can easily compare the various Video codecs from various websites. But still there isconfusion which codec is the appropriate? However it also depends on application. Butunderstanding pros & cons of some of these codecs gives us the better information and insightdepth.H.264 / MPEG-4 AVCH.264 is also known as MEPG-4 AVC. H.264 uses the latest innovations in video compressiontechnology to provide consistently crisp and clear video for the best possible viewing.Pros H.264 delivers incredible video quality at data rates one-fourth to one-half the size of previous video formats H.264 offers dramatically lower bit rates and better picture quality than MPEG-2, MPEG-4 or H.263+ It is 2X times more efficient than MPEG-4. and file size is 3X times smaller than comparable MPEG-2 Codecs It is easy to integrate and covers wide range of picture format. Hence used in large application segment.Cons H.264 requires longer encoding time It is certainly not constricted and low-bandwidth friendly More Hardware overhead is also one of the limiting factor Licensing agreements are complicated.MPEG-4MPEG-4 is a standard currently under development for the delivery of interactive multimediaacross networks. As such, it is more than a single codec, and will include specifications for audio,video, and interactivity.The video component of MPEG-4 is very similar to H.263. It is optimized for delivery of video atInternet data rates. One implementation of MPEG-4 video is included in Microsoft‟s NetShow.Pros Good image quality at low data rates
Cons Standard is still being designedDiVxDivX is a brand name of products created by Divx Inc. The DivX codec uses lossy MPEG-4 Part 2compression and it isfully MPEG-4-Advanced Simple Profile compliant; MPEG-4 ASP.Pros The Divx codec is quite simple to set up and use It is popular due to its ability to compress lengthy video segments into small sizes while maintaining relatively high visual quality.Cons It‟s a commercial codec, so in order to get all the options you have to pay money for it.x264x264 is a freely available open source implementation of the h.264 standard. H.264, or AVC as itis sometimes known is a very advanced compression method that is part of the MPEG-4standard.Pros It offers the best quality at the smallest filesizeCons x264 (or any h.264 codec for that matter) is that it can take bit of CPU power to playReal VideoReal Media currently has only two video codecs: Real Video (Standard) and Real Video (Fractal).Please bear in mind that this page only compares the one to the other.Pros RealVideo (Standard) is usually best for data rates below 3 KBps. It works better with relatively static material than it does with higher action content. It usually encodes faster.Cons RealVideo (Standard) is significantly more CPU intensive than the RealVideo (Fractal) codec. It usually requires a very fast PowerMac or Pentium for optimal playback.SorensonThe Sorenson Video Codec produces excellent Web video suitable for playback on any Pentiumor PowerMac. It also delivers outstanding quality CD-ROM video at a fraction of traditional datarates.Pros Provides much higher image quality than Cinepak, with smaller files. It is often possible to get twice the image quality at less than half the data rate. Tuned to work well from 2 - 100 KBps. Supports Media Cleaner Pro‟s variable bitrate encoding, which provides the best possible results at any data rate.Cons Playback of CD-ROM video requires faster computers than Cinepak Movies larger than 320×240, or at data rates above 100 KBps, do not play smoothly except on high-end machines (such as a Macintosh G3). While picture quality is usually
outstanding at higher rates, you should test these movies on your target machines to determine if playback performance is acceptable.MPEG-1MPEG-1 provides excellent image quality at CD-ROM data rates. One of the most popular uses ofMPEG-1 is the VCD, or “white book” video CD. MPEG includes both audio and video compression.The biggest problem with MPEG is that it has high requirements for playback. Either a dedicatedMPEG decoder card must be installed, or a high-end CPU is required for software-only playback.Because of this limitation, MPEG-1 has not gained wide acceptance in consumer titles.Pros Excellent image qualityCons Very high playback requirements Majority of installed base not capable of viewing MPEG Licensing fees (typically US $0.04 - $0.40 per unit) are required to distribute MPEG-2 video. There may also be fees for MPEG-1; there is some uncertainty regarding this. Not well-suited to WWW video (the upcoming MPEG-4 standard will address this)MPEG-2MPEG-2 is a standard for broadcast-quality digitally encoded video. It offers outstanding imagequality and resolution. MPEG-2 is the primary video standard for DVD-VideoPros Excellent image qualityCons Very few people are currently capable of viewing MPEG-2 Licensing fees (typically US $0.04 - $0.40 per unit) are required to distribute MPEG-2 video.H.261H.261 is a standard video-conferencing codec. As such, it is optimized for low data rates andrelatively low motion.Pros H.261 is optimized for low data rates. H.261 has a strong temporal compression component, and works best on movies in which there is little change between frames.Cons Not generally as good quality as H.263. It may not play well on lower-end machines.H.263H.263 is a standard video-conferencing codec. As such, it is optimized for low data rates andrelatively low motion.H.263 is an advancement of the H.261 standard; mainly it was used as astarting point for the development of MPEG (which is optimized for higher data rates.)Pros H.263 is optimized for low data rates. Generally better quality than H.261 H.263 has a strong temporal compression component, and works best on movies in which there is little change between frames.Cons H.263 is CPU intensive It may not play well on lower-end machines.
Understanding various Audio Codecs:The audio codecs is a method of compression/decompression of audio file, audio data orstreaming audio format. The codecs stands for Coders / Decoders.There are various kinds of audio codecs available. Since these codecs have been implemented bydifferent algorithms by number of companies; they have different specification and application invarious fields. These audio codecs generally complies Industry standards.The various software audio codecs are: AAC AAC+ / AAC+ Enhanced AC3 or Digital Dolby Digital Dolby Plus Speex FLAC MIDI MP3 MP3 Pro Monkey’s audio OggVorbis QCELP Real Audio WMA Melody HVACThese various Audio codecs are technically differentiated from each other based on variousfactors which includes compression technology / algorithm, platform supported, sampling, OSsupported etcOne can easily compare the various audio codecs on wikipedia.But still there is confusion whichcodec is the appropriate? However it also depends on application. But understanding pros & consof some of these codecs gives us the better information and insight depth.AACPros An international standard approved by the ISO Flexible: supports several sampling rates (8000-96000 Hz), bit depths, and multichannel (up to 48 channels) Several implementations, including free and high quality ones. Reaches transparency in most samples and for most users at around 150 kbps Part of MPEG-4 specs Anyone can create its own implementation (specifications and demo sources available)
Cons Problem cases that trip out all transform codecs Heavily patented Increased complexity AAC comes in different “flavors” (object types: AAC LC, AAC HE, AAC PS etc.). Many (especially portable) players only support LC (at the moment) so you can have files that are valid but your player won‟t play them.AAC+ / Enhanced AAC AACplus (AAC+) is a variant of AAC which is optimized for low bit rates developed by Coding Technologies. It uses techniques including SBR (Spectral Band Replication) and PS (Parametric Stereo). Multi-channel support for 5.1, 7.1 and beyond (48 channels total) Optimized speech, mixed speech/music down to 8 kbps monoAC3 / Digital Dolby / Digital Dolby plusPros Digital Dolby or AC3 decoder is the industry standard for DTV and DVD media. Nearly all new DVD movies come with a DD soundtrack program AC-3 provides only full range channels, its sound is really much better in terms of quality. And it is also backward compatible. Digital Dolby plus also supports 7.1 channelCons Max support for 5.1 channel audio CDs, limited to 448 kbps maximum for Digital DolbySPEEXPros Speex is an Open Source/Free Software patent-free audio compression format Speex is based on CELP and is designed to compress voice at bitrates ranging from 2 to 44 kbps Speex has a number of features that aren‟t in other codecs such as Intensity stereo encoding, integration of multiple sampling rates in the same bitstream, and a VBR modeCons: Speex is mainly designed for only three different sampling rates: 8 kHz, 16 KHz & 32 KHzFLACProsFLAC is portable to many systems Open source and freely licensed The encoding of audio data incurs no loss of information. Hardware support & Streaming support
Extremely fast decoding Supports multi-channel and high resolution streams Supports Replay Gain & cue-sheet (with some limitations) Gaining wide use as successor to ShortenCons: Compresses less efficiently than other popular modern compressors (Monkey‟s Audio, OptimFROG) Higher compression modes slow, for little gain over the default setting.MP3Pros Widespread acceptance, support in nearly all hardware audio players and devices An ISO standard, part of MPEG specs Fast decoding, lower complexity than AAC or Vorbis Anyone can create their own implementation (Specs and demo sources available) Relaxed licensing scheduleCons Lower performance/efficiency than modern codecs. Problem cases that trip out all transform codecs. Sometimes, maximum bitrate (320kbps) isn‟t enough. No multichannel implementations. Unusable for high definition audio (sampling rates higher than 48kHz).OGG VORBISPros (Ogg) Vorbis specification is in the public domain; it is free for commercial or noncommercial use, under both (LGPL and BSD license) Easy to use high-level API (Application Programming Interface) Good all-round performance (>48 kbps – a leading codec at 128 kbps) Well written specs Supported by most portable (Ogg) DAPs Suitable for internet-streaming (via Icecast and other methods) Fully gapless playback High potential for further tuning Structured to allow the design for a hybrid filterbankCons Limited official development (third-party developement is always encouraged) Current implementations are more computationally intensive to decode than MP3