Compare the advantages and limitations of various audio
In This Chapter, you’ll learn on:
The following file formats.
The characteristics of these formats.
The advantages and limitations of these formats.
What is a “Raw” Audio?
It’s hard to talk about “Raw” because it’s not really
the name of a file format. Originally the term was
used within Photoshop for doing a desperation
import of a mystery file that contains no metadata
whatsoever, not even the most basic facts such as
file format, size, color mode, etc.
A newer (and basically unrelated) meaning of “raw”
— more formally called Camera Raw — has
become very common with the rise of digital
What is a “Raw” Audio?
Raw Audio is any recorded audio that is unedited
and unprocessed. The advantage of raw audio
means it can provide clients with audio exactly as it
was recorded. However when a file is defined as
“raw”, it also means that the file size can be too big
to be easily transferred.
MP3 is the name of a type of audio file extension and also the
name of the type of file for MPEG (Audio layer 3). Layer 3 is one of
three coding schemes (layer 1, layer 2 and layer 3) for the
compression of audio signals. Layer 3 uses perceptual audio
coding and psychoacoustic compression to remove all
superfluous information (more specifically, the redundant and
irrelevant parts of a sound signal. The stuff the human ear doesn't
hear). It also adds a MDCT (Modified Discrete Cosine Transform)
that implements a filter bank, increasing the frequency resolution
18 times higher than that of layer 2.
The result in real terms is layer 3 shrinks the original sound data from
a CD (with a bit rate of 1411.2 kilobits per one second of stereo
music) by a factor of 12 (down to 112-128kbps) without sacrificing
Because MP3 files are small, they can easily be transferred across
the Internet, which really is a plus point as compared to the other
audio formats. MP3 files are played on the computer via media
player software, such as Apple's iTunes and Microsoft's Windows
Media Player, as well as in countless iPods and other handheld
players. MP3 sound quality cannot fully match the original CD,
and true audiophiles complain bitterly, but millions of people
consider it "good enough" because they can pack thousands of
songs into a tiny pocket-sized player.
Converting a digital audio track from a music CD to the MP3
format (or other audio format) is called "ripping" or "importing,"
and this conversion function is built into iTunes, Windows Media
Player and other jukebox software. Stand-alone rippers are also
While 128 Kbps (kilobits per second) is considered the norm for
MP3 files, MP3s can be ripped to bit rates from 8Kbps to 320 Kbps.
The higher the bit rate, the better the sound and the larger the file.
According to CNN, any artists, execs and labels in the music
industry have lashed out at the MP3 format due to the fact that it
doesn't offer much security and is, therefore, very easy to pirate.
Another potential disadvantage to MP3 format is that at lower
bitrates the MP3 format's quality has been known to suffer, while
WMA sounds much better at comparable low bitrates. Finally, MP3
files do not support 5.1 Surround Sound, which the WMA format
Short for Audio Interchange File Format, a common format for
storing and transmitting sampled sound. The format was
developed by Apple Computer and is the standard audio format
for Macintosh computers. It is also used by Silicon Graphics
AIFF files generally end with a .AIF or .IEF extension.
The AIFF format does not support data compression so AIFF files
tend to be large. However, there is another format called AIFF-
Compressed (AIFF-C or AIFC) that supports compression ratios as
high as 6:1.
The extension for this file type is ".aif" when it is used on a PC. On
aMac, the file extension is not needed. A Mac file uses a Type and
Creatorresource to identify itself to the operating system and the
applicationsthat can open it.
An AIFF file contains the raw audio data, channel information
(monophonic orstereophonic), bit depth, sample rate, and application-
specific data areas. Theapplication-specific data areas let different
applications add informationto the file header that remains there even if
the file is opened andprocessed by another application.
Moving Pictures Experts Group is an ISO/ITU standard for
compressing digital video. Pronounced "em-peg," it is the universal
standard for digital terrestrial, cable and satellite TV, DVDs and
digital video recorders (DVRs).
MPEG uses lossy compression within each frame similar to JPEG,
which means pixels from the original images are permanently
discarded. It also uses interframe coding, which further
compresses the data by encoding only the differences between
periodic frames (see interframe coding). MPEG performs the
actual compression using the discrete cosine transform (DCT)
method (see DCT).
MPEG is an asymmetrical system. It takes longer to compress the
video than it does to decompress it in the DVD player, PC, set-top
box or digital TV set. As a result, in the early days, compression was
perfomed only in the studio. As chips advanced and became less
costly, they enabled digital video recorders, such as Tivos, to
convert analog TV to MPEG and record it on disk in real time (see
The major MPEG standards include the following:
MPEG-1 (Video CDs) - Although MPEG-1 supports higher
resolutions, it is typically coded at 352x240 x 30fps (NTSC) or
352x288 x 25fps (PAL/SECAM). Full 704x480 and 704x576 frames
(BT.601) were scaled down for encoding and scaled up for
playback. MPEG-1 uses the YCbCr color space with 4:2:0
sampling, but did not provide a standard way of handling
interlaced video. Data rates were limited to 1.8 Mbps, but often
exceeded. See chroma subsampling.
MPEG-2 (DVD, Digital TV) - provides broadcast quality video with
resolutions up to 1920x1080. It supports a variety of audio/video
formats, including legacy TV, HDTV and five channel surround
sound. MPEG-2 uses the YCbCr color space with 4:2:0, 4:2:2 and
4:4:4 sampling and supports interlaced video. Data rates are from
1.5 to 60 Mbps. See chroma subsampling.
MPEG-4 (All Inclusive and Interactive)
MPEG-4 is an extremely comprehensive system for multimedia
representation and distribution. Based on a variation of Apple's
QuickTime file format, MPEG-4 offers a variety of compression
options, including low-bandwidth formats for transmitting to
wireless devices as well as high-bandwidth for studio processing.
MPEG-4 also incorporates AAC, which is a high-quality audio
encoder. MPEG-4 AAC is widely used as an audio-only format (see
MPEG-4 (All Inclusive and Interactive)
A major feature of MPEG-4 is its ability to identify and deal with
separate audio and video objects in the frame, which allows
separate elements to be compressed more efficiently and dealt
with independently. User-controlled interactive sequences that
include audio, video, text, 2D and 3D objects and animations are
all part of the MPEG-4 framework. For more information, visit the
MPEG Industry Forum at www.mpegif.org.
MPEG-7 is about describing multimedia objects and has nothing
to do with compression. It provides a library of core description
tools and an XML-based Description Definition Language (DDL) for
extending the library with additional multimedia objects. Color,
texture, shape and motion are examples of characteristics
defined by MPEG-7.
MPEG-21 (Digital Rights Infrastructure)
MPEG-21 provides a comprehensive framework for storing,
searching, accessing and protecting the copyrights of multimedia
assets. It was designed to provide a standard for digital rights
management as well as interoperability. MPEG-21 uses the "Digital
Item" as a descriptor for all multimedia objects. Like MPEG-7, it
does not deal with compression methods.
MPEG-2 and MPEG-4 are international standards for compressing
video signals; they are widely used in digital television
broadcasting, DVD discs, and mobile videophony. Using these
standards, video can be compressed in an encoder before
transmission or storage, perhaps to 1/50 of the original size, and
then decompressed in a decoder for playback.
Like the JPEG (Joint Photographic Experts Group) standard for
compressing still images, the MPEG-2 and MPEG-4 standards
exploit redundancies in the original signal and limitations of
human vision to discard unnecessary information. Depending on
the compressed bitrate selected, the decoded image can be
imperceptibly different from the original or badly flawed with
MPEG-2 and MPEG-4 offer several advantages over proprietary
codecs. The standards process allows careful, open, and
objective tests of competitive ideas and approaches to find the
best mix of efficiency, latency, and implementation complexity for
The format for storing sound in files developed jointly by Microsoft
and IBM. Support for WAV files was built into Windows 95 making it
the de facto standard for sound on PCs. WAV sound files end with
a.wav extension and can be played by nearly all Windows
applications that support sound.
This is a high-quality audio file type generally used for applications
that require high quality, such as CDs. WAV files are
uncompressed, and therefore take up some disk space, unlike
MP3s or AACs, which are compressed.
Because WAV files are uncompressed, they contain more data
and produce better, more subtle, and more detailed sounds.
A WAV file generally needs 10MB for every 1 minute of audio,
whereas an MP3 needs about 1MB for every 1 minute.
ACT is a lossy ADPCM 8 kbit/s compressed audio format recorded
by most Chinese MP3 and MP4 players with a recording function,
and voice recorders. However, many models of recorder that use
the ACT format do so only for their lowest-quality recording setting;
if the quality setting is increased then a different format such as
WAV is used instead (albeit at the expense of using up recorder
memory more quickly).
There are different versions of ACT; files produced by later devices
cannot as of June 2009 be read by any free standard audio
player and converter software, only by the supplied MP3 utilities.
Short for Windows Media Audio, a file format developed by
Microsoft for encoding digital audio files similar to MP3 though can
compress files at a higher rate than MP3. WMA files, which use the
".wma" file extension, can be of any size compressed to match
many different connection speeds, or bandwidths. Known
originally as MSAudio, this proprietary format competes with the
MP3 and AAC methods. WMA encodes rapidly and is known to be
especially effective at low bit rates.
One of the largest advantages to the WMA format is the format's
support of DRM, or Digital Rights Management. This is an anti-
piracy procedure that allows for much more security to surround a
given music file. Although it doesn't completely make it pirate-
proof, it makes pirating in this method much less common; MP3
offers no DRM and is therefore much easier to steal.
One of the largest disadvantages associated with the WMA
format is the lack of support from third party devices such as MP3
players. Although some support WMA files, such as Microsoft's own
Zune, many do not. The Apple iPod does not support WMA and,
as the market's dominant MP3 player, compatibility can be a
huge issue if you own one. Also, many media players for the
computer do not play DRM protected WMA files, making this
format's biggest advantage also a disadvantage, in some cases.
When it comes to WMA and MP3, both sacrifice sound quality to
save space. With WMA Lossless this sacrifice is not made but the
sound file is still condensed down to about half of its original size.
The file extension for WMA Lossless is still .WMA. This format is less
popular than other lossless audio formats, such as FLAC.