Digital Audio Formats and Processing Explained

Digital Audio
By
Dr. R. Manjula Devi
Assistant Professor(SRG) / CSE

Why audio?
• To create moods, enhance understanding, reinforce
concepts, enhance multimedia games through
background music and sound effects
• Scope of digital applications:
– Computer generated sound, sound storage and
processing, digital communications, answering service,
speech synthesis, speech recognition, computerized call
centre, presentation of data as sound
– Can enable programs to have more user friendly interface

Processing and Representation of
Digital Audio
• Includes:Processes of generation, propagation,
amplification and transformation

Uses of sound in multimedia
• It can simulate emotional responses
• To convey an educational message
• To alert the users in computer games
• To enjoy music, video and audio narrations
• To aid in learning foreign languages-speaking
and listening

Possible Justifications of using
sound

Representation of Sound
• Sound
– Continuous vibration caused by waves travel through
the air
– Physical characteristics: frequency and amplitude
– vibrations generated from our ear drums; generate
nerve signals when reaching the brain, produce the
perception of sound
• Measures of auditory perception of our ear:
– Loudness and pitch

Analog vs. Digital Audio
Analog Digital
Normal sound perceived by
our ear - Voice, musical
instruments, all other audio
sources
Sound was encoded on CD
as numerical information.
Recording: on the disk as
physical grooves or on the
tape as magnetic orientation
of fine ferromagnetic
particles
Laser beam is used to read
and translate the recording,
which does not degrade the
quality with repeated use.
Drawbacks:
Quality of recording
degrades with each
generation of duplication
Advantages: Portability,
Durability, Options to record
and process, better sound
quality

Producing Digital Audio
• 2 ways to convert analog to digital:
– Converting analog recording into digital form
– Direct digital recording of the sound
• Microphones: translate analog signals to electric
impulses which are converted as numbers
• Digital Audio Tapes(DAT): convert analog to
digital having high quality
• Others: sound synthesizer or software

Microphones
• 3 basic designs of microphone:
– 1. dynamic :
• commonly used, less cost,
• vibrations of light & easy to move diaphragm produce electric
signal by induction in a wire coil attached to end of the cone;
• weight of these elements limits the frequency response
– 2. condenser :
• more accurate, more sensitive, more expensive
• Uses big open-air capacitors to pick up sound waves
– 3. crystal
• Uses unique properties of some crystals, which create signals

Digitisation of Analog signal
• Pulse Code Modulation(PCM) : 2 step
processes:
– 1.Sampling: An analog signal approximated by taking
samples at regular intervals; sampling rate determines
frequency resolution;
– more samples -> better quality and more bits needed to
store data (called as resolution); Eg. 11.025,22.05, 44
kHz.
– 2.Quantisation: refers resolutions; Eg. 8 / 16 bits

Nyquist Sampling Criteria
• It states that
– A finite band limited signal of frequency fm can be
completely represented if the signal is uniformly
sampled at a rate greater than or equal to 2fm.
– If sampling rate fs > 2fm, then sampled values will
contain all information of the original signal
– fs > 40 kHz produces high quality (Audio CD) sound.

Quantisation
• It is the process of approximating a
continuous signal by a set of discrete
symbols or integer values.
• Turn samples into numbers
• Break the full +ve & -ve range of sample
amplitude values into N sections; code it in
log2(N) bits; No. of bits determines resolution
• Quantisation operator: Q(x) = round(f(x)),
where x is real no., f(x) is an arbitrary real-
valued function

Linear vs. Non-linear
quantisation
• Linear: each code word represents a
quantisation interval of equal length; low
amplitude signals are more sensitive to noise
• Non-linear: more digits to represent samples at
some levels and less for samples at other levels.
•  Law (for North America) & -Law (for Europe)
– std telephone communications
• Quantisation Error: Error between sampled
discrete values and actual continuous sound.

Psychoacoustics
• Higher amplitude, louder the sound we hear;
• Frequency = no. of times a wave cycle is
repeated per unit time.
• Loudness and pitch : amplitude and frequency
• Loudness is measured in decibels(dB):50dB to
120dB
• Range of hearing for human: 20Hz to 20kHz
• More sensitive: 1 kHz to 5 kHz.
• English: 500Hz to 2kHz.

Masking
• Masking: Loud tones can mask the perception
of other(quieter) tones at nearby frequencies
• Some tones are cannot be heard
• Types of masking:
– Non-parallel(time-domain)
– Parallel(frequency domain)

Perceptual Encoding
• Perceptual Encoding: allow production of very
compact but high quality music.
– used in realaudio(.rm), mp3,windows media audio(.wma),
etc.
– Audio compression based on human hearing
– MPEG uses monophonic, dual-phonic, stereo, joint-
stereo models; AAC(advanced audio coding) used for
surround sound effect
– Bit rates for compression schemes:
• MPEG Layer 1 32 - 448 kbps
• MPEG Layer 2 32 - 384 kbps
• MPEG Layer 3 32 - 320 kbps (popular audio format)
• AAC 32 - 192 kbps

Processing Sound
• Recorded, digitised, processed and incorporated into
multimedia
• Digital to Analog Converter(DAC) reproduce analog
signals.
• software have builtin tools that allow to perform
noise reduction, addition of echo, mixing of sound
tracks, etc. Eg.SoundForge, Audacity
• Additive Synthesis: No. of different wave forms can
be combined to form complex waveform

MIDI
• Musical Instrument Digital Interface – will
produce synthetic sound (1982)
• Format for sending music information between
electronic music devices like synthesisers and PC
sound cards
• contains only set of digital musical instructions
(values for the note’s pitch, length and volume) that
can be interpreted by PC’s sound card
• lot of music can be packed in small MIDI file (.mid)
• It cannot record sounds

Guidelines for effective use of
audio in multimedia applications
• If different music files are used – same style
• Best to use same voice for narration and voice –overs
• For each character, the voices should be distinct
• Coordinate the sound files with other graphic
animation, etc.
• For web based applications – use MIDI files
• Clear label for downloaded audio files
• Avoid using sound files recorded at different resolution

Representation of Audio files
• Choice of multimedia sound format depends
• Target application or user
• Purpose
• Availability of hardware and software
• Cost involved in producing sound

Audio Codec
• Audio Codec : computer program that
compresses/decompresses digital audio data
using libraries (executed within media player)
• Audio is compressed and stored – determined
by codec(compressor/decompressor)
• Codec – determines file size

Main categories of audio file
formats
• Uncompressed :referred as PCM(Pulse Coded
Modulation)- large file sizes - .wav files
• Lossless compression: no information lost –
quality not degraded - .wma files
• Lossy compression: eliminates redundant/
unnecessary information – small file size - .mp3,
real audio, Ogg Vorbis files

Common audio file formats
• MP3: popular audio format
– Uses perceptual audio coding & psychoacoustic
compression to remove redundant information
– Uses MDCT(Modified Discrete Cosine Transform) to
implement a filter bank, increasing resolution
– Shrinks original data by a factor of 12
• Ogg Vorbis(.ogg): free, open plug-in available for
common media player
• Windows Media Audio(.wma): Microsoft – similar to
MP3 – any compressed size to match different
connection bandwidths

Common audio file formats
• Waveform Audio File (.wav): support for
Windows OS – uncompressed – Large file size –
10MB / minute
• Audio File (.au): widely accepted cross-platform
format – web pages – similar to .wav
• Advanced Audio Coding(.aac): based on
MPEG4 – Apple
• Audio Interchange File Format(.aif) – Apple –
equivalent of WAV files – not cross platform –
not supported by all browsers

Audio Streaming and RealAudio
• RealAudio (.ra, .ram. .rm) : also supports video –
allows streaming of audio with low bandwidths
• Low quality – play digital audio in internet using
RealPlayer
• Proprietary method of compression and an
associated protocol for transmitting digital sound.
• It does not have any error checking
• Streaming requirements: 1. browser must accept
streams of data in realaudio format 2. uses UDP to
achieve steady data flow.

Digital Audio Formats and Processing Explained

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Digital Audio Formats and Processing Explained

Similar to Digital Audio Formats and Processing Explained (20)

More from Gem WeBlog

More from Gem WeBlog (20)

Recently uploaded

Recently uploaded (20)

Digital Audio Formats and Processing Explained