Digital Audio
Contents
 Digital Audio Fundamentals
 Sampling and Quantizing
 PCM
 Audio Compression
 Disk-Based Recording
 Rotary Head Digital Recorders
 Digital Audio Broadcasting
 Digital Filtering
 Stereophony and Multichannel Sound
Digital Audio Fundamentals
 Digital audio is simply an alternative means of
carrying an audio waveform
Digital Audio Fundamentals
 Digital audio is sound reproduction using pulse-code
modulation and digital signals
 Digital audio systems include analog-to-digital conversion
(ADC), digital-to-analog conversion (DAC), digital storage,
processing and transmission components
 A primary benefit of digital audio is in its convenience of
storage, transmission and retrieval
 Digital audio is useful in the recording, manipulation,
mass-production, and distribution of sound
 Modern distribution of music across the Internet via on-line
stores depends on digital recording and digital compression
algorithms
PCM(Pulse Code Modulation)
 PCM consists of three steps to digitize an analog
signal:
1. Sampling
2. Quantization
3. Binary encoding
 Before we sample, we have to filter the signal to
limit the maximum frequency of the signal as it
affects the sampling rate.
 Filtering should ensure that we do not distort
the signal, ie remove high frequency
components that affect the signal shape.
PCM(Pulse Code Modulation)
Sampling
 Analog signal is sampled every TS secs.
 Ts is referred to as the sampling interval.
 fs = 1/Ts is called the sampling rate or sampling
frequency.
 There are 3 sampling methods:
 Ideal - an impulse at each sampling instant
 Natural - a pulse of short width with varying amplitude
 Flattop - sample and hold, like natural but with single
amplitude value
 The process is referred to as pulse amplitude
modulation PAM and the outcome is a signal with
analog (non integer) values
Sampling
Sampling
Quantization
 Sampling results in a series of pulses of varying amplitude
values ranging between two limits: a min and a max
 The amplitude values are infinite between the two limits.
 We need to map the infinite amplitude values onto a finite
set of known values
 This is achieved by dividing the distance between min and
max into L zones, each of height 
 = (max - min)/L
 The midpoint of each zone is assigned a value from 0 to L-1
(resulting in L values)
 Each sample falling in a zone is then approximated to the
value of the midpoint
Quantization Zones
 Assume we have a voltage signal with amplitutes
Vmin=-20V and Vmax=+20V.
 We want to use L=8 quantization levels.
 Zone width = (20 - -20)/8 = 5
 The 8 zones are: -20 to -15, -15 to -10, -10 to -5, -5 to 0, 0
to +5, +5 to +10, +10 to +15, +15 to +20
 The midpoints are: -17.5, -12.5, -7.5, -2.5, 2.5, 7.5, 12.5,
17.5
Assigning Codes to Zones
 Each zone is then assigned a binary code.
 The number of bits required to encode the zones,
or the number of bits per sample as it is commonly
referred to, is obtained as follows:
nb = log2 L
 Given our example, nb = 3
 The 8 zone (or level) codes are therefore: 000, 001,
010, 011, 100, 101, 110, and 111
 Assigning codes to zones:
 000 will refer to zone -20 to -15
 001 to zone -15 to -10, etc.
Quantization and encoding of a
sampled signal
PCM Decoder
 To recover an analog signal from a digitized signal
we follow the following steps:
 We use a hold circuit that holds the amplitude value of a
pulse till the next pulse arrives.
 We pass this signal through a low pass filter with a cutoff
frequency that is equal to the highest frequency in the
pre-sampled signal.
 The higher the value of L, the less distorted a
signal is recovered.
PCM Decoder
Audio Compression
 In its native form, high-quality digital audio requires a
high data rate, which may be excessive for certain
applications
 One approach to the problem is to use compression,
which reduces that rate significantly with a moderate
loss of subjective quality
 While compression may achieve considerable
reduction in bit rate, it must be appreciated that
compression systems reintroduce the generation loss
of the analog domain to digital systems
Audio Compression
 One of the most popular compression standards for
audio and video is known as MPEG (Moving Picture
Experts Group)
 In practice, audio and video streams of this type can be
combined using multiplexing
 The program stream is optimized for recording and is
based on blocks of arbitrary size
 The transport stream is optimized for transmission
and is based on blocks of constant size
Audio Compression
 The bit stream
types of MPEG-2
Audio Compression
 Compression and the corresponding decoding are complex
processes and take time, adding to existing delays in signal
paths
 Concealment of uncorrectable errors is also more difficult
on compressed data
 The acceptable trade-off between loss of audio quality and
transmission or storage size depends upon the application
 For example, one 640MB compact disc (CD) holds
approximately one hour of uncompressed high fidelity
music, less than 2 hours of music compressed losslessly, or
7 hours of music compressed in the MP3 format at a
medium bit rate
 A digital sound recorder can typically store around 200
hours of clearly intelligible speech in 640MB
Disk-Based Recording
 The magnetic disk drive was perfected by the computer
industry to allow rapid random access to data, and so it
makes an ideal medium for editing
 Development of the optical disk was stimulated by the
availability of low-cost lasers
 Optical disks are available in many different types,
some which can only be recorded once, whereas others
are erasable
 Optical disks have in common the fact that access is
generally slower than with magnetic drives and that it
is difficult to obtain high data rates, but most of them
are removable and can act as interchange media
Rotary Head Digital Recorders
 In a fixed tape head system, audio tape is drawn past
the head at a constant speed
 The head creates a fluctuating magnetic field in
response to the signal to be recorded, and the
magnetic particles on the tape are forced to line up
with the field at the head
 As the tape moves away, the magnetic particles carry
an imprint of the signal in their magnetic orientation
 If the tape moves too slowly, a high frequency signal
will not be imprinted: the particles' polarity will
simply oscillate in the vicinity of the head, to be left in
a random position
Rotary Head Digital Recorders
 Thus the bandwidth channel capacity of the recorded
signal can be seen to be related to tape speed: the faster the
speed, the higher the frequency that can be recorded
 Digital video and digital audio need considerably more
bandwidth than analog audio, so much so that tape would
have to be drawn past the heads at very high speed in order
to capture this signal
 This is impractical, since tapes of immense length would
be required
 The generally adopted solution is to rotate the head against
the tape at high speed, so that the relative velocity is high,
but the tape itself moves at a slow speed.
Rotary Head Digital Recorders
 To accomplish this, the head must be tilted so that at
each rotation of the head, a new area of tape is
brought into play; each segment of the signal is
recorded as a diagonal stripe across the tape
 This is known as a helical scan because the tape wraps
around the circular drum at an angle, travelling up like
a helix
 The rotary head recorder has the advantage that the
spinning heads create a high head-to-tape speed,
offering a high bit rate recording without high linear
tape speed
Digital Audio Broadcasting
 Digital Audio Broadcasting (DAB) is a digital radio
technology for broadcasting radio stations
 Advantages of DAB
 Broadcasting programs with good sound quality
comparable to multi-media products such as MP3
 Offering stable reception and removing noises
 Bringing diversified program choices to the audiences
 Enabling transmission of text / images
DAB sender
Trans-
mitter
Trans-
mission
Multi-
plexer
MSC
Multi-
plexer
ODFM
Packet
Mux
Channel
Coder
Audio
Encoder
Channel
Coder
DAB Signal
Service
Information FIC
Multiplex
Information
Data
Services
Audio
Services
Radio Frequency
FIC: Fast Information Channel
MSC: Main Service Channel
OFDM: Orthogonal Frequency Division Multiplexing
1.5 MHz
f
carriers
DAB receiver
Packet
Demux
Audio
Decoder
Channel
Decoder
Independent
Data
Service
Audio
Service
Controller
Tuner
ODFM
Demodulator
User Interface
FIC
Control Bus
(partial)
MSC
Digital Filtering
 In electronics, computer science and mathematics,
a digital filter is a system that performs
mathematical operations on a sampled, discrete-
time signal to reduce or enhance certain aspects of
that signal
 A digital filter system usually consists of an
analog-to-digital converter to sample the input
signal, followed by a microprocessor and some
peripheral components such as memory to store
data and filter coefficients
Digital Filtering
 Digital filters are commonplace and an essential
element of everyday electronics such as radios,
cellphones, and stereo receivers
 Digital filters are defined by their impulse response,
h[n], or the filter output given a unit sample impulse
input signal
 A discrete-time unit impulse signal is defined by
Digital Filtering
 Digital filters are often best described in terms of their
frequency response. That is, how is a sinusoidal signal of a
given frequency affected by the filter
 The frequency response of a digital filter can be found by
taking the DFT (or FFT) of the filter impulse response
 The frequency response of a filter consists of its
magnitude and phase responses
 The magnitude response indicates the ratio of a filtered
sine wave's output amplitude to its input amplitude
 The phase response describes the phase ``offset'' or time
delay experienced by a sine wave passing through a filter
Digital Filtering
 The filter implementation simply performs a
convolution of the time domain impulse response and
the sampled signal
 Convolution is defined as the integral of the product of
the two functions after one is reversed and shifted or
delayed
 What happens when we add a signal to a one-sample
delayed version of itself?
 y[n] = x[n] + x[n - 1]
Digital Filtering
 Consider the following input signals:
 The filter's frequency response magnitude is shown
Digital Filtering
 Finite Impulse Response (FIR) Filters
 Finite Impulse Response (FIR) filters are defined by
scaled and time-delayed versions of the filter input
signal only, as given by the following difference
equation:
 The impulse response of an FIR filter is only as long as
the maximum delayed input term in its difference
equation
Digital Filtering
 An FIR filter can be represented by a block diagram as
shown
Digital Filtering
 What happens if we use a previous filter output value
to produce the filter's current output?
 y[n] = x[n] + y[n - 1]
 Consider the following input signals
Digital Filtering
 The filter's frequency response magnitude is shown
Digital Filtering
 Infinite Impulse Response (IIR) filters include delayed
and scaled versions of the output signal which are fed
back into the current output
 IIR filters are described by the following difference
equation
Digital Filtering
 An IIR filter can be represented by a block diagram as
shown
Stereophony and Multichannel
Sound
 It is a method of sound recording in which the
recording contains information about the spatial
arrangement of the sound sources
 When a stereophonic recording is reproduced, the
listener hears a more natural sound that seems to
come from many separate sources and to be arranged
in the same way as during the recording
 The listener has the impression that the sound is
“three-dimensional” and possessed of an added
“depth.”
Stereophony and Multichannel
Sound
 This effect is achieved through the separate recording
of electrical signals from different microphones on
individual channels and through the separate
reproduction of the sound on each channel by
loudspeakers
Stereophony and Multichannel
Sound
 The arrangement of the loudspeakers must be similar
to that of the microphones; that is, the right and left
channels must coincide
 The quality of stereophonic sound reproduction
improves with the number of channels used
 However, the number of channels is usually kept
within certain limits to avoid undue complexity and
excessive cost
Stereophony and Multichannel
Sound
 5.1 channel sound is an industry standard sound
format for movies and music with five main channels
of sound and a sixth subwoofer channel used for
special movie effects and bass for music
 A 5.1 channel system consists of a stereo pair of
speakers, a center channel speaker placed between the
stereo speakers and two surround sound speakers
located behind the listener. 5.1 channel sound is found
on DVD movie and music discs and some CDs
Stereophony and Multichannel
Sound
 6.1 channel sound is a sound enhancement to 5.1
channel sound with an additional center surround
sound speaker located between the two surround
sound speakers directly behind the listener. 6.1
channel sound produces a more enveloping surround
sound experience.
 7.1 channel sound is a further sound enhancement to
5.1 channel sound with two additional side-surround
speakers located to the sides of the listener’s seating
position. 7.1 channel sound is used for greater sound
envelopment and more accurate positioning of sounds

Digital audio

  • 1.
  • 2.
    Contents  Digital AudioFundamentals  Sampling and Quantizing  PCM  Audio Compression  Disk-Based Recording  Rotary Head Digital Recorders  Digital Audio Broadcasting  Digital Filtering  Stereophony and Multichannel Sound
  • 3.
    Digital Audio Fundamentals Digital audio is simply an alternative means of carrying an audio waveform
  • 4.
    Digital Audio Fundamentals Digital audio is sound reproduction using pulse-code modulation and digital signals  Digital audio systems include analog-to-digital conversion (ADC), digital-to-analog conversion (DAC), digital storage, processing and transmission components  A primary benefit of digital audio is in its convenience of storage, transmission and retrieval  Digital audio is useful in the recording, manipulation, mass-production, and distribution of sound  Modern distribution of music across the Internet via on-line stores depends on digital recording and digital compression algorithms
  • 5.
    PCM(Pulse Code Modulation) PCM consists of three steps to digitize an analog signal: 1. Sampling 2. Quantization 3. Binary encoding  Before we sample, we have to filter the signal to limit the maximum frequency of the signal as it affects the sampling rate.  Filtering should ensure that we do not distort the signal, ie remove high frequency components that affect the signal shape.
  • 6.
  • 7.
    Sampling  Analog signalis sampled every TS secs.  Ts is referred to as the sampling interval.  fs = 1/Ts is called the sampling rate or sampling frequency.  There are 3 sampling methods:  Ideal - an impulse at each sampling instant  Natural - a pulse of short width with varying amplitude  Flattop - sample and hold, like natural but with single amplitude value  The process is referred to as pulse amplitude modulation PAM and the outcome is a signal with analog (non integer) values
  • 8.
  • 9.
  • 10.
    Quantization  Sampling resultsin a series of pulses of varying amplitude values ranging between two limits: a min and a max  The amplitude values are infinite between the two limits.  We need to map the infinite amplitude values onto a finite set of known values  This is achieved by dividing the distance between min and max into L zones, each of height   = (max - min)/L  The midpoint of each zone is assigned a value from 0 to L-1 (resulting in L values)  Each sample falling in a zone is then approximated to the value of the midpoint
  • 11.
    Quantization Zones  Assumewe have a voltage signal with amplitutes Vmin=-20V and Vmax=+20V.  We want to use L=8 quantization levels.  Zone width = (20 - -20)/8 = 5  The 8 zones are: -20 to -15, -15 to -10, -10 to -5, -5 to 0, 0 to +5, +5 to +10, +10 to +15, +15 to +20  The midpoints are: -17.5, -12.5, -7.5, -2.5, 2.5, 7.5, 12.5, 17.5
  • 12.
    Assigning Codes toZones  Each zone is then assigned a binary code.  The number of bits required to encode the zones, or the number of bits per sample as it is commonly referred to, is obtained as follows: nb = log2 L  Given our example, nb = 3  The 8 zone (or level) codes are therefore: 000, 001, 010, 011, 100, 101, 110, and 111  Assigning codes to zones:  000 will refer to zone -20 to -15  001 to zone -15 to -10, etc.
  • 13.
    Quantization and encodingof a sampled signal
  • 14.
    PCM Decoder  Torecover an analog signal from a digitized signal we follow the following steps:  We use a hold circuit that holds the amplitude value of a pulse till the next pulse arrives.  We pass this signal through a low pass filter with a cutoff frequency that is equal to the highest frequency in the pre-sampled signal.  The higher the value of L, the less distorted a signal is recovered.
  • 15.
  • 16.
    Audio Compression  Inits native form, high-quality digital audio requires a high data rate, which may be excessive for certain applications  One approach to the problem is to use compression, which reduces that rate significantly with a moderate loss of subjective quality  While compression may achieve considerable reduction in bit rate, it must be appreciated that compression systems reintroduce the generation loss of the analog domain to digital systems
  • 17.
    Audio Compression  Oneof the most popular compression standards for audio and video is known as MPEG (Moving Picture Experts Group)  In practice, audio and video streams of this type can be combined using multiplexing  The program stream is optimized for recording and is based on blocks of arbitrary size  The transport stream is optimized for transmission and is based on blocks of constant size
  • 18.
    Audio Compression  Thebit stream types of MPEG-2
  • 19.
    Audio Compression  Compressionand the corresponding decoding are complex processes and take time, adding to existing delays in signal paths  Concealment of uncorrectable errors is also more difficult on compressed data  The acceptable trade-off between loss of audio quality and transmission or storage size depends upon the application  For example, one 640MB compact disc (CD) holds approximately one hour of uncompressed high fidelity music, less than 2 hours of music compressed losslessly, or 7 hours of music compressed in the MP3 format at a medium bit rate  A digital sound recorder can typically store around 200 hours of clearly intelligible speech in 640MB
  • 20.
    Disk-Based Recording  Themagnetic disk drive was perfected by the computer industry to allow rapid random access to data, and so it makes an ideal medium for editing  Development of the optical disk was stimulated by the availability of low-cost lasers  Optical disks are available in many different types, some which can only be recorded once, whereas others are erasable  Optical disks have in common the fact that access is generally slower than with magnetic drives and that it is difficult to obtain high data rates, but most of them are removable and can act as interchange media
  • 21.
    Rotary Head DigitalRecorders  In a fixed tape head system, audio tape is drawn past the head at a constant speed  The head creates a fluctuating magnetic field in response to the signal to be recorded, and the magnetic particles on the tape are forced to line up with the field at the head  As the tape moves away, the magnetic particles carry an imprint of the signal in their magnetic orientation  If the tape moves too slowly, a high frequency signal will not be imprinted: the particles' polarity will simply oscillate in the vicinity of the head, to be left in a random position
  • 22.
    Rotary Head DigitalRecorders  Thus the bandwidth channel capacity of the recorded signal can be seen to be related to tape speed: the faster the speed, the higher the frequency that can be recorded  Digital video and digital audio need considerably more bandwidth than analog audio, so much so that tape would have to be drawn past the heads at very high speed in order to capture this signal  This is impractical, since tapes of immense length would be required  The generally adopted solution is to rotate the head against the tape at high speed, so that the relative velocity is high, but the tape itself moves at a slow speed.
  • 23.
    Rotary Head DigitalRecorders  To accomplish this, the head must be tilted so that at each rotation of the head, a new area of tape is brought into play; each segment of the signal is recorded as a diagonal stripe across the tape  This is known as a helical scan because the tape wraps around the circular drum at an angle, travelling up like a helix  The rotary head recorder has the advantage that the spinning heads create a high head-to-tape speed, offering a high bit rate recording without high linear tape speed
  • 24.
    Digital Audio Broadcasting Digital Audio Broadcasting (DAB) is a digital radio technology for broadcasting radio stations  Advantages of DAB  Broadcasting programs with good sound quality comparable to multi-media products such as MP3  Offering stable reception and removing noises  Bringing diversified program choices to the audiences  Enabling transmission of text / images
  • 25.
    DAB sender Trans- mitter Trans- mission Multi- plexer MSC Multi- plexer ODFM Packet Mux Channel Coder Audio Encoder Channel Coder DAB Signal Service InformationFIC Multiplex Information Data Services Audio Services Radio Frequency FIC: Fast Information Channel MSC: Main Service Channel OFDM: Orthogonal Frequency Division Multiplexing 1.5 MHz f carriers
  • 26.
  • 27.
    Digital Filtering  Inelectronics, computer science and mathematics, a digital filter is a system that performs mathematical operations on a sampled, discrete- time signal to reduce or enhance certain aspects of that signal  A digital filter system usually consists of an analog-to-digital converter to sample the input signal, followed by a microprocessor and some peripheral components such as memory to store data and filter coefficients
  • 28.
    Digital Filtering  Digitalfilters are commonplace and an essential element of everyday electronics such as radios, cellphones, and stereo receivers  Digital filters are defined by their impulse response, h[n], or the filter output given a unit sample impulse input signal  A discrete-time unit impulse signal is defined by
  • 29.
    Digital Filtering  Digitalfilters are often best described in terms of their frequency response. That is, how is a sinusoidal signal of a given frequency affected by the filter  The frequency response of a digital filter can be found by taking the DFT (or FFT) of the filter impulse response  The frequency response of a filter consists of its magnitude and phase responses  The magnitude response indicates the ratio of a filtered sine wave's output amplitude to its input amplitude  The phase response describes the phase ``offset'' or time delay experienced by a sine wave passing through a filter
  • 30.
    Digital Filtering  Thefilter implementation simply performs a convolution of the time domain impulse response and the sampled signal  Convolution is defined as the integral of the product of the two functions after one is reversed and shifted or delayed  What happens when we add a signal to a one-sample delayed version of itself?  y[n] = x[n] + x[n - 1]
  • 31.
    Digital Filtering  Considerthe following input signals:  The filter's frequency response magnitude is shown
  • 32.
    Digital Filtering  FiniteImpulse Response (FIR) Filters  Finite Impulse Response (FIR) filters are defined by scaled and time-delayed versions of the filter input signal only, as given by the following difference equation:  The impulse response of an FIR filter is only as long as the maximum delayed input term in its difference equation
  • 33.
    Digital Filtering  AnFIR filter can be represented by a block diagram as shown
  • 34.
    Digital Filtering  Whathappens if we use a previous filter output value to produce the filter's current output?  y[n] = x[n] + y[n - 1]  Consider the following input signals
  • 35.
    Digital Filtering  Thefilter's frequency response magnitude is shown
  • 36.
    Digital Filtering  InfiniteImpulse Response (IIR) filters include delayed and scaled versions of the output signal which are fed back into the current output  IIR filters are described by the following difference equation
  • 37.
    Digital Filtering  AnIIR filter can be represented by a block diagram as shown
  • 38.
    Stereophony and Multichannel Sound It is a method of sound recording in which the recording contains information about the spatial arrangement of the sound sources  When a stereophonic recording is reproduced, the listener hears a more natural sound that seems to come from many separate sources and to be arranged in the same way as during the recording  The listener has the impression that the sound is “three-dimensional” and possessed of an added “depth.”
  • 39.
    Stereophony and Multichannel Sound This effect is achieved through the separate recording of electrical signals from different microphones on individual channels and through the separate reproduction of the sound on each channel by loudspeakers
  • 40.
    Stereophony and Multichannel Sound The arrangement of the loudspeakers must be similar to that of the microphones; that is, the right and left channels must coincide  The quality of stereophonic sound reproduction improves with the number of channels used  However, the number of channels is usually kept within certain limits to avoid undue complexity and excessive cost
  • 41.
    Stereophony and Multichannel Sound 5.1 channel sound is an industry standard sound format for movies and music with five main channels of sound and a sixth subwoofer channel used for special movie effects and bass for music  A 5.1 channel system consists of a stereo pair of speakers, a center channel speaker placed between the stereo speakers and two surround sound speakers located behind the listener. 5.1 channel sound is found on DVD movie and music discs and some CDs
  • 42.
    Stereophony and Multichannel Sound 6.1 channel sound is a sound enhancement to 5.1 channel sound with an additional center surround sound speaker located between the two surround sound speakers directly behind the listener. 6.1 channel sound produces a more enveloping surround sound experience.  7.1 channel sound is a further sound enhancement to 5.1 channel sound with two additional side-surround speakers located to the sides of the listener’s seating position. 7.1 channel sound is used for greater sound envelopment and more accurate positioning of sounds