2. Audio Processing
• Critical part of “internet radio” station
– Very important for digital streaming to avoid artifacts of compression
algorithm
• Must operate on diverse array of files
– Think in grand scale: 80’s music to current day
• Instruments, mastering, music structure, recording technology
• Dynamically adjust to maintain high probability of clarity or articulation of
original file structure in spectral and temporal domains while providing a
uniform “sound signature” on these diverse files
– Articulation or intelligibity of complex music audio files is preserved and
enhanced
• Maintain transient “punch”
• Goal is to make sure “everyone wins” in the mix
– ‘Muddiness” must be avoided
– Preserve loudness, pitch and timbre
• Purpose
– A consistent ‘sonic signature” for the station format
7/30/2013 2http://www.linkedin.com/in/gpbrefini/
3. Motivation: The Connected Dash
7/30/2013 http://www.linkedin.com/in/gpbrefini/ 3
• Delivering Internet audio to the car is
hard
– Carrier’s signals not ubiquitous ….yet
– Everyday more people accessing high
speed content in particular rush hour
– Carrier’s have limited BW
• DASH Applications are different from
auto manufacturer to another
• Early studies show humans want
Internet radio in car to work like
conventional radio
• NO ONE ARGUES: Internet radio is the
FUTURE!
– The automobile is the listening
"theater,"
4. Good Audio Processing is
Multi-Band Processing!
• Perfected by Mike Dorrough (based on Altec-
Lansing design of the 1950’s)
– “Monolith” installed at KRLA, 8 band analog
processor
– DAP-310, 3 band analog
– Both had phase equalized pass-bands before
combining back to composite
• Minimize phase rotations at band edges
• Linear Group Delay!!!!
7/30/2013 4http://www.linkedin.com/in/gpbrefini/
5. “Process for the Stream”
• Streams use "lossy" data compression such as:
• MPEG, Real Audio, Microsoft's MSV2 codec
• linear 44.1kHz stereo audio stream ~1.6Mb/s
• At 128kb/s the MPEG Layer 3 compression ratio is approximately
11:1
• At 256kb/s the MPEG Layer 3 compression ratio is about 6:1
• Critical area of these perceptual coding schemes is the
high-frequency area
– Maintain consistent amplitude near FS for codec
– Keep the upper spectrum free from clipping distortion or
excessive high-frequency processing
– Consistent spectral balance over a wide range of material
is a must!
7/30/2013 5http://www.linkedin.com/in/gpbrefini/
6. Codec Magic: Masking!
• Codecs remove redundant information that
humans will not perceived as being removed
– Audio spectrum split into 500 bands
– Algorithm models human ear
• CODEC dynamically computes a “best frequency domain
fit” where certain signals present can be removed
• CODEC also performs “level masking” taking advantage of
how human hearing focuses on what’s going on in the
foreground
• Typically only 20% of original audio file is all that
is needed to be transmitted!
7/30/2013 http://www.linkedin.com/in/gpbrefini/ 6
7. Avoiding “watery sound” of Internet
Radio
• Coders do not like hard limited audio, harmonics get
squirreled into pass-band that algorithm can not model
• RMS is more important than peak of all waveforms
– It is a measure of energy over time
– Normalize FS to RMS (can’t exceed 0dB FS peak)
• Peak to RMS ratio is critical
• Contemporary Hit Music format uses processing to make it
more exciting
– As in movie production: frame by frame image is color
corrected & exposure corrected
• If we understand the transmission system and technical
challenges and we can minimize or hide sonic challenges
the better we sound!
7/30/2013 http://www.linkedin.com/in/gpbrefini/ 7
http://schedule.sxsw.com/2011/events/event_MP7661
http://www.digido.com
8. Articulation Processing Example
7/30/2013 http://www.linkedin.com/in/gpbrefini/ 8
• Process to bring the kick drum beater slaps
forward
– Use linear phase to keep the transient rise/fall
times steep
• Bring near-infrasonics forward during
transients in mid-range ( 2.1 – 6.4 KHz )
– This gives the audio a subtle “thumb”
9. Example:
• Modern Hit Music Station format
– Today’s Hits, the 2K’s, the 90’s and the 80’s
• Processing Challenges
– Modern Music has
• very limited dynamic range
• large bottom
• Digitally corrected vocals
– 80’s Music has
• larger dynamic range
• More traditional instruments, less synthesized
• SPARS code was highly likely AAD
• Need processing that makes for a consistent air sound
7/30/2013 http://www.linkedin.com/in/gpbrefini/ 9
10. The Nation’s Hit Music Station!
7/30/2013 http://www.linkedin.com/in/gpbrefini/ 10
Press above for Radio XL5 live!
Press above for Radio XL5 website
11. The Audio Chain
• Mild multiband processing
with impact/thumb
enhancements
– Articulation processing
• Second stage multi-band
processing
– More bands
– Clip/Bass distortion
correction
– Mild stereo enhancement
• Articulation processing
• Two-band DSP
limit/compression
• Analog “fast” compressor
• High-end Soundcards
7/30/2013 http://www.linkedin.com/in/gpbrefini/ 11
BreakAway Proc
StereoTool Proc
Behringer Digital Proc
Alesis Analog Proc
‘Proprietary System” Articulation Proc
12. Internet “Air Sound” vs FM “Air Sound”
• Internet Radio
– Flat audio processing throughout the air chain
• FM Radio
– requires pre-emphasis, 17 dB gain at 15 KHz for
75 uS (US radio)!
• Most modern music is highly clipped/limited
– FM pre-emphasis really increases distortion
– Internet audio, flat and easy to use de-clipping
algorithms
7/30/2013 http://www.linkedin.com/in/gpbrefini/ 12
13. Summary
7/30/2013 http://www.linkedin.com/in/gpbrefini/ 13
• Playback using high quality sources
• Multi-stage, multi-band processors
– Less is more
– Phase & bass correct
– Declipper function is important
– Goal is for spectral balance
• Apply multiple articulation/transient “punch” processing at front
and back end of chain
– Avoid anything that does not have linear group delay
– Preserve original transient information (spectral components).
• Minimal analog processing if possible
• Use high end sound cards
– A/D and D/A low jitter clocks, preferably locked to common source