Audio Essentials


Published on

An academic in-depth look into the physics behind sound and audio, and how the science comes into play in professional audio recording.

Published in: Education
Audio Essentials

  1. 1. By Fred Ginsburg CAS PhD
  2. 2. • Vertical axis = AMPLITUDE (aka loudness) • Horizontal axis = FREQUENCY • Positive and Negative components of the wave • Expressed as Hz or kHz. Example: 22 kHz (or 22k for short)
  3. 3. • Digital audio is a graphic “snapshot of the sine wave” • The “pixel grid” is divided into “vertical resolution” and “horizontal resolution”. • Vertical res are BITS, as in 16 bit, 24 bit, 32 bit floating. • BITS actually refer to the size/detail of the data packet • Horizontal res is SAMPLING RATE, slices per second, as in 44.1 kHz, 48 kHz
  4. 4. • Sampling rates commonly range from 8 to 192. • Low fidelity rates: 8, 11, 16, 22, 32. Poor freq response. • Used for radio comms, telephone, cheap recorders, along with lower bit rates. • Higher sampling rates: 44.1, 48, 88.2, 96, 192 • 44.1 = consumer audio CD (.CDA) • 48 = professional audio files, digital cinema/DCP • 96 = uber quality (music) recording intended for future editing and down conversion to 48 or 44.1 • 192 = NASA grade instrumentation recording of vibration
  5. 5. • “In order to get a clear “snapshot” of the freq response, we need to sample it twice, and then subtract a portion for housekeeping. • “Twice, has something to do with the fact that each sound wave has two parts to it, a positive curve and a negative curve. • “Housekeeping requires 2k.” • Sampling rate, divide by two, subtract another 2k = Freq Response. (Sample/2) minus 2 = Freq. • *note: Prof Ginsburg’s words, not the actual mathematical (geek speak) theorem.
  6. 6. • Example. 48k sampling rate • 48/2 = 24 • 24 minus 2 = 22 • Freq response of 48k is only 22 kHz, as in 20-22 kHz • Example. 44.1 sampling rate • 44/2 = 22 • 22 minus 2 = 20 • Freq response of 44.1 k is only 20 kHz, as in 40-20 kHz
  7. 7. • Noise purposely added to the digital track to mask the sterile, “on/off” listening experience of pure digital (1-0-1- 0) sampling. • Kind of a way of smoothing off the rough edges, so to speak. • Think of it as adding a touch of diffusion to make a high resolution “facial portrait” more flattering!
  8. 8. • Freq response approx 20-22kHz, but reduces as we get older to just 12 k or 14 k. • Pain threshold approx 120 dB spl (sound pressure level) • Hearing damage from 90 dB • Avoid loud earphones, monitor speakers!!! • Hearing damage is permanent.
  9. 9. • A way to measure the volume (gain) of an audio signal • Doubling the sound pressure (voltage) corresponds to a measured level change of 6 dB. Doubling of sound intensity (acoustic energy) belongs to a calculated level change of 3 dB. • Think of 3 dB as one F-stop of sound • We perceive equal (voltage) levels of sound as different “loudness” based on their frequencies, referred to as Fletcher-Munson curves. • We use weighted meters (such as A-weighted) to compensate. • Engineers use variations of the dB term to refer to actual voltage levels of the audio signal. Too geeky to get into.
  10. 10. • 20 Hz to120 Hz = Bass. • Most mics roll off around 80 Hz to filter out low freq noise such as handling, wind, rumble/vibration, distant traffic, ventilation systems. • 100 Hz to 5 kHz = Mid-range. average range of human voice • 5 kHz to 6 kHz = Upper mid-range to lower High freqs. • Sometimes we roll off sibilance (de-Essing) of human voice • 6 kHz to 22 kHz = High frequencies. Harmonics.
  11. 11. • Reverb is characterized as random, blended repetitions of a sound occurring within thirty milliseconds after the sound is made. This is all the sound that immediately bounces off any nearby surfaces before it gets back to your ears. • Echo is defined by distinct repetitions of a sound occurring after 30 milliseconds. This is when you can unquestionably hear a distinct... well, echo of a sound coming back to you. • Echo often is a diminished quality of the signal, due to greater disproportional frequency loss and delay.
  12. 12. • Measurement of resistance. Means different things depending on the context. • Mic Level output, 250 Ω, a low level output. • Line Level output, 600 Ω, a much louder output. • Mic Level input, 250 Ω, expects a low level input. • Line Level input, 600 Ω, wants a much louder input. • Approx 50 dB difference in volume between mic v line level • Headphones, 40 to 100 Ω is ideal. Less than 35 > too loud and easily distorted. More than 100 < provides not enough volume from field recorders.
  13. 13. • Mic out to Mic in. Sounds OK, but weaker signal more prone to interference. • Line out to Line in. Sounds OK, a stronger signal less prone to interference. Best settings to use if you can. • Mic out to Line in. Very low and feint volume. Think of sending 12v into a 100v light bulb. • Line out to Mic in. Too powerful a signal for that input. Audio will be loud and distorted. Think of 100v going into a 12v light bulb. • Line level is normal level for most devices. Mic level requires pre-amp to raise level up to Line level for mixing or recording.
  14. 14. • Why actors tell each other to “break a leg” • Has nothing to do with fracturing one’s bones! • During Medieval era, theater actors only got paid if they appeared on stage that day. • To “break a leg” meant that your leg cleared (aka broke) the edge of the stage curtains and you would be visible in the performance… • So you were entitled to wages.