Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Audio Essentials


Published on

An academic in-depth look into the physics behind sound and audio, and how the science comes into play in professional audio recording.

Published in: Education
  • Be the first to comment

  • Be the first to like this

Audio Essentials

  1. 1. By Fred Ginsburg CAS PhD
  2. 2. • Vertical axis = AMPLITUDE (aka loudness) • Horizontal axis = FREQUENCY • Positive and Negative components of the wave • Expressed as Hz or kHz. Example: 22 kHz (or 22k for short)
  3. 3. • Digital audio is a graphic “snapshot of the sine wave” • The “pixel grid” is divided into “vertical resolution” and “horizontal resolution”. • Vertical res are BITS, as in 16 bit, 24 bit, 32 bit floating. • BITS actually refer to the size/detail of the data packet • Horizontal res is SAMPLING RATE, slices per second, as in 44.1 kHz, 48 kHz
  4. 4. • Sampling rates commonly range from 8 to 192. • Low fidelity rates: 8, 11, 16, 22, 32. Poor freq response. • Used for radio comms, telephone, cheap recorders, along with lower bit rates. • Higher sampling rates: 44.1, 48, 88.2, 96, 192 • 44.1 = consumer audio CD (.CDA) • 48 = professional audio files, digital cinema/DCP • 96 = uber quality (music) recording intended for future editing and down conversion to 48 or 44.1 • 192 = NASA grade instrumentation recording of vibration
  5. 5. • “In order to get a clear “snapshot” of the freq response, we need to sample it twice, and then subtract a portion for housekeeping. • “Twice, has something to do with the fact that each sound wave has two parts to it, a positive curve and a negative curve. • “Housekeeping requires 2k.” • Sampling rate, divide by two, subtract another 2k = Freq Response. (Sample/2) minus 2 = Freq. • *note: Prof Ginsburg’s words, not the actual mathematical (geek speak) theorem.
  6. 6. • Example. 48k sampling rate • 48/2 = 24 • 24 minus 2 = 22 • Freq response of 48k is only 22 kHz, as in 20-22 kHz • Example. 44.1 sampling rate • 44/2 = 22 • 22 minus 2 = 20 • Freq response of 44.1 k is only 20 kHz, as in 40-20 kHz
  7. 7. • Noise purposely added to the digital track to mask the sterile, “on/off” listening experience of pure digital (1-0-1- 0) sampling. • Kind of a way of smoothing off the rough edges, so to speak. • Think of it as adding a touch of diffusion to make a high resolution “facial portrait” more flattering!
  8. 8. • Freq response approx 20-22kHz, but reduces as we get older to just 12 k or 14 k. • Pain threshold approx 120 dB spl (sound pressure level) • Hearing damage from 90 dB • Avoid loud earphones, monitor speakers!!! • Hearing damage is permanent.
  9. 9. • A way to measure the volume (gain) of an audio signal • Doubling the sound pressure (voltage) corresponds to a measured level change of 6 dB. Doubling of sound intensity (acoustic energy) belongs to a calculated level change of 3 dB. • Think of 3 dB as one F-stop of sound • We perceive equal (voltage) levels of sound as different “loudness” based on their frequencies, referred to as Fletcher-Munson curves. • We use weighted meters (such as A-weighted) to compensate. • Engineers use variations of the dB term to refer to actual voltage levels of the audio signal. Too geeky to get into.
  10. 10. • 20 Hz to120 Hz = Bass. • Most mics roll off around 80 Hz to filter out low freq noise such as handling, wind, rumble/vibration, distant traffic, ventilation systems. • 100 Hz to 5 kHz = Mid-range. average range of human voice • 5 kHz to 6 kHz = Upper mid-range to lower High freqs. • Sometimes we roll off sibilance (de-Essing) of human voice • 6 kHz to 22 kHz = High frequencies. Harmonics.
  11. 11. • Reverb is characterized as random, blended repetitions of a sound occurring within thirty milliseconds after the sound is made. This is all the sound that immediately bounces off any nearby surfaces before it gets back to your ears. • Echo is defined by distinct repetitions of a sound occurring after 30 milliseconds. This is when you can unquestionably hear a distinct... well, echo of a sound coming back to you. • Echo often is a diminished quality of the signal, due to greater disproportional frequency loss and delay.
  12. 12. • Measurement of resistance. Means different things depending on the context. • Mic Level output, 250 Ω, a low level output. • Line Level output, 600 Ω, a much louder output. • Mic Level input, 250 Ω, expects a low level input. • Line Level input, 600 Ω, wants a much louder input. • Approx 50 dB difference in volume between mic v line level • Headphones, 40 to 100 Ω is ideal. Less than 35 > too loud and easily distorted. More than 100 < provides not enough volume from field recorders.
  13. 13. • Mic out to Mic in. Sounds OK, but weaker signal more prone to interference. • Line out to Line in. Sounds OK, a stronger signal less prone to interference. Best settings to use if you can. • Mic out to Line in. Very low and feint volume. Think of sending 12v into a 100v light bulb. • Line out to Mic in. Too powerful a signal for that input. Audio will be loud and distorted. Think of 100v going into a 12v light bulb. • Line level is normal level for most devices. Mic level requires pre-amp to raise level up to Line level for mixing or recording.
  14. 14. • Why actors tell each other to “break a leg” • Has nothing to do with fracturing one’s bones! • During Medieval era, theater actors only got paid if they appeared on stage that day. • To “break a leg” meant that your leg cleared (aka broke) the edge of the stage curtains and you would be visible in the performance… • So you were entitled to wages.