Adam Sporka
Charles University
Tuesdays 10.40, Room SW1, Malostranské náměstí
NCGD009 ’19/’20
Audio Signal Representation
Technical remark: I drew a number of sketches on the whiteboard. These
were not archived in this deck of slides. Sorry about that, I’ll keep it in mind
next time!
Acoustics 101
Sound is oscillation of atmospheric pressure, detectable by ears.
20 Hz – 20,000 Hz
• Amplitude of anything over time
• A/D
• Sampling frequency
• Bit depth resolution
• D/A
• Time domain
• Frequency domain
Signal Processing 101
• Periodic measurement of the input signal
• Storing the values in a sequence
A/D Converter
0.000
0.383
0.707
0.924
1.000
0.924
0.707
0.383
0.000
– 0.383
– 0.707
– 0.924
– 1.000
– 0.924
– 0.707
– 0.383
0.000
t
D/A Converter (DAC)
• Processing a sequence of values
• Conversion of each value to amplitude
0.000
0.383
0.707
0.924
1.000
0.924
0.707
0.383
0.000
– 0.383
– 0.707
– 0.924
– 1.000
– 0.924
– 0.707
– 0.383
0.000
t
Audio Data Formats
Time domain-based formats
• Pulse Code Modulation
• WAV
• FLAC
• ...
Frequency domain-based formats
• MP3
• Ogg Vorbis
• ...
(Musical Instrument Digital Interface, MIDI)
Mentioned for sake of completeness
Events and their placement in time
Multi-channel Audio
mono FC
stereo FL+FR
2.1 FL+FR+LFE
3.0 FL+FR+FC
3.0(back) FL+FR+BC
4.0 FL+FR+FC+BC
quad FL+FR+BL+BR
quad(side) FL+FR+SL+SR
3.1 FL+FR+FC+LFE
5.0 FL+FR+FC+BL+BR
5.0(side) FL+FR+FC+SL+SR
4.1 FL+FR+FC+LFE+BC
5.1 FL+FR+FC+LFE+BL+BR
5.1(side) FL+FR+FC+LFE+SL+SR
6.0 FL+FR+FC+BC+SL+SR
6.0(front) FL+FR+FLC+FRC+SL+SR
hexagonal FL+FR+FC+BL+BR+BC
6.1 FL+FR+FC+LFE+BC+SL+SR
6.1(back) FL+FR+FC+LFE+BL+BR+BC
6.1(front) FL+FR+LFE+FLC+FRC+SL+SR
7.0 FL+FR+FC+BL+BR+SL+SR
7.0(front) FL+FR+FC+FLC+FRC+SL+SR
7.1 FL+FR+FC+LFE+BL+BR+SL+SR
7.1(wide) FL+FR+FC+LFE+BL+BR+FLC+FRC
7.1(wide-side) FL+FR+FC+LFE+FLC+FRC+SL+SR
octagonal FL+FR+FC+BL+BR+BC+SL+SR
hexadecagonal FL+FR+FC+BL+BR+BC+SL+SR+TFL+TFC+TFR+TBL+TBC+TBR+WL+WR
Audio Playback
“Sample pump”
D/A < audio controller < operating system < process < playback routine
Aliasing
http://www.cs.berkeley.edu/~sequin/CS184/LECT_09/L28.html
Aliasing
http://www.cs.berkeley.edu/~sequin/CS184/LECT_09/L28.html
• Sampling of a signal containing higher frequencies
than Nyquist Frequency
Aliasing
Which Sampling Frequency?
• Depends on the highest pitch we wish to represent
50 ms
44100 Hz
22050 Hz
14700 Hz
11025 Hz
8820 Hz
Which Sampling Frequency?
0 s 1 s 2 s
21
KHz
10
Hz
Which Sampling Frequency?
0 s 1 s 2 s
21
KHz
10
Hz
Bit Depth
• Measured values stored as separated numbers
• Resolution in bits
– Typical values:
8, 12, 16, 24, 32 bits
• Quantization error
– measurement error on each sample
– signal contains added noise
Bit Depth
• Example:
8-bit
4-bit
3-bit
2-bit
1-bit
7 sec
A low level audio output API
// Main thread
audioAPI_set_callback(&prepare_buffer);
audioAPI_play();
// … some time after (or never):
audioAPI_stop();
// Callback function
static cursor = 0;
void prepare_buffer(
signed_8bit_number *dest_buff,
int length)
{
for (int a=0; a<length; a++) {
cursor ++;
double time = cursor / 48000;
dest_buff[a] = 127*sin(time);
}
}
Buffering
PLAY BUFFER 1PREPARE BUFFER 2
PLAY BUFFER 2
PLAY BUFFER 3
PLAY BUFFER 4
PLAY BUFFER 5
PREPARE BUFFER 3
PREPARE BUFFER 4
PREPARE BUFFER 5
PREPARE BUFFER 1
PREPARE BUFFER 6
PLAY BUFFER 6PREPARE BUFFER 7
PLAY BUFFER 7
time
– dropout
CALLBACKS SOUND HW ACTIVITY
Buffering
• Shorter, fewer buffers:
– E.g.: 1 buffer, 256 samples @48,000 Hz
• 5.33 ms
• Almost good for the real-time music performances
• Zero margin for errors and delays
Buffering
• Longer, more buffers
– E.g.: 10 buffers, 1024 samples each @48000 Hz
• 213 ms
• Safer
• Probably OK for phone calls
– Internet radios – 1~60 seconds of buffer
• Unsuitable for interactive tasks
Playback Errors
• Data transmission errors
– Omission
– Pause
– Repetition
• Wrong sample rate
– Different playback speed
– Gradual drop of synchronicity
“Going above Zero”
• Distortion:

Audio Data Representation (NCGD009)

  • 1.
    Adam Sporka Charles University Tuesdays10.40, Room SW1, Malostranské náměstí NCGD009 ’19/’20 Audio Signal Representation
  • 2.
    Technical remark: Idrew a number of sketches on the whiteboard. These were not archived in this deck of slides. Sorry about that, I’ll keep it in mind next time!
  • 3.
    Acoustics 101 Sound isoscillation of atmospheric pressure, detectable by ears. 20 Hz – 20,000 Hz
  • 4.
    • Amplitude ofanything over time • A/D • Sampling frequency • Bit depth resolution • D/A • Time domain • Frequency domain Signal Processing 101
  • 5.
    • Periodic measurementof the input signal • Storing the values in a sequence A/D Converter 0.000 0.383 0.707 0.924 1.000 0.924 0.707 0.383 0.000 – 0.383 – 0.707 – 0.924 – 1.000 – 0.924 – 0.707 – 0.383 0.000 t
  • 6.
    D/A Converter (DAC) •Processing a sequence of values • Conversion of each value to amplitude 0.000 0.383 0.707 0.924 1.000 0.924 0.707 0.383 0.000 – 0.383 – 0.707 – 0.924 – 1.000 – 0.924 – 0.707 – 0.383 0.000 t
  • 7.
    Audio Data Formats Timedomain-based formats • Pulse Code Modulation • WAV • FLAC • ... Frequency domain-based formats • MP3 • Ogg Vorbis • ...
  • 8.
    (Musical Instrument DigitalInterface, MIDI) Mentioned for sake of completeness Events and their placement in time
  • 9.
    Multi-channel Audio mono FC stereoFL+FR 2.1 FL+FR+LFE 3.0 FL+FR+FC 3.0(back) FL+FR+BC 4.0 FL+FR+FC+BC quad FL+FR+BL+BR quad(side) FL+FR+SL+SR 3.1 FL+FR+FC+LFE 5.0 FL+FR+FC+BL+BR 5.0(side) FL+FR+FC+SL+SR 4.1 FL+FR+FC+LFE+BC 5.1 FL+FR+FC+LFE+BL+BR 5.1(side) FL+FR+FC+LFE+SL+SR 6.0 FL+FR+FC+BC+SL+SR 6.0(front) FL+FR+FLC+FRC+SL+SR hexagonal FL+FR+FC+BL+BR+BC 6.1 FL+FR+FC+LFE+BC+SL+SR 6.1(back) FL+FR+FC+LFE+BL+BR+BC 6.1(front) FL+FR+LFE+FLC+FRC+SL+SR 7.0 FL+FR+FC+BL+BR+SL+SR 7.0(front) FL+FR+FC+FLC+FRC+SL+SR 7.1 FL+FR+FC+LFE+BL+BR+SL+SR 7.1(wide) FL+FR+FC+LFE+BL+BR+FLC+FRC 7.1(wide-side) FL+FR+FC+LFE+FLC+FRC+SL+SR octagonal FL+FR+FC+BL+BR+BC+SL+SR hexadecagonal FL+FR+FC+BL+BR+BC+SL+SR+TFL+TFC+TFR+TBL+TBC+TBR+WL+WR
  • 10.
    Audio Playback “Sample pump” D/A< audio controller < operating system < process < playback routine
  • 11.
  • 12.
  • 13.
    • Sampling ofa signal containing higher frequencies than Nyquist Frequency Aliasing
  • 14.
    Which Sampling Frequency? •Depends on the highest pitch we wish to represent 50 ms 44100 Hz 22050 Hz 14700 Hz 11025 Hz 8820 Hz
  • 15.
    Which Sampling Frequency? 0s 1 s 2 s 21 KHz 10 Hz
  • 16.
    Which Sampling Frequency? 0s 1 s 2 s 21 KHz 10 Hz
  • 17.
    Bit Depth • Measuredvalues stored as separated numbers • Resolution in bits – Typical values: 8, 12, 16, 24, 32 bits • Quantization error – measurement error on each sample – signal contains added noise
  • 18.
  • 19.
    A low levelaudio output API // Main thread audioAPI_set_callback(&prepare_buffer); audioAPI_play(); // … some time after (or never): audioAPI_stop(); // Callback function static cursor = 0; void prepare_buffer( signed_8bit_number *dest_buff, int length) { for (int a=0; a<length; a++) { cursor ++; double time = cursor / 48000; dest_buff[a] = 127*sin(time); } }
  • 20.
    Buffering PLAY BUFFER 1PREPAREBUFFER 2 PLAY BUFFER 2 PLAY BUFFER 3 PLAY BUFFER 4 PLAY BUFFER 5 PREPARE BUFFER 3 PREPARE BUFFER 4 PREPARE BUFFER 5 PREPARE BUFFER 1 PREPARE BUFFER 6 PLAY BUFFER 6PREPARE BUFFER 7 PLAY BUFFER 7 time – dropout CALLBACKS SOUND HW ACTIVITY
  • 21.
    Buffering • Shorter, fewerbuffers: – E.g.: 1 buffer, 256 samples @48,000 Hz • 5.33 ms • Almost good for the real-time music performances • Zero margin for errors and delays
  • 22.
    Buffering • Longer, morebuffers – E.g.: 10 buffers, 1024 samples each @48000 Hz • 213 ms • Safer • Probably OK for phone calls – Internet radios – 1~60 seconds of buffer • Unsuitable for interactive tasks
  • 23.
    Playback Errors • Datatransmission errors – Omission – Pause – Repetition • Wrong sample rate – Different playback speed – Gradual drop of synchronicity
  • 24.