SlideShare a Scribd company logo
 Introduction
 Goal
 Impulse Response
 Maximum Length Sequence
 Head Related Transfer Function
 ITD and ILD
 Head-shadow effect
 Measurement Setup
 Hardware Design
 Compensation
 Verification Test
 HRTF Database
 MIT HRTF Plots
 Labyrinth HRTF Calibration
 Labyrinth HRTF Plots
Spatial perception with/out hearing aids on.
Vent configuration + Hearing
aids On
Vent configuration + Hearing aids Off
BTE 0mm 1 mm 2mm 3mm 0mm 1mm 2 mm 3mm
ITE 0mm 1 mm 2mm 3mm 0mm 1mm 2 mm 3mm
OPEN No hearing aids, using KEMAR’s in ear microphones.
Measuring and analyzing 17 different scenarios,
A relationship between the input and the output
of a system.
input transfer function = output
input tf = output
A ) If input is then output is also =>
OUTPUT/INPUT = TF
Not stable
B) delta function stuff = stuff =>
What if stuff = input
Then output = input;
Therefore;
tf = input output
 “A Maximum-Length Sequence (MLS) is a
periodic two-level signal of length P = 2^N – 1,
where N is an integer and P is the periodicity,
which yields the impulse response of a linear
system under circular convolution. The impulse
response is extracted by the deconvolution of
the system’s output when excited with an MLS
signal.”
 http://www.commsp.ee.ic.ac.uk/~mrt102/project
s/mls/MLS%20Theory.pdf
MLS, pseudorandom
binary sequence, of
period 4, in time domain.
 Magnitude spectrum of
MLS.
(Closest approximation
to delta function).
 We used a 17 period long MLS signal.
 The signal was re-sampled by ADC/DAC to
24414 Hz.
Time Domain Frequency Domain
Transfer function of one’s sounds localization
system from a point in space.
Involves shape of the pinna, shoulders effect, hair
and more. Left
Right
HRTF_L
HRTF_R
Conv(Input (Mono) & Impulse Response(L,R)) = Output(L,R)
Input : Desired Sound
Impulse Response : HRTFalpha (L,R) => Desired Direction
Output : Sound interpreted from angle alpha
0 degrees
alpha
alpha
MLS + Chrip
signal
Time domain
alpha
….
Recorded Signal HRTF for angle alpha
3D Audio
Reconstruction
 Interaural time difference - ITD
 Interaural level difference - ILD
 Spectral information
1000Hz 2000Hz
4000Hz 8000Hz
If distance between two ears > λ/2
Where λ = speed of sound/ frequency
For a normal headsize, F > 1600 Hz
 Room:
Height : 115 cm – Width : 170 cm - Length: 265 cm
 T60 reverberation time ?
 Noise floor: 20 dB SPL
 Loudspeaker: ?
 KEMAR:?
 Preamp: ?
 Measurement Microphone: ?
 ADC/DAC: RM1, Two I/O, Sampling rate 24414
 Turn Table: Outline
 Turn Table:
ET 250-3D
Size: 350 * 455 mm
Resolution : 0.01 degree
Step size: 0.5 degree min
Control: Ethernet Cable
and TTL connector
Axial Load: 1500 kg
• KEMAR rotates
azimuthally.
• Speaker rotates for only
elevation angles.
Azimuth: 0 : 5: 355
Elevation : -30 : 10 : 90
•Both motors are controlled
counter/clockwise through
Ethernet cable via Arduino
microprocessor on Matlab.
Frictio
n
Tweeter and Bass are on point.
KEMAR
Speaker
LCD Shield
Ethernet Shield
Arduino
Push button for changing rotating the motors
 What impulse response do we want?
Loudspeaker?
Room?
Pre-Amplifier?
Coupler?
Microphone?
RM1?
NO
NO
NO
No
NO
NO
 We want KEMAR’s ear response to source’s
location in a room regardless of,
Loudspeaker
Room
Pre-Amplifier
Coupler
Microphone
RM1
responses.
 What we have,
Impulse Response
(Loudspeaker+room+coupler+microphone+pre-amp+RM1+hrir)
 What we need,
hrir
 Solution,
(Loudspeaker+room+coupler+microphone+pre-amp+RM1+hrir)
minus
(Loudspeaker+room+coupler+microphone+pre-amp+RM1)
= hrir
 Procedure:
1st : Remove KEMAR and replace with a
measurement microphone similar to one’s in
KEMAR’s ear at the same exact location to get,
(Loudspeaker+room+coupler+microphone+pre-amp+RM1)
let’s call the combination of all these responses, ‘room’
response for now.
2nd: Compensate for “room” response if necessary.
 How to compensate?
A) conv(room^-1, (room+hrir)) = hrir (1)
Same thing as
abs(FFT(room+hrir))/abs(FFT(room))
(Note: Once in frequency domain use linear convolution length for the number of FFT
points to avoid time aliasing).
Problem:
Ill-condition frequency bins that introduces sever spectral coloration to results.
(1) Project MaRIE. "CRTools Compatible HRIR." GN Resound
Room Response
Inverse Room Response
Ill-condition
frequency bin
 How to compensate?
B) Constant Regularization (1)
abs(FFT(room+hrir))/abs(FFT(room)+β)
Avoid spectral coloration at the cost of losing some room
compensation.
(1) Choueiri . Optimal Crosstalk Cancellation for Binaural Audio with Two
Loudspeakers. BACCH Audio. Princeton University.
Room Response
Inverse Room Response
The magnitude at the ill-conditioned frequency
dropped by 10 times (20 dB).
Ill-condition
frequency bin
shifted
 How to compensate?
C) Frequency Dependent Regularization (1)
abs(FFT(room+hrir))/abs(FFT(room)+β(frequency))
Avoid spectral coloration at the cost of losing a smaller amount
of room compensation.
0 if Room(i) > threshold, for i= 1 : #fftpoints /2
β(frequency) =
β if Room(i) < threshold, for i= 1 : #fftpoints /2
(1) Choueiri . Optimal Crosstalk Cancellation for Binaural Audio with Two
Loudspeakers. BACCH Audio. Princeton University.
Room Response
Inverse Room Response
Only the bins below the threshold are boosted.
Getting some of the room compensation back
from constant regularization.
 How to compensate?
D) Filter Inversion
Example: Looking at room response as a high pass filter.
=> Flat out the response and deal with phase separately.
1- Fit a curve to the room response to avoid compensating for
FFT artifacts.
We only want to partially compensate for the shape of the filter without
introducing new artifacts to the system.
2- Find the maximum value of the obtained curve and boost all
frequency bins to that value.
room response is boosted appropriately and resulted in a flat frequency
response.
 As expected, low frequency bins are boosted the most to
compensate for the room response.
This method is basically, a variation of frequency dependent
regularization where the threshed is defined backward as the
max(FFT(room)).
3- Compensating for phase.
Phase of the acquired signal before compensation,
Phase of the compensated signal,
Note: Phase is conjugate symmetric.
DC and middle frequency bin must be
ignored when reconstructing the new
hrir response.
First step
Final step
Goal:
Does the impulse response corresponds to right
location?
How:
By comparing,
both perceptually and calculating the quantization error
between
the resulting signals.
 KEMAR is 145 cm from loudspeaker, 5 cm
elevation.
 HRTF angle ~ 60 degrees in Front Left
azimuth and ~ 5 degree elevation on top.
 Sound pressure level at left ear ~ 75-80 dB
0 degree
60 degrees
Left
Right
 Only the magnitude of the HRTFs
were compensated, since the phase
response of the room is linear.
Room Response
Original Binaural
Reconstructed with HRTF
+ Room Compensation
Reconstructed with HRTF
Perceptually:
Difference:
After synchronizing and normalizing the gain,
we
have,Difference Reconstructed L/R Compensated L/R
Binaural Left 20.38% 5.2%
Binaural Right 33.03% 15.55%
FYI:
Subtracting the two STFT at each frequency bin, taking the average over each
frequency bin and find the norm of the resulted vector => Difference.
Binaural Recording
Difference between Binaural and ReconstructedDifference between Binaural and Compensated
 MIT
 CIPIC
 Andrew’s DI Database
 Labyrinth
 MIT Media Lab - 1994
 Sampling rate 44.1 kHz
 Azimuth : 0 to 355 ~ 5 degrees step
 Elevation : -45 to 90 degrees ~ 15 degrees
step
 1024 samples
 University of California Davis -2001
 Sampling rate 44.1 kHz
 Azimuth : -80 to 80 ~ 5 degrees step
 Elevation : -45 to 270 degrees ~ 15 degrees
step
 200 samples
 GN Resound at Glenview, Illinois - 2014
 Sampling rate 48828 kHz
 Azimuth : 0 to 355 = 5 degrees step
 Elevation : -30 to 90 degrees = 10 degrees
step
 160 samples
 Better Ear Strategy: Compare SNR of signal source
between ears and choose the signal with most positive
SNR. For polar plot, choose the signal between ears that
has the most attenuation in reference to the on-axis signal.
 Audibility Strategy: Compare levels of signal source
between ears and choose signal with the most positive
level. For polar plot, choose the signal between ears that
has the least attenuation in reference to the on-axis signal.
(1) Andrew Dittberner, Chang Ma,, and Paul Sexton. "BASS Benchmark Project.“ Labyrinth
Program. GN Resound, 2014.
Better Ear Strategy Audibility Strategy
@ 1 KHz
@ 2 KHz
Elevation : -40 Degrees
Elevation : -40 Degrees
Better Ear Strategy Audibility Strategy
Elevation : 0 Degrees
Better Ear Strategy Audibility Strategy
Elevation : 80 Degrees
Better Ear Strategy Audibility Strategy
Frequency : 2000 Hz
Better Ear Strategy Audibility Strategy
Frequency : 4000 Hz
Better Ear Strategy Audibility Strategy
Frequency : 8000 Hz
Better Ear Strategy Audibility Strategy
 Why Calibration?
 Is KEMAR facing the speaker at the (0az,0el)?
 Is the robotic arm keep the same azimuth angle
as it goes to higher elevation angles?
There are different ways to operate calibration on
the system.
1- Set the speaker and the KEMAR at (90az, 0 el) by
eyeballing it (or maybe use a level).
2- Define a reasonable azimuth and elevation threshold
for the eyeballing error, e.g. ±15 degree azimuth and
elevation.
3- Record the response at both ears for every point within
the threshold. The step size will define the resolution of
your calibration.
4- Take the RMS of the result from each ear and subtract
them in the log domain.
5- The maximum value from step 4 will correspond to the
actual (90az, 0 el).
6- Move the motor to the corresponding angle from step 5
and set it to (90az, 0 el).
 th = threshold
 Threshold on azimuth and elevation are arbitrary and could
be different values.
-
10*log10(rms
10*log10(rms
10*log10(rms
Verification: Left and Right RMS intersection
Azimuth Angles
Magnitude
dB
The intersection is @25 degrees.
 KEMAR without its pinna on could form a
directional microphone. Auditorium calibration
might be more accurate if the KEMAR’s pinna are
removed( For higher frequency bands).
 A similar procedure must also be done for
(0az,90el). It’s expected that the level difference
between left and right should be zero since they’re
symmetric (we’re looking for the minimum value
here).
 A similar auditorium calibration procedure can be
done by focusing on the ITD instead of ILD. The
index at which Left and Right impulse response are
extracted must be the same at ( 0az, 0el).
When these two peaks are
at the same index,
KEMAR is located 0az, 0
el)
Azimuth
Elevation
 The auditorium calibration assumes that the robotic arm
would move on a straight line (keeping the same azimuth
angle with respect to the KEMAR) toward higher elevation
angles. Turned out that’s not the case here.
 The laser pointer follows the center of the KEMAR
head at all elevation angles.
 KEMAR receive reflection of the signal of the
walls and other items while receiving the
same signal from the speaker.
Where is it coming from?
It varies from 100~150
samples form the peak for
most azimuth angles. Given
the sampling rate this
translate to 70~100 cm
from the KEMAR.
Solution:
Use a measurement microphone as the second ear and place it in different position.
Warmer or colder?
12
2
3
4
5
8
9
10
The robotic arms were the main origins of
the early reflections in the systems.
A few acoustic sound absorption foams
on each arm decreased the reflection by
almost 40 dB.
 Early reflection becomes more important
when using Hearing Aids, since they got
longer impulse responses. It’s harder to
detect the early reflection.
Open Left Ear
1000 Hz
Open Left Ear
4000 Hz
Open Left Ear
2000 Hz
 Open Ear Audibility Strategy 5kHz
 Open Ear Better Ear Strategy
5kHz
Open Ear Better Ear /Audibility Strategy
1kHz 4kHz
-
 Open Ear
@(0az,0el)
Frequency
Magnitude dB
Frequency Response (Left/Right)Groupdelay (Left/Right)
 BTE Hearing aids
@(0az,0el)
Frequency
Magnitude dB
Increasin
g the vent
size
 BTE Hearing aids Left Ear 750
Hz
 BTE Hearing aids 1000
Hz
Gain issue
 BTE Hearing aids 2000
Hz
Obviously vent size is affecting
the lower frequencies.
 BTE Hearing aids Audibility Strategy 2000
Hz
 BTE Hearing aids Better Ear Strategy 2000
Hz
 ITE Hearing aids
@(0az,0el)
Increasing the vent size
ITE and BTE have opposite relationship w.r. to the vent
size.
0
1
3
 ITE Hearing aids Right Ear 750
Hz
Gain issue
 ITE Hearing aids Right Ear 2000
Hz
 BTE Hearing aids Audibility Strategy 2000
Hz
 BTE Hearing aids Better Ear Strategy 2000
Hz
 Collecting data for on elevation at 5 degree azimuth takes
about 923 seconds ~ 15 min, that’s about 4.5 hours to
collect data for all elevation at 10 degree resolution.
 Not a good idea to save your database to Matlab memory
while measuring, no matter how more convenient it is!
Matlab WILL CRAHS!
 Interpolation in time/Frequency domain
 ITD/ILD
 Reverberation Time
 3D Audio
 Beamforming for Source Localization
 CIPIC Polar Pattern
 CIPIC Delay and Sum Pattern
 MIT Polar Pattern
 MIT Delay and Sum Pattern
 And more glitzy plots!
 Why interpolation?
1. Higher Resolution and easier for analysis.
2. Creating a smoother transition for reconstructing 3D audio.
3. Easier to compare different HRTF databases of different
step sizes.
 Original Database
(Getting bigger, ILD)
 Interpolated Database
(Thicker vertically)
 Original database
(shifting to right, ITD)
 Interpolated database
(thicker horizontally).
 How? Cross Correlating between left and
right hrir.
Two microphones
separated 23 cm.
Two ears
 How? Subtracting the magnitude squred of
the HRTF at left ear from the right ear.
 The time it takes for a signal to drop by 60 dB. In a
noisy environment, T60 is measured by
interpolating the linear region in the Energy Decay
Curve.
Where h(tau) is the room impulse response.
https://ccrma.stanford.edu/~jos/pasp/Energy_Decay_Curve.html
 Goal : Make a sound moving smoothly
through all the angles in the database.
Help us identify the accuracy of the database and the possible
spectral coloration of the database on any desired signal.
3D Audio for 0 degree elevation:
CIPIC -80 to 80 MIT 0 to 355 CIPIC -80 to 80 MIT 0 to 355
CIPIC :
Original
. .
MIT:
-Low pass filtered
- Low frequency bin
are even suppressed.
- Few bandstop
filters
 How does Spatial Aliasing affect us in
localizing a sound source?
Assuming that the only cue we could use to localize a
source is the ITD cue (like wearing a hearing aid?).
 We could use a set of techniques, called
Beamforming, to simulate localizing a sound
source.
 The idea is…
Delay the reference signal until the
sum of the energy of the two signals
is at its maximum. That delay would
corresponds to the angle of arrival.
• Signals at 1 kHZ forming
a sound source at 20
degrees at sampling rate 5
kHz.
•Angle of arrival
detected correctly
•Angle of arrival
detected in-correctly
• Signals at 2.5 kHZ forming
a sound source at 20
degrees at sampling rate 5
kHz.
Beam pattern
 CIPIC database
-80 to 80 azimuth
@ 0 elevation
 MIT database
0 to 355 azimuth
@ 0 elevation
Both from DC to 22 k HZ, shown for Left ear.
-80 to 80 azimuth
 ITD: stronger at low frequencies.
 ILD: stronger at high frequencies.
Shorter wavelength, die faster, bigger ILD.
 Distance between ear ~ 23 cm
 23 cm = 343 m/s * t => t ~ 670µs
 Frequency = 1/period =>
F ~ 1600 Hz
What does that mean?
Left Right
@ 1 KHz
@ 2 KHz
Elevation : -40 Degrees
Left Right
@ 4 KHz
@ 8 KHz
Elevation : -40 Degrees
Left Right
Elevation : -40 Degrees
Left Right
Elevation : 0 Degrees
Left Right
Elevation : 40 Degrees
Left Right
Elevation : 80 Degrees
Better Ear Strategy Audibility Strategy
@ 4 KHz
@ 8 KHz
Elevation : -40 Degrees
(1) Andrew Dittberner, Chang Ma,, and Paul Sexton. "BASS Benchmark Project.“ Labyrinth
Program. GN Resound, 2014.
Elevation : 40 Degrees
Better Ear Strategy Audibility Strategy
Frequency : 1000 Hz
Better Ear Strategy Audibility Strategy

More Related Content

What's hot

E media seminar 20_12_2017_artificial_reverberation
E media seminar 20_12_2017_artificial_reverberationE media seminar 20_12_2017_artificial_reverberation
E media seminar 20_12_2017_artificial_reverberation
Giacomo Vairetti
 
Signal Processing
Signal ProcessingSignal Processing
Signal Processing
Shayan Mamaghani
 
Mastering
MasteringMastering
Defense - Sound space rendering based on the virtual Sound space rendering ba...
Defense - Sound space rendering based on the virtual Sound space rendering ba...Defense - Sound space rendering based on the virtual Sound space rendering ba...
Defense - Sound space rendering based on the virtual Sound space rendering ba...
JunjieShi3
 
Digital signal processing through speech, hearing, and Python
Digital signal processing through speech, hearing, and PythonDigital signal processing through speech, hearing, and Python
Digital signal processing through speech, hearing, and Python
Mel Chua
 
Hybrid Reverberator Using Multiple Impulse Responses for Audio Rendering Impr...
Hybrid Reverberator Using Multiple Impulse Responses for Audio Rendering Impr...Hybrid Reverberator Using Multiple Impulse Responses for Audio Rendering Impr...
Hybrid Reverberator Using Multiple Impulse Responses for Audio Rendering Impr...
a3labdsp
 
Distance Coding And Performance Of The Mark 5 And St350 Soundfield Microphone...
Distance Coding And Performance Of The Mark 5 And St350 Soundfield Microphone...Distance Coding And Performance Of The Mark 5 And St350 Soundfield Microphone...
Distance Coding And Performance Of The Mark 5 And St350 Soundfield Microphone...
Bruce Wiggins
 
NOISE CANCELATION USING MATLAB
NOISE CANCELATION USING MATLABNOISE CANCELATION USING MATLAB
NOISE CANCELATION USING MATLAB
Aniruddha Paul
 
Antinoise system & Noise Cancellation
Antinoise system & Noise CancellationAntinoise system & Noise Cancellation
Antinoise system & Noise Cancellation
Gujarat Technological University
 
DSP_FOEHU - Lec 13 - Digital Signal Processing Applications I
DSP_FOEHU - Lec 13 - Digital Signal Processing Applications IDSP_FOEHU - Lec 13 - Digital Signal Processing Applications I
DSP_FOEHU - Lec 13 - Digital Signal Processing Applications I
Amr E. Mohamed
 
Defying Nyquist in Analog to Digital Conversion
Defying Nyquist in Analog to Digital ConversionDefying Nyquist in Analog to Digital Conversion
Defying Nyquist in Analog to Digital Conversion
Distinguished Lecturer Series - Leon The Mathematician
 
Sampling rate bit depth
Sampling rate bit depthSampling rate bit depth
Sampling rate bit depth
Jonny Williams
 
Future Proof Surround Sound Mixing using Ambisonics
Future Proof Surround Sound Mixing using AmbisonicsFuture Proof Surround Sound Mixing using Ambisonics
Future Proof Surround Sound Mixing using Ambisonics
Bruce Wiggins
 
Speech measurement using laser doppler vibrometer
Speech measurement using laser doppler vibrometerSpeech measurement using laser doppler vibrometer
Speech measurement using laser doppler vibrometerI'am Ajas
 
B45010811
B45010811B45010811
B45010811
IJERA Editor
 
Understanding the FM Slide Chirp Advantages in Hydroacoustics for Fisheries A...
Understanding the FM Slide Chirp Advantages in Hydroacoustics for Fisheries A...Understanding the FM Slide Chirp Advantages in Hydroacoustics for Fisheries A...
Understanding the FM Slide Chirp Advantages in Hydroacoustics for Fisheries A...
HTI Hydroacoustic Technology, Inc.
 
ESS-Bilbao Initiative Workshop. Pulsed Source Requirements from the User’s Po...
ESS-Bilbao Initiative Workshop. Pulsed Source Requirements from the User’s Po...ESS-Bilbao Initiative Workshop. Pulsed Source Requirements from the User’s Po...
ESS-Bilbao Initiative Workshop. Pulsed Source Requirements from the User’s Po...
ESS BILBAO
 
Image Denoising Using Wavelet
Image Denoising Using WaveletImage Denoising Using Wavelet
Image Denoising Using Wavelet
Asim Qureshi
 
Research: Applying Various DSP-Related Techniques for Robust Recognition of A...
Research: Applying Various DSP-Related Techniques for Robust Recognition of A...Research: Applying Various DSP-Related Techniques for Robust Recognition of A...
Research: Applying Various DSP-Related Techniques for Robust Recognition of A...
Roman Atachiants
 

What's hot (20)

E media seminar 20_12_2017_artificial_reverberation
E media seminar 20_12_2017_artificial_reverberationE media seminar 20_12_2017_artificial_reverberation
E media seminar 20_12_2017_artificial_reverberation
 
Signal Processing
Signal ProcessingSignal Processing
Signal Processing
 
Mastering
MasteringMastering
Mastering
 
Defense - Sound space rendering based on the virtual Sound space rendering ba...
Defense - Sound space rendering based on the virtual Sound space rendering ba...Defense - Sound space rendering based on the virtual Sound space rendering ba...
Defense - Sound space rendering based on the virtual Sound space rendering ba...
 
Digital signal processing through speech, hearing, and Python
Digital signal processing through speech, hearing, and PythonDigital signal processing through speech, hearing, and Python
Digital signal processing through speech, hearing, and Python
 
Hybrid Reverberator Using Multiple Impulse Responses for Audio Rendering Impr...
Hybrid Reverberator Using Multiple Impulse Responses for Audio Rendering Impr...Hybrid Reverberator Using Multiple Impulse Responses for Audio Rendering Impr...
Hybrid Reverberator Using Multiple Impulse Responses for Audio Rendering Impr...
 
Distance Coding And Performance Of The Mark 5 And St350 Soundfield Microphone...
Distance Coding And Performance Of The Mark 5 And St350 Soundfield Microphone...Distance Coding And Performance Of The Mark 5 And St350 Soundfield Microphone...
Distance Coding And Performance Of The Mark 5 And St350 Soundfield Microphone...
 
NOISE CANCELATION USING MATLAB
NOISE CANCELATION USING MATLABNOISE CANCELATION USING MATLAB
NOISE CANCELATION USING MATLAB
 
Antinoise system & Noise Cancellation
Antinoise system & Noise CancellationAntinoise system & Noise Cancellation
Antinoise system & Noise Cancellation
 
DSP_FOEHU - Lec 13 - Digital Signal Processing Applications I
DSP_FOEHU - Lec 13 - Digital Signal Processing Applications IDSP_FOEHU - Lec 13 - Digital Signal Processing Applications I
DSP_FOEHU - Lec 13 - Digital Signal Processing Applications I
 
Defying Nyquist in Analog to Digital Conversion
Defying Nyquist in Analog to Digital ConversionDefying Nyquist in Analog to Digital Conversion
Defying Nyquist in Analog to Digital Conversion
 
Sampling rate bit depth
Sampling rate bit depthSampling rate bit depth
Sampling rate bit depth
 
Future Proof Surround Sound Mixing using Ambisonics
Future Proof Surround Sound Mixing using AmbisonicsFuture Proof Surround Sound Mixing using Ambisonics
Future Proof Surround Sound Mixing using Ambisonics
 
Speech measurement using laser doppler vibrometer
Speech measurement using laser doppler vibrometerSpeech measurement using laser doppler vibrometer
Speech measurement using laser doppler vibrometer
 
Digital receiver
Digital receiverDigital receiver
Digital receiver
 
B45010811
B45010811B45010811
B45010811
 
Understanding the FM Slide Chirp Advantages in Hydroacoustics for Fisheries A...
Understanding the FM Slide Chirp Advantages in Hydroacoustics for Fisheries A...Understanding the FM Slide Chirp Advantages in Hydroacoustics for Fisheries A...
Understanding the FM Slide Chirp Advantages in Hydroacoustics for Fisheries A...
 
ESS-Bilbao Initiative Workshop. Pulsed Source Requirements from the User’s Po...
ESS-Bilbao Initiative Workshop. Pulsed Source Requirements from the User’s Po...ESS-Bilbao Initiative Workshop. Pulsed Source Requirements from the User’s Po...
ESS-Bilbao Initiative Workshop. Pulsed Source Requirements from the User’s Po...
 
Image Denoising Using Wavelet
Image Denoising Using WaveletImage Denoising Using Wavelet
Image Denoising Using Wavelet
 
Research: Applying Various DSP-Related Techniques for Robust Recognition of A...
Research: Applying Various DSP-Related Techniques for Robust Recognition of A...Research: Applying Various DSP-Related Techniques for Robust Recognition of A...
Research: Applying Various DSP-Related Techniques for Robust Recognition of A...
 

Similar to 3D Spatial Response

Sleep 2008 Electronicsv.3
Sleep 2008 Electronicsv.3Sleep 2008 Electronicsv.3
Sleep 2008 Electronicsv.3weckhardt
 
Audio Signal Processing
Audio Signal Processing Audio Signal Processing
Audio Signal Processing
Ahmed A. Arefin
 
Howling noise control in public address systems
Howling noise control in public address systemsHowling noise control in public address systems
Howling noise control in public address systems
ADEDAMOLA ADEDAPO
 
EEG Basics monish.pptx
EEG Basics monish.pptxEEG Basics monish.pptx
EEG Basics monish.pptx
MohinishS
 
Sampling
SamplingSampling
Superhetrodyne receiver
Superhetrodyne receiverSuperhetrodyne receiver
Superhetrodyne receiver
lrsst
 
SignalDecompositionTheory.pptx
SignalDecompositionTheory.pptxSignalDecompositionTheory.pptx
SignalDecompositionTheory.pptx
PriyankaDarshana
 
[Best]Chromatic Tuner Project Final Report
[Best]Chromatic Tuner Project Final Report[Best]Chromatic Tuner Project Final Report
[Best]Chromatic Tuner Project Final ReportNicholas Ambrosio
 
Equalisers
EqualisersEqualisers
Unit 6: DSP applications
Unit 6: DSP applicationsUnit 6: DSP applications
Unit 6: DSP applications
Minakshi Atre
 
SUPERHETERODYNE RECEIVER.pdf
SUPERHETERODYNE RECEIVER.pdfSUPERHETERODYNE RECEIVER.pdf
SUPERHETERODYNE RECEIVER.pdf
VALUEADDEDCOURSEDATA
 
Radio receivers,dp
Radio receivers,dpRadio receivers,dp
Radio receivers,dp
Deep Patel
 
Synchronous Time / Frequency Domain Measurements Using a Digital Oscilloscope...
Synchronous Time / Frequency Domain Measurements Using a Digital Oscilloscope...Synchronous Time / Frequency Domain Measurements Using a Digital Oscilloscope...
Synchronous Time / Frequency Domain Measurements Using a Digital Oscilloscope...
Rohde & Schwarz North America
 
Oscilloscope Fundamentals, Hands-On Course at EELive 2014
Oscilloscope Fundamentals, Hands-On Course at EELive 2014Oscilloscope Fundamentals, Hands-On Course at EELive 2014
Oscilloscope Fundamentals, Hands-On Course at EELive 2014Rohde & Schwarz North America
 
Unit 6: DSP applications
Unit 6: DSP applications Unit 6: DSP applications
Unit 6: DSP applications
Minakshi Atre
 
Rf propagation in a nutshell
Rf propagation in a nutshellRf propagation in a nutshell
Rf propagation in a nutshell
Izah Asmadi
 
RADIO RECEIVERS.pptx
RADIO RECEIVERS.pptxRADIO RECEIVERS.pptx
RADIO RECEIVERS.pptx
CyprianObota
 

Similar to 3D Spatial Response (20)

Sleep 2008 Electronicsv.3
Sleep 2008 Electronicsv.3Sleep 2008 Electronicsv.3
Sleep 2008 Electronicsv.3
 
Audio Signal Processing
Audio Signal Processing Audio Signal Processing
Audio Signal Processing
 
Howling noise control in public address systems
Howling noise control in public address systemsHowling noise control in public address systems
Howling noise control in public address systems
 
EEG Basics monish.pptx
EEG Basics monish.pptxEEG Basics monish.pptx
EEG Basics monish.pptx
 
Sampling
SamplingSampling
Sampling
 
Superhetrodyne receiver
Superhetrodyne receiverSuperhetrodyne receiver
Superhetrodyne receiver
 
SignalDecompositionTheory.pptx
SignalDecompositionTheory.pptxSignalDecompositionTheory.pptx
SignalDecompositionTheory.pptx
 
[Best]Chromatic Tuner Project Final Report
[Best]Chromatic Tuner Project Final Report[Best]Chromatic Tuner Project Final Report
[Best]Chromatic Tuner Project Final Report
 
Equalisers
EqualisersEqualisers
Equalisers
 
Exp passive filter (5)
Exp passive filter (5)Exp passive filter (5)
Exp passive filter (5)
 
Unit 6: DSP applications
Unit 6: DSP applicationsUnit 6: DSP applications
Unit 6: DSP applications
 
Comm8(exp.3)
Comm8(exp.3)Comm8(exp.3)
Comm8(exp.3)
 
21 Receivers.pdf
21 Receivers.pdf21 Receivers.pdf
21 Receivers.pdf
 
SUPERHETERODYNE RECEIVER.pdf
SUPERHETERODYNE RECEIVER.pdfSUPERHETERODYNE RECEIVER.pdf
SUPERHETERODYNE RECEIVER.pdf
 
Radio receivers,dp
Radio receivers,dpRadio receivers,dp
Radio receivers,dp
 
Synchronous Time / Frequency Domain Measurements Using a Digital Oscilloscope...
Synchronous Time / Frequency Domain Measurements Using a Digital Oscilloscope...Synchronous Time / Frequency Domain Measurements Using a Digital Oscilloscope...
Synchronous Time / Frequency Domain Measurements Using a Digital Oscilloscope...
 
Oscilloscope Fundamentals, Hands-On Course at EELive 2014
Oscilloscope Fundamentals, Hands-On Course at EELive 2014Oscilloscope Fundamentals, Hands-On Course at EELive 2014
Oscilloscope Fundamentals, Hands-On Course at EELive 2014
 
Unit 6: DSP applications
Unit 6: DSP applications Unit 6: DSP applications
Unit 6: DSP applications
 
Rf propagation in a nutshell
Rf propagation in a nutshellRf propagation in a nutshell
Rf propagation in a nutshell
 
RADIO RECEIVERS.pptx
RADIO RECEIVERS.pptxRADIO RECEIVERS.pptx
RADIO RECEIVERS.pptx
 

More from Ramin Anushiravani

Sound Source Localization with microphone arrays
Sound Source Localization with microphone arraysSound Source Localization with microphone arrays
Sound Source Localization with microphone arrays
Ramin Anushiravani
 
Beamforming and microphone arrays
Beamforming and microphone arraysBeamforming and microphone arrays
Beamforming and microphone arrays
Ramin Anushiravani
 
3D Audio playback for single channel audio using visual cues
3D Audio playback for single channel audio using visual cues3D Audio playback for single channel audio using visual cues
3D Audio playback for single channel audio using visual cues
Ramin Anushiravani
 
A computer vision approach to speech enhancement
A computer vision approach to speech enhancementA computer vision approach to speech enhancement
A computer vision approach to speech enhancement
Ramin Anushiravani
 
Poster cs543
Poster cs543Poster cs543
Poster cs543
Ramin Anushiravani
 

More from Ramin Anushiravani (6)

recommender_systems
recommender_systemsrecommender_systems
recommender_systems
 
Sound Source Localization with microphone arrays
Sound Source Localization with microphone arraysSound Source Localization with microphone arrays
Sound Source Localization with microphone arrays
 
Beamforming and microphone arrays
Beamforming and microphone arraysBeamforming and microphone arrays
Beamforming and microphone arrays
 
3D Audio playback for single channel audio using visual cues
3D Audio playback for single channel audio using visual cues3D Audio playback for single channel audio using visual cues
3D Audio playback for single channel audio using visual cues
 
A computer vision approach to speech enhancement
A computer vision approach to speech enhancementA computer vision approach to speech enhancement
A computer vision approach to speech enhancement
 
Poster cs543
Poster cs543Poster cs543
Poster cs543
 

3D Spatial Response

  • 1.
  • 2.  Introduction  Goal  Impulse Response  Maximum Length Sequence  Head Related Transfer Function  ITD and ILD  Head-shadow effect  Measurement Setup  Hardware Design  Compensation  Verification Test  HRTF Database  MIT HRTF Plots  Labyrinth HRTF Calibration  Labyrinth HRTF Plots
  • 3. Spatial perception with/out hearing aids on.
  • 4. Vent configuration + Hearing aids On Vent configuration + Hearing aids Off BTE 0mm 1 mm 2mm 3mm 0mm 1mm 2 mm 3mm ITE 0mm 1 mm 2mm 3mm 0mm 1mm 2 mm 3mm OPEN No hearing aids, using KEMAR’s in ear microphones. Measuring and analyzing 17 different scenarios,
  • 5. A relationship between the input and the output of a system. input transfer function = output
  • 6. input tf = output A ) If input is then output is also => OUTPUT/INPUT = TF Not stable B) delta function stuff = stuff => What if stuff = input Then output = input; Therefore; tf = input output
  • 7.  “A Maximum-Length Sequence (MLS) is a periodic two-level signal of length P = 2^N – 1, where N is an integer and P is the periodicity, which yields the impulse response of a linear system under circular convolution. The impulse response is extracted by the deconvolution of the system’s output when excited with an MLS signal.”  http://www.commsp.ee.ic.ac.uk/~mrt102/project s/mls/MLS%20Theory.pdf
  • 8. MLS, pseudorandom binary sequence, of period 4, in time domain.  Magnitude spectrum of MLS. (Closest approximation to delta function).
  • 9.  We used a 17 period long MLS signal.  The signal was re-sampled by ADC/DAC to 24414 Hz. Time Domain Frequency Domain
  • 10. Transfer function of one’s sounds localization system from a point in space. Involves shape of the pinna, shoulders effect, hair and more. Left Right HRTF_L HRTF_R Conv(Input (Mono) & Impulse Response(L,R)) = Output(L,R) Input : Desired Sound Impulse Response : HRTFalpha (L,R) => Desired Direction Output : Sound interpreted from angle alpha 0 degrees alpha alpha MLS + Chrip signal Time domain alpha …. Recorded Signal HRTF for angle alpha 3D Audio Reconstruction
  • 11.  Interaural time difference - ITD  Interaural level difference - ILD  Spectral information
  • 13. If distance between two ears > λ/2 Where λ = speed of sound/ frequency For a normal headsize, F > 1600 Hz
  • 14.  Room: Height : 115 cm – Width : 170 cm - Length: 265 cm  T60 reverberation time ?  Noise floor: 20 dB SPL  Loudspeaker: ?  KEMAR:?  Preamp: ?  Measurement Microphone: ?  ADC/DAC: RM1, Two I/O, Sampling rate 24414  Turn Table: Outline
  • 15.  Turn Table: ET 250-3D Size: 350 * 455 mm Resolution : 0.01 degree Step size: 0.5 degree min Control: Ethernet Cable and TTL connector Axial Load: 1500 kg
  • 16. • KEMAR rotates azimuthally. • Speaker rotates for only elevation angles. Azimuth: 0 : 5: 355 Elevation : -30 : 10 : 90 •Both motors are controlled counter/clockwise through Ethernet cable via Arduino microprocessor on Matlab. Frictio n
  • 17. Tweeter and Bass are on point.
  • 18. KEMAR Speaker LCD Shield Ethernet Shield Arduino Push button for changing rotating the motors
  • 19.
  • 20.  What impulse response do we want? Loudspeaker? Room? Pre-Amplifier? Coupler? Microphone? RM1? NO NO NO No NO NO
  • 21.  We want KEMAR’s ear response to source’s location in a room regardless of, Loudspeaker Room Pre-Amplifier Coupler Microphone RM1 responses.
  • 22.  What we have, Impulse Response (Loudspeaker+room+coupler+microphone+pre-amp+RM1+hrir)  What we need, hrir  Solution, (Loudspeaker+room+coupler+microphone+pre-amp+RM1+hrir) minus (Loudspeaker+room+coupler+microphone+pre-amp+RM1) = hrir
  • 23.  Procedure: 1st : Remove KEMAR and replace with a measurement microphone similar to one’s in KEMAR’s ear at the same exact location to get, (Loudspeaker+room+coupler+microphone+pre-amp+RM1) let’s call the combination of all these responses, ‘room’ response for now. 2nd: Compensate for “room” response if necessary.
  • 24.  How to compensate? A) conv(room^-1, (room+hrir)) = hrir (1) Same thing as abs(FFT(room+hrir))/abs(FFT(room)) (Note: Once in frequency domain use linear convolution length for the number of FFT points to avoid time aliasing). Problem: Ill-condition frequency bins that introduces sever spectral coloration to results. (1) Project MaRIE. "CRTools Compatible HRIR." GN Resound
  • 25. Room Response Inverse Room Response Ill-condition frequency bin
  • 26.  How to compensate? B) Constant Regularization (1) abs(FFT(room+hrir))/abs(FFT(room)+β) Avoid spectral coloration at the cost of losing some room compensation. (1) Choueiri . Optimal Crosstalk Cancellation for Binaural Audio with Two Loudspeakers. BACCH Audio. Princeton University.
  • 27. Room Response Inverse Room Response The magnitude at the ill-conditioned frequency dropped by 10 times (20 dB). Ill-condition frequency bin shifted
  • 28.  How to compensate? C) Frequency Dependent Regularization (1) abs(FFT(room+hrir))/abs(FFT(room)+β(frequency)) Avoid spectral coloration at the cost of losing a smaller amount of room compensation. 0 if Room(i) > threshold, for i= 1 : #fftpoints /2 β(frequency) = β if Room(i) < threshold, for i= 1 : #fftpoints /2 (1) Choueiri . Optimal Crosstalk Cancellation for Binaural Audio with Two Loudspeakers. BACCH Audio. Princeton University.
  • 29. Room Response Inverse Room Response Only the bins below the threshold are boosted. Getting some of the room compensation back from constant regularization.
  • 30.  How to compensate? D) Filter Inversion Example: Looking at room response as a high pass filter. => Flat out the response and deal with phase separately.
  • 31. 1- Fit a curve to the room response to avoid compensating for FFT artifacts. We only want to partially compensate for the shape of the filter without introducing new artifacts to the system.
  • 32. 2- Find the maximum value of the obtained curve and boost all frequency bins to that value. room response is boosted appropriately and resulted in a flat frequency response.
  • 33.  As expected, low frequency bins are boosted the most to compensate for the room response. This method is basically, a variation of frequency dependent regularization where the threshed is defined backward as the max(FFT(room)).
  • 34. 3- Compensating for phase. Phase of the acquired signal before compensation, Phase of the compensated signal, Note: Phase is conjugate symmetric. DC and middle frequency bin must be ignored when reconstructing the new hrir response.
  • 36. Goal: Does the impulse response corresponds to right location? How: By comparing, both perceptually and calculating the quantization error between the resulting signals.
  • 37.  KEMAR is 145 cm from loudspeaker, 5 cm elevation.  HRTF angle ~ 60 degrees in Front Left azimuth and ~ 5 degree elevation on top.  Sound pressure level at left ear ~ 75-80 dB 0 degree 60 degrees Left Right
  • 38.  Only the magnitude of the HRTFs were compensated, since the phase response of the room is linear. Room Response Original Binaural Reconstructed with HRTF + Room Compensation Reconstructed with HRTF Perceptually:
  • 39. Difference: After synchronizing and normalizing the gain, we have,Difference Reconstructed L/R Compensated L/R Binaural Left 20.38% 5.2% Binaural Right 33.03% 15.55% FYI: Subtracting the two STFT at each frequency bin, taking the average over each frequency bin and find the norm of the resulted vector => Difference.
  • 40. Binaural Recording Difference between Binaural and ReconstructedDifference between Binaural and Compensated
  • 41.  MIT  CIPIC  Andrew’s DI Database  Labyrinth
  • 42.  MIT Media Lab - 1994  Sampling rate 44.1 kHz  Azimuth : 0 to 355 ~ 5 degrees step  Elevation : -45 to 90 degrees ~ 15 degrees step  1024 samples
  • 43.  University of California Davis -2001  Sampling rate 44.1 kHz  Azimuth : -80 to 80 ~ 5 degrees step  Elevation : -45 to 270 degrees ~ 15 degrees step  200 samples
  • 44.  GN Resound at Glenview, Illinois - 2014  Sampling rate 48828 kHz  Azimuth : 0 to 355 = 5 degrees step  Elevation : -30 to 90 degrees = 10 degrees step  160 samples
  • 45.  Better Ear Strategy: Compare SNR of signal source between ears and choose the signal with most positive SNR. For polar plot, choose the signal between ears that has the most attenuation in reference to the on-axis signal.  Audibility Strategy: Compare levels of signal source between ears and choose signal with the most positive level. For polar plot, choose the signal between ears that has the least attenuation in reference to the on-axis signal. (1) Andrew Dittberner, Chang Ma,, and Paul Sexton. "BASS Benchmark Project.“ Labyrinth Program. GN Resound, 2014.
  • 46. Better Ear Strategy Audibility Strategy @ 1 KHz @ 2 KHz Elevation : -40 Degrees
  • 47. Elevation : -40 Degrees Better Ear Strategy Audibility Strategy
  • 48. Elevation : 0 Degrees Better Ear Strategy Audibility Strategy
  • 49. Elevation : 80 Degrees Better Ear Strategy Audibility Strategy
  • 50. Frequency : 2000 Hz Better Ear Strategy Audibility Strategy
  • 51. Frequency : 4000 Hz Better Ear Strategy Audibility Strategy
  • 52. Frequency : 8000 Hz Better Ear Strategy Audibility Strategy
  • 53.  Why Calibration?  Is KEMAR facing the speaker at the (0az,0el)?  Is the robotic arm keep the same azimuth angle as it goes to higher elevation angles? There are different ways to operate calibration on the system.
  • 54. 1- Set the speaker and the KEMAR at (90az, 0 el) by eyeballing it (or maybe use a level). 2- Define a reasonable azimuth and elevation threshold for the eyeballing error, e.g. ±15 degree azimuth and elevation. 3- Record the response at both ears for every point within the threshold. The step size will define the resolution of your calibration. 4- Take the RMS of the result from each ear and subtract them in the log domain. 5- The maximum value from step 4 will correspond to the actual (90az, 0 el). 6- Move the motor to the corresponding angle from step 5 and set it to (90az, 0 el).
  • 55.  th = threshold  Threshold on azimuth and elevation are arbitrary and could be different values. - 10*log10(rms 10*log10(rms 10*log10(rms
  • 56. Verification: Left and Right RMS intersection Azimuth Angles Magnitude dB The intersection is @25 degrees.
  • 57.  KEMAR without its pinna on could form a directional microphone. Auditorium calibration might be more accurate if the KEMAR’s pinna are removed( For higher frequency bands).  A similar procedure must also be done for (0az,90el). It’s expected that the level difference between left and right should be zero since they’re symmetric (we’re looking for the minimum value here).  A similar auditorium calibration procedure can be done by focusing on the ITD instead of ILD. The index at which Left and Right impulse response are extracted must be the same at ( 0az, 0el).
  • 58. When these two peaks are at the same index, KEMAR is located 0az, 0 el)
  • 60.  The auditorium calibration assumes that the robotic arm would move on a straight line (keeping the same azimuth angle with respect to the KEMAR) toward higher elevation angles. Turned out that’s not the case here.
  • 61.
  • 62.  The laser pointer follows the center of the KEMAR head at all elevation angles.
  • 63.  KEMAR receive reflection of the signal of the walls and other items while receiving the same signal from the speaker.
  • 64. Where is it coming from? It varies from 100~150 samples form the peak for most azimuth angles. Given the sampling rate this translate to 70~100 cm from the KEMAR. Solution: Use a measurement microphone as the second ear and place it in different position. Warmer or colder?
  • 66. The robotic arms were the main origins of the early reflections in the systems. A few acoustic sound absorption foams on each arm decreased the reflection by almost 40 dB.
  • 67.  Early reflection becomes more important when using Hearing Aids, since they got longer impulse responses. It’s harder to detect the early reflection.
  • 69. Open Left Ear 4000 Hz Open Left Ear 2000 Hz
  • 70.  Open Ear Audibility Strategy 5kHz
  • 71.  Open Ear Better Ear Strategy 5kHz
  • 72. Open Ear Better Ear /Audibility Strategy 1kHz 4kHz -
  • 73.  Open Ear @(0az,0el) Frequency Magnitude dB Frequency Response (Left/Right)Groupdelay (Left/Right)
  • 74.  BTE Hearing aids @(0az,0el) Frequency Magnitude dB Increasin g the vent size
  • 75.  BTE Hearing aids Left Ear 750 Hz
  • 76.  BTE Hearing aids 1000 Hz Gain issue
  • 77.  BTE Hearing aids 2000 Hz Obviously vent size is affecting the lower frequencies.
  • 78.  BTE Hearing aids Audibility Strategy 2000 Hz
  • 79.  BTE Hearing aids Better Ear Strategy 2000 Hz
  • 80.  ITE Hearing aids @(0az,0el) Increasing the vent size ITE and BTE have opposite relationship w.r. to the vent size. 0 1 3
  • 81.  ITE Hearing aids Right Ear 750 Hz Gain issue
  • 82.  ITE Hearing aids Right Ear 2000 Hz
  • 83.  BTE Hearing aids Audibility Strategy 2000 Hz
  • 84.  BTE Hearing aids Better Ear Strategy 2000 Hz
  • 85.  Collecting data for on elevation at 5 degree azimuth takes about 923 seconds ~ 15 min, that’s about 4.5 hours to collect data for all elevation at 10 degree resolution.  Not a good idea to save your database to Matlab memory while measuring, no matter how more convenient it is! Matlab WILL CRAHS!
  • 86.  Interpolation in time/Frequency domain  ITD/ILD  Reverberation Time  3D Audio  Beamforming for Source Localization  CIPIC Polar Pattern  CIPIC Delay and Sum Pattern  MIT Polar Pattern  MIT Delay and Sum Pattern  And more glitzy plots!
  • 87.  Why interpolation? 1. Higher Resolution and easier for analysis. 2. Creating a smoother transition for reconstructing 3D audio. 3. Easier to compare different HRTF databases of different step sizes.
  • 88.  Original Database (Getting bigger, ILD)  Interpolated Database (Thicker vertically)
  • 89.  Original database (shifting to right, ITD)  Interpolated database (thicker horizontally).
  • 90.  How? Cross Correlating between left and right hrir.
  • 92.
  • 93.  How? Subtracting the magnitude squred of the HRTF at left ear from the right ear.
  • 94.
  • 95.
  • 96.  The time it takes for a signal to drop by 60 dB. In a noisy environment, T60 is measured by interpolating the linear region in the Energy Decay Curve. Where h(tau) is the room impulse response. https://ccrma.stanford.edu/~jos/pasp/Energy_Decay_Curve.html
  • 97.
  • 98.
  • 99.  Goal : Make a sound moving smoothly through all the angles in the database. Help us identify the accuracy of the database and the possible spectral coloration of the database on any desired signal. 3D Audio for 0 degree elevation: CIPIC -80 to 80 MIT 0 to 355 CIPIC -80 to 80 MIT 0 to 355
  • 100. CIPIC : Original . . MIT: -Low pass filtered - Low frequency bin are even suppressed. - Few bandstop filters
  • 101.
  • 102.  How does Spatial Aliasing affect us in localizing a sound source? Assuming that the only cue we could use to localize a source is the ITD cue (like wearing a hearing aid?).
  • 103.  We could use a set of techniques, called Beamforming, to simulate localizing a sound source.  The idea is… Delay the reference signal until the sum of the energy of the two signals is at its maximum. That delay would corresponds to the angle of arrival.
  • 104. • Signals at 1 kHZ forming a sound source at 20 degrees at sampling rate 5 kHz. •Angle of arrival detected correctly •Angle of arrival detected in-correctly • Signals at 2.5 kHZ forming a sound source at 20 degrees at sampling rate 5 kHz. Beam pattern
  • 105.
  • 106.
  • 107.
  • 108.
  • 109.  CIPIC database -80 to 80 azimuth @ 0 elevation  MIT database 0 to 355 azimuth @ 0 elevation Both from DC to 22 k HZ, shown for Left ear. -80 to 80 azimuth
  • 110.  ITD: stronger at low frequencies.  ILD: stronger at high frequencies. Shorter wavelength, die faster, bigger ILD.
  • 111.  Distance between ear ~ 23 cm  23 cm = 343 m/s * t => t ~ 670µs  Frequency = 1/period => F ~ 1600 Hz What does that mean?
  • 112. Left Right @ 1 KHz @ 2 KHz Elevation : -40 Degrees
  • 113. Left Right @ 4 KHz @ 8 KHz Elevation : -40 Degrees
  • 114. Left Right Elevation : -40 Degrees
  • 116. Left Right Elevation : 40 Degrees
  • 117. Left Right Elevation : 80 Degrees
  • 118. Better Ear Strategy Audibility Strategy @ 4 KHz @ 8 KHz Elevation : -40 Degrees (1) Andrew Dittberner, Chang Ma,, and Paul Sexton. "BASS Benchmark Project.“ Labyrinth Program. GN Resound, 2014.
  • 119. Elevation : 40 Degrees Better Ear Strategy Audibility Strategy
  • 120. Frequency : 1000 Hz Better Ear Strategy Audibility Strategy

Editor's Notes

  1. BASS BENCHMARK – proposal
  2. Bigger topics, brief introduction, no need to introduce linear system, go from impulse MLS sequence, then HRTF,
  3. Purpose . Spatial perception .
  4. Plug hearing aids/ changing TF, simulate another polar plots distorted and undistorted. Goal Directionality might be affected by the hearing aids.
  5. appendix
  6. So, how do we get a delta function? Maybe popping balloons? Won’t be accurate and won’t give you the response for every frequency.
  7. Same energy at all frequency bins.
  8. Even though the signal is not entirely binary anymore, the spectrum is still flat, though noisy, over the desired frequency range, so we’re safe.
  9. We’ll use method B to measure the impulse response( steering vectors) of one’s sound localization system. In order to get the delta effect, we can use MLS or chirp signals, which is a psudue random noise.
  10. Localization cues. ITD better for < 1600 Hz, headshadow effect comes to importance > 1600 ILD and nervous system not very good at ITD at low frequencies, and spatial aliasing, forming beams for > 1600 Hz.
  11. Too many, octave frequency 1 2 4 8 kHZ Produce by : Delay and sum beamforming
  12. Super low frequencies: omni response. Narrow beam at zero degrees => good But high energy at all these other angles, cannot distinguish the difference. Of course we almost never hear one frequency, but a wide range of frequencies. So , conclusion here is that there is more to sound localaization besides ITD.
  13. Assumptions can be made here. Collect references, do they compensate? Papers associated with databases.
  14. Let’s assume loudspeakers and all are all room response, that they have flat response. Room response if different at. Just talk about it, w ref
  15. Reference. Need papers. Cited.
  16. Based on the room and equipment, there might be few to many ill-conditioned frequency bins. These are the bins that would boost the signal to higher values and colorate the spectral of the Original signal. Too much detailed.
  17. Source. Refer to the same database, different ways to compensate. Be consistent.
  18. Fix the peak.
  19. Room = fft(room);
  20. Room = fft(room);
  21. Room = fft(room);
  22. Room = fft(room);
  23. Room = fft(room);
  24. Solution is to simply subtract the phases from each other, so the phase of the signal won’t be affected by the room. If room phase is linear, this process is unnecessary. References. Citations.
  25. My computation is right, then no need to room compensation? Verify a few point. Quantization noise, veify the process
  26. My computation is right, then no need to room compensation? Verify a few point. Quantization noise, verify the process
  27. Keep in mind, binaural sound always sound better than reconstructed and no one knows why yet!
  28. Open ear HRTF
  29. Pick one database.
  30. Better ear strategy analogous to beamfroming pattern and audibility to omni pattern
  31. Interpolation error based on the resolution at 355 to 5 degrees.
  32. Do in frequency domain
  33. At azimuth -80 degrees level differences over elevation angles as we walk through different elevation angles. The time delay between each elevation is neglibgible.
  34. At elevation -45 degrees. As expected, time delay is more important over azimuth angles, and level differences is more important over elevation angles.
  35. Similar to microphone array and MIT results
  36. Low frequency localization for about 300 Hz to 1500 Hz pretty good at zero degree elevation.
  37. Similar, ITD doesn’t change over elevation angles with frequency. Huh! So, we can use ITD cues the same way at any elevation planes. It only depends on how big your head is! We can make the same beams toward the azimuth angles using ITD.
  38. As expected, ILD is higher for F > 1500, but it’s divided to two ranges, 2500 to 6500 and 8500 to 11000 Hz. As expected, ILD is higher at extreme angles, -80 and +80. 0 degrees is not very good, cone of confusion.
  39. Front bottom : two ranges Front 0 : the low range Front up : low range wider Top : low range wider and weaker Smiley face!
  40. Back Top: high range not very strong Back 0: high and low both Back bottom: high Appenidx They also look like happy, annoyed and angry faces!
  41. Appendix.
  42. Appendix.
  43. You be the judge of their angles . 0 degrees front- -80 degrees left same as 355 80 degrees right same as 0 degrees Appendix
  44. Appendix
  45. appendix
  46. appendix
  47. appendix
  48. -80 to 80 CIPIC for 0 elevation for selected frequency, azimuth to beam to spatial aliasing. Polar pattern. Put both polar plot.
  49. -80 to 80 front side for selected elevation angle. Plot left and right. See headshadow effect. What is the assumption? Beamforming + headshadow effect. Cut off 8 kHz. Same data sphere plot.
  50. All azimuth angles, limited elevation angles.
  51. Goes from 0 to 22 kHZ -80 left to 355 80 right to 0 MIT 355 t0 180 is like -80 to 80 in CIPIC Fft(hrir)
  52. fixed. Brain and detection. Remove
  53. ITD good for F<1600, more than the wavelength is smaller than the distance between the ears, we have Spatial aliasing. High frequency : reflection problem
  54. Interpolation error based on the resolution at 355 to 5 degrees.
  55. Interpolation error based on the resolution at 355 to 5 degrees.
  56. Better ear : min intersection Audibility: max intersection