CTAC 2024 Valencia - Henrik Hanke - Reduce to the max - slideshare.pdf
HRTF Database Provides Cues for Sound Localization
1.
2. Humans are self trained to localize sounds using their
ears starting at birth and localize well even in adverse
conditions.
Head-related transfer function (HRTF), which is the
ratio of the Fourier transform of the signal at the
listener’s eardrum to that at the center of the listener’s
head with the listener absent, characterizes these
listener induced changes.
Head-related transfer functions (HRTFs) capture the
sound localization cues created by the scattering of
incident sound waves by the body, and play a central
role in spatial audio systems.
3. The CIPIC Interface Laboratory at U.C. Davis has
measured HRTFs at high spatial resolution for more
than 90 subjects.
In addition to including impulse responses for 1250
directions for each ear of each subject, the database
includes a set of anthropometric measurements that
can be used for scaling studies.
Release 1.0 – a public-domain subset for 45 subjects
(including KEMAR with large and with small pinnae)
– is available by downloading from the website
(http://interface.cipic.ucdavis.edu).
4. HRTF impulse responses are the output of a linear and
time-invariant system, that is, the diffraction and
reflections around the human head, the outer ear, and
the torso.
An attractive property of HRTF’s is that they may be
modeled as minimum phase structures.
5.
6. One of the advantages of measuring HRTF data at
high spatial resolution is that the data can represented
as an image.
Figure 5(a) shows, each column in the image is one
impulse response at a particular azimuth, with
brightness coding the strength of the response.
Figure 5(b) shows, each column is the magnitude of
the HRTF in db, after the power spectrum was
smoothed by a constant-Q filter (Q=8).
7.
8. Figure (a) shows the gray scale value represents the
amplitude of HRIR .
Figure (b) shows the gray scale value is the magnitude
of the HRTF in dB.
Composition of the responded in terms of head
diffraction effects, head and torso reflection, pinna
effects and knee reflection can be seen both in the
time domain and in the frequency domain.
9.
10. The basis for the decomposition techniques presented
are spectral peaks and nulls, i.e., poles and the zeros.
These poles and zeros are caused by different parts like
the head, torso, knees and pinna.
The challenging task is to isolate the prominent
spectral nulls caused by different acoustic phenomena.
11.
12.
13.
14. In this section, features like pinna resonant
frequencies, pinna nulls and the delay due to torso and
knee reflection can be extracted using the above
decomposition technique.
Figure shows the frequency response of the 12th order
all-pole model for the subject 10 for azimuth 0* as a
function of different elevations as a mesh plot.
The effect of torso reflection delay in the frequency
domain is the appearance of periodic comb-filter nulls.
15.
16. High-spatial-resolution HRTF measurements clarify
the physical sources of HRTF behavior.
The composition and decomposition of the HRTF into
different components, and extraction of features
which could be perceptually important for sound
source localization.
Using the features extracted interpolation can be done
in the feature domain.
These features can be related to the physical
dimensions of the human anatomy and the pinna so
that the HRTF could be customized.
17.
18.
19.
20. V.R. Algazi, R. O. Duda and D. M. Thompson, C.
Avendano, “The CIPIC HRTF Database” October 21-24
2001, New Paltz, New York.
Vikas C. Raykar, Ramani Duraiswami, Larry Davis, B.
Yegnanarayana, “Extracting Significant Features from
the HRTF” July 6-9 2003, Boston, MA
Dmitry N. Zotkin, Jane Hwanf, Ramani Duraiswamy,
Larry S. Davis, “HRTF Personalization using
Anthropometric Measurements” College park, MD