SlideShare a Scribd company logo
VALIDATION OF A REAL-TIME VIRTUAL AUDITORY
 SYSTEM FOR DYNAMIC SOUND STIMULI AND ITS
        APPLICATION TO SOUND LOCALIZATION


                                  Brett Rinehold
Outline
     Motivation
 
     Introduction
 
     Background
 
     Loudspeaker Presentation
 
     HRTF Interpolation
 
     Acoustic Waveform Comparison
 
          Static Sound Presentation
      
          Dynamic Sound Presentation
      
          Static Sound with a Dynamic Head Presentation
      
     Psychophysical Experiment
 
     Discussion
 
Motivation
     To validate a real-time system that updates head-related
 
     impulse responses

     Goal is to show that the acoustic waveforms measured
 
     on KEMAR match between real and virtual presentations

     Applications:
 
          Explore the effects on sound localization with the presentation
      
          of dynamic sound
Introduction: What is Real/Virtual Audio?
     Real Audio consists of presenting sounds over
 
     loudspeakers

     Virtual Audio consists of presenting acoustic waveforms
 
     over headphones.
     Advantages
 
          Cost-effective
      
          Portable
      
          Doesn’t depend on room effects
      

     Disadvantages
 
          Unrealistic
      
Introduction: Sound Localization
     Interaural Time Difference – ITD – Differences between
 
     sound arrival times at the two ears
          Predominant cue in the low frequencies < 2kHz
      

     Interaural Level Difference – ILD – Differences between
 
     sound levels in the two ears
          Predominant cue in the higher frequencies due to head
      
          shadowing ~> 2kHz
     Encoded in Head-Related Transfer Function (HRTF)
 
          ILD in Magnitude
      
          ITD in Phase
      
Background of RTVAS System
     Developed by Jacob Scarpaci (2006)
 
     Uses a Real-Time Kernel in Linux to update HRTF filters
 




                        Key to system is that the HRTF being convolved
                        with input signal is the difference between where
                        the sound should be and where the subject’s
                        head position is.
Project Motivation/Aims
     Goal is to validate that the Real-Time Virtual Auditory
 
     System, developed by Jacob Scarpaci (2006), correctly
     updates HRTFs in accordance with head location relative
     to sound location.
     Approach to validation:
 
          Compare acoustic waveforms measured on KEMAR when
      
          presented with sound over headphones to those presented
          over loudspeakers.
               Mathematical, signals approach
           

          Perform a behavioral task where subjects are to track dynamic
      
          sound played over headphones or loudspeakers.
               Perceptual approach
           
Methods: Real Presentation - Panning




     Loudspeaker setup to create a virtual speaker
                                                      Nonlinear (Leakey, 1959)
     (shown as dashed outline) by interpolation
                                                        CH 1 = 1/ 2  (sin(! ) / 2 sin(! pos ))
     between two symmetrically located speakers
     about 0 degrees azimuth.                           CH 2 = 1/ 2 + (sin(! ) / 2 sin(! pos ))
HRTF Measurement
     Empirical KEMAR
 
          17th order MLS used to measure HRTF at every degree from -90 to 90 degrees.
      
     All measurements were windowed to 226 coefficients using a modified
 
     Hanning window to remove reverberations.
     Minimum-Phase plus Linear Phase Interpolation
 
          Interpolated from every 5 degree empirical measurements.
      
          Magnitude function was derived using a linear weighted average of the log
      
          magnitude functions from the empirical measurements.
          Minimum Phase function was derived from the magnitude function.
      



          Linear Phase component was added corresponding to the ITD calculated for
      
          that position.


      
Acoustic Waveform Comparison: Static
Sound/Static Head Methods
     Presented either a speech waveform or noise waveform at three
 
     different static locations: 5, 23, and -23 degrees




     During the free-field presentation the positions were created by
 
     using the panning technique (outlined previously) from speakers.
     Used 4 different KEMAR HRTF sets in the virtual presentation
 
          Empirical, Min-Phase Interp., Empirical Headphone TF, Min-Phase
      
          Headphone TF
     Recorded sounds on KEMAR with microphones located at the
 
     position corresponding to the human eardrum.
Static Sound/Static Head: Analysis
     Correlated the waveforms recorded over loudspeakers
 
     with the waveforms recorded over headphones for a
     given set of HRTFs.
          Correlated time, magnitude, and phase functions
      
          Allowed for a maximum delay of 4ms in time to allow for
      
          transmission delays
     Broke signals into third-octave bands with the following
 
     center frequencies:
       [200 250 315 400 500 630 800 1000 1250 1600 2000 2500 3150 4000
          5000 6300 8000 10000]
          Correlated time, magnitude, and phase within each band and calculated
      
          the delay(lag) needed to impose on one signal to achieve maximum
          correlation.
          Looked at differences in binaural cues within each band
      
Across Time/Frequency Correlations of
Static Noise
Acoustic Waveform Comparisons: Static
Sound/Static Head Results Cont.
Acoustic Waveform Comparisons: Static
Sound/Static Head Results Cont.
Acoustic Waveform Comparisons: Static
Sound/Static Head Results Cont.
Difference in ITDs from Free-Field and
Headphones for Static Noise
Difference in ILDs from Free-Field and
Headphones for Static Noise
Dynamic Sound/Static Head: Methods
     Presented a speech or a noise waveform either over
 
     loudspeakers or headphones using panning or convolution
     algorithm




     Sound was presented from 0 to 30 degrees
 
     Used same 4 HRTF sets
 
Across Time/Frequency Correlation of
Dynamic Noise
Acoustic Waveform Comparison: Dynamic
Sound/Static Head Noise Results Cont.
Acoustic Waveform Comparison: Dynamic
Sound/Static Head Noise Results Cont.
Acoustic Waveform Comparison: Dynamic
Sound/Static Head Noise Results Cont.
Difference in ITDs from Free-Field and
Headphones for Dynamic Noise
Difference in ILDs from Free-Field and
Headphones for Dynamic Noise
Static Sound/Dynamic Head: Methods
     Speech or noise waveform was presented over
 
     loudspeakers or headphone at a fixed position, 30
     degrees.
     4 HRTF sets were used
 
     KEMAR was moved from 30 degrees to 0 degree position
 
     while sound was presented.
     Head position was monitored using Intersense® IS900
 
     VWT head tracker.
Static Sound/Dynamic Head: Analysis
     Similar data analysis was performed in this case as in the
 
     previous two cases.
     Only tracks that followed the same trajectory were
 
     correlated.
          Acceptance Criteria was less than 1 or 1.5 degree difference
      
          between the tracks.
Across Time/Frequency Correlation for
Dynamic Head/Static Noise
Acoustic Waveform Comparison: Static
Sound/Dynamic Head Noise Results Cont.
Acoustic Waveform Comparison: Static
Sound/Dynamic Head Noise Results Cont.
Acoustic Waveform Comparison: Static
Sound/Dynamic Head Noise Results Cont.
Difference in ITDs from Free-Field and
Headphones for Static Noise/Dynamic Head
Difference in ILDs from Free-Field and
Headphones for Static Noise/Dynamic Head.
Waveform Comparison Discussion
     Interaural cues match up very well across the different
 
     conditions as well as between loudspeakers and
     headphones.
          Result from higher correlations in the magnitude and phase
      
          functions.
     Differences (correlation) in waveforms may not matter
 
     perceptually if receiving same binaural cues.
     Output algorithm in the RTVAS seems to present correct
 
     directional oriented sounds as well as correctly adjusting
     to head movement.
Psychophysical Experiment: Details
     6 Normal Hearing Subjects
 
          4 Male, 2 Female
      

     Sound was presented over headphones or loudspeakers
 
     Task was to track, using their head, a moving sound
 
     source.
     HRTFs tested were, Empirical KEMAR, Minimum-Phase
 
     KEMAR, Individual (Interpolated using Minimum-Phase)
Psychophysical Experiment: Details cont.
     Sound Details
 
          White noise
      
               Frequency content was 200Hz to 10kHz
           
               Presented at 65dB SPL
           
               5 second in duration
           



     Track Details
 
          15(sin((2pi/5)t)+ sin((2pi/2)t*rand))
      
Psychophysical Experiment: Virtual Setup

     Head Movement Training – Subjects just moved head (no sound)
 
          5 repetitions where subjects’ task was to put the square (representing
      
          head) in another box.
          Also centers head.
      
     Training – All using Empirical KEMAR
 
          10 trials where subject was shown, via plot, the path of the sound before
      
          it played.



          10 trials where the same track as before was presented but no visual
      
          cue was available.
          10 trials where subject was shown, via plot, the path but path was
      
          random from trial to trial.
          10 trials where tracks were random and no visualization.
      
Psychophysical Experiment: Setup cont.
     Experiment (Headphones)
 
          10 trials using Empirical KEMAR HRTFs
      
          10 trials using Minimum-Phase KEMAR HRTFs
      
          10 trials using Individual HRTFs
      
          Repeated 3 times
      

     Loudspeaker Training
 
          Same as headphones but trials were reduced to 5.
      

     Loudspeaker Experiment
 
          30 trials repeated only once
      
          Subjects were instructed to press a button as soon as they
      
          heard the sound. This started the head tracking.
Individual Tracking Results
Individual RMS/RMS Error
Individual Response to Complexity of Tracks
Overall Coherence in Performance
Overall Latency in Tracking
RMS/RMS Error of Tracking
Complexity of Track Analysis
Deeper Look into Individual HRTF Case
Psychophysical Experiment: Discussion
     Coherence
 
          The coherence or correlation measure is statistically insignificant in
      
          the empirical and minimum phase interpolation case from that over
          loudspeakers.
          Coherence of individual HRTFs was surprisingly worse.
      
          Coherence also stays strong as the complexity of the track varies.
      

     Latency
 
          Individual HRTFs show a more variability in latency.
      
               Might be able to track changes quicker using their own HRTFs
           

          Loudspeaker latency is negative which means that subjects are
      
          predicting the path.
               Could be because subjects are predicting the path since sound always go
           
               to the right first as well as a result from the delay in pressing the button
Psychophysical Experiment: Discussion
Cont.
     RMS
 
          No significant difference in total RMS error as well as RMS
      
          undershoot error between Empirical and Minimum-Phase
          HRTFs from loudspeakers.
          Subjects generally undershoot the path of the sound.
      
               Could be a motor problem, i.e. laziness, as well as perception.
           
Overall Conclusions
     Coherence of acoustic recordings may not be the best
 
     measure for validation
          Reverberation or panning techniques
      



     If perception is the only thing that matters, than have to
 
     conclude that algorithm works
Future Work
     Look at different methods for presenting dynamic sound
 
     over loudspeakers.
     Try different room environments.
 
     Closer look at differences between headphones
 
          Particularly looking at open canal tube-phones to see if
      
          subjects could distinguish between real and virtual sources.
     Various psychophysical experiments that involve dynamic
 
     sound (speech, masking)
          Sound localization
      
          Source separation
      
Acknowledgements
                                       Other
     Committee                     
 
                                            Dave Freedman
          Steven Colburn                
      
                                            Jake Scarpaci
          Barb Shinn-Cunningham         
      
          Nathaniel Durlach            My Subjects
                                  
     Binaural Gang                     All in Attendance
                                  
          Todd Jennings
      
          Le Wang
      
          Tim Streeter
      
          Varun Parmar
      
          Akshay Navaladi
      
          Antje Ihlefeld
      
THANK YOU
Backup Slides
Methods: Real Presentation Continued

                                     Input stimulus was a 17th
    Title: Speaker
                                 
    Presentation

                                     order mls sequence sampled
      Source         Created

                                     at 50kHz.
      Speaker        Position

                        -5
                                          Corresponds to a duration of
                                      
         10              0
                                          ~2.6sec
                         5
                       -10
         15              0
                                     Waveforms were recorded
                                 
                       10
                       -20
                                     on KEMAR (Knowles
                       -10

                                     Electronic Manikin for
         30              0
                       10

                                     Acoustic Research)
                       20
                       -40
                       -30
                       -10
         45              0
                       10
                       30
                       40
Results: Real Presentation

 •  HRTFs measured when sound was presented over loudspeakers using the
 linear and nonlinear interpolation functions


   Linear                                  Nonlinear
Results: Correlation Coefficients at all Spatial Locations for
    Interpolated Sound over Loudspeakers


                                                                                                   Correlation
Title: Correlation Coefficients
                                                                                               
                             Linear Function                  Non-linear Function
Speaker        Virtual
                             Left           Right             Left          Right
Location       Position
                                                                                                   between a
                         -40       0.98799           0.9758         0.98655         0.97769
                         -30       0.97427          0.96611         0.97534         0.96777

                                                                                                   virtual point
                         -10       0.96842          0.94612         0.96858          0.9466
      45                   0       0.95736          0.91602         0.95693         0.91709
                          10       0.96374          0.95282         0.96384         0.95276
                                                                                                   source and a
                          30       0.97532          0.97095         0.97644         0.97084
                          40       0.98397          0.98194         0.98268         0.98177

                                                                                                   real source
                         -20       0.98372          0.97316         0.98385         0.97357
                         -10       0.98054           0.9564         0.98054         0.95649
      30                   0       0.97184          0.93755         0.97171         0.93774
                          10       0.97151          0.96414         0.97147         0.96448
                          20       0.97844          0.97768         0.97883         0.97762
                         -10         0.993          0.97775         0.99301         0.97787
      15                   0       0.97821          0.95517         0.97817         0.95503
                          10       0.98406          0.98576         0.98412         0.98572
                          -5       0.99326          0.97585         0.99328         0.97601
      10                   0       0.98927          0.96086         0.98924         0.96077
                           5       0.99319          0.98977         0.99312         0.98977

Very strong correlation, generally, for all spatial locations
Weaker correlation as speakers become more spatially separated
Weakest correlation when created sound is furthest from both speakers (0
degrees)
Spatial Separation of Loudspeakers

                                                    Correlation coefficients
                                                
                                                    for a virtually created
                                                    sound source at -10
                                                    degrees at various
                                                    spatial separations of the
                                                    loudspeakers


Correlation declines as the loudspeakers become more spatially separated
Example of Psuedo-Anechoic HRTFs




• Correlation coefficients are slightly better when reverberations are taken out of the impulse
responses
      • Linear Reverberant: 0.98054, 0.9564 (Left, Right Ears)
      • Linear Psuedo-Anechoic: 0.98545, 0.96019 (Left, Right Ears)
      • Nonlinear Reverberant: 0.98054, 0.95649 (Left, Right Ears)
      • Nonlinear Pseudo-Anechoic: 0.9855, 0.96007 (Left, Right Ears)
Correlation Coefficients at all Spatial Locations for Interpolated
     Sound over Loudspeakers (Pseudo-Anechoic)

     Table 3. Correlation Coefficients for Psuedo-Anechoic HRTFs
                                 Linear Function              Non-linear Function
     Speaker       Virtual
                                 Left           Right         Left          Right
     Location      Position
                            -40        0.96567        0.99168       0.96416         0.98421
                            -30        0.96223        0.95356       0.96138         0.95815
                            -10        0.96348        0.93433       0.96299         0.93902
           45                 0        0.95471        0.89491       0.95436         0.89968
                             10        0.95856        0.93652       0.95913         0.93953
                             30        0.97678          0.945       0.97825         0.94013
                             40        0.99563         0.9814          0.99         0.98018
                            -20        0.98762        0.97555       0.98767         0.97663
                            -10        0.98545        0.96019        0.9855         0.96007
           30                 0        0.97281        0.93616       0.97284         0.93623
                             10        0.97927        0.96945       0.97912         0.96968
                             20        0.97904        0.98188       0.97846         0.98183
                            -10        0.99608        0.98114       0.99592         0.98167
           15                 0        0.97891        0.95475        0.9788         0.95461
                             10         0.9928        0.98922       0.99287          0.9892
                             -5        0.99738        0.98141       0.99736         0.98162
           10                 0        0.99329        0.96323       0.99333          0.9632
                              5        0.99731         0.9946       0.99736         0.99462


      Correlations generally are better when reverberant energy is taken out of the impulse
 
      responses.
HRTF Window Function
HRTF Magnitude Comparison
Headphone Transfer Function

More Related Content

Similar to Thesis Defense Presentation

Defense - Sound space rendering based on the virtual Sound space rendering ba...
Defense - Sound space rendering based on the virtual Sound space rendering ba...Defense - Sound space rendering based on the virtual Sound space rendering ba...
Defense - Sound space rendering based on the virtual Sound space rendering ba...
JunjieShi3
 
Feasibility of EEG Super-Resolution Using Deep Convolutional Networks
Feasibility of EEG Super-Resolution Using Deep Convolutional NetworksFeasibility of EEG Super-Resolution Using Deep Convolutional Networks
Feasibility of EEG Super-Resolution Using Deep Convolutional Networks
Sangjun Han
 
EEG Basics monish.pptx
EEG Basics monish.pptxEEG Basics monish.pptx
EEG Basics monish.pptx
MohinishS
 
Equalisers
EqualisersEqualisers
Recording Distortion Product Otoacoustic Emissions using the Adaptive Noise C...
Recording Distortion Product Otoacoustic Emissions using the Adaptive Noise C...Recording Distortion Product Otoacoustic Emissions using the Adaptive Noise C...
Recording Distortion Product Otoacoustic Emissions using the Adaptive Noise C...
RicardoVallejo30
 
example based audio editing
example based audio editingexample based audio editing
example based audio editing
Ramin Anushiravani
 
SUMSEM-2021-22_ECE6007_ETH_VL2021220701295_Reference_Material_I_04-07-2022_EE...
SUMSEM-2021-22_ECE6007_ETH_VL2021220701295_Reference_Material_I_04-07-2022_EE...SUMSEM-2021-22_ECE6007_ETH_VL2021220701295_Reference_Material_I_04-07-2022_EE...
SUMSEM-2021-22_ECE6007_ETH_VL2021220701295_Reference_Material_I_04-07-2022_EE...
BharathSrinivasG
 
ECHOES & DELPH Seismic - Advances in geophysical sensor data acquisition
ECHOES & DELPH Seismic - Advances in geophysical sensor data acquisitionECHOES & DELPH Seismic - Advances in geophysical sensor data acquisition
ECHOES & DELPH Seismic - Advances in geophysical sensor data acquisition
IXSEA-DELPH
 
Audiometry class by Dr. Kavitha Ashok Kumar MSU Malaysia
Audiometry class by Dr. Kavitha Ashok Kumar MSU MalaysiaAudiometry class by Dr. Kavitha Ashok Kumar MSU Malaysia
Audiometry class by Dr. Kavitha Ashok Kumar MSU Malaysia
Kavitha Ashokb
 
Introduction to EEG: Instrument and Acquisition
Introduction to EEG: Instrument and AcquisitionIntroduction to EEG: Instrument and Acquisition
Introduction to EEG: Instrument and Acquisition
kj_jantzen
 
TESTS FOR AUDITORY ASSESSMENT
TESTS FOR AUDITORY ASSESSMENTTESTS FOR AUDITORY ASSESSMENT
TESTS FOR AUDITORY ASSESSMENT
Dr. Parul Sachdeva
 
Hybrid Reverberator Using Multiple Impulse Responses for Audio Rendering Impr...
Hybrid Reverberator Using Multiple Impulse Responses for Audio Rendering Impr...Hybrid Reverberator Using Multiple Impulse Responses for Audio Rendering Impr...
Hybrid Reverberator Using Multiple Impulse Responses for Audio Rendering Impr...
a3labdsp
 
How to play audio from a microcontroller
How to play audio from a microcontrollerHow to play audio from a microcontroller
How to play audio from a microcontroller
Mahadev Gopalakrishnan
 
3D Spatial Response
3D Spatial Response3D Spatial Response
3D Spatial Response
Ramin Anushiravani
 
EEG course.pptx
EEG course.pptxEEG course.pptx
EEG course.pptx
QuentinMoreau12
 
Otoacoustic Emissions : A comparison between simulation and lab measures.
Otoacoustic Emissions : A comparison between simulation and lab measures.Otoacoustic Emissions : A comparison between simulation and lab measures.
Otoacoustic Emissions : A comparison between simulation and lab measures.
Nicolò Paternoster
 
February 26 esp 179 noise
February 26  esp 179 noiseFebruary 26  esp 179 noise
February 26 esp 179 noise
CEQAplanner
 
3 D Sound
3 D Sound3 D Sound
3 D Sound
adityas87
 
Build Your Own VR Display Course - SIGGRAPH 2017: Part 4
Build Your Own VR Display Course - SIGGRAPH 2017: Part 4Build Your Own VR Display Course - SIGGRAPH 2017: Part 4
Build Your Own VR Display Course - SIGGRAPH 2017: Part 4
StanfordComputationalImaging
 
Geometric distortion artifact remedy
Geometric distortion artifact remedyGeometric distortion artifact remedy
Geometric distortion artifact remedy
Gamal Mahdaly
 

Similar to Thesis Defense Presentation (20)

Defense - Sound space rendering based on the virtual Sound space rendering ba...
Defense - Sound space rendering based on the virtual Sound space rendering ba...Defense - Sound space rendering based on the virtual Sound space rendering ba...
Defense - Sound space rendering based on the virtual Sound space rendering ba...
 
Feasibility of EEG Super-Resolution Using Deep Convolutional Networks
Feasibility of EEG Super-Resolution Using Deep Convolutional NetworksFeasibility of EEG Super-Resolution Using Deep Convolutional Networks
Feasibility of EEG Super-Resolution Using Deep Convolutional Networks
 
EEG Basics monish.pptx
EEG Basics monish.pptxEEG Basics monish.pptx
EEG Basics monish.pptx
 
Equalisers
EqualisersEqualisers
Equalisers
 
Recording Distortion Product Otoacoustic Emissions using the Adaptive Noise C...
Recording Distortion Product Otoacoustic Emissions using the Adaptive Noise C...Recording Distortion Product Otoacoustic Emissions using the Adaptive Noise C...
Recording Distortion Product Otoacoustic Emissions using the Adaptive Noise C...
 
example based audio editing
example based audio editingexample based audio editing
example based audio editing
 
SUMSEM-2021-22_ECE6007_ETH_VL2021220701295_Reference_Material_I_04-07-2022_EE...
SUMSEM-2021-22_ECE6007_ETH_VL2021220701295_Reference_Material_I_04-07-2022_EE...SUMSEM-2021-22_ECE6007_ETH_VL2021220701295_Reference_Material_I_04-07-2022_EE...
SUMSEM-2021-22_ECE6007_ETH_VL2021220701295_Reference_Material_I_04-07-2022_EE...
 
ECHOES & DELPH Seismic - Advances in geophysical sensor data acquisition
ECHOES & DELPH Seismic - Advances in geophysical sensor data acquisitionECHOES & DELPH Seismic - Advances in geophysical sensor data acquisition
ECHOES & DELPH Seismic - Advances in geophysical sensor data acquisition
 
Audiometry class by Dr. Kavitha Ashok Kumar MSU Malaysia
Audiometry class by Dr. Kavitha Ashok Kumar MSU MalaysiaAudiometry class by Dr. Kavitha Ashok Kumar MSU Malaysia
Audiometry class by Dr. Kavitha Ashok Kumar MSU Malaysia
 
Introduction to EEG: Instrument and Acquisition
Introduction to EEG: Instrument and AcquisitionIntroduction to EEG: Instrument and Acquisition
Introduction to EEG: Instrument and Acquisition
 
TESTS FOR AUDITORY ASSESSMENT
TESTS FOR AUDITORY ASSESSMENTTESTS FOR AUDITORY ASSESSMENT
TESTS FOR AUDITORY ASSESSMENT
 
Hybrid Reverberator Using Multiple Impulse Responses for Audio Rendering Impr...
Hybrid Reverberator Using Multiple Impulse Responses for Audio Rendering Impr...Hybrid Reverberator Using Multiple Impulse Responses for Audio Rendering Impr...
Hybrid Reverberator Using Multiple Impulse Responses for Audio Rendering Impr...
 
How to play audio from a microcontroller
How to play audio from a microcontrollerHow to play audio from a microcontroller
How to play audio from a microcontroller
 
3D Spatial Response
3D Spatial Response3D Spatial Response
3D Spatial Response
 
EEG course.pptx
EEG course.pptxEEG course.pptx
EEG course.pptx
 
Otoacoustic Emissions : A comparison between simulation and lab measures.
Otoacoustic Emissions : A comparison between simulation and lab measures.Otoacoustic Emissions : A comparison between simulation and lab measures.
Otoacoustic Emissions : A comparison between simulation and lab measures.
 
February 26 esp 179 noise
February 26  esp 179 noiseFebruary 26  esp 179 noise
February 26 esp 179 noise
 
3 D Sound
3 D Sound3 D Sound
3 D Sound
 
Build Your Own VR Display Course - SIGGRAPH 2017: Part 4
Build Your Own VR Display Course - SIGGRAPH 2017: Part 4Build Your Own VR Display Course - SIGGRAPH 2017: Part 4
Build Your Own VR Display Course - SIGGRAPH 2017: Part 4
 
Geometric distortion artifact remedy
Geometric distortion artifact remedyGeometric distortion artifact remedy
Geometric distortion artifact remedy
 

Recently uploaded

The Steadfast and Reliable Bull: Taurus Zodiac Sign
The Steadfast and Reliable Bull: Taurus Zodiac SignThe Steadfast and Reliable Bull: Taurus Zodiac Sign
The Steadfast and Reliable Bull: Taurus Zodiac Sign
my Pandit
 
Business storytelling: key ingredients to a story
Business storytelling: key ingredients to a storyBusiness storytelling: key ingredients to a story
Business storytelling: key ingredients to a story
Alexandra Fulford
 
2022 Vintage Roman Numerals Men Rings
2022 Vintage Roman  Numerals  Men  Rings2022 Vintage Roman  Numerals  Men  Rings
2022 Vintage Roman Numerals Men Rings
aragme
 
The Genesis of BriansClub.cm Famous Dark WEb Platform
The Genesis of BriansClub.cm Famous Dark WEb PlatformThe Genesis of BriansClub.cm Famous Dark WEb Platform
The Genesis of BriansClub.cm Famous Dark WEb Platform
SabaaSudozai
 
Pitch Deck Teardown: Kinnect's $250k Angel deck
Pitch Deck Teardown: Kinnect's $250k Angel deckPitch Deck Teardown: Kinnect's $250k Angel deck
Pitch Deck Teardown: Kinnect's $250k Angel deck
HajeJanKamps
 
Industrial Tech SW: Category Renewal and Creation
Industrial Tech SW:  Category Renewal and CreationIndustrial Tech SW:  Category Renewal and Creation
Industrial Tech SW: Category Renewal and Creation
Christian Dahlen
 
4 Benefits of Partnering with an OnlyFans Agency for Content Creators.pdf
4 Benefits of Partnering with an OnlyFans Agency for Content Creators.pdf4 Benefits of Partnering with an OnlyFans Agency for Content Creators.pdf
4 Benefits of Partnering with an OnlyFans Agency for Content Creators.pdf
onlyfansmanagedau
 
list of states and organizations .pdf
list of  states  and  organizations .pdflist of  states  and  organizations .pdf
list of states and organizations .pdf
Rbc Rbcua
 
The Most Inspiring Entrepreneurs to Follow in 2024.pdf
The Most Inspiring Entrepreneurs to Follow in 2024.pdfThe Most Inspiring Entrepreneurs to Follow in 2024.pdf
The Most Inspiring Entrepreneurs to Follow in 2024.pdf
thesiliconleaders
 
Digital Transformation Frameworks: Driving Digital Excellence
Digital Transformation Frameworks: Driving Digital ExcellenceDigital Transformation Frameworks: Driving Digital Excellence
Digital Transformation Frameworks: Driving Digital Excellence
Operational Excellence Consulting
 
Call8328958814 satta matka Kalyan result satta guessing
Call8328958814 satta matka Kalyan result satta guessingCall8328958814 satta matka Kalyan result satta guessing
Call8328958814 satta matka Kalyan result satta guessing
➑➌➋➑➒➎➑➑➊➍
 
Income Tax exemption for Start up : Section 80 IAC
Income Tax  exemption for Start up : Section 80 IACIncome Tax  exemption for Start up : Section 80 IAC
Income Tax exemption for Start up : Section 80 IAC
CA Dr. Prithvi Ranjan Parhi
 
一比一原版新西兰奥塔哥大学毕业证(otago毕业证)如何办理
一比一原版新西兰奥塔哥大学毕业证(otago毕业证)如何办理一比一原版新西兰奥塔哥大学毕业证(otago毕业证)如何办理
一比一原版新西兰奥塔哥大学毕业证(otago毕业证)如何办理
taqyea
 
Cover Story - China's Investment Leader - Dr. Alyce SU
Cover Story - China's Investment Leader - Dr. Alyce SUCover Story - China's Investment Leader - Dr. Alyce SU
Cover Story - China's Investment Leader - Dr. Alyce SU
msthrill
 
NIMA2024 | De toegevoegde waarde van DEI en ESG in campagnes | Nathalie Lam |...
NIMA2024 | De toegevoegde waarde van DEI en ESG in campagnes | Nathalie Lam |...NIMA2024 | De toegevoegde waarde van DEI en ESG in campagnes | Nathalie Lam |...
NIMA2024 | De toegevoegde waarde van DEI en ESG in campagnes | Nathalie Lam |...
BBPMedia1
 
Chapter 7 Final business management sciences .ppt
Chapter 7 Final business management sciences .pptChapter 7 Final business management sciences .ppt
Chapter 7 Final business management sciences .ppt
ssuser567e2d
 
DearbornMusic-KatherineJasperFullSailUni
DearbornMusic-KatherineJasperFullSailUniDearbornMusic-KatherineJasperFullSailUni
DearbornMusic-KatherineJasperFullSailUni
katiejasper96
 
How HR Search Helps in Company Success.pdf
How HR Search Helps in Company Success.pdfHow HR Search Helps in Company Success.pdf
How HR Search Helps in Company Success.pdf
HumanResourceDimensi1
 
Anny Serafina Love - Letter of Recommendation by Kellen Harkins, MS.
Anny Serafina Love - Letter of Recommendation by Kellen Harkins, MS.Anny Serafina Love - Letter of Recommendation by Kellen Harkins, MS.
Anny Serafina Love - Letter of Recommendation by Kellen Harkins, MS.
AnnySerafinaLove
 
Digital Marketing with a Focus on Sustainability
Digital Marketing with a Focus on SustainabilityDigital Marketing with a Focus on Sustainability
Digital Marketing with a Focus on Sustainability
sssourabhsharma
 

Recently uploaded (20)

The Steadfast and Reliable Bull: Taurus Zodiac Sign
The Steadfast and Reliable Bull: Taurus Zodiac SignThe Steadfast and Reliable Bull: Taurus Zodiac Sign
The Steadfast and Reliable Bull: Taurus Zodiac Sign
 
Business storytelling: key ingredients to a story
Business storytelling: key ingredients to a storyBusiness storytelling: key ingredients to a story
Business storytelling: key ingredients to a story
 
2022 Vintage Roman Numerals Men Rings
2022 Vintage Roman  Numerals  Men  Rings2022 Vintage Roman  Numerals  Men  Rings
2022 Vintage Roman Numerals Men Rings
 
The Genesis of BriansClub.cm Famous Dark WEb Platform
The Genesis of BriansClub.cm Famous Dark WEb PlatformThe Genesis of BriansClub.cm Famous Dark WEb Platform
The Genesis of BriansClub.cm Famous Dark WEb Platform
 
Pitch Deck Teardown: Kinnect's $250k Angel deck
Pitch Deck Teardown: Kinnect's $250k Angel deckPitch Deck Teardown: Kinnect's $250k Angel deck
Pitch Deck Teardown: Kinnect's $250k Angel deck
 
Industrial Tech SW: Category Renewal and Creation
Industrial Tech SW:  Category Renewal and CreationIndustrial Tech SW:  Category Renewal and Creation
Industrial Tech SW: Category Renewal and Creation
 
4 Benefits of Partnering with an OnlyFans Agency for Content Creators.pdf
4 Benefits of Partnering with an OnlyFans Agency for Content Creators.pdf4 Benefits of Partnering with an OnlyFans Agency for Content Creators.pdf
4 Benefits of Partnering with an OnlyFans Agency for Content Creators.pdf
 
list of states and organizations .pdf
list of  states  and  organizations .pdflist of  states  and  organizations .pdf
list of states and organizations .pdf
 
The Most Inspiring Entrepreneurs to Follow in 2024.pdf
The Most Inspiring Entrepreneurs to Follow in 2024.pdfThe Most Inspiring Entrepreneurs to Follow in 2024.pdf
The Most Inspiring Entrepreneurs to Follow in 2024.pdf
 
Digital Transformation Frameworks: Driving Digital Excellence
Digital Transformation Frameworks: Driving Digital ExcellenceDigital Transformation Frameworks: Driving Digital Excellence
Digital Transformation Frameworks: Driving Digital Excellence
 
Call8328958814 satta matka Kalyan result satta guessing
Call8328958814 satta matka Kalyan result satta guessingCall8328958814 satta matka Kalyan result satta guessing
Call8328958814 satta matka Kalyan result satta guessing
 
Income Tax exemption for Start up : Section 80 IAC
Income Tax  exemption for Start up : Section 80 IACIncome Tax  exemption for Start up : Section 80 IAC
Income Tax exemption for Start up : Section 80 IAC
 
一比一原版新西兰奥塔哥大学毕业证(otago毕业证)如何办理
一比一原版新西兰奥塔哥大学毕业证(otago毕业证)如何办理一比一原版新西兰奥塔哥大学毕业证(otago毕业证)如何办理
一比一原版新西兰奥塔哥大学毕业证(otago毕业证)如何办理
 
Cover Story - China's Investment Leader - Dr. Alyce SU
Cover Story - China's Investment Leader - Dr. Alyce SUCover Story - China's Investment Leader - Dr. Alyce SU
Cover Story - China's Investment Leader - Dr. Alyce SU
 
NIMA2024 | De toegevoegde waarde van DEI en ESG in campagnes | Nathalie Lam |...
NIMA2024 | De toegevoegde waarde van DEI en ESG in campagnes | Nathalie Lam |...NIMA2024 | De toegevoegde waarde van DEI en ESG in campagnes | Nathalie Lam |...
NIMA2024 | De toegevoegde waarde van DEI en ESG in campagnes | Nathalie Lam |...
 
Chapter 7 Final business management sciences .ppt
Chapter 7 Final business management sciences .pptChapter 7 Final business management sciences .ppt
Chapter 7 Final business management sciences .ppt
 
DearbornMusic-KatherineJasperFullSailUni
DearbornMusic-KatherineJasperFullSailUniDearbornMusic-KatherineJasperFullSailUni
DearbornMusic-KatherineJasperFullSailUni
 
How HR Search Helps in Company Success.pdf
How HR Search Helps in Company Success.pdfHow HR Search Helps in Company Success.pdf
How HR Search Helps in Company Success.pdf
 
Anny Serafina Love - Letter of Recommendation by Kellen Harkins, MS.
Anny Serafina Love - Letter of Recommendation by Kellen Harkins, MS.Anny Serafina Love - Letter of Recommendation by Kellen Harkins, MS.
Anny Serafina Love - Letter of Recommendation by Kellen Harkins, MS.
 
Digital Marketing with a Focus on Sustainability
Digital Marketing with a Focus on SustainabilityDigital Marketing with a Focus on Sustainability
Digital Marketing with a Focus on Sustainability
 

Thesis Defense Presentation

  • 1. VALIDATION OF A REAL-TIME VIRTUAL AUDITORY SYSTEM FOR DYNAMIC SOUND STIMULI AND ITS APPLICATION TO SOUND LOCALIZATION Brett Rinehold
  • 2. Outline Motivation   Introduction   Background   Loudspeaker Presentation   HRTF Interpolation   Acoustic Waveform Comparison   Static Sound Presentation   Dynamic Sound Presentation   Static Sound with a Dynamic Head Presentation   Psychophysical Experiment   Discussion  
  • 3. Motivation To validate a real-time system that updates head-related   impulse responses Goal is to show that the acoustic waveforms measured   on KEMAR match between real and virtual presentations Applications:   Explore the effects on sound localization with the presentation   of dynamic sound
  • 4. Introduction: What is Real/Virtual Audio? Real Audio consists of presenting sounds over   loudspeakers Virtual Audio consists of presenting acoustic waveforms   over headphones. Advantages   Cost-effective   Portable   Doesn’t depend on room effects   Disadvantages   Unrealistic  
  • 5. Introduction: Sound Localization Interaural Time Difference – ITD – Differences between   sound arrival times at the two ears Predominant cue in the low frequencies < 2kHz   Interaural Level Difference – ILD – Differences between   sound levels in the two ears Predominant cue in the higher frequencies due to head   shadowing ~> 2kHz Encoded in Head-Related Transfer Function (HRTF)   ILD in Magnitude   ITD in Phase  
  • 6. Background of RTVAS System Developed by Jacob Scarpaci (2006)   Uses a Real-Time Kernel in Linux to update HRTF filters   Key to system is that the HRTF being convolved with input signal is the difference between where the sound should be and where the subject’s head position is.
  • 7. Project Motivation/Aims Goal is to validate that the Real-Time Virtual Auditory   System, developed by Jacob Scarpaci (2006), correctly updates HRTFs in accordance with head location relative to sound location. Approach to validation:   Compare acoustic waveforms measured on KEMAR when   presented with sound over headphones to those presented over loudspeakers. Mathematical, signals approach   Perform a behavioral task where subjects are to track dynamic   sound played over headphones or loudspeakers. Perceptual approach  
  • 8. Methods: Real Presentation - Panning Loudspeaker setup to create a virtual speaker     Nonlinear (Leakey, 1959) (shown as dashed outline) by interpolation CH 1 = 1/ 2 (sin(! ) / 2 sin(! pos )) between two symmetrically located speakers about 0 degrees azimuth. CH 2 = 1/ 2 + (sin(! ) / 2 sin(! pos ))
  • 9. HRTF Measurement Empirical KEMAR   17th order MLS used to measure HRTF at every degree from -90 to 90 degrees.   All measurements were windowed to 226 coefficients using a modified   Hanning window to remove reverberations. Minimum-Phase plus Linear Phase Interpolation   Interpolated from every 5 degree empirical measurements.   Magnitude function was derived using a linear weighted average of the log   magnitude functions from the empirical measurements. Minimum Phase function was derived from the magnitude function.   Linear Phase component was added corresponding to the ITD calculated for   that position.  
  • 10. Acoustic Waveform Comparison: Static Sound/Static Head Methods Presented either a speech waveform or noise waveform at three   different static locations: 5, 23, and -23 degrees During the free-field presentation the positions were created by   using the panning technique (outlined previously) from speakers. Used 4 different KEMAR HRTF sets in the virtual presentation   Empirical, Min-Phase Interp., Empirical Headphone TF, Min-Phase   Headphone TF Recorded sounds on KEMAR with microphones located at the   position corresponding to the human eardrum.
  • 11. Static Sound/Static Head: Analysis Correlated the waveforms recorded over loudspeakers   with the waveforms recorded over headphones for a given set of HRTFs. Correlated time, magnitude, and phase functions   Allowed for a maximum delay of 4ms in time to allow for   transmission delays Broke signals into third-octave bands with the following   center frequencies:   [200 250 315 400 500 630 800 1000 1250 1600 2000 2500 3150 4000 5000 6300 8000 10000] Correlated time, magnitude, and phase within each band and calculated   the delay(lag) needed to impose on one signal to achieve maximum correlation. Looked at differences in binaural cues within each band  
  • 13. Acoustic Waveform Comparisons: Static Sound/Static Head Results Cont.
  • 14. Acoustic Waveform Comparisons: Static Sound/Static Head Results Cont.
  • 15. Acoustic Waveform Comparisons: Static Sound/Static Head Results Cont.
  • 16. Difference in ITDs from Free-Field and Headphones for Static Noise
  • 17. Difference in ILDs from Free-Field and Headphones for Static Noise
  • 18. Dynamic Sound/Static Head: Methods Presented a speech or a noise waveform either over   loudspeakers or headphones using panning or convolution algorithm Sound was presented from 0 to 30 degrees   Used same 4 HRTF sets  
  • 20. Acoustic Waveform Comparison: Dynamic Sound/Static Head Noise Results Cont.
  • 21. Acoustic Waveform Comparison: Dynamic Sound/Static Head Noise Results Cont.
  • 22. Acoustic Waveform Comparison: Dynamic Sound/Static Head Noise Results Cont.
  • 23. Difference in ITDs from Free-Field and Headphones for Dynamic Noise
  • 24. Difference in ILDs from Free-Field and Headphones for Dynamic Noise
  • 25. Static Sound/Dynamic Head: Methods Speech or noise waveform was presented over   loudspeakers or headphone at a fixed position, 30 degrees. 4 HRTF sets were used   KEMAR was moved from 30 degrees to 0 degree position   while sound was presented. Head position was monitored using Intersense® IS900   VWT head tracker.
  • 26. Static Sound/Dynamic Head: Analysis Similar data analysis was performed in this case as in the   previous two cases. Only tracks that followed the same trajectory were   correlated. Acceptance Criteria was less than 1 or 1.5 degree difference   between the tracks.
  • 27. Across Time/Frequency Correlation for Dynamic Head/Static Noise
  • 28. Acoustic Waveform Comparison: Static Sound/Dynamic Head Noise Results Cont.
  • 29. Acoustic Waveform Comparison: Static Sound/Dynamic Head Noise Results Cont.
  • 30. Acoustic Waveform Comparison: Static Sound/Dynamic Head Noise Results Cont.
  • 31. Difference in ITDs from Free-Field and Headphones for Static Noise/Dynamic Head
  • 32. Difference in ILDs from Free-Field and Headphones for Static Noise/Dynamic Head.
  • 33. Waveform Comparison Discussion Interaural cues match up very well across the different   conditions as well as between loudspeakers and headphones. Result from higher correlations in the magnitude and phase   functions. Differences (correlation) in waveforms may not matter   perceptually if receiving same binaural cues. Output algorithm in the RTVAS seems to present correct   directional oriented sounds as well as correctly adjusting to head movement.
  • 34. Psychophysical Experiment: Details 6 Normal Hearing Subjects   4 Male, 2 Female   Sound was presented over headphones or loudspeakers   Task was to track, using their head, a moving sound   source. HRTFs tested were, Empirical KEMAR, Minimum-Phase   KEMAR, Individual (Interpolated using Minimum-Phase)
  • 35. Psychophysical Experiment: Details cont. Sound Details   White noise   Frequency content was 200Hz to 10kHz   Presented at 65dB SPL   5 second in duration   Track Details   15(sin((2pi/5)t)+ sin((2pi/2)t*rand))  
  • 36. Psychophysical Experiment: Virtual Setup Head Movement Training – Subjects just moved head (no sound)   5 repetitions where subjects’ task was to put the square (representing   head) in another box. Also centers head.   Training – All using Empirical KEMAR   10 trials where subject was shown, via plot, the path of the sound before   it played. 10 trials where the same track as before was presented but no visual   cue was available. 10 trials where subject was shown, via plot, the path but path was   random from trial to trial. 10 trials where tracks were random and no visualization.  
  • 37. Psychophysical Experiment: Setup cont. Experiment (Headphones)   10 trials using Empirical KEMAR HRTFs   10 trials using Minimum-Phase KEMAR HRTFs   10 trials using Individual HRTFs   Repeated 3 times   Loudspeaker Training   Same as headphones but trials were reduced to 5.   Loudspeaker Experiment   30 trials repeated only once   Subjects were instructed to press a button as soon as they   heard the sound. This started the head tracking.
  • 40. Individual Response to Complexity of Tracks
  • 41. Overall Coherence in Performance
  • 42. Overall Latency in Tracking
  • 43. RMS/RMS Error of Tracking
  • 45. Deeper Look into Individual HRTF Case
  • 46. Psychophysical Experiment: Discussion Coherence   The coherence or correlation measure is statistically insignificant in   the empirical and minimum phase interpolation case from that over loudspeakers. Coherence of individual HRTFs was surprisingly worse.   Coherence also stays strong as the complexity of the track varies.   Latency   Individual HRTFs show a more variability in latency.   Might be able to track changes quicker using their own HRTFs   Loudspeaker latency is negative which means that subjects are   predicting the path. Could be because subjects are predicting the path since sound always go   to the right first as well as a result from the delay in pressing the button
  • 47. Psychophysical Experiment: Discussion Cont. RMS   No significant difference in total RMS error as well as RMS   undershoot error between Empirical and Minimum-Phase HRTFs from loudspeakers. Subjects generally undershoot the path of the sound.   Could be a motor problem, i.e. laziness, as well as perception.  
  • 48. Overall Conclusions Coherence of acoustic recordings may not be the best   measure for validation Reverberation or panning techniques   If perception is the only thing that matters, than have to   conclude that algorithm works
  • 49. Future Work Look at different methods for presenting dynamic sound   over loudspeakers. Try different room environments.   Closer look at differences between headphones   Particularly looking at open canal tube-phones to see if   subjects could distinguish between real and virtual sources. Various psychophysical experiments that involve dynamic   sound (speech, masking) Sound localization   Source separation  
  • 50. Acknowledgements Other Committee     Dave Freedman Steven Colburn     Jake Scarpaci Barb Shinn-Cunningham     Nathaniel Durlach My Subjects     Binaural Gang All in Attendance     Todd Jennings   Le Wang   Tim Streeter   Varun Parmar   Akshay Navaladi   Antje Ihlefeld  
  • 53. Methods: Real Presentation Continued Input stimulus was a 17th Title: Speaker   Presentation order mls sequence sampled Source Created at 50kHz. Speaker Position -5 Corresponds to a duration of   10 0 ~2.6sec 5 -10 15 0 Waveforms were recorded   10 -20 on KEMAR (Knowles -10 Electronic Manikin for 30 0 10 Acoustic Research) 20 -40 -30 -10 45 0 10 30 40
  • 54. Results: Real Presentation •  HRTFs measured when sound was presented over loudspeakers using the linear and nonlinear interpolation functions Linear Nonlinear
  • 55. Results: Correlation Coefficients at all Spatial Locations for Interpolated Sound over Loudspeakers Correlation Title: Correlation Coefficients   Linear Function Non-linear Function Speaker Virtual Left Right Left Right Location Position between a -40 0.98799 0.9758 0.98655 0.97769 -30 0.97427 0.96611 0.97534 0.96777 virtual point -10 0.96842 0.94612 0.96858 0.9466 45 0 0.95736 0.91602 0.95693 0.91709 10 0.96374 0.95282 0.96384 0.95276 source and a 30 0.97532 0.97095 0.97644 0.97084 40 0.98397 0.98194 0.98268 0.98177 real source -20 0.98372 0.97316 0.98385 0.97357 -10 0.98054 0.9564 0.98054 0.95649 30 0 0.97184 0.93755 0.97171 0.93774 10 0.97151 0.96414 0.97147 0.96448 20 0.97844 0.97768 0.97883 0.97762 -10 0.993 0.97775 0.99301 0.97787 15 0 0.97821 0.95517 0.97817 0.95503 10 0.98406 0.98576 0.98412 0.98572 -5 0.99326 0.97585 0.99328 0.97601 10 0 0.98927 0.96086 0.98924 0.96077 5 0.99319 0.98977 0.99312 0.98977 Very strong correlation, generally, for all spatial locations Weaker correlation as speakers become more spatially separated Weakest correlation when created sound is furthest from both speakers (0 degrees)
  • 56. Spatial Separation of Loudspeakers Correlation coefficients   for a virtually created sound source at -10 degrees at various spatial separations of the loudspeakers Correlation declines as the loudspeakers become more spatially separated
  • 57. Example of Psuedo-Anechoic HRTFs • Correlation coefficients are slightly better when reverberations are taken out of the impulse responses • Linear Reverberant: 0.98054, 0.9564 (Left, Right Ears) • Linear Psuedo-Anechoic: 0.98545, 0.96019 (Left, Right Ears) • Nonlinear Reverberant: 0.98054, 0.95649 (Left, Right Ears) • Nonlinear Pseudo-Anechoic: 0.9855, 0.96007 (Left, Right Ears)
  • 58. Correlation Coefficients at all Spatial Locations for Interpolated Sound over Loudspeakers (Pseudo-Anechoic) Table 3. Correlation Coefficients for Psuedo-Anechoic HRTFs Linear Function Non-linear Function Speaker Virtual Left Right Left Right Location Position -40 0.96567 0.99168 0.96416 0.98421 -30 0.96223 0.95356 0.96138 0.95815 -10 0.96348 0.93433 0.96299 0.93902 45 0 0.95471 0.89491 0.95436 0.89968 10 0.95856 0.93652 0.95913 0.93953 30 0.97678 0.945 0.97825 0.94013 40 0.99563 0.9814 0.99 0.98018 -20 0.98762 0.97555 0.98767 0.97663 -10 0.98545 0.96019 0.9855 0.96007 30 0 0.97281 0.93616 0.97284 0.93623 10 0.97927 0.96945 0.97912 0.96968 20 0.97904 0.98188 0.97846 0.98183 -10 0.99608 0.98114 0.99592 0.98167 15 0 0.97891 0.95475 0.9788 0.95461 10 0.9928 0.98922 0.99287 0.9892 -5 0.99738 0.98141 0.99736 0.98162 10 0 0.99329 0.96323 0.99333 0.9632 5 0.99731 0.9946 0.99736 0.99462 Correlations generally are better when reverberant energy is taken out of the impulse   responses.