SlideShare a Scribd company logo
1 of 61
VALIDATION OF A REAL-TIME VIRTUAL AUDITORY
 SYSTEM FOR DYNAMIC SOUND STIMULI AND ITS
        APPLICATION TO SOUND LOCALIZATION


                                  Brett Rinehold
Outline
     Motivation
 
     Introduction
 
     Background
 
     Loudspeaker Presentation
 
     HRTF Interpolation
 
     Acoustic Waveform Comparison
 
          Static Sound Presentation
      
          Dynamic Sound Presentation
      
          Static Sound with a Dynamic Head Presentation
      
     Psychophysical Experiment
 
     Discussion
 
Motivation
     To validate a real-time system that updates head-related
 
     impulse responses

     Goal is to show that the acoustic waveforms measured
 
     on KEMAR match between real and virtual presentations

     Applications:
 
          Explore the effects on sound localization with the presentation
      
          of dynamic sound
Introduction: What is Real/Virtual Audio?
     Real Audio consists of presenting sounds over
 
     loudspeakers

     Virtual Audio consists of presenting acoustic waveforms
 
     over headphones.
     Advantages
 
          Cost-effective
      
          Portable
      
          Doesn’t depend on room effects
      

     Disadvantages
 
          Unrealistic
      
Introduction: Sound Localization
     Interaural Time Difference – ITD – Differences between
 
     sound arrival times at the two ears
          Predominant cue in the low frequencies < 2kHz
      

     Interaural Level Difference – ILD – Differences between
 
     sound levels in the two ears
          Predominant cue in the higher frequencies due to head
      
          shadowing ~> 2kHz
     Encoded in Head-Related Transfer Function (HRTF)
 
          ILD in Magnitude
      
          ITD in Phase
      
Background of RTVAS System
     Developed by Jacob Scarpaci (2006)
 
     Uses a Real-Time Kernel in Linux to update HRTF filters
 




                        Key to system is that the HRTF being convolved
                        with input signal is the difference between where
                        the sound should be and where the subject’s
                        head position is.
Project Motivation/Aims
     Goal is to validate that the Real-Time Virtual Auditory
 
     System, developed by Jacob Scarpaci (2006), correctly
     updates HRTFs in accordance with head location relative
     to sound location.
     Approach to validation:
 
          Compare acoustic waveforms measured on KEMAR when
      
          presented with sound over headphones to those presented
          over loudspeakers.
               Mathematical, signals approach
           

          Perform a behavioral task where subjects are to track dynamic
      
          sound played over headphones or loudspeakers.
               Perceptual approach
           
Methods: Real Presentation - Panning




     Loudspeaker setup to create a virtual speaker
                                                      Nonlinear (Leakey, 1959)
     (shown as dashed outline) by interpolation
                                                        CH 1 = 1/ 2  (sin(! ) / 2 sin(! pos ))
     between two symmetrically located speakers
     about 0 degrees azimuth.                           CH 2 = 1/ 2 + (sin(! ) / 2 sin(! pos ))
HRTF Measurement
     Empirical KEMAR
 
          17th order MLS used to measure HRTF at every degree from -90 to 90 degrees.
      
     All measurements were windowed to 226 coefficients using a modified
 
     Hanning window to remove reverberations.
     Minimum-Phase plus Linear Phase Interpolation
 
          Interpolated from every 5 degree empirical measurements.
      
          Magnitude function was derived using a linear weighted average of the log
      
          magnitude functions from the empirical measurements.
          Minimum Phase function was derived from the magnitude function.
      



          Linear Phase component was added corresponding to the ITD calculated for
      
          that position.


      
Acoustic Waveform Comparison: Static
Sound/Static Head Methods
     Presented either a speech waveform or noise waveform at three
 
     different static locations: 5, 23, and -23 degrees




     During the free-field presentation the positions were created by
 
     using the panning technique (outlined previously) from speakers.
     Used 4 different KEMAR HRTF sets in the virtual presentation
 
          Empirical, Min-Phase Interp., Empirical Headphone TF, Min-Phase
      
          Headphone TF
     Recorded sounds on KEMAR with microphones located at the
 
     position corresponding to the human eardrum.
Static Sound/Static Head: Analysis
     Correlated the waveforms recorded over loudspeakers
 
     with the waveforms recorded over headphones for a
     given set of HRTFs.
          Correlated time, magnitude, and phase functions
      
          Allowed for a maximum delay of 4ms in time to allow for
      
          transmission delays
     Broke signals into third-octave bands with the following
 
     center frequencies:
       [200 250 315 400 500 630 800 1000 1250 1600 2000 2500 3150 4000
          5000 6300 8000 10000]
          Correlated time, magnitude, and phase within each band and calculated
      
          the delay(lag) needed to impose on one signal to achieve maximum
          correlation.
          Looked at differences in binaural cues within each band
      
Across Time/Frequency Correlations of
Static Noise
Acoustic Waveform Comparisons: Static
Sound/Static Head Results Cont.
Acoustic Waveform Comparisons: Static
Sound/Static Head Results Cont.
Acoustic Waveform Comparisons: Static
Sound/Static Head Results Cont.
Difference in ITDs from Free-Field and
Headphones for Static Noise
Difference in ILDs from Free-Field and
Headphones for Static Noise
Dynamic Sound/Static Head: Methods
     Presented a speech or a noise waveform either over
 
     loudspeakers or headphones using panning or convolution
     algorithm




     Sound was presented from 0 to 30 degrees
 
     Used same 4 HRTF sets
 
Across Time/Frequency Correlation of
Dynamic Noise
Acoustic Waveform Comparison: Dynamic
Sound/Static Head Noise Results Cont.
Acoustic Waveform Comparison: Dynamic
Sound/Static Head Noise Results Cont.
Acoustic Waveform Comparison: Dynamic
Sound/Static Head Noise Results Cont.
Difference in ITDs from Free-Field and
Headphones for Dynamic Noise
Difference in ILDs from Free-Field and
Headphones for Dynamic Noise
Static Sound/Dynamic Head: Methods
     Speech or noise waveform was presented over
 
     loudspeakers or headphone at a fixed position, 30
     degrees.
     4 HRTF sets were used
 
     KEMAR was moved from 30 degrees to 0 degree position
 
     while sound was presented.
     Head position was monitored using Intersense® IS900
 
     VWT head tracker.
Static Sound/Dynamic Head: Analysis
     Similar data analysis was performed in this case as in the
 
     previous two cases.
     Only tracks that followed the same trajectory were
 
     correlated.
          Acceptance Criteria was less than 1 or 1.5 degree difference
      
          between the tracks.
Across Time/Frequency Correlation for
Dynamic Head/Static Noise
Acoustic Waveform Comparison: Static
Sound/Dynamic Head Noise Results Cont.
Acoustic Waveform Comparison: Static
Sound/Dynamic Head Noise Results Cont.
Acoustic Waveform Comparison: Static
Sound/Dynamic Head Noise Results Cont.
Difference in ITDs from Free-Field and
Headphones for Static Noise/Dynamic Head
Difference in ILDs from Free-Field and
Headphones for Static Noise/Dynamic Head.
Waveform Comparison Discussion
     Interaural cues match up very well across the different
 
     conditions as well as between loudspeakers and
     headphones.
          Result from higher correlations in the magnitude and phase
      
          functions.
     Differences (correlation) in waveforms may not matter
 
     perceptually if receiving same binaural cues.
     Output algorithm in the RTVAS seems to present correct
 
     directional oriented sounds as well as correctly adjusting
     to head movement.
Psychophysical Experiment: Details
     6 Normal Hearing Subjects
 
          4 Male, 2 Female
      

     Sound was presented over headphones or loudspeakers
 
     Task was to track, using their head, a moving sound
 
     source.
     HRTFs tested were, Empirical KEMAR, Minimum-Phase
 
     KEMAR, Individual (Interpolated using Minimum-Phase)
Psychophysical Experiment: Details cont.
     Sound Details
 
          White noise
      
               Frequency content was 200Hz to 10kHz
           
               Presented at 65dB SPL
           
               5 second in duration
           



     Track Details
 
          15(sin((2pi/5)t)+ sin((2pi/2)t*rand))
      
Psychophysical Experiment: Virtual Setup

     Head Movement Training – Subjects just moved head (no sound)
 
          5 repetitions where subjects’ task was to put the square (representing
      
          head) in another box.
          Also centers head.
      
     Training – All using Empirical KEMAR
 
          10 trials where subject was shown, via plot, the path of the sound before
      
          it played.



          10 trials where the same track as before was presented but no visual
      
          cue was available.
          10 trials where subject was shown, via plot, the path but path was
      
          random from trial to trial.
          10 trials where tracks were random and no visualization.
      
Psychophysical Experiment: Setup cont.
     Experiment (Headphones)
 
          10 trials using Empirical KEMAR HRTFs
      
          10 trials using Minimum-Phase KEMAR HRTFs
      
          10 trials using Individual HRTFs
      
          Repeated 3 times
      

     Loudspeaker Training
 
          Same as headphones but trials were reduced to 5.
      

     Loudspeaker Experiment
 
          30 trials repeated only once
      
          Subjects were instructed to press a button as soon as they
      
          heard the sound. This started the head tracking.
Individual Tracking Results
Individual RMS/RMS Error
Individual Response to Complexity of Tracks
Overall Coherence in Performance
Overall Latency in Tracking
RMS/RMS Error of Tracking
Complexity of Track Analysis
Deeper Look into Individual HRTF Case
Psychophysical Experiment: Discussion
     Coherence
 
          The coherence or correlation measure is statistically insignificant in
      
          the empirical and minimum phase interpolation case from that over
          loudspeakers.
          Coherence of individual HRTFs was surprisingly worse.
      
          Coherence also stays strong as the complexity of the track varies.
      

     Latency
 
          Individual HRTFs show a more variability in latency.
      
               Might be able to track changes quicker using their own HRTFs
           

          Loudspeaker latency is negative which means that subjects are
      
          predicting the path.
               Could be because subjects are predicting the path since sound always go
           
               to the right first as well as a result from the delay in pressing the button
Psychophysical Experiment: Discussion
Cont.
     RMS
 
          No significant difference in total RMS error as well as RMS
      
          undershoot error between Empirical and Minimum-Phase
          HRTFs from loudspeakers.
          Subjects generally undershoot the path of the sound.
      
               Could be a motor problem, i.e. laziness, as well as perception.
           
Overall Conclusions
     Coherence of acoustic recordings may not be the best
 
     measure for validation
          Reverberation or panning techniques
      



     If perception is the only thing that matters, than have to
 
     conclude that algorithm works
Future Work
     Look at different methods for presenting dynamic sound
 
     over loudspeakers.
     Try different room environments.
 
     Closer look at differences between headphones
 
          Particularly looking at open canal tube-phones to see if
      
          subjects could distinguish between real and virtual sources.
     Various psychophysical experiments that involve dynamic
 
     sound (speech, masking)
          Sound localization
      
          Source separation
      
Acknowledgements
                                       Other
     Committee                     
 
                                            Dave Freedman
          Steven Colburn                
      
                                            Jake Scarpaci
          Barb Shinn-Cunningham         
      
          Nathaniel Durlach            My Subjects
                                  
     Binaural Gang                     All in Attendance
                                  
          Todd Jennings
      
          Le Wang
      
          Tim Streeter
      
          Varun Parmar
      
          Akshay Navaladi
      
          Antje Ihlefeld
      
THANK YOU
Backup Slides
Methods: Real Presentation Continued

                                     Input stimulus was a 17th
    Title: Speaker
                                 
    Presentation

                                     order mls sequence sampled
      Source         Created

                                     at 50kHz.
      Speaker        Position

                        -5
                                          Corresponds to a duration of
                                      
         10              0
                                          ~2.6sec
                         5
                       -10
         15              0
                                     Waveforms were recorded
                                 
                       10
                       -20
                                     on KEMAR (Knowles
                       -10

                                     Electronic Manikin for
         30              0
                       10

                                     Acoustic Research)
                       20
                       -40
                       -30
                       -10
         45              0
                       10
                       30
                       40
Results: Real Presentation

 •  HRTFs measured when sound was presented over loudspeakers using the
 linear and nonlinear interpolation functions


   Linear                                  Nonlinear
Results: Correlation Coefficients at all Spatial Locations for
    Interpolated Sound over Loudspeakers


                                                                                                   Correlation
Title: Correlation Coefficients
                                                                                               
                             Linear Function                  Non-linear Function
Speaker        Virtual
                             Left           Right             Left          Right
Location       Position
                                                                                                   between a
                         -40       0.98799           0.9758         0.98655         0.97769
                         -30       0.97427          0.96611         0.97534         0.96777

                                                                                                   virtual point
                         -10       0.96842          0.94612         0.96858          0.9466
      45                   0       0.95736          0.91602         0.95693         0.91709
                          10       0.96374          0.95282         0.96384         0.95276
                                                                                                   source and a
                          30       0.97532          0.97095         0.97644         0.97084
                          40       0.98397          0.98194         0.98268         0.98177

                                                                                                   real source
                         -20       0.98372          0.97316         0.98385         0.97357
                         -10       0.98054           0.9564         0.98054         0.95649
      30                   0       0.97184          0.93755         0.97171         0.93774
                          10       0.97151          0.96414         0.97147         0.96448
                          20       0.97844          0.97768         0.97883         0.97762
                         -10         0.993          0.97775         0.99301         0.97787
      15                   0       0.97821          0.95517         0.97817         0.95503
                          10       0.98406          0.98576         0.98412         0.98572
                          -5       0.99326          0.97585         0.99328         0.97601
      10                   0       0.98927          0.96086         0.98924         0.96077
                           5       0.99319          0.98977         0.99312         0.98977

Very strong correlation, generally, for all spatial locations
Weaker correlation as speakers become more spatially separated
Weakest correlation when created sound is furthest from both speakers (0
degrees)
Spatial Separation of Loudspeakers

                                                    Correlation coefficients
                                                
                                                    for a virtually created
                                                    sound source at -10
                                                    degrees at various
                                                    spatial separations of the
                                                    loudspeakers


Correlation declines as the loudspeakers become more spatially separated
Example of Psuedo-Anechoic HRTFs




• Correlation coefficients are slightly better when reverberations are taken out of the impulse
responses
      • Linear Reverberant: 0.98054, 0.9564 (Left, Right Ears)
      • Linear Psuedo-Anechoic: 0.98545, 0.96019 (Left, Right Ears)
      • Nonlinear Reverberant: 0.98054, 0.95649 (Left, Right Ears)
      • Nonlinear Pseudo-Anechoic: 0.9855, 0.96007 (Left, Right Ears)
Correlation Coefficients at all Spatial Locations for Interpolated
     Sound over Loudspeakers (Pseudo-Anechoic)

     Table 3. Correlation Coefficients for Psuedo-Anechoic HRTFs
                                 Linear Function              Non-linear Function
     Speaker       Virtual
                                 Left           Right         Left          Right
     Location      Position
                            -40        0.96567        0.99168       0.96416         0.98421
                            -30        0.96223        0.95356       0.96138         0.95815
                            -10        0.96348        0.93433       0.96299         0.93902
           45                 0        0.95471        0.89491       0.95436         0.89968
                             10        0.95856        0.93652       0.95913         0.93953
                             30        0.97678          0.945       0.97825         0.94013
                             40        0.99563         0.9814          0.99         0.98018
                            -20        0.98762        0.97555       0.98767         0.97663
                            -10        0.98545        0.96019        0.9855         0.96007
           30                 0        0.97281        0.93616       0.97284         0.93623
                             10        0.97927        0.96945       0.97912         0.96968
                             20        0.97904        0.98188       0.97846         0.98183
                            -10        0.99608        0.98114       0.99592         0.98167
           15                 0        0.97891        0.95475        0.9788         0.95461
                             10         0.9928        0.98922       0.99287          0.9892
                             -5        0.99738        0.98141       0.99736         0.98162
           10                 0        0.99329        0.96323       0.99333          0.9632
                              5        0.99731         0.9946       0.99736         0.99462


      Correlations generally are better when reverberant energy is taken out of the impulse
 
      responses.
HRTF Window Function
HRTF Magnitude Comparison
Headphone Transfer Function

More Related Content

Similar to Thesis Defense Presentation

SUMSEM-2021-22_ECE6007_ETH_VL2021220701295_Reference_Material_I_04-07-2022_EE...
SUMSEM-2021-22_ECE6007_ETH_VL2021220701295_Reference_Material_I_04-07-2022_EE...SUMSEM-2021-22_ECE6007_ETH_VL2021220701295_Reference_Material_I_04-07-2022_EE...
SUMSEM-2021-22_ECE6007_ETH_VL2021220701295_Reference_Material_I_04-07-2022_EE...
BharathSrinivasG
 
February 26 esp 179 noise
February 26  esp 179 noiseFebruary 26  esp 179 noise
February 26 esp 179 noise
CEQAplanner
 

Similar to Thesis Defense Presentation (20)

Defense - Sound space rendering based on the virtual Sound space rendering ba...
Defense - Sound space rendering based on the virtual Sound space rendering ba...Defense - Sound space rendering based on the virtual Sound space rendering ba...
Defense - Sound space rendering based on the virtual Sound space rendering ba...
 
Feasibility of EEG Super-Resolution Using Deep Convolutional Networks
Feasibility of EEG Super-Resolution Using Deep Convolutional NetworksFeasibility of EEG Super-Resolution Using Deep Convolutional Networks
Feasibility of EEG Super-Resolution Using Deep Convolutional Networks
 
EEG Basics monish.pptx
EEG Basics monish.pptxEEG Basics monish.pptx
EEG Basics monish.pptx
 
Equalisers
EqualisersEqualisers
Equalisers
 
Recording Distortion Product Otoacoustic Emissions using the Adaptive Noise C...
Recording Distortion Product Otoacoustic Emissions using the Adaptive Noise C...Recording Distortion Product Otoacoustic Emissions using the Adaptive Noise C...
Recording Distortion Product Otoacoustic Emissions using the Adaptive Noise C...
 
example based audio editing
example based audio editingexample based audio editing
example based audio editing
 
SUMSEM-2021-22_ECE6007_ETH_VL2021220701295_Reference_Material_I_04-07-2022_EE...
SUMSEM-2021-22_ECE6007_ETH_VL2021220701295_Reference_Material_I_04-07-2022_EE...SUMSEM-2021-22_ECE6007_ETH_VL2021220701295_Reference_Material_I_04-07-2022_EE...
SUMSEM-2021-22_ECE6007_ETH_VL2021220701295_Reference_Material_I_04-07-2022_EE...
 
ECHOES & DELPH Seismic - Advances in geophysical sensor data acquisition
ECHOES & DELPH Seismic - Advances in geophysical sensor data acquisitionECHOES & DELPH Seismic - Advances in geophysical sensor data acquisition
ECHOES & DELPH Seismic - Advances in geophysical sensor data acquisition
 
Audiometry class by Dr. Kavitha Ashok Kumar MSU Malaysia
Audiometry class by Dr. Kavitha Ashok Kumar MSU MalaysiaAudiometry class by Dr. Kavitha Ashok Kumar MSU Malaysia
Audiometry class by Dr. Kavitha Ashok Kumar MSU Malaysia
 
Introduction to EEG: Instrument and Acquisition
Introduction to EEG: Instrument and AcquisitionIntroduction to EEG: Instrument and Acquisition
Introduction to EEG: Instrument and Acquisition
 
TESTS FOR AUDITORY ASSESSMENT
TESTS FOR AUDITORY ASSESSMENTTESTS FOR AUDITORY ASSESSMENT
TESTS FOR AUDITORY ASSESSMENT
 
Hybrid Reverberator Using Multiple Impulse Responses for Audio Rendering Impr...
Hybrid Reverberator Using Multiple Impulse Responses for Audio Rendering Impr...Hybrid Reverberator Using Multiple Impulse Responses for Audio Rendering Impr...
Hybrid Reverberator Using Multiple Impulse Responses for Audio Rendering Impr...
 
How to play audio from a microcontroller
How to play audio from a microcontrollerHow to play audio from a microcontroller
How to play audio from a microcontroller
 
3D Spatial Response
3D Spatial Response3D Spatial Response
3D Spatial Response
 
EEG course.pptx
EEG course.pptxEEG course.pptx
EEG course.pptx
 
Otoacoustic Emissions : A comparison between simulation and lab measures.
Otoacoustic Emissions : A comparison between simulation and lab measures.Otoacoustic Emissions : A comparison between simulation and lab measures.
Otoacoustic Emissions : A comparison between simulation and lab measures.
 
February 26 esp 179 noise
February 26  esp 179 noiseFebruary 26  esp 179 noise
February 26 esp 179 noise
 
3 D Sound
3 D Sound3 D Sound
3 D Sound
 
Build Your Own VR Display Course - SIGGRAPH 2017: Part 4
Build Your Own VR Display Course - SIGGRAPH 2017: Part 4Build Your Own VR Display Course - SIGGRAPH 2017: Part 4
Build Your Own VR Display Course - SIGGRAPH 2017: Part 4
 
Geometric distortion artifact remedy
Geometric distortion artifact remedyGeometric distortion artifact remedy
Geometric distortion artifact remedy
 

Recently uploaded

Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...
Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...
Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...
daisycvs
 
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
amitlee9823
 
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Dipal Arora
 
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
amitlee9823
 
Call Girls From Pari Chowk Greater Noida ❤️8448577510 ⊹Best Escorts Service I...
Call Girls From Pari Chowk Greater Noida ❤️8448577510 ⊹Best Escorts Service I...Call Girls From Pari Chowk Greater Noida ❤️8448577510 ⊹Best Escorts Service I...
Call Girls From Pari Chowk Greater Noida ❤️8448577510 ⊹Best Escorts Service I...
lizamodels9
 

Recently uploaded (20)

Mondelez State of Snacking and Future Trends 2023
Mondelez State of Snacking and Future Trends 2023Mondelez State of Snacking and Future Trends 2023
Mondelez State of Snacking and Future Trends 2023
 
Dr. Admir Softic_ presentation_Green Club_ENG.pdf
Dr. Admir Softic_ presentation_Green Club_ENG.pdfDr. Admir Softic_ presentation_Green Club_ENG.pdf
Dr. Admir Softic_ presentation_Green Club_ENG.pdf
 
Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...
Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...
Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...
 
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
 
It will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 MayIt will be International Nurses' Day on 12 May
It will be International Nurses' Day on 12 May
 
John Halpern sued for sexual assault.pdf
John Halpern sued for sexual assault.pdfJohn Halpern sued for sexual assault.pdf
John Halpern sued for sexual assault.pdf
 
BAGALUR CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRL
BAGALUR CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRLBAGALUR CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRL
BAGALUR CALL GIRL IN 98274*61493 ❤CALL GIRLS IN ESCORT SERVICE❤CALL GIRL
 
Call Girls Service In Old Town Dubai ((0551707352)) Old Town Dubai Call Girl ...
Call Girls Service In Old Town Dubai ((0551707352)) Old Town Dubai Call Girl ...Call Girls Service In Old Town Dubai ((0551707352)) Old Town Dubai Call Girl ...
Call Girls Service In Old Town Dubai ((0551707352)) Old Town Dubai Call Girl ...
 
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
 
Call Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine ServiceCall Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine Service
 
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
Call Girls Navi Mumbai Just Call 9907093804 Top Class Call Girl Service Avail...
 
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
 
Value Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and painsValue Proposition canvas- Customer needs and pains
Value Proposition canvas- Customer needs and pains
 
Organizational Transformation Lead with Culture
Organizational Transformation Lead with CultureOrganizational Transformation Lead with Culture
Organizational Transformation Lead with Culture
 
Phases of negotiation .pptx
 Phases of negotiation .pptx Phases of negotiation .pptx
Phases of negotiation .pptx
 
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service AvailableCall Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
 
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best ServicesMysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
Mysore Call Girls 8617370543 WhatsApp Number 24x7 Best Services
 
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
 
Call Girls From Pari Chowk Greater Noida ❤️8448577510 ⊹Best Escorts Service I...
Call Girls From Pari Chowk Greater Noida ❤️8448577510 ⊹Best Escorts Service I...Call Girls From Pari Chowk Greater Noida ❤️8448577510 ⊹Best Escorts Service I...
Call Girls From Pari Chowk Greater Noida ❤️8448577510 ⊹Best Escorts Service I...
 
Katrina Personal Brand Project and portfolio 1
Katrina Personal Brand Project and portfolio 1Katrina Personal Brand Project and portfolio 1
Katrina Personal Brand Project and portfolio 1
 

Thesis Defense Presentation

  • 1. VALIDATION OF A REAL-TIME VIRTUAL AUDITORY SYSTEM FOR DYNAMIC SOUND STIMULI AND ITS APPLICATION TO SOUND LOCALIZATION Brett Rinehold
  • 2. Outline Motivation   Introduction   Background   Loudspeaker Presentation   HRTF Interpolation   Acoustic Waveform Comparison   Static Sound Presentation   Dynamic Sound Presentation   Static Sound with a Dynamic Head Presentation   Psychophysical Experiment   Discussion  
  • 3. Motivation To validate a real-time system that updates head-related   impulse responses Goal is to show that the acoustic waveforms measured   on KEMAR match between real and virtual presentations Applications:   Explore the effects on sound localization with the presentation   of dynamic sound
  • 4. Introduction: What is Real/Virtual Audio? Real Audio consists of presenting sounds over   loudspeakers Virtual Audio consists of presenting acoustic waveforms   over headphones. Advantages   Cost-effective   Portable   Doesn’t depend on room effects   Disadvantages   Unrealistic  
  • 5. Introduction: Sound Localization Interaural Time Difference – ITD – Differences between   sound arrival times at the two ears Predominant cue in the low frequencies < 2kHz   Interaural Level Difference – ILD – Differences between   sound levels in the two ears Predominant cue in the higher frequencies due to head   shadowing ~> 2kHz Encoded in Head-Related Transfer Function (HRTF)   ILD in Magnitude   ITD in Phase  
  • 6. Background of RTVAS System Developed by Jacob Scarpaci (2006)   Uses a Real-Time Kernel in Linux to update HRTF filters   Key to system is that the HRTF being convolved with input signal is the difference between where the sound should be and where the subject’s head position is.
  • 7. Project Motivation/Aims Goal is to validate that the Real-Time Virtual Auditory   System, developed by Jacob Scarpaci (2006), correctly updates HRTFs in accordance with head location relative to sound location. Approach to validation:   Compare acoustic waveforms measured on KEMAR when   presented with sound over headphones to those presented over loudspeakers. Mathematical, signals approach   Perform a behavioral task where subjects are to track dynamic   sound played over headphones or loudspeakers. Perceptual approach  
  • 8. Methods: Real Presentation - Panning Loudspeaker setup to create a virtual speaker     Nonlinear (Leakey, 1959) (shown as dashed outline) by interpolation CH 1 = 1/ 2 (sin(! ) / 2 sin(! pos )) between two symmetrically located speakers about 0 degrees azimuth. CH 2 = 1/ 2 + (sin(! ) / 2 sin(! pos ))
  • 9. HRTF Measurement Empirical KEMAR   17th order MLS used to measure HRTF at every degree from -90 to 90 degrees.   All measurements were windowed to 226 coefficients using a modified   Hanning window to remove reverberations. Minimum-Phase plus Linear Phase Interpolation   Interpolated from every 5 degree empirical measurements.   Magnitude function was derived using a linear weighted average of the log   magnitude functions from the empirical measurements. Minimum Phase function was derived from the magnitude function.   Linear Phase component was added corresponding to the ITD calculated for   that position.  
  • 10. Acoustic Waveform Comparison: Static Sound/Static Head Methods Presented either a speech waveform or noise waveform at three   different static locations: 5, 23, and -23 degrees During the free-field presentation the positions were created by   using the panning technique (outlined previously) from speakers. Used 4 different KEMAR HRTF sets in the virtual presentation   Empirical, Min-Phase Interp., Empirical Headphone TF, Min-Phase   Headphone TF Recorded sounds on KEMAR with microphones located at the   position corresponding to the human eardrum.
  • 11. Static Sound/Static Head: Analysis Correlated the waveforms recorded over loudspeakers   with the waveforms recorded over headphones for a given set of HRTFs. Correlated time, magnitude, and phase functions   Allowed for a maximum delay of 4ms in time to allow for   transmission delays Broke signals into third-octave bands with the following   center frequencies:   [200 250 315 400 500 630 800 1000 1250 1600 2000 2500 3150 4000 5000 6300 8000 10000] Correlated time, magnitude, and phase within each band and calculated   the delay(lag) needed to impose on one signal to achieve maximum correlation. Looked at differences in binaural cues within each band  
  • 13. Acoustic Waveform Comparisons: Static Sound/Static Head Results Cont.
  • 14. Acoustic Waveform Comparisons: Static Sound/Static Head Results Cont.
  • 15. Acoustic Waveform Comparisons: Static Sound/Static Head Results Cont.
  • 16. Difference in ITDs from Free-Field and Headphones for Static Noise
  • 17. Difference in ILDs from Free-Field and Headphones for Static Noise
  • 18. Dynamic Sound/Static Head: Methods Presented a speech or a noise waveform either over   loudspeakers or headphones using panning or convolution algorithm Sound was presented from 0 to 30 degrees   Used same 4 HRTF sets  
  • 20. Acoustic Waveform Comparison: Dynamic Sound/Static Head Noise Results Cont.
  • 21. Acoustic Waveform Comparison: Dynamic Sound/Static Head Noise Results Cont.
  • 22. Acoustic Waveform Comparison: Dynamic Sound/Static Head Noise Results Cont.
  • 23. Difference in ITDs from Free-Field and Headphones for Dynamic Noise
  • 24. Difference in ILDs from Free-Field and Headphones for Dynamic Noise
  • 25. Static Sound/Dynamic Head: Methods Speech or noise waveform was presented over   loudspeakers or headphone at a fixed position, 30 degrees. 4 HRTF sets were used   KEMAR was moved from 30 degrees to 0 degree position   while sound was presented. Head position was monitored using Intersense® IS900   VWT head tracker.
  • 26. Static Sound/Dynamic Head: Analysis Similar data analysis was performed in this case as in the   previous two cases. Only tracks that followed the same trajectory were   correlated. Acceptance Criteria was less than 1 or 1.5 degree difference   between the tracks.
  • 27. Across Time/Frequency Correlation for Dynamic Head/Static Noise
  • 28. Acoustic Waveform Comparison: Static Sound/Dynamic Head Noise Results Cont.
  • 29. Acoustic Waveform Comparison: Static Sound/Dynamic Head Noise Results Cont.
  • 30. Acoustic Waveform Comparison: Static Sound/Dynamic Head Noise Results Cont.
  • 31. Difference in ITDs from Free-Field and Headphones for Static Noise/Dynamic Head
  • 32. Difference in ILDs from Free-Field and Headphones for Static Noise/Dynamic Head.
  • 33. Waveform Comparison Discussion Interaural cues match up very well across the different   conditions as well as between loudspeakers and headphones. Result from higher correlations in the magnitude and phase   functions. Differences (correlation) in waveforms may not matter   perceptually if receiving same binaural cues. Output algorithm in the RTVAS seems to present correct   directional oriented sounds as well as correctly adjusting to head movement.
  • 34. Psychophysical Experiment: Details 6 Normal Hearing Subjects   4 Male, 2 Female   Sound was presented over headphones or loudspeakers   Task was to track, using their head, a moving sound   source. HRTFs tested were, Empirical KEMAR, Minimum-Phase   KEMAR, Individual (Interpolated using Minimum-Phase)
  • 35. Psychophysical Experiment: Details cont. Sound Details   White noise   Frequency content was 200Hz to 10kHz   Presented at 65dB SPL   5 second in duration   Track Details   15(sin((2pi/5)t)+ sin((2pi/2)t*rand))  
  • 36. Psychophysical Experiment: Virtual Setup Head Movement Training – Subjects just moved head (no sound)   5 repetitions where subjects’ task was to put the square (representing   head) in another box. Also centers head.   Training – All using Empirical KEMAR   10 trials where subject was shown, via plot, the path of the sound before   it played. 10 trials where the same track as before was presented but no visual   cue was available. 10 trials where subject was shown, via plot, the path but path was   random from trial to trial. 10 trials where tracks were random and no visualization.  
  • 37. Psychophysical Experiment: Setup cont. Experiment (Headphones)   10 trials using Empirical KEMAR HRTFs   10 trials using Minimum-Phase KEMAR HRTFs   10 trials using Individual HRTFs   Repeated 3 times   Loudspeaker Training   Same as headphones but trials were reduced to 5.   Loudspeaker Experiment   30 trials repeated only once   Subjects were instructed to press a button as soon as they   heard the sound. This started the head tracking.
  • 40. Individual Response to Complexity of Tracks
  • 41. Overall Coherence in Performance
  • 42. Overall Latency in Tracking
  • 43. RMS/RMS Error of Tracking
  • 45. Deeper Look into Individual HRTF Case
  • 46. Psychophysical Experiment: Discussion Coherence   The coherence or correlation measure is statistically insignificant in   the empirical and minimum phase interpolation case from that over loudspeakers. Coherence of individual HRTFs was surprisingly worse.   Coherence also stays strong as the complexity of the track varies.   Latency   Individual HRTFs show a more variability in latency.   Might be able to track changes quicker using their own HRTFs   Loudspeaker latency is negative which means that subjects are   predicting the path. Could be because subjects are predicting the path since sound always go   to the right first as well as a result from the delay in pressing the button
  • 47. Psychophysical Experiment: Discussion Cont. RMS   No significant difference in total RMS error as well as RMS   undershoot error between Empirical and Minimum-Phase HRTFs from loudspeakers. Subjects generally undershoot the path of the sound.   Could be a motor problem, i.e. laziness, as well as perception.  
  • 48. Overall Conclusions Coherence of acoustic recordings may not be the best   measure for validation Reverberation or panning techniques   If perception is the only thing that matters, than have to   conclude that algorithm works
  • 49. Future Work Look at different methods for presenting dynamic sound   over loudspeakers. Try different room environments.   Closer look at differences between headphones   Particularly looking at open canal tube-phones to see if   subjects could distinguish between real and virtual sources. Various psychophysical experiments that involve dynamic   sound (speech, masking) Sound localization   Source separation  
  • 50. Acknowledgements Other Committee     Dave Freedman Steven Colburn     Jake Scarpaci Barb Shinn-Cunningham     Nathaniel Durlach My Subjects     Binaural Gang All in Attendance     Todd Jennings   Le Wang   Tim Streeter   Varun Parmar   Akshay Navaladi   Antje Ihlefeld  
  • 53. Methods: Real Presentation Continued Input stimulus was a 17th Title: Speaker   Presentation order mls sequence sampled Source Created at 50kHz. Speaker Position -5 Corresponds to a duration of   10 0 ~2.6sec 5 -10 15 0 Waveforms were recorded   10 -20 on KEMAR (Knowles -10 Electronic Manikin for 30 0 10 Acoustic Research) 20 -40 -30 -10 45 0 10 30 40
  • 54. Results: Real Presentation •  HRTFs measured when sound was presented over loudspeakers using the linear and nonlinear interpolation functions Linear Nonlinear
  • 55. Results: Correlation Coefficients at all Spatial Locations for Interpolated Sound over Loudspeakers Correlation Title: Correlation Coefficients   Linear Function Non-linear Function Speaker Virtual Left Right Left Right Location Position between a -40 0.98799 0.9758 0.98655 0.97769 -30 0.97427 0.96611 0.97534 0.96777 virtual point -10 0.96842 0.94612 0.96858 0.9466 45 0 0.95736 0.91602 0.95693 0.91709 10 0.96374 0.95282 0.96384 0.95276 source and a 30 0.97532 0.97095 0.97644 0.97084 40 0.98397 0.98194 0.98268 0.98177 real source -20 0.98372 0.97316 0.98385 0.97357 -10 0.98054 0.9564 0.98054 0.95649 30 0 0.97184 0.93755 0.97171 0.93774 10 0.97151 0.96414 0.97147 0.96448 20 0.97844 0.97768 0.97883 0.97762 -10 0.993 0.97775 0.99301 0.97787 15 0 0.97821 0.95517 0.97817 0.95503 10 0.98406 0.98576 0.98412 0.98572 -5 0.99326 0.97585 0.99328 0.97601 10 0 0.98927 0.96086 0.98924 0.96077 5 0.99319 0.98977 0.99312 0.98977 Very strong correlation, generally, for all spatial locations Weaker correlation as speakers become more spatially separated Weakest correlation when created sound is furthest from both speakers (0 degrees)
  • 56. Spatial Separation of Loudspeakers Correlation coefficients   for a virtually created sound source at -10 degrees at various spatial separations of the loudspeakers Correlation declines as the loudspeakers become more spatially separated
  • 57. Example of Psuedo-Anechoic HRTFs • Correlation coefficients are slightly better when reverberations are taken out of the impulse responses • Linear Reverberant: 0.98054, 0.9564 (Left, Right Ears) • Linear Psuedo-Anechoic: 0.98545, 0.96019 (Left, Right Ears) • Nonlinear Reverberant: 0.98054, 0.95649 (Left, Right Ears) • Nonlinear Pseudo-Anechoic: 0.9855, 0.96007 (Left, Right Ears)
  • 58. Correlation Coefficients at all Spatial Locations for Interpolated Sound over Loudspeakers (Pseudo-Anechoic) Table 3. Correlation Coefficients for Psuedo-Anechoic HRTFs Linear Function Non-linear Function Speaker Virtual Left Right Left Right Location Position -40 0.96567 0.99168 0.96416 0.98421 -30 0.96223 0.95356 0.96138 0.95815 -10 0.96348 0.93433 0.96299 0.93902 45 0 0.95471 0.89491 0.95436 0.89968 10 0.95856 0.93652 0.95913 0.93953 30 0.97678 0.945 0.97825 0.94013 40 0.99563 0.9814 0.99 0.98018 -20 0.98762 0.97555 0.98767 0.97663 -10 0.98545 0.96019 0.9855 0.96007 30 0 0.97281 0.93616 0.97284 0.93623 10 0.97927 0.96945 0.97912 0.96968 20 0.97904 0.98188 0.97846 0.98183 -10 0.99608 0.98114 0.99592 0.98167 15 0 0.97891 0.95475 0.9788 0.95461 10 0.9928 0.98922 0.99287 0.9892 -5 0.99738 0.98141 0.99736 0.98162 10 0 0.99329 0.96323 0.99333 0.9632 5 0.99731 0.9946 0.99736 0.99462 Correlations generally are better when reverberant energy is taken out of the impulse   responses.