MSEE Defense

367 views
295 views

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
367
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
6
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

MSEE Defense

  1. 1. Enhancements to the Generalized Sidelobe Canceller for Audio Beamforming in an Immersive Environment Phil Townsend MSEE Candidate University of Kentucky www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  2. 2. Overview 1) Introduction - Adaptive Beamforming and the GSC 2) Amplitude Scaling Improvements - 1/r Model, Acoustic Physics, Statistical 3) Automatic Target Alignment - Thresholded Cross Correlation using PHAT-β 4) Array Geometry Analysis - Volumetric Beamfield Plots - Monte Carlo Test of Geometric Parameters 5) Final Conclusions and Questions www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  3. 3. Part 1: Introduction • What's beamforming? • A spatial filter that enhances sound based on its spatial position through the coherent processing of signals from distributed microphones. – Reduce room noise/effects – Suppress interfering speakers www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  4. 4. Adaptive Beamforming • Optimization of Generalized Filter Coefficients T y[ n]=W [ n] X [n ] opt – Often requires minimizing output energy while keeping target component unchanged • Estimate statistics on the fly – Input Correlation Matrix unknown/changing • Gradient Descent Toward Optimal Taps – Constrained Lowest Energy Output Forms Unique Minimum to Bowl-Shaped Surface www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  5. 5. Visualization of Gradient Descent From http://en.wikipedia.org/wiki/Gradient_descent; Image in Public Domain www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  6. 6. Generalized Sidelobe Canceller (GSC) • Simplifies Frost's constrained adaptation into two stages – A fixed, Delay-Sum Beamformer – A Blocking Matrix that's adaptively filtered and subtracted. – Adaptation can be any algorithm; we use NLMS here – Simplification comes mostly from enforcing distortionless response www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  7. 7. www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  8. 8. GSC (con't) • Upper branch DSB result • Lower branch BM tracks are where traditional Blocking Matrix is www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  9. 9. GSC (con't) • Final output is • Adaption algorithm for each BM track is (NLMS, much faster than constrained) www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  10. 10. Limitations of Current Models and Methods • Blocking Matrix Leakage – Farfield assumption not valid for immsersive microphone arrays – Target steering might be incorrect • Most research limited to equispaced linear arrays – Hard to construct – Limited useful frequency range – Want to explore other geometries and find the best www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  11. 11. Part 2: Amplitude Correction • Nearfield acoustics means target component has different amplitude in each microphone • Propose and test a few models to correct cancellation – 1/r Model – Sound propagation filtering – Statistical filtering www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  12. 12. Simple 1/r Model • The acoustic wave equation is solved by a function inversely proportional in r • so make a BM using that fact (keep tracks in distance order) www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  13. 13. ISO Acoustic Physics Model • Fluid dynamics can be taken into account to design a filter based on distance, temperature, humidity, and pressure (ISO standard 9613) • Might allow us to add easily-obtainable information to enhance beamforming www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  14. 14. Statistical Amplitude Scaling • Lump all corruptive effects together and minimize energy of difference of tracks • Carry out as a function of frequency to get www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  15. 15. ISO and Statistical BM's • ISO Model (Frequency Domain) • Statistical Scaling (Frequency Domain) www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  16. 16. A Perfect Blocking Matrix • Audio Cage data was collected with targets and speakers separate, so a perfect BM can be simulated • Shows upper bound on possible improvement www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  17. 17. www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  18. 18. Experimental Evaluation of Methods • Set initial intelligibility to around .3 • Beamform for many target and noise scenarios • Find mean correlation coefficient of BM tracks (want as low as possible) and overall output (want as large as possible) www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  19. 19. Results • Most real methods make little difference – Statistical scaling a little worse b/c of bad SNR – ISO filtering a little better b/c of more info – 1/r model made no difference • Perfect BM made slight improvement, but array geometry was most important! • Listen to some examples... www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  20. 20. Output Correlation Chart www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  21. 21. BM Correlation Chart www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  22. 22. Part 3: Automatic Steering • If steering delays aren't right then target signal leakage occurs and DSB is weaker. • Cross correlation is a highly robust technique for finding similarities between signals, so use to fine tune delays • Apply window and correlation strength thresholds to try to improve performance in poor SNR environment www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  23. 23. GCC and PHAT-β • Find the cross correlation between tracks over only a small window of possible movements and whiten to make the spike stand out www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  24. 24. Correlation Coefficient Threshold • Since environment is noisy and speaker might go silent, update only if max correlation is sufficiently strong www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  25. 25. Experimental Evaluation • Same setup as before – Initial intel ~.3 – Find output correlation with closest mic • Vary correlation threshold .1 to .9 www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  26. 26. Results • Tighter threshold better but updates never help vs original GSC – Low threshold: erratic focal point movement – High threshold: can't recover from bad updates – Low SNR makes good estimates very difficult • Retrace of lags (multilateration) shows search window D should be tighter • Array geometry still more important • Listen to some more examples... www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  27. 27. Output Correlation Chart Normal GSC Performance for Comparison www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  28. 28. www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  29. 29. www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  30. 30. Part 4: Array Geometry • Since array geometry is the most important factor, we need to find what the best layouts are and why • Start by generating beamfields to visualize array performance and look for patterns qualitatively • Then propose parameters and run computer simulations quantitatively www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  31. 31. Volumetric Beamfield Plots • GSC beamfield changes over time, but DSB is root of the system and performance is constant. • Need to see performance in three dimensions • Use layered approach with colors to indicate intensity and transparency to see features inside the space www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  32. 32. Linear Array • Generally good performance – Office too small for sidelobes to appear • Mainlobe elongated toward array www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  33. 33. www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  34. 34. Perimeter Array • Also generally good – Very tight mainlobe • No height resolution – Not a problem in an office though – Motivation for ceiling arrays www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  35. 35. www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  36. 36. Random Arrays • Performance highly variable – One best of the lot, one very bad • Need to find ways to describe and select best random arrays (coming soon) www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  37. 37. www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  38. 38. www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  39. 39. A Monte Carlo Experiment for Analysis of Geometry • Propose the following parameters for describing array geometry in 2D and evaluate array performance for many randomly-chosen geometries: – Centroid • Array center of gravity (mean position) – Dispersion • Mic spread (standard deviation of positions) www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  40. 40. Parameter Examples www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  41. 41. Monte Carlo (con't) • For a given centroid and dispersion, evaluate the array based on: – PSR – Peak to Side lobe Ratio • Worst-case interference – MLW – Main Lobe Width • Tightness of enhancement area • Redefined in 2D to use x and y 3dB widths 2 2 w3dB=  x  y 3dB 3dB www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  42. 42. Monte Carlo Simulation • Test variation of one parameter while holding the other constant. • Generate random positions from an 8x8m square and target a sound source 1m below center • Choose 120 random geometries for each run (a “class” of arrays) • Compare to rectangular array www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  43. 43. Layout www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  44. 44. Centroid Displacement www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  45. 45. Dispersion www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  46. 46. Results • Centroid centered over target always best – Irregular arrays more robust when centroid shifts • Dispersion a classic tradeoff – Tightly-packed array: tight mainlobe but strong sidelobes – Widely-spread array: wide mainlobe but weak sidelobes www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  47. 47. Part 5. Final Conclusions & Future Work • Statistical methods for improving GSC ineffective – Low SNR introduces large error • Introducing separate, concrete info helped – ISO model gave a tiny improvement – More accurate target position (laser, SSL) always best for steering • Array geometry is most important to improving performance – Linear array good, but random arrays have potential to do better – Found that a ceiling array should be centered over its intended target, but... – Open question: how does one describe the best array for beamforming on human speech? www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  48. 48. Special Thanks • Advisor – Dr. Kevin Donohue • Thesis Committee Members – Dr. Jens Hannemann – Dr. Samson Cheung • Everyone at the UK Vis Center www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  49. 49. Questions? www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  50. 50. Extra Slides www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  51. 51. Frost Algorithm • Solution to the constrained optimization subject to the constraint (C a selection matrix) The constraint vector dictates the sum of column weights, often F = [1 0 0 0...] • Solution (P and F constant matrices): www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257
  52. 52. www.vis.uky.edu | Dedicated to Research, Education and Industrial Outreach | 859.257.1257

×