Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Dsp2015for ss

4,651 views

Published on

Invited talk in IEEE DSP2015

Published in: Engineering
  • Be the first to comment

  • Be the first to like this

Dsp2015for ss

  1. 1. Statistical-Model-Based Speech Enhancement with Musical-Noise-Free Properties Hiroshi Saruwatari (The University of Tokyo, JAPAN) IEEE DSP2015 Invited Talk
  2. 2. Outline 1. Research background 2. What is musical-noise-free? 3. Conventional statistical-model-based speech enhancement 4. Proposed method and analysis 5. Experimental evaluation 6. Conclusion 2
  3. 3. Research Background and Goal  Single-channel speech enhancement  Spectral subtraction (SS) [Boll, 1979], Wiener Filtering, Bayesian minimum mean-square error short-time spectral amplitude (MMSE-STSA) estimator [Ephraim, 1984], MAP estimator [Lotter, 2005], etc.  Harmful distortion owing to musical noise generation  Musical-noise-free speech enhancement [Miyazaki, Saruwatari et al., IEEE Trans. ASLP 2012]  Noise reduction without any musical noise  We have found that SS (maximum-likelihood amplitude estimator) has musical-noise-free state.  Whether or not Generalized Bayesian MMSE-STSA estimator has musical-noise-free state? 3
  4. 4. Relation between Musical Noise and Kurtosis 4 Proportional relation between human perception (musical noise score) and log kurtosis ratio [Saruwatari, 2008]
  5. 5. What is Musical-Noise-Free? 5
  6. 6. Musical-Noise-Free Speech Enhancement  Iterative noise reduction procedure with musical-noise- free condition [Miyazaki, Saruwatari, et al., IEEE Trans. ASLP 2012] 6 …
  7. 7. MOSIE (generalized MMSE-STSA) Estimator 7 Statistical speech amplitude estimator with parametric speech prior [Breithaupt, et al., IEEE Trans. 2011]
  8. 8. How to Generate Musical-Noise-Free State? 8 Unfortunately we cannot find any musical-noise-free states in the conventional MOSIE estimator. No intersection! Forgetting factor a is increasing
  9. 9. Analysis Strategy 9
  10. 10. Calculation of Moment for Biased MOSIE (1/4) 10 1. Derivation of p.d.f.
  11. 11. Calculation of Moment for Biased MOSIE (2/4) 11 2. Calculation of moment for
  12. 12. Calculation of Moment for Biased MOSIE (3/4) 12 3. Moment-cumulant transformation for 4. Cumulant of noise power spectrum
  13. 13. Calculation of Moment for Biased MOSIE (4/4) 13 5. Cumulant-moment transformation for m1 is used for NRR, and m2 and m4 are used for kurtosis, which are functions of value of bias e.
  14. 14. Calculation of Moment for Biased MOSIE (4/4) 14 Bias e large
  15. 15. Experiment 1: Existence of Musical-Noise-Free 15 Noise White Gaussian noise in 0-dB SNR Speech prior Gaussian model (r = 1) Forgetting factor in DD 0.98 Noise PSD estimation Minimum Statistics Method [Martin, 1994] Theoretical analysis Experimental results Bias e = 0 To introduce bias ε, we find musical-noise-free state in statistical-model-based estimator. e large
  16. 16. Experiment 2: Existence of Musical-Noise-Free 16 Noise White Gaussian noise in 0-dB SNR Speech prior Super Gaussian model (r = 0.5) Forgetting factor in DD 0.98 Noise PSD estimation Minimum Statistics Method [Martin, 1994] Theoretical analysis Experimental results Bias e = 0 Strong speech prior (small ρ) gives almost no musical- noise-free state in real processing. e large
  17. 17. Experiment 3: Comparison with Other Methods 17 Speech 10 utterances Noise White Gaussian noise in 0-dB SNR Speech prior Super Gaussian model (r = 0.5, b = 0.001) Forgetting factor in DD 0.98 Noise PSD estimation Minimum Statistics Method [Martin, 1994] Target NRR 16 dB
  18. 18. Experiment 3: Comparison with Other Methods 18 Speech 10 utterances Noise White Gaussian noise in 0-dB SNR Speech prior Super Gaussian model (r = 0.5, b = 0.001) Forgetting factor in DD 0.98 Noise PSD estimation Minimum Statistics Method [Martin, 1994] Target NRR 16 dB Large musical noise methods No musical noise methods
  19. 19. Experiment 3: Comparison with Other Methods 19 Speech 10 utterances Noise White Gaussian noise in 0-dB SNR Speech prior Super Gaussian model (r = 0.5, b = 0.001) Forgetting factor in DD 0.98 Noise PSD estimation Minimum Statistics Method [Martin, 1994] Target NRR 16 dB Lowest speech distortion Large musical noise methods No musical noise methods Richer speech prior
  20. 20. Conclusion  To introduce bias ε, we find musical-noise-free state in Bayesian estimator.  Proposed biased MOSIE estimator can achieve better cepstral distortion whereas its kurtosis ratio is perfectly fixed to 1.0.  Strong speech prior (small ρ) gives almost no musical-noise-free state. So we should carefully select the appropriate prior to maintain the qualities of both speech and remaining noise. 20 Thank you for your attention!

×