## Just for you: FREE 60-day trial to the world’s largest digital library.

The SlideShare family just got bigger. Enjoy access to millions of ebooks, audiobooks, magazines, and more from Scribd.

Cancel anytime.Free with a 14 day trial from Scribd

- 1. Online Divergence Switching for Superresolution-Based Nonnegative Matrix Factorization Daichi Kitamura, Hiroshi Saruwatari, Satoshi Nakamura (Nara Institute of Science and Technology, Japan) Yu Takahashi, Kazunobu Kondo (Yamaha Corporation, Japan) Hirokazu Kameoka (The University of Tokyo, Japan) 2014 RISP International Workshop on Nonlinear Circuits, Communications and Signal Processing Speech Analysis(2),2PM2-2
- 2. Outline • 1. Research background • 2. Conventional methods – Nonnegative matrix factorization – Supervised nonnegative matrix factorization – Directional clustering – Hybrid method • 3. Proposed method – Online divergence switching for hybrid method • 4. Experiments • 5. Conclusions 2
- 3. Outline • 1. Research background • 2. Conventional methods – Nonnegative matrix factorization – Supervised nonnegative matrix factorization – Directional clustering – Hybrid method • 3. Proposed method – Online divergence switching for hybrid method • 4. Experiments • 5. Conclusions 3
- 4. Research background • Music signal separation technologies have received much attention. • Music signal separation based on nonnegative matrix factorization (NMF) is a very active research area. • The separation performance of supervised NMF (SNMF) markedly degrades for the case of many source mixtures. 4 • Automatic music transcription • 3D audio system, etc. Applications We have been proposed a new hybrid separation method for stereo music signals. Separate!
- 5. Research background • Our proposed hybrid method 5 Input stereo signal Spatial separation method (Directional clustering) SNMF-based separation method (Superresolution-based SNMF) Separated signal L R
- 6. Research background • Optimal divergence criterion in superresolution-based SNMF depends on the spatial conditions of the input signal. • Our aim in this presentation 6 We propose a new optimal separation scheme for this hybrid method to separate the target signal with high accuracy for any types of the spatial condition.
- 7. Outline • 1. Research background • 2. Conventional methods – Nonnegative matrix factorization – Supervised nonnegative matrix factorization – Directional clustering – Hybrid method • 3. Proposed method – Online divergence switching for hybrid method • 4. Experiments • 5. Conclusions 7
- 8. • NMF – is a sparse representation algorithm. – can extract significant features from the observed matrix. NMF [Lee, et al., 2001] Amplitude Amplitude Observed matrix (spectrogram) Basis matrix (spectral patterns) Activation matrix (Time-varying gain) Time Ω: Number of frequency bins 𝑇: Number of time frames 𝐾: Number of bases Time Frequency Frequency 8 Basis
- 9. Optimization in NMF • The variable matrices and are optimized by minimization of the divergence between and . • Euclidian distance (EUC-distance) and Kullbuck- Leibler divergence (KL-divergence) are often used for the divergence in the cost function. • In NMF-based separation, KL-divergence based cost function achieves high separation performance. 9 : Entries of variable matrices and , respectively. Cost function:
- 10. • SNMF utilizes some sample sounds of the target. – Construct the trained basis matrix of the target sound – Decompose into the target signal and other signal SNMF [Smaragdis, et al., 2007] Separation process Optimize Training process Supervised basis matrix (spectral dictionary) Sample sounds of target signal 10Fixed Ex. Musical scale Target signal Other signalMixed signal
- 11. Five-source case Problem of SNMF • The separation performance of SNMF markedly degrades when many interference sources exist. 11 Separate Two-source case Separate Residual components
- 12. Directional clustering [Araki, et al., 2007] • Directional clustering – utilizes differences between channels as a separation cue. – Is equal to binary masking in the spectrogram domain. • Problems – Cannot separate sources in the same direction – Artificial distortion arises owing to the binary masking. 12 Right L R Center Left L R Center Binary masking Input signal (stereo) Separated signal 1 1 1 0 0 0 1 0 0 0 0 0 1 1 1 1 0 0 1 0 0 0 0 0 1 1 1 1 1 1 Frequency Time C C C R L R C L L L R R C C C C R R C R R L L L C C C C C C Frequency Time Binary maskSpectrogram Entry-wise product
- 13. Hybrid method [D. Kitamura, et al., 2013] • We have proposed a new SNMF called superresolution-based SNMF and its hybrid method. • Hybrid method consists of directional clustering and superresolution-based SNMF. 13 Directional clustering L R Spatial separation Spectral separation Superresolution- based SNMF Hybrid method
- 14. Superresolution-based SNMF • This SNMF reconstructs the spectrogram obtained from directional clustering using supervised basis extrapolation. Time Frequency Separated cluster : Chasms Time Frequency Input spectrogram Other direction Time Frequency Reconstructed spectrogram 14 Target direction Directional clustering Superresolution- based SNMF
- 15. • Spectral chasms owing to directional clustering Superresolution-based SNMF 15 : Chasm Time Frequency Separated cluster Chasms Treat these chasms as an unseen observationsSupervised basis … Extrapolate the fittest bases
- 16. Superresolution-based SNMF Center RightLeft Direction sourcecomponent z (b) Center RightLeft Direction sourcecomponent (a) Target Center RightLeft Direction sourcecomponent (c) Extrapolated componentsFrequencyofFrequencyofFrequencyof After Input After signal directional clustering super- resolution- based SNMF Binary masking 16 Time FrequencyObserved spectrogram Target Interference Time Time Frequency Extrapolate Frequency Separated cluster Reconstructed data Supervised spectral bases Directional clustering Superresolution- based SNMF
- 17. • The divergence is defined at all grids except for the chasms by using the index matrix . Decomposition model and cost function 17 Decomposition model: Supervised bases (Fixed) : Entries of matrices, , and , respectively : Weighting parameters,: Binary complement, : Frobenius norm Regularization term Penalty term Cost function: : Index matrix obtained from directional clustering
- 18. Update rules • We can obtain the update rules for the optimization of the variables matrices , , and . 18 Update rules:
- 19. Outline • 1. Research background • 2. Conventional methods – Nonnegative matrix factorization – Supervised nonnegative matrix factorization – Directional clustering – Hybrid method • 3. Proposed method – Online divergence switching for hybrid method • 4. Experiments • 5. Conclusions 19
- 20. Consideration for optimal divergence • Separation performance of conventional SNMF • Superresolution-based SNMF – Optimal divergence depends on the amount of spectral chasms. 20 KL-divergence EUC-distance KL-divergence EUC-distance? However…
- 21. Consideration for optimal divergence • Superresolution-based SNMF has two tasks. • Abilities of each divergence 21 Signal separation Basis extrapolation Superresolution- based SNMF Signal separation Basis extrapolation KL-divergence (Very good) (Poor) EUC-distance (Good) (Good)
- 22. Consideration for optimal divergence • Spectrum decomposed by NMF with KL-divergence tends to become sparse compared with that decomposed by NMF with EUC-distance. • Sparse basis is not suitable for extrapolating using observable data. 22 -10 -8 -6 -4 -2 0 Amplitude[dB] 543210 Frequency [kHz] -10 -8 -6 -4 -2 0 Amplitude[dB] 543210 Frequency [kHz] KL-divergence EUC-distance
- 23. Consideration for optimal divergence • The optimal divergence for superresolution-based SNMF depends on the amount of spectral chasms because of the trade-off between separation and extrapolation abilities.Performance Separation Total performance Extrapolation Anti-sparseSparse -10 -8 -6 -4 -2 0 Amplitude[dB] 543210 Frequency [kHz] -10 -8 -6 -4 -2 0 Amplitude[dB] 543210 Frequency [kHz] Sparseness: Weak 23 KL-divergence EUC-distance Strong
- 24. • The optimal divergence for superresolution-based SNMF depends on the amount of spectral chasms. Consideration for optimal divergence 24 Time Frequency : Chasms Time Frequency : Chasms If there are many chasms If the chasms are not exist The extrapolation ability is required. The separation ability is required. KL-divergence should be used. EUC-distance should be used.
- 25. Hybrid method for online input data • When we consider applying the hybrid method to online input data… 25 Online binary-masked spectrogram Frequency Time Observed spectrogramDirectional clustering Binary mask
- 26. Hybrid method for online input data • We divide the online spectrogram into some block parts. 26 Frequency Time Superresolution- based SNMF Superresolution- based SNMF Superresolution- based SNMF In parallel
- 27. Online divergence switching • We calculate the rate of chasms in each block part. 27 There are many chasms. The chasms are not exist so much. Superresolution- based SNMF with KL-divergence Superresolution- based SNMF with EUC-distance Threshold value Threshold value
- 28. Procedure of proposed method 28
- 29. Outline • 1. Research background • 2. Conventional methods – Nonnegative matrix factorization – Supervised nonnegative matrix factorization – Directional clustering – Hybrid method • 3. Proposed method – Online divergence switching for hybrid method • 4. Experiments • 5. Conclusions 29
- 30. Experimental conditions • We used stereo-panning signals. • Mixture of four instruments generated by MIDI synthesizer • We used the same type of MIDI sounds of the target instruments as supervision for training process. 30 Center １ ２ ３ ４ Left Right Target source Supervision sound Two octave notes that cover all the notes of the target signal
- 31. Experimental conditions • We compared three methods. – Hybrid method using only EUC-distance-based SNMF (Conventional method 1) – Hybrid method using only KL-divergence-based SNMF (Conventional method 2) – Proposed hybrid method that switches the divergence to the optimal one (Proposed method) • We used signal-to-distortion ratio (SDR) as an evaluation score. – SDR indicates the total separation accuracy, which includes both of quality of separated target signal and degree of separation. 31
- 32. Experimental result • Average SDR scores for each method, where the four instruments are shuffled with 12 combinations. • Proposed method outperforms other methods. 32 GoodBad 8.0 8.5 9.0 9.5 10.0 SDR [dB] Conventional method 1 Conventional method 2 Proposed method
- 33. Conclusions • We propose a new divergence switching scheme for superresolution-based SNMF. • This method is for the online input signal to separate using optimal divergence in NMF. • The proposed method can be used for any types of the spatial condition of sources, and separates the target signal with high accuracy. 33 Thank you for your attention!