Presented at IEEE International Symposium on Signal Processing and Information Technology (ISSPIT 2013) (international conference)
Daichi Kitamura, Hiroshi Saruwatari, Kosuke Yagi, Kiyohiro Shikano, Yu Takahashi, Kazunobu Kondo, "Robust music signal separation based on supervised nonnegative matrix factorization with prevention of basis sharing," Proceedings of IEEE International Symposium on Signal Processing and Information Technology (ISSPIT 2013), pp.392-397, Athens, Greece, December 2013.
Superresolution-based stereo signal separation via supervised nonnegative mat...Daichi Kitamura
This document presents a new method called regularized superresolution-based nonnegative matrix factorization (NMF) for multichannel music signal separation. The proposed method addresses limitations of existing approaches like directional clustering and penalized supervised NMF. It utilizes directional clustering to separate sources by direction, then applies superresolution-based NMF to extrapolate missing components from the target source using supervised bases, regularized by an index matrix. An evaluation compares this hybrid approach to other methods, finding it achieves higher source separation quality in terms of SDR, SIR and SAR scores.
Divergence optimization in nonnegative matrix factorization with spectrogram ...Daichi Kitamura
The document proposes a new supervised nonnegative matrix factorization (SNMF) method and hybrid method for multichannel signal separation. It analyzes the optimal divergence criterion for the SNMF with spectrogram restoration ability. The key points are:
1. A generalized cost function is introduced to extend SNMF to optimize the divergence criterion.
2. Theoretical analysis based on a data generation model finds the optimal divergence for basis extrapolation in spectrogram restoration is around Euclidean distance.
3. Experiments show the proposed hybrid method using Euclidean distance outperforms other methods for both instantaneous mixtures and real recordings, achieving the best separation quality measured by signal-to-distortion ratio.
Relaxation of rank-1 spatial constraint in overdetermined blind source separa...Daichi Kitamura
Presented at The 2015 European Signal Processing Conference (EUSIPCO 2015, international conference)
Daichi Kitamura, Nobutaka Ono, Hiroshi Sawada, Hirokazu Kameoka, Hiroshi Saruwatari, "Relaxation of rank-1 spatial constraint in overdetermined blind source separation," Proceedings of The 2015 European Signal Processing Conference (EUSIPCO 2015), pp.1271-1275, Nice, France, September 2015 (Invited Special Session).
Hybrid multichannel signal separation using supervised nonnegative matrix fac...Daichi Kitamura
Presented at Asia-Pacific Signal and Information Processing Association Annual Summit and Conference 2014 (APSIPA 2014, international conference)
Daichi Kitamura, Hiroshi Saruwatari, Satoshi Nakamura, Yu Takahashi, Kazunobu Kondo, Hirokazu Kameoka, "Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration," Proceedings of Asia-Pacific Signal and Information Processing Association Annual Summit and Conference 2014 (APSIPA 2014), Siem Reap, Cambodia, December 2014 (invited paper).
Online divergence switching for superresolution-based nonnegative matrix fact...Daichi Kitamura
This document describes a proposed method for online divergence switching in a hybrid music source separation system. The hybrid system uses directional clustering for spatial separation followed by supervised nonnegative matrix factorization (SNMF) for spectral separation. The optimal divergence for SNMF depends on the amount of spectral gaps ("chasms") caused by directional clustering, with KL-divergence preferred for many chasms and Euclidean distance preferred when chasms are few. The proposed method divides the online spectrogram into blocks and selects the optimal divergence for each block based on its chasm rate, allowing real-time adaptation to achieve high separation accuracy for any source spatial conditions. Experiments show the proposed method outperforms using a single divergence.
Regularized superresolution-based binaural signal separation with nonnegative...Daichi Kitamura
Presented at 5th International Conference on 3D Systems and Applications (3DSA 2013) (international conference)
Daichi Kitamura, Hiroshi Saruwatari, Kiyohiro Shikano, Kazunobu Kondo, Yu Takahashi, "Regularized superresolution-based binaural signal separation with nonnegative matrix factorization," Proceedings of 5th International Conference on 3D Systems and Applications (3DSA 2013), S10-4, Osaka, Japan, June 2013.
Efficient initialization for nonnegative matrix factorization based on nonneg...Daichi Kitamura
Daichi Kitamura, Nobutaka Ono, "Efficient initialization for nonnegative matrix factorization based on nonnegative independent component analysis," The 15th International Workshop on Acoustic Signal Enhancement (IWAENC 2016), Xi'an, China, September 2016.
Depth estimation of sound images using directional clustering and activation-...Daichi Kitamura
Presented at 2014 RISP International Workshop on Nonlinear Circuits, Communications and Signal Processing (NCSP 2014) (international conference)
Tomo Miyauchi, Daichi Kitamura, Hiroshi Saruwatari, Satoshi Nakamura, "Depth estimation of sound images using directional clustering and activation-shared nonnegative matrix factorization," Proceedings of 2014 RISP International Workshop on Nonlinear Circuits, Communications and Signal Processing (NCSP 2014), pp.437-440, Hawaii, USA, March 2014 (Student Paper Award).
Superresolution-based stereo signal separation via supervised nonnegative mat...Daichi Kitamura
This document presents a new method called regularized superresolution-based nonnegative matrix factorization (NMF) for multichannel music signal separation. The proposed method addresses limitations of existing approaches like directional clustering and penalized supervised NMF. It utilizes directional clustering to separate sources by direction, then applies superresolution-based NMF to extrapolate missing components from the target source using supervised bases, regularized by an index matrix. An evaluation compares this hybrid approach to other methods, finding it achieves higher source separation quality in terms of SDR, SIR and SAR scores.
Divergence optimization in nonnegative matrix factorization with spectrogram ...Daichi Kitamura
The document proposes a new supervised nonnegative matrix factorization (SNMF) method and hybrid method for multichannel signal separation. It analyzes the optimal divergence criterion for the SNMF with spectrogram restoration ability. The key points are:
1. A generalized cost function is introduced to extend SNMF to optimize the divergence criterion.
2. Theoretical analysis based on a data generation model finds the optimal divergence for basis extrapolation in spectrogram restoration is around Euclidean distance.
3. Experiments show the proposed hybrid method using Euclidean distance outperforms other methods for both instantaneous mixtures and real recordings, achieving the best separation quality measured by signal-to-distortion ratio.
Relaxation of rank-1 spatial constraint in overdetermined blind source separa...Daichi Kitamura
Presented at The 2015 European Signal Processing Conference (EUSIPCO 2015, international conference)
Daichi Kitamura, Nobutaka Ono, Hiroshi Sawada, Hirokazu Kameoka, Hiroshi Saruwatari, "Relaxation of rank-1 spatial constraint in overdetermined blind source separation," Proceedings of The 2015 European Signal Processing Conference (EUSIPCO 2015), pp.1271-1275, Nice, France, September 2015 (Invited Special Session).
Hybrid multichannel signal separation using supervised nonnegative matrix fac...Daichi Kitamura
Presented at Asia-Pacific Signal and Information Processing Association Annual Summit and Conference 2014 (APSIPA 2014, international conference)
Daichi Kitamura, Hiroshi Saruwatari, Satoshi Nakamura, Yu Takahashi, Kazunobu Kondo, Hirokazu Kameoka, "Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration," Proceedings of Asia-Pacific Signal and Information Processing Association Annual Summit and Conference 2014 (APSIPA 2014), Siem Reap, Cambodia, December 2014 (invited paper).
Online divergence switching for superresolution-based nonnegative matrix fact...Daichi Kitamura
This document describes a proposed method for online divergence switching in a hybrid music source separation system. The hybrid system uses directional clustering for spatial separation followed by supervised nonnegative matrix factorization (SNMF) for spectral separation. The optimal divergence for SNMF depends on the amount of spectral gaps ("chasms") caused by directional clustering, with KL-divergence preferred for many chasms and Euclidean distance preferred when chasms are few. The proposed method divides the online spectrogram into blocks and selects the optimal divergence for each block based on its chasm rate, allowing real-time adaptation to achieve high separation accuracy for any source spatial conditions. Experiments show the proposed method outperforms using a single divergence.
Regularized superresolution-based binaural signal separation with nonnegative...Daichi Kitamura
Presented at 5th International Conference on 3D Systems and Applications (3DSA 2013) (international conference)
Daichi Kitamura, Hiroshi Saruwatari, Kiyohiro Shikano, Kazunobu Kondo, Yu Takahashi, "Regularized superresolution-based binaural signal separation with nonnegative matrix factorization," Proceedings of 5th International Conference on 3D Systems and Applications (3DSA 2013), S10-4, Osaka, Japan, June 2013.
Efficient initialization for nonnegative matrix factorization based on nonneg...Daichi Kitamura
Daichi Kitamura, Nobutaka Ono, "Efficient initialization for nonnegative matrix factorization based on nonnegative independent component analysis," The 15th International Workshop on Acoustic Signal Enhancement (IWAENC 2016), Xi'an, China, September 2016.
Depth estimation of sound images using directional clustering and activation-...Daichi Kitamura
Presented at 2014 RISP International Workshop on Nonlinear Circuits, Communications and Signal Processing (NCSP 2014) (international conference)
Tomo Miyauchi, Daichi Kitamura, Hiroshi Saruwatari, Satoshi Nakamura, "Depth estimation of sound images using directional clustering and activation-shared nonnegative matrix factorization," Proceedings of 2014 RISP International Workshop on Nonlinear Circuits, Communications and Signal Processing (NCSP 2014), pp.437-440, Hawaii, USA, March 2014 (Student Paper Award).
Blind source separation based on independent low-rank matrix analysis and its...Daichi Kitamura
Daichi Kitamura, "Blind source separation based on independent low-rank matrix analysis and its extension to Student's t-distribution," Télécom ParisTech, Invited Lecture, September 4th, 2017.
The document describes a proposed hybrid method for multichannel signal separation using supervised nonnegative matrix factorization (SNMF). The method combines directional clustering for spatial separation with SNMF incorporating spectrogram restoration for spectral separation. Experiments show the hybrid method achieves better separation performance than conventional single-channel SNMF or multichannel NMF methods, as measured by signal-to-distortion ratio. The optimal divergence for the SNMF component involves a tradeoff between separation ability and ability to restore missing spectral components.
Blind source separation based on independent low-rank matrix analysis and its...Daichi Kitamura
Daichi Kitamura, "Blind source separation based on independent low-rank matrix analysis and its extensions," Ohio State University, Invited Lecture, December 15th, 2017.
Audio Source Separation Based on Low-Rank Structure and Statistical IndependenceDaichi Kitamura
Daichi Kitamura presented his research on audio source separation. He discussed using low-rank modeling of spectrograms and non-negative matrix factorization to separate sources based on their structural properties in supervised settings. He also discussed using statistical independence between sources and the central limit theorem as the basis for blind source separation via independent component analysis. The talk covered applications of source separation, demonstrations of techniques, and challenges like basis mismatch for supervised methods and permutation problems for blind separation.
Prior distribution design for music bleeding-sound reduction based on nonnega...Kitamura Laboratory
Yusaku Mizobuchi, Daichi Kitamura, Tomohiko Nakamura, Hiroshi Saruwatari, Yu Takahashi, and Kazunobu Kondo, "Prior distribution design for music bleeding-sound reduction based on nonnegative matrix factorization," Proceedings of Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC 2021), pp. 651–658, Tokyo, Japan, December 2021.
Blind audio source separation based on time-frequency structure modelsKitamura Laboratory
Daichi Kitamura, "Blind audio source separation based on time-frequency structure models," Invited Overview Session in Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC 2021), Tokyo, Japan, December 2021.
This document proposes a flexible microphone array system using informed source separation methods for a rescue robot. It aims to detect victim speech in disaster areas using multiple microphones on the robot's flexible body. The proposed method uses supervised rank-1 nonnegative matrix factorization (NMF) and statistical signal estimation to address two key problems: ego-noise basis mismatch due to the robot's self-vibrations, and speech model ambiguity. Experiments show the proposed approach outperforms conventional independent vector analysis and single-channel NMF, improving speech detection even with mismatched ego-noise recordings.
Shoichi Koyama, Naoki Murata, and Hiroshi Saruwatari. "Super-resolution in sound field recording and reproduction based on sparse representation"
presented at 5th Joint Meeting Acoustical Society of America and Acoustical Society of Japan (28 Nov. - 2 Dec. 2016, Honolulu, USA)
Linear multichannel blind source separation based on time-frequency mask obta...Kitamura Laboratory
This document proposes a new method for linear multichannel blind source separation (BSS) based on time-frequency masks obtained from harmonic/percussive sound separation (HPSS). The proposed method applies HPSS independently to temporarily estimated sources to generate harmonic and percussive masks, then smooths the masks and uses them in time-frequency masking-based BSS. Experiments show the proposed method achieves higher source separation quality than single-channel HPSS and outperforms other multichannel BSS methods, demonstrating the effectiveness of integrating HPSS with multichannel BSS.
DNN-based permutation solver for frequency-domain independent component analy...Kitamura Laboratory
1) The document proposes a DNN-based method to solve the permutation problem in frequency-domain independent component analysis (FDICA) for audio source separation.
2) Conventional permutation solvers sometimes fail to correctly align the separated signal components across frequencies. The proposed method trains a DNN on simulated permutation data to learn how to align components.
3) In experiments separating reverberant speech mixtures, the proposed DNN-based method improved the signal-to-distortion ratio by about 8 dB, outperforming other techniques and approaching the upper limit of performance.
Shoichi Koyama, "Source-Location-Informed Sound Field Recording and Reproduction: A Generalization to Arrays of Arbitrary Geometry"
Presented in 2016 AES International Conference on Sound Field Control (July 18-20 2016, Guildford, UK)
The document proposes an improved method for audio signal separation using supervised nonnegative matrix factorization (NMF) with time-variant basis deformation. The key contributions are:
1. Classifying supervised bases into time-variant attack and sustain parts and applying different all-pole model-based deformations to each.
2. Introducing discriminative training to avoid overfitting the interference signal and better separate the target.
3. An iterative approximated algorithm is presented that searches for deformation matrices representing the target signal while being constrained to also fit the mixture signal.
4. Experimental results on instrument mixtures show the proposed method achieves better signal-to-distortion ratio performance than previous supervised NMF techniques.
DNN-based frequency component prediction for frequency-domain audio source se...Kitamura Laboratory
DNN-based frequency component prediction for frequency-domain audio source separation. The paper proposes a new framework that combines frequency-domain audio source separation with DNN to achieve high quality separation with lower computational cost. The framework applies multichannel NMF to separate sources in the low frequency band. A DNN then predicts the separated source components in the high frequency band based on the low frequency separated sources and mixture. Experiments show the mixture components help the DNN expand the bandwidth of separated sources, and the proposed framework achieves similar separation quality to fullband NMF with half the computational cost.
Experimental analysis of optimal window length for independent low-rank matri...Daichi Kitamura
Daichi Kitamura, Nobutaka Ono, and Hiroshi Saruwatari, "Experimental analysis of optimal window length for independent low-rank matrix analysis," Proceedings of The 2017 European Signal Processing Conference (EUSIPCO 2017), pp. 1210–1214, Kos, Greece, August 2017 (Invited Special Session).
Presented at 25th European Signal Processing Conference (EUSIPCO) 2017, "SS14: Multivariate Analysis for Audio Signal Source Enhancement," 14:30-16:10, August 30, 2017.
Online Divergence Switching for Superresolution-Based Nonnegative Matrix Fa...奈良先端大 情報科学研究科
This document summarizes a research presentation on an online divergence switching method for a hybrid source separation technique. The hybrid method combines directional clustering for spatial separation with supervised nonnegative matrix factorization (SNMF) for spectral separation. The proposed method switches between KL divergence and Euclidean distance for the SNMF, depending on the amount of spectral gaps from the directional clustering. When there are many gaps, Euclidean distance is better for basis extrapolation. When gaps are fewer, KL divergence gives better separation. In experiments, the proposed online switching method outperformed using only one divergence, achieving higher signal-to-distortion ratios for music source separation.
Depth Estimation of Sound Images Using Directional Clustering and Activation...奈良先端大 情報科学研究科
The document proposes two methods for estimating the depth of sound images using direction of arrival (DOA) information extracted from stereo signals.
Method 1 estimates depth based on the shape of the DOA distribution modeled using a generalized Gaussian distribution (GGD), where a smoother distribution indicates a source is farther away.
Method 2 applies activation-shared nonnegative matrix factorization (NMF) to extract features while maintaining directional information and reduce noise interfering with DOA estimation. Conventional NMF generates artificial fluctuations, while shared activation preserves source direction.
An experiment calculates the correlation between estimated and reference depths from 6 datasets containing varying source combinations. Results show the proposed method achieves high correlation, confirming its efficacy over conventional methods
This document summarizes a research talk on statistical-model-based speech enhancement techniques that aim to reduce noise without generating musical noise artifacts. The talk outlines conventional enhancement methods like spectral subtraction and Wiener filtering that often cause musical noise. It then proposes a biased minimum mean-square error estimator that can achieve a musical-noise-free state by introducing a bias parameter. Analysis and experiments show this method can reduce noise while keeping the kurtosis ratio fixed at 1.0 to prevent musical noise, outperforming other techniques in terms of speech quality. A strong speech prior model is found to limit achieving musical-noise-free states, so the prior must be carefully selected.
Robust Sound Field Reproduction against Listener’s Movement Utilizing Image ...奈良先端大 情報科学研究科
This document proposes a sound field reproduction system that utilizes an image sensor to estimate a listener's position and orientation. It summarizes challenges with existing spectral division methods that rely on a fixed reference listening position. The proposed system applies an "equiangular filter" to the spectral division method to locally synthesize sound fields around the listener by limiting the spatial bandwidth, as estimated from the image sensor. Simulation experiments and subjective assessments show the proposed system can accurately reproduce sound fields at the listener's position regardless of frequency, with improved directional perception over conventional methods. However, it did not clearly outperform alternatives in terms of sound quality. Overall, the system aims to address limitations of existing approaches by adapting sound field synthesis based on estimated listener parameters
A Generalization of Laplace Nonnegative Matrix Factorizationand Its Multichan...Hiroki_Tanji
This paper proposes a generalization of nonnegative matrix factorization (NMF) and multichannel NMF based on the complex Bessel distribution. This distribution generalizes three statistical models, including the Gaussian, exponential-function Laplace, and Bessel-function Laplace distributions. Optimization algorithms are derived for the proposed Bessel-NMF and Bessel-multichannel NMF. Simulations on music signal separation show the proposed method achieves better source-to-distortion ratio improvements than competing methods, demonstrating the effectiveness of modeling super-Gaussian observations.
1) The document discusses a semi-supervised nonnegative matrix factorization method with a cosine penalty condition for audio source separation.
2) It proposes adding a cosine similarity penalty term to penalize similarity between basis matrices, to improve on existing penalized SNMF methods.
3) Experiments show the proposed method achieves higher source separation performance compared to existing methods, measured by average and median SDR values, but the optimal weight coefficient values are peaky.
This document provides an introduction to blind source separation and non-negative matrix factorization. It describes blind source separation as a method to estimate original signals from observed mixed signals. Non-negative matrix factorization is introduced as a constraint-based approach to solving blind source separation using non-negativity. The alternating least squares algorithm is described for solving the non-negative matrix factorization problem. Experiments applying these methods to artificial and real image data are presented and discussed.
Blind source separation based on independent low-rank matrix analysis and its...Daichi Kitamura
Daichi Kitamura, "Blind source separation based on independent low-rank matrix analysis and its extension to Student's t-distribution," Télécom ParisTech, Invited Lecture, September 4th, 2017.
The document describes a proposed hybrid method for multichannel signal separation using supervised nonnegative matrix factorization (SNMF). The method combines directional clustering for spatial separation with SNMF incorporating spectrogram restoration for spectral separation. Experiments show the hybrid method achieves better separation performance than conventional single-channel SNMF or multichannel NMF methods, as measured by signal-to-distortion ratio. The optimal divergence for the SNMF component involves a tradeoff between separation ability and ability to restore missing spectral components.
Blind source separation based on independent low-rank matrix analysis and its...Daichi Kitamura
Daichi Kitamura, "Blind source separation based on independent low-rank matrix analysis and its extensions," Ohio State University, Invited Lecture, December 15th, 2017.
Audio Source Separation Based on Low-Rank Structure and Statistical IndependenceDaichi Kitamura
Daichi Kitamura presented his research on audio source separation. He discussed using low-rank modeling of spectrograms and non-negative matrix factorization to separate sources based on their structural properties in supervised settings. He also discussed using statistical independence between sources and the central limit theorem as the basis for blind source separation via independent component analysis. The talk covered applications of source separation, demonstrations of techniques, and challenges like basis mismatch for supervised methods and permutation problems for blind separation.
Prior distribution design for music bleeding-sound reduction based on nonnega...Kitamura Laboratory
Yusaku Mizobuchi, Daichi Kitamura, Tomohiko Nakamura, Hiroshi Saruwatari, Yu Takahashi, and Kazunobu Kondo, "Prior distribution design for music bleeding-sound reduction based on nonnegative matrix factorization," Proceedings of Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC 2021), pp. 651–658, Tokyo, Japan, December 2021.
Blind audio source separation based on time-frequency structure modelsKitamura Laboratory
Daichi Kitamura, "Blind audio source separation based on time-frequency structure models," Invited Overview Session in Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC 2021), Tokyo, Japan, December 2021.
This document proposes a flexible microphone array system using informed source separation methods for a rescue robot. It aims to detect victim speech in disaster areas using multiple microphones on the robot's flexible body. The proposed method uses supervised rank-1 nonnegative matrix factorization (NMF) and statistical signal estimation to address two key problems: ego-noise basis mismatch due to the robot's self-vibrations, and speech model ambiguity. Experiments show the proposed approach outperforms conventional independent vector analysis and single-channel NMF, improving speech detection even with mismatched ego-noise recordings.
Shoichi Koyama, Naoki Murata, and Hiroshi Saruwatari. "Super-resolution in sound field recording and reproduction based on sparse representation"
presented at 5th Joint Meeting Acoustical Society of America and Acoustical Society of Japan (28 Nov. - 2 Dec. 2016, Honolulu, USA)
Linear multichannel blind source separation based on time-frequency mask obta...Kitamura Laboratory
This document proposes a new method for linear multichannel blind source separation (BSS) based on time-frequency masks obtained from harmonic/percussive sound separation (HPSS). The proposed method applies HPSS independently to temporarily estimated sources to generate harmonic and percussive masks, then smooths the masks and uses them in time-frequency masking-based BSS. Experiments show the proposed method achieves higher source separation quality than single-channel HPSS and outperforms other multichannel BSS methods, demonstrating the effectiveness of integrating HPSS with multichannel BSS.
DNN-based permutation solver for frequency-domain independent component analy...Kitamura Laboratory
1) The document proposes a DNN-based method to solve the permutation problem in frequency-domain independent component analysis (FDICA) for audio source separation.
2) Conventional permutation solvers sometimes fail to correctly align the separated signal components across frequencies. The proposed method trains a DNN on simulated permutation data to learn how to align components.
3) In experiments separating reverberant speech mixtures, the proposed DNN-based method improved the signal-to-distortion ratio by about 8 dB, outperforming other techniques and approaching the upper limit of performance.
Shoichi Koyama, "Source-Location-Informed Sound Field Recording and Reproduction: A Generalization to Arrays of Arbitrary Geometry"
Presented in 2016 AES International Conference on Sound Field Control (July 18-20 2016, Guildford, UK)
The document proposes an improved method for audio signal separation using supervised nonnegative matrix factorization (NMF) with time-variant basis deformation. The key contributions are:
1. Classifying supervised bases into time-variant attack and sustain parts and applying different all-pole model-based deformations to each.
2. Introducing discriminative training to avoid overfitting the interference signal and better separate the target.
3. An iterative approximated algorithm is presented that searches for deformation matrices representing the target signal while being constrained to also fit the mixture signal.
4. Experimental results on instrument mixtures show the proposed method achieves better signal-to-distortion ratio performance than previous supervised NMF techniques.
DNN-based frequency component prediction for frequency-domain audio source se...Kitamura Laboratory
DNN-based frequency component prediction for frequency-domain audio source separation. The paper proposes a new framework that combines frequency-domain audio source separation with DNN to achieve high quality separation with lower computational cost. The framework applies multichannel NMF to separate sources in the low frequency band. A DNN then predicts the separated source components in the high frequency band based on the low frequency separated sources and mixture. Experiments show the mixture components help the DNN expand the bandwidth of separated sources, and the proposed framework achieves similar separation quality to fullband NMF with half the computational cost.
Experimental analysis of optimal window length for independent low-rank matri...Daichi Kitamura
Daichi Kitamura, Nobutaka Ono, and Hiroshi Saruwatari, "Experimental analysis of optimal window length for independent low-rank matrix analysis," Proceedings of The 2017 European Signal Processing Conference (EUSIPCO 2017), pp. 1210–1214, Kos, Greece, August 2017 (Invited Special Session).
Presented at 25th European Signal Processing Conference (EUSIPCO) 2017, "SS14: Multivariate Analysis for Audio Signal Source Enhancement," 14:30-16:10, August 30, 2017.
Online Divergence Switching for Superresolution-Based Nonnegative Matrix Fa...奈良先端大 情報科学研究科
This document summarizes a research presentation on an online divergence switching method for a hybrid source separation technique. The hybrid method combines directional clustering for spatial separation with supervised nonnegative matrix factorization (SNMF) for spectral separation. The proposed method switches between KL divergence and Euclidean distance for the SNMF, depending on the amount of spectral gaps from the directional clustering. When there are many gaps, Euclidean distance is better for basis extrapolation. When gaps are fewer, KL divergence gives better separation. In experiments, the proposed online switching method outperformed using only one divergence, achieving higher signal-to-distortion ratios for music source separation.
Depth Estimation of Sound Images Using Directional Clustering and Activation...奈良先端大 情報科学研究科
The document proposes two methods for estimating the depth of sound images using direction of arrival (DOA) information extracted from stereo signals.
Method 1 estimates depth based on the shape of the DOA distribution modeled using a generalized Gaussian distribution (GGD), where a smoother distribution indicates a source is farther away.
Method 2 applies activation-shared nonnegative matrix factorization (NMF) to extract features while maintaining directional information and reduce noise interfering with DOA estimation. Conventional NMF generates artificial fluctuations, while shared activation preserves source direction.
An experiment calculates the correlation between estimated and reference depths from 6 datasets containing varying source combinations. Results show the proposed method achieves high correlation, confirming its efficacy over conventional methods
This document summarizes a research talk on statistical-model-based speech enhancement techniques that aim to reduce noise without generating musical noise artifacts. The talk outlines conventional enhancement methods like spectral subtraction and Wiener filtering that often cause musical noise. It then proposes a biased minimum mean-square error estimator that can achieve a musical-noise-free state by introducing a bias parameter. Analysis and experiments show this method can reduce noise while keeping the kurtosis ratio fixed at 1.0 to prevent musical noise, outperforming other techniques in terms of speech quality. A strong speech prior model is found to limit achieving musical-noise-free states, so the prior must be carefully selected.
Robust Sound Field Reproduction against Listener’s Movement Utilizing Image ...奈良先端大 情報科学研究科
This document proposes a sound field reproduction system that utilizes an image sensor to estimate a listener's position and orientation. It summarizes challenges with existing spectral division methods that rely on a fixed reference listening position. The proposed system applies an "equiangular filter" to the spectral division method to locally synthesize sound fields around the listener by limiting the spatial bandwidth, as estimated from the image sensor. Simulation experiments and subjective assessments show the proposed system can accurately reproduce sound fields at the listener's position regardless of frequency, with improved directional perception over conventional methods. However, it did not clearly outperform alternatives in terms of sound quality. Overall, the system aims to address limitations of existing approaches by adapting sound field synthesis based on estimated listener parameters
A Generalization of Laplace Nonnegative Matrix Factorizationand Its Multichan...Hiroki_Tanji
This paper proposes a generalization of nonnegative matrix factorization (NMF) and multichannel NMF based on the complex Bessel distribution. This distribution generalizes three statistical models, including the Gaussian, exponential-function Laplace, and Bessel-function Laplace distributions. Optimization algorithms are derived for the proposed Bessel-NMF and Bessel-multichannel NMF. Simulations on music signal separation show the proposed method achieves better source-to-distortion ratio improvements than competing methods, demonstrating the effectiveness of modeling super-Gaussian observations.
1) The document discusses a semi-supervised nonnegative matrix factorization method with a cosine penalty condition for audio source separation.
2) It proposes adding a cosine similarity penalty term to penalize similarity between basis matrices, to improve on existing penalized SNMF methods.
3) Experiments show the proposed method achieves higher source separation performance compared to existing methods, measured by average and median SDR values, but the optimal weight coefficient values are peaky.
This document provides an introduction to blind source separation and non-negative matrix factorization. It describes blind source separation as a method to estimate original signals from observed mixed signals. Non-negative matrix factorization is introduced as a constraint-based approach to solving blind source separation using non-negativity. The alternating least squares algorithm is described for solving the non-negative matrix factorization problem. Experiments applying these methods to artificial and real image data are presented and discussed.
The document discusses the benefits of exercise for mental health. Regular physical activity can help reduce anxiety and depression and improve mood and cognitive functioning. Exercise causes chemical changes in the brain that may help boost feelings of calmness, happiness and focus.
China engineering consultation industry development prospects and investment ...Qianzhan Intelligence
The document provides an overview and analysis of the engineering consultation industry in China from 2013-2018. It discusses the major structure of the industry, including the number and scale of corporations, their qualifications and specialties. It also examines the market status and profit margins. The report then analyzes investment opportunities and forecasts prospects for different segments of the industry, including building construction, municipal public works, electric power, hydraulic, railway, and highway engineering consultation. Key data on projects, investments, corporation numbers and revenues are given for each segment. The analysis aims to help companies understand industry trends and make correct marketing decisions.
China coatings industry production & marketing demand and investment forecast...Qianzhan Intelligence
This document provides an overview and analysis of China's coatings industry from 2012-2016. It covers topics such as industry definitions, goals of coatings, PEST analysis, development status and competitive landscape, analysis of coating materials and device markets, coating processing technologies, and key application fields. The key application fields discussed are automotive coatings and engineering/machinery coatings. For each section, the document analyzes historical development, current status, segmentation by product or market, and predicted future trends.
China pharmaceutical excipients industry indepth research and investment stra...Qianzhan Intelligence
The document discusses China's pharmaceutical excipients industry. It notes that the quality of excipients directly impacts the quality of drug preparations. The development of excipients in China has lagged the pharmaceutical industry. However, recent events have brought increased attention to excipients and driven their development. The report analyzes the industry's environment, market size, competitors, and prospects for development. It aims to provide a comprehensive overview of the current state and future trends in China's pharmaceutical excipients sector.
Nano electronics and nano sensors Dr Shehan de Silva costi2014
National level support in Sri Lanka is best used to develop unique platform technologies that can then be commercialized in various forms by private enterprises. One such potential platform technology is electroactive porous materials, which could enable new devices for sensing, actuation, and electronic storage at the nanoscale. Applications of these materials could include hydrogen storage and recovery, breathable garments and buildings, water filtration, crop-specific sensors, and electronic drug delivery. A roadmap was presented for developing electroactive porous nanomaterials to help solve global challenges around food, water, and medicine.
There are 6 playgrounds of varying sizes, 2 computer labs for different age groups, 1 kitchen and 2 dining rooms separated by age, a library located in the primary section, and a conference room on the ground floor used to discuss important topics. Additionally, there is a small chapel where students and staff pray at the beginning and end of the school year.
China organosilicon industry market demand prospects and investment strategy ...Qianzhan Intelligence
1. The document discusses the organosilicon industry in China, including its development, market environment, and competition.
2. It analyzes statistics, the industrial chain, key players, imports/exports, and the balance between supply and demand in the Chinese organosilicon industry from 2011-2017.
3. The report aims to help companies in the organosilicon industry understand market trends and opportunities to inform strategic decision making.
Marc Bachrach’S Hypnotism Show Pre Show Sjsu 12 2008MarcBachrach
This document advertises Marc Bachrach's hypnotism show, promoting it as hilarious yet cruelty-free. It states that 1 hour of hypnosis equals 8 hours of sleep and encourages viewing the show or participating in it. The ad also notes that DVDs of the show are available for purchase.
El documento presenta un resumen del calendario noticioso del año 2016 en Chile, incluyendo debates legislativos sobre pensiones, educación superior y reforma laboral, entre otros temas. También aborda cambios en el gabinete ministerial, elecciones municipales y casos judiciales relevantes. A nivel internacional, destaca las elecciones presidenciales en Estados Unidos y Perú, y la situación económica y política en países de la región como Argentina, Brasil, Venezuela y Colombia.
China high end equipment manufacturing park development pattern and investmen...Qianzhan Intelligence
The document provides a report on the development patterns and investment strategic planning of high-end equipment manufacturing parks in China from 2013 to 2020. It discusses the definition and construction models of high-end equipment manufacturing parks and analyzes the development prospects, focuses and planning of key industries such as aeronautic, satellite, railway transportation and ocean engineering equipment. The report also examines the distribution, characteristics and leading enterprises of major high-end equipment manufacturing clusters across China. It aims to help park corporations and investors better understand industry trends and make informed investment decisions.
China high end equipment manufacturing park development pattern and investmen...Qianzhan Intelligence
This document provides a summary and analysis of the development patterns, investment strategies, and prospects for high-end equipment manufacturing parks in China from 2013 to 2020. It examines construction models for these parks, as well as business models and profit models. It also analyzes development trends and focuses for various industries housed in these parks, including aeronautic, satellite, railway transportation, ocean engineering, and intelligent manufacturing equipment. Finally, it discusses the regional distribution of high-end manufacturing industries throughout China.
Mickey Shariff has over 30 years of experience in the global fashion industry working across apparel and accessories. He is currently the Head of Production and Commercial at Marc b Ltd where he oversees production, development, sales, concessions, and e-commerce. Previously, he held roles such as Head of Production at RTW Group and Co-Founder of Factor Of Three Ltd.
DNN-based frequency-domain permutation solver for multichannel audio source s...Kitamura Laboratory
Fumiya Hasuike, Daichi Kitamura, and Rui Watanabe,"DNN-based frequency-domain permutation solver for multichannel audio source separation," Proceedings of Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC 2022), pp. 872–877, Chiang Mai, Thailand, November 2022.
Approximation of Dynamic Convolution Exploiting Principal Component Analysis:...a3labdsp
In recent years, several techniques have been proposed in the literature in order to attempt the emulation of nonlinear electro-acoustic devices, such as compressors, distortions, and preamplifiers. Among them, the dynamic convolution technique is one of the most common approaches used to perform this task. In this paper an exhaustive objective and subjective analysis of a dynamic convolution operation based on principal components analysis has been performed. Taking into consideration real nonlinear systems, such as bass preamplifier, distortion, and compressor, comparisons with the existing techniques of the state of the art have been carried out in order to prove the effectiveness of the proposed approach.
Three approaches to automatic composition are discussed: rule-based models, genetic algorithms, and statistical models. Recent advances include FlowComposer, which uses constrained Markov models to generate lead sheets based on user input, and WaveNet, a neural network that generates raw audio. While earlier rule-based systems focused on genres like classical music, statistical models with harmonic constraints now show the best performance. Deep learning models also show promise but require further study on incorporating harmonic constraints. Objective evaluation metrics are still needed to assess composition quality.
Development of voice password based speaker verification systemniranjan kumar
This document presents research on developing a voice password-based speaker verification system using vowel regions. It begins with introducing speaker verification and issues with limited speech data. A Gaussian mixture model baseline system is described along with results. An empirical mode decomposition approach is then proposed to emphasize vowel regions before applying dynamic time warping on the means of vowel regions only for verification. The proposed system achieves better equal error rates than baselines on a voice password database.
Development of voice password based speaker verification systemniranjan kumar
This document presents research on developing a voice password-based speaker verification system using vowel regions. It begins with introducing speaker verification and issues with limited speech data. A Gaussian mixture model baseline system is described along with results. An empirical mode decomposition approach is then proposed to emphasize vowel regions before applying dynamic time warping on the means of vowel regions only for verification. The proposed system achieves better equal error rates than baselines on a voice password database.
This document provides an introduction to the fundamentals of digitizing audio content, including discretization of signals through sampling in time and quantization in amplitude. It describes key concepts such as the sampling theorem, which states that a sampled signal can be perfectly reconstructed if the sampling frequency is greater than twice the maximum frequency of the original signal. It also covers properties of the quantization error introduced by representing continuous amplitude values with a finite number of levels, and how factors like word length affect the signal-to-noise ratio of the quantization process.
NIDM-Results. A standard for describing and sharing neuroimaging results: app...Camille Maumet
NeuroSpin seminar - October 3rd, 2016
A standard for describing and sharing neuroimaging results: application to image-based meta-analysis
Abstract: Only a tiny fraction of the data and metadata produced by an fMRI study is eventually conveyed to the community. This lack of transparency not only hinders the reproducibility of neuroimaging results but also impairs future meta-analyses. The issue is multi-factorial with a number of ethical, psychological and technical barriers. In an effort to tackle some of the technical issues, The Neuroimaging Data Model (NIDM) was built to make sharing of neuroimaging data and metadata as effortless as possible. This model was developed as part of the International Neuroinformatics Coordinating Facility (INCF) neuroimaging task force.
In this talk, I will present NIDM, and review our recent progress in providing a model to share the statistical results of a neuroimaging study. In particular, I will show how this standard is made accessible to neuroimaging researchers through their usual software packages and tools (SPM, FSL, Neurovault). Finally, I will demonstrate how the use of this standard enables image-based meta-analyses.
the generation of panning laws for irregular speaker arrays using heuristic m...Bruce Wiggins
A presentation made at the 31st International AES conference in 2007 on the generation of higher order Ambisonic decoders for the irregular, 5 speaker, ITU speaker arrangement.
Recovery of low frequency Signals from noisy data using Ensembled Empirical M...inventionjournals
International Journal of Engineering and Science Invention (IJESI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJESI publishes research articles and reviews within the whole field Engineering Science and Technology, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online.
The L2F Spoken Web Search system for Mediaeval 2012MediaEval2012
The document describes the L2F Spoken Web Search system submitted to the Mediaeval 2012 evaluation. The system uses a hybrid ANN/HMM speech recognition system called AUDIMUS to perform phonemic tokenization of queries and acoustic keyword search over audio files. It consists of four parallel sub-systems with different language models that are fused together. The submitted run used per-query score normalization and majority voting fusion. The goal of the first participation was to learn, have fun, and build a reasonable system with limited time.
This document summarizes a research paper on pitch detection of speech synthesis using MATLAB. It discusses using an adaptable filter and peak-valley decision method to determine pitch marks for speech synthesis. Low-pass filtering and autocorrelation are used to detect pitch periods. An adaptive filter is designed to flatten spectral peaks. Peak and valley costs are calculated over each pitch period to determine pitch marks. Dynamic programming is then used to obtain the optimal pitch mark locations for high quality speech synthesis.
Recently, WaveNet, which predicts the probability distribution of speech sample auto-regressively, provides a new paradigm in speech synthesis tasks.
Since the usage of WaveNet for speech synthesis varies by conditional vectors, it is very important to effectively design a baseline system structure.
In this talk, I would like to first introduce various types of WaveNet vocoders such as conventional speech-domain approach and recently proposed source-filter theory-based approach.
Then, I will explain a linear prediction (LP)-based WaveNet speech synthesis, i.e., LP-WaveNet, which overcomes the limitations of source-filter theory-based WaveNet vocoders caused by the mismatch between speech excitation signal and vocal tract filter.
While presenting experimental setups and results, I also would like to share some know-hows to successfully training the network.
This document describes a model for localizing moving sound sources based on binaural hearing. The model has three main parts: a monaural pathway that groups time-frequency units by source, a binaural pathway that extracts interaural time and level differences, and a localization framework that integrates information across pathways. The model was tested using simulated binaural impulse responses generated with the Roomsim package. Results showed high localization accuracy in anechoic and noisy conditions but poorer performance in reverberant environments. Future work could involve estimating the number of sources and incorporating visual information.
Learning the Statistical Model of the NMF Using the Deep Multiplicative Updat...Hiroki_Tanji
This document presents the Deep Multiplicative Update Algorithm (DeMUA) for Nonnegative Matrix Factorization (NMF). DeMUA uses a neural network to represent the statistical model and update rules for NMF. It is applied to audio denoising and supervised signal separation tasks. Experimental results show DeMUA can learn complex distributions and achieve better performance than conventional statistical models of NMF.
Music Gesture for Visual Sound Separationivaderivader
This document provides an overview of a research paper on using music gestures to separate visual sound sources. It describes a pipeline that uses a context-aware graph network to model body and finger movements from video and associates them with corresponding audio signals using an audio-visual fusion model. Experiments on various music performance datasets show the approach performs better than previous methods at separating sounds of different instruments and same instruments. The research presents a new direction of exploiting structured body dynamics to guide sound separation and proposes an audio-video fusion module.
The document discusses MRI and fMRI techniques. It obtained image sets from 3T and 7T scanners and compared the spatial and temporal signal-to-noise ratio (SNR) between the sets. Spatial SNR was higher for 7T images as expected due to increased magnetic field strength. Independent component analysis identified auditory components that correlated better between 7T data and a prior study than 3T data, indicating 7T fMRI provides improved detection of brain activity.
Similar to Robust music signal separation based on supervised nonnegative matrix factorization with prevention of basis sharing (18)
独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...Daichi Kitamura
北村大地, "独立低ランク行列分析に基づく音源分離とその発展," IEICE信号処理研究会, 2021年8月24日.
Daichi Kitamura, "Audio source separation based on independent low-rank matrix analysis and its extensions," IEICE Technical Group on Signal Processing, Aug. 24th, 2021.
http://d-kitamura.net
日本音響学会2021春季研究発表会1-1-2
北村大地, 矢田部浩平, "スペクトログラム無矛盾性を用いた独立低ランク行列分析の実験的評価," 日本音響学会 2021年春季研究発表会講演論文集, 1-1-2, pp. 121–124, Tokyo, March 2021.
Daichi Kitamura and Kohei Yatabe, "Experimental evaluation of consistent independent low-rank matrix analysis," Proceedings of 2021 Spring Meeting of Acoustical Society of Japan, 1-1-2, pp. 121–124, Tokyo, March 2021 (in Japanese).
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...Daichi Kitamura
角野隼斗, 北村大地, 高宗典玄, 高道慎之介, 猿渡洋, 小野順貴, "独立深層学習行列分析に基づく多チャネル音源分離," 日本音響学会 2018年春季研究発表会講演論文集, 1-4-16, pp. 449–452, Saitama, March 2018.
Hayato Sumino, Daichi Kitamura, Norihiro Takamune, Shinnosuke Takamichi, Hiroshi Saruwatari, Nobutaka Ono, "Multichannel audio source separation based on independent deeply learned matrix analysis," Proceedings of 2018 Spring Meeting of Acoustical Society of Japan, 1-4-16, pp. 449–452, Saitama, March 2018 (in Japanese).
近接分離最適化によるブラインド⾳源分離(Blind source separation via proximal splitting algorithm)Daichi Kitamura
矢田部浩平, 北村大地, "近接分離最適化によるブラインド⾳源分離," 日本音響学会 2018年春季研究発表会講演論文集, 1-4-10, pp. 431–434, Saitama, March 2018.
Kohei Yatabe, Daichi Kitamura, "Blind source separation via proximal splitting algorithm," Proceedings of 2018 Spring Meeting of Acoustical Society of Japan, 1-4-10, pp. 431–434, Saitama, March 2018 (in Japanese).
Effective Optimization Algorithms for Blind and Supervised Music Source Separation with Nonnegative Matrix Factorization
長倉研究奨励賞第三次審査,20分間の研究概要説明
内容は自身の学位論文の一部に相当
音源分離における音響モデリング(Acoustic modeling in audio source separation)Daichi Kitamura
北村大地, "音源分離における音響モデリング," 日本音響学会 サマーセミナー 招待講演, September 11th, 2017.
Daichi Kitamura, "Acoustic modeling in audio source separation," The Acoustical Society of Japan, Summer Seminar Invited Talk, September 11th, 2017.
2017年6月24日,ICASSP2017読み会(関東編)@東京大学
AASP-L3: Deep Learning for Source Separation and Enhancement I
東京大学特任助教 北村大地担当分のスライド
私が著者ではないペーパーの紹介スライドですので,再配布等はご遠慮ください.また,このスライドで取り扱っていない詳細な情報に関しては対象となる論文をご参照ください.
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...Daichi Kitamura
東京大学 システム情報学専攻 談話会
2017年2月27日(月)15時~16時30分
北村大地, "独立性に基づくブラインド音源分離の発展と独立低ランク行列分析," 東京大学 システム情報学専攻 談話会, 2月27日, 2017年.
Daichi Kitamura, "History of independence-based blind source separation and independent low-rank matrix analysis," The University of Tokyo, Department of Information Physics and Computing, Seminar, 27th Feb., 2017.
統計的独立性と低ランク行列分解理論に基づくブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...Daichi Kitamura
北村大地, "統計的独立性と低ランク行列分解理論に基づくブラインド音源分離 –独立低ランク行列分析–," 筑波大学システム情報工学研究科マルチメディア研究室 招待講演, Ibaraki, September 26th, 2016.
Daichi Kitamura, "Blind source separation based on statistical independence and low-rank matrix decomposition –Independent low-rank matrix analysis–," University of Tsukuba, Graduate School of Systems and Information Engineering, Multimedia Laboratory, Invited Talk, Ibaraki, September 26th, 2016.
北村大地, 小野順貴, "独立性基準を用いた非負値行列因子分解の効果的な初期値決定法," 日本音響学会 2016年春季研究発表会, 3-3-5, pp. 619-622, Kanagawa, March 2016.
Daichi Kitamura, Nobutaka Ono, "Statistical-independence-based effective initialization for nonnegative matrix factorization," Proceedings of 2016 Spring Meeting of Acoustical Society of Japan, 3-3-5, pp. 619-622, Kanagawa, March 2016 (in Japanese).
非負値行列分解の確率的生成モデルと多チャネル音源分離への応用 (Generative model in nonnegative matrix facto...Daichi Kitamura
北村大地, "非負値行列分解の確率的生成モデルと多チャネル音源分離への応用," 慶應義塾大学理工学部電子工学科湯川研究室 招待講演, Kanagawa, November, 2015.
Daichi Kitamura, "Generative model in nonnegative matrix factorization and its application to multichannel sound source separation," Keio University, Science and Technology, Department of Electronics and Electrical Engineeing, Yukawa Laboratory, Invited Talk, Kanagawa, November, 2015.
ランク1空間近似を用いたBSSにおける音源及び空間モデルの考察 Study on Source and Spatial Models for BSS wi...Daichi Kitamura
Presented at 2015 Autumn Meeting of Acoustical Society of Japan (domestic conference)
北村大地, 猿渡洋, 小野順貴, 澤田宏, 亀岡弘和, "ランク1空間近似を用いたBSSにおける音源及び空間モデルの考察," 日本音響学会 2015年秋季研究発表会, 3-6-10, pp.583-586, Fukushima, September 2015.
Daichi Kitamura, Hiroshi Saruwatari, Nobutaka Ono, Hiroshi Sawada, Hirokazu Kameoka, "Study on source and spatial models for BSS with rank-1 spatial approximation," Proceedings of 2015 Autumn Meeting of Acoustical Society of Japan, 3-6-10, pp.583-586, Fukushima, September 2015 (in Japanese).
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...University of Maribor
Slides from talk presenting:
Aleš Zamuda: Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapter and Networking.
Presentation at IcETRAN 2024 session:
"Inter-Society Networking Panel GRSS/MTT-S/CIS
Panel Session: Promoting Connection and Cooperation"
IEEE Slovenia GRSS
IEEE Serbia and Montenegro MTT-S
IEEE Slovenia CIS
11TH INTERNATIONAL CONFERENCE ON ELECTRICAL, ELECTRONIC AND COMPUTING ENGINEERING
3-6 June 2024, Niš, Serbia
Comparative analysis between traditional aquaponics and reconstructed aquapon...bijceesjournal
The aquaponic system of planting is a method that does not require soil usage. It is a method that only needs water, fish, lava rocks (a substitute for soil), and plants. Aquaponic systems are sustainable and environmentally friendly. Its use not only helps to plant in small spaces but also helps reduce artificial chemical use and minimizes excess water use, as aquaponics consumes 90% less water than soil-based gardening. The study applied a descriptive and experimental design to assess and compare conventional and reconstructed aquaponic methods for reproducing tomatoes. The researchers created an observation checklist to determine the significant factors of the study. The study aims to determine the significant difference between traditional aquaponics and reconstructed aquaponics systems propagating tomatoes in terms of height, weight, girth, and number of fruits. The reconstructed aquaponics system’s higher growth yield results in a much more nourished crop than the traditional aquaponics system. It is superior in its number of fruits, height, weight, and girth measurement. Moreover, the reconstructed aquaponics system is proven to eliminate all the hindrances present in the traditional aquaponics system, which are overcrowding of fish, algae growth, pest problems, contaminated water, and dead fish.
Using recycled concrete aggregates (RCA) for pavements is crucial to achieving sustainability. Implementing RCA for new pavement can minimize carbon footprint, conserve natural resources, reduce harmful emissions, and lower life cycle costs. Compared to natural aggregate (NA), RCA pavement has fewer comprehensive studies and sustainability assessments.
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTjpsjournal1
The rivalry between prominent international actors for dominance over Central Asia's hydrocarbon
reserves and the ancient silk trade route, along with China's diplomatic endeavours in the area, has been
referred to as the "New Great Game." This research centres on the power struggle, considering
geopolitical, geostrategic, and geoeconomic variables. Topics including trade, political hegemony, oil
politics, and conventional and nontraditional security are all explored and explained by the researcher.
Using Mackinder's Heartland, Spykman Rimland, and Hegemonic Stability theories, examines China's role
in Central Asia. This study adheres to the empirical epistemological method and has taken care of
objectivity. This study analyze primary and secondary research documents critically to elaborate role of
china’s geo economic outreach in central Asian countries and its future prospect. China is thriving in trade,
pipeline politics, and winning states, according to this study, thanks to important instruments like the
Shanghai Cooperation Organisation and the Belt and Road Economic Initiative. According to this study,
China is seeing significant success in commerce, pipeline politics, and gaining influence on other
governments. This success may be attributed to the effective utilisation of key tools such as the Shanghai
Cooperation Organisation and the Belt and Road Economic Initiative.
Batteries -Introduction – Types of Batteries – discharging and charging of battery - characteristics of battery –battery rating- various tests on battery- – Primary battery: silver button cell- Secondary battery :Ni-Cd battery-modern battery: lithium ion battery-maintenance of batteries-choices of batteries for electric vehicle applications.
Fuel Cells: Introduction- importance and classification of fuel cells - description, principle, components, applications of fuel cells: H2-O2 fuel cell, alkaline fuel cell, molten carbonate fuel cell and direct methanol fuel cells.
Literature Review Basics and Understanding Reference Management.pptxDr Ramhari Poudyal
Three-day training on academic research focuses on analytical tools at United Technical College, supported by the University Grant Commission, Nepal. 24-26 May 2024
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsVictor Morales
K8sGPT is a tool that analyzes and diagnoses Kubernetes clusters. This presentation was used to share the requirements and dependencies to deploy K8sGPT in a local environment.
Redefining brain tumor segmentation: a cutting-edge convolutional neural netw...IJECEIAES
Medical image analysis has witnessed significant advancements with deep learning techniques. In the domain of brain tumor segmentation, the ability to
precisely delineate tumor boundaries from magnetic resonance imaging (MRI)
scans holds profound implications for diagnosis. This study presents an ensemble convolutional neural network (CNN) with transfer learning, integrating
the state-of-the-art Deeplabv3+ architecture with the ResNet18 backbone. The
model is rigorously trained and evaluated, exhibiting remarkable performance
metrics, including an impressive global accuracy of 99.286%, a high-class accuracy of 82.191%, a mean intersection over union (IoU) of 79.900%, a weighted
IoU of 98.620%, and a Boundary F1 (BF) score of 83.303%. Notably, a detailed comparative analysis with existing methods showcases the superiority of
our proposed model. These findings underscore the model’s competence in precise brain tumor localization, underscoring its potential to revolutionize medical
image analysis and enhance healthcare outcomes. This research paves the way
for future exploration and optimization of advanced CNN models in medical
imaging, emphasizing addressing false positives and resource efficiency.
Introduction- e - waste – definition - sources of e-waste– hazardous substances in e-waste - effects of e-waste on environment and human health- need for e-waste management– e-waste handling rules - waste minimization techniques for managing e-waste – recycling of e-waste - disposal treatment methods of e- waste – mechanism of extraction of precious metal from leaching solution-global Scenario of E-waste – E-waste in India- case studies.
Robust music signal separation based on supervised nonnegative matrix factorization with prevention of basis sharing
1. Robust Music Signal Separation Based on
Supervised Nonnegative Matrix Factorization
with Prevention of Basis Sharing
Daichi Kitamura, Hiroshi Saruwatari,
Kosuke Yagi, Kiyohiro Shikano
(Nara Institute of Science and Technology, Japan)
Yu Takahashi, Kazunobu Kondo
(Yamaha Corporation, Japan)
IEEE International Symposium on Signal Processing and Information Technology
December 12-15, 2013 - Athens, Greece
Session T.B3: Speech – Audio - Music
4. • Sound signal separation
– decomposes target source from an observed mixed signal.
– Speech and noise, specific instrumental sound, etc.
• Typical method for sound signal separation
– is treated in the time-frequency domain.
Background
Extract!
Time
Frequency
Spectrogram
First tone
Second tone
Separation
4
6. • Nonnegative matrix factorization (NMF)
– is a sparse representation algorithm.
– can extract significant features from the observed matrix.
• It is difficult to cluster the bases as specific sources.
Nonnegative matrix factorization [Lee, et al., 2012]
Amplitude
Amplitude
Observed matrix
(spectrogram)
Basis matrix
(spectral patterns)
Activation matrix
(Time-varying gain)
Time
Ω: Number of frequency bins
𝑇: Number of time frames
𝐾: Number of bases
Time
Frequency
Frequency
6
Basis
7. • SNMF utilizes some sample sounds of the target.
– Construct the trained basis matrix of the target sound.
– Decompose into the target signal and other signal.
Supervised NMF (SNMF) [Smaragdis, et al., 2007]
Separation process Optimize
Training process
Supervised basis matrix
(spectral dictionary)
Sample sounds
of target signal
7Fixed
Ex. Musical scale
Target signal Other signalMixed signal
8. Problem of SNMF
• Basis sharing problem in SNMF
– There is no constraint between and .
– Other bases may also have the target spectral patterns.
– The estimated target signal loses some of the target signal.
– The cost function is only defined as the distance between
8
Estimated
target signal
Estimated
other signals
Target signal
If also have the target basis…
and .
9. Basis sharing problem: example of SNMF
9
Separated
by SNMF
Mixed signal
Only the target
signal (oracle)
10. Basis sharing problem: example of SNMF
10
Only the target
signal (oracle)
Separated
by SNMF
Mixed signal
11. Basis sharing problem: example of SNMF
11
Separated
by SNMF
Separated signal
(estimated)
The estimated signal loses
some of the target components
because of the basis sharing
problem.
13. Proposed method
• In SNMF, other basis matrix may have the same
spectral patterns with supervised basis matrix .
• Propose to make as different as possible from
by introducing a penalty term in the cost function.
13
Target signal Other signalMixed signal Fixed
Optimize as different as possible from .
Basis sharing problem
Penalized SNMF (PSNMF)
14. Decomposition model and cost function
14
Decomposition model:
Cost function in SNMF:
Generalized divergence function: -divergence [Eguchi, et al., 2001]
Supervised basis matrix (fixed)
15. Decomposition model and cost function
15
Introduce a penalty term
We propose two types of penalty terms.
Cost function in PSNMF:
Decomposition model:
Cost function in SNMF:
Supervised basis matrix (fixed)
16. Orthogonality penalty
• Orthogonality penalty is the optimization of that
minimizes the inner product of matrices and .
– If includes the similar basis to , becomes
larger.
• All the bases are normalized as one.
• Introduce a weighting parameter .
16
17. Maximum-divergence penalty
• Maximum-divergence penalty is the optimization of
– If includes the similar basis to , the divergence
becomes smaller.
• All the bases are normalized as one.
• Introduce a weighting parameter and sensitivity
parameter .
17
that maximizes the divergence between and .
18. Derivation of optimal variables in PSNMF
• Derive the optimal variables .
• Auxiliary function method
– Optimization scheme that uses the upper bound function.
– Design the auxiliary function for and as and .
– Minimize the original cost functions by minimizing the
auxiliary functions indirectly.
18
19. Derivation of optimal variables in PSNMF
• The second and third terms become convex or
concave function w.r.t. value.
– Convex: Jensen’s inequality
– Concave: tangent line inequality
19
where
20. Derivation of optimal variables in PSNMF
• Always becomes the convex function
– Convex: Jensen’s inequality
20
: auxiliary variable
21. Derivation of optimal variables in PSNMF
• Auxiliary functions and are designed as
• The update rules for optimization are obtained by
21
, and .
22. Update rules for optimization of PSNMF
• Update rules with orthogonality penalty
22
where,
23. Update rules for optimization of PSNMF
• Update rules with maximum-divergence penalty
23
where,
25. • Produced four melodies using a MIDI synthesizer.
• Used the same MIDI sounds of the target instruments
containing two octave notes as a supervision sound.
• Evaluation in two-source case and four-source case.
– There are 12 combinations in the two-source case, and 4
patterns in the four-source case.
Experimental conditions
25
Training sound
Two octave notes that cover all the notes of the target signal.
26. • Evaluation scores [Vincent, 2006]
– Source-to-distortion ratio (SDR)
– SDR indicates the total quality of separated signal.
Experimental conditions
Observed signal Mixed 2 or 4 signals as the same power
Training signal
The same MIDI sounds of the target signal
containing two octave notes
Divergence
criteria
All combinations of
Number of bases
Supervised bases : 100
Other bases : 50
Parameters
Experimentally determined
Methods Conventional SNMF, Proposed PSNMF
26
29. Example of separation (Cello & Oboe)
29
Separated
by SNMF
Cello signal
Mixed signal
Separated
by PSNMF
(Ortho.)
30. Conclusions
• Conventional supervised NMF has a basis sharing
problem that degrades the separation performance.
• We propose to add a penalty term, which forces the
other bases to become uncorrelated with supervised
bases, in the cost function.
• Penalized supervised NMF can achieve the high
separation accuracy.
30
Penalized supervised NMF
Thank you for your attention!
Editor's Notes
Good afternoon everyone, // I’m Daichi Kitamura from Nara institute of science and technology, Japan.
Today // I’d like to talk about Robust Music Signal Separation Based on Supervised Nonnegative Matrix Factorization with Prevention of Basis Sharing.
This is outline of my talk.
First, // I talk about research background.
Sound signal separation / is a technique for decomposing a target signal / from an observed mixed signal.
For example, / speech and noise separation, / specific instrumental sound extraction like this, / and so on.
Typical method for sound signal separation is treated in the time-frequency domain, namely, in the spectrogram domain.
There are two tones in this spectrogram. So, if we could separate these tones like this, / the sound separation is achieved.
Next, I explain about conventional methods.
As a means for extracting some features from the spectrogram, / nonnegative matrix factorization, NMF in short, has been proposed.
This is a sparse representation algorithm, and this method can extract the significant features from the observed matrix.
NMF decomposes the observed spectrogram Y, / into two nonnegative matrices F and G, approximately. (アポロークシメイトリ)
Here, first decomposed matrix F has frequently-appearing spectral patterns / as a basis.
And another decomposed matrix G has time-varying gains / of each spectral pattern.
So, the matrix F is called as ‘basis matrix,’ / and the matrix G is called as ‘activation matrix.’
Therefore, if we could know / which basis corresponds to the target signal, we can reconstruct the target spectrogram that has only the target sound.
However, it is very difficult / to cluster these bases as specific sources.
To solve this problem, supervised NMF, SNMF in short, has been proposed.
SNMF utilizes some sample sounds of the target signal / as a supervision signal.
For example, / if we wanted to separate the piano signal from this mixed signal, / the musical scale sound of the same piano / should be used as a supervision.
This sample sound is decomposed by simple NMF, / and the supervised basis matrix F is constructed in the training process.
Then, the mixed signal is decomposed in the separation process / using the supervised bases F, / as FG+HU.
The matrix F is fixed, / and the other matrix G, H, and U are optimized.
Finally, the target piano signal is separated as FG, / and the other signals, such as saxophone and bass, are separated as HU.
However, SNMF has a problem / called basis sharing.
In SNMF, there is no constraint between the supervised matrix F and the other matrix H.
Therefore, the other bases H may also have the target spectral patterns.
For example, the target signal is represented as these basis and activation.
The supervised matrix F has this target basis / because this is a dictionary of the target signal.
If H also have this target basis, the activation is split / between G and U like this
So, the target signal is deprived by HU, / and the estimated signal loses some of the target signal.
This is because / the cost function is only defined as the distance between Y and FG+HU.
Even if the target components are split like this, the value of cost function doesn’t change.
This upper left spectrogram is a mixed signal.
And, lower left one is a spectrogram that have only the target signal.
As you can see,
these components are the non-target signal.
If we separate this signal using SNMF,
the estimated signal loses some of the target components / because of the basis sharing problem.
Next, I talk about our proposed method.
In conventional SNMF, / the other basis matrix may have the same spectral patterns with supervised basis matrix F. This is the basis sharing problem.
To solve this problem, / we propose to make the other basis H as different as possible from the supervised basis F / by introducing a penalty term in the cost function.
We call this method as Penalized SNMF, PSNMF in short.
This is a decomposition model of PSNMF. It is the same as conventional SNMF.
The cost function in the conventional SNMF is defined as the divergence between Y and FG+HU, like this equation, where Dβ indicates the generalized divergence function, / which includes Euclidian distance, Kullback-Leibler divergence, and Itakura-Saito divergence.
In our proposed method, we introduce two types of penalty terms, additively.
These equations, J1 and J2, are the cost functions in our proposed PSNMF.
I will explain these penalty terms in the following slides.
First one is an orthogonality (オゥサーゴナリティ) penalty.
This is the optimization of H / that minimizes the inner product of supervised basis F and other basis H, like this.
If H includes the similar basis to F, this penalty term becomes larger.
So we can optimize H as different as possible from F by minimizing this term.
This minimization corresponds to the maximization of orthogonality (オゥサーゴナリティ) between F and H.
And all the bases are normalized as one / to avoid an arbitrariness of the scale.
In addition, we introduce a weighting parameter μ1.
Second one is a maximum-divergence penalty.
This penalty is the optimization of H / that maximizes the divergence between F and H, / where we use the β-divergence in this penalty.
If H includes the similar basis to F, the value of divergence becomes smaller.
So we can optimize H as different as possible from F by maximizing this term.
Similarly, all the bases are normalized as one.
In addition, to treat this penalty as the minimization problem, / we invert the sign / and introduce an exponential function like this.
And we derive the optimal variables G, H, and U, / which minimize these cost functions.
However, it is quite difficult to differentiate (ディファレンシエイト) these functions directly (ディレクトリィ), so we use an auxiliary (オゥグジーリアリ) function method.
This method is an optimization scheme that uses the upper bound function, / as the auxiliary function.
In this method, we design the auxiliary functions for the cost functions J1 and J2, / as J1+ and J2+.
Then we can minimize the original cost functions by minimizing the auxiliary functions indirectly.
To design the auxiliary function, we have to derive the upper bounds for Dβ and orthogonality (オゥサーゴナリティ) penalty term.
First, we derive the upper bound for Dβ.
This divergence function is described like this.
The second and third terms become convex or concave function / with respect to β value.
For the convex function, Jensen’s inequality can use to derive the upper bound.
On the other hand, for the concave function, we can use the tangent line inequality.
The upper bound function JSNFM+ becomes quite complex form, so please refer to my paper.
Next, we derive the upper bound for this term.
This term always becomes the convex function, / so we can derive the upper bound using Jensen’s inequality as P+.
Finally, we can design the auxiliary functions J1+ and J2+ like this.
The update rules for optimization are obtained by these differentials.
These are the update rules of PSNMF with orthogonality penalty.
This term corresponds to the orthogonality penalty.
These are the update rules of PSNMF with maximum-divergence penalty.
Similarly, this term corresponds to the maximum-divergence penalty.
Next, I explain about experiments.
In the experiment, we produced four melodies using a MIDI synthesizer, like this score.
The instruments were clarinet, oboe, piano, and cello.
And, we used the same MIDI sounds of the target instruments / containing two octave notes, / as a supervision sound.
In addition, we evaluated two-source case and four-source case.
In the two-source case, the observed signal was produced by mixing two sources / selected from four instruments with the same power.
In the four-source case, we produced an observed signal / that consisted (コンシステッド) of four instruments with the same power.
There are 12 combinations in the two-source case, and 4 patterns in the four-source case.
The evaluation scores are averaged / in each case.
This table is the experimental conditions.
The divergence criterion β affects the separation accuracy.
So we used these values of β and βm, where βm is a criterion for the maximum-divergence penalty.
(These values correspond to Itakura-Saito divergence, Kullback-Leibler divergence, and Euclidian distance, respectively.)
(The number of supervised bases K was 100, and the number of the other bases was 50. )
We compare the conventional SNMF, / and our proposed PSNMF.
In addition, we used SDR value as the evaluation score.
SDR is the source-to-distortion ratio, / which indicates total quality of separated signal.
This is the result of two-source-case experiment.
We indicated the results of β=0, 1, and 2, respectively.
The blue bar is the conventional SNMF, red one is the PSNMF with orthogonality penalty, and green one is the PSNMF with maximum-divergence penalty with various βm.
From this result, we can confirm that / the conventional SNMF cannot achieve high separation accuracy / because of the basis sharing problem.
But our proposed methods outperform conventional SNMF constantly.
Both of orthogonality and maximum-divergence can avoid the basis sharing problem.
This is the result of four-source-case experiment.
(The total scores are decrease / because the number of interference signals increase in this case.)
(However )
PSNMF outperforms the conventional method.
This is the example of the separation of cello and oboe sounds.
The separated cello sound by conventional SNMF / lose some of the target components / because of the basis sharing problem.
But, our proposed PSNMF can separate with high performance.
Finally, I’ll show the sounds.
This is a mixed signal of cello and oboe sound.
Next one is only cello sound.
And, this is the separated cello sound by conventional SNMF.
And this is our proposed method.
This is my conclusions.
Thank you for your attention.
Supervised method has an inherent problem.
That is, we cannot get the perfect supervision sound of the target signal.
Even if the supervision sounds are the same type of instrument as the target sound, / these sounds differ / according to various conditions.
For example, individual styles of playing / and the timbre individuality for each instrument, and so on.
When we want to separate this piano sound from mixed signal, / maybe we can only prepare the similar piano sound, but the timbre is slightly different.
However the supervised NMF cannot separate because of the difference of spectra of the target sound.
To solve this problem, we have proposed a new supervised method / that adapts the supervised bases to the target spectra / by a basis deformation.
This is the decomposition model in this method.
We introduce the deformable term, / which has both positive and negative values like this.
Then we optimize the matrices D, G, H, and U.
This figure indicates spectral difference between the real sound and artificial sound.
This is an example of separation in four-source case. The target sound is a clarinet.
If λ increases, the divergence penalty term becomes like this graph.
So, the parameter λ controls the sensitivity of the divergence penalty term.
The optimization of variables F and G in NMF / is based on the minimization of the cost function.
The cost function is defined as the divergence between observed spectrogram Y / and reconstructed spectrogram FG.
This minimization is an inequality constrained optimization problem.
SDR is the total evaluation score as the performance of separation.