Regularized superresolution-based binaural signal separation with nonnegative matrix factorization

Daichi Kitamura
Daichi KitamuraAssistant Professor at National Institute of Technology, Kagawa College
Regularized Superresolution-Based
Binaural Signal Separation
with Nonnegative Matrix Factorization
Daichi Kitamura, Hiroshi Saruwatari,
Yusuke Iwao, Kiyohiro Shikano
(Nara Institute of Science and Technology, Nara, Japan)
Kazunobu Kondo, Yu Takahashi
(Yamaha Corporation Research & Development Center, Shizuoka, Japan)
Outline
• 1. Research background
• 2. Conventional method
– Nonnegative matrix factorization
– Penalized supervised nonnegative matrix factorization
– Directional clustering
– Hybrid method
• 3. Proposed method
– Regularized superresolution-based nonnegative matrix
factorization
• 4. Experiments
• 5. Conclusions
2
Outline
• 1. Research background
• 2. Conventional method
– Nonnegative matrix factorization
– Penalized supervised nonnegative matrix factorization
– Directional clustering
– Hybrid method
• 3. Proposed method
– Regularized superresolution-based nonnegative matrix
factorization
• 4. Experiments
• 5. Conclusions
3
Background
• Music signal separation technologies have received much
attention.
• Music signal separation based on nonnegative matrix
factorization (NMF) has been a very active area of the
research.
• The extraction performance of NMF markedly degrades for the
case of many source mixtures.
4
• Automatic music transcription
• 3D audio system, etc.
Applications
We propose a new method for multichannel signal
separation with NMF utilizing both spectral and spatial
cues included in mixtures of multiple instruments.
Outline
• 1. Research background
• 2. Conventional method
– Nonnegative matrix factorization
– Penalized supervised nonnegative matrix factorization
– Directional clustering
– Hybrid method
• 3. Proposed method
– Regularized superresolution-based nonnegative matrix
factorization
• 4. Experiments
• 5. Conclusions
5
NMF
• NMF is a type of sparse representation algorithm that
decomposes a nonnegative matrix into two nonnegative
matrices. [D. D. Lee, et al., 2001]
6
Time
Frequency
AmplitudeFrequency
Amplitude
Observed matrix
(Spectrogram)
Basis matrix
(Spectral bases)
Activation matrix
(Time-varying gain)
Time
Ω: Number of frequency bins
𝑇: Number of frames
𝐾: Number of bases
𝒀: Observed matrix
𝑭: Basis matrix
𝑮: Activation matrix
Penalized Supervised NMF (PSNMF)
• In PSNMF, the following decomposition is addressed under
the condition that is known in advance. [Yagi, et al., 2012]
7
Separation process Fix trained bases and update .
is forced to become uncorrelated with
Update
Training process
Supervised bases
of the target sound
Supervision sound
Penalized Supervised NMF (PSNMF)
• In PSNMF, the following decomposition is addressed under
the condition that is known in advance. [Yagi, et al., 2012]
8
Separation process Fix trained bases and update .
is forced to become uncorrelated with
Update
Training process
Supervised bases
of the target sound
Supervision sound
Problem of PSNMF: When the signal includes many sources,
the extraction performance markedly degrades.
Directional Clustering
• Directional clustering can estimate sources and their direction
in multichannel signal. [Araki, et al., 2007] [Miyabe, et al., 2009]
• This method can separate sources with spatial information in
an observed signal.
9
L R
L-chinputsignal
R-ch input signal
:Source component
:Centroid vector
Directional Clustering
• Directional clustering can estimate sources and their direction
in multichannel signal. [Araki, et al., 2007] [Miyabe, et al., 2009]
• This method can separate sources with spatial information in
an observed signal.
10
L R
L-chinputsignal
R-ch input signal
:Source component
:Centroid vector
Problem of directional clustering:
This method cannot separate sources in the same direction.
Hybrid method
• Conventional hybrid method utilizes PSNMF after the
directional clustering. [Iwao, et al., 2012]
• This method consists of two techniques.
– Directional clustering
– PSNMF
11
Directional
clustering
L R PSNMF
Spatial
separation
Source
separation
Conventional Hybrid method
Problem of hybrid method
• The signal extracted by the hybrid method suffers from the
generation of considerable distortion due to the binary
masking in directional clustering.
• The signal in the target direction, which is obtained by
directional clustering, has many spectral chasms.
• The resolution of the spectrogram is degraded.
12
1 0 0 0 0 0 0
0 1 1 0 0 1 1
1 0 0 0 0 0 0
0 1 0 1 1 0 1
1 0 0 0 0 0 0
1 1 1 0 1 1 0
Time
Frequency
: Target direction Time
Frequency
TimeFrequency
: Other direction :Hadamard product (product of each element)
Input spectrogram Binary mask Separated cluster
Directional Clustering
Outline
• 1. Research background
• 2. Conventional method
– Nonnegative matrix factorization
– Penalized supervised nonnegative matrix factorization
– Directional clustering
– Hybrid method
• 3. Proposed method
– Regularized superresolution-based nonnegative matrix
factorization
• 4. Experiments
• 5. Conclusions
13
Proposed hybrid method
14
Input stereo signal
L-ch R-ch
STFT
Directional clustering
Center component
L-ch R-ch
center cluster
Index of
based SNMF
Superresolution-
based SNMF
Superresolution-
ISTFT ISTFT
Mixing
Extracted signal
Input stereo signal
L-ch R-ch
STFT
Directional clustering
Center component
PSNMFPSNMF
L-ch R-ch
ISTFT ISTFT
Mixing
Extracted signal
Conventional
hybrid method
Proposed
hybrid method
Employ a new supervised NMF algorithm as an alternative
to the conventional PSNMF in the hybrid method.
Regularized superresolution-based NMF
• In proposed supervised NMF, the spectral chasms are treated
as unseen observations using index matrix.
15
: Chasms
Time
Frequency
Separated cluster
Chasms
Treat chasms as
unseen observations.
1 0 0 0 0 0 0
0 1 1 0 0 1 1
1 0 0 0 0 0 0
0 1 0 1 1 0 1
1 0 0 0 0 0 0
1 1 1 0 1 1 0
Time
Frequency
Index matrix
Regularized superresolution-based NMF
• The spectrogram of the target sound is reconstructed using
more matched bases because chasms are treated as unseen.
• The components of the target sound lost after directional
clustering can be extrapolated using supervised bases.
16
Time
Frequency
Separated cluster
Time
Frequency
Reconstructed spectrogram
: Chasms
Supervised
bases
Superresolution
using supervised
bases
17
Regularized superresolution-based NMF
• Signal flow of the proposed hybrid method
Center RightLeft
Direction
sourcecomponent
(a)
Frequencyof
Observed
spectra
Target source
18
Target direction
Regularized superresolution-based NMF
• Signal flow of the proposed hybrid method
Center RightLeft
Direction
sourcecomponent
z
(b)
Frequencyof
After
directional
clustering
Target source
Center RightLeft
Direction
sourcecomponent
(a)
Frequencyof
Observed
spectra
Center sources lose some
of their components
Directional
clustering
19
Regularized superresolution-based NMF
• Signal flow of the proposed hybrid method
Center RightLeft
Direction
sourcecomponent
z
(b)
Frequencyof
After
directional
clustering Center sources lose some
of their components
20
Regularized superresolution-based NMF
• Signal flow of the proposed hybrid method
Center RightLeft
Direction
sourcecomponent
z
(b)
Frequencyof
After
directional
clustering Center sources lose some
of their components
Superresolution-
based NMF
Center RightLeft
Direction
sourcecomponent
(c)
Frequencyof
After
super-
resolution-
based SNMF
Extrapolated
target source
Regularized superresolution-based NMF
• The basis extrapolation includes an underlying problem.
• If the time-frequency spectra are almost unseen in the
spectrogram, which means that the indexes are almost zero, a
large extrapolation error may occur.
• It is necessary to regularize the extrapolation.
21
4
3
2
1
0
Frequency[kHz]
43210
Time [s]
Extrapolation error
(incorrectly modifying the activation)
Time
Frequency
Separated cluster
Almost unseen frame
Regularized superresolution-based NMF
• We propose two types of regularizations.
22
Regularization of the temporal continuity
Regularization of the norm minimization
𝑰 : Index matrix ∙ : Binary complement
𝑖 𝜔,𝑡: Entry of index matrix 𝑰 𝑔 𝑘,𝑡: Entry of matrix 𝑮
𝑓𝜔,𝑘: Entry of matrix 𝑭
Previous
frame
The intensity of these regularizations are proportional to the
number of chasms in each frame.
Regularized superresolution-based NMF
• The cost function in regularized superresolution-based NMF is
defined using the index matrix as
23
: Regularization term
: Penalty term to force and to
become uncorrelated with each other
: Weighting parameter
Regularized superresolution-based NMF
• The update rules that minimize the cost function are obtained
as follows:
24
Outline
• 1. Research background
• 2. Conventional method
– Nonnegative matrix factorization
– Penalized supervised nonnegative matrix factorization
– Directional clustering
– Hybrid method
• 3. Proposed method
– Regularized superresolution-based nonnegative matrix
factorization
• 4. Experiments
• 5. Conclusions
25
Evaluation experiment
• We compared four methods.
– Conventional hybrid method using PSNMF (Conventional method)
– Proposed hybrid method using superresolution-based NMF without
regularization (Proposed method 1)
– Proposed hybrid method using superresolution-based NMF with
regularization of the temporal continuity (Proposed method 2)
– Proposed hybrid method using superresolution-based NMF with
regularization of the norm minimization (Proposed method 3)
26
Input stereo signal
L-ch R-ch
STFT
Directional clustering
Center component
PSNMFPSNMF
L-ch R-ch
ISTFT ISTFT
Mixing
Extracted signal
Input stereo signal
L-ch R-ch
STFT
Directional clustering
Center component
L-ch R-ch
center cluster
Index of
based SNMF
Superresolution-
based SNMF
Superresolution-
ISTFT ISTFT
Mixing
Extracted signal
Evaluation experiment
• We used stereo-panning signals ( ) and binaural-
recorded signals ( ) containing four instruments, Ob.,
Fl., Tb., and Pf., generated by MIDI synthesizer.
• The sources are mixed as the same power.
• Target source is always located in the center direction (no.1).
• We used the same type of MIDI sounds of the target
instruments as supervision for training process.
27
Center
1
2 3
4
Left Right
Target source
Supervision
sound
Two octave notes that cover all notes of the target signal
Experimental results (panning signal)
• Average SDR, SIR, and SAR scores for each method, where the 4
instruments are shuffled with 12 combinations.
28
12
10
8
6
4
2
0
SDR[dB]
24
20
16
12
8
4
0
SIR[dB]
10
8
6
4
2
0
SAR[dB]
SDR :quality of the separated target sound
SIR :degree of separation between the target and other sounds
SAR :absence of artificial distortion
Proposed method 1 :no regularization
Proposed method 2 :regularization of temporal continuity
Proposed method 3 :regularization of norm minimization
SDR SIR SARGood
Bad
Experimental results (binaural signal)
• Average SDR, SIR, and SAR scores for each method, where the 4
instruments are shuffled with 12 combinations.
29
6
5
4
3
2
1
0
SAR[dB]
20
16
12
8
4
0
SIR[dB]
10
8
6
4
2
0
SDR[dB]
SDR :quality of the separated target sound
SIR :degree of separation between the target and other sounds
SAR :absence of artificial distortion
SDR SIR SAR
Proposed method 1 :no regularization
Proposed method 2 :regularization of temporal continuity
Proposed method 3 :regularization of norm minimization
Bad
Good
Conclusions
• We propose a new supervised NMF algorithm, which is
superresolution-based method, for the hybrid method to
separate stereo or binaural signals.
• The proposed hybrid method can separate the target signal
with high performance compared with conventional method.
• The regularization of norm minimization is effective for the
proposed supervised NMF algorithm.
30
Thank you for your attention!
1 of 30

Recommended

Relaxation of rank-1 spatial constraint in overdetermined blind source separa... by
Relaxation of rank-1 spatial constraint in overdetermined blind source separa...Relaxation of rank-1 spatial constraint in overdetermined blind source separa...
Relaxation of rank-1 spatial constraint in overdetermined blind source separa...Daichi Kitamura
1.6K views23 slides
Divergence optimization in nonnegative matrix factorization with spectrogram ... by
Divergence optimization in nonnegative matrix factorization with spectrogram ...Divergence optimization in nonnegative matrix factorization with spectrogram ...
Divergence optimization in nonnegative matrix factorization with spectrogram ...Daichi Kitamura
942 views30 slides
Robust music signal separation based on supervised nonnegative matrix factori... by
Robust music signal separation based on supervised nonnegative matrix factori...Robust music signal separation based on supervised nonnegative matrix factori...
Robust music signal separation based on supervised nonnegative matrix factori...Daichi Kitamura
1.2K views30 slides
Superresolution-based stereo signal separation via supervised nonnegative mat... by
Superresolution-based stereo signal separation via supervised nonnegative mat...Superresolution-based stereo signal separation via supervised nonnegative mat...
Superresolution-based stereo signal separation via supervised nonnegative mat...Daichi Kitamura
775 views30 slides
Depth estimation of sound images using directional clustering and activation-... by
Depth estimation of sound images using directional clustering and activation-...Depth estimation of sound images using directional clustering and activation-...
Depth estimation of sound images using directional clustering and activation-...Daichi Kitamura
919 views35 slides
Hybrid multichannel signal separation using supervised nonnegative matrix fac... by
Hybrid multichannel signal separation using supervised nonnegative matrix fac...Hybrid multichannel signal separation using supervised nonnegative matrix fac...
Hybrid multichannel signal separation using supervised nonnegative matrix fac...Daichi Kitamura
1.1K views31 slides

More Related Content

What's hot

Hybrid NMF APSIPA2014 invited by
Hybrid NMF APSIPA2014 invitedHybrid NMF APSIPA2014 invited
Hybrid NMF APSIPA2014 invitedSaruwatariLabUTokyo
14.1K views31 slides
Blind source separation based on independent low-rank matrix analysis and its... by
Blind source separation based on independent low-rank matrix analysis and its...Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...Daichi Kitamura
1.6K views47 slides
Ica2016 312 saruwatari by
Ica2016 312 saruwatariIca2016 312 saruwatari
Ica2016 312 saruwatariSaruwatariLabUTokyo
14.2K views18 slides
Koyama ASA ASJ joint meeting 2016 by
Koyama ASA ASJ joint meeting 2016Koyama ASA ASJ joint meeting 2016
Koyama ASA ASJ joint meeting 2016SaruwatariLabUTokyo
14.5K views23 slides
Koyama AES Conference SFC 2016 by
Koyama AES Conference SFC 2016Koyama AES Conference SFC 2016
Koyama AES Conference SFC 2016SaruwatariLabUTokyo
14.1K views19 slides
Prior distribution design for music bleeding-sound reduction based on nonnega... by
Prior distribution design for music bleeding-sound reduction based on nonnega...Prior distribution design for music bleeding-sound reduction based on nonnega...
Prior distribution design for music bleeding-sound reduction based on nonnega...Kitamura Laboratory
99 views29 slides

What's hot(20)

Blind source separation based on independent low-rank matrix analysis and its... by Daichi Kitamura
Blind source separation based on independent low-rank matrix analysis and its...Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...
Daichi Kitamura1.6K views
Prior distribution design for music bleeding-sound reduction based on nonnega... by Kitamura Laboratory
Prior distribution design for music bleeding-sound reduction based on nonnega...Prior distribution design for music bleeding-sound reduction based on nonnega...
Prior distribution design for music bleeding-sound reduction based on nonnega...
Blind audio source separation based on time-frequency structure models by Kitamura Laboratory
Blind audio source separation based on time-frequency structure modelsBlind audio source separation based on time-frequency structure models
Blind audio source separation based on time-frequency structure models
Blind source separation based on independent low-rank matrix analysis and its... by Daichi Kitamura
Blind source separation based on independent low-rank matrix analysis and its...Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...
Daichi Kitamura1.4K views
DNN-based frequency component prediction for frequency-domain audio source se... by Kitamura Laboratory
DNN-based frequency component prediction for frequency-domain audio source se...DNN-based frequency component prediction for frequency-domain audio source se...
DNN-based frequency component prediction for frequency-domain audio source se...
Linear multichannel blind source separation based on time-frequency mask obta... by Kitamura Laboratory
Linear multichannel blind source separation based on time-frequency mask obta...Linear multichannel blind source separation based on time-frequency mask obta...
Linear multichannel blind source separation based on time-frequency mask obta...
Audio Source Separation Based on Low-Rank Structure and Statistical Independence by Daichi Kitamura
Audio Source Separation Based on Low-Rank Structure and Statistical IndependenceAudio Source Separation Based on Low-Rank Structure and Statistical Independence
Audio Source Separation Based on Low-Rank Structure and Statistical Independence
Daichi Kitamura2.5K views
DNN-based permutation solver for frequency-domain independent component analy... by Kitamura Laboratory
DNN-based permutation solver for frequency-domain independent component analy...DNN-based permutation solver for frequency-domain independent component analy...
DNN-based permutation solver for frequency-domain independent component analy...
Experimental analysis of optimal window length for independent low-rank matri... by Daichi Kitamura
Experimental analysis of optimal window length for independent low-rank matri...Experimental analysis of optimal window length for independent low-rank matri...
Experimental analysis of optimal window length for independent low-rank matri...
Daichi Kitamura850 views
コサイン類似度罰則条件付き半教師あり非負値行列因子分解と音源分離への応用 by Kitamura Laboratory
コサイン類似度罰則条件付き半教師あり非負値行列因子分解と音源分離への応用コサイン類似度罰則条件付き半教師あり非負値行列因子分解と音源分離への応用
コサイン類似度罰則条件付き半教師あり非負値行列因子分解と音源分離への応用
A Generalization of Laplace Nonnegative Matrix Factorizationand Its Multichan... by Hiroki_Tanji
A Generalization of Laplace Nonnegative Matrix Factorizationand Its Multichan...A Generalization of Laplace Nonnegative Matrix Factorizationand Its Multichan...
A Generalization of Laplace Nonnegative Matrix Factorizationand Its Multichan...
Hiroki_Tanji70 views

Viewers also liked

統計的独立性と低ランク行列分解理論に基づく ブラインド音源分離 –独立低ランク行列分析– Blind source separation based on... by
統計的独立性と低ランク行列分解理論に基づくブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...統計的独立性と低ランク行列分解理論に基づくブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...
統計的独立性と低ランク行列分解理論に基づく ブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...Daichi Kitamura
2.9K views56 slides
Divergence optimization based on trade-off between separation and extrapolati... by
Divergence optimization based on trade-off between separation and extrapolati...Divergence optimization based on trade-off between separation and extrapolati...
Divergence optimization based on trade-off between separation and extrapolati...Daichi Kitamura
917 views19 slides
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep... by
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...Daichi Kitamura
5.9K views74 slides
Evaluation of separation accuracy for various real instruments based on super... by
Evaluation of separation accuracy for various real instruments based on super...Evaluation of separation accuracy for various real instruments based on super...
Evaluation of separation accuracy for various real instruments based on super...Daichi Kitamura
676 views29 slides
半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法 by
半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法
半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法Daichi Kitamura
4.3K views28 slides
擬似ハムバッキングピックアップの弦振動応答 (in Japanese) by
擬似ハムバッキングピックアップの弦振動応答 (in Japanese)擬似ハムバッキングピックアップの弦振動応答 (in Japanese)
擬似ハムバッキングピックアップの弦振動応答 (in Japanese)Daichi Kitamura
1.1K views13 slides

Viewers also liked(19)

統計的独立性と低ランク行列分解理論に基づく ブラインド音源分離 –独立低ランク行列分析– Blind source separation based on... by Daichi Kitamura
統計的独立性と低ランク行列分解理論に基づくブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...統計的独立性と低ランク行列分解理論に基づくブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...
統計的独立性と低ランク行列分解理論に基づく ブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...
Daichi Kitamura2.9K views
Divergence optimization based on trade-off between separation and extrapolati... by Daichi Kitamura
Divergence optimization based on trade-off between separation and extrapolati...Divergence optimization based on trade-off between separation and extrapolati...
Divergence optimization based on trade-off between separation and extrapolati...
Daichi Kitamura917 views
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep... by Daichi Kitamura
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...
Daichi Kitamura5.9K views
Evaluation of separation accuracy for various real instruments based on super... by Daichi Kitamura
Evaluation of separation accuracy for various real instruments based on super...Evaluation of separation accuracy for various real instruments based on super...
Evaluation of separation accuracy for various real instruments based on super...
Daichi Kitamura676 views
半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法 by Daichi Kitamura
半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法
半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法
Daichi Kitamura4.3K views
擬似ハムバッキングピックアップの弦振動応答 (in Japanese) by Daichi Kitamura
擬似ハムバッキングピックアップの弦振動応答 (in Japanese)擬似ハムバッキングピックアップの弦振動応答 (in Japanese)
擬似ハムバッキングピックアップの弦振動応答 (in Japanese)
Daichi Kitamura1.1K views
Music signal separation using supervised nonnegative matrix factorization wit... by Daichi Kitamura
Music signal separation using supervised nonnegative matrix factorization wit...Music signal separation using supervised nonnegative matrix factorization wit...
Music signal separation using supervised nonnegative matrix factorization wit...
Daichi Kitamura985 views
Study on optimal divergence for superresolution-based supervised nonnegative ... by Daichi Kitamura
Study on optimal divergence for superresolution-based supervised nonnegative ...Study on optimal divergence for superresolution-based supervised nonnegative ...
Study on optimal divergence for superresolution-based supervised nonnegative ...
Daichi Kitamura1K views
基底変形型教師ありNMFによる実楽器信号分離 (in Japanese) by Daichi Kitamura
基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)
基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)
Daichi Kitamura5.9K views
独立性基準を用いた非負値行列因子分解の効果的な初期値決定法(Statistical-independence-based efficient initia... by Daichi Kitamura
独立性基準を用いた非負値行列因子分解の効果的な初期値決定法(Statistical-independence-based efficient initia...独立性基準を用いた非負値行列因子分解の効果的な初期値決定法(Statistical-independence-based efficient initia...
独立性基準を用いた非負値行列因子分解の効果的な初期値決定法(Statistical-independence-based efficient initia...
Daichi Kitamura4.9K views
音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou... by Daichi Kitamura
音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou...音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou...
音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou...
Daichi Kitamura12.2K views
非負値行列分解の確率的生成モデルと 多チャネル音源分離への応用 (Generative model in nonnegative matrix facto... by Daichi Kitamura
非負値行列分解の確率的生成モデルと多チャネル音源分離への応用 (Generative model in nonnegative matrix facto...非負値行列分解の確率的生成モデルと多チャネル音源分離への応用 (Generative model in nonnegative matrix facto...
非負値行列分解の確率的生成モデルと 多チャネル音源分離への応用 (Generative model in nonnegative matrix facto...
Daichi Kitamura5.9K views
Clustering Underlying Stock Trends via NMF by Andrea Pazienza
Clustering Underlying Stock Trends via NMFClustering Underlying Stock Trends via NMF
Clustering Underlying Stock Trends via NMF
Andrea Pazienza407 views
過決定条件BSSにおけるランク1空間制約の緩和 Relaxation of rank-1 spatial model in overdetermined... by Daichi Kitamura
過決定条件BSSにおけるランク1空間制約の緩和 Relaxation of rank-1 spatial model in overdetermined...過決定条件BSSにおけるランク1空間制約の緩和 Relaxation of rank-1 spatial model in overdetermined...
過決定条件BSSにおけるランク1空間制約の緩和 Relaxation of rank-1 spatial model in overdetermined...
Daichi Kitamura1.7K views
模擬ハムバッキング・ピックアップの弦振動応答 (in Japanese) by Daichi Kitamura
模擬ハムバッキング・ピックアップの弦振動応答 (in Japanese)模擬ハムバッキング・ピックアップの弦振動応答 (in Japanese)
模擬ハムバッキング・ピックアップの弦振動応答 (in Japanese)
Daichi Kitamura1.3K views
Efficient multichannel nonnegative matrix factorization with rank-1 spatial m... by Daichi Kitamura
Efficient multichannel nonnegative matrix factorization with rank-1 spatial m...Efficient multichannel nonnegative matrix factorization with rank-1 spatial m...
Efficient multichannel nonnegative matrix factorization with rank-1 spatial m...
Daichi Kitamura1.2K views
ランク1空間近似を用いたBSSにおける音源及び空間モデルの考察 Study on Source and Spatial Models for BSS wi... by Daichi Kitamura
ランク1空間近似を用いたBSSにおける音源及び空間モデルの考察 Study on Source and Spatial Models for BSS wi...ランク1空間近似を用いたBSSにおける音源及び空間モデルの考察 Study on Source and Spatial Models for BSS wi...
ランク1空間近似を用いたBSSにおける音源及び空間モデルの考察 Study on Source and Spatial Models for BSS wi...
Daichi Kitamura1.8K views
Optimal divergence diversity for superresolution-based nonnegative matrix fac... by Daichi Kitamura
Optimal divergence diversity for superresolution-based nonnegative matrix fac...Optimal divergence diversity for superresolution-based nonnegative matrix fac...
Optimal divergence diversity for superresolution-based nonnegative matrix fac...
Daichi Kitamura566 views
Nonnegative Matrix Factorization by Tatsuya Yokota
Nonnegative Matrix FactorizationNonnegative Matrix Factorization
Nonnegative Matrix Factorization
Tatsuya Yokota4.5K views

Similar to Regularized superresolution-based binaural signal separation with nonnegative matrix factorization

NIDM-Results. A standard for describing and sharing neuroimaging results: app... by
NIDM-Results. A standard for describing and sharing neuroimaging results: app...NIDM-Results. A standard for describing and sharing neuroimaging results: app...
NIDM-Results. A standard for describing and sharing neuroimaging results: app...Camille Maumet
335 views44 slides
DNA translocation through a nanopore by
DNA translocation through a nanoporeDNA translocation through a nanopore
DNA translocation through a nanoporekunyan
2.7K views58 slides
time based ranging via uwb radios by
time based ranging via uwb radiostime based ranging via uwb radios
time based ranging via uwb radiossujan shrestha
594 views56 slides
RADAR & NAVIGATION (Lecture 5).pptx by
RADAR & NAVIGATION (Lecture 5).pptxRADAR & NAVIGATION (Lecture 5).pptx
RADAR & NAVIGATION (Lecture 5).pptxErniDwi3
102 views40 slides
ANN based fault diagnostic scheme for power transformer by
ANN based fault diagnostic scheme for power transformerANN based fault diagnostic scheme for power transformer
ANN based fault diagnostic scheme for power transformerMohammad Sohaib
391 views9 slides
NON-LINEAR FULLY-CONSTRAINED SPECTRAL UNMIXING by
NON-LINEAR FULLY-CONSTRAINED SPECTRAL UNMIXINGNON-LINEAR FULLY-CONSTRAINED SPECTRAL UNMIXING
NON-LINEAR FULLY-CONSTRAINED SPECTRAL UNMIXINGgrssieee
755 views15 slides

Similar to Regularized superresolution-based binaural signal separation with nonnegative matrix factorization(20)

NIDM-Results. A standard for describing and sharing neuroimaging results: app... by Camille Maumet
NIDM-Results. A standard for describing and sharing neuroimaging results: app...NIDM-Results. A standard for describing and sharing neuroimaging results: app...
NIDM-Results. A standard for describing and sharing neuroimaging results: app...
Camille Maumet335 views
DNA translocation through a nanopore by kunyan
DNA translocation through a nanoporeDNA translocation through a nanopore
DNA translocation through a nanopore
kunyan2.7K views
time based ranging via uwb radios by sujan shrestha
time based ranging via uwb radiostime based ranging via uwb radios
time based ranging via uwb radios
sujan shrestha594 views
RADAR & NAVIGATION (Lecture 5).pptx by ErniDwi3
RADAR & NAVIGATION (Lecture 5).pptxRADAR & NAVIGATION (Lecture 5).pptx
RADAR & NAVIGATION (Lecture 5).pptx
ErniDwi3102 views
ANN based fault diagnostic scheme for power transformer by Mohammad Sohaib
ANN based fault diagnostic scheme for power transformerANN based fault diagnostic scheme for power transformer
ANN based fault diagnostic scheme for power transformer
Mohammad Sohaib391 views
NON-LINEAR FULLY-CONSTRAINED SPECTRAL UNMIXING by grssieee
NON-LINEAR FULLY-CONSTRAINED SPECTRAL UNMIXINGNON-LINEAR FULLY-CONSTRAINED SPECTRAL UNMIXING
NON-LINEAR FULLY-CONSTRAINED SPECTRAL UNMIXING
grssieee755 views
Random finite set filters for superpositon type sensors by Daniel Hauschildt
Random finite set filters for superpositon type sensorsRandom finite set filters for superpositon type sensors
Random finite set filters for superpositon type sensors
Daniel Hauschildt311 views
Phased Array Scan Planning and Modeling for Weld inspection by Olympus IMS
Phased Array Scan Planning and Modeling for Weld inspectionPhased Array Scan Planning and Modeling for Weld inspection
Phased Array Scan Planning and Modeling for Weld inspection
Olympus IMS3.6K views
Online Stochastic Tensor Decomposition for Background Subtraction in Multispe... by ActiveEon
Online Stochastic Tensor Decomposition for Background Subtraction in Multispe...Online Stochastic Tensor Decomposition for Background Subtraction in Multispe...
Online Stochastic Tensor Decomposition for Background Subtraction in Multispe...
ActiveEon1.2K views
Approximation of Dynamic Convolution Exploiting Principal Component Analysis:... by a3labdsp
Approximation of Dynamic Convolution Exploiting Principal Component Analysis:...Approximation of Dynamic Convolution Exploiting Principal Component Analysis:...
Approximation of Dynamic Convolution Exploiting Principal Component Analysis:...
a3labdsp934 views
FMRI medical imagining by Vishwas N
FMRI  medical imaginingFMRI  medical imagining
FMRI medical imagining
Vishwas N132 views
Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Ne... by Alpen-Adria-Universität
Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Ne...Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Ne...
Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Ne...
Supporting image-based meta-analysis with NIDM: Standardized reporting of neu... by Camille Maumet
Supporting image-based meta-analysis with NIDM: Standardized reporting of neu...Supporting image-based meta-analysis with NIDM: Standardized reporting of neu...
Supporting image-based meta-analysis with NIDM: Standardized reporting of neu...
Camille Maumet1.7K views
SPECFORMER: SPECTRAL GRAPH NEURAL NETWORKS MEET TRANSFORMERS.pptx by ssuser2624f71
SPECFORMER: SPECTRAL GRAPH NEURAL NETWORKS MEET TRANSFORMERS.pptxSPECFORMER: SPECTRAL GRAPH NEURAL NETWORKS MEET TRANSFORMERS.pptx
SPECFORMER: SPECTRAL GRAPH NEURAL NETWORKS MEET TRANSFORMERS.pptx
ssuser2624f7158 views
Quality Measurements Using NIR/MIR Spectroscopy: A Rotten Apple Could Turn Yo... by TechRentals
Quality Measurements Using NIR/MIR Spectroscopy: A Rotten Apple Could Turn Yo...Quality Measurements Using NIR/MIR Spectroscopy: A Rotten Apple Could Turn Yo...
Quality Measurements Using NIR/MIR Spectroscopy: A Rotten Apple Could Turn Yo...
TechRentals7.1K views
2008 Spie Defense + Security Presentation by Clyde Lettsome
2008 Spie Defense + Security Presentation2008 Spie Defense + Security Presentation
2008 Spie Defense + Security Presentation
Clyde Lettsome660 views
(Reading Group) Automatic Detection of Action Potentials in a Noisy Neural Re... by Mohamed Elawady
(Reading Group) Automatic Detection of Action Potentials in a Noisy Neural Re...(Reading Group) Automatic Detection of Action Potentials in a Noisy Neural Re...
(Reading Group) Automatic Detection of Action Potentials in a Noisy Neural Re...
Mohamed Elawady537 views
REAL-TIME PIPELINE BATCH INTERFACE DETECTION & TRANSMIX REDUCTION by iQHub
REAL-TIME PIPELINE BATCH INTERFACE DETECTION & TRANSMIX REDUCTIONREAL-TIME PIPELINE BATCH INTERFACE DETECTION & TRANSMIX REDUCTION
REAL-TIME PIPELINE BATCH INTERFACE DETECTION & TRANSMIX REDUCTION
iQHub43 views

More from Daichi Kitamura

独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank... by
独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...
独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...Daichi Kitamura
1.5K views91 slides
スペクトログラム無矛盾性を用いた 独立低ランク行列分析の実験的評価 by
スペクトログラム無矛盾性を用いた独立低ランク行列分析の実験的評価スペクトログラム無矛盾性を用いた独立低ランク行列分析の実験的評価
スペクトログラム無矛盾性を用いた 独立低ランク行列分析の実験的評価Daichi Kitamura
1.1K views24 slides
Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも) by
Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)
Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)Daichi Kitamura
2.8K views67 slides
独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank... by
独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...
独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...Daichi Kitamura
8.3K views67 slides
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s... by
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...Daichi Kitamura
4.1K views26 slides
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen... by
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...Daichi Kitamura
2.1K views15 slides

More from Daichi Kitamura(10)

独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank... by Daichi Kitamura
独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...
独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...
Daichi Kitamura1.5K views
スペクトログラム無矛盾性を用いた 独立低ランク行列分析の実験的評価 by Daichi Kitamura
スペクトログラム無矛盾性を用いた独立低ランク行列分析の実験的評価スペクトログラム無矛盾性を用いた独立低ランク行列分析の実験的評価
スペクトログラム無矛盾性を用いた 独立低ランク行列分析の実験的評価
Daichi Kitamura1.1K views
Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも) by Daichi Kitamura
Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)
Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)
Daichi Kitamura2.8K views
独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank... by Daichi Kitamura
独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...
独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...
Daichi Kitamura8.3K views
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s... by Daichi Kitamura
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...
Daichi Kitamura4.1K views
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen... by Daichi Kitamura
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...
Daichi Kitamura2.1K views
近接分離最適化によるブラインド⾳源分離(Blind source separation via proximal splitting algorithm) by Daichi Kitamura
近接分離最適化によるブラインド⾳源分離(Blind source separation via proximal splitting algorithm)近接分離最適化によるブラインド⾳源分離(Blind source separation via proximal splitting algorithm)
近接分離最適化によるブラインド⾳源分離(Blind source separation via proximal splitting algorithm)
Daichi Kitamura2K views
非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法 by Daichi Kitamura
非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法
非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法
Daichi Kitamura3.5K views
音源分離における音響モデリング(Acoustic modeling in audio source separation) by Daichi Kitamura
音源分離における音響モデリング(Acoustic modeling in audio source separation)音源分離における音響モデリング(Acoustic modeling in audio source separation)
音源分離における音響モデリング(Acoustic modeling in audio source separation)
Daichi Kitamura22.6K views
ICASSP2017読み会(関東編)・AASP_L3(北村担当分) by Daichi Kitamura
ICASSP2017読み会(関東編)・AASP_L3(北村担当分)ICASSP2017読み会(関東編)・AASP_L3(北村担当分)
ICASSP2017読み会(関東編)・AASP_L3(北村担当分)
Daichi Kitamura4K views

Recently uploaded

Ansari: Practical experiences with an LLM-based Islamic Assistant by
Ansari: Practical experiences with an LLM-based Islamic AssistantAnsari: Practical experiences with an LLM-based Islamic Assistant
Ansari: Practical experiences with an LLM-based Islamic AssistantM Waleed Kadous
9 views29 slides
GPS Survery Presentation/ Slides by
GPS Survery Presentation/ SlidesGPS Survery Presentation/ Slides
GPS Survery Presentation/ SlidesOmarFarukEmon1
7 views13 slides
Design of Structures and Foundations for Vibrating Machines, Arya-ONeill-Pinc... by
Design of Structures and Foundations for Vibrating Machines, Arya-ONeill-Pinc...Design of Structures and Foundations for Vibrating Machines, Arya-ONeill-Pinc...
Design of Structures and Foundations for Vibrating Machines, Arya-ONeill-Pinc...csegroupvn
8 views210 slides
MongoDB.pdf by
MongoDB.pdfMongoDB.pdf
MongoDB.pdfArthyR3
49 views6 slides
Design_Discover_Develop_Campaign.pptx by
Design_Discover_Develop_Campaign.pptxDesign_Discover_Develop_Campaign.pptx
Design_Discover_Develop_Campaign.pptxShivanshSeth6
49 views20 slides
DESIGN OF SPRINGS-UNIT4.pptx by
DESIGN OF SPRINGS-UNIT4.pptxDESIGN OF SPRINGS-UNIT4.pptx
DESIGN OF SPRINGS-UNIT4.pptxgopinathcreddy
21 views47 slides

Recently uploaded(20)

Ansari: Practical experiences with an LLM-based Islamic Assistant by M Waleed Kadous
Ansari: Practical experiences with an LLM-based Islamic AssistantAnsari: Practical experiences with an LLM-based Islamic Assistant
Ansari: Practical experiences with an LLM-based Islamic Assistant
M Waleed Kadous9 views
Design of Structures and Foundations for Vibrating Machines, Arya-ONeill-Pinc... by csegroupvn
Design of Structures and Foundations for Vibrating Machines, Arya-ONeill-Pinc...Design of Structures and Foundations for Vibrating Machines, Arya-ONeill-Pinc...
Design of Structures and Foundations for Vibrating Machines, Arya-ONeill-Pinc...
csegroupvn8 views
MongoDB.pdf by ArthyR3
MongoDB.pdfMongoDB.pdf
MongoDB.pdf
ArthyR349 views
Design_Discover_Develop_Campaign.pptx by ShivanshSeth6
Design_Discover_Develop_Campaign.pptxDesign_Discover_Develop_Campaign.pptx
Design_Discover_Develop_Campaign.pptx
ShivanshSeth649 views
ASSIGNMENTS ON FUZZY LOGIC IN TRAFFIC FLOW.pdf by AlhamduKure
ASSIGNMENTS ON FUZZY LOGIC IN TRAFFIC FLOW.pdfASSIGNMENTS ON FUZZY LOGIC IN TRAFFIC FLOW.pdf
ASSIGNMENTS ON FUZZY LOGIC IN TRAFFIC FLOW.pdf
AlhamduKure8 views
Unlocking Research Visibility.pdf by KhatirNaima
Unlocking Research Visibility.pdfUnlocking Research Visibility.pdf
Unlocking Research Visibility.pdf
KhatirNaima10 views
REACTJS.pdf by ArthyR3
REACTJS.pdfREACTJS.pdf
REACTJS.pdf
ArthyR337 views
Créativité dans le design mécanique à l’aide de l’optimisation topologique by LIEGE CREATIVE
Créativité dans le design mécanique à l’aide de l’optimisation topologiqueCréativité dans le design mécanique à l’aide de l’optimisation topologique
Créativité dans le design mécanique à l’aide de l’optimisation topologique
LIEGE CREATIVE8 views
Web Dev Session 1.pptx by VedVekhande
Web Dev Session 1.pptxWeb Dev Session 1.pptx
Web Dev Session 1.pptx
VedVekhande17 views
2023Dec ASU Wang NETR Group Research Focus and Facility Overview.pptx by lwang78
2023Dec ASU Wang NETR Group Research Focus and Facility Overview.pptx2023Dec ASU Wang NETR Group Research Focus and Facility Overview.pptx
2023Dec ASU Wang NETR Group Research Focus and Facility Overview.pptx
lwang78180 views
Integrating Sustainable Development Goals (SDGs) in School Education by SheetalTank1
Integrating Sustainable Development Goals (SDGs) in School EducationIntegrating Sustainable Development Goals (SDGs) in School Education
Integrating Sustainable Development Goals (SDGs) in School Education
SheetalTank19 views

Regularized superresolution-based binaural signal separation with nonnegative matrix factorization

  • 1. Regularized Superresolution-Based Binaural Signal Separation with Nonnegative Matrix Factorization Daichi Kitamura, Hiroshi Saruwatari, Yusuke Iwao, Kiyohiro Shikano (Nara Institute of Science and Technology, Nara, Japan) Kazunobu Kondo, Yu Takahashi (Yamaha Corporation Research & Development Center, Shizuoka, Japan)
  • 2. Outline • 1. Research background • 2. Conventional method – Nonnegative matrix factorization – Penalized supervised nonnegative matrix factorization – Directional clustering – Hybrid method • 3. Proposed method – Regularized superresolution-based nonnegative matrix factorization • 4. Experiments • 5. Conclusions 2
  • 3. Outline • 1. Research background • 2. Conventional method – Nonnegative matrix factorization – Penalized supervised nonnegative matrix factorization – Directional clustering – Hybrid method • 3. Proposed method – Regularized superresolution-based nonnegative matrix factorization • 4. Experiments • 5. Conclusions 3
  • 4. Background • Music signal separation technologies have received much attention. • Music signal separation based on nonnegative matrix factorization (NMF) has been a very active area of the research. • The extraction performance of NMF markedly degrades for the case of many source mixtures. 4 • Automatic music transcription • 3D audio system, etc. Applications We propose a new method for multichannel signal separation with NMF utilizing both spectral and spatial cues included in mixtures of multiple instruments.
  • 5. Outline • 1. Research background • 2. Conventional method – Nonnegative matrix factorization – Penalized supervised nonnegative matrix factorization – Directional clustering – Hybrid method • 3. Proposed method – Regularized superresolution-based nonnegative matrix factorization • 4. Experiments • 5. Conclusions 5
  • 6. NMF • NMF is a type of sparse representation algorithm that decomposes a nonnegative matrix into two nonnegative matrices. [D. D. Lee, et al., 2001] 6 Time Frequency AmplitudeFrequency Amplitude Observed matrix (Spectrogram) Basis matrix (Spectral bases) Activation matrix (Time-varying gain) Time Ω: Number of frequency bins 𝑇: Number of frames 𝐾: Number of bases 𝒀: Observed matrix 𝑭: Basis matrix 𝑮: Activation matrix
  • 7. Penalized Supervised NMF (PSNMF) • In PSNMF, the following decomposition is addressed under the condition that is known in advance. [Yagi, et al., 2012] 7 Separation process Fix trained bases and update . is forced to become uncorrelated with Update Training process Supervised bases of the target sound Supervision sound
  • 8. Penalized Supervised NMF (PSNMF) • In PSNMF, the following decomposition is addressed under the condition that is known in advance. [Yagi, et al., 2012] 8 Separation process Fix trained bases and update . is forced to become uncorrelated with Update Training process Supervised bases of the target sound Supervision sound Problem of PSNMF: When the signal includes many sources, the extraction performance markedly degrades.
  • 9. Directional Clustering • Directional clustering can estimate sources and their direction in multichannel signal. [Araki, et al., 2007] [Miyabe, et al., 2009] • This method can separate sources with spatial information in an observed signal. 9 L R L-chinputsignal R-ch input signal :Source component :Centroid vector
  • 10. Directional Clustering • Directional clustering can estimate sources and their direction in multichannel signal. [Araki, et al., 2007] [Miyabe, et al., 2009] • This method can separate sources with spatial information in an observed signal. 10 L R L-chinputsignal R-ch input signal :Source component :Centroid vector Problem of directional clustering: This method cannot separate sources in the same direction.
  • 11. Hybrid method • Conventional hybrid method utilizes PSNMF after the directional clustering. [Iwao, et al., 2012] • This method consists of two techniques. – Directional clustering – PSNMF 11 Directional clustering L R PSNMF Spatial separation Source separation Conventional Hybrid method
  • 12. Problem of hybrid method • The signal extracted by the hybrid method suffers from the generation of considerable distortion due to the binary masking in directional clustering. • The signal in the target direction, which is obtained by directional clustering, has many spectral chasms. • The resolution of the spectrogram is degraded. 12 1 0 0 0 0 0 0 0 1 1 0 0 1 1 1 0 0 0 0 0 0 0 1 0 1 1 0 1 1 0 0 0 0 0 0 1 1 1 0 1 1 0 Time Frequency : Target direction Time Frequency TimeFrequency : Other direction :Hadamard product (product of each element) Input spectrogram Binary mask Separated cluster Directional Clustering
  • 13. Outline • 1. Research background • 2. Conventional method – Nonnegative matrix factorization – Penalized supervised nonnegative matrix factorization – Directional clustering – Hybrid method • 3. Proposed method – Regularized superresolution-based nonnegative matrix factorization • 4. Experiments • 5. Conclusions 13
  • 14. Proposed hybrid method 14 Input stereo signal L-ch R-ch STFT Directional clustering Center component L-ch R-ch center cluster Index of based SNMF Superresolution- based SNMF Superresolution- ISTFT ISTFT Mixing Extracted signal Input stereo signal L-ch R-ch STFT Directional clustering Center component PSNMFPSNMF L-ch R-ch ISTFT ISTFT Mixing Extracted signal Conventional hybrid method Proposed hybrid method Employ a new supervised NMF algorithm as an alternative to the conventional PSNMF in the hybrid method.
  • 15. Regularized superresolution-based NMF • In proposed supervised NMF, the spectral chasms are treated as unseen observations using index matrix. 15 : Chasms Time Frequency Separated cluster Chasms Treat chasms as unseen observations. 1 0 0 0 0 0 0 0 1 1 0 0 1 1 1 0 0 0 0 0 0 0 1 0 1 1 0 1 1 0 0 0 0 0 0 1 1 1 0 1 1 0 Time Frequency Index matrix
  • 16. Regularized superresolution-based NMF • The spectrogram of the target sound is reconstructed using more matched bases because chasms are treated as unseen. • The components of the target sound lost after directional clustering can be extrapolated using supervised bases. 16 Time Frequency Separated cluster Time Frequency Reconstructed spectrogram : Chasms Supervised bases Superresolution using supervised bases
  • 17. 17 Regularized superresolution-based NMF • Signal flow of the proposed hybrid method Center RightLeft Direction sourcecomponent (a) Frequencyof Observed spectra Target source
  • 18. 18 Target direction Regularized superresolution-based NMF • Signal flow of the proposed hybrid method Center RightLeft Direction sourcecomponent z (b) Frequencyof After directional clustering Target source Center RightLeft Direction sourcecomponent (a) Frequencyof Observed spectra Center sources lose some of their components Directional clustering
  • 19. 19 Regularized superresolution-based NMF • Signal flow of the proposed hybrid method Center RightLeft Direction sourcecomponent z (b) Frequencyof After directional clustering Center sources lose some of their components
  • 20. 20 Regularized superresolution-based NMF • Signal flow of the proposed hybrid method Center RightLeft Direction sourcecomponent z (b) Frequencyof After directional clustering Center sources lose some of their components Superresolution- based NMF Center RightLeft Direction sourcecomponent (c) Frequencyof After super- resolution- based SNMF Extrapolated target source
  • 21. Regularized superresolution-based NMF • The basis extrapolation includes an underlying problem. • If the time-frequency spectra are almost unseen in the spectrogram, which means that the indexes are almost zero, a large extrapolation error may occur. • It is necessary to regularize the extrapolation. 21 4 3 2 1 0 Frequency[kHz] 43210 Time [s] Extrapolation error (incorrectly modifying the activation) Time Frequency Separated cluster Almost unseen frame
  • 22. Regularized superresolution-based NMF • We propose two types of regularizations. 22 Regularization of the temporal continuity Regularization of the norm minimization 𝑰 : Index matrix ∙ : Binary complement 𝑖 𝜔,𝑡: Entry of index matrix 𝑰 𝑔 𝑘,𝑡: Entry of matrix 𝑮 𝑓𝜔,𝑘: Entry of matrix 𝑭 Previous frame The intensity of these regularizations are proportional to the number of chasms in each frame.
  • 23. Regularized superresolution-based NMF • The cost function in regularized superresolution-based NMF is defined using the index matrix as 23 : Regularization term : Penalty term to force and to become uncorrelated with each other : Weighting parameter
  • 24. Regularized superresolution-based NMF • The update rules that minimize the cost function are obtained as follows: 24
  • 25. Outline • 1. Research background • 2. Conventional method – Nonnegative matrix factorization – Penalized supervised nonnegative matrix factorization – Directional clustering – Hybrid method • 3. Proposed method – Regularized superresolution-based nonnegative matrix factorization • 4. Experiments • 5. Conclusions 25
  • 26. Evaluation experiment • We compared four methods. – Conventional hybrid method using PSNMF (Conventional method) – Proposed hybrid method using superresolution-based NMF without regularization (Proposed method 1) – Proposed hybrid method using superresolution-based NMF with regularization of the temporal continuity (Proposed method 2) – Proposed hybrid method using superresolution-based NMF with regularization of the norm minimization (Proposed method 3) 26 Input stereo signal L-ch R-ch STFT Directional clustering Center component PSNMFPSNMF L-ch R-ch ISTFT ISTFT Mixing Extracted signal Input stereo signal L-ch R-ch STFT Directional clustering Center component L-ch R-ch center cluster Index of based SNMF Superresolution- based SNMF Superresolution- ISTFT ISTFT Mixing Extracted signal
  • 27. Evaluation experiment • We used stereo-panning signals ( ) and binaural- recorded signals ( ) containing four instruments, Ob., Fl., Tb., and Pf., generated by MIDI synthesizer. • The sources are mixed as the same power. • Target source is always located in the center direction (no.1). • We used the same type of MIDI sounds of the target instruments as supervision for training process. 27 Center 1 2 3 4 Left Right Target source Supervision sound Two octave notes that cover all notes of the target signal
  • 28. Experimental results (panning signal) • Average SDR, SIR, and SAR scores for each method, where the 4 instruments are shuffled with 12 combinations. 28 12 10 8 6 4 2 0 SDR[dB] 24 20 16 12 8 4 0 SIR[dB] 10 8 6 4 2 0 SAR[dB] SDR :quality of the separated target sound SIR :degree of separation between the target and other sounds SAR :absence of artificial distortion Proposed method 1 :no regularization Proposed method 2 :regularization of temporal continuity Proposed method 3 :regularization of norm minimization SDR SIR SARGood Bad
  • 29. Experimental results (binaural signal) • Average SDR, SIR, and SAR scores for each method, where the 4 instruments are shuffled with 12 combinations. 29 6 5 4 3 2 1 0 SAR[dB] 20 16 12 8 4 0 SIR[dB] 10 8 6 4 2 0 SDR[dB] SDR :quality of the separated target sound SIR :degree of separation between the target and other sounds SAR :absence of artificial distortion SDR SIR SAR Proposed method 1 :no regularization Proposed method 2 :regularization of temporal continuity Proposed method 3 :regularization of norm minimization Bad Good
  • 30. Conclusions • We propose a new supervised NMF algorithm, which is superresolution-based method, for the hybrid method to separate stereo or binaural signals. • The proposed hybrid method can separate the target signal with high performance compared with conventional method. • The regularization of norm minimization is effective for the proposed supervised NMF algorithm. 30 Thank you for your attention!

Editor's Notes

  1. Thank you chires. Good afternoon everyone, // I’m Daichi Kitamura from Nara institute of science and technology, Japan. Today // I’d like to talk about Binaural signal separation / using regularized superresolution-based nonnegative matrix factorization.
  2. This is outline of my talk.
  3. First, // I talk about research background.
  4. Recently, // music signal separation technologies have received much attention. These technologies are available / for controlling each source in a music signal / for 3D audio system. Music signal separation based on nonnegative matrix factorization, // NMF in short, // has been a very active area of the research. NMF can extract the target signal to some extent , // especially in the case of small number of instruments. However, // for the case of many source mixtures / like more realistic musical tunes, / the extraction performance markedly degrades. To solve this problem, // we propose a new method for multichannel signal separation / with NMF utilizing both spectral and spatial cues / included in mixtures of multiple instruments.
  5. Next, // we talk about conventional methods.
  6. NMF is a type of sparse representation algorithm // that decomposes a nonnegative matrix / into two nonnegative matrices like this. Where Y is an observed spectrogram. F is a nonnegative matrix / that involves spectral patterns of the observed signal as column vectors, // and G is a nonnegative matrix / that corresponds to the activation of each spectral pattern.
  7. And penalized supervised NMF, / PSNMF in short, / has been proposed by Yagi and others. In PSNMF, // an observed matrix is decomposed like this. Where F is a trained bases / using the target supervision sound in training process. So, the target signal is extracted as F and G. In addition, // to prevent the simultaneous generation / of similar spectral patterns in the matrices F and H, // a specific penalty is imposed between F and H. This method uses spectral cues for the separation.
  8. However, // PSNMF has a problem. When the input signal includes many instrumental sources, // the extraction performance markedly degrades because several resemble bases arise in both of the target and other instruments.
  9. Next, // we explain directional clustering method. Directional clustering can estimates sources and their direction in multichannel signal. This method can separate sources with spatial information in an observed signal.
  10. However, this method cannot separate sources in the same direction, like this.
  11. To solve these problems, / a hybrid method that concatenates PSNMF after directional clustering / has been proposed. This method consists of two techniques. First, / directional clustering is applied to the input signal / to separate the target direction. However, / directional clustering cannot separate the sources in the same direction. So, / we added PSNMF after the directional clustering, and separate the target source. (This method uses suitable decompositions / for each separation problem, i.e., this hybrid method is divide-and-conquer method.)
  12. But / there is also a problem of the hybrid method. The signal extracted by the hybrid method / suffers from the generation of considerable distortion / due to the binary masking in directional clustering. So, / the separated cluster / has many spectral chasms. In other words, the resolution of the spectrogram is degraded.
  13. Next, // we talk about proposed method.
  14. In proposed method, / we employed a new supervised NMF algorithm / as an alternative to the conventional PSNMF in the hybrid method.
  15. This is an example of spectrum at one frame. There are many spectral chasms. And, this matrix is the index of separated cluster. Indexes of zero indicate the grids of chasm in the spectrogram. In proposed supervised NMF, / the spectral chasms are treated as unseen observations / using this index matrix, like this. Therefore, / supervised NMF is applied to only the observed valid components / not unseen observations like these chasms. (The directional clustering is hard clustering, binary masking. And the index matrix of directional clustering is obtained from the separated results. So, we can know where is the chasms. The ones mean observations, and zeros mean unseen observations.)
  16. In addition, / the spectrogram of the target sound is reconstructed / using more matched bases / in the proposed NMF. The components of the target sound lost after directional clustering / can be extrapolated using supervised bases. In other words, / the resolution of the target spectrogram / is recovered with the superresolution / by the supervised basis extrapolation.
  17. (pointing (a)) This is a directional source distribution of observed stereo signal. The target source is in the center direction, / and other sources are distributed like this.
  18. Directional clustering is a binary masking in the time-frequency domain. So, / the separated cluster is obtained like this. Left and right source components / leak in the center cluster, // and center sources lose some of their components. These lost components / correspond to the spectral chasms in the time-frequency domain.
  19. Then, after the directional clustering,
  20. we apply the superresolution-based NMF. This NMF separates the target source / and reconstructs lost components with basis extrapolation using supervised bases.
  21. However, / this basis extrapolation includes an underlying problem. If the time-frequency spectra are almost unseen in the spectrogram, / a large extrapolation error may occur. So, it is necessary to regularize / this extrapolation.
  22. We propose two types of regularizations. First one / uses temporal continuity / with a previous frame in the spectrogram. And second one, / norm minimization is based on the assumption that // the frame, / which has many spectral chasms, / doesn’t have much of target components intrinsically. Where I bar means the binary complement of the index. So, / I bar represents the grid of chasms. Therefore, intensity of these regularizations are proportional to the number of chasms in each frame.
  23. The cost function in regularized superresolution-based NMF / is defined like this. Where, / Rn is the regularization term, and n represents the type of regularization. n equals one, / is the regularization of time continuity. And, n equals two, / is the norm minimization. In addition, this (pointing |FtH|^2) term is a penalty term / that forces F and H / to become uncorrelated with each other to avoid sharing the same basis.
  24. The update rules that minimize the cost function are obtained like this.
  25. Then, // we talk about experiments.
  26. In the experiment, we compared 4 methods, / namely, conventional hybrid method using PSNMF, / proposed hybrid method using superresolution-based NMF without regularization, / and proposed hybrid method with two types of the regularizations.
  27. And, we used stereo-panning and binaural-recorded signals / containing 4 instruments, namely, oboe, flute, trombone, and piano, / generated by MIDI. These sources are mixed as the same power, / and the target source is always located in the center. No.1 is the target source / and Nos.2,3,4 are the other sources. In addition, / we used the same type of MIDI sounds of the target instruments / as the supervision sound / like this (pointing supervision score). This supervision sound consists two octave notes that cover all notes of the target signal.
  28. These results are average of evaluation scores / for the stereo-panning signal. Where, / SDR indicates the quality of the separated target sound, / SIR indicates degree of separation / between the target and other sounds, / and SAR indicates absence of artificial distortion. From these results, Proposed method 3, / superresolution-based NMF with norm minimization, / outperforms all other methods.
  29. And, this is result for the binaural signal. Similar to the results of panning signal, / Proposed method 3 was the highest scores. SIR of the conventional method was high score, / but the quality of separated signal is not good because of the spectral chasms. Also, Proposed method 1 has a risk / to cause the extrapolation error. From SAR results, proposed regularizations can avoid such error, / and norm minimization is better for the hybrid method totally. (This is because, / the norm minimization compresses residual components of the other sources. This phenomenon is a side-effect / of the regularization.)
  30. This is conclusions of my talk. Thank you for your attention.
  31. (The directional clustering is hard clustering, binary masking. And the index matrix of directional clustering is obtained from the separated results. So, we can know where is the chasms. The ones mean observations, and zeros mean unseen observations.)
  32. In addition, / the spectrogram of the target sound is reconstructed / using more matched bases / in the proposed NMF. (pointing (a)) This is a directional source distribution of observed stereo signal. The target source is in the center direction, / and other sources are distributed like this. After directional clustering, / separated cluster loses some of their components. And after superresolution-based NMF, the target components are restored using supervised bases. In other words, / the resolution of the target spectrogram / is recovered with the superresolution / by the supervised basis extrapolation.
  33. If the target sources increase in the same direction with target instruments, the separation performance of supervised NMF markedly degrades. This is because, the several resemble bases arise in both of the target and other instruments.
  34. If the left and right sources close to the center direction, the separation ↓ become difficult, because directional clustering cannot separate well. In addition, bases extrapolation also become difficult because the number of chasms in the separated cluster / are increased in this case. In contrast, if the theta become larger, the separation ↓ become easy.
  35. This is a signal flow of the proposed hybrid method. In our experiment, superresolution-based supervised NMF is applied to only the center direction because the target source is located in the center direction. However, if the target source is located in the left or right side, we should apply this NMF to the direction that have the target source whether or not there is the other source in that direction.
  36. SDR :quality of the separated target sound SIR :degree of separation between the target and other sounds SAR :absence of artificial distortion
  37. SDR is the total evaluation score as the performance of separation.
  38. And penalized supervised NMF, / PSNMF in short, / has been proposed by Yagi and others. In PSNMF, // an observed matrix is decomposed like this. Where F is a nonnegative matrix / that involves the target sound basis as column vectors. G is an activation matrix / that corresponds to F, // and H and U are nonnegative matrices. So, the target signal is extracted as F and G. In addition, // to prevent the simultaneous formulation / of similar spectral patterns in the matrices F and H, // a specific penalty is imposed between F and H. However, // PSNMF has a problem. When the input signal includes many instrumental sources, // the extraction performance markedly degrades. (because several resemble bases arise in both of the target and other instruments.)