SlideShare a Scribd company logo
1 of 31
Hybrid Multichannel Signal Separation Using
Supervised Nonnegative Matrix Factorization
Daichi Kitamura, (The Graduate University for Advanced Studies, Japan)
Hiroshi Saruwatari, (The University of Tokyo, Japan)
Satoshi Nakamura, (Nara Institute of Science and Technology, Japan)
Yu Takahashi, (Yamaha Corporation, Japan)
Kazunobu Kondo, (Yamaha Corporation, Japan)
Hirokazu Kameoka, (The University of Tokyo, Japan)
Asia-Pacific Signal and Information Processing Association ASC 2014
Special session – Recent Advances in Audio and Acoustic Signal processing
Outline
• 1. Research background
• 2. Conventional methods
– Nonnegative matrix factorization
– Supervised nonnegative matrix factorization
– Multichannel NMF
• 3. Proposed method
– SNMF with spectrogram restoration and its Hybrid method
• 4. Experiments
– Closed data experiment
– Open data experiment
• 5. Conclusions
2
Outline
• 1. Research background
• 2. Conventional methods
– Nonnegative matrix factorization
– Supervised nonnegative matrix factorization
– Multichannel NMF
• 3. Proposed method
– SNMF with spectrogram restoration and its Hybrid method
• 4. Experiments
– Closed data experiment
– Open data experiment
• 5. Conclusions
3
Research background
• Signal separation have received much attention.
• Music signal separation based on nonnegative matrix
factorization (NMF) is a very active research area.
• Supervised NMF (SNMF) achieves the highest
separation performance.
• To improve its performance, SNMF-based
multichannel signal separation method is required.
4
• Automatic music transcription
• 3D audio system, etc.
Applications
Separate!
Separate the target signal from multichannel
signals with high accuracy.
Outline
• 1. Research background
• 2. Conventional methods
– Nonnegative matrix factorization
– Supervised nonnegative matrix factorization
– Multichannel NMF
• 3. Proposed method
– SNMF with spectrogram restoration and its Hybrid method
• 4. Experiments
– Closed data experiment
– Open data experiment
• 5. Conclusions
5
• NMF can extract significant spectral patterns.
– Basis matrix has frequently-appearing spectral patterns
in .
NMF [Lee, et al., 2001]
Amplitude
Amplitude
Observed matrix
(spectrogram)
Basis matrix
(spectral patterns)
Activation matrix
(Time-varying gain)
Time
Ω: Number of frequency bins
𝑇: Number of time frames
𝐾: Number of bases
Time
Frequency
Frequency
6
Basis
• SNMF
– Supervised spectral separation method
Supervised NMF [Smaragdis, et al., 2007]
Separation process Optimize
Training process
Supervised basis matrix
(spectral dictionary)
Sample sounds
of target signal
7
Fixed
Sample sound
Target signal Other signalMixed signal
Problems of SNMF
• SNMF is only for a single-channel signal
– For multichannel signal, SNMF cannot use information
between channels.
• When many interference sources exist, separation
performance of SNMF markedly degrades.
8
Separate
Residual
components
9
• Multichannel NMF
– is a natural extension of NMF for a multichannel signal
– uses spatial information for the clustering of bases to
achieve the unsupervised separation task.
Multichannel NMF [Sawada, et al., 2013]
Problems:
Multichannel NMF involve strong dependence on initial values
and lack robustness.
Microphone array
Outline
• 1. Research background
• 2. Conventional methods
– Nonnegative matrix factorization
– Supervised nonnegative matrix factorization
– Multichannel NMF
• 3. Proposed method
– Motivation and strategy
– SNMF with spectrogram restoration and its Hybrid method
• 4. Experiments
– Closed data experiment
– Open data experiment
• 5. Conclusions
10
• Sawada’s multichannel NMF
– is unified method to solve spatial and spectral separations.
– Maximizes a likelihood:
– For supervised situation, target spectral patterns is given.
– Too much difficult to solve (lack robustness)
– Computationally inefficient (much computational time)
Motivation and strategy
11
Spatial direction
of target signal
Source components
of all signals
Target Other
Observed spectrograms
• Proposed hybrid method
– divides the problems as follows:
– The spatial separation should be carried out with classical
D.O.A. estimation methods.
• These methods are very efficient and stable.
– Divide and conquer method
Motivation and strategy
12
Unsupervised
spatial separation
Supervised
spectral separation
Approximation
Classical D.O.A. estimation SNMF-based method
Directional clustering [Araki, et al., 2007]
• Directional clustering
– Unsupervised spatial separation method
– k-means clustering (fast and stable)
• Problems
– Artificial distortion arises owing to the binary masking.
13
Right
L R
Center
Left
L R
Center
Binary masking
Input signal (stereo) Separated signal
1 1 1 0 0 0
1 0 0 0 0 0
1 1 1 1 0 0
1 0 0 0 0 0
1 1 1 1 1 1
Frequency
Time
C C C R L R
C L L L R R
C C C C R R
C R R L L L
C C C C C C
Frequency
Time
Binary maskSpectrogram
Entry-wise product
Proposed method: hybrid separation
• Hybrid separation method
14
Input stereo signal
Spatial separation method
(Directional clustering)
SNMF-based separation method
(SNMF with spectrogram restoration)
Separated signal
L R
SNMF with spectrogram restoration
: Holes
Time
Frequency
Separated cluster
Spectral holes (lost components)
The proposed SNMF treats these
holes as unseen observations
Supervised basis
…
Extrapolate the
fittest bases
15
(dictionary of target signal)
Fix up
SNMF with spectrogram restoration
Center RightLeft
Direction
sourcecomponent
z
(b)
Center RightLeft
Direction
sourcecomponent
(a)
Target
Center RightLeft
Direction
sourcecomponent
(c)
Extrapolated
componentsFrequencyofFrequencyofFrequencyof
After
Input
After
signal
directional
clustering
super-
resolution-
based SNMF
Binary
masking
16
Time
FrequencyObserved spectrogram
Target
Interference
Time
Time
Frequency
Extrapolate
Frequency
Separated cluster
Reconstructed data
Supervised
spectral bases
Directional
clustering
SNMF with
spectrogram restoration
• The divergence is defined at all grids except for the
holes by using the Binary mask matrix .
Decomposition model and cost function
17
Decomposition model:
Supervised bases (Fixed)
: Entries of matrices, , and , respectively
: Weighting parameters,: Binary complement, : Frobenius norm
Cost function:
: Binary masking matrix obtained from directional clustering
• The divergence is defined at all grids except for the
holes by using the Binary mask matrix .
Decomposition model and cost function
18
Decomposition model:
Supervised bases (Fixed)
: Entries of matrices, , and , respectively
: Weighting parameters,: Binary complement, : Frobenius norm
Cost function:
: Binary masking matrix obtained from directional clustering
Binary index to exclude the holes
• The divergence is defined at all grids except for the
holes by using the Binary mask matrix .
Decomposition model and cost function
19
Decomposition model:
Supervised bases (Fixed)
: Entries of matrices, , and , respectively
: Weighting parameters,: Binary complement, : Frobenius norm
Regularization term
Cost function:
: Binary masking matrix obtained from directional clustering
Binary index to exclude the holes
• The divergence is defined at all grids except for the
holes by using the Binary mask matrix .
Decomposition model and cost function
20
Decomposition model:
Supervised bases (Fixed)
: Entries of matrices, , and , respectively
: Weighting parameters,: Binary complement, : Frobenius norm
Regularization term
Penalty term
[Kitamura, et al. 2014]
Cost function:
: Binary masking matrix obtained from directional clustering
Binary index to exclude the holes
• : -divergence [Eguchi, et al., 2001]
– EUC-distance
– KL-divergence
– IS-divergence
Generalized divergence: b -divergence
21
The best criterion for
signal separation
[Kitamura, et al., 2014]
• We used two -divergences for the main cost and
the regularization cost as and .
Decomposition model and cost function
22
Decomposition model:
Cost function:
Supervised bases (Fixed)
Update rules
• We can obtain the update rules for the optimization of
the variables matrices , , and .
23
Update rules:
Outline
• 1. Research background
• 2. Conventional methods
– Nonnegative matrix factorization
– Supervised nonnegative matrix factorization
– Multichannel NMF
• 3. Proposed method
– SNMF with spectrogram restoration and its Hybrid method
• 4. Experiments
– Closed data experiment
– Open data experiment
• 5. Conclusions
24
• Mixed signal includes four melodies (sources).
• Three compositions of instruments
– We evaluated the average score of 36 patterns.
Experimental condition
25
Center
1
2 3
4
Left Right
Target source
Supervision
signal
24 notes that cover all the notes in the target melody
Dataset Melody 1 Melody 2 Midrange Bass
No. 1 Oboe Flute Piano Trombone
No. 2 Trumpet Violin Harpsichord Fagotto
No. 3 Horn Clarinet Piano Cello
14
12
10
8
6
4
2
0
SDR[dB]
43210
bNMF
• Signal-to-distortion ratio (SDR)
– total quality of the separation, which includes the degree of
separation and absence of artificial distortion.
Experimental result: closed data
26
Good
Bad
Conventional SNMF
(single-channel SNMF)
Proposed hybrid method
Directional
clustering
Supervised
Multichannel
NMF [Sawada]
KL-divergence EUC-distance
SNMF with spectrogram restoration
• SNMF with spectrogram restoration has two tasks.
• The optimal divergence for source separation is KL-
divergence ( ).
• In contrast, a divergence with higher value is
suitable for the basis extrapolation.
27
Source
separation
SNMF with
spectrogram restoration
Basis
extrapolation
Trade-off: separation and restoration
• The optimal divergence for SNMF with spectrogram
restoration and its hybrid method is based on the
trade-off between separation and restoration abilities.
-10
-8
-6
-4
-2
0
Amplitude[dB]
543210
Frequency [kHz]
-10
-8
-6
-4
-2
0
Amplitude[dB]
543210
Frequency [kHz]
Sparseness: strong Sparseness: weak
28
Performance
Separation
Total performance of the hybrid method
Restoration
0 1 2 3 4
• Closed data experiment
– used different Tone generator for training and test signals
Experimental condition
29
Supervision
signal
24 notes that cover all the notes in the target melody
Provided by Tone generator A
Provided by Tone generator B
(more real sound)
+ back ground noise (SNR = 10 dB)
Center
1
2 3
4
Left Right
Target source
10
8
6
4
2
0
-2
-4
SDR[dB]
43210
bNMF
• Signal-to-distortion ratio (SDR)
– total quality of the separation, which includes the degree of
separation and absence of artificial distortion.
Experimental result: open data
30
Good
Bad
Conventional SNMF
(single-channel SNMF)
Proposed hybrid method
Directional
clustering
Supervised
Multichannel
NMF [Sawada]
KL-divergence EUC-distance
Conclusions
• We proposed a hybrid multichannel signal separation
method combining directional clustering and SNMF
with spectrogram restoration.
• There is a trade-off between separation and
restoration abilities.
31
Thank you for your attention!
You can hear a
demonstration
from my HP!

More Related Content

What's hot

Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...Daichi Kitamura
 
Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...Daichi Kitamura
 
DNN-based permutation solver for frequency-domain independent component analy...
DNN-based permutation solver for frequency-domain independent component analy...DNN-based permutation solver for frequency-domain independent component analy...
DNN-based permutation solver for frequency-domain independent component analy...Kitamura Laboratory
 
DNN-based frequency component prediction for frequency-domain audio source se...
DNN-based frequency component prediction for frequency-domain audio source se...DNN-based frequency component prediction for frequency-domain audio source se...
DNN-based frequency component prediction for frequency-domain audio source se...Kitamura Laboratory
 
A Generalization of Laplace Nonnegative Matrix Factorizationand Its Multichan...
A Generalization of Laplace Nonnegative Matrix Factorizationand Its Multichan...A Generalization of Laplace Nonnegative Matrix Factorizationand Its Multichan...
A Generalization of Laplace Nonnegative Matrix Factorizationand Its Multichan...Hiroki_Tanji
 
Learning the Statistical Model of the NMF Using the Deep Multiplicative Updat...
Learning the Statistical Model of the NMF Using the Deep Multiplicative Updat...Learning the Statistical Model of the NMF Using the Deep Multiplicative Updat...
Learning the Statistical Model of the NMF Using the Deep Multiplicative Updat...Hiroki_Tanji
 
11-16-0316-00-00ay-low-complexity-beamtraining-for-hybrid-mimo
11-16-0316-00-00ay-low-complexity-beamtraining-for-hybrid-mimo11-16-0316-00-00ay-low-complexity-beamtraining-for-hybrid-mimo
11-16-0316-00-00ay-low-complexity-beamtraining-for-hybrid-mimoFares Zenaidi
 
Approximation of Dynamic Convolution Exploiting Principal Component Analysis:...
Approximation of Dynamic Convolution Exploiting Principal Component Analysis:...Approximation of Dynamic Convolution Exploiting Principal Component Analysis:...
Approximation of Dynamic Convolution Exploiting Principal Component Analysis:...a3labdsp
 
M.sc. presentation t.bagheri fashkhami
M.sc. presentation t.bagheri fashkhamiM.sc. presentation t.bagheri fashkhami
M.sc. presentation t.bagheri fashkhamitaherbagherif
 
(Reading Group) Automatic Detection of Action Potentials in a Noisy Neural Re...
(Reading Group) Automatic Detection of Action Potentials in a Noisy Neural Re...(Reading Group) Automatic Detection of Action Potentials in a Noisy Neural Re...
(Reading Group) Automatic Detection of Action Potentials in a Noisy Neural Re...Mohamed Elawady
 
BalloonNet: A Deploying Method for a Three-Dimensional Wireless Network Surro...
BalloonNet: A Deploying Method for a Three-Dimensional Wireless Network Surro...BalloonNet: A Deploying Method for a Three-Dimensional Wireless Network Surro...
BalloonNet: A Deploying Method for a Three-Dimensional Wireless Network Surro...Naoki Shibata
 
Ibfd presentation
Ibfd presentationIbfd presentation
Ibfd presentationFuyun Ling
 
Fourier Filtering Denoising Based on Genetic Algorithms
Fourier Filtering Denoising Based on Genetic AlgorithmsFourier Filtering Denoising Based on Genetic Algorithms
Fourier Filtering Denoising Based on Genetic Algorithmsijtsrd
 
IRJET- Music Genre Classification using MFCC and AANN
IRJET- Music Genre Classification using MFCC and AANNIRJET- Music Genre Classification using MFCC and AANN
IRJET- Music Genre Classification using MFCC and AANNIRJET Journal
 
Frequency based criterion for distinguishing tonal and noisy spectral components
Frequency based criterion for distinguishing tonal and noisy spectral componentsFrequency based criterion for distinguishing tonal and noisy spectral components
Frequency based criterion for distinguishing tonal and noisy spectral componentsCSCJournals
 
Contextual high-resolution image classification by markovian data fusion.pdf
Contextual high-resolution image classification by markovian data fusion.pdfContextual high-resolution image classification by markovian data fusion.pdf
Contextual high-resolution image classification by markovian data fusion.pdfgrssieee
 
Analysis and Compression of Reflectance Data Using An Evolved Spectral Correl...
Analysis and Compression of Reflectance Data Using An Evolved Spectral Correl...Analysis and Compression of Reflectance Data Using An Evolved Spectral Correl...
Analysis and Compression of Reflectance Data Using An Evolved Spectral Correl...Peter Morovic
 

What's hot (20)

Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...
 
Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...
 
DNN-based permutation solver for frequency-domain independent component analy...
DNN-based permutation solver for frequency-domain independent component analy...DNN-based permutation solver for frequency-domain independent component analy...
DNN-based permutation solver for frequency-domain independent component analy...
 
DNN-based frequency component prediction for frequency-domain audio source se...
DNN-based frequency component prediction for frequency-domain audio source se...DNN-based frequency component prediction for frequency-domain audio source se...
DNN-based frequency component prediction for frequency-domain audio source se...
 
Temporal Segment Network
Temporal Segment NetworkTemporal Segment Network
Temporal Segment Network
 
A Generalization of Laplace Nonnegative Matrix Factorizationand Its Multichan...
A Generalization of Laplace Nonnegative Matrix Factorizationand Its Multichan...A Generalization of Laplace Nonnegative Matrix Factorizationand Its Multichan...
A Generalization of Laplace Nonnegative Matrix Factorizationand Its Multichan...
 
Learning the Statistical Model of the NMF Using the Deep Multiplicative Updat...
Learning the Statistical Model of the NMF Using the Deep Multiplicative Updat...Learning the Statistical Model of the NMF Using the Deep Multiplicative Updat...
Learning the Statistical Model of the NMF Using the Deep Multiplicative Updat...
 
Oceans13 Presentation
Oceans13 PresentationOceans13 Presentation
Oceans13 Presentation
 
11-16-0316-00-00ay-low-complexity-beamtraining-for-hybrid-mimo
11-16-0316-00-00ay-low-complexity-beamtraining-for-hybrid-mimo11-16-0316-00-00ay-low-complexity-beamtraining-for-hybrid-mimo
11-16-0316-00-00ay-low-complexity-beamtraining-for-hybrid-mimo
 
Approximation of Dynamic Convolution Exploiting Principal Component Analysis:...
Approximation of Dynamic Convolution Exploiting Principal Component Analysis:...Approximation of Dynamic Convolution Exploiting Principal Component Analysis:...
Approximation of Dynamic Convolution Exploiting Principal Component Analysis:...
 
M.sc. presentation t.bagheri fashkhami
M.sc. presentation t.bagheri fashkhamiM.sc. presentation t.bagheri fashkhami
M.sc. presentation t.bagheri fashkhami
 
(Reading Group) Automatic Detection of Action Potentials in a Noisy Neural Re...
(Reading Group) Automatic Detection of Action Potentials in a Noisy Neural Re...(Reading Group) Automatic Detection of Action Potentials in a Noisy Neural Re...
(Reading Group) Automatic Detection of Action Potentials in a Noisy Neural Re...
 
BalloonNet: A Deploying Method for a Three-Dimensional Wireless Network Surro...
BalloonNet: A Deploying Method for a Three-Dimensional Wireless Network Surro...BalloonNet: A Deploying Method for a Three-Dimensional Wireless Network Surro...
BalloonNet: A Deploying Method for a Three-Dimensional Wireless Network Surro...
 
Ibfd presentation
Ibfd presentationIbfd presentation
Ibfd presentation
 
Fourier Filtering Denoising Based on Genetic Algorithms
Fourier Filtering Denoising Based on Genetic AlgorithmsFourier Filtering Denoising Based on Genetic Algorithms
Fourier Filtering Denoising Based on Genetic Algorithms
 
Max_Poster_FINAL
Max_Poster_FINALMax_Poster_FINAL
Max_Poster_FINAL
 
IRJET- Music Genre Classification using MFCC and AANN
IRJET- Music Genre Classification using MFCC and AANNIRJET- Music Genre Classification using MFCC and AANN
IRJET- Music Genre Classification using MFCC and AANN
 
Frequency based criterion for distinguishing tonal and noisy spectral components
Frequency based criterion for distinguishing tonal and noisy spectral componentsFrequency based criterion for distinguishing tonal and noisy spectral components
Frequency based criterion for distinguishing tonal and noisy spectral components
 
Contextual high-resolution image classification by markovian data fusion.pdf
Contextual high-resolution image classification by markovian data fusion.pdfContextual high-resolution image classification by markovian data fusion.pdf
Contextual high-resolution image classification by markovian data fusion.pdf
 
Analysis and Compression of Reflectance Data Using An Evolved Spectral Correl...
Analysis and Compression of Reflectance Data Using An Evolved Spectral Correl...Analysis and Compression of Reflectance Data Using An Evolved Spectral Correl...
Analysis and Compression of Reflectance Data Using An Evolved Spectral Correl...
 

Viewers also liked

統計的独立性と低ランク行列分解理論に基づく ブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...
統計的独立性と低ランク行列分解理論に基づくブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...統計的独立性と低ランク行列分解理論に基づくブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...
統計的独立性と低ランク行列分解理論に基づく ブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...Daichi Kitamura
 
半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法
半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法
半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法Daichi Kitamura
 
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...Daichi Kitamura
 
直交化及び距離最大化則条件を用いた教師あり非負値行列因子分解による音楽信号分離
直交化及び距離最大化則条件を用いた教師あり非負値行列因子分解による音楽信号分離直交化及び距離最大化則条件を用いた教師あり非負値行列因子分解による音楽信号分離
直交化及び距離最大化則条件を用いた教師あり非負値行列因子分解による音楽信号分離奈良先端大 情報科学研究科
 
Study on optimal divergence for superresolution-based supervised nonnegative ...
Study on optimal divergence for superresolution-based supervised nonnegative ...Study on optimal divergence for superresolution-based supervised nonnegative ...
Study on optimal divergence for superresolution-based supervised nonnegative ...Daichi Kitamura
 
Music signal separation using supervised nonnegative matrix factorization wit...
Music signal separation using supervised nonnegative matrix factorization wit...Music signal separation using supervised nonnegative matrix factorization wit...
Music signal separation using supervised nonnegative matrix factorization wit...Daichi Kitamura
 
ランク1空間近似を用いたBSSにおける音源及び空間モデルの考察 Study on Source and Spatial Models for BSS wi...
ランク1空間近似を用いたBSSにおける音源及び空間モデルの考察 Study on Source and Spatial Models for BSS wi...ランク1空間近似を用いたBSSにおける音源及び空間モデルの考察 Study on Source and Spatial Models for BSS wi...
ランク1空間近似を用いたBSSにおける音源及び空間モデルの考察 Study on Source and Spatial Models for BSS wi...Daichi Kitamura
 
基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)
基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)
基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)Daichi Kitamura
 
独立性基準を用いた非負値行列因子分解の効果的な初期値決定法(Statistical-independence-based efficient initia...
独立性基準を用いた非負値行列因子分解の効果的な初期値決定法(Statistical-independence-based efficient initia...独立性基準を用いた非負値行列因子分解の効果的な初期値決定法(Statistical-independence-based efficient initia...
独立性基準を用いた非負値行列因子分解の効果的な初期値決定法(Statistical-independence-based efficient initia...Daichi Kitamura
 
音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou...
音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou...音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou...
音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou...Daichi Kitamura
 
非負値行列分解の確率的生成モデルと 多チャネル音源分離への応用 (Generative model in nonnegative matrix facto...
非負値行列分解の確率的生成モデルと多チャネル音源分離への応用 (Generative model in nonnegative matrix facto...非負値行列分解の確率的生成モデルと多チャネル音源分離への応用 (Generative model in nonnegative matrix facto...
非負値行列分解の確率的生成モデルと 多チャネル音源分離への応用 (Generative model in nonnegative matrix facto...Daichi Kitamura
 

Viewers also liked (11)

統計的独立性と低ランク行列分解理論に基づく ブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...
統計的独立性と低ランク行列分解理論に基づくブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...統計的独立性と低ランク行列分解理論に基づくブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...
統計的独立性と低ランク行列分解理論に基づく ブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...
 
半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法
半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法
半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法
 
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...
 
直交化及び距離最大化則条件を用いた教師あり非負値行列因子分解による音楽信号分離
直交化及び距離最大化則条件を用いた教師あり非負値行列因子分解による音楽信号分離直交化及び距離最大化則条件を用いた教師あり非負値行列因子分解による音楽信号分離
直交化及び距離最大化則条件を用いた教師あり非負値行列因子分解による音楽信号分離
 
Study on optimal divergence for superresolution-based supervised nonnegative ...
Study on optimal divergence for superresolution-based supervised nonnegative ...Study on optimal divergence for superresolution-based supervised nonnegative ...
Study on optimal divergence for superresolution-based supervised nonnegative ...
 
Music signal separation using supervised nonnegative matrix factorization wit...
Music signal separation using supervised nonnegative matrix factorization wit...Music signal separation using supervised nonnegative matrix factorization wit...
Music signal separation using supervised nonnegative matrix factorization wit...
 
ランク1空間近似を用いたBSSにおける音源及び空間モデルの考察 Study on Source and Spatial Models for BSS wi...
ランク1空間近似を用いたBSSにおける音源及び空間モデルの考察 Study on Source and Spatial Models for BSS wi...ランク1空間近似を用いたBSSにおける音源及び空間モデルの考察 Study on Source and Spatial Models for BSS wi...
ランク1空間近似を用いたBSSにおける音源及び空間モデルの考察 Study on Source and Spatial Models for BSS wi...
 
基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)
基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)
基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)
 
独立性基準を用いた非負値行列因子分解の効果的な初期値決定法(Statistical-independence-based efficient initia...
独立性基準を用いた非負値行列因子分解の効果的な初期値決定法(Statistical-independence-based efficient initia...独立性基準を用いた非負値行列因子分解の効果的な初期値決定法(Statistical-independence-based efficient initia...
独立性基準を用いた非負値行列因子分解の効果的な初期値決定法(Statistical-independence-based efficient initia...
 
音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou...
音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou...音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou...
音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou...
 
非負値行列分解の確率的生成モデルと 多チャネル音源分離への応用 (Generative model in nonnegative matrix facto...
非負値行列分解の確率的生成モデルと多チャネル音源分離への応用 (Generative model in nonnegative matrix facto...非負値行列分解の確率的生成モデルと多チャネル音源分離への応用 (Generative model in nonnegative matrix facto...
非負値行列分解の確率的生成モデルと 多チャネル音源分離への応用 (Generative model in nonnegative matrix facto...
 

Similar to Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration

Online Divergence Switching for Superresolution-Based Nonnegative Matrix Fa...
Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Fa...Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Fa...
Online Divergence Switching for Superresolution-Based Nonnegative Matrix Fa...奈良先端大 情報科学研究科
 
Prior distribution design for music bleeding-sound reduction based on nonnega...
Prior distribution design for music bleeding-sound reduction based on nonnega...Prior distribution design for music bleeding-sound reduction based on nonnega...
Prior distribution design for music bleeding-sound reduction based on nonnega...Kitamura Laboratory
 
Test vector compression in Digital Testing
Test vector compression in Digital Testing Test vector compression in Digital Testing
Test vector compression in Digital Testing Amr Abd El Latief
 
NIDM-Results. A standard for describing and sharing neuroimaging results: app...
NIDM-Results. A standard for describing and sharing neuroimaging results: app...NIDM-Results. A standard for describing and sharing neuroimaging results: app...
NIDM-Results. A standard for describing and sharing neuroimaging results: app...Camille Maumet
 
SPECFORMER: SPECTRAL GRAPH NEURAL NETWORKS MEET TRANSFORMERS.pptx
SPECFORMER: SPECTRAL GRAPH NEURAL NETWORKS MEET TRANSFORMERS.pptxSPECFORMER: SPECTRAL GRAPH NEURAL NETWORKS MEET TRANSFORMERS.pptx
SPECFORMER: SPECTRAL GRAPH NEURAL NETWORKS MEET TRANSFORMERS.pptxssuser2624f71
 
sub topics of NMR.pptx
sub topics of NMR.pptxsub topics of NMR.pptx
sub topics of NMR.pptxHajira Mahmood
 
Non-Uniform sampling and reconstruction of multi-band signals
Non-Uniform sampling and reconstruction of multi-band signalsNon-Uniform sampling and reconstruction of multi-band signals
Non-Uniform sampling and reconstruction of multi-band signalsmravendi
 
Design and Hardware Implementation of Low-Complexity Multiuser Precoders (ETH...
Design and Hardware Implementation of Low-Complexity Multiuser Precoders (ETH...Design and Hardware Implementation of Low-Complexity Multiuser Precoders (ETH...
Design and Hardware Implementation of Low-Complexity Multiuser Precoders (ETH...TSC University of Mondragon
 
automatic detection of pulmonary nodules in lung ct images
automatic detection of pulmonary nodules in lung ct imagesautomatic detection of pulmonary nodules in lung ct images
automatic detection of pulmonary nodules in lung ct imagesWookjin Choi
 
Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Ne...
Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Ne...Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Ne...
Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Ne...Alpen-Adria-Universität
 
A Sparse-Coding Based Approach for Class-Specific Feature Selection
A Sparse-Coding Based Approach for Class-Specific Feature SelectionA Sparse-Coding Based Approach for Class-Specific Feature Selection
A Sparse-Coding Based Approach for Class-Specific Feature SelectionDavide Nardone
 
Optimal fuzzy rule based pulmonary nodule detection
Optimal fuzzy rule based pulmonary nodule detectionOptimal fuzzy rule based pulmonary nodule detection
Optimal fuzzy rule based pulmonary nodule detectionWookjin Choi
 
Using Feature Grouping as a Stochastic Regularizer for High Dimensional Noisy...
Using Feature Grouping as a Stochastic Regularizer for High Dimensional Noisy...Using Feature Grouping as a Stochastic Regularizer for High Dimensional Noisy...
Using Feature Grouping as a Stochastic Regularizer for High Dimensional Noisy...WiMLDSMontreal
 
End-to-End Object Detection with Transformers
End-to-End Object Detection with TransformersEnd-to-End Object Detection with Transformers
End-to-End Object Detection with TransformersSeunghyun Hwang
 
Algorithms for detecting periodic patterns in millions of time series
Algorithms for detecting periodic patterns in millions of time seriesAlgorithms for detecting periodic patterns in millions of time series
Algorithms for detecting periodic patterns in millions of time seriesMeir TOLEDANO
 
Robust inference via generative classifiers for handling noisy labels
Robust inference via generative classifiers for handling noisy labelsRobust inference via generative classifiers for handling noisy labels
Robust inference via generative classifiers for handling noisy labelsKimin Lee
 

Similar to Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration (20)

Hybrid NMF APSIPA2014 invited
Hybrid NMF APSIPA2014 invitedHybrid NMF APSIPA2014 invited
Hybrid NMF APSIPA2014 invited
 
Online Divergence Switching for Superresolution-Based Nonnegative Matrix Fa...
Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Fa...Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Fa...
Online Divergence Switching for Superresolution-Based Nonnegative Matrix Fa...
 
Prior distribution design for music bleeding-sound reduction based on nonnega...
Prior distribution design for music bleeding-sound reduction based on nonnega...Prior distribution design for music bleeding-sound reduction based on nonnega...
Prior distribution design for music bleeding-sound reduction based on nonnega...
 
Test vector compression
Test vector compressionTest vector compression
Test vector compression
 
Test vector compression in Digital Testing
Test vector compression in Digital Testing Test vector compression in Digital Testing
Test vector compression in Digital Testing
 
NIDM-Results. A standard for describing and sharing neuroimaging results: app...
NIDM-Results. A standard for describing and sharing neuroimaging results: app...NIDM-Results. A standard for describing and sharing neuroimaging results: app...
NIDM-Results. A standard for describing and sharing neuroimaging results: app...
 
SPECFORMER: SPECTRAL GRAPH NEURAL NETWORKS MEET TRANSFORMERS.pptx
SPECFORMER: SPECTRAL GRAPH NEURAL NETWORKS MEET TRANSFORMERS.pptxSPECFORMER: SPECTRAL GRAPH NEURAL NETWORKS MEET TRANSFORMERS.pptx
SPECFORMER: SPECTRAL GRAPH NEURAL NETWORKS MEET TRANSFORMERS.pptx
 
Apsipa2016for ss
Apsipa2016for ssApsipa2016for ss
Apsipa2016for ss
 
sub topics of NMR.pptx
sub topics of NMR.pptxsub topics of NMR.pptx
sub topics of NMR.pptx
 
Non-Uniform sampling and reconstruction of multi-band signals
Non-Uniform sampling and reconstruction of multi-band signalsNon-Uniform sampling and reconstruction of multi-band signals
Non-Uniform sampling and reconstruction of multi-band signals
 
sequencea.ppt
sequencea.pptsequencea.ppt
sequencea.ppt
 
Design and Hardware Implementation of Low-Complexity Multiuser Precoders (ETH...
Design and Hardware Implementation of Low-Complexity Multiuser Precoders (ETH...Design and Hardware Implementation of Low-Complexity Multiuser Precoders (ETH...
Design and Hardware Implementation of Low-Complexity Multiuser Precoders (ETH...
 
automatic detection of pulmonary nodules in lung ct images
automatic detection of pulmonary nodules in lung ct imagesautomatic detection of pulmonary nodules in lung ct images
automatic detection of pulmonary nodules in lung ct images
 
Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Ne...
Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Ne...Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Ne...
Relevance-Based Compression of Cataract Surgery Videos Using Convolutional Ne...
 
A Sparse-Coding Based Approach for Class-Specific Feature Selection
A Sparse-Coding Based Approach for Class-Specific Feature SelectionA Sparse-Coding Based Approach for Class-Specific Feature Selection
A Sparse-Coding Based Approach for Class-Specific Feature Selection
 
Optimal fuzzy rule based pulmonary nodule detection
Optimal fuzzy rule based pulmonary nodule detectionOptimal fuzzy rule based pulmonary nodule detection
Optimal fuzzy rule based pulmonary nodule detection
 
Using Feature Grouping as a Stochastic Regularizer for High Dimensional Noisy...
Using Feature Grouping as a Stochastic Regularizer for High Dimensional Noisy...Using Feature Grouping as a Stochastic Regularizer for High Dimensional Noisy...
Using Feature Grouping as a Stochastic Regularizer for High Dimensional Noisy...
 
End-to-End Object Detection with Transformers
End-to-End Object Detection with TransformersEnd-to-End Object Detection with Transformers
End-to-End Object Detection with Transformers
 
Algorithms for detecting periodic patterns in millions of time series
Algorithms for detecting periodic patterns in millions of time seriesAlgorithms for detecting periodic patterns in millions of time series
Algorithms for detecting periodic patterns in millions of time series
 
Robust inference via generative classifiers for handling noisy labels
Robust inference via generative classifiers for handling noisy labelsRobust inference via generative classifiers for handling noisy labels
Robust inference via generative classifiers for handling noisy labels
 

More from Daichi Kitamura

独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...
独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...
独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...Daichi Kitamura
 
スペクトログラム無矛盾性を用いた 独立低ランク行列分析の実験的評価
スペクトログラム無矛盾性を用いた独立低ランク行列分析の実験的評価スペクトログラム無矛盾性を用いた独立低ランク行列分析の実験的評価
スペクトログラム無矛盾性を用いた 独立低ランク行列分析の実験的評価Daichi Kitamura
 
Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)
Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)
Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)Daichi Kitamura
 
独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...
独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...
独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...Daichi Kitamura
 
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...Daichi Kitamura
 
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...Daichi Kitamura
 
近接分離最適化によるブラインド⾳源分離(Blind source separation via proximal splitting algorithm)
近接分離最適化によるブラインド⾳源分離(Blind source separation via proximal splitting algorithm)近接分離最適化によるブラインド⾳源分離(Blind source separation via proximal splitting algorithm)
近接分離最適化によるブラインド⾳源分離(Blind source separation via proximal splitting algorithm)Daichi Kitamura
 
非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法
非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法
非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法Daichi Kitamura
 
音源分離における音響モデリング(Acoustic modeling in audio source separation)
音源分離における音響モデリング(Acoustic modeling in audio source separation)音源分離における音響モデリング(Acoustic modeling in audio source separation)
音源分離における音響モデリング(Acoustic modeling in audio source separation)Daichi Kitamura
 
Experimental analysis of optimal window length for independent low-rank matri...
Experimental analysis of optimal window length for independent low-rank matri...Experimental analysis of optimal window length for independent low-rank matri...
Experimental analysis of optimal window length for independent low-rank matri...Daichi Kitamura
 
ICASSP2017読み会(関東編)・AASP_L3(北村担当分)
ICASSP2017読み会(関東編)・AASP_L3(北村担当分)ICASSP2017読み会(関東編)・AASP_L3(北村担当分)
ICASSP2017読み会(関東編)・AASP_L3(北村担当分)Daichi Kitamura
 
Audio Source Separation Based on Low-Rank Structure and Statistical Independence
Audio Source Separation Based on Low-Rank Structure and Statistical IndependenceAudio Source Separation Based on Low-Rank Structure and Statistical Independence
Audio Source Separation Based on Low-Rank Structure and Statistical IndependenceDaichi Kitamura
 
擬似ハムバッキングピックアップの弦振動応答 (in Japanese)
擬似ハムバッキングピックアップの弦振動応答 (in Japanese)擬似ハムバッキングピックアップの弦振動応答 (in Japanese)
擬似ハムバッキングピックアップの弦振動応答 (in Japanese)Daichi Kitamura
 
模擬ハムバッキング・ピックアップの弦振動応答 (in Japanese)
模擬ハムバッキング・ピックアップの弦振動応答 (in Japanese)模擬ハムバッキング・ピックアップの弦振動応答 (in Japanese)
模擬ハムバッキング・ピックアップの弦振動応答 (in Japanese)Daichi Kitamura
 
Evaluation of separation accuracy for various real instruments based on super...
Evaluation of separation accuracy for various real instruments based on super...Evaluation of separation accuracy for various real instruments based on super...
Evaluation of separation accuracy for various real instruments based on super...Daichi Kitamura
 
Divergence optimization based on trade-off between separation and extrapolati...
Divergence optimization based on trade-off between separation and extrapolati...Divergence optimization based on trade-off between separation and extrapolati...
Divergence optimization based on trade-off between separation and extrapolati...Daichi Kitamura
 

More from Daichi Kitamura (16)

独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...
独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...
独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...
 
スペクトログラム無矛盾性を用いた 独立低ランク行列分析の実験的評価
スペクトログラム無矛盾性を用いた独立低ランク行列分析の実験的評価スペクトログラム無矛盾性を用いた独立低ランク行列分析の実験的評価
スペクトログラム無矛盾性を用いた 独立低ランク行列分析の実験的評価
 
Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)
Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)
Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)
 
独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...
独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...
独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...
 
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...
 
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...
 
近接分離最適化によるブラインド⾳源分離(Blind source separation via proximal splitting algorithm)
近接分離最適化によるブラインド⾳源分離(Blind source separation via proximal splitting algorithm)近接分離最適化によるブラインド⾳源分離(Blind source separation via proximal splitting algorithm)
近接分離最適化によるブラインド⾳源分離(Blind source separation via proximal splitting algorithm)
 
非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法
非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法
非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法
 
音源分離における音響モデリング(Acoustic modeling in audio source separation)
音源分離における音響モデリング(Acoustic modeling in audio source separation)音源分離における音響モデリング(Acoustic modeling in audio source separation)
音源分離における音響モデリング(Acoustic modeling in audio source separation)
 
Experimental analysis of optimal window length for independent low-rank matri...
Experimental analysis of optimal window length for independent low-rank matri...Experimental analysis of optimal window length for independent low-rank matri...
Experimental analysis of optimal window length for independent low-rank matri...
 
ICASSP2017読み会(関東編)・AASP_L3(北村担当分)
ICASSP2017読み会(関東編)・AASP_L3(北村担当分)ICASSP2017読み会(関東編)・AASP_L3(北村担当分)
ICASSP2017読み会(関東編)・AASP_L3(北村担当分)
 
Audio Source Separation Based on Low-Rank Structure and Statistical Independence
Audio Source Separation Based on Low-Rank Structure and Statistical IndependenceAudio Source Separation Based on Low-Rank Structure and Statistical Independence
Audio Source Separation Based on Low-Rank Structure and Statistical Independence
 
擬似ハムバッキングピックアップの弦振動応答 (in Japanese)
擬似ハムバッキングピックアップの弦振動応答 (in Japanese)擬似ハムバッキングピックアップの弦振動応答 (in Japanese)
擬似ハムバッキングピックアップの弦振動応答 (in Japanese)
 
模擬ハムバッキング・ピックアップの弦振動応答 (in Japanese)
模擬ハムバッキング・ピックアップの弦振動応答 (in Japanese)模擬ハムバッキング・ピックアップの弦振動応答 (in Japanese)
模擬ハムバッキング・ピックアップの弦振動応答 (in Japanese)
 
Evaluation of separation accuracy for various real instruments based on super...
Evaluation of separation accuracy for various real instruments based on super...Evaluation of separation accuracy for various real instruments based on super...
Evaluation of separation accuracy for various real instruments based on super...
 
Divergence optimization based on trade-off between separation and extrapolati...
Divergence optimization based on trade-off between separation and extrapolati...Divergence optimization based on trade-off between separation and extrapolati...
Divergence optimization based on trade-off between separation and extrapolati...
 

Recently uploaded

Artificial Intelligence in Power System overview
Artificial Intelligence in Power System overviewArtificial Intelligence in Power System overview
Artificial Intelligence in Power System overviewsandhya757531
 
Comprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdfComprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdfalene1
 
Novel 3D-Printed Soft Linear and Bending Actuators
Novel 3D-Printed Soft Linear and Bending ActuatorsNovel 3D-Printed Soft Linear and Bending Actuators
Novel 3D-Printed Soft Linear and Bending ActuatorsResearcher Researcher
 
multiple access in wireless communication
multiple access in wireless communicationmultiple access in wireless communication
multiple access in wireless communicationpanditadesh123
 
11. Properties of Liquid Fuels in Energy Engineering.pdf
11. Properties of Liquid Fuels in Energy Engineering.pdf11. Properties of Liquid Fuels in Energy Engineering.pdf
11. Properties of Liquid Fuels in Energy Engineering.pdfHafizMudaserAhmad
 
Robotics Group 10 (Control Schemes) cse.pdf
Robotics Group 10  (Control Schemes) cse.pdfRobotics Group 10  (Control Schemes) cse.pdf
Robotics Group 10 (Control Schemes) cse.pdfsahilsajad201
 
『澳洲文凭』买麦考瑞大学毕业证书成绩单办理澳洲Macquarie文凭学位证书
『澳洲文凭』买麦考瑞大学毕业证书成绩单办理澳洲Macquarie文凭学位证书『澳洲文凭』买麦考瑞大学毕业证书成绩单办理澳洲Macquarie文凭学位证书
『澳洲文凭』买麦考瑞大学毕业证书成绩单办理澳洲Macquarie文凭学位证书rnrncn29
 
KCD Costa Rica 2024 - Nephio para parvulitos
KCD Costa Rica 2024 - Nephio para parvulitosKCD Costa Rica 2024 - Nephio para parvulitos
KCD Costa Rica 2024 - Nephio para parvulitosVictor Morales
 
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONTHE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONjhunlian
 
Python Programming for basic beginners.pptx
Python Programming for basic beginners.pptxPython Programming for basic beginners.pptx
Python Programming for basic beginners.pptxmohitesoham12
 
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...Sumanth A
 
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdfDEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdfAkritiPradhan2
 
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...Erbil Polytechnic University
 
CME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTES
CME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTESCME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTES
CME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTESkarthi keyan
 
TEST CASE GENERATION GENERATION BLOCK BOX APPROACH
TEST CASE GENERATION GENERATION BLOCK BOX APPROACHTEST CASE GENERATION GENERATION BLOCK BOX APPROACH
TEST CASE GENERATION GENERATION BLOCK BOX APPROACHSneha Padhiar
 
"Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ..."Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ...Erbil Polytechnic University
 
Computer Graphics Introduction, Open GL, Line and Circle drawing algorithm
Computer Graphics Introduction, Open GL, Line and Circle drawing algorithmComputer Graphics Introduction, Open GL, Line and Circle drawing algorithm
Computer Graphics Introduction, Open GL, Line and Circle drawing algorithmDeepika Walanjkar
 
Cost estimation approach: FP to COCOMO scenario based question
Cost estimation approach: FP to COCOMO scenario based questionCost estimation approach: FP to COCOMO scenario based question
Cost estimation approach: FP to COCOMO scenario based questionSneha Padhiar
 
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENT
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENTFUNCTIONAL AND NON FUNCTIONAL REQUIREMENT
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENTSneha Padhiar
 
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.pptROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.pptJohnWilliam111370
 

Recently uploaded (20)

Artificial Intelligence in Power System overview
Artificial Intelligence in Power System overviewArtificial Intelligence in Power System overview
Artificial Intelligence in Power System overview
 
Comprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdfComprehensive energy systems.pdf Comprehensive energy systems.pdf
Comprehensive energy systems.pdf Comprehensive energy systems.pdf
 
Novel 3D-Printed Soft Linear and Bending Actuators
Novel 3D-Printed Soft Linear and Bending ActuatorsNovel 3D-Printed Soft Linear and Bending Actuators
Novel 3D-Printed Soft Linear and Bending Actuators
 
multiple access in wireless communication
multiple access in wireless communicationmultiple access in wireless communication
multiple access in wireless communication
 
11. Properties of Liquid Fuels in Energy Engineering.pdf
11. Properties of Liquid Fuels in Energy Engineering.pdf11. Properties of Liquid Fuels in Energy Engineering.pdf
11. Properties of Liquid Fuels in Energy Engineering.pdf
 
Robotics Group 10 (Control Schemes) cse.pdf
Robotics Group 10  (Control Schemes) cse.pdfRobotics Group 10  (Control Schemes) cse.pdf
Robotics Group 10 (Control Schemes) cse.pdf
 
『澳洲文凭』买麦考瑞大学毕业证书成绩单办理澳洲Macquarie文凭学位证书
『澳洲文凭』买麦考瑞大学毕业证书成绩单办理澳洲Macquarie文凭学位证书『澳洲文凭』买麦考瑞大学毕业证书成绩单办理澳洲Macquarie文凭学位证书
『澳洲文凭』买麦考瑞大学毕业证书成绩单办理澳洲Macquarie文凭学位证书
 
KCD Costa Rica 2024 - Nephio para parvulitos
KCD Costa Rica 2024 - Nephio para parvulitosKCD Costa Rica 2024 - Nephio para parvulitos
KCD Costa Rica 2024 - Nephio para parvulitos
 
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTIONTHE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
THE SENDAI FRAMEWORK FOR DISASTER RISK REDUCTION
 
Python Programming for basic beginners.pptx
Python Programming for basic beginners.pptxPython Programming for basic beginners.pptx
Python Programming for basic beginners.pptx
 
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
Robotics-Asimov's Laws, Mechanical Subsystems, Robot Kinematics, Robot Dynami...
 
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdfDEVICE DRIVERS AND INTERRUPTS  SERVICE MECHANISM.pdf
DEVICE DRIVERS AND INTERRUPTS SERVICE MECHANISM.pdf
 
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
Comparative study of High-rise Building Using ETABS,SAP200 and SAFE., SAFE an...
 
CME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTES
CME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTESCME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTES
CME 397 - SURFACE ENGINEERING - UNIT 1 FULL NOTES
 
TEST CASE GENERATION GENERATION BLOCK BOX APPROACH
TEST CASE GENERATION GENERATION BLOCK BOX APPROACHTEST CASE GENERATION GENERATION BLOCK BOX APPROACH
TEST CASE GENERATION GENERATION BLOCK BOX APPROACH
 
"Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ..."Exploring the Essential Functions and Design Considerations of Spillways in ...
"Exploring the Essential Functions and Design Considerations of Spillways in ...
 
Computer Graphics Introduction, Open GL, Line and Circle drawing algorithm
Computer Graphics Introduction, Open GL, Line and Circle drawing algorithmComputer Graphics Introduction, Open GL, Line and Circle drawing algorithm
Computer Graphics Introduction, Open GL, Line and Circle drawing algorithm
 
Cost estimation approach: FP to COCOMO scenario based question
Cost estimation approach: FP to COCOMO scenario based questionCost estimation approach: FP to COCOMO scenario based question
Cost estimation approach: FP to COCOMO scenario based question
 
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENT
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENTFUNCTIONAL AND NON FUNCTIONAL REQUIREMENT
FUNCTIONAL AND NON FUNCTIONAL REQUIREMENT
 
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.pptROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
ROBOETHICS-CCS345 ETHICS AND ARTIFICIAL INTELLIGENCE.ppt
 

Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration

  • 1. Hybrid Multichannel Signal Separation Using Supervised Nonnegative Matrix Factorization Daichi Kitamura, (The Graduate University for Advanced Studies, Japan) Hiroshi Saruwatari, (The University of Tokyo, Japan) Satoshi Nakamura, (Nara Institute of Science and Technology, Japan) Yu Takahashi, (Yamaha Corporation, Japan) Kazunobu Kondo, (Yamaha Corporation, Japan) Hirokazu Kameoka, (The University of Tokyo, Japan) Asia-Pacific Signal and Information Processing Association ASC 2014 Special session – Recent Advances in Audio and Acoustic Signal processing
  • 2. Outline • 1. Research background • 2. Conventional methods – Nonnegative matrix factorization – Supervised nonnegative matrix factorization – Multichannel NMF • 3. Proposed method – SNMF with spectrogram restoration and its Hybrid method • 4. Experiments – Closed data experiment – Open data experiment • 5. Conclusions 2
  • 3. Outline • 1. Research background • 2. Conventional methods – Nonnegative matrix factorization – Supervised nonnegative matrix factorization – Multichannel NMF • 3. Proposed method – SNMF with spectrogram restoration and its Hybrid method • 4. Experiments – Closed data experiment – Open data experiment • 5. Conclusions 3
  • 4. Research background • Signal separation have received much attention. • Music signal separation based on nonnegative matrix factorization (NMF) is a very active research area. • Supervised NMF (SNMF) achieves the highest separation performance. • To improve its performance, SNMF-based multichannel signal separation method is required. 4 • Automatic music transcription • 3D audio system, etc. Applications Separate! Separate the target signal from multichannel signals with high accuracy.
  • 5. Outline • 1. Research background • 2. Conventional methods – Nonnegative matrix factorization – Supervised nonnegative matrix factorization – Multichannel NMF • 3. Proposed method – SNMF with spectrogram restoration and its Hybrid method • 4. Experiments – Closed data experiment – Open data experiment • 5. Conclusions 5
  • 6. • NMF can extract significant spectral patterns. – Basis matrix has frequently-appearing spectral patterns in . NMF [Lee, et al., 2001] Amplitude Amplitude Observed matrix (spectrogram) Basis matrix (spectral patterns) Activation matrix (Time-varying gain) Time Ω: Number of frequency bins 𝑇: Number of time frames 𝐾: Number of bases Time Frequency Frequency 6 Basis
  • 7. • SNMF – Supervised spectral separation method Supervised NMF [Smaragdis, et al., 2007] Separation process Optimize Training process Supervised basis matrix (spectral dictionary) Sample sounds of target signal 7 Fixed Sample sound Target signal Other signalMixed signal
  • 8. Problems of SNMF • SNMF is only for a single-channel signal – For multichannel signal, SNMF cannot use information between channels. • When many interference sources exist, separation performance of SNMF markedly degrades. 8 Separate Residual components
  • 9. 9 • Multichannel NMF – is a natural extension of NMF for a multichannel signal – uses spatial information for the clustering of bases to achieve the unsupervised separation task. Multichannel NMF [Sawada, et al., 2013] Problems: Multichannel NMF involve strong dependence on initial values and lack robustness. Microphone array
  • 10. Outline • 1. Research background • 2. Conventional methods – Nonnegative matrix factorization – Supervised nonnegative matrix factorization – Multichannel NMF • 3. Proposed method – Motivation and strategy – SNMF with spectrogram restoration and its Hybrid method • 4. Experiments – Closed data experiment – Open data experiment • 5. Conclusions 10
  • 11. • Sawada’s multichannel NMF – is unified method to solve spatial and spectral separations. – Maximizes a likelihood: – For supervised situation, target spectral patterns is given. – Too much difficult to solve (lack robustness) – Computationally inefficient (much computational time) Motivation and strategy 11 Spatial direction of target signal Source components of all signals Target Other Observed spectrograms
  • 12. • Proposed hybrid method – divides the problems as follows: – The spatial separation should be carried out with classical D.O.A. estimation methods. • These methods are very efficient and stable. – Divide and conquer method Motivation and strategy 12 Unsupervised spatial separation Supervised spectral separation Approximation Classical D.O.A. estimation SNMF-based method
  • 13. Directional clustering [Araki, et al., 2007] • Directional clustering – Unsupervised spatial separation method – k-means clustering (fast and stable) • Problems – Artificial distortion arises owing to the binary masking. 13 Right L R Center Left L R Center Binary masking Input signal (stereo) Separated signal 1 1 1 0 0 0 1 0 0 0 0 0 1 1 1 1 0 0 1 0 0 0 0 0 1 1 1 1 1 1 Frequency Time C C C R L R C L L L R R C C C C R R C R R L L L C C C C C C Frequency Time Binary maskSpectrogram Entry-wise product
  • 14. Proposed method: hybrid separation • Hybrid separation method 14 Input stereo signal Spatial separation method (Directional clustering) SNMF-based separation method (SNMF with spectrogram restoration) Separated signal L R
  • 15. SNMF with spectrogram restoration : Holes Time Frequency Separated cluster Spectral holes (lost components) The proposed SNMF treats these holes as unseen observations Supervised basis … Extrapolate the fittest bases 15 (dictionary of target signal) Fix up
  • 16. SNMF with spectrogram restoration Center RightLeft Direction sourcecomponent z (b) Center RightLeft Direction sourcecomponent (a) Target Center RightLeft Direction sourcecomponent (c) Extrapolated componentsFrequencyofFrequencyofFrequencyof After Input After signal directional clustering super- resolution- based SNMF Binary masking 16 Time FrequencyObserved spectrogram Target Interference Time Time Frequency Extrapolate Frequency Separated cluster Reconstructed data Supervised spectral bases Directional clustering SNMF with spectrogram restoration
  • 17. • The divergence is defined at all grids except for the holes by using the Binary mask matrix . Decomposition model and cost function 17 Decomposition model: Supervised bases (Fixed) : Entries of matrices, , and , respectively : Weighting parameters,: Binary complement, : Frobenius norm Cost function: : Binary masking matrix obtained from directional clustering
  • 18. • The divergence is defined at all grids except for the holes by using the Binary mask matrix . Decomposition model and cost function 18 Decomposition model: Supervised bases (Fixed) : Entries of matrices, , and , respectively : Weighting parameters,: Binary complement, : Frobenius norm Cost function: : Binary masking matrix obtained from directional clustering Binary index to exclude the holes
  • 19. • The divergence is defined at all grids except for the holes by using the Binary mask matrix . Decomposition model and cost function 19 Decomposition model: Supervised bases (Fixed) : Entries of matrices, , and , respectively : Weighting parameters,: Binary complement, : Frobenius norm Regularization term Cost function: : Binary masking matrix obtained from directional clustering Binary index to exclude the holes
  • 20. • The divergence is defined at all grids except for the holes by using the Binary mask matrix . Decomposition model and cost function 20 Decomposition model: Supervised bases (Fixed) : Entries of matrices, , and , respectively : Weighting parameters,: Binary complement, : Frobenius norm Regularization term Penalty term [Kitamura, et al. 2014] Cost function: : Binary masking matrix obtained from directional clustering Binary index to exclude the holes
  • 21. • : -divergence [Eguchi, et al., 2001] – EUC-distance – KL-divergence – IS-divergence Generalized divergence: b -divergence 21 The best criterion for signal separation [Kitamura, et al., 2014]
  • 22. • We used two -divergences for the main cost and the regularization cost as and . Decomposition model and cost function 22 Decomposition model: Cost function: Supervised bases (Fixed)
  • 23. Update rules • We can obtain the update rules for the optimization of the variables matrices , , and . 23 Update rules:
  • 24. Outline • 1. Research background • 2. Conventional methods – Nonnegative matrix factorization – Supervised nonnegative matrix factorization – Multichannel NMF • 3. Proposed method – SNMF with spectrogram restoration and its Hybrid method • 4. Experiments – Closed data experiment – Open data experiment • 5. Conclusions 24
  • 25. • Mixed signal includes four melodies (sources). • Three compositions of instruments – We evaluated the average score of 36 patterns. Experimental condition 25 Center 1 2 3 4 Left Right Target source Supervision signal 24 notes that cover all the notes in the target melody Dataset Melody 1 Melody 2 Midrange Bass No. 1 Oboe Flute Piano Trombone No. 2 Trumpet Violin Harpsichord Fagotto No. 3 Horn Clarinet Piano Cello
  • 26. 14 12 10 8 6 4 2 0 SDR[dB] 43210 bNMF • Signal-to-distortion ratio (SDR) – total quality of the separation, which includes the degree of separation and absence of artificial distortion. Experimental result: closed data 26 Good Bad Conventional SNMF (single-channel SNMF) Proposed hybrid method Directional clustering Supervised Multichannel NMF [Sawada] KL-divergence EUC-distance
  • 27. SNMF with spectrogram restoration • SNMF with spectrogram restoration has two tasks. • The optimal divergence for source separation is KL- divergence ( ). • In contrast, a divergence with higher value is suitable for the basis extrapolation. 27 Source separation SNMF with spectrogram restoration Basis extrapolation
  • 28. Trade-off: separation and restoration • The optimal divergence for SNMF with spectrogram restoration and its hybrid method is based on the trade-off between separation and restoration abilities. -10 -8 -6 -4 -2 0 Amplitude[dB] 543210 Frequency [kHz] -10 -8 -6 -4 -2 0 Amplitude[dB] 543210 Frequency [kHz] Sparseness: strong Sparseness: weak 28 Performance Separation Total performance of the hybrid method Restoration 0 1 2 3 4
  • 29. • Closed data experiment – used different Tone generator for training and test signals Experimental condition 29 Supervision signal 24 notes that cover all the notes in the target melody Provided by Tone generator A Provided by Tone generator B (more real sound) + back ground noise (SNR = 10 dB) Center 1 2 3 4 Left Right Target source
  • 30. 10 8 6 4 2 0 -2 -4 SDR[dB] 43210 bNMF • Signal-to-distortion ratio (SDR) – total quality of the separation, which includes the degree of separation and absence of artificial distortion. Experimental result: open data 30 Good Bad Conventional SNMF (single-channel SNMF) Proposed hybrid method Directional clustering Supervised Multichannel NMF [Sawada] KL-divergence EUC-distance
  • 31. Conclusions • We proposed a hybrid multichannel signal separation method combining directional clustering and SNMF with spectrogram restoration. • There is a trade-off between separation and restoration abilities. 31 Thank you for your attention! You can hear a demonstration from my HP!

Editor's Notes

  1. This is outline of my talk.
  2. This is outline of my talk.
  3. Recently, // signal separation technologies have received much attention. These technologies are available for many applications. And nonnegative matrix factorization, // NMF in short, // has been a very active area of the signal separation. Particularly, supervised NMF (SNMF) / achieves good separation performance. However, SNMF can be used for only single-channel signals. To improve its performance, SNMF-based multichannel signal separation method is required.
  4. This is outline of my talk.
  5. Before explaining a supervised NMF, I will explain the basic of simple NMF. NMF is a powerful method for extracting significant features from a spectrogram. This method decomposes the input spectrogram Y into a product of basis matrix F and activation matrix G, where basis matrix F / has frequently-appearing spectral patterns / as basis vectors like this, and activation matrix G / has time-varying gains / of each spectral pattern.
  6. To separate the target signal with NMF, Supervised NMF has been proposed. In SNMF, first, we train the sample sound of the target signal, which is like a musical scale. Then we construct the supervised basis F. This is a spectral dictionary of the target sound. Next, we separate the mixed signal / using the supervised basis F, as FG+HU. Therefore, the target signal obtained as FG, and the other signal is reconstructed by HU.
  7. The problem of SNMF is that This is only for a single-channel signals. We cannot use any information between channels. But almost all music signals are the stereo format. So we should extend simple SNMF to the a multichannel SNMF. In addition, when many interfering sources exist, the separation performance of SNMF markedly degrades.
  8. As another means for the multichannel signal separation, Multichannel NMF also has been proposed by Sawada. This is a natural extension of NMF, and uses spatial information for the clustering of bases, to achieve the unsupervised separation. However, this method is very difficult optimization problem mathematically. So, this method strongly depends on the initial values.
  9. Sawada’s multichannel NMF is a unified method to solve spatial and spectral separations simultaneously. This method maximizes a likelihood like this, where theta is a spatial direction of the target signal, F, G, H, and U is source components of target and other signals, Y is an observed given spectrogram of both channels. For the supervised situation, the target spectral patterns F is given like this. However, even if F is given, this optimization is too much difficult to solve. So it lacks robustness. Also, it requires much computational time.
  10. Our proposed method approximately divides the problem into the unsupervised spatial separation and supervised spectral separation. Because we can use efficient classical D.O.A. estimation methods for the spatial separation. This is very efficient and stable. Then SNMF is applied for the spectral separation problem. Therefore, this method can be considered as a divide and conquer method. The optimal methods are applied for each separations.
  11. For the spatial separation, we used a directional clustering because this is very fast and stable. This method utilizes level difference between left and right channels as a clustering cue. So, we can separate the sources direction-wisely. And this is equal to binary masking in the spectrogram domain. We get the binary mask from the result of clustering, and we calculate an entry-wise product. Finally we obtain the separated direction. However, the separated direction has an artificial distortion owing to the binary masking.
  12. So we proposed a new SNMF-based method named SMNF with spectrogram restoration. This is the concept of our proposed hybrid method. First, the target direction is separated. Then, target signal is extracted by this new SNMF.
  13. Here, / the separated signal by directional clustering / has many spectral holes owing to the binary masking. This spectrum is an example. There are so many spectral holes owing to the binary masking. However, / the proposed SNMF treats these holes as unseen observations like this. We exclude these components from the cost function. Then, the target bases are extrapolated using the fittest spectral pattern / from the supervised bases F. As a result, the lost components are restored by the supervised basis extrapolation.
  14. This figure shows the directional distribution of the input stereo signal. The target source is in the center direction, and the other interfering sources are distributed like this. After directional clustering, / left and right source components / leak in the center cluster, // and center sources lose some of their components. These lost components / correspond to the spectral holes. And after SNMF with spectrogram restoration, the target components are separated / and restored using supervised bases. In other words, / the resolution of the target spectrogram / is recovered.
  15. This is a decomposition model of SNMF with spectrogram restoration. It is the same as the simple SNMF. And, J is the cost function of the proposed SNMF. In this cost function,
  16. We introduce the binary index i, which is for excluding the holes from the total cost. This index is obtained from the binary mask matrix. Therefore, the divergence is defined at all spectrogram grids / except for the spectral holes.
  17. For the grids of the holes, we impose a regularization term to avoid the extrapolation error.
  18. The third term is a penalty term to avoid sharing the same basis between F and H. This penalty improves the separation performance in SNMF.
  19. For the divergence measure, we propose to use beta-divergence. This is a generalized distance function, which involves EUC-distance, KL-divergence, and IS-divergence when beta = 2, 1, and 0. In SNMF, it is reported that / KL-divergence is the best criterion for the signal separation.
  20. And we used two beta-divergences for the main cost and regularization cost / as beta_NMF and beta_reg.
  21. From the minimization of the cost function, / we can obtain the update rules / for the optimization of variable matrices G, H, and U.
  22. This is outline of my talk.
  23. This is an experimental condition. The mixed signal includes four melodies. Each sound source located like this figure, / where the target source is always located in the center direction / with other interfering source. And we prepared 3 compositions of instruments and evaluated the average score of 36 patterns. In addition, the supervision signal has 24 notes like this score, which cover all the notes in the target melody.
  24. This is a result of experiment. We showed the average SDR score, where SDR indicates the total quality of the separation. Directional clustering cannot separate the sources in the same direction, so the result was not good. Multichannel NMF strongly depends on the initial value, and the average score becomes bad. The hybrid method outperforms the conventional SNMF. And the conventional SNMF achieves the highest score when beta equals 1, KL-divergence. However, surprisingly, EUC-distance is preferable for the proposed hybrid method.
  25. This is because / SNMF with spectrogram restoration has two tasks, namely, Separation of the target signal / and basis extrapolation for the restoration of the spectrogram. And it is reported that the KL-divergence is suitable for the source separation. However, in contrast, a divergence with higher beta value is suitable for the basis extrapolation. This fact is experimentally proven in our paper.
  26. The reason is that / if we use the smaller beta value, such as a KL-divergence, the obtained basis becomes sparse. (pointing figure) On the other hand, if we use the higher beta value, the sparseness of the basis becomes weak. And the sparse basis is not suitable for the basis extrapolation using only the observable data. Therefore, the optimal divergence for the hybrid method is around EUC-distance / because of the trade-off between separation and restoration abilities / like this graph. The optimal beta is shifted from 1 to 2.
  27. Also, we conducted an open data experiment. Here we used the different MIDI Tone generator for the training and test signals. Therefore, the waveforms are not same, but similar. In addition, we added the back ground noise to the test signals as SNR = 10 dB.
  28. This is the result. Even if we use the different training sound, we can achieve good results. Sawada’s multichannel NMF does not work because this method cannot reduce the defuse noise.
  29. This is conclusions of my talk. Thank you for your attention.
  30. その他の実験条件はこのようになっています. NMFの距離規範βNMFを0から4まで変化させた時のすべての組み合わせの評価値を比較します. 正則化の距離規範においてはもっとも性能の高いβreg=1のみを示しております. 評価値にはSDRを用いております. SDRは分離度合と人工歪の少なさを含む総合的な分離精度です.
  31. Supervised method has an inherent problem. That is, we cannot get the perfect supervision sound of the target signal. Even if the supervision sounds are the same type of instrument as the target sound, / these sounds differ / according to various conditions. For example, individual styles of playing / and the timbre individuality for each instrument, and so on. When we want to separate this piano sound from mixed signal, / maybe we can only prepare the similar piano sound, but the timbre is slightly different. However the supervised NMF cannot separate because of the difference of spectra of the target sound.
  32. To solve this problem, we have proposed a new supervised method / that adapts the supervised bases to the target spectra / by a basis deformation. This is the decomposition model in this method. We introduce the deformable term, / which has both positive and negative values like this. Then we optimize the matrices D, G, H, and U. This figure indicates spectral difference between the real sound and artificial sound.
  33. This is a result of the experiment using real-recorded signal. From this result, we can confirm that the optimal divergence for the hybrid method is EUC-distance.
  34. In NMF decomposition, the cost function is defined as a distance or a divergence between input matrix Y and decomposed matrix FG. J_NMF indicates the cost function in NMF, and we minimize this one to find F and G under the constraint of nonnegativity. And there are some criteria for the distance used in the cost function. These 3 criteria are often used in the NMF decomposition.
  35. The decomposition of NMF is equivalent to a maximum likelihood estimation, / which assumes the generation model of the input data Y, implicitly. If we select the parameter beta, / the assumption of generation model is fixed. In other words, the parameter beta defines the generation model of the input data.
  36. In this analysis, to compare the net extrapolation ability, we generated a random input data Y, which obey each generation model. Also, we prepared the binary-masked random data YI, and attempt to restore that. In a training process, we construct the supervised basis F using the random data Y. Then we attempt to restore the binary-masked data using the trained basis F.
  37. The binary mask I was generated by uniform manner, and we generated two types of binary masks / whose densities of holes are 75% and 98%. Therefore, by calculating the similarity between input data Y and restored data FG, / we can evaluate the extrapolation ability and the accuracy of restoration. So SAR indicates the accuracy of restoration.
  38. These are the results of analysis. The left one is the result for 75%-binary-masked data, and the right one is 98%-binary masked data. Beta equals 1 is the optimal divergence for source separation, which means KL-divergence. But, surprisingly, the optimal divergence for the restoration is that / beta equals around 3.
  39. Also we conducted an experiment using real-recorded signals. In this experiment, the binaural mixed signal was recorded in the real environment. The other conditions are the same as those in the previous experiment.
  40. This is a result of the experiment using real-recorded signal. From this result, we can confirm that the optimal divergence for the hybrid method is EUC-distance.
  41. As I already said, the best divergence depends on the number of holes. If there are many holes, beta = 2 should be used. And if the holes are not so many, beta =1 should be used. Therefore, divergence should be switched to the optimal one with threshold value. We propose frame-wise multi-divergence.
  42. We define the multi-divergence using cases at each time frame, where r_t means a density of holes at frame t. By the threshold value tau, the divergence are adapted.
  43. Then we evaluated various patterns of spatial location of the sources / as SP1~SP4. SP4 leads more spectral holes than SP1. From this result, we can confirm that the multi-divergence always achieves the highest performance.
  44. SDR is the total evaluation score as the performance of separation.