Online divergence switching for superresolution-based nonnegative matrix factorization

Daichi Kitamura
Daichi KitamuraAssistant Professor at National Institute of Technology, Kagawa College
Online Divergence Switching for
Superresolution-Based
Nonnegative Matrix Factorization
Daichi Kitamura, Hiroshi Saruwatari, Satoshi Nakamura
(Nara Institute of Science and Technology, Japan)
Yu Takahashi, Kazunobu Kondo
(Yamaha Corporation, Japan)
Hirokazu Kameoka
(The University of Tokyo, Japan)
2014 RISP International Workshop on Nonlinear Circuits,
Communications and Signal Processing
Speech Analysis(2),2PM2-2
Outline
• 1. Research background
• 2. Conventional methods
– Nonnegative matrix factorization
– Supervised nonnegative matrix factorization
– Directional clustering
– Hybrid method
• 3. Proposed method
– Online divergence switching for hybrid method
• 4. Experiments
• 5. Conclusions
2
Outline
• 1. Research background
• 2. Conventional methods
– Nonnegative matrix factorization
– Supervised nonnegative matrix factorization
– Directional clustering
– Hybrid method
• 3. Proposed method
– Online divergence switching for hybrid method
• 4. Experiments
• 5. Conclusions
3
Research background
• Music signal separation technologies have received
much attention.
• Music signal separation based on nonnegative matrix
factorization (NMF) is a very active research area.
• The separation performance of supervised NMF
(SNMF) markedly degrades for the case of many
source mixtures.
4
• Automatic music transcription
• 3D audio system, etc.
Applications
We have been proposed a new hybrid
separation method for stereo music signals.
Separate!
Research background
• Our proposed hybrid method
5
Input stereo signal
Spatial separation method
(Directional clustering)
SNMF-based separation method
(Superresolution-based SNMF)
Separated signal
L R
Research background
• Optimal divergence criterion in superresolution-based
SNMF depends on the spatial conditions of the input
signal.
• Our aim in this presentation
6
We propose a new optimal separation scheme for this
hybrid method to separate the target signal with high
accuracy for any types of the spatial condition.
Outline
• 1. Research background
• 2. Conventional methods
– Nonnegative matrix factorization
– Supervised nonnegative matrix factorization
– Directional clustering
– Hybrid method
• 3. Proposed method
– Online divergence switching for hybrid method
• 4. Experiments
• 5. Conclusions
7
• NMF
– is a sparse representation algorithm.
– can extract significant features from the observed matrix.
NMF [Lee, et al., 2001]
Amplitude
Amplitude
Observed matrix
(spectrogram)
Basis matrix
(spectral patterns)
Activation matrix
(Time-varying gain)
Time
Ω: Number of frequency bins
𝑇: Number of time frames
𝐾: Number of bases
Time
Frequency
Frequency
8
Basis
Optimization in NMF
• The variable matrices and are optimized by
minimization of the divergence between and .
• Euclidian distance (EUC-distance) and Kullbuck-
Leibler divergence (KL-divergence) are often used
for the divergence in the cost function.
• In NMF-based separation, KL-divergence based cost
function achieves high separation performance.
9
: Entries of variable matrices and , respectively.
Cost function:
• SNMF utilizes some sample sounds of the target.
– Construct the trained basis matrix of the target sound
– Decompose into the target signal and other signal
SNMF [Smaragdis, et al., 2007]
Separation process Optimize
Training process
Supervised basis matrix
(spectral dictionary)
Sample sounds
of target signal
10Fixed
Ex. Musical scale
Target signal Other signalMixed signal
Five-source case
Problem of SNMF
• The separation performance of SNMF markedly
degrades when many interference sources exist.
11
Separate
Two-source case
Separate
Residual
components
Directional clustering [Araki, et al., 2007]
• Directional clustering
– utilizes differences between channels as a separation cue.
– Is equal to binary masking in the spectrogram domain.
• Problems
– Cannot separate sources in the same direction
– Artificial distortion arises owing to the binary masking.
12
Right
L R
Center
Left
L R
Center
Binary masking
Input signal (stereo) Separated signal
1 1 1 0 0 0
1 0 0 0 0 0
1 1 1 1 0 0
1 0 0 0 0 0
1 1 1 1 1 1
Frequency
Time
C C C R L R
C L L L R R
C C C C R R
C R R L L L
C C C C C C
Frequency
Time
Binary maskSpectrogram
Entry-wise product
Hybrid method [D. Kitamura, et al., 2013]
• We have proposed a new SNMF called
superresolution-based SNMF and its hybrid method.
• Hybrid method consists of directional clustering and
superresolution-based SNMF.
13
Directional
clustering
L R
Spatial
separation
Spectral
separation
Superresolution-
based SNMF
Hybrid method
Superresolution-based SNMF
• This SNMF reconstructs the spectrogram obtained
from directional clustering using supervised basis
extrapolation.
Time
Frequency
Separated cluster
: Chasms
Time
Frequency
Input spectrogram
Other
direction
Time
Frequency
Reconstructed
spectrogram
14
Target
direction
Directional
clustering
Superresolution-
based SNMF
• Spectral chasms owing to directional clustering
Superresolution-based SNMF
15
: Chasm
Time
Frequency
Separated cluster
Chasms
Treat these chasms as
an unseen observationsSupervised basis
…
Extrapolate the
fittest bases
Superresolution-based SNMF
Center RightLeft
Direction
sourcecomponent
z
(b)
Center RightLeft
Direction
sourcecomponent
(a)
Target
Center RightLeft
Direction
sourcecomponent
(c)
Extrapolated
componentsFrequencyofFrequencyofFrequencyof
After
Input
After
signal
directional
clustering
super-
resolution-
based SNMF
Binary
masking
16
Time
FrequencyObserved spectrogram
Target
Interference
Time
Time
Frequency
Extrapolate
Frequency
Separated cluster
Reconstructed data
Supervised
spectral bases
Directional
clustering
Superresolution-
based SNMF
• The divergence is defined at all grids except for the
chasms by using the index matrix .
Decomposition model and cost function
17
Decomposition model:
Supervised bases (Fixed)
: Entries of matrices, , and , respectively
: Weighting parameters,: Binary complement, : Frobenius norm
Regularization term
Penalty term
Cost function:
: Index matrix obtained from directional clustering
Update rules
• We can obtain the update rules for the optimization of
the variables matrices , , and .
18
Update rules:
Outline
• 1. Research background
• 2. Conventional methods
– Nonnegative matrix factorization
– Supervised nonnegative matrix factorization
– Directional clustering
– Hybrid method
• 3. Proposed method
– Online divergence switching for hybrid method
• 4. Experiments
• 5. Conclusions
19
Consideration for optimal divergence
• Separation performance of conventional
SNMF
• Superresolution-based SNMF
– Optimal divergence depends on the amount of
spectral chasms.
20
KL-divergence EUC-distance
KL-divergence EUC-distance?
However…
Consideration for optimal divergence
• Superresolution-based SNMF has two tasks.
• Abilities of each divergence
21
Signal
separation
Basis
extrapolation
Superresolution-
based SNMF
Signal
separation
Basis
extrapolation
KL-divergence (Very good) (Poor)
EUC-distance (Good) (Good)
Consideration for optimal divergence
• Spectrum decomposed by NMF with KL-divergence
tends to become sparse compared with that
decomposed by NMF with EUC-distance.
• Sparse basis is not suitable for extrapolating using
observable data.
22
-10
-8
-6
-4
-2
0
Amplitude[dB]
543210
Frequency [kHz]
-10
-8
-6
-4
-2
0
Amplitude[dB]
543210
Frequency [kHz]
KL-divergence EUC-distance
Consideration for optimal divergence
• The optimal divergence for superresolution-based
SNMF depends on the amount of spectral chasms
because of the trade-off between separation and
extrapolation abilities.Performance
Separation
Total performance
Extrapolation
Anti-sparseSparse
-10
-8
-6
-4
-2
0
Amplitude[dB]
543210
Frequency [kHz]
-10
-8
-6
-4
-2
0
Amplitude[dB]
543210
Frequency [kHz]
Sparseness: Weak 23
KL-divergence EUC-distance
Strong
• The optimal divergence for superresolution-based
SNMF depends on the amount of spectral chasms.
Consideration for optimal divergence
24
Time
Frequency
: Chasms
Time
Frequency
: Chasms
If there are many chasms If the chasms are not exist
The extrapolation ability is
required.
The separation ability is
required.
KL-divergence should
be used.
EUC-distance should
be used.
Hybrid method for online input data
• When we consider applying the hybrid method to
online input data…
25
Online binary-masked spectrogram
Frequency
Time
Observed
spectrogramDirectional clustering
Binary
mask
Hybrid method for online input data
• We divide the online spectrogram into some block
parts.
26
Frequency
Time
Superresolution-
based SNMF
Superresolution-
based SNMF
Superresolution-
based SNMF
In parallel
Online divergence switching
• We calculate the rate of chasms in each block part.
27
There are many
chasms.
The chasms are
not exist so much.
Superresolution-
based SNMF with
KL-divergence
Superresolution-
based SNMF with
EUC-distance
Threshold
value
Threshold
value
Procedure of proposed method
28
Outline
• 1. Research background
• 2. Conventional methods
– Nonnegative matrix factorization
– Supervised nonnegative matrix factorization
– Directional clustering
– Hybrid method
• 3. Proposed method
– Online divergence switching for hybrid method
• 4. Experiments
• 5. Conclusions
29
Experimental conditions
• We used stereo-panning signals.
• Mixture of four instruments generated by MIDI synthesizer
• We used the same type of MIDI sounds of the target
instruments as supervision for training process.
30
Center
1
2 3
4
Left Right
Target source
Supervision
sound
Two octave notes that cover all the notes of the target signal
Experimental conditions
• We compared three methods.
– Hybrid method using only EUC-distance-based SNMF
(Conventional method 1)
– Hybrid method using only KL-divergence-based SNMF
(Conventional method 2)
– Proposed hybrid method that switches the divergence to
the optimal one (Proposed method)
• We used signal-to-distortion ratio (SDR) as an
evaluation score.
– SDR indicates the total separation accuracy, which includes
both of quality of separated target signal and degree of
separation.
31
Experimental result
• Average SDR scores for each method, where the
four instruments are shuffled with 12 combinations.
• Proposed method outperforms other methods.
32
GoodBad
8.0 8.5 9.0 9.5 10.0
SDR [dB]
Conventional
method 1
Conventional
method 2
Proposed
method
Conclusions
• We propose a new divergence switching scheme for
superresolution-based SNMF.
• This method is for the online input signal to separate
using optimal divergence in NMF.
• The proposed method can be used for any types of
the spatial condition of sources, and separates the
target signal with high accuracy.
33
Thank you for your attention!
1 of 33

Recommended

Divergence optimization in nonnegative matrix factorization with spectrogram ... by
Divergence optimization in nonnegative matrix factorization with spectrogram ...Divergence optimization in nonnegative matrix factorization with spectrogram ...
Divergence optimization in nonnegative matrix factorization with spectrogram ...Daichi Kitamura
942 views30 slides
Relaxation of rank-1 spatial constraint in overdetermined blind source separa... by
Relaxation of rank-1 spatial constraint in overdetermined blind source separa...Relaxation of rank-1 spatial constraint in overdetermined blind source separa...
Relaxation of rank-1 spatial constraint in overdetermined blind source separa...Daichi Kitamura
1.6K views23 slides
Robust music signal separation based on supervised nonnegative matrix factori... by
Robust music signal separation based on supervised nonnegative matrix factori...Robust music signal separation based on supervised nonnegative matrix factori...
Robust music signal separation based on supervised nonnegative matrix factori...Daichi Kitamura
1.2K views30 slides
Depth estimation of sound images using directional clustering and activation-... by
Depth estimation of sound images using directional clustering and activation-...Depth estimation of sound images using directional clustering and activation-...
Depth estimation of sound images using directional clustering and activation-...Daichi Kitamura
919 views35 slides
Hybrid multichannel signal separation using supervised nonnegative matrix fac... by
Hybrid multichannel signal separation using supervised nonnegative matrix fac...Hybrid multichannel signal separation using supervised nonnegative matrix fac...
Hybrid multichannel signal separation using supervised nonnegative matrix fac...Daichi Kitamura
1.1K views31 slides
Superresolution-based stereo signal separation via supervised nonnegative mat... by
Superresolution-based stereo signal separation via supervised nonnegative mat...Superresolution-based stereo signal separation via supervised nonnegative mat...
Superresolution-based stereo signal separation via supervised nonnegative mat...Daichi Kitamura
775 views30 slides

More Related Content

What's hot

Blind source separation based on independent low-rank matrix analysis and its... by
Blind source separation based on independent low-rank matrix analysis and its...Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...Daichi Kitamura
1.6K views47 slides
DNN-based permutation solver for frequency-domain independent component analy... by
DNN-based permutation solver for frequency-domain independent component analy...DNN-based permutation solver for frequency-domain independent component analy...
DNN-based permutation solver for frequency-domain independent component analy...Kitamura Laboratory
61 views18 slides
DNN-based frequency component prediction for frequency-domain audio source se... by
DNN-based frequency component prediction for frequency-domain audio source se...DNN-based frequency component prediction for frequency-domain audio source se...
DNN-based frequency component prediction for frequency-domain audio source se...Kitamura Laboratory
72 views20 slides
Blind source separation based on independent low-rank matrix analysis and its... by
Blind source separation based on independent low-rank matrix analysis and its...Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...Daichi Kitamura
1.4K views50 slides
A Generalization of Laplace Nonnegative Matrix Factorizationand Its Multichan... by
A Generalization of Laplace Nonnegative Matrix Factorizationand Its Multichan...A Generalization of Laplace Nonnegative Matrix Factorizationand Its Multichan...
A Generalization of Laplace Nonnegative Matrix Factorizationand Its Multichan...Hiroki_Tanji
70 views15 slides
Temporal Segment Network by
Temporal Segment NetworkTemporal Segment Network
Temporal Segment NetworkDongang (Sean) Wang
626 views26 slides

What's hot(20)

Blind source separation based on independent low-rank matrix analysis and its... by Daichi Kitamura
Blind source separation based on independent low-rank matrix analysis and its...Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...
Daichi Kitamura1.6K views
DNN-based permutation solver for frequency-domain independent component analy... by Kitamura Laboratory
DNN-based permutation solver for frequency-domain independent component analy...DNN-based permutation solver for frequency-domain independent component analy...
DNN-based permutation solver for frequency-domain independent component analy...
DNN-based frequency component prediction for frequency-domain audio source se... by Kitamura Laboratory
DNN-based frequency component prediction for frequency-domain audio source se...DNN-based frequency component prediction for frequency-domain audio source se...
DNN-based frequency component prediction for frequency-domain audio source se...
Blind source separation based on independent low-rank matrix analysis and its... by Daichi Kitamura
Blind source separation based on independent low-rank matrix analysis and its...Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...
Daichi Kitamura1.4K views
A Generalization of Laplace Nonnegative Matrix Factorizationand Its Multichan... by Hiroki_Tanji
A Generalization of Laplace Nonnegative Matrix Factorizationand Its Multichan...A Generalization of Laplace Nonnegative Matrix Factorizationand Its Multichan...
A Generalization of Laplace Nonnegative Matrix Factorizationand Its Multichan...
Hiroki_Tanji70 views
Learning the Statistical Model of the NMF Using the Deep Multiplicative Updat... by Hiroki_Tanji
Learning the Statistical Model of the NMF Using the Deep Multiplicative Updat...Learning the Statistical Model of the NMF Using the Deep Multiplicative Updat...
Learning the Statistical Model of the NMF Using the Deep Multiplicative Updat...
Hiroki_Tanji74 views
Ibfd presentation by Fuyun Ling
Ibfd presentationIbfd presentation
Ibfd presentation
Fuyun Ling1.1K views
M.sc. presentation t.bagheri fashkhami by taherbagherif
M.sc. presentation t.bagheri fashkhamiM.sc. presentation t.bagheri fashkhami
M.sc. presentation t.bagheri fashkhami
taherbagherif286 views
Timing synchronization F Ling_v1.2 by Fuyun Ling
Timing synchronization F Ling_v1.2Timing synchronization F Ling_v1.2
Timing synchronization F Ling_v1.2
Fuyun Ling380 views
IRJET- Music Genre Classification using MFCC and AANN by IRJET Journal
IRJET- Music Genre Classification using MFCC and AANNIRJET- Music Genre Classification using MFCC and AANN
IRJET- Music Genre Classification using MFCC and AANN
IRJET Journal42 views
Frequency based criterion for distinguishing tonal and noisy spectral components by CSCJournals
Frequency based criterion for distinguishing tonal and noisy spectral componentsFrequency based criterion for distinguishing tonal and noisy spectral components
Frequency based criterion for distinguishing tonal and noisy spectral components
CSCJournals207 views
11-16-0316-00-00ay-low-complexity-beamtraining-for-hybrid-mimo by Fares Zenaidi
11-16-0316-00-00ay-low-complexity-beamtraining-for-hybrid-mimo11-16-0316-00-00ay-low-complexity-beamtraining-for-hybrid-mimo
11-16-0316-00-00ay-low-complexity-beamtraining-for-hybrid-mimo
Fares Zenaidi765 views
Approximation of Dynamic Convolution Exploiting Principal Component Analysis:... by a3labdsp
Approximation of Dynamic Convolution Exploiting Principal Component Analysis:...Approximation of Dynamic Convolution Exploiting Principal Component Analysis:...
Approximation of Dynamic Convolution Exploiting Principal Component Analysis:...
a3labdsp934 views
Performance Evaluation of Conventional and Hybrid Feature Extractions Using M... by IJERA Editor
Performance Evaluation of Conventional and Hybrid Feature Extractions Using M...Performance Evaluation of Conventional and Hybrid Feature Extractions Using M...
Performance Evaluation of Conventional and Hybrid Feature Extractions Using M...
IJERA Editor389 views
Sound event detection using deep neural networks by TELKOMNIKA JOURNAL
Sound event detection using deep neural networksSound event detection using deep neural networks
Sound event detection using deep neural networks
Initial acquisition in digital communication systems by Fuyun Ling, v1.2 by Fuyun Ling
Initial acquisition in digital communication systems by Fuyun Ling, v1.2Initial acquisition in digital communication systems by Fuyun Ling, v1.2
Initial acquisition in digital communication systems by Fuyun Ling, v1.2
Fuyun Ling600 views
Analysis and Compression of Reflectance Data Using An Evolved Spectral Correl... by Peter Morovic
Analysis and Compression of Reflectance Data Using An Evolved Spectral Correl...Analysis and Compression of Reflectance Data Using An Evolved Spectral Correl...
Analysis and Compression of Reflectance Data Using An Evolved Spectral Correl...
Peter Morovic317 views

Similar to Online divergence switching for superresolution-based nonnegative matrix factorization

Online Divergence Switching for Superresolution-Based Nonnegative Matrix Fa... by
Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Fa...Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Fa...
Online Divergence Switching for Superresolution-Based Nonnegative Matrix Fa...奈良先端大 情報科学研究科
2.5K views33 slides
Hybrid NMF APSIPA2014 invited by
Hybrid NMF APSIPA2014 invitedHybrid NMF APSIPA2014 invited
Hybrid NMF APSIPA2014 invitedSaruwatariLabUTokyo
14.1K views31 slides
Prior distribution design for music bleeding-sound reduction based on nonnega... by
Prior distribution design for music bleeding-sound reduction based on nonnega...Prior distribution design for music bleeding-sound reduction based on nonnega...
Prior distribution design for music bleeding-sound reduction based on nonnega...Kitamura Laboratory
99 views29 slides
SPECFORMER: SPECTRAL GRAPH NEURAL NETWORKS MEET TRANSFORMERS.pptx by
SPECFORMER: SPECTRAL GRAPH NEURAL NETWORKS MEET TRANSFORMERS.pptxSPECFORMER: SPECTRAL GRAPH NEURAL NETWORKS MEET TRANSFORMERS.pptx
SPECFORMER: SPECTRAL GRAPH NEURAL NETWORKS MEET TRANSFORMERS.pptxssuser2624f71
56 views20 slides
time based ranging via uwb radios by
time based ranging via uwb radiostime based ranging via uwb radios
time based ranging via uwb radiossujan shrestha
594 views56 slides
Experimental analysis of optimal window length for independent low-rank matri... by
Experimental analysis of optimal window length for independent low-rank matri...Experimental analysis of optimal window length for independent low-rank matri...
Experimental analysis of optimal window length for independent low-rank matri...Daichi Kitamura
850 views22 slides

Similar to Online divergence switching for superresolution-based nonnegative matrix factorization(20)

Prior distribution design for music bleeding-sound reduction based on nonnega... by Kitamura Laboratory
Prior distribution design for music bleeding-sound reduction based on nonnega...Prior distribution design for music bleeding-sound reduction based on nonnega...
Prior distribution design for music bleeding-sound reduction based on nonnega...
SPECFORMER: SPECTRAL GRAPH NEURAL NETWORKS MEET TRANSFORMERS.pptx by ssuser2624f71
SPECFORMER: SPECTRAL GRAPH NEURAL NETWORKS MEET TRANSFORMERS.pptxSPECFORMER: SPECTRAL GRAPH NEURAL NETWORKS MEET TRANSFORMERS.pptx
SPECFORMER: SPECTRAL GRAPH NEURAL NETWORKS MEET TRANSFORMERS.pptx
ssuser2624f7156 views
time based ranging via uwb radios by sujan shrestha
time based ranging via uwb radiostime based ranging via uwb radios
time based ranging via uwb radios
sujan shrestha594 views
Experimental analysis of optimal window length for independent low-rank matri... by Daichi Kitamura
Experimental analysis of optimal window length for independent low-rank matri...Experimental analysis of optimal window length for independent low-rank matri...
Experimental analysis of optimal window length for independent low-rank matri...
Daichi Kitamura850 views
Dynamic sub arrays for Hybrid Precoding in Wide Band Millimeter Wave Wireless... by Abdul Qudoos
Dynamic sub arrays for Hybrid Precoding in Wide Band Millimeter Wave Wireless...Dynamic sub arrays for Hybrid Precoding in Wide Band Millimeter Wave Wireless...
Dynamic sub arrays for Hybrid Precoding in Wide Band Millimeter Wave Wireless...
Abdul Qudoos294 views
Comparison of Single Carrier and Multi-carrier.ppt by Stefan Oprea
Comparison of Single Carrier and Multi-carrier.pptComparison of Single Carrier and Multi-carrier.ppt
Comparison of Single Carrier and Multi-carrier.ppt
Stefan Oprea40 views
Linear multichannel blind source separation based on time-frequency mask obta... by Kitamura Laboratory
Linear multichannel blind source separation based on time-frequency mask obta...Linear multichannel blind source separation based on time-frequency mask obta...
Linear multichannel blind source separation based on time-frequency mask obta...
Blind audio source separation based on time-frequency structure models by Kitamura Laboratory
Blind audio source separation based on time-frequency structure modelsBlind audio source separation based on time-frequency structure models
Blind audio source separation based on time-frequency structure models
IRJET- Compressed Sensing based Modified Orthogonal Matching Pursuit in DTTV ... by IRJET Journal
IRJET- Compressed Sensing based Modified Orthogonal Matching Pursuit in DTTV ...IRJET- Compressed Sensing based Modified Orthogonal Matching Pursuit in DTTV ...
IRJET- Compressed Sensing based Modified Orthogonal Matching Pursuit in DTTV ...
IRJET Journal11 views
Design and Hardware Implementation of Low-Complexity Multiuser Precoders (ETH... by TSC University of Mondragon
Design and Hardware Implementation of Low-Complexity Multiuser Precoders (ETH...Design and Hardware Implementation of Low-Complexity Multiuser Precoders (ETH...
Design and Hardware Implementation of Low-Complexity Multiuser Precoders (ETH...
DNN-based frequency-domain permutation solver for multichannel audio source s... by Kitamura Laboratory
DNN-based frequency-domain permutation solver for multichannel audio source s...DNN-based frequency-domain permutation solver for multichannel audio source s...
DNN-based frequency-domain permutation solver for multichannel audio source s...
Positioning techniques in 3 g networks (1) by kike2005
Positioning techniques in 3 g networks (1)Positioning techniques in 3 g networks (1)
Positioning techniques in 3 g networks (1)
kike20051.3K views
5G physical layer by Ali Nikfal
5G physical layer 5G physical layer
5G physical layer
Ali Nikfal330 views
Smart mm-Wave Beam Steering Algorithm for Fast Link Re-Establishment under No... by Avishek Patra
Smart mm-Wave Beam Steering Algorithm for Fast Link Re-Establishment under No...Smart mm-Wave Beam Steering Algorithm for Fast Link Re-Establishment under No...
Smart mm-Wave Beam Steering Algorithm for Fast Link Re-Establishment under No...
Avishek Patra217 views

More from Daichi Kitamura

独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank... by
独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...
独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...Daichi Kitamura
1.5K views91 slides
スペクトログラム無矛盾性を用いた 独立低ランク行列分析の実験的評価 by
スペクトログラム無矛盾性を用いた独立低ランク行列分析の実験的評価スペクトログラム無矛盾性を用いた独立低ランク行列分析の実験的評価
スペクトログラム無矛盾性を用いた 独立低ランク行列分析の実験的評価Daichi Kitamura
1.1K views24 slides
Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも) by
Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)
Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)Daichi Kitamura
2.8K views67 slides
独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank... by
独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...
独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...Daichi Kitamura
8.3K views67 slides
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s... by
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...Daichi Kitamura
4.1K views26 slides
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen... by
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...Daichi Kitamura
2.1K views15 slides

More from Daichi Kitamura(20)

独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank... by Daichi Kitamura
独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...
独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...
Daichi Kitamura1.5K views
スペクトログラム無矛盾性を用いた 独立低ランク行列分析の実験的評価 by Daichi Kitamura
スペクトログラム無矛盾性を用いた独立低ランク行列分析の実験的評価スペクトログラム無矛盾性を用いた独立低ランク行列分析の実験的評価
スペクトログラム無矛盾性を用いた 独立低ランク行列分析の実験的評価
Daichi Kitamura1.1K views
Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも) by Daichi Kitamura
Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)
Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)
Daichi Kitamura2.8K views
独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank... by Daichi Kitamura
独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...
独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...
Daichi Kitamura8.3K views
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s... by Daichi Kitamura
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...
Daichi Kitamura4.1K views
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen... by Daichi Kitamura
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...
Daichi Kitamura2.1K views
近接分離最適化によるブラインド⾳源分離(Blind source separation via proximal splitting algorithm) by Daichi Kitamura
近接分離最適化によるブラインド⾳源分離(Blind source separation via proximal splitting algorithm)近接分離最適化によるブラインド⾳源分離(Blind source separation via proximal splitting algorithm)
近接分離最適化によるブラインド⾳源分離(Blind source separation via proximal splitting algorithm)
Daichi Kitamura2K views
非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法 by Daichi Kitamura
非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法
非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法
Daichi Kitamura3.5K views
音源分離における音響モデリング(Acoustic modeling in audio source separation) by Daichi Kitamura
音源分離における音響モデリング(Acoustic modeling in audio source separation)音源分離における音響モデリング(Acoustic modeling in audio source separation)
音源分離における音響モデリング(Acoustic modeling in audio source separation)
Daichi Kitamura22.5K views
ICASSP2017読み会(関東編)・AASP_L3(北村担当分) by Daichi Kitamura
ICASSP2017読み会(関東編)・AASP_L3(北村担当分)ICASSP2017読み会(関東編)・AASP_L3(北村担当分)
ICASSP2017読み会(関東編)・AASP_L3(北村担当分)
Daichi Kitamura4K views
Audio Source Separation Based on Low-Rank Structure and Statistical Independence by Daichi Kitamura
Audio Source Separation Based on Low-Rank Structure and Statistical IndependenceAudio Source Separation Based on Low-Rank Structure and Statistical Independence
Audio Source Separation Based on Low-Rank Structure and Statistical Independence
Daichi Kitamura2.5K views
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep... by Daichi Kitamura
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...
Daichi Kitamura5.9K views
統計的独立性と低ランク行列分解理論に基づく ブラインド音源分離 –独立低ランク行列分析– Blind source separation based on... by Daichi Kitamura
統計的独立性と低ランク行列分解理論に基づくブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...統計的独立性と低ランク行列分解理論に基づくブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...
統計的独立性と低ランク行列分解理論に基づく ブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...
Daichi Kitamura2.9K views
半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法 by Daichi Kitamura
半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法
半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法
Daichi Kitamura4.3K views
独立性基準を用いた非負値行列因子分解の効果的な初期値決定法(Statistical-independence-based efficient initia... by Daichi Kitamura
独立性基準を用いた非負値行列因子分解の効果的な初期値決定法(Statistical-independence-based efficient initia...独立性基準を用いた非負値行列因子分解の効果的な初期値決定法(Statistical-independence-based efficient initia...
独立性基準を用いた非負値行列因子分解の効果的な初期値決定法(Statistical-independence-based efficient initia...
Daichi Kitamura4.9K views
音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou... by Daichi Kitamura
音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou...音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou...
音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou...
Daichi Kitamura12.2K views
非負値行列分解の確率的生成モデルと 多チャネル音源分離への応用 (Generative model in nonnegative matrix facto... by Daichi Kitamura
非負値行列分解の確率的生成モデルと多チャネル音源分離への応用 (Generative model in nonnegative matrix facto...非負値行列分解の確率的生成モデルと多チャネル音源分離への応用 (Generative model in nonnegative matrix facto...
非負値行列分解の確率的生成モデルと 多チャネル音源分離への応用 (Generative model in nonnegative matrix facto...
Daichi Kitamura5.9K views
ランク1空間近似を用いたBSSにおける音源及び空間モデルの考察 Study on Source and Spatial Models for BSS wi... by Daichi Kitamura
ランク1空間近似を用いたBSSにおける音源及び空間モデルの考察 Study on Source and Spatial Models for BSS wi...ランク1空間近似を用いたBSSにおける音源及び空間モデルの考察 Study on Source and Spatial Models for BSS wi...
ランク1空間近似を用いたBSSにおける音源及び空間モデルの考察 Study on Source and Spatial Models for BSS wi...
Daichi Kitamura1.8K views
基底変形型教師ありNMFによる実楽器信号分離 (in Japanese) by Daichi Kitamura
基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)
基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)
Daichi Kitamura5.9K views
Study on optimal divergence for superresolution-based supervised nonnegative ... by Daichi Kitamura
Study on optimal divergence for superresolution-based supervised nonnegative ...Study on optimal divergence for superresolution-based supervised nonnegative ...
Study on optimal divergence for superresolution-based supervised nonnegative ...
Daichi Kitamura1K views

Recently uploaded

CHEMICAL KINETICS.pdf by
CHEMICAL KINETICS.pdfCHEMICAL KINETICS.pdf
CHEMICAL KINETICS.pdfAguedaGutirrez
8 views337 slides
LFA-NPG-Paper.pdf by
LFA-NPG-Paper.pdfLFA-NPG-Paper.pdf
LFA-NPG-Paper.pdfharinsrikanth
40 views13 slides
Object Oriented Programming with JAVA by
Object Oriented Programming with JAVAObject Oriented Programming with JAVA
Object Oriented Programming with JAVADemian Antony D'Mello
95 views28 slides
DevOps to DevSecOps: Enhancing Software Security Throughout The Development L... by
DevOps to DevSecOps: Enhancing Software Security Throughout The Development L...DevOps to DevSecOps: Enhancing Software Security Throughout The Development L...
DevOps to DevSecOps: Enhancing Software Security Throughout The Development L...Anowar Hossain
12 views34 slides
Update 42 models(Diode/General ) in SPICE PARK(DEC2023) by
Update 42 models(Diode/General ) in SPICE PARK(DEC2023)Update 42 models(Diode/General ) in SPICE PARK(DEC2023)
Update 42 models(Diode/General ) in SPICE PARK(DEC2023)Tsuyoshi Horigome
19 views16 slides
SNMPx by
SNMPxSNMPx
SNMPxAmatullahbutt
14 views12 slides

Recently uploaded(20)

DevOps to DevSecOps: Enhancing Software Security Throughout The Development L... by Anowar Hossain
DevOps to DevSecOps: Enhancing Software Security Throughout The Development L...DevOps to DevSecOps: Enhancing Software Security Throughout The Development L...
DevOps to DevSecOps: Enhancing Software Security Throughout The Development L...
Anowar Hossain12 views
Update 42 models(Diode/General ) in SPICE PARK(DEC2023) by Tsuyoshi Horigome
Update 42 models(Diode/General ) in SPICE PARK(DEC2023)Update 42 models(Diode/General ) in SPICE PARK(DEC2023)
Update 42 models(Diode/General ) in SPICE PARK(DEC2023)
Literature review and Case study on Commercial Complex in Nepal, Durbar mall,... by AakashShakya12
Literature review and Case study on Commercial Complex in Nepal, Durbar mall,...Literature review and Case study on Commercial Complex in Nepal, Durbar mall,...
Literature review and Case study on Commercial Complex in Nepal, Durbar mall,...
AakashShakya1257 views
An approach of ontology and knowledge base for railway maintenance by IJECEIAES
An approach of ontology and knowledge base for railway maintenanceAn approach of ontology and knowledge base for railway maintenance
An approach of ontology and knowledge base for railway maintenance
IJECEIAES12 views
Machine learning in drug supply chain management during disease outbreaks: a ... by IJECEIAES
Machine learning in drug supply chain management during disease outbreaks: a ...Machine learning in drug supply chain management during disease outbreaks: a ...
Machine learning in drug supply chain management during disease outbreaks: a ...
IJECEIAES10 views
Effect of deep chemical mixing columns on properties of surrounding soft clay... by AltinKaradagli
Effect of deep chemical mixing columns on properties of surrounding soft clay...Effect of deep chemical mixing columns on properties of surrounding soft clay...
Effect of deep chemical mixing columns on properties of surrounding soft clay...
AltinKaradagli6 views
Performance of Back-to-Back Mechanically Stabilized Earth Walls Supporting th... by ahmedmesaiaoun
Performance of Back-to-Back Mechanically Stabilized Earth Walls Supporting th...Performance of Back-to-Back Mechanically Stabilized Earth Walls Supporting th...
Performance of Back-to-Back Mechanically Stabilized Earth Walls Supporting th...
ahmedmesaiaoun12 views
Machine Element II Course outline.pdf by odatadese1
Machine Element II Course outline.pdfMachine Element II Course outline.pdf
Machine Element II Course outline.pdf
odatadese17 views
NEW SUPPLIERS SUPPLIES (copie).pdf by georgesradjou
NEW SUPPLIERS SUPPLIES (copie).pdfNEW SUPPLIERS SUPPLIES (copie).pdf
NEW SUPPLIERS SUPPLIES (copie).pdf
georgesradjou14 views
_MAKRIADI-FOTEINI_diploma thesis.pptx by fotinimakriadi
_MAKRIADI-FOTEINI_diploma thesis.pptx_MAKRIADI-FOTEINI_diploma thesis.pptx
_MAKRIADI-FOTEINI_diploma thesis.pptx
fotinimakriadi6 views
fakenews_DBDA_Mar23.pptx by deepmitra8
fakenews_DBDA_Mar23.pptxfakenews_DBDA_Mar23.pptx
fakenews_DBDA_Mar23.pptx
deepmitra812 views
A multi-microcontroller-based hardware for deploying Tiny machine learning mo... by IJECEIAES
A multi-microcontroller-based hardware for deploying Tiny machine learning mo...A multi-microcontroller-based hardware for deploying Tiny machine learning mo...
A multi-microcontroller-based hardware for deploying Tiny machine learning mo...
IJECEIAES12 views

Online divergence switching for superresolution-based nonnegative matrix factorization

  • 1. Online Divergence Switching for Superresolution-Based Nonnegative Matrix Factorization Daichi Kitamura, Hiroshi Saruwatari, Satoshi Nakamura (Nara Institute of Science and Technology, Japan) Yu Takahashi, Kazunobu Kondo (Yamaha Corporation, Japan) Hirokazu Kameoka (The University of Tokyo, Japan) 2014 RISP International Workshop on Nonlinear Circuits, Communications and Signal Processing Speech Analysis(2),2PM2-2
  • 2. Outline • 1. Research background • 2. Conventional methods – Nonnegative matrix factorization – Supervised nonnegative matrix factorization – Directional clustering – Hybrid method • 3. Proposed method – Online divergence switching for hybrid method • 4. Experiments • 5. Conclusions 2
  • 3. Outline • 1. Research background • 2. Conventional methods – Nonnegative matrix factorization – Supervised nonnegative matrix factorization – Directional clustering – Hybrid method • 3. Proposed method – Online divergence switching for hybrid method • 4. Experiments • 5. Conclusions 3
  • 4. Research background • Music signal separation technologies have received much attention. • Music signal separation based on nonnegative matrix factorization (NMF) is a very active research area. • The separation performance of supervised NMF (SNMF) markedly degrades for the case of many source mixtures. 4 • Automatic music transcription • 3D audio system, etc. Applications We have been proposed a new hybrid separation method for stereo music signals. Separate!
  • 5. Research background • Our proposed hybrid method 5 Input stereo signal Spatial separation method (Directional clustering) SNMF-based separation method (Superresolution-based SNMF) Separated signal L R
  • 6. Research background • Optimal divergence criterion in superresolution-based SNMF depends on the spatial conditions of the input signal. • Our aim in this presentation 6 We propose a new optimal separation scheme for this hybrid method to separate the target signal with high accuracy for any types of the spatial condition.
  • 7. Outline • 1. Research background • 2. Conventional methods – Nonnegative matrix factorization – Supervised nonnegative matrix factorization – Directional clustering – Hybrid method • 3. Proposed method – Online divergence switching for hybrid method • 4. Experiments • 5. Conclusions 7
  • 8. • NMF – is a sparse representation algorithm. – can extract significant features from the observed matrix. NMF [Lee, et al., 2001] Amplitude Amplitude Observed matrix (spectrogram) Basis matrix (spectral patterns) Activation matrix (Time-varying gain) Time Ω: Number of frequency bins 𝑇: Number of time frames 𝐾: Number of bases Time Frequency Frequency 8 Basis
  • 9. Optimization in NMF • The variable matrices and are optimized by minimization of the divergence between and . • Euclidian distance (EUC-distance) and Kullbuck- Leibler divergence (KL-divergence) are often used for the divergence in the cost function. • In NMF-based separation, KL-divergence based cost function achieves high separation performance. 9 : Entries of variable matrices and , respectively. Cost function:
  • 10. • SNMF utilizes some sample sounds of the target. – Construct the trained basis matrix of the target sound – Decompose into the target signal and other signal SNMF [Smaragdis, et al., 2007] Separation process Optimize Training process Supervised basis matrix (spectral dictionary) Sample sounds of target signal 10Fixed Ex. Musical scale Target signal Other signalMixed signal
  • 11. Five-source case Problem of SNMF • The separation performance of SNMF markedly degrades when many interference sources exist. 11 Separate Two-source case Separate Residual components
  • 12. Directional clustering [Araki, et al., 2007] • Directional clustering – utilizes differences between channels as a separation cue. – Is equal to binary masking in the spectrogram domain. • Problems – Cannot separate sources in the same direction – Artificial distortion arises owing to the binary masking. 12 Right L R Center Left L R Center Binary masking Input signal (stereo) Separated signal 1 1 1 0 0 0 1 0 0 0 0 0 1 1 1 1 0 0 1 0 0 0 0 0 1 1 1 1 1 1 Frequency Time C C C R L R C L L L R R C C C C R R C R R L L L C C C C C C Frequency Time Binary maskSpectrogram Entry-wise product
  • 13. Hybrid method [D. Kitamura, et al., 2013] • We have proposed a new SNMF called superresolution-based SNMF and its hybrid method. • Hybrid method consists of directional clustering and superresolution-based SNMF. 13 Directional clustering L R Spatial separation Spectral separation Superresolution- based SNMF Hybrid method
  • 14. Superresolution-based SNMF • This SNMF reconstructs the spectrogram obtained from directional clustering using supervised basis extrapolation. Time Frequency Separated cluster : Chasms Time Frequency Input spectrogram Other direction Time Frequency Reconstructed spectrogram 14 Target direction Directional clustering Superresolution- based SNMF
  • 15. • Spectral chasms owing to directional clustering Superresolution-based SNMF 15 : Chasm Time Frequency Separated cluster Chasms Treat these chasms as an unseen observationsSupervised basis … Extrapolate the fittest bases
  • 16. Superresolution-based SNMF Center RightLeft Direction sourcecomponent z (b) Center RightLeft Direction sourcecomponent (a) Target Center RightLeft Direction sourcecomponent (c) Extrapolated componentsFrequencyofFrequencyofFrequencyof After Input After signal directional clustering super- resolution- based SNMF Binary masking 16 Time FrequencyObserved spectrogram Target Interference Time Time Frequency Extrapolate Frequency Separated cluster Reconstructed data Supervised spectral bases Directional clustering Superresolution- based SNMF
  • 17. • The divergence is defined at all grids except for the chasms by using the index matrix . Decomposition model and cost function 17 Decomposition model: Supervised bases (Fixed) : Entries of matrices, , and , respectively : Weighting parameters,: Binary complement, : Frobenius norm Regularization term Penalty term Cost function: : Index matrix obtained from directional clustering
  • 18. Update rules • We can obtain the update rules for the optimization of the variables matrices , , and . 18 Update rules:
  • 19. Outline • 1. Research background • 2. Conventional methods – Nonnegative matrix factorization – Supervised nonnegative matrix factorization – Directional clustering – Hybrid method • 3. Proposed method – Online divergence switching for hybrid method • 4. Experiments • 5. Conclusions 19
  • 20. Consideration for optimal divergence • Separation performance of conventional SNMF • Superresolution-based SNMF – Optimal divergence depends on the amount of spectral chasms. 20 KL-divergence EUC-distance KL-divergence EUC-distance? However…
  • 21. Consideration for optimal divergence • Superresolution-based SNMF has two tasks. • Abilities of each divergence 21 Signal separation Basis extrapolation Superresolution- based SNMF Signal separation Basis extrapolation KL-divergence (Very good) (Poor) EUC-distance (Good) (Good)
  • 22. Consideration for optimal divergence • Spectrum decomposed by NMF with KL-divergence tends to become sparse compared with that decomposed by NMF with EUC-distance. • Sparse basis is not suitable for extrapolating using observable data. 22 -10 -8 -6 -4 -2 0 Amplitude[dB] 543210 Frequency [kHz] -10 -8 -6 -4 -2 0 Amplitude[dB] 543210 Frequency [kHz] KL-divergence EUC-distance
  • 23. Consideration for optimal divergence • The optimal divergence for superresolution-based SNMF depends on the amount of spectral chasms because of the trade-off between separation and extrapolation abilities.Performance Separation Total performance Extrapolation Anti-sparseSparse -10 -8 -6 -4 -2 0 Amplitude[dB] 543210 Frequency [kHz] -10 -8 -6 -4 -2 0 Amplitude[dB] 543210 Frequency [kHz] Sparseness: Weak 23 KL-divergence EUC-distance Strong
  • 24. • The optimal divergence for superresolution-based SNMF depends on the amount of spectral chasms. Consideration for optimal divergence 24 Time Frequency : Chasms Time Frequency : Chasms If there are many chasms If the chasms are not exist The extrapolation ability is required. The separation ability is required. KL-divergence should be used. EUC-distance should be used.
  • 25. Hybrid method for online input data • When we consider applying the hybrid method to online input data… 25 Online binary-masked spectrogram Frequency Time Observed spectrogramDirectional clustering Binary mask
  • 26. Hybrid method for online input data • We divide the online spectrogram into some block parts. 26 Frequency Time Superresolution- based SNMF Superresolution- based SNMF Superresolution- based SNMF In parallel
  • 27. Online divergence switching • We calculate the rate of chasms in each block part. 27 There are many chasms. The chasms are not exist so much. Superresolution- based SNMF with KL-divergence Superresolution- based SNMF with EUC-distance Threshold value Threshold value
  • 29. Outline • 1. Research background • 2. Conventional methods – Nonnegative matrix factorization – Supervised nonnegative matrix factorization – Directional clustering – Hybrid method • 3. Proposed method – Online divergence switching for hybrid method • 4. Experiments • 5. Conclusions 29
  • 30. Experimental conditions • We used stereo-panning signals. • Mixture of four instruments generated by MIDI synthesizer • We used the same type of MIDI sounds of the target instruments as supervision for training process. 30 Center 1 2 3 4 Left Right Target source Supervision sound Two octave notes that cover all the notes of the target signal
  • 31. Experimental conditions • We compared three methods. – Hybrid method using only EUC-distance-based SNMF (Conventional method 1) – Hybrid method using only KL-divergence-based SNMF (Conventional method 2) – Proposed hybrid method that switches the divergence to the optimal one (Proposed method) • We used signal-to-distortion ratio (SDR) as an evaluation score. – SDR indicates the total separation accuracy, which includes both of quality of separated target signal and degree of separation. 31
  • 32. Experimental result • Average SDR scores for each method, where the four instruments are shuffled with 12 combinations. • Proposed method outperforms other methods. 32 GoodBad 8.0 8.5 9.0 9.5 10.0 SDR [dB] Conventional method 1 Conventional method 2 Proposed method
  • 33. Conclusions • We propose a new divergence switching scheme for superresolution-based SNMF. • This method is for the online input signal to separate using optimal divergence in NMF. • The proposed method can be used for any types of the spatial condition of sources, and separates the target signal with high accuracy. 33 Thank you for your attention!

Editor's Notes

  1. Good afternoon everyone, // I’m Daichi Kitamura from Nara institute of science and technology, Japan. Today // I’d like to talk about Online Divergence Switching for Superresolution-Based Nonnegative Matrix Factorization
  2. This is outline of my talk.
  3. First, // I talk about research background.
  4. Recently, // music signal separation technologies have received much attention. These technologies are available for many applications, such as an automatic transcription, 3D audio system, and so on. / Music signal separation / based on nonnegative matrix factorization, // NMF in short, // has been a very active area of the research. Particularly, supervised NMF, / SNMF in short, / can separate the target signal with high accuracy. However, // for the case of many source mixtures / such as more realistic musical tunes, / the separation performance markedly degrades. To solve this problem, // we have been proposed a new hybrid separation method for stereo music signals.
  5. Our proposed hybrid method concatenates spatial separation method called directional clustering / and SNMF based separation method called superresolution-based SNMF. In this hybrid method, first, the target direction is separated by the directional clustering. Then, target signal is separated by this SNMF.
  6. In previous studies, / we confirmed that / the optimal divergence criterion in superresolution-based SNMF / depends on the spatial conditions of the input signal. In this presentation, / we propose a new optimal separation scheme for this hybrid method / to separate the target signal with high performance / for any types of the spatial condition.
  7. Next, // I talk about conventional methods.
  8. As a means of extracting some features from the spectrogram, / NMF has been proposed. This is a sparse representation algorithm, and this method can extract the significant features from the observed matrix. NMF decomposes the observed spectrogram Y, / into two nonnegative matrices F and G, approximately. (アポロークシメイトリ) Here, first decomposed matrix F / has frequently-appearing spectral patterns / as a basis. And another decomposed matrix G / has time-varying gains / of each spectral pattern. So, the matrix F is called ‘basis matrix,’ / and the matrix G is called ‘activation matrix.’
  9. In NMF decomposition, the variable matrices F and G are optimized / by minimization of the divergence between input data Y and reconstructed data FG. This is the cost function in NMF. We can optimize the variable matrices F and G by the minimization of this cost function. Here, Euclidian distance and KL-divergence are often used for the divergence in the cost function. In NMF based signal separation, KL-divergence based cost function / achieves high separation performance / because of the sparseness in music spectrogram.
  10. To separate the target signal using NMF, SNMF has been proposed. SNMF utilizes some sample sounds of the target signal / as a supervision signal. For example, / if we wanted to separate the piano signal from this mixed signal, / the musical scale sound of the same piano / should be used as a supervision. This sample sound is decomposed by simple NMF, / and the supervised basis matrix F is constructed in the training process. Then, the mixed signal is decomposed in the separation process / using the supervised bases F, / as FG+HU. The matrix F is fixed, / and the other matrices G, H, and U are optimized. Finally, the target piano signal is separated as FG, / and the other signals are separated as HU.
  11. SNMF can extract the target signal / when the number of mixed signal is small. However, for the case of many interfering sources exist, / the separation performance markedly degrades.
  12. Next, // I explain about directional clustering method. This method utilizes differences between left and right channels as a separation cue. And this is equal to binary masking in the spectrogram domain. However, this method cannot separate the sources in the same direction / like this. In addition, the separated signal has an artificial distortion owing to the binary masking.
  13. To solve these problems of SNMF and Directional clustering, / we have proposed a new SNMF called “superresolution-based SNMF” / and its hybrid method. This hybrid method consists of two techniques, namely, directional clustering and superresolution-based SNMF. First, / directional clustering is applied to the input stereo signal / to separate the target direction. Then, / the target signal is separated by this SNMF.
  14. Here, / the separated spectrogram by directional clustering / has many spectral chasms / like this. This is due to the binary masking in directional clustering. But, our superresolution-based SNMF can reconstruct such damaged spectrogram using supervised basis extrapolation.
  15. This spectrum is obtained by directional clustering. There are many spectral chasms owing to the binary masking. Superresolution-based SNMF treats these chasms as an unseen observations like this, / and extrapolates the fittest target basis / from the supervised bases F. As a result, the lost components are reconstructed by the supervised basis extrapolation.
  16. This figure shows the directional distribution of the input stereo signal. The target source is in the center direction, and other interfering sources are distributed like this. After directional clustering, / left and right source components / leak in the center cluster, // and center sources lose some of their components. These lost components / correspond to the spectral chasms in the spectrogram domain. And after superresolution-based SNMF, the target components are separated / and restored using supervised bases of the target sound trained in advance. In other words, / the resolution of the target spectrogram / is recovered with the superresolution / by the supervised basis extrapolation.
  17. This is a decomposition model of superresolution-based SNMF. It is the same as that in the conventional SNMF. And, this equation is the cost function. In this cost function, / the divergence is defined at all spectrogram grids / except for the spectral chasms / by using the index matrix I obtained from directional clustering. For the grids of the chasms, we impose a regularization term for superresolution.
  18. From the minimization of the cost function, / we can obtain the update rules / for the optimization of variable matrices G, H, and U.
  19. Next, I talk about proposed method.
  20. In conventional SNMF, KL-divergence-based SNMF always achieves high separation performance / rather than Euclidian-distance-based SNMF. However, in superresolution-based SNMF, / the optimal divergence depends on the amount of spectral chasms.
  21. This is because superresolution-based SNMF has two tasks, / namely, the signal separation / and the basis extrapolation for the superresolution of damaged spectrogram. KL-divergence can separate signals with high accuracy, but it’s not suitable for the basis extrapolation. On the other hand, Euclidian distance is good for the basis extrapolation.
  22. The spectrum decomposed by NMF with KL-divergence / tends to become sparse / compared with that decomposed by NMF with EUC-distance. And, such sparse basis is not suitable for extrapolating / using observable data.
  23. Therefore, The optimal divergence for superresolution-based SNMF / depends on the amount of spectral chasms because of the trade-off / between separation and extrapolation abilities.
  24. From these properties, if there are many chasms, EUC-distance should be used because the extrapolation ability is required. On the other hand, if the chasms are not exist so much, KL-divergence should be used because the separation ability is required.
  25. When we consider applying our hybrid method to online input data, / we can obtain online binary-masked spectrogram from the directional clustering..
  26. And, we propose to divide this online spectrogram into some block parts. Then, superresolution-based SNMF is applied to each blocked spectrogram in parallel.
  27. Here, we can calculate the rate of chasms r in each block part, / and decide the divergence using threshold value tau. For example, this blocked spectrogram Y(1) doesn’t have the chasms so much. So, KL-divergence is suitable because the separation ability is required. And, next blocked part Y(2) has many spectral chasms. So, Euclidian-distance is suitable / because we have to reconstruct this damaged spectrogram by the superresolution.
  28. This is a procedure of the proposed divergence switching method, / where the supervised bases for both of Euclidian-distance and KL-divergence, F(EUC) and F(KL) / should be prepared in advance using supervision sound of the target signal.
  29. Next, I talk about Experiments.
  30. To confirm the effectiveness of the proposed divergence switching method, / we conduct an evaluation experiment. In this experiment, we used stereo-panning music signals. This stereo signal has four instrumental sources, and the target source is always located in the center direction. Left and right side interfering sources are located in 15-degree in the fist half, / and these sources are moved in the center direction as theta equals 0 / in the last half. Therefore, many chasms were produced by directional clustering in the first half / compared with the last half. The signal contains 4 instruments, namely, oboe, flute, trombone, and piano, / generated by MIDI synthesizer. These sources are mixed as the same power. In addition, / we used the same type of MIDI sounds of the target instruments / as the supervision sound / like this (pointing supervision score). This supervision sound consists of two octave notes that cover all notes of the target signal.
  31. In this experiment, we compared three methods, / namely, Hybrid method using only EUC-distance based SNMF, / Hybrid method using only KL-divergence based SNMF, / and the proposed hybrid method that switches the divergence to the optimal one. In addition, we used signal-to-distortion ratio / as an evaluation score. SDR indicates the total separation accuracy, / which includes both / quality of separated target signal / and degree of separation.
  32. This result is an average of evaluation scores for all combinations of the input signals. From this result, proposed hybrid method outperforms other methods. This is the efficacy of the optimal divergence switching.
  33. This is conclusions of my talk. Thank you for your attention.
  34. Conventional hybrid method is a simple method that concatenates normal SNMF and directional clustering. So, this method cannot reconstruct the lost components, spectral chasms. This proposed method, red line, is fixed the divergence. So, we already confirmed that the divergence-switching method achieves better result than this red line, in the previous result.
  35. Directional clustering utilizes some clustering methods, such as K-means clustering. The feature of the clustering / is the differences of the amplitude between channels, namely, the direction of the sources. From the clustering result, we can obtain binary mask matrix. So, the separation is achieved by the production of the input spectrogram and this mask.
  36. As another means of addressing multichannel signal separation, Multichannel NMF also has been proposed by Ozerov and Sawada. This method is a natural extension of NMF, and uses spectral and spatial cues. But, this unified method is very difficult optimization problem mathematically / because many variables should be optimized by one cost function. So, this method strongly depends on the initial values.
  37. If the target sources increase in the same direction with target instruments, the separation performance of supervised NMF markedly degrades. This is because, the several resemble bases arise in both of the target and other instruments.
  38. If the left and right sources close to the center direction, the separation ↓ become difficult, because directional clustering cannot separate well. In addition, bases extrapolation also become difficult because the number of chasms in the separated cluster / are increased in this case. In contrast, if the theta become larger, the separation ↓ become easy.
  39. This is a signal flow of the proposed hybrid method. In our experiment, superresolution-based supervised NMF is applied to only the center direction because the target source is located in the center direction. However, if the target source is located in the left or right side, we should apply this NMF to the direction that have the target source whether or not there is the other source in that direction.
  40. The optimization of variables F and G in NMF / is based on the minimization of the cost function. The cost function is defined as the divergence between observed spectrogram Y / and reconstructed spectrogram FG. This minimization is an inequality constrained optimization problem.
  41. SDR is the total evaluation score as the performance of separation.