Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration

Hybrid Multichannel Signal Separation Using
Supervised Nonnegative Matrix Factorization
Daichi Kitamura, (The Graduate University for Advanced Studies, Japan)
Hiroshi Saruwatari, (The University of Tokyo, Japan)
Satoshi Nakamura, (Nara Institute of Science and Technology, Japan)
Yu Takahashi, (Yamaha Corporation, Japan)
Kazunobu Kondo, (Yamaha Corporation, Japan)
Hirokazu Kameoka, (The University of Tokyo, Japan)
Asia-Pacific Signal and Information Processing Association ASC 2014
Special session – Recent Advances in Audio and Acoustic Signal processing

Outline
• 1. Research background
• 2. Conventional methods
– Nonnegative matrix factorization
– Supervised nonnegative matrix factorization
– Multichannel NMF
• 3. Proposed method
– SNMF with spectrogram restoration and its Hybrid method
• 4. Experiments
– Closed data experiment
– Open data experiment
• 5. Conclusions
2

Outline
• 4. Experiments
• 5. Conclusions
3

Research background
• Signal separation have received much attention.
• Music signal separation based on nonnegative matrix
factorization (NMF) is a very active research area.
• Supervised NMF (SNMF) achieves the highest
separation performance.
• To improve its performance, SNMF-based
multichannel signal separation method is required.
4
• Automatic music transcription
• 3D audio system, etc.
Applications
Separate!
Separate the target signal from multichannel
signals with high accuracy.

Outline
• 4. Experiments
• 5. Conclusions
5

• NMF can extract significant spectral patterns.
– Basis matrix has frequently-appearing spectral patterns
in .
NMF [Lee, et al., 2001]
Amplitude
Amplitude
Observed matrix
(spectrogram)
Basis matrix
(spectral patterns)
Activation matrix
(Time-varying gain)
Time
Ω: Number of frequency bins
𝑇: Number of time frames
𝐾: Number of bases
Time
Frequency
Frequency
6
Basis

• SNMF
– Supervised spectral separation method
Supervised NMF [Smaragdis, et al., 2007]
Separation process Optimize
Training process
Supervised basis matrix
(spectral dictionary)
Sample sounds
of target signal
7
Fixed
Sample sound
Target signal Other signalMixed signal

Problems of SNMF
• SNMF is only for a single-channel signal
– For multichannel signal, SNMF cannot use information
between channels.
• When many interference sources exist, separation
performance of SNMF markedly degrades.
8
Separate
Residual
components

9
• Multichannel NMF
– is a natural extension of NMF for a multichannel signal
– uses spatial information for the clustering of bases to
achieve the unsupervised separation task.
Multichannel NMF [Sawada, et al., 2013]
Problems:
Multichannel NMF involve strong dependence on initial values
and lack robustness.
Microphone array

Outline
– Motivation and strategy
• 4. Experiments
• 5. Conclusions
10

• Sawada’s multichannel NMF
– is unified method to solve spatial and spectral separations.
– Maximizes a likelihood:
– For supervised situation, target spectral patterns is given.
– Too much difficult to solve (lack robustness)
– Computationally inefficient (much computational time)
Motivation and strategy
11
Spatial direction
of target signal
Source components
of all signals
Target Other
Observed spectrograms

• Proposed hybrid method
– divides the problems as follows:
– The spatial separation should be carried out with classical
D.O.A. estimation methods.
• These methods are very efficient and stable.
– Divide and conquer method
Motivation and strategy
12
Unsupervised
spatial separation
Supervised
spectral separation
Approximation
Classical D.O.A. estimation SNMF-based method

Directional clustering [Araki, et al., 2007]
• Directional clustering
– Unsupervised spatial separation method
– k-means clustering (fast and stable)
• Problems
– Artificial distortion arises owing to the binary masking.
13
Right
L R
Center
Left
L R
Center
Binary masking
Input signal (stereo) Separated signal
1 1 1 0 0 0
1 0 0 0 0 0
1 1 1 1 0 0
1 0 0 0 0 0
1 1 1 1 1 1
Frequency
Time
C C C R L R
C L L L R R
C C C C R R
C R R L L L
C C C C C C
Frequency
Time
Binary maskSpectrogram
Entry-wise product

Proposed method: hybrid separation
• Hybrid separation method
14
Input stereo signal
Spatial separation method
(Directional clustering)
SNMF-based separation method
(SNMF with spectrogram restoration)
Separated signal
L R

SNMF with spectrogram restoration
: Holes
Time
Frequency
Separated cluster
Spectral holes (lost components)
The proposed SNMF treats these
holes as unseen observations
Supervised basis
…
Extrapolate the
fittest bases
15
(dictionary of target signal)
Fix up

Center RightLeft
Direction
sourcecomponent
z
(b)
Center RightLeft
Direction
sourcecomponent
(a)
Target
Center RightLeft
Direction
sourcecomponent
(c)
Extrapolated
componentsFrequencyofFrequencyofFrequencyof
After
Input
After
signal
directional
clustering
super-
resolution-
based SNMF
Binary
masking
16
Time
FrequencyObserved spectrogram
Target
Interference
Time
Time
Frequency
Extrapolate
Frequency
Separated cluster
Reconstructed data
Supervised
spectral bases
Directional
clustering
SNMF with
spectrogram restoration

• The divergence is defined at all grids except for the
holes by using the Binary mask matrix .
Decomposition model and cost function
17
Decomposition model:
Supervised bases (Fixed)
: Entries of matrices, , and , respectively
: Weighting parameters,: Binary complement, : Frobenius norm
Cost function:
: Binary masking matrix obtained from directional clustering

18
Cost function:
Binary index to exclude the holes

19
Regularization term
Cost function:

20
Regularization term
Penalty term
[Kitamura, et al. 2014]
Cost function:

• : -divergence [Eguchi, et al., 2001]
– EUC-distance
– KL-divergence
– IS-divergence
Generalized divergence: b -divergence
21
The best criterion for
signal separation
[Kitamura, et al., 2014]

• We used two -divergences for the main cost and
the regularization cost as and .
22
Cost function:

Update rules
• We can obtain the update rules for the optimization of
the variables matrices , , and .
23
Update rules:

Outline
• 4. Experiments
• 5. Conclusions
24

• Mixed signal includes four melodies (sources).
• Three compositions of instruments
– We evaluated the average score of 36 patterns.
Experimental condition
25
Center
１
２３
４
Left Right
Target source
Supervision
signal
24 notes that cover all the notes in the target melody
Dataset Melody 1 Melody 2 Midrange Bass
No. 1 Oboe Flute Piano Trombone
No. 2 Trumpet Violin Harpsichord Fagotto
No. 3 Horn Clarinet Piano Cello

14
12
10
8
6
4
2
0
SDR[dB]
43210
bNMF
• Signal-to-distortion ratio (SDR)
– total quality of the separation, which includes the degree of
separation and absence of artificial distortion.
Experimental result: closed data
26
Good
Bad
Conventional SNMF
(single-channel SNMF)
Proposed hybrid method
Directional
clustering
Supervised
Multichannel
NMF [Sawada]
KL-divergence EUC-distance

• SNMF with spectrogram restoration has two tasks.
• The optimal divergence for source separation is KL-
divergence ( ).
• In contrast, a divergence with higher value is
suitable for the basis extrapolation.
27
Source
separation
SNMF with
spectrogram restoration
Basis
extrapolation

Trade-off: separation and restoration
• The optimal divergence for SNMF with spectrogram
restoration and its hybrid method is based on the
trade-off between separation and restoration abilities.
-10
-8
-6
-4
-2
0
Amplitude[dB]
543210
Frequency [kHz]
-10
-8
-6
-4
-2
0
Amplitude[dB]
543210
Frequency [kHz]
Sparseness: strong Sparseness: weak
28
Performance
Separation
Total performance of the hybrid method
Restoration
0 1 2 3 4

• Closed data experiment
– used different Tone generator for training and test signals
Experimental condition
29
Supervision
signal
24 notes that cover all the notes in the target melody
Provided by Tone generator A
Provided by Tone generator B
(more real sound)
+ back ground noise (SNR = 10 dB)
Center
１
２３
４
Left Right
Target source

10
8
6
4
2
0
-2
-4
SDR[dB]
43210
bNMF
• Signal-to-distortion ratio (SDR)
– total quality of the separation, which includes the degree of
separation and absence of artificial distortion.
Experimental result: open data
30
Good
Bad
Conventional SNMF
(single-channel SNMF)
Proposed hybrid method
Directional
clustering
Supervised
Multichannel
NMF [Sawada]
KL-divergence EUC-distance

Conclusions
• We proposed a hybrid multichannel signal separation
method combining directional clustering and SNMF
with spectrogram restoration.
• There is a trade-off between separation and
restoration abilities.
31
Thank you for your attention!
You can hear a
demonstration
from my HP!

Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Viewers also liked

Viewers also liked (11)

Similar to Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration

Similar to Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration (20)

More from Daichi Kitamura

More from Daichi Kitamura (16)

Recently uploaded

Recently uploaded (20)

Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration

Editor's Notes