SlideShare a Scribd company logo
1 of 23
Download to read offline
Daichi Kitamura
Nobutaka Ono
Hiroshi Sawada
Hirokazu Kameoka
Hiroshi Saruwatari
Relaxation of Rank-1 Spatial Constraint in
Overdetermined Blind Source Separation
(SOKENDAI)
(NII/SOKENDAI)
(NTT)
(The Univ. of Tokyo/NTT)
(The Univ. of Tokyo)
EUSIPCO 2015, 2 Sept.,14:30 - 16:10,
SS30 Acoustic scene analysis using microphone array
Research Background
• Blind source separation (BSS)
– Estimation of original sources from the mixture signal
– We only focus on overdetermined situations
• Number of sources Number of microphones
• Ex) Independent component analysis, independent vector analysis
• Applications of BSS
– Acoustic scene analysis, speech enhancement, music
analysis, reproduction of sound field, etc.
2/21
Original sources Observation (mixture) Estimated sources
Mixing system BSS
Unknown
Problems and Motivations
• For reverberant signals
– ICA-based methods cannot separate sources well because
Linear time-invariant mixing system is assumed
– When the number of microphones is grater than the
number of sources, PCA is often applied before BSS
• Reverberation is also important information to
analyze acoustic scenes
– We should separate the sources with their own
reverberations. 3/21
Original
sources
Observed signals
Mixing
Estimated
sources
BSS
Dimension-
reduced signals
PCA
Instantaneous mixing in time-frequency domain
To remove weak (reverberant) components of all the sources
• Independent vector analysis (IVA) [Hiroe, 2006], [Kim, 2006]
– assumes independence between source vectors
– assumes linear time-invariant mixing system
• The mixing system can be represented by mixing matrix in each
frequency bin.
– can efficiently be optimized [Ono, 2011]
Conventional Methods (1/4)
4/21
…
…
Original
sources Mixing
matrices
…
…
…
Observed
signals Demixing
matrices
Estimated
sources
Conventional Methods (2/4)
• Nonnegative matrix factorization (NMF) [Lee, 2001]
– decomposes spectrogram into spectral bases
– Decomposed bases should be clustered into each source.
• Very difficult problem
– Multichannel extension of NMF has been proposed. 5/21
Amplitude
Amplitude
Observed matrix
(power spectrogram)
Basis matrix
(spectral patterns)
Activation matrix
(Time-varying gain)
Time
: Number of frequency bins
: Number of time frames
: Number of bases
Time
Frequency
Frequency
Basis
• Multichannel NMF (MNMF) [Ozerov, 2010], [Sawada, 2013]
Conventional Methods (3/4)
6/21
Time-frequency-wise
channel correlations
Multichannel observation
Multichannel vector
Instantaneous covariance
Source-frequency-wise
spatial covariances Basis matrix Activation matrix
Spatial model Source model
Cluster-
indicator
Spectral patterns
Gains
• MNMF with rank-1 spatial model (Rank-1 MNMF)
– Spatial model can be optimized by IVA
– Source model and can be optimized by simple NMF
We can optimize all the variables using
update rules of IVA and simple NMF
Time-frequency-wise
channel correlations
Source-frequency-wise
spatial covariances Basis matrix Activation matrix
Spectral patterns
Gains
Conventional Methods (4/4)
7/21
[Kitamura, ICASSP 2015]
= Linear mixing assumption as well as IVA
Modeled by rank-1 matrices (constraint)
Cluster-
indicator
• Rank-1 spatial constraint Linear mixing assumption
– Instantaneous mixture in a time-frequency domain
– Mixing system can be represented by mixing matrix
Rank-1 Spatial Constraint
8/21
1. Sources can be modeled as point sources
2. Reverberation time is shorter than FFT length
Frequency
Time
Observed spectrogram
Time-invariant mixing matrix
Observed
signal
Source
signal
• When reverberation time is longer than FFT length,
– the impulse response becomes long
– reverberant components leak into the next time frame
Problem of Rank-1 Spatial Model
9/21
Mixing system cannot be represented by using only .
The separation performance markedly degrades.
Frequency
Time
Observed spectrogram
Observed
signal
Source
signal
Leaked
components
Summary of Conventional methods
• MNMF [Ozerov, 2010], [Sawada, 2013]
– Full-rank spatial model
• does not use rank-1 spatial constraint
– much computational costs
– strong dependence on initial values
• IVA [Hiroe, 2006], [Kim, 2006] & Rank-1 MNMF [Kitamura, 2015]
– Rank-1 spatial constraint (linear mixing assumption)
• Separation performance degrades for the reverberant signals
– Faster and more stable optimization
10/21
Relax the rank-1 spatial constraint while
maintaining efficient optimization
To achieve good and stable separation
even for the reverberant signals,
• Dimensionality reduction with principal component
analysis (PCA)
– remove reverberant components of all the sources by PCA
– But the reverberant components are important!
• Utilize extra observations to model direct and
reverberant components simultaneously.
– microphones for sources, where
Proposed Approach
11/21
Original
sources
Observed signals
Mixing
Estimated
sources
BSS
Dimension-reduced
signals
PCA
Ex. sources, microphones ( )
Proposed Approach
12/21
• Utilize extra observations to model direct and
reverberant components simultaneously.
– microphones for sources, where
Original
sources
Observed signals
Mixing
Ex. sources, microphones ( )
Estimated
sources
Reconstruction
Separated components
BSS
IVA or Rank-1 MNMF
Proposed Approach
13/21
• Utilize extra observations to model direct and
reverberant components simultaneously.
– microphones for sources, where
Original
sources
Observed signals
Mixing
Ex. sources, microphones ( )
Direct
Reverb.
Direct
Reverb.
Estimated
sources
Reconstruction
Separated components
BSS
• We assume the independence between not only
sources but also the direct and reverberant
components of the same sources.
• Permutation problem of separated components
– Order of separated components depends on initial values
• We propose two methods to cluster the components
– 1. Using cross-correlations for IVA
– 2. Sharing basis matrices for Rank-1 MNMF
Clustering of Separated Components
14/21
Separated
components
Which separated components
belong to which source?
• Permutation problem of separated components
– Order of separated components depends on initial values
• We propose two methods to cluster the components
– 1. Using cross-correlations for IVA
– 2. Sharing basis matrices for Rank-1 MNMF
Clustering of Separated Components
15/21
Estimated
source
Reconstruction
Separated
components
Clustered
components
Direct component
of source 1
Clustering
Reverb. component
of source 1
Direct component
of source 2
Reverb. component
of source 2
Clustering Using Spectrogram Correlation
• Direct and reverberant components of the same
source have a strong cross-correlation.
• Cross-correlation of two power spectrograms
– Calculate for all combination of separated components
– Merge the components in a descending order of 16/21
Power spectrogram of Power spectrogram of
・・・
• Direct and reverberant components can be modeled
by the same bases (spectral patterns)
• Estimate signals with Basis-Shared Rank-1 MNMF
– Only for Rank-1 MNMF
• because IVA doesn’t have NMF source model
– By imposing basis-shared source model, Rank-1 MNMF
can automatically cluster the components.
Auto-Clustering by Sharing Basis Matrix
17/21
Separated
components
Source model of Basis-
Shared Rank-1 MNMF
Shared
basis matrix
for source 1
Reconstruction
Estimated
sources
Shared
basis matrix
for source 2
Direct component
of source 1
Reverb. component
of source 1
Direct component
of source 2
Reverb. component
of source 2
• Conditions
– JR2 impulse response
Experiments
Original source
Professionally-produced music signals from SiSEC database
JR2 impulse response in RWCP database is used
Two sources and four microphones
Sampling frequency Down sampled from 44.1 kHz to 16 kHz
FFT length in STFT 8192 points (128 ms, Hamming window)
Shift length in STFT 2048 points (64 ms)
Number of bases 15 bases for each source (30 bases for all the sources)
Number of iterations 200
Number of trials 10 times with various seeds of random initialization
Evaluation criterion Average SDR improvement and its deviation
18/21
Reverberation time: 470 ms 2 m
Source 1
80 60
Microphone spacing: 2.83 cm
Source 2
• Compared methods (7 methods)
– PCA + 2ch IVA
• Apply PCA before IVA
– PCA + 2ch Rank-1 MNMF
• Apply PCA before Rank-1 MNMF
– 4ch IVA + Clustering
• Apply IVA without PCA, and cluster the components
– 4ch Basis-Shared Rank-1 MNMF
• Apply Basis-Shared Rank-1 MNMF without PCA
– 4ch MNMF-based BF (beam forming)
• Apply maximum SNR beam forming (time-invariant filtering)
using full-rank covariance estimated by 4ch MNMF
– 4ch MNMF
• Apply conventional MNMF (full-rank model), and apply
multichannel Wiener filtering (time-variant filtering)
– Ideal time-invariant filtering
• The upper limit of time-invariant filtering (supervised)
Experiments
19/21
Conventional
methods
Proposed
methods
Conventional
methods
Reference
score
• Results (song: ultimate_nz_tour__snip_43_61)
– Source 1: Guitar
– Source 2: Vocals
16
14
12
10
8
6
4
2
0
SDRimprovement[dB] Experiments
20/21
Rank-1 spatial model
Time-invariant filter (1/src)
Full-rank model
Time-invariant
filter (1/src)
Full-rank spatial model
Time-variant filter (1/src)
Upper limit of
time-invariant
filter (1/src)
Rank-1 spatial model
Time-invariant filter (2/src)
: Source 1 : Source 2
PCA+
2ch IVA
PCA+
2ch Rank1
MNMF
4ch IVA+
Clustering
4ch MNMF-
based BF
4ch MNMF Ideal time-
invariant filtering
(supervised)
4ch Basis-
Shared Rank-1
MNMF
• Results (song: bearlin-roads__snip_85_99)
– Source 1: Acoustic guitar
– Source 2: Piano
12
10
8
6
4
2
0
-2
-4
SDRimprovement[dB] Experiments
21/21
: Source 1 : Source 2
PCA+
2ch IVA
PCA+
2ch Rank1
MNMF
4ch IVA+
Clustering
4ch MNMF4ch Basis-
Shared Rank-1
MNMF
Ideal time-
invariant filtering
(supervised)
4ch MNMF-
based BF
Experiments
22/21
• Comparison of computational times
– Conditions
• CPU: Intel Core i7-4790 (3.60GHz)
• MATLAB 8.3 (64-bit)
• Song: ultimate_nz_tour__snip_43_61 (18s, 16kHz sampling)
PCA +
2ch IVA
PCA + 2ch
Rank1MNMF
4ch IVA+
Clustering
4ch Basis-
Shared Rank1
MNMF
4ch
MNMF
23.4 s 29.4 s 60.1 s 143.9 s 3611.8 s
Achieve efficient optimization compared with MNMF
(The performance is comparable with MNMF)
1h!2.4m
Conclusion
• For the case of reverberant signals
– Achieve both good performance and efficient optimization
• The proposed method
– Can be applied when the number of microphones is grater
than twice the number of sources
– separately estimates direct and reverberant components
utilizing extra observations
– can be thought as a relaxation of rank-1 spatial constraint
• Experimental results show better performance
– The proposed method outperforms the upper limit of time-
invariant filtering in some cases
23/21
Thank you for your attention!

More Related Content

What's hot

Efficient initialization for nonnegative matrix factorization based on nonneg...
Efficient initialization for nonnegative matrix factorization based on nonneg...Efficient initialization for nonnegative matrix factorization based on nonneg...
Efficient initialization for nonnegative matrix factorization based on nonneg...Daichi Kitamura
 
Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...Daichi Kitamura
 
Prior distribution design for music bleeding-sound reduction based on nonnega...
Prior distribution design for music bleeding-sound reduction based on nonnega...Prior distribution design for music bleeding-sound reduction based on nonnega...
Prior distribution design for music bleeding-sound reduction based on nonnega...Kitamura Laboratory
 
DNN-based permutation solver for frequency-domain independent component analy...
DNN-based permutation solver for frequency-domain independent component analy...DNN-based permutation solver for frequency-domain independent component analy...
DNN-based permutation solver for frequency-domain independent component analy...Kitamura Laboratory
 
Blind audio source separation based on time-frequency structure models
Blind audio source separation based on time-frequency structure modelsBlind audio source separation based on time-frequency structure models
Blind audio source separation based on time-frequency structure modelsKitamura Laboratory
 
DNN-based frequency component prediction for frequency-domain audio source se...
DNN-based frequency component prediction for frequency-domain audio source se...DNN-based frequency component prediction for frequency-domain audio source se...
DNN-based frequency component prediction for frequency-domain audio source se...Kitamura Laboratory
 
Koyama ASA ASJ joint meeting 2016
Koyama ASA ASJ joint meeting 2016Koyama ASA ASJ joint meeting 2016
Koyama ASA ASJ joint meeting 2016SaruwatariLabUTokyo
 
Linear multichannel blind source separation based on time-frequency mask obta...
Linear multichannel blind source separation based on time-frequency mask obta...Linear multichannel blind source separation based on time-frequency mask obta...
Linear multichannel blind source separation based on time-frequency mask obta...Kitamura Laboratory
 
Audio Source Separation Based on Low-Rank Structure and Statistical Independence
Audio Source Separation Based on Low-Rank Structure and Statistical IndependenceAudio Source Separation Based on Low-Rank Structure and Statistical Independence
Audio Source Separation Based on Low-Rank Structure and Statistical IndependenceDaichi Kitamura
 
Experimental analysis of optimal window length for independent low-rank matri...
Experimental analysis of optimal window length for independent low-rank matri...Experimental analysis of optimal window length for independent low-rank matri...
Experimental analysis of optimal window length for independent low-rank matri...Daichi Kitamura
 
Online Divergence Switching for Superresolution-Based Nonnegative Matrix Fa...
Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Fa...Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Fa...
Online Divergence Switching for Superresolution-Based Nonnegative Matrix Fa...奈良先端大 情報科学研究科
 
Depth Estimation of Sound Images Using Directional Clustering and Activation...
Depth Estimation of Sound Images Using  Directional Clustering and Activation...Depth Estimation of Sound Images Using  Directional Clustering and Activation...
Depth Estimation of Sound Images Using Directional Clustering and Activation...奈良先端大 情報科学研究科
 
Robust Sound Field Reproduction against Listener’s Movement Utilizing Image ...
Robust Sound Field Reproduction against  Listener’s Movement Utilizing Image ...Robust Sound Field Reproduction against  Listener’s Movement Utilizing Image ...
Robust Sound Field Reproduction against Listener’s Movement Utilizing Image ...奈良先端大 情報科学研究科
 
コサイン類似度罰則条件付き半教師あり非負値行列因子分解と音源分離への応用
コサイン類似度罰則条件付き半教師あり非負値行列因子分解と音源分離への応用コサイン類似度罰則条件付き半教師あり非負値行列因子分解と音源分離への応用
コサイン類似度罰則条件付き半教師あり非負値行列因子分解と音源分離への応用Kitamura Laboratory
 
A Generalization of Laplace Nonnegative Matrix Factorizationand Its Multichan...
A Generalization of Laplace Nonnegative Matrix Factorizationand Its Multichan...A Generalization of Laplace Nonnegative Matrix Factorizationand Its Multichan...
A Generalization of Laplace Nonnegative Matrix Factorizationand Its Multichan...Hiroki_Tanji
 

What's hot (20)

Efficient initialization for nonnegative matrix factorization based on nonneg...
Efficient initialization for nonnegative matrix factorization based on nonneg...Efficient initialization for nonnegative matrix factorization based on nonneg...
Efficient initialization for nonnegative matrix factorization based on nonneg...
 
Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...
 
Prior distribution design for music bleeding-sound reduction based on nonnega...
Prior distribution design for music bleeding-sound reduction based on nonnega...Prior distribution design for music bleeding-sound reduction based on nonnega...
Prior distribution design for music bleeding-sound reduction based on nonnega...
 
DNN-based permutation solver for frequency-domain independent component analy...
DNN-based permutation solver for frequency-domain independent component analy...DNN-based permutation solver for frequency-domain independent component analy...
DNN-based permutation solver for frequency-domain independent component analy...
 
Blind audio source separation based on time-frequency structure models
Blind audio source separation based on time-frequency structure modelsBlind audio source separation based on time-frequency structure models
Blind audio source separation based on time-frequency structure models
 
DNN-based frequency component prediction for frequency-domain audio source se...
DNN-based frequency component prediction for frequency-domain audio source se...DNN-based frequency component prediction for frequency-domain audio source se...
DNN-based frequency component prediction for frequency-domain audio source se...
 
Koyama ASA ASJ joint meeting 2016
Koyama ASA ASJ joint meeting 2016Koyama ASA ASJ joint meeting 2016
Koyama ASA ASJ joint meeting 2016
 
Hybrid NMF APSIPA2014 invited
Hybrid NMF APSIPA2014 invitedHybrid NMF APSIPA2014 invited
Hybrid NMF APSIPA2014 invited
 
Ica2016 312 saruwatari
Ica2016 312 saruwatariIca2016 312 saruwatari
Ica2016 312 saruwatari
 
Linear multichannel blind source separation based on time-frequency mask obta...
Linear multichannel blind source separation based on time-frequency mask obta...Linear multichannel blind source separation based on time-frequency mask obta...
Linear multichannel blind source separation based on time-frequency mask obta...
 
Audio Source Separation Based on Low-Rank Structure and Statistical Independence
Audio Source Separation Based on Low-Rank Structure and Statistical IndependenceAudio Source Separation Based on Low-Rank Structure and Statistical Independence
Audio Source Separation Based on Low-Rank Structure and Statistical Independence
 
Koyama AES Conference SFC 2016
Koyama AES Conference SFC 2016Koyama AES Conference SFC 2016
Koyama AES Conference SFC 2016
 
Experimental analysis of optimal window length for independent low-rank matri...
Experimental analysis of optimal window length for independent low-rank matri...Experimental analysis of optimal window length for independent low-rank matri...
Experimental analysis of optimal window length for independent low-rank matri...
 
Apsipa2016for ss
Apsipa2016for ssApsipa2016for ss
Apsipa2016for ss
 
Online Divergence Switching for Superresolution-Based Nonnegative Matrix Fa...
Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Fa...Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Fa...
Online Divergence Switching for Superresolution-Based Nonnegative Matrix Fa...
 
Depth Estimation of Sound Images Using Directional Clustering and Activation...
Depth Estimation of Sound Images Using  Directional Clustering and Activation...Depth Estimation of Sound Images Using  Directional Clustering and Activation...
Depth Estimation of Sound Images Using Directional Clustering and Activation...
 
Robust Sound Field Reproduction against Listener’s Movement Utilizing Image ...
Robust Sound Field Reproduction against  Listener’s Movement Utilizing Image ...Robust Sound Field Reproduction against  Listener’s Movement Utilizing Image ...
Robust Sound Field Reproduction against Listener’s Movement Utilizing Image ...
 
Dsp2015for ss
Dsp2015for ssDsp2015for ss
Dsp2015for ss
 
コサイン類似度罰則条件付き半教師あり非負値行列因子分解と音源分離への応用
コサイン類似度罰則条件付き半教師あり非負値行列因子分解と音源分離への応用コサイン類似度罰則条件付き半教師あり非負値行列因子分解と音源分離への応用
コサイン類似度罰則条件付き半教師あり非負値行列因子分解と音源分離への応用
 
A Generalization of Laplace Nonnegative Matrix Factorizationand Its Multichan...
A Generalization of Laplace Nonnegative Matrix Factorizationand Its Multichan...A Generalization of Laplace Nonnegative Matrix Factorizationand Its Multichan...
A Generalization of Laplace Nonnegative Matrix Factorizationand Its Multichan...
 

Viewers also liked

音源分離における音響モデリング(Acoustic modeling in audio source separation)
音源分離における音響モデリング(Acoustic modeling in audio source separation)音源分離における音響モデリング(Acoustic modeling in audio source separation)
音源分離における音響モデリング(Acoustic modeling in audio source separation)Daichi Kitamura
 
擬似ハムバッキングピックアップの弦振動応答 (in Japanese)
擬似ハムバッキングピックアップの弦振動応答 (in Japanese)擬似ハムバッキングピックアップの弦振動応答 (in Japanese)
擬似ハムバッキングピックアップの弦振動応答 (in Japanese)Daichi Kitamura
 
Music signal separation using supervised nonnegative matrix factorization wit...
Music signal separation using supervised nonnegative matrix factorization wit...Music signal separation using supervised nonnegative matrix factorization wit...
Music signal separation using supervised nonnegative matrix factorization wit...Daichi Kitamura
 
ランク1空間近似を用いたBSSにおける音源及び空間モデルの考察 Study on Source and Spatial Models for BSS wi...
ランク1空間近似を用いたBSSにおける音源及び空間モデルの考察 Study on Source and Spatial Models for BSS wi...ランク1空間近似を用いたBSSにおける音源及び空間モデルの考察 Study on Source and Spatial Models for BSS wi...
ランク1空間近似を用いたBSSにおける音源及び空間モデルの考察 Study on Source and Spatial Models for BSS wi...Daichi Kitamura
 
独立性基準を用いた非負値行列因子分解の効果的な初期値決定法(Statistical-independence-based efficient initia...
独立性基準を用いた非負値行列因子分解の効果的な初期値決定法(Statistical-independence-based efficient initia...独立性基準を用いた非負値行列因子分解の効果的な初期値決定法(Statistical-independence-based efficient initia...
独立性基準を用いた非負値行列因子分解の効果的な初期値決定法(Statistical-independence-based efficient initia...Daichi Kitamura
 
非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法
非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法
非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法Daichi Kitamura
 
Study on optimal divergence for superresolution-based supervised nonnegative ...
Study on optimal divergence for superresolution-based supervised nonnegative ...Study on optimal divergence for superresolution-based supervised nonnegative ...
Study on optimal divergence for superresolution-based supervised nonnegative ...Daichi Kitamura
 
半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法
半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法
半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法Daichi Kitamura
 
基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)
基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)
基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)Daichi Kitamura
 
音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou...
音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou...音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou...
音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou...Daichi Kitamura
 
非負値行列分解の確率的生成モデルと 多チャネル音源分離への応用 (Generative model in nonnegative matrix facto...
非負値行列分解の確率的生成モデルと多チャネル音源分離への応用 (Generative model in nonnegative matrix facto...非負値行列分解の確率的生成モデルと多チャネル音源分離への応用 (Generative model in nonnegative matrix facto...
非負値行列分解の確率的生成モデルと 多チャネル音源分離への応用 (Generative model in nonnegative matrix facto...Daichi Kitamura
 
ICASSP2017読み会(関東編)・AASP_L3(北村担当分)
ICASSP2017読み会(関東編)・AASP_L3(北村担当分)ICASSP2017読み会(関東編)・AASP_L3(北村担当分)
ICASSP2017読み会(関東編)・AASP_L3(北村担当分)Daichi Kitamura
 
統計的独立性と低ランク行列分解理論に基づく ブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...
統計的独立性と低ランク行列分解理論に基づくブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...統計的独立性と低ランク行列分解理論に基づくブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...
統計的独立性と低ランク行列分解理論に基づく ブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...Daichi Kitamura
 
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...Daichi Kitamura
 
TensorFlow を使った 機械学習ことはじめ (GDG京都 機械学習勉強会)
TensorFlow を使った機械学習ことはじめ (GDG京都 機械学習勉強会)TensorFlow を使った機械学習ことはじめ (GDG京都 機械学習勉強会)
TensorFlow を使った 機械学習ことはじめ (GDG京都 機械学習勉強会)徹 上野山
 

Viewers also liked (15)

音源分離における音響モデリング(Acoustic modeling in audio source separation)
音源分離における音響モデリング(Acoustic modeling in audio source separation)音源分離における音響モデリング(Acoustic modeling in audio source separation)
音源分離における音響モデリング(Acoustic modeling in audio source separation)
 
擬似ハムバッキングピックアップの弦振動応答 (in Japanese)
擬似ハムバッキングピックアップの弦振動応答 (in Japanese)擬似ハムバッキングピックアップの弦振動応答 (in Japanese)
擬似ハムバッキングピックアップの弦振動応答 (in Japanese)
 
Music signal separation using supervised nonnegative matrix factorization wit...
Music signal separation using supervised nonnegative matrix factorization wit...Music signal separation using supervised nonnegative matrix factorization wit...
Music signal separation using supervised nonnegative matrix factorization wit...
 
ランク1空間近似を用いたBSSにおける音源及び空間モデルの考察 Study on Source and Spatial Models for BSS wi...
ランク1空間近似を用いたBSSにおける音源及び空間モデルの考察 Study on Source and Spatial Models for BSS wi...ランク1空間近似を用いたBSSにおける音源及び空間モデルの考察 Study on Source and Spatial Models for BSS wi...
ランク1空間近似を用いたBSSにおける音源及び空間モデルの考察 Study on Source and Spatial Models for BSS wi...
 
独立性基準を用いた非負値行列因子分解の効果的な初期値決定法(Statistical-independence-based efficient initia...
独立性基準を用いた非負値行列因子分解の効果的な初期値決定法(Statistical-independence-based efficient initia...独立性基準を用いた非負値行列因子分解の効果的な初期値決定法(Statistical-independence-based efficient initia...
独立性基準を用いた非負値行列因子分解の効果的な初期値決定法(Statistical-independence-based efficient initia...
 
非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法
非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法
非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法
 
Study on optimal divergence for superresolution-based supervised nonnegative ...
Study on optimal divergence for superresolution-based supervised nonnegative ...Study on optimal divergence for superresolution-based supervised nonnegative ...
Study on optimal divergence for superresolution-based supervised nonnegative ...
 
半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法
半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法
半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法
 
基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)
基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)
基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)
 
音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou...
音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou...音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou...
音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou...
 
非負値行列分解の確率的生成モデルと 多チャネル音源分離への応用 (Generative model in nonnegative matrix facto...
非負値行列分解の確率的生成モデルと多チャネル音源分離への応用 (Generative model in nonnegative matrix facto...非負値行列分解の確率的生成モデルと多チャネル音源分離への応用 (Generative model in nonnegative matrix facto...
非負値行列分解の確率的生成モデルと 多チャネル音源分離への応用 (Generative model in nonnegative matrix facto...
 
ICASSP2017読み会(関東編)・AASP_L3(北村担当分)
ICASSP2017読み会(関東編)・AASP_L3(北村担当分)ICASSP2017読み会(関東編)・AASP_L3(北村担当分)
ICASSP2017読み会(関東編)・AASP_L3(北村担当分)
 
統計的独立性と低ランク行列分解理論に基づく ブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...
統計的独立性と低ランク行列分解理論に基づくブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...統計的独立性と低ランク行列分解理論に基づくブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...
統計的独立性と低ランク行列分解理論に基づく ブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...
 
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...
 
TensorFlow を使った 機械学習ことはじめ (GDG京都 機械学習勉強会)
TensorFlow を使った機械学習ことはじめ (GDG京都 機械学習勉強会)TensorFlow を使った機械学習ことはじめ (GDG京都 機械学習勉強会)
TensorFlow を使った 機械学習ことはじめ (GDG京都 機械学習勉強会)
 

Similar to Relaxation of rank-1 spatial constraint in overdetermined blind source separation

DNN-based frequency-domain permutation solver for multichannel audio source s...
DNN-based frequency-domain permutation solver for multichannel audio source s...DNN-based frequency-domain permutation solver for multichannel audio source s...
DNN-based frequency-domain permutation solver for multichannel audio source s...Kitamura Laboratory
 
Approximation of Dynamic Convolution Exploiting Principal Component Analysis:...
Approximation of Dynamic Convolution Exploiting Principal Component Analysis:...Approximation of Dynamic Convolution Exploiting Principal Component Analysis:...
Approximation of Dynamic Convolution Exploiting Principal Component Analysis:...a3labdsp
 
Online Stochastic Tensor Decomposition for Background Subtraction in Multispe...
Online Stochastic Tensor Decomposition for Background Subtraction in Multispe...Online Stochastic Tensor Decomposition for Background Subtraction in Multispe...
Online Stochastic Tensor Decomposition for Background Subtraction in Multispe...ActiveEon
 
Spatial Modulation
Spatial ModulationSpatial Modulation
Spatial ModulationAbdul Qudoos
 
IRJET- Compressed Sensing based Modified Orthogonal Matching Pursuit in DTTV ...
IRJET- Compressed Sensing based Modified Orthogonal Matching Pursuit in DTTV ...IRJET- Compressed Sensing based Modified Orthogonal Matching Pursuit in DTTV ...
IRJET- Compressed Sensing based Modified Orthogonal Matching Pursuit in DTTV ...IRJET Journal
 
GPU-based Accelerated Spectral Caustic Rendering of Homogeneous Caustic Objects
GPU-based Accelerated Spectral Caustic Rendering of Homogeneous Caustic ObjectsGPU-based Accelerated Spectral Caustic Rendering of Homogeneous Caustic Objects
GPU-based Accelerated Spectral Caustic Rendering of Homogeneous Caustic ObjectsBudianto Tandianus
 
Comparison of Single Carrier and Multi-carrier.ppt
Comparison of Single Carrier and Multi-carrier.pptComparison of Single Carrier and Multi-carrier.ppt
Comparison of Single Carrier and Multi-carrier.pptStefan Oprea
 
SPECFORMER: SPECTRAL GRAPH NEURAL NETWORKS MEET TRANSFORMERS.pptx
SPECFORMER: SPECTRAL GRAPH NEURAL NETWORKS MEET TRANSFORMERS.pptxSPECFORMER: SPECTRAL GRAPH NEURAL NETWORKS MEET TRANSFORMERS.pptx
SPECFORMER: SPECTRAL GRAPH NEURAL NETWORKS MEET TRANSFORMERS.pptxssuser2624f71
 
IRJET- Musical Instrument Recognition using CNN and SVM
IRJET-  	  Musical Instrument Recognition using CNN and SVMIRJET-  	  Musical Instrument Recognition using CNN and SVM
IRJET- Musical Instrument Recognition using CNN and SVMIRJET Journal
 
time based ranging via uwb radios
time based ranging via uwb radiostime based ranging via uwb radios
time based ranging via uwb radiossujan shrestha
 
FMRI medical imagining
FMRI  medical imaginingFMRI  medical imagining
FMRI medical imaginingVishwas N
 
A rotational reference cell for high-accuracy real-time spectroscopic trace-g...
A rotational reference cell for high-accuracy real-time spectroscopic trace-g...A rotational reference cell for high-accuracy real-time spectroscopic trace-g...
A rotational reference cell for high-accuracy real-time spectroscopic trace-g...Clinton Smith
 
Ibfd presentation
Ibfd presentationIbfd presentation
Ibfd presentationFuyun Ling
 
IRJET- Music Genre Classification using MFCC and AANN
IRJET- Music Genre Classification using MFCC and AANNIRJET- Music Genre Classification using MFCC and AANN
IRJET- Music Genre Classification using MFCC and AANNIRJET Journal
 
Radar 2009 a 11 waveforms and pulse compression
Radar 2009 a 11 waveforms and pulse compressionRadar 2009 a 11 waveforms and pulse compression
Radar 2009 a 11 waveforms and pulse compressionForward2025
 
Fsi pacman meeting
Fsi pacman meetingFsi pacman meeting
Fsi pacman meetingNitin Nigam
 

Similar to Relaxation of rank-1 spatial constraint in overdetermined blind source separation (19)

DNN-based frequency-domain permutation solver for multichannel audio source s...
DNN-based frequency-domain permutation solver for multichannel audio source s...DNN-based frequency-domain permutation solver for multichannel audio source s...
DNN-based frequency-domain permutation solver for multichannel audio source s...
 
Approximation of Dynamic Convolution Exploiting Principal Component Analysis:...
Approximation of Dynamic Convolution Exploiting Principal Component Analysis:...Approximation of Dynamic Convolution Exploiting Principal Component Analysis:...
Approximation of Dynamic Convolution Exploiting Principal Component Analysis:...
 
Online Stochastic Tensor Decomposition for Background Subtraction in Multispe...
Online Stochastic Tensor Decomposition for Background Subtraction in Multispe...Online Stochastic Tensor Decomposition for Background Subtraction in Multispe...
Online Stochastic Tensor Decomposition for Background Subtraction in Multispe...
 
Temporal Segment Network
Temporal Segment NetworkTemporal Segment Network
Temporal Segment Network
 
Spatial Modulation
Spatial ModulationSpatial Modulation
Spatial Modulation
 
IRJET- Compressed Sensing based Modified Orthogonal Matching Pursuit in DTTV ...
IRJET- Compressed Sensing based Modified Orthogonal Matching Pursuit in DTTV ...IRJET- Compressed Sensing based Modified Orthogonal Matching Pursuit in DTTV ...
IRJET- Compressed Sensing based Modified Orthogonal Matching Pursuit in DTTV ...
 
GPU-based Accelerated Spectral Caustic Rendering of Homogeneous Caustic Objects
GPU-based Accelerated Spectral Caustic Rendering of Homogeneous Caustic ObjectsGPU-based Accelerated Spectral Caustic Rendering of Homogeneous Caustic Objects
GPU-based Accelerated Spectral Caustic Rendering of Homogeneous Caustic Objects
 
Comparison of Single Carrier and Multi-carrier.ppt
Comparison of Single Carrier and Multi-carrier.pptComparison of Single Carrier and Multi-carrier.ppt
Comparison of Single Carrier and Multi-carrier.ppt
 
SPECFORMER: SPECTRAL GRAPH NEURAL NETWORKS MEET TRANSFORMERS.pptx
SPECFORMER: SPECTRAL GRAPH NEURAL NETWORKS MEET TRANSFORMERS.pptxSPECFORMER: SPECTRAL GRAPH NEURAL NETWORKS MEET TRANSFORMERS.pptx
SPECFORMER: SPECTRAL GRAPH NEURAL NETWORKS MEET TRANSFORMERS.pptx
 
Equalization.pdf
Equalization.pdfEqualization.pdf
Equalization.pdf
 
IRJET- Musical Instrument Recognition using CNN and SVM
IRJET-  	  Musical Instrument Recognition using CNN and SVMIRJET-  	  Musical Instrument Recognition using CNN and SVM
IRJET- Musical Instrument Recognition using CNN and SVM
 
time based ranging via uwb radios
time based ranging via uwb radiostime based ranging via uwb radios
time based ranging via uwb radios
 
FMRI medical imagining
FMRI  medical imaginingFMRI  medical imagining
FMRI medical imagining
 
A rotational reference cell for high-accuracy real-time spectroscopic trace-g...
A rotational reference cell for high-accuracy real-time spectroscopic trace-g...A rotational reference cell for high-accuracy real-time spectroscopic trace-g...
A rotational reference cell for high-accuracy real-time spectroscopic trace-g...
 
Ibfd presentation
Ibfd presentationIbfd presentation
Ibfd presentation
 
IRJET- Music Genre Classification using MFCC and AANN
IRJET- Music Genre Classification using MFCC and AANNIRJET- Music Genre Classification using MFCC and AANN
IRJET- Music Genre Classification using MFCC and AANN
 
Radar 2009 a 11 waveforms and pulse compression
Radar 2009 a 11 waveforms and pulse compressionRadar 2009 a 11 waveforms and pulse compression
Radar 2009 a 11 waveforms and pulse compression
 
Cr2012b
Cr2012bCr2012b
Cr2012b
 
Fsi pacman meeting
Fsi pacman meetingFsi pacman meeting
Fsi pacman meeting
 

More from Daichi Kitamura

独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...
独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...
独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...Daichi Kitamura
 
スペクトログラム無矛盾性を用いた 独立低ランク行列分析の実験的評価
スペクトログラム無矛盾性を用いた独立低ランク行列分析の実験的評価スペクトログラム無矛盾性を用いた独立低ランク行列分析の実験的評価
スペクトログラム無矛盾性を用いた 独立低ランク行列分析の実験的評価Daichi Kitamura
 
Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)
Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)
Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)Daichi Kitamura
 
独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...
独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...
独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...Daichi Kitamura
 
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...Daichi Kitamura
 
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...Daichi Kitamura
 
近接分離最適化によるブラインド⾳源分離(Blind source separation via proximal splitting algorithm)
近接分離最適化によるブラインド⾳源分離(Blind source separation via proximal splitting algorithm)近接分離最適化によるブラインド⾳源分離(Blind source separation via proximal splitting algorithm)
近接分離最適化によるブラインド⾳源分離(Blind source separation via proximal splitting algorithm)Daichi Kitamura
 
模擬ハムバッキング・ピックアップの弦振動応答 (in Japanese)
模擬ハムバッキング・ピックアップの弦振動応答 (in Japanese)模擬ハムバッキング・ピックアップの弦振動応答 (in Japanese)
模擬ハムバッキング・ピックアップの弦振動応答 (in Japanese)Daichi Kitamura
 
Evaluation of separation accuracy for various real instruments based on super...
Evaluation of separation accuracy for various real instruments based on super...Evaluation of separation accuracy for various real instruments based on super...
Evaluation of separation accuracy for various real instruments based on super...Daichi Kitamura
 
Divergence optimization based on trade-off between separation and extrapolati...
Divergence optimization based on trade-off between separation and extrapolati...Divergence optimization based on trade-off between separation and extrapolati...
Divergence optimization based on trade-off between separation and extrapolati...Daichi Kitamura
 

More from Daichi Kitamura (10)

独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...
独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...
独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...
 
スペクトログラム無矛盾性を用いた 独立低ランク行列分析の実験的評価
スペクトログラム無矛盾性を用いた独立低ランク行列分析の実験的評価スペクトログラム無矛盾性を用いた独立低ランク行列分析の実験的評価
スペクトログラム無矛盾性を用いた 独立低ランク行列分析の実験的評価
 
Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)
Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)
Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)
 
独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...
独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...
独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...
 
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...
 
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...
 
近接分離最適化によるブラインド⾳源分離(Blind source separation via proximal splitting algorithm)
近接分離最適化によるブラインド⾳源分離(Blind source separation via proximal splitting algorithm)近接分離最適化によるブラインド⾳源分離(Blind source separation via proximal splitting algorithm)
近接分離最適化によるブラインド⾳源分離(Blind source separation via proximal splitting algorithm)
 
模擬ハムバッキング・ピックアップの弦振動応答 (in Japanese)
模擬ハムバッキング・ピックアップの弦振動応答 (in Japanese)模擬ハムバッキング・ピックアップの弦振動応答 (in Japanese)
模擬ハムバッキング・ピックアップの弦振動応答 (in Japanese)
 
Evaluation of separation accuracy for various real instruments based on super...
Evaluation of separation accuracy for various real instruments based on super...Evaluation of separation accuracy for various real instruments based on super...
Evaluation of separation accuracy for various real instruments based on super...
 
Divergence optimization based on trade-off between separation and extrapolati...
Divergence optimization based on trade-off between separation and extrapolati...Divergence optimization based on trade-off between separation and extrapolati...
Divergence optimization based on trade-off between separation and extrapolati...
 

Recently uploaded

Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructureitnewsafrica
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...itnewsafrica
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integrationmarketing932765
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Kaya Weers
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkPixlogix Infotech
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxLoriGlavin3
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPathCommunity
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Karmanjay Verma
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesManik S Magar
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfAarwolf Industries LLC
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...BookNet Canada
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxAna-Maria Mihalceanu
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 

Recently uploaded (20)

Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical InfrastructureVarsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
Varsha Sewlal- Cyber Attacks on Critical Critical Infrastructure
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...Abdul Kader Baba- Managing Cybersecurity Risks  and Compliance Requirements i...
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS:  6 Ways to Automate Your Data IntegrationBridging Between CAD & GIS:  6 Ways to Automate Your Data Integration
Bridging Between CAD & GIS: 6 Ways to Automate Your Data Integration
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)Design pattern talk by Kaya Weers - 2024 (v2)
Design pattern talk by Kaya Weers - 2024 (v2)
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
React Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App FrameworkReact Native vs Ionic - The Best Mobile App Framework
React Native vs Ionic - The Best Mobile App Framework
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptxThe Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
The Role of FIDO in a Cyber Secure Netherlands: FIDO Paris Seminar.pptx
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
UiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to HeroUiPath Community: Communication Mining from Zero to Hero
UiPath Community: Communication Mining from Zero to Hero
 
Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#Microservices, Docker deploy and Microservices source code in C#
Microservices, Docker deploy and Microservices source code in C#
 
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotesMuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
MuleSoft Online Meetup Group - B2B Crash Course: Release SparkNotes
 
Landscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdfLandscape Catalogue 2024 Australia-1.pdf
Landscape Catalogue 2024 Australia-1.pdf
 
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
Transcript: New from BookNet Canada for 2024: BNC SalesData and LibraryData -...
 
A Glance At The Java Performance Toolbox
A Glance At The Java Performance ToolboxA Glance At The Java Performance Toolbox
A Glance At The Java Performance Toolbox
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 

Relaxation of rank-1 spatial constraint in overdetermined blind source separation

  • 1. Daichi Kitamura Nobutaka Ono Hiroshi Sawada Hirokazu Kameoka Hiroshi Saruwatari Relaxation of Rank-1 Spatial Constraint in Overdetermined Blind Source Separation (SOKENDAI) (NII/SOKENDAI) (NTT) (The Univ. of Tokyo/NTT) (The Univ. of Tokyo) EUSIPCO 2015, 2 Sept.,14:30 - 16:10, SS30 Acoustic scene analysis using microphone array
  • 2. Research Background • Blind source separation (BSS) – Estimation of original sources from the mixture signal – We only focus on overdetermined situations • Number of sources Number of microphones • Ex) Independent component analysis, independent vector analysis • Applications of BSS – Acoustic scene analysis, speech enhancement, music analysis, reproduction of sound field, etc. 2/21 Original sources Observation (mixture) Estimated sources Mixing system BSS Unknown
  • 3. Problems and Motivations • For reverberant signals – ICA-based methods cannot separate sources well because Linear time-invariant mixing system is assumed – When the number of microphones is grater than the number of sources, PCA is often applied before BSS • Reverberation is also important information to analyze acoustic scenes – We should separate the sources with their own reverberations. 3/21 Original sources Observed signals Mixing Estimated sources BSS Dimension- reduced signals PCA Instantaneous mixing in time-frequency domain To remove weak (reverberant) components of all the sources
  • 4. • Independent vector analysis (IVA) [Hiroe, 2006], [Kim, 2006] – assumes independence between source vectors – assumes linear time-invariant mixing system • The mixing system can be represented by mixing matrix in each frequency bin. – can efficiently be optimized [Ono, 2011] Conventional Methods (1/4) 4/21 … … Original sources Mixing matrices … … … Observed signals Demixing matrices Estimated sources
  • 5. Conventional Methods (2/4) • Nonnegative matrix factorization (NMF) [Lee, 2001] – decomposes spectrogram into spectral bases – Decomposed bases should be clustered into each source. • Very difficult problem – Multichannel extension of NMF has been proposed. 5/21 Amplitude Amplitude Observed matrix (power spectrogram) Basis matrix (spectral patterns) Activation matrix (Time-varying gain) Time : Number of frequency bins : Number of time frames : Number of bases Time Frequency Frequency Basis
  • 6. • Multichannel NMF (MNMF) [Ozerov, 2010], [Sawada, 2013] Conventional Methods (3/4) 6/21 Time-frequency-wise channel correlations Multichannel observation Multichannel vector Instantaneous covariance Source-frequency-wise spatial covariances Basis matrix Activation matrix Spatial model Source model Cluster- indicator Spectral patterns Gains
  • 7. • MNMF with rank-1 spatial model (Rank-1 MNMF) – Spatial model can be optimized by IVA – Source model and can be optimized by simple NMF We can optimize all the variables using update rules of IVA and simple NMF Time-frequency-wise channel correlations Source-frequency-wise spatial covariances Basis matrix Activation matrix Spectral patterns Gains Conventional Methods (4/4) 7/21 [Kitamura, ICASSP 2015] = Linear mixing assumption as well as IVA Modeled by rank-1 matrices (constraint) Cluster- indicator
  • 8. • Rank-1 spatial constraint Linear mixing assumption – Instantaneous mixture in a time-frequency domain – Mixing system can be represented by mixing matrix Rank-1 Spatial Constraint 8/21 1. Sources can be modeled as point sources 2. Reverberation time is shorter than FFT length Frequency Time Observed spectrogram Time-invariant mixing matrix Observed signal Source signal
  • 9. • When reverberation time is longer than FFT length, – the impulse response becomes long – reverberant components leak into the next time frame Problem of Rank-1 Spatial Model 9/21 Mixing system cannot be represented by using only . The separation performance markedly degrades. Frequency Time Observed spectrogram Observed signal Source signal Leaked components
  • 10. Summary of Conventional methods • MNMF [Ozerov, 2010], [Sawada, 2013] – Full-rank spatial model • does not use rank-1 spatial constraint – much computational costs – strong dependence on initial values • IVA [Hiroe, 2006], [Kim, 2006] & Rank-1 MNMF [Kitamura, 2015] – Rank-1 spatial constraint (linear mixing assumption) • Separation performance degrades for the reverberant signals – Faster and more stable optimization 10/21 Relax the rank-1 spatial constraint while maintaining efficient optimization To achieve good and stable separation even for the reverberant signals,
  • 11. • Dimensionality reduction with principal component analysis (PCA) – remove reverberant components of all the sources by PCA – But the reverberant components are important! • Utilize extra observations to model direct and reverberant components simultaneously. – microphones for sources, where Proposed Approach 11/21 Original sources Observed signals Mixing Estimated sources BSS Dimension-reduced signals PCA Ex. sources, microphones ( )
  • 12. Proposed Approach 12/21 • Utilize extra observations to model direct and reverberant components simultaneously. – microphones for sources, where Original sources Observed signals Mixing Ex. sources, microphones ( ) Estimated sources Reconstruction Separated components BSS IVA or Rank-1 MNMF
  • 13. Proposed Approach 13/21 • Utilize extra observations to model direct and reverberant components simultaneously. – microphones for sources, where Original sources Observed signals Mixing Ex. sources, microphones ( ) Direct Reverb. Direct Reverb. Estimated sources Reconstruction Separated components BSS • We assume the independence between not only sources but also the direct and reverberant components of the same sources.
  • 14. • Permutation problem of separated components – Order of separated components depends on initial values • We propose two methods to cluster the components – 1. Using cross-correlations for IVA – 2. Sharing basis matrices for Rank-1 MNMF Clustering of Separated Components 14/21 Separated components Which separated components belong to which source?
  • 15. • Permutation problem of separated components – Order of separated components depends on initial values • We propose two methods to cluster the components – 1. Using cross-correlations for IVA – 2. Sharing basis matrices for Rank-1 MNMF Clustering of Separated Components 15/21 Estimated source Reconstruction Separated components Clustered components Direct component of source 1 Clustering Reverb. component of source 1 Direct component of source 2 Reverb. component of source 2
  • 16. Clustering Using Spectrogram Correlation • Direct and reverberant components of the same source have a strong cross-correlation. • Cross-correlation of two power spectrograms – Calculate for all combination of separated components – Merge the components in a descending order of 16/21 Power spectrogram of Power spectrogram of ・・・
  • 17. • Direct and reverberant components can be modeled by the same bases (spectral patterns) • Estimate signals with Basis-Shared Rank-1 MNMF – Only for Rank-1 MNMF • because IVA doesn’t have NMF source model – By imposing basis-shared source model, Rank-1 MNMF can automatically cluster the components. Auto-Clustering by Sharing Basis Matrix 17/21 Separated components Source model of Basis- Shared Rank-1 MNMF Shared basis matrix for source 1 Reconstruction Estimated sources Shared basis matrix for source 2 Direct component of source 1 Reverb. component of source 1 Direct component of source 2 Reverb. component of source 2
  • 18. • Conditions – JR2 impulse response Experiments Original source Professionally-produced music signals from SiSEC database JR2 impulse response in RWCP database is used Two sources and four microphones Sampling frequency Down sampled from 44.1 kHz to 16 kHz FFT length in STFT 8192 points (128 ms, Hamming window) Shift length in STFT 2048 points (64 ms) Number of bases 15 bases for each source (30 bases for all the sources) Number of iterations 200 Number of trials 10 times with various seeds of random initialization Evaluation criterion Average SDR improvement and its deviation 18/21 Reverberation time: 470 ms 2 m Source 1 80 60 Microphone spacing: 2.83 cm Source 2
  • 19. • Compared methods (7 methods) – PCA + 2ch IVA • Apply PCA before IVA – PCA + 2ch Rank-1 MNMF • Apply PCA before Rank-1 MNMF – 4ch IVA + Clustering • Apply IVA without PCA, and cluster the components – 4ch Basis-Shared Rank-1 MNMF • Apply Basis-Shared Rank-1 MNMF without PCA – 4ch MNMF-based BF (beam forming) • Apply maximum SNR beam forming (time-invariant filtering) using full-rank covariance estimated by 4ch MNMF – 4ch MNMF • Apply conventional MNMF (full-rank model), and apply multichannel Wiener filtering (time-variant filtering) – Ideal time-invariant filtering • The upper limit of time-invariant filtering (supervised) Experiments 19/21 Conventional methods Proposed methods Conventional methods Reference score
  • 20. • Results (song: ultimate_nz_tour__snip_43_61) – Source 1: Guitar – Source 2: Vocals 16 14 12 10 8 6 4 2 0 SDRimprovement[dB] Experiments 20/21 Rank-1 spatial model Time-invariant filter (1/src) Full-rank model Time-invariant filter (1/src) Full-rank spatial model Time-variant filter (1/src) Upper limit of time-invariant filter (1/src) Rank-1 spatial model Time-invariant filter (2/src) : Source 1 : Source 2 PCA+ 2ch IVA PCA+ 2ch Rank1 MNMF 4ch IVA+ Clustering 4ch MNMF- based BF 4ch MNMF Ideal time- invariant filtering (supervised) 4ch Basis- Shared Rank-1 MNMF
  • 21. • Results (song: bearlin-roads__snip_85_99) – Source 1: Acoustic guitar – Source 2: Piano 12 10 8 6 4 2 0 -2 -4 SDRimprovement[dB] Experiments 21/21 : Source 1 : Source 2 PCA+ 2ch IVA PCA+ 2ch Rank1 MNMF 4ch IVA+ Clustering 4ch MNMF4ch Basis- Shared Rank-1 MNMF Ideal time- invariant filtering (supervised) 4ch MNMF- based BF
  • 22. Experiments 22/21 • Comparison of computational times – Conditions • CPU: Intel Core i7-4790 (3.60GHz) • MATLAB 8.3 (64-bit) • Song: ultimate_nz_tour__snip_43_61 (18s, 16kHz sampling) PCA + 2ch IVA PCA + 2ch Rank1MNMF 4ch IVA+ Clustering 4ch Basis- Shared Rank1 MNMF 4ch MNMF 23.4 s 29.4 s 60.1 s 143.9 s 3611.8 s Achieve efficient optimization compared with MNMF (The performance is comparable with MNMF) 1h!2.4m
  • 23. Conclusion • For the case of reverberant signals – Achieve both good performance and efficient optimization • The proposed method – Can be applied when the number of microphones is grater than twice the number of sources – separately estimates direct and reverberant components utilizing extra observations – can be thought as a relaxation of rank-1 spatial constraint • Experimental results show better performance – The proposed method outperforms the upper limit of time- invariant filtering in some cases 23/21 Thank you for your attention!

Editor's Notes

  1. Blind source separation is a technique to estimate original sources from the observed mixture signal, where the mixing system is unknown. Therefore, we cannot use any information about recording environment, or locations of the sources and microphones. And in this presentation, we only focus on the overdetermined situations, which means the number of sources is equal or smaller than the number of sources. As you know, independent component analysis is a very famous method for the overdetermined BSS. There are so many applications for BSS. The very big one is an acoustic scene analysis because we have to separate the sources before we analyze the observations.
  2. However, for the reverberant signals, ICA-based methods cannot separate the sources well. This is because these methods assume a linear time-invariant mixing system, which is an instantaneous mixing in time-frequency domain. Also, when the number of microphones is grater than the number of sources, PCA is often applied before BSS. This process expects to remove the reverberant components of all the sources and to make the linear time-invariant assumption valid. However, reverberation is also important information to analyze acoustic scenes. Therefore, we should separate the sources with their own reverberations.
  3. Let me introduce some conventional methods of BSS. First one is independent vector analysis, IVA. This is an extension of Frequency-Domain ICA. IVA assumes independence between frequency vectors. In addition, IVA assumes linear time-invariant mixing system. In this assumption, the mixing system can be represented by the mixing matrix A in each frequency bin. And recently, the efficient optimization scheme for IVA has been proposed by Prof. Ono.
  4. Another famous method is NMF. NMF decomposes a power spectrogram into two nonnegative matrices, T and V. T is a basis matrix, which has spectral patterns, and V is activation matrix, which involves time-varying gains of each basis. So we can extract some significant spectra from the mixture spectrogram. And then, the decomposed bases should be clustered into each source, but it’s a very difficult problem. So, the multichannel extension of NMF has been proposed.
  5. For the multichannel signal, we have M spectrograms. M is a number of microphones. And we can calculate M by M instantaneous covariance matrix like this. This matrix can be calculated in each time and frequency like a tensor X in this figure. Multichannel NMF decomposes X into the spatial covariance H, cluster-indicater z, basis matrix T and activation matrix V. T and V are the same as simple NMF, spectral patterns and their activations. H includes source-wise spatial covariances. So, MNMF clusters bases into the sources using spatial model and cluster-indicator. The problem of this method is that the optimization of these variables are too much difficult. The result strongly depends on initial values
  6. Then, we have proposed a new efficient optimization scheme for MNMF, which utilizes rank-1 spatial constraint at the last ICASSP. In this model, all of the spatial covariances in H must be the rank-1 matrices. It means, we assume the linear time-invariant mixing system, as well as the IVA. And this new model can efficiently be optimized using update rules of IVA and simple NMF, alternatively.
  7. As I already said, the rank-1 spatial constraint is equal to the linear mixing assumption. This can be thought as an instantaneous mixture in a time-frequency domain. So, the frequency-wise mixing matrix Ai can be defined, which is time-invariant. In this model, we assume that the sources can be modeled as point sources, and the reverberation time is shorter than the FFT length.
  8. However, when the reverberation time is longer than the FFT length, the reverberant components leak into the next time frame like this figure. Therefore, we cannot represent the mixture signal x by using only Ai. The leaked component n, which comes from the previous time frame, is added. Since the IVA and Rank-1 MNMF estimates the inverse of Ai, the separation performance markedly degrades in this reverberant case.
  9. This is a summary of the problems. MNMF can estimate full-rank spatial model. So, it works for the reverberant signals to some extent. But it requires much computational costs, and it strongly depends on the initial values. IVA and Rank-1 MNMF use rank-1 spatial constraint, the linear mixing assumption. So the separation performance degrades for the reverberant signals. But they have efficient optimization method. To achieve good and stable separation even for the reverberant signals, we propose to relax the rank-1 spatial constraint while maintaining efficient optimization.
  10. In this presentation, we propose to utilize extra observations to model direct and reverberant components simultaneously. Now we assume that there are M microphones for N sources, where M = PN. For example, there are 4 microphones and 2 sources, where P = 2. In general, (click) we apply PCA to reduce the dimension of the signal (click). Then, we apply BSS. This dimensionality reduction expects to remove the reverberant components of all the sources. But the reverberant components are important for acoustic scene analysis. So we shouldn’t ignore them.
  11. In the proposed method (click), we apply IVA or Rank-1 MNMF with extra observations (click). We expect that the 2 original sources are separately obtained as (click) direct and reverberant components like this. Therefore, we assume the independence between not only sources but also these components. Finally, we reconstruct the components to the sources by adding them.
  12. In the proposed method (click), we apply IVA or Rank-1 MNMF with extra observations (click). We expect that the 2 original sources are separately obtained as (click) direct and reverberant components like this. Therefore, we assume the independence between not only sources but also these components. Finally, we reconstruct the components to the sources by adding them.
  13. However, in this method, there is a permutation problem of the separated components because the order of the separated components depends on the initial values. It means that, we don’t know which separated components belong to which source. So we have to cluster them into each source, and reconstruct the original estimated sources by adding the components of the same sources. Here, we propose two methods to cluster these components for IVA and for Rank-1 MNMF.
  14. However, in this method, there is a permutation problem of the separated components because the order of the separated components depends on the initial values. It means that, we don’t know which separated components belong to which source. So we have to cluster them into each source, and reconstruct the original estimated sources by adding the components of the same sources. Here, we propose two methods to cluster these components for IVA and for Rank-1 MNMF.
  15. The first method is for the IVA. We can expect that the direct and reverberant components of the same source have a strong cross-correlation in the power spectrogram domain. So we calculate the cross-correlations between all the combinations of the components. Then, we merge them in a descending order of the cross-correlations. This is a very simple way, and it actually works well.
  16. For Rank-1 MNMF, we can use another way for the clustering. We can expect that the direct and reverberant components can be modeled by the same bases. So we propose to share the basis matrix T for each source in Rank-1 MNMF. By imposing basis-shared source model in advance, Rank-1 MNMF can automatically cluster the components as the sources. It means that, the separated components are already clustered.
  17. We conducted a separation experiment. This table is an experimental conditions. We used actual music signals, and impulse responses. We produced 4 channel observed signals that includes 2 sources. The important point is the reverberation time is much longer than the FFT length. Also, we used SDR value that indicates total separation quality.
  18. We compared 7 methods. The first and second methods are the conventional methods that apply PCA before BSS. The red ones are the proposed methods that utilize extra observations. We also evaluate the performance of conventional MNMF that estimates full-rank spatial model. This one applies maximum SNR beam forming after MNMF. The other one applies multichannel Wiener filtering after MNMF, so this method uses time-variant post filtering. In addition, we show the upper limit of time-invariant filtering methods as a supervised method.
  19. This is a result of song 1. These methods have (click) the difference model like this. The proposed approaches that separately estimate direct and reverberant components achieve better result than the methods with PCA. Since the proposed methods use time-invariant filters for each of direct and reverberant components, they utilize 2 filters for one source. Therefore, the proposed methods have a potential to outperform the upper limit of ideal time-invariant filter. The performance of conventional MNMF is also high. It is comparable (カンパラボー) with the proposed method, but the results is not stable.
  20. This is the result of song 2. It is similar to the previous result.
  21. This table shows the actual computational times of each method. From these results, the proposed methods can achieve comparable separation performance with MNMF even for the reverberant signals while maintaining efficient optimization.
  22. This is a result of song 3. We can confirm that the proposed method outperform the upper limit.