Efficient initialization for nonnegative matrix factorization based on nonnegative independent component analysis

Daichi Kitamura
Daichi KitamuraAssistant Professor at National Institute of Technology, Kagawa College
Daichi Kitamura (SOKENDAI, Japan)
Nobutaka Ono (NII/SOKENDAI, Japan)
Efficient initialization for NMF
based on nonnegative ICA
IWAENC 2016, Sept. 16, 08:30 - 10:30,
Session SPS-II - Student paper competition 2
SPC-II-04
• Nonnegative matrix factorization (NMF) [Lee, 1999]
– Dimensionality reduction with nonnegative constraint
– Unsupervised learning extracting meaningful features
– Sparse decomposition (implicitly)
Research background: what is NMF?
Amplitude
Amplitude
Input data matrix
(power spectrogram)
Basis matrix
(spectral patterns)
Activation matrix
(time-varying gains)
Time
Time
Frequency
Frequency
2/19
: # of rows
: # of columns
: # of bases
• Optimization in NMF
– Define a cost function (data fidelity) and minimize it
– No closed-form solution for and
– Efficient iterative optimization
• Multiplicative update rules (auxiliary function technique) [Lee, 2001]
– Initial values for all the variables are required.
Research background: how to optimize?
3/19
(when the cost function is a squared Euclidian distance)
• Results of all applications using NMF always depend
the initialization of and .
– Ex. source separation via full-supervised NMF [Smaragdis, 2007]
• Motivation: Initialization method that always gives
us a good performance is desired.
Problem and motivation
4/19
12
10
8
6
4
2
0
SDRimprovement[dB]
Rand10
Rand1
Rand2
Rand3
Rand4
Rand5
Rand6
Rand7
Rand8
Rand9
Different
random seeds
More
than 1dB
Poor
Good
• With random values (not focused here)
– Directly use random values
– Search good values via genetic algorithm [Stadlthanner, 2006], [Janecek, 2011]
– Clustering-based initialization [Zheng, 2007], [Xue, 2008], [Rezaei, 2011]
• Cluster input data into clusters, and set the centroid vectors to
initial basis vectors.
• Without random values
– PCA-based initialization [Zhao, 2014]
• Apply PCA to input data , extract orthogonal bases and coefficients,
and set their absolute values to the initial bases and activations.
– SVD-based initialization [Boutsidis, 2008]
• Apply a special SVD (nonnegative double SVD) to input data and
set nonnegative left and right singular vectors to the initial values.
Conventional NMF initialization techniques
5/19
• Are orthogonal bases really better for NMF?
– PCA and SVD are orthogonal decompositions.
– A geometric interpretation of NMF [Donoho, 2003]
• The optimal bases in NMF are “along the edges of a convex cone”
that includes all the observed data points.
– Orthogonality might not be a good initial value for NMF.
Bases orthogonality?
6/19
Convex cone
Data points
Edge
Optimal bases Orthogonal bases Tight bases
satisfactory for representing
all the data points
have a risk to represent
a meaningless area
cannot represent all
the data points
Meaningless areas
• What can we do from only the input data ?
– Independent component analysis (ICA) [Comon, 1994]
– ICA extracts non-orthogonal bases
• that maximize a statistical independence between sources.
– ICA estimates sparse sources
• when we assume a super-Gaussian prior.
• Propose to use ICA bases and estimated sources
as initial NMF values
– Objectives:
• 1. Deeper minimization
• 2. Faster convergence
• 3. Better performance
Proposed method: utilization of ICA
7/19
Number of update iterations in NMF
Valueofcost
functioninNMF
Deeper
minimization
Faster
convergence
• The input data matrix is a mixture of some sources.
– sources in are mixed via , then observed as
– ICA can estimate a demixing matrix and the
independent sources .
• PCA for only the dimensionality reduction in NMF
• Nonnegative ICA for taking nonnegativity into account
• Nonnegativization for ensuring complete nonnegativity
Proposed method: concept
8/19
Input data matrix Mixing matrix Source matrix
…
…
Input data
matrix
PCA
NMFInitial values
NICA Nonnegativization
…ICA
bases
PCA matrix for
dimensionality reduction
Mutually
independent
• Nonnegative ICA (NICA) [Plumbley, 2003]
– estimates demixing matrix so that all of the separated
sources become nonnegative.
– finds rotation matrix for pre-whitened mixtures .
– Steepest gradient descent for estimating
Nonnegative constrained ICA
9/19
Cost function: where
Observed
Whitening w/o
centering
Pre-whitened Separated
Rotation
(demixing)
• Dimensionality reduction via PCA
• NMF variables obtained from the estimates of NICA
– Support that ,
– then we have
Combine PCA for dimensionality reduction
10/19
Rows are eigenvectors of
has top- eigenvectors
Eigenvalues
High
Low
Basis
matrix
Activation
matrix
Rotation matrix
estimated by NICA
ICA bases Sources
Zero matrix
• Even if we use NICA, there is no guarantee that
– obtained (sources) becomes completely nonnegative
because of the dimensionality reduction by PCA.
– As for the obtained basis (ICA bases), nonnegativity is
not assumed in NICA.
• Take a “nonnegativization” for obtained and :
– Method 1:
– Method 2:
– Method 3:
• where and are scale fitting coefficient that depend on a
divergence of following NMF
Nonnegativization
11/19
Correlation between
and
Correlation between
and
• Power spectrogram of mixture with Vo. and Gt.
– Song: “Actions – One Minute Smile” from SiSEC2015
– Size of power spectrogram: 2049 x 1290 (60 sec.)
– Number of bases:
Experiment: conditions
12/19
Frequency[kHz]
Time [s]
• Convergence of cost function in NICA
Experiment: results of NICA
13/19
0.6
0.5
0.4
0.3
0.2
0.1
0.0
ValueofcostfunctioninNICA
2000150010005000
Number of iterations
Steepest gradient descent
• Convergence of EU-NMF
Experiment: results of Euclidian NMF
14/19
Processing time
for initialization
NICA: 4.36 s
PCA: 0.98 s
SVD: 2.40 s
EU-NMF: 12.78 s
(for 1000 iter.)
Rand1~10 are based on random
initialization with different seeds.
• Convergence of KL-NMF
Experiment: results of Kullback-Leibler NMF
15/19
PCA
Proposed methods
Processing time
for initialization
NICA: 4.36 s
PCA: 0.98 s
SVD: 2.40 s
KL-NMF: 48.07 s
(for 1000 iter.)
Rand1~10 are based on random
initialization with different seeds.
• Convergence of IS-NMF
Experiment: results of Itakura-Saito NMF
16/19
PCA
Proposed methods
x106
Processing time
for initialization
NICA: 4.36 s
PCA: 0.98 s
SVD: 2.40 s
IS-NMF: 214.26 s
(for 1000 iter.)
Rand1~10 are based on random
initialization with different seeds.
Experiment: full-supervised source separation
• Full-supervised NMF [Smaragdis, 2007]
– Simply use pre-trained sourcewise bases for separation
17/19
Training stage
,
Separation stage
Initialized by
conventional or proposed
method
Cost functions:
Cost function:
Pre-trained bases (fixed)
Initialized based on the
correlations between
and or
• Two sources separation using full-supervised NMF
– SiSEC2015 MUS dataset (professionally recorded music)
– Averaged SDR improvements of 15 songs
Experiment: results of separation
18/19
Separation performance for source 1 Separation performance for source 2
Rand10
NICA1
NICA2
NICA3
PCA
SVD
Rand1
Rand2
Rand3
Rand4
Rand5
Rand6
Rand7
Rand8
Rand9
12
10
8
6
4
2
0
SDRimprovement[dB]
5
4
3
2
1
0
SDRimprovement[dB]
Rand10
NICA1
NICA2
NICA3
PCA
SVD
Rand1
Rand2
Rand3
Rand4
Rand5
Rand6
Rand7
Rand8
Rand9
Prop. Conv. Prop. Conv.
Conclusion
• Proposed efficient initialization method for NMF
• Utilize statistical independence for obtaining non-
orthogonal bases and sources
– The orthogonality may not be preferable for NMF.
• The proposed initialization gives
– deeper minimization
– faster convergence
– better performance for full-supervised source separation
19/19
Thank you for your attention!
1 of 19

Recommended

独立性基準を用いた非負値行列因子分解の効果的な初期値決定法(Statistical-independence-based efficient initia... by
独立性基準を用いた非負値行列因子分解の効果的な初期値決定法(Statistical-independence-based efficient initia...独立性基準を用いた非負値行列因子分解の効果的な初期値決定法(Statistical-independence-based efficient initia...
独立性基準を用いた非負値行列因子分解の効果的な初期値決定法(Statistical-independence-based efficient initia...Daichi Kitamura
4.9K views20 slides
ディープラーニングの2値化(Binarized Neural Network) by
ディープラーニングの2値化(Binarized Neural Network)ディープラーニングの2値化(Binarized Neural Network)
ディープラーニングの2値化(Binarized Neural Network)Hideo Terada
9.1K views36 slides
関数データ解析の概要とその方法 by
関数データ解析の概要とその方法関数データ解析の概要とその方法
関数データ解析の概要とその方法Hidetoshi Matsui
9.6K views156 slides
NMF in PyTorch by
NMF in PyTorchNMF in PyTorch
NMF in PyTorchChin Yun Yu
476 views37 slides
「はじめてでもわかる RandomForest 入門-集団学習による分類・予測 -」 -第7回データマイニング+WEB勉強会@東京 by
「はじめてでもわかる RandomForest 入門-集団学習による分類・予測 -」 -第7回データマイニング+WEB勉強会@東京「はじめてでもわかる RandomForest 入門-集団学習による分類・予測 -」 -第7回データマイニング+WEB勉強会@東京
「はじめてでもわかる RandomForest 入門-集団学習による分類・予測 -」 -第7回データマイニング+WEB勉強会@東京Koichi Hamada
289K views85 slides
Numpy scipyで独立成分分析 by
Numpy scipyで独立成分分析Numpy scipyで独立成分分析
Numpy scipyで独立成分分析Shintaro Fukushima
15.5K views38 slides

More Related Content

What's hot

[DL輪読会]陰関数微分を用いた深層学習 by
[DL輪読会]陰関数微分を用いた深層学習[DL輪読会]陰関数微分を用いた深層学習
[DL輪読会]陰関数微分を用いた深層学習Deep Learning JP
3.1K views43 slides
東京都市大学 データ解析入門 4 スパース性と圧縮センシング1 by
東京都市大学 データ解析入門 4 スパース性と圧縮センシング1東京都市大学 データ解析入門 4 スパース性と圧縮センシング1
東京都市大学 データ解析入門 4 スパース性と圧縮センシング1hirokazutanaka
664 views49 slides
東京都市大学 データ解析入門 5 スパース性と圧縮センシング 2 by
東京都市大学 データ解析入門 5 スパース性と圧縮センシング 2東京都市大学 データ解析入門 5 スパース性と圧縮センシング 2
東京都市大学 データ解析入門 5 スパース性と圧縮センシング 2hirokazutanaka
567 views44 slides
Neural networks for Graph Data NeurIPS2018読み会@PFN by
Neural networks for Graph Data NeurIPS2018読み会@PFNNeural networks for Graph Data NeurIPS2018読み会@PFN
Neural networks for Graph Data NeurIPS2018読み会@PFNemakryo
11.8K views50 slides
PCAの最終形態GPLVMの解説 by
PCAの最終形態GPLVMの解説PCAの最終形態GPLVMの解説
PCAの最終形態GPLVMの解説弘毅 露崎
23.8K views27 slides
DID, Synthetic Control, CausalImpact by
DID, Synthetic Control, CausalImpactDID, Synthetic Control, CausalImpact
DID, Synthetic Control, CausalImpactYusuke Kaneko
25.8K views37 slides

What's hot(20)

[DL輪読会]陰関数微分を用いた深層学習 by Deep Learning JP
[DL輪読会]陰関数微分を用いた深層学習[DL輪読会]陰関数微分を用いた深層学習
[DL輪読会]陰関数微分を用いた深層学習
Deep Learning JP3.1K views
東京都市大学 データ解析入門 4 スパース性と圧縮センシング1 by hirokazutanaka
東京都市大学 データ解析入門 4 スパース性と圧縮センシング1東京都市大学 データ解析入門 4 スパース性と圧縮センシング1
東京都市大学 データ解析入門 4 スパース性と圧縮センシング1
hirokazutanaka664 views
東京都市大学 データ解析入門 5 スパース性と圧縮センシング 2 by hirokazutanaka
東京都市大学 データ解析入門 5 スパース性と圧縮センシング 2東京都市大学 データ解析入門 5 スパース性と圧縮センシング 2
東京都市大学 データ解析入門 5 スパース性と圧縮センシング 2
hirokazutanaka567 views
Neural networks for Graph Data NeurIPS2018読み会@PFN by emakryo
Neural networks for Graph Data NeurIPS2018読み会@PFNNeural networks for Graph Data NeurIPS2018読み会@PFN
Neural networks for Graph Data NeurIPS2018読み会@PFN
emakryo11.8K views
PCAの最終形態GPLVMの解説 by 弘毅 露崎
PCAの最終形態GPLVMの解説PCAの最終形態GPLVMの解説
PCAの最終形態GPLVMの解説
弘毅 露崎23.8K views
DID, Synthetic Control, CausalImpact by Yusuke Kaneko
DID, Synthetic Control, CausalImpactDID, Synthetic Control, CausalImpact
DID, Synthetic Control, CausalImpact
Yusuke Kaneko25.8K views
畳み込みネットワークによる高次元信号復元と異分野融合への展開 by Shogo Muramatsu
畳み込みネットワークによる高次元信号復元と異分野融合への展開 畳み込みネットワークによる高次元信号復元と異分野融合への展開
畳み込みネットワークによる高次元信号復元と異分野融合への展開
Shogo Muramatsu1.5K views
特徴選択のためのLasso解列挙 by Satoshi Hara
特徴選択のためのLasso解列挙特徴選択のためのLasso解列挙
特徴選択のためのLasso解列挙
Satoshi Hara4.8K views
科学と機械学習のあいだ:変量の設計・変換・選択・交互作用・線形性 by Ichigaku Takigawa
科学と機械学習のあいだ:変量の設計・変換・選択・交互作用・線形性科学と機械学習のあいだ:変量の設計・変換・選択・交互作用・線形性
科学と機械学習のあいだ:変量の設計・変換・選択・交互作用・線形性
Ichigaku Takigawa14.1K views
非負値行列分解の確率的生成モデルと 多チャネル音源分離への応用 (Generative model in nonnegative matrix facto... by Daichi Kitamura
非負値行列分解の確率的生成モデルと多チャネル音源分離への応用 (Generative model in nonnegative matrix facto...非負値行列分解の確率的生成モデルと多チャネル音源分離への応用 (Generative model in nonnegative matrix facto...
非負値行列分解の確率的生成モデルと 多チャネル音源分離への応用 (Generative model in nonnegative matrix facto...
Daichi Kitamura5.9K views
[DL輪読会]Learning convolutional neural networks for graphs by Deep Learning JP
[DL輪読会]Learning convolutional neural networks for graphs[DL輪読会]Learning convolutional neural networks for graphs
[DL輪読会]Learning convolutional neural networks for graphs
Deep Learning JP3.3K views
Skip Connection まとめ(Neural Network) by Yamato OKAMOTO
Skip Connection まとめ(Neural Network)Skip Connection まとめ(Neural Network)
Skip Connection まとめ(Neural Network)
Yamato OKAMOTO17K views
アンサンブル木モデル解釈のためのモデル簡略化法 by Satoshi Hara
アンサンブル木モデル解釈のためのモデル簡略化法アンサンブル木モデル解釈のためのモデル簡略化法
アンサンブル木モデル解釈のためのモデル簡略化法
Satoshi Hara6.3K views
機械学習モデルの列挙 by Satoshi Hara
機械学習モデルの列挙機械学習モデルの列挙
機械学習モデルの列挙
Satoshi Hara9.3K views
サポートベクトルデータ記述法による異常検知 in 機械学習プロフェッショナルシリーズ輪読会 by Shotaro Sano
サポートベクトルデータ記述法による異常検知 in 機械学習プロフェッショナルシリーズ輪読会サポートベクトルデータ記述法による異常検知 in 機械学習プロフェッショナルシリーズ輪読会
サポートベクトルデータ記述法による異常検知 in 機械学習プロフェッショナルシリーズ輪読会
Shotaro Sano5.4K views
遺伝子のアノテーション付加 by 弘毅 露崎
遺伝子のアノテーション付加遺伝子のアノテーション付加
遺伝子のアノテーション付加
弘毅 露崎8.7K views
Beyond Low Rank + Sparse: Multi-scale Low Rank Matrix Decomposition by Frank Ong
Beyond Low Rank + Sparse: Multi-scale Low Rank Matrix DecompositionBeyond Low Rank + Sparse: Multi-scale Low Rank Matrix Decomposition
Beyond Low Rank + Sparse: Multi-scale Low Rank Matrix Decomposition
Frank Ong354 views
Sparse estimation tutorial 2014 by Taiji Suzuki
Sparse estimation tutorial 2014Sparse estimation tutorial 2014
Sparse estimation tutorial 2014
Taiji Suzuki7.9K views
10分でわかるRandom forest by Yasunori Ozaki
10分でわかるRandom forest10分でわかるRandom forest
10分でわかるRandom forest
Yasunori Ozaki13.1K views

Viewers also liked

非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法 by
非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法
非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法Daichi Kitamura
3.5K views23 slides
半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法 by
半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法
半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法Daichi Kitamura
4.3K views28 slides
音源分離における音響モデリング(Acoustic modeling in audio source separation) by
音源分離における音響モデリング(Acoustic modeling in audio source separation)音源分離における音響モデリング(Acoustic modeling in audio source separation)
音源分離における音響モデリング(Acoustic modeling in audio source separation)Daichi Kitamura
22.5K views114 slides
擬似ハムバッキングピックアップの弦振動応答 (in Japanese) by
擬似ハムバッキングピックアップの弦振動応答 (in Japanese)擬似ハムバッキングピックアップの弦振動応答 (in Japanese)
擬似ハムバッキングピックアップの弦振動応答 (in Japanese)Daichi Kitamura
1.1K views13 slides
Blind source separation based on independent low-rank matrix analysis and its... by
Blind source separation based on independent low-rank matrix analysis and its...Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...Daichi Kitamura
1.6K views47 slides
Study on optimal divergence for superresolution-based supervised nonnegative ... by
Study on optimal divergence for superresolution-based supervised nonnegative ...Study on optimal divergence for superresolution-based supervised nonnegative ...
Study on optimal divergence for superresolution-based supervised nonnegative ...Daichi Kitamura
1K views47 slides

Viewers also liked(17)

非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法 by Daichi Kitamura
非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法
非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法
Daichi Kitamura3.5K views
半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法 by Daichi Kitamura
半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法
半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法
Daichi Kitamura4.3K views
音源分離における音響モデリング(Acoustic modeling in audio source separation) by Daichi Kitamura
音源分離における音響モデリング(Acoustic modeling in audio source separation)音源分離における音響モデリング(Acoustic modeling in audio source separation)
音源分離における音響モデリング(Acoustic modeling in audio source separation)
Daichi Kitamura22.5K views
擬似ハムバッキングピックアップの弦振動応答 (in Japanese) by Daichi Kitamura
擬似ハムバッキングピックアップの弦振動応答 (in Japanese)擬似ハムバッキングピックアップの弦振動応答 (in Japanese)
擬似ハムバッキングピックアップの弦振動応答 (in Japanese)
Daichi Kitamura1.1K views
Blind source separation based on independent low-rank matrix analysis and its... by Daichi Kitamura
Blind source separation based on independent low-rank matrix analysis and its...Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...
Daichi Kitamura1.6K views
Study on optimal divergence for superresolution-based supervised nonnegative ... by Daichi Kitamura
Study on optimal divergence for superresolution-based supervised nonnegative ...Study on optimal divergence for superresolution-based supervised nonnegative ...
Study on optimal divergence for superresolution-based supervised nonnegative ...
Daichi Kitamura1K views
ランク1空間近似を用いたBSSにおける音源及び空間モデルの考察 Study on Source and Spatial Models for BSS wi... by Daichi Kitamura
ランク1空間近似を用いたBSSにおける音源及び空間モデルの考察 Study on Source and Spatial Models for BSS wi...ランク1空間近似を用いたBSSにおける音源及び空間モデルの考察 Study on Source and Spatial Models for BSS wi...
ランク1空間近似を用いたBSSにおける音源及び空間モデルの考察 Study on Source and Spatial Models for BSS wi...
Daichi Kitamura1.8K views
Music signal separation using supervised nonnegative matrix factorization wit... by Daichi Kitamura
Music signal separation using supervised nonnegative matrix factorization wit...Music signal separation using supervised nonnegative matrix factorization wit...
Music signal separation using supervised nonnegative matrix factorization wit...
Daichi Kitamura985 views
Relaxation of rank-1 spatial constraint in overdetermined blind source separa... by Daichi Kitamura
Relaxation of rank-1 spatial constraint in overdetermined blind source separa...Relaxation of rank-1 spatial constraint in overdetermined blind source separa...
Relaxation of rank-1 spatial constraint in overdetermined blind source separa...
Daichi Kitamura1.6K views
基底変形型教師ありNMFによる実楽器信号分離 (in Japanese) by Daichi Kitamura
基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)
基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)
Daichi Kitamura5.9K views
音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou... by Daichi Kitamura
音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou...音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou...
音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou...
Daichi Kitamura12.2K views
ICASSP2017読み会(関東編)・AASP_L3(北村担当分) by Daichi Kitamura
ICASSP2017読み会(関東編)・AASP_L3(北村担当分)ICASSP2017読み会(関東編)・AASP_L3(北村担当分)
ICASSP2017読み会(関東編)・AASP_L3(北村担当分)
Daichi Kitamura4K views
Audio Source Separation Based on Low-Rank Structure and Statistical Independence by Daichi Kitamura
Audio Source Separation Based on Low-Rank Structure and Statistical IndependenceAudio Source Separation Based on Low-Rank Structure and Statistical Independence
Audio Source Separation Based on Low-Rank Structure and Statistical Independence
Daichi Kitamura2.5K views
Experimental analysis of optimal window length for independent low-rank matri... by Daichi Kitamura
Experimental analysis of optimal window length for independent low-rank matri...Experimental analysis of optimal window length for independent low-rank matri...
Experimental analysis of optimal window length for independent low-rank matri...
Daichi Kitamura850 views
統計的独立性と低ランク行列分解理論に基づく ブラインド音源分離 –独立低ランク行列分析– Blind source separation based on... by Daichi Kitamura
統計的独立性と低ランク行列分解理論に基づくブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...統計的独立性と低ランク行列分解理論に基づくブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...
統計的独立性と低ランク行列分解理論に基づく ブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...
Daichi Kitamura2.9K views
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep... by Daichi Kitamura
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...
Daichi Kitamura5.9K views
TensorFlow を使った 機械学習ことはじめ (GDG京都 機械学習勉強会) by 徹 上野山
TensorFlow を使った機械学習ことはじめ (GDG京都 機械学習勉強会)TensorFlow を使った機械学習ことはじめ (GDG京都 機械学習勉強会)
TensorFlow を使った 機械学習ことはじめ (GDG京都 機械学習勉強会)
徹 上野山319.6K views

Similar to Efficient initialization for nonnegative matrix factorization based on nonnegative independent component analysis

Prior distribution design for music bleeding-sound reduction based on nonnega... by
Prior distribution design for music bleeding-sound reduction based on nonnega...Prior distribution design for music bleeding-sound reduction based on nonnega...
Prior distribution design for music bleeding-sound reduction based on nonnega...Kitamura Laboratory
99 views29 slides
Online divergence switching for superresolution-based nonnegative matrix fact... by
Online divergence switching for superresolution-based nonnegative matrix fact...Online divergence switching for superresolution-based nonnegative matrix fact...
Online divergence switching for superresolution-based nonnegative matrix fact...Daichi Kitamura
624 views33 slides
Blind source separation based on independent low-rank matrix analysis and its... by
Blind source separation based on independent low-rank matrix analysis and its...Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...Daichi Kitamura
1.4K views50 slides
Hybrid multichannel signal separation using supervised nonnegative matrix fac... by
Hybrid multichannel signal separation using supervised nonnegative matrix fac...Hybrid multichannel signal separation using supervised nonnegative matrix fac...
Hybrid multichannel signal separation using supervised nonnegative matrix fac...Daichi Kitamura
1.1K views31 slides
Online Divergence Switching for Superresolution-Based Nonnegative Matrix Fa... by
Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Fa...Online Divergence Switching for  Superresolution-Based  Nonnegative Matrix Fa...
Online Divergence Switching for Superresolution-Based Nonnegative Matrix Fa...奈良先端大 情報科学研究科
2.5K views33 slides
Hybrid NMF APSIPA2014 invited by
Hybrid NMF APSIPA2014 invitedHybrid NMF APSIPA2014 invited
Hybrid NMF APSIPA2014 invitedSaruwatariLabUTokyo
14.1K views31 slides

Similar to Efficient initialization for nonnegative matrix factorization based on nonnegative independent component analysis(20)

Prior distribution design for music bleeding-sound reduction based on nonnega... by Kitamura Laboratory
Prior distribution design for music bleeding-sound reduction based on nonnega...Prior distribution design for music bleeding-sound reduction based on nonnega...
Prior distribution design for music bleeding-sound reduction based on nonnega...
Online divergence switching for superresolution-based nonnegative matrix fact... by Daichi Kitamura
Online divergence switching for superresolution-based nonnegative matrix fact...Online divergence switching for superresolution-based nonnegative matrix fact...
Online divergence switching for superresolution-based nonnegative matrix fact...
Daichi Kitamura624 views
Blind source separation based on independent low-rank matrix analysis and its... by Daichi Kitamura
Blind source separation based on independent low-rank matrix analysis and its...Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...
Daichi Kitamura1.4K views
Hybrid multichannel signal separation using supervised nonnegative matrix fac... by Daichi Kitamura
Hybrid multichannel signal separation using supervised nonnegative matrix fac...Hybrid multichannel signal separation using supervised nonnegative matrix fac...
Hybrid multichannel signal separation using supervised nonnegative matrix fac...
Daichi Kitamura1.1K views
Superresolution-based stereo signal separation via supervised nonnegative mat... by Daichi Kitamura
Superresolution-based stereo signal separation via supervised nonnegative mat...Superresolution-based stereo signal separation via supervised nonnegative mat...
Superresolution-based stereo signal separation via supervised nonnegative mat...
Daichi Kitamura775 views
Divergence optimization in nonnegative matrix factorization with spectrogram ... by Daichi Kitamura
Divergence optimization in nonnegative matrix factorization with spectrogram ...Divergence optimization in nonnegative matrix factorization with spectrogram ...
Divergence optimization in nonnegative matrix factorization with spectrogram ...
Daichi Kitamura942 views
A Generalization of Laplace Nonnegative Matrix Factorizationand Its Multichan... by Hiroki_Tanji
A Generalization of Laplace Nonnegative Matrix Factorizationand Its Multichan...A Generalization of Laplace Nonnegative Matrix Factorizationand Its Multichan...
A Generalization of Laplace Nonnegative Matrix Factorizationand Its Multichan...
Hiroki_Tanji70 views
Learning the Statistical Model of the NMF Using the Deep Multiplicative Updat... by Hiroki_Tanji
Learning the Statistical Model of the NMF Using the Deep Multiplicative Updat...Learning the Statistical Model of the NMF Using the Deep Multiplicative Updat...
Learning the Statistical Model of the NMF Using the Deep Multiplicative Updat...
Hiroki_Tanji77 views
Optical modeling and design of freeform surfaces using anisotropic Radial Bas... by Milan Maksimovic
Optical modeling and design of freeform surfaces using anisotropic Radial Bas...Optical modeling and design of freeform surfaces using anisotropic Radial Bas...
Optical modeling and design of freeform surfaces using anisotropic Radial Bas...
Milan Maksimovic1.7K views
Set Transfomer: A Framework for Attention-based Permutaion-Invariant Neural N... by Thien Q. Tran
Set Transfomer: A Framework for Attention-based Permutaion-Invariant Neural N...Set Transfomer: A Framework for Attention-based Permutaion-Invariant Neural N...
Set Transfomer: A Framework for Attention-based Permutaion-Invariant Neural N...
Thien Q. Tran157 views
Design and Implementation of Area Optimized, Low Complexity CMOS 32nm Technol... by IJERA Editor
Design and Implementation of Area Optimized, Low Complexity CMOS 32nm Technol...Design and Implementation of Area Optimized, Low Complexity CMOS 32nm Technol...
Design and Implementation of Area Optimized, Low Complexity CMOS 32nm Technol...
IJERA Editor24 views
Robust music signal separation based on supervised nonnegative matrix factori... by Daichi Kitamura
Robust music signal separation based on supervised nonnegative matrix factori...Robust music signal separation based on supervised nonnegative matrix factori...
Robust music signal separation based on supervised nonnegative matrix factori...
Daichi Kitamura1.2K views
Regularized superresolution-based binaural signal separation with nonnegative... by Daichi Kitamura
Regularized superresolution-based binaural signal separation with nonnegative...Regularized superresolution-based binaural signal separation with nonnegative...
Regularized superresolution-based binaural signal separation with nonnegative...
Daichi Kitamura701 views
Design and analysis of dual-mode numerically controlled oscillators based co... by IJECEIAES
Design and analysis of dual-mode numerically controlled  oscillators based co...Design and analysis of dual-mode numerically controlled  oscillators based co...
Design and analysis of dual-mode numerically controlled oscillators based co...
IJECEIAES3 views
Ofdm performance analysis by Saroj Dhakal
Ofdm performance analysisOfdm performance analysis
Ofdm performance analysis
Saroj Dhakal4.1K views
Performance Benchmarking of the R Programming Environment on the Stampede 1.5... by James McCombs
Performance Benchmarking of the R Programming Environment on the Stampede 1.5...Performance Benchmarking of the R Programming Environment on the Stampede 1.5...
Performance Benchmarking of the R Programming Environment on the Stampede 1.5...
James McCombs100 views

More from Daichi Kitamura

独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank... by
独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...
独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...Daichi Kitamura
1.5K views91 slides
スペクトログラム無矛盾性を用いた 独立低ランク行列分析の実験的評価 by
スペクトログラム無矛盾性を用いた独立低ランク行列分析の実験的評価スペクトログラム無矛盾性を用いた独立低ランク行列分析の実験的評価
スペクトログラム無矛盾性を用いた 独立低ランク行列分析の実験的評価Daichi Kitamura
1.1K views24 slides
Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも) by
Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)
Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)Daichi Kitamura
2.8K views67 slides
独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank... by
独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...
独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...Daichi Kitamura
8.3K views67 slides
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s... by
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...Daichi Kitamura
4.1K views26 slides
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen... by
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...Daichi Kitamura
2.1K views15 slides

More from Daichi Kitamura(11)

独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank... by Daichi Kitamura
独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...
独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...
Daichi Kitamura1.5K views
スペクトログラム無矛盾性を用いた 独立低ランク行列分析の実験的評価 by Daichi Kitamura
スペクトログラム無矛盾性を用いた独立低ランク行列分析の実験的評価スペクトログラム無矛盾性を用いた独立低ランク行列分析の実験的評価
スペクトログラム無矛盾性を用いた 独立低ランク行列分析の実験的評価
Daichi Kitamura1.1K views
Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも) by Daichi Kitamura
Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)
Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)
Daichi Kitamura2.8K views
独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank... by Daichi Kitamura
独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...
独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...
Daichi Kitamura8.3K views
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s... by Daichi Kitamura
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...
Daichi Kitamura4.1K views
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen... by Daichi Kitamura
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...
Daichi Kitamura2.1K views
近接分離最適化によるブラインド⾳源分離(Blind source separation via proximal splitting algorithm) by Daichi Kitamura
近接分離最適化によるブラインド⾳源分離(Blind source separation via proximal splitting algorithm)近接分離最適化によるブラインド⾳源分離(Blind source separation via proximal splitting algorithm)
近接分離最適化によるブラインド⾳源分離(Blind source separation via proximal splitting algorithm)
Daichi Kitamura2K views
模擬ハムバッキング・ピックアップの弦振動応答 (in Japanese) by Daichi Kitamura
模擬ハムバッキング・ピックアップの弦振動応答 (in Japanese)模擬ハムバッキング・ピックアップの弦振動応答 (in Japanese)
模擬ハムバッキング・ピックアップの弦振動応答 (in Japanese)
Daichi Kitamura1.3K views
Evaluation of separation accuracy for various real instruments based on super... by Daichi Kitamura
Evaluation of separation accuracy for various real instruments based on super...Evaluation of separation accuracy for various real instruments based on super...
Evaluation of separation accuracy for various real instruments based on super...
Daichi Kitamura676 views
Divergence optimization based on trade-off between separation and extrapolati... by Daichi Kitamura
Divergence optimization based on trade-off between separation and extrapolati...Divergence optimization based on trade-off between separation and extrapolati...
Divergence optimization based on trade-off between separation and extrapolati...
Daichi Kitamura917 views
Depth estimation of sound images using directional clustering and activation-... by Daichi Kitamura
Depth estimation of sound images using directional clustering and activation-...Depth estimation of sound images using directional clustering and activation-...
Depth estimation of sound images using directional clustering and activation-...
Daichi Kitamura919 views

Recently uploaded

ENTOMOLOGY PPT ON BOMBYCIDAE AND SATURNIIDAE.pptx by
ENTOMOLOGY PPT ON BOMBYCIDAE AND SATURNIIDAE.pptxENTOMOLOGY PPT ON BOMBYCIDAE AND SATURNIIDAE.pptx
ENTOMOLOGY PPT ON BOMBYCIDAE AND SATURNIIDAE.pptxMN
6 views13 slides
Artificial Intelligence Helps in Drug Designing and Discovery.pptx by
Artificial Intelligence Helps in Drug Designing and Discovery.pptxArtificial Intelligence Helps in Drug Designing and Discovery.pptx
Artificial Intelligence Helps in Drug Designing and Discovery.pptxabhinashsahoo2001
118 views22 slides
Conventional and non-conventional methods for improvement of cucurbits.pptx by
Conventional and non-conventional methods for improvement of cucurbits.pptxConventional and non-conventional methods for improvement of cucurbits.pptx
Conventional and non-conventional methods for improvement of cucurbits.pptxgandhi976
18 views35 slides
Pollination By Nagapradheesh.M.pptx by
Pollination By Nagapradheesh.M.pptxPollination By Nagapradheesh.M.pptx
Pollination By Nagapradheesh.M.pptxMNAGAPRADHEESH
15 views9 slides
Open Access Publishing in Astrophysics by
Open Access Publishing in AstrophysicsOpen Access Publishing in Astrophysics
Open Access Publishing in AstrophysicsPeter Coles
725 views26 slides
"How can I develop my learning path in bioinformatics? by
"How can I develop my learning path in bioinformatics?"How can I develop my learning path in bioinformatics?
"How can I develop my learning path in bioinformatics?Bioinformy
21 views13 slides

Recently uploaded(20)

ENTOMOLOGY PPT ON BOMBYCIDAE AND SATURNIIDAE.pptx by MN
ENTOMOLOGY PPT ON BOMBYCIDAE AND SATURNIIDAE.pptxENTOMOLOGY PPT ON BOMBYCIDAE AND SATURNIIDAE.pptx
ENTOMOLOGY PPT ON BOMBYCIDAE AND SATURNIIDAE.pptx
MN6 views
Artificial Intelligence Helps in Drug Designing and Discovery.pptx by abhinashsahoo2001
Artificial Intelligence Helps in Drug Designing and Discovery.pptxArtificial Intelligence Helps in Drug Designing and Discovery.pptx
Artificial Intelligence Helps in Drug Designing and Discovery.pptx
abhinashsahoo2001118 views
Conventional and non-conventional methods for improvement of cucurbits.pptx by gandhi976
Conventional and non-conventional methods for improvement of cucurbits.pptxConventional and non-conventional methods for improvement of cucurbits.pptx
Conventional and non-conventional methods for improvement of cucurbits.pptx
gandhi97618 views
Pollination By Nagapradheesh.M.pptx by MNAGAPRADHEESH
Pollination By Nagapradheesh.M.pptxPollination By Nagapradheesh.M.pptx
Pollination By Nagapradheesh.M.pptx
MNAGAPRADHEESH15 views
Open Access Publishing in Astrophysics by Peter Coles
Open Access Publishing in AstrophysicsOpen Access Publishing in Astrophysics
Open Access Publishing in Astrophysics
Peter Coles725 views
"How can I develop my learning path in bioinformatics? by Bioinformy
"How can I develop my learning path in bioinformatics?"How can I develop my learning path in bioinformatics?
"How can I develop my learning path in bioinformatics?
Bioinformy21 views
How to be(come) a successful PhD student by Tom Mens
How to be(come) a successful PhD studentHow to be(come) a successful PhD student
How to be(come) a successful PhD student
Tom Mens460 views
A training, certification and marketing scheme for informal dairy vendors in ... by ILRI
A training, certification and marketing scheme for informal dairy vendors in ...A training, certification and marketing scheme for informal dairy vendors in ...
A training, certification and marketing scheme for informal dairy vendors in ...
ILRI11 views
Distinct distributions of elliptical and disk galaxies across the Local Super... by Sérgio Sacani
Distinct distributions of elliptical and disk galaxies across the Local Super...Distinct distributions of elliptical and disk galaxies across the Local Super...
Distinct distributions of elliptical and disk galaxies across the Local Super...
Sérgio Sacani30 views
Light Pollution for LVIS students by CWBarthlmew
Light Pollution for LVIS studentsLight Pollution for LVIS students
Light Pollution for LVIS students
CWBarthlmew5 views
Workshop Chemical Robotics ChemAI 231116.pptx by Marco Tibaldi
Workshop Chemical Robotics ChemAI 231116.pptxWorkshop Chemical Robotics ChemAI 231116.pptx
Workshop Chemical Robotics ChemAI 231116.pptx
Marco Tibaldi95 views
MODULE-9-Biotechnology, Genetically Modified Organisms, and Gene Therapy.pdf by KerryNuez1
MODULE-9-Biotechnology, Genetically Modified Organisms, and Gene Therapy.pdfMODULE-9-Biotechnology, Genetically Modified Organisms, and Gene Therapy.pdf
MODULE-9-Biotechnology, Genetically Modified Organisms, and Gene Therapy.pdf
KerryNuez121 views
CSF -SHEEBA.D presentation.pptx by SheebaD7
CSF -SHEEBA.D presentation.pptxCSF -SHEEBA.D presentation.pptx
CSF -SHEEBA.D presentation.pptx
SheebaD711 views
application of genetic engineering 2.pptx by SankSurezz
application of genetic engineering 2.pptxapplication of genetic engineering 2.pptx
application of genetic engineering 2.pptx
SankSurezz7 views
Experimental animal Guinea pigs.pptx by Mansee Arya
Experimental animal Guinea pigs.pptxExperimental animal Guinea pigs.pptx
Experimental animal Guinea pigs.pptx
Mansee Arya13 views

Efficient initialization for nonnegative matrix factorization based on nonnegative independent component analysis

  • 1. Daichi Kitamura (SOKENDAI, Japan) Nobutaka Ono (NII/SOKENDAI, Japan) Efficient initialization for NMF based on nonnegative ICA IWAENC 2016, Sept. 16, 08:30 - 10:30, Session SPS-II - Student paper competition 2 SPC-II-04
  • 2. • Nonnegative matrix factorization (NMF) [Lee, 1999] – Dimensionality reduction with nonnegative constraint – Unsupervised learning extracting meaningful features – Sparse decomposition (implicitly) Research background: what is NMF? Amplitude Amplitude Input data matrix (power spectrogram) Basis matrix (spectral patterns) Activation matrix (time-varying gains) Time Time Frequency Frequency 2/19 : # of rows : # of columns : # of bases
  • 3. • Optimization in NMF – Define a cost function (data fidelity) and minimize it – No closed-form solution for and – Efficient iterative optimization • Multiplicative update rules (auxiliary function technique) [Lee, 2001] – Initial values for all the variables are required. Research background: how to optimize? 3/19 (when the cost function is a squared Euclidian distance)
  • 4. • Results of all applications using NMF always depend the initialization of and . – Ex. source separation via full-supervised NMF [Smaragdis, 2007] • Motivation: Initialization method that always gives us a good performance is desired. Problem and motivation 4/19 12 10 8 6 4 2 0 SDRimprovement[dB] Rand10 Rand1 Rand2 Rand3 Rand4 Rand5 Rand6 Rand7 Rand8 Rand9 Different random seeds More than 1dB Poor Good
  • 5. • With random values (not focused here) – Directly use random values – Search good values via genetic algorithm [Stadlthanner, 2006], [Janecek, 2011] – Clustering-based initialization [Zheng, 2007], [Xue, 2008], [Rezaei, 2011] • Cluster input data into clusters, and set the centroid vectors to initial basis vectors. • Without random values – PCA-based initialization [Zhao, 2014] • Apply PCA to input data , extract orthogonal bases and coefficients, and set their absolute values to the initial bases and activations. – SVD-based initialization [Boutsidis, 2008] • Apply a special SVD (nonnegative double SVD) to input data and set nonnegative left and right singular vectors to the initial values. Conventional NMF initialization techniques 5/19
  • 6. • Are orthogonal bases really better for NMF? – PCA and SVD are orthogonal decompositions. – A geometric interpretation of NMF [Donoho, 2003] • The optimal bases in NMF are “along the edges of a convex cone” that includes all the observed data points. – Orthogonality might not be a good initial value for NMF. Bases orthogonality? 6/19 Convex cone Data points Edge Optimal bases Orthogonal bases Tight bases satisfactory for representing all the data points have a risk to represent a meaningless area cannot represent all the data points Meaningless areas
  • 7. • What can we do from only the input data ? – Independent component analysis (ICA) [Comon, 1994] – ICA extracts non-orthogonal bases • that maximize a statistical independence between sources. – ICA estimates sparse sources • when we assume a super-Gaussian prior. • Propose to use ICA bases and estimated sources as initial NMF values – Objectives: • 1. Deeper minimization • 2. Faster convergence • 3. Better performance Proposed method: utilization of ICA 7/19 Number of update iterations in NMF Valueofcost functioninNMF Deeper minimization Faster convergence
  • 8. • The input data matrix is a mixture of some sources. – sources in are mixed via , then observed as – ICA can estimate a demixing matrix and the independent sources . • PCA for only the dimensionality reduction in NMF • Nonnegative ICA for taking nonnegativity into account • Nonnegativization for ensuring complete nonnegativity Proposed method: concept 8/19 Input data matrix Mixing matrix Source matrix … … Input data matrix PCA NMFInitial values NICA Nonnegativization …ICA bases PCA matrix for dimensionality reduction Mutually independent
  • 9. • Nonnegative ICA (NICA) [Plumbley, 2003] – estimates demixing matrix so that all of the separated sources become nonnegative. – finds rotation matrix for pre-whitened mixtures . – Steepest gradient descent for estimating Nonnegative constrained ICA 9/19 Cost function: where Observed Whitening w/o centering Pre-whitened Separated Rotation (demixing)
  • 10. • Dimensionality reduction via PCA • NMF variables obtained from the estimates of NICA – Support that , – then we have Combine PCA for dimensionality reduction 10/19 Rows are eigenvectors of has top- eigenvectors Eigenvalues High Low Basis matrix Activation matrix Rotation matrix estimated by NICA ICA bases Sources Zero matrix
  • 11. • Even if we use NICA, there is no guarantee that – obtained (sources) becomes completely nonnegative because of the dimensionality reduction by PCA. – As for the obtained basis (ICA bases), nonnegativity is not assumed in NICA. • Take a “nonnegativization” for obtained and : – Method 1: – Method 2: – Method 3: • where and are scale fitting coefficient that depend on a divergence of following NMF Nonnegativization 11/19 Correlation between and Correlation between and
  • 12. • Power spectrogram of mixture with Vo. and Gt. – Song: “Actions – One Minute Smile” from SiSEC2015 – Size of power spectrogram: 2049 x 1290 (60 sec.) – Number of bases: Experiment: conditions 12/19 Frequency[kHz] Time [s]
  • 13. • Convergence of cost function in NICA Experiment: results of NICA 13/19 0.6 0.5 0.4 0.3 0.2 0.1 0.0 ValueofcostfunctioninNICA 2000150010005000 Number of iterations Steepest gradient descent
  • 14. • Convergence of EU-NMF Experiment: results of Euclidian NMF 14/19 Processing time for initialization NICA: 4.36 s PCA: 0.98 s SVD: 2.40 s EU-NMF: 12.78 s (for 1000 iter.) Rand1~10 are based on random initialization with different seeds.
  • 15. • Convergence of KL-NMF Experiment: results of Kullback-Leibler NMF 15/19 PCA Proposed methods Processing time for initialization NICA: 4.36 s PCA: 0.98 s SVD: 2.40 s KL-NMF: 48.07 s (for 1000 iter.) Rand1~10 are based on random initialization with different seeds.
  • 16. • Convergence of IS-NMF Experiment: results of Itakura-Saito NMF 16/19 PCA Proposed methods x106 Processing time for initialization NICA: 4.36 s PCA: 0.98 s SVD: 2.40 s IS-NMF: 214.26 s (for 1000 iter.) Rand1~10 are based on random initialization with different seeds.
  • 17. Experiment: full-supervised source separation • Full-supervised NMF [Smaragdis, 2007] – Simply use pre-trained sourcewise bases for separation 17/19 Training stage , Separation stage Initialized by conventional or proposed method Cost functions: Cost function: Pre-trained bases (fixed) Initialized based on the correlations between and or
  • 18. • Two sources separation using full-supervised NMF – SiSEC2015 MUS dataset (professionally recorded music) – Averaged SDR improvements of 15 songs Experiment: results of separation 18/19 Separation performance for source 1 Separation performance for source 2 Rand10 NICA1 NICA2 NICA3 PCA SVD Rand1 Rand2 Rand3 Rand4 Rand5 Rand6 Rand7 Rand8 Rand9 12 10 8 6 4 2 0 SDRimprovement[dB] 5 4 3 2 1 0 SDRimprovement[dB] Rand10 NICA1 NICA2 NICA3 PCA SVD Rand1 Rand2 Rand3 Rand4 Rand5 Rand6 Rand7 Rand8 Rand9 Prop. Conv. Prop. Conv.
  • 19. Conclusion • Proposed efficient initialization method for NMF • Utilize statistical independence for obtaining non- orthogonal bases and sources – The orthogonality may not be preferable for NMF. • The proposed initialization gives – deeper minimization – faster convergence – better performance for full-supervised source separation 19/19 Thank you for your attention!

Editor's Notes

  1. This is a background. This presentation treats NMF, which is nonnegative matrix factorization. NMF is a dimensionality reduction with a nonnegative constraint, namely, all the decomposed components must be nonnegative, and the input matrix is represented by the linear combination of these nonnegative parts. From this nature, the decomposed nonnegative parts tend to be meaningful features. For example, for acoustic signals, we often use a power spectrogram as the input data matrix. In this case, by the NMF decomposition, we can extract some spectral patterns in the input spectrogram and their time-varying gains as nonnegative parts. These patterns are often called basis vectors, and the gains are called activations. From the nonnegative constraint, the decomposed parts implicitly tend to be sparse.
  2. In NMF, we define a cost function as a divergence between input data D and the decomposed model FG, and minimize it to find bases and activations under the nonnegative constraint. Of course, there is no closed-form solution, but, efficient iterative update rules have been proposed. These are the multiplicative update rules / when the cost function is a squared Euclidian distance. However, this kind of optimization requires initial values for all the variables.
  3. And the problem is that / the results of all the applications using NMF / always depend on the initial values of F and G. This graph shows an example of source separation performance based on a full-supervised NMF. The vertical axis indicates the performance, and we tried with 10 random seeds. The difference is more than 1 dB, so it strongly depends on the initial values. That’s the problem. And our motivation is that / we want to develop an efficient initialization method / that always gives us a good performance.
  4. Some initialization techniques are already proposed for NMF, and they can be roughly divided into two types. One is using random values, and the other one is without random values. In this presentation, we only focus on the method without random values / because it is more stable than that using random values. In this category, there have been two methods, PCA-based and SVD-based initializations. Both of them are applying PCA or SVD to the data matrix D, then extract orthogonal bases that represent D. Those can be used for the initial bases and activations.
  5. But the question is that, / are orthogonal bases really better for NMF? That’s problematic. Because, from a geometric interpretation of NMF, it is reported that, the optimal bases in NMF are along the edges of a convex cone / that includes all the observed data points. We here show that. Now we observed some data points. They must be in a positive orthant like this / because the data points are nonnegative, and the convex cone that includes all the data can be defined like this yellow tringle. The optimal bases are along the edges of the cone because they are satisfactory for representing all the data. The orthogonal bases can also represent them, but they have a risk to represent the meaningless area. On the other hand, the tight bases cannot represent all the data points. Therefore, we thought that / orthogonality might not be a good initial value for NMF.
  6. So what can we do from only the input data matrix? We here focused on ICA, independent component analysis. Because ICA can extract non-orthogonal bases using statistical independence between coefficients, which are called sources. Also, ICA can estimate sparse source if we assume a super-Gaussian prior. Since NMF is also considered as a sparse decomposition, this feature may also be suitable. So, in this presentation, we propose to use ICA bases and estimated sources as initial NMF values. And the objectives are here, deeper minimization, faster convergence, and better performance for many applications. (ここで6分前後なら15分で終わる)
  7. This slide shows the concept of the proposed method. We assume that the input data matrix is a mixture of some independent sources, namely, N sources in G, which are the activations, are mixed via ICA bases F, then observed as P1D. Here, the matrix P1 is just a PCA matrix for the dimensionality reduction. I will show you the detail in the following slide. Anyway, we assume the activations of each basis are mutually independent. And this / is showing the process flow of the initialization, and here / is the proposed method. First, we reduce the dimension of the input data matrix D by applying PCA. This is the orthogonal decomposition, but after that, we apply ICA. So it’s not a problem. In this presentation, we apply nonnegative ICA to take the nonnegativity into account. This is also explained in the next slide. Finally, nonnegativization is applied for ensuring complete nonnegativity, namely, we force all the estimates of NICA to be nonnegative.
  8. I briefly introduce nonnegative ICA. NICA estimates demixing matrix so that all of the separated sources become nonnegative. This problem is equal to finding a rotation matrix W for pre-whitened mixtures. Here we have an observed scatter x, and we apply whitening without data centering. Z is a whitened data. Then, we just rotate z so that all the data are in a positive orthant. That’s corresponding to the demixing of sources. And this is the cost function in NICA. That’s a minimization of the powers of remaining negative components. The rotation matrix W can be estimated by the steepest gradient descent as shown in the bottom.
  9. Also, we combine PCA for only the use of dimensionality reduction. P is a PCA matrix, so it has the eigenvectors of DD transpose, and P1 is the top-K eigenvectors. We can easily derive a relationship between NMF variables, F and G, and NICA estimates W using P.
  10. Finally, we have to make the variables nonnegative at the end of initialization because, even if we use NICA, there is no guarantee that obtained F and G becomes completely nonnegative. This is due to the dimensionality reduction. So we take a nonnegativization for obtained F and G. We propose three types of nonnegativization. The method 1 is just taking the absolute values of obtained F and G. Method 2 is that we first take the absolute for bases, then, the nonnegative activations are calculated using correlations between the data D and F. The method 3 is an inverse of method 2, where alpha is just a scale fitting coefficient calculated like this.
  11. We conducted an experiment. This is the data matrix D. It’s a power spectrogram of the song obtained from SiSEC2015, and it includes vocals and guitar sounds. The matrix size is here, and we set the number of bases to 60.
  12. This is a result of cost function in NICA. We can confirm the convergence at around 2000 iterations.
  13. After the initialization, we iterate the NMF update rules. We compared PCA-based and SVD-based initializations, random initialization, and the proposed method with 3 types of nonnegativization. This graph shows the convergence of cost function in NMF. Here we show random ones, SVD and PCA ones, and the proposed methods. We can confirm that the proposed methods achieve faster and deeper convergence compared with the other methods. Also, the second type of nonnegativization was the best performance in this experiment.
  14. Next, we confirmed the performance of proposed initializations in a full supervised source separation task. In full-supervised NMF, we train the sample sound of each source in advance, and produce the sourcewise bases F1 and F2. The training is a simple NMF, and we used various initialization methods here. At the separation stage, F1 and F2 are used and fixed to separate the mixture, where the initial values of G1 and G2 are calculated based on the correlations between X and F1 or F2. Therefore, the separation performance only depends on the initialization in the training stage.
  15. These graphs are the averaged results of 15 songs. The left one is for source 1, and the right one is for source 2. Although deeper minimization in NMF cost function does not directly ensure the separation performance, the proposed methods achieve more accurate separation compared with the other methods. This might be because, for the separation using full-supervised NMF, the appropriate bases should be trained, and they should be minimum sufficient bases, namely, the bases along the edges of the data cone. And, we expect that / the proposed method induces the bases to be such meaningful parts that are along the edges.