SlideShare a Scribd company logo
1 of 35
Depth Estimation of Sound Images Using
Directional Clustering and Activation-Shared
Nonnegative Matrix Factorization
Tomo Miyauchi, Daichi Kitamura,
Hiroshi Saruwatari, Satoshi Nakamura
(Nara Institute of Science and Technology, Japan)
Outline
 Background and related study
 Problem and purpose
 Proposed method 1
- Depth estimation based on DOA distribution
Proposed method 2
- Activation shared nonnegative matrix factorization
 Experiments
 Conclusions
2
Background
With the advent of 3D TV, the reproduction of 3D image is realized.
Viewer feels uncomfortable due to mismatch of images.
Problem Picture image Sound image
: Sound
image
3D TV
3
To solve this problem, sound field reproduction technique
have been studied actively.
can present the “direction” and “depth” of
the sound images to the listener.
3D sound reproduction system has not been established yet.
Related study: wave field synthesis
WFS allows us to create sound
images at the front of loudspeakers.
Wave Field Synthesis (WFS)
Sound field reproduction
Representation "depth“
of sound images
[A. J. Berkhout, et al., 1993]
…… …
Listener
4
Drawback of WFS×
Source separation
Localization estimation of
sound images
1
2
These information have been lost in
existing contents by down-mix.
Up-mixing method are required.
↓
Sound image
Mixed signal → individual source
WFS requires the primary source
information of sound images.
1. Individual sound source
2. Localization information
Mixed multi-
channel signal
Wave field
Synthesis
Stereo contents Spatial sound
reproduction
Spatial sound system using existing contents
Flow of proposed up-mixer
Depth
estimation
New depth
estimation
Sound source
separation
1
Directional
estimation
Depth estimation of sound images has not been proposed
Conventional
method
2
This study
5
Related study: directional clustering [Araki, et al., 2007]
6:Source component :Spatial representative vector
L-chinputsignal
R-ch input signal
L-chinputsignal
R-ch input signal
Normalization Clustering
Mixed stereo signal
L-chinputsignal
R-ch input signal
Individual sources of each cluster
: Fourier transform : Inverse Fourier transform
1
Outline
 Background and related study
 Problem and purpose
 Proposed method 1
- Depth estimation based on DOA distribution
Proposed method 2
- Activation-shared multichannel NMF
 Experiments
 Conclusions
7
Problem and purpose
8
Depth estimation method using
direction of arrival (DOA) distribution
Proposed method
Establishing new depth estimation method
How can we get depth information?
Purpose
Problem WFS requires specific localization information of
individual sound sources to reproduce a sound field.
Up-mixer
Directional estimation method have been developed.
Directional estimation based on VBAP [Hirata, et al., 2011]
Outline
 Background and related study
 Problem and purpose
 Proposed method 1
- Depth estimation based on DOA distribution
Proposed method 2
- Activation-shared multichannel NMF
 Experiments
 Conclusions
9
→ “Direction of arrival” of sound waves
We estimate the depth using the DOA distribution.
Center RightLeft
Frequencyof
sourcecomponents
Direction of arrival
Directional clustering Weighted DOA histogram
DOA
Amplitude
ratio of
10
Directional information
Weighting term
Proposed method 1: depth estimation based on DOA
Mixed signal
Individual sources
Magnitude of each vector
Proposed method 1: depth estimation based on DOA
11
sourcecomponent
Frequencyof
sourcecomponent
Frequencyof
Direction of arrival
Close
Far
Observed DOA histogram
becomes smooth shape
Difference of DOA shape corresponding to source distance
Observed DOA distribution of the target source
can be used as a cue for depth estimation.
Observed DOA histogram
becomes spiky shape
Close source
Direction of arrival
Far source
 In sound fields, when a sound source is far from the listener, sound waves
arrive from various directions owing to sound diffusion.
12
Generalized Gaussian distribution: GGD [Box, et al., 1973]
Proposed method 1: modeling of DOA distribution
βshape = 2: Gaussian
distribution PDF
βshape = 1: Laplacian
distribution PDF
Definition of GGD
Flexible family of probability
density function (PDF)
 To model DOA, we propose a new modeling method using GGD.
Shape of GGD changes
depending on βshape.
13
Modeling of DOA distribution based on GGD parameter
Proposed method 1: modeling of DOA distribution
Close
Direction of arrival
sourcecomponents
Frequencyof
Far
Source is close ⇔ βshape is small
Source is Far ⇔ βshape is large
We propose a new depth estimation based on GGD.
Shape parameter βshape
is utilized as metric.
Proposed method 2: problem in proposed method 1
Problem of
signal processing
L-ch
R-ch
Small noise components
are enhanced.
L-chinputsignal
R-ch input signalBinaural – recorded
Normalization problem
14
DOA
Frequencyof
sourcecomponents
Center
RightLeft
 Background noise and artificial distortion generated
by signal processing interfere with DOA histogram.
Activation-shared multichannel NMFFeature extraction
Noise
×
Outline
 Background and related study
 Problem and purpose
 Proposed method 1
- Depth estimation based on DOA distribution
Proposed method 2
- Activation-shared multichannel NMF
 Experiments
 Conclusions
15
Proposed method 2: activation-shared multichannel NMF
16
Time
Frequency
AmplitudeFrequency
Amplitude
Time
Ω: Number of frequency bins
𝑇: Number of time frames
𝐾: Number of bases
Nonnegative matrix factorization: NMF [Lee, et al., 2001]
Activation matrix
(Time-varying gain)
Basis matrix
(Spectral patterns)
Observed matrix
(Spectrogram)
— is a sparse representation.
— can extract significant features from the observed matrix.
 The sparse representation provides high performance
for noise reduction, compression, and feature extraction.
We eliminate background noise and artificial distortion.
17
L-ch
NMF
R-ch
NMF
 Conventional NMFs
generate an artificial
fluctuation.
Directional
information
DOA information
is disturbed.
Conventional NMF
Proposed method 2: problem of conventional NMF
NMFs are
applied in
parallel
Amplitude
ratioBases are trained
uncorrelated.
18
This reduces dimensionality of
input signal while maintaining
directional information.
Cost function
Activation matrix
is shared through
all channels
Activation-shared multichannel NMFProposed method
: cost function, : β-divergence, : entries of matrices
L-ch
NMF
R-ch
NMF
Proposed method 2: activation-shared multichannel NMF
- divergence [Eguchi, et al., 2001]
: Euclidean distance
: Generalized Kullback-Leibler divergence
: Itakura–Saito divergence
Generalized divergence of variable corresponding to .
19
Proposed method 2: activation-shared multichannel NMF
20
Using
-divergence
Proposed method 2: activation-shared multichannel NMF
Auxiliary function method is an optimization
scheme that uses the upper bound function.
1. Design the auxiliary function for as .
2. Minimize the original cost functions indirectly
by minimizing the auxiliary functions.
Derivation of optimal variables
The first and second terms become convex or concave
functions with respect to value.
concave
convex
convex
concave
convex
concave
21
Proposed method 2: activation-shared multichannel NMF
Cost function
 Convex: Jensen’s inequality
 Concave: tangent line inequality
: Convex
function
: Concave
function
22
Proposed method 2: activation-shared multichannel NMF
Cost function
Upper bound function of each term is defined by applying
 The update rules for optimization are obtained from the
derivative of auxiliary function w.r.t. each objective variable.
23
are entries
of matrices .
Proposed method 2: activation-shared multichannel NMF
Update rules
Flow of proposed depth estimation method
Input stereo signal
L-ch R-ch
STFT
Cluster RCluster CCluster L
Weighted DOA histogram
estimation
Depth
estimation
Depth
estimation
Depth
shared NMF
Activation-
Direction of arrivalWe can estimate depth information by
calculate shape parameter of DOA histogram.
Frequencyof
sourcecomponents
Direction of arrival
Direction of arrival
shared NMF
Activation-
shared NMF
Activation-
24
Frequencyof
sourcecomponents
Frequencyof
sourcecomponents
Outline
 Background and related study
 Problem and purpose
 Proposed method 1
- Depth estimation based on DOA distribution
Proposed method 2
- Activation-shared multichannel NMF
 Experiments
 Conclusions
25
Experimental conditions
26
Conditions
 Mixed stereo signals
consist of 3 instruments.
 Target source is located
center with 7 distances.
 Combination related to
direction is 6 patterns.
Mixing source parameter
Test source 1
Test source 2
Test source 3
Reverberation time
NMF beta
NMF basis: Interference source
: Target source
at intervals
Conventional
method 2
Conventional
method 1
Proposed
method
Weighted DOA histogram
(Not processed by NMF)
Processed by conventional NMF
Processed by proposed NMF
Real source Image source
Geometry of image method
Time index
Amplitude
Example of room impulse response
Experimental conditions
Technique of simulating
room impulse response
 Volume of room
 Source location
 Microphone location
 Absorption coefficient
– can be set arbitrarily
Reference sound sources
were generated using
image method.
Image method
[Allen, et al., 1979]
27
28
Experimental results
Results 1
・ Results of conventional methods have no agreement with the oracle (image method).
・ Results of proposed method correctly estimates distance of the target source.
: Interference source
: Target source
Target source: Vocal
Interference source (left): Piano
Interference source (right): Guitar
Data set 1
29
Data set 1 2 3 4 5 6
Target source
Interference source (left)
Interference source (right)
Vocal
Piano
Guitar
Vocal
Guitar
Piano
Guitar
Piano
Vocal
Guitar
Vocal
Piano
Piano
Vocal
Guitar
Piano
Guitar
Vocal
Conventional method 1 0.350 0.532 0.154 0.277 0.602 0.496
Conventional method 2 0.189 0.165 0.044 -0.037 0.426 0.157
Proposed method 0.986 0.925 0.777 0.651 0.791 0.856
Experimental results: correlation coefficient
Correlation coefficient
between reference value
and estimated value
• Strong relation between the estimated value of proposed
method and the distance of the target source is indicated.
• The efficacy of the proposed method is confirmed.
Table Correlation coefficient of each method
Results 2
Conclusions
30
 We proposed a new depth estimation method of
sound source in mixed signal using the shape of DOA
distribution.
 The shape of DOA distribution is modeling by GGD.
 We also proposed a new feature extraction method
for the multichannel signal, activation-shared
multichannel NMF.
 The result of the experiment indicated the efficacy of
the proposed method.
31
Derivation of parameter βshape
Kurtosis of DOA histogram
we propose a closed-form parameter estimation
algorithm based on some approximation and kurtosis.
th moment of GGD
: Observed DOA histogram : Gamma function
×
32
Relation equation of kurtosis and shape parameter
The maximum-likelihood based shape parameter
estimation has no closed-form solution in GGD.
Modified Stirling's formula
There is no exact closed-form solution of the inverse function.×
Approximation of
gamma function
Take a logarithm
33
Derivation of parameter βshape
Introduce Modified String’s formula
This results in the following quadratic equation of to be solved
closed-form estimate of shape parameter
Preparation of depth estimation method is completed.
we can derive the closed-form estimation
34
Derivation of parameter βshape
35
L-ch
NMF
R-ch
NMF
Preliminary experiment
Fluctuation are
generated in DOA Direction of arrival [degree]
L-ch
NMF
R-ch
NMF
(Individually applied)
conventional NMF
(Activation-shared)
proposed NMF
Weighted
DOA histogram
Center cluster DOA
of mixed source
(3 instrument)Direction of arrival [degree]
Direction of arrival [degree]
Feature extraction
while maintaining
directional information
Proposed method 2: activation-shared multichannel NMF
Example of
DOA histogram

More Related Content

What's hot

Efficient initialization for nonnegative matrix factorization based on nonneg...
Efficient initialization for nonnegative matrix factorization based on nonneg...Efficient initialization for nonnegative matrix factorization based on nonneg...
Efficient initialization for nonnegative matrix factorization based on nonneg...Daichi Kitamura
 
Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...Daichi Kitamura
 
Prior distribution design for music bleeding-sound reduction based on nonnega...
Prior distribution design for music bleeding-sound reduction based on nonnega...Prior distribution design for music bleeding-sound reduction based on nonnega...
Prior distribution design for music bleeding-sound reduction based on nonnega...Kitamura Laboratory
 
Depth Estimation of Sound Images Using Directional Clustering and Activation...
Depth Estimation of Sound Images Using  Directional Clustering and Activation...Depth Estimation of Sound Images Using  Directional Clustering and Activation...
Depth Estimation of Sound Images Using Directional Clustering and Activation...奈良先端大 情報科学研究科
 
DNN-based frequency component prediction for frequency-domain audio source se...
DNN-based frequency component prediction for frequency-domain audio source se...DNN-based frequency component prediction for frequency-domain audio source se...
DNN-based frequency component prediction for frequency-domain audio source se...Kitamura Laboratory
 
Blind audio source separation based on time-frequency structure models
Blind audio source separation based on time-frequency structure modelsBlind audio source separation based on time-frequency structure models
Blind audio source separation based on time-frequency structure modelsKitamura Laboratory
 
Linear multichannel blind source separation based on time-frequency mask obta...
Linear multichannel blind source separation based on time-frequency mask obta...Linear multichannel blind source separation based on time-frequency mask obta...
Linear multichannel blind source separation based on time-frequency mask obta...Kitamura Laboratory
 
DNN-based permutation solver for frequency-domain independent component analy...
DNN-based permutation solver for frequency-domain independent component analy...DNN-based permutation solver for frequency-domain independent component analy...
DNN-based permutation solver for frequency-domain independent component analy...Kitamura Laboratory
 
Koyama ASA ASJ joint meeting 2016
Koyama ASA ASJ joint meeting 2016Koyama ASA ASJ joint meeting 2016
Koyama ASA ASJ joint meeting 2016SaruwatariLabUTokyo
 
Robust Sound Field Reproduction against Listener’s Movement Utilizing Image ...
Robust Sound Field Reproduction against  Listener’s Movement Utilizing Image ...Robust Sound Field Reproduction against  Listener’s Movement Utilizing Image ...
Robust Sound Field Reproduction against Listener’s Movement Utilizing Image ...奈良先端大 情報科学研究科
 
Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...Daichi Kitamura
 
Audio Source Separation Based on Low-Rank Structure and Statistical Independence
Audio Source Separation Based on Low-Rank Structure and Statistical IndependenceAudio Source Separation Based on Low-Rank Structure and Statistical Independence
Audio Source Separation Based on Low-Rank Structure and Statistical IndependenceDaichi Kitamura
 
Experimental analysis of optimal window length for independent low-rank matri...
Experimental analysis of optimal window length for independent low-rank matri...Experimental analysis of optimal window length for independent low-rank matri...
Experimental analysis of optimal window length for independent low-rank matri...Daichi Kitamura
 
コサイン類似度罰則条件付き半教師あり非負値行列因子分解と音源分離への応用
コサイン類似度罰則条件付き半教師あり非負値行列因子分解と音源分離への応用コサイン類似度罰則条件付き半教師あり非負値行列因子分解と音源分離への応用
コサイン類似度罰則条件付き半教師あり非負値行列因子分解と音源分離への応用Kitamura Laboratory
 

What's hot (20)

Efficient initialization for nonnegative matrix factorization based on nonneg...
Efficient initialization for nonnegative matrix factorization based on nonneg...Efficient initialization for nonnegative matrix factorization based on nonneg...
Efficient initialization for nonnegative matrix factorization based on nonneg...
 
Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...
 
Prior distribution design for music bleeding-sound reduction based on nonnega...
Prior distribution design for music bleeding-sound reduction based on nonnega...Prior distribution design for music bleeding-sound reduction based on nonnega...
Prior distribution design for music bleeding-sound reduction based on nonnega...
 
Depth Estimation of Sound Images Using Directional Clustering and Activation...
Depth Estimation of Sound Images Using  Directional Clustering and Activation...Depth Estimation of Sound Images Using  Directional Clustering and Activation...
Depth Estimation of Sound Images Using Directional Clustering and Activation...
 
DNN-based frequency component prediction for frequency-domain audio source se...
DNN-based frequency component prediction for frequency-domain audio source se...DNN-based frequency component prediction for frequency-domain audio source se...
DNN-based frequency component prediction for frequency-domain audio source se...
 
Blind audio source separation based on time-frequency structure models
Blind audio source separation based on time-frequency structure modelsBlind audio source separation based on time-frequency structure models
Blind audio source separation based on time-frequency structure models
 
Linear multichannel blind source separation based on time-frequency mask obta...
Linear multichannel blind source separation based on time-frequency mask obta...Linear multichannel blind source separation based on time-frequency mask obta...
Linear multichannel blind source separation based on time-frequency mask obta...
 
DNN-based permutation solver for frequency-domain independent component analy...
DNN-based permutation solver for frequency-domain independent component analy...DNN-based permutation solver for frequency-domain independent component analy...
DNN-based permutation solver for frequency-domain independent component analy...
 
Hybrid NMF APSIPA2014 invited
Hybrid NMF APSIPA2014 invitedHybrid NMF APSIPA2014 invited
Hybrid NMF APSIPA2014 invited
 
Koyama ASA ASJ joint meeting 2016
Koyama ASA ASJ joint meeting 2016Koyama ASA ASJ joint meeting 2016
Koyama ASA ASJ joint meeting 2016
 
Ica2016 312 saruwatari
Ica2016 312 saruwatariIca2016 312 saruwatari
Ica2016 312 saruwatari
 
Robust Sound Field Reproduction against Listener’s Movement Utilizing Image ...
Robust Sound Field Reproduction against  Listener’s Movement Utilizing Image ...Robust Sound Field Reproduction against  Listener’s Movement Utilizing Image ...
Robust Sound Field Reproduction against Listener’s Movement Utilizing Image ...
 
Koyama AES Conference SFC 2016
Koyama AES Conference SFC 2016Koyama AES Conference SFC 2016
Koyama AES Conference SFC 2016
 
Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...
 
Apsipa2016for ss
Apsipa2016for ssApsipa2016for ss
Apsipa2016for ss
 
Audio Source Separation Based on Low-Rank Structure and Statistical Independence
Audio Source Separation Based on Low-Rank Structure and Statistical IndependenceAudio Source Separation Based on Low-Rank Structure and Statistical Independence
Audio Source Separation Based on Low-Rank Structure and Statistical Independence
 
Experimental analysis of optimal window length for independent low-rank matri...
Experimental analysis of optimal window length for independent low-rank matri...Experimental analysis of optimal window length for independent low-rank matri...
Experimental analysis of optimal window length for independent low-rank matri...
 
Dsp2015for ss
Dsp2015for ssDsp2015for ss
Dsp2015for ss
 
Temporal Segment Network
Temporal Segment NetworkTemporal Segment Network
Temporal Segment Network
 
コサイン類似度罰則条件付き半教師あり非負値行列因子分解と音源分離への応用
コサイン類似度罰則条件付き半教師あり非負値行列因子分解と音源分離への応用コサイン類似度罰則条件付き半教師あり非負値行列因子分解と音源分離への応用
コサイン類似度罰則条件付き半教師あり非負値行列因子分解と音源分離への応用
 

Viewers also liked

Divergence optimization based on trade-off between separation and extrapolati...
Divergence optimization based on trade-off between separation and extrapolati...Divergence optimization based on trade-off between separation and extrapolati...
Divergence optimization based on trade-off between separation and extrapolati...Daichi Kitamura
 
統計的独立性と低ランク行列分解理論に基づく ブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...
統計的独立性と低ランク行列分解理論に基づくブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...統計的独立性と低ランク行列分解理論に基づくブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...
統計的独立性と低ランク行列分解理論に基づく ブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...Daichi Kitamura
 
半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法
半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法
半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法Daichi Kitamura
 
音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou...
音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou...音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou...
音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou...Daichi Kitamura
 
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...Daichi Kitamura
 
模擬ハムバッキング・ピックアップの弦振動応答 (in Japanese)
模擬ハムバッキング・ピックアップの弦振動応答 (in Japanese)模擬ハムバッキング・ピックアップの弦振動応答 (in Japanese)
模擬ハムバッキング・ピックアップの弦振動応答 (in Japanese)Daichi Kitamura
 
擬似ハムバッキングピックアップの弦振動応答 (in Japanese)
擬似ハムバッキングピックアップの弦振動応答 (in Japanese)擬似ハムバッキングピックアップの弦振動応答 (in Japanese)
擬似ハムバッキングピックアップの弦振動応答 (in Japanese)Daichi Kitamura
 
Music signal separation using supervised nonnegative matrix factorization wit...
Music signal separation using supervised nonnegative matrix factorization wit...Music signal separation using supervised nonnegative matrix factorization wit...
Music signal separation using supervised nonnegative matrix factorization wit...Daichi Kitamura
 
Study on optimal divergence for superresolution-based supervised nonnegative ...
Study on optimal divergence for superresolution-based supervised nonnegative ...Study on optimal divergence for superresolution-based supervised nonnegative ...
Study on optimal divergence for superresolution-based supervised nonnegative ...Daichi Kitamura
 
ランク1空間近似を用いたBSSにおける音源及び空間モデルの考察 Study on Source and Spatial Models for BSS wi...
ランク1空間近似を用いたBSSにおける音源及び空間モデルの考察 Study on Source and Spatial Models for BSS wi...ランク1空間近似を用いたBSSにおける音源及び空間モデルの考察 Study on Source and Spatial Models for BSS wi...
ランク1空間近似を用いたBSSにおける音源及び空間モデルの考察 Study on Source and Spatial Models for BSS wi...Daichi Kitamura
 
Evaluation of separation accuracy for various real instruments based on super...
Evaluation of separation accuracy for various real instruments based on super...Evaluation of separation accuracy for various real instruments based on super...
Evaluation of separation accuracy for various real instruments based on super...Daichi Kitamura
 
基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)
基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)
基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)Daichi Kitamura
 
独立性基準を用いた非負値行列因子分解の効果的な初期値決定法(Statistical-independence-based efficient initia...
独立性基準を用いた非負値行列因子分解の効果的な初期値決定法(Statistical-independence-based efficient initia...独立性基準を用いた非負値行列因子分解の効果的な初期値決定法(Statistical-independence-based efficient initia...
独立性基準を用いた非負値行列因子分解の効果的な初期値決定法(Statistical-independence-based efficient initia...Daichi Kitamura
 
非負値行列分解の確率的生成モデルと 多チャネル音源分離への応用 (Generative model in nonnegative matrix facto...
非負値行列分解の確率的生成モデルと多チャネル音源分離への応用 (Generative model in nonnegative matrix facto...非負値行列分解の確率的生成モデルと多チャネル音源分離への応用 (Generative model in nonnegative matrix facto...
非負値行列分解の確率的生成モデルと 多チャネル音源分離への応用 (Generative model in nonnegative matrix facto...Daichi Kitamura
 

Viewers also liked (14)

Divergence optimization based on trade-off between separation and extrapolati...
Divergence optimization based on trade-off between separation and extrapolati...Divergence optimization based on trade-off between separation and extrapolati...
Divergence optimization based on trade-off between separation and extrapolati...
 
統計的独立性と低ランク行列分解理論に基づく ブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...
統計的独立性と低ランク行列分解理論に基づくブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...統計的独立性と低ランク行列分解理論に基づくブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...
統計的独立性と低ランク行列分解理論に基づく ブラインド音源分離 –独立低ランク行列分析– Blind source separation based on...
 
半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法
半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法
半教師あり非負値行列因子分解における音源分離性能向上のための効果的な基底学習法
 
音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou...
音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou...音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou...
音響メディア信号処理における独立成分分析の発展と応用, History of independent component analysis for sou...
 
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...
独立性に基づくブラインド音源分離の発展と独立低ランク行列分析 History of independence-based blind source sep...
 
模擬ハムバッキング・ピックアップの弦振動応答 (in Japanese)
模擬ハムバッキング・ピックアップの弦振動応答 (in Japanese)模擬ハムバッキング・ピックアップの弦振動応答 (in Japanese)
模擬ハムバッキング・ピックアップの弦振動応答 (in Japanese)
 
擬似ハムバッキングピックアップの弦振動応答 (in Japanese)
擬似ハムバッキングピックアップの弦振動応答 (in Japanese)擬似ハムバッキングピックアップの弦振動応答 (in Japanese)
擬似ハムバッキングピックアップの弦振動応答 (in Japanese)
 
Music signal separation using supervised nonnegative matrix factorization wit...
Music signal separation using supervised nonnegative matrix factorization wit...Music signal separation using supervised nonnegative matrix factorization wit...
Music signal separation using supervised nonnegative matrix factorization wit...
 
Study on optimal divergence for superresolution-based supervised nonnegative ...
Study on optimal divergence for superresolution-based supervised nonnegative ...Study on optimal divergence for superresolution-based supervised nonnegative ...
Study on optimal divergence for superresolution-based supervised nonnegative ...
 
ランク1空間近似を用いたBSSにおける音源及び空間モデルの考察 Study on Source and Spatial Models for BSS wi...
ランク1空間近似を用いたBSSにおける音源及び空間モデルの考察 Study on Source and Spatial Models for BSS wi...ランク1空間近似を用いたBSSにおける音源及び空間モデルの考察 Study on Source and Spatial Models for BSS wi...
ランク1空間近似を用いたBSSにおける音源及び空間モデルの考察 Study on Source and Spatial Models for BSS wi...
 
Evaluation of separation accuracy for various real instruments based on super...
Evaluation of separation accuracy for various real instruments based on super...Evaluation of separation accuracy for various real instruments based on super...
Evaluation of separation accuracy for various real instruments based on super...
 
基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)
基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)
基底変形型教師ありNMFによる実楽器信号分離 (in Japanese)
 
独立性基準を用いた非負値行列因子分解の効果的な初期値決定法(Statistical-independence-based efficient initia...
独立性基準を用いた非負値行列因子分解の効果的な初期値決定法(Statistical-independence-based efficient initia...独立性基準を用いた非負値行列因子分解の効果的な初期値決定法(Statistical-independence-based efficient initia...
独立性基準を用いた非負値行列因子分解の効果的な初期値決定法(Statistical-independence-based efficient initia...
 
非負値行列分解の確率的生成モデルと 多チャネル音源分離への応用 (Generative model in nonnegative matrix facto...
非負値行列分解の確率的生成モデルと多チャネル音源分離への応用 (Generative model in nonnegative matrix facto...非負値行列分解の確率的生成モデルと多チャネル音源分離への応用 (Generative model in nonnegative matrix facto...
非負値行列分解の確率的生成モデルと 多チャネル音源分離への応用 (Generative model in nonnegative matrix facto...
 

Similar to Depth estimation of sound images using directional clustering and activation-shared nonnegative matrix factorization

A_Noise_Reduction_Method_Based_on_LMS_Adaptive_Fil.pdf
A_Noise_Reduction_Method_Based_on_LMS_Adaptive_Fil.pdfA_Noise_Reduction_Method_Based_on_LMS_Adaptive_Fil.pdf
A_Noise_Reduction_Method_Based_on_LMS_Adaptive_Fil.pdfBala Murugan
 
Final presentation
Final presentationFinal presentation
Final presentationYash Bhalgat
 
Frequency based criterion for distinguishing tonal and noisy spectral components
Frequency based criterion for distinguishing tonal and noisy spectral componentsFrequency based criterion for distinguishing tonal and noisy spectral components
Frequency based criterion for distinguishing tonal and noisy spectral componentsCSCJournals
 
Feature Extraction of Musical Instrument Tones using FFT and Segment Averaging
Feature Extraction of Musical Instrument Tones using FFT and Segment AveragingFeature Extraction of Musical Instrument Tones using FFT and Segment Averaging
Feature Extraction of Musical Instrument Tones using FFT and Segment AveragingTELKOMNIKA JOURNAL
 
Design of dfe based mimo communication system for mobile moving with high vel...
Design of dfe based mimo communication system for mobile moving with high vel...Design of dfe based mimo communication system for mobile moving with high vel...
Design of dfe based mimo communication system for mobile moving with high vel...Made Artha
 
Sparsity based Joint Direction-of-Arrival and Offset Frequency Estimator
Sparsity based Joint Direction-of-Arrival and Offset Frequency EstimatorSparsity based Joint Direction-of-Arrival and Offset Frequency Estimator
Sparsity based Joint Direction-of-Arrival and Offset Frequency EstimatorJason Fernandes
 
Image Restoration Using Particle Filters By Improving The Scale Of Texture Wi...
Image Restoration Using Particle Filters By Improving The Scale Of Texture Wi...Image Restoration Using Particle Filters By Improving The Scale Of Texture Wi...
Image Restoration Using Particle Filters By Improving The Scale Of Texture Wi...CSCJournals
 
A novel speech enhancement technique
A novel speech enhancement techniqueA novel speech enhancement technique
A novel speech enhancement techniqueeSAT Publishing House
 
Introduction to deep learning based voice activity detection
Introduction to deep learning based voice activity detectionIntroduction to deep learning based voice activity detection
Introduction to deep learning based voice activity detectionNAVER Engineering
 
IR UWB TOA Estimation Techniques and Comparison
IR UWB TOA Estimation Techniques and ComparisonIR UWB TOA Estimation Techniques and Comparison
IR UWB TOA Estimation Techniques and Comparisoninventionjournals
 
AURALISATION OF DEEP CONVOLUTIONAL NEURAL NETWORKS: LISTENING TO LEARNED FEAT...
AURALISATION OF DEEP CONVOLUTIONAL NEURAL NETWORKS: LISTENING TO LEARNED FEAT...AURALISATION OF DEEP CONVOLUTIONAL NEURAL NETWORKS: LISTENING TO LEARNED FEAT...
AURALISATION OF DEEP CONVOLUTIONAL NEURAL NETWORKS: LISTENING TO LEARNED FEAT...NAVER LABS
 
METHOD FOR REDUCING OF NOISE BY IMPROVING SIGNAL-TO-NOISE-RATIO IN WIRELESS LAN
METHOD FOR REDUCING OF NOISE BY IMPROVING SIGNAL-TO-NOISE-RATIO IN WIRELESS LANMETHOD FOR REDUCING OF NOISE BY IMPROVING SIGNAL-TO-NOISE-RATIO IN WIRELESS LAN
METHOD FOR REDUCING OF NOISE BY IMPROVING SIGNAL-TO-NOISE-RATIO IN WIRELESS LANIJNSA Journal
 
DNN-based frequency-domain permutation solver for multichannel audio source s...
DNN-based frequency-domain permutation solver for multichannel audio source s...DNN-based frequency-domain permutation solver for multichannel audio source s...
DNN-based frequency-domain permutation solver for multichannel audio source s...Kitamura Laboratory
 
Image Denoising Using Earth Mover's Distance and Local Histograms
Image Denoising Using Earth Mover's Distance and Local HistogramsImage Denoising Using Earth Mover's Distance and Local Histograms
Image Denoising Using Earth Mover's Distance and Local HistogramsCSCJournals
 
Handling Ihnarmonic Series with Median-Adjustive Trajectories
Handling Ihnarmonic Series with Median-Adjustive TrajectoriesHandling Ihnarmonic Series with Median-Adjustive Trajectories
Handling Ihnarmonic Series with Median-Adjustive TrajectoriesMatthieu Hodgkinson
 
IRJET- Implementing Musical Instrument Recognition using CNN and SVM
IRJET- Implementing Musical Instrument Recognition using CNN and SVMIRJET- Implementing Musical Instrument Recognition using CNN and SVM
IRJET- Implementing Musical Instrument Recognition using CNN and SVMIRJET Journal
 
Direction of Arrival Estimation Based on MUSIC Algorithm Using Uniform and No...
Direction of Arrival Estimation Based on MUSIC Algorithm Using Uniform and No...Direction of Arrival Estimation Based on MUSIC Algorithm Using Uniform and No...
Direction of Arrival Estimation Based on MUSIC Algorithm Using Uniform and No...IJERA Editor
 

Similar to Depth estimation of sound images using directional clustering and activation-shared nonnegative matrix factorization (20)

A_Noise_Reduction_Method_Based_on_LMS_Adaptive_Fil.pdf
A_Noise_Reduction_Method_Based_on_LMS_Adaptive_Fil.pdfA_Noise_Reduction_Method_Based_on_LMS_Adaptive_Fil.pdf
A_Noise_Reduction_Method_Based_on_LMS_Adaptive_Fil.pdf
 
Final presentation
Final presentationFinal presentation
Final presentation
 
Frequency based criterion for distinguishing tonal and noisy spectral components
Frequency based criterion for distinguishing tonal and noisy spectral componentsFrequency based criterion for distinguishing tonal and noisy spectral components
Frequency based criterion for distinguishing tonal and noisy spectral components
 
Feature Extraction of Musical Instrument Tones using FFT and Segment Averaging
Feature Extraction of Musical Instrument Tones using FFT and Segment AveragingFeature Extraction of Musical Instrument Tones using FFT and Segment Averaging
Feature Extraction of Musical Instrument Tones using FFT and Segment Averaging
 
Design of dfe based mimo communication system for mobile moving with high vel...
Design of dfe based mimo communication system for mobile moving with high vel...Design of dfe based mimo communication system for mobile moving with high vel...
Design of dfe based mimo communication system for mobile moving with high vel...
 
Sparsity based Joint Direction-of-Arrival and Offset Frequency Estimator
Sparsity based Joint Direction-of-Arrival and Offset Frequency EstimatorSparsity based Joint Direction-of-Arrival and Offset Frequency Estimator
Sparsity based Joint Direction-of-Arrival and Offset Frequency Estimator
 
Image Restoration Using Particle Filters By Improving The Scale Of Texture Wi...
Image Restoration Using Particle Filters By Improving The Scale Of Texture Wi...Image Restoration Using Particle Filters By Improving The Scale Of Texture Wi...
Image Restoration Using Particle Filters By Improving The Scale Of Texture Wi...
 
A novel speech enhancement technique
A novel speech enhancement techniqueA novel speech enhancement technique
A novel speech enhancement technique
 
Introduction to deep learning based voice activity detection
Introduction to deep learning based voice activity detectionIntroduction to deep learning based voice activity detection
Introduction to deep learning based voice activity detection
 
IR UWB TOA Estimation Techniques and Comparison
IR UWB TOA Estimation Techniques and ComparisonIR UWB TOA Estimation Techniques and Comparison
IR UWB TOA Estimation Techniques and Comparison
 
Max_Poster_FINAL
Max_Poster_FINALMax_Poster_FINAL
Max_Poster_FINAL
 
AURALISATION OF DEEP CONVOLUTIONAL NEURAL NETWORKS: LISTENING TO LEARNED FEAT...
AURALISATION OF DEEP CONVOLUTIONAL NEURAL NETWORKS: LISTENING TO LEARNED FEAT...AURALISATION OF DEEP CONVOLUTIONAL NEURAL NETWORKS: LISTENING TO LEARNED FEAT...
AURALISATION OF DEEP CONVOLUTIONAL NEURAL NETWORKS: LISTENING TO LEARNED FEAT...
 
METHOD FOR REDUCING OF NOISE BY IMPROVING SIGNAL-TO-NOISE-RATIO IN WIRELESS LAN
METHOD FOR REDUCING OF NOISE BY IMPROVING SIGNAL-TO-NOISE-RATIO IN WIRELESS LANMETHOD FOR REDUCING OF NOISE BY IMPROVING SIGNAL-TO-NOISE-RATIO IN WIRELESS LAN
METHOD FOR REDUCING OF NOISE BY IMPROVING SIGNAL-TO-NOISE-RATIO IN WIRELESS LAN
 
DNN-based frequency-domain permutation solver for multichannel audio source s...
DNN-based frequency-domain permutation solver for multichannel audio source s...DNN-based frequency-domain permutation solver for multichannel audio source s...
DNN-based frequency-domain permutation solver for multichannel audio source s...
 
Image Denoising Using Earth Mover's Distance and Local Histograms
Image Denoising Using Earth Mover's Distance and Local HistogramsImage Denoising Using Earth Mover's Distance and Local Histograms
Image Denoising Using Earth Mover's Distance and Local Histograms
 
N017428692
N017428692N017428692
N017428692
 
Handling Ihnarmonic Series with Median-Adjustive Trajectories
Handling Ihnarmonic Series with Median-Adjustive TrajectoriesHandling Ihnarmonic Series with Median-Adjustive Trajectories
Handling Ihnarmonic Series with Median-Adjustive Trajectories
 
IRJET- Implementing Musical Instrument Recognition using CNN and SVM
IRJET- Implementing Musical Instrument Recognition using CNN and SVMIRJET- Implementing Musical Instrument Recognition using CNN and SVM
IRJET- Implementing Musical Instrument Recognition using CNN and SVM
 
Direction of Arrival Estimation Based on MUSIC Algorithm Using Uniform and No...
Direction of Arrival Estimation Based on MUSIC Algorithm Using Uniform and No...Direction of Arrival Estimation Based on MUSIC Algorithm Using Uniform and No...
Direction of Arrival Estimation Based on MUSIC Algorithm Using Uniform and No...
 
H0814247
H0814247H0814247
H0814247
 

More from Daichi Kitamura

独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...
独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...
独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...Daichi Kitamura
 
スペクトログラム無矛盾性を用いた 独立低ランク行列分析の実験的評価
スペクトログラム無矛盾性を用いた独立低ランク行列分析の実験的評価スペクトログラム無矛盾性を用いた独立低ランク行列分析の実験的評価
スペクトログラム無矛盾性を用いた 独立低ランク行列分析の実験的評価Daichi Kitamura
 
Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)
Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)
Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)Daichi Kitamura
 
独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...
独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...
独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...Daichi Kitamura
 
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...Daichi Kitamura
 
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...Daichi Kitamura
 
近接分離最適化によるブラインド⾳源分離(Blind source separation via proximal splitting algorithm)
近接分離最適化によるブラインド⾳源分離(Blind source separation via proximal splitting algorithm)近接分離最適化によるブラインド⾳源分離(Blind source separation via proximal splitting algorithm)
近接分離最適化によるブラインド⾳源分離(Blind source separation via proximal splitting algorithm)Daichi Kitamura
 
非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法
非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法
非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法Daichi Kitamura
 
音源分離における音響モデリング(Acoustic modeling in audio source separation)
音源分離における音響モデリング(Acoustic modeling in audio source separation)音源分離における音響モデリング(Acoustic modeling in audio source separation)
音源分離における音響モデリング(Acoustic modeling in audio source separation)Daichi Kitamura
 
ICASSP2017読み会(関東編)・AASP_L3(北村担当分)
ICASSP2017読み会(関東編)・AASP_L3(北村担当分)ICASSP2017読み会(関東編)・AASP_L3(北村担当分)
ICASSP2017読み会(関東編)・AASP_L3(北村担当分)Daichi Kitamura
 

More from Daichi Kitamura (10)

独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...
独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...
独立低ランク行列分析に基づく音源分離とその発展(Audio source separation based on independent low-rank...
 
スペクトログラム無矛盾性を用いた 独立低ランク行列分析の実験的評価
スペクトログラム無矛盾性を用いた独立低ランク行列分析の実験的評価スペクトログラム無矛盾性を用いた独立低ランク行列分析の実験的評価
スペクトログラム無矛盾性を用いた 独立低ランク行列分析の実験的評価
 
Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)
Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)
Windowsマシン上でVisual Studio Codeとpipenvを使ってPythonの仮想実行環境を構築する方法(Jupyter notebookも)
 
独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...
独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...
独立低ランク行列分析に基づくブラインド音源分離(Blind source separation based on independent low-rank...
 
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...
独立深層学習行列分析に基づく多チャネル音源分離の実験的評価(Experimental evaluation of multichannel audio s...
 
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...
独立深層学習行列分析に基づく多チャネル音源分離(Multichannel audio source separation based on indepen...
 
近接分離最適化によるブラインド⾳源分離(Blind source separation via proximal splitting algorithm)
近接分離最適化によるブラインド⾳源分離(Blind source separation via proximal splitting algorithm)近接分離最適化によるブラインド⾳源分離(Blind source separation via proximal splitting algorithm)
近接分離最適化によるブラインド⾳源分離(Blind source separation via proximal splitting algorithm)
 
非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法
非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法
非負値行列因子分解に基づくブラインド及び教師あり音楽音源分離の効果的最適化法
 
音源分離における音響モデリング(Acoustic modeling in audio source separation)
音源分離における音響モデリング(Acoustic modeling in audio source separation)音源分離における音響モデリング(Acoustic modeling in audio source separation)
音源分離における音響モデリング(Acoustic modeling in audio source separation)
 
ICASSP2017読み会(関東編)・AASP_L3(北村担当分)
ICASSP2017読み会(関東編)・AASP_L3(北村担当分)ICASSP2017読み会(関東編)・AASP_L3(北村担当分)
ICASSP2017読み会(関東編)・AASP_L3(北村担当分)
 

Recently uploaded

Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...roncy bisnoi
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfKamal Acharya
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapRishantSharmaFr
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)simmis5
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...ranjana rawat
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueBhangaleSonal
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...ranjana rawat
 
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Bookingroncy bisnoi
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . pptDineshKumar4165
 
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Bookingdharasingh5698
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTbhaskargani46
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...SUHANI PANDEY
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptMsecMca
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdfankushspencer015
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...Call Girls in Nagpur High Profile
 
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELLPVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELLManishPatel169454
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Call Girls in Nagpur High Profile
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756dollysharma2066
 

Recently uploaded (20)

Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdfONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
ONLINE FOOD ORDER SYSTEM PROJECT REPORT.pdf
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
(INDIRA) Call Girl Meerut Call Now 8617697112 Meerut Escorts 24x7
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
The Most Attractive Pune Call Girls Manchar 8250192130 Will You Miss This Cha...
 
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Walvekar Nagar Call Me 7737669865 Budget Friendly No Advance Booking
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
 
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar  ≼🔝 Delhi door step de...
Call Now ≽ 9953056974 ≼🔝 Call Girls In New Ashok Nagar ≼🔝 Delhi door step de...
 
Generative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPTGenerative AI or GenAI technology based PPT
Generative AI or GenAI technology based PPT
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
 
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELLPVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
PVC VS. FIBERGLASS (FRP) GRAVITY SEWER - UNI BELL
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 

Depth estimation of sound images using directional clustering and activation-shared nonnegative matrix factorization

  • 1. Depth Estimation of Sound Images Using Directional Clustering and Activation-Shared Nonnegative Matrix Factorization Tomo Miyauchi, Daichi Kitamura, Hiroshi Saruwatari, Satoshi Nakamura (Nara Institute of Science and Technology, Japan)
  • 2. Outline  Background and related study  Problem and purpose  Proposed method 1 - Depth estimation based on DOA distribution Proposed method 2 - Activation shared nonnegative matrix factorization  Experiments  Conclusions 2
  • 3. Background With the advent of 3D TV, the reproduction of 3D image is realized. Viewer feels uncomfortable due to mismatch of images. Problem Picture image Sound image : Sound image 3D TV 3 To solve this problem, sound field reproduction technique have been studied actively. can present the “direction” and “depth” of the sound images to the listener. 3D sound reproduction system has not been established yet.
  • 4. Related study: wave field synthesis WFS allows us to create sound images at the front of loudspeakers. Wave Field Synthesis (WFS) Sound field reproduction Representation "depth“ of sound images [A. J. Berkhout, et al., 1993] …… … Listener 4 Drawback of WFS× Source separation Localization estimation of sound images 1 2 These information have been lost in existing contents by down-mix. Up-mixing method are required. ↓ Sound image Mixed signal → individual source WFS requires the primary source information of sound images. 1. Individual sound source 2. Localization information
  • 5. Mixed multi- channel signal Wave field Synthesis Stereo contents Spatial sound reproduction Spatial sound system using existing contents Flow of proposed up-mixer Depth estimation New depth estimation Sound source separation 1 Directional estimation Depth estimation of sound images has not been proposed Conventional method 2 This study 5
  • 6. Related study: directional clustering [Araki, et al., 2007] 6:Source component :Spatial representative vector L-chinputsignal R-ch input signal L-chinputsignal R-ch input signal Normalization Clustering Mixed stereo signal L-chinputsignal R-ch input signal Individual sources of each cluster : Fourier transform : Inverse Fourier transform 1
  • 7. Outline  Background and related study  Problem and purpose  Proposed method 1 - Depth estimation based on DOA distribution Proposed method 2 - Activation-shared multichannel NMF  Experiments  Conclusions 7
  • 8. Problem and purpose 8 Depth estimation method using direction of arrival (DOA) distribution Proposed method Establishing new depth estimation method How can we get depth information? Purpose Problem WFS requires specific localization information of individual sound sources to reproduce a sound field. Up-mixer Directional estimation method have been developed. Directional estimation based on VBAP [Hirata, et al., 2011]
  • 9. Outline  Background and related study  Problem and purpose  Proposed method 1 - Depth estimation based on DOA distribution Proposed method 2 - Activation-shared multichannel NMF  Experiments  Conclusions 9
  • 10. → “Direction of arrival” of sound waves We estimate the depth using the DOA distribution. Center RightLeft Frequencyof sourcecomponents Direction of arrival Directional clustering Weighted DOA histogram DOA Amplitude ratio of 10 Directional information Weighting term Proposed method 1: depth estimation based on DOA Mixed signal Individual sources Magnitude of each vector
  • 11. Proposed method 1: depth estimation based on DOA 11 sourcecomponent Frequencyof sourcecomponent Frequencyof Direction of arrival Close Far Observed DOA histogram becomes smooth shape Difference of DOA shape corresponding to source distance Observed DOA distribution of the target source can be used as a cue for depth estimation. Observed DOA histogram becomes spiky shape Close source Direction of arrival Far source  In sound fields, when a sound source is far from the listener, sound waves arrive from various directions owing to sound diffusion.
  • 12. 12 Generalized Gaussian distribution: GGD [Box, et al., 1973] Proposed method 1: modeling of DOA distribution βshape = 2: Gaussian distribution PDF βshape = 1: Laplacian distribution PDF Definition of GGD Flexible family of probability density function (PDF)  To model DOA, we propose a new modeling method using GGD. Shape of GGD changes depending on βshape.
  • 13. 13 Modeling of DOA distribution based on GGD parameter Proposed method 1: modeling of DOA distribution Close Direction of arrival sourcecomponents Frequencyof Far Source is close ⇔ βshape is small Source is Far ⇔ βshape is large We propose a new depth estimation based on GGD. Shape parameter βshape is utilized as metric.
  • 14. Proposed method 2: problem in proposed method 1 Problem of signal processing L-ch R-ch Small noise components are enhanced. L-chinputsignal R-ch input signalBinaural – recorded Normalization problem 14 DOA Frequencyof sourcecomponents Center RightLeft  Background noise and artificial distortion generated by signal processing interfere with DOA histogram. Activation-shared multichannel NMFFeature extraction Noise ×
  • 15. Outline  Background and related study  Problem and purpose  Proposed method 1 - Depth estimation based on DOA distribution Proposed method 2 - Activation-shared multichannel NMF  Experiments  Conclusions 15
  • 16. Proposed method 2: activation-shared multichannel NMF 16 Time Frequency AmplitudeFrequency Amplitude Time Ω: Number of frequency bins 𝑇: Number of time frames 𝐾: Number of bases Nonnegative matrix factorization: NMF [Lee, et al., 2001] Activation matrix (Time-varying gain) Basis matrix (Spectral patterns) Observed matrix (Spectrogram) — is a sparse representation. — can extract significant features from the observed matrix.  The sparse representation provides high performance for noise reduction, compression, and feature extraction. We eliminate background noise and artificial distortion.
  • 17. 17 L-ch NMF R-ch NMF  Conventional NMFs generate an artificial fluctuation. Directional information DOA information is disturbed. Conventional NMF Proposed method 2: problem of conventional NMF NMFs are applied in parallel Amplitude ratioBases are trained uncorrelated.
  • 18. 18 This reduces dimensionality of input signal while maintaining directional information. Cost function Activation matrix is shared through all channels Activation-shared multichannel NMFProposed method : cost function, : β-divergence, : entries of matrices L-ch NMF R-ch NMF Proposed method 2: activation-shared multichannel NMF
  • 19. - divergence [Eguchi, et al., 2001] : Euclidean distance : Generalized Kullback-Leibler divergence : Itakura–Saito divergence Generalized divergence of variable corresponding to . 19 Proposed method 2: activation-shared multichannel NMF
  • 20. 20 Using -divergence Proposed method 2: activation-shared multichannel NMF Auxiliary function method is an optimization scheme that uses the upper bound function. 1. Design the auxiliary function for as . 2. Minimize the original cost functions indirectly by minimizing the auxiliary functions. Derivation of optimal variables
  • 21. The first and second terms become convex or concave functions with respect to value. concave convex convex concave convex concave 21 Proposed method 2: activation-shared multichannel NMF Cost function
  • 22.  Convex: Jensen’s inequality  Concave: tangent line inequality : Convex function : Concave function 22 Proposed method 2: activation-shared multichannel NMF Cost function Upper bound function of each term is defined by applying
  • 23.  The update rules for optimization are obtained from the derivative of auxiliary function w.r.t. each objective variable. 23 are entries of matrices . Proposed method 2: activation-shared multichannel NMF Update rules
  • 24. Flow of proposed depth estimation method Input stereo signal L-ch R-ch STFT Cluster RCluster CCluster L Weighted DOA histogram estimation Depth estimation Depth estimation Depth shared NMF Activation- Direction of arrivalWe can estimate depth information by calculate shape parameter of DOA histogram. Frequencyof sourcecomponents Direction of arrival Direction of arrival shared NMF Activation- shared NMF Activation- 24 Frequencyof sourcecomponents Frequencyof sourcecomponents
  • 25. Outline  Background and related study  Problem and purpose  Proposed method 1 - Depth estimation based on DOA distribution Proposed method 2 - Activation-shared multichannel NMF  Experiments  Conclusions 25
  • 26. Experimental conditions 26 Conditions  Mixed stereo signals consist of 3 instruments.  Target source is located center with 7 distances.  Combination related to direction is 6 patterns. Mixing source parameter Test source 1 Test source 2 Test source 3 Reverberation time NMF beta NMF basis: Interference source : Target source at intervals Conventional method 2 Conventional method 1 Proposed method Weighted DOA histogram (Not processed by NMF) Processed by conventional NMF Processed by proposed NMF
  • 27. Real source Image source Geometry of image method Time index Amplitude Example of room impulse response Experimental conditions Technique of simulating room impulse response  Volume of room  Source location  Microphone location  Absorption coefficient – can be set arbitrarily Reference sound sources were generated using image method. Image method [Allen, et al., 1979] 27
  • 28. 28 Experimental results Results 1 ・ Results of conventional methods have no agreement with the oracle (image method). ・ Results of proposed method correctly estimates distance of the target source. : Interference source : Target source Target source: Vocal Interference source (left): Piano Interference source (right): Guitar Data set 1
  • 29. 29 Data set 1 2 3 4 5 6 Target source Interference source (left) Interference source (right) Vocal Piano Guitar Vocal Guitar Piano Guitar Piano Vocal Guitar Vocal Piano Piano Vocal Guitar Piano Guitar Vocal Conventional method 1 0.350 0.532 0.154 0.277 0.602 0.496 Conventional method 2 0.189 0.165 0.044 -0.037 0.426 0.157 Proposed method 0.986 0.925 0.777 0.651 0.791 0.856 Experimental results: correlation coefficient Correlation coefficient between reference value and estimated value • Strong relation between the estimated value of proposed method and the distance of the target source is indicated. • The efficacy of the proposed method is confirmed. Table Correlation coefficient of each method Results 2
  • 30. Conclusions 30  We proposed a new depth estimation method of sound source in mixed signal using the shape of DOA distribution.  The shape of DOA distribution is modeling by GGD.  We also proposed a new feature extraction method for the multichannel signal, activation-shared multichannel NMF.  The result of the experiment indicated the efficacy of the proposed method.
  • 31. 31
  • 32. Derivation of parameter βshape Kurtosis of DOA histogram we propose a closed-form parameter estimation algorithm based on some approximation and kurtosis. th moment of GGD : Observed DOA histogram : Gamma function × 32 Relation equation of kurtosis and shape parameter The maximum-likelihood based shape parameter estimation has no closed-form solution in GGD.
  • 33. Modified Stirling's formula There is no exact closed-form solution of the inverse function.× Approximation of gamma function Take a logarithm 33 Derivation of parameter βshape Introduce Modified String’s formula
  • 34. This results in the following quadratic equation of to be solved closed-form estimate of shape parameter Preparation of depth estimation method is completed. we can derive the closed-form estimation 34 Derivation of parameter βshape
  • 35. 35 L-ch NMF R-ch NMF Preliminary experiment Fluctuation are generated in DOA Direction of arrival [degree] L-ch NMF R-ch NMF (Individually applied) conventional NMF (Activation-shared) proposed NMF Weighted DOA histogram Center cluster DOA of mixed source (3 instrument)Direction of arrival [degree] Direction of arrival [degree] Feature extraction while maintaining directional information Proposed method 2: activation-shared multichannel NMF Example of DOA histogram

Editor's Notes

  1. Hello, everyone. I’m Tomo Miyauchi from Nara institute of science and technology, Japan. Today / I’d like to talk about Depth estimation of sound images using directional clustering and activation-shared nonnegative matrix factorization.
  2. Here is the outline of today’s presentation. My presentation is divided into five parts. First, I talk about research background and related study.
  3. Recently, with the advent of 3D TV, the reproduction of 3D image is realized. On the other hand, 3D sound reproduction system has not been established yet. Therefore, Viewers feel uncomfortable due to mismatch of images. To solve this problem, sound field reproduction technique have been studied actively. Thanks to this technique, / we can present the direction and depth of the sound images to the listener.
  4. Wave field synthesis, WFS in short, is one of the sound field reproduction technique. WFS allows (アラウズ) us to create sound images at the front of loudspeakers. WFS requires the primary source information of sound images, where primary source information means the individual sound source and the localization information. However, these information have been lost in existing contents by down-mix. This is the drawback of WFS. Therefore, up-mixing method are required. Up-mix process can be divided into 2 steps. 1st step is the source separation, and 2nd step is the localization estimation of sound images.
  5. This is the flow of the proposed up-mixer. The signal of stereo contents is processed by sound source separation method and the localization estimation method. The processed signal are used in WFS finally. The localization method consists of directional estimation and depth estimation. In previous research, sound source separation and directional estimation have been proposed in conventional method. On the other hand, depth estimation has not been proposed yet. Therefore, in this study, we proposed a new depth estimation method.
  6. Now, I explain about the directional clustering, / which is used as source separation method in proposed up-mixer. This is the procedure of clustering. First, the mixed stereo signal is processed by Fourier transform. Next, the time-frequency components of signal are represented into the two-dimensional space, where XL and XR are the amplitude of each channel. Then, these components are normalized and separated by k-means clustering. Finally, the individual sources of each cluster is obtained by inverse Fourier transform.
  7. Next, I explain about problem and purpose of this study.
  8. WFS requires specific localization information of individual sound sources to reproduce a sound field. This is the problem of the spatial (スペィシアル) sound system using WFS. As mentioned above (アバブ), the directional estimation method have been developed. Therefore, the purpose of this study is establishing a new estimation method. In this study, we propose depth estimation method using direction of arrival distribution.
  9. Next, I explain about proposed method 1, depth estimation based on DOA distribution.
  10. DOA means direction of arrival of sound waves. We estimate the depth information using the DOA distribution. In the directional clustering, we using a amplitude ratio of signal / as the directional information. This parameter is reused as DOA. Now, we calculate a weighted DOA histogram. In this process, DOAs are calculated as θ. Then, DOAs are weighted by the magnitude of each vector w.
  11. In sound fields, when a sound source is far from the listener, / sound waves arrive from various directions owing to sound diffusion. If the source is close, observed DOA histogram becomes spiky shape. On the other hand, if the sound source is far, observed DOA histogram becomes smooth shape. Therefore, the shape of an observed DOA distribution of the target source / can be used as a cue for depth estimation.
  12. To model of DOA, we propose a new modeling method using GGD. Generalized Gaussian distribution, GGD in short, is a flexible family of probability density function. As can be seen, the shape of GGD changes depending on βshape. β of 2 corresponds to Gaussian PDF / and that β of 1 corresponds to Laplacian PDF.
  13. If β is small, GGD becomes a spiky shape, and if β is large, GGD becomes a smooth shape. Based on this property, we propose a new depth estimation based on GGD. In our method, shape parameter is utilized as metric. Then, we define the target source is close when β is small, and the target source is far when β is large.
  14. In the actual calculating process, back ground noise and artificial distortion generated by signal processing / interfere with DOA histogram. These noise have a negative effect in the depth estimation. Therefore, we proposed a feature (フィーチャー) extraction method, activation-shared multichannel NMF.
  15. Next, I mention about proposed method 2, activation-shared multichannel NMF.
  16. Nonnegative matrix factorization, NMF in short, has been proposed. NMF is a sparse representation, and can extract the significant features from the observed matrix. NMF decomposes the observed matrix, spectrogram Y, into two nonnegative matrices F and G. Here, F has frequently-appearing spectral patterns. And G has time-varying (ベリン) gains. So, F is called ‘basis matrix,’ and G is called ‘activation matrix.’ The aim of sparse representations is / to reveal basis structures, / and to represent these structures in a compact. Also, the sparse representation provides high performance for noise reduction, compression and feature extraction. Using this property, we eliminate background noise and artificial distortion.
  17. However, if the conventional NMFs are applied in parallel, artificial fluctuation is generated. This is due to the fact that bases are trained uncorrelated. As a result, DOA information is disturbed.
  18. Therefore, we propose activation-shared multichannel NMF. In this method, the activation matrix is shared through all channels. Thus (ザス), we can reduce dimensionality of the input signal / while maintaining directional information. This is the cost function of the proposed NMF.
  19. β-divergence is a generalized divergence of variable (ベリアブル) x corresponding to y. Dβ indicates the generalized divergence function, / which includes Euclidean distance, Kullback-Leibler divergence, and Itakura-Saito divergence.
  20. And we derive the optimal variables F, G, which minimize these cost functions. Auxiliary function method is an optimization scheme that uses the upper bound function, as the auxiliary function. In this method, we design the auxiliary functions for the cost functions J, as J plus. Then, we can minimize the original cost functions indirectly by minimizing the auxiliary functions. To design the auxiliary function, we have to derive the upper bounds. Using β-divergence, the cost function is redefined like this.
  21. The 1st and 2nd terms become convex or concave function with respect to β value, like this.
  22. For the convex function, Jensen’s inequality (インイクアリティ) can be used to derive the upper bound. On the other hand, for the concave function, we can use the tangent line inequality for making upper bound.
  23. The update rules for optimization are obtained from the derivative of auxiliary function with respect to each objective variable. These are the update rules of proposed NMF.
  24. This is the flow of the proposed depth estimation method. First, input stereo signal is processed by Fourier transform. Next, weighted DOA histogram is calculated. Then, the signal is separated by directional clustering. Activation-shared NMF is applied as the feature extraction method. Finally, we can estimate the depth of sound images by calculate shape parameter of DOA histogram.
  25. Next, I explain about experiments.
  26. In the experiment, we prepared mixed stereo signals, which consist of three instruments, vocal, piano, and guitar. The target source was located in the center with seven distances. In addition, combination related to direction is six patterns. As for β, we conducted a preliminary experiment. Then, we decided β equals 1 / corresponds to KL divergence. In this experiments, the signal not processed by NMF was evaluated as conventional method 1. Also, the signal processed by conventional NMF was evaluated as conventional method 2.
  27. We used the image method as a reference for this experiment, which is a technique of simulating the room impulse response. In this method, volume of room, source location, microphone location, and absorption coefficient (コエフィシェント) can be set arbitrarily (アービタラリリー). The reference sound sources were generated using image method.
  28. Here is the experimental result. In this graph, the gray line is reference values of the image method. As can be seen, shape parameter is increased corresponding to distance between source and listener. In addition, triangle of green is conventional method 1, circle of blue is conventional method 2, and diamond of red is proposed method. From this graph, the results of the conventional methods have no agreement with the oracle. On the other hand, the results of the proposed method correctly estimates distance of the target source.
  29. In addition, this is the correlation coefficient between the reference value and the estimated value. As can be seen, results of proposed method are highest value in all conditions. This result indicates strong relation between the estimated value of proposed method and the distance of the target source. Thus (ザス!), the efficacy of the proposed method as the depth estimation is confirmed.
  30. This is my conclusions. Thank you for your attention.
  31. 31