SlideShare a Scribd company logo
Depth Estimation of Sound Images Using
Directional Clustering and Activation-Shared
Nonnegative Matrix Factorization

Tomo Miyauchi, Daichi Kitamura,
Hiroshi Saruwatari, Satoshi Nakamura
(Nara Institute of Science and Technology, Japan)
Outline
Background and related study
Problem and purpose
Proposed method 1
- Depth estimation based on DOA distribution
Proposed method 2
- Activation shared nonnegative matrix factorization
Experiments
Conclusions
2
Background
With the advent of 3D TV, the reproduction of 3D image is realized.
3D sound reproduction system has not been established yet.
Problem

Picture image

Sound image

3D TV

: Sound
image

Viewer feels uncomfortable due to mismatch of images.

To solve this problem, sound field reproduction technique
have been studied actively.
can present the “direction” and “depth” of
the sound images to the listener.
3
Related study: wave field synthesis
Sound field reproduction

WFS requires the primary source
information of sound images.

Representation "depth“
of sound images

1. Individual sound source
2. Localization information

Wave Field Synthesis (WFS)
[A. J. Berkhout, et al., 1993]

WFS allows us to create sound
images at the front of loudspeakers.

…

…

…

These information have been lost in
existing contents by down-mix.

×Drawback of WFS
↓
Up-mixing method are required.

Sound image
1

Source separation

Mixed signal → individual source
2
Listener

Localization estimation of
sound images
4
Flow of proposed up-mixer
Spatial sound system using existing contents
Stereo contents

Spatial sound
reproduction

1

Mixed multichannel signal
Conventional
method

Sound source
separation

Wave field
Synthesis
This study

2

Directional
estimation

New depth
Depth
estimation

Depth estimation of sound images has not been proposed

5
Related study: directional clustering [Araki, et al., 2007]
Individual sources of each cluster

Mixed stereo signal

: Inverse Fourier transform

L-ch input signal

L-ch input signal

: Fourier transform
L-ch input signal

1

R-ch input signal

R-ch input signal

R-ch input signal

Normalization
:Source component

Clustering
:Spatial representative vector

6
Outline
Background and related study
Problem and purpose
Proposed method 1
- Depth estimation based on DOA distribution
Proposed method 2
- Activation-shared multichannel NMF
Experiments
Conclusions
7
Problem and purpose
Problem WFS requires specific localization information of
individual sound sources to reproduce a sound field.
Up-mixer
Directional estimation method have been developed.
Directional estimation based on VBAP [Hirata, et al., 2011]

Purpose
Establishing new depth estimation method
How can we get depth information?

Proposed method

Depth estimation method using
direction of arrival (DOA) distribution
8
Outline
Background and related study
Problem and purpose
Proposed method 1
- Depth estimation based on DOA distribution
Proposed method 2
- Activation-shared multichannel NMF
Experiments
Conclusions
9
Proposed method 1: depth estimation based on DOA
DOA

→ “Direction of arrival” of sound waves
We estimate the depth using the DOA distribution.

Directional clustering

Weighted DOA histogram
Directional information

Amplitude
ratio of

Weighting term

Mixed signal
Frequency of
source components

Magnitude of each vector

Individual sources

Left

Center

Right

Direction of arrival

10
Proposed method 1: depth estimation based on DOA
In sound fields, when a sound source is far from the listener, sound waves
arrive from various directions owing to sound diffusion.

Frequency of

Close

source component

Difference of DOA shape corresponding to source distance

Close source
Observed DOA histogram
becomes spiky shape

Frequency of

Far

source component

Direction of arrival

Far source
Observed DOA histogram
becomes smooth shape
Direction of arrival

Observed DOA distribution of the target source
can be used as a cue for depth estimation.

11
Proposed method 1: modeling of DOA distribution
To model DOA, we propose a new modeling method using GGD.
Generalized Gaussian distribution: GGD [Box, et al., 1973]
Flexible family of probability
density function (PDF)

Shape of GGD changes
depending on βshape.
βshape = 2: Gaussian
distribution PDF
βshape = 1: Laplacian
distribution PDF
Definition of GGD

12
Proposed method 1: modeling of DOA distribution
Modeling of DOA distribution based on GGD parameter

Frequency of
source components

Close

Far

Direction of arrival

We propose a new depth estimation based on GGD.
Shape parameter βshape
is utilized as metric.

Source is close ⇔ βshape is small
Source is Far ⇔ βshape is large

13
Proposed method 2: problem in proposed method 1
Normalization problem
Small noise components
are enhanced.

× Problem of
L-ch

signal processing

R-ch

L-ch input signal

Frequency of
source components

Left

Binaural – recorded

R-ch input signal

Center

Right

Noise
DOA

Background noise and artificial distortion generated
by signal processing interfere with DOA histogram.
Feature extraction

Activation-shared multichannel NMF
14
Outline
Background and related study
Problem and purpose
Proposed method 1
- Depth estimation based on DOA distribution
Proposed method 2
- Activation-shared multichannel NMF
Experiments
Conclusions
15
Proposed method 2: activation-shared multichannel NMF
Nonnegative matrix factorization: NMF [Lee, et al., 2001]

Frequency

Frequency

Amplitude

— is a sparse representation.
— can extract significant features from the observed matrix.

Time

Observed matrix
(Spectrogram)

Time

Amplitude

Activation matrix
(Time-varying gain)

Basis matrix
(Spectral patterns)

Ω: Number of frequency bins
: Number of time frames
: Number of bases

The sparse representation provides high performance
for noise reduction, compression, and feature extraction.
We eliminate background noise and artificial distortion.

16
Proposed method 2: problem of conventional NMF
Conventional NMF

Directional
information

L-ch
NMF

NMFs are
applied in
parallel

R-ch
NMF

Conventional NMFs
generate an artificial
fluctuation.

Bases are trained
uncorrelated.

Amplitude
ratio

DOA information
is disturbed.
17
Proposed method 2: activation-shared multichannel NMF
Proposed method

Activation-shared multichannel NMF

NMF

Activation matrix
is shared through
all channels

R-ch

This reduces dimensionality of
input signal while maintaining
directional information.

L-ch

NMF

Cost function

: cost function,

: β-divergence,

: entries of matrices
18
Proposed method 2: activation-shared multichannel NMF
- divergence [Eguchi, et al., 2001]
Generalized divergence of variable

corresponding to .

: Euclidean distance
: Generalized Kullback-Leibler divergence
: Itakura–Saito divergence
19
Proposed method 2: activation-shared multichannel NMF
Derivation of optimal variables
Auxiliary function method is an optimization
scheme that uses the upper bound function.
1. Design the auxiliary function for
as
.
2. Minimize the original cost functions indirectly
by minimizing the auxiliary functions.
Using
-divergence

20
Proposed method 2: activation-shared multichannel NMF
Cost function

The first and second terms become convex or concave
functions with respect to value.

concave
convex
convex

concave
concave
convex
21
Proposed method 2: activation-shared multichannel NMF
Cost function

Upper bound function of each term is defined by applying
Convex: Jensen’s inequality

Concave: tangent line inequality

: Convex
function

: Concave
function

22
Proposed method 2: activation-shared multichannel NMF
The update rules for optimization are obtained from the
derivative of auxiliary function w.r.t. each objective variable.

Update rules
‫ﰀﰀ‬

are entries
of matrices
.
23
Frequency of
source components

Flow of proposed depth estimation method
Input stereo signal
R-ch
L-ch
STFT

Direction of arrival

Cluster L

Cluster C

Cluster R

Activation- Activation- Activationshared NMF shared NMF shared NMF
Depth
estimation

Depth
estimation

Depth
estimation

We can estimate depth information by
calculate shape parameter of DOA histogram.

Direction of arrival
Frequency of
source components

‫ﰀﰀ‬

Frequency of
source components

Weighted DOA histogram

Direction of arrival

24
Outline
Background and related study
Problem and purpose
Proposed method 1
- Depth estimation based on DOA distribution
Proposed method 2
- Activation-shared multichannel NMF
Experiments
Conclusions
25
Experimental conditions
Conditions
Mixing source parameter

Test source 1
Test source 2
Test source 3
: Target source

Reverberation time

intervals

NMF beta

: Interference source

NMF basis

at

Mixed stereo signals
consist of 3 instruments.

Conventional
method 1

Target source is located
center with 7 distances.

Conventional
Processed by conventional NMF
method 2

Combination related to
direction is 6 patterns.

Proposed
method

Weighted DOA histogram
(Not processed by NMF)

Processed by proposed NMF
26
Experimental conditions
Image method
[Allen, et al., 1979]

Geometry of image method
Real source

Technique of simulating
room impulse response

Image source

Volume of room
Source location
Microphone location
Absorption coefficient
Example of room impulse response

Reference sound sources
were generated using
image method.

Amplitude

– can be set arbitrarily

Time index

27
Experimental results
Results 1

: Target source

‫ﰀ‬ҏ

: Interference source

Data set 1
Target source: Vocal
Interference source (left): Piano
Interference source (right): Guitar

・ Results of conventional methods have no agreement with the oracle (image method).
・ Results of proposed method correctly estimates distance of the target source.

28
Experimental results: correlation coefficient
Results 2
Correlation coefficient
between reference value
and estimated value
Table Correlation coefficient of each method
Data set

1

2

3

4

5

6

‫ﰀ‬ҏ

Target source
Interference source (left)
Interference source (right)

Vocal
Piano
Guitar

Vocal
Guitar
Piano

Guitar
Piano
Vocal

Guitar
Vocal
Piano

Piano
Vocal
Guitar

Piano
Guitar
Vocal

Conventional method 1
Conventional method 2
Proposed method

0.350
0.189
0.986

0.532
0.165
0.925

0.154
0.044
0.777

0.277
-0.037
0.651

0.602
0.426
0.791

0.496
0.157
0.856

• Strong relation between the estimated value of proposed
method and the distance of the target source is indicated.
• The efficacy of the proposed method is confirmed.

29
Conclusions
We proposed a new depth estimation method of
sound source in mixed signal using the shape of DOA
distribution.
The shape of DOA distribution is modeling by GGD.
We also proposed a new feature extraction method
for the multichannel signal, activation-shared
multichannel NMF.
The result of the experiment indicated the efficacy of
the proposed method.

30
䩐

31
Derivation of parameter βshape

×The maximum-likelihood based shape parameter
estimation has no closed-form solution in GGD.
we propose a closed-form parameter estimation
algorithm based on some approximation and kurtosis.
Kurtosis of DOA histogram

th moment of GGD

Relation equation of kurtosis and shape parameter

: Observed DOA histogram

: Gamma function
32
Derivation of parameter βshape

×There is no exact closed-form solution of the inverse function.
Introduce Modified String’s formula Approximation of
gamma function
Modified Stirling's formula
‫ﰀﰀ‬

Take a logarithm

33
Derivation of parameter βshape
This results in the following quadratic equation of

to be solved

we can derive the closed-form estimation
‫ﰀﰀ‬

closed-form estimate of shape parameter

Preparation of depth estimation method is completed.
34
Proposed method 2: activation-shared multichannel NMF
Preliminary experiment

Example of
DOA histogram

Weighted
DOA histogram
Direction of arrival [degree]

(Individually applied)
conventional NMF
Fluctuation are
generated in DOA

‫ﰀﰀ‬

L-ch
NMF

R-ch
NMF
Direction of arrival [degree]

(Activation-shared)
proposed NMF
Feature extraction
while maintaining
directional information

Center cluster DOA
of mixed source
(3 instrument)

L-ch
NMF

R-ch
NMF
Direction of arrival [degree]

35

More Related Content

What's hot

Ica2016 312 saruwatari
Ica2016 312 saruwatariIca2016 312 saruwatari
Ica2016 312 saruwatari
SaruwatariLabUTokyo
 
Relaxation of rank-1 spatial constraint in overdetermined blind source separa...
Relaxation of rank-1 spatial constraint in overdetermined blind source separa...Relaxation of rank-1 spatial constraint in overdetermined blind source separa...
Relaxation of rank-1 spatial constraint in overdetermined blind source separa...
Daichi Kitamura
 
Hybrid NMF APSIPA2014 invited
Hybrid NMF APSIPA2014 invitedHybrid NMF APSIPA2014 invited
Hybrid NMF APSIPA2014 invited
SaruwatariLabUTokyo
 
Online divergence switching for superresolution-based nonnegative matrix fact...
Online divergence switching for superresolution-based nonnegative matrix fact...Online divergence switching for superresolution-based nonnegative matrix fact...
Online divergence switching for superresolution-based nonnegative matrix fact...
Daichi Kitamura
 
Apsipa2016for ss
Apsipa2016for ssApsipa2016for ss
Apsipa2016for ss
SaruwatariLabUTokyo
 
DNN-based permutation solver for frequency-domain independent component analy...
DNN-based permutation solver for frequency-domain independent component analy...DNN-based permutation solver for frequency-domain independent component analy...
DNN-based permutation solver for frequency-domain independent component analy...
Kitamura Laboratory
 
Koyama AES Conference SFC 2016
Koyama AES Conference SFC 2016Koyama AES Conference SFC 2016
Koyama AES Conference SFC 2016
SaruwatariLabUTokyo
 
Divergence optimization in nonnegative matrix factorization with spectrogram ...
Divergence optimization in nonnegative matrix factorization with spectrogram ...Divergence optimization in nonnegative matrix factorization with spectrogram ...
Divergence optimization in nonnegative matrix factorization with spectrogram ...
Daichi Kitamura
 
Robust music signal separation based on supervised nonnegative matrix factori...
Robust music signal separation based on supervised nonnegative matrix factori...Robust music signal separation based on supervised nonnegative matrix factori...
Robust music signal separation based on supervised nonnegative matrix factori...
Daichi Kitamura
 
Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...
Daichi Kitamura
 
Dsp2015for ss
Dsp2015for ssDsp2015for ss
Dsp2015for ss
SaruwatariLabUTokyo
 
Regularized superresolution-based binaural signal separation with nonnegative...
Regularized superresolution-based binaural signal separation with nonnegative...Regularized superresolution-based binaural signal separation with nonnegative...
Regularized superresolution-based binaural signal separation with nonnegative...
Daichi Kitamura
 
Hybrid multichannel signal separation using supervised nonnegative matrix fac...
Hybrid multichannel signal separation using supervised nonnegative matrix fac...Hybrid multichannel signal separation using supervised nonnegative matrix fac...
Hybrid multichannel signal separation using supervised nonnegative matrix fac...
Daichi Kitamura
 
Isolated words recognition using mfcc, lpc and neural network
Isolated words recognition using mfcc, lpc and neural networkIsolated words recognition using mfcc, lpc and neural network
Isolated words recognition using mfcc, lpc and neural network
eSAT Journals
 
Reduced Ordering Based Approach to Impulsive Noise Suppression in Color Images
Reduced Ordering Based Approach to Impulsive Noise Suppression in Color ImagesReduced Ordering Based Approach to Impulsive Noise Suppression in Color Images
Reduced Ordering Based Approach to Impulsive Noise Suppression in Color Images
IDES Editor
 
Environmental Sound detection Using MFCC technique
Environmental Sound detection Using MFCC techniqueEnvironmental Sound detection Using MFCC technique
Environmental Sound detection Using MFCC techniquePankaj Kumar
 
Speech Enhancement Using Spectral Flatness Measure Based Spectral Subtraction
Speech Enhancement Using Spectral Flatness Measure Based Spectral SubtractionSpeech Enhancement Using Spectral Flatness Measure Based Spectral Subtraction
Speech Enhancement Using Spectral Flatness Measure Based Spectral Subtraction
IOSRJVSP
 
APPRAISAL AND ANALOGY OF MODIFIED DE-NOISING AND LOCAL ADAPTIVE WAVELET IMAGE...
APPRAISAL AND ANALOGY OF MODIFIED DE-NOISING AND LOCAL ADAPTIVE WAVELET IMAGE...APPRAISAL AND ANALOGY OF MODIFIED DE-NOISING AND LOCAL ADAPTIVE WAVELET IMAGE...
APPRAISAL AND ANALOGY OF MODIFIED DE-NOISING AND LOCAL ADAPTIVE WAVELET IMAGE...
International Journal of Technical Research & Application
 
Audio Source Separation Based on Low-Rank Structure and Statistical Independence
Audio Source Separation Based on Low-Rank Structure and Statistical IndependenceAudio Source Separation Based on Low-Rank Structure and Statistical Independence
Audio Source Separation Based on Low-Rank Structure and Statistical Independence
Daichi Kitamura
 
Speaker Recognition System using MFCC and Vector Quantization Approach
Speaker Recognition System using MFCC and Vector Quantization ApproachSpeaker Recognition System using MFCC and Vector Quantization Approach
Speaker Recognition System using MFCC and Vector Quantization Approach
ijsrd.com
 

What's hot (20)

Ica2016 312 saruwatari
Ica2016 312 saruwatariIca2016 312 saruwatari
Ica2016 312 saruwatari
 
Relaxation of rank-1 spatial constraint in overdetermined blind source separa...
Relaxation of rank-1 spatial constraint in overdetermined blind source separa...Relaxation of rank-1 spatial constraint in overdetermined blind source separa...
Relaxation of rank-1 spatial constraint in overdetermined blind source separa...
 
Hybrid NMF APSIPA2014 invited
Hybrid NMF APSIPA2014 invitedHybrid NMF APSIPA2014 invited
Hybrid NMF APSIPA2014 invited
 
Online divergence switching for superresolution-based nonnegative matrix fact...
Online divergence switching for superresolution-based nonnegative matrix fact...Online divergence switching for superresolution-based nonnegative matrix fact...
Online divergence switching for superresolution-based nonnegative matrix fact...
 
Apsipa2016for ss
Apsipa2016for ssApsipa2016for ss
Apsipa2016for ss
 
DNN-based permutation solver for frequency-domain independent component analy...
DNN-based permutation solver for frequency-domain independent component analy...DNN-based permutation solver for frequency-domain independent component analy...
DNN-based permutation solver for frequency-domain independent component analy...
 
Koyama AES Conference SFC 2016
Koyama AES Conference SFC 2016Koyama AES Conference SFC 2016
Koyama AES Conference SFC 2016
 
Divergence optimization in nonnegative matrix factorization with spectrogram ...
Divergence optimization in nonnegative matrix factorization with spectrogram ...Divergence optimization in nonnegative matrix factorization with spectrogram ...
Divergence optimization in nonnegative matrix factorization with spectrogram ...
 
Robust music signal separation based on supervised nonnegative matrix factori...
Robust music signal separation based on supervised nonnegative matrix factori...Robust music signal separation based on supervised nonnegative matrix factori...
Robust music signal separation based on supervised nonnegative matrix factori...
 
Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...Blind source separation based on independent low-rank matrix analysis and its...
Blind source separation based on independent low-rank matrix analysis and its...
 
Dsp2015for ss
Dsp2015for ssDsp2015for ss
Dsp2015for ss
 
Regularized superresolution-based binaural signal separation with nonnegative...
Regularized superresolution-based binaural signal separation with nonnegative...Regularized superresolution-based binaural signal separation with nonnegative...
Regularized superresolution-based binaural signal separation with nonnegative...
 
Hybrid multichannel signal separation using supervised nonnegative matrix fac...
Hybrid multichannel signal separation using supervised nonnegative matrix fac...Hybrid multichannel signal separation using supervised nonnegative matrix fac...
Hybrid multichannel signal separation using supervised nonnegative matrix fac...
 
Isolated words recognition using mfcc, lpc and neural network
Isolated words recognition using mfcc, lpc and neural networkIsolated words recognition using mfcc, lpc and neural network
Isolated words recognition using mfcc, lpc and neural network
 
Reduced Ordering Based Approach to Impulsive Noise Suppression in Color Images
Reduced Ordering Based Approach to Impulsive Noise Suppression in Color ImagesReduced Ordering Based Approach to Impulsive Noise Suppression in Color Images
Reduced Ordering Based Approach to Impulsive Noise Suppression in Color Images
 
Environmental Sound detection Using MFCC technique
Environmental Sound detection Using MFCC techniqueEnvironmental Sound detection Using MFCC technique
Environmental Sound detection Using MFCC technique
 
Speech Enhancement Using Spectral Flatness Measure Based Spectral Subtraction
Speech Enhancement Using Spectral Flatness Measure Based Spectral SubtractionSpeech Enhancement Using Spectral Flatness Measure Based Spectral Subtraction
Speech Enhancement Using Spectral Flatness Measure Based Spectral Subtraction
 
APPRAISAL AND ANALOGY OF MODIFIED DE-NOISING AND LOCAL ADAPTIVE WAVELET IMAGE...
APPRAISAL AND ANALOGY OF MODIFIED DE-NOISING AND LOCAL ADAPTIVE WAVELET IMAGE...APPRAISAL AND ANALOGY OF MODIFIED DE-NOISING AND LOCAL ADAPTIVE WAVELET IMAGE...
APPRAISAL AND ANALOGY OF MODIFIED DE-NOISING AND LOCAL ADAPTIVE WAVELET IMAGE...
 
Audio Source Separation Based on Low-Rank Structure and Statistical Independence
Audio Source Separation Based on Low-Rank Structure and Statistical IndependenceAudio Source Separation Based on Low-Rank Structure and Statistical Independence
Audio Source Separation Based on Low-Rank Structure and Statistical Independence
 
Speaker Recognition System using MFCC and Vector Quantization Approach
Speaker Recognition System using MFCC and Vector Quantization ApproachSpeaker Recognition System using MFCC and Vector Quantization Approach
Speaker Recognition System using MFCC and Vector Quantization Approach
 

Similar to Depth Estimation of Sound Images Using Directional Clustering and Activation-Shared Nonnegative Matrix Factorization

Feature Extraction of Musical Instrument Tones using FFT and Segment Averaging
Feature Extraction of Musical Instrument Tones using FFT and Segment AveragingFeature Extraction of Musical Instrument Tones using FFT and Segment Averaging
Feature Extraction of Musical Instrument Tones using FFT and Segment Averaging
TELKOMNIKA JOURNAL
 
Frequency based criterion for distinguishing tonal and noisy spectral components
Frequency based criterion for distinguishing tonal and noisy spectral componentsFrequency based criterion for distinguishing tonal and noisy spectral components
Frequency based criterion for distinguishing tonal and noisy spectral components
CSCJournals
 
A_Noise_Reduction_Method_Based_on_LMS_Adaptive_Fil.pdf
A_Noise_Reduction_Method_Based_on_LMS_Adaptive_Fil.pdfA_Noise_Reduction_Method_Based_on_LMS_Adaptive_Fil.pdf
A_Noise_Reduction_Method_Based_on_LMS_Adaptive_Fil.pdf
Bala Murugan
 
Design of dfe based mimo communication system for mobile moving with high vel...
Design of dfe based mimo communication system for mobile moving with high vel...Design of dfe based mimo communication system for mobile moving with high vel...
Design of dfe based mimo communication system for mobile moving with high vel...
Made Artha
 
survey paper for image denoising
survey paper for image denoisingsurvey paper for image denoising
survey paper for image denoising
Arti Singh
 
Final presentation
Final presentationFinal presentation
Final presentation
Yash Bhalgat
 
Image Restoration Using Particle Filters By Improving The Scale Of Texture Wi...
Image Restoration Using Particle Filters By Improving The Scale Of Texture Wi...Image Restoration Using Particle Filters By Improving The Scale Of Texture Wi...
Image Restoration Using Particle Filters By Improving The Scale Of Texture Wi...
CSCJournals
 
L011117884
L011117884L011117884
L011117884
IOSR Journals
 
Direction of Arrival Estimation Based on MUSIC Algorithm Using Uniform and No...
Direction of Arrival Estimation Based on MUSIC Algorithm Using Uniform and No...Direction of Arrival Estimation Based on MUSIC Algorithm Using Uniform and No...
Direction of Arrival Estimation Based on MUSIC Algorithm Using Uniform and No...
IJERA Editor
 
Handling Ihnarmonic Series with Median-Adjustive Trajectories
Handling Ihnarmonic Series with Median-Adjustive TrajectoriesHandling Ihnarmonic Series with Median-Adjustive Trajectories
Handling Ihnarmonic Series with Median-Adjustive TrajectoriesMatthieu Hodgkinson
 
AURALISATION OF DEEP CONVOLUTIONAL NEURAL NETWORKS: LISTENING TO LEARNED FEAT...
AURALISATION OF DEEP CONVOLUTIONAL NEURAL NETWORKS: LISTENING TO LEARNED FEAT...AURALISATION OF DEEP CONVOLUTIONAL NEURAL NETWORKS: LISTENING TO LEARNED FEAT...
AURALISATION OF DEEP CONVOLUTIONAL NEURAL NETWORKS: LISTENING TO LEARNED FEAT...
NAVER LABS
 
A novel speech enhancement technique
A novel speech enhancement techniqueA novel speech enhancement technique
A novel speech enhancement technique
eSAT Publishing House
 
IR UWB TOA Estimation Techniques and Comparison
IR UWB TOA Estimation Techniques and ComparisonIR UWB TOA Estimation Techniques and Comparison
IR UWB TOA Estimation Techniques and Comparison
inventionjournals
 
Improving the global parameter signal to distortion value in music signals
Improving the global parameter signal to distortion value in music signalsImproving the global parameter signal to distortion value in music signals
Improving the global parameter signal to distortion value in music signalsiaemedu
 
Improving the global parameter signal to distortion value in music signals
Improving the global parameter signal to distortion value in music signalsImproving the global parameter signal to distortion value in music signals
Improving the global parameter signal to distortion value in music signalsiaemedu
 
Improving the global parameter signal to distortion value in music signals
Improving the global parameter signal to distortion value in music signalsImproving the global parameter signal to distortion value in music signals
Improving the global parameter signal to distortion value in music signalsIAEME Publication
 
Introduction to deep learning based voice activity detection
Introduction to deep learning based voice activity detectionIntroduction to deep learning based voice activity detection
Introduction to deep learning based voice activity detection
NAVER Engineering
 
3 D Sound
3 D Sound3 D Sound
3 D Sound
adityas87
 
Sparsity based Joint Direction-of-Arrival and Offset Frequency Estimator
Sparsity based Joint Direction-of-Arrival and Offset Frequency EstimatorSparsity based Joint Direction-of-Arrival and Offset Frequency Estimator
Sparsity based Joint Direction-of-Arrival and Offset Frequency Estimator
Jason Fernandes
 
N017428692
N017428692N017428692
N017428692
IOSR Journals
 

Similar to Depth Estimation of Sound Images Using Directional Clustering and Activation-Shared Nonnegative Matrix Factorization (20)

Feature Extraction of Musical Instrument Tones using FFT and Segment Averaging
Feature Extraction of Musical Instrument Tones using FFT and Segment AveragingFeature Extraction of Musical Instrument Tones using FFT and Segment Averaging
Feature Extraction of Musical Instrument Tones using FFT and Segment Averaging
 
Frequency based criterion for distinguishing tonal and noisy spectral components
Frequency based criterion for distinguishing tonal and noisy spectral componentsFrequency based criterion for distinguishing tonal and noisy spectral components
Frequency based criterion for distinguishing tonal and noisy spectral components
 
A_Noise_Reduction_Method_Based_on_LMS_Adaptive_Fil.pdf
A_Noise_Reduction_Method_Based_on_LMS_Adaptive_Fil.pdfA_Noise_Reduction_Method_Based_on_LMS_Adaptive_Fil.pdf
A_Noise_Reduction_Method_Based_on_LMS_Adaptive_Fil.pdf
 
Design of dfe based mimo communication system for mobile moving with high vel...
Design of dfe based mimo communication system for mobile moving with high vel...Design of dfe based mimo communication system for mobile moving with high vel...
Design of dfe based mimo communication system for mobile moving with high vel...
 
survey paper for image denoising
survey paper for image denoisingsurvey paper for image denoising
survey paper for image denoising
 
Final presentation
Final presentationFinal presentation
Final presentation
 
Image Restoration Using Particle Filters By Improving The Scale Of Texture Wi...
Image Restoration Using Particle Filters By Improving The Scale Of Texture Wi...Image Restoration Using Particle Filters By Improving The Scale Of Texture Wi...
Image Restoration Using Particle Filters By Improving The Scale Of Texture Wi...
 
L011117884
L011117884L011117884
L011117884
 
Direction of Arrival Estimation Based on MUSIC Algorithm Using Uniform and No...
Direction of Arrival Estimation Based on MUSIC Algorithm Using Uniform and No...Direction of Arrival Estimation Based on MUSIC Algorithm Using Uniform and No...
Direction of Arrival Estimation Based on MUSIC Algorithm Using Uniform and No...
 
Handling Ihnarmonic Series with Median-Adjustive Trajectories
Handling Ihnarmonic Series with Median-Adjustive TrajectoriesHandling Ihnarmonic Series with Median-Adjustive Trajectories
Handling Ihnarmonic Series with Median-Adjustive Trajectories
 
AURALISATION OF DEEP CONVOLUTIONAL NEURAL NETWORKS: LISTENING TO LEARNED FEAT...
AURALISATION OF DEEP CONVOLUTIONAL NEURAL NETWORKS: LISTENING TO LEARNED FEAT...AURALISATION OF DEEP CONVOLUTIONAL NEURAL NETWORKS: LISTENING TO LEARNED FEAT...
AURALISATION OF DEEP CONVOLUTIONAL NEURAL NETWORKS: LISTENING TO LEARNED FEAT...
 
A novel speech enhancement technique
A novel speech enhancement techniqueA novel speech enhancement technique
A novel speech enhancement technique
 
IR UWB TOA Estimation Techniques and Comparison
IR UWB TOA Estimation Techniques and ComparisonIR UWB TOA Estimation Techniques and Comparison
IR UWB TOA Estimation Techniques and Comparison
 
Improving the global parameter signal to distortion value in music signals
Improving the global parameter signal to distortion value in music signalsImproving the global parameter signal to distortion value in music signals
Improving the global parameter signal to distortion value in music signals
 
Improving the global parameter signal to distortion value in music signals
Improving the global parameter signal to distortion value in music signalsImproving the global parameter signal to distortion value in music signals
Improving the global parameter signal to distortion value in music signals
 
Improving the global parameter signal to distortion value in music signals
Improving the global parameter signal to distortion value in music signalsImproving the global parameter signal to distortion value in music signals
Improving the global parameter signal to distortion value in music signals
 
Introduction to deep learning based voice activity detection
Introduction to deep learning based voice activity detectionIntroduction to deep learning based voice activity detection
Introduction to deep learning based voice activity detection
 
3 D Sound
3 D Sound3 D Sound
3 D Sound
 
Sparsity based Joint Direction-of-Arrival and Offset Frequency Estimator
Sparsity based Joint Direction-of-Arrival and Offset Frequency EstimatorSparsity based Joint Direction-of-Arrival and Offset Frequency Estimator
Sparsity based Joint Direction-of-Arrival and Offset Frequency Estimator
 
N017428692
N017428692N017428692
N017428692
 

More from 奈良先端大 情報科学研究科

テレコミュニケーションを支援してみよう
テレコミュニケーションを支援してみようテレコミュニケーションを支援してみよう
テレコミュニケーションを支援してみよう
奈良先端大 情報科学研究科
 
マイコンと機械学習を使って行動認識システムを作ろう
マイコンと機械学習を使って行動認識システムを作ろうマイコンと機械学習を使って行動認識システムを作ろう
マイコンと機械学習を使って行動認識システムを作ろう
奈良先端大 情報科学研究科
 
5G時代を支えるNFVによるネットワーク最適設計
5G時代を支えるNFVによるネットワーク最適設計5G時代を支えるNFVによるネットワーク最適設計
5G時代を支えるNFVによるネットワーク最適設計
奈良先端大 情報科学研究科
 
21.Raspberry Piを用いたIoTアプリの開発
21.Raspberry Piを用いたIoTアプリの開発21.Raspberry Piを用いたIoTアプリの開発
21.Raspberry Piを用いたIoTアプリの開発
奈良先端大 情報科学研究科
 
20. 地理ビッグデータ利活用: リスク予測型自動避難誘導,地理的リスク分析
20. 地理ビッグデータ利活用: リスク予測型自動避難誘導,地理的リスク分析20. 地理ビッグデータ利活用: リスク予測型自動避難誘導,地理的リスク分析
20. 地理ビッグデータ利活用: リスク予測型自動避難誘導,地理的リスク分析
奈良先端大 情報科学研究科
 
11.実装の脆弱性を利用して強力な暗号を解読してみよう!
11.実装の脆弱性を利用して強力な暗号を解読してみよう!11.実装の脆弱性を利用して強力な暗号を解読してみよう!
11.実装の脆弱性を利用して強力な暗号を解読してみよう!
奈良先端大 情報科学研究科
 
8. ミニ・スーパコンピュータを自作しよう!
8. ミニ・スーパコンピュータを自作しよう!8. ミニ・スーパコンピュータを自作しよう!
8. ミニ・スーパコンピュータを自作しよう!
奈良先端大 情報科学研究科
 
16. マイコンと機械学習を使って行動認識システムを作ろう
16. マイコンと機械学習を使って行動認識システムを作ろう16. マイコンと機械学習を使って行動認識システムを作ろう
16. マイコンと機械学習を使って行動認識システムを作ろう
奈良先端大 情報科学研究科
 
15. テレイグジスタンスシステムを制作してみよう
15. テレイグジスタンスシステムを制作してみよう15. テレイグジスタンスシステムを制作してみよう
15. テレイグジスタンスシステムを制作してみよう
奈良先端大 情報科学研究科
 
14. ビデオシースルーHMDで視覚拡張の世界を体感しよう
14. ビデオシースルーHMDで視覚拡張の世界を体感しよう14. ビデオシースルーHMDで視覚拡張の世界を体感しよう
14. ビデオシースルーHMDで視覚拡張の世界を体感しよう
奈良先端大 情報科学研究科
 
19. 生物に学ぶ人工知能とロボット制御
19. 生物に学ぶ人工知能とロボット制御19. 生物に学ぶ人工知能とロボット制御
19. 生物に学ぶ人工知能とロボット制御
奈良先端大 情報科学研究科
 
13. SDRで学ぶ無線通信
13. SDRで学ぶ無線通信13. SDRで学ぶ無線通信
13. SDRで学ぶ無線通信
奈良先端大 情報科学研究科
 
18. 計測に基づいた写実的なコンピュータグラフィクスの生成法
18. 計測に基づいた写実的なコンピュータグラフィクスの生成法18. 計測に基づいた写実的なコンピュータグラフィクスの生成法
18. 計測に基づいた写実的なコンピュータグラフィクスの生成法
奈良先端大 情報科学研究科
 
21. 人の動作・行動センシングに基づく拡張現実感システムの開発
21. 人の動作・行動センシングに基づく拡張現実感システムの開発21. 人の動作・行動センシングに基づく拡張現実感システムの開発
21. 人の動作・行動センシングに基づく拡張現実感システムの開発
奈良先端大 情報科学研究科
 
20. 友好的関係を構築する人と対話ロボットのコミュニケーション技術開発
20. 友好的関係を構築する人と対話ロボットのコミュニケーション技術開発20. 友好的関係を構築する人と対話ロボットのコミュニケーション技術開発
20. 友好的関係を構築する人と対話ロボットのコミュニケーション技術開発
奈良先端大 情報科学研究科
 
9. マイコンと機械学習を使って行動認識システムを作ろう
9. マイコンと機械学習を使って行動認識システムを作ろう9. マイコンと機械学習を使って行動認識システムを作ろう
9. マイコンと機械学習を使って行動認識システムを作ろう
奈良先端大 情報科学研究科
 
6. 生物に学ぶ人工知能とロボット制御
6. 生物に学ぶ人工知能とロボット制御6. 生物に学ぶ人工知能とロボット制御
6. 生物に学ぶ人工知能とロボット制御
奈良先端大 情報科学研究科
 
14. モバイルエージェントによる並列分散学習システムの構築
14. モバイルエージェントによる並列分散学習システムの構築14. モバイルエージェントによる並列分散学習システムの構築
14. モバイルエージェントによる並列分散学習システムの構築
奈良先端大 情報科学研究科
 
17. 100台の小型ロボットを協調させよう
17. 100台の小型ロボットを協調させよう17. 100台の小型ロボットを協調させよう
17. 100台の小型ロボットを協調させよう
奈良先端大 情報科学研究科
 
5. ミニ・スーパコンピュータを自作しよう!
5. ミニ・スーパコンピュータを自作しよう!5. ミニ・スーパコンピュータを自作しよう!
5. ミニ・スーパコンピュータを自作しよう!
奈良先端大 情報科学研究科
 

More from 奈良先端大 情報科学研究科 (20)

テレコミュニケーションを支援してみよう
テレコミュニケーションを支援してみようテレコミュニケーションを支援してみよう
テレコミュニケーションを支援してみよう
 
マイコンと機械学習を使って行動認識システムを作ろう
マイコンと機械学習を使って行動認識システムを作ろうマイコンと機械学習を使って行動認識システムを作ろう
マイコンと機械学習を使って行動認識システムを作ろう
 
5G時代を支えるNFVによるネットワーク最適設計
5G時代を支えるNFVによるネットワーク最適設計5G時代を支えるNFVによるネットワーク最適設計
5G時代を支えるNFVによるネットワーク最適設計
 
21.Raspberry Piを用いたIoTアプリの開発
21.Raspberry Piを用いたIoTアプリの開発21.Raspberry Piを用いたIoTアプリの開発
21.Raspberry Piを用いたIoTアプリの開発
 
20. 地理ビッグデータ利活用: リスク予測型自動避難誘導,地理的リスク分析
20. 地理ビッグデータ利活用: リスク予測型自動避難誘導,地理的リスク分析20. 地理ビッグデータ利活用: リスク予測型自動避難誘導,地理的リスク分析
20. 地理ビッグデータ利活用: リスク予測型自動避難誘導,地理的リスク分析
 
11.実装の脆弱性を利用して強力な暗号を解読してみよう!
11.実装の脆弱性を利用して強力な暗号を解読してみよう!11.実装の脆弱性を利用して強力な暗号を解読してみよう!
11.実装の脆弱性を利用して強力な暗号を解読してみよう!
 
8. ミニ・スーパコンピュータを自作しよう!
8. ミニ・スーパコンピュータを自作しよう!8. ミニ・スーパコンピュータを自作しよう!
8. ミニ・スーパコンピュータを自作しよう!
 
16. マイコンと機械学習を使って行動認識システムを作ろう
16. マイコンと機械学習を使って行動認識システムを作ろう16. マイコンと機械学習を使って行動認識システムを作ろう
16. マイコンと機械学習を使って行動認識システムを作ろう
 
15. テレイグジスタンスシステムを制作してみよう
15. テレイグジスタンスシステムを制作してみよう15. テレイグジスタンスシステムを制作してみよう
15. テレイグジスタンスシステムを制作してみよう
 
14. ビデオシースルーHMDで視覚拡張の世界を体感しよう
14. ビデオシースルーHMDで視覚拡張の世界を体感しよう14. ビデオシースルーHMDで視覚拡張の世界を体感しよう
14. ビデオシースルーHMDで視覚拡張の世界を体感しよう
 
19. 生物に学ぶ人工知能とロボット制御
19. 生物に学ぶ人工知能とロボット制御19. 生物に学ぶ人工知能とロボット制御
19. 生物に学ぶ人工知能とロボット制御
 
13. SDRで学ぶ無線通信
13. SDRで学ぶ無線通信13. SDRで学ぶ無線通信
13. SDRで学ぶ無線通信
 
18. 計測に基づいた写実的なコンピュータグラフィクスの生成法
18. 計測に基づいた写実的なコンピュータグラフィクスの生成法18. 計測に基づいた写実的なコンピュータグラフィクスの生成法
18. 計測に基づいた写実的なコンピュータグラフィクスの生成法
 
21. 人の動作・行動センシングに基づく拡張現実感システムの開発
21. 人の動作・行動センシングに基づく拡張現実感システムの開発21. 人の動作・行動センシングに基づく拡張現実感システムの開発
21. 人の動作・行動センシングに基づく拡張現実感システムの開発
 
20. 友好的関係を構築する人と対話ロボットのコミュニケーション技術開発
20. 友好的関係を構築する人と対話ロボットのコミュニケーション技術開発20. 友好的関係を構築する人と対話ロボットのコミュニケーション技術開発
20. 友好的関係を構築する人と対話ロボットのコミュニケーション技術開発
 
9. マイコンと機械学習を使って行動認識システムを作ろう
9. マイコンと機械学習を使って行動認識システムを作ろう9. マイコンと機械学習を使って行動認識システムを作ろう
9. マイコンと機械学習を使って行動認識システムを作ろう
 
6. 生物に学ぶ人工知能とロボット制御
6. 生物に学ぶ人工知能とロボット制御6. 生物に学ぶ人工知能とロボット制御
6. 生物に学ぶ人工知能とロボット制御
 
14. モバイルエージェントによる並列分散学習システムの構築
14. モバイルエージェントによる並列分散学習システムの構築14. モバイルエージェントによる並列分散学習システムの構築
14. モバイルエージェントによる並列分散学習システムの構築
 
17. 100台の小型ロボットを協調させよう
17. 100台の小型ロボットを協調させよう17. 100台の小型ロボットを協調させよう
17. 100台の小型ロボットを協調させよう
 
5. ミニ・スーパコンピュータを自作しよう!
5. ミニ・スーパコンピュータを自作しよう!5. ミニ・スーパコンピュータを自作しよう!
5. ミニ・スーパコンピュータを自作しよう!
 

Recently uploaded

Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Product School
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
Sri Ambati
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
Product School
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
DanBrown980551
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 

Recently uploaded (20)

Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
Unsubscribed: Combat Subscription Fatigue With a Membership Mentality by Head...
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
GenAISummit 2024 May 28 Sri Ambati Keynote: AGI Belongs to The Community in O...
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 

Depth Estimation of Sound Images Using Directional Clustering and Activation-Shared Nonnegative Matrix Factorization

  • 1. Depth Estimation of Sound Images Using Directional Clustering and Activation-Shared Nonnegative Matrix Factorization Tomo Miyauchi, Daichi Kitamura, Hiroshi Saruwatari, Satoshi Nakamura (Nara Institute of Science and Technology, Japan)
  • 2. Outline Background and related study Problem and purpose Proposed method 1 - Depth estimation based on DOA distribution Proposed method 2 - Activation shared nonnegative matrix factorization Experiments Conclusions 2
  • 3. Background With the advent of 3D TV, the reproduction of 3D image is realized. 3D sound reproduction system has not been established yet. Problem Picture image Sound image 3D TV : Sound image Viewer feels uncomfortable due to mismatch of images. To solve this problem, sound field reproduction technique have been studied actively. can present the “direction” and “depth” of the sound images to the listener. 3
  • 4. Related study: wave field synthesis Sound field reproduction WFS requires the primary source information of sound images. Representation "depth“ of sound images 1. Individual sound source 2. Localization information Wave Field Synthesis (WFS) [A. J. Berkhout, et al., 1993] WFS allows us to create sound images at the front of loudspeakers. … … … These information have been lost in existing contents by down-mix. ×Drawback of WFS ↓ Up-mixing method are required. Sound image 1 Source separation Mixed signal → individual source 2 Listener Localization estimation of sound images 4
  • 5. Flow of proposed up-mixer Spatial sound system using existing contents Stereo contents Spatial sound reproduction 1 Mixed multichannel signal Conventional method Sound source separation Wave field Synthesis This study 2 Directional estimation New depth Depth estimation Depth estimation of sound images has not been proposed 5
  • 6. Related study: directional clustering [Araki, et al., 2007] Individual sources of each cluster Mixed stereo signal : Inverse Fourier transform L-ch input signal L-ch input signal : Fourier transform L-ch input signal 1 R-ch input signal R-ch input signal R-ch input signal Normalization :Source component Clustering :Spatial representative vector 6
  • 7. Outline Background and related study Problem and purpose Proposed method 1 - Depth estimation based on DOA distribution Proposed method 2 - Activation-shared multichannel NMF Experiments Conclusions 7
  • 8. Problem and purpose Problem WFS requires specific localization information of individual sound sources to reproduce a sound field. Up-mixer Directional estimation method have been developed. Directional estimation based on VBAP [Hirata, et al., 2011] Purpose Establishing new depth estimation method How can we get depth information? Proposed method Depth estimation method using direction of arrival (DOA) distribution 8
  • 9. Outline Background and related study Problem and purpose Proposed method 1 - Depth estimation based on DOA distribution Proposed method 2 - Activation-shared multichannel NMF Experiments Conclusions 9
  • 10. Proposed method 1: depth estimation based on DOA DOA → “Direction of arrival” of sound waves We estimate the depth using the DOA distribution. Directional clustering Weighted DOA histogram Directional information Amplitude ratio of Weighting term Mixed signal Frequency of source components Magnitude of each vector Individual sources Left Center Right Direction of arrival 10
  • 11. Proposed method 1: depth estimation based on DOA In sound fields, when a sound source is far from the listener, sound waves arrive from various directions owing to sound diffusion. Frequency of Close source component Difference of DOA shape corresponding to source distance Close source Observed DOA histogram becomes spiky shape Frequency of Far source component Direction of arrival Far source Observed DOA histogram becomes smooth shape Direction of arrival Observed DOA distribution of the target source can be used as a cue for depth estimation. 11
  • 12. Proposed method 1: modeling of DOA distribution To model DOA, we propose a new modeling method using GGD. Generalized Gaussian distribution: GGD [Box, et al., 1973] Flexible family of probability density function (PDF) Shape of GGD changes depending on βshape. βshape = 2: Gaussian distribution PDF βshape = 1: Laplacian distribution PDF Definition of GGD 12
  • 13. Proposed method 1: modeling of DOA distribution Modeling of DOA distribution based on GGD parameter Frequency of source components Close Far Direction of arrival We propose a new depth estimation based on GGD. Shape parameter βshape is utilized as metric. Source is close ⇔ βshape is small Source is Far ⇔ βshape is large 13
  • 14. Proposed method 2: problem in proposed method 1 Normalization problem Small noise components are enhanced. × Problem of L-ch signal processing R-ch L-ch input signal Frequency of source components Left Binaural – recorded R-ch input signal Center Right Noise DOA Background noise and artificial distortion generated by signal processing interfere with DOA histogram. Feature extraction Activation-shared multichannel NMF 14
  • 15. Outline Background and related study Problem and purpose Proposed method 1 - Depth estimation based on DOA distribution Proposed method 2 - Activation-shared multichannel NMF Experiments Conclusions 15
  • 16. Proposed method 2: activation-shared multichannel NMF Nonnegative matrix factorization: NMF [Lee, et al., 2001] Frequency Frequency Amplitude — is a sparse representation. — can extract significant features from the observed matrix. Time Observed matrix (Spectrogram) Time Amplitude Activation matrix (Time-varying gain) Basis matrix (Spectral patterns) Ω: Number of frequency bins : Number of time frames : Number of bases The sparse representation provides high performance for noise reduction, compression, and feature extraction. We eliminate background noise and artificial distortion. 16
  • 17. Proposed method 2: problem of conventional NMF Conventional NMF Directional information L-ch NMF NMFs are applied in parallel R-ch NMF Conventional NMFs generate an artificial fluctuation. Bases are trained uncorrelated. Amplitude ratio DOA information is disturbed. 17
  • 18. Proposed method 2: activation-shared multichannel NMF Proposed method Activation-shared multichannel NMF NMF Activation matrix is shared through all channels R-ch This reduces dimensionality of input signal while maintaining directional information. L-ch NMF Cost function : cost function, : β-divergence, : entries of matrices 18
  • 19. Proposed method 2: activation-shared multichannel NMF - divergence [Eguchi, et al., 2001] Generalized divergence of variable corresponding to . : Euclidean distance : Generalized Kullback-Leibler divergence : Itakura–Saito divergence 19
  • 20. Proposed method 2: activation-shared multichannel NMF Derivation of optimal variables Auxiliary function method is an optimization scheme that uses the upper bound function. 1. Design the auxiliary function for as . 2. Minimize the original cost functions indirectly by minimizing the auxiliary functions. Using -divergence 20
  • 21. Proposed method 2: activation-shared multichannel NMF Cost function The first and second terms become convex or concave functions with respect to value. concave convex convex concave concave convex 21
  • 22. Proposed method 2: activation-shared multichannel NMF Cost function Upper bound function of each term is defined by applying Convex: Jensen’s inequality Concave: tangent line inequality : Convex function : Concave function 22
  • 23. Proposed method 2: activation-shared multichannel NMF The update rules for optimization are obtained from the derivative of auxiliary function w.r.t. each objective variable. Update rules ‫ﰀﰀ‬ are entries of matrices . 23
  • 24. Frequency of source components Flow of proposed depth estimation method Input stereo signal R-ch L-ch STFT Direction of arrival Cluster L Cluster C Cluster R Activation- Activation- Activationshared NMF shared NMF shared NMF Depth estimation Depth estimation Depth estimation We can estimate depth information by calculate shape parameter of DOA histogram. Direction of arrival Frequency of source components ‫ﰀﰀ‬ Frequency of source components Weighted DOA histogram Direction of arrival 24
  • 25. Outline Background and related study Problem and purpose Proposed method 1 - Depth estimation based on DOA distribution Proposed method 2 - Activation-shared multichannel NMF Experiments Conclusions 25
  • 26. Experimental conditions Conditions Mixing source parameter Test source 1 Test source 2 Test source 3 : Target source Reverberation time intervals NMF beta : Interference source NMF basis at Mixed stereo signals consist of 3 instruments. Conventional method 1 Target source is located center with 7 distances. Conventional Processed by conventional NMF method 2 Combination related to direction is 6 patterns. Proposed method Weighted DOA histogram (Not processed by NMF) Processed by proposed NMF 26
  • 27. Experimental conditions Image method [Allen, et al., 1979] Geometry of image method Real source Technique of simulating room impulse response Image source Volume of room Source location Microphone location Absorption coefficient Example of room impulse response Reference sound sources were generated using image method. Amplitude – can be set arbitrarily Time index 27
  • 28. Experimental results Results 1 : Target source ‫ﰀ‬ҏ : Interference source Data set 1 Target source: Vocal Interference source (left): Piano Interference source (right): Guitar ・ Results of conventional methods have no agreement with the oracle (image method). ・ Results of proposed method correctly estimates distance of the target source. 28
  • 29. Experimental results: correlation coefficient Results 2 Correlation coefficient between reference value and estimated value Table Correlation coefficient of each method Data set 1 2 3 4 5 6 ‫ﰀ‬ҏ Target source Interference source (left) Interference source (right) Vocal Piano Guitar Vocal Guitar Piano Guitar Piano Vocal Guitar Vocal Piano Piano Vocal Guitar Piano Guitar Vocal Conventional method 1 Conventional method 2 Proposed method 0.350 0.189 0.986 0.532 0.165 0.925 0.154 0.044 0.777 0.277 -0.037 0.651 0.602 0.426 0.791 0.496 0.157 0.856 • Strong relation between the estimated value of proposed method and the distance of the target source is indicated. • The efficacy of the proposed method is confirmed. 29
  • 30. Conclusions We proposed a new depth estimation method of sound source in mixed signal using the shape of DOA distribution. The shape of DOA distribution is modeling by GGD. We also proposed a new feature extraction method for the multichannel signal, activation-shared multichannel NMF. The result of the experiment indicated the efficacy of the proposed method. 30
  • 32. Derivation of parameter βshape ×The maximum-likelihood based shape parameter estimation has no closed-form solution in GGD. we propose a closed-form parameter estimation algorithm based on some approximation and kurtosis. Kurtosis of DOA histogram th moment of GGD Relation equation of kurtosis and shape parameter : Observed DOA histogram : Gamma function 32
  • 33. Derivation of parameter βshape ×There is no exact closed-form solution of the inverse function. Introduce Modified String’s formula Approximation of gamma function Modified Stirling's formula ‫ﰀﰀ‬ Take a logarithm 33
  • 34. Derivation of parameter βshape This results in the following quadratic equation of to be solved we can derive the closed-form estimation ‫ﰀﰀ‬ closed-form estimate of shape parameter Preparation of depth estimation method is completed. 34
  • 35. Proposed method 2: activation-shared multichannel NMF Preliminary experiment Example of DOA histogram Weighted DOA histogram Direction of arrival [degree] (Individually applied) conventional NMF Fluctuation are generated in DOA ‫ﰀﰀ‬ L-ch NMF R-ch NMF Direction of arrival [degree] (Activation-shared) proposed NMF Feature extraction while maintaining directional information Center cluster DOA of mixed source (3 instrument) L-ch NMF R-ch NMF Direction of arrival [degree] 35