The document proposes two methods for estimating the depth of sound images using direction of arrival (DOA) information extracted from stereo signals.
Method 1 estimates depth based on the shape of the DOA distribution modeled using a generalized Gaussian distribution (GGD), where a smoother distribution indicates a source is farther away.
Method 2 applies activation-shared nonnegative matrix factorization (NMF) to extract features while maintaining directional information and reduce noise interfering with DOA estimation. Conventional NMF generates artificial fluctuations, while shared activation preserves source direction.
An experiment calculates the correlation between estimated and reference depths from 6 datasets containing varying source combinations. Results show the proposed method achieves high correlation, confirming its efficacy over conventional methods
DNN-based frequency component prediction for frequency-domain audio source se...Kitamura Laboratory
Rui Watanabe, Daichi Kitamura, Hiroshi Saruwatari, Yu Takahashi, and Kazunobu Kondo, "DNN-based frequency component prediction for frequency-domain audio source separation," Proceedings of European Signal Processing Conference (EUSIPCO 2020), pp. 805–809, Amsterdam, Netherlands, January 2021.
Depth estimation of sound images using directional clustering and activation-...Daichi Kitamura
Presented at 2014 RISP International Workshop on Nonlinear Circuits, Communications and Signal Processing (NCSP 2014) (international conference)
Tomo Miyauchi, Daichi Kitamura, Hiroshi Saruwatari, Satoshi Nakamura, "Depth estimation of sound images using directional clustering and activation-shared nonnegative matrix factorization," Proceedings of 2014 RISP International Workshop on Nonlinear Circuits, Communications and Signal Processing (NCSP 2014), pp.437-440, Hawaii, USA, March 2014 (Student Paper Award).
Shoichi Koyama, Naoki Murata, and Hiroshi Saruwatari. "Super-resolution in sound field recording and reproduction based on sparse representation"
presented at 5th Joint Meeting Acoustical Society of America and Acoustical Society of Japan (28 Nov. - 2 Dec. 2016, Honolulu, USA)
Linear multichannel blind source separation based on time-frequency mask obta...Kitamura Laboratory
Soichiro Oyabu, Daichi Kitamura, and Kohei Yatabe, "Linear multichannel blind source separation based on time-frequency mask obtained by harmonic/percussive sound separation," Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2021), pp. 201–205, Toronto, Canada, June 2021.
Blind audio source separation based on time-frequency structure modelsKitamura Laboratory
Daichi Kitamura, "Blind audio source separation based on time-frequency structure models," Invited Overview Session in Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC 2021), Tokyo, Japan, December 2021.
Prior distribution design for music bleeding-sound reduction based on nonnega...Kitamura Laboratory
Yusaku Mizobuchi, Daichi Kitamura, Tomohiko Nakamura, Hiroshi Saruwatari, Yu Takahashi, and Kazunobu Kondo, "Prior distribution design for music bleeding-sound reduction based on nonnegative matrix factorization," Proceedings of Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC 2021), pp. 651–658, Tokyo, Japan, December 2021.
DNN-based frequency component prediction for frequency-domain audio source se...Kitamura Laboratory
Rui Watanabe, Daichi Kitamura, Hiroshi Saruwatari, Yu Takahashi, and Kazunobu Kondo, "DNN-based frequency component prediction for frequency-domain audio source separation," Proceedings of European Signal Processing Conference (EUSIPCO 2020), pp. 805–809, Amsterdam, Netherlands, January 2021.
Depth estimation of sound images using directional clustering and activation-...Daichi Kitamura
Presented at 2014 RISP International Workshop on Nonlinear Circuits, Communications and Signal Processing (NCSP 2014) (international conference)
Tomo Miyauchi, Daichi Kitamura, Hiroshi Saruwatari, Satoshi Nakamura, "Depth estimation of sound images using directional clustering and activation-shared nonnegative matrix factorization," Proceedings of 2014 RISP International Workshop on Nonlinear Circuits, Communications and Signal Processing (NCSP 2014), pp.437-440, Hawaii, USA, March 2014 (Student Paper Award).
Shoichi Koyama, Naoki Murata, and Hiroshi Saruwatari. "Super-resolution in sound field recording and reproduction based on sparse representation"
presented at 5th Joint Meeting Acoustical Society of America and Acoustical Society of Japan (28 Nov. - 2 Dec. 2016, Honolulu, USA)
Linear multichannel blind source separation based on time-frequency mask obta...Kitamura Laboratory
Soichiro Oyabu, Daichi Kitamura, and Kohei Yatabe, "Linear multichannel blind source separation based on time-frequency mask obtained by harmonic/percussive sound separation," Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2021), pp. 201–205, Toronto, Canada, June 2021.
Blind audio source separation based on time-frequency structure modelsKitamura Laboratory
Daichi Kitamura, "Blind audio source separation based on time-frequency structure models," Invited Overview Session in Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC 2021), Tokyo, Japan, December 2021.
Prior distribution design for music bleeding-sound reduction based on nonnega...Kitamura Laboratory
Yusaku Mizobuchi, Daichi Kitamura, Tomohiko Nakamura, Hiroshi Saruwatari, Yu Takahashi, and Kazunobu Kondo, "Prior distribution design for music bleeding-sound reduction based on nonnegative matrix factorization," Proceedings of Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC 2021), pp. 651–658, Tokyo, Japan, December 2021.
Relaxation of rank-1 spatial constraint in overdetermined blind source separa...Daichi Kitamura
Presented at The 2015 European Signal Processing Conference (EUSIPCO 2015, international conference)
Daichi Kitamura, Nobutaka Ono, Hiroshi Sawada, Hirokazu Kameoka, Hiroshi Saruwatari, "Relaxation of rank-1 spatial constraint in overdetermined blind source separation," Proceedings of The 2015 European Signal Processing Conference (EUSIPCO 2015), pp.1271-1275, Nice, France, September 2015 (Invited Special Session).
Online divergence switching for superresolution-based nonnegative matrix fact...Daichi Kitamura
Presented at 2014 RISP International Workshop on Nonlinear Circuits, Communications and Signal Processing (NCSP 2014) (international conference)
Daichi Kitamura, Hiroshi Saruwatari, Satoshi Nakamura, Yu Takahashi, Kazunobu Kondo, Hirokazu Kameoka, "Online divergence switching for superresolution-based nonnegative matrix factorization," Proceedings of 2014 RISP International Workshop on Nonlinear Circuits, Communications and Signal Processing (NCSP 2014), pp.485-488, Hawaii, USA, March 2014 (Student Paper Award).
DNN-based permutation solver for frequency-domain independent component analy...Kitamura Laboratory
Shuhei Yamaji and Daichi Kitamura, "DNN-based permutation solver for frequency-domain independent component analysis in two-source mixture case," Proceedings of Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC 2020), pp. 781–787, Auckland, New Zealand, December 2020.
Shoichi Koyama, "Source-Location-Informed Sound Field Recording and Reproduction: A Generalization to Arrays of Arbitrary Geometry"
Presented in 2016 AES International Conference on Sound Field Control (July 18-20 2016, Guildford, UK)
Divergence optimization in nonnegative matrix factorization with spectrogram ...Daichi Kitamura
Presented at 4th Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA 2014) (international conference)
Daichi Kitamura, Hiroshi Saruwatari, Satoshi Nakamura, Yu Takahashi, Kazunobu Kondo, Hirokazu Kameoka, "Divergence optimization in nonnegative matrix factorization with spectrogram restoration for multichannel signal separation," Proceedings of 4th Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA 2014), pp.92-96, Nancy, France, May 2014.
Robust music signal separation based on supervised nonnegative matrix factori...Daichi Kitamura
Presented at IEEE International Symposium on Signal Processing and Information Technology (ISSPIT 2013) (international conference)
Daichi Kitamura, Hiroshi Saruwatari, Kosuke Yagi, Kiyohiro Shikano, Yu Takahashi, Kazunobu Kondo, "Robust music signal separation based on supervised nonnegative matrix factorization with prevention of basis sharing," Proceedings of IEEE International Symposium on Signal Processing and Information Technology (ISSPIT 2013), pp.392-397, Athens, Greece, December 2013.
Blind source separation based on independent low-rank matrix analysis and its...Daichi Kitamura
Daichi Kitamura, "Blind source separation based on independent low-rank matrix analysis and its extension to Student's t-distribution," Télécom ParisTech, Invited Lecture, September 4th, 2017.
Regularized superresolution-based binaural signal separation with nonnegative...Daichi Kitamura
Presented at 5th International Conference on 3D Systems and Applications (3DSA 2013) (international conference)
Daichi Kitamura, Hiroshi Saruwatari, Kiyohiro Shikano, Kazunobu Kondo, Yu Takahashi, "Regularized superresolution-based binaural signal separation with nonnegative matrix factorization," Proceedings of 5th International Conference on 3D Systems and Applications (3DSA 2013), S10-4, Osaka, Japan, June 2013.
Hybrid multichannel signal separation using supervised nonnegative matrix fac...Daichi Kitamura
Presented at Asia-Pacific Signal and Information Processing Association Annual Summit and Conference 2014 (APSIPA 2014, international conference)
Daichi Kitamura, Hiroshi Saruwatari, Satoshi Nakamura, Yu Takahashi, Kazunobu Kondo, Hirokazu Kameoka, "Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration," Proceedings of Asia-Pacific Signal and Information Processing Association Annual Summit and Conference 2014 (APSIPA 2014), Siem Reap, Cambodia, December 2014 (invited paper).
Isolated words recognition using mfcc, lpc and neural networkeSAT Journals
Abstract Automatic speech recognition is an important topic of speech processing. This paper presents the use of an Artificial Neural Network (ANN) for isolated word recognition. The Pre-processing is done and voiced speech is detected based on energy and zero crossing rates (ZCR). The proposed approach used in speech recognition is Mel Frequency Cepstral Coefficients (MFCC) and combine features of both MFCC and Linear Predictive Coding (LPC). The back-propagation is used as a classifier. The recognition accuracy is increased when combine features of both LPC and MFCC are used as compared to only MFCC approach using Neural Network as a classifier.. Keywords: Pre-processing, Mel frequency Cepstral Coefficient (MFCC), Linear Predictive Coding (LPC), Artificial Neural Network (ANN).
Reduced Ordering Based Approach to Impulsive Noise Suppression in Color ImagesIDES Editor
In this paper a novel filtering design intended for
the impulsive noise removal in color images is presented.
The described scheme utilizes the rank weighted cumulated
distances between the pixels belonging to the local filtering
window. The impulse detection scheme is based on the
difference between the aggregated weighted distances assigned
to the central pixel of the window and the minimum value,
which corresponds to the rank weighted vector median. If the
difference exceeds an adaptively determined threshold value,
then the processed pixel is replaced by the mean of the
neighboring pixels, which were found to be not corrupted,
otherwise it is retained. The important feature of the described
filtering framework is its ability to effectively suppress
impulsive noise, while preserving fine image details. The
comparison with the state-of-the-art denoising schemes
revealed that the proposed filter yields better restoration
results in terms of objective restoration quality measures.
Speech Enhancement Using Spectral Flatness Measure Based Spectral SubtractionIOSRJVSP
This paper is aimed to reduce background noise introduced in speech signal during capture, storage, transmission and processing using Spectral Subtraction algorithm. To consider the fact that colored noise corrupts the speech signal non-uniformly over different frequency bands, Multi-Band Spectral Subtraction (MBSS) approach is exploited wherein amount of noise subtracted from noisy speech signal is decided by a weighting factor. Choice of optimal values of weights decides the performance of the speech enhancement system. In this paper weights are decided based on SFM (Spectral Flatness Measure) than conventional SNR (Signal to Noise Ratio) based rule. Since SFM is able to provide true distinction between speech signal and noise signal. Spectrogram, Mean Opinion Score show that speech enhanced from proposed SFM based MBSS possess better perceptual quality and improved intelligibility than existing SNR based MBSS
Removal of noise is a determining track in
the image rebuilding process, but denoising of image remains a
claiming problem in upcoming analysis accomplice along
image processing. Denoising is utilized to expel the noise from
corrupted image, where as we need to maintain the edges and
other detailed characteristics almost accessible. This noise gets
imported during accretion, transmitting & receiving and
storage & retrieval techniques. In this paper, to discover out
denoised image the modified denoising technique and the local
adaptive wavelet image denoising technique can be obtained.
The input (noisy image) is denoised with the help of modified
denoising technique which is form on wavelet domain as well as
spatial domain along with the local adaptive wavelet image
denoising technique which is form on wavelet domain. In this
paper, I have appraised and analyzed achievements of
modified denoising technique and the local adaptive wavelet
image denoising technique. The above procedures are
contemplated with other based on PSNR between input image
and noisy image and SNR between input image and denoised
image. Simulation and experimental outgrowth for an image
reflects as the mean square error of the local adaptive wavelet
image denoising procedure is less efficient as compare to
modified denoising procedure including the signal to noise
ratio of the local adaptive wavelet image denoising technique is
effective than other approach. Therefore, the image after
denoising has a superior visual effect. In this paper, these two
techniques are materialized with the help of MATLAB for
denoising of image
Speaker Recognition System using MFCC and Vector Quantization Approachijsrd.com
This paper presents an approach to speaker recognition using frequency spectral information with Mel frequency for the improvement of speech feature representation in a Vector Quantization codebook based recognition approach. The Mel frequency approach extracts the features of the speech signal to get the training and testing vectors. The VQ Codebook approach uses training vectors to form clusters and recognize accurately with the help of LBG algorithm.
Feature Extraction of Musical Instrument Tones using FFT and Segment AveragingTELKOMNIKA JOURNAL
A feature extraction for musical instrument tones that based on a transform domain approach was
proposed in this paper. The aim of the proposed feature extraction was to get the lower feature extraction
coefficients. In general, the proposed feature extraction was carried out as follow. Firstly, the input signal
was transformed using FFT (Fast Fourier Transform). Secondly, the left half of the transformed signal was
divided into a number of segments. Finally, the averaging results of that segments, was the feature
extraction of the input signal. Based on the test results, the proposed feature extraction was highly efficient
for the tones, which have many significant local peaks in the Fourier transform domain, because it only
required at least four feature extraction coefficients, in order to represent every tone.
Frequency based criterion for distinguishing tonal and noisy spectral componentsCSCJournals
A frequency-based criterion for distinguishing tonal and noisy spectral components is proposed. For considered spectral local maximum two instantaneous frequency estimates are determined and the difference between them is used in order to verify whether component is noisy or tonal. Since one of the estimators was invented specially for this application its properties are deeply examined. The proposed criterion is applied to the stationary and nonstationary sinusoids in order to examine its efficiency.
Relaxation of rank-1 spatial constraint in overdetermined blind source separa...Daichi Kitamura
Presented at The 2015 European Signal Processing Conference (EUSIPCO 2015, international conference)
Daichi Kitamura, Nobutaka Ono, Hiroshi Sawada, Hirokazu Kameoka, Hiroshi Saruwatari, "Relaxation of rank-1 spatial constraint in overdetermined blind source separation," Proceedings of The 2015 European Signal Processing Conference (EUSIPCO 2015), pp.1271-1275, Nice, France, September 2015 (Invited Special Session).
Online divergence switching for superresolution-based nonnegative matrix fact...Daichi Kitamura
Presented at 2014 RISP International Workshop on Nonlinear Circuits, Communications and Signal Processing (NCSP 2014) (international conference)
Daichi Kitamura, Hiroshi Saruwatari, Satoshi Nakamura, Yu Takahashi, Kazunobu Kondo, Hirokazu Kameoka, "Online divergence switching for superresolution-based nonnegative matrix factorization," Proceedings of 2014 RISP International Workshop on Nonlinear Circuits, Communications and Signal Processing (NCSP 2014), pp.485-488, Hawaii, USA, March 2014 (Student Paper Award).
DNN-based permutation solver for frequency-domain independent component analy...Kitamura Laboratory
Shuhei Yamaji and Daichi Kitamura, "DNN-based permutation solver for frequency-domain independent component analysis in two-source mixture case," Proceedings of Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC 2020), pp. 781–787, Auckland, New Zealand, December 2020.
Shoichi Koyama, "Source-Location-Informed Sound Field Recording and Reproduction: A Generalization to Arrays of Arbitrary Geometry"
Presented in 2016 AES International Conference on Sound Field Control (July 18-20 2016, Guildford, UK)
Divergence optimization in nonnegative matrix factorization with spectrogram ...Daichi Kitamura
Presented at 4th Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA 2014) (international conference)
Daichi Kitamura, Hiroshi Saruwatari, Satoshi Nakamura, Yu Takahashi, Kazunobu Kondo, Hirokazu Kameoka, "Divergence optimization in nonnegative matrix factorization with spectrogram restoration for multichannel signal separation," Proceedings of 4th Joint Workshop on Hands-free Speech Communication and Microphone Arrays (HSCMA 2014), pp.92-96, Nancy, France, May 2014.
Robust music signal separation based on supervised nonnegative matrix factori...Daichi Kitamura
Presented at IEEE International Symposium on Signal Processing and Information Technology (ISSPIT 2013) (international conference)
Daichi Kitamura, Hiroshi Saruwatari, Kosuke Yagi, Kiyohiro Shikano, Yu Takahashi, Kazunobu Kondo, "Robust music signal separation based on supervised nonnegative matrix factorization with prevention of basis sharing," Proceedings of IEEE International Symposium on Signal Processing and Information Technology (ISSPIT 2013), pp.392-397, Athens, Greece, December 2013.
Blind source separation based on independent low-rank matrix analysis and its...Daichi Kitamura
Daichi Kitamura, "Blind source separation based on independent low-rank matrix analysis and its extension to Student's t-distribution," Télécom ParisTech, Invited Lecture, September 4th, 2017.
Regularized superresolution-based binaural signal separation with nonnegative...Daichi Kitamura
Presented at 5th International Conference on 3D Systems and Applications (3DSA 2013) (international conference)
Daichi Kitamura, Hiroshi Saruwatari, Kiyohiro Shikano, Kazunobu Kondo, Yu Takahashi, "Regularized superresolution-based binaural signal separation with nonnegative matrix factorization," Proceedings of 5th International Conference on 3D Systems and Applications (3DSA 2013), S10-4, Osaka, Japan, June 2013.
Hybrid multichannel signal separation using supervised nonnegative matrix fac...Daichi Kitamura
Presented at Asia-Pacific Signal and Information Processing Association Annual Summit and Conference 2014 (APSIPA 2014, international conference)
Daichi Kitamura, Hiroshi Saruwatari, Satoshi Nakamura, Yu Takahashi, Kazunobu Kondo, Hirokazu Kameoka, "Hybrid multichannel signal separation using supervised nonnegative matrix factorization with spectrogram restoration," Proceedings of Asia-Pacific Signal and Information Processing Association Annual Summit and Conference 2014 (APSIPA 2014), Siem Reap, Cambodia, December 2014 (invited paper).
Isolated words recognition using mfcc, lpc and neural networkeSAT Journals
Abstract Automatic speech recognition is an important topic of speech processing. This paper presents the use of an Artificial Neural Network (ANN) for isolated word recognition. The Pre-processing is done and voiced speech is detected based on energy and zero crossing rates (ZCR). The proposed approach used in speech recognition is Mel Frequency Cepstral Coefficients (MFCC) and combine features of both MFCC and Linear Predictive Coding (LPC). The back-propagation is used as a classifier. The recognition accuracy is increased when combine features of both LPC and MFCC are used as compared to only MFCC approach using Neural Network as a classifier.. Keywords: Pre-processing, Mel frequency Cepstral Coefficient (MFCC), Linear Predictive Coding (LPC), Artificial Neural Network (ANN).
Reduced Ordering Based Approach to Impulsive Noise Suppression in Color ImagesIDES Editor
In this paper a novel filtering design intended for
the impulsive noise removal in color images is presented.
The described scheme utilizes the rank weighted cumulated
distances between the pixels belonging to the local filtering
window. The impulse detection scheme is based on the
difference between the aggregated weighted distances assigned
to the central pixel of the window and the minimum value,
which corresponds to the rank weighted vector median. If the
difference exceeds an adaptively determined threshold value,
then the processed pixel is replaced by the mean of the
neighboring pixels, which were found to be not corrupted,
otherwise it is retained. The important feature of the described
filtering framework is its ability to effectively suppress
impulsive noise, while preserving fine image details. The
comparison with the state-of-the-art denoising schemes
revealed that the proposed filter yields better restoration
results in terms of objective restoration quality measures.
Speech Enhancement Using Spectral Flatness Measure Based Spectral SubtractionIOSRJVSP
This paper is aimed to reduce background noise introduced in speech signal during capture, storage, transmission and processing using Spectral Subtraction algorithm. To consider the fact that colored noise corrupts the speech signal non-uniformly over different frequency bands, Multi-Band Spectral Subtraction (MBSS) approach is exploited wherein amount of noise subtracted from noisy speech signal is decided by a weighting factor. Choice of optimal values of weights decides the performance of the speech enhancement system. In this paper weights are decided based on SFM (Spectral Flatness Measure) than conventional SNR (Signal to Noise Ratio) based rule. Since SFM is able to provide true distinction between speech signal and noise signal. Spectrogram, Mean Opinion Score show that speech enhanced from proposed SFM based MBSS possess better perceptual quality and improved intelligibility than existing SNR based MBSS
Removal of noise is a determining track in
the image rebuilding process, but denoising of image remains a
claiming problem in upcoming analysis accomplice along
image processing. Denoising is utilized to expel the noise from
corrupted image, where as we need to maintain the edges and
other detailed characteristics almost accessible. This noise gets
imported during accretion, transmitting & receiving and
storage & retrieval techniques. In this paper, to discover out
denoised image the modified denoising technique and the local
adaptive wavelet image denoising technique can be obtained.
The input (noisy image) is denoised with the help of modified
denoising technique which is form on wavelet domain as well as
spatial domain along with the local adaptive wavelet image
denoising technique which is form on wavelet domain. In this
paper, I have appraised and analyzed achievements of
modified denoising technique and the local adaptive wavelet
image denoising technique. The above procedures are
contemplated with other based on PSNR between input image
and noisy image and SNR between input image and denoised
image. Simulation and experimental outgrowth for an image
reflects as the mean square error of the local adaptive wavelet
image denoising procedure is less efficient as compare to
modified denoising procedure including the signal to noise
ratio of the local adaptive wavelet image denoising technique is
effective than other approach. Therefore, the image after
denoising has a superior visual effect. In this paper, these two
techniques are materialized with the help of MATLAB for
denoising of image
Speaker Recognition System using MFCC and Vector Quantization Approachijsrd.com
This paper presents an approach to speaker recognition using frequency spectral information with Mel frequency for the improvement of speech feature representation in a Vector Quantization codebook based recognition approach. The Mel frequency approach extracts the features of the speech signal to get the training and testing vectors. The VQ Codebook approach uses training vectors to form clusters and recognize accurately with the help of LBG algorithm.
Feature Extraction of Musical Instrument Tones using FFT and Segment AveragingTELKOMNIKA JOURNAL
A feature extraction for musical instrument tones that based on a transform domain approach was
proposed in this paper. The aim of the proposed feature extraction was to get the lower feature extraction
coefficients. In general, the proposed feature extraction was carried out as follow. Firstly, the input signal
was transformed using FFT (Fast Fourier Transform). Secondly, the left half of the transformed signal was
divided into a number of segments. Finally, the averaging results of that segments, was the feature
extraction of the input signal. Based on the test results, the proposed feature extraction was highly efficient
for the tones, which have many significant local peaks in the Fourier transform domain, because it only
required at least four feature extraction coefficients, in order to represent every tone.
Frequency based criterion for distinguishing tonal and noisy spectral componentsCSCJournals
A frequency-based criterion for distinguishing tonal and noisy spectral components is proposed. For considered spectral local maximum two instantaneous frequency estimates are determined and the difference between them is used in order to verify whether component is noisy or tonal. Since one of the estimators was invented specially for this application its properties are deeply examined. The proposed criterion is applied to the stationary and nonstationary sinusoids in order to examine its efficiency.
Image Restoration Using Particle Filters By Improving The Scale Of Texture Wi...CSCJournals
Traditional techniques are based on restoring image values based on local smoothness constraints within fixed bandwidth windows where image structure is not considered. Common problem for such methods is how to choose the most appropriate bandwidth and the most suitable set of neighboring pixels to guide the reconstruction process. The present work proposes a denoising technique based on particle filtering using MRF (Markov Random Field). It is an automatic technique to capture the scale of texture. The contribution of our method is the selection of an appropriate window in the image domain. For this we first construct a set containing all occurrences then the conditional pdf can be estimated with a histogram of all center pixel values. Particle evolution is controlled by the image structure leading to a filtering window adapted to the image content. Our method explores multiple neighbors’ sets (or hypotheses) that can be used for pixel denoising, through a particle filtering approach. This technique associates weights for each hypothesis according to its relevance and its contribution in the denoising process.
Direction of Arrival Estimation Based on MUSIC Algorithm Using Uniform and No...IJERA Editor
In signal processing, the direction of arrival (DOA) estimation denotes the direction from which a propagating wave arrives at a point, where a set of antennas is located. Using the array antenna has an advantage over the single antenna in achieving an improved performance by applying Multiple Signal Classification (MUSIC) algorithm. This paper focuses on estimating the DOA using uniform linear array (ULA) and non-uniform linear array (NLA)of antennas to analyze the performance factors that affect the accuracy and resolution of the system based on MUSIC algorithm. The direction of arrival estimation is simulated on a MATLAB platform with a set of input parameters such as array elements, signal to noise ratio, number of snapshots and number of signal sources. An extensive simulation has been conducted and the results show that the NLA with DOA estimation for co-prime array can achieve an accurate and efficient DOA estimation
AURALISATION OF DEEP CONVOLUTIONAL NEURAL NETWORKS: LISTENING TO LEARNED FEAT...NAVER LABS
ISMIR ( International Society for Music Information Retrieval Conference ) 2015 에서 발표된 CNN 딥러닝 방법을 이용하여 음악을 분석하는 내용의 논문입니다.
저자 : Queen Mary University of London 최근우, 네이버랩스 김정희, Queen Mary University of London George Fazekas, Queen Mary University of London Mark Sandler
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
International Journal of Engineering and Science Invention (IJESI) is an international journal intended for professionals and researchers in all fields of computer science and electronics. IJESI publishes research articles and reviews within the whole field Engineering Science and Technology, new teaching methods, assessment, validation and the impact of new technologies and it will continue to provide information on the latest trends and developments in this ever-expanding subject. The publications of papers are selected through double peer reviewed to ensure originality, relevance, and readability. The articles published in our journal can be accessed online.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
• The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
• Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
• Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
• Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Monitoring Java Application Security with JDK Tools and JFR Events
Depth Estimation of Sound Images Using Directional Clustering and Activation-Shared Nonnegative Matrix Factorization
1. Depth Estimation of Sound Images Using
Directional Clustering and Activation-Shared
Nonnegative Matrix Factorization
Tomo Miyauchi, Daichi Kitamura,
Hiroshi Saruwatari, Satoshi Nakamura
(Nara Institute of Science and Technology, Japan)
2. Outline
Background and related study
Problem and purpose
Proposed method 1
- Depth estimation based on DOA distribution
Proposed method 2
- Activation shared nonnegative matrix factorization
Experiments
Conclusions
2
3. Background
With the advent of 3D TV, the reproduction of 3D image is realized.
3D sound reproduction system has not been established yet.
Problem
Picture image
Sound image
3D TV
: Sound
image
Viewer feels uncomfortable due to mismatch of images.
To solve this problem, sound field reproduction technique
have been studied actively.
can present the “direction” and “depth” of
the sound images to the listener.
3
4. Related study: wave field synthesis
Sound field reproduction
WFS requires the primary source
information of sound images.
Representation "depth“
of sound images
1. Individual sound source
2. Localization information
Wave Field Synthesis (WFS)
[A. J. Berkhout, et al., 1993]
WFS allows us to create sound
images at the front of loudspeakers.
…
…
…
These information have been lost in
existing contents by down-mix.
×Drawback of WFS
↓
Up-mixing method are required.
Sound image
1
Source separation
Mixed signal → individual source
2
Listener
Localization estimation of
sound images
4
5. Flow of proposed up-mixer
Spatial sound system using existing contents
Stereo contents
Spatial sound
reproduction
1
Mixed multichannel signal
Conventional
method
Sound source
separation
Wave field
Synthesis
This study
2
Directional
estimation
New depth
Depth
estimation
Depth estimation of sound images has not been proposed
5
6. Related study: directional clustering [Araki, et al., 2007]
Individual sources of each cluster
Mixed stereo signal
: Inverse Fourier transform
L-ch input signal
L-ch input signal
: Fourier transform
L-ch input signal
1
R-ch input signal
R-ch input signal
R-ch input signal
Normalization
:Source component
Clustering
:Spatial representative vector
6
7. Outline
Background and related study
Problem and purpose
Proposed method 1
- Depth estimation based on DOA distribution
Proposed method 2
- Activation-shared multichannel NMF
Experiments
Conclusions
7
8. Problem and purpose
Problem WFS requires specific localization information of
individual sound sources to reproduce a sound field.
Up-mixer
Directional estimation method have been developed.
Directional estimation based on VBAP [Hirata, et al., 2011]
Purpose
Establishing new depth estimation method
How can we get depth information?
Proposed method
Depth estimation method using
direction of arrival (DOA) distribution
8
9. Outline
Background and related study
Problem and purpose
Proposed method 1
- Depth estimation based on DOA distribution
Proposed method 2
- Activation-shared multichannel NMF
Experiments
Conclusions
9
10. Proposed method 1: depth estimation based on DOA
DOA
→ “Direction of arrival” of sound waves
We estimate the depth using the DOA distribution.
Directional clustering
Weighted DOA histogram
Directional information
Amplitude
ratio of
Weighting term
Mixed signal
Frequency of
source components
Magnitude of each vector
Individual sources
Left
Center
Right
Direction of arrival
10
11. Proposed method 1: depth estimation based on DOA
In sound fields, when a sound source is far from the listener, sound waves
arrive from various directions owing to sound diffusion.
Frequency of
Close
source component
Difference of DOA shape corresponding to source distance
Close source
Observed DOA histogram
becomes spiky shape
Frequency of
Far
source component
Direction of arrival
Far source
Observed DOA histogram
becomes smooth shape
Direction of arrival
Observed DOA distribution of the target source
can be used as a cue for depth estimation.
11
12. Proposed method 1: modeling of DOA distribution
To model DOA, we propose a new modeling method using GGD.
Generalized Gaussian distribution: GGD [Box, et al., 1973]
Flexible family of probability
density function (PDF)
Shape of GGD changes
depending on βshape.
βshape = 2: Gaussian
distribution PDF
βshape = 1: Laplacian
distribution PDF
Definition of GGD
12
13. Proposed method 1: modeling of DOA distribution
Modeling of DOA distribution based on GGD parameter
Frequency of
source components
Close
Far
Direction of arrival
We propose a new depth estimation based on GGD.
Shape parameter βshape
is utilized as metric.
Source is close ⇔ βshape is small
Source is Far ⇔ βshape is large
13
14. Proposed method 2: problem in proposed method 1
Normalization problem
Small noise components
are enhanced.
× Problem of
L-ch
signal processing
R-ch
L-ch input signal
Frequency of
source components
Left
Binaural – recorded
R-ch input signal
Center
Right
Noise
DOA
Background noise and artificial distortion generated
by signal processing interfere with DOA histogram.
Feature extraction
Activation-shared multichannel NMF
14
15. Outline
Background and related study
Problem and purpose
Proposed method 1
- Depth estimation based on DOA distribution
Proposed method 2
- Activation-shared multichannel NMF
Experiments
Conclusions
15
16. Proposed method 2: activation-shared multichannel NMF
Nonnegative matrix factorization: NMF [Lee, et al., 2001]
Frequency
Frequency
Amplitude
— is a sparse representation.
— can extract significant features from the observed matrix.
Time
Observed matrix
(Spectrogram)
Time
Amplitude
Activation matrix
(Time-varying gain)
Basis matrix
(Spectral patterns)
Ω: Number of frequency bins
: Number of time frames
: Number of bases
The sparse representation provides high performance
for noise reduction, compression, and feature extraction.
We eliminate background noise and artificial distortion.
16
17. Proposed method 2: problem of conventional NMF
Conventional NMF
Directional
information
L-ch
NMF
NMFs are
applied in
parallel
R-ch
NMF
Conventional NMFs
generate an artificial
fluctuation.
Bases are trained
uncorrelated.
Amplitude
ratio
DOA information
is disturbed.
17
18. Proposed method 2: activation-shared multichannel NMF
Proposed method
Activation-shared multichannel NMF
NMF
Activation matrix
is shared through
all channels
R-ch
This reduces dimensionality of
input signal while maintaining
directional information.
L-ch
NMF
Cost function
: cost function,
: β-divergence,
: entries of matrices
18
20. Proposed method 2: activation-shared multichannel NMF
Derivation of optimal variables
Auxiliary function method is an optimization
scheme that uses the upper bound function.
1. Design the auxiliary function for
as
.
2. Minimize the original cost functions indirectly
by minimizing the auxiliary functions.
Using
-divergence
20
21. Proposed method 2: activation-shared multichannel NMF
Cost function
The first and second terms become convex or concave
functions with respect to value.
concave
convex
convex
concave
concave
convex
21
22. Proposed method 2: activation-shared multichannel NMF
Cost function
Upper bound function of each term is defined by applying
Convex: Jensen’s inequality
Concave: tangent line inequality
: Convex
function
: Concave
function
22
23. Proposed method 2: activation-shared multichannel NMF
The update rules for optimization are obtained from the
derivative of auxiliary function w.r.t. each objective variable.
Update rules
ﰀﰀ
are entries
of matrices
.
23
24. Frequency of
source components
Flow of proposed depth estimation method
Input stereo signal
R-ch
L-ch
STFT
Direction of arrival
Cluster L
Cluster C
Cluster R
Activation- Activation- Activationshared NMF shared NMF shared NMF
Depth
estimation
Depth
estimation
Depth
estimation
We can estimate depth information by
calculate shape parameter of DOA histogram.
Direction of arrival
Frequency of
source components
ﰀﰀ
Frequency of
source components
Weighted DOA histogram
Direction of arrival
24
25. Outline
Background and related study
Problem and purpose
Proposed method 1
- Depth estimation based on DOA distribution
Proposed method 2
- Activation-shared multichannel NMF
Experiments
Conclusions
25
26. Experimental conditions
Conditions
Mixing source parameter
Test source 1
Test source 2
Test source 3
: Target source
Reverberation time
intervals
NMF beta
: Interference source
NMF basis
at
Mixed stereo signals
consist of 3 instruments.
Conventional
method 1
Target source is located
center with 7 distances.
Conventional
Processed by conventional NMF
method 2
Combination related to
direction is 6 patterns.
Proposed
method
Weighted DOA histogram
(Not processed by NMF)
Processed by proposed NMF
26
27. Experimental conditions
Image method
[Allen, et al., 1979]
Geometry of image method
Real source
Technique of simulating
room impulse response
Image source
Volume of room
Source location
Microphone location
Absorption coefficient
Example of room impulse response
Reference sound sources
were generated using
image method.
Amplitude
– can be set arbitrarily
Time index
27
28. Experimental results
Results 1
: Target source
ﰀҏ
: Interference source
Data set 1
Target source: Vocal
Interference source (left): Piano
Interference source (right): Guitar
・ Results of conventional methods have no agreement with the oracle (image method).
・ Results of proposed method correctly estimates distance of the target source.
28
29. Experimental results: correlation coefficient
Results 2
Correlation coefficient
between reference value
and estimated value
Table Correlation coefficient of each method
Data set
1
2
3
4
5
6
ﰀҏ
Target source
Interference source (left)
Interference source (right)
Vocal
Piano
Guitar
Vocal
Guitar
Piano
Guitar
Piano
Vocal
Guitar
Vocal
Piano
Piano
Vocal
Guitar
Piano
Guitar
Vocal
Conventional method 1
Conventional method 2
Proposed method
0.350
0.189
0.986
0.532
0.165
0.925
0.154
0.044
0.777
0.277
-0.037
0.651
0.602
0.426
0.791
0.496
0.157
0.856
• Strong relation between the estimated value of proposed
method and the distance of the target source is indicated.
• The efficacy of the proposed method is confirmed.
29
30. Conclusions
We proposed a new depth estimation method of
sound source in mixed signal using the shape of DOA
distribution.
The shape of DOA distribution is modeling by GGD.
We also proposed a new feature extraction method
for the multichannel signal, activation-shared
multichannel NMF.
The result of the experiment indicated the efficacy of
the proposed method.
30
32. Derivation of parameter βshape
×The maximum-likelihood based shape parameter
estimation has no closed-form solution in GGD.
we propose a closed-form parameter estimation
algorithm based on some approximation and kurtosis.
Kurtosis of DOA histogram
th moment of GGD
Relation equation of kurtosis and shape parameter
: Observed DOA histogram
: Gamma function
32
33. Derivation of parameter βshape
×There is no exact closed-form solution of the inverse function.
Introduce Modified String’s formula Approximation of
gamma function
Modified Stirling's formula
ﰀﰀ
Take a logarithm
33
34. Derivation of parameter βshape
This results in the following quadratic equation of
to be solved
we can derive the closed-form estimation
ﰀﰀ
closed-form estimate of shape parameter
Preparation of depth estimation method is completed.
34
35. Proposed method 2: activation-shared multichannel NMF
Preliminary experiment
Example of
DOA histogram
Weighted
DOA histogram
Direction of arrival [degree]
(Individually applied)
conventional NMF
Fluctuation are
generated in DOA
ﰀﰀ
L-ch
NMF
R-ch
NMF
Direction of arrival [degree]
(Activation-shared)
proposed NMF
Feature extraction
while maintaining
directional information
Center cluster DOA
of mixed source
(3 instrument)
L-ch
NMF
R-ch
NMF
Direction of arrival [degree]
35