IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
Analysis of PEAQ Model using Wavelet Decomposition Techniquesidescitation
Digital broadcasting, internet audio and music database make use of audio
compression and coding techniques to reduce high quality audio signal without impairing its
perceptual quality. Audio signal compression is the lossy compression
technique, It
converts original converting audio signal into compressed bitstream. The compressed audio
bitstream is decoded at the decoder to produce a close approximation of the original signal.
For the purpose of improving the coding this work attempts to verify the perceptual
evaluation of audio quality (PEAQ) model in BS.1387 using wavelet decomposition
techniques. Finally the comparison of masking threshold for sub-bands using Wavelet
techniques and Fast Fourier transform (FFT) will be done
Audio Noise Removal – The State of the Artijceronline
International Journal of Computational Engineering Research (IJCER) is dedicated to protecting personal information and will make every reasonable effort to handle collected information appropriately. All information collected, as well as related requests, will be handled as carefully and efficiently as possible in accordance with IJCER standards for integrity and objectivity.
A New Speech Enhancement Technique to Reduce Residual Noise Using Perceptual ...IOSR Journals
This paper deals with residual musical noise which results from the perceptual speech enhancement
type algorithms and especially using wiener filtering approach. Perceptual speech enhancement techniques
perform better than the non perceptual techniques, most of them still return a trouble residual musical noise.
This is due to that only noise above the noise masking threshold (NMT) is filtered out then noise below the noise
masking threshold (NMT) can become audible if its maskers are filtered. It can affect the performance of
perceptual speech enhancement method that process the audible noise only (Residual noise is still present). In
order to overcome this drawback a new speech enhancement technique is proposed here.The main aim here is to improve the enhanced speech signal quality provided by perceptual wiener filtering and by controlling the
latter via a second filter regarded as a psychoacoustically motivated weighting factor. The simulation results
gives the information that the performance is improved compared to other perceptual speech enhancement
methods
Noise reduction in speech processing using improved active noise control (anc...eSAT Journals
Abstract An improved feed forward adaptive Active Noise Control (ANC) scheme is proposed by using Voice Activity Detector (VAD) and wiener filtering method. The ‘speech-plus-noise’ periods and ‘noise-only’ periods are separated using VAD and the unwanted noise is removed by adaptive filtering method. By using Speech Distortion Weighted- Multichannel Wiener Filtering (SDW-MWF) algorithm the noise periods which is present along with the speech signal, is processed and filtered out. The background noise along with the speech samples are removed by the iterative procedure of filtering process. Feed forward based ANC is used to achieve a system with a better noise reduction in speech processing. Adaptive filtering process is carried out and the speech signal without the background noise can be achieved. Key Words: Active Noise Control (ANC), Noise reduction, Adaptive filtering, Feed forward ANC
Noise reduction in speech processing using improved active noise control (anc...eSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Analysis of PEAQ Model using Wavelet Decomposition Techniquesidescitation
Digital broadcasting, internet audio and music database make use of audio
compression and coding techniques to reduce high quality audio signal without impairing its
perceptual quality. Audio signal compression is the lossy compression
technique, It
converts original converting audio signal into compressed bitstream. The compressed audio
bitstream is decoded at the decoder to produce a close approximation of the original signal.
For the purpose of improving the coding this work attempts to verify the perceptual
evaluation of audio quality (PEAQ) model in BS.1387 using wavelet decomposition
techniques. Finally the comparison of masking threshold for sub-bands using Wavelet
techniques and Fast Fourier transform (FFT) will be done
Audio Noise Removal – The State of the Artijceronline
International Journal of Computational Engineering Research (IJCER) is dedicated to protecting personal information and will make every reasonable effort to handle collected information appropriately. All information collected, as well as related requests, will be handled as carefully and efficiently as possible in accordance with IJCER standards for integrity and objectivity.
A New Speech Enhancement Technique to Reduce Residual Noise Using Perceptual ...IOSR Journals
This paper deals with residual musical noise which results from the perceptual speech enhancement
type algorithms and especially using wiener filtering approach. Perceptual speech enhancement techniques
perform better than the non perceptual techniques, most of them still return a trouble residual musical noise.
This is due to that only noise above the noise masking threshold (NMT) is filtered out then noise below the noise
masking threshold (NMT) can become audible if its maskers are filtered. It can affect the performance of
perceptual speech enhancement method that process the audible noise only (Residual noise is still present). In
order to overcome this drawback a new speech enhancement technique is proposed here.The main aim here is to improve the enhanced speech signal quality provided by perceptual wiener filtering and by controlling the
latter via a second filter regarded as a psychoacoustically motivated weighting factor. The simulation results
gives the information that the performance is improved compared to other perceptual speech enhancement
methods
Noise reduction in speech processing using improved active noise control (anc...eSAT Journals
Abstract An improved feed forward adaptive Active Noise Control (ANC) scheme is proposed by using Voice Activity Detector (VAD) and wiener filtering method. The ‘speech-plus-noise’ periods and ‘noise-only’ periods are separated using VAD and the unwanted noise is removed by adaptive filtering method. By using Speech Distortion Weighted- Multichannel Wiener Filtering (SDW-MWF) algorithm the noise periods which is present along with the speech signal, is processed and filtered out. The background noise along with the speech samples are removed by the iterative procedure of filtering process. Feed forward based ANC is used to achieve a system with a better noise reduction in speech processing. Adaptive filtering process is carried out and the speech signal without the background noise can be achieved. Key Words: Active Noise Control (ANC), Noise reduction, Adaptive filtering, Feed forward ANC
Noise reduction in speech processing using improved active noise control (anc...eSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
A Novel Uncertainty Parameter SR ( Signal to Residual Spectrum Ratio ) Evalua...sipij
Usually, hearing impaired people use hearing aids which are implemented with speech enhancement
algorithms. Estimation of speech and estimation of nose are the components in single channel speech
enhancement system. The main objective of any speech enhancement algorithm is estimation of noise power
spectrum for non stationary environment. VAD (Voice Activity Detector) is used to identify speech pauses
and during these pauses only estimation of noise. MMSE (Minimum Mean Square Error) speech
enhancement algorithm did not enhance the intelligibility, quality and listener fatigues are the perceptual
aspects of speech. Novel evaluation approach SR (Signal to Residual spectrum ratio) based on uncertainty
parameter introduced for the benefits of hearing impaired people in non stationary environments to control
distortions. By estimation and updating of noise based on division of original pure signal into three parts
such as pure speech, quasi speech and non speech frames based on multiple threshold conditions. Different
values of SR and LLR demonstrate the amount of attenuation and amplification distortions. The proposed
method will compared with any one method WAT(Weighted Average Technique) Hence by using
parameters SR (signal to residual spectrum ratio) and LLR (log like hood ratio), MMSE (Minim Mean
Square Error) in terms of segmented SNR and LLR.
In this paper, the performances of adaptive noise cancelling system employing Least Mean Square (LMS) algorithm are studied considering both white Gaussian noise (Case 1) and colored noise (Case 2)
situations. Performance is analysed with varying number of iterations, Signal to Noise Ratio (SNR) and tap size with considering Mean Square Error (MSE) as the performance measurement criteria. Results show that the noise reduction is better as well as convergence speed is faster for Case 2 as compared with Case 1. It is also observed that MSE decreases with increasing SNR with relatively faster decrease of MSE in Case 2 as compared with Case 1, and on average MSE increases linearly with increasing number of filter
coefficients for both type of noise situations. All the experiments have been done using computer
simulations implemented on MATLAB platform.
Speech Enhancement Using Spectral Flatness Measure Based Spectral SubtractionIOSRJVSP
This paper is aimed to reduce background noise introduced in speech signal during capture, storage, transmission and processing using Spectral Subtraction algorithm. To consider the fact that colored noise corrupts the speech signal non-uniformly over different frequency bands, Multi-Band Spectral Subtraction (MBSS) approach is exploited wherein amount of noise subtracted from noisy speech signal is decided by a weighting factor. Choice of optimal values of weights decides the performance of the speech enhancement system. In this paper weights are decided based on SFM (Spectral Flatness Measure) than conventional SNR (Signal to Noise Ratio) based rule. Since SFM is able to provide true distinction between speech signal and noise signal. Spectrogram, Mean Opinion Score show that speech enhanced from proposed SFM based MBSS possess better perceptual quality and improved intelligibility than existing SNR based MBSS
Adaptive wavelet thresholding with robust hybrid features for text-independe...IJECEIAES
The robustness of speaker identification system over additive noise channel is crucial for real-world applications. In speaker identification (SID) systems, the extracted features from each speech frame are an essential factor for building a reliable identification system. For clean environments, the identification system works well; in noisy environments, there is an additive noise, which is affect the system. To eliminate the problem of additive noise and to achieve a high accuracy in speaker identification system a proposed algorithm for feature extraction based on speech enhancement and a combined features is presents. In this paper, a wavelet thresholding pre-processing stage, and feature warping (FW) techniques are used with two combined features named power normalized cepstral coefficients (PNCC) and gammatone frequency cepstral coefficients (GFCC) to improve the identification system robustness against different types of additive noises. Universal background model Gaussian mixture model (UBM-GMM) is used for features matching between the claim and actual speakers. The results showed performance improvement for the proposed feature extraction algorithm of identification system comparing with conventional features over most types of noises and different SNR ratios.
International Journal of Engineering Research and Development (IJERD)IJERD Editor
call for paper 2012, hard copy of journal, research paper publishing, where to publish research paper,
journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJERD, journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, publishing of research paper, reserach and review articles, IJERD Journal, How to publish your research paper, publish research paper, open access engineering journal, Engineering journal, Mathemetics journal, Physics journal, Chemistry journal, Computer Engineering, Computer Science journal, how to submit your paper, peer reviw journal, indexed journal, reserach and review articles, engineering journal, www.ijerd.com, research journals,
yahoo journals, bing journals, International Journal of Engineering Research and Development, google journals, hard copy of journal
Deep Learning Based Voice Activity Detection and Speech EnhancementNAVER Engineering
발표자: 김준태 (KAIST 박사과정)
발표일: 2018.10
Voice activity detection (VAD) and speech enhancement (SE) are important front-end technologies for noise robust speech recognition system.
From incoming noisy signal, VAD detects the speech signal only and SE removes the noise signal while conserving the speech signal.
For VAD and SE, this presentation will cover the traditional methods, deep learning based methods, and our papers as follows:
1. J. Kim and M. Hahn, "Voice Activity Detection Using an Adaptive Context Attention Model," in IEEE Signal Processing Letters, vol. 25, no. 8, pp. 1181-1185, Aug. 2018.
2. J. Kim and M. Hahn, "Speech Enhancement Using a Two Step Network," submitted to IEEE Signal Processing Letters, 2018.
Also, this presentation will briefly introduce some experimental results in real-world environment (far-field, noisy environment), conducted on the embedded board.
For VAD,
Traditional VAD methods.
Deep learning based VAD methods.
Paper presentation: J. Kim and M. Hahn, "Voice Activity Detection Using an Adaptive Context Attention Model," in IEEE Signal Processing Letters, vol. 25, no. 8, pp. 1181-1185, Aug. 2018.
End point detection based on VAD.
Experimental results of DNN-EPD on embedded board in real-world environment.
For SE,
Traditional SE methods.
Deep learning based SE methods.
Paper presentation: J. Kim and M. Hahn, "Speech Enhancement Using a Two Step Network," submitted to IEEE Signal Processing Letters, 2018.
Experimental results in real-world environment.
Improvement of minimum tracking in Minimum Statistics noise estimation methodCSCJournals
Noise spectrum estimation is a fundamental component of speech enhancement and speech recognition systems. In this paper we propose a new method for minimum tracking in Minimum Statistics (MS) noise estimation method. This noise estimation algorithm is proposed for highly nonstationary noise environments. This was confirmed with formal listening tests which indicated that the proposed noise estimation algorithm when integrated in speech enhancement was preferred over other noise estimation algorithms.
Speech Emotion Recognition is a recent research topic in the Human Computer Interaction (HCI) field. The need has risen for a more natural communication interface between humans and computer, as computers have become an integral part of our lives. A lot of work currently going on to improve the interaction between humans and computers. To achieve this goal, a computer would have to be able to distinguish its present situation and respond differently depending on that observation. Part of this process involves understanding a user‟s emotional state. To make the human computer interaction more natural, the objective is that computer should be able to recognize emotional states in the same as human does. The efficiency of emotion recognition system depends on type of features extracted and classifier used for detection of emotions. The proposed system aims at identification of basic emotional states such as anger, joy, neutral and sadness from human speech. While classifying different emotions, features like MFCC (Mel Frequency Cepstral Coefficient) and Energy is used. In this paper, Standard Emotional Database i.e. English Database is used which gives the satisfactory detection of emotions than recorded samples of emotions. This methodology describes and compares the performances of Learning Vector Quantization Neural Network (LVQ NN), Multiclass Support Vector Machine (SVM) and their combination for emotion recognition.
Reduced Ordering Based Approach to Impulsive Noise Suppression in Color ImagesIDES Editor
In this paper a novel filtering design intended for
the impulsive noise removal in color images is presented.
The described scheme utilizes the rank weighted cumulated
distances between the pixels belonging to the local filtering
window. The impulse detection scheme is based on the
difference between the aggregated weighted distances assigned
to the central pixel of the window and the minimum value,
which corresponds to the rank weighted vector median. If the
difference exceeds an adaptively determined threshold value,
then the processed pixel is replaced by the mean of the
neighboring pixels, which were found to be not corrupted,
otherwise it is retained. The important feature of the described
filtering framework is its ability to effectively suppress
impulsive noise, while preserving fine image details. The
comparison with the state-of-the-art denoising schemes
revealed that the proposed filter yields better restoration
results in terms of objective restoration quality measures.
Broad phoneme classification using signal based featuresijsc
Speech is the most efficient and popular means of human communication Speech is produced as a sequence
of phonemes. Phoneme recognition is the first step performed by automatic speech recognition system. The
state-of-the-art recognizers use mel-frequency cepstral coefficients (MFCC) features derived through short
time analysis, for which the recognition accuracy is limited. Instead of this, here broad phoneme
classification is achieved using features derived directly from the speech at the signal level itself. Broad
phoneme classes include vowels, nasals, fricatives, stops, approximants and silence. The features identified
useful for broad phoneme classification are voiced/unvoiced decision, zero crossing rate (ZCR), short time
energy, most dominant frequency, energy in most dominant frequency, spectral flatness measure and first
three formants. Features derived from short time frames of training speech are used to train a multilayer
feedforward neural network based classifier with manually marked class label as output and classification
accuracy is then tested. Later this broad phoneme classifier is used for broad syllable structure prediction
which is useful for applications such as automatic speech recognition and automatic language
identification.
Automatic speech emotion and speaker recognition based on hybrid gmm and ffbnnijcsa
In this paper we present text dependent speaker recognition with an enhancement of detecting the emotion
of the speaker prior using the hybrid FFBN and GMM methods. The emotional state of the speaker
influences recognition system. Mel-frequency Cepstral Coefficient (MFCC) feature set is used for
experimentation. To recognize the emotional state of a speaker Gaussian Mixture Model (GMM) is used in
training phase and in testing phase Feed Forward Back Propagation Neural Network (FFBNN). Speech
database consisting of 25 speakers recorded in five different emotional states: happy, angry, sad, surprise
and neutral is used for experimentation. The results reveal that the emotional state of the speaker shows a
significant impact on the accuracy of speaker recognition.
Teager Energy Operation on Wavelet Packet Coefficients for Enhancing Noisy Sp...CSCJournals
In this paper a new thresholding based speech enhancement approach is presented, where the threshold is statistically determined by employing the Teager energy operation on the Wavelet Packet (WP) coefficients of noisy speech. The threshold thus obtained is applied on the WP coefficients of the noisy speech by using a hard thresholding function in order to obtain an enhanced speech. Detailed simulations are carried out in the presence of white, car, pink, and babble noises to evaluate the performance of the proposed method. Standard objective measures, spectrogram representations and subjective listening tests show that the proposed method outperforms the existing state-of-the-art thresholding based speech enhancement approaches for noisy speech from high to low levels of SNR.
This paper contains a report on an Audio-Visual Client Recognition System using Matlab software which identifies five clients and can be improved to identify as many clients as possible depending on the number of clients it is trained to identify which was successfully implemented. The implementation was accomplished first by visual recognition system implemented using The Principal Component Analysis, Linear Discriminant Analysis and Nearest Neighbour Classifier. A successful implementation of second part was achieved by audio recognition using Mel-Frequency Cepstrum Coefficient, Linear Discriminant Analysis and Nearest Neighbour Classifier the system was tested using images and sounds that have not been trained to the system to see whether it can detect an intruder which lead us to a very successful result with précised response to intruder.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
A Novel Uncertainty Parameter SR ( Signal to Residual Spectrum Ratio ) Evalua...sipij
Usually, hearing impaired people use hearing aids which are implemented with speech enhancement
algorithms. Estimation of speech and estimation of nose are the components in single channel speech
enhancement system. The main objective of any speech enhancement algorithm is estimation of noise power
spectrum for non stationary environment. VAD (Voice Activity Detector) is used to identify speech pauses
and during these pauses only estimation of noise. MMSE (Minimum Mean Square Error) speech
enhancement algorithm did not enhance the intelligibility, quality and listener fatigues are the perceptual
aspects of speech. Novel evaluation approach SR (Signal to Residual spectrum ratio) based on uncertainty
parameter introduced for the benefits of hearing impaired people in non stationary environments to control
distortions. By estimation and updating of noise based on division of original pure signal into three parts
such as pure speech, quasi speech and non speech frames based on multiple threshold conditions. Different
values of SR and LLR demonstrate the amount of attenuation and amplification distortions. The proposed
method will compared with any one method WAT(Weighted Average Technique) Hence by using
parameters SR (signal to residual spectrum ratio) and LLR (log like hood ratio), MMSE (Minim Mean
Square Error) in terms of segmented SNR and LLR.
In this paper, the performances of adaptive noise cancelling system employing Least Mean Square (LMS) algorithm are studied considering both white Gaussian noise (Case 1) and colored noise (Case 2)
situations. Performance is analysed with varying number of iterations, Signal to Noise Ratio (SNR) and tap size with considering Mean Square Error (MSE) as the performance measurement criteria. Results show that the noise reduction is better as well as convergence speed is faster for Case 2 as compared with Case 1. It is also observed that MSE decreases with increasing SNR with relatively faster decrease of MSE in Case 2 as compared with Case 1, and on average MSE increases linearly with increasing number of filter
coefficients for both type of noise situations. All the experiments have been done using computer
simulations implemented on MATLAB platform.
Speech Enhancement Using Spectral Flatness Measure Based Spectral SubtractionIOSRJVSP
This paper is aimed to reduce background noise introduced in speech signal during capture, storage, transmission and processing using Spectral Subtraction algorithm. To consider the fact that colored noise corrupts the speech signal non-uniformly over different frequency bands, Multi-Band Spectral Subtraction (MBSS) approach is exploited wherein amount of noise subtracted from noisy speech signal is decided by a weighting factor. Choice of optimal values of weights decides the performance of the speech enhancement system. In this paper weights are decided based on SFM (Spectral Flatness Measure) than conventional SNR (Signal to Noise Ratio) based rule. Since SFM is able to provide true distinction between speech signal and noise signal. Spectrogram, Mean Opinion Score show that speech enhanced from proposed SFM based MBSS possess better perceptual quality and improved intelligibility than existing SNR based MBSS
Adaptive wavelet thresholding with robust hybrid features for text-independe...IJECEIAES
The robustness of speaker identification system over additive noise channel is crucial for real-world applications. In speaker identification (SID) systems, the extracted features from each speech frame are an essential factor for building a reliable identification system. For clean environments, the identification system works well; in noisy environments, there is an additive noise, which is affect the system. To eliminate the problem of additive noise and to achieve a high accuracy in speaker identification system a proposed algorithm for feature extraction based on speech enhancement and a combined features is presents. In this paper, a wavelet thresholding pre-processing stage, and feature warping (FW) techniques are used with two combined features named power normalized cepstral coefficients (PNCC) and gammatone frequency cepstral coefficients (GFCC) to improve the identification system robustness against different types of additive noises. Universal background model Gaussian mixture model (UBM-GMM) is used for features matching between the claim and actual speakers. The results showed performance improvement for the proposed feature extraction algorithm of identification system comparing with conventional features over most types of noises and different SNR ratios.
International Journal of Engineering Research and Development (IJERD)IJERD Editor
call for paper 2012, hard copy of journal, research paper publishing, where to publish research paper,
journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJERD, journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, publishing of research paper, reserach and review articles, IJERD Journal, How to publish your research paper, publish research paper, open access engineering journal, Engineering journal, Mathemetics journal, Physics journal, Chemistry journal, Computer Engineering, Computer Science journal, how to submit your paper, peer reviw journal, indexed journal, reserach and review articles, engineering journal, www.ijerd.com, research journals,
yahoo journals, bing journals, International Journal of Engineering Research and Development, google journals, hard copy of journal
Deep Learning Based Voice Activity Detection and Speech EnhancementNAVER Engineering
발표자: 김준태 (KAIST 박사과정)
발표일: 2018.10
Voice activity detection (VAD) and speech enhancement (SE) are important front-end technologies for noise robust speech recognition system.
From incoming noisy signal, VAD detects the speech signal only and SE removes the noise signal while conserving the speech signal.
For VAD and SE, this presentation will cover the traditional methods, deep learning based methods, and our papers as follows:
1. J. Kim and M. Hahn, "Voice Activity Detection Using an Adaptive Context Attention Model," in IEEE Signal Processing Letters, vol. 25, no. 8, pp. 1181-1185, Aug. 2018.
2. J. Kim and M. Hahn, "Speech Enhancement Using a Two Step Network," submitted to IEEE Signal Processing Letters, 2018.
Also, this presentation will briefly introduce some experimental results in real-world environment (far-field, noisy environment), conducted on the embedded board.
For VAD,
Traditional VAD methods.
Deep learning based VAD methods.
Paper presentation: J. Kim and M. Hahn, "Voice Activity Detection Using an Adaptive Context Attention Model," in IEEE Signal Processing Letters, vol. 25, no. 8, pp. 1181-1185, Aug. 2018.
End point detection based on VAD.
Experimental results of DNN-EPD on embedded board in real-world environment.
For SE,
Traditional SE methods.
Deep learning based SE methods.
Paper presentation: J. Kim and M. Hahn, "Speech Enhancement Using a Two Step Network," submitted to IEEE Signal Processing Letters, 2018.
Experimental results in real-world environment.
Improvement of minimum tracking in Minimum Statistics noise estimation methodCSCJournals
Noise spectrum estimation is a fundamental component of speech enhancement and speech recognition systems. In this paper we propose a new method for minimum tracking in Minimum Statistics (MS) noise estimation method. This noise estimation algorithm is proposed for highly nonstationary noise environments. This was confirmed with formal listening tests which indicated that the proposed noise estimation algorithm when integrated in speech enhancement was preferred over other noise estimation algorithms.
Speech Emotion Recognition is a recent research topic in the Human Computer Interaction (HCI) field. The need has risen for a more natural communication interface between humans and computer, as computers have become an integral part of our lives. A lot of work currently going on to improve the interaction between humans and computers. To achieve this goal, a computer would have to be able to distinguish its present situation and respond differently depending on that observation. Part of this process involves understanding a user‟s emotional state. To make the human computer interaction more natural, the objective is that computer should be able to recognize emotional states in the same as human does. The efficiency of emotion recognition system depends on type of features extracted and classifier used for detection of emotions. The proposed system aims at identification of basic emotional states such as anger, joy, neutral and sadness from human speech. While classifying different emotions, features like MFCC (Mel Frequency Cepstral Coefficient) and Energy is used. In this paper, Standard Emotional Database i.e. English Database is used which gives the satisfactory detection of emotions than recorded samples of emotions. This methodology describes and compares the performances of Learning Vector Quantization Neural Network (LVQ NN), Multiclass Support Vector Machine (SVM) and their combination for emotion recognition.
Reduced Ordering Based Approach to Impulsive Noise Suppression in Color ImagesIDES Editor
In this paper a novel filtering design intended for
the impulsive noise removal in color images is presented.
The described scheme utilizes the rank weighted cumulated
distances between the pixels belonging to the local filtering
window. The impulse detection scheme is based on the
difference between the aggregated weighted distances assigned
to the central pixel of the window and the minimum value,
which corresponds to the rank weighted vector median. If the
difference exceeds an adaptively determined threshold value,
then the processed pixel is replaced by the mean of the
neighboring pixels, which were found to be not corrupted,
otherwise it is retained. The important feature of the described
filtering framework is its ability to effectively suppress
impulsive noise, while preserving fine image details. The
comparison with the state-of-the-art denoising schemes
revealed that the proposed filter yields better restoration
results in terms of objective restoration quality measures.
Broad phoneme classification using signal based featuresijsc
Speech is the most efficient and popular means of human communication Speech is produced as a sequence
of phonemes. Phoneme recognition is the first step performed by automatic speech recognition system. The
state-of-the-art recognizers use mel-frequency cepstral coefficients (MFCC) features derived through short
time analysis, for which the recognition accuracy is limited. Instead of this, here broad phoneme
classification is achieved using features derived directly from the speech at the signal level itself. Broad
phoneme classes include vowels, nasals, fricatives, stops, approximants and silence. The features identified
useful for broad phoneme classification are voiced/unvoiced decision, zero crossing rate (ZCR), short time
energy, most dominant frequency, energy in most dominant frequency, spectral flatness measure and first
three formants. Features derived from short time frames of training speech are used to train a multilayer
feedforward neural network based classifier with manually marked class label as output and classification
accuracy is then tested. Later this broad phoneme classifier is used for broad syllable structure prediction
which is useful for applications such as automatic speech recognition and automatic language
identification.
Automatic speech emotion and speaker recognition based on hybrid gmm and ffbnnijcsa
In this paper we present text dependent speaker recognition with an enhancement of detecting the emotion
of the speaker prior using the hybrid FFBN and GMM methods. The emotional state of the speaker
influences recognition system. Mel-frequency Cepstral Coefficient (MFCC) feature set is used for
experimentation. To recognize the emotional state of a speaker Gaussian Mixture Model (GMM) is used in
training phase and in testing phase Feed Forward Back Propagation Neural Network (FFBNN). Speech
database consisting of 25 speakers recorded in five different emotional states: happy, angry, sad, surprise
and neutral is used for experimentation. The results reveal that the emotional state of the speaker shows a
significant impact on the accuracy of speaker recognition.
Teager Energy Operation on Wavelet Packet Coefficients for Enhancing Noisy Sp...CSCJournals
In this paper a new thresholding based speech enhancement approach is presented, where the threshold is statistically determined by employing the Teager energy operation on the Wavelet Packet (WP) coefficients of noisy speech. The threshold thus obtained is applied on the WP coefficients of the noisy speech by using a hard thresholding function in order to obtain an enhanced speech. Detailed simulations are carried out in the presence of white, car, pink, and babble noises to evaluate the performance of the proposed method. Standard objective measures, spectrogram representations and subjective listening tests show that the proposed method outperforms the existing state-of-the-art thresholding based speech enhancement approaches for noisy speech from high to low levels of SNR.
This paper contains a report on an Audio-Visual Client Recognition System using Matlab software which identifies five clients and can be improved to identify as many clients as possible depending on the number of clients it is trained to identify which was successfully implemented. The implementation was accomplished first by visual recognition system implemented using The Principal Component Analysis, Linear Discriminant Analysis and Nearest Neighbour Classifier. A successful implementation of second part was achieved by audio recognition using Mel-Frequency Cepstrum Coefficient, Linear Discriminant Analysis and Nearest Neighbour Classifier the system was tested using images and sounds that have not been trained to the system to see whether it can detect an intruder which lead us to a very successful result with précised response to intruder.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Bsp customization and porting of linux on arm cortex based i.mx6 processor wi...eSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Behavior of square footing resting on reinforced sand subjected to incrementa...eSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
Bit error rate analysis of miso system in rayleigh fading channeleSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
A systematic image compression in the combination of linear vector quantisati...eSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Analysis of element shape in the design for multi band applicationseSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Evaluation of reactances and time constants of synchronous generatoreSAT Publishing House
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
Comparative performance analysis of channel normalization techniqueseSAT Journals
Abstract A major part of the interaction between humans takes place via speech communication. The speech signal carries both useful and unwanted information. Processing of such signals involve enhancing the useful information. The intelligibility of speech signals is significantly reduced due to the presence of unwanted information such as noise. Channel normalization algorithms suppress such additive noise introduced in the speech signals by transmission channel or by recording environment conditions. Enhancing the quality and intelligibility of speech signals improve the performance of speech systems such as Automatic speech recognition (ASR) , voice communication and hearing aids to name the few. Based on the experimental results the comparative analysis of channel normalization techniques have been presented in this paper to find out the most suitable algorithm for enhancing the speech signals. Keywords: Cepstral Mean Normalization, Spectral Subtraction, Weiner filter, Signal to Noise Ratio
Suppression of noise in noisy speech signal is required in many speech enhancement applications like signal recording and transmission from one place to other. In this paper a novel single line noise cancellation system is proposed using derivative of normalized least mean spare algorithm. The proposed system has two phases. The first phase is generation of secondary reference signal from incoming primary signal itself at initial silence period and pause between two words, which is essential while adaptive filter using as noise canceller. Second phase is noise cancellation using proposed modified error data normalized step size (EDNSS) algorithm. The performance of the proposed algorithm is compared with normalized least mean square (NLMS) algorithm and original EDNSS algorithm using standard IEEE sentence (SP23) of Noizeus data base with different types of real-world noise at different level of signal to noise ratio (SNR). The output of proposed, NLMS and EDNSS algorithm are measured with output SNR, excessive mean square error (EMSE) and misadjustment (M). The results clearly illustrates that the proposed algorithm gives improved result over conventional NLMS and EDNSS algorithm. The speed of convergence is also maintained as same conventional NLMS algorithm.
A New Speech Enhancement Technique to Reduce Residual Noise Using Perceptual ...IOSR Journals
Abstract- This paper deals with residual musical noise which results from the perceptual speech enhancement
type algorithms and especially using wiener filtering approach. Perceptual speech enhancement techniques
perform better than the non perceptual techniques, most of them still return a trouble residual musical noise.
This is due to that only noise above the noise masking threshold (NMT) is filtered out then noise below the noise
masking threshold (NMT) can become audible if its maskers are filtered. It can affect the performance of
perceptual speech enhancement method that process the audible noise only (Residual noise is still present). In
order to overcome this drawback a new speech enhancement technique is proposed here.The main aim here is
to improve the enhanced speech signal quality provided by perceptual wiener filtering and by controlling the
latter via a second filter regarded as a psychoacoustically motivated weighting factor. The simulation results
gives the information that the performance is improved compared to other perceptual speech enhancement
methods.
Speech enhancement using spectral subtraction technique with minimized cross ...eSAT Journals
Abstract The aim of speech enhancement is to get significant reduction of noise and enhanced speech from noisy speech. There are several
approaches for speech enhancement .earlier approaches didn’t consider cross spectral terms into account. Cross spectral terms
become prominent when processing window size becomes small i.e. 20ms-30ms. In this paper, an enhancement method is
proposed for significant reduction of noise, and improvement in the quality and perceptibility of speech degraded by correlated
additive background noise. The proposed method is based on the spectral subtraction technique. The simple spectral subtraction
technique results in poor reduction of noise. One of the main reasons for this is neglecting the cross spectral terms of speech and
noise, based on the appropriation that clean speech and noise signals are completely uncorrelated to each other, which is not true
on short time basis. In this paper an improvement in reduction of the noise is achieved as compared to the earlier methods. This
fact is mainly attributed to the cross spectral terms between speech and noise. This algorithm can be implemented and used in
hearing aids for the benefit of hearing impaired people. Objective speech quality measures, spectrogram analyses and subjective
listening tests conforms the proposed method is more effective in comparison with earlier speech enhancement techniques.
Keywords: Spectral Subtaction,Cross Spectral Components
Improving the Efficiency of Spectral Subtraction Method by Combining it with ...IJORCS
In the field of speech signal processing, Spectral subtraction method (SSM) has been successfully implemented to suppress the noise that is added acoustically. SSM does reduce the noise at satisfactory level but musical noise is a major drawback of this method. To implement spectral subtraction method, transformation of speech signal from time domain to frequency domain is required. On the other hand, Wavelet transform displays another aspect of speech signal. In this paper we have applied a new approach in which SSM is cascaded with wavelet thresholding technique (WTT) for improving the quality of speech signal by removing the problem of musical noise to a great extent. Results of this proposed system have been simulated on MATLAB.
Audio Noise Removal – The State of the Artijceronline
International Journal of Computational Engineering Research (IJCER) is dedicated to protecting personal information and will make every reasonable effort to handle collected information appropriately. All information collected, as well as related requests, will be handled as carefully and efficiently as possible in accordance with IJCER standards for integrity and objectivity.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
Saudi Arabia stands as a titan in the global energy landscape, renowned for its abundant oil and gas resources. It's the largest exporter of petroleum and holds some of the world's most significant reserves. Let's delve into the top 10 oil and gas projects shaping Saudi Arabia's energy future in 2024.
Event Management System Vb Net Project Report.pdfKamal Acharya
In present era, the scopes of information technology growing with a very fast .We do not see any are untouched from this industry. The scope of information technology has become wider includes: Business and industry. Household Business, Communication, Education, Entertainment, Science, Medicine, Engineering, Distance Learning, Weather Forecasting. Carrier Searching and so on.
My project named “Event Management System” is software that store and maintained all events coordinated in college. It also helpful to print related reports. My project will help to record the events coordinated by faculties with their Name, Event subject, date & details in an efficient & effective ways.
In my system we have to make a system by which a user can record all events coordinated by a particular faculty. In our proposed system some more featured are added which differs it from the existing system such as security.
Industrial Training at Shahjalal Fertilizer Company Limited (SFCL)MdTanvirMahtab2
This presentation is about the working procedure of Shahjalal Fertilizer Company Limited (SFCL). A Govt. owned Company of Bangladesh Chemical Industries Corporation under Ministry of Industries.
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSEDuvanRamosGarzon1
AIRCRAFT GENERAL
The Single Aisle is the most advanced family aircraft in service today, with fly-by-wire flight controls.
The A318, A319, A320 and A321 are twin-engine subsonic medium range aircraft.
The family offers a choice of engines
Welcome to WIPAC Monthly the magazine brought to you by the LinkedIn Group Water Industry Process Automation & Control.
In this month's edition, along with this month's industry news to celebrate the 13 years since the group was created we have articles including
A case study of the used of Advanced Process Control at the Wastewater Treatment works at Lleida in Spain
A look back on an article on smart wastewater networks in order to see how the industry has measured up in the interim around the adoption of Digital Transformation in the Water Industry.
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdffxintegritypublishin
Advancements in technology unveil a myriad of electrical and electronic breakthroughs geared towards efficiently harnessing limited resources to meet human energy demands. The optimization of hybrid solar PV panels and pumped hydro energy supply systems plays a pivotal role in utilizing natural resources effectively. This initiative not only benefits humanity but also fosters environmental sustainability. The study investigated the design optimization of these hybrid systems, focusing on understanding solar radiation patterns, identifying geographical influences on solar radiation, formulating a mathematical model for system optimization, and determining the optimal configuration of PV panels and pumped hydro storage. Through a comparative analysis approach and eight weeks of data collection, the study addressed key research questions related to solar radiation patterns and optimal system design. The findings highlighted regions with heightened solar radiation levels, showcasing substantial potential for power generation and emphasizing the system's efficiency. Optimizing system design significantly boosted power generation, promoted renewable energy utilization, and enhanced energy storage capacity. The study underscored the benefits of optimizing hybrid solar PV panels and pumped hydro energy supply systems for sustainable energy usage. Optimizing the design of solar PV panels and pumped hydro energy supply systems as examined across diverse climatic conditions in a developing country, not only enhances power generation but also improves the integration of renewable energy sources and boosts energy storage capacities, particularly beneficial for less economically prosperous regions. Additionally, the study provides valuable insights for advancing energy research in economically viable areas. Recommendations included conducting site-specific assessments, utilizing advanced modeling tools, implementing regular maintenance protocols, and enhancing communication among system components.
Courier management system project report.pdfKamal Acharya
It is now-a-days very important for the people to send or receive articles like imported furniture, electronic items, gifts, business goods and the like. People depend vastly on different transport systems which mostly use the manual way of receiving and delivering the articles. There is no way to track the articles till they are received and there is no way to let the customer know what happened in transit, once he booked some articles. In such a situation, we need a system which completely computerizes the cargo activities including time to time tracking of the articles sent. This need is fulfilled by Courier Management System software which is online software for the cargo management people that enables them to receive the goods from a source and send them to a required destination and track their status from time to time.
Quality defects in TMT Bars, Possible causes and Potential Solutions.PrashantGoswami42
Maintaining high-quality standards in the production of TMT bars is crucial for ensuring structural integrity in construction. Addressing common defects through careful monitoring, standardized processes, and advanced technology can significantly improve the quality of TMT bars. Continuous training and adherence to quality control measures will also play a pivotal role in minimizing these defects.
Vaccine management system project report documentation..pdfKamal Acharya
The Division of Vaccine and Immunization is facing increasing difficulty monitoring vaccines and other commodities distribution once they have been distributed from the national stores. With the introduction of new vaccines, more challenges have been anticipated with this additions posing serious threat to the already over strained vaccine supply chain system in Kenya.
Overview of the fundamental roles in Hydropower generation and the components involved in wider Electrical Engineering.
This paper presents the design and construction of hydroelectric dams from the hydrologist’s survey of the valley before construction, all aspects and involved disciplines, fluid dynamics, structural engineering, generation and mains frequency regulation to the very transmission of power through the network in the United Kingdom.
Author: Robbie Edward Sayers
Collaborators and co editors: Charlie Sims and Connor Healey.
(C) 2024 Robbie E. Sayers
Democratizing Fuzzing at Scale by Abhishek Aryaabh.arya
Presented at NUS: Fuzzing and Software Security Summer School 2024
This keynote talks about the democratization of fuzzing at scale, highlighting the collaboration between open source communities, academia, and industry to advance the field of fuzzing. It delves into the history of fuzzing, the development of scalable fuzzing platforms, and the empowerment of community-driven research. The talk will further discuss recent advancements leveraging AI/ML and offer insights into the future evolution of the fuzzing landscape.
About
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
Technical Specifications
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
Key Features
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface
• Compatible with MAFI CCR system
• Copatiable with IDM8000 CCR
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
Application
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
Explore the innovative world of trenchless pipe repair with our comprehensive guide, "The Benefits and Techniques of Trenchless Pipe Repair." This document delves into the modern methods of repairing underground pipes without the need for extensive excavation, highlighting the numerous advantages and the latest techniques used in the industry.
Learn about the cost savings, reduced environmental impact, and minimal disruption associated with trenchless technology. Discover detailed explanations of popular techniques such as pipe bursting, cured-in-place pipe (CIPP) lining, and directional drilling. Understand how these methods can be applied to various types of infrastructure, from residential plumbing to large-scale municipal systems.
Ideal for homeowners, contractors, engineers, and anyone interested in modern plumbing solutions, this guide provides valuable insights into why trenchless pipe repair is becoming the preferred choice for pipe rehabilitation. Stay informed about the latest advancements and best practices in the field.
Gen AI Study Jams _ For the GDSC Leads in India.pdf
A novel speech enhancement technique
1. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
__________________________________________________________________________________________
Volume: 03 Special Issue: 07 | May-2014, Available @ http://www.ijret.org 98
A NOVEL SPEECH ENHANCEMENT TECHNIQUE
R.Dhivya1
, Judith Justin2
1
PG Scholar, Department of Biomedical Instrumentation Engineering, Avinashilingam University, TamilNadu, India
2
Faculty, Department of Biomedical Instrumentation Engineering, Avinashilingam University, TamilNadu, India
Abstract
This enhancement technique is a novel one and is based on the combination of Wavelet thresholding and Spectral Subtraction. Five
wavelet filters are compared and the best filter is selected based on their performance of Signal to Noise Ratio. The selected filter is
applied to the detail coefficients for thresholding. Approximation coefficient is applied to spectral subtraction filter. The reconstructed
signal is evaluated using the metrics such as SNR (Signal to Noise Ratio), Correlation coefficient and PESQ (Perceptual Evaluation of
Speech Quality). Real time data is recorded from Alaryngeal speakers and real world noise from Noizeus corpus is used for the study.
Keywords: Wavelet Thresholding, Signal to Noise Ratio, Correlation coefficient, Perceptual Evaluation of Speech
Quality.
----------------------------------------------------------------------***-----------------------------------------------------------------------
1. INTRODUCTION FOR SPEECH
ENHANCEMENT
Numerous speech enhancement algorithms have been
proposed to improve the performance of communication
devices working in normal environments surrounded in noise.
The rapid increase in usage of speech processing algorithms in
multi-media and telecommunication applications raises the
need for speech quality evaluation [1]. Accurate and reliable
assessment of speech quality is thus becoming vital for the
satisfaction of the end-user or customer of the deployed
speech processing systems (e.g., mobile phone, video
conferencing equipment, speech synthesis system, etc.). It is
possible for speech to be highly intelligible and still be of poor
quality. Also, although two different algorithms may produce
equal word intelligibility scores, listeners may perceive the
speech of one of the algorithms as being more natural,
pleasant and acceptable. Therefore there is a need to measure
the attributes of a speech signal. Reliable rating of speech
quality is a challenging task because quality assessment is
highly subjective and reliability of subjective measurements
becomes an issue.
In this research a novel method of speech enhancement
technique is proposed and its performance is evaluated using
some validated measures.
The data used for the study is unique. A speaker who suffered
from laryngeal carcinoma is surgically treated with Total
Laryngectomy, which is the removal f the larynx. This is
followed by a loss of voice after which voice prosthesis is
implanted. The speaker gets trained to speak again with the
help of a speech therapist. The voice thus produced is called
alaryngeal voice. The quality of the voice of the
laryngectomee is assessed through Signal to Noise Ratio
(SNR), Correlation coefficient and Perceptual Evaluation of
Speech Quality (PESQ) which is recommended by
International Telecommunication Union.
This paper is organized as follows: Section II describes the
materials and the method selected for the study, Section III
describes the proposed technique adopted for the
enhancement, and a study of an existing speech enhancement
algorithm and Section IV presents the quality metrics adopted
for the evaluation. Section V elaborates on the results obtained
and a comparison between the two algorithms. In section VI
Conclusions are drawn about the novel technique of speech
enhancement.
2. MATERIALS AND METHODS
A male speaker implanted with the Blom-singer voice
prosthesis was considered for the study. A sentence is
presented to the alaryngeal speaker from the IEEE sentence
database. The sentences in this database have the features of
being phonetically balanced. The sentence, “Kick the ball
straight and follow through” is used for the study. The voice
generated by the speaker after implantation with Blom-Singer
Duckbill Voice Prosthesis is recorded using a unidirectional
microphone in an anechoic room and is stored on a computer.
In order to study the performance of the algorithm in a real
world situations, real world noises like babble, car, street and
train at different levels (0 dB, 5 dB,10 dB and 15 dB) are added
and the performance of the algorithm are evaluated using
validated metrics. The quality metrics used for the evaluation
are SNR, Correlation coefficient and PESQ scores. The
proposed method is explained with a block diagram shown in
Figure 1.
2. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
__________________________________________________________________________________________
Volume: 03 Special Issue: 07 | May-2014, Available @ http://www.ijret.org 99
3. PROPOSED TECHNIQUE FOR SPEECH
ENHANCEMENT
The recorded speech signal is added to different real world
noises - Babble, Car, Street and Train at four different levels 0,
5, 10 and 15 dB respectively. The noisy speech signal
generated is decomposed using wavelet filters [2], [3]. Mother
wavelet is carefully selected to better approximate and capture
the transient spikes of the original signal. “Mother wavelet” is
determined how well we estimate the original signal in terms of
the shape and at the same time it will affect the frequency
spectrum of the denoised signal. The choice of mother wavelet
can be based on modifying the detail coefficients using wavelet
thresholding techniques and keeps the approximation
Coefficients unaltered. The selection of wavelet [4], [5] is
based on Signal to Noise Ratio (SNR) among the signal of
interest and the wavelet-denoised signal shown in table1.
The signal is decomposed at one level, since the signal is
down-sampled by 2, approximate and detail coefficients. To
threshold the detail coefficients the threshold value is selected
based on soft threshold evaluator of unbiased risk, „Rigrsure‟.
Soft thresholding shrinks the coefficients that are larger than
the threshold. The selected threshold value is shown in
equation 1.
bl s w=
(1)
is the standard deviation of the noisy signal, ω is the squared
wavelet coefficient.
The detail coefficient is denoised and the approximation
coefficients are enhanced using the multiband spectral
subtraction method [6].
3.1 Wavelet Thresholding
Wavelet Transform is applied to the noisy signal to decompose
into approximation and Detail coefficients. Soft thresholding is
applied to the detail coefficients to denoise the noisy signal.
Assume that y(n) is a noisy signal and is given as in equation 2.
y(n)=x(n)+d(n) (2)
Where x(n) is original signal and d(n) is noise signal. High
frequency features may present in original signal, which is
well-preserved by wavelet transform. The detail coefficient of
the noisy signal is shrinked by soft thresholding given in
equation (3),
X =
sgn( )y y T- if y T³
0
if y T<
(3)
There are different types of wavelet filters such as [6]:
Daubechies filter is the extension of haar wavelets,
that produce smoother scaling and wavelet
functions.
Coiflets filter allow approximately equal number
of zero scaling function and wavelet moments.
Symlet filter is a modified version of Daubechies
wavelets with increased symmetry.
Biorthogonal filter describes a pair of topological
vector spaces that are in duality with a pair of
indexed subsets in a specific way.
Reverse biorthogonal
These wavelets are assessed through Signal to Noise Ratio to
assess the best performing denoising wavelet as shown in
Table 1.
Wavelet Reconstruction: After the detail coefficients are
shrinked and approximation coefficients enhanced, these two
coefficients are reconstructed using inverse discrete wavelet
transform with the help of the mother wavelet. The
reconstructed signal is evaluated using the quality metrics such
as SNR, Correlation Coefficient and PESQ.
3.2 Spectral Subtraction
Generally noise will not affect the speech signal uniformly over
the spectrum; some frequencies will be exaggerated based on
the spectral characteristics of the noise. In mband approach, the
spectrum is allocated into N overlapping bands, and spectral
subtraction is achieved independently in each band [7].
The estimate of clean speech spectrum in ith band is obtained
by the equation 4,
22 2
( ) ( ) . . ( )i k i k i i i kX Y Dw w a b w= -
i k ib ew£ £
(4)
Where kw
= 2 /k NP , are discrete frequencies,
2
( )i kD w
is
the estimated noise power spectrum, ib
and ie
are the
commencement and termination frequency bins of the ith
frequency band, ia
is the over-subtraction factor, and ib
is the
additional band-subtraction factor that can be set individually
for each frequency band to adapt the noise removal process.
3.3 Quality Metrics
Speech quality measure has three types of objectives [9] such
as perceptual domain, spectral domain, and time domain
measures. In analog or waveform coding systems, time domain
measures are used, which has the goal to reproduce the
waveform itself. Among them the most known methods are
SNR and Segmental SNR (SNRseg). Spectral domain
3. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
__________________________________________________________________________________________
Volume: 03 Special Issue: 07 | May-2014, Available @ http://www.ijret.org 100
measures are the second type that ranges between 15 and 30 ms
long, calculated using speech segments. Spectral domain
measures are less sensitive to misalignments between the
original and distorted signal, so they are much more reliable
than time domain measures. Perceptual domain measures are
based on models of human auditory acuity that is in contrast to
spectral domain measures. It incorporates human auditory
models and transform speech signal into a perceptually
pertinent domain e.g., Bark spectrum or loudness domain.
3.3.1 SNR
Synchronization of original and distorted signals is necessary,
because the waveform is directly compared in time domain or
else the performance will be poor. Synchronization is difficult;
the simplest possible method is to calculate Signal to Noise
Ratio (SNR). It measures the distortion of waveform coders
that reproduce the input waveform, calculated as based on
equation 5:
2
1
10
1
(i)
10log
ˆ( ( ) ( ))
N
i
N
i
x
SNR
x i x i
=
=
=
-
å
å
(5)
Where x(i) and ˆx (i) are the original and processed speech
samples indexed by i and N is the total number of samples.
3.3.2 Correlation Coefficient
Correlation expresses the relationship between two signals
numerically. Correlation is expressed in terms of scale from 0
to1. If the correlation value is closer to 0 indicates weaker, and
if it is closer to 1 indicates stronger correlation.
3.3.3 PESQ
Perceptual domain measures are based on models of the human
auditory system, compared to time and spectral domain
measures and they have a higher chance of predicting the
subjective quality of speech [8]. The commonly used
perceptual quality measure is Perceptual Evaluation of Speech
Quality (PESQ). PESQ is an authenticated metric suggested by
International Telecommunication Union (ITU) for assessing
speech quality. PESQ predicts the subjective opinion score of
an enhanced speech. In PESQ algorithm, a reference signal and
enhanced signal are first allied in both time and level. The final
PESQ score is computed as a linear combination of the average
disturbance value dsym and the average asymmetrical
disturbance value dasym as given in equation 6:
0 1 2PESQ a a dsym a dasym= + × + × (6)
The assortment of the PESQ score will be a MOS-like score,
i.e., a score of rating of 1-2-3-4-5 is given for inadequate-
deprived-reasonable-worthy-outstanding on a listening quality
scale.
Where a0=4.5, a1= - 0.1 and a2= - 0.0309
4. RESULTS AND DISCUSSION
Different wavelet filters considered include Daubechies, coif
let, Symlet, demy, biorthogonal and reverse biorthogonal.
These wavelet filters are compared based on the Signal to
Noise Ratio and the results are shown in Table 1. From the
comparison, we find Symlet 7 wavelet filter to perform well in
denoising the signal.
Table -1: Selection of Wavelet Filter
Wavelet Filter SNR
db9 9.9256
coif4 9.92
sym7 9.9258
Demy 9.9183
bior3.7 9.922
4.1 SNR
The Signal to Noise Ratio is one among the metric considered
for evaluation of the proposed algorithm. The values are
increasing for higher noise levels. Maximum value for babble
noise at 15dB is 9.1733, for car noise at 15dB is 9.5701, for
street noise at 15dB is 9.3026 and for train noise at 15dB is
9.1018, shown in Figure 2. Proposed algorithm give best under
high noise levels, which makes it suitable for real life
situations.
4.2 Correlation Coefficient
Correlation values are increasing for higher dB noise level.
Maximum value for babble noise at 15dB is 0.9763, for car
noise at 15dB is 0.9813, for street noise at 15dB is 0.9768 and
for train noise at 15dB is 0.9763, shown in Figure 2. Proposed
algorithm is found to yield good correlation for higher noise
levels.
4.3 PESQ
PESQ values are increasing for higher dB noise level.
Maximum value for babble noise at 15dB is 3.1705, for car
noise at 15dB is 3.5105, for street noise at 15dB is 3.3086 and
for train noise at 15dB is 3.3086, shown in Figure 2. The PESQ
values indicate reasonable quality in the listening quality scale.
4. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
__________________________________________________________________________________________
Volume: 03 Special Issue: 07 | May-2014, Available @ http://www.ijret.org 101
5. CONCLUSIONS
From assessment of quality metrics, results show that speech
enhancement using Wavelet Thresholding integrated with
Spectral Subtraction yields better performance. Comparing
with all noise levels, proposed algorithm works better in higher
noise dB level and also by comparing enhancement of different
noises, denoising of Car noise works better. The alaryngeal
speech produced by voice prosthesis tracheoesophageal
puncture (TEP) produces a pseudo speech which is of
reasonable quality in the presence of real world noises. Further
the proposed algorithm can be compared with other Speech
Enhancement algorithms to prove the efficacy of enhancement.
Fig -1: Steps for the Speech Enhancement
REFERENCES
[1] Boutaleb, R. , Meraoubi, H. ; Ykhlef, F. ; Benzaba,
W. ; Boucetta, Y. ; Bendaouia, L., “ Comparitive
Performance Study between Spectral Subtraction and
Discrete Wavelet Transform for Speech Enhancement”,
Computer Systems and Applications (AICCSA), 2013
ACS International Conference on 27-30 May 2013
[2] Dr. Mahesh S. Chavan, Mrs Manjusha N.Chavan & Dr.
M.S.Gaikwad, “Studies on Implementation of Wavelet
for Denoising Speech Signal”, International Journal of
Computer Applications (0975 – 8887)Volume 3 – No.2,
June 2010.
[3] V.S.R Kumari, Dileep Kumar Devarakonda,” A
Wavelet Based Denoising of Speech Signal”
International Journal of Engineering Trends and
Technology (IJETT) – Volume 5 number 2 - Nov 2013.
[4] Roopali Goel, Ritesh Jain,” Speech Signal Noise
Reduction by Wavelets”, International Journal of
Innovative Technology and Exploring Engineering
(IJITEE)
[5] Ritesh Jain, Suraiya Parveen , “Analysis of Different
Wavelets by Correlation “,International Journal of
Engineering and Advanced Technology (IJEAT)ISSN:
2249 – 8958, Volume-2, Issue-4, April 2013.
Speech Signal
+ Noise
Noisy Speech
Signal
Selection of
Wavelet Filter
SNR
Enhanced
Speech Signal
Wavelet
Reconstruction
Soft
Thresholding
Spectral
Subtraction
Detail
Coefficients
Approximation
Coefficients
Wavelet
Decomposition
CoC PESQ
5. IJRET: International Journal of Research in Engineering and Technology eISSN: 2319-1163 | pISSN: 2321-7308
__________________________________________________________________________________________
Volume: 03 Special Issue: 07 | May-2014, Available @ http://www.ijret.org 102
[6] Guang-Yan Wang ,” Speech enhancement based on the
Combination of spectral subtraction and wavelet
thresholding”, http://ieeexplore.ieee.org/xpl/23-25 Oct.
2009
[7] S. Kamath, and P. Loizou, "A multi-band spectral
subtraction method for enhancing speech corrupted by
colored noise. Proc. IEEE Int. Conf. Acoust., Speech,
Signal Processing, 2002
[8] Yi Hu, Philips C. Loizou, "Evaluation of objective
quality measures for speech enhancement," IEEE
Transactions on Audio, Speech & Language
Processing, vol.16(1), pp. 229-238, 2008.
[9] S. Mohamed, F. Cervantes and H. Afifi. ``Real-Time
Audio Quality Assessment in Packet Networks'',
In Network and Information Systems Journal, vol. 3, no.
3-4, 2000, pp. 595-609.
Chart -1: Quality metrics for Proposed Algorithm
BIOGRAPHIES
Dhivya.R completed B.E in Biomedical
Engineering from Vellalar Engineering
College, Erode and Currently pursuing M.E in
the department of Biomedical Instrumentation
Engineering in Avinashilingam University,
Coimbatore.
Judith Justin graduated from Government
College of Technology, Coimbatore and
completed M.Tech in Biomedical Engineering
from Indian Institute of Technology – Madras.
Her research interests are in the field of
Biosignal and Image Processing and Medical Instrumentation.
She has published around 15 papers in National and
International Conferences and has more than 12 papers
published in Peer reviewed International Journals to her credit.
She is a life member of ISTE and BMESI.