This document summarizes and compares different speech enhancement methods for noisy Punjabi speech at the phoneme level. It describes spectral subtraction, Wiener filtering, Kalman filtering, and RASTA methods. It also proposes a hybrid method using features from Wiener and Kalman filtering. Phonemes of Punjabi language and the methodology are explained. Results applying various noise types at different signal-to-noise ratios on phonemes show the proposed method produces the best enhancement compared to other methods.
Teager Energy Operation on Wavelet Packet Coefficients for Enhancing Noisy Sp...CSCJournals
In this paper a new thresholding based speech enhancement approach is presented, where the threshold is statistically determined by employing the Teager energy operation on the Wavelet Packet (WP) coefficients of noisy speech. The threshold thus obtained is applied on the WP coefficients of the noisy speech by using a hard thresholding function in order to obtain an enhanced speech. Detailed simulations are carried out in the presence of white, car, pink, and babble noises to evaluate the performance of the proposed method. Standard objective measures, spectrogram representations and subjective listening tests show that the proposed method outperforms the existing state-of-the-art thresholding based speech enhancement approaches for noisy speech from high to low levels of SNR.
Broad phoneme classification using signal based featuresijsc
Speech is the most efficient and popular means of human communication Speech is produced as a sequence
of phonemes. Phoneme recognition is the first step performed by automatic speech recognition system. The
state-of-the-art recognizers use mel-frequency cepstral coefficients (MFCC) features derived through short
time analysis, for which the recognition accuracy is limited. Instead of this, here broad phoneme
classification is achieved using features derived directly from the speech at the signal level itself. Broad
phoneme classes include vowels, nasals, fricatives, stops, approximants and silence. The features identified
useful for broad phoneme classification are voiced/unvoiced decision, zero crossing rate (ZCR), short time
energy, most dominant frequency, energy in most dominant frequency, spectral flatness measure and first
three formants. Features derived from short time frames of training speech are used to train a multilayer
feedforward neural network based classifier with manually marked class label as output and classification
accuracy is then tested. Later this broad phoneme classifier is used for broad syllable structure prediction
which is useful for applications such as automatic speech recognition and automatic language
identification.
Comparative performance analysis of channel normalization techniqueseSAT Journals
Abstract A major part of the interaction between humans takes place via speech communication. The speech signal carries both useful and unwanted information. Processing of such signals involve enhancing the useful information. The intelligibility of speech signals is significantly reduced due to the presence of unwanted information such as noise. Channel normalization algorithms suppress such additive noise introduced in the speech signals by transmission channel or by recording environment conditions. Enhancing the quality and intelligibility of speech signals improve the performance of speech systems such as Automatic speech recognition (ASR) , voice communication and hearing aids to name the few. Based on the experimental results the comparative analysis of channel normalization techniques have been presented in this paper to find out the most suitable algorithm for enhancing the speech signals. Keywords: Cepstral Mean Normalization, Spectral Subtraction, Weiner filter, Signal to Noise Ratio
A Novel Uncertainty Parameter SR ( Signal to Residual Spectrum Ratio ) Evalua...sipij
Usually, hearing impaired people use hearing aids which are implemented with speech enhancement
algorithms. Estimation of speech and estimation of nose are the components in single channel speech
enhancement system. The main objective of any speech enhancement algorithm is estimation of noise power
spectrum for non stationary environment. VAD (Voice Activity Detector) is used to identify speech pauses
and during these pauses only estimation of noise. MMSE (Minimum Mean Square Error) speech
enhancement algorithm did not enhance the intelligibility, quality and listener fatigues are the perceptual
aspects of speech. Novel evaluation approach SR (Signal to Residual spectrum ratio) based on uncertainty
parameter introduced for the benefits of hearing impaired people in non stationary environments to control
distortions. By estimation and updating of noise based on division of original pure signal into three parts
such as pure speech, quasi speech and non speech frames based on multiple threshold conditions. Different
values of SR and LLR demonstrate the amount of attenuation and amplification distortions. The proposed
method will compared with any one method WAT(Weighted Average Technique) Hence by using
parameters SR (signal to residual spectrum ratio) and LLR (log like hood ratio), MMSE (Minim Mean
Square Error) in terms of segmented SNR and LLR.
International Journal of Engineering Research and Development (IJERD)IJERD Editor
call for paper 2012, hard copy of journal, research paper publishing, where to publish research paper,
journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJERD, journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, publishing of research paper, reserach and review articles, IJERD Journal, How to publish your research paper, publish research paper, open access engineering journal, Engineering journal, Mathemetics journal, Physics journal, Chemistry journal, Computer Engineering, Computer Science journal, how to submit your paper, peer reviw journal, indexed journal, reserach and review articles, engineering journal, www.ijerd.com, research journals,
yahoo journals, bing journals, International Journal of Engineering Research and Development, google journals, hard copy of journal
Teager Energy Operation on Wavelet Packet Coefficients for Enhancing Noisy Sp...CSCJournals
In this paper a new thresholding based speech enhancement approach is presented, where the threshold is statistically determined by employing the Teager energy operation on the Wavelet Packet (WP) coefficients of noisy speech. The threshold thus obtained is applied on the WP coefficients of the noisy speech by using a hard thresholding function in order to obtain an enhanced speech. Detailed simulations are carried out in the presence of white, car, pink, and babble noises to evaluate the performance of the proposed method. Standard objective measures, spectrogram representations and subjective listening tests show that the proposed method outperforms the existing state-of-the-art thresholding based speech enhancement approaches for noisy speech from high to low levels of SNR.
Broad phoneme classification using signal based featuresijsc
Speech is the most efficient and popular means of human communication Speech is produced as a sequence
of phonemes. Phoneme recognition is the first step performed by automatic speech recognition system. The
state-of-the-art recognizers use mel-frequency cepstral coefficients (MFCC) features derived through short
time analysis, for which the recognition accuracy is limited. Instead of this, here broad phoneme
classification is achieved using features derived directly from the speech at the signal level itself. Broad
phoneme classes include vowels, nasals, fricatives, stops, approximants and silence. The features identified
useful for broad phoneme classification are voiced/unvoiced decision, zero crossing rate (ZCR), short time
energy, most dominant frequency, energy in most dominant frequency, spectral flatness measure and first
three formants. Features derived from short time frames of training speech are used to train a multilayer
feedforward neural network based classifier with manually marked class label as output and classification
accuracy is then tested. Later this broad phoneme classifier is used for broad syllable structure prediction
which is useful for applications such as automatic speech recognition and automatic language
identification.
Comparative performance analysis of channel normalization techniqueseSAT Journals
Abstract A major part of the interaction between humans takes place via speech communication. The speech signal carries both useful and unwanted information. Processing of such signals involve enhancing the useful information. The intelligibility of speech signals is significantly reduced due to the presence of unwanted information such as noise. Channel normalization algorithms suppress such additive noise introduced in the speech signals by transmission channel or by recording environment conditions. Enhancing the quality and intelligibility of speech signals improve the performance of speech systems such as Automatic speech recognition (ASR) , voice communication and hearing aids to name the few. Based on the experimental results the comparative analysis of channel normalization techniques have been presented in this paper to find out the most suitable algorithm for enhancing the speech signals. Keywords: Cepstral Mean Normalization, Spectral Subtraction, Weiner filter, Signal to Noise Ratio
A Novel Uncertainty Parameter SR ( Signal to Residual Spectrum Ratio ) Evalua...sipij
Usually, hearing impaired people use hearing aids which are implemented with speech enhancement
algorithms. Estimation of speech and estimation of nose are the components in single channel speech
enhancement system. The main objective of any speech enhancement algorithm is estimation of noise power
spectrum for non stationary environment. VAD (Voice Activity Detector) is used to identify speech pauses
and during these pauses only estimation of noise. MMSE (Minimum Mean Square Error) speech
enhancement algorithm did not enhance the intelligibility, quality and listener fatigues are the perceptual
aspects of speech. Novel evaluation approach SR (Signal to Residual spectrum ratio) based on uncertainty
parameter introduced for the benefits of hearing impaired people in non stationary environments to control
distortions. By estimation and updating of noise based on division of original pure signal into three parts
such as pure speech, quasi speech and non speech frames based on multiple threshold conditions. Different
values of SR and LLR demonstrate the amount of attenuation and amplification distortions. The proposed
method will compared with any one method WAT(Weighted Average Technique) Hence by using
parameters SR (signal to residual spectrum ratio) and LLR (log like hood ratio), MMSE (Minim Mean
Square Error) in terms of segmented SNR and LLR.
International Journal of Engineering Research and Development (IJERD)IJERD Editor
call for paper 2012, hard copy of journal, research paper publishing, where to publish research paper,
journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJERD, journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, publishing of research paper, reserach and review articles, IJERD Journal, How to publish your research paper, publish research paper, open access engineering journal, Engineering journal, Mathemetics journal, Physics journal, Chemistry journal, Computer Engineering, Computer Science journal, how to submit your paper, peer reviw journal, indexed journal, reserach and review articles, engineering journal, www.ijerd.com, research journals,
yahoo journals, bing journals, International Journal of Engineering Research and Development, google journals, hard copy of journal
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
International Journal of Computational Engineering Research(IJCER)ijceronline
International Journal of Computational Engineering Research(IJCER) is an intentional online Journal in English monthly publishing journal. This Journal publish original research work that contributes significantly to further the scientific knowledge in engineering and Technology.
A New Speech Enhancement Technique to Reduce Residual Noise Using Perceptual ...IOSR Journals
This paper deals with residual musical noise which results from the perceptual speech enhancement
type algorithms and especially using wiener filtering approach. Perceptual speech enhancement techniques
perform better than the non perceptual techniques, most of them still return a trouble residual musical noise.
This is due to that only noise above the noise masking threshold (NMT) is filtered out then noise below the noise
masking threshold (NMT) can become audible if its maskers are filtered. It can affect the performance of
perceptual speech enhancement method that process the audible noise only (Residual noise is still present). In
order to overcome this drawback a new speech enhancement technique is proposed here.The main aim here is to improve the enhanced speech signal quality provided by perceptual wiener filtering and by controlling the
latter via a second filter regarded as a psychoacoustically motivated weighting factor. The simulation results
gives the information that the performance is improved compared to other perceptual speech enhancement
methods
Speech enhancement using spectral subtraction technique with minimized cross ...eSAT Journals
Abstract The aim of speech enhancement is to get significant reduction of noise and enhanced speech from noisy speech. There are several
approaches for speech enhancement .earlier approaches didn’t consider cross spectral terms into account. Cross spectral terms
become prominent when processing window size becomes small i.e. 20ms-30ms. In this paper, an enhancement method is
proposed for significant reduction of noise, and improvement in the quality and perceptibility of speech degraded by correlated
additive background noise. The proposed method is based on the spectral subtraction technique. The simple spectral subtraction
technique results in poor reduction of noise. One of the main reasons for this is neglecting the cross spectral terms of speech and
noise, based on the appropriation that clean speech and noise signals are completely uncorrelated to each other, which is not true
on short time basis. In this paper an improvement in reduction of the noise is achieved as compared to the earlier methods. This
fact is mainly attributed to the cross spectral terms between speech and noise. This algorithm can be implemented and used in
hearing aids for the benefit of hearing impaired people. Objective speech quality measures, spectrogram analyses and subjective
listening tests conforms the proposed method is more effective in comparison with earlier speech enhancement techniques.
Keywords: Spectral Subtaction,Cross Spectral Components
A REVIEW OF LPC METHODS FOR ENHANCEMENT OF SPEECH SIGNALSijiert bestjournal
This paper presents a review of LPC methods for enhancement of speech signals. The purpose of all method of speech enhancement is to impr ove quality of speech signal by minimizing the background noise . This paper especially comments on Power Spectral Subtraction,Multiband Spectral Subtraction,Non - Linear Spectral Subtraction Method,MMSE Spectral Subtraction Method and Spectral Subtraction based on perceptual properties. As the spectral subtraction produces the Residual noise,Musical noise,Linear Predictive Analysis method is used to enhance the Speech.
Single Channel Speech Enhancement using Wiener Filter and Compressive Sensing IJECEIAES
The speech enhancement algorithms are utilized to overcome multiple limitation factors in recent applications such as mobile phone and communication channel. The challenges focus on corrupted speech solution between noise reduction and signal distortion. We used a modified Wiener filter and compressive sensing (CS) to investigate and evaluate the improvement of speech quality. This new method adapted noise estimation and Wiener filter gain function in which to increase weight amplitude spectrum and improve mitigation of interested signals. The CS is then applied using the gradient projection for sparse reconstruction (GPSR) technique as a study system to empirically investigate the interactive effects of the corrupted noise and obtain better perceptual improvement aspects to listener fatigue with noiseless reduction conditions. The proposed algorithm shows an enhancement in testing performance evaluation of objective assessment tests outperform compared to other conventional algorithms at various noise type conditions of 0, 5, 10, 15 dB SNRs. Therefore, the proposed algorithm significantly achieved the speech quality improvement and efficiently obtained higher performance resulting in better noise reduction compare to other conventional algorithms.
Analysis of PEAQ Model using Wavelet Decomposition Techniquesidescitation
Digital broadcasting, internet audio and music database make use of audio
compression and coding techniques to reduce high quality audio signal without impairing its
perceptual quality. Audio signal compression is the lossy compression
technique, It
converts original converting audio signal into compressed bitstream. The compressed audio
bitstream is decoded at the decoder to produce a close approximation of the original signal.
For the purpose of improving the coding this work attempts to verify the perceptual
evaluation of audio quality (PEAQ) model in BS.1387 using wavelet decomposition
techniques. Finally the comparison of masking threshold for sub-bands using Wavelet
techniques and Fast Fourier transform (FFT) will be done
Speech Enhancement Using Spectral Flatness Measure Based Spectral SubtractionIOSRJVSP
This paper is aimed to reduce background noise introduced in speech signal during capture, storage, transmission and processing using Spectral Subtraction algorithm. To consider the fact that colored noise corrupts the speech signal non-uniformly over different frequency bands, Multi-Band Spectral Subtraction (MBSS) approach is exploited wherein amount of noise subtracted from noisy speech signal is decided by a weighting factor. Choice of optimal values of weights decides the performance of the speech enhancement system. In this paper weights are decided based on SFM (Spectral Flatness Measure) than conventional SNR (Signal to Noise Ratio) based rule. Since SFM is able to provide true distinction between speech signal and noise signal. Spectrogram, Mean Opinion Score show that speech enhanced from proposed SFM based MBSS possess better perceptual quality and improved intelligibility than existing SNR based MBSS
Audio Noise Removal – The State of the Artijceronline
International Journal of Computational Engineering Research (IJCER) is dedicated to protecting personal information and will make every reasonable effort to handle collected information appropriately. All information collected, as well as related requests, will be handled as carefully and efficiently as possible in accordance with IJCER standards for integrity and objectivity.
Audio/Speech Signal Analysis for Depressionijsrd.com
The word “depressed†is a common everyday word. People might say "I am depressed" when in fact they mean "I am fed up because I have had a row, or failed an exam, or lost my job", etc. These ups and downs of life are common and normal. Most people recover quite quickly. Depression is identified by different methods. Here we are identified depression by MFCC (Mel Frequency Ceptral Coefficient) method. There are different parameters used for the identification of depressed speech and normal speech, but MFCCs based parameter is the most applicable information then other parameter because depressive speech or audio signal can contain more information in the higher energy bands when compared with normal speech.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
International Journal of Computational Engineering Research(IJCER)ijceronline
International Journal of Computational Engineering Research(IJCER) is an intentional online Journal in English monthly publishing journal. This Journal publish original research work that contributes significantly to further the scientific knowledge in engineering and Technology.
A New Speech Enhancement Technique to Reduce Residual Noise Using Perceptual ...IOSR Journals
This paper deals with residual musical noise which results from the perceptual speech enhancement
type algorithms and especially using wiener filtering approach. Perceptual speech enhancement techniques
perform better than the non perceptual techniques, most of them still return a trouble residual musical noise.
This is due to that only noise above the noise masking threshold (NMT) is filtered out then noise below the noise
masking threshold (NMT) can become audible if its maskers are filtered. It can affect the performance of
perceptual speech enhancement method that process the audible noise only (Residual noise is still present). In
order to overcome this drawback a new speech enhancement technique is proposed here.The main aim here is to improve the enhanced speech signal quality provided by perceptual wiener filtering and by controlling the
latter via a second filter regarded as a psychoacoustically motivated weighting factor. The simulation results
gives the information that the performance is improved compared to other perceptual speech enhancement
methods
Speech enhancement using spectral subtraction technique with minimized cross ...eSAT Journals
Abstract The aim of speech enhancement is to get significant reduction of noise and enhanced speech from noisy speech. There are several
approaches for speech enhancement .earlier approaches didn’t consider cross spectral terms into account. Cross spectral terms
become prominent when processing window size becomes small i.e. 20ms-30ms. In this paper, an enhancement method is
proposed for significant reduction of noise, and improvement in the quality and perceptibility of speech degraded by correlated
additive background noise. The proposed method is based on the spectral subtraction technique. The simple spectral subtraction
technique results in poor reduction of noise. One of the main reasons for this is neglecting the cross spectral terms of speech and
noise, based on the appropriation that clean speech and noise signals are completely uncorrelated to each other, which is not true
on short time basis. In this paper an improvement in reduction of the noise is achieved as compared to the earlier methods. This
fact is mainly attributed to the cross spectral terms between speech and noise. This algorithm can be implemented and used in
hearing aids for the benefit of hearing impaired people. Objective speech quality measures, spectrogram analyses and subjective
listening tests conforms the proposed method is more effective in comparison with earlier speech enhancement techniques.
Keywords: Spectral Subtaction,Cross Spectral Components
A REVIEW OF LPC METHODS FOR ENHANCEMENT OF SPEECH SIGNALSijiert bestjournal
This paper presents a review of LPC methods for enhancement of speech signals. The purpose of all method of speech enhancement is to impr ove quality of speech signal by minimizing the background noise . This paper especially comments on Power Spectral Subtraction,Multiband Spectral Subtraction,Non - Linear Spectral Subtraction Method,MMSE Spectral Subtraction Method and Spectral Subtraction based on perceptual properties. As the spectral subtraction produces the Residual noise,Musical noise,Linear Predictive Analysis method is used to enhance the Speech.
Single Channel Speech Enhancement using Wiener Filter and Compressive Sensing IJECEIAES
The speech enhancement algorithms are utilized to overcome multiple limitation factors in recent applications such as mobile phone and communication channel. The challenges focus on corrupted speech solution between noise reduction and signal distortion. We used a modified Wiener filter and compressive sensing (CS) to investigate and evaluate the improvement of speech quality. This new method adapted noise estimation and Wiener filter gain function in which to increase weight amplitude spectrum and improve mitigation of interested signals. The CS is then applied using the gradient projection for sparse reconstruction (GPSR) technique as a study system to empirically investigate the interactive effects of the corrupted noise and obtain better perceptual improvement aspects to listener fatigue with noiseless reduction conditions. The proposed algorithm shows an enhancement in testing performance evaluation of objective assessment tests outperform compared to other conventional algorithms at various noise type conditions of 0, 5, 10, 15 dB SNRs. Therefore, the proposed algorithm significantly achieved the speech quality improvement and efficiently obtained higher performance resulting in better noise reduction compare to other conventional algorithms.
Analysis of PEAQ Model using Wavelet Decomposition Techniquesidescitation
Digital broadcasting, internet audio and music database make use of audio
compression and coding techniques to reduce high quality audio signal without impairing its
perceptual quality. Audio signal compression is the lossy compression
technique, It
converts original converting audio signal into compressed bitstream. The compressed audio
bitstream is decoded at the decoder to produce a close approximation of the original signal.
For the purpose of improving the coding this work attempts to verify the perceptual
evaluation of audio quality (PEAQ) model in BS.1387 using wavelet decomposition
techniques. Finally the comparison of masking threshold for sub-bands using Wavelet
techniques and Fast Fourier transform (FFT) will be done
Speech Enhancement Using Spectral Flatness Measure Based Spectral SubtractionIOSRJVSP
This paper is aimed to reduce background noise introduced in speech signal during capture, storage, transmission and processing using Spectral Subtraction algorithm. To consider the fact that colored noise corrupts the speech signal non-uniformly over different frequency bands, Multi-Band Spectral Subtraction (MBSS) approach is exploited wherein amount of noise subtracted from noisy speech signal is decided by a weighting factor. Choice of optimal values of weights decides the performance of the speech enhancement system. In this paper weights are decided based on SFM (Spectral Flatness Measure) than conventional SNR (Signal to Noise Ratio) based rule. Since SFM is able to provide true distinction between speech signal and noise signal. Spectrogram, Mean Opinion Score show that speech enhanced from proposed SFM based MBSS possess better perceptual quality and improved intelligibility than existing SNR based MBSS
Audio Noise Removal – The State of the Artijceronline
International Journal of Computational Engineering Research (IJCER) is dedicated to protecting personal information and will make every reasonable effort to handle collected information appropriately. All information collected, as well as related requests, will be handled as carefully and efficiently as possible in accordance with IJCER standards for integrity and objectivity.
Audio/Speech Signal Analysis for Depressionijsrd.com
The word “depressed†is a common everyday word. People might say "I am depressed" when in fact they mean "I am fed up because I have had a row, or failed an exam, or lost my job", etc. These ups and downs of life are common and normal. Most people recover quite quickly. Depression is identified by different methods. Here we are identified depression by MFCC (Mel Frequency Ceptral Coefficient) method. There are different parameters used for the identification of depressed speech and normal speech, but MFCCs based parameter is the most applicable information then other parameter because depressive speech or audio signal can contain more information in the higher energy bands when compared with normal speech.
Beyond the stories of collapse, devastation, and moral uncertainty in Iraq’s recent history there are tales of connections, relations, and the entanglements of lives which are named in forms such as friendship and family, and modes of comporting to others such as care, attention, and even love, which have yet to become part of how one thinks and writes about life after the invasion. In this article the authors draw attention to a picture of the lives of Iraqis as caught not merely in the forms and structures of tribal obligations and sectarianism, and the violence and destruction of terror, but also in the rough ground of mundane affairs and encounters. We argue that in the overlappings and relations of lives and intentionalities resides an intercorporeal ethics of the rough ground of the everyday. An ethics of the rough ground of the everyday is one understood not only in terms of the ways in which life is open to the pain, suffering, joy, and ennui of others, but in terms of how in the entanglements and relations of lives with other lives in the everyday, lines of care and concern emerge, are fostered, and also frayed.
Improvement of minimum tracking in Minimum Statistics noise estimation methodCSCJournals
Noise spectrum estimation is a fundamental component of speech enhancement and speech recognition systems. In this paper we propose a new method for minimum tracking in Minimum Statistics (MS) noise estimation method. This noise estimation algorithm is proposed for highly nonstationary noise environments. This was confirmed with formal listening tests which indicated that the proposed noise estimation algorithm when integrated in speech enhancement was preferred over other noise estimation algorithms.
METHOD FOR REDUCING OF NOISE BY IMPROVING SIGNAL-TO-NOISE-RATIO IN WIRELESS LANIJNSA Journal
The signal to noise ratio (SNR) is one of the important measures for reducing the noise.A technique that uses a linear prediction error filter (LPEF) and an adaptive digital filter (ADF) to achieve noise reduction in a speech and image degraded by additive background noise is proposed. Since a speech signal can be represented as the stationary signal over a short interval of time, most of speech signal can be predicted by the LPEF. This estimation is performed by the ADF which is used as system identification. Noise reduction is achieved by subtracting the reconstructed noise from the speech degraded by additive background noise. Most of the MR image accelerating methods suffers from degradation of acquired images, which is often correlated with the degree of acceleration. However, Wideband MRI is a novel technique that transcends such flaws.In this paper we proposed LPEF and ADF for reducing the noise in speech and also we demonstrate that Wideband MRI is capable of obtaining images with identical quality as conventional MR images in terms of SNR in wireless LAN.
Speech Enhancement for Nonstationary Noise Environmentssipij
In this paper, we present a simultaneous detection and estimation approach for speech enhancement in nonstationary noise environments. A detector for speech presence in the short-time Fourier transform domain is combined with an estimator, which jointly minimizes a cost function that takes into account both detection and estimation errors. Under speech-presence, the cost is proportional to a quadratic spectral amplitude error, while under speech-absence, the distortion depends on a certain attenuation factor. Experimental results demonstrate the advantage of using the proposed simultaneous detection and estimation approach which facilitate suppression of nonstationary noise with a controlled level of speech distortion.
Audio Noise Removal – The State of the Artijceronline
International Journal of Computational Engineering Research (IJCER) is dedicated to protecting personal information and will make every reasonable effort to handle collected information appropriately. All information collected, as well as related requests, will be handled as carefully and efficiently as possible in accordance with IJCER standards for integrity and objectivity.
Speech Analysis and synthesis using VocoderIJTET Journal
Abstract— In this paper, I proposed a speech analysis and synthesis using a vocoder. Voice conversion systems do not create new speech signals, but just transform existing one. The proposed speech vocoding is different from speech coding. To analyze the speech signal and represent it with less number of bits, so that bandwidth efficiency can be increased. The Synthesis of speech signal from the received bits of information. In this paper three aspects of analysis have been discussed: pitch refinement, spectral envelope estimation and maximum voiced frequency estimation. A Quasi-harmonic analysis model can be used to implement a pitch refinement algorithm which improves the accuracy of the spectral estimation. Harmonic plus noise model to reconstruct the speech signal from parameter. Finally to achieve the highest possible resynthesis quality using the lowest possible number of bits to transmit the speech signal. Future work aims at incorporating the phase information into the analysis and modeling process and also synthesis these three aspects in different pitch period.
A New Speech Enhancement Technique to Reduce Residual Noise Using Perceptual ...IOSR Journals
Abstract- This paper deals with residual musical noise which results from the perceptual speech enhancement
type algorithms and especially using wiener filtering approach. Perceptual speech enhancement techniques
perform better than the non perceptual techniques, most of them still return a trouble residual musical noise.
This is due to that only noise above the noise masking threshold (NMT) is filtered out then noise below the noise
masking threshold (NMT) can become audible if its maskers are filtered. It can affect the performance of
perceptual speech enhancement method that process the audible noise only (Residual noise is still present). In
order to overcome this drawback a new speech enhancement technique is proposed here.The main aim here is
to improve the enhanced speech signal quality provided by perceptual wiener filtering and by controlling the
latter via a second filter regarded as a psychoacoustically motivated weighting factor. The simulation results
gives the information that the performance is improved compared to other perceptual speech enhancement
methods.
General Kalman Filter & Speech Enhancement for Speaker Identificationijcisjournal
Presence of noise increases the dimension of the information. A noise suppression algorithm is developed
with an idea of combining the General Kalman Filter and Estimate Maximization (EM) frame work.This
combination is helpful and effective in identifying noise characteristics of an acoustic environment.
Recursion between Estimate step and Maximization step enabled the algorithm to deal any model of noise.
The same Speech enhancement procedure in applied in the pre-processing stage of a conventional Speaker
identification method. Due to the non-stationary nature of noise and speech adaptive algorithms are
required. Algorithm is first applied for Speech enhancement problem and then extended to using it in the
pre-processing step of the Speaker identification. The present work is compared in terms of significant
metrics with existing and popular algorithms and results show that the developed algorithm is dominant
over them.
Broad Phoneme Classification Using Signal Based Features ijsc
Speech is the most efficient and popular means of human communication Speech is produced as a sequence of phonemes. Phoneme recognition is the first step performed by automatic speech recognition system. The state-of-the-art recognizers use mel-frequency cepstral coefficients (MFCC) features derived through short time analysis, for which the recognition accuracy is limited. Instead of this, here broad phoneme classification is achieved using features derived directly from the speech at the signal level itself. Broad phoneme classes include vowels, nasals, fricatives, stops, approximants and silence. The features identified useful for broad phoneme classification are voiced/unvoiced decision, zero crossing rate (ZCR), short time energy, most dominant frequency, energy in most dominant frequency, spectral flatness measure and first three formants. Features derived from short time frames of training speech are used to train a multilayer feedforward neural network based classifier with manually marked class label as output and classification accuracy is then tested. Later this broad phoneme classifier is used for broad syllable structure prediction which is useful for applications such as automatic speech recognition and automatic language identification.
IJERA (International journal of Engineering Research and Applications) is International online, ... peer reviewed journal. For more detail or submit your article, please visit www.ijera.com
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
In this paper, the performances of adaptive noise cancelling system employing Least Mean Square (LMS) algorithm are studied considering both white Gaussian noise (Case 1) and colored noise (Case 2)
situations. Performance is analysed with varying number of iterations, Signal to Noise Ratio (SNR) and tap size with considering Mean Square Error (MSE) as the performance measurement criteria. Results show that the noise reduction is better as well as convergence speed is faster for Case 2 as compared with Case 1. It is also observed that MSE decreases with increasing SNR with relatively faster decrease of MSE in Case 2 as compared with Case 1, and on average MSE increases linearly with increasing number of filter
coefficients for both type of noise situations. All the experiments have been done using computer
simulations implemented on MATLAB platform.
International Journal of Engineering Research and Applications (IJERA) is an open access online peer reviewed international journal that publishes research and review articles in the fields of Computer Science, Neural Networks, Electrical Engineering, Software Engineering, Information Technology, Mechanical Engineering, Chemical Engineering, Plastic Engineering, Food Technology, Textile Engineering, Nano Technology & science, Power Electronics, Electronics & Communication Engineering, Computational mathematics, Image processing, Civil Engineering, Structural Engineering, Environmental Engineering, VLSI Testing & Low Power VLSI Design etc.
A Gaussian Clustering Based Voice Activity Detector for Noisy Environments Us...CSCJournals
In this paper, a voice activity detector is proposed on the basis of Gaussian modeling of noise in the spectro-temporal space. Spectro-temporal space is obtained from auditory cortical processing. The auditory model that offers a multi-dimensional picture of the sound includes two stages: the initial stage is a model of inner ear and the second stage is the auditory central cortical modeling in the brain. In this paper, the speech noise in this picture has been modeled by a 3-D mono Gaussian cluster. At the start of suggested VAD process, the noise is modeled by a Gaussian shaped cluster. The average noise behavior is obtained in different spectrotemporal space in various points for each frame. In the stage of separation of speech from noise, the criterion is the difference between the average noise behavior and the speech signal amplitude in spectrotemporal domain. This was measured for each frame and was used as the criterion of classification. Using Noisex92, this method is tested in different noise models such as White, exhibition, Street, Office and Train noises. The results are compared to both auditory model and multifeature method. It is observed that the performance of this method in low signal-to-noise ratios (SNRs) conditions is better than other current methods.
Overview of the fundamental roles in Hydropower generation and the components involved in wider Electrical Engineering.
This paper presents the design and construction of hydroelectric dams from the hydrologist’s survey of the valley before construction, all aspects and involved disciplines, fluid dynamics, structural engineering, generation and mains frequency regulation to the very transmission of power through the network in the United Kingdom.
Author: Robbie Edward Sayers
Collaborators and co editors: Charlie Sims and Connor Healey.
(C) 2024 Robbie E. Sayers
About
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
Technical Specifications
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
Key Features
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface
• Compatible with MAFI CCR system
• Copatiable with IDM8000 CCR
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
Application
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
Vaccine management system project report documentation..pdfKamal Acharya
The Division of Vaccine and Immunization is facing increasing difficulty monitoring vaccines and other commodities distribution once they have been distributed from the national stores. With the introduction of new vaccines, more challenges have been anticipated with this additions posing serious threat to the already over strained vaccine supply chain system in Kenya.
Automobile Management System Project Report.pdfKamal Acharya
The proposed project is developed to manage the automobile in the automobile dealer company. The main module in this project is login, automobile management, customer management, sales, complaints and reports. The first module is the login. The automobile showroom owner should login to the project for usage. The username and password are verified and if it is correct, next form opens. If the username and password are not correct, it shows the error message.
When a customer search for a automobile, if the automobile is available, they will be taken to a page that shows the details of the automobile including automobile name, automobile ID, quantity, price etc. “Automobile Management System” is useful for maintaining automobiles, customers effectively and hence helps for establishing good relation between customer and automobile organization. It contains various customized modules for effectively maintaining automobiles and stock information accurately and safely.
When the automobile is sold to the customer, stock will be reduced automatically. When a new purchase is made, stock will be increased automatically. While selecting automobiles for sale, the proposed software will automatically check for total number of available stock of that particular item, if the total stock of that particular item is less than 5, software will notify the user to purchase the particular item.
Also when the user tries to sale items which are not in stock, the system will prompt the user that the stock is not enough. Customers of this system can search for a automobile; can purchase a automobile easily by selecting fast. On the other hand the stock of automobiles can be maintained perfectly by the automobile shop manager overcoming the drawbacks of existing system.
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdffxintegritypublishin
Advancements in technology unveil a myriad of electrical and electronic breakthroughs geared towards efficiently harnessing limited resources to meet human energy demands. The optimization of hybrid solar PV panels and pumped hydro energy supply systems plays a pivotal role in utilizing natural resources effectively. This initiative not only benefits humanity but also fosters environmental sustainability. The study investigated the design optimization of these hybrid systems, focusing on understanding solar radiation patterns, identifying geographical influences on solar radiation, formulating a mathematical model for system optimization, and determining the optimal configuration of PV panels and pumped hydro storage. Through a comparative analysis approach and eight weeks of data collection, the study addressed key research questions related to solar radiation patterns and optimal system design. The findings highlighted regions with heightened solar radiation levels, showcasing substantial potential for power generation and emphasizing the system's efficiency. Optimizing system design significantly boosted power generation, promoted renewable energy utilization, and enhanced energy storage capacity. The study underscored the benefits of optimizing hybrid solar PV panels and pumped hydro energy supply systems for sustainable energy usage. Optimizing the design of solar PV panels and pumped hydro energy supply systems as examined across diverse climatic conditions in a developing country, not only enhances power generation but also improves the integration of renewable energy sources and boosts energy storage capacities, particularly beneficial for less economically prosperous regions. Additionally, the study provides valuable insights for advancing energy research in economically viable areas. Recommendations included conducting site-specific assessments, utilizing advanced modeling tools, implementing regular maintenance protocols, and enhancing communication among system components.
Cosmetic shop management system project report.pdfKamal Acharya
Buying new cosmetic products is difficult. It can even be scary for those who have sensitive skin and are prone to skin trouble. The information needed to alleviate this problem is on the back of each product, but it's thought to interpret those ingredient lists unless you have a background in chemistry.
Instead of buying and hoping for the best, we can use data science to help us predict which products may be good fits for us. It includes various function programs to do the above mentioned tasks.
Data file handling has been effectively used in the program.
The automated cosmetic shop management system should deal with the automation of general workflow and administration process of the shop. The main processes of the system focus on customer's request where the system is able to search the most appropriate products and deliver it to the customers. It should help the employees to quickly identify the list of cosmetic product that have reached the minimum quantity and also keep a track of expired date for each cosmetic product. It should help the employees to find the rack number in which the product is placed.It is also Faster and more efficient way.
Forklift Classes Overview by Intella PartsIntella Parts
Discover the different forklift classes and their specific applications. Learn how to choose the right forklift for your needs to ensure safety, efficiency, and compliance in your operations.
For more technical information, visit our website https://intellaparts.com
Saudi Arabia stands as a titan in the global energy landscape, renowned for its abundant oil and gas resources. It's the largest exporter of petroleum and holds some of the world's most significant reserves. Let's delve into the top 10 oil and gas projects shaping Saudi Arabia's energy future in 2024.
Final project report on grocery store management system..pdfKamal Acharya
In today’s fast-changing business environment, it’s extremely important to be able to respond to client needs in the most effective and timely manner. If your customers wish to see your business online and have instant access to your products or services.
Online Grocery Store is an e-commerce website, which retails various grocery products. This project allows viewing various products available enables registered users to purchase desired products instantly using Paytm, UPI payment processor (Instant Pay) and also can place order by using Cash on Delivery (Pay Later) option. This project provides an easy access to Administrators and Managers to view orders placed using Pay Later and Instant Pay options.
In order to develop an e-commerce website, a number of Technologies must be studied and understood. These include multi-tiered architecture, server and client-side scripting techniques, implementation technologies, programming language (such as PHP, HTML, CSS, JavaScript) and MySQL relational databases. This is a project with the objective to develop a basic website where a consumer is provided with a shopping cart website and also to know about the technologies used to develop such a website.
This document will discuss each of the underlying technologies to create and implement an e- commerce website.
TECHNICAL TRAINING MANUAL GENERAL FAMILIARIZATION COURSEDuvanRamosGarzon1
AIRCRAFT GENERAL
The Single Aisle is the most advanced family aircraft in service today, with fly-by-wire flight controls.
The A318, A319, A320 and A321 are twin-engine subsonic medium range aircraft.
The family offers a choice of engines
Explore the innovative world of trenchless pipe repair with our comprehensive guide, "The Benefits and Techniques of Trenchless Pipe Repair." This document delves into the modern methods of repairing underground pipes without the need for extensive excavation, highlighting the numerous advantages and the latest techniques used in the industry.
Learn about the cost savings, reduced environmental impact, and minimal disruption associated with trenchless technology. Discover detailed explanations of popular techniques such as pipe bursting, cured-in-place pipe (CIPP) lining, and directional drilling. Understand how these methods can be applied to various types of infrastructure, from residential plumbing to large-scale municipal systems.
Ideal for homeowners, contractors, engineers, and anyone interested in modern plumbing solutions, this guide provides valuable insights into why trenchless pipe repair is becoming the preferred choice for pipe rehabilitation. Stay informed about the latest advancements and best practices in the field.
The Benefits and Techniques of Trenchless Pipe Repair.pdf
Paper id 28201448
1. International Journal of Research in Advent Technology, Vol.2, No.8, August 2014
E-ISSN: 2321-9637
98
Speech Enhancement of Punjabi Language at Phoneme
Level using Digital Signal Processing Techniques
Jaismine Jassal1, Manjot Kaur Gill2
M.Tech. student, Dept. of Computer Science and Engineering1, Guru Nanak Dev Engg. College, Ludhiana1
Assistant Professor, Dept. of Information Technology2,Guru Nanak Dev Engg. College, Ludhiana2
Email:jassal.priya@yahoo.com1 , gill.manjot@gmail.com2
Abstract-This paper presents an overview of several most commonly used methods for enhancement of degraded speech.
The common methods like Spectral Subtraction, Wiener Filter, Kalman Filter, RASTA Filter and the Proposed Method
which contains the features from all the methods mentioned are explained. Each method uses certain Digital Signal Proc-essing
(DSP) techniques. Framing, windowing, DFT(Discrete Fourier Transform), FFT(Fast Fourier Transform), noise
detection, SNR are the common parameters used in each method. These methods are applied on the phonemes of Punjabi
language extracted from the word recorded.
Keywords- Noise, speech enhancement, phonemes, SNR (Signal to Noise Ratio).
1. INTRODUCTION
Speech signals in the real worlds scenario are often cor-rupted
by various types of degradations. The most common
degradation includes background noise, reverberation and
speech from competing speaker(s). Degraded speech is
poor, both in terms of quality and intelligibility. Therefore,
there is a need to process the degraded speech for enhancing
the perceptual quality and intelligibility. Several methods in
the literature have been proposed for the purpose. Degraded
speech is processed in the frequency domain for achieving
enhancement. Different types of noise from the environ-ment
were being added and their results were computed and
compared.
This paper provides an overview of some of the
commonly used methods, the comparison between them and
the proposed method. The rest of the paper is organised as
follows: Section 2 presents a review of the methods for
processing speech degraded by background noise. Section 3
describes the Punjabi language and its phonemes. Section 4
covers the methodology followed. Section 5 describes the
comparative results and discussion between the methods
applied on the phonemes. The conclusion is discussed in
Section 5.
2. ENHANCEMENT OF NOISY SPEECH
Background noise is the most common factor that causes
degradation of the quality and intelligibility of speech. The
term background noise refers to any unwanted signal that is
added to the desired signal. Background noise can be sta-tionary
or non-stationary and is assumed to be uncorrelated
and additive to the speech signal. Mathematically, speech
degraded by background noise can be expressed as the sum
of clean speech and background noise (Krishnamoorthy and
Prasanna, 2010) given as
s(n) = x(n) + p(n) (1)
where s(n), x(n) and p(n) denote the noisy speech, clean
speech and the background noise respectively. In the fre-quency
domain it can be represented as
S(f) = X(f) + P(f) (2)
where f is the index of frequency bin.
The problem of enhancing noisy speech received
considerable attention in the literature and a variety of
methods have been proposed to overcome it. the over-view
for each of them is discussed underneath.
2.1. Spectral Subtraction
Spectral Subtraction is a very popular method to en-hance
the quality of speech that has been degraded by
additive noise. It is a form of spectral amplitude esti-mation
method to restore signals degraded by additive
noise, where the phase distortion can be ignored
(Saeed, 2005) .Since, it is assumed that the human ear
is insensitive to the phase. This method of enhancement
works at restoring the signal by subtracting an estimate
of the noise spectrum from the noisy signal spectrum
(Saeed, 2005). In Spectral Subtraction the noise in the
degraded speech is estimated from the ‘pauses’ or
‘quiet’ periods in the speech signal, when there is no
speech being said and only noise is present. The noise
spectrum is then usually updated as more frames of
noise or silent periods appear in the speech signal.
However since the noise is random by nature the resul-tant
spectrum can become negative when Spectral Sub-traction
is applied. This means that the negative values
need to be set to a positive value. This in turn can also
cause distortion of the signal but reduces distortion
caused when the spectrum turns negative. Spectral Sub-traction
of the signal takes place in the frequency do-main
rather than the time domain where the signal is
given. To transform the signals to the frequency do-main
is usually done using a Discrete Fourier transform
(DFT). In this, the Fast Fourier Transform is used in-stead
(FFT). The FFT is the same as the DFT only it is
an efficient way of doing it. Therefore, it is quicker and
will use fewer resources when working with it, making
the system more efficient(Paul, 2009).
2.2. Wiener Filtering Method
2. International Journal of Research in Advent Technology, Vol.2, No.8, August 2014
E-ISSN: 2321-9637
99
The improvement to spectral is the Wiener Filter. In
signal processing, the Wiener Filter is a filter used to
produce an estimate of a desired or target random proc-ess
by linear time-invariant filtering an observed noisy
process, assuming non-stationary signal and noise spec-tra,
and additive noise. The Wiener Filter minimizes
the mean square error between the estimated random
process and the desired process. The goal of the Wiener
Filter is to filter out noise that has corrupted a signal
(Paul, 2009).
2.3. Kalman Filtering Method
Next method of improvement in signal is through Kal-man
Filtering. It is an adaptive least square error filter
that provides an efficient computational recursive solu-tion
for estimating a signal in presence of Gaussian
noises. It is an algorithm which makes optimal use of
imprecise data on a linear (or nearly linear) system with
Gaussian errors to continuously update the best esti-mate
of the system's current state (Gannot et al, 1998).
Kalman Filter theory is based on a state-space ap-proach
in which a state equation models the dynamics
of the signal generation process and an observation
equation models the noisy and distorted observation
signal.
This method however, is best suitable for reduction
of white noise to comply with Kalman assumption. In
deriving Kalman equations it is normally assumed that
the process noise (the additive noise that is observed in
the observation vector) is uncorrelated and has a nor-mal
distribution. This assumption extends to whiteness
character of the noise chosen. However, there are dif-ferent
methods developed to fit the Kalman approach to
colored noises (Gannot et al, 1998)
2.4. RASTA Method
The next technique is RASTA i.e. Relative Spectral
Analysis. To compensate for linear channel distortions
the analysis library provides the ability to perform
RASTA Filtering. This method can be used either in
the log spectral or cepstral domains. In effect, the filter
band passes each feature coefficient. the linear channel
distortions appear as an additive constant in both the
log spectral and the cepstral domains. The high-pass
portion of the equivalent band pass filter alleviates the
effect of convolution noise introduced in the channel.
The low-pass filtering helps in smoothing frame to
frame spectral changes (Urmila and Vilas, n.d).
2.5. The Proposed Method for Speech Enhancement
The Proposed method uses the features of Wiener and
Kalman Filtering method. The connection is not simple
cascade but the blocks are interacting. The combination
of Wiener and Kalman approach can be termed as hy-brid
approach used to improve the performance at even
low SNRs (0-15dB). This method is designed to en-hance
the speech ( i.e. phonemes in our case ) degraded
by noise. The method contains certain features of Wie-ner
and some of the parameters and features used in
Kalman filtering technique.
The features of Wiener like doubling the magni-tude
and eliminating negative magnitude because
sometimes the estimated noise could be larger than the
current signal and we end up with a negative magni-tude.
This would lead to poor quality sound and needed
to be limited to positive values to reduce musical noise
and. It was also necessary to keep the code flexible so a
range of values could be tested for the different pa-rameters.
The features from Kalman consists of innovation
process, Kalman gain, and recursive update. The Kal-man
gain matrix acts as a coefficient to the innovation
sequence. Their product gives a correction factor that is
used to update the initial prediction of the state vector.
The final, optimal estimate is the sum of the initial pre-dicted
value and the correction factor. Likewise, the a
prior error covariance is updated to give the posterior
error covariance matrix at time n. Along with this the
SNR was also used. The tests were conducted using
the combination of all these factors to get the enhanced
and better results from all the filtering methods dis-cussed
above.
3. PUNJABI LANGUAGE PHONEMES
Phonemes are the smallest segmental unit of sound to
form contrasts between utterances(Phonemes, n.d).
Punjabi language has 38 consonants and 10 non-nasal
vowels and 10 nasal vowels. these are shown as fol-lows
(Vivek and Meenakshi, 2013):
Figure 1. Punjabi Consonants and Vowels
Consonants are further divided into aspirated and non
aspirated consonants (Phonemes, n.d). Aspirated con-sonants
has sound of ( h, B, P, T, J, C, D, K, d, G, Q)
whereas non aspirated consonants (p, b, q, t, s, j, c, h,
d, r, V, S, g, l, n, x, v, X) have single character sound.
The ten non nasal vowels are divided into two forms
i.e. independent vowels ( A, Aw, au, aU, ie, eI, AY, a
o, AO) and dependent vowels( w, i, I, u, U, y, Y, o,
O ). There are three nasal symbols( N, M, ` ) that pro-duce
double sound and three paireens ( h, v, r).
4. METHODOLOGY
Step 1. Input :Word level input is fed into the system.
This can be done using microphone to record the word.
Step 2. Phoneme Extract: Break words into pho-nemes.
This is done with the help of Sound Forge 5.0.
Step 3. Add noise: Different types of noises are added.
The noise like random noise generated in Matlab (7.12)
which is of same length i.e. of the signal (phoneme).
Apart from this, other types of noises like cars, aircraft,
household, bells, water etc were added whose length
was truncated to the length of speech (phoneme).
3. International Journal of Research in Advent
Step 4. DSP Techniques: Techniques like
digital filtering, blocking into frames, windowing,
noise detection, SNR(Signal-to-Noise Ratio), FFT (Fast
Fourier Transform) etc, applied before filtering met
ods.
Step 5. Filtering methods: The methods explained
above in Section 2 are used and then the results are
computed and compared.
Step 6. Output: Enhanced speech.
5. RESULTS AND DISCUSSION
Different types of noises were used along with different
levels of SNR (Signal to Noise Ratio).
the test for random noise generated in Matlab was also
done at different SNR values. During the whole deve
opment of the algorithms there were tests being co
tinuously carried out to verify that the filters were o
erating as required. These tests involved the developer
listening to the filtered speech, the spectrogram and
also examining graphs of the speech signals that had
gone through the filters. Doing so helped to see the
progress of the filtering methods. When the algorithms
were working, they were then setup to be able to
change the value of the SNR of the signal. This now a
lowed to be able to choose their own SNR value and
run the filters to see how well they functioned under
different levels of noise in the speech signals.
The test itself consisted of different speech samples.
Each speech sample was then broken up again by a
plying different SNR values to the speech samples
ranging from 20db to 40db. Therefore most tests were
held in a relaxing atmosphere at the PC using either
headphones or speakers. First of all, the phoneme is s
lected , afterwards noise is chosen and added to th
phoneme. The phoneme selected was extracted from
the word recorded using Sound Forge5.0
length of phoneme and the noise was made equal by
using truncation method in Matlab using equation
4. ,
where, 'Len' stores the minimum among the both clean
signal and the noise signal. The noisy signal is then
computed using the addition operator in Matlab. The
formula to compute the noisy signal is shown in equ
tion 5 . The (1:Len) is used to shorten length of Both
clean and noise signal to 'Len'.
5. 1:
In the labelling of each figure SS, WF, RF,
notes Spectral Subtraction method, Wiener Filtering
method, RASTA Filtering Method,
method and the Proposed Method respectively
The graph for original signal i.e. phoneme (ey) is
shown Figure 1.
Technology, Vol.2, No.8, August 2014
E-ISSN: 2321-9637
truncation,
meth-
.
Apart from this,
andom devel-opment
con-tinuously
op-erating
raphs . al-be
ap-different
se-lected
the
Forge5.0. Then the
ing 4:
7. 1: (4)
KF, PM de-hod,
Kalman Filtering
respectively.
Figure 2. Original Signal
5.1. Graph of random noise for each method at different
values of SNR
The graphs are plotted in M
'plot' having syntax as shown:
Matlab 7.12 using function
plot(
X,Y); (5)
which creates a 2-D line plot of the data in
the corresponding values in
X where X and Y are both
vectors, both matrices or one vector other matrix of
equal length.
5.1.1. Comparison at SNR 20.0 dB
Y versus
Figure 3. (a) SS (b) WF (c)
RF (d) KF (e) PM at 20 dB
showing Clean signal (blue), noisy signal (red) and filtered
signal (green).
5.1.2. Comparison at 30.0 dB
Fig-
ure 4. (a) SS (b) WF (c) RF (d)
Clean signal (blue), noisy signal (red) and filtered signal
KF (e) PMat 30 dB showing
(green).
5.1.3. Comparison at 40.0 dB
100
8. International Journal of Research in Advent
Figure 5. (a) SS (b) WF (c) RF (d) KF
showing Clean signal (blue), noisy signal (red) and filtered
signal (green).
5.2. Graph of birds005.wav noise for each method at di
ferent values of SNR
As from the previous graphs, we can clearly see the diffe
ence that the Proposed method produces the best result as
compared to the other filters and it is observed that each
filters works best when SNR is increased. Apart from this
another type of noises were also introduced, which
truncated to the length of the phoneme using truncation
The result for each of the filter at SNR ranging from 20db
to 40db in the noise(birds005.wav) is shown underneath:
5.2.1. Comparison at SNR 20.0 dB
Figure 6. (a) SS (b) WF(c) RF (d) KF
showing Clean signal (blue), noisy signal (red) and filtered
signal (green).
5.2.2. Comparison at SNR 30.0 dB
Figure 7. (a) SS (b) WF (c) RF (d) KF
showing Clean signal (blue), noisy signal (red) and filtered
signal (green).
5.2.3. Comparison at SNR 40.0 dB
Technology, Vol.2, No.8, August 2014
E-ISSN: 2321-9637
(e) PM at 40 dB
dif-we
differ-ence
were
truncation.
(e) PM at 20 dB
KF(e) PM at 30 dB
Figure 8. (a) SS (b) WF (c)
RF(d) KF (e) PM at 40 dB
showing Clean signal (blue), noisy signal (red) and filtered
signal (green).
5.3. Spectrogram of birds005.wav noise for each method
at different values of SNR
The another method used for comparison between the di
ferent filters is the spectrogram.
of the spectrum of frequencies in a sound or other signal as
they vary with time or some other variable.
falls, voiceprints, or voice-grams
spectrograms (Spectrogram, n.d)
It is a visual representation
are commonly referred as
For identification of the
cally, spectrograms can be used
in the development fields like
speech processing, seismology
5.3.1. Comparison at SNR 20.0 dB
Figure 9. (a) Original Signal (b)
PM at 20
5.3.2. Comparison at SNR 30.0 dB
Figure 10. (a) Original Signal (b)
(f) PM at 30 dB
dif-ferent
Spectral water-grams
101
d).
ication spoken words phoneti-
, used. Extensively, it can be used
music, sonar, radar, and
mology etc (Spectrogram, n.d).
SS (c) WF (d) RF(e) KF (f)
dB
FF (c) WF (d) RF(e) KF
9. International Journal of Research in Advent Technology, Vol.2, No.8, August 2014
E-ISSN: 2321-9637
102
5.3.3. Comparison at SNR 40.0 dB
Figure 11. (a) Original Signal (b) SS (c) WF (d) RF (e) KF
(f) PM at 40 dB
As from the above spectrogram it can be more precisely
seen that each filters performs better when SNR(Signal to
Noise Ratio) is increase and at the same time it indicates
that the Proposed method performs better even at low SNR
value as described in the earlier comparison phase.
6. CONCLUSION
After studying and comparing all the filtering techniques, it
is clear that the proposed method gives the better results
even in the random noise and other noises observed, re-corded
and used in these methods.
As from the discussion in previous section, it be-comes
clear that even at low SNR value the results of the
Proposed method are better from the other four filters. The
Table 1 shows the rating 1 to 5 ranging from very poor,
poor, bad, good to very good respectively.
In the Table 1 the five different methods are la-belled
as SS, WF, RF, KF, PM denoting the Spectral sub-traction
method, Wiener Filtering method, Rasta Filtering
method, Kalman filtering method and the proposed method
respectively. The rating is done on the behalf of the results
computed and the comparison shown in previous section.
Table 1: Rating for each method based on the testing results
Noise type(.wav) Filtering Methods
SS WF RF KF PM
Randn 2 4 4 3 5
cars002.wav 2 4 4 3 5
household018.wav 3 3 4 3 5
aircraft003.wav 3 4 4 2 5
animals006.wav 2 3 4 2 4
birds005.wav 2 3 3 2 4
REFERENCES
[1] P. Krishnamoorthy; S. R. Mahadeva Prasanna (2010),
Temporal and Spectral Processing Methods for Process-ing
of Degraded Speech: A Review.
[2] Paul Coffey (2009), Enhancement of Speech in Noisy
Condition, Project Report, National University of Ire-land,
B.E. Electronic Engineering.
[3] Phonemes (n.d),Available from:
https://www.princeton.edu/~achaney/tmve/wiki100k/doc
s/Phoneme.html
[4] S.Gannot,D.Brushtein,E.Weinstein (1998), Iterative and
Sequential Kalman filter-based Speech Enhancement
Algorithms, IEEE Transaction,Speech AudioProcess,
vol. 6, no. 4, pp. 373-385.
[5] Saeed V.Vasegi (2005), Advanced Digital Signal Proc-essing
and Noise Reduction, Third edition.
[6] Spectrogram (n.d), Available from:
en.wikipedia.org/wiki/Spectrogram.
[7] Urmila Shrawankar, Dr Vilas Thakare (n.d), Techniques
for Feature Extraction in Speech Recognition System: A
Comparitive Study, Available from:
arxiv.org/ftp/arxiv/papers/1305/1305.1145.pdf
[8]. Vivek Sharma, Meenakshi Sharma(2013), A quantita-tive
study of the Automatic Speech Recognition Tech-nique,
International Journal of Advances in Science and
Technology, vol 1 issue 1.