Speech Enhancement by suppressing uncorrelated acoustically added noise has been a challenging topic of research for many years. These are the primary choice for real time applications due to the simplicity and comparatively low computational load. This paper shows VAD (Voice activity detection) technique that can detect the non speech segment from the speech signal. It is also shown that it can work powerfully in an unpredictable noise ambience. The technique is mostly done in microprocessors or DSP processors because of their flexibility. But there are several advantages of FPGA over DSP processors like high cost per logic element related to these processors makes them improper for large scale use. From the experimental results, VAD method is implemented on the FPGA chip.
Analysis of PEAQ Model using Wavelet Decomposition Techniquesidescitation
Digital broadcasting, internet audio and music database make use of audio
compression and coding techniques to reduce high quality audio signal without impairing its
perceptual quality. Audio signal compression is the lossy compression
technique, It
converts original converting audio signal into compressed bitstream. The compressed audio
bitstream is decoded at the decoder to produce a close approximation of the original signal.
For the purpose of improving the coding this work attempts to verify the perceptual
evaluation of audio quality (PEAQ) model in BS.1387 using wavelet decomposition
techniques. Finally the comparison of masking threshold for sub-bands using Wavelet
techniques and Fast Fourier transform (FFT) will be done
The Short-Time Silence of Speech Signal as Signal-To-Noise Ratio EstimatorIJERA Editor
It is proposed in this paper to use a small portion of the audio speech signal to estimate Signal-to-Noise Ratio
(SNR). It is found that, the first 30 ms duration has enough information about the SNR in advance. The first 30
ms of a recorded speech usually comes from the silence rather than speech. This is because the speaker usually
starts the recording process or wait for it before he/she can deliver the utterance. For testing and comparing the
proposed estimator, different noisy corpora are built upon the TIMIT data. The average estimation of the
suggested algorithm proves to get better results as compared to the Waveform Amplitude Distribution Analysis
(WADA) and the National Institute of Standard and Technology (NIST) SNR estimators. The complexity of the
STS-SNR estimator is less than both as it only processes a small portion of the audio samples
Analysis of PEAQ Model using Wavelet Decomposition Techniquesidescitation
Digital broadcasting, internet audio and music database make use of audio
compression and coding techniques to reduce high quality audio signal without impairing its
perceptual quality. Audio signal compression is the lossy compression
technique, It
converts original converting audio signal into compressed bitstream. The compressed audio
bitstream is decoded at the decoder to produce a close approximation of the original signal.
For the purpose of improving the coding this work attempts to verify the perceptual
evaluation of audio quality (PEAQ) model in BS.1387 using wavelet decomposition
techniques. Finally the comparison of masking threshold for sub-bands using Wavelet
techniques and Fast Fourier transform (FFT) will be done
The Short-Time Silence of Speech Signal as Signal-To-Noise Ratio EstimatorIJERA Editor
It is proposed in this paper to use a small portion of the audio speech signal to estimate Signal-to-Noise Ratio
(SNR). It is found that, the first 30 ms duration has enough information about the SNR in advance. The first 30
ms of a recorded speech usually comes from the silence rather than speech. This is because the speaker usually
starts the recording process or wait for it before he/she can deliver the utterance. For testing and comparing the
proposed estimator, different noisy corpora are built upon the TIMIT data. The average estimation of the
suggested algorithm proves to get better results as compared to the Waveform Amplitude Distribution Analysis
(WADA) and the National Institute of Standard and Technology (NIST) SNR estimators. The complexity of the
STS-SNR estimator is less than both as it only processes a small portion of the audio samples
This paper proposes a voice morphing system for people suffering from Laryngectomy, which is the surgical removal of all or part of the larynx or the voice box, particularly performed in cases of laryngeal cancer. A primitive method of achieving voice morphing is by extracting the source's vocal coefficients and then converting them into the target speaker's vocal parameters. In this paper, we deploy Gaussian Mixture Models (GMM) for mapping the coefficients from source to destination. However, the use of the traditional/conventional GMM-based mapping approach results in the problem of over-smoothening of the converted voice. Thus, we hereby propose a unique method to perform efficient voice morphing and conversion based on GMM, which overcomes the traditional-method effects of over-smoothening. It uses a technique of glottal waveform separation and prediction of excitations and hence the result shows that not only over-smoothening is eliminated but also the transformed vocal tract parameters match with the target. Moreover, the synthesized speech thus obtained is found to be of a sufficiently high quality. Thus, voice morphing based on a unique GMM approach has been proposed and also critically evaluated based on various subjective and objective evaluation parameters. Further, an application of voice morphing for Laryngectomees which deploys this unique approach has been recommended by this paper
METHOD FOR REDUCING OF NOISE BY IMPROVING SIGNAL-TO-NOISE-RATIO IN WIRELESS LANIJNSA Journal
The signal to noise ratio (SNR) is one of the important measures for reducing the noise.A technique that uses a linear prediction error filter (LPEF) and an adaptive digital filter (ADF) to achieve noise reduction in a speech and image degraded by additive background noise is proposed. Since a speech signal can be represented as the stationary signal over a short interval of time, most of speech signal can be predicted by the LPEF. This estimation is performed by the ADF which is used as system identification. Noise reduction is achieved by subtracting the reconstructed noise from the speech degraded by additive background noise. Most of the MR image accelerating methods suffers from degradation of acquired images, which is often correlated with the degree of acceleration. However, Wideband MRI is a novel technique that transcends such flaws.In this paper we proposed LPEF and ADF for reducing the noise in speech and also we demonstrate that Wideband MRI is capable of obtaining images with identical quality as conventional MR images in terms of SNR in wireless LAN.
This is my presentation on a Journal Club. It's based on the article: "Auditory inspired machine learning techniques can improve speech intelligibility and quality for hearing-impaired listeners". You can find all the references in the slide at the end of the article. I review very basic techniques in noise reduction, and how the techniques are implemented in the area of deep neural-network.
PAPR Reduction using Tone Reservation Method in OFDM SignalIJASRD Journal
In this paper, we are going to propose ‘Tone Reservation’ technique to reduce PAPR (Peak to Average Power Ratio) in OFDM signal using new algorithm. It is less complex and also calculates its own threshold value at the time of communication. It also calculates its PRT signal while other algorithms requiring predetermined threshold and PRT. It modifies the data by ‘bit by bit’ comparison with a modified copy of itself (algorithm modified) thus scaling the peaks as and providing a decent BER and good PAPR reduction.
A Novel Uncertainty Parameter SR ( Signal to Residual Spectrum Ratio ) Evalua...sipij
Usually, hearing impaired people use hearing aids which are implemented with speech enhancement
algorithms. Estimation of speech and estimation of nose are the components in single channel speech
enhancement system. The main objective of any speech enhancement algorithm is estimation of noise power
spectrum for non stationary environment. VAD (Voice Activity Detector) is used to identify speech pauses
and during these pauses only estimation of noise. MMSE (Minimum Mean Square Error) speech
enhancement algorithm did not enhance the intelligibility, quality and listener fatigues are the perceptual
aspects of speech. Novel evaluation approach SR (Signal to Residual spectrum ratio) based on uncertainty
parameter introduced for the benefits of hearing impaired people in non stationary environments to control
distortions. By estimation and updating of noise based on division of original pure signal into three parts
such as pure speech, quasi speech and non speech frames based on multiple threshold conditions. Different
values of SR and LLR demonstrate the amount of attenuation and amplification distortions. The proposed
method will compared with any one method WAT(Weighted Average Technique) Hence by using
parameters SR (signal to residual spectrum ratio) and LLR (log like hood ratio), MMSE (Minim Mean
Square Error) in terms of segmented SNR and LLR.
Linear Predictive Coding (LPC) is one of the most powerful speech analysis techniques, and one of the most useful methods for encoding good quality speech at a low bit rate. It provides extremely accurate estimates of speech parameters, and is relatively efficient for computation.
Deep Learning Based Voice Activity Detection and Speech EnhancementNAVER Engineering
발표자: 김준태 (KAIST 박사과정)
발표일: 2018.10
Voice activity detection (VAD) and speech enhancement (SE) are important front-end technologies for noise robust speech recognition system.
From incoming noisy signal, VAD detects the speech signal only and SE removes the noise signal while conserving the speech signal.
For VAD and SE, this presentation will cover the traditional methods, deep learning based methods, and our papers as follows:
1. J. Kim and M. Hahn, "Voice Activity Detection Using an Adaptive Context Attention Model," in IEEE Signal Processing Letters, vol. 25, no. 8, pp. 1181-1185, Aug. 2018.
2. J. Kim and M. Hahn, "Speech Enhancement Using a Two Step Network," submitted to IEEE Signal Processing Letters, 2018.
Also, this presentation will briefly introduce some experimental results in real-world environment (far-field, noisy environment), conducted on the embedded board.
For VAD,
Traditional VAD methods.
Deep learning based VAD methods.
Paper presentation: J. Kim and M. Hahn, "Voice Activity Detection Using an Adaptive Context Attention Model," in IEEE Signal Processing Letters, vol. 25, no. 8, pp. 1181-1185, Aug. 2018.
End point detection based on VAD.
Experimental results of DNN-EPD on embedded board in real-world environment.
For SE,
Traditional SE methods.
Deep learning based SE methods.
Paper presentation: J. Kim and M. Hahn, "Speech Enhancement Using a Two Step Network," submitted to IEEE Signal Processing Letters, 2018.
Experimental results in real-world environment.
Novel Approach of Implementing Psychoacoustic model for MPEG-1 Audioinventy
Research Inventy : International Journal of Engineering and Science is published by the group of young academic and industrial researchers with 12 Issues per year. It is an online as well as print version open access journal that provides rapid publication (monthly) of articles in all areas of the subject such as: civil, mechanical, chemical, electronic and computer engineering as well as production and information technology. The Journal welcomes the submission of manuscripts that meet the general criteria of significance and scientific excellence. Papers will be published by rapid process within 20 days after acceptance and peer review process takes only 7 days. All articles published in Research Inventy will be peer-reviewed.
FIR Filter Design using Particle Swarm Optimization with Constriction Factor ...IDES Editor
This paper presents an alternative approach for the
design of linear phase digital low pass FIR filter using Particle
Swarm Optimization with Constriction Factor and Inertia
Weight Approach (PSO-CFIWA). FIR filter design is a multimodal
optimization problem. The conventional gradient based
optimization techniques are not efficient for digital filter
design. Given the filter specification to be realized, PSO
algorithm generates a set of filter coefficients and tries to
meet the ideal frequency characteristic. In this paper, for the
given problem, the realization of the FIR filters of different
order has been performed. The simulation results have been
compared with the well accepted evolutionary algorithm such
as genetic algorithm (GA). The results justify that the proposed
filter design approach using PSO-CFIWA outperforms to that
of GA, not only in the accuracy of the designed filter but also
in the convergence speed and solution quality.
This paper proposes a voice morphing system for people suffering from Laryngectomy, which is the surgical removal of all or part of the larynx or the voice box, particularly performed in cases of laryngeal cancer. A primitive method of achieving voice morphing is by extracting the source's vocal coefficients and then converting them into the target speaker's vocal parameters. In this paper, we deploy Gaussian Mixture Models (GMM) for mapping the coefficients from source to destination. However, the use of the traditional/conventional GMM-based mapping approach results in the problem of over-smoothening of the converted voice. Thus, we hereby propose a unique method to perform efficient voice morphing and conversion based on GMM, which overcomes the traditional-method effects of over-smoothening. It uses a technique of glottal waveform separation and prediction of excitations and hence the result shows that not only over-smoothening is eliminated but also the transformed vocal tract parameters match with the target. Moreover, the synthesized speech thus obtained is found to be of a sufficiently high quality. Thus, voice morphing based on a unique GMM approach has been proposed and also critically evaluated based on various subjective and objective evaluation parameters. Further, an application of voice morphing for Laryngectomees which deploys this unique approach has been recommended by this paper
METHOD FOR REDUCING OF NOISE BY IMPROVING SIGNAL-TO-NOISE-RATIO IN WIRELESS LANIJNSA Journal
The signal to noise ratio (SNR) is one of the important measures for reducing the noise.A technique that uses a linear prediction error filter (LPEF) and an adaptive digital filter (ADF) to achieve noise reduction in a speech and image degraded by additive background noise is proposed. Since a speech signal can be represented as the stationary signal over a short interval of time, most of speech signal can be predicted by the LPEF. This estimation is performed by the ADF which is used as system identification. Noise reduction is achieved by subtracting the reconstructed noise from the speech degraded by additive background noise. Most of the MR image accelerating methods suffers from degradation of acquired images, which is often correlated with the degree of acceleration. However, Wideband MRI is a novel technique that transcends such flaws.In this paper we proposed LPEF and ADF for reducing the noise in speech and also we demonstrate that Wideband MRI is capable of obtaining images with identical quality as conventional MR images in terms of SNR in wireless LAN.
This is my presentation on a Journal Club. It's based on the article: "Auditory inspired machine learning techniques can improve speech intelligibility and quality for hearing-impaired listeners". You can find all the references in the slide at the end of the article. I review very basic techniques in noise reduction, and how the techniques are implemented in the area of deep neural-network.
PAPR Reduction using Tone Reservation Method in OFDM SignalIJASRD Journal
In this paper, we are going to propose ‘Tone Reservation’ technique to reduce PAPR (Peak to Average Power Ratio) in OFDM signal using new algorithm. It is less complex and also calculates its own threshold value at the time of communication. It also calculates its PRT signal while other algorithms requiring predetermined threshold and PRT. It modifies the data by ‘bit by bit’ comparison with a modified copy of itself (algorithm modified) thus scaling the peaks as and providing a decent BER and good PAPR reduction.
A Novel Uncertainty Parameter SR ( Signal to Residual Spectrum Ratio ) Evalua...sipij
Usually, hearing impaired people use hearing aids which are implemented with speech enhancement
algorithms. Estimation of speech and estimation of nose are the components in single channel speech
enhancement system. The main objective of any speech enhancement algorithm is estimation of noise power
spectrum for non stationary environment. VAD (Voice Activity Detector) is used to identify speech pauses
and during these pauses only estimation of noise. MMSE (Minimum Mean Square Error) speech
enhancement algorithm did not enhance the intelligibility, quality and listener fatigues are the perceptual
aspects of speech. Novel evaluation approach SR (Signal to Residual spectrum ratio) based on uncertainty
parameter introduced for the benefits of hearing impaired people in non stationary environments to control
distortions. By estimation and updating of noise based on division of original pure signal into three parts
such as pure speech, quasi speech and non speech frames based on multiple threshold conditions. Different
values of SR and LLR demonstrate the amount of attenuation and amplification distortions. The proposed
method will compared with any one method WAT(Weighted Average Technique) Hence by using
parameters SR (signal to residual spectrum ratio) and LLR (log like hood ratio), MMSE (Minim Mean
Square Error) in terms of segmented SNR and LLR.
Linear Predictive Coding (LPC) is one of the most powerful speech analysis techniques, and one of the most useful methods for encoding good quality speech at a low bit rate. It provides extremely accurate estimates of speech parameters, and is relatively efficient for computation.
Deep Learning Based Voice Activity Detection and Speech EnhancementNAVER Engineering
발표자: 김준태 (KAIST 박사과정)
발표일: 2018.10
Voice activity detection (VAD) and speech enhancement (SE) are important front-end technologies for noise robust speech recognition system.
From incoming noisy signal, VAD detects the speech signal only and SE removes the noise signal while conserving the speech signal.
For VAD and SE, this presentation will cover the traditional methods, deep learning based methods, and our papers as follows:
1. J. Kim and M. Hahn, "Voice Activity Detection Using an Adaptive Context Attention Model," in IEEE Signal Processing Letters, vol. 25, no. 8, pp. 1181-1185, Aug. 2018.
2. J. Kim and M. Hahn, "Speech Enhancement Using a Two Step Network," submitted to IEEE Signal Processing Letters, 2018.
Also, this presentation will briefly introduce some experimental results in real-world environment (far-field, noisy environment), conducted on the embedded board.
For VAD,
Traditional VAD methods.
Deep learning based VAD methods.
Paper presentation: J. Kim and M. Hahn, "Voice Activity Detection Using an Adaptive Context Attention Model," in IEEE Signal Processing Letters, vol. 25, no. 8, pp. 1181-1185, Aug. 2018.
End point detection based on VAD.
Experimental results of DNN-EPD on embedded board in real-world environment.
For SE,
Traditional SE methods.
Deep learning based SE methods.
Paper presentation: J. Kim and M. Hahn, "Speech Enhancement Using a Two Step Network," submitted to IEEE Signal Processing Letters, 2018.
Experimental results in real-world environment.
Novel Approach of Implementing Psychoacoustic model for MPEG-1 Audioinventy
Research Inventy : International Journal of Engineering and Science is published by the group of young academic and industrial researchers with 12 Issues per year. It is an online as well as print version open access journal that provides rapid publication (monthly) of articles in all areas of the subject such as: civil, mechanical, chemical, electronic and computer engineering as well as production and information technology. The Journal welcomes the submission of manuscripts that meet the general criteria of significance and scientific excellence. Papers will be published by rapid process within 20 days after acceptance and peer review process takes only 7 days. All articles published in Research Inventy will be peer-reviewed.
FIR Filter Design using Particle Swarm Optimization with Constriction Factor ...IDES Editor
This paper presents an alternative approach for the
design of linear phase digital low pass FIR filter using Particle
Swarm Optimization with Constriction Factor and Inertia
Weight Approach (PSO-CFIWA). FIR filter design is a multimodal
optimization problem. The conventional gradient based
optimization techniques are not efficient for digital filter
design. Given the filter specification to be realized, PSO
algorithm generates a set of filter coefficients and tries to
meet the ideal frequency characteristic. In this paper, for the
given problem, the realization of the FIR filters of different
order has been performed. The simulation results have been
compared with the well accepted evolutionary algorithm such
as genetic algorithm (GA). The results justify that the proposed
filter design approach using PSO-CFIWA outperforms to that
of GA, not only in the accuracy of the designed filter but also
in the convergence speed and solution quality.
A Combined Voice Activity Detector Based On Singular Value Decomposition and ...CSCJournals
voice activity detector (VAD) is used to separate the speech data included parts from silence parts of the signal. In this paper a new VAD algorithm is represented on the basis of singular value decomposition. There are two sections to perform the feature vector extraction. In first section voiced frames are separated from unvoiced and silence frames. In second section unvoiced frames are silence frames. To perform the above sections, first, windowing the noisy signal then Hankel’s matrix is formed for each frame. The basis of statistical feature extraction of purposed system is slope of singular value curve related to each frame by using linear regression. It is shown that the slope of singular values curve per different SNRs in voiced frames is more than the other types and this property can be to achieve the goal the first part can be used. High similarity between feature vector of unvoiced and silence frame caused to approach for separation of the two categories above cannot be used. So in the second part, the frequency characteristics for identification of unvoiced frames from silent frames have been used. Simulation results show that high speed and accuracy are the advantages of the proposed system.
transportation is the back bone of countries economical development. providing good and effective transportation facility will help in developing countries economy. transportation will save time, money, and work can be done easily. intelligent transportation system provides more benefits to the nation.
The developments in the agricultural field are the buzzword in the market. In the field of agriculture, use of proper method of irrigation is important and it is well known that irrigation by drip is very economical and efficient. In the conventional drip irrigation system, the farmer has to keep watch on irrigation timetable, which is different for different crops and it is very difficult. This paper mainly focuses on designing of an accurate & cost effective Global System for Mobile (GSM) Based Automatic Drip Irrigation System using micro-controller. In order to fulfill these objectives we have used relay and solenoid valve along with a 16×2 Liquid Crystal Display (LCD) that can be connected to the microcontroller, which will displays the soil moisture level and ambient temperature. The developed irrigation method removes the need for workmanship for flooding irrigation. Efficient water management plays an important role in the irrigated agricultural cropping systems. Time based control mechanism; volume based control mechanism and priority based mechanism can be designed in one system.
The Solar Roadway generates electrical power from the sun and becomes a nation’s decentralized, intelligent, self-healing power grid, replacing our current deteriorating power distribution infrastructure.
Speech Enhancement Using Spectral Flatness Measure Based Spectral SubtractionIOSRJVSP
This paper is aimed to reduce background noise introduced in speech signal during capture, storage, transmission and processing using Spectral Subtraction algorithm. To consider the fact that colored noise corrupts the speech signal non-uniformly over different frequency bands, Multi-Band Spectral Subtraction (MBSS) approach is exploited wherein amount of noise subtracted from noisy speech signal is decided by a weighting factor. Choice of optimal values of weights decides the performance of the speech enhancement system. In this paper weights are decided based on SFM (Spectral Flatness Measure) than conventional SNR (Signal to Noise Ratio) based rule. Since SFM is able to provide true distinction between speech signal and noise signal. Spectrogram, Mean Opinion Score show that speech enhanced from proposed SFM based MBSS possess better perceptual quality and improved intelligibility than existing SNR based MBSS
VOWEL PHONEME RECOGNITION BASED ON AVERAGE ENERGY INFORMATION IN THE ZEROCROS...ijistjournal
Speech signal is modelled using the average energy of the signal in the zerocrossing intervals. Variation of these energies in the zerocrossing interval of the signal is studied and the distribution of this parameter through out the signal is evaluated. It is observed that the distribution patterns are similar for repeated utterances of the same vowels and varies from vowel to vowel. Credibility of the proposed parameter is verified over five Malayalam (one of the most popular Indian language) vowels using multilayer feed forward artificial neural network based recognition system. The performance of the system using additive white Gaussian noise corrupted speech is also studied for different SNR levels. From the experimental results it is evident that the average energy information in the zerocrossing intervals and its distributions can be effectively utilised for vowel phone classification and recognition.
Improving the Efficiency of Spectral Subtraction Method by Combining it with ...IJORCS
In the field of speech signal processing, Spectral subtraction method (SSM) has been successfully implemented to suppress the noise that is added acoustically. SSM does reduce the noise at satisfactory level but musical noise is a major drawback of this method. To implement spectral subtraction method, transformation of speech signal from time domain to frequency domain is required. On the other hand, Wavelet transform displays another aspect of speech signal. In this paper we have applied a new approach in which SSM is cascaded with wavelet thresholding technique (WTT) for improving the quality of speech signal by removing the problem of musical noise to a great extent. Results of this proposed system have been simulated on MATLAB.
Suppression of noise in noisy speech signal is required in many speech enhancement applications like signal recording and transmission from one place to other. In this paper a novel single line noise cancellation system is proposed using derivative of normalized least mean spare algorithm. The proposed system has two phases. The first phase is generation of secondary reference signal from incoming primary signal itself at initial silence period and pause between two words, which is essential while adaptive filter using as noise canceller. Second phase is noise cancellation using proposed modified error data normalized step size (EDNSS) algorithm. The performance of the proposed algorithm is compared with normalized least mean square (NLMS) algorithm and original EDNSS algorithm using standard IEEE sentence (SP23) of Noizeus data base with different types of real-world noise at different level of signal to noise ratio (SNR). The output of proposed, NLMS and EDNSS algorithm are measured with output SNR, excessive mean square error (EMSE) and misadjustment (M). The results clearly illustrates that the proposed algorithm gives improved result over conventional NLMS and EDNSS algorithm. The speed of convergence is also maintained as same conventional NLMS algorithm.
Single Channel Speech Enhancement using Wiener Filter and Compressive Sensing IJECEIAES
The speech enhancement algorithms are utilized to overcome multiple limitation factors in recent applications such as mobile phone and communication channel. The challenges focus on corrupted speech solution between noise reduction and signal distortion. We used a modified Wiener filter and compressive sensing (CS) to investigate and evaluate the improvement of speech quality. This new method adapted noise estimation and Wiener filter gain function in which to increase weight amplitude spectrum and improve mitigation of interested signals. The CS is then applied using the gradient projection for sparse reconstruction (GPSR) technique as a study system to empirically investigate the interactive effects of the corrupted noise and obtain better perceptual improvement aspects to listener fatigue with noiseless reduction conditions. The proposed algorithm shows an enhancement in testing performance evaluation of objective assessment tests outperform compared to other conventional algorithms at various noise type conditions of 0, 5, 10, 15 dB SNRs. Therefore, the proposed algorithm significantly achieved the speech quality improvement and efficiently obtained higher performance resulting in better noise reduction compare to other conventional algorithms.
International Journal of Engineering Research and Development (IJERD)IJERD Editor
call for paper 2012, hard copy of journal, research paper publishing, where to publish research paper,
journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJERD, journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, publishing of research paper, reserach and review articles, IJERD Journal, How to publish your research paper, publish research paper, open access engineering journal, Engineering journal, Mathemetics journal, Physics journal, Chemistry journal, Computer Engineering, Computer Science journal, how to submit your paper, peer reviw journal, indexed journal, reserach and review articles, engineering journal, www.ijerd.com, research journals,
yahoo journals, bing journals, International Journal of Engineering Research and Development, google journals, hard copy of journal
An Effective Approach for Chinese Speech Recognition on Small Size of Vocabularysipij
In this paper, an effective approach for Chinese speech recognition on small vocabulary size is proposed - the independent speech recognition of Chinese words based on Hidden Markov Model (HMM). The features of speech words are generated by sub-syllable of Chinese characters. Total 640 speech samples are recorded by 4 native males and 4 females with frequently speaking ability. The preliminary results of inside and outside testing achieve 89.6% and 77.5%, respectively. To improve the performance, keyword spotting criterion is applied into our system. The final precision rates for inside and outside testing in average achieve 92.7% and 83.8%. The results prove that the approach for Chinese speech recognition on small vocabulary is effective.
Research: Applying Various DSP-Related Techniques for Robust Recognition of A...Roman Atachiants
This paper approaches speaker recognition in a new way. A speaker recognition system has been realized that works on adult and child speakers, both male and female. Furthermore, the system employs text-dependent and text-independent algorithms, which makes robust speaker recognition possible in many applications. Single-speaker classication is achieved by age/sex pre-classication and is implemented using classic text-dependent techniques, as well as a novel technology for text-independent recognition. This new research uses Evolutionary Stable Strategies to model human speech and allows speaker recognition by analyzing just one vowel.
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
Slope at Zero Crossings (ZC) of Speech Signal for Multi-Speaker Activity Dete...ijcisjournal
Multi-Speaker activity (MSA) detection helps in detecting the presence of whether the speech signals has a
single speaker or multiple speaker speeches in the speech signal. It is easy to calculate the slope at ZCs
(zero crossings) of the speech signal and makes a comparison with a suitable threshold (Th). Multi-speaker
is declared as and when the zero crossing value exceeds the threshold. The impact of the proposed
technique is compared to the existing technique by calculating the sample-by-sample ZCR (Zero crossing
rate) value is demonstrated. Experimental results prove that the proposed ZCR technique achieves accurate
results than the traditional techniques for MSA detection that uses the cepstrum resynthesis residual
magnitude (CRRM) in the literature.
Comparative performance analysis of channel normalization techniqueseSAT Journals
Abstract A major part of the interaction between humans takes place via speech communication. The speech signal carries both useful and unwanted information. Processing of such signals involve enhancing the useful information. The intelligibility of speech signals is significantly reduced due to the presence of unwanted information such as noise. Channel normalization algorithms suppress such additive noise introduced in the speech signals by transmission channel or by recording environment conditions. Enhancing the quality and intelligibility of speech signals improve the performance of speech systems such as Automatic speech recognition (ASR) , voice communication and hearing aids to name the few. Based on the experimental results the comparative analysis of channel normalization techniques have been presented in this paper to find out the most suitable algorithm for enhancing the speech signals. Keywords: Cepstral Mean Normalization, Spectral Subtraction, Weiner filter, Signal to Noise Ratio
Compressive Sensing in Speech from LPC using Gradient Projection for Sparse R...IJERA Editor
This paper presents compressive sensing technique used for speech reconstruction using linear predictive coding because the
speech is more sparse in LPC. DCT of a speech is taken and the DCT points of sparse speech are thrown away arbitrarily.
This is achieved by making some point in DCT domain to be zero by multiplying with mask functions. From the incomplete
points in DCT domain, the original speech is reconstructed using compressive sensing and the tool used is Gradient
Projection for Sparse Reconstruction. The performance of the result is compared with direct IDCT subjectively. The
experiment is done and it is observed that the performance is better for compressive sensing than the DCT.
Improvement of minimum tracking in Minimum Statistics noise estimation methodCSCJournals
Noise spectrum estimation is a fundamental component of speech enhancement and speech recognition systems. In this paper we propose a new method for minimum tracking in Minimum Statistics (MS) noise estimation method. This noise estimation algorithm is proposed for highly nonstationary noise environments. This was confirmed with formal listening tests which indicated that the proposed noise estimation algorithm when integrated in speech enhancement was preferred over other noise estimation algorithms.
Similar to Cancellation of Noise from Speech Signal using Voice Activity Detection Method on FPGA (20)
Due to availability of internet and evolution of embedded devices, Internet of things can be useful to contribute in energy domain. The Internet of Things (IoT) will deliver a smarter grid to enable more information and connectivity throughout the infrastructure and to homes. Through the IoT, consumers, manufacturers and utility providers will come across new ways to manage devices and ultimately conserve resources and save money by using smart meters, home gateways, smart plugs and connected appliances. The future smart home, various devices will be able to measure and share their energy consumption, and actively participate in house-wide or building wide energy management systems. This paper discusses the different approaches being taken worldwide to connect the smart grid. Full system solutions can be developed by combining hardware and software to address some of the challenges in building a smarter and more connected smart grid.
A Survey Report on : Security & Challenges in Internet of Thingsijsrd.com
In the era of computing technology, Internet of Things (IoT) devices are now popular in each and every domains like e-governance, e-Health, e-Home, e-Commerce, and e-Trafficking etc. Iot is spreading from small to large applications in all fields like Smart Cities, Smart Grids, Smart Transportation. As on one side IoT provide facilities and services for the society. On the other hand, IoT security is also a crucial issues.IoT security is an area which totally concerned for giving security to connected devices and networks in the IoT .As, IoT is vast area with usability, performance, security, and reliability as a major challenges in it. The growth of the IoT is exponentially increases as driven by market pressures, which proportionally increases the security threats involved in IoT The relationship between the security and billions of devices connecting to the Internet cannot be described with existing mathematical methods. In this paper, we explore the opportunities possible in the IoT with security threats and challenges associated with it.
In today’s emerging world of Internet, each and every thing is supposed to be in connected mode with the help of billions of smart devices. By connecting all the devises used in our day to day life, make our life trouble less and easy. We are incorporated in a world where we are used to have smart phones, smart cars, smart gadgets, smart homes and smart cities. Different institutes and researchers are working for creating a smart world for us but real question which we need to emphasis on is how to make dumb devises talk with uncommon hardware and communication technology. For the same what kind of mechanism to use with various protocols and less human interaction. The purpose is to provide the key area for application of IoT and a platform on which various devices having different mechanism and protocols can communicate with an integrated architecture.
Study on Issues in Managing and Protecting Data of IOTijsrd.com
This paper discusses variety of issues for preserving and managing data produced by IoT. Every second large amount of data are added or updated in the IoT databases across the heterogeneous environment. While managing the data each phase of data processing for IoT data is exigent like storing data, querying, indexing, transaction management and failure handling. We also refer to the problem of data integration and protection as data requires to be fit in single layout and travel securely as they arrive in the pool from diversified sources in different structure. Finally, we confer a standardized pathway to manage and to defend data in consistent manner.
Interactive Technologies for Improving Quality of Education to Build Collabor...ijsrd.com
Today with advancement in Information Communication Technology (ICT) the way the education is being delivered is seeing a paradigm shift from boring classroom lectures to interactive applications such as 2-D and 3-D learning content, animations, live videos, response systems, interactive panels, education games, virtual laboratories and collaborative research (data gathering and analysis) etc. Engineering is emerging with more innovative solutions in the field of education and bringing out their innovative products to improve education delivery. The academic institutes which were once hesitant to use such technology are now looking forward to such innovations. They are adopting the new ways as they are realizing the vast benefits of using such methods and technology. The benefits are better comprehensibility, improved learning efficiency of students, and access to vast knowledge resources, geographical reach, quick feedback, accountability and quality research. This paper focuses on how engineering can leverage the latest technology and build a collaborative learning environment which can then be integrated with the national e-learning grid.
Internet of Things - Paradigm Shift of Future Internet Application for Specia...ijsrd.com
In the world more than 15% people are living with disability that also include children below age of 10 years. Due to lack of independent support services specially abled (handicap) people overly rely on other people for their basic needs, that excludes them from being financially and socially active. The Internet of Things (IoT) can give support system and a better quality of life as well as participation in routine and day to day life. For this purpose, the future solutions for current problems has been introduced in this paper. Daunting challenges have been considered as future research and glimpse of the IoT for specially abled person is given in the paper.
A Study of the Adverse Effects of IoT on Student's Lifeijsrd.com
Internet of things (IoT) is the most powerful invention and if used in the positive direction, internet can prove to be very productive. But, now a days, due to the social networking sites such as Face book, WhatsApp, twitter, hike etc. internet is producing adverse effects on the student life, especially those students studying at college Level. As it is rightly said, something which has some positive effects also has some of the negative effects on the other hand. In this article, we are discussing some adverse effects of IoT on student’s life.
Pedagogy for Effective use of ICT in English Language Learningijsrd.com
The use of information and communications technology (ICT) in education is a relatively new phenomenon and it has been the educational researchers' focus of attention for more than two decades. Educators and researchers examine the challenges of using ICT and think of new ways to integrate ICT into the curriculum. However, there are some barriers for the teachers that prevent them to use ICT in the classroom and develop supporting materials through ICT. The purpose of this study is to examine the high school English teachers’ perceptions of the factors discouraging teachers to use ICT in the classroom.
In recent years usage of private vehicles create urban traffic more and more crowded. As result traffic becomes one of the important problems in big cities in all over the world. Some of the traffic concerns are traffic jam and accidents which have caused a huge waste of time, more fuel consumption and more pollution. Time is very important parameter in routine life. The main problem faced by the people is real time routing. Our solution Virtual Eye will provide the current updates as in the real time scenario of the specific route. This research paper presents smart traffic navigation system, based on Internet of Things, which is featured by low cost, high compatibility, easy to upgrade, to replace traditional traffic management system and the proposed system can improve road traffic tremendously.
Ontological Model of Educational Programs in Computer Science (Bachelor and M...ijsrd.com
In this work there is illustrated an ontological model of educational programs in computer science for bachelor and master degrees in Computer science and for master educational program “Computer science as second competence†by Tempus project PROMIS.
Understanding IoT Management for Smart Refrigeratorijsrd.com
Lately the concept of Internet of Things (IoT) is being more elaborated and devices and databases are proposed thereby to meet the need of an Internet of Things scenario. IoT is being considered to be an integral part of smart house where devices will be connected to each other and also react upon certain environmental input. This will eventually include the home refrigerator, air conditioner, lights, heater and such other home appliances. Therefore, we focus our research on the database part for such an IoT’ fridge which we called as smart Fridge. We describe the potentials achievable through a database for an IoT refrigerator to manage the refrigerator food and also aid the creation of a monthly budget of the house for a family. The paper aims at the data management issue based on a proposed design for an intelligent refrigerator leveraging the sensor technology and the wireless communication technology. The refrigerator which identifies products by reading the barcodes or RFID tags is proposed to order the required products by connecting to the Internet. Thus the goal of this paper is to minimize human interaction to maintain the daily life events.
DESIGN AND ANALYSIS OF DOUBLE WISHBONE SUSPENSION SYSTEM USING FINITE ELEMENT...ijsrd.com
Double wishbone designs allow the engineer to carefully control the motion of the wheel throughout suspension travel. 3-D model of the Lower Wishbone Arm is prepared by using CAD software for modal and stress analysis. The forces and moments are used as the boundary conditions for finite element model of the wishbone arm. By using these boundary conditions static analysis is carried out. Then making the load as a function of time; quasi-static analysis of the wishbone arm is carried out. A finite element based optimization is used to optimize the design of lower wishbone arm. Topology optimization and material optimization techniques are used to optimize lower wishbone arm design.
A Review: Microwave Energy for materials processingijsrd.com
Microwave energy is a latest largest growing technique for material processing. This paper presents a review of microwave technologies used for material processing and its use for industrial applications. Advantages in using microwave energy for processing material include rapid heating, high heating efficiency, heating uniformity and clean energy. The microwave heating has various characteristics and due to which it has been become popular for heating low temperature applications to high temperature applications. In recent years this novel technique has been successfully utilized for the processing of metallic materials. Many researchers have reported microwave energy for sintering, joining and cladding of metallic materials. The aim of this paper is to show the use of microwave energy not only for non-metallic materials but also the metallic materials. The ability to process metals with microwave could assist in the manufacturing of high performance metal parts desired in many industries, for example in automotive and aeronautical industries.
Web Usage Mining: A Survey on User's Navigation Pattern from Web Logsijsrd.com
With an expontial growth of World Wide Web, there are so many information overloaded and it became hard to find out data according to need. Web usage mining is a part of web mining, which deal with automatic discovery of user navigation pattern from web log. This paper presents an overview of web mining and also provide navigation pattern from classification and clustering algorithm for web usage mining. Web usage mining contain three important task namely data preprocessing, pattern discovery and pattern analysis based on discovered pattern. And also contain the comparative study of web mining techniques.
APPLICATION OF STATCOM to IMPROVED DYNAMIC PERFORMANCE OF POWER SYSTEMijsrd.com
Application of FACTS controller called Static Synchronous Compensator STATCOM to improve the performance of power grid with Wind Farms is investigated .The essential feature of the STATCOM is that it has the ability to absorb or inject fastly the reactive power with power grid . Therefore the voltage regulation of the power grid with STATCOM FACTS device is achieved. Moreover restoring the stability of the power system having wind farm after occurring severe disturbance such as faults or wind farm mechanical power variation is obtained with STATCOM controller . The dynamic model of the power system having wind farm controlled by proposed STATCOM is developed . To validate the powerful of the STATCOM FACTS controller, the studied power system is simulated and subjected to different severe disturbances. The results prove the effectiveness of the proposed STATCOM controller in terms of fast damping the power system oscillations and restoring the power system stability.
Making model of dual axis solar tracking with Maximum Power Point Trackingijsrd.com
Now a days solar harvesting is more popular. As the popularity become higher the material quality and solar tracking methods are more improved. There are several factors affecting the solar system. Major influence on solar cell, intensity of source radiation and storage techniques The materials used in solar cell manufacturing limit the efficiency of solar cell. This makes it particularly difficult to make considerable improvements in the performance of the cell, and hence restricts the efficiency of the overall collection process. Therefore, the most attainable maximum power point tracking method of improving the performance of solar power collection is to increase the mean intensity of radiation received from the source used. The purposed of tracking system controls elevation and orientation angles of solar panels such that the panels always maintain perpendicular to the sunlight. The measured variables of our automatic system were compared with those of a fixed angle PV system. As a result of the experiment, the voltage generated by the proposed tracking system has an overall of about 28.11% more than the fixed angle PV system. There are three major approaches for maximizing power extraction in medium and large scale systems. They are sun tracking, maximum power point (MPP) tracking or both.
A REVIEW PAPER ON PERFORMANCE AND EMISSION TEST OF 4 STROKE DIESEL ENGINE USI...ijsrd.com
In day today's relevance, it is mandatory to device the usage of diesel in an economic way. In present scenario, the very low combustion efficiency of CI engine leads to poor performance of engine and produces emission due to incomplete combustion. Study of research papers is focused on the improvement in efficiency of the engine and reduction in emissions by adding ethanol in a diesel with different blends like 5%, 10%, 15%, 20%, 25% and 30% by volume. The performance and emission characteristics of the engine are tested observed using blended fuels and comparative assessment is done with the performance and emission characteristics of engine using pure diesel.
Study and Review on Various Current Comparatorsijsrd.com
This paper presents study and review on various current comparators. It also describes low voltage current comparator using flipped voltage follower (FVF) to obtain the single supply voltage. This circuit has short propagation delay and occupies a small chip area as compare to other current comparators. The results of this circuit has obtained using PSpice simulator for 0.18 μm CMOS technology and a comparison has been performed with its non FVF counterpart to contrast its effectiveness, simplicity, compactness and low power consumption.
Reducing Silicon Real Estate and Switching Activity Using Low Power Test Patt...ijsrd.com
Power dissipation is a challenging problem for today's system-on-chip design and test. This paper presents a novel architecture which generates the test patterns with reduced switching activities; it has the advantage of low test power and low hardware overhead. The proposed LP-TPG (test pattern generator) structure consists of modified low power linear feedback shift register (LP-LFSR), m-bit counter, gray counter, NOR-gate structure and XOR-array. The seed generated from LP-LFSR is EXCLUSIVE-OR ed with the data generated from gray code generator. The XOR result of the sequence is single input changing (SIC) sequence, in turn reduces the switching activity and so power dissipation will be very less. The proposed architecture is simulated using Modelsim and synthesized using Xilinx ISE9.2.The Xilinx chip scope tool will be used to test the logic running on FPGA.
Defending Reactive Jammers in WSN using a Trigger Identification Service.ijsrd.com
In the last decade, the greatest threat to the wireless sensor network has been Reactive Jamming Attack because it is difficult to be disclosed and defend as well as due to its mass destruction to legitimate sensor communications. As discussed above about the Reactive Jammers Nodes, a new scheme to deactivate them efficiently is by identifying all trigger nodes, where transmissions invoke the jammer nodes, which has been proposed and developed. Due to this identification mechanism, many existing reactive jamming defending schemes can be benefited. This Trigger Identification can also work as an application layer .In this paper, on one side we provide the several optimization problems to provide complete trigger identification service framework for unreliable wireless sensor networks and on the other side we also provide an improved algorithm with regard to two sophisticated jamming models, in order to enhance its robustness for various network scenarios.
About
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
Technical Specifications
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
Key Features
Indigenized remote control interface card suitable for MAFI system CCR equipment. Compatible for IDM8000 CCR. Backplane mounted serial and TCP/Ethernet communication module for CCR remote access. IDM 8000 CCR remote control on serial and TCP protocol.
• Remote control: Parallel or serial interface
• Compatible with MAFI CCR system
• Copatiable with IDM8000 CCR
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
Application
• Remote control: Parallel or serial interface.
• Compatible with MAFI CCR system.
• Compatible with IDM8000 CCR.
• Compatible with Backplane mount serial communication.
• Compatible with commercial and Defence aviation CCR system.
• Remote control system for accessing CCR and allied system over serial or TCP.
• Indigenized local Support/presence in India.
• Easy in configuration using DIP switches.
CFD Simulation of By-pass Flow in a HRSG module by R&R Consult.pptxR&R Consult
CFD analysis is incredibly effective at solving mysteries and improving the performance of complex systems!
Here's a great example: At a large natural gas-fired power plant, where they use waste heat to generate steam and energy, they were puzzled that their boiler wasn't producing as much steam as expected.
R&R and Tetra Engineering Group Inc. were asked to solve the issue with reduced steam production.
An inspection had shown that a significant amount of hot flue gas was bypassing the boiler tubes, where the heat was supposed to be transferred.
R&R Consult conducted a CFD analysis, which revealed that 6.3% of the flue gas was bypassing the boiler tubes without transferring heat. The analysis also showed that the flue gas was instead being directed along the sides of the boiler and between the modules that were supposed to capture the heat. This was the cause of the reduced performance.
Based on our results, Tetra Engineering installed covering plates to reduce the bypass flow. This improved the boiler's performance and increased electricity production.
It is always satisfying when we can help solve complex challenges like this. Do your systems also need a check-up or optimization? Give us a call!
Work done in cooperation with James Malloy and David Moelling from Tetra Engineering.
More examples of our work https://www.r-r-consult.dk/en/cases-en/
Saudi Arabia stands as a titan in the global energy landscape, renowned for its abundant oil and gas resources. It's the largest exporter of petroleum and holds some of the world's most significant reserves. Let's delve into the top 10 oil and gas projects shaping Saudi Arabia's energy future in 2024.
Hybrid optimization of pumped hydro system and solar- Engr. Abdul-Azeez.pdffxintegritypublishin
Advancements in technology unveil a myriad of electrical and electronic breakthroughs geared towards efficiently harnessing limited resources to meet human energy demands. The optimization of hybrid solar PV panels and pumped hydro energy supply systems plays a pivotal role in utilizing natural resources effectively. This initiative not only benefits humanity but also fosters environmental sustainability. The study investigated the design optimization of these hybrid systems, focusing on understanding solar radiation patterns, identifying geographical influences on solar radiation, formulating a mathematical model for system optimization, and determining the optimal configuration of PV panels and pumped hydro storage. Through a comparative analysis approach and eight weeks of data collection, the study addressed key research questions related to solar radiation patterns and optimal system design. The findings highlighted regions with heightened solar radiation levels, showcasing substantial potential for power generation and emphasizing the system's efficiency. Optimizing system design significantly boosted power generation, promoted renewable energy utilization, and enhanced energy storage capacity. The study underscored the benefits of optimizing hybrid solar PV panels and pumped hydro energy supply systems for sustainable energy usage. Optimizing the design of solar PV panels and pumped hydro energy supply systems as examined across diverse climatic conditions in a developing country, not only enhances power generation but also improves the integration of renewable energy sources and boosts energy storage capacities, particularly beneficial for less economically prosperous regions. Additionally, the study provides valuable insights for advancing energy research in economically viable areas. Recommendations included conducting site-specific assessments, utilizing advanced modeling tools, implementing regular maintenance protocols, and enhancing communication among system components.
Explore the innovative world of trenchless pipe repair with our comprehensive guide, "The Benefits and Techniques of Trenchless Pipe Repair." This document delves into the modern methods of repairing underground pipes without the need for extensive excavation, highlighting the numerous advantages and the latest techniques used in the industry.
Learn about the cost savings, reduced environmental impact, and minimal disruption associated with trenchless technology. Discover detailed explanations of popular techniques such as pipe bursting, cured-in-place pipe (CIPP) lining, and directional drilling. Understand how these methods can be applied to various types of infrastructure, from residential plumbing to large-scale municipal systems.
Ideal for homeowners, contractors, engineers, and anyone interested in modern plumbing solutions, this guide provides valuable insights into why trenchless pipe repair is becoming the preferred choice for pipe rehabilitation. Stay informed about the latest advancements and best practices in the field.
Overview of the fundamental roles in Hydropower generation and the components involved in wider Electrical Engineering.
This paper presents the design and construction of hydroelectric dams from the hydrologist’s survey of the valley before construction, all aspects and involved disciplines, fluid dynamics, structural engineering, generation and mains frequency regulation to the very transmission of power through the network in the United Kingdom.
Author: Robbie Edward Sayers
Collaborators and co editors: Charlie Sims and Connor Healey.
(C) 2024 Robbie E. Sayers
Cancellation of Noise from Speech Signal using Voice Activity Detection Method on FPGA
1. IJSRD - International Journal for Scientific Research & Development| Vol. 1, Issue 3, 2013 | ISSN (online): 2321-0613
All rights reserved by www.ijsrd.com 638
Cancellation of Noise from Speech Signal using Voice Activity Detection
Method on FPGA
Mr. Pratik P. Shah1
Prof. Kinnar G. Vaghela2
Prof. Anil J. Kshatriya3
1
PG Student 2
Associate Professor 3
Assistant Professor
1, 2, 3
E. & C. Department
1, 3
VGEC, Chandkheda, Gujarat, India, 2
GEC, Modasa, Gujarat, India
Abstract—Speech Enhancement by suppressing
uncorrelated acoustically added noise has been a challenging
topic of research for many years. These are the primary
choice for real time applications due to the simplicity and
comparatively low computational load. This paper shows
VAD (Voice activity detection) technique that can detect the
non speech segment from the speech signal. It is also shown
that it can work powerfully in an unpredictable noise
ambience. The technique is mostly done in microprocessors
or DSP processors because of their flexibility. But there are
several advantages of FPGA over DSP processors like high
cost per logic element related to these processors makes
them improper for large scale use. From the experimental
results, VAD method is implemented on the FPGA chip.
I. INTRODUCTION
The problem of enhancing speech degraded by additive
background noise has been widely studied in the past and it
is still an active field for dissertation as well as research.
Noise, added acoustically to speech, can degrade the
performance of digital voice processors used for speech
recognition, compression and authentication. In many
applications, such as mobile communication, audio noise
reductions, efficient noise cancellation techniques are
needed, which require a low computational load. Noise
suppression and speech enhancement techniques can be used
to improve Signal to Noise Ratio (SNR), to lower the
variance and to emphasize the important features of the
speech signal.
With the rapid development of microprocessors
and DSP processors, acoustic noise suppression has been
widely implemented with microprocessors or digital signal
processors. These approaches provide flexibility and are
suitable for many applications. However due to their very
high cost per logic element microprocessors and digital
signal processors are not suitable for widespread use.
In recent years, FPGA-based hardware
implementation technology has been used for signal
processing field successfully due to advantages of their
programmable hard-wired feature, fast time to market and
reusable IP (Intellectual Property) cores. Their cost per logic
element is reasonable too. Besides, the FPGA based system
can get a very high speed level, since it can carry out
parallel processing by means of hardware mode. Thus, one
Of the motivating factors of this work is the high capability
of modern FPGA.
II. VOICE ACTIVITY DETECTION TECHNIQUE FOR
SPEECH SIGNAL
Let, assume that a windowed noise signal n(k) has
been added to a windowed speech signal s(k), with their sum
denoted by x(k)[1].Then
x(k)= s(k) + n(k) (1)
A. Short Time Energy of Speech Signal
Generally, the amplitude of the unvoiced speech segments is
much lower than that of the voiced one and the amplitude of
the speech signal varies with time. These amplitude
variations are represented by the energy of the speech
signal. Short time energy can be defined En as[4]:
En = ∑_(m= -∞)^∞▒[x(m)w(n-m)]2 (2)
The nature of the short time energy representation
is determined by using of the appropriate window. We used
hamming window in our model which gives much greater
attenuation outside the bandpass than the comparable
rectangular window.
h(n) = 0.54 – 0.46 cos(2πn/(N-1)) , for 0≤n≤N-1
= 0, otherwise (3)
The attenuation of this window is independent of
the window duration. Increasing the length, N, decreases the
bandwidth, Fig. 1. If N is too small, the energy of sample En
will fluctuate very rapidly depending on the exact details of
the waveform. If N is too large, the energy of sample En
will change very slowly. Thus it will not adequately reflect
the changing properties of the speech signal. So the short-
term energy En can be used to distinguish voiced/unvoiced
speech, distinguish voiceful /voiceless sound, etc.
Fig. 1: Short time energy of speech signal[4]
2. Cancellation of Noise from Speech Signal Using Voice Activity Detection Method on FPGA
(IJSRD/Vol. 1/Issue 3/2013/0060)
All rights reserved by www.ijsrd.com
639
En is a variety measure of squared signal
magnitude. In some occasion, we can use another measure
of variety in speech magnitude called average short-term
magnitude Mn
[4]
.
Mn = En = ∑∞
∞ x(m)|w(n-m) = |x(n)|*w(m) (4)
For voiced speech, energy is mainly below 3.4kHz
and for unvoiced speech, energy is mainly above 3.4kHz
As shown in the block diagram, Fig.2, voice speech
is detected when ZCR < E else it is detected as unvoiced
speech signal. The rate of zero crossing is normally too high
for noisy frames.
Fig. 2: Block Diagram of VAD
B. Zero Crossing Rate Detection
If successive samples have different algebraic signs, a zero
crossing is said to occur in the context of discrete-time
signals. The rate at which zero crossings occur is a simple
measure of the frequency content of a signal. Zero-crossing
rate is a measure of number of times in a given time
interval/frame that the amplitude of the speech signals
passes through a value of zero[4]
.
Since speech signals are broadband signals,
interpretation of average zero-crossing rate is much less
precise. Rough estimations of spectral properties can be
obtained using a representation based on the short time
average zero-crossing rate.
Fig. 3: Computation of Voiced and Unvoiced Speech
Signal[4]
Mean ZC rate for unvoiced speech is 49 per 10
msec interval and Mean ZC rate for voiced speech is 14 per
10 msec interval.
A definition for zero-crossings rate[4]
Zn is
Zn = ∑ ∞
∞ sgn [x(m)] – sgn[x(m-1)] w(n-m) (5)
where
sgn [x(n)] = 1, for x
= -1, for x (6)
w(n) = , for 0
= 0, otherwise (7)
The energy of voiced speech is found below about
3 kHz. This is because of the spectrum fall of introduced by
the glottal wave. Most of the energy is found at higher
frequencies for unvoiced speech. There is a strong
correlation between zero-crossing rate and energy
distribution with frequency since high frequencies imply
high zero crossing rates and low frequencies imply low one.
It should be a reasonable generalization that if the zero-
crossing rate is high, the speech signal is unvoiced and if the
rate is low, the speech signal is voiced.
III. USE OF FPGA
In this paper, FPGA is used to implement the functional
blocks like short time energy calculations, zero crossing rate
calculations and takes the decision whether voiced speech is
present or unvoiced speech is present in actual speech
signal. Simulation results are shown for the Xilinx Spartan3
Device, XC3S400.
IV. SIMULATION RESULT AND ANALYSIS
As shown in Fig. 5, Speech 1 sampled is read by using
Xilinx 9.1i. All speech sampled are stored in buffer and they
are read by using readline function. As shown in the Fig. 6,
at the riding edge of the clock pulse and en = ‘1’, sampled
values are read from the buffer, perform the short time
energy calculation and stored result into the output file by
using writeline function.
Fig. 5: Short time energy calculation using Xilinx 9.1i ISE
As shown in Fig. 7, Speech 2 sampled is read by
using Xilinx 9.1i. All speech sampled are stored in buffer
and they are read by using readline function. As shown in
the Fig.7, at the riding edge of the clock pulse and en = ‘1’,
sampled values are read from the buffer, perform the short
time energy calculation and stored result into the output file
by using writeline function.
As shown in Fig. 9, entire speech signal is framed
and buff1 is used to store the speech samples which are read
through the readline function. The size of each buffer is 80
(integer). ZCR for each frame has been calculated when en
= 1 and wr =0. The dataout1 signal is used to calculate the
3. Cancellation of Noise from Speech Signal Using Voice Activity Detection Method on FPGA
(IJSRD/Vol. 1/Issue 3/2013/0060)
All rights reserved by www.ijsrd.com
640
ZCR for each frame. The buff3 is used to store the sampled
value before writing it to the output file.
Fig. 6: Short time energy calculation of Speech
If ZCR is less than 14, then it is assumed that the
corresponding frame contains speech only otherwise it is
considered as noise. To write data to the output file, the en
signal is cleared to 0 while wr signal is set to 1. If frame
contains the speech signal, then valid sampled values are
stored into the output file by using writeline function. If
frame contains noise, then the amplitude of the samples are
assign to zero and stored it into the output file.
Fig. 7: Zero crossing rate calculation by using Xilinx ISE
9.1i
As shown in the Fig. 7, if the ZCR of the frame is
greater than 14, then the value 0 is assigned to each sample
of the frame. If ZCR of the frame is less than 14, then the
value of the sample remains as it is.
Voiced speech and unvoiced speech are present in
speech signal. As shown in Fig. 9, if voiced speech is
detected in speech signal then VAD flag bit =1, otherwise
VAD flag bit = 0.
Fig. 8: Removing Noise from Speech using ZCR
Fig. 9 : Voice Activity Detection using Xilinx 9.1i
Fig. 10: Detection of Voiced and Unvoiced Speech Using
VAD
V. CONCLUSION
From these simulation results, it is concluded that VAD
(Voice activity detection) is an efficient method to
distinguish unvoiced speech segment from the voiced one in
speech signal. If a speech segment is detected as unvoiced
speech segment, then it assigns the value 0. If a speech
segment is detected as voiced speech, then the value of the
voiced speech segment remains as it is. It can remove the
noise from the unvoiced speech signal but not from the
voiced speech. So it is better to use the frequency domain
algorithms for the cancellation of noise from voiced speech
as well as unvoiced speech segment.
VI. FUTURE WORK
Various algorithms for noise cancellation in frequency
domain can be implemented which provides better
performance than time domain approaches.
REFERENCES
[1]. S. F. Boll, “Suppression of Acoustic Noise in Speech
Using Spectral Subtraction”, IEEE Transactions on
acoustics, speech, and signal processing, Vol. assp-27,
no. 2, April,1979.
[2]. U. Mahbub, T. Rahman and A. B. M. H. Rashid,
“FPGA Implementation of Real Time Acoustic Noise
Suppression by Spectral Subtraction using Dynamic
Moving Average Method”, IEEE Symposium on
Industrial Electronics and Applications, Kuala Lumpur,
Malaysia, October 4-6, 2009.
[3]. J. Whittington, K. Deo, T. Kleinschmidt and M. Mason,
“FPGA Implementation of Spectral Subtraction for
4. Cancellation of Noise from Speech Signal Using Voice Activity Detection Method on FPGA
(IJSRD/Vol. 1/Issue 3/2013/0060)
All rights reserved by www.ijsrd.com
641
Automotive Speech Recognition”, IEEE Transaction,
2009.
[4]. F. Beritelli, “A Robust Voice Activity Detector for
Wireless Communications Using Soft Computing”,
IEEE Transaction, 1998.
[5]. J. Xu, A. Ariyaeeinia, R. Sotudeh and Z. Ahmad, “Pre-
processing Speech Signals in FPGAs”, IEEE
Transaction, 2005.
[6]. A. B. Diggikar and S. S. Ardhapurkar, “Design and
Implementation of Adaptive filtering algorithm for
Noise Cancellation in speech signal on FPGA”,
International Conference on Computing, Electronics
and Electrical Technologies, 2012.
[7]. R. Fukane and S. L. Sahare, “Different Approaches of
Spectral Subtraction method for Enhancing the Speech
Signal in Noisy Environments”, Internation.al Journal
of Scientific & Engineering Research, Volume 2, Issue
5, May-2011.
[8]. V.K. Subramaniam, V. M. Reddy andS. S. Rao, “FPGA
Implementation of an Adaptive Noise Canceller with
Low Signal Distortion”, IEEE Transaction, 1999.