Published on

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide


  1. 1. Y Prasanthi, K Jyothi, GIET / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 2, Issue4, July-August 2012, pp.2162-2165 Frequency Depended Spectral Subtraction For Speech Enhancement Y Prasanthi, K Jyothi, GIETAbstract: The corruption of speech due to presence of To understand what the speaker is saying. Noiseadditive background noise causes severe difficulties in reduction or speech enhancement algorithms are used tovarious communications environments. This paper suppress such background noise and improve theaddresses the problem of reduction of additive perceptual quality and intelligibility of speech. Evenbackground noise in speech. The proposed approach is though speech is perceptible in a moderately noisya frequency dependent speech enhancement method environment, many applications like mobilebased on the proven spectral subtraction method. Most communications, speech recognition and aids for theimplementations and variations of the basic spectral hearing handicapped, to name a few, drive the effort tosubtraction technique advocate subtraction of the noise build more effective noise reduction algorithms forspectrum estimate over the entire speech spectrum. better performance. Over the years engineers haveHowever real world noise is mostly colored and does developed a variety of theoretical and relativelynot affect the speech signal uniformly over the entire effective techniques to combat this issue. However, thespectrum. In this paper, new spectral subtraction problem of cleaning noisy speech still poses a challengemethod proposed which takes into account the fact that to the area of signal processing. Removing variouscolored noise affects the speech spectrum differently at types of noise is difficult due to the random nature ofvarious frequencies. This method provides a greater the noise and the inherent complexities of speech.degree of flexibility and control on the noise subtraction Noise reduction techniques usually have a trade offlevels that reduces artifacts in the enhanced speech, between the amount of noise removal and speechresulting in improved speech quality. This method distortions introduced due the processing of the speechoutperforms the conventional spectral subtraction signal. Complexity and ease of implementation of themethod with respect to speech quality and reduced noise reduction algorithms is also of concern inmusical noise. applications especially those related to portable devices such as mobile communications and digital hearing1. INTRODUCTION aids. The spectral subtraction method is a well-known A major part of the interaction between noise reduction technique. Most implementations andhumans takes place via speech communication. variations of the basic technique advocate subtractionHence, research in speech and hearing sciences has of the noise spectrum estimate over the entire speechbeen going on for centuries to understand the dynamics spectrum. However, real world noise is mostly coloredand processes involved in the production and and does not affect the speech signal uniformly over theperception of speech. The field of speech processing is entire spectrum. In this paper, we propose a multi-bandessentially an application of signal processing spectral subtraction approach that takes into account thetechniques to acoustic signals using the knowledge fact that colored noise affects the speech spectrumoffered by researchers in the field of hearing sciences. differently at various frequencies. This methodThe explosive advances in recent years in the field of outperforms the standard power spectral subtractiondigital computing have provided a tremendous boost to method resulting in superior speech quality and largelythe field of speech processing. Digital signal processing reduced musical noise.techniques are more sophisticated and advanced ascompared to their analog counterparts. Ease and speed 1.2 Short-term Spectral Amplitude Techniquesof representing, storing, retrieving and processing The short-term spectral amplitude (STSA) of speechspeech data has contributed to the development of has been exploited successfully in the development ofefficient and effective speech processing techniques to various speech enhancement algorithms. The basic ideaaddress the issues related to speech. The presence of is to use the STSA of the noisy speech input andbackground noise in speech significantly reduces the recover an estimate of the clean STSA by removing theintelligibility of speech. Degradation of speech severely part contributed by the additive noise. Expressed asaffects the ability of person, whether impaired ornormal hearing, 2162 | P a g e
  2. 2. Y Prasanthi, K Jyothi, GIET / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 2, Issue4, July-August 2012, pp.2162-2165 values in the enhanced spectrum were floored to the noisy spectrum as (1)Where S(k) and Y(k) are magnitude spectrum of cleanspeech and the noisy speech. Estimated noisemagnitude is calculated during periods ofspeech, α is an over-subtraction factor. (3)A general representation of STSA technique is given inFigure 1. Where The spectral floor parameter was set to β=0.001. 3. IMPLEMENTATION AND PERFORMANCE EVALUTION 3.1 Objective Measures for Performance Evaluation In the evaluation of speech enhancement algorithm, it is necessary to identify the similarities and differences in perceived quality and subjectively measured intelligibility. Speech quality is an indicator of the naturalness of the processed speech signal. Intelligibility of speech signals is a measure of the amount of speech information present in the signal that is responsible for conveying what that speaker is saying. The interrelationship between perceived speechFigure 1: Diagrammatic Representation of the short- and intelligibility is not clearly understood.time Spectral Magnitude Enhancement System Performance evaluation tests can be done by subjective quality measures or objective quality measures. While The input to the system is the noise-corrupted subjective measures provide a broad measure ofsignal y(n) . While there are many methods for the performance since a large difference in quality isanalysis-synthesis processing, the Short-term Fourier necessary to be distinguishable to the listener. Hence, itTransform (STFT) of the signal with OverLap and Add becomes difficult to get a reliable measure of changes(OLA) is the most commonly used method. Spectral due to algorithm parameters. Objective measures, onsubtraction is a well-known noise reduction method the other hand, provide a measure that can be easilybased on the STSA estimation technique. The basic implemented and reliably reproduced.power spectral subtraction technique, as proposed by It is necessary to conduct off-line simulationsBoll, is popular due to its simple underlying concept to check the validity and feasibility of an algorithmand its effectiveness in enhancing speech degraded by before it can be implemented on a real-time system.additive noise but drawback of the power spectral Implementation on a workstation permits modificationssubtraction technique is residual noise. and changes to the algorithm without constraints of time, memory or computational power. The simulations2. FREQUENCY DEPENDED SPECTRAL were carried out on an using Matlab, a technicalSUBTRACTION computing software. The speech signal is first Along with the actual noise suppression Hamming windowed using a 20-ms window and a 10-operation, some pre-processing methods are needed to ms overlap between frames. The windowed speechachieving good speech quality. In frequency depended frame is then analyzed using the Fast Fourier Transformspectral subtraction, the speech spectrum is divided into (FFT). Smoothing of the magnitude spectrum wasnon-overlapping bands, and spectral subtraction is found to reduce the variance of the speech spectrumperformed independently in each band. The fact that and contribute to the enhancement in speech quality. Acolored noise affects the speech spectrum differently at weighted spectral average is taken over preceding andvarious frequencies. succeeding frames of data as given by Equation. (4)(2) Where j is the frame index 0<Wi<1. The averaging is Where bi and ei are beginning and ending done over M preceding and succeeding frames offrequency bins of the ith frequency band.δi is tweaking speech. The number of frames M is limited to 2 tofactor that can be individually set for each band to prevent smearing of the speech spectral content. Thecustomize the noise removal properties.The negative 2163 | P a g e
  3. 3. Y Prasanthi, K Jyothi, GIET / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 2, Issue4, July-August 2012, pp.2162-2165filter weights Wi were empirically determined and setto W= [0.09, 0.25, 0.32, 0.25, 0.09] (5) Where fi is the upper frequency of the ithband, and Fs is the sampling frequency in Hz. The Figure 2: Intelligibility Test Results for Seven Subjectsmotivation for using smaller δi values for the low scored on Percentage Words Correctfrequency bands is to minimize speech distortion, sincemost of the speech energy is present in the lowerfrequencies. Relaxed subtraction is also used for thehigh frequency bands. Subtraction is performed overeach band as indicated in equation and the negativevalues are rectified using the spectral floor. A smallamount of the original noisy spectrum can beintroduced back into the enhanced spectrum to maskany remaining musical noise. In this implementation,5% of the original noisy spectrum within each band iscombined, and the enhanced signal is obtained bytaking the IFFT of the enhanced spectrum using the Figure 3: signal representation of the sentence I amphase of the original noisy spectrum. Finally, the David. The top representation is the original signal, thestandard overlap-and-add method is used to obtain the middle representation is corrupted signal, and theenhanced signal. bottom representation is the enhanced signal obtained by the frequency depended spectral subtraction3.2 Subjective Evaluation of Speech Intelligibility method Subjective tests are conducted by having somehuman subjects listen to the prepared test speech filesand evaluate based on some criteria. Intelligibility testwere carried out at the Callier Center forCommunication Disorders/UTD on seven hearing-impaired subjects with severe to profound hearing loss.Speech enhanced by the MBSS algorithm with fourlinearly spaced frequency bands was evaluated againstthe noisy speech. The sentences were corrupted usingspeech-shaped noise at 0 dB SNR. The noise-corruptedsentences were played in a random order through Figure 4: Spectrogram of the sentence I am David.speakers in a sound insulated booth. The subjects were The top spectrogram is the original signal, the middleasked to repeat the sentence they heard. Intelligibility spectrogram is the corrupted signal, the bottomwas measured in terms of percentage of words correct. spectrogram is the enhanced signal obtained byFigure 2 gives the bar plots of the intelligibility scores frequency depended spectral subtraction method.achieved by each subject and the average score for boththe test conditions. The results obtained by objectiveevaluation are an indicator of the best speech quality 5. CONCLUSIONSthat can be obtained by the different configurations of The work in this paper addressed the problemthe algorithm. The proposed multi-band spectral of enhancing speech in noisy conditions. A multi-bandsubtraction approach(with number of bands>3) spectral subtraction method, based on the directperformed better than the PSS approach for both SNRs. estimation of the short-term spectral amplitude of speech and the non-uniform effect of noise on speech.4. RESULTS The results establish the superiority of the proposed method over the conventional spectral subtraction 2164 | P a g e
  4. 4. Y Prasanthi, K Jyothi, GIET / International Journal of Engineering Research and Applications (IJERA) ISSN: 2248-9622 www.ijera.com Vol. 2, Issue4, July-August 2012, pp.2162-2165method with respect to speech quality of the enhanced Processing, vol.7, pp. 2819-2822, Sydney,signal and reduced residual noise. Australia, Dec.1998.The major contributions of this paper are [11] C. He and G. Zweig, “Adaptive two-band(a) Development of a multi-band speech enhancement spectral subtraction with multi-windowstrategy based on the spectral subtraction method. spectral estimation,” ICASSP, vol.2, pp. 793-Speech processed by the new algorithm shows reduced 796, 1999.levels of residual noise and good speech quality. [12] Y. Hu, M. Bhatnagar and P. Loizou, “A cross-(b) Proposed a band subtraction factor that provides correlation technique for enhancing speechgreater control over the subtraction process in each corrupted with correlated noise,” ICASSP, vol.band and can be tweaked to minimize speech distortion. 1, pp. 673-676, 2001.(c) Evaluation of various pre-processing strategies for [13] S. Kamath and P. Loizou, “A multi-bandimproving the output speech quality. It was shown that spectral subtraction method for enhancingspectral smoothing and weighted spectral averaging of speech corrupted by colored noise,” submittedthe input speech spectrum helped preserve the speech to ICASSP 2002.content and improved speech quality. [14] P. Kasthuri, “Multichannel speech enhancement for a digital programmableReference/Bibliography hearing aid,” Master thesis, University of New [1] L. Arslan, A. McCree and V. Viswanathan, “ Mexico, 1999. New methods for adaptive noise suppression,” [15] W. Kim, S. Kang and H. Ko, “Spectral ICASSP, vol.1, pp. 812-815, May 1995. subtraction based on phonetic dependency and [2] M. Berouti, R. Schwartz and J. Makhoul, masking effects,” Proc. IEEE Vis. Image “Enhancement of speech corrupted by acoustic Signal Procs., vol. 147, No. 5, Oct. 2000. noise,” Proc. IEEE Int. Conf. on Acoust., [16] H. Levitt, “Noise reduction in hearing aids: An Speech, Signal Procs., pp. 208-211, Apr. 1979. overview”, Journal of Rehabilitation Research [3] S. Boll, “Suppression of acoustic noise in and Development, vol. 38,No.1 speech using spectral subtraction,” IEEE January/February 2001. Trans. Acoust., Speech, Signal Process., [17] J. Lim and A. Oppenheim, “All-pole modeling vol.27, pp. 113-120, Apr. 1979. of degraded speech,” IEEE Transactions on [4] Y. Cheng and D. OShaughnessy, “Speech Acoustics, Speech and Signal Processing, vol. enhancement based conceptually on auditory 26, No. 3, pp. 197-210, June 1978. evidence,” ICASSP, vol.2, pp. 961-964, Apr. [18] J. Lim and A. Oppenheim, “Enhancement and 1991. bandwidth compression of noisy speech,” [5] J. Deller Jr., J. Hansen and J. Proakis, “Discrete- Proc. IEEE, vol. 67, No. 12, pp. 221-239, Dec. Time Processing of Speech Signals”, NY: 1979 IEEE Press, 2000. [6] Y. Ephraim, “Statistical-model-based speech enhancement systems,” Proc. IEEE,80, No.10, pp. 1526-1555, Oct.1992. [7] Y. Ephraim and D. Malah, “Speech enhancement using a minimum mean-square error short- term spectral amplitude estimator,” IEEE Trans. on Acoust.,Speech,Signal Proc., vol.ASSP-32, No.6, pp. 1109-1121, Dec.1984. [8] Y. Ephraim and H. Van Trees, “A signal subspace approach for speech enhancement,” IEEE Trans. Speech Audio Procs., vol. 3, pp. 251-266, Jul. 1995. [9] Z. Goh, K.Tan and T. Tan, “Postprocessing method for suppressing musical noise generated by spectral subtraction,” IEEE Trans. Speech Audio Procs., vol. 6, pp. 287- 292, May 1998. [10] J. Hansen and B. Pellom, “An effective quality evaluation protocol for speech enhancements algorithms,” Inter. Conf. on Spoken Language 2165 | P a g e