SlideShare a Scribd company logo
1 of 5
Download to read offline
Azhar S. Abdulaziz.et al. Int. Journal of Engineering Research and Application www.ijera.com
ISSN : 2248-9622, Vol. 6, Issue 8, ( Part -1) August 2016, pp.99-103
www.ijera.com 99|P a g e
The Short-Time Silence of Speech Signal as Signal-To-Noise Ratio
Estimator
Azhar S. Abdulaziz1, 2
, Veton Z. KΓ«puska3
1
Department of Electrical and Computer Engineering, Florida Institute of Technology, Florida, USA
2
Department of Computer and Information, College of Electronic Engineering, Ninevah University, Mosul, Iraq.
3
Department of Electrical and Computer Engineering, Florida Institute of Technology, Florida, USA
ABSTRACT
It is proposed in this paper to use a small portion of the audio speech signal to estimate Signal-to-Noise Ratio
(SNR). It is found that, the first 30 ms duration has enough information about the SNR in advance. The first 30
ms of a recorded speech usually comes from the silence rather than speech. This is because the speaker usually
starts the recording process or wait for it before he/she can deliver the utterance. For testing and comparing the
proposed estimator, different noisy corpora are built upon the TIMIT data. The average estimation of the
suggested algorithm proves to get better results as compared to the Waveform Amplitude Distribution Analysis
(WADA) and the National Institute of Standard and Technology (NIST) SNR estimators. The complexity of the
STS-SNR estimator is less than both as it only processes a small portion of the audio samples.
Keywords: SNR Estimation, Noisy Speech, Signal-to-Noise Ratio, Short-Time Silence.
I. INTRODUCTION
The signal-to-noise ratio (SNR) estimation
algorithms has been investigated deeply and used for
different applications. They could be used to
improve speech enhancement, detection and
recognition algorithms in different ways [1]. Those
estimators usually use all signal samples to compute
the SNR. For some applications, it isenough to know
if the signal has high or low SNR in order to do
further operations on the speech signal for better
recognition accuracy as in [2], [3] and [4]. Many of
the SNR estimation approaches are based on either a
pre-specified weighting factor or preceding
assumptions of some parameters in the signal model
[5]. The spoken utterance SNR is a ratio between the
power of two random signals, the speech and the
noise. This phenomenon of variability and
randomness behavior makes the SNR estimation
more difficult to investigate [6].
The National Institute of Standard and
Technology NIST developed a well-known SNR
estimator NIST-STNR [7]. It uses the entire speech
signal by framing it to 20 ms frames to estimate the
histogram of the root-mean square (RMS) of the
power. Those frames are 50% overlapped among
each other, then used for histogram updates. When
the audio is finished, the resulting power histogram
will be analyzed so that the 15% of the total area
from the left represents the noise while the 85% of
the area from the left of the histogram represents the
signal [8].
Figure 1: NIST-STNR Estimation using RMS
power Histogram [8]
The NIST algorithm tries to separate the
high and the low power of the audio signal using the
its probability density function (PDF). It is supposed
that, the noise and signal PDF boundaries lies in the
selected regions with higher probability.
While the power PDF is the key for NIST
SNR estimator, there is another approach of
examining the amplitude PDF as in the Waveform
Amplitude Distribution Analysis (WADA-SNR)
algorithm in [1]. This approach is based on the
assumption that the clean speech amplitude has
proximately a symmetrical Gamma distribution
function. According to Kim and Stern in [1], when a
white Gaussian noise is added to the clean speech
there will be a unique parameter Gz that can
determine SNR.
The researchers used a pre-calculated
lookup table to store the Gz parameterfor different
SNRs. It is important to note that although the
WADA-SNR approach assumes that the noise has
RESEARCH ARTICLE OPEN ACCESS
Azhar S. Abdulaziz.et al. Int. Journal of Engineering Research and Application www.ijera.com
ISSN : 2248-9622, Vol. 6, Issue 8, ( Part -1) August 2016, pp.99-103
www.ijera.com 100|P a g e
Gaussian distribution, the empirical results shows
that it has superiority over the NIST-STNR even for
other types of noise like background speech and
music as shown in the figure 2[1].
a- Results with WADA-SNR
b- Results with NIST-SNR
Figure 2: Comparison of the average SNR estimates
between WADA and NIST SNR algorithms on
DARPA-RM database [1]
The rest of the paper will explain the
proposed Short-Time Silence SNR estimator (STS-
SNR). A comparison of the simulation results
between the suggested STS-SNR algorithm with the
WADA and NIST-STNR approaches is in section 3,
while section 4 will is dedicated for the conclusion
and future works.
II. THE SHORT-TIME SILENCE SNR
ESTIMATOR
Based on what has been discussed in the
previous section, an SNR estimator cannot give its
results unless all the audio signal samples are
processed. Those full-audio length approaches
would seize the resources for some real-time live
audio applications, like Automatic Speech
Recognition (ASR). The suggested algorithm is to
process only a small amount of samples from the
audio signal to guess the SNR. The proposed
algorithm is refereed as the Short-Time Silence
(STS-SNR) estimator, which will consider only the
first 30 ms at the beginning of the audio. This
estimator assumes that the first 30 ms of the tested
audio represents silence rather than speech. It is
more realistic to have this consideration as the
speaker cannot deliver his/her utterance within the
first 30 ms. It is also assumed in this research that
the SNR is not changing during the time of interest.
The proposed Short-Time Silence SNR
(STS-SNR) estimator is described by the following
steps:
1. Take the first 30 ms duration directly after the
microphone is on, which is denoted here as the
Noise Frame NFrame.
2. Subtract the mean of the NFrame
3. Estimate the Power Spectral Density (PSD) of
the NFrame using Fast-Fourier Transform (FFT)
of 512 points and taking only 0 to 8 kHz band in
consideration. Where PSD for the 30 ms NPSD
is:
𝑁𝑃𝑆𝐷 = π‘πΉπ‘Ÿπ‘Žπ‘šπ‘’ πœ” 2
(1)
Where π‘πΉπ‘Ÿπ‘Žπ‘šπ‘’ (πœ”) is the spectrum of the audio
frame.
4. Reform the PSD by taking the absolute
difference of its flipped version to produce a
white-like PSD using the following process:
π‘π‘…π‘’π‘“π‘œπ‘Ÿπ‘šπ‘’π‘‘ = 𝑁𝑃𝑆𝐷 βˆ’ 𝑁𝑃𝑆𝐷
𝑇
(2)
here, 𝑁𝑃𝑆𝐷
𝑇
is the flipped version of the noise power
spectral density vector 𝑁𝑃𝑆𝐷 .
5. Take the average of the first and last quarter of
the 8 kHz band of the NReformed. The average of
those quarters is then considered as the
estimated noise power spectral density 𝑁𝑃𝑆𝐷.
6. The estimated SNR in dB is:
𝑆𝑁𝑅 𝑑𝐡 = π‘œπ‘“π‘“π‘ π‘’π‘‘ βˆ’ 10 Γ— log10(𝑁𝑃𝑆𝐷) (3)
As non-white noise is also expected, the
NPSD is reformed in step 4 to produce a white-like
PSD. This step will eliminate the effect of some
colored noise that might appear while preserve the
white noise PSD. In figure 3, the NReformed is shown
for a speech that was sampled by 16KHz sampling
rate and has an additive blue noise that made the
SNR 10 dB.
(a) PSD before reforming
Azhar S. Abdulaziz.et al. Int. Journal of Engineering Research and Application www.ijera.com
ISSN : 2248-9622, Vol. 6, Issue 8, ( Part -1) August 2016, pp.99-103
www.ijera.com 101|P a g e
(b) PSD after reforming
Figure 3: The reforming operation in step 4
The measured pauses in a music concert are
an ambient noise of 30 dB in advance as each
frequency band contributes a different amount of dB
power [9]. Hence, step 5 is designed to estimate the
maximum contribution of each frequencyband by
averaging the reformed spectrum from step 4. The
NPSD in the final stepshows to change linearly with
the SNR of the speech utterance. However, there is
an offset of the estimator mean from the real mean.
Experiments on different SNR noisy speech
showed that the offset in equation (3) is not varying
with the SNR. It was found that use of value 23 as
an offset to get minimum error for different kinds
and levels of noise. This exact value of the offset is
estimated by the minimum sum of square error
(SSE) and the minimum mean absolute error (MAE)
as depicted in figure 4.
Experiments on the TIMIT test data and the
NOIZEUS [10] show that this offset in equation 3
would be 23 dB as it gave the minimum for both the
Sum of Square Error (SSE) and the Mean Absolute
Error (MAE) as shown in figure 4 below.
(a) SSE for different Offsets
(b) MAE for different offsets
Figure 4: Testing different offsets for NFrame of
NOIZEUS noisy speech at 15 dB SNR
The offset value in equation (3) represents
the minimum power gap between the moderate
speech and the faint noise. So that the speech is
differentiated from the noise by the speaker and his
audience. It is common for a speaker to raise his
vocal power to compete the ambient noise energy at
least to hear himself, and by doing so allowing the
target audience to hear clearly. According to Gordon
J. King in [9], the Faint Noise starts by the 20 dB
Sound Pressure Level (SPL) for whisper and goes up
to 40 dB SPL for public library noise. The moderate
speech lies in a pressure region of what he called
Moderate Noise which is between 40 and 60 dB SPL
[9]. Any increase in the faint noise will make the
speaker, who aims to speak moderately, to increase
his utterance power to compete the additional noise.
Therefore, the constant offset of 23 in equation 3
seems to be reasonable and close to theory.
III. SIMULATION RESULTS
The STS-SNR estimator was evaluated
using the DARPA TIMIT [11] and the noisy speech
corpus NOIZEUS that is described in [10].
However, as the latter corpus speech and noise
signals were filtered by the modified Intermediate
Reference System (IRS) filters, an additional pre-
processing was needed to be applied for getting the
un-filtered spectrum.
By comparing the average estimated SNR
of the WADA, NIST-STNR and the proposed STS-
SNR, it seems that the latter has a better response.
Figure 5 explains that the proposed STS-SNR is
closer to the ideal case in average for NOIZEUS
corpus.
Azhar S. Abdulaziz.et al. Int. Journal of Engineering Research and Application www.ijera.com
ISSN : 2248-9622, Vol. 6, Issue 8, ( Part -1) August 2016, pp.99-103
www.ijera.com 102|P a g e
Figure 5: The average estimated SNR comparison
for NOIZEUS corpus.
The audio of the TIMIT corpus is recorded
in a quiet room environment [12], therefore different
types of noise are added to it throughout this work.
Additive White Gaussian Noise (AWGN), blue and
pink are artificially computed and added to the
TIMIT audio. Besides, crowd babble noise is also
considered, where a crowd of people speaking
randomly together. Both artificial and crowd babble
noise are added to make noisy TIMIT of SNR ranges
from 5 to 50 dB with 5 dB steps.
For white, pink and blue artificial additive
noise, the STS-SNR shows to be a better SNR
estimator than the other two approaches. Figure 6
below shows the comparison for the TIMIT corpus
with white noise only as the responses due to pink
and blue noise types have the same pattern.
Figure 6: White Noise estimated SNR using TIMIT
corpus
For babble crowd TIMIT the mean
estimated SNR was almost better using the proposed
STS-SNR compared with the WADA and the NIST-
STNR, as shown in figure 7.
Figure 7: Crowd Babble Noise estimated SNR using
TIMIT corpus.
The WADA-SNR has a good estimation
when the SNR is less than 20 dB. However, when
the test is run over higher noise levels, the proposed
STS-SNR shows an advantage over the WADA
approach. The results in figures 5 and 6 shows that
the proposed STS-SNR estimator has an effective
true mean detection in average.
Meanwhile, for the test cases below 20 dB
SNR, the STS-SNR has shown to give a higher
deviation from the mean, when it is compared with
the WADA algorithm. As shown in figure 8 below,
for SNR below 20 dB, the Mean Absolute Error
(MAE) is higher for NIST-STNR and the proposed
STS than the WADA approach. However, the MAE
for STS-SNR is still less than both WADA and
NIST-STNR in average for the whole range.
Figure 8: The MAE Comparison for NIST, WADA
and STS Estimators.
For the complexity comparison among
those SNR estimators, the STS-SNR is considered
less expensive. As the proposed STS-SNR uses only
30 ms NFrame window of the signal, it is expected to
perform faster than the NIST and WADA SNR
approaches. The computational expenses of the three
estimators were test edusing the Real Time ratio
(RT) as in the following equation:
𝑅𝑇 =
π‘ƒπ‘Ÿπ‘œπ‘π‘’π‘ π‘ π‘–π‘›π‘” π‘‡π‘–π‘šπ‘’
π΄π‘’π‘‘π‘–π‘œ π·π‘’π‘Ÿπ‘Žπ‘‘π‘–π‘œπ‘›
(4)
Azhar S. Abdulaziz.et al. Int. Journal of Engineering Research and Application www.ijera.com
ISSN : 2248-9622, Vol. 6, Issue 8, ( Part -1) August 2016, pp.99-103
www.ijera.com 103|P a g e
Table 1 shows a comparison among the
three SNR estimators, NIST, WADA and the
proposed STS approach.
Table 1: Average Real-time Ratio for SNR
Estimators
SNR Estimator Average RT
NIST-STNR 0.0182
WADA-SNR 0.0033
STS-SNR 0.0026
Those measurements are the average RT’s
of when the algorithms were tested using the TIMIT
test corpus of 1680 speech audio files.
IV. CONCLUSION
The proposed algorithm for the SNR
estimation has shown to give a close guess to the
ideal SNR for audio speech. It is less complicated
and more efficient to expect the SNR in advance.
The suggested SNR estimator in this paper shows to
have a higher error for SNR less than 20 dB, as
compared to the WADA approach. Nevertheless, for
higher SNRs, the proposed STS algorithm estimates
the SNR more accurately than the WADA estimator.
In general, the STS algorithm has an advantage over
the WADA and the NIST-STNR approaches. Even
though the WADA estimator is based on extensive
integration, its complexity is reduced by using the
offline pre-calculated table [1]. However, as it is
noted in table 1, the proposed STS-SNR estimator is
still less expensive than the WADA-SNR.
The drawbacks of the proposed STS-SNR
are there should be a 30 ms of silence at the
beginning and the SNR is not changing. If the noise
level is changed later, another silence period should
be considered to re-estimate the new SNR ratio. In
this case, a Voice Activity Detector (VAD) should
be used to pick silencesthat occur within the speech
as well. For Automatic Speech Recognizers (ASR)
applications, there is another solution. Some of
modern ASRs, like CMU Sphinx, can detect a
silence and its samples could be easily extracted.
This is because silence is treated as a word that
represent a non-speech event and stored in a filler
dictionary [13]. In future works, a continuous
feedback from within-speech silences will be added
allowing the STS to update the SNR continuously.
REFERENCES
[1]. C. Kim and R. M. Stern, β€œRobust signal-to-noise
ratio estimation based on waveform amplitude
distribution analysis.” in INTERSPEECH, 2008,
pp. 2598–2601.
[2]. H. G. Hirsch and C. Ehrlicher, β€œNoise estimation
techniques for robust speech recognition,” in
Acoustics, Speech, and Signal Processing, 1995.
ICASSP-95., 1995 International Conference on,
vol. 1. IEEE, 1995, pp. 153–156.
[3]. J. Morales-Cordovilla, N. Ma, V. Sa ́nchez, J. L.
Carmona, A. M. Peinado, J. Barker et al., β€œA
pitch based noise estimation technique for robust
speech recognition with missing data,” in
Acoustics, Speech and Signal Processing
(ICASSP), 2011 IEEE International Conference
on. IEEE, 2011, pp. 4808– 4811.
[4]. H.-W. Park, A.-R. Khil, and M.-J. Bae, β€œPitch
detection based on signal- to-noise-ratio
estimation and compensation for continuous
speech signal,” in Convergence and Hybrid
Information Technology. Springer, 2012, pp.
767–774. 

[5]. T. Moazzeni, A. Amei, J. Ma, and Y. Jiang,
β€œStatistical model based snr estimation method
for speech signals,” Electronics letters, vol. 48,
no. 12, pp. 727–729, 2012. 

[6]. P. Papadopoulos, A. Tsiartas, J. Gibson, and S.
Narayanan, β€œA supervised signal-to-noise ratio
estimation of speech signals,” in Acoustics,
Speech and Signal Processing (ICASSP), 2014
IEEE International Conference on. IEEE, 2014,
pp. 8237–8241.
[7]. National Institute of Standards and Technology,
β€œNIST speech signal to noise ratio
measurements,” 2008. 

[8]. D. Ellis. (2011) Objective measures of speech
qual- ity (snr).the laboratory for the recognition
and organiza- tion of speech and audio (labrosa).
[Online]. Available:
http://labrosa.ee.columbia.edu/projects/snreval/

[9]. G. J. King, The audio handbook. Newnes-
Butterworths Group, J. W. Ar- rowsmith Ltd.,
1975. 

[10]. Y. Hu and P. C. Loizou, β€œSubjective comparison
and evaluation of speech enhancement
algorithms,” Speech communication, vol. 49, no.
7, pp. 588–601, 2007. 

[11]. W. M. Fisher, G. R. Doddington, and K. M.
Goudie-Marshall, β€œThe DARPA speech
recognition research database: specifications and
status,” in Proc. DARPA Workshop on speech
recognition, 1986, pp. 93–99. 

[12]. C. Becchetti and K. P. Ricotti, Speech
Recognition: Theory and C++ Implementation
(With CD). John Wiley & Sons, 2008.
[13]. Carnegie Mellon University. (2000) Manual for
the sphinx-iii recognition system,[Online].
Available at:
http://www.speech.cs.cmu.edu/sphinxman/.
http://www.speech.cs.cmu.edu/sphinxman/script
man1.html#00

More Related Content

What's hot

An Optimized Energy Detection Scheme For Spectrum Sensing In Cognitive Radio
An Optimized Energy Detection Scheme For Spectrum Sensing In Cognitive RadioAn Optimized Energy Detection Scheme For Spectrum Sensing In Cognitive Radio
An Optimized Energy Detection Scheme For Spectrum Sensing In Cognitive RadioIJERD Editor
Β 
Data Communication & Computer network: Channel capacity
Data Communication & Computer network: Channel capacityData Communication & Computer network: Channel capacity
Data Communication & Computer network: Channel capacityDr Rajiv Srivastava
Β 
example based audio editing
example based audio editingexample based audio editing
example based audio editingRamin Anushiravani
Β 
WSN, FND, LEACH, modified algorithm, data transmission
WSN, FND, LEACH, modified algorithm, data transmission WSN, FND, LEACH, modified algorithm, data transmission
WSN, FND, LEACH, modified algorithm, data transmission ijwmn
Β 
Research: Applying Various DSP-Related Techniques for Robust Recognition of A...
Research: Applying Various DSP-Related Techniques for Robust Recognition of A...Research: Applying Various DSP-Related Techniques for Robust Recognition of A...
Research: Applying Various DSP-Related Techniques for Robust Recognition of A...Roman Atachiants
Β 
Audio Signal Processing
Audio Signal Processing Audio Signal Processing
Audio Signal Processing Ahmed A. Arefin
Β 
Data bit rate_by_abhishek_wadhwa
Data bit rate_by_abhishek_wadhwaData bit rate_by_abhishek_wadhwa
Data bit rate_by_abhishek_wadhwaAbhishek Wadhwa
Β 
Analysis of PEAQ Model using Wavelet Decomposition Techniques
Analysis of PEAQ Model using Wavelet Decomposition TechniquesAnalysis of PEAQ Model using Wavelet Decomposition Techniques
Analysis of PEAQ Model using Wavelet Decomposition Techniquesidescitation
Β 
Data Communication And Networking - DATA RATE LIMITS
Data Communication And Networking - DATA RATE LIMITSData Communication And Networking - DATA RATE LIMITS
Data Communication And Networking - DATA RATE LIMITSAvijeet Negel
Β 
Noise Analysis on Indoor Power Line Communication Channel
Noise Analysis on Indoor Power Line Communication ChannelNoise Analysis on Indoor Power Line Communication Channel
Noise Analysis on Indoor Power Line Communication Channelijsrd.com
Β 
Performance analysis of radar based on ds bpsk modulation technique
Performance analysis of radar based on ds bpsk modulation techniquePerformance analysis of radar based on ds bpsk modulation technique
Performance analysis of radar based on ds bpsk modulation techniqueIAEME Publication
Β 
Data Communication And Networking - DATA & SIGNALS
Data Communication And Networking - DATA & SIGNALSData Communication And Networking - DATA & SIGNALS
Data Communication And Networking - DATA & SIGNALSAvijeet Negel
Β 
Wireless Communication Networks and Systems 1st Edition Beard Solutions Manual
Wireless Communication Networks and Systems 1st Edition Beard Solutions ManualWireless Communication Networks and Systems 1st Edition Beard Solutions Manual
Wireless Communication Networks and Systems 1st Edition Beard Solutions Manualpuriryrap
Β 
Ch03-Data And Signals
Ch03-Data And SignalsCh03-Data And Signals
Ch03-Data And Signalsasadawan123
Β 
2. data and signals
2. data and signals2. data and signals
2. data and signalsHumayoun Kabir
Β 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)IJERD Editor
Β 

What's hot (18)

An Optimized Energy Detection Scheme For Spectrum Sensing In Cognitive Radio
An Optimized Energy Detection Scheme For Spectrum Sensing In Cognitive RadioAn Optimized Energy Detection Scheme For Spectrum Sensing In Cognitive Radio
An Optimized Energy Detection Scheme For Spectrum Sensing In Cognitive Radio
Β 
Data Communication & Computer network: Channel capacity
Data Communication & Computer network: Channel capacityData Communication & Computer network: Channel capacity
Data Communication & Computer network: Channel capacity
Β 
example based audio editing
example based audio editingexample based audio editing
example based audio editing
Β 
WSN, FND, LEACH, modified algorithm, data transmission
WSN, FND, LEACH, modified algorithm, data transmission WSN, FND, LEACH, modified algorithm, data transmission
WSN, FND, LEACH, modified algorithm, data transmission
Β 
Research: Applying Various DSP-Related Techniques for Robust Recognition of A...
Research: Applying Various DSP-Related Techniques for Robust Recognition of A...Research: Applying Various DSP-Related Techniques for Robust Recognition of A...
Research: Applying Various DSP-Related Techniques for Robust Recognition of A...
Β 
Audio Signal Processing
Audio Signal Processing Audio Signal Processing
Audio Signal Processing
Β 
Data bit rate_by_abhishek_wadhwa
Data bit rate_by_abhishek_wadhwaData bit rate_by_abhishek_wadhwa
Data bit rate_by_abhishek_wadhwa
Β 
Analysis of PEAQ Model using Wavelet Decomposition Techniques
Analysis of PEAQ Model using Wavelet Decomposition TechniquesAnalysis of PEAQ Model using Wavelet Decomposition Techniques
Analysis of PEAQ Model using Wavelet Decomposition Techniques
Β 
Data Communication And Networking - DATA RATE LIMITS
Data Communication And Networking - DATA RATE LIMITSData Communication And Networking - DATA RATE LIMITS
Data Communication And Networking - DATA RATE LIMITS
Β 
Noise Analysis on Indoor Power Line Communication Channel
Noise Analysis on Indoor Power Line Communication ChannelNoise Analysis on Indoor Power Line Communication Channel
Noise Analysis on Indoor Power Line Communication Channel
Β 
Performance analysis of radar based on ds bpsk modulation technique
Performance analysis of radar based on ds bpsk modulation techniquePerformance analysis of radar based on ds bpsk modulation technique
Performance analysis of radar based on ds bpsk modulation technique
Β 
Data Communication And Networking - DATA & SIGNALS
Data Communication And Networking - DATA & SIGNALSData Communication And Networking - DATA & SIGNALS
Data Communication And Networking - DATA & SIGNALS
Β 
Ch 03
Ch 03Ch 03
Ch 03
Β 
H0814247
H0814247H0814247
H0814247
Β 
Wireless Communication Networks and Systems 1st Edition Beard Solutions Manual
Wireless Communication Networks and Systems 1st Edition Beard Solutions ManualWireless Communication Networks and Systems 1st Edition Beard Solutions Manual
Wireless Communication Networks and Systems 1st Edition Beard Solutions Manual
Β 
Ch03-Data And Signals
Ch03-Data And SignalsCh03-Data And Signals
Ch03-Data And Signals
Β 
2. data and signals
2. data and signals2. data and signals
2. data and signals
Β 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
Β 

Viewers also liked

Design of 4:16 decoder using reversible logic gates
Design of 4:16 decoder using reversible logic gatesDesign of 4:16 decoder using reversible logic gates
Design of 4:16 decoder using reversible logic gatesIJERA Editor
Β 
Antibacterial Resistance in the Muscles of Chicken, Pig and Beef
Antibacterial Resistance in the Muscles of Chicken, Pig and Beef Antibacterial Resistance in the Muscles of Chicken, Pig and Beef
Antibacterial Resistance in the Muscles of Chicken, Pig and Beef IJERA Editor
Β 
Carbon Nanotubes On Unsteady MHD Non-Darcy Flow Over A Porous Wedge In The Pr...
Carbon Nanotubes On Unsteady MHD Non-Darcy Flow Over A Porous Wedge In The Pr...Carbon Nanotubes On Unsteady MHD Non-Darcy Flow Over A Porous Wedge In The Pr...
Carbon Nanotubes On Unsteady MHD Non-Darcy Flow Over A Porous Wedge In The Pr...IJERA Editor
Β 
Optimization of Quality of Service (QoS) framework for highway based Vehicula...
Optimization of Quality of Service (QoS) framework for highway based Vehicula...Optimization of Quality of Service (QoS) framework for highway based Vehicula...
Optimization of Quality of Service (QoS) framework for highway based Vehicula...IJERA Editor
Β 
Application of Virtual Reality in a Learning Experience
Application of Virtual Reality in a Learning ExperienceApplication of Virtual Reality in a Learning Experience
Application of Virtual Reality in a Learning ExperienceIJERA Editor
Β 
Preparation Effect of Mould Systems on Microstructure and Mechanical Properti...
Preparation Effect of Mould Systems on Microstructure and Mechanical Properti...Preparation Effect of Mould Systems on Microstructure and Mechanical Properti...
Preparation Effect of Mould Systems on Microstructure and Mechanical Properti...IJERA Editor
Β 
Discovering diamonds under coal piles: Revealing exclusive business intellige...
Discovering diamonds under coal piles: Revealing exclusive business intellige...Discovering diamonds under coal piles: Revealing exclusive business intellige...
Discovering diamonds under coal piles: Revealing exclusive business intellige...IJERA Editor
Β 
A Study on Steel Fiber Reinforced Normal Compacting Concrete
A Study on Steel Fiber Reinforced Normal Compacting ConcreteA Study on Steel Fiber Reinforced Normal Compacting Concrete
A Study on Steel Fiber Reinforced Normal Compacting ConcreteIJERA Editor
Β 
Emulation OF 3gpp Scme CHANNEL MODELS USING A Reverberation Chamber MEASUREME...
Emulation OF 3gpp Scme CHANNEL MODELS USING A Reverberation Chamber MEASUREME...Emulation OF 3gpp Scme CHANNEL MODELS USING A Reverberation Chamber MEASUREME...
Emulation OF 3gpp Scme CHANNEL MODELS USING A Reverberation Chamber MEASUREME...IJERA Editor
Β 
Effect of Composition of Sand Mold on Mechanical Properties and Density of Al...
Effect of Composition of Sand Mold on Mechanical Properties and Density of Al...Effect of Composition of Sand Mold on Mechanical Properties and Density of Al...
Effect of Composition of Sand Mold on Mechanical Properties and Density of Al...IJERA Editor
Β 
Seawater Intrusion Vulnerability Assessment of a Coastal Aquifer: North Coast...
Seawater Intrusion Vulnerability Assessment of a Coastal Aquifer: North Coast...Seawater Intrusion Vulnerability Assessment of a Coastal Aquifer: North Coast...
Seawater Intrusion Vulnerability Assessment of a Coastal Aquifer: North Coast...IJERA Editor
Β 
Development of Aluminum in Architecture
Development of Aluminum in ArchitectureDevelopment of Aluminum in Architecture
Development of Aluminum in ArchitectureIJERA Editor
Β 
Fabrication of Semi-Automatic Molten Metal Pouring System in Casting Industries.
Fabrication of Semi-Automatic Molten Metal Pouring System in Casting Industries.Fabrication of Semi-Automatic Molten Metal Pouring System in Casting Industries.
Fabrication of Semi-Automatic Molten Metal Pouring System in Casting Industries.IJERA Editor
Β 
Sensor Technology and Futuristic Of Fighter Aircraft
Sensor Technology and Futuristic Of Fighter Aircraft Sensor Technology and Futuristic Of Fighter Aircraft
Sensor Technology and Futuristic Of Fighter Aircraft IJERA Editor
Β 
Power Quality Improvement of a Distributed Generation Power System
Power Quality Improvement of a Distributed Generation Power SystemPower Quality Improvement of a Distributed Generation Power System
Power Quality Improvement of a Distributed Generation Power SystemIJERA Editor
Β 
Performance of Urban Transit in Jordan
Performance of Urban Transit in JordanPerformance of Urban Transit in Jordan
Performance of Urban Transit in JordanIJERA Editor
Β 
Proof collection from car black box using smart phone for accident detection
Proof collection from car black box using smart phone for accident detectionProof collection from car black box using smart phone for accident detection
Proof collection from car black box using smart phone for accident detectionIJERA Editor
Β 
Traffic Light Controller Using Fpga
Traffic Light Controller Using FpgaTraffic Light Controller Using Fpga
Traffic Light Controller Using FpgaIJERA Editor
Β 
Air cooling effect of fins on a Honda shine bike
Air cooling effect of fins on a Honda shine bikeAir cooling effect of fins on a Honda shine bike
Air cooling effect of fins on a Honda shine bikeIJERA Editor
Β 

Viewers also liked (19)

Design of 4:16 decoder using reversible logic gates
Design of 4:16 decoder using reversible logic gatesDesign of 4:16 decoder using reversible logic gates
Design of 4:16 decoder using reversible logic gates
Β 
Antibacterial Resistance in the Muscles of Chicken, Pig and Beef
Antibacterial Resistance in the Muscles of Chicken, Pig and Beef Antibacterial Resistance in the Muscles of Chicken, Pig and Beef
Antibacterial Resistance in the Muscles of Chicken, Pig and Beef
Β 
Carbon Nanotubes On Unsteady MHD Non-Darcy Flow Over A Porous Wedge In The Pr...
Carbon Nanotubes On Unsteady MHD Non-Darcy Flow Over A Porous Wedge In The Pr...Carbon Nanotubes On Unsteady MHD Non-Darcy Flow Over A Porous Wedge In The Pr...
Carbon Nanotubes On Unsteady MHD Non-Darcy Flow Over A Porous Wedge In The Pr...
Β 
Optimization of Quality of Service (QoS) framework for highway based Vehicula...
Optimization of Quality of Service (QoS) framework for highway based Vehicula...Optimization of Quality of Service (QoS) framework for highway based Vehicula...
Optimization of Quality of Service (QoS) framework for highway based Vehicula...
Β 
Application of Virtual Reality in a Learning Experience
Application of Virtual Reality in a Learning ExperienceApplication of Virtual Reality in a Learning Experience
Application of Virtual Reality in a Learning Experience
Β 
Preparation Effect of Mould Systems on Microstructure and Mechanical Properti...
Preparation Effect of Mould Systems on Microstructure and Mechanical Properti...Preparation Effect of Mould Systems on Microstructure and Mechanical Properti...
Preparation Effect of Mould Systems on Microstructure and Mechanical Properti...
Β 
Discovering diamonds under coal piles: Revealing exclusive business intellige...
Discovering diamonds under coal piles: Revealing exclusive business intellige...Discovering diamonds under coal piles: Revealing exclusive business intellige...
Discovering diamonds under coal piles: Revealing exclusive business intellige...
Β 
A Study on Steel Fiber Reinforced Normal Compacting Concrete
A Study on Steel Fiber Reinforced Normal Compacting ConcreteA Study on Steel Fiber Reinforced Normal Compacting Concrete
A Study on Steel Fiber Reinforced Normal Compacting Concrete
Β 
Emulation OF 3gpp Scme CHANNEL MODELS USING A Reverberation Chamber MEASUREME...
Emulation OF 3gpp Scme CHANNEL MODELS USING A Reverberation Chamber MEASUREME...Emulation OF 3gpp Scme CHANNEL MODELS USING A Reverberation Chamber MEASUREME...
Emulation OF 3gpp Scme CHANNEL MODELS USING A Reverberation Chamber MEASUREME...
Β 
Effect of Composition of Sand Mold on Mechanical Properties and Density of Al...
Effect of Composition of Sand Mold on Mechanical Properties and Density of Al...Effect of Composition of Sand Mold on Mechanical Properties and Density of Al...
Effect of Composition of Sand Mold on Mechanical Properties and Density of Al...
Β 
Seawater Intrusion Vulnerability Assessment of a Coastal Aquifer: North Coast...
Seawater Intrusion Vulnerability Assessment of a Coastal Aquifer: North Coast...Seawater Intrusion Vulnerability Assessment of a Coastal Aquifer: North Coast...
Seawater Intrusion Vulnerability Assessment of a Coastal Aquifer: North Coast...
Β 
Development of Aluminum in Architecture
Development of Aluminum in ArchitectureDevelopment of Aluminum in Architecture
Development of Aluminum in Architecture
Β 
Fabrication of Semi-Automatic Molten Metal Pouring System in Casting Industries.
Fabrication of Semi-Automatic Molten Metal Pouring System in Casting Industries.Fabrication of Semi-Automatic Molten Metal Pouring System in Casting Industries.
Fabrication of Semi-Automatic Molten Metal Pouring System in Casting Industries.
Β 
Sensor Technology and Futuristic Of Fighter Aircraft
Sensor Technology and Futuristic Of Fighter Aircraft Sensor Technology and Futuristic Of Fighter Aircraft
Sensor Technology and Futuristic Of Fighter Aircraft
Β 
Power Quality Improvement of a Distributed Generation Power System
Power Quality Improvement of a Distributed Generation Power SystemPower Quality Improvement of a Distributed Generation Power System
Power Quality Improvement of a Distributed Generation Power System
Β 
Performance of Urban Transit in Jordan
Performance of Urban Transit in JordanPerformance of Urban Transit in Jordan
Performance of Urban Transit in Jordan
Β 
Proof collection from car black box using smart phone for accident detection
Proof collection from car black box using smart phone for accident detectionProof collection from car black box using smart phone for accident detection
Proof collection from car black box using smart phone for accident detection
Β 
Traffic Light Controller Using Fpga
Traffic Light Controller Using FpgaTraffic Light Controller Using Fpga
Traffic Light Controller Using Fpga
Β 
Air cooling effect of fins on a Honda shine bike
Air cooling effect of fins on a Honda shine bikeAir cooling effect of fins on a Honda shine bike
Air cooling effect of fins on a Honda shine bike
Β 

Similar to The Short-Time Silence of Speech Signal as Signal-To-Noise Ratio Estimator

Improvement of minimum tracking in Minimum Statistics noise estimation method
Improvement of minimum tracking in Minimum Statistics noise estimation methodImprovement of minimum tracking in Minimum Statistics noise estimation method
Improvement of minimum tracking in Minimum Statistics noise estimation methodCSCJournals
Β 
A Novel Uncertainty Parameter SR ( Signal to Residual Spectrum Ratio ) Evalua...
A Novel Uncertainty Parameter SR ( Signal to Residual Spectrum Ratio ) Evalua...A Novel Uncertainty Parameter SR ( Signal to Residual Spectrum Ratio ) Evalua...
A Novel Uncertainty Parameter SR ( Signal to Residual Spectrum Ratio ) Evalua...sipij
Β 
Lr3620092012
Lr3620092012Lr3620092012
Lr3620092012IJERA Editor
Β 
Single Channel Speech Enhancement using Wiener Filter and Compressive Sensing
Single Channel Speech Enhancement using Wiener Filter and Compressive Sensing Single Channel Speech Enhancement using Wiener Filter and Compressive Sensing
Single Channel Speech Enhancement using Wiener Filter and Compressive Sensing IJECEIAES
Β 
A generic algorithm to numerically model the linewidth broadening of laser
A generic algorithm to numerically model the linewidth broadening of laserA generic algorithm to numerically model the linewidth broadening of laser
A generic algorithm to numerically model the linewidth broadening of laserIAEME Publication
Β 
20575-38936-1-PB.pdf
20575-38936-1-PB.pdf20575-38936-1-PB.pdf
20575-38936-1-PB.pdfIjictTeam
Β 
A Noise Reduction Method Based on Modified Least Mean Square Algorithm of Rea...
A Noise Reduction Method Based on Modified Least Mean Square Algorithm of Rea...A Noise Reduction Method Based on Modified Least Mean Square Algorithm of Rea...
A Noise Reduction Method Based on Modified Least Mean Square Algorithm of Rea...IRJET Journal
Β 
An efficient peak valley detection based vad algorithm for robust detection o...
An efficient peak valley detection based vad algorithm for robust detection o...An efficient peak valley detection based vad algorithm for robust detection o...
An efficient peak valley detection based vad algorithm for robust detection o...csandit
Β 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)ijceronline
Β 
16 28 may17 13jan 14083 28548-1-sm
16 28 may17 13jan 14083 28548-1-sm16 28 may17 13jan 14083 28548-1-sm
16 28 may17 13jan 14083 28548-1-smIAESIJEECS
Β 
A New Approach for Speech Enhancement Based On Eigenvalue Spectral Subtraction
A New Approach for Speech Enhancement Based On Eigenvalue Spectral SubtractionA New Approach for Speech Enhancement Based On Eigenvalue Spectral Subtraction
A New Approach for Speech Enhancement Based On Eigenvalue Spectral SubtractionCSCJournals
Β 
Improving the Efficiency of Spectral Subtraction Method by Combining it with ...
Improving the Efficiency of Spectral Subtraction Method by Combining it with ...Improving the Efficiency of Spectral Subtraction Method by Combining it with ...
Improving the Efficiency of Spectral Subtraction Method by Combining it with ...IJORCS
Β 
Dwpt Based FFT and Its Application to SNR Estimation in OFDM Systems
Dwpt Based FFT and Its Application to SNR Estimation in OFDM SystemsDwpt Based FFT and Its Application to SNR Estimation in OFDM Systems
Dwpt Based FFT and Its Application to SNR Estimation in OFDM SystemsCSCJournals
Β 
Enhanced modulation spectral subtraction incorporating various real time nois...
Enhanced modulation spectral subtraction incorporating various real time nois...Enhanced modulation spectral subtraction incorporating various real time nois...
Enhanced modulation spectral subtraction incorporating various real time nois...IRJET Journal
Β 
Comparative performance analysis of channel normalization techniques
Comparative performance analysis of channel normalization techniquesComparative performance analysis of channel normalization techniques
Comparative performance analysis of channel normalization techniqueseSAT Journals
Β 
FACTORS AFFECTING THE SIGNAL-TO-NOISE RATIO.pptx
FACTORS AFFECTING THE SIGNAL-TO-NOISE RATIO.pptxFACTORS AFFECTING THE SIGNAL-TO-NOISE RATIO.pptx
FACTORS AFFECTING THE SIGNAL-TO-NOISE RATIO.pptxRohit Bansal
Β 
Development of Algorithm for Voice Operated Switch for Digital Audio Control ...
Development of Algorithm for Voice Operated Switch for Digital Audio Control ...Development of Algorithm for Voice Operated Switch for Digital Audio Control ...
Development of Algorithm for Voice Operated Switch for Digital Audio Control ...IJMER
Β 
Effect of Speech enhancement using spectral subtraction on various noisy envi...
Effect of Speech enhancement using spectral subtraction on various noisy envi...Effect of Speech enhancement using spectral subtraction on various noisy envi...
Effect of Speech enhancement using spectral subtraction on various noisy envi...IRJET Journal
Β 

Similar to The Short-Time Silence of Speech Signal as Signal-To-Noise Ratio Estimator (20)

Improvement of minimum tracking in Minimum Statistics noise estimation method
Improvement of minimum tracking in Minimum Statistics noise estimation methodImprovement of minimum tracking in Minimum Statistics noise estimation method
Improvement of minimum tracking in Minimum Statistics noise estimation method
Β 
A Novel Uncertainty Parameter SR ( Signal to Residual Spectrum Ratio ) Evalua...
A Novel Uncertainty Parameter SR ( Signal to Residual Spectrum Ratio ) Evalua...A Novel Uncertainty Parameter SR ( Signal to Residual Spectrum Ratio ) Evalua...
A Novel Uncertainty Parameter SR ( Signal to Residual Spectrum Ratio ) Evalua...
Β 
Lr3620092012
Lr3620092012Lr3620092012
Lr3620092012
Β 
Single Channel Speech Enhancement using Wiener Filter and Compressive Sensing
Single Channel Speech Enhancement using Wiener Filter and Compressive Sensing Single Channel Speech Enhancement using Wiener Filter and Compressive Sensing
Single Channel Speech Enhancement using Wiener Filter and Compressive Sensing
Β 
129966863723746268[1]
129966863723746268[1]129966863723746268[1]
129966863723746268[1]
Β 
A generic algorithm to numerically model the linewidth broadening of laser
A generic algorithm to numerically model the linewidth broadening of laserA generic algorithm to numerically model the linewidth broadening of laser
A generic algorithm to numerically model the linewidth broadening of laser
Β 
20575-38936-1-PB.pdf
20575-38936-1-PB.pdf20575-38936-1-PB.pdf
20575-38936-1-PB.pdf
Β 
A Noise Reduction Method Based on Modified Least Mean Square Algorithm of Rea...
A Noise Reduction Method Based on Modified Least Mean Square Algorithm of Rea...A Noise Reduction Method Based on Modified Least Mean Square Algorithm of Rea...
A Noise Reduction Method Based on Modified Least Mean Square Algorithm of Rea...
Β 
An efficient peak valley detection based vad algorithm for robust detection o...
An efficient peak valley detection based vad algorithm for robust detection o...An efficient peak valley detection based vad algorithm for robust detection o...
An efficient peak valley detection based vad algorithm for robust detection o...
Β 
International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)International Journal of Computational Engineering Research(IJCER)
International Journal of Computational Engineering Research(IJCER)
Β 
16 28 may17 13jan 14083 28548-1-sm
16 28 may17 13jan 14083 28548-1-sm16 28 may17 13jan 14083 28548-1-sm
16 28 may17 13jan 14083 28548-1-sm
Β 
A New Approach for Speech Enhancement Based On Eigenvalue Spectral Subtraction
A New Approach for Speech Enhancement Based On Eigenvalue Spectral SubtractionA New Approach for Speech Enhancement Based On Eigenvalue Spectral Subtraction
A New Approach for Speech Enhancement Based On Eigenvalue Spectral Subtraction
Β 
PublishedPaper
PublishedPaperPublishedPaper
PublishedPaper
Β 
Improving the Efficiency of Spectral Subtraction Method by Combining it with ...
Improving the Efficiency of Spectral Subtraction Method by Combining it with ...Improving the Efficiency of Spectral Subtraction Method by Combining it with ...
Improving the Efficiency of Spectral Subtraction Method by Combining it with ...
Β 
Dwpt Based FFT and Its Application to SNR Estimation in OFDM Systems
Dwpt Based FFT and Its Application to SNR Estimation in OFDM SystemsDwpt Based FFT and Its Application to SNR Estimation in OFDM Systems
Dwpt Based FFT and Its Application to SNR Estimation in OFDM Systems
Β 
Enhanced modulation spectral subtraction incorporating various real time nois...
Enhanced modulation spectral subtraction incorporating various real time nois...Enhanced modulation spectral subtraction incorporating various real time nois...
Enhanced modulation spectral subtraction incorporating various real time nois...
Β 
Comparative performance analysis of channel normalization techniques
Comparative performance analysis of channel normalization techniquesComparative performance analysis of channel normalization techniques
Comparative performance analysis of channel normalization techniques
Β 
FACTORS AFFECTING THE SIGNAL-TO-NOISE RATIO.pptx
FACTORS AFFECTING THE SIGNAL-TO-NOISE RATIO.pptxFACTORS AFFECTING THE SIGNAL-TO-NOISE RATIO.pptx
FACTORS AFFECTING THE SIGNAL-TO-NOISE RATIO.pptx
Β 
Development of Algorithm for Voice Operated Switch for Digital Audio Control ...
Development of Algorithm for Voice Operated Switch for Digital Audio Control ...Development of Algorithm for Voice Operated Switch for Digital Audio Control ...
Development of Algorithm for Voice Operated Switch for Digital Audio Control ...
Β 
Effect of Speech enhancement using spectral subtraction on various noisy envi...
Effect of Speech enhancement using spectral subtraction on various noisy envi...Effect of Speech enhancement using spectral subtraction on various noisy envi...
Effect of Speech enhancement using spectral subtraction on various noisy envi...
Β 

Recently uploaded

UNIT-2 image enhancement.pdf Image Processing Unit 2 AKTU
UNIT-2 image enhancement.pdf Image Processing Unit 2 AKTUUNIT-2 image enhancement.pdf Image Processing Unit 2 AKTU
UNIT-2 image enhancement.pdf Image Processing Unit 2 AKTUankushspencer015
Β 
Filters for Electromagnetic Compatibility Applications
Filters for Electromagnetic Compatibility ApplicationsFilters for Electromagnetic Compatibility Applications
Filters for Electromagnetic Compatibility ApplicationsMathias Magdowski
Β 
Independent Solar-Powered Electric Vehicle Charging Station
Independent Solar-Powered Electric Vehicle Charging StationIndependent Solar-Powered Electric Vehicle Charging Station
Independent Solar-Powered Electric Vehicle Charging Stationsiddharthteach18
Β 
Involute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdf
Involute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdfInvolute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdf
Involute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdfJNTUA
Β 
Circuit Breakers for Engineering Students
Circuit Breakers for Engineering StudentsCircuit Breakers for Engineering Students
Circuit Breakers for Engineering Studentskannan348865
Β 
Worksharing and 3D Modeling with Revit.pptx
Worksharing and 3D Modeling with Revit.pptxWorksharing and 3D Modeling with Revit.pptx
Worksharing and 3D Modeling with Revit.pptxMustafa Ahmed
Β 
21scheme vtu syllabus of visveraya technological university
21scheme vtu syllabus of visveraya technological university21scheme vtu syllabus of visveraya technological university
21scheme vtu syllabus of visveraya technological universityMohd Saifudeen
Β 
NO1 Best Powerful Vashikaran Specialist Baba Vashikaran Specialist For Love V...
NO1 Best Powerful Vashikaran Specialist Baba Vashikaran Specialist For Love V...NO1 Best Powerful Vashikaran Specialist Baba Vashikaran Specialist For Love V...
NO1 Best Powerful Vashikaran Specialist Baba Vashikaran Specialist For Love V...Amil baba
Β 
Artificial Intelligence in due diligence
Artificial Intelligence in due diligenceArtificial Intelligence in due diligence
Artificial Intelligence in due diligencemahaffeycheryld
Β 
NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024
NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024
NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024EMMANUELLEFRANCEHELI
Β 
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...drjose256
Β 
Maximizing Incident Investigation Efficacy in Oil & Gas: Techniques and Tools
Maximizing Incident Investigation Efficacy in Oil & Gas: Techniques and ToolsMaximizing Incident Investigation Efficacy in Oil & Gas: Techniques and Tools
Maximizing Incident Investigation Efficacy in Oil & Gas: Techniques and Toolssoginsider
Β 
Developing a smart system for infant incubators using the internet of things ...
Developing a smart system for infant incubators using the internet of things ...Developing a smart system for infant incubators using the internet of things ...
Developing a smart system for infant incubators using the internet of things ...IJECEIAES
Β 
5G and 6G refer to generations of mobile network technology, each representin...
5G and 6G refer to generations of mobile network technology, each representin...5G and 6G refer to generations of mobile network technology, each representin...
5G and 6G refer to generations of mobile network technology, each representin...archanaece3
Β 
engineering chemistry power point presentation
engineering chemistry  power point presentationengineering chemistry  power point presentation
engineering chemistry power point presentationsj9399037128
Β 
8th International Conference on Soft Computing, Mathematics and Control (SMC ...
8th International Conference on Soft Computing, Mathematics and Control (SMC ...8th International Conference on Soft Computing, Mathematics and Control (SMC ...
8th International Conference on Soft Computing, Mathematics and Control (SMC ...josephjonse
Β 
21P35A0312 Internship eccccccReport.docx
21P35A0312 Internship eccccccReport.docx21P35A0312 Internship eccccccReport.docx
21P35A0312 Internship eccccccReport.docxrahulmanepalli02
Β 
Seismic Hazard Assessment Software in Python by Prof. Dr. Costas Sachpazis
Seismic Hazard Assessment Software in Python by Prof. Dr. Costas SachpazisSeismic Hazard Assessment Software in Python by Prof. Dr. Costas Sachpazis
Seismic Hazard Assessment Software in Python by Prof. Dr. Costas SachpazisDr.Costas Sachpazis
Β 
15-Minute City: A Completely New Horizon
15-Minute City: A Completely New Horizon15-Minute City: A Completely New Horizon
15-Minute City: A Completely New HorizonMorshed Ahmed Rahath
Β 
Maher Othman Interior Design Portfolio..
Maher Othman Interior Design Portfolio..Maher Othman Interior Design Portfolio..
Maher Othman Interior Design Portfolio..MaherOthman7
Β 

Recently uploaded (20)

UNIT-2 image enhancement.pdf Image Processing Unit 2 AKTU
UNIT-2 image enhancement.pdf Image Processing Unit 2 AKTUUNIT-2 image enhancement.pdf Image Processing Unit 2 AKTU
UNIT-2 image enhancement.pdf Image Processing Unit 2 AKTU
Β 
Filters for Electromagnetic Compatibility Applications
Filters for Electromagnetic Compatibility ApplicationsFilters for Electromagnetic Compatibility Applications
Filters for Electromagnetic Compatibility Applications
Β 
Independent Solar-Powered Electric Vehicle Charging Station
Independent Solar-Powered Electric Vehicle Charging StationIndependent Solar-Powered Electric Vehicle Charging Station
Independent Solar-Powered Electric Vehicle Charging Station
Β 
Involute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdf
Involute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdfInvolute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdf
Involute of a circle,Square, pentagon,HexagonInvolute_Engineering Drawing.pdf
Β 
Circuit Breakers for Engineering Students
Circuit Breakers for Engineering StudentsCircuit Breakers for Engineering Students
Circuit Breakers for Engineering Students
Β 
Worksharing and 3D Modeling with Revit.pptx
Worksharing and 3D Modeling with Revit.pptxWorksharing and 3D Modeling with Revit.pptx
Worksharing and 3D Modeling with Revit.pptx
Β 
21scheme vtu syllabus of visveraya technological university
21scheme vtu syllabus of visveraya technological university21scheme vtu syllabus of visveraya technological university
21scheme vtu syllabus of visveraya technological university
Β 
NO1 Best Powerful Vashikaran Specialist Baba Vashikaran Specialist For Love V...
NO1 Best Powerful Vashikaran Specialist Baba Vashikaran Specialist For Love V...NO1 Best Powerful Vashikaran Specialist Baba Vashikaran Specialist For Love V...
NO1 Best Powerful Vashikaran Specialist Baba Vashikaran Specialist For Love V...
Β 
Artificial Intelligence in due diligence
Artificial Intelligence in due diligenceArtificial Intelligence in due diligence
Artificial Intelligence in due diligence
Β 
NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024
NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024
NEWLETTER FRANCE HELICES/ SDS SURFACE DRIVES - MAY 2024
Β 
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Tembisa Central Terminating Pills +27838792658 PHOMOLONG Top Abortion Pills F...
Β 
Maximizing Incident Investigation Efficacy in Oil & Gas: Techniques and Tools
Maximizing Incident Investigation Efficacy in Oil & Gas: Techniques and ToolsMaximizing Incident Investigation Efficacy in Oil & Gas: Techniques and Tools
Maximizing Incident Investigation Efficacy in Oil & Gas: Techniques and Tools
Β 
Developing a smart system for infant incubators using the internet of things ...
Developing a smart system for infant incubators using the internet of things ...Developing a smart system for infant incubators using the internet of things ...
Developing a smart system for infant incubators using the internet of things ...
Β 
5G and 6G refer to generations of mobile network technology, each representin...
5G and 6G refer to generations of mobile network technology, each representin...5G and 6G refer to generations of mobile network technology, each representin...
5G and 6G refer to generations of mobile network technology, each representin...
Β 
engineering chemistry power point presentation
engineering chemistry  power point presentationengineering chemistry  power point presentation
engineering chemistry power point presentation
Β 
8th International Conference on Soft Computing, Mathematics and Control (SMC ...
8th International Conference on Soft Computing, Mathematics and Control (SMC ...8th International Conference on Soft Computing, Mathematics and Control (SMC ...
8th International Conference on Soft Computing, Mathematics and Control (SMC ...
Β 
21P35A0312 Internship eccccccReport.docx
21P35A0312 Internship eccccccReport.docx21P35A0312 Internship eccccccReport.docx
21P35A0312 Internship eccccccReport.docx
Β 
Seismic Hazard Assessment Software in Python by Prof. Dr. Costas Sachpazis
Seismic Hazard Assessment Software in Python by Prof. Dr. Costas SachpazisSeismic Hazard Assessment Software in Python by Prof. Dr. Costas Sachpazis
Seismic Hazard Assessment Software in Python by Prof. Dr. Costas Sachpazis
Β 
15-Minute City: A Completely New Horizon
15-Minute City: A Completely New Horizon15-Minute City: A Completely New Horizon
15-Minute City: A Completely New Horizon
Β 
Maher Othman Interior Design Portfolio..
Maher Othman Interior Design Portfolio..Maher Othman Interior Design Portfolio..
Maher Othman Interior Design Portfolio..
Β 

The Short-Time Silence of Speech Signal as Signal-To-Noise Ratio Estimator

  • 1. Azhar S. Abdulaziz.et al. Int. Journal of Engineering Research and Application www.ijera.com ISSN : 2248-9622, Vol. 6, Issue 8, ( Part -1) August 2016, pp.99-103 www.ijera.com 99|P a g e The Short-Time Silence of Speech Signal as Signal-To-Noise Ratio Estimator Azhar S. Abdulaziz1, 2 , Veton Z. KΓ«puska3 1 Department of Electrical and Computer Engineering, Florida Institute of Technology, Florida, USA 2 Department of Computer and Information, College of Electronic Engineering, Ninevah University, Mosul, Iraq. 3 Department of Electrical and Computer Engineering, Florida Institute of Technology, Florida, USA ABSTRACT It is proposed in this paper to use a small portion of the audio speech signal to estimate Signal-to-Noise Ratio (SNR). It is found that, the first 30 ms duration has enough information about the SNR in advance. The first 30 ms of a recorded speech usually comes from the silence rather than speech. This is because the speaker usually starts the recording process or wait for it before he/she can deliver the utterance. For testing and comparing the proposed estimator, different noisy corpora are built upon the TIMIT data. The average estimation of the suggested algorithm proves to get better results as compared to the Waveform Amplitude Distribution Analysis (WADA) and the National Institute of Standard and Technology (NIST) SNR estimators. The complexity of the STS-SNR estimator is less than both as it only processes a small portion of the audio samples. Keywords: SNR Estimation, Noisy Speech, Signal-to-Noise Ratio, Short-Time Silence. I. INTRODUCTION The signal-to-noise ratio (SNR) estimation algorithms has been investigated deeply and used for different applications. They could be used to improve speech enhancement, detection and recognition algorithms in different ways [1]. Those estimators usually use all signal samples to compute the SNR. For some applications, it isenough to know if the signal has high or low SNR in order to do further operations on the speech signal for better recognition accuracy as in [2], [3] and [4]. Many of the SNR estimation approaches are based on either a pre-specified weighting factor or preceding assumptions of some parameters in the signal model [5]. The spoken utterance SNR is a ratio between the power of two random signals, the speech and the noise. This phenomenon of variability and randomness behavior makes the SNR estimation more difficult to investigate [6]. The National Institute of Standard and Technology NIST developed a well-known SNR estimator NIST-STNR [7]. It uses the entire speech signal by framing it to 20 ms frames to estimate the histogram of the root-mean square (RMS) of the power. Those frames are 50% overlapped among each other, then used for histogram updates. When the audio is finished, the resulting power histogram will be analyzed so that the 15% of the total area from the left represents the noise while the 85% of the area from the left of the histogram represents the signal [8]. Figure 1: NIST-STNR Estimation using RMS power Histogram [8] The NIST algorithm tries to separate the high and the low power of the audio signal using the its probability density function (PDF). It is supposed that, the noise and signal PDF boundaries lies in the selected regions with higher probability. While the power PDF is the key for NIST SNR estimator, there is another approach of examining the amplitude PDF as in the Waveform Amplitude Distribution Analysis (WADA-SNR) algorithm in [1]. This approach is based on the assumption that the clean speech amplitude has proximately a symmetrical Gamma distribution function. According to Kim and Stern in [1], when a white Gaussian noise is added to the clean speech there will be a unique parameter Gz that can determine SNR. The researchers used a pre-calculated lookup table to store the Gz parameterfor different SNRs. It is important to note that although the WADA-SNR approach assumes that the noise has RESEARCH ARTICLE OPEN ACCESS
  • 2. Azhar S. Abdulaziz.et al. Int. Journal of Engineering Research and Application www.ijera.com ISSN : 2248-9622, Vol. 6, Issue 8, ( Part -1) August 2016, pp.99-103 www.ijera.com 100|P a g e Gaussian distribution, the empirical results shows that it has superiority over the NIST-STNR even for other types of noise like background speech and music as shown in the figure 2[1]. a- Results with WADA-SNR b- Results with NIST-SNR Figure 2: Comparison of the average SNR estimates between WADA and NIST SNR algorithms on DARPA-RM database [1] The rest of the paper will explain the proposed Short-Time Silence SNR estimator (STS- SNR). A comparison of the simulation results between the suggested STS-SNR algorithm with the WADA and NIST-STNR approaches is in section 3, while section 4 will is dedicated for the conclusion and future works. II. THE SHORT-TIME SILENCE SNR ESTIMATOR Based on what has been discussed in the previous section, an SNR estimator cannot give its results unless all the audio signal samples are processed. Those full-audio length approaches would seize the resources for some real-time live audio applications, like Automatic Speech Recognition (ASR). The suggested algorithm is to process only a small amount of samples from the audio signal to guess the SNR. The proposed algorithm is refereed as the Short-Time Silence (STS-SNR) estimator, which will consider only the first 30 ms at the beginning of the audio. This estimator assumes that the first 30 ms of the tested audio represents silence rather than speech. It is more realistic to have this consideration as the speaker cannot deliver his/her utterance within the first 30 ms. It is also assumed in this research that the SNR is not changing during the time of interest. The proposed Short-Time Silence SNR (STS-SNR) estimator is described by the following steps: 1. Take the first 30 ms duration directly after the microphone is on, which is denoted here as the Noise Frame NFrame. 2. Subtract the mean of the NFrame 3. Estimate the Power Spectral Density (PSD) of the NFrame using Fast-Fourier Transform (FFT) of 512 points and taking only 0 to 8 kHz band in consideration. Where PSD for the 30 ms NPSD is: 𝑁𝑃𝑆𝐷 = π‘πΉπ‘Ÿπ‘Žπ‘šπ‘’ πœ” 2 (1) Where π‘πΉπ‘Ÿπ‘Žπ‘šπ‘’ (πœ”) is the spectrum of the audio frame. 4. Reform the PSD by taking the absolute difference of its flipped version to produce a white-like PSD using the following process: π‘π‘…π‘’π‘“π‘œπ‘Ÿπ‘šπ‘’π‘‘ = 𝑁𝑃𝑆𝐷 βˆ’ 𝑁𝑃𝑆𝐷 𝑇 (2) here, 𝑁𝑃𝑆𝐷 𝑇 is the flipped version of the noise power spectral density vector 𝑁𝑃𝑆𝐷 . 5. Take the average of the first and last quarter of the 8 kHz band of the NReformed. The average of those quarters is then considered as the estimated noise power spectral density 𝑁𝑃𝑆𝐷. 6. The estimated SNR in dB is: 𝑆𝑁𝑅 𝑑𝐡 = π‘œπ‘“π‘“π‘ π‘’π‘‘ βˆ’ 10 Γ— log10(𝑁𝑃𝑆𝐷) (3) As non-white noise is also expected, the NPSD is reformed in step 4 to produce a white-like PSD. This step will eliminate the effect of some colored noise that might appear while preserve the white noise PSD. In figure 3, the NReformed is shown for a speech that was sampled by 16KHz sampling rate and has an additive blue noise that made the SNR 10 dB. (a) PSD before reforming
  • 3. Azhar S. Abdulaziz.et al. Int. Journal of Engineering Research and Application www.ijera.com ISSN : 2248-9622, Vol. 6, Issue 8, ( Part -1) August 2016, pp.99-103 www.ijera.com 101|P a g e (b) PSD after reforming Figure 3: The reforming operation in step 4 The measured pauses in a music concert are an ambient noise of 30 dB in advance as each frequency band contributes a different amount of dB power [9]. Hence, step 5 is designed to estimate the maximum contribution of each frequencyband by averaging the reformed spectrum from step 4. The NPSD in the final stepshows to change linearly with the SNR of the speech utterance. However, there is an offset of the estimator mean from the real mean. Experiments on different SNR noisy speech showed that the offset in equation (3) is not varying with the SNR. It was found that use of value 23 as an offset to get minimum error for different kinds and levels of noise. This exact value of the offset is estimated by the minimum sum of square error (SSE) and the minimum mean absolute error (MAE) as depicted in figure 4. Experiments on the TIMIT test data and the NOIZEUS [10] show that this offset in equation 3 would be 23 dB as it gave the minimum for both the Sum of Square Error (SSE) and the Mean Absolute Error (MAE) as shown in figure 4 below. (a) SSE for different Offsets (b) MAE for different offsets Figure 4: Testing different offsets for NFrame of NOIZEUS noisy speech at 15 dB SNR The offset value in equation (3) represents the minimum power gap between the moderate speech and the faint noise. So that the speech is differentiated from the noise by the speaker and his audience. It is common for a speaker to raise his vocal power to compete the ambient noise energy at least to hear himself, and by doing so allowing the target audience to hear clearly. According to Gordon J. King in [9], the Faint Noise starts by the 20 dB Sound Pressure Level (SPL) for whisper and goes up to 40 dB SPL for public library noise. The moderate speech lies in a pressure region of what he called Moderate Noise which is between 40 and 60 dB SPL [9]. Any increase in the faint noise will make the speaker, who aims to speak moderately, to increase his utterance power to compete the additional noise. Therefore, the constant offset of 23 in equation 3 seems to be reasonable and close to theory. III. SIMULATION RESULTS The STS-SNR estimator was evaluated using the DARPA TIMIT [11] and the noisy speech corpus NOIZEUS that is described in [10]. However, as the latter corpus speech and noise signals were filtered by the modified Intermediate Reference System (IRS) filters, an additional pre- processing was needed to be applied for getting the un-filtered spectrum. By comparing the average estimated SNR of the WADA, NIST-STNR and the proposed STS- SNR, it seems that the latter has a better response. Figure 5 explains that the proposed STS-SNR is closer to the ideal case in average for NOIZEUS corpus.
  • 4. Azhar S. Abdulaziz.et al. Int. Journal of Engineering Research and Application www.ijera.com ISSN : 2248-9622, Vol. 6, Issue 8, ( Part -1) August 2016, pp.99-103 www.ijera.com 102|P a g e Figure 5: The average estimated SNR comparison for NOIZEUS corpus. The audio of the TIMIT corpus is recorded in a quiet room environment [12], therefore different types of noise are added to it throughout this work. Additive White Gaussian Noise (AWGN), blue and pink are artificially computed and added to the TIMIT audio. Besides, crowd babble noise is also considered, where a crowd of people speaking randomly together. Both artificial and crowd babble noise are added to make noisy TIMIT of SNR ranges from 5 to 50 dB with 5 dB steps. For white, pink and blue artificial additive noise, the STS-SNR shows to be a better SNR estimator than the other two approaches. Figure 6 below shows the comparison for the TIMIT corpus with white noise only as the responses due to pink and blue noise types have the same pattern. Figure 6: White Noise estimated SNR using TIMIT corpus For babble crowd TIMIT the mean estimated SNR was almost better using the proposed STS-SNR compared with the WADA and the NIST- STNR, as shown in figure 7. Figure 7: Crowd Babble Noise estimated SNR using TIMIT corpus. The WADA-SNR has a good estimation when the SNR is less than 20 dB. However, when the test is run over higher noise levels, the proposed STS-SNR shows an advantage over the WADA approach. The results in figures 5 and 6 shows that the proposed STS-SNR estimator has an effective true mean detection in average. Meanwhile, for the test cases below 20 dB SNR, the STS-SNR has shown to give a higher deviation from the mean, when it is compared with the WADA algorithm. As shown in figure 8 below, for SNR below 20 dB, the Mean Absolute Error (MAE) is higher for NIST-STNR and the proposed STS than the WADA approach. However, the MAE for STS-SNR is still less than both WADA and NIST-STNR in average for the whole range. Figure 8: The MAE Comparison for NIST, WADA and STS Estimators. For the complexity comparison among those SNR estimators, the STS-SNR is considered less expensive. As the proposed STS-SNR uses only 30 ms NFrame window of the signal, it is expected to perform faster than the NIST and WADA SNR approaches. The computational expenses of the three estimators were test edusing the Real Time ratio (RT) as in the following equation: 𝑅𝑇 = π‘ƒπ‘Ÿπ‘œπ‘π‘’π‘ π‘ π‘–π‘›π‘” π‘‡π‘–π‘šπ‘’ π΄π‘’π‘‘π‘–π‘œ π·π‘’π‘Ÿπ‘Žπ‘‘π‘–π‘œπ‘› (4)
  • 5. Azhar S. Abdulaziz.et al. Int. Journal of Engineering Research and Application www.ijera.com ISSN : 2248-9622, Vol. 6, Issue 8, ( Part -1) August 2016, pp.99-103 www.ijera.com 103|P a g e Table 1 shows a comparison among the three SNR estimators, NIST, WADA and the proposed STS approach. Table 1: Average Real-time Ratio for SNR Estimators SNR Estimator Average RT NIST-STNR 0.0182 WADA-SNR 0.0033 STS-SNR 0.0026 Those measurements are the average RT’s of when the algorithms were tested using the TIMIT test corpus of 1680 speech audio files. IV. CONCLUSION The proposed algorithm for the SNR estimation has shown to give a close guess to the ideal SNR for audio speech. It is less complicated and more efficient to expect the SNR in advance. The suggested SNR estimator in this paper shows to have a higher error for SNR less than 20 dB, as compared to the WADA approach. Nevertheless, for higher SNRs, the proposed STS algorithm estimates the SNR more accurately than the WADA estimator. In general, the STS algorithm has an advantage over the WADA and the NIST-STNR approaches. Even though the WADA estimator is based on extensive integration, its complexity is reduced by using the offline pre-calculated table [1]. However, as it is noted in table 1, the proposed STS-SNR estimator is still less expensive than the WADA-SNR. The drawbacks of the proposed STS-SNR are there should be a 30 ms of silence at the beginning and the SNR is not changing. If the noise level is changed later, another silence period should be considered to re-estimate the new SNR ratio. In this case, a Voice Activity Detector (VAD) should be used to pick silencesthat occur within the speech as well. For Automatic Speech Recognizers (ASR) applications, there is another solution. Some of modern ASRs, like CMU Sphinx, can detect a silence and its samples could be easily extracted. This is because silence is treated as a word that represent a non-speech event and stored in a filler dictionary [13]. In future works, a continuous feedback from within-speech silences will be added allowing the STS to update the SNR continuously. REFERENCES [1]. C. Kim and R. M. Stern, β€œRobust signal-to-noise ratio estimation based on waveform amplitude distribution analysis.” in INTERSPEECH, 2008, pp. 2598–2601. [2]. H. G. Hirsch and C. Ehrlicher, β€œNoise estimation techniques for robust speech recognition,” in Acoustics, Speech, and Signal Processing, 1995. ICASSP-95., 1995 International Conference on, vol. 1. IEEE, 1995, pp. 153–156. [3]. J. Morales-Cordovilla, N. Ma, V. Sa ́nchez, J. L. Carmona, A. M. Peinado, J. Barker et al., β€œA pitch based noise estimation technique for robust speech recognition with missing data,” in Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on. IEEE, 2011, pp. 4808– 4811. [4]. H.-W. Park, A.-R. Khil, and M.-J. Bae, β€œPitch detection based on signal- to-noise-ratio estimation and compensation for continuous speech signal,” in Convergence and Hybrid Information Technology. Springer, 2012, pp. 767–774. 
 [5]. T. Moazzeni, A. Amei, J. Ma, and Y. Jiang, β€œStatistical model based snr estimation method for speech signals,” Electronics letters, vol. 48, no. 12, pp. 727–729, 2012. 
 [6]. P. Papadopoulos, A. Tsiartas, J. Gibson, and S. Narayanan, β€œA supervised signal-to-noise ratio estimation of speech signals,” in Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on. IEEE, 2014, pp. 8237–8241. [7]. National Institute of Standards and Technology, β€œNIST speech signal to noise ratio measurements,” 2008. 
 [8]. D. Ellis. (2011) Objective measures of speech qual- ity (snr).the laboratory for the recognition and organiza- tion of speech and audio (labrosa). [Online]. Available: http://labrosa.ee.columbia.edu/projects/snreval/
 [9]. G. J. King, The audio handbook. Newnes- Butterworths Group, J. W. Ar- rowsmith Ltd., 1975. 
 [10]. Y. Hu and P. C. Loizou, β€œSubjective comparison and evaluation of speech enhancement algorithms,” Speech communication, vol. 49, no. 7, pp. 588–601, 2007. 
 [11]. W. M. Fisher, G. R. Doddington, and K. M. Goudie-Marshall, β€œThe DARPA speech recognition research database: specifications and status,” in Proc. DARPA Workshop on speech recognition, 1986, pp. 93–99. 
 [12]. C. Becchetti and K. P. Ricotti, Speech Recognition: Theory and C++ Implementation (With CD). John Wiley & Sons, 2008. [13]. Carnegie Mellon University. (2000) Manual for the sphinx-iii recognition system,[Online]. Available at: http://www.speech.cs.cmu.edu/sphinxman/. http://www.speech.cs.cmu.edu/sphinxman/script man1.html#00