Wavelet-Based Speech Intelligibility Enhancement in Telugu Using Hybrid Threshold Transform

Speech Intelligibility Quality in Telugu Speech Patterns Using a
Wavelet-Based Hybrid Threshold Transform Method
Presentation By
Dr.S.China Venkateswarlu
Professor in Dept. of ECE, IARE(Autonomous)
Co-Authors
Dr. Naluguru Udaya Kumar
Associate Professor in Dept. of ECE, MLRITM
(Autonomous)
Dr.Vallabhuni Vijay
Associate Professor in Dept. of ECE, IARE(Autonomous)
Paper ID: 287
9/28/2021
Dr.S.China Venkateswarlu-Professor of ECE ,
IARE-Hyderabad

 Presentation Outline
MLRITM
 Abstract
 Introduction
 Literature Survey
 Methodology
 Simulation Results
 Conclusion
 References
ICISSC-2021
9/28/2021
IARE-Hyderabad

 Abstract
MLRITM
 This paper proposes the algorithm of multiband spectral
subtraction in which the quality of speech has enhanced the
intelligibility of the speech.
 In Our daily life, speech is significant to convey to destination.
 When we consider industrial areas where we face the noise in
speech,
 there may be some additional noise added to the signal
 which causes the disturbance to the original signal cannot be
achieved perfectly to remove this noise;
 different algorithms such as spectral subtraction, wiener filter.
 At the same time While these two algorithms have different
objective measure analyses such as SNR , SSNR, MSE,
PSNR, NRMSE, PSEQ.
ICISSC-2021
9/28/2021
IARE-Hyderabad

 Abstract
MLRITM
 Comparing these values with Multiband spectral subtraction
 some objective measures such as
 SNR,
 Segmental SNR,
 Frequency segmental SNR,
 Cestrum and overall SNR.
By using different transform techniques such as Haar transform,
 Daubechies transform.
Using these transform techniques and algorithms,
we can remove the noise and improve the quality of the speech
signal.
ICISSC-2021
9/28/2021
IARE-Hyderabad

MLRITM ICCSSS 2020
There is no conventional challenge in improving voice production and
eligibility that remains accessible and unresolved;
 this has been an active study era for many years.
The demand for communication headsets for industrial use, new
applications such as hands-free networking, and
automatic voice recognition devices have fueled development in this
subject.
Non-steady noise is one of the major issues for the modern state of
technology.
Standard algorithms can detect non-stationary noise, but output
efficiency is decreased as background noise status is increased
 Hence, algorithms to increase speech communication efficiency in the
industrial and heavily noisy world improve speech communication.
Introduction
9/28/2021
IARE-Hyderabad

 Introduction
MLRITM
There is no conventional challenge in improving voice production and
eligibility that remains accessible and unresolved;
 This has been an active study era for many years.
The demand for communication headsets for industrial use, new
applications such as hands-free networking, and
automatic voice recognition devices have fueled development in this
subject.
Non-steady noise is one of the major issues for the modern state of
technology.
Standard algorithms can detect non-stationary noise, but output
efficiency is decreased as background noise status is increased
 Hence,
algorithms to increase speech communication efficiency in the
industrial and heavily noisy world improve speech communication.
ICISSC-2021
9/28/2021
IARE-Hyderabad

MLRITM
 Coefficient thresholding methods such as binary masking and transformation of the
wavelet have also been used widely to improve expression.
 Modulation channel methods have performed little work.
Yet the optimum performance in speech efficiency and intelligibility could not be
achieved in both strategies.
In other words, it is difficult to find effective ways to restrict the sounds of low SNR.
 Previous approaches had limitations, including the incorporation of peaks (musical
noise), more iterations, and low speech efficiency and intelligibility.
ICISSC-2021
9/28/2021
IARE-Hyderabad

Speech Processing
MLRITM
Speech processing refers to the analysis of speech signals as well as signal
processing techniques.
There are some techniques which are used in the speech processing and those are:
 Dynamic Time Warping is an algorithm for the simulation measurement between the
two time series.
 Generally speaking, the digital time warping equation determines the ideal fit
between two sequences with some restrictions and laws like time series.
The optimum match which meets all constraints and laws and which has the minimum
cost in which the cost is calculated as the number of absolute differences between
their values for each matched index pair.
ICISSC-2021
9/28/2021
IARE-Hyderabad

 Speech processing
MLRITM
Artificial Neural Networks are built on an artificial neuronal-like set of linked units or
nodes that model the biological brain loose.
 Each communication can relay a signal from one artificial neuron to another,
like Here the main point included is the algorithms are made with the help of
programming.
And the next one is Signal processing. Speech processing is mainly regarded as a
special case of digital signal processing.
ICISSC-2021
9/28/2021
IARE-Hyderabad

 De-Noising Techniques
MLRITM
 The de-noising of speech is an issue of many years.
Due to an input noisy signal, we want to filter the unwanted, damaging the interest
signal.
You can visualize someone speaking during a video call when a music piece is playing
behind the scenes.
In this case, it is the responsibility of the speech denotation device to remove the
background noise to increase the speech signal.
 [1] This technology is particularly important for video and audio conferencing where
noise can greatly reduce speech intelligibility, in addition to many others.
Denoting or estimating functions requires as much as possible reconstruction of the
signal on the basis of measurements of a useful noise corrupting signal.
ICISSC-2021
9/28/2021
IARE-Hyderabad

MLRITM
The below shown is the pictorial presentation of De-Noising
using wavelet transform.
And first it starts with an Input Noisy Speech pattern. So here we
will input noise consisting of speech patterns.
And then it is passed to wavelet transform to check for it.
ICISSC-2021
Figure 1:.De-noising using wavelet transform.
9/28/2021
IARE-Hyderabad

MLRITM
 And then to estimate the threshold to check the
estimation of this.
And then there will be two ways. One will be soft
thresholding and other will be hard thresholding. So we can
apply either of them to proceed. Then next it goes to
inverse wavelet transform and then finally its upto denoised
speech.
Here we can see that the speech is passing and the
noise which is alongside speech is removed. Hence this is
the process of de noising technique
ICISSC-2021
9/28/2021
IARE-Hyderabad

 WAVELET TRANSFORMS
MLRITM
 In several areas of mechanics, engineers, seismography, electronic
data processing, etc., wavelet transformations are widely used in the
analyses, encoded and reconstructions of signals.
In maths, an orthonormal series generated by a wavelet is a
representation of square integrated (real or compound-evaluated)
function.
This article includes a systematic, mathematical description of the
transformation of an orthonormal wavelet.
The basic theory of wavelet transformations is that the conversion can
accommodate only time extension shifts, but not shape.
This is influenced by the selection of appropriate basic functions.
Changes in the period extended should be in accordance with the
preceding base function analysis frequency.
ICISSC-2021
9/28/2021
IARE-Hyderabad

MLRITM
 The Fourier Transformations main drawback is that it absorbs
global frequency detail,, which means frequencies that last for an
entire signal.
 Not all applications, where the pulse has small intervals of
characteristic oscillation, are suitable for that kind of signal
decomposition.
The wavelet transform, a feature that breaks into a series of
wavelets, is an alternative solution.
ICISSC-2021
9/28/2021
IARE-Hyderabad

 DIFFERENT WAVELET TRANSFORMS
MLRITM
 HAAR:
The first and shortest wavelet is the Haar wavelet, which is the starting point for any
discussion of wavelets.
The Haar wavelet is a step-like feature that is discontinuous. It depicts the same
wavelet as the Daubechies db1 wavelet.
 DAUBECHIES: Ingrid Daubechies, perhaps the most splendid star in the realm of
wavelet research, developed what are called minimally upheld orthonormal wavelets-
along these lines making discrete wavelet investigation practicable.
The names of the Daubechies family wavelets are written in db N, where N is the
order, and db is the "last name" of the wavelet.
 SYMLETS: Daubechies offered symlets as modifications to the db family as almost
balanced wavelets. The two wavelet families' properties are compared. The wavelet
abilities in psi are listed below.
By assembling waveinfo('sym') from the MATLAB command line, we may acquire an
overview of the fundamental attributes of this family.
 MORLET: There is no scaling function in this wavelet, but it is explicit.
ICISSC-2021
9/28/2021
IARE-Hyderabad

 LITERATURE SURVEY
MLRITM
 For adaptive noise cancellation, Widrow et al. have suggested wiener filters based on
both noise and voice.
A complicated continuum of clean speech (time domain form) is easily retrieved using
the linear model between observed and projected signal.
Clean voice spectral amplitude is measured in the Wiener filter and also phase is
retrieved directly from the noisy sign.
This approach DSP Digital signal processor is suitable for stationary applications
where the ideal Wiener filter reduces the measurement error.
In the Wiener filter a further number of iterations were carried out.
NOISY SPEECH ENHANCEMENT
 The problem of noise reduction in the paper was that the optimal signal y(n) is
recovered with the cleansing speech signal x(n) and n is the discrete time indicator of 0.
(n) = x(n) + s(n) (2.1)
Where S(n) is a background noise, and x(n) is background information that is not
correlated. Non-stationary signals are used as input data for the analysis and
implementation of the proposed method.
In the frequency domain short-time, Fourier transformations are used to calculate the
clean language patterns provided by noisy speech patents.
ICISSC-2021
9/28/2021
IARE-Hyderabad

 NOISY SPEECH ENHANCEMENT
MLRITM
 Y(k,m) = S(k,m) + U(k,m) (2.2)
Where Y(k,m), S(k,m) and U(k,m), the frequency bin k is belongs to {0,
1, 2, 3, …., k-1} and the time frame m, respectively the STFTs of y(n),
S(n) and u(n).
The variance of Y(k,m) is necessary for further study of the threshold
because of s(n) and u(n) uncorrected by inference.
Recently, improving expression has become an essential component
of speech coding and speech recognition technologies.
Speech amplification has two main considerations :
 Language evaluation and noise power assessment.
The voice estimation is based on the mathematical speech model,
the distortion criterion and the measured noise
ICISSC-2021
9/28/2021
IARE-Hyderabad

 EXISTING METHOD
MLRITM
 From the paper we had, the existing model has both soft and hard
thresholding.
So this causes some advantages and also some disadvantages.
To solve some of the disadvantages, here is the mix of soft and hard
thresholding and that is called the Hybrid threshold.
HYBRID THRESHOLD: As few drawbacks are present in Soft
Thresholding and Hard Thresholding individually during noise reduction
methods.
To come back with these drawbacks we use the combination of these
soft and hard thresholding techniques to get a new kind of thresholding
technique and it is the Hybrid Thresholding Method.
ICISSC-2021
9/28/2021
IARE-Hyderabad

 HYBRID THRESHOLD
MLRITM
 In soft Thresholding methods it removes the discontinuity of the
signal and in hard thresholding technique the discontinuity of the signal
presents.
Sometimes it keeps and sometimes it kills the procedure and is more
and more often.
So these combined techniques are used for developing a new
technique which is more efficient in manner and it is named Hybrid
Thresholding. Below
Figure is Block Diagram of Spectral Subtraction Method
ICISSC-2021
9/28/2021
IARE-Hyderabad

MLRITM
Here the input is through Windowing and then it passes
through FFT which is a fast fourier transform.
And then these speech signals will go to noise estimation and
then to spectral subtraction.
Here these speech signals then continue to Complex Spectrum
and then to IFFT which is Inverse Fast Fourier Transform (IFFT) .
 Here then continues to Overlap – add and then to Enhanced
Speech Signal. Here we get a clear signal without noise.
ICISSC-2021
9/28/2021
IARE-Hyderabad

 PRINCIPLE OF SPECTRAL SUBTRACTION METHOD
MLRITM
Take a noise signal having noises which is derived from the
independent additive noises as
y[n] = s[n] + d[n]
where y[n] ~~~ sampled noisy speech,
s[n] ~~~ clean speech,
d[n] ~~~ additive noise
Is considered additive noise with a zero mean and no relevance to any
type of clear speech. Due to the non-stationary and time-variable
nature of speech signals,.
Its representation is in the Short -Time Fourier Transform, which has
the following transformation.
Y (ω, k) = S(ω, k) + D(ω, k) (2.3)
By removing a noise estimate from the received signal,
the speech can be approximated.
ICISSC-2021
9/28/2021
IARE-Hyderabad

MLRITM
 Averaging recent speech pause frames yields an estimate of the noise spectrum:
(2.5)
M is the number of consecutive frames.
The spectral subtraction is considered as filter by manipulating , so that it can be
given as the product of noisy speech spectrum along with spectral subtraction ﬁlter
(SSF) as below:
(2.6)

where H(ω) is gain function and also known as Spectral Subtraction Filter (SSF).
 The H(ω) is a zero phase ﬁlter, having magnitude response as in between 0 ≤
H(ω) ≤ 1.
(2.7)
ICISSC-2021
9/28/2021
IARE-Hyderabad

MLRITM
 To construct back the previous signal, we need phase estimation of the speech.
Thus, the speech signal in a frame is calculated by
(2.8)
Those estimated speech signals recover in the domain of time and inverse Fourier
transforming S(ω) using the Overlap-add technique.
Although this Spectral Subtraction method reduces the majority of the noises and it
also has some drawbacks like it depends more.
ICISSC-2021
9/28/2021
IARE-Hyderabad

 Hybrid Thresholding
MLRITM
 Here we can see that the flow chart starts
with the Noisy Speech signal. In other words,
the noise is added to the speech signal.
And then it continues to Time-Frequency
Analysis in Windowing mode. And next to
Wavelet Decomposition passing through it.
Here the wavelet is decomposed. Next is
Hybrid Thresholding, as said above, this is the
mixture of Soft and hard thresholding.
The signal passes through this. And next to
IWT, which is Integer Wavelet Transform. And
then finally to enhance speech signal.
Here enhanced meaning, the noise which is
present with the signal is eradicated. And the
pure, enhanced signal is shown.
ICISSC-2021
Figure 2.2 : Hybrid thresholding
9/28/2021
IARE-Hyderabad

 Wiener Filter
MLRITM
The Wiener Filter (WF) is a kind of filter that reduces the mean square error (MSE). The GF –
Gain Function of WF Wiener(ω), is written in the form of the power spectral density (PSD) of
clean speech of the noise Pd (ω)
(2.9)
The fixed gain (FG) at every frequency levels and their requirements to estimate the PSD of
the clean signal and noise are before filtering This is the drawback of Wiener Filter.
So we use adaptive WF to round the approximation of WF gain function.
(2.10)
ICISSC-2021
9/28/2021
IARE-Hyderabad

MLRITM
Problem Identification
During communication between two people in a laboratory, there is some
noise added to the original speech signals of the speaker. This noise may be due to
Environmental disturbance or may be people around the speakers who create the
disturbance. This result in the quality of speech is degraded and cannot transmit the
message in effective manner. This is the problem which we face in our daily life
situations. In order to rectify the problem, we have the algorithm which enhances the
speech quality.[6].
ICISSC-2021
9/28/2021
IARE-Hyderabad

MLRITM
PROPOSED METHOD
Multiband spectral subtraction is an Algorithm which removes the noise in speech
signals by using different transform techniques. In real life situations whenever there is
transmission of information we add a carrier signal to the message signal in order to
travel more distance .Also we add some noise in the transmitter side and remove that
noise in the receiver side before we retrieve actual information.[6] But sometimes due to
environmental issues this noise will not be discarded completely and cause
disturbances. So in order to remove this noise we are using this algorithm and using
different wavelet transform techniques such Haar ,Daubechies etc and we are denoising
the speech signals .There are some thresholding techniques such as hard thresholding
and soft thresholding.[13]
ICISSC-2021
9/28/2021
IARE-Hyderabad

 Multiband spectral subtraction algorithm
MLRITM
The assumption behind the
multiband spectral subtraction
method is that the added noise
will be stationary and
uncorrelated with a clean voice
signal.
Figure 3.1 : Hybrid Threshold
ICISSC-2021
9/28/2021
IARE-Hyderabad

Multiband spectral subtraction
MLRITM
-0
ICISSC-2021
9/28/2021
IARE-Hyderabad

 Noise Estimation
MLRITM
 Here we can see that the process starts with Noise estimation and
next to windowing along with FFT.
This process continues to Phase and then meanwhile to Multi-Band.
These all together pass to Power spectral Modification and then to IFFT.
Noise Estimation: In real life situations, noise does not affect the
speech signal uniformly over the whole spectrum.
 Few frequencies will affect these speeches. This kind of noise is
known as stationary noise. These estimated noise in the speech signals
can be removed by using the special algorithms.
One such algorithm we are implementing in this paper.
From this we can calculate the signal to noise ratio and some objective
measures.
ICISSC-2021
9/28/2021
IARE-Hyderabad

 Results and Discussions
MLRITM
When we simulate in the
matlab software by using
the speech corpus
databases as input,
we get the output signal
with noise and original
noise.
ICISSC-2021
Figure : Original Signal and Denoised Signals
9/28/2021
IARE-Hyderabad

 Results and Discussions
MLRITM
 above Figure : Original Signal, After De-noising the signal with noise we get the original signal
as shown .
When we give a speech corpus05 database as an input ,
when we analyze with a symlet then we get the above threshold coefficients and original
coefficients.
 Left side graphs represent the absolute values of the coefficients.
In this we use threshold transform method
ICISSC-2021
Figure : Original Signal and De-noised Signals
9/28/2021
IARE-Hyderabad

 Table 5.4:- Window Performance Comparisons in-terms of Speech Quality Objective Measurers with TELUGU-WHITE NOISE
MLRITM
ICISSC-2021
Objecti
ve
Measure
in dB
SNR
Values
in dB
WAVELET-SS HYBRID-TH-I Method WAVELET-WIENER HYBRID-TH-I Method
Hammin
g
Window
Kaiser
α=0.75
Gaussia
n
α=3.0
Dolph
Chebyshe
v
42dB
Polynom
ial
α=5
Hammin
g
Window
Kaiser
α=0.75
Gaussia
n
α=0.75
Dolph
Chebyshev
α=28dB
Polynomia
l
α=2
LLR
0 0.8198 0.7859 0.8706 0.8207 0.8514 1.0614 1.5268 1.5263 1.6539 1.6050
5 0.6851 0.6331 0.7342 0.6793 0.7072 0.9715 1.5296 1.5279 1.6854 1.6119
10 0.5581 0.5419 0.6017 0.5550 0.5706 0.8725 1.7020 1.7034 1.6444 1.65121
15 0.4926 0.4774 0.5131 0.4802 0.4952 0.7321 1.6856 1.6894 71.8352 1.6467
WSS
0 63.0372 60.0229 69.6377 62.2582 64.7961 104.7435 85.4570 85.7418 71.8842 72.0651
5 55.4465 51.0245 59.2924 54.1031 56.6300 87.2074 83.4607 83.4545 71.8888 72.0727
10 46.9684 43.4460 52.8675 47.7075 48.2365 74.0772 82.4248 82.3000 71.8270 8.6418
15 39.9035 37.7773 44.7065 39.1082 40.3765 62.9145 82.9782 81.7071 8.9883 72.0714
Cep
0 5.1610 5.0759 5.2814 5.1655 5.2293 6.3116 7.8823 7.8767 8.7149 8.3530
5 4.7348 4.5360 4.7949 4.7024 4.7037 6.0062 7.9101 7.9034 8.7827 8.3639
10 4.2650 4.2440 4.3355 4.2430 4.2303 5.6471 8.5448 8.5538 8.6965 -0.1722
15 4.0518 4.0402 4.0315 4.0114 3.9560 5.2007 8.5959 8.6052 8.9883 8.4223
Seg-
SNR
0 -0.7518 -0.5314 -0.1905 -0.7474 -0.4393 0.0423 -0.1842 -0.1829 -0.1906 -0.1717
5 1.5062 1.5829 1.6394 1.5362 1.6449 1.4420 -0.1840 -0.1826 -0.1906 -0.1718
10 2.6535 2.9403 2.7910 2.8194 3.1367 2.9313 -0.1832 -0.1813 -0.1914 2.2274
15 3.3839 3.5588 3.7003 3.5077 3.6447 4.3152 -0.1832 -0.1806 -0.1909 -0.1713
FwSeg-
SNR
0 6.0623 6.2012 5.9976 6.0612 6.1152 4.5607 2.6753 2.6582 2.8614 2.2916
5 7.2927 7.3954 6.9876 7.3647 7.3131 5.3318 2.5912 2.5900 3.4275 2.4138
10 8.2464 8.2006 8.0511 8.3065 8.4962 6.1478 2.4104 2.4082 3.0449 -0.2618
15 8.9426 8.7623 8.9249 9.0189 9.2436 6.9929 2.7752 2.7038 10.5598 2.3408
SNR
0 2.3687 2.9373 3.7149 2.3985 3.2751 1.4268 -0.3338 -0.3347 -0.2943 -0.2451
5 5.1252 5.2732 5.4671 5.1603 5.5309 3.7845 -0.3376 -0.3288 -0.2929 -0.2448
10 6.3003 6.8604 6.5939 6.5564 7.5071 6.3552 -0.3266 -0.3020 -0.2987 1.6467
15 6.9450 7.2002 7.5288 7.0517 7.7279 8.4777 -0.3432 -0.3037 -0.2743 -0.2591
9/28/2021
IARE-Hyderabad

MLRITM
ICISSC-2021
Objecti
ve
Measure
in dB
SNR
Values
in dB
WAVELET-SS HYBRID-TH-I Method WAVELET-WIENER HYBRID-TH-I Method
Hammin
g
Window Kaiser
α=0.75
Gaussia
n
α=3.0
Dolph
Chebyshe
v
42dB
Polynom
ial
α=5
Hammin
g
Window Kaiser
α=0.75
Gaussia
n
α=0.75
Dolph
Chebyshev
α=28dB
Polynomia
l
α=2
LLR
0 0.8198 0.7859 0.8706 0.8207 0.8514 1.0614 1.5268 1.5263 1.6539 1.6050
5 0.6851 0.6331 0.7342 0.6793 0.7072 0.9715 1.5296 1.5279 1.6854 1.6119
10 0.5581 0.5419 0.6017 0.5550 0.5706 0.8725 1.7020 1.7034 1.6444 1.65121
15 0.4926 0.4774 0.5131 0.4802 0.4952 0.7321 1.6856 1.6894 71.8352 1.6467
WSS
0 63.0372 60.0229 69.6377 62.2582 64.7961 104.7435 85.4570 85.7418 71.8842 72.0651
5 55.4465 51.0245 59.2924 54.1031 56.6300 87.2074 83.4607 83.4545 71.8888 72.0727
10 46.9684 43.4460 52.8675 47.7075 48.2365 74.0772 82.4248 82.3000 71.8270 8.6418
15 39.9035 37.7773 44.7065 39.1082 40.3765 62.9145 82.9782 81.7071 8.9883 72.0714
Cep
0 5.1610 5.0759 5.2814 5.1655 5.2293 6.3116 7.8823 7.8767 8.7149 8.3530
5 4.7348 4.5360 4.7949 4.7024 4.7037 6.0062 7.9101 7.9034 8.7827 8.3639
10 4.2650 4.2440 4.3355 4.2430 4.2303 5.6471 8.5448 8.5538 8.6965 -0.1722
15 4.0518 4.0402 4.0315 4.0114 3.9560 5.2007 8.5959 8.6052 8.9883 8.4223
Seg-
SNR
0 -0.7518 -0.5314 -0.1905 -0.7474 -0.4393 0.0423 -0.1842 -0.1829 -0.1906 -0.1717
5 1.5062 1.5829 1.6394 1.5362 1.6449 1.4420 -0.1840 -0.1826 -0.1906 -0.1718
10 2.6535 2.9403 2.7910 2.8194 3.1367 2.9313 -0.1832 -0.1813 -0.1914 2.2274
15 3.3839 3.5588 3.7003 3.5077 3.6447 4.3152 -0.1832 -0.1806 -0.1909 -0.1713
FwSeg-
SNR
0 6.0623 6.2012 5.9976 6.0612 6.1152 4.5607 2.6753 2.6582 2.8614 2.2916
5 7.2927 7.3954 6.9876 7.3647 7.3131 5.3318 2.5912 2.5900 3.4275 2.4138
10 8.2464 8.2006 8.0511 8.3065 8.4962 6.1478 2.4104 2.4082 3.0449 -0.2618
15 8.9426 8.7623 8.9249 9.0189 9.2436 6.9929 2.7752 2.7038 10.5598 2.3408
SNR
0 2.3687 2.9373 3.7149 2.3985 3.2751 1.4268 -0.3338 -0.3347 -0.2943 -0.2451
5 5.1252 5.2732 5.4671 5.1603 5.5309 3.7845 -0.3376 -0.3288 -0.2929 -0.2448
10 6.3003 6.8604 6.5939 6.5564 7.5071 6.3552 -0.3266 -0.3020 -0.2987 1.6467
15 6.9450 7.2002 7.5288 7.0517 7.7279 8.4777 -0.3432 -0.3037 -0.2743 -0.2591
9/28/2021
IARE-Hyderabad

 CONCLUSION AND FUTURE SCOPE
MLRITM
The hybrid threshold approach was developed to improve the efficiency of speech
communication Systems in a noisy industrial setting.
 On a shop floor, where employees are interacting with one another,
 any loss of speech is unacceptable.
For this reason,
we present a technique in which non-stationary noise enhancement methods are
combined with evolutionary computation for machine learning with the aim of
improving distorted speech in the audio logical phase.
Some variables are more important for speech enhancement algorithms than others,
 depending on the needs of shop floor staff.
 For the best Wavelet transform range,
a hybrid thresholding scheme for non-stationary low SNR noisy speech patterns is
considered in this paper. In hybrid thresholding,
the optimal selection of decomposition levels in the wavelet transform is more
critical for speech quality and intelligibility
ICISSC-2021
9/28/2021
IARE-Hyderabad

MLRITM
 As compared to traditional approaches, this hybrid approach outperforms them in
terms of calculating parameters such as
SNR,
SSNR,
MSE,
PSNR,
NRMSE and PSEQ for speech content and intelligence.
Spectrograms for extremely non-stationary noises with negative SNR show a
significant improvement in the proposed approach when the above mentioned
parameters are used.
The proposed method used a simple
 haar wavelet,
Daubechies, to solve the problem of denoising at various SNR decibel levels.
In comparison to traditional approaches,
the hybrid device performs better in the experimental analysis,
which requires both parameters and spectrograms.
Here we use multiband spectral subtraction method to enhance the signal.
ICISSC-2021
9/28/2021
IARE-Hyderabad

MLRITM
This work can be extended in order to increase the accuracy and intelligibility of the
speech processing in communication.
For the implementation of speech processing in the Digital signal processors for the
faster transmission of information between the source and destination.
 Here,
even if we use multiband spectral subtraction method to enhance the signal,
we can also use wiener filter with the help of this as this can help.
ICISSC-2021
9/28/2021
IARE-Hyderabad

REFERENCES
MLRITM
[1] Hirsch, H. G., & Pearce, D. (2000). The Aurora experimental framework for the performance
evaluation of speech recognition systems under noisy conditions. In ISCA ITRW
ASR2000, Paris, France, 18–20, 2000.
[2] Hamid, M. E., Molla, M. K. I., Dang, X., & Nakai, T. (2013). Single channel speech enhancement
using adaptive soft-thresholding with bivariate EMD. ISRN Signal Processing,
2013(2013), 1–9.
[3] Singh, S., & Mutawa, A. M. (2016). A wavelet based transform method for quality improvement
in noisy speech patterns of Arabic Language. International Journal of Speech Technology, 18(2),
157–166.
[4] Farouk, M. H. (2018). Application of wavelets in speech processing. Springer briefs in speech
technology (2nd ed.). Springer.
[5] Polikar, R. (1996). The wavelet tutorial. [Internet] [Cited 2017 March 30].
[6] Kaur, H., & Talwar, R. (2015). Overlapping frame approach to estimate and reduce noise from
single channel speech. International Journal of Signal Processing, Image Processing and
Pattern Recognition, 8(4), 49–58.
ICISSC-2021
9/28/2021
IARE-Hyderabad

REFERENCES
MLRITM
[7] D Manikanta, S.China Venkateswarlu “Performance in Denser Networks Using IoT Adaptive
Configurations' European Journal of Molecular & Clinical Medicine, Volume-8, Issue 01
2021,pp.1664-1686, Publisher-European Journal of Molecular & Clinical Medicine:
[8] S.China Venkateswarlu,Ch.Sashi Kiran, R.V.Santhosh Nayan, Vijay Vallabhuni, P.Ashok Babu,
V.Siva Nagaraju "Artificial Intelligence Based Smart Home Automation System Using Internet of
Things,, Publication date:2 021/9, Patent Office-India, Application number: 202041057023.
[9] S.China Venkateswarlu, Naluguru Udaya Kumar, Annam Karthik , "Wavelet Region implanting
watermark upgrades the security framework in Digital Speech Watermarking 2021/3/10, IOP
Conference Series: Materials Science and Engineering,volume- 12013,issue - ICCSSS 2020
[10] S.China Venkateswarlu, Udaya Kumar Naluguru, A Karthik, "Speech Enhancement Using
Recursive Least Square Based on Real-time adaptive filtering algorithm" 2021 IEEE-6th
International Conference for Convergence in Technology (I2CT),pp no. 1-6.
ICISSC-2021
9/28/2021
IARE-Hyderabad

MLRITM
Department of ECE 41
9/28/2021
IARE-Hyderabad

Wavelet-Based Speech Intelligibility Enhancement in Telugu Using Hybrid Threshold Transform

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Wavelet-Based Speech Intelligibility Enhancement in Telugu Using Hybrid Threshold Transform

Similar to Wavelet-Based Speech Intelligibility Enhancement in Telugu Using Hybrid Threshold Transform (20)

Recently uploaded

Recently uploaded (20)

Wavelet-Based Speech Intelligibility Enhancement in Telugu Using Hybrid Threshold Transform