SlideShare a Scribd company logo
1 of 5
Download to read offline
International Journal of Innovative Research in Advanced Engineering (IJIRAE) ISSN: 2349-2163
Issue 01, Volume 6 (January 2019) www.ijirae.com
_________________________________________________________________________________________________
IJIRAE: Impact Factor Value – Mendeley (Elsevier Indexed); Citefactor 1.9 (2017) ; SJIF: Innospace, Morocco
(2016): 3.916 | PIF: 2.469 | Jour Info: 4.085 | ISRAJIF (2017): 4.011 | Indexcopernicus: (ICV 2016): 64.35
IJIRAE © 2014- 19, All Rights Reserved Page–8
SPEECH RECOGNITION USING SONOGRAM AND AANN
R.Thiruvengatanadhan
Department of Computer Science and Engineering, Annamalai University,
Annamalainagar, Tamilnadu, India
thiruvengatanadhan01@gmail.com
Manuscript History
Number: IJIRAE/RS/Vol.06/Issue01/JAAE10086
Received: 02, January 2019
Final Correction: 13, January 2019
Final Accepted: 20, January 2019
Published: January 2019
Citation: Thiruvengatanadhan(2019), .Speech Recognition Using Sonogram and AANN. IJIRAE:: International Journal
of Innovative Research in Advanced Engineering, Volume VI, 08-12. doi://10.26562/IJIRAE.2019.JAAE10086
Editor: Dr.A.Arul L.S, Chief Editor, IJIRAE, AM Publications, India
Copyright: ©2019 This is an open access article distributed under the terms of the Creative Commons Attribution
License, Which Permits unrestricted use, distribution, and reproduction in any medium, provided the original author
and source are credited
Abstract— Automatic recognition of speech using computers is a challenging issue. This paper describes a
technique that uses Auto associative Neural Network (AANN) to recognized speech based on features using
Sonogram. Modeling techniques such as AANN were used to model each individual word which is trained to the
system. Each isolated word Segment using Voice Activity Detection (VAD) from the test sentence is matched
against these models for finding the semantic representation of the test input speech. Experimental results of
AANN shows good performance in recognized rate.
Keywords—Feature Extraction; Voice Activity Detection (VAD); Sonogram and support vector machines (SVM);
I. INTRODUCTION
An audio signal represents the sound as an electrical voltage. Signal flow is nothing but a route taken by an audio
signal for travelling towards the speaker from the source. Audio signal is characterized by bandwidth, power and
voltage. Impedance of the signal path determines the relation between power and voltage [1]. Electrical signal is
used by analog processors but digital signals are mathematically deals by the digital processors. Due to storage
constraints, research related to speech indexing and retrieval has received much attention [2]. As storage has
become cheaper, large collection of spoken documents is available online, but there is a lack of adequate technology
to explain them. Manual transcription of speech is costly and also has privacy constraints [3]. Hence, the need to
explore automatic approaches to search and retrieve spoken documents has increased. Moreover, a wide variety of
multimedia data is available online and paves the way for development of new technologies to index and search
such media [4]. Speech recognition is a main core of spoken language systems. Speech recognition is a complex
classification task and classified by different mathematical approaches: acoustic-phonetic approach, pattern
recognition approach, artificial intelligence approach, dynamic dime warping, connectionist approaches and
support vector machine.
Proposed work aims to develop a system which has to convert spoken word into text using AANN modeling
technique using acoustic feature namely Sonogram. In this work the temporal envelop through RMS energy of the
signal is derived for segregating individual words out of the continuous speeches using voice activity detection
method. Features for each isolated word are extracted and those models were trained. AANN modeling technique is
used to model each individual utterance. Thus each isolated word segment from the test sentence is matched
against these models for finding the semantic representation of the test input dialogue.
International Journal of Innovative Research in Advanced Engineering (IJIRAE) ISSN: 2349-2163
Issue 01, Volume 6 (January 2019) www.ijirae.com
_________________________________________________________________________________________________
IJIRAE: Impact Factor Value – Mendeley (Elsevier Indexed); Citefactor 1.9 (2017) ; SJIF: Innospace, Morocco
(2016): 3.916 | PIF: 2.469 | Jour Info: 4.085 | ISRAJIF (2017): 4.011 | Indexcopernicus: (ICV 2016): 64.35
IJIRAE © 2014- 19, All Rights Reserved Page–9
II. VOICE ACTIVITY DETECTION
Voice Activity Detection (VAD) is a technique for finding voiced segments in speech and plays an important role in
speech mining applications [5]. VAD ignores the additional signal information around the word under
consideration. It can be also viewed as a speaker independent word recognition problem. The basic principle of a
VAD algorithm is that it extracts acoustic features from the input signal and then compares these values with
thresholds usually extracted from silence. Voice activity is declared if the measured values exceed the threshold.
Otherwise, no speech activity is present [6].
VAD finds its usage in a variety of speech communication systems like coding of speech, recognizing speech, hands
free telephony, audio conferencing, speech enhancement and cancellation of audio [7]. It identifies where the
speech is voiced, unvoiced or sustained and makes smooth progress of the speech process [8]. A frame size of 20 ms,
with an overlap of 50%, is considered for VAD. RMS is extracted for each frame. Fig. 1 shows the isolated word
separation.
Fig. 1. Isolated Word Separations.
III. SONOGRAM
Pre-emphasis is performed for the speech signal followed by frame blocking and windowing. The speech segment is
then transformed using FFT into spectrogram representation [9]. Bark scale is applied and frequency bands are
grouped into 24 critical bands. Spectral masking effect is achieved using spreading function. The spectrum energy
values are transformed into decibel scale [10]. Equal loudness contour is incorporated to calculate the loudness
level. The loudness sensation per critical band is computed. STFT is computed for each segment of pre-processed
speech. A frame size of 20 ms is deployed with 50% overlap between the frames. The sampling frequency of 1
second duration is 16 kHz. The block diagram of sonogram extraction is shown in Fig. 2.
Fig. 2. Sonogram Feature Extractions.
International Journal of Innovative Research in Advanced Engineering (IJIRAE) ISSN: 2349-2163
Issue 01, Volume 6 (January 2019) www.ijirae.com
_________________________________________________________________________________________________
IJIRAE: Impact Factor Value – Mendeley (Elsevier Indexed); Citefactor 1.9 (2017) ; SJIF: Innospace, Morocco
(2016): 3.916 | PIF: 2.469 | Jour Info: 4.085 | ISRAJIF (2017): 4.011 | Indexcopernicus: (ICV 2016): 64.35
IJIRAE © 2014- 19, All Rights Reserved Page–10
A perceptual scale known as bark scale is applied to the spectrogram and it groups the frequencies based upon the
perceptive pitch regions to critical bands. The occlusion of one sound to another is modelled by applying a spectral
masking spread function to the signal [11]. The spectrum energy values are then transformed into decibel scale.
Phone scale computation involves equal loudness curve which represents different perception of loudness at
different frequencies respectively. The values are then transformed into a sone-scale to reflect the loudness
sensation of the human auditory system [12].
IV. AUTOASSOCIATIVE NEURAL NETWORK (AANN)
Auto associative Neural Network (AANN) model consists of five layer network which captures the distribution of
the feature vector as shown in Figure 3. The input layer in the network has less number of units than the second
and the fourth layers. The first and the fifth layers have more number of units than the third layer [13]. The number
of processing units in the second layer can be either linear or non-linear. But the processing units in the first and
third layer are non-linear. Back propagation algorithm is used to train the network [14]. The activation functions at
the second, third and fourth layer are nonlinear. The structure of the AANN model used in our study is 13L 26N 4N
26N 13L for capturing the distribution of acoustic features, where L denotes a linear unit, and N denotes anon-
linear unit. The integer value indicates the number of units used in that layer [15]. The non-linear units use tanh(s)
as the activation function, where s is the activation value of the unit. Back propagation learning algorithm is used to
adjust the weights of the network to minimize the mean square error for each feature vector [16].
Fig. 3. Auto associate neural network
V. EXPERIMENTAL RESULTS
A. Dataset Collection
Experiments are conducted for indexing speech audio using Television broadcast speechdata collected from Tamil
news channels using a tuner card. A total dataset of 100 different speech dialogue clips, ranging from 5 to 10
seconds duration, sampled at 16 kHz and encoded by 16-bit is recorded. Voice activity detection is performed to
isolate the words in each speech file using RMS energy envelope. For each speech file, a database of the isolated
words is obtained using VAD.
B. Feature Extraction
VAD the isolated words are extracted from the sentences. Thus frames which are unvoiced excitations are removed
by thresholding the segment size. Feature Sonogram are extracted from each frame of size 320 window with an
overlap of 120 samples. During training process each isolated word is separated into 20ms overlapping windows
for extracting 22 Sonogram features.
C. Classification
Using VAD isolated words in a speech is separated. AANN are created for each isolated word. For training, isolated
words from were considered. The training process analyzes speech training data to find an optimal way to classify
speech frames into their respective classes. For testing 22 dimensional Sonogram feature vectors were given as
input. The feature vectors are given as input and compared with the output to calculate the error. In this
experiment the network is trained for 500 epochs.
International Journal of Innovative Research in Advanced Engineering (IJIRAE) ISSN: 2349-2163
Issue 01, Volume 6 (January 2019) www.ijirae.com
_________________________________________________________________________________________________
IJIRAE: Impact Factor Value – Mendeley (Elsevier Indexed); Citefactor 1.9 (2017) ; SJIF: Innospace, Morocco
(2016): 3.916 | PIF: 2.469 | Jour Info: 4.085 | ISRAJIF (2017): 4.011 | Indexcopernicus: (ICV 2016): 64.35
IJIRAE © 2014- 19, All Rights Reserved Page–11
The confidence score is calculated from the normalized squared error and the category is decided based on
highest confidence score. The performance of speech recognition is studied by varying the number of units in the
compression layer as shown in Fig. 4.
Fig. 4. Performance of Speech Recognition in Terms of Number of Units in the Compression Layer.
The performance of speech recognition in terms of number of units in the expansion layer is shown in Fig. 5.The
network structures 22L 44N 4N 44N 22L gives a good performance and this structure is obtained after some trial
and error
Fig. 5. Performance of Speech Recognition in Terms of Number of Units in the Expansion Layer.
VI. CONCLUSIONS
In this paper, Voice Activity Detection (VAD) is used for segregating individual words out of the continuous
speeches. Features for each isolated word are extracted and those models were trained successfully. Sonogram is
calculated as features to characterize audio content. AANN is used to model each Individual utterance. Sonogram
is calculated as features to characterize audio content. AANN learning algorithm has been used for the recognized
speech by learning from training data. Experimental results show that the proposed audio AANN learning method
has good performance in 90% speech recognized rate.
REFERENCES
1. Albert Bregman, Auditory Scene Analysis, MIT Press, Cambridge, 1990.
2. Tsung-Hsien Wen, Hung-Yi Lee, Pei-hao Su and Lin-shan Lee, “Interactive Spoken Content Retrieval by
Extended Query Model and Continuous State Space Markov Decision Process,” IEEE International Conference
on Acoustics, Speech and Signal Processing, pp. 8510-8514, 2013.
3. Iswarya, P. and Radha, V, “Speech and Text Query Based Tamil - English Cross Language Information Retrieval
system,” International Conference on Computer Communication and Informatics, pp. 1-4, Coimbatore, 2014.
International Journal of Innovative Research in Advanced Engineering (IJIRAE) ISSN: 2349-2163
Issue 01, Volume 6 (January 2019) www.ijirae.com
_________________________________________________________________________________________________
IJIRAE: Impact Factor Value – Mendeley (Elsevier Indexed); Citefactor 1.9 (2017) ; SJIF: Innospace, Morocco
(2016): 3.916 | PIF: 2.469 | Jour Info: 4.085 | ISRAJIF (2017): 4.011 | Indexcopernicus: (ICV 2016): 64.35
IJIRAE © 2014- 19, All Rights Reserved Page–12
4. Chien-Lin Huang, Chiori Hori and Hideki Kashioka, “Semantic Inference Based on Neural Probabilistic
Language Modeling for Speech Indexing,” IEEE International Conference on Acoustics, Speech and Signal
Processing, pp. 8480-8484, 2013.
5. Ivan Markovi, Sre´ ckoJuri´ Kavelj and Ivan Petrovi, “Partial Mutual Information Based Input Variable
Selection for Supervised Learning Approaches to Voice Activity Detection,” Applied Soft Computing
Elsevier, vol. 13, pp. 4383-4391, 2013.
6. Khoubrouy, S. A. and Panahi, I.M.S., “Voice Activation Detection using Teager-Kaiser Energy Measure,”
International Symposium on Image and Signal Processing and Analysis, pp. 388-392, 2013.
7. Saleh Khawatreh, Belal Ayyoub, Ashraf Abu-Ein and Ziad Alqadi. A Novel Methodology to Extract Voice Signal
Features. International Journal of Computer Applications 179(9):40-43, January 2018.
8. Tayseer M F Taha and Amir Hussain. A Survey on Techniques for Enhancing Speech. International Journal of
Computer Applications 179(17):1-14, February 2018.
9. Xiaowen Cheng, Jarod V. Hart, and James S. Walker, “Time-frequency Analysis of Musical Rhythm,” Notices of
AMS, vol. 56, no. 3, 2008.
10.Ausgef¨uhrt, Evaluation of New Audio Features and Their Utilization in Novel Music Retrieval Applications,
Master's thesis, Vienna University of Technology, December 2006.
11.Eberhard Zwicker and Hugo Fastl, “Psychoacoustics-Facts and Models,” Springer Series of Information
Sciences, Berlin, 1999.
12.M. R. Schroder, B. S. Atal, and J. L. Hall, “Optimizing Digital Speech Coders by Exploiting Masking Properties of
the Human Ear,” Journal of the Acoustical Society of America, vol. 66, pp. 1647-1652, 1979.
13.ShaojunRen, Fengqi Si, Jianxin Zhou, Zongliang Qiao, Yuanlin Cheng, “A new reconstruction-based auto-
associative neural network for fault diagnosis in nonlinear systems,” Chemometrics and Intelligent Laboratory
Systems, Volume 172, 15 January 2018, Pages 118-128N.
14.Nitananda, M. Haseyama, and H. Kitajima, “Accurate Audio-Segment Classification using Feature Extraction
Matrix,” IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 261-264, 2005.
15.G. Peeters, “A Large Set of Audio Features for Sound Description,” Technical representation, IRCAM, 2004.
16.K. Lee, “Identifying Cover Songs from Audio using Harmonic Representation,” International Symposium on
Music Information Retrieval, 2006.

More Related Content

What's hot

Wavelet Based Feature Extraction for the Indonesian CV Syllables Sound
Wavelet Based Feature Extraction for the Indonesian CV Syllables SoundWavelet Based Feature Extraction for the Indonesian CV Syllables Sound
Wavelet Based Feature Extraction for the Indonesian CV Syllables SoundTELKOMNIKA JOURNAL
 
COLEA : A MATLAB Tool for Speech Analysis
COLEA : A MATLAB Tool for Speech AnalysisCOLEA : A MATLAB Tool for Speech Analysis
COLEA : A MATLAB Tool for Speech AnalysisRushin Shah
 
Speaker recognition in android
Speaker recognition in androidSpeaker recognition in android
Speaker recognition in androidAnshuli Mittal
 
Signal denoising techniques
Signal denoising techniquesSignal denoising techniques
Signal denoising techniquesShwetaRevankar4
 
Adaptive wavelet thresholding with robust hybrid features for text-independe...
Adaptive wavelet thresholding with robust hybrid features  for text-independe...Adaptive wavelet thresholding with robust hybrid features  for text-independe...
Adaptive wavelet thresholding with robust hybrid features for text-independe...IJECEIAES
 
Speaker identification
Speaker identificationSpeaker identification
Speaker identificationTriloki Gupta
 
Wavelet Based Noise Robust Features for Speaker Recognition
Wavelet Based Noise Robust Features for Speaker RecognitionWavelet Based Noise Robust Features for Speaker Recognition
Wavelet Based Noise Robust Features for Speaker RecognitionCSCJournals
 
International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)inventionjournals
 
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTSEFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTSijnlc
 
A Survey on Speaker Recognition System
A Survey on Speaker Recognition SystemA Survey on Speaker Recognition System
A Survey on Speaker Recognition SystemVani011
 
Performance Analysis of Fading Channels on Cooperative Mode Spectrum Sensing ...
Performance Analysis of Fading Channels on Cooperative Mode Spectrum Sensing ...Performance Analysis of Fading Channels on Cooperative Mode Spectrum Sensing ...
Performance Analysis of Fading Channels on Cooperative Mode Spectrum Sensing ...ijtsrd
 
Speaker recognition systems
Speaker recognition systemsSpeaker recognition systems
Speaker recognition systemsNamratha Dcruz
 
Text-Independent Speaker Verification Report
Text-Independent Speaker Verification ReportText-Independent Speaker Verification Report
Text-Independent Speaker Verification ReportCody Ray
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...ijceronline
 
Text-Independent Speaker Verification
Text-Independent Speaker VerificationText-Independent Speaker Verification
Text-Independent Speaker VerificationCody Ray
 

What's hot (20)

Wavelet Based Feature Extraction for the Indonesian CV Syllables Sound
Wavelet Based Feature Extraction for the Indonesian CV Syllables SoundWavelet Based Feature Extraction for the Indonesian CV Syllables Sound
Wavelet Based Feature Extraction for the Indonesian CV Syllables Sound
 
COLEA : A MATLAB Tool for Speech Analysis
COLEA : A MATLAB Tool for Speech AnalysisCOLEA : A MATLAB Tool for Speech Analysis
COLEA : A MATLAB Tool for Speech Analysis
 
Speech Signal Analysis
Speech Signal AnalysisSpeech Signal Analysis
Speech Signal Analysis
 
speech enhancement
speech enhancementspeech enhancement
speech enhancement
 
Speaker recognition in android
Speaker recognition in androidSpeaker recognition in android
Speaker recognition in android
 
Signal denoising techniques
Signal denoising techniquesSignal denoising techniques
Signal denoising techniques
 
Animal Voice Morphing System
Animal Voice Morphing SystemAnimal Voice Morphing System
Animal Voice Morphing System
 
Speaker recognition.
Speaker recognition.Speaker recognition.
Speaker recognition.
 
1801 1805
1801 18051801 1805
1801 1805
 
Adaptive wavelet thresholding with robust hybrid features for text-independe...
Adaptive wavelet thresholding with robust hybrid features  for text-independe...Adaptive wavelet thresholding with robust hybrid features  for text-independe...
Adaptive wavelet thresholding with robust hybrid features for text-independe...
 
Speaker identification
Speaker identificationSpeaker identification
Speaker identification
 
Wavelet Based Noise Robust Features for Speaker Recognition
Wavelet Based Noise Robust Features for Speaker RecognitionWavelet Based Noise Robust Features for Speaker Recognition
Wavelet Based Noise Robust Features for Speaker Recognition
 
International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)International Journal of Engineering and Science Invention (IJESI)
International Journal of Engineering and Science Invention (IJESI)
 
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTSEFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
EFFECT OF MFCC BASED FEATURES FOR SPEECH SIGNAL ALIGNMENTS
 
A Survey on Speaker Recognition System
A Survey on Speaker Recognition SystemA Survey on Speaker Recognition System
A Survey on Speaker Recognition System
 
Performance Analysis of Fading Channels on Cooperative Mode Spectrum Sensing ...
Performance Analysis of Fading Channels on Cooperative Mode Spectrum Sensing ...Performance Analysis of Fading Channels on Cooperative Mode Spectrum Sensing ...
Performance Analysis of Fading Channels on Cooperative Mode Spectrum Sensing ...
 
Speaker recognition systems
Speaker recognition systemsSpeaker recognition systems
Speaker recognition systems
 
Text-Independent Speaker Verification Report
Text-Independent Speaker Verification ReportText-Independent Speaker Verification Report
Text-Independent Speaker Verification Report
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
 
Text-Independent Speaker Verification
Text-Independent Speaker VerificationText-Independent Speaker Verification
Text-Independent Speaker Verification
 

Similar to SPEECH RECOGNITION USING SONOGRAM AND AANN

IRJET- Emotion recognition using Speech Signal: A Review
IRJET-  	  Emotion recognition using Speech Signal: A ReviewIRJET-  	  Emotion recognition using Speech Signal: A Review
IRJET- Emotion recognition using Speech Signal: A ReviewIRJET Journal
 
Cochlear implant acoustic simulation model based on critical band filters
Cochlear implant acoustic simulation model based on critical band filtersCochlear implant acoustic simulation model based on critical band filters
Cochlear implant acoustic simulation model based on critical band filtersIAEME Publication
 
IRJET- Survey on Efficient Signal Processing Techniques for Speech Enhancement
IRJET- Survey on Efficient Signal Processing Techniques for Speech EnhancementIRJET- Survey on Efficient Signal Processing Techniques for Speech Enhancement
IRJET- Survey on Efficient Signal Processing Techniques for Speech EnhancementIRJET Journal
 
Intelligent Arabic letters speech recognition system based on mel frequency c...
Intelligent Arabic letters speech recognition system based on mel frequency c...Intelligent Arabic letters speech recognition system based on mel frequency c...
Intelligent Arabic letters speech recognition system based on mel frequency c...IJECEIAES
 
Speech emotion recognition with light gradient boosting decision trees machine
Speech emotion recognition with light gradient boosting decision trees machineSpeech emotion recognition with light gradient boosting decision trees machine
Speech emotion recognition with light gradient boosting decision trees machineIJECEIAES
 
ADAPTIVE WATERMARKING TECHNIQUE FOR SPEECH SIGNAL AUTHENTICATION
ADAPTIVE WATERMARKING TECHNIQUE FOR SPEECH SIGNAL AUTHENTICATION ADAPTIVE WATERMARKING TECHNIQUE FOR SPEECH SIGNAL AUTHENTICATION
ADAPTIVE WATERMARKING TECHNIQUE FOR SPEECH SIGNAL AUTHENTICATION ijcsit
 
Development of Algorithm for Voice Operated Switch for Digital Audio Control ...
Development of Algorithm for Voice Operated Switch for Digital Audio Control ...Development of Algorithm for Voice Operated Switch for Digital Audio Control ...
Development of Algorithm for Voice Operated Switch for Digital Audio Control ...IJMER
 
Estimating the quality of digitally transmitted speech over satellite communi...
Estimating the quality of digitally transmitted speech over satellite communi...Estimating the quality of digitally transmitted speech over satellite communi...
Estimating the quality of digitally transmitted speech over satellite communi...Alexander Decker
 
Classification of Vehicles Based on Audio Signals using Quadratic Discriminan...
Classification of Vehicles Based on Audio Signals using Quadratic Discriminan...Classification of Vehicles Based on Audio Signals using Quadratic Discriminan...
Classification of Vehicles Based on Audio Signals using Quadratic Discriminan...ijsc
 
Financial Transactions in ATM Machines using Speech Signals
Financial Transactions in ATM Machines using Speech SignalsFinancial Transactions in ATM Machines using Speech Signals
Financial Transactions in ATM Machines using Speech SignalsIJERA Editor
 
Classification of vehicles based on audio signals
Classification of vehicles based on audio signalsClassification of vehicles based on audio signals
Classification of vehicles based on audio signalsijsc
 
A survey on Enhancements in Speech Recognition
A survey on Enhancements in Speech RecognitionA survey on Enhancements in Speech Recognition
A survey on Enhancements in Speech RecognitionIRJET Journal
 
Analysis of spectrum based approach for detection of mobile signals
Analysis of spectrum based approach for detection of mobile signalsAnalysis of spectrum based approach for detection of mobile signals
Analysis of spectrum based approach for detection of mobile signalsIAEME Publication
 
A Survey On Audio Watermarking Methods
A Survey On Audio Watermarking  MethodsA Survey On Audio Watermarking  Methods
A Survey On Audio Watermarking MethodsIRJET Journal
 
A comparison of different support vector machine kernels for artificial speec...
A comparison of different support vector machine kernels for artificial speec...A comparison of different support vector machine kernels for artificial speec...
A comparison of different support vector machine kernels for artificial speec...TELKOMNIKA JOURNAL
 
Speaker Recognition System using MFCC and Vector Quantization Approach
Speaker Recognition System using MFCC and Vector Quantization ApproachSpeaker Recognition System using MFCC and Vector Quantization Approach
Speaker Recognition System using MFCC and Vector Quantization Approachijsrd.com
 
Acoustic Scene Classification by using Combination of MODWPT and Spectral Fea...
Acoustic Scene Classification by using Combination of MODWPT and Spectral Fea...Acoustic Scene Classification by using Combination of MODWPT and Spectral Fea...
Acoustic Scene Classification by using Combination of MODWPT and Spectral Fea...ijtsrd
 

Similar to SPEECH RECOGNITION USING SONOGRAM AND AANN (20)

IRJET- Emotion recognition using Speech Signal: A Review
IRJET-  	  Emotion recognition using Speech Signal: A ReviewIRJET-  	  Emotion recognition using Speech Signal: A Review
IRJET- Emotion recognition using Speech Signal: A Review
 
Cochlear implant acoustic simulation model based on critical band filters
Cochlear implant acoustic simulation model based on critical band filtersCochlear implant acoustic simulation model based on critical band filters
Cochlear implant acoustic simulation model based on critical band filters
 
IRJET- Survey on Efficient Signal Processing Techniques for Speech Enhancement
IRJET- Survey on Efficient Signal Processing Techniques for Speech EnhancementIRJET- Survey on Efficient Signal Processing Techniques for Speech Enhancement
IRJET- Survey on Efficient Signal Processing Techniques for Speech Enhancement
 
Intelligent Arabic letters speech recognition system based on mel frequency c...
Intelligent Arabic letters speech recognition system based on mel frequency c...Intelligent Arabic letters speech recognition system based on mel frequency c...
Intelligent Arabic letters speech recognition system based on mel frequency c...
 
Speech emotion recognition with light gradient boosting decision trees machine
Speech emotion recognition with light gradient boosting decision trees machineSpeech emotion recognition with light gradient boosting decision trees machine
Speech emotion recognition with light gradient boosting decision trees machine
 
ADAPTIVE WATERMARKING TECHNIQUE FOR SPEECH SIGNAL AUTHENTICATION
ADAPTIVE WATERMARKING TECHNIQUE FOR SPEECH SIGNAL AUTHENTICATION ADAPTIVE WATERMARKING TECHNIQUE FOR SPEECH SIGNAL AUTHENTICATION
ADAPTIVE WATERMARKING TECHNIQUE FOR SPEECH SIGNAL AUTHENTICATION
 
K010416167
K010416167K010416167
K010416167
 
Development of Algorithm for Voice Operated Switch for Digital Audio Control ...
Development of Algorithm for Voice Operated Switch for Digital Audio Control ...Development of Algorithm for Voice Operated Switch for Digital Audio Control ...
Development of Algorithm for Voice Operated Switch for Digital Audio Control ...
 
Estimating the quality of digitally transmitted speech over satellite communi...
Estimating the quality of digitally transmitted speech over satellite communi...Estimating the quality of digitally transmitted speech over satellite communi...
Estimating the quality of digitally transmitted speech over satellite communi...
 
Classification of Vehicles Based on Audio Signals using Quadratic Discriminan...
Classification of Vehicles Based on Audio Signals using Quadratic Discriminan...Classification of Vehicles Based on Audio Signals using Quadratic Discriminan...
Classification of Vehicles Based on Audio Signals using Quadratic Discriminan...
 
Financial Transactions in ATM Machines using Speech Signals
Financial Transactions in ATM Machines using Speech SignalsFinancial Transactions in ATM Machines using Speech Signals
Financial Transactions in ATM Machines using Speech Signals
 
Classification of vehicles based on audio signals
Classification of vehicles based on audio signalsClassification of vehicles based on audio signals
Classification of vehicles based on audio signals
 
A survey on Enhancements in Speech Recognition
A survey on Enhancements in Speech RecognitionA survey on Enhancements in Speech Recognition
A survey on Enhancements in Speech Recognition
 
Analysis of spectrum based approach for detection of mobile signals
Analysis of spectrum based approach for detection of mobile signalsAnalysis of spectrum based approach for detection of mobile signals
Analysis of spectrum based approach for detection of mobile signals
 
1801 1805
1801 18051801 1805
1801 1805
 
A Survey On Audio Watermarking Methods
A Survey On Audio Watermarking  MethodsA Survey On Audio Watermarking  Methods
A Survey On Audio Watermarking Methods
 
A comparison of different support vector machine kernels for artificial speec...
A comparison of different support vector machine kernels for artificial speec...A comparison of different support vector machine kernels for artificial speec...
A comparison of different support vector machine kernels for artificial speec...
 
50120140505010
5012014050501050120140505010
50120140505010
 
Speaker Recognition System using MFCC and Vector Quantization Approach
Speaker Recognition System using MFCC and Vector Quantization ApproachSpeaker Recognition System using MFCC and Vector Quantization Approach
Speaker Recognition System using MFCC and Vector Quantization Approach
 
Acoustic Scene Classification by using Combination of MODWPT and Spectral Fea...
Acoustic Scene Classification by using Combination of MODWPT and Spectral Fea...Acoustic Scene Classification by using Combination of MODWPT and Spectral Fea...
Acoustic Scene Classification by using Combination of MODWPT and Spectral Fea...
 

More from AM Publications

DEVELOPMENT OF TODDLER FAMILY CADRE TRAINING BASED ON ANDROID APPLICATIONS IN...
DEVELOPMENT OF TODDLER FAMILY CADRE TRAINING BASED ON ANDROID APPLICATIONS IN...DEVELOPMENT OF TODDLER FAMILY CADRE TRAINING BASED ON ANDROID APPLICATIONS IN...
DEVELOPMENT OF TODDLER FAMILY CADRE TRAINING BASED ON ANDROID APPLICATIONS IN...AM Publications
 
TESTING OF COMPOSITE ON DROP-WEIGHT IMPACT TESTING AND DAMAGE IDENTIFICATION ...
TESTING OF COMPOSITE ON DROP-WEIGHT IMPACT TESTING AND DAMAGE IDENTIFICATION ...TESTING OF COMPOSITE ON DROP-WEIGHT IMPACT TESTING AND DAMAGE IDENTIFICATION ...
TESTING OF COMPOSITE ON DROP-WEIGHT IMPACT TESTING AND DAMAGE IDENTIFICATION ...AM Publications
 
THE USE OF FRACTAL GEOMETRY IN TILING MOTIF DESIGN
THE USE OF FRACTAL GEOMETRY IN TILING MOTIF DESIGNTHE USE OF FRACTAL GEOMETRY IN TILING MOTIF DESIGN
THE USE OF FRACTAL GEOMETRY IN TILING MOTIF DESIGNAM Publications
 
TWO-DIMENSIONAL INVERSION FINITE ELEMENT MODELING OF MAGNETOTELLURIC DATA: CA...
TWO-DIMENSIONAL INVERSION FINITE ELEMENT MODELING OF MAGNETOTELLURIC DATA: CA...TWO-DIMENSIONAL INVERSION FINITE ELEMENT MODELING OF MAGNETOTELLURIC DATA: CA...
TWO-DIMENSIONAL INVERSION FINITE ELEMENT MODELING OF MAGNETOTELLURIC DATA: CA...AM Publications
 
USING THE GENETIC ALGORITHM TO OPTIMIZE LASER WELDING PARAMETERS FOR MARTENSI...
USING THE GENETIC ALGORITHM TO OPTIMIZE LASER WELDING PARAMETERS FOR MARTENSI...USING THE GENETIC ALGORITHM TO OPTIMIZE LASER WELDING PARAMETERS FOR MARTENSI...
USING THE GENETIC ALGORITHM TO OPTIMIZE LASER WELDING PARAMETERS FOR MARTENSI...AM Publications
 
ANALYSIS AND DESIGN E-MARKETPLACE FOR MICRO, SMALL AND MEDIUM ENTERPRISES
ANALYSIS AND DESIGN E-MARKETPLACE FOR MICRO, SMALL AND MEDIUM ENTERPRISESANALYSIS AND DESIGN E-MARKETPLACE FOR MICRO, SMALL AND MEDIUM ENTERPRISES
ANALYSIS AND DESIGN E-MARKETPLACE FOR MICRO, SMALL AND MEDIUM ENTERPRISESAM Publications
 
REMOTE SENSING AND GEOGRAPHIC INFORMATION SYSTEMS
REMOTE SENSING AND GEOGRAPHIC INFORMATION SYSTEMS REMOTE SENSING AND GEOGRAPHIC INFORMATION SYSTEMS
REMOTE SENSING AND GEOGRAPHIC INFORMATION SYSTEMS AM Publications
 
EVALUATE THE STRAIN ENERGY ERROR FOR THE LASER WELD BY THE H-REFINEMENT OF TH...
EVALUATE THE STRAIN ENERGY ERROR FOR THE LASER WELD BY THE H-REFINEMENT OF TH...EVALUATE THE STRAIN ENERGY ERROR FOR THE LASER WELD BY THE H-REFINEMENT OF TH...
EVALUATE THE STRAIN ENERGY ERROR FOR THE LASER WELD BY THE H-REFINEMENT OF TH...AM Publications
 
HMM APPLICATION IN ISOLATED WORD SPEECH RECOGNITION
HMM APPLICATION IN ISOLATED WORD SPEECH RECOGNITIONHMM APPLICATION IN ISOLATED WORD SPEECH RECOGNITION
HMM APPLICATION IN ISOLATED WORD SPEECH RECOGNITIONAM Publications
 
PEDESTRIAN DETECTION IN LOW RESOLUTION VIDEOS USING A MULTI-FRAME HOG-BASED D...
PEDESTRIAN DETECTION IN LOW RESOLUTION VIDEOS USING A MULTI-FRAME HOG-BASED D...PEDESTRIAN DETECTION IN LOW RESOLUTION VIDEOS USING A MULTI-FRAME HOG-BASED D...
PEDESTRIAN DETECTION IN LOW RESOLUTION VIDEOS USING A MULTI-FRAME HOG-BASED D...AM Publications
 
EFFECT OF SILICON - RUBBER (SR) SHEETS AS AN ALTERNATIVE FILTER ON HIGH AND L...
EFFECT OF SILICON - RUBBER (SR) SHEETS AS AN ALTERNATIVE FILTER ON HIGH AND L...EFFECT OF SILICON - RUBBER (SR) SHEETS AS AN ALTERNATIVE FILTER ON HIGH AND L...
EFFECT OF SILICON - RUBBER (SR) SHEETS AS AN ALTERNATIVE FILTER ON HIGH AND L...AM Publications
 
UTILIZATION OF IMMUNIZATION SERVICES AMONG CHILDREN UNDER FIVE YEARS OF AGE I...
UTILIZATION OF IMMUNIZATION SERVICES AMONG CHILDREN UNDER FIVE YEARS OF AGE I...UTILIZATION OF IMMUNIZATION SERVICES AMONG CHILDREN UNDER FIVE YEARS OF AGE I...
UTILIZATION OF IMMUNIZATION SERVICES AMONG CHILDREN UNDER FIVE YEARS OF AGE I...AM Publications
 
REPRESENTATION OF THE BLOCK DATA ENCRYPTION ALGORITHM IN AN ANALYTICAL FORM F...
REPRESENTATION OF THE BLOCK DATA ENCRYPTION ALGORITHM IN AN ANALYTICAL FORM F...REPRESENTATION OF THE BLOCK DATA ENCRYPTION ALGORITHM IN AN ANALYTICAL FORM F...
REPRESENTATION OF THE BLOCK DATA ENCRYPTION ALGORITHM IN AN ANALYTICAL FORM F...AM Publications
 
OPTICAL CHARACTER RECOGNITION USING RBFNN
OPTICAL CHARACTER RECOGNITION USING RBFNNOPTICAL CHARACTER RECOGNITION USING RBFNN
OPTICAL CHARACTER RECOGNITION USING RBFNNAM Publications
 
DETECTION OF MOVING OBJECT
DETECTION OF MOVING OBJECTDETECTION OF MOVING OBJECT
DETECTION OF MOVING OBJECTAM Publications
 
SIMULATION OF ATMOSPHERIC POLLUTANTS DISPERSION IN AN URBAN ENVIRONMENT
SIMULATION OF ATMOSPHERIC POLLUTANTS DISPERSION IN AN URBAN ENVIRONMENTSIMULATION OF ATMOSPHERIC POLLUTANTS DISPERSION IN AN URBAN ENVIRONMENT
SIMULATION OF ATMOSPHERIC POLLUTANTS DISPERSION IN AN URBAN ENVIRONMENTAM Publications
 
PREPARATION AND EVALUATION OF WOOL KERATIN BASED CHITOSAN NANOFIBERS FOR AIR ...
PREPARATION AND EVALUATION OF WOOL KERATIN BASED CHITOSAN NANOFIBERS FOR AIR ...PREPARATION AND EVALUATION OF WOOL KERATIN BASED CHITOSAN NANOFIBERS FOR AIR ...
PREPARATION AND EVALUATION OF WOOL KERATIN BASED CHITOSAN NANOFIBERS FOR AIR ...AM Publications
 
ANALYSIS ON LOAD BALANCING ALGORITHMS IMPLEMENTATION ON CLOUD COMPUTING ENVIR...
ANALYSIS ON LOAD BALANCING ALGORITHMS IMPLEMENTATION ON CLOUD COMPUTING ENVIR...ANALYSIS ON LOAD BALANCING ALGORITHMS IMPLEMENTATION ON CLOUD COMPUTING ENVIR...
ANALYSIS ON LOAD BALANCING ALGORITHMS IMPLEMENTATION ON CLOUD COMPUTING ENVIR...AM Publications
 
A MODEL BASED APPROACH FOR IMPLEMENTING WLAN SECURITY
A MODEL BASED APPROACH FOR IMPLEMENTING WLAN SECURITY A MODEL BASED APPROACH FOR IMPLEMENTING WLAN SECURITY
A MODEL BASED APPROACH FOR IMPLEMENTING WLAN SECURITY AM Publications
 

More from AM Publications (20)

DEVELOPMENT OF TODDLER FAMILY CADRE TRAINING BASED ON ANDROID APPLICATIONS IN...
DEVELOPMENT OF TODDLER FAMILY CADRE TRAINING BASED ON ANDROID APPLICATIONS IN...DEVELOPMENT OF TODDLER FAMILY CADRE TRAINING BASED ON ANDROID APPLICATIONS IN...
DEVELOPMENT OF TODDLER FAMILY CADRE TRAINING BASED ON ANDROID APPLICATIONS IN...
 
TESTING OF COMPOSITE ON DROP-WEIGHT IMPACT TESTING AND DAMAGE IDENTIFICATION ...
TESTING OF COMPOSITE ON DROP-WEIGHT IMPACT TESTING AND DAMAGE IDENTIFICATION ...TESTING OF COMPOSITE ON DROP-WEIGHT IMPACT TESTING AND DAMAGE IDENTIFICATION ...
TESTING OF COMPOSITE ON DROP-WEIGHT IMPACT TESTING AND DAMAGE IDENTIFICATION ...
 
THE USE OF FRACTAL GEOMETRY IN TILING MOTIF DESIGN
THE USE OF FRACTAL GEOMETRY IN TILING MOTIF DESIGNTHE USE OF FRACTAL GEOMETRY IN TILING MOTIF DESIGN
THE USE OF FRACTAL GEOMETRY IN TILING MOTIF DESIGN
 
TWO-DIMENSIONAL INVERSION FINITE ELEMENT MODELING OF MAGNETOTELLURIC DATA: CA...
TWO-DIMENSIONAL INVERSION FINITE ELEMENT MODELING OF MAGNETOTELLURIC DATA: CA...TWO-DIMENSIONAL INVERSION FINITE ELEMENT MODELING OF MAGNETOTELLURIC DATA: CA...
TWO-DIMENSIONAL INVERSION FINITE ELEMENT MODELING OF MAGNETOTELLURIC DATA: CA...
 
USING THE GENETIC ALGORITHM TO OPTIMIZE LASER WELDING PARAMETERS FOR MARTENSI...
USING THE GENETIC ALGORITHM TO OPTIMIZE LASER WELDING PARAMETERS FOR MARTENSI...USING THE GENETIC ALGORITHM TO OPTIMIZE LASER WELDING PARAMETERS FOR MARTENSI...
USING THE GENETIC ALGORITHM TO OPTIMIZE LASER WELDING PARAMETERS FOR MARTENSI...
 
ANALYSIS AND DESIGN E-MARKETPLACE FOR MICRO, SMALL AND MEDIUM ENTERPRISES
ANALYSIS AND DESIGN E-MARKETPLACE FOR MICRO, SMALL AND MEDIUM ENTERPRISESANALYSIS AND DESIGN E-MARKETPLACE FOR MICRO, SMALL AND MEDIUM ENTERPRISES
ANALYSIS AND DESIGN E-MARKETPLACE FOR MICRO, SMALL AND MEDIUM ENTERPRISES
 
REMOTE SENSING AND GEOGRAPHIC INFORMATION SYSTEMS
REMOTE SENSING AND GEOGRAPHIC INFORMATION SYSTEMS REMOTE SENSING AND GEOGRAPHIC INFORMATION SYSTEMS
REMOTE SENSING AND GEOGRAPHIC INFORMATION SYSTEMS
 
EVALUATE THE STRAIN ENERGY ERROR FOR THE LASER WELD BY THE H-REFINEMENT OF TH...
EVALUATE THE STRAIN ENERGY ERROR FOR THE LASER WELD BY THE H-REFINEMENT OF TH...EVALUATE THE STRAIN ENERGY ERROR FOR THE LASER WELD BY THE H-REFINEMENT OF TH...
EVALUATE THE STRAIN ENERGY ERROR FOR THE LASER WELD BY THE H-REFINEMENT OF TH...
 
HMM APPLICATION IN ISOLATED WORD SPEECH RECOGNITION
HMM APPLICATION IN ISOLATED WORD SPEECH RECOGNITIONHMM APPLICATION IN ISOLATED WORD SPEECH RECOGNITION
HMM APPLICATION IN ISOLATED WORD SPEECH RECOGNITION
 
PEDESTRIAN DETECTION IN LOW RESOLUTION VIDEOS USING A MULTI-FRAME HOG-BASED D...
PEDESTRIAN DETECTION IN LOW RESOLUTION VIDEOS USING A MULTI-FRAME HOG-BASED D...PEDESTRIAN DETECTION IN LOW RESOLUTION VIDEOS USING A MULTI-FRAME HOG-BASED D...
PEDESTRIAN DETECTION IN LOW RESOLUTION VIDEOS USING A MULTI-FRAME HOG-BASED D...
 
INTELLIGENT BLIND STICK
INTELLIGENT BLIND STICKINTELLIGENT BLIND STICK
INTELLIGENT BLIND STICK
 
EFFECT OF SILICON - RUBBER (SR) SHEETS AS AN ALTERNATIVE FILTER ON HIGH AND L...
EFFECT OF SILICON - RUBBER (SR) SHEETS AS AN ALTERNATIVE FILTER ON HIGH AND L...EFFECT OF SILICON - RUBBER (SR) SHEETS AS AN ALTERNATIVE FILTER ON HIGH AND L...
EFFECT OF SILICON - RUBBER (SR) SHEETS AS AN ALTERNATIVE FILTER ON HIGH AND L...
 
UTILIZATION OF IMMUNIZATION SERVICES AMONG CHILDREN UNDER FIVE YEARS OF AGE I...
UTILIZATION OF IMMUNIZATION SERVICES AMONG CHILDREN UNDER FIVE YEARS OF AGE I...UTILIZATION OF IMMUNIZATION SERVICES AMONG CHILDREN UNDER FIVE YEARS OF AGE I...
UTILIZATION OF IMMUNIZATION SERVICES AMONG CHILDREN UNDER FIVE YEARS OF AGE I...
 
REPRESENTATION OF THE BLOCK DATA ENCRYPTION ALGORITHM IN AN ANALYTICAL FORM F...
REPRESENTATION OF THE BLOCK DATA ENCRYPTION ALGORITHM IN AN ANALYTICAL FORM F...REPRESENTATION OF THE BLOCK DATA ENCRYPTION ALGORITHM IN AN ANALYTICAL FORM F...
REPRESENTATION OF THE BLOCK DATA ENCRYPTION ALGORITHM IN AN ANALYTICAL FORM F...
 
OPTICAL CHARACTER RECOGNITION USING RBFNN
OPTICAL CHARACTER RECOGNITION USING RBFNNOPTICAL CHARACTER RECOGNITION USING RBFNN
OPTICAL CHARACTER RECOGNITION USING RBFNN
 
DETECTION OF MOVING OBJECT
DETECTION OF MOVING OBJECTDETECTION OF MOVING OBJECT
DETECTION OF MOVING OBJECT
 
SIMULATION OF ATMOSPHERIC POLLUTANTS DISPERSION IN AN URBAN ENVIRONMENT
SIMULATION OF ATMOSPHERIC POLLUTANTS DISPERSION IN AN URBAN ENVIRONMENTSIMULATION OF ATMOSPHERIC POLLUTANTS DISPERSION IN AN URBAN ENVIRONMENT
SIMULATION OF ATMOSPHERIC POLLUTANTS DISPERSION IN AN URBAN ENVIRONMENT
 
PREPARATION AND EVALUATION OF WOOL KERATIN BASED CHITOSAN NANOFIBERS FOR AIR ...
PREPARATION AND EVALUATION OF WOOL KERATIN BASED CHITOSAN NANOFIBERS FOR AIR ...PREPARATION AND EVALUATION OF WOOL KERATIN BASED CHITOSAN NANOFIBERS FOR AIR ...
PREPARATION AND EVALUATION OF WOOL KERATIN BASED CHITOSAN NANOFIBERS FOR AIR ...
 
ANALYSIS ON LOAD BALANCING ALGORITHMS IMPLEMENTATION ON CLOUD COMPUTING ENVIR...
ANALYSIS ON LOAD BALANCING ALGORITHMS IMPLEMENTATION ON CLOUD COMPUTING ENVIR...ANALYSIS ON LOAD BALANCING ALGORITHMS IMPLEMENTATION ON CLOUD COMPUTING ENVIR...
ANALYSIS ON LOAD BALANCING ALGORITHMS IMPLEMENTATION ON CLOUD COMPUTING ENVIR...
 
A MODEL BASED APPROACH FOR IMPLEMENTING WLAN SECURITY
A MODEL BASED APPROACH FOR IMPLEMENTING WLAN SECURITY A MODEL BASED APPROACH FOR IMPLEMENTING WLAN SECURITY
A MODEL BASED APPROACH FOR IMPLEMENTING WLAN SECURITY
 

Recently uploaded

(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...
High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...
High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...Call Girls in Nagpur High Profile
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...RajaP95
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxDeepakSakkari2
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxupamatechverse
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...Soham Mondal
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxAsutosh Ranjan
 
Analog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAnalog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAbhinavSharma374939
 
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZTE
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLDeelipZope
 

Recently uploaded (20)

(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...
High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...
High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...
 
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
IMPLICATIONS OF THE ABOVE HOLISTIC UNDERSTANDING OF HARMONY ON PROFESSIONAL E...
 
Biology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptxBiology for Computer Engineers Course Handout.pptx
Biology for Computer Engineers Course Handout.pptx
 
Introduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptxIntroduction and different types of Ethernet.pptx
Introduction and different types of Ethernet.pptx
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
OSVC_Meta-Data based Simulation Automation to overcome Verification Challenge...
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
Coefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptxCoefficient of Thermal Expansion and their Importance.pptx
Coefficient of Thermal Expansion and their Importance.pptx
 
Analog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAnalog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog Converter
 
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
 
Current Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCLCurrent Transformer Drawing and GTP for MSETCL
Current Transformer Drawing and GTP for MSETCL
 

SPEECH RECOGNITION USING SONOGRAM AND AANN

  • 1. International Journal of Innovative Research in Advanced Engineering (IJIRAE) ISSN: 2349-2163 Issue 01, Volume 6 (January 2019) www.ijirae.com _________________________________________________________________________________________________ IJIRAE: Impact Factor Value – Mendeley (Elsevier Indexed); Citefactor 1.9 (2017) ; SJIF: Innospace, Morocco (2016): 3.916 | PIF: 2.469 | Jour Info: 4.085 | ISRAJIF (2017): 4.011 | Indexcopernicus: (ICV 2016): 64.35 IJIRAE © 2014- 19, All Rights Reserved Page–8 SPEECH RECOGNITION USING SONOGRAM AND AANN R.Thiruvengatanadhan Department of Computer Science and Engineering, Annamalai University, Annamalainagar, Tamilnadu, India thiruvengatanadhan01@gmail.com Manuscript History Number: IJIRAE/RS/Vol.06/Issue01/JAAE10086 Received: 02, January 2019 Final Correction: 13, January 2019 Final Accepted: 20, January 2019 Published: January 2019 Citation: Thiruvengatanadhan(2019), .Speech Recognition Using Sonogram and AANN. IJIRAE:: International Journal of Innovative Research in Advanced Engineering, Volume VI, 08-12. doi://10.26562/IJIRAE.2019.JAAE10086 Editor: Dr.A.Arul L.S, Chief Editor, IJIRAE, AM Publications, India Copyright: ©2019 This is an open access article distributed under the terms of the Creative Commons Attribution License, Which Permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited Abstract— Automatic recognition of speech using computers is a challenging issue. This paper describes a technique that uses Auto associative Neural Network (AANN) to recognized speech based on features using Sonogram. Modeling techniques such as AANN were used to model each individual word which is trained to the system. Each isolated word Segment using Voice Activity Detection (VAD) from the test sentence is matched against these models for finding the semantic representation of the test input speech. Experimental results of AANN shows good performance in recognized rate. Keywords—Feature Extraction; Voice Activity Detection (VAD); Sonogram and support vector machines (SVM); I. INTRODUCTION An audio signal represents the sound as an electrical voltage. Signal flow is nothing but a route taken by an audio signal for travelling towards the speaker from the source. Audio signal is characterized by bandwidth, power and voltage. Impedance of the signal path determines the relation between power and voltage [1]. Electrical signal is used by analog processors but digital signals are mathematically deals by the digital processors. Due to storage constraints, research related to speech indexing and retrieval has received much attention [2]. As storage has become cheaper, large collection of spoken documents is available online, but there is a lack of adequate technology to explain them. Manual transcription of speech is costly and also has privacy constraints [3]. Hence, the need to explore automatic approaches to search and retrieve spoken documents has increased. Moreover, a wide variety of multimedia data is available online and paves the way for development of new technologies to index and search such media [4]. Speech recognition is a main core of spoken language systems. Speech recognition is a complex classification task and classified by different mathematical approaches: acoustic-phonetic approach, pattern recognition approach, artificial intelligence approach, dynamic dime warping, connectionist approaches and support vector machine. Proposed work aims to develop a system which has to convert spoken word into text using AANN modeling technique using acoustic feature namely Sonogram. In this work the temporal envelop through RMS energy of the signal is derived for segregating individual words out of the continuous speeches using voice activity detection method. Features for each isolated word are extracted and those models were trained. AANN modeling technique is used to model each individual utterance. Thus each isolated word segment from the test sentence is matched against these models for finding the semantic representation of the test input dialogue.
  • 2. International Journal of Innovative Research in Advanced Engineering (IJIRAE) ISSN: 2349-2163 Issue 01, Volume 6 (January 2019) www.ijirae.com _________________________________________________________________________________________________ IJIRAE: Impact Factor Value – Mendeley (Elsevier Indexed); Citefactor 1.9 (2017) ; SJIF: Innospace, Morocco (2016): 3.916 | PIF: 2.469 | Jour Info: 4.085 | ISRAJIF (2017): 4.011 | Indexcopernicus: (ICV 2016): 64.35 IJIRAE © 2014- 19, All Rights Reserved Page–9 II. VOICE ACTIVITY DETECTION Voice Activity Detection (VAD) is a technique for finding voiced segments in speech and plays an important role in speech mining applications [5]. VAD ignores the additional signal information around the word under consideration. It can be also viewed as a speaker independent word recognition problem. The basic principle of a VAD algorithm is that it extracts acoustic features from the input signal and then compares these values with thresholds usually extracted from silence. Voice activity is declared if the measured values exceed the threshold. Otherwise, no speech activity is present [6]. VAD finds its usage in a variety of speech communication systems like coding of speech, recognizing speech, hands free telephony, audio conferencing, speech enhancement and cancellation of audio [7]. It identifies where the speech is voiced, unvoiced or sustained and makes smooth progress of the speech process [8]. A frame size of 20 ms, with an overlap of 50%, is considered for VAD. RMS is extracted for each frame. Fig. 1 shows the isolated word separation. Fig. 1. Isolated Word Separations. III. SONOGRAM Pre-emphasis is performed for the speech signal followed by frame blocking and windowing. The speech segment is then transformed using FFT into spectrogram representation [9]. Bark scale is applied and frequency bands are grouped into 24 critical bands. Spectral masking effect is achieved using spreading function. The spectrum energy values are transformed into decibel scale [10]. Equal loudness contour is incorporated to calculate the loudness level. The loudness sensation per critical band is computed. STFT is computed for each segment of pre-processed speech. A frame size of 20 ms is deployed with 50% overlap between the frames. The sampling frequency of 1 second duration is 16 kHz. The block diagram of sonogram extraction is shown in Fig. 2. Fig. 2. Sonogram Feature Extractions.
  • 3. International Journal of Innovative Research in Advanced Engineering (IJIRAE) ISSN: 2349-2163 Issue 01, Volume 6 (January 2019) www.ijirae.com _________________________________________________________________________________________________ IJIRAE: Impact Factor Value – Mendeley (Elsevier Indexed); Citefactor 1.9 (2017) ; SJIF: Innospace, Morocco (2016): 3.916 | PIF: 2.469 | Jour Info: 4.085 | ISRAJIF (2017): 4.011 | Indexcopernicus: (ICV 2016): 64.35 IJIRAE © 2014- 19, All Rights Reserved Page–10 A perceptual scale known as bark scale is applied to the spectrogram and it groups the frequencies based upon the perceptive pitch regions to critical bands. The occlusion of one sound to another is modelled by applying a spectral masking spread function to the signal [11]. The spectrum energy values are then transformed into decibel scale. Phone scale computation involves equal loudness curve which represents different perception of loudness at different frequencies respectively. The values are then transformed into a sone-scale to reflect the loudness sensation of the human auditory system [12]. IV. AUTOASSOCIATIVE NEURAL NETWORK (AANN) Auto associative Neural Network (AANN) model consists of five layer network which captures the distribution of the feature vector as shown in Figure 3. The input layer in the network has less number of units than the second and the fourth layers. The first and the fifth layers have more number of units than the third layer [13]. The number of processing units in the second layer can be either linear or non-linear. But the processing units in the first and third layer are non-linear. Back propagation algorithm is used to train the network [14]. The activation functions at the second, third and fourth layer are nonlinear. The structure of the AANN model used in our study is 13L 26N 4N 26N 13L for capturing the distribution of acoustic features, where L denotes a linear unit, and N denotes anon- linear unit. The integer value indicates the number of units used in that layer [15]. The non-linear units use tanh(s) as the activation function, where s is the activation value of the unit. Back propagation learning algorithm is used to adjust the weights of the network to minimize the mean square error for each feature vector [16]. Fig. 3. Auto associate neural network V. EXPERIMENTAL RESULTS A. Dataset Collection Experiments are conducted for indexing speech audio using Television broadcast speechdata collected from Tamil news channels using a tuner card. A total dataset of 100 different speech dialogue clips, ranging from 5 to 10 seconds duration, sampled at 16 kHz and encoded by 16-bit is recorded. Voice activity detection is performed to isolate the words in each speech file using RMS energy envelope. For each speech file, a database of the isolated words is obtained using VAD. B. Feature Extraction VAD the isolated words are extracted from the sentences. Thus frames which are unvoiced excitations are removed by thresholding the segment size. Feature Sonogram are extracted from each frame of size 320 window with an overlap of 120 samples. During training process each isolated word is separated into 20ms overlapping windows for extracting 22 Sonogram features. C. Classification Using VAD isolated words in a speech is separated. AANN are created for each isolated word. For training, isolated words from were considered. The training process analyzes speech training data to find an optimal way to classify speech frames into their respective classes. For testing 22 dimensional Sonogram feature vectors were given as input. The feature vectors are given as input and compared with the output to calculate the error. In this experiment the network is trained for 500 epochs.
  • 4. International Journal of Innovative Research in Advanced Engineering (IJIRAE) ISSN: 2349-2163 Issue 01, Volume 6 (January 2019) www.ijirae.com _________________________________________________________________________________________________ IJIRAE: Impact Factor Value – Mendeley (Elsevier Indexed); Citefactor 1.9 (2017) ; SJIF: Innospace, Morocco (2016): 3.916 | PIF: 2.469 | Jour Info: 4.085 | ISRAJIF (2017): 4.011 | Indexcopernicus: (ICV 2016): 64.35 IJIRAE © 2014- 19, All Rights Reserved Page–11 The confidence score is calculated from the normalized squared error and the category is decided based on highest confidence score. The performance of speech recognition is studied by varying the number of units in the compression layer as shown in Fig. 4. Fig. 4. Performance of Speech Recognition in Terms of Number of Units in the Compression Layer. The performance of speech recognition in terms of number of units in the expansion layer is shown in Fig. 5.The network structures 22L 44N 4N 44N 22L gives a good performance and this structure is obtained after some trial and error Fig. 5. Performance of Speech Recognition in Terms of Number of Units in the Expansion Layer. VI. CONCLUSIONS In this paper, Voice Activity Detection (VAD) is used for segregating individual words out of the continuous speeches. Features for each isolated word are extracted and those models were trained successfully. Sonogram is calculated as features to characterize audio content. AANN is used to model each Individual utterance. Sonogram is calculated as features to characterize audio content. AANN learning algorithm has been used for the recognized speech by learning from training data. Experimental results show that the proposed audio AANN learning method has good performance in 90% speech recognized rate. REFERENCES 1. Albert Bregman, Auditory Scene Analysis, MIT Press, Cambridge, 1990. 2. Tsung-Hsien Wen, Hung-Yi Lee, Pei-hao Su and Lin-shan Lee, “Interactive Spoken Content Retrieval by Extended Query Model and Continuous State Space Markov Decision Process,” IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 8510-8514, 2013. 3. Iswarya, P. and Radha, V, “Speech and Text Query Based Tamil - English Cross Language Information Retrieval system,” International Conference on Computer Communication and Informatics, pp. 1-4, Coimbatore, 2014.
  • 5. International Journal of Innovative Research in Advanced Engineering (IJIRAE) ISSN: 2349-2163 Issue 01, Volume 6 (January 2019) www.ijirae.com _________________________________________________________________________________________________ IJIRAE: Impact Factor Value – Mendeley (Elsevier Indexed); Citefactor 1.9 (2017) ; SJIF: Innospace, Morocco (2016): 3.916 | PIF: 2.469 | Jour Info: 4.085 | ISRAJIF (2017): 4.011 | Indexcopernicus: (ICV 2016): 64.35 IJIRAE © 2014- 19, All Rights Reserved Page–12 4. Chien-Lin Huang, Chiori Hori and Hideki Kashioka, “Semantic Inference Based on Neural Probabilistic Language Modeling for Speech Indexing,” IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 8480-8484, 2013. 5. Ivan Markovi, Sre´ ckoJuri´ Kavelj and Ivan Petrovi, “Partial Mutual Information Based Input Variable Selection for Supervised Learning Approaches to Voice Activity Detection,” Applied Soft Computing Elsevier, vol. 13, pp. 4383-4391, 2013. 6. Khoubrouy, S. A. and Panahi, I.M.S., “Voice Activation Detection using Teager-Kaiser Energy Measure,” International Symposium on Image and Signal Processing and Analysis, pp. 388-392, 2013. 7. Saleh Khawatreh, Belal Ayyoub, Ashraf Abu-Ein and Ziad Alqadi. A Novel Methodology to Extract Voice Signal Features. International Journal of Computer Applications 179(9):40-43, January 2018. 8. Tayseer M F Taha and Amir Hussain. A Survey on Techniques for Enhancing Speech. International Journal of Computer Applications 179(17):1-14, February 2018. 9. Xiaowen Cheng, Jarod V. Hart, and James S. Walker, “Time-frequency Analysis of Musical Rhythm,” Notices of AMS, vol. 56, no. 3, 2008. 10.Ausgef¨uhrt, Evaluation of New Audio Features and Their Utilization in Novel Music Retrieval Applications, Master's thesis, Vienna University of Technology, December 2006. 11.Eberhard Zwicker and Hugo Fastl, “Psychoacoustics-Facts and Models,” Springer Series of Information Sciences, Berlin, 1999. 12.M. R. Schroder, B. S. Atal, and J. L. Hall, “Optimizing Digital Speech Coders by Exploiting Masking Properties of the Human Ear,” Journal of the Acoustical Society of America, vol. 66, pp. 1647-1652, 1979. 13.ShaojunRen, Fengqi Si, Jianxin Zhou, Zongliang Qiao, Yuanlin Cheng, “A new reconstruction-based auto- associative neural network for fault diagnosis in nonlinear systems,” Chemometrics and Intelligent Laboratory Systems, Volume 172, 15 January 2018, Pages 118-128N. 14.Nitananda, M. Haseyama, and H. Kitajima, “Accurate Audio-Segment Classification using Feature Extraction Matrix,” IEEE International Conference on Acoustics, Speech and Signal Processing, pp. 261-264, 2005. 15.G. Peeters, “A Large Set of Audio Features for Sound Description,” Technical representation, IRCAM, 2004. 16.K. Lee, “Identifying Cover Songs from Audio using Harmonic Representation,” International Symposium on Music Information Retrieval, 2006.