SlideShare a Scribd company logo
International Association of Scientific Innovation and Research (IASIR)
(An Association Unifying the Sciences, Engineering, and Applied Research)
International Journal of Emerging Technologies in Computational
and Applied Sciences (IJETCAS)
www.iasir.net
IJETCAS 14-390; © 2014, IJETCAS All Rights Reserved Page 281
ISSN (Print): 2279-0047
ISSN (Online): 2279-0055
An SVM Based Speaker Independent Isolated Malayalam Word
Recognition System
Thushara P V1
, Gopakumar C2
1
M.Tech Scholar, Department of Electronics, College of Engineering, Chengannur
Chengannur, Alappuzha Dist., Kerala-689121, INDIA
2
Associate Professor, Department of Electronics, College of Engineering, Karunagapally
Karunagapally, Kollam Dist., Kerala-690523, INDIA
Abstract: As technology advances man-machine interaction is becoming an unavoidable activity. So an effective
method of communication with machines enhances the quality of life. If it is able to operate a system by simply
commanding, then it will be a great blessing to the users. Speech is the most effective mode of communication
used by humans. So by introducing voice user interfaces the interaction with the machines can be made more
user friendly. This paper implements a speaker independent speech recognition system for limited vocabulary
Malayalam Words in Raspberry Pi. Mel Frequency Cepstral Coefficients (MFCC) are the features for
classification and this paper proposes Radial Basis Function (RBF) kernel in Support Vector Machine (SVM)
classifier gives better accuracy in speech recognition than linear kernel. An overall accuracy of 91.8% is
obtained with this work.
Keywords: Speech Recognition; Feature Extraction; MFCC; Support Vector Machine; RBF kernel; Linear
kernel; Raspberry Pi
I. Introduction
Automatic speech recognition can be defined as a technology which enables a system to recognize the input
speech signals and interpret the meaning, after which the system should be able to generate some control signals
[1]. In this paper, the speech recognition system recognizes Malayalam words with limited vocabulary. There
are so many challenges in speech processing like acoustic and phonetic variability. Isolated speech input is used
since continuous speech is difficult to process. In continuous speech, it is difficult to find the start and end points
of words. Within all these considerations, system recognizes words with better accuracy and greater speed.
Malayalam is recently being considered among one of the classic languages in India. Speech recognition system
for Malayalam language helps people who are not conversant with English and unaware of using computer.
Malayalam is one among the 22 languages spoken in India with classical status [2]. Malayalam belongs to the
Dravidian family of languages and most of the Malayalam speakers live in the Kerala, one of the southern states
of India. There are 37 consonants and 16 vowels in the language. For the proper functioning of the system, there
should be distinct pauses between the words i.e. isolated words. Due to the memory constrains in the handheld
device, the vocabulary supported by the system is limited i.e. it is a limited vocabulary speaker independent
isolated word recognition system.
II. Literature Survey
Nowadays, innovation in scientific research is focused much more on the interactions between humans and
technology and automatic speech recognition is a driving force in this process. Speech recognition technology is
changing the way information is accessed, tasks are accomplished and business is done. There are two related
speech tasks: speech understanding and speech recognition. Speech understanding is getting the meaning of an
utterance such that one can respond properly whether or not one has correctly recognized all of the words.
Speech recognition is simply transcribing the speech without necessarily knowing the meaning of the utterance.
The two can be combined, but the task described here is purely recognition. Automatic speech recognition
(ASR) is the ability of a machine to convert the words that is spoken in to the microphone to recognized words.
A. Speech Production and Perception
Five different elements are associated with speech production and perception. They are speech formulation,
human vocal mechanism, acoustic air, perception of the ear, speech comprehension etc.
Thushara P V et al., International Journal of Emerging Technologies in Computational and Applied Sciences, 8(3), March-May, 2043, pp.
281-285
IJETCAS 14-390; © 2014, IJETCAS All Rights Reserved Page 282
B. Types of ASR Systems
ASR can be classified in several ways: speaker dependent or independent, discrete or continuous, and small or
large vocabulary.
1. Speaker Independent (SI) systems or Speaker Dependent (SD) system: System can recognize a variety
of speakers, without any training. Such systems limit the number of words in a vocabulary. But speaker
dependent system can only recognize the speech of users it is trained to understand [3].
2. Discrete ASR or Continuous ASR: Discrete ASR recognizes isolated utterances. Here the user must
speak unnaturally, leaving distinct pauses between each word. In Continuous ASR, the user can speak
naturally, with normal conversational pauses, but it is more difficult for the system to detect the word
boundaries.
3. Small vocabulary or large vocabulary system: In small vocabulary ASR, all the words in the
vocabulary are trained at least once, whereas large vocabulary systems recognize sounds rather than
whole words and are able of recognizing words that have never been in the training set.
C. Factors Affecting Speech Recognition Performance
Different speech recognition systems have different parameters and design methodology according to the
application. But in most cases some factors are similar. For instance, Vocabulary size and variability of factors
[3].These factors have significant roles for the accuracy of the system. According to how much vocabulary can
be recognized, speech recognition can be divided into three different scales vocabulary speech recognition:
Small vocabulary speech recognition, Medium vocabulary speech recognition, large vocabulary speech
recognition. Small-scale can identify less than 100 vocabulary while medium-scale can identify more than 100
vocabulary and large-scale can identify more than 1000 vocabulary. Other factors that affect performance of
speech recognition system, are variability in speakers, environments, transmission channels and microphones.
The variability in speakers involves gender, speed of speaking, regional changes in language. Variability in
environment means whether it is noisy or clean. The bandwidth of transmission channel also plays an important
role in determining the accuracy of the system as number of samples transmitted changes when bandwidth
changes. According to the type of microphone and distance of microphone from mouth, reliability of system
may changes. And finally the performance of the system will be in the hands of experts behind the work.
III. Methodology
A speech recognition system is basically a pattern recognition system dedicated to recognize the words spoken
into microphone. Proposed automatic speech recognition system (ASR) starts with a feature extraction module
where, the input speech waveform is processed to extract the required acoustic feature vectors that are used to
characterize the spectral properties of the time varying speech signal. Feature extraction is processing the input
speech to extract compact and efficient set of parameters that uniquely represents the speech input .The second
stage is the classifier. This stage evaluates the similarities between the input feature vector sequence and trained
database to determine which words were most likely spoken. By feature extraction, we calculate Mel Frequency
Cepstral Coefficients (MFCCs) [4].We use MFCCs as the feature for classification because it allows better
processing of data, high accuracy for clean speech and it approximates the human auditory system's response.
The classifier used for training and testing is Support Vector Machine (SVM). The block diagram of the
proposed speech recognition system is shown in figure 1:
Figure 1. Block Diagram of Proposed System
A. Voice Activity Detection
Voice Activity Detection is a preprocessing technique which detects the start and end point of a word. In the
proposed method we use three different features per frame. The first feature is the widely used short-term
Thushara P V et al., International Journal of Emerging Technologies in Computational and Applied Sciences, 8(3), March-May, 2043, pp.
281-285
IJETCAS 14-390; © 2014, IJETCAS All Rights Reserved Page 283
energy (E) [5]. Energy is the most common feature for speech/silence detection. Another feature selected for
separating speech and silence is zero crossing rate. It is observed that silence parts have high zero crossing rate
whereas in speech, zero crossing is low. Besides these two features, it was observed that the most dominant
frequency component of the speech frame spectrum can be very useful in discriminating between speech and
silence frames.
B. Mel Frequency Cepstral Coefficients (MFCCs)
Figure 2. Calculating MFCCs
C. Support Vector Machine (SVM) Classifier
Pattern recognition algorithm used for the proposed system is Support Vector Machine [6]. MFCCs of each of
the words is given to the classifier stage for training using SVM learning algorithm. Here 100 samples of each
word from different speakers is used to train the classifier and testing is conducted with another different
speakers in order to make the system speaker independent. Here we optimize the parameters of SVM by making
a comparison between the kernel functions. In this work, we use simple linear kernel and radial basis function
kernel for comparison.
IV. Results and Analysis
A. Database Used
For conducting this experiment we choose six Malayalam words. The samples stored in the database are
recorded by using a high quality studio-recording microphone at a sampling rate of 16 KHz. Malayalam
numerals from one to six is chosen to create the database. Twelve speakers are selected to record the words.
Each speaker utters six words with thirty samples each. We have used six male speakers and six female speakers
for creating the database. Thus the database consists of a total of 2160 utterances of the spoken words. Speech
database is shown in Table 1.
Table 1. Speech Database
NO WORDS
1 ഒന്ന് (onnu)
2 രണ്ട് (randu)
3 മൂന്ന് (moonnu)
4 നാല് (naalu)
5 അഞ്ച് (anj)
6 ആറു (aaru)
B. Voice Activity Detection
This important front end processing detects the start and end point of words. The result obtained by voice
activity detection of word ‘onnu’ is plotted in Figure 3.
C. Simulation Parameters of Feature Extraction Block
 Sampling frequency = 16000Hz
 Frame duration = 10ms
 Frame overlapping = 5ms
 Number of DFT points = 256
 Number of Mel filters = 24
 Number of MFCC coefficients = 19
Thushara P V et al., International Journal of Emerging Technologies in Computational and Applied Sciences, 8(3), March-May, 2043, pp.
281-285
IJETCAS 14-390; © 2014, IJETCAS All Rights Reserved Page 284
Figure 3. Voice Activity Detection of word ‘ONNU’
D. Recognition Results
There are 100 input samples for training and testing is conducted with samples of different speakers for each
word. Training is performed with maximum iterations of 100. In SVM, it is possible to use RBF kernel and
linear kernel. In this paper, a comparison of accuracy in test results is made for both the cases when RBF and
Linear kernel are used. The accuracy obtained in both the experiments are shown in the table II:
Table 2. Accuracy Test Results
Words Accuracy (%)
Linear kernel RBF kernel
onnu 75 91.6
randu 91.6 91.6
moonnu 84 100
naalu 50 100
anj 68 84
aaru 84 84
Average 75.5 91.8
E. Comparison of Results
0
20
40
60
80
100
accuracy
Words
COMPARISON BETWEEN
KERNELS IN SVM
linearkernel
RBF kernel
Figure 4. Comparison of accuracies between linear and RBF kernel
The experimental results prove that training with RBF kernel gives better accuracy in recognition than with
linear kernel.System trained with linear kernel has got an average accuracy of 75.5% whereas for RBF kernel it
is 91.8%.
Thushara P V et al., International Journal of Emerging Technologies in Computational and Applied Sciences, 8(3), March-May, 2043, pp.
281-285
IJETCAS 14-390; © 2014, IJETCAS All Rights Reserved Page 285
F. Hardware Implementation Results
The system consists of six Malayalam words and five different speakers for training and 5 speakers for testing.
The experiment was conducted in Raspberry Pi and obtained 91.8% accuracy.
Figure 5. Hardware platform: Raspberry Pi
V. Conclusion
The paper proposes a speech recognition system using MFCCs and support vector machine. The proposed
method improves the accuracy of recognition compared to existing methods such as Wavelet coefficients and
ANN [7], MFCCs and k-means clustering, Formant frequencies and ANN [8] etc. Misclassification rate in
support vector machine is analyzed by comparing the kernel functions for mapping input space to feature space.
The generalization capability of SVM classifier improves when we are using RBF kernel compared to linear
kernel. The system implemented in Raspberry Pi has got an accuracy of 91.8%.
References
[1] Shivanker Dev Dhingra , Geeta Nijhawan , Poonam Pandit , “Isolated speech recognition using mfcc and dtw”,International Journal
of Advanced Research in Electrical, Electronics and Instrumentation Engineering, Vol. 2, Issue 8, August 2013
[2] Sreejith C, “Isolated Spoken Word Identification in Malayalam using Mel-frequency Cepstral Coefficients and K-means clustering”,
International Journal of Science and Research, Vol. 1, December 2012, pp.163-167
[3] Nitin Trivedi et al, “Speech Recognition by Wavelet Analysis”, International Journal of Computer Application, Volume 15, February
2011, pp.27-32
[4] Daniel Jurafsky & James H. Martin, “Speech and Language Processing: An introduction to natural language processing, computational
linguistics, and speech recognition”, Pearson Education, 2007, pp.327-335
[5] M. H. Moattar, M. M. Homayounpour, “A simple but efficient real-time voice activity detection Algorithm”, 17th European
Conference on Signal Processing, August 24-28, 2009, pp.2549-2553
[6] M.A.Anusuya, S.K.Katti, “Classification Techniques used in Speech Recognition Applications: A Review”, International Journal of
computer applications, Vol 2, AUGUST 2011, pp.910-954
[7] Sonia Sunny et al, “Development of a Speech Recognition System for Speaker Independent Isolated Malayalam Words”,
International Journal of Computer Science & Engineering Technology, Vol. 3, April 2012, pp.69-75
[8] Dipanwita Paul, “Automated speech recognition of isolated words using neural networks”, International Journal of Engineering
Science and Technology, Vol. 3, June 2011

More Related Content

What's hot

SMATalk: Standard Malay Text to Speech Talk System
SMATalk: Standard Malay Text to Speech Talk SystemSMATalk: Standard Malay Text to Speech Talk System
SMATalk: Standard Malay Text to Speech Talk System
CSCJournals
 
SPEAKER VERIFICATION USING ACOUSTIC AND PROSODIC FEATURES
SPEAKER VERIFICATION USING ACOUSTIC AND PROSODIC FEATURESSPEAKER VERIFICATION USING ACOUSTIC AND PROSODIC FEATURES
SPEAKER VERIFICATION USING ACOUSTIC AND PROSODIC FEATURES
acijjournal
 
G1802024651
G1802024651G1802024651
G1802024651
IOSR Journals
 
IRJET - Sign Language Converter
IRJET -  	  Sign Language ConverterIRJET -  	  Sign Language Converter
IRJET - Sign Language Converter
IRJET Journal
 
E0502 01 2327
E0502 01 2327E0502 01 2327
E0502 01 2327IJMER
 
Volume 2-issue-6-2186-2189
Volume 2-issue-6-2186-2189Volume 2-issue-6-2186-2189
Volume 2-issue-6-2186-2189Editor IJARCET
 
Evaluation of Hidden Markov Model based Marathi Text-ToSpeech Synthesis System
Evaluation of Hidden Markov Model based Marathi Text-ToSpeech Synthesis SystemEvaluation of Hidden Markov Model based Marathi Text-ToSpeech Synthesis System
Evaluation of Hidden Markov Model based Marathi Text-ToSpeech Synthesis System
IJERA Editor
 
Dy36749754
Dy36749754Dy36749754
Dy36749754
IJERA Editor
 
Speech Recognized Automation System Using Speaker Identification through Wire...
Speech Recognized Automation System Using Speaker Identification through Wire...Speech Recognized Automation System Using Speaker Identification through Wire...
Speech Recognized Automation System Using Speaker Identification through Wire...
IOSR Journals
 
IRJET- Comparative Analysis of Emotion Recognition System
IRJET- Comparative Analysis of Emotion Recognition SystemIRJET- Comparative Analysis of Emotion Recognition System
IRJET- Comparative Analysis of Emotion Recognition System
IRJET Journal
 
An optimized approach to voice translation on mobile phones
An optimized approach to voice translation on mobile phonesAn optimized approach to voice translation on mobile phones
An optimized approach to voice translation on mobile phones
eSAT Publishing House
 
An optimized approach to voice translation on mobile phones
An optimized approach to voice translation on mobile phonesAn optimized approach to voice translation on mobile phones
An optimized approach to voice translation on mobile phones
eSAT Journals
 
IRJET- Voice based E-mail system
IRJET- Voice based E-mail systemIRJET- Voice based E-mail system
IRJET- Voice based E-mail system
IRJET Journal
 
Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...
Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...
Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...
iosrjce
 
Assistive Examination System for Visually Impaired
Assistive Examination System for Visually ImpairedAssistive Examination System for Visually Impaired
Assistive Examination System for Visually Impaired
Editor IJCATR
 
Utterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANNUtterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANN
IJCSEA Journal
 

What's hot (17)

SMATalk: Standard Malay Text to Speech Talk System
SMATalk: Standard Malay Text to Speech Talk SystemSMATalk: Standard Malay Text to Speech Talk System
SMATalk: Standard Malay Text to Speech Talk System
 
SPEAKER VERIFICATION USING ACOUSTIC AND PROSODIC FEATURES
SPEAKER VERIFICATION USING ACOUSTIC AND PROSODIC FEATURESSPEAKER VERIFICATION USING ACOUSTIC AND PROSODIC FEATURES
SPEAKER VERIFICATION USING ACOUSTIC AND PROSODIC FEATURES
 
10
1010
10
 
G1802024651
G1802024651G1802024651
G1802024651
 
IRJET - Sign Language Converter
IRJET -  	  Sign Language ConverterIRJET -  	  Sign Language Converter
IRJET - Sign Language Converter
 
E0502 01 2327
E0502 01 2327E0502 01 2327
E0502 01 2327
 
Volume 2-issue-6-2186-2189
Volume 2-issue-6-2186-2189Volume 2-issue-6-2186-2189
Volume 2-issue-6-2186-2189
 
Evaluation of Hidden Markov Model based Marathi Text-ToSpeech Synthesis System
Evaluation of Hidden Markov Model based Marathi Text-ToSpeech Synthesis SystemEvaluation of Hidden Markov Model based Marathi Text-ToSpeech Synthesis System
Evaluation of Hidden Markov Model based Marathi Text-ToSpeech Synthesis System
 
Dy36749754
Dy36749754Dy36749754
Dy36749754
 
Speech Recognized Automation System Using Speaker Identification through Wire...
Speech Recognized Automation System Using Speaker Identification through Wire...Speech Recognized Automation System Using Speaker Identification through Wire...
Speech Recognized Automation System Using Speaker Identification through Wire...
 
IRJET- Comparative Analysis of Emotion Recognition System
IRJET- Comparative Analysis of Emotion Recognition SystemIRJET- Comparative Analysis of Emotion Recognition System
IRJET- Comparative Analysis of Emotion Recognition System
 
An optimized approach to voice translation on mobile phones
An optimized approach to voice translation on mobile phonesAn optimized approach to voice translation on mobile phones
An optimized approach to voice translation on mobile phones
 
An optimized approach to voice translation on mobile phones
An optimized approach to voice translation on mobile phonesAn optimized approach to voice translation on mobile phones
An optimized approach to voice translation on mobile phones
 
IRJET- Voice based E-mail system
IRJET- Voice based E-mail systemIRJET- Voice based E-mail system
IRJET- Voice based E-mail system
 
Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...
Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...
Approach of Syllable Based Unit Selection Text- To-Speech Synthesis System fo...
 
Assistive Examination System for Visually Impaired
Assistive Examination System for Visually ImpairedAssistive Examination System for Visually Impaired
Assistive Examination System for Visually Impaired
 
Utterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANNUtterance Based Speaker Identification Using ANN
Utterance Based Speaker Identification Using ANN
 

Viewers also liked

Ijetcas14 572
Ijetcas14 572Ijetcas14 572
Ijetcas14 572
Iasir Journals
 
Ijetcas14 504
Ijetcas14 504Ijetcas14 504
Ijetcas14 504
Iasir Journals
 
Ijetcas14 580
Ijetcas14 580Ijetcas14 580
Ijetcas14 580
Iasir Journals
 
Ijetcas14 516
Ijetcas14 516Ijetcas14 516
Ijetcas14 516
Iasir Journals
 
Ijetcas14 647
Ijetcas14 647Ijetcas14 647
Ijetcas14 647
Iasir Journals
 
Ijetcas14 524
Ijetcas14 524Ijetcas14 524
Ijetcas14 524
Iasir Journals
 

Viewers also liked (19)

Ijetcas14 316
Ijetcas14 316Ijetcas14 316
Ijetcas14 316
 
Ijebea14 275
Ijebea14 275Ijebea14 275
Ijebea14 275
 
Ijetcas14 572
Ijetcas14 572Ijetcas14 572
Ijetcas14 572
 
Ijetcas14 472
Ijetcas14 472Ijetcas14 472
Ijetcas14 472
 
Ijebea14 253
Ijebea14 253Ijebea14 253
Ijebea14 253
 
Ijebea14 229
Ijebea14 229Ijebea14 229
Ijebea14 229
 
Aijrfans14 224
Aijrfans14 224Aijrfans14 224
Aijrfans14 224
 
Ijetcas14 504
Ijetcas14 504Ijetcas14 504
Ijetcas14 504
 
Ijetcas14 483
Ijetcas14 483Ijetcas14 483
Ijetcas14 483
 
Ijetcas14 489
Ijetcas14 489Ijetcas14 489
Ijetcas14 489
 
Ijetcas14 458
Ijetcas14 458Ijetcas14 458
Ijetcas14 458
 
Ijetcas14 468
Ijetcas14 468Ijetcas14 468
Ijetcas14 468
 
Aijrfans14 293
Aijrfans14 293Aijrfans14 293
Aijrfans14 293
 
Ijetcas14 580
Ijetcas14 580Ijetcas14 580
Ijetcas14 580
 
Ijetcas14 516
Ijetcas14 516Ijetcas14 516
Ijetcas14 516
 
Aijrfans14 288
Aijrfans14 288Aijrfans14 288
Aijrfans14 288
 
Ijetcas14 647
Ijetcas14 647Ijetcas14 647
Ijetcas14 647
 
Ijetcas14 385
Ijetcas14 385Ijetcas14 385
Ijetcas14 385
 
Ijetcas14 524
Ijetcas14 524Ijetcas14 524
Ijetcas14 524
 

Similar to Ijetcas14 390

International journal of signal and image processing issues vol 2015 - no 1...
International journal of signal and image processing issues   vol 2015 - no 1...International journal of signal and image processing issues   vol 2015 - no 1...
International journal of signal and image processing issues vol 2015 - no 1...
sophiabelthome
 
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
AIRCC Publishing Corporation
 
Kc3517481754
Kc3517481754Kc3517481754
Kc3517481754
IJERA Editor
 
ACHIEVING SECURITY VIA SPEECH RECOGNITION
ACHIEVING SECURITY VIA SPEECH RECOGNITIONACHIEVING SECURITY VIA SPEECH RECOGNITION
ACHIEVING SECURITY VIA SPEECH RECOGNITION
ijistjournal
 
A Review On Speech Feature Techniques And Classification Techniques
A Review On Speech Feature Techniques And Classification TechniquesA Review On Speech Feature Techniques And Classification Techniques
A Review On Speech Feature Techniques And Classification Techniques
Nicole Heredia
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
ijcseit
 
5215ijcseit01
5215ijcseit015215ijcseit01
5215ijcseit01ijcsit
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
ijcseit
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
ijcseit
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
ijcseit
 
Performance of different classifiers in speech recognition
Performance of different classifiers in speech recognitionPerformance of different classifiers in speech recognition
Performance of different classifiers in speech recognition
eSAT Journals
 
On Developing an Automatic Speech Recognition System for Commonly used Englis...
On Developing an Automatic Speech Recognition System for Commonly used Englis...On Developing an Automatic Speech Recognition System for Commonly used Englis...
On Developing an Automatic Speech Recognition System for Commonly used Englis...
rahulmonikasharma
 
De4201715719
De4201715719De4201715719
De4201715719
IJERA Editor
 
Emotional telugu speech signals classification based on k nn classifier
Emotional telugu speech signals classification based on k nn classifierEmotional telugu speech signals classification based on k nn classifier
Emotional telugu speech signals classification based on k nn classifier
eSAT Publishing House
 
Emotional telugu speech signals classification based on k nn classifier
Emotional telugu speech signals classification based on k nn classifierEmotional telugu speech signals classification based on k nn classifier
Emotional telugu speech signals classification based on k nn classifier
eSAT Journals
 
19 ijcse-01227
19 ijcse-0122719 ijcse-01227
19 ijcse-01227
Shivlal Mewada
 
A Survey on Speech Recognition with Language Specification
A Survey on Speech Recognition with Language SpecificationA Survey on Speech Recognition with Language Specification
A Survey on Speech Recognition with Language Specification
ijtsrd
 
Volume 2-issue-6-2186-2189
Volume 2-issue-6-2186-2189Volume 2-issue-6-2186-2189
Volume 2-issue-6-2186-2189Editor IJARCET
 
Speech to text conversion
Speech to text conversionSpeech to text conversion
Speech to text conversionankit_saluja
 
Speech to text conversion
Speech to text conversionSpeech to text conversion
Speech to text conversionankit_saluja
 

Similar to Ijetcas14 390 (20)

International journal of signal and image processing issues vol 2015 - no 1...
International journal of signal and image processing issues   vol 2015 - no 1...International journal of signal and image processing issues   vol 2015 - no 1...
International journal of signal and image processing issues vol 2015 - no 1...
 
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
CURVELET BASED SPEECH RECOGNITION SYSTEM IN NOISY ENVIRONMENT: A STATISTICAL ...
 
Kc3517481754
Kc3517481754Kc3517481754
Kc3517481754
 
ACHIEVING SECURITY VIA SPEECH RECOGNITION
ACHIEVING SECURITY VIA SPEECH RECOGNITIONACHIEVING SECURITY VIA SPEECH RECOGNITION
ACHIEVING SECURITY VIA SPEECH RECOGNITION
 
A Review On Speech Feature Techniques And Classification Techniques
A Review On Speech Feature Techniques And Classification TechniquesA Review On Speech Feature Techniques And Classification Techniques
A Review On Speech Feature Techniques And Classification Techniques
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
 
5215ijcseit01
5215ijcseit015215ijcseit01
5215ijcseit01
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
 
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMARSYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
SYLLABLE-BASED SPEECH RECOGNITION SYSTEM FOR MYANMAR
 
Performance of different classifiers in speech recognition
Performance of different classifiers in speech recognitionPerformance of different classifiers in speech recognition
Performance of different classifiers in speech recognition
 
On Developing an Automatic Speech Recognition System for Commonly used Englis...
On Developing an Automatic Speech Recognition System for Commonly used Englis...On Developing an Automatic Speech Recognition System for Commonly used Englis...
On Developing an Automatic Speech Recognition System for Commonly used Englis...
 
De4201715719
De4201715719De4201715719
De4201715719
 
Emotional telugu speech signals classification based on k nn classifier
Emotional telugu speech signals classification based on k nn classifierEmotional telugu speech signals classification based on k nn classifier
Emotional telugu speech signals classification based on k nn classifier
 
Emotional telugu speech signals classification based on k nn classifier
Emotional telugu speech signals classification based on k nn classifierEmotional telugu speech signals classification based on k nn classifier
Emotional telugu speech signals classification based on k nn classifier
 
19 ijcse-01227
19 ijcse-0122719 ijcse-01227
19 ijcse-01227
 
A Survey on Speech Recognition with Language Specification
A Survey on Speech Recognition with Language SpecificationA Survey on Speech Recognition with Language Specification
A Survey on Speech Recognition with Language Specification
 
Volume 2-issue-6-2186-2189
Volume 2-issue-6-2186-2189Volume 2-issue-6-2186-2189
Volume 2-issue-6-2186-2189
 
Speech to text conversion
Speech to text conversionSpeech to text conversion
Speech to text conversion
 
Speech to text conversion
Speech to text conversionSpeech to text conversion
Speech to text conversion
 

More from Iasir Journals

ijetcas14 650
ijetcas14 650ijetcas14 650
ijetcas14 650
Iasir Journals
 
Ijetcas14 648
Ijetcas14 648Ijetcas14 648
Ijetcas14 648
Iasir Journals
 
Ijetcas14 643
Ijetcas14 643Ijetcas14 643
Ijetcas14 643
Iasir Journals
 
Ijetcas14 641
Ijetcas14 641Ijetcas14 641
Ijetcas14 641
Iasir Journals
 
Ijetcas14 639
Ijetcas14 639Ijetcas14 639
Ijetcas14 639
Iasir Journals
 
Ijetcas14 632
Ijetcas14 632Ijetcas14 632
Ijetcas14 632
Iasir Journals
 
Ijetcas14 624
Ijetcas14 624Ijetcas14 624
Ijetcas14 624
Iasir Journals
 
Ijetcas14 619
Ijetcas14 619Ijetcas14 619
Ijetcas14 619
Iasir Journals
 
Ijetcas14 615
Ijetcas14 615Ijetcas14 615
Ijetcas14 615
Iasir Journals
 
Ijetcas14 608
Ijetcas14 608Ijetcas14 608
Ijetcas14 608
Iasir Journals
 
Ijetcas14 605
Ijetcas14 605Ijetcas14 605
Ijetcas14 605
Iasir Journals
 
Ijetcas14 604
Ijetcas14 604Ijetcas14 604
Ijetcas14 604
Iasir Journals
 
Ijetcas14 598
Ijetcas14 598Ijetcas14 598
Ijetcas14 598
Iasir Journals
 
Ijetcas14 594
Ijetcas14 594Ijetcas14 594
Ijetcas14 594
Iasir Journals
 
Ijetcas14 593
Ijetcas14 593Ijetcas14 593
Ijetcas14 593
Iasir Journals
 
Ijetcas14 591
Ijetcas14 591Ijetcas14 591
Ijetcas14 591
Iasir Journals
 
Ijetcas14 589
Ijetcas14 589Ijetcas14 589
Ijetcas14 589
Iasir Journals
 
Ijetcas14 585
Ijetcas14 585Ijetcas14 585
Ijetcas14 585
Iasir Journals
 
Ijetcas14 584
Ijetcas14 584Ijetcas14 584
Ijetcas14 584
Iasir Journals
 
Ijetcas14 583
Ijetcas14 583Ijetcas14 583
Ijetcas14 583
Iasir Journals
 

More from Iasir Journals (20)

ijetcas14 650
ijetcas14 650ijetcas14 650
ijetcas14 650
 
Ijetcas14 648
Ijetcas14 648Ijetcas14 648
Ijetcas14 648
 
Ijetcas14 643
Ijetcas14 643Ijetcas14 643
Ijetcas14 643
 
Ijetcas14 641
Ijetcas14 641Ijetcas14 641
Ijetcas14 641
 
Ijetcas14 639
Ijetcas14 639Ijetcas14 639
Ijetcas14 639
 
Ijetcas14 632
Ijetcas14 632Ijetcas14 632
Ijetcas14 632
 
Ijetcas14 624
Ijetcas14 624Ijetcas14 624
Ijetcas14 624
 
Ijetcas14 619
Ijetcas14 619Ijetcas14 619
Ijetcas14 619
 
Ijetcas14 615
Ijetcas14 615Ijetcas14 615
Ijetcas14 615
 
Ijetcas14 608
Ijetcas14 608Ijetcas14 608
Ijetcas14 608
 
Ijetcas14 605
Ijetcas14 605Ijetcas14 605
Ijetcas14 605
 
Ijetcas14 604
Ijetcas14 604Ijetcas14 604
Ijetcas14 604
 
Ijetcas14 598
Ijetcas14 598Ijetcas14 598
Ijetcas14 598
 
Ijetcas14 594
Ijetcas14 594Ijetcas14 594
Ijetcas14 594
 
Ijetcas14 593
Ijetcas14 593Ijetcas14 593
Ijetcas14 593
 
Ijetcas14 591
Ijetcas14 591Ijetcas14 591
Ijetcas14 591
 
Ijetcas14 589
Ijetcas14 589Ijetcas14 589
Ijetcas14 589
 
Ijetcas14 585
Ijetcas14 585Ijetcas14 585
Ijetcas14 585
 
Ijetcas14 584
Ijetcas14 584Ijetcas14 584
Ijetcas14 584
 
Ijetcas14 583
Ijetcas14 583Ijetcas14 583
Ijetcas14 583
 

Recently uploaded

Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
Jisc
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
EverAndrsGuerraGuerr
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
joachimlavalley1
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
MIRIAMSALINAS13
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
EugeneSaldivar
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
Levi Shapiro
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
DhatriParmar
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
SACHIN R KONDAGURI
 
Polish students' mobility in the Czech Republic
Polish students' mobility in the Czech RepublicPolish students' mobility in the Czech Republic
Polish students' mobility in the Czech Republic
Anna Sz.
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
TechSoup
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
DeeptiGupta154
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
Celine George
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
Delapenabediema
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Thiyagu K
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
Peter Windle
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
Jisc
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
heathfieldcps1
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
Vivekanand Anglo Vedic Academy
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
kaushalkr1407
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
Jean Carlos Nunes Paixão
 

Recently uploaded (20)

Supporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptxSupporting (UKRI) OA monographs at Salford.pptx
Supporting (UKRI) OA monographs at Salford.pptx
 
Thesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.pptThesis Statement for students diagnonsed withADHD.ppt
Thesis Statement for students diagnonsed withADHD.ppt
 
Additional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdfAdditional Benefits for Employee Website.pdf
Additional Benefits for Employee Website.pdf
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
 
Polish students' mobility in the Czech Republic
Polish students' mobility in the Czech RepublicPolish students' mobility in the Czech Republic
Polish students' mobility in the Czech Republic
 
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup   New Member Orientation and Q&A (May 2024).pdfWelcome to TechSoup   New Member Orientation and Q&A (May 2024).pdf
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdf
 
Overview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with MechanismOverview on Edible Vaccine: Pros & Cons with Mechanism
Overview on Edible Vaccine: Pros & Cons with Mechanism
 
Model Attribute Check Company Auto Property
Model Attribute  Check Company Auto PropertyModel Attribute  Check Company Auto Property
Model Attribute Check Company Auto Property
 
The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
 
The basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptxThe basics of sentences session 5pptx.pptx
The basics of sentences session 5pptx.pptx
 
The French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free downloadThe French Revolution Class 9 Study Material pdf free download
The French Revolution Class 9 Study Material pdf free download
 
The Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdfThe Roman Empire A Historical Colossus.pdf
The Roman Empire A Historical Colossus.pdf
 
Lapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdfLapbook sobre os Regimes Totalitários.pdf
Lapbook sobre os Regimes Totalitários.pdf
 

Ijetcas14 390

  • 1. International Association of Scientific Innovation and Research (IASIR) (An Association Unifying the Sciences, Engineering, and Applied Research) International Journal of Emerging Technologies in Computational and Applied Sciences (IJETCAS) www.iasir.net IJETCAS 14-390; © 2014, IJETCAS All Rights Reserved Page 281 ISSN (Print): 2279-0047 ISSN (Online): 2279-0055 An SVM Based Speaker Independent Isolated Malayalam Word Recognition System Thushara P V1 , Gopakumar C2 1 M.Tech Scholar, Department of Electronics, College of Engineering, Chengannur Chengannur, Alappuzha Dist., Kerala-689121, INDIA 2 Associate Professor, Department of Electronics, College of Engineering, Karunagapally Karunagapally, Kollam Dist., Kerala-690523, INDIA Abstract: As technology advances man-machine interaction is becoming an unavoidable activity. So an effective method of communication with machines enhances the quality of life. If it is able to operate a system by simply commanding, then it will be a great blessing to the users. Speech is the most effective mode of communication used by humans. So by introducing voice user interfaces the interaction with the machines can be made more user friendly. This paper implements a speaker independent speech recognition system for limited vocabulary Malayalam Words in Raspberry Pi. Mel Frequency Cepstral Coefficients (MFCC) are the features for classification and this paper proposes Radial Basis Function (RBF) kernel in Support Vector Machine (SVM) classifier gives better accuracy in speech recognition than linear kernel. An overall accuracy of 91.8% is obtained with this work. Keywords: Speech Recognition; Feature Extraction; MFCC; Support Vector Machine; RBF kernel; Linear kernel; Raspberry Pi I. Introduction Automatic speech recognition can be defined as a technology which enables a system to recognize the input speech signals and interpret the meaning, after which the system should be able to generate some control signals [1]. In this paper, the speech recognition system recognizes Malayalam words with limited vocabulary. There are so many challenges in speech processing like acoustic and phonetic variability. Isolated speech input is used since continuous speech is difficult to process. In continuous speech, it is difficult to find the start and end points of words. Within all these considerations, system recognizes words with better accuracy and greater speed. Malayalam is recently being considered among one of the classic languages in India. Speech recognition system for Malayalam language helps people who are not conversant with English and unaware of using computer. Malayalam is one among the 22 languages spoken in India with classical status [2]. Malayalam belongs to the Dravidian family of languages and most of the Malayalam speakers live in the Kerala, one of the southern states of India. There are 37 consonants and 16 vowels in the language. For the proper functioning of the system, there should be distinct pauses between the words i.e. isolated words. Due to the memory constrains in the handheld device, the vocabulary supported by the system is limited i.e. it is a limited vocabulary speaker independent isolated word recognition system. II. Literature Survey Nowadays, innovation in scientific research is focused much more on the interactions between humans and technology and automatic speech recognition is a driving force in this process. Speech recognition technology is changing the way information is accessed, tasks are accomplished and business is done. There are two related speech tasks: speech understanding and speech recognition. Speech understanding is getting the meaning of an utterance such that one can respond properly whether or not one has correctly recognized all of the words. Speech recognition is simply transcribing the speech without necessarily knowing the meaning of the utterance. The two can be combined, but the task described here is purely recognition. Automatic speech recognition (ASR) is the ability of a machine to convert the words that is spoken in to the microphone to recognized words. A. Speech Production and Perception Five different elements are associated with speech production and perception. They are speech formulation, human vocal mechanism, acoustic air, perception of the ear, speech comprehension etc.
  • 2. Thushara P V et al., International Journal of Emerging Technologies in Computational and Applied Sciences, 8(3), March-May, 2043, pp. 281-285 IJETCAS 14-390; © 2014, IJETCAS All Rights Reserved Page 282 B. Types of ASR Systems ASR can be classified in several ways: speaker dependent or independent, discrete or continuous, and small or large vocabulary. 1. Speaker Independent (SI) systems or Speaker Dependent (SD) system: System can recognize a variety of speakers, without any training. Such systems limit the number of words in a vocabulary. But speaker dependent system can only recognize the speech of users it is trained to understand [3]. 2. Discrete ASR or Continuous ASR: Discrete ASR recognizes isolated utterances. Here the user must speak unnaturally, leaving distinct pauses between each word. In Continuous ASR, the user can speak naturally, with normal conversational pauses, but it is more difficult for the system to detect the word boundaries. 3. Small vocabulary or large vocabulary system: In small vocabulary ASR, all the words in the vocabulary are trained at least once, whereas large vocabulary systems recognize sounds rather than whole words and are able of recognizing words that have never been in the training set. C. Factors Affecting Speech Recognition Performance Different speech recognition systems have different parameters and design methodology according to the application. But in most cases some factors are similar. For instance, Vocabulary size and variability of factors [3].These factors have significant roles for the accuracy of the system. According to how much vocabulary can be recognized, speech recognition can be divided into three different scales vocabulary speech recognition: Small vocabulary speech recognition, Medium vocabulary speech recognition, large vocabulary speech recognition. Small-scale can identify less than 100 vocabulary while medium-scale can identify more than 100 vocabulary and large-scale can identify more than 1000 vocabulary. Other factors that affect performance of speech recognition system, are variability in speakers, environments, transmission channels and microphones. The variability in speakers involves gender, speed of speaking, regional changes in language. Variability in environment means whether it is noisy or clean. The bandwidth of transmission channel also plays an important role in determining the accuracy of the system as number of samples transmitted changes when bandwidth changes. According to the type of microphone and distance of microphone from mouth, reliability of system may changes. And finally the performance of the system will be in the hands of experts behind the work. III. Methodology A speech recognition system is basically a pattern recognition system dedicated to recognize the words spoken into microphone. Proposed automatic speech recognition system (ASR) starts with a feature extraction module where, the input speech waveform is processed to extract the required acoustic feature vectors that are used to characterize the spectral properties of the time varying speech signal. Feature extraction is processing the input speech to extract compact and efficient set of parameters that uniquely represents the speech input .The second stage is the classifier. This stage evaluates the similarities between the input feature vector sequence and trained database to determine which words were most likely spoken. By feature extraction, we calculate Mel Frequency Cepstral Coefficients (MFCCs) [4].We use MFCCs as the feature for classification because it allows better processing of data, high accuracy for clean speech and it approximates the human auditory system's response. The classifier used for training and testing is Support Vector Machine (SVM). The block diagram of the proposed speech recognition system is shown in figure 1: Figure 1. Block Diagram of Proposed System A. Voice Activity Detection Voice Activity Detection is a preprocessing technique which detects the start and end point of a word. In the proposed method we use three different features per frame. The first feature is the widely used short-term
  • 3. Thushara P V et al., International Journal of Emerging Technologies in Computational and Applied Sciences, 8(3), March-May, 2043, pp. 281-285 IJETCAS 14-390; © 2014, IJETCAS All Rights Reserved Page 283 energy (E) [5]. Energy is the most common feature for speech/silence detection. Another feature selected for separating speech and silence is zero crossing rate. It is observed that silence parts have high zero crossing rate whereas in speech, zero crossing is low. Besides these two features, it was observed that the most dominant frequency component of the speech frame spectrum can be very useful in discriminating between speech and silence frames. B. Mel Frequency Cepstral Coefficients (MFCCs) Figure 2. Calculating MFCCs C. Support Vector Machine (SVM) Classifier Pattern recognition algorithm used for the proposed system is Support Vector Machine [6]. MFCCs of each of the words is given to the classifier stage for training using SVM learning algorithm. Here 100 samples of each word from different speakers is used to train the classifier and testing is conducted with another different speakers in order to make the system speaker independent. Here we optimize the parameters of SVM by making a comparison between the kernel functions. In this work, we use simple linear kernel and radial basis function kernel for comparison. IV. Results and Analysis A. Database Used For conducting this experiment we choose six Malayalam words. The samples stored in the database are recorded by using a high quality studio-recording microphone at a sampling rate of 16 KHz. Malayalam numerals from one to six is chosen to create the database. Twelve speakers are selected to record the words. Each speaker utters six words with thirty samples each. We have used six male speakers and six female speakers for creating the database. Thus the database consists of a total of 2160 utterances of the spoken words. Speech database is shown in Table 1. Table 1. Speech Database NO WORDS 1 ഒന്ന് (onnu) 2 രണ്ട് (randu) 3 മൂന്ന് (moonnu) 4 നാല് (naalu) 5 അഞ്ച് (anj) 6 ആറു (aaru) B. Voice Activity Detection This important front end processing detects the start and end point of words. The result obtained by voice activity detection of word ‘onnu’ is plotted in Figure 3. C. Simulation Parameters of Feature Extraction Block  Sampling frequency = 16000Hz  Frame duration = 10ms  Frame overlapping = 5ms  Number of DFT points = 256  Number of Mel filters = 24  Number of MFCC coefficients = 19
  • 4. Thushara P V et al., International Journal of Emerging Technologies in Computational and Applied Sciences, 8(3), March-May, 2043, pp. 281-285 IJETCAS 14-390; © 2014, IJETCAS All Rights Reserved Page 284 Figure 3. Voice Activity Detection of word ‘ONNU’ D. Recognition Results There are 100 input samples for training and testing is conducted with samples of different speakers for each word. Training is performed with maximum iterations of 100. In SVM, it is possible to use RBF kernel and linear kernel. In this paper, a comparison of accuracy in test results is made for both the cases when RBF and Linear kernel are used. The accuracy obtained in both the experiments are shown in the table II: Table 2. Accuracy Test Results Words Accuracy (%) Linear kernel RBF kernel onnu 75 91.6 randu 91.6 91.6 moonnu 84 100 naalu 50 100 anj 68 84 aaru 84 84 Average 75.5 91.8 E. Comparison of Results 0 20 40 60 80 100 accuracy Words COMPARISON BETWEEN KERNELS IN SVM linearkernel RBF kernel Figure 4. Comparison of accuracies between linear and RBF kernel The experimental results prove that training with RBF kernel gives better accuracy in recognition than with linear kernel.System trained with linear kernel has got an average accuracy of 75.5% whereas for RBF kernel it is 91.8%.
  • 5. Thushara P V et al., International Journal of Emerging Technologies in Computational and Applied Sciences, 8(3), March-May, 2043, pp. 281-285 IJETCAS 14-390; © 2014, IJETCAS All Rights Reserved Page 285 F. Hardware Implementation Results The system consists of six Malayalam words and five different speakers for training and 5 speakers for testing. The experiment was conducted in Raspberry Pi and obtained 91.8% accuracy. Figure 5. Hardware platform: Raspberry Pi V. Conclusion The paper proposes a speech recognition system using MFCCs and support vector machine. The proposed method improves the accuracy of recognition compared to existing methods such as Wavelet coefficients and ANN [7], MFCCs and k-means clustering, Formant frequencies and ANN [8] etc. Misclassification rate in support vector machine is analyzed by comparing the kernel functions for mapping input space to feature space. The generalization capability of SVM classifier improves when we are using RBF kernel compared to linear kernel. The system implemented in Raspberry Pi has got an accuracy of 91.8%. References [1] Shivanker Dev Dhingra , Geeta Nijhawan , Poonam Pandit , “Isolated speech recognition using mfcc and dtw”,International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering, Vol. 2, Issue 8, August 2013 [2] Sreejith C, “Isolated Spoken Word Identification in Malayalam using Mel-frequency Cepstral Coefficients and K-means clustering”, International Journal of Science and Research, Vol. 1, December 2012, pp.163-167 [3] Nitin Trivedi et al, “Speech Recognition by Wavelet Analysis”, International Journal of Computer Application, Volume 15, February 2011, pp.27-32 [4] Daniel Jurafsky & James H. Martin, “Speech and Language Processing: An introduction to natural language processing, computational linguistics, and speech recognition”, Pearson Education, 2007, pp.327-335 [5] M. H. Moattar, M. M. Homayounpour, “A simple but efficient real-time voice activity detection Algorithm”, 17th European Conference on Signal Processing, August 24-28, 2009, pp.2549-2553 [6] M.A.Anusuya, S.K.Katti, “Classification Techniques used in Speech Recognition Applications: A Review”, International Journal of computer applications, Vol 2, AUGUST 2011, pp.910-954 [7] Sonia Sunny et al, “Development of a Speech Recognition System for Speaker Independent Isolated Malayalam Words”, International Journal of Computer Science & Engineering Technology, Vol. 3, April 2012, pp.69-75 [8] Dipanwita Paul, “Automated speech recognition of isolated words using neural networks”, International Journal of Engineering Science and Technology, Vol. 3, June 2011