Speech disorders are very complicated in individuals suffering from Apraxia of Speech-AOS. In this paper ,
the pathological cases of speech disabled children affected with AOS are analyzed. The speech signal
samples of children of age between three to eight years are considered for the present study. These speech
signals are digitized and enhanced using the using the Speech Pause Index, Jitter,Skew ,Kurtosis analysis
This analysis is conducted on speech data samples which are concerned with both place of articulation and
manner of articulation. The speech disability of pathological subjects was estimated using results of above
analysis.
Kannada Phonemes to Speech Dictionary: Statistical ApproachIJERA Editor
Â
The input or output of a natural Language processing system can be either written text or speech. To process written text we need to analyze: lexical, syntactic, semantic knowledge about the language, discourse information, real world knowledge to process spoken language, we need to analyze everything required to process written text, along with the challenges of speech recognition and speech synthesis. This paper describes how articulatory phonetics of Kannada is used to generate the phoneme to speech dictionary for Kannada; the statistical computational approach is used to map the elements which are taken from input query or documents. The articulatory phonetics is the place of articulation of a consonant. It is the point of contact where an obstruction occurs in the vocal tract between an articulatory gesture, an active articulator, typically some part of the tongue, and a passive location, typically some part of the roof of the mouth. Along with the manner of articulation and the phonation, this gives the consonant its distinctive sound. The results are presented for the same.
Transliteration by orthography or phonology for hindi and marathi to english ...ijnlc
Â
e-Governance and Web based online commercial multilingual applications has given utmost importance to
the task of translation and transliteration. The Named Entities and Technical Terms occur in the source
language of translation are called out of vocabulary words as they are not available in the multilingual
corpus or dictionary used to support translation process. These Named Entities and Technical Terms need
to be transliterated from source language to target language without losing their phonetic properties. The
fundamental problem in India is that there is no set of rules available to write the spellings in English for
Indian languages according to the linguistics. People are writing different spellings for the same name at
different places. This fact certainly affects the Top-1 accuracy of the transliteration and in turn the
translation process. Major issue noticed by us is the transliteration of named entities consisting three
syllables or three phonetic units in Hindi and Marathi languages where people use mixed approach to
write the spelling either by orthographical approach or by phonological approach. In this paper authors
have provided their opinion through experimentation about appropriateness of either approach.
EXTRACTING LINGUISTIC SPEECH PATTERNS OF JAPANESE FICTIONAL CHARACTERS USING ...kevig
Â
This study extracted and analyzed the linguistic speech patterns that characterize Japanese anime or game characters. Conventional morphological analyzers, such as MeCab, segment words with high performance, but they are unable to segment broken expressions or utterance endings that are not listed in the dictionary, which often appears in lines of anime or game characters. To overcome this challenge, we propose segmenting lines of Japanese anime or game characters using subword units that were proposed mainly for deep learning, and extracting frequently occurring strings to obtain expressions that characterize their utterances. We analyzed the subword units weighted by TF/IDF according to gender, age, and each anime character and show that they are linguistic speech patterns that are specific for each feature. Additionally, a classification experiment shows that the model with subword units outperformed that with the conventional method.
A ROBUST THREE-STAGE HYBRID FRAMEWORK FOR ENGLISH TO BANGLA TRANSLITERATIONkevig
Â
Phonetic typing using the English alphabet has become widely popular nowadays for social media and chat services. As a result, a text containing various English and Bangla words and phrases has become increasingly common. Existing transliteration tools display poor performance for such texts. This paper proposes a robust Three-stage Hybrid Transliteration (THT) framework that can transliterate both English words and phonetic typed Bangla words satisfactorily. This is achieved by adopting a hybrid approach of dictionary-based and rule-based techniques. Experimental results confirm superiority of THT as it significantly outperforms the benchmark transliteration tool.
Myanmar named entity corpus and its use in syllable-based neural named entity...IJECEIAES
Â
Myanmar language is a low-resource language and this is one of the main reasons why Myanmar Natural Language Processing lagged behind compared to other languages. Currently, there is no publicly available named entity corpus for Myanmar language. As part of this work, a very first manually annotated Named Entity tagged corpus for Myanmar language was developed and proposed to support the evaluation of named entity extraction. At present, our named entity corpus contains approximately 170,000 name entities and 60,000 sentences. This work also contributes the first evaluation of various deep neural network architectures on Myanmar Named Entity Recognition. Experimental results of the 10-fold cross validation revealed that syllable-based neural sequence models without additional feature engineering can give better results compared to baseline CRF model. This work also aims to discover the effectiveness of neural network approaches to textual processing for Myanmar language as well as to promote future research works on this understudied language.
ADVANCEMENTS ON NLP APPLICATIONS FOR MANIPURI LANGUAGEijnlc
Â
Manipuri is both a minority and morphologically rich language with genetic features similar to Tibeto Burman languages. It has Subject-Object-Verb (SOV) order, agglutinative verb morphology and is monosyllabic. Morphology and syntax are not clearly distinguished in this language. Natural Language
Processing (NLP) is a useful research field of computer science that deals with processing of a large amount of natural language corpus. The NLP applications encompass E-Dictionary, Morphological Analyzer, Reduplicated Multi-Word Expression (RMWE), Named Entity Recognition (NER), Part of Speech
(POS) Tagging, Machine Translation (MT), Word Net, Word Sense Disambiguation (WSD) etc. In this paper, we present a study on the advancements in NLP applications for Manipuri language, at the same time presenting a comparison table of the approaches and techniques adopted and the results obtained of each of the applications followed by a detail discussion of each work.
MORPHOLOGICAL ANALYZER USING THE BILSTM MODEL ONLY FOR JAPANESE HIRAGANA SENT...kevig
Â
This study proposes a method to develop neural models of the morphological analyzer for Japanese Hiragana sentences using the Bi-LSTM CRF model. Morphological analysis is a technique that divides text data into words and assigns information such as parts of speech. In Japanese natural language processing systems, this technique plays an essential role in downstream applications because the Japanese language does not have word delimiters between words. Hiragana is a type of Japanese phonogramic characters, which is used for texts for children or people who cannot read Chinese characters. Morphological analysis of Hiragana sentences is more difficult than that of ordinary Japanese sentences because there is less information for dividing. For morphological analysis of Hiragana sentences, we demonstrated the effectiveness of fine-tuning using a model based on ordinary Japanese text and examined the influence of training data on texts of various genres.
INTEGRATION OF PHONOTACTIC FEATURES FOR LANGUAGE IDENTIFICATION ON CODE-SWITC...kevig
Â
In this paper, phoneme sequences are used as language information to perform code-switched language
identification (LID). With the one-pass recognition system, the spoken sounds are converted into
phonetically arranged sequences of sounds. The acoustic models are robust enough to handle multiple
languages when emulating multiple hidden Markov models (HMMs). To determine the phoneme similarity
among our target languages, we reported two methods of phoneme mapping. Statistical phoneme-based
bigram language models (LM) are integrated into speech decoding to eliminate possible phone
mismatches. The supervised support vector machine (SVM) is used to learn to recognize the phonetic
information of mixed-language speech based on recognized phone sequences. As the back-end decision is
taken by an SVM, the likelihood scores of segments with monolingual phone occurrence are used to
classify language identity. The speech corpus was tested on Sepedi and English languages that are often
mixed. Our system is evaluated by measuring both the ASR performance and the LID performance
separately. The systems have obtained a promising ASR accuracy with data-driven phone merging
approach modelled using 16 Gaussian mixtures per state. In code-switched speech and monolingual
speech segments respectively, the proposed systems achieved an acceptable ASR and LID accuracy.
Kannada Phonemes to Speech Dictionary: Statistical ApproachIJERA Editor
Â
The input or output of a natural Language processing system can be either written text or speech. To process written text we need to analyze: lexical, syntactic, semantic knowledge about the language, discourse information, real world knowledge to process spoken language, we need to analyze everything required to process written text, along with the challenges of speech recognition and speech synthesis. This paper describes how articulatory phonetics of Kannada is used to generate the phoneme to speech dictionary for Kannada; the statistical computational approach is used to map the elements which are taken from input query or documents. The articulatory phonetics is the place of articulation of a consonant. It is the point of contact where an obstruction occurs in the vocal tract between an articulatory gesture, an active articulator, typically some part of the tongue, and a passive location, typically some part of the roof of the mouth. Along with the manner of articulation and the phonation, this gives the consonant its distinctive sound. The results are presented for the same.
Transliteration by orthography or phonology for hindi and marathi to english ...ijnlc
Â
e-Governance and Web based online commercial multilingual applications has given utmost importance to
the task of translation and transliteration. The Named Entities and Technical Terms occur in the source
language of translation are called out of vocabulary words as they are not available in the multilingual
corpus or dictionary used to support translation process. These Named Entities and Technical Terms need
to be transliterated from source language to target language without losing their phonetic properties. The
fundamental problem in India is that there is no set of rules available to write the spellings in English for
Indian languages according to the linguistics. People are writing different spellings for the same name at
different places. This fact certainly affects the Top-1 accuracy of the transliteration and in turn the
translation process. Major issue noticed by us is the transliteration of named entities consisting three
syllables or three phonetic units in Hindi and Marathi languages where people use mixed approach to
write the spelling either by orthographical approach or by phonological approach. In this paper authors
have provided their opinion through experimentation about appropriateness of either approach.
EXTRACTING LINGUISTIC SPEECH PATTERNS OF JAPANESE FICTIONAL CHARACTERS USING ...kevig
Â
This study extracted and analyzed the linguistic speech patterns that characterize Japanese anime or game characters. Conventional morphological analyzers, such as MeCab, segment words with high performance, but they are unable to segment broken expressions or utterance endings that are not listed in the dictionary, which often appears in lines of anime or game characters. To overcome this challenge, we propose segmenting lines of Japanese anime or game characters using subword units that were proposed mainly for deep learning, and extracting frequently occurring strings to obtain expressions that characterize their utterances. We analyzed the subword units weighted by TF/IDF according to gender, age, and each anime character and show that they are linguistic speech patterns that are specific for each feature. Additionally, a classification experiment shows that the model with subword units outperformed that with the conventional method.
A ROBUST THREE-STAGE HYBRID FRAMEWORK FOR ENGLISH TO BANGLA TRANSLITERATIONkevig
Â
Phonetic typing using the English alphabet has become widely popular nowadays for social media and chat services. As a result, a text containing various English and Bangla words and phrases has become increasingly common. Existing transliteration tools display poor performance for such texts. This paper proposes a robust Three-stage Hybrid Transliteration (THT) framework that can transliterate both English words and phonetic typed Bangla words satisfactorily. This is achieved by adopting a hybrid approach of dictionary-based and rule-based techniques. Experimental results confirm superiority of THT as it significantly outperforms the benchmark transliteration tool.
Myanmar named entity corpus and its use in syllable-based neural named entity...IJECEIAES
Â
Myanmar language is a low-resource language and this is one of the main reasons why Myanmar Natural Language Processing lagged behind compared to other languages. Currently, there is no publicly available named entity corpus for Myanmar language. As part of this work, a very first manually annotated Named Entity tagged corpus for Myanmar language was developed and proposed to support the evaluation of named entity extraction. At present, our named entity corpus contains approximately 170,000 name entities and 60,000 sentences. This work also contributes the first evaluation of various deep neural network architectures on Myanmar Named Entity Recognition. Experimental results of the 10-fold cross validation revealed that syllable-based neural sequence models without additional feature engineering can give better results compared to baseline CRF model. This work also aims to discover the effectiveness of neural network approaches to textual processing for Myanmar language as well as to promote future research works on this understudied language.
ADVANCEMENTS ON NLP APPLICATIONS FOR MANIPURI LANGUAGEijnlc
Â
Manipuri is both a minority and morphologically rich language with genetic features similar to Tibeto Burman languages. It has Subject-Object-Verb (SOV) order, agglutinative verb morphology and is monosyllabic. Morphology and syntax are not clearly distinguished in this language. Natural Language
Processing (NLP) is a useful research field of computer science that deals with processing of a large amount of natural language corpus. The NLP applications encompass E-Dictionary, Morphological Analyzer, Reduplicated Multi-Word Expression (RMWE), Named Entity Recognition (NER), Part of Speech
(POS) Tagging, Machine Translation (MT), Word Net, Word Sense Disambiguation (WSD) etc. In this paper, we present a study on the advancements in NLP applications for Manipuri language, at the same time presenting a comparison table of the approaches and techniques adopted and the results obtained of each of the applications followed by a detail discussion of each work.
MORPHOLOGICAL ANALYZER USING THE BILSTM MODEL ONLY FOR JAPANESE HIRAGANA SENT...kevig
Â
This study proposes a method to develop neural models of the morphological analyzer for Japanese Hiragana sentences using the Bi-LSTM CRF model. Morphological analysis is a technique that divides text data into words and assigns information such as parts of speech. In Japanese natural language processing systems, this technique plays an essential role in downstream applications because the Japanese language does not have word delimiters between words. Hiragana is a type of Japanese phonogramic characters, which is used for texts for children or people who cannot read Chinese characters. Morphological analysis of Hiragana sentences is more difficult than that of ordinary Japanese sentences because there is less information for dividing. For morphological analysis of Hiragana sentences, we demonstrated the effectiveness of fine-tuning using a model based on ordinary Japanese text and examined the influence of training data on texts of various genres.
INTEGRATION OF PHONOTACTIC FEATURES FOR LANGUAGE IDENTIFICATION ON CODE-SWITC...kevig
Â
In this paper, phoneme sequences are used as language information to perform code-switched language
identification (LID). With the one-pass recognition system, the spoken sounds are converted into
phonetically arranged sequences of sounds. The acoustic models are robust enough to handle multiple
languages when emulating multiple hidden Markov models (HMMs). To determine the phoneme similarity
among our target languages, we reported two methods of phoneme mapping. Statistical phoneme-based
bigram language models (LM) are integrated into speech decoding to eliminate possible phone
mismatches. The supervised support vector machine (SVM) is used to learn to recognize the phonetic
information of mixed-language speech based on recognized phone sequences. As the back-end decision is
taken by an SVM, the likelihood scores of segments with monolingual phone occurrence are used to
classify language identity. The speech corpus was tested on Sepedi and English languages that are often
mixed. Our system is evaluated by measuring both the ASR performance and the LID performance
separately. The systems have obtained a promising ASR accuracy with data-driven phone merging
approach modelled using 16 Gaussian mixtures per state. In code-switched speech and monolingual
speech segments respectively, the proposed systems achieved an acceptable ASR and LID accuracy.
Accents of English have been investigated for many years both from the perspective of native and non-native speakers of the language. Various research results imply that non-native speakers of English language produce certain speech characteristics which are uncommon in native speakersâ speech. This is because non-native speakers do not produce the same tongue movement as native speakers. This paper presents an isolated English word recognition system devised with the speech of local Bangladeshi people, who are also non-native speakers of English language. Here, we have also noticed a different speech characteristic which is not available within the speech of native English speakers. Two acoustic features, âpitchâ and âformantsâ have been utilized to develop the system. The system is speaker-independent and stands on Template based approach. The recognition method applied here is very simple and the recognition accuracy is also very satisfactory.
Automatic Speech Recognition of Malayalam Language Nasal Class PhonemesEditor IJCATR
Â
Speech recognition applications are becoming common and useful in this day and age as many of the modern devices are designed and produced user-friendly for the convenience of general public. Speaking/communicating directly with the machine to achieve desired objectives make usage of modern devices easier and convenient. Although may interactive software applications are available, the use of these applications is limited due to language barriers. Hence development of speech recognition systems in local languages will help anyone to make use of this technological advancement. In this paper we discuss the results of the Nasal phonetic class wise speech recognition performance of Malayalam language
ADVANCEMENTS ON NLP APPLICATIONS FOR MANIPURI LANGUAGEkevig
Â
Manipuri is both a minority and morphologically rich language with genetic features similar to Tibeto Burman languages. It has Subject-Object-Verb (SOV) order, agglutinative verb morphology and ismonosyllabic. Morphology and syntax are not clearly distinguished in this language. Natural Language Processing (NLP) is a useful research field of computer science that deals with processing of a large amount of natural language corpus. The NLP applications encompass E-Dictionary, Morphological
Analyzer, Reduplicated Multi-Word Expression (RMWE), Named Entity Recognition (NER), Part of Speech (POS) Tagging, Machine Translation (MT), Word Net, Word Sense Disambiguation (WSD) etc. In this paper, we present a study on the advancements in NLP applications for Manipuri language, at the same time presenting a comparison table of the approaches and techniques adopted and the results obtained of each of the applications followed by a detail discussion of each work.
Emotional telugu speech signals classification based on k nn classifiereSAT Publishing House
Â
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...ijma
Â
In this work a new Bangla speech corpus along with proper transcriptions has been developed; also
various acoustic feature extraction methods have been investigated using Long Short-Term Memory
(LSTM) neural network to find their effective integration into a state-of-the-art Bangla speech recognition
system. The acoustic features are usually a sequence of representative vectors that are extracted from
speech signals and the classes are either words or sub word units such as phonemes. The most commonly
used feature extraction method, known as linear predictive coding (LPC), has been used first in this work.
Then the other two popular methods, namely, the Mel frequency cepstral coefficients (MFCC) and
perceptual linear prediction (PLP) have also been applied. These methods are based on the models of the
human auditory system. A detailed review of the implementation of these methods have been described
first. Then the steps of the implementation have been elaborated for the development of an automatic
speech recognition system (ASR) for Bangla speech.
Regulatory and Linguistic Analysis For a New Proprietary Drug NameBrand Acumen, LLC
Â
Brand Acumen's Regulatory and Linguistic Analysis For a New Proprietary Drug Name. A look into what goes into the creation of a new pharmaceutical name. A Case Study: Janage
Speech is the important mode of communication and is the current research
topic. The concentration is mostly focused on synthesis and analyzing part.Apart of
synthesizing, text to speech system is developed.Speech synthesis is an artificial production
of human speech.A text to speech system (TTS) is to convert an arbitrary text into speech.In
India different languages have been spoken each being the mother tongue of tens of millions
of people.In this paper,the text to speech system is primarily developed for Telugu, a
Dravidian language predominantly spoken in Indian state of Andhra Pradesh.The
important qualities expected from this system are naturalness and intelligibility.Telugu TTS
can be developed using other synthesis methods like articulatory synthesis,formant synthesis
and concatenative synthesis.This paper describes a development of a Telugu text to speech
system using concatenative synthesis method on mobile based system OMAP 3530 (ARM
Cortex A-8 core) in Linux.
Segmentation Words for Speech Synthesis in Persian Language Based On Silencepaperpublications3
Â
Abstract: In speech synthesis in text to speech systems, the words usually break to different parts and use from recorded sound of each part for play words. This paper use silent in word's pronunciation for better quality of speech. Most algorithms divide words to syllable and some of them divide words to phoneme, but This paper benefit from silent in intonation and divide words at silent region and then set equivalent sound of each parts whereupon joining the parts is trusty and speech quality being more smooth . this paper concern Persian language but extendable to another language. This method has been tested with MOS test and intelligibility, naturalness and fluidity are better.
Keywords:TTS, SBS, Sillable, Diphone.
A SURVEY OF GRAMMAR CHECKERS FOR NATURAL LANGUAGEScsandit
Â
ABSTRACT
Natural Language processing is an interdisciplinary branch of linguistic and computer science studied under the Artificial Intelligence (AI) that gave birth to an allied area called
âComputational Linguisticâ which focuses on processing of natural languages on computational devices. A natural language consists of a large number of sentences which are linguistic units involving one or more words linked together in accordance with a set of predefined rules called grammar. Grammar checking is the task of validating sentences syntactically and is a prominent tool within language engineering. Our review draws on the recent development of various grammar checkers to look at past, present and the future in a new light. Our review covers grammar checkers of many languages with the aim of seeking their approaches, methodologies for developing new tool and system as a whole. The survey concludes with the discussion of various features included in existing grammar checkers of foreign languages as well as a few Indian Languages.
High Level Speaker Specific Features as an Efficiency Enhancing Parameters in...IJECEIAES
Â
In this paper, I present high-level speaker specific feature extraction considering intonation, linguistics rhythm, linguistics stress, prosodic features directly from speech signals. I assume that the rhythm is related to language units such as syllables and appears as changes in measurable parameters such as fundamental frequency ( ), duration, and energy. In this work, the syllable type features are selected as the basic unit for expressing the prosodic features. The approximate segmentation of continuous speech to syllable units is achieved by automatically locating the vowel starting point. The knowledge of high-level speakerâs specific speakers is used as a reference for extracting the prosodic features of the speech signal. High-level speaker-specific features extracted using this method may be useful in applications such as speaker recognition where explicit phoneme/syllable boundaries are not readily available. The efficiency of the particular characteristics of the specific features used for automatic speaker recognition was evaluated on TIMIT and HTIMIT corpora initially sampled in the TIMIT at 16 kHz to 8 kHz. In summary, the experiment, the basic discriminating system, and the HMM system are formed on TIMIT corpus with a set of 48 phonemes. Proposed ASR system shows 1.99%, 2.10%, 2.16% and 2.19 % of efficiency improvements compared to traditional ASR system for and of 16KHz TIMIT utterances.
COMPOSITE TEXTURE SHAPE CLASSIFICATION BASED ON MORPHOLOGICAL SKELETON AND RE...sipij
Â
After several decades of research, the development of an effective feature extraction method for texture
classification is still an ongoing effort. Therefore , several techniques have been proposed to resolve such
problems. In this paper a novel composite texture classification method based on innovative pre-processing
techniques, skeletonization and Regional moments (RM) is proposed. This proposed texture classification
approach, takes into account the ambiguity brought in by noise and the different caption and digitization
processes. To offer better classification rate, innovative pre-processing methods are applied on various
texture images first. Pre-processing mechanisms describe various methods of converting a grey level image
into binary image with minimal consideration of the noise model. Then shape features are evaluated using
RM on the proposed Morphological Skeleton (MS) method by suitable numerical characterization
measures for a precise classification. This texture classification study using MS and RM has given a good
performance. Good classification result is achieved from a single region moment RM10 while others failed
in classification.
Advances in automatic tuberculosis detection in chest x ray imagessipij
Â
Tuberculosis (TB) is very dangerous and rapidly spread disease in the world. In the investigating cases for
suspected tuberculosis (TB), chest radiography is not only the key techniques of diagnosis based on the
medical imaging but also the diagnostic radiology. So, Computer aided diagnosis (CAD) has been popular
and many researchers are interested in this research areas and different approaches have been proposed
for the TB detection and lung decease classification. In this paper, the medical background history of TB
decease in chest X-rays and a survey of the various approaches in TB detection and classification are
presented. The literature in the related methods is surveyed papers in this research area until now 2014.
S IGNAL A ND I MAGE P ROCESSING OF O PTICAL C OHERENCE T OMOGRAPHY AT 1310 NM...sipij
Â
OCT is a recently developed optical interferometric
technique for non-invasive diagnostic medical imag
ing
in vivo; the most sensitive optical imaging modalit
y.OCT finds its application in ophthalmology, blood
flow
estimation and cancer diagnosis along with many non
biomedical applications. The main advantage of
OCT is its high resolution which is in
Îź
m range and depth of penetration in mm range. Unlik
e other
techniques like X rays and CT scan, OCT does not co
mprise any x ray source and therefore no radiations
are involved. This research work discusses the basi
cs of spectral domain OCT (SD-OCT), experimental
setup, data acquisition and signal processing invol
ved in OCT systems. Simulation of OCT involving
modelling and signal processing, carried out on Lab
VIEW platform has been discussed. Using the
experimental setup, some of the non biomedical samp
les have been scanned. The signal processing and
image processing of the scanned data was carried ou
t in MATLAB and Lab VIEW, some of the results thus
obtained have been discussed in the end
Accents of English have been investigated for many years both from the perspective of native and non-native speakers of the language. Various research results imply that non-native speakers of English language produce certain speech characteristics which are uncommon in native speakersâ speech. This is because non-native speakers do not produce the same tongue movement as native speakers. This paper presents an isolated English word recognition system devised with the speech of local Bangladeshi people, who are also non-native speakers of English language. Here, we have also noticed a different speech characteristic which is not available within the speech of native English speakers. Two acoustic features, âpitchâ and âformantsâ have been utilized to develop the system. The system is speaker-independent and stands on Template based approach. The recognition method applied here is very simple and the recognition accuracy is also very satisfactory.
Automatic Speech Recognition of Malayalam Language Nasal Class PhonemesEditor IJCATR
Â
Speech recognition applications are becoming common and useful in this day and age as many of the modern devices are designed and produced user-friendly for the convenience of general public. Speaking/communicating directly with the machine to achieve desired objectives make usage of modern devices easier and convenient. Although may interactive software applications are available, the use of these applications is limited due to language barriers. Hence development of speech recognition systems in local languages will help anyone to make use of this technological advancement. In this paper we discuss the results of the Nasal phonetic class wise speech recognition performance of Malayalam language
ADVANCEMENTS ON NLP APPLICATIONS FOR MANIPURI LANGUAGEkevig
Â
Manipuri is both a minority and morphologically rich language with genetic features similar to Tibeto Burman languages. It has Subject-Object-Verb (SOV) order, agglutinative verb morphology and ismonosyllabic. Morphology and syntax are not clearly distinguished in this language. Natural Language Processing (NLP) is a useful research field of computer science that deals with processing of a large amount of natural language corpus. The NLP applications encompass E-Dictionary, Morphological
Analyzer, Reduplicated Multi-Word Expression (RMWE), Named Entity Recognition (NER), Part of Speech (POS) Tagging, Machine Translation (MT), Word Net, Word Sense Disambiguation (WSD) etc. In this paper, we present a study on the advancements in NLP applications for Manipuri language, at the same time presenting a comparison table of the approaches and techniques adopted and the results obtained of each of the applications followed by a detail discussion of each work.
Emotional telugu speech signals classification based on k nn classifiereSAT Publishing House
Â
IJRET : International Journal of Research in Engineering and Technology is an international peer reviewed, online journal published by eSAT Publishing House for the enhancement of research in various disciplines of Engineering and Technology. The aim and scope of the journal is to provide an academic medium and an important reference for the advancement and dissemination of research results that support high-level learning, teaching and research in the fields of Engineering and Technology. We bring together Scientists, Academician, Field Engineers, Scholars and Students of related fields of Engineering and Technology.
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...ijma
Â
In this work a new Bangla speech corpus along with proper transcriptions has been developed; also
various acoustic feature extraction methods have been investigated using Long Short-Term Memory
(LSTM) neural network to find their effective integration into a state-of-the-art Bangla speech recognition
system. The acoustic features are usually a sequence of representative vectors that are extracted from
speech signals and the classes are either words or sub word units such as phonemes. The most commonly
used feature extraction method, known as linear predictive coding (LPC), has been used first in this work.
Then the other two popular methods, namely, the Mel frequency cepstral coefficients (MFCC) and
perceptual linear prediction (PLP) have also been applied. These methods are based on the models of the
human auditory system. A detailed review of the implementation of these methods have been described
first. Then the steps of the implementation have been elaborated for the development of an automatic
speech recognition system (ASR) for Bangla speech.
Regulatory and Linguistic Analysis For a New Proprietary Drug NameBrand Acumen, LLC
Â
Brand Acumen's Regulatory and Linguistic Analysis For a New Proprietary Drug Name. A look into what goes into the creation of a new pharmaceutical name. A Case Study: Janage
Speech is the important mode of communication and is the current research
topic. The concentration is mostly focused on synthesis and analyzing part.Apart of
synthesizing, text to speech system is developed.Speech synthesis is an artificial production
of human speech.A text to speech system (TTS) is to convert an arbitrary text into speech.In
India different languages have been spoken each being the mother tongue of tens of millions
of people.In this paper,the text to speech system is primarily developed for Telugu, a
Dravidian language predominantly spoken in Indian state of Andhra Pradesh.The
important qualities expected from this system are naturalness and intelligibility.Telugu TTS
can be developed using other synthesis methods like articulatory synthesis,formant synthesis
and concatenative synthesis.This paper describes a development of a Telugu text to speech
system using concatenative synthesis method on mobile based system OMAP 3530 (ARM
Cortex A-8 core) in Linux.
Segmentation Words for Speech Synthesis in Persian Language Based On Silencepaperpublications3
Â
Abstract: In speech synthesis in text to speech systems, the words usually break to different parts and use from recorded sound of each part for play words. This paper use silent in word's pronunciation for better quality of speech. Most algorithms divide words to syllable and some of them divide words to phoneme, but This paper benefit from silent in intonation and divide words at silent region and then set equivalent sound of each parts whereupon joining the parts is trusty and speech quality being more smooth . this paper concern Persian language but extendable to another language. This method has been tested with MOS test and intelligibility, naturalness and fluidity are better.
Keywords:TTS, SBS, Sillable, Diphone.
A SURVEY OF GRAMMAR CHECKERS FOR NATURAL LANGUAGEScsandit
Â
ABSTRACT
Natural Language processing is an interdisciplinary branch of linguistic and computer science studied under the Artificial Intelligence (AI) that gave birth to an allied area called
âComputational Linguisticâ which focuses on processing of natural languages on computational devices. A natural language consists of a large number of sentences which are linguistic units involving one or more words linked together in accordance with a set of predefined rules called grammar. Grammar checking is the task of validating sentences syntactically and is a prominent tool within language engineering. Our review draws on the recent development of various grammar checkers to look at past, present and the future in a new light. Our review covers grammar checkers of many languages with the aim of seeking their approaches, methodologies for developing new tool and system as a whole. The survey concludes with the discussion of various features included in existing grammar checkers of foreign languages as well as a few Indian Languages.
High Level Speaker Specific Features as an Efficiency Enhancing Parameters in...IJECEIAES
Â
In this paper, I present high-level speaker specific feature extraction considering intonation, linguistics rhythm, linguistics stress, prosodic features directly from speech signals. I assume that the rhythm is related to language units such as syllables and appears as changes in measurable parameters such as fundamental frequency ( ), duration, and energy. In this work, the syllable type features are selected as the basic unit for expressing the prosodic features. The approximate segmentation of continuous speech to syllable units is achieved by automatically locating the vowel starting point. The knowledge of high-level speakerâs specific speakers is used as a reference for extracting the prosodic features of the speech signal. High-level speaker-specific features extracted using this method may be useful in applications such as speaker recognition where explicit phoneme/syllable boundaries are not readily available. The efficiency of the particular characteristics of the specific features used for automatic speaker recognition was evaluated on TIMIT and HTIMIT corpora initially sampled in the TIMIT at 16 kHz to 8 kHz. In summary, the experiment, the basic discriminating system, and the HMM system are formed on TIMIT corpus with a set of 48 phonemes. Proposed ASR system shows 1.99%, 2.10%, 2.16% and 2.19 % of efficiency improvements compared to traditional ASR system for and of 16KHz TIMIT utterances.
COMPOSITE TEXTURE SHAPE CLASSIFICATION BASED ON MORPHOLOGICAL SKELETON AND RE...sipij
Â
After several decades of research, the development of an effective feature extraction method for texture
classification is still an ongoing effort. Therefore , several techniques have been proposed to resolve such
problems. In this paper a novel composite texture classification method based on innovative pre-processing
techniques, skeletonization and Regional moments (RM) is proposed. This proposed texture classification
approach, takes into account the ambiguity brought in by noise and the different caption and digitization
processes. To offer better classification rate, innovative pre-processing methods are applied on various
texture images first. Pre-processing mechanisms describe various methods of converting a grey level image
into binary image with minimal consideration of the noise model. Then shape features are evaluated using
RM on the proposed Morphological Skeleton (MS) method by suitable numerical characterization
measures for a precise classification. This texture classification study using MS and RM has given a good
performance. Good classification result is achieved from a single region moment RM10 while others failed
in classification.
Advances in automatic tuberculosis detection in chest x ray imagessipij
Â
Tuberculosis (TB) is very dangerous and rapidly spread disease in the world. In the investigating cases for
suspected tuberculosis (TB), chest radiography is not only the key techniques of diagnosis based on the
medical imaging but also the diagnostic radiology. So, Computer aided diagnosis (CAD) has been popular
and many researchers are interested in this research areas and different approaches have been proposed
for the TB detection and lung decease classification. In this paper, the medical background history of TB
decease in chest X-rays and a survey of the various approaches in TB detection and classification are
presented. The literature in the related methods is surveyed papers in this research area until now 2014.
S IGNAL A ND I MAGE P ROCESSING OF O PTICAL C OHERENCE T OMOGRAPHY AT 1310 NM...sipij
Â
OCT is a recently developed optical interferometric
technique for non-invasive diagnostic medical imag
ing
in vivo; the most sensitive optical imaging modalit
y.OCT finds its application in ophthalmology, blood
flow
estimation and cancer diagnosis along with many non
biomedical applications. The main advantage of
OCT is its high resolution which is in
Îź
m range and depth of penetration in mm range. Unlik
e other
techniques like X rays and CT scan, OCT does not co
mprise any x ray source and therefore no radiations
are involved. This research work discusses the basi
cs of spectral domain OCT (SD-OCT), experimental
setup, data acquisition and signal processing invol
ved in OCT systems. Simulation of OCT involving
modelling and signal processing, carried out on Lab
VIEW platform has been discussed. Using the
experimental setup, some of the non biomedical samp
les have been scanned. The signal processing and
image processing of the scanned data was carried ou
t in MATLAB and Lab VIEW, some of the results thus
obtained have been discussed in the end
ALEXANDER FRACTIONAL INTEGRAL FILTERING OF WAVELET COEFFICIENTS FOR IMAGE DEN...sipij
Â
The present paper, proposes an efficient denoising algorithm which works well for images corrupted with
Gaussian and speckle noise. The denoising algorithm utilizes the alexander fractional integral filter which
works by the construction of fractional masks window computed using alexander polynomial. Prior to the
application of the designed filter, the corrupted image is decomposed using symlet wavelet from which only
the horizontal, vertical and diagonal components are denoised using the alexander integral filter.
Significant increase in the reconstruction quality was noticed when the approach was applied on the
wavelet decomposed image rather than applying it directly on the noisy image. Quantitatively the results
are evaluated using the peak signal to noise ratio (PSNR) which was 30.8059 on an average for images
corrupted with Gaussian noise and 36.52 for images corrupted with speckle noise, which clearly
outperforms the existing methods.
A HYBRID FILTERING TECHNIQUE FOR ELIMINATING UNIFORM NOISE AND IMPULSE NOIS...sipij
Â
A new hybrid filtering technique is proposed to improving denoising process on digital images.
This technique is performed in two steps. In the first step, uniform noise and impulse noise is
eliminated using decision based algorithm (DBA). Image denoising process is further improved
by an appropriately combining DBA with Adaptive Neuro Fuzzy Inference System (ANFIS) at
the removal of uniform noise and impulse noise on the digital images. Three well known images
are selected for training and the internal parameters of the neuro-fuzzy network are adaptively
optimized by training. This technique offers excellent line, edge, and fine detail preservation
performance while, at the same time, effectively denoising digital images. Extensive simulation
results were realized for ANFIS network and different filters are compared. Results show that
the proposed filter is superior performance in terms of image denoising and edges and fine
details preservation properties.
ONTOLOGICAL MODEL FOR CHARACTER RECOGNITION BASED ON SPATIAL RELATIONSsipij
Â
In this paper, we present a set of spatial relations between concepts describing an ontological model for a
new process of character recognition. Our main idea is based on the construction of the domain ontology
modelling the Latin script. This ontology is composed by a set of concepts and a set of relations. The
concepts represent the graphemes extracted by segmenting the manipulated document and the relations are
of two types, is-a relations and spatial relations. In this paper we are interested by description of second
type of relations and their implementation by java code.
Analog signal processing approach for coarse and fine depth estimationsipij
Â
Imaging and Image sensors is a field that is continuously evolving. There are new products coming into the
market every day. Some of these have very severe Size, Weight and Power constraints whereas other
devices have to handle very high computational loads. Some require both these conditions to be met
simultaneously. Current imaging architectures and digital image processing solutions will not be able to
meet these ever increasing demands. There is a need to develop novel imaging architectures and image
processing solutions to address these requirements. In this work we propose analog signal processing as a
solution to this problem. The analog processor is not suggested as a replacement to a digital processor but
it will be used as an augmentation device which works in parallel with the digital processor, making the
system faster and more efficient. In order to show the merits of analog processing two stereo
correspondence algorithms are implemented. We propose novel modifications to the algorithms and new
imaging architectures which, significantly reduces the computation time
Image compression using embedded zero tree waveletsipij
Â
Compressing an image is significantly different than compressing raw binary data. compressing images is
used by this different compression algorithm. Wavelet transforms used in Image compression methods to
provide high compression rates while maintaining good image quality. Discrete Wavelet Transform (DWT)
is one of the most common methods used in signal and image compression .It is very powerful compared to
other transform because its ability to represent any type of signals both in time and frequency domain
simultaneously. In this paper, we will moot the use of Wavelet Based Image compression algorithm-
Embedded Zerotree Wavelet (EZW). We will obtain a bit stream with increasing accuracy from ezw
algorithm because of basing on progressive encoding to compress an image into . All the numerical results
were done by using matlab coding and the numerical analysis of this algorithm is carried out by sizing
Peak Signal to Noise Ratio (PSNR) and Compression Ratio (CR) for standard Lena Image .Experimental
results beam that the method is fast, robust and efficient enough to implement it in still and complex images
with significant image compression.
EEG S IGNAL Q UANTIFICATION B ASED ON M ODUL L EVELS sipij
Â
This article proposes a contribution to quantify EE
G signals outline. This technique uses two tools fo
r EEG
signal characteristics extraction. Our tests were r
ealized on the basis of 32 canals EEG canals using
Neuroscan software. EEG example demonstration is re
ferenced CZ and is sampled at 1000HZ. The
principal aim of this technique is to reduce the im
portant volume of EEG signal data Without losing an
y
information. EEG signals are quantified on the basi
s of a whole predefined levels The obtained results
show that an EEG alignment can be posted in a quant
ified form.
SkinCure: An Innovative Smart Phone Based Application to Assist in Melanoma E...sipij
Â
Melanoma spreads through metastasis, and therefore it has been proven to be very fatal. Statistical
evidence has revealed that the majority of deaths resulting from skin cancer are as a result of melanoma.
Further investigations have shown that the survival rates in patients depend on the stage of the infection;
early detection and intervention of melanoma implicates higher chances of cure. Clinical diagnosis and
prognosis of melanoma is challenging since the processes are prone to misdiagnosis and inaccuracies due
to doctorsâ subjectivity. This paper proposes an innovative and fully functional smart-phone based
application to assist in melanoma early detection and prevention. The application has two major
components; the first component is a real-time alert to help users prevent skin burn caused by sunlight; a
novel equation to compute the time for skin to burn is thereby introduced. The second component is an
automated image analysis module which contains image acquisition, hair detection and exclusion, lesion
segmentation, feature extraction, and classification. The proposed system exploits PH2 Dermoscopy image
database from Pedro Hispano Hospital for development and testing purposes. The image database
contains a total of 200 dermoscopy images of lesions, including normal, atypical, and melanoma cases.
The experimental results show that the proposed system is efficient, achieving classification of the normal,
atypical and melanoma images with accuracy of 96.3%, 95.7% and 97.5%, respectively.
Improve Captcha's Security Using Gaussian Blur Filtersipij
Â
Providing security for webservers against unwanted and automated registrations has become a big
concern. To prevent these kinds of false registrations many websites use CAPTCHAs. Among all kinds of
CAPTCHAs OCR-Based or visual CAPTCHAs are very common. Actually visual CAPTCHA is an image
containing a sequence of characters. So far most of visual CAPTCHAs, in order to resist against OCR
programs, use some common implementations such as wrapping the characters, random placement and
rotations of characters, etc. In this paper we applied Gaussian Blur filter, which is an image
transformation, to visual CAPTCHAs to reduce their readability by OCR programs. We concluded that this
technique made CAPTCHAs almost unreadable for OCR programs but, their readability by human users
still remained high.
n this paper, we present feature-based technique f
or construction of mosaic image from underwater vid
eo
sequence, which suffers from parallax distortion du
e to propagation properties of light in the underwa
ter
environment. The most of the available mosaic tools
and underwater image mosaicing techniques yields
final result with some artifacts such as blurring,
ghosting and seam due to presence of parallax in th
e input
images. The removal of parallax from input images m
ay not reduce its effects instead it must be correc
ted
in successive steps of mosaicing. Thus, our approac
h minimizes the parallax effects by adopting an eff
icient
local alignment technique after global registration
. We extract texture features using Centre Symmetri
c
Local Binary Pattern (CS-LBP) descriptor in order t
o find feature correspondences, which are used furt
her
for estimation of homography through RANSAC. In ord
er to increase the accuracy of global registration,
we perform preprocessing such as colour alignment b
etween two selected frames based on colour
distribution adjustment. Because of existence of 1
00% overlap in consecutive frames of underwater vid
eo,
we select frames with minimum overlap based on mutu
al offset in order to reduce the computation cost
during mosaicing. Our approach minimizes the parall
ax effects considerably in final mosaic constructed
using our own underwater video sequences.
IMAGE COMPRESSION BY EMBEDDING FIVE MODULUS METHOD INTO JPEGsipij
Â
The standard JPEG format is almost the optimum format in image compression. The compression ratio in
JPEG sometimes reaches 30:1. The compression ratio of JPEG could be increased by embedding the Five
Modulus Method (FMM) into the JPEG algorithm. The novel algorithm gives twice the time as the standard
JPEG algorithm or more. The novel algorithm was called FJPEG (Five-JPEG). The quality of the
reconstructed image after compression is approximately approaches the JPEG. Standard test images have
been used to support and implement the suggested idea in this paper and the error metrics have been
computed and compared with JPEG.
WAVELET PACKET BASED IRIS TEXTURE ANALYSIS FOR PERSON AUTHENTICATIONsipij
Â
There is considerable rise in the research of iris recognition system over a period of time. Most of the
researchers has been focused on the development of new iris pre-processing and recognition algorithms for
good quail iris images. In this paper, iris recognition system using Haar wavelet packet is presented.
Wavelet Packet Transform (WPT ) which is extension of discrete wavelet transform has multi-resolution
approach. In this iris information is encoded based on energy of wavelet packets.. Our proposed work
significantly decreases the error rate in recognition of noisy images. A comparison of this work with nonorthogonal Gabor wavelets method is done. Computational complexity of our work is also less as
compared to Gabor wavelets method.
Hybrid lwt svd watermarking optimized using metaheuristic algorithms along wi...sipij
Â
Medical image security provides challenges and opportunities, watermarking and encryption of medical
images provides the necessary control over the flow of medical information. In this paper a dual security approach is employed .A medical image is considered as watermark and is watermarked inside a natural image. This approach is to wean way the potential attacker by disguising the medical image as a natural image. To further enhance the security the watermarked image is encrypted using encryption algorithms. In this paper a multiâobjective optimization approach optimized using different metaheuristic approaches like Genetic Algorithm (GA), Differential Evolution ( DE) and Bacterial Foraging Optimization (BFOA) is proposed. Such optimization helps in preserving the structural integrity of the medical images, which is of utmost importance. The water marking is proposed to be implemented using both Lifted Wavelet
Transforms (LWT) and Singular Value Decomposition (SVD) technique. The encryption is done using RSA
and AES encryption algorithms. A Graphical User Interface (GUI) which enables the user to have ease of
operation in loading the image, watermark it, encrypt it and also retrieve the original image whenever necessary is also designed and presented in this paper.
VIDEO QUALITY ASSESSMENT USING LAPLACIAN MODELING OF MOTION VECTOR DISTRIBUTI...sipij
Â
Video/Image quality assessment (VQA/IQA) is fundamental in various fields of video/image processing.
VQA reflects the quality of a video as most people commonly perceive. This paper proposes a reducedreference
mobile VQA, in which one-dimensional (1-D) motion vector (MV) distributions are used as
features of videos. This paper focuses on reduction of data size using Laplacian modeling of MV
distributions because network resource is restricted in the case of mobile video. The proposed method is
more efficient than the conventional methods in view of the computation time, because the proposed quality
metric decodes MVs directly from video stream in the parsing process rather than reconstructing the
distorted video at a receiver. Moreover, in view of data size, the proposed method is efficient because a
sender transmits only 28 parameters. We adopt the Laplacian distribution for modeling 1-D MV
histograms. 1-D MV histograms accumulated over the whole video sequences are used, which is different
from the conventional methods that assess each image frame independently. For testing the similarity
between MV histogram of reference and distorted videos and for minimizing the fitting error in Laplacian
modeling process, we use the chi-square method. To show the effectiveness of our proposed method, we
compare the proposed method with the conventional methods with coded video clips, which are coded
under varying bit rate, image size, and frame rate by H.263 and H.264/AVC. Experimental results show
that the proposed method gives the performance comparable with the conventional methods, especially, the
proposed method requires much lower transmission data.
SIGNIFICANCE OF DIMENSIONALITY REDUCTION IN IMAGE PROCESSING sipij
Â
The aim of this paper is to present a comparative study of two linear dimension reduction methods namely
PCA (Principal Component Analysis) and LDA (Linear Discriminant Analysis). The main idea of PCA is to
transform the high dimensional input space onto the feature space where the maximal variance is
displayed. The feature selection in traditional LDA is obtained by maximizing the difference between
classes and minimizing the distance within classes. PCA finds the axes with maximum variance for the
whole data set where LDA tries to find the axes for best class seperability. The neural network is trained
about the reduced feature set (using PCA or LDA) of images in the database for fast searching of images
from the database using back propagation algorithm. The proposed method is experimented over a general
image database using Matlab. The performance of these systems has been evaluated by Precision and
Recall measures. Experimental results show that PCA gives the better performance in terms of higher
precision and recall values with lesser computational complexity than LDA
A DEEP LEARNING BASED EVALUATION OF ARTICULATION DISORDER AND LEARNING ASSIST...ijnlc
Â
Automatic Speech Recognition (ASR) is a thriving technology, which permits an application to detect the spoken words or the speaker using computer. This work isfocused on speech recognition for children with Aspergerâs Syndrome who belongs to the age group of five to ten. Aspergerâs Syndrome (AS) is one of the spectrum disorders in Autism Spectrum Disorder (ASD) in which most of affected individuals have high levelof IQ and sound language skills. But thespectrum disorder restricts for communication and social interaction. The implementation of this workaccomplished through two different stages. In the first stage, an attempt is made to develop an Automatic Speech Recognition system, to identify the articulation disorder inchildren with Aspergersâ Syndrome (AS) on the basis of 14 Malayalam vowels. A deep autoencoder is used for pre-training in an unsupervised manner with 27,300 features that includes MelFrequency Cepstral Coefficients (MFCC), Zero Crossing, Formants 1, Formants 2 and Formants 3. An additional layer is added and fine-tuned to predict the target vowel sounds, which enhance the performance of the autoâencoder. In the second stage, Learning Assistive System for Autistic Child (LASAC)is envisioned to provide speech training as well as impaired speech analysis for Malayalam vowels. A feature vector is used to create the acoustic model and the Euclidean Distance formula is used to analyze the percentage of impairment in AS studentâs speech by comparing their speech features against the acoustic model.LASAC provides an interactive interface that offers speech training, speech analysis and articulation tutorial. Most of the autistic students have difficulty to memorize words or letters but they are able to recollect pictures. This serves as a learning assistive system for Malayalam letters with their corresponding images and words. The DNN with auto-encoder has achieved an average accuracy of 65% with the impaired (autistic) dataset.
Upgrading the Performance of Speech Emotion Recognition at the Segmental Level IOSR Journals
Â
Abstract: This paper presents an efficient approach for maximizing the accuracy of automatic speech emotion
recognition in English, using minimal inputs, minimal features, lesser algorithmic complexity and reduced
processing time. Whereas the findings reported here are based on the exclusive use of vowel formants, most of
the related previous works used tens or even hundreds of other features. In spite of using a greater level of
signal processing, the recognition accuracy reported earlier was often lesser than that obtained by our
approach. This method is based on vowel utterances and the first step comprises statistical pre-processing of
the vowel formants. This is followed by the identification of the best formants using the KMeans, K-nearest
neighbor and Naive Bayes classifiers. The Artificial neural network that was used for the final classification
gave an accuracy of 95.6% on elicited emotional speech. Nearly 1500 speech files from ten female speakers in
the neutral and six basic emotions were used to prove the efficiency of the proposed approach. Such a result
has not been reported earlier for English and is of significance to researchers, sociologists and others interested in speech.
Keywords: Artificial Neural Networks, Emotions, Formants, Preprocessing,Vowels.
Research Inventy : International Journal of Engineering and Science is published by the group of young academic and industrial researchers with 12 Issues per year. It is an online as well as print version open access journal that provides rapid publication (monthly) of articles in all areas of the subject such as: civil, mechanical, chemical, electronic and computer engineering as well as production and information technology. The Journal welcomes the submission of manuscripts that meet the general criteria of significance and scientific excellence. Papers will be published by rapid process within 20 days after acceptance and peer review process takes only 7 days. All articles published in Research Inventy will be peer-reviewed.
Research Inventy : International Journal of Engineering and Science is published by the group of young academic and industrial researchers with 12 Issues per year. It is an online as well as print version open access journal that provides rapid publication (monthly) of articles in all areas of the subject such as: civil, mechanical, chemical, electronic and computer engineering as well as production and information technology. The Journal welcomes the submission of manuscripts that meet the general criteria of significance and scientific excellence. Papers will be published by rapid process within 20 days after acceptance and peer review process takes only 7 days. All articles published in Research Inventy will be peer-reviewed.
Vol. 2, No. 3 , Ahwaz Journal of Linguistics Studies
Ahwaz Journal of Linguistics Studies
(Peer-Reviewed International Quarterly Journal)
(ISSN: 2717-2643)
For more information, please visit the journal website:
WWW.AJLS.IR
The journal welcomes submissions in English, Arabic or Persian in any of the relevant fields:
A) Linguistics (Any issue related to either theoretical or applied linguistics)
B) Languages and dialects (Any linguistic issue related languages and dialects)
C) Translation (Any translation and interpreting issue related to languages and dialects)
D) Religious linguistics (Any linguistic study related to religious texts and speeches)
Please feel free to write if there is any query.
The AJLS Secretariat,
Ahwaz 61335-4619 Iran
Tel: (+98) 61-32931199
Fax: (+98) 61-32931198
Mobile: (+98) 916-5088772 (WhatsApp Number)
Email: info@pahi.ir
The Speech Sound Pics Approach has been created by the Reading Whisperer for Australian schools. This presentation shows the research on which SSP is based, as well as an overview regarding HOW to teach any child to read and spell before year 2.
www.facebook.com/readaustralia
Efficiency of the energy contained in modulators in the Arabic vowels recogni...IJECEIAES
Â
The speech signal is described as many acoustic properties that may contribute differently to spoken word recognition. Vowel characterization is an important process of studying the acoustic characteristics or behaviors of speech within different contexts. This current study focuses on the modulators characteristics of three Arabic vowels, we proposed a new approach to characterize the three Arabic vowels /a/, /i/ and /u/. The proposed method is based on the energy contained in the speech modulators. The coherent subband demodulation method related to the spectral center of gravity (COG) was used to calculate the energy of the speech modulators. The obtained results showed that the modulators energy help characterize the Arabic vowels /a/, /i/ and /u/ with an interesting recognition rate ranging from 86% to 100%.
Improving the intelligibility of dysarthric speech using a time domain pitch...IJECEIAES
Â
Dysarthria is a motor speech impairment that reduces the intelligibility of speech. Observations indicate that for different types of dysarthria, the fundamental frequency, intensity, and speech rate of speech are distinct from those of unimpaired speakers. Therefore, the proposed enhancement technique modifies these parameters so that they fall in the range for unimpaired speakers. The fundamental frequency and speech rate of dysarthric speech are modified using the time domain pitch synchronous overlap and add (TD-PSOLA) algorithm. Then its intensity is modified using the fast Fourier transform (FFT) and inverse fast Fourier transform (IFFT)-based approach. This technique is applied to impaired speech samples of ten dysarthric speakers. After enhancement, the intelligibility of impaired and enhanced dysarthric speech is evaluated. The change in the intelligibility of impaired and enhanced dysarthric speech is evaluated using the rating scale and word count methods. The improvement in intelligibility is significant for speakers whose original intelligibility was poor. In contrast, the improvement in intelligibility was minimal for speakers whose intelligibility was already high. According to the rating scale method, for diverse speakers, the change in intelligibility ranges from 9% to 53%. Whereas, according to the word count method, this change in intelligibility ranges from 0% to 53%.
Acoustic Duration of Arabic Utterances Spoken by the Indonesian Learners of A...iosrjce
Â
IOSR Journal of Humanities and Social Science is a double blind peer reviewed International Journal edited by International Organization of Scientific Research (IOSR).The Journal provides a common forum where all aspects of humanities and social sciences are presented. IOSR-JHSS publishes original papers, review papers, conceptual framework, analytical and simulation models, case studies, empirical research, technical notes etc.
Estimation of Severity of Speech Disability Through Speech Envelopesipij
Â
In this paper, envelope detection of speech is discussed to distinguish the pathological cases of speech disabled children. The speech signal samples of children of age between five to eight years are considered for the present study. These speech signals are digitized and are used to determine the speech envelope. The envelope is subjected to ratio mean analysis to estimate the disability. This analysis is conducted on ten speech signal samples which are related to both place of articulation and manner of articulation. Overall speech disability of a pathological subject is estimated based on the results of above analysis.
Welcome to International Journal of Engineering Research and Development (IJERD)IJERD Editor
Â
call for paper 2012, hard copy of journal, research paper publishing, where to publish research paper,
journal publishing, how to publish research paper, Call For research paper, international journal, publishing a paper, IJERD, journal of science and technology, how to get a research paper published, publishing a paper, publishing of journal, publishing of research paper, reserach and review articles, IJERD Journal, How to publish your research paper, publish research paper, open access engineering journal, Engineering journal, Mathemetics journal, Physics journal, Chemistry journal, Computer Engineering, Computer Science journal, how to submit your paper, peer reviw journal, indexed journal, reserach and review articles, engineering journal, www.ijerd.com, research journals
October and what was with you guys Kay and the systems engineer w c baby I hope you have been proposed what was easy and the đ well with the possibility of what was with me I don't let it Welcome to Gboard clipboard, any text you copy will be saved here.Tap on a clip to paste it in the text box.Touch and hold a clip to pin it. Unpinned clips will be deleted after 1 hour.Use the edit icon to pin, add or delete clips.October and what was with you guys Kay and the systems engineer w c baby I hope you have been proposed what was easy and the đ well with the possibility of what was with me I don't let it October and what was with you guys Kay and the systems engineer w c baby I hope you have been proposed what was easy and the đ well with the possibility of what was with me I don't let it October and what was with you guys Kay and the systems engineer w c baby I hope you have been proposed what was easy and the đ well with the possibility of what was with me I don't let it October and what was with you guys Kay and the systems engineer w c baby I hope you have been proposed what was easy and the đ well with the possibility of what was with me I don't let it October and what was with you guys Kay and the systems engineer w c baby I hope you have been proposed what was easy and the đ well with the possibility of what was with me I don't let it October and what was with you guys Kay and the systems engineer w c baby I hope you have been proposed what was easy and the đ well with the possibility of what was with me I don't let it October and what was with you guys Kay and the systems engineer w c baby I hope you have been proposed what was easy and the đ well with the possibility of what was with me I don't let it October and what was with you guys Kay and the systems engineer w c baby I hope you have been proposed what was easy and the đ well with the possibility of what was with me I don't let it October and what was with you guys Kay and the systems engineer w c baby I hope you have been proposed what was easy and the đ well with the possibility of what was with me I don't let it October and what was with you guys Kay and the systems engineer w c baby I hope you have been proposed what was easy and the đ well with the possibility of what was with me I don't let it October and what was with you guys Kay and the systems engineer w c baby I hope you have been proposed what was easy and the đ well with the possibility of what was with me I don't let it October and what was with you guys Kay and the systems engineer w c baby I hope you have been proposed what was easy and the đ well with the possibility of what was with me I don't let it October and what was with you guys Kay and the systems engineer w c baby I hope you have been proposed what was easy and the đ well with the possibility of what was with me I don't let it October and what was with you guys Kay and the systems engineer w c baby I hope you have been proposed what was easy and th
Segmentation Words for Speech Synthesis in Persian Language Based On Silencepaperpublications3
Â
Abstract: In speech synthesis in text to speech systems, the words usually break to different parts and use from recorded sound of each part for play words. This paper use silent in word's pronunciation for better quality of speech. Most algorithms divide words to syllable and some of them divide words to phoneme, but This paper benefit from silent in intonation and divide words at silent region and then set equivalent sound of each parts whereupon joining the parts is trusty and speech quality being more smooth . this paper concern Persian language but extendable to another language. This method has been tested with MOS test and intelligibility, naturalness and fluidity are better.Keywords:TTS, SBS, Sillable, Diphone.
Title:Segmentation Words for Speech Synthesis in Persian Language Based On Silence
Author:Sohrab Hojjatkhah, Ali Jowharpour
International Journal of Recent Research in Mathematics Computer Science and Information Technology (IJRRMCSIT)
Paper Publications
Part of speech tagging is one of the basic steps in natural language processing. Although it has been
investigated for many languages around the world, very little has been done for Setswana language.
Setswana language is written disjunctively and some words play multiple functions in a sentence. These
features make part of speech tagging more challenging. This paper presents a finite state method for
identifying one of the compound parts of speech, the relative. Results show an 82% identification rate
which is lower than for other languages. The results also show that the model can identify the start of a
relative 97% of the time but fail to identify where it stops 13% of the time. The model fails due to the
limitations of the morphological analyser and due to more complex sentences not accounted for in the
model.
The current study was conducted for children with CP who are Hindi speakers, to analyze voice onset time (VOT) patterns associated with stop consonants. The purpose was to gain more detailed insights into how the coordination and timing needed for these articulations may be affected by the neurodevelopmental disorder ultimately opening up avenues for improved treatment solutions. Total 20 children in the age group of 6 - 8 years were taken for the study which was divided into two groups: 10 children diagnosed with spastic CP and 10 typical children speaking Hindi. Measuring VOT through voice analysis helped us gauge the interval lapsing between releasing a stop closure and voicing onset. The outcomes exhibited considerable variation in VOT values when comparing groups of children with cerebral palsy (CP) against typical children speaking Hindi. Specifically, CP children showed longer VOT durations indicating delayed voicing onset compared to controls. These findings point out that children who have cerebral palsy exhibit different timing and coordination patterns upon pronouncing stop consonants. It is essential to grasp the VOT traits in children with cerebral palsy. This will aid in creating suitable intervention techniques that enhance their speech comprehensibility and communication skills.
Similar to Sipij040305SPEECH EVALUATION WITH SPECIAL FOCUS ON CHILDREN SUFFERING FROM APRAXIA OF SPEECH (20)
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Â
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as âpredictable inferenceâ.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Accelerate your Kubernetes clusters with Varnish CachingThijs Feryn
Â
A presentation about the usage and availability of Varnish on Kubernetes. This talk explores the capabilities of Varnish caching and shows how to use the Varnish Helm chart to deploy it to Kubernetes.
This presentation was delivered at K8SUG Singapore. See https://feryn.eu/presentations/accelerate-your-kubernetes-clusters-with-varnish-caching-k8sug-singapore-28-2024 for more details.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Â
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overviewâ
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualityInflectra
Â
In this insightful webinar, Inflectra explores how artificial intelligence (AI) is transforming software development and testing. Discover how AI-powered tools are revolutionizing every stage of the software development lifecycle (SDLC), from design and prototyping to testing, deployment, and monitoring.
Learn about:
⢠The Future of Testing: How AI is shifting testing towards verification, analysis, and higher-level skills, while reducing repetitive tasks.
⢠Test Automation: How AI-powered test case generation, optimization, and self-healing tests are making testing more efficient and effective.
⢠Visual Testing: Explore the emerging capabilities of AI in visual testing and how it's set to revolutionize UI verification.
⢠Inflectra's AI Solutions: See demonstrations of Inflectra's cutting-edge AI tools like the ChatGPT plugin and Azure Open AI platform, designed to streamline your testing process.
Whether you're a developer, tester, or QA professional, this webinar will give you valuable insights into how AI is shaping the future of software delivery.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Â
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But thereâs more:
In a second workflow supporting the same use case, youâll see:
Your campaign sent to target colleagues for approval
If the âApproveâ button is clicked, a Jira/Zendesk ticket is created for the marketing design team
Butâif the âRejectâ button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
Â
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
Â
As AI technology is pushing into IT I was wondering myself, as an âinfrastructure container kubernetes guyâ, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefitâs both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Â
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Â
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
Â
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
Â
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties â USA
Expansion of bot farms â how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks â Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Â
Sipij040305SPEECH EVALUATION WITH SPECIAL FOCUS ON CHILDREN SUFFERING FROM APRAXIA OF SPEECH
1. Signal & Image Processing : An International Journal (SIPIJ) Vol.4, No.3, June 2013
DOI : 10.5121/sipij.2013.4305 57
SPEECH EVALUATION WITH SPECIAL FOCUS
ON CHILDREN SUFFERING FROM APRAXIA OF
SPEECH
Manasi Dixit1
and Shaila Apte2
1
Department of Electronics K.I.T.âs College of Engineering, Kolhapur,416234
Maharashtra, India
mrdixit@rediffmail.com
2
Department of Electronics and Communication Engineering,
Rajarshi Shahu College of Engineering, Pune, Maharashtra,India
sdapte@redifmail.com
ABSTRACT
Speech disorders are very complicated in individuals suffering from Apraxia of Speech-AOS. In this paper ,
the pathological cases of speech disabled children affected with AOS are analyzed. The speech signal
samples of children of age between three to eight years are considered for the present study. These speech
signals are digitized and enhanced using the using the Speech Pause Index, Jitter,Skew ,Kurtosis analysis
This analysis is conducted on speech data samples which are concerned with both place of articulation and
manner of articulation. The speech disability of pathological subjects was estimated using results of above
analysis.
KEY WORDS
Speech-Pause Index (SPI), Lexical Stress Index (LSI), Fundamental Frequency Analysis, Speech Intensity
Analysis, Coefficient of Variation Ratio (CVR), Formant Analysis ,Jitter,Skew,Kurtosis ,Consonant
production errors ,Speech intelligibility, Speech-recognition,
1. INTRODUCTION
Apraxia of Speech(AOS) is caused by a neurologic disorder or injury (e.g. cerebral palsy or
traumatic brain injury).. In this present work, the speech data utterances by children of age group
of 3 to 10 years were recorded and digitized. The digitized signal was further processed by using
a MATLAB platform. The speech data was analyzed to check the disorders due to AOS. The
acoustic analysis (e.g., spectrography,) is conducted on speech data. Perceptual assessment of
speech is done so as to get important information regarding articulation, resonance and speech
intelligibility[1]. Perceptual assessment provides important information regarding misarticulation,
resonance and speech intelligibility, In this present work ,it is observed that the speech disabled
children produce misarticulation errors mainly due to weak motor movements and inappropriate
closure of oral cavity uncoordinated articulator movements, breathing problems[6,7]. When
speech is affected, all aspects of speech production may be affected including respiration,
phonation, resonation, articulation, and prosody due to Rigidity of motor muscles ,
Velopharyngeal incompetency, Auditory processing, and Language impairments[2,10]. The paper
is organized as follows. Section 2 deals with the literature review .Section 3 discusses the
methodology for experimental work and section 4 presents the results.
2. Signal & Image Processing : An International Journal (SIPIJ) Vol.4, No.3, June 2013
58
2. LITERATURE REVIEW
According to R.Cmelja, J. Rusz the changes of vocal cords vibration are perceived as changes of
pitch period or fundamental frequency in voice. The fundamental frequency is influenced by
properties of the vocal cords for example elasticity, weight and length. Thus the variations in
fundamental frequency provide instability measurements and work as markers about the
condition and functionality of the vocal cords. Abnormal characteristics of the fundamental
frequency are found in a number of voice and speech disorders .Frequency instability or jitter is
derived from the instantaneous values of pitch periods[1] . John-Paul Hosom, Lawrence Shriberg,
Jordan R. Green suggested the method of estimation of Jitter . They have also discussed the
impact of lexical stress and coefficient variation ratio in case of evaluation of pathological voices
[2,10]. Darush Mehta discussed the liear source filter model of the vocal tract and the variations
in formant frequencies act as a diagnostic marker in case of speech diabled subjects[4]. Skew and
Kurtosis factors were evaluated using Praat.
3. METHODOLOGY
The present work is based on study of children with Marathi mother tongue. The speech data of
normal subjects/children and pathological subjects/children of the same age group between 3 to
10years is collected. The children were trained to utter similar words before recording. The
speech data consists of isolated words, connected words , fast uttered sentences and songs for eg.
Prarthana-School-Prayer, National anthem and Pledge,Nursery Rhymes,famous film songs etc.
The speech data was recorded using Sony Intelligent Portable Ocular Device (IPOD) in digital
form. The recording was carried out in a pleasant atmosphere and maintaining the children in
tension-stress free environment. The recorded signal is transformed into .wav file by using
GOLDWAVE software. The isolated word speech data is given in Table-1. The data was
collected at Chetana Vikas Mandir, a special school established to educate mentally retarded
children as well as children with various disorders. It is located at Kolhapur, India.
3.1. Segmental Analysis
The Segmental Analysis was carried out for utterances of particular isolated words as well as
complete sentences and paragraph /songs. The utterances made by 20 normal subjects and 10
pathological subjects were analyzed. Proposed diagnostic markers for suspected Apraxia of
Speech (AOS) are vowel duration analysis, vowel formant analysis, and consonant transition
index (CTI) analysis were carried out. Various misarticulation cases were analyzed in case of
pathological subjects. The spectrograms were studied for formant analysis[1,4]. Fast uttered
words or continuous sentences exhibit greater complexities with respect to speech intelligibility.
The results obtained with the help of proposed MATLAB Algorithm were verified using Praat
and SFS Open source softwares.
3.2. Suprasegmental Analysis
The Suprasegmental Analysis was carried out for utterances of particular isolated words as well
as complete sentences and paragraph /songs. The utterances made by 20 normal subjects and 10
pathological subjects were analyzed. Proposed diagnostic markers for suspected Apraxia of
Speech (AOS) are Speech-Pause Index (SPI), Fundamental Frequency Analysis[9], Speech
Intensity Analysis ,Jitter[11],Skew and Kurtosis analysis[1,2,10].
Skew and Kurtosis analysis represents the deviations due to stressed utterences compared to the
unstressed utterences in normal subjects as obtained from a speakerâs productions for isolated
3. Signal & Image Processing : An International Journal (SIPIJ) Vol.4, No.3, June 2013
59
word and sentences forms. Composite weightings for the three stress parameters were determined
from a principal components analysis. The Coefficient of Variation Ratio (CVR) expresses the
average normalized variability of durations of pause and speech events that were obtained from a
conversational speech sample. These diagnostic markers address temporal variations observed in
the speech of children with suspected AOS [2,10].
3.3 Suprasegmental factors concerned with Marathi language
The present study uses marathi words and sentences. The following factors are of concern for
marathi language.
Length: Vocalic length is mostly predictable. With the exception of É, the last vowel of a word is
long unless the vowels are followed by a combination of consonants such as nt, tr, kt. The length
is phonemic in i, u.
Nasal vowels: Use of nasal vowels as independent entities and they vary from speaker to speaker.
They are found in certain adverbs, nouns, and plural nouns in the context of case and
postpositions. They are phonemic in certain dialects.
Accent: Marathi is said to have a stress accent. Length, pitch, and sonority play a role in
determining the loudest accent.
Phonology :Modern Marathi script, known asâ balbodhiâ, is based on the Sanskrit Devnagari
script, with certain modifications. Unlike English, Devnagari is alpha syllabic. It uses certain
diacritics for vowels when combined with consonants. The diacritics distinguish long and short
vowels. There are special systems to denote consonant clusters.
Traditional Marathi alphabetic chart lists 16 vowels and 36 consonants based on Sanskrit. Today,
many of these alphabets are obsolete. Modern Marathi has 8 basic vowels and 34 consonants
including two semivowels. Table 1 and Table 2 indicate the vocalic and consonantal charts and
their respective features.
Table 1. DevanÄgarÄŤ alphabet for Marathi Vowels and vowel diacritics with IPA codes
4. Signal & Image Processing : An International Journal (SIPIJ) Vol.4, No.3, June 2013
60
Table 2. DevanÄgarÄŤ alphabet for Marathi Consonants with IPA codes
The isolated word speech data emphasized in the present work is described below. It indicates
case studies of different types of Articulation errors with respect to both place of articulation and
manner of articulation[6,7]. Table 3 lists the place of articulation and manner of articulation for
marathi words. Table 4 lists the fundamental frequencies for normal and pathological subjects and
Table 5 lists the second and third formants for normal and pathological subjects. Table-6. Lists
the Speech Intensity,Jitter,Speech âPause Analysis for Different Subjects.
Table-3. List of specific place of articulation and manner of articulation of Consonants
Sr.
No.
Place of
articulation
Manner of
articulation
English Letters Marathi Letters Marathi Words
1 bilabial plosive p,b Pa f ba Ba AapNa fijatI barI Baava
bilabial nasal m ma maamaa
bilabial approximant w va vaasaU
2 labio-dental fricative f, v f vh fursao vhya
3 dental fricative th, th t qa d Qa tara qaaoDI dUQa
4 alveolar plosive t, d T z D Z D^a@Tr Zga
alveolar nasal n na Na baaNa kana
alveolar fricative s, z sa saap
alveolar approximant r/l r la L rsaaL laLa
5 post-alveolar fricative sh,zh Ya Sa Ja YaTkaona SahamaRga Jagaa
post-alveolar affricate ch,j ca C ja ca,Mda CumaCuma jahaja
post-alveolar approximant y ya yaaoyaao
6 velar plosive kg k K ga Ga kaya ga Katosa Gar
velar nasal ng D. vaaD.maya
7 glottal fricative h h hvaa
Other Isolated Words Used for Analysis ga( AnaMt saMyau> raYT A&anaI )dya
5. Signal & Image Processing : An International Journal (SIPIJ) Vol.4, No.3, June 2013
61
Table-4.Fundamental Frequency Analysis - f 0 and variations of f 0 for different subjects
Sr.No. Subject f 0 Variations of f 0 Remarks
1 Speaker 1 255 Hz 96.7-319 Normal Female
2 Speaker 2 228.7Hz 141.3-319.6 Normal Female
3 Speaker 3 210.55 Hz 60-307.7 Patological Female Subject
4 Speaker 4 188.8 Hz 129.7-309.6 Normal Male
5 Speaker 5 198.8 Hz 129-265.6 Pathological Male Subject
6 Speaker 6 139.75 Hz 101-297.9 Pathologic Male Subject
7 Speaker 7 181.5 7Hz 77-297.9 Pathologic Male Subject
8 Speaker 8 159.54 Hz 82-315.5 Pathologic Male Subject
9 Speaker 9 163.4 Hz 69-319 Pathologic Male Subject
10 Speaker 10 240.21 Hz 91-319 Pathologic Male Subject
Table-5. Formant Analysis- f 1 , f 2 and f 3 for Different Subjects
Sr.No. Subject f 1 -Hz Bw of f 1 f 2 -Hz Bw of f 2 f 3 -Hz Bw of f 3
1 Speaker 1 580.94 379.6 1994.12 1349.86 2911.28 805.6
2 Speaker 2 618.21 154.03 1733.73 100.85 2740.9 351.59
3 Speaker 3 487.95 194.31 1530.16 143.32 2450.86 157.39
4 Speaker 4 640.57 70.6 1760.14 107.98 2738.32 261.8
5 Speaker 5 563.04 703.13 1397.31 372.92 2828.74 355.64
6 Speaker 6 811.47 503.09 1632.88 1496.1 2683.95 1069.35
7 Speaker 7 700.87 274.05 1598.82 1186.99 2548.72 749.50
8 Speaker 8 659.90 319 1751.96 424.39 2908.14 314.45
9 Speaker 9 1055.46 46.11 1875.62 202.5 2746.21 400.12
10 Speaker 10 1026.46 29.18 1761.64 1075.86 2561.33 247.57
Table-6. Speech Intensity,Jitter,Speech âPause Analysis for Different Subjects
Sr.
No.
Subject Sound
Amplitu
de mean-
pascal
Sound
Mean
Power
-dB
Total
Energy
in Air-
Joules/
m2
Intensity
Variation-
dB
Jitter
%
Speech-
Pause
Index
%
Skew Kurtosis
1 Speaker 1 0.41574 84.82 0.00660 60.72-
89.53
17.16 38.07 4.32 22.70
2 Speaker 2 0.05110 84.34 0.0063 55.9-91 27.28 27.06 4.2 16.81
3 Speaker 3 -0.00397 70.02 0.00051 45-79.98 13.02 64.55 2.47 4.76
4 Speaker 4 -0.0523 72.63 0.00028 56.87-
84.66
27.38 40 3.4 12.86
5 Speaker 5 0.30647 85.12 0.04907 60.38-
91.74
31.4 62.93 2.24 5.28
6 Speaker 6 0.38278 87.91 0.01469 69.38-
90.59
22.77 25.63 3.42 11.10
7 Speaker 7 0.3592 83.57 0.00760 56.28-87 2.29 48.8 2.79 6.47
8 Speaker 8 0.0001 88.41 0.16007 60.41-
91.50
35.8 21.85 2.99 9.41
9 Speaker 9 0.00018 83.67 0.0166 58.45-84.4 28.06 42.22 1.97 2.66
10 Speaker 10 0.00018 83.67 0.0166 38.54-
86.56
33.45 54.55 1.75 4.09
6. Signal & Image Processing : An International Journal (SIPIJ) Vol.4, No.3, June 2013
62
4. CONCLUSION
The pathological subjects affected with Apraxia of speech exhibit rigidity in motor movements.
Various types of misarticulation errors occur in different subjects. The observations are weak
consonant production and inappropriate closure of oral cavity while producing Bilabial plosives.
Utterances of some of the plosives and alveolar approximants were not possible in case of some
of the subjects. The Segmental and Suprasegmental Analysis indicators Speech-Pause Index
(SPI), was found to be high in pathological subjects. The Skew factor is observed below the
threshold level of 3.4 and the Kurtosis factor is observed below the threshold level of 12 in almost
all pathological subjects. High Pitch values falling in the female pitch range are observed in case
of pathological male subjects. The Formants are widely spread in case of pathological subjects.
High range of itensity variations,high % Jitter levels, high values of speech-pause index(SPI),low
skew levels and low level of Kurtosis factor is the diagnostic marker set of pathological
male/female speech.
ACKNOWLEDGEMENTS
The authors would like to thank the participants of research work- all the mentally retarded
children with speech disorders and the Director of the special school Chetana Vikas Mandir
Mr.Pawan Khebudkar.
REFERENCES
[1] R.Cmelja, J. Rusz âLaboratory Tasks From Voice Analysis in the Study of Biomedical Engineering
using MATLABâ Research Program MSM6840770012 Trans Disciplinary Research in Biomedical
Engineering-Voice and Speech Disorders, Charles university ,Prague, Czech Technical university
,Prague , Dallas university
[2] John-Paul Hosom, Lawrence Shriberg, Jordan R. GreenâDiagnostic Assessment of Childhood
Apraxia of Speech Using Automatic Speech Recognition (ASR) Methodsâ J Med Speech Lang
Pathol. 2004 December ; 12(4): 167â171.
[3] Marcelo de Oliveira Rosa, JosĂŠ Carlos Pereira, and Marcos Grelletâ Adaptive Estimation of Residue
Signal for Voice Pathology Diagnosisâ IEEE TRANSACTIONS ON BIOMEDICAL
ENGINEERING, VOL. 47, NO. 1, JANUARY 2000
[4] Darush Mehta âAspiration Noise During Phonation: Synthesis, Analysis, And Pitch -Scale
Modificationâ MASSACHUSETTS INSTITUTE OF TECHNOLOGY February 2006
[5] Lingyun Gu, John G. Harris,et.al. âDisordered Speech Assessment Using Automatic Methods Based
on Quantitative Measures â EURASIP Journal on Applied Signal Processing 2005:9, 1400â1409 c_
2005 Hindawi Publishing Corporation
[6] Oscar Saz,1 Javier Sim´ on,2 W.-Ricardo Rodr´Ĺguez,1 Eduardo Lleida,1 and Carlos Vaquero1
Research ArticleâAnalysis of Acoustic Features in Speakers with Cognitive Disorders and Speech
Impairmentsâ Hindawi Publishing Corporation EURASIP Journal on Advances in Signal Processing
Volume 2009, Article ID 159234, 11 pages doi:10.1155/2009/159234
[7] Juan Ignacio Godino-Llorente,1 Pedro G´omez-Vilda (EURASIPMember),2 and Tan Lee3 Editorial
âAnalysis and Signal Processing of Oesophageal and Pathological Voicesâ Hindawi Publishing
Corporation EURASIP Journal on Advances in Signal Processing Volume 2009, Article ID 283504, 4
pages doi:10.1155/2009/283504
7. Signal & Image Processing : An International Journal (SIPIJ) Vol.4, No.3, June 2013
63
[8] CatherineMiddag,1 Jean-PierreMartens,1 Gwen Van Nuffelen,2 andMarc De Bodt2 Research Article
âAutomated Intelligibility Assessment of Pathological Speech Using Phonological Features âHindawi
Publishing Corporation EURASIP Journal on Advances in Signal Processing Volume 2009, Article
ID 629030, 9 pages doi:10.1155/2009/629030
[9] Arundhati S. Mehendale and Manasi R. Dixit âSpeaker Identificationâ Signal & Image Processing :
An International Journal (SIPIJ) Vol.2, No.2, June 2011, DOI : 10.5121/sipij.2011.2206
[10] Lawrence D. Shriberg âStudies of Childhood Apraxia of Speech in Complex Neurodevelopmental
Disordersâ Conference on Motor Speech: Motor Speech Disorders ,Monterey, California,March 6-9,
2008
[11] D´arcio G. Silva, Lu´Ĺs C. Oliveira, andM´ario Andrea âJitter Estimation Algorithms for Detection of
Pathological Voicesâ Research Article Hindawi Publishing Corporation EURASIP Journal on
Advances in Signal Processing Volume 2009, Article ID 567875, 9 pages doi:10.1155/2009/567875
Authors
Mrs.Manasi Dixit is working as Associate Professor in KITâs College of
Engineering,Kolhapur .Her teaching experience is 28 years. Her main fields of
interest are Digital Signal Processing,Speech Processing,Image Processing and
Microwave Engineering. 16 PG students have completed their research work and
have been awarded M.E.(E&TC) degree under her guidance. She has published 14
papers in reputed international journals.She has worked in the capacity of the
SENATE member Shivaji University,Kolhapur and BOS-Board of Studies Member
for Electronics and Telecommunication Engineering, Shivaji University,Kolhapur.
Prof. Dr. Shaila Dinkar Apte is currently working as a professor on PG side in
Rajarshi Shahu College of Engineering, Pune and as reviewer for the International
Journal of Speech Technology by Springer Publication, International Journal of
Digital Signal Processing, Elsevier Publication. She is currently guiding 5 Ph.D.
candidates. Four candidates have completed their Ph.D. under her guidance. About
50 candidates completed their M.E. dissertations under her guidance. Almost all
dissertations are in the area of signal processing. She has a vast teaching experience
of 32 years in electronics engineering, and enjoys great popularity amongst students.
She has been teaching Digital Signal Processing and Advanced Digital Signal
Processing since last 18 years. Her previous designations include being an Assistant Professor in Walchand
College of Engineering, Sangli, for 27 years; a member of board of studies for Shivaji University and a
principle investigator for a research project sponsored by ARDE, New Delhi. She has published 28 papers
in reputed international journals, more than 40 papers in international conferences and about 15 papers in
national conferences. She has a patent published to her credit related to generation of mother wavelet from
speech signal. Her books titled âDigital Signal Processingâ and âSpeech and Audio Processingâ are
published by Wiley India. A second reprint of second edition of the first book is in the market. The book
titled âAdvanced Digital Signal Procesingâ will be published recently in June 2013 by Wiley publishers.