This work describes a construction of PADAS “Phonetics Arabic Database Automatically segmented” based on a data-driven Markov process. The use of a segmentation database is necessary in speech synthesis and recognizing speech. Manual segmentation is precise but inconsistent, since it is often produced by more than one label and require time and money. The MAUS segmentation and labeling exist for German speech and other languages but not in Arabic. It is necessary to modify MAUS for establish a segmental database for Arab. The speech corpus contains a total of 600 sentences recorded by 3 (2 male and 1 female) Arabic native speakers from Tunisia, 200 sentences for each.
Rule-Based Standard Arabic Phonetization at Phoneme, Allophone, and Syllable ...CSCJournals
Phonetization is the transcription from written text into sounds. It is used in many natural language processing tasks, such as speech processing, speech synthesis, and computer-aided pronunciation assessment. A common phonetization approach is the use of letter-to-sound rules developed by linguists for the transcription from grapheme to sound. In this paper, we address the problem of rule-based phonetization of standard Arabic. 1The paper contributions can be summarized as follows: 1) Discussion of the transcription rules of standard Arabic which were used in literature on the phonemic and phonetic level. 2) Improvements of existing rules are suggested and new rules are introduced. Moreover, a comprehensive algorithm covering the phenomenon of pharyngealization in standard Arabic is proposed. Finally, the resulting rules set has been tested on large datasets. 3) We present a reliable automatic phonetic transcription of standard Arabic at five levels: phoneme, allophone, syllable, word, and sentence. An encoding which covers all sounds of standard Arabic is proposed, and several pronunciation dictionaries have been automatically generated. These dictionaries have been manually verified yielding an accuracy higher than 99 % for standard Arabic texts that do not contain dates, numbers, acronyms, abbreviations, and special symbols. The dictionaries are available for research purposes.
A STUDY FOR THE EFFECT OF THE EMPHATICNESS AND LANGUAGE AND DIALECT FOR VOIC...sipij
This study analyzes voice onset time (VOT) values for four stop consonants in Modern Standard Arabic: /d/, /t/, /d?
/, and /t?
/. The student researcher builds a database of carrier words with a CV-CV-CV syllable structure and computes VOT values. The main findings are that VOT values for the emphatic sounds (/d?
/ and /t?
/) are consistently lower than for their non-emphatic counterparts (/d/ and /t/), and that VOT can distinguish Arabic dialects. The study aims to address the lack of research analyzing phonetic features of Arabic and to help automatic speech and language processing systems.
T URN S EGMENTATION I NTO U TTERANCES F OR A RABIC S PONTANEOUS D IALOGUES ...ijnlc
ext segmentation task is an essential processing task for many of Natural Language Processing (NLP)
such as text summarization, text translation, dialogue language understanding, among others. Turns
segmentation consi
dered the key player in dialogue understanding task for building automatic Human
-
Computer systems. In this paper, we introduce a novel approach to turn segmentation into utterances for
Egyptian spontaneous dialogues and Instance Messages (IM) using Machine
Learning (ML) approach as a
part of automatic understanding Egyptian spontaneous dialogues and IM task. Due to the lack of Egyptian
dialect
dialogue
corpus
the system evaluated by our
corpus
includes 3001 turns, which
are collected,
segmented, and annotat
ed manually from Egyptian call
-
centers. The system achieves F
1
scores
of 90.74%
and accuracy of 95.98%
Transliteration/Romanization of Urdu Processing by Rashida sharif Rashida Sharif
This document discusses transliteration of Urdu text into the Roman (English) script. It begins by introducing transliteration as the systematic mapping of text from one writing system to another. It then reviews previous work on transliterating Urdu, noting various systems that were developed but had limitations. The document proposes a new reversible transliteration scheme for mapping Urdu letters to English letters based on both letter conversions and phonetic approaches. It presents mapping tables and concludes the new scheme allows for recursive transliteration between English and Urdu without ambiguity issues seen in prior systems.
The impact of language planning, terminology planning, and arabicization, on ...Alexander Decker
This document summarizes a research paper on the impact of language planning, terminology planning, and Arabicization on military terminology planning and translation. It discusses how modern Arab armies have struggled with a lack of accurate Arabic equivalents for new military terminology as they have modeled their structure and equipment after Western armies. The document then reviews different approaches to language planning, terminology planning, and Arabicization and analyzes how they have been applied to developing standardized Arabic military terminology in Jordan. It concludes that language planning has played an important role in adapting to developments in military science and technology by coining new terms or Arabicizing foreign terms to find accurate Arabic equivalents.
A New Approach to Romanize Arabic WordsIJERA Editor
Romanization of Arabic words has been acquired the interest of the researchers due to its importance in many
fields such as security and terrorism fighting, translation, religious purposes, etc.
In this paper, a proposed method was presented to solve the drawbacks of available methods such as lack of
reverse recognition, using of extra letters and punctuation characters, and neglecting the correlation of the letters
in a word.
This method was implemented and tested using a sample of 100 undergraduate Iraqi students and 150 Arabic
words which romanized using five well-known methods in addition to the proposed one. The test showed that
the proposed method dominants the rest method from the recognition and reverse recognition process in
considerable ratio.
XMODEL: An XML-based Morphological Analyzer for Arabic LanguageWaqas Tariq
Morphological analysis is an essential stage in language engineering applications. For the Arabic language, this stage is not easy to develop because the Arabic language has some particularities such as the phenomena of agglutination and a lot of morphological ambiguity phenomenon. These reasons make the design of the morphological analyzer for Arabic somewhat difficult and require lots of other tools and treatments. The volume of the lexicon is another big problem of the morphological analysis of the Arabic Language which affects directly the process of the analyzing. In this paper we present a Morphological Analyzer for Modern Standard Arabic based on Arabic Morphological Automaton technique and using a new and innovative language (XMODEL) to represent the Arabic morphological knowledge in an optimal way. Both the Arabic Morphological Analyzer and Arabic Morphological Automaton are implemented in Java language and used XML technology. Buckwalter Arabic Morphological Analyzer and Xerox Arabic Finite State Morphology are two of the best known morphological analyzers for Modern Standard Arabic and they are also available and documented. Our Morphological Analyzer can be exploited by Natural Language Processing (NLP) applications such as machine translation, orthographical correction, information retrieval and both syntactic and semantic analyzers. At the end, an evaluation of Xerox and our system is done.
Deterministic Finite State Automaton of Arabic Verb System: A Morphological S...CSCJournals
Finite State Morphology serves as an important tool for investigators of natural language processing. Morphological Analysis forms an essential preprocessing step in natural language processing. This paper discusses the morphological analysis and processing of verb forms in Arabic. It focuses on the inflected verb forms and discusses the perfective, imperfective and imperatives. The deterministic finite state morphological parser for the verb forms can deal with Morphological and orthographic features of Arabic and the morphological processes which are involved in Arabic verb formation and conjugation. We use this model to generate and add all the necessary information (prefix, suffix, stem, etc.) to each morpheme of the words; so we need subtags for each morpheme. Using Finite State tool to build the computational lexicon that are usually structured with a list of the stems and affixes of the language together with a representation that tells us how words can be structured together and how the network of all forms can be represented.
Rule-Based Standard Arabic Phonetization at Phoneme, Allophone, and Syllable ...CSCJournals
Phonetization is the transcription from written text into sounds. It is used in many natural language processing tasks, such as speech processing, speech synthesis, and computer-aided pronunciation assessment. A common phonetization approach is the use of letter-to-sound rules developed by linguists for the transcription from grapheme to sound. In this paper, we address the problem of rule-based phonetization of standard Arabic. 1The paper contributions can be summarized as follows: 1) Discussion of the transcription rules of standard Arabic which were used in literature on the phonemic and phonetic level. 2) Improvements of existing rules are suggested and new rules are introduced. Moreover, a comprehensive algorithm covering the phenomenon of pharyngealization in standard Arabic is proposed. Finally, the resulting rules set has been tested on large datasets. 3) We present a reliable automatic phonetic transcription of standard Arabic at five levels: phoneme, allophone, syllable, word, and sentence. An encoding which covers all sounds of standard Arabic is proposed, and several pronunciation dictionaries have been automatically generated. These dictionaries have been manually verified yielding an accuracy higher than 99 % for standard Arabic texts that do not contain dates, numbers, acronyms, abbreviations, and special symbols. The dictionaries are available for research purposes.
A STUDY FOR THE EFFECT OF THE EMPHATICNESS AND LANGUAGE AND DIALECT FOR VOIC...sipij
This study analyzes voice onset time (VOT) values for four stop consonants in Modern Standard Arabic: /d/, /t/, /d?
/, and /t?
/. The student researcher builds a database of carrier words with a CV-CV-CV syllable structure and computes VOT values. The main findings are that VOT values for the emphatic sounds (/d?
/ and /t?
/) are consistently lower than for their non-emphatic counterparts (/d/ and /t/), and that VOT can distinguish Arabic dialects. The study aims to address the lack of research analyzing phonetic features of Arabic and to help automatic speech and language processing systems.
T URN S EGMENTATION I NTO U TTERANCES F OR A RABIC S PONTANEOUS D IALOGUES ...ijnlc
ext segmentation task is an essential processing task for many of Natural Language Processing (NLP)
such as text summarization, text translation, dialogue language understanding, among others. Turns
segmentation consi
dered the key player in dialogue understanding task for building automatic Human
-
Computer systems. In this paper, we introduce a novel approach to turn segmentation into utterances for
Egyptian spontaneous dialogues and Instance Messages (IM) using Machine
Learning (ML) approach as a
part of automatic understanding Egyptian spontaneous dialogues and IM task. Due to the lack of Egyptian
dialect
dialogue
corpus
the system evaluated by our
corpus
includes 3001 turns, which
are collected,
segmented, and annotat
ed manually from Egyptian call
-
centers. The system achieves F
1
scores
of 90.74%
and accuracy of 95.98%
Transliteration/Romanization of Urdu Processing by Rashida sharif Rashida Sharif
This document discusses transliteration of Urdu text into the Roman (English) script. It begins by introducing transliteration as the systematic mapping of text from one writing system to another. It then reviews previous work on transliterating Urdu, noting various systems that were developed but had limitations. The document proposes a new reversible transliteration scheme for mapping Urdu letters to English letters based on both letter conversions and phonetic approaches. It presents mapping tables and concludes the new scheme allows for recursive transliteration between English and Urdu without ambiguity issues seen in prior systems.
The impact of language planning, terminology planning, and arabicization, on ...Alexander Decker
This document summarizes a research paper on the impact of language planning, terminology planning, and Arabicization on military terminology planning and translation. It discusses how modern Arab armies have struggled with a lack of accurate Arabic equivalents for new military terminology as they have modeled their structure and equipment after Western armies. The document then reviews different approaches to language planning, terminology planning, and Arabicization and analyzes how they have been applied to developing standardized Arabic military terminology in Jordan. It concludes that language planning has played an important role in adapting to developments in military science and technology by coining new terms or Arabicizing foreign terms to find accurate Arabic equivalents.
A New Approach to Romanize Arabic WordsIJERA Editor
Romanization of Arabic words has been acquired the interest of the researchers due to its importance in many
fields such as security and terrorism fighting, translation, religious purposes, etc.
In this paper, a proposed method was presented to solve the drawbacks of available methods such as lack of
reverse recognition, using of extra letters and punctuation characters, and neglecting the correlation of the letters
in a word.
This method was implemented and tested using a sample of 100 undergraduate Iraqi students and 150 Arabic
words which romanized using five well-known methods in addition to the proposed one. The test showed that
the proposed method dominants the rest method from the recognition and reverse recognition process in
considerable ratio.
XMODEL: An XML-based Morphological Analyzer for Arabic LanguageWaqas Tariq
Morphological analysis is an essential stage in language engineering applications. For the Arabic language, this stage is not easy to develop because the Arabic language has some particularities such as the phenomena of agglutination and a lot of morphological ambiguity phenomenon. These reasons make the design of the morphological analyzer for Arabic somewhat difficult and require lots of other tools and treatments. The volume of the lexicon is another big problem of the morphological analysis of the Arabic Language which affects directly the process of the analyzing. In this paper we present a Morphological Analyzer for Modern Standard Arabic based on Arabic Morphological Automaton technique and using a new and innovative language (XMODEL) to represent the Arabic morphological knowledge in an optimal way. Both the Arabic Morphological Analyzer and Arabic Morphological Automaton are implemented in Java language and used XML technology. Buckwalter Arabic Morphological Analyzer and Xerox Arabic Finite State Morphology are two of the best known morphological analyzers for Modern Standard Arabic and they are also available and documented. Our Morphological Analyzer can be exploited by Natural Language Processing (NLP) applications such as machine translation, orthographical correction, information retrieval and both syntactic and semantic analyzers. At the end, an evaluation of Xerox and our system is done.
Deterministic Finite State Automaton of Arabic Verb System: A Morphological S...CSCJournals
Finite State Morphology serves as an important tool for investigators of natural language processing. Morphological Analysis forms an essential preprocessing step in natural language processing. This paper discusses the morphological analysis and processing of verb forms in Arabic. It focuses on the inflected verb forms and discusses the perfective, imperfective and imperatives. The deterministic finite state morphological parser for the verb forms can deal with Morphological and orthographic features of Arabic and the morphological processes which are involved in Arabic verb formation and conjugation. We use this model to generate and add all the necessary information (prefix, suffix, stem, etc.) to each morpheme of the words; so we need subtags for each morpheme. Using Finite State tool to build the computational lexicon that are usually structured with a list of the stems and affixes of the language together with a representation that tells us how words can be structured together and how the network of all forms can be represented.
Hybrid Phonemic and Graphemic Modeling for Arabic Speech RecognitionWaqas Tariq
The document summarizes a study that proposes a hybrid approach for acoustic and pronunciation modeling in Arabic speech recognition. It combines phonemic and graphemic modeling techniques. Two baseline speech recognition systems were built using phonemic and graphemic acoustic models. These models were then fused into a hybrid acoustic model. Different hybrid techniques for pronunciation modeling were also proposed and evaluated on a broadcast news speech corpus, showing error rate reductions of 8.8-12.6% over the baselines. The hybrid approach aims to benefit from both vocalized and non-vocalized Arabic resources.
Automatic Phonetization-based Statistical Linguistic Study of Standard ArabicCSCJournals
This document describes an automatic phonetic analysis of Standard Arabic text conducted by researchers from the University of Erlangen-Nuremberg. They compiled a corpus of over 5 million words from Classical and Modern Standard Arabic texts. They developed software to automatically transcribe the text into linguistic units like phonemes, allophones, syllables and allosyllables. After testing the software and achieving over 99% accuracy, they used it to analyze their corpus. Their analysis included identifying the frequencies of linguistic units, determining the best curve equation to model the frequency distributions, and extracting other statistical information about Standard Arabic phonology and phonetics.
STANDARD ARABIC VERBS INFLECTIONS USING NOOJ PLATFORMijnlc
This article describes the morphological analysis of a standard Arabic natural language processing, as a
part of an electronic dictionary-constricting phase. A fully 3-lettered inflected verbs model are formalized
based on a linguistic classification, using NOOJ platform, the classification gives certain representative
verbs that will considered as lemmas, this verbs form our dictionary entries, they are also conjugated
according to our inflection paradigm relying on certain specific morphological properties. This dictionary
will be considered as an Arabic resource, which will help NLP applications and NOOJ platform to analyse
sophisticated Arabic corpora.
This document discusses speech recognition for Arabic and some of the challenges. It notes that there are differences between Modern Standard Arabic and Egyptian Colloquial Arabic in terms of vocabulary and grammar. It also describes some key differences between the written and spoken forms of Arabic. The document then discusses recent works on Arabic speech recognition, including developing systems for broadcast news transcription and using genetic algorithms and neural network language models to improve accuracy. However, error rates for conversational speech recognition remain high compared to other languages.
Arabic words stemming approach using arabic wordnetIJDKP
The big growth of the Arabic internet content in the last years has raised up the need for an effective
stemming techniques for Arabic language. Arabic stemming algorithms can be ranked, according to three
category, as root-based approach (ex. Khoja); stem-based approach (ex. Larkey); and statistical approach
(ex. N-Garm). However, no stemming of this language is perfect: The existing stemmers have a low
efficiency. In this paper, we introduce a new stemming technique for Arabic words that also solve the
problem of the plural form of irregular nouns in Arabic language, which called broken plural. The
proposed stem extractor provides very accurate results in comparisons with other algorithms.
Consequently the search effectiveness improved.
CONSTRUCTION OF AMHARIC-ARABIC PARALLEL TEXT CORPUS FOR NEURAL MACHINE TRANSL...gerogepatton
Many automatic translation works have been addressed between major European language pairs, by taking advantage of large scale parallel corpora, but very few research works are conducted on the Amharic-Arabic language pair due to its parallel data scarcity. However, there is no benchmark parallel Amharic-Arabic text corpora available for Machine Translation task. Therefore, a small parallel Quranic text corpus is constructed by modifying the existing monolingual Arabic text and its equivalent translation of Amharic language text corpora available on Tanzile. Experiments are carried out on Two Long ShortTerm Memory (LSTM) and Gated Recurrent Units (GRU) based Neural Machine Translation (NMT) using Attention-based Encoder-Decoder architecture which is adapted from the open-source OpenNMT system. LSTM and GRU based NMT models and Google Translation system are compared and found that LSTM based OpenNMT outperforms GRU based OpenNMT and Google Translation system, with a BLEU score of 12%, 11%, and 6% respectively.
Construction of Amharic-arabic Parallel Text Corpus for Neural Machine Transl...gerogepatton
Many automatic translation works have been addressed between major European language pairs, by
taking advantage of large scale parallel corpora, but very few research works are conducted on the
Amharic-Arabic language pair due to its parallel data scarcity. However, there is no benchmark parallel
Amharic-Arabic text corpora available for Machine Translation task. Therefore, a small parallel Quranic
text corpus is constructed by modifying the existing monolingual Arabic text and its equivalent translation
of Amharic language text corpora available on Tanzile. Experiments are carried out on Two Long ShortTerm Memory (LSTM) and Gated Recurrent Units (GRU) based Neural Machine Translation (NMT) using
Attention-based Encoder-Decoder architecture which is adapted from the open-source OpenNMT system.
LSTM and GRU based NMT models and Google Translation system are compared and found that LSTM
based OpenNMT outperforms GRU based OpenNMT and Google Translation system, with a BLEU score
of 12%, 11%, and 6% respectively.
CONSTRUCTION OF AMHARIC-ARABIC PARALLEL TEXT CORPUS FOR NEURAL MACHINE TRANSL...ijaia
Many automatic translation works have been addressed between major European language pairs, by taking advantage of large scale parallel corpora, but very few research works are conducted on the Amharic-Arabic language pair due to its parallel data scarcity. However, there is no benchmark parallel Amharic-Arabic text corpora available for Machine Translation task. Therefore, a small parallel Quranic text corpus is constructed by modifying the existing monolingual Arabic text and its equivalent translation of Amharic language text corpora available on Tanzile. Experiments are carried out on Two Long ShortTerm Memory (LSTM) and Gated Recurrent Units (GRU) based Neural Machine Translation (NMT) using Attention-based Encoder-Decoder architecture which is adapted from the open-source OpenNMT system. LSTM and GRU based NMT models and Google Translation system are compared and found that LSTM based OpenNMT outperforms GRU based OpenNMT and Google Translation system, with a BLEU score of 12%, 11%, and 6% respectively
DEVELOPING A SIMPLIFIED MORPHOLOGICAL ANALYZER FOR ARABIC PRONOMINAL SYSTEMkevig
This paper proposes an improved morphological analyser for Arabic pronominal system using finite state method. The main advantage of the finite state method is very flexible, powerful and efficient. The most important results about FSAs, relates the class of languages generated by finite state automaton to certain closure properties. This result makes the theory of finite-state automata a very versatile and descriptive framework. The main contribution of this work is the full analysis and the representation of morphological analysis of all the inflections of pronoun forms in Arabic. In this paper we build a finite state network for the inflectional forms of the root words, restricted to all the inflections and grammatical properties of generating the dependent and independent forms of pronouns in Arabic language. The results show high score of accuracy in the output with all the needed linguistic features and the evaluation process of output is conducted using f-score test and the achievement is at the rate of 80% to 83%. The results from the study also provide the evidence that Arabic has strong concatenative word formations.
Segmentation Words for Speech Synthesis in Persian Language Based On Silencepaperpublications3
Abstract: In speech synthesis in text to speech systems, the words usually break to different parts and use from recorded sound of each part for play words. This paper use silent in word's pronunciation for better quality of speech. Most algorithms divide words to syllable and some of them divide words to phoneme, but This paper benefit from silent in intonation and divide words at silent region and then set equivalent sound of each parts whereupon joining the parts is trusty and speech quality being more smooth . this paper concern Persian language but extendable to another language. This method has been tested with MOS test and intelligibility, naturalness and fluidity are better.Keywords:TTS, SBS, Sillable, Diphone.
Title:Segmentation Words for Speech Synthesis in Persian Language Based On Silence
Author:Sohrab Hojjatkhah, Ali Jowharpour
International Journal of Recent Research in Mathematics Computer Science and Information Technology (IJRRMCSIT)
Paper Publications
International Journal on Natural Language Computing (IJNLC) Vol. 4, No.2,Apri...ijnlc
Building
dialogues systems
interaction
has recently gained considerable
attention, but most of the
resourc
es and systems built so far are
tailored to
English and other Indo
-
European languages. The need
for designing
systems for
other languages is increasing such as Arabic language.
For this reasons, there
are more int
erest for Arabic dialogue acts classification
task because it
a key player in Arabic language
under
standing
to
bu
ilding this systems
.
This paper surveys
different techniques
for dialogue acts classification
for Arabic.
W
e describe the
main existing techniques for utterances segmentations and
classification, annotation schemas, and
test corpora for Arabic
dialogues understanding
that have introduced
in the literature
Research Inventy : International Journal of Engineering and Science is published by the group of young academic and industrial researchers with 12 Issues per year. It is an online as well as print version open access journal that provides rapid publication (monthly) of articles in all areas of the subject such as: civil, mechanical, chemical, electronic and computer engineering as well as production and information technology. The Journal welcomes the submission of manuscripts that meet the general criteria of significance and scientific excellence. Papers will be published by rapid process within 20 days after acceptance and peer review process takes only 7 days. All articles published in Research Inventy will be peer-reviewed.
Research Inventy : International Journal of Engineering and Science is published by the group of young academic and industrial researchers with 12 Issues per year. It is an online as well as print version open access journal that provides rapid publication (monthly) of articles in all areas of the subject such as: civil, mechanical, chemical, electronic and computer engineering as well as production and information technology. The Journal welcomes the submission of manuscripts that meet the general criteria of significance and scientific excellence. Papers will be published by rapid process within 20 days after acceptance and peer review process takes only 7 days. All articles published in Research Inventy will be peer-reviewed.
High Quality Arabic Concatenative Speech Synthesissipij
This paper describes the implementation of TD-PSOLA tools to improve the quality of the Arabic Text-tospeech (TTS) system. This system based on Diphone concatenation with TD-PSOLA modifier synthesizer. This paper describes techniques to improve the precision of prosodic modifications in the Arabic speech synthesis using the TD-PSOLA (Time Domain Pitch Synchronous Overlap-Add) method. This approach is based on the decomposition of the signal into overlapping frames synchronized with the pitch period. The main objective is to preserve the consistency and accuracy of the pitch marks after prosodic modifications of the speech signal and diphone with vowel integrated database adjustment and optimisation.
Recent approaches to arabic dialogue acts classificationscsandit
Building Arabic dialogue systems (Spoken or Written) has gained an increasing interest in the last few. For this reasons, there are more interest for Arabic dialogue acts classification task because it a key player in Arabic language understanding to building this systems. This paper describes the results of the recent approaches of Arabic dialogue acts classifications and covers Arabic dialogue acts corpora, annotation schema, utterance segmentation, and classification tasks.
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...ijma
In this work a new Bangla speech corpus along with proper transcriptions has been developed; also
various acoustic feature extraction methods have been investigated using Long Short-Term Memory
(LSTM) neural network to find their effective integration into a state-of-the-art Bangla speech recognition
system. The acoustic features are usually a sequence of representative vectors that are extracted from
speech signals and the classes are either words or sub word units such as phonemes. The most commonly
used feature extraction method, known as linear predictive coding (LPC), has been used first in this work.
Then the other two popular methods, namely, the Mel frequency cepstral coefficients (MFCC) and
perceptual linear prediction (PLP) have also been applied. These methods are based on the models of the
human auditory system. A detailed review of the implementation of these methods have been described
first. Then the steps of the implementation have been elaborated for the development of an automatic
speech recognition system (ASR) for Bangla speech.
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...ijma
This document discusses performance analysis of different acoustic features for Bangla speech recognition using LSTM neural networks. It develops a Bangla speech corpus and extracts linear predictive coding (LPC), Mel frequency cepstral coefficients (MFCC), and perceptual linear prediction (PLP) acoustic features from the corpus. The features are then used to train LSTM models for Bangla speech recognition and their performance is evaluated based on sentence correct rates on test data sets consisting of male and female speakers.
Exploring Twitter as a Source of an Arabic Dialect CorpusCSCJournals
Given the lack of Arabic dialect text corpora in comparison with what is available for dialects of English and other languages, there is a need to create dialect text corpora for use in Arabic natural language processing. What is more, there is an increasing use of Arabic dialects in social media, so this text is now considered quite appropriate as a source of a corpus. We collected 210,915K tweets from five groups of Arabic dialects Gulf, Iraqi, Egyptian, Levantine, and North African. This paper explores Twitter as a source and describes the methods that we used to extract tweets and classify them according to the geographic location of the sender. We classified Arabic dialects by using Waikato Environment for Knowledge Analysis (WEKA) data analytic tool which contains many alternative filters and classifiers for machine learning. Our approach in classification tweets achieved an accuracy equal to 79%.
A new framework based on KNN and DT for speech identification through emphat...nooriasukmaningtyas
This document presents a new framework that combines K-nearest neighbors (KNN) and decision trees (DT) for speech identification through emphatic letters in the Moroccan dialect of Arabic. The framework first uses KNN and DT individually to predict the gender of the speaker and the emphatic letter and diacritic pronounced. It then uses these predictions as additional features to improve the overall prediction of the sound content, achieving an accuracy of 71.43% - a 12.1% improvement over directly applying the classifiers. The study examines 720 speech samples from 12 speakers and evaluates the performance of hidden markov models, DT, and KNN applied individually, finding that KNN best recognizes diacritics while DT performs best for gender
Stemming is the process of reducing words to their stems or roots. Due to the morphological richness and
complexity of the Arabic language, stemming is an essential part of most Natural Language Processing
(NLP) tasks for this language. In this paper, we study the impact of different stemming approaches on the
Named Entity Recognition (NER) task for Arabic and explore the merits, limitations and differences
between light stemming and root-extraction methods. Our experiments are evaluated on the standard
ANERCorp dataset as well as the AQMAR Arabic Wikipedia Named Entity Corpus.
Hybrid Phonemic and Graphemic Modeling for Arabic Speech RecognitionWaqas Tariq
The document summarizes a study that proposes a hybrid approach for acoustic and pronunciation modeling in Arabic speech recognition. It combines phonemic and graphemic modeling techniques. Two baseline speech recognition systems were built using phonemic and graphemic acoustic models. These models were then fused into a hybrid acoustic model. Different hybrid techniques for pronunciation modeling were also proposed and evaluated on a broadcast news speech corpus, showing error rate reductions of 8.8-12.6% over the baselines. The hybrid approach aims to benefit from both vocalized and non-vocalized Arabic resources.
Automatic Phonetization-based Statistical Linguistic Study of Standard ArabicCSCJournals
This document describes an automatic phonetic analysis of Standard Arabic text conducted by researchers from the University of Erlangen-Nuremberg. They compiled a corpus of over 5 million words from Classical and Modern Standard Arabic texts. They developed software to automatically transcribe the text into linguistic units like phonemes, allophones, syllables and allosyllables. After testing the software and achieving over 99% accuracy, they used it to analyze their corpus. Their analysis included identifying the frequencies of linguistic units, determining the best curve equation to model the frequency distributions, and extracting other statistical information about Standard Arabic phonology and phonetics.
STANDARD ARABIC VERBS INFLECTIONS USING NOOJ PLATFORMijnlc
This article describes the morphological analysis of a standard Arabic natural language processing, as a
part of an electronic dictionary-constricting phase. A fully 3-lettered inflected verbs model are formalized
based on a linguistic classification, using NOOJ platform, the classification gives certain representative
verbs that will considered as lemmas, this verbs form our dictionary entries, they are also conjugated
according to our inflection paradigm relying on certain specific morphological properties. This dictionary
will be considered as an Arabic resource, which will help NLP applications and NOOJ platform to analyse
sophisticated Arabic corpora.
This document discusses speech recognition for Arabic and some of the challenges. It notes that there are differences between Modern Standard Arabic and Egyptian Colloquial Arabic in terms of vocabulary and grammar. It also describes some key differences between the written and spoken forms of Arabic. The document then discusses recent works on Arabic speech recognition, including developing systems for broadcast news transcription and using genetic algorithms and neural network language models to improve accuracy. However, error rates for conversational speech recognition remain high compared to other languages.
Arabic words stemming approach using arabic wordnetIJDKP
The big growth of the Arabic internet content in the last years has raised up the need for an effective
stemming techniques for Arabic language. Arabic stemming algorithms can be ranked, according to three
category, as root-based approach (ex. Khoja); stem-based approach (ex. Larkey); and statistical approach
(ex. N-Garm). However, no stemming of this language is perfect: The existing stemmers have a low
efficiency. In this paper, we introduce a new stemming technique for Arabic words that also solve the
problem of the plural form of irregular nouns in Arabic language, which called broken plural. The
proposed stem extractor provides very accurate results in comparisons with other algorithms.
Consequently the search effectiveness improved.
CONSTRUCTION OF AMHARIC-ARABIC PARALLEL TEXT CORPUS FOR NEURAL MACHINE TRANSL...gerogepatton
Many automatic translation works have been addressed between major European language pairs, by taking advantage of large scale parallel corpora, but very few research works are conducted on the Amharic-Arabic language pair due to its parallel data scarcity. However, there is no benchmark parallel Amharic-Arabic text corpora available for Machine Translation task. Therefore, a small parallel Quranic text corpus is constructed by modifying the existing monolingual Arabic text and its equivalent translation of Amharic language text corpora available on Tanzile. Experiments are carried out on Two Long ShortTerm Memory (LSTM) and Gated Recurrent Units (GRU) based Neural Machine Translation (NMT) using Attention-based Encoder-Decoder architecture which is adapted from the open-source OpenNMT system. LSTM and GRU based NMT models and Google Translation system are compared and found that LSTM based OpenNMT outperforms GRU based OpenNMT and Google Translation system, with a BLEU score of 12%, 11%, and 6% respectively.
Construction of Amharic-arabic Parallel Text Corpus for Neural Machine Transl...gerogepatton
Many automatic translation works have been addressed between major European language pairs, by
taking advantage of large scale parallel corpora, but very few research works are conducted on the
Amharic-Arabic language pair due to its parallel data scarcity. However, there is no benchmark parallel
Amharic-Arabic text corpora available for Machine Translation task. Therefore, a small parallel Quranic
text corpus is constructed by modifying the existing monolingual Arabic text and its equivalent translation
of Amharic language text corpora available on Tanzile. Experiments are carried out on Two Long ShortTerm Memory (LSTM) and Gated Recurrent Units (GRU) based Neural Machine Translation (NMT) using
Attention-based Encoder-Decoder architecture which is adapted from the open-source OpenNMT system.
LSTM and GRU based NMT models and Google Translation system are compared and found that LSTM
based OpenNMT outperforms GRU based OpenNMT and Google Translation system, with a BLEU score
of 12%, 11%, and 6% respectively.
CONSTRUCTION OF AMHARIC-ARABIC PARALLEL TEXT CORPUS FOR NEURAL MACHINE TRANSL...ijaia
Many automatic translation works have been addressed between major European language pairs, by taking advantage of large scale parallel corpora, but very few research works are conducted on the Amharic-Arabic language pair due to its parallel data scarcity. However, there is no benchmark parallel Amharic-Arabic text corpora available for Machine Translation task. Therefore, a small parallel Quranic text corpus is constructed by modifying the existing monolingual Arabic text and its equivalent translation of Amharic language text corpora available on Tanzile. Experiments are carried out on Two Long ShortTerm Memory (LSTM) and Gated Recurrent Units (GRU) based Neural Machine Translation (NMT) using Attention-based Encoder-Decoder architecture which is adapted from the open-source OpenNMT system. LSTM and GRU based NMT models and Google Translation system are compared and found that LSTM based OpenNMT outperforms GRU based OpenNMT and Google Translation system, with a BLEU score of 12%, 11%, and 6% respectively
DEVELOPING A SIMPLIFIED MORPHOLOGICAL ANALYZER FOR ARABIC PRONOMINAL SYSTEMkevig
This paper proposes an improved morphological analyser for Arabic pronominal system using finite state method. The main advantage of the finite state method is very flexible, powerful and efficient. The most important results about FSAs, relates the class of languages generated by finite state automaton to certain closure properties. This result makes the theory of finite-state automata a very versatile and descriptive framework. The main contribution of this work is the full analysis and the representation of morphological analysis of all the inflections of pronoun forms in Arabic. In this paper we build a finite state network for the inflectional forms of the root words, restricted to all the inflections and grammatical properties of generating the dependent and independent forms of pronouns in Arabic language. The results show high score of accuracy in the output with all the needed linguistic features and the evaluation process of output is conducted using f-score test and the achievement is at the rate of 80% to 83%. The results from the study also provide the evidence that Arabic has strong concatenative word formations.
Segmentation Words for Speech Synthesis in Persian Language Based On Silencepaperpublications3
Abstract: In speech synthesis in text to speech systems, the words usually break to different parts and use from recorded sound of each part for play words. This paper use silent in word's pronunciation for better quality of speech. Most algorithms divide words to syllable and some of them divide words to phoneme, but This paper benefit from silent in intonation and divide words at silent region and then set equivalent sound of each parts whereupon joining the parts is trusty and speech quality being more smooth . this paper concern Persian language but extendable to another language. This method has been tested with MOS test and intelligibility, naturalness and fluidity are better.Keywords:TTS, SBS, Sillable, Diphone.
Title:Segmentation Words for Speech Synthesis in Persian Language Based On Silence
Author:Sohrab Hojjatkhah, Ali Jowharpour
International Journal of Recent Research in Mathematics Computer Science and Information Technology (IJRRMCSIT)
Paper Publications
International Journal on Natural Language Computing (IJNLC) Vol. 4, No.2,Apri...ijnlc
Building
dialogues systems
interaction
has recently gained considerable
attention, but most of the
resourc
es and systems built so far are
tailored to
English and other Indo
-
European languages. The need
for designing
systems for
other languages is increasing such as Arabic language.
For this reasons, there
are more int
erest for Arabic dialogue acts classification
task because it
a key player in Arabic language
under
standing
to
bu
ilding this systems
.
This paper surveys
different techniques
for dialogue acts classification
for Arabic.
W
e describe the
main existing techniques for utterances segmentations and
classification, annotation schemas, and
test corpora for Arabic
dialogues understanding
that have introduced
in the literature
Research Inventy : International Journal of Engineering and Science is published by the group of young academic and industrial researchers with 12 Issues per year. It is an online as well as print version open access journal that provides rapid publication (monthly) of articles in all areas of the subject such as: civil, mechanical, chemical, electronic and computer engineering as well as production and information technology. The Journal welcomes the submission of manuscripts that meet the general criteria of significance and scientific excellence. Papers will be published by rapid process within 20 days after acceptance and peer review process takes only 7 days. All articles published in Research Inventy will be peer-reviewed.
Research Inventy : International Journal of Engineering and Science is published by the group of young academic and industrial researchers with 12 Issues per year. It is an online as well as print version open access journal that provides rapid publication (monthly) of articles in all areas of the subject such as: civil, mechanical, chemical, electronic and computer engineering as well as production and information technology. The Journal welcomes the submission of manuscripts that meet the general criteria of significance and scientific excellence. Papers will be published by rapid process within 20 days after acceptance and peer review process takes only 7 days. All articles published in Research Inventy will be peer-reviewed.
High Quality Arabic Concatenative Speech Synthesissipij
This paper describes the implementation of TD-PSOLA tools to improve the quality of the Arabic Text-tospeech (TTS) system. This system based on Diphone concatenation with TD-PSOLA modifier synthesizer. This paper describes techniques to improve the precision of prosodic modifications in the Arabic speech synthesis using the TD-PSOLA (Time Domain Pitch Synchronous Overlap-Add) method. This approach is based on the decomposition of the signal into overlapping frames synchronized with the pitch period. The main objective is to preserve the consistency and accuracy of the pitch marks after prosodic modifications of the speech signal and diphone with vowel integrated database adjustment and optimisation.
Recent approaches to arabic dialogue acts classificationscsandit
Building Arabic dialogue systems (Spoken or Written) has gained an increasing interest in the last few. For this reasons, there are more interest for Arabic dialogue acts classification task because it a key player in Arabic language understanding to building this systems. This paper describes the results of the recent approaches of Arabic dialogue acts classifications and covers Arabic dialogue acts corpora, annotation schema, utterance segmentation, and classification tasks.
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...ijma
In this work a new Bangla speech corpus along with proper transcriptions has been developed; also
various acoustic feature extraction methods have been investigated using Long Short-Term Memory
(LSTM) neural network to find their effective integration into a state-of-the-art Bangla speech recognition
system. The acoustic features are usually a sequence of representative vectors that are extracted from
speech signals and the classes are either words or sub word units such as phonemes. The most commonly
used feature extraction method, known as linear predictive coding (LPC), has been used first in this work.
Then the other two popular methods, namely, the Mel frequency cepstral coefficients (MFCC) and
perceptual linear prediction (PLP) have also been applied. These methods are based on the models of the
human auditory system. A detailed review of the implementation of these methods have been described
first. Then the steps of the implementation have been elaborated for the development of an automatic
speech recognition system (ASR) for Bangla speech.
PERFORMANCE ANALYSIS OF DIFFERENT ACOUSTIC FEATURES BASED ON LSTM FOR BANGLA ...ijma
This document discusses performance analysis of different acoustic features for Bangla speech recognition using LSTM neural networks. It develops a Bangla speech corpus and extracts linear predictive coding (LPC), Mel frequency cepstral coefficients (MFCC), and perceptual linear prediction (PLP) acoustic features from the corpus. The features are then used to train LSTM models for Bangla speech recognition and their performance is evaluated based on sentence correct rates on test data sets consisting of male and female speakers.
Exploring Twitter as a Source of an Arabic Dialect CorpusCSCJournals
Given the lack of Arabic dialect text corpora in comparison with what is available for dialects of English and other languages, there is a need to create dialect text corpora for use in Arabic natural language processing. What is more, there is an increasing use of Arabic dialects in social media, so this text is now considered quite appropriate as a source of a corpus. We collected 210,915K tweets from five groups of Arabic dialects Gulf, Iraqi, Egyptian, Levantine, and North African. This paper explores Twitter as a source and describes the methods that we used to extract tweets and classify them according to the geographic location of the sender. We classified Arabic dialects by using Waikato Environment for Knowledge Analysis (WEKA) data analytic tool which contains many alternative filters and classifiers for machine learning. Our approach in classification tweets achieved an accuracy equal to 79%.
A new framework based on KNN and DT for speech identification through emphat...nooriasukmaningtyas
This document presents a new framework that combines K-nearest neighbors (KNN) and decision trees (DT) for speech identification through emphatic letters in the Moroccan dialect of Arabic. The framework first uses KNN and DT individually to predict the gender of the speaker and the emphatic letter and diacritic pronounced. It then uses these predictions as additional features to improve the overall prediction of the sound content, achieving an accuracy of 71.43% - a 12.1% improvement over directly applying the classifiers. The study examines 720 speech samples from 12 speakers and evaluates the performance of hidden markov models, DT, and KNN applied individually, finding that KNN best recognizes diacritics while DT performs best for gender
Stemming is the process of reducing words to their stems or roots. Due to the morphological richness and
complexity of the Arabic language, stemming is an essential part of most Natural Language Processing
(NLP) tasks for this language. In this paper, we study the impact of different stemming approaches on the
Named Entity Recognition (NER) task for Arabic and explore the merits, limitations and differences
between light stemming and root-extraction methods. Our experiments are evaluated on the standard
ANERCorp dataset as well as the AQMAR Arabic Wikipedia Named Entity Corpus.
How to Manage Reception Report in Odoo 17Celine George
A business may deal with both sales and purchases occasionally. They buy things from vendors and then sell them to their customers. Such dealings can be confusing at times. Because multiple clients may inquire about the same product at the same time, after purchasing those products, customers must be assigned to them. Odoo has a tool called Reception Report that can be used to complete this assignment. By enabling this, a reception report comes automatically after confirming a receipt, from which we can assign products to orders.
Level 3 NCEA - NZ: A Nation In the Making 1872 - 1900 SML.pptHenry Hollis
The History of NZ 1870-1900.
Making of a Nation.
From the NZ Wars to Liberals,
Richard Seddon, George Grey,
Social Laboratory, New Zealand,
Confiscations, Kotahitanga, Kingitanga, Parliament, Suffrage, Repudiation, Economic Change, Agriculture, Gold Mining, Timber, Flax, Sheep, Dairying,
How to Setup Default Value for a Field in Odoo 17Celine George
In Odoo, we can set a default value for a field during the creation of a record for a model. We have many methods in odoo for setting a default value to the field.
1. Mohamed Khalil Krichi & Cherif Adnan
Signal Processing: An International Journal (SPIJ), Volume (8) : Issue (2) : 2014 10
The Arabic Speech Database:
PADAS
Mohamed Khalil Krichi krichi_mohal@yahoo.fr
Faculty of Sciences of Tunis/ Laboratory of
Signal Processing/ Physics Department
University of Tunis-Manar
TUNIS, 1060, TUNISIA
Cherif Adnan adnen2fr@yahoo.fr
Faculty of Sciences of Tunis/ Laboratory of
Signal Processing/ Physics Department
University of Tunis-Manar
TUNIS, 1060, TUNISIA
Abstract
This work describes a construction of PADAS “Phonetics Arabic Database Automatically segmented”
based on a data-driven Markov process. The use of a segmentation database is necessary in
speech synthesis and recognizing speech. Manual segmentation is precise but inconsistent,
since it is often produced by more than one label and require time and money. The MAUS
segmentation and labeling exist for German speech and other languages but not in Arabic. It is
necessary to modify MAUS for establish a segmental database for Arab. The speech corpus
contains a total of 600 sentences recorded by 3 (2 male and 1 female) Arabic native speakers
from Tunisia, 200 sentences for each.
Keywords: HTK, MAUS, Phonetic Database, Automatic Segmentation.
1. INTRODUCTION
Many researches such as automatic speech recognition or speech synthesis are now based on
database e.g. English [1, 2, 3 and 4]. For obtaining a good result, the database must be
balanced, segmented and reduce the noise (noise in step of record)In order to produce a robust
speaker-independent continuous Arabic, a set of speech recordings that are rich and balanced is
required. The rich characteristic is in the sense that it must contain all the phonemes of Arabic
language. It must be balanced in preserving the phonetics distribution of Arabic language too.
This set of speech recordings must be based on a proper written set of sentences and phrases
created by experts. Therefore, it is crucial to create a high quality written (text) set of the
sentences and phrases before recording them. Any work based on the learning step requires a
database to learn the system and then evaluate it. They are a several international databases in
field of speech such as TIMIT which was developed by DARPA Committee for American English.
And we also find other databases in different known languages, such as French and German,
and unknown, as Vietnamese and Turkish.
For Arabic, we have not found a standard database, but we still found a few references. KACST
[5] database developed by the Institute of King Abdul -Aziz in Saudi Arabia.
1.1 KACST
Indeed KACST created a database for Arabic language sounds in 1997. This database was to
created the least number of phonetically rich Arabic words. As a result, a list of 663 phonetically
rich words containing all Arabic phonemes.
2. Mohamed Khalil Krichi & Cherif Adnan
Signal Processing: An International Journal (SPIJ), Volume (8) : Issue (2) : 2014 11
The purpose is used for Arabic ASR and text-to-speech synthesis applications.
KACST produced a technical report of the project “Database of Arabic Sounds: Sentences” in
2003. The sentences of Arabic Database have been written using the said 663 phonetically rich
words. The database consists of 367 sentences; 2 to 9 words per sentence.
The purpose is to produce Arabic phrases and sentences that are phonetically rich and balanced
based on the previously created list of 663 phonetically rich words [6].
1.2 ALGASD
ALGERIAN ARABIC SPEECH DATABASE (ALGASD) [7] developed for the treatment of Algeria
Arabic speech taking into account the different accents from different regions of the country.
Unavailability and lack of resources for a database audio prompted us to build our own database
to make the recognition of numbers and operations of a standard calculator in Arabic for a single
user. We made 27 recordings of 28 vocabulary words.
Database is the most important tool for multiple domains as speech synthesis or speech
recognition. to provide database a interesting and contains all the acoustic units must have all
the possible linguistic combinations .The quality of the final result of the synthesis is directly
dependent on the quality of recordings made during the development of the acoustic units
therefore a filtering step dictionary is mandatory.
The implementation stages can be summarized as follows:
a) The choice of dictionary (set of sentences contains several examples of phonemes.).
b) Sound recording expressions.
c) Noise reduction.
d) Segmentation.
2. ARABIC LANGUAGE
Statistics show that it is the first language (mother-tongue) of 206 million native speakers ranked
as fourth after Mandarin, Spanish and English [8].The Arabic language is a derivational and
inflectional language. The original Arabic is the language spoken by the Arabs. In addition, it is
the sacred language of the Koran and Islam. Because the spread of Islam and the spread of the
Qur'an, the language became a liturgical language. It is spoken in 22 countries, while the number
of speakers is more than 280 million [9].
2.1 Alphabet
2.1.1 Consonants
The Arabic alphabet consists of twenty eight consonants (see Table 1) basic, but there are
authors who treat the letter (alif) as the twenty –ninth consonant. The (alif) behaves as a long
vowel never found as consonant of the root.
Vowels are not as consonants, they are rarely noted. They are written only to clarify ambiguities
in the editions of the Koran or in the academic literature. Indeed, vowels play an important role in
the Arabic words, not only because they remove the ambiguity, but also because they give the
grammatical function of a word regardless of its position in the sentence. In other words, vowels
have a dual function: one morphological or semantic and the other is syntactic. Arabic has two
sets of vowels, the short one and the other long.
2.1.2 Short Vowels
The short vowels (ُ ,ِ ,َ ) are added above or below consonants. When the consonant has no
vowel, it will mark an absence of vowel represented in Arabic by a silent vowel ( ْ ).
3. Mohamed Khalil Krichi & Cherif Adnan
Signal Processing: An International Journal (SPIJ), Volume (8) : Issue (2) : 2014 12
2.1.3 Long Vowels
Long vowels are long letters, they are formed by a brief vowels and one of the following letters (ﻱ
, ﺍ , ﻭ )
2.2 The Diacritics
Short vowels are represented by symbols called diacritics (see Figure 5). Three in number, these
symbols are transcribed as follows:
• The Fetha [a] is symbolized by a small line on the consonant (َﻡ / ma / )
• Damma the [ u ] is symbolized by a hook above the consonant (ُﻡ/ mu / )
• The kasra [i] is symbolized by a small line below the consonant ( ِ/ﻡ mi / )
• A small round o symbolizing Sukun is displayed on a consonant when it is not linked to
any vowel.
2.2.1 The Tanwin
The sign of tanwin is added to the end of words undetermined. It is related to exclusion with
Article determination placed at the beginning of a word. Symbols tanwin are three in number and
are formed by splitting diacritics above, which results in the addition of the phoneme / n /
phonetically:
[an]: (ًﻉ/ AIan / )
[un] : (ٌﻉ / AIun /)
[in]: (ٍ/ﻉ AIin / )
2.2.2 The Chadda
The sign of the chadda can be placed over all the consonants non initial position. The consonant
which is then analyzed receives a sequence of two consonants identical:
FIGURE 1: Example of a sentence / jalAIabuuna limuddati saAItin / ("They play for an hour").
4. Mohamed Khalil Krichi & Cherif Adnan
Signal Processing: An International Journal (SPIJ), Volume (8) : Issue (2) : 2014 13
TABLE 1: Arabic consonant and vowels and their SAMPA code.
3. BALANCED SELECTION OF ARABIC WORDS
The syllabic structures in Arabic are limited in number and easily detectable. Every syllable in
Arabic begins with a consonant followed by a vowel which is called the nucleus of the syllable.
Short vowels are denoted by (V) and long vowels are denoted by (VV). It is obvious that the
vowel is placed in the second place of the syllable. These features make the process of
syllabification easier. Arabic syllables can be classified either according to the length of the
syllable or according to the end of the syllable. Short syllable occur only in CV form, because it is
ending with a vowel so it is open. Medium syllable can be in the form of open CVV, or closed
CVC. Long syllable has two closed forms CVVC, and CVCC. Arabic words are composed at least
by one syllable; most contain two or more syllables. The longest word is combined of five
syllables. Table II illustrates Arabic syllables. Some of the Arabic words are spelled together
forming new long words with 6 syllables like (َﻪَﻧﻭُﻠُﻛْﺄَﻳ), or 7 syllables like (ُﻪَﻧﻭُﻠِﺑْﻘَﺗْﺳَﻳ). There exist a few
Arabic data suitable for HMM-based synthesis, which should ideally include a large number of
Arabic databases from a single speaker and corresponding phonetic transcriptions.
TABLE 2: Arabic Syllables Types.
3.1 Corpus Description
Creating phonetically rich and balanced text corpus requires selecting a set of phonetically rich
words, which are combined together to produce sentences and phrases. These sentences and
phrases are verified and checked for balanced phonetic distribution. Some of these sentences
and phrases might be deleted and/or replaced by others in order to achieve an adequate phonetic
distribution [10].The corpus, which we used to build our database, is composed of 200 sentences,
with an average of 5 words per sentence. These sentences contain 1000 words, 2600 syllables,
7445 phonemes including 2302 short vowels and long vowels. These sentences were read at an
Graphem
es
symbol Graphem
es
symbol Graphem
es
symbol Graphem
es
symbol
ء ? ﺭ r ﻍ g ﻱ j
ﺏ b ﺯ z ﻑ f َ◌ a
ﺕ t ﺱ s ﻕ q ً◌ a:
ﺙ T ﺵ S ﻙ k ِ◌ i
ﺝ Z ﺹ s’ ﻝ l ٍ◌ i:
ﺡ X ﺽ d’ ﻡ m ْ◌ u
ﺥ x ﻁ t’ ﻥ n ُ◌ u:
ﺩ d ﻅ D’ ﻩ h
ﺫ D ﻉ ?’ ﻭ w
Syllable Arabic example English
meaning
cv ِﻝ li to
cvv ِﻓﻰ fii in
cvc ْﻝُﻗ qul say
cvcc ْﺭَْﺣﺑ bahr sea
cvvc ْﻝﺎَﻣ maAl money
cvvcc ّﺭﺍَﺯ zaArr visit
5. Mohamed Khalil Krichi & Cherif Adnan
Signal Processing: An International Journal (SPIJ), Volume (8) : Issue (2) : 2014 14
average speed (from 10 to 12 phonemes/second) by Tunisian speakers, two male and a female.
They were sampled at 16 KHz with 16 bits per sample.
3.2 Corpus Analysis
We have carried out a statistical study of our corpus. Table 3 shows the results of this study. We
can note the following results:
The short vowel [a] and the long vowel [a:] appear with a frequency of 17%, followed by
vowels [i] and [i:] with an occurrence frequency of 14.3%. The vowels [u] and [u:]
represent 7%.
The occurrence of the vowel (short and long) is about 37%.
The most frequent Arabic consonants are: [?] (15%), [n] (6.66%), [l] (6.63%), [m]
(5.59%), etc.
Consonant and
vowels
Phoneme
Repetitions
%
? 523 13,34%
b 102 2,60%
t 92 2,35%
T 70 1,79%
x 19 0,48%
/X 20 0,51%
G 35 0,89%
d 39 0,99%
D 40 1,02%
r 102 2,60%
z 35 0,89%
s 48 1,22%
S 73 1,86%
s’ 18 0,46%
d’ 24 0,61%
t’ 19 0,48%
D’ 24 0,61%
?’ 61 1,56%
g 23 0,59%
f 61 1,56%
q 61 1,56%
k 80 2,04%
l 260 6,63%
m 219 5,59%
n 261 6,66%
h 123 3,14%
w 51 1,30%
j 80 2,04%
a 400 10,20%
a: 254 6,48%
i 400 10,20%
i: 28 0,71%
u 252 6,43%
u: 23 0,59%
total 3920 100%
TABLE 3: Occurrence Frequency (%) of Arabic Consonants and Vowels.
6. Mohamed Khalil Krichi & Cherif Adnan
Signal Processing: An International Journal (SPIJ), Volume (8) : Issue (2) : 2014 15
3.3 Noise Reduction
Degradation of signals by noise is an omnipresent problem [11]. In almost all fields of signal
processing the removal of noise is a key problem. The wavelet transform is striking for its great
variety of different types and modifications. A whole host of different scaling and wavelet
functions (or scaling and wavelet coefficients) provide plenty of possible adjustments and
regulating variables [12]. The audio recordings were noisy with a continuous background noise.
Our goal is to reduce this undesirable component. Figures 2 and 3 shows the time signal before
and after filtering for a particular audio file. We note in particular that the zone of silence
highlighted is closer to zero in the filtered signal in the original signal release.
FIGURE 2: Example for Original Speech of Database File.
FIGURE 3: Example for the Same Speech Denoising.
3.4 Automatic Segmentation
Nowadays, the Practical applications of automatic S&L are implemented as a statistical search
for a S&L
^
k in a space Ψ of all possible S&Ls, which can be formulated as [13,14]:
(1)
Where, O is the acoustic observation on the corresponding speech signal. The MAUS system
^ ( ) ( | )
arg max (k | O) argmax
( )
k k
P k p O k
k P
P O
ψ ψ∈ ∈= =
7. Mohamed Khalil Krichi & Cherif Adnan
Signal Processing: An International Journal (SPIJ), Volume (8) : Issue (2) : 2014 16
models
( )P k
for each recording O. Each path from the start node to the end node represents a
possible
k ψ∈
and accumulates to the probability
( ) ( | )P k p O k
which is determined by HMMs for
each phonemic segment and a simple Viterbi search through the graph yields the
maximal
( ) ( | )P k p O k
.
The ’Munich Automatic Segmentation’ (MAUS) system developed by Department of Phonetics,
University of Munich, For more details about the MAUS method refer to [15], [16] and [17].
The purpose is analyzing a spoken utterance. Indeed, input a speech wave and some
orthographic form of the spoken text. The text is parsed into a chain of single words (punctuation
marks are stripped) and passed to a text−to−phoneme algorithm, which is either rule−based or a
combination of lexicon lookup and fallback to the rule−based system.
FIGURE 4: Processing in MAUS.
3.5 Corpus Segmentation and Labeling
Our continuous speech corpus was segmented and labeled with automatic procedure “MAUS”.
This software requires as input a file in format "wav" of the sentence to be segmented, and a text
file containing the phonetic transcription of the same sentence. A phonetic file (.par). This file
consists of the list of sentence phonemes with their prosodic characteristics.
8. Mohamed Khalil Krichi & Cherif Adnan
Signal Processing: An International Journal (SPIJ), Volume (8) : Issue (2) : 2014 17
FIGURE 5: Example MAUS segmentation and labeling taken from the Arabic corpus with SAMPA code.
4. RESULT AND EVALUATION
4.1 Result
A database is a collection of accumulated documents. our database defines as follow:
The files (.wav), four files of transcription (txt, word, phn, textGrid) exist for each sentence of the
corpus, which contains respectively:
• The text of the marked sentence (.txt) ;
• The associated time aligned word transcription (.word) ;
• The associated time aligned phonetic transcription (.phn) ;
• Temporal description of each phoneme; start time and end time (.textGrid).
FIGURE 6: Temporal description of each phoneme; start time and end time.
9. Mohamed Khalil Krichi & Cherif Adnan
Signal Processing: An International Journal (SPIJ), Volume (8) : Issue (2) : 2014 18
4.2 Evaluation
In total, 600 sentences were segmented, 400 sentences for the two speakers (male, 200
sentences for every one), 200 sentences for the third speaker (female). For each segmented 200
sentences, we randomly selected 10 sentences for segmented manually. To do this, we need 6
students in our research laboratory, two for each 10 sentences. The results are summarized in
the following table:
TABLE 4: Evaluation Result.
5. CONCLUSION
This paper reports our work towards developing the PADAS «Phonetic Arabic Database
Automatically Segmented» based on rich phonetic and balanced speech corpus, which is
automatic segmented with the MAUS system. This work includes creating the rich phonetic and
balanced speech corpus; building an Arabic phonetic dictionary, reducing noise by wavelet
method and an evaluation of the automatic segmentation. The current release of our database
contains 1 female and 2 male voices. The purpose of this work is to build a database to be used
in all area of Speech processing. This variety is useful when used in speech synthesis or speech
recognition.
6. REFERENCES
[1] A. Black and K. Tokuda, “The Blizzard Challenge Evaluating Corpus-Based Speech
Synthesis on Common Datasets,” in Proceeding of Interspeech, Portugal, pp. 77-80, 2005.
[2] S. D’Arcy and M. Russell, “Experiments with the ABI (Accents of the British Isles) Speech
Corpus,” in Proceedings of Interspeech 08, Australia, pp. 293-296, 2008.
[3] J. Garofolo, L. Lamel , W. Fisher, J. Fiscus, D. Pallett, N. Dahlgren, and V. Zue, “TIMIT
Acoustic-Phonetic Continuous Speech Corpus,” Technical Document, Trustees of the
University of Pennsylvania, Philadelphia, 1993.
[4] K. Tokuda, H. Zen, and A.W. Black, “An HMM-based speech synthesis system applied to
English”, in IEEE Speech Synthesis Workshop, 2002.
[5] M. Alghamdi, A. Alhamid, and M. Aldasuqi, “Database of Arabic Sounds: Sentences,”
Technical Report, Saudi Arabia, 2003.
[6] M.A. Mansour “Kacst arabic phonetics database”. Riyadh, Kingdom of Saudi Arabia. 2004.
[7] G.Droua-Hamdani “Algerian Arabic speech database (algasd)”. December 2010.
[8] R. Gordon, “Ethnologue: Languages of the World, Texas: Dallas”, SIL International, 2005.
[9] A. Omar “Dirasat Al–Swat Al–Lugawi”.Cairo: Alam Al– Kutub 1985.
[10]L. Pineda, G.mez M., D. Vaufreydaz and J. Serignat “Experiments on the Construction of a
Phonetically Balanced Corpus from the Web,” in Proceedings of 5th International Conference
on Computational Linguistics and Intelligent Text Processing, Lecture Notes in Computer
Science, Korea, pp. 416-419, 2004.
speaker Manual segmentation Automatic segmentation
First male speaker 99% 94%
second male speaker 99% 94.4%
female speaker 99% 94.1%
10. Mohamed Khalil Krichi & Cherif Adnan
Signal Processing: An International Journal (SPIJ), Volume (8) : Issue (2) : 2014 19
[11]L. Hadjileontiadis and S. Panas. “Separation of discontinuous adventitious sounds from
vesicular sounds using a wavelet based filter”, IEEE Trans. Biomed. Eng., vol. 44, n° 7, pp.
876-886, 1997.
[12]S. Mallat. “A wavelet tour of signal processing”. Academic Press, 1999.
[13]F. Schiel, and J. Harrington: “Phonemic Segmentation and Labelling using the MAUS
Technique”. Workshop 'New Tools and Methods for Very-Large-Scale Phonetics Research',
University of Pennsylvania, January 28-31, 2011.
[14]F. Schiel, “MAUS Goes Iterative”. Proc. of the IV. International Conference on Language
Resources and Evaluation, Lisbon, Portugal, pp. 1015-1018. 2004.
[15]J.L. Fleiss “Measuring nominal scale agreement among many raters”. Psychological Bulletin,
Vol. 76, No. 5 pp. 378-382. 1971.
[16]S. Burger, K. Weilhammer, F. Schiel, H. G. Tillmann, “Verbmobil Data Collection and
Annotation”. Foundations of Speech-to-Speech Translation (Ed.Wahlster W), Springer,
Berlin, Heidelberg. 2000.
[17]F. Schiel, C. Heinrich, and S. Barf¨ußer “Alcohol Language Corpus”. Language
Resources,2011.