3. Table 1: Some differences between Modern Standard Arabic and
Egyptian Colloquial Arabic
GLOSSECAMSA
summerse:fsaif
‘he speacks’yitkallimyatakallam
TabletarabeezaTawila
4. Writing System
Arabic is written in script and from right to left. The
alphabet consists of twenty-eight letters, twenty-five
of which represent consonants. The remaining three
letters represent the long vowels of Arabic and, where
applicable, the corresponding Each letter can appear
in up to four different shapes, depending on whether it
occurs at the beginning, in the middle, or at the end of
a word, or in isolation. Letters are mostly connected
and there is no capitalization semivowels.
8. Examples of MSA pronominal and possessive
affixes (separated from stem by '-').
9.
10. error rates on conversational speech, by contrast, are
unacceptably high. The currently best error
rate, 55.5%, is larger than those
for comparable data in other languages
11. problems
the mismatch between spoken and written representation
(missing pronunciation information in Arabic script);
the lack of conversational training data;
morphological complexity.
12. Projects and contributions
IBM first established a system to learn to speak Arabic
and converted to text. (OS2)
After that the two versions of the Windows system
Then introduced Via Voice mulineum. In speech
recognition system to answer phone calls, and
responding to user voice command.
The problem was: the need for a large number of words,
a word 200,000 to cover 97% of the language used in
the modern day.
13. Recent Works
Alghamdi . (2009) developed an Arabic broadcast news
transcription system.
Elmahdy in. (2009)used acoustic models trained with large MSA
news broadcast speech corpus to work as multilingual or multi-accent
models to decode colloquial Arabic.
Selouani and Alotaibi (2011)presented Genetic Algorithms to
adapt HMMs for
non-native speech in a large vocabulary speech recognition system of
MSA.
Saon et al. (2010) described the Arabic broadcast transcription
system
Kuo et al. (2010) studied various syntactic and morphological
context features incorporated in an NNLM for Arabic speech
recognition