Research Inventy: International Journal of Engineering And Science
Vol.4, Issue 7 (July 2014), PP 06-11
Issn (e): 2278-4721, Issn (p):2319-6483, www.researchinventy.com
6
Phonetic Transcription- A Framework for Phonetic
Representation of Sound Structures
Dr. M Hanumanthappa1
, Rashmi S2
, Jyothi N M3
1
Associate Professor, Department of Computer Science and Applications,
Bangalore University, Bangalore-56.
2
Research Scholar, Department of Computer Science and Applications,
Bangalore University, Bangalore-56.
3
Assistant Professor, Department of MCA, Bapuji Institute of Engineering and Technology, Davangere.
ABSTRACT: Implementing phonetics to Natural language is a Herculean task. The first step for the phonetic
implementation is to generate a phonetic dictionary. Phonetic dictionary is very important as it plays a vital
role in identifying the essential components of the gigantic vocabulary in the speech recognition system of
natural language. Language like English needs phonetic transcriptions because the English spelling does not
tell us how to pronounce it. Pronunciation is very important for communication as it vitalizes rapid transition to
practical concern. There is lot of ongoing research in multilingual speech recognition. Phonetic dictionary is
utmost important for any speech recognition system and hence the key goal of the paper is to build a phonetic
dictionary for English language. The paper aims at building dictionary for identifying the phonetic
transcriptions of International Phonetic Alphabet (IPA) and American English alphabets phonetically. The
paper also gives a detailed explanation about the rules of the phonemes.
KEYWORDS: Natural Language Processing (NLP), Phonetics, Acoustic phonetics, Phonemes,
Phonetic transcriptions, WordNet.
I. INTRODUCTION
According to the WordNet the phonetics is defined as a branch of NLP which is concerned with the
speech production and perception and a detailed analysis of acoustics. Acoustic phonetics is a branch of
phonetics which deals with the study of sounds that is made by the vocal organs of the human in order to
produce the sound. The acoustic knowledge is very important to locate the underlying phonetic representation.
Acoustics discusses how the sound travels from speakers’ mouth to listeners’ ears. The challenges of identifying
the language and storing the corresponding sound for phonetic transcription are tedious and tiresome. The
International Phonetic Alphabet (IPA) and the acoustic modeling of various phonemes have shown rationally
good improvement for the implementation and understanding of the sound structure. Various approaches such as
Statistical Machine Translation (SMT), language identification system (LID) merges gap between the encoding
and the decoding process by using multi stream approach [1].
Phonemes are the basic building block for the language which is combined with the other phoneme to
build a meaningful unit such as words and morphemes. Phonemes are the individual or group of sound units.
The phonetic codes include syllable code, syllable time and syllable rhythm. The phonemes form the bedrock
for initial and final code form to represent the phonetic structure with sensibility and sub-information of
linguistics in syllable code [2].
Phonetic transcriptions tell us how the word has to be pronounced. These phonetic transcriptions are
written in IPA where each and every English alphabet has its own symbol. For example the IPA based phonetic
transcription for the words such as no is noʊ, and the transcription of do is duː. Though both of these words end
with the same letter they have different sound and the phonetic transcriptions are different. The transcription for
the word alphabet is . Phonetic transcriptions are helpful because there are no convincing
empirical results which help us to sort the pronunciation training [5]. The below table mention some of the
examples for phonetic transcriptions.
main stress
/ˌek.spekˈteɪ.ʃə
n/ expectation
secondary stress /ˌriːˈtell/ retell
syllable division /ˈsɪs.təm/ system
Table-1 phonetic transcriptions
Phonetic Transcription- A Framework for Phonetic Representation of Sound Structures
7
II. LITERATURE SURVEY
It is said that there are around 4000 languages spoken today across the world and most of these
languages have limited linguistic knowledge and speech data/resources are available. Author Akbacak, M et al
have proposed series of speech retrieval algorithms in order to leverage the already existing algorithms. The
algorithms employ confusion-embedded hybrid pronunciation networks, and lattice-based phonetic search
within a proper name retrieval task. Latin-American Spanish are used as the target language. After searching for
queries consisting of Spanish proper names in Spanish Broadcast News data, we demonstrate that retrieval
performance degradations (due to data sparseness during automatic speech recognition (ASR) deployment in the
target language) are compensated by employing English acoustic models [9]. Speech recognition system for
various languages have been studied and developed. Arabic speech recognition system is developed at
Cambridge University by Gales M J F et al [6]. Author has shown a simple scheme for automatically generating
associations among diverse pronunciations for use in training and reducing the phonetic out-of-vocabulary rate.
Author Palanisamy K et al, has developed Tamil pronunciation dictionary by incorporating the visual actions of
the organs for improving communication skills [7]. Phonetic transcriptions by using Speech Assessment Method
Phonetic Alphabet (SAMPA) for Romanian language has been proposed by Domokos, J et al. Development of
phonetic dictionary and also the system architecture is explained in the paper [8].
In the next section various methodologies for building the phonetic dictionary and the associated rules
are discussed.
III. METHODOLOGIES
Today almost all the English dictionaries have audio recordings. So even with this, the need of
phonetic transcriptions is prominent because of the following reasons [3]
 To have a good communication in English first we have learn the pronunciations and English language
is polysemy in nature. So learning the pronunciations through sound is apt in these kinds of situations.
 When we are listening to the pronunciations we might not be sure whether we heard
ʊ or ə, ɒ or ʌ, s or z, etc., this is due to the lack of experience and the quality of the particular sound is poor.
Hence reading phonetic transcriptions will make the phonemes clear as we will see the sound symbols of all the
alphabets in a word
 Dictionaries often show multiple transcriptions but have the same pronunciations so the better way to
find the reasons for this ambiguity would be to study the information about the particular transcriptions.
 The computer which we are using might not have the speakers and hence the audio recording of the
particular sound cannot be heard. Sometimes even if the speakers are available we do not want to disturb the
public with the sound. In these situations transcriptions will greatly help.
Word Stress: A word is combination of syllables. In every word, one or more letters are pronounced
strongly. This is called word stress. The dictionary which is shown in this paper also contains the word stress.
Pronunciation for English language is broadly classified into American and British transcriptions. In American
English, the sound of r is always stressed where as in British English r is not stressed. Example: arm/ father are
pronounced differently in both of the accent.
IPA: International Phonetic Alphabet is the abbreviation for IPA. It defines the standard phonetic
symbol for every alphabet in English language. The IPA symbols are usually written in the Latin symbols. IPA
defines the standard sound representation for oral language. IPA is considered as the standard for linguistics.
However there is also the American phonetic alphabet because of the difficulties faced with IPA such as, it is
tough to type the IPA symbols with the normal keyboard. IPA also increases the error rate by withholding the
awkwardness reading the transcriptions which are hand written.
IV. BUILDING PHONETIC DICTIONARY
Phonetic dictionary is not as simple as it looks because applying sounding mechanisms to make
computer auto detect the word and give the pronunciation is difficult. There are various rules that need to be
followed while building the phonetic transcriptions. The rules are [3]
[1]. Almost all dictionaries use the e symbol for the vowel in bed. The problem with this convention is
that e in the IPA does not stand for the vowel in bed; it stands for a different vowel that is heard, for
example, in the German word Seele. The ―proper‖ symbol for the bed vowel is ɚ (do not confuse with ɛ:).
The same goes for eə vs. ɚə.
Phonetic Transcription- A Framework for Phonetic Representation of Sound Structures
8
[2]. In əʳ and ɛ:ʳ, the ʳ is not pronounced in BrE, unless the sound comes before a vowel (as in
answering, answer it). In AmE, the ʳ is always pronounced, and the sounds are sometimes written
as ə and ɜ.
[3]. In AmE, ɑ: and ɒ are one vowel, so calm and cot have the same vowel. In American transcriptions, hot is
written as hɑ:t.
[4]. About 40% of Americans pronounce ɔ: the same way as ɑ:, so that caught and cot have the same vowel.
See cot-caught merger.
[5]. In American transcriptions, ɔ: is often written as ɒ: (e.g. law = lɒ:), unless it is followed by r, in which
case it remains an ɔ:.
[6]. In British transcriptions, oʊ is usually represented as əʊ. For some BrE speakers, oʊ is more appropriate
(they use a rounded vowel) — for others, the proper symbol is əʊ. For American speakers, oʊ is usually
more accurate.
[7]. In eəʳ ɪəʳ ʊəʳ, the r is not pronounced in BrE, unless the sound comes before a vowel (as in dearest, dear
Ann). In AmE, the r is always pronounced, and the sounds are often written as er ɪr ʊr.
[8]. All dictionaries use the r symbol for the first sound in red. The problem with this convention is that r in
the IPA does not stand for the British or American r; it stands for the ―hard‖ r that is heard, for example,
in the Spanish word rey or Italian vero. The ―proper‖ symbol for the red consonant is ɹ.
[9]. In American English, t is often pronounced as a flap t, which sounds like d or (more accurately) like the
quick, hard r heard e.g. in the Spanish word pero. For example: letter. Some dictionaries use
the t ̬ symbol for the flap t.
Table II describes the English alphabets and the corresponding IPA and American phonetic alphabets. Word
stress is also highlighted.
Table II IPA and phonetic Transcriptions
English Alphabet a b c d e f g h i J k l M n o
IPA/American Phonetic Alphabet ə bi si di I ɚf ǰi/dʒi etʃ/eč aj/ay dʒe/ǰe ke ɚl ɚm ɚn o
Stress Marked across the Alphabets é bí sí dí í ɚ́f ǰí/dʒí étʃ/éč áj/áy dʒé/ǰé ké ɚ́l ɚ́m ɚ́n ó
English Alphabet p Q r s t u v W x Y z
IPA/American Phonetic Alphabet pi kyu/kju ɑr ɚs ti ju/yu vi dəbəlju/dəbəlyu ɚks waj/way zi
Stress Marked across the Alphabets pí kyú/kjú ár ɚ́s tí jú/yú ví də́bəlju/də́bəlyu ɚ́ks wáj/wáy zí
Table III shows the IPA phonetic symbols and its phonetic description. The short cut key defines the letters that
we have to type in order to get the appropriate phonetic symbol.
Alphabet
s
Phoneti
c
symbol
Phonetic
description
for the
symbol Short cut key
Alphabet
s
Phoneti
c
symbol
Phonetic
description
for the
symbol Short cut key
A
ɑ
open back
unrounded
vowel (Ctrl+A) M ɱ
labiodental
nasal (Ctrl+M)
æ
near-open
front
unrounded
vowel (Ctrl+AA)
N
ŋ velar nasal (Ctrl+N)
ɐ
near-open
central vowel (Ctrl+AAA) ɲ palatal nasal (Ctrl+NN)
ɑ̃
nasalized open
back
unrounded
vowel (Ctrl+AAAA) ɴ uvular nasal (Ctrl+NNN)
B
β
voiced bilabial
fricative (Ctrl+B) ɳ
retroflex
nasal (Ctrl+NNNN)
ɓ
voiced bilabial
implosive (Ctrl+BB)
O
ɔ
open-mid
back rounded
vowel (Ctrl+O)
ʙ bilabial trill (Ctrl+BBB) œ
open-mid
front rounded
vowel (Ctrl+OO)
Phonetic Transcription- A Framework for Phonetic Representation of Sound Structures
9
C
ɕ
voiceless
alveo-palatal
fricative (Ctrl+C) ø
close-mid
front rounded
vowel (Ctrl+OOO)
ç
voiceless
palatal
fricative (Ctrl+CC) ɒ
open back
rounded
vowel (Ctrl+OOOO)
D
ð
voiced dental
fricative (Ctrl+D) ɔ̃
nasalized
open-mid
back rounded
vowel (Ctrl+OOOOO)
d͡ ʒ
voiced
postalveolar
affricate (Ctrl+DD) ɶ
open front
rounded
vowel
(Ctrl+OOOOOO
)
ɖ
voiced
retroflex
plosive (Ctrl+DDD) P ɸ
voiceless
bilabial
fricative (Ctrl+P)
ɗ
voiced
alveolar
implosive (Ctrl+DDDD)
R
ɾ alveolar tap (Ctrl+R)
E
ə
mid-central
vowel (Ctrl+E) ʁ
voiced uvular
fricative (Ctrl+RR)
ə
rhotacized
mid-central
vowel (Ctrl+EE) ɹ
alveolar
approximant (Ctrl+RRR)
ɵ
close-mid
central
rounded
vowel (Ctrl+EEE) ɻ
retroflex
approximant (Ctrl+RRRR)
ɘ
close-mid
central
unrounded
vowel (Ctrl+EEEE) ʀ uvular trill (Ctrl+RRRRR)
F
ɚ
open-mid
front
unrounded
vowel (Ctrl+3) ɽ retroflex flap (Ctrl+RRRRRR)
ɛ
open-mid
central
unrounded
vowel (Ctrl+33) ɺ
alveolar
lateral flap
(Ctrl+RRRRRR
R)
ɜ
rhotacized
open-mid
central
unrounded
vowel (Ctrl+333)
S
ʃ
voiceless
postalveolar
fricative (Ctrl+S)
ɚ̃
nasalized
open-mid
front
unrounded
vowel (Ctrl+3333) ʂ
voiceless
retroflex
fricative (Ctrl+SS)
ɝ
open-mid
central
rounded
vowel (Ctrl+33333)
T
Θ
voiceless
dental
fricative (Ctrl+T)
G
ɟ
voiced velar
implosive (Ctrl+G) t͡ ʃ
voiceless
postalveolar
affricate (Ctrl+TT)
ɢ
voiced uvular
plosive (Ctrl+GG) t͡ s
voiceless
alveolar
affricate (Ctrl+TTT)
ʛ
voiced uvular
implosive (Ctrl+GGG) ʈ
voiceless
retroflex
plosive (Ctrl+TTTT)
Phonetic Transcription- A Framework for Phonetic Representation of Sound Structures
10
H
ɥ
labial-palatal
approximant (Ctrl+H)
U
ʊ
near-close
near-back
rounded
vowel (Ctrl+U)
ɦ
voiced glottal
fricative (Ctrl+HH) ʊ̈
near-close
central
rounded
vowel (Ctrl+UU)
ħ
voiceless
pharyngeal
fricative (Ctrl+HHH) ʉ
close central
rounded
vowel (Ctrl+UUU)
ɧ Sje-sound (Ctrl+HHHH)
V
ʌ
open-mid
back
unrounded
vowel (Ctrl+V)
ʜ
voiceless
epiglottal
fricative
(Ctrl+HHHH
H) ʋ
labiodental
approximant (Ctrl+VV)
I
ɪ
near-close
near-front
unrounded
vowel (Ctrl+I) ⱱ
labiodental
flap (Ctrl+VVV)
ɪ̈
near-close
central
unrounded
vowel (Ctrl+II)
W
ʍ
voiceless
labio-velar
approximant (Ctrl+W)
ɨ
close central
unrounded
vowel (Ctrl+III) ɯ
close back
unrounded
vowel (Ctrl+WW)
J
ʝ
voiced palatal
fricative (Ctrl+J) ɰ
velar
approximant (Ctrl+WWW)
ɞ
voiced palatal
plosive (Ctrl+JJ) X χ
voiceless
uvular
fricative (Ctrl+X)
ʄ
voiced palatal
implosive (Ctrl+JJJ)
Y
ʎ
palatal lateral
approximant (Ctrl+Y)
L
ɫ
velarized
alveolar
lateral
approximant (Ctrl+L) ɣ
voiced velar
fricative (Ctrl+YY)
ɬ
voiceless
alveolar
lateral
fricative (Ctrl+LL) ʏ
near-close
near-front
rounded
vowel (Ctrl+YYY)
ʟ
velar lateral
approximant (Ctrl+LLL) ɤ
close-mid
back
unrounded
vowel (Ctrl+YYYY)
ɭ
retroflex
lateral
approximant (Ctrl+LLLL)
Z
ʒ
voiced
postalveolar
fricative (Ctrl+Z)
ɮ
voiced
alveolar
lateral
fricative (Ctrl+LLLLL) ʐ
voiced
retroflex
fricative (Ctrl+ZZ)
ʑ
voiced
alveolo-
palatal
fricative (Ctrl+ZZZ)
Table III Phonemes
Phonetic Transcription- A Framework for Phonetic Representation of Sound Structures
11
V. CONCLUSION
Pronunciation is very important in today’s communication and currently there has been a shift from
linguistic competencies to a broader level of communicative compliances. Effective communication is always
based on the good pronunciation. Pronunciation is reckoned as not just production of the right phonemes. It
forms the foundation for the next level of speech analysis. With adequate pronunciation skill one can fly to a
new horizon by achieving professional responsibility. But with the increasing ambiguities in the natural
language it is difficult to judge the right pronunciation. The goal of this paper was to provide a phonetic
dictionary. This forms the root for the later part of the research work which is to build the interface for
recognizing phonetic structure of sounds generated by the natural language mostly in English. The dictionary
shown in the paper is for Latin-American English alphabets only. It is intentionally kept for these languages in
order to limit the resources. The same rules will form the ground work to recognize the sound of other native
languages such as Kannada, Telugu and others. Various researchers can get benefit from the paper by looking
into the rules and the dictionary for building the sound recognition system. Future scope of the work will be on
developing a tool for phonetic dictionary by applying the rules specified in the paper.
REFERENCES:
[1]. A first speech recognition system for Mandarin-English code-switch conversational speech Ngoc Thang Vu ; Dau-Cheng Lyu ;
Weiner, J. ; Telaar, D. ; Schlippe, T. ; Blaicher, F. ; Eng-Siong Chng ; chultz, T. ; Haizhou Li , Acoustics, Speech and Signal
Processing (ICASSP), 2012 IEEE International Conference on Digital Object Identifier: 10.1109/ICASSP.2012.6289015
Publication Year: 2012 , Page(s): 4889- 4892
[2]. Notice of Retraction A kind of Chinese language Phonetic input output system code, Lu Qiao ; Wan Pu ; Zhang Li ; Zhu Dao-
yong Computer Science and Information Technology (ICCSIT), 2010 3rd IEEE International Conference on Volume:9 Digital
Object Identifier: 10.1109/ICCSIT.2010.5563773 Publication Year: 2010 , Page(s): 508- 511
[3]. Tomasz P. Szynalski, Proceedings of the International Research Conference ―Metamorphoses of the Mind: Out of One’s Mind,
Losing One’s Mind, Lightmindedness, Mindlessness‖ Organized by the Latvian Academy of Culture and Goethe Institut Riga
Riga, 21–23 September 2006 http://www.antimoon.com/how/pronunc-trans.htm
[4]. http://www.antimoon.com/how/pronunc-soundsipa.htm#chart
[5]. Marcus Otlowski, Pronunciation: What Are the Expectations?, The Internet TESL Journal, Vol. IV, No. 1
[6]. Development of a phonetic system for large vocabulary Arabic speech recognition, Gales, M.J.F. ; Cambridge Univ., published
in Automatic Speech Recognition & Understanding, 2007. ASRU. IEEE Workshop on Date of Conference:9-13 Dec. 2007
Page(s):24 – 29 E-ISBN :978-1-4244-1746-9 Print ISBN: 978-1-4244-1746-9 INSPEC Accession Number: 9832782
[7]. Audio visual based pronunciation dictionary for Indian languages. Palanisamy, K. ; Published in:Technology for Education
(T4E), 2010 International Conference on Date of Conference: 1-3 July 2010 Page(s): 82 – 84 E-ISBN : 978-1-4244-7361-8 Print
ISBN:978-1-4244-7362-5 NSPEC Accession Number: 11476081
[8]. 100K+ words, machine-readable, pronunciation dictionary for the Romanian language, Domokos, J. ; Buza, O. ; Toderean, G.
Signal Processing Conference (EUSIPCO) 2012 IEEE Proceedings of the 20th European Date of Conference:27-31 Aug. 2012
Page(s):320 – 324 ISSN :2219-5491 Print ISBN:978-1-4673-068-0 INSPEC Accession Number:13072796
[9]. Spoken Proper Name Retrieval for Limited ResourceLanguages Using Multilingual Hybrid Representations, Akbacak,
M. ; Hansen, J.H.L. Audio, Speech, and Language Processing, IEEE Transactions on Volume: 18 , Issue: 6 Digital Object
Identifier: 10.1109 /TASL .2009 .2035785 Publication Year: 2010 , Page(s): 1486 - 1495

B047006011

  • 1.
    Research Inventy: InternationalJournal of Engineering And Science Vol.4, Issue 7 (July 2014), PP 06-11 Issn (e): 2278-4721, Issn (p):2319-6483, www.researchinventy.com 6 Phonetic Transcription- A Framework for Phonetic Representation of Sound Structures Dr. M Hanumanthappa1 , Rashmi S2 , Jyothi N M3 1 Associate Professor, Department of Computer Science and Applications, Bangalore University, Bangalore-56. 2 Research Scholar, Department of Computer Science and Applications, Bangalore University, Bangalore-56. 3 Assistant Professor, Department of MCA, Bapuji Institute of Engineering and Technology, Davangere. ABSTRACT: Implementing phonetics to Natural language is a Herculean task. The first step for the phonetic implementation is to generate a phonetic dictionary. Phonetic dictionary is very important as it plays a vital role in identifying the essential components of the gigantic vocabulary in the speech recognition system of natural language. Language like English needs phonetic transcriptions because the English spelling does not tell us how to pronounce it. Pronunciation is very important for communication as it vitalizes rapid transition to practical concern. There is lot of ongoing research in multilingual speech recognition. Phonetic dictionary is utmost important for any speech recognition system and hence the key goal of the paper is to build a phonetic dictionary for English language. The paper aims at building dictionary for identifying the phonetic transcriptions of International Phonetic Alphabet (IPA) and American English alphabets phonetically. The paper also gives a detailed explanation about the rules of the phonemes. KEYWORDS: Natural Language Processing (NLP), Phonetics, Acoustic phonetics, Phonemes, Phonetic transcriptions, WordNet. I. INTRODUCTION According to the WordNet the phonetics is defined as a branch of NLP which is concerned with the speech production and perception and a detailed analysis of acoustics. Acoustic phonetics is a branch of phonetics which deals with the study of sounds that is made by the vocal organs of the human in order to produce the sound. The acoustic knowledge is very important to locate the underlying phonetic representation. Acoustics discusses how the sound travels from speakers’ mouth to listeners’ ears. The challenges of identifying the language and storing the corresponding sound for phonetic transcription are tedious and tiresome. The International Phonetic Alphabet (IPA) and the acoustic modeling of various phonemes have shown rationally good improvement for the implementation and understanding of the sound structure. Various approaches such as Statistical Machine Translation (SMT), language identification system (LID) merges gap between the encoding and the decoding process by using multi stream approach [1]. Phonemes are the basic building block for the language which is combined with the other phoneme to build a meaningful unit such as words and morphemes. Phonemes are the individual or group of sound units. The phonetic codes include syllable code, syllable time and syllable rhythm. The phonemes form the bedrock for initial and final code form to represent the phonetic structure with sensibility and sub-information of linguistics in syllable code [2]. Phonetic transcriptions tell us how the word has to be pronounced. These phonetic transcriptions are written in IPA where each and every English alphabet has its own symbol. For example the IPA based phonetic transcription for the words such as no is noʊ, and the transcription of do is duː. Though both of these words end with the same letter they have different sound and the phonetic transcriptions are different. The transcription for the word alphabet is . Phonetic transcriptions are helpful because there are no convincing empirical results which help us to sort the pronunciation training [5]. The below table mention some of the examples for phonetic transcriptions. main stress /ˌek.spekˈteɪ.ʃə n/ expectation secondary stress /ˌriːˈtell/ retell syllable division /ˈsɪs.təm/ system Table-1 phonetic transcriptions
  • 2.
    Phonetic Transcription- AFramework for Phonetic Representation of Sound Structures 7 II. LITERATURE SURVEY It is said that there are around 4000 languages spoken today across the world and most of these languages have limited linguistic knowledge and speech data/resources are available. Author Akbacak, M et al have proposed series of speech retrieval algorithms in order to leverage the already existing algorithms. The algorithms employ confusion-embedded hybrid pronunciation networks, and lattice-based phonetic search within a proper name retrieval task. Latin-American Spanish are used as the target language. After searching for queries consisting of Spanish proper names in Spanish Broadcast News data, we demonstrate that retrieval performance degradations (due to data sparseness during automatic speech recognition (ASR) deployment in the target language) are compensated by employing English acoustic models [9]. Speech recognition system for various languages have been studied and developed. Arabic speech recognition system is developed at Cambridge University by Gales M J F et al [6]. Author has shown a simple scheme for automatically generating associations among diverse pronunciations for use in training and reducing the phonetic out-of-vocabulary rate. Author Palanisamy K et al, has developed Tamil pronunciation dictionary by incorporating the visual actions of the organs for improving communication skills [7]. Phonetic transcriptions by using Speech Assessment Method Phonetic Alphabet (SAMPA) for Romanian language has been proposed by Domokos, J et al. Development of phonetic dictionary and also the system architecture is explained in the paper [8]. In the next section various methodologies for building the phonetic dictionary and the associated rules are discussed. III. METHODOLOGIES Today almost all the English dictionaries have audio recordings. So even with this, the need of phonetic transcriptions is prominent because of the following reasons [3]  To have a good communication in English first we have learn the pronunciations and English language is polysemy in nature. So learning the pronunciations through sound is apt in these kinds of situations.  When we are listening to the pronunciations we might not be sure whether we heard ʊ or ə, ɒ or ʌ, s or z, etc., this is due to the lack of experience and the quality of the particular sound is poor. Hence reading phonetic transcriptions will make the phonemes clear as we will see the sound symbols of all the alphabets in a word  Dictionaries often show multiple transcriptions but have the same pronunciations so the better way to find the reasons for this ambiguity would be to study the information about the particular transcriptions.  The computer which we are using might not have the speakers and hence the audio recording of the particular sound cannot be heard. Sometimes even if the speakers are available we do not want to disturb the public with the sound. In these situations transcriptions will greatly help. Word Stress: A word is combination of syllables. In every word, one or more letters are pronounced strongly. This is called word stress. The dictionary which is shown in this paper also contains the word stress. Pronunciation for English language is broadly classified into American and British transcriptions. In American English, the sound of r is always stressed where as in British English r is not stressed. Example: arm/ father are pronounced differently in both of the accent. IPA: International Phonetic Alphabet is the abbreviation for IPA. It defines the standard phonetic symbol for every alphabet in English language. The IPA symbols are usually written in the Latin symbols. IPA defines the standard sound representation for oral language. IPA is considered as the standard for linguistics. However there is also the American phonetic alphabet because of the difficulties faced with IPA such as, it is tough to type the IPA symbols with the normal keyboard. IPA also increases the error rate by withholding the awkwardness reading the transcriptions which are hand written. IV. BUILDING PHONETIC DICTIONARY Phonetic dictionary is not as simple as it looks because applying sounding mechanisms to make computer auto detect the word and give the pronunciation is difficult. There are various rules that need to be followed while building the phonetic transcriptions. The rules are [3] [1]. Almost all dictionaries use the e symbol for the vowel in bed. The problem with this convention is that e in the IPA does not stand for the vowel in bed; it stands for a different vowel that is heard, for example, in the German word Seele. The ―proper‖ symbol for the bed vowel is ɚ (do not confuse with ɛ:). The same goes for eə vs. ɚə.
  • 3.
    Phonetic Transcription- AFramework for Phonetic Representation of Sound Structures 8 [2]. In əʳ and ɛ:ʳ, the ʳ is not pronounced in BrE, unless the sound comes before a vowel (as in answering, answer it). In AmE, the ʳ is always pronounced, and the sounds are sometimes written as ə and ɜ. [3]. In AmE, ɑ: and ɒ are one vowel, so calm and cot have the same vowel. In American transcriptions, hot is written as hɑ:t. [4]. About 40% of Americans pronounce ɔ: the same way as ɑ:, so that caught and cot have the same vowel. See cot-caught merger. [5]. In American transcriptions, ɔ: is often written as ɒ: (e.g. law = lɒ:), unless it is followed by r, in which case it remains an ɔ:. [6]. In British transcriptions, oʊ is usually represented as əʊ. For some BrE speakers, oʊ is more appropriate (they use a rounded vowel) — for others, the proper symbol is əʊ. For American speakers, oʊ is usually more accurate. [7]. In eəʳ ɪəʳ ʊəʳ, the r is not pronounced in BrE, unless the sound comes before a vowel (as in dearest, dear Ann). In AmE, the r is always pronounced, and the sounds are often written as er ɪr ʊr. [8]. All dictionaries use the r symbol for the first sound in red. The problem with this convention is that r in the IPA does not stand for the British or American r; it stands for the ―hard‖ r that is heard, for example, in the Spanish word rey or Italian vero. The ―proper‖ symbol for the red consonant is ɹ. [9]. In American English, t is often pronounced as a flap t, which sounds like d or (more accurately) like the quick, hard r heard e.g. in the Spanish word pero. For example: letter. Some dictionaries use the t ̬ symbol for the flap t. Table II describes the English alphabets and the corresponding IPA and American phonetic alphabets. Word stress is also highlighted. Table II IPA and phonetic Transcriptions English Alphabet a b c d e f g h i J k l M n o IPA/American Phonetic Alphabet ə bi si di I ɚf ǰi/dʒi etʃ/eč aj/ay dʒe/ǰe ke ɚl ɚm ɚn o Stress Marked across the Alphabets é bí sí dí í ɚ́f ǰí/dʒí étʃ/éč áj/áy dʒé/ǰé ké ɚ́l ɚ́m ɚ́n ó English Alphabet p Q r s t u v W x Y z IPA/American Phonetic Alphabet pi kyu/kju ɑr ɚs ti ju/yu vi dəbəlju/dəbəlyu ɚks waj/way zi Stress Marked across the Alphabets pí kyú/kjú ár ɚ́s tí jú/yú ví də́bəlju/də́bəlyu ɚ́ks wáj/wáy zí Table III shows the IPA phonetic symbols and its phonetic description. The short cut key defines the letters that we have to type in order to get the appropriate phonetic symbol. Alphabet s Phoneti c symbol Phonetic description for the symbol Short cut key Alphabet s Phoneti c symbol Phonetic description for the symbol Short cut key A ɑ open back unrounded vowel (Ctrl+A) M ɱ labiodental nasal (Ctrl+M) æ near-open front unrounded vowel (Ctrl+AA) N ŋ velar nasal (Ctrl+N) ɐ near-open central vowel (Ctrl+AAA) ɲ palatal nasal (Ctrl+NN) ɑ̃ nasalized open back unrounded vowel (Ctrl+AAAA) ɴ uvular nasal (Ctrl+NNN) B β voiced bilabial fricative (Ctrl+B) ɳ retroflex nasal (Ctrl+NNNN) ɓ voiced bilabial implosive (Ctrl+BB) O ɔ open-mid back rounded vowel (Ctrl+O) ʙ bilabial trill (Ctrl+BBB) œ open-mid front rounded vowel (Ctrl+OO)
  • 4.
    Phonetic Transcription- AFramework for Phonetic Representation of Sound Structures 9 C ɕ voiceless alveo-palatal fricative (Ctrl+C) ø close-mid front rounded vowel (Ctrl+OOO) ç voiceless palatal fricative (Ctrl+CC) ɒ open back rounded vowel (Ctrl+OOOO) D ð voiced dental fricative (Ctrl+D) ɔ̃ nasalized open-mid back rounded vowel (Ctrl+OOOOO) d͡ ʒ voiced postalveolar affricate (Ctrl+DD) ɶ open front rounded vowel (Ctrl+OOOOOO ) ɖ voiced retroflex plosive (Ctrl+DDD) P ɸ voiceless bilabial fricative (Ctrl+P) ɗ voiced alveolar implosive (Ctrl+DDDD) R ɾ alveolar tap (Ctrl+R) E ə mid-central vowel (Ctrl+E) ʁ voiced uvular fricative (Ctrl+RR) ə rhotacized mid-central vowel (Ctrl+EE) ɹ alveolar approximant (Ctrl+RRR) ɵ close-mid central rounded vowel (Ctrl+EEE) ɻ retroflex approximant (Ctrl+RRRR) ɘ close-mid central unrounded vowel (Ctrl+EEEE) ʀ uvular trill (Ctrl+RRRRR) F ɚ open-mid front unrounded vowel (Ctrl+3) ɽ retroflex flap (Ctrl+RRRRRR) ɛ open-mid central unrounded vowel (Ctrl+33) ɺ alveolar lateral flap (Ctrl+RRRRRR R) ɜ rhotacized open-mid central unrounded vowel (Ctrl+333) S ʃ voiceless postalveolar fricative (Ctrl+S) ɚ̃ nasalized open-mid front unrounded vowel (Ctrl+3333) ʂ voiceless retroflex fricative (Ctrl+SS) ɝ open-mid central rounded vowel (Ctrl+33333) T Θ voiceless dental fricative (Ctrl+T) G ɟ voiced velar implosive (Ctrl+G) t͡ ʃ voiceless postalveolar affricate (Ctrl+TT) ɢ voiced uvular plosive (Ctrl+GG) t͡ s voiceless alveolar affricate (Ctrl+TTT) ʛ voiced uvular implosive (Ctrl+GGG) ʈ voiceless retroflex plosive (Ctrl+TTTT)
  • 5.
    Phonetic Transcription- AFramework for Phonetic Representation of Sound Structures 10 H ɥ labial-palatal approximant (Ctrl+H) U ʊ near-close near-back rounded vowel (Ctrl+U) ɦ voiced glottal fricative (Ctrl+HH) ʊ̈ near-close central rounded vowel (Ctrl+UU) ħ voiceless pharyngeal fricative (Ctrl+HHH) ʉ close central rounded vowel (Ctrl+UUU) ɧ Sje-sound (Ctrl+HHHH) V ʌ open-mid back unrounded vowel (Ctrl+V) ʜ voiceless epiglottal fricative (Ctrl+HHHH H) ʋ labiodental approximant (Ctrl+VV) I ɪ near-close near-front unrounded vowel (Ctrl+I) ⱱ labiodental flap (Ctrl+VVV) ɪ̈ near-close central unrounded vowel (Ctrl+II) W ʍ voiceless labio-velar approximant (Ctrl+W) ɨ close central unrounded vowel (Ctrl+III) ɯ close back unrounded vowel (Ctrl+WW) J ʝ voiced palatal fricative (Ctrl+J) ɰ velar approximant (Ctrl+WWW) ɞ voiced palatal plosive (Ctrl+JJ) X χ voiceless uvular fricative (Ctrl+X) ʄ voiced palatal implosive (Ctrl+JJJ) Y ʎ palatal lateral approximant (Ctrl+Y) L ɫ velarized alveolar lateral approximant (Ctrl+L) ɣ voiced velar fricative (Ctrl+YY) ɬ voiceless alveolar lateral fricative (Ctrl+LL) ʏ near-close near-front rounded vowel (Ctrl+YYY) ʟ velar lateral approximant (Ctrl+LLL) ɤ close-mid back unrounded vowel (Ctrl+YYYY) ɭ retroflex lateral approximant (Ctrl+LLLL) Z ʒ voiced postalveolar fricative (Ctrl+Z) ɮ voiced alveolar lateral fricative (Ctrl+LLLLL) ʐ voiced retroflex fricative (Ctrl+ZZ) ʑ voiced alveolo- palatal fricative (Ctrl+ZZZ) Table III Phonemes
  • 6.
    Phonetic Transcription- AFramework for Phonetic Representation of Sound Structures 11 V. CONCLUSION Pronunciation is very important in today’s communication and currently there has been a shift from linguistic competencies to a broader level of communicative compliances. Effective communication is always based on the good pronunciation. Pronunciation is reckoned as not just production of the right phonemes. It forms the foundation for the next level of speech analysis. With adequate pronunciation skill one can fly to a new horizon by achieving professional responsibility. But with the increasing ambiguities in the natural language it is difficult to judge the right pronunciation. The goal of this paper was to provide a phonetic dictionary. This forms the root for the later part of the research work which is to build the interface for recognizing phonetic structure of sounds generated by the natural language mostly in English. The dictionary shown in the paper is for Latin-American English alphabets only. It is intentionally kept for these languages in order to limit the resources. The same rules will form the ground work to recognize the sound of other native languages such as Kannada, Telugu and others. Various researchers can get benefit from the paper by looking into the rules and the dictionary for building the sound recognition system. Future scope of the work will be on developing a tool for phonetic dictionary by applying the rules specified in the paper. REFERENCES: [1]. A first speech recognition system for Mandarin-English code-switch conversational speech Ngoc Thang Vu ; Dau-Cheng Lyu ; Weiner, J. ; Telaar, D. ; Schlippe, T. ; Blaicher, F. ; Eng-Siong Chng ; chultz, T. ; Haizhou Li , Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on Digital Object Identifier: 10.1109/ICASSP.2012.6289015 Publication Year: 2012 , Page(s): 4889- 4892 [2]. Notice of Retraction A kind of Chinese language Phonetic input output system code, Lu Qiao ; Wan Pu ; Zhang Li ; Zhu Dao- yong Computer Science and Information Technology (ICCSIT), 2010 3rd IEEE International Conference on Volume:9 Digital Object Identifier: 10.1109/ICCSIT.2010.5563773 Publication Year: 2010 , Page(s): 508- 511 [3]. Tomasz P. Szynalski, Proceedings of the International Research Conference ―Metamorphoses of the Mind: Out of One’s Mind, Losing One’s Mind, Lightmindedness, Mindlessness‖ Organized by the Latvian Academy of Culture and Goethe Institut Riga Riga, 21–23 September 2006 http://www.antimoon.com/how/pronunc-trans.htm [4]. http://www.antimoon.com/how/pronunc-soundsipa.htm#chart [5]. Marcus Otlowski, Pronunciation: What Are the Expectations?, The Internet TESL Journal, Vol. IV, No. 1 [6]. Development of a phonetic system for large vocabulary Arabic speech recognition, Gales, M.J.F. ; Cambridge Univ., published in Automatic Speech Recognition & Understanding, 2007. ASRU. IEEE Workshop on Date of Conference:9-13 Dec. 2007 Page(s):24 – 29 E-ISBN :978-1-4244-1746-9 Print ISBN: 978-1-4244-1746-9 INSPEC Accession Number: 9832782 [7]. Audio visual based pronunciation dictionary for Indian languages. Palanisamy, K. ; Published in:Technology for Education (T4E), 2010 International Conference on Date of Conference: 1-3 July 2010 Page(s): 82 – 84 E-ISBN : 978-1-4244-7361-8 Print ISBN:978-1-4244-7362-5 NSPEC Accession Number: 11476081 [8]. 100K+ words, machine-readable, pronunciation dictionary for the Romanian language, Domokos, J. ; Buza, O. ; Toderean, G. Signal Processing Conference (EUSIPCO) 2012 IEEE Proceedings of the 20th European Date of Conference:27-31 Aug. 2012 Page(s):320 – 324 ISSN :2219-5491 Print ISBN:978-1-4673-068-0 INSPEC Accession Number:13072796 [9]. Spoken Proper Name Retrieval for Limited ResourceLanguages Using Multilingual Hybrid Representations, Akbacak, M. ; Hansen, J.H.L. Audio, Speech, and Language Processing, IEEE Transactions on Volume: 18 , Issue: 6 Digital Object Identifier: 10.1109 /TASL .2009 .2035785 Publication Year: 2010 , Page(s): 1486 - 1495