SlideShare a Scribd company logo
9th
National Research Conference 1

Abstract
Urdu is the language which is understandable by most of
the regions in South Asia and rapidly growing medium of
communication in Arab World .Transliteration commonly known as
Romanization is a way of mapping words from one system of writing
into another. The research taken place here in “Center of Excellence
for Urdu Informatics” was about the mapping of Romanized
alphabets of Urdu in English Script to the Urdu Script. Software for
converting transliterates text into Urdu script for whom cannot write
the Urdu script. In the research, standard alphabets of Urdu listed
with their all states e.g. initial state, medial state, final state and
alone states with their Romanized alphabets of English. Similarly
developed lists of those words take place with two alphabets of
English ‫بھ‬ ~ bh as well as Urdu vowels and diacritics (airaab). In the
defined rules of application genuinely discussed the letters which
may be romanized in different ways depending on their context,
Romanization of orthographic symbols other than letters and vowel
sign. Deeply hunt the transliteration as affected by grammatical
structure and special characters and characters modifiers in
transliteration.
Keywords: Transliteration, English, Urdu, Phonetic issues
I. INTRODUCTION
he methodology of converting text from one writing
system to another writing system in a systematic manner
is known as transliteration. In systematic way transliteration is
the mapping from one system of writing to another letter by
letter. Mostly there mis-understands about transliteration and
transcription that both are same but clarifying the confusion
between transliteration and transcription they are situational
opposite of each other i.e. transliteration
Romanization attempts to transliterate the original script, the
guiding principle is a one-to-one mapping of characters in the
source language into the target script, with less emphasis on
how the result sounds when pronounced according to the
reader's language is mapping of words in a language where as
transcription is mapping of sounds with words in a language .
Arabic script is much into the popularity that Roman script
is not easily acceptable by the communities and they oppose
this trend. Meanwhile it is very popular on the cyber end
because of the unavailability of Arabic Script as it is rarely
Implemented or mostly underdevelopment. Numerous websites
and blogs including technical forums are in Roman Script and
communication is vastly understandable by the community on
the cyber end.
Some of the central Asian countries are not able to read or
understand Arabic script and they have used transliteration for
the recitation of Quran in Arabic even the Arab nations the
owner of the language accept and standardized in Malaya,
Indonesian, and The languages and they have standardized
letters of the Arabic alphabets to make it possible unless the
stranderazation is not existent transliteration nor possible. As
the Urdu character set is not completely implemented it is
highly in discussion the standardize and make it possible to
count on to transliterate. While the transcription implies
seeking the best way to render foreign words into a particular
language, the typing transliteration is a purely pragmatic
process of inputting text in a particular language therefore it is
highly in demand for URDU to standardize its character set for
transliteration.
Mostly transliteration is feasible where original script is not
available to write or understandable from the foreign language
user. It can be beneficial for the learning and understanding the
Languages by non native or foreign language users to use local
Learning can be made possible in easy manner.
II. PREMATURE HISTORY OF TRANSLITERATION
The work on transliteration has been started several
hundred years ago in Asia (India, arab), Indians phoneticians
had much work and analyses the sound category and analyzed
the issues accrued. Some work on transliteration has been done
Transliteration / Romanization for Urdu
processing (June 2009)
Rashida Sharif
Center of Excellence for Urdu Informatics (CEUI),
National Language Authority, Islamabad
rashida.sharif@gmail.com
T
9th
National Research Conference 2
in the guidance of a committee was set up at the Geneva
oriental Congress in September 1894 which have been broadly
finalized the standard of transliteration of Sanskrit.
Previous Work
For transliteration of Urdu into English and several other
languages several numerous systems have been developed but
they are not reversible, most popular are British Library,
Library of Congress, and Encyclopedia of Islam. Since Urdu
and Hindi are grammatically same language and they also
share a very good number of words, its easier for both
speakers to understand each others’ language. The only
obstacle is the script. Pakistan chose Arabic script instead of
Devnagari for Urdu. And the script does not transliterate well
into Hindi or probably roman too.
1. URDU INFORMATICS “Reversible Urdu
Transliteration to be used in
computer/Email/Internet” by Dr. Attash durrani, at
national language authority publish in 2008. in this
article letters of alphabet, their roman properties and
values are discussed in incredibly
aspect.[APPENDIX B][1]
2. Letters of the Urdu alphabets are discussed in the
library of congress[4]
3. Transliteration editor for Arabic, Persian and Urdu
has developed in India at Carnegie Mellon
University, Hyderabad; they put the light of
commonalities of Middle East Languages and
discussed the all primary forms of letters.[3]
4. British Library[6]
5. Google Labs (Google Indic transliteration)[8]
6. The encyclopedia of Islam[7]
7. Urdu orthography also explain with its character sets
include in basic and secondary letters diacritics
(aerab) punctuation marks and special symbols in
center of research in Urdu language processing,
national university of computer and emerging
sciences.[4]
Provided through above mentioned resources didn’t
provide complete set of Urdu alphabets and their
transliteration scheme are not enough to develop a system that
can provide entirely explicable conversation as the user want.
III. SCHEMES/ SYSTEMS FOR TRANSLITERATION
Different languages have their different phonetic schemes to
make possible and resolve the native language issue to convert
text from one writing system to another in a systematic way.
For Urdu language some worth full schemes that are
considered to go behind are following:
1.Speech assessment methods phonetic Alphabet
(SAMPA) widely used scheme across the world for
encoding the international phonetic alphabets (IPA).
2.Universal Intermediate Description (UIT), a scheme to
transcribe text in Urdu, Punjabi and Hindi, considered
an un ambiguous standard.[2]
3.ALA-LA Romanization scheme for letters of the
alphabet: Transliteration Schemes for Non-Roman
Scripts, approved by the Library of Congress and the
American Library Association.[4]
Issues of Mapping Urdu Alphabets into English
We need a transliteration system based both on letters
conversions and phonetic approach. For Urdu transliteration,
to a great extent work has been done or in struggling to be
done handled complexity of its structure as compare to other
languages. Numerous systems completed but still with many
ambiguities remaining for users. For mapping of Urdu letters
into English there are countless problems to acquire perfect
reversible system that adopt 100% correct equalant
transliteration into Urdu from Roman and expecting dialogue
is no symbols have to add with or before and after the letters.
In fast university, they develop a scheme for their corpus based
Urdu lexicon development system that I observe particularly in
the sense of standard alphabets of Urdu listed and its analyzed
found ambiguities, and suggest possible solution
SAMPA Urdu Letters
t ،‫ت‬ ‫ط‬
s ،‫س‬،‫ص‬ ، ‫ث‬
z ‫ذ‬ ،‫،ض‬ ‫ظ‬ ، ‫ز‬
h ،‫ہ‬ ‫ح‬
a ‫ا‬ ‫آ‬،
@ ‫ء‬،‫ع‬
Table A: ambiguous letters in Appendix A
If ‫ت‬" “ and “‫”ط‬ are converted to “t“ reverse transliteration
is not possible because the said table mapping devising uni-
directional transliteration. So the same is the case for the all
listed letters of Table A.
As per the mapping given at table c the revised system could
produce again and again recursively without any problem as
shown or ignored by Appendix A mapping.
IV. CONCLUSION
As we need a transliteration system based both on letter
conversion and phonetic approach suggested Urdu set of
characters [table c] for reversible transliteration scheme is a
worth able to admit the developing systems rising day by day
feeling the need of writing to communicate among native user,
all alphabets are conversed and settled for English to Urdu
transliteration. In the list all consonants, Diagraph representing
Urdu aspirates, Urdu vowel and Diphthongs discretely
describe to evade the ambiguous of same sounds letters that’s
rebuff the level way to make transliteration possible.
9th
National Research Conference 3
Appendix A
9th
National Research Conference 4
Appendix B
9th
National Research Conference 5
9th
National Research Conference 6
REFERENCES
[1] Attash Durrani.Dr, “Reversible Urdu Translitration (book,
Urdu Informatics)”, 1st
ed.Vol:1, Islamabad: 2008. pp.48-
50
[2] Sarmad Hussain.dr & Madiha Ijaz, “corpus based Urdu
lexicon development”(article), Center for research in Urdu
language processing, national university of Computer and
emerging sciences, Lahore 2007
[3] M.G Abbas malik, pushpak Bhattacharyya, Christian boilt
“Hindi Urdu Machine transliteration using Finite-state
Transliteration (HUMT)”(Article), GTALP, laboratories
d’informatique grenoble, University josep Fourier,
France. Dept. of Computer science and engineering, IIT
Bombay, India 2008
[4] ALA .LA, “Library of congress”(online sources) www.loc.gov
[5] SAMPA, ”speech assessment method of phonetic Alphabets”(online
sources, article), Sarmad Hussain.dr & Madiha Ijaz, “corpus based Urdu
lexicon development”[2], http://phone.ucl.ac.uk/home/sampa/
[6] British Library,”transliteration scheme” (online
sources).Http://www.bl.uk/
[7] The encyclopedia of Islam, (online
sources),www.muslimphilosophy.com/ei2/list.htm
[8] Google Labs “Google Indic Transliteration” (online sources),
http://www.google.com/transliterate/indic/Urdu
Rashida Sharif is working in Center of Excellence for Urdu Informatics,
National Language Authority, Islamabad

More Related Content

What's hot

Rule-Based Standard Arabic Phonetization at Phoneme, Allophone, and Syllable ...
Rule-Based Standard Arabic Phonetization at Phoneme, Allophone, and Syllable ...Rule-Based Standard Arabic Phonetization at Phoneme, Allophone, and Syllable ...
Rule-Based Standard Arabic Phonetization at Phoneme, Allophone, and Syllable ...
CSCJournals
 
A transformational generative approach towards understanding al-istifham
A transformational  generative approach towards understanding al-istifhamA transformational  generative approach towards understanding al-istifham
A transformational generative approach towards understanding al-istifham
Alexander Decker
 
Presentation 3 applied branch of translation studies
Presentation 3   applied branch of translation studiesPresentation 3   applied branch of translation studies
Presentation 3 applied branch of translation studies
Raeza Rizon
 

What's hot (16)

STANDARD ARABIC VERBS INFLECTIONS USING NOOJ PLATFORM
STANDARD ARABIC VERBS INFLECTIONS USING NOOJ PLATFORMSTANDARD ARABIC VERBS INFLECTIONS USING NOOJ PLATFORM
STANDARD ARABIC VERBS INFLECTIONS USING NOOJ PLATFORM
 
Rule-Based Standard Arabic Phonetization at Phoneme, Allophone, and Syllable ...
Rule-Based Standard Arabic Phonetization at Phoneme, Allophone, and Syllable ...Rule-Based Standard Arabic Phonetization at Phoneme, Allophone, and Syllable ...
Rule-Based Standard Arabic Phonetization at Phoneme, Allophone, and Syllable ...
 
A ROBUST THREE-STAGE HYBRID FRAMEWORK FOR ENGLISH TO BANGLA TRANSLITERATION
A ROBUST THREE-STAGE HYBRID FRAMEWORK FOR ENGLISH TO BANGLA TRANSLITERATIONA ROBUST THREE-STAGE HYBRID FRAMEWORK FOR ENGLISH TO BANGLA TRANSLITERATION
A ROBUST THREE-STAGE HYBRID FRAMEWORK FOR ENGLISH TO BANGLA TRANSLITERATION
 
An implementation of apertium based assamese morphological analyzer
An implementation of apertium based assamese morphological analyzerAn implementation of apertium based assamese morphological analyzer
An implementation of apertium based assamese morphological analyzer
 
Deterministic Finite State Automaton of Arabic Verb System: A Morphological S...
Deterministic Finite State Automaton of Arabic Verb System: A Morphological S...Deterministic Finite State Automaton of Arabic Verb System: A Morphological S...
Deterministic Finite State Automaton of Arabic Verb System: A Morphological S...
 
Lexicographical Techniques Adopted in Tranquebar Tamil-English Dictionary
Lexicographical Techniques Adopted in Tranquebar Tamil-English DictionaryLexicographical Techniques Adopted in Tranquebar Tamil-English Dictionary
Lexicographical Techniques Adopted in Tranquebar Tamil-English Dictionary
 
Translation periods By Christine Joanne Librero-Desacado
Translation periods   By Christine Joanne Librero-DesacadoTranslation periods   By Christine Joanne Librero-Desacado
Translation periods By Christine Joanne Librero-Desacado
 
Presentation curras paper-emnlp2014-final
Presentation curras paper-emnlp2014-finalPresentation curras paper-emnlp2014-final
Presentation curras paper-emnlp2014-final
 
Lexical Borrowings in Boro
Lexical Borrowings in BoroLexical Borrowings in Boro
Lexical Borrowings in Boro
 
Coreference recognition in arabic
Coreference recognition in arabicCoreference recognition in arabic
Coreference recognition in arabic
 
A transformational generative approach towards understanding al-istifham
A transformational  generative approach towards understanding al-istifhamA transformational  generative approach towards understanding al-istifham
A transformational generative approach towards understanding al-istifham
 
Lexicography
 Lexicography Lexicography
Lexicography
 
Leah Dacheva & Richard Fay
Leah Dacheva & Richard FayLeah Dacheva & Richard Fay
Leah Dacheva & Richard Fay
 
Lexicography
 Lexicography Lexicography
Lexicography
 
Translation Services: A Brief Study
Translation Services: A Brief StudyTranslation Services: A Brief Study
Translation Services: A Brief Study
 
Presentation 3 applied branch of translation studies
Presentation 3   applied branch of translation studiesPresentation 3   applied branch of translation studies
Presentation 3 applied branch of translation studies
 

Similar to Transliteration/Romanization of Urdu Processing by Rashida sharif

CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...
CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...
CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...
kevig
 
CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...
CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...
CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...
ijnlc
 
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text EditorDynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
Waqas Tariq
 
MoM2010: Arabic natural language processing
MoM2010: Arabic natural language processingMoM2010: Arabic natural language processing
MoM2010: Arabic natural language processing
Hend Al-Khalifa
 
Interpretation of Sadhu into Cholit Bhasha by Cataloguing and Translation System
Interpretation of Sadhu into Cholit Bhasha by Cataloguing and Translation SystemInterpretation of Sadhu into Cholit Bhasha by Cataloguing and Translation System
Interpretation of Sadhu into Cholit Bhasha by Cataloguing and Translation System
ijtsrd
 
Design and Implementation of a Language Assistant for English – Arabic Texts
Design and Implementation of a Language Assistant for English – Arabic TextsDesign and Implementation of a Language Assistant for English – Arabic Texts
Design and Implementation of a Language Assistant for English – Arabic Texts
IJCSIS Research Publications
 
A GRAMMATICALLY AND STRUCTURALLY BASED PART OF SPEECH (POS) TAGGER FOR ARABIC...
A GRAMMATICALLY AND STRUCTURALLY BASED PART OF SPEECH (POS) TAGGER FOR ARABIC...A GRAMMATICALLY AND STRUCTURALLY BASED PART OF SPEECH (POS) TAGGER FOR ARABIC...
A GRAMMATICALLY AND STRUCTURALLY BASED PART OF SPEECH (POS) TAGGER FOR ARABIC...
kevig
 

Similar to Transliteration/Romanization of Urdu Processing by Rashida sharif (20)

Azhary: An Arabic Lexical Ontology
Azhary: An Arabic Lexical OntologyAzhary: An Arabic Lexical Ontology
Azhary: An Arabic Lexical Ontology
 
TRANSLITERATION BY ORTHOGRAPHY OR PHONOLOGY FOR HINDI AND MARATHI TO ENGLISH:...
TRANSLITERATION BY ORTHOGRAPHY OR PHONOLOGY FOR HINDI AND MARATHI TO ENGLISH:...TRANSLITERATION BY ORTHOGRAPHY OR PHONOLOGY FOR HINDI AND MARATHI TO ENGLISH:...
TRANSLITERATION BY ORTHOGRAPHY OR PHONOLOGY FOR HINDI AND MARATHI TO ENGLISH:...
 
Arabic words stemming approach using arabic wordnet
Arabic words stemming approach using arabic wordnetArabic words stemming approach using arabic wordnet
Arabic words stemming approach using arabic wordnet
 
CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...
CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...
CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...
 
CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...
CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...
CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...
 
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text EditorDynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text Editor
 
Exploring the effects of stemming on
Exploring the effects of stemming onExploring the effects of stemming on
Exploring the effects of stemming on
 
Phonetic Dictionary for Natural Language Processing: Kannada
Phonetic Dictionary for Natural Language Processing: KannadaPhonetic Dictionary for Natural Language Processing: Kannada
Phonetic Dictionary for Natural Language Processing: Kannada
 
ANALYTICAL STUDY TO REVIEW OF ARABIC LANGUAGE LEARNING USING INTERNET WEBSITES
ANALYTICAL STUDY TO REVIEW OF ARABIC LANGUAGE LEARNING USING INTERNET WEBSITESANALYTICAL STUDY TO REVIEW OF ARABIC LANGUAGE LEARNING USING INTERNET WEBSITES
ANALYTICAL STUDY TO REVIEW OF ARABIC LANGUAGE LEARNING USING INTERNET WEBSITES
 
ANALYTICAL STUDY TO REVIEW OF ARABIC LANGUAGE LEARNING USING INTERNET WEBSITES
ANALYTICAL STUDY TO REVIEW OF ARABIC LANGUAGE LEARNING USING INTERNET WEBSITESANALYTICAL STUDY TO REVIEW OF ARABIC LANGUAGE LEARNING USING INTERNET WEBSITES
ANALYTICAL STUDY TO REVIEW OF ARABIC LANGUAGE LEARNING USING INTERNET WEBSITES
 
ANALYTICAL STUDY TO REVIEW OF ARABIC LANGUAGE LEARNING USING INTERNET WEBSITES
ANALYTICAL STUDY TO REVIEW OF ARABIC LANGUAGE LEARNING USING INTERNET WEBSITESANALYTICAL STUDY TO REVIEW OF ARABIC LANGUAGE LEARNING USING INTERNET WEBSITES
ANALYTICAL STUDY TO REVIEW OF ARABIC LANGUAGE LEARNING USING INTERNET WEBSITES
 
MoM2010: Arabic natural language processing
MoM2010: Arabic natural language processingMoM2010: Arabic natural language processing
MoM2010: Arabic natural language processing
 
The classification of the modern arabic poetry using machine learning
The classification of the modern arabic poetry using machine learningThe classification of the modern arabic poetry using machine learning
The classification of the modern arabic poetry using machine learning
 
Interpretation of Sadhu into Cholit Bhasha by Cataloguing and Translation System
Interpretation of Sadhu into Cholit Bhasha by Cataloguing and Translation SystemInterpretation of Sadhu into Cholit Bhasha by Cataloguing and Translation System
Interpretation of Sadhu into Cholit Bhasha by Cataloguing and Translation System
 
DICTIONARY BASED AMHARIC-ARABIC CROSS LANGUAGE INFORMATION RETRIEVAL
DICTIONARY BASED AMHARIC-ARABIC CROSS LANGUAGE INFORMATION RETRIEVALDICTIONARY BASED AMHARIC-ARABIC CROSS LANGUAGE INFORMATION RETRIEVAL
DICTIONARY BASED AMHARIC-ARABIC CROSS LANGUAGE INFORMATION RETRIEVAL
 
Design and Implementation of a Language Assistant for English – Arabic Texts
Design and Implementation of a Language Assistant for English – Arabic TextsDesign and Implementation of a Language Assistant for English – Arabic Texts
Design and Implementation of a Language Assistant for English – Arabic Texts
 
CONSTRUCTION OF AMHARIC-ARABIC PARALLEL TEXT CORPUS FOR NEURAL MACHINE TRANSL...
CONSTRUCTION OF AMHARIC-ARABIC PARALLEL TEXT CORPUS FOR NEURAL MACHINE TRANSL...CONSTRUCTION OF AMHARIC-ARABIC PARALLEL TEXT CORPUS FOR NEURAL MACHINE TRANSL...
CONSTRUCTION OF AMHARIC-ARABIC PARALLEL TEXT CORPUS FOR NEURAL MACHINE TRANSL...
 
Construction of Amharic-arabic Parallel Text Corpus for Neural Machine Transl...
Construction of Amharic-arabic Parallel Text Corpus for Neural Machine Transl...Construction of Amharic-arabic Parallel Text Corpus for Neural Machine Transl...
Construction of Amharic-arabic Parallel Text Corpus for Neural Machine Transl...
 
CONSTRUCTION OF AMHARIC-ARABIC PARALLEL TEXT CORPUS FOR NEURAL MACHINE TRANSL...
CONSTRUCTION OF AMHARIC-ARABIC PARALLEL TEXT CORPUS FOR NEURAL MACHINE TRANSL...CONSTRUCTION OF AMHARIC-ARABIC PARALLEL TEXT CORPUS FOR NEURAL MACHINE TRANSL...
CONSTRUCTION OF AMHARIC-ARABIC PARALLEL TEXT CORPUS FOR NEURAL MACHINE TRANSL...
 
A GRAMMATICALLY AND STRUCTURALLY BASED PART OF SPEECH (POS) TAGGER FOR ARABIC...
A GRAMMATICALLY AND STRUCTURALLY BASED PART OF SPEECH (POS) TAGGER FOR ARABIC...A GRAMMATICALLY AND STRUCTURALLY BASED PART OF SPEECH (POS) TAGGER FOR ARABIC...
A GRAMMATICALLY AND STRUCTURALLY BASED PART OF SPEECH (POS) TAGGER FOR ARABIC...
 

Recently uploaded

Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Peter Udo Diehl
 

Recently uploaded (20)

"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
AI for Every Business: Unlocking Your Product's Universal Potential by VP of ...
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
 
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptxUnpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
Unpacking Value Delivery - Agile Oxford Meetup - May 2024.pptx
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCustom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
 
Powerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara LaskowskaPowerful Start- the Key to Project Success, Barbara Laskowska
Powerful Start- the Key to Project Success, Barbara Laskowska
 
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
From Daily Decisions to Bottom Line: Connecting Product Work to Revenue by VP...
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
Exploring UiPath Orchestrator API: updates and limits in 2024 🚀
 
UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2UiPath Test Automation using UiPath Test Suite series, part 2
UiPath Test Automation using UiPath Test Suite series, part 2
 
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo DiehlFuture Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
Future Visions: Predictions to Guide and Time Tech Innovation, Peter Udo Diehl
 
IESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIESVE for Early Stage Design and Planning
IESVE for Early Stage Design and Planning
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 

Transliteration/Romanization of Urdu Processing by Rashida sharif

  • 1. 9th National Research Conference 1  Abstract Urdu is the language which is understandable by most of the regions in South Asia and rapidly growing medium of communication in Arab World .Transliteration commonly known as Romanization is a way of mapping words from one system of writing into another. The research taken place here in “Center of Excellence for Urdu Informatics” was about the mapping of Romanized alphabets of Urdu in English Script to the Urdu Script. Software for converting transliterates text into Urdu script for whom cannot write the Urdu script. In the research, standard alphabets of Urdu listed with their all states e.g. initial state, medial state, final state and alone states with their Romanized alphabets of English. Similarly developed lists of those words take place with two alphabets of English ‫بھ‬ ~ bh as well as Urdu vowels and diacritics (airaab). In the defined rules of application genuinely discussed the letters which may be romanized in different ways depending on their context, Romanization of orthographic symbols other than letters and vowel sign. Deeply hunt the transliteration as affected by grammatical structure and special characters and characters modifiers in transliteration. Keywords: Transliteration, English, Urdu, Phonetic issues I. INTRODUCTION he methodology of converting text from one writing system to another writing system in a systematic manner is known as transliteration. In systematic way transliteration is the mapping from one system of writing to another letter by letter. Mostly there mis-understands about transliteration and transcription that both are same but clarifying the confusion between transliteration and transcription they are situational opposite of each other i.e. transliteration Romanization attempts to transliterate the original script, the guiding principle is a one-to-one mapping of characters in the source language into the target script, with less emphasis on how the result sounds when pronounced according to the reader's language is mapping of words in a language where as transcription is mapping of sounds with words in a language . Arabic script is much into the popularity that Roman script is not easily acceptable by the communities and they oppose this trend. Meanwhile it is very popular on the cyber end because of the unavailability of Arabic Script as it is rarely Implemented or mostly underdevelopment. Numerous websites and blogs including technical forums are in Roman Script and communication is vastly understandable by the community on the cyber end. Some of the central Asian countries are not able to read or understand Arabic script and they have used transliteration for the recitation of Quran in Arabic even the Arab nations the owner of the language accept and standardized in Malaya, Indonesian, and The languages and they have standardized letters of the Arabic alphabets to make it possible unless the stranderazation is not existent transliteration nor possible. As the Urdu character set is not completely implemented it is highly in discussion the standardize and make it possible to count on to transliterate. While the transcription implies seeking the best way to render foreign words into a particular language, the typing transliteration is a purely pragmatic process of inputting text in a particular language therefore it is highly in demand for URDU to standardize its character set for transliteration. Mostly transliteration is feasible where original script is not available to write or understandable from the foreign language user. It can be beneficial for the learning and understanding the Languages by non native or foreign language users to use local Learning can be made possible in easy manner. II. PREMATURE HISTORY OF TRANSLITERATION The work on transliteration has been started several hundred years ago in Asia (India, arab), Indians phoneticians had much work and analyses the sound category and analyzed the issues accrued. Some work on transliteration has been done Transliteration / Romanization for Urdu processing (June 2009) Rashida Sharif Center of Excellence for Urdu Informatics (CEUI), National Language Authority, Islamabad rashida.sharif@gmail.com T
  • 2. 9th National Research Conference 2 in the guidance of a committee was set up at the Geneva oriental Congress in September 1894 which have been broadly finalized the standard of transliteration of Sanskrit. Previous Work For transliteration of Urdu into English and several other languages several numerous systems have been developed but they are not reversible, most popular are British Library, Library of Congress, and Encyclopedia of Islam. Since Urdu and Hindi are grammatically same language and they also share a very good number of words, its easier for both speakers to understand each others’ language. The only obstacle is the script. Pakistan chose Arabic script instead of Devnagari for Urdu. And the script does not transliterate well into Hindi or probably roman too. 1. URDU INFORMATICS “Reversible Urdu Transliteration to be used in computer/Email/Internet” by Dr. Attash durrani, at national language authority publish in 2008. in this article letters of alphabet, their roman properties and values are discussed in incredibly aspect.[APPENDIX B][1] 2. Letters of the Urdu alphabets are discussed in the library of congress[4] 3. Transliteration editor for Arabic, Persian and Urdu has developed in India at Carnegie Mellon University, Hyderabad; they put the light of commonalities of Middle East Languages and discussed the all primary forms of letters.[3] 4. British Library[6] 5. Google Labs (Google Indic transliteration)[8] 6. The encyclopedia of Islam[7] 7. Urdu orthography also explain with its character sets include in basic and secondary letters diacritics (aerab) punctuation marks and special symbols in center of research in Urdu language processing, national university of computer and emerging sciences.[4] Provided through above mentioned resources didn’t provide complete set of Urdu alphabets and their transliteration scheme are not enough to develop a system that can provide entirely explicable conversation as the user want. III. SCHEMES/ SYSTEMS FOR TRANSLITERATION Different languages have their different phonetic schemes to make possible and resolve the native language issue to convert text from one writing system to another in a systematic way. For Urdu language some worth full schemes that are considered to go behind are following: 1.Speech assessment methods phonetic Alphabet (SAMPA) widely used scheme across the world for encoding the international phonetic alphabets (IPA). 2.Universal Intermediate Description (UIT), a scheme to transcribe text in Urdu, Punjabi and Hindi, considered an un ambiguous standard.[2] 3.ALA-LA Romanization scheme for letters of the alphabet: Transliteration Schemes for Non-Roman Scripts, approved by the Library of Congress and the American Library Association.[4] Issues of Mapping Urdu Alphabets into English We need a transliteration system based both on letters conversions and phonetic approach. For Urdu transliteration, to a great extent work has been done or in struggling to be done handled complexity of its structure as compare to other languages. Numerous systems completed but still with many ambiguities remaining for users. For mapping of Urdu letters into English there are countless problems to acquire perfect reversible system that adopt 100% correct equalant transliteration into Urdu from Roman and expecting dialogue is no symbols have to add with or before and after the letters. In fast university, they develop a scheme for their corpus based Urdu lexicon development system that I observe particularly in the sense of standard alphabets of Urdu listed and its analyzed found ambiguities, and suggest possible solution SAMPA Urdu Letters t ،‫ت‬ ‫ط‬ s ،‫س‬،‫ص‬ ، ‫ث‬ z ‫ذ‬ ،‫،ض‬ ‫ظ‬ ، ‫ز‬ h ،‫ہ‬ ‫ح‬ a ‫ا‬ ‫آ‬، @ ‫ء‬،‫ع‬ Table A: ambiguous letters in Appendix A If ‫ت‬" “ and “‫”ط‬ are converted to “t“ reverse transliteration is not possible because the said table mapping devising uni- directional transliteration. So the same is the case for the all listed letters of Table A. As per the mapping given at table c the revised system could produce again and again recursively without any problem as shown or ignored by Appendix A mapping. IV. CONCLUSION As we need a transliteration system based both on letter conversion and phonetic approach suggested Urdu set of characters [table c] for reversible transliteration scheme is a worth able to admit the developing systems rising day by day feeling the need of writing to communicate among native user, all alphabets are conversed and settled for English to Urdu transliteration. In the list all consonants, Diagraph representing Urdu aspirates, Urdu vowel and Diphthongs discretely describe to evade the ambiguous of same sounds letters that’s rebuff the level way to make transliteration possible.
  • 6. 9th National Research Conference 6 REFERENCES [1] Attash Durrani.Dr, “Reversible Urdu Translitration (book, Urdu Informatics)”, 1st ed.Vol:1, Islamabad: 2008. pp.48- 50 [2] Sarmad Hussain.dr & Madiha Ijaz, “corpus based Urdu lexicon development”(article), Center for research in Urdu language processing, national university of Computer and emerging sciences, Lahore 2007 [3] M.G Abbas malik, pushpak Bhattacharyya, Christian boilt “Hindi Urdu Machine transliteration using Finite-state Transliteration (HUMT)”(Article), GTALP, laboratories d’informatique grenoble, University josep Fourier, France. Dept. of Computer science and engineering, IIT Bombay, India 2008 [4] ALA .LA, “Library of congress”(online sources) www.loc.gov [5] SAMPA, ”speech assessment method of phonetic Alphabets”(online sources, article), Sarmad Hussain.dr & Madiha Ijaz, “corpus based Urdu lexicon development”[2], http://phone.ucl.ac.uk/home/sampa/ [6] British Library,”transliteration scheme” (online sources).Http://www.bl.uk/ [7] The encyclopedia of Islam, (online sources),www.muslimphilosophy.com/ei2/list.htm [8] Google Labs “Google Indic Transliteration” (online sources), http://www.google.com/transliterate/indic/Urdu Rashida Sharif is working in Center of Excellence for Urdu Informatics, National Language Authority, Islamabad