- Today the regional economies, societies, cultures and educations are integrated through a globe-spanning network of communication and trade.
- This globalization trend evokes for a homogeneous platform so that each member of the platform can apprehend what other intimates and perpetuates the discussion in a mellifluous way.
- However the barriers of languages throughout the world are continuously obviating the whole world from congregating into a single domain of sharing knowledge and information.
- Therefore researcher works on various languages and tries to give a platform where multi lingual people can communicate through their native language.
- Researcher analyze the language structure and form structural grammar and rules which used to translate one language to other.
- From the last few years several language-specific translation systems have been proposed.
- Since these systems are based on specific source and target languages, these have their own limitations.
- As a consequence United Nations University/Institute of Advanced Studies (UNU/IAS) were decided to develop an inter-language translation program .
- The corollary of their continuous research leads a common form of languages known as Universal Networking Language (UNL) and introduces UNL system.
- UNL system is an initiative to overcome the problem of language pairs in automated translation. UNL is an artificial language that is based on Interlingua approach.UNL acts as an intermediate form computer semantic language whereby any text written in a particular language is converted to text of any other forms of languages.
- UNL system consists of major three components:
- language resources
- software for processing language resources (parser) and
- supporting tools for maintaining and operating language processing software or developing language resources.
- The parser of UNL system take input sentence and start parsing based on rules and convert it into corresponding universal word from word dictionary.
- The challenge in detection of named is that such expressions are hard to analyze using UNL because they belong to the open class of expressions, i.e., there is an infinite variety and new expressions are constantly being invented.
- Bengali is the seventh popular language in the world, second in India and the national language of Bangladesh.
- So this is an important problem since search queries on UNL dictionary for proper nouns while all proper nouns(names) cannot be exhaustively maintained in the dictionary for automatic identification.
In this research project we do this task , Proper noun detection and conversion.
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Â
Development of analysis rules to identify proper noun from bengali sentence for universal networking language
1. Development of Analysis Rules to Identify
Proper Noun from Bengali Sentence for
Universal Networking language
Prepared By-
Md. Syeful Islam
Sr. Software Engineer at-
Samsung Research Institute Bangladesh
Google scholar : http://scholar.google.com/citations?user=b0QDD2AAAAAJ&hl=en
academia.edu: https://independent.academia.edu/SyefulIslam
2. Presentation coveredâĻ
âĸ Overview
âĸ Introducing with Universal Networking
Language(UNL)
âĸ Proposed Proper Noun Conversion Approach
âĸ Result Analysis
3. Overview
âĸ Today the regional economies, societies, cultures and educations are integrated through a
globe-spanning network of communication and trade.
âĸ This globalization trend evokes for a homogeneous platform so that each member of the
platform can apprehend what other intimates and perpetuates the discussion in a
mellifluous way.
âĸ However the barriers of languages throughout the world are continuously obviating the
whole world from congregating into a single domain of sharing knowledge and information.
âĸ Therefore researcher works on various languages and tries to give a platform where multi
lingual people can communicate through their native language.
âĸ Researcher analyze the language structure and form structural grammar and rules which
used to translate one language to other.
âĸ From the last few years several language-specific translation systems have been proposed.
âĸ Since these systems are based on specific source and target languages, these have their own
limitations.
âĸ As a consequence United Nations University/Institute of Advanced Studies (UNU/IAS) were
decided to develop an inter-language translation program .
âĸ The corollary of their continuous research leads a common form of languages known as
Universal Networking Language (UNL) and introduces UNL system.
4. OverviewâĻ
âĸ UNL system is an initiative to overcome the problem of language pairs in automated translation.
UNL is an artificial language that is based on Interlingua approach.UNL acts as an intermediate
form computer semantic language whereby any text written in a particular language is converted
to text of any other forms of languages.
âĸ UNL system consists of major three components:
- language resources
- software for processing language resources (parser) and
- supporting tools for maintaining and operating language processing software or
developing language resources.
âĸ The parser of UNL system take input sentence and start parsing based on rules and convert it into
corresponding universal word from word dictionary.
âĸ The challenge in detection of named is that such expressions are hard to analyze using UNL
because they belong to the open class of expressions, i.e., there is an infinite variety and new
expressions are constantly being invented.
âĸ Bengali is the seventh popular language in the world, second in India and the national language of
Bangladesh.
âĸ So this is an important problem since search queries on UNL dictionary for proper nouns while all
proper nouns(names) cannot be exhaustively maintained in the dictionary for automatic
identification.
In this research project we do this task , Proper noun detection and conversion.
5. Introducing with Universal Networking
Language(UNL)
What is the UNL?
âĸ The UNL consists of Universal words (UWs), Relations, Attributes, and UNL
Knowledge Base. The Universal words constitute the vocabulary of the UNL,
Relations and attribute constitutes the syntax of the UNL and UNL Knowledge Base
constitutes the semantics of the UNL.
Why the UNL is necessary?
âĸ A computer in future needs a capability to make knowledge processing.
Knowledge processing means a computer takes over thought and judgment of
humans using knowledge of humans. It is necessary to make a processing based
on contents. Computers need to have knowledge for knowledge processing. It is
necessary for computers to have a language to have knowledge like human. It is
also necessary to have a language to process contents like human. The UNL is a
language for computers to do so.
âĸ The UNL can express knowledge like a natural language. The UNL can express
contents like a natural language.
6. Introducing with Universal Networking
Language(UNL)
What is different from others?
âĸ Systems which can deal with knowledge and contents have already been developed. But,
their representation of knowledge or contents is different from each other. Moreover, their
representations are language dependent. Namely, concept primitives used to represent
knowledge are language dependent. Knowledge or contents of a system cannot be used in
other systems.
âĸ The situation is same as machine translation. For example, if we put all the result of research
and development of machine translation, we cannot realize multilingual machine translation
systems which can break language barriers.
Advantage of common language for computers
âĸ The UNL greatly reduces development cost of developing knowledge or contents necessary
to make knowledge processing by sharing knowledge and contents. Furthermore, if every
knowledgeâs necessary for doing something by software is described in a language for
computers such as the UNL, software only need to interpret instructions written in the
language to perform it functions. And those instructions could be shared by other software.
Then we can accumulate such knowledge for computer like a library for humans.
7. Introducing with Universal Networking Language(UNL)
How the UNL express information?
âĸ The UNL expresses information or knowledge in the form of semantic network
with hyper-node. UNL semantic network is made up of a set of binary relations,
each binary relation is composed of a relation and two UWs that old the relation.
Expression of Binary Relation
âĸ A binary relation of UNL is expressed in the following format:
âĸ <relation> ( <uw1>, <uw2> )
âĸ In <relation>, one of the relations defined in the UNL Specifications is described. In
<uw1> and <uw2>, the two UWs that hold the relation given at <relation> are
described. Such a binary relation is interpreted as follows:
Interpretation of Binary Relation
âĸ The semantic network [4] of UNL expression is a directed graph by means of the
binary relations. The three elements of each binary relation have the following
interrelationship:
<uw1> --- <relation> --> <uw2>
This interrelationship means that the UW given in <uw2>
âĸ Plays the role indicated by the relation given in <relation> and held by the UW
given in <uw1>; whereas the UW given in <uw1>
âĸ holds the relation given in <relation> with the UW given in <uw2>.
As mentioned above, all binary relations that compose a UNL expression has
directions, and the semantic network of UNL expression is a directed graph.
8. Introducing with Universal Networking
Language(UNL)
Hyper-graph
âĸ The UNL expression is a hyper semantic network. That is, each node of the graph,
<uw1> and <uw2> of a binary relation, can be replaced with a semantic network.
Such a node consists of a semantic network of a UNL expression and is called a
âscopeâ. A scope can be connected with other UWs or scopes. The UNL
expressions of in a scope is distinguished from others by assigning an ID to the
<relations> of the set of binary relations that belong to the scope. The general
description format of binary relations for a hyper-node of UNL expression is the
following:
âĸ <Relation> :< scope-id> (<node1>, <node2>)
Where,
<scope-id> is the ID for distinguishing a scope. <scope-id> is not necessary to
specify when a binary relation does not belong to any scope.
âĸ <node1> and <node2> can be a UW or a <scope node>.
âĸ A <scope node> is given in the format of â:< scope-id>â.
The UNL expresses information classifying objectivity and subjectivity. Objectivity
is expressed using UWs and relations. Subjectivity is expressed using attributes by
attaching them to UWs.
9. Introducing with Universal Networking
Language(UNL)
Figure: Sentence information is represented as a hyper-graph
10. Introducing with Universal Networking
Language(UNL)
UNL Expression
[S:2]
{org:es}
Hace tiempo, en la ciudad de Babilonia, la gente comenzo a construir una torre
enorme, que parecia alcanzar los cielos.
{/org}
{unl}
tim(begin(icl>do).@entry.@past, long ago(icl>ago))
mod(city(icl>region).@def, Babylon(icl>city))
plc(begin(icl>do).@entry.@past, city (icl>region).@def)
agt(begin(icl>do).@entry.@past, people(icl>person).@def)
obj(begin(icl>do).@entry.@past, build(icl>do))
agt(build(icl>do), people.@def)
obj(build(icl>do), tower(icl>building).@indef)
aoj(huge(icl>big), tower(icl>building).@indef)
aoj(seem(icl>be).@past, tower(icl>building).@indef)
obj(seem(icl>be).@past, reach(icl>come).@begin.@soon)
obj(reach(icl>come).@begin-soon, tower(icl>building).@indef)
gol(reach(icl>come).@begin-soon, heaven(icl>region).@def.@pl)
{/unl}
[/S]
11. Proposed Proper Noun Conversion Approach
âĸ The proper noun conversion is relatively simple in
English language, because such entities start with a
capital letter.
âĸ In Bangla we do not have concept of small or capital
letters.
âĸ Thus we find difficulties in understanding whether a
word is a proper noun or not.
âĸ In this paper we define our work as two parts-
- identified a proper noun from Bengali sentence and
- convert this proper noun into UNL form.
12. Proposed Proper Noun Conversion ApproachâĻ
âĸ In our research project we proposed a method for UNL
to proper noun conversion which is a combination of
- dictionary-based and
- rule-based approaches.
âĸ In this approach, UNL EnConverter identifies the
proper noun using rules and morphological analysis.
âĸ The Post converter convert it to corresponding
language(English)
âĸ Approaches are sequentially described into next slide.
14. Proper Noun detection approach
âĸ In UNL system firstly take an input sentence
and parse it word by word
âĸ Search the word from dictionary the relative
word.
âĸ If not found it try to recognize that the
temporary word is a proper noun based on
define rules.
âĸ If this process is fail then morphological
analysis is used.
15. Proper Noun detection approachâĻ
The approaches of proper noun detection are sequentially
described with three steps-
īŧDictionary based analysis for proper noun detection
īŧRule-based analysis for proper noun detection
īŧMorphological Analysis
16. Dictionary based analysis for proper
noun detection
âĸ If a dictionary is maintained where we try to attach most commonly used
proper noun and it is known as Name dictionary. Here we describe the
dictionary based proper noun detection technique sequentially.
âĸ Firstly the input sentence is processed on en-converter which finds the
word on word dictionary. If the word is found then it is converted into
relative universal word. If not in dictionary then EnConverter creates a
temporary entry for this word.
âĸ Secondly the en-converter finds the word with flag TEMP into name
dictionary. If it is found then it is conceder as the word may be noun and
apply rules to ensure.
âĸ Finally if the word is not in name dictionary then it sends into
morphological analyzer to conform that the word is proper noun.
17. Rule-based analysis for proper noun
detection
âĸ Rule-based approaches rely on some rules, one or
more of which is to be satisfied by the test word.
âĸ Some words which use in Bangla sentence as a part of
name.
âĸ Here we take a technique to identify proper noun
using such word (part of name).
âĸ Firstly we make dictionary entry with pof (parts of
name) and other relevant attribute.
âĸ To identify proper noun from Bangla sentence use pof
word some typical rules are given next slide.
18. Rule-based analysis for proper noun
detection
âĸ Rule 1. If the parser find the following word like ā§āĻŽāĻžāĻ, āĻŽā§āĻ, ā§āĻŽāĻžāĻ¸āĻžāĻŽā§āĻ¯āĻŽ ā§, ā§āĻŽāĻžāĻāĻžāĻŽā§āĻ¯āĻŽ ā§, āĻāĻ, āĻāĻŦā§āĻ¯āĻĻā§āĻ˛, āĻāĻŦā§āĻ¯āĻĻā§āĻ°, āĻŋāĻŽāĻ,
āĻŋāĻŽāĻ¸, āĻŋāĻŽā§āĻ¸āĻ¸, ā§āĻŦāĻāĻŽ, āĻŋāĻŦāĻŋāĻŦ, āĻĄāĻā§āĻ¯āĻāĻ°, āĻĄāĻ, āĻāĻ˛ā§, ā§āĻļā§āĻ, āĻ¸ā§āĻ¯āĻžāĻŦāĻŽā§, ā§āĻ¸ā§āĻĻ, ā§āĻ°āĻāĻžā§āĻ°āĻ¨ā§āĻ¯āĻĄ, āĻļā§ā§āĻ¯ā§āĻ°, āĻļā§ā§āĻ¯ā§āĻ°āĻ¯ā§āĻā§āĻ¯āĻ¤ etc. then the word is
considered as the first word of name and set the status of the word first part of name (FPN). The word or
collection of words after FPN with status TEMP is also considered as parts of name.
âĸ Rule 2. If the parser find the following word (title words and mid-name words to human names) like
ā§āĻā§āĻ§ā§āĻ°ā§, āĻŋāĻŽā§āĻž, āĻŋāĻŽāĻāĻž, āĻā§ā§āĻ¯āĻāĻāĻžāĻĒāĻžāĻ¯āĻ§ā§āĻ¯āĻž ā§, āĻŽā§āĻāĻĒāĻžāĻ§ā§āĻ¯āĻžāĻ¯ā§, āĻāĻžāĻ¨, ā§āĻšāĻžā§āĻ¸āĻ¨, ā§āĻšāĻžāĻāĻžāĻāĻ¨, āĻ°āĻšāĻŽāĻžāĻ¨, ā§āĻšāĻžāĻ¸āĻžāĻāĻ¨, ā§āĻāĻžāĻˇā§, ā§āĻŦāĻžāĻ¸, āĻŦāĻ¸ā§, āĻŋāĻŽāĻ¤ā§āĻ¯āĻ°, āĻ°āĻžā§,
āĻ¸āĻ°āĻāĻžāĻ°, āĻāĻžāĻ¨, āĻāĻšā§āĻŽāĻĻ, āĻ°āĻšāĻŽāĻžāĻ¨, āĻšāĻ etc. and āĻā§āĻŽāĻžāĻ°, āĻāĻ¨ā§āĻ¯āĻĻā§āĻ¯āĻ°, āĻ°āĻā§āĻ¯āĻāĻ¨, ā§āĻļā§āĻāĻ°, āĻĒā§āĻ¯āĻ¸āĻ°āĻžāĻĻ, āĻāĻ˛ā§, āĻāĻ˛āĻŽ
etc. after temporary entry word. Then LPN and temporary entry word along with such words are likely to
constitute a multi-word name(proper noun). For example, āĻ°āĻŋāĻŦ āĻŦāĻ¸āĻžāĻ, āĻĒāĻ˛ā§āĻ¯āĻ˛āĻŦ āĻā§āĻŽāĻžāĻ° āĻŽāĻŋā§āĻ¯āĻ˛āĻ˛āĻ are all name.
âĸ Rule 3. If there are two or more words in a sequence that represent the characters or spell like the
characters of Bangla or English, then they belong to the name. For example, āĻŋāĻŦ āĻ (BA), āĻāĻŽ āĻ (MA), āĻāĻŽ āĻŋāĻŦ
āĻŋāĻŦ āĻāĻ¸ (MBBS) are all name. Note that the rule will not distinguish between a proper name and common
name.
19. Rule-based analysis for proper noun
detection
âĸ Rule 4. If a substring like āĻŦāĻžāĻŦā§, āĻĻāĻžāĻĻāĻž, āĻĻāĻž, āĻ¸āĻžā§āĻšāĻŦ, āĻāĻžāĻā§, āĻāĻāĻā§āĻ¯, āĻā§āĻ¯āĻžāĻ°āĻŽ, āĻĒā§āĻ°, āĻā§, āĻ¨āĻāĻ° occurs at the end of
the temporary word, then temporary word is likely to be a name.
âĸ Rule 5. If a word which is in temporary entry ended with āĻ-āĻāĻžāĻ°, āĻāĻ°, āĻ°, āĻ°āĻž, āĻāĻ°āĻž, ā§āĻ, ā§āĻĻāĻ°, ā§āĻ¤, ā§
then the word is likely to be a name.
âĸ Rule 6. If a word like - āĻ¸āĻ°āĻ¨ā§, ā§āĻ°āĻžāĻĄ, āĻŋāĻ°āĻ¸ā§āĻ¯āĻā§āĻ¯āĻ, ā§āĻ˛āĻ¨, āĻĨāĻžāĻ¨āĻž, āĻ¸ā§āĻ¯āĻā§āĻ˛, āĻŋāĻŦāĻĻā§āĻ¯āĻžāĻ¯āĻ˛ā§, āĻā§āĻ˛āĻ, āĻ¨āĻĻā§, ā§āĻ˛āĻ, āĻšā§āĻ¯āĻĻāĻ°, āĻ¸āĻžāĻāĻ°, āĻŽāĻšāĻžāĻ¸āĻžāĻāĻ°,
āĻĒāĻžāĻšāĻžā§, āĻĒāĻŦā§āĻ¯āĻ¤āĻ° is found after temporary word then NW along with such word may belong to name.
For example - āĻŋāĻŦāĻā§ āĻ¸āĻ°āĻ¨ā§, āĻ°āĻžā§āĻ¸āĻ˛ āĻŋāĻ°āĻ¸ā§āĻ¯āĻā§āĻ¯āĻ, āĻŋāĻāĻŽā§ā§āĻ¯āĻāĻŦ āĻĒāĻžāĻšāĻžā§ all are name.
âĸ Rule 7. If the sentence containing āĻŦā§āĻ˛āĻ¨, āĻŦāĻ˛ā§āĻ˛āĻ¨, āĻŦāĻ˛āĻ˛, āĻļā§ā§āĻ¨āĻ˛, āĻšāĻžāĻ¸āĻ˛, āĻŋāĻ˛āĻāĻ˛, āĻŋāĻ˛āĻā§āĻ˛āĻ¨,ā§āĻā§āĻ˛āĻ¨, ā§āĻĻāĻāĻ˛ after
temporary word ten the temporary word is likely to be a proper noun.
âĸ Rule 8. If at the end of word there are suffixes like āĻāĻž,āĻāĻžāĻ¨āĻž, āĻāĻžāĻŋāĻ¨, āĻāĻžā§āĻ¤, āĻāĻžā§, āĻŋāĻā§āĻ, āĻāĻžā§āĻ, āĻāĻā§āĻ¨, āĻā§āĻ˛āĻž,
āĻā§ā§āĻ˛āĻž, āĻā§āĻŋāĻ˛ etc., and then word is usually not a proper noun.
20. Morphological Analysis for proper
noun detection
When previous two steps fail to identify proper noun or there is confusion
about the word is proper noun or not then we apply morphological
analysis to sure that the unknown word is proper noun. In this case we
conceder the structure of words and the position of word in the sentence
and identify the word type.
âĸ Rule 1. Proper noun always ended with 1st, 2nd, 5th and 6th verbal inflexions
(Bibhakti). So when parser fined an unknown word with 1st, 2nd, 5th and 6th
bibhakti then the word may be proper noun.
âĸ Rule 2. From sentence structure if parser find an unknown word is in the
position of karti kaarak and word is ended with 1st bibhakti then it is
conceder as a proper noun.
âĸ Rule 3. If the unknown word position is in the position of karma kaarak
and it is indirect object and word is ended with 2nd bibhakti then it is
conceder as a proper noun. But for direct object it is not a proper noun.
21. Morphological Analysis for proper
noun detection
âĸ Rule 4. If any sentence contains more than one word ended with 1st bibhakti then
the first word is with flag unknown word then must be karti kaarak and the word is
proper noun.
âĸ Rule 5. In case of apaadaan kaarak, if any word in the sentence ended with 5th
bibhakti and the word is unknown then it must be noun. If most of all otherâs noun
are in word dictionary so unknown word ended with 5th bibhakti must be proper
noun.
âĸ Rule 6. In case of adhikaran kaarak, if any word in the sentence ended with 7th
bibhakti and the word is unknown then it may be the name of place. If the word is
not in dictionary then it is conceder as proper noun (name of any pace).
22. Morphological Analysis for proper
noun detection
âĸ In that time, when EnConverter identify an word or collection of
word as a proper noun and EnConverter convert into UNL
expression it keep track temporary word using custom UNL relation
tpl and tpr.
âĸ We use two relation âtprâ and âtplâ to identify the word which
should converted.
âĸ The relation âtprâ use when EnConverter finds the temporary word
after pof (parts of name) attribute and âtprâ use when EnConverter
finds the temporary word before pof (parts of name ) attribute.
âĸ When EnConverter found âtprâ relation then it converts the word
which is after blank space. For the case of âtplâ it converts the first
word.
âĸ If proper noun contain only one word with attribute TEMP then
using this TEMP attribute converter convert.
23. Function of Post Converter
âĸ From previous slide we identify proper noun from bangle
sentence and convert the sentence into corresponding
intermediate UNL expression.
âĸ But there is little bit problem, EnConverter convert those
word which is in word dictionary. In the case of temporary
word which is already identified as a proper noun or part of
proper noun cannot converted and it is same as in Bengali
sentence. It is difficult to other people who cannot read
bangle language.
âĸ So it must be converted into English for UNL expression. Post
converters do this conversion.
25. Proposed enconversion process (Algorithm)
âĸ How post converter converts Bangla word for intermediate UNL expression to English
for final UNL expression. The process are describe step by step-
âĸ Step 1: In first step the UNL expression is the inputs of post converter for find the final
UNL expression.
âĸ Step 2: In second step post converter read the full expression and fined relation tpr or
tpl. The relation tpr and tpl are used to identify the word which should convert.
âĸ Step 3: At third step Post converter collect Bangla word which are converted within
this post converter using the help of above two relations. When EnConverter found
âtprâ relation then it collects the word which is after blank space and for tpl it collects
the first word.
âĸ Step 4: In this steps converter convert the word applying rules and finding the
corresponding English syllable or word form syllable dictionary.
âĸ Step 5: In this step we get the converted word which is placed on final UNL expression.
âĸ Step 6: This steps is the final steps which generate final UNL expression.
26. Dictionary Entries for Post Converter
Here we have listed some dictionary entries for post converter. Table 1 shows the
Bengali vowel and table 2 shows the shows the Bengali consonant and the
corresponding entries in dictionary. In future we define the full phonetics for
Bengali to English conversion.
Bangla vowel Dictionary entries
āĻ [āĻ ]{} "a" (TEMP) <.,0,0>
āĻ [āĻ]{} "a" (TEMP) <.,0,0>
āĻ [āĻ]{} "i" (TEMP) <.,0,0>
āĻ [āĻ]{} "ei" (TEMP) <.,0,0>
āĻ [āĻ]{} "oo" (TEMP) <.,0,0>
āĻ [āĻ]{} "u" (TEMP) <.,0,0>
āĻ [āĻ]{} "rri" (TEMP) <.,0,0>
āĻ [āĻ]{} "a" (TEMP) <.,0,0>
āĻ [āĻ]{} "oi" (TEMP) <.,0,0>
āĻ [āĻ]{} "o" (TEMP) <.,0,0>
āĻ [āĻ]{} "ou" (TEMP) <.,0,0>
28. Dictionary Entries for Post ConverterâĻ
Bangla parts of word Dictionary entries
āĻāĻž [āĻāĻž]{} "ka" (TEMP) <.,0,0>
āĻŋāĻ [āĻŋāĻ]{} "ki" (TEMP) <.,0,0>
āĻā§ [āĻā§]{} "kei" (TEMP) <.,0,0>
āĻā§ [āĻā§]{} "koo" (TEMP) <.,0,0>
āĻā§ [āĻā§]{} "ku" (TEMP) <.,0,0>
āĻā§ [āĻā§]{} krriu" (TEMP) <.,0,0>
ā§āĻ [ā§āĻ]{} "ka" (TEMP) <.,0,0>
ā§āĻ [ā§āĻ]{} "koi" (TEMP) <.,0,0>
ā§āĻāĻž [ā§āĻāĻž]{} "ko" (TEMP) <.,0,0>
ā§āĻā§ [ā§āĻā§]{} "kou" (TEMP) <.,0,0>
āĻā§āĻ° [āĻā§]āĻ°{} "kra" (TEMP) <.,0,0>
āĻā§āĻ¯ [āĻā§]āĻ¯{} "kka" (TEMP) <.,0,0>
Similarly for all consonant it should need to entries in word dictionary.
29. Example...
âĸ Letâs an intermediate UNL expression-
{unl}
agt(read(icl>see>do,agt>person,obj>information).@entry.@prese
nt.@progress, āĻāĻŋāĻ°āĻŽ: TEMP :05)
{/unl}
âĸ Post converter firstly read the full sentence. When it finds the word
âāĻāĻŋāĻ°āĻŽâ with attribute TEMP converter collect this word and push it into
Post Converter. Then applying rules it convert into UNL word âkarimâ.
Enconverter parse âāĻāĻŋāĻ°āĻŽâ letter by letter.
āĻ =āĻ + āĻ -> Ka
āĻŋāĻ° = -> ri
āĻŽ -> m
Thatâs mean âāĻāĻŋāĻ°āĻŽâ converted in âKarimâ
Thus Post converter converts all proper nouns Bengali to English.
30. Result Analysis
âĸ To convert any Bangla sentence we have used
the following files.
īInput file
īRules file
īDictionary
âĸ We have used an Encoder (EnCoL33.exe) and
here I present print screen of enconversion.
âĸ Screen print shows the Encoder that produces
Bangla to UNL expression or UNL to Bangla.
32. Result AnalysisâĻ
When users click on convert button it generates corresponding UNL expression.
The bellow screen shows this operation.
Figure: Enconversion process
33. Result AnalysisâĻ
When users click on convert button it generates corresponding UNL expression.
The bellow screen shows this operation.
Figure: Enconversion process
34. Result AnalysisâĻ
âĸ For example, âāĻāĻŋāĻ°āĻŽ āĻĒāĻŋā§ā§āĻ¤ā§āĻâ, pronounce as Karim
Poritechhe means âKarim is readingâ. Here subject Karim
initiates an action. So, agent (agt) relation is made between
subject âKarimâ and verb âreadâ.
âĸ The following dictionary entries are needed for conversion
the sentence
âĸ [āĻāĻŋāĻ°āĻŽ] {} âkarim (icl>name,iof>person,com>male)(N)
âĸ [āĻĒā§] {} âread (icl>see>do,agt>person,obj>information)
(ROOT,CEND,^ALT)
âĸ [āĻā§āĻ¤ā§āĻ] {} ââ (VI, CEND, CEG1, PRS, PRG, 3P)
36. Result AnalysisâĻ
âĸ But āĻāĻŋāĻ°āĻŽ is the name of a person. So it is not needed to entry on
dictionary. To convert this sentence into UNL expression when
EnConverter finds word which is not in dictionaries then it creates a
temporary entry. So if āĻāĻŋāĻ°āĻŽ is not in dictionary then the temporary entry
as like:
âĸ [āĻāĻŋāĻ°āĻŽ ]{} "āĻāĻŋāĻ°āĻŽ" (TEMP) <.,0,0>;.[]{}
âĸ Applied rules: R{:::}
{^PRON,^N,^VERB,^ROOT,^ADJ,^ADV,^ABY,^KBIV,^BIV:+N,+PROP::}
(BLK)P10;
âĸ After applying these rules it is conceder as a proper noun. And temporary
entry converted as:
âĸ [āĻāĻŋāĻ°āĻŽ]{}"āĻāĻŋāĻ°āĻŽ" (N,PROP,TEMP) <.,0,0>;.[]{}
âĸ To convert this sentence into UNL expression morphological analysis is
made between âāĻĒā§â (por) and âāĻā§āĻ¤ā§āĻâ (itechhe) and semantic analysis is
made between âāĻāĻŋāĻ°āĻŽâ (karim) and â āĻŋāĻĒā§ā§ā§āĻ¤āĻâ (poritechhe)
37. Result AnalysisâĻ
īą Rule of morphological analysis:
ī§ After applying some right shift rules when verb root âāĻĒā§â (por)
comes in the LAW and verbal inflexion âāĻā§āĻ¤ā§āĻâ (itechhe) comes in the
RAW then following rule is applied to perform morphological analysis:
+{ROOT,CEND,^ALT,^VERB:+VERB,-ROOT,+@::}{VI,CEND:::}P10;
īą Rule of semantic analysis:
ī§ After morphological analysis when âāĻāĻŋāĻ°āĻŽâ appears in the LAW and
verb âāĻĒāĻŋā§ā§āĻ¤ā§āĻâ (poritechhe) appears in the RAW the following
agent (agt) relation is made between âāĻāĻŋāĻ°āĻŽâ (karim) and
âāĻĒāĻŋā§ā§āĻ¤ā§āĻâ (poritechhe) to complete the semantic analysis.
ī§ >{N,SUBJ::agt:}{VERB,#AGT:+&@present,+&@progress::}P1;
ī§ where, NOM denotes the attributes for nouns or pronouns
38. Result AnalysisâĻ
{unl}
agt(read(icl>see>do,agt>person,obj>information).@entry.@prese
nt.@progress, āĻāĻŋāĻ°āĻŽ:TEMP: 05)
{/unl}
The intermediate language UNL form shows that the word âāĻāĻŋāĻ°āĻŽâ
same as before conversion. When the sentence âāĻāĻŋāĻ°āĻŽ āĻĒāĻŋā§ā§āĻ¤ā§āĻâ
converted into another language then the word âāĻāĻŋāĻ°āĻŽâ is not
understandable. Because of, in UNL expression the word âāĻāĻŋāĻ°āĻŽâ is
Bengali word. So this expression is invoked into Post Converter which
converts this word into English using proposed simple phonetic
method.
39. Result AnalysisâĻ
âĸ Post Converter takes the UNL expression and
converts the proper noun âāĻāĻŋāĻ°āĻŽâ Bengali to
English âkarimâ.
āĻ =āĻ + āĻ -> Ka
āĻŋāĻ° -> ri
āĻŽ -> m
Thatâs mean âāĻāĻŋāĻ°āĻŽâ converted in âkarimâ
40. Result AnalysisâĻ
âĸ After conversion the final UNL expression is
like as-
{unl}
agt(read(icl>see>do,agt>person,obj>information).@entry.@prese
nt.@progress, karim 05)
{/unl}
41. Conclusion
âĸ In this paper we define our work as two parts, firstly identified a
proper noun from Bengali sentence and secondly convert this
proper noun into UNL form.
âĸ In the second parts we use a converter named as post converter
which use simple phonetic method to convert Bengali to English
and we only mention the simple procedure not complete.
âĸ Our future plan in this regards is give the complete rules for post
converter.
âĸ We will also works on Bengali language and give the complete
analysis rules for enconversion program to convert Bangla to UNL
language and generation rules for deconversion program to convert
any UNL documents to Bangla language.
http://www.mecs-press.org/ijmecs/ijmecs-v6-n8/v6n8-1.html
http://www.academia.edu/7985833/Design_Analysis_Rules_to_Identify_Proper_Noun_from_Bengali_Sentence_for_Universal_Networking_Language