Imran Sarwar Bajwa, M. Imran Siddique, M. Abbas Choudhary, [2006], "Automatic Domain Specific Terminology Extraction using a Decision Support System", in IEEE 4th International Conference on Information and Communication Technology (ICICT 2006), Cairo, Egypt. pp:651-659
Automated Java Code Generation (ICDIM 2006)IT Industry
Â
Imran Sarwar Bajwa, Imran Siddique, M. Abbas Choudhary, [2006], "Rule Based Production System for Automatic Code Generation in Java", in IEEE 1st International Conference on Digital Information Management (ICDIM 2006), Bangalore, India, Dec 2006, pp:300-305
NL based Object Oriented modeling - EJSR 35(1)IT Industry
Â
Imran Sarwar Bajwa, Shahzad Mumtaz, Ali Samad [2009], "Object Oriented Software Modeling using NLP Based Knowledge Extraction", European Journal of Scientific Research, Aug 2009, Vol. 35 No. 01, pp:22-33
Imran Sarwar Bajwa, M. Abbas Choudhary [2006], "Natural Language Processing based Automated System for UML Diagrams Generation", in Saudi 18th National Conference on Computer Application, 2006, (18th NCCA) Riyadh, Kingdom of Saudi Arabia pp:171-176
Imran S. Bajwa, Irfan Hayder, [2007], "UCD-Generator - A LESSA Application for Use Case Design", in IEEE 1st International Conference on Information and Emerging Technologies 2007, Karachi, Pakistan Aug 2007, pp:1-5
Imran Sarwar Bajwa, [2010], "Markov Logics Based Automated Business Requirements Analysis", in International Journal of Computer and Electrical Engineering (IJCEE) 2(3) pp:481-485, June 2010
OOP and Its Calculated Measures in Programming Interactivityiosrjce
Â
This study examines the object oriented programming (OOP) and its calculated measures in
programming interactivity in Nigeria. It focused on the existing programming languages used by programmers
and examines the need for integrating programming interactivity with OOP. A survey was conducted to measure
interactivity amongst professionals using certain parameters like flexibility, interactivity, speed,
interoperability, scalability, dynamism, and solving real life problems. Data was gathered using questionnaire,
and analysis was carried out using frequency, percentage ratio, and mean in arriving at a more proactive stand.
The results revealed that the some of the parameters used are highly in support of the programming interactivity
with OOP.
Automated Java Code Generation (ICDIM 2006)IT Industry
Â
Imran Sarwar Bajwa, Imran Siddique, M. Abbas Choudhary, [2006], "Rule Based Production System for Automatic Code Generation in Java", in IEEE 1st International Conference on Digital Information Management (ICDIM 2006), Bangalore, India, Dec 2006, pp:300-305
NL based Object Oriented modeling - EJSR 35(1)IT Industry
Â
Imran Sarwar Bajwa, Shahzad Mumtaz, Ali Samad [2009], "Object Oriented Software Modeling using NLP Based Knowledge Extraction", European Journal of Scientific Research, Aug 2009, Vol. 35 No. 01, pp:22-33
Imran Sarwar Bajwa, M. Abbas Choudhary [2006], "Natural Language Processing based Automated System for UML Diagrams Generation", in Saudi 18th National Conference on Computer Application, 2006, (18th NCCA) Riyadh, Kingdom of Saudi Arabia pp:171-176
Imran S. Bajwa, Irfan Hayder, [2007], "UCD-Generator - A LESSA Application for Use Case Design", in IEEE 1st International Conference on Information and Emerging Technologies 2007, Karachi, Pakistan Aug 2007, pp:1-5
Imran Sarwar Bajwa, [2010], "Markov Logics Based Automated Business Requirements Analysis", in International Journal of Computer and Electrical Engineering (IJCEE) 2(3) pp:481-485, June 2010
OOP and Its Calculated Measures in Programming Interactivityiosrjce
Â
This study examines the object oriented programming (OOP) and its calculated measures in
programming interactivity in Nigeria. It focused on the existing programming languages used by programmers
and examines the need for integrating programming interactivity with OOP. A survey was conducted to measure
interactivity amongst professionals using certain parameters like flexibility, interactivity, speed,
interoperability, scalability, dynamism, and solving real life problems. Data was gathered using questionnaire,
and analysis was carried out using frequency, percentage ratio, and mean in arriving at a more proactive stand.
The results revealed that the some of the parameters used are highly in support of the programming interactivity
with OOP.
A NOVEL APPROACH FOR NAMED ENTITY RECOGNITION ON HINDI LANGUAGE USING RESIDUA...kevig
Â
Many Natural Language Processing (NLP) applications involve Named Entity Recognition (NER) as an important task, where it leads to improve the overall performance of NLP applications. In this paper the Deep learning techniques are used to perform NER task on Hindi text data as it found that as compared to English NER, Hindi language NER is not sufficiently done. This is a barrier for resource-scarce languages as many resources are not readily available. Many researchers use various techniques such as rule based, machine learning based and hybrid approaches to solve this problem. Deep learning based algorithms are being developed in large scale as an innovative approach now a days for the advanced NER models which will give the best results out of it. In this paper we devise a Novel architecture based on residual network architecture for preferably Bidirectional Long Short Term Memory (BiLSTM) with fasttext word embedding layers. For this purpose we use pre-trained word embedding to represent the words in the corpus where the NER tags of the words are defined as the used annotated corpora. BiLSTM Development of an NER system for Indian languages is a comparatively difficult task. In this paper, we have done the various experiments to compare the results of NER with normal embedding and fasttext embedding layers to analyse the performance of word embedding with different batch sizes to train the deep learning models. Here we present a state-of-the-art results with said approach F1 Score measures.
Speaker independent visual lip activity detection for human - computer inte...eSAT Journals
Â
Abstract Recently there is an increased interest in using the visual features for improved speech processing. Lip reading plays a vital role in visual speech processing. In this paper, a new approach for lip reading is presented. Visual speech recognition is applied in mobile phone applications, human-computer interaction and also to recognize the spoken words of hearing impaired persons. The visual speech video is taken as input for face detection module which is used to detect the face region. The mouth region is identified based on the face region of interest (ROI). The mouth images are applied for feature extraction process. The features are extracted using every 10th coordinate, every 16th coordinate, 16 point + Discrete Cosine Transform (DCT) method and Lip DCT method. Then, these features are applied as inputs for recognizing the visual speech using Hidden Markov Model. Out of the different feature extraction methods, the DCT method gives the experimental results of better performance accuracy. 10 participants were uttered 35 different isolated words. For each word, 20 samples are collected for training and testing the process. Index Terms: Feature Extraction, HMM, Mouth ROI, DWT, Visual Speech Recognition
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text EditorWaqas Tariq
Â
In recent decades speech interactive systems have gained increasing importance. Performance of an ASR system mainly depends on the availability of large corpus of speech. The conventional method of building a large vocabulary speech recognizer for any language uses a top-down approach to speech. This approach requires large speech corpus with sentence or phoneme level transcription of the speech utterances. The transcriptions must also include different speech order so that the recognizer can build models for all the sounds present. But, for Telugu language, because of its complex nature, a very large, well annotated speech database is very difficult to build. It is very difficult, if not impossible, to cover all the words of any Indian language, where each word may have thousands and millions of word forms. A significant part of grammar that is handled by syntax in English (and other similar languages) is handled within morphology in Telugu. Phrases including several words (that is, tokens) in English would be mapped on to a single word in Telugu.Telugu language is phonetic in nature in addition to rich in morphology. That is why the speech technology developed for English cannot be applied to Telugu language. This paper highlights the work carried out in an attempt to build a voice enabled text editor with capability of automatic term suggestion. Main claim of the paper is the recognition enhancement process developed by us for suitability of highly inflecting, rich morphological languages. This method results in increased speech recognition accuracy with very much reduction in corpus size. It also adapts Telugu words to the database dynamically, resulting in growth of the corpus.
The upsurge of deep learning for computer vision applicationsIJECEIAES
Â
Artificial intelligence (AI) is additionally serving to a brand new breed of corporations disrupt industries from restorative examination to horticulture. Computers canât nevertheless replace humans, however, they will work superbly taking care of the everyday tangle of our lives. The era is reconstructing big business and has been on the rise in recent years which has grounded with the success of deep learning (DL). Cyber-security, Auto and health industry are three industries innovating with AI and DL technologies and also Banking, retail, finance, robotics, manufacturing. The healthcare industry is one of the earliest adopters of AI and DL. DL accomplishing exceptional dimensions levels of accurateness to the point where DL algorithms can outperform humans at classifying videos & images. The major drivers that caused the breakthrough of deep neural networks are the provision of giant amounts of coaching information, powerful machine infrastructure, and advances in academia. DL is heavily employed in each academe to review intelligence and within the trade-in building intelligent systems to help humans in varied tasks. Thereby DL systems begin to crush not solely classical ways, but additionally, human benchmarks in numerous tasks like image classification, action detection, natural language processing, signal process, and linguistic communication process.
A Metamodel and Graphical Syntax for NS-2 ProgramingEditor IJCATR
Â
One of the most important issues, around the world, which manufacturers pay special attention to is to promote their
activities in order to be able to give high quality products or services. Perhaps the first advice to achieve this goal is simulation idea.
Therefore, simulation software packages with different properties have been made available. One of the most applicable simulators is
NS-2 which suffers from internal complexity. On the other hand, Domain Specific Modeling Languages can make an abstraction level
by which we can overcome the complexity of NS-2, increase the production speed, and promote efficiency. So, in this paper, we
introduce a Domain Specific Metamodel for NS-2.This new metamodel paves the ways for introducing abstract syntax and modeling
language syntax. In addition, created syntax in Domain Specific Modeling Language is supported by a graphical modeling tool.
Imran Sarwar Bajwa, M. Abbas Choudhary [2006], "A Rule Based System for Speech Language Context Understanding", International Journal of Donghua University (English Edition), Jun 2006, Vol. 23 No. 06, pp:39-42
A NOVEL APPROACH FOR NAMED ENTITY RECOGNITION ON HINDI LANGUAGE USING RESIDUA...kevig
Â
Many Natural Language Processing (NLP) applications involve Named Entity Recognition (NER) as an important task, where it leads to improve the overall performance of NLP applications. In this paper the Deep learning techniques are used to perform NER task on Hindi text data as it found that as compared to English NER, Hindi language NER is not sufficiently done. This is a barrier for resource-scarce languages as many resources are not readily available. Many researchers use various techniques such as rule based, machine learning based and hybrid approaches to solve this problem. Deep learning based algorithms are being developed in large scale as an innovative approach now a days for the advanced NER models which will give the best results out of it. In this paper we devise a Novel architecture based on residual network architecture for preferably Bidirectional Long Short Term Memory (BiLSTM) with fasttext word embedding layers. For this purpose we use pre-trained word embedding to represent the words in the corpus where the NER tags of the words are defined as the used annotated corpora. BiLSTM Development of an NER system for Indian languages is a comparatively difficult task. In this paper, we have done the various experiments to compare the results of NER with normal embedding and fasttext embedding layers to analyse the performance of word embedding with different batch sizes to train the deep learning models. Here we present a state-of-the-art results with said approach F1 Score measures.
Speaker independent visual lip activity detection for human - computer inte...eSAT Journals
Â
Abstract Recently there is an increased interest in using the visual features for improved speech processing. Lip reading plays a vital role in visual speech processing. In this paper, a new approach for lip reading is presented. Visual speech recognition is applied in mobile phone applications, human-computer interaction and also to recognize the spoken words of hearing impaired persons. The visual speech video is taken as input for face detection module which is used to detect the face region. The mouth region is identified based on the face region of interest (ROI). The mouth images are applied for feature extraction process. The features are extracted using every 10th coordinate, every 16th coordinate, 16 point + Discrete Cosine Transform (DCT) method and Lip DCT method. Then, these features are applied as inputs for recognizing the visual speech using Hidden Markov Model. Out of the different feature extraction methods, the DCT method gives the experimental results of better performance accuracy. 10 participants were uttered 35 different isolated words. For each word, 20 samples are collected for training and testing the process. Index Terms: Feature Extraction, HMM, Mouth ROI, DWT, Visual Speech Recognition
Dynamic Construction of Telugu Speech Corpus for Voice Enabled Text EditorWaqas Tariq
Â
In recent decades speech interactive systems have gained increasing importance. Performance of an ASR system mainly depends on the availability of large corpus of speech. The conventional method of building a large vocabulary speech recognizer for any language uses a top-down approach to speech. This approach requires large speech corpus with sentence or phoneme level transcription of the speech utterances. The transcriptions must also include different speech order so that the recognizer can build models for all the sounds present. But, for Telugu language, because of its complex nature, a very large, well annotated speech database is very difficult to build. It is very difficult, if not impossible, to cover all the words of any Indian language, where each word may have thousands and millions of word forms. A significant part of grammar that is handled by syntax in English (and other similar languages) is handled within morphology in Telugu. Phrases including several words (that is, tokens) in English would be mapped on to a single word in Telugu.Telugu language is phonetic in nature in addition to rich in morphology. That is why the speech technology developed for English cannot be applied to Telugu language. This paper highlights the work carried out in an attempt to build a voice enabled text editor with capability of automatic term suggestion. Main claim of the paper is the recognition enhancement process developed by us for suitability of highly inflecting, rich morphological languages. This method results in increased speech recognition accuracy with very much reduction in corpus size. It also adapts Telugu words to the database dynamically, resulting in growth of the corpus.
The upsurge of deep learning for computer vision applicationsIJECEIAES
Â
Artificial intelligence (AI) is additionally serving to a brand new breed of corporations disrupt industries from restorative examination to horticulture. Computers canât nevertheless replace humans, however, they will work superbly taking care of the everyday tangle of our lives. The era is reconstructing big business and has been on the rise in recent years which has grounded with the success of deep learning (DL). Cyber-security, Auto and health industry are three industries innovating with AI and DL technologies and also Banking, retail, finance, robotics, manufacturing. The healthcare industry is one of the earliest adopters of AI and DL. DL accomplishing exceptional dimensions levels of accurateness to the point where DL algorithms can outperform humans at classifying videos & images. The major drivers that caused the breakthrough of deep neural networks are the provision of giant amounts of coaching information, powerful machine infrastructure, and advances in academia. DL is heavily employed in each academe to review intelligence and within the trade-in building intelligent systems to help humans in varied tasks. Thereby DL systems begin to crush not solely classical ways, but additionally, human benchmarks in numerous tasks like image classification, action detection, natural language processing, signal process, and linguistic communication process.
A Metamodel and Graphical Syntax for NS-2 ProgramingEditor IJCATR
Â
One of the most important issues, around the world, which manufacturers pay special attention to is to promote their
activities in order to be able to give high quality products or services. Perhaps the first advice to achieve this goal is simulation idea.
Therefore, simulation software packages with different properties have been made available. One of the most applicable simulators is
NS-2 which suffers from internal complexity. On the other hand, Domain Specific Modeling Languages can make an abstraction level
by which we can overcome the complexity of NS-2, increase the production speed, and promote efficiency. So, in this paper, we
introduce a Domain Specific Metamodel for NS-2.This new metamodel paves the ways for introducing abstract syntax and modeling
language syntax. In addition, created syntax in Domain Specific Modeling Language is supported by a graphical modeling tool.
Imran Sarwar Bajwa, M. Abbas Choudhary [2006], "A Rule Based System for Speech Language Context Understanding", International Journal of Donghua University (English Edition), Jun 2006, Vol. 23 No. 06, pp:39-42
A SURVEY OF GRAMMAR CHECKERS FOR NATURAL LANGUAGEScsandit
Â
ABSTRACT
Natural Language processing is an interdisciplinary branch of linguistic and computer science studied under the Artificial Intelligence (AI) that gave birth to an allied area called
âComputational Linguisticâ which focuses on processing of natural languages on computational devices. A natural language consists of a large number of sentences which are linguistic units involving one or more words linked together in accordance with a set of predefined rules called grammar. Grammar checking is the task of validating sentences syntactically and is a prominent tool within language engineering. Our review draws on the recent development of various grammar checkers to look at past, present and the future in a new light. Our review covers grammar checkers of many languages with the aim of seeking their approaches, methodologies for developing new tool and system as a whole. The survey concludes with the discussion of various features included in existing grammar checkers of foreign languages as well as a few Indian Languages.
Natural Language Processing Theory, Applications and Difficultiesijtsrd
Â
The promise of a powerful computing device to help people in productivity as well as in recreation can only be realized with proper human machine communication. Automatic recognition and understanding of spoken language is the first step toward natural human machine interaction. Research in this field has produced remarkable results, leading to many exciting expectations and new challenges. This field is known as Natural language Processing. In this paper the natural language generation and Natural language understanding is discussed. Difficulties in NLU, applications and comparison with structured programming language are also discussed here. Mrs. Anjali Gharat | Mrs. Helina Tandel | Mr. Ketan Bagade "Natural Language Processing Theory, Applications and Difficulties" Published in International Journal of Trend in Scientific Research and Development (ijtsrd), ISSN: 2456-6470, Volume-3 | Issue-6 , October 2019, URL: https://www.ijtsrd.com/papers/ijtsrd28092.pdf Paper URL: https://www.ijtsrd.com/engineering/computer-engineering/28092/natural-language-processing-theory-applications-and-difficulties/mrs-anjali-gharat
Exploiting rules for resolving ambiguity in marathi language texteSAT Journals
Â
Abstract
Natural language ambiguity is a situation involving some words having multiple meanings/senses. This paper discusses natural
language ambiguity and its types. Further we propose a knowledge based solution to resolve various types of ambiguity occurring
in Marathi language text. The task of resolving semantic and lexical ambiguity occurring in words to obtain the actual sense is
referred as Word Sense Disambiguation (WSD). Marathi language is the official and commonly spoken language of Maharashtra
state in India. Plenty of words in Marathi are spelled same as well as uttered same but are semantically (meaning-wise/ sensewise)
different. During the automatic translation, these words lead to ambiguity. Our method successfully removes the ambiguity
by identifying the correct sense of the given text from the predefined possible senses available in Marathi Wordnet using word and
sentence rules. The method is applicable only for word level ambiguity. Structural ambiguity is not handled by this system. This
system may be successfully used as a subsystem in other Natural Language Processing (NLP) applications.
Key Words: Word Sense Disambiguation, Natural Language Processing, Marathi, Marathi Wordnet, ambiguity,
knowledge based
Syracuse UniversitySURFACEThe School of Information Studie.docxdeanmtaylor1545
Â
Syracuse University
SURFACE
The School of Information Studies Faculty
Scholarship
School of Information Studies (iSchool)
2001
Natural Language Processing
Elizabeth D. Liddy
Syracuse University, [email protected]
Follow this and additional works at: http://surface.syr.edu/istpub
Part of the Library and Information Science Commons, and the Linguistics Commons
This Book Chapter is brought to you for free and open access by the School of Information Studies (iSchool) at SURFACE. It has been accepted for
inclusion in The School of Information Studies Faculty Scholarship by an authorized administrator of SURFACE. For more information, please contact
[email protected]
Recommended Citation
Liddy, E.D. 2001. Natural Language Processing. In Encyclopedia of Library and Information Science, 2nd Ed. NY. Marcel Decker, Inc.
http://surface.syr.edu?utm_source=surface.syr.edu%2Fistpub%2F63&utm_medium=PDF&utm_campaign=PDFCoverPages
http://surface.syr.edu/istpub?utm_source=surface.syr.edu%2Fistpub%2F63&utm_medium=PDF&utm_campaign=PDFCoverPages
http://surface.syr.edu/istpub?utm_source=surface.syr.edu%2Fistpub%2F63&utm_medium=PDF&utm_campaign=PDFCoverPages
http://surface.syr.edu/ischool?utm_source=surface.syr.edu%2Fistpub%2F63&utm_medium=PDF&utm_campaign=PDFCoverPages
http://surface.syr.edu/istpub?utm_source=surface.syr.edu%2Fistpub%2F63&utm_medium=PDF&utm_campaign=PDFCoverPages
http://network.bepress.com/hgg/discipline/1018?utm_source=surface.syr.edu%2Fistpub%2F63&utm_medium=PDF&utm_campaign=PDFCoverPages
http://network.bepress.com/hgg/discipline/371?utm_source=surface.syr.edu%2Fistpub%2F63&utm_medium=PDF&utm_campaign=PDFCoverPages
mailto:[email protected]
Natural Language Processing
1
INTRODUCTION
Natural Language Processing (NLP) is the computerized approach to analyzing text that
is based on both a set of theories and a set of technologies. And, being a very active area
of research and development, there is not a single agreed-upon definition that would
satisfy everyone, but there are some aspects, which would be part of any knowledgeable
personâs definition. The definition I offer is:
Definition: Natural Language Processing is a theoretically motivated range of
computational techniques for analyzing and representing naturally occurring texts
at one or more levels of linguistic analysis for the purpose of achieving human-like
language processing for a range of tasks or applications.
Several elements of this definition can be further detailed. Firstly the imprecise notion of
ârange of computational techniquesâ is necessary because there are multiple methods or
techniques from which to choose to accomplish a particular type of language analysis.
âNaturally occurring textsâ can be of any language, mode, genre, etc. The texts can be
oral or written. The only requirement is that they be in a language used by humans to
communicate to one another. Also, the text being analyzed should not be specifically
constru.
Natural Language Processing: State of The Art, Current Trends and Challengesantonellarose
Â
Diksha Khurana1
, Aditya Koli1
, Kiran Khatter1,2 and Sukhdev Singh1,2
1Department of Computer Science and Engineering
Manav Rachna International University, Faridabad-121004, India
2Accendere Knowledge Management Services Pvt. Ltd., India
The paper presents a model for developing intelligent query processing in Malayalam. For this the
investigator has selected a domain as time enquiry system in Malayalam language. This work discusses
issues involved in Natural Language Processing. NLQPS is a restricted domain system, deals with the
natural Language Queries on time enquiry for different modes of transportation. The system performs a
shallow syntactic and semantic analysis of the input query. After the knowledge level understanding of the
query, the system triggers a reasoning process to determine the type of query and the result slots that are
required. The investigator tries to extract the hidden intelligent behind a Natural Language Query
submitted by a user.
Ali Akbar Dehkhoda, the prominent lexicographer, describes a person who has difficulty in grasping knowledge as someone who âCannot understand something without knowing all its details.â If the knowledge required by somebody is in a language other than the personâs mother tongue, access to this knowledge will surely meet special difficulties resulting from the personâs lack of mastery over the second
language. Any project that can monitor knowledge sources written in English and change them into the
userâs language by employing a simple understandable model is capable of being a knowledge-based
project with a world view regarding text simplification. This article creates a knowledge system,
investigates some algorithms for analyzing contents of complex texts, and presents solutions for changing
such texts simple and understandable ones. Texts are automatically analyzed and their ambiguous points
are identified by software, but it is the author or the human agent who makes decisions concerning
omission of the ambiguities or correction of the texts.
Phrase Identification is one of the most critical and widely studied in Natural Language processing (NLP) tasks. Verb Phrase Identification within a sentence is very useful for a variety of application on NLP. One of the core enabling technologies required in NLP applications is a Morphological Analysis. This paper presents the Myanmar Verb Phrase Identification and Translation Algorithm and develops a Markov Model with Morphological Analysis. The system is based on Rule-Based Maximum Matching Approach. In Machine Translation, Large amount of information is needed to guide the translation process. Myanmar Language is inflected language and there are very few creations and researches of Lexicon in Myanmar, comparing to other language such as English, French and Czech etc. Therefore, this system is proposed Myanmar Verb Phrase identification and translation model based on Syntactic Structure and Morphology of Myanmar Language by using Myanmar- English bilingual lexicon. Markov Model is also used to reformulate the translation probability of Phrase pairs. Experiment results showed that proposed system can improve translation quality by applying morphological analysis on Myanmar Language.
Automatic classification of bengali sentences based on sense definitions pres...ijctcm
Â
Based on the sense definition of words available in the Bengali WordNet, an attempt is made to classify the
Bengali sentences automatically into different groups in accordance with their underlying senses. The input
sentences are collected from 50 different categories of the Bengali text corpus developed in the TDIL
project of the Govt. of India, while information about the different senses of particular ambiguous lexical
item is collected from Bengali WordNet. In an experimental basis we have used Naive Bayes probabilistic
model as a useful classifier of sentences. We have applied the algorithm over 1747 sentences that contain a
particular Bengali lexical item which, because of its ambiguous nature, is able to trigger different senses
that render sentences in different meanings. In our experiment we have achieved around 84% accurate
result on the sense classification over the total input sentences. We have analyzed those residual sentences
that did not comply with our experiment and did affect the results to note that in many cases, wrong
syntactic structures and less semantic information are the main hurdles in semantic classification of
sentences. The applicational relevance of this study is attested in automatic text classification, machine
learning, information extraction, and word sense disambiguation
The News Today 24 (https://thenewstoday24.com/)IT Industry
Â
The News Today 24 is source of latest and up-to-date information from all fields of life including politics, business, technology, entertainment and all other fields of life. For more detail, visit: https://thenewstoday24.com/
Imran Sarwar Bajwa, [2010], "Context Based Meaning Extraction by Means of Markov Logic", in International Journal of Computer Theory and Engineering - (IJCTE) 2(1) pp:35-38, February 2010
Imran Sarwar Bajwa, [2010], "Virtual Telemedicine Using Natural Language Processing", International Journal of Information Technology and Web Engineering IJITWE 5(1):43-55, January 2010
Imran Sarwar Bajwa, Behzad Bordbar, Mark G. Lee. [2010] "OCL Constraints Generation from Natural Language Specification", in proceedings of EDOC 2010 - IEEE International EDOC Conference 2010, Vitoria, Brazil, Oct 2010, pp:204-213
BPM & SOA for Small Business Enterprises (ICIME 2009)IT Industry
Â
Imran Sarwar Bajwa, S. Mumtaz, A. Samad, R. Kazmi, A. Choudhary [2009], "BPM meeting with SOA: A Customized Solution for Small Business Enterprises", in proceedings of IEEE- International Conference on Information management & Engineering 2009, Kuala Lumpur Malaysia, Apr 2009, pp:677-682
Imran Sarwar Bajwa, A. H. S. Bukhari, [2006] "Speech Language based Engineering System for Automatic Generation of User Forms", in International Conference on Man-Machine Systems (ICOMMS 2006), Kangar, Malaysia
Imran Sarwar Bajwa, M. Abbas Choudhry [2006], "A Study for Prediction of Minerals in Rock Images using Back Propagation Neural Networks", in IEEE 1st International Conference on Advances in Space Technologies (ICAST 2006), Aug 2006, Islamabad, Pakistan. pp:185-189
Imran Sarwar Bajwa, Irfan Hyder, M. Abbas Choudhary. [2006], âSuitable Reusable Components Mining to Assist Developers using a Rule Based Systemâ, in Fifth International Conference on Information and Management Sciences (IMS 2006), Chengdu, China, Volume: 5 pp:266-270
M. Kashif Nazir, Imran Sarwar Bajwa, M. Imran Khan [2006], "A Conceptual Framework of Earthquake Disaster Management System (EDMS) for Quetta City using GIS", in IEEE 1st International Conference on Advances in Space Technologies, (ICAST 2006), Islamabad, Pakistan, Aug 2006, pp:117-120
Imran Sarwar Bajwa, M. Abbas Choudhary [2006], "Automatic Web Layout Generation using Natural Language Processing Techniques", in 2nd International Conference From Scientific Computing to Computational Engineering, (IC-SCCE 2006) Athens, Greece, pp:334-340
Imran Sarwar Bajwa, S. Irfan Hyder [2005], "PCA Based Image Classification of Single-Layered Cloud Types", in 1st IEEE International Conference on Emerging Technologies (ICET 2005), Islamabad, Pakistan, Jan 2005, pp:365-369
Feature Based Image Classification by using Principal Component AnalysisIT Industry
Â
Classification of different types of cloud images is the primary issue used to forecast precipitation and other weather constituents. A PCA based classification system has been presented in this paper to classify the different types of single-layered and multi-layered clouds. Principal Component Analysis (PCA) provides enhanced accuracy in features based image identification and classification as compared to other techniques. PCA is a feature based classification technique that is characteristically used for image recognition. PCA is based on principal features of an image and these features discreetly represent an image. The used approach in this research uses the principal features of an image to identify different cloud image types with better accuracy. A classifier system has also been designed to exhibit this enhancement. The designed system reads features of gray-level images to create an image space. This image space is used for classification of images. In testing phase, a new cloud image is classified by comparing it with the specified image space using the PCA algorithm.
Feature Based Image Classification by using Principal Component Analysis
Â
Domain Specific Terminology Extraction (ICICT 2006)
1. AUTOMATIC DOMAIN SPECIFIC
TERMINOLOGY EXTRACTION USING A
DECISION SUPPORT SYSTEM
Imran Sarwar Bajwa1 , Imran Siddique2 and M. Abbas Choudhary3
1 Department of Computer Science and Information Technology
The Islamia University of Bahawalpur, Pakistan
Phone: +92 (62) 9255466
E-mail: imransbajwa@yahoo.com
2 Facuty of Computer an EmergingSciences
3 Balochistan University of Information Technology and Management Sciences
Quetta, Pakistan.
Phone: +92 (81) 9202463 Fax: +92 (81) 9201064
E-mail: imran@buitms.edu.pk
E-mail:abbas@buitms.edu.pk
Abstract: Speech languages or Natural languages contents are major
tools of communication. This research paper presents a
natural language processing based automated system for
understanding speech language text. A new rule based model
is presented for analyzing the natural languages and
extracting the relative meanings from the given text. User
writes the natural language scenario in simple English in a
few paragraphs and the designed system has an obvious
capability of analyzing the given script by the user. After
composite analysis and extraction of associated information,
the designed system gives particular meanings to an
assortment of speech language text on the basis of its
context. The designed system uses standard speech language
rules that are clearly defined for all speech languages as
2. English, Urdu, Chinese, Arabic, French, etc. The designed
system provides a quick and reliable way to comprehend
speech language context and generates respective meanings.
The application with such abilities can be more intelligent
and pertinent specifically for the user to save the time.
Keywords: Terminology Extraction, Automatic text understanding,
Speech language processing, Information extraction
1. INTRODUCTION
In daily life, natural languages or speech languages are main source of
commutation. Particular and typical set of words and vocabulary
collections are used for each field of life and they are quiet exclusive to
their use. In computer science, Natural Language Processing (NLP) is the
field used for the analysis of speech language text [4]. NLP introduces
many new concepts and new terminologies to describe its models,
techniques and processes. Some of these terms come directly from
linguistics and the science of perception, while others were invented to
describe discoveries that did not fit into any previous category [3]. Natural
Language understanding is a hefty field to grasp. Understanding and
comprehending a speech language means to know that what concepts a
word or a particular phrase stands under a particular perspective. To give
meanings to a particular sentence a system should know how to link the
concepts together in a meaningful way. It's satirical that the speech
languages are easiest for human beings in terms of learning, understanding
and using, on the other hand hardest for a computer to understand.
In modern era, computer machines have attained the ability of solving
complex mathematical and statistical problems with speed and grace but
still they are inefficient to comprehend the basics of spoken and written
languages. The designed automated system can be useful in various
business and technical software by only acquiring the requirements from
the user. The designed system will extract the required information from
the given text and provide to the computer for further automated
processing. Applications can be automated generation of UML diagrams,
query processing, web mining, web template designing, user interface
designing, etc. As far as this software is concerns the time, it takes to
3. explore all the facilities and services, should is far less than a minute and
this information is adequately useful for the engineers and computer users.
2. RELATED WORK
Natural languages have been an area of interest from last one century. In
the lat nineteen sixties and seventies, so many researchers as Noam
Chomsky (1965) [9], Maron, M. E. and Kuhns, J. L (1960) [10], Chow, C.,
& Liu, C (1968) [11] contributed in the area of information retrieval from
natural languages. They contributed for analysis and understanding of the
natural languages, but still there was lot of effort required for better
understanding and analysis. Some authors concentrated in this area in
eighties and nineties as Losee, R. M (1998) [6], Salton, G., & McGill, M
(1995) [8], Maron, M. E. & Kuhns, J. (1997) [9]. These authors worked for
lexical ambiguity [5]and information retrieval [7], probabilistic indexing
[9], data bases handling [3] and so many other related areas.
In this research paper, a newly designed rule based framework has been
proposed that is able to read the English language text and extract its
meanings after analyzing and extracting related information. The conducted
research provides a robust solution to the addressed problem. The
functionality of the conducted research was domain specific but it can be
enhanced easily in the future according to the requirements. Current
designed system incorporate the capability of mapping user requirements
after reading the given requirements in plain text and drawing the set of
speech language contents.
3. SPEECH LANGUAGE ANALYSIS
The understanding and multi-aspect processing of the natural languages
that are also termed as "speech languages", is actually one of the arguments
of greater interest in the field artificial intelligence field [6]. The natural
languages are irregular and asymmetrical. Traditionally, natural languages
are based on un-formal grammars. There are the geographical,
psychological and sociological factors which influence the behaviours of
natural languages [17]. There are undefined set of words and they also
change and vary area to area and time to time.Due to these variations and
inconsistencies, the natural languages have different flavours as English
4. language has more than half dozen renowned flavours all over the world.
These flavours have different accents, set of vocabularies and phonological
aspects. These ominous and menacing discrepancies and inconsistencies in
natural languages make it a difficult task to process them as compared to
the formal languages [13].
In the process of analyzing and understanding the natural languages,
various problems are usually faced by the researchers. The problems
connected to the greater complexity of the natural language are verbâs
conjugation, verbal inflexion & paradigms, active & passive vice, lexical
amplitude, problem of ambiguity, etc. All these problems are significant to
handle but problem of ambiguity is difficult to handle. Ambiguity could be
easily solved at the syntax and semantic level by using a sound and robust
rule-based system. During the interpretation process of the English
language text, the text is distinguished in four classes to handle the
ambiguity as: lexical analysis, syntactic analysis, semantic analysis and
pragmatic analysis. Such distinction is not exhaustive but it is useful to
focus on the problem and for practical purposes [7].
3.1. Lexical Analysis:
The problem of words written or pronounced not correctly is omitted in this
problem scenario. These kinds of errors are simply solvable through the
comparison with the expressions contained in a dictionary. Lexical
ambiguity is created when a same word assumes various meanings [3]. In
this case that ambiguity is generated from the fact that which meanings will
be incorporated in which scenario. As an example we consider the "cold"
adjective in the following sentences as "That room is cold." and "That
person is cold." It turns out obvious that the same "cold" adjective assumes,
in the two phrases different meanings. In the first sentence it indicate a
temperature, in the second one a particular character of a person.
3.2. Syntactic Analysis:
Syntax analysis is performed on world level to recognize the word
category. The syntactic analysis of the programs would have to be in a
position to isolate subject, verbs, objects and various complements. It is
little complex procedure. For example, "Imran eats the apple." In this
5. example, the actual meanings are that âMario eats appleâ but ambiguity can
be asserted if this sentence is conceived as âthe apple eats Marioâ.
3.3. Semantic Analysis:
To analyze a phrase from the semantic point of view means to give it a
meaning. This should let you understand we arrived to a crucial point.
Semantic ambiguities are most common due to the fact that generally a
computer is not in a position to distinguish the logical situations as "The
bus hit the pole while it was moving." All of us would surely interpret the
phrase like "The car, while moving, hit the pole.", while nobody would be
dreamed to attribute to the sentence the meant "the car hit the pole while
the pole was moving ".
3.4. Pragmatic Analysis:
Pragmatic ambiguities born when the communication happens between two
persons who do not share the same context. As following example as "I
will arrive to the airport at 8 o'clock". In this example, if the subject person
belongs to a different continent, the meanings can be totally changed.
4. DESIGNED SYSTEM ARCHITECTURE
The designed speech language contents system has ability to draw speech
language contents after reading the text scenario provided by the user. This
system draws in five modules: Text input acquisition, text understanding,
knowledge extraction, generation of speech language contents and finally
multi-lingual code generation as shown in following fig.
4.1. Text input acquisition
This module helps to acquire input text scenario. User provides the
business scenario in from of paragraphs of the text. This module reads the
input text in the form characters and generates the words by concatenating
the input characters. This module is the implementation of the lexical
phase. Lexicons and tokens are generated in this module.
6. 4.2. Contents Interpretation
This module reads the input from module 1 in the form of words. These
words are categorized into various classes as verbs, helping verbs, nouns,
pronouns, adjectives, prepositions, conjunctions, etc.
Text Required
Text Terms
Document Terms
Text Input Acquisition
Understanding
Text
Contents Interpretation
Analyzing
Terms
Knowledge Retrieval
Matching
Terms Domain
Knowledge
Terminology Extraction
Required Terms
Output
Figure 1: Architecture of the designed Decision Support System for Domain Specific Terminology
Extraction
4.3. Knowledge Retrieval
This module, extracts different objects and classes and their respective
attributes on the basses of the input provided by the preceding module.
Nouns are symbolized as classes and objects and their associated attributes
are termed as attributes.
7. 4.4. Terminology Extraction
This module finally uses speech language contents symbols to constitute
the various speech language contents by combining available symbols
according to the information extracted of the previous module.
5. Used Methodology
Conventional natural language processing based systems user rule based
systems. Agents are another way to address this problem [8]. In the
research, a rule-based algorithm has been used which has robust ability to
read, understand and extract the desired information. First of all basic
elements of the language grammar are extracted as verbs, nouns, adjectives,
etc then on the basis of this extracted information further processing is
performed. In linguistic terms, verbs often specify actions, and noun
phrases the objects that participate in the action [16]. Each noun phrase's
then role specifies how the object participates in the action. As in the
following example:
"Ahmed hit a ball with a racket."
A procedure that understands such a sentence must discover that Role is the
agent because he performs the action of hitting, that the ball as the thematic
object because it is the object hit, and that the racket is an instrument
because it is the tool with which hitting is done. Thus, sentence analysis
requires, in part, the answers to these actions: The number of thematic roles
embraced by various theories varies probably. Some people use about half-
dozen thematic roles [9]. Others use more times as many. The exact
number does not matter much, as long as they will great enough to expose
natural constraints on how verbs and thematic-instances form sentences.
Agent: The agent causes the action to occur as in "Ahmed hit the ball,"
Ahmed is agent who performs the task. But in this example a passive
sentence, the agent also may appear in a prepositional "The ball was hit by
Ahmed.''
Co-agent: The word with may introduce a join phrase that serves a partner
in the principal agent. They two carry out the action together "Ahmed
played tennis with Ali."
8. Beneficiary: The beneficiary is the person for whom an action has bee
performed: "Ahmed bought the balls for Ali." In this sentence Ali is
beneficiary.
Thematic object: The thematic object is the object the sentence is really
all aboutâ typically the object, undergoing a change. Often the thematic
object is the same as the syntactic direct object, as "Ahmed hit the ball."
Here the ball is thematic object.
Conveyance: The conveyance is something in which or on which travels:
âAslam always goes by train.â
Trajectory: Motion from source to destination takes place over at
trajectory. ID contrast to the other role possibilities, several prepositions
can serve to introduce trajectory noun phrases: "Ahmed and Aslam went in
through the front door."
Location: The location is where an action occurs. As in the trajectory role,
several prepositions are possible, which conveys meanings in addition to
serve as a signal that a location noun phrase is "Ahmed and Ali studied in
the library, at a desk, by the wall, a picture, near the door."
Time: Time specifies when an action occurs. Prepositions such at, before,
and after introduce noun phrases serving as time role fill "Ahmed and Ali
left before noon."
Duration: Duration specifies how long an action takes. Preposition such as
for indicate duration. "Ali and Zia jogged for an hour.â
6. CONCLUSION
The designed system for Automatic Domain Specific Terminology
Extraction using a Decision Support System is a robust framework for
inferring the appropriate meanings from a given text. The accomplished
research is related to the understanding of the human languages. Human
being needs specific linguistic knowledge generating and understanding
speech language contents. It is difficult for computers to perform this task.
The speech language context understanding using a rule based framework
has ability to read user provided text, extract related information and
ultimately give meanings to the extracted contents. The designed system
9. very effective and have high accuracy up to 90 %. The designed system
depicts the meanings of the given sentence or paragraph efficiently. An
elegant graphical user interface has also been provided to the user for
entering the Input scenario in a proper way and generating speech language
contents.
7. FUTURE WORK
The designed system analyzes and extracts the contents of a speech
language as English. Current designed system only works for the active-
vice sentences. Passive-vice sentences are still to work for to make the
system more efficient and effective for various business applications. There
is also periphery of improvements in the algorithms for the better extraction
of language parts and understanding the meanings of the speech language
context. Current accuracy of generating meanings from a text is about 85%
to 90%. It can be enhanced up to 95% or even more by improving the
algorithms and inducing the ability of learning in the system.
8. REFERENCES
[1] - Andrake M.A. and Bork P. âAutomated Extraction of Information in
Molecular Biologyâ FEBS Letters, pp. 12-17, 2000.
[2] -Blaschke C, Andrade M. A, Ouzounis C. and Valencia,A. âAutomatic
extraction of biological information from scientific text: proteinâprotein
interactionsâ. Ismb, pp. 60â67, 1999.
[3] - C. A. Thompson, R. J. Mooney and L. R. Tang, âLearning to parse natural
language database queries into logical formâ, in: Workshop on Automata
Induction, Grammatical Inference and Language Acquisition, 1997.
[4] - Pustejovsky J, Castaño J., Zhang J, Kotecki M, Cochran B. âRobust relational
parsing over biomedical literature: Extracting inhibit relationsâ. Pac. Symp.
Biocomput, 362-73, 2002.
[5] - Krovetz, R., & Croft, W. B. âLexical ambiguity and information retrievalâ,
ACM Transactions on Information Systems, 10, pp. 115â141, 1992.
[6] - Losee, R. M. âParameter estimation for probabilistic document retrieval
modelsâ. Journal of the American Society for Information Science, 39(1), pp. 8â
16, 1988.
10. [7] - Partee, B. H., Meulen, A. t., &Wall, R. E. âMathematical Methods in
Linguisticsâ. Kluwer, Dordrecht, The Netherlands. 1999.
[8] - Salton, G., & McGill, M. âIntroduction to Modern Information Retrievalâ
McGraw-Hill, New York., 1995
[9] - Maron, M. E. & Kuhns, J. L. âOn relevance, probabilistic indexing, and
information retrievalâ Journal of the ACM, 7, 216â244, 1997.
[10] - Chomsky, N. âAspects of the Theory of Syntax. MIT Press, Cambridge,
Mass, 1965.
[11] - Chow, C., & Liu, C. âApproximating discrete probability distributions with
dependence treesâ. IEEE Transactions on Information Theory, IT-14(3), 462â
467, 1968.
[12] - C. A. Thompson, R. J. Mooney and L. R. Tang, Learning to parse natural language
database queries into logical form, Workshop on Automata Induction, Grammatical
Inference and Language Acquisition (1997).
[13] - Taffet, Mary D. (2001). GENTECH 2001 Scholarship Proposal: Automatic Tagging
of Genealogical Data to Enhance Web-based Retrieval. Available at:
<http://web.syr.edu/~mdtaffet/GENTECH_Scholarship_Proposal.htm>.
[14] - Nerbonne, John (2003): "Natural language processing in computer-assisted language
learning". In: Mitkov, R. (ed.): The Oxford Handbook of Computational Linguistics.
Oxford: 670-698.
[15] - Menzel, Wolfgang/Schröder, Ingo (1998): "Error diagnosis for language learning
systems". Proceedings of NLP+IA'98 1: 526-530.
[16] - Etchegoyhen, Thierry/ Wehrle, Thomas (1998): "Overview of GBGen: A large-scale,
domain independent syntactic generator". In: Hovy, Eduard (ed.): Proceedings of the 9th
International Workshop on Natural Language Generation. Niagara-on-the-Lake: 288-291.
[17] - Granger, Sylviane (2003): "Error-tagged Learner Corpora and CALL: A Promising
Synergy". CALICO Journal 20/3: 465-480.
[18] - Menzel, Wolfgang/Schröder, Ingo (1998): "Error diagnosis for language learning
systems". Proceedings of NLP+IA'98 1: 526-530. Anne Vandeventer Faltin: Natural
Language Tools for CALL 153
[19] - Nerbonne, John (2003): "Natural language processing in computer-assisted language
learning". In: Mitkov, R. (ed.): The Oxford Handbook of Computational Linguistics.
Oxford: 670-698.