Ontologisms have been applied to many applications in recent years, especially on Sematic Web, Information
Retrieval, Information Extraction, and Question and Answer. The purpose of domain-specific ontology
is to get rid of conceptual and terminological confusion. It accomplishes this by specifying a set of generic
concepts that characterizes the domain as well as their definitions and interrelationships. This paper will
describe some algorithms for identifying semantic relations and constructing an Information Technology
Ontology, while extracting the concepts and objects from different sources. The Ontology is constructed
based on three main resources: ACM, Wikipedia and unstructured files from ACM Digital Library. Our
algorithms are combined of Natural Language Processing and Machine Learning. We use Natural Language
Processing tools, such as OpenNLP, Stanford Lexical Dependency Parser in order to explore sentences.
We then extract these sentences based on English pattern in order to build training set. We use a
random sample among 245 categories of ACM to evaluate our results. Results generated show that our
system yields superior performance.
Ontology Matching Based on hypernym, hyponym, holonym, and meronym Sets in Wo...dannyijwest
Considerable research in the field of ontology matching has been performed where information sharing
and reuse becomes necessary in ontology development. Measurement of lexical similarity in ontology
matching is performed using synset, defined in WordNet. In this paper, we defined a Super Word Set,
which is an aggregate set that includes hypernym, hyponym, holonym, and meronym sets in WordNet.
The Super Word Set Similarity is calculated by the rate of words of concept name and synset’s words
inclusion in the Super Word Set. In order to measure of Super Word Set Similarity, we first extracted
Matched Concepts(MC), Matched Properties(MP) and Property Unmatched Concepts(PUC) from the
result of ontology matching. We compared these against two ontology matching tools – COMA++ and
LOM. The Super Word Set Similarity shows an average improvement of 12% over COMA++ and 19%
over LOM.
This paper proposes Natural language based Discourse Analysis method used for extracting
information from the news article of different domain. The Discourse analysis used the Rhetorical Structure
theory which is used to find coherent group of text which are most prominent for extracting information
from text. RST theory used the Nucleus- Satellite concept for finding most prominent text from the text
document. After Discourse analysis the text analysis has been done for extracting domain related object
and relates this object. For extracting the information knowledge based system has been used which
consist of domain dictionary .The domain dictionary has a bag of words for domain. The system is
evaluated according gold-of-art analysis and human decision for extracted information.
Data integration is a perennial challenge facing large-scale data scientists. Bio-ontologies are useful in this endeavour as sources of synonyms and also for rules-based fuzzy integration pipelines.
Ontology Matching Based on hypernym, hyponym, holonym, and meronym Sets in Wo...dannyijwest
Considerable research in the field of ontology matching has been performed where information sharing
and reuse becomes necessary in ontology development. Measurement of lexical similarity in ontology
matching is performed using synset, defined in WordNet. In this paper, we defined a Super Word Set,
which is an aggregate set that includes hypernym, hyponym, holonym, and meronym sets in WordNet.
The Super Word Set Similarity is calculated by the rate of words of concept name and synset’s words
inclusion in the Super Word Set. In order to measure of Super Word Set Similarity, we first extracted
Matched Concepts(MC), Matched Properties(MP) and Property Unmatched Concepts(PUC) from the
result of ontology matching. We compared these against two ontology matching tools – COMA++ and
LOM. The Super Word Set Similarity shows an average improvement of 12% over COMA++ and 19%
over LOM.
This paper proposes Natural language based Discourse Analysis method used for extracting
information from the news article of different domain. The Discourse analysis used the Rhetorical Structure
theory which is used to find coherent group of text which are most prominent for extracting information
from text. RST theory used the Nucleus- Satellite concept for finding most prominent text from the text
document. After Discourse analysis the text analysis has been done for extracting domain related object
and relates this object. For extracting the information knowledge based system has been used which
consist of domain dictionary .The domain dictionary has a bag of words for domain. The system is
evaluated according gold-of-art analysis and human decision for extracted information.
Data integration is a perennial challenge facing large-scale data scientists. Bio-ontologies are useful in this endeavour as sources of synonyms and also for rules-based fuzzy integration pipelines.
There is a vast amount of unstructured Arabic information on the Web, this data is always organized in
semi-structured text and cannot be used directly. This research proposes a semi-supervised technique that
extracts binary relations between two Arabic named entities from the Web. Several works have been
performed for relation extraction from Latin texts and as far as we know, there isn’t any work for Arabic
text using a semi-supervised technique. The goal of this research is to extract a large list or table from
named entities and relations in a specific domain. A small set of a handful of instance relations are
required as input from the user. The system exploits summaries from Google search engine as a source
text. These instances are used to extract patterns. The output is a set of new entities and their relations. The
results from four experiments show that precision and recall varies according to relation type. Precision
ranges from 0.61 to 0.75 while recall ranges from 0.71 to 0.83. The best result is obtained for (player, club)
relationship, 0.72 and 0.83 for precision and recall respectively.
COMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUEJournal For Research
Natural Language Processing (NLP) techniques are one of the most used techniques in the field of computer applications. It has become one of the vast and advanced techniques. Language is the means of communication or interaction among humans and in present scenario when everything is dependent on machine or everything is computerized, communication between computer and human has become a necessity. To fulfill this necessity NLP has been emerged as the means of interaction which narrows the gap between machines (computers) and humans. It was evolved from the study of linguistics which was passed through the Turing test to check the similarity between data but it was limited to small set of data. Later on various algorithms were developed along with the concept of AI (Artificial Intelligence) for the successful execution of NLP. In this paper, the main emphasis is on the different techniques of NLP which have been developed till now, their applications and the comparison of all those techniques on different parameters.
The important problem of word segmentation in Thai language is sentential noun phrase. The existing
studies try to minimize the problem. But there is no research that solves this problem directly. This study
investigates the approach to resolve this problem using conditional random fields which is a probabilistic
model to segment and label sequence data. The results present that the corrected data of noun phrase was
detected more than 78.61 % based on our technique.
Language Combinatorics: A Sentence Pattern Extraction Architecture Based on C...Waqas Tariq
A \"sentence pattern\" in modern Natural Language Processing is often considered as a subsequent string of words (n-grams). However, in many branches of linguistics, like Pragmatics or Corpus Linguistics, it has been noticed that simple n-gram patterns are not sufficient to reveal the whole sophistication of grammar patterns. We present a language independent architecture for extracting from sentences more sophisticated patterns than n-grams. In this architecture a \"sentence pattern\" is considered as n-element ordered combination of sentence elements. Experiments showed that the method extracts significantly more frequent patterns than the usual n-gram approach.
Analysis of lexico syntactic patterns for antonym pair extraction from a turk...csandit
Extraction of semantic relations from various sourc
es such as corpus, web pages, dictionary
definitions etc. is one of the most important issue
in study of Natural Language Processing
(NLP). Various methods have been used to extract se
mantic relation from various sources.
Pattern-based approach is one of the most popular m
ethod among them. In this study, we
propose a model to extract antonym pairs from Turki
sh corpus automatically. Using a set of
seeds, we automatically extract lexico-syntactic pa
tterns (LSPs) for antonym relation from
corpus. Reliability score is calculated for each pa
ttern. The most reliable patterns are used to
generate new antonym pairs. Study conduct on only a
djective-adjective and noun-noun pairs.
Noun and adjective target words are used to measure
success of method and candidate
antonyms are generated using reliable patterns. For
each antonym pair consisting of candidate
antonym and target word, antonym score is calculat
ed. Pairs that have a certain score are
assigned to antonym pair. The proposed method shows
good performance with 77.2% average
accuracy.
FINDING OUT NOISY PATTERNS FOR RELATION EXTRACTION OF BANGLA SENTENCESijnlc
Relation extraction is one of the most important parts of natural language processing. It is the process of extracting relationships from a text. Extracted relationships actually occur between two or more entities of a certain type and these relations may have different patterns. The goal of the paper is to find out the noisy patterns for relation extraction of Bangla sentences. For the research work, seed tuples were needed containing two entities and the relation between them. We can get seed tuples from Freebase. Freebase is a large collaborative knowledge base and database of general, structured information for public use. But for Bangla language, there is no available Freebase. So we made Bangla Freebase which was the real challenge and it can be used for any other NLP based works. Then we tried to find out the noisy patterns for relation extraction by measuring conflict score.
Taxonomy extraction from automotive natural language requirements using unsup...ijnlc
In this paper we present a novel approach to semi-automatically learn concept hierarchies from natural
language requirements of the automotive industry. The approach is based on the distributional hypothesis
and the special characteristics of domain-specific German compounds. We extract taxonomies by using
clustering techniques in combination with general thesauri. Such a taxonomy can be used to support
requirements engineering in early stages by providing a common system understanding and an agreedupon
terminology. This work is part of an ontology-driven requirements engineering process, which builds
on top of the taxonomy. Evaluation shows that this taxonomy extraction approach outperforms common
hierarchical clustering techniques.
A DOMAIN INDEPENDENT APPROACH FOR ONTOLOGY SEMANTIC ENRICHMENTcscpconf
Ontology automatic enrichment consists of adding automatically new concepts and/or new relations to an initial ontology built manually using a basic domain knowledge. In a concrete manner, enrichment is firstly, extracting concepts and relations from textual sources then putting them in their right emplacements in the initial ontology. However, the main issue in that process is how to preserve the coherence of the ontology after this operation. For this purpose, we consider the semantic aspect in the enrichment process by using similarity techniques between terms. Contrarily to other approaches, our approach is domain independent and the enrichment process is based on a semantic analysis. Another advantage of our approach is that it takes into account the two types of relations, taxonomic and non taxonomic
ones.
Survey On Building A Database Driven Reverse DictionaryEditor IJMTER
Reverse dictionaries are widely used for a reference work that is organized by concepts,
phrases, or the definitions of words. This paper describe the many challenges inherent in building a
reverse lexicon, and map drawback to the well known abstract similarity problem The criterion web
search engines are basic versions of system; they take benefit of huge scale which permits inferring
general interest concerning documents from link information. This paper describe the basic study of
database driven reverse dictionary using three large-scale dataset namely person names, general English
words and biomedical concepts. This paper analyzes difficulties arising in the use of documents
produced by Reverse dictionary.
semantic data integration the process of using a conceptual representation of the data and of their relationships to eliminate possible heterogeneities.
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
Theoretical work submitted to the Journal should be original in its motivation or modeling structure. Empirical analysis should be based on a theoretical framework and should be capable of replication. It is expected that all materials required for replication (including computer programs and data sets) should be available upon request to the authors.
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING cscpconf
In the last decade, ontologies have played a key technology role for information sharing and agents interoperability in different application domains. In semantic web domain, ontologies are efficiently used toface the great challenge of representing the semantics of data, in order to bring the actual web to its full
power and hence, achieve its objective. However, using ontologies as common and shared vocabularies requires a certain degree of interoperability between them. To confront this requirement, mapping ontologies is a solution that is not to be avoided. In deed, ontology mapping build a meta layer that allows different applications and information systems to access and share their informations, of course, after resolving the different forms of syntactic, semantic and lexical mismatches. In the contribution presented in this paper, we have integrated the semantic aspect based on an external lexical resource, wordNet, to design a new algorithm for fully automatic ontology mapping. This fully automatic character features the
main difference of our contribution with regards to the most of the existing semi-automatic algorithms of ontology mapping, such as Chimaera, Prompt, Onion, Glue, etc. To better enhance the performances of our algorithm, the mapping discovery stage is based on the combination of two sub-modules. The former
analysis the concept’s names and the later analysis their properties. Each one of these two sub-modules is
it self based on the combination of lexical and semantic similarity measures.
There is a vast amount of unstructured Arabic information on the Web, this data is always organized in
semi-structured text and cannot be used directly. This research proposes a semi-supervised technique that
extracts binary relations between two Arabic named entities from the Web. Several works have been
performed for relation extraction from Latin texts and as far as we know, there isn’t any work for Arabic
text using a semi-supervised technique. The goal of this research is to extract a large list or table from
named entities and relations in a specific domain. A small set of a handful of instance relations are
required as input from the user. The system exploits summaries from Google search engine as a source
text. These instances are used to extract patterns. The output is a set of new entities and their relations. The
results from four experiments show that precision and recall varies according to relation type. Precision
ranges from 0.61 to 0.75 while recall ranges from 0.71 to 0.83. The best result is obtained for (player, club)
relationship, 0.72 and 0.83 for precision and recall respectively.
COMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUEJournal For Research
Natural Language Processing (NLP) techniques are one of the most used techniques in the field of computer applications. It has become one of the vast and advanced techniques. Language is the means of communication or interaction among humans and in present scenario when everything is dependent on machine or everything is computerized, communication between computer and human has become a necessity. To fulfill this necessity NLP has been emerged as the means of interaction which narrows the gap between machines (computers) and humans. It was evolved from the study of linguistics which was passed through the Turing test to check the similarity between data but it was limited to small set of data. Later on various algorithms were developed along with the concept of AI (Artificial Intelligence) for the successful execution of NLP. In this paper, the main emphasis is on the different techniques of NLP which have been developed till now, their applications and the comparison of all those techniques on different parameters.
The important problem of word segmentation in Thai language is sentential noun phrase. The existing
studies try to minimize the problem. But there is no research that solves this problem directly. This study
investigates the approach to resolve this problem using conditional random fields which is a probabilistic
model to segment and label sequence data. The results present that the corrected data of noun phrase was
detected more than 78.61 % based on our technique.
Language Combinatorics: A Sentence Pattern Extraction Architecture Based on C...Waqas Tariq
A \"sentence pattern\" in modern Natural Language Processing is often considered as a subsequent string of words (n-grams). However, in many branches of linguistics, like Pragmatics or Corpus Linguistics, it has been noticed that simple n-gram patterns are not sufficient to reveal the whole sophistication of grammar patterns. We present a language independent architecture for extracting from sentences more sophisticated patterns than n-grams. In this architecture a \"sentence pattern\" is considered as n-element ordered combination of sentence elements. Experiments showed that the method extracts significantly more frequent patterns than the usual n-gram approach.
Analysis of lexico syntactic patterns for antonym pair extraction from a turk...csandit
Extraction of semantic relations from various sourc
es such as corpus, web pages, dictionary
definitions etc. is one of the most important issue
in study of Natural Language Processing
(NLP). Various methods have been used to extract se
mantic relation from various sources.
Pattern-based approach is one of the most popular m
ethod among them. In this study, we
propose a model to extract antonym pairs from Turki
sh corpus automatically. Using a set of
seeds, we automatically extract lexico-syntactic pa
tterns (LSPs) for antonym relation from
corpus. Reliability score is calculated for each pa
ttern. The most reliable patterns are used to
generate new antonym pairs. Study conduct on only a
djective-adjective and noun-noun pairs.
Noun and adjective target words are used to measure
success of method and candidate
antonyms are generated using reliable patterns. For
each antonym pair consisting of candidate
antonym and target word, antonym score is calculat
ed. Pairs that have a certain score are
assigned to antonym pair. The proposed method shows
good performance with 77.2% average
accuracy.
FINDING OUT NOISY PATTERNS FOR RELATION EXTRACTION OF BANGLA SENTENCESijnlc
Relation extraction is one of the most important parts of natural language processing. It is the process of extracting relationships from a text. Extracted relationships actually occur between two or more entities of a certain type and these relations may have different patterns. The goal of the paper is to find out the noisy patterns for relation extraction of Bangla sentences. For the research work, seed tuples were needed containing two entities and the relation between them. We can get seed tuples from Freebase. Freebase is a large collaborative knowledge base and database of general, structured information for public use. But for Bangla language, there is no available Freebase. So we made Bangla Freebase which was the real challenge and it can be used for any other NLP based works. Then we tried to find out the noisy patterns for relation extraction by measuring conflict score.
Taxonomy extraction from automotive natural language requirements using unsup...ijnlc
In this paper we present a novel approach to semi-automatically learn concept hierarchies from natural
language requirements of the automotive industry. The approach is based on the distributional hypothesis
and the special characteristics of domain-specific German compounds. We extract taxonomies by using
clustering techniques in combination with general thesauri. Such a taxonomy can be used to support
requirements engineering in early stages by providing a common system understanding and an agreedupon
terminology. This work is part of an ontology-driven requirements engineering process, which builds
on top of the taxonomy. Evaluation shows that this taxonomy extraction approach outperforms common
hierarchical clustering techniques.
A DOMAIN INDEPENDENT APPROACH FOR ONTOLOGY SEMANTIC ENRICHMENTcscpconf
Ontology automatic enrichment consists of adding automatically new concepts and/or new relations to an initial ontology built manually using a basic domain knowledge. In a concrete manner, enrichment is firstly, extracting concepts and relations from textual sources then putting them in their right emplacements in the initial ontology. However, the main issue in that process is how to preserve the coherence of the ontology after this operation. For this purpose, we consider the semantic aspect in the enrichment process by using similarity techniques between terms. Contrarily to other approaches, our approach is domain independent and the enrichment process is based on a semantic analysis. Another advantage of our approach is that it takes into account the two types of relations, taxonomic and non taxonomic
ones.
Survey On Building A Database Driven Reverse DictionaryEditor IJMTER
Reverse dictionaries are widely used for a reference work that is organized by concepts,
phrases, or the definitions of words. This paper describe the many challenges inherent in building a
reverse lexicon, and map drawback to the well known abstract similarity problem The criterion web
search engines are basic versions of system; they take benefit of huge scale which permits inferring
general interest concerning documents from link information. This paper describe the basic study of
database driven reverse dictionary using three large-scale dataset namely person names, general English
words and biomedical concepts. This paper analyzes difficulties arising in the use of documents
produced by Reverse dictionary.
semantic data integration the process of using a conceptual representation of the data and of their relationships to eliminate possible heterogeneities.
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
The papers for publication in The International Journal of Engineering& Science are selected through rigorous peer reviews to ensure originality, timeliness, relevance, and readability.
Theoretical work submitted to the Journal should be original in its motivation or modeling structure. Empirical analysis should be based on a theoretical framework and should be capable of replication. It is expected that all materials required for replication (including computer programs and data sets) should be available upon request to the authors.
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING cscpconf
In the last decade, ontologies have played a key technology role for information sharing and agents interoperability in different application domains. In semantic web domain, ontologies are efficiently used toface the great challenge of representing the semantics of data, in order to bring the actual web to its full
power and hence, achieve its objective. However, using ontologies as common and shared vocabularies requires a certain degree of interoperability between them. To confront this requirement, mapping ontologies is a solution that is not to be avoided. In deed, ontology mapping build a meta layer that allows different applications and information systems to access and share their informations, of course, after resolving the different forms of syntactic, semantic and lexical mismatches. In the contribution presented in this paper, we have integrated the semantic aspect based on an external lexical resource, wordNet, to design a new algorithm for fully automatic ontology mapping. This fully automatic character features the
main difference of our contribution with regards to the most of the existing semi-automatic algorithms of ontology mapping, such as Chimaera, Prompt, Onion, Glue, etc. To better enhance the performances of our algorithm, the mapping discovery stage is based on the combination of two sub-modules. The former
analysis the concept’s names and the later analysis their properties. Each one of these two sub-modules is
it self based on the combination of lexical and semantic similarity measures.
Semantic similarity and semantic relatedness
measure in particular is very important in the current scenario
due to the huge demand for natural language processing based
applications such as chatbots and information retrieval systems
such as knowledge base based FAQ systems. Current approaches
generally use similarity measures which does not use the context
sensitive relationships between the words. This leads to erroneous
similarity predictions and is not of much use in real life
applications. This work proposes a novel approach that gives an
accurate relatedness measure of any two words in a sentence by
taking their context into consideration. This context correction
results in a more accurate similarity prediction which results in
higher accuracy of information retrieval systems.
An Enhanced Suffix Tree Approach to Measure Semantic Similarity between Multi...iosrjce
IOSR Journal of Computer Engineering (IOSR-JCE) is a double blind peer reviewed International Journal that provides rapid publication (within a month) of articles in all areas of computer engineering and its applications. The journal welcomes publications of high quality papers on theoretical developments and practical applications in computer technology. Original research papers, state-of-the-art reviews, and high quality technical notes are invited for publications.
PATENT DOCUMENT SUMMARIZATION USING CONCEPTUAL GRAPHSkevig
In this paper a methodology to mine the concepts from documents and use these concepts to generate an
objective summary of the claims section of the patent documents is proposed. Conceptual Graph (CG)
formalism as proposed by Sowa (Sowa 1984) is used in this work for representing the concepts and their
relationships. Automatic identification of concepts and conceptual relations from text documents is a
challenging task. In this work the focus is on the analysis of the patent documents, mainly on the claim’s
section (Claim) of the documents. There are several complexities in the writing style of these documents as
they are technical as well as legal. It is observed that the general in-depth parsers available in the open
domain fail to parse the ‘claims section’ sentences in patent documents. The failure of in-depth parsers
has motivated us, to develop methodology to extract CGs using other resources. Thus in the present work
shallow parsing, NER and machine learning technique for extracting concepts and conceptual
relationships from sentences in the claim section of patent documents is used. Thus, this paper discusses i)
Generation of CG, a semantic network and ii) Generation of abstractive summary of the claims section of
the patent. The aim is to generate a summary which is 30% of the whole claim section. Here we use
Restricted Boltzmann Machines (RBMs), a deep learning technique for automatically extracting CGs. We
have tested our methodology using a corpus of 5000 patent documents from electronics domain. The results
obtained are encouraging and is comparable with the state of the art systems.
PATENT DOCUMENT SUMMARIZATION USING CONCEPTUAL GRAPHSijnlc
In this paper a methodology to mine the concepts from documents and use these concepts to generate an
objective summary of the claims section of the patent documents is proposed. Conceptual Graph (CG)
formalism as proposed by Sowa (Sowa 1984) is used in this work for representing the concepts and their
relationships. Automatic identification of concepts and conceptual relations from text documents is a
challenging task. In this work the focus is on the analysis of the patent documents, mainly on the claim’s
section (Claim) of the documents. There are several complexities in the writing style of these documents as
they are technical as well as legal. It is observed that the general in-depth parsers available in the open
domain fail to parse the ‘claims section’ sentences in patent documents. The failure of in-depth parsers
has motivated us, to develop methodology to extract CGs using other resources. Thus in the present work
shallow parsing, NER and machine learning technique for extracting concepts and conceptual
relationships from sentences in the claim section of patent documents is used. Thus, this paper discusses i)
Generation of CG, a semantic network and ii) Generation of abstractive summary of the claims section of
the patent. The aim is to generate a summary which is 30% of the whole claim section. Here we use
Restricted Boltzmann Machines (RBMs), a deep learning technique for automatically extracting CGs. We
have tested our methodology using a corpus of 5000 patent documents from electronics domain. The results
obtained are encouraging and is comparable with the state of the art systems.
Conceptual Similarity Measurement Algorithm For Domain Specific OntologyZac Darcy
This paper presents the similarity measurement algorithm for domain specific terms collected in the
ontology based data integration system. This similarity measurement algorithm can be used in ontology
mapping and query service of ontology based data integration system. In this paper, we focus on the web
query service to apply this proposed algorithm. Concepts similarity is important for web query service
because the words in user input query are not same wholly with the concepts in ontology. So, we need to
extract the possible concepts that are match or related to the input words with the help of machine readable
dictionary WordNet. Sometimes, we use the generated mapping rules in query generation procedure for
some words that cannot be confirmed the similarity of these words by WordNet. We prove the effect of this
algorithm with two degree semantic result of web mining by generating the concepts results obtained form
the input query.
Conceptual similarity measurement algorithm for domain specific ontology[Zac Darcy
This paper presents the similarity measurement algorithm for domain specific terms collected in the
ontology based data integration system. This similarity measurement algorithm can be used in ontology
mapping and query service of
ontology based data integration sy
stem. In this paper, we focus
o
n the web
query service to apply
this proposed algorithm
. Concepts similarity is important for web query service
because the words in user input query are not
same wholly with the concepts in
ontology. So, we need to
extract the possible concepts that are match or related to the input words with the help of machine readable
dictionary WordNet. Sometimes, we use the generated mapping rules in query generation procedure for
some words that canno
t be
confirmed the similarity of these words
by WordNet. We prove the effect
of this
algorithm with two degree semantic result of web minin
g by generating
the concepts results obtained form
the input query
ANALYSIS OF LEXICO-SYNTACTIC PATTERNS FOR ANTONYM PAIR EXTRACTION FROM A TURK...cscpconf
Extraction of semantic relations from various sources such as corpus, web pages, dictionary
definitions etc. is one of the most important issue in study of Natural Language Processing
(NLP). Various methods have been used to extract semantic relation from various sources.
Pattern-based approach is one of the most popular method among them. In this study, we
propose a model to extract antonym pairs from Turkish corpus automatically. Using a set of
seeds, we automatically extract lexico-syntactic patterns (LSPs) for antonym relation from
corpus. Reliability score is calculated for each pattern. The most reliable patterns are used to
generate new antonym pairs. Study conduct on only adjective-adjective and noun-noun pairs.
Noun and adjective target words are used to measure success of method and candidate
antonyms are generated using reliable patterns. For each antonym pair consisting of candidate
antonym and target word, antonym score is calculated. Pairs that have a certain score are
assigned to antonym pair. The proposed method shows good performance with 77.2% average
accuracy.
An Efficient Semantic Relation Extraction Method For Arabic Texts Based On Si...CSCJournals
Semantic relation extraction is an important component of ontologies that can support many applications e.g. text mining, question answering, and information extraction. However, extracting semantic relations between concepts is not trivial and one of the main challenges in Natural Language Processing (NLP) Field. In this paper, we propose a method for semantic relation extraction between concepts. The method relies on the definition of concept context and the semantic similarity measures to extract relations from domain corpus. In this work, we implemented algorithm for concept context construction and for similarity computation based on different semantic similarity measures. We analyze the proposed methods and evaluate their performance. The preliminary experiments showed that the best results precision of 83% are obtained with Lin measure at minimum confidence =0.50 and precision of 85% with the Cosine and Jaccard similarity measures. The main advantage is the automatic and unsupervised operation; it doesn't need any pre labeled training data. Also used effectively for relation extraction in various domains. The results show the high effectiveness of the proposed approach to extract relations for Arabic ontology construction.
In recent years, great advances have been made in the speed, accuracy, and coverage of automatic word
sense disambiguator systems that, given a word appearing in a certain context, can identify the sense of
that word. In this paper we consider the problem of deciding whether same words contained in different
documents are related to the same meaning or are homonyms. Our goal is to improve the estimate of the
similarity of documents in which some words may be used with different meanings. We present three new
strategies for solving this problem, which are used to filter out homonyms from the similarity computation.
Two of them are intrinsically non-semantic, whereas the other one has a semantic flavor and can also be
applied to word sense disambiguation. The three strategies have been embedded in an article document
recommendation system that one of the most important Italian ad-serving companies offers to its customers
In recent years, great advances have been made in the speed, accuracy, and coverage of automatic word
sense disambiguator systems that, given a word appearing in a certain context, can identify the sense of
that word. In this paper we consider the problem of deciding whether same words contained in different
documents are related to the same meaning or are homonyms. Our goal is to improve the estimate of the
similarity of documents in which some words may be used with different meanings. We present three new
strategies for solving this problem, which are used to filter out homonyms from the similarity computation.
Two of them are intrinsically non-semantic, whereas the other one has a semantic flavor and can also be
applied to word sense disambiguation. The three strategies have been embedded in an article document
recommendation system that one of the most important Italian ad-serving companies offers to its customers.
The World Wide Web holds a large size of different information. Sometimes while searching the World Wide Web, users always do not gain the type of information they expect. In the subject of information extraction, extracting semantic relationships between terms from documents become a challenge. This paper proposes a system helps in retrieving documents based on the query expansion and tackles the extracting of semantic relationships from biological documents. This system retrieved documents that are relevant to the input terms then it extracts the existence of a relationship. In this system, we use Boolean model and the pattern recognition which helps in determining the relevant documents and determining the place of the relationship in the biological document. The system constructs a term-relation table that accelerates the relation extracting part. The proposed method offers another usage of the system so the researchers can use it to figure out the relationship between two biological terms through the available information in the biological documents. Also for the retrieved documents, the system measures the percentage of the precision and recall.
A Semantic Retrieval System for Extracting Relationships from Biological Corpusijcsit
The World Wide Web holds a large size of different information. Sometimes while searching the World Wide Web, users always do not gain the type of information they expect. In the subject of information extraction, extracting semantic relationships between terms from documents become a challenge. This
paper proposes a system helps in retrieving documents based on the query expansion and tackles the extracting of semantic relationships from biological documents. This system retrieved documents that are relevant to the input terms then it extracts the existence of a relationship. In this system, we use Boolean
model and the pattern recognition which helps in determining the relevant documents and determining the place of the relationship in the biological document. The system constructs a term-relation table that accelerates the relation extracting part. The proposed method offers another usage of the system so the
researchers can use it to figure out the relationship between two biological terms through the available information in the biological documents. Also for the retrieved documents, the system measures the percentage of the precision and recall.
Similar to Identifying the semantic relations on (20)
INFORMATION THEORY BASED ANALYSIS FOR UNDERSTANDING THE REGULATION OF HLA GEN...ijistjournal
Considering information entropy (IE), HLA surface expression (SE) regulation phenomenon is considered as information propagation channel with an amount of distortion. HLA gene SE is considered as sink regulated by the inducible transcription factors (TFs) (source). Previous work with a certain number of bin size, IEs for source and receiver is computed and computation of mutual information characterizes the dependencies of HLA gene SE on some certain TFs in different cells types of hematopoietic system under the condition of leukemia. Though in recent time information theory is utilized for different biological knowledge generation and different rules are available in those specific domains of biomedical areas; however, no such attempt is made regarding gene expression regulation, hence no such rule is available. In this work, IE calculation with varying bin size considering the number of bins is approximately half of the sample size of an attribute also confirms the previous inferences.
Call for Research Articles - 5th International Conference on Artificial Intel...ijistjournal
5th International Conference on Artificial Intelligence and Machine Learning (CAIML 2024) will provide an excellent international forum for sharing knowledge and results in theory, methodology and applications of Artificial Intelligence and Machine Learning. The Conference looks for significant contributions to all major fields of the Artificial Intelligence, Machine Learning in theoretical and practical aspects. The aim of the Conference is to provide a platform to the researchers and practitioners from both academia as well as industry to meet and share cutting-edge development in the field.
Authors are solicited to contribute to the conference by submitting articles that illustrate research results, projects, surveying works and industrial experiences that describe significant advances in the areas of Computer Science, Engineering and Applications.
Online Paper Submission - International Journal of Information Sciences and T...ijistjournal
The International Journal of Information Science & Techniques (IJIST) focuses on information systems science and technology coercing multitude applications of information systems in business administration, social science, biosciences, and humanities education, library sciences management, depiction of data and structural illustration, big data analytics, information economics in real engineering and scientific problems.
This journal provides a forum that impacts the development of engineering, education, technology management, information theories and application validation. It also acts as a path to exchange novel and innovative ideas about Information systems science and technology.
SYSTEM IDENTIFICATION AND MODELING FOR INTERACTING AND NON-INTERACTING TANK S...ijistjournal
System identification from the experimental data plays a vital role for model based controller design. Derivation of process model from first principles is often difficult due to its complexity. The first stage in the development of any control and monitoring system is the identification and modeling of the system. Each model is developed within the context of a specific control problem. Thus, the need for a general system identification framework is warranted. The proposed framework should be able to adapt and emphasize different properties based on the control objective and the nature of the behavior of the system. Therefore, system identification has been a valuable tool in identifying the model of the system based on the input and output data for the design of the controller. The present work is concerned with the identification of transfer function models using statistical model identification, process reaction curve method, ARX model, genetic algorithm and modeling using neural network and fuzzy logic for interacting and non interacting tank process. The identification technique and modeling used is prone to parameter change & disturbance. The proposed methods are used for identifying the mathematical model and intelligent model of interacting and non interacting process from the real time experimental data.
Call for Research Articles - 4th International Conference on NLP & Data Minin...ijistjournal
4th International Conference on NLP & Data Mining (NLDM 2024) will provide an excellent international forum for sharing knowledge and results in theory, methodology and applications of Natural Language Computing and Data Mining.
Authors are solicited to contribute to the conference by submitting articles that illustrate research results, projects, surveying works and industrial experiences that describe significant advances in the following areas, but are not limited to.
Research Article Submission - International Journal of Information Sciences a...ijistjournal
The International Journal of Information Science & Techniques (IJIST) focuses on information systems science and technology coercing multitude applications of information systems in business administration, social science, biosciences, and humanities education, library sciences management, depiction of data and structural illustration, big data analytics, information economics in real engineering and scientific problems.
This journal provides a forum that impacts the development of engineering, education, technology management, information theories and application validation. It also acts as a path to exchange novel and innovative ideas about Information systems science and technology.
Call for Papers - International Journal of Information Sciences and Technique...ijistjournal
The International Journal of Information Science & Techniques (IJIST) focuses on information systems science and technology coercing multitude applications of information systems in business administration, social science, biosciences, and humanities education, library sciences management, depiction of data and structural illustration, big data analytics, information economics in real engineering and scientific problems.
This journal provides a forum that impacts the development of engineering, education, technology management, information theories and application validation. It also acts as a path to exchange novel and innovative ideas about Information systems science and technology.
Implementation of Radon Transformation for Electrical Impedance Tomography (EIT)ijistjournal
Radon Transformation is generally used to construct optical image (like CT image) from the projection data in biomedical imaging. In this paper, the concept of Radon Transformation is implemented to reconstruct Electrical Impedance Topographic Image (conductivity or resistivity distribution) of a circular subject. A parallel resistance model of a subject is proposed for Electrical Impedance Topography(EIT) or Magnetic Induction Tomography(MIT). A circular subject with embedded circular objects is segmented into equal width slices from different angles. For each angle, Conductance and Conductivity of each slice is calculated and stored in an array. A back projection method is used to generate a two-dimensional image from one-dimensional projections. As a back projection method, Inverse Radon Transformation is applied on the calculated conductance and conductivity to reconstruct two dimensional images. These images are compared to the target image. In the time of image reconstruction, different filters are used and these images are compared with each other and target image.
Online Paper Submission - 6th International Conference on Machine Learning & ...ijistjournal
6th International Conference on Machine Learning & Applications (CMLA 2024) will provide an excellent international forum for sharing knowledge and results in theory, methodology and applications of on Machine Learning & Applications.
Authors are solicited to contribute to the conference by submitting articles that illustrate research results, projects, surveying works and industrial experiences that describe significant advances in the following areas, but are not limited to.
Submit Your Research Articles - International Journal of Information Sciences...ijistjournal
The International Journal of Information Science & Techniques (IJIST) focuses on information systems science and technology coercing multitude applications of information systems in business administration, social science, biosciences, and humanities education, library sciences management, depiction of data and structural illustration, big data analytics, information economics in real engineering and scientific problems.
This journal provides a forum that impacts the development of engineering, education, technology management, information theories and application validation. It also acts as a path to exchange novel and innovative ideas about Information systems science and technology.
BER Performance of MPSK and MQAM in 2x2 Almouti MIMO Systemsijistjournal
Almouti published the error performance of the 2x2 space-time transmit diversity scheme using BPSK. One of the key techniques employed for correcting such errors is the Quadrature amplitude modulation (QAM) because of its efficiency in power and bandwidth.. In this paper we explore the error performance of the 2x2 MIMO system using the Almouti space-time codes for higher order PSK and M-ary QAM. MATLAB was used to simulate the system; assuming slow fading Rayleigh channel and additive white Gaussian noise. The simulated performance curves were compared and evaluated with theoretical curves obtained using BER tool on the MATLAB by setting parameters for random generators. The results shows that the technique used do find a place in correcting error rates of QAM system of higher modulation schemes. The model can equally be used not only for the criteria of adaptive modulation but for a platform to design other modulation systems as well.
Online Paper Submission - International Journal of Information Sciences and T...ijistjournal
The International Journal of Information Science & Techniques (IJIST) focuses on information systems science and technology coercing multitude applications of information systems in business administration, social science, biosciences, and humanities education, library sciences management, depiction of data and structural illustration, big data analytics, information economics in real engineering and scientific problems.
This journal provides a forum that impacts the development of engineering, education, technology management, information theories and application validation. It also acts as a path to exchange novel and innovative ideas about Information systems science and technology.
Call for Papers - International Journal of Information Sciences and Technique...ijistjournal
The International Journal of Information Science & Techniques (IJIST) focuses on information systems science and technology coercing multitude applications of information systems in business administration, social science, biosciences, and humanities education, library sciences management, depiction of data and structural illustration, big data analytics, information economics in real engineering and scientific problems.
This journal provides a forum that impacts the development of engineering, education, technology management, information theories and application validation. It also acts as a path to exchange novel and innovative ideas about Information systems science and technology.
International Journal of Information Sciences and Techniques (IJIST)ijistjournal
The International Journal of Information Science & Techniques (IJIST) focuses on information systems science and technology coercing multitude applications of information systems in business administration, social science, biosciences, and humanities education, library sciences management, depiction of data and structural illustration, big data analytics, information economics in real engineering and scientific problems.
This journal provides a forum that impacts the development of engineering, education, technology management, information theories and application validation. It also acts as a path to exchange novel and innovative ideas about Information systems science and technology.
BRAIN TUMOR MRIIMAGE CLASSIFICATION WITH FEATURE SELECTION AND EXTRACTION USI...ijistjournal
Feature extraction is a method of capturing visual content of an image. The feature extraction is the process to represent raw image in its reduced form to facilitate decision making such as pattern classification. We have tried to address the problem of classification MRI brain images by creating a robust and more accurate classifier which can act as an expert assistant to medical practitioners. The objective of this paper is to present a novel method of feature selection and extraction. This approach combines the Intensity, Texture, shape based features and classifies the tumor as white matter, Gray matter, CSF, abnormal and normal area. The experiment is performed on 140 tumor contained brain MR images from the Internet Brain Segmentation Repository. The proposed technique has been carried out over a larger database as compare to any previous work and is more robust and effective. PCA and Linear Discriminant Analysis (LDA) were applied on the training sets. The Support Vector Machine (SVM) classifier served as a comparison of nonlinear techniques Vs linear ones. PCA and LDA methods are used to reduce the number of features used. The feature selection using the proposed technique is more beneficial as it analyses the data according to grouping class variable and gives reduced feature set with high classification accuracy.
Research Article Submission - International Journal of Information Sciences a...ijistjournal
The International Journal of Information Science & Techniques (IJIST) focuses on information systems science and technology coercing multitude applications of information systems in business administration, social science, biosciences, and humanities education, library sciences management, depiction of data and structural illustration, big data analytics, information economics in real engineering and scientific problems.
This journal provides a forum that impacts the development of engineering, education, technology management, information theories and application validation. It also acts as a path to exchange novel and innovative ideas about Information systems science and technology.
A MEDIAN BASED DIRECTIONAL CASCADED WITH MASK FILTER FOR REMOVAL OF RVINijistjournal
In this paper A Median Based Directional Cascaded with Mask (MBDCM) filter has been proposed, which is based on three different sized cascaded filtering windows. The differences between the current pixel and its neighbors aligned with four main directions are considered for impulse detection. A direction index is used for each edge aligned with a given direction. Minimum of these four direction indexes is used for impulse detection under each masking window. Depending on the minimum direction indexes among these three windows new value to substitute the noisy pixel is calculated. Extensive simulations showed that the MBDCM filter provides good performances of suppressing impulses from both gray level and colored benchmarked images corrupted with low noise level as well as for highly dense impulses. MBDCM filter gives better results than MDWCMM filter in suppressing impulses from highly corrupted digital images.
DECEPTION AND RACISM IN THE TUSKEGEE SYPHILIS STUDYijistjournal
During the twentieth century (1932-1972), white physicians representing the United States government
conducted a human experiment known as the Tuskegee Syphilis Study on black syphilis patients in Macon
County, Alabama. The creators of the study, who supported the idea of black inferiority and the concept
that black people’s bodies functioned differently from white people’s, observed the effects of a disease
called syphilis on untreated black patients in order to collect data for further research on syphilis. Black
individuals involved with the study believed that they were receiving treatment, although in truth,
treatments for syphilis were purposely held back from them. Not only this, but fluids from their bodies, such
as blood and spinal fluid, were extracted to serve as research material without their awareness of the
purpose of the collection. The physicians justified their approach by positioning it as mere observation,
asserting that they were not actively intervening with the patients participating in the experiment. However,
despite their claims of passivity, these white physicians engaged in various morally improper actions,
including deceit, which ultimately resulted in the deaths of numerous black patients who might have had a
chance at survival.
Online Paper Submission - International Journal of Information Sciences and T...ijistjournal
The International Journal of Information Science & Techniques (IJIST) focuses on information systems science and technology coercing multitude applications of information systems in business administration, social science, biosciences, and humanities education, library sciences management, depiction of data and structural illustration, big data analytics, information economics in real engineering and scientific problems.
This journal provides a forum that impacts the development of engineering, education, technology management, information theories and application validation. It also acts as a path to exchange novel and innovative ideas about Information systems science and technology.
A NOVEL APPROACH FOR SEGMENTATION OF SECTOR SCAN SONAR IMAGES USING ADAPTIVE ...ijistjournal
The SAR and SAS images are perturbed by a multiplicative noise called speckle, due to the coherent nature of the scattering phenomenon. If the background of an image is uneven, the fixed thresholding technique is not suitable to segment an image using adaptive thresholding method. In this paper a new Adaptive thresholding method is proposed to reduce the speckle noise, preserving the structural features and textural information of Sector Scan SONAR (Sound Navigation and Ranging) images. Due to the massive proliferation of SONAR images, the proposed method is very appealing in under water environment applications. In fact it is a pre- treatment required in any SONAR images analysis system. The results obtained from the proposed method were compared quantitatively and qualitatively with the results obtained from the other speckle reduction techniques and demonstrate its higher performance for speckle reduction in the SONAR images.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
Transcript: Selling digital books in 2024: Insights from industry leaders - T...BookNet Canada
The publishing industry has been selling digital audiobooks and ebooks for over a decade and has found its groove. What’s changed? What has stayed the same? Where do we go from here? Join a group of leading sales peers from across the industry for a conversation about the lessons learned since the popularization of digital books, best practices, digital book supply chain management, and more.
Link to video recording: https://bnctechforum.ca/sessions/selling-digital-books-in-2024-insights-from-industry-leaders/
Presented by BookNet Canada on May 28, 2024, with support from the Department of Canadian Heritage.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Neuro-symbolic is not enough, we need neuro-*semantic*
Identifying the semantic relations on
1. International Journal of Information Sciences and Techniques (IJIST) Vol.4, No.3, May 2014
DOI : 10.5121/ijist.2014.4301 1
IDENTIFYING THE SEMANTIC RELATIONS ON
UNSTRUCTURED DATA
Chien D C Ta1
and Tuoi Phan Thi1
1Faculty of Computer Science and Engineering,
HoChiMinh City University of Technology
Abstract
Ontologisms have been applied to many applications in recent years, especially on Sematic Web, Informa-
tion Retrieval, Information Extraction, and Question and Answer. The purpose of domain-specific ontology
is to get rid of conceptual and terminological confusion. It accomplishes this by specifying a set of generic
concepts that characterizes the domain as well as their definitions and interrelationships. This paper will
describe some algorithms for identifying semantic relations and constructing an Information Technology
Ontology, while extracting the concepts and objects from different sources. The Ontology is constructed
based on three main resources: ACM, Wikipedia and unstructured files from ACM Digital Library. Our
algorithms are combined of Natural Language Processing and Machine Learning. We use Natural Lan-
guage Processing tools, such as OpenNLP, Stanford Lexical Dependency Parser in order to explore sen-
tences. We then extract these sentences based on English pattern in order to build training set. We use a
random sample among 245 categories of ACM to evaluate our results. Results generated show that our
system yields superior performance.
Keywords
Domain ontology, Knowledge based system, Semantic Relation, Information Extraction.
1. Introduction
The methods of human computer interaction on Internet as with Google Search, Information Ex-
traction, Question and Answer have become important tools in modern society. End users can use
these systems with various purposes, such as querying information, learning, E-commerce, etc.
However, the precision of these systems is always a big issue that the research must solve. In or-
der to achieve high precision, the systems should consider semantic of sentences. Previous re-
search has been done in the realm of semantic relation detection but it still remains challenging to
this day. V. Malase et al [1] use lexical- syntactic patterns to detect semantic relations between
the main terms of definition in order to help terminologist build structured terminology following
these relations. With the dramatic increase of data on the Internet, identifying semantic relations
plays an important role in semantic-oriented applications.
Our goal is to automatically identify some of the semantic relations that might be found in do-
main-specific corpora. Here, we will describe methods of identifying semantic relations. For this
purpose, we will combine Natural Language Processing and Matching Learning. We will also
define some English patterns and semantic roles to identify semantic relations in 2000 text files
from ACM Digital Library. We then propose a method to identify synonyms, hyponyms, and
hypernyms of instances in domain-specific ontology. Finally, we will use three measures: Preci-
sion, Recall and F-Measure in order to evaluate these methods. The evaluation results shown af-
terward will prove the effectiveness of the proposed mythology.
2. International Journal of Information Sciences and Techniques (IJIST) Vol.4, No.3, May 2014
2
2. Related Work
Information extraction is an important research topic in Natural language Processing (NLP) [2]. It
tries to find semantic relations, relevant information from the large amount of text documents and
on the World Wide Web. Y. Jie et al [3] focused on semantic rules to build Extraction system
from LIDAR (Light Detection and Ranging). F. Gomez et al [4] built a semantic interpreter to
assign meaning to the grammatical relations of the sentences when they constructed a knowledge
base about a given topic. K. Kongkachandra et al [5] proposed semantic based key-phrase recov-
ery for domain-independent key-phrase extraction. In this method, he added a key-phrase recov-
ery function as a post process of the conventional key-phrase extractors in order to reconsider the
failed key phrases by semantic matching based on sentence meaning. Z.Goudong et al [6] pro-
posed novel tree kernel-based method with rich syntactic and semantic information for the extrac-
tion of semantic relations between named entities. A.B. Abacha et al [7] built a platform MeTAE
(Medical Texts Annotation and Exploration). This system allows the extracting and annotating of
Medical entities and relationships from Medical text. He relied on linguistic patterns to detect the
semantic relations in medical text files. A.D.S Jayatilaka et al [8] constructed ontology from Web
pages. He introduced web usage patterns as a novel source of semantics in ontology learning. The
proposed methodology combines web content mining with web usage mining in the know-
ledge extraction process. H. Li et al [9] extract semantic relations between Chinese named enti-
ties based on semantic features and the Vector Space Model (VSM).
Those research attempts were meant to identify semantic relations from web documents or text
files in order to develop ontology. They either used natural language processing techniques, the
statistical, or the machine learning approach in the ontology learning process. Our research com-
bines Natural Language Processing and Matching Learning to identify semantic relations on un-
structured data.
3. Algorithms for Identifying Semantic Relations
3.1 Information Technology Ontology (ITO)
Domain Ontology includes concepts, attributes and events. Ontology is a tuple O = (C, I, R, T, V,
≤, ⊥, ∈, =) [10], where:
C is the set of concepts, I is the set of individuals including attributes and events.
R is the set of relations; T is the set of data types.
V is the set of values (C, I, R, T, V being pair wise disjoint)
≤ is a relation on(C×C) ∪ (R×R) ∪ (T×T) called specialization.
⊥ is a relation on(C×C) ∪ (R×R) ∪ (T×T) called exclusion. ∈ is a relation over (I×C) ∪ (V× T)
called instantiation, = is a relation over I×P× (I∪V) called assignment.
We propose an ontology structure, as shown in Figure.1
3. International Journal of Information Sciences and Techniques (IJIST) Vol.4, No.3, May 2014
3
Figure1. Structure of Information Technology Ontology
There are four layers in this Ontology
Topic layer includes 245 categories in Information Technology area from ACM Category [11].
Ingredient layer includes many instances in Information Technology area from difference re-
sources such as Wikipedia and unstructured text files.
Synset layer is a set of synonyms, hyponyms, and hypernyms of instances from ingredient layer.
We will describe it in detail in next session.
The last layer is known as sentence layer. We will introduce it in detail in next session.
3.2 Algorithm for Identifying Synonyms, Hyponyms and Hypernyms Relations
The Synset layer in this ontology includes synonyms, hyponyms and hypernyms of instances be-
longing to an ingredient layer. In order to find a set of synonyms, hyponyms and hypernyms of
instances from ingredient layer, we use WordNet. Similarly to Wikipedia, WordNet is an ontolo-
gy that includes many fields and languages. However, we will only focus on English language.
Our proposed algorithm is as follows.
Procedure Find_out_Synset ()
While (instance of Ingredient layer is not null)
Begin
Synonym_List =QueryIntoWordNet (instance)
If (Synset_List is not null)
Link (instance to Synonym, hyponym, hypernym)
End if
End
End While
End Pro
4. International Journal of Information Sciences and Techniques (IJIST) Vol.4, No.3, May 2014
4
Table 1 represents results after applying the algorithm for some instances.
Table 1. Set of Synonym, Hyponym and Hypernym corresponding with instances
Instances from
Ingredient layer
Synonyms Hyponyms Hypernyms
NLP Natural Language
Processing
Informatics, infor-
mation processing
Data structure Hierarchical struc-
ture
Organization, sys-
tem
Computer Network Internet, intranet,
WAN
Electronic network
RAM Random Access
Memory
Core memory Volatile storage
From Table 1, we can identify some semantic relations between an instance of Ingredient layer
with its synonyms, hyponyms and hypernyms, such as
- NLP is a Natural Language Processing
- NLP such as Informatics, information processing
- Hierarchical structure includes Data structure
- Data structure such as organization, system
- RAM is random access memory
- Core memory includes RAM
3.3 Algorithm for Identifying Semantic Relations based on Syntax Patterns and
Linguistic
To enrich the sentence layer of ontology, we use English syntax patterns to extract sentences re-
lated to instances of ingredient layer. This process is implemented during the extraction of in-
stances. For this purpose, we use OpenNLP tool to recognize sentences. Then, we use a Stanford
Lexical Dependency Parser (SLDP) to refine them. SLDP can output typed dependency tree to
present a grammatical relationship between keywords in the sentence. After that, we apply lin-
guistic markers to the sentences to recognize their semantic meaning. Our proposed model is
shown in Figure. 2.
5. International Journal of Information Sciences and Techniques (IJIST) Vol.4, No.3, May 2014
5
Figure2. Identifying Semantic Relations based on Pattern and Linguistic
Our training corpus (2000 papers) for testing is focused on the field of Information Technology.
Electronic documents were automatically collected from ACM digital library. We designed ele-
ven English patterns to select sentences identified by OpenNLP. The English patterns are shown
in table 2.
Table 2. English Patterns used to extract sentences
No Pattern Example
1 S + V Computer is broken
2 S + V + Object He was mowing the lawn
3 S + V + Adjective The girl was tall
4 S + V + Indirect Object The woman went to the house
5 S + V + Direct Object The man hit the ball
6 S + V + Complement Some students in the class are engi-
neers
7 S + V + Prepositional Phrase The cat waited for its owner yester-
day
8 S +V+ Indirect Object+ Direct Object Granny left Gary all of her money
9 S + V + Direct Object + Adjective The Jury found the defendant guilty
10 S + V + Direct Object + NP The Jury found the defendant guilty
11 S + V + Direct Object + Complement The class picked Susieclass repre-
sentative.
After extracting sentences based on the English patterns, we refine them by SLDP. With SLDP,
we eliminate the unnecessary words in a sentence. Below are some examples of such sentences.
6. International Journal of Information Sciences and Techniques (IJIST) Vol.4, No.3, May 2014
6
Table 3. Examples of eliminating unnecessary words
Before applying SLDP After applying SLDP
COBOL is not popular programming
language in recent years.
COBOL is not popular programming lan-
guage.
Oracle database is one of the Relation-
al Database Management System.
Oracle is Relational Database Manage-
ment System.
In my opinion Java Language is diffi-
cult to program
Java Language is difficult to program
C Sharp in Microsoft Dot Net is also
an Object Oriented programming lan-
guage
C Sharp is Object Oriented programming
language
In our ontology, most of the concepts have semantic relations between each other. In order to
identify those semantic relations, we rely on linguistic roles (linguistic marker), which are defined
as follows:
IS-A: this generic-specific relation reflects hierarchical inheritance in network of concepts. All
entities are categorized as instances of a particular class. Class can become instances of a particu-
lar class. Thus, any concepts can be linked to its immediate super ordinate concept. For example,
Random Access Memory (concept) is a core memory (concept) in computer.
PART-OF: this relation also reflects the hierarchical structure of the domain. This relation direct-
ly refers to the parts of each concept in sentence. For example, Random Access Memory (ROM)
(concept) is part of memory (concept)
MADE-OF: this relation links to concepts, which made of material concepts. For example, Inte-
grated Circuit (IC) Chip can be made of a semiconductor material.
DELIMITED-BY: this relation marks the boundaries, dividing one concept from another. This is
a domain-specific relation, mainly for the concepts, which are belonged to different topic in the
field of Information technology. This relation is usually represented by a number of verbs, such as
include, delimit, limit, circumscribe, restrict, etc. For example, the processing of a computer is
restricted by CPU, RAM.
TAKES-PLACE-IN: this relation describes the context of processes, which are related to spatial
and temporal dimensions. A number of verbs represent this relation, such as happen, occur, take
place in, etc. For example, In order to tackle the conflict of process, time scheduling takes place
in the Operating System.
ATTRIBUTE-OF: this relation is only useful for concepts designated by specialized adjectives,
such as strong, powerful, etc., or nouns that define the properties of other concepts. For example,
these router devices are powerful and useful in a network.
RESULT-OF: this relation is relevant to either processes or entities that are derived from other
processes. For example, as a result of the inconsistency, this file is considered corrupted.
AFFECTS: this relation, along with RESULT-OF, is a crucial semantic relation in the knowledge
base since both can relate all kinds of concepts in ontology.
7. International Journal of Information Sciences and Techniques (IJIST) Vol.4, No.3, May 2014
7
CAUSES: this relation is directly relevant to the processes or concepts that are derived from other
processes. On the contrary of RESULT-OF, this relation usually represents negative meaning.
These linguistic roles help us identify semantic relations between keywords in a sentence. There-
fore, we can recognize sentence meaning and exactly categorize them. The semantic relations are
represented, as shown in Figure. 3.
Figure 3. Semantic relations in Information Technology Ontology
4. Experimental Results
Over 2000 text documents from the ACM Digital Library were used for testing purposes. These
documents with different topics belonged to the field of Information technology. In order to easily
process the documents, we grouped them by category before extracting instances and concepts. In
our paper, we chose randomsample corpora among 245 categories from ACM.
The system’s performance was calculatedby using three measures: Precision, Recall and F-
measure. They are calculated by each category in domain-specific ontology as below:
( ) =
( )
( ) + ( )
=
( )
( ) + ( )
F − Mesure = 2
Precision ∗ Recall
Precision + Recall
Where Ci represents a category in ITO and correct, bad, missing represented the number of cor-
rect, wrong, missing, respectively.
Experiment results are shown in table 4, 5, 6, 7, and 8.
8. International Journal of Information Sciences and Techniques (IJIST) Vol.4, No.3, May 2014
8
Table 4. Experiments results on instances of ingredient layer
Category Qualityof
instances
Precision
(%)
Recall
(%)
F-Measure
(%)
Application (AP_inst) 3672 79.26 76.51 77.86
Artificial Intelligent
(AI_inst)
5714 82.94 78.92 80.88
Logic Design
(LD_inst)
4644 82.18 80.06 81.11
Operating System
(OS_inst)
6785 84.47 81.37 82.89
Process Management
(PM_inst)
3056 76.53 72.51 74.47
Software (Soft_inst) 4249 81.64 79.62 80.62
Table 5. Experiments results on set of synonyms
Category Quality of
synonym
Precision
(%)
Recall
(%)
F-Measure
(%)
Application
(AP_syno)
524 79.26 76.51 77.86
Artificial Intelligent
(AI_syno)
689 94.41 88.15 91.17
Logic Design
(LD_syno)
472 92.24 84.27 88.08
Operating System
(OS_syno)
861 96.18 91.58 93.82
Process Management
(PM_syno)
517 93.25 86.16 89.56
Software (SW_syno) 583 94.26 89.04 91.57
Table 6.Experiments results on set of Hyponyms
Category Qualityof
Hyponym
Precision
(%)
Recall
(%)
F-Measure
(%)
Application
(AP_hypo)
714 89.38 76.51 82.45
ArtificialIntelligent
(AI_hypo)
837 96.14 88.29 92.04
Logic Design
(LD_hypo)
718 87.54 84.26 85.86
Operating System
(OS_hypo)
972 96.82 91.42 94.04
Process Management
(PM_hypo)
728 88.31 85.15 86.70
Software (SW_hypo) 646 85.64 81.04 83.28
9. International Journal of Information Sciences and Techniques (IJIST) Vol.4, No.3, May 2014
9
Table 7. Experiments results on set of Hypernyms
Category Qualityof
Hypernym
Precision
(%)
Recall
(%)
F-Measure
(%)
Application
(AP_hype)
916 79.26 76.51 77.86
Artificial Intelligent
(AI_hype)
1321 92.41 91.17 91.79
Logic Design
(LD_hype)
954 84.62 79.37 81.91
Operating System
(OS_hype)
1413 95.04 96.81 95.92
Process Management
(PM_hype)
834 82.31 84.55 83.41
Software (SW_hype) 893 85.48 80.19 82.75
Table 8. Experiments results on set of Sentences
Category Qualityof
Sentences
Precision
(%)
Recall
(%)
F-Measure
(%)
Application
(AP_sent)
683 83.26 80.75 81.98
Artificial Intelligent
(AI_sent)
972 87.93 85.64 86.77
Logic Design
(LD_sent)
589 81.52 79.18 80.33
Operating System
(OS_sent)
1026 92.41 88.32 90.32
Process Management
(PM_sent)
647 82.83 79.17 80.96
Software (SW_sent) 762 86.74 82.14 84.38
Figure 4. Experiment result evaluation of instances, synonyms, hyponyms and hypernymsbased on Preci-
sion, Recall and F-Measures.
10. International Journal of Information Sciences and Techniques (IJIST) Vol.4, No.3, May 2014
10
5. Conclusions
Our experiment tried to identify semantic relations between nouns or noun phrases in unstruc-
tured files based on linguistic roles in order to build domain ontology on Information Technology.
After usingan OpenNLP tool to recognize words and sentences, we applied English patterns de-
fined by us to extract them. We then refined them using the SDLC tool. Therefore, our experi-
mental results have a high precision and high recall. In order to identify semantic relations, we
applied linguistic roles. The semantic roles directly link to instances in the layers of domain on-
tology. Also, we propose an algorithm to identify semantic relations based on synonyms, hypo-
nyms and hypernyms from instances of the ingredient layer. Overall scores are computed based
on three measures: Precision, Recall ad F-Measure. Efforts must also be invested in order to re-
duce the overall processing time of the system.
In future works, we will focus on building an Information Extraction system based on this ontol-
ogy to solve a number of problems, such as ontology learning and Question and Answer.
REFERENCES
[1] V. MalaSe et al , "Detecting semantic relations between terms in definitions," in The 3rd Interna-
tional Workshop on Computational Terminology - CompuTerm 2004 , 2004.
[2] G. Zhou et al, "Tree Kernel-Based Semantic Relation Extraction with Rich Syntactic and Semantic
Information," Information Sciences, vol. 180, no. 8, pp. 1313 - 1325, 2010.
[3] Y. Jie et al, "Building Extraction from LIDAR based Semantic Analysis," Geo-Spatial Information
Science, vol. 9, no. 4, Sep. 2006.
[4] F. Gomez et al, "Semantic interpretation and knowledge extraction," Knowledge-Based Systems, vol.
20, no. 1, pp. 51 - 60, July 2006.
[5] G.Kongkachandra et al, "Abductive Reasoning for Keyword Recovering in Semantic-based Key-word
Extraction," in The Fifth International Conference on Information Technology: New Generations -
IEEE, 2008, pp. 714 - 719.
[6] Z. Goudong et al, "Tree kernel-based semantic relation extraction with rich syntactic and semantic
information," Information Sciences, vol. 180, no. 8, pp. 1313 - 1325, Dec. 2009.
[7] A.B Abacha et al, "Automatic Extraction of Semantic Relations between Medical Entities- a rule
based approach," Journal of Biomedical Semantics, 2011.
[8] A.D.S Jayatilaka, "Knowledge Extraction for Semantic Web Using Web Mining ," in The Interna-
tional Conference on Advances in ICT for Emerging Regions (ICTer 2011) - IEEE, 2011, pp. 89 - 94.
[9] H. Li et al, "A Relation Extraction Method of Chinese Named Entities based on Location and Se-
mantic Features," Applied Intelligence, vol. 18, no. 1, pp. 1- 14, May 2012.
[10] J. Euzenat et al, Ontology Matching.: Springer, 2007.
[11] ACM. [Online].http://www.acm.org/about/class/ccs98-html.
AUTHORS
Tuoi Phan Thi, Professor. : She is a Professor at the Faculty of Computer Science and
Engineering, HCMC University of Technology, Vietnam. She obtained her Ph.D. in Com-
puter Science from Charles University, Czech Republic, in 1985. Her research interests are
compiler, information retrieval, natural language processing. She has been the Chief Investi-
gator of national key projects and published many papers in international journals and confe-
rence proceedings in those areas.
Chien D C Ta, Mr .: He is Ph.D. Student at the Faculty of Computer Science and Engineer-
ing, HCMC University of Technology, Vietnam. His research interests include Natural Lan-
guage Processing and its applications, Information Extraction.