The document describes a new bottom-up method called FCA-MERGE for merging ontologies. It extracts instances from documents for each ontology to generate formal contexts. It then merges the contexts and computes a concept lattice using techniques from Formal Concept Analysis. This lattice provides a structural description of the merging process. The final merged ontology is then generated from the lattice with human guidance. FCA-MERGE circumvents the problem of finding instances classified in both ontologies by extracting instances from relevant documents.
The document compares two sets of operators - Michalski's knowledge transmutations from machine learning and Baker's dialogue transformations that model knowledge negotiation in dialogue. It finds some overlap between the operators but also key differences related to the number of agents involved, how strategical aspects are represented, and the relationship between uttered and internal knowledge. The authors discuss how fusing these operator sets could help model collaborative learning and develop human-machine learning systems.
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING cscpconf
In the last decade, ontologies have played a key technology role for information sharing and agents interoperability in different application domains. In semantic web domain, ontologies are efficiently used toface the great challenge of representing the semantics of data, in order to bring the actual web to its full
power and hence, achieve its objective. However, using ontologies as common and shared vocabularies requires a certain degree of interoperability between them. To confront this requirement, mapping ontologies is a solution that is not to be avoided. In deed, ontology mapping build a meta layer that allows different applications and information systems to access and share their informations, of course, after resolving the different forms of syntactic, semantic and lexical mismatches. In the contribution presented in this paper, we have integrated the semantic aspect based on an external lexical resource, wordNet, to design a new algorithm for fully automatic ontology mapping. This fully automatic character features the
main difference of our contribution with regards to the most of the existing semi-automatic algorithms of ontology mapping, such as Chimaera, Prompt, Onion, Glue, etc. To better enhance the performances of our algorithm, the mapping discovery stage is based on the combination of two sub-modules. The former
analysis the concept’s names and the later analysis their properties. Each one of these two sub-modules is
it self based on the combination of lexical and semantic similarity measures.
Conceptual Interoperability and Biomedical DataJim McCusker
The goals of conceptual interoperability are:
Make similar but distinct data resources available for search, conversion, and inter-mapping in a way that mirrors human understanding of the data being searched.
Make data resources that use cross-cutting models (HL7-RIM, provenance models, etc.) interoperable with domain-specific models without explicit mappings between them.
1) Artificial Intelligence research draws from many disciplines including formal logic, probability theory, linguistics and philosophy. Computational logic combines and improves upon traditional logic and decision theory.
2) The paper argues that the abductive logic programming (ALP) agent model is a powerful model of both descriptive and normative thinking. It includes production systems and is compatible with classical logic and decision theory.
3) The ALP agent model treats beliefs as describing the world and goals as describing how the world should be. Its semantics aim to generate actions and assumptions to make goals and observations true based on beliefs.
The document introduces the Word-Sensibility Model as a way to represent commonsense knowledge for AI. It consists of several components, including quadranyms, micro-topics, and an ecological perspective. Quadranyms are four-part constructs that represent virtual units of orientation and constraint. Micro-topics help organize lexical information and abstract human contextual expectations. The model takes an ecological view of representing dynamic relationships between an agent's internal responses and external occurrences across different contextual levels.
ASSESSING SIMILARITY BETWEEN ONTOLOGIES: THE CASE OF THE CONCEPTUAL SIMILARITYIJwest
In ontology engineering, there are many cases where assessing similarity between ontologies is required, this is the case of the alignment activities, ontology evolutions, ontology similarities, etc. This paper presents a new method for assessing similarity between concepts of ontologies. The method is based on the
set theory, edges and feature similarity. We first determine the set of concepts that is shared by two ontologies and the sets of concepts that are different from them. Then, we evaluate the average value of similarity for each set by using edges-based semantic similarity. Finally, we compute similarity between
ontologies by using average values of each set and by using feature-based similarity measure too.
Presentation of the Marcu 2000 ACL paper "The rhetorical parsing of unrestricted texts- A surface-based approach" for Discourse Parsing and Language Technology seminar.
The spread and abundance of electronic documents requires automatic techniques for extracting useful information from the text they contain. The availability of conceptual taxonomies can be of great help, but manually building them is a complex and costly task. Building on previous work, we propose a technique to automatically extract conceptual graphs from text and reason with them. Since automated learning of taxonomies needs to be robust with respect to missing or partial knowledge and flexible with respect to noise, this work proposes a way to deal with these problems. The case of poor data/sparse concepts is tackled by finding generalizations among disjoint pieces of knowledge. Noise is
handled by introducing soft relationships among concepts rather than hard ones, and applying a probabilistic inferential setting. In particular, we propose to reason on the extracted graph using different kinds of relationships among concepts, where each arc/relationship is associated to a number that represents its likelihood among all possible worlds, and to face the problem of sparse knowledge by using generalizations among distant concepts as bridges between disjoint portions of knowledge.
The document compares two sets of operators - Michalski's knowledge transmutations from machine learning and Baker's dialogue transformations that model knowledge negotiation in dialogue. It finds some overlap between the operators but also key differences related to the number of agents involved, how strategical aspects are represented, and the relationship between uttered and internal knowledge. The authors discuss how fusing these operator sets could help model collaborative learning and develop human-machine learning systems.
SEMANTIC INTEGRATION FOR AUTOMATIC ONTOLOGY MAPPING cscpconf
In the last decade, ontologies have played a key technology role for information sharing and agents interoperability in different application domains. In semantic web domain, ontologies are efficiently used toface the great challenge of representing the semantics of data, in order to bring the actual web to its full
power and hence, achieve its objective. However, using ontologies as common and shared vocabularies requires a certain degree of interoperability between them. To confront this requirement, mapping ontologies is a solution that is not to be avoided. In deed, ontology mapping build a meta layer that allows different applications and information systems to access and share their informations, of course, after resolving the different forms of syntactic, semantic and lexical mismatches. In the contribution presented in this paper, we have integrated the semantic aspect based on an external lexical resource, wordNet, to design a new algorithm for fully automatic ontology mapping. This fully automatic character features the
main difference of our contribution with regards to the most of the existing semi-automatic algorithms of ontology mapping, such as Chimaera, Prompt, Onion, Glue, etc. To better enhance the performances of our algorithm, the mapping discovery stage is based on the combination of two sub-modules. The former
analysis the concept’s names and the later analysis their properties. Each one of these two sub-modules is
it self based on the combination of lexical and semantic similarity measures.
Conceptual Interoperability and Biomedical DataJim McCusker
The goals of conceptual interoperability are:
Make similar but distinct data resources available for search, conversion, and inter-mapping in a way that mirrors human understanding of the data being searched.
Make data resources that use cross-cutting models (HL7-RIM, provenance models, etc.) interoperable with domain-specific models without explicit mappings between them.
1) Artificial Intelligence research draws from many disciplines including formal logic, probability theory, linguistics and philosophy. Computational logic combines and improves upon traditional logic and decision theory.
2) The paper argues that the abductive logic programming (ALP) agent model is a powerful model of both descriptive and normative thinking. It includes production systems and is compatible with classical logic and decision theory.
3) The ALP agent model treats beliefs as describing the world and goals as describing how the world should be. Its semantics aim to generate actions and assumptions to make goals and observations true based on beliefs.
The document introduces the Word-Sensibility Model as a way to represent commonsense knowledge for AI. It consists of several components, including quadranyms, micro-topics, and an ecological perspective. Quadranyms are four-part constructs that represent virtual units of orientation and constraint. Micro-topics help organize lexical information and abstract human contextual expectations. The model takes an ecological view of representing dynamic relationships between an agent's internal responses and external occurrences across different contextual levels.
ASSESSING SIMILARITY BETWEEN ONTOLOGIES: THE CASE OF THE CONCEPTUAL SIMILARITYIJwest
In ontology engineering, there are many cases where assessing similarity between ontologies is required, this is the case of the alignment activities, ontology evolutions, ontology similarities, etc. This paper presents a new method for assessing similarity between concepts of ontologies. The method is based on the
set theory, edges and feature similarity. We first determine the set of concepts that is shared by two ontologies and the sets of concepts that are different from them. Then, we evaluate the average value of similarity for each set by using edges-based semantic similarity. Finally, we compute similarity between
ontologies by using average values of each set and by using feature-based similarity measure too.
Presentation of the Marcu 2000 ACL paper "The rhetorical parsing of unrestricted texts- A surface-based approach" for Discourse Parsing and Language Technology seminar.
The spread and abundance of electronic documents requires automatic techniques for extracting useful information from the text they contain. The availability of conceptual taxonomies can be of great help, but manually building them is a complex and costly task. Building on previous work, we propose a technique to automatically extract conceptual graphs from text and reason with them. Since automated learning of taxonomies needs to be robust with respect to missing or partial knowledge and flexible with respect to noise, this work proposes a way to deal with these problems. The case of poor data/sparse concepts is tackled by finding generalizations among disjoint pieces of knowledge. Noise is
handled by introducing soft relationships among concepts rather than hard ones, and applying a probabilistic inferential setting. In particular, we propose to reason on the extracted graph using different kinds of relationships among concepts, where each arc/relationship is associated to a number that represents its likelihood among all possible worlds, and to face the problem of sparse knowledge by using generalizations among distant concepts as bridges between disjoint portions of knowledge.
Concept hierarchy is the backbone of ontology, and the concept hierarchy acquisition has been a hot topic in the field of ontology learning. this paper proposes a hyponymy extraction method of domain ontology concept based on cascaded conditional random field(CCRFs) and hierarchy clustering. It takes free text as extracting object, adopts CCRFs identifying the domain concepts. First the low layer of CCRFs is used to identify simple domain concept, then the results are sent to the high layer, in which the nesting concepts are recognized. Next we adopt hierarchy clustering to identify the hyponymy relation between domain ontology concepts. The experimental results demonstrate the proposed method is efficient.
This document discusses the computation of presuppositions and entailments from natural language text. It begins by defining presuppositions and entailments, and explaining how they can be computed using tree transformations on semantic representations. The paper then provides examples of elementary presuppositions and entailments. It describes a system that computes presuppositions and entailments while parsing sentences using an augmented transition network. The system applies tree transformations specified in the lexicon to the semantic representation to derive inferences. The paper concludes that presuppositions and entailments exhibit computational properties not shown by the general class of inferences, such as being tied to the semantic and syntactic structure of language.
An Entity-Driven Recursive Neural Network Model for Chinese Discourse Coheren...ijaia
Chinese discourse coherence modeling remains a challenge taskin Natural Language Processing
field.Existing approaches mostlyfocus on the need for feature engineering, whichadoptthe sophisticated
features to capture the logic or syntactic or semantic relationships acrosssentences within a text.In this
paper, we present an entity-drivenrecursive deep modelfor the Chinese discourse coherence evaluation
based on current English discourse coherenceneural network model. Specifically, to overcome the
shortage of identifying the entity(nouns) overlap across sentences in the currentmodel, Our combined
modelsuccessfully investigatesthe entities information into the recursive neural network
freamework.Evaluation results on both sentence ordering and machine translation coherence rating
task show the effectiveness of the proposed model, which significantly outperforms the existing strong
baseline.
Ekaw ontology learning for cost effective large-scale semantic annotationShahab Mokarizadeh
This document discusses using ontology learning to semantically annotate a corpus of 15,000 web service interfaces. It proposes extracting terms from the interfaces at a fine-grained level and using pattern-based methods to discover taxonomic and non-taxonomic relations to automatically generate an ontology. The method achieved 62% accuracy for common concepts and 71% for common instances compared to a golden ontology.
Introduction to Distributional SemanticsAndre Freitas
This document provides an introduction to distributional semantics. It discusses how distributional semantic models (DSMs) represent word meanings as vectors based on their linguistic contexts in large corpora. This distributional hypothesis states that words that appear in similar contexts tend to have similar meanings. The document outlines how DSMs are built, important parameters like context type and weighting, and examples like latent semantic analysis. It also discusses how DSMs can support applications like semantic search. Finally, it introduces how compositional semantics explores representing the meanings of phrases and sentences compositionally based on the meanings of their parts.
Using construction grammar in conversational systemsCJ Jenkins
This thesis explored using construction grammar and ontologies in conversational systems. The author built two early experimental systems using these techniques. Construction grammar represents language as constructions pairing form and meaning. Ontologies allow for more explicit semantics compared to databases. The author developed a stemmer called UEA-Lite and a system called KIA that incorporated construction grammar, ontologies, and machine learning to understand and respond to natural language.
Lecture 2: From Semantics To Semantic-Oriented ApplicationsMarina Santini
From the "Natural Language Processing" LinkedIn group:
John Kontos, Professor of Artificial Intelligence
I wonder whether translating into formal logic is nothing more than transliteration which simply isolates the part of the text that can be reasoned upon using the simple inference mechanism of formal logic. The real problem I think lies with the part of text that CANNOT be translated one the one hand and the one that changes its meaning due to civilization advances. My own proposal is to leave NL text alone and try building inference mechanisms for the UNTRANSLATED text depending on the task requirements.
All the best
John"
Representation of ontology by Classified Interrelated object modelMihika Shah
1. The document discusses representing ontology using the Classified Interrelated Object Model (CIOM) data modeling technique. CIOM represents ontology components like classes, subclasses, attributes, and relationships between classes.
2. Key components of an ontology like classes, subclasses, attributes, and inter-class relationships are described and examples are given of how each would be represented using CIOM notation.
3. CIOM provides a general purpose methodology for representing ontologies using existing database technologies and overcomes limitations of specialized ontology languages and tools.
Classical logic has a serious limitation in that it cannot cope with the issues of vagueness and uncertainty
into which fall most modes of human reasoning. In order to provide a foundation for human knowledge
representation and reasoning in the presence of vagueness, imprecision, and uncertainty, fuzzy logic
should have the ability to deal with linguistic hedges, which play a very important role in the modification
of fuzzy predicates. In this paper, we extend fuzzy logic in narrow sense with graded syntax, introduced by
Nova´k et al., with many hedge connectives. In one case, each hedge does not have any dual one. In the
other case, each hedge can have its own dual one. The resulting logics are shown to also have the Pavelkastyle
completeness.
The document discusses analogical reasoning and case-based reasoning. It provides an overview of research in these areas including structure mapping theory, models of analogical processing like SME and MAC/FAC, and case-based reasoning systems. It proposes an analogy ontology to integrate analogical processing and first-principles reasoning by providing a formal representation of analogy concepts and results.
Ontology Matching Based on hypernym, hyponym, holonym, and meronym Sets in Wo...dannyijwest
Considerable research in the field of ontology matching has been performed where information sharing
and reuse becomes necessary in ontology development. Measurement of lexical similarity in ontology
matching is performed using synset, defined in WordNet. In this paper, we defined a Super Word Set,
which is an aggregate set that includes hypernym, hyponym, holonym, and meronym sets in WordNet.
The Super Word Set Similarity is calculated by the rate of words of concept name and synset’s words
inclusion in the Super Word Set. In order to measure of Super Word Set Similarity, we first extracted
Matched Concepts(MC), Matched Properties(MP) and Property Unmatched Concepts(PUC) from the
result of ontology matching. We compared these against two ontology matching tools – COMA++ and
LOM. The Super Word Set Similarity shows an average improvement of 12% over COMA++ and 19%
over LOM.
This document proposes ORL, an extension of OWL with Horn clause rules. ORL aims to overcome some expressive limitations of OWL, especially regarding properties, while maintaining compatibility with OWL's syntax and semantics. Rules are added as a new type of axiom and are given a formal abstract syntax and model-theoretic semantics as an extension of OWL DL. The addition of rules makes ontology consistency undecidable but provides greater expressive power for modeling relationships between properties. Examples are given and extensions to OWL's XML and RDF syntaxes are discussed to accommodate the new rule constructs.
Language Combinatorics: A Sentence Pattern Extraction Architecture Based on C...Waqas Tariq
A \"sentence pattern\" in modern Natural Language Processing is often considered as a subsequent string of words (n-grams). However, in many branches of linguistics, like Pragmatics or Corpus Linguistics, it has been noticed that simple n-gram patterns are not sufficient to reveal the whole sophistication of grammar patterns. We present a language independent architecture for extracting from sentences more sophisticated patterns than n-grams. In this architecture a \"sentence pattern\" is considered as n-element ordered combination of sentence elements. Experiments showed that the method extracts significantly more frequent patterns than the usual n-gram approach.
ONTOLOGICAL MODEL FOR CHARACTER RECOGNITION BASED ON SPATIAL RELATIONSsipij
In this paper, we present a set of spatial relations between concepts describing an ontological model for a
new process of character recognition. Our main idea is based on the construction of the domain ontology
modelling the Latin script. This ontology is composed by a set of concepts and a set of relations. The
concepts represent the graphemes extracted by segmenting the manipulated document and the relations are
of two types, is-a relations and spatial relations. In this paper we are interested by description of second
type of relations and their implementation by java code.
Taxonomy extraction from automotive natural language requirements using unsup...ijnlc
In this paper we present a novel approach to semi-automatically learn concept hierarchies from natural
language requirements of the automotive industry. The approach is based on the distributional hypothesis
and the special characteristics of domain-specific German compounds. We extract taxonomies by using
clustering techniques in combination with general thesauri. Such a taxonomy can be used to support
requirements engineering in early stages by providing a common system understanding and an agreedupon
terminology. This work is part of an ontology-driven requirements engineering process, which builds
on top of the taxonomy. Evaluation shows that this taxonomy extraction approach outperforms common
hierarchical clustering techniques.
This paper introduces approaches to combining logic, probability, and learning. It discusses past attempts to solve probabilistic logic learning and overviews different formalisms for defining probabilities on logical views. It also describes approaches that combine probabilistic reasoning and logical representation, such as Bayesian logic programs and probabilistic relational models. Learning probabilistic logics involves adapting probabilistic models based on data, including tasks of parameter estimation and structure learning. The paper provides an integrated survey of various concepts in this area.
A Semi-Automatic Ontology Extension Method for Semantic Web ServicesIDES Editor
this paper provides a novel semi-automatic ontology
extension method for Semantic Web Services (SWS). This is
significant since ontology extension methods those existing
in literature mostly deal with semantic description of static
Web resources such as text documents. Hence, there is a need
for methods that can serve dynamic Web resources such as
SWS. The developed method in this paper avoids redundancy
and respects consistency so as to assure high quality of the
resulting shared ontologies.
Distributional semantics is a research area that uses statistical analysis of linguistic contexts to develop theories and methods for determining the semantic similarities between words and linguistic items based on their distributional properties in large text corpora. It is based on the distributional hypothesis that words with similar distributions have similar meanings. Distributional semantic models represent words as vectors in a high-dimensional semantic space based on their co-occurrence with other words, allowing semantic similarity to be measured using vector similarity methods. Common distributional semantic models include term frequency-inverse document frequency (tf-idf), latent semantic analysis (LSA), latent Dirichlet allocation (LDA), and word embeddings.
1) The document discusses a system called MaLTe (Machine Learning from Text) that aims to extract knowledge from technical expository texts using both natural language processing and machine learning techniques.
2) MaLTe will process texts containing narratives and examples, and output a representation of the knowledge in the form of Horn clauses. Some user interaction will be required during the translation process.
3) The document outlines several challenges in applying machine learning and natural language processing to knowledge extraction from real-world texts, including their logical structure and examples. It provides an example from a tax guide to illustrate these challenges.
The poet Olavo Bilac was asked by a friend to write an announcement advertising the sale of his small farm. Bilac wrote a poetic description highlighting the beauty of the property. Later, when Bilac asked if the farm had sold, his friend said he had changed his mind after reading Bilac's writing and realizing what a treasure he had. The friend was reminded not to underestimate the good things we have and to appreciate life's blessings like family, health, and friends rather than chasing false treasures.
Concept hierarchy is the backbone of ontology, and the concept hierarchy acquisition has been a hot topic in the field of ontology learning. this paper proposes a hyponymy extraction method of domain ontology concept based on cascaded conditional random field(CCRFs) and hierarchy clustering. It takes free text as extracting object, adopts CCRFs identifying the domain concepts. First the low layer of CCRFs is used to identify simple domain concept, then the results are sent to the high layer, in which the nesting concepts are recognized. Next we adopt hierarchy clustering to identify the hyponymy relation between domain ontology concepts. The experimental results demonstrate the proposed method is efficient.
This document discusses the computation of presuppositions and entailments from natural language text. It begins by defining presuppositions and entailments, and explaining how they can be computed using tree transformations on semantic representations. The paper then provides examples of elementary presuppositions and entailments. It describes a system that computes presuppositions and entailments while parsing sentences using an augmented transition network. The system applies tree transformations specified in the lexicon to the semantic representation to derive inferences. The paper concludes that presuppositions and entailments exhibit computational properties not shown by the general class of inferences, such as being tied to the semantic and syntactic structure of language.
An Entity-Driven Recursive Neural Network Model for Chinese Discourse Coheren...ijaia
Chinese discourse coherence modeling remains a challenge taskin Natural Language Processing
field.Existing approaches mostlyfocus on the need for feature engineering, whichadoptthe sophisticated
features to capture the logic or syntactic or semantic relationships acrosssentences within a text.In this
paper, we present an entity-drivenrecursive deep modelfor the Chinese discourse coherence evaluation
based on current English discourse coherenceneural network model. Specifically, to overcome the
shortage of identifying the entity(nouns) overlap across sentences in the currentmodel, Our combined
modelsuccessfully investigatesthe entities information into the recursive neural network
freamework.Evaluation results on both sentence ordering and machine translation coherence rating
task show the effectiveness of the proposed model, which significantly outperforms the existing strong
baseline.
Ekaw ontology learning for cost effective large-scale semantic annotationShahab Mokarizadeh
This document discusses using ontology learning to semantically annotate a corpus of 15,000 web service interfaces. It proposes extracting terms from the interfaces at a fine-grained level and using pattern-based methods to discover taxonomic and non-taxonomic relations to automatically generate an ontology. The method achieved 62% accuracy for common concepts and 71% for common instances compared to a golden ontology.
Introduction to Distributional SemanticsAndre Freitas
This document provides an introduction to distributional semantics. It discusses how distributional semantic models (DSMs) represent word meanings as vectors based on their linguistic contexts in large corpora. This distributional hypothesis states that words that appear in similar contexts tend to have similar meanings. The document outlines how DSMs are built, important parameters like context type and weighting, and examples like latent semantic analysis. It also discusses how DSMs can support applications like semantic search. Finally, it introduces how compositional semantics explores representing the meanings of phrases and sentences compositionally based on the meanings of their parts.
Using construction grammar in conversational systemsCJ Jenkins
This thesis explored using construction grammar and ontologies in conversational systems. The author built two early experimental systems using these techniques. Construction grammar represents language as constructions pairing form and meaning. Ontologies allow for more explicit semantics compared to databases. The author developed a stemmer called UEA-Lite and a system called KIA that incorporated construction grammar, ontologies, and machine learning to understand and respond to natural language.
Lecture 2: From Semantics To Semantic-Oriented ApplicationsMarina Santini
From the "Natural Language Processing" LinkedIn group:
John Kontos, Professor of Artificial Intelligence
I wonder whether translating into formal logic is nothing more than transliteration which simply isolates the part of the text that can be reasoned upon using the simple inference mechanism of formal logic. The real problem I think lies with the part of text that CANNOT be translated one the one hand and the one that changes its meaning due to civilization advances. My own proposal is to leave NL text alone and try building inference mechanisms for the UNTRANSLATED text depending on the task requirements.
All the best
John"
Representation of ontology by Classified Interrelated object modelMihika Shah
1. The document discusses representing ontology using the Classified Interrelated Object Model (CIOM) data modeling technique. CIOM represents ontology components like classes, subclasses, attributes, and relationships between classes.
2. Key components of an ontology like classes, subclasses, attributes, and inter-class relationships are described and examples are given of how each would be represented using CIOM notation.
3. CIOM provides a general purpose methodology for representing ontologies using existing database technologies and overcomes limitations of specialized ontology languages and tools.
Classical logic has a serious limitation in that it cannot cope with the issues of vagueness and uncertainty
into which fall most modes of human reasoning. In order to provide a foundation for human knowledge
representation and reasoning in the presence of vagueness, imprecision, and uncertainty, fuzzy logic
should have the ability to deal with linguistic hedges, which play a very important role in the modification
of fuzzy predicates. In this paper, we extend fuzzy logic in narrow sense with graded syntax, introduced by
Nova´k et al., with many hedge connectives. In one case, each hedge does not have any dual one. In the
other case, each hedge can have its own dual one. The resulting logics are shown to also have the Pavelkastyle
completeness.
The document discusses analogical reasoning and case-based reasoning. It provides an overview of research in these areas including structure mapping theory, models of analogical processing like SME and MAC/FAC, and case-based reasoning systems. It proposes an analogy ontology to integrate analogical processing and first-principles reasoning by providing a formal representation of analogy concepts and results.
Ontology Matching Based on hypernym, hyponym, holonym, and meronym Sets in Wo...dannyijwest
Considerable research in the field of ontology matching has been performed where information sharing
and reuse becomes necessary in ontology development. Measurement of lexical similarity in ontology
matching is performed using synset, defined in WordNet. In this paper, we defined a Super Word Set,
which is an aggregate set that includes hypernym, hyponym, holonym, and meronym sets in WordNet.
The Super Word Set Similarity is calculated by the rate of words of concept name and synset’s words
inclusion in the Super Word Set. In order to measure of Super Word Set Similarity, we first extracted
Matched Concepts(MC), Matched Properties(MP) and Property Unmatched Concepts(PUC) from the
result of ontology matching. We compared these against two ontology matching tools – COMA++ and
LOM. The Super Word Set Similarity shows an average improvement of 12% over COMA++ and 19%
over LOM.
This document proposes ORL, an extension of OWL with Horn clause rules. ORL aims to overcome some expressive limitations of OWL, especially regarding properties, while maintaining compatibility with OWL's syntax and semantics. Rules are added as a new type of axiom and are given a formal abstract syntax and model-theoretic semantics as an extension of OWL DL. The addition of rules makes ontology consistency undecidable but provides greater expressive power for modeling relationships between properties. Examples are given and extensions to OWL's XML and RDF syntaxes are discussed to accommodate the new rule constructs.
Language Combinatorics: A Sentence Pattern Extraction Architecture Based on C...Waqas Tariq
A \"sentence pattern\" in modern Natural Language Processing is often considered as a subsequent string of words (n-grams). However, in many branches of linguistics, like Pragmatics or Corpus Linguistics, it has been noticed that simple n-gram patterns are not sufficient to reveal the whole sophistication of grammar patterns. We present a language independent architecture for extracting from sentences more sophisticated patterns than n-grams. In this architecture a \"sentence pattern\" is considered as n-element ordered combination of sentence elements. Experiments showed that the method extracts significantly more frequent patterns than the usual n-gram approach.
ONTOLOGICAL MODEL FOR CHARACTER RECOGNITION BASED ON SPATIAL RELATIONSsipij
In this paper, we present a set of spatial relations between concepts describing an ontological model for a
new process of character recognition. Our main idea is based on the construction of the domain ontology
modelling the Latin script. This ontology is composed by a set of concepts and a set of relations. The
concepts represent the graphemes extracted by segmenting the manipulated document and the relations are
of two types, is-a relations and spatial relations. In this paper we are interested by description of second
type of relations and their implementation by java code.
Taxonomy extraction from automotive natural language requirements using unsup...ijnlc
In this paper we present a novel approach to semi-automatically learn concept hierarchies from natural
language requirements of the automotive industry. The approach is based on the distributional hypothesis
and the special characteristics of domain-specific German compounds. We extract taxonomies by using
clustering techniques in combination with general thesauri. Such a taxonomy can be used to support
requirements engineering in early stages by providing a common system understanding and an agreedupon
terminology. This work is part of an ontology-driven requirements engineering process, which builds
on top of the taxonomy. Evaluation shows that this taxonomy extraction approach outperforms common
hierarchical clustering techniques.
This paper introduces approaches to combining logic, probability, and learning. It discusses past attempts to solve probabilistic logic learning and overviews different formalisms for defining probabilities on logical views. It also describes approaches that combine probabilistic reasoning and logical representation, such as Bayesian logic programs and probabilistic relational models. Learning probabilistic logics involves adapting probabilistic models based on data, including tasks of parameter estimation and structure learning. The paper provides an integrated survey of various concepts in this area.
A Semi-Automatic Ontology Extension Method for Semantic Web ServicesIDES Editor
this paper provides a novel semi-automatic ontology
extension method for Semantic Web Services (SWS). This is
significant since ontology extension methods those existing
in literature mostly deal with semantic description of static
Web resources such as text documents. Hence, there is a need
for methods that can serve dynamic Web resources such as
SWS. The developed method in this paper avoids redundancy
and respects consistency so as to assure high quality of the
resulting shared ontologies.
Distributional semantics is a research area that uses statistical analysis of linguistic contexts to develop theories and methods for determining the semantic similarities between words and linguistic items based on their distributional properties in large text corpora. It is based on the distributional hypothesis that words with similar distributions have similar meanings. Distributional semantic models represent words as vectors in a high-dimensional semantic space based on their co-occurrence with other words, allowing semantic similarity to be measured using vector similarity methods. Common distributional semantic models include term frequency-inverse document frequency (tf-idf), latent semantic analysis (LSA), latent Dirichlet allocation (LDA), and word embeddings.
1) The document discusses a system called MaLTe (Machine Learning from Text) that aims to extract knowledge from technical expository texts using both natural language processing and machine learning techniques.
2) MaLTe will process texts containing narratives and examples, and output a representation of the knowledge in the form of Horn clauses. Some user interaction will be required during the translation process.
3) The document outlines several challenges in applying machine learning and natural language processing to knowledge extraction from real-world texts, including their logical structure and examples. It provides an example from a tax guide to illustrate these challenges.
The poet Olavo Bilac was asked by a friend to write an announcement advertising the sale of his small farm. Bilac wrote a poetic description highlighting the beauty of the property. Later, when Bilac asked if the farm had sold, his friend said he had changed his mind after reading Bilac's writing and realizing what a treasure he had. The friend was reminded not to underestimate the good things we have and to appreciate life's blessings like family, health, and friends rather than chasing false treasures.
In dit nummer alles over Social Media for Business! Oftwel hoe kun je geld verdienen met social media. Met o.a. leuke praktijk voorbeelden, tips en meer...!
The document summarizes ClimateWell's solar-powered indoor climate solution. It describes ClimateWell's proprietary triple-phase absorption heat pump technology, which uses a salt-based energy storage system to provide continuous heating and cooling that is powered by solar energy or waste heat. The technology has been implemented in residential and commercial projects in Spain, providing free heating, cooling, and hot water while reducing CO2 emissions.
The document summarizes key concepts about stars and the universe from an Earth science textbook chapter. It describes the properties and evolution of stars from their formation as protostars through their main sequence and red giant phases. It explains that stars die in different ways depending on their mass, ending as white dwarfs, neutron stars, or black holes. It also summarizes the different types of galaxies like spiral, elliptical, and irregular, as well as concepts about the expanding universe and theories like the Big Bang.
How To Use Open Content In An Interactive Museum ExhibitKaren Drost
Open is as open does: How To Use Open Content
In An Interactive Museum Exhibit
Netherlands Institute of Sound and Vision, Hilversum, The Netherlands
@ Museums and the Web 2016, Los Angeles, USA
Cooperating Techniques for Extracting Conceptual Taxonomies from TextFulvio Rotella
The document proposes a mixed approach using existing natural language processing techniques and novel techniques to automatically construct conceptual taxonomies from text. It identifies relevant concepts from text using keyword extraction, clustering, and computing relevance weights. It then generalizes similar concepts using WordNet to group concepts and disambiguate word senses. Preliminary evaluations show promising initial results.
The document proposes a mixed approach using existing natural language processing techniques and novel techniques to automatically construct conceptual taxonomies from text. Key steps include identifying relevant concepts and attributes from text, clustering similar concepts, computing relevance weights for concepts, and generalizing concepts using WordNet. Preliminary results suggest the approach shows promise for extending and improving automatic taxonomy construction.
A DOMAIN INDEPENDENT APPROACH FOR ONTOLOGY SEMANTIC ENRICHMENTcscpconf
Ontology automatic enrichment consists of adding automatically new concepts and/or new relations to an initial ontology built manually using a basic domain knowledge. In a concrete manner, enrichment is firstly, extracting concepts and relations from textual sources then putting them in their right emplacements in the initial ontology. However, the main issue in that process is how to preserve the coherence of the ontology after this operation. For this purpose, we consider the semantic aspect in the enrichment process by using similarity techniques between terms. Contrarily to other approaches, our approach is domain independent and the enrichment process is based on a semantic analysis. Another advantage of our approach is that it takes into account the two types of relations, taxonomic and non taxonomic
ones.
ASSESSING SIMILARITY BETWEEN ONTOLOGIES: THE CASE OF THE CONCEPTUAL SIMILARITYdannyijwest
In ontology engineering, there are many cases where assessing similarity between ontologies is required, this is the case of the alignment activities, ontology evolutions, ontology similarities, etc. This paper presents a new method for assessing similarity between concepts of ontologies. The method is based on the set theory, edges and feature similarity. We first determine the set of concepts that is shared by two ontologies and the sets of concepts that are different from them. Then, we evaluate the average value of similarity for each set by using edges-based semantic similarity. Finally, we compute similarity between ontologies by using average values of each set and by using feature-based similarity measure too.
HYPONYMY EXTRACTION OF DOMAIN ONTOLOGY CONCEPT BASED ON CCRFS AND HIERARCHY C...dannyijwest
The document describes a method for extracting hyponymy (hierarchical) relationships between domain concepts in an ontology. It uses Cascaded Conditional Random Fields (CCRFs) to identify concepts in text, then performs hierarchy clustering on the concepts to identify hyponymy relationships. CCRFs use a two-layer approach, with the first layer identifying simple concepts and the second layer identifying nested concepts. Hierarchy clustering represents concepts as vectors based on co-occurrence and calculates similarity to group concepts into a taxonomy. The method aims to automatically construct ontology hierarchies from free text.
SWSN UNIT-3.pptx we can information about swsn professionalgowthamnaidu0986
Ontology engineering involves constructing ontologies through various methods. It begins with defining the scope and evaluating existing ontologies for reuse. Terms are enumerated and organized in a taxonomy with defined properties, facets, and instances. The ontology is checked for anomalies and refined iteratively. Popular tools for ontology development include Protege and WebOnto. Methods like Meth ontology and On-To-Knowledge methodology provide processes for building ontologies from scratch or reusing existing ones. Ontology sharing requires mapping between ontologies to allow interoperability, and libraries exist for storing and accessing ontologies.
This document discusses an integrated approach to ontology development methodology and provides a case study using a shopping mall domain. It begins by reviewing existing ontology development methodologies and identifying their pitfalls. An integrated methodology is then proposed which aims to reduce these pitfalls. The key steps in the proposed methodology are: 1) capturing motivating user scenarios or keywords, 2) generating formal/informal questions and answers from the scenarios, 3) extracting terms and constraints, and 4) building the ontology using a top-down approach. The methodology is applied to developing an ontology for a shopping mall domain to provide multilingual information to visitors.
Tools for Ontology Building from Texts: Analysis and Improvement of the Resul...IOSR Journals
Text2Onto is a tool that learns ontologies from textual data by extracting ontology components like concepts, relations, instances, and hierarchies. It analyzes texts through linguistic preprocessing using Gate to tokenize, tag parts of speech, and identify noun and verb phrases. Algorithms then extract ontology components and store them probabilistically in a Preliminary Ontology Model independent of any representation language. The study aimed to understand Text2Onto's architecture, analyze errors in its extractions, and attempt improvements by using a meta-model of the text to better classify concepts under core concepts.
Formal treatments of inheritance are rather scarce and those that do exist are often more suited for
analysis of existing systems than as guides to language designers. One problem that adds complexity to
previous efforts is the need to pass a reference to the original invoking object throughout the method call
tree. In this paper, a novel specification of inheritance semantics is given. The approach dispenses with
self-reference, instead using static and dynamic scope to accomplish similar behaviour. The result is a
methodology that is simpler than previous specification attempts, easy to understand, and sufficiently
expressive. Moreover, an inheritance system based on this approach can be implemented with relatively
few lines of code in environment-passing interpreters.
Automatic Annotation Of Historical Paper DocumentsMartha Brown
The document describes research on automatically annotating and classifying historical documents. It discusses applying machine learning techniques to label documents in a digital archive of films from the 1920s-1930s. Specifically, it uses a first-order logic learning system called INTHELEX to learn definitions that classify documents and identify semantic labels within documents. Experimental results show it can accurately classify documents into types and identify block labels with predictive accuracies generally over 90%. The system provides a way to automatically structure and annotate a large collection of historical documents.
In this paper we present the SMalL Ontology for malicious software classification, SMalL Java Application for antivirus systems comparison and the SMalL knowledge based file format for malware related attacks. We believe that our ontology is able to aid the development of malware prevention software by offering a common knowledge base and a clear classification of the existing malicious software. The application is a prototype regarding how this ontology might be used in conjunction with known antivirus capabilities to offer a comprehensive comparison.
In this paper we tried to correlate text sequences those provides common topics for semantic clues. We propose a two step method for asynchronous text mining. Step one check for the common topics in the sequences and isolates these with their timestamps. Step two takes the topic and tries to give the timestamp of the text document. After multiple repetitions of step two, we could give optimum result.
IDENTIFYING THE SEMANTIC RELATIONS ON UNSTRUCTURED DATAijistjournal
Ontologisms have been applied to many applications in recent years, especially on Sematic Web, Information Retrieval, Information Extraction, and Question and Answer. The purpose of domain-specific ontology is to get rid of conceptual and terminological confusion. It accomplishes this by specifying a set of generic concepts that characterizes the domain as well as their definitions and interrelationships. This paper will describe some algorithms for identifying semantic relations and constructing an Information Technology Ontology, while extracting the concepts and objects from different sources. The Ontology is constructed based on three main resources: ACM, Wikipedia and unstructured files from ACM Digital Library. Our algorithms are combined of Natural Language Processing and Machine Learning. We use Natural Language Processing tools, such as OpenNLP, Stanford Lexical Dependency Parser in order to explore sentences. We then extract these sentences based on English pattern in order to build training set. We use a random sample among 245 categories of ACM to evaluate our results. Results generated show that our system yields superior performance.
Ontologisms have been applied to many applications in recent years, especially on Sematic Web, Information
Retrieval, Information Extraction, and Question and Answer. The purpose of domain-specific ontology
is to get rid of conceptual and terminological confusion. It accomplishes this by specifying a set of generic
concepts that characterizes the domain as well as their definitions and interrelationships. This paper will
describe some algorithms for identifying semantic relations and constructing an Information Technology
Ontology, while extracting the concepts and objects from different sources. The Ontology is constructed
based on three main resources: ACM, Wikipedia and unstructured files from ACM Digital Library. Our
algorithms are combined of Natural Language Processing and Machine Learning. We use Natural Language
Processing tools, such as OpenNLP, Stanford Lexical Dependency Parser in order to explore sentences.
We then extract these sentences based on English pattern in order to build training set. We use a
random sample among 245 categories of ACM to evaluate our results. Results generated show that our
system yields superior performance.
The document discusses Lin Ma's PhD research on analyzing presuppositions in natural language requirements. Presuppositions are implicit commitments in language that simplify communication but can cause misunderstanding if not made explicit. The research aims to automatically detect presuppositions triggered by definite descriptions in requirements and identify which are not explicitly stated. It will use natural language processing techniques and knowledge sources to classify definite descriptions and analyze how presuppositions project in requirements texts.
The document presents a new ontology matching system based on a multi-agent architecture. The system takes ontologies described in XML, RDF Schema, and OWL as input. It uses multiple matchers and filtering to generate mappings between ontology entities. The mappings are then validated. The system is implemented as a multi-agent system with different agent types responsible for resources, matching, generating mappings, and filtering/validating mappings. The architecture allows for robust, flexible, and scalable ontology matching.
The document discusses stepwise methodologies for building ontologies. It outlines common steps such as identifying the purpose and scope, capturing concepts and relationships, coding the ontology formally, integrating existing ontologies, evaluation, and documentation. It emphasizes starting with a middle-out approach to capture definitions and discusses reaching consensus among those involved in building the ontology. Modularization of ontologies into reusable components is also presented as an important aspect of the methodology.
Sentence compression via clustering of dependency graph nodes - NLP-KE 2012Ayman El-Kilany
This paper proposes an unsupervised model for sentence compression based on clustering the nodes of a sentence's dependency graph. The model first clusters related nodes into chunks using the Louvain clustering method. It then merges chunks based on linguistic rules to improve coherence. Candidate compressions are generated by removing chunks, and scored based on language models and word importance to select the best compression. An experiment found the proposed method performed better than a recent supervised technique.
New Quantitative Methodology for Identification of Drug Abuse Based on Featur...Carrie Wang
This project developed a new quantitative methodology using feature-based context-free grammar to analyze discourse semantics from social media discussions in order to identify potential drug abuse. The methodology was able to parse YouTube comments about recreational cough syrup use and perform anaphora resolution. This computational representation of discourse contributes to understanding human language structure and has applications in public health monitoring and clinical research.
Similar to FCA-MERGE: Bottom-Up Merging of Ontologies (20)
How to Setup Warehouse & Location in Odoo 17 InventoryCeline George
In this slide, we'll explore how to set up warehouses and locations in Odoo 17 Inventory. This will help us manage our stock effectively, track inventory levels, and streamline warehouse operations.
This document provides an overview of wound healing, its functions, stages, mechanisms, factors affecting it, and complications.
A wound is a break in the integrity of the skin or tissues, which may be associated with disruption of the structure and function.
Healing is the body’s response to injury in an attempt to restore normal structure and functions.
Healing can occur in two ways: Regeneration and Repair
There are 4 phases of wound healing: hemostasis, inflammation, proliferation, and remodeling. This document also describes the mechanism of wound healing. Factors that affect healing include infection, uncontrolled diabetes, poor nutrition, age, anemia, the presence of foreign bodies, etc.
Complications of wound healing like infection, hyperpigmentation of scar, contractures, and keloid formation.
How to Make a Field Mandatory in Odoo 17Celine George
In Odoo, making a field required can be done through both Python code and XML views. When you set the required attribute to True in Python code, it makes the field required across all views where it's used. Conversely, when you set the required attribute in XML views, it makes the field required only in the context of that particular view.
Philippine Edukasyong Pantahanan at Pangkabuhayan (EPP) CurriculumMJDuyan
(𝐓𝐋𝐄 𝟏𝟎𝟎) (𝐋𝐞𝐬𝐬𝐨𝐧 𝟏)-𝐏𝐫𝐞𝐥𝐢𝐦𝐬
𝐃𝐢𝐬𝐜𝐮𝐬𝐬 𝐭𝐡𝐞 𝐄𝐏𝐏 𝐂𝐮𝐫𝐫𝐢𝐜𝐮𝐥𝐮𝐦 𝐢𝐧 𝐭𝐡𝐞 𝐏𝐡𝐢𝐥𝐢𝐩𝐩𝐢𝐧𝐞𝐬:
- Understand the goals and objectives of the Edukasyong Pantahanan at Pangkabuhayan (EPP) curriculum, recognizing its importance in fostering practical life skills and values among students. Students will also be able to identify the key components and subjects covered, such as agriculture, home economics, industrial arts, and information and communication technology.
𝐄𝐱𝐩𝐥𝐚𝐢𝐧 𝐭𝐡𝐞 𝐍𝐚𝐭𝐮𝐫𝐞 𝐚𝐧𝐝 𝐒𝐜𝐨𝐩𝐞 𝐨𝐟 𝐚𝐧 𝐄𝐧𝐭𝐫𝐞𝐩𝐫𝐞𝐧𝐞𝐮𝐫:
-Define entrepreneurship, distinguishing it from general business activities by emphasizing its focus on innovation, risk-taking, and value creation. Students will describe the characteristics and traits of successful entrepreneurs, including their roles and responsibilities, and discuss the broader economic and social impacts of entrepreneurial activities on both local and global scales.
Temple of Asclepius in Thrace. Excavation resultsKrassimira Luka
The temple and the sanctuary around were dedicated to Asklepios Zmidrenus. This name has been known since 1875 when an inscription dedicated to him was discovered in Rome. The inscription is dated in 227 AD and was left by soldiers originating from the city of Philippopolis (modern Plovdiv).
This presentation was provided by Racquel Jemison, Ph.D., Christina MacLaughlin, Ph.D., and Paulomi Majumder. Ph.D., all of the American Chemical Society, for the second session of NISO's 2024 Training Series "DEIA in the Scholarly Landscape." Session Two: 'Expanding Pathways to Publishing Careers,' was held June 13, 2024.
Chapter wise All Notes of First year Basic Civil Engineering.pptxDenish Jangid
Chapter wise All Notes of First year Basic Civil Engineering
Syllabus
Chapter-1
Introduction to objective, scope and outcome the subject
Chapter 2
Introduction: Scope and Specialization of Civil Engineering, Role of civil Engineer in Society, Impact of infrastructural development on economy of country.
Chapter 3
Surveying: Object Principles & Types of Surveying; Site Plans, Plans & Maps; Scales & Unit of different Measurements.
Linear Measurements: Instruments used. Linear Measurement by Tape, Ranging out Survey Lines and overcoming Obstructions; Measurements on sloping ground; Tape corrections, conventional symbols. Angular Measurements: Instruments used; Introduction to Compass Surveying, Bearings and Longitude & Latitude of a Line, Introduction to total station.
Levelling: Instrument used Object of levelling, Methods of levelling in brief, and Contour maps.
Chapter 4
Buildings: Selection of site for Buildings, Layout of Building Plan, Types of buildings, Plinth area, carpet area, floor space index, Introduction to building byelaws, concept of sun light & ventilation. Components of Buildings & their functions, Basic concept of R.C.C., Introduction to types of foundation
Chapter 5
Transportation: Introduction to Transportation Engineering; Traffic and Road Safety: Types and Characteristics of Various Modes of Transportation; Various Road Traffic Signs, Causes of Accidents and Road Safety Measures.
Chapter 6
Environmental Engineering: Environmental Pollution, Environmental Acts and Regulations, Functional Concepts of Ecology, Basics of Species, Biodiversity, Ecosystem, Hydrological Cycle; Chemical Cycles: Carbon, Nitrogen & Phosphorus; Energy Flow in Ecosystems.
Water Pollution: Water Quality standards, Introduction to Treatment & Disposal of Waste Water. Reuse and Saving of Water, Rain Water Harvesting. Solid Waste Management: Classification of Solid Waste, Collection, Transportation and Disposal of Solid. Recycling of Solid Waste: Energy Recovery, Sanitary Landfill, On-Site Sanitation. Air & Noise Pollution: Primary and Secondary air pollutants, Harmful effects of Air Pollution, Control of Air Pollution. . Noise Pollution Harmful Effects of noise pollution, control of noise pollution, Global warming & Climate Change, Ozone depletion, Greenhouse effect
Text Books:
1. Palancharmy, Basic Civil Engineering, McGraw Hill publishers.
2. Satheesh Gopi, Basic Civil Engineering, Pearson Publishers.
3. Ketki Rangwala Dalal, Essentials of Civil Engineering, Charotar Publishing House.
4. BCP, Surveying volume 1
Chapter wise All Notes of First year Basic Civil Engineering.pptx
FCA-MERGE: Bottom-Up Merging of Ontologies
1. FCA-M ERGE: Bottom-Up Merging of Ontologies
Gerd Stumme and Alexander Maedche
Institute AIFB, University of Karlsruhe, 76128 Karlsruhe, Germany
stumme,maedche @aifb.uni-karlsruhe.de
http://www.aifb.uni-karlsruhe.de/WBS
Abstract for comparisons, these approaches do not offer a structural
description of the global merging process.
Ontologies have been established for knowledge We propose the new method FCA–M ERGE for merging
sharing and are widely used as a means for con- ontologies following a bottom-up approach which offers a
ceptually structuring domains of interest. With global structural description of the merging process. For both
the growing usage of ontologies, the problem of source ontologies, it extracts instances from a given set set of
overlapping knowledge in a common domain be- domain-specific text documents by applying natural language
comes critical. We propose the new method processing techniques. Based on the extracted instances we
FCA–M ERGE for merging ontologies following a apply mathematically founded techniques taken from Formal
bottom-up approach which offers a structural de- Concept Analysis [Wille, 1982; Ganter and Wille, 1999 ] to
scription of the merging process. The method derive a lattice of concepts as a structural result of FCA–
is guided by application-specific instances of two M ERGE. The produced result is explored and transformed to
given source ontologies, that are to be merged. We the merged ontology by the ontology engineer. The extrac-
apply techniques from natural language processing tion of instances from text documents circumvents the prob-
and formal concept analysis to derive a lattice of lem that in most applications there are no objects which are
concepts as a structural result of FCA–M ERGE. simultaneously instances of both ontologies, and which could
The generated result is then explored and trans- be used as a basis for identifying similar concepts.
formed into the merged ontology with human in- The remainder of the paper is as follows. We briefly in-
teraction. troduce some basic definitions concentrating on a formal def-
inition of what an ontology is and recall the basics of For-
mal Concept Analysis in Section 2. Before we present our
1 Introduction generic method for ontology merging in Section 4, we give
Ontologies have been established for knowledge sharing and an overview over existing and related work in Section 3. Sec-
are widely used as a means for conceptually structuring do- tion 5 provides a detailed description of FCA–M ERGE. Sec-
mains of interest. With the growing usage of ontologies, the tion 6 summarizes the paper and concludes with an outlook
problem of overlapping knowledge in a common domain oc- on future work.
curs more often and becomes critical. Domain-specific on-
tologies are modeled by multiple authors in multiple settings. 2 Ontologies and Formal Concept Analysis
These ontologies lay the foundation for building new domain- In this section, we briefly introduce some basic definitions.
specific ontologies in similar domains by assembling and ex- We thereby concentrate on a formal definition of what an on-
tending multiple ontologies from repositories. tology is and recall the basics of Formal Concept Analysis.
The process of ontology merging takes as input two source
ontologies and returns a merged ontology based on the given 2.1 Ontologies
source ontologies. Manual ontology merging using con- There is no common formal definition of what an ontology is.
ventional editing tools without intelligent support is diffi- However, most approaches share a few core items: concepts,
cult, labor intensive and error prone. Therefore, several sys- a hierarchical IS-A-relation, and further relations. For sake
tems and frameworks for supporting the knowledge engi- of generality, we do not discuss more specific features like
neer in the ontology merging task have recently been pro- constraints, functions, or axioms here. We formalize the core
posed [Hovy, 1998; Chalupsky, 2000; Noy and Musen, 2000; in the following way.
McGuinness et al, 2000 ]. The approaches rely on syntactic
and semantic matching heuristics which are derived from the Definition: A (core) ontology is a tuple Ç
behavior of ontology engineers when confronted with the task ´ × Ê µ, where is a set whose elements are called
of merging ontologies, i. e. human behaviour is simulated. concepts, × is a partial order on (i. e., a binary rela-
Although some of them locally use different kinds of logics tion × ¢ which is reflexive, transitive, and anti-
2. symmetric), and Ê is a set whose elements are called relation syntactic rewriting supports the translation between two dif-
names (or relations for short), and Ê · is a function ferent knowledge representation languages, semantic rewrit-
which assigns to each relation name its arity. ing offers means for inference-based transformations. It ex-
plicitly allows to violate the preservation of semantics in
As said above, the definition considers the core elements of trade-off for a more expressive, flexible transformation mech-
most languages for ontology representation only. It is possi- anism.
ble to map the definition to most types of ontology represen- In [McGuinness et al, 2000 ] the Chimaera system is de-
tation languages. Our implementation, for instance, is based scribed. It provides support for merging of ontological terms
on Frame Logic [Kifer et al, 1995 ]. Frame Logic has a well- from different sources, for checking the coverage and cor-
founded semantics, but we do not refer to it in this paper. rectness of ontologies and for maintaining ontologies over
time. Chimaera supports the merging of ontologies by co-
2.2 Formal Concept Analysis alescing two semantically identical terms from different on-
We recall the basics of Formal Concept Analysis (FCA) as far tologies and by identifying terms that should be related by
as they are needed for this paper. A more extensive overview subsumption or disjointness relationships. Chimaera offers a
is given in [Ganter and Wille, 1999 ]. To allow a mathematical broad collection of functions, but the underlying assumptions
description of concepts as being composed of extensions and about structural properties of the ontologies at hand are not
intensions, FCA starts with a formal context defined as a triple made explicit.
à ´ µ
Å Á , where is a set of objects, Å is a set of Prompt [Noy and Musen, 2000 ] is an algorithm for ontol-
attributes, and Á is a binary relation between and Å (i. e. ogy merging and alignment embedded in Prot´ g´ 2000. It
e e
Á ´ µ
¢ Å ). Ñ ¾ Á is read “object has attribute Ñ”. starts with the identification of matching class names. Based
ѾŠ¾ on this initial step an iterative approach is carried out for per-
Definition: For , we define ¼
´ µ
Ñ ¾ Á and, for Å , we define ¼ ¾ forming automatic updates, finding resulting conflicts, and
Ѿ ´ µ
Ñ ¾Á .
making suggestions to remove these conflicts.
A formal concept of a formal context ´ µ
Å Á is defined
The tools described above offer extensive merging func-
as a pair ´ µ
with , Å, ¼ and ¼ .
tionalities, most of them based on syntactic and semantic
matching heuristics, which are derived from the behaviour of
The sets and are called the extent and the intent of the
formal concept ´ µ
. The subconcept–superconcept rela-
ontology engineers when confronted with the task of merg-
tion is formalized by ½ ½ ´ µ ´
¾ ¾ ´µ ½ µ ing ontologies. OntoMorph and Chimarea use a descrip-
´
¾ ´µ ½ µ
¾ The set of all formal concepts of a
tion logics based approach that influences the merging pro-
cess locally, e. g. checking subsumption relationships be-
context à together with the partial order is always a com- tween terms. None of these approaches offers a structural de-
plete lattice,1 called the concept lattice of à and denoted by
´ µ
à .
scription of the global merging process. FCA–M ERGE can
be regarded as complementary to existing work, offering a
A possible confusion might arise from the double use of structural description of the overall merging process with an
the word ‘concept’ in FCA and in ontologies. This comes underlying mathematical framework.
from the fact that FCA and ontologies are two models for The work closest to our approach is described in [Schmitt
the concept of ‘concept’ which arose independently. In order and Saake, 1998 ]. They apply Formal Concept Analysis to
to distinguish both notions, we will always refer to the FCA a related problem, namely database schema integration. As
concepts as ‘formal concepts’. The concepts in ontologies in our approach, a knowledge engineer has to interpret the
are referred to just as ‘concepts’ or as ‘ontology concepts’. results in order to make modeling decisions. Our technique
There is no direct counter-part of formal concepts in ontolo- differs in two points: There is no need of knowledge acquisi-
gies. Ontology concepts are best compared to FCA attributes, tion from a domain expert in the preprocessing phase; and it
as both can be considered as unary predicates on the set of ob- additionally suggests new concepts and relations for the tar-
jects. get ontology.
3 Related Work 4 Bottom-Up Ontology Merging
A first approach for supporting the merging of ontologies is As said above, we propose a bottom-up approach for ontol-
described in [Hovy, 1998 ]. There, several heuristics are de- ogy merging. Our mechanism is based on application-specific
scribed for identifying corresponding concepts in different instances of the two given ontologies Ç ½ and Ǿ that are to
ontologies, e.g. comparing the names of two concepts, com- be merged. The overall process of merging two ontologies is
paring the natural language definitions of two concepts by depicted in Figure 1 and consists of three steps, namely (i)
linguistic techniques, and checking the closeness of two con- instance extraction and computing of two formal contexts à ½
cepts in the concept hierarchy. and à ¾ , (ii) the FCA-M ERGE core algorithm that derives a
The OntoMorph system [Chalupsky, 2000 ] offers two common context and computes a concept lattice, and (iii) the
kinds of mechanisms for translating and merging ontologies: generation of the final merged ontology based on the concept
lattice.
1 Our method takes as input data the two ontologies and a
I. e., for each set of formal concepts, there is always a greatest
common subconcept and a least common superconcept. set of natural language documents. The documents have to
3. 1
1 5 The FCA–M ERGE Method
Linguistic
Processing 1 In this section, we discuss the three steps of FCA–M ERGE in
FCA- Lattice more detail. We illustrate FCA–M ERGE with a small exam-
Merge Exploration
Linguistic
new ple taken from the tourism domain, where we have built sev-
Processing 2
eral specific ontology-based information systems. Our gen-
eral experiments are based on tourism ontologies that have
2 2 been modeled in an ontology engineering seminar. Differ-
Figure 1: Ontology Merging Method ent ontologies have been modeled for a given text corpus on
the web, which is provided by a WWW provider for tourist
information. 2 The corpus describes actual objects, like loca-
tions, accommodations, furnishings of accommodations, ad-
be relevant to both ontologies, so that the documents are de- ministrative information, and cultural events. For the scenario
scribed by the concepts contained in the ontology. The doc- described here, we have selected two ontologies: The first on-
uments may be taken from the target application which re- tology contains 67 concepts and 31 relations, and the second
quires the final merged ontology. From the documents in , ontology contains 51 concepts and 22 relations. The under-
we extract instances. The mechanism for instance extraction lying text corpus consists of 233 natural language documents
is further described in Subsection 5.1. It returns, for each on- taken from the WWW provider described above. For demon-
stration purposes, we restrict ourselves first to two very small
subsets ǽ and Ǿ of the two ontologies described above;
tology, a formal context indicating which ontology concepts
appear in which documents.
and to 14 out of the 233 documents. These examples will
The extraction of the instances from documents is neces- be translated in English. In Subsection 5.3, we provide some
sary because there are usually no instances which are already examples from the merging of the larger ontologies.
classified by both ontologies. However, if this situation is
given, one can skip the first step and use the classification of 5.1 Linguistic Analysis and Context Generation
the instances directly as input for the two formal contexts. The aim of this first step is to generate, for each ontology
The second step of our ontology merging approach com-
Ç ¾ ½¾ , a formal context à ´ µ
Å Á . The set
of documents is taken as object set ( ), and the set
prises the FCA–M ERGE core algorithm. The core algorithm
of concepts is taken as attribute set (Å ). While these
merges the two contexts and computes a concept lattice from
sets come for free, the difficult step is generating the binary
the merged context using FCA techniques. More precisely, it
computes a pruned concept lattice which has the same degree
relation Á . The relation ´ µ
Ñ ¾ Á shall hold whenever
document contains an instance of Ñ.
of detail as the two source ontologies. The techniques ap-
The computation uses linguistic techniques as described
plied for generating the pruned concept lattice are described
in the sequel. We conceive an information extraction-based
in Subsection 5.2 in more detail.
approach for ontology-based extraction, which has been im-
Instance extraction and the FCA–M ERGE core algorithm plemented on top of SMES (Saarbr¨ cken Message Extrac-
u
are fully automatic. The final step of deriving the merged tion System), a shallow text processor for German (cf. [Neu-
ontology from the concept lattice requires human interaction. mann et al, 1997 ]). The architecture of SMES comprises
Based on the pruned concept lattice and the sets of relation a tokenizer based on regular expressions, a lexical analysis
names ʽ and ʾ , the ontology engineer creates the con- component including a word and a domain lexicon, and a
cepts and relations of the target ontology. We offer graphical chunk parser. The tokenizer scans the text in order to identify
means of the ontology engineering environment OntoEdit for boundaries of words and complex expressions like “$20.00”
supporting this process. or “Mecklenburg–Vorpommern”, 3 and to expand abbrevia-
tions.
For obtaining good results, a few assumptions have to be The lexicon contains more than 120,000 stem entries and
met by the input data: Firstly, the documents have to be rel- more than 12,000 subcategorization frames describing infor-
evant to each of the source ontologies. A document from mation used for lexical analysis and chunk parsing. Further-
which no instance is extracted for each source ontology can more, the domain-specific part of the lexicon contains lexical
be neglected for our task. Secondly, the documents have entries that express natural language representations of con-
to cover all concepts from the source ontologies. Concepts cepts and relations. Lexical entries may refer to several con-
which are not covered have to be treated manually after our cepts or relations, and one concept or relation may be referred
merging procedure (or the set of documents has to be ex- to by several lexical entries.
panded). And last but not least, the documents must sepa- Lexical analysis uses the lexicon to perform (1) morpho-
rate the concepts well enough. If two concepts which are logical analysis, i. e. the identification of the canonical com-
considered as different always appear in the same documents, mon stem of a set of related word forms and the analysis
FCA-M ERGE will map them to the same concept in the target of compounds, (2) recognition of named entities, (3) part-of-
ontology (unless this decision is overruled by the knowledge speech tagging, and (4) retrieval of domain-specific informa-
engineer). When this situation appears too often, the knowl-
2
edge engineer might want to add more documents which fur- URL: http://www.all-in-all.com
3
ther separate the concepts. a region in the north east of Germany
4. Accommodation
Root_1
Root_2
Vacation
Hotel_1
Concert
Musical
Hotel_2
Hotel
Event
Hotel
Root
Root
Á½ Á¾ Accommodation_2 Event_1
¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢
¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢
doc1 doc1
Concert_1
¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢
doc2 doc2
¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢
doc3 doc3 Musical_2
¢ ¢ ¢ ¢ ¢
doc4 doc4
¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢
doc5 doc5
¢ ¢ ¢ ¢ ¢
doc6 doc6
Vacation_1
¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢
doc7 doc7
¢ ¢ ¢ ¢ ¢ ¢ ¢
doc8 doc8
¢ ¢ ¢ ¢ ¢ ¢ ¢
doc9 doc9
¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢
doc10 doc10
¢ ¢ ¢ ¢ ¢
doc11 doc11
¢ ¢ ¢ ¢ ¢ ¢ ¢ ¢
doc12 doc12
¢ ¢ ¢ ¢ ¢ ¢ ¢
doc13 doc13
doc14 doc14 Figure 3: The pruned concept lattice
Figure 2: The two contexts à ½ and à ¾ as result of the first
step We will not compute the whole concept lattice of à , as it
would provide too many too specific concepts. We restrict
the computation to those formal concepts which are above
tion. While steps (1), (2), and (3) can be viewed as standard
at least one formal concept generated by an (ontology) con-
for information extraction approaches, step (4) is of specific
cept of the source ontologies. This assures that we remain
interest for our instance extraction mechanism. This step as-
within the range of specificity of the source ontologies. More
sociates single words or complex expressions with a concept
precisely, the pruned concept lattice is given by Ô Ã ´ µ
from the ontology if a corresponding entry in the domain-
specific part of the lexicon exists. For instance, the expression ´ µ ´ µ
¾ Ã ´
Ñ¾Å Ñ ¼ Ñ ¼¼ µ ´ µ
. For
“Hotel Schwarzer Adler” is associated with the concept Ho- our example, the pruned concept lattice is shown in Figure 3.
tel. If the concept Hotel is in ontology Ç ½ and document It consists of six formal concepts. Two formal concepts of
contains the expression “Hotel Schwarzer Adler”, then the the total concept lattice are pruned since they are too specific
relation ( ,Hotel) ¾Á½ holds. compared to the two source ontologies.
Finally, the transitivity of the × -relation is compiled The computation of the pruned concept lattice is done with
into the formal context, i. e. Ñ ¾Á and Ñ × Ò im- ´ µ the algorithm T ITANIC [Stumme et al, 2000 ]. It is slightly
plies ´ µ
Ò ¾Á . This means that if ( ,Hotel) ¾Á ½ holds modified to allow the pruning. Compared to other algorithms
for computing concept lattices, T ITANIC has — for our pur-
and Hotel × Accommodation, then the document
also describes an instance of the concept Accommodation: pose — the advantage that it computes the formal concepts
( ,Accommodation) ¾Á½ . via their key sets (or minimal generators). A key set is a min-
Figure 2 depicts the contexts à ½ and à ¾ that have been imal set of attributes which generates a given formal concept.
I. e., Ã Å is a key set for the formal concept ´ µ
if and
generated from the documents for the small example ontolo-
gies. E. g., document doc5 contains instances of the con- ´
only if à ¼ à ¼¼ µ ´ µ and ´ ¼ ¼¼
µ ´ µ
for all
cepts Event, Concert, and Root of ontology Ç ½ , and à with Ã.
Musical and Root of ontology Ç ¾ . All other documents In our application, key sets serve two purposes. Firstly,
contain some information on hotels, as they contain instances they indicate if the generated formal concept gives rise to a
of the concept Hotel both in ontology Ç ½ and in ontology new concept in the target ontology or not. A concept is new
Ǿ . if and only if it has no key sets of cardinality one. Secondly,
the key sets of cardinality two or more can be used as generic
5.2 Generating the Pruned Concept Lattice names for new concepts and they indicate the arity of new
The second step takes as input the two formal contexts à ½ relations.
and à ¾ which were generated in the last step, and returns
a pruned concept lattice (see below), which will be used as
5.3 Generating the new Ontology from the
input in the next step. Concept Lattice
First we merge the two formal contexts into a new formal While the previous steps (instance extraction, context deriva-
context à , from which we will derive the pruned concept lat- tion, context merging, and T ITANIC) are fully automatic, the
tice. Before merging the two formal contexts, we have to derivation of the merged ontology from the concept lattice
disambiguate the attribute sets, since ½ and ¾ may con- requires human interaction, since it heavily relies on back-
tain the same concepts: Let Å ¼ Ñ Ñ ¾ Å , ´ µ ground knowledge of the domain expert.
for ¾ ½¾ . The disambiguation of the concepts allows the The result from the last step is a pruned concept lattice.
possibility that the same concept exists in both ontologies, but From it we have to derive the target ontology. Each of the
is treated differently. For instance, a Campground may be formal concepts of the pruned concept lattice is a candidate
considered as an Accommodation in the first ontology, but for a concept, a relation, or a new subsumption in the target
not in the second one. Then the merged formal context is ob- ontology. In the sequel we describe the strategy which under-
tained by à ŠÁ with ´ ,Å µ
Ž ž ,
¼ ¼
lies this derivation process. Of course, most of the technical
and ´ ´Ñ µµ
¾Á ¸ Ñ ¾Á. ´ µ details are hidden from the user.
5. There is a number of queries which may be used to focus for the name of the new concept, or for the concepts which
on the most relevant parts of the pruned concept lattice. We should be linked with the new relation. Only those key sets
discuss these queries after the description of the general strat- with minimal cardinality are considered, as they provide the
egy — which follows now. shortest names for new concepts and minimal arities for new
As the documents are not needed for the generation of the relations, resp.
target ontology, we restrict our attention to the intents of the For instance, the formal concept in the middle of Fig-
formal concepts, which are sets of (ontology) concepts of the ure 3 has Hotel 2, Event 1 , Hotel 1, Event 1 ,
source ontologies. For each formal concept of the pruned and Accommodation 2, Event 1 as key sets. The user
concept lattice, we analyze the related key sets. For each for- can now decide if to create a new concept with the default
mal concept, the following cases can be distinguished: name HotelEvent (which is unlikely in this situation), or
1. It has exactly one key set of cardinality 1. to create a new relation with arity (Hotel, Event), e. g., the
2. It has two or more key sets of cardinality 1. relation organizesEvent.
3. It has no key sets of cardinality 0 or 1. Key sets of cardinality 2 serve yet another purpose:
4. It has the empty set as key set. 4 ѽ Ѿ being a key set implies that neither Ñ ½ × Ñ¾ nor
Ѿ × Ñ½ currently hold. Thus when the user does not use
The generation of the target ontology starts with all concepts a key set of cardinality 2 for generating a new concept or re-
being in one of the two first situations. The first case is the lation, she should check if it is reasonable to add one of the
easiest: The formal concept is generated by exactly one on- two subsumptions to the target ontology. This case does not
tology concept from one of the source ontologies. It can show up in our small example. An example from the large
be included in the target ontology without interaction of the ontologies is given at the end of the section.
knowledge engineer. In our example, these are the two formal There is exactly one formal concept in the fourth case (as
concepts labeled by Vacation 1 and by Event 1. the empty set is always a key set). This formal concept gives
In the second case, two or more concepts of the source on- rise to a new largest concept in the target ontology, the Root
tologies generate the same formal concept. This indicates concept. It is up to the knowledge engineer to accept or to
that the concepts should be merged into one concept in the reject this concept. Many ontology tools require the existence
target ontology. The user is asked which of the names to of such a largest concept. In our example, this is the formal
retain. In the example, this is the case for two formal con- concept labeled by Root 1 and Root 2.
cepts: The key sets Concert 1 and Musical 2 gen-
Finally, the is a order on the concepts of the target ontology
erate the same formal concept, and are thus suggested to
can be derived automatically from the pruned concept lattice:
be merged; and the key sets Hotel 1 , Hotel 2 , and
If the concepts ½ and ¾ are derived from the formal concepts
Accommodation 2 also generate the same formal con-
cept.5 The latter case is interesting, since it includes two con-
´ µ ´ µ
½ ½ and ¾ ¾ , resp., then ½ × ¾ if and only if
cepts of the same ontology. This means that the set of docu- ½ ¾ (or if the user explicitly modeled it based on a key
set of cardinality 2).
ments does not provide enough details to separate these two
concepts. Either the knowledge engineer decides to merge Querying the pruned concept lattice. In order to support the
the concepts (for instance because he observes that the dis- knowledge engineer in the different steps, there is a number
tinction is of no importance in the target application), or he of queries for focusing his attention to the significant parts of
adds them as separate concepts to the target ontology. If there the pruned concept lattice.
are too many suggestions to merge concepts which should be Two queries support the handling of the second case (in
distinguished, this is an indication that the set of documents which different ontology concepts generate the same formal
was not large enough. In such a case, the user might want to ´ µ
concept). The first is a list of all pairs Ñ ½ Ѿ ¾ ½ ¢ ¾
re-launch FCA–M ERGE with a larger set of documents. with ѽ ¼ Ѿ ¼ . It indicates which concepts from the
When all formal concepts in the first two cases are dealt different source ontologies should be merged.
with, then all concepts from the source ontologies are in- In our small example, this list contains for instance the pair
cluded in the target ontology. Now, all relations from the two (Concert 1, Musical 2). In the larger application (which
source ontologies are copied into the target ontology. Possi- is based on the German language), pairs like (Zoo 1, Tier-
ble conflicts and duplicats have to be resolved by the ontology park 2) and (Zoo 1, Tiergarten 2) are listed. We de-
engineer. cided to merge Zoo [engl.: zoo] and Tierpark [zoo], but
In the next step, we deal with all formal concepts covered not Zoo and Tiergarten [zoological garden].
by the third case. They are all generated by at least two con- The second query returns, for ontology Ç with ¾ ½¾ ,
cepts from the source ontologies, and are candidates for new
ontology concepts or relations in the target ontology. The de-
´
the list of pairs Ñ Ò ¾ µ ¢ with Ñ ¼ Ò ¼ . It
helps checking which concepts out of a single ontology might
cision whether to add a concept or a relation to the target on- be subject to merge. The user might either conclude that some
tology (or to discard the suggestion) is a modeling decision, of these concept pairs can be merged because their differen-
and is left to the user. The key sets provide suggestions either tiation is not necessary in the target application; or he might
4
This implies (by the definition of key sets) that the formal con- decide that the set of documents must be extended because it
cept does not have another key set. does not differentiate the concepts enough.
5
Root 1 and Root 2 are no key sets, as each of them has In the small example, the list for Ç ½ contains only the pair
a subset (namely the empty set) generating the same formal concept. (Hotel 1, Accommodation 1). In the larger application,
6. we had additionally pairs like (R¨umliches, Gebiet) and
a two contexts and the computation of the pruned concept lat-
(Auto, Fortbewegungsmittel). For the target applica- tice; and the semi-automatic ontology creation phase which
tion, we merged R¨umliches [spatial thing] and Gebiet
a supports the user in modeling the target ontology. The pa-
[region], but not Auto [car] and Fortbewegungsmittel per described the underlying assumptions and discussed the
[means of travel]. methodology.
The number of suggestions provided for the third situation Future work includes the closer integration of the FCA–
can be quite high. There are three queries which present only M ERGE method in the ontology engineering environment
the most significant formal concepts. These queries can also O NTO E DIT. In particular, we will offer views on the pruned
be combined. concept lattice based on the queries described in Subsec-
Firstly, one can fix an upper bound for the cardinality of the tion 5.3.
key sets. The lower the bound is, the fewer new concepts are The evaluation of ontology merging is an open issue [Noy
presented. A typical value is 2, which allows to retain all con- and Musen, 2000 ]. We plan to use FCA–M ERGE to generate
cepts from the two source ontologies (as they are generated independently a set of merged ontologies (based on two given
by key sets of cardinality 1), and to discover new binary rela- source ontologies). Comparing these merged ontologies us-
tions between concepts from the different source ontologies, ing the standard information retrieval measures as proposed
but no relations of higher arity. If one is interested in having in [Noy and Musen, 2000 ] will allow us to evaluate the per-
exactly the old concepts and relations in the target ontology, formance of FCA–M ERGE.
and no suggestions for new concepts and relations, then the On the theoretical side, an interesting open question is the
upper bound for the key set size is set to 1. extension of the formalism to features of specific ontology
Secondly, one can fix a minimum support. This prunes all languages, like for instance functions or axioms. The ques-
formal concepts where the cardinality of the extent is too low tion is ( ) how they can be exploited for the merging process,
(compared to the overall number of documents). The default and ( ) how new functions and axioms describing the inter-
is no pruning, i. e., with a minimum support of 0 %. It is also play between the source ontologies can be generated for the
possible to fix different minimum supports for different car- target ontology.
dinalities of the key sets. The typical case is to set the min-
imum support to 0 % for key sets of cardinality 1, and to a References
higher percentage for key sets of higher cardinality. This way [Chalupsky, 2000 ] H. Chalupsky: OntoMorph: A translation
we retain all concepts from the source ontologies, and gen- system for symbolic knowledge. Proc. KR ’00.
erate new concepts and relations only if they have a certain [Ganter and Wille, 1999 ] B. Ganter, R. Wille: Formal Con-
(statistical) significance. cept Analysis: mathematical foundations. Springer.
Thirdly, one can consider only those key sets of cardinal- [Hovy, 1998 ] E. Hovy: Combining and standardizing large-
ity 2 in which the two concepts come from one ontology each. scale, practical ontologies for machine translation and
This way, only those formal concepts are presented which other uses. Proc. 1st Intl. Conf. on Language Resources
give rise to concepts or relations linking the two source on- and Evaluation, Granada.
tologies. This restriction is useful whenever the quality of [Kifer et al, 1995 ] M. Kifer, G. Lausen, J. Wu: Logical foun-
each source ontolology per se is known to be high, i. e., when dations of object-oriented and frame-based languages.
there is no need to extend each of the source ontologies alone. Journal of the ACM 42.
In the small example, there are no key sets with cardinal- [McGuinness et al, 2000 ] D. L. McGuinness, R. Fikes, J.
ity 3 or higher. The three key sets with cardinality 2 (as
given above) all have a support of ½½ ½ ± . In the
Rice, and S. Wilder: An environment for merging and
testing large Oontologies. Proc. KR ’00.
larger application, we fixed 2 as upper bound for the cardinal- [Neumann et al, 1997 ] G. Neumann, R. Backofen, J. Baur,
ity of the key sets. We obtained key sets like (Telefon 1 M. Becker, C. Braun: An information extraction core
[telephone], ¨ffentliche Einrichtung 2 [public in-
O system for real world German text processing. Proc.
stitution]) (support = 24.5 %), (Unterkunft 1 [accom- ANLP-97, Washington.
modation], Fortbewegungsmittel 2 [means of travel]) [Noy and Musen, 2000 ] N. Fridman Noy, M. A. Musen:
(1.7 %), (Schloß 1 [castle], Bauwerk 2 [building]) PROMPT: algorithm and tool for automated ontology
(2.1 %), and (Zimmer 1 [room], Bibliothek 2 [library]) merging and alignment. Proc. AAAI ’00.
(2.1 %). The first gave rise to a new concept Tele- [Schmitt and Saake, 1998 ] I. Schmitt, G. Saake: Merging in-
fonzelle [public phone], the second to a new binary rela- heritance hierarchies for database integration. Proc.
tion hatVerkehrsanbindung [hasPublicTransportCon- CoopIS’98. IEEE Computer Science Press.
nection], the third to a new subsumption Schloß × [Stumme et al, 2000 ] G. Stumme, R. Taouil, Y. Bastide, N.
Bauwerk, and the fourth was discarded as meaningless. Pasquier, L. Lakhal: Fast computation of concept lat-
tices using data mining Ttechniques. Proc. KRDB ´00,
6 Conclusion and Future Work CEUR-Workshop Proc. http://sunsite.informatik.rwth-
aachen.de/Publications/CEUR-WS/
FCA–M ERGE is a bottom-up technique for merging ontolo- [Wille, 1982] R. Wille: Restructuring lattice theory: an ap-
gies based on a set of documents. In this paper, we described
proach based on hierarchies of concepts. In: I. Rival
the three steps of the technique: the linguistic analysis of the
(ed.): Ordered sets. Reidel, Dordrecht, 445–470.
texts which returns two formal contexts; the merging of the