The YAGO-SUMO integration incorporates millions of entities from YAGO, which is based on Wikipedia and WordNet, into the Suggested Upper Merged Ontology (SUMO), a highly axiomatized formal upper ontology. With the combined force of the two ontologies, an enormous, unprecedented corpus of formalized world knowledge is available for automated processing and reasoning, providing information about millions of entities such as people, cities, organizations, and companies.
Compared to the original YAGO, more advanced reasoning is possible due to the axiomatic knowledge delivered by SUMO. A reasoner can conclude e.g. that a child of a human must also be a human and cannot be born before its parents, or that two people sharing the same parents must be siblings.
Machine Learning Methods for Analysing and Linking RDF DataJens Lehmann
Invited Talk at the 8th International Conference on Scalable Uncertainty Management (SUM)
The talk outlines applications of supervised structured machine learning and presents a specific refinement operator based approach for RDF/OWL. It also outlines how similar ideas can be used in other (formal) languages, in particular link specifications.
The document provides guidelines for creating an ontology, including defining what an ontology is, why they are useful, and the basic components and methodology for building one. It discusses evaluating ontology taxonomies and provides two examples - an e-commerce ontology and a banking ontology - to demonstrate the concepts. The key steps outlined are identifying important terms and concepts, organizing them hierarchically, defining attributes and relationships, and evaluating for issues like redundant or incomplete information.
This document discusses exposing Nobel Prize data as linked open data. It describes a two phase approach: 1) exposing the data externally to spread information and enable other apps, and 2) using the linked data internally to improve data quality and enhance webpages. It provides details on publishing the dataset, interlinking it with other datasets, and technical implementations like a SPARQL endpoint and linked data cache. The goal is to increase the value of Nobel Prize information for their organization and audiences while also contributing to the larger linked open data cloud.
The document summarizes a seminar on ontology mapping presented by Samhati Soor. The seminar covered the need for ontology mapping due to the proliferation of ontologies, and the purpose of mapping ontologies to achieve interoperability and sharing knowledge. It defined ontologies and ontology mapping and discussed categories of mapping including between global and local ontologies, between local ontologies, and for merging ontologies. Tools for ontology mapping discussed included GLUE and SAM. Evaluation criteria and challenges of ontology mapping were also summarized along with conclusions and references.
Machine Learning Techniques for the Semantic Webpauldix
The document discusses using machine learning techniques on semantic web data represented in RDF triples. It describes representing the RDF triples as a vector space to find relationships between subjects. Dimensionality reduction techniques like latent semantic analysis can be applied to cluster similar subjects based on predicate relationships. Both supervised and unsupervised machine learning approaches are applicable, such as classification to map between ontologies or clustering for ontology learning.
'Meaning is its use' - Towards the use of distributional semantics for conten...Cataldo Musto
The document discusses using distributional semantics for content-based recommender systems. Distributional semantics represents words and documents as vectors based on their contexts in a large text corpus. This allows calculating similarities between words and documents to find semantic relationships. The speaker proposes an enhanced vector space model (eVSM) that represents users, items, and their profiles as vectors for recommender systems. Representations based on distributional semantics are inherently multilingual as word contexts are largely language-independent. This allows cross-language recommendations without additional costs.
The document discusses processing OWL ontologies using the Jena ontology API in Java. It describes how to create an ontology model, read an existing ontology, retrieve classes and properties, and examine class and property hierarchies. Key points include getting ontology classes and iterating over them, examining class relationships and restrictions, retrieving object and datatype properties, and getting property domains and ranges. The document provides examples of working with ontologies using the Jena API in Java.
Support vector machines (SVM) are a supervised learning method used for classification and regression analysis. SVMs find a hyperplane that maximizes the margin between two classes of objects. They can handle non-linear classification problems by projecting data into a higher dimensional space. The training points closest to the separating hyperplane are called support vectors. SVMs learn the discrimination boundary between classes rather than modeling each class individually.
Machine Learning Methods for Analysing and Linking RDF DataJens Lehmann
Invited Talk at the 8th International Conference on Scalable Uncertainty Management (SUM)
The talk outlines applications of supervised structured machine learning and presents a specific refinement operator based approach for RDF/OWL. It also outlines how similar ideas can be used in other (formal) languages, in particular link specifications.
The document provides guidelines for creating an ontology, including defining what an ontology is, why they are useful, and the basic components and methodology for building one. It discusses evaluating ontology taxonomies and provides two examples - an e-commerce ontology and a banking ontology - to demonstrate the concepts. The key steps outlined are identifying important terms and concepts, organizing them hierarchically, defining attributes and relationships, and evaluating for issues like redundant or incomplete information.
This document discusses exposing Nobel Prize data as linked open data. It describes a two phase approach: 1) exposing the data externally to spread information and enable other apps, and 2) using the linked data internally to improve data quality and enhance webpages. It provides details on publishing the dataset, interlinking it with other datasets, and technical implementations like a SPARQL endpoint and linked data cache. The goal is to increase the value of Nobel Prize information for their organization and audiences while also contributing to the larger linked open data cloud.
The document summarizes a seminar on ontology mapping presented by Samhati Soor. The seminar covered the need for ontology mapping due to the proliferation of ontologies, and the purpose of mapping ontologies to achieve interoperability and sharing knowledge. It defined ontologies and ontology mapping and discussed categories of mapping including between global and local ontologies, between local ontologies, and for merging ontologies. Tools for ontology mapping discussed included GLUE and SAM. Evaluation criteria and challenges of ontology mapping were also summarized along with conclusions and references.
Machine Learning Techniques for the Semantic Webpauldix
The document discusses using machine learning techniques on semantic web data represented in RDF triples. It describes representing the RDF triples as a vector space to find relationships between subjects. Dimensionality reduction techniques like latent semantic analysis can be applied to cluster similar subjects based on predicate relationships. Both supervised and unsupervised machine learning approaches are applicable, such as classification to map between ontologies or clustering for ontology learning.
'Meaning is its use' - Towards the use of distributional semantics for conten...Cataldo Musto
The document discusses using distributional semantics for content-based recommender systems. Distributional semantics represents words and documents as vectors based on their contexts in a large text corpus. This allows calculating similarities between words and documents to find semantic relationships. The speaker proposes an enhanced vector space model (eVSM) that represents users, items, and their profiles as vectors for recommender systems. Representations based on distributional semantics are inherently multilingual as word contexts are largely language-independent. This allows cross-language recommendations without additional costs.
The document discusses processing OWL ontologies using the Jena ontology API in Java. It describes how to create an ontology model, read an existing ontology, retrieve classes and properties, and examine class and property hierarchies. Key points include getting ontology classes and iterating over them, examining class relationships and restrictions, retrieving object and datatype properties, and getting property domains and ranges. The document provides examples of working with ontologies using the Jena API in Java.
Support vector machines (SVM) are a supervised learning method used for classification and regression analysis. SVMs find a hyperplane that maximizes the margin between two classes of objects. They can handle non-linear classification problems by projecting data into a higher dimensional space. The training points closest to the separating hyperplane are called support vectors. SVMs learn the discrimination boundary between classes rather than modeling each class individually.
This document summarizes a workshop on data integration using ontologies. It discusses how data integration is challenging due to differences in schemas, semantics, measurements, units and labels across data sources. It proposes that ontologies can help with data integration by providing definitions for schemas and entities referred to in the data. Core challenges discussed include dealing with multiple synonyms for entities and relationships between biological entities that depend on context. The document advocates for shared community ontologies that can be extended and integrated to facilitate flexible and responsive data integration across multiple sources.
Presentation of the Semantic Knowledge Graph research paper at the 2016 IEEE 3rd International Conference on Data Science and Advanced Analytics (Montreal, Canada - October 18th, 2016)
Abstract—This paper describes a new kind of knowledge representation and mining system which we are calling the Semantic Knowledge Graph. At its heart, the Semantic Knowledge Graph leverages an inverted index, along with a complementary uninverted index, to represent nodes (terms) and edges (the documents within intersecting postings lists for multiple terms/nodes). This provides a layer of indirection between each pair of nodes and their corresponding edge, enabling edges to materialize dynamically from underlying corpus statistics. As a result, any combination of nodes can have edges to any other nodes materialize and be scored to reveal latent relationships between the nodes. This provides numerous benefits: the knowledge graph can be built automatically from a real-world corpus of data, new nodes - along with their combined edges - can be instantly materialized from any arbitrary combination of preexisting nodes (using set operations), and a full model of the semantic relationships between all entities within a domain can be represented and dynamically traversed using a highly compact representation of the graph. Such a system has widespread applications in areas as diverse as knowledge modeling and reasoning, natural language processing, anomaly detection, data cleansing, semantic search, analytics, data classification, root cause analysis, and recommendations systems. The main contribution of this paper is the introduction of a novel system - the Semantic Knowledge Graph - which is able to dynamically discover and score interesting relationships between any arbitrary combination of entities (words, phrases, or extracted concepts) through dynamically materializing nodes and edges from a compact graphical representation built automatically from a corpus of data representative of a knowledge domain.
Semantic Perspectives for Contemporary Question Answering SystemsAndre Freitas
This document summarizes an academic presentation about semantic perspectives for contemporary question answering systems. It discusses multiple perspectives of semantic representation including facts versus definitions and temporality. It also outlines lightweight semantic representation using RDF and distributional semantics. The document then discusses knowledge graph extraction from text including taxonomy extraction, n-ary relation extraction, and argumentation representation. It concludes with an overview of querying knowledge graphs using distributional semantics and semantic approximation as core operations.
Presentation of Domain Specific Question Answering System Using N-gram Approach.Tasnim Ara Islam
Design an application for a domain specific question answering system. Built a solution for finding answers of factoid questions by using N-gram Mining Approach. Calculated percentage about the related answers for the specific question. Built this application in Java platform.
Représentation sous forme de graphe d'états
Global Problem Solver
Algorithmes de Recherche Aveugles
Algorithmes de Recherche Informés
Depth First Search
Breadth First Search
Best First Search
A, A*
Fonction heuristique, Fonction heuristique admissible
Deep Learning Models for Question AnsweringSujit Pal
This document discusses deep learning models for question answering. It provides an overview of common deep learning building blocks such as fully connected networks, word embeddings, convolutional neural networks and recurrent neural networks. It then summarizes the authors' experiments using these techniques on benchmark question answering datasets like bAbI and a Kaggle science question dataset. Their best model achieved an accuracy of 76.27% by incorporating custom word embeddings trained on external knowledge sources. The authors discuss future work including trying additional models and deploying the trained systems.
UMBEL: Subject Concepts Layer for the WebMike Bergman
UMBEL is a lightweight ontology and subject concept framework comprised of around 20,000 concepts and their relationships that aims to provide context for web content and datasets. It serves as a reference structure for placing information into context with other data by defining common subject concepts and mapping entities and datasets to these concepts. UMBEL is freely available under an open source license and relies on existing vocabularies and ontologies like SKOS, RDFS, and OWL to provide interoperability.
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...Marko Rodriguez
A graph is a data structure that links a set of vertices by a set of edges. Modern graph databases support multi-relational graph structures, where there exist different types of vertices (e.g. people, places, items) and different types of edges (e.g. friend, lives at, purchased). By means of index-free adjacency, graph databases are optimized for graph traversals and are interacted with through a graph traversal engine. A graph traversal is defined as an abstract path whose instance is realized on a graph dataset. Graph databases and traversals can be used for searching, scoring, ranking, and in concert, recommendation. This presentation will explore graph structures, algorithms, traversal algebras, graph-related software suites, and a host of examples demonstrating how to solve real-world problems, in real-time, with graphs. This is a whirlwind tour of the theory and application of graphs.
This document summarizes support vector machines (SVMs), a machine learning technique for classification and regression. SVMs find the optimal separating hyperplane that maximizes the margin between positive and negative examples in the training data. This is achieved by solving a convex optimization problem that minimizes a quadratic function under linear constraints. SVMs can perform non-linear classification by implicitly mapping inputs into a higher-dimensional feature space using kernel functions. They have applications in areas like text categorization due to their ability to handle high-dimensional sparse data.
In machine learning, support vector machines (SVMs, also support vector networks[1]) are supervised learning models with associated learning algorithms that analyze data and recognize patterns, used for classification and regression analysis. The basic SVM takes a set of input data and predicts, for each given input, which of two possible classes forms the output, making it a non-probabilistic binary linear classifier.
grammaticality, deep & surface structure, and ambiguityDedew Deviarini
This document discusses English morphology and syntax. It covers several key topics:
1. What is syntax and syntactic structure, including parts of speech and phrase structure.
2. The difference between deep and surface structure, where deep structure is the underlying form and surface structure is the actual form after transformations.
3. Grammaticality, which refers to sentences that follow syntactic rules rather than other factors like meaning or truth.
4. Types of ambiguities, including lexical ambiguities due to ambiguous words, and structural ambiguities due to multiple possible syntactic trees.
Formalization and implementation of BFO 2 with a focus on the OWL implementationgolpedegato2
Formalization and implementation of Basic Formal Ontology 2 with a focus on the OWL implementation.
With an introduction to some of the underlying technologies
ICBO 2018 Poster - Current Development in the Evidence and Conclusion Ontolog...dolleyj
The Evidence & Conclusion Ontology (ECO) has been developed to provide standardized descriptions for types of evidence within the biological domain. Best
practices in biocuration require that when a biological assertion is made (e.g. linking a Gene Ontology (GO) term for a molecular function to a protein), the type of evidence
supporting it is captured. In recent development efforts, we have been working with other ontology groups to ensure that ECO classes exist for the types of curation they
support. These include the Ontology for Microbial Phenotypes and GO. In addition, we continue to support user-level class requests through our GitHub issue tracker. To
facilitate the addition and maintenance of new classes, we utilize ROBOT (a command line tool for working with Open Biomedical Ontologies) as part of our standard workflow.
ROBOT templates allow us to define classes in a spreadsheet and convert them to Web Ontology Language (OWL) axioms, which can then be merged into ECO. ROBOT is
also part of our automated release process. Additionally, we are engaged in ongoing work to map ECO classes to Ontology for Biomedical Investigation classes using logical
definitions. ECO is currently in use by dozens of groups engaged in biological curation and the number of ECO users continues to grow. The ontology, in OWL and Open
Biomedical Ontology (OBO) formats, and associated resources can be accessed through our GitHub site (https://github.com/evidenceontology/evidenceontology) as well as
the ECO web page (http://evidenceontology.org/).
The document discusses MIREOT (Minimal information to reference external ontology terms), an approach used by the Ontology for Biomedical Investigations (OBI) project to import terms from external ontologies. It describes three approaches to importing terms - creating duplicate terms, importing modules, and full imports. It proposes importing only the classes needed using a minimal set of information to unambiguously identify terms from external ontologies. This process has been implemented in OBI and an online tool called OntoFox has been developed to facilitate the MIREOT process.
A N E XTENSION OF P ROTÉGÉ FOR AN AUTOMA TIC F UZZY - O NTOLOGY BUILDING U...ijcsit
The process of building ontology is a very
complex and time
-
consuming process
especially when dealing
with huge amount of data. Unfortunately current
marketed
tools are very limited and don’t meet
all
user
needs.
Indeed, t
hese software build the core of the ontology from initial data that generates
a
big number of
information.
In this paper, we
aim to resolve these problems
by adding an extension to the well known
ontology editor Protégé in order to work towards a complete
FCA
-
based framework
which resolves the
limitation of other tools in
building fuzzy
-
ontology
.
W
e will give
, in this paper
, some
details on
our
sem
i
-
automat
ic collaborative tool
called FOD Tab Plug
-
in
which
takes into consideration another degree of
granularity in the process of generation
.
In fact, i
t follows a bottom
-
up strategy based on conceptual
clustering, fuzzy logic and Formal Concept Analysis (FCA) a
nd it defines ontology between classes
resulting from a preliminary classification of data and not from the initial large amount of data
.
The document discusses how ontologies and social media can support eLearning. It describes how ontologies can be enhanced with social tags to integrate formal and informal knowledge. An experiment used tags from Delicious to identify related tags and map them to concepts in a computing ontology. User evaluations found that beginners prefer tagged documents while advanced learners benefit from structured ontologies. Integrating ontologies, tags and social networks has potential to support knowledge discovery and recommendation across formal and informal learning resources and communities.
Towards Linked Ontologies and Data on the Semantic WebJie Bao
The document outlines Jie Bao's research background and overview, including work on linked ontologies and linked data using semantic wikis. It discusses a modular ontology approach called P-DL that allows importing between ontologies similar to citation. It also describes using a semantic wiki to generate linked data from wiki revision histories and other semantic data. Future work includes applying these techniques to government data and improving scalability.
I held this presentation at the first PKP Scholarly Publishing Conference in Vancouver Canada, on July 12th 2007. Check out the general conference blog if you want to know more about the event:
http://scholarlypublishing.blogspot.com/
You may also be interested in things marked with the "open-access" tag in my own blog:
http://corpblawg.ynada.com/
The document discusses the Portable Ontology Aligned Fragments (POAF) project. It describes how POAF aims to address issues with semantic integration by capturing relevant information from aligned ontologies in portable, machine-readable fragments. It provides an example of aligning terrorism-related ontologies and extracting a POAF. Future work areas are also outlined, such as dynamic namespace resolution and using POAF to enable faster semantic queries in distributed environments.
Gadgets pwn us? A pattern language for CALLLawrie Hunter
The document discusses creating a pattern language for computer-assisted language learning (CALL). It explores the concept of a pattern language as defined by Christopher Alexander and proposes a framework for creating a CALL pattern language in the era of web 2.0. The paper seeks to rework concepts from other fields, like "formal learning design expression" and "task arc," and have participants brainstorm elements to include through graphical challenges. The overall goal is to establish foundational patterns for CALL work.
This document summarizes a workshop on data integration using ontologies. It discusses how data integration is challenging due to differences in schemas, semantics, measurements, units and labels across data sources. It proposes that ontologies can help with data integration by providing definitions for schemas and entities referred to in the data. Core challenges discussed include dealing with multiple synonyms for entities and relationships between biological entities that depend on context. The document advocates for shared community ontologies that can be extended and integrated to facilitate flexible and responsive data integration across multiple sources.
Presentation of the Semantic Knowledge Graph research paper at the 2016 IEEE 3rd International Conference on Data Science and Advanced Analytics (Montreal, Canada - October 18th, 2016)
Abstract—This paper describes a new kind of knowledge representation and mining system which we are calling the Semantic Knowledge Graph. At its heart, the Semantic Knowledge Graph leverages an inverted index, along with a complementary uninverted index, to represent nodes (terms) and edges (the documents within intersecting postings lists for multiple terms/nodes). This provides a layer of indirection between each pair of nodes and their corresponding edge, enabling edges to materialize dynamically from underlying corpus statistics. As a result, any combination of nodes can have edges to any other nodes materialize and be scored to reveal latent relationships between the nodes. This provides numerous benefits: the knowledge graph can be built automatically from a real-world corpus of data, new nodes - along with their combined edges - can be instantly materialized from any arbitrary combination of preexisting nodes (using set operations), and a full model of the semantic relationships between all entities within a domain can be represented and dynamically traversed using a highly compact representation of the graph. Such a system has widespread applications in areas as diverse as knowledge modeling and reasoning, natural language processing, anomaly detection, data cleansing, semantic search, analytics, data classification, root cause analysis, and recommendations systems. The main contribution of this paper is the introduction of a novel system - the Semantic Knowledge Graph - which is able to dynamically discover and score interesting relationships between any arbitrary combination of entities (words, phrases, or extracted concepts) through dynamically materializing nodes and edges from a compact graphical representation built automatically from a corpus of data representative of a knowledge domain.
Semantic Perspectives for Contemporary Question Answering SystemsAndre Freitas
This document summarizes an academic presentation about semantic perspectives for contemporary question answering systems. It discusses multiple perspectives of semantic representation including facts versus definitions and temporality. It also outlines lightweight semantic representation using RDF and distributional semantics. The document then discusses knowledge graph extraction from text including taxonomy extraction, n-ary relation extraction, and argumentation representation. It concludes with an overview of querying knowledge graphs using distributional semantics and semantic approximation as core operations.
Presentation of Domain Specific Question Answering System Using N-gram Approach.Tasnim Ara Islam
Design an application for a domain specific question answering system. Built a solution for finding answers of factoid questions by using N-gram Mining Approach. Calculated percentage about the related answers for the specific question. Built this application in Java platform.
Représentation sous forme de graphe d'états
Global Problem Solver
Algorithmes de Recherche Aveugles
Algorithmes de Recherche Informés
Depth First Search
Breadth First Search
Best First Search
A, A*
Fonction heuristique, Fonction heuristique admissible
Deep Learning Models for Question AnsweringSujit Pal
This document discusses deep learning models for question answering. It provides an overview of common deep learning building blocks such as fully connected networks, word embeddings, convolutional neural networks and recurrent neural networks. It then summarizes the authors' experiments using these techniques on benchmark question answering datasets like bAbI and a Kaggle science question dataset. Their best model achieved an accuracy of 76.27% by incorporating custom word embeddings trained on external knowledge sources. The authors discuss future work including trying additional models and deploying the trained systems.
UMBEL: Subject Concepts Layer for the WebMike Bergman
UMBEL is a lightweight ontology and subject concept framework comprised of around 20,000 concepts and their relationships that aims to provide context for web content and datasets. It serves as a reference structure for placing information into context with other data by defining common subject concepts and mapping entities and datasets to these concepts. UMBEL is freely available under an open source license and relies on existing vocabularies and ontologies like SKOS, RDFS, and OWL to provide interoperability.
Problem-Solving using Graph Traversals: Searching, Scoring, Ranking, and Reco...Marko Rodriguez
A graph is a data structure that links a set of vertices by a set of edges. Modern graph databases support multi-relational graph structures, where there exist different types of vertices (e.g. people, places, items) and different types of edges (e.g. friend, lives at, purchased). By means of index-free adjacency, graph databases are optimized for graph traversals and are interacted with through a graph traversal engine. A graph traversal is defined as an abstract path whose instance is realized on a graph dataset. Graph databases and traversals can be used for searching, scoring, ranking, and in concert, recommendation. This presentation will explore graph structures, algorithms, traversal algebras, graph-related software suites, and a host of examples demonstrating how to solve real-world problems, in real-time, with graphs. This is a whirlwind tour of the theory and application of graphs.
This document summarizes support vector machines (SVMs), a machine learning technique for classification and regression. SVMs find the optimal separating hyperplane that maximizes the margin between positive and negative examples in the training data. This is achieved by solving a convex optimization problem that minimizes a quadratic function under linear constraints. SVMs can perform non-linear classification by implicitly mapping inputs into a higher-dimensional feature space using kernel functions. They have applications in areas like text categorization due to their ability to handle high-dimensional sparse data.
In machine learning, support vector machines (SVMs, also support vector networks[1]) are supervised learning models with associated learning algorithms that analyze data and recognize patterns, used for classification and regression analysis. The basic SVM takes a set of input data and predicts, for each given input, which of two possible classes forms the output, making it a non-probabilistic binary linear classifier.
grammaticality, deep & surface structure, and ambiguityDedew Deviarini
This document discusses English morphology and syntax. It covers several key topics:
1. What is syntax and syntactic structure, including parts of speech and phrase structure.
2. The difference between deep and surface structure, where deep structure is the underlying form and surface structure is the actual form after transformations.
3. Grammaticality, which refers to sentences that follow syntactic rules rather than other factors like meaning or truth.
4. Types of ambiguities, including lexical ambiguities due to ambiguous words, and structural ambiguities due to multiple possible syntactic trees.
Formalization and implementation of BFO 2 with a focus on the OWL implementationgolpedegato2
Formalization and implementation of Basic Formal Ontology 2 with a focus on the OWL implementation.
With an introduction to some of the underlying technologies
ICBO 2018 Poster - Current Development in the Evidence and Conclusion Ontolog...dolleyj
The Evidence & Conclusion Ontology (ECO) has been developed to provide standardized descriptions for types of evidence within the biological domain. Best
practices in biocuration require that when a biological assertion is made (e.g. linking a Gene Ontology (GO) term for a molecular function to a protein), the type of evidence
supporting it is captured. In recent development efforts, we have been working with other ontology groups to ensure that ECO classes exist for the types of curation they
support. These include the Ontology for Microbial Phenotypes and GO. In addition, we continue to support user-level class requests through our GitHub issue tracker. To
facilitate the addition and maintenance of new classes, we utilize ROBOT (a command line tool for working with Open Biomedical Ontologies) as part of our standard workflow.
ROBOT templates allow us to define classes in a spreadsheet and convert them to Web Ontology Language (OWL) axioms, which can then be merged into ECO. ROBOT is
also part of our automated release process. Additionally, we are engaged in ongoing work to map ECO classes to Ontology for Biomedical Investigation classes using logical
definitions. ECO is currently in use by dozens of groups engaged in biological curation and the number of ECO users continues to grow. The ontology, in OWL and Open
Biomedical Ontology (OBO) formats, and associated resources can be accessed through our GitHub site (https://github.com/evidenceontology/evidenceontology) as well as
the ECO web page (http://evidenceontology.org/).
The document discusses MIREOT (Minimal information to reference external ontology terms), an approach used by the Ontology for Biomedical Investigations (OBI) project to import terms from external ontologies. It describes three approaches to importing terms - creating duplicate terms, importing modules, and full imports. It proposes importing only the classes needed using a minimal set of information to unambiguously identify terms from external ontologies. This process has been implemented in OBI and an online tool called OntoFox has been developed to facilitate the MIREOT process.
A N E XTENSION OF P ROTÉGÉ FOR AN AUTOMA TIC F UZZY - O NTOLOGY BUILDING U...ijcsit
The process of building ontology is a very
complex and time
-
consuming process
especially when dealing
with huge amount of data. Unfortunately current
marketed
tools are very limited and don’t meet
all
user
needs.
Indeed, t
hese software build the core of the ontology from initial data that generates
a
big number of
information.
In this paper, we
aim to resolve these problems
by adding an extension to the well known
ontology editor Protégé in order to work towards a complete
FCA
-
based framework
which resolves the
limitation of other tools in
building fuzzy
-
ontology
.
W
e will give
, in this paper
, some
details on
our
sem
i
-
automat
ic collaborative tool
called FOD Tab Plug
-
in
which
takes into consideration another degree of
granularity in the process of generation
.
In fact, i
t follows a bottom
-
up strategy based on conceptual
clustering, fuzzy logic and Formal Concept Analysis (FCA) a
nd it defines ontology between classes
resulting from a preliminary classification of data and not from the initial large amount of data
.
The document discusses how ontologies and social media can support eLearning. It describes how ontologies can be enhanced with social tags to integrate formal and informal knowledge. An experiment used tags from Delicious to identify related tags and map them to concepts in a computing ontology. User evaluations found that beginners prefer tagged documents while advanced learners benefit from structured ontologies. Integrating ontologies, tags and social networks has potential to support knowledge discovery and recommendation across formal and informal learning resources and communities.
Towards Linked Ontologies and Data on the Semantic WebJie Bao
The document outlines Jie Bao's research background and overview, including work on linked ontologies and linked data using semantic wikis. It discusses a modular ontology approach called P-DL that allows importing between ontologies similar to citation. It also describes using a semantic wiki to generate linked data from wiki revision histories and other semantic data. Future work includes applying these techniques to government data and improving scalability.
I held this presentation at the first PKP Scholarly Publishing Conference in Vancouver Canada, on July 12th 2007. Check out the general conference blog if you want to know more about the event:
http://scholarlypublishing.blogspot.com/
You may also be interested in things marked with the "open-access" tag in my own blog:
http://corpblawg.ynada.com/
The document discusses the Portable Ontology Aligned Fragments (POAF) project. It describes how POAF aims to address issues with semantic integration by capturing relevant information from aligned ontologies in portable, machine-readable fragments. It provides an example of aligning terrorism-related ontologies and extracting a POAF. Future work areas are also outlined, such as dynamic namespace resolution and using POAF to enable faster semantic queries in distributed environments.
Gadgets pwn us? A pattern language for CALLLawrie Hunter
The document discusses creating a pattern language for computer-assisted language learning (CALL). It explores the concept of a pattern language as defined by Christopher Alexander and proposes a framework for creating a CALL pattern language in the era of web 2.0. The paper seeks to rework concepts from other fields, like "formal learning design expression" and "task arc," and have participants brainstorm elements to include through graphical challenges. The overall goal is to establish foundational patterns for CALL work.
The document discusses the impact of standardized terminologies and domain ontologies in multilingual information processing. It outlines how natural language processing (NLP) techniques can be used to semi-automatically populate ontologies by extracting information from text. Integrating knowledge from ontologies, NLP tools, and subject experts allows for more effective information access and management in an organization.
Cross-lingual ontology lexicalisation, translation and information extraction...Tobias Wunner
The document discusses cross-lingual ontology translation and lexicalization. It presents the lemon model for connecting ontology concepts to lexical information to facilitate tasks like machine translation. The lemon model represents lexical entries, forms, linguistic structure, meanings, and syntactic frames. It separates ontological semantics from lexical features to enable linking terminology to external resources for translation. The model supports representing multilingual labels and relating terms through concepts like narrower/broader. This enables cross-lingual information extraction and search over linked data.
The document discusses how modularity has allowed for increased evolvability in biological and technological systems by separating functions and allowing independent development of modules. It argues that a similar increase in modularity could benefit scientific communication by making more research processes and results transparent earlier through preprint servers, comments, and finer-grained publications. This would reduce wasted effort and increase opportunities for collaboration.
1) The document presents Topologos software, which allows modeling of both processes and objects to address issues with traditional separation of data and process modeling.
2) Traditional techniques model processes separately from data using techniques like data flow diagrams and class diagrams, but this separation makes it difficult to guarantee transparency of both processes and data.
3) Merging object and process modeling solves inheritance problems that occur when an object participates in multiple processes, as attributes depend on an object's location within a process.
Collaborative Construction of Large Biological OntologiesJie Bao
This document discusses challenges with collaborative ontology building and proposes a modular ontology approach using fine-grained packages. Each package represents a fragment of the overall ontology and can be independently developed by different contributors concurrently. A Collaborative Ontology Building (COB) editor is presented to facilitate this package-based collaborative ontology development approach.
ANChor: A powerful approach to scientific communicationJosh Inouye
In science, the failure to communicate effectively can mean the death of a proposal, the rejection of a paper, or failure to obtain a job. Coalescing the ideas of many experts in cognitive psychology and technical communication, I have created a general framework for scientific communication that I use extensively for scientific presentations, grant proposals, academic papers, and posters which I call ANChor (Assertions, Noise, and Cohesion). I propose that this approach is a powerful way to unify, clarify, and sharpen scientific messages for the benefit of both the audience and the author.
This methodology has proven effectiveness based on 4 presentation awards given to myself and my colleagues resulting from the use of this framework.
This document summarizes a PhD student's research on generating natural language explanations of entailments in OWL ontologies to help non-specialists understand and debug ontologies. The research aims to identify common justification patterns and develop an approach to explaining justifications in an accessible way using techniques from proof presentations. A preliminary study identified the most frequent patterns in a corpus of ontologies. The research will further analyze justification patterns and test explanations' effectiveness through user studies.
The document discusses 10 observations about using open access content and summarizing a lecture on the topic. It observes that the amount of scientific literature is increasing rapidly but can only be read fractionally. It notes that open access could change scholarly discourse by making literature freely available. It suggests merging databases and journals for a new learning experience, using semantic enrichment to better integrate content, and utilizing rich media like video to increase discovery rates.
The document discusses the problem of "posterior collapse" in variational autoencoders (VAEs), where the model ignores the latent variable z during training. The authors investigate this from the perspective of training dynamics, finding that the inference network fails to accurately approximate the true posterior distribution early in training, as it is a moving target. As a result, the model learns to ignore the latent encoding. To address this, the authors propose an approach where the inference network is aggressively optimized before each model update, depending on the current mutual information between z and x. Despite adding no new components, this approach avoids posterior collapse on text and image benchmarks, outperforming autoregressive baselines in terms of likelihood.
CDAO presentation.
The idea of the comparative analysis ontoloty has been presented worldwide, including: NESCent (USA), IGBMC (France), UFRJ (Brazil). Providing a semantic framework for evolutionary analysis in a high-throughtput way after the next and third generation sequencing is the way to approach evolutionary-based studies into genome-wide analysis. The darwinian core of reasoning also allows CDAO to be used with other entities.
Similar to YAGO-SUMO: Integrating YAGO into the Suggested Upper Merged Ontology (20)
SEMAC Graph Node Embeddings for Link PredictionGerard de Melo
We present a new graph representation learning approach called SEMAC that jointly exploits fine-grained node features as well as the overall graph topology. In contrast to the SGNS or SVD methods espoused in previous representation-based studies, our model represents nodes in terms of subgraph embeddings acquired via a form of convex matrix completion to iteratively reduce the rank, and thereby, more effectively eliminate noise in the representation. Thus, subgraph embeddings and convex matrix completion are elegantly integrated into a novel link prediction framework.
While traditional scholarship has tended to emphasize thorough reading, reflection, and learning, many researchers nowadays – both in academia and industry – find themselves in a fast-paced and demanding environment. A successful research career crucially depends on management-related skills, and devoting some time to such skills is likely to pay off very quickly. One important example is time and task management, which is critical when there are numerous conflicting demands and opportunities. Another example is being able to cope with challenges and failure. Researchers also need to be creative and bold in defending their ideas. This talk provides an overview of these and other skills that are vital in modern research environments.
Knowlywood: Mining Activity Knowledge from Hollywood NarrativesGerard de Melo
Knowlywood is a new knowledge graph mined from movies, TV series, and literature. It provides commonsense knowledge about human activities, e.g. participants, preceding and following activities, and so on.
Learning Multilingual Semantics from Big Data on the WebGerard de Melo
This document summarizes Gerard de Melo's presentation on learning multilingual semantics from big data on the web. It discusses how lexical and taxonomic knowledge can be extracted at large scale from online resources like Wiktionary, Wikipedia, and WordNet. Methods are presented for merging structured data like knowledge graphs and integrating taxonomies across languages using techniques like linear program relaxation and belief propagation. The goal is to build large yet reasonably clean multilingual knowledge bases to power applications in areas like semantic search and the digital humanities.
Big Data is more than just hype. The vast quantities of data now available have led to two important challenges that are fundamentally changing the way we develop data-intensive systems. The first is at the data management level, where we are finally moving beyond vanilla MapReduce towards infrastructure that allows for more flexible data processing pipelines. The second challenge is transitioning from quantity to quality and distilling genuine knowledge from the raw data. For this, we still need innovative algorithms that facilitate data cleaning, unsupervised and semi-supervised learning, knowledge harvesting, and knowledge integration. Examples include data integration, and large-scale knowledge bases such as UWN/MENTA, and collections of commonsense knowledge such as WebChild.
Scalable Learning Technologies for Big Data MiningGerard de Melo
These are slides of a tutorial by Gerard de Melo and Aparna Varde presented at the DASFAA 2015 conference.
As data expands into big data, enhanced or entirely novel data mining algorithms often become necessary. The real value of big data is often only exposed when we can adequately mine and learn from it. We provide an overview of new scalable techniques for knowledge discovery. Our focus is on the areas of cloud data mining and machine learning, semi-supervised processing, and deep learning. We also give practical advice for choosing among different methods and discuss open research problems and concerns.
These are slides of a tutorial at ECIR by Gerard de Melo and Katja Hose.
Search is currently undergoing a major paradigm shift away from the traditional document-centric “10 blue links” towards more explicit and actionable information. Recent advances in this area are Google’s Knowledge Graph, Virtual Personal Assistants such as Siri and Google Now, as well as the now ubiquitous entity-oriented vertical search results for places, products, etc. Apart from novel query understanding methods, these developments are largely driven by structured data that is blended into the Web Search experience. We discuss efficient indexing and query processing techniques to work with large amounts of structured data. Finally, we present query interpretation and understanding methods to map user queries to these structured data sources.
From Linked Data to Tightly Integrated DataGerard de Melo
Invited Talk at the 3rd Workshop on Linked Data in Linguistics: Multilingual Knowledge Resources and Natural Language Processing. Reykjavik, Iceland, 27th May 2014
The ideas behind the Web of Linked Data have great allure. Apart from the prospect of large amounts of freely available data, we are also promised nearly effortless interoperability. Common data formats and protocols have indeed made it easier than ever to obtain and work with information from different sources simultaneously, opening up new opportunities in linguistics, library science, and many other areas.
In this talk, however, I argue that the true potential of Linked Data can only be appreciated when extensive cross-linkage and integration engenders an even higher degree of interconnectedness. This can take the form of shared identifiers, e.g. those based on Wikipedia and WordNet, which can be used to describe numerous forms of linguistic and commonsense knowledge. An alternative is to rely on sameAs and similarity links, which can automatically be discovered using scalable approaches like the LINDA algorithm but need to be interpreted with great care, as we have observed in experimental studies. A closer level of linkage is achieved when resources are also connected at the taxonomic level, as exemplified by the MENTA approach to taxonomic data integration. Such integration means that one can buy into ecosystems already carrying a range of valuable pre-existing assets. Even more tightly integrated resources like Lexvo.org combine triples from multiple sources into unified, coherent knowledge bases. Finally, I also comment on how to address some remaining challenges that are still impeding a more widespread adoption of Linked Data on the Web. In the long run, I believe that such steps will lead us to significantly more tightly integrated Linked Data.
Information Extraction from Web-Scale N-Gram DataGerard de Melo
Search engines are increasingly relying on structured data to provide direct answers to certain types of queries. However, extracting such structured data from text is challenging, especially due to the scarcity of explicitly expressed knowledge. Even when relying on large document collections, pattern-based information extraction approaches typically expose only insufficient amounts of information. This paper evaluates to what extent n-gram statistics, derived from volumes of texts several orders of magnitude larger than typical corpora, can allow us to overcome this bottleneck. An extensive experimental evaluation is provided for three different binary relations, comparing different sources of n-gram data as well as different learning algorithms.
UWN: A Large Multilingual Lexical Knowledge BaseGerard de Melo
We present UWN, a large multilingual lexical knowledge base that describes the meanings and relationships of words in over 200 languages. This paper explains how link prediction, information integration and taxonomy induction methods have been used to build UWN based on WordNet and extend it with millions of named entities from Wikipedia. We additionally introduce extensions to cover lexical relationships, frame-semantic knowledge, and language data. An online interface provides human access to the data, while a software API enables applications to look up over 16 million words and names.
Multilingual Text Classification using OntologiesGerard de Melo
In this paper, we investigate strategies for automatically classifying documents in different languages thematically, geographically or according to other criteria. A novel linguistically motivated text representation scheme is presented that can be used with machine learning algorithms in order to learn classifications from pre-classified examples and then automatically classify documents that might be provided in entirely different languages. Our approach makes use of ontologies and lexical resources but goes beyond a simple mapping from terms to concepts by fully exploiting the external knowledge manifested in such resources and mapping to entire regions of concepts. For this, a graph traversal algorithm is used to explore related concepts that might be relevant. Extensive testing has shown that our methods lead to significant improvements compared to existing approaches.
Extracting Sense-Disambiguated Example Sentences From Parallel CorporaGerard de Melo
Example sentences provide an intuitive means of grasping the meaning of a word, and are frequently used to complement conventional word definitions. When a word has multiple meanings, it is useful to have example sentences for specific senses (and hence definitions) of that word rather than indiscriminately lumping all of them together. In this paper, we investigate to what extent such sense-specific example sentences can be extracted from parallel corpora using lexical knowledge bases for multiple languages as a sense index. We use word sense disambiguation heuristics and a cross-lingual measure of semantic similarity to link example sentences to specific word senses. From the sentences found for a given sense, an algorithm then selects a smaller subset that can be presented to end users, taking into account both representativeness and diversity. Preliminary results show that a precision of around 80% can be obtained for a reasonable number of word senses, and that the subset selection yields convincing results.
Towards a Universal Wordnet by Learning from Combined EvidenceGerard de Melo
Lexical databases are invaluable sources of knowledge about words and their meanings, with numerous applications in areas like NLP, IR, and AI. We propose a methodology for the automatic construction of a large-scale multilingual lexical database where words of many languages are hierarchically organized in terms of their
meanings and their semantic relations to other words. This resource is bootstrapped from WordNet, a well-known English-language resource. Our approach extends WordNet with around 1.5 million meaning links for 800,000 words in over 200 languages, drawing on evidence extracted from a variety of resources including existing (monolingual) wordnets, (mostly bilingual) translation dictionaries, and parallel corpora. Graph-based scoring functions and statistical learning techniques are used to iteratively integrate this information and build an output graph. Experiments show that this wordnet has a high
level of precision and coverage, and that it can be useful in applied tasks such as cross-lingual text classification.
Not Quite the Same: Identity Constraints for the Web of Linked DataGerard de Melo
Linked Data is based on the idea that information from different sources can flexibly be connected to enable novel applications that individual datasets do not support on their own. This hinges upon the existence of links between datasets that would otherwise be isolated. The most notable form, sameAs links, are intended to express that two identifiers are equivalent in all respects. Unfortunately, many existing ones do not reflect such genuine identity. This study provides a novel method to analyse this phenomenon, based on a thorough theoretical analysis, as well as a novel graph-based method to resolve such issues to some extent. Our experiments on a representative Web-scale set of sameAs links from the Web of Data show that our method can identify and remove hundreds of thousands of constraint violations.
Good, Great, Excellent: Global Inference of Semantic IntensitiesGerard de Melo
Adjectives like good, great, and excellent are similar in meaning, but differ in intensity. Intensity order information is very useful for language learners as well as in several NLP tasks, but is missing in most lexical resources (dictionaries, WordNet, and thesauri). In this paper, we present a primarily unsupervised approach that uses semantics from Web-scale data (e.g., phrases like good but not excellent) to rank words by assigning them positions on a continuous scale. We rely on Mixed Integer Linear Programming to jointly determine the ranks, such that individual decisions benefit from global information. When ranking English adjectives, our global algorithm achieves substantial improvements over previous work on both pairwise and rank correlation metrics (specifically, 70% pairwise accuracy as compared to only 56% by previous work). Moreover, our approach can incorporate external synonymy information (increasing its pairwise accuracy to 78%) and extends easily to new languages.
HCL Notes and Domino License Cost Reduction in the World of DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-and-domino-license-cost-reduction-in-the-world-of-dlau/
The introduction of DLAU and the CCB & CCX licensing model caused quite a stir in the HCL community. As a Notes and Domino customer, you may have faced challenges with unexpected user counts and license costs. You probably have questions on how this new licensing approach works and how to benefit from it. Most importantly, you likely have budget constraints and want to save money where possible. Don’t worry, we can help with all of this!
We’ll show you how to fix common misconfigurations that cause higher-than-expected user counts, and how to identify accounts which you can deactivate to save money. There are also frequent patterns that can cause unnecessary cost, like using a person document instead of a mail-in for shared mailboxes. We’ll provide examples and solutions for those as well. And naturally we’ll explain the new licensing model.
Join HCL Ambassador Marc Thomas in this webinar with a special guest appearance from Franz Walder. It will give you the tools and know-how to stay on top of what is going on with Domino licensing. You will be able lower your cost through an optimized configuration and keep it low going forward.
These topics will be covered
- Reducing license cost by finding and fixing misconfigurations and superfluous accounts
- How do CCB and CCX licenses really work?
- Understanding the DLAU tool and how to best utilize it
- Tips for common problem areas, like team mailboxes, functional/test users, etc
- Practical examples and best practices to implement right away
"Choosing proper type of scaling", Olena SyrotaFwdays
Imagine an IoT processing system that is already quite mature and production-ready and for which client coverage is growing and scaling and performance aspects are life and death questions. The system has Redis, MongoDB, and stream processing based on ksqldb. In this talk, firstly, we will analyze scaling approaches and then select the proper ones for our system.
Skybuffer SAM4U tool for SAP license adoptionTatiana Kojar
Manage and optimize your license adoption and consumption with SAM4U, an SAP free customer software asset management tool.
SAM4U, an SAP complimentary software asset management tool for customers, delivers a detailed and well-structured overview of license inventory and usage with a user-friendly interface. We offer a hosted, cost-effective, and performance-optimized SAM4U setup in the Skybuffer Cloud environment. You retain ownership of the system and data, while we manage the ABAP 7.58 infrastructure, ensuring fixed Total Cost of Ownership (TCO) and exceptional services through the SAP Fiori interface.
Connector Corner: Seamlessly power UiPath Apps, GenAI with prebuilt connectorsDianaGray10
Join us to learn how UiPath Apps can directly and easily interact with prebuilt connectors via Integration Service--including Salesforce, ServiceNow, Open GenAI, and more.
The best part is you can achieve this without building a custom workflow! Say goodbye to the hassle of using separate automations to call APIs. By seamlessly integrating within App Studio, you can now easily streamline your workflow, while gaining direct access to our Connector Catalog of popular applications.
We’ll discuss and demo the benefits of UiPath Apps and connectors including:
Creating a compelling user experience for any software, without the limitations of APIs.
Accelerating the app creation process, saving time and effort
Enjoying high-performance CRUD (create, read, update, delete) operations, for
seamless data management.
Speakers:
Russell Alfeche, Technology Leader, RPA at qBotic and UiPath MVP
Charlie Greenberg, host
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframePrecisely
Inconsistent user experience and siloed data, high costs, and changing customer expectations – Citizens Bank was experiencing these challenges while it was attempting to deliver a superior digital banking experience for its clients. Its core banking applications run on the mainframe and Citizens was using legacy utilities to get the critical mainframe data to feed customer-facing channels, like call centers, web, and mobile. Ultimately, this led to higher operating costs (MIPS), delayed response times, and longer time to market.
Ever-changing customer expectations demand more modern digital experiences, and the bank needed to find a solution that could provide real-time data to its customer channels with low latency and operating costs. Join this session to learn how Citizens is leveraging Precisely to replicate mainframe data to its customer channels and deliver on their “modern digital bank” experiences.
Introduction of Cybersecurity with OSS at Code Europe 2024Hiroshi SHIBATA
I develop the Ruby programming language, RubyGems, and Bundler, which are package managers for Ruby. Today, I will introduce how to enhance the security of your application using open-source software (OSS) examples from Ruby and RubyGems.
The first topic is CVE (Common Vulnerabilities and Exposures). I have published CVEs many times. But what exactly is a CVE? I'll provide a basic understanding of CVEs and explain how to detect and handle vulnerabilities in OSS.
Next, let's discuss package managers. Package managers play a critical role in the OSS ecosystem. I'll explain how to manage library dependencies in your application.
I'll share insights into how the Ruby and RubyGems core team works to keep our ecosystem safe. By the end of this talk, you'll have a better understanding of how to safeguard your code.
Main news related to the CCS TSI 2023 (2023/1695)Jakub Marek
An English 🇬🇧 translation of a presentation to the speech I gave about the main changes brought by CCS TSI 2023 at the biggest Czech conference on Communications and signalling systems on Railways, which was held in Clarion Hotel Olomouc from 7th to 9th November 2023 (konferenceszt.cz). Attended by around 500 participants and 200 on-line followers.
The original Czech 🇨🇿 version of the presentation can be found here: https://www.slideshare.net/slideshow/hlavni-novinky-souvisejici-s-ccs-tsi-2023-2023-1695/269688092 .
The videorecording (in Czech) from the presentation is available here: https://youtu.be/WzjJWm4IyPk?si=SImb06tuXGb30BEH .
Programming Foundation Models with DSPy - Meetup SlidesZilliz
Prompting language models is hard, while programming language models is easy. In this talk, I will discuss the state-of-the-art framework DSPy for programming foundation models with its powerful optimizers and runtime constraint system.
Dandelion Hashtable: beyond billion requests per second on a commodity serverAntonios Katsarakis
This slide deck presents DLHT, a concurrent in-memory hashtable. Despite efforts to optimize hashtables, that go as far as sacrificing core functionality, state-of-the-art designs still incur multiple memory accesses per request and block request processing in three cases. First, most hashtables block while waiting for data to be retrieved from memory. Second, open-addressing designs, which represent the current state-of-the-art, either cannot free index slots on deletes or must block all requests to do so. Third, index resizes block every request until all objects are copied to the new index. Defying folklore wisdom, DLHT forgoes open-addressing and adopts a fully-featured and memory-aware closed-addressing design based on bounded cache-line-chaining. This design offers lock-free index operations and deletes that free slots instantly, (2) completes most requests with a single memory access, (3) utilizes software prefetching to hide memory latencies, and (4) employs a novel non-blocking and parallel resizing. In a commodity server and a memory-resident workload, DLHT surpasses 1.6B requests per second and provides 3.5x (12x) the throughput of the state-of-the-art closed-addressing (open-addressing) resizable hashtable on Gets (Deletes).
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyScyllaDB
Freshworks creates AI-boosted business software that helps employees work more efficiently and effectively. Managing data across multiple RDBMS and NoSQL databases was already a challenge at their current scale. To prepare for 10X growth, they knew it was time to rethink their database strategy. Learn how they architected a solution that would simplify scaling while keeping costs under control.
Driving Business Innovation: Latest Generative AI Advancements & Success StorySafe Software
Are you ready to revolutionize how you handle data? Join us for a webinar where we’ll bring you up to speed with the latest advancements in Generative AI technology and discover how leveraging FME with tools from giants like Google Gemini, Amazon, and Microsoft OpenAI can supercharge your workflow efficiency.
During the hour, we’ll take you through:
Guest Speaker Segment with Hannah Barrington: Dive into the world of dynamic real estate marketing with Hannah, the Marketing Manager at Workspace Group. Hear firsthand how their team generates engaging descriptions for thousands of office units by integrating diverse data sources—from PDF floorplans to web pages—using FME transformers, like OpenAIVisionConnector and AnthropicVisionConnector. This use case will show you how GenAI can streamline content creation for marketing across the board.
Ollama Use Case: Learn how Scenario Specialist Dmitri Bagh has utilized Ollama within FME to input data, create custom models, and enhance security protocols. This segment will include demos to illustrate the full capabilities of FME in AI-driven processes.
Custom AI Models: Discover how to leverage FME to build personalized AI models using your data. Whether it’s populating a model with local data for added security or integrating public AI tools, find out how FME facilitates a versatile and secure approach to AI.
We’ll wrap up with a live Q&A session where you can engage with our experts on your specific use cases, and learn more about optimizing your data workflows with AI.
This webinar is ideal for professionals seeking to harness the power of AI within their data management systems while ensuring high levels of customization and security. Whether you're a novice or an expert, gain actionable insights and strategies to elevate your data processes. Join us to see how FME and AI can revolutionize how you work with data!
Conversational agents, or chatbots, are increasingly used to access all sorts of services using natural language. While open-domain chatbots - like ChatGPT - can converse on any topic, task-oriented chatbots - the focus of this paper - are designed for specific tasks, like booking a flight, obtaining customer support, or setting an appointment. Like any other software, task-oriented chatbots need to be properly tested, usually by defining and executing test scenarios (i.e., sequences of user-chatbot interactions). However, there is currently a lack of methods to quantify the completeness and strength of such test scenarios, which can lead to low-quality tests, and hence to buggy chatbots.
To fill this gap, we propose adapting mutation testing (MuT) for task-oriented chatbots. To this end, we introduce a set of mutation operators that emulate faults in chatbot designs, an architecture that enables MuT on chatbots built using heterogeneous technologies, and a practical realisation as an Eclipse plugin. Moreover, we evaluate the applicability, effectiveness and efficiency of our approach on open-source chatbots, with promising results.
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfChart Kalyan
A Mix Chart displays historical data of numbers in a graphical or tabular form. The Kalyan Rajdhani Mix Chart specifically shows the results of a sequence of numbers over different periods.
Taking AI to the Next Level in Manufacturing.pdfssuserfac0301
Read Taking AI to the Next Level in Manufacturing to gain insights on AI adoption in the manufacturing industry, such as:
1. How quickly AI is being implemented in manufacturing.
2. Which barriers stand in the way of AI adoption.
3. How data quality and governance form the backbone of AI.
4. Organizational processes and structures that may inhibit effective AI adoption.
6. Ideas and approaches to help build your organization's AI strategy.
YAGO-SUMO: Integrating YAGO into the Suggested Upper Merged Ontology
1. Introduction
Approach
Conclusion
Integrating YAGO into the
Suggested Upper Merged Ontology
G. de Melo1, F. Suchanek1, A. Pease2
1: Max Planck Institute for Informatics, Germany
2: Articulate Software, USA
2008-11-03
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
2. Introduction
Approach
Conclusion
Ontologies and KBs
SUMO
Extending Ontologies
YAGO
Outline
1 Introduction
Ontologies and KBs
SUMO
Extending Ontologies
YAGO
2 Approach
Incorporation
Class Information
Statements
3 Conclusion
Ongoing Work
Summary
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
3. Introduction
Approach
Conclusion
Ontologies and KBs
SUMO
Extending Ontologies
YAGO
Introduction
Ontologies/KBs: provide
background knowledge for
intelligent applications
Schism:
formal ontologies vs. large KBs
Goal: Large-scale formal ontology
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
4. Introduction
Approach
Conclusion
Ontologies and KBs
SUMO
Extending Ontologies
YAGO
Introduction
Ontologies/KBs: provide
background knowledge for
intelligent applications
Schism:
formal ontologies vs. large KBs
Goal: Large-scale formal ontology
formal ontologies: complex axioms
(e.g. in FOL), but quite small
large-scale KBs (e.g. based on
Wikipedia): only simple facts
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
5. Introduction
Approach
Conclusion
Ontologies and KBs
SUMO
Extending Ontologies
YAGO
Introduction
Ontologies/KBs: provide
background knowledge for
intelligent applications
Schism:
formal ontologies vs. large KBs
Goal: Large-scale formal ontology
combine the best of both worlds!
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
6. Introduction
Approach
Conclusion
Ontologies and KBs
SUMO
Extending Ontologies
YAGO
Introduction
SUMO
Suggested Upper Merged Ontology
open source
based on KIF rather than e.g. OWL
large formal ontology (20,000 terms, 70,000 axioms)
axiomatization of general and domain-specific concepts
for applications requiring basic “common sense”
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
7. Introduction
Approach
Conclusion
Ontologies and KBs
SUMO
Extending Ontologies
YAGO
Introduction
SUMO
Suggested Upper Merged Ontology
open source
based on KIF rather than e.g. OWL
origins: IEEE standard upper ontology group
core owned by IEEE (basically Public Domain), portions GPL
e.g.: OpenCyc doesn’t include axioms of commercial Cyc
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
8. Introduction
Approach
Conclusion
Ontologies and KBs
SUMO
Extending Ontologies
YAGO
Introduction
SUMO
Suggested Upper Merged Ontology
open source
based on KIF rather than e.g. OWL
peer review, community of experts and users
formal verification with ATP systems
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
9. Introduction
Approach
Conclusion
Ontologies and KBs
SUMO
Extending Ontologies
YAGO
Introduction
SUMO
Suggested Upper Merged Ontology
open source
based on KIF rather than e.g. OWL
OWL without additional rules is not very expressive
KIF variant standardized as ISO/IEC IS 24707:2007
(Common Logic)
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
12. Introduction
Approach
Conclusion
Ontologies and KBs
SUMO
Extending Ontologies
YAGO
Introduction
SUMO Example
(=>
(and
(parent ?CHILD ?PARENT)
(subclass ?CLASS Organism)
(instance ?PARENT ?CLASS))
(instance ?CHILD ?CLASS))
This implies, for example, that a child of a Human is also a Human.
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
14. Introduction
Approach
Conclusion
Ontologies and KBs
SUMO
Extending Ontologies
YAGO
Introduction
SUMO
additional domain ontologies
however, SUMO is mainly an upper ontology
not enough instances and ground facts
e.g. for geography, finance, transportation
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
15. Introduction
Approach
Conclusion
Ontologies and KBs
SUMO
Extending Ontologies
YAGO
Introduction
SUMO
additional domain ontologies
however, SUMO is mainly an upper ontology
not enough instances and ground facts
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
16. Introduction
Approach
Conclusion
Ontologies and KBs
SUMO
Extending Ontologies
YAGO
Introduction
SUMO
additional domain ontologies
however, SUMO is mainly an upper ontology
not enough instances and ground facts
e.g. people, cities, books
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
17. Introduction
Approach
Conclusion
Ontologies and KBs
SUMO
Extending Ontologies
YAGO
Introduction
Extending Ontologies: Possible Approaches
Manual work
Information extraction from corpora / the Web
Import from existing databases
slow process, low coverage
Semantic Wikis not yet accepted enough
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
18. Introduction
Approach
Conclusion
Ontologies and KBs
SUMO
Extending Ontologies
YAGO
Introduction
Extending Ontologies: Possible Approaches
Manual work
Information extraction from corpora / the Web
Import from existing databases
low accuracy
not canonical / in line with upper ontology
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
19. Introduction
Approach
Conclusion
Ontologies and KBs
SUMO
Extending Ontologies
YAGO
Introduction
Extending Ontologies: Possible Approaches
Manual work
Information extraction from corpora / the Web
Import from existing databases
feasible, but not universal enough
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
20. Introduction
Approach
Conclusion
Ontologies and KBs
SUMO
Extending Ontologies
YAGO
Introduction
YAGO
combine entities and facts from Wikipedia with an upper
ontology
original YAGO: WordNet for the upper level
New goal: integrate with SUMO
excellent coverage: around 2 million entities
millions of facts about them
high quality: e.g. birth dates of people, location of cities
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
21. Introduction
Approach
Conclusion
Ontologies and KBs
SUMO
Extending Ontologies
YAGO
Introduction
YAGO
combine entities and facts from Wikipedia with an upper
ontology
original YAGO: WordNet for the upper level
New goal: integrate with SUMO
mainly a lexical knowledge base
e.g. hyponymic relationships do not strictly imply subsumptions
lack of formal axioms
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
22. Introduction
Approach
Conclusion
Ontologies and KBs
SUMO
Extending Ontologies
YAGO
Introduction
YAGO
combine entities and facts from Wikipedia with an upper
ontology
original YAGO: WordNet for the upper level
New goal: integrate with SUMO
so the class information actually is meaningful
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
24. Introduction
Approach
Conclusion
Incorporation
Class Information
Statements
Incorporation
Idea: most Wikipedia articles become new entities
Semi-automatic matching: although SUMO contains only
few instances, some degree of overlap exists
use weighted string similarity measure
additional manual validation
−→ equivalence table
Entity Generation: produce a new unique term name for
Wikipedia article not listed in equivalence table, subject to the
following desiderata:
prevent clashes with SUMO or other entities
conciseness
abide to KIF syntax (Wikipedia uses Unicode)
must be a proper entity (not: “List of ...”)
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
25. Introduction
Approach
Conclusion
Incorporation
Class Information
Statements
Incorporation
Idea: most Wikipedia articles become new entities
Semi-automatic matching: although SUMO contains only
few instances, some degree of overlap exists
use weighted string similarity measure
additional manual validation
−→ equivalence table
Entity Generation: produce a new unique term name for
Wikipedia article not listed in equivalence table, subject to the
following desiderata:
prevent clashes with SUMO or other entities
conciseness
abide to KIF syntax (Wikipedia uses Unicode)
must be a proper entity (not: “List of ...”)
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
26. Introduction
Approach
Conclusion
Incorporation
Class Information
Statements
Incorporation
Idea: most Wikipedia articles become new entities
Semi-automatic matching: although SUMO contains only
few instances, some degree of overlap exists
use weighted string similarity measure
additional manual validation
−→ equivalence table
Entity Generation: produce a new unique term name for
Wikipedia article not listed in equivalence table, subject to the
following desiderata:
prevent clashes with SUMO or other entities
conciseness
abide to KIF syntax (Wikipedia uses Unicode)
must be a proper entity (not: “List of ...”)
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
27. Introduction
Approach
Conclusion
Incorporation
Class Information
Statements
Class Information
YAGO: From Wikipedia to WordNet
goal: each entity should have class membership information
use Wikipedia category system, however cannot use it directly
first link categories to WordNet, then map to SUMO
requirement: distinguish thematic categories from categories
encoding class membership
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
28. Introduction
Approach
Conclusion
Incorporation
Class Information
Statements
Class Information
YAGO: From Wikipedia to WordNet
goal: each entity should have class membership information
use Wikipedia category system, however cannot use it directly
first link categories to WordNet, then map to SUMO
requirement: distinguish thematic categories from categories
encoding class membership
categorization not transitive
members of subcategories often unrelated to parent category
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
29. Introduction
Approach
Conclusion
Incorporation
Class Information
Statements
Class Information
YAGO: From Wikipedia to WordNet
goal: each entity should have class membership information
use Wikipedia category system, however cannot use it directly
first link categories to WordNet, then map to SUMO
requirement: distinguish thematic categories from categories
encoding class membership
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
30. Introduction
Approach
Conclusion
Incorporation
Class Information
Statements
Class Information
YAGO: From Wikipedia to WordNet
goal: each entity should have class membership information
use Wikipedia category system, however cannot use it directly
first link categories to WordNet, then map to SUMO
requirement: distinguish thematic categories from categories
encoding class membership
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
35. Introduction
Approach
Conclusion
Incorporation
Class Information
Statements
Class Information
YAGO: From Wikipedia to WordNet
check WordNet for premodifier + headword or headword only
disambiguate using frequency information
result: relationship to WordNet-derived class
American singers of German origin
becomes linked as a subclass to the
WordNet-derived class Person
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
36. Introduction
Approach
Conclusion
Incorporation
Class Information
Statements
Class Information
Voting Procedure
problem:
regular polysemy, Wikipedia articles simultaneously cover
several metonymically related senses
e.g. Brown University is both a College and a
GroupOfPeople
will cause inconsistencies when the axioms are added
solution:
look at top-level branches for each proposed class (locations,
artifacts, abstract entities, etc.)
voting procedure to determine most salient branch (ties broken
arbitrarily)
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
37. Introduction
Approach
Conclusion
Incorporation
Class Information
Statements
Class Information
Voting Procedure
problem:
regular polysemy, Wikipedia articles simultaneously cover
several metonymically related senses
e.g. Brown University is both a College and a
GroupOfPeople
will cause inconsistencies when the axioms are added
solution:
look at top-level branches for each proposed class (locations,
artifacts, abstract entities, etc.)
voting procedure to determine most salient branch (ties broken
arbitrarily)
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
41. Introduction
Approach
Conclusion
Incorporation
Class Information
Statements
Class Information
From WordNet to SUMO
in further cases, the mappings yield a property or relation
−→ create new WordNet-based class, add axioms of the
form
(=>
(instance ?ENTITY Guitarist)
(property ?ENTITY Musician))
Then recursively move up WordNet’s class hierarchy adding
parent classes, until until a genuine parent class in SUMO is
available.
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
44. Introduction
Approach
Conclusion
Incorporation
Class Information
Statements
Statements
Information Extraction
YAGO uses manual rules and heuristics to extract information
about entities from Wikipedia pages
mainly based on categories and infoboxes, not on article text,
e.g. geographical location, spouse, etc.
manual rewriting rules to express facts using SUMO’s terms
sample evaluation: for each relation, at least 95% of the
statements are accurate
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
45. Introduction
Approach
Conclusion
Incorporation
Class Information
Statements
Statements
Information Extraction
YAGO uses manual rules and heuristics to extract information
about entities from Wikipedia pages
mainly based on categories and infoboxes, not on article text,
e.g. geographical location, spouse, etc.
manual rewriting rules to express facts using SUMO’s terms
sample evaluation: for each relation, at least 95% of the
statements are accurate
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
46. Introduction
Approach
Conclusion
Incorporation
Class Information
Statements
Statements
Information Extraction
YAGO uses manual rules and heuristics to extract information
about entities from Wikipedia pages
mainly based on categories and infoboxes, not on article text,
e.g. geographical location, spouse, etc.
manual rewriting rules to express facts using SUMO’s terms
sample evaluation: for each relation, at least 95% of the
statements are accurate
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
47. Introduction
Approach
Conclusion
Incorporation
Class Information
Statements
Statements
Information Extraction
YAGO uses manual rules and heuristics to extract information
about entities from Wikipedia pages
mainly based on categories and infoboxes, not on article text,
e.g. geographical location, spouse, etc.
manual rewriting rules to express facts using SUMO’s terms
sample evaluation: for each relation, at least 95% of the
statements are accurate
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
49. Introduction
Approach
Conclusion
Incorporation
Class Information
Statements
Statements
SUMO Integration
mapping rules
new relations added to SUMO when necessary
incl. additional rules for reasoning
(instance establishedOnDate BinaryRelation)
(domain 1 establishedOnDate Agent)
(domain 2 establishedOnDate TimeInterval)
(=> (establishedOnDate ?OBJ ?TIME)
(exists (?FOUNDING)
(and (instance ?FOUNDING Founding)
(result ?FOUNDING ?OBJ)
(overlapsTemporally (WhenFn ?FOUNDING) TIME))))
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
50. Introduction
Approach
Conclusion
Incorporation
Class Information
Statements
Statements
SUMO Integration
mapping rules
new relations added to SUMO when necessary
incl. additional rules for reasoning
(instance establishedOnDate BinaryRelation)
(domain 1 establishedOnDate Agent)
(domain 2 establishedOnDate TimeInterval)
(=> (establishedOnDate ?OBJ ?TIME)
(exists (?FOUNDING)
(and (instance ?FOUNDING Founding)
(result ?FOUNDING ?OBJ)
(overlapsTemporally (WhenFn ?FOUNDING) TIME))))
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
51. Introduction
Approach
Conclusion
Incorporation
Class Information
Statements
Statements
Statements with Literals
proper encoding of literals with units:
e.g. (MeasureFn 3.0 SquareMeter)
date ranges are recast
(exists ?YEARNO ?MONTHNO ?YEARNO
(and
(birthdate HerveyDeStanton
(DayFn ?DAYNO
(MonthFn ?MONTHNO
(YearFn ?YEARNO))))
(greaterThanOrEqualTo ?YEARNO 1270)
(lessThanOrEqualTo ?YEARNO 1279)))
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
52. Introduction
Approach
Conclusion
Incorporation
Class Information
Statements
Statements
Statements with Literals
proper encoding of literals with units:
e.g. (MeasureFn 3.0 SquareMeter)
date ranges are recast
(exists ?YEARNO ?MONTHNO ?YEARNO
(and
(birthdate HerveyDeStanton
(DayFn ?DAYNO
(MonthFn ?MONTHNO
(YearFn ?YEARNO))))
(greaterThanOrEqualTo ?YEARNO 1270)
(lessThanOrEqualTo ?YEARNO 1279)))
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
53. Introduction
Approach
Conclusion
Incorporation
Class Information
Statements
Statements
Additional Grounding
statements of the form
(representsInLanguage
"Immanuel Kant" ImmanuelKant EnglishLanguage)
produce a greater level of formal grounding of the semantics
of term names
when names are ambiguous, providing such symbolic strings
for multiple languages can further reduce the range of possible
interpretations
classes are better-specified due to their extensional
characterization
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
54. Introduction
Approach
Conclusion
Incorporation
Class Information
Statements
Statements
Additional Grounding
statements of the form
(representsInLanguage
"Immanuel Kant" ImmanuelKant EnglishLanguage)
produce a greater level of formal grounding of the semantics
of term names
when names are ambiguous, providing such symbolic strings
for multiple languages can further reduce the range of possible
interpretations
classes are better-specified due to their extensional
characterization
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
55. Introduction
Approach
Conclusion
Incorporation
Class Information
Statements
Statements
Additional Grounding
statements of the form
(representsInLanguage
"Immanuel Kant" ImmanuelKant EnglishLanguage)
produce a greater level of formal grounding of the semantics
of term names
when names are ambiguous, providing such symbolic strings
for multiple languages can further reduce the range of possible
interpretations
classes are better-specified due to their extensional
characterization
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
60. Introduction
Approach
Conclusion
Ongoing Work
Summary
Summary
Summary
SUMO: axiomatic representation of common sense knowledge
but lack of simple encyclopedic facts
YAGO methodology: add entities and statements about them
from Wikipedia
semi-automatic techniques, basic amount of manual work
−→ formal ontology with around two million entities and
several million statements and axioms
SUMO is catapulted from an upper level ontology to a
full-fledged all-purpose KB
Open source, available online:
http://www.demelo.org/yagosumo/
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
61. Introduction
Approach
Conclusion
Ongoing Work
Summary
Summary
Summary
SUMO: axiomatic representation of common sense knowledge
but lack of simple encyclopedic facts
YAGO methodology: add entities and statements about them
from Wikipedia
semi-automatic techniques, basic amount of manual work
−→ formal ontology with around two million entities and
several million statements and axioms
SUMO is catapulted from an upper level ontology to a
full-fledged all-purpose KB
Open source, available online:
http://www.demelo.org/yagosumo/
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
62. Introduction
Approach
Conclusion
Ongoing Work
Summary
Summary
Summary
SUMO: axiomatic representation of common sense knowledge
but lack of simple encyclopedic facts
YAGO methodology: add entities and statements about them
from Wikipedia
semi-automatic techniques, basic amount of manual work
−→ formal ontology with around two million entities and
several million statements and axioms
SUMO is catapulted from an upper level ontology to a
full-fledged all-purpose KB
Open source, available online:
http://www.demelo.org/yagosumo/
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
63. Introduction
Approach
Conclusion
Ongoing Work
Summary
Summary
Summary
SUMO: axiomatic representation of common sense knowledge
but lack of simple encyclopedic facts
YAGO methodology: add entities and statements about them
from Wikipedia
semi-automatic techniques, basic amount of manual work
−→ formal ontology with around two million entities and
several million statements and axioms
SUMO is catapulted from an upper level ontology to a
full-fledged all-purpose KB
Open source, available online:
http://www.demelo.org/yagosumo/
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
64. Introduction
Approach
Conclusion
Ongoing Work
Summary
Summary
Summary
SUMO: axiomatic representation of common sense knowledge
but lack of simple encyclopedic facts
YAGO methodology: add entities and statements about them
from Wikipedia
semi-automatic techniques, basic amount of manual work
−→ formal ontology with around two million entities and
several million statements and axioms
SUMO is catapulted from an upper level ontology to a
full-fledged all-purpose KB
Open source, available online:
http://www.demelo.org/yagosumo/
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology
65. Introduction
Approach
Conclusion
Ongoing Work
Summary
Summary
Summary
SUMO: axiomatic representation of common sense knowledge
but lack of simple encyclopedic facts
YAGO methodology: add entities and statements about them
from Wikipedia
semi-automatic techniques, basic amount of manual work
−→ formal ontology with around two million entities and
several million statements and axioms
SUMO is catapulted from an upper level ontology to a
full-fledged all-purpose KB
Open source, available online:
http://www.demelo.org/yagosumo/
G. de Melo, F. Suchanek, A. Pease Integrating YAGO into theSuggested Upper Merged Ontology