This tutorial will provide detailed instruction to create and make use of formalized ontologies from linked open data for advanced knowledge discovery including consistency checking and answering sophisticated questions.
Automated reasoning in OWL offers the tantalizing possibility to undertake advanced knowledge discovery including verifying the consistency of conceptual schemata in information systems, verifying data integrity and answering expressive queries over the conceptual schema and the data. Given that a large amount of structured knowledge is now available as linked data, the challenge is to formalize this knowledge iso that intended semantics become explicit and that the reasoning is efficient and scalable. While using the full expressiveness of OWL 2 yields ontologies that can be used for consistency verification, classification and query answering, use of less expressive OWL profiles enable efficient reasoning and support different application scenarios. In this tutorial,
- we describe how to generate OWL ontologies from linked data
- check consistency of knowledge
- automatically transform ontologies into OWL profiles
- use this knowledge in applications to integrate data and answer sophisticated questions across domains.
- expressive ontologies enables data integration, verifying consistency of knowledge and answering questions
- formalization of linked data will create new opportunities for knowledge discovery
- OWL 2 profiles support more efficient reasoning and query answering procedures
- recent technology facilitates the automatic conversion of OWL 2 ontologies into profiles
- OWL ontologies can dramatically extend the functionality of semantically-enabled web sites
Data integration is a perennial challenge facing large-scale data scientists. Bio-ontologies are useful in this endeavour as sources of synonyms and also for rules-based fuzzy integration pipelines.
semantic data integration the process of using a conceptual representation of the data and of their relationships to eliminate possible heterogeneities.
Taxonomy extraction from automotive natural language requirements using unsup...ijnlc
In this paper we present a novel approach to semi-automatically learn concept hierarchies from natural
language requirements of the automotive industry. The approach is based on the distributional hypothesis
and the special characteristics of domain-specific German compounds. We extract taxonomies by using
clustering techniques in combination with general thesauri. Such a taxonomy can be used to support
requirements engineering in early stages by providing a common system understanding and an agreedupon
terminology. This work is part of an ontology-driven requirements engineering process, which builds
on top of the taxonomy. Evaluation shows that this taxonomy extraction approach outperforms common
hierarchical clustering techniques.
Data integration is a perennial challenge facing large-scale data scientists. Bio-ontologies are useful in this endeavour as sources of synonyms and also for rules-based fuzzy integration pipelines.
semantic data integration the process of using a conceptual representation of the data and of their relationships to eliminate possible heterogeneities.
Taxonomy extraction from automotive natural language requirements using unsup...ijnlc
In this paper we present a novel approach to semi-automatically learn concept hierarchies from natural
language requirements of the automotive industry. The approach is based on the distributional hypothesis
and the special characteristics of domain-specific German compounds. We extract taxonomies by using
clustering techniques in combination with general thesauri. Such a taxonomy can be used to support
requirements engineering in early stages by providing a common system understanding and an agreedupon
terminology. This work is part of an ontology-driven requirements engineering process, which builds
on top of the taxonomy. Evaluation shows that this taxonomy extraction approach outperforms common
hierarchical clustering techniques.
The Statement of Conjunctive and Disjunctive Queries in Object Oriented Datab...Editor IJCATR
Entrance of object orienting concept in database caused the relation database gradually to replace with object oriented
database in various fields. On the other hand for solving the problem of real world uncertain data, several methods were presented.
One of these methods for modeling database is an approach wich couples object-oriented database modeling with fuzzy logic. Many
queries that users to pose are expressed on the basis of linguistic variables. Because of classical databases are not able to support these
variables, leads to fuzzy approaches are considered. We investigate databases queries in this study both simple and complex ways. In
the complex way, we use conjunctive and disjunctive queries. In the following, we use the XML labels to express inqueries into fuzzy.
We can also communicate with other sections of software by entering into XML world as the most reliable opportunity. Also we want
to correct conjunctive and disjunctive queries related to fuzzy object oriented database using the concept of dependency measure and
weight, and weight be assigned to different phrases of a query based on user emphasis. The other aim of this research is mapping fuzzy
queries to fuzzy-XML. It is expected to be simple implement of query, and output of execution of queries be greatly closer to users'
needs and fulfill her expect. The results show that the proposed method explains the possible conjunctive and disjunctive queries the
database in the form of Fuzzy-XML.
Different Semantic Perspectives for Question Answering SystemsAndre Freitas
Question Answering systems define one of the most complex tasks in computational semantics. The intrinsic complexity of the QA task allows researchers of QA systems to investigate and explore different perspectives of semantics. However, this complexity also induces a bias towards a systems perspective, where researchers are alienated from a deeper reasoning on the semantic principles that are in place within the different components of the system. In this talk we will explore the semantic challenges, principles and perspectives behind the components of QA systems, aiming at providing a principled map and overview on the contribution of each component within the QA semantic interpretation goal.
Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azm...Khirulnizam Abd Rahman
Application of Ontology in Semantic Information Retrieval
by Prof Shahrul Azman from FSTM, UKM
Presentation for MyREN Seminar 2014
Berjaya Hotel, Kuala Lumpur
27 November 2014
A Semi-Automatic Ontology Extension Method for Semantic Web ServicesIDES Editor
this paper provides a novel semi-automatic ontology
extension method for Semantic Web Services (SWS). This is
significant since ontology extension methods those existing
in literature mostly deal with semantic description of static
Web resources such as text documents. Hence, there is a need
for methods that can serve dynamic Web resources such as
SWS. The developed method in this paper avoids redundancy
and respects consistency so as to assure high quality of the
resulting shared ontologies.
Formal and Computational Representations
The Semantics of First-Order Logic
Event Representations
Description Logics & the Web Ontology Language
Compositionality
Lamba calculus
Corpus-based approaches:
Latent Semantic Analysis
Topic models
Distributional Semantics
On the Semantic Mapping of Schema-agnostic Queries: A Preliminary StudyAndre Freitas
The growing size, heterogeneity and complexity of databases
demand the creation of strategies to facilitate users and systems to consume
data. Ideally, query mechanisms should be schema-agnostic or
vocabulary-independent, i.e. they should be able to match user queries
in their own vocabulary and syntax to the data, abstracting data consumers
from the representation of the data. Despite being a central requirement across natural language interfaces and entity search, there is a lack on the conceptual analysis of schema-agnosticism and on the associated semantic differences between queries and databases. This work aims at providing an initial conceptualization for schema-agnostic queries aiming at providing a fine-grained classification which can support the scoping, evaluation and development of semantic matching approaches for schema-agnostic queries.
How hard is this Query? Measuring the Semantic Complexity of Schema-agnostic ...Andre Freitas
The growing size, heterogeneity and complexity of databases demand the creation of strategies to facilitate users and systems to consume data. Ideally, query mechanisms should be schema-agnostic, i.e. they should be able to match user queries in their own vocabulary and syntax to the data, abstracting data consumers from the representation of the data. This work provides an informationtheoretical framework to evaluate the semantic complexity involved in the query-database communication, under a schema-agnostic query scenario. Different entropy measures are introduced to quantify the semantic phenomena involved in the user-database communication, including structural complexity, ambiguity, synonymy and vagueness. The entropy measures are validated using natural language queries over Semantic Web databases. The analysis of the semantic complexity is used to improve the understanding of the core semantic dimensions present at the query-data matching process, allowing the improvement of the design of schema-agnostic query mechanisms and defining measures which can be used to assess the semantic uncertainty or difficulty behind a schema-agnostic querying task.
ONTOLOGICAL MODEL FOR CHARACTER RECOGNITION BASED ON SPATIAL RELATIONSsipij
In this paper, we present a set of spatial relations between concepts describing an ontological model for a
new process of character recognition. Our main idea is based on the construction of the domain ontology
modelling the Latin script. This ontology is composed by a set of concepts and a set of relations. The
concepts represent the graphemes extracted by segmenting the manipulated document and the relations are
of two types, is-a relations and spatial relations. In this paper we are interested by description of second
type of relations and their implementation by java code.
Concept hierarchy is the backbone of ontology, and the concept hierarchy acquisition has been a hot topic in the field of ontology learning. this paper proposes a hyponymy extraction method of domain ontology concept based on cascaded conditional random field(CCRFs) and hierarchy clustering. It takes free text as extracting object, adopts CCRFs identifying the domain concepts. First the low layer of CCRFs is used to identify simple domain concept, then the results are sent to the high layer, in which the nesting concepts are recognized. Next we adopt hierarchy clustering to identify the hyponymy relation between domain ontology concepts. The experimental results demonstrate the proposed method is efficient.
Schema-agnositc queries over large-schema databases: a distributional semanti...Andre Freitas
The evolution of data environments towards the growth in the size, complexity, dy-
namicity and decentralisation (SCoDD) of schemas drastically impacts contemporary
data management. The SCoDD trend emerges as a central data management concern
in Big Data scenarios, where users and applications have a demand for more complete
data, produced by independent data sources, under different semantic assumptions and
contexts of use. Most Database Management Systems (DBMSs) today target a closed
communication scenario, where the symbolic schema of the database is known a priori
by the database user, which is able to interpret it in an unambiguous way. The context
in which the data is consumed and produced is well-defined and it is typically the
same context in which the data was created. In contrast, data management under the
SCoDD conditions target an open communication scenario where the symbolic system of
the database is unknown by the user and multiple interpretation contexts are possible.
In this case the database can be created under a different context from the database
user. The emergence of this new data environment demands the revisit of the semantic
assumptions behind databases and the design of data access mechanisms which can
support semantically heterogeneous (open communication) data environments.
This work aims at filling this gap by proposing a complementary semantic model for
databases, based on distributional semantic models. Distributional semantics provides a
complementary perspective to the formal perspective of database semantics, which supports
semantic approximation as a first-class database operation. Differently from models
which describe uncertain and incomplete data or probabilistic databases, distributional-
relational models focuses on the construction of conceptual approximation approaches
for databases, supported by a comprehensive semantic model automatically built from
large-scale unstructured data external to the database, which serves as a semantic/com-
monsense knowledge base. The semantic model can be used to support schema-agnosticqueries, i.e. abstracting the data consumer from a specific conceptualization behind the
data.
The proposed distributional-relational semantic model is supported by a distributional
structured vector space model, named τ −Space, which represents structured data under
a distributional semantic model representation which, in coordination with a query plan-
ning approach, supports a schema-agnostic query mechanism for large-schema databases.
The query mechanism is materialized in the Treo query engine and is evaluated using
schema-agnostic natural language queries.
The evaluation of the query mechanism confirms that distributional semantics provides
a high-recall, medium-high precision, and low maintainability solution to cope with
the abstraction and conceptual-level differences in schema-agnostic queries over largeschema/
schema-less open domain dataset
Language Combinatorics: A Sentence Pattern Extraction Architecture Based on C...Waqas Tariq
A \"sentence pattern\" in modern Natural Language Processing is often considered as a subsequent string of words (n-grams). However, in many branches of linguistics, like Pragmatics or Corpus Linguistics, it has been noticed that simple n-gram patterns are not sufficient to reveal the whole sophistication of grammar patterns. We present a language independent architecture for extracting from sentences more sophisticated patterns than n-grams. In this architecture a \"sentence pattern\" is considered as n-element ordered combination of sentence elements. Experiments showed that the method extracts significantly more frequent patterns than the usual n-gram approach.
An Entity-Driven Recursive Neural Network Model for Chinese Discourse Coheren...ijaia
Chinese discourse coherence modeling remains a challenge taskin Natural Language Processing
field.Existing approaches mostlyfocus on the need for feature engineering, whichadoptthe sophisticated
features to capture the logic or syntactic or semantic relationships acrosssentences within a text.In this
paper, we present an entity-drivenrecursive deep modelfor the Chinese discourse coherence evaluation
based on current English discourse coherenceneural network model. Specifically, to overcome the
shortage of identifying the entity(nouns) overlap across sentences in the currentmodel, Our combined
modelsuccessfully investigatesthe entities information into the recursive neural network
freamework.Evaluation results on both sentence ordering and machine translation coherence rating
task show the effectiveness of the proposed model, which significantly outperforms the existing strong
baseline.
The increased potential of the ontologies to reduce the human interference has wide range of applications. This paper identifies requirements for an ontology development platform to innovate artificially intelligent web. To facilitate this process, RDF and OWL have been developed as standard formats for the sharing and integration of data and knowledge. The knowledge in the form of rich conceptual schemas called ontologies. Based on the framework, an architectural paradigm is put forward in view of ontology engineering and development of ontology applications and a development portal designed to support ontology engineering, content authoring and application development with a view to maximal scalability in size and complexity of semantic knowledge and flexible reuse of ontology models and ontology application processes in a distributed and collaborative engineering environment.
The logic-based machine-understandable framework of the Semantic Web often challenges naive users when they try to query ontology-based knowledge bases. Existing research efforts have approached this problem by introducing Natural Language (NL) interfaces to ontologies. These NL interfaces have the ability to construct SPARQL queries based on NL user queries. However, most efforts were restricted to queries expressed in English, and they often benefited from the advancement of English NLP tools. However, little research has been done to support querying the Arabic content on the Semantic Web by using NL queries. This paper presents a domain-independent approach to translate Arabic NL queries to SPARQL by leveraging linguistic analysis. Based on a special consideration on Noun Phrases (NPs), our approach uses a language parser to extract NPs and the relations from Arabic parse trees and match them to the underlying ontology. It then utilizes knowledge in the ontology to group NPs into triple-based representations. A SPARQL query is finally generated by extracting targets and modifiers, and interpreting them into SPARQL. The interpretation of advanced semantic features including negation, conjunctive and disjunctive modifiers is also supported. The approach was evaluated by using two datasets consisting of OWL test data and queries, and the obtained results have confirmed its feasibility to translate Arabic NL queries to SPARQL.
The purpose of properties is to enable inference. For all the explicit information that has been modeled, what information can be implied?
RDFS provides a very limited set of inference capabilities. The Web Ontology Language (OWL) provides more elaborate constraints on how information can be specified. A subset of these constraints are discussed in this presentation.
The Statement of Conjunctive and Disjunctive Queries in Object Oriented Datab...Editor IJCATR
Entrance of object orienting concept in database caused the relation database gradually to replace with object oriented
database in various fields. On the other hand for solving the problem of real world uncertain data, several methods were presented.
One of these methods for modeling database is an approach wich couples object-oriented database modeling with fuzzy logic. Many
queries that users to pose are expressed on the basis of linguistic variables. Because of classical databases are not able to support these
variables, leads to fuzzy approaches are considered. We investigate databases queries in this study both simple and complex ways. In
the complex way, we use conjunctive and disjunctive queries. In the following, we use the XML labels to express inqueries into fuzzy.
We can also communicate with other sections of software by entering into XML world as the most reliable opportunity. Also we want
to correct conjunctive and disjunctive queries related to fuzzy object oriented database using the concept of dependency measure and
weight, and weight be assigned to different phrases of a query based on user emphasis. The other aim of this research is mapping fuzzy
queries to fuzzy-XML. It is expected to be simple implement of query, and output of execution of queries be greatly closer to users'
needs and fulfill her expect. The results show that the proposed method explains the possible conjunctive and disjunctive queries the
database in the form of Fuzzy-XML.
Different Semantic Perspectives for Question Answering SystemsAndre Freitas
Question Answering systems define one of the most complex tasks in computational semantics. The intrinsic complexity of the QA task allows researchers of QA systems to investigate and explore different perspectives of semantics. However, this complexity also induces a bias towards a systems perspective, where researchers are alienated from a deeper reasoning on the semantic principles that are in place within the different components of the system. In this talk we will explore the semantic challenges, principles and perspectives behind the components of QA systems, aiming at providing a principled map and overview on the contribution of each component within the QA semantic interpretation goal.
Application of Ontology in Semantic Information Retrieval by Prof Shahrul Azm...Khirulnizam Abd Rahman
Application of Ontology in Semantic Information Retrieval
by Prof Shahrul Azman from FSTM, UKM
Presentation for MyREN Seminar 2014
Berjaya Hotel, Kuala Lumpur
27 November 2014
A Semi-Automatic Ontology Extension Method for Semantic Web ServicesIDES Editor
this paper provides a novel semi-automatic ontology
extension method for Semantic Web Services (SWS). This is
significant since ontology extension methods those existing
in literature mostly deal with semantic description of static
Web resources such as text documents. Hence, there is a need
for methods that can serve dynamic Web resources such as
SWS. The developed method in this paper avoids redundancy
and respects consistency so as to assure high quality of the
resulting shared ontologies.
Formal and Computational Representations
The Semantics of First-Order Logic
Event Representations
Description Logics & the Web Ontology Language
Compositionality
Lamba calculus
Corpus-based approaches:
Latent Semantic Analysis
Topic models
Distributional Semantics
On the Semantic Mapping of Schema-agnostic Queries: A Preliminary StudyAndre Freitas
The growing size, heterogeneity and complexity of databases
demand the creation of strategies to facilitate users and systems to consume
data. Ideally, query mechanisms should be schema-agnostic or
vocabulary-independent, i.e. they should be able to match user queries
in their own vocabulary and syntax to the data, abstracting data consumers
from the representation of the data. Despite being a central requirement across natural language interfaces and entity search, there is a lack on the conceptual analysis of schema-agnosticism and on the associated semantic differences between queries and databases. This work aims at providing an initial conceptualization for schema-agnostic queries aiming at providing a fine-grained classification which can support the scoping, evaluation and development of semantic matching approaches for schema-agnostic queries.
How hard is this Query? Measuring the Semantic Complexity of Schema-agnostic ...Andre Freitas
The growing size, heterogeneity and complexity of databases demand the creation of strategies to facilitate users and systems to consume data. Ideally, query mechanisms should be schema-agnostic, i.e. they should be able to match user queries in their own vocabulary and syntax to the data, abstracting data consumers from the representation of the data. This work provides an informationtheoretical framework to evaluate the semantic complexity involved in the query-database communication, under a schema-agnostic query scenario. Different entropy measures are introduced to quantify the semantic phenomena involved in the user-database communication, including structural complexity, ambiguity, synonymy and vagueness. The entropy measures are validated using natural language queries over Semantic Web databases. The analysis of the semantic complexity is used to improve the understanding of the core semantic dimensions present at the query-data matching process, allowing the improvement of the design of schema-agnostic query mechanisms and defining measures which can be used to assess the semantic uncertainty or difficulty behind a schema-agnostic querying task.
ONTOLOGICAL MODEL FOR CHARACTER RECOGNITION BASED ON SPATIAL RELATIONSsipij
In this paper, we present a set of spatial relations between concepts describing an ontological model for a
new process of character recognition. Our main idea is based on the construction of the domain ontology
modelling the Latin script. This ontology is composed by a set of concepts and a set of relations. The
concepts represent the graphemes extracted by segmenting the manipulated document and the relations are
of two types, is-a relations and spatial relations. In this paper we are interested by description of second
type of relations and their implementation by java code.
Concept hierarchy is the backbone of ontology, and the concept hierarchy acquisition has been a hot topic in the field of ontology learning. this paper proposes a hyponymy extraction method of domain ontology concept based on cascaded conditional random field(CCRFs) and hierarchy clustering. It takes free text as extracting object, adopts CCRFs identifying the domain concepts. First the low layer of CCRFs is used to identify simple domain concept, then the results are sent to the high layer, in which the nesting concepts are recognized. Next we adopt hierarchy clustering to identify the hyponymy relation between domain ontology concepts. The experimental results demonstrate the proposed method is efficient.
Schema-agnositc queries over large-schema databases: a distributional semanti...Andre Freitas
The evolution of data environments towards the growth in the size, complexity, dy-
namicity and decentralisation (SCoDD) of schemas drastically impacts contemporary
data management. The SCoDD trend emerges as a central data management concern
in Big Data scenarios, where users and applications have a demand for more complete
data, produced by independent data sources, under different semantic assumptions and
contexts of use. Most Database Management Systems (DBMSs) today target a closed
communication scenario, where the symbolic schema of the database is known a priori
by the database user, which is able to interpret it in an unambiguous way. The context
in which the data is consumed and produced is well-defined and it is typically the
same context in which the data was created. In contrast, data management under the
SCoDD conditions target an open communication scenario where the symbolic system of
the database is unknown by the user and multiple interpretation contexts are possible.
In this case the database can be created under a different context from the database
user. The emergence of this new data environment demands the revisit of the semantic
assumptions behind databases and the design of data access mechanisms which can
support semantically heterogeneous (open communication) data environments.
This work aims at filling this gap by proposing a complementary semantic model for
databases, based on distributional semantic models. Distributional semantics provides a
complementary perspective to the formal perspective of database semantics, which supports
semantic approximation as a first-class database operation. Differently from models
which describe uncertain and incomplete data or probabilistic databases, distributional-
relational models focuses on the construction of conceptual approximation approaches
for databases, supported by a comprehensive semantic model automatically built from
large-scale unstructured data external to the database, which serves as a semantic/com-
monsense knowledge base. The semantic model can be used to support schema-agnosticqueries, i.e. abstracting the data consumer from a specific conceptualization behind the
data.
The proposed distributional-relational semantic model is supported by a distributional
structured vector space model, named τ −Space, which represents structured data under
a distributional semantic model representation which, in coordination with a query plan-
ning approach, supports a schema-agnostic query mechanism for large-schema databases.
The query mechanism is materialized in the Treo query engine and is evaluated using
schema-agnostic natural language queries.
The evaluation of the query mechanism confirms that distributional semantics provides
a high-recall, medium-high precision, and low maintainability solution to cope with
the abstraction and conceptual-level differences in schema-agnostic queries over largeschema/
schema-less open domain dataset
Language Combinatorics: A Sentence Pattern Extraction Architecture Based on C...Waqas Tariq
A \"sentence pattern\" in modern Natural Language Processing is often considered as a subsequent string of words (n-grams). However, in many branches of linguistics, like Pragmatics or Corpus Linguistics, it has been noticed that simple n-gram patterns are not sufficient to reveal the whole sophistication of grammar patterns. We present a language independent architecture for extracting from sentences more sophisticated patterns than n-grams. In this architecture a \"sentence pattern\" is considered as n-element ordered combination of sentence elements. Experiments showed that the method extracts significantly more frequent patterns than the usual n-gram approach.
An Entity-Driven Recursive Neural Network Model for Chinese Discourse Coheren...ijaia
Chinese discourse coherence modeling remains a challenge taskin Natural Language Processing
field.Existing approaches mostlyfocus on the need for feature engineering, whichadoptthe sophisticated
features to capture the logic or syntactic or semantic relationships acrosssentences within a text.In this
paper, we present an entity-drivenrecursive deep modelfor the Chinese discourse coherence evaluation
based on current English discourse coherenceneural network model. Specifically, to overcome the
shortage of identifying the entity(nouns) overlap across sentences in the currentmodel, Our combined
modelsuccessfully investigatesthe entities information into the recursive neural network
freamework.Evaluation results on both sentence ordering and machine translation coherence rating
task show the effectiveness of the proposed model, which significantly outperforms the existing strong
baseline.
The increased potential of the ontologies to reduce the human interference has wide range of applications. This paper identifies requirements for an ontology development platform to innovate artificially intelligent web. To facilitate this process, RDF and OWL have been developed as standard formats for the sharing and integration of data and knowledge. The knowledge in the form of rich conceptual schemas called ontologies. Based on the framework, an architectural paradigm is put forward in view of ontology engineering and development of ontology applications and a development portal designed to support ontology engineering, content authoring and application development with a view to maximal scalability in size and complexity of semantic knowledge and flexible reuse of ontology models and ontology application processes in a distributed and collaborative engineering environment.
The logic-based machine-understandable framework of the Semantic Web often challenges naive users when they try to query ontology-based knowledge bases. Existing research efforts have approached this problem by introducing Natural Language (NL) interfaces to ontologies. These NL interfaces have the ability to construct SPARQL queries based on NL user queries. However, most efforts were restricted to queries expressed in English, and they often benefited from the advancement of English NLP tools. However, little research has been done to support querying the Arabic content on the Semantic Web by using NL queries. This paper presents a domain-independent approach to translate Arabic NL queries to SPARQL by leveraging linguistic analysis. Based on a special consideration on Noun Phrases (NPs), our approach uses a language parser to extract NPs and the relations from Arabic parse trees and match them to the underlying ontology. It then utilizes knowledge in the ontology to group NPs into triple-based representations. A SPARQL query is finally generated by extracting targets and modifiers, and interpreting them into SPARQL. The interpretation of advanced semantic features including negation, conjunctive and disjunctive modifiers is also supported. The approach was evaluated by using two datasets consisting of OWL test data and queries, and the obtained results have confirmed its feasibility to translate Arabic NL queries to SPARQL.
The purpose of properties is to enable inference. For all the explicit information that has been modeled, what information can be implied?
RDFS provides a very limited set of inference capabilities. The Web Ontology Language (OWL) provides more elaborate constraints on how information can be specified. A subset of these constraints are discussed in this presentation.
Presentation and about Social Media and it's application for the Kentucky Humane Society and Non-profits.
Some slides re-purposed from presentations by Nathan Schock and Lance Shields
The Balangero asbestos open pit mine, located 35km NW of Torino (Italy), was the largest operation of this kind in Western Europe. The dry tailings were lifted by a conveyor belt from the mill and dumped over a natural slope with an approximate angle of 25 degrees, progressively reaching a maximum thickness estimated at 80 m.
By the '80s the dump was deeply scarred by various local and large scale instabilities, to the point that houses located at the toe, on the opposite side of the valley, were evacuated.
The award winning restoration project used a multidisciplinary approach including hydraulics, geotechnical, pedological and risk engineering to yield a well balanced and sustainable solution. This paper illustrates the Risk Based Decision Making (RBDM) process used through the feasibility, design and construction follow-up of the environmental restoration of the 60 Mm3 dry Balangero asbestos tailings dump.
The slopes were hit by storm Quinn and Medicane Rolf and came out unscathed. That was after surviving heavy summer storms in 2010 and 2011.
This presentation was used as a base for a talk resented at RIMS Conference in Vancouver, May 2011, Metals and Mining Session. Recent world-wide events have shown
folksonomy, social tagging, tag clouds, automatic folksonomy construction, word clouds, wordle,context-preserving word cloud visualisation, CPEWCV, seam carving, inflate and push, star forest, cycle cover, quantitative metrics, realized adjacencies, distortion, area utilization, compactness, aspect ratio, running time, semantics in language technology
Formalization and implementation of BFO 2 with a focus on the OWL implementationgolpedegato2
Formalization and implementation of Basic Formal Ontology 2 with a focus on the OWL implementation.
With an introduction to some of the underlying technologies
USING RELATIONAL MODEL TO STORE OWL ONTOLOGIES AND FACTScsandit
The storing and the processing of OWL instances are important subjects in database modeling.
Many research works have focused on the way of managing OWL instances efficiently. Some
systems store and manage OWL instances using relational models to ensure their persistence.
Nevertheless, several approaches keep only RDF triplets as instances in relational tables
explicitly, and the manner of structuring instances as graph and keeping links between concepts
is not taken into account. In this paper, we propose an architecture that permits relational
tables behave as an OWL model by adapting relational tables to OWL instances and an OWL
hierarchy structure. Therefore, two kinds of tables are used: facts or instances relational tables.
The tables hold instances and the OWL table holds a specification of how the concepts are
structured. Instances tables should conform to OWLtable to be valid. A mechanism of
construction of OWLtable and instances tables is defined in order to enable and enhance
inference and semantic querying of OWL in relational model context.
2010 CASCON - Towards a integrated network of data and services for the life ...Michel Dumontier
Towards a integrated network of data and services for the life sciences Modern biological knowledge discovery requires access to machine-understandable data that can be searched, retrieved, and subsequently analyzed using a wide array of analytical software and services. The Semantic Automated Discovery and Integration (SADI) framework is a set of conventions to formalize web service inputs and outputs using OWL ontologies that enable the automatic discovery and invocation of Semantic Web services. In this talk, I will walk through a worked example in the design and deployment of chemical semantic web services using the Chemical Development Toolkit, chemical descriptors from the Chemical Information Ontology (CHEMINF), and the Semanticscience Integrated Ontology (SIO) as a unifying, upper level ontology of basic types and relations. I will discuss how one can make use of the SADI-enabled SHARE client to reason about data obtained from Bio2RDF, the largest linked open data project, and automatically invoke chemical semantic web services to determine a chemical's drug-likeness. If you want to see the potential of the Semantic Web being realized, this talk is for you.
study or concern about what kinds of things exist
what entities there are in the universe.
the ontology derives from the Greek onto (being) and logia (written or spoken). It is a branch of metaphysics , the study of first principles or the root of things.
Rulelog is in process of industry standardization via RuleML and W3C:
RIF-Rulelog specification, version of of May 24, 2013, Michael Kifer, ed. RIF-Rulelog is a powerful dialect of W3C Rule Interchange Format (RIF) that is in draft as a submission from RuleML to W3C.
Several industry standards in the areas are based heavily on our team’s contributions to the authoring/editing of the specifications and conducting the underlying research and earlier-phase standards design. These include most notably the two most important industry standards on rules knowledge:
W3C Rule Interchange Format (RIF), which is primarily based on the RuleML standards design (semantic web rules)
W3C OWL 2 RL Profile (rule-based web ontologies)
The team has also contributed to the development of W3C SPARQL and ISO Common Logic, and been strongly involved in other related standardization efforts at OMG and Oasis.
Linked Open Data Alignment and Enrichment Using Bootstrapping Based TechniquesPrateek Jain
The recent emergence of the “Linked Data” approach for publishing data represents a major step forward in realizing the original vision of a web that can “understand and satisfy the requests of people and machines to use the web content” – i.e. the Semantic Web. This new approach has resulted in the Linked Open Data (LOD) Cloud, which includes more than 70 large datasets contributed by experts belonging to diverse communities such as geography, entertainment, and life sciences. However, the current interlinks between datasets in the LOD Cloud – as we will illustrate – are too shallow to realize much of the benefits promised. If this limitation is left unaddressed, then the LOD Cloud will merely be more data that suffers from the same kinds of problems, which plague the Web of Documents, and hence the vision of the Semantic Web will fall short.
This thesis presents a comprehensive solution to address these issues using a bootstrapping based approach. It showcases using bootstrapping based methods to identify and create richer relationships between LOD datasets. The BLOOMS project (http://wiki.knoesis.org/index.php/BLOOMS) and the PLATO project, both built as part of this research, have provided evidence to the feasibility and the applicability of the solution.
Resource Description Framework Approach to Data Publication and FederationPistoia Alliance
Bob Stanley, CEO, IO Informatics, explains the utility to RDF as a standard way of defining and redefining data that could have utility in managing life science information.
The increased availability of biomedical data, particularly in the public domain, offers the opportunity to better understand human health and to develop effective therapeutics for a wide range of unmet medical needs. However, data scientists remain stymied by the fact that data remain hard to find and to productively reuse because data and their metadata i) are wholly inaccessible, ii) are in non-standard or incompatible representations, iii) do not conform to community standards, and iv) have unclear or highly restricted terms and conditions that preclude legitimate reuse. These limitations require a rethink on data can be made machine and AI-ready - the key motivation behind the FAIR Guiding Principles. Concurrently, while recent efforts have explored the use of deep learning to fuse disparate data into predictive models for a wide range of biomedical applications, these models often fail even when the correct answer is already known, and fail to explain individual predictions in terms that data scientists can appreciate. These limitations suggest that new methods to produce practical artificial intelligence are still needed.
In this talk, I will discuss our work in (1) building an integrative knowledge infrastructure to prepare FAIR and "AI-ready" data and services along with (2) neurosymbolic AI methods to improve the quality of predictions and to generate plausible explanations. Attention is given to standards, platforms, and methods to wrangle knowledge into simple, but effective semantic and latent representations, and to make these available into standards-compliant and discoverable interfaces that can be used in model building, validation, and explanation. Our work, and those of others in the field, creates a baseline for building trustworthy and easy to deploy AI models in biomedicine.
Bio
Dr. Michel Dumontier is the Distinguished Professor of Data Science at Maastricht University, founder and executive director of the Institute of Data Science, and co-founder of the FAIR (Findable, Accessible, Interoperable and Reusable) data principles. His research explores socio-technological approaches for responsible discovery science, which includes collaborative multi-modal knowledge graphs, privacy-preserving distributed data mining, and AI methods for drug discovery and personalized medicine. His work is supported through the Dutch National Research Agenda, the Netherlands Organisation for Scientific Research, Horizon Europe, the European Open Science Cloud, the US National Institutes of Health, and a Marie-Curie Innovative Training Network. He is the editor-in-chief for the journal Data Science and is internationally recognized for his contributions in bioinformatics, biomedical informatics, and semantic technologies including ontologies and linked data.
knowledge graphs are an emerging paradigm to represent information. yet their discovery and reuse is hampered by insufficient or inadequate metadata. here, the COST ACTION Distributed Knowledge Graphs had a first workshop to develop a KG metadata schema. In this presentation, the progress and plans are discussed with the W3C Community Group on Knowledge Graph Construction.
Data-Driven Discovery Science with FAIR Knowledge GraphsMichel Dumontier
Data-Driven Discovery Science with FAIR Knowledge Graphs
Despite the existence of vast amounts of biomedical data, these remain difficult to find and to productively reuse in machine learning and other Artificial Intelligence technologies. In this talk, I will discuss the role of the FAIR Guiding Principles to make AI-ready biomedical data, and their representation as knowledge graphs not only enables powerful ontology-backed semantic queries, but also can be used to predict missing information, as well as to check the quality of knowledge collected.
The main idea of the talk is to introduce the FAIR principles (what they are and what they are not), and how their application with semantic web technologies (ontologies/linked data) creates improved possibilities for large scale data integration, answering sophisticated questions using automated reasoners, and predicting new relations/validating data using graph embeddings. The audience will gain insight into the state of the art in a carefully presented manner that introduces principles, approaches, and outcomes relevant to Health AI.
The FAIR (Findable, Accessible, Interoperable, Reusable) Guiding Principles light a path towards improving the discovery and reuse of digital objects (data, documents, software, web services, etc) by machines. Machine reusability is a crucial strategic component in building robust digital infrastructure that strengthens scholarship and opens new pathways for innovation on a truly global scale. However, as the FAIR principles do not specify any particular implementation, communities have the homework to devise, standardize and implement technical specifications to improve the ‘FAIRness’ of digital assets. In this seminar, I will focus on the history and state of the art in the FAIRness assessment, including manual, semi-automated and fully automated approaches, and how these can be used by developers and consumers alike. This seminar will serve as a springboard for community discussion and adoption of these services to incrementally and realistically improve the FAIRness of their resources.
The Role of the FAIR Guiding Principles for an effective Learning Health SystemMichel Dumontier
he learning health system (LHS) is an integrated social and technological system that embeds continuous improvement and innovation for the effective delivery of healthcare. A crucial part of the LHS lies in how the underlying information system will secure and take advantage of relevant knowledge assets towards supporting complex and unusual clinical decision making, facilitating public health surveillance, and aiding comparative effectiveness research. However, key knowledge assets remain difficult to obtain and reuse, particularly in a decentralized context. In this talk, I will discuss the role of the Findable, Accessible, Interoperable, and Reusable (FAIR) Guiding Principles towards the realization of the LHS, along with emerging technologies to publish and refine clinical research and knowledge derived therein.
Keynote given for 2021 Knowledge Representation for Health Care http://banzai-deim.urv.net/events/KR4HC-2021/
CIKM2020 Keynote: Accelerating discovery science with an Internet of FAIR dat...Michel Dumontier
Biomedicine has always been a fertile and challenging domain for computational discovery science. Indeed, the existence of millions of scientific articles, thousands of databases, and hundreds of ontologies, offer exciting opportunities to reuse our collective knowledge, were we not stymied by incompatible formats, overlapping and incomplete vocabularies, unclear licensing, and heterogeneous access points. In this talk, I will discuss our work to create computational standards, platforms, and methods to wrangle knowledge into simple, but effective representations based on semantic web technologies that are maximally FAIR - Findable, Accessible, Interoperable, and Reuseable - and to further use these for biomedical knowledge discovery. But only with additional crucial developments will this emerging Internet of FAIR data and services enable automated scientific discovery on a global scale.
bio:
Dr. Michel Dumontier is the Distinguished Professor of Data Science at Maastricht University and co-founder of the FAIR (Findable, Accessible, Interoperable and Reusable) data principles. His research focuses on the development of computational methods for scalable and responsible discovery science. Dr. Dumontier obtained his BSc (Biochemistry) in 1998 from the University of Manitoba, and his PhD (Bioinformatics) in 2005 from the University of Toronto. Previously a faculty member at Carleton University in Ottawa and Stanford University in Palo Alto, Dr. Dumontier founded and directs the interfaculty Institute of Data Science at Maastricht University to develop sociotechnological systems for responsible data science by design. His work is supported through the Dutch National Research Agenda, the Netherlands Organisation for Scientific Research, Horizon 2020, the European Open Science Cloud, the US National Institutes of Health and a Marie-Curie Innovative Training Network. He is the editor-in-chief for the journal Data Science and is internationally recognized for his contributions in bioinformatics, biomedical informatics, and semantic technologies including ontologies and linked data.
This presentation was given on October 21, 2020 at CIKM2020.
The role of the FAIR Guiding Principles in a Learning Health SystemMichel Dumontier
The learning health system (LHS) is a concept for a socio-technological system that continuously improves the delivery of health care by coupling biomedical research with practice- and evidence- based medicine. Key aspects of the LHS are collecting, integrating, and analyzing data from different sources. While the increased digitalisation of healthcare is creating new data sources, these remain hard to find and use, let alone make use of as part of intelligent systems for the benefit of patients, healthcare providers, and researchers. This talk will examine recent developments towards making key parts of the LHS, such as clinical practice guidelines, Findable, Accessible, Interoperable, and Reusable (FAIR).
Acclerating biomedical discovery with an internet of FAIR data and services -...Michel Dumontier
With its focus on improving the health and well being of people, biomedicine has always been a fertile, if not challenging domain for computational discovery science. Indeed, the existence of millions of scientific articles, thousands of databases, and hundreds of ontologies, offer exciting opportunities to reuse our collective knowledge, were we not stymied by incompatible formats, overlapping and incomplete vocabularies, unclear licensing, and heterogeneous access points. In this talk, I will discuss our work to create computational standards, platforms, and methods to wrangle knowledge into simple, but effective representations based on semantic web technologies that are maximally FAIR - Findable, Accessible, Interoperable, and Reuseable - and to further use these for biomedical knowledge discovery. But only with additional crucial developments will this emerging Internet of FAIR data and services, which is built on Semantic Web technologies, be well positioned to support automated scientific discovery on a global scale.
Accelerating Biomedical Research with the Emerging Internet of FAIR Data and ...Michel Dumontier
ith its focus on improving the health and well being of people, biomedicine has always been a fertile, if not challenging domain for computational discovery science. Indeed, the existence of millions of scientific articles, thousands of databases, and hundreds of ontologies, offer exciting opportunities to reuse our collective knowledge, were we not stymied by incompatible formats, overlapping and incomplete vocabularies, unclear licensing, and heterogeneous access points. In this talk, I will discuss our work to create computational standards, platforms, and methods to wrangle knowledge into simple, but effective representations based on semantic web technologies that are maximally FAIR - Findable, Accessible, Interoperable, and Reuseable - and to further use these for biomedical knowledge discovery. But only with additional crucial developments will this emerging Internet of FAIR data and services enable automated scientific discovery on a global scale.
Are we FAIR yet? And will it be worth it?
The FAIR Principles propose essential characteristics that all digital resources (e.g. datasets, repositories, web services) should possess to be Findable, Accessible, Interoperable, and Reusable by both humans and machines. The Principles act as a guide that researchers and data stewards should expect from contemporary digital resources, and in turn, the requirements on them when publishing their own scholarly products. As interest in, and support for the Principles has spread, the diversity of interpretations has also broadened, with some resources claiming to already “be FAIR”.
This talk will elaborate on what FAIR is, what it entails, and how we should evaluate FAIRness. I will describe new social and technological infrastructure to support the creation and evaluation of FAIR resources, and how FAIR fits into institutional, national and international efforts. Finally, I will discuss the merits of the FAIR principles (and what we ask of people) in the context of strengthening data-driven scientific inquiry.Are we FAIR yet? And will it be worth it?
The FAIR Principles propose essential characteristics that all digital resources (e.g. datasets, repositories, web services) should possess to be Findable, Accessible, Interoperable, and Reusable by both humans and machines. The Principles act as a guide that researchers and data stewards should expect from contemporary digital resources, and in turn, the requirements on them when publishing their own scholarly products. As interest in, and support for the Principles has spread, the diversity of interpretations has also broadened, with some resources claiming to already “be FAIR”.
This talk will elaborate on what FAIR is, what it entails, and how we should evaluate FAIRness. I will describe new social and technological infrastructure to support the creation and evaluation of FAIR resources, and how FAIR fits into institutional, national and international efforts. Finally, I will discuss the merits of the FAIR principles (and what we ask of people) in the context of strengthening data-driven scientific inquiry.
Keynote given at NETTAB2018 - http://www.igst.it/nettab/2018/
The future of science and business - a UM Star LectureMichel Dumontier
I discuss how data science is affecting our way of life and how we at Maastricht University are preparing the next generation of leaders to address opportunities and challenges in responsible manner.
The FAIR Principles propose key characteristics that all digital resources (e.g. datasets, repositories, web services) should possess to be Findable, Accessible, Interoperable, and Reusable by people and machines. The Principles act as a guide that researchers should expect from contemporary digital resources, and in turn, the requirements on them when publishing their own scholarly products. As interest in, and support for the Principles has spread, the diversity of interpretations has also broadened, with some resources claiming to already “be FAIR”. This talk will elaborate on what FAIR is, why we need it, what it entails, and how we should evaluate FAIRness. I will describe new social and technological infrastructure to support the creation and evaluation of FAIR resources, and how FAIR fits into institutional, national and international efforts. Finally, I will discuss the merits of the FAIR principles (and what we ask of people) in the context of strengthening data-driven scientific inquiry.
A talk prepared for Workshop Working on data stewardship? Meet your peers!
Datum: 03 OKT 2017
https://www.surf.nl/agenda/2017/10/workshop-working-on-data-stewardship-meet-your-peers/index.html
Towards metrics to assess and encourage FAIRnessMichel Dumontier
With an increased interest in the FAIR metrics, there is need to develop tools and appraoches that can assess the FAIRness of a digital resource. This talk begins to explore some ideas in this space, and invites people to participate in a working group focused on the development, application, and evaluation of FAIR metric efforts.
A presentation to the New Year's Event for Maastricht University's Knowledge Engineering @ Work Program. https://www.maastrichtuniversity.nl/news/kework-first-10-students-academic-workstudy-track-graduate
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Generating a custom Ruby SDK for your web service or Rails API using Smithyg2nightmarescribd
Have you ever wanted a Ruby client API to communicate with your web service? Smithy is a protocol-agnostic language for defining services and SDKs. Smithy Ruby is an implementation of Smithy that generates a Ruby SDK using a Smithy model. In this talk, we will explore Smithy and Smithy Ruby to learn how to generate custom feature-rich SDKs that can communicate with any web service, such as a Rails JSON API.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
Securing your Kubernetes cluster_ a step-by-step guide to success !KatiaHIMEUR1
Today, after several years of existence, an extremely active community and an ultra-dynamic ecosystem, Kubernetes has established itself as the de facto standard in container orchestration. Thanks to a wide range of managed services, it has never been so easy to set up a ready-to-use Kubernetes cluster.
However, this ease of use means that the subject of security in Kubernetes is often left for later, or even neglected. This exposes companies to significant risks.
In this talk, I'll show you step-by-step how to secure your Kubernetes cluster for greater peace of mind and reliability.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
JMeter webinar - integration with InfluxDB and GrafanaRTTS
Watch this recorded webinar about real-time monitoring of application performance. See how to integrate Apache JMeter, the open-source leader in performance testing, with InfluxDB, the open-source time-series database, and Grafana, the open-source analytics and visualization application.
In this webinar, we will review the benefits of leveraging InfluxDB and Grafana when executing load tests and demonstrate how these tools are used to visualize performance metrics.
Length: 30 minutes
Session Overview
-------------------------------------------
During this webinar, we will cover the following topics while demonstrating the integrations of JMeter, InfluxDB and Grafana:
- What out-of-the-box solutions are available for real-time monitoring JMeter tests?
- What are the benefits of integrating InfluxDB and Grafana into the load testing stack?
- Which features are provided by Grafana?
- Demonstration of InfluxDB and Grafana using a practice web application
To view the webinar recording, go to:
https://www.rttsweb.com/jmeter-integration-webinar
Epistemic Interaction - tuning interfaces to provide information for AI supportAlan Dix
Paper presented at SYNERGY workshop at AVI 2024, Genoa, Italy. 3rd June 2024
https://alandix.com/academic/papers/synergy2024-epistemic/
As machine learning integrates deeper into human-computer interactions, the concept of epistemic interaction emerges, aiming to refine these interactions to enhance system adaptability. This approach encourages minor, intentional adjustments in user behaviour to enrich the data available for system learning. This paper introduces epistemic interaction within the context of human-system communication, illustrating how deliberate interaction design can improve system understanding and adaptation. Through concrete examples, we demonstrate the potential of epistemic interaction to significantly advance human-computer interaction by leveraging intuitive human communication strategies to inform system design and functionality, offering a novel pathway for enriching user-system engagements.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
A little more semantics goes a lot further! Getting more out of Linked Data with OWL
1. A little more semantics goes a lot further!
Getting more out of Linked Data with OWL
Dr. Michel Dumontier
Dr. Robert Hoehndorf
2. Abstract
This tutorial will provide detailed instruction to create and make use of formalized ontologies
from linked open data for advanced knowledge discovery including consistency checking and
answering sophisticated questions.
Automated reasoning in OWL offers the tantalizing possibility to undertake advanced
knowledge discovery including verifying the consistency of conceptual schemata in
information systems, verifying data integrity and answering expressive queries over the
conceptual schema and the data. Given that a large amount of structured knowledge is now
available as linked data, the challenge is to formalize this knowledge iso that intended
semantics become explicit and that the reasoning is efficient and scalable. While using the
full expressiveness of OWL 2 yields ontologies that can be used for consistency verification,
classification and query answering, use of less expressive OWL profiles enable efficient
reasoning and support different application scenarios. In this tutorial,
- we describe how to generate OWL ontologies from linked data
- check consistency of knowledge
- automatically transform ontologies into OWL profiles
- use this knowledge in applications to integrate data and answer sophisticated questions
across domains.
- expressive ontologies enables data integration, verifying consistency of knowledge and
answering questions
- formalization of linked data will create new opportunities for knowledge discovery
- OWL 2 profiles support more efficient reasoning and query answering procedures
- recent technology facilitates the automatic conversion of OWL 2 ontologies into profiles
- OWL ontologies can dramatically extend the functionality of semantically-enabled web sites
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 2
3. skills obtained
• understand the nature and capability of a formal ontology and information system
• understand the subtle differences between OWL2 and its profiles, including difference in
constructs, when to apply these profiles and how to convert ontologies in this format
• understand the distinction between a class and an individual and their descriptions
• understand how to convert RDF triples in Linked Data into axioms for an OWL ontology
• understand how to execute standard reasoning services (classification, consistency
checking, realization, query answering) on an OWL ontology using the OWL API and an
OWL reasoner, with focus on OWL-EL ontologies and reasoners.
• understand how to identify inconsistencies and simple patterns to remove or repair them
• Understand how to convert large amounts of linked data into a large scale OWL
knowledge base and enable tractable reasoning over it
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 3
4. 90 min Outline
1. introduction (10min)
• case study: SGD
• linked data vs ontology
• RDF vs OWL
• Motivation: can we use some features of OWL to organize, verify and exploit Linked Data?
2. Formalization
• OWL2 – elements, expressions and axioms
• Triples to axioms
• Role of top level ontologies (classes + relations)
• Axiom patterns
3. Practical Reasoning
• classification using CEL/CB/Pellet/HermiT/...
• OWL profiles
• Modularization (EL Vira)
• Diagnosis and Repair
• Explanations
• Inference of new triples
4. Conclusion
4OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial
5. Saccharomyces Genome Database
A repository for all things yeast.
includes :
• molecular entities, their parts
o chromosomes; genes, open reading frames, etc
o rna, proteins; domains
• qualities, realizables (dispositions, functions)
• interactions and their participants
• complexes, their parts and their topology
• pathways and their components
• phenotypes and their basis
5OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial
6. Hexokinase (HXK1)
The HXK1 gene encodes the HXK1 protein - which is
responsible for the conversion of glucose to glucose-6-
phosphate in the first step of glycolysis.
Gene:
(region of DNA)
Protein
(macromolecule)
6OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial
7. Questions we may want to ask about HXK1:
• What kind of thing is HXK1?
• What are the implications of being a gene?
o In which chromosome does it appear?
o Which entities does it encode?
• What are the implications of being a protein?
o What is its function?
o Where is it located in the cell?
o If HXK1 participates in processes that involves other
cellular components, where else must HXK1 be located?
• Is HXK1 annotation consistent?
o does the annotation contradict common biological
knowledge?
o Is it possible for HXK1 to have multiple locations when it
can only be located on one chromosome?
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 7
8. SGD refers to other data sources
Gene Ontology
- functions, locations, processes
Ascomycetes Phenotype Ontology
- experiments, interactions and phenotypes
Pubmed
- abstracts of published research articles + MeSH terms
over 40 references to other molecular/data entities for which
the relation is unclear…
8OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial
10. SGD as RDF-based Linked Open Data
10OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial
11. Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-cloud.net/
SGD is provided by Bio2RDF and forms part of the
growing linked open data cloud
11OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial
12. Semantic Integration
• Requires a level of abstraction/generalization
where the relationship between each resource is
formalized
– classes
– relations
• How do we ensure that our representation
facilitates integration across datasets?
• How can we get our formalization to interoperate
with ontologies?
12OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial
15. Semantic Technologies: RDF vs OWL
RDF: simple triples, graph-based queries, supports
very large amount of data
OWL: significantly more expressive language, strong
axioms, inference capabilities, consistency
verification, but can be rather slow
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 15
16. RDF-based Linked Data
• Provides the basis for simple data syndication and
syntactic data integration
o IRIs
o Statements (aka triples) take the form of
o <subject> <predicate> <object>
• Easy to implement
o stand-alone datasets
o logical layer over databases
• Limited reasoning
o class and property hierarchies
o domain/range restrictions
o can’t automatically discover inconsistency
• Standardized Queries - SPARQL
• Scalable - to billions of triples
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 16
17. OWL - The Web Ontology Language
• Enhanced vocabulary (strong axioms) to express
knowledge relating to classes, properties, individuals and
data values
o quantifiers (existential, universal, cardinality restriction)
o negation
o disjunction
o property characteristics
o complex classes in domain and range restrictions
o property chains
• Advanced reasoning
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 17
18. Advanced Reasoning
• Consistency: determines whether the ontology contains
contradictions.
• Satisfiability: determines whether classes can have
instances.
• Subsumption: is class C1 implicitly a subclass of C2?
• Classification: repetitive application of subsumption to
discover implicit subclass links between named classes
• Realization: find the most specific class that an individual
belongs to.
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 18
19. OWL Challenges and Solutions
Inconsistency:
• needs to be resolved to ask any questions involving the
ontology
• Solution: explicitly accommodate multiple meanings,
remove contradictory axioms
Unsatisfiability (of a class):
• may indicate a modelling error
• needs to be resolved to ask meaningful questions about the
class
• Solution: explicitly accommodate multiple meanings,
redefine class, remove contradicting class restrictions
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 19
20. OWL Challenges and Solutions
Scalability:
• answers to OWL queries requires reasoning
• inference in OWL is highly complex (worst case: 2
NEXPTIME)
• highly optimized reasoners are getting better and better, but
can still be slow with large ontologies
• tractable OWL profiles (EL, QL, RL) enable more efficient
and guaranteed polynomial-time inferences
• use ontology modularization approaches to increase
performance
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 20
21. Linked data and OWL: Motivation
• use OWL reasoning to identify mistakes in RDF data
o incorrect content of assertions
o incorrect use of relations
o conflicting conceptualizations
o incorrect same-as assertions
• verify, fix and exploit Linked Data through expressive OWL
reasoning
• generate/infer new triples to write back into RDF and use
for efficient retrieval
Proposal:
Convert RDF to OWL to perform inferences and represent
inferences in RDF after classification.
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 21
22. OWL can help you create rich, machine-
understandable descriptions!
• transform our expert knowledge into axioms and
expressions that can be automatically reasoned about
o a transcription factor is
a protein
that binds to DNA
and regulates the expression of a gene.
o can we mine 'omic datasets to discover which
proteins are transcription factors?
• create rich expressions from combinations of classes,
relations and individuals
• assert statements of truth using axioms.
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 22
23. Elements of OWL 2.0
• The “ontology” of OWL 2 consists of:
• Classes
• Object properties
• Data properties
• Individuals
• Expressions
• Axioms
• Plus RDF stuff (like datatypes)
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 23
24. Classes and class axioms
• a class is a set of individuals that share one or more characteristics
o a protein
• classes can be organized in a hierarchy using subClassOf axioms
o i.e. every member of C2 is a member of C1
o subClassOf (protein molecule)
• special classes
o owl:Thing is the superclass of all things
o owl:Nothing is the subclass of all things, denotes an empty set
• classes can be made disjoint from one another
o i.e. there is no member of C1 that is also a member of C2
o disjointClasses (protein DNA )
• classes can be said to be equivalent
o i.e. all members of C1 are members of C2 and all members of C2
are members of C1
o EquivalentClass (Peptide Polypeptide )
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 24
25. Object Properties and axioms
• an object property OP is a relation between two individuals
o 'has part' is an object property that denotes the mereological
relation between two individuals
• OPs can be organized in a hierarchy
o given OP1 and OP2 and OP2 is a subproperty of OP1 then if an
individual x is connected by OP2 to an individual y, then x is
also connected by OP2 to y.
o subPropertyOf ('has proper part' 'has part')
o owl:TopObjectProperty, owl:BottomObjectProperty
• We can restrict the domain and range to allowed values
• ObjectPropertyDomain ('is participant in', 'process')
• ObjectPropertyRange ('is participant in', 'physical entity')
• We can also assert objects to be disjoint or equivalent
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 25
26. description of object properties
• Inverse
o we say that 'has part' is an inverse for 'is part of'
o we can also refer to this as inv('is part of')
• Symmetric
o to cases where the inverse relation is the very same relation
o e.g. the inverse for 'is related to' is 'is related to‘
• Transitive
o a transitive relation if individual x is connected to an individual y
that is connected by to an individual z, then x is also connected by
to z
o e.g. 'has part' is transitive
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 26
27. description of object properties
• Reflexive
o reflexive infers that the relation automatically refers back to the
individual
o e.g. 'has part' is reflexive because protein has itself as a part.
• Functional
o restrict the range of the relation to a single individual, and therefore
all individuals in the range must be the same.
o e.g. 'has unique identifier‘
• Inverse Functional
o restrict the domain of the relation to a single individual, therefore all
individuals in the domain must be the same
o e.g. 'is unique identifier of'
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 27
28. Class Expressions
Class expressions are rich descriptions of classes through the
logical combination of ontological primitives (classes, object
properties, datatype properties, individuals)
Protein subClassOf
molecule and ‘has proper part’ min 2 ‘amino acid residues’
Combinations specified using logical operators
• conjunction (and), disjunction (or), negation (not)
Object or data property expressions provide a qualified cardinality
over the relation
o minimum: rel min # Y
o maximum: rel max # Y
o exact: rel exactly # Y (minimum + maximum)
o some: rel min 1 Y
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 28
29. Class Expressions
o The quantifications can qualified by the object type
o rel only Y – the only values allowed are of type Y
• To form complex class expressions like
o 'molecule' and not 'dna'
o 'has part' min 2 'amino acid'
o 'is located in' only ('nucleus' or 'cytoplasm')
• and be expressed as axioms in the ontology
Protein subClassOf
molecule and ‘has proper part’ min 2 ‘amino acid residues’
Transcription Factor equivalentClass
‘protein’
and ‘has disposition’ some ‘to bind to DNA’
and ‘has function’ some ‘to regulate gene expression’
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 29
30. Triples to axioms
Convert RDF triples into OWL axioms.
Triple in RDF:
<Nucleus> <partOf> <Cell>
• Nucleus and Cell are classes
• partOf is a relation between 2 classes
• intended meaning:
every instance of Nucleus is partOf some instance of Cell
• formalize as OWL axiom:
Nucleus subClassOf
partOf some Cell
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 30
31. Triples to axioms
Triple in RDF:
<Cytosol> <isLocationOf> <HXK1>
• Cell and HXK1 are classes
• isLocationOf is an axiom pattern involving 2 classes
• intended meaning:
• every instance of HXK1 is located at some instance of Cytosol
• not intended:
• for every instance of Cytosol, there is an instance of HXK1 located
in it.
HXK1 subClassOf
hasLocation some Cytosol
inv(isLocationOf) some Cytosol
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 31
32. Triples to axioms
Convert RDF triples into OWL axioms.
Triple in RDF:
<C1 R C2>
• C1 and C2 are classes, R a relation between 2 classes
• intended meaning:
o C1 SubClassOf: C2
o C1 SubClassOf: R some C2
o C1 SubClassOf: R only C2
o C2 SubClassOf: R some C1
o C1 SubClassOf: S some C2
o C1 DisjointFrom C2
o C1 and C2 SubClassOf: owl:Nothing
o R some C1 DisjointFrom: R some C2
o C1 EquivalentClasses C2
o ...
• in general: P(C1, C2), where P is an OWL axiom (template)
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 32
Challenge:
Formalizing data requires
one to commit to a
particular meaning – to
make an ontological
commitment
33. Triples to axioms
Formalizing RDF triples in OWL often introduces new OWL
object properties.
• Which object properties should be included?
• What axioms hold for included object properties?
• Can domain and range restrictions be generalized across
multiple domains, i.e., reused across multiple linked data
sources to ensure consistency between them?
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 33
Challenges
34. Top level ontologies contain
generalized (domain independent)
classes and relations
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 34
They can be used to constrain what can be said about these
entities (and hence will later be useful for checking the
consistency of data annotated using these terms).
35. Basic classes in top-level ontologies
• Material entity
• Example: Apple, Human, Cell, Planet
• Has mass as an quality
• Located in space and time
• Independent of other entities
• it exists in whole whenever it exists
• Quality
• Example: mass, color, concentration
• Dependent: always the quality of some entity
• Quality of object: size, shape, length
• Quality of process: duration, rate
• Quality of quality: shade (of color), intensity
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 35
36. Basic classes in top-level ontologies
• Function
• e.g. to bind, to catalyze (a reaction), to kill bacteria
• Dependent: always the function of some thing
• Similar to a property of an object
• Represents the potential to do something (an action) in
some process
• capabilities, dispositions and tendencies
• Process
• Example: running a marathon, binding, cell division
• Located in space and time
• Independent of other entities
• Temporally extended
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 36
37. Top-level ontologies make a
commitment to these being different
things
Material object, Process, Function and Quality are mutually
disjoint.
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 37
39. Relations in top-level ontologies
• relations (object properties) in OWL hold between
instances
• domain and range restrictions from top-level ontology
can be applied for general relations, e.g.:
o ‘has part’ can be restricted with "Material object" as
both domain and range
o ‘participates in’ can be restricted with a domain of
"Material object" and a range of "Process“
o re-use of relations enables inferences across
resources
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 39
40. Enforce ontological commitment by
mapping to a top-level ontology
Foundation of domain classes and relations in top-level
ontology:
• every domain class becomes a subclass of a class in top-
level ontology
• every object property used in OWL axioms becomes a sub-
property of an object property in the top-level ontology
• assert additional axioms to restrict domain classes and
delimit it from other domains (where appropriate)
o e.g., if a particular resources uses (in RDF) the relation
part-of exclusively between processes, the additional
constraint can be added
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 40
41. Top-level ontology
Application of a top-level ontology:
• can help to make the ontological commitment that is
employed within an information system explicit,
• can guarantee basic agreement about fundamental types,
• agreement about common relations,
• provides common domain and range restrictions across
multiple domains, and therefore
• enables re-use of relations and types across data sources,
domains, levels of granularities, information systems.
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 41
42. Formalization of SGD’s Linked Data
SGD uses at least the following relations in RDF:
• isPartOf
• hasParticipant
• isFunctionOf
• isLocationOf
Can we create patterns from which linked data can be
appropriately formalized into OWL axioms?
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 42
axiom patterns
43. Formalization of SGD Linked Data
?X isPartOf ?Y
Can be translated to axiom pattern
?X subClassOf: part-of some ?Y
"part-of" is an object property contained in our top-level
ontology.
Example:
HXK1 isPartOf chromosome6_Crick
translated to
HXK1 subClassOf: part-of some chromosome6_Crick
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 43
44. Formalization of SGD Linked Data
?X hasParticipant ?Y
translated to axiom pattern
?Y subClassOf: participates-in some ?X
"participates-in" is an object property contained in our top-level
ontology.
Example:
GO:0005975 (carbohydrate metabolism) hasParticipant HXK1
translated to
HXK1 subClassOf: participates-in some GO:0005975
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 44
45. Formalization of SGD Linked Data
?X isLocationOf ?Y
translated to axiom schema
?Y subClassOf: located-in some ?X
Example:
GO:0005737 (cytoplasm) isLocationof HXK1
translated to
HXK1 subClassOf: located-in some GO:0005737
What if "located-in" is not present in our top-level ontology…
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 45
46. Formalization of SGD Linked Data
Top-level foundation for located-in relation:
• declare located-in as sub-property of part-of
o verify how located-in is used within SGD, i.e., does
located-in imply part-of?
o counter-example: misfolded protein located-in chaperone
protein, but not misfolded protein part-of chaperone
protein
• create located-in as super-property of part-of in our top-
level ontology:
o does part-of imply located-in within SGD?
o cell body part-of cell, but not cell body located-in cell
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 46
47. Formalization of SGD Linked Data
Top-level foundation for located-in relation:
• add located-in to our top-level ontology
o adding the new relation allows its reuse across multiple
resources
o inclusion may require addition of further classes (e.g.,
spatial regions)
o relation to part-of must be clarified (and part-of may even
be replaced by located-in)
Establishing the relation between relations and classes
depends on how the relations and classes are being applied.
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 47
48. Formalization of SGD Linked Data
Top-level foundation:
Translate
HXK1 rdf:type OpenReadingFrame
to
HXK1 subClassOf: OpenReadingFrame
OpenReadingFrame (Sequence Ontology) is a subclass of
Sequence.
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 48
50. Formalization of SGD Linked Data
Foundation for SGD classes in top-level ontology:
• declare Sequence to be a subclass of Material object
• import (owl:imports) Sequence Ontology
• declare Biological Process (GO) subclass of Process
• declare Molecular Function (GO) subclass of Function
• import GO
• ...
to create a top-level foundation (i.e., super-class in top-level
ontology for all classes) for SGD
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 50
51. Implementation
• expand relations in RDF based on relational patterns
• relational patterns are OWL axioms with 2 variables (which
are filled by subject and object, respectively)
• implementation based on OWL API
• adopt implementation of relational patterns in OBO
language (http://code.google.com/p/obo2owl/)
Hoehndorf, Robert, Oellrich, Anika, Dumontier, Michel, Kelso, Janet, Herre,
Heinrich, and Rebholz-Schuhmann, Dietrich (2010). Relational patterns in OWL
and their application to OBO. OWL: Experiences and Directions (OWLED).
paper: http://www.webont.org/owled/2010/papers/owled2010_submission_3.pdf
presentation: http://www.slideshare.net/micheldumontier/relational-patterns-in-
owl-and-their-application-to-obo
BMC Bioinformatics: http://www.biomedcentral.com/1471-2105/11/441
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 51
52. Another way?
• OPPL is an abstract formalism that allows for
manipulating ontologies written in OWL.
• Use OPPL to select triples and create the axioms
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 52
53. Operations on OWL ontologies
• Consistency: determines whether the ontology contains
contradictions.
• Satisfiability: determines whether classes can have
instances.
• Subsumption: is class C1 implicitly a subclass of C2?
• Classification: repetitive application of subsumption to
discover implicit subclass links between named classes
• Realization: find the most specific class that an individual
belongs to.
OWL reasoners can perform these operations and make the
results accessible for further processing.
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 53
54. Practical reasoning with OWL
ontologies
• Ontology editors such as Protege interface with reasoners to
perform consistency and class satisfiability,
classification, realisation, and provide explanations.
• Some reasoners are setup to be used as the command line
to execute requests including SPARQL querying.
• Programmatic use of reasoners via APIs. Maximal flexibility,
e.g., one can request all subclasses of a given class,
including implicit once, or all entailed statements with a
specified subject and predicate
54OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial
58. OWL Reasoners
OWL DL Reasoners
• Pellet: Clark & Parsia, dual-licensed, Java.
• Fact++: Manchester University, open-source, C++ with a Java API.
• HermiT: Oxford University, open-source, Java.
• Racer Pro: Racer Systems, commercial, Lisp with a Java API.
OWL Profile/subset reasoners
• Jena: Hewlett-Packard, open-source, Java.
• OWLIM: Ontotext, dual-licensed, Java.
• CB:
• CEL:
• JCEL (Pellet)
• ELLY:
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 58
59. Automated reasoning over SGD
• SGD in OWL contains more than 800,000 axioms
• included ontologies contains several thousand axioms
o GO has approx. 35,000 classes
o ChEBI contains almost 100,000 classes
o complex definitions of classes create links between large
ontologies
• Reasoning in OWL 2 DL is highly complex (worst-case
2NEXPTIME complete).
• Consequence: OWL reasoning can rarely be employing in a
large scale.
• Expressive OWL reasoners do not classify the formalized
SGD repository.
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 59
60. OWL Profiles
• OWL 2 defines three different tractable profiles:
• EL
o polynomial time reasoning for schema and data
o Useful for ontologies with large conceptual part
• QL
o fast (logspace) query answering using RDBMs via SQL
o Useful for large datasets already stored in RDBs
• RL
o fast (polynomial) query answering using rule-extended
DBs
o Useful for large datasets stored as RDF triple
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 60
61. OWL RL
Features:
• identity of classes, instances, properties
• subproperties, subclasses, domains, ranges
• union and intersection of classes (some restrictions)
• property characterizations (functional, symmetric, etc)
• property chains
• keys
• some property restrictions (but not all inferences are
possible)
Limitations:
• not all datatypes are available
• no datatype restrictions
• no minimum or exact cardinality restrictions
• maximum cardinality only with 0 and 1
• some consequences cannot be drawn
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 61
62. OWL EL
Features
• existential quantification to a class expression or data range
• existential quantification to an individual or a literal
• self-restriction
• enumerations involving a single individual or a single literal
• intersection of classes and data range
• class axioms: subClassOf, equivalence, disjointness
• property axioms: domain, range, equivalence, transitive, reflexive, inclusion with
or without property chains; functional data properties. keys.
• assertions (sameAs, DifferentFrom, Class, Object Property, Data Property,
Negative Object/Data Property
Not supported
• universal quantification to a class expression or a data range
• cardinality restrictions
• disjunction (union)
• class negation
• enumerations involving more than one individual
• object properties: disjoint, symmetric, asymmetric, irreflexive, inverse, functional
and inverse-functional
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 62
63. Ontology modularization
Can we automatically extract a large (maximal) OWL (EL, QL,
RL) module from an ontology?
1. D EquivalentTo: not A (not EL)
2. C EquivalentTo: not B (not EL)
3. B subClassOf: A (EL)
Inference:
• D subClassOf: C (EL) (Inference from (1)-(3))
EL module of (1)-(3):
• {B subClassOf: A}, or
• {B subClassOf: A, D subClassOf: C}
63OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial
64. EL Vira modularization
• ontology modularization
• identify EL, QL, RL axioms in deductive closure
• retain signature of ontology
• maximality is an open problem
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 64
http://el-vira.googlecode.com
65. Consistency repair
• Unsatisfiable classes result from contradictory class
definitions
• Conflict in asserted axioms, in imported ontologies or
through combination of both
• Conflicts can be hidden through domain/range
restrictions, subclass relations, axioms for relations, etc.
• Conflicting axioms may be challenging to identify!
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 65
68. Ontology repair and disambiguation
• Ontological constraints may have been too strong
• Complex relations (between classes) that are used in
multiple meanings can be relaxed by explicitly introducing
a disjunction that accommodates the different meanings,
e.g.:
o (1) Hxk1 part-of Chromosome6_Crick_strand
o (2) Hxk1 part-of Hxk1_ATP_complex
o (3) Hxk1 part-of Carbohydrate_metabolism
o only (1) is consistent with background knowledge that
Genes (as material objects) must be part of material
objects (more specifically DNA), and that Genes cannot
be part of protein complexes
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 68
69. Ontology repair and disambiguation
1. Hxk1 part-of Chromosome6_Crick_strand
2. Hxk1 part-of Hxk1_ATP_complex
3. Hxk1 part-of Carbohydrate_metabolism
part-of here means either
?X subClassOf: part-of some ?Y, or
?X subClassOf: encodes some (part-of some ?Y), or
?X subClassOf: participates-in some ?Y, or
?X subClassOf: encodes some (participates-in some ?Y)
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 69
70. Ontology repair and disambiguation
?X subClassOf: part-of some ?Y, or
?X subClassOf: encodes some (part-of some ?Y), or
?X subClassOf: participates-in some ?Y, or
?X subClassOf: encodes some (participates-in some ?Y)
All four interpretations are disjoint!
Create new interpretation for part-of:
?X subClassOf:
part-of some ?Y or
encodes some (part-of some ?Y) or
participates-in some ?Y or
encodes some (participates-in some ?Y)
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 70
71. Inference of revised RDF
representation
• Query OWL ontology for relational patterns that were used
in relation expansion
• generates deductive closure of a set of RDF triples with
respect to inferences in OWL
• naive implementation:
o given a pattern P(?X, ?Y), substitute all combination of
named classes for ?X and ?Y
o runtime: n*n
o more efficient implementation work-in-progress
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 71
72. Inference of revised RDF
representation
In the definition:
?X subClassOf:
part-of some ?Y or
encodes some (part-of some ?Y) or
participates-in some ?Y or
encodes some (participates-in some ?Y)
one or more of the classes in the disjunction may become
unsatisfiable!
• reasoner can be used to decide which interpretation is
correct
• eliminate remaining interpretations
• useful to "split" relations in RDF that have multiple
conflicting meanings
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 72
73. Summary - RDF and OWL
RDF provides
• light-weight semantics
• fast queries
• highly scalable implementations
• large volumes of data (e.g., DBPedia, other Linked Data
repositories)
OWL provides
• Constructs to formalize the intended semantics
• An OWLAPI to develop, manage, and serialize OWL
ontologies
• Efficient reasoners of get inferences, compute modules and
get explanations.
• syntactic subset for better performance, albeit some
inferences may be lost
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 73
74. Summary - Reasoning in OWL
• verification: reveal contradictory definitions of classes
(unsatisfiable classes), conflicting conceptualizations and
reveal hidden inferences (that may be considered invalid
through manual verification)
• repair: through explicit definitions using disjunction,
constraints can be relaxed and contradictions reduced
• more facts: OWL queries for relational patterns can be
used to generate RDF triples that are closed against the
constraints and axioms of an OWL knowledge base
• powerful queries: queries in OWL can be made for
instances and for classes satisfying complex expressions
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 74
75. Conclusions
• ontologies are tools for better knowledge management
• ontology (philosophy) is a useful source of well-developed
theories that can be applied to ontology design, but only
when put into practice as a formalized ontology
• formal ontologies can help in getting us closer to the goal of
large-scale integration, verification and analysis of data
across domains and levels of granularity
• The combination of formal ontologies + scalable reasoning
will be instrumental in making sense of the Semantic Web.
OWLED2011::Dumontier|Hoehndorf::Formalizing Linked Data Tutorial 75