This presentation describes the public data service - FactForge. It is a reason-able view of a segement of LOD cloud, and the biggest body of general knowledge on which inference is performed, supplied with a reference layer for a quick access.
This presentation discusses the value of inferred knowledge over LOD and presents a new version of FactForge, a reason-able view, the biggest body of heterogeneous generic knowledge on which inference is performed, showing examples of inferred statements across LOD datasets.
- FactForge is a semantic data service that provides access to a large collection of heterogeneous linked open data through inference and a reference ontology.
- It allows exploration of inferred knowledge through SPARQL queries, an RDF search, and relationship browsing.
- Challenges include cleaning input data, detecting contradictions, consistency checking, and curating and upgrading the methodology. FactForge has been used to generate linked data from unstructured sources and integrate metadata.
A Mathematical Approach to Ontology Authoring and DocumentationChristoph Lange
This document proposes using OMDoc, a framework for representing formal knowledge, to improve ontology authoring and documentation. It describes how OMDoc can:
1) Provide better support for modularity, documentation at different granularities, and linking documentation to formal representations compared to languages like OWL.
2) Model existing ontologies and translate between OMDoc and OWL/RDF formats to leverage existing tools.
3) Allow comprehensive, integrated documentation of ontologies through features like literate programming. The approach is evaluated by reimplementing the FOAF ontology in OMDoc.
The Distributed Ontology Language (DOL): Use Cases, Syntax, and ExtensibilityChristoph Lange
The document discusses the Distributed Ontology Language (DOL) which aims to support semantic integration and interoperability across heterogeneous ontologies. DOL allows for logically heterogeneous ontologies, modular ontologies, and formal and informal links between ontologies. It has a formal semantics and can be serialized in XML, RDF, and text. Examples of applications that could benefit from DOL include an ontology repository engine and a multilingual map user interface driven by aligned ontologies.
The document discusses the concepts and implementation of linked data and the semantic web. It describes Cambridge University Library's COMET project which converted bibliographic records from MARC21 format to RDF triples and published them as linked open data with HTTP URIs. The project aimed to release data for open use and gain experience working with semantic web technologies like RDF, SPARQL and triplestores. Key challenges included dealing with IPR issues in MARC21 records and developing tools to transform and link the data.
UBL is an XML standard for business documents being developed to fulfill the promise of XML for business. UBL aims to define a standard cross-industry vocabulary for business documents to enable easier and cheaper electronic data interchange. UBL is committed to international semantic standardization and is fully compliant with ebXML Core Components, which provide a basis for standardizing business data while remaining syntax neutral. However, a concrete XML syntax is still needed to enable widespread adoption, and UBL is developing XML document schemas and design rules to serve this purpose.
This document discusses semantic technologies and digital data processing. It provides an overview of semantics and the semantic web, including XML, RDF, OWL, SPARQL, ontologies, and data models. It also discusses capturing semantics in XML documents, OWL, RDF schema, semantic web applications like cartographic searching, SKOS for knowledge organization systems, and the SKOS Play visualization tool.
This presentation discusses the value of inferred knowledge over LOD and presents a new version of FactForge, a reason-able view, the biggest body of heterogeneous generic knowledge on which inference is performed, showing examples of inferred statements across LOD datasets.
- FactForge is a semantic data service that provides access to a large collection of heterogeneous linked open data through inference and a reference ontology.
- It allows exploration of inferred knowledge through SPARQL queries, an RDF search, and relationship browsing.
- Challenges include cleaning input data, detecting contradictions, consistency checking, and curating and upgrading the methodology. FactForge has been used to generate linked data from unstructured sources and integrate metadata.
A Mathematical Approach to Ontology Authoring and DocumentationChristoph Lange
This document proposes using OMDoc, a framework for representing formal knowledge, to improve ontology authoring and documentation. It describes how OMDoc can:
1) Provide better support for modularity, documentation at different granularities, and linking documentation to formal representations compared to languages like OWL.
2) Model existing ontologies and translate between OMDoc and OWL/RDF formats to leverage existing tools.
3) Allow comprehensive, integrated documentation of ontologies through features like literate programming. The approach is evaluated by reimplementing the FOAF ontology in OMDoc.
The Distributed Ontology Language (DOL): Use Cases, Syntax, and ExtensibilityChristoph Lange
The document discusses the Distributed Ontology Language (DOL) which aims to support semantic integration and interoperability across heterogeneous ontologies. DOL allows for logically heterogeneous ontologies, modular ontologies, and formal and informal links between ontologies. It has a formal semantics and can be serialized in XML, RDF, and text. Examples of applications that could benefit from DOL include an ontology repository engine and a multilingual map user interface driven by aligned ontologies.
The document discusses the concepts and implementation of linked data and the semantic web. It describes Cambridge University Library's COMET project which converted bibliographic records from MARC21 format to RDF triples and published them as linked open data with HTTP URIs. The project aimed to release data for open use and gain experience working with semantic web technologies like RDF, SPARQL and triplestores. Key challenges included dealing with IPR issues in MARC21 records and developing tools to transform and link the data.
UBL is an XML standard for business documents being developed to fulfill the promise of XML for business. UBL aims to define a standard cross-industry vocabulary for business documents to enable easier and cheaper electronic data interchange. UBL is committed to international semantic standardization and is fully compliant with ebXML Core Components, which provide a basis for standardizing business data while remaining syntax neutral. However, a concrete XML syntax is still needed to enable widespread adoption, and UBL is developing XML document schemas and design rules to serve this purpose.
This document discusses semantic technologies and digital data processing. It provides an overview of semantics and the semantic web, including XML, RDF, OWL, SPARQL, ontologies, and data models. It also discusses capturing semantics in XML documents, OWL, RDF schema, semantic web applications like cartographic searching, SKOS for knowledge organization systems, and the SKOS Play visualization tool.
Linked Open (Geo)Data and the Distributed Ontology Language – a perfect matchChristoph Lange
The Distributed Ontology Language is a meta-language for integrating
ontologies written in different languages. Our notion of “distributed”
comprises logical heterogeneity within ontologies, modularity and reuse,
and links across ontologies in different places of the Web. Not only
can ontologies be distributed across the Web, but DOL's supply of
supported ontology languages can also be extended in a decentral way.
For this functionality, DOL builds on the Linked Open Data (LOD)
principles. But DOL also contributes to LOD use cases. Many current
LOD applications are limited by the weak expressivity of the RDF and
RDFS languages commonly used to express data and vocabularies.
Completely switching to a more expressive language would impair
scalability to big datasets. DOL addresses the scalability and
expressivity requirements by allowing to represent each aspect of a
dataset in the most suitable language and keeping these different
representations connected. This is particularly useful in geographic
information systems, where big datasets (e.g. Linked Geo Data, the LOD
version of OpenStreetMap) need to be integrated with formalisations of
complex spatial notions (e.g. in the first-order language Common Logic).
This document provides an introduction to Linked Open Data (LOD) and how to create and work with LOD datasets. It explains that LOD combines Linked Data with open licenses to publish structured data on the web. It describes the 5 star open data model and shows how to make data available in different formats like RDF, JSON-LD and Turtle. It also introduces SPARQL and provides examples of querying LOD sources like Wikidata to find information. Finally, it discusses software for working with LOD and challenges like data quality and performance at large scales.
A Unified Approach for Representing MetametadataKai Eckert
This document proposes a unified approach for representing metametadata, or statements about statements, using RDF reification. It discusses two scenarios where metametadata is needed: crosswalks between metadata schemas, and integrating metadata from different sources. For each scenario, examples are given of how metametadata could provide additional context about the origin and generation of statements to help debug errors, update rules, and assess statement quality. The approach uses RDF and SPARQL to represent and query this metametadata.
The Dublin Core 1:1 Principle in the Age of Linked DataRichard Urban
Presentation given at the International Conference on Dublin Core and Metadata Applications, Austin, TX. October 9, 2014. See associated paper http://dcevents.dublincore.org/IntConf/dc-2014/paper/view/263
Pal gov.tutorial2.session13 3.data integration and fusion using rdfMustafa Jarrar
This document discusses data integration and fusion using RDF. It provides an example of integrating data from three governmental agencies about companies by transforming each database into RDF and then concatenating the RDF graphs. Entities are linked across datasets using URIs and properties like owl:sameAs. Integrating RDF data in this way provides an integrated view where applications can query the datasets as a single database. References are provided on linked data and the emerging web of linked data.
The document defines key terms related to semantic technologies and the semantic web including:
- Linked Open Data (LOD) which publishes open data according to semantic web standards and links it to other sources to create a web of data.
- LOD2, an EU project developing infrastructure for building LOD.
- OWL, a language for more expressive semantic modeling.
- R2RML, a standard for mapping data in relational databases to RDF.
- RDF, the standard data model using triples to represent information.
Matching and merging anonymous terms from web sourcesIJwest
This paper describes a workflow of simplifying and matching spec This paper describes a workflow of simplifying and matching spec This paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching spec This paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching spec This paper describes a workflow of simplifying and matching spec This paper describes a workflow of simplifying and matching spec This paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching spec This paper describes a workflow of simplifying and matching spec This paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching spec This paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching spec This paper describes a workflow of simplifying and matching spec This paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching spec This paper describes a workflow of simplifying and matching spec This paper describes a workflow of simplifying and matching spec This paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching spec ial language terms in RDF generated ial language terms in RDF generated ial language terms in RDF generated ial language terms in RDF generated ial language terms in RDF generated ial language terms in RDF generated ial language terms in RDF generated ial language terms in RDF generated ial language terms in RDF generated ial language terms in RDF generated i
Linked Open Data to support content based Recommender SystemsVito Ostuni
This document proposes using Linked Open Data to support content-based recommender systems by providing rich item descriptions. It describes the main drawback of traditional content-based recommender systems as their limited ability to analyze content. A vector space model is then adapted to represent RDF graphs from Linked Open Data as vectors, allowing item similarity to be computed based on semantic relationships. An evaluation of this approach using MovieLens data demonstrates improvements in recommendation precision and recall.
The document provides an introduction to Dublin Core metadata, including its history and development. It describes Dublin Core elements, refinements, vocabularies, and provides examples of Dublin Core records at both the collection and item levels, including records for manuscripts, books, and other resources.
Bernhard Haslhofer is a postdoc researcher at Cornell University studying linked data, user-contributed data, and data interoperability. He discusses Linked (Open) Data, which uses URIs and RDF to publish and link structured data on the web. The key principles are using URIs to identify things, providing useful information about those URIs when dereferenced, and including links to other URIs. Enabling technologies include URIs, RDF, RDFS/OWL for vocabularies, SPARQL for querying, and best practices for publishing vocabularies and data. Useful tools are also presented.
Part 4 of tutorials at DC2008, Berlin. (International Conference on Dublin Core and Metadata Applications). See also part 1-3 by Jane Greenberg, Pete Johnston, and Mikael Nilsson on DC history, concepts, and other schemas. This part focuses on practical issues.
From Exploratory Search to Web Search and back - PIKM 2010Roku
The power of search is with no doubt one of the main aspects for the success of the Web. Currently available search engines on the Web allow to return results with a high precision. Nevertheless, if we limit our attention only to lookup search we are missing another important search task. In exploratory search, the user is willing not only to find documents relevant with respect to her query but she is also interested in learning, discovering and understanding novel knowledge on complex and sometimes unknown topics.
In the paper we address this issue presenting LED, a web based system that aims to improve (lookup) Web search by enabling users to properly explore knowledge associated to her query. We rely on DBpedia to explore the semantics of keywords within the query thus suggesting potentially interesting related topics/keywords to the user.
“Publishing and Consuming Linked Data. (Lessons learnt when using LOD in an a...Marta Villegas
This document discusses lessons learned from using linked open data in applications. It describes converting metadata from a language resource catalogue into RDF triples, including resolving complex instances like people and organizations. Issues addressed include data enrichment, linking to external datasets, and making implicit relations explicit. The goals of displaying comprehensive data to users and aggregating external data sensitively are discussed.
This document provides an overview of RDF Schema (RDFS), which extends RDF to enable the description of classes and properties. RDFS defines classes using rdfs:Class and subclasses using rdfs:subClassOf. Properties are instances of rdf:Property, and their domains and ranges can be specified using rdfs:domain and rdfs:range. RDFS provides a basic vocabulary for defining other application-specific vocabularies through classes, subclasses, and properties.
RDF and Open Linked Data, a first approachhorvadam
This document discusses the potential benefits of libraries publishing their data as linked open data using semantic web technologies. It describes how linked data allows for standardized access to data across the web as a single API. Libraries can make their data more discoverable on the web and searchable by services like Google by publishing it as linked open data. Semantic web technologies like RDF and SPARQL allow for more powerful search capabilities. Several large libraries are already publishing portions of their data as linked open data, including authority files and entire catalogs. The document outlines some semantic web applications libraries could use to enhance discovery and provides examples of vocabularies for describing different types of metadata.
WebSpa is a tool that allows the quick, intuitive (and even fun) interrogation of arbitrary SPARQL endpoints. WebSpa runs in the web browser and does not require the installation of any additional software. The tool manages a large variety of pre-defined SPARQL endpoints and allows the addition of new ones. An user account gives the possibility of saving both the interrogation and its results on the local computer, as well as further editing of the queries. The application is written in both Java and Flex. It uses Jena and ARQ application programming interface in order to perform the queries, and the results are processed and displayed using Flex.
The document provides an overview of the work done at DERI Galway, including developing technologies like SIOC, ActiveRDF, and BrowseRDF to interconnect online communities and enable semantic applications. It also describes JeromeDL, a digital library system that uses semantic metadata and services to allow users to collaboratively browse and share knowledge.
Exposing Bibliographic Information as Linked Open Data using Standards-based ...Nikolaos Konstantinou
The Linked Open Data (LOD) movement is constantly gaining worldwide acceptance. In this paper we describe how LOD is generated in the case of digital repositories that contain bibliographic information, adopting international standards. The available options and respective choices are presented and justified while we also provide a technical description, the methodology we followed, the possibilities and difficulties in the way, and the respective benefits and drawbacks. Detailed examples are provided regarding the implementation and query capabilities, and the paper concludes after a discussion over the results and the challenges associated with our approach, and our most important observations and future plans.
The document discusses exposing relational databases as Resource Description Framework (RDF). It provides an overview of relational database models (RDBM) and RDF data models. It then describes how to represent data from relational databases as RDF triples and query the semantic data using SPARQL. The key aspects covered are mapping relational schemas and data to RDF graphs, constructing RDF graphs from relational databases using SQL queries, and the advantages of representing data as RDF.
The document compares RDF and property graph data models. It discusses how RDF does not uniquely identify relationships between resources and cannot qualify relationships with attributes. Property graphs in LPGs can model these with unique relationship IDs and edge properties. It also notes RDF uses quads for named graphs while property graphs currently have no equivalent, and that RDF supports multivalued properties using lists while property graphs use arrays. The document cautions comparing RDF and graph databases by focusing on their underlying models rather than specific storage implementations. It demonstrates RDF can be modeled in a native graph database and that semantics are defined through optional rules and vocabularies rather than being inherent to RDF.
This document describes a new approach called BLOOMS+ for performing contextual ontology alignment of Linked Open Data datasets with an upper ontology. BLOOMS+ leverages contextual information from Wikipedia category hierarchies to compute similarities between concepts in different ontologies. It computes class similarity, contextual similarity between super classes, and an overall similarity to determine equivalence or subsumption relationships between concepts during alignment. The approach is evaluated on aligning several LOD ontologies to the PROTON upper ontology, outperforming existing solutions. Future work involves extending this approach to utilize more contextual sources and enable seamless querying across aligned datasets.
Linked Open (Geo)Data and the Distributed Ontology Language – a perfect matchChristoph Lange
The Distributed Ontology Language is a meta-language for integrating
ontologies written in different languages. Our notion of “distributed”
comprises logical heterogeneity within ontologies, modularity and reuse,
and links across ontologies in different places of the Web. Not only
can ontologies be distributed across the Web, but DOL's supply of
supported ontology languages can also be extended in a decentral way.
For this functionality, DOL builds on the Linked Open Data (LOD)
principles. But DOL also contributes to LOD use cases. Many current
LOD applications are limited by the weak expressivity of the RDF and
RDFS languages commonly used to express data and vocabularies.
Completely switching to a more expressive language would impair
scalability to big datasets. DOL addresses the scalability and
expressivity requirements by allowing to represent each aspect of a
dataset in the most suitable language and keeping these different
representations connected. This is particularly useful in geographic
information systems, where big datasets (e.g. Linked Geo Data, the LOD
version of OpenStreetMap) need to be integrated with formalisations of
complex spatial notions (e.g. in the first-order language Common Logic).
This document provides an introduction to Linked Open Data (LOD) and how to create and work with LOD datasets. It explains that LOD combines Linked Data with open licenses to publish structured data on the web. It describes the 5 star open data model and shows how to make data available in different formats like RDF, JSON-LD and Turtle. It also introduces SPARQL and provides examples of querying LOD sources like Wikidata to find information. Finally, it discusses software for working with LOD and challenges like data quality and performance at large scales.
A Unified Approach for Representing MetametadataKai Eckert
This document proposes a unified approach for representing metametadata, or statements about statements, using RDF reification. It discusses two scenarios where metametadata is needed: crosswalks between metadata schemas, and integrating metadata from different sources. For each scenario, examples are given of how metametadata could provide additional context about the origin and generation of statements to help debug errors, update rules, and assess statement quality. The approach uses RDF and SPARQL to represent and query this metametadata.
The Dublin Core 1:1 Principle in the Age of Linked DataRichard Urban
Presentation given at the International Conference on Dublin Core and Metadata Applications, Austin, TX. October 9, 2014. See associated paper http://dcevents.dublincore.org/IntConf/dc-2014/paper/view/263
Pal gov.tutorial2.session13 3.data integration and fusion using rdfMustafa Jarrar
This document discusses data integration and fusion using RDF. It provides an example of integrating data from three governmental agencies about companies by transforming each database into RDF and then concatenating the RDF graphs. Entities are linked across datasets using URIs and properties like owl:sameAs. Integrating RDF data in this way provides an integrated view where applications can query the datasets as a single database. References are provided on linked data and the emerging web of linked data.
The document defines key terms related to semantic technologies and the semantic web including:
- Linked Open Data (LOD) which publishes open data according to semantic web standards and links it to other sources to create a web of data.
- LOD2, an EU project developing infrastructure for building LOD.
- OWL, a language for more expressive semantic modeling.
- R2RML, a standard for mapping data in relational databases to RDF.
- RDF, the standard data model using triples to represent information.
Matching and merging anonymous terms from web sourcesIJwest
This paper describes a workflow of simplifying and matching spec This paper describes a workflow of simplifying and matching spec This paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching spec This paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching spec This paper describes a workflow of simplifying and matching spec This paper describes a workflow of simplifying and matching spec This paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching spec This paper describes a workflow of simplifying and matching spec This paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching spec This paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching spec This paper describes a workflow of simplifying and matching spec This paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching spec This paper describes a workflow of simplifying and matching spec This paper describes a workflow of simplifying and matching spec This paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching specThis paper describes a workflow of simplifying and matching spec ial language terms in RDF generated ial language terms in RDF generated ial language terms in RDF generated ial language terms in RDF generated ial language terms in RDF generated ial language terms in RDF generated ial language terms in RDF generated ial language terms in RDF generated ial language terms in RDF generated ial language terms in RDF generated i
Linked Open Data to support content based Recommender SystemsVito Ostuni
This document proposes using Linked Open Data to support content-based recommender systems by providing rich item descriptions. It describes the main drawback of traditional content-based recommender systems as their limited ability to analyze content. A vector space model is then adapted to represent RDF graphs from Linked Open Data as vectors, allowing item similarity to be computed based on semantic relationships. An evaluation of this approach using MovieLens data demonstrates improvements in recommendation precision and recall.
The document provides an introduction to Dublin Core metadata, including its history and development. It describes Dublin Core elements, refinements, vocabularies, and provides examples of Dublin Core records at both the collection and item levels, including records for manuscripts, books, and other resources.
Bernhard Haslhofer is a postdoc researcher at Cornell University studying linked data, user-contributed data, and data interoperability. He discusses Linked (Open) Data, which uses URIs and RDF to publish and link structured data on the web. The key principles are using URIs to identify things, providing useful information about those URIs when dereferenced, and including links to other URIs. Enabling technologies include URIs, RDF, RDFS/OWL for vocabularies, SPARQL for querying, and best practices for publishing vocabularies and data. Useful tools are also presented.
Part 4 of tutorials at DC2008, Berlin. (International Conference on Dublin Core and Metadata Applications). See also part 1-3 by Jane Greenberg, Pete Johnston, and Mikael Nilsson on DC history, concepts, and other schemas. This part focuses on practical issues.
From Exploratory Search to Web Search and back - PIKM 2010Roku
The power of search is with no doubt one of the main aspects for the success of the Web. Currently available search engines on the Web allow to return results with a high precision. Nevertheless, if we limit our attention only to lookup search we are missing another important search task. In exploratory search, the user is willing not only to find documents relevant with respect to her query but she is also interested in learning, discovering and understanding novel knowledge on complex and sometimes unknown topics.
In the paper we address this issue presenting LED, a web based system that aims to improve (lookup) Web search by enabling users to properly explore knowledge associated to her query. We rely on DBpedia to explore the semantics of keywords within the query thus suggesting potentially interesting related topics/keywords to the user.
“Publishing and Consuming Linked Data. (Lessons learnt when using LOD in an a...Marta Villegas
This document discusses lessons learned from using linked open data in applications. It describes converting metadata from a language resource catalogue into RDF triples, including resolving complex instances like people and organizations. Issues addressed include data enrichment, linking to external datasets, and making implicit relations explicit. The goals of displaying comprehensive data to users and aggregating external data sensitively are discussed.
This document provides an overview of RDF Schema (RDFS), which extends RDF to enable the description of classes and properties. RDFS defines classes using rdfs:Class and subclasses using rdfs:subClassOf. Properties are instances of rdf:Property, and their domains and ranges can be specified using rdfs:domain and rdfs:range. RDFS provides a basic vocabulary for defining other application-specific vocabularies through classes, subclasses, and properties.
RDF and Open Linked Data, a first approachhorvadam
This document discusses the potential benefits of libraries publishing their data as linked open data using semantic web technologies. It describes how linked data allows for standardized access to data across the web as a single API. Libraries can make their data more discoverable on the web and searchable by services like Google by publishing it as linked open data. Semantic web technologies like RDF and SPARQL allow for more powerful search capabilities. Several large libraries are already publishing portions of their data as linked open data, including authority files and entire catalogs. The document outlines some semantic web applications libraries could use to enhance discovery and provides examples of vocabularies for describing different types of metadata.
WebSpa is a tool that allows the quick, intuitive (and even fun) interrogation of arbitrary SPARQL endpoints. WebSpa runs in the web browser and does not require the installation of any additional software. The tool manages a large variety of pre-defined SPARQL endpoints and allows the addition of new ones. An user account gives the possibility of saving both the interrogation and its results on the local computer, as well as further editing of the queries. The application is written in both Java and Flex. It uses Jena and ARQ application programming interface in order to perform the queries, and the results are processed and displayed using Flex.
The document provides an overview of the work done at DERI Galway, including developing technologies like SIOC, ActiveRDF, and BrowseRDF to interconnect online communities and enable semantic applications. It also describes JeromeDL, a digital library system that uses semantic metadata and services to allow users to collaboratively browse and share knowledge.
Exposing Bibliographic Information as Linked Open Data using Standards-based ...Nikolaos Konstantinou
The Linked Open Data (LOD) movement is constantly gaining worldwide acceptance. In this paper we describe how LOD is generated in the case of digital repositories that contain bibliographic information, adopting international standards. The available options and respective choices are presented and justified while we also provide a technical description, the methodology we followed, the possibilities and difficulties in the way, and the respective benefits and drawbacks. Detailed examples are provided regarding the implementation and query capabilities, and the paper concludes after a discussion over the results and the challenges associated with our approach, and our most important observations and future plans.
The document discusses exposing relational databases as Resource Description Framework (RDF). It provides an overview of relational database models (RDBM) and RDF data models. It then describes how to represent data from relational databases as RDF triples and query the semantic data using SPARQL. The key aspects covered are mapping relational schemas and data to RDF graphs, constructing RDF graphs from relational databases using SQL queries, and the advantages of representing data as RDF.
The document compares RDF and property graph data models. It discusses how RDF does not uniquely identify relationships between resources and cannot qualify relationships with attributes. Property graphs in LPGs can model these with unique relationship IDs and edge properties. It also notes RDF uses quads for named graphs while property graphs currently have no equivalent, and that RDF supports multivalued properties using lists while property graphs use arrays. The document cautions comparing RDF and graph databases by focusing on their underlying models rather than specific storage implementations. It demonstrates RDF can be modeled in a native graph database and that semantics are defined through optional rules and vocabularies rather than being inherent to RDF.
This document describes a new approach called BLOOMS+ for performing contextual ontology alignment of Linked Open Data datasets with an upper ontology. BLOOMS+ leverages contextual information from Wikipedia category hierarchies to compute similarities between concepts in different ontologies. It computes class similarity, contextual similarity between super classes, and an overall similarity to determine equivalence or subsumption relationships between concepts during alignment. The approach is evaluated on aligning several LOD ontologies to the PROTON upper ontology, outperforming existing solutions. Future work involves extending this approach to utilize more contextual sources and enable seamless querying across aligned datasets.
A Framework for Improved Access to Museum Databases in the Semantic WebMariana Damova, Ph.D
This paper presents a framework for processing Museum databases according to a set of interlinked ontologies, including CIDOC-CRM, and loading them in a reason-able view of the web of data, providing additional links to datasets from the LOD cloud. The infrastructure allows accessing the data via SPARQL queries and to verbalize the query results in natural language, the GF formalism, which allows access to 18 natural languages.
The document defines ontologies as explicit descriptions of a domain that define concepts, properties, attributes, and constraints. It discusses the history of categorization in philosophy and the development of knowledge models like semantic nets and conceptual graphs. The document outlines different methods for building ontologies and different types of ontologies. It also discusses ontology tools like Protege and TopBraid Composer and how ontologies are used on the semantic web through languages like OWL.
This document discusses humanizing technologies and trends in developing technologies that are more human-centric. It provides examples of technologies being developed by Mozaika, a research center, to better integrate technologies into human lives in natural ways. Mozaika is working on projects involving summarization, skills matching, information management, publishing, and more using techniques like natural language processing, sentiment analysis, and semantic technologies.
1) The document discusses a vision for linking CRIS (Current Research Information Systems) and OAR (Open Access Repositories) data as Linked Data to improve interoperability in scholarly communication.
2) It proposes using common vocabularies and assigning persistent URIs to entities like publications, authors, projects and organizations to reduce duplication and improve data sharing across domains.
3) Next steps include adopting the KE CRIS-OAR data model and vocabulary and exploring how Linked Data approaches could be used to link publications, research data, and CRIS and OAR domains as part of the OpenAIREplus project.
Charleston 2012 - The Future of Serials in a Linked Data WorldProQuest
The educational objective of this session is to review today’s MARC-based environment in which the serial record predominates, and compare that with what might be possible in a future world of linked data. The session will inspire conversation and reflection on a number of questions. What will a world of statement-based rather than record-based metadata look like? What will a new environment mean for library systems, workflows, and information dissemination?
This document discusses linking open data with Drupal. It begins with an introduction to open data and the semantic web. It explains how to transform open data into linked data using ontologies and semantic metadata. Several Drupal modules are presented for importing, publishing, and querying linked data. The document concludes by proposing a hackathon where participants could consume, publish, and build applications with linked open government data and the Drupal framework.
This document provides an overview of a presentation on representing and connecting language data and metadata using linked data. It discusses the technological background of linked data and the collaborative research opportunities it provides for linguistics. It also outlines prospects for using linked data in linguistics by connecting annotated corpora, lexical-semantic resources, and linguistic databases to build a linguistic linked open data cloud.
This document discusses linking open data with Drupal. It begins with an introduction to open data and the semantic web. It then explains how to transform open data into linked data by using ontologies, semantic metadata, and publishing semantic data on the web. Several Drupal modules are presented that allow consuming, publishing, and building applications with linked open data, including RDFx, Schema.org, and SPARQL. Finally, it proposes a hackathon event to work on projects connecting open data and Drupal through linked data approaches.
The LOD2 project aims to make linked data the model of choice for next-generation IT systems. It focuses on very large RDF data management, enrichment and interlinking of data, and adaptive user interfaces. Run from 2010-2014 with a budget of 10.2 million euros, the LOD2 consortium includes universities, companies, and research organizations that develop and release linked data tools as part of an integrated technology stack to support the linked data lifecycle.
The document discusses the growth of RDF data volumes on the web. It notes that the Linked Open Data cloud currently contains over 325 datasets with more than 25 billion triples, and the size is doubling every year. It provides examples of popular datasets that are part of the Linked Open Data cloud.
IP LodB project (for more details see iplod.io ) capitalizes on LOD database thinking, to build bridges between patented information and scientific knowledge, whilst focusing on individuals who codify new knowledge and their connected organizations, including those who apply patents in new products and services.
As main outputs the IP LodB produced an intellectual property rights (IPR) linked open data (LOD) map (IP LOD map), and has tested the linkability of the European patent (EP) LOD database, whilst increasing the uniqueness of data using different harmonization techniques.
These slides were developed for NIPO workshop
This document discusses the LODAC project which aims to connect museum data as linked open data. It describes gathering data from 14 Japanese museums and other sources, standardizing the data, and integrating it while maintaining links to original sources. The integrated data includes information on artworks, creators, and museums. Creator data is matched across sources to link artists to their works. The goals are to publish the data as reusable linked open data, create connections between cultural heritage information and other domains, and allow for new discoveries and uses of museum data through open connectivity.
This document summarizes a presentation about semantic technologies for big data. It discusses how semantic technologies can help address challenges related to the volume, velocity, and variety of big data. Specific examples are provided of large semantic datasets containing billions of triples and semantic applications that have integrated and analyzed disparate data sources. Semantic technologies are presented as a good fit for addressing big data's variety, and research is making progress in applying them to velocity and volume as well.
This document provides a summary of a talk given by Tope Omitola on using linked data for world sense-making. The talk discussed EnAKTing, a project focused on building ontologies from large-scale user participation and querying linked data. It also covered publishing and consuming public sector datasets as linked data, including challenges around data integration, normalization and alignment. The talk concluded with a discussion of linked data services and applications developed by the project to enhance findability, search, and visualization of linked data.
Information School, University of Washington, 2014-05-21: INFX 598 - Introducing Linked Data: concepts, methods and tools. Guest lecture (Module 9) "Doing Business with Semantic Technologies": Introduction to Ontotext and some of its products, clients and projects.
Also see video:https://voicethread.com/myvoice/#thread/5784646/29625471/31274564
Linked Data Driven Data Virtualization for Web-scale Integrationrumito
- Linked data and data virtualization can help address challenges of growing data heterogeneity, complexity, and need for agility by providing a common data model and identifiers.
- Linked data uses RDF to represent information as graphs of triples connected by URIs, allowing different data sources to be integrated and queried together.
- As more data is published using common vocabularies and linking to existing URIs, it increases opportunities for discovery, integration and novel ways to extract value from diverse data sources.
Aggregating Social Media for Enhancing Conference ExperiencesHouda khrouf
The document describes a system called Confomaton that aggregates social media to enhance conference experiences. Confomaton collects media items and event data from various sources, models the data semantically, and reconciles items to events in real-time. It provides a web interface and APIs to browse conferences and associated media. Future work aims to integrate more media sources, enrich event descriptions, refine reconciliation, and improve the user interface performance.
This document describes the generation of two Linked Data datasets from sensor data - LinkedSensorData and LinkedObservationData. LinkedSensorData contains descriptions of about 20,000 weather stations in the US with links to sensor observations. LinkedObservationData contains over a billion triples of sensor observations during major storms, linked to the weather stations. The datasets are generated by converting sensor data from MesoWest into the Observations and Measurements (O&M) format, then using an API to convert O&M to RDF and load it into a Virtuoso triplestore. The datasets are made available via SPARQL endpoints and a Pubby interface to allow querying and browsing of the sensor descriptions and observations.
Talk at 3th Keystone Training School - Keyword Search in Big Linked Data - Institute for Software Technology and Interactive Systems, TU Wien, Austria, 2017
Linked data driven EPCIS Event-based Traceability across Supply chain busine...Monika Solanki
This document discusses using semantic web and linked data principles to improve traceability in supply chains. It proposes a framework that uses GS1 standards like EPCIS and EPC to generate linked pedigrees from event data. Ontologies are developed to represent traceability data and processes. Extract, transform, and load processes are used to publish relational EPCIS data as linked open data. This allows traceability information to be interlinked and shared across supply chain partners.
The document discusses the CIARD (Coherence in Information for Agricultural Research for Development) framework for sharing agricultural data and information globally. It describes how CIARD is building various infrastructure elements like VocBench (a vocabulary server), the RING (a roadmap and registry of information nodes), and tools to convert data into linked open data using vocabularies. The goal is to aggregate and semantically link data from distributed repositories and make it accessible to those who need it.
Introduction to Ontology Concepts and TerminologySteven Miller
The document introduces an ontology tutorial that will cover basic concepts of the Semantic Web, Linked Data, and the Resource Description Framework data model as well as the ontology languages RDFS and OWL. The tutorial is intended for information professionals who want to gain an introductory understanding of ontologies, ontology concepts, and terminology. The tutorial will explain how to model and structure data as RDF triples and create basic RDFS ontologies.
This presentation gives insight to the overall Horizon 2020 Program and more specifically for the period 2018-2020 with emphasis to ICT. Mariana Damova is the National Contact Point for Horizon 2020 ICT in Bulgaria
Geography of Letters - The Spirituality of Sofia in the Historic MemoryMariana Damova, Ph.D
Presentation of the project The Spirituality of Sofia in the Historic Memory at the Round table on the future perspectives for Digital humanities in SEE within the Summer School in Advanced Tools for Digital Humanities and IT
The document describes IndustryInform, a semantic-based search and recommendation service for business networking. It allows industrial enterprises to advertise themselves, helps potential clients and investors find matching businesses, and provides a data as a service facility (DaaS) through annual subscriptions or pay-per-query plans. The service uses semantic web technologies and linked data to power searches across a database of over 50 million information units about 300,000 companies in 7 countries. It has features like extended search, company/user registration, and results displayed in table or Google-like formats. The system was developed by Mozaika's Humanizing Technologies Lab and has an engineering team, graphic designer, and business/marketing team to manage it.
Mozaika is a research center and SME operating since 2013 in the areas of data science, natural language interfaces, and human insight. It provides consulting, R&D projects, and data as a service solutions tailored to human behavior. Mozaika has expertise in semantic technologies, cognitive systems, and multimodal interactivity. It has completed projects in business networking, human resources management, cultural heritage, education, and aerospace with clients and partners from both private companies and research organizations.
This document summarizes Mozaika, a research center focused on humanizing technologies. It discusses technologies that make emerging technologies more understandable and give people more control, including reducing data complexity through semantic technologies. It provides examples of Mozaika's projects involving skills matching, city experience summarization, satellite communications, linked open data, and e-publishing. The goal is for technology to better support and enhance humanity.
Communication channels for the european single digital marketMariana Damova, Ph.D
Presentation about the importance of tackling the multilinguality in the strategy agenda for the European Digital Single Market, and about the role of language technology and the European language technology community in solving this issue endorsed by public funding
This is a presentation targeted to leaders of cultural institutions in Bulgaria to inform them about the opportunities to publish cultural content in Bulgariana and in Europeana and about what would be their benefits for doing this.
NLIWoD ISWC 2014 - Multilingual Retrieval Interface for Structured data on th...Mariana Damova, Ph.D
This presentation described a Multilingual Retrieval Interface for Structured data on the Web, a talk given at NLIWoD workshop at ISWC 2014. The approach is based on Grammatical framework and semantic web and linked data technologies
Presentation held at a meeting in Bulgaria (Varna Regional Library) coorganized by Europeana, Bulgariana, Varna Regional library and BBIA about Europeana.
Multilingual Access to Cultural Heritage Content on the Semantic Web - Acl2013Mariana Damova, Ph.D
The document discusses building an ontology-based application to communicate museum content in multiple languages on the Semantic Web. It aims to make cultural heritage accessible to both humans and computers by generating natural language descriptions from semantic data. The application uses Grammatical Framework to linearly multiple museum datasets and ontologies into 15 languages. It addresses challenges in cross-linguistically representing classes, properties, word order, tense, and reference. The system was demonstrated to generate descriptions of paintings from the Louvre museum in English and French.
This presentation is an overview of the Bulgarian participation in the virtual museum Europeana, and the path of establishing a National Aggregator to Europeana.
This document summarizes a presentation given by Mariana Damova on using semantic technologies and Europeana data. It discusses how Europeana data has been converted to RDF and loaded into the OWLIM semantic graph database. This allows linking Europeana data to other datasets to enable queries across multiple sources. Examples of queries over Europeana and other cultural heritage data are provided. Future work on projects like Europeana Creative is also mentioned.
The Microsoft 365 Migration Tutorial For Beginner.pptxoperationspcvita
This presentation will help you understand the power of Microsoft 365. However, we have mentioned every productivity app included in Office 365. Additionally, we have suggested the migration situation related to Office 365 and how we can help you.
You can also read: https://www.systoolsgroup.com/updates/office-365-tenant-to-tenant-migration-step-by-step-complete-guide/
Programming Foundation Models with DSPy - Meetup SlidesZilliz
Prompting language models is hard, while programming language models is easy. In this talk, I will discuss the state-of-the-art framework DSPy for programming foundation models with its powerful optimizers and runtime constraint system.
Your One-Stop Shop for Python Success: Top 10 US Python Development Providersakankshawande
Simplify your search for a reliable Python development partner! This list presents the top 10 trusted US providers offering comprehensive Python development services, ensuring your project's success from conception to completion.
Ivanti’s Patch Tuesday breakdown goes beyond patching your applications and brings you the intelligence and guidance needed to prioritize where to focus your attention first. Catch early analysis on our Ivanti blog, then join industry expert Chris Goettl for the Patch Tuesday Webinar Event. There we’ll do a deep dive into each of the bulletins and give guidance on the risks associated with the newly-identified vulnerabilities.
Have you ever been confused by the myriad of choices offered by AWS for hosting a website or an API?
Lambda, Elastic Beanstalk, Lightsail, Amplify, S3 (and more!) can each host websites + APIs. But which one should we choose?
Which one is cheapest? Which one is fastest? Which one will scale to meet our needs?
Join me in this session as we dive into each AWS hosting service to determine which one is best for your scenario and explain why!
Main news related to the CCS TSI 2023 (2023/1695)Jakub Marek
An English 🇬🇧 translation of a presentation to the speech I gave about the main changes brought by CCS TSI 2023 at the biggest Czech conference on Communications and signalling systems on Railways, which was held in Clarion Hotel Olomouc from 7th to 9th November 2023 (konferenceszt.cz). Attended by around 500 participants and 200 on-line followers.
The original Czech 🇨🇿 version of the presentation can be found here: https://www.slideshare.net/slideshow/hlavni-novinky-souvisejici-s-ccs-tsi-2023-2023-1695/269688092 .
The videorecording (in Czech) from the presentation is available here: https://youtu.be/WzjJWm4IyPk?si=SImb06tuXGb30BEH .
Taking AI to the Next Level in Manufacturing.pdfssuserfac0301
Read Taking AI to the Next Level in Manufacturing to gain insights on AI adoption in the manufacturing industry, such as:
1. How quickly AI is being implemented in manufacturing.
2. Which barriers stand in the way of AI adoption.
3. How data quality and governance form the backbone of AI.
4. Organizational processes and structures that may inhibit effective AI adoption.
6. Ideas and approaches to help build your organization's AI strategy.
Fueling AI with Great Data with Airbyte WebinarZilliz
This talk will focus on how to collect data from a variety of sources, leveraging this data for RAG and other GenAI use cases, and finally charting your course to productionalization.
HCL Notes and Domino License Cost Reduction in the World of DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-and-domino-license-cost-reduction-in-the-world-of-dlau/
The introduction of DLAU and the CCB & CCX licensing model caused quite a stir in the HCL community. As a Notes and Domino customer, you may have faced challenges with unexpected user counts and license costs. You probably have questions on how this new licensing approach works and how to benefit from it. Most importantly, you likely have budget constraints and want to save money where possible. Don’t worry, we can help with all of this!
We’ll show you how to fix common misconfigurations that cause higher-than-expected user counts, and how to identify accounts which you can deactivate to save money. There are also frequent patterns that can cause unnecessary cost, like using a person document instead of a mail-in for shared mailboxes. We’ll provide examples and solutions for those as well. And naturally we’ll explain the new licensing model.
Join HCL Ambassador Marc Thomas in this webinar with a special guest appearance from Franz Walder. It will give you the tools and know-how to stay on top of what is going on with Domino licensing. You will be able lower your cost through an optimized configuration and keep it low going forward.
These topics will be covered
- Reducing license cost by finding and fixing misconfigurations and superfluous accounts
- How do CCB and CCX licenses really work?
- Understanding the DLAU tool and how to best utilize it
- Tips for common problem areas, like team mailboxes, functional/test users, etc
- Practical examples and best practices to implement right away
Dandelion Hashtable: beyond billion requests per second on a commodity serverAntonios Katsarakis
This slide deck presents DLHT, a concurrent in-memory hashtable. Despite efforts to optimize hashtables, that go as far as sacrificing core functionality, state-of-the-art designs still incur multiple memory accesses per request and block request processing in three cases. First, most hashtables block while waiting for data to be retrieved from memory. Second, open-addressing designs, which represent the current state-of-the-art, either cannot free index slots on deletes or must block all requests to do so. Third, index resizes block every request until all objects are copied to the new index. Defying folklore wisdom, DLHT forgoes open-addressing and adopts a fully-featured and memory-aware closed-addressing design based on bounded cache-line-chaining. This design offers lock-free index operations and deletes that free slots instantly, (2) completes most requests with a single memory access, (3) utilizes software prefetching to hide memory latencies, and (4) employs a novel non-blocking and parallel resizing. In a commodity server and a memory-resident workload, DLHT surpasses 1.6B requests per second and provides 3.5x (12x) the throughput of the state-of-the-art closed-addressing (open-addressing) resizable hashtable on Gets (Deletes).
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfChart Kalyan
A Mix Chart displays historical data of numbers in a graphical or tabular form. The Kalyan Rajdhani Mix Chart specifically shows the results of a sequence of numbers over different periods.
[OReilly Superstream] Occupy the Space: A grassroots guide to engineering (an...Jason Yip
The typical problem in product engineering is not bad strategy, so much as “no strategy”. This leads to confusion, lack of motivation, and incoherent action. The next time you look for a strategy and find an empty space, instead of waiting for it to be filled, I will show you how to fill it in yourself. If you’re wrong, it forces a correction. If you’re right, it helps create focus. I’ll share how I’ve approached this in the past, both what works and lessons for what didn’t work so well.
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-und-domino-lizenzkostenreduzierung-in-der-welt-von-dlau/
DLAU und die Lizenzen nach dem CCB- und CCX-Modell sind für viele in der HCL-Community seit letztem Jahr ein heißes Thema. Als Notes- oder Domino-Kunde haben Sie vielleicht mit unerwartet hohen Benutzerzahlen und Lizenzgebühren zu kämpfen. Sie fragen sich vielleicht, wie diese neue Art der Lizenzierung funktioniert und welchen Nutzen sie Ihnen bringt. Vor allem wollen Sie sicherlich Ihr Budget einhalten und Kosten sparen, wo immer möglich. Das verstehen wir und wir möchten Ihnen dabei helfen!
Wir erklären Ihnen, wie Sie häufige Konfigurationsprobleme lösen können, die dazu führen können, dass mehr Benutzer gezählt werden als nötig, und wie Sie überflüssige oder ungenutzte Konten identifizieren und entfernen können, um Geld zu sparen. Es gibt auch einige Ansätze, die zu unnötigen Ausgaben führen können, z. B. wenn ein Personendokument anstelle eines Mail-Ins für geteilte Mailboxen verwendet wird. Wir zeigen Ihnen solche Fälle und deren Lösungen. Und natürlich erklären wir Ihnen das neue Lizenzmodell.
Nehmen Sie an diesem Webinar teil, bei dem HCL-Ambassador Marc Thomas und Gastredner Franz Walder Ihnen diese neue Welt näherbringen. Es vermittelt Ihnen die Tools und das Know-how, um den Überblick zu bewahren. Sie werden in der Lage sein, Ihre Kosten durch eine optimierte Domino-Konfiguration zu reduzieren und auch in Zukunft gering zu halten.
Diese Themen werden behandelt
- Reduzierung der Lizenzkosten durch Auffinden und Beheben von Fehlkonfigurationen und überflüssigen Konten
- Wie funktionieren CCB- und CCX-Lizenzen wirklich?
- Verstehen des DLAU-Tools und wie man es am besten nutzt
- Tipps für häufige Problembereiche, wie z. B. Team-Postfächer, Funktions-/Testbenutzer usw.
- Praxisbeispiele und Best Practices zum sofortigen Umsetzen
HCL Notes und Domino Lizenzkostenreduzierung in der Welt von DLAU
Fact forge aimsa2012
1. FactForge: Data Service or the
Diversity of Inferred Knowledge
over LOD
Mariana Damova, PhD, Kiril Simov, Zdravko Tashev, Atanas Kiryakov
AIMSA’2012
September 2012
2. Ontotext
– Top-5 provider of core Semantic Technology
– Established in year 2000; offices in Bulgaria, UK, USA
– Active both in research and commercial projects (FP7 funding for 10 years)
• 360° semantic technology – unique portfolio:
– Semantic Databases: high-performance RDF DBMS, scalable reasoning
– Semantic Search: text-mining (IE), metadata generation, Information Retrieval (IR)
– Web Mining: focused crawling, screen scraping, data fusion
– Linked Data Management and Data Integration
Good recognition in the SemTech community
– Ontotext pages are ranked #1 for “semantic annotation” and “semantic repository” at
GYM, #3 for “linked data management” at Google
Several joint ventures and subsidiaries
– Innovantage: leading online recruitment intelligence provider in UK
3. Ontotext Clients (selected)
British Broadcasting Corporation (BBC)
– Run its World Cup 2010 sites on top of OWLIM
– Since Mar’12 BBC Sports
– 2012 Olympics sections are driven
by OWLIM and a Concept Extraction service developed by Ontotext
Press Association (UK)
– Analysis of Sports news
– Concept extraction
– Linked data generation
Top-3 USA media (not allowed to name)
The National Archives (UK) contracted Ontotext to implement
semantic KB and semantic search for the Government Web Archive
British Museum (UK) Ontotext leads the development of Phase 3 of
ResearchSpace project on collaborative research in cultural heritage;
British Museum’s public SPARQL end-point is powered by OWLIM
de Bibliothek (Holland) aggregation of data from 150 library databases
4. Semantic Web and Linked Open Data
• Semantic Web
a set of standards that enable computers to interpret the
semantics of data on the web
• Linked Open Data
a set of principles for publishing structured data and interlinking
them so that they can be browsed in a way HTML pages are
browsable
- Use URIs to identify things.
- Use HTTP URIs so that these things can be referred to and looked up
("dereferenced") by people and user agents.
- Provide useful information about the thing when its URI is dereferenced,
using standard formats such as RDF/XML.
- Include links to other, related URIs in the exposed data to improve discovery
of other related information on the Web.
AIMSA’2012 September 2012 #4
5. Linked Open Data cloud
2008
2011
295 datasets 2009
more than 30 billion triples
AIMSA’2012 July 2011 #5
6. Linked Open Data is maturing
LOD cloud grows by billions of triples yearly
Technologies and guidelines about
how to produce linked data fast
how to assure their quality
how to provide vertical oriented data services
LOD2, LATC, baseKB
AIMSA’2012 September 2012 #6
7. This talk is about
reasoning
and
coping with diversity of the data on the web of data
AIMSA’2012 September 2012 #7
8. Outline
• FactForge (beta)
• Reference Layer
• Access Modes
• Querying
– Airports around London
– US city – a subject of a Novel
– US city – contactInformation
• Challenges
• Conclusion
AIMSA’2012 September 2012
9. FactForge (beta)
the largest body of heterogeneous general knowledge on which inference has been performed
– powered by OWLIM 5.2 – supporting SPARQL 1.1
AIMSA’2012
September 2012
10. Datasets
REASON-ABLE VIEW
of LOD datasets
Number of explicit statements: 1,686,804,539
Implicit statements: 1,264,199,839
Retrievable statements: 12,646,674,554
CIA FactBook
DBpedia 3.7
Freebase
NY Times
Lexvo
Wordnet 3.0 Geonames Lingvoj
MusicBrainz
materialization is performed with respect to the semantics of OWL-Horst optimized
AIMSA’2012
September 2012
11. Reference Layer
PROTON – light weight upper level ontology
~500 classes, ~150 properties
http://www.ontotext.com/proton-ontology
Linking at schema level:
(1) using rdfs:subClassOf and rdfs:subPropertyOf statements;
(2) using OWL expressions where there is a difference in the conceptualization
(3) using inference rules if additional individuals are necessary in the repository to support the mapping
AIMSA’2012 September 2012 #11
12. Access modes
RDF Search - retrieve ranked list of URIs related to literals, which contain specific keywords
AIMSA’2012 September 2012 #12
13. Access modes (condt)
Exploration - traversing the data, one resource at a time
AIMSA’2012 September 2012
14. Access modes (condt)
Exploration - traversing the data, one resource at a time,
inspecting inferred knowledge
- locatedIn – Bulgaria, Eastern Europe
- Geonames types/FearureCodes (dc:type P.PPL)
- parentFeature – Bulgaria, Europe
-containsLocation – Cherno More Sports Complex,
Varna Archeological Museum
- isBirthPlaceOf – Aleksander Kraev, Martin Hristov
…
AIMSA’2012 September 2012 #14
15. Access modes (condt)
Exploration - traversing the data, one resource at a time,
inspecting inferred knowledge
- locatedIn - Europe
- subRegionOf - Europe
- hasContactInfo –
website via Freebase
-containsLocation
- partOf
…
AIMSA’2012 September 2012 #15
18. Querying
Using LOD concepts
SELECT * WHERE {
?Person dbp-ont:birthPlace ?BirthPlace ;
rdf:type dbp-ont:Politician ;
?BirthPlace geo-ont:parentFeature dbpedia:Germany .
}
Using the intermediary layer
SELECT * WHERE {
?Person prot:birthPlace ?BirthPlace ;
rdf:type prot:Politicianr ;
?BirthPlace prot:subRegionOf dbpedia:Germany .
}
AIMSA’2012 September 2012
19. Find Airports near London
Standard LOD vs. PROTON query
13 vs. 20 results
DBpedia vs. DBpedia and Geonames
AIMSA’2012 September 2012 #19
20. Find airports near London - Results comparison
Using Geospatial index of OWLIM
AIMSA’2012 September 2012 #20
21. City – a subject of a science fiction author
AIMSA’2012 September 2012 #21
22. OWLIM 5.0 and SPARQL 1.1
Exemplary queries :
GROUP BY, min
— Minimal and maximal population counts of European countries
Federated Query between FactForge and LinkedLifeData
— Drugs that cure the disease from which died Alexandre Graham Bell
Literal index over dates
– World governors in office between 1980 and 2005
Literal index over digits
― European countries with population above 20 MLN
Geospatial index
— Show the distance from London of airports located at most 50 miles away from it
AIMSA’2012 September 2012 #22
23. Challenges and usage
• Clean data
– Clean up input data
• At model level
– Contradiction detection
– Consistency checking
• Curation and upgrading methodology
FactForge has been used as data layer infrastructure in FP7 projects, like RENDER
FactForge has been used in tasks of
linked data generation from unstructured data,
metadata enrichment of structured data
providing linkage to the entire LOD cloud
for example The National Archive of UK
EDAMAM - food recommendation app
AIMSA’2012 September 2012 #23
24. Acknowledgements
Partial funding
Colleagues
Ivan Peikov, Ontotext
Rouslan Velkov, Ontotext
Barry Bishop, Ontotext
Barry Norton, Ontotext
Marin Dimitrov, Ontotext
Alex Simov, Ontotext
Jordan Dichev, Ontotext
Konstantin Penchev, Ontotext
Links
http://ff-dev.ontotext.com
http://www.ontotext.com/owlim
http://www.ontotext.com/factforge
Email:
info@factforge.net
AIMSA’2012 September 2012 #24
25. Thank you for your attention!
mariana.damova@ontotext.com