This presentation discusses catalog enrichment using Linked Open Data. It begins with defining catalog enrichment as any addendum to catalog records, such as links to full texts or subjects. It then discusses techniques for enrichment including matching catalog records to external data sources and linking the records. The presentation demonstrates an implementation of catalog enrichment by linking records to data sets like DBpedia, Project Gutenberg and Open Library. It concludes that while catalog enrichment is possible without Linked Open Data, using LOD makes the process easier.
The document discusses a webinar presented by LOD2 on creating knowledge from interlinked data. It describes LOD2 as an EU-funded project involving leading linked open data organizations. The webinar agenda includes discussing SIREn, a plugin for Elasticsearch that allows indexing and searching of JSON documents. It provides an overview of Elasticsearch and describes how to install SIREn, create an index, index documents, and perform searches on nested JSON data.
(http://lod2.eu/BlogPost/webinar-series) In this Webinar Michael Martin presents CubeViz - a facetted browser for statistical data utilizing the RDF Data Cube vocabulary which is the state-of-the-art in representing statistical data in RDF. This vocabulary is compatible with SDMX and increasingly being adopted. Based on the vocabulary and the encoded Data Cube, CubeViz is generating a facetted browsing widget that can be used to filter interactively observations to be visualized in charts. Based on the selected structure, CubeViz offer beneficiary chart types and options which can be selected by users.
If you are interested in Linked (Open) Data principles and mechanisms, LOD tools & services and concrete use cases that can be realised using LOD then join us in the free LOD2 webinar series!
http://lod2.eu/BlogPost/webinar-series
This webinar in the course of the LOD2 webinar series will present the release 3.0 of the LOD2 stack, which contains updates to
*) Virtuoso 7 [Openlink]: the original row store of the Virtuoso 6 universal server has now been replaced by a column store, increasing the performance of SPARQL queries significantly, the store is now up to three times as fast as the previous major version.
Linked Open Data Manager Suite [SWC]: the 'lodms' application allows the user to quickly set up pipelines for transforming linked data through the use of its many extensions. It also allows operations for extracting rdf from other types of data.
*) dbpedia-spotlight-ui [ULEI]: a graphical user interface component that allows the user to use a remote DBpedia spotlight instance to annotate a text with DBpedia concepts.
*) sparqlify [ULEI]: a scalable SPARQL-SQL rewriter, allowing you to query an SQL database as if it were a triple store.
*) SIREn [DERI]: a Lucene plugin that allows you to efficiently index and query RDF, as well as any textual document with an arbitrary amount of metadata fields.
*) CubeViz [ULEI]: CubeViz allows visualization of the Data Cube linked data representation of statistical data. It has support for the more advanced DataCube features, such as slices. It also allows the selection of a remote SPARQL endpoint and export of a modified cube.
*) R2R [UMA]: the R2R mapping API is now included directly into the lod2 demonstrator application, allowing users to experience the full effect of the R2R semantic mapping language through a graphical user interface.
*) ontowiki-csvimport [ULEI]: an OntoWiki extension that transforms CSV files to RDF. The extension can create Data Cubes that can be visualized by CubeViz.
If you are interested in Linked (Open) Data principles and mechanisms, LOD tools & services and concrete use cases that can be realised using LOD then join us in the free LOD2 webinar series!
Interactive exploration of complex relational data sets in a web - SemWeb.Pro...Logilab
The document discusses interactive exploration of complex relational datasets. It describes using the Cubicweb framework to store and query data using an entity-relationship model and RQL. Results can be visualized through standard views or processed into pivot tables and numerical arrays for array views like histograms, scatterplots and graphs. This allows flexible visualization and datamining of relational data through unique URLs.
UnifiedViews is a joint project currently maintained by Semantic Web Company (SWC) and Semantica.cz (Semantica.cz). It has been mainly developed by Charles University in Prague as a student project called ODCleanStore (version 2). It is based on the experience SWC obtained with the LOD Management Suite (LODMS) used in WP7 and ODCleansStore (version 1) developed by Charles University in Prague for the WP9a use case of the LOD2 FP7 project. In the next stack release of the LOD2 stack, UnifiedViews will replace LODMS as an ETL tool in the stack and the tool has already been adopted in other projects.
In the webinar we will give a brief overview of the UnifiedViews project (Helmut Nagy). The main part will be a presentation of the tool and it's capabilities (Tomas Knap)
The document discusses year 2 deliverables for work packages 9 and 10 of the LOD2 project. It summarizes reports on improvements made to the Publicdata.eu portal including upgrades to CKAN and new features. Next steps include further technical enhancements to Publicdata.eu and engaging communities of data publishers and users. Deliverables from the Serbian CKAN team established their data portal and infrastructure. The Polish Ministry of Economy requirements analysis identified needs for publishing their data as linked open data.
The document discusses a webinar presented by LOD2 on creating knowledge from interlinked data. It describes LOD2 as an EU-funded project involving leading linked open data organizations. The webinar agenda includes discussing SIREn, a plugin for Elasticsearch that allows indexing and searching of JSON documents. It provides an overview of Elasticsearch and describes how to install SIREn, create an index, index documents, and perform searches on nested JSON data.
(http://lod2.eu/BlogPost/webinar-series) In this Webinar Michael Martin presents CubeViz - a facetted browser for statistical data utilizing the RDF Data Cube vocabulary which is the state-of-the-art in representing statistical data in RDF. This vocabulary is compatible with SDMX and increasingly being adopted. Based on the vocabulary and the encoded Data Cube, CubeViz is generating a facetted browsing widget that can be used to filter interactively observations to be visualized in charts. Based on the selected structure, CubeViz offer beneficiary chart types and options which can be selected by users.
If you are interested in Linked (Open) Data principles and mechanisms, LOD tools & services and concrete use cases that can be realised using LOD then join us in the free LOD2 webinar series!
http://lod2.eu/BlogPost/webinar-series
This webinar in the course of the LOD2 webinar series will present the release 3.0 of the LOD2 stack, which contains updates to
*) Virtuoso 7 [Openlink]: the original row store of the Virtuoso 6 universal server has now been replaced by a column store, increasing the performance of SPARQL queries significantly, the store is now up to three times as fast as the previous major version.
Linked Open Data Manager Suite [SWC]: the 'lodms' application allows the user to quickly set up pipelines for transforming linked data through the use of its many extensions. It also allows operations for extracting rdf from other types of data.
*) dbpedia-spotlight-ui [ULEI]: a graphical user interface component that allows the user to use a remote DBpedia spotlight instance to annotate a text with DBpedia concepts.
*) sparqlify [ULEI]: a scalable SPARQL-SQL rewriter, allowing you to query an SQL database as if it were a triple store.
*) SIREn [DERI]: a Lucene plugin that allows you to efficiently index and query RDF, as well as any textual document with an arbitrary amount of metadata fields.
*) CubeViz [ULEI]: CubeViz allows visualization of the Data Cube linked data representation of statistical data. It has support for the more advanced DataCube features, such as slices. It also allows the selection of a remote SPARQL endpoint and export of a modified cube.
*) R2R [UMA]: the R2R mapping API is now included directly into the lod2 demonstrator application, allowing users to experience the full effect of the R2R semantic mapping language through a graphical user interface.
*) ontowiki-csvimport [ULEI]: an OntoWiki extension that transforms CSV files to RDF. The extension can create Data Cubes that can be visualized by CubeViz.
If you are interested in Linked (Open) Data principles and mechanisms, LOD tools & services and concrete use cases that can be realised using LOD then join us in the free LOD2 webinar series!
Interactive exploration of complex relational data sets in a web - SemWeb.Pro...Logilab
The document discusses interactive exploration of complex relational datasets. It describes using the Cubicweb framework to store and query data using an entity-relationship model and RQL. Results can be visualized through standard views or processed into pivot tables and numerical arrays for array views like histograms, scatterplots and graphs. This allows flexible visualization and datamining of relational data through unique URLs.
UnifiedViews is a joint project currently maintained by Semantic Web Company (SWC) and Semantica.cz (Semantica.cz). It has been mainly developed by Charles University in Prague as a student project called ODCleanStore (version 2). It is based on the experience SWC obtained with the LOD Management Suite (LODMS) used in WP7 and ODCleansStore (version 1) developed by Charles University in Prague for the WP9a use case of the LOD2 FP7 project. In the next stack release of the LOD2 stack, UnifiedViews will replace LODMS as an ETL tool in the stack and the tool has already been adopted in other projects.
In the webinar we will give a brief overview of the UnifiedViews project (Helmut Nagy). The main part will be a presentation of the tool and it's capabilities (Tomas Knap)
The document discusses year 2 deliverables for work packages 9 and 10 of the LOD2 project. It summarizes reports on improvements made to the Publicdata.eu portal including upgrades to CKAN and new features. Next steps include further technical enhancements to Publicdata.eu and engaging communities of data publishers and users. Deliverables from the Serbian CKAN team established their data portal and infrastructure. The Polish Ministry of Economy requirements analysis identified needs for publishing their data as linked open data.
LOD2 is a 4-year European Commission project comprising Linked Data researchers and companies from 12 countries. The project aims to integrate Linked Data into existing large-scale applications in media, publishing, corporate intranets, and eGovernment. The webinar series offers monthly free webinars on tools and services for acquiring, editing, composing, connecting, and publishing Linked Data.
In this Webinar Lorenz Bühmann presents the ontology repair and enrichment tool ORE and also the DL-Learner , a machine learning tool to solve supervised learnings tasks and support knowledge engineers in constructing knowledge. Those two beneighbored tools in the LOD2 Stack are for classification and the following quality analysis of Linked Data.
This document provides an overview of relevant approaches for accessing open data programmatically and data-as-a-service (DaaS) solutions. It discusses common data access methods like web APIs, OData, and SPARQL and describes several DaaS platforms that simplify publishing and consuming open data. It also outlines requirements for a proposed open DaaS platform called DaPaaS that aims to address challenges in open data management and application development.
OpenAIRE and the Case of Irish RepositoriesRIANIreland
This document discusses OpenAIRE and Irish repositories. It begins with a brief explanation of OpenAIRE, including its history and role in Horizon 2020. It then analyzes the status of Irish repositories in OpenAIRE and BASE, noting that about 27,000 documents are openly accessible. The document asks questions about other Irish repositories and CRIS systems. It also discusses important metadata properties for OpenAIRE, such as referencing funding sources. Finally, it covers how repositories can connect with OpenAIRE through services, plugins, and add-ons.
Linguistic Linked Open Data, Challenges, Approaches, Future WorkSebastian Hellmann
Hellmann keynote TKE (2016), Challenges, Approaches and Future Work for Linguistic Linked Open Data (LLOD)
While the Linguistic Linked Open Data (LLOD) Cloud (http://linguistic-lod.org/) has evolved beyond expectations - thanks to the effort of a vibrant community - overall progress has to be seen under a more scrutinizing light.
Initial challenges which have been formulated by Christian Chiarcos, Sebastian Nordhoff and me as early as 2011[1][2] have been discussed extensively in the LDL, MLODE and NLP & DBpedia workshop series and in several W3C community groups. In particular, the LIDER FP7 project (http://www.lider-project.eu/) - originally conceived to tackle these challenges and build a Linguistic Linked Open Data Cloud - rather gave them more shape and uncovered that there is yet quite a long road ahead to solve problems such as proper metadata, contextualisation of knowledge, data quality, hosting, open licensing and provenance, timely updated network links, knowledge integration and interoperability on the largest possible scale - the Web.
The invited talk attempts to give a full account of these abovementioned challenges and presents and critically evaluates pertinent efforts and approaches including evolving standards such as the NLP Interchange Format (NIF)[3][4], DataID[5], SHACL[6], lemon[7] and the LIDER guidelines[8] as well as practical services such as LingHub[9], LODVader[10], RDFUnit[11] (just to mention a few).
As a glimmer of hope, the talk will conclude with the recent efforts of the DBpedia community to coordinate the creation of a public data infrastructure for a large, multilingual, semantic knowledge graph, which is, of course, not a panacean golden hammer, but a potential step in the right direction to bridge the gap between language and knowledge.
________________
[1] Towards a Linguistic Linked Open Data cloud : The Open Linguistics Working Group (http://www.atala.org/IMG/pdf/Chiarcos-TAL52-3.pdf ) Christian Chiarcos, Sebastian Hellmann, and Sebastian Nordhoff. TAL 52(3):245 - 275 (2011)
[2] Linked Data in Linguistics. Representing Language Data and Metadata (http://www.springer.com/computer/ai/book/978-3-642-28248-5 ) Christian Chiarcos, Sebastian Nordhoff, and Sebastian Hellmann (Eds.). Springer, Heidelberg, (2012)
[3] http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core
[4] https://www.w3.org/community/ld4lt/
[5] http://wiki.dbpedia.org/projects/dbpedia-dataid
[6] http://w3c.github.io/data-shapes/shacl/
[7] https://www.w3.org/2016/05/ontolex/
[8] http://www.lider-project.eu/guidelines
[9] http://linghub.lider-project.eu/
[10] http://lodvader.aksw.org/
[11] http://aksw.org/Projects/RDFUnit
This webinar in the course of the LOD2 webinar series will present use cases and live demos of D2R (Free University Berlin) and Sparqlify (University of Leipzig).
D2R Server is a tool for publishing relational databases on the Semantic Web. It enables RDF and HTML browsers to navigate the content of the database, and allows applications to query the database using the SPARQL query language.
Sparqlify is a tool enabling one to define expressive RDF views on relational databases and query them with a subset of the SPARQL query language. By featuring a novel RDF view definition syntax, it aims at simplifying the RDB-RDF mapping process.
more to be found at:
The slideset used to conduct an introduction/tutorial
on DBpedia use cases, concepts and implementation
aspects held during the DBpedia community meeting
in Dublin on the 9th of February 2015.
(slide creators: M. Ackermann, M. Freudenberg
additional presenter: Ali Ismayilov)
ROI in Linking Content to CRM by Applying the Linked Data StackMartin Voigt
Today, decision makers in enterprises have to rely more and more on a variety of data sets that are internally but also externally available in heterogeneous formats. Therefore, intelligent processes are required to build an integrated knowledge-base. Unfortunately, the adoption of the Linked Data lifecycle within enterprises, which targets the extraction, interlinking, publishing and analytics of distributed data, lags behind the public domain due to missing frameworks that are efficiently to deploy and ease to use. In this paper, we present our adoption of the lifecycle through our generic, enterprise-ready Linked Data workbench. To judge its benefits, we describe its application within a real-world Customer Relationship Management scenario. It shows (1) that sales employee could significantly reduce their workload and (2) that the integration of sophisticated Linked Data tools come with an obvious positive Return on Investment.
This document discusses linked data life cycles, including modeling, publishing, discovery, integration, and use cases. It describes key concepts like dataspaces, DSSPs, linked data principles, and the linked open data cloud. Challenges with linked data include schema mapping, write-enablement, authentication, and dataset dynamics as data sources change over time.
The document discusses data discovery, conversion, integration and visualization using RDF. It covers topics like ontologies, vocabularies, data catalogs, converting different data formats to RDF including CSV, XML and relational databases. It also discusses federated SPARQL queries to integrate data from multiple sources and different techniques for visualizing linked data including analyzing relationships, events, and multidimensional data.
The document discusses migrating from HDF5 1.6 to HDF5 1.8. It provides an overview of new features in HDF5 1.8, including a revised file format, improvements to group storage, new link types like external links, and enhanced error handling. The document recommends helping with the transition to HDF5 1.8 by discussing beneficial new features and awareness of compatibility issues when moving from 1.6 to 1.8.
This tutorial is designed for new HDF5 users. We will go over a brief history of HDF and HDF5 software, and will cover basic HDF5 Data Model objects and their properties; we will give an overview of the HDF5 Libraries and APIs, and discuss the HDF5 programming model. Simple C and Fortran examples, and Java tool HDFView will be used to illustrate HDF5 concepts.
Data 2 Documents: Modular and Distributive Content Management in RDFNiels Ockeloen
This document describes a system called Data 2 Documents (D2D) that aims to enable modular and distributive content management on the web using Linked Data and RDF. It discusses how D2D addresses issues with sharing content across different content management systems and websites by modeling the knowledge involved in content selection, composition and rendering. An evaluation involved experts and students performing tasks in D2D, and found that participants could complete the tasks and would consider using D2D for future website development. Future work is needed to develop graphical user interfaces and JavaScript implementations for D2D.
Ivan Herman - Semantic Web Activities @ W3Csssw2012
This document summarizes Ivan Herman's presentation on the Semantic Web and Linked Data at the W3C Business Group on Oil, Gas, and Chemicals meeting in Houston on February 13, 2012. The presentation covered topics such as using ontologies and vocabularies to structure and integrate data on the web, technologies like SPARQL and RDFa, converting relational databases to RDF, and ongoing work at the W3C on standards like RDF 1.1 and the Linked Data platform.
A short talk on the topic of "MarkLogic and the Linked Data Connection", about using MarkLogic with triple stores and running SPARQL queries via the SPARQL HTTP Graph Data Protocol and the SPARQL Protocol.
The text for this presentation is in the GitHub project mentioned on slide 16.
Nicoletta Fornara and Fabio Marfia | Modeling and Enforcing Access Control Ob...semanticsconference
This document discusses modeling and enforcing access control obligations for SPARQL-DL queries. It proposes an approach using formal specifications of obligations to define fine-grained access control for inferred data in OWL 2 DL ontologies. An obligation enforcement module sits as a middle layer, rewriting queries before execution and enforcing obligations on results by modifying returned data based on obligation definitions. The approach allows complex queries while protecting inferred data through reasoning about access control conditions.
Mike Miller is the Co-Founder and Chief Scientist of Cloudant, a company that provides a globally distributed data layer for web applications. He has a background in machine learning, analysis, big data, and distributed systems. Cloudant was founded in 2009 by MIT data scientists and provides a hyper-scalable document database and analytics platform that runs across multiple data centers.
Catalog enrichment: importing Dewey Decimal Classification from external sour...Stefano Bargioni
Usually, important catalogs are accessed for copy-cataloguing whole records. It is possible to retrieve "atomic" information too, using unique keys like ISBN.
Library at Pontificia Università della S. Croce developed a tool that allows Dewey retrieval and insertion into bibliographic records, in bulk mode as well as in single record mode, i.e. during cataloguing.
During the bulk process, Dewey classification was added to about 20,000 records, retrieving it from OCLC, Library of Congress and some national libraries, up to 7 external sources.
The single record mode was integrated into the Koha ILS, to make easier to assign Dewey classification during cataloguing.
This document presents LDP-DL, a language for defining the design of Linked Data Platforms (LDPs). LDP-DL allows describing what resources an LDP contains, how they are organized into containers, and the content of each resource. An LDP-DL model can be interpreted to automatically generate the described LDP. The implementation generates LDPs from LDP-DL designs and heterogeneous data sources. Experiments show LDP-DL supports generating multiple LDPs from a single design, applying one design across data sources, and loose coupling between designs and generated LDPs.
This presentation addresses the main issues of Linked Data and scalability. In particular, it provides gives details on approaches and technologies for clustering, distributing, sharing, and caching data. Furthermore, it addresses the means for publishing data trough could deployment and the relationship between Big Data and Linked Data, exploring how some of the solutions can be transferred in the context of Linked Data.
This document summarizes a presentation on Linked Open Data given by Silke Schomburg and Adrian Pohl at the Bielefeld Conference on April 26, 2012. It discusses the basics of Linked Open Data (LOD), the motivations for adopting LOD practices, and the LOD activities of the hbz library network, including lobid.org for publishing open bibliographic data as LOD and culturegraph.org for interconnecting datasets. It also explores future opportunities for web-based cataloging and building a more integrated library infrastructure based on open data and web services.
Open Educational Data - Datasets and APIs (Athens Green Hackathon 2012)Stefan Dietze
This document discusses linking educational data as linked open data. It describes several existing educational linked data projects and datasets, including SmartLink, mEducator, and the Linked Education Graph. The Linked Education Graph integrates datasets from various sources into a single RDF dataset with over 6 million resources and 97 million triples. The document outlines challenges in linking educational data and introduces the LinkedUp project which aims to further adoption of linked data in education through an open data competition and infrastructure to integrate and query educational datasets.
LOD2 is a 4-year European Commission project comprising Linked Data researchers and companies from 12 countries. The project aims to integrate Linked Data into existing large-scale applications in media, publishing, corporate intranets, and eGovernment. The webinar series offers monthly free webinars on tools and services for acquiring, editing, composing, connecting, and publishing Linked Data.
In this Webinar Lorenz Bühmann presents the ontology repair and enrichment tool ORE and also the DL-Learner , a machine learning tool to solve supervised learnings tasks and support knowledge engineers in constructing knowledge. Those two beneighbored tools in the LOD2 Stack are for classification and the following quality analysis of Linked Data.
This document provides an overview of relevant approaches for accessing open data programmatically and data-as-a-service (DaaS) solutions. It discusses common data access methods like web APIs, OData, and SPARQL and describes several DaaS platforms that simplify publishing and consuming open data. It also outlines requirements for a proposed open DaaS platform called DaPaaS that aims to address challenges in open data management and application development.
OpenAIRE and the Case of Irish RepositoriesRIANIreland
This document discusses OpenAIRE and Irish repositories. It begins with a brief explanation of OpenAIRE, including its history and role in Horizon 2020. It then analyzes the status of Irish repositories in OpenAIRE and BASE, noting that about 27,000 documents are openly accessible. The document asks questions about other Irish repositories and CRIS systems. It also discusses important metadata properties for OpenAIRE, such as referencing funding sources. Finally, it covers how repositories can connect with OpenAIRE through services, plugins, and add-ons.
Linguistic Linked Open Data, Challenges, Approaches, Future WorkSebastian Hellmann
Hellmann keynote TKE (2016), Challenges, Approaches and Future Work for Linguistic Linked Open Data (LLOD)
While the Linguistic Linked Open Data (LLOD) Cloud (http://linguistic-lod.org/) has evolved beyond expectations - thanks to the effort of a vibrant community - overall progress has to be seen under a more scrutinizing light.
Initial challenges which have been formulated by Christian Chiarcos, Sebastian Nordhoff and me as early as 2011[1][2] have been discussed extensively in the LDL, MLODE and NLP & DBpedia workshop series and in several W3C community groups. In particular, the LIDER FP7 project (http://www.lider-project.eu/) - originally conceived to tackle these challenges and build a Linguistic Linked Open Data Cloud - rather gave them more shape and uncovered that there is yet quite a long road ahead to solve problems such as proper metadata, contextualisation of knowledge, data quality, hosting, open licensing and provenance, timely updated network links, knowledge integration and interoperability on the largest possible scale - the Web.
The invited talk attempts to give a full account of these abovementioned challenges and presents and critically evaluates pertinent efforts and approaches including evolving standards such as the NLP Interchange Format (NIF)[3][4], DataID[5], SHACL[6], lemon[7] and the LIDER guidelines[8] as well as practical services such as LingHub[9], LODVader[10], RDFUnit[11] (just to mention a few).
As a glimmer of hope, the talk will conclude with the recent efforts of the DBpedia community to coordinate the creation of a public data infrastructure for a large, multilingual, semantic knowledge graph, which is, of course, not a panacean golden hammer, but a potential step in the right direction to bridge the gap between language and knowledge.
________________
[1] Towards a Linguistic Linked Open Data cloud : The Open Linguistics Working Group (http://www.atala.org/IMG/pdf/Chiarcos-TAL52-3.pdf ) Christian Chiarcos, Sebastian Hellmann, and Sebastian Nordhoff. TAL 52(3):245 - 275 (2011)
[2] Linked Data in Linguistics. Representing Language Data and Metadata (http://www.springer.com/computer/ai/book/978-3-642-28248-5 ) Christian Chiarcos, Sebastian Nordhoff, and Sebastian Hellmann (Eds.). Springer, Heidelberg, (2012)
[3] http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core
[4] https://www.w3.org/community/ld4lt/
[5] http://wiki.dbpedia.org/projects/dbpedia-dataid
[6] http://w3c.github.io/data-shapes/shacl/
[7] https://www.w3.org/2016/05/ontolex/
[8] http://www.lider-project.eu/guidelines
[9] http://linghub.lider-project.eu/
[10] http://lodvader.aksw.org/
[11] http://aksw.org/Projects/RDFUnit
This webinar in the course of the LOD2 webinar series will present use cases and live demos of D2R (Free University Berlin) and Sparqlify (University of Leipzig).
D2R Server is a tool for publishing relational databases on the Semantic Web. It enables RDF and HTML browsers to navigate the content of the database, and allows applications to query the database using the SPARQL query language.
Sparqlify is a tool enabling one to define expressive RDF views on relational databases and query them with a subset of the SPARQL query language. By featuring a novel RDF view definition syntax, it aims at simplifying the RDB-RDF mapping process.
more to be found at:
The slideset used to conduct an introduction/tutorial
on DBpedia use cases, concepts and implementation
aspects held during the DBpedia community meeting
in Dublin on the 9th of February 2015.
(slide creators: M. Ackermann, M. Freudenberg
additional presenter: Ali Ismayilov)
ROI in Linking Content to CRM by Applying the Linked Data StackMartin Voigt
Today, decision makers in enterprises have to rely more and more on a variety of data sets that are internally but also externally available in heterogeneous formats. Therefore, intelligent processes are required to build an integrated knowledge-base. Unfortunately, the adoption of the Linked Data lifecycle within enterprises, which targets the extraction, interlinking, publishing and analytics of distributed data, lags behind the public domain due to missing frameworks that are efficiently to deploy and ease to use. In this paper, we present our adoption of the lifecycle through our generic, enterprise-ready Linked Data workbench. To judge its benefits, we describe its application within a real-world Customer Relationship Management scenario. It shows (1) that sales employee could significantly reduce their workload and (2) that the integration of sophisticated Linked Data tools come with an obvious positive Return on Investment.
This document discusses linked data life cycles, including modeling, publishing, discovery, integration, and use cases. It describes key concepts like dataspaces, DSSPs, linked data principles, and the linked open data cloud. Challenges with linked data include schema mapping, write-enablement, authentication, and dataset dynamics as data sources change over time.
The document discusses data discovery, conversion, integration and visualization using RDF. It covers topics like ontologies, vocabularies, data catalogs, converting different data formats to RDF including CSV, XML and relational databases. It also discusses federated SPARQL queries to integrate data from multiple sources and different techniques for visualizing linked data including analyzing relationships, events, and multidimensional data.
The document discusses migrating from HDF5 1.6 to HDF5 1.8. It provides an overview of new features in HDF5 1.8, including a revised file format, improvements to group storage, new link types like external links, and enhanced error handling. The document recommends helping with the transition to HDF5 1.8 by discussing beneficial new features and awareness of compatibility issues when moving from 1.6 to 1.8.
This tutorial is designed for new HDF5 users. We will go over a brief history of HDF and HDF5 software, and will cover basic HDF5 Data Model objects and their properties; we will give an overview of the HDF5 Libraries and APIs, and discuss the HDF5 programming model. Simple C and Fortran examples, and Java tool HDFView will be used to illustrate HDF5 concepts.
Data 2 Documents: Modular and Distributive Content Management in RDFNiels Ockeloen
This document describes a system called Data 2 Documents (D2D) that aims to enable modular and distributive content management on the web using Linked Data and RDF. It discusses how D2D addresses issues with sharing content across different content management systems and websites by modeling the knowledge involved in content selection, composition and rendering. An evaluation involved experts and students performing tasks in D2D, and found that participants could complete the tasks and would consider using D2D for future website development. Future work is needed to develop graphical user interfaces and JavaScript implementations for D2D.
Ivan Herman - Semantic Web Activities @ W3Csssw2012
This document summarizes Ivan Herman's presentation on the Semantic Web and Linked Data at the W3C Business Group on Oil, Gas, and Chemicals meeting in Houston on February 13, 2012. The presentation covered topics such as using ontologies and vocabularies to structure and integrate data on the web, technologies like SPARQL and RDFa, converting relational databases to RDF, and ongoing work at the W3C on standards like RDF 1.1 and the Linked Data platform.
A short talk on the topic of "MarkLogic and the Linked Data Connection", about using MarkLogic with triple stores and running SPARQL queries via the SPARQL HTTP Graph Data Protocol and the SPARQL Protocol.
The text for this presentation is in the GitHub project mentioned on slide 16.
Nicoletta Fornara and Fabio Marfia | Modeling and Enforcing Access Control Ob...semanticsconference
This document discusses modeling and enforcing access control obligations for SPARQL-DL queries. It proposes an approach using formal specifications of obligations to define fine-grained access control for inferred data in OWL 2 DL ontologies. An obligation enforcement module sits as a middle layer, rewriting queries before execution and enforcing obligations on results by modifying returned data based on obligation definitions. The approach allows complex queries while protecting inferred data through reasoning about access control conditions.
Mike Miller is the Co-Founder and Chief Scientist of Cloudant, a company that provides a globally distributed data layer for web applications. He has a background in machine learning, analysis, big data, and distributed systems. Cloudant was founded in 2009 by MIT data scientists and provides a hyper-scalable document database and analytics platform that runs across multiple data centers.
Catalog enrichment: importing Dewey Decimal Classification from external sour...Stefano Bargioni
Usually, important catalogs are accessed for copy-cataloguing whole records. It is possible to retrieve "atomic" information too, using unique keys like ISBN.
Library at Pontificia Università della S. Croce developed a tool that allows Dewey retrieval and insertion into bibliographic records, in bulk mode as well as in single record mode, i.e. during cataloguing.
During the bulk process, Dewey classification was added to about 20,000 records, retrieving it from OCLC, Library of Congress and some national libraries, up to 7 external sources.
The single record mode was integrated into the Koha ILS, to make easier to assign Dewey classification during cataloguing.
This document presents LDP-DL, a language for defining the design of Linked Data Platforms (LDPs). LDP-DL allows describing what resources an LDP contains, how they are organized into containers, and the content of each resource. An LDP-DL model can be interpreted to automatically generate the described LDP. The implementation generates LDPs from LDP-DL designs and heterogeneous data sources. Experiments show LDP-DL supports generating multiple LDPs from a single design, applying one design across data sources, and loose coupling between designs and generated LDPs.
This presentation addresses the main issues of Linked Data and scalability. In particular, it provides gives details on approaches and technologies for clustering, distributing, sharing, and caching data. Furthermore, it addresses the means for publishing data trough could deployment and the relationship between Big Data and Linked Data, exploring how some of the solutions can be transferred in the context of Linked Data.
This document summarizes a presentation on Linked Open Data given by Silke Schomburg and Adrian Pohl at the Bielefeld Conference on April 26, 2012. It discusses the basics of Linked Open Data (LOD), the motivations for adopting LOD practices, and the LOD activities of the hbz library network, including lobid.org for publishing open bibliographic data as LOD and culturegraph.org for interconnecting datasets. It also explores future opportunities for web-based cataloging and building a more integrated library infrastructure based on open data and web services.
Open Educational Data - Datasets and APIs (Athens Green Hackathon 2012)Stefan Dietze
This document discusses linking educational data as linked open data. It describes several existing educational linked data projects and datasets, including SmartLink, mEducator, and the Linked Education Graph. The Linked Education Graph integrates datasets from various sources into a single RDF dataset with over 6 million resources and 97 million triples. The document outlines challenges in linking educational data and introduces the LinkedUp project which aims to further adoption of linked data in education through an open data competition and infrastructure to integrate and query educational datasets.
This document provides an overview of a presentation on representing and connecting language data and metadata using linked data. It discusses the technological background of linked data and the collaborative research opportunities it provides for linguistics. It also outlines prospects for using linked data in linguistics by connecting annotated corpora, lexical-semantic resources, and linguistic databases to build a linguistic linked open data cloud.
NISO Virtual Conference: BIBFRAME & Real World Applications of Linked Bibliographic Data
http://www.niso.org/news/events/2016/virtual_conference/jun15_virtualconf/
June 15, 2016
Opening Keynote: Landscape and Current Status of BIBFRAME and Related Initiatives
Improving the Performance of the DL-Learner SPARQL Component for Semantic We...Sebastian Hellmann
Presentation at JIST 2012 -- I forgot to add a link to http://en.wikipedia.org/wiki/Knowledge_extraction I mentioned it during the presentation, because some of their output would be compatible with SPARQL
RDF2vec is a method for creating embeddings vectors for entities in knowledge graphs. In this talk, I introduce the basic idea of RDF2vec, as well as the latest extensions developments, like the use of different walk strategies, the flavour of order-aware RDF2vec, RDF2vec for dynamic knowledge graphs, and more.
Linked Data Publishing with Drupal (SWIB12 Lightning Talk)Joachim Neubert
see more recent http://www.slideshare.net/jneubert/linked-data-enhanced-publishing-for-special-collections-with-drupal and http://www.slideshare.net/jneubert/swib13-drupal-ws
OpenAIRE and the case of Irish Repositories, by Jochen Schirrwagen (RIAN Work...OpenAIRE
This document discusses OpenAIRE and Irish repositories. It begins with a brief explanation of OpenAIRE, including its history and role in Horizon 2020. It then analyzes the status of Irish repositories in OpenAIRE and BASE, noting that about 27,000 documents are openly accessible. The document asks questions about other Irish repositories and CRIS systems. It also discusses important metadata properties for OpenAIRE, such as referencing funding sources. Finally, it covers how repositories can connect with OpenAIRE through services, plugins, and add-ons.
Simplified minimalistic workflows for the publication of Linked Open DataLinDa_FP7
The LinDA project addresses one of the most significant challenges of the usage and publication of Linked Data, the renovation and conversion of existing data formats into structures that support the semantic enrichment and interlinking of data. The set of tools provided by LinDA will assist enterprises, especially SMEs which often cannot afford the development and maintenance of dedicated information analysis and management departments, in efficiently developing novel data analytical services that are linked to the available public data therefore contributing to improve their competitiveness and stimulating the emergence of innovative business models.
This is the project presentation from Samos 2015 Summit on ICT-enabled Governance, held on June 29 – July 3, 2015, Samos, Greece (http://samos-summit.blogspot.de/).
Simplified minimalistic workflows for the publication of Linked Open DataSalvatore Virtuoso
Our colleague Yuri Glikman of Fraunhofer FOKUS (LinDA partner) presented the LinDA transformation tool at the recent Samos Summit (http://samos-summit.blogspot.de/).
Soren Auer - LOD2 - creating knowledge out of Interlinked DataOpen City Foundation
The document discusses the LOD2 project which aims to create knowledge from interlinked open data. It focuses on very large RDF data management, knowledge enrichment through interlinking data from different sources, and developing semantic user interfaces. The project uses use cases in media, enterprise, open government data, and public sector contracts. The goal is to develop an integrated Linked Data lifecycle management stack.
This document summarizes Linked Library Data initiatives and the role of the Dublin Core Metadata Initiative. It discusses how libraries are publishing structured data using vocabularies like FRBR, MARC, and RDA. It also outlines efforts to align library metadata with the Dublin Core Abstract Model and link library data on the web through the W3C Linked Data Incubator Group. The document concludes that distributing bibliographic control through linked data allows for greater interlinking and description of values as non-literal resources.
The document provides an introduction and overview of NoSQL databases. It discusses why NoSQL databases were created, the different categories of NoSQL databases including column stores, document stores, and key-value stores. It also provides an overview of Hadoop, describing it as a framework that allows distributed processing of large datasets across computer clusters.
DBpedia is a crowd-sourced effort to extract structured data from Wikipedia and Wikidata. It provides a public SPARQL endpoint to query this multi-domain, multilingual dataset. The DBpedia Association was founded in 2014 as a non-profit to oversee DBpedia and aims to improve uptime, data quality, and integration with other sources. It relies on funding and contributions from members to achieve goals like 99.99% uptime across languages and domains. The document promotes joining the DBpedia Association and participating in future events like a DBpedia meeting at the SEMANTiCS 2016 conference.
This webinar in the course of the LOD2 webinar series will present Zemanta and its LODRefine - a LOD-enabled version of OpenRefine (previously Google Refine), which is a part of the LOD2 stack. LODRefine extends cleansing and linking functionalities of OpenRefine by providing means to reconcile and augment your data with DBpedia or any other SPARQL endpoint, extract named entities using Zemanta API, export data in one of the RDF formats, and recently also to exploit available crowdsourcing services. In webinar we will demonstrate several task which demonstrate the ease of use and versatility of LODRefine.
If you are interested in Linked (Open) Data principles and mechanisms, LOD tools & services and concrete use cases that can be realised using LOD then join us in the free LOD2 webinar series: http://lod2.eu/BlogPost/webinar-series
Linked data activities in the Deutsche NationalbibliothekLars G. Svensson
This presentation accompanied a lightning talk at the IFLA Semantic Web Special Interest Group's session at the World Library and Information Conference in Helsinki 2012
Presentation for the OCLC Linked Data Roundtable event for IFLA Helsinki 2012. Covers the reasoning behind the BL's linked open data version of the British National Bibliography, the processes needed to create the service and challenges to be addressed.
The document provides an overview of the activities and direction of Work Package 1 (WP1) of the NoTube project. WP1 focuses on developing shared datasets and services to support data integration and use case scenarios. In year 3, WP1 has moved from a single data warehouse to a more distributed model and is planning for sustainability beyond the project lifetime. WP1 datasets and services are being used to support other work packages and end-to-end demonstrations. WP1 also conducts outreach activities to promote metadata sharing and adoption of standards.
“An Outlook of the Ongoing and Future Relationship between Blockchain Technologies and Process-aware Information Systems.” Invited talk at the joint workshop on Blockchain for Information Systems (BC4IS) and Blockchain for Trusted Data Sharing (B4TDS), co-located with with the 36th International Conference on Advanced Information Systems Engineering (CAiSE), 3 June 2024, Limassol, Cyprus.
Full-RAG: A modern architecture for hyper-personalizationZilliz
Mike Del Balso, CEO & Co-Founder at Tecton, presents "Full RAG," a novel approach to AI recommendation systems, aiming to push beyond the limitations of traditional models through a deep integration of contextual insights and real-time data, leveraging the Retrieval-Augmented Generation architecture. This talk will outline Full RAG's potential to significantly enhance personalization, address engineering challenges such as data management and model training, and introduce data enrichment with reranking as a key solution. Attendees will gain crucial insights into the importance of hyperpersonalization in AI, the capabilities of Full RAG for advanced personalization, and strategies for managing complex data integrations for deploying cutting-edge AI solutions.
Removing Uninteresting Bytes in Software FuzzingAftab Hussain
Imagine a world where software fuzzing, the process of mutating bytes in test seeds to uncover hidden and erroneous program behaviors, becomes faster and more effective. A lot depends on the initial seeds, which can significantly dictate the trajectory of a fuzzing campaign, particularly in terms of how long it takes to uncover interesting behaviour in your code. We introduce DIAR, a technique designed to speedup fuzzing campaigns by pinpointing and eliminating those uninteresting bytes in the seeds. Picture this: instead of wasting valuable resources on meaningless mutations in large, bloated seeds, DIAR removes the unnecessary bytes, streamlining the entire process.
In this work, we equipped AFL, a popular fuzzer, with DIAR and examined two critical Linux libraries -- Libxml's xmllint, a tool for parsing xml documents, and Binutil's readelf, an essential debugging and security analysis command-line tool used to display detailed information about ELF (Executable and Linkable Format). Our preliminary results show that AFL+DIAR does not only discover new paths more quickly but also achieves higher coverage overall. This work thus showcases how starting with lean and optimized seeds can lead to faster, more comprehensive fuzzing campaigns -- and DIAR helps you find such seeds.
- These are slides of the talk given at IEEE International Conference on Software Testing Verification and Validation Workshop, ICSTW 2022.
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Speck&Tech
ABSTRACT: A prima vista, un mattoncino Lego e la backdoor XZ potrebbero avere in comune il fatto di essere entrambi blocchi di costruzione, o dipendenze di progetti creativi e software. La realtà è che un mattoncino Lego e il caso della backdoor XZ hanno molto di più di tutto ciò in comune.
Partecipate alla presentazione per immergervi in una storia di interoperabilità, standard e formati aperti, per poi discutere del ruolo importante che i contributori hanno in una comunità open source sostenibile.
BIO: Sostenitrice del software libero e dei formati standard e aperti. È stata un membro attivo dei progetti Fedora e openSUSE e ha co-fondato l'Associazione LibreItalia dove è stata coinvolta in diversi eventi, migrazioni e formazione relativi a LibreOffice. In precedenza ha lavorato a migrazioni e corsi di formazione su LibreOffice per diverse amministrazioni pubbliche e privati. Da gennaio 2020 lavora in SUSE come Software Release Engineer per Uyuni e SUSE Manager e quando non segue la sua passione per i computer e per Geeko coltiva la sua curiosità per l'astronomia (da cui deriva il suo nickname deneb_alpha).
Enchancing adoption of Open Source Libraries. A case study on Albumentations.AIVladimir Iglovikov, Ph.D.
Presented by Vladimir Iglovikov:
- https://www.linkedin.com/in/iglovikov/
- https://x.com/viglovikov
- https://www.instagram.com/ternaus/
This presentation delves into the journey of Albumentations.ai, a highly successful open-source library for data augmentation.
Created out of a necessity for superior performance in Kaggle competitions, Albumentations has grown to become a widely used tool among data scientists and machine learning practitioners.
This case study covers various aspects, including:
People: The contributors and community that have supported Albumentations.
Metrics: The success indicators such as downloads, daily active users, GitHub stars, and financial contributions.
Challenges: The hurdles in monetizing open-source projects and measuring user engagement.
Development Practices: Best practices for creating, maintaining, and scaling open-source libraries, including code hygiene, CI/CD, and fast iteration.
Community Building: Strategies for making adoption easy, iterating quickly, and fostering a vibrant, engaged community.
Marketing: Both online and offline marketing tactics, focusing on real, impactful interactions and collaborations.
Mental Health: Maintaining balance and not feeling pressured by user demands.
Key insights include the importance of automation, making the adoption process seamless, and leveraging offline interactions for marketing. The presentation also emphasizes the need for continuous small improvements and building a friendly, inclusive community that contributes to the project's growth.
Vladimir Iglovikov brings his extensive experience as a Kaggle Grandmaster, ex-Staff ML Engineer at Lyft, sharing valuable lessons and practical advice for anyone looking to enhance the adoption of their open-source projects.
Explore more about Albumentations and join the community at:
GitHub: https://github.com/albumentations-team/albumentations
Website: https://albumentations.ai/
LinkedIn: https://www.linkedin.com/company/100504475
Twitter: https://x.com/albumentations
Unlocking Productivity: Leveraging the Potential of Copilot in Microsoft 365, a presentation by Christoforos Vlachos, Senior Solutions Manager – Modern Workplace, Uni Systems
Communications Mining Series - Zero to Hero - Session 1DianaGray10
This session provides introduction to UiPath Communication Mining, importance and platform overview. You will acquire a good understand of the phases in Communication Mining as we go over the platform with you. Topics covered:
• Communication Mining Overview
• Why is it important?
• How can it help today’s business and the benefits
• Phases in Communication Mining
• Demo on Platform overview
• Q/A
Sudheer Mechineni, Head of Application Frameworks, Standard Chartered Bank
Discover how Standard Chartered Bank harnessed the power of Neo4j to transform complex data access challenges into a dynamic, scalable graph database solution. This keynote will cover their journey from initial adoption to deploying a fully automated, enterprise-grade causal cluster, highlighting key strategies for modelling organisational changes and ensuring robust disaster recovery. Learn how these innovations have not only enhanced Standard Chartered Bank’s data infrastructure but also positioned them as pioneers in the banking sector’s adoption of graph technology.
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionAggregage
Join Maher Hanafi, VP of Engineering at Betterworks, in this new session where he'll share a practical framework to transform Gen AI prototypes into impactful products! He'll delve into the complexities of data collection and management, model selection and optimization, and ensuring security, scalability, and responsible use.
Alt. GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using ...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
Goodbye Windows 11: Make Way for Nitrux Linux 3.5.0!SOFTTECHHUB
As the digital landscape continually evolves, operating systems play a critical role in shaping user experiences and productivity. The launch of Nitrux Linux 3.5.0 marks a significant milestone, offering a robust alternative to traditional systems such as Windows 11. This article delves into the essence of Nitrux Linux 3.5.0, exploring its unique features, advantages, and how it stands as a compelling choice for both casual users and tech enthusiasts.
Building RAG with self-deployed Milvus vector database and Snowpark Container...Zilliz
This talk will give hands-on advice on building RAG applications with an open-source Milvus database deployed as a docker container. We will also introduce the integration of Milvus with Snowpark Container Services.
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...Neo4j
Leonard Jayamohan, Partner & Generative AI Lead, Deloitte
This keynote will reveal how Deloitte leverages Neo4j’s graph power for groundbreaking digital twin solutions, achieving a staggering 100x performance boost. Discover the essential role knowledge graphs play in successful generative AI implementations. Plus, get an exclusive look at an innovative Neo4j + Generative AI solution Deloitte is developing in-house.
For the full video of this presentation, please visit: https://www.edge-ai-vision.com/2024/06/building-and-scaling-ai-applications-with-the-nx-ai-manager-a-presentation-from-network-optix/
Robin van Emden, Senior Director of Data Science at Network Optix, presents the “Building and Scaling AI Applications with the Nx AI Manager,” tutorial at the May 2024 Embedded Vision Summit.
In this presentation, van Emden covers the basics of scaling edge AI solutions using the Nx tool kit. He emphasizes the process of developing AI models and deploying them globally. He also showcases the conversion of AI models and the creation of effective edge AI pipelines, with a focus on pre-processing, model conversion, selecting the appropriate inference engine for the target hardware and post-processing.
van Emden shows how Nx can simplify the developer’s life and facilitate a rapid transition from concept to production-ready applications.He provides valuable insights into developing scalable and efficient edge AI solutions, with a strong focus on practical implementation.
2. License
2
This presentation – inclusive the graphics made by the author, are licensed CC0:
https://creativecommons.org/about/cc0
Pictures from http://www.istockphoto.com/ at slides 5, 7, 8 and 41 are licensed CC-BY-ND:
http://creativecommons.org/licenses/by-nd/3.0/de/
Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch. http://lod-
cloud.net/
Christoph - Catalog enrichment à la Linked Open Data 2012-12-26
3. Overview
3
Catalog enrichment
Definition
Technique
Matching
Linking
Implementation demo
Conclusion
Christoph - Catalog enrichment à la Linked Open Data 2012-12-26
4. Overview
4
Catalog enrichment
Definition
Technique
Matching
Linking
Implementation demo
Conclusion
Christoph - Catalog enrichment à la Linked Open Data 2012-12-26
6. Catalog enrichment: definition
6
Any addendum to the records:
linksto fulltexts/webpages/...
subjects, tags, recensions
covers
...
The source of the addendum does not matter
(users, libraries, companies...)
New features: only indirect
Christoph - Catalog enrichment à à la Linked Open Data
Kataloganreicherung la Linked Open Data 24.05.2012
2012-12-26
2012-09-27
9. Overview
9
Catalog enrichment
Definition
Technique
Matching
Linking
Implementation demo
Conclusion
Christoph - Catalog enrichment à la Linked Open Data 2012-12-26
10. Catalog enrichment: methods
10
Sourtce of the pictures :http://findicons.com/about
database vs. mashup
Christoph - Catalog enrichment à à la Linked Open Data
Kataloganreicherung la Linked Open Data 24.05.2012
2012-12-26
2012-09-27
11. methods
11
locale DB: dynamic mashup:
+ elaborated combination of the + data always up-to-date
data
+ relatively easy to integrate the data
+ data can be used to search and
browse and other features - needs (performant) API
- continously high effort to - no search etc.
integrate the data
Christoph - Catalog enrichment à à la Linked Open Data
Kataloganreicherung la Linked Open Data 24.05.2012
2012-12-26
2012-09-27
12. infrastructure
12
RDF based storing with SPARQL endpoint:
Easy to add data
Open to be used by customer
Self-describing data
SPARQL is a (too?) powerful API
Christoph - Catalog enrichment à à la Linked Open Data
Kataloganreicherung la Linked Open Data 24.05.2012
2012-12-26
13. Overview
13
Catalog enrichment
Definition
Technique
Matching
Linking
Implementation demo
Conclusion
Christoph - Catalog enrichment à la Linked Open Data 2012-12-26
14. 14
Source of the picture: http://www.flickr.com/photos/jhsum-commons/4419490136/
15. lobid.org
15
triple store with SPARQL Endpoint: 4store
open data from the hbz union catalog
16 M records <=> 1 B Triple
links to:
• 5.500 Projekt Gutenberg • 1.250.000 Open Library
• 12.000 DBpedia • 700.000 ZDB
• 70.000 b3kat • 800.000 LOC Iso-639-2
• 200.000 Dewey Decimal Class. • 22.000.000 gnd authority file
• 270.000 DNB Nationalbiografie • 32.000.000 lobid-organisations
• 420.000 OCLC
Christoph - Catalog enrichment à à la Linked Open Data
Kataloganreicherung la Linked Open Data 24.05.2012
2012-12-26
2012-09-27
16. Software
16
Silk
Culturegraph
Google-refine
Hadoop
...
Christoph - Catalog-enrichment à à la Linkedmit LOD
Jansen / Christoph KataloganreicherungOpen Data
Kataloganreicherung la Linked Open Data 24.05.2012
2012-12-26
2012-09-27
17. Matching algorithms
17
depending on the data
Interestingdata reside „elsewhere“
=> other cataloging rules
DBpedia example:
Creator, ISBN etc. are often missing => only title
constraints:
german DBpedia
category:Literarisches_Werk ,
category:Lexikon,_Enzyklopädie
Christoph - Catalog enrichment à à la Linked Open Data
Kataloganreicherung la Linked Open Data 24.05.2012
2012-12-26
2012-09-27
18. Problem: disambiguation
18
matching is to blurry
Post processing:
Allow only bundle with same creator
Christoph - Catalog-enrichment à à la Linkedmit LOD
Jansen / Christoph KataloganreicherungOpen Data
Kataloganreicherung la Linked Open Data 24.05.2012
2012-12-26
2012-09-27
19. Bundle having the same creator
19
Christoph - Catalog-enrichment à à la Linkedmit LOD
Jansen / Christoph KataloganreicherungOpen Data
Kataloganreicherung la Linked Open Data 24.05.2012
2012-12-26
2012-09-27
20. Bundle having different creators
20
Christoph - Catalog-enrichment à à la Linkedmit LOD
Jansen / Christoph KataloganreicherungOpen Data
Kataloganreicherung la Linked Open Data 24.05.2012
2012-12-26
2012-09-27
21. LOW-HANGING
FRUIT
Kai Schreiber, „Reiche Ernte” 7. August 2005 via Flickr CC BY-SA 2.0
22. Overview
22
Catalog enrichment
Definition
Technique
Matching
Linking
Implementation demo
Conclusion
Christoph - Catalog enrichment à la Linked Open Data 2012-12-26
23. triplification
23
Find predicates or mint them yourself
rdrel:workManifested
=> Triple:
<lobid-resource> <rdrel:workManifested> <dbpedia-resource>
Christoph - Catalog-enrichment à à la Linkedmit LOD
Jansen / Christoph KataloganreicherungOpen Data
Kataloganreicherung la Linked Open Data 24.05.2012
2012-12-26
2012-09-27
24. indexing
24
What is the license ?
Import triples into the SPARQL-Endpoint
own „named graph“ has advantages:
Easilyremovable/changeable
Provenience is stored
Query specific named graphs
Christoph - Catalog-enrichment à à la Linkedmit LOD
Jansen / Christoph KataloganreicherungOpen Data
Kataloganreicherung la Linked Open Data 24.05.2012
2012-12-26
2012-09-27
25. Named Graphs
25
Christoph - Catalog-enrichment à à la Linkedmit LOD
Jansen / Christoph KataloganreicherungOpen Data
Kataloganreicherung la Linked Open Data 24.05.2012
2012-12-26
2012-09-27
26. What we achieved
26
12.000 „sure“ links to 4.000 DBpedia
resources => 4.000 new „Work“-levels (21.000
discared links)
average size of a bundle: 3
links to freebase: 3.000
0.1 % enrichment
Christoph - Kataloganreicherung à la Linkedmit LOD
Jansen / Christoph -enrichment à la Linked Open Data
Catalog Kataloganreicherung Open Data 24.05.2012
2012-09-27
2012-12-26
27. What we achieved
27
5.500 links zu 400 Project Gutenberg
ressources (fulltexts in differnet formats)
=> 0.05% enrichment
1.200.000 links to the work level of the Open
Library
=> 12.5% enrichment
Christoph - Kataloganreicherung à la Linkedmit LOD
Jansen / Christoph -enrichment à la Linked Open Data
Catalog Kataloganreicherung Open Data 24.05.2012
2012-09-27
2012-12-26
28. What we achieved
28
Sir Tim Berners Lee:
Source of picture: http://www.w3.org/DesignIssues/LinkedData.html
Christoph - Catalog enrichment à à la Linked Open Data
Kataloganreicherung la Linked Open Data 2012-12-26
2012-09-27
29. LOW-HANGING
FRUIT
Kai Schreiber, „Reiche Ernte” 7. August 2005 via Flickr CC BY-SA 2.0
30. What we achieved
30
DBpedia example:
„Die Heilige Johanna der Schlachthöfe“
Christoph - Kataloganreicherung à la Linkedmit LOD
Jansen / Christoph -enrichment à la Linked Open Data
Catalog Kataloganreicherung Open Data 24.05.2012
2012-09-27
2012-12-26
31.
32.
33.
34. What we achieved
34
Open Library example:
„With reference to reference“
Christoph - Kataloganreicherung à la Linkedmit LOD
Jansen / Christoph -enrichment à la Linked Open Data
Catalog Kataloganreicherung Open Data 24.05.2012
2012-09-27
2012-12-26
35.
36. Linking Example: LODUM
36
Christoph - Catalog enrichment à à la Linked Open Data
Kataloganreicherung la Linked Open Data 24.05.2012
2012-12-26
2012-09-27
37. Integration into the catalog
37
What is allowed ?
What should be integrated, what not?
Human readable presentation of the
links/URIs
(some) data should be indexed locally (e. g. to
be able to search)
...
Christoph - Kataloganreicherung à la Linkedmit LOD
Jansen / Christoph -enrichment à la Linked Open Data
Catalog Kataloganreicherung Open Data 24.05.2012
2012-09-27
2012-12-26
38. Overview
38
Catalog enrichment
Definition
Technique
Matching
Linking
Implementation demo
Conclusion
Christoph - Catalog enrichment à la Linked Open Data 2012-12-26
39. Implementation demo
39
Christoph - Kataloganreicherung à la Linkedmit LOD
Jansen / Christoph -enrichment à la Linked Open Data
Catalog Kataloganreicherung Open Data 24.05.2012
2012-09-27
2012-12-26
40. Implementation demo
40
Christoph - Kataloganreicherung à la Linkedmit LOD
Jansen / Christoph -enrichment à la Linked Open Data
Catalog Kataloganreicherung Open Data 24.05.2012
2012-09-27
2012-12-26
41. Overview
41
Catalog enrichment
Definition
Technique
Matching
Linking
Implementation demo
Conclusion
Christoph - Catalog enrichment à la Linked Open Data 2012-12-26
44. conclusion
44
Everything that's possible with LOD could also
be achieved without LOD.
It's just easier with LOD.
Christoph - Kataloganreicherung à la Linkedmit LOD
Jansen / Christoph -enrichment à la Linked Open Data
Catalog Kataloganreicherung Open Data 24.05.2012
2012-09-27
2012-12-26
45. LOD - Definition „linked“
45 Ad astra ?
Addata ! ?
Ad astra
Ad data !
To boldly go where no data has gone before.
To boldly go where no data has gone before .
Source of the picture:http://hubblesite.org/gallery/album/star/pr2006050d
Christoph - Kataloganreicherung à la Linked Open Data 2012-09-27
46. Open source
46
http://sourceforge.net/projects/culturegraph/
http://4store.org/
https://github.com/lobid/
Silk https://www.assembla.com/spaces/silk
Christoph - Catalog enrichment à la Linked Open Data
47. 47 Thank you !
Pascal Christoph
christoph@hbz-nrw.de
semweb@hbz-nrw.de
48. 48 list of references
- KiM: Empfehlungen zur Öffnung bibliothekarischer Daten
https://wiki.d-nb.de/pages/viewpage.action?pageId=45419980
- Till Kreutzer (2010): Open Data – Freigabe von Daten aus Bibliothekskatalogen
http://www.hbz-nrw.de/dokumentencenter/veroeffentlichungen/open-data-leitfaden.pdf
- Adrian Pohl (2010): Open Data im hbz-Verbund. Erschienen in: ProLibris. 3. Preprint:
http://www.hbz-nrw.de/dokumentencenter/produkte/lod/aktuell/pohl_2010_open-data.pdf
- Tim Berners Lee's talk of Open Data (2010): http://www.youtube.com/watch?v=3YcZ3Zqk0a8
- Jansen / Christoph: Dynamische Kataloganreicherung auf Basis von Linked Open Data
http://de.slideshare.net/h_jansen/dynamische-kataloganreicherung-auf-basis-von-linked-open-data
- Blog post: First results using SILK to link to DBpedia
https://wiki1.hbz-nrw.de/display/SEM/2012/05/03/First+results+using+SILK+to+link+to+DBpedia
- Blog post: 1.2 M links to Open Library
https://wiki1.hbz-nrw.de/display/SEM/2012/05/23/1.2+M+links+to+Open+Library
- Oliver Flimm (2010): LOD und die Open Library http://de.slideshare.net/flimm/lod-openlibrary20100512
- Directory of data „thedatahub“ aka CKAN: http://www.thedatahub.org/
- 49 bibliographic data sources as LODhttp://thedatahub.org/group/bibliographic?tags=lod