The document describes an upcoming summer school on Linked Open Data and smart cities to be held from June 7-12 in Cercedilla, Spain. It provides an agenda that will include topics like Linked Open Data guidelines for data generation, discussions, and hands-on sessions. Presenters will include researchers from universities and organizations in Spain. The goal is to discuss how open data can be used to power applications for smart cities and improve areas like transportation, accessibility, and public services.
Slides from our tutorial on Linked Data generation in the energy domain, presented at the Sustainable Places 2014 conference on October 2nd in Nice, France
The document describes the 1st Summer School on Smart Cities and Linked Open Data (LD4SC-15) which will take place from June 7-12 in Cercedilla, Spain. The summer school will focus on data linking activities like linking city, district, and civic structure data to datasets in DBpedia using the OpenRefine tool. The goal is to reconcile and link individual instances to create RDF output.
Overview of Open Data, Linked Data and Web ScienceHaklae Kim
This document provides an overview of open data, linked data, and web science through conceptual discussions, case studies, and proposed next steps. It begins with definitions of key concepts like open data and the semantic web. Case studies demonstrate current applications of open data through government initiatives and technologies like Google's Knowledge Graph and Apple's Siri. The document concludes by acknowledging challenges with open data strategies and advocating for interdisciplinary collaboration to realize the potential of linked open government data.
Linked Open Data Principles, Technologies and ExamplesOpen Data Support
Theoretical and practical introducton to linked data, focusing both on the value proposition, the theory/foundations, and on practical examples. The material is tailored to the context of the EU institutions.
Within the course, we will present Linked Data as a set of best practices for publishing and connecting structured data on the Web. These best practices have been adopted by an increasing number of data providers over the past years, leading to the creation of a global data space that contains many billions of assertions – the Web of Linked Data.
This document discusses Linked Open Data and how to publish open government data. It explains that publishing data in open, machine-readable formats and linking it to other external data sources increases its value. It provides examples of published open government data and outlines best practices for making data open through licensing, standard formats like CSV and XML, using URIs as identifiers, and linking to related external data. The key benefits outlined are empowering others to build upon the data and improving transparency, competition and innovation.
Open Data Support - bridging open data supply and demandOpen Data Support
This document provides an overview of the Open Data Support project, which aims to improve the accessibility and facilitate the reuse of open government datasets across Europe. The project offers publishing, training, and consulting services to help public administrations prepare and share open data. It has created a pan-European data portal that provides unified access to metadata descriptions of over 55,000 datasets from 9 member states. The training services teach skills for effectively publishing and using linked open government data. The presentation highlights the benefits for data reusers, including opportunities to discover, access, and reuse compliant metadata as linked open data.
Slides from our tutorial on Linked Data generation in the energy domain, presented at the Sustainable Places 2014 conference on October 2nd in Nice, France
The document describes the 1st Summer School on Smart Cities and Linked Open Data (LD4SC-15) which will take place from June 7-12 in Cercedilla, Spain. The summer school will focus on data linking activities like linking city, district, and civic structure data to datasets in DBpedia using the OpenRefine tool. The goal is to reconcile and link individual instances to create RDF output.
Overview of Open Data, Linked Data and Web ScienceHaklae Kim
This document provides an overview of open data, linked data, and web science through conceptual discussions, case studies, and proposed next steps. It begins with definitions of key concepts like open data and the semantic web. Case studies demonstrate current applications of open data through government initiatives and technologies like Google's Knowledge Graph and Apple's Siri. The document concludes by acknowledging challenges with open data strategies and advocating for interdisciplinary collaboration to realize the potential of linked open government data.
Linked Open Data Principles, Technologies and ExamplesOpen Data Support
Theoretical and practical introducton to linked data, focusing both on the value proposition, the theory/foundations, and on practical examples. The material is tailored to the context of the EU institutions.
Within the course, we will present Linked Data as a set of best practices for publishing and connecting structured data on the Web. These best practices have been adopted by an increasing number of data providers over the past years, leading to the creation of a global data space that contains many billions of assertions – the Web of Linked Data.
This document discusses Linked Open Data and how to publish open government data. It explains that publishing data in open, machine-readable formats and linking it to other external data sources increases its value. It provides examples of published open government data and outlines best practices for making data open through licensing, standard formats like CSV and XML, using URIs as identifiers, and linking to related external data. The key benefits outlined are empowering others to build upon the data and improving transparency, competition and innovation.
Open Data Support - bridging open data supply and demandOpen Data Support
This document provides an overview of the Open Data Support project, which aims to improve the accessibility and facilitate the reuse of open government datasets across Europe. The project offers publishing, training, and consulting services to help public administrations prepare and share open data. It has created a pan-European data portal that provides unified access to metadata descriptions of over 55,000 datasets from 9 member states. The training services teach skills for effectively publishing and using linked open government data. The presentation highlights the benefits for data reusers, including opportunities to discover, access, and reuse compliant metadata as linked open data.
This module supported the training on Linked Open Data delivered to the EU Institutions on 30 November 2015 in Brussels. https://joinup.ec.europa.eu/community/ods/news/ods-onsite-training-european-commission
This document provides an introduction to linked data and open data. It discusses the evolution of the web from documents to interconnected data. The four principles of linked data are explained: using URIs to identify things, making URIs accessible, providing useful information about the URI, and including links to other URIs. The differences between open data and linked data are outlined. Key milestones in linked government data are presented. Formats for publishing linked data like RDF and SPARQL are introduced. Finally, the 5 star scheme for publishing open data as linked data is described.
LOD2 is a 4-year European Commission project comprising Linked Data researchers and companies from 12 countries. The project aims to integrate Linked Data into existing large-scale applications in media, publishing, corporate intranets, and eGovernment. The webinar series offers monthly free webinars on tools and services for acquiring, editing, composing, connecting, and publishing Linked Data.
PoolParty Semantic Suite: Management Briefing and Functional Overview Martin Kaltenböck
Slides for the presentation of PoolParty Semantic Suite on 12.11. 2015 at KNVI Congres 2015 in Utrecht, the Netherlands, see: http://congres.knvi.info/ by Martin Kaltenböck in the Big Data & Linked Data Session.
PwC is a global network of firms providing assurance, tax, and advisory services. This training module covers best practices for designing and developing RDF vocabularies. It discusses modeling data by reusing existing vocabularies when possible, creating sub-classes and properties to specialize existing terms, and defining new terms following common conventions when needed. The module also addresses publishing and promoting vocabularies so they can be reused by others.
The document discusses a webinar presented by LOD2 on creating knowledge from interlinked data. It describes LOD2 as an EU-funded project involving leading linked open data organizations. The webinar agenda includes discussing SIREn, a plugin for Elasticsearch that allows indexing and searching of JSON documents. It provides an overview of Elasticsearch and describes how to install SIREn, create an index, index documents, and perform searches on nested JSON data.
Linking Open Data to Accelerate Low - Carbon Development Martin Kaltenböck
Presentation in the course of the Workshop in Abu Dhabi on 18.01.2012: Linking Open Data to Accelerate Low-Carbon Development - A workshop for decision makers in clean energy organisations - by Martin Kaltenböck, Semantic Web Company (SWC) including:
Linked Open Government Data: Open Government & Open Government Data; Putting the L in Front: from Open Data to Linked Open Data (LOD) -
(http://lod2.eu/BlogPost/webinar-series) In this Webinar Michael Martin presents CubeViz - a facetted browser for statistical data utilizing the RDF Data Cube vocabulary which is the state-of-the-art in representing statistical data in RDF. This vocabulary is compatible with SDMX and increasingly being adopted. Based on the vocabulary and the encoded Data Cube, CubeViz is generating a facetted browsing widget that can be used to filter interactively observations to be visualized in charts. Based on the selected structure, CubeViz offer beneficiary chart types and options which can be selected by users.
If you are interested in Linked (Open) Data principles and mechanisms, LOD tools & services and concrete use cases that can be realised using LOD then join us in the free LOD2 webinar series!
The document discusses year 2 deliverables for work packages 9 and 10 of the LOD2 project. It summarizes reports on improvements made to the Publicdata.eu portal including upgrades to CKAN and new features. Next steps include further technical enhancements to Publicdata.eu and engaging communities of data publishers and users. Deliverables from the Serbian CKAN team established their data portal and infrastructure. The Polish Ministry of Economy requirements analysis identified needs for publishing their data as linked open data.
Introduction to the Data Web, DBpedia and the Life-cycle of Linked DataSören Auer
Over the past 4 years, the Semantic Web activity has gained momentum with the widespread publishing of structured data as RDF. The Linked Data paradigm has therefore evolved from a practical research idea into
a very promising candidate for addressing one of the biggest challenges
of computer science: the exploitation of the Web as a platform for data
and information integration. To translate this initial success into a
world-scale reality, a number of research challenges need to be
addressed: the performance gap between relational and RDF data
management has to be closed, coherence and quality of data published on
the Web have to be improved, provenance and trust on the Linked Data Web
must be established and generally the entrance barrier for data
publishers and users has to be lowered. This tutorial will discuss
approaches for tackling these challenges. As an example of a successful
Linked Data project we will present DBpedia, which leverages Wikipedia
by extracting structured information and by making this information
freely accessible on the Web. The tutorial will also outline some recent advances in DBpedia, such as the mappings Wiki, DBpedia Live as well as
the recently launched DBpedia benchmark.
This document summarizes a presentation about semantic technologies for big data. It discusses how semantic technologies can help address challenges related to the volume, velocity, and variety of big data. Specific examples are provided of large semantic datasets containing billions of triples and semantic applications that have integrated and analyzed disparate data sources. Semantic technologies are presented as a good fit for addressing big data's variety, and research is making progress in applying them to velocity and volume as well.
The document is a presentation by Prof. Dr. Stefan Gradmann from KU Leuven given on July 11, 2013 at Universidad Carlos III de Madrid titled "From Records to Graphs: Linked Data and Libraries". It discusses how libraries are moving from traditional catalog records to linked data graphs by embracing semantic web technologies like RDF. Specifically, it covers the Europeana Data Model (EDM) and how it enables libraries to publish linked cultural heritage data and support new types of context-driven research services.
http://lod2.eu/BlogPost/webinar-series
This webinar in the course of the LOD2 webinar series will present the release 3.0 of the LOD2 stack, which contains updates to
*) Virtuoso 7 [Openlink]: the original row store of the Virtuoso 6 universal server has now been replaced by a column store, increasing the performance of SPARQL queries significantly, the store is now up to three times as fast as the previous major version.
Linked Open Data Manager Suite [SWC]: the 'lodms' application allows the user to quickly set up pipelines for transforming linked data through the use of its many extensions. It also allows operations for extracting rdf from other types of data.
*) dbpedia-spotlight-ui [ULEI]: a graphical user interface component that allows the user to use a remote DBpedia spotlight instance to annotate a text with DBpedia concepts.
*) sparqlify [ULEI]: a scalable SPARQL-SQL rewriter, allowing you to query an SQL database as if it were a triple store.
*) SIREn [DERI]: a Lucene plugin that allows you to efficiently index and query RDF, as well as any textual document with an arbitrary amount of metadata fields.
*) CubeViz [ULEI]: CubeViz allows visualization of the Data Cube linked data representation of statistical data. It has support for the more advanced DataCube features, such as slices. It also allows the selection of a remote SPARQL endpoint and export of a modified cube.
*) R2R [UMA]: the R2R mapping API is now included directly into the lod2 demonstrator application, allowing users to experience the full effect of the R2R semantic mapping language through a graphical user interface.
*) ontowiki-csvimport [ULEI]: an OntoWiki extension that transforms CSV files to RDF. The extension can create Data Cubes that can be visualized by CubeViz.
If you are interested in Linked (Open) Data principles and mechanisms, LOD tools & services and concrete use cases that can be realised using LOD then join us in the free LOD2 webinar series!
This webinar in the course of the LOD2 webinar series will present use cases and live demos of D2R (Free University Berlin) and Sparqlify (University of Leipzig).
D2R Server is a tool for publishing relational databases on the Semantic Web. It enables RDF and HTML browsers to navigate the content of the database, and allows applications to query the database using the SPARQL query language.
Sparqlify is a tool enabling one to define expressive RDF views on relational databases and query them with a subset of the SPARQL query language. By featuring a novel RDF view definition syntax, it aims at simplifying the RDB-RDF mapping process.
more to be found at:
The Danish case: What does the danish web talk aboutWARCnet
The document discusses a project that analyzes billions of words from the Danish web from 2006-2016 to map the topics and development of discussions. It will use four approaches: analyzing frequent words each year, words of the year, specific events/topics identified for each year, and most searched terms. The data comes from the Danish web archive and methods like topic modeling and word embeddings will be used to analyze lexical structures and map where topics were discussed. The goal is to understand what the Danish web has talked about over time and develop methods for large-scale textual analysis.
For our September 19th lecture... Tim Berners-Lee's 5-Star Open Data scheme.
Who is Tim Berners-Lee? What is the 5-Star Open Data scheme? We will be giving an introduction to open data and how to apply the 5-Star Open Data scheme to an open data program.
UnifiedViews is a joint project currently maintained by Semantic Web Company (SWC) and Semantica.cz (Semantica.cz). It has been mainly developed by Charles University in Prague as a student project called ODCleanStore (version 2). It is based on the experience SWC obtained with the LOD Management Suite (LODMS) used in WP7 and ODCleansStore (version 1) developed by Charles University in Prague for the WP9a use case of the LOD2 FP7 project. In the next stack release of the LOD2 stack, UnifiedViews will replace LODMS as an ETL tool in the stack and the tool has already been adopted in other projects.
In the webinar we will give a brief overview of the UnifiedViews project (Helmut Nagy). The main part will be a presentation of the tool and it's capabilities (Tomas Knap)
LODeX: Schema Summarization and automatic SPARQL query generation for Linked ...Fabio Benedetti
The document describes LODeX, a system that automatically extracts and summarizes schemas from Linked Open Data sources to help users discover and query these datasets. It extracts statistical indexes from the datasets and uses them to generate interactive schema summaries describing classes, properties, and attributes. Users can then build basic queries through a GUI, which are compiled into SPARQL and executed. The system was tested on over 500 datasets and evaluations found the schema summaries and query generation were over 88% accurate. Future work includes improving the interface based on user surveys and extending the system to cluster summaries for large datasets.
This module supported the training on Linked Open Data delivered to the EU Institutions on 30 November 2015 in Brussels. https://joinup.ec.europa.eu/community/ods/news/ods-onsite-training-european-commission
This document provides an introduction to linked data and open data. It discusses the evolution of the web from documents to interconnected data. The four principles of linked data are explained: using URIs to identify things, making URIs accessible, providing useful information about the URI, and including links to other URIs. The differences between open data and linked data are outlined. Key milestones in linked government data are presented. Formats for publishing linked data like RDF and SPARQL are introduced. Finally, the 5 star scheme for publishing open data as linked data is described.
LOD2 is a 4-year European Commission project comprising Linked Data researchers and companies from 12 countries. The project aims to integrate Linked Data into existing large-scale applications in media, publishing, corporate intranets, and eGovernment. The webinar series offers monthly free webinars on tools and services for acquiring, editing, composing, connecting, and publishing Linked Data.
PoolParty Semantic Suite: Management Briefing and Functional Overview Martin Kaltenböck
Slides for the presentation of PoolParty Semantic Suite on 12.11. 2015 at KNVI Congres 2015 in Utrecht, the Netherlands, see: http://congres.knvi.info/ by Martin Kaltenböck in the Big Data & Linked Data Session.
PwC is a global network of firms providing assurance, tax, and advisory services. This training module covers best practices for designing and developing RDF vocabularies. It discusses modeling data by reusing existing vocabularies when possible, creating sub-classes and properties to specialize existing terms, and defining new terms following common conventions when needed. The module also addresses publishing and promoting vocabularies so they can be reused by others.
The document discusses a webinar presented by LOD2 on creating knowledge from interlinked data. It describes LOD2 as an EU-funded project involving leading linked open data organizations. The webinar agenda includes discussing SIREn, a plugin for Elasticsearch that allows indexing and searching of JSON documents. It provides an overview of Elasticsearch and describes how to install SIREn, create an index, index documents, and perform searches on nested JSON data.
Linking Open Data to Accelerate Low - Carbon Development Martin Kaltenböck
Presentation in the course of the Workshop in Abu Dhabi on 18.01.2012: Linking Open Data to Accelerate Low-Carbon Development - A workshop for decision makers in clean energy organisations - by Martin Kaltenböck, Semantic Web Company (SWC) including:
Linked Open Government Data: Open Government & Open Government Data; Putting the L in Front: from Open Data to Linked Open Data (LOD) -
(http://lod2.eu/BlogPost/webinar-series) In this Webinar Michael Martin presents CubeViz - a facetted browser for statistical data utilizing the RDF Data Cube vocabulary which is the state-of-the-art in representing statistical data in RDF. This vocabulary is compatible with SDMX and increasingly being adopted. Based on the vocabulary and the encoded Data Cube, CubeViz is generating a facetted browsing widget that can be used to filter interactively observations to be visualized in charts. Based on the selected structure, CubeViz offer beneficiary chart types and options which can be selected by users.
If you are interested in Linked (Open) Data principles and mechanisms, LOD tools & services and concrete use cases that can be realised using LOD then join us in the free LOD2 webinar series!
The document discusses year 2 deliverables for work packages 9 and 10 of the LOD2 project. It summarizes reports on improvements made to the Publicdata.eu portal including upgrades to CKAN and new features. Next steps include further technical enhancements to Publicdata.eu and engaging communities of data publishers and users. Deliverables from the Serbian CKAN team established their data portal and infrastructure. The Polish Ministry of Economy requirements analysis identified needs for publishing their data as linked open data.
Introduction to the Data Web, DBpedia and the Life-cycle of Linked DataSören Auer
Over the past 4 years, the Semantic Web activity has gained momentum with the widespread publishing of structured data as RDF. The Linked Data paradigm has therefore evolved from a practical research idea into
a very promising candidate for addressing one of the biggest challenges
of computer science: the exploitation of the Web as a platform for data
and information integration. To translate this initial success into a
world-scale reality, a number of research challenges need to be
addressed: the performance gap between relational and RDF data
management has to be closed, coherence and quality of data published on
the Web have to be improved, provenance and trust on the Linked Data Web
must be established and generally the entrance barrier for data
publishers and users has to be lowered. This tutorial will discuss
approaches for tackling these challenges. As an example of a successful
Linked Data project we will present DBpedia, which leverages Wikipedia
by extracting structured information and by making this information
freely accessible on the Web. The tutorial will also outline some recent advances in DBpedia, such as the mappings Wiki, DBpedia Live as well as
the recently launched DBpedia benchmark.
This document summarizes a presentation about semantic technologies for big data. It discusses how semantic technologies can help address challenges related to the volume, velocity, and variety of big data. Specific examples are provided of large semantic datasets containing billions of triples and semantic applications that have integrated and analyzed disparate data sources. Semantic technologies are presented as a good fit for addressing big data's variety, and research is making progress in applying them to velocity and volume as well.
The document is a presentation by Prof. Dr. Stefan Gradmann from KU Leuven given on July 11, 2013 at Universidad Carlos III de Madrid titled "From Records to Graphs: Linked Data and Libraries". It discusses how libraries are moving from traditional catalog records to linked data graphs by embracing semantic web technologies like RDF. Specifically, it covers the Europeana Data Model (EDM) and how it enables libraries to publish linked cultural heritage data and support new types of context-driven research services.
http://lod2.eu/BlogPost/webinar-series
This webinar in the course of the LOD2 webinar series will present the release 3.0 of the LOD2 stack, which contains updates to
*) Virtuoso 7 [Openlink]: the original row store of the Virtuoso 6 universal server has now been replaced by a column store, increasing the performance of SPARQL queries significantly, the store is now up to three times as fast as the previous major version.
Linked Open Data Manager Suite [SWC]: the 'lodms' application allows the user to quickly set up pipelines for transforming linked data through the use of its many extensions. It also allows operations for extracting rdf from other types of data.
*) dbpedia-spotlight-ui [ULEI]: a graphical user interface component that allows the user to use a remote DBpedia spotlight instance to annotate a text with DBpedia concepts.
*) sparqlify [ULEI]: a scalable SPARQL-SQL rewriter, allowing you to query an SQL database as if it were a triple store.
*) SIREn [DERI]: a Lucene plugin that allows you to efficiently index and query RDF, as well as any textual document with an arbitrary amount of metadata fields.
*) CubeViz [ULEI]: CubeViz allows visualization of the Data Cube linked data representation of statistical data. It has support for the more advanced DataCube features, such as slices. It also allows the selection of a remote SPARQL endpoint and export of a modified cube.
*) R2R [UMA]: the R2R mapping API is now included directly into the lod2 demonstrator application, allowing users to experience the full effect of the R2R semantic mapping language through a graphical user interface.
*) ontowiki-csvimport [ULEI]: an OntoWiki extension that transforms CSV files to RDF. The extension can create Data Cubes that can be visualized by CubeViz.
If you are interested in Linked (Open) Data principles and mechanisms, LOD tools & services and concrete use cases that can be realised using LOD then join us in the free LOD2 webinar series!
This webinar in the course of the LOD2 webinar series will present use cases and live demos of D2R (Free University Berlin) and Sparqlify (University of Leipzig).
D2R Server is a tool for publishing relational databases on the Semantic Web. It enables RDF and HTML browsers to navigate the content of the database, and allows applications to query the database using the SPARQL query language.
Sparqlify is a tool enabling one to define expressive RDF views on relational databases and query them with a subset of the SPARQL query language. By featuring a novel RDF view definition syntax, it aims at simplifying the RDB-RDF mapping process.
more to be found at:
The Danish case: What does the danish web talk aboutWARCnet
The document discusses a project that analyzes billions of words from the Danish web from 2006-2016 to map the topics and development of discussions. It will use four approaches: analyzing frequent words each year, words of the year, specific events/topics identified for each year, and most searched terms. The data comes from the Danish web archive and methods like topic modeling and word embeddings will be used to analyze lexical structures and map where topics were discussed. The goal is to understand what the Danish web has talked about over time and develop methods for large-scale textual analysis.
For our September 19th lecture... Tim Berners-Lee's 5-Star Open Data scheme.
Who is Tim Berners-Lee? What is the 5-Star Open Data scheme? We will be giving an introduction to open data and how to apply the 5-Star Open Data scheme to an open data program.
UnifiedViews is a joint project currently maintained by Semantic Web Company (SWC) and Semantica.cz (Semantica.cz). It has been mainly developed by Charles University in Prague as a student project called ODCleanStore (version 2). It is based on the experience SWC obtained with the LOD Management Suite (LODMS) used in WP7 and ODCleansStore (version 1) developed by Charles University in Prague for the WP9a use case of the LOD2 FP7 project. In the next stack release of the LOD2 stack, UnifiedViews will replace LODMS as an ETL tool in the stack and the tool has already been adopted in other projects.
In the webinar we will give a brief overview of the UnifiedViews project (Helmut Nagy). The main part will be a presentation of the tool and it's capabilities (Tomas Knap)
LODeX: Schema Summarization and automatic SPARQL query generation for Linked ...Fabio Benedetti
The document describes LODeX, a system that automatically extracts and summarizes schemas from Linked Open Data sources to help users discover and query these datasets. It extracts statistical indexes from the datasets and uses them to generate interactive schema summaries describing classes, properties, and attributes. Users can then build basic queries through a GUI, which are compiled into SPARQL and executed. The system was tested on over 500 datasets and evaluations found the schema summaries and query generation were over 88% accurate. Future work includes improving the interface based on user surveys and extending the system to cluster summaries for large datasets.
This document introduces Linked Data Fragments, which is an approach to querying Linked Data in a scalable and reliable way by moving intelligence from centralized servers to distributed clients. It describes how basic Linked Data Fragments can be used to answer SPARQL queries by retrieving and combining relevant fragments. The vision is for clients to be able to query different Linked Data sources across the web using various types of fragments. All Linked Data Fragments software is available as open source.
The document discusses the benefits of a federated and decentralized approach to knowledge and data on the web. It argues that centralized approaches like Big Data fail at web scale, as knowledge is inherently distributed and heterogeneous. A federated future based on light interfaces like Triple Pattern Fragments is envisioned, one where clients can query multiple data sources simultaneously for better performance and reliability compared to centralized endpoints. Serendipity and realistic expectations are important principles for this vision.
There are 4 SPARQL query forms: SELECT, ASK, CONSTRUCT, and DESCRIBE. Each form serves a different purpose. SELECT returns variable bindings and is equivalent to an SQL SELECT query. ASK returns a boolean for whether a pattern matches or not. CONSTRUCT returns an RDF graph constructed from templates. DESCRIBE returns an RDF graph that describes resources found. Beyond their basic uses, the forms can be used for tasks like indexing, transformation, validation, and prototyping user interfaces.
1. The document discusses various laws related to identity theft and privacy such as FACTA, GLB Safeguard Rules, and state privacy legislation that affect businesses and their responsibilities to protect personal information.
2. It provides an overview of an identity theft protection program that businesses can implement to help establish an affirmative defense and mitigate risks and liabilities from data breaches or identity theft affecting employees.
3. The program includes appointing a security compliance officer, developing security policies and plans, conducting employee training, and offering identity theft monitoring and restoration services to employees.
The document summarizes an F-Interop meetup in London about improving IoT interoperability. It discusses the challenges of interoperability testing, including the barriers faced by SMEs. It then describes the F-Interop project, which is developing online remote testing tools to help address these challenges. The meetup covered the F-Interop platform roles and capabilities, supported protocols, and an upcoming open call for new testing tools and interoperability tests.
RESTdesc – Efficient runtime service discovery and consumptionRuben Verborgh
The document discusses RESTdesc, which provides a way to efficiently discover and consume Web services through functional descriptions at runtime. It describes RESTdesc's use of HTTP, RDF, and custom vocabularies to semantically describe a photo height service. The description captures that the service takes a photo ID as input, performs a GET request to retrieve the height in pixels, and returns that value associated with the photo. RESTdesc aims to improve automated service use through simple yet powerful functional descriptions of what services do rather than how they are implemented.
A Semantic Description Language for RESTful Data Services to Combat SemaphobiaMarkus Lanthaler
The document proposes a semantic description language (SEREDASj) to provide machine-readable descriptions of RESTful web services. It aims to address the lack of standards for describing REST APIs and help combat "semaphobia", the fear of semantics. The language builds on previous work but is tailored specifically for REST by focusing on simplicity and supporting many use cases including discovery and composition of RESTful services.
This document discusses the principles of REST APIs and hypermedia. It argues that while hypermedia is useful for navigating a single API, it is not sufficient on its own for describing cross-API interactions. The document advocates combining self-descriptive representations of an API's responses with self-descriptive descriptions of an API's functionality to fully support how clients may use the API.
Aligning Web Services with the Semantic Web to Create a Global Read-Write Gra...Markus Lanthaler
Presentation of the paper "Aligning Web Services with the Semantic Web to Create a Global Read-Write Graph of Data" gave at the 9th IEEE European Conference on Web Services (ECOWS 2011) in Lugano, Switzerland.
Despite significant research and development efforts, the vision of the Semantic Web yielding to a Web of Data has not yet become reality. Even though initiatives such as Linking Open Data gained traction recently, the Web of Data is still clearly outpaced by the growth of the traditional, document-based Web. Instead of releasing data in the form of RDF, many publishers choose to publish their data in the form of Web services. The reasons for this are manifold. Given that RESTful Web services closely resemble the document-based Web, they are not only perceived as less complex and disruptive, but also provide read-write interfaces to the underlying data. In contrast, the current Semantic Web is essentially read-only which clearly inhibits net-working effects and engagement of the crowd. On the other hand, the prevalent use of proprietary schemas to represent the data published by Web services inhibits generic browsers or crawlers to access and understand this data; the consequence are islands of data instead of a global graph of data forming the envisioned Semantic Web. We thus propose a novel approach to integrate Web services into the Web of Data by introducing an algorithm to translate SPARQL queries to HTTP requests. The aim is to create a global read-write graph of data and to standardize the mashup development process. We try to keep the approach as familiar and simple as possible to lower the entry barrier and foster the adoption of our approach. Thus, we based our proposal on SEREDASj, a semantic description language for RESTful data services, for making proprietary JSON service schemas accessible.
Live DBpedia querying with high availabilityRuben Verborgh
The document discusses improving the availability of querying live DBpedia data by using a simpler interface like Triple Pattern Fragments instead of SPARQL. Triple Pattern Fragments places less load on servers, allowing them to achieve high availability like other HTTP interfaces. Complex queries can still be handled by clients assembling the results from multiple fragment requests rather than burdening servers.
Initial Usage Analysis of DBpedia's Triple Pattern FragmentsRuben Verborgh
The document summarizes an analysis of the usage of DBpedia's Triple Pattern Fragments interface between November 2014 and February 2015. Over 4 million requests were made to the interface with 99.9994% uptime. The top clients were the TPF client library, crawlers and Chrome browser. Most requests came from Europe, US and China. The analysis found the interface provided highly available querying of DBpedia's data but more work is needed to understand specific queries and build applications for end users.
The document discusses the history and future of the web and hypermedia. It covers the early concepts of hypertext by Ted Nelson in the 1970s. It then discusses the development of the web in the 1990s by Tim Berners-Lee and the constraints of HTTP, HTML, and URLs that made it scalable but limited. It introduces REST and how the web can be viewed as a RESTful architecture. It discusses the semantic web and using URIs, RDF, and schemas to add meaning for machines. It concludes by discussing how a combination of semantics and hypermedia could solve the hypermedia paradox by enabling browsers to dynamically generate appropriate links.
This document discusses using smart SPARQL agents to distribute reasoning over linked data. The agents can outsource reasoning to infrastructure like client-side, server-side, or third-party reasoning services. This allows reasoning to be performed as a service. Reasoned SPARQL allows data consumers to choose inference rules for querying distributed data. Nested queries and workload balancing techniques are also described.
Presentation of SAPS at the 1st International Workshop on the Information-Centric Web (IC-Web 2011) at the 11th IEEE/IPSJ International Symposium on Applications and the Internet (SAINT 2011) in Munich, Germany
Invited talk at USEWOD2014 (http://people.cs.kuleuven.be/~bettina.berendt/USEWOD2014/)
A tremendous amount of machine-interpretable information is available in the Linked Open Data Cloud. Unfortunately, much of this data remains underused as machine clients struggle to use the Web. I believe this can be solved by giving machines interfaces similar to those we offer humans, instead of separate interfaces such as SPARQL endpoints. In this talk, I'll discuss the Linked Data Fragments vision on machine access to the Web of Data, and indicate how this impacts usage analysis of the LOD Cloud. We all can learn a lot from how humans access the Web, and those strategies can be applied to querying and analysis. In particular, we have to focus first on solving those use cases that humans can do easily, and only then consider tackling others.
- Web Worker context compared to SSJS context
- Mixte Synchronous / Asynchronous APIs
- Making Existing Client-side JS APIs recommendations adaptable to the server context
- Defining W3C recommendation for Server-side JavaScript APIs?
- Remote debugging for Remote (Server) Workers
- Potential common package/module format support (CommonJS, AMD, ECMAScript 6)
- DOM Events, ProgressEvent, EventSource, Server Events (EventEmitter?), & Client Events
- Feedback on previous work at CommonJS and from some SSJS implementations
- Feedback on our experiences in the Wakanda implementation
- start the activity of the community group
EU Tools for all Open Data harmonisation all over EuropeMarc Garriga
The document discusses open data harmonization efforts across Europe. It outlines reasons why building open data services is difficult, including a lack of standardized formats and vocabularies. It then describes several European initiatives and tools that aim to address these challenges by promoting standards like DCAT and DCAT-AP for cataloguing open government data, as well as projects that develop federated open data portals and address legal interoperability issues.
Publishing Linked Open Data on the Web & the Role of OntologiesMaría Poveda Villalón
This document contains information about a presentation given by María Poveda Villalón on publishing linked open data on the web and the role of ontologies. It provides details about María's background and work at the Ontology Engineering Group in Madrid. It also gives an overview of the group's research areas including ontological engineering, linked data technologies and applications, and involvement in various projects and standardization activities.
Semantic Data Architecture and Ontological Infrastructure (ASIO)AdrinSaavedraSerrano
This document introduces linked open data and the semantic web. It discusses the evolution from a document-based web to a web of interlinked data using standards like RDF and URIs. It explains the differences between open data and linked data and outlines best practices for publishing linked open data using the 5 star framework. Key topics covered include linked data principles, publishing structured data using formats like RDFa and schema.org, and the economic and social benefits of linked open and government data.
The state of global research data initiatives: observations from a life on th...Projeto RCAAP
This document summarizes the state of global research data initiatives. It discusses that while interest in research data management is growing globally, challenges remain, including lack of advocacy, skills, and incentives. However, it also outlines strengths in many countries through investments in infrastructure and policies. It calls for increased international collaboration and coordination to help manage more research data according to FAIR and open principles.
Presentation during the 14th Association of African Universities (AAU) Conference and African Open Science Platform (AOSP)/Research Data Alliance (RDA) Workshop in Accra, Ghana, 7-8 June 2017.
Putting the L in front: from Open Data to Linked Open DataMartin Kaltenböck
Keynote presentation of Martin Kaltenböck (LOD2 project, Semantic Web Company) at the Government Linked Data Workshop in the course of the OGD Camp 2011 in Warsaw, Poland: Putting the L in front: from Open Data to Linked Open Data
The document discusses the evolution and history of the Internet and the Research Data Alliance (RDA). It provides details on:
- How the Internet originated from research networks developed by DARPA in the 1960s-70s.
- The RDA aims to build bridges for open sharing of research data globally by facilitating collaboration between experts. It is supported by funding from the EC, Australian NSD, and US NSF/NIST.
- The RDA works through Working and Interest Groups that develop standards and recommendations to advance data sharing at biannual plenary meetings. Several outputs addressing issues like metadata standards, data type registries, and PID information are expected in 2014.
In recent years governments and research institutions have emphasized the need for open data as a fundamental component of open science. But we need much more than the data themselves for them to be reusable and useful. We need descriptive and machine-readable metadata, of course, but we also need the software and the algorithms necessary to fully understand the data. We need the standards and protocols that allow us to easily read and analyze the data with the tools of our choice. We need to be able to trust the source and derivation of the data. In short, we need an interoperable data infrastructure, but it must be a flexible infrastructure able to work across myriad cultures, scales, and technologies. This talk will present a concept of infrastructure as a body of human, organisational, and machine relationships built around data. It will illustrate how a new organization, the Research Data Alliance, is working to build those relationships to enable functional data sharing and reuse.
The document discusses open data initiatives and tools for data sharing. It describes projects from the EDINA National Data Centre, DISC-UK DataShare project which investigated legal and technical issues around research data sharing, and tools for visualizing and sharing numeric and spatial data online like Many Eyes, Gapminder and OpenStreetMap. It also covers barriers to data sharing, harnessing collective intelligence through open science, and citizens contributing geographic data through tools like geograph.
NordForsk Open Access Reykjavik 14-15/8-2014:RdaNordForsk
The Research Data Alliance provides opportunities for global collaboration on data-related issues. It grew from the need to connect research computers and share data openly across technologies and borders. RDA works through Working and Interest Groups to develop standards and best practices around topics like data citation and metadata. Recent outputs include recommendations for data type registries and persistent identifier information types. RDA membership includes over 1,900 individuals from 83 countries and represents academia, government, and industry.
The tripscore Linked Data client: calculating specific summaries over large t...David Chaves-Fraga
The document describes the Tripscore Linked Data client, which calculates summaries over large public transportation time series data stored using the Linked Connections framework. It addresses the problem of expensive analytical queries over long periods of public transport data. The client moves processing to the client side by requesting summarized data from the server in order to improve performance. Next steps include transforming additional real-world public transport datasets into the Linked Connections format and improving metadata for discoverability.
Data management plans – EUDAT Best practices and case study | www.eudat.euEUDAT
| www.eudat.eu | Presentation given by Stéphane Coutin during the PRACE 2017 Spring School joint training event with the EU H2020 VI-SEEM project (https://vi-seem.eu/) organised by CaSToRC at The Cyprus Institute. Science and more specifically projects using HPC is facing a digital data explosion. Instruments and simulations are producing more and more volume; data can be shared, mined, cited, preserved… They are a great asset, but they are facing risks: we can miss storage, we can lose them, they can be misused,… To start this session, we will review why it is important to manage research data and how to do this by maintaining a Data Management Plan. This will be based on the best practices from EUDAT H2020 project and European Commission recommendation. During the second part we will interactively draft a DMP for a given use case.
Tracey P. Lauriault discusses open data and technological citizenship. She makes three key points:
1) Data are not objective or politically neutral, but are inseparable from the ideas, technologies, and contexts that produce them.
2) Technological citizenship involves engaging with data and technology as a form of political participation and action.
3) Various definitions and principles of open data have emerged over time from organizations aiming to make data accessible and shareable.
This document discusses linked statistical data and its benefits. It provides an overview of key concepts like open data, linked data, and the W3C RDF DataCube specification. It also presents a case study of a statistical office in Aragon, Spain that has published local government data as linked open data. Publishing data this way allows for easier reuse by both internal and external developers. It facilitates integration with other datasets and enables complex queries across multiple sources. Overall, representing statistical data using semantic web standards like RDF DataCube provides advantages for data sharing and reuse.
This document discusses how open data is turned into services. It notes that while open data initiatives aim to spur the creation of new services, the development of sustainable services based on open data has been disappointing. Several approaches are explored to better encourage the creation of open data services, such as addressing intellectual property rights, improving data discoverability, providing support like APIs and training, and running competitions. However, studies suggest that the potential for open data to generate new services may be overstated, as reusers are not strongly demanding open datasets and there is a lack of alignment between released datasets and created services.
What do analytics on learning analytics tell us? How can we make sense of this emerging field’s historical roots, current state, and future trends, based on how its members report and debate their research?
Challenge submissions should exploit the LAK Dataset for a meaningful purpose. This may include submissions which cover one or more of the following, non-exclusive list of topics:
Analysis & assessment of the emerging LAK community in terms of topics, people, citations or connections with other fields
Innovative applications to explore, navigate and visualise the dataset (and/or its correlation with other datasets)
Usage of the dataset as part of recommender systems
Analysis of the evolution of LAK discipline
Improvement or enrichment of the LAK Dataset
The binding of cosmological structures by massless topological defectsSérgio Sacani
Assuming spherical symmetry and weak field, it is shown that if one solves the Poisson equation or the Einstein field
equations sourced by a topological defect, i.e. a singularity of a very specific form, the result is a localized gravitational
field capable of driving flat rotation (i.e. Keplerian circular orbits at a constant speed for all radii) of test masses on a thin
spherical shell without any underlying mass. Moreover, a large-scale structure which exploits this solution by assembling
concentrically a number of such topological defects can establish a flat stellar or galactic rotation curve, and can also deflect
light in the same manner as an equipotential (isothermal) sphere. Thus, the need for dark matter or modified gravity theory is
mitigated, at least in part.
This presentation explores a brief idea about the structural and functional attributes of nucleotides, the structure and function of genetic materials along with the impact of UV rays and pH upon them.
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...Wasswaderrick3
In this book, we use conservation of energy techniques on a fluid element to derive the Modified Bernoulli equation of flow with viscous or friction effects. We derive the general equation of flow/ velocity and then from this we derive the Pouiselle flow equation, the transition flow equation and the turbulent flow equation. In the situations where there are no viscous effects , the equation reduces to the Bernoulli equation. From experimental results, we are able to include other terms in the Bernoulli equation. We also look at cases where pressure gradients exist. We use the Modified Bernoulli equation to derive equations of flow rate for pipes of different cross sectional areas connected together. We also extend our techniques of energy conservation to a sphere falling in a viscous medium under the effect of gravity. We demonstrate Stokes equation of terminal velocity and turbulent flow equation. We look at a way of calculating the time taken for a body to fall in a viscous medium. We also look at the general equation of terminal velocity.
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...University of Maribor
Slides from talk:
Aleš Zamuda: Remote Sensing and Computational, Evolutionary, Supercomputing, and Intelligent Systems.
11th International Conference on Electrical, Electronics and Computer Engineering (IcETRAN), Niš, 3-6 June 2024
Inter-Society Networking Panel GRSS/MTT-S/CIS Panel Session: Promoting Connection and Cooperation
https://www.etran.rs/2024/en/home-english/
hematic appreciation test is a psychological assessment tool used to measure an individual's appreciation and understanding of specific themes or topics. This test helps to evaluate an individual's ability to connect different ideas and concepts within a given theme, as well as their overall comprehension and interpretation skills. The results of the test can provide valuable insights into an individual's cognitive abilities, creativity, and critical thinking skills
ESR spectroscopy in liquid food and beverages.pptxPRIYANKA PATEL
With increasing population, people need to rely on packaged food stuffs. Packaging of food materials requires the preservation of food. There are various methods for the treatment of food to preserve them and irradiation treatment of food is one of them. It is the most common and the most harmless method for the food preservation as it does not alter the necessary micronutrients of food materials. Although irradiated food doesn’t cause any harm to the human health but still the quality assessment of food is required to provide consumers with necessary information about the food. ESR spectroscopy is the most sophisticated way to investigate the quality of the food and the free radicals induced during the processing of the food. ESR spin trapping technique is useful for the detection of highly unstable radicals in the food. The antioxidant capability of liquid food and beverages in mainly performed by spin trapping technique.
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptxRASHMI M G
Abnormal or anomalous secondary growth in plants. It defines secondary growth as an increase in plant girth due to vascular cambium or cork cambium. Anomalous secondary growth does not follow the normal pattern of a single vascular cambium producing xylem internally and phloem externally.
The debris of the ‘last major merger’ is dynamically youngSérgio Sacani
The Milky Way’s (MW) inner stellar halo contains an [Fe/H]-rich component with highly eccentric orbits, often referred to as the
‘last major merger.’ Hypotheses for the origin of this component include Gaia-Sausage/Enceladus (GSE), where the progenitor
collided with the MW proto-disc 8–11 Gyr ago, and the Virgo Radial Merger (VRM), where the progenitor collided with the
MW disc within the last 3 Gyr. These two scenarios make different predictions about observable structure in local phase space,
because the morphology of debris depends on how long it has had to phase mix. The recently identified phase-space folds in Gaia
DR3 have positive caustic velocities, making them fundamentally different than the phase-mixed chevrons found in simulations
at late times. Roughly 20 per cent of the stars in the prograde local stellar halo are associated with the observed caustics. Based
on a simple phase-mixing model, the observed number of caustics are consistent with a merger that occurred 1–2 Gyr ago.
We also compare the observed phase-space distribution to FIRE-2 Latte simulations of GSE-like mergers, using a quantitative
measurement of phase mixing (2D causticality). The observed local phase-space distribution best matches the simulated data
1–2 Gyr after collision, and certainly not later than 3 Gyr. This is further evidence that the progenitor of the ‘last major merger’
did not collide with the MW proto-disc at early times, as is thought for the GSE, but instead collided with the MW disc within
the last few Gyr, consistent with the body of work surrounding the VRM.
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Ana Luísa Pinho
Functional Magnetic Resonance Imaging (fMRI) provides means to characterize brain activations in response to behavior. However, cognitive neuroscience has been limited to group-level effects referring to the performance of specific tasks. To obtain the functional profile of elementary cognitive mechanisms, the combination of brain responses to many tasks is required. Yet, to date, both structural atlases and parcellation-based activations do not fully account for cognitive function and still present several limitations. Further, they do not adapt overall to individual characteristics. In this talk, I will give an account of deep-behavioral phenotyping strategies, namely data-driven methods in large task-fMRI datasets, to optimize functional brain-data collection and improve inference of effects-of-interest related to mental processes. Key to this approach is the employment of fast multi-functional paradigms rich on features that can be well parametrized and, consequently, facilitate the creation of psycho-physiological constructs to be modelled with imaging data. Particular emphasis will be given to music stimuli when studying high-order cognitive mechanisms, due to their ecological nature and quality to enable complex behavior compounded by discrete entities. I will also discuss how deep-behavioral phenotyping and individualized models applied to neuroimaging data can better account for the subject-specific organization of domain-general cognitive systems in the human brain. Finally, the accumulation of functional brain signatures brings the possibility to clarify relationships among tasks and create a univocal link between brain systems and mental functions through: (1) the development of ontologies proposing an organization of cognitive processes; and (2) brain-network taxonomies describing functional specialization. To this end, tools to improve commensurability in cognitive science are necessary, such as public repositories, ontology-based platforms and automated meta-analysis tools. I will thus discuss some brain-atlasing resources currently under development, and their applicability in cognitive as well as clinical neuroscience.
The ability to recreate computational results with minimal effort and actionable metrics provides a solid foundation for scientific research and software development. When people can replicate an analysis at the touch of a button using open-source software, open data, and methods to assess and compare proposals, it significantly eases verification of results, engagement with a diverse range of contributors, and progress. However, we have yet to fully achieve this; there are still many sociotechnical frictions.
Inspired by David Donoho's vision, this talk aims to revisit the three crucial pillars of frictionless reproducibility (data sharing, code sharing, and competitive challenges) with the perspective of deep software variability.
Our observation is that multiple layers — hardware, operating systems, third-party libraries, software versions, input data, compile-time options, and parameters — are subject to variability that exacerbates frictions but is also essential for achieving robust, generalizable results and fostering innovation. I will first review the literature, providing evidence of how the complex variability interactions across these layers affect qualitative and quantitative software properties, thereby complicating the reproduction and replication of scientific studies in various fields.
I will then present some software engineering and AI techniques that can support the strategic exploration of variability spaces. These include the use of abstractions and models (e.g., feature models), sampling strategies (e.g., uniform, random), cost-effective measurements (e.g., incremental build of software configurations), and dimensionality reduction methods (e.g., transfer learning, feature selection, software debloating).
I will finally argue that deep variability is both the problem and solution of frictionless reproducibility, calling the software science community to develop new methods and tools to manage variability and foster reproducibility in software systems.
Exposé invité Journées Nationales du GDR GPL 2024
Deep Software Variability and Frictionless Reproducibility
Linked Data Generation Process
1. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
1st
Summer
School
on
Smart
Ci2es
and
Linked
Open
Data
(LD4SC-‐15)
Linked
Data
Genera=on
Process
Raúl
García-‐Castro,
Filip
Radulovic,
Oscar
Corcho,
María
Poveda,
Víctor
Rodríguez-‐Doncel,
Asunción
Gómez-‐Pérez,
Daniel
Vila-‐Suero
Presenter:
Raúl
García-‐Castro
2. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Index
• Linked
Open
Data
in
Smart
Ci2es
• Guidelines
for
the
Genera=on
of
Linked
Data
• Discussion
• Hands-‐on
Descrip=on
2
3. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Data
in
smart
ci=es
hQp://br.fiberhomegroup.com/pt/Enterprise/324/2282.aspx
3
4. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
• For
example,
(re)using
open
transport
data
– Provide
travel
informa=on
to
persons
– Allow
beQer
mul=modal
route
planning
– Facilitate
public
transport
management
– …
– Accessibility
• Which
metro
accesses
are
accessible
for
wheelchair
users?
• In
which
bus
stops
is
it
safer
and
more
convenient
for
a
wheelchair
user
to
wait?
• Is
there
any
accessible
parking
space
nearby
a
bus
stop?
• etc.
Open
data…
for
what?
4
5. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Legal
framework
and
open
data
ini=a=ves
• Aarhus
Conven=on
(1998)
– Right
to
par=cipa=on
and
access;
41
countries
and
the
EU
• Open
Access
Ini=a=ve
(2001)
– Scien=fic
informa=on
on
the
Web;
>
510
organisa=ons
• PSI
Direc=ve
– PSI
Reuse
(2003/98/EC)
• Conven=on
for
the
access
to
official
documents
(2009)
– Signed
by
12
countries
– Belgium,
Finland,
Norway,
Sweden,
Hungary,
Estonia,
Lithuania,
Slovenia,
Georgia,
Montenegro,
Serbia
and
Macedonia
• Law
37/2007.
PSI
Reuse
• Law
11/2007.
Ci=zen
access
to
public
services
and
right
to
the
quality
of
services
• RD
4/2010
Na=onal
Interoperability
Scheme
– Open
standards
– Technology
neutral
– Open
source
solware
• RD
1495/2011
It
develops
law
37/2007
• Norma
Técnica
de
Interoperabilidad
(19/02/2013,
BOE
4/3/2013)
Adapted
from
Antonio
Rodríguez
Pascual
(IGN)
5
6. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
The
problem:
lack
of
interoperability
Publish
Extract
Publish
Extract
Publish
Extract
I
want
to
publish
data
in
an
interoperable
structure
and
format
I
use
GTFS
I
use
my
own
CSV
structure
I
provide
a
web
service
Build
an
app
that
is
available
all
over
the
world
6
7. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Scenario:
open
transport
data
Is
there
any
open
transport
data
already?
We
are
surrounded
by
them
7
8. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Open
data
and
how
they
are
published
1)
In
no2ce
boards
– For
those
who
have
a
lot
of
free
=me
– Or
those
who
are
there
at
the
right
moment
in
=me
Adapted
from
Antonio
Rodríguez
Pascual
(IGN)
DATA
8
9. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Open
data
and
how
they
are
published
2)
In
web
pages
and
mobile
apps
– For
people
Adapted
from
Antonio
Rodríguez
Pascual
(IGN)
On
the
Web,
open
license
DATA
9
10. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Open
data
and
how
they
are
published
2)
In
web
pages
and
mobile
apps
– For
people
Adapted
from
Antonio
Rodríguez
Pascual
(IGN)
On
the
Web,
open
license
DATA
Machine-‐readable
Non-‐proprietary
format
11. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Open
data
and
how
they
are
published
3)
As
web
files
– So
that
they
can
be
loaded
by
humans
in
their
informa=on
systems
(XML,
HTML,
CSV,
etc.)
– Hopefully
it
is
not
a
scanned
PDF
Adapted
from
Antonio
Rodríguez
Pascual
(IGN)
On
the
Web,
open
license
DATA
Machine-‐readable
Non-‐proprietary
format
11
12. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Adapted
from
Antonio
Rodríguez
Pascual
(IGN)
Open
data
and
how
they
are
published
4)
Via
web
services
– For
humans
and
machines
– It
allows
genera=ng
added-‐value
services
– And
can
be
integrated
in
the
applica=on
business
logic
On
the
Web,
open
license
DATA
Machine-‐readable
Non-‐proprietary
format
12
13. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
What
is
open
data?
• Open
data
are
data
that
can
be
freely
used,
reused
and
redistributed
by
anyone
-‐
subject
only,
at
most,
to
the
requirement
to
a9ribute
and
sharealike.
• The
most
important
aspects
to
consider:
– Availability
and
Access:
data
must
be
available
as
a
whole
and
at
no
more
than
a
reasonable
reproduc2on
cost,
preferably
by
downloading
over
the
Internet.
Data
must
also
be
available
in
a
convenient
and
modifiable
form.
– Reuse
and
Redistribu2on:
data
must
be
provided
under
terms
that
permit
reuse
and
redistribu2on
including
the
intermixing
with
other
datasets.
– Universal
Par2cipa2on:
everyone
must
be
able
to
use,
reuse
and
redistribute
-‐
there
should
be
no
discrimina2on
against
fields
of
endeavour
or
against
persons
or
groups.
For
example,
‘non-‐
commercial’
or
‘only
in
educa=on’
restric=ons.
Source:
Open
Data
Handbook
13
14. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Scenario:
open
transport
data
Is
there
any
open
transport
data
already?
Can
we
do
it
beSer?
14
15. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Going
into
4
and
5
Linked
Data
Make
it
available
as
structured
data
(e.g.,
Excel
instead
of
image
scan
or
a
table)
Use
non-‐proprietary
formats
(e.g.,
CSV
instead
of
Excel)
Use
URIs
to
iden2fy
things,
so
that
people
can
point
at
your
stuff
Link
your
data
to
other
data
to
provide
context
Make
your
stuff
available
on
the
Web
(whatever
format)
under
an
open
license
15
16. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
USE
URIs
+
RDF
RDF
standards
José
Mobility
impairment
Boardgames
API
Mirasierra
Ven=squero
de
la
Condesa
Yes
CSV
Mega
Games
Ven=squero
de
la
Condesa
Yes
CSV
Mega
Games
Conquer
&
Smash!
MG
29,95
HTML
José
Mobility
Impairment
hasImpairment
Wheelchair
Accessibility
requires
Boardgame
likes
Mirasierra
address
Ven=squero
de
la
Condesa
Wheelchair
Accessibility
hasAccessibility
Mega
Games
address
hasAccessibility
Wheelchair
Accessibility
Ven=squero
de
la
Condesa
Mega
Games
Conquer
&
Smash!
is
a
Boardgame
sells
API
RDF
CSV
RDF
CSV
RDF
HTML
RDF
17. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Link
your
data
Linked
RDF
José
Mobility
impairment
Boardgames
Mirasierra
Ven=squero
de
la
Condesa
Yes
Mega
Games
Ven=squero
de
la
Condesa
Yes
Mega
Games
Conquer
&
Smash!
MG
29,95
API
CSV
CSV
HTML
José
Mobility
Impairment
hasImpairment
Wheelchair
Accessibility
requires
Boardgame
likes
Mirasierra
address
Ven=squero
de
la
Condesa
Wheelchair
Accessibility
Mega
Games
address
hasAccessibility
Wheelchair
Accessibility
Mega
Games
Conquer
&
Smash!
is
a
hasAccessibility
Boardgame
Ven=squero
de
la
Condesa
sells
API
RDF
CSV
RDF
CSV
RDF
HTML
RDF
18. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Wheelchair
Accessibility
Ven=squero
de
la
Condesa
Boardgame
Link
your
data
Linked
RDF
José
Mobility
impairment
Boardgames
Mirasierra
Ven=squero
de
la
Condesa
Yes
Mega
Games
Ven=squero
de
la
Condesa
Yes
Mega
Games
Conquer
&
Smash!
MG
29,95
API
CSV
CSV
HTML
José
Mobility
Impairment
hasImpairment
Wheelchair
Accessibility
requires
Boardgame
likes
Mirasierra
address
Ven=squero
de
la
Condesa
hasAccessibility
Wheelchair
Accessibility
Mega
Games
address
Ven=squero
de
la
Condesa
hasAccessibility
Wheelchair
Accessibility
Mega
Games
sells
Conquer
&
Smash!
is
a
Boardgame
API
RDF
CSV
RDF
CSV
RDF
HTML
RDF
19. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Make
complex
queries
Where
can
I
buy
the
Conquer
&
Smash!
game?
Which
are
the
most
accessible
routes
for
Christmas
shopping?
Expansion
pack
for
Conquer
&
Smash!
Take
metro
line
9
and
in
35
minutes
we
can
demo
it
to
you!
Or
beQer
take
bus
231
because
it
is
sunny
and
you
can
take
a
glance
at
the
outdoor
art
exhibi=on
in
Plaza
de
Cas=lla
MG
20. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Using
Linked
Open
Transport
Data
• Calculate
accessible
routes
– Combined
with
geographical
data
(IGN)
– Which
stop
should
I
use
if
I
have
mobility
problems?
• Commercial
routes
by
bus
– Combined
with
Madrid’s
shop
census
(from
Ayto.
Madrid)
• Geomarke=ng
decisions
for
enterpreneurs
– Where
should
I
open
my
shop?
Based
on
the
combina=on
of
the
number
of
travellers
per
stop,
demographic
data,
data
about
other
businesses
and
shops
around,
etc.
• Personalised
offers
to
travellers
– With
real-‐=me
data
and
data
about
consump=on
paQerns
(e.g.,
credit
card
transac=ons)
• …
20
21. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Index
• Linked
Open
Data
in
Smart
Ci=es
• Guidelines
for
the
Genera2on
of
Linked
Data
• Discussion
• Hands-‐on
Descrip=on
21
22. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Linked
Data
life
cycle
Specification
Modelling
GenerationPublication
Exploitation
Linking
22
23. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Requirements
(smart
ci=es
domain)
1. Tabular
formats
(i.e.,
SQL,
XLS
or
CSV)
– Other
data
structures
(e.g.,
XML)
less
important
in
prac=ce
or
are
unstructured
and
would
require
much
more
work
2. Changing
data
(dynamic
or
streaming
data),
versioning,
(automa=c)
data
quality
assurance
and
reliability
3. Data
access
through
web
services,
proprietary
APIs
and
data
files
4. Legal
aspects
(e.g.,
licensing,
data
ownership)
5. Access
rights
management
or
mechanisms
for
extrac=ng
public
data
(plenty
of
confiden=al
data)
23
24. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Linked
Data
genera=on
process
Select data
source
Obtain
access to
data source
Analyse data
source
Analyse
licensing of
the data
source
Define resource
naming strategy
Transform
data source
Link with
other
datasets
Data source
Access, data
License
Schema, data
Resource naming strategy
Ontology
RDF data
Linked dataset
Ontology
Develop
ontology
24
F.
Radulovic,
M.
Poveda-‐Villalón,
D.
Vila-‐Suero,
V.
Rodríguez-‐Doncel,
R.
García-‐Castro
and
A.
Gómez-‐
Pérez,
Guidelines
for
Linked
Data
genera=on
and
publica=on:
An
example
in
building
energy
consump=on,
Automa=on
in
Construc=on,
Special
Issue
on
Linked
Data
in
Architecture
and
Construc=on.
Available
online
April
2015.
25. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Linked
Data
genera=on
process
Select data
source
Obtain
access to
data source
Analyse data
source
Analyse
licensing of
the data
source
Define resource
naming strategy
Transform
data source
Link with
other
datasets
Data source
Access, data
License
Schema, data
Resource naming strategy
Ontology
RDF data
Linked dataset
Ontology
Develop
ontology
DATA PREPARATION
25
26. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Select
data
source
• Select
the
data
source
that
will
be
transformed
into
Linked
Data
• Steps:
– To
define
the
requirements
for
selec=on
– To
select
one
or
several
data
sources
• The
data
set
may
be:
– Owned
by
your
organiza=on…
– …
or
not
(external
data
sources)
26
27. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Select
data
source
–
LCmple
• Requirements
– Real-‐world
scenario
in
the
smart
city
domain
– Available
for
use
– Available
in
machine-‐processable
format
(the
more
structured
the
data
are,
the
beQer)
– Can
be
linked
with
generic
en==es
(e.g.,
loca=on)
• Leeds
City
Council
–
energy
consump=on
– hQp://data.gov.uk/dataset/council-‐energy-‐consump=on
27
28. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Obtain
access
to
data
source
• Data
access
means
– Technical
means
to
retrieve
the
data
– Legal
rights
to
use
the
data
• If
the
data
is
not
accessible:
– To
iden=fy
the
person
to
contact
– To
request
the
access
– To
obtain
access
and
to
retrieve
the
data
• Access
alterna=ves:
– file,
– programming
interface,
– database,
– data
stream,
– etc.
28
29. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Obtain
access
to
data
source
–
Lample
• Data
set
already
available
as
a
CSV
file
29
30. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Analysing
licensing
of
the
data
source
• Licenses
specify
the
legal
terms
under
which
a
data
set
can
be
used
and
exploited
• Neither
legal
prescrip=ons
on
how
to
declare
licenses
nor
common
standard
prac=ces
to
do
so
• Steps
(not
automatable):
– To
iden=fy
the
rightsholder
and
the
authorita=ve
publisher
• Righstholder
vs.
authorized
distributor
– To
find
the
applicable
license
• Web
page,
data
set
metadata,
data
themselves
• Contact
the
publisher
– To
read
the
license
and
analyse
legal
terms
• Tips
– Analysis
should
be
performed
upon
all
copies
and
formats
of
the
data
– Ensure
license
compa=bility
when
integra=ng
several
data
sources
30
31. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Linked
Data
resources
can
be
protected
Ontologies are intellectual works,
they can be protected by copyright
RDF Datasets can be considered as
databases, also legally protected in the EU
31
32. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Create, consume, aggregate,
derive and publish Linked Data
in a lawful environment
0
Always
license
your
data
…
Data
shops
Government
Individuals
32
33. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Licensed
Linked
Data
Non-‐licensed
Linked
Data
Licensed
Linked
Data
+License
Unless there is a license allowing to
do so, the resource cannot be copied,
modified or published.
In practice, non-licensed resources
are useless in industrial settings
Licensed Linked Data can be used
33
34. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Licensed
Linked
Data
in
prac=ce
Linked Open Data
Published
Open License
(Published) Linked Data
Published
No Open License
Linked Data
Not Published
No Open License
34
35. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
ç
Guidelines
for
licensing
linked
data
35
Add
"rights"
metadata
in
the
dataset
descrip=on
(e.g.,
VoID,
DCAT)
1
Use
standard
predicates
to
declare
"rights"
statements
(e.g.,
Dublin
Core
terms:
dc:rights,
dct:license)
2
?
Use
rights
declara2on
language,
e.g.,
ODRL
Yes
Use
URI
of
standard
license
e.g.,
CC0
3b
3a
No
Standard license available
ODRL
Open
Digital
Rights
Language
DCAT
Data
catalog
vocabulary
36. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Licensing
Linked
Data
is
Simple…
The
Bri=sh
Na=onal
Bibliography
(BNB)
lists
the
books
and
new
journal
=tles
published
or
distributed
in
the
United
Kingdom
and
Ireland
since
1950.
J
36
37. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
…
or
complex
depending
your
needs
Policies
can
be
expressed
with
ODRL
2.0
to
govern
access
to
Linked
Data
Example
of
access
to
Linked
Data
for
a
price
(15EUR
for
the
dataset
or
0.01EUR
for
a
triple
thereof)
@prefix gr: <http://purl.org/goodrelations/> .
@prefix dcat: <http://www.w3.org/ns/dcat#> .
<http://salonica.dia.fi.upm.es/ldr/policy/cdaddba4-fc2e-4ee0-a784-e62f1db259bf>
a odrl:Set ;
rdfs:label "License Offering Paid Linked Data" ;
odrl:permission [ a odrl:Permission ;
odrl:target <http://example.org/dataset/ds01> ;
odrl:action odrl:reproduce ;
odrl:duty [ a odrl:Duty ;
rdfs:label "Pay" ;
gr:UnitOfMeasurement dcat:Dataset ;
gr:amountOfThisGood "1" ;
odrl:action odrl:pay ;
odrl:target "15,00 EUR"
]
] , [ a odrl:Permission ;
odrl:action odrl:reproduce ;
odrl:target <http://example.org/dataset/ds01> ;
odrl:duty [ a odrl:Duty ;
rdfs:label "Pay" ;
gr:UnitOfMeasurement rdf:Statement ;
gr:amountOfThisGood "1" ;
odrl:action odrl:pay ;
odrl:target "0,01 EUR"
]
] ..
The target can be an ontology, a
dataset, a SPARQL endpoint…
…or a SPARQL query itself or a triple
pattern: {mysubject, ?p , ?o}
37
38. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
And
you
have
support
for
that
• Condi=onal
access
to
Linked
Data
– hQp://condi=onal.linkeddata.es
• Dataset
of
licenses
in
RDF
– hQp://rdflicense.appspot.com
• ODRL
Profile
for
Linked
Data
– hQp://purl.oclc.org/NET/ldr/ns#
– hQps://www.w3.org/community/odrl/profile/linkeddata/
38
40. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Analyse
data
source
• Get
insight
into
the
data
structure
and
organiza=on
• Steps:
– To
analyse
the
characteris=cs
of
the
data
• Data
values,
data
ranges,
etc.
– To
obtain
the
schema
of
the
data
• Concepts
and
their
rela=onships
• Data
can
be
available
as:
– Structured
data
– Unstructured
data
• If
the
schema
does
not
exist:
– Use
a
standard
modeling
language
for
describing
the
data
schema
(e.g.,
UML)
40
41. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Analyse
data
source
–
LCmple
• Metadata
not
quite
descrip=ve:
– Different
types
of
council
sites
(mostly
buildings)
– Electricity,
gas
and
oil
consump=ons
– 1-‐year
intervals
-‐
2010/11,
2011/12,
2012/13
• Analysis
required
contac=ng
with
people
from
LCC
open
data
41
42. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Analyse
data
source
–
LCmple
42
hQp://localhost:3333/
43. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Analyse
data
source
–
LCmple
43
44. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Analyse
data
source
–
LCmple
• Analyse
the
characteris=cs
of
data
using
facets
• Obtain
the
schema
of
the
data
44
45. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Data
characteris=cs
and
schema
–
LCCLLIDD
Column
Type
Comments
/
Range
(rounded)
Problems
uprn
String
Not
unique,
empty
values
Site
Name
String
Unique?
Site
types
+
name
4
repeated
sites
Address
2
String
Not
unique,
empty
values
Address
3
String
Not
unique,
empty
values
Village?
Civil
Parish?
Address
4
String
Not
unique,
empty
values
City?
Metropolitan
district?
“leeds”
vs
“Leeds”
PostCode
String
Not
unique,
empty
values
Electricity
10/11
Decimal
0
—
2.700.000
Electricity
11/12
Decimal
0
—
2.300.000
Electricity
12/13
Decimal
0
—
2.400.000
Gas
10/11
Decimal
-‐100,000
—
6,100,000
Nega=ve
values
Gas
11/12
Decimal
-‐100,000
—
7,800,000
Nega=ve
values
Gas
12/13
Decimal
-‐100,000
—
8,300,000
Nega=ve
values
Oil
12/13
Decimal
-‐1,000,000
—
13,000,000
Nega=ve
values
45
46. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Linked
Data
genera=on
process
Select data
source
Obtain
access to
data source
Analyse data
source
Analyse
licensing of
the data
source
Define resource
naming strategy
Transform
data source
Link with
other
datasets
Data source
Access, data
License
Schema, data
Resource naming strategy
Ontology
RDF data
Linked dataset
Ontology
Develop
ontology
DEFINE RESOURCE
NAMING STRATEGY
46
47. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Hash
and
slash
URIs
• Hash
URIs
(#)
– hQp://www.energycompany.com/about#energyCompany
– The
fragment
part
has
to
be
stripped
off
when
the
URI
is
requested
from
the
server
(i.e.,
the
resource
cannot
be
retrieved
directly)
– Hash
URIs
can
be
used
to
iden=fy
non-‐document
resources
• Slash
URIs
(/)
– hQp://www.energycompany.com/about/energyCompany
– Imply
a
303
redirec=on
to
the
loca=on
of
a
document
that
represents
the
resource
(+
content
nego=a=on)
• E.g.,
hQp://www.energycompany.com/about/energyCompany.rdf
– Drawbacks:
HTTP
round-‐trip,
redirects,
web
server
configura=on
47
48. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Hash
or
slash?
• Depends
on
the
data
and
on
their
expected
use
• Small
data:
– Hash
namespace
– Access
all
the
data
as
a
whole
– HTTP
GET
would
return
a
single
informa=on
resource
with
everything
• Large
/
frequently-‐updated
/
modular
data:
– Slash
namespace
– Access
resources
individually
or
in
groups
– Resource
descrip=ons
may
be
divided
among
many
informa=on
resources
or
may
be
managed
via
a
query
service
(e.g.,
SPARQL)
– Progressively
greater
detail
about
resources
may
be
retrieved
through
mul=ple
accesses
48
49. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Define
resource
naming
strategy
• Steps:
– To
choose
a
URI
form
(hash
or
slash)
– To
choose
a
domain
for
the
URIs.
– To
choose
a
path
for
the
URIs.
– To
choose
a
paQern
for
ontology
classes
and
proper=es
in
the
ontology,
as
well
as
for
individuals
• Tips:
– One
URI
must
iden=fy
only
one
item
(e.g.,
avoid
mixing
with
web
pages
and
real-‐world
objects)
– URIs
should
be
persistent
and
should
not
change
over
=me
(e.g.,
state
informa=on);
PURL
may
support
this
– Use
a
domain
that
is
under
your
control
(or
a
service
such
as
PURL)
– Separate
the
ontology
model
from
its
instances
– Define
meaningful
URIs
49
51. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Linked
Data
genera=on
process
Select data
source
Obtain
access to
data source
Analyse data
source
Analyse
licensing of
the data
source
Define resource
naming strategy
Transform
data source
Link with
other
datasets
Data source
Access, data
License
Schema, data
Resource naming strategy
Ontology
RDF data
Linked dataset
Ontology
Develop
ontology
DEVELOP ONTOLOGY
51
52. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Ontology
development
6. Ontology
implementation
5. Ontology selection
1. Requirements definition
Can you
represent all
your data?
7. Ontology evaluation
2. Terms extraction
3. Ontology conceptualization
4. Ontology search
6.2 Ontology
completion
3.1 Initial model drafting
3.2 Detailed model definition
6.1 Ontology integration
You
did
this
yesterday
52
53. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Ontology
development
–
LCCDD
53
54. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Linked
Data
genera=on
process
Select data
source
Obtain
access to
data source
Analyse data
source
Analyse
licensing of
the data
source
Define resource
naming strategy
Transform
data source
Link with
other
datasets
Data source
Access, data
License
Schema, data
Resource naming strategy
Ontology
RDF data
Linked dataset
Ontology
Develop
ontology
TRANSFORM
DATA
54
55. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Data
transforma=on
• Steps:
– To
select
the
RDF
serializa=on
• RDF/XML,
Turtle,
N-‐Triples,
JSON-‐LD
– To
select
a
tool.
Depends
on:
• The
format
of
the
data
(database,
spreadsheets,
etc.),
• Concrete
needs
of
the
transforma=on
process
(e.g.,
dynamicity)
– To
transform
the
data
into
RDF
• Usually
requires
a
mapping
between
the
data
and
the
ontology
• The
mapping
implements
the
resource
naming
strategy
– To
evaluate
the
obtained
RDF
data:
• Syntax,
Completeness,
Accuracy,
Conciseness,
Modelling,
Understandability,
Versa=lity,
Usage,
Licensing,
…
55
56. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Data
transforma=on
tools
Database
to
RDF
Data
streams
to
RDF
• morph-‐RDB
• D2R
Server
• TopBraid
Composer
• morph-‐streams
• D2R
Server
Spreadsheets
to
RDF
XML
to
RDF
• TopBraid
Composer
• Excel2RDF
• RDF123
• XLWrap
• OpenRefine/LODRefine
• XML2RDF
• TopBraid
Composer
• OpenRefine/LODRefine
56
57. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Data
transforma=on
tools
Database
to
RDF
Data
streams
to
RDF
• morph-‐RDB
• D2R
Server
• TopBraid
Composer
• morph-‐streams
• D2R
Server
Spreadsheets
to
RDF
XML
to
RDF
• TopBraid
Composer
• Excel2RDF
• RDF123
• XLWrap
• OpenRefine/LODRefine
• XML2RDF
• TopBraid
Composer
• OpenRefine/LODRefine
Overview
of
OpenRefine
57
58. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
OpenRefine
basic
opera=ons
• Installing
• Crea=ng
a
new
project
• Data
analysis
– Exploring
data
– Sor=ng
data
– Face=ng
data
– Filtering
data
• Basic
data
transforma=on
(cleaning/preparing)
– Columns:
• Move
• Rename
• Remove
columns
• Collapse
and
expand
• Common
transforma=ons
– Rows:
• Remove
rows
• Export
whole
project
58
59. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Adding
derived
columns
Edit
column
à
Add
column
based
on
this
column...
59
60. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Spli‚ng
data
accross
columns
Edit
column
à
Split
into
several
columns...
60
62. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Rows
and
records
Show
as:
rows
records
Record
Row
62
63. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Clustering
similar
cells
Edit
cells
à
Cluster
and
edit...
63
64. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Transposing
rows
and
columns
Transpose
à
Transpose
cells
across
columns
into
rows...
Transpose
à
Columnize
by
key/value
columns...
64
65. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Other
useful
u=li=es
• Regular
expressions
– Java
regular
expressions
• Custom
transforma=ons
– General
Refine
Expression
Language
(GREL)
– Jython
(Python
implemented
in
Java)
– Clojure
(func=onal
language
that
resembles
Lisp)
65
66. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
66
Using
the
project
history
• Project
history:
– Access
opera=on
history
– Undo
opera=ons
– Extract
opera=ons
(in
JSON)
– Apply
opera=ons
• Cau=on:
– Transforma=ons
are
registered
in
the
history;
filters
and
facets
are
not
73. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Evalua=ng
the
exported
data
• Manual
inspec=on
• Syntax
evalua=on
(with
syntax
validator)
• Consistency
with
the
ontologies
(with
reasoner)
• Usage
evalua=on
(e.g.,
by
running
SPARQL
queries)
– Show
all
electricity
consump=ons
and
the
related
=me
periods
for
all
council
sites
related
to
culture
– Show
all
energy
consump=ons
and
the
related
=me
periods
of
council
sites
from
the
Wakefield
district
73
74. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Index
• Linked
Open
Data
in
Smart
Ci=es
• Guidelines
for
the
Genera=on
of
Linked
Data
• Discussion
• Hands-‐on
Descrip=on
74
76. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Linked
Data
are
just
data
01000000
electric1011
01000000
electric1112
01000000
0 20 40 60 80 100
electric1213
Building
Electrical consumption
0e+00
2e+06
4e+06
6e+06
8e+06
0 500000 1000000 1500000 2000000
Electricity
Gas
Electricity vs gas consumption 12/13
0.0e+00
4.0e+06
8.0e+06
1.2e+07
0 500000 1000000 1500000 2000000
Electricity
Oil
Electricity vs oil consumption 12/13
76
77. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
77
Benefits
of
linking
data
resPlus$electricTotal
0e+00
2e+06
4e+06
6e+06
Total
electric
consump2on
Original
data
+
geoloca=on
resP
Total
electric
consump2on
in
loca2ons
with
popula2on
>
20.000
Original
data
+
geoloca=on
+
popula=on
78. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Benefits
of
reasoning
resPlus
25
50
75
10
Total
electric
consump2on
in
cultural
buildings
schema:CivicStructure
CulturalSite
Museum Library
78
79. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Index
• Linked
Open
Data
in
Smart
Ci=es
• Guidelines
for
the
Genera=on
of
Linked
Data
• Discussion
• Hands-‐on
Descrip2on
79
80. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
What
are
we
going
to
do?
Specification
Modelling
GenerationPublication
Exploitation
Linking
80
81. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
What
are
we
going
to
do?
Select data
source
Obtain
access to
data source
Analyse data
source
Analyse
licensing of
the data
source
Define resource
naming strategy
Transform
data source
Link with
other
datasets
Data source
Access, data
License
Schema, data
Resource naming strategy
Ontology
RDF data
Linked dataset
Ontology
Develop
ontology
81
82. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Hands-‐on
task
1
• Goal:
to
get
familiar
with
the
first
steps
in
the
Linked
Data
genera=on
process
• The
students
will
have
to
take
their
selected
dataset(s)
and
perform
the
following
tasks:
– Analyse
Data
Set
• Both
the
data
(quan==es,
value
ranges,
etc.)
and
the
schema
– Analyse
Licensing
of
the
Data
Source
• Who
is
the
publisher
and
the
rightsholder?
• What
is
the
licence?
• Which
will
be
the
license
to
be
used
for
the
generated
dataset?
– Define
Resource
Naming
Strategy
• For
the
ontology
and
the
data
(URI
form,
content
nego=a=on,
URIs
domain,
path,
paQerns,
etc.)
– Finish
Ontology
Development
• Lightweight
ontology
(i.e.,
classes,
proper=es,
domains
and
ranges)
82
83. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Hands-‐on
task
1
-‐
Deliverables
• A
document
that
includes:
– The
analyses
performed
over
the
data
source
– The
licensing
of
the
data
source
and
the
poten=al
license
– The
resource
naming
strategy
defined
• An
OWL
file
with
the
ontology
developed,
according
to
the
resource
naming
strategy
defined
83
84. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Hands-‐on
task
2
• Goal:
to
get
familiar
with
the
transforma=on
of
CSV
data
into
RDF
using
LODRefine
• The
students
will
have
to
take
their
selected
dataset(s)
and
perform
the
following
tasks:
– Import
data
into
LODRefine
– Analyse
and
fix
data
• Analysis
performed
in
the
previous
class,
but
can
be
updated
with
new
findings
• Fix
the
data
to
remove
errors
• Transform
the
data
to
facilitate
RDF
genera=on
– Export
data
to
RDF
• Define
an
RDF
skeleton
for
the
data
• Export
the
data
to
RDF
(Turtle
syntax)
84
85. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
Hands-‐on
task
2
-‐
Deliverables
For
each
dataset:
• An
RDF
file
in
the
Turtle
syntax
with
the
data
transformed
into
RDF
85
86. LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
LD4SC
Summer
School
7th
-‐
12th
June,
Cercedilla,
Spain
1st
Summer
School
on
Smart
Ci2es
and
Linked
Open
Data
(LD4SC-‐15)
Thank
you
for
your
aQen=on!