This document introduces linked open data and the semantic web. It discusses the evolution from a document-based web to a web of interlinked data using standards like RDF and URIs. It explains the differences between open data and linked data and outlines best practices for publishing linked open data using the 5 star framework. Key topics covered include linked data principles, publishing structured data using formats like RDFa and schema.org, and the economic and social benefits of linked open and government data.
This document discusses the potential benefits of using linked data in libraries. It explains that linked data connects related data on the web using URIs and RDF triples. This allows data to be integrated, extended and reused. The document provides examples of how linked data could unlock library data, connect different library systems, and allow complex relationships to be modeled. Overall, it argues that linked data can help libraries share and integrate their data in new ways.
Overview of Open Data, Linked Data and Web ScienceHaklae Kim
This document provides an overview of open data, linked data, and web science through conceptual discussions, case studies, and proposed next steps. It begins with definitions of key concepts like open data and the semantic web. Case studies demonstrate current applications of open data through government initiatives and technologies like Google's Knowledge Graph and Apple's Siri. The document concludes by acknowledging challenges with open data strategies and advocating for interdisciplinary collaboration to realize the potential of linked open government data.
This document provides an overview of linked data and the SPARQL query language. It defines linked data as a method of publishing structured data on the web so that it can be interlinked and queried. The key aspects covered include linked data principles of using URIs to identify things and including links to other related data. SPARQL is introduced as the query language for retrieving and manipulating linked data.
Linked Data Integration and semantic webDiego Pessoa
This document discusses linked data and the semantic web. It explains that as data volumes on the web grow, linking related data from different sources becomes important. Linked data uses URIs and RDF to connect related data and establish links between resources on the web. The principles of linked data include using URIs to identify things, providing HTTP URIs so people can look up those names, and including links to other related resources. Guidelines are provided for publishing linked data, such as using dereferenceable URIs and creating RDF links. Both browsers and domain-specific applications can be used to consume linked data. Research challenges for linked data include user interfaces, application architectures, and maintaining links between data.
Presented Jan. 15, 2010 for the Technical Services 'Big Heads', as an introduction to the RDA Vocabularies and the opportunities provided by this different approach to data.
The slideset used to conduct an introduction/tutorial
on DBpedia use cases, concepts and implementation
aspects held during the DBpedia community meeting
in Dublin on the 9th of February 2015.
(slide creators: M. Ackermann, M. Freudenberg
additional presenter: Ali Ismayilov)
This presentation gives details on technologies and approaches towards exploiting Linked Data by building LD applications. In particular, it gives an overview of popular existing applications and introduces the main technologies that support implementation and development. Furthermore, it illustrates how data exposed through common Web APIs can be integrated with Linked Data in order to create mashups.
This document discusses the potential benefits of using linked data in libraries. It explains that linked data connects related data on the web using URIs and RDF triples. This allows data to be integrated, extended and reused. The document provides examples of how linked data could unlock library data, connect different library systems, and allow complex relationships to be modeled. Overall, it argues that linked data can help libraries share and integrate their data in new ways.
Overview of Open Data, Linked Data and Web ScienceHaklae Kim
This document provides an overview of open data, linked data, and web science through conceptual discussions, case studies, and proposed next steps. It begins with definitions of key concepts like open data and the semantic web. Case studies demonstrate current applications of open data through government initiatives and technologies like Google's Knowledge Graph and Apple's Siri. The document concludes by acknowledging challenges with open data strategies and advocating for interdisciplinary collaboration to realize the potential of linked open government data.
This document provides an overview of linked data and the SPARQL query language. It defines linked data as a method of publishing structured data on the web so that it can be interlinked and queried. The key aspects covered include linked data principles of using URIs to identify things and including links to other related data. SPARQL is introduced as the query language for retrieving and manipulating linked data.
Linked Data Integration and semantic webDiego Pessoa
This document discusses linked data and the semantic web. It explains that as data volumes on the web grow, linking related data from different sources becomes important. Linked data uses URIs and RDF to connect related data and establish links between resources on the web. The principles of linked data include using URIs to identify things, providing HTTP URIs so people can look up those names, and including links to other related resources. Guidelines are provided for publishing linked data, such as using dereferenceable URIs and creating RDF links. Both browsers and domain-specific applications can be used to consume linked data. Research challenges for linked data include user interfaces, application architectures, and maintaining links between data.
Presented Jan. 15, 2010 for the Technical Services 'Big Heads', as an introduction to the RDA Vocabularies and the opportunities provided by this different approach to data.
The slideset used to conduct an introduction/tutorial
on DBpedia use cases, concepts and implementation
aspects held during the DBpedia community meeting
in Dublin on the 9th of February 2015.
(slide creators: M. Ackermann, M. Freudenberg
additional presenter: Ali Ismayilov)
This presentation gives details on technologies and approaches towards exploiting Linked Data by building LD applications. In particular, it gives an overview of popular existing applications and introduces the main technologies that support implementation and development. Furthermore, it illustrates how data exposed through common Web APIs can be integrated with Linked Data in order to create mashups.
DBpedia is a crowd-sourced effort to extract structured data from Wikipedia and Wikidata. It provides a public SPARQL endpoint to query this multi-domain, multilingual dataset. The DBpedia Association was founded in 2014 as a non-profit to oversee DBpedia and aims to improve uptime, data quality, and integration with other sources. It relies on funding and contributions from members to achieve goals like 99.99% uptime across languages and domains. The document promotes joining the DBpedia Association and participating in future events like a DBpedia meeting at the SEMANTiCS 2016 conference.
Build Narratives, Connect Artifacts: Linked Open Data for Cultural HeritageOntotext
Many issues are faced by scholars, book researchers, museum directors who try to find the underlying connection between resources. Scholars in particular continuously emphasizes the role of digital humanities and the value of linked data in cultural heritage information systems.
Introduction to DBpedia, the most popular and interconnected source of Linked Open Data. Part of EXPLORING WIKIDATA AND THE SEMANTIC WEB FOR LIBRARIES at METRO http://metro.org/events/598/
Publishing the British National Bibliography as Linked Open Data / Corine Del...CIGScotland
Presented at Linked Open Data: current practice in libraries and archives (Cataloguing & Indexing Group in Scotlland 3rd Linked Open Data Conference), Edinburgh, 18 Nov 2013
This document discusses techniques for discovering structured information from web sites. It presents three main contributions:
1. A method to extract structured data in the form of web lists that are split across multiple web pages, called logical lists.
2. An approach for automatically extracting sitemaps from web sites.
3. A technique for clustering web pages based on intra-page and extra-page features.
It19 20140721 linked data personal perspectiveJanifer Gatenby
A presentation made for Standards Australia's seminar. Outlines the basic aspects of linked data from a personal perspective and where it fits with direct and subject searching.
The document discusses the basics of linked data modeling, including defining classes and properties to describe resources, creating instances of classes, and interlinking related data across datasets using properties like rdfs:seeAlso and owl:sameAs. It explains key concepts like URIs, RDF, and SPARQL and compares linked data modeling to traditional database modeling in terms of tables, columns, rows, and relations. The goal of linked data modeling is to publish structured data that can be interlinked to become more useful and discoverable on the semantic web.
Information School, University of Washington, 2014-05-21: INFX 598 - Introducing Linked Data: concepts, methods and tools. Guest lecture (Module 9) "Doing Business with Semantic Technologies": Introduction to Ontotext and some of its products, clients and projects.
Also see video:https://voicethread.com/myvoice/#thread/5784646/29625471/31274564
Shrinking the silo boundary: data and schema in the Semantic WebGordon Dunsire
This document discusses representing local structured metadata as linked data in the Semantic Web. It notes that mapping from a local schema like MARC 21 to global linked data schemas can result in lost information. To avoid this, the document recommends publishing the local schema as its own RDF element set. This allows the local data and schema to be published as linked data without information loss, while still enabling mappings to other schemas. It presents advantages like other communities being able to understand and reuse the local schema. The document concludes by visualizing how this approach "shrinks the silo" by publishing the local data, schema, and mappings as RDF on the Semantic Web.
Publishing and Using Linked Open Data - Day 1 Richard Urban
This document provides an agenda and schedule for Monday's Linked Open Data class. The day includes introductions, sessions on introducing linked data and exploring use cases, breaks for discussion, and a concluding session on kicking off participant projects. Evening events include an outside lecture and networking social for graduate students.
The document discusses using linked open data and linked data principles for libraries. It covers key concepts like URIs, RDF triples, ontologies and vocabularies. It then outlines options for libraries to both consume and publish linked data, such as enriching existing catalog data by linking to external sources, creating new information aggregates, and publishing library holdings and metadata as linked open data. Challenges include a lack of common identifiers, FRBRization of existing data, and the need for content curation and new technical systems to fully realize the benefits of linked open data for libraries.
The document discusses the agenda for a presentation on the Semantic Web. The agenda includes an overview of the World Wide Web, an introduction to the Semantic Web, tools and applications for the Semantic Web, Linking Open Data, the Social Semantic Web, and Open Government. Each section provides details on the topic covered.
The document introduces the Semantic Web and the key technologies that enable it, including RDF, RDF Schema, OWL, and SPARQL. RDF allows for describing resources and relationships between them using triples. RDF Schema extends RDF with a vocabulary for describing properties and classes of resources. OWL builds on RDF and RDF Schema to provide additional expressive power for defining complex ontologies. SPARQL is a query language for retrieving and manipulating data stored in RDF format. These technologies work together to transform the existing web of documents into a web of linked data that can be processed automatically by machines.
PoolParty 5.0 - Five more reasons to lean on a world-class semantic platformSemantic Web Company
This document provides an overview and summary of a presentation about the PoolParty semantic platform. The presentation covers the following key points in 3 sentences:
The presentation provides an overview of the PoolParty semantic platform and its components for creating and managing taxonomies and knowledge graphs, linking data sources, and integrating knowledge graphs into applications via API. It demonstrates how PoolParty allows unstructured and structured data to be combined by extracting entities from text based on knowledge graphs. The presentation highlights new features of PoolParty 5 including improved entity extraction, disambiguation capabilities, deep integration with content management systems, a web crawler to extract terms from websites, and expanded ontology management.
Recognos is a semantic technology company established in 1999 with offices in California and Romania. They have 70 employees conducting research and development into semantic technologies. Their applications include finance, CRM, life sciences, and more. Semantic technology aims to teach machines human reasoning by representing knowledge as statements describing concepts, logic, and relationships. This allows for integrated querying across structured and unstructured data sources. Recognos can help companies like Netflix develop semantic applications such as integrated search across data sources and detecting similarities in film descriptions.
This document provides an adaptation of a webinar on querying linked data for Android devices using a modified triple store implementation. It outlines instructions for installing OntoQuad, deploying a preloaded MusicBrainz dataset, and includes sample SPARQL queries adapted from the original webinar to query the data within the limitations of mobile devices. The sample queries are demonstrated to retrieve album and track information for the band Queen from the loaded dataset.
Creating web applications with LODSPeaKrAlvaro Graves
LODSPeaKr is a framework for creating web applications using Linked Open Data. It allows for rapid development of applications with features like content negotiation, workflow execution across multiple SPARQL endpoints, and templates to generate HTML, RDF, and JSON representations from SPARQL queries. Developers can build applications like open data portals, APIs, and mobile apps using the templating system and services provided by LODSPeaKr without having to handle different representations manually. Future work aims to improve crawling, add better data management controls, and documentation.
Ifla swsig meeting - Puerto Rico - 20110817Figoblog
This summary provides an overview of the agenda and reports from the 1st Semantic Web SIG open session at IFLA 77th WLIC in August 2011. The agenda included reports from the W3C Library Linked Data incubator group, Namespaces task group, and RDA task group. It also discussed next steps and expectations from Library Linked Data implementations.
The Impact of Linked Data in Digital Curation and Application to the Catalogu...Hong (Jenny) Jing
(Full version of the presentation: https://www.youtube.com/watch?v=WS9Svbmp-YY)
Information organization and systems in libraries are in a state of significant flux. In systems there is a shift to XML and RDF-based schemas and ontologies while resource description content standards have changed from AACR2 to RDA. A move from MARC to BIBFRAME and other linked data applications is on the horizon. Linked data and the semantic web have become buzzwords, but what is linked data and why it is important for librarians? How can we use it in digital curation? What can libraries do now to “prepare” for this change in their current practice?
In light of these questions, the panel presentation will discuss two projects. First, there will be coverage of a sample project using the Fedora-based open source framework, Islandora to demonstrate the concepts of connecting related data across the Web with URIs, HTTP and RDF. The second half of the presentation will describe how a consortia has taken a holistic approach to writing an RDA workflow to help front-line cataloguers develop a wider perspective when it comes to resource description (creating more structured, future compatible metadata). Up for discussion: the current state and future possibilities of library metadata with a focus on the implications of linked data.
PoolParty is a world-leading semantic technology platform focusing on standards-based management of taxonomies and ontologies.
Its outstanding text mining capabilities based on controlled vocabularies open up new options for masterdata and information management.
Try out a powerful thesaurus management system and entity extractor. See how easily knowledge models can be generated with PoolParty, and learn how linked open data can enrich your own thesaurus. Get an impression how simple it is to publish a knowledge model as linked open data and test our text mining component.
PoolParty Thesaurus Server (PPT) is an advanced software platform to manage enterprise metadata and linked data based on semantic knowledge models (taxonomies, thesauri, ontologies and knowledge graphs). PPT´s metadata management is based on W3C´s Semantic Web standards RDF, SKOS and OWL and is combined with text mining and linked data mapping technologies. PoolParty´s API (based on W3C´s SPARQL standard) allows the integration of semantic technologies with other systems like search engines, CMS, DMS, web shops or Wikis. In addition, PoolParty Enterprise Server offers outstanding facilities based on PoolParty Extractor that allow text mining over large document collections.
PoolParty knowledge modeling approach combines best-of-breed approaches of the semantic web (text corpus analysis, entity extraction, linked data enrichment, SKOS thesaurus management). This enables thesaurus managers to build, maintain and publish even the largest and most complex knowledge models built on top of RDF Schema and SKOS.
Application scenarios
web-based thesaurus-, vocabulary- and taxonomy-management based on open standards;
(semi-)automatic annotation and categorisation of documents with high precision;
thesaurus- and linked data publishing on the web and on the intranet (Sharepoint, Confluence etc.) as a basis to build semantic mash-ups;
data integration from different sources (structured and unstructured) based on flexible metadata models;
What is linked data
What is open data
What is the difference between linked and open data
How to publish linked data (5-star schema)
The economic and social aspects of linked data.
morning session talk at the second Keystone Training School "Keyword search in Big Linked Data" held in Santiago de Compostela.
https://eventos.citius.usc.es/keystone.school/
DBpedia is a crowd-sourced effort to extract structured data from Wikipedia and Wikidata. It provides a public SPARQL endpoint to query this multi-domain, multilingual dataset. The DBpedia Association was founded in 2014 as a non-profit to oversee DBpedia and aims to improve uptime, data quality, and integration with other sources. It relies on funding and contributions from members to achieve goals like 99.99% uptime across languages and domains. The document promotes joining the DBpedia Association and participating in future events like a DBpedia meeting at the SEMANTiCS 2016 conference.
Build Narratives, Connect Artifacts: Linked Open Data for Cultural HeritageOntotext
Many issues are faced by scholars, book researchers, museum directors who try to find the underlying connection between resources. Scholars in particular continuously emphasizes the role of digital humanities and the value of linked data in cultural heritage information systems.
Introduction to DBpedia, the most popular and interconnected source of Linked Open Data. Part of EXPLORING WIKIDATA AND THE SEMANTIC WEB FOR LIBRARIES at METRO http://metro.org/events/598/
Publishing the British National Bibliography as Linked Open Data / Corine Del...CIGScotland
Presented at Linked Open Data: current practice in libraries and archives (Cataloguing & Indexing Group in Scotlland 3rd Linked Open Data Conference), Edinburgh, 18 Nov 2013
This document discusses techniques for discovering structured information from web sites. It presents three main contributions:
1. A method to extract structured data in the form of web lists that are split across multiple web pages, called logical lists.
2. An approach for automatically extracting sitemaps from web sites.
3. A technique for clustering web pages based on intra-page and extra-page features.
It19 20140721 linked data personal perspectiveJanifer Gatenby
A presentation made for Standards Australia's seminar. Outlines the basic aspects of linked data from a personal perspective and where it fits with direct and subject searching.
The document discusses the basics of linked data modeling, including defining classes and properties to describe resources, creating instances of classes, and interlinking related data across datasets using properties like rdfs:seeAlso and owl:sameAs. It explains key concepts like URIs, RDF, and SPARQL and compares linked data modeling to traditional database modeling in terms of tables, columns, rows, and relations. The goal of linked data modeling is to publish structured data that can be interlinked to become more useful and discoverable on the semantic web.
Information School, University of Washington, 2014-05-21: INFX 598 - Introducing Linked Data: concepts, methods and tools. Guest lecture (Module 9) "Doing Business with Semantic Technologies": Introduction to Ontotext and some of its products, clients and projects.
Also see video:https://voicethread.com/myvoice/#thread/5784646/29625471/31274564
Shrinking the silo boundary: data and schema in the Semantic WebGordon Dunsire
This document discusses representing local structured metadata as linked data in the Semantic Web. It notes that mapping from a local schema like MARC 21 to global linked data schemas can result in lost information. To avoid this, the document recommends publishing the local schema as its own RDF element set. This allows the local data and schema to be published as linked data without information loss, while still enabling mappings to other schemas. It presents advantages like other communities being able to understand and reuse the local schema. The document concludes by visualizing how this approach "shrinks the silo" by publishing the local data, schema, and mappings as RDF on the Semantic Web.
Publishing and Using Linked Open Data - Day 1 Richard Urban
This document provides an agenda and schedule for Monday's Linked Open Data class. The day includes introductions, sessions on introducing linked data and exploring use cases, breaks for discussion, and a concluding session on kicking off participant projects. Evening events include an outside lecture and networking social for graduate students.
The document discusses using linked open data and linked data principles for libraries. It covers key concepts like URIs, RDF triples, ontologies and vocabularies. It then outlines options for libraries to both consume and publish linked data, such as enriching existing catalog data by linking to external sources, creating new information aggregates, and publishing library holdings and metadata as linked open data. Challenges include a lack of common identifiers, FRBRization of existing data, and the need for content curation and new technical systems to fully realize the benefits of linked open data for libraries.
The document discusses the agenda for a presentation on the Semantic Web. The agenda includes an overview of the World Wide Web, an introduction to the Semantic Web, tools and applications for the Semantic Web, Linking Open Data, the Social Semantic Web, and Open Government. Each section provides details on the topic covered.
The document introduces the Semantic Web and the key technologies that enable it, including RDF, RDF Schema, OWL, and SPARQL. RDF allows for describing resources and relationships between them using triples. RDF Schema extends RDF with a vocabulary for describing properties and classes of resources. OWL builds on RDF and RDF Schema to provide additional expressive power for defining complex ontologies. SPARQL is a query language for retrieving and manipulating data stored in RDF format. These technologies work together to transform the existing web of documents into a web of linked data that can be processed automatically by machines.
PoolParty 5.0 - Five more reasons to lean on a world-class semantic platformSemantic Web Company
This document provides an overview and summary of a presentation about the PoolParty semantic platform. The presentation covers the following key points in 3 sentences:
The presentation provides an overview of the PoolParty semantic platform and its components for creating and managing taxonomies and knowledge graphs, linking data sources, and integrating knowledge graphs into applications via API. It demonstrates how PoolParty allows unstructured and structured data to be combined by extracting entities from text based on knowledge graphs. The presentation highlights new features of PoolParty 5 including improved entity extraction, disambiguation capabilities, deep integration with content management systems, a web crawler to extract terms from websites, and expanded ontology management.
Recognos is a semantic technology company established in 1999 with offices in California and Romania. They have 70 employees conducting research and development into semantic technologies. Their applications include finance, CRM, life sciences, and more. Semantic technology aims to teach machines human reasoning by representing knowledge as statements describing concepts, logic, and relationships. This allows for integrated querying across structured and unstructured data sources. Recognos can help companies like Netflix develop semantic applications such as integrated search across data sources and detecting similarities in film descriptions.
This document provides an adaptation of a webinar on querying linked data for Android devices using a modified triple store implementation. It outlines instructions for installing OntoQuad, deploying a preloaded MusicBrainz dataset, and includes sample SPARQL queries adapted from the original webinar to query the data within the limitations of mobile devices. The sample queries are demonstrated to retrieve album and track information for the band Queen from the loaded dataset.
Creating web applications with LODSPeaKrAlvaro Graves
LODSPeaKr is a framework for creating web applications using Linked Open Data. It allows for rapid development of applications with features like content negotiation, workflow execution across multiple SPARQL endpoints, and templates to generate HTML, RDF, and JSON representations from SPARQL queries. Developers can build applications like open data portals, APIs, and mobile apps using the templating system and services provided by LODSPeaKr without having to handle different representations manually. Future work aims to improve crawling, add better data management controls, and documentation.
Ifla swsig meeting - Puerto Rico - 20110817Figoblog
This summary provides an overview of the agenda and reports from the 1st Semantic Web SIG open session at IFLA 77th WLIC in August 2011. The agenda included reports from the W3C Library Linked Data incubator group, Namespaces task group, and RDA task group. It also discussed next steps and expectations from Library Linked Data implementations.
The Impact of Linked Data in Digital Curation and Application to the Catalogu...Hong (Jenny) Jing
(Full version of the presentation: https://www.youtube.com/watch?v=WS9Svbmp-YY)
Information organization and systems in libraries are in a state of significant flux. In systems there is a shift to XML and RDF-based schemas and ontologies while resource description content standards have changed from AACR2 to RDA. A move from MARC to BIBFRAME and other linked data applications is on the horizon. Linked data and the semantic web have become buzzwords, but what is linked data and why it is important for librarians? How can we use it in digital curation? What can libraries do now to “prepare” for this change in their current practice?
In light of these questions, the panel presentation will discuss two projects. First, there will be coverage of a sample project using the Fedora-based open source framework, Islandora to demonstrate the concepts of connecting related data across the Web with URIs, HTTP and RDF. The second half of the presentation will describe how a consortia has taken a holistic approach to writing an RDA workflow to help front-line cataloguers develop a wider perspective when it comes to resource description (creating more structured, future compatible metadata). Up for discussion: the current state and future possibilities of library metadata with a focus on the implications of linked data.
PoolParty is a world-leading semantic technology platform focusing on standards-based management of taxonomies and ontologies.
Its outstanding text mining capabilities based on controlled vocabularies open up new options for masterdata and information management.
Try out a powerful thesaurus management system and entity extractor. See how easily knowledge models can be generated with PoolParty, and learn how linked open data can enrich your own thesaurus. Get an impression how simple it is to publish a knowledge model as linked open data and test our text mining component.
PoolParty Thesaurus Server (PPT) is an advanced software platform to manage enterprise metadata and linked data based on semantic knowledge models (taxonomies, thesauri, ontologies and knowledge graphs). PPT´s metadata management is based on W3C´s Semantic Web standards RDF, SKOS and OWL and is combined with text mining and linked data mapping technologies. PoolParty´s API (based on W3C´s SPARQL standard) allows the integration of semantic technologies with other systems like search engines, CMS, DMS, web shops or Wikis. In addition, PoolParty Enterprise Server offers outstanding facilities based on PoolParty Extractor that allow text mining over large document collections.
PoolParty knowledge modeling approach combines best-of-breed approaches of the semantic web (text corpus analysis, entity extraction, linked data enrichment, SKOS thesaurus management). This enables thesaurus managers to build, maintain and publish even the largest and most complex knowledge models built on top of RDF Schema and SKOS.
Application scenarios
web-based thesaurus-, vocabulary- and taxonomy-management based on open standards;
(semi-)automatic annotation and categorisation of documents with high precision;
thesaurus- and linked data publishing on the web and on the intranet (Sharepoint, Confluence etc.) as a basis to build semantic mash-ups;
data integration from different sources (structured and unstructured) based on flexible metadata models;
What is linked data
What is open data
What is the difference between linked and open data
How to publish linked data (5-star schema)
The economic and social aspects of linked data.
morning session talk at the second Keystone Training School "Keyword search in Big Linked Data" held in Santiago de Compostela.
https://eventos.citius.usc.es/keystone.school/
Linked Open Data Principles, Technologies and ExamplesOpen Data Support
Theoretical and practical introducton to linked data, focusing both on the value proposition, the theory/foundations, and on practical examples. The material is tailored to the context of the EU institutions.
This module supported the training on Linked Open Data delivered to the EU Institutions on 30 November 2015 in Brussels. https://joinup.ec.europa.eu/community/ods/news/ods-onsite-training-european-commission
An introduction to linked data (semantic web) for a Knowledge and Information Network (KIN) webinar. The presentation shows some examples of linked data in action, data visualization, difference between open and linked data and how linkd data is being used in UK gov and local gov.
Linked Data, the Semantic Web, and You discusses key concepts related to Linked Data and the Semantic Web. It defines Linked Data as a set of best practices for publishing and connecting structured data on the web using URIs, HTTP, RDF, and other standards. It also explains semantic web technologies like RDF, ontologies, SKOS, and SPARQL that enable representing and querying structured data on the web. Finally, it discusses how libraries are applying these concepts through projects like BIBFRAME, FAST, library linked data platforms, and the LD4L project to represent bibliographic data as linked open data.
Linked Data provides a standardized framework for publishing structured data on the web by linking data instead of documents. It uses URIs, HTTP, and RDF to link related data across different sources to create a global data space without silos. EnAKTing is a research project focused on building ontologies from large-scale user participation, querying linked data at web-scale, and visualizing the massive amounts of interconnected data. Some of its applications include services for discovering backlinks, geographical resources, and dataset equivalences in the Web of Data.
The document discusses the Semantic Web and how it provides a common framework to share and reuse data across applications and organizations. It describes Resource Description Framework (RDF) and how it represents relationships in a simple data structure using graphs. It also discusses Linked Data design principles and standards like RDFa and Microformats that embed semantics into web pages. Finally, it provides examples of how search engines like Google and Yahoo utilize structured data from RDFa and Microformats to enhance search results.
This document provides an overview of relevant approaches for accessing open data programmatically and data-as-a-service (DaaS) solutions. It discusses common data access methods like web APIs, OData, and SPARQL and describes several DaaS platforms that simplify publishing and consuming open data. It also outlines requirements for a proposed open DaaS platform called DaPaaS that aims to address challenges in open data management and application development.
Talk given at Open Knowledge Foundation 'Opening Up Metadata: Challenges, Standards and Tools' Workshop, Queen Mary University of London, 13th June 2012.
Info on the event at http://openglam.org/2012/05/31/last-places-left-for-opening-up-metadata-challenges-standards-and-tools/
This document discusses standardizing data on the web. It notes that data exists in many formats, from informal to curated, and machine to human readable. W3C has focused on integrating data at web scale using standards like RDF, SPARQL, and Linked Data principles. However, converting all data to RDF has challenges. Much data exists as CSV, JSON, XML and does not need full integration. The reality is data on the web is messy with many formats. Developers see converting data as too complex. The document discusses providing tools to publish Linked Data easily, or focusing on raw data without RDF. It notes different approaches can coexist and discusses a workshop on open data formats.
Notes for talk on 12th June 2013 to Open Innovation meeting, GlasgowPeterWinstanley1
The document discusses open innovation and TRIZ methodology for inventive problem solving. It notes that TRIZ asks designers to define the ideal solution without negative aspects, and looks for solutions across domains by removing domain-specific language. The document then discusses semantic interoperability of data through shared data models and vocabularies, and provides examples of existing linked open data sources and vocabularies that could be reused.
The document discusses the evolution of the web from Web 1.0 to the proposed Web 3.0. Web 1.0 consisted of static, text-based pages, while Web 2.0 added user-generated content and applications. Web 3.0, also called the Semantic Web, aims to make data on the web more accessible and useful by adding metadata and structure using technologies like RDF, RDFS, OWL, and SPARQL. Microformats are presented as a simpler way to add semantics by reusing existing web standards. Open data and APIs are also discussed as ways to freely share and combine data. Examples of sites using these approaches are provided.
The document discusses the Semantic Web and Linked Data. It provides an overview of key concepts like URIs, RDF, and standardized formats for representing semantic data like Turtle and JSON-LD. It also provides examples of representing personal profile information about individuals using these technologies and linking the data together.
Nelson Piedra , Janneth Chicaiza
and Jorge López, Universidad Técnica Particular de Loja, Edmundo
Tovar, Universidad Politécnica de Madrid,
and Oscar Martínez, Universitas
Miguel Hernández
Explore the advantages of using linked data with OERs.
This document discusses RDFa and linked open data. It provides examples of organizations that have implemented RDFa, including the Library of Congress, the New York Times, and UK government websites. It also discusses tools for working with RDFa and the benefits it can provide for publishing structured data on the web.
Discovering Resume Information using linked data dannyijwest
In spite of having different web applications to create and collect resumes, these web applications suffer
mainly from a common standard data model, data sharing, and data reusing. Though, different web
applications provide same quality of resume information, but internally there are many differences in terms
of data structure and storage which makes computer difficult to process and analyse the information from
different sources. The concept of Linked Data has enabled the web to share data among different data
sources and to discover any kind of information while resolving the issues like heterogeneity,
interoperability, and data reusing between different data sources and allowing machine process-able data
on the web.
OpenCalais is a web service that analyzes text and generates semantic metadata about entities, events, and facts mentioned. It links these entities to datasets like DBpedia, Wikipedia, and Geonames using URIs. This allows the data to be interconnected and explored through the web of linked data. OpenCalais follows the principles of linked data by assigning HTTP URIs to entities and linking them to other open datasets. It returns the extracted metadata in RDF format, integrating the analyzed content into the larger linked data cloud.
The document discusses the concepts of the semantic web and linked data. It explains that the semantic web aims to convert the web into a single database that can be understood by machines through linking data using URIs, RDF, and other standards. It provides examples of projects like DBpedia and the Linking Open Data cloud that publish open government and other data as linked data. The document outlines some of the technologies and best practices for publishing and connecting data as linked data.
Similar to Semantic Data Architecture and Ontological Infrastructure (ASIO) (20)
Securing BGP: Operational Strategies and Best Practices for Network Defenders...APNIC
Md. Zobair Khan,
Network Analyst and Technical Trainer at APNIC, presented 'Securing BGP: Operational Strategies and Best Practices for Network Defenders' at the Phoenix Summit held in Dhaka, Bangladesh from 23 to 24 May 2024.
Honeypots Unveiled: Proactive Defense Tactics for Cyber Security, Phoenix Sum...APNIC
Adli Wahid, Senior Internet Security Specialist at APNIC, delivered a presentation titled 'Honeypots Unveiled: Proactive Defense Tactics for Cyber Security' at the Phoenix Summit held in Dhaka, Bangladesh from 23 to 24 May 2024.
HijackLoader Evolution: Interactive Process HollowingDonato Onofri
CrowdStrike researchers have identified a HijackLoader (aka IDAT Loader) sample that employs sophisticated evasion techniques to enhance the complexity of the threat. HijackLoader, an increasingly popular tool among adversaries for deploying additional payloads and tooling, continues to evolve as its developers experiment and enhance its capabilities.
In their analysis of a recent HijackLoader sample, CrowdStrike researchers discovered new techniques designed to increase the defense evasion capabilities of the loader. The malware developer used a standard process hollowing technique coupled with an additional trigger that was activated by the parent process writing to a pipe. This new approach, called "Interactive Process Hollowing", has the potential to make defense evasion stealthier.
Semantic Data Architecture and Ontological Infrastructure (ASIO)
1. FONDO EUROPEO DE DESARROLLO REGIONAL (FEDER)
Una manera de hacer Europa
Introducción a Linked Open Data (espacios enlazados y
enlazables)
Diego López-de-Ipiña
MORElab research group, Universidad de Deusto
dipina@deusto.es
2. FONDO EUROPEO DE DESARROLLO REGIONAL (FEDER)
Una manera de hacer Europa
Temas a tratar
❑ What is linked data
❑ What is open data
❑ What is the difference between linked and open data
❑ How to publish linked data (5-star schema)
❑ The economic and social aspects of linked data.
3. FONDO EUROPEO DE DESARROLLO REGIONAL (FEDER)
Una manera de hacer Europa
Web of Data: Limitaciones de la Web de Documentos
3
❑ Demasiada información con muy poca estructura y hecha además para consumo
humano
o Es una web sintáctica no semántica
o La búsqueda de contenidos es muy simplista
• Se requieren mejores métodos
❑ Los contenidos web son heterogéneos
o En términos de contenido
o En términos de estructura
o En términos de codificación de caracteres
❑ El futuro requiere integración de información inteligente
4. FONDO EUROPEO DE DESARROLLO REGIONAL (FEDER)
Una manera de hacer Europa
What is Linked Data?
4
❑ Evolution from a document-based Web to a Web of interlinked data
❑ The Web is evolving from a “Web of linked documents” into a “Web of linked data”...
Web of documents... Web of linked data...
5. FONDO EUROPEO DE DESARROLLO REGIONAL (FEDER)
Una manera de hacer Europa
The Web is evolving from a “Web of linked documents” into a “Web of
linked data”...
5
• The Web started as a collection of documents
published online – accessible at Web location
identified by a URL.
• These documents often contain data about
real-world resources which is mainly human-
readable and cannot be understood by
machines.
• The Web of Data is about in machine-readable
enabling the access to this data, by making it
available formats and connecting it using Uniform
Resource Identifiers (URIs), thus enabling people
and machines to collect the data, and put it
together to do all kinds of things with it
(permitted by the licence).
• Machine-readable data (or metadata)
is data in a format that can be
interpreted by a computer.
• 2 types of machine-readable data:
• human-readable data that is marked
up so that it can also be understood
by computers, e.g.
microformats, RDFa;
• data formats intended principally
for computers, e.g. RDF, XML and
JSON.
6. FONDO EUROPEO DE DESARROLLO REGIONAL (FEDER)
Una manera de hacer Europa
Democratiando la web semántica: Metadatos empotrados
6
❑ Necesitamos que nuestros datos estén preparados para responder adecuadamente a las
preguntas de los navegadores y agentes software
o “Embedded metadata” son datos sobre datos empotrados en una página web que pueden ser extraídos
por buscadores y agentes de búsqueda
❑ Tres opciones principales:
o RDFa – sistema complejo conectado a XHTML
o Microformats – ampliamente usado y apoyado, usan etiquetas XHTML antiguas
<a href="http://jane-blog.example.org/" rel="sweetheart date met">Jane</a>
o Microdata – más nuevo, soportado por los buscadores, nivel de complejidad intermedio
<div itemscope itemtype="http://schema.org/SoftwareApplication">
<span itemprop="name">Angry Birds</span> -
REQUIRES <span itemprop="operatingSystem">ANDROID</span><br>
<link itemprop="applicationCategory" href="http://schema.org/GameApplication"/>
</div>
¡¡Todas juntas nos ayudarán a alcanzar la visión de una web con más
significado, pero todavía comprensible tanto a humanos como máquinas!!
7. FONDO EUROPEO DE DESARROLLO REGIONAL (FEDER)
Una manera de hacer Europa
Democratizando las ontologías: Schema.org
7
❑ Initiative launched in 2011 by Bing, Google, Yahoo and then Yandex
❑ Objective: “create and support a common set of schemas for structured data mark-up on web pages.”
o Propose to use their schemas to annotate contents in a web page with metadata
❑ Metadata are recognized by search engines and other parsers, thus accessing to the “meaning” of portals
❑ Their vocabularies were inspired by earlier formats like Microformats, FOAF, GoodRelations and OpenCyc
❑ Offer schemas in the following domains (http://schema.org/docs/schemas.html):
o Events, health, organization, person, place, product, offer, revisión and so on.
❑ To map declarations in microdata to RDF the following tools can be used:
o http://tools.seochat.com/category/schema-generators
❑ More info at: http://schema.org/
❑ Examples:
❑ http://schema.org/CreativeWork
❑ http://paginaspersonales.deusto.es/dipina/ (microdata.reveal Chrome plugin)
8. FONDO EUROPEO DE DESARROLLO REGIONAL (FEDER)
Una manera de hacer Europa
Defining Linked Data
8
• “Linked data is a set of design principles for sharing machine-readable data on the
Web for use by public administrations, business and citizens.”
• EC ISA Case Study: How Linked Data is transforming eGovernment
❑ The four design principles of Linked Data (by Tim Berners Lee):
1. Use Uniform Resource Identifiers (URIs) as names for things.
2. Use HTTP URIs so that people can look up those names.
3. When someone looks up a URI, provide useful information, using the standards (RDF*, SPARQL).
4. Include links to other URIs so that they can discover more things.
9. FONDO EUROPEO DE DESARROLLO REGIONAL (FEDER)
Una manera de hacer Europa
LinkedData
9
❑ “A term used to describe a recommended best practice for exposing, sharing, and
connecting pieces of data, information, and knowledge on the Semantic Web
using URIs and RDF.“
❑ Permite descubrir, conectar, describir y reutilizar todo tipo de datos.
o Pasa de una Web de Documentos a una Web de Datos
• En Septiembre 2011 ya contenía 31 billones de tripletas RDF, ligadas por 504millones de enlaces
❑ Pensado para abrir y conectar diversos vocabularios e instancias semánticas, para
que puedan ser utilizados por la comunidad semántica
❑ URL: https://www.w3.org/standards/semanticweb/data
10. FONDO EUROPEO DE DESARROLLO REGIONAL (FEDER)
Una manera de hacer Europa
Ejemplo de Linked Data
10
http://…/isb
n978
Programming the
Semantic Web
978-0-596-15381-6
Toby Segaran
http://…/publi
sher1
O’Reilly
title
name
author
publisher
isbn
http://…/isb
n978
sameAs
http://…/rev
iew1
Awesome
Book
http://…/rev
iewer
Juan
Sequeda
http://juanseque
da.com/id
hasReview
hasReviewer
description
name
sameAs
livesIn
Juan Sequeda
name
http://dbpedia.org/Austin
11. FONDO EUROPEO DE DESARROLLO REGIONAL (FEDER)
Una manera de hacer Europa
Visualizing Linked Data
11
12. FONDO EUROPEO DE DESARROLLO REGIONAL (FEDER)
Una manera de hacer Europa
Linked (open) government data (LOGD) – value proposition
12
❑ Flexible data integration: LOGD facilitates data integration and enables the
interconnection of previously disparate government datasets.
❑ Increase in data quality: The increased (re)use of LOGD triggers a growing demand to
improve data quality. Through crowd-sourcing and self-service mechanisms, errors
are progressively corrected.
❑ New services: The availability of LOGD gives rise to new services offered by the public
and/or private sector.
❑ Cost reduction: The reuse of LOGD in e-Government applications leads to
considerable cost reductions.
❑ Example:
o EU Open Data Portal: https://data.europa.eu/euodp/en/home
13. FONDO EUROPEO DE DESARROLLO REGIONAL (FEDER)
Una manera de hacer Europa
The four principles in practice... (1)
13
1. Use Uniform Resource Identifiers (URIs) as names for things.
2. Use HTTP URIs so that people can look up those names.
o E.g. for an organisation: UNICEF
http://publications.europa.eu/resource/authority/corporate-body/UNICEF
14. FONDO EUROPEO DE DESARROLLO REGIONAL (FEDER)
Una manera de hacer Europa
The four principles in practice... (2)
14
3. When someone looks up a URI, provide useful information, using the standards
(RDF*, SPARQL).
4. Include links to other URIs so that they can discover more things.
15. FONDO EUROPEO DE DESARROLLO REGIONAL (FEDER)
Una manera de hacer Europa
Linked Data vs. Open Data
15
“Open data is data that can be freely used, reused and redistributed by anyone – subject
only, at most, to the requirement to attribute and sharealike.”
- OpenDefinition.org
Open data
Data can be published and be
publicly available under an open
licence without linking to other
data sources.
Linked data
Data can be linked to URIs from
other data sources, using open
standards such as RDF without
being publicly available under an
open licence.
16. FONDO EUROPEO DE DESARROLLO REGIONAL (FEDER)
Una manera de hacer Europa
4 reglas de Linked Data
16
1. Usa URIs para identificar cosas
2. Usa URIs HTTP para que estas cosas puedan ser referenciadas y dereferenciadas por
gente y agentes de usuario
3. Proporciona información útil (descripción estructurada y metadatos) sobre la
cosa/concepto al que referencia la URI
4. Incluye enlaces a otras URIs para mejorar el descubrimiento de información relacionada
en la Web
17. FONDO EUROPEO DE DESARROLLO REGIONAL (FEDER)
Una manera de hacer Europa
Linked Data Foundations: URIs
17
❑ URIs for naming things, RDF for describing data and SPARQL for querying it
o “A Uniform Resource Identifier (URI) is a compact sequence of characters that identifies an abstract
or physical resource.”
– ISA’s 10 Rules for Persistent URIs
❑ Examples:
o A country, e.g. Spain
o http://publications.europa.eu/resource/authority/country/ESP
o An organisation, e.g. the Publications Office
o http://publications.europa.eu/resource/authority/corporate-body/PUBL
o A dataset, e.g. Countries Named Authority List
o http://publications.europa.eu/resource/authority/country
18. FONDO EUROPEO DE DESARROLLO REGIONAL (FEDER)
Una manera de hacer Europa
Linked Data Foundations: RDF & SPARQL
18
❑ The Resource Description Framework (RDF ) is a syntax for representing data and
resources in the Web
❑ SPARQL is a standardised language for querying RDF data.
❑ RDF breaks every piece of information down in triples:
o Subject – a resource, which may be identified with a URI.
o Predicate – a URI-identified reused specification of the relationship.
o Object – a resource or literal to which the subject is related.
http://example.org/place/Madrid is the capital of “Spain”.
OR
http://example.org/place/Madrid is the capital of http://example.org/place/Madrid.
Subject Predicate Object
19. FONDO EUROPEO DE DESARROLLO REGIONAL (FEDER)
Una manera de hacer Europa
Manifestaciones de Linked Data
19
❑ Los datos publicados como LinkedData puede seguir la siguiente
clasificación, según Tim Bernes-Lee:
o 1 estrella: datos disponibles en la web (en cualquier formato), pero con una
licencia abierta
o 2 estrellas: datos disponibles son estructurados y legibles por máquinas. Por
ejemplo, Microsoft Excel en vez de una imagen escaneada de una tabla.
o 3 estrellas: los datos disponibles como en (2) pero no siguen un formato
propietario. Por ejemplo, CSV en vez de Excel.
o 4 estrellas: los datos son dispuestos de manera abierta usando un estándar
abierto de W3C (RDF y SPARQL) para identificar cosas, de modo que la gente
los pueda enlazar.
o 5 estrellas: los datos son dispuestos siguiendo lo anterior, incluyendo enlaces
externos a los datos de otra gente.
20. FONDO EUROPEO DE DESARROLLO REGIONAL (FEDER)
Una manera de hacer Europa
How to publish Linked Data?
20
❑ Paving the way towards 5-star linked data
★ Make your stuff available on the Web (whatever format) under an open license.
★★ Make it available as structured data (e.g., Excel instead of image scan of a table)
★★★ Use non-proprietary formats (e.g., CSV instead of Excel)
★★★★ Use URIs to denote things, so that people can point at your stuff
★★★★★ Link your data to other data to provide context
21. FONDO EUROPEO DE DESARROLLO REGIONAL (FEDER)
Una manera de hacer Europa
★ Make your stuff available on the web under an open license
21
22. FONDO EUROPEO DE DESARROLLO REGIONAL (FEDER)
Una manera de hacer Europa
Pros & Cons of ★ Open Data
22
As a consumer... As a publisher...
✓You can access the data. ✓It is simple to publish.
✓You can store it locally. ✓You do not have explain repeatedly to
others that they can use your data.
✓You can enter the data into any other
system.
✓You can change the data.
✓You can share the data with anyone.
23. FONDO EUROPEO DE DESARROLLO REGIONAL (FEDER)
Una manera de hacer Europa
★ ★ Make it available as structured data
23
24. FONDO EUROPEO DE DESARROLLO REGIONAL (FEDER)
Una manera de hacer Europa
Pros & Cons of ★ ★ Open Data
24
❑ All the benefits of ★ open data; plus
As a consumer... As a publisher...
✓You can directly process it with
proprietary software to aggregate it,
perform calculations, visualise it, etc.
✓It is still simple to publish.
✓You can export it into another
(structured and/or non proprietary)
format.
25. FONDO EUROPEO DE DESARROLLO REGIONAL (FEDER)
Una manera de hacer Europa
★ ★ ★ Use non-proprietary formats
25
❑ Proprietary: Excel, Word, PDF...
❑ Non-proprietary: XML, CSV, RDF, JSON, ODF...
❑ Road safety- Accidents 2006:
26. FONDO EUROPEO DE DESARROLLO REGIONAL (FEDER)
Una manera de hacer Europa
Pros & Cons of ★ ★ ★ Open Data
26
❑ All the benefits of ★ ★ open data; plus
As a consumer... As a publisher...
✓You can manipulate the data in any
way you like, without being confined
by the capabilities of any particular
software.
✓It is still simple to publish.
- But, you may need converters or
plug-informat.s to export the data from
the proprietary
27. FONDO EUROPEO DE DESARROLLO REGIONAL (FEDER)
Una manera de hacer Europa
★ ★ ★ ★ Use URIs to denote things
27
❑ Por ejemplo, Comercios situados en el Municipio de Santander principalmente
dedicados a la venta al por menor.
https://datos.gob.es/es/catalogo/l01390759-comercios
28. FONDO EUROPEO DE DESARROLLO REGIONAL (FEDER)
Una manera de hacer Europa
Pros & Cons of ★ ★ ★ ★ Open Data
28
❑ All the benefits of ★ ★ ★ ★ open data; plus
As a consumer... As a publisher...
✓You can link to it from any other place. ✓Other data publishers can now link into your data,
promoting it to 5 star.
✓You can bookmark it. ✓You will be able to reuse vocabularies, data and
metadata, and URI design patterns instead of creating
them from scratch.
✓You can access information about a particular resource
directly through its URI, without having to download the
complete dataset.
✓You may be able to reuse existing tools
and libraries.
- But you typically need to invest some time in identifying
the resources and assigning URIs.
✓You can combine the data with other data. - You need to invest in a stable policy, management and
infrastructure for persistent URIs.
- But understanding the technology requires effort and can have a steep learning curve.
29. FONDO EUROPEO DE DESARROLLO REGIONAL (FEDER)
Una manera de hacer Europa
★ ★ ★ ★ ★ Link your data to other data to provide context
29
30. FONDO EUROPEO DE DESARROLLO REGIONAL (FEDER)
Una manera de hacer Europa
Pros & Cons of ★ ★ ★ ★ Open Data
30
❑ All the benefits of ★ ★ ★ ★ open data; plus
As a consumer... As a publisher...
✓You can discover more (related) data while
consuming the data.
✓You make your data discoverable.
✓You can directly learn about the data schema. ✓You increase the context, expressivity, quality
and value of your data (and consequently you
give visibility to your organisation).
✓You can combine data from different source,
be innovative, gain new knowledge, be an
entrepreneur...
- This requires an investment in time, money,
technology and competencies/ skills.
- But, you now have to deal with broken data links. Not all publishers/data sources will be reliable.
31. FONDO EUROPEO DE DESARROLLO REGIONAL (FEDER)
Una manera de hacer Europa
Prepare your data for publishing – LOD lifecycle
31
❑ LOGD and metadata lifecycle focusing on supply and demand
Select Model Publish Find Integrate Re-use
Data management
Feedback
Data
Publisher
Data
Consumers
Supply Demand
32. FONDO EUROPEO DE DESARROLLO REGIONAL (FEDER)
Una manera de hacer Europa
Selection of high-value data
32
❑ Several dimensions can be considered in the selection process of Linked Open
Government Data, both from the publisher’s and the re-user’s point of view:
o Transparency: Does the publication of the dataset increase transparency and openness of the
government towards its citizens?
o Legal requirements: Is there a law that makes open publication mandatory or is there no specific
obligation?
o Relation to public task: Is the data the direct result of the primary public task of government or is it
a product of a non-essential activity?
o Current status of open publication: Is the data already openly available or does it still need to be
opened up?
o Type of value: Is the data useful for social engagement or does it have commercial value?
o Audience: Is the data primarily intended for the public or is it primarily aimed at back-office
integration?
33. FONDO EUROPEO DE DESARROLLO REGIONAL (FEDER)
Una manera de hacer Europa
Modelling your data & metadata is about ...
33
❑ Making your data available in a structured, comprehensible and machine-readable
way.
❑ Reusing what already exists in terms of vocabularies and reference data.
❑ Reaching the right quality level by cleansing your data.
❑ Providing licensing information so that data consumers know what the conditions of
reuse are.
❑ Providing a rich description (metadata).
❑ Using semantic technologies (RDF, HTTP URIs...) for describing your data.
34. FONDO EUROPEO DE DESARROLLO REGIONAL (FEDER)
Una manera de hacer Europa
Cleansing your data & metadata
34
❑ To ensure data and metadata can be published with an appropriate level of quality
and minimum errors.
o This means:
• Fixing errors.
• Transforming/homogenising formats.
• Aligning inconsistencies in data and metadata.
• Removing duplicate/redundant information.
• Adding lacking information.
• Making sure the information is up-to-date.
OpenRefine tool
• Portal: https://openrefine.org/
• Video demostrador:
• https://www.youtube.com/watch?v=tzXExfZCA1w
• GitHub: https://github.com/OpenRefine/OpenRefine
35. FONDO EUROPEO DE DESARROLLO REGIONAL (FEDER)
Una manera de hacer Europa
Cleansing your data – example
35
Company_name Registration date Country E-mail # Establishments
Nikè 1991-04-28 Belgium niké 7
BARCO 15 September 1986 BE Barco@email.be 2
Nikè België
Coca-Cola United States coca@cola.com 3
Formatting
issue
Missing
information
Duplicate error
Redundant
information
Inconsistent
information
Company_name Registration date Country E-mail
Nikè 1991-04-28 BE niké@sport.org
BARCO 1986-09-05 BE Barco@email.be
Coca-Cola 1964-03-26 US coca@cola.com
Cleansing
operations
36. FONDO EUROPEO DE DESARROLLO REGIONAL (FEDER)
Una manera de hacer Europa
Publishing linked data is about ...
36
❑ Breaking down the walls of the silos in order to create more value.
o Making your data and metadata publicly and easily accessible on the Web.
o Linking your data and metadata to other data (or metadata) in order to:
• Attach meaning and content to it.
• Give context to it.
• Enrich it.
• Allow people to discover more.
37. FONDO EUROPEO DE DESARROLLO REGIONAL (FEDER)
Una manera de hacer Europa
Ejemplos de fuentes de datos públicas reutilizables
37
❑ Datos Abiertos de Zaragoza https://www.zaragoza.es/sede/portal/datos-abiertos/api
❑ Dades Obertes Manlleu: https://dadesobertes.diba.cat/dades-obertes/documentacio-tecnica/api
❑ Aemet OpenData API: https://opendata.aemet.es/
❑ Catálogo de APIs Abiertas ISTAC: https://www3.gobiernodecanarias.org/aplicaciones/appsistac/api
❑ EMT mobilitylabs: https://mobilitylabs.emtmadrid.es/es/portal/opendata
38. FONDO EUROPEO DE DESARROLLO REGIONAL (FEDER)
Una manera de hacer Europa
WikiData & DBpedia
38
❑ Wikidata is a volunteer-created knowledge base of structured data that anyone can edit
o Focused on structured data: possible for humans and computers alike to use the data
o Many ways to contribute to Wikidata: translate, write apps, add and edit data.
o It works with:
• Items – abstract concepts with theirs own and a unique identifier (Q###) and optionally a label, description and aliases
• Statements are added to items: category of data as a property, while the data that describes an item for a given property is known as
a value.
o Example: entry for Everest mountain https://www.wikidata.org/wiki/Q513
o Documentation: https://www.wikidata.org/wiki/Wikidata:Tours
o Wikidata query service: https://query.wikidata.org/
❑ DBpedia, a project to create a graph from Wikipedia data – allows users to semantically query
relationships and properties associated with Wikipedia resources, including links to other related
datasets
o Wikipedia articles consist mostly of free text, but also include structured information embedded in the articles,
such as "infobox" tables, categorisation information, images, geo-coordinates and links to external Web pages.
• This structured information is extracted and put in a uniform dataset which can be queried: http://live.dbpedia.org/sparql
39. FONDO EUROPEO DE DESARROLLO REGIONAL (FEDER)
Una manera de hacer Europa
WikiData & DBpedia
39
❑ There are 4 main differences:
o Wikidata provides data to Wikipedia, while DBpedia extracts data from Wikipedia.
o Wikidata's ontology is community curated, and part of the data maintained on the site, while DBpedia’s ontology is
statically defined, and much stricter.
o Formally, Wikidata only asserts statements (who claims what), while DBpedia asserts facts, often causing
contradictions.
o Wikidata is licensed CC-0, and is this re-usable without any restrictions, while DBpedia is licensed CC-BY-SA, which
requires author attribution - which is a good thing generally, but impractical for a knowledge base automatically derived
from text.
❑ More info: https://www.quora.com/How-is-Wikidata-related-to-Wikipedia-in-a-way-different-from-
how-DBpedia-is-related-to-Wikipedia
❑ Examples:
o Listar los nombres en castellano de los países en Dbpedia
o Countries sorted by population in WikiData
PREFIX dbo: <http://dbpedia.org/ontology/>
PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>
SELECT ?nombre WHERE {
?pais rdf:type dbo:Country .
?pais rdfs:label ?nombre .
FILTER (lang(?nombre)='es')
}
40. FONDO EUROPEO DE DESARROLLO REGIONAL (FEDER)
Una manera de hacer Europa
Iniciativa Aporta – https://datos.gob.es/es/acerca-de-la-iniciativa-aporta
40
❑ La Iniciativa Aporta es la estrategia nacional de coordinación y de impulso de la apertura de datos
procedentes del sector público, y de promoción del desarrollo de servicios avanzados basados ellos.
❑ Objetivos:
o Impulsar y coordinar la apertura de los datos generados por el sector público.
o Estimular un mercado ligado a la reutilización de la información del sector público.
o Contribuir a favorecer las condiciones para el desarrollo de la Estrategia europea de datos en España.
Sensibilización Análisis y
Estadísticas
Cooperación
Internacional
Regulación
Cooperación
Nacional
Catálogo Nacional
y Soporte Innovación
https://datos.gob.es/ Punto de encuentro entre las
administraciones, las empresas y los ciudadanos que forman
parte del ecosistema de datos abiertos en España
41. FONDO EUROPEO DE DESARROLLO REGIONAL (FEDER)
Una manera de hacer Europa
41
LOS DATOS, MOTOR DEL MUNDO
2018
33
zettabytes
almacenados en tabletas de 512 GB,
formarían una torre que llegaría a la luna.
bastarían para hacer cinco veces el camino
de ida y vuelta a la luna.
2025
175
zettabytes
5 veces más Estos datos son un elemento
fundamental para facilitar la
toma de decisiones y generar
valor económico y social.
En 2018, el valor de la economía de
datos superó los 300 mil millones de
euros en la UE28, con un crecimiento
del 12% con respecto al año anterior.*
Fuente: Unión Europea, 2020
Cada año crece el volumen de datos en el mundo.
Unos datos generados tanto por empresas, como por el conjunto de la sociedad.
CRECERÁ EL VOLUMEN GLOBAL DE DATOS:
*Fuente: DATA ASTHE ENGINE OF EUROPE’S DIGITAL FUTURE, IDC, 2019
42. FONDO EUROPEO DE DESARROLLO REGIONAL (FEDER)
Una manera de hacer Europa
42
• Los datos abiertos son la materia prima de nuevos servicios, productos y aplicaciones.
• Las empresas infomediarias crean productos y servicios de valor añadido con datos públicos y privados.
En España, el sector generó 2.000 millones de euros y empleó a más de 15.000 personas en 2018 *
*Fuente: informe “Del sector infomediario a la economía del dato. Parte I. Caracterización del Sector Infomediario“, ONTSI .
•+ 15,4%
• Volumen de negocio
(2015-2018)
•+ 14,3%
• Empleados
(2016-2018)
EJEMPLOS
Turismo:
análisis de tendencias de
viajeros
Energía:
medidores inteligentes que
tienen en cuenta las condiciones
atmosféricas
Sector Inmobiliario:
análisis de precios del
mercado y condiciones de las
distintas zonas
Agricultura:
datos del suelo o del tiempo
para optimizar el riego
EL USO DE LOS DATOS EN LAS EMPRESAS
43. FONDO EUROPEO DE DESARROLLO REGIONAL (FEDER)
Una manera de hacer Europa
Conclusions
43
❑Linked data is a set of design principles for sharing machine-readable
data on the Web.
❑Linked data and open data are not the same.
❑URIs, RDF and SPARQL form the foundational layer for Linked data.
❑Linked data offers a number of advantages for:
o Data integration with small impact on legacy systems;
o Enables for semantic interoperability;
o Enables creativity and innovation through context and knowledge-creation.
44. FONDO EUROPEO DE DESARROLLO REGIONAL (FEDER)
Una manera de hacer Europa
Referencias
44
❑ Iniciativa Aporta: https://datos.gob.es/es/acerca-de-la-iniciativa-aporta
o Presentación: http://ondemand2.redes.ondemand.flumotion.com/redes/ondemand2/EX-
24956/PresentacionMultimedia_07.09.20.pptx
❑ European Data Portal – Introduction to Linked Data, training module:
https://www.europeandataportal.eu/sites/default/files/d2.1.2_training_module_1.2_introduction_
to_linked_data_en_edp.pdf
❑ Guía práctica para la publicación de Datos Abiertos usando APIs – Iniciativa Aporta
o https://datos.gob.es/es/documentacion/guia-practica-para-la-publicacion-de-datos-abiertos-usando-apis
45. FONDO EUROPEO DE DESARROLLO REGIONAL (FEDER)
Una manera de hacer Europa
Red Ontologías Hércules – ROH
Diego López-de-Ipiña
MORElab research group, Universidad de Deusto
dipina@deusto.es