Q-Factor General Quiz-7th April 2024, Quiz Club NITW
Linked Data Tutorial
1. February 16, 2012
Linked Data Tutorial*
Tomáš Knap, Jindřich Mynarz, Martin Nečaský, Jakub Stárka
Faculty of Mathematics and Physics
Charles University in Prague
(*Partially based on slides of Chris Bizer [8])
3. Motivational Scenario
Basic Public
Employees Departments Budget Expenses
data contracts
WWW
Business Buyer‘s
page of the ÚFIS ISVZUS gov.cz
Register Profile
institution
Data Consumer: Show me suppliers of the public
contracts for the Ministry of Finance (MF) in the region
Liberec. Show me the data on the Google maps in
iPhone. For every public contract, I am also looking for
the aggregation of all the payments made by MF, link
to their budget and responsible person.
• Where can I get the data about public
contracts, responsible persons, expenses, and
budget of MF?
• How should I aggregate and link the data?
• How can16th February 2012 | data Data the map?
I observe the Linked on Tutorial 3
4. Current Common Practise
Basic Public
Employees Departments Budget Expenses
data contracts
WWW
Business Buyer‘s
page of the ÚFIS ISVZUS gov.cz
Register Profile
institution
3 - Expenses ?
2 – MF public 1 – MF public
contracts + contracts
employees
? Consumer did
not discovered
?
Information Integration very time consuming, boring, and ineffective!
16th February 2012 | Linked Data Tutorial 4
5. Linked Data - Basics
16th February 2012 | Linked Data Tutorial 5
6. Linked Data
• Set of best practices for publishing structured
data on the Web in accordance with the
general architecture of the Web
using Semantic Web technologies and standards
Semantic Web is the goal, Linked Data provides
the means to reach the goal
16th February 2012 | Linked Data Tutorial 6
7. Linked Data Principles
1. Use URIs as names for things
2. Use HTTP URIs so that people can look up those
names.
3. When someone looks up a URI, provide useful RDF
information
4. Include RDF statements that link to other URIs so
that they can discover related things.
[Tim Berners-Lee, http://www.w3.org/DesignIssues/LinkedData.html, 2006]
16th February 2012 | Linked Data Tutorial 7
8. Architecture of the Classic Web
Single global information space
Small set of simple standards:
‒ HTTP URI
• globally unique ID
• retrieval mechanism
‒ HTML as document format
‒ Hyperlinks to connect everything
Applications work on top of the
complete information space
16th February 2012 | Linked Data Tutorial 8
9. Web 2.0 APIs and Mashups
No single global dataspace
Shortcomings:
‒ API have proprietary interfaces
‒ No hyperlinks between data items
within different APIs
‒ Mashups are based on a fixed set
of data sources
Web APIs slice the Web into Walled Gardens!
16th February 2012 | Linked Data Tutorial 9
10. Linked Data
• Extend the Web with a single global dataspace
By using RDF to publish structured data on the Web
By setting links between data items within different data
sources.
Physically distributed, behaves like single dataspace
16th February 2012 | Linked Data Tutorial 10
11. RDF Data Model
• Flexible graph-based data model [2]
• HTTP URIs take the role of global primary keys.
pd:cygri = http://richard.cyganiak.de/foaf.rdf#cygri
dbpedia:Berlin = http://dbpedia.org/resource/Berlin
16th February 2012 | Linked Data Tutorial 11
12. Resolving URIs over the Web
• The HTTP protocol brings together
identification and retrieval
16th February 2012 | Linked Data Tutorial 12
14. Pubby – Linked Data Browser
http://dbpedia.org/page/Český_Krumlov
16th February 2012 | Linked Data Tutorial 14
15. Properties of the Web of Linked Data
• Global, distributed data space build on a simple
set of standards
RDF, URIs, HTTP
• Entities are connected by links
creating a global data graph that spans data sources
enables the discovery of new data sources
• Data-coexistence
Everyone can publish data to the Web of Linked Data
Everyone can express their personal view on things
16th February 2012 | Linked Data Tutorial 15
16. Linked Data Deployment on the Web..
Is it real?
16th February 2012 | Linked Data Tutorial 16
17. W3C Linking Open Data Project
• Grassroots community effort to
Publish existing open license datasets as Linked
Data on the Web
Interlink things between different data sources
16th February 2012 | Linked Data Tutorial 17
20. Linked Data Cloud 2011
http://richard.cyganiak.de/2007/10/lod/lod-datasets_2011-09-19_colored.pdf
http://thedatahub.org/
16th February 2012 | Linked Data Tutorial 20
21. More Statistics
http://stats.lod2.eu/stats
16th February 2012 | Linked Data Tutorial 21
22. Uptake in Governmental Domain
• The EU is publishing LinkedData
EuroStat
‒ http://estatwrap.ontologycentral.com/
• National efforts
The Government is releasing public data
‒ http://data.gov.uk/
‒ Lots of initiatives in Great Britain
Budget in Germany
‒ http://bund.offenerhaushalt.de/
Open Data in Catalonia
‒ http://opendata.gencat.cat/en/dades-obertes.html
16th February 2012 | Linked Data Tutorial 22
23. Data.gov.uk
http://data.gov.uk/organogram/cabinet-office
16th February 2012 | Linked Data Tutorial 23
24. Linked Data Applications
Linked Data Browsers
? ? ??
16th February 2012 | Linked Data Tutorial 24
25. Search Engines - Sig.ma
http://sig.ma
16th February 2012 | Linked Data Tutorial 25
26. Mashups – Public Contracts On the Map
http://gd.projekty.ms.mff.cuni.cz:2021/new/map.html
16th February 2012 | Linked Data Tutorial 26
27. Mashups – Crime, Transport, Education
http://apps.seme4.com/see-uk/
16th February 2012 | Linked Data Tutorial 27
28. Other Applications
• Browsers:
Disco Hyperdata Browser
‒ http://www4.wiwiss.fu-berlin.de/rdf_browser/
OpenLink RDF Browser
‒ http://ode.openlinksw.com/
• Search Engines
Falcons
‒ http://ws.nju.edu.cn/falcons/
Watson
‒ http://watson.kmi.open.ac.uk/WatsonWUI/
• Mashups
16th February 2012 | Linked Data Tutorial 28
29. Linked Data Applications - Summary
Linked Data Browsers Search Engines Linked Data Mashups
16th February 2012 | Linked Data Tutorial 29
31. Publishing Tasks – Bizer 38
• 1. Make data available as RDF via HTTP
Requires ways to serialize RDF data model
• 2. Set RDF links pointing at other data sources
• 3. Make your data self-descriptive
16th February 2012 | Linked Data Tutorial 31
34. RDFa
• A way to directly add RDF to XHTML pages
Provides new attributes to handle additional
markup
• W3C Recommendation, 2008 [5]
• HTML is not extendable
• most RDFa parsers will recognize RDFa attributes in any
version of HTML
16th February 2012 | Linked Data Tutorial 34
35. RDFa
• Provides new attributes to handle additional
markup, reuses existing
About, resource, …
Href, src, …
• Used with any supported element, prefered:
Span, div (in the body)
a (linking element)
Meta, link (in the header)
16th February 2012 | Linked Data Tutorial 35
36. RDFa Example
• XHTML page http://example.com/alice/posts/42
• Original XHTML code
All content on this site is licensed under <a
href="http://cc.org/licenses/by/3.0/"> a Creative Commons License </a>.
• XHTML + RDFa
All content on this site is licensed under <a rel=“cc:license"
href="http://cc.org/licenses/by/3.0/"> a Creative Commons License </a>.
• RDF triples destilled from XHTML+RDFa
<http://example.com/alice/posts/42> cc:license
<http://cc.org/licenses/by/3.0/>.
16th February 2012 | Linked Data Tutorial 36
37. RDF store + Linked Data Interface
• Virtuoso + pubby
16th February 2012 | Linked Data Tutorial 37
38. D2R server
• A way how to publish data in relational databases as
Linked Data
• Requests from the Web are rewritten into SQL queries
via the mapping.
on-the-fly translation
eliminates the need for replicating the data into a
dedicated RDF triple store.
16th February 2012 | Linked Data Tutorial 38
39. Publishing Tasks
1. Make data available as RDF via HTTP
2. Set RDF links pointing at other data sources
3. Make your data self-descriptive
16th February 2012 | Linked Data Tutorial 39
40. 2. Set RDF links
<http://dbpedia.org/resource/Berlin>
owl:sameAs
<http://sws.geonames.org/2950159> .
• There are tools to help you generate links
Silk [6]
16th February 2012 | Linked Data Tutorial 40
41. Publishing Tasks
1. Make data available as RDF via HTTP
2. Set RDF links pointing at other data sources
3. Make your data self-descriptive
16th February 2012 | Linked Data Tutorial 41
42. 3. Make your data self-descriptive
• Increase the usefulness of your data and ease
data integration
• Aspects of self-descriptiveness
1. Reuse terms from common vocabularies
2. Enable clients to retrieve the schema
3. Publish schema mappings for proprietary terms
4. Metadata
‒ Provide provenance metadata
‒ Provide licensing metadata
‒ Provide data-set-level metadata using voiD
16th February 2012 | Linked Data Tutorial 42
43. About Vocabularies
• We have to be able to define the meaning of
the subject, properties
Vocabularies, e.g. Public contracts ontology
16th February 2012 | Linked Data Tutorial 43
44. Public Contracts Ontology
http://purl.org/procurement/public-contracts#
16th February 2012 | Linked Data Tutorial 44
45. RDFS
• RDFS = RDF Schema
W3C recommendation
‒ http://www.w3.org/TR/rdf-schema/
Vocabulary for RDF
‒ Definition of classes
• is:Student rdf:type rdfs:Class
‒ Definition of properties
• is:name rdf:type rdfs:Property
‒ Domains and ranges of properties
• is:name rdfs:domain is:Student
• is:name rdfs:range xsd:string
16th February 2012 | Linked Data Tutorial 45
46. OWL
• OWL = Web Ontology Language
W3C recommendation
‒ http://www.w3.org/TR/owl2-overview/
Ontologies
‒ More complex constructs
• Class or property equivalences
• Cardinality restrictions
• …
16th February 2012 | Linked Data Tutorial 46
47. 3. Make your data self-descriptive
• Increase the usefulness of your data and ease
data integration
• Aspects of self-descriptiveness
1. Reuse terms from common vocabularies
2. Enable clients to retrieve the schema
3. Publish schema mappings for proprietary terms
4. Metadata
‒ Provide provenance metadata
‒ Provide licensing metadata
‒ Provide data-set-level metadata using voiD
16th February 2012 | Linked Data Tutorial 47
48. 3.1 Reuse Terms from Common
vocabularies
• Common Vocabularies
Friend-of-a-Friend for describing people and their social network
SIOC for describing forums and blogs
SKOS for representing topic taxonomies
Organization Ontology for describing the structure of organizations
GoodRelations provides terms for describing products and business
entities
Music Ontology for describing artists, albums, and performances
Review Vocabulary provides terms for representing reviews
• Common sources of identifiers (URIs) for real world objects
LinkedGeoData and Geonames locations
GeneID and UniProt life science identifiers
DBpedia wide range of things
16th February 2012 | Linked Data Tutorial 48
49. 3.2 Enable Clients to retrieve the Schema
• Clients can resolve the URIs that identify vocabulary
terms in order to get their RDFS or OWL definitions.
• If we discover in data URI:
<http://opendata.cz/data/p6/contract/ocz_art_5161>
http://purl.org/procurement/public-contracts#awardDate
"2011-11-11"^^<http://www.w3.org/2001/XMLSchema#date> ;
• We resolve the URI and get the definition:
RDFS or OWL definition
16th February 2012 | Linked Data Tutorial 49
55. SPARQL Example - Result
16th February 2012 | Linked Data Tutorial 55
56. Issues of the Simple Consuming Scenarios
• How to aggregate the data if the links are
missing, the data models (ontologies) differs?
• How to deal with data quality?
Everybody can say whatever he wants!
• Solution: We are developing an infrastructure
for cleaning, linking, and aggregating Linked
Data
Reusing existing technologies, such as Silk
16th February 2012 | Linked Data Tutorial 56
57. ODCleanStore
• Cleaning the data • Smart data consuming
Custom cleaners Data aggregation (due to
• Linking the data links, ontology
mappings)
Silk
Conflict resolution
• Graphical user interface Data provenance
16th February 2012 | Linked Data Tutorial 57
58. Motivational Scenario - Recall
Basic Public
Employees Departments Budget Expenses
data contracts
WWW
Business Buyer‘s
page of the ÚFIS ISVZUS gov.cz
Register Profile
institution
Data Consumer: Show me suppliers of the public
contracts for the Ministry of Finance (MF) in the
region Liberec. Show the data on the Google maps
in iPhone. For every public contract, I am looking
for the aggregation of all the payments made by
MF, link to their budget and responsible person.
• Where can I get the data about public
contracts, responsible persons, expenses, and
budget of MF?
• How should I aggregate and link the data?
• How can I observer the data on the map?
16th February 2012 | Linked Data Tutorial 58
59. Goal
Basic Public
Employees Departments Budget Expenses
data contracts
WWW
Business Buyer‘s
page of the ÚFIS ISVZUS gov.cz
Register Profile
institution
ODCleanStore
16th February 2012 | Linked Data Tutorial 59
61. Linked Data vs. Open Data
Open data are raw data, which are freely
available on the Web to:
• Everyone
• Anytime
• For whatever purpose
• Open data – 3 stars!
4th star: Single and flexible model (RDF) is missing
5th star: Links
16th February 2012 | Linked Data Tutorial 61
62. Conclusions and Take Away Message
The Power Of Linked Data (5 star data)
• Web-scale data publishing with web-based discovery
mechanisms
• Distributed annotation – make comments about
observations, data series, points on the map
• Easy to reuse
Huge potential when connecting to the cloud, linking the
data, the benefits are growing as the amount of data
published as Linked Data is increasing
• Integration on data level
• Easy to extend (new data properties as required, no
need to be planned up-front)
• Easy to merge – no name clashes!
16th February 2012 | Linked Data Tutorial 62
63. Future Steps
• If you managed to get interesting data, try to
publish them as Linked Data!
We can help you with the whole lifecycle – creating,
publishing, maintenance of the data
Just create RDF data, we will publish it for you
Just let us know (send us the data), we can publish it
Publish data in the same way, but use global
identifiers according to LD principles
• When the infrastructure (ODCleanStore) is ready,
you can just send us the RDF data using web
service and we will do all the other stuffs –
clean, link, and provide aggregated views.
16th February 2012 | Linked Data Tutorial 63
65. References
• Textbook: Tom Heath, Christian Bizer: Linked Data:
Evolving the Web into a Global Data Space.
http://linkeddatabook.com/
• [2] http://www.w3.org/TR/rdf-primer/
• [3] http://www.w3.org/TR/REC-rdf-syntax/
• [4] http://www.w3.org/TeamSubmission/turtle/
• [5] http://www.w3.org/TR/rdfa-syntax/
• [6] http://www4.wiwiss.fu-berlin.de/bizer/silk/
• [7] http://www.w3.org/TR/rdf-sparql-query/
• [8] http://static.lod2.eu/isslod/Bizer-LOD2-
IndianSummerSchool.pdf
16th February 2012 | Linked Data Tutorial 65
Editor's Notes
Domain-specific applications using Linked Data from the Web
Publishing Linked DataRDF data model, ontologies, important vocabularies
Publishing Linked DataRDF data model, ontologies, important vocabularies
As a result:No or restricted reusability of the refined dataNo single model, no global identifiers, general mechanism for linkingStill reinventing the wheel when looking for couple of information distributed among the data sources!
Web-scale data publishing with web-based discovery mechanismsLOD2Develops tools (stack of tools) to work with Linked Data