This presentation provides an overview on Linked Data, its underlying principles and applications. It further discusses benefits and business models for enterprises;
Held at the Tiroler IT Tag 2010
Unleash Your Potential - Namagunga Girls Coding Club
Linked Data - Overview and Potentials
1. Linked Data
Overview and Potentials
Tiroler IT Tag
Innsbruck, 28.10.2010
Gefördert mit Mitteln des BMWFJ, des BMVIT und des Landes Salzburg
Dr. Tobias Bürger, Salzburg Research
tobias.buerger@salzburgresearch.at
2. Seite 2
Motivation: The Digital Data Dilemma
247 billion emails sent per day
47 million added websites
126 million blogs
4 billion photos hosted by Flickr
2.5 billion photos uploaded to
Facebook per month
1 billion videos served by
YouTube per day
Google operates over 1 mil.
servers worldwide
65 million tweets per day
Core question: How to seperate the signal from the noise on
the Web / in an enterprise?
(Figures from 2009)
Figure: IDC
3. Seite 3
Motivation: The Integration Dilemma
Very large amounts of data -> problem of scale
Different data types and schemas -> problem of
data heterogeneity
Different information systems -> problem of
system heterogeneity
Core question: How to integrate data/systems and benefit from
integrated information sources?
5. Seite 5
Evolution of the Web
Hypertext
Hypermedia
Web
Web of Data
Social Web
(Web 2.0)
Semantic Web
Picture from [1]
?
Picture from [2]
“As We May Think”
-> Connecting documents
-> Connecting people
-> Connecting data & knowledge
6. Seite 6
Evolution: From a Web of Documents to a
Web of Data
Web of Documents Web of Data
“Documents”
Hyperlinks
“Things”
Typed Links
Fundamental principles:
Names(URIs), Documents(Resources)
described using HTML/XML,
Interactions (http), Hyperlinks
Fundamental differences:
Structure of data of pages explicit
URIs for things
Typed links
7. Seite 7
Vision of the Web of Data
Many common things are represented in multiple data sets
Linking identifiers links these data sets
Figure: Chris Bizer
8. Seite 8
Linked Data Principles
1. Use URIs as names for things.
2. Use HTTP URIs so that people can look up those
names.
3. When someone looks up a URI, provide useful
information.
4. Include links to other URIs to allow discovery of related
things.
9. Seite 9
Foundational Principle: Machine
Understandable Data
„tbuerger.jpg is a picture and depicts a person called Tobias Bürger
who is affilitated with a company called Salzburg Research which is
located in Salzburg which is a city in Austria.“
10. Seite 10
Foundational Principle: Linking in a Giant
Global Graph
DBpedia
IMDb
DBpedia: related to
Works
list
DBpedia: Works
Geonames
Flickr
DNB
…
11. Seite 11
The Enabling Technologies: URIs
Uniform Resource Identifiers (URI) identify things
Use dereferencable HTTP URIs in the Web of Data
12. Seite 12
The Enabling Technologies: RDF
A data model for representing metadata on the Web
Everything is a resource (e.g. a person, document, company)
Several statements (triples) form a graph
14. Seite 14
Benefits of Using RDF in Linked Data
Clients can look up every URI in an RDF graph over the Web to
retrieve additional information
Information from different sources merges naturally.
Information expressed using different schemas can be captured in
one model.
Any kind of structured/unstructured data can be modeled.
15. Seite 15
The Enabling Technologies: RDFS
A language for describing vocabularies in a machine
understandable way
16. Seite 16
The Enabling Technologies: OWL
A more expressive language for expressing vocabularies and/or
ontologies in a machine understandable way
17. Seite 17
The Enabling Technologies: SKOS
A language for defining controlled vocabularies.
18. Seite 18
The Enabling Technologies: SPARQL
A query language and protocol for accessing
RDF data on the Web
select distinct ?x
where {?x skos:subject
<http://dbpedia.org/resource/Category:1980s_horror_films>}
LIMIT 10
19. Seite 19
Linking Open Data Project
What? Community project with W3C support
Follows the Linked Data principles to publish open data sets
“The goal of the W3C SWEO Linking Open Data community project is to extend the
Web with a data commons by publishing various open data sets as RDF on the Web
and by setting RDF links between data items from different data sources. “
Basic idea:
take existing (open) data sets and make them available on the Web in RDF.
Once published in RDF, interlink them with other data sets.
27. Seite 27
Why should I publish my data based on the
Linked Data principles?
Elimination of data silos
Ease of discovery
Ease of consumption
Ease of integration
Reduced redundancy
Increased reusability
Ease of (schema) updates
…
28. Seite 28
Benefits for Enterprises
Benefits arise from:
(1) Publishing Linked Data
(2) Consuming Linked Data
(3) Adopting Linked Data
Linked Data as a means to reach the „holy grail of enterprise information
systems“: complete, integrated access to all data.
Figure: http://bit.ly/21LmMA
29. Seite 29
Benefits for Enterprises (2)
Publishing Linked Data
Structured analysis before publication stage => increased data quality
Increased visibility of own data => increased (external) reusability
Easier than creating and maintaing Web APIs => reduced costs
Entity-driven publishing is strongly web-optimized => improved search engine rankings
Consuming Linked Data
Added value for proprietary data
Easy integration of external data
Mashup of internal and externally maintained data
Adopting Linked Data
Ease of data integration
Ease of data consolidation (strong identifiers, clean modelling and linking rather than
replicating data)
Structured queries over heterogeneous data sources
Unified infrastructure for managing both internal/external data
32. Seite 32
Positive effects of Using RDF and
Linked Data in Search
SearchMonkey: An open platform for using structured
data to build more useful and relevant search results
Before After
33. Seite 33
Linked Data at the BBC
i.e. BBC Programmes,
BBC Music, BBC WildLifeFinder
Linked Data driven publishing
of information at the BBC,
consolidating and inter-
linking several internal and
external datasets
„One page for every
programme/artist
/species/etc.“
34. Seite 34
Linked Data at the BBC (2)
Demo BBC MusicDemo BBC ProgrammesDemo BBC WildLife Finder