Linked (Open) Data


Published on

Lecture slides about the basics of Linked Data

1 Comment
No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Linked (Open) Data

  1. 1. Linked (Open) DataINFO 4302 - April 18, 2011Bernhard Haslhofer - Cornell University
  2. 2. Who am I?• Postdoc at Cornell Information Science• Research areas • linked data • user-contributed data (annotations) • (meta-)data interoperability• Contact: •
  3. 3. Today we talk about...
  4. 4. Today we talk about...• Movies, actors and other real-world entities• How to make data about these entities available on the Web (Linked Data)• Enabling technologies, best-practices and useful tools that help us in doing so• Other Linked Data projects (BBC, LoC)
  5. 5. Web Architecture Recap
  6. 6. The World Wide Web (WWW)• Internet != WWW != Google != Facebook• Fundamental technologies • URI - a simple and generic syntax for identifiers • HTML - a markup language without formal schema binding • HTTP - a simple protocol to access and manipulate resources and resource representations in a distributed environment• W3C Consortium (
  7. 7. URIs• Identification of resources via Uniform Resource Identifiers (URIs)•The generic syntax consists of a hierarchical sequence of components, scheme, Generic Syntax: authority, path, query, and fragment. URI = scheme “:” hier-path [ “?” query ] [ “#” fragment ] Scheme and hier-path are required, though the path may be empty. Example URIs with components: URI foo:// _/ ________________/_________/ _________/ __/ URL | | | | | URN scheme authority path query fragment
  8. 8. URIs / Resources• Information Resource • web pages, images, product catalogs, etc • all their essential characteristics can be conveyed in a message • e.g.,• Non-Information Resource • other things such as dogs, people, this classroom, concepts • their essence is not information • e.g.,
  9. 9. HTTP• A stateless request-response protocol in the client-server computing model• HTTP methods: GET, POST, PUT, DELETE, ...• Agents may use a URI to access the referenced resource = dereferencing the URI
  10. 10. HTTP Content Negotiation• A URI is not (necessarily) a filename• Conneg = making available multiple resource representations via the same URI Plain Text text/plain HTML (en) URI text/html HTML (jp) text/html Resource
  11. 11. (X)HTML(5)• A resource representation data format...• ... for presentation markup • rendered by user agents (typically browsers) • focus on readability • less formal, user-friendly syntax and semantics
  12. 12. Web Services• Application-to-application communication based on the Web architecture • simple and open standards (HTTP, XML, JSON, ...) • send data from Application A to Application B through the Web • usually define some API Web Application A Application B
  13. 13. Linked Data
  14. 14. Why Linked Data?
  15. 15. Why Linked Data?
  16. 16. Why Linked Data?
  17. 17. Why Linked Data?• There is lots of information on the Web• ...valuable information that can be (re-)used• Problem • information is usually expressed in the form of HTML documents • the underlying raw data are locked in closed data silos (mostly DBMS)
  18. 18. (c)
  19. 19. Why Linked Data?• The Web is successful because it provides • Uniform encoding (HTML) • Uniform addressing (URI) • Uniform transportation (HTTP) for the exchange of documents.• Why not apply the same mechanism to the underlying data?
  20. 20. What is Linked Data?• A method to build a Web of Data• Architectural style, set of standards Web
  21. 21. What is Linked Data?• A set of four principles • use URIs as names for things • use HTTP URIs so that people can look up those names • when someone looks up a URI, provide useful information, using the standards (RDF, SPARQL) • include links to other URIs, so that they can discover more things
  22. 22. Enabling Technologies
  23. 23. Uniform Resource Identifiers (URI)• Name and identify things (resources)• Dereferencable HTTP URIs The_Shining_(film) resource/film/2014 04fjzv
  24. 24. Resource Description Framework (RDF)• A model for representing data on the Web• Several statements (triples) form a graph Film Person rdf:type rdf:type dbpprop:starring The_Shining_(film) Jack_Nicholson foaf:name rdfs:label rdfs:label dbpedia-owl:birthDate !" (#$) The Shining (film) 1937-04-22 Jack Nicholson
  25. 25. RDF serialization (RDF/XML, N3, Turtle, etc.)• Data formats for RDF resource representations RDF Serialization Formats: RDF/XML, N3, Turtle, N-Triple, etc• Used to transfer RDF data between apps Data formats for RDF resource representations Used to transfer RDF data from application-to-application N3/Turtle example: @prefix rdf:<> . @prefix dbpedia-owl:<> . <> rdf:type dbpedia-owl:Work , dbpedia-owl:Film . @prefix dbpprop:<> . @prefix ns9:<> . <> dbpprop:runtime"146.0"^^ns9:minute ; © Prof. Dr. Wolfgang Klas und Dr. Bernhard Haslhofer, WS 2010/11 - Multimediale Systeme 2 7 Linked (Open) Data 7-15
  26. 26. RDF Vocabulary Description Language (RDFS)• A language for describing the syntax and semantics of vocabularies in a machine- understandable way Work rdfs:subClassOf Film
  27. 27. OWL - Web Ontology Language• A more expressive (formal) language for defining the syntax and semantics of vocabularies• Solves RDFS shortcomings but introduces quite some complexity owl#ObjectProperty Work rdf:type rdfs:domain rdfs:range starring Person rdfs:label starring
  28. 28. Simple Knowledge Organization System (SKOS)• A language for describing controlled vocabularies (taxonomies, thesauri, classification schemes) Category:1980s_horror_films skos:subject rdf:type skos:broader The_Shining_(film) skos/core#Concept rdf:type Category:1980s_films
  29. 29. Links between Resources • OWL defines properties for linking resources dbpprop:starring The_Shining_(film) Jack_Nicholson owl:sameAs owl:sameAs owl:sameAs resource/film/2014 N5761411277431266513 04fjzv
  30. 30. SPARQL • A query language and protocol for accessing7.2.2.7 SPARQL - RDF Query Language RDF data on the Web A query language and protocol for accessing RDF data on the Web SELECT DISTINCT ?x WHERE {?x skos:subject < gory:1980s_horror_films>} LIMIT 10
  31. 31. Vocabulary / DataPublishing Best Practices
  32. 32. Publishing Vocabularies• Hash-based URIs • e.g., • Suited to group the description of a moderate number of related terms into one RDF document • Agent can retrieve terms with a single request• Slash-based URIs • e.g., • Suited to split terms in large vocabularies into one document per term • No need to download a massive document
  33. 33. Provide either:human-readable content from vocabulary URI
  34. 34. or:machine-readable content from vocabulary URI... depending on what is requested.
  35. 35. Publishing Data• Distinguish between non-information and information resource• Sample non-information resource •• Sample information resource • - HTML • - RDF
  36. 36. Publishing Data GET Accept: application/rdf+xml 303 See Other Location: GET Accept: application/rdf+xml 200 OK ... <?xml version="1.0" encoding="utf-8"?> <rdf:RDF ...
  37. 37. The Linking Open DataCommunity Project
  38. 38. Linking? Open? Data Project?• Open Data: a philosophy, practice, or policy that data are freely available to everyone without restrictions from copyright, patents, a.s.o.• Linked Data: method / best practices for exposing, sharing, and connecting data using URIs and RDF• Linking Open Data: a W3C community project with the goal to extend the Web with a data commons by publishing various open data sets as RDF on the Web and by setting links between data items from different sources
  39. 39. Useful Tools
  40. 40. RDF APIs• Java • Jena Semantic Web Framework ( • Sesame RDF API (• PHP • ARC (• Ruby • RDF.rb: Linked Data for Ruby (• Python • RDFLib (• C • Redland RDF Libraries (
  41. 41. RDF Stores• OpenLink Virtuoso ( dataspace/dav/wiki/Main/)• 4Store (• AllegroGraph ( allegrograph/)• Oracle 11g ( database/options/semantic-tech/ index.html)• ...and many more:
  42. 42. RDF / Linked Data Wrappers• D2RQ - SPARQL / Linked Data for relational databases ( bizer/d2rq/)• OAI2LOD Server - expose any OAI-PMH source as Linked Data• TripFS - filesystem as Linked Data• TripCel - XLS spreadsheets as Linked Dat• ...
  43. 43. Linked Data debuggingStartup your console / terminal - native on Linux / Mac OS X - Windows: resources with cURL ( -I -H "Accept: application/rdf+xml" -H "Accept: application/rdf+xml"
  44. 44. Linked Data debuggingInstall the Raptor RDF Syntax Library ( - Mac: brew install raptorUse the rapper utility to dereference URIsrapper -o rdfxml
  45. 45. Readings
  46. 46. Required Reading• T. Heath, C. Bizer. Linked Data: Evolving the Web into a Global Data Space, Chapters 1-5
  47. 47. Recommended Readings• Linked Data Web Site:• Linked Data / Semantic Web Introduction: http://• Tim Berners-Lee. Linked Data Design Issues: http://• Best Practice Recipes for Publishing RDF Vocabularies:• How to Publish Linked Data on the Web: http://