Linked (Open) Data

Uploaded on

Lecture slides about the basics of Linked Data

Lecture slides about the basics of Linked Data

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
No Downloads


Total Views
On Slideshare
From Embeds
Number of Embeds



Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

    No notes for slide


  • 1. Linked (Open) DataINFO 4302 - April 18, 2011Bernhard Haslhofer - Cornell University
  • 2. Who am I?• Postdoc at Cornell Information Science• Research areas • linked data • user-contributed data (annotations) • (meta-)data interoperability• Contact: •
  • 3. Today we talk about...
  • 4. Today we talk about...• Movies, actors and other real-world entities• How to make data about these entities available on the Web (Linked Data)• Enabling technologies, best-practices and useful tools that help us in doing so• Other Linked Data projects (BBC, LoC)
  • 5. Web Architecture Recap
  • 6. The World Wide Web (WWW)• Internet != WWW != Google != Facebook• Fundamental technologies • URI - a simple and generic syntax for identifiers • HTML - a markup language without formal schema binding • HTTP - a simple protocol to access and manipulate resources and resource representations in a distributed environment• W3C Consortium (
  • 7. URIs• Identification of resources via Uniform Resource Identifiers (URIs)•The generic syntax consists of a hierarchical sequence of components, scheme, Generic Syntax: authority, path, query, and fragment. URI = scheme “:” hier-path [ “?” query ] [ “#” fragment ] Scheme and hier-path are required, though the path may be empty. Example URIs with components: URI foo:// _/ ________________/_________/ _________/ __/ URL | | | | | URN scheme authority path query fragment
  • 8. URIs / Resources• Information Resource • web pages, images, product catalogs, etc • all their essential characteristics can be conveyed in a message • e.g.,• Non-Information Resource • other things such as dogs, people, this classroom, concepts • their essence is not information • e.g.,
  • 9. HTTP• A stateless request-response protocol in the client-server computing model• HTTP methods: GET, POST, PUT, DELETE, ...• Agents may use a URI to access the referenced resource = dereferencing the URI
  • 10. HTTP Content Negotiation• A URI is not (necessarily) a filename• Conneg = making available multiple resource representations via the same URI Plain Text text/plain HTML (en) URI text/html HTML (jp) text/html Resource
  • 11. (X)HTML(5)• A resource representation data format...• ... for presentation markup • rendered by user agents (typically browsers) • focus on readability • less formal, user-friendly syntax and semantics
  • 12. Web Services• Application-to-application communication based on the Web architecture • simple and open standards (HTTP, XML, JSON, ...) • send data from Application A to Application B through the Web • usually define some API Web Application A Application B
  • 13. Linked Data
  • 14. Why Linked Data?
  • 15. Why Linked Data?
  • 16. Why Linked Data?
  • 17. Why Linked Data?• There is lots of information on the Web• ...valuable information that can be (re-)used• Problem • information is usually expressed in the form of HTML documents • the underlying raw data are locked in closed data silos (mostly DBMS)
  • 18. (c)
  • 19. Why Linked Data?• The Web is successful because it provides • Uniform encoding (HTML) • Uniform addressing (URI) • Uniform transportation (HTTP) for the exchange of documents.• Why not apply the same mechanism to the underlying data?
  • 20. What is Linked Data?• A method to build a Web of Data• Architectural style, set of standards Web
  • 21. What is Linked Data?• A set of four principles • use URIs as names for things • use HTTP URIs so that people can look up those names • when someone looks up a URI, provide useful information, using the standards (RDF, SPARQL) • include links to other URIs, so that they can discover more things
  • 22. Enabling Technologies
  • 23. Uniform Resource Identifiers (URI)• Name and identify things (resources)• Dereferencable HTTP URIs The_Shining_(film) resource/film/2014 04fjzv
  • 24. Resource Description Framework (RDF)• A model for representing data on the Web• Several statements (triples) form a graph Film Person rdf:type rdf:type dbpprop:starring The_Shining_(film) Jack_Nicholson foaf:name rdfs:label rdfs:label dbpedia-owl:birthDate !" (#$) The Shining (film) 1937-04-22 Jack Nicholson
  • 25. RDF serialization (RDF/XML, N3, Turtle, etc.)• Data formats for RDF resource representations RDF Serialization Formats: RDF/XML, N3, Turtle, N-Triple, etc• Used to transfer RDF data between apps Data formats for RDF resource representations Used to transfer RDF data from application-to-application N3/Turtle example: @prefix rdf:<> . @prefix dbpedia-owl:<> . <> rdf:type dbpedia-owl:Work , dbpedia-owl:Film . @prefix dbpprop:<> . @prefix ns9:<> . <> dbpprop:runtime"146.0"^^ns9:minute ; © Prof. Dr. Wolfgang Klas und Dr. Bernhard Haslhofer, WS 2010/11 - Multimediale Systeme 2 7 Linked (Open) Data 7-15
  • 26. RDF Vocabulary Description Language (RDFS)• A language for describing the syntax and semantics of vocabularies in a machine- understandable way Work rdfs:subClassOf Film
  • 27. OWL - Web Ontology Language• A more expressive (formal) language for defining the syntax and semantics of vocabularies• Solves RDFS shortcomings but introduces quite some complexity owl#ObjectProperty Work rdf:type rdfs:domain rdfs:range starring Person rdfs:label starring
  • 28. Simple Knowledge Organization System (SKOS)• A language for describing controlled vocabularies (taxonomies, thesauri, classification schemes) Category:1980s_horror_films skos:subject rdf:type skos:broader The_Shining_(film) skos/core#Concept rdf:type Category:1980s_films
  • 29. Links between Resources • OWL defines properties for linking resources dbpprop:starring The_Shining_(film) Jack_Nicholson owl:sameAs owl:sameAs owl:sameAs resource/film/2014 N5761411277431266513 04fjzv
  • 30. SPARQL • A query language and protocol for accessing7.2.2.7 SPARQL - RDF Query Language RDF data on the Web A query language and protocol for accessing RDF data on the Web SELECT DISTINCT ?x WHERE {?x skos:subject < gory:1980s_horror_films>} LIMIT 10
  • 31. Vocabulary / DataPublishing Best Practices
  • 32. Publishing Vocabularies• Hash-based URIs • e.g., • Suited to group the description of a moderate number of related terms into one RDF document • Agent can retrieve terms with a single request• Slash-based URIs • e.g., • Suited to split terms in large vocabularies into one document per term • No need to download a massive document
  • 33. Provide either:human-readable content from vocabulary URI
  • 34. or:machine-readable content from vocabulary URI... depending on what is requested.
  • 35. Publishing Data• Distinguish between non-information and information resource• Sample non-information resource •• Sample information resource • - HTML • - RDF
  • 36. Publishing Data GET Accept: application/rdf+xml 303 See Other Location: GET Accept: application/rdf+xml 200 OK ... <?xml version="1.0" encoding="utf-8"?> <rdf:RDF ...
  • 37. The Linking Open DataCommunity Project
  • 38. Linking? Open? Data Project?• Open Data: a philosophy, practice, or policy that data are freely available to everyone without restrictions from copyright, patents, a.s.o.• Linked Data: method / best practices for exposing, sharing, and connecting data using URIs and RDF• Linking Open Data: a W3C community project with the goal to extend the Web with a data commons by publishing various open data sets as RDF on the Web and by setting links between data items from different sources
  • 39. Useful Tools
  • 40. RDF APIs• Java • Jena Semantic Web Framework ( • Sesame RDF API (• PHP • ARC (• Ruby • RDF.rb: Linked Data for Ruby (• Python • RDFLib (• C • Redland RDF Libraries (
  • 41. RDF Stores• OpenLink Virtuoso ( dataspace/dav/wiki/Main/)• 4Store (• AllegroGraph ( allegrograph/)• Oracle 11g ( database/options/semantic-tech/ index.html)• ...and many more:
  • 42. RDF / Linked Data Wrappers• D2RQ - SPARQL / Linked Data for relational databases ( bizer/d2rq/)• OAI2LOD Server - expose any OAI-PMH source as Linked Data• TripFS - filesystem as Linked Data• TripCel - XLS spreadsheets as Linked Dat• ...
  • 43. Linked Data debuggingStartup your console / terminal - native on Linux / Mac OS X - Windows: resources with cURL ( -I -H "Accept: application/rdf+xml" -H "Accept: application/rdf+xml"
  • 44. Linked Data debuggingInstall the Raptor RDF Syntax Library ( - Mac: brew install raptorUse the rapper utility to dereference URIsrapper -o rdfxml
  • 45. Readings
  • 46. Required Reading• T. Heath, C. Bizer. Linked Data: Evolving the Web into a Global Data Space, Chapters 1-5
  • 47. Recommended Readings• Linked Data Web Site:• Linked Data / Semantic Web Introduction: http://• Tim Berners-Lee. Linked Data Design Issues: http://• Best Practice Recipes for Publishing RDF Vocabularies:• How to Publish Linked Data on the Web: http://