Linked (Open) DataINFO 4302 - April 18, 2011Bernhard Haslhofer - Cornell University
Who am I?• Postdoc at Cornell Information Science• Research areas • linked data • user-contributed data (annotations) • (m...
Today we talk about...http://www.youtube.com/watch?v=5Cb3ik6zP2I
Today we talk about...• Movies, actors and other real-world entities• How to make data about these entities available on t...
Web Architecture Recap
The World Wide Web (WWW)• Internet != WWW != Google != Facebook• Fundamental technologies • URI - a simple and generic syn...
URIs• Identification of resources via Uniform  Resource Identifiers (URIs)•The generic syntax consists of a hierarchical seq...
URIs / Resources• Information Resource •   web pages, images, product catalogs, etc •   all their essential characteristic...
HTTP• A stateless request-response protocol in the client-server computing model• HTTP methods: GET, POST, PUT, DELETE, .....
HTTP Content Negotiation• A URI is not (necessarily) a filename• Conneg = making available multiple resource representation...
(X)HTML(5)• A resource representation data format...• ... for presentation markup • rendered by user agents (typically bro...
Web Services• Application-to-application communication based on the Web architecture • simple and open standards (HTTP, XM...
Linked Data
Why Linked Data?
Why Linked Data?
Why Linked Data?
Why Linked Data?• There is lots of information on the Web• ...valuable information that can be (re-)used• Problem • inform...
(c) http://www.flickr.com/photos/docsearls/5500714140
Why Linked Data?• The Web is successful because it provides • Uniform encoding (HTML) • Uniform addressing (URI) • Uniform...
What is Linked Data?• A method to build a Web of Data• Architectural style, set of standards                        Web
What is Linked Data?• A set of four principles • use URIs as names for things • use HTTP URIs so that people can look up t...
Enabling Technologies
Uniform Resource Identifiers (URI)• Name and identify things (resources)• Dereferencable HTTP URIs                         ...
Resource Description Framework (RDF)• A model for representing data on the Web• Several statements (triples) form a graph ...
RDF serialization (RDF/XML, N3, Turtle, etc.)• Data formats for RDF resource representations   7.2.2.3            RDF Seri...
RDF Vocabulary Description Language (RDFS)• A language for describing the syntax and semantics of vocabularies in a machin...
OWL - Web Ontology Language• A more expressive (formal) language for defining the  syntax and semantics of vocabularies• So...
Simple Knowledge Organization System (SKOS)• A language for describing controlled vocabularies      (taxonomies, thesauri,...
Links between Resources   • OWL defines properties for linking resources                             http://dbpedia.org/res...
SPARQL • A query language and protocol for accessing7.2.2.7   SPARQL - RDF Query Language    RDF data on the Web    A quer...
Vocabulary / DataPublishing Best Practices
Publishing Vocabularies• Hash-based URIs •   e.g., http://example.com/example1#ClassA •   Suited to group the description ...
Provide either:human-readable content from vocabulary URI
or:machine-readable content from vocabulary URI... depending on what is requested.
Publishing Data• Distinguish between non-information and information resource• Sample non-information resource • http://db...
Publishing Data       GET http://dbpedia.org/resource/The_Shining_(film)       Accept: application/rdf+xml       303 See Ot...
The Linking Open DataCommunity Project
Linking? Open? Data Project?• Open Data: a philosophy, practice, or policy that data are  freely available to everyone wit...
Useful Tools
RDF APIs•   Java    •   Jena Semantic Web Framework (http://openjena.org/)    •   Sesame RDF API (http://www.openrdf.org/)...
RDF Stores• OpenLink Virtuoso (http://virtuoso.openlinksw.com/  dataspace/dav/wiki/Main/)• 4Store (http://4store.org/)• Al...
RDF / Linked Data Wrappers• D2RQ - SPARQL / Linked Data for relational databases (http://www4.wiwiss.fu-berlin.de/ bizer/d...
Linked Data debuggingStartup your console / terminal  - native on Linux / Mac OS X  - Windows: http://www.cygwin.com/Deref...
Linked Data debuggingInstall the Raptor RDF Syntax Library (http://librdf.org/raptor/)  - Mac: brew install raptorUse the ...
Readings
Required Reading• T. Heath, C. Bizer. Linked Data: Evolving the Web into a  Global Data Space, Chapters 1-5  http://linked...
Recommended Readings• Linked Data Web Site: http://linkeddata.org• Linked Data / Semantic Web Introduction: http://  www.l...
Linked (Open) Data
Linked (Open) Data
Linked (Open) Data
Linked (Open) Data
Linked (Open) Data
Linked (Open) Data
Linked (Open) Data
Linked (Open) Data
Linked (Open) Data
Linked (Open) Data
Linked (Open) Data
Linked (Open) Data
Linked (Open) Data
Linked (Open) Data
Linked (Open) Data
Linked (Open) Data
Linked (Open) Data
Upcoming SlideShare
Loading in...5
×

Linked (Open) Data

3,000

Published on

Lecture slides about the basics of Linked Data

1 Comment
4 Likes
Statistics
Notes
No Downloads
Views
Total Views
3,000
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
139
Comments
1
Likes
4
Embeds 0
No embeds

No notes for slide

Linked (Open) Data

  1. 1. Linked (Open) DataINFO 4302 - April 18, 2011Bernhard Haslhofer - Cornell University
  2. 2. Who am I?• Postdoc at Cornell Information Science• Research areas • linked data • user-contributed data (annotations) • (meta-)data interoperability• Contact: • bernhard.haslhofer@cornell.edu
  3. 3. Today we talk about...http://www.youtube.com/watch?v=5Cb3ik6zP2I
  4. 4. Today we talk about...• Movies, actors and other real-world entities• How to make data about these entities available on the Web (Linked Data)• Enabling technologies, best-practices and useful tools that help us in doing so• Other Linked Data projects (BBC, LoC)
  5. 5. Web Architecture Recap
  6. 6. The World Wide Web (WWW)• Internet != WWW != Google != Facebook• Fundamental technologies • URI - a simple and generic syntax for identifiers • HTML - a markup language without formal schema binding • HTTP - a simple protocol to access and manipulate resources and resource representations in a distributed environment• W3C Consortium (http://www.w3.org)
  7. 7. URIs• Identification of resources via Uniform Resource Identifiers (URIs)•The generic syntax consists of a hierarchical sequence of components, scheme, Generic Syntax: authority, path, query, and fragment. URI = scheme “:” hier-path [ “?” query ] [ “#” fragment ] Scheme and hier-path are required, though the path may be empty. Example URIs with components: URI foo://example.com:8042/over/there?name=ferret#nose _/ ________________/_________/ _________/ __/ URL | | | | | URN scheme authority path query fragment
  8. 8. URIs / Resources• Information Resource • web pages, images, product catalogs, etc • all their essential characteristics can be conveyed in a message • e.g., http://www.flickr.com/user2/photos/image.jpg• Non-Information Resource • other things such as dogs, people, this classroom, concepts • their essence is not information • e.g., http://www.example.com/ontology/meter
  9. 9. HTTP• A stateless request-response protocol in the client-server computing model• HTTP methods: GET, POST, PUT, DELETE, ...• Agents may use a URI to access the referenced resource = dereferencing the URI
  10. 10. HTTP Content Negotiation• A URI is not (necessarily) a filename• Conneg = making available multiple resource representations via the same URI Plain Text text/plain HTML (en) URI text/html HTML (jp) http://example.com/The_Shining text/html Resource
  11. 11. (X)HTML(5)• A resource representation data format...• ... for presentation markup • rendered by user agents (typically browsers) • focus on readability • less formal, user-friendly syntax and semantics
  12. 12. Web Services• Application-to-application communication based on the Web architecture • simple and open standards (HTTP, XML, JSON, ...) • send data from Application A to Application B through the Web • usually define some API Web Application A Application B
  13. 13. Linked Data
  14. 14. Why Linked Data?
  15. 15. Why Linked Data?
  16. 16. Why Linked Data?
  17. 17. Why Linked Data?• There is lots of information on the Web• ...valuable information that can be (re-)used• Problem • information is usually expressed in the form of HTML documents • the underlying raw data are locked in closed data silos (mostly DBMS)
  18. 18. (c) http://www.flickr.com/photos/docsearls/5500714140
  19. 19. Why Linked Data?• The Web is successful because it provides • Uniform encoding (HTML) • Uniform addressing (URI) • Uniform transportation (HTTP) for the exchange of documents.• Why not apply the same mechanism to the underlying data?
  20. 20. What is Linked Data?• A method to build a Web of Data• Architectural style, set of standards Web
  21. 21. What is Linked Data?• A set of four principles • use URIs as names for things • use HTTP URIs so that people can look up those names • when someone looks up a URI, provide useful information, using the standards (RDF, SPARQL) • include links to other URIs, so that they can discover more things
  22. 22. Enabling Technologies
  23. 23. Uniform Resource Identifiers (URI)• Name and identify things (resources)• Dereferencable HTTP URIs http://dbpedia.org/resource/ The_Shining_(film) http://data.linkedmdb.org/ resource/film/2014 http://rdf.freebase.com/ns/m/ 04fjzv
  24. 24. Resource Description Framework (RDF)• A model for representing data on the Web• Several statements (triples) form a graph http://dbpedia.org/ontology/ http://xmlns.com/foaf/0.1/ Film Person rdf:type rdf:type http://dbpedia.org/resource/ http://dbpedia.org/resource/ dbpprop:starring The_Shining_(film) Jack_Nicholson foaf:name rdfs:label rdfs:label dbpedia-owl:birthDate !" (#$) The Shining (film) 1937-04-22 Jack Nicholson
  25. 25. RDF serialization (RDF/XML, N3, Turtle, etc.)• Data formats for RDF resource representations 7.2.2.3 RDF Serialization Formats: RDF/XML, N3, Turtle, N-Triple, etc• Used to transfer RDF data between apps Data formats for RDF resource representations Used to transfer RDF data from application-to-application N3/Turtle example: @prefix rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix dbpedia-owl:<http://dbpedia.org/ontology/> . <http://dbpedia.org/resource/The_Shining_%28film%29> rdf:type dbpedia-owl:Work , dbpedia-owl:Film . @prefix dbpprop:<http://dbpedia.org/property/> . @prefix ns9:<http://dbpedia.org/datatype/> . <http://dbpedia.org/resource/The_Shining_%28film%29> dbpprop:runtime"146.0"^^ns9:minute ; © Prof. Dr. Wolfgang Klas und Dr. Bernhard Haslhofer, WS 2010/11 - Multimediale Systeme 2 7 Linked (Open) Data 7-15
  26. 26. RDF Vocabulary Description Language (RDFS)• A language for describing the syntax and semantics of vocabularies in a machine- understandable way http://dbpedia.org/ontology/ Work rdfs:subClassOf http://dbpedia.org/ontology/ Film
  27. 27. OWL - Web Ontology Language• A more expressive (formal) language for defining the syntax and semantics of vocabularies• Solves RDFS shortcomings but introduces quite some complexity http://www.w3.org/2002/07/ http://dbpedia.org/ontology/ owl#ObjectProperty Work rdf:type rdfs:domain http://dbpedia.org/ontology/ http://dbpedia.org/ontology/ rdfs:range starring Person rdfs:label starring
  28. 28. Simple Knowledge Organization System (SKOS)• A language for describing controlled vocabularies (taxonomies, thesauri, classification schemes) http://dbpedia.org/resource/ Category:1980s_horror_films skos:subject rdf:typehttp://dbpedia.org/resource/ skos:broader http://www.w3.org/2004/02/ The_Shining_(film) skos/core#Concept rdf:type http://dbpedia.org/resource/ Category:1980s_films
  29. 29. Links between Resources • OWL defines properties for linking resources http://dbpedia.org/resource/ http://dbpedia.org/resource/ dbpprop:starring The_Shining_(film) Jack_Nicholson owl:sameAs owl:sameAs owl:sameAshttp://data.linkedmdb.org/ resource/film/2014 http://data.nytimes.com/ N5761411277431266513 http://rdf.freebase.com/ns/m/ 04fjzv
  30. 30. SPARQL • A query language and protocol for accessing7.2.2.7 SPARQL - RDF Query Language RDF data on the Web A query language and protocol for accessing RDF data on the Web SELECT DISTINCT ?x WHERE {?x skos:subject <http:dbpedia.org/resource/Cate- gory:1980s_horror_films>} LIMIT 10
  31. 31. Vocabulary / DataPublishing Best Practices
  32. 32. Publishing Vocabularies• Hash-based URIs • e.g., http://example.com/example1#ClassA • Suited to group the description of a moderate number of related terms into one RDF document • Agent can retrieve terms with a single request• Slash-based URIs • e.g., http://example.com/example1/ClassB • Suited to split terms in large vocabularies into one document per term • No need to download a massive document
  33. 33. Provide either:human-readable content from vocabulary URI
  34. 34. or:machine-readable content from vocabulary URI... depending on what is requested.
  35. 35. Publishing Data• Distinguish between non-information and information resource• Sample non-information resource • http://dbpedia.org/resource/The_Shining_(film)• Sample information resource • http://dbpedia.org/page/The_Shining_(film) - HTML • http://dbpedia.org/data/The_Shining_(film) - RDF
  36. 36. Publishing Data GET http://dbpedia.org/resource/The_Shining_(film) Accept: application/rdf+xml 303 See Other Location: http://dbpedia.org/data/The_Shining_(film) GET http://dbpedia.org/data/The_Shining_(film) Accept: application/rdf+xml 200 OK ... <?xml version="1.0" encoding="utf-8"?> <rdf:RDF ...
  37. 37. The Linking Open DataCommunity Project
  38. 38. Linking? Open? Data Project?• Open Data: a philosophy, practice, or policy that data are freely available to everyone without restrictions from copyright, patents, a.s.o.• Linked Data: method / best practices for exposing, sharing, and connecting data using URIs and RDF• Linking Open Data: a W3C community project with the goal to extend the Web with a data commons by publishing various open data sets as RDF on the Web and by setting links between data items from different sources
  39. 39. Useful Tools
  40. 40. RDF APIs• Java • Jena Semantic Web Framework (http://openjena.org/) • Sesame RDF API (http://www.openrdf.org/)• PHP • ARC (http://arc.semsol.org/)• Ruby • RDF.rb: Linked Data for Ruby (http://rdf.rubyforge.org/)• Python • RDFLib (http://www.rdflib.net/)• C • Redland RDF Libraries (http://librdf.org/)
  41. 41. RDF Stores• OpenLink Virtuoso (http://virtuoso.openlinksw.com/ dataspace/dav/wiki/Main/)• 4Store (http://4store.org/)• AllegroGraph (http://www.franz.com/agraph/ allegrograph/)• Oracle 11g (http://www.oracle.com/technetwork/ database/options/semantic-tech/ index.html)• ...and many more: http://www.w3.org/2001/sw/wiki/Tools
  42. 42. RDF / Linked Data Wrappers• D2RQ - SPARQL / Linked Data for relational databases (http://www4.wiwiss.fu-berlin.de/ bizer/d2rq/)• OAI2LOD Server - expose any OAI-PMH source as Linked Data• TripFS - filesystem as Linked Data• TripCel - XLS spreadsheets as Linked Dat• ...
  43. 43. Linked Data debuggingStartup your console / terminal - native on Linux / Mac OS X - Windows: http://www.cygwin.com/Dereference resources with cURL (http://curl.haxx.se/)curl -I -H "Accept: application/rdf+xml" http://dbpedia.org/resource/The_Shining_%28film%29curl -H "Accept: application/rdf+xml" http://dbpedia.org/data/The_Shining_%28film%29
  44. 44. Linked Data debuggingInstall the Raptor RDF Syntax Library (http://librdf.org/raptor/) - Mac: brew install raptorUse the rapper utility to dereference URIsrapper http://dbpedia.org/resource/The_Shining_%28film%29rapper -o rdfxml http://dbpedia.org/resource/The_Shining_%28film%29
  45. 45. Readings
  46. 46. Required Reading• T. Heath, C. Bizer. Linked Data: Evolving the Web into a Global Data Space, Chapters 1-5 http://linkeddatabook.com/editions/1.0/
  47. 47. Recommended Readings• Linked Data Web Site: http://linkeddata.org• Linked Data / Semantic Web Introduction: http:// www.linkeddatatools.com/semantic-web-basics• Tim Berners-Lee. Linked Data Design Issues: http:// www.w3.org/DesignIssues/LinkedData.html• Best Practice Recipes for Publishing RDF Vocabularies: http://www.w3.org/TR/swbp-vocab-pub/• How to Publish Linked Data on the Web: http:// www4.wiwiss.fu-berlin.de/bizer/pub/LinkedDataTutorial/
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×