Linked (Open) Data
Upcoming SlideShare
Loading in...5
×
 

Linked (Open) Data

on

  • 2,997 views

Lecture slides about the basics of Linked Data

Lecture slides about the basics of Linked Data

Statistics

Views

Total Views
2,997
Views on SlideShare
2,997
Embed Views
0

Actions

Likes
4
Downloads
135
Comments
1

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Linked (Open) Data Linked (Open) Data Presentation Transcript

  • Linked (Open) DataINFO 4302 - April 18, 2011Bernhard Haslhofer - Cornell University
  • Who am I?• Postdoc at Cornell Information Science• Research areas • linked data • user-contributed data (annotations) • (meta-)data interoperability• Contact: • bernhard.haslhofer@cornell.edu
  • Today we talk about...http://www.youtube.com/watch?v=5Cb3ik6zP2I
  • Today we talk about...• Movies, actors and other real-world entities• How to make data about these entities available on the Web (Linked Data)• Enabling technologies, best-practices and useful tools that help us in doing so• Other Linked Data projects (BBC, LoC)
  • Web Architecture Recap
  • The World Wide Web (WWW)• Internet != WWW != Google != Facebook• Fundamental technologies • URI - a simple and generic syntax for identifiers • HTML - a markup language without formal schema binding • HTTP - a simple protocol to access and manipulate resources and resource representations in a distributed environment• W3C Consortium (http://www.w3.org)
  • URIs• Identification of resources via Uniform Resource Identifiers (URIs)•The generic syntax consists of a hierarchical sequence of components, scheme, Generic Syntax: authority, path, query, and fragment. URI = scheme “:” hier-path [ “?” query ] [ “#” fragment ] Scheme and hier-path are required, though the path may be empty. Example URIs with components: URI foo://example.com:8042/over/there?name=ferret#nose _/ ________________/_________/ _________/ __/ URL | | | | | URN scheme authority path query fragment
  • URIs / Resources• Information Resource • web pages, images, product catalogs, etc • all their essential characteristics can be conveyed in a message • e.g., http://www.flickr.com/user2/photos/image.jpg• Non-Information Resource • other things such as dogs, people, this classroom, concepts • their essence is not information • e.g., http://www.example.com/ontology/meter
  • HTTP• A stateless request-response protocol in the client-server computing model• HTTP methods: GET, POST, PUT, DELETE, ...• Agents may use a URI to access the referenced resource = dereferencing the URI
  • HTTP Content Negotiation• A URI is not (necessarily) a filename• Conneg = making available multiple resource representations via the same URI Plain Text text/plain HTML (en) URI text/html HTML (jp) http://example.com/The_Shining text/html Resource
  • (X)HTML(5)• A resource representation data format...• ... for presentation markup • rendered by user agents (typically browsers) • focus on readability • less formal, user-friendly syntax and semantics
  • Web Services• Application-to-application communication based on the Web architecture • simple and open standards (HTTP, XML, JSON, ...) • send data from Application A to Application B through the Web • usually define some API Web Application A Application B
  • Linked Data
  • Why Linked Data?
  • Why Linked Data?
  • Why Linked Data?
  • Why Linked Data?• There is lots of information on the Web• ...valuable information that can be (re-)used• Problem • information is usually expressed in the form of HTML documents • the underlying raw data are locked in closed data silos (mostly DBMS)
  • (c) http://www.flickr.com/photos/docsearls/5500714140
  • Why Linked Data?• The Web is successful because it provides • Uniform encoding (HTML) • Uniform addressing (URI) • Uniform transportation (HTTP) for the exchange of documents.• Why not apply the same mechanism to the underlying data?
  • What is Linked Data?• A method to build a Web of Data• Architectural style, set of standards Web
  • What is Linked Data?• A set of four principles • use URIs as names for things • use HTTP URIs so that people can look up those names • when someone looks up a URI, provide useful information, using the standards (RDF, SPARQL) • include links to other URIs, so that they can discover more things
  • Enabling Technologies
  • Uniform Resource Identifiers (URI)• Name and identify things (resources)• Dereferencable HTTP URIs http://dbpedia.org/resource/ The_Shining_(film) http://data.linkedmdb.org/ resource/film/2014 http://rdf.freebase.com/ns/m/ 04fjzv
  • Resource Description Framework (RDF)• A model for representing data on the Web• Several statements (triples) form a graph http://dbpedia.org/ontology/ http://xmlns.com/foaf/0.1/ Film Person rdf:type rdf:type http://dbpedia.org/resource/ http://dbpedia.org/resource/ dbpprop:starring The_Shining_(film) Jack_Nicholson foaf:name rdfs:label rdfs:label dbpedia-owl:birthDate !" (#$) The Shining (film) 1937-04-22 Jack Nicholson
  • RDF serialization (RDF/XML, N3, Turtle, etc.)• Data formats for RDF resource representations 7.2.2.3 RDF Serialization Formats: RDF/XML, N3, Turtle, N-Triple, etc• Used to transfer RDF data between apps Data formats for RDF resource representations Used to transfer RDF data from application-to-application N3/Turtle example: @prefix rdf:<http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix dbpedia-owl:<http://dbpedia.org/ontology/> . <http://dbpedia.org/resource/The_Shining_%28film%29> rdf:type dbpedia-owl:Work , dbpedia-owl:Film . @prefix dbpprop:<http://dbpedia.org/property/> . @prefix ns9:<http://dbpedia.org/datatype/> . <http://dbpedia.org/resource/The_Shining_%28film%29> dbpprop:runtime"146.0"^^ns9:minute ; © Prof. Dr. Wolfgang Klas und Dr. Bernhard Haslhofer, WS 2010/11 - Multimediale Systeme 2 7 Linked (Open) Data 7-15
  • RDF Vocabulary Description Language (RDFS)• A language for describing the syntax and semantics of vocabularies in a machine- understandable way http://dbpedia.org/ontology/ Work rdfs:subClassOf http://dbpedia.org/ontology/ Film
  • OWL - Web Ontology Language• A more expressive (formal) language for defining the syntax and semantics of vocabularies• Solves RDFS shortcomings but introduces quite some complexity http://www.w3.org/2002/07/ http://dbpedia.org/ontology/ owl#ObjectProperty Work rdf:type rdfs:domain http://dbpedia.org/ontology/ http://dbpedia.org/ontology/ rdfs:range starring Person rdfs:label starring
  • Simple Knowledge Organization System (SKOS)• A language for describing controlled vocabularies (taxonomies, thesauri, classification schemes) http://dbpedia.org/resource/ Category:1980s_horror_films skos:subject rdf:typehttp://dbpedia.org/resource/ skos:broader http://www.w3.org/2004/02/ The_Shining_(film) skos/core#Concept rdf:type http://dbpedia.org/resource/ Category:1980s_films
  • Links between Resources • OWL defines properties for linking resources http://dbpedia.org/resource/ http://dbpedia.org/resource/ dbpprop:starring The_Shining_(film) Jack_Nicholson owl:sameAs owl:sameAs owl:sameAshttp://data.linkedmdb.org/ resource/film/2014 http://data.nytimes.com/ N5761411277431266513 http://rdf.freebase.com/ns/m/ 04fjzv
  • SPARQL • A query language and protocol for accessing7.2.2.7 SPARQL - RDF Query Language RDF data on the Web A query language and protocol for accessing RDF data on the Web SELECT DISTINCT ?x WHERE {?x skos:subject <http:dbpedia.org/resource/Cate- gory:1980s_horror_films>} LIMIT 10
  • Vocabulary / DataPublishing Best Practices
  • Publishing Vocabularies• Hash-based URIs • e.g., http://example.com/example1#ClassA • Suited to group the description of a moderate number of related terms into one RDF document • Agent can retrieve terms with a single request• Slash-based URIs • e.g., http://example.com/example1/ClassB • Suited to split terms in large vocabularies into one document per term • No need to download a massive document
  • Provide either:human-readable content from vocabulary URI
  • or:machine-readable content from vocabulary URI... depending on what is requested.
  • Publishing Data• Distinguish between non-information and information resource• Sample non-information resource • http://dbpedia.org/resource/The_Shining_(film)• Sample information resource • http://dbpedia.org/page/The_Shining_(film) - HTML • http://dbpedia.org/data/The_Shining_(film) - RDF
  • Publishing Data GET http://dbpedia.org/resource/The_Shining_(film) Accept: application/rdf+xml 303 See Other Location: http://dbpedia.org/data/The_Shining_(film) GET http://dbpedia.org/data/The_Shining_(film) Accept: application/rdf+xml 200 OK ... <?xml version="1.0" encoding="utf-8"?> <rdf:RDF ...
  • The Linking Open DataCommunity Project
  • Linking? Open? Data Project?• Open Data: a philosophy, practice, or policy that data are freely available to everyone without restrictions from copyright, patents, a.s.o.• Linked Data: method / best practices for exposing, sharing, and connecting data using URIs and RDF• Linking Open Data: a W3C community project with the goal to extend the Web with a data commons by publishing various open data sets as RDF on the Web and by setting links between data items from different sources
  • Useful Tools
  • RDF APIs• Java • Jena Semantic Web Framework (http://openjena.org/) • Sesame RDF API (http://www.openrdf.org/)• PHP • ARC (http://arc.semsol.org/)• Ruby • RDF.rb: Linked Data for Ruby (http://rdf.rubyforge.org/)• Python • RDFLib (http://www.rdflib.net/)• C • Redland RDF Libraries (http://librdf.org/)
  • RDF Stores• OpenLink Virtuoso (http://virtuoso.openlinksw.com/ dataspace/dav/wiki/Main/)• 4Store (http://4store.org/)• AllegroGraph (http://www.franz.com/agraph/ allegrograph/)• Oracle 11g (http://www.oracle.com/technetwork/ database/options/semantic-tech/ index.html)• ...and many more: http://www.w3.org/2001/sw/wiki/Tools
  • RDF / Linked Data Wrappers• D2RQ - SPARQL / Linked Data for relational databases (http://www4.wiwiss.fu-berlin.de/ bizer/d2rq/)• OAI2LOD Server - expose any OAI-PMH source as Linked Data• TripFS - filesystem as Linked Data• TripCel - XLS spreadsheets as Linked Dat• ...
  • Linked Data debuggingStartup your console / terminal - native on Linux / Mac OS X - Windows: http://www.cygwin.com/Dereference resources with cURL (http://curl.haxx.se/)curl -I -H "Accept: application/rdf+xml" http://dbpedia.org/resource/The_Shining_%28film%29curl -H "Accept: application/rdf+xml" http://dbpedia.org/data/The_Shining_%28film%29
  • Linked Data debuggingInstall the Raptor RDF Syntax Library (http://librdf.org/raptor/) - Mac: brew install raptorUse the rapper utility to dereference URIsrapper http://dbpedia.org/resource/The_Shining_%28film%29rapper -o rdfxml http://dbpedia.org/resource/The_Shining_%28film%29
  • Readings
  • Required Reading• T. Heath, C. Bizer. Linked Data: Evolving the Web into a Global Data Space, Chapters 1-5 http://linkeddatabook.com/editions/1.0/
  • Recommended Readings• Linked Data Web Site: http://linkeddata.org• Linked Data / Semantic Web Introduction: http:// www.linkeddatatools.com/semantic-web-basics• Tim Berners-Lee. Linked Data Design Issues: http:// www.w3.org/DesignIssues/LinkedData.html• Best Practice Recipes for Publishing RDF Vocabularies: http://www.w3.org/TR/swbp-vocab-pub/• How to Publish Linked Data on the Web: http:// www4.wiwiss.fu-berlin.de/bizer/pub/LinkedDataTutorial/