Linked Data Tutorial

  • 22,999 views
Uploaded on

DERI, Galway, Michael Hausenblas 2009-03-05

DERI, Galway, Michael Hausenblas 2009-03-05

More in: Education
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Be the first to comment
No Downloads

Views

Total Views
22,999
On Slideshare
0
From Embeds
0
Number of Embeds
3

Actions

Shares
Downloads
695
Comments
0
Likes
25

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Digital Enterprise Research Institute www.deri.ie Linked Data A Practical Introduction by Dr. Michael Hausenblas, DI2 © Copyright 2008 Digital Enterprise Research Institute. All rights reserved.
  • 2. Schedule Digital Enterprise Research Institute www.deri.ie Linked Data Principles – 10%   Web of Data 101 (URI, HTTP, RDF) – 40%   Linking Open Data community project – 20%   Tools and Applications – 30%   2
  • 3. Why? Digital Enterprise Research Institute www.deri.ie Web of Data = linked data + vocabularies +   embedded metadata (RDFa, microformats, etc.) When publishing linked data you provide a   standardised, uniform, and generic API for: discovery, see also http://webofdata.wordpress.com/   integration/meshup   distributed query   uniform access to metadata and data   enable serendipity   See also [EXPL]   3
  • 4. What? Digital Enterprise Research Institute www.deri.ie In contrast to the full-fledged Semantic Web vision, linked data is mainly about publishing structured data in RDF using URIs rather than focusing on the ontological level or inference. This simplification— just as the Web simplified the established academic approaches of Hypertext systems—lowers the entry barrier for data provider, hence fosters a wide- spread adoption. [EXPL] 4
  • 5. Linked Data Principles Digital Enterprise Research Institute www.deri.ie By Tim Berners-Lee, ca. 2006 [LD]   Use URIs to identify things (anything, not just documents)   Use HTTP URIs – globally unique names, distributed   ownership – allows people to look up things Provide useful information in RDF – when someone looks   up a URI Include RDF links to other URIs – to enable discovery of   related information 5
  • 6. Linked Data Principles Digital Enterprise Research Institute www.deri.ie Issues   These are principles, not implementation advices   Many things (deliberately?) kept blurry   Non-information resource vs. information resource debate   (see also [AWWSW]) CN and 303, the httpRange TAG issue [UA]   Formats: HTML + RDF/XML vs. RDFa   6
  • 7. Linked Data Principles Digital Enterprise Research Institute www.deri.ie Ongoing work   Description/discovery   –  semantic sitemaps, see http://sw.deri.org/2007/07/sitemapextension/ –  voiD, see http://semanticweb.org/wiki/VoiD Trust (SPOT09 at ESWC09 for example)   Multimedia/Fragments, see   http://www.interlinkingmultimedia.info/ Foundational issues in TAG/AWWSW   Transforming the read-only Web of Data into a read/write   Web of Data, see for example http://esw.w3.org/topic/PushBackDataToLegacySources 7
  • 8. Schedule Digital Enterprise Research Institute www.deri.ie Linked Data Principles – 15%   Web of Data 101 (URI, HTTP, RDF) – 40%   Linking Open Data community project – 15%   Tools and Applications – 30%   8
  • 9. Web of Data 101 - URI Digital Enterprise Research Institute www.deri.ie A Uniform Resource Identifier (URI) is a compact   sequence of characters that identifies an abstract or physical resource. [RFC3986] Syntax   URI = scheme quot;:quot; hier-part [ quot;?quot; query ] [ quot;#quot; fragment ] Example   foo://example.com:8042/over/there?name=ferret#nose _/ _________________/_________/ __________/ __/ | | | | | scheme authority path query fragment 9
  • 10. Web of Data 101 - URI Digital Enterprise Research Institute www.deri.ie Don’t confuse scheme with protocol   Scheme: defines URI layout and (certain) semantics; go and   register with IANA using [RFC4395] Protocol: defines communication means between   endpoints (such as HTTP, FTP, etc.) URI resolution (as of [RFC3986])   STEP OUTPUT BUFFER INPUT BUFFER 1: /a/b/c/./../../g 2E: /a /b/c/./../../g 2E: /a/b /c/./../../g 2E: /a/b/c /./../../g 2B: /a/b/c /../../g 2C: /a/b /../g 2C: /a /g 2E: /a/g 10
  • 11. Web of Data 101 - URI Digital Enterprise Research Institute www.deri.ie URIrefs, URI references [RDF AS]   An RDF URI reference is a Unicode string does not contain any control characters (#x00 - #x1F, #x7F-#x9F) and would produce a valid URI character sequence representing an absolute URI when subjected to an UTF-8 encoding along with %-escaping non-US-ASCII octets. QNames, Qualified Names [XML NS]   XML’s way to allow namespaced elements/attributes as of QName = Prefix ‘:‘ LocalPart CURIEs, Compact URIs [CURIE]   Generic, abbreviated syntax for expressing URIs, currently   in SPARQL, RDFa, and XHTML2 deployed 11
  • 12. Web of Data 101 - HTTP Digital Enterprise Research Institute www.deri.ie The Hypertext Transfer Protocol (HTTP) is an   application-level protocol for distributed, collaborative, hypermedia information systems. It is a generic, stateless, protocol which can be used for many tasks beyond its use for hypertext, such as name servers and distributed object management systems, through extension of its request methods, error codes and headers. A feature of HTTP is the typing and negotiation of data representation, allowing systems to be built independently of the data being transferred. [RFC2616] 12
  • 13. Web of Data 101 - HTTP Digital Enterprise Research Institute www.deri.ie HTTP messages consist of requests from client to   server and responses from server to client Set of methods is predefined (such as GET, POST,   etc.), but can be expanded Set of status codes is defined   Informational 1xx, provisional response, (100 Continue)   Successful 2xx, request successfully received, understood, and   accepted (201 Created) Redirection 3xx, further action needs to be taken by user agent   to fulfill the request (301 Moved Permanently) Client Error 4xx, client erred (405 Method Not Allowed)   Server Error 5xx, server encountered an unexpected condition   (501 Not Implemented) 13
  • 14. Web of Data 101 - HTTP Digital Enterprise Research Institute www.deri.ie GET /html/rfc2616 HTTP/1.1 REQUEST Host: tools.ietf.org User-Agent: Mozilla/5.0 Accept: text/html,application/xhtml +xml,application/xml;q=0.9,*/*;q=0.8 RESPONSE HTTP/1.x 200 OK Date: Thu, 05 Mar 2009 08:17:33 GMT Server: Apache/2.2.11 Content-Location: rfc2616.html Last-Modified: Tue, 20 Jan 2009 09:16:04 GMT Content-Type: text/html; charset=UTF-8 14
  • 15. Web of Data 101 - HTTP Digital Enterprise Research Institute www.deri.ie Content Negotiation (CN, conneg) is the process of   selecting the best representation for a given response when there are multiple representations available Three types of CN: server-driven, agent-driven CN,   transparent CN Example   curl -I -H quot;Accept: application/rdf+xmlquot; http://dbpedia.org/resource/Galway HTTP/1.1 303 See Other Content-Type: application/rdf+xml Location: http://dbpedia.org/data/Galway.rdf 15
  • 16. Web of Data 101 - HTTP Digital Enterprise Research Institute www.deri.ie Caching (see Cache–Control header field) is   essential for scalability HTTPbis [HTTPbis], IETF WG chaired by Mark   Nottingham, mainly about: patches, clarifications, deprecate non-used features, documentation of security properties 16
  • 17. Web of Data 101 - HTTP Digital Enterprise Research Institute www.deri.ie Representational State Transfer [REST]   resource the intended conceptual target of a hypertext reference resource identifier URL, URN representation HTML document, JPEG image representation media type, last-modified time metadata resource source link, alternates, vary metadata control data if-modified-since, cache-control 17
  • 18. Web of Data 101 - RDF Digital Enterprise Research Institute www.deri.ie As of [RDF AS] a data model: a directed, labeled   graph based on URIs Triple: (subject predicate object)   subject … URIref or bNode   predicate … URIref   object … URIref or bNode or literal   18
  • 19. Web of Data 101 - RDF Digital Enterprise Research Institute www.deri.ie 19
  • 20. Web of Data 101 - Overview Digital Enterprise Research Institute www.deri.ie Web's Standard Retrieval Algorithm as of [SDD]: 1.  parse URI and find HTTP protocol 2.  look up DNS name to determine the associated IP address 3.  open a TCP stream to port 80 at the IP address determined above 4.  format an HTTP GET request for resource and sends that to the server 5.  read response from the server 6.  from the status code (200) determine that a representation of the resource is available 7.  inspect the returned Content-Type 8.  pass the entity-body to its HTML rendering engine 20
  • 21. Digital Enterprise Research Institute www.deri.ie 21
  • 22. Web of Data 101 - Overview Digital Enterprise Research Institute www.deri.ie 22
  • 23. Web of Data 101 - Overview Digital Enterprise Research Institute www.deri.ie 23
  • 24. Schedule Digital Enterprise Research Institute www.deri.ie Linked Data Principles – 15%   Web of Data 101 (URI, HTTP, RDF) – 40%   Linking Open Data community project – 15%   Tools and Applications – 30%   24
  • 25. Linking Open Data Project Digital Enterprise Research Institute www.deri.ie Community project with W3C support started in   early 2007 [LOD] Idea: take existing (open) data sets and make   them available on the Web in RDF Interlink them with other data sets   Kudos to Tom Heath and Richard Cyganiak; the material in this section is heavily based on their work. 25
  • 26. Linking Open Data Project Digital Enterprise Research Institute www.deri.ie May 2007 26
  • 27. Linking Open Data Project Digital Enterprise Research Institute www.deri.ie Feb 2009 27
  • 28. Linking Open Data Project Digital Enterprise Research Institute www.deri.ie DBpedia 28
  • 29. Linking Open Data Project Digital Enterprise Research Institute www.deri.ie Geonames 29
  • 30. Schedule Digital Enterprise Research Institute www.deri.ie Linked Data Principles – 15%   Web of Data 101 (URI, HTTP, RDF) – 40%   Linking Open Data community project – 15%   Tools and Applications – 30%   30
  • 31. Tools and Applications Digital Enterprise Research Institute www.deri.ie Linking Open Data homepage [LOD] has   Browsing with Tabulator, VisiNav, DBpedia Mobile, etc.   Searching with Sindice, SWSE, Falcons, etc.   Mashups, e.g. Revyu, BBC Music, DERI Pipes   See further   http://esw.w3.org/topic/SweoIG/TaskForces/ CommunityProjects/LinkingOpenData/Applications 31
  • 32. Tools and Applications Digital Enterprise Research Institute www.deri.ie DBpedia Mobile 32
  • 33. Tools and Applications Digital Enterprise Research Institute www.deri.ie BBC music beta 33
  • 34. Tools and Applications Digital Enterprise Research Institute www.deri.ie Virtuoso including RDF triple store, SPARQL access   to data, Open source edition http://virtuoso.openlinksw.com/ Talis Platform, SaaS, Cloud-based storage for RDF   data and binary objects, SPARQL access, REST APIs http://www.talis.com/platform ARC (PHP) http://arc.semsol.org/   Jena (Java) http://jena.sourceforge.net/   Summary see   http://www.semanticscripting.org/SFSW2005/SFSW- Toolkits.pdf 34
  • 35. Tools and Applications Digital Enterprise Research Institute www.deri.ie Frameworks   LOOMP http://www.loomp.org/   Silk http://www4.wiwiss.fu-berlin.de/bizer/silk/   SQUIN http://squin.sourceforge.net/   Paget http://code.google.com/p/paget   Other Tools (debug, etc.)   curl   Live HTTP headers (FF plug in)   http://sparql.org/sparql.html   https://wiki.mozilla.org/Labs/Ubiquity/   Commands_In_The_Wild (Web of Data, etc.) 35
  • 36. Tools and Applications Digital Enterprise Research Institute www.deri.ie Further resources regarding the publishing process   http://linkeddata.org/docs/how-to-publish   http://events.linkeddata.org/iswc2008tutorial/   http://videolectures.net/iswc08_heath_hpldw/   http://www.w3.org/TR/swbp-vocab-pub/   http://vapour.sourceforge.net/   36
  • 37. References Digital Enterprise Research Institute www.deri.ie [EXPL] … ‘Exploiting Linked Data For Building Web Applications’, Hausenblas,   2009, accepted for publication in IEEE IC (pre-print version: http://sw-app.org/pub/exploit-lod-webapps-IEEEIC-preprint.pdf) [LD] … http://www.w3.org/DesignIssues/LinkedData.html   [UA] … http://esw.w3.org/topic/FindingResourceDescriptions   [RFC3986] … http://www.ietf.org/rfc/rfc3986.txt   [RDF AS] … http://www.w3.org/TR/rdf-concepts/#section-Graph-URIref   [XML NS] … http://www.w3.org/TR/xml-names/#ns-qualnames   [CURIE] … http://www.w3.org/TR/curie/   [RFC4395] … http://tools.ietf.org/html/rfc4395   [RFC2616] … http://www.ietf.org/rfc/rfc2616.txt   [HTTPbis] … http://tools.ietf.org/wg/httpbis/   [REST] … ‘Principled design of the modern Web architecture’, Fielding and Taylor,   2002, http://portal.acm.org/citation.cfm?doid=514183.514185 [SDD] … http://www.w3.org/2001/tag/doc/selfDescribingDocuments   [AWWSW] … http://esw.w3.org/topic/AwwswHome   [LOD] …   http://esw.w3.org/topic/SweoIG/TaskForces/CommunityProjects/LinkingOpenData 37