Linked Data on the Web

3,262 views
3,129 views

Published on

An comprehensive overview on Linked Data, including an introduction, technical foundations, consuming LD, and open research issues.

Published in: Technology, Education

Linked Data on the Web

  1. 1. Linked Data on the Web Olaf Hartig http://olafhartig.de/foaf.rdf#olaf Database and Information Systems Research Group Humboldt-Universität zu Berlin
  2. 2. Outline From a Web of Documents to a Web of Data Technical Foundations of Linked Data Consuming Linked Data Current Research Issues Olaf Hartig - Linked Data on the Web
  3. 3. The Traditional Web Traditional Web = Internet + Docs + Links Olaf Hartig - Linked Data on the Web
  4. 4. The Traditional Web Traditional Web = Internet + Docs + Links ● HTML as shared content format ● HTTP to access documents on the Web ● URLs ● Globally unique identifiers for documents ● Retrieval mechanism ● Hyperlinks ● Single global information space Olaf Hartig - Linked Data on the Web
  5. 5. The Traditional Web So what is the problem? Olaf Hartig - Linked Data on the Web
  6. 6. The Traditional Web So what is the problem? ● Web content is only loosely structured ● Difficult for applications to do smart things Olaf Hartig - Linked Data on the Web
  7. 7. The Traditional Web So what is the problem? ● Web content is only loosely structured ● Difficult for applications to do smart things Solution: ● Increase the structure of Web content ● Publish data Olaf Hartig - Linked Data on the Web
  8. 8. The Traditional Web So what is the problem? ● Web content is only loosely structured ● Difficult for applications to do smart things Solution: ● Increase the structure of Web content ● Publish data But wait… don't we do that already? Olaf Hartig - Linked Data on the Web
  9. 9. The Traditional Web ● Content providers offer access via Web APIs Web API Web API Web API Web API Olaf Hartig - Linked Data on the Web
  10. 10. The Traditional Web ● Content providers offer access via Web APIs ● Mashups combine this data Web API Web API Web API Web API Olaf Hartig - Linked Data on the Web
  11. 11. The Traditional Web ● Content providers offer access via Web APIs ● Mashups combine this data Shortcomings: ● APIs are proprietary ● Mashups are based on a fixed set of data sources Web API ● YouWeb API can not set hyperlinks between data object Web API Web API Olaf Hartig - Linked Data on the Web
  12. 12. ● Use URIs as names for things ● Use HTTP URIs so that people can look up those names. ● When someone looks up a URI, provide useful information. ● Include links to other URIs so that they can discover more things. Tim Berners-Lee, July 2006 My Movie DB Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
  13. 13. ● Use URIs as names for things ● Use HTTP URIs so that people can look up those names. ● When someone looks up a URI, provide useful information. ● Include links to other URIs so that they can discover more things. Tim Berners-Lee, July 2006 http://mymovie.db/movie1342 http://mymovie.db/movie0362 http://mymovie.db/movie5112 My Movie DB http://mymovie.db/movie2449 Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
  14. 14. ● Use URIs as names for things ● Use HTTP URIs so that people can look up those names. http://m ● When someone looks up a ymovie URI, provide useful information. ? .d b/movie ● Include links to other URIs so that they can discover more 2449 things. Tim Berners-Lee, July 2006 http://mymovie.db/movie1342 http://mymovie.db/movie0362 http://mymovie.db/movie5112 My Movie DB http://mymovie.db/movie2449 Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
  15. 15. ● Use URIs as names for things ● Use HTTP URIs so that people can look up those names. http://m ● When someone looks up a ymovie URI, provide useful information. ? .d b/movie ● Include links to other URIs so that they can discover more 2449 things. Tim Berners-Lee, July 2006 http://mymovie.db/movie1342 http://mymovie.db/movie0362 http://mymovie.db/movie5112 My Movie DB http://mymovie.db/movie2449 Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
  16. 16. ● Use URIs as names for things ● Use HTTP URIs so that people can look up those names. http://m ● When someone looks up a ymovie URI, provide useful information. ? .d b/movie ● Include links to other URIs so that they can discover more 2449 things. Tim Berners-Lee, July 2006 http://mymovie.db/movie1342 http://mymovie.db/movie0362 http://mymovie.db/movie5112 My Movie DB http://mymovie.db/movie2449 Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
  17. 17. ● Use URIs as names for things ● Use HTTP URIs so that people can look up those names. http://m ● When someone looks up a ymovie URI, provide useful information. ? .d b/movie ● Include links to other URIs so that they can discover more 2449 things. Tim Berners-Lee, July 2006 http://mymovie.db/movie1342 http://mymovie.db/movie0362 http://geo.db/country21 http://geo.db/country7 http://mymovie.db/movie5112 My Movie DB http://geo.db/cityCJ http://geo.db/cityXA http://mymovie.db/movie2449 Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
  18. 18. ● Use URIs as names for things ● Use HTTP URIs so that people can look up those names. http://m ● When someone looks up a ymovie URI, provide useful information. ? .d b/movie ● Include links to other URIs so that they can discover more 2449 things. Tim Berners-Lee, July 2006 http://mymovie.db/movie1342 http://mymovie.db/movie0362 http://geo.db/country21 http://geo.db/country7 http://mymovie.db/movie5112 My Movie DB http://geo.db/cityCJ http://geo.db/cityXA http://mymovie.db/movie2449 Olaf Hartig - Executing SPARQL Queries over the Web of Linked Data
  19. 19. Linked Data – An Example http://data.linkedmdb.org/.../2014 rdf:type http://data.linkedmdb.org/.../film mov ie:re dc fo late d Bo af :t ok itle :b as ed http://www4.wi … /0743424425 _n The Shining ea r http://sws.geonames.org/2635167/ Olaf Hartig - Linked Data on the Web
  20. 20. Linked Data – An Example http://data.linkedmdb.org/.../2014 rdf:type http://data.linkedmdb.org/.../film mov ie:re dc fo late d Bo af :t ok itle :b as ed http://www4.wi … /0743424425 _n The Shining ea r http://sws.geonames.org/2635167/ n atio l be o pu l n:p :la g s rdf 60943000 United Kingdom Olaf Hartig - Linked Data on the Web
  21. 21. Linked Data – An Example http://data.linkedmdb.org/.../2014 rdf:type http://data.linkedmdb.org/.../film mov ie:re dc fo late d Bo af :t ok itle :b as ed http://www4.wi … /0743424425 dc: title _n The Shining ea r http://sws.geonames.org/2635167/ The Shining sko s:s n atio l be pu l ub o n:p :la g je s ct rdf 60943000 United Kingdom http://www4.wi … /Fiction Olaf Hartig - Linked Data on the Web
  22. 22. Linked Data – An Example http://data.linkedmdb.org/.../2014 rdf:type http://data.linkedmdb.org/.../film mov ie:re dc fo late d Bo af :t ok itle :b as ed http://www4.wi … /0743424425 dc: title _n The Shining ea r http://sws.geonames.org/2635167/ The Shining sko s:s n atio l be pu l ub o n:p :la g je s ct rdf 60943000 United Kingdom http://www4.wi … /Fiction http://www4.wi … /1571884029 t skos:subjec Olaf Hartig - Linked Data on the Web
  23. 23. Properties of Linked Data ● Anyone can publish data to the Web of data ● Entities are connected by links ● Giant global data graph that spans data sources ● Data is self-describing ● Vocabulary terms are identified by URIs, too ● Look-up yields their RDFS or OWL definition ● The Web of data is open ● Applications can discover new data sources at run-time Olaf Hartig - Linked Data on the Web
  24. 24. Properties of Linked Data ● Anyone can publish data to the Web of data ● Entities are connected by links ● Giant global data graph that spans data sources ● Data is self-describing ● Vocabulary terms are identified by URIs, too ● Look-up yields their RDFS or OWL definition ● The Web of data is open ● Applications can discover new data sources at run-time Is this real? Olaf Hartig - Linked Data on the Web
  25. 25. W3C Linking Open Data Project ● Grassroots community effort ● Publish existing, open license datasets as Linked Data ● Interlink things between different data sources Olaf Hartig - Linked Data on the Web
  26. 26. W3C Linking Open Data Project As of July 2007 > 500M triples ca. 120,000 links Olaf Hartig - Linked Data on the Web
  27. 27. W3C Linking Open Data Project ca. 6.7B triples ca. 150M links Olaf Hartig - Linked Data on the Web
  28. 28. W3C Linking Open Data Project Media User generated content Publications Geographic Cross-domain Life Sciences ca. 6.7B triples ca. 150M links Olaf Hartig - Linked Data on the Web
  29. 29. Linked Data Publishers ● UK government ● US government ● Thomson Reuters (Open Calais) ● MetaWeb (Freebase) ● BBC ● NY Times ● Best Buy ● CNET etc. Olaf Hartig - Linked Data on the Web
  30. 30. Linked Data Publishers ● UK government ● US government ● Thomson Reuters (Open Calais) ● MetaWeb (Freebase) ● BBC ● NY Times ● Best Buy ● CNET etc. Can I become part? Olaf Hartig - Linked Data on the Web
  31. 31. Linked Data Publishing Tools ● Use HTTP URIs in your FOAF profile ● Legacy data in relational databases ● D2R Server, Triplify, Virtuoso, Ultrawrap, ... ● CMS ● Drupal ● Native RDF stores ● Sesame, AllegroGraph, Virtuoso ● Talis platform (Linked Data in the cloud) ● HTML with RDFa Olaf Hartig - Linked Data on the Web
  32. 32. Integrating the Traditional Web ● Annotate Web documents with Linked Data URIs http://data.semanticweb.org/ … /eswc/2007/paper-69 dc :su bje ct http://dbpedia.org/resource/Machine_Learning ● Annotation services using named entity recognition ● Open Calais (Thomson Reuters) for news ● Zemanta for blog posts ● Epiphany Olaf Hartig - Linked Data on the Web
  33. 33. Outline From a Web of Documents to a Web of Data Technical Foundations of Linked Data Consuming Linked Data Current Research Issues Olaf Hartig - Linked Data on the Web
  34. 34. Technical Foundations There is no magic – Linked Data is based on well-established (Semantic) Web technologies. ● HTTP ● URI ● RDF ● RDFS / OWL Olaf Hartig - Linked Data on the Web
  35. 35. URIs ● Hash URIs http://olafhartig.de/foaf.rdf#olaf ● Slash URIs http://data.linkedmdb.org/resource/film/2014 Olaf Hartig - Linked Data on the Web
  36. 36. Looking up URIs Give me data about http://olafhartig.de/foaf.rdf#olaf HTTP Request for http://olafhartig.de/foaf.rdf GET /foaf.rdf HTTP/1.1 User-Agent: curl/7.19.6 (i686-pc-linux-gnu) libcurl/7.19.6 OpenSSL/0.9.8l zlib/1.2.3 Host: olafhartig.de Accept: */* Olaf Hartig - Linked Data on the Web
  37. 37. Looking up URIs HTTP Response: HTTP/1.1 200 OK Date: Thu, 11 Mar 2010 08:47:53 GMT Server: Apache/2.2.6 (Unix) mod_ssl/2.2.6 OpenSSL/0.9.8g Last-Modified: Fri, 05 Mar 2010 18:01:07 GMT ETag: "72a16-1946-7fe53ec0" Accept-Ranges: bytes Content-Length: 6470 Content-Type: application/rdf+xml Content-Language: de <?xml version="1.0" encoding="UTF-8"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:foaf="http://xmlns.com/foaf/0.1/"> <foaf:PersonalProfileDocument rdf:about=""> <foaf:maker rdf:resource="http://olafhartig.de/foaf.rdf#olaf"/> ... Olaf Hartig - Linked Data on the Web
  38. 38. HTTP Content Negotiation ● Request the resource in a specific format (representation) ● Use the HTTP header Accept to specify a media type Example: GET /data/dbprofs HTTP/1.1 Host: researchersmap.informatik.hu-berlin.de Accept: text/rdf+n3 Olaf Hartig - Linked Data on the Web
  39. 39. HTTP Content Negotiation HTTP Response: HTTP/1.1 200 OK Date: Thu, 11 Mar 2010 09:02:22 GMT Server: Apache/2.2.13 (Linux/SUSE) Content-Location: dbprofs.n3 Vary: negotiate,accept TCN: choice Last-Modified: Tue, 05 Jan 2010 14:46:17 GMT ETag: "40e4d-2250-47c6be683f0e1;47c6be69482f5" Accept-Ranges: bytes Content-Length: 8784 Content-Type: text/rdf+n3 @prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix foaf: <http://xmlns.com/foaf/0.1/> . @prefix contact: <http://www.w3.org/2000/10/swap/pim/contact#> . <> a foaf:Document ; foaf:maker <http://www.informatik.hu-berlin.de/~hartig/foaf.rdf#olaf> . ... Olaf Hartig - Linked Data on the Web
  40. 40. URIs ● Hash URIs http://olafhartig.de/foaf.rdf#olaf ● Slash URIs http://data.linkedmdb.org/resource/film/2014 Olaf Hartig - Linked Data on the Web
  41. 41. Redirections HTTP Request for http://data.linkedmdb.org/resource/film/2014 GET /resource/film/2014 HTTP/1.1 User-Agent: curl/7.19.6 (i686-pc-linux-gnu) libcurl/7.19.6 Host: data.linkedmdb.org Accept: application/rdf+xml Olaf Hartig - Linked Data on the Web
  42. 42. Redirections HTTP Request for http://data.linkedmdb.org/resource/film/2014 GET /resource/film/2014 HTTP/1.1 User-Agent: curl/7.19.6 (i686-pc-linux-gnu) libcurl/7.19.6 Host: data.linkedmdb.org Accept: application/rdf+xml Response: HTTP/1.1 303 See Other Date: Thu, 11 Mar 2010 08:15:50 GMT Server: Jetty(6.1.4) Location: http://data.linkedmdb.org/data/film/2014 Content-Length: 0 Via: 1.1 data.linkedmdb.org Content-Type: text/plain Olaf Hartig - Linked Data on the Web
  43. 43. Redirections HTTP Request for http://data.linkedmdb.org/resource/film/2014 GET /resource/film/2014 HTTP/1.1 User-Agent: curl/7.19.6 (i686-pc-linux-gnu) libcurl/7.19.6 Host: data.linkedmdb.org Accept: text/html Olaf Hartig - Linked Data on the Web
  44. 44. Redirections HTTP Request for http://data.linkedmdb.org/resource/film/2014 GET /resource/film/2014 HTTP/1.1 User-Agent: curl/7.19.6 (i686-pc-linux-gnu) libcurl/7.19.6 Host: data.linkedmdb.org Accept: text/html Response: HTTP/1.1 303 See Other Date: Thu, 11 Mar 2010 08:15:50 GMT Server: Jetty(6.1.4) Location: http://data.linkedmdb.org/page/film/2014 Content-Length: 0 Via: 1.1 data.linkedmdb.org Content-Type: text/plain Olaf Hartig - Linked Data on the Web
  45. 45. Vocabularies and Ontologies ● Defined using RDFS or OWL ● A plenty of vocabularies exist: ● People ● Social media ● Commerce ● Events ● Radio and TV programmes ● Music etc. Olaf Hartig - Linked Data on the Web
  46. 46. owl:sameAs http://sws.geonames.org/2635167/ = http://dbpedia.org/resource/United_Kingdom = http://rdf.freebase.com/ns/guid.9202a8c04000641f800000000003e30b = http://www4.wiwiss.fu-berlin.de/factbook/resource/United_Kingdom = http://www4.wiwiss.fu-berlin.de/eurostat/resource/countries/United_Kingdom Olaf Hartig - Linked Data on the Web
  47. 47. owl:sameAs http://data.linkedmdb.org/.../2014 rdf:type http://data.linkedmdb.org/.../film mov ie:re dc fo late d Bo af :t ok itle :b as ed http://www4.wi … /0743424425 _n The Shining ea r http://sws.geonames.org/2635167/ ow l :sam n e As atio l be o pu l n:p :la g s http://dbpedia.org/resource/United_Kingdom rdf 60943000 United Kingdom Olaf Hartig - Linked Data on the Web
  48. 48. owl:sameAs http://data.linkedmdb.org/.../2014 rdf:type http://data.linkedmdb.org/.../film mov ie:re dc fo late d Bo af :t ok itle :b as ed http://www4.wi … /0743424425 _n The Shining ea r http://sws.geonames.org/2635167/ ow l :sam n e As atio l be o pu l n:p :la g s http://dbpedia.org/resource/United_Kingdom rdf 60943000 r db United Kingdom de :c a all p:le in gC db od http://dbpedia.org/resource/Gordon_Brown e 44 Olaf Hartig - Linked Data on the Web
  49. 49. Outline From a Web of Documents to a Web of Data Technical Foundations of Linked Data Consuming Linked Data Current Research Issues Olaf Hartig - Linked Data on the Web
  50. 50. Consuming Linked Data … by Humans ● Linked Data browsers ● Faceted browsers ● On-the-fly Linked Data Mashups ● Linked Data based applications Olaf Hartig - Linked Data on the Web
  51. 51. Linked Data Browsers ● Provide a tabular view on retrieved RDF data ● Some integrate data from multiple sources ● Allow to follow RDF links ● Multiple options: ● Tabulator ● Disco ● OpenLink Data Explorer ● Zitgist Data Viewer ● Marbles etc. Olaf Hartig - Linked Data on the Web
  52. 52. Faceted Browsers Olaf Hartig - Linked Data on the Web http://dbpedia.neofonie.de
  53. 53. On-the-fly Mashups Olaf Hartig - Linked Data on the Web http://sig.ma
  54. 54. Linked Data based Applications [SFSW'09] Olaf Hartig - Linked Data on the Web
  55. 55. New Kind of Applications ● Users retain full control over their data ● Users manage and publish data on their own ● All that is needed for the application is a URI http://researchersmap.informatik.hu-berlin.de/data/dbprofs … <http://www.dbis.informatik.hu-berlin.de/ … /freytag.rdf#me> rdf:type :DBProfessor . … Olaf Hartig - Linked Data on the Web
  56. 56. Users Really Own their Data http://www.dbis.informatik.hu-berlin.de/ ... /freytag.rdf … <http://www.dbis.informatik.hu-berlin.de/ … /freytag.rdf#me> contact:fullName "Prof. Johann-Christoph Freytag, Ph.D." ; contact:office [ contact:address [ contact:street "Rudower Chaussee 25" ; contact:city "Berlin"^^xsd:string ; contact:postalCode "12489"^^xsd:string ] ] ; foaf:topic_interest <http://dbpedia.org/resource/Query_optimization> , <http://dbpedia.org/resource/Privacy> , <http://dbpedia.org/resource/Data_quality> , <http://dbpedia.org/resource/Data_warehouse> ; owl:sameAs <http://dblp.l3s.de/d2r/resource/authors/Johann_Christoph_Freytag> . … Olaf Hartig - Linked Data on the Web
  57. 57. Consuming Linked Data … in Applications ● Look up URIs and process the retrieved data ● Query with SPARQL Olaf Hartig - Linked Data on the Web
  58. 58. Brief Introduction to SPARQL ● Query language for RDF data ● Main idea: pattern matching ● Describe subgraphs of the queried RDF graph ● Subgraphs that match your description yield a result ● Mean: graph patterns (i.e. RDF graphs with variables) ?v rdf:type http://.../Volcano Olaf Hartig - Linked Data on the Web
  59. 59. Brief Introduction to SPARQL Queried graph: rdf:type http://.../Mount_Baker http://.../Volcano p:lastEruption rdf:type "1880" htp://.../Mount_Etna ?v rdf:type http://.../Volcano Results: ?v http://.../Mount_Baker http://.../Mount_Etna Olaf Hartig - Linked Data on the Web
  60. 60. Querying Linked Data with SPARQL ● Linked Data sources usually provide a SPARQL service ● Send your query, receive the result Data Source Endpoint Address DBpedia http://dbpedia.org/sparql Musicbrainz http://dbtune.org/musicbrainz/sparql U.S. Census http://www.rdfabout.com/sparql Semantic Crunchbase http://cb.semsol.org/sparql More complete list: http://esw.w3.org/topic/SparqlEndpoints Olaf Hartig - Linked Data on the Web
  61. 61. Querying Linked Data with SPARQL Querying a single dataset is quite boring compared to: Issuing SPARQL queries over multiple datasets How can you do this? ● Issue follow-up queries to different endpoints ● Query a central collection of datasets ● Build store with copies of relevant datasets ● (Use query federation system) ● Use a link traversal based query system Olaf Hartig - Linked Data on the Web
  62. 62. Querying Linked Data with SPARQL Traditional approach 1: data centralization ● Querying a collection of copies from all relevant datasets Olaf Hartig - Linked Data on the Web
  63. 63. Querying Linked Data with SPARQL Traditional approach 2: federated query processing ? ● Querying a mediator which distributes subqueries to relevant sources and integrates the results ? ? ? Olaf Hartig - Linked Data on the Web
  64. 64. Main drawback: You have to know the relevant data sources in advance. You restrict yourself to the selected sources. You do not tap the full potential of the Web ! Olaf Hartig - Linked Data on the Web
  65. 65. A novel approach: Link Traversal Based Query Execution [ISWC'09] Olaf Hartig - Linked Data on the Web
  66. 66. Main Idea ● Intertwine query evaluation with traversal of RDF links ● Alternately: ● Evaluate parts of the query on a continuously augmented set of data ● Look up URIs in intermediate solutions and add retrieved data to the queried data set Queried data Olaf Hartig - Linked Data on the Web
  67. 67. Main Idea ● Intertwine query evaluation with traversal of RDF links ● Alternately: ● Evaluate parts of the query on a continuously augmented set of data ● Look up URIs in intermediate solutions and add retrieved data to the queried data set Queried data http://.../movie2449 s ?stat unem Query filmin tis t ic p_ r a g Loca sta te t io n ?loc ?ur Olaf Hartig - Linked Data on the Web
  68. 68. Main Idea ● Intertwine query evaluation with traversal of RDF links Alternately: htt ● p:/ /. Evaluate parts of the query on a ../m ? ● continuously augmented set of data ov ie2 44 ● Look up URIs in intermediate 9 solutions and add retrieved data to the queried data set Queried data http://.../movie2449 s ?stat unem Query filmin tis t ic p_ r a g Loca sta te t io n ?loc ?ur Olaf Hartig - Linked Data on the Web
  69. 69. Main Idea ● Intertwine query evaluation with traversal of RDF links Alternately: htt ● p:/ /. Evaluate parts of the query on a ../m ? ● continuously augmented set of data ov ie2 44 ● Look up URIs in intermediate 9 solutions and add retrieved data to the queried data set Queried data http://.../movie2449 s ?stat unem Query filmin tis t ic p_ r a g Loca sta te t io n ?loc ?ur Olaf Hartig - Linked Data on the Web
  70. 70. Main Idea ● Intertwine query evaluation with traversal of RDF links Alternately: htt ● p:/ /. Evaluate parts of the query on a ../m ? ● continuously augmented set of data ov ie2 44 ● Look up URIs in intermediate 9 solutions and add retrieved data to the queried data set Queried data http://.../movie2449 s ?stat unem Query filmin tis t ic p_ r a g Loca sta te t io n ?loc ?ur Olaf Hartig - Linked Data on the Web
  71. 71. Main Idea ● Intertwine query evaluation with traversal of RDF links ● Alternately: ● Evaluate parts of the query on a continuously augmented set of data ● Look up URIs in intermediate solutions and add retrieved data to the queried data set Queried data http://.../movie2449 s ?stat unem Query filmin tis t ic p_ r a g Loca sta te t io n ?loc ?ur Olaf Hartig - Linked Data on the Web
  72. 72. Main Idea ● Intertwine query evaluation with traversal of RDF links ● Alternately: ● Evaluate parts of the query on a continuously augmented set of data ● Look up URIs in intermediate solutions and add retrieved data to the queried data set filmingLocation http://.../movie2449 http://geo.../Italy Queried data http://.../movie2449 s ?stat unem Query filmin tis t ic p_ r a g Loca sta te t io n ?loc ?ur Olaf Hartig - Linked Data on the Web
  73. 73. Main Idea ● Intertwine query evaluation with traversal of RDF links ?loc ● Alternately: http://geo.../Italy ● Evaluate parts of the query on a continuously augmented set of data ● Look up URIs in intermediate solutions and add retrieved data to the queried data set filmingLocation http://.../movie2449 http://geo.../Italy Queried data http://.../movie2449 s ?stat unem Query filmin tis t ic p_ r a g Loca sta te t io n ?loc ?ur Olaf Hartig - Linked Data on the Web
  74. 74. Main Idea ● Intertwine query evaluation with traversal of RDF links ?loc ● Alternately: http://geo.../Italy ● Evaluate parts of the query on a ? aly continuously augmented set of data ./I t .. g eo Look up URIs in intermediate :// ● p htt solutions and add retrieved data to the queried data set Queried data http://.../movie2449 s ?stat unem Query filmin tis t ic p_ r a g Loca sta te t io n ?loc ?ur Olaf Hartig - Linked Data on the Web
  75. 75. Main Idea ● Intertwine query evaluation with traversal of RDF links ?loc ● Alternately: http://geo.../Italy ● Evaluate parts of the query on a ? aly continuously augmented set of data ./I t .. g eo Look up URIs in intermediate :// ● p htt solutions and add retrieved data to the queried data set Queried data http://.../movie2449 s ?stat unem Query filmin tis t ic p_ r a g Loca sta te t io n ?loc ?ur Olaf Hartig - Linked Data on the Web
  76. 76. Main Idea ● Intertwine query evaluation with traversal of RDF links ?loc ● Alternately: http://geo.../Italy ● Evaluate parts of the query on a ? aly continuously augmented set of data ./I t .. g eo Look up URIs in intermediate :// ● p htt solutions and add retrieved data to the queried data set Queried data http://.../movie2449 s ?stat unem Query filmin tis t ic p_ r a g Loca sta te t io n ?loc ?ur Olaf Hartig - Linked Data on the Web
  77. 77. Main Idea ● Intertwine query evaluation with traversal of RDF links ?loc ● Alternately: http://geo.../Italy ● Evaluate parts of the query on a continuously augmented set of data ● Look up URIs in intermediate solutions and add retrieved data to the queried data set Queried data http://.../movie2449 s ?stat unem Query filmin tis t ic p_ r a g Loca sta te t io n ?loc ?ur Olaf Hartig - Linked Data on the Web
  78. 78. Main Idea ● Intertwine query evaluation with traversal of RDF links ?loc ● Alternately: http://geo.../Italy ● Evaluate parts of the query on a continuously augmented set of data ● Look up URIs in intermediate solutions and add retrieved data to the queried data set Queried data http://.../movie2449 s ?stat unem Query filmin tis t ic p_ r a g Loca sta te t io n ?loc ?ur Olaf Hartig - Linked Data on the Web
  79. 79. Main Idea ● Intertwine query evaluation with traversal of RDF links ?loc ● Alternately: http://geo.../Italy ● Evaluate parts of the query on a continuously augmented set of data ● Look up URIs in intermediate solutions and add retrieved data to the queried data set tics http://stat.db/.../it statis http://geo.../Italy Queried data http://.../movie2449 s ?stat unem Query filmin tis t ic p_ r a g Loca sta te t io n ?loc ?ur Olaf Hartig - Linked Data on the Web
  80. 80. Main Idea ● Intertwine query evaluation with traversal of RDF links ?loc ● Alternately: http://geo.../Italy ● Evaluate parts of the query on a continuously augmented set of data ● Look up URIs in intermediate ?loc ?stat solutions and add retrieved data http://geo.../Italy http://stats.db/../it to the queried data set tics http://stat.db/.../it statis http://geo.../Italy Queried data http://.../movie2449 s ?stat unem Query filmin tis t ic p_ r a g Loca sta te t io n ?loc ?ur Olaf Hartig - Linked Data on the Web
  81. 81. Main Idea ● Intertwine query evaluation with traversal of RDF links ?loc ● Alternately: http://geo.../Italy ● Evaluate parts of the query on a continuously augmented set of data ● Look up URIs in intermediate ?loc ?stat solutions and add retrieved data http://geo.../Italy http://stats.db/../it to the queried data set Queried data http://.../movie2449 s ?stat unem Query filmin tis t ic p_ r a g Loca sta te t io n ?loc ?ur Olaf Hartig - Linked Data on the Web
  82. 82. In a Nutshell ● Link traversal based query execution: ● Evaluation on a continuously augmented dataset ● Discovery of potentially relevant data during execution ● Discovery driven by intermediate solutions ● Main advantage: ● No need to know all data sources in advance Olaf Hartig - Linked Data on the Web
  83. 83. Real-World Example SELECT DISTINCT ?author ?phone WHERE { ?pub swc:isPartOf <http://data.semanticweb.org/conference/eswc/2009/proceedings> . ?pub swc:hasTopic ?topic . ?topic rdfs:label ?topicLabel . FILTER regex( str(?topicLabel), "ontology engineering", "i" ) . ?pub swrc:author ?author . { ?author owl:sameAs ?authorAlt } Return phone numbers of authors of ontology engineering papers UNION at ESWC'09. { ?authorAlt owl:sameAs ?author } ?authorAlt foaf:phone ?phone # of query results 2 } # of retrieved graphs 297 # of accessed servers 16 avg. execution time 1min 30sec Olaf Hartig - Linked Data on the Web
  84. 84. Application ● Researchers Map implemented with SQUIN ● Query interface to the whole Web of Data SELECT DISTINCT ?i ?label WHERE { ?prof rdf:type <http://res ... data/dbprofs#DBProfessor> ; foaf:topic_interest ?i . OPTIONAL { ?i rdfs:label ?label ? FILTER( LANG(?label)="en" || LANG(?label)="") } } ORDER BY ?label SQUIN SemWeb Client Lib Olaf Hartig - Linked Data on the Web
  85. 85. Application SELECT DISTINCT ?i ?label WHERE { ?prof rdf:type <http://res ... data/dbprofs#DBProfessor> . ?prof foaf:topic_interest ?i . OPTIONAL { ?i rdfs:label ?label FILTER( LANG(?label)="en" || LANG(?label)="") } } ORDER BY ?label Olaf Hartig - Linked Data on the Web
  86. 86. Application ● Implementation of Researchers Map was very easy due to: ● SQUIN / SemWeb Client Lib ● Approx. 700 LOC JavaScript (incl. 100 for the queries) ● Approx. 50 LOC PHP (Mainly to set up server side proxy due to same origin policy) ● Convenient access to SQUIN with SQUIN PHP tools $s = 'http:// …'; // address of the SQUIN service $q = new SparqlQuerySock( $s, '… SELECT ...' ); $res = $q->getJsonResult(); // or getXmlResult() ● Try it: http://squin.org Olaf Hartig - Linked Data on the Web
  87. 87. Consuming Linked Data … getting started Issues people have when they want to start: ● Finding URIs ● Finding additional data ● Finding SPARQL endpoints Olaf Hartig - Linked Data on the Web
  88. 88. Finding URIs Problem: What URIs exist that identify the thing I'm interested in? Two options: ● Data source specific solutions ● Some Linked Data sources provide a keyword based search for things in their dataset(s) ● Search Engines for the Web of data Olaf Hartig - Linked Data on the Web
  89. 89. Olaf Hartig - Linked Data on the Web
  90. 90. Olaf Hartig - Linked Data on the Web
  91. 91. Finding URIs What if there is no search possibility? You may try a SPARQL query: SELECT DISTINCT ?s WHERE { ?s rdfs:label ?label . FILTER regex( str(?label), "Berlin", "i" ) . } Olaf Hartig - Linked Data on the Web
  92. 92. Finding URIs ● Search engines for the Web of data provide keyword based search for things in different datasets) ● Falcons http://iws.seu.edu.cn/services/falcons/ ● Sindice http://sindice.com ● SWSE http://www.swse.org ● Watson http://watson.kmi.open.ac.uk ● They have also APIs Olaf Hartig - Linked Data on the Web
  93. 93. Olaf Hartig - Linked Data on the Web
  94. 94. Olaf Hartig - Linked Data on the Web
  95. 95. Olaf Hartig - Linked Data on the Web
  96. 96. Finding Additional Data Problem: Given a URIs, where do I find more data as what is available by looking it up? Three options: ● Follow links (e.g. rdfs:seeAlso, owl:sameAs) ● Use a search engine for the Web of data ● Use a co-reference service ● Co-reference services find different URIs that refer to the same thing ● They may also provide an API Olaf Hartig - Linked Data on the Web
  97. 97. Olaf Hartig - Linked Data on the Web
  98. 98. Olaf Hartig - Linked Data on the Web
  99. 99. Finding SPARQL Endpoints Problem: What relevant endpoints exist? Where is the SPARQL endpoint for a dataset? What is the data provided via a SPARQL endpoint about? ● Look at: http://esw.w3.org/topic/SparqlEndpoints ● Still an open issue Olaf Hartig - Linked Data on the Web
  100. 100. Outline From a Web of Documents to a Web of Data Technical Foundations of Linked Data Consuming Linked Data Current Research Issues Olaf Hartig - Linked Data on the Web
  101. 101. Linked Data Fusion Applications want an integrated view on all data that is available about a thing Requirements: ● Schema mapping: map data into a single schema ● Identity resolution: smush data from all sources ● Conflict resolution: resolve inconsistencies in the data Olaf Hartig - Linked Data on the Web
  102. 102. User Interfaces and Interaction ● How do we build interfaces that operate over such a large amount of data? ● What will be their interaction paradigm? ● How to explain data provenance and data fusion? Olaf Hartig - Linked Data on the Web
  103. 103. Provenance, Quality, and Trust ● There are no facts on the Web – everything is a claim ● Increasing amount of research in this area ● W3C provenance incubator group ● Our contributions so far: ● A provenance model for the Web of data [LDOW'09] ● A provenance based Information Quality assessment method [SWPM'09] ● tSPARQL – a trust aware extension for SPARQL [ESWC'09] Olaf Hartig - Linked Data on the Web
  104. 104. Take-away Summary The traditional Web of documents evolves into a Web of data. ● Entities are connected by data links ● Data is self-describing ● Anyone can publish data to the Web of data ● Linked Data holds an enormous potential: users may benefit from a virtually unbound set of data sources ● Learn more about Linked Data: ● “Linked Data – The Story So Far” by C. Bizer, T. Heath, T. Berners-Lee ● On consuming Linked Data: http://consuminglinkeddata.org Olaf Hartig - Linked Data on the Web
  105. 105. These slides have been created by Olaf Hartig http://olafhartig.de Some slides are based on slide sets provided by ● Christian Bizer ● Juan Sequeda This work is licensed under a Creative Commons Attribution-Share Alike 3.0 License (http://creativecommons.org/licenses/by-sa/3.0/) Olaf Hartig - Linked Data on the Web

×